Recover Threads and Posts in vBulletin

How to Selectively Recover Threads and Posts in vBulletin 4

You found this page because you made a foo bar on your vBulletin forum and wish to selectively recover threads and posts from a particular forum that was accidentally removed. I think I can help you do this! I had this project and I’m going to provide the steps and tools and queries I used to fix the situation.

You would follow this information if somehow you deleted or pruned threads from a forum accidentally and need them back but don’t want to restore the entire database and file system. Official vBulletin rules are that you should always restore the entire database if you need to recover information. The following procedures are certainly NOT official or endorsed and guess what, if you screw things up, it will only be your fault! Proceed if you are comfortable giving this a shot.

Things you will need:

  • A complete and working version of your backup forum that contains the missing threads/posts/attachments that you want to recover. It’s okay if the backup is older, you can only recover from your latest backup after all.
  • Moderate knowledge of working with MySQL queries and a means to export data from the queries. (I use Navicat as a DB manager). You will have to adapt this to working in phpMyAdmin or command line as needed.
  • A PHP server where you can upload the files from this page and run the included script. Since you have vBulletin, you already have a PHP server, but please use an isolated folder for testing and don’t try to run any of this on live folders! Note that when I performed this, I was using VB 4.1.3. My current and backup were both the same version. I cannot say how this will work on other versions but I imagine all 4.0 versions have a similar database design with few changes. I also suppose this will work best if your backup and current forum are the same version as well. If not, make sure the table structure is the same and adjust your SQL files accordingly.

 

STEP 1: (Backup Threads)

Things you will need:

  • Access to backup database and ability to run queries and export SQL files from it.
  • If you are a very very careful person, you may even want a third copy of your forum to test with. In other words, one server is running your backup, another server is running a copy of your current forum. This way you can test the imports before going live.

When I did this, I had just one backup server, I first set up my test server with the backup data. Then I pruned the forum to duplicate what happened on the main server, then I tested the re-import. However, I found it necessary to recover the backup data again for further testing, so having two test versions of your forum may be easier, or not. Depends how confident you are! It did waste a lot of time rebuilding my one test server.

FROM NOW ON, NOTHING HAPPENS ON YOUR LIVE SERVER EXCEPT TO LOOK UP AN ID WHEN NEEDED! DO BE CAREFUL!

To get the missing threads, we only need the ID of the forum, you can find this from the admin section of VB in the forum manager area. Click the Go button and read the ID as seen in the next image. Or just hover over the link to the forum and get the ID from the URL. I assume you already know how to do this stuff, just being thorough.

forumid-1

forumid-2

In the image, I see the ID is 202. Once you have the forum ID, we can do the bulk of our work. Our first MySQL query is quite simple, we just have to get all the threads that belong to this forum, no less, no more. The query looks like this:

SELECT thread.* FROM thread WHERE thread.forumid = 73

Where “73” is the ID of the forum you need to recover.

That’s it, you need to export this data into an SQL file and call it ‘thread.sql’. In Navicat, I use the Export Wizard to save the file. In phpMyAdmin you will open your database and click the SQL tab button at the top:

phpmyadmin-query

Check that the query works by noticing how many records it returns, it should return a similar number as the threads you are missing. Look at your backup forum and see how many threads it has, compare this number to your query count. Since the backup is older, don’t expect the numbers to be the same. I will leave it up to you to make sure your query returned what you want. Next you want to click the Export link in phpMyAdmin. Not the link under the results table, but the one in the “Query results operations”.

phpmyadmin-export

Use the correct Export link!

Moving right along, you need to set the settings as shown in the next image when you export. We’re trying to export with as little “extra” stuff as possible, all we want is the INSERT statements. Navicat can automatically make a pretty clean output, but phpMyAdmin will include a bunch of stuff that you should remove from the file. Here is a screenshot of the settings and what was changed is marked:

phpmyadmin-export-settings

Here is a shot of the extra stuff that you can remove from the sql file. Use a text editor like Notepad, if the size is too big (Notepad can’t open it), use something that can handle larger file sizes like Notepad++ or other text editor. Remove everything but the INSERT statements, that’s all we need to import the data.

edit-sql-file

In the image, you can see I’ve removed everything except the clean “INSERT INTO…” lines. Edit and save this SQL file.

Now you have a clean SQL file with the INSERT statement of all your lost threads! If you like, you can test this by importing to your second backup install of your forum if you created one. In other words, if you set up a forum to duplicate the data loss of your primary forum, you can test importing this now, refresh the forum, and see if the empty threads appear. The threads won’t have any posts, of course, but they should be there. Information about importing data is down below. You will also need to run the update forum information utility from the admin section to make the threads/posts count accurate. If you don’t have the second backup server to play with, just hold on and move to the next step.

STEP 2: (Backup Posts)

Things you will need:

  • Access to backup database and ability to run queries and export SQL file. Same as Step 1.

This query will return a crap load more data than the other. You may have only had some hundreds or a few thousand threads in the last query, but the post count will be much higher. When I did this recovery, there were near 40,000 posts. That being said, queries, exports and all that will be a little slower. Like before, we need to query the database to get the posts. This time, we want to grab all the posts that belong to any threads which belong to the single forum. This is a two part query. We do this by joining the post and thread tables. It looks like this:

SELECT post.* FROM post
JOIN thread ON post.threadid = thread.threadid
WHERE thread.forumid = 73

In order not to insult your intelligence, I’ll simply say that you need to export this into an SQL file like we did with the thread table. Name it ‘post.sql’ and save it in the same folder as the thread.sql file created earlier. Also like before, if you happen to have a second backup database set up, you can try to import the posts file, and be sure to edit the post file and remove the extra statements beside the INSERT statements. Now that we have the post and thread files, we have to do a more complicated thing, and that is get all the missing attachments that were purged from the deleted posts.

STEP 3: (Backup Attachments)

Things you will need:

  • Access to backup database and ability to run queries and export SQL file. Same as Step 1 and 2.
  • Access to the file system of the forum in order to download the attachments folder. This procedure applies to forums which are set up to save attachments to the file system and not to the database. If your forum is set to save attachments to the database, none of this will apply to you and I don’t have the procedure for you. I imagine it might even be easier, since it can likely be found with queries as well.

Here, we have to mess with the database as well as the file system. Another thing to note, if your backup is very recent, perhaps just a few days or a week, it may be worth your while to simply replace the entire attachments folder with the backup version. You may create some orphan files this way, or bring some attachments back from the dead, but it’s a small price to pay. My backup was a good 3 months old, and I didn’t want to restore that many orphan files, so I’m selectively restoring the attachment files as well, copying back only those files which belong to the threads in our recovered forum. If you need to selectively restore attachments, continue…

First lets get our data out of the database just like we did with the threads and posts. In this case we need combine our previous query and go a step further, after getting our posts that belong to the threads that belong to the forum, we need the attachments that belong to those posts. The query looks like this:

SELECT attachment.*
FROM attachment
JOIN post ON post.postid = attachment.contentid
WHERE post.attach > 0 AND post.threadid
IN
   (
   SELECT thread.threadid
   FROM thread
   WHERE thread.forumid = 73
   )

Here we join the attachments table to the post table by ID. Then we also only select those posts that have at least one attachment, and then we make sure the posts only belong to threads which are in our recovered forum. Note that your query will not return all posts, only posts which have attachments, so the records returned won’t seem to mesh with anything you can check. For example if you have 2000 threads with 8000 posts, you might have 2000 posts with attachments, but your result will not be 2000, since many posts might have more than one attachment. So it will return 2500, for example. Whatever the case, this number is hard to verify and you might just have to trust it and move on. Most likely, though, the number returned should be higher than our “post” query, unless for some reason none of your posts could possibly have more than one attachment.

Here is where everything changes. Export the data just like you did before, using phpMyAdmin or Navicat or something else. Edit the file and remove the extra commands beside the INSERT statements. Next you need to export the data again, but this time export as an Excel XLS file, or a CSV file. Again, this is simply a second export of the same data but in a different format. We need this file to prepare for use with the script I’ll provide in a minute. The export looks more like this:

phpmyadmin-export-attachment

Export this file as ‘attachment.csv’ and open it up in Excel. If you happen to not own a spreadsheet program, you shouldn’t run a forum! In the file, remove all columns except the yellow columns in the image below:

excel-output

You are keeping the “contentid” column and the “filedataid” column, that’s it. These are what we need to know where files are stored in the file system. Next rename the file from ‘attachment.csv’ to ‘list.txt’. I want you to open the txt file and remove the column names. Your text file should look exactly like the image below, but of course with different IDs and the column names removed.

remove-line

With the list.txt file saved and in the correct format, you will need to move it into the folder where you copy the script downloaded below. The script will use this list in order to search through your backup attachments folder to look for the matching attachments. That is part of step 4. Like before with the thread and post SQL files, those of you who have a secondary copy of your forum installed to test the imports, you can try to import the attachment.sql file into the database. It won’t necessarily “fix” anything, since the physical attachment files are missing, but you can make sure it places the attachment box on the post. Look down in step 4 for a before and after shot of a post with and without attachments data inserted.

Now unlike the post and thread queries, to restore attachments, we actually need TWO more sets of data. We now need to get the data from the attachmentcategoryuser table. To grab this data, we’ll use the following query:

SELECT attachmentcategoryuser.*
FROM attachmentcategoryuser
WHERE attachmentcategoryuser.filedataid
IN
   (
   SELECT attachment.filedataid
   FROM attachment
   JOIN post ON post.postid = attachment.contentid
   WHERE post.attach > 0
   AND post.threadid
   IN
      (
      SELECT thread.threadid
      FROM thread
      WHERE thread.forumid = 73
      )
  )

These queries are getting a little more tricky! This time we grab the data by matching the filedataid to the attachments, and matching attachments that match post which match threads to the forum. Again remember to change the forum ID to the one you are recovering! Same drill applies, export the sql, check the file for extra stuff, and set it aside for later testing/importing. If you are adventurous, you can go ahead and import it to your secondary backup test server. But wait, there’s more! We need a third set of data, this time from the filedata table, and this should be our last set of data to get things working again. Without this last set of data, you will see attachment IDs listed in posts, but the files still won’t display until this is in there too. Here is the query:

SELECT DISTINCT filedata.*
FROM filedata
JOIN attachment ON filedata.filedataid = attachment.filedataid
WHERE attachment.contentid
IN
   (
   SELECT post.postid
   FROM post
   JOIN thread ON post.threadid = thread.threadid
   WHERE thread.forumid = 73
   )

Here we select the filedata by linking it to attachments and only selecting attachments which are in posts which are in threads which are in our forum. It should return a similar number of records as the attachment and attachmentcategoryuser queries. All three queries will probably not return the same number of records, but may be close in proximity though. Export the query data and clean it up. Those with secondary test server can import it and skip the next step, you should now get my script and find any missing attachment files to copy. If you decided to just restore the entire backup attachment folder, likely everything is now working and you should go to the last step. For the rest of you, hold on to all your queries and lets go import them!

STEP 4: (Restore Data)

Things you will need:

  • Your post.sql, thread.sql, attachment.sql, attachmentcategoryuser.sql, and filedata.sql files.
  • Direct access to the database directly on the server (optional, faster import when local to MySQL server).

Now it’s time to go ahead and test importing the data. At this point you should have a copy of your forum running on a test server. This should basically be an exact copy of the live server (missing the same data of course). If you have two test installs, you’re good to go, start working with your live site mirror. If you only have one backup to work with (from which you gathered all the queries) you might want to either install a mirror of your live site to test with, or somehow duplicate the issue it has. For me, to duplicate my lost threads all I did was run a prune threads on the forum. Once the threads were all gone, I could test these imports. However, if you find your queries don’t work or you need to re-export them again, tough luck! Cause now you have no copy of the backup and you have to restore the backup database again to your test server. I actually had to do this twice during testing, restoring the backup, and pruning again. If you a little more adventurous, you could skip the tests and just trying importing all this on your live server, but of course you may want to backup the database first! You would only have to restore the database if something corrupts, since we aren’t changing the filesystem at this time.

Copy your sql files to the server or anywhere they need to be to import, in this case I find it’s easier to import through the MySQL command line interface. My test server was running from WAMP on Windows 7, so I’ll give the example of doing this from Windows, but it’s pretty much the same from a Linux box. Assuming your files on are C: drive, and your MySQL executable is in your %PATH% system variable. On Linux you should be able to run MySQL from anywhere, on Windows you may have to change to the directory where MySQL is installed. In any event, open a command prompt and log in to MySQL with something like  > mysql -u root –p.

Next select your database with the command use vbulletin_forum; If you don’t know the name of the database, for some odd reason, type show databases; OK, so assuming you started up MySQL, logged in, and selected to use your test database, you can now import your files in the following order with these commands:

source c:\thread.sql;  You’ll see some output as it imports the data.

And then

source c:\post.sql;  More output.

Next the attachment.sql file:

source c:\attachment.sql; Output…

Now attachmentcategoryuser.sql file:

source c:attachmentcategoryuser.sql; Output…

And finally the filedata.sql file:

source c:filedata.sql; Output…

Note that in the MySQL output it may occasionally say error because a duplicate ID exists, that’s alright, if it already exists, no problem! But for the most part, it should report that “1 row affected” as it inserts each row. I don’t have the time to deal with every problem that may arise, but there is a way to contact me at the bottom and I may start adding some tech support to this page as problems arise. If you do have issues, it is most likely going to come down to how your sql file is formatted, or perhaps with your MySQL user permissions or something. Luckily you are doing this on a test server first! And I’m serious about the formation of your sql file, you might notice that it puts the wrong table name inside, or the column names didn’t match, or who knows what. It might be worth while to mention at this point a particular quirk with Navicat. When you create the query and then save it, the name you used to save the query will become the table name inside the sql output! Don’t believe, me, check this out:

query-name

Notice I saved my query and named it “filedata from specific forum”.

query-name-block

Now when I export, it uses the same name, and I can’t change it here!

Here is what the sql file looks like after exporting:

query-name-wrong

As you can see, I had to rename my queries in Navicat to match the table name I was importing in to. If you happen to use phpMyAdmin for this, keep in mind about script execution times. If you are importing zillions of lines of data, it’s better to execute directly on server. But in my case I had less than 1300 lines and imports took a number of seconds, no big deal. If all your importing is a success, you should be able to refresh the forum (on test server) and see all the threads and posts now recovered. Attachments won’t be there if the physical files are still missing, we do that next. Just for kicks, here is an image of a post before importing the attachments sql:

before-attachment

And after import:

after-attachment

You can see after importing attachments sql file, the attachments are connected to the post, but clicking on one will cause a 404 because the actual file has not been recovered yet.

Next we will deal with finding the missing attachment files in the file system. And that should be our last bit of work!

STEP 5: (Restore Attachment Files)

Things you will need:

  • Your list.txt file created earlier and properly formatted.
  • This php script and files. 
  • A PHP server from which to run the script. (Optional) Compare utility to select just the orphan files from your backup (to save on disk space).

I know it’s odd to create this thing to run in PHP, but it’s just what I happened to do. If somebody wants to make a desktop app or something, be my guest! The PHP code is fairly simple so it won’t be hard to see how it works. Also, I don’t know how much the script can handle as far as work load, if you pass it a gazillion byte txt file I doubt it will work. Perhaps you would have to split your txt file into like 5k lines each or something and run it multiple times, I don’t know! Mine had 1286 lines and ran over 30 seconds or so. Note that PHP’s default max_execution_time is 30 seconds, meaning you might get about 900 files copied before it dies. However, I wrote the script so that it should stay alive if your server has a normal PHP setup. If you run PHP in Safe Mode, probably not. Unzip the folder to your server someplace where you can access the script. The folder is called “attachfind” and has two subfolders in it, “backup” and “restore”, as well as the index.php file.

The backup folder is for you to copy your attachment files in to. That is, the backup folder which contains the attachments you need to find. vBulletin stores attachments in folders named ‘1’, ‘2’, ‘3’ and so forth with many subfolders. So you should end up with a folder called backup with folders 1,2,3,4 inside. In other words, don’t copy vBulletin’s ‘attachment’ folder into backup, but rather copy the files/folders within their attachment folder. The script needs to look for folders such as “backup/1/1/4/6”. I Hope that makes sense. Here is a screen shot in Windows of what mine looks like:

backup-folder

The empty “restore” folder will contain your missing attachments after the script runs, these will need to be copied back to your server when done. Also, copy your list.txt file to the same folder as index.php. Once you have this folder on your server, your backup attachments files copied into the backup folder, your list.txt file copied in, you can browse to the attachfind folder on your server. The script will run and provide instructions from there including any errors it might have. If the script ran successfully, your restore folder will be populated with all the missing attachments to the recovered posts. These folders should then be merged into the attachments folder on your test server to check if it worked. The script will let you know of any errors. If you have issues running the script, check permissions on the script that it can write the folders etc. It will be up to you to copy the files to your test server and live server, my script only copies from the backup folder to the restore folder, that’s it.

STEP 6: FINISHED!

If you inserted the five sql files and copied the missing attachments, go ahead and run the Update Forum Information utility from VB backend. This will get the thread and post counts accurate. There you go! Test your test server, check if your missing threads/posts/attachments are back in the forum you were restoring. If it looks good and acceptable, you can take the plunge to do it all again on the live server! Naturally it would be a good idea to get a fresh backup of your live server first. Then go ahead and do the imports and copies and test again. If it worked, you’ll want yet another fresh backup of the server so that you can include the recovered threads and posts in your latest backup. The way I did the queries should prevent you from restoring orphan threads OR posts. It only grabbed posts which belonged to threads in the forum, and only grabbed threads that belong in the forum. Nevertheless, an operation this like may lead to unpredictable results. Were there any other tables in the DB that had reference data? Do other counters need updated? Will anything mess up when you try to install the next update to VB? Won’t know until you try! Note also that on my server, we don’t really use the CMS or Blogs, so if there was more to recover in relation to threads/posts with CMS and Blog, I’m not aware. This recovery is for the forum only, not for posts to blogs or articles.

I hope it worked for you! If you liked this document and it helped save your butt, feel free to buy me lunch with the donate button!