• Don't Miss a Thing
    Free Updates by Email

  • Browse Categories

  • Recent Posts

  • RSS Join DT User Group Today!

  • Archives

  • Twitter

    <!-- document.write('TwitThis'); //-->
  • Pages

  •  

    June 2009
    M T W T F S S
    « May   Jul »
    1234567
    891011121314
    15161718192021
    22232425262728
    2930  
  • Meta

Settings Series: Check for Target Retry Errors on the target Double-Take logs

Many Double-Take users experience no issues with their replication from the time they first initiate a connection until the time they have to fail over. When problems do arise, our tech support team can get you back on track quickly and effectively. There is one error that’s somewhat common and very easy to correct however, and that’s if Double-Take starts complaining about “Retrying Ops” on a Target server while replication is in effect. Today I’d like to discuss why it happens and two ways to correct for it.

Retrying can happen for many different reasons, but the most common are backup or other file-level operations temporarily locking a file on a Target; applications running against the data on the Target; and potentially a problem on the server itself. Double-Take’s replication engine can read and replicate any file on a Source machine, even if that file is locked or exclusively held by another application or Windows itself. So files being in-use on a Source machine poses no problems for us. On the Target, though, we cannot write to a file that’s locked or in use. That’s why the Full Server Failover tools in Double-Take Availability will create an SSM_Staging directory on the Target to hold things like Windows system files, since we can’t send them to the Windows directory on that machine. When Double-Take encounters a locked file, it will attempt to write to the file over and over until the lock is removed, or until the replication queue is full and we’re forced to do a re-mirror operation to correct things.

So, when a backup tool or other system temporarily locks a file, we’ll just retry that operation until the backup tool is done and we can successfully write our change. These types of retry operations are a normal part of what Double-Take does, so they require no corrective action unless you find yourself re-mirroring excessively due to temporary file locks.

However, when Double-Take hits a file that’s permanently locked, you will need to take one of the two following corrective actions in order to fix the situation. If you don’t, you’ll continually re-mirror as the queue gets filled up over and over with operations we cannot write since we preserve write-order integrity, and the op at the front of the line can’t be committed.

First, if there is an application running against the data on the Target, it should be shut down. Applications that write to the data we’re protecting should not be running on the Target while replication is in play. You can pause replication, bring the application on-line and then take it offline and resume replication, but it’s a much better idea to not use the apps on the Target at all. Exceptions to that rule of thumb are when you are testing or failed over, of course.

If no applications seem to be running against the data, or if you’re attempting to replicate something like a server management tool that has to independently run on the Target anyway, you can exclude those files that are getting stuck from protection. This means that those files will not get replicated, and that’s a critical thing to take into account. However, if the data is not something you need to replicate, then you can safely exclude it from protection. In the Double-Take Management Console, open the Source server, expand to the Replication Set and either remove the directory that contains the file or add an exclusion. Exclusions are added by right-clicking the repset and choosing “Properties” then creating a new rule in the rule engine to exclude a file, directory or wildcard.

Once you’ve either corrected the running application or added the exclusion, be sure to initiate a re-mirror to make sure nothing was missed between Source and Target. Retrying is something that can usually be fixed quite quickly, and often doesn’t require any intervention at all. If you do run into any problems, just give our Technical Support department a call and we’ll be able to get you back on track (http://www.doubletake.com/support).


Leave a Reply