Monthly Archives: February 2011

Submission Engine on Schedule, Still Working on Success Rates

The submission engine has now fully caught up with the backlog and is running on schedule.

We’re still working on improving the success rates across the board. We tried a new batch of IP’s yesterday and they didn’t perform any better. We’re waiting for another batch to be setup to try next week. We also have few implementation changes we can make internally to try and increase the likelihood of success, but two in particular are fairly major and will take some time to implement.

Update on Failure Rates

We have identified the problem with the failure rates more clearly now – the servers providing the IP pool that we are using are not as reliable as we would like and are causing timeouts. We have several strategies in mind to attack the problem, but only one can be tested at a time, and we need to allow each change to run for around 24 hours so that we can gather enough stats data and log files to clearly see the effects of the changes.

We are implementing one of these changes today which we will check the results of tomorrow and then we shall resume next week.

Note that we need to keep multiple IP’s in rotation now as IMAutomator has reached the point where we are performing so many submissions that our IP quickly gets blocked if all those submissions come from just one so we cannot go back to the previous configuration.

Duplicate Submissions Will Be Allowed Once Failure Rates Improved

We have a mechanism in place to prevent duplicate submissions. This is there because the bookmarking sites we submit to do not allow duplicates so it simply results in our submission engine doing unnecessary work submitting something that will just result in a duplication error.

However, as we have had a few days where the failure rates have been unusually high (we’re still working on bringing that down but it takes time as we need to allow changes to run for a while to test them fully), there are some cases where links have been submitted and most of the individual submissions have failed and there’s no mechanism to retry them.

Therefore, once we have improved the failure rates to a more acceptable level, we will temporarily disable the duplicate URL check so if you so desire you’ll be able to delete a link that did not perform well and re-submit it. We will probably leave this in place for around a week. We’ll let you know via this blog when this has happened.

Failure Rates Improved, But Not Done Yet…

We’ve managed to improve the failure rates that we were experiencing a few days ago but there are still some problems. We have changed the way we perform our submissions to use a pool of IP addresses rather than just one. This is better because it means that we don’t have to worry about our server getting blocked for making too many connections, but there are some subtle differences in the way the submissions work when using this IP pool and this is causing some failures.

All failures are monitored and logged and we’re continuing to tweak the submission process to get the best results. Before the change, we were generally hitting about 90-95% success rates for most sites and that’s what we are continuing to aim for.

Submission Failures Being Investigated

Over the last few days the submission engine has been catching up with the submissions that were backlogged but we’ve noticed a much higher failure rate than normal amongst the submissions. Usually we check these almost every day but we haven’t had a chance to do whilst we were fixing the engine.

These failures are now under investigation!

[UPDATE – 4.05pm GMT] Upon reviewing all of the failures, 3 sites had died and had to be removed but the vast majority of the other failures appear to have been caused by the problem that we initially suspected was the cause of the recent engine slowdown – our server IP address being blocked for too many concurrent connections.

Initially we did a lot of work putting together a pool of 10 unique IP’s to use but as this did not speed things up we removed that functionality. With the engine submission speed now resolved we’ve just put this back in and it looks like those sites are working again now. We’ll continue to let this run for the remainder of the day to get a better idea of performance and some more logs to check but it’s looking positive. If this works as expected then we should also be able to add back in a few sites that we had previously removed!

Fast Submit Options Re-Enabled

Now that the engine is running faster and working through the backlog that built up, we have now re-enabled the fast submit options so you can once again select the ‘all links in one day’ speed setting.

However, please note that the engine is still backlogged so if you use this option it will still be delayed by a few days. We are allowing the backlog to clear slowly (over a period of about a week) to ensure that submissions are still drip-fed slowly using time intervals similar to the original submission schedule. If we cleared them all too fast then too many backlinks would be built in a a very short space of time which could hurt, rather than help you.

Do You Want Decaptcher Support? Cast Your Vote

Many of the higher PageRank bookmarking sites (Mixx & Reddit for example) make use of a CAPTCHA on their submission page which means that the submission cannot be automated in the normal way. It can be done, but it costs money. We use a service called Decaptcher to automate account creation and we could extend our submitter to use it but the cost is $2 for 1000 captchas.

The sheer volume of submissions made by IMAutomator means that we cannot incorporate it into our submitter unless it is the members’ Decaptcher account that is used. We can develop a feature that allows you to associate your Decaptcher account with IMAutomator and you’d be able to set the minimum PR of sites to use it with – such as PR4+ only. For other members, captcha sites would simply be skipped.

It should be affordable for most members – for example, if we had 10 captcha sites, 50 bookmarks would cost just $1. So, we’re running a poll to ask members if they want the Decaptcher support or not. To cast your vote, head over to the members page.


The Backlog is Beginning to Clear… at Last!

I’m very pleased to be able to announce that finally the backlog is beginning to clear! We’ve been doing a lot of work over the last week or so trying to analyse exactly what was going on to pinpoint the cause of the slowdown.

There were actually several problems and this sent us off on the wrong track initially. We have a submission queue which holds all of the individual submission jobs to be executed. As the number of members increases, the jobs in the queue goes up. As the number of Pro members increases, this REALLY goes up because Pro members submit a lot more jobs and of course since Release 2 there are now 4 submission tools. The result is that our queue had been steadily growing and its performance had been steadily declining!

The problem was, as long as the engine was able to process all the jobs on schedule, we never actually noticed that the performance was dropping. It was only when it reached that critical tipping point where the workload exceeded the capacity (which happened on the 7th February) that it began to fall behind and we started investigations.

The additional servers that we added at the weekend did not help because they were thwarted by the bottleneck in the database. Thankfully, optimising databases is a lot easier than setting up new hardware! Unfortunately, we wasted a lot of time setting up servers and tweaking things to try and improve performance before we realised that the major bottleneck was in fact in the database.

We’ve made some changes which have had a significant performance impact – enough to push the capacity beyond the workload so that now the backlog can begin to be cleared. However, an important lesson has been learned here and we are going to use the opportunity to continue to improve performance. We have at least another 4 areas that we think can improve performance even more and with such a major backlog on the queue, we have a ton of test data to work with!

That old saying ‘a chain is only as strong as its weakest link’ is significant here – it may well be that we can get further performance gains from better servers or a greater pool of IP addresses but if there is another area that is causing a greater bottleneck, then those gains cannot be realised. Now that we’ve solved the biggest issue, we can continue testing and move up the chain, improving as we go!

I’d like to thank you all for your support and patience over the last week or so!

Engine Backlog Status Update

Just to let you know the status of the engine backlog at the moment – the submissions are running just under 4 days behind at the moment. A few members have expressed concern about the ordering of the delayed submissions so let me clarify:

When the submission schedule for a job is created, the timestamp is added and usually the submission would be made within a few minutes of that time. With the engine in backlog, all submissions are still made in strict date order with the oldest jobs being processed first (with the exception of secondary submissions that have a lower priority). The only difference is that they are taking longer to execute.

When we solve the problem and can begin to clear the backlog, the date order will still be adhered to and will clear the backlog slowly over time so as still to keep submissions suitably spaced out. Some people were worried that all their submissions would be submitted at once but that is not going to happen.

Fast Submissions Temporarily Disabled

Whilst the engine is in backlog, we have decided to disable the fast submission options – anything faster than 1 link per day so you can still submit as normal, but the schedule will be more spread out. We are of course still trying to find a solution to this ongoing problem and can only apologise for the slow speed of submissions.

Stay connected with us in your favorite flavor!