80legs spider overloading sites

Friday, January 13th, 2012

The last hours we have noticed in several websites we manage an increased number of sessions. The first investigation show that there were no human visitors but rather bots / spider / crawlers since these sessions didn’t load tracking code of monitoring tools (Google Analytics, SiteMeter, statcounter, etc). So, even with more than 300 sessions on the CMS the Google analytics real time users were just 20 or 30..

The next step was to find the originator of these extra sessions. There were many different IP addresses, most of them originated at Russia but also from Ukraine, Saudi Arabia and similar countries. Looking at the log files of the web servers having these hits the common signature was the user agent record saying 008/0.83. The full log entry in all cases, in all different source IPs was:
Mozilla/5.0 (compatible; 008/0.83; Gecko/2008032620