Mike Healan
Sept. 8, 2003
http://www.spywareinfo.com
If you own a web site or a blog, you no doubt know all about the term "referer spam". When you click a hyperlink on one web site, your browser passes to the next site the address of the page where you clicked the link. This is logged by the server hosting the next web site.
The referer information can be faked very easily. Some unscrupulous web site owners will arrange to have several computers access a particular web site with a referer that lists their own web site address. There are a number of ways to accomplish this (see below), but the result is that the web server logs of the targeted site will contain hundreds or possibly thousands of entries with the fake referer information. This is known as "referer spamming".
Why go to the effort of leaving a web site address in someone else's log files?
Most web servers have the ability to log an extensive amount of information about web sites visitors. Many webmasters and bloggers use web-based software to parse those log files automatically. The result is one or more pages breaking the information down into very detailed statistics. These statistics include the referer information and often those referers are displayed as hyperlinks.
Bloggers quite often will display a link to the most frequent or most recent web site found in their referer logs using scripts. Some will even put those links right on the front page of their site in a sidebar area.
Unscrupulous web site owners are spamming the log files in order to have their web sites listed on those referer links. This creates an artificial boost in that site's popularity among those search engines that measure the number of links to a site. It also generates traffic when curious visitors of a victim site clicks the links displayed in the referer listing.
Simply put, these people are running advertisements on your web site and using it to boost their search engine rankings. They do this without your knowledge, without your permission, and without compensating you in any way for the use of your network.
Spammers decide which sites to spam by checking sites automatically such as blo.gs, weblogs.com, and popdex.com for blogs that have updated recently. They also may do a simple Google search for the phrase "recent referers".
Once the spammer has chosen which sites to spam, there are several ways to go about it. At one point, there was even a company that offered to spam the logs of over 55,000 sites for a fee.
One clever method used by porn sites is to include an image tag in their page's HTML that calls your web site's home page. No image is loaded, but visitors to the porn site generate a hit on your site that is logged by the server.
Another is to use a desktop application similar to a spambot email harvester. Instead of scooping up email addresses, the purpose of this sort of application is to load all of the pages on your site while leaving a custom referer in your logs. One blogger wrote such a program, but never released it fearing abuse (and the wrath of his fellow bloggers no doubt).
Another method that people speculate about is browser hijacking. Internet Explorer can be hijacked to change the start page, search settings, DNS error handling, search hooks, and to have BHOs and strange toolbars installed. Certainly the referer setting can also be altered by a BHO. To my knowledge, this has never been done. Let's hope I'm not giving anyone ideas...
It is worth noting that it is not just porn sites that spam referer logs. A few software developers include an advertisement for their product that is sent out each time a customer uses it to visit a site. Some RSS news aggregators are known to do this.
At least one well known security company also does this. Outpost firewall by Agnitum spams referer logs with "Field blocked by Outpost firewall (http://www.agnitum.com)" when users turn on referer blocking. This is why you will not see Outpost recommended on my software page. Norton Internet Security also alters the referer in some nonstandard fashion. I don't know if it also spams the log or if it changes the referer to gibberish. Either way, it is not blank and it is not the real referer.
My message to software developers is to leave the referer alone. Either block it entirely and leave it blank or don't touch it at all. By meddling with it, you are causing problems on web sites that use scripts that depend on the referers. Also, you are spamming and legitimate software companies should know better.
You have this happening on your own web site. The spammers are all over your log files. How do you stop it?
God bless the people at Apache for creating such a simple and powerful web server. If you have a web site, chances are it is hosted on Apache. Apache makes it very simple to block anyone from your site you wish to block, based on just about any parameter you choose.
I have no idea how to block someone on a Windows IIS server. If anyone knows, hopefully they will clue me in.
Update:
A reader has written in to say that ISAPI_Rewrite can make these techniques work on some IIS servers. You will need to have your host install it for you if you are on shared hosting.
Log into your site's FTP server. Make sure it is set to display hidden files on the server. Check your FTP client's documentation for help with that. If there is a file named .htaccess, download it and open it in a text editor. If there isn't one, create one on your hard drive and upload it when you are finished.
Put this in the .htaccess file, changing "spammersite" to the name of sites found spamming your logs. The first site listed should not have "NC", and the last one should not have "OR".
If your site starts generating errors after you upload this file, remove the # from the first line. If you use Microsoft Frontpage on your site, do not do this. Changing the .htaccess file could interfere with Frontpage.
# Options +FollowSymlinks
RewriteEngine On
RewriteCond %{HTTP_REFERER} ^http://(www\.)?spammersite1.com.*$ [OR]
RewriteCond %{HTTP_REFERER} ^http://(www\.)?spammersite2.com.*$ [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://(www\.)?spammersite3.com.*$ [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://(www\.)?spammersite4.com.*$ [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://(www\.)?spammersite5.com.*$ [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://(www\.)?spammersite6.com.*$ [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://(www\.)?spammersite7.com.*$ [NC]
RewriteRule \.*$ http://www.some-other-website.com [R,L]
This will redirect any request with a spammer site in the refer to any other site you wish. If you prefer, you can rewrite that last line as "RewriteRule .* - [F,L]" to give them a "Forbidden" error. Either way, it keeps them out of your referer logs and there is no way to defeat it.
If you want a nice, large list to start your blacklist, you may use mine. I was fed up with spammers in my own logs, so I tracked them all down and listed them. It was to the point that six of the top ten referers in my stats were spammers. Some of them had made tens of thousands of entries in my logs.
If you prefer, you can also use wild cards to filter out domain names that are likely to belong to porn sites. That should catch most of these spammers, as porn sites seem to be the biggest offenders. Blogger Joe Maller has a blacklist that should work perfectly for this.
Since people are far more likely to have their browsers hijacked while surfing for porn, there are a large number of porn sites that do link to SpywareInfo. I can't use Joe's list since I don't want to block those people.
The people who would spam someone else's web site to promote their own are scum. Thankfully, using the method above, it is stopped easily.
If you notice a web site that displays referers that is being spammed, please contact the webmaster of that site and point them here so they can learn how to block it.
This article is located at http://www.spywareinfo.com/articles/referer_spam/. © 2001-2008 Mike Healan. If copied in its entirety to message boards, blogs, and newsgroups, this notice must be included with it. Please see our terms of use for more information.
http://www.spywareinfo.com/articles/bho/ :: BHO article at SWI
http://www.joemaller.com/refererspam.shtml Joe Maller's refer spam article
http://www.wired.com/news/culture/0,1284,56017,00.html :: When the Spam Hits the Blogs
Search powered by
Google.com
Search powered by
SpywareGuide.com
All materials on this web site are copyrighted © 2001 - 2008 by Mike Healan or their respective owners.
® All rights reserved.
Use of this site and its services are subject to our terms of use.

This site uses Google Analytics to count page views. More Info