Frequently asked questions and answers about spam received on or via websites.

Q: I’m getting junk sent through the feedback form on my website (by email, or being added to a comments page). What is it, and is there anything that can be done?

This has been an increasing problem on websites since about 2005. Automated programs (‘bots’) trawl the web looking for contact forms. They are particularly looking for blogs that allow comments to be posted without moderation. After a few days or weeks, the bots post information as if it was filled in from the form, some of it random and some of it key words containing links and key words (such as the names of prescription drugs), in the hope of getting it posted on your site and thereby improving the ranking of their site on search engines. Another, less common, reason for form spam is to try to find some kind of ‘injection’ exploit in your site, usually to send spam email.

In fact, the bots will usually spam any form, whether it is linked to a blog or a forum – or a mailback form where the contents will never make it onto the web anyway. Spam to mail forms is annoying enough if it comes only to you. However, it can be particularly troublesome if some of the recipients are people you are trying to petition: not only does it irritate them, but it devalues your campaign and can get it blocked from their servers. For this reason, it is a good idea to have yourself listed as a recipient in such campaigns.

Also, fake email addresses, either undeliverable or harvested from the web, can be submitted to any mailing list that you run, which is a problem – no-one, especially GreenNet, wants to be seen to send spam. Therefore, to avoid having your supporter list contaminated by fake addresses, it is best to set any mailer program to require confirmation from the recipient (this is very easy in Mailman), and turn on any bounce processing you may have in the mailer. Please contact GreenNet Support if you are having trouble doing this.

Websites which immediately show unmoderated comments are probably more likely to be targeted. You may want to turn on moderation, or separate the comments page from the original article.

Q: What can be done to stop this rubbish being sent in the first place?

Some of this ‘form spam’ is distinctive and is blocked from websites hosted on GreenNet altogether. So if you see several similar abuses, please report it to support@gn.apc.org and we can filter it from everyone’s site (or at least from the email system).

You can add checks in your form processing script to reject spam submissions. A quite common way of doing this is to require validation by entering an image of a word, known as a CAPTCHA (Completely Automated Test To Tell Computers and Humans Apart). Some versions of this can cause accessibility problems for people with visual impairments, but there is a well-developed free implementation called reCAPTCHA available from http://www.captcha.net/ This still raises accessibility concerns, but does at least have an audio option. (Also if it suits your site visitors you can ask them to pick out pictures of cats)

Simpler text-only versions of these questions can be added to sites running on Drupal (a content management system that GreenNet specialises in). Or for any PHP-based web site you can add code like this to a form:

——

<div id=“spamcheck”>
Choose and enter only numbers from the following string:

wd1m99nm7xs. <br />
<input type=“text” name=“disccode” id=“disccode” value=”“>
</div>
<script language=“JavaScript” type=“text/javascript”> <—
document.getElementById(“disccode”).value = “1997”;
document.getElementById(“spamcheck”).style.display = “none”;
//—> </script>

——

and then check that the correct answer has been inserted either by Javascript or manually by the person visiting your site:

——
if ( ( strtolower($disccode) !== "1997" ) OR ( strpos(strtolower(implode("",$HTTP_POST_VARS).$some_form_field),"href") !== false ) ) { echo 'Content-Type: text/html Status: 302 Location: http://www.example.org
<script language=“JavaScript” type=“text/javascript”> <—
document.location = “http://www.example.org”;
//—> </script>
'; exit; }

——

Choose a question and response of your choice. Ideally it should be something that all your visitors can answer, but a bot is going to have trouble with, so don’t mention the answer anywhere in the form.

It is possible but unlikely that some bots are clever and determined enough to interpret the Javascript. If they do manage to sneak through, you can just modify the page with more complex Javascript.

Q: This seems quite complicated. What can be done without asking a question or altering the script?

You could hide a field using CSS. It won’t be displayed in most browsers, but if a bot tries to fill it in, it will be rejected with a 403 error page. This is because we’re blocking fields with the name used below (“cypha”).

So you can add something to your CSS like
.cyphastyle { display: none; font-size: 25%}
and then to the comments form:

<input type="text" name="cypha" size=50 value="" class="cyphastyle" />

However, since it takes a few weeks for harvesters to update the form definitions, this probably won’t take effect immediately, and is ideally combined with a check like that above.

Q: I’m concerned about accessibility. Are there other methods?

Other solutions may involve a bit more coding in your application. These include using public databases of spammer addresses such as the popular Akismet which has plugins for Drupal, WordPress, Joomla or phpBB; or similarly Mollom or more open services: including Project Honeypot and BlogSpam.net.

Q: I’ve taken action against bots but I’m still getting strange comments posted up on the site. What’s happening?

Even if you’re blocking bots with a CAPTCHA or requiring users to register with their email address (also possible in Drupal), people will want to abuse your comments sections to promote their own site. What seems to have started happening in 2008 is that people are marketing a “search engine optimisation” (SEO) service which actually involves employing real people in India to spam the customer’s link on your site.

There isn’t much of a technical solution to this. You can block people posting links altogether (again possible in Drupal as on this site), but in that case people may just put the link missing out any www. or http. Otherwise, just invest more time in moderating the comments. Requiring approval for comments, so that they are not approved immediately, is likely to discourage this form of human abuse.