I've been following some of the comment spam discussions going on around the ol' INTARWEB.
- The beautiful, sexy, and talented Scott Hanselman has posted a CAPTCHA solution for DasBlog that doesn't require a recompile (nice)
- Robert McLaws has gone a little nuts, but in a good way
Between the two of them, they've covered a lot of the obvious bases.
Scott has a solution that can get you in shape quickly (provided you're running DasBlog), and Robert has a solution that could potentially lead to resolving issues for a much larger population of bloggers.
I've thought about both of these methods. A couple months ago, I envisioned a centralized system with a platform agnostic web services interface (yes, to the WS inexperienced: you bet your little tushes that there are platform dependent WS APIs out there - maybe not in a strictly technical sense, but in a "real world" way) that would create a simple and easy to use membership system that would have some built-in anti-spam measures.
Problems?
Resources, for one, and I'm not talking about people. A centralized solution would require servers, hosting, coding, and so on. In other words, it would require moolah, which is something that I don't exactly have in spades right now.
Robert might succeed here because he definitely has more resources to work with, and I hope he gets something going.
Another issue is adoption. Unless you publish an extremely simple spec, nobody's going to adopt it. Plus, your paranoid types are going to treat a centralized solution with a lack of trust ("So, I sign up for this free service, give this guy all my membership info, rely on him for spam filtering, and then, six months into it, he decides to start charging me - screw that!").
I think this is the best solution for defeating comment spam (aside from hanging the spammers by their intestines in public places), but also the hardest to implement.
Then, on the other side of the fence is Scott's solution (which has been implemented by many bloggers many times over).
I think CAPTCHA is great solution, but also a great inconvenience. I'd rather sign up once for a service like Robert's and then never have to worry about it. When I write comments, I don't want to have to copy some computer's LSD dream rendering of a string of letters and digits.
That said, it's something that will work, and it will work now.
There are other solutions, though.
A lame solution that also works now
When I thought about the effort required to implement a centralized solution, get the blogosphere to back it, get help to code it, and get the resources to host it, I decided instead to go read a good book and then take a long bath. I already have, like, three careers or something, and I don't need another.
When I thought about implementing a straightforward CAPTCHA solution, I decided against it because I want, first and foremost, for commenting to be easy. Without comments, blogging is, like, stupid and dumb. This is all about arguing, agreeing, insulting, and patting each other on the back. I don't want to do anything to discourage comments.
So, here's what I did:
- Created an Outlook folder called "Blog Comments"
- Created an Outlook folder called "Blog Spam"
- Put together an Outlook rule that routes all comments to the "Blog Comments" folder
- I go through manually (yes: manually) and determine which comments in the "Blog Comments" folder are spam
- Any comment spam that I find, I drag to the "Blog Spam" folder (this is easier than it sounds - spammers nail your blog in sweeps, and you can often just drag ten or more emails at once into the folder)
- Once I've separated the spam from the real comments, I run an app that I wrote which iterates through all the comments in the "Blog Spam" folder, parses them for their .Text IDs, hooks into one of the .Text web service APIs, and then calls the method that deletes the entry associated with the parsed ID - at the end, the app goes through and deletes all the entries in the "Blog Spam" folder so that I never have to see their ugly little mugs again
Sound complicated? Yeah, sure. It's not easy, but it beats doing the whole thing manually, and the truth is that I'm so used to it now that I can delete about 50+ spam comments in less than a couple minutes, and that's worth it for me. I don't have to make commenting more inconvenient, and I don't have to wait for a centralized solution.
However...
I'm still working on a CAPTCHA solution, but it's not like your typical CAPTCHA system. I've been watching the spammers and getting a feel for how they do things. I think I can put together a system that will block the spammers 99% of the time, and only inconvenience the commentors about 1% of the time.
I made those stats up, by the way. I don't have any idea about just how often the thing will really work.
But it will work.
And, when it's working, I'll 'splain it to y'alls. It's not complicated. Quite stupid, actually.
But so are the spammers.
For now.
The Strongest Solution
In the end, I think the best approach is to:
- Get a system like Robert's up and running
- Implement different measures on our own
Having a centralized solution will take care of large quantities of spam, but the solution itself is, ultimately, reactive instead of proactive, which means that spam will still get through. Somebody has to get spammed before an IP can get blocked.
For those situations, I think it's best to have a personal system. Scott's using CAPTCHA, and I'm using Outlook/C#. In working like this, we're throwing multiple problems to the spammers. My CAPTCHA system, for example, isn't going to output the same peyote colors that Scott's will, and it won't output in the same manner.
If we have many per-blog solutions that can be plugged in relatively easily by bloggers, then we're throwing the spammers more than they can handle.
At the very least, we can keep them busy by forcing them to constantly add ridiculous IF statements to their bloody, stinking, vile, stinking, reprehensible, stinking code...
...until the day that they're hanging in public squares by their gollywots.