I recently had a problem where my Spamassassin install started thinking that a lot of spam messages were really ham (non-spam). Since these were getting BAYES_00 scores of -2.5 they were almost all getting through my spam filter. These particular messages all were regarding STOCK quotes and were pretty obviously spam just by looking at the text of the messages. Somehow my Spamassassin install thought that they were not spam because the messages were being passed as ham by the Bayesian filter. Since they kept getting past, the bayesian filter kept learning them as HAM.

In order to break this vicious cycle, you just need to clear out the bayesian tokens. It’s very easy to do. As root user, type:
sa-learn --clear
This starts you fresh. By default, Spamassassin won’t use the bayes filter until it has 200 spam and ham messages, so until you get to that level it will continue to learn based the other Spamassassin detection settings.

Ideally, I would have sa-learn train using these spam messages. But since I use Outlook, and there is no “easy” way to have it interface with sa-learn, I find it easier to clean out the bayes tokens every once in a while. SpamAssassin Coach is a plugin for outlook which should connect to your spamd server and “learn” a message as ham or spam. But in practice, it did not work for me. It looks like the project has a lot of potential.

For more information on how Bayesian filtering works, check out this wikipedia article.

You May Also Like

PowerDNS flexible, fast DNS Server

I’ve recently been testing/installing PowerDNS for a web hosting provider. Man am…

How to Install SNMP on Tomato Router Firmware and Graph Traffic with Cacti

You’ve flashed your old WRT54G or other vanilla router with the Tomato…

Convert drive from FAT to NTFS

Let’s say you just installed windows, but told it to use FAT…

Cpuspeed Reboot Loop on Dual Opteron

I recently had a problem with a dual Opteron system going into…