This article is a follow-up of Migrating Apple Mail filters to Sieve scripts with Dovecot, where I explained how I moved my Apple Mail email triage filters from the client to the server with Sieve scripts, allowing my email inbox to stay clean even when I am on the go, checking emails from my iPhone.

Yet, one issue persisted: spam. The logical next step was to move my spam filtering — currently handled on my Mac via SpamSieve — to the server. Here’s how I did it.


10 Years of Spam Filtering on my Mac

I have been running SpamSieve for the last 10 years to clear spam out of my inbox automatically, and once trained, it’s proven to be extraordinarily effective. More effective than Gmail’s or Outlook’s spam filter in my experience, since it starts with zero knowledge of what spam looks like, and you have to spend time training it with the good mail and the bad mail that you receive. You end up with a highly personalized spam filter, free of bias. There are little to no false positive, which is when the good mail (the mail I want) ends up in spam. There’s some false negative, which is when the mail I don’t want ends up in my inbox (typically due to spammers getting creative and sending new content that does not yet look like spam to SpamSieve).

SpamSieve is a Bayesian spam filter, meaning it is trained on hammy and spammy words from examples (ie. your emails when you manually classify them as spam or ham). A Bayesian filter stores words and a score. These word→score pairs then get used to calculate a probability that a given email you feed to the filter is spam or ham, from 0% probability to 100% probability. The ham/spam cut-off cursor is configured by the user, that is, the probability threshold below or above which an email is classified as ham or spam (or undecided).

For any email you receive and that gets through your Bayesian spam filter, a decision is made based on the calculated probability :

  1. Should the email stay in the inbox? (definitely ham)
  2. Be flagged for review? (not sure if it’s ham or spam — ie. undecided)
  3. Or go to the junk folder? (definitely spam)

My problem here with SpamSieve is the exact same as with Apple Mail filters: if my MacBook is asleep in my backpack, then it will not sort through my newly-received email. Meaning that, when I access my mailbox on my iPhone it will be a mess of unfiltered email (mostly spam). Therefore, the solution is to find a Bayesian filtering program that runs on the server.


Finding a Server-Side Bayesian Spam Filter

The typical go-to solution for server-side spam filtering is SpamAssassin. It’s a well-known program that’s effective but which comes with a lot of spam filtering techniques I do not need. I only need a Bayesian spam filter, and I need it lightweight.

Fortunately, such a program exists: bogofilter. It comes as a lightweight binary, built in C, and is called as a command in the terminal. Its usage is simple: to train it with examples of spam and ham, or to calculate the spamminess of an email, you run a command. There’s no need to start it as a daemon like SpamAssassin and then use a complicated network protocol and dealing with milters.

It turns out that adding bogofilter to my stack was very simple, as my self-hosted email setup is fairly standard: I run Postfix and Dovecot, and a DKIM/SPF/DMARC stack. I have Sieve scripts running in Dovecot.

Here’s how we will integrate bogofilter with this existing stack:

  • Postfix will run bogofilter spam checks on all new mail that gets in, and adds a header to each email with the spam/ham score;
  • Dovecot will train bogofilter with what’s spam and what’s ham, based on email that gets moved by the user from Inbox to Junk folder, or from Junk folder to Inbox;

Let’s modify Postfix and Dovecot configuration now.


Setting up Bogofilter

👉 For all the steps described below, I am running Debian 13 on my server, with Postfix 2.4 and Dovecot 3.11. The version of bogofilter is 1.2 (configuration might change and break if you are reading this article too far in the future — it’s 2026 here).

First things first, we will start by installing bogofilter on our server, which will be commonly used by Postfix and Dovecot:

apt-get install bogofilter-sqlite

Step 1: Classify incoming mail on Postfix

All inbound emails sent to email addresses on our mail server must go through Postfix first, before they get routed to Dovecot via LMTP. Postfix will be the one that’ll call bogofilter for every email that passes through. bogofilter will return the original email content, with an extra header X-Bogosity containing the ham/spam probability score, and its spam/ham/unsure decision based on configured cut-off thresholds.

For instance, this is what I got on a clean email for a password reset from Soundcloud:

X-Bogosity: Ham, tests=bogofilter, spamicity=0.000144, version=1.2.5

And that’s what I got for a random spam email:

X-Bogosity: Spam, tests=bogofilter, spamicity=0.809745, version=1.2.5

The bogosity command used to generate this output from an input email is the following (for my email: valerian@valeriansaliou.name):

/usr/bin/bogofilter -e -p -o 0.4,0.05 -d /var/lib/bogofilter/valerian@valeriansaliou.name < input_email.eml

Let’s explain what this command does:

  • -e: exits with code 0 if the message has been classified
  • -p: passthrough option, used to take whole email in stdin, and return whole email in stdout
  • -o 0.4,0.05: spam/ham cut-off values, so above 40% it’s Spam, below 5% it’s Ham, in-between it’s Unsure
  • -d /path/to/db/: the path to where the bogofilter database is stored for the recipient (each account on your email server account trains its own personalized filter, so each recipient has its own database)

We are going to configure Postfix so that it runs this command to every received email. We will be using a Postfix content filter, that has been introduced in recent Postfix versions.

First, create the bogofilter database storage directory:

mkdir /var/lib/bogofilter
chown vmail:vmail /var/lib/bogofilter

Then, create the bogofilter scan script on your server:

nano /usr/local/bin/bogofilter-scan
#!/bin/sh

SENDMAIL="/usr/sbin/sendmail -G -i"
RECIPIENT_LOWER=$(echo "$1" | tr '[:upper:]' '[:lower:]')
BOGOFILTER_DB="/var/lib/bogofilter/$RECIPIENT_LOWER"

# No database for user? Queue mail without spam-checking
if [ ! -d $BOGOFILTER_DB ]; then
  tee | $SENDMAIL -f "$2" -- "$1"
  exit $?
fi

# Database exists: run spam check (this adds a 'X-Bogosity' header)
# [SPAM] > 40% >= [UNSURE] >= 5% > [HAM]
/usr/bin/bogofilter -e -p -o 0.4,0.05 -d "$BOGOFILTER_DB" | $SENDMAIL -f "$2" -- "$1"
exit $?
chmod +x /usr/local/bin/bogofilter-scan

Now, edit your Postfix main configuration file:

nano /etc/postfix/main.cf

And add this line:

# Bogofilter
bogofilter_destination_recipient_limit = 1

Finally, edit your Postfix master configuration:

nano /etc/postfix/master.cf

And apply this change:

# For the 'smtp' queue:
smtp       inet  n      -       n       -       -       smtpd
  # <Keep your existing configuration here>
  # Add the following line:
  -o content_filter=bogofilter:

At the end of the file add this queue:

bogofilter  unix  -     n       n       -       2       pipe
  flags=Rq user=vmail null_sender= argv=/usr/local/bin/bogofilter-scan ${recipient} ${sender}

Finally, restart Postfix:

service postfix restart

You can now try sending an email to yourself from a remote sender. Make sure no error appears in Postfix log, and that the email gets delivered to your inbox.

Since you do not have any bogofilter database at the moment, there will not be any X-Bogosity header, that’s expected. Your database will be created on the first training. Let’s set up Dovecot now!

Step 2: Moving incoming spams to Junk on Dovecot

This step is very simple. The default behavior here is: no spam email caught by Postfix will be moved to Junk when they arrive in Dovecot.

Any user willing to do so can setup a rule in their user-defined Sieve script to move emails with certain X-Bogosity headers where they please.

The X-Bogosity header can hold the following decision values:

  • Ham: emails definitely ham (those can stay in the inbox)
  • Unsure: emails that cannot be classified (usually because there is not yet enough training data)
  • Spam: emails definitely spam (those should go to the junk folder)

Each user account on your IMAP server can thus specify what they want to do with the Spam and Unsure levels. Some users might want to auto-delete all spams. Some other users might want to move spams to the Junk folder and auto-mark them as read. Some other users might also want unsure emails to also go to the Junk folder. Everyone has their preference.

For my own IMAP account, those are my Sieve rules (put at the top, they need to execute first):

require ["fileinto", "imap4flags"];

################
### BOGOSITY ###
################

# Bogosity (Spam)
if header :matches "X-Bogosity" "Spam*"
{
    addflag "\\Seen";
    fileinto "Spam";
    stop;
}

# Bogosity (Unsure)
if header :matches "X-Bogosity" "Unsure*"
{
    fileinto "Review";
    stop;
}

Moving the Unsure level emails to a “Review” folder is especially important as it forces the user to provide explicit feedback to bogofilter for emails it could not classify accurately, so as to train the filter by moving emails in “Review” to the Inbox or Junk folder.

If you did not yet configure Dovecot for user Sieve scripts, read my related article: Migrating Apple Mail filters to Sieve scripts with Dovecot.

This allows for auto-training based on moving emails from IMAP folder to IMAP folder, which is covered in the next step.

Step 3: Auto-training from user feedback via Dovecot

In this step we will configure Dovecot to run server-wide Sieve script triggering on IMAP events like moving an email from IMAP folder to IMAP folder. Those Sieve scripts will run for all IMAP users, and cannot be customized at the user level.

Edit the Dovecot plugins configuration:

nano /etc/dovecot/conf.d/10-plugin.conf
# Add this at the end of the file:

sieve_plugins = sieve_imapsieve sieve_extprograms

# From elsewhere to Spam folder: report as spam
mailbox Spam {
  sieve_script spam_from_any {
    type  = before
    cause = copy
    path  = /usr/lib/sieve/report-spam.sieve
  }
}

# From Spam folder to elsewhere: report as ham
imapsieve_from Spam {
  sieve_script ham_from_spam {
    type  = before
    cause = copy
    path  = /usr/lib/sieve/report-ham.sieve
  }
}

# From Review folder to elsewhere: report as ham
imapsieve_from Review {
  sieve_script ham_from_review {
    type  = before
    cause = copy
    path  = /usr/lib/sieve/report-ham.sieve
  }
}

sieve_pipe_bin_dir = /usr/lib/sieve

sieve_global_extensions = vnd.dovecot.pipe vnd.dovecot.environment

Now, edit the Dovecot IMAP configuration:

nano /etc/dovecot/conf.d/20-imap.conf
protocol imap {
  mail_plugins {
    # <Keep your existing plugins here>
    # Add this plugin:
    imap_sieve = yes
  }
}

Create the Sieve directory (where Sieve scripts will be stored and compiled):

mkdir /usr/lib/sieve
chown vmail:vmail /usr/lib/sieve

Good. Now create the spam-reporting Sieve script:

nano /usr/lib/sieve/report-spam.sieve
require ["vnd.dovecot.pipe", "copy", "imapsieve", "environment", "variables"];

if environment :matches "imap.user" "*" {
  set "username" "${1}";
}

pipe :copy "bogofilter-learn" [ "spam", "${username}" ];

And the ham-reporting Sieve script:

nano /usr/lib/sieve/report-ham.sieve
require ["vnd.dovecot.pipe", "copy", "imapsieve", "environment", "variables"];

if environment :matches "imap.mailbox" "*" {
  set "mailbox" "${1}";
}

# Safety guard to not train as ham deleted or moved spams
# The 'Review' folder is a special use folder that is optional and can be used \
#   to sort emails with the 'Unsure' bogosity flag and wait for user input. \
#   If moved out of 'Review' then based on the target folder we know if its \
#   spam or ham. We therefore need to ignore emails re-injected into 'Review' \
#   here to avoid wrongly re-training as ham.
if anyof (
  string "${mailbox}" "Trash",
  string "${mailbox}" "Spam",
  string "${mailbox}" "Review"
) {
  stop;
}

if environment :matches "imap.user" "*" {
  set "username" "${1}";
}

pipe :copy "bogofilter-learn" [ "ham", "${username}" ];

And we finally need to create the common bogofilter-learn script:

nano /usr/lib/sieve/bogofilter-learn
#!/bin/sh

CLASS="$1"
BOGOFILTER_DB="/var/lib/bogofilter/$2"

# Train as ham
if [ "$CLASS" = "ham" ]; then
  /usr/bin/bogofilter -Sn -d "$BOGOFILTER_DB"
  exit $?
fi

# Train as spam
if [ "$CLASS" = "spam" ]; then
  /usr/bin/bogofilter -Ns -d "$BOGOFILTER_DB"
  exit $?
fi

exit 1
chmod +x /usr/lib/sieve/bogofilter-learn

Now, restart Dovecot:

service dovecot restart

Try moving an email that looks like spam to the Junk folder. Then check Dovecot logs for errors, if there’s no error you are good to go!

The bogofilter database for the IMAP user you are logged-in to should have been created in /var/lib/bogofilter/, meaning you are all set! 😃


🇫🇷 Written from Nantes, France.