I've been creating local backups of my email for ages. Recently I noticed that each year my email archives more than quadruple in size. I went from a 3 meg zipped archive in 2003 to a 15 meg zipped archive in 2004. That's not a huge increase, but then between 2004 and 2011, the archives quickly shot up to over 300 megs.
After a bit of digging around in my mail archives it was pretty obvious that it wasn't the text that was sucking up all the space, but rather all the attachments that are fired around.
While it's nice to look at the baby pictures, I don't need to save a copy on both Google
and on my local backup. I use fetchmail to pop my email from Gmail and procmail to handle the actual delivery. Procmail is a hold-over from when I had school and hotmail accounts that simply piled up with spam. Using a combination of spam-bayes and procmail I could choke off a bit of that spam. Since I switched to Gmail, my spam load has dropped to the point that procmail just runs quietly in the background delivering everything into mail archives that I hardly ever look at these days.
After a bit of research, I came across a
perl script by Mike Leonetti. It was meant to work with sendmail and do a few extra filtering tasks that I didn't need so I tweaked it to be a bit simpler and work with procmail. It quietly removes any file types attached to email specified in ~/procmail/filter_attachments, substitues some text to indicate where the attachments used to live and then passes the mail on to be delivered.
I should mention that Mike was tremendously helpful in debugging and making suggestions when I couldn't get my tweaked version to work. He's a great guy!
After the jump you can find the procmail recipe and the perl script. Happy stripping!