Joe and Dave + SASA:
I have combed my notes. Here's what I have which I have removed some
password info on but I think it can help rebuild the process. Dave, can
you take a look? I can get you passwords. Talon1 still exists,
spamassassin-vm does not, I have a backup of spamassassin-vm from 3.7
months ago.
Regards,
KAM
#1 - Some boxes are just names for other boxes
trap-proc.spamassassin.org. Sonic has scripts set up to archive
collected spam to that server.
#2 - My notes from spamassassin-vm.apache.org that catastrophically died:
this was the traps cron that needs to be added on spamassassin-vm
20 2 * * * rsync -rze ssh --whole-file --size-only --delete
j...@trap-proc.spamassassin.org
<mailto:j...@trap-proc.spamassassin.org>:/home/jm/cor/.
/export/home/bbmass/uploadedcorpora/traps/.
DONE - add this traps account
DONE - fix perms for /export/home/bbmass/uploadedcorpora/traps/
DONE - add cron job
#3 - From April 2017
Let me know if you are not the correct person to talk to about this, but
we are having issues reaching trap-proc.spamassassin.org. It looks like
we have some scripts set up to archive collected spam to that server,
and I haven't seen a successful connection for a few days now.
--
Grant Keller
System Operations
grant.kel...@sonic.com <mailto:grant.kel...@sonic.com>
#4 - from 2014
The box at Sonic is the backend for the SpamAssassin spamtraps feed. To
be honest, I am not sure anyone or anything is consuming the collected
data at this stage -- it should probably be shut down, unless someone
wants to take it over?
incoming.spamassassin.org : this is the spamtrap machine at Sonic.
Basically, qpsmtpd
handles the incoming SMTP traffic, handing it off via a Gearman queue to
"gears" -- a
set of scripts running in the background which filter out noise, crap,
bounces, etc.
then buffer them to mbox files and upload.
/home/trap contains the code, /home/trapper is the output files.
/etc/init.d/gears starts the
scripts which compose it, copying them to /tmpfs first so they don't hit
the disk where possible,
for speed.
The main config file is at /home/trap/code/gears/config .
The buffered mbox files are then uploaded to my S3 account, using an IAM
credential which can only
access one single bucket called "mailtrap". After 1 day those files are
auto-expired.
This stuff all appears to be working ok, although the volume is pretty
high (and I suspect
it's costing me a fair bit of money even despite the auto-expiration!)
Next step is spamassassin2.zones.apache.org, which has an alias of
"trap-proc.spamassassin.org"
in DNS. A cron on my user account runs
"/home/trapscripts/copy_to_corpus" which
(at least at some point) appears to have selected a randomised subset of
uploaded spam corpora
into /home/jm/cor/spam and /home/jm/cor/nonspam. Those directories are
now empty, so
I think this part may have broken at some point in 2013 :(
I can't track down the script which downloads files from the S3 account,
annoyingly!
Again, everything there runs as "jm".
Finally there is talon1. The host is talon1.pccc.com; username "jm",
password is in
spamassassin2.zones.apache.org/root/sought_rules_info.txt (readable only by
root).
That host is being used to generate the SOUGHT rulesets, and as far as I can
see (apologies, I haven't been monitoring it at all recently!) it still
seems
to be doing so. It all runs from the "jm" user account, every 4 hours from
cron; see "crontab -l". Part of the process is to rsync-over-ssh the
ham and spam
corpus from jm@spamassassin2.
Then the final step of that script is to publish the files to my server
at taint.org,
by "svn commit"ing in ~/sought on talon1. That directory commits back to an
svn repo on that host over svn+ssh, then SSHes to that host and runs a
script;
that generates GPG signatures, updates the DNS records and pushes it to the
Cloudfront/S3 bucket for rules.yerp.org. If/when you guys take this
over, this
bit definitely needs to be moved to another host and account, since that's
my main personal server ;)
Having said that, I'm happy to hand over the credentials to all the
other "jm" accounts
named above. I've put the passwords into
spamassassin2.zones.apache.org/root/sought_rules_info.txt (readable only
by root).
Feel free to take over those accounts and do what you will with them ;)
Sorry I haven't handed this over earlier -- even reverse-engineering all
this took
quite a lot of effort. Legacy systems suck!
The trap data comes from: It's partly a typical spamtrapping MX
capturing dead domains, and partly /etc/aliases forwards from other ISPs
around the world, who are following the "how to donate your spamtrap to
SA" instructions on the wiki. Note that the latter means that we have
to do a bunch of stripping off forwarding steps when/if we act on that data.
the domains are MX records hanging off existing,
live domains; e.g. I'd add a "mx.taint.org", seed a few email addresses in
those domains eg for web scrapers, then MX the entire domain to the traps
machine.
> - the alias forwards: where they pointing to?
Essentially there's a *@incoming.spamassassin.org
<mailto:*@incoming.spamassassin.org> catch-all, and the alias
forwards redirect spam into named addresses there.
sought_rules_info.txt:
jm@talon1 password: <removed>
incoming.spamassassin.org = traps machine: u root p <removed >
u jm p <removed>
jm account on zones2: <removed>
Doing about 74000 messages per day as of 2/5/2014
DONE - WORKS AS of 4/23 GOING TO 76.191.162.2 1 - Get SSH access working
- Pinged Justin on 4/22
DONE - 2 - why does incoming.spamassassin.org have two IPs? - Emailed Justin
incoming.spamassassin.org. 3507 IN A 76.191.162.2
incoming.spamassassin.org. 3507 IN A 75.101.166.134
Huh. I had no idea we were still doing that ;) That is the Mailchannels
spamtrap IP. If you remember back in 2008 (private@ was cc'd), they
donated spamtrap hosting to us, in exchange for spam data. We
eventually moved off the donated spamtrap server (in EC2) which they
were paying for, to the current one in PCCC. it looks like we never
changed the 50:50 split setup though on the MX record (and I'd forgotten
about it). I think we can probably turn that off now….
76.191.162.2 is our one. I've just verified that I'm able to SSH to it
as root.
DONE - 2a - Remove 75.101.166.134 from incoming.spamassassin.org. DNS entry
3 - more?
On 1/15/2018 11:49 AM, Dave Jones wrote:
No problem. No rush. Just didn't hear from you so I thought you
might have missed the last email from Joe.
There's no rsyncd running or listening on port 873 on that box if the
rsync's are supposed to be pushing to it.
[root@colo etc]# netstat -tunlap | grep LISTEN
tcp 0 0 0.0.0.0:35469 0.0.0.0:* LISTEN
9693/perl
tcp 0 0 127.0.0.1:4243 0.0.0.0:* LISTEN
20805/java
tcp 0 0 127.0.0.1:53 0.0.0.0:* LISTEN
2175/named
tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN
2306/sshd
tcp 0 0 127.0.0.1:25 0.0.0.0:* LISTEN
2389/master
tcp 0 0 127.0.0.1:953 0.0.0.0:* LISTEN
2175/named
tcp 0 0 0.0.0.0:7003 0.0.0.0:* LISTEN
9693/perl
tcp 0 0 :::8000 :::* LISTEN
757/httpd
tcp 0 0 ::1:53 :::* LISTEN
2175/named
tcp 0 0 :::22 :::* LISTEN
2306/sshd
tcp 0 0 ::1:953 :::* LISTEN
2175/named
I didn't find any cron jobs scheduled that would be
pulling/transferring via rsync either.
I aliased my email address to root's on that box so I would get all
emails and so far just a few cron jobs with minor issues.
I searched the /home/trap directory for any "signs of life" since that
seems to be the main thing setup/running on this box for gearmand.
Nothing found in the logs.
Dave
On 01/15/2018 10:27 AM, Kevin A. McGrail wrote:
Please give me two more days. There are some DNS issues I'm
researching around trap-proc and some old notes for justin.
I think the system is broken because trap-proc should be a cname for
the colo box.
There should be rsyncs and traps happening.
I am traveling for business and have not had the time I hoped this week.
On 1/15/2018 11:24 AM, Dave Jones wrote:
Kevin,
Are you OK with shutting down this colo box? I didn't find anything
running on this box anymore.
Dave
On 01/11/2018 02:01 PM, Joe Muller wrote:
Kevin, are you okay with shutting down the old server today? It sounds
like Dave has finished migrating services.
Also, I'll be looking through our capture scripts to make sure they're
functioning. Expect an email in the next couple days with my
findings. :)
-- Joe
On 01/11/2018 10:40 AM, Dave Jones wrote:
On 01/08/2018 06:56 AM, Kevin A. McGrail wrote:
Hi Joe.
Great to hear about ns b back online and thanks about the machine.
Dave has really been leading the effort about the machine. We have a
mirror running on it now. We're still getting some information
about
the old server but it looks awesome.
Out of interest, do you have any documentation on the capture
scripts
running at Sonic? We are trying to really improve our documentation
on systems.
Regards,
KAM
I have scoured this old colo box for the past week and don't see it
doing really anything. It has a local gearmand running to process
queues but it's logs don't show any work happening.
I added my email address to the root alias and all I see is a minimal
logwatch email and some unimportant minor errors from cron output.
The local account "trapper" is full of undeliverable email with this:
trap-proc.spamassassin.org[192.87.106.247]: Connection timed out
It has an Apache webserver running on port 8000 but that appears to
only be for Munin reports.
It has a local BIND DNS server only listening on 127.0.0.1:53 and
127.0.0.1:953.
Unless anyone knows something more, I think the old server can be
shutdown and given a few days before being pulled.
Dave
On 1/6/2018 11:37 PM, Joe Muller wrote:
Kevin,
Thanks for letting me know - b.auth-ns was offline for an
emergency hardware swap. Unfortunately, spinning up a fresh OS and
the associated DNS backend took longer than expected (I largely
blame myself) - we are back up to fully operating status as of
8:30pm PST.
On a side note, how's the migration going to the new server? No
huge rush to get it done, mostly curiosity if everything is to your
team's satisfaction. It's not very often that I build up systems
for
use by folks outside of Sonic, so I'm always looking for ways to
improve the process.
-- Joe Muller
Sonic System Operations