Re: pauses sync'ing between tmpfs and disk on Linux 2.4.x

2005-03-26 Thread Tim Moore
I use rsync similarly.  In addition to he standard /tmp traffic, junkbuster 
has a 4MB  jarfile and 18MB log, ccache has 16 directories and 8000 files 
of which 3000 are > 10k.  Both are heavily used.

At boot time /tmp is populated from on disk storage:
/bin/nice -19 /usr/bin/rsync -aq --whole-file /usr/TMP /tmp &
At shutdown /tmpfs deltas are put back:
/usr/bin/rsync -aq --delete --no-whole-file \
--exclude "*-lock" --exclude "*-unix" --exclude "*=:0.0" /tmp /usr/TMP &
Here are some representative timings (start == boot, stop == shutdown):
[9:26] abit:/tmp > egrep '(stop|start)\) .*  tmp' /var/log/messages.3
Mar 19 20:53:55 abit rc.local: (stop) starting  tmp /tmp/ /usr/TMP
Mar 19 20:53:57 abit rc.local: (stop) finished  tmp /tmp/ /usr/TMP
Mar 20 09:31:18 abit rc.local: (start) starting  tmp /usr/TMP/ /tmp
Mar 20 09:31:55 abit rc.local: (start) finished  tmp /usr/TMP/ /tmp
Mar 20 21:07:37 abit rc.local: (stop) starting  tmp /tmp/ /usr/TMP
Mar 20 21:07:41 abit rc.local: (stop) finished  tmp /tmp/ /usr/TMP
Mar 21 20:33:21 abit rc.local: (start) starting  tmp /usr/TMP/ /tmp
Mar 21 20:34:17 abit rc.local: (start) finished  tmp /usr/TMP/ /tmp
There is never a hang or pause, in fact the process would be transparent if 
I didnt have xosview in a corner window.  The problem probably isn't the 
way rsync, /tmpfs or 2.4 work, so you may want to look elsewhere.

kernel: 2.4.30-rc1
cpu/mem: athlon XP 2800, 1GB PC2700
rsync: 2.6.3
disk: 3xRAID-5 (old 20GB PATA drives, about 50MB/s throughput)
[9:07] abit:/tmp > df /tmp
Filesystem   1K-blocks  Used Available Use% Mounted on
tmpfs   460800159492301308  35% /tmp
[9:18] abit:/tmp > ls -la
total 7729
drwxrwxrwt  11 root   root   480 Mar 26 08:57 .
drwxr-xr-x  26 root   root  1024 Mar 26 08:51 ..
-r--r--r--   1 root   root11 Mar 26 08:52 .X0-lock
drwxrwxrwt   2 root   root60 Mar 26 08:52 .X11-unix
drwxrwxr-x  18 timtim 163900 Mar 23 23:34 .ccache
drwxrwxrwt   2 xfsxfs 60 Mar 26 08:52 .font-unix
drwx--   3 timtim 60 Jan 11 23:35 .wine-tim
drwx--   2 root   root40 Feb  6 00:44 .xf86config1053
drwx--   2 root   root40 Feb  6 00:46 .xf86config1077
-rw---   1 timtim  0 Feb  9 22:24 0vws1uvh.zip
-rw---   1 timtim  0 Feb  9 22:22 56gtsq4w.zip
-rw-rw-r--   1 timtim 178108 Mar 15 21:35 SATA_HotPlug.pdf
-rw-rw-r--   1 timtim 234529 Mar 15 21:33 SATA_PCI_CardBusHost.pdf
-rw-rw-r--   1 timtim 175906 Mar 15 21:36 SATA_illus_guide.pdf
srwx--   1 timtim  0 Mar 26 08:52 afterstep-500.DISPLAY=:0.0
-rw-r--r--   1 root   root 16686 Jan 30 17:18 bonnie.log
-rw-r--r--   1 root   root 35644 Jan  8 18:08 build.log
drwxr-xr-x   2 timtim 40 Mar  7 21:05 hsperfdata_tim
drwxr-xr-x   2 nobody nobody  80 Dec 25 23:56 junkbuster
drwx--   2 timtim 60 Mar 16 22:46 mcop-tim
-rw-r--r--   1 root   root485728 Feb 21 12:18 rsync.log
-rw-rw-r--   1 timtim6693164 Mar 15 21:40 serialata10a.pdf
-rw-rw-r--   1 timtim  27475 Jan 31 17:07 smp.log
-rw-r--r--   1 root   root 0 Feb 12 15:19 tmp.dd
t.
Ray Van Dolson wrote:
I've set up a 1GB tmpfs filesystem on a system with a single IDE disk and
2GB's of memory.  I'm storing a large amount of RRD files (~300MB) on the
tmpfs filesystem to make their generation a bit speedier... this part works
great.
However, I want to rsync these files over from time to time to a directory on
the local filesystem (same physical server).  I'm using rsync 2.6.4pre3 and
am hoping to understand a bit better what is happening.
When I run rsync -av /path/to/tmpfs /path/to/diskdir things move along pretty
fast for while, then there's a big pause... sometimes for up to 30+ seconds
where nothing seems to be happening, but all IO on the system ceases (can't do
anything in another xterm).  Then rsync starts moving along again for a while,
and then pauses again... after about three such pauses it finishes the entire
rsync process.
I'm wondering what I can do to speed things up... perhaps whatever processes
that write to the tmpfs filesystem are fighting with rsync... but doesn't
rsync just need read access?
I've also tried with -W (copy whole file) and used a smaller -B value.
Anyways, I'll explain a bit about the curren scenario:
1. Daemon on system receives data from remote devices.
2. Daemon calls rrdtool to write a .rrd file to the tmpfs filesystem.
3. rsync runs periodically (every 5 minutes) to sync up the tmpfs filesystem
   with a directory on the local filesystem.
The tmpfs filesystem is a directory with about 100 directories inside of it,
each containing rrd files.  I'm considering using a script to just rsync one
of these subdirectories at a time over a period of time to "distribute" the
load.
The main issue is that while rsync is running and enters these random 30
second "pauses", no IO can happen and things get really backed up on

Re: Spam to this list

2005-03-26 Thread John E. Malmberg
Martin Pool wrote:
John Van Essen wrote: 

The policy is to block as much spam as possible without blocking
legitimate posts.  A 100% solution is impossible, even if we had human
moderation (humans make mistakes).
I am seeing reports on news.admin.net-abuse.email from Steve Linford 
that he is getting at least 99% accuracy in removing spam with zero loss 
of real e-mail.

He is removing about 85% of the spam with DNSbls so that it does not 
even get inside of the mail server, and then using SpamAasssin 3.0 with 
it's new test on URLs inside of mail, where if the URL resolves to an IP 
address that is known to be controlled by a spammer, the e-mail is rejected.

And he is reporting that he is not using a DHCP list for doing rejections.

The first one has been in the dul.dnsbl.sorbs.net blacklist since Oct.
I use these 4 DNS-based blacklists in the mail server that I manage:
  sbl-xbl.spamhaus.org
I have not ever seen a report of an incorrect listing in the 
xbl.spamhaus.org.  I have only seen one reported error in several years 
of the sbl.spamhaus.org and it was corrected with in 1/2 hour of this 
being pointed out on news.admin.net-abuse.email.

It is a merging of 3 dnsbls for convenience.
   sbl.spamhaus.org - Hand maintained list of I.P. addresses controlled
  by spammers.
  The sbl.spamhaus.org is probably now the most widely used dnsbl in
  the world.  An ISP has to work hard at supporting spam to get any
  of it's IP addresses listed in the sbl.spamhaus.org.
  xbl.spamhaus.org is a combination of opm.blitz.org and
  cbl.abuseat.org.
  The cbl.abuseat.org runs spamtraps that filter out auto-responders.
  In the time it has been in existence, I have seen zero reports of
  an incorrect listing.  It will delist on request once per week, and
  listings age off.
  The opm.blitz.org verifies that the I.P. address is an open proxy,
  and ages off old listings.
  list.dsbl.org
This is a list of known compromised I.P. addresses where no responsible 
party has demonstrated they have an RFC compliant mailbox set reading 
abuse complaints.  If a real mail server is listed, it means that it is 
either an active compromised machine, or that their is no one that is 
reading messages to their abuse or postmaster e-mail addresses.

It is extremely widely used to reject e-mail, possibly the most used 
after the spamhaus.org.

  dul.dnsbl.sorbs.net
In the past, the dul.dnsbl.sorbs.net used to run a higher false positive 
rate.  Now it is almost not measurable.

dul.dnsbl.sorbs.net now allows owners of mistaken static entries to use 
a webform to remove them as long as they can show a forward DNS name 
pointing to that I.P. with a long enough TTL to show it is static.

Currently a listing in dul.dnsbl.sorbs.net indicates well over a 99% 
chance of spam.

  web.dnsbl.sorbs.net
I have heard nothing good or bad about that one.  In the spam I sent 
through spamcop.net in the past year, I recall seeing it only flag one 
spam that was not detected by either the cbl.abuseat.org or njabl as 
being in that DNSBL.

From what I have seen, the only zone in sorbs that is likely to cause 
real e-mail to be rejected is the spam.dnsbl.sorbs.net as it is usually 
listing multi-hop exploits of the mail servers of major ISP's and they 
have to jump through hoops to get off of it.  The other SORBS zones do 
not require such extra actions.

And they have helped a LOT.

The other 3 have no reverse DNS entries.  A machine with no reverse DNS
that is sending email is not very likely to be a legitimate email server.
It's much more likely a compromised machine on a clueless ISP's network.
Rejecting email from those unidentified machines also has helped a lot.

Using any of those measures alone tends to block legitimate posters,
Can you find a legitimate post that was blocked by the 
sbl-xbl.spamhaus.org?  I have not heard of an error on that list yet.

From the reports that I have seen on the various e-mail forums, reverse 
DNS is now an RFC requirement for operating any server on the public 
internet.  Networks with no rDNS are demonstrating that they do not 
understand how to be properly connected to the internet and have proven 
to be a large source of problems.  The fastest way to get that problem 
fixed is to take AOL's approach and refuse all e-mail with no rDNS on it 
at all.

particularly those running their own mail server, which to my mind is a
greater harm than letting ocassional spam go through.  Our purpose here
is to run a mailing list, not punish ISPs.  So we use all the things you
named as part of a weighted score.
Actually what is a result is that you are allowing the list recipients 
to be punished by incompetent ISP's.

At some point, it is not worth attempting to try to find a potential 
real e-mail from a network that has allowed spammers to infest it by 
either neglect or by willful act.

If you can put a [SPAM?] tag on mail trapped by a the following 
algorithm, I would be surprised if any real postings