Bogus popularity claims for Sendmail

2001-01-13 Thread D. J. Bernstein

I've set up a web page to combat Sendmail Inc.'s false advertising on
this topic: http://cr.yp.to/surveys/sendmail.html

Sendmail dropped below 50% of the Internet's SMTP servers---including
idle workstations---last year; qmail has climbed past 10%. I suspect
that qmail now handles more Internet mail deliveries than Sendmail does,
although I don't know a good way to measure this.

---Dan



Re: Bogus popularity claims for Sendmail

2001-01-13 Thread Russell Nelson

D. J. Bernstein writes:
 > I've set up a web page to combat Sendmail Inc.'s false advertising on
 > this topic: http://cr.yp.to/surveys/sendmail.html
 > 
 > Sendmail dropped below 50% of the Internet's SMTP servers---including
 > idle workstations---last year; qmail has climbed past 10%. I suspect
 > that qmail now handles more Internet mail deliveries than Sendmail does,
 > although I don't know a good way to measure this.

The problem is getting the random sample.  You can't just count
servers, you have to count traffic.  And when you start to do that, it
becomes quite difficult to come up with a good random sample.  Best I
can think of is to do what the FBI does: arrange with some Internet
provider to put a traffic analyzer somewhere on their backbone, and
sniff for SMTP sessions.  Check the MTA's on both ends and give each
credit for handling an Internet mail delivery.

You could examine a set of log files, but then how do you count them?
You can't count the MTA that sent and received the email because it's
completely non-random.  And yet, that throws off your statistics.

I could, for example, get you the log files for Rediff.com.  They're
an Indian portal that probably handles 50% of all email in and out of
India.  From the smtpd and qmail log files you could contact each
sending and receiving site.  You could identify the MTA, and count
that as "an Internet mail delivery".

But that sample would be weighted towards personal email, and away
from workplace email.  That makes it much less random.

I could also get you the log files for two ISPs that send daily mail
to all of their customers.  But that weights the sample towards
people interested in that kind of mail.

-- 
-russ nelson <[EMAIL PROTECTED]>  http://russnelson.com | Government is the
Crynwr sells support for free software  | PGPok | fictitious entity by which
521 Pleasant Valley Rd. | +1 315 268 1925 voice | everyone seeks to live at
Potsdam, NY 13676-3213  | +1 315 268 9201 FAX   | everyone else's expense.



Re: Bogus popularity claims for Sendmail

2001-01-13 Thread Ricardo Cerqueira

On Sun, Jan 14, 2001 at 12:02:52AM -0500, Russell Nelson wrote:
> D. J. Bernstein writes:
>  > 
>  > Sendmail dropped below 50% of the Internet's SMTP servers---including
>  > idle workstations---last year; qmail has climbed past 10%. I suspect
>  > that qmail now handles more Internet mail deliveries than Sendmail does,
>  > although I don't know a good way to measure this.
> 
> But that sample would be weighted towards personal email, and away
> from workplace email.  That makes it much less random.
> 
> I could also get you the log files for two ISPs that send daily mail
> to all of their customers.  But that weights the sample towards
> people interested in that kind of mail.

And also, gathering server stats from this list would be biased.
I also work in an ISP, a large one (500k+ customers). I could also send you
a bunch of logfiles, both from residential customers, corporate customers,
and even the offices. But, considering most traffic is probably internal,
it would all be qmail talking to qmail. 

RC

-- 
+---
| Ricardo Cerqueira  
| PGP Key fingerprint  -  B7 05 13 CE 48 0A BF 1E  87 21 83 DB 28 DE 03 42 
| Novis Telecom  -  Engenharia ISP / Rede Técnica 
| Pç. Duque Saldanha, 1, 7º E / 1050-094 Lisboa / Portugal
| Tel: +351 2 1010  - Fax: +351 2 1010 4459

 PGP signature


Re: Bogus popularity claims for Sendmail

2001-01-14 Thread Jurjen Oskam

On 13 Jan 2001 22:16:34 -, "D. J. Bernstein" <[EMAIL PROTECTED]> wrote:

>I suspect
>that qmail now handles more Internet mail deliveries than Sendmail does,
>although I don't know a good way to measure this.

With this in mind, isn't it a great time to promote QMTP? For example,
by using the QMTP-enabled qmail-remote Russ made?


end
-- 
Jurjen Oskam * carnivore! * http://www.stupendous.org/ for PGP key
assassinate nuclear iraq clinton kill bomb USA eta ira cia fbi nsa kill
president wall street ruin economy disrupt phonenetwork atomic bomb sarin
nerve gas bin laden military -*- DVD Decryption at www.stupendous.org -*-



Re: Bogus popularity claims for Sendmail

2001-01-15 Thread Mark Delany

> On Sat, Jan 13, 2001 at 10:16:34PM -, D. J. Bernstein wrote:
> > I've set up a web page to combat Sendmail Inc.'s false advertising on
> > this topic: http://cr.yp.to/surveys/sendmail.html
> > 
> > Sendmail dropped below 50% of the Internet's SMTP servers---including
> > idle workstations---last year; qmail has climbed past 10%. I suspect
> > that qmail now handles more Internet mail deliveries than Sendmail does,
> > although I don't know a good way to measure this.
> 
> You could examine a set of log files, but then how do you count them?
> You can't count the MTA that sent and received the email because it's
> completely non-random.  And yet, that throws off your statistics.

I would totally exclude the server that generates the logs and just
use the 250 responses from the remote SMTP servers. Unless it's
someone like AOL, I don't think that ignoring the local system will
have much bearing on the stats.

I wouldn't bother chasing down the MX and then probing it, from the
perspective of Sendmail vs qmail vs the-rest, the queue-id responses
are sufficiently distinct with a few pattern matches.

The best server logs to look at are probably those that are running
diverse-interest mailing lists. ISP logs - regardless of whether they
are running qmail - are probably fine since we're not counting local
deliveries.


Regards.



Re: Bogus popularity claims for Sendmail

2001-01-16 Thread Gjermund Sorseth


   Mark Delany write:

   > I would (...) just
   > use the 250 responses from the remote SMTP servers.
   >
   > I wouldn't bother chasing down the MX and then probing it, from the
   > perspective of Sendmail vs qmail vs the-rest, the queue-id responses
   > are sufficiently distinct with a few pattern matches.
   >
   > The best server logs to look at are probably those that are running
   > diverse-interest mailing lists. ISP logs - regardless of whether they
   > are running qmail - are probably fine since we're not counting local
   > deliveries.


Good idea. For fun, I decided to look at the logs from our server for
the last two weeks. The sample size comes to 3,016,454 messages
delived to 62,786 different SMTP servers around the world.

Out of these 62,786 remote SMTP servers, 16,658 are running sendmail (27%)
and 5098 are running qmail (8%).

(The server providing these logs belongs to an ISP and includes a good
 mix of private, commercial, educational and government users. The remote
 servers are mostly active servers at other ISP's, schools or businesses
 I presume, few `idle workstations')

-- 
Gjermund Sorseth



Re: Bogus popularity claims for Sendmail

2001-01-16 Thread Gjermund Sorseth


  > Out of these 62,786 remote SMTP servers, 16,658 are running sendmail (27%)
  > and 5098 are running qmail (8%).


Perhaps it is also interesting to look at how many of the messages
were delivered to what type of server.

Out of the 3,016,454 messages in the sample, 484,010 were delivered
to servers running sendmail (16%) and 313,195 to servers running
qmail (11%).

This shifts the numbers in favor of qmail, which suggests that
large sites prefer to run qmail rather than sendmail.

-- 
Gjermund Sorseth



Re: Bogus popularity claims for Sendmail

2001-01-16 Thread Mark Delany

Excellent stats, Gjermund.

Are you scripts suitable for general use? Can they be easily modified
to identify some of the missing 65% and 73% respectively?

Your latter numbers are more useful, 100 machines running sendmail and
accepting 1 email each "handle" less traffic than 1 machine running
qmaik and accepting 101 emails, IMO.


Regards.

 
On Tue, Jan 16, 2001 at 12:01:05PM +0100, Gjermund Sorseth wrote:
> 
>   > Out of these 62,786 remote SMTP servers, 16,658 are running sendmail (27%)
>   > and 5098 are running qmail (8%).
> 
> 
> Perhaps it is also interesting to look at how many of the messages
> were delivered to what type of server.
> 
> Out of the 3,016,454 messages in the sample, 484,010 were delivered
> to servers running sendmail (16%) and 313,195 to servers running
> qmail (11%).
> 
> This shifts the numbers in favor of qmail, which suggests that
> large sites prefer to run qmail rather than sendmail.
> 
> -- 
> Gjermund Sorseth



Re: Bogus popularity claims for Sendmail

2001-01-17 Thread D. J. Bernstein

Russell Nelson writes:
> arrange with some Internet
> provider to put a traffic analyzer somewhere on their backbone,

There's a huge amount of mail that doesn't cross any backbones.

There's also a huge amount of mail that isn't sent by ISP mail servers: 
for example, deliveries from dedicated ezmlm machines.

Furthermore, every ISP is different. An ISP with more experienced users
will have more communications with UNIX machines.

---Dan



Re: Bogus popularity claims for Sendmail

2001-01-18 Thread Russell Nelson

D. J. Bernstein writes:
 > Russell Nelson writes:
 > > arrange with some Internet
 > > provider to put a traffic analyzer somewhere on their backbone,
 > 
 > There's a huge amount of mail that doesn't cross any backbones.

Can that mail truly be called "Internet" mail?

 > There's also a huge amount of mail that isn't sent by ISP mail servers: 
 > for example, deliveries from dedicated ezmlm machines.

I don't think anybody is running vanilla ezmlm if they have more than
one list.  Ezmlm doesn't account for bounces across lists.  Instead,
if a user is subscribed to N lists, ezmlm has to run through its
bounce algorithm N times.

But in any case if you want a random sample of email that crosses the
Internet, a reasonable way to do it is to randomly sample the email
that crosses the Internet.

Gee, maybe we could get that information via FOIA from the FBI's
Carnivore records?  :)

 > Furthermore, every ISP is different. An ISP with more experienced users
 > will have more communications with UNIX machines.

I think that's lost in the noise.  Look at the Unix machines that send
out millions of messages per day, e.g. colonize.com, rediffmail.com,
egroups.com, nbci.com and matchlogic.com.  All of these are Unix
machines running qmail, but they all send mail to as many newbies as
experienced users.

-- 
-russ nelson <[EMAIL PROTECTED]>  http://russnelson.com | Government is the
Crynwr sells support for free software  | PGPok | fictitious entity by which
521 Pleasant Valley Rd. | +1 315 268 1925 voice | everyone seeks to live at
Potsdam, NY 13676-3213  | +1 315 268 9201 FAX   | everyone else's expense.