Re: Java and Qmail - building a large mailmerge server - plain text version

2001-06-27 Thread Greg Cope

Russell Nelson wrote:
 
 
 The problem, simply enough, is that you should try very, very hard not
 to have a separate copy of the email on the disk.  If you're running
 qmail-inject on each message, then yes, three machines aren't going to
 be enough.  On the other hand, three machines of the type you describe
 below will be sufficient to deliver one million emails in about eight
 hours, IF you're doing the mail merge function at delivery time.
 
 You can do that using the qmail-verh patch, you could call
 qmail-remote directly (in theory; I don't know that anyone is doing
 that), or you could purchase my qmail-merge system.  It lets you
 substitute multiple fields into each message.  So you could substitute
 in a first name, a last name, a database ID number, or whatever else
 you want.  Handles bounces, and runs everything through the database.
 Details upon request.
 

Russ,

I emailed you off list a few days ago about your qmail-merge system, but
as yet have had no reply did you get it ?  Can please contact me off
list.

appologis to rest of list for a gratuitous waste of bandwidth

Thanks

Greg

 
 --
 -russ nelson [EMAIL PROTECTED]  http://russnelson.com
 Crynwr sells support for free software  | PGPok |
 521 Pleasant Valley Rd. | +1 315 268 1925 voice | #exclude windows.h
 Potsdam, NY 13676-3213  | +1 315 268 9201 FAX   |





Re: Java and Qmail - building a large mailmerge server - plain text version

2001-06-23 Thread manav

Hi Brett,

Thanks for the reply.

I am exploring ezmlm right now, so I believe I'd have to trouble the people
on the ezmlm mailing list for queries on that :-)

For tracking forwarded emails, I have a hidden IMG tag which then calls a
servlet. When the user opens the email for the first time, the hit is
registered and a cookie is written. Subsequent email reads by the same
user can now be tracked. When the servlet finds the cookie is not there,
either the cookies were deleted or the user forwarded the email. I don't
think I can make use of any combination of HTTP headers to establish
uniqueness of the recipient (or if there is, please let me know).

Once again, if this discussion offends anyone on the list, I apologize (and
would be glad to carry the same offlist).

Thanks,
Manav.
- Original Message -
From: Brett Randall [EMAIL PROTECTED]
To: manav [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Sent: Saturday, June 23, 2001 6:36 AM
Subject: Re: Java and Qmail - building a large mailmerge server - plain text
version


 Hi Manav. For most of this, one word: ezmlm (www.ezmlm.org). For the
 rest...

  manav == manav  [EMAIL PROTECTED] writes:

  1.2 For each blast we want to handle the bounced emails individually
(we
  would need to update the appropriate table). What do we do for that? We
  cannot just set environment variables since there will be multiple
  mail-merges and blasts happening simultaneously.

 Mailing list is the word I think you are after. See above...

  1.3 Usually after about 5,000 deliveries, the messages would be
stuck in
  the queue. We then added the CNAME lookup patch, and this increased to
about
  10,000. Currently, we prune the lists uploaded by the users and send
  messages in chunks of 2000, with less than 30 concurrent messages. Any
  suggestions what could be the culprit? What can we do to circumvent this
  problem?

 The only reason I can see why you would want to do this would be if
 you are customising the message for each individual user. If you
 are... you will probably want a bit more processing power (ie: more
 servers) than this. It is well known that qmail doesn't really enjoy
 having 10,000+ e-mails in the queue...

  1.4 What would be the best possible way to handle unsubscribe
requests.
  Currently we invoke a java program from the .qmail file that updates the
  database. Any suggestions how this can be improved upon?

 Ezmlm

  2. We then decided to switch over to using qmail-remote, to circumvent
the
  queue and the logging problem. This effectively means we will have to do
our
  own logging. Is there anyway to hand over different messages to
qmail-remote
  rather than invoking it for each message? We have now decided to change
the
  implementation so that at any point of time, there will be as many
threads
  sending messages as the qmail concurrency (say around 100), and the
messages
  themselves will be broken into chunks of 300 to 500 each. How can we
improve
  this?

 Ezmlm looks after all of this for you. It is probably easier to hack
 up ezmlm-idx to customise messages, than to make your own do
 everything that ezmlm does.

  3. Currently, we have our own implementation for checking bad e-mail
  addresses, list management, handling bounces and mail-merge. Are
  there any guidelines/sample code available (any language), that we
  can look at?

 Ezmlm...

  4 . What other things should we keep in mind to provide stability to
  the system? What patches to qmail are advisable to be installed?
  What should be the typical server configuration for such a system?

 If you are customising messages, you definitely need parallel
 processing or clustering. Also, that 128kb line is a MAJOR
 bottleneck...

 Oh, and RedHat 6.2 is not the best server distribution. I use it on a
 number of my servers, but am moving them to Mandrake (for now) until I
 find the time to investigate other alternatives such as Turbo Linux
 and Debian. Mandrake can be made to work a lot better for you than
 RedHat, and so far 8.0 has MUCH less bugs in the components than most
 RedHat versions...

  5. On a parallel note, what would be the best algorithm to track
  forwarded messages? We make use of cookies right now (but that
  provides 50% accuracy).

 We use a blank 1x1pixel gif in our e-mails that is like:
 a href=http://my.server.com/cgi-bin/emailcount.pl?2001-06-22-Email-1;
width=1 height=1

 That perl script then does whatever it has to (it logs the relevant
 data to a file, and increases the count in another file) and then
 returns a 1x1 pixel GIF, using the GD library, from
 memory... Obviously this requires an HTML e-mail to be going out, but
 if you're using cookies then you are obviously already there!

 By the way, the parameter on the perl script (?2001-06-blah) is so
 that we can use the same script for each e-mail that goes out, and
 just change the parameter so that we can count for different
 mailouts. On that note, Hotmail doesn't allow the forwarding of HTML
 e

Java and Qmail - building a large mailmerge server - plain text version

2001-06-22 Thread manav


Hi,

I have been using qmail for the last year and a half and have been closely
following the mailing list at securepoint, and didn't find anything related
to my query, hence I took the liberty of posting it.

The objective is to build a high-volumer server capable of doing mail-merged
email blasts to several lists with 10,000 to 1,000,000 users, provide
detailed reports about the status of emails (sent, bounced, bad email
addresses, opened, forwarded), list management (across multiple lists for
each user) and of course, stability.

Over the period of last 12 months, we explored several options - and finally
settled on qmail (what else?). I am using a Pentium III with Linux Redhat
6.2 installed on it, with 512 MB of RAM, 20 GB HDD and JDK 1.2.2 connected
to a 128 Kbps line.

Following are the topics on which I need your comments/suggestions:-

1. Earlier we used to Runtime.exec() qmail-inject and manually give it the
messages. This way, qmail would go on and do the delivery. We would then
parse the log files to find the status of the message.
1.1 We had a unique from address for each blast for each user to
uniquely identify each email sent (in maillog). Sometimes, instead of
logging the From address, the maillog would have the replyto address.
Any ideas why? Is there anything else that can be used to uniquely identify
a message?
1.2 For each blast we want to handle the bounced emails individually (we
would need to update the appropriate table). What do we do for that? We
cannot just set environment variables since there will be multiple
mail-merges and blasts happening simultaneously.
1.3 Usually after about 5,000 deliveries, the messages would be stuck in
the queue. We then added the CNAME lookup patch, and this increased to about
10,000. Currently, we prune the lists uploaded by the users and send
messages in chunks of 2000, with less than 30 concurrent messages. Any
suggestions what could be the culprit? What can we do to circumvent this
problem?
1.4 What would be the best possible way to handle unsubscribe requests.
Currently we invoke a java program from the .qmail file that updates the
database. Any suggestions how this can be improved upon?

2. We then decided to switch over to using qmail-remote, to circumvent the
queue and the logging problem. This effectively means we will have to do our
own logging. Is there anyway to hand over different messages to qmail-remote
rather than invoking it for each message? We have now decided to change the
implementation so that at any point of time, there will be as many threads
sending messages as the qmail concurrency (say around 100), and the messages
themselves will be broken into chunks of 300 to 500 each. How can we improve
this?

3. Currently, we have our own implementation for checking bad e-mail
addresses, list management, handling bounces and mail-merge. Are there any
guidelines/sample code available (any language), that we can look at?

4 . What other things should we keep in mind to provide stability to the
system? What patches to qmail are advisable to be installed? What should be
the typical server configuration for such a system?

5. On a parallel note, what would be the best algorithm to track forwarded
messages? We make use of cookies right now (but that provides 50% accuracy).

I apologize if I broke some protocol and asked some questions that do not
pertain to this list.

Regards,
manav.




Re: Java and Qmail - building a large mailmerge server - plain text version

2001-06-22 Thread Russell Nelson

manav writes:
  I have been using qmail for the last year and a half and have been closely
  following the mailing list at securepoint, and didn't find anything related
  to my query, hence I took the liberty of posting it.
  
  The objective is to build a high-volumer server capable of doing mail-merged
  email blasts to several lists with 10,000 to 1,000,000 users, provide
  detailed reports about the status of emails (sent, bounced, bad email
  addresses, opened, forwarded), list management (across multiple lists for
  each user) and of course, stability.
  
  Over the period of last 12 months, we explored several options - and finally
  settled on qmail (what else?). I am using a Pentium III with Linux Redhat
  6.2 installed on it, with 512 MB of RAM, 20 GB HDD and JDK 1.2.2 connected
  to a 128 Kbps line.

128Kbps?  Surely you mean Mbps.  If that's all the bandwidth you can
afford at your location, you should rent a server at a colocation site
n the US.  Use your server to create and distribute batches of
recipients to a server running qmail-qmqps configured with the
qmail-verh and big-concurrency patches.

Let's say that you're sending a 2K message.  Sent to 1,000,000 users,
that's 2,000,000,000 bytes.  Assuming that you're using qmail-verh (to
merge on the fly), that your system doesn't limit your sending (and if
you've got an IDE disk, it will), and assuming 20% overhead (tcp/ip
packet headers, smtp dialogue, message retries), this blast will take
15 seconds to clear your server.  That's 42 hours, minimum.

-- 
-russ nelson [EMAIL PROTECTED]  http://russnelson.com
Crynwr sells support for free software  | PGPok | 
521 Pleasant Valley Rd. | +1 315 268 1925 voice | #exclude windows.h
Potsdam, NY 13676-3213  | +1 315 268 9201 FAX   | 



Re: Java and Qmail - building a large mailmerge server - plain text version

2001-06-22 Thread Brett Randall

Hi Manav. For most of this, one word: ezmlm (www.ezmlm.org). For the
rest...

 manav == manav  [EMAIL PROTECTED] writes:

 1.2 For each blast we want to handle the bounced emails individually (we
 would need to update the appropriate table). What do we do for that? We
 cannot just set environment variables since there will be multiple
 mail-merges and blasts happening simultaneously.

Mailing list is the word I think you are after. See above...

 1.3 Usually after about 5,000 deliveries, the messages would be stuck in
 the queue. We then added the CNAME lookup patch, and this increased to about
 10,000. Currently, we prune the lists uploaded by the users and send
 messages in chunks of 2000, with less than 30 concurrent messages. Any
 suggestions what could be the culprit? What can we do to circumvent this
 problem?

The only reason I can see why you would want to do this would be if
you are customising the message for each individual user. If you
are... you will probably want a bit more processing power (ie: more
servers) than this. It is well known that qmail doesn't really enjoy
having 10,000+ e-mails in the queue...

 1.4 What would be the best possible way to handle unsubscribe requests.
 Currently we invoke a java program from the .qmail file that updates the
 database. Any suggestions how this can be improved upon?

Ezmlm

 2. We then decided to switch over to using qmail-remote, to circumvent the
 queue and the logging problem. This effectively means we will have to do our
 own logging. Is there anyway to hand over different messages to qmail-remote
 rather than invoking it for each message? We have now decided to change the
 implementation so that at any point of time, there will be as many threads
 sending messages as the qmail concurrency (say around 100), and the messages
 themselves will be broken into chunks of 300 to 500 each. How can we improve
 this?

Ezmlm looks after all of this for you. It is probably easier to hack
up ezmlm-idx to customise messages, than to make your own do
everything that ezmlm does.

 3. Currently, we have our own implementation for checking bad e-mail
 addresses, list management, handling bounces and mail-merge. Are
 there any guidelines/sample code available (any language), that we
 can look at?

Ezmlm...

 4 . What other things should we keep in mind to provide stability to
 the system? What patches to qmail are advisable to be installed?
 What should be the typical server configuration for such a system?

If you are customising messages, you definitely need parallel
processing or clustering. Also, that 128kb line is a MAJOR
bottleneck...

Oh, and RedHat 6.2 is not the best server distribution. I use it on a
number of my servers, but am moving them to Mandrake (for now) until I
find the time to investigate other alternatives such as Turbo Linux
and Debian. Mandrake can be made to work a lot better for you than
RedHat, and so far 8.0 has MUCH less bugs in the components than most
RedHat versions...

 5. On a parallel note, what would be the best algorithm to track
 forwarded messages? We make use of cookies right now (but that
 provides 50% accuracy).

We use a blank 1x1pixel gif in our e-mails that is like:
a href=http://my.server.com/cgi-bin/emailcount.pl?2001-06-22-Email-1; width=1 
height=1

That perl script then does whatever it has to (it logs the relevant
data to a file, and increases the count in another file) and then
returns a 1x1 pixel GIF, using the GD library, from
memory... Obviously this requires an HTML e-mail to be going out, but
if you're using cookies then you are obviously already there!

By the way, the parameter on the perl script (?2001-06-blah) is so
that we can use the same script for each e-mail that goes out, and
just change the parameter so that we can count for different
mailouts. On that note, Hotmail doesn't allow the forwarding of HTML
e-mail. I don't know about the other major free e-mail providers.

HTH

Brett.
-- 
Smash forehead on keyboard to continue



Re: Java and Qmail - building a large mailmerge server - plain text version

2001-06-22 Thread Mike Jackson

manav wrote:

 The objective is to build a high-volumer server capable of doing mail-merged
 email blasts to several lists with 10,000 to 1,000,000 users, provide
 detailed reports about the status of emails (sent, bounced, bad email
 addresses, opened, forwarded), list management (across multiple lists for
 each user) and of course, stability.
 
 Over the period of last 12 months, we explored several options - and finally
 settled on qmail (what else?). I am using a Pentium III with Linux Redhat
 6.2 installed on it, with 512 MB of RAM, 20 GB HDD and JDK 1.2.2 connected
 to a 128 Kbps line.
 

Before you go any further, get a real pipe. Why do people insist that
their Volkswagen Beetle is capable of keeping up with a Ferrari on the
autobahn? The volume of messages that you are trying to send is nothing
short of ridiculous with a 128Kbps line.

--
Mike



Re: Java and Qmail - building a large mailmerge server - plain text version

2001-06-22 Thread manav

Hi Mike, Russ,

I really appreciate you took some time out to reply. Thanks.

Yes, I do have three of my production servers co-located with an ISP in the
US that promises unlimited bandwidth, with a 99.9% uptime. All these
production boxes have a SCSI Disk with hardware alarms to indicate any
malfunction, and 1 GB of RAM. I have a crude load balancing algorithm that
ensures the load is shared across these boxes.

We are running the alpha phase right now (with whatever current
implementations we have), and I have serious doubts about the stability and
scalability of the system. The maximum load that I've put on my production
boxes is 250,000 emails so far and I've had similar issues that I mentioned
on my development boxes (the ones that are resemble a Beetle, to quote Mike
:-) ).

Before I move anything to production, I test them on the local (Indian
servers). These issues appear at both places.

Thanks once again for your responses.

Manav.
- Original Message -
From: Mike Jackson [EMAIL PROTECTED]
To: manav [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Sent: Friday, June 22, 2001 8:53 PM
Subject: Re: Java and Qmail - building a large mailmerge server - plain text
version


 manav wrote:

  The objective is to build a high-volumer server capable of doing
mail-merged
  email blasts to several lists with 10,000 to 1,000,000 users, provide
  detailed reports about the status of emails (sent, bounced, bad email
  addresses, opened, forwarded), list management (across multiple lists
for
  each user) and of course, stability.
 
  Over the period of last 12 months, we explored several options - and
finally
  settled on qmail (what else?). I am using a Pentium III with Linux
Redhat
  6.2 installed on it, with 512 MB of RAM, 20 GB HDD and JDK 1.2.2
connected
  to a 128 Kbps line.
 

 Before you go any further, get a real pipe. Why do people insist that
 their Volkswagen Beetle is capable of keeping up with a Ferrari on the
 autobahn? The volume of messages that you are trying to send is nothing
 short of ridiculous with a 128Kbps line.

 --
 Mike




Re: Java and Qmail - building a large mailmerge server - plain text version

2001-06-22 Thread Russell Nelson

manav writes:
  I really appreciate you took some time out to reply. Thanks.

And not flame you?  :-) Not everybody on the list is a flamer, and
besides you supplied us with all the necessary information.  You *did*
confuse us by mentioning 10 lakh recipients and 128kbps in the same
paragraph, but that's really no matter.  The real problem is injecting 
bulk email using separate messages.

  We are running the alpha phase right now (with whatever current
  implementations we have), and I have serious doubts about the stability and
  scalability of the system. The maximum load that I've put on my production
  boxes is 250,000 emails so far and I've had similar issues that I mentioned
  on my development boxes (the ones that are resemble a Beetle, to quote Mike
  :-) ).

The problem, simply enough, is that you should try very, very hard not 
to have a separate copy of the email on the disk.  If you're running
qmail-inject on each message, then yes, three machines aren't going to 
be enough.  On the other hand, three machines of the type you describe 
below will be sufficient to deliver one million emails in about eight
hours, IF you're doing the mail merge function at delivery time.

You can do that using the qmail-verh patch, you could call
qmail-remote directly (in theory; I don't know that anyone is doing
that), or you could purchase my qmail-merge system.  It lets you
substitute multiple fields into each message.  So you could substitute
in a first name, a last name, a database ID number, or whatever else
you want.  Handles bounces, and runs everything through the database.
Details upon request.

Dealing with bounces is a whole 'nother headache.  You see, there are
three types of email bounces: 4XX bounces, which are known to be
temporary.  A retry is definitely called for, and qmail will handle
that on its own.  You also get a 5XX bounce, where the smtp server has
told your smtp client that the email will never be deliverable.  These
get handled by parsing the QSBMF message.  And you can also get a
delivered but returned message.  VERP is your friend here, because
parsing bounce messages is a task only attempted by lunatics.

Even then, you can't treat a 5XX or returned message as a permanent
failure.  You have to have a system for retries these messages at a
later time.

As someone else pointed out, ezmlm handles this nicely.
Unfortunately, ezmlm doesn't work well when you've got users
subscribed to more than one type of mailing, because it doesn't share
bounce information between lists.

-- 
-russ nelson [EMAIL PROTECTED]  http://russnelson.com
Crynwr sells support for free software  | PGPok | 
521 Pleasant Valley Rd. | +1 315 268 1925 voice | #exclude windows.h
Potsdam, NY 13676-3213  | +1 315 268 9201 FAX   | 



Re: Java and Qmail - building a large mailmerge server - plain text version

2001-06-22 Thread Mike Jackson

manav wrote:
 
 Hi Mike, Russ,

Hi !

 
 We are running the alpha phase right now (with whatever current
 implementations we have), and I have serious doubts about the stability and
 scalability of the system. The maximum load that I've put on my production
 boxes is 250,000 emails so far and I've had similar issues that I mentioned
 on my development boxes (the ones that are resemble a Beetle, to quote Mike
 :-) ).

Just as an example of the speed of qmail and ezmlm:

Machine: 1U rackmount cheapo 600Mhz Celeron, 128MB RAM, 18GB hard disk
OS: NetBSD 1.5
MTA: Qmail 1.03 with only the verh patch
List Manager: Ezmlm 0.53 with idx 0.40
remoteconcurrency: 120

Here are some stats from the first large mailing with this server. As
you can see, within 15 minutes most of the deliveries were completed.
The only kernel tuning I did was to raise the max processes to 256 and
max open files per process to 512. The numbers look a little off since
there are a few old messages still going through, mostly mail servers
that were previously unreachable.

12.45.21message sent to 4773 addresses

12.50.001738 deliveries
1924 attempts
1761 successes
187 failures

12.55.001775 deliveries
1937 attempts
1779 successes
166 failures

13.00.00423 deliveries
455 attempts
433 successes
32 failures

13.05.0013 deliveries
14 attempts

13.10.002 deliveries
2 attempts
---
Total   3951 deliveries
4332 attempts

 With the large concurrency patch, this throughput could be increased
significantly. I will put it into use if I get a requirement to send to
at least 10,000 addresses.

 Using qmail-ldap and qmqp with a frontend master server and several
slave servers, you can distribute the load among several servers very
easily. For example, if you have 4 slave servers then use a unique
mailhost attribute for each quarter of your subscriber base. The
scalability of qmail-ldap is almost limitless, I think. The master
server will transfer the qmqp messages to the slave servers via qmqp
faster than you can even dream of. For more info, www.nrg4u.com
qmail-ldap homepage.

Regards,
Mike



Re: Java and Qmail - building a large mailmerge server - plain text version

2001-06-22 Thread Karsten W. Rohrbach

manav([EMAIL PROTECTED])@2001.06.22 21:17:26 +:
 Yes, I do have three of my production servers co-located with an ISP in the
 US that promises unlimited bandwidth, with a 99.9% uptime. All these

wow, daring. my contract with my isp ensures 100mbit/fdx ethernet with
99.87something% availabilty -- unlimited bandwidth seems a little bit
high to me
;-)

/k

-- 
 MCSE: Management Can't Send E-mail
KR433/KR11-RIPE -- WebMonster Community Founder -- nGENn GmbH Senior Techie
http://www.webmonster.de/ -- ftp://ftp.webmonster.de/ -- http://www.ngenn.net/
karstenrohrbach.de -- alphangenn.net -- alphascene.org -- [EMAIL PROTECTED]
GnuPG 0x2964BF46 2001-03-15 42F9 9FFF 50D4 2F38 DBEE  DF22 3340 4F4E 2964 BF46
Please do not remove my address from To: and Cc: fields in mailing lists. 10x

 PGP signature