Re: Java and Qmail - building a large mailmerge server - plain text version
Russell Nelson wrote: The problem, simply enough, is that you should try very, very hard not to have a separate copy of the email on the disk. If you're running qmail-inject on each message, then yes, three machines aren't going to be enough. On the other hand, three machines of the type you describe below will be sufficient to deliver one million emails in about eight hours, IF you're doing the mail merge function at delivery time. You can do that using the qmail-verh patch, you could call qmail-remote directly (in theory; I don't know that anyone is doing that), or you could purchase my qmail-merge system. It lets you substitute multiple fields into each message. So you could substitute in a first name, a last name, a database ID number, or whatever else you want. Handles bounces, and runs everything through the database. Details upon request. Russ, I emailed you off list a few days ago about your qmail-merge system, but as yet have had no reply did you get it ? Can please contact me off list. appologis to rest of list for a gratuitous waste of bandwidth Thanks Greg -- -russ nelson [EMAIL PROTECTED] http://russnelson.com Crynwr sells support for free software | PGPok | 521 Pleasant Valley Rd. | +1 315 268 1925 voice | #exclude windows.h Potsdam, NY 13676-3213 | +1 315 268 9201 FAX |
Re: Java and Qmail - building a large mailmerge server - plain text version
Hi Brett, Thanks for the reply. I am exploring ezmlm right now, so I believe I'd have to trouble the people on the ezmlm mailing list for queries on that :-) For tracking forwarded emails, I have a hidden IMG tag which then calls a servlet. When the user opens the email for the first time, the hit is registered and a cookie is written. Subsequent email reads by the same user can now be tracked. When the servlet finds the cookie is not there, either the cookies were deleted or the user forwarded the email. I don't think I can make use of any combination of HTTP headers to establish uniqueness of the recipient (or if there is, please let me know). Once again, if this discussion offends anyone on the list, I apologize (and would be glad to carry the same offlist). Thanks, Manav. - Original Message - From: Brett Randall [EMAIL PROTECTED] To: manav [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Sent: Saturday, June 23, 2001 6:36 AM Subject: Re: Java and Qmail - building a large mailmerge server - plain text version Hi Manav. For most of this, one word: ezmlm (www.ezmlm.org). For the rest... manav == manav [EMAIL PROTECTED] writes: 1.2 For each blast we want to handle the bounced emails individually (we would need to update the appropriate table). What do we do for that? We cannot just set environment variables since there will be multiple mail-merges and blasts happening simultaneously. Mailing list is the word I think you are after. See above... 1.3 Usually after about 5,000 deliveries, the messages would be stuck in the queue. We then added the CNAME lookup patch, and this increased to about 10,000. Currently, we prune the lists uploaded by the users and send messages in chunks of 2000, with less than 30 concurrent messages. Any suggestions what could be the culprit? What can we do to circumvent this problem? The only reason I can see why you would want to do this would be if you are customising the message for each individual user. If you are... you will probably want a bit more processing power (ie: more servers) than this. It is well known that qmail doesn't really enjoy having 10,000+ e-mails in the queue... 1.4 What would be the best possible way to handle unsubscribe requests. Currently we invoke a java program from the .qmail file that updates the database. Any suggestions how this can be improved upon? Ezmlm 2. We then decided to switch over to using qmail-remote, to circumvent the queue and the logging problem. This effectively means we will have to do our own logging. Is there anyway to hand over different messages to qmail-remote rather than invoking it for each message? We have now decided to change the implementation so that at any point of time, there will be as many threads sending messages as the qmail concurrency (say around 100), and the messages themselves will be broken into chunks of 300 to 500 each. How can we improve this? Ezmlm looks after all of this for you. It is probably easier to hack up ezmlm-idx to customise messages, than to make your own do everything that ezmlm does. 3. Currently, we have our own implementation for checking bad e-mail addresses, list management, handling bounces and mail-merge. Are there any guidelines/sample code available (any language), that we can look at? Ezmlm... 4 . What other things should we keep in mind to provide stability to the system? What patches to qmail are advisable to be installed? What should be the typical server configuration for such a system? If you are customising messages, you definitely need parallel processing or clustering. Also, that 128kb line is a MAJOR bottleneck... Oh, and RedHat 6.2 is not the best server distribution. I use it on a number of my servers, but am moving them to Mandrake (for now) until I find the time to investigate other alternatives such as Turbo Linux and Debian. Mandrake can be made to work a lot better for you than RedHat, and so far 8.0 has MUCH less bugs in the components than most RedHat versions... 5. On a parallel note, what would be the best algorithm to track forwarded messages? We make use of cookies right now (but that provides 50% accuracy). We use a blank 1x1pixel gif in our e-mails that is like: a href=http://my.server.com/cgi-bin/emailcount.pl?2001-06-22-Email-1; width=1 height=1 That perl script then does whatever it has to (it logs the relevant data to a file, and increases the count in another file) and then returns a 1x1 pixel GIF, using the GD library, from memory... Obviously this requires an HTML e-mail to be going out, but if you're using cookies then you are obviously already there! By the way, the parameter on the perl script (?2001-06-blah) is so that we can use the same script for each e-mail that goes out, and just change the parameter so that we can count for different mailouts. On that note, Hotmail doesn't allow the forwarding of HTML e
Java and Qmail - building a large mailmerge server - plain text version
Hi, I have been using qmail for the last year and a half and have been closely following the mailing list at securepoint, and didn't find anything related to my query, hence I took the liberty of posting it. The objective is to build a high-volumer server capable of doing mail-merged email blasts to several lists with 10,000 to 1,000,000 users, provide detailed reports about the status of emails (sent, bounced, bad email addresses, opened, forwarded), list management (across multiple lists for each user) and of course, stability. Over the period of last 12 months, we explored several options - and finally settled on qmail (what else?). I am using a Pentium III with Linux Redhat 6.2 installed on it, with 512 MB of RAM, 20 GB HDD and JDK 1.2.2 connected to a 128 Kbps line. Following are the topics on which I need your comments/suggestions:- 1. Earlier we used to Runtime.exec() qmail-inject and manually give it the messages. This way, qmail would go on and do the delivery. We would then parse the log files to find the status of the message. 1.1 We had a unique from address for each blast for each user to uniquely identify each email sent (in maillog). Sometimes, instead of logging the From address, the maillog would have the replyto address. Any ideas why? Is there anything else that can be used to uniquely identify a message? 1.2 For each blast we want to handle the bounced emails individually (we would need to update the appropriate table). What do we do for that? We cannot just set environment variables since there will be multiple mail-merges and blasts happening simultaneously. 1.3 Usually after about 5,000 deliveries, the messages would be stuck in the queue. We then added the CNAME lookup patch, and this increased to about 10,000. Currently, we prune the lists uploaded by the users and send messages in chunks of 2000, with less than 30 concurrent messages. Any suggestions what could be the culprit? What can we do to circumvent this problem? 1.4 What would be the best possible way to handle unsubscribe requests. Currently we invoke a java program from the .qmail file that updates the database. Any suggestions how this can be improved upon? 2. We then decided to switch over to using qmail-remote, to circumvent the queue and the logging problem. This effectively means we will have to do our own logging. Is there anyway to hand over different messages to qmail-remote rather than invoking it for each message? We have now decided to change the implementation so that at any point of time, there will be as many threads sending messages as the qmail concurrency (say around 100), and the messages themselves will be broken into chunks of 300 to 500 each. How can we improve this? 3. Currently, we have our own implementation for checking bad e-mail addresses, list management, handling bounces and mail-merge. Are there any guidelines/sample code available (any language), that we can look at? 4 . What other things should we keep in mind to provide stability to the system? What patches to qmail are advisable to be installed? What should be the typical server configuration for such a system? 5. On a parallel note, what would be the best algorithm to track forwarded messages? We make use of cookies right now (but that provides 50% accuracy). I apologize if I broke some protocol and asked some questions that do not pertain to this list. Regards, manav.
Re: Java and Qmail - building a large mailmerge server - plain text version
manav writes: I have been using qmail for the last year and a half and have been closely following the mailing list at securepoint, and didn't find anything related to my query, hence I took the liberty of posting it. The objective is to build a high-volumer server capable of doing mail-merged email blasts to several lists with 10,000 to 1,000,000 users, provide detailed reports about the status of emails (sent, bounced, bad email addresses, opened, forwarded), list management (across multiple lists for each user) and of course, stability. Over the period of last 12 months, we explored several options - and finally settled on qmail (what else?). I am using a Pentium III with Linux Redhat 6.2 installed on it, with 512 MB of RAM, 20 GB HDD and JDK 1.2.2 connected to a 128 Kbps line. 128Kbps? Surely you mean Mbps. If that's all the bandwidth you can afford at your location, you should rent a server at a colocation site n the US. Use your server to create and distribute batches of recipients to a server running qmail-qmqps configured with the qmail-verh and big-concurrency patches. Let's say that you're sending a 2K message. Sent to 1,000,000 users, that's 2,000,000,000 bytes. Assuming that you're using qmail-verh (to merge on the fly), that your system doesn't limit your sending (and if you've got an IDE disk, it will), and assuming 20% overhead (tcp/ip packet headers, smtp dialogue, message retries), this blast will take 15 seconds to clear your server. That's 42 hours, minimum. -- -russ nelson [EMAIL PROTECTED] http://russnelson.com Crynwr sells support for free software | PGPok | 521 Pleasant Valley Rd. | +1 315 268 1925 voice | #exclude windows.h Potsdam, NY 13676-3213 | +1 315 268 9201 FAX |
Re: Java and Qmail - building a large mailmerge server - plain text version
Hi Manav. For most of this, one word: ezmlm (www.ezmlm.org). For the rest... manav == manav [EMAIL PROTECTED] writes: 1.2 For each blast we want to handle the bounced emails individually (we would need to update the appropriate table). What do we do for that? We cannot just set environment variables since there will be multiple mail-merges and blasts happening simultaneously. Mailing list is the word I think you are after. See above... 1.3 Usually after about 5,000 deliveries, the messages would be stuck in the queue. We then added the CNAME lookup patch, and this increased to about 10,000. Currently, we prune the lists uploaded by the users and send messages in chunks of 2000, with less than 30 concurrent messages. Any suggestions what could be the culprit? What can we do to circumvent this problem? The only reason I can see why you would want to do this would be if you are customising the message for each individual user. If you are... you will probably want a bit more processing power (ie: more servers) than this. It is well known that qmail doesn't really enjoy having 10,000+ e-mails in the queue... 1.4 What would be the best possible way to handle unsubscribe requests. Currently we invoke a java program from the .qmail file that updates the database. Any suggestions how this can be improved upon? Ezmlm 2. We then decided to switch over to using qmail-remote, to circumvent the queue and the logging problem. This effectively means we will have to do our own logging. Is there anyway to hand over different messages to qmail-remote rather than invoking it for each message? We have now decided to change the implementation so that at any point of time, there will be as many threads sending messages as the qmail concurrency (say around 100), and the messages themselves will be broken into chunks of 300 to 500 each. How can we improve this? Ezmlm looks after all of this for you. It is probably easier to hack up ezmlm-idx to customise messages, than to make your own do everything that ezmlm does. 3. Currently, we have our own implementation for checking bad e-mail addresses, list management, handling bounces and mail-merge. Are there any guidelines/sample code available (any language), that we can look at? Ezmlm... 4 . What other things should we keep in mind to provide stability to the system? What patches to qmail are advisable to be installed? What should be the typical server configuration for such a system? If you are customising messages, you definitely need parallel processing or clustering. Also, that 128kb line is a MAJOR bottleneck... Oh, and RedHat 6.2 is not the best server distribution. I use it on a number of my servers, but am moving them to Mandrake (for now) until I find the time to investigate other alternatives such as Turbo Linux and Debian. Mandrake can be made to work a lot better for you than RedHat, and so far 8.0 has MUCH less bugs in the components than most RedHat versions... 5. On a parallel note, what would be the best algorithm to track forwarded messages? We make use of cookies right now (but that provides 50% accuracy). We use a blank 1x1pixel gif in our e-mails that is like: a href=http://my.server.com/cgi-bin/emailcount.pl?2001-06-22-Email-1; width=1 height=1 That perl script then does whatever it has to (it logs the relevant data to a file, and increases the count in another file) and then returns a 1x1 pixel GIF, using the GD library, from memory... Obviously this requires an HTML e-mail to be going out, but if you're using cookies then you are obviously already there! By the way, the parameter on the perl script (?2001-06-blah) is so that we can use the same script for each e-mail that goes out, and just change the parameter so that we can count for different mailouts. On that note, Hotmail doesn't allow the forwarding of HTML e-mail. I don't know about the other major free e-mail providers. HTH Brett. -- Smash forehead on keyboard to continue
Re: Java and Qmail - building a large mailmerge server - plain text version
manav wrote: The objective is to build a high-volumer server capable of doing mail-merged email blasts to several lists with 10,000 to 1,000,000 users, provide detailed reports about the status of emails (sent, bounced, bad email addresses, opened, forwarded), list management (across multiple lists for each user) and of course, stability. Over the period of last 12 months, we explored several options - and finally settled on qmail (what else?). I am using a Pentium III with Linux Redhat 6.2 installed on it, with 512 MB of RAM, 20 GB HDD and JDK 1.2.2 connected to a 128 Kbps line. Before you go any further, get a real pipe. Why do people insist that their Volkswagen Beetle is capable of keeping up with a Ferrari on the autobahn? The volume of messages that you are trying to send is nothing short of ridiculous with a 128Kbps line. -- Mike
Re: Java and Qmail - building a large mailmerge server - plain text version
Hi Mike, Russ, I really appreciate you took some time out to reply. Thanks. Yes, I do have three of my production servers co-located with an ISP in the US that promises unlimited bandwidth, with a 99.9% uptime. All these production boxes have a SCSI Disk with hardware alarms to indicate any malfunction, and 1 GB of RAM. I have a crude load balancing algorithm that ensures the load is shared across these boxes. We are running the alpha phase right now (with whatever current implementations we have), and I have serious doubts about the stability and scalability of the system. The maximum load that I've put on my production boxes is 250,000 emails so far and I've had similar issues that I mentioned on my development boxes (the ones that are resemble a Beetle, to quote Mike :-) ). Before I move anything to production, I test them on the local (Indian servers). These issues appear at both places. Thanks once again for your responses. Manav. - Original Message - From: Mike Jackson [EMAIL PROTECTED] To: manav [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Sent: Friday, June 22, 2001 8:53 PM Subject: Re: Java and Qmail - building a large mailmerge server - plain text version manav wrote: The objective is to build a high-volumer server capable of doing mail-merged email blasts to several lists with 10,000 to 1,000,000 users, provide detailed reports about the status of emails (sent, bounced, bad email addresses, opened, forwarded), list management (across multiple lists for each user) and of course, stability. Over the period of last 12 months, we explored several options - and finally settled on qmail (what else?). I am using a Pentium III with Linux Redhat 6.2 installed on it, with 512 MB of RAM, 20 GB HDD and JDK 1.2.2 connected to a 128 Kbps line. Before you go any further, get a real pipe. Why do people insist that their Volkswagen Beetle is capable of keeping up with a Ferrari on the autobahn? The volume of messages that you are trying to send is nothing short of ridiculous with a 128Kbps line. -- Mike
Re: Java and Qmail - building a large mailmerge server - plain text version
manav writes: I really appreciate you took some time out to reply. Thanks. And not flame you? :-) Not everybody on the list is a flamer, and besides you supplied us with all the necessary information. You *did* confuse us by mentioning 10 lakh recipients and 128kbps in the same paragraph, but that's really no matter. The real problem is injecting bulk email using separate messages. We are running the alpha phase right now (with whatever current implementations we have), and I have serious doubts about the stability and scalability of the system. The maximum load that I've put on my production boxes is 250,000 emails so far and I've had similar issues that I mentioned on my development boxes (the ones that are resemble a Beetle, to quote Mike :-) ). The problem, simply enough, is that you should try very, very hard not to have a separate copy of the email on the disk. If you're running qmail-inject on each message, then yes, three machines aren't going to be enough. On the other hand, three machines of the type you describe below will be sufficient to deliver one million emails in about eight hours, IF you're doing the mail merge function at delivery time. You can do that using the qmail-verh patch, you could call qmail-remote directly (in theory; I don't know that anyone is doing that), or you could purchase my qmail-merge system. It lets you substitute multiple fields into each message. So you could substitute in a first name, a last name, a database ID number, or whatever else you want. Handles bounces, and runs everything through the database. Details upon request. Dealing with bounces is a whole 'nother headache. You see, there are three types of email bounces: 4XX bounces, which are known to be temporary. A retry is definitely called for, and qmail will handle that on its own. You also get a 5XX bounce, where the smtp server has told your smtp client that the email will never be deliverable. These get handled by parsing the QSBMF message. And you can also get a delivered but returned message. VERP is your friend here, because parsing bounce messages is a task only attempted by lunatics. Even then, you can't treat a 5XX or returned message as a permanent failure. You have to have a system for retries these messages at a later time. As someone else pointed out, ezmlm handles this nicely. Unfortunately, ezmlm doesn't work well when you've got users subscribed to more than one type of mailing, because it doesn't share bounce information between lists. -- -russ nelson [EMAIL PROTECTED] http://russnelson.com Crynwr sells support for free software | PGPok | 521 Pleasant Valley Rd. | +1 315 268 1925 voice | #exclude windows.h Potsdam, NY 13676-3213 | +1 315 268 9201 FAX |
Re: Java and Qmail - building a large mailmerge server - plain text version
manav wrote: Hi Mike, Russ, Hi ! We are running the alpha phase right now (with whatever current implementations we have), and I have serious doubts about the stability and scalability of the system. The maximum load that I've put on my production boxes is 250,000 emails so far and I've had similar issues that I mentioned on my development boxes (the ones that are resemble a Beetle, to quote Mike :-) ). Just as an example of the speed of qmail and ezmlm: Machine: 1U rackmount cheapo 600Mhz Celeron, 128MB RAM, 18GB hard disk OS: NetBSD 1.5 MTA: Qmail 1.03 with only the verh patch List Manager: Ezmlm 0.53 with idx 0.40 remoteconcurrency: 120 Here are some stats from the first large mailing with this server. As you can see, within 15 minutes most of the deliveries were completed. The only kernel tuning I did was to raise the max processes to 256 and max open files per process to 512. The numbers look a little off since there are a few old messages still going through, mostly mail servers that were previously unreachable. 12.45.21message sent to 4773 addresses 12.50.001738 deliveries 1924 attempts 1761 successes 187 failures 12.55.001775 deliveries 1937 attempts 1779 successes 166 failures 13.00.00423 deliveries 455 attempts 433 successes 32 failures 13.05.0013 deliveries 14 attempts 13.10.002 deliveries 2 attempts --- Total 3951 deliveries 4332 attempts With the large concurrency patch, this throughput could be increased significantly. I will put it into use if I get a requirement to send to at least 10,000 addresses. Using qmail-ldap and qmqp with a frontend master server and several slave servers, you can distribute the load among several servers very easily. For example, if you have 4 slave servers then use a unique mailhost attribute for each quarter of your subscriber base. The scalability of qmail-ldap is almost limitless, I think. The master server will transfer the qmqp messages to the slave servers via qmqp faster than you can even dream of. For more info, www.nrg4u.com qmail-ldap homepage. Regards, Mike
Re: Java and Qmail - building a large mailmerge server - plain text version
manav([EMAIL PROTECTED])@2001.06.22 21:17:26 +: Yes, I do have three of my production servers co-located with an ISP in the US that promises unlimited bandwidth, with a 99.9% uptime. All these wow, daring. my contract with my isp ensures 100mbit/fdx ethernet with 99.87something% availabilty -- unlimited bandwidth seems a little bit high to me ;-) /k -- MCSE: Management Can't Send E-mail KR433/KR11-RIPE -- WebMonster Community Founder -- nGENn GmbH Senior Techie http://www.webmonster.de/ -- ftp://ftp.webmonster.de/ -- http://www.ngenn.net/ karstenrohrbach.de -- alphangenn.net -- alphascene.org -- [EMAIL PROTECTED] GnuPG 0x2964BF46 2001-03-15 42F9 9FFF 50D4 2F38 DBEE DF22 3340 4F4E 2964 BF46 Please do not remove my address from To: and Cc: fields in mailing lists. 10x PGP signature