Not receiving e-mail on submission port
Hi, I was reported that the e-mails people send never get to my server. When I first was reported, I tested and it was true. I tested some more and found out I couldn't get e-mail when I switch to submission port. How can I fix this? You can find my postconf -n and master.cf below (at the moment it's not using submission port. postconf -n alias_database = hash:/etc/aliases alias_maps = hash:/etc/aliases append_dot_mydomain = no biff = no broken_sasl_auth_clients = yes config_directory = /etc/postfix debug_peer_level = 3 debug_peer_list = localhost html_directory = /usr/share/doc/postfix/html inet_interfaces = all mailbox_command = procmail -a "$EXTENSION" mailbox_size_limit = 0 mydestination = localhost myhostname = vps.ozses.net mynetworks = 127.0.0.0/8 127.0.0.2/32 184.82.40.0/24 64.120.177.0/24 myorigin = /etc/mailname readme_directory = /usr/share/doc/postfix recipient_delimiter = + relayhost = smtpd_banner = $myhostname ESMTP $mail_name (Ubuntu) smtpd_recipient_restrictions = permit_mynetworks, permit_sasl_authenticated, reject_non_fqdn_hostname, reject_non_fqdn_sender, reject_non_fqdn_recipient, reject_unauth_destination, reject_unauth_pipelining, reject_invalid_hostname smtpd_sasl_auth_enable = yes smtpd_sasl_local_domain = $myhostname smtpd_sasl_path = private/auth smtpd_sasl_security_options = noanonymous smtpd_sasl_type = dovecot virtual_alias_maps = mysql:/etc/postfix/mysql_virtual_alias_maps.cf virtual_gid_maps = static:5000 virtual_mailbox_base = /srv/vmail virtual_mailbox_domains = mysql:/etc/postfix/mysql_virtual_domains_maps.cf virtual_mailbox_maps = mysql:/etc/postfix/mysql_virtual_mailbox_maps.cf virtual_minimum_uid = 100 virtual_transport = virtual virtual_uid_maps = static:5000 root@vps:~# cat /etc/postfix/master.cf # # Postfix master process configuration file. For details on the format # of the file, see the master(5) manual page (command: "man 5 master"). # # Do not forget to execute "postfix reload" after editing this file. # # == # service type private unpriv chroot wakeup maxproc command + args # (yes) (yes) (yes) (never) (100) # == smtp inet n - n - - smtpd #submission inet n - - - - smtpd # -o smtpd_tls_security_level=encrypt # -o smtpd_sasl_auth_enable=yes # -o smtpd_client_restrictions=permit_sasl_authenticated,reject # -o milter_macro_daemon_name=ORIGINATING #smtps inet n - - - - smtpd # -o smtpd_tls_wrappermode=yes # -o smtpd_sasl_auth_enable=yes # -o smtpd_client_restrictions=permit_sasl_authenticated,reject # -o milter_macro_daemon_name=ORIGINATING #628 inet n - - - - qmqpd pickupfifo n - - 60 1 pickup cleanup unix n - - - 0 cleanup qmgr fifo n - n 300 1 qmgr #qmgr fifo n - - 300 1 oqmgr tlsmgrunix - - - 1000? 1 tlsmgr rewrite unix - - n - - trivial-rewrite bounceunix - - - - 0 bounce defer unix - - - - 0 bounce trace unix - - - - 0 bounce verifyunix - - - - 1 verify flush unix n - - 1000? 0 flush proxymap unix - - n - - proxymap proxywrite unix - - n - 1 proxymap smtp unix - - - - - smtp # When relaying mail as backup MX, disable fallback_relay to avoid MX loops relay unix - - - - - smtp -o smtp_fallback_relay= # -o smtp_helo_timeout=5 -o smtp_connect_timeout=5 showq unix n - - - - showq error unix - - - - - error retry unix - - - - - error discard unix - - - - - discard local unix - n n - - local virtual unix - n n - - virtual lmtp unix - - - - - lmtp anvil unix - - - - 1 anvil scacheunix - - - - 1 scache # # # Interfaces to non-Postfix software. Be sure to examine the manual # pages of the non-Postfix software to find out what options it wants. # # Many of the following services use the Postfix pipe(8) delivery # agent. See the pipe(8) man page for information about ${recipient} # and other message envelope options. # ===
Re: Issue integrating with Cyrus-SASL
As previously mentioned, the chroot for smtp is turned off: cat /etc/postfix/master.cf | grep "smtp inet" smtp inet n - n - - smtpd From: Wietse Venema To: Postfix users Sent: Wednesday, September 28, 2011 3:07 PM Subject: Re: Issue integrating with Cyrus-SASL Crazedfred: > Any thoughts? What does the smtpd line in master.cf look like? If it looks like this: smtp inet n - - - - smtpd This means that chroot is turned on , and that Postfix won't be able to talk to saslauthd. To turn off chroot change it into this and do "postfix reload": smtp inet n - n - - smtpd Only Debian ships Postfix with chroot turned on. Complain there. Wietse
Always check for irregular mail usage of your mail server
http://www.nk.ca/blog/index.php?/archives/1275-Phishing-spam-mail-script-intercepted.html -- Member - Liberal International This is doc...@nl2k.ab.ca Ici doc...@nl2k.ab.ca God, Queen and country! Never Satan President Republic! Beware AntiChrist rising! https://www.fullyfollow.me/rootnl2k Ontario, Nfld, and Manitoba boot the extremists out and vote Liberal!
Re: Using Postfix for email retention
On Mon, 2011-10-10 at 07:20:22 +0530, Janantha Marasinghe wrote: > I want to know if postfix can be used to save a copy of every e-mail > sent and received (including attachments) by a mail server for email > retention. See http://article.gmane.org/gmane.mail.postfix.user/221022 and the mailing list archive for similar discussions. > If it could indexed for easier searching that would be great! This has to happen outside of Postfix. -- Sahil Tandon
Re: LDAP table, recursion filter
On 20/09/2011, at 11:04 AM, Tom Lanyon wrote: > When using a LDAP lookup table the 'special_result_attribute' parameter is > available to allow me to recurse to other DNs [e.g. recursing to members of a > LDAP group]. I can also use the 'leaf_result_attribute' parameter to select > the attribute I want to return from those recursive DN lookups, but I can't > find a way to filter that recursive lookup to avoid returning > > As an example, I have a group with a bunch of members, but a few of those > members' objects are marked as 'disabled'. I'd like to recurse through the > group's member DNs to find their 'mail' attribute, but only for members who > don't have the 'disabled' attribute set to true [e.g. apply a filter of > "(!(disabled=true))"]. > > Is it possible to apply such a filter on the recursive DN search? No bites on this... perhaps it'd help if I gave an example: LDAP: dn: cn=tech-staff,ou=Groups,dc=example,dc=com objectclass: top objectclass: ldapgroup cn: tech-staff mail: tech-st...@example.com memberdn: uid=adam,ou=People,dc=example,dc=com memberdn: uid=bob,ou=People,dc=example,dc=com memberdn: uid=chuck,ou=People,dc=example,dc=com dn: uid=adam,ou=People,dc=example,dc=com objectclass: top objectclass: ldapuser uid: adam mail: a...@example.com dn: uid=bob,ou=People,dc=example,dc=com objectclass: top objectclass: ldapuser uid: bob mail: b...@example.com accountLock: true Postfix (ldap-group-aliases.cf): search_base = ou=Groups,dc=example,dc=com query_filter = mail=%s result_attribute = mail special_result_attribute = memberdn This is fine, and recurses on the memberdn attributes to find the mail attributes for the listed users, but we need a way to filter that recursion with a (!(accountLock=true)) filter so that even though bob is a group member, his account is disabled so his address shouldn't be expanded... Advice appreciated. Regards, Tom
Using Postfix for email retention
Hi All, I want to know if postfix can be used to save a copy of every e-mail sent and received (including attachments) by a mail server for email retention. If it could indexed for easier searching that would be great! Thnks J
Re: Premature "No Space left on device" on XFS
On Sun, Oct 09, 2011 at 06:03:36PM -0500, Stan Hoeppner wrote: > On 10/9/2011 3:29 PM, Bron Gondwana wrote: > > > I'm honestly more interested in maildir type workload too, spool doesn't > > get enough traffic usually to care about IO. > > > > (sorry, getting a bit off topic for the postfix list) > > Maybe not off topic. You're delivering into the maildir mailboxes with > local(8) right? Cyrus via LMTP (through an intermediate proxy, what's more) actually. > > We went with lots of small filesystems to reduce single points of > > failure rather than one giant filesystem across all our spools. > > Not a bad architecture. Has a few downsides but one big upside. Did > you really mean Postfix spools here, or did you mean to say maildir > directories? Destination cyrus directories, yes - sorry, not postfix spools. > > My goodness. That's REALLY recent in filesystem times. Something > > XFS has been seeing substantial development for a few years now due to > interest from RedHat, who plan to make it the default RHEL filesystem in > the future. They've dedicated serious resources to the effort, > including hiring Dave Chinner from SGI. Dave's major contribution while > at RedHat has been the code that yields the 10X+ increase in unlink > performance. It is enabled by default in 2.6.39 and later kernels. Fair enough. It's good to see the extra work going in. > > that recent plus "all my eggs in one basket" of changing to a > > large multi-spindle filesystem that would really get the benefits > > of XFS would be more dangerous than I'm willing to consider. That's > > That's one opinion, probably not shared by most XFS users. I assume > your current architecture is designed to mitigate hardware > failure--focused on the very rare occasion of filesystem corruption in > absence of some hardware failure event. I'd make an educated guess that > the median size XFS filesystem in the wild today is at least 50TB and > spans dozens of spindles housed in multiple FC SAN array chassis. Corruption happens for real. We get maybe 1-2 per month on average. Wouldn't even notice them if we didn't actually have the sha1 of every single email file in the metadata files, and THAT protected with a crc32 per entry as well. So we can actually detect them. > > barely a year old. At least we're not still running Debian's 2.6.32 > > any more, but still. > > We've been discussing a performance patch to a filesystem driver, not a > Gnome release. :) Age is irrelevant. It's the mainline default. If > you have an "age" hangup WRT kernel patches, well that's just silly. Seriously? I do actually build my own kernels still, but upgrading is always an interesting balancing act, random bits of hardware work differently - stability is always a question. Upgrading to a new gnome release is much less risky. > > I'll run up some tests again some time, but I'm not thinking of > > switching soon. > > Don't migrate just to migrate. If you currently have deficient > performance with high mailbox concurrency on many spindles, it may make > sense. If youre performance is fine, and you have plenty of headroom, > stick with what you have. > > I evangelize XFS to the masses because it's great for many things, and > many people haven't heard of it, or know nothing about it. They simply > use EXTx because it's the default. I'm getting to the word out WRT > possibilities and capabilities. I'm not trying to _convert_ everyone to > XFS. > > Apologies to *BSD, AIX, Solaris, HP-UX mail server admins if it appears > I assume the world is all Linux. I don't assume that--all the numbers > out here say it has ~99% of all "UNIX like" server installs. Well, yeah. I've heard interestingly mixed things from people running ZFS too, but mostly positive. We keep our backups on ZFS on real Solaris - at least one lot. The others are on XFS on one of those huge SAN thingies. But I don't care so much about performance there, because I'm reading and writing huge .tar.gz files. And XFS is good at that. Bron.
Re: Premature "No Space left on device" on XFS
- Цитат от Bron Gondwana (br...@fastmail.fm), на 10.10.2011 в 01:50 - > On Mon, Oct 10, 2011 at 01:33:31AM +0300, karave...@mail.bg wrote: >> Nice setup. And thanks for your work on Cyrus. We are >> looking also to move the metadata on SSDs but we have not >> found yet cost effective devices - we need at least a pair of >> 250G disk for 20-30T spool on a server. > > You can move cyrus.cache to data now, that's the whole > point, because it doesn't need to be mmaped in so much. > Thanks for the info. >> Setting a higher number of allocation groups per XFS >> filesystem helps a lot for the concurrency. My rule of >> thumb (learnt from databases) is: >> number of spindles + 2 * number of CPUs. >> You have done the same with multiple filesystems. >> >> About the fsck times. We experienced a couple of power >> failures and XFS comes up in 30-45 minutes (30T in >> RAID5 of 12 SATA disks). If the server is shut down >> correctly in comes up in a second. > > Interesting - is that 30-45 minutes actually a proper > fsck, or just a log replay? > I think some kind recovery procedure internal to xfs. The XFS log is 2G, so I think it is not just replaying. > More interestingly, what's your disaster recovery plan > for when you lose multiple disks? Our design is > heavily influenced by having lost 3 disks in a RAID6 > within 12 hours. It took a week to get everyone back > from backups, just because of the IO rate limits of > the backup server. Ouch! You had really bad luck. I do not know how long it will take for us to recover from backups. My estimate is 2-3 weeks if one servers fails. We are looking for better options. Your partitioning is a better plan here - smaller probability the 2 failing disks to come form one array, faster recovery time, etc. > >> We know that RAID5 is not the best option for write >> scalability, but the controller write cache helps a lot. > > Yeah, we did RAID5 for a while - but it turned out we > were still being write limited more than disk space > limited, so the last RAID5s are being phased out for > more RAID1. > > Bron. > -- Luben Karavelov -- Luben Karavelov
Re: Premature "No Space left on device" on XFS
-- From: "Bron Gondwana" Sent: Sunday, October 09, 2011 6:28 PM To: "vg_ us" Cc: "Bron Gondwana" ; "Stan Hoeppner" ; Subject: Re: Premature "No Space left on device" on XFS On Sun, Oct 09, 2011 at 04:42:25PM -0400, vg_ us wrote: From: "Bron Gondwana" >I'm honestly more interested in maildir type workload too, spool doesn't >get enough traffic usually to care about IO. will postmark transaction test do? here - http://www.phoronix.com/scan.php?page=article&item=linux_2639_fs&num=1 stop arguing - I think postmark transaction was the only relevant test XFS was loosing badly - not anymore... search www.phoronix.com for other tests - there is one for every kernel version. Sorry, I don't change filesystems every week just because the latest shiny got a better benchmark. I need a pretty compelling reason, and what's most impressive there is how shockingly bad XFS was before 2.6.39. I don't think there's many stable distributions out there shipping 2.6.39 yet, which means you're bleeding all sorts of edges to get a faster filesystem... Ahhh - which part of "search www.phoronix.com for other tests" did you miss? All I meant - benchmarks are out there... ... and you're storing your customers' email on that. But - you have convinced me that it may be time to take another round of tests - particularly since we've added another couple of database files since my last test, which will increase the linear IO slightly on regular use. It may be worth comparing again. But I will still advise ext4 to anyone who asks right now. Bron.
Re: Premature "No Space left on device" on XFS
On 10/9/2011 3:29 PM, Bron Gondwana wrote: > I'm honestly more interested in maildir type workload too, spool doesn't > get enough traffic usually to care about IO. > > (sorry, getting a bit off topic for the postfix list) Maybe not off topic. You're delivering into the maildir mailboxes with local(8) right? > We went with lots of small filesystems to reduce single points of > failure rather than one giant filesystem across all our spools. Not a bad architecture. Has a few downsides but one big upside. Did you really mean Postfix spools here, or did you mean to say maildir directories? > No, not really. I'm not going to advise people to use something that > requires a lot of tuning. My point was that if a workload requires, or can benefit from XFS, it requires a learning curve, and is worth the effort. > My goodness. That's REALLY recent in filesystem times. Something XFS has been seeing substantial development for a few years now due to interest from RedHat, who plan to make it the default RHEL filesystem in the future. They've dedicated serious resources to the effort, including hiring Dave Chinner from SGI. Dave's major contribution while at RedHat has been the code that yields the 10X+ increase in unlink performance. It is enabled by default in 2.6.39 and later kernels. > that recent plus "all my eggs in one basket" of changing to a > large multi-spindle filesystem that would really get the benefits > of XFS would be more dangerous than I'm willing to consider. That's That's one opinion, probably not shared by most XFS users. I assume your current architecture is designed to mitigate hardware failure--focused on the very rare occasion of filesystem corruption in absence of some hardware failure event. I'd make an educated guess that the median size XFS filesystem in the wild today is at least 50TB and spans dozens of spindles housed in multiple FC SAN array chassis. > barely a year old. At least we're not still running Debian's 2.6.32 > any more, but still. We've been discussing a performance patch to a filesystem driver, not a Gnome release. :) Age is irrelevant. It's the mainline default. If you have an "age" hangup WRT kernel patches, well that's just silly. > I'll run up some tests again some time, but I'm not thinking of > switching soon. Don't migrate just to migrate. If you currently have deficient performance with high mailbox concurrency on many spindles, it may make sense. If youre performance is fine, and you have plenty of headroom, stick with what you have. I evangelize XFS to the masses because it's great for many things, and many people haven't heard of it, or know nothing about it. They simply use EXTx because it's the default. I'm getting to the word out WRT possibilities and capabilities. I'm not trying to _convert_ everyone to XFS. Apologies to *BSD, AIX, Solaris, HP-UX mail server admins if it appears I assume the world is all Linux. I don't assume that--all the numbers out here say it has ~99% of all "UNIX like" server installs. -- Stan
Re: Premature "No Space left on device" on XFS
On Mon, Oct 10, 2011 at 01:49:44AM +0300, karave...@mail.bg wrote: > I do not trust Postmark - it models mbox appending and skips > fsync-s. So it is too different from our setup. The best benchmark > tool I have found is imaptest (from dovecot fame) - it is actually > end to end benchmarking, including the IMAP server. I use imaptest as something to throw against my Cyrus dev builds to check I haven't broken anything quickly. It's very good. Of course, I run those on tmpfs so my machine doesn't grind to a halt! > The last fs tests I have done were April and there is no > fundamental change in the filesystems since then. Make your > test and see yourself. The setup here was XFS so we changed > only a mount option - delaylog was not default before 2.6.39. > Ext4 is also a nice choice but we have problems with long fsck > times. Agree, long fsck times suck. Then again, we have very reliable UPS setup and multiple power supplies on separate UPSes for every machine. I think the last time we lost a single power channel was about 4 years ago, and I don't recall ever losing both channels. Bron.
Re: Premature "No Space left on device" on XFS
On Mon, Oct 10, 2011 at 01:33:31AM +0300, karave...@mail.bg wrote: > Nice setup. And thanks for your work on Cyrus. We are > looking also to move the metadata on SSDs but we have not > found yet cost effective devices - we need at least a pair of > 250G disk for 20-30T spool on a server. You can move cyrus.cache to data now, that's the whole point, because it doesn't need to be mmaped in so much. > Setting a higher number of allocation groups per XFS > filesystem helps a lot for the concurrency. My rule of > thumb (learnt from databases) is: > number of spindles + 2 * number of CPUs. > You have done the same with multiple filesystems. > > About the fsck times. We experienced a couple of power > failures and XFS comes up in 30-45 minutes (30T in > RAID5 of 12 SATA disks). If the server is shut down > correctly in comes up in a second. Interesting - is that 30-45 minutes actually a proper fsck, or just a log replay? More interestingly, what's your disaster recovery plan for when you lose multiple disks? Our design is heavily influenced by having lost 3 disks in a RAID6 within 12 hours. It took a week to get everyone back from backups, just because of the IO rate limits of the backup server. > We know that RAID5 is not the best option for write > scalability, but the controller write cache helps a lot. Yeah, we did RAID5 for a while - but it turned out we were still being write limited more than disk space limited, so the last RAID5s are being phased out for more RAID1. Bron.
Re: Premature "No Space left on device" on XFS
- Цитат от Bron Gondwana (br...@fastmail.fm), на 10.10.2011 в 01:28 - > On Sun, Oct 09, 2011 at 04:42:25PM -0400, vg_ us wrote: >> From: "Bron Gondwana" >> >I'm honestly more interested in maildir type workload too, spool doesn't >> >get enough traffic usually to care about IO. >> >> will postmark transaction test do? here - >> http://www.phoronix.com/scan.php?page=article&item=linux_2639_fs&num=1 >> stop arguing - I think postmark transaction was the only relevant >> test XFS was loosing badly - not anymore... >> search www.phoronix.com for other tests - there is one for every >> kernel version. > > Sorry, I don't change filesystems every week just because > the latest shiny got a better benchmark. I need a pretty > compelling reason, and what's most impressive there is > how shockingly bad XFS was before 2.6.39. I don't think > there's many stable distributions out there shipping 2.6.39 > yet, which means you're bleeding all sorts of edges to get > a faster filesystem... > > ... and you're storing your customers' email on that. > > But - you have convinced me that it may be time to take > another round of tests - particularly since we've added > another couple of database files since my last test, > which will increase the linear IO slightly on regular use. > It may be worth comparing again. But I will still advise > ext4 to anyone who asks right now. > > Bron. > I do not trust Postmark - it models mbox appending and skips fsync-s. So it is too different from our setup. The best benchmark tool I have found is imaptest (from dovecot fame) - it is actually end to end benchmarking, including the IMAP server. The last fs tests I have done were April and there is no fundamental change in the filesystems since then. Make your test and see yourself. The setup here was XFS so we changed only a mount option - delaylog was not default before 2.6.39. Ext4 is also a nice choice but we have problems with long fsck times. Best regards -- Luben Karavelov
Re: Premature "No Space left on device" on XFS
On Sun, Oct 09, 2011 at 04:42:25PM -0400, vg_ us wrote: > will postmark transaction test do? here - > http://www.phoronix.com/scan.php?page=article&item=linux_2639_fs&num=1 Oh: http://blog.goolamabbas.org/2007/06/17/postmark-is-not-a-mail-server-benchmark/ "Thus it pains me a lot that they are trying to pass of a benchmark (Postmark) which does not have a single fsync(2) as appropiate for a mail server. " And that other benchmark posted earlier had barriers turned off. Anything that benchmarks without fsync is a lie, because it can create the file, do something with it, and unlink without ever having written a byte down to storage. Woo f'ing hoo. So no, a postmark transaction test won't do unless you can show me a resource that says it does fsyncs now, otherwise you're just playing with improved in-memory datastructures, and my workload is limited by random disk IO, not CPU. Bron.
Re: Premature "No Space left on device" on XFS
- Цитат от Bron Gondwana (br...@fastmail.fm), на 10.10.2011 в 01:12 - > > Here's what our current IMAP servers look like: > > 2 x 92GB SSD > 12 x 2TB SATA > > two of the SATA drives are hotspares - though I'm > wondering if that's actually necessary now, we > haven't lost any yet, and we have 24 hr support in > our datacentres. Hot swap is probably fine. > > so - 5 x RAID1 for a total of 10TB storage. > > Each 2TB volume is then further split into 4 x 500Gb > partitions. The SSD is just a single partition with > all the metadata, which is a change from our previous > pattern of separate metadata partitions as well, but > has been performing OK thanks to the performance of > SSD. > > The SSDs are in RAID1 as well. > > This gives us 20 separate mailbox databases, which > not only keeps the size down, but gives us concurrency > for free - so there's no single points of contention > for the entire machine. It gives us small enough > filesystems that you can actually fsck them in a day, > and fill up a new replica in a day as well. > > And it means when we need to shut down a single machine, > the masters transfer to quite a few other machines > rather than one replica host taking all the load, so > it spreads things around nicely. > > This is letting us throw a couple of hundred thousand > users on a single one of these machines and barely > break a sweat. It took a year or so of work to rewrite > the internals of Cyrus IMAP to cut down the IO hits on > the SATA drives, but it was worth it. > > Total cost for one of these boxes, with 48GB RAM and a > pair of CPUs is under US $13k - and they scale very > linearly - throw a handful of them into the datacentre > and toss some replicas on there. Easy. > > And there's no single point of failure - each machine > is totally standalone - with its own CPU, its own > storage, its own metadata. Nice. > > So yeah, I'm quite happy with the sweet spot that I've > found at the moment - and it means that a single machine > has 21 separate filesystems on it. So long as there's > no massive lock that all the filesystems have to go > through, we get the scalability horizontally rather > than vertically. > > Bron. > Nice setup. And thanks for your work on Cyrus. We are looking also to move the metadata on SSDs but we have not found yet cost effective devices - we need at least a pair of 250G disk for 20-30T spool on a server. Setting a higher number of allocation groups per XFS filesystem helps a lot for the concurrency. My rule of thumb (learnt from databases) is: number of spindles + 2 * number of CPUs. You have done the same with multiple filesystems. About the fsck times. We experienced a couple of power failures and XFS comes up in 30-45 minutes (30T in RAID5 of 12 SATA disks). If the server is shut down correctly in comes up in a second. We know that RAID5 is not the best option for write scalability, but the controller write cache helps a lot. Best regards -- Luben Karavelov
Re: Premature "No Space left on device" on XFS
On Sun, Oct 09, 2011 at 04:42:25PM -0400, vg_ us wrote: > From: "Bron Gondwana" > >I'm honestly more interested in maildir type workload too, spool doesn't > >get enough traffic usually to care about IO. > > will postmark transaction test do? here - > http://www.phoronix.com/scan.php?page=article&item=linux_2639_fs&num=1 > stop arguing - I think postmark transaction was the only relevant > test XFS was loosing badly - not anymore... > search www.phoronix.com for other tests - there is one for every > kernel version. Sorry, I don't change filesystems every week just because the latest shiny got a better benchmark. I need a pretty compelling reason, and what's most impressive there is how shockingly bad XFS was before 2.6.39. I don't think there's many stable distributions out there shipping 2.6.39 yet, which means you're bleeding all sorts of edges to get a faster filesystem... ... and you're storing your customers' email on that. But - you have convinced me that it may be time to take another round of tests - particularly since we've added another couple of database files since my last test, which will increase the linear IO slightly on regular use. It may be worth comparing again. But I will still advise ext4 to anyone who asks right now. Bron.
Re: Premature "No Space left on device" on XFS
On Sun, Oct 09, 2011 at 03:24:44PM -0500, Stan Hoeppner wrote: > That said, there are plenty of mailbox > servers in the wild that would benefit from the XFS + linear concat > setup. It doesn't require an insane drive count, such as the 136 in the > test system above, to demonstrate the gains, especially against EXT3/4 > with RAID5/6 on the same set of disks. I think somewhere between 16-32 > should do it, which is probably somewhat typical of mailbox storage > servers at many sites. Here's what our current IMAP servers look like: 2 x 92GB SSD 12 x 2TB SATA two of the SATA drives are hotspares - though I'm wondering if that's actually necessary now, we haven't lost any yet, and we have 24 hr support in our datacentres. Hot swap is probably fine. so - 5 x RAID1 for a total of 10TB storage. Each 2TB volume is then further split into 4 x 500Gb partitions. The SSD is just a single partition with all the metadata, which is a change from our previous pattern of separate metadata partitions as well, but has been performing OK thanks to the performance of SSD. The SSDs are in RAID1 as well. This gives us 20 separate mailbox databases, which not only keeps the size down, but gives us concurrency for free - so there's no single points of contention for the entire machine. It gives us small enough filesystems that you can actually fsck them in a day, and fill up a new replica in a day as well. And it means when we need to shut down a single machine, the masters transfer to quite a few other machines rather than one replica host taking all the load, so it spreads things around nicely. This is letting us throw a couple of hundred thousand users on a single one of these machines and barely break a sweat. It took a year or so of work to rewrite the internals of Cyrus IMAP to cut down the IO hits on the SATA drives, but it was worth it. Total cost for one of these boxes, with 48GB RAM and a pair of CPUs is under US $13k - and they scale very linearly - throw a handful of them into the datacentre and toss some replicas on there. Easy. And there's no single point of failure - each machine is totally standalone - with its own CPU, its own storage, its own metadata. Nice. So yeah, I'm quite happy with the sweet spot that I've found at the moment - and it means that a single machine has 21 separate filesystems on it. So long as there's no massive lock that all the filesystems have to go through, we get the scalability horizontally rather than vertically. Bron.
Re: Premature "No Space left on device" on XFS
- Цитат от Bron Gondwana (br...@fastmail.fm), на 09.10.2011 в 23:29 - > > My goodness. That's REALLY recent in filesystem times. Something > that recent plus "all my eggs in one basket" of changing to a > large multi-spindle filesystem that would really get the benefits > of XFS would be more dangerous than I'm willing to consider. That's > barely a year old. At least we're not still running Debian's 2.6.32 > any more, but still. > > I'll run up some tests again some time, but I'm not thinking of > switching soon. > I run a couple of busy postfix MX servers with queues now on XFS: average: 400 deliveries per minute peak: 1200 deliveries per minute. 4 months ago they were hosted on 8 core Xeon, 6xSAS10k RAID 10 machines. The spools were on ext4. When I have switched the queue filesystem to XFS with delaylog option (around 2.6.36) the load average dropped from 2.5 to 0.5. Now I run the same servers on smaller machines - dual core Opterons. The queues are on one Intel SLC SSD. The load average of the machines is under 0.2. Now, about the spools. They are managed by Cyrus, so not a Maildir but close. We have now in use 2 types of servers for spools: 24 SATA x 1T disks in RAID5 12 SATA x 3T disks in RAID5. The mail spools and other mail related filesystems are on XFS with delaylog option. They run with average 200 TPS Yes, the expunges take some time. But we run the task every night for 1/7 of the mailboxes, so every mailbox is expunged once in a week. The expunge task runs for 2-3 hours on around 50k mailboxes. I have done some test with BTRFS for spools but I am quite disappointed - horrible performance and horrible stability. The only other promising option was ZFS but it means to switch also the OS to FreeBSD or some form of Solaris. And we are not there yet. Best regards -- Luben Karavelov
Re: Premature "No Space left on device" on XFS
-- From: "Bron Gondwana" Sent: Sunday, October 09, 2011 4:29 PM To: "Stan Hoeppner" Cc: Subject: Re: Premature "No Space left on device" on XFS On Sun, Oct 09, 2011 at 02:31:19PM -0500, Stan Hoeppner wrote: On 10/9/2011 8:36 AM, Bron Gondwana wrote: > How many people are running their mail servers on 24-32 SAS spindles > verses those running them on two spindles in RAID1? These results are for a maildir type workload, i.e. POP/IMAP, not a spool workload. I believe I already stated previously that XFS is not an optimal filesystem for a spool workload but would work well enough if setup properly. There's typically not enough spindles nor concurrency to take advantage of XFS' strengths on a spool workload. I'm honestly more interested in maildir type workload too, spool doesn't get enough traffic usually to care about IO. will postmark transaction test do? here - http://www.phoronix.com/scan.php?page=article&item=linux_2639_fs&num=1 stop arguing - I think postmark transaction was the only relevant test XFS was loosing badly - not anymore... search www.phoronix.com for other tests - there is one for every kernel version. - Vadim Grigoryan (sorry, getting a bit off topic for the postfix list) > Wow - just what I love doing. Building intimate knowledge of the > XFS allocation group architecture to run up a mail server. I'll > get right on it. As with anything you pick the right tool for the job. If your job requires the scalability of XFS you'd learn to use it. Apparently your workload doesn't. We went with lots of small filesystems to reduce single points of failure rather than one giant filesystem across all our spools. I'm still convinced that it's a better way to do it, despite people trying to convince me to throw all my eggs in one basket again. SANs are great they say, never had any problems, they say. > Sarcasm aside - if you ship with stupid-ass defaults, don't be > surprised if people say the product isn't a good choice for > regular users. I think you missed the point. No, not really. I'm not going to advise people to use something that requires a lot of tuning. > I tried XFS for our workload (RAID1 sets, massive set of unlinks once > per week when we do the weekly expunge cleanup) - and the unlinks were > just so nasty that we decided not to use it. I was really hoping for > btrfs to be ready for prime-time by now, but that's looking unlikely > to happen any time soon. Take another look at XFS. See below, specifically the unlink numbers in the 2nd linked doc. hmm... > Maybe my tuning fu was bad - but you know what, I did a bit of reading > and chose options that provided similar consistency guarantees to the > options we were currently using with reiserfs. Besides, 2.6.17 was > still recent memory at the time, and it didn't encourage me much. It was not your lack of tuning fu. XFS metadata write performance was abysmal before 2.6.35. For example deleting a kernel source tree took 10+ times longer than EXT3/4. Look at the performance since the delayed logging patch was introduced in 2.6.35. With a pure unlink workload it's now up to par with EXT4 performance up to 4 threads, and surpasses it by a factor of two or more at 8 threads and greater. XFS' greatest strength, parallelism, now covers unlink performance, where it was severely lacking for many years, both on IRIX and Linux. The design document: http://xfs.org/index.php/Improving_Metadata_Performance_By_Reducing_Journal_Overhead Thread discussing the performance gains: http://oss.sgi.com/archives/xfs/2010-05/msg00329.html My goodness. That's REALLY recent in filesystem times. Something that recent plus "all my eggs in one basket" of changing to a large multi-spindle filesystem that would really get the benefits of XFS would be more dangerous than I'm willing to consider. That's barely a year old. At least we're not still running Debian's 2.6.32 any more, but still. I'll run up some tests again some time, but I'm not thinking of switching soon. Bron.
Re: Premature "No Space left on device" on XFS
On Sun, Oct 09, 2011 at 02:31:19PM -0500, Stan Hoeppner wrote: > On 10/9/2011 8:36 AM, Bron Gondwana wrote: > > How many people are running their mail servers on 24-32 SAS spindles > > verses those running them on two spindles in RAID1? > > These results are for a maildir type workload, i.e. POP/IMAP, not a > spool workload. I believe I already stated previously that XFS is not > an optimal filesystem for a spool workload but would work well enough if > setup properly. There's typically not enough spindles nor concurrency > to take advantage of XFS' strengths on a spool workload. I'm honestly more interested in maildir type workload too, spool doesn't get enough traffic usually to care about IO. (sorry, getting a bit off topic for the postfix list) > > Wow - just what I love doing. Building intimate knowledge of the > > XFS allocation group architecture to run up a mail server. I'll > > get right on it. > > As with anything you pick the right tool for the job. If your job > requires the scalability of XFS you'd learn to use it. Apparently your > workload doesn't. We went with lots of small filesystems to reduce single points of failure rather than one giant filesystem across all our spools. I'm still convinced that it's a better way to do it, despite people trying to convince me to throw all my eggs in one basket again. SANs are great they say, never had any problems, they say. > > Sarcasm aside - if you ship with stupid-ass defaults, don't be > > surprised if people say the product isn't a good choice for > > regular users. > > I think you missed the point. No, not really. I'm not going to advise people to use something that requires a lot of tuning. > > I tried XFS for our workload (RAID1 sets, massive set of unlinks once > > per week when we do the weekly expunge cleanup) - and the unlinks were > > just so nasty that we decided not to use it. I was really hoping for > > btrfs to be ready for prime-time by now, but that's looking unlikely > > to happen any time soon. > > Take another look at XFS. See below, specifically the unlink numbers in > the 2nd linked doc. hmm... > > Maybe my tuning fu was bad - but you know what, I did a bit of reading > > and chose options that provided similar consistency guarantees to the > > options we were currently using with reiserfs. Besides, 2.6.17 was > > still recent memory at the time, and it didn't encourage me much. > > It was not your lack of tuning fu. XFS metadata write performance was > abysmal before 2.6.35. For example deleting a kernel source tree took > 10+ times longer than EXT3/4. Look at the performance since the delayed > logging patch was introduced in 2.6.35. With a pure unlink workload > it's now up to par with EXT4 performance up to 4 threads, and surpasses > it by a factor of two or more at 8 threads and greater. XFS' greatest > strength, parallelism, now covers unlink performance, where it was > severely lacking for many years, both on IRIX and Linux. > > The design document: > http://xfs.org/index.php/Improving_Metadata_Performance_By_Reducing_Journal_Overhead > > Thread discussing the performance gains: > http://oss.sgi.com/archives/xfs/2010-05/msg00329.html My goodness. That's REALLY recent in filesystem times. Something that recent plus "all my eggs in one basket" of changing to a large multi-spindle filesystem that would really get the benefits of XFS would be more dangerous than I'm willing to consider. That's barely a year old. At least we're not still running Debian's 2.6.32 any more, but still. I'll run up some tests again some time, but I'm not thinking of switching soon. Bron.
Re: Premature "No Space left on device" on XFS
On 10/9/2011 9:32 AM, Wietse Venema wrote: > Stan Hoeppner: >> On 10/8/2011 3:33 PM, Wietse Venema wrote: >>> That's a lot of text. How about some hard numbers? >> >> Maybe not the perfect example, but here's one such high concurrency >> synthetic mail server workload comparison showing XFS with a substantial >> lead over everything but JFS, in which case the lead is much smaller: >> >> http://btrfs.boxacle.net/repository/raid/history/History_Mail_server_simulation._num_threads=128.html > > I see no write operations, no unlink operations, and no rename > operations. Apologies. I should have provided more links. The site isn't setup for easy navigation... >From the webroot of the site: http://btrfs.boxacle.net/ Mail Server (raid, single-disk) Start with one million files spread across one thousand directories. File sizes range from 1 kB to 1 MB Each thread creates a new file, reads an entire existing file, or deletes a file. 57% (4/7) reads 29% (2/7) creates 14% (1/7) deletes All reads and writes are done in 4 kB blocks. > Comments on performance are welcome, but I prefer that they are > based on first-hand experience, and preferably on configurations > that are likely to be seen in the wild. I would love to publish first hand experience. Unfortunately to sufficiently demonstrate the gains I would need quite a few more spindles than I currently have available. With my current hardware the gains w/XFS are in the statistical noise range as I can't sustain enough parallelism at the spindles. That said, there are plenty of mailbox servers in the wild that would benefit from the XFS + linear concat setup. It doesn't require an insane drive count, such as the 136 in the test system above, to demonstrate the gains, especially against EXT3/4 with RAID5/6 on the same set of disks. I think somewhere between 16-32 should do it, which is probably somewhat typical of mailbox storage servers at many sites. Again, this setup is geared to parallel IMAP/POP and Postfix local delivery type performance, not spool performance. The discussion in this thread drifted at one point away from strictly the spool. I've been addressing the other part. Again, XFS isn't optimal for a typical Postfix spool. I never made that case. For XFS to yield an increase in spool performance would likely require and unrealistically high inbound mail flow rate and a high spindle count to sink the messages. I'll work on getting access to suitable hardware so I can publish some thorough first hand head-to-head numbers, hopefully with a test harness that will use Postfix/SMTP and Dovecot/IMAP instead of a purely synthetic benchmark. -- Stan
Re: Premature "No Space left on device" on XFS
On 10/9/2011 8:36 AM, Bron Gondwana wrote: >> http://btrfs.boxacle.net/repository/raid/history/History_Mail_server_simulation._num_threads=128.html > > Sorry - I don't see unlinks there. Maybe I'm not not reading very > carefully... Unfortunately the web isn't littered with a gazillion head-to-head filesystem scalability benchmark results using a spool or maildir type workload. And I've yet to see one covering multiple operating systems. If not for the creation of BTRFS we'd not have the limited set of results above, which to this point is the most comprehensive I've seen for anything resembling recent Linux kernel versions. > How many people are running their mail servers on 24-32 SAS spindles > verses those running them on two spindles in RAID1? These results are for a maildir type workload, i.e. POP/IMAP, not a spool workload. I believe I already stated previously that XFS is not an optimal filesystem for a spool workload but would work well enough if setup properly. There's typically not enough spindles nor concurrency to take advantage of XFS' strengths on a spool workload. > Wow - just what I love doing. Building intimate knowledge of the > XFS allocation group architecture to run up a mail server. I'll > get right on it. As with anything you pick the right tool for the job. If your job requires the scalability of XFS you'd learn to use it. Apparently your workload doesn't. > Sarcasm aside - if you ship with stupid-ass defaults, don't be > surprised if people say the product isn't a good choice for > regular users. I think you missed the point. > I tried XFS for our workload (RAID1 sets, massive set of unlinks once > per week when we do the weekly expunge cleanup) - and the unlinks were > just so nasty that we decided not to use it. I was really hoping for > btrfs to be ready for prime-time by now, but that's looking unlikely > to happen any time soon. Take another look at XFS. See below, specifically the unlink numbers in the 2nd linked doc. > Maybe my tuning fu was bad - but you know what, I did a bit of reading > and chose options that provided similar consistency guarantees to the > options we were currently using with reiserfs. Besides, 2.6.17 was > still recent memory at the time, and it didn't encourage me much. It was not your lack of tuning fu. XFS metadata write performance was abysmal before 2.6.35. For example deleting a kernel source tree took 10+ times longer than EXT3/4. Look at the performance since the delayed logging patch was introduced in 2.6.35. With a pure unlink workload it's now up to par with EXT4 performance up to 4 threads, and surpasses it by a factor of two or more at 8 threads and greater. XFS' greatest strength, parallelism, now covers unlink performance, where it was severely lacking for many years, both on IRIX and Linux. The design document: http://xfs.org/index.php/Improving_Metadata_Performance_By_Reducing_Journal_Overhead Thread discussing the performance gains: http://oss.sgi.com/archives/xfs/2010-05/msg00329.html -- Stan
Re: Premature "No Space left on device" on XFS
Stan Hoeppner: > On 10/8/2011 3:33 PM, Wietse Venema wrote: > > That's a lot of text. How about some hard numbers? > > Maybe not the perfect example, but here's one such high concurrency > synthetic mail server workload comparison showing XFS with a substantial > lead over everything but JFS, in which case the lead is much smaller: > > http://btrfs.boxacle.net/repository/raid/history/History_Mail_server_simulation._num_threads=128.html I see no write operations, no unlink operations, and no rename operations. Comments on performance are welcome, but I prefer that they are based on first-hand experience, and preferably on configurations that are likely to be seen in the wild. Wietse
Re: Premature "No Space left on device" on XFS
On Sun, Oct 09, 2011 at 03:56:39AM -0500, Stan Hoeppner wrote: > On 10/8/2011 3:33 PM, Wietse Venema wrote: > > That's a lot of text. How about some hard numbers? > > Maybe not the perfect example, but here's one such high concurrency > synthetic mail server workload comparison showing XFS with a substantial > lead over everything but JFS, in which case the lead is much smaller: > > http://btrfs.boxacle.net/repository/raid/history/History_Mail_server_simulation._num_threads=128.html Sorry - I don't see unlinks there. Maybe I'm not not reading very carefully... > If anyone has a relatively current (4 years) bare metal "lab" box with > say 24-32 locally attached SAS drives (the more the better) to which I > could get SSH KVM access, have pretty much free reign to destroy > anything on it and build a proper test rig, I'd be happy to do a bunch > of maildir type workload tests of the various Linux filesystems and > publish the results, focusing on getting the XFS+linear concat info into > public view. How many people are running their mail servers on 24-32 SAS spindles verses those running them on two spindles in RAID1? > If not, but if someone with sufficient hardware would like to do this > project him/herself, I'd be glad to assist getting the XFS+linear concat > configured correctly. Unfortunately it's not something one can setup > without already having a somewhat intimate knowledge of the XFS > allocation group architecture. Once performance data is out there, and > there is demand generated, I'll try to publish a how-to. Wow - just what I love doing. Building intimate knowledge of the XFS allocation group architecture to run up a mail server. I'll get right on it. Sarcasm aside - if you ship with stupid-ass defaults, don't be surprised if people say the product isn't a good choice for regular users. > Wietse has called me out on my assertion. The XFS allocation group > design properly combined with a linear concat dictates the performance > is greater for this workload, simply based on the IO math vs striped > RAID. All those who have stated they use it testify to the increased > performance. But no one has published competitive analysis yet. I'd > love to get such data published as it's a great solution and many could > benefit from it, at least Linux users anyway--XFS is only available on > Linux now that IRIX is dead... I tried XFS for our workload (RAID1 sets, massive set of unlinks once per week when we do the weekly expunge cleanup) - and the unlinks were just so nasty that we decided not to use it. I was really hoping for btrfs to be ready for prime-time by now, but that's looking unlikely to happen any time soon. Maybe my tuning fu was bad - but you know what, I did a bit of reading and chose options that provided similar consistency guarantees to the options we were currently using with reiserfs. Besides, 2.6.17 was still recent memory at the time, and it didn't encourage me much. Bron.
Re: Premature "No Space left on device" on XFS
On 10/8/2011 3:33 PM, Wietse Venema wrote: > Stan Hoeppner: > [ Charset ISO-8859-1 unsupported, converting... ] >> On 10/8/2011 5:17 AM, Wietse Venema wrote: >>> Stan Hoeppner: nicely. On the other hand, you won't see an EXTx filesystem capable of anywhere close to 10GB/s or greater file IO. Here XFS doesn't break a sweat. >>> >>> I recall that XFS was optimized for fast read/write with large >>> files, while email files are small, and have a comparatively high >>> metadata overhead (updating directories, inodes etc.). XFS is >>> probably not optimal here. >>> >>> Wietse >> >> >> With modern XFS this really depends on the specific workload and custom >> settings. Default XFS has always been very good with large file >> performance and has been optimized for such. It was historically >> hampered by write heavy metadata operations, but was sufficiently fast >> with metadata read operations, especially at high parallelism. The >> 'delaylog' code introduced in 2009 has mostly alleviated the metadata >> write performance issues. Delaylog is the default mode since Linux 2.6.39. >> >> XFS is not optimized by default for the OP's specific mail workload, but >> is almost infinitely tunable. The OP has been given multiple options on >> the XFS list to fix this problem. XFS is not unsuitable for this >> workload. The 10GB XFS filesystem created by the OP for this workload >> is not suitable. Doubling the FS size or tweaking the inode layout >> fixes the problem. >> >> As with most things, optimizing the defaults for some workloads may >> yield less than optimal performance with others. By default XFS is less >> than optimal for a high concurrency maildir workload. However with a >> proper storage stack architecture and XFS optimizations it handily >> outperforms all other filesystems. This would be the "XFS linear >> concatenation" setup I believe I've described here previously. >> >> XFS can do just about anything you want it to at any performance level >> you need. For the non default use cases, it simply requires knowledge, >> planning, tweaking, testing, and tweaking to get it there, not to >> mention time. Alas, the learning curve is very steep. > > That's a lot of text. How about some hard numbers? > > Wietse Maybe not the perfect example, but here's one such high concurrency synthetic mail server workload comparison showing XFS with a substantial lead over everything but JFS, in which case the lead is much smaller: http://btrfs.boxacle.net/repository/raid/history/History_Mail_server_simulation._num_threads=128.html I don't have access to this system so I'm unable to demonstrate the additional performance of an XFS+linear concat setup. The throughput would be considerably higher still. The 8-way LVM stripe over 17 drive RAID0 stripes would have caused hot and cold spots within the array spindles, as wide stripe arrays always do with small file random IOPS workloads. Using a properly configured XFS+linear concat in these tests would likely guarantee full concurrency on 128 of the 136 spindles. I say likely as I've not read the test code and don't know exactly how it behaves WRT directory parallelism. If anyone has a relatively current (4 years) bare metal "lab" box with say 24-32 locally attached SAS drives (the more the better) to which I could get SSH KVM access, have pretty much free reign to destroy anything on it and build a proper test rig, I'd be happy to do a bunch of maildir type workload tests of the various Linux filesystems and publish the results, focusing on getting the XFS+linear concat info into public view. If not, but if someone with sufficient hardware would like to do this project him/herself, I'd be glad to assist getting the XFS+linear concat configured correctly. Unfortunately it's not something one can setup without already having a somewhat intimate knowledge of the XFS allocation group architecture. Once performance data is out there, and there is demand generated, I'll try to publish a how-to. Wietse has called me out on my assertion. The XFS allocation group design properly combined with a linear concat dictates the performance is greater for this workload, simply based on the IO math vs striped RAID. All those who have stated they use it testify to the increased performance. But no one has published competitive analysis yet. I'd love to get such data published as it's a great solution and many could benefit from it, at least Linux users anyway--XFS is only available on Linux now that IRIX is dead... -- Stan