Re: [Dovecot] Providing shared folders with multiple backend servers

2012-01-11 Thread Sven Hartge
Sven Hartge  wrote:

> I am currently in the planning stage for a "new and improved" mail
> system at my university.

OK, executive summary of the design ideas so far:

- deployment of X (starting with 4, but easily scalable) virtual servers
  on VMware ESX

- storage will be backed by a RDM on our iSCSI SAN.
  + main mailbox storage will be on 15k SAS6 600GB disks
  + backup rsnapshot storage will be on 7.2k SAS6 2TB disks

- XFS filesystem on LVM, allowing easy local snapshots for rsnapshot

- sharing folders from one user to another is not needed

- central public shared folders reside on its own storage server and are
  accessed through the imapc-backend configured for the "#shared."-namespace
  (needs dovecot 2.1~rc3 or higher)

- mdbox with compression (23h lifetime, 50MB max size)

- quota in MySQL, allowing my MXes to check the quota for a user
  _before_ accepting any mail for him. This is a much needed feature,
  currently not possible and thus leading to backscatter right now.

- + Backup with bacula for file level backup every 24 hours (120 days
retention)
  + rsnapshot to node local backup space for easier access (14 days
retention)
  + possibly SAN-based remote snapshots to different storage tier.


Because sharing a RDM (or VMDK) with multiple VMs pins the VM to an ESX
server and prohibits HA and DRS in the ESX cluster and because of my bad
experience with cluster FS I want to avoid one and use only local
storage for the personal mailboxes of the users.

Each user is fixed to one server, routing/redirecting of IMAP/POP3
connections happens via perdition (popmap feature via LDAP lookup) in a
frontend server (this component is already working since some 3-ish
years).

So each node is isolated from the other nodes, knows only its users and
does not care about users on other nodes. This prevents usage of the
dovecot director, which only works if all nodes are able to access all
mailboxes (correct?)

I am aware this creates a SPoF for an 1/X portion of my users in the
case of a VM failure, but this is deemed acceptable, since the use of
VMs will allow me to quickly deploy a new one and reattach the RDM.
(And if my whole iSCSI storage or ESX cluster fails, I have other,
bigger problems than a non-functional mail system.)

Comments?

Grüße,
Sven.

-- 
Sigmentation fault. Core dumped.



Re: [Dovecot] Providing shared folders with multiple backend servers

2012-01-11 Thread Joseba Torre

El 09/01/12 14:50, Phil Turmel escribió:

I've been following this thread with great interest, but no advice to offer.
The content is entirely appropriate, and appreciated.  Don't be embarrassed
by your enthusiasm, Stan.


+1


Re: [Dovecot] Providing shared folders with multiple backend servers

2012-01-09 Thread Stan Hoeppner
On 1/9/2012 7:48 AM, Sven Hartge wrote:

> It seems my initial idea was not so bad after all ;) 

Yeah, but you didn't know how "not so bad" it really was until you had
me analyze it, flesh it out, and confirm it.  ;)

> Now I "just" need o
> built a little test setup, put some dummy users on it and see, if
> anything bad happens while accessing the shared folders and how the
> reaction of the system is, should the shared folder server be down.

It won't be down.  Because instead of using NFS you're going to use GFS2
for the shared folder LUN so each user accesses the shared folders
locally just as they do their mailbox.  Pat yourself on the back Sven,
you just eliminated a SPOF. ;)

>> How many total cores per VMware node (all sockets)?
> 
> 8

Fairly beefy.  Dual socket quad core Xeons I'd guess.

> Here the memory statistics an 14:30 o'clock:
> 
>  total   used   free sharedbuffers cached
> Mem: 12046  11199847  0 88   7926
> -/+ buffers/cache:   3185   8861
> Swap: 5718 10   5707

That doesn't look too bad.  How many IMAP user connections at that time?
 Is that a high average or low for that day?  The RAM numbers in
isolation only paint a partial picture...

> The SAN has plenty space. Over 70TiB at this time, with another 70TiB
> having just arrived and waiting to be connected.

140TB of 15k storage.  Wow, you're so under privileged. ;)

> The iSCSI storage nodes (HP P4500) use 600GB SAS6 at 15k rpm with 12
> disks per node, configured in 2 RAID5 sets with 6 disks each.
> 
> But this is internal to each storage node, which are kind of a blackbox
> and have to be treated as such.

I cringe every time I hear 'black box'...

> The HP P4500 is a but unique, since it does not consist of a head node
> which storage arrays connected to it, but of individual storage nodes
> forming a self balancing iSCSI cluster. (The nodes consist of DL320s G2.)

The 'black box' is Lefthand Networks SAN/iQ software stack.  I wasn't
that impressed with it when I read about it 8 or so years ago.  IIRC,
load balancing across cluster nodes is accomplished by resending host
packets from a receiving node to another node after performing special
sauce calculations regarding cluster load.  Hence the need, apparently,
for a full power, hot running, multi-core x86 CPU instead of an embedded
low power/wattage type CPU such as MIPS, PPC, i960 descended IOP3xx, or
even the Atom if they must stick with x86 binaries.  If this choice was
merely due to economy of scale of their server boards, they could have
gone with a single socket board instead of the dual, which would have
saved money.  So this choice of a dual socket Xeon board wasn't strictly
based on cost or ease of manufacture.

Many/most purpose built SAN arrays on the market don't use full power
x86 chips, but embedded RISC chips, to cut cost, power draw, and heat
generation.  These RISC chips are typically in order designs, don't have
branch prediction or register renaming logic circuits and they have tiny
caches.  This is because block moving code handles streams of data and
doesn't typically branch nor have many conditionals.  For streaming
apps, data caches simply get in the way, although an instruction cache
is beneficial.  HP's choice of full power CPUs that have such features
suggests branching conditional code is used.  Which makes sense when
running algorithms that attempt to calculate the least busy node.

Thus, this 'least busy node' calculation and packet shipping adds non
trivial latency to host SCSI IO command completion, compared to
traditional FC/iSCSI SAN arrays, or DAS, and thus has implications for
high IOPS workloads and especially those making heavy use of FSYNC, such
as SMTP and IMAP servers.  FSYNC performance may not be an issue if the
controller instantly acks FSYNC before data hits platter, but then you
may run into bigger problems as you have no guarantee data hit the disk.
 Or, you may not run into perceptible performance issues at all given
the number of P4500s you have and the proportionally light IO load of
your 10K mail users.  Sheer horsepower alone may prove sufficient.

Just in case, it may prove beneficial to fire up ImapTest or some other
synthetic mail workload generator to see if array response times are
acceptable under heavy mail loads.

> So far, I had no performance or other problems with this setup and it
> scales quite nice, as you  buy as you grow .

I'm glad the Lefthand units are working well for you so far.  Are you
hitting the arrays with any high random IOPS workloads as of yet?

> And again, price was also a factor, deploying a FC-SAN would have cost
> us more than thrice the amount than the amount the deployment of an iSCSI
> solution did, because the latter is "just" ethernet, while the former
> would have needed a lot more totally new components.

I guess that depends on the features you need, such as PIT backups,
remote replicat

Re: [Dovecot] Providing shared folders with multiple backend servers

2012-01-09 Thread Sven Hartge
Timo Sirainen  wrote:
> On 9.1.2012, at 22.13, Sven Hartge wrote:
>> Timo Sirainen  wrote:
>>> On 9.1.2012, at 21.45, Sven Hartge wrote:
 
>> |   location = imapc:~/imapc-shared
 
 What is the syntax of this location? What does "imapc-shared" do in
 this case?
>> 
>>> It's the directory for index files. The backend IMAP server is used
>>> as a rather dummy storage, so if for example you do a FETCH 1:*
>>> BODYSTRUCTURE command, all of the message bodies are downloaded to
>>> the user's Dovecot server which parses them. But with indexes this
>>> is done only once (same as with any other mailbox format). If you
>>> want SEARCH BODY to be fast, you'd also need to use some kind of
>>> full text search indexes.
>> 
>>> If your users share the same UID (or 0666 mode would probably work
>>> too), you could share the index files rather than make them
>>> per-user.  Then you could use imapc:/shared/imapc or something.
 
>> Hmm. Yes, this is a fully virtual setup, every users mail is owned by
>> the virtmail user. Does this sharing of index files have any security
>> or privacy issues?

> There are no privacy issues, at least currently, since there is no
> per-user data. If you had wanted per-user flags this wouldn't have
> worked.

OK. I think I will go with the per-user index files for now and pay the
extra in bandwidth and processing power needed.

All in all, of 10,000 users, only about 100 use shared folders.

Grüße,
Sven.

-- 
Sigmentation fault. Core dumped.



Re: [Dovecot] Providing shared folders with multiple backend servers

2012-01-09 Thread Timo Sirainen
On 9.1.2012, at 22.13, Sven Hartge wrote:

> Timo Sirainen  wrote:
>> On 9.1.2012, at 21.45, Sven Hartge wrote:
> 
> |   location = imapc:~/imapc-shared
>>> 
>>> What is the syntax of this location? What does "imapc-shared" do in this
>>> case?
> 
>> It's the directory for index files. The backend IMAP server is used as
>> a rather dummy storage, so if for example you do a FETCH 1:*
>> BODYSTRUCTURE command, all of the message bodies are downloaded to the
>> user's Dovecot server which parses them. But with indexes this is done
>> only once (same as with any other mailbox format). If you want SEARCH
>> BODY to be fast, you'd also need to use some kind of full text search
>> indexes.
> 
> The bodies are downloaded but not stored, right? Just the index files
> are stored locally.

Right.

>> If your users share the same UID (or 0666 mode would probably work
>> too), you could share the index files rather than make them per-user.
>> Then you could use imapc:/shared/imapc or something.
> 
> Hmm. Yes, this is a fully virtual setup, every users mail is owned by
> the virtmail user. Does this sharing of index files have any security or
> privacy issues?

There are no privacy issues, at least currently, since there is no per-user 
data. If you had wanted per-user flags this wouldn't have worked.

> Not every user sees every shared folder, so an information leak has to
> be avoided at all costs.

Oh, that reminds me, it doesn't actually work :) Because Dovecot deletes those 
directories it doesn't see on the remote server. You might be able to use 
imapc:~/imapc:INDEX=/shared/imapc though. The nice thing about shared imapc 
indexes is that each user doesn't have to re-index the message.

Re: [Dovecot] Providing shared folders with multiple backend servers

2012-01-09 Thread Sven Hartge
Timo Sirainen  wrote:
> On 9.1.2012, at 21.45, Sven Hartge wrote:

 |   location = imapc:~/imapc-shared
>> 
>> What is the syntax of this location? What does "imapc-shared" do in this
>> case?

> It's the directory for index files. The backend IMAP server is used as
> a rather dummy storage, so if for example you do a FETCH 1:*
> BODYSTRUCTURE command, all of the message bodies are downloaded to the
> user's Dovecot server which parses them. But with indexes this is done
> only once (same as with any other mailbox format). If you want SEARCH
> BODY to be fast, you'd also need to use some kind of full text search
> indexes.

The bodies are downloaded but not stored, right? Just the index files
are stored locally.

> If your users share the same UID (or 0666 mode would probably work
> too), you could share the index files rather than make them per-user.
> Then you could use imapc:/shared/imapc or something.

Hmm. Yes, this is a fully virtual setup, every users mail is owned by
the virtmail user. Does this sharing of index files have any security or
privacy issues?

Not every user sees every shared folder, so an information leak has to
be avoided at all costs.

> BTW. All message flags are shared between users. If you want per-user
> flags you'd need to modify the code.

No, I need shared message flags, as this is the reason we introduced
shared folders, so one user can see, if a mail has already been read or
replied to.

>>> Right. Also in this Dovecot you want a regular namespace without prefix:
>> 
>>> namespace inbox {
>>> separator = /
>>> list = yes
>>> inbox = yes
>>> }
>> 
>>> You might as well use the proper separator here in case you ever change it 
>>> for users.
>> 
>> Is this seperator converted to '.' on the frontend?

> Yes, as long as you explicitly specify the separator setting to the
> public namespace.

OK, good to know, one for my documentation with an '!' behind it.

Grüße,
Sven

-- 
Sigmentation fault. Core dumped.



Re: [Dovecot] Providing shared folders with multiple backend servers

2012-01-09 Thread Timo Sirainen
On 9.1.2012, at 21.45, Sven Hartge wrote:

>>> |   location = imapc:~/imapc-shared
> 
> What is the syntax of this location? What does "imapc-shared" do in this
> case?

It's the directory for index files. The backend IMAP server is used as a rather 
dummy storage, so if for example you do a FETCH 1:* BODYSTRUCTURE command, all 
of the message bodies are downloaded to the user's Dovecot server which parses 
them. But with indexes this is done only once (same as with any other mailbox 
format). If you want SEARCH BODY to be fast, you'd also need to use some kind 
of full text search indexes.

If your users share the same UID (or 0666 mode would probably work too), you 
could share the index files rather than make them per-user. Then you could use 
imapc:/shared/imapc or something.

BTW. All message flags are shared between users. If you want per-user flags 
you'd need to modify the code.

>> Right. Also in this Dovecot you want a regular namespace without prefix:
> 
>> namespace inbox {
>> separator = /
>> list = yes
>> inbox = yes
>> }
> 
>> You might as well use the proper separator here in case you ever change it 
>> for users.
> 
> Is this seperator converted to '.' on the frontend?

Yes, as long as you explicitly specify the separator setting to the public 
namespace.

Re: [Dovecot] Providing shared folders with multiple backend servers

2012-01-09 Thread Sven Hartge
Timo Sirainen  wrote:
> On 9.1.2012, at 21.31, Sven Hartge wrote:

>> ,
>> | # User's private mail location
>> | mail_location = mdbox:~/mdbox
>> |
>> | # When creating any namespaces, you must also have a private namespace:
>> | namespace {
>> |   type = private
>> |   separator = .
>> |   prefix = INBOX.
>> |   #location defaults to mail_location.
>> |   inbox = yes
>> | }
>> |
>> | namespace {
>> |   type = public
>> |   separator = .
>> |   prefix = #shared.

> I'd probably just use "Shared." as prefix, since it is visible to
> users. Anyway if you want to use # you need to put the value in
> "quotes" or it's treated as comment.

I have to use "#shared.", because this is what Courier uses.
Unfortunately I have to stick to prefixes and seperators used currently.

>> |   location = imapc:~/imapc-shared

What is the syntax of this location? What does "imapc-shared" do in this
case?

>> |   subscriptions = no

> list = children here

>> | }
>> |
>> | imapc_host = m-st-sh-01.foo.bar
>> | imapc_password = master-user-password
>> | imapc_user = shareduser
>> | imapc_master_user = %u
>> `
>> 
>> Where do I add "list = children"? In the user-dovecots shared namespace
>> or on the shared-dovecots private namespace?

> Shared-dovecot always has mailboxes (at least INBOX), so list=children would 
> equal list=yes.

OK, seems logical.

>> 
>>> 2. Configure the shared Dovecot:
>> 
>>> You need master passdb that allows all existing users to log in as 
>>> "shareduser" user. You can probably simply do (not tested):
>> 
>>> passdb {
>>> type = static
>>> args = user=shareduser pass=master-user-password
>>> master = yes
>>> }
>> 
>>> The "shareduser" owns all of the actual shared mailboxes and has the
>>> necessary ACLs set up for individual users. ACLs use the master
>>> username (= the real username in this case) to do the ACL checks.
>> 
>> So this is kind of "backwards", since normally the imapc_master_user would be
>> the static user and imapc_user would be dynamic, right?

> Right. Also in this Dovecot you want a regular namespace without prefix:

> namespace inbox {
>  separator = /
>  list = yes
>  inbox = yes
> }

> You might as well use the proper separator here in case you ever change it 
> for users.

Is this seperator converted to '.' on the frontend? The department
supporting our users will give me hell if anything visible changes in
the layout of the folders for the end user.

Grüße,
Sven.

-- 
Sigmentation fault. Core dumped.



Re: [Dovecot] Providing shared folders with multiple backend servers

2012-01-09 Thread Timo Sirainen
On 9.1.2012, at 21.31, Sven Hartge wrote:

> ,
> | # User's private mail location
> | mail_location = mdbox:~/mdbox
> |
> | # When creating any namespaces, you must also have a private namespace:
> | namespace {
> |   type = private
> |   separator = .
> |   prefix = INBOX.
> |   #location defaults to mail_location.
> |   inbox = yes
> | }
> |
> | namespace {
> |   type = public
> |   separator = .
> |   prefix = #shared.

I'd probably just use "Shared." as prefix, since it is visible to users. Anyway 
if you want to use # you need to put the value in "quotes" or it's treated as 
comment.

> |   location = imapc:~/imapc-shared
> |   subscriptions = no

list = children here

> | }
> |
> | imapc_host = m-st-sh-01.foo.bar
> | imapc_password = master-user-password
> | imapc_user = shareduser
> | imapc_master_user = %u
> `
> 
> Where do I add "list = children"? In the user-dovecots shared namespace
> or on the shared-dovecots private namespace?

Shared-dovecot always has mailboxes (at least INBOX), so list=children would 
equal list=yes.

> 
>> 2. Configure the shared Dovecot:
> 
>> You need master passdb that allows all existing users to log in as 
>> "shareduser" user. You can probably simply do (not tested):
> 
>> passdb {
>> type = static
>> args = user=shareduser pass=master-user-password
>> master = yes
>> }
> 
>> The "shareduser" owns all of the actual shared mailboxes and has the
>> necessary ACLs set up for individual users. ACLs use the master
>> username (= the real username in this case) to do the ACL checks.
> 
> So this is kind of "backwards", since normally the imapc_master_user would be
> the static user and imapc_user would be dynamic, right?

Right. Also in this Dovecot you want a regular namespace without prefix:

namespace inbox {
  separator = /
  list = yes
  inbox = yes
}

You might as well use the proper separator here in case you ever change it for 
users.

Re: [Dovecot] Providing shared folders with multiple backend servers

2012-01-09 Thread Sven Hartge
Timo Sirainen  wrote:
> On 9.1.2012, at 20.47, Sven Hartge wrote:

 Can "mmap_disable = yes" and the other NFS options be set per
 namespace or only globally?
>> 
>>> Currently only globally.
>> 
>> Ah, too bad.
>> 
>> Back to the drawing board then.

> mmap_disable=yes works pretty well even if you're only using it for local 
> filesystems. It just spends some more memory when reading dovecot.index.cache 
> files.

>> Implementing my idea in my environment using a cluster filesystem would
>> be a very big pain in the lower back, so I need a different idea to
>> share the shared folders with all nodes but still keeping the user
>> specific mailboxes fixed and local to a node.
>> 
>> The imapc backed namespace you mentioned sounds very interesting, but
>> this is not implemented right now for shared folders, is it?

> Well.. If you don't need users sharing mailboxes to each others, 

God heavens, no! If I allowed users to share their mailboxes with other
users, hell would break loose. Nononono, just shared folders set up by
the admin team, statically assigned to groups of users (for example, the
central postmaster@ mail alias ends in such a shared folder).

> then you can probably already do this with Dovecot v2.1:

> 1. Configure the user Dovecots:

> namespace {
>  type = public
>  prefix = Shared/
>  location = imapc:~/imapc-shared
> }
> imapc_host = sharedmails.example.com
> imapc_password = master-user-password

> # With latest v2.1 hg you can do:
> imapc_user = shareduser
> imapc_master_user = %u
> # With v2.1.rc2 and older you need to do:
> imapc_user = shareduser*%u
> auth_master_user_separator = *

So, in my case, this would look like this:

,
| # User's private mail location
| mail_location = mdbox:~/mdbox
|
| # When creating any namespaces, you must also have a private namespace:
| namespace {
|   type = private
|   separator = .
|   prefix = INBOX.
|   #location defaults to mail_location.
|   inbox = yes
| }
|
| namespace {
|   type = public
|   separator = .
|   prefix = #shared.
|   location = imapc:~/imapc-shared
|   subscriptions = no
| }
|
| imapc_host = m-st-sh-01.foo.bar
| imapc_password = master-user-password
| imapc_user = shareduser
| imapc_master_user = %u
`

Where do I add "list = children"? In the user-dovecots shared namespace
or on the shared-dovecots private namespace?

> 2. Configure the shared Dovecot:

> You need master passdb that allows all existing users to log in as 
> "shareduser" user. You can probably simply do (not tested):

> passdb {
>  type = static
>  args = user=shareduser pass=master-user-password
>  master = yes
> }

> The "shareduser" owns all of the actual shared mailboxes and has the
> necessary ACLs set up for individual users. ACLs use the master
> username (= the real username in this case) to do the ACL checks.

So this is kind of "backwards", since normally the imapc_master_user would be
the static user and imapc_user would be dynamic, right?

All in all, a _very_ interesting configuration.

Grüße,
Sven.

-- 
Sigmentation fault. Core dumped.



Re: [Dovecot] Providing shared folders with multiple backend servers

2012-01-09 Thread Timo Sirainen
On 9.1.2012, at 21.16, Timo Sirainen wrote:

> passdb {
>  type = static
>  args = user=shareduser

Of course you should also require a password:

args = user=shareduser pass=master-user-password



Re: [Dovecot] Providing shared folders with multiple backend servers

2012-01-09 Thread Timo Sirainen
On 9.1.2012, at 20.47, Sven Hartge wrote:

>>> Can "mmap_disable = yes" and the other NFS options be set per
>>> namespace or only globally?
> 
>> Currently only globally.
> 
> Ah, too bad.
> 
> Back to the drawing board then.

mmap_disable=yes works pretty well even if you're only using it for local 
filesystems. It just spends some more memory when reading dovecot.index.cache 
files.

> Implementing my idea in my environment using a cluster filesystem would
> be a very big pain in the lower back, so I need a different idea to
> share the shared folders with all nodes but still keeping the user
> specific mailboxes fixed and local to a node.
> 
> The imapc backed namespace you mentioned sounds very interesting, but
> this is not implemented right now for shared folders, is it?

Well.. If you don't need users sharing mailboxes to each others, then you can 
probably already do this with Dovecot v2.1:

1. Configure the user Dovecots:

namespace {
  type = public
  prefix = Shared/
  location = imapc:~/imapc-shared
}
imapc_host = sharedmails.example.com
imapc_password = master-user-password

# With latest v2.1 hg you can do:
imapc_user = shareduser
imapc_master_user = %u
# With v2.1.rc2 and older you need to do:
imapc_user = shareduser*%u
auth_master_user_separator = *

2. Configure the shared Dovecot:

You need master passdb that allows all existing users to log in as "shareduser" 
user. You can probably simply do (not tested):

passdb {
  type = static
  args = user=shareduser
  master = yes
}

The "shareduser" owns all of the actual shared mailboxes and has the necessary 
ACLs set up for individual users. ACLs use the master username (= the real 
username in this case) to do the ACL checks.

Re: [Dovecot] Providing shared folders with multiple backend servers

2012-01-09 Thread Sven Hartge
Timo Sirainen  wrote:
> On 9.1.2012, at 20.25, Sven Hartge wrote:
>> Timo Sirainen  wrote:
>>> On 8.1.2012, at 0.20, Sven Hartge wrote:
 
 Right now, I am pondering with using an additional server with just
 the shared folders on it and using NFS (or a cluster FS) to mount
 the shared folder filesystem to each backend storage server, so
 each user has potential access to a shared folders data.
>> 
>>> With NFS you'll run into problems with caching
>>> (http://wiki2.dovecot.org/NFS). Some cluster fs might work better.
>> 
>> Can "mmap_disable = yes" and the other NFS options be set per
>> namespace or only globally?

> Currently only globally.

Ah, too bad.

Back to the drawing board then.

Implementing my idea in my environment using a cluster filesystem would
be a very big pain in the lower back, so I need a different idea to
share the shared folders with all nodes but still keeping the user
specific mailboxes fixed and local to a node.

The imapc backed namespace you mentioned sounds very interesting, but
this is not implemented right now for shared folders, is it?

Grüße,
Sven.

-- 
Sigmentation fault. Core dumped.



Re: [Dovecot] Providing shared folders with multiple backend servers

2012-01-09 Thread Timo Sirainen
On 9.1.2012, at 20.25, Sven Hartge wrote:

> Timo Sirainen  wrote:
>> On 8.1.2012, at 0.20, Sven Hartge wrote:
> 
>>> Right now, I am pondering with using an additional server with just
>>> the shared folders on it and using NFS (or a cluster FS) to mount the
>>> shared folder filesystem to each backend storage server, so each user
>>> has potential access to a shared folders data.
> 
>> With NFS you'll run into problems with caching
>> (http://wiki2.dovecot.org/NFS). Some cluster fs might work better.
> 
> Can "mmap_disable = yes" and the other NFS options be set per namespace
> or only globally?

Currently only globally.



Re: [Dovecot] Providing shared folders with multiple backend servers

2012-01-09 Thread Sven Hartge
Timo Sirainen  wrote:
> On 8.1.2012, at 0.20, Sven Hartge wrote:

>> Right now, I am pondering with using an additional server with just
>> the shared folders on it and using NFS (or a cluster FS) to mount the
>> shared folder filesystem to each backend storage server, so each user
>> has potential access to a shared folders data.

> With NFS you'll run into problems with caching
> (http://wiki2.dovecot.org/NFS). Some cluster fs might work better.

Can "mmap_disable = yes" and the other NFS options be set per namespace
or only globally?

Grüße,
Sven.

-- 
Sigmentation fault. Core dumped.



Re: [Dovecot] Providing shared folders with multiple backend servers

2012-01-09 Thread Timo Sirainen
On 9.1.2012, at 17.14, Charles Marcus wrote:

> On 2012-01-09 9:51 AM, Timo Sirainen  wrote:
>> The "proper" solution for this that I've been thinking about would be
>> to use v2.1's imapc backend with master users. So that when user A
>> wants to access user B's shared folder, Dovecot connects to B's IMAP
>> server using master user login, and accesses the mailbox via IMAP.
>> Probably wouldn't be a big job to implement, mainly I'd need to
>> figure out how this should be configured.
> 
> Sounds interesting... would this be the new officially supported method for 
> sharing mailboxes in all cases? Or is this just for shared mailboxes on NFS 
> shares?

Well, it would be one officially supported way to do it. It would also help 
when using multiple UIDs.



Re: [Dovecot] Providing shared folders with multiple backend servers

2012-01-09 Thread Sven Hartge
Stan Hoeppner  wrote:

> The more I think about your planned architecture the more it reminds
> me of a "shared nothing" database cluster--even a relatively small one
> can outrun a well tuned mainframe, especially doing decision
> support/data mining workloads (TPC-H).

> As long as you're prepared for the extra administration, which you
> obviously are, this setup will yield better performance than the NFS
> setup I recommended.  Performance may not be quite as good as 4
> physical hosts with local storage, but you haven't mentioned the
> details of your SAN storage nor the current load on it, so obviously I
> can't say with any certainty.  If the controller currently has plenty
> of spare IOPS then the performance difference would be minimal.

This is the beauty of the HP P4500: every node is a controller, load is
automagically balanced between all nodes of a storage cluster. The more
nodes (up to ten) you add, the more performance you get.

So far, I have not been able to push our current SAN to its limits, even
with totally artificial benchmarks, so I am quite confident in its
performance for the given task.

But if everything fails and the performance is not good, I can still go
ahead and buy dedicated hardware for the mailsystem.

The only thing left is the NFS problem with caching Timo mentioned, but
since the accesses to a central public shared folder will be only a
minor portion of a clients access, I am hoping the impact will be
minimal. Only testing will tell.

Grüße,
Sven.

-- 
Sigmentation fault. Core dumped.



Re: [Dovecot] Providing shared folders with multiple backend servers

2012-01-09 Thread Stan Hoeppner
On 1/9/2012 8:08 AM, Sven Hartge wrote:
> Stan Hoeppner  wrote:

> The quota for students is 1GiB here. If I provide each of my 4 nodes
> with 500GiB of storage space, this gives me 2TiB now, which should be
> sufficient. If a nodes fills, I increase its storage space. Only if it
> fills too fast, I may have to rebalance users.

That should work.

> And I never wanted to place the users based on their current size. I
> knew this was not going to work because of the reasons you mentioned.
> 
> I just want to hash their username and use this as a function to
> distribute the users, keeping it simple and stupid.

My apologies Sven.  I just re-read your first messages and you did
mention this method.

> Yes, I know. But right now, if I lose my one and only mail storage
> servers, all users mailboxes will be offline, until I am either a) able
> to repair the server, b) move the disks to my identical backup system (or
> the backup system to the location of the failed one) or c) start the
> backup system and lose all mails not rsynced since the last rsync-run.

True.  3/4 of users remaining online is much better than none. :)

> It is not easy designing a mail system without a SPoF which still
> performs under load.

And many other systems for that matter.

> For example, once a time I had a DRDB (active/passive( setup between the
> two storage systems. This would allow me to start my standby system
> without losing (nearly) any mail. But this was awful slow and sluggish.

Eric Rostetter at University of Texas at Austin has reported good
performance with his twin Dovecot DRBD cluster.  Though in his case he's
doing active/active DRBD with GFS2 sitting on top, so there is no
failover needed.  DRBD is obviously not an option for your current needs.

>> 3.  You will consume more SAN volumes and LUNs.  Most arrays have a
>>fixed number of each.  May or may not be an issue.
> 
> Not really an issue here. The SAN is exclusive for the VMware cluster,
> so most LUNs are quite big (1TiB to 2TiB) but there are not many of
> them.

I figured this wouldn't be a problem.  I'm just trying to be thorough,
mentioning anything I can think of that might be an issue.


The more I think about your planned architecture the more it reminds me
of a "shared nothing" database cluster--even a relatively small one can
outrun a well tuned mainframe, especially doing decision support/data
mining workloads (TPC-H).

As long as you're prepared for the extra administration, which you
obviously are, this setup will yield better performance than the NFS
setup I recommended.  Performance may not be quite as good as 4 physical
hosts with local storage, but you haven't mentioned the details of your
SAN storage nor the current load on it, so obviously I can't say with
any certainty.  If the controller currently has plenty of spare IOPS
then the performance difference would be minimal.  And using the SAN
allows automatic restart of a VM if a physical node dies.

As with Phil, I'm anxious to see how well it works in production.  When
you send an update please CC me directly as sometimes I don't read all
the list mail.

I hope my participation was helpful to you Sven, even if only to a small
degree.  Best of luck with the implementation.

-- 
Stan




Re: [Dovecot] Providing shared folders with multiple backend servers

2012-01-09 Thread Charles Marcus

On 2012-01-09 9:51 AM, Timo Sirainen  wrote:

The "proper" solution for this that I've been thinking about would be
to use v2.1's imapc backend with master users. So that when user A
wants to access user B's shared folder, Dovecot connects to B's IMAP
server using master user login, and accesses the mailbox via IMAP.
Probably wouldn't be a big job to implement, mainly I'd need to
figure out how this should be configured.


Sounds interesting... would this be the new officially supported method 
for sharing mailboxes in all cases? Or is this just for shared mailboxes 
on NFS shares?


It sounds like this might be a proper (fully supported without kludges) 
way to get what I had asked about before, with respect to expanding on 
the concept of Master users for sharing an entire account with one or 
more other users...


--

Best regards,

Charles


Re: [Dovecot] Providing shared folders with multiple backend servers

2012-01-09 Thread Sven Hartge
Timo Sirainen  wrote:
> On 8.1.2012, at 0.20, Sven Hartge wrote:

>> Right now, I am pondering with using an additional server with just
>> the shared folders on it and using NFS (or a cluster FS) to mount the
>> shared folder filesystem to each backend storage server, so each user
>> has potential access to a shared folders data.

> With NFS you'll run into problems with caching
> (http://wiki2.dovecot.org/NFS). Some cluster fs might work better.

> The "proper" solution for this that I've been thinking about would be
> to use v2.1's imapc backend with master users. So that when user A
> wants to access user B's shared folder, Dovecot connects to B's IMAP
> server using master user login, and accesses the mailbox via IMAP.
> Probably wouldn't be a big job to implement, mainly I'd need to figure
> out how this should be configured..

Luckily, in my case, User A does not access anythin from User B, but
instead both User A and User B access the same public folder, which is
different from any folder of User A and User B.

Grüße,
Sven.

-- 
Sigmentation fault. Core dumped.



Re: [Dovecot] Providing shared folders with multiple backend servers

2012-01-09 Thread Timo Sirainen
Too much text in the rest of this thread so I haven't read it, but:

On 8.1.2012, at 0.20, Sven Hartge wrote:

> Right now, I am pondering with using an additional server with just the
> shared folders on it and using NFS (or a cluster FS) to mount the shared
> folder filesystem to each backend storage server, so each user has
> potential access to a shared folders data.

With NFS you'll run into problems with caching (http://wiki2.dovecot.org/NFS). 
Some cluster fs might work better.

The "proper" solution for this that I've been thinking about would be to use 
v2.1's imapc backend with master users. So that when user A wants to access 
user B's shared folder, Dovecot connects to B's IMAP server using master user 
login, and accesses the mailbox via IMAP. Probably wouldn't be a big job to 
implement, mainly I'd need to figure out how this should be configured..

Re: [Dovecot] Providing shared folders with multiple backend servers

2012-01-09 Thread Sven Hartge
Stan Hoeppner  wrote:
> On 1/8/2012 2:15 PM, Sven Hartge wrote:

>> Wouldn't such a setup be the "Best of Both Worlds"? Having the main
>> traffic going to local disks (being RDMs) and also being able to provide
>> shared folders to every user who needs them without the need to move
>> those users onto one server?

> The only problems I can see at this time are:

> 1.  Some users will have much larger mailboxes than others.
>Each year ~1/4 of your student population rotates, so if you
>manually place existing mailboxes now based on current size
>you have no idea who the big users are in the next freshman
>class, or the next.  So you may have to do manual re-balancing
>of mailboxes, maybe frequently.

The quota for students is 1GiB here. If I provide each of my 4 nodes
with 500GiB of storage space, this gives me 2TiB now, which should be
sufficient. If a nodes fills, I increase its storage space. Only if it
fills too fast, I may have to rebalance users.

And I never wanted to place the users based on their current size. I
knew this was not going to work because of the reasons you mentioned.

I just want to hash their username and use this as a function to
distribute the users, keeping it simple and stupid.

> 2.  If you lose a Dovecot VM guest due to image file or other
>corruption, or some other rare cause, you can't restart that guest,
>but will have to build a new image from a template.  This could
>cause either minor or significant downtime for ~1/4 of your mail
>users w/4 nodes.  This is likely rare enough it's not worth
>consideration.

Yes, I know. But right now, if I lose my one and only mail storage
servers, all users mailboxes will be offline, until I am either a) able
to repair the server, b) move the disks to my identical backup system (or
the backup system to the location of the failed one) or c) start the
backup system and lose all mails not rsynced since the last rsync-run.

It is not easy designing a mail system without a SPoF which still
performs under load.

For example, once a time I had a DRDB (active/passive( setup between the
two storage systems. This would allow me to start my standby system
without losing (nearly) any mail. But this was awful slow and sluggish.

> 3.  You will consume more SAN volumes and LUNs.  Most arrays have a
>fixed number of each.  May or may not be an issue.

Not really an issue here. The SAN is exclusive for the VMware cluster,
so most LUNs are quite big (1TiB to 2TiB) but there are not many of
them.

Grüße,
Sven.

-- 
Sigmentation fault. Core dumped.



Re: [Dovecot] Providing shared folders with multiple backend servers

2012-01-09 Thread Sven Hartge
Stan Hoeppner  wrote:
> On 1/8/2012 3:07 PM, Sven Hartge wrote:

>> Ah, I forgot: I _already_ have the mechanisms in place to statically
>> redirect/route accesses for users to different backends, since some
>> of the users are already redirected to a different mailsystem at
>> another location of my university.

> I assume you mean IMAP/POP connections, not SMTP.

Yes. perdition uses its popmap feature to redirect users of the other
location to the IMAP/POP servers there. So we only need one central
mailserver for the users to configure while we are able to physically
store their mails at different datacenters.

> I'm guessing no one else has interest in this thread, or maybe simply
> lost interest as the replies have been lengthy, and not wholly Dovecot
> related.  I accept some blame for that.

I will open a new thread with more concrete problems/questions after I
setup my test setup. This will be more technical and less philosphical,
I hope :)

Grüße,
Sven

-- 
Sigmentation fault. Core dumped.



Re: [Dovecot] Providing shared folders with multiple backend servers

2012-01-09 Thread Phil Turmel
On 01/09/2012 08:38 AM, Stan Hoeppner wrote:
> On 1/8/2012 3:07 PM, Sven Hartge wrote:

[...]

>> (Are my words making any sense? I got the feeling I'm writing German with
>> English words and nobody is really understanding anything ...)
> 
> You're making perfect sense, and frankly, if not for the .de TLD in your
> email address, I'd have thought you were an American.  Your written
> English is probably better than mine, and it's my only language.  To be
> fair to the Brits, I speak/write American English.  ;)

Concur.  My American ear is also perfectly happy.

> I'm guessing no one else has interest in this thread, or maybe simply
> lost interest as the replies have been lengthy, and not wholly Dovecot
> related.  I accept some blame for that.

I've been following this thread with great interest, but no advice to offer.
The content is entirely appropriate, and appreciated.  Don't be embarrassed
by your enthusiasm, Stan.

Sven, a follow-up report when you have it all working as desired would also
be appreciated (and appropriate).

Thanks,

Phil



Re: [Dovecot] Providing shared folders with multiple backend servers

2012-01-09 Thread Sven Hartge
Stan Hoeppner  wrote:
> On 1/8/2012 9:39 AM, Sven Hartge wrote:

>> Memory size. I am a bit hesistant to deploy a VM with 16GB of RAM. My
>> cluster nodes each have 48GB, so no problem on this side though.

> Shouldn't be a problem if you're going to spread the load over 2 to 4
> cluster nodes.  16/2 = 8GB per VM, 16/4 = 4GB per Dovecot VM.  This,
> assuming you are able to evenly spread user load.

I think I will be able to do that. If I devide my users by using a
hash like MD5 or SHA1 over their username, this should give me an even
distribution.

>> So, this reads like my idea in the first place.
>> 
>> Only you place all the mails on the NFS server, whereas my idea was to
>> just share the shared folders from a central point and keep the normal
>> user dirs local to the different nodes, thus reducing network impact for
>> the way more common user access.

> To be quite honest, after thinking this through a bit, many traditional
> advantages of a single shared mail store start to disappear.  Whether
> you use NFS or a clusterFS, or 'local' disk (RDMs), all IO goes to the
> same array, so the traditional IO load balancing advantage disappears.
> The other main advantage, replacing a dead hardware node, simply mapping
> the LUNs to the new one and booting it up, also disappears due to
> VMware's unique abilities, including vmotion.  Efficient use of storage
> isn't an issue as you can just as easily slice off a small LUN to each
> of 2/4 Dovecot VMs as a larger one to the NFS VM.

Yes. Plus I can much more easily increase a LUNs size, if the need
arises.

> So the only disadvantages I see are with the 'local' disk RDM mailstore
> location. 'manual' connection/mailbox/size balancing, all increasing
> administrator burden.

Well, I don't see size balancing as a problem since I can increase the
size of the disk for a node very easy.

Load should be fairly even, if I distribute the 10,000 users across the
nodes. Even if there is a slight imbalance, the systems should have
enough power to smooth that out.  I could measure the load every user
creates and use that as a distribution key, but I believe this to be a
wee bit over-engineered for my scenario.

Initial placement of a new user will be automatic, during the activation
of the account, so no administrative burden there.

It seems my initial idea was not so bad after all ;) Now I "just" need o
built a little test setup, put some dummy users on it and see, if
anything bad happens while accessing the shared folders and how the
reaction of the system is, should the shared folder server be down.

>> 2.3GHz for most VMware nodes.

> How many total cores per VMware node (all sockets)?

8

>> You got the numbers wrong. And I got a word wrong ;)
>> 
>> Should have read "900GB _of_ 1300GB used".

> My bad.  I misunderstood.

Here the memory statistics an 14:30 o'clock:

 total   used   free sharedbuffers cached
Mem: 12046  11199847  0 88   7926
-/+ buffers/cache:   3185   8861
Swap: 5718 10   5707

>> So not much wiggle room left.

> And that one is retiring anyway as you state below.  So do you have
> plenty of space on your VMware SAN arrays?  If not can you add disks
> or do you need another array chassis?

The SAN has plenty space. Over 70TiB at this time, with another 70TiB
having just arrived and waiting to be connected.

>> This is a Transtec Provigo 610. This is a 24 disk enclosure, 12 disks
>> with 150GB (7.200k) each for the main mail storage in RAID6 and
>> another 10 disks with 150GB (5.400k) for a backup LUN. I daily
>> rsnapshot my /home onto this local backup (20 days of retention),
>> because it is easier to restore from than firing up Bacula, which has
>> the long retention time of 90 days. But must users need a restore of
>> mails from $yesterday or $the_day_before.

> And your current iSCSI SAN array(s) backing the VMware farm?  Total
> disks?  Is it monolithic, or do you have multiple array chassis from
> one or multiple vendors?

The iSCSI storage nodes (HP P4500) use 600GB SAS6 at 15k rpm with 12
disks per node, configured in 2 RAID5 sets with 6 disks each.

But this is internal to each storage node, which are kind of a blackbox
and have to be treated as such.

The HP P4500 is a but unique, since it does not consist of a head node
which storage arrays connected to it, but of individual storage nodes
forming a self balancing iSCSI cluster. (The nodes consist of DL320s G2.)

So far, I had no performance or other problems with this setup and it
scales quite nice, as you  buy as you grow .

And again, price was also a factor, deploying a FC-SAN would have cost
us more than thrice the amount than the amount the deployment of an iSCSI
solution did, because the latter is "just" ethernet, while the former
would have needed a lot more totally new components.

>> Well, it was either Parallel-SCSI or FC back then, as far as I can
>> remember. The price diff

Re: [Dovecot] Providing shared folders with multiple backend servers

2012-01-09 Thread Stan Hoeppner
On 1/8/2012 3:07 PM, Sven Hartge wrote:

> Ah, I forgot: I _already_ have the mechanisms in place to statically
> redirect/route accesses for users to different backends, since some of
> the users are already redirected to a different mailsystem at another
> location of my university.

I assume you mean IMAP/POP connections, not SMTP.

> So using this mechanism to also redirect/route users internal to _my_
> location is no big deal.
> 
> This is what got me into the idea of several independant backend
> storages without the need to share the _whole_ storage, but just the
> shared folders for some users.
> 
> (Are my words making any sense? I got the feeling I'm writing German with
> English words and nobody is really understanding anything ...)

You're making perfect sense, and frankly, if not for the .de TLD in your
email address, I'd have thought you were an American.  Your written
English is probably better than mine, and it's my only language.  To be
fair to the Brits, I speak/write American English.  ;)

I'm guessing no one else has interest in this thread, or maybe simply
lost interest as the replies have been lengthy, and not wholly Dovecot
related.  I accept some blame for that.

-- 
Stan


Re: [Dovecot] Providing shared folders with multiple backend servers

2012-01-09 Thread Stan Hoeppner
On 1/8/2012 2:15 PM, Sven Hartge wrote:

> Wouldn't such a setup be the "Best of Both Worlds"? Having the main
> traffic going to local disks (being RDMs) and also being able to provide
> shared folders to every user who needs them without the need to move
> those users onto one server?

The only problems I can see at this time are:

1.  Some users will have much larger mailboxes than others.
Each year ~1/4 of your student population rotates, so if you
manually place existing mailboxes now based on current size
you have no idea who the big users are in the next freshman
class, or the next.  So you may have to do manual re-balancing
of mailboxes, maybe frequently.

2.  If you lose a Dovecot VM guest due to image file or other
corruption, or some other rare cause, you can't restart that guest,
but will have to build a new image from a template.  This could
cause either minor or significant downtime for ~1/4 of your mail
users w/4 nodes.  This is likely rare enough it's not worth
consideration.

3.  You will consume more SAN volumes and LUNs.  Most arrays have a
fixed number of each.  May or may not be an issue.

-- 
Stan


Re: [Dovecot] Providing shared folders with multiple backend servers

2012-01-09 Thread Stan Hoeppner
On 1/8/2012 9:39 AM, Sven Hartge wrote:

> Memory size. I am a bit hesistant to deploy a VM with 16GB of RAM. My
> cluster nodes each have 48GB, so no problem on this side though.

Shouldn't be a problem if you're going to spread the load over 2 to 4
cluster nodes.  16/2 = 8GB per VM, 16/4 = 4GB per Dovecot VM.  This,
assuming you are able to evenly spread user load.

> And our VMware SAN is iSCSI based, so no way to plug a FC-based storage
> into it.

There are standalone FC-iSCSI bridges but they're marketed to bridge FC
SAN islands over an IP WAN.  Director class SAN switches can connect
anything to anything, just buy the cards you need.  Both of these are
rather pricey.  These wouldn't make sense in your environment.  I'm just
pointing out that it can be done.

> So, this reads like my idea in the first place.
> 
> Only you place all the mails on the NFS server, whereas my idea was to
> just share the shared folders from a central point and keep the normal
> user dirs local to the different nodes, thus reducing network impact for
> the way more common user access.

To be quite honest, after thinking this through a bit, many traditional
advantages of a single shared mail store start to disappear.  Whether
you use NFS or a clusterFS, or 'local' disk (RDMs), all IO goes to the
same array, so the traditional IO load balancing advantage disappears.
The other main advantage, replacing a dead hardware node, simply mapping
the LUNs to the new one and booting it up, also disappears due to
VMware's unique abilities, including vmotion.  Efficient use of storage
isn't an issue as you can just as easily slice off a small LUN to each
of 2/4 Dovecot VMs as a larger one to the NFS VM.

So the only disadvantages I see are with the 'local' disk RDM mailstore
location. 'manual' connection/mailbox/size balancing, all increasing
administrator burden.

> 2.3GHz for most VMware nodes.

How many total cores per VMware node (all sockets)?

> You got the numbers wrong. And I got a word wrong ;)
> 
> Should have read "900GB _of_ 1300GB used".

My bad.  I misunderstood.

> So not much wiggle room left.

And that one is retiring anyway as you state below.  So do you have
plenty of space on your VMware SAN arrays?  If not can you add disks or
do you need another array chassis?

> But modifications to our systems are made, which allow me to
> temp-disable a user, convert and move his mailbox and re-enable him,
> which allows me to move them one at a time from the old system to the
> new one, without losing a mail or disrupting service to long and often.

As it should be.

> This is a Transtec Provigo 610. This is a 24 disk enclosure, 12 disks
> with 150GB (7.200k) each for the main mail storage in RAID6 and another
> 10 disks with 150GB (5.400k) for a backup LUN. I daily rsnapshot my
> /home onto this local backup (20 days of retention), because it is
> easier to restore from than firing up Bacula, which has the long
> retention time of 90 days. But must users need a restore of mails from
> $yesterday or $the_day_before.

And your current iSCSI SAN array(s) backing the VMware farm?  Total
disks?  Is it monolithic, or do you have multiple array chassis from one
or multiple vendors?

> Well, it was either Parallel-SCSI or FC back then, as far as I can
> remember. The price difference between the U320 version and the FC one
> was not so big and I wanted to avoid having to route those big SCSI-U320
> through my racks.

Can't blame you there.  I take it you hadn't built the iSCSI SAN yet at
that point?

> See above, not 1500GB disks, but 150GB ones. RAID6, because I wanted the
> double security. I have been kind of burned by the previous system and I
> tend to get nervous while tinking about data loss in my mail storage,
> because I know my users _will_ give me hell if that happens.

And as it turns out RAID10 wouldn't have provided you enough bytes.

> I never used mbox as an admin. The box before the box before this one
> uses uw-imapd with mbox and I experienced the system as a user and it
> was horriffic. Most users back then never heard of IMAP folders and just
> stored their mails inside of INBOX, which of course got huge. If one of
> those users with a big mbox then deleted mails, it would literally lock
> the box up for everyone, as uw-imapd was copying (for example) a 600MB
> mbox file around to delete one mail.

Yeah, ouch.  IMAP with mbox works pretty well when users are marginally
smart about organizing their mail, or a POP then delete setup.  I'd bet
if that was maildir in that era on that box it would have slowed things
way down as well.  Especially if the filesystem was XFS, which had
horrible, abysmal really, unlink performance until 2.6.35 (2009).

> Of course, this was mostly because of the crappy uw-imapd and secondly
> by some poor design choices in the server itself (underpowered RAID
> controller, to small cache and a RAID5 setup, low RAM in the server).

That's a recipe for disaster.

> So the first thing we did ba

Re: [Dovecot] Providing shared folders with multiple backend servers

2012-01-08 Thread Sven Hartge
Sven Hartge  wrote:
> Sven Hartge  wrote:
>> Stan Hoeppner  wrote:

>>> If an individual VMware node don't have sufficient RAM you could build a
>>> VM based Dovecot cluster, run these two VMs on separate nodes, and thin
>>> out the other VMs allowed to run on these nodes.  Since you can't
>>> directly share XFS, build a tiny Debian NFS server VM and map the XFS
>>> LUN to it, export the filesystem to the two Dovecot VMs.  You could
>>> install the Dovecot director on this NFS server VM as well.  Converting
>>> from maildir to mdbox should help eliminate the NFS locking problems.  I
>>> would do the conversion before migrating to this VM setup with NFS.

>>> Also, run the NFS server VM on the same physical node as one of the
>>> Dovecot servers.  The NFS traffic will be a memory-memory copy instead
>>> of going over the GbE wire, decreasing IO latency and increasing
>>> performance for that Dovecot server.  If it's possible to have Dovecot
>>> director or your fav load balancer weight more connections to one
>>> Deovecot node, funnel 10-15% more connections to this one.  (I'm no
>>> director guru, in fact haven't use it yet).

>> So, this reads like my idea in the first place.

>> Only you place all the mails on the NFS server, whereas my idea was to
>> just share the shared folders from a central point and keep the normal
>> user dirs local to the different nodes, thus reducing network impact for
>> the way more common user access.

> To be a bit more concrete on this one:

> a) X backend servers which my frontend (being perdition or dovecot
>   director) redirects users to, fixed, no random redirects.

>   I might start with 4 backend servers, but I can easily scale them,
>   either vertically by adding more RAM or vCPUs or horizontally by
>   adding more VMs and reshuffling some mailboxes during the night.

>  Why 4 and not 2? If I'm going to build a cluster, I already have to do
>  the work to implement this and with 4 backends, I can distribute the
>  load even further without much additional administrative overhead.
>  But the load impact on each node gets lower with more nodes, if I am
>  able to evenly spread my users across those nodes (like md5'ing the
>  username and using the first 2 bits from that to determine which
>  node the user resides on).

Ah, I forgot: I _already_ have the mechanisms in place to statically
redirect/route accesses for users to different backends, since some of
the users are already redirected to a different mailsystem at another
location of my university.

So using this mechanism to also redirect/route users internal to _my_
location is no big deal.

This is what got me into the idea of several independant backend
storages without the need to share the _whole_ storage, but just the
shared folders for some users.

(Are my words making any sense? I got the feeling I'm writing German with
English words and nobody is really understanding anything ...)

Grüße,
Sven.

-- 
Sigmentation fault. Core dumped.



Re: [Dovecot] Providing shared folders with multiple backend servers

2012-01-08 Thread Sven Hartge
Sven Hartge  wrote:
> Stan Hoeppner  wrote:

>> If an individual VMware node don't have sufficient RAM you could build a
>> VM based Dovecot cluster, run these two VMs on separate nodes, and thin
>> out the other VMs allowed to run on these nodes.  Since you can't
>> directly share XFS, build a tiny Debian NFS server VM and map the XFS
>> LUN to it, export the filesystem to the two Dovecot VMs.  You could
>> install the Dovecot director on this NFS server VM as well.  Converting
>> from maildir to mdbox should help eliminate the NFS locking problems.  I
>> would do the conversion before migrating to this VM setup with NFS.

>> Also, run the NFS server VM on the same physical node as one of the
>> Dovecot servers.  The NFS traffic will be a memory-memory copy instead
>> of going over the GbE wire, decreasing IO latency and increasing
>> performance for that Dovecot server.  If it's possible to have Dovecot
>> director or your fav load balancer weight more connections to one
>> Deovecot node, funnel 10-15% more connections to this one.  (I'm no
>> director guru, in fact haven't use it yet).

> So, this reads like my idea in the first place.

> Only you place all the mails on the NFS server, whereas my idea was to
> just share the shared folders from a central point and keep the normal
> user dirs local to the different nodes, thus reducing network impact for
> the way more common user access.

To be a bit more concrete on this one:

a) X backend servers which my frontend (being perdition or dovecot
   director) redirects users to, fixed, no random redirects.

   I might start with 4 backend servers, but I can easily scale them,
   either vertically by adding more RAM or vCPUs or horizontally by
   adding more VMs and reshuffling some mailboxes during the night.

  Why 4 and not 2? If I'm going to build a cluster, I already have to do
  the work to implement this and with 4 backends, I can distribute the
  load even further without much additional administrative overhead.
  But the load impact on each node gets lower with more nodes, if I am
  able to evenly spread my users across those nodes (like md5'ing the
  username and using the first 2 bits from that to determine which
  node the user resides on).

b) 1 backend server for the public shared mailboxes, exporting them via
   NFS to the user backend servers

Configuration like this, from http://wiki2.dovecot.org/SharedMailboxes/Public

,
| # User's private mail location
| mail_location = mdbox:~/mdbox
|
| # When creating any namespaces, you must also have a private namespace:
| namespace {
|   type = private
|   separator = .
|   prefix = INBOX.
|   #location defaults to mail_location.
|   inbox = yes
| }
|
| namespace {
|   type = public
|   separator = .
|   prefix = #shared.
|   location = mdbox:/srv/shared/
|   subscriptions = no
| }
`

With /srv/shared being the NFS mountpoint from my central public shared
mailbox server.

This setup would keep the amount of data transferred via NFS small (only
a tiny fraction of my 10,000 users have access to a shared folder,
mostly users in the IT-Team or in the administration of the university.

Wouldn't such a setup be the "Best of Both Worlds"? Having the main
traffic going to local disks (being RDMs) and also being able to provide
shared folders to every user who needs them without the need to move
those users onto one server?

Grüße,
Sven.

-- 
Sigmentation fault. Core dumped.



Re: [Dovecot] Providing shared folders with multiple backend servers

2012-01-08 Thread Sven Hartge
Stan Hoeppner  wrote:
> On 1/7/2012 7:55 PM, Sven Hartge wrote:
>> Stan Hoeppner  wrote:
>> 
>>> It's highly likely your problems can be solved without the drastic
>>> architecture change, and new problems it will introduce, that you
>>> describe below.
>> 
>> The main reason is I need to replace the hardware as its service
>> contract ends this year and I am not able to extend it further.
>> 
>> The box so far is fine, there are normally no problems during normal
>> operations with speed or responsiveness towards the end-user.
>> 
>> Sometimes, higher peak loads tend to strain the system a bit and this is
>> starting to occur more often.
> ...
>> First thought was to move this setup into our VMware cluster (yeah, I
>> know, spare me the screams), since the hardware used there is way more
>> powerfull than the hardware used now and I wouldn't have to buy new
>> servers for my mail system (which is kind of painful to do in an
>> universitary environment, especially in Germany, if you want to invest
>> an amount of money above a certain amount).

> What's wrong with moving it onto VMware?  This actually seems like a
> smart move given your description of the node hardware.  It also gives
> you much greater backup flexibility with VCB (or whatever they call it
> today).  You can snapshot the LUN over the SAN during off peak hours to
> a backup server and do the actual backup to the library at your leisure.
> Forgive me if the software names have changed as I've not used VMware
> since ESX3 back in 07.

VCB as it was back in the days is dead. But yes, one of the reasons to
use a VM was to be able to easily backup the whole shebang.

>> But then I thought about the problems with VMs this size and got to the
>> idea with the distributed setup, splitting the one server into 4 or 6
>> backend servers.

> Not sure what you mean by "VMs this size".  Do you mean memory
> requirements or filesystem size?  If the nodes have enough RAM that's no
> issue.

Memory size. I am a bit hesistant to deploy a VM with 16GB of RAM. My
cluster nodes each have 48GB, so no problem on this side though.

> And surely you're not thinking of using a .vmdk for the mailbox
> storage.  You'd use an RDM SAN LUN.

No, I was not planning to use a VMDK backed disk for this.

> In fact you should be able to map in the existing XFS storage LUN and
> use it as is.  Assuming it's not going into retirement as well.

It is going to be retired as well, as it is as old as the server.

It also is not connected to any SAN as well, only local to the
backend server.

And our VMware SAN is iSCSI based, so no way to plug a FC-based storage
into it.

> If an individual VMware node don't have sufficient RAM you could build a
> VM based Dovecot cluster, run these two VMs on separate nodes, and thin
> out the other VMs allowed to run on these nodes.  Since you can't
> directly share XFS, build a tiny Debian NFS server VM and map the XFS
> LUN to it, export the filesystem to the two Dovecot VMs.  You could
> install the Dovecot director on this NFS server VM as well.  Converting
> from maildir to mdbox should help eliminate the NFS locking problems.  I
> would do the conversion before migrating to this VM setup with NFS.

> Also, run the NFS server VM on the same physical node as one of the
> Dovecot servers.  The NFS traffic will be a memory-memory copy instead
> of going over the GbE wire, decreasing IO latency and increasing
> performance for that Dovecot server.  If it's possible to have Dovecot
> director or your fav load balancer weight more connections to one
> Deovecot node, funnel 10-15% more connections to this one.  (I'm no
> director guru, in fact haven't use it yet).

So, this reads like my idea in the first place.

Only you place all the mails on the NFS server, whereas my idea was to
just share the shared folders from a central point and keep the normal
user dirs local to the different nodes, thus reducing network impact for
the way more common user access.

> Assuming the CPUs in the VMware cluster nodes are clocked a decent
> amount higher than 1.8GHz I wouldn't monkey with configuring virtual smp
> for these two VMs, as they'll be IO bound not CPU bound.

2.3GHz for most VMware nodes.

 Ideas? Suggestions? Nudges in the right direction?
>> 
>>> Yes.  We need more real information.  Please provide:
>> 
>>> 1.  Mailbox count, total maildir file count and size
>> 
>> about 10,000 Maildir++ boxes
>> 
>> 900GB for 1300GB used, "df -i" says 11 million inodes used

> Converting to mdbox will take a large burden off your storage, as you've
> seen.  With ~1.3TB consumed of ~15TB you should have plenty of space to
> convert to mdbox while avoiding filesystem fragmentation.

You got the numbers wrong. And I got a word wrong ;)

Should have read "900GB _of_ 1300GB used".

I am using 900GB of 1300GB. The disks are SATA1.5 (not SATA3 or SATA6)
as in data transfer rate. The disks each are 150GB in size, so my
maximum storage size of my underlying VG is 150

Re: [Dovecot] Providing shared folders with multiple backend servers

2012-01-08 Thread Stan Hoeppner
On 1/7/2012 7:55 PM, Sven Hartge wrote:
> Stan Hoeppner  wrote:
> 
>> It's highly likely your problems can be solved without the drastic
>> architecture change, and new problems it will introduce, that you
>> describe below.
> 
> The main reason is I need to replace the hardware as its service
> contract ends this year and I am not able to extend it further.
> 
> The box so far is fine, there are normally no problems during normal
> operations with speed or responsiveness towards the end-user.
> 
> Sometimes, higher peak loads tend to strain the system a bit and this is
> starting to occur more often.
...
> First thought was to move this setup into our VMware cluster (yeah, I
> know, spare me the screams), since the hardware used there is way more
> powerfull than the hardware used now and I wouldn't have to buy new
> servers for my mail system (which is kind of painful to do in an
> universitary environment, especially in Germany, if you want to invest
> an amount of money above a certain amount).

What's wrong with moving it onto VMware?  This actually seems like a
smart move given your description of the node hardware.  It also gives
you much greater backup flexibility with VCB (or whatever they call it
today).  You can snapshot the LUN over the SAN during off peak hours to
a backup server and do the actual backup to the library at your leisure.
 Forgive me if the software names have changed as I've not used VMware
since ESX3 back in 07.

> But then I thought about the problems with VMs this size and got to the
> idea with the distributed setup, splitting the one server into 4 or 6
> backend servers.

Not sure what you mean by "VMs this size".  Do you mean memory
requirements or filesystem size?  If the nodes have enough RAM that's no
issue.  And surely you're not thinking of using a .vmdk for the mailbox
storage.  You'd use an RDM SAN LUN.  In fact you should be able to map
in the existing XFS storage LUN and use it as is.  Assuming it's not
going into retirement as well.

If an individual VMware node don't have sufficient RAM you could build a
VM based Dovecot cluster, run these two VMs on separate nodes, and thin
out the other VMs allowed to run on these nodes.  Since you can't
directly share XFS, build a tiny Debian NFS server VM and map the XFS
LUN to it, export the filesystem to the two Dovecot VMs.  You could
install the Dovecot director on this NFS server VM as well.  Converting
from maildir to mdbox should help eliminate the NFS locking problems.  I
would do the conversion before migrating to this VM setup with NFS.

Also, run the NFS server VM on the same physical node as one of the
Dovecot servers.  The NFS traffic will be a memory-memory copy instead
of going over the GbE wire, decreasing IO latency and increasing
performance for that Dovecot server.  If it's possible to have Dovecot
director or your fav load balancer weight more connections to one
Deovecot node, funnel 10-15% more connections to this one.  (I'm no
director guru, in fact haven't use it yet).

Assuming the CPUs in the VMware cluster nodes are clocked a decent
amount higher than 1.8GHz I wouldn't monkey with configuring virtual smp
for these two VMs, as they'll be IO bound not CPU bound.

> As I said: "idea". Other ideas making my life easier are more than
> welcome.

I hope my suggestions contribute to doing so. :)

>>> Ideas? Suggestions? Nudges in the right direction?
> 
>> Yes.  We need more real information.  Please provide:
> 
>> 1.  Mailbox count, total maildir file count and size
> 
> about 10,000 Maildir++ boxes
> 
> 900GB for 1300GB used, "df -i" says 11 million inodes used

Converting to mdbox will take a large burden off your storage, as you've
seen.  With ~1.3TB consumed of ~15TB you should have plenty of space to
convert to mdbox while avoiding filesystem fragmentation.  With maildir
you likely didn't see heavy fragmentation due to small file sizes.  With
mdbox, especially at 50MB, you'll likely start seeing more
fragmentation.  Use this to periodically check the fragmentation level:

$ xfs_db -c frag [device] -r
e.g.
$  xfs_db -c frag /dev/sda7 -r
actual 76109, ideal 75422, fragmentation factor 0.90%

I'd recommend running xfs_fsr when frag factor exceeds ~20-30%.  The XFS
developers recommend against running xfs_fsr too often as it can
actually increases free space fragmentation while it decreases file
fragmentation, especially on filesystems that are relatively full.
Having heavily fragmented free space is worse than having fragmented
files, as newly created files will automatically be fragged.

> I know, this is very _tiny_ compared to the systems ISPs are using.

Not everyone is an ISP, including me. :)

>> 2.  Average/peak concurrent user connections
> 
> IMAP: Average 800 concurrent user connections, peaking at about 1400.
> POP3: Average 300 concurrent user connections, peaking at about 600.
> 
>> 3.  CPU type/speed/total core count, total RAM, free RAM (incl buffers)
> 
> Currently dual-core AMD Opteron 

Re: [Dovecot] Providing shared folders with multiple backend servers

2012-01-07 Thread Sven Hartge
Stan Hoeppner  wrote:

> It's highly likely your problems can be solved without the drastic
> architecture change, and new problems it will introduce, that you
> describe below.

The main reason is I need to replace the hardware as its service
contract ends this year and I am not able to extend it further.

The box so far is fine, there are normally no problems during normal
operations with speed or responsiveness towards the end-user.

Sometimes, higher peak loads tend to strain the system a bit and this is
starting to occur more often.

First thought was to move this setup into our VMware cluster (yeah, I
know, spare me the screams), since the hardware used there is way more
powerfull than the hardware used now and I wouldn't have to buy new
servers for my mail system (which is kind of painful to do in an
universitary environment, especially in Germany, if you want to invest
an amount of money above a certain amount).

But then I thought about the problems with VMs this size and got to the
idea with the distributed setup, splitting the one server into 4 or 6
backend servers.

As I said: "idea". Other ideas making my life easier are more than
welcome.

>> Ideas? Suggestions? Nudges in the right direction?

> Yes.  We need more real information.  Please provide:

> 1.  Mailbox count, total maildir file count and size

about 10,000 Maildir++ boxes

900GB for 1300GB used, "df -i" says 11 million inodes used

I know, this is very _tiny_ compared to the systems ISPs are using.

> 2.  Average/peak concurrent user connections

IMAP: Average 800 concurrent user connections, peaking at about 1400.
POP3: Average 300 concurrent user connections, peaking at about 600.

> 3.  CPU type/speed/total core count, total RAM, free RAM (incl buffers)

Currently dual-core AMD Opteron 2210, 1.8GHz.

Right now, in the middle of the night (2:30 AM here) on a Sunday, thus a
low point in the usage pattern:

 total   used   free sharedbuffers cached
Mem:  1233582097202522615568  0  53112 680424
-/+ buffers/cache:89867163349104
Swap:  5855676  109165844760


System reaches its 7 year this summer which is the end of its service
contract.

> 4.  Storage configuration--total spindles, RAID level, hard or soft RAID

RAID 6 with 12 SATA1.5 disks, external 4Gbit FC 

Back in 2005, a SAS enclosure was way to expensive for us to afford.

> 5.  Filesystem type

XFS in a LVM to allow snapshots for backup

I of course aligned the partions on the RAID correctly and of course
created a filesystem with the correct parameters wrt. spindels, chunk
size, etc.

> 6.  Backup software/method

Full backup with Bacula, taking about 24 hours right now. Because of
this, I switched to virtual full backups, only ever doing incremental
and differental backups off of the real system and creating synthetic
full backups inside Bacula. Works fine though, incremental taking 2
hours, differential about 4 hours.

The main problem of the backup time is Maildir++. During a test, I
copied the mail storage to a spare box, converted it to mdbox (50MB
file size) and the backup was lightning fast compared to the Maildir++
format.

Additonally compressing the mails inside the mdbox and not having Bacula
compress them for me reduce the backup time further (and speeding up the
access through IMAP and POP3).

So this is the way to go, I think, regardless of which way I implement
the backend mail server.

> 7.  Operating system

Debian Linux Lenny, currently with kernel 2.6.39

> Instead of telling us what you think the solution to your unidentified
> bottleneck is and then asking "yeah or nay", tell us what the problem is
> and allow us to recommend solutions.

I am not asking for "yay or nay", I just pointed out my idea, but I am
open to other suggestions.

If the general idea is to buy a new big single storage system, I am more
than happy to do just this, because this will prevent any problems I might
have with a distributed one before they even can occur.

Maybe two HP DL180s (one for production and one as test/standby-system)
with an SAS attached enclosure for storage?

Keeping in mind the new system has to work for some time (again 5 to 7
years) I have to be able to extend the storage space without to much
hassle.

Grüße,
S°

-- 
Sigmentation fault. Core dumped.



Re: [Dovecot] Providing shared folders with multiple backend servers

2012-01-07 Thread Stan Hoeppner
On 1/7/2012 4:20 PM, Sven Hartge wrote:
> Hi *,
> 
> I am currently in the planning stage for a "new and improved" mail
> system at my university.
> 
> Right now, everything is on one big backend server but this is causing
> me increasing amounts of pain, beginning with the time a full backup
> takes.

You failed to mention your analysis and diagnosis identifying the source
of the slow backup, and other issues your eluded to but didn't mention
specifically.  You also didn't mention how you're doing this full backup
(tar, IMAP; D2D or tape), where the backup bottleneck is, what mailbox
storage format you're using, total mailbox count and filesystem space
occupied.  What is your disk storage configuration?  Direct attach?
Hardware or software RAID?  What RAID level?  How many disks?  SAS or SATA?

It's highly likely your problems can be solved without the drastic
architecture change, and new problems it will introduce, that you
describe below.

> So naturally, I want to split this big server into smaller ones.

Naturally?  Many OPs spend significant x/y/z resources trying to avoid
the "shared nothing" storage backend setup below.

> To keep things simple, I want to pin a user to a server so I can avoid
> things like NFS or cluster aware filesystems. The mapping for each
> account is then inserted into the LDAP object for each user and the
> frontend proxy (perdition at the moment) then uses this information to
> route each access to the correct backend storage server running dovecot.

Splitting the IMAP workload like this isn't keeping things simple, but
increases complexity, on many levels.  And there's nothing wrong with
NFS and cluster filesystems if they are used correctly.

> So far this has been working nice with my test setup.
> 
> But: I also have to provide shared folders for users. Thankfully users
> don't have the right to share their own folders, which makes things
> easier (I hope).
> 
> Right now, the setup works like this, using Courier:
> 
>  - complete virtual mail setup
>  - global shared folders configured in /etc/courier/shared/index
>  - inside /home/shared-folder-name/Maildir/courierimapacl specific user
>get access to a folder
>  - each folder a user has access is mapped to the namespace #shared
>like #shared.shared-folder-name
> 
> Now, if I split my backend storage server into multiple ones and user-A
> is on server-1 and user-B is on server-2, but both need to access the
> same shared folder, I have a problem.

Yes, you do.

> I could of course move all users needing access to a shared folder to
> the same server, but in the end, this will be a nightmare for me,
> because I forsee having to move users around on a daily basis.

See my comments above.

> Right now, I am pondering with using an additional server with just the
> shared folders on it and using NFS (or a cluster FS) to mount the shared
> folder filesystem to each backend storage server, so each user has
> potential access to a shared folders data.

So you're going to implement a special case of what you're desperately
trying to avoid?  This makes no sense.

> Ideas? Suggestions? Nudges in the right direction?

Yes.  We need more real information.  Please provide:

1.  Mailbox count, total maildir file count and size
2.  Average/peak concurrent user connections
3.  CPU type/speed/total core count, total RAM, free RAM (incl buffers)
4.  Storage configuration--total spindles, RAID level, hard or soft RAID
5.  Filesystem type
6.  Backup software/method
7.  Operating system

Instead of telling us what you think the solution to your unidentified
bottleneck is and then asking "yeah or nay", tell us what the problem is
and allow us to recommend solutions.  This way you'll get some education
and multiple solutions that may very well be a better fit, will perform
better, and possibly cost less in capital outlay and administration
time/effort.

-- 
Stan


[Dovecot] Providing shared folders with multiple backend servers

2012-01-07 Thread Sven Hartge
Hi *,

I am currently in the planning stage for a "new and improved" mail
system at my university.

Right now, everything is on one big backend server but this is causing
me increasing amounts of pain, beginning with the time a full backup
takes.

So naturally, I want to split this big server into smaller ones.

To keep things simple, I want to pin a user to a server so I can avoid
things like NFS or cluster aware filesystems. The mapping for each
account is then inserted into the LDAP object for each user and the
frontend proxy (perdition at the moment) then uses this information to
route each access to the correct backend storage server running dovecot.

So far this has been working nice with my test setup.

But: I also have to provide shared folders for users. Thankfully users
don't have the right to share their own folders, which makes things
easier (I hope).

Right now, the setup works like this, using Courier:

 - complete virtual mail setup
 - global shared folders configured in /etc/courier/shared/index
 - inside /home/shared-folder-name/Maildir/courierimapacl specific user
   get access to a folder
 - each folder a user has access is mapped to the namespace #shared
   like #shared.shared-folder-name

Now, if I split my backend storage server into multiple ones and user-A
is on server-1 and user-B is on server-2, but both need to access the
same shared folder, I have a problem.

I could of course move all users needing access to a shared folder to
the same server, but in the end, this will be a nightmare for me,
because I forsee having to move users around on a daily basis.

Right now, I am pondering with using an additional server with just the
shared folders on it and using NFS (or a cluster FS) to mount the shared
folder filesystem to each backend storage server, so each user has
potential access to a shared folders data.



Ideas? Suggestions? Nudges in the right direction?


Grüße,
Sven.

-- 
Sigmentation fault. Core dumped.