On Feb 6, 2007, at 6:05 PM, Heywood, Todd wrote:

I know this is an OpenMPI list, but does anyone know how common or uncommon LDAP-based clusters are? I would have thought this issue would have arisen elsewhere, but Googling MPI+LDAP (and similar) doesn't turn up much.

FWIW, when I was back at Indiana University, we had a similar issue with a 128 node cluster -- starting parallel jobs would overwhelm the central slapd's and logins would start failing.

IIRC, the admins tried a variety of things that didn't end up working or were too complicated to maintain in the long term. So they ended up replicating the /etc/shadow and /etc/passwd from LDAP every X hours (24, I think?) so that all authentications on the cluster were local. Then they simply disallowed changing user information the cluster (password, shell, etc.) and said "if you want to change information, change it elsewhere and it will sync to the cluster within X hours".

Not an optimal solution, but it was the one they opted for because all things being equal, I think it was the simplest.

This is all from quite a while ago, so I might not have the details exactly correct.

I don't know much about LDAP, but if proxying / caching LDAP servers exist, it might help considerably (e.g., put a caching proxy on the cluster head node that can respond quickly to hundreds of simultaneous LDAP requests from across the cluster instead of having the cluster nodes all talk to a central LDAP server). I don't know if that even makes sense (caching LDAP queries), but it was just a thought...

--
Jeff Squyres
Server Virtualization Business Unit
Cisco Systems

Reply via email to