Re: 6.1 and NFS
Bill Moran writes: Have you tried contacting the Foundation? http://www.freebsdfoundation.org/ It's my understanding that they coordinate most of this money -> developers stuff ... I think I explored that route. It's been a month or so now.. but if memory serves me well that was not a viable option. Don't recall details, but I think someone told me they were not setup to find someone... or something along those lines. One of the Core developers offered to put me in contact with one or more people who did this type of work.. but after a few days with no response I sent a follow message and never heard back.. We ended up giving up on NFS and re-architecting what we were doing as to not use NFS. :-( ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: 6.1 and NFS
In response to Francisco Reyes <[EMAIL PROTECTED]>: > Kris Kennaway writes: > > > There are a number of PRs I filed, but those aren't all of the > > problems. It will require fairly major work to fix - the best hope > > would be if someone was funded to work on it. > > A couple of months back the place I work for had a number of issues with > NFS. We tried to find someone to work with us and we were offering to pay. > After weeks searching I was unable to find someone. A few weeks later We got > in touch with Mohan Srinivasan who graciously spent time during his vacation > to help us. > > Although I believe our problems were in a good deal related to our own > network quality the state of the NFS server seems to need some considerable > work. Also we found a couple of additional bugs with the client which made > things even worse. > > So.. if there is someone who is willing to work on NFS.. as a contract there > needs to be a way for companies willing to fund it to get in touch with such > person(s). Perhaps there could be a list/forum where people familiar with > internals such as NFS, can post their availability and willingness to do > contract work so companies willing to fund development in a particular area > can get in touch with the right people. Have you tried contacting the Foundation? http://www.freebsdfoundation.org/ It's my understanding that they coordinate most of this money -> developers stuff ... -- Bill Moran Collaborative Fusion Inc. ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: 6.1 and NFS
Kris Kennaway writes: There are a number of PRs I filed, but those aren't all of the problems. It will require fairly major work to fix - the best hope would be if someone was funded to work on it. A couple of months back the place I work for had a number of issues with NFS. We tried to find someone to work with us and we were offering to pay. After weeks searching I was unable to find someone. A few weeks later We got in touch with Mohan Srinivasan who graciously spent time during his vacation to help us. Although I believe our problems were in a good deal related to our own network quality the state of the NFS server seems to need some considerable work. Also we found a couple of additional bugs with the client which made things even worse. So.. if there is someone who is willing to work on NFS.. as a contract there needs to be a way for companies willing to fund it to get in touch with such person(s). Perhaps there could be a list/forum where people familiar with internals such as NFS, can post their availability and willingness to do contract work so companies willing to fund development in a particular area can get in touch with the right people. ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: 6.1 and NFS
On Thu, Sep 21, 2006 at 04:12:03PM -0700, Perry Hutchison wrote: > > File locking works reasonably well within a single "system" (defined > as a combination of hardware and software that all crashes together > :) I doubt anyone will ever get it to work all that well when the > locks must be shared across a larger entity. Not yet extensive 'real-world-usage-experience' but from what I have learnt and seen myself, DLM (Distributed Lock Manager) coming with GFS v6.1 (Global File System) does quite a nice job. However, it is not NFS and requries quite a bit more pocket money to setup as a SAN is required (unless you use iSCSI served off a spare box...). So building a reliable lock manager over multiple systems can be done. Kurt ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: 6.1 and NFS
Hi, > That's interesting. Are you getting a "could not lock the passwd > file: EOPNOTSUPP" failure with rpc.lockd not enabled? Negative, I rebuild the kernel on one box today, commented out rpc.lockd="YES" in /etc/rc.conf and rebooted into single user mode and remounted / rw. I then ran adduser: [some output deleted] Home : /home/koekoek Shell : /bin/sh Locked : no OK? (yes/no): pw: group update: Operation not supported pwd_mkdb: flock: Operation not supported pw: user 'koekoek' disappeared during update adduser: ERROR: There was an error adding user (koekoek). and uploaded the kdump to http://members.iphouse.com/robertj/ktrace.adduser.no.rpc.lockd.txt, same story with vipw: vipw: could not lock the passwd file: : Operation not supported (http://members.iphouse.com/robertj/ktrace.vipw.no.rpc.lockd.txt). After that, I rebooted once again into the normal multi-user modus with rpc.lockd running and took ktraces for vipw and adduser (http://members.iphouse.com/robertj/ktrace.vipw.rpc.lockd.txt and http://members.iphouse.com/robertj/ktrace.adduser.rpc.lockd.txt respectively) for the record. The pxe box runs 5.4-RELEASE-p11, being served by a 4.11-RELEASE-p19 box. Please feel free to contact me to let me do something else, test or whatswhoever. Regards, Robert PS: koekoek is a bird, dutch for Cuculus Canorus :-) ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: 6.1 and NFS
Hi, > >I observed running a pxe client running fbsd 5.[45] being served by > >nfs-box running 5 (and 4 > >nowadays because of asr0 trouble due to geom) having disabled > >rpc.lockd > >the box doens't let me run adduser, but with rpc.lockd enabled it's > >fine > >with 'em. Is that strange or am I missing (some) insight about this > >matter? > That's interesting. Are you getting a "could not lock the passwd > file: EOPNOTSUPP" failure with rpc.lockd not enabled? I never witnissed any message. After I press enter (adduser) nothing appears on the screen at all. But it has been a long time ago and I'm currently unable to experiment. I hope tomorrow I will. I assume it's a good idea to run a debug kernel ? I also recall there were some occaisions I once run adduser that went ok, but the second (and more) times in a row it just locked. These cases were never reproducable, however I had the idea something called some rpc.lockd procedure before. In all these cases, the box ran the standard sendmail enabled. Maybe in these occasions the box did a 'lock' before, and all subsequent flock() calls stall. Something like that. I'm not sure but at that time I presume I ran a 6.0-rel pxe client served by a 5.3-rel nfs server; I really forget though and didn't take any notes either. Regards, Robert ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: 6.1 and NFS
On Sep 22, 2006, at 11:02 AM, Robert Joosten wrote: Hmmm, is there a way to run pxe-boxes without rpc.lockd and then still able to run adduser and so on ? Safely? No. But then, flock() doesn't work via NFS even if rpc.lockd is running, so you aren't any worse off. flock() .. hmm yeah, I discoverd trouble with sendmail as well, it rings my bell. At least I know where to look for digging in the code finding clues about why. You say flock() doesn't work with rpc.lockd running. At least at one point, flock() used against an NFS-mount filesystem would simply return as if the call was successful, but no locking was done. Whether rpc.lockd is running or not would have no impact. I observed running a pxe client running fbsd 5.[45] being served by nfs-box running 5 (and 4 nowadays because of asr0 trouble due to geom) having disabled rpc.lockd the box doens't let me run adduser, but with rpc.lockd enabled it's fine with 'em. Is that strange or am I missing (some) insight about this matter? That's interesting. Are you getting a "could not lock the passwd file: EOPNOTSUPP" failure with rpc.lockd not enabled? I suspect that the pw_lock() code in libutil ought to use O_EXLOCK in the open() call rather than calling flock() separately: [EOPNOTSUPP] O_SHLOCK or O_EXLOCK is specified but the underlying file system does not support locking. ...? -- -Chuck ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: 6.1 and NFS
Hi, > >Hmmm, is there a way to run pxe-boxes without rpc.lockd and then still > >able to run adduser and so on ? > Safely? No. But then, flock() doesn't work via NFS even if > rpc.lockd is running, so you aren't any worse off. flock() .. hmm yeah, I discoverd trouble with sendmail as well, it rings my bell. At least I know where to look for digging in the code finding clues about why. You say flock() doesn't work with rpc.lockd running. I observed running a pxe client running fbsd 5.[45] being served by nfs-box running 5 (and 4 nowadays because of asr0 trouble due to geom) having disabled rpc.lockd the box doens't let me run adduser, but with rpc.lockd enabled it's fine with 'em. Is that strange or am I missing (some) insight about this matter ? > However, I believe that some systems have actually re-implemented the > BSD flock() call in terms of calling the POSIX lockf(), which would > attempt to use rpc.lockd and thus have some chance of working over > NFS. I believe this was done in Linux by Andy Walker and for MacOS X > by Justin Walker (odd naming coincidence, there), IIRC; perhaps some > of these changes have made their way back to the other BSDs. Interesting observation. Thx for your reply ! Regards, Robert ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: 6.1 and NFS
>> rpc.lockd remains unreliable; avoid using it if practical. statd and lockd have been problematic ever since Sun invented them a couple of decades ago, at least partly because what they are trying to do is fundamentally not computable. (There is no way to distinguish between the other side having crashed, and a temporary network communication problem that has not yet been resolved but will be eventually. At best, you find out about the other side's crash after it has rebooted, which could be hours or days later if the crash was caused by a hardware failure.) The best solution is to avoid locking files over NFS. For example: * As pointed out in Mark Crispin's article, use IMAP (or POP) instead of having the mailserver export the spool directory. * ssh to the server to do things like adduser, rather than trying to run it on the client. File locking works reasonably well within a single "system" (defined as a combination of hardware and software that all crashes together :) I doubt anyone will ever get it to work all that well when the locks must be shared across a larger entity. ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: 6.1 and NFS
On Sep 21, 2006, at 2:43 PM, Robert Joosten wrote: rpc.lockd remains unreliable; avoid using it if practical. Hmmm, is there a way to run pxe-boxes without rpc.lockd and then still able to run adduser and so on ? Safely? No. But then, flock() doesn't work via NFS even if rpc.lockd is running, so you aren't any worse off. Details follow: adduser invokes pw underneath, and pw should share the same password locking convention that vipw uses to avoid simultaneous/conflicting updates to the password files. Both pw and vipw use the pw_lock() routine from src/lib/libutil: pw_lock(void) { if (*masterpasswd == '\0') return (-1); /* * If the master password file doesn't exist, the system is hosed. * Might as well try to build one. Set the close-on-exec bit so * that users can't get at the encrypted passwords while editing. * Open should allow flock'ing the file; see 4.4BSD.XXX */ for (;;) { struct stat st; lockfd = open(masterpasswd, O_RDONLY, 0); if (lockfd < 0 || fcntl(lockfd, F_SETFD, 1) == -1) err(1, "%s", masterpasswd); /* XXX vulnerable to race conditions */ if (flock(lockfd, LOCK_EX|LOCK_NB) == -1) { if (errno == EWOULDBLOCK) { errx(1, "the password db file is busy"); } else { err(1, "could not lock the passwd file: "); } } [ ... ] Note the "XXX"es. And, as Mark said in the section I quoted in my previous email on this thread: flock() always returns as if it succeeded on NFS files, when in fact it is a no-op. There is no way around this. However, I believe that some systems have actually re-implemented the BSD flock() call in terms of calling the POSIX lockf(), which would attempt to use rpc.lockd and thus have some chance of working over NFS. I believe this was done in Linux by Andy Walker and for MacOS X by Justin Walker (odd naming coincidence, there), IIRC; perhaps some of these changes have made their way back to the other BSDs. -- -Chuck ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: 6.1 and NFS
Hi Kris, > > > rpc.lockd remains unreliable; avoid using it if practical. > > Hmmm, is there a way to run pxe-boxes without rpc.lockd and then still > > able to run adduser and so on ? > Use the nolockd option to mount_nfs, nolockd, aha. Okay, I'll look at that, thx. Regards, Robert ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: 6.1 and NFS
On Thu, Sep 21, 2006 at 11:43:16PM +0200, Robert Joosten wrote: > Hi, > > > rpc.lockd remains unreliable; avoid using it if practical. > > Hmmm, is there a way to run pxe-boxes without rpc.lockd and then still > able to run adduser and so on ? Use the nolockd option to mount_nfs, that's what I meant by 'avoid using it'. Kris pgp9z1IkGrYxO.pgp Description: PGP signature
Re: 6.1 and NFS
Hi, > rpc.lockd remains unreliable; avoid using it if practical. Hmmm, is there a way to run pxe-boxes without rpc.lockd and then still able to run adduser and so on ? Regards, Robert ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: 6.1 and NFS
On Sep 21, 2006, at 11:42 AM, Michael Conlen wrote: On Thu, Sep 21, 2006 at 02:21:08PM -0400, Michael Conlen wrote: I recall that FreeBSD 6.1 had some NFS & lockd issues that were a show stopper at one time for me however I'm having trouble finding information on the current state of NFS. Anyone have a pointer to information? rpc.lockd remains unreliable; avoid using it if practical. This is becoming a show stopper for us moving forward with FreeBSD and may require us moving to a different OS (Linux or Solaris, each with significant downsides). Do you have a pointer on where I can track the issue so as to make a decision at some point in the future? Well, Solaris has the best NFS implementation out there and includes a number of subtle workarounds in their server code to reduce the number of and/or impact of problems seen doing heterogeneous networking against clients running other operating systems, but frankly, rpc.lockd isn't significantly more stable there on Solaris than on FreeBSD. I've done extensive testing on Solaris 2.5.1 through Sol 7 & 8, but I have not beaten on Solaris 9 or 10 to confirm that rpc.lockd is still unreliable on the most recent versions of Solaris. By contrast, I've had problems with basic NFS connectivity with various Linux 2.2 and 2.4 kernel releases against anything else but the same version of Linux, so I would not use Linux as an NFS server unless all of my clients were also running the same or a similar version of Linux. rpc.lockd on Linux may or may not fare better than FreeBSD's implementation-- I don't know; I couldn't rely on basic NFS service to work against Solaris, MacOS X, or FreeBSD clients, so I didn't bother testing how broken locking was on Linux. In other words, if you plan to use NFS filesharing, you should make every effort to utilize software which functions with the classic ".lock"file mechanism rather than depending on lockf()/flock()/fcntl ()-based locking working. -- -Chuck PS: Here's a more detailed description of the status of locking, written by Mark Crispin. At the very least, read the last section called "TRADEOFFS": UNIX Advisory File Locking Implications on c-client Mark Crispin, 28 November 1995 THIS DOCUMENT HAS BEEN UPDATED TO REFLECT THE FACT THAT LINUX SUPPORTS BOTH flock() AND fcntl() AND THAT OSF/1 HAS BEEN BROKEN SO THAT IT ONLY SUPPORTS fcntl(). -- JUNE 15, 2004 THIS DOCUMENT HAS BEEN UPDATED TO REFLECT THE CODE IN THE IMAP-4 TOOLKIT AS OF NOVEMBER 28, 1995. SOME STATEMENTS IN THIS DOCUMENT DO NOT APPLY TO EARLIER VERSIONS OF THE IMAP TOOLKIT. INTRODUCTION Advisory locking is a mechanism by which cooperating processes can signal to each other their usage of a resource and whether or not that usage is critical. It is not a mechanism to protect against processes which do not cooperate in the locking. The most basic form of locking involves a counter. This counter is -1 when the resource is available. If a process wants the lock, it executes an atomic increment-and-test-if-zero. If the value is zero, the process has the lock and can execute the critical code that needs exclusive usage of a resource. When it is finished, it sets the lock back to -1. In C terms: while (++lock)/* try to get lock */ invoke_other_threads ();/* failed, try again */ . ./* critical code here */ . lock = -1;/* release lock */ This particular form of locking appears most commonly in multi-threaded applications such as operating system kernels. It makes several presumptions: (1) it is alright to keep testing the lock (no overflow) (2) the critical resource is single-access only (3) there is shared writeable memory between the two threads (4) the threads can be trusted to release the lock when finished In applications programming on multi-user systems, most commonly the other threads are in an entirely different process, which may even be logged in as a different user. Few operating systems offer shared writeable memory between such processes. A means of communicating this is by use of a file with a mutually agreed upon name. A binary semaphore can be passed by means of the existance or non-existance of that file, provided that there is an atomic means to create a file if and only if that file does not exist. In C terms: /* try to get lock */ while ((fd = open ("lockfile",O_WRONLY|O_CREAT|O_EXCL,0666)) < 0) sleep (1); /* failed, try again */ close (fd); /* got the lock */ . ./* critical code here */ . unlink ("lockfile"); /* release lock */ This form of locking makes fewer presumptions, but it still is guilty of presumptions (2) and (4) above. Presumption (2) limits the ability to have processes sharing a
Re: 6.1 and NFS
On Thu, Sep 21, 2006 at 02:57:32PM -0400, Michael Conlen wrote: > > On Sep 21, 2006, at 2:45 PM, Kris Kennaway wrote: > > >On Thu, Sep 21, 2006 at 02:42:44PM -0400, Michael Conlen wrote: > >> > >>On Sep 21, 2006, at 2:22 PM, Kris Kennaway wrote: > >> > >>>On Thu, Sep 21, 2006 at 02:21:08PM -0400, Michael Conlen wrote: > I recall that FreeBSD 6.1 had some NFS & lockd issues that were a > show stopper at one time for me however I'm having trouble finding > information on the current state of NFS. Anyone have a pointer to > information? > >>> > >>>rpc.lockd remains unreliable; avoid using it if practical. > >> > >>This is becoming a show stopper for us moving forward with FreeBSD > >>and may require us moving to a different OS (Linux or Solaris, each > >>with significant downsides). Do you have a pointer on where I can > >>track the issue so as to make a decision at some point in the future? > > > >There are a number of PRs I filed, but those aren't all of the > >problems. It will require fairly major work to fix - the best hope > >would be if someone was funded to work on it. > > Do you have an estimate of what kind of time is necessary to solve > the problem? No idea, sorry. Kris ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: 6.1 and NFS
On Sep 21, 2006, at 2:45 PM, Kris Kennaway wrote: On Thu, Sep 21, 2006 at 02:42:44PM -0400, Michael Conlen wrote: On Sep 21, 2006, at 2:22 PM, Kris Kennaway wrote: On Thu, Sep 21, 2006 at 02:21:08PM -0400, Michael Conlen wrote: I recall that FreeBSD 6.1 had some NFS & lockd issues that were a show stopper at one time for me however I'm having trouble finding information on the current state of NFS. Anyone have a pointer to information? rpc.lockd remains unreliable; avoid using it if practical. This is becoming a show stopper for us moving forward with FreeBSD and may require us moving to a different OS (Linux or Solaris, each with significant downsides). Do you have a pointer on where I can track the issue so as to make a decision at some point in the future? There are a number of PRs I filed, but those aren't all of the problems. It will require fairly major work to fix - the best hope would be if someone was funded to work on it. Do you have an estimate of what kind of time is necessary to solve the problem? -- Michael Conlen ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: 6.1 and NFS
On Thu, Sep 21, 2006 at 02:42:44PM -0400, Michael Conlen wrote: > > On Sep 21, 2006, at 2:22 PM, Kris Kennaway wrote: > > >On Thu, Sep 21, 2006 at 02:21:08PM -0400, Michael Conlen wrote: > >>I recall that FreeBSD 6.1 had some NFS & lockd issues that were a > >>show stopper at one time for me however I'm having trouble finding > >>information on the current state of NFS. Anyone have a pointer to > >>information? > > > >rpc.lockd remains unreliable; avoid using it if practical. > > This is becoming a show stopper for us moving forward with FreeBSD > and may require us moving to a different OS (Linux or Solaris, each > with significant downsides). Do you have a pointer on where I can > track the issue so as to make a decision at some point in the future? There are a number of PRs I filed, but those aren't all of the problems. It will require fairly major work to fix - the best hope would be if someone was funded to work on it. Kris ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: 6.1 and NFS
On Sep 21, 2006, at 2:22 PM, Kris Kennaway wrote: On Thu, Sep 21, 2006 at 02:21:08PM -0400, Michael Conlen wrote: I recall that FreeBSD 6.1 had some NFS & lockd issues that were a show stopper at one time for me however I'm having trouble finding information on the current state of NFS. Anyone have a pointer to information? rpc.lockd remains unreliable; avoid using it if practical. This is becoming a show stopper for us moving forward with FreeBSD and may require us moving to a different OS (Linux or Solaris, each with significant downsides). Do you have a pointer on where I can track the issue so as to make a decision at some point in the future? -- Michael Conlen ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: 6.1 and NFS
On Thu, Sep 21, 2006 at 02:21:08PM -0400, Michael Conlen wrote: > I recall that FreeBSD 6.1 had some NFS & lockd issues that were a > show stopper at one time for me however I'm having trouble finding > information on the current state of NFS. Anyone have a pointer to > information? rpc.lockd remains unreliable; avoid using it if practical. Kris pgpiPxG8FCI2K.pgp Description: PGP signature
6.1 and NFS
I recall that FreeBSD 6.1 had some NFS & lockd issues that were a show stopper at one time for me however I'm having trouble finding information on the current state of NFS. Anyone have a pointer to information? -- Michael Conlen ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "[EMAIL PROTECTED]"