Re: bug in pw, -STABLE [patch]

2002-06-24 Thread Matthew D. Fuller

On Sun, Jun 23, 2002 at 05:14:58PM -0700 I heard the voice of
Paul Herman, and lo! it spake thus:
 
 Clearly, at the root of our disagreement is what we both perceive
 the problem to be.

Oh, certainly; that's what makes it fun   :)


 I don't see problems in the current implementation, aside from bugs
 that lead to unexpected behavior, i.e. passwd file corruption.
 You see the problem as a deficiency in the implementation itself,
 and wish to protect the user from shooting themselves in the foot.

Well, we're in violent agreement on the first one.  I'm just using that
as an opportunity to smack down the second and kill two birds with one
stone.


 Not only do I think that's impossible[*], I choose to fight for my
 right to shoot myself in the foot as quickly and efficiently as
 possible, but that's where we'll disagree, and I'll just leave it
 at that and wish you a good night's sleep.

Wouldn't work; the determined foot-screwer would simply cd /etc  mv
aliases somethingelse  ln master.passwd aliases  vi aliases.  Since
we don't have mandatory file locking, it's neither possible nor my
intention to prevent people from doing things intentionally; I'm just
trying to remove the ways they can do it accidentally using the tools we
provide.  I'm all for leaving options for people to intentionally de-toe,
or convince the system that they know what they're doing while they shoot
caterpillars between their toes.

Your approach will (I think; I haven't tested, so it's tough to be sure)
solve the problem that sparked this, which is that pw(8) has a race
condition allowing multiple invocations to step up each other's toes.
However, it doesn't do anything about the larger problem of maintaining
consistency in the passwd subsystem as a whole, which is where I'm
aiming.  I also think my approach (once documented, at any rate) would
jump out a bit more at people writing programs that adjust the auth
information.  And, additionally, we took the opportunity to take one MORE
step back from the problem, and implement the pid_*() functions which
abstract the implementation of this sort of locking, making is easy to
apply in other places.


Besides, this is a codeocracy; I changed more lines of code than you did,
to say nothing of a MANPAGE!  My solution MUST be better!   8-}


 [*] Patch to vi refusing to edit filenames containing
 master.passwd withheld by request.  ;-)

rm -f /usr/bin/vi  ln -s /usr/local/bin/pico /usr/bin/vi

That'll shake people up enough that they won't edit anything.  I know it
would have ME waking up screaming...


-- 
Matthew Fuller (MF4839)   |  [EMAIL PROTECTED]
Systems/Network Administrator |  http://www.over-yonder.net/~fullermd/

The only reason I'm burning my candle at both ends, is because I
  haven't figured out how to light the middle yet

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: bug in pw, -STABLE [patch]

2002-06-24 Thread Matthew D. Fuller

On Sun, Jun 23, 2002 at 05:07:40PM -0700 I heard the voice of
Terry Lambert, and lo! it spake thus:
 
 The problem with your proposed patch is that it breaks the
 ability to allow authentication against the database while
 it is undergoing modification, which may be a prolonged period.

Would it?

For starters, this locks master.passwd, not spwd.db, so any well-behaved
program that used getpw*() like a good little monkey wouldn't even notice
it.  In fact, with Paul's change and the rename() alteration, it'd be
BETTER for programs that tried to parse master.passwd, because as it
currently stands, there are times when it's incomplete (because of the
line-by-line copy, which caused us both a few moments of illness when we
noticed it).


-- 
Matthew Fuller (MF4839)   |  [EMAIL PROTECTED]
Systems/Network Administrator |  http://www.over-yonder.net/~fullermd/

The only reason I'm burning my candle at both ends, is because I
  haven't figured out how to light the middle yet

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: bug in pw, -STABLE [patch]

2002-06-24 Thread Paul Herman

On Mon, 24 Jun 2002, Geoffrey C. Speicher wrote:

 So we either need to have a compelling solution or get a
 committer to step in and make up our minds for us.

I think the best thing to do is file a PR for this.

-Paul.


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: bug in pw, -STABLE [patch]

2002-06-24 Thread Geoffrey C. Speicher

On Mon, 24 Jun 2002, Paul Herman wrote:

 On Mon, 24 Jun 2002, Geoffrey C. Speicher wrote:

  So we either need to have a compelling solution or get a
  committer to step in and make up our minds for us.

 I think the best thing to do is file a PR for this.

With or without a patch?

Geoff


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: bug in pw, -STABLE [patch]

2002-06-24 Thread Terry Lambert

Matthew D. Fuller wrote:
 Terry Lambert, and lo! it spake thus:
 
  The problem with your proposed patch is that it breaks the
  ability to allow authentication against the database while
  it is undergoing modification, which may be a prolonged period.
 
 Would it?
 
 For starters, this locks master.passwd, not spwd.db, so any well-behaved
 program that used getpw*() like a good little monkey wouldn't even notice
 it.  In fact, with Paul's change and the rename() alteration, it'd be
 BETTER for programs that tried to parse master.passwd, because as it
 currently stands, there are times when it's incomplete (because of the
 line-by-line copy, which caused us both a few moments of illness when we
 noticed it).

Whatever locking protocol you come up with needs to be partcipated
in by everyone.  Historically, passwd file manipulations have used
link count locking.

The real issue here is that there are a number of incremental file
update utilities for managmenet of very large installations, and a
large number of historical programs from places like comp.unix.sources
that use file locks, rather than advisory range locks.

Any locking protocol using advisory locking must lock out readers
while there are writers.

What you are really locking (or should be) is directory entries
here, which means using the same exclusion space for the locking
as you use for the update operation, which is a lock on the parent
directory (e.g. as is used internally to the FS rename operation).

--- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: bug in pw, -STABLE [patch]

2002-06-24 Thread Geoffrey C. Speicher

On Mon, 24 Jun 2002, Paul Herman wrote:

 On Mon, 24 Jun 2002, Geoffrey C. Speicher wrote:

  So we either need to have a compelling solution or get a
  committer to step in and make up our minds for us.

 I think the best thing to do is file a PR for this.

More than one person has already beaten us to the punch:

  bin/23501, closed on Fri Jun 22 08:43:09 PDT 2001, describes the
  exact same problem with pw(8), and contains a patch that was verified to
  fix the problem, but it was apparently never committed.

  bin/38676, submitted on Tue May 28 19:40:08 PDT 2002, requests that the
  original patch be committed, but has gotten no feedback.

Next?

Geoff


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: bug in pw, -STABLE [patch]

2002-06-24 Thread Matthew D. Fuller

On Mon, Jun 24, 2002 at 09:56:31AM -0700 I heard the voice of
Paul Herman, and lo! it spake thus:
 On Mon, 24 Jun 2002, Geoffrey C. Speicher wrote:
 
  So we either need to have a compelling solution or get a
  committer to step in and make up our minds for us.
 
 I think the best thing to do is file a PR for this.

I'll probably do so (referencing also the previous ones) once I finish
testing this; the downside is that I may not get a chance to plunk a few
straight solid hours into it until this weekend (darn that paying work
stuff!), so we may have to find new realms of the subject to flam^Wdebate
until then, just to keep the ol' adrenaline flowing.

Regardless of the outcome of the passwd issue, I think the pid_*()
functions in libutil will be useful for other things that do those kind
of functions.  Just glancing at my /var/run, I can see cron, inetd,
mountd, moused, and syslogd, just among the stuff in our base system that
isn't contrib/-ish (like named, sshd, ntpd, etc).


-- 
Matthew Fuller (MF4839)   |  [EMAIL PROTECTED]
Systems/Network Administrator |  http://www.over-yonder.net/~fullermd/

The only reason I'm burning my candle at both ends, is because I
  haven't figured out how to light the middle yet

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Locking the passwd subsystem (was Re: bug in pw, -STABLE [patch])

2002-06-23 Thread Matthew D. Fuller

[ Moved to -hackers since this isn't really a STABLE-specific issue, but
rather a general thing.  Bcc's to -stable for continuity's sake. ]

This discussion has gone on, on and off, for months now, so I'm going to
recap a bit.  For the people in it, we've mostly paged out the details by
now, and for the people not in it, this will be necessary catchup.

pw(8) can sometimes corrupt the passwd file when it's running.  This is
due to the fact that, while it munges things, it locks
/etc/master.passwd.  This fails to really work right between invocations,
because as part of its dirty work, pw(8) moves master.passwd away and
creates a new one from scratch.  So, an external lockfile is needed.
There's no real common locking among the other consumers (chpass(1),
passwd(1), vipw(8), pwd_mkdb(8), etc) either, so we might as well make a
big common locking mechanism out of it.

And while we're doing that, it would be nifty to centralize some
functions for doing locking like that, too, instead of coding it from
scratch in every application that needs it.  The patch on this mail
accomplishes this part of the objective; once we can all settle on that,
we can easily handle locking all the consumers neatly using these
functions.


And now, for the carrying on.

On Sun, Jun 23, 2002 at 09:07:15AM -0400 I heard the voice of
Geoffrey C. Speicher, and lo! it spake thus:
 
 Actually, now that I think about it, even a daemon wouldn't guarantee
 better results in every such scenario.  At first it would seem that the
 daemon would get a chance to clean up the pid file long before another
 process wanted it (during step 4).  However, if the timing is just right
 then p1 can die the moment that its pid would be reused, and chances are
 that the daemon wouldn't catch it in time.

Well, a daemon would have a BETTER chance, though nowhere near a
guarantee.  The theory being that the daemon would check every so often
(5 minutes sounds good to me), whereas with the library, you potentially
could not check for days, depending on how often you run pw(8).  Even on
a heavily used system, 5 minutes wouldn't likely be enough to reuse a
PID, unless you had random PID assignment or some such.


  Now, this is a problem.  There's a race condition here.  It's a very
  small window, to be sure, but I'm not quite sure how to close it.  After
 
 I think you must've figured it out judging by your post that just came
 through as I was writing that previous paragraph.  :)

I think so.  See comments in the code.


(Re: Oliver's statements about the general evil of PID files)
 The beauty of encapsulating the pid file operations into a library is
 that
 the implementation can be changed to create a socket instead of, or (more
 likely) in addition to, a pid file.  In fact, this may be just the thing
 we need to close that pid gets reused hole. 
 
 Since Matt already has the functions mostly implemented, I'll defer to
 him.

Well, there's nothing necessarily stopping us from adding more stuff
later, of course.  Personally, I've set my sights a bit lower; for the
moment, I'm more concerned with Let's not corrupt the passwd database
than with This must never block unless totally absolutely completely
necessary and should always know where its cheese is, which is what the
socket-theory is pointing at.  One giant code upheaval at a time   :)

One possibility would be to make procfs(5) show the process start time
(rather than the current time) for the ctime of the /proc/pid directory.
Then, one could look at the PID file, and if its mtime (not ctime; see in
below code where we could use a pre-existing file) is older than the
ctime of the /proc/pid directory, we'd know the PID had changed.  Getting
the actual process start time would require libkvm (and thus setgid
kmem), or spawning ps(1) and parsing output, otherwise.


ANYWAY:
Attached is a patch (relative to src/) for my first run through creating
the functions to handle PID-file creation, deletion, and
checking-on-creation.  It consists of two functions:
- pid_begin(), which creates a PID file if one doesn't exist, and checks
  to see if the listed process is still around (taking over the PID file
  if it's not) if it does.
- pid_end(), which deletes the PID file.

Code is reasonably well commented.  Manpage included.  Shaken, not
stirred.  Slippery when wet.  This has gone through a compile test, but
hasn't been live-tested.

This will provide the basis then to move forward and lock all the
programs that access the passwd database (for writing, of course; no
reason to lock readers) in a common way.


-- 
Matthew Fuller (MF4839)   |  [EMAIL PROTECTED]
Systems/Network Administrator |  http://www.over-yonder.net/~fullermd/

The only reason I'm burning my candle at both ends, is because I
  haven't figured out how to light the middle yet


Index: lib/libutil/Makefile
===
RCS file: /usr/cvs/src/lib/libutil/Makefile,v
retrieving 

Re: Locking the passwd subsystem (was Re: bug in pw, -STABLE [patch])

2002-06-23 Thread Matthew D. Fuller

On Sun, Jun 23, 2002 at 11:17:35AM -0400 I heard the voice of
Geoffrey C. Speicher, and lo! it spake thus:

 5-minute window.  Still unliklely, but not quite as unlikely as the
 non-daemon scenario.

And it all goes out when window when kern.randompid=1, so it's kinda
moot.


 Looks good from here.  I've attached some minor patches (also relative to
 src/) to pid_util.{3,c}.  The first corrects some minor typos in the man
 page and the second is to make pid_begin() work as advertised when read()
 fails (set errno and return -1 rather than return errno).

Yes, I have a brain.  No, really I do.  I just don't have enough coffee
:)


 (I haven't tested any of this yet either.)

I just slapped together a few quick test programs.  Here's an updated
patch, incorporating your fixes, as well as a bugfix and an almost-bugfix
my testing showed up (hint: just because you ftruncate() to 0-length,
doesn't mean your offset becomes 0 too, dangit), and eliminated an extra
variable.


-- 
Matthew Fuller (MF4839)   |  [EMAIL PROTECTED]
Systems/Network Administrator |  http://www.over-yonder.net/~fullermd/

The only reason I'm burning my candle at both ends, is because I
  haven't figured out how to light the middle yet


Index: lib/libutil/Makefile
===
RCS file: /usr/cvs/src/lib/libutil/Makefile,v
retrieving revision 1.46
diff -u -r1.46 Makefile
--- lib/libutil/Makefile8 May 2002 00:50:07 -   1.46
+++ lib/libutil/Makefile23 Jun 2002 13:31:01 -
@@ -8,7 +8,7 @@
 CFLAGS+=-DINET6
 SRCS=  _secure_path.c auth.c fparseln.c login.c login_auth.c \
login_cap.c login_class.c login_crypt.c login_ok.c login_times.c \
-   login_tty.c logout.c logwtmp.c property.c pty.c \
+   login_tty.c logout.c logwtmp.c pid_util.c property.c pty.c \
pw_util.c realhostname.c stub.c \
trimdomain.c uucplock.c
 INCS=  libutil.h login_cap.h
@@ -16,7 +16,7 @@
 MAN+=  login.3 login_auth.3 login_tty.3 logout.3 logwtmp.3 pty.3 \
login_cap.3 login_class.3 login_times.3 login_ok.3 \
_secure_path.3 uucplock.3 property.3 auth.3 realhostname.3 \
-   realhostname_sa.3 trimdomain.3 fparseln.3
+   realhostname_sa.3 trimdomain.3 fparseln.3 pid_util.3
 MAN+=  login.conf.5 auth.conf.5
 MLINKS+= property.3 properties_read.3  property.3 properties_free.3
 MLINKS+= property.3 property_find.3
@@ -39,5 +39,6 @@
 MLINKS+=login_auth.3 auth_checknologin.3 login_auth.3 auth_cat.3
 MLINKS+=uucplock.3 uu_lock.3 uucplock.3 uu_lock_txfr.3 \
uucplock.3 uu_unlock.3 uucplock.3 uu_lockerr.3
+MLINKS+=pid_util.3 pid_begin.3 pid_end.3
 
 .include bsd.lib.mk
Index: lib/libutil/libutil.h
===
RCS file: /usr/cvs/src/lib/libutil/libutil.h,v
retrieving revision 1.37
diff -u -r1.37 libutil.h
--- lib/libutil/libutil.h   8 May 2002 00:50:07 -   1.37
+++ lib/libutil/libutil.h   23 Jun 2002 12:35:30 -
@@ -82,6 +82,8 @@
 struct sockaddr;
 intrealhostname_sa(char *host, size_t hsize, struct sockaddr *addr,
 int addrlen);
+int pid_begin(const char *pidfile, mode_t mode, int flags);
+int pid_end(const char *pidfile);
 #ifdef _STDIO_H_   /* avoid adding new includes */
 char   *fparseln(FILE *, size_t *, size_t *, const char[3], int);
 #endif
@@ -128,5 +130,8 @@
 /* pw_scan() */
 #define PWSCAN_MASTER  0x01
 #define PWSCAN_WARN0x02
+
+/* pid_begin() */
+#define PID_NOBLOCK0x01
 
 #endif /* !_LIBUTIL_H_ */
Index: lib/libutil/pid_util.c
===
--- /dev/null   Sun Jun 23 11:44:00 2002
+++ lib/libutil/pid_util.c  Sun Jun 23 11:50:59 2002
@@ -0,0 +1,178 @@
+/*-
+ * Copyright (c) 2002 Matthew D. Fuller [EMAIL PROTECTED]
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *notice, this list of conditions and the following disclaimer in the
+ *documentation and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE AUTHORS AND CONTRIBUTORS ``AS IS'' AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHORS OR CONTRIBUTORS BE LIABLE
+ * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
+ * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF 

Re: bug in pw, -STABLE [patch]

2002-06-23 Thread Paul Herman


[ pulled over to -hackers as well ]

On Sun, 23 Jun 2002, Matthew D. Fuller wrote:

 On Sun, Jun 23, 2002 at 09:00:52AM -0700 I heard the voice of
 Paul Herman, and lo! it spake thus:
  On Sun, 23 Jun 2002, Geoffrey C. Speicher wrote:
 
  How so?  I'm not suggesting unlink(2)ing /etc/master.passwd or
  /etc/spwd.db at all.

 No, but pw(8) does; making it not do so would require reasonably
 extensive rewriting.

Indeed it does do this.  However, that seems to be the crux of the
problem.  The current pw(8) behavior was (probably) not only a
convenient way to update files in /etc, but an attempt to narrow
(not eliminate) the window for race conditions.  If this is
fundamentaly flawed, *this* should be fixed, and not have a larger
band aid put over it.

I can't imagine it would be too extensive of a rewrite.  The temp
file code could be kept, and in fileupd.c:fileupdate() instead of
rename(/etc/master.passwd.new, /etc/master.passwd), it just
copies the contents of the .new file into the original file (that
has the O_EXLOCK.)

-Paul.


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: bug in pw, -STABLE [patch]

2002-06-23 Thread Paul Herman

On Sun, 23 Jun 2002, Matthew D. Fuller wrote:

 On Sun, Jun 23, 2002 at 09:59:10AM -0700 I heard the voice of
 Paul Herman, and lo! it spake thus:
 
  I can't imagine it would be too extensive of a rewrite.  The temp
  file code could be kept, and in fileupd.c:fileupdate() instead of
  rename(/etc/master.passwd.new, /etc/master.passwd), it just
  copies the contents of the .new file into the original file (that
  has the O_EXLOCK.)

 Yes, but that would open up the problem that cause the rename(2) to be
 used in the first place; i.e., rename() is atomic; read(2)/write(2)'ing a
 bunch of data isn't, so a system crash at the wrong time would leave you
 with a partial master.passwd.

Fine, then lock them both, and use rename().  Patch attached.

In fact, if you look at fileupdate(), you see that it already gains
an exclusive lock on the temp file, but not the original
/etc/master.passwd (if you will.)  I think this is a bug, because
the original is getting modified (at least in name space), so that
should be locked while pw(8) is operating.

What do you think?

-Paul.

Index: fileupd.c
===
RCS file: /u02/ncvs/src/usr.sbin/pw/fileupd.c,v
retrieving revision 1.9
diff -u -r1.9 fileupd.c
--- fileupd.c   26 Oct 1999 04:27:13 -  1.9
+++ fileupd.c   23 Jun 2002 18:30:16 -
@@ -76,7 +76,7 @@
if (pfxlen = 1)
rc = EINVAL;
else {
-   intinfd = open(filename, O_RDWR | O_CREAT, fmode);
+   intinfd = open(filename, O_RDWR | O_CREAT | O_EXLOCK | O_NONBLOCK, 
+fmode);

if (infd == -1)
rc = errno;


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: Locking the passwd subsystem (was Re: bug in pw, -STABLE [patch])

2002-06-23 Thread Matthew D. Fuller

And here's a run at Stage 2.  This adapts a subset of programs to use the
pid_*() locking supplied.  The attached patch updates: pwd.h, libutil
(libutil.h and pw_util.c), chpass (chpass.c), pw (pw.c), pwd_mkdb
(Makefile, pwd_mkdb.8, pwd_mkdb.c), vipw (vipw.c).  This does NOT include
passwd(1), since that does its dirty work in PAM, which I definately
don't feel like delving into.

Note that this patch MAY be a little off when applied over the previous
patch; I'm working on them in seperate trees, to try and keep the stages
discrete.  But a little manual fiddling should get them reconciled in
short order.


-- 
Matthew Fuller (MF4839)   |  [EMAIL PROTECTED]
Systems/Network Administrator |  http://www.over-yonder.net/~fullermd/

The only reason I'm burning my candle at both ends, is because I
  haven't figured out how to light the middle yet


Index: include/pwd.h
===
RCS file: /usr/cvs/src/include/pwd.h,v
retrieving revision 1.12
diff -u -r1.12 pwd.h
--- include/pwd.h   9 Jun 2002 19:39:18 -   1.12
+++ include/pwd.h   23 Jun 2002 18:16:21 -
@@ -66,6 +66,9 @@
 #define_PATH_MASTERPASSWD  /etc/master.passwd
 #define_MASTERPASSWD   master.passwd
 
+#define_PATH_PWDLOCK   /var/run/pwd.lock
+#define_MODE_PWDLOCK   (S_IRUSR | S_IWUSR)
+
 #define_PATH_MP_DB /etc/pwd.db
 #define_MP_DB  pwd.db
 #define_PATH_SMP_DB/etc/spwd.db
Index: lib/libutil/libutil.h
===
RCS file: /usr/cvs/src/lib/libutil/libutil.h,v
retrieving revision 1.37
diff -u -r1.37 libutil.h
--- lib/libutil/libutil.h   8 May 2002 00:50:07 -   1.37
+++ lib/libutil/libutil.h   23 Jun 2002 18:38:18 -
@@ -95,6 +95,8 @@
 intpw_init(const char *_dir, const char *_master);
 char   *pw_make(struct passwd *_pw);
 intpw_mkdb(const char *_user);
+intpw_globlock(int);
+intpw_globunlock(void);
 intpw_lock(void);
 struct passwd *pw_scan(const char *_line, int _flags);
 const char *pw_tempname(void);
@@ -124,6 +126,9 @@
 #defineFPARSELN_UNESCCOMM  0x04
 #defineFPARSELN_UNESCREST  0x08
 #defineFPARSELN_UNESCALL   0x0f
+
+/* pw_globlock() */
+#definePW_NOBLOCK  PID_NOBLOCK
 
 /* pw_scan() */
 #define PWSCAN_MASTER  0x01
Index: lib/libutil/pw_util.c
===
RCS file: /usr/cvs/src/lib/libutil/pw_util.c,v
retrieving revision 1.25
diff -u -r1.25 pw_util.c
--- lib/libutil/pw_util.c   8 May 2002 14:52:32 -   1.25
+++ lib/libutil/pw_util.c   23 Jun 2002 18:46:19 -
@@ -79,6 +79,7 @@
 static char passwd_dir[PATH_MAX];
 static char tempname[PATH_MAX];
 static int initialized;
+static int globlocked = 0;
 
 void
 pw_cont(int sig)
@@ -163,6 +164,28 @@
 }
 
 /*
+ * Lock the password subsystem globally.
+ */
+int
+pw_globlock(int flags)
+{
+   int retval;
+
+   if( (retval=pid_begin(_PATH_PWDLOCK, _MODE_PWDLOCK, flags)) == 0 );
+   globlocked=1;
+   return(retval);
+}
+
+/*
+ * Unlock the global lock
+ */
+int pw_globunlock(void)
+{
+   
+   return(pid_end(_PATH_PWDLOCK));
+}
+
+/*
  * Lock the master password file.
  */
 int
@@ -258,10 +281,10 @@
case 0:
/* child */
if (user == NULL)
-   execl(_PATH_PWD_MKDB, pwd_mkdb, -p,
+   execl(_PATH_PWD_MKDB, pwd_mkdb, -p, -n,
-d, passwd_dir, tempname, NULL);
else
-   execl(_PATH_PWD_MKDB, pwd_mkdb, -p,
+   execl(_PATH_PWD_MKDB, pwd_mkdb, -p, -n,
-d, passwd_dir, -u, user, tempname, NULL);
_exit(1);
default:
@@ -353,6 +376,11 @@
}
if (lockfd != -1)
close(lockfd);
+   if (globlocked != 0)
+   {
+   globlocked=0;
+   pid_end(_PATH_PWDLOCK);
+   }
errno = serrno;
 }
 
Index: usr.bin/chpass/chpass.c
===
RCS file: /usr/cvs/src/usr.bin/chpass/chpass.c,v
retrieving revision 1.23
diff -u -r1.23 chpass.c
--- usr.bin/chpass/chpass.c 8 May 2002 00:54:28 -   1.23
+++ usr.bin/chpass/chpass.c 23 Jun 2002 18:42:25 -
@@ -258,6 +258,8 @@
case _PWF_FILES:
if (pw_init(NULL, NULL))
err(1, pw_init());
+   if (pw_globlock(PW_NOBLOCK)  0)
+   err(1, pw_globlock());
if ((pfd = pw_lock()) == -1) {
pw_fini();
err(1, pw_lock());
Index: usr.sbin/pw/pw.c
===
RCS file: /usr/cvs/src/usr.sbin/pw/pw.c,v
retrieving 

Re: bug in pw, -STABLE [patch]

2002-06-23 Thread Matthew D. Fuller

On Sun, Jun 23, 2002 at 11:32:54AM -0700 I heard the voice of
Paul Herman, and lo! it spake thus:
 
 In fact, if you look at fileupdate(), you see that it already gains
 an exclusive lock on the temp file, but not the original
 /etc/master.passwd (if you will.)  I think this is a bug, because
 the original is getting modified (at least in name space), so that
 should be locked while pw(8) is operating.

I'm not sure.  Hm.

I think that MAY prevent some of the corruption cases.  On the other
hand, we're really addressing a broader spectrum of issues here.  The was
pw(8) manipulates the passwd database is rather different from how
chpass(1) or vipw(8) do it; this unifies it all (well, starts the
process).  Also, looking down, it seems that pw(8) will ALWAYS try to do
a line-by-line copy of the temp file back into the main file, which would
be Bad Bad Bad (tm); it does this because using rename() all the time
loses the flock() lock.  However, if we use this global locking, we can
dispense with that, and remove some code complexity, to say nothing of a
giant waste of time copying line-by-line.

I'm trying to kill several birds with as small a stone as I can; I've had
my eye on this whole subsystem for a while, and I'd really like to
re-center it all around pw(8).  vipw(8) kinda has to be its own beast,
but chpass(1) and adduser(8)/rmuser(8) are prime candidates to be
reworked to use pw(8) for their dirty work.


Let's try this: updated patch for pw(8) including my global locking, the
addition of flock()'ing both old and new files (it's not quite necessary,
since it's all under a global lock, but it never hurts to double-check)
as in your patch, and removing the line-by-line copy and using rename()
all the time.  (I suggest tabstop=4 or even =2; that file is *DEEP*)


-- 
Matthew Fuller (MF4839)   |  [EMAIL PROTECTED]
Systems/Network Administrator |  http://www.over-yonder.net/~fullermd/

The only reason I'm burning my candle at both ends, is because I
  haven't figured out how to light the middle yet


Index: fileupd.c
===
RCS file: /usr/cvs/src/usr.sbin/pw/fileupd.c,v
retrieving revision 1.9
diff -u -r1.9 fileupd.c
--- fileupd.c   26 Oct 1999 04:27:13 -  1.9
+++ fileupd.c   23 Jun 2002 19:22:16 -
@@ -76,7 +76,8 @@
if (pfxlen = 1)
rc = EINVAL;
else {
-   intinfd = open(filename, O_RDWR | O_CREAT, fmode);
+   intinfd = open(filename,
+   O_RDWR | O_CREAT | O_EXLOCK | O_NONBLOCK, fmode);
 
if (infd == -1)
rc = errno;
@@ -92,7 +93,8 @@
 
strcpy(file, filename);
strcat(file, .new);
-   outfd = open(file, O_RDWR | O_CREAT | O_TRUNC | 
O_EXLOCK, fmode);
+   outfd = open(file,
+   O_RDWR | O_CREAT | O_TRUNC | O_EXLOCK | 
+O_NONBLOCK, fmode);
if (outfd == -1)
rc = errno;
else {
@@ -169,27 +171,24 @@
rc = errno; /* 
Failed to update */
else {
/*
-* Copy data back into 
the
+* Move data back into 
+the
 * original file and 
truncate
 */
rewind(infp);
rewind(outfp);
-   while (fgets(line, 
linesize, outfp) != NULL)
-   fputs(line, 
infp);
 
/*
-* If there was a 
problem with copying
-* we will just rename 
'file.new' 
-* to 'file'.
-* This is a gross 
hack, but we may have
-* corrupted the 
original file
-* Unfortunately, it 
will lose the inode
-* and hence the lock.
+* Use rename() to 
+move the new file 

Re: bug in pw, -STABLE [patch]

2002-06-23 Thread Paul Herman

On Sun, 23 Jun 2002, Matthew D. Fuller wrote:

 On Sun, Jun 23, 2002 at 11:32:54AM -0700 I heard the voice of
 Paul Herman, and lo! it spake thus:
 
  In fact, if you look at fileupdate(), you see that it already gains
  an exclusive lock on the temp file, but not the original
  /etc/master.passwd (if you will.)  I think this is a bug, because
  the original is getting modified (at least in name space), so that
  should be locked while pw(8) is operating.

 I think that MAY prevent some of the corruption cases.  On the other
 hand, we're really addressing a broader spectrum of issues here.  The was
 pw(8) manipulates the passwd database is rather different from how
 chpass(1) or vipw(8) do it; this unifies it all (well, starts the
 process).  Also, looking down, it seems that pw(8) will ALWAYS try to do
 a line-by-line copy of the temp file back into the main file, which would
 be Bad Bad Bad (tm);

I think you might have your infp confused with your outfp.  It's
not writing to the original live file, it's just writing the new
temp file.  That part of the code is OK.

As for the giant lock, it would be like closing down the entire
airport after someone finds a knife in the snack bar, and in my
opinion an this is an unwise and brutal thing to do.

Not to mention stale /var/run/ files that might be lying around...

I do appreciate your work, but please don't commit this until the
real issue is solved, and the dust settles.  There still may be
some use for it... plenty of time until the next release cycle.

Geoff:  Does the patch for fileupd.c in the last mail hinder the
file corruption you are seeing?

 (I suggest tabstop=4 or even =2; that file is *DEEP*)

Yss!  :-)

-Paul.



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: bug in pw, -STABLE [patch]

2002-06-23 Thread Matthew D. Fuller

On Sun, Jun 23, 2002 at 01:19:02PM -0700 I heard the voice of
Paul Herman, and lo! it spake thus:
 
 I think you might have your infp confused with your outfp.  It's
 not writing to the original live file, it's just writing the new
 temp file.  That part of the code is OK.

I'm talking about down around line 177 thru (unpatched), where it's
copying back to the primary file.  As it stands now (I hadn't realized in
my prior reading) it's line-by-line'ing it:

  /*
   * Copy data back into the
   * original file and truncate
   */
  rewind(infp);
  rewind(outfp);
  while (fgets(line, linesize, outfp) != NULL)
  fputs(line, infp);

  /*
   * If there was a problem with copying
   * we will just rename 'file.new'
   * to 'file'.
   * This is a gross hack, but we may have
   * corrupted the original file
   * Unfortunately, it will lose the inode
   * and hence the lock.
   */
  if (fflush(infp) == EOF || ferror(infp))
  rename(file, filename);
  else
  ftruncate(infd, ftell(infp));


which is Very Bad (tm) in that it's not very resilient against system
crashes.  The rename() solution is much safer.


 As for the giant lock, it would be like closing down the entire
 airport after someone finds a knife in the snack bar, and in my
 opinion an this is an unwise and brutal thing to do.

No, it's more like closing the snack bar while you clean it, rather than
closing the first steam tray while cleaning that, then the second steam
tray, then the third booth on the left side, then...  That way, when you
open the snack bar, you know that the whole thing is clean (or, depending
on your skill and work ethic, at least equally dirty ;).

In many passwd-manipulations, the file can't be safely modified in-place,
so it has to be done out-of-place, then atomically shoved in, which
requires a rename() or the ilk; the downside is that you can't flock()
the file in question and handle that easily.  One also has to consider
the impact with other programs, potentially not written by us, that
modify things; a wrapper lock on changes to the auth information is
really the only way to ensure consistency; otherwise, you can't REALLY be
sure that the passwd and group files are in sync, to say nothing of
preventing crashes from mangling things.


 Not to mention stale /var/run/ files that might be lying around...

See the code I put in pid_begin() to handle just that case, that being
the reason the function was written in the first place (otherwise, I
could dispose of a lot of mess, and just use an empty file as a mutex for
it).


 I do appreciate your work, but please don't commit this until the
 real issue is solved, and the dust settles.  There still may be
 some use for it... plenty of time until the next release cycle.

Commit?  How would I do that?   :P


-- 
Matthew Fuller (MF4839)   |  [EMAIL PROTECTED]
Systems/Network Administrator |  http://www.over-yonder.net/~fullermd/

The only reason I'm burning my candle at both ends, is because I
  haven't figured out how to light the middle yet

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: bug in pw, -STABLE [patch]

2002-06-23 Thread Paul Herman

On Sun, 23 Jun 2002, Matthew D. Fuller wrote:

 I'm talking about down around line 177 thru (unpatched), where it's
 copying back to the primary file.
 [...]
 which is Very Bad (tm) in that it's not very resilient against
 system crashes.  The rename() solution is much safer.

Ah, now I see, thanks.  No contention there, we are in total
agreement that this is a Bad Thing, and that rename() is
sufficient.

 No, it's more like closing the snack bar while you clean it,
 rather than closing the first steam tray while cleaning that,
 then the second steam tray, then the third booth on the left
 side, then...

OK, you can change the analogy, but you're still closing the snack
bar, and you turn away customers.  Most 24 hour snack bars clean
each table at a time as each one frees up.

Moral of that story:  This would mean for pw(8) that I can't update
the system passwords and the passwords in all my jails at the same
time.  There is no reason to enforce that policy.

Back to the situation at hand.  You make a few points that should
be clarified, I'll summarize them in QA format:

Q: rename() is good, but you can't flock() the file in question
   and handle it easily, can you?

A: Indeed, you can.  flocks() are even preserved across renames.
   Consider:
   lockf -k foo sleep 60 
   mv foo bar
   lockf -t 1 -k bar true
   The second lockf will error.

Q: What about other programs not written by us that modify
   things?

A: Agreed, this is a problem.  However, no system in place will
   prevent at any time anyone with the proper credentials from
   doing:
 echo junk  /etc/master.passwd
   At least if the programs in the base agree on a system
   (be it flock() or /var/run lock) it will work for the base
   system.  This doesn't speak for or against any one method.
   Yes, a lock on changes to the auth information is the only
   way to ensure consistency, which is what this is all about,
   but HOW to accomplish this issue at hand.

 Commit?  How would I do that?   :P

Sorry, it wasn't as much directed at you as it was more of a
general plea.  Kind of like this one:

Why use lockfiles, when we can lock files?  Let's lock files and
keep the snack bar open!  :-)

-Paul.


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: bug in pw, -STABLE [patch]

2002-06-23 Thread Matthew D. Fuller

OK, this is the end for me today.  I'm fairly sure, somehow, it's time to
get some sleep   ;p

On Sun, Jun 23, 2002 at 03:51:46PM -0700 I heard the voice of
Paul Herman, and lo! it spake thus:
 
 Moral of that story:  This would mean for pw(8) that I can't update
 the system passwords and the passwords in all my jails at the same
 time.  There is no reason to enforce that policy.

No, it means that (with configuration) you can only update them once at a
time in each jail.

I'm all in favor of adding options, either command line, or pw.conf (or
both) to put the lockfile somewhere else; -V ${dir} already makes pw(8)
read ${dir}/pw.conf instead of the default /etc/pw.conf, so we're 90% of
the way there already.  Heck, I wouldn't be too worried about adjusting
the lockfile location based on -V automagically.  I'm also not opposed to
adding an option to not bother with this locking; I already added one to
pwd_mkdb(8) (primarily because it's called from libutil's pw_mkdb(3),
which is called by vipw(8)...  deadlocks suck).  I just haven't done
either NOW; nor have I yet added options to 'block on the lock and keep
trying' (or decided to make blocking the default and have a switch for
non-blocking) - this is experimental implementation stage, after all.

I don't claim this is perfect for every case; I'm just taking aim at the
common case, where it's currently far too easy to accidentally screw
yourself in the foot.


 Q: rename() is good, but you can't flock() the file in question
and handle it easily, can you?
 
 A: Indeed, you can.  flocks() are even preserved across renames.

Yes, I've seen that.


 A: Agreed, this is a problem.  However, no system in place will
prevent at any time anyone with the proper credentials from
doing:

I was aiming more at:
 User A adds user with pw(8)
  * pw(8) gets together stuff, internally adds user and group
  * pw(8) builds homedirs, skel files, changes ownership
  * pw(8) opens and locks master.passwd (and assorted temp files) and
starts rewriting
- User B adds group
  % User B's pw(8) (or other) invocation adds group with GID that User
A's pw(8) had already assigned and chmod'd for user above
  * User A's pw(8) either errors out leaving job half done, or adds group
with duplicate GID, or (not coded) backs out everything it's done and
starts over, or (ditto) backs out everything it's done and errors
back to User A, who screams and bitches and dumps User B into the
East River after sundown.

Similar situations can be thought up.  My POV on this is that more than 1
file has to be coordinated here, so why not just make 1 file responsible
for the coordination?  I don't care THAT much if it's called
/var/run/pwd.lock or /etc/passwd.lock or /kernel (I suggest testing the
latter on a scratch system).  It just seems to me the simplest and least
error-prone way of doing it.


Yes, a lock on changes to the auth information is the only
way to ensure consistency, which is what this is all about,
but HOW to accomplish this issue at hand.

Now, mind you; were this a high-traffic subsystem, where one would
normally have 2 or 3 or 4 requests (changes) outstanding, with a
constant stream coming in keeping the queue full, I'd be out there
screaming for fine-grained locking of individual pieces of it.  But
really; how many systems are there (that are using stock passwd and
getpw* and pw/pwd_mkdb/etc) that are adding users at even a significant
fraction of the speed the system can process them?  A coarse lock is
easier on the programmer (especially being as we have quickie functions
that do the job for them), the admin (easier to see an extra lock file,
then a hung process holding a flock()), and pretty much everything else.


 Why use lockfiles, when we can lock files?  Let's lock files and
 keep the snack bar open!  :-)

We are locking a file   :)
We're just locking one file for the whole subsystem, so the pieces can be
assured of being in sync.


So, where's a snack bar with no CVS repository or text editor or diff(1);
I need a vacation!



-- 
Matthew Fuller (MF4839)   |  [EMAIL PROTECTED]
Systems/Network Administrator |  http://www.over-yonder.net/~fullermd/

The only reason I'm burning my candle at both ends, is because I
  haven't figured out how to light the middle yet

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: bug in pw, -STABLE [patch]

2002-06-23 Thread Terry Lambert

Paul Herman wrote:
 Fine, then lock them both, and use rename().  Patch attached.
 
 In fact, if you look at fileupdate(), you see that it already gains
 an exclusive lock on the temp file, but not the original
 /etc/master.passwd (if you will.)  I think this is a bug, because
 the original is getting modified (at least in name space), so that
 should be locked while pw(8) is operating.
 
 What do you think?

Don't use an exclusive open.  Use a standard lock file lock
mechanism.

The problem with your proposed patch is that it breaks the
ability to allow authentication against the database while
it is undergoing modification, which may be a prolonged period.

The correct way to deal with this is to make a hard link against
the file, verify that the link count is exactly 2, and if it's
not, back off and fail.

You are only permitted to write the file if you are the one that
made the second link.

Since both rename and link are atomic operations in the directory,
you are safe.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: bug in pw, -STABLE [patch]

2002-06-23 Thread Paul Herman

On Sun, 23 Jun 2002, Matthew D. Fuller wrote:

 I don't claim this is perfect for every case; I'm just taking aim
 at the common case, where it's currently far too easy to
 accidentally screw yourself in the foot.

...and have the screw (or bullet) delivered as precisely, quickly
and efficiently as possible.

Clearly, at the root of our disagreement is what we both perceive
the problem to be.

I don't see problems in the current implementation, aside from bugs
that lead to unexpected behavior, i.e. passwd file corruption.
You see the problem as a deficiency in the implementation itself,
and wish to protect the user from shooting themselves in the foot.

Not only do I think that's impossible[*], I choose to fight for my
right to shoot myself in the foot as quickly and efficiently as
possible, but that's where we'll disagree, and I'll just leave it
at that and wish you a good night's sleep.

-Paul.

[*] Patch to vi refusing to edit filenames containing
master.passwd withheld by request.  ;-)


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: bug in pw, -STABLE [patch]

2002-06-23 Thread .

Matthew D. Fuller writes:
...
  Why use lockfiles, when we can lock files?  Let's lock files and
  keep the snack bar open!  :-)
 
 We are locking a file   :)
 We're just locking one file for the whole subsystem, so the pieces can be
 assured of being in sync.
Time to implement hierarchic lock.
hlock dir lead to fail hlock
every file in that dir with ref count == 1
??

Extremly unefficient

-- 
@BABOLO  http://links.ru/

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: bug in pw, -STABLE [patch]

2002-06-23 Thread Geoffrey C. Speicher

On Sun, 23 Jun 2002, Paul Herman wrote:

 You see the problem as a deficiency in the implementation itself,
 and wish to protect the user from shooting themselves in the foot.

 Not only do I think that's impossible[*], I choose to fight for my
 right to shoot myself in the foot as quickly and efficiently as
 possible, but that's where we'll disagree, and I'll just leave it
 at that and wish you a good night's sleep.

Not to be snooty, but there will always be plenty of ways for you to shoot
yourself in the foot if you want to play that way.  For regular users of
pw(8) who don't want to get shot, it needs to be fixed when used within
its intended bounds.

As far as I'm concerned, it comes down to what is most feasible to
implement correctly without an unreasonably large effort.  In reality, it
comes down to how a committer would like to see it work, since the
committers are the ones who ultimately have to deal with and are
responsible for the change.

So we either need to have a compelling solution or get a committer to step
in and make up our minds for us.

Geoff


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: bug in pw, -STABLE [patch]

2002-05-18 Thread Geoffrey C. Speicher

[I'm bringing the discussion back to -hackers, since it was omitted in
your original reply, Paul.  To get everyone up to speed, Paul suggested
that calling unlink() _before_ close() should solve the race condition
mentioned in my original message.  However, that still leads to
corruption, and we're trying to figure out why.]

On Fri, 17 May 2002, Paul Herman wrote:

  The file isn't actually unlinked until p1 closes the fd
 
 Sure it is.  Try a little C:
 
 fd = open(watchme, O_CREAT|O_EXLOCK);
 sleep(10);
 unlink(watchme);
 sleep(10);
 close(fd);
 
 You'll see the file disappear after 10 seconds, not 20.

Hmm.  Rechecking the man page for unlink(2) I see that I worded the above
incorrectly.  The file is unlinked when you unlink(), but not actually
removed until close().

 Also, open() looks like it blocks until it can get the lock

But when does open() block in the following scenario?

 p1:open()  - open  lock file
 p2:open()  - BLOCK
 p1:unlink()- remove link only, do not remove file
 p2:open()  - WAKEUP
 p1:close() - close fd

Does it block before it opens a fd, or after then but before it gains an
exclusive lock?  In other words, did p2 open a new file after it woke up,
or did it open the same file that p1 had opened?  If the latter is true
then we could extend the above:

 p3:open()  - open  lock file (since p1 unlinked p1  p2's copy)
 p2/p3:update master.passwd   - BOOM

If the former is true, then I'm pretty much out of ideas.  I don't think
the corruption is due to something else, since simply removing the call to
unlink() does in fact fix the problem.

Geoff


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: bug in pw, -STABLE [patch]

2002-05-18 Thread Paul Herman

On Sat, 18 May 2002, Geoffrey C. Speicher wrote:

   The file isn't actually unlinked until p1 closes the fd
 

 Hmm.  Rechecking the man page for unlink(2) I see that I worded
 the above incorrectly.  The file is unlinked when you unlink(),
 but not actually removed until close().

Well, unlink and remove are the same in my vocabulary.  The file is
removed when you unlink it, but the will lock remain until you
close it.

 But when does open() block in the following scenario?

  p1:open()- open  lock file
  p2:open()- BLOCK
  p1:unlink()  - remove link only, do not remove file
  p2:open()- WAKEUP
  p1:close()   - close fd

p2 blocks until p1 closes the fd.  The problem in this example is,
by the time p2 gains the lock, the file in non-existant in
/var/lock, which allows p3 to grab a new 2nd lock, which gives you
the corruption.

You should open() with O_NONBLOCK in order to solve your problem,
but the calling process will fail.  In that case, pw should exit
with a sysexits(3) error to notify the caller.  Having pw loop for
a few seconds while trying to obtain the lock (like lockf(1) does)
might be nice.

-Paul.


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message