Re: fd code multithreaded race?

2010-08-04 Thread Antti Kantee
On Wed Aug 04 2010 at 13:21:07 +, Andrew Doran wrote:
 On Sat, Jul 31, 2010 at 08:31:19PM +0300, Antti Kantee wrote:
  Hi,
  
  I'm looking at a KASSERT which is triggering quite rarely for me (in
  terms of iterations):
  
  panic: kernel diagnostic assertion dt-dt_ff[i]-ff_refcnt == 0 failed: 
  file 
  /usr/allsrc/src/sys/rump/librump/rumpkern/../../../kern/kern_descrip.c, 
  line 856
  
  Upon closer examination, it seems that this can trigger while another
  thread is in fd_getfile() between upping the refcount, testing for
  ff_file, and fd_putfile().  Removing the KASSERT seems to restore correct
 
 You're right there, the KASSERT() is wrong, it should be removed.

Thanks, I'll do that.

  operation, but I didn't read the code far enough to see where the race
  is actually handled and what stops the code from using the wrong file.
 
 FYI the fdfile_t (per-descriptor records) are stable for the lifetime of the
 process, what each record descibes can and does of course change, and how
 those records are pointed to does change (fdtab_t).
  
 There isn't really a concept of wrong file, as in, the app gets
 what it asked for.  It is free to ask for the wrong thing, and it's free
 to ask for the right thing at the wrong time, etc - that's its problem.
 
 Unless you're alluding to another bug?

Not really.  I just started thinking about how applications can make
sure they use the right file descriptor.  It seems using close() to
notify other threads of a file descriptor being closed is racy.

So something naiive like this:

t1: lock
t1: get fd1
t1: unlock
/* t1 wants to do a syscall with fd1 but is preempted */
t2: lock
t2: close fd1
t2: unlock
t3: lock
t3: open, result fd1
t3: unlock
t1: syscall fd1 ...

will give you the wrong result.  Essentially there is no interlock from
the application lookup to the kernel backing object lookup.

So I guess if you want things to work correctly, instead of close()
you need to dup2() to a zombie/deadfs fd and wait for all threads to
check in before you can close it.  (i assume dup2 is atomic)

Never realized file descriptors and threads were so tricky ;)


Re: fd code multithreaded race?

2010-08-04 Thread der Mouse
 So something naiive like this: [...] will give you the wrong result.

True.

Essentially, you have a shared resource (the fd-number to thing table)
which you are using without any locking.  Of course there are races!

Some threading setups allow different threads to have independent file
descriptor tables, which would avoid the issue you sketch (but, of
course, introduce other issues).

 Never realized file descriptors and threads were so tricky ;)

There's nothing special about file descriptors here.  You have
basically the same issue with any other piece of state which is shared
by all threads.  To pick two more examples the kernel maintains,
working directory and umask.  These are a shared resource; like any
shared resource, accessing them from multiple threads requires care,
and usually locking of some sort.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: weird ktrace timestamps?

2010-08-04 Thread der Mouse
 There are two interesting things here: the timestamps in the file do
 not warp backwards by almost two seconds - but they do warp
 backwards, by 3353 ns.
 Try a different time counter.  See kern.timecounter.

kern.timecounter.choice = clockinterrupt(q=0, f=100 Hz) TSC(q=-100, 
f=1795636350 Hz) ACPI-Fast(q=1000, f=3579545 Hz) i8254(q=100, f=1193182 Hz) 
dummy(q=-100, f=100 Hz)
kern.timecounter.hardware = ACPI-Fast
kern.timecounter.timestepwarnings = 0

I switched to TSC and got the same syndrome (well, as far as kdump
output goes; I didn't dig into ktrace.out).  Switching to
clockinterrupt made the syndrome go away, but all the timestamp deltas
printed by kdump -R were zero (there probably were a very few that
weren't, but I didn't see any of them in a quick eyeball skim).

 Is this UP or MP?

MP.

cpu0 at mainbus0: apid 0 (boot processor)
cpu0: Intel Core 2 (Merom) (686-class), 1795.62 MHz, id 0x6fd
cpu0: features bfebfbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR
cpu0: features bfebfbffPGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX
cpu0: features bfebfbffFXSR,SSE,SSE2,SS,HTT,TM,SBF
cpu0: features2 e3bdSSE3,MONITOR,DS-CPL,VMX,EST,TM2,xTPR
cpu0: Intel(R) Core(TM)2 Duo CPU T7100  @ 1.80GHz
cpu0: I-cache 32 KB 64B/line 8-way, D-cache 32 KB 64B/line 8-way
cpu0: L2 cache 2 MB 64B/line 8-way
cpu0: using thermal monitor 1
cpu0: Enhanced SpeedStep (1420 mV) 2000 MHz
cpu0: unknown Enhanced SpeedStep CPU.
cpu0: using only highest and lowest power states.
cpu0: Enhanced SpeedStep frequencies available (MHz): 2000 1200
cpu0: calibrating local timer
cpu0: apic clock running at 199 MHz
cpu0: 64 page colors
cpu1 at mainbus0: apid 1 (application processor)
cpu1: starting
cpu1: Intel Core 2 (Merom) (686-class), 1795.43 MHz, id 0x6fd
cpu1: features bfebfbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR
cpu1: features bfebfbffPGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX
cpu1: features bfebfbffFXSR,SSE,SSE2,SS,HTT,TM,SBF
cpu1: features2 e3bdSSE3,MONITOR,DS-CPL,VMX,EST,TM2,xTPR
cpu1: Intel(R) Core(TM)2 Duo CPU T7100  @ 1.80GHz
cpu1: I-cache 32 KB 64B/line 8-way, D-cache 32 KB 64B/line 8-way
cpu1: L2 cache 2 MB 64B/line 8-way
cpu1: using thermal monitor 1

I did consider the possibility that MP was relevant, but it strikes me
as highly unlikely that it would migrate between processors exactly
there every time (and, apparently, _only_ there).

Is it likely enough that it's something else tied to MPness that it's
worth trying a test under a UP kernel?

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: Module and device configuration locking [was Re: Modules loading modules?]

2010-08-04 Thread Paul Goyette

On Wed, 4 Aug 2010, Andrew Doran wrote:


Sorry for not replying sooner.

Please don't add a generic recursive lock facility, it would be abused.


Yeah!  My current patches have no new generic facilities at all.


I'd like something roughly like the following.  I think this should
also cover major configuration tasks such as device attach and detach,
so it wouldn't really be module specific.  The naming is a poor suggestion.

void
sysconfig_lock(void)
{

if (mutex_owned(sysconfig_mutex)) {
/*
 * At this point it's OK for the thread to hold other
 * locks, since the system configuration lock was the
 * first taken on the way in.
 */
sysconfig_recurse++;
return;
}
/*
 * I don't remember the correct argument sequence to the
 * following, but basically we need to assert that NO locks,
 * sleep or spin, other than kernel_lock are held,
* as the system configuration lock would be the outermost
 * lock in the system.
 */
LOCKDEBUG_BARRIER(...);
mutex_enter(sysconfig_mutex);
KASSERT(sysconfig_recurse == 0);
}

For the boot autoconfig path where you have things being driven by the
kernel, this would work OK.  Situations where there's a kthread or
something attaching and detaching devices on the fly are a bit different,
since they likely have local atomicity requirements (need to take device
private locks).  There you'd probably need the caller to take the lock.
USB comes to mind (along with some hardware RAID drivers that do hotplug
attach/detach).

Thoughts?


Well, at first glance, this proposal makes sense.  But it is certainly a 
much larger task than dealing with the immediate problem - modules which 
want to load other modules.


So, I really have three questions:

1. In keeping with earlier concerns about holding locks for long periods
   of time (eeh and mrg both commented on this), the approach described
   above would seem to have an even longer hold-time than the current
   module_lock.  Would not something similar to haad's condvar approach
   be appropriate for this proposal, as well as for the more-limited-in-
   scope recursive module lock?

2. Would it be reasonable to solve the immediate problem as previously
   proposed, rather than waiting for this ultimate solution?  I think
   it would be a long time before we could find and resolve all of the
   issues that might be created in the various threaded situations
   that may exist.  I know I'm certainly not sufficiently qualified to
   identify all those situations, and I also know that I don't have
   sufficient available work-hours to do this in any reasonable time-
   frame.

   I'd still like to move forward with the most recent solution to the
   module_lock problem.

3. Since we're still technically abusing the mutex(9) man-page caveat
   about using mutex_owned() for making locking decisions, it would
   be very much appreciated (at least by myself) to have an explanation
   of WHY it is OK in this case to do it here, but not somewhere else.





-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: fd code multithreaded race?

2010-08-04 Thread Sad Clouds
On Wed, 4 Aug 2010 17:06:59 +0300
Antti Kantee po...@cs.hut.fi wrote:
 
 Never realized file descriptors and threads were so tricky ;)

I'm just finishing writing multi-threaded logging subsystem for one of
my projects. I think I faced a similar issue with threads and shared
file descriptors. Threads can log data to the same file, or different
files on the filesystem. They can log data to one file, or many
different files simultaneously.

I implemented a new type, which is basically a structure that contains
mutex, file descriptor, file path string and reference counter. Any
thread doing I/O on a particular descriptor has to lock its mutex.
Since, logging data can remain in buffers, waiting to be flushed,
threads increment descriptors' reference counters, when they flush
buffers they decrement those counters. This ensures that descriptors
are not freed from memory, until reference counter drops to 0. If the
value of the descriptor == -1, thread holding the lock calls open()
with the file path as the first argument.

My code uses a hash table to lookup descriptors, however I don't use
the value of an open file descriptor (which can change between
multiple open()/close() calls), but the address of the structure that
describes a particular file descriptor.