[zfs-code] ARC deadlock.

2007-08-09 Thread Pawel Jakub Dawidek
On Fri, May 18, 2007 at 08:22:26AM -0600, Mark Maybee wrote:
> Yup, J?rgen is correct.  The problem here is that we are blocked in
> arc_data_buf_alloc() while holding a hash_lock.  This is bug 6457639.
> One possibility, for this specific bug might be to drop the lock before
> the allocate and then redo the read lookup (in case there is a race)
> with the necessary buffer already in hand.

Any updates on this? We don't see those deadlocks when ZIL is disabled
and we're thinking about disabling ZIL by default for 7.0-RELEASE if we
won't find fix for this. It is somehow quite easy to triggers for some
workloads.

> J?rgen Keil wrote:
> >>Kris Kennaway  found a deadlock,
> >>which I think is not FreeBSD-specific.
> >>
> >>When we are running low in memory and kmem_alloc(KM_SLEEP) is called,
> >>the thread waits for the memory to be reclaimed, right?
> >
> >>In such situation the arc_reclaim_thread thread is woken up.
> >>
> >>Ok. I've two threads waiting for the memory to be freed:
> >>
> >>First one, and this one is not really problematic:
> >...
> >>And second one, which holds
> >>arc_buf_t->b_hdr->hash_lock:
> >
> >
> >Bug 6457639 might be related, 
> >http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6457639
> >
> >In this case I also found the arc deadlocking because of KM_SLEEP
> >allocations, while having parts of the buf hash table locked.

-- 
Pawel Jakub Dawidek   http://www.wheel.pl
pjd at FreeBSD.org   http://www.FreeBSD.org
FreeBSD committer Am I Evil? Yes, I Am!
-- next part --
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 187 bytes
Desc: not available
URL: 
<http://mail.opensolaris.org/pipermail/zfs-code/attachments/20070809/3d801944/attachment.bin>


[zfs-code] ARC deadlock.

2007-08-09 Thread Mark Maybee
Sorry Pawel, we have not done much with this bug.  It is not a high
priority bug for us, since we don't see it under Solaris, and there have
been lots of other demands on our time and resources lately.  Have you
tried exploring the fix I suggested?  We are always happy to accept
fixes from the community :-).

-Mark

Pawel Jakub Dawidek wrote:
> On Fri, May 18, 2007 at 08:22:26AM -0600, Mark Maybee wrote:
>> Yup, J?rgen is correct.  The problem here is that we are blocked in
>> arc_data_buf_alloc() while holding a hash_lock.  This is bug 6457639.
>> One possibility, for this specific bug might be to drop the lock before
>> the allocate and then redo the read lookup (in case there is a race)
>> with the necessary buffer already in hand.
> 
> Any updates on this? We don't see those deadlocks when ZIL is disabled
> and we're thinking about disabling ZIL by default for 7.0-RELEASE if we
> won't find fix for this. It is somehow quite easy to triggers for some
> workloads.
> 
>> J?rgen Keil wrote:
 Kris Kennaway  found a deadlock,
 which I think is not FreeBSD-specific.

 When we are running low in memory and kmem_alloc(KM_SLEEP) is called,
 the thread waits for the memory to be reclaimed, right?
 In such situation the arc_reclaim_thread thread is woken up.

 Ok. I've two threads waiting for the memory to be freed:

 First one, and this one is not really problematic:
>>> ...
 And second one, which holds
 arc_buf_t->b_hdr->hash_lock:
>>>
>>> Bug 6457639 might be related, 
>>> http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6457639
>>>
>>> In this case I also found the arc deadlocking because of KM_SLEEP
>>> allocations, while having parts of the buf hash table locked.
> 
> 
> 
> 
> ___
> zfs-code mailing list
> zfs-code at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-code



[zfs-code] ARC deadlock.

2007-08-09 Thread Neil Perrin
Pawel,

I don't understand the connection with the ZIL, the referenced
bug has no ZIL path. I think that deserves more investigation.

Also, please don't ship ZFS with the ZIL disabled. That would
remove POSIX compliance as fsync, O_DSYNC and friends are not
working. Applications that make use of these may generate corrupt
files. Also users will complain when perf decreases when the ZIL
is re-enabled!

Thanks: Neil.

Pawel Jakub Dawidek wrote:
> On Fri, May 18, 2007 at 08:22:26AM -0600, Mark Maybee wrote:
>> Yup, J?rgen is correct.  The problem here is that we are blocked in
>> arc_data_buf_alloc() while holding a hash_lock.  This is bug 6457639.
>> One possibility, for this specific bug might be to drop the lock before
>> the allocate and then redo the read lookup (in case there is a race)
>> with the necessary buffer already in hand.
> 
> Any updates on this? We don't see those deadlocks when ZIL is disabled
> and we're thinking about disabling ZIL by default for 7.0-RELEASE if we
> won't find fix for this. It is somehow quite easy to triggers for some
> workloads.
> 
>> J?rgen Keil wrote:
 Kris Kennaway  found a deadlock,
 which I think is not FreeBSD-specific.

 When we are running low in memory and kmem_alloc(KM_SLEEP) is called,
 the thread waits for the memory to be reclaimed, right?
 In such situation the arc_reclaim_thread thread is woken up.

 Ok. I've two threads waiting for the memory to be freed:

 First one, and this one is not really problematic:
>>> ...
 And second one, which holds
 arc_buf_t->b_hdr->hash_lock:
>>>
>>> Bug 6457639 might be related, 
>>> http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6457639
>>>
>>> In this case I also found the arc deadlocking because of KM_SLEEP
>>> allocations, while having parts of the buf hash table locked.
> 
> 
> 
> 
> ___
> zfs-code mailing list
> zfs-code at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-code



[zfs-code] [crypto-discuss] ZFS boot and swrand (Was Re: [zfs-discuss] ZFS boot: 3 smaller glitches with console, )

2007-08-09 Thread Krishna Yenduri
Darren J Moffat wrote:
> Yannick Robert wrote:
>   
>> Hello
>>
>> it seems i have the same problem after zfs boot installation (following this 
>> setup on a snv_69 release 
>> http://www.opensolaris.org/os/community/zfs/boot/zfsboot-manual/ ). The 
>> outputs from the requested command are similar to the outputs posted by 
>> dev2006.
>>
>> Reading this page, i found no solution concerning the /dev/random problem. 
>> Is there somewhere a procedure to repair my install ?
>> 

 To answer Yannick's question, the /dev/random warning message does not 
indicate
 any problem with the install and can be ignored.

> ...
>
> Unlike UFS when we do a ZFS boot we do use the in kernel interface to 
> /dev/random (random_get_bytes) before svc://system/cryptosvc has run.
>   

 To be exact, the API used by ZFS kernel module  is 
random_get_pseudo_bytes().

> The message you are seeing is from KCF saying that it has a random pool 
> but nothing providing entropy to it.  This is because swrand hasn't yet 
> registered with kcf.
>   

 We had a similar issue with SCTP where in it uses the kernel API
 random_get_pseudo_bytes() before swrand could register.

 The solution we had there was to load swrand directly. From 
uts/sparc/ip/Makefile
78  #
79  # Depends on md5 and swrand (for SCTP). SCTP needs to depend on
80  # swrand as it needs random numbers early on during boot before
81  # kCF subsystem can load swrand.
82  #
83  LDFLAGS += -dy -Nmisc/md5 -Ncrypto/swrand -Nmisc/hook 
-Nmisc/neti


 I think we can do a similar thing here. The zfs (or is it zfs-root ?) 
kernel module
 can have crypto/swrand as a dependency. I see that uts/sparc/zfs/Makefile
 lists drv/random as a dependency. This is not needed because the
 API is in modstubs now and it is not implemented in drv/random any more.
 That can be replaced with crypto/swrand.

 swrand does  not need any crypto signature verification. So, it can 
safely be loaded
 early on during boot.

> Now this was all done prior to newboot and SMF and part of the goal of 
> why KCF works this way with software providers is was to ensure no boot 
> time performance regressions by doing load on demand rather than forcing 
> the loading of all modules at boot time.
 
Yes. This requirement added a lot of complexity to KCF.

> With newboot on x86, and soon 
> on SPARC, the swrand module will be in the boot archive anyway.
>   

 That would be great. It is cleaner and will remove the need for ad hoc
 solutions like above.

-Krishna