Re: Lost file-system story

2011-12-14 Thread David Holland
On Thu, Dec 15, 2011 at 12:48:51AM +0400, Aleksej Saushev wrote:
 > >> There have of course also been some pretty serious bugs in various fsck
 > >> implementations across the years and vendors.
 > >
 > > I'd be suspicious of fsck failing on a regularly mounted disk with
 > > corruption that can't otherwise be tracked to outside influences (bad
 > > ram, bad disk cache, etc). I've seen some bizarre things happen on ram
 > > errors over the years for instance.
 > 
 > I've got infinite sequence of nested subdirectories on new hardware and
 > "stable" FreeBSD 5.3 once. Something like http://xkcd.com/981/
 > fsck refused to work there.

At one point some time back when pounding on rename, I got a test
volume into a state where if you ran fsck -fy it would "fix" a ton of
stuff, run to completion, and mark the fs clean. Which was great,
except that if you did it again, it would do the same thing. Over and
over. I'm glad it was a test volume...

-- 
David A. Holland
dholl...@netbsd.org


Re: [RFC] getgroups2 system call

2011-12-14 Thread Emmanuel Dreyfus
Eric Haszlakiewicz  wrote:

> I know approximately nothing about the FUSE protocol, but would it to be
> feasible to keep a fixed length header with a flag that says extra groups
> can be found in the payload of the message, either before or after the regular
> payload?

We could have a secondary group trailer, I thought about this
possibility, but that is going to cause a lot of changes in filesystems,
since you typically compute the payload length to be packet length minus
header length.

-- 
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org


Re: Lost file-system story

2011-12-14 Thread Greg A. Woods
At Wed, 14 Dec 2011 07:50:37 + (UTC), mlel...@serpens.de (Michael van Elst) 
wrote:
Subject: Re: Lost file-system story
> 
> wo...@planix.ca ("Greg A. Woods") writes:
> 
> >easy, if not even easier, to do a "mount -u -r"
> 
> Does this work again?

Not that I know of, and PR#30525 concurs, as does the commit mentioned
in that PR to prevent it from falsely appearing to work, a change which
remains in netbsd-5 and -current to date.  See my discussion of this
issue earlier in this thread.

-- 
Greg A. Woods
Planix, Inc.

   +1 250 762-7675http://www.planix.com/


pgpsPtoKtaNDu.pgp
Description: PGP signature


Re: [RFC] getgroups2 system call

2011-12-14 Thread Michael van Elst
On Wed, Dec 14, 2011 at 03:22:19PM -0600, Eric Haszlakiewicz wrote:
> On Wed, Dec 14, 2011 at 07:57:43AM +, Michael van Elst wrote:
> > mm_li...@pulsar-zone.net (Matthew Mondor) writes:
> > 
> > >What does NFS do in this case?  I seem to remember that it also imposes
> > >a sane size limit, possibly even below NGROUPS_MAX, is it really the
> > >case?  If so, would this also be acceptable?
> > 
> > NFS (or rather the underlying SunRPC) passes an array of 16 gids, which is
> > a common problem when you try to use groups for fine grained access control.
> 
> Based on what I've read, it's only NFSv3 that works like that.  With
> NFSv4 the access control can be based on what groups the server thinks the
> user is in, so there are no group ids being passed.

http://nfsworld.blogspot.com/2005/03/whats-deal-on-16-group-id-limitation.html

Greetings,
-- 
Michael van Elst
Internet: mlel...@serpens.de
"A potential Snark may lurk in every tree."


Re: ehci1: missed microframe, TT reset not implemented, hub might be inoperational

2011-12-14 Thread Christos Zoulas
In article ,
Donald Allen   wrote:
>I am seeing the subject message
>
>ehci1: missed microframe, TT reset not implemented, hub might be inoperational
>
>on occasion on my Lenovo S10 workstation. It seems to coincide with
>the mouse not working (usb mouse, connected through a Raritan KVM). I
>have had to reboot to get the mouse working -- restarting X doesn't do
>it -- so a real problem.
>
>I'm running the GENERIC 5.1 kernel at the moment, but I note that
>someone else complained about this on 'current-users' last August,
>running the then-current kernel on an HP ProLiant MicroServer N36L.
>
>Anyone have any thoughts about this?
>
>/Don
>

Looks a bit involved to do, but linux does it. *BSD reports but does not.

christos



Re: [RFC] getgroups2 system call

2011-12-14 Thread Eric Haszlakiewicz
On Wed, Dec 14, 2011 at 07:57:43AM +, Michael van Elst wrote:
> mm_li...@pulsar-zone.net (Matthew Mondor) writes:
> 
> >What does NFS do in this case?  I seem to remember that it also imposes
> >a sane size limit, possibly even below NGROUPS_MAX, is it really the
> >case?  If so, would this also be acceptable?
> 
> NFS (or rather the underlying SunRPC) passes an array of 16 gids, which is
> a common problem when you try to use groups for fine grained access control.

Based on what I've read, it's only NFSv3 that works like that.  With
NFSv4 the access control can be based on what groups the server thinks the
user is in, so there are no group ids being passed.

eric


Re: [RFC] getgroups2 system call

2011-12-14 Thread Eric Haszlakiewicz
On Wed, Dec 14, 2011 at 07:04:06AM +0100, Emmanuel Dreyfus wrote:
> I explored the option of modifying the FUSE protocol, and that is
> though. We can easily negociate an extended FUSE header that contains
> secondary groups, and I already submitted a patch that does exactly
> that, but then we face two conflicting requirements:
> 
> - a fixed lentgh header is highly desirable for performance
> optimization. For instance glusterfs fetches the header and the data
> using readv(2) with an iovec that has two slots. That way it gets write
> date aligned on a page boundary.
> 
> - a fixed length header means an array of secondary groups with
> NGROUPS_MAX slots, but Linux's NGROUPS_MAX is 65536, which means an
> insane waste of space. Therefore we need an array of secondary groups
> that is not bigger than the used slots.
> 
> As a tradeoff between the two requirements, I proposed that the
> filesystem could request a minimum size for secondary group array. That
> way, the header would be of fixed length most of the time, except when
> there are many groups (something that can only happen on Linux: NetBSD's
> NGROUPS_MAX is much more reasonable). Big amount of secondary groups

Ours is 16, which seems more like ridiculously limiting, rather than
"reasonable".  For instance, on one of the (Linux) machines at work
I'm in 22 different groups, and other people are in many more.  
Whatever solution is found for the FUSE problem, IMO it should be able
to efficiently handle at least a few dozen groups.

I know approximately nothing about the FUSE protocol, but would it to be
feasible to keep a fixed length header with a flag that says extra groups
can be found in the payload of the message, either before or after the regular
payload?

eric


Re: Lost file-system story

2011-12-14 Thread Aleksej Saushev
James Chacon  writes:

> On Tue, Dec 13, 2011 at 4:09 PM, Greg A. Woods  wrote:
>> At Wed, 14 Dec 2011 09:06:23 +1030, Brett Lymn  
>> wrote:
>> Subject: Re: Lost file-system story
>>>
>>> On Tue, Dec 13, 2011 at 01:38:57PM +0100, Joerg Sonnenberger wrote:
>>> >
>>> > fsck is supposed to handle *all* corruptions to the file system that can
>>> > occur as part of normal file system operation in the kernel. It is doing
>>> > best effort for others. It's a bug if it doesn't do the former and a
>>> > potential missing feature for the latter.
>>>
>>> There are a lot of slips twixt cup and lip.  If you are really unlucky
>>> you can get an outage at just the wrong time that will cause the
>>> filesystem to be hosed so badly that fsck cannot recover it.  Sure, fsck
>>> can run to completion but all you have is most of your FS in lost+found
>>> which you have to be really really desperate to sort through.  I have
>>> been working with UNIX for over 20years now and I have only seen this
>>> happen once and it was with a commercial UNIX.
>>
>> I've seen that happen more than once unfortunately.  SunOS-4 once I think.
>>
>> I agree 100% with Joerg here though.
>>
>> I'm pretty sure at least some of the times I've seen fsck do more damage
>> than good it was due to a kernel bug or more breaking assumptions about
>> ordered operations.
>>
>> There have of course also been some pretty serious bugs in various fsck
>> implementations across the years and vendors.
>
> I'd be suspicious of fsck failing on a regularly mounted disk with
> corruption that can't otherwise be tracked to outside influences (bad
> ram, bad disk cache, etc). I've seen some bizarre things happen on ram
> errors over the years for instance.

I've got infinite sequence of nested subdirectories on new hardware and
"stable" FreeBSD 5.3 once. Something like http://xkcd.com/981/
fsck refused to work there.


-- 
HE CE3OH...



Re: [RFC] getgroups2 system call

2011-12-14 Thread Christos Zoulas
On Dec 14,  6:05am, m...@netbsd.org (Emmanuel Dreyfus) wrote:
-- Subject: Re: [RFC] getgroups2 system call

| Christos Zoulas  wrote:
| 
| > Don't you need a getuid2(pid_t pid)? 
| 
| uid, gid and pid are passed inthe FUSE header, so we aready have them.
| 
| > Why don't you add separate fuse messages to send and retrieve this
| > information? Then the kernel can notify if these have changed...
| 
| That adds a lot of state in the kernel (you need to update creds on
| process termination and setgroup(2) calls, which makes the thing even
| harder to port. And on the performance front, new messages add lattency.
| 
| At this point, I think I will fetch secondary groups through sysctl,
| this seems to be the point of least resistance.

I concur :-)

christos


Re: Audio drivers- Difference between start_output and trigger_output.

2011-12-14 Thread Iain Hibbert
On Wed, 14 Dec 2011, Nat Sloss wrote:

> Is is possible to rewrite btsco so that it uses trigger_output/input?

> Any thoughts appreciated.

I don't think trigger_input/output is very useful for btsco(4), as it just
forwards the output it receives it rather than producing/consuming audio
in real time. In fact, if the pad(4) driver had been available at the time
I wrote btsco(4) then I think I would have used that directly from
bthset(1) instead of an in-kernel implementation..

iain


Re: Audio drivers- Difference between start_output and trigger_output.

2011-12-14 Thread Valeriy E. Ushakov
Disclaimer: it's almost 10 years since I last touched our audio
framework :)

On Wed, Dec 14, 2011 at 23:23:31 +1100, Nat Sloss wrote:

> I have read audio(9) and have looked at several audio drivers.  I was having 
> trouble with btsco and have found it uses start output instead of trigger 
> output.
> 
> So my question is what is the difference between start output and trigger 
> output? 
>
> Is block,blksize in start output equivalent to start,blksize in trigger 
> output?

There's some minimal description in audio(9) manpage.

Driver must supply either start_output or trigger_output.  If
trigger_output is provided it is used instead of start_output.  Check
sys/dev/audio.c - I don't like giving RTFS for an answer, but in this
case it might be faster and easier to read 3 code snippets that refer
trigger_output and start_output than to define the semantic formally.

Informally, you should use trigger_output if you can efficiently
arrange for your DMA engine to process the next chunk of data while
your interrupt routine calls back to audio_pint for even more data.

In this case audio(9) begins playback with trigger_output and in
audio_pint it just throws more data into the ringbuffer.  I guess it's
classic parallel producer/consumer.

If you use start_output instead then audio(9) begins playback with
start_output and in audio_pint it also needs to call start_ouput for
the new chunk.  There's no parallelism.


E.g. for cs4231 we use trigger_output to program the first chunk into
address/count registers and second chunk into next address/next count
registers.  When the first chunk is done hardware automatically loads
"next" registers into "main" registers and proceeds with playback
while interrupt routine reprograms "next" registers with the third
chunk and asks audio_pint to supply more data now that the first chunk
data has been consumed.  Thus for trigger_output you need start, end
and blocksize.

If you can't arrange for such parallelism, you have to use
start_output that takes single block descrbed by block+blocksize and
audio_pint needs to explicitly call start_output again to continue
playback.


I hope I didn't make too many mistakes in this :)

-uwe


Re: [RFC] getgroups2 system call

2011-12-14 Thread Thor Lancelot Simon
On Wed, Dec 14, 2011 at 02:00:49PM +, Emmanuel Dreyfus wrote:
> On Wed, Dec 14, 2011 at 08:55:35AM -0500, Thor Lancelot Simon wrote:
> > So, um, whoever "considers" it that way -- they understand there are
> > security impliations to not doing it some other way?
> 
> Not quite. But BTW, what are the security implication? The only case
> I can think of is a thread doing a file operation while another one
> does a setgroups(2). Usual filesystem semantics require the operation 
> to be evaluated against older groups, but it maybe evaluated with newer
> ones.

I suspect the same condition is possible with nonblocking I/O.  But
the most obvious problem is that this can cause a program that tries
to drop privileges before doing a file operation to do so _after_ doing
the file operation.  There are probably several other similar issues.

Thor


Re: [RFC] getgroups2 system call

2011-12-14 Thread Emmanuel Dreyfus
On Wed, Dec 14, 2011 at 08:55:35AM -0500, Thor Lancelot Simon wrote:
> So, um, whoever "considers" it that way -- they understand there are
> security impliations to not doing it some other way?

Not quite. But BTW, what are the security implication? The only case
I can think of is a thread doing a file operation while another one
does a setgroups(2). Usual filesystem semantics require the operation 
to be evaluated against older groups, but it maybe evaluated with newer
ones.
-- 
Emmanuel Dreyfus
m...@netbsd.org


Re: [RFC] getgroups2 system call

2011-12-14 Thread Thor Lancelot Simon
On Wed, Dec 14, 2011 at 07:04:06AM +0100, Emmanuel Dreyfus wrote:
> 
> desired. That last proposal has been considered "a series of hacks to
> make it confirm to the requirements", therefore I am left with fetching
> secondary groups asynchrnously through sysctl.

So, um, whoever "considers" it that way -- they understand there are
security impliations to not doing it some other way?

Thor


ehci1: missed microframe, TT reset not implemented, hub might be inoperational

2011-12-14 Thread Donald Allen
I am seeing the subject message

ehci1: missed microframe, TT reset not implemented, hub might be inoperational

on occasion on my Lenovo S10 workstation. It seems to coincide with
the mouse not working (usb mouse, connected through a Raritan KVM). I
have had to reboot to get the mouse working -- restarting X doesn't do
it -- so a real problem.

I'm running the GENERIC 5.1 kernel at the moment, but I note that
someone else complained about this on 'current-users' last August,
running the then-current kernel on an HP ProLiant MicroServer N36L.

Anyone have any thoughts about this?

/Don


Re: [RFC] getgroups2 system call

2011-12-14 Thread Emmanuel Dreyfus
David Holland  wrote:

> So they want to do it right, but this causes trouble on Linux? Why is
> this *our* problem?

Our problem is to have NetBSD supporting FUSE filesystem, and this
sometime require changes to libfuse.
 
> Also, has fuse really been around this long without ever having had
> support for group lists? Are you really the first person who's ever
> tried to use it for serious work?

On Linux that information is obtained from /proc, at the expense of
opening and reading it on each file operation, which is not good for
performances. On NetBSD this causes deadlocks vnode problems (Manuel
experienced it, he can explain further)

-- 
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org


Audio drivers- Difference between start_output and trigger_output.

2011-12-14 Thread Nat Sloss
Hi.

I have read audio(9) and have looked at several audio drivers.  I was having 
trouble with btsco and have found it uses start output instead of trigger 
output.

So my question is what is the difference between start output and trigger 
output? 

Is block,blksize in start output equivalent to start,blksize in trigger 
output?

Is is possible to rewrite btsco so that it uses trigger_output/input?

Any thoughts appreciated.


Regards,

Nat.


Re: [RFC] getgroups2 system call

2011-12-14 Thread David Holland
On Wed, Dec 14, 2011 at 10:51:20AM +, Emmanuel Dreyfus wrote:
 > On Wed, Dec 14, 2011 at 09:09:59AM +, YAMAMOTO Takashi wrote:
 > > in my understanding, fuse_getgroups needs to talk with perfused,
 > > not kernel.  so i suggested creating a side channel between
 > > fuse_getgroups and perfused.
 > 
 > There is a proposal from fuse-devel mailing list to add FUSE message to
 > send credentials, but that seems overly complicated: [...]
 > 
 > I am not ready to implement such a complicated scheme.

That's not what he suggested.

-- 
David A. Holland
dholl...@netbsd.org


Re: [RFC] getgroups2 system call

2011-12-14 Thread David Holland
On Wed, Dec 14, 2011 at 08:19:42AM +, Emmanuel Dreyfus wrote:
 > On Wed, Dec 14, 2011 at 01:10:19AM -0500, Matthew Mondor wrote:
 > > What does NFS do in this case?  I seem to remember that it also imposes
 > > a sane size limit, possibly even below NGROUPS_MAX, is it really the
 > > case?  If so, would this also be acceptable?
 > 
 > FUSE people want the filesystem to accept POSIX_NGROUPS, otherwise they 
 > call that a bug.

So they want to do it right, but this causes trouble on Linux? Why is
this *our* problem?

Also, has fuse really been around this long without ever having had
support for group lists? Are you really the first person who's ever
tried to use it for serious work?

-- 
David A. Holland
dholl...@netbsd.org


Re: [RFC] getgroups2 system call

2011-12-14 Thread Emmanuel Dreyfus
On Wed, Dec 14, 2011 at 09:09:59AM +, YAMAMOTO Takashi wrote:
> in my understanding, fuse_getgroups needs to talk with perfused, not kernel.
> so i suggested creating a side channel between fuse_getgroups and perfused.

There is a proposal from fuse-devel mailing list to add FUSE message to
send credentials, but that seems overly complicated: the FUSE client
would have to send secondary group list everytime a new process uses
FUSE, and everytime it uses setgroups(2). Since perfused is not 
explictely notified of setgroups(2) calls, it will have to store secondary 
group lists in perfused for each process, and compare current creds to the 
one stored for every request. 

Additonnallu, A destroy message must be sent when a process terminate so 
that the secondary group list are deleted from the filesystem. Since
perfused does not know when a process terminates, this suggests it will
have a TTL on secondary group list, and send a destroy cred message
on tiemout. 

I am not ready to implement such a complicated scheme.

-- 
Emmanuel Dreyfus
m...@netbsd.org


Re: [RFC] getgroups2 system call

2011-12-14 Thread YAMAMOTO Takashi
hi,

> On Wed, Dec 14, 2011 at 07:45:11AM +, YAMAMOTO Takashi wrote:
>> do you mean to implement fuse_getgroups for NetBSD with the sysctl?
>> if you are adding a #ifdef NetBSD block to the fuse, can't it use
>> a NetBSD-specific sidechannel to get the info from an appropriate
>> puffs-supplied uucred?
> 
> FUSE filesystems are not PUFFS servers, they do not have access to
> PUFFS-supplied uucred. perfused is the PUFFS server. It has the data, 
> but in order to pass it to the FUSE filesystem, the FUSE protocol 
> neds to be extended. As I explained in another post, I have not been
> able to get consensus from FUSE maintainers on this extension. Therefore
> I have to fallback to asynchronous fecth of secondary groups through 
> a system call. 

i'm unclear why you want a system call.
in my understanding, fuse_getgroups needs to talk with perfused, not kernel.
so i suggested creating a side channel between fuse_getgroups and perfused.

YAMAMOTO Takashi

> 
> -- 
> Emmanuel Dreyfus
> m...@netbsd.org


Re: [RFC] getgroups2 system call

2011-12-14 Thread Emmanuel Dreyfus
On Wed, Dec 14, 2011 at 07:45:11AM +, YAMAMOTO Takashi wrote:
> do you mean to implement fuse_getgroups for NetBSD with the sysctl?
> if you are adding a #ifdef NetBSD block to the fuse, can't it use
> a NetBSD-specific sidechannel to get the info from an appropriate
> puffs-supplied uucred?

FUSE filesystems are not PUFFS servers, they do not have access to
PUFFS-supplied uucred. perfused is the PUFFS server. It has the data, 
but in order to pass it to the FUSE filesystem, the FUSE protocol 
neds to be extended. As I explained in another post, I have not been
able to get consensus from FUSE maintainers on this extension. Therefore
I have to fallback to asynchronous fecth of secondary groups through 
a system call. 

-- 
Emmanuel Dreyfus
m...@netbsd.org


Re: [RFC] getgroups2 system call

2011-12-14 Thread Emmanuel Dreyfus
On Wed, Dec 14, 2011 at 01:10:19AM -0500, Matthew Mondor wrote:
> What does NFS do in this case?  I seem to remember that it also imposes
> a sane size limit, possibly even below NGROUPS_MAX, is it really the
> case?  If so, would this also be acceptable?

FUSE people want the filesystem to accept POSIX_NGROUPS, otherwise they 
call that a bug.

-- 
Emmanuel Dreyfus
m...@netbsd.org