Re: Broken ioctl error returns (was Re: [PATCH 2/3] block: fail SCSI passthrough ioctls on partition devices)

2012-01-06 Thread Mauro Carvalho Chehab
On 05-01-2012 17:09, Mauro Carvalho Chehab wrote:
> On 05-01-2012 15:47, Linus Torvalds wrote:
 
>> Can you test the patch with some media capture apps (preferably with
>> the obvious fix for the problem that Paulo already pointed out -
>> although that won't actually matter until some block driver starts
>> using ENOIOCTLCMD there, so even the unfixed patch should mostly work
>> for testing)?
> 
> Sure. I'm currently traveling, so I have just my "first aids kit" of devices
> but they should be enough for testing it. I'll return you as soon as I finish
> compiling the kernel on this slow 4 years-old notebook and run some
> tests with the usual applications.

I did a quick test today: didn't notice any regressions with the usual
applications.

Regards,
Mauro
--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Broken ioctl error returns (was Re: [PATCH 2/3] block: fail SCSI passthrough ioctls on partition devices)

2012-01-06 Thread Linus Torvalds
On Fri, Jan 6, 2012 at 8:19 AM, Arnd Bergmann  wrote:
>
> For a simpler solution, we could keep the old warning message, but change
> the logic from
>
> (.compat_ioctl == NULL || .compat_ioctl() == -ENOIOCTL) && (no generic 
> handler)
>
> to
>
> (.ioctl != NULL && .compat_ioctl == NULL) && (no generic handler).
>
> and then fix the warnings we see by adding appropriate .compat_ioctl
> functions.

So I have no problem with that. What I did have problems with was the
net/socket.c kind of workarounds (which I noticed mainly because they
had that whole EINVAL confusion built into them). Those were why I
decided that the warning has to go.

But if you can re-introduce the warning without workarounds like
those, I have no problem with us reintroducing the warning, even if it
is technically not really kosher and will afaik still print that
warning for the case of a compat-ioctl transform that *would* have
been valid, just not for that particular file descriptor.

But the printout is small and not all that annoying, and explicitly
limited in number, so I guess I don't mind a few false positives if it
really can help find stale translation entries.

   Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Broken ioctl error returns (was Re: [PATCH 2/3] block: fail SCSI passthrough ioctls on partition devices)

2012-01-06 Thread Arnd Bergmann
On Thursday 05 January 2012, Linus Torvalds wrote:

> And finally, ENOIOCTLCMD is a way to say ENOTTY in a sane manner, and
> will now be turned into ENOTTY for the user space return (not EINVAL -
> I have no idea where that idiocy came from, but it's clearly confused,
> although it's also clearly very old).

To give some background here: While I did a lot of changes to the
compat ioctl code over the years, I never touched the EINVAL/ENOTTY
logic out of fear of breaking things further. Your patch is probably
the right answer here and we should have done it long ago.

> This fixes the core files I noticed. It removes the insane
> complaints from the compat_ioctl() (which would complain if you
> returned ENOIOCTLCMD after an ioctl translation - the reason that is
> totally insane is that somebody might use an ioctl on the wrong kind
> of file descriptor, so even if it was translated perfectly fine,
> ENOIOCTLCMD is a perfectly fine error return and shouldn't cause
> warnings - and that allows us to remove stupid crap from the socket
> ioctl code).

Now this behavior was entirely intentional. ENOIOCTLCMD was introduced
explicitly so we could tell the difference between compat_ioctl handlers
saying "I don't know how to translate this but common code might know"
(ENOIOCTLCMD) and "I know that this is not a valid command code here"
(ENOTTY plus whatever a broken driver might return).

The distinction is used in two places:

1. To let device drivers return early from fops->compat_ioctl when
they want to avoid the overhead from searching the COMPATIBLE_IOCTL()
table and from the (formerly large) switch statement in do_ioctl_trans(),
or when they want to make sure that the common translation is not
called. We have some really tricky bits in the socket code where
the crazy SIOCDEVPRIVATE needs to be handled specially by some drivers
while the majority can use the default siocdevprivate_ioctl() function.

2. To return from do_ioctl_trans() when we know that there is no
translation for the command number, so that the we can write a warning
to the console (which you have now removed). This was initially more
useful back when do_ioctl_trans() knew about practically all command
numbers, but we have gradually moved all handlers into device drivers
where they belong, and that has caused the problem that we get warnings
whenever a user attempts an ioctl on a device that does not handle the
command even when all the correct devices that support the command also
have a compat handler. The common way we handle those is to add an
IGNORE_IOCTL() entry for a command that repeatedly shows up in logs
that way and is known to be handled correctly.

I used to have patches to completely remove the remains of do_ioctl_trans()
that can be resurrected if that helps, but it's mostly an independent
issue.

While I agree that the existing compat_ioctl_error() method was somewhat
broken, I'd like to keep having a way to get some indication from
drivers that are missing ioctl translations. Ideally, we would move
all of fs/compat_ioctl.c into drivers and then just warn when we register
some file_operations on a 64 bit machine that have an ioctl but no
compat_ioctl function, but that would be a large amount of work.

For a simpler solution, we could keep the old warning message, but change
the logic from

(.compat_ioctl == NULL || .compat_ioctl() == -ENOIOCTL) && (no generic handler)

to

(.ioctl != NULL && .compat_ioctl == NULL) && (no generic handler).

and then fix the warnings we see by adding appropriate .compat_ioctl
functions.

Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Broken ioctl error returns (was Re: [PATCH 2/3] block: fail SCSI passthrough ioctls on partition devices)

2012-01-05 Thread Mauro Carvalho Chehab
On 05-01-2012 15:47, Linus Torvalds wrote:
> On Thu, Jan 5, 2012 at 9:37 AM, Mauro Carvalho Chehab
>  wrote:
>>
>> For the media drivers, we've already fixed it, at the V4L side:
>> -EINVAL doesn't mean that an ioctl is not supported anymore.
>> I think that such fix went into Kernel 3.1.
> 
> Ok, I'm happy to hear that the thing should be fixed. My grepping
> still found a fair amount of EINVAL returns both in code and in the
> Documentation subdirectory for media ioctls, but it really was just
> grepping with a few lines of context, so I didn't look closer at the
> semantics. I was just looking for certain patterns (ie grepping for
> "EINVAL" near ioctl or ENOIOCTLCMD etc) that I thought might indicate
> problem spots, and the media subdirectory had a lot of them.

Yeah, there are lots of EINVAL there, as the API is fairly complex
(about 80 ioctl's for V4L, plus a similar set for DVB),  but there's
an ioctl dispatcher inside the V4L core that handles the ENOTTY case,
at drivers/media/video/v4l2-ioctl.c.

You'll see some -EINVAL things there, but they're due to errors
on parameters (the semantics there is somewhat complex, to avoid
returning -ENOTTY where a -EINVAL should be returned, instead).

In summary, the code there is:

static long __video_do_ioctl(struct file *file,
unsigned int cmd, void *arg)
{
...
long ret = -ENOTTY;
...
switch (cmd) {
/*
 * several ioctl callbacks here. if they're not
 * implemented, break (e. g. -ENOTTY will be returned).
 */
...
}
...
return ret;

The context there is too big for noticing it with a few lines of
context, and too complex as well, as some ioctl's may be implemented 
by more than one callback, depending on what's needed, and some 
others have a default implementation there. This is somewhat similar 
to file ops callbacks.

> Can you test the patch with some media capture apps (preferably with
> the obvious fix for the problem that Paulo already pointed out -
> although that won't actually matter until some block driver starts
> using ENOIOCTLCMD there, so even the unfixed patch should mostly work
> for testing)?

Sure. I'm currently traveling, so I have just my "first aids kit" of devices
but they should be enough for testing it. I'll return you as soon as I finish
compiling the kernel on this slow 4 years-old notebook and run some
tests with the usual applications.

Regards,
Mauro
--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Broken ioctl error returns (was Re: [PATCH 2/3] block: fail SCSI passthrough ioctls on partition devices)

2012-01-05 Thread Linus Torvalds
On Thu, Jan 5, 2012 at 9:37 AM, Mauro Carvalho Chehab
 wrote:
>
> For the media drivers, we've already fixed it, at the V4L side:
> -EINVAL doesn't mean that an ioctl is not supported anymore.
> I think that such fix went into Kernel 3.1.

Ok, I'm happy to hear that the thing should be fixed. My grepping
still found a fair amount of EINVAL returns both in code and in the
Documentation subdirectory for media ioctls, but it really was just
grepping with a few lines of context, so I didn't look closer at the
semantics. I was just looking for certain patterns (ie grepping for
"EINVAL" near ioctl or ENOIOCTLCMD etc) that I thought might indicate
problem spots, and the media subdirectory had a lot of them.

Can you test the patch with some media capture apps (preferably with
the obvious fix for the problem that Paulo already pointed out -
although that won't actually matter until some block driver starts
using ENOIOCTLCMD there, so even the unfixed patch should mostly work
for testing)?

  Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Broken ioctl error returns (was Re: [PATCH 2/3] block: fail SCSI passthrough ioctls on partition devices)

2012-01-05 Thread Linus Torvalds
On Thu, Jan 5, 2012 at 9:26 AM, Paolo Bonzini  wrote:
> On 01/05/2012 06:02 PM, Linus Torvalds wrote:
>>
>> +       return  ret == -EINVAL ||
>> +               ret == -ENOTTY ||
>> +               ret == ENOIOCTLCMD;
>
>
> Missing minus before ENOIOCTLCMD.

Oops, thanks, fixed.

Also, I do realize that the patch results in a warning about
"compat_ioctl_error()" no longer being used. I've removed it in my
tree, but I do wonder if we could perhaps have some kind of better
check, so maybe it is useful if somebody can come up with a saner way
to do it. Or at least a way that doesn't cause the kind of crazy code
that net/socket.c had.

And I notice that not only net/socket.c had workarounds for the bogus
warning, but fs/compat_ioctl.c itself does too: it's why we have those
IGNORE_IOCTL() entries.

So *maybe* we can reinstate that compat_ioctl_error() check, and just
remove the net/socket.c stuff, and make sure that all the ioctls that
net/socket.c had hacks for are mentioned as IGNORE_IOCTL's. Dunno.

Anybody have strong opinions either way? Has that printout helped
compat ioctl debugging a lot lately and we really want to maintain it?
Otherwise I'm inclined to remove it (we can always reinstate it later,
it's not like removal is necessarily final).

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Broken ioctl error returns (was Re: [PATCH 2/3] block: fail SCSI passthrough ioctls on partition devices)

2012-01-05 Thread Mauro Carvalho Chehab
On 05-01-2012 15:02, Linus Torvalds wrote:
> On Thu, Jan 5, 2012 at 8:16 AM, Linus Torvalds
>  wrote:
>>
>> Just fix the *obvious* breakage in BLKROSET. It's clearly what the
>> code *intends* to do, it just didn't check for ENOIOCTLCMD.
> 
> So it seems from quick grepping that the block layer isn't actually
> all that confused apart from that one BLK[SG]ETRO issue, although I'm
> sure there are crazy drivers that think that EINVAL is the right error
> return.
> 
> The *really* confused layer seems to be the damn media drivers. The
> confusion there seems to go very deep indeed, where for some crazy
> reason the media people seem to have made it part of the semantics
> that "if a driver doesn't support a particular ioctl, it returns
> EINVAL".
> 
> Added, linux-media and Mauro to the Cc, because I'm about to commit
> something like the attached patch to see if anything breaks. We may
> have to revert it if things get too nasty, but we should have done
> this years and years ago, so let's hope not.

For the media drivers, we've already fixed it, at the V4L side:
-EINVAL doesn't mean that an ioctl is not supported anymore.
I think that such fix went into Kernel 3.1.

There are still two different behaviors there:
at the V4L API: the return code is -ENOTTY;
at the DVB API: the return code is currrently -EOPNOTSUPP.

Yeah, I know that DVB is returning the wrong code. Fixing it is on
my todo list, although with low priority, as the behavior inside the
DVB part is consistent.

On both DVB and V4L, -EINVAL now means only invalid parameters.

Regards,
Mauro

> Basic rules: ENOTTY means "I don't recognize this ioctl". Yes, the
> name is odd, and yes, it's for historical Unix reasons. ioctl's were
> originally a way to control mainly terminal settings - think termios -
> so when you did an ioctl on a file, you'd get "I'm not a tty, dummy".
> File flags were controlled with fcntl().
> 
> In contrast, EINVAL means "there is something wrong with the
> arguments", which very much implies "I do recognize the ioctl".
> 
> And finally, ENOIOCTLCMD is a way to say ENOTTY in a sane manner, and
> will now be turned into ENOTTY for the user space return (not EINVAL -
> I have no idea where that idiocy came from, but it's clearly confused,
> although it's also clearly very old).
> 
> This fixes the core files I noticed. It removes the *insane*
> complaints from the compat_ioctl() (which would complain if you
> returned ENOIOCTLCMD after an ioctl translation - the reason that is
> totally insane is that somebody might use an ioctl on the wrong kind
> of file descriptor, so even if it was translated perfectly fine,
> ENOIOCTLCMD is a perfectly fine error return and shouldn't cause
> warnings - and that allows us to remove stupid crap from the socket
> ioctl code).
> 
> Does this break things and need to be reverted? Maybe. There could be
> user code that tests *explicitly* for EINVAL and considers that the
> "wrong ioctl number", even though it's the wrong error return.
> 
> And we may have those kinds of mistakes inside the kernel too. We did
> in the block layer BLKSETRO code, for example, as pointed out by
> Paulo. That one is fixed here, but there may be others.
> 
> I didn't change any media layers, since there it's clearly an endemic
> problem, and there it seems to be used as a "we pass media ioctls down
> and drivers should by definition recognize them, so if they don't, we
> assume the driver is limited and doesn't support those particular
> settings and return EINVAL".
> 
> But in general, any code like this is WRONG:
> 
>switch (cmd) {
>case MYIOCTL:
>   .. do something ..
>default:
>   return -EINVAL;
>}
> 
> while something like this is CORRECT:
> 
>switch (cmd) {
>case MYIOCT:
>   if (arg)
>  return -EINVAL;
>   ...
> 
>case OTHERIOCT:
>   /* I recognize this one, but I don't support it */
>   return -EINVAL;
> 
>default:
>   return -ENOIOCTLCMD; // Or -ENOTTY - see below about the difference
>}
> 
> where right now we do have some magic differences between ENOIOCTLCMD
> and ENOTTY (the compat layer will try to do a translated ioctl *only*
> if it gets ENOIOCTLCMD, iirc, so ENOTTY basically means "this is my
> final answer").
> 
> I'll try it out on my own setup here to see what problems I can
> trigger, but I thought I'd send it out first just as (a) a heads-up
> and (b) to let others try it out and see.. See the block/ioctl.c code
> for an example of the kinds of things we may need even just inside the
> kernel (and the kinds of things that could cause problems for
> user-space that makes a difference between EINVAL and ENOTTY).
> 
>  Linus

--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Broken ioctl error returns (was Re: [PATCH 2/3] block: fail SCSI passthrough ioctls on partition devices)

2012-01-05 Thread Linus Torvalds
On Thu, Jan 5, 2012 at 9:02 AM, Linus Torvalds
 wrote:
>
> Added, linux-media and Mauro to the Cc, because I'm about to commit
> something like the attached patch to see if anything breaks. We may
> have to revert it if things get too nasty, but we should have done
> this years and years ago, so let's hope not.

Ok, so "It works for me". I'll delay committing it in case somebody
has some quick obvious fixes or comments (like noticing other cases
like the blk_ioctl.c one), but on the whole I think I'll commit it
later today, just so that it will be as early as possible in the merge
window in case there is ENOTTY/EINVAL confusion.

The good news is that no user space can *ever* care about
ENOTTY/EINVAL in the "generic case", since different drivers have
returned different error returns for years. So user space that doesn't
know exactly what it is dealing with will pretty much by definition
not be affected. Except perhaps in a good way - if it uses "perror()"
or "strerror()" or similar, it will now give a much better error
string of "Inappropriate ioctl for device".

However, some applications don't work with "generic devices", but
instead work with a very specific device or perhaps a very specific
subset.

So the exception would be user space apps that know exactly which
driver they are talking about, and that particular driver used to
always return EINVAL before, and now the ENOIOCTLCMD -> ENOTTY fix
means that it returns the proper ENOTTY - and the application has
never seen it, and never tested against it, and breaks.

I don't *think* this happens outside of the media drivers, but we'll
see. It may be that we will have to make certain drivers return EINVAL
explicitly rather than ENOIOCTLCMD and add a comment about why. Sad,
if so, so we'll have little choice. Let's see what the breakage (if
any - cross your fingers) looks like.

   Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Broken ioctl error returns (was Re: [PATCH 2/3] block: fail SCSI passthrough ioctls on partition devices)

2012-01-05 Thread Paolo Bonzini

On 01/05/2012 06:02 PM, Linus Torvalds wrote:

+   return  ret == -EINVAL ||
+   ret == -ENOTTY ||
+   ret == ENOIOCTLCMD;


Missing minus before ENOIOCTLCMD.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html