Re: [PATCH v3 3/5] tracing: Update synth command errors

2020-12-09 Thread Steven Rostedt
On Wed, 9 Dec 2020 22:51:14 +0900
Masami Hiramatsu  wrote:

> This makes sense. Anyway, what I considered were
> - synthetic_events interface doesn't provide syntax error reports
> - synthetic_events interface is not self-reproducive*.
> 
> *) I meant
> 
> $ cat synthetic_events > saved_events
> $ cat saved_events > synthetic_events
> 
>   should work. But this does *NOT* mean
> 
> $ cat user-input > synthetic_events
> $ cat synthetic_events > saved_events
> $ diff user-input saved_events # no diff
> 
> So input and output can be different, but the output can be input again.

Totally agree.

Thanks,

-- Steve


Re: [PATCH v3 3/5] tracing: Update synth command errors

2020-12-09 Thread Masami Hiramatsu
On Tue, 8 Dec 2020 12:53:40 -0500
Steven Rostedt  wrote:

> On Tue, 08 Dec 2020 11:34:41 -0600
> Tom Zanussi  wrote:
> 
> > Unfortunately, you're correct, if you have a script that creates a
> > synthetic event without semicolons, this patchset will break it, as I
> > myself found out and fixed in patch 4 ([PATCH v3 4/5] selftests/ftrace:
> > Add synthetic event field separators) [4].
> > 
> > So whereas before this would work, even though it shouldn't have in the
> > first place:
> > 
> >   # echo 'wakeup_latency  u64 lat pid_t pid char comm[16]' >
> > synthetic_events
> > 
> > it now has to be:
> > 
> >   # echo 'wakeup_latency  u64 lat; pid_t pid; char comm[16]' >
> > synthetic_events
> > 
> > So yeah, this patchset fixes a set of parsing bugs for things that
> > shouldn't have been accepted as valid, but shouldn't break things that
> > are obviously valid.
> > 
> > If it's too late to fix them, though, I guess we'll just have to live
> > with them, or some other option?
> 
> 
> I would suggest allowing the old interface work (with no new features, for
> backward compatibility), but new things like "char comm[16]" we require
> semicolons.
> 
> One method to do this is to add to the start of reading the string, and
> checking if it has semicolons. If it does not, we create a new string with
> them, but make sure that the string does not include new changes.
> 
>   strncpy_from_user(buffer, user_buff, sizeof(buffer));
> 
>   if (!strstr(buffer, ";")) {
>   if (!audit_old_buffer(buffer))
>   goto error;
>   insert_colons(buffer);
>   }
> 
> 
> That is, if the buffer does not have semicolons, then check if it is a
> valid "old format", and if not, we error out. Otherwise, we insert the
> colons into the buffer, and process that as if the user put in colons:
> 
> That is:
> 
>   echo 'wakeup_latency u64 lat pid_t pid' > synthetic_events
> 
> would change the buffer to:
> 
>   "wakeup_latency u64 lat; pid_t pid;"
> 
> And then put it through the normal processing. I think its OK that if the
> user were to cat out the synthetic events, it would see the semicolons even
> if it did not add them. As I don't think that will break userspace.
> 
> Does that make sense?

This makes sense. Anyway, what I considered were
- synthetic_events interface doesn't provide syntax error reports
- synthetic_events interface is not self-reproducive*.

*) I meant

$ cat synthetic_events > saved_events
$ cat saved_events > synthetic_events

  should work. But this does *NOT* mean

$ cat user-input > synthetic_events
$ cat synthetic_events > saved_events
$ diff user-input saved_events # no diff

So input and output can be different, but the output can be input again.

Thank you,

-- 
Masami Hiramatsu 


Re: [PATCH v3 3/5] tracing: Update synth command errors

2020-12-08 Thread Tom Zanussi
Hi Steve,

On Tue, 2020-12-08 at 12:53 -0500, Steven Rostedt wrote:
> On Tue, 08 Dec 2020 11:34:41 -0600
> Tom Zanussi  wrote:
> 
> > Unfortunately, you're correct, if you have a script that creates a
> > synthetic event without semicolons, this patchset will break it, as
> > I
> > myself found out and fixed in patch 4 ([PATCH v3 4/5]
> > selftests/ftrace:
> > Add synthetic event field separators) [4].
> > 
> > So whereas before this would work, even though it shouldn't have in
> > the
> > first place:
> > 
> >   # echo 'wakeup_latency  u64 lat pid_t pid char comm[16]' >
> > synthetic_events
> > 
> > it now has to be:
> > 
> >   # echo 'wakeup_latency  u64 lat; pid_t pid; char comm[16]' >
> > synthetic_events
> > 
> > So yeah, this patchset fixes a set of parsing bugs for things that
> > shouldn't have been accepted as valid, but shouldn't break things
> > that
> > are obviously valid.
> > 
> > If it's too late to fix them, though, I guess we'll just have to
> > live
> > with them, or some other option?
> 
> 
> I would suggest allowing the old interface work (with no new
> features, for
> backward compatibility), but new things like "char comm[16]" we
> require
> semicolons.
> 
> One method to do this is to add to the start of reading the string,
> and
> checking if it has semicolons. If it does not, we create a new string
> with
> them, but make sure that the string does not include new changes.
> 
>   strncpy_from_user(buffer, user_buff, sizeof(buffer));
> 
>   if (!strstr(buffer, ";")) {
>   if (!audit_old_buffer(buffer))
>   goto error;
>   insert_colons(buffer);
>   }
> 
> 
> That is, if the buffer does not have semicolons, then check if it is
> a
> valid "old format", and if not, we error out. Otherwise, we insert
> the
> colons into the buffer, and process that as if the user put in
> colons:
> 
> That is:
> 
>   echo 'wakeup_latency u64 lat pid_t pid' > synthetic_events
> 
> would change the buffer to:
> 
>   "wakeup_latency u64 lat; pid_t pid;"
> 
> And then put it through the normal processing. I think its OK that if
> the
> user were to cat out the synthetic events, it would see the
> semicolons even
> if it did not add them. As I don't think that will break userspace.
> 
> Does that make sense?
> 

Yeah, that should work, I'll try adding that.

Thanks,

Tom

> -- Steve



Re: [PATCH v3 3/5] tracing: Update synth command errors

2020-12-08 Thread Steven Rostedt
On Tue, 08 Dec 2020 11:34:41 -0600
Tom Zanussi  wrote:

> Unfortunately, you're correct, if you have a script that creates a
> synthetic event without semicolons, this patchset will break it, as I
> myself found out and fixed in patch 4 ([PATCH v3 4/5] selftests/ftrace:
> Add synthetic event field separators) [4].
> 
> So whereas before this would work, even though it shouldn't have in the
> first place:
> 
>   # echo 'wakeup_latency  u64 lat pid_t pid char comm[16]' >
> synthetic_events
> 
> it now has to be:
> 
>   # echo 'wakeup_latency  u64 lat; pid_t pid; char comm[16]' >
> synthetic_events
> 
> So yeah, this patchset fixes a set of parsing bugs for things that
> shouldn't have been accepted as valid, but shouldn't break things that
> are obviously valid.
> 
> If it's too late to fix them, though, I guess we'll just have to live
> with them, or some other option?


I would suggest allowing the old interface work (with no new features, for
backward compatibility), but new things like "char comm[16]" we require
semicolons.

One method to do this is to add to the start of reading the string, and
checking if it has semicolons. If it does not, we create a new string with
them, but make sure that the string does not include new changes.

strncpy_from_user(buffer, user_buff, sizeof(buffer));

if (!strstr(buffer, ";")) {
if (!audit_old_buffer(buffer))
goto error;
insert_colons(buffer);
}


That is, if the buffer does not have semicolons, then check if it is a
valid "old format", and if not, we error out. Otherwise, we insert the
colons into the buffer, and process that as if the user put in colons:

That is:

echo 'wakeup_latency u64 lat pid_t pid' > synthetic_events

would change the buffer to:

"wakeup_latency u64 lat; pid_t pid;"

And then put it through the normal processing. I think its OK that if the
user were to cat out the synthetic events, it would see the semicolons even
if it did not add them. As I don't think that will break userspace.

Does that make sense?

-- Steve


Re: [PATCH v3 3/5] tracing: Update synth command errors

2020-12-08 Thread Tom Zanussi
Hi Steve,

On Mon, 2020-12-07 at 20:13 -0500, Steven Rostedt wrote:
> On Mon, 26 Oct 2020 10:06:11 -0500
> Tom Zanussi  wrote:
> 
> > Since array types are handled differently, errors referencing them
> > also need to be handled differently.  Add and use a new
> > INVALID_ARRAY_SPEC error.  Also add INVALID_CMD and INVALID_DYN_CMD
> > to
> > catch and display the correct form for badly-formed commands, which
> > can also be used in place of CMD_INCOMPLETE, which is removed, and
> > remove CMD_TOO_LONG, since it's no longer used.
> > 
> > Signed-off-by: Tom Zanussi 
> > ---
> 
> Unfortunately, this patch series breaks user space.
> 
> I already have scripts that do the histograms, and I'm sure others
> may
> have that too, and if we change how synthetic events are created, it
> will break them.
> 
> What's the rationale for the new delimiters?
> 

The overall problem this is trying to fix is that it was probably a
mistake to try to shoehorn the synthetic event parsing into what was
available from  trace_run_command() and trace_parse_run_command(),
which use argv_split() to do the command splitting, and which only
splits on whitespace.  Whereas the synthetic events have a bit of a
higher-level structure which is 'event field; field; field;...'

So this patchset tries to remedy that - the first patch,
(tracing/dynevent: Delegate parsing to create function) is from Masami,
and makes it possible to share code between kprobe/uprobe and synthetic
evnents, and to allow synthetic events to have their own higher-level
parsing, which the next 2 patches do.

The history in more detail:

Initially the problem was to fix the errors mentioned by Masami in
[1]. 

Things like:

  # echo myevent char str[];; int v >> synthetic_events

which was identified as INVALID_TYPE where it should just be a void arg
and

  # echo mye;vent char str[] >> synthetic_events

which was identified as BAD_NAME where it should have been an invalid
command, etc.

I suggested that the way to fix them was to consider semicolon as
additional whitespace and the result was the patchset containing [2],
which also explains the reasons for wanting to enforce semicolon
grouping.

Masami pointed out that it really wasn't correct to do it that way,
and  the commands should be split out first at the higher level by
semicolon and then further processed [3].

Unfortunately, you're correct, if you have a script that creates a
synthetic event without semicolons, this patchset will break it, as I
myself found out and fixed in patch 4 ([PATCH v3 4/5] selftests/ftrace:
Add synthetic event field separators) [4].

So whereas before this would work, even though it shouldn't have in the
first place:

  # echo 'wakeup_latency  u64 lat pid_t pid char comm[16]' >
synthetic_events

it now has to be:

  # echo 'wakeup_latency  u64 lat; pid_t pid; char comm[16]' >
synthetic_events

So yeah, this patchset fixes a set of parsing bugs for things that
shouldn't have been accepted as valid, but shouldn't break things that
are obviously valid.

If it's too late to fix them, though, I guess we'll just have to live
with them, or some other option?

Tom

[1] 
https://lore.kernel.org/lkml/20201014110636.139df7be275d40a23b523...@kernel.org/
[2] 
https://lore.kernel.org/lkml/e29c3ae1fc46892ec792d6f6f910f75d0e12584c.1602883818.git.zanu...@kernel.org/
[3] 
https://lore.kernel.org/lkml/20201018232011.38e5da51f5cd8e73e6f52...@kernel.org/
[4] 
https://lore.kernel.org/lkml/75a2816b4001e04e7d60bcc87aa91477ad5d90b3.1603723933.git.zanu...@kernel.org/



> -- Steve



Re: [PATCH v3 3/5] tracing: Update synth command errors

2020-12-07 Thread Steven Rostedt
On Mon, 26 Oct 2020 10:06:11 -0500
Tom Zanussi  wrote:

> Since array types are handled differently, errors referencing them
> also need to be handled differently.  Add and use a new
> INVALID_ARRAY_SPEC error.  Also add INVALID_CMD and INVALID_DYN_CMD to
> catch and display the correct form for badly-formed commands, which
> can also be used in place of CMD_INCOMPLETE, which is removed, and
> remove CMD_TOO_LONG, since it's no longer used.
> 
> Signed-off-by: Tom Zanussi 
> ---

Unfortunately, this patch series breaks user space.

I already have scripts that do the histograms, and I'm sure others may
have that too, and if we change how synthetic events are created, it
will break them.

What's the rationale for the new delimiters?

-- Steve