Re: [PATCH 1/2 v3] revision.c: args starting with "-" might be a revision

2017-02-13 Thread Siddharth Kannan
Hey Junio,
> 
> See the 3-patch series I just sent out.  I didn't think it through
> very carefully (especially the error message the other caller
> produces), but the whole thing _smells_ correct to me.

Okay, got it! I will write-up those changes, and make sure nothing bad
happens. (Also, the one other function that calls handle_revision_opt,
parse_revision_opt needs to be fixed for any changes in
handle_revision_opt.)

I will do all of this in the next week (Unfortunately, exams!) and
submit a new version of this patch (Also, I need to update tests, add
documentation, and remove code that does this shorthand stuff for
other commands as per Matthieu's comments)

--
Best Regards,

Siddharth Kannan.


Re: [PATCH 1/2 v3] revision.c: args starting with "-" might be a revision

2017-02-12 Thread Junio C Hamano
Siddharth Kannan  writes:

> This "changing the order" gave me the idea to change the flow. I tried to
> implement the above steps without touching the function handle_revision_opt. 
> By
> inserting the handle_revision_arg call just before calling 
> handle_revision_opt.

Changing the order is changing the order of the function calls,
i.e. changing the flow.  So at the idea level we are on the same
page.

I was shooting for not having to duplicate calls to
handle_revision_arg().  

>> But I think the resulting code flow is much closer to the
>> above ideal.
>
> (about Junio's version of the patch): Yes, I agree with you on this. It's like
> the ideal, but the argv has already been populated, so the only remaining step
> is "left++".
>> 
>> Such a change to handle_revision_opt() unfortunately affects other
>> callers of the function, so it may not be worth it.

See the 3-patch series I just sent out.  I didn't think it through
very carefully (especially the error message the other caller
produces), but the whole thing _smells_ correct to me.


Re: [PATCH 1/2 v3] revision.c: args starting with "-" might be a revision

2017-02-12 Thread Siddharth Kannan
On Sat, Feb 11, 2017 at 01:08:09PM -0800, Junio C Hamano wrote:
> Siddharth Kannan  writes:
> 
> > Um, I am sorry, but I feel that decrementing left, and incrementing it 
> > again is
> > also confusing.
> 
> Yes, but it is no more confusing than your original "left--".
> ...
> 
>  * If it is an option known to us, handle it and go to the next arg.
> 
>  * If it is an option that we do not understand, stuff it in
>argv[left++] and go to the next arg.
> 
> Because the second step currently is implemented by calling
> handle_opt(), which not just tells if it is an option we understand
> or not, but also mucks with argv[left++], you need to undo it once
> you start making it possible for a valid "rev" to begin with a dash.
> That is what your left-- was, and that is what "decrement and then
> increment when it turns out it was an unknown option after all" is.

So, our problem here is that the function handle_revision_opt is opaquely also
incrementing "left", which we need to decrement somehow.

Or: we could change the flow of the code so that this incrementing
will happen only when we have decided that the argument is not a
revision.
> 
>  * If it is an option known to us, handle it and go to the next arg.
> 
>  * If it is a rev, handle it, and note that fact in got_rev_arg
>(this change of order enables you to allow a rev that begins with
>a dash, which would have been misrecognised as a possible unknown
>option).
> 
>  * If it looks like an option (i.e. "begins with a dash"), then we
>already know that it is not something we understand, because the
>first step would have caught it already.  Stuff it in
>argv[left++] and go to the next arg.
> 
>  * If it is not a rev and we haven't seen dashdash, verify that it
>and everything that follows it are pathnames (which is an inexact
>but a cheap way to avoid ambiguity), make all them the prune_data
>and conclude.

This "changing the order" gave me the idea to change the flow. I tried to
implement the above steps without touching the function handle_revision_opt. By
inserting the handle_revision_arg call just before calling handle_revision_opt.

The decrementing then incrementing or "left--" things have now been removed.
(But there is still one thing which doesn't look good)

diff --git a/revision.c b/revision.c
index b37dbec..8c0acea 100644
--- a/revision.c
+++ b/revision.c
@@ -2203,11 +2203,11 @@ int setup_revisions(int argc, const char **argv, struct 
rev_info *revs, struct s
if (seen_dashdash)
revarg_opt |= REVARG_CANNOT_BE_FILENAME;
read_from_stdin = 0;
+
for (left = i = 1; i < argc; i++) {
const char *arg = argv[i];
+   int opts;
if (*arg == '-') {
-   int opts;
-
opts = handle_revision_pseudo_opt(submodule,
revs, argc - i, argv + i,
);
@@ -2226,7 +2226,11 @@ int setup_revisions(int argc, const char **argv, struct 
rev_info *revs, struct s
read_revisions_from_stdin(revs, _data);
continue;
}
+   }
 
+   if (!handle_revision_arg(arg, revs, flags, revarg_opt))
+   got_rev_arg = 1;
+   else if (*arg == '-') {
opts = handle_revision_opt(revs, argc - i, argv + i, 
, argv);
if (opts > 0) {
i += opts - 1;
@@ -2234,11 +2238,7 @@ int setup_revisions(int argc, const char **argv, struct 
rev_info *revs, struct s
}
if (opts < 0)
exit(128);
-   continue;
-   }
-
-
-   if (handle_revision_arg(arg, revs, flags, revarg_opt)) {
+   } else {
int j;
if (seen_dashdash || *arg == '^')
die("bad revision '%s'", arg);
@@ -2255,8 +2255,6 @@ int setup_revisions(int argc, const char **argv, struct 
rev_info *revs, struct s
append_prune_data(_data, argv + i);
break;
}
-   else
-   got_rev_arg = 1;
}
 
if (prune_data.nr) {


The "if (*arg =='-')" line is repeated. On analysing the resulting
revision.c:setup_revisions function, I feel that the codepath is still as
easily followable as it was earlier, and there is definitely no confusion
because of a mysterious decrement. Also, the repeated condition doesn't make it
any harder (it looks like a useful check because we already know that every
option would start with a "-"). But that's only my opinion, and you definitely
know better.

now the flow is very close to the ideal flow that we prefer:

1. If it is a 

Re: [PATCH 1/2 v3] revision.c: args starting with "-" might be a revision

2017-02-11 Thread Junio C Hamano
Junio C Hamano  writes:

> Such a change to handle_revision_opt() unfortunately affects other
> callers of the function, so it may not be worth it, and I think
> "decrement and then increment, because this codepath wants to check
> to see something that may ordinarily be clasified as an unknown
> option if it is a rev" is an ugly workaround, just like your left--
> was.  But I think the resulting code flow is much closer to the
> above ideal.

Having re-analysed the codepath like so, I realize that the new
variable I introduced was misnamed.  Its purpose is to let the
"if arg begins with dash, do this" block communicate that what the
later part of the code is told to inspect in "arg" may be an option
that we do not recognise.  So I shouldn't have called it maybe_rev;
the message from the former to the latter is "this may be an unknown
option" and I should have called it "maybe_unknown_opt".



Re: [PATCH 1/2 v3] revision.c: args starting with "-" might be a revision

2017-02-11 Thread Junio C Hamano
Siddharth Kannan  writes:

> On Fri, Feb 10, 2017 at 03:35:47PM -0800, Junio C Hamano wrote:
>
>> I am wondering if writing it like the following is easier to
>> understand.  I had a hard time figuring out what you are trying to
>> do, partly because "args" is quite a misnomer---implying "how many
>> arguments did we see" that is similar to opts that does mean "how
>> many options did handle_revision_opts() see?"
>
> Um, okay, I see that "args" is very confusing. Would it help if this variable
> was called "arg_not_rev"?

Not really.  If we absolutely need to have one variable that is
meant to escape the "if it begins with a dash" block and to affect
what comes next, I think the variable should mean "we know we saw a
revision and you do not have to call it again".  IOW the code that
needs to do "handle_rev_arg_called && arg_not_rev" is just being
silly.  At that point in the codeflow, I do not see why the code
needs to take two bits of information and combine them; the one that
sets these two variables should have done the work for it.

And that would make the if statement slightly easier to read
compared to the original.  I am however not suggesting to do that;
read on.

> Because the value that is returned from
> handle_revision_arg is 1 when it is not a revision, and 0 when it is a
> revision.

The function follows the convention to return 0 for success, -1 for
error/unexpected, by the way.

> Um, I am sorry, but I feel that decrementing left, and incrementing it again 
> is
> also confusing.

Yes, but it is no more confusing than your original "left--".

If we want to make the flow of logic easier to follow, we need to
step back and view what the codepath _wants_ to do at the higher
level, which is:

 * If it is an option known to us, handle it and go to the next arg.

 * If it is an option that we do not understand, stuff it in
   argv[left++] and go to the next arg.

 * If it is a rev, handle it, and note that fact in got_rev_arg.

 * If it is not a rev and we haven't seen dashdash, verify that it
   and everything that follows it are pathnames (which is an inexact
   but a cheap way to avoid ambiguity), make all them the prune_data
   and conclude.

Because the second step currently is implemented by calling
handle_opt(), which not just tells if it is an option we understand
or not, but also mucks with argv[left++], you need to undo it once
you start making it possible for a valid "rev" to begin with a dash.
That is what your left-- was, and that is what "decrement and then
increment when it turns out it was an unknown option after all" is.

The first step to a saner flow _could_ be to stop passing the unkv
and unkc to handle_revision_opt() and instead make the caller
responsible for doing that.  That would express what your patch
wanted to do in the most natural way, i.e.

 * If it is an option known to us, handle it and go to the next arg.

 * If it is a rev, handle it, and note that fact in got_rev_arg
   (this change of order enables you to allow a rev that begins with
   a dash, which would have been misrecognised as a possible unknown
   option).

 * If it looks like an option (i.e. "begins with a dash"), then we
   already know that it is not something we understand, because the
   first step would have caught it already.  Stuff it in
   argv[left++] and go to the next arg.

 * If it is not a rev and we haven't seen dashdash, verify that it
   and everything that follows it are pathnames (which is an inexact
   but a cheap way to avoid ambiguity), make all them the prune_data
   and conclude.

Such a change to handle_revision_opt() unfortunately affects other
callers of the function, so it may not be worth it, but I think
"decrement and then increment, because this codepath wants to check
to see something that may ordinarily be clasified as an unknown
option if it is a rev" is an ugly workaround, just like your left--
was.  But I think the resulting code flow is much closer to the
above ideal.



Re: [PATCH 1/2 v3] revision.c: args starting with "-" might be a revision

2017-02-11 Thread Siddharth Kannan
Hey Junio,
On Fri, Feb 10, 2017 at 03:35:47PM -0800, Junio C Hamano wrote:
> So the difference is just "--left" (by the way, our codebase seem to
> prefer "left--" when there is no difference between pre- or post-
> decrement/increment) that adjusts the slot in argv[] where the next
> unknown argument is stuffed to.

Understood, I will use post decrement.

> I am wondering if writing it like the following is easier to
> understand.  I had a hard time figuring out what you are trying to
> do, partly because "args" is quite a misnomer---implying "how many
> arguments did we see" that is similar to opts that does mean "how
> many options did handle_revision_opts() see?"  

Um, okay, I see that "args" is very confusing. Would it help if this variable
was called "arg_not_rev"? Because the value that is returned from
handle_revision_arg is 1 when it is not a revision, and 0 when it is a
revision. The changed block of code would look like this:

---
 revision.c | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/revision.c b/revision.c
index b37dbec..4131ad5 100644
--- a/revision.c
+++ b/revision.c
@@ -2205,6 +2205,7 @@ int setup_revisions(int argc, const char **argv, struct 
rev_info *revs, struct s
read_from_stdin = 0;
for (left = i = 1; i < argc; i++) {
const char *arg = argv[i];
+   int handle_rev_arg_called = 0, arg_not_rev;
if (*arg == '-') {
int opts;
@@ -2234,11 +2235,18 @@ int setup_revisions(int argc, const char **argv, struct 
rev_info *revs, struct s
}
if (opts < 0)
exit(128);
-   continue;
+
+   arg_not_rev = handle_revision_arg(arg, revs, flags, 
revarg_opt);
+   handle_rev_arg_called = 1;
+   if (arg_not_rev)
+   continue; /* arg is neither an option nor a 
revision */
+   else
+   left--; /* arg is a revision! */
}
 
 
-   if (handle_revision_arg(arg, revs, flags, revarg_opt)) {
+   if ((handle_rev_arg_called && arg_not_rev) ||
+   handle_revision_arg(arg, revs, flags, revarg_opt)) {

> The rewrite below may avoid such a confusion.  I dunno.

Um, I am sorry, but I feel that decrementing left, and incrementing it again is
also confusing. I think that with a better name for the return value from
handle_revision_arg, the earlier confusion should be resolved.

I base this on my previous experience following the codepath. It was easy for
me to understand with the previous code that "continue" will be executed from
within the first if block whenever arg begins with a "-" and it is determined
that it is not an option. 

going by that, now, "continue" will be executed whenever it's not an option and
_also_ not an argument. Otherwise, the further parts of the code will execute
as before, and there are no continue statements there. I hope this argument
makes sense.

Also worth noting, The two `if` lines look better now:

1. If arg is not a revision, go to the next arg (because we have already
determined that it is not an option)

2. If handle_rev_arg was called AND the argument was not a revision, OR
if handle_revision_arg says that arg is not a rev, execute the following block.

Perhaps, someone else could please have a look at the changes in the block
above and the block below and give some feedback on which one is easier to
understand and the reason that they feel so. Thanks a lot!

> 
>  revision.c | 14 +-
>  1 file changed, 9 insertions(+), 5 deletions(-)
> 
> diff --git a/revision.c b/revision.c
> index b37dbec378..e238430948 100644
> --- a/revision.c
> +++ b/revision.c
> @@ -2204,6 +2204,7 @@ int setup_revisions(int argc, const char **argv, struct 
> rev_info *revs, struct s
>   revarg_opt |= REVARG_CANNOT_BE_FILENAME;
>   read_from_stdin = 0;
>   for (left = i = 1; i < argc; i++) {
> + int maybe_rev = 0;
>   const char *arg = argv[i];
>   if (*arg == '-') {
>   int opts;
> @@ -2234,11 +2235,16 @@ int setup_revisions(int argc, const char **argv, 
> struct rev_info *revs, struct s
>   }
>   if (opts < 0)
>   exit(128);
> - continue;
> + maybe_rev = 1;
> + left--; /* tentatively cancel "unknown opt" */
>   }
>  
> -
> - if (handle_revision_arg(arg, revs, flags, revarg_opt)) {
> + if (!handle_revision_arg(arg, revs, flags, revarg_opt)) {
> + got_rev_arg = 1;
> + } else if (maybe_rev) {
> + left++; /* it turns out that it was "unknown opt" */
> + continue;
> + } else {
>   int j;
>   

Re: [PATCH 1/2 v3] revision.c: args starting with "-" might be a revision

2017-02-10 Thread Junio C Hamano
Siddharth Kannan  writes:

> @@ -2234,11 +2235,18 @@ int setup_revisions(int argc, const char **argv, 
> struct rev_info *revs, struct s
>   }
>   if (opts < 0)
>   exit(128);
> - continue;
> +
> + args = handle_revision_arg(arg, revs, flags, 
> revarg_opt);
> + handle_rev_arg_called = 1;
> + if (args)
> + continue;
> + else
> + --left;
>   }
>  
>  
> - if (handle_revision_arg(arg, revs, flags, revarg_opt)) {
> + if ((handle_rev_arg_called && args) ||
> + handle_revision_arg(arg, revs, flags, 
> revarg_opt)) {

Naively I would have expected that removing the "continue" at the
end and letting the control go to the existing

if (handle_revision_arg(arg, revs, flags, revarg_opt)) {

would be all that is needed.  The latter half of the patch is an
artifact of having ane xtra "handle_revision_arg() calls inside the
"if it begins with dash" block to avoid calling it twice.

So the difference is just "--left" (by the way, our codebase seem to
prefer "left--" when there is no difference between pre- or post-
decrement/increment) that adjusts the slot in argv[] where the next
unknown argument is stuffed to.

The adjustment is needed as the call to handle_revision_opt() that
is before the pre-context of this hunk stuffed the unknown thing
that begins with "-" into argv[left++]; if that thing turns out to
be a valid rev, then you would need to take it back, because after
all, that is not an unknown command line argument.

I am wondering if writing it like the following is easier to
understand.  I had a hard time figuring out what you are trying to
do, partly because "args" is quite a misnomer---implying "how many
arguments did we see" that is similar to opts that does mean "how
many options did handle_revision_opts() see?"  The variable means
means "yes we saw a valid rev" when it is zero.  The rewrite
below may avoid such a confusion.  I dunno.

 revision.c | 14 +-
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/revision.c b/revision.c
index b37dbec378..e238430948 100644
--- a/revision.c
+++ b/revision.c
@@ -2204,6 +2204,7 @@ int setup_revisions(int argc, const char **argv, struct 
rev_info *revs, struct s
revarg_opt |= REVARG_CANNOT_BE_FILENAME;
read_from_stdin = 0;
for (left = i = 1; i < argc; i++) {
+   int maybe_rev = 0;
const char *arg = argv[i];
if (*arg == '-') {
int opts;
@@ -2234,11 +2235,16 @@ int setup_revisions(int argc, const char **argv, struct 
rev_info *revs, struct s
}
if (opts < 0)
exit(128);
-   continue;
+   maybe_rev = 1;
+   left--; /* tentatively cancel "unknown opt" */
}
 
-
-   if (handle_revision_arg(arg, revs, flags, revarg_opt)) {
+   if (!handle_revision_arg(arg, revs, flags, revarg_opt)) {
+   got_rev_arg = 1;
+   } else if (maybe_rev) {
+   left++; /* it turns out that it was "unknown opt" */
+   continue;
+   } else {
int j;
if (seen_dashdash || *arg == '^')
die("bad revision '%s'", arg);
@@ -2255,8 +2261,6 @@ int setup_revisions(int argc, const char **argv, struct 
rev_info *revs, struct s
append_prune_data(_data, argv + i);
break;
}
-   else
-   got_rev_arg = 1;
}
 
if (prune_data.nr) {