On Thu, 4 Oct 2018 at 20:56, Johannes Schindelin <johannes.schinde...@gmx.de> wrote: > > Hi Ananya, > > On Thu, 4 Oct 2018, Ananya Krishna Maram wrote: > > > On Thu, 4 Oct 2018 at 19:56, Johannes Schindelin > > <johannes.schinde...@gmx.de> wrote: > > > > > > Hi Ananya, > > > > > > thank you for taking the time to write this patch! > > > > > > On Thu, 4 Oct 2018, Ananya Krishna Maram wrote: > > > > > > > the forward slash character should be escaped with backslash. Fix > > > > Unescaped forward slash error in Python regex statements. > > > > > > > > Signed-off-by: Ananya Krishna Maram<ananyakittu1...@gmail.com> > > > > > > That explains pretty well what you did, but I wonder why the forward slash > > > needs to be escaped? I would understand if we enclosed the pattern in > > > `/<regex>/`, as it is done e.g. in Javascript, but we do not... > > > > You are correct, the code would execute either ways. But when I came across > > this line, I didn't get it's meaning instantly because as per standards, > > forward > > slash has to be escaped. In fact when open source code is written according > > to > > standards, the code will be reachable to more people. > > I am afraid that I do not follow... Regular expressions have quite a few > special characters, but forward slashes are not among them. > > Meaning: if we had specified the regular expression thusly (in any > language that supports them to be specified in this way): > > /|[-+*/<>%&^|=!]=|//=?|<<=?|>>=?|\\*\\*=?/ > > then I would agree that this is a bug, and needs to be fixed. But we > specify it as a regular C string: > > "|[-+*/<>%&^|=!]=|//=?|<<=?|>>=?|\\*\\*=?" > > In this context, the backslash has an additional, nested meaning: it > escapes special characters in a C string, too. So writing > > "\\" > > will actually result in a string consisting of a single backslash. And > > "\n" > > would specify a string consisting of a single line feed character. Some C > compilers ignore incorrectly-escaped characters, i.e. "\/" would simply > expand to the forward slash. > > However, you wanted to escape the forward slash in the regular expression. > To do that, you would have to write > > "\\/" > > i.e. specifying, in a C string, a backslash character, followed by a > forward slash character. > > However, the regular expression would then be interpreting the backslash > character as escape character in its own right, seeing the forward slash, > and replacing this sequence by a forward slash. > > But it does not need to be escaped, when you specify the regular > expression the way we do. And the way we specified it is really the > standard when specifying regular expressions in C code, i.e. *without* the > suggested backslash.
Aha!. this makes total sense. I was thinking from a general regular expression point of view. But I should be thinking from C point of view and how C might interpret this newly submitted string. This explanation is very clear. Thanks for taking time to reply to my patch. From next time on, I will try to think from git project's point of view. Thanks, Ananya. > Ciao, > Johannes > > > > > Thanks, > > Ananya. > > > > > Thanks, > > > Johannes > > > > > > > --- > > > > userdiff.c | 2 +- > > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > > > diff --git a/userdiff.c b/userdiff.c > > > > index f565f6731..f4ff9b9e5 100644 > > > > --- a/userdiff.c > > > > +++ b/userdiff.c > > > > @@ -123,7 +123,7 @@ PATTERNS("python", "^[ \t]*((class|def)[ \t].*)$", > > > > /* -- */ > > > > "[a-zA-Z_][a-zA-Z0-9_]*" > > > > "|[-+0-9.e]+[jJlL]?|0[xX]?[0-9a-fA-F]+[lL]?" > > > > - "|[-+*/<>%&^|=!]=|//=?|<<=?|>>=?|\\*\\*=?"), > > > > + "|[-+*\/<>%&^|=!]=|\/\/=?|<<=?|>>=?|\\*\\*=?"), > > > > /* -- */ > > > > PATTERNS("ruby", "^[ \t]*((class|module|def)[ \t].*)$", > > > > /* -- */ > > > > -- > > > > 2.17.1 > > > > > > > > > >