Re: extending the blame callback

2019-01-18 Thread Julian Foad
Stefan Kueng wrote: > On 14.01.2019 14:25, Branko Čibej wrote: > > I started on that then decided that svn_client_blame6 is far too narrow > > scope for that. In order to do this right, we have to introduce a line > > splitter callback to svn_subst_stream_translated. > > > > My idea was to do some

Re: extending the blame callback

2019-01-14 Thread Stefan Kueng
On 14.01.2019 14:25, Branko Čibej wrote: On 14.01.2019 13:36, Julian Foad wrote: Stefan, thanks for taking account of the feedback and updating the doc string in r1851197. I took a look and thought to rewrite the part about encoding and line splitting like this: * Character Encoding and

Re: extending the blame callback

2019-01-14 Thread Branko Čibej
On 14.01.2019 15:37, Julian Foad wrote: > Branko Čibej wrote: >> On 14.01.2019 13:36, Julian Foad wrote: > [...] > > Thanks, Brane; you gave a good initial summary of the scope of the problem. > As you said on IRC there's even more to it, such as the relationship between > line splitting and keyw

Re: extending the blame callback

2019-01-14 Thread Julian Foad
Branko Čibej wrote: > On 14.01.2019 13:36, Julian Foad wrote: [...] Thanks, Brane; you gave a good initial summary of the scope of the problem. As you said on IRC there's even more to it, such as the relationship between line splitting and keywords, meaning that all has to be addressed together.

Re: extending the blame callback

2019-01-14 Thread Branko Čibej
On 14.01.2019 13:36, Julian Foad wrote: > Stefan, thanks for taking account of the feedback and updating the doc string > in r1851197. > > I took a look and thought to rewrite the part about encoding and line > splitting like this: > > * Character Encoding and Line Splitting: > * > * It is up

Re: extending the blame callback

2019-01-14 Thread Julian Foad
Stefan, thanks for taking account of the feedback and updating the doc string in r1851197. I took a look and thought to rewrite the part about encoding and line splitting like this: * Character Encoding and Line Splitting: * * It is up to the client to determine the character encoding. The @

Re: extending the blame callback

2019-01-12 Thread Branko Čibej
On 11.01.2019 17:11, Stefan Kueng wrote: > > > On 10.01.2019 23:46, Branko Čibej wrote: >> On 10.01.2019 19:13, Stefan Kueng wrote: >>> >>> >>> On 10.01.2019 06:58, Branko Čibej wrote: On 10.01.2019 04:58, Branko Čibej wrote: > On 07.01.2019 20:57, Stefan Kueng wrote: >> @@ -758,6 +759

Re: extending the blame callback

2019-01-11 Thread Stefan Kueng
On 10.01.2019 23:46, Branko Čibej wrote: On 10.01.2019 19:13, Stefan Kueng wrote: On 10.01.2019 06:58, Branko Čibej wrote: On 10.01.2019 04:58, Branko Čibej wrote: On 07.01.2019 20:57, Stefan Kueng wrote: @@ -758,6 +759,33 @@    * will be true if the reason there is no blame information

Re: extending the blame callback

2019-01-10 Thread Branko Čibej
On 10.01.2019 19:13, Stefan Kueng wrote: > > > On 10.01.2019 06:58, Branko Čibej wrote: >> On 10.01.2019 04:58, Branko Čibej wrote: >>> On 07.01.2019 20:57, Stefan Kueng wrote: @@ -758,6 +759,33 @@    * will be true if the reason there is no blame information is that the line   

Re: extending the blame callback

2019-01-10 Thread Stefan Kueng
On 10.01.2019 06:58, Branko Čibej wrote: On 10.01.2019 04:58, Branko Čibej wrote: On 07.01.2019 20:57, Stefan Kueng wrote: @@ -758,6 +759,33 @@ * will be true if the reason there is no blame information is that the line * was modified locally. In all other cases @a local_change will be

Re: extending the blame callback

2019-01-09 Thread Branko Čibej
On 10.01.2019 04:58, Branko Čibej wrote: > On 07.01.2019 20:57, Stefan Kueng wrote: >> @@ -758,6 +759,33 @@ >> * will be true if the reason there is no blame information is that the line >> * was modified locally. In all other cases @a local_change will be false. >> * >> + * @note the line is

Re: extending the blame callback

2019-01-09 Thread Branko Čibej
On 07.01.2019 20:57, Stefan Kueng wrote: > @@ -758,6 +759,33 @@ > * will be true if the reason there is no blame information is that the line > * was modified locally. In all other cases @a local_change will be false. > * > + * @note the line is split on LF characters. Clients must be aware o

Re: extending the blame callback

2019-01-07 Thread Peter Samuelson
[Daniel Shahaf] > The current patch's docstring implies the LF byte is necessarily part > of a line terminator, which is true for UTF-8/16/32 but not > necessarily true in arbitrary encodings. Nitpick: It is true in UTF-8, but not -16 or -32. There are about 70 characters in the BMP which, in U

Re: extending the blame callback

2019-01-07 Thread Stefan Kueng
On 07.01.2019 19:58, Daniel Shahaf wrote: Stefan Kueng wrote on Mon, 07 Jan 2019 19:30 +0100: On 06.01.2019 21:09, Daniel Shahaf wrote: Stefan Kueng wrote on Sun, Jan 06, 2019 at 20:40:28 +0100: @@ -758,6 +759,33 @@ * will be true if the reason there is no blame information is that the l

Re: extending the blame callback

2019-01-07 Thread Daniel Shahaf
Stefan Kueng wrote on Mon, 07 Jan 2019 19:30 +0100: > On 06.01.2019 21:09, Daniel Shahaf wrote: > > Stefan Kueng wrote on Sun, Jan 06, 2019 at 20:40:28 +0100: > >> @@ -758,6 +759,33 @@ > >>* will be true if the reason there is no blame information is that the > >> line > >>* was modified l

Re: extending the blame callback

2019-01-07 Thread Stefan Kueng
On 07.01.2019 19:56, Branko Čibej wrote: On Mon, 7 Jan 2019, 19:30 Stefan Kueng wrote: I think for proper C, all variables need to be declared at the start of the function. Unless of course svn allows newer C versions? Not "the start of the function"

Re: extending the blame callback

2019-01-07 Thread Branko Čibej
On Mon, 7 Jan 2019, 19:30 Stefan Kueng > I think for proper C, all variables need to be declared at the start of > the function. Unless of course svn allows newer C versions? > Not "the start of the function" but "before any statements in the enclosing block." It has been that way since the begin

Re: extending the blame callback

2019-01-07 Thread Stefan Kueng
the attached patch has some corrections to the doc strings. If there are no objections, I'll commit this tomorrow. Also see inline comments: On 06.01.2019 21:09, Daniel Shahaf wrote: Stefan Kueng wrote on Sun, Jan 06, 2019 at 20:40:28 +0100: +++ subversion/include/svn_client.h (working cop

Re: extending the blame callback

2019-01-07 Thread Julian Foad
I reviewed the patch 'blame6_5.patch'. It all looks fine, functionally. I think it provides a good upgrade for any user of the API, even though I don't really think using it to process UTF-16 data is a very good idea. The only part of the patch I'd like to see changed is the comment about "the

Re: extending the blame callback

2019-01-06 Thread Branko Čibej
On 07.01.2019 06:17, Daniel Shahaf wrote: > Branko Čibej wrote on Mon, 07 Jan 2019 05:55 +0100: >> On 06.01.2019 19:54, Daniel Shahaf wrote: >>> Branko Čibej wrote on Sun, 06 Jan 2019 19:37 +0100: A simple check would be: * if 0x0a is on an odd offset, and the next byte is 0x00, th

Re: extending the blame callback

2019-01-06 Thread Daniel Shahaf
Branko Čibej wrote on Mon, 07 Jan 2019 05:55 +0100: > On 06.01.2019 19:54, Daniel Shahaf wrote: > > Branko Čibej wrote on Sun, 06 Jan 2019 19:37 +0100: > >> A simple check would be: > >> > >> * if 0x0a is on an odd offset, and the next byte is 0x00, then it's a > >> UTF-16-LE linefeed; > >>

Re: extending the blame callback

2019-01-06 Thread Branko Čibej
On 06.01.2019 20:10, Stefan Kueng wrote: > > > On 06.01.2019 19:37, Branko Čibej wrote: > >> Windows default is UTF-16-LE, at least on x86(_64) and other >> little-endian architectures. I'm not sure what they do on ARM but I'd be >> surprised if Windows doesn't put it in little-endian mode, given t

Re: extending the blame callback

2019-01-06 Thread Branko Čibej
On 06.01.2019 19:54, Daniel Shahaf wrote: > Branko Čibej wrote on Sun, 06 Jan 2019 19:37 +0100: >> A simple check would be: >> >> * if 0x0a is on an odd offset, and the next byte is 0x00, then it's a >> UTF-16-LE linefeed; >> * else if 0x0a is on an even offset, and the _previous_ byte is 0

Re: extending the blame callback

2019-01-06 Thread Daniel Shahaf
Stefan Kueng wrote on Sun, Jan 06, 2019 at 20:40:28 +0100: > +++ subversion/include/svn_client.h (working copy) > @@ -736,10 +736,11 @@ > * @{ > */ > > -/** Callback type used by svn_client_blame5() to notify the caller > +/** Callback type used by svn_client_blame6() to notify the caller >

Re: extending the blame callback

2019-01-06 Thread Stefan Kueng
On 06.01.2019 19:54, Daniel Shahaf wrote: The encoding may also be set explicitly via a svn:mime-type="text/foo; charset=utf-16-le" property. (We even parse that in mod_dav_svn, I think?) Setting the mime-type to text/.. with the utf16 charset helps only that the --force flag doesn't need t

Re: extending the blame callback

2019-01-06 Thread Stefan Kueng
On 06.01.2019 19:37, Branko Čibej wrote: Windows default is UTF-16-LE, at least on x86(_64) and other little-endian architectures. I'm not sure what they do on ARM but I'd be surprised if Windows doesn't put it in little-endian mode, given that decades of legacy software assume little-endian.

Re: extending the blame callback

2019-01-06 Thread Daniel Shahaf
Branko Čibej wrote on Sun, 06 Jan 2019 19:37 +0100: > A simple check would be: > > * if 0x0a is on an odd offset, and the next byte is 0x00, then it's a > UTF-16-LE linefeed; > * else if 0x0a is on an even offset, and the _previous_ byte is 0x00, > then it's a UTF-16-BE linefeed; Woul

Re: extending the blame callback

2019-01-06 Thread Branko Čibej
On 06.01.2019 17:16, Stefan Kueng wrote: > > > On 06.01.2019 15:05, Branko Čibej wrote: > >> Sorry about getting into this late, but as Julian said: >> >>> * we already have a (char*, len) wrapper, called svn_string_t, so I >>> would expect it would be neatest to use that; >> >> but the patch has:

Re: extending the blame callback

2019-01-06 Thread Stefan Kueng
On 06.01.2019 15:05, Branko Čibej wrote: Sorry about getting into this late, but as Julian said: * we already have a (char*, len) wrapper, called svn_string_t, so I would expect it would be neatest to use that; but the patch has: @@ -758,6 +758,28 @@ ·*·will·be·true·if·the·reason·there·

Re: extending the blame callback

2019-01-06 Thread Stefan Kueng
On 06.01.2019 15:22, Julian Foad wrote: Branko Čibej wrote: + const svn_string_t line = {sb->data, sb->len}; In plain old C, we can't write that, but we have a helper function: svn_stringbuf__morph_into_string(). but that destroys the original stringbuf, but that one is sti

Re: extending the blame callback

2019-01-06 Thread Branko Čibej
On 06.01.2019 15:22, Julian Foad wrote: > Branko Čibej wrote: >> + const svn_string_t line = {sb->data, sb->len}; > In plain old C, we can't write that, but we have a helper function: > svn_stringbuf__morph_into_string(). I've noticed that clang is a bit deficient when it comes to pe

Re: extending the blame callback

2019-01-06 Thread Julian Foad
Branko Čibej wrote: > + const svn_string_t line = {sb->data, sb->len}; In plain old C, we can't write that, but we have a helper function: svn_stringbuf__morph_into_string(). -- - Julian

Re: extending the blame callback

2019-01-06 Thread Branko Čibej
On 06.01.2019 15:05, Branko Čibej wrote: > Until we have better support for other Unicode representations, we > should detect that null byte and include it as part of the reported > blame line. That might also imply detecting that the file encoding is not UTF-16-BE instead ... which could very lik

Re: extending the blame callback

2019-01-06 Thread Branko Čibej
On 06.01.2019 14:10, Stefan Kueng wrote: > > > On 05.01.2019 22:35, Daniel Shahaf wrote: >> Stefan Kueng wrote on Sat, 05 Jan 2019 21:15 +0100: >>> here's a patch using svn_stringbuf_t for review. >> >> Change "Provided for backwards compatibility with the 1.6 API" to >> "Provided for backwards com

Re: extending the blame callback

2019-01-06 Thread Stefan Kueng
On 05.01.2019 22:35, Daniel Shahaf wrote: Stefan Kueng wrote on Sat, 05 Jan 2019 21:15 +0100: here's a patch using svn_stringbuf_t for review. Change "Provided for backwards compatibility with the 1.6 API" to "Provided for backwards compatibility with the 1.11 API" in svn_client_blame5()'s d

Re: extending the blame callback

2019-01-05 Thread Daniel Shahaf
Stefan Kueng wrote on Sat, 05 Jan 2019 21:15 +0100: > here's a patch using svn_stringbuf_t for review. Change "Provided for backwards compatibility with the 1.6 API" to "Provided for backwards compatibility with the 1.11 API" in svn_client_blame5()'s docstring.

Re: extending the blame callback

2019-01-05 Thread Stefan Kueng
On 05.01.2019 17:53, Julian Foad wrote: Stefan Kueng wrote: When running blame on an utf16 text file, the lines can not be used since the blame callback only passes a 'char *' parameter which means it ends at the first zero char. But actually, svn knows if the line has more content. +1 on do

Re: extending the blame callback

2019-01-05 Thread Stefan Kueng
On 05.01.2019 18:05, Mark Phippard wrote: On Jan 5, 2019, at 10:58 AM, Stefan Kueng wrote: So, with this patch I'm able to do a blame on utf16 files, so I'd like to commit this if there are no objections. I have no objections, just a question. I thought SVN treated utf16 files as bina

Re: extending the blame callback

2019-01-05 Thread Stefan Kueng
On 05.01.2019 17:53, Julian Foad wrote: Stefan Kueng wrote: When running blame on an utf16 text file, the lines can not be used since the blame callback only passes a 'char *' parameter which means it ends at the first zero char. But actually, svn knows if the line has more content. +1 on d

Re: extending the blame callback

2019-01-05 Thread Mark Phippard
> On Jan 5, 2019, at 10:58 AM, Stefan Kueng wrote: > > So, with this patch I'm able to do a blame on utf16 files, so I'd like to > commit this if there are no objections. > I have no objections, just a question. I thought SVN treated utf16 files as binary. How does blame work at all? I th

Re: extending the blame callback

2019-01-05 Thread Julian Foad
Stefan Kueng wrote: > When running blame on an utf16 text file, the lines can not be used > since the blame callback only passes a 'char *' parameter which means it > ends at the first zero char. But actually, svn knows if the line has > more content. +1 on doing something to fix the problem.

extending the blame callback

2019-01-05 Thread Stefan Kueng
Hi, When running blame on an utf16 text file, the lines can not be used since the blame callback only passes a 'char *' parameter which means it ends at the first zero char. But actually, svn knows if the line has more content. So I'd like to propose the following patch which extends the blame