On 07.01.2019 06:17, Daniel Shahaf wrote:
> Branko Čibej wrote on Mon, 07 Jan 2019 05:55 +0100:
>> On 06.01.2019 19:54, Daniel Shahaf wrote:
>>> Branko Čibej wrote on Sun, 06 Jan 2019 19:37 +0100:
>>>> A simple check would be:
>>>>
>>>>   * if 0x0a is on an odd offset, and the next byte is 0x00, then it's a
>>>>     UTF-16-LE linefeed;
>>>>   * else if 0x0a is on an even offset, and the _previous_ byte is 0x00,
>>>>     then it's a UTF-16-BE linefeed;
>>> Would would happen if it were an ASCII/UTF-8 file that happened to
>>> have a literal NUL byte next to an LF byte?  I have seen/used
>>> some of those.
>> Yes, well, in that case the NUL just might randomly "move" from one line
>> to the next, depending on changes in the file. Nothing we can do
>> short-term will fix that ... and as far as I'm concerned that's not a
>> valid text file, so it won't disturb me too much if blame results are a
>> bit fuzzy in such cases.
> You don't actually give any reason why we shouldn't support text files
> with embedded NULs.


I didn't say we shouldn't. Clearly we do, if inadvertently, or we
wouldn't be discussing Stefan's patch in this thread.


>>>   (We even parse that in mod_dav_svn, I think?)
>> Even if we do, it's irrelevant to this discussion. We set the value of
>> the Content-Type header based on svn:mime-type for a simple GET request,
>> but don't interpret it otherwise on the server side.
> The relevance is that the property may be already set in users' repositories.

It's still irrelevant to *this* discussion. I don't think anyone is
proposing to propagate pieces of svn:mime-type deep into the difflib as
part of this patch.

It will be relevant to some future discussion where we talk about
supporting (and detecting) Unicode representations as text throughout
our code.

-- Brane

Reply via email to