Re: SVN Blame Returns Corrupt Data

Branko Čibej Fri, 11 Oct 2013 08:00:23 -0700

On 11.10.2013 16:55, Bob Archer wrote:
>> On 11.10.2013 15:58, Bob Archer wrote:
>>>> On Thu, Oct 10, 2013 at 5:49 PM, Bob Archer <[email protected]>
>> wrote:
>>>> I assume he was asking how to "fix" the blame. Cause, sure, he could
>>>> open the file, convert it back to UTF-8 with CRLF line endings... and
>>>> commit it... of course, now blame is going to show him on every line,
>>>> since he just changed every line.
>>>>
>>>> That's exactly what I meant.  You're correct with how the blame is
>>>> handled.  I committed the UTF-8 copy to a test branch, diff'd, and it
>>>> showed every line as being changed.  Unfortunately it looks like this is 
>>>> our
>> best option.
>>> Yep, we have done the same thing. As a matter of fact, I just over the past
>> few days rescripted all our database scripts to be UTF-8 since merging them
>> just doesn't work correctly when they are UTF-16 even if you remove the
>> binary mime type.
>>>> On Thu, Oct 10, 2013 at 7:07 PM, Ben Reser <[email protected]> wrote:
>>>> At current blame is not UTF-16 aware.
>>> It's not just blame that isn't... the diff engine, or whatever detects file
>> types always considers UTF-16 files to be binary. If you "add" a UTF-16 file
>> you see that svn adds the application/octet-stream mime type.  There is an
>> issue in the bug database about this from when I reported/complained about
>> it... however it hasn't been addressed. I'm surprised still at this time 
>> that svn
>> still can't support UTF-16 text files as text wrt adding, diffing, blaming, 
>> etc.
>>
>> It's quite simple: no-one has written the necessary code. While I can
>> understand it's an interesting feature for Windows users, most Subversion
>> developers have other things to do. This being a volunteer project, and most
>> of us do not use Windows, you can hardly expect anyone to spend several
>> weeks on solving a problem that has a perfectly simple workaround. Since
>> UFT-8 and UTF-16 can be interchanged without data loss, there are other,
>> much more important things to do in Subversion.
> I appreciate all that you said. I didn't expect that UTF-16 was so uncommon 
> in non-Windows OSes. A large number of dev tools that I work with on Windows, 
> especially the Microsoft tools default to creating UTF-16 files.  
>
> I disagree with your "can be converted without data loss". If you need UTF-16 
> then you need it. Also, if you are working in an international team and you 
> have developers with other language Oss which have different code pages then 
> what you see when you look at a UTF-8 file might be different than what I see.


I don't follow. Both UTF-16 and UTF-8 are complete representations of
the Unicode character set. Exactly the same code sequences can be
represented in both encodings. You can convert from UTF-16 to UTF-8 and
back and get exactly the same sequence of bytes.

-- Brane


-- 
Branko Čibej | Director of Subversion
WANdisco // Non-Stop Data
e. [email protected]

Re: SVN Blame Returns Corrupt Data

Reply via email to