> On 11.10.2013 16:55, Bob Archer wrote: > >> On 11.10.2013 15:58, Bob Archer wrote: > >>>> On Thu, Oct 10, 2013 at 5:49 PM, Bob Archer <bob.arc...@amsi.com> > >> wrote: > >>>> I assume he was asking how to "fix" the blame. Cause, sure, he > >>>> could open the file, convert it back to UTF-8 with CRLF line > >>>> endings... and commit it... of course, now blame is going to show > >>>> him on every line, since he just changed every line. > >>>> > >>>> That's exactly what I meant. You're correct with how the blame is > >>>> handled. I committed the UTF-8 copy to a test branch, diff'd, and > >>>> it showed every line as being changed. Unfortunately it looks like > >>>> this is our > >> best option. > >>> Yep, we have done the same thing. As a matter of fact, I just over > >>> the past > >> few days rescripted all our database scripts to be UTF-8 since > >> merging them just doesn't work correctly when they are UTF-16 even if > >> you remove the binary mime type. > >>>> On Thu, Oct 10, 2013 at 7:07 PM, Ben Reser <b...@reser.org> wrote: > >>>> At current blame is not UTF-16 aware. > >>> It's not just blame that isn't... the diff engine, or whatever > >>> detects file > >> types always considers UTF-16 files to be binary. If you "add" a > >> UTF-16 file you see that svn adds the application/octet-stream mime > >> type. There is an issue in the bug database about this from when I > >> reported/complained about it... however it hasn't been addressed. I'm > >> surprised still at this time that svn still can't support UTF-16 text > >> files as > text wrt adding, diffing, blaming, etc. > >> > >> It's quite simple: no-one has written the necessary code. While I can > >> understand it's an interesting feature for Windows users, most > >> Subversion developers have other things to do. This being a volunteer > >> project, and most of us do not use Windows, you can hardly expect > >> anyone to spend several weeks on solving a problem that has a > >> perfectly simple workaround. Since > >> UFT-8 and UTF-16 can be interchanged without data loss, there are > >> other, much more important things to do in Subversion. > > I appreciate all that you said. I didn't expect that UTF-16 was so uncommon > in non-Windows OSes. A large number of dev tools that I work with on > Windows, especially the Microsoft tools default to creating UTF-16 files. > > > > I disagree with your "can be converted without data loss". If you need UTF- > 16 then you need it. Also, if you are working in an international team and you > have developers with other language Oss which have different code pages > then what you see when you look at a UTF-8 file might be different than > what I see. > > I don't follow. Both UTF-16 and UTF-8 are complete representations of the > Unicode character set. Exactly the same code sequences can be represented > in both encodings. You can convert from UTF-16 to UTF-8 and back and get > exactly the same sequence of bytes. >
Ok, I have to back pedal here a bit. You are correct, UTF-8 is a Unicode format and can store all characters. It's not a UTF-8 vs UTF-16 issue (Friday senior moment). What I recall being told by one of the subversion developers was that subversion only supported the ASCII character set and while UTF-8 was compatible with ASCII it didn't truly support Unicode files. However, this blog entry seems to dispute that: http://rhubbarb.wordpress.com/2012/04/28/svn-unicode/ Would adding that mime-type to this file fix the blame issues this user is seeing? BOb