On Thu, Sep 13, 2012 at 4:08 PM, Richard Hipp <d...@sqlite.org> wrote: > > > On Thu, Sep 13, 2012 at 3:45 PM, Richard Hipp <d...@sqlite.org> wrote: >> >> >> >> On Thu, Sep 13, 2012 at 3:43 PM, Kevin Greiner <grein...@gmail.com> wrote: >>> >>> >>> I'm using fossil 1.23 on Windows 7. I'm attempting to store text files >>> generated by Microsoft SQL Server 2012 in fossil so I can easily track their >>> changes over time. >>> >>> The problem is that fossil thinks these generated text files are binary >>> data which prevents me from viewing the files via the web ui and generating >>> diffs. >>> >>> When I look at these files in a hex edtor, I see this: ff fe 53 00 45 00 >>> 54 00 20 00 41 00. A text editor shows "SET A". >>> >>> I've looked in the email list archive where DRH specifies that a null >>> character or a line longer than 8192 chars. The entire file is 1074 bytes so >>> it's not the length. Is fossil reading the 00 bytes as nulls? >>> >>> Any idea why fossil thinks these files are binary? And, more importantly, >>> what encoding I can specify to prevent this? I've tried various permutations >>> of ASCII, UTF8, UTF7 to no effect. >> >> >> The file itself appears to be in utf16le. The "diff" facilities in Fossil >> currently only know how to deal with utf8. > > > It would be an interesting project to enhance Fossil so that it could > support UTF16 in addition to UTF8. What would be needed is an algorithm to > detect when a file was UTF16. (The BOM at the beginning of Kevin's example > ought to be a big hint.) Then automatically call a convert routine to > generate UTF8 prior to passing the content into the diff logic, or into the > wiki engine, or prior to display on the UTF8 webpage, etc. > > Basically, we need a routine that converts an in-memory buffer from UTF16 to > UTF8, and leaves anything that isn't UTF16 unchanged. Then we need to call > that routine in a few strategic places inside of Fossil > > Volunteers to write that routine? I'll help identify the places where it > needs to be called. >
I guess this could help to write such routine: http://unicode.org/faq/utf_bom.html -- Martin G. _______________________________________________ fossil-users mailing list fossil-users@lists.fossil-scm.org http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users