On Thu, Sep 13, 2012 at 4:08 PM, Richard Hipp <d...@sqlite.org> wrote:
>
>
> On Thu, Sep 13, 2012 at 3:45 PM, Richard Hipp <d...@sqlite.org> wrote:
>>
>>
>>
>> On Thu, Sep 13, 2012 at 3:43 PM, Kevin Greiner <grein...@gmail.com> wrote:
>>>
>>>
>>> I'm using fossil 1.23 on Windows 7. I'm attempting to store text files
>>> generated by Microsoft SQL Server 2012 in fossil so I can easily track their
>>> changes over time.
>>>
>>> The problem is that fossil thinks these generated text files are binary
>>> data which prevents me from viewing the files via the web ui and generating
>>> diffs.
>>>
>>> When I look at these files in a hex edtor, I see this: ff fe 53 00 45 00
>>> 54 00 20 00 41 00. A text editor shows "SET A".
>>>
>>> I've looked in the email list archive where DRH specifies that a null
>>> character or a line longer than 8192 chars. The entire file is 1074 bytes so
>>> it's not the length. Is fossil reading the 00 bytes as nulls?
>>>
>>> Any idea why fossil thinks these files are binary? And, more importantly,
>>> what encoding I can specify to prevent this? I've tried various permutations
>>> of ASCII, UTF8, UTF7 to no effect.
>>
>>
>> The file itself appears to be in utf16le.  The "diff" facilities in Fossil
>> currently only know how to deal with utf8.
>
>
> It would be an interesting project to enhance Fossil so that it could
> support UTF16 in addition to UTF8.  What would be needed is an algorithm to
> detect when a file was UTF16.  (The BOM at the beginning of Kevin's example
> ought to be a big hint.)  Then automatically call a convert routine to
> generate UTF8 prior to passing the content into the diff logic, or into the
> wiki engine, or prior to display on the UTF8 webpage, etc.
>
> Basically, we need a routine that converts an in-memory buffer from UTF16 to
> UTF8, and leaves anything that isn't UTF16 unchanged.  Then we need to call
> that routine in a few strategic places inside of Fossil
>
> Volunteers to write that routine?  I'll help identify the places where it
> needs to be called.
>

I guess this could help to write such routine:
http://unicode.org/faq/utf_bom.html

-- 
Martin G.
_______________________________________________
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users

Reply via email to