Re: [fossil-users] Fossil enhancement idea. Was: trouble handling text files from SQL Server 2012

2012-09-15 Thread Scott Robison
On Fri, Sep 14, 2012 at 11:34 PM, Csaba Kos csaba@gmail.com wrote: I am a fossil novice myself, but I don't think there is such functionality built-in currently. I was talking about tagging encoding as well as end of line handling, but mainly I was giving myself an out in case I was

Re: [fossil-users] Fossil enhancement idea. Was: trouble handling text files from SQL Server 2012

2012-09-15 Thread Michal Suchanek
On 15 September 2012 04:20, Csaba Kos csaba@gmail.com wrote: I think now would be a good time to discuss the possibility of a more generic text conversion framework, i.e. not only UTF16 to UTF8 but also SHIFT-JIS to UTF8, and so on. Also CR+NL to NL conversion could be handled by such

Re: [fossil-users] Fossil enhancement idea. Was: trouble handling text files from SQL Server 2012

2012-09-15 Thread Csaba Kos
On Sat, Sep 15, 2012 at 6:00 PM, Michal Suchanek hramr...@gmail.com wrote: On 15 September 2012 04:20, Csaba Kos csaba@gmail.com wrote: I think now would be a good time to discuss the possibility of a more generic text conversion framework, i.e. not only UTF16 to UTF8 but also SHIFT-JIS

Re: [fossil-users] Fossil enhancement idea. Was: trouble handling text files from SQL Server 2012

2012-09-14 Thread David Given
Richard Hipp wrote: You assume correctly. Good to know --- just checking! -- ┌─── dg@cowlark.com ─ http://www.cowlark.com ─ │ Parents let children ride bicycles on the street. But parents do not │ allow children to hear vulgar words. Therefore we can deduce that │ cursing is more

Re: [fossil-users] Fossil enhancement idea. Was: trouble handling text files from SQL Server 2012

2012-09-14 Thread Richard Hipp
On Fri, Sep 14, 2012 at 1:29 AM, Scott Robison sc...@scottrobison.uswrote: So I've spent some time writing a small and I think portable routine to detect if a buffer is a valid UTF-16 (either little or big endian). It rejects buffers if they contain an odd number of bytes or contain any of

Re: [fossil-users] Fossil enhancement idea. Was: trouble handling text files from SQL Server 2012

2012-09-14 Thread Csaba Kos
On Thu, Sep 13, 2012 at 2:08 PM, Richard Hipp wrote: It would be an interesting project to enhance Fossil so that it could support UTF16 in addition to UTF8. What would be needed is an algorithm to detect when a file was UTF16. (The BOM at the beginning of Kevin's example ought to be a big

Re: [fossil-users] Fossil enhancement idea. Was: trouble handling text files from SQL Server 2012

2012-09-14 Thread Scott Robison
On Fri, Sep 14, 2012 at 5:46 AM, Richard Hipp d...@sqlite.org wrote: Detection of embedded non-printing characters, especially U+, would be nice. Should we insist on a BOM at the beginning of the file? I don't think a BOM should be mandatory, as it is not required by Unicode. Another

Re: [fossil-users] Fossil enhancement idea. Was: trouble handling text files from SQL Server 2012

2012-09-14 Thread Scott Robison
On Fri, Sep 14, 2012 at 8:20 PM, Csaba Kos csaba@gmail.com wrote: I think now would be a good time to discuss the possibility of a more generic text conversion framework, i.e. not only UTF16 to UTF8 but also SHIFT-JIS to UTF8, and so on. Also CR+NL to NL conversion could be handled by

Re: [fossil-users] Fossil enhancement idea. Was: trouble handling text files from SQL Server 2012

2012-09-14 Thread Csaba Kos
On Sat, Sep 15, 2012 at 1:21 PM, Scott Robison sc...@scottrobison.us wrote: One thing I thought of yesterday but dismissed (and am now rethinking as a result of your email) is maybe there should be a bit of meta-data that can be attached to files to explicitly set their encoding. Having built

[fossil-users] Fossil enhancement idea. Was: trouble handling text files from SQL Server 2012

2012-09-13 Thread Richard Hipp
On Thu, Sep 13, 2012 at 3:45 PM, Richard Hipp d...@sqlite.org wrote: On Thu, Sep 13, 2012 at 3:43 PM, Kevin Greiner grein...@gmail.com wrote: I'm using fossil 1.23 on Windows 7. I'm attempting to store text files generated by Microsoft SQL Server 2012 in fossil so I can easily track their

Re: [fossil-users] Fossil enhancement idea. Was: trouble handling text files from SQL Server 2012

2012-09-13 Thread Scott Robison
I'd like to assist with that contribution. Assuming someone hasn't already done it by the time I click send. :) SDR On Thu, Sep 13, 2012 at 2:08 PM, Richard Hipp d...@sqlite.org wrote: On Thu, Sep 13, 2012 at 3:45 PM, Richard Hipp d...@sqlite.org wrote: On Thu, Sep 13, 2012 at 3:43 PM,

Re: [fossil-users] Fossil enhancement idea. Was: trouble handling text files from SQL Server 2012

2012-09-13 Thread Martin Gagnon
On Thu, Sep 13, 2012 at 4:08 PM, Richard Hipp d...@sqlite.org wrote: On Thu, Sep 13, 2012 at 3:45 PM, Richard Hipp d...@sqlite.org wrote: On Thu, Sep 13, 2012 at 3:43 PM, Kevin Greiner grein...@gmail.com wrote: I'm using fossil 1.23 on Windows 7. I'm attempting to store text files

Re: [fossil-users] Fossil enhancement idea. Was: trouble handling text files from SQL Server 2012

2012-09-13 Thread David Given
On 13/09/12 21:08, Richard Hipp wrote: [...] Basically, we need a routine that converts an in-memory buffer from UTF16 to UTF8, and leaves anything that isn't UTF16 unchanged. Then we need to call that routine in a few strategic places inside of Fossil Could you clarify what you mean by

Re: [fossil-users] Fossil enhancement idea. Was: trouble handling text files from SQL Server 2012

2012-09-13 Thread Scott Robison
I assumed (dangerous though it may be) that leaves anything that isn't UTF-16 unchanged meant don't convert any buffer to UTF-8 if the origination buffer is not UTF-16. SDR On Thu, Sep 13, 2012 at 5:04 PM, David Given d...@cowlark.com wrote: On 13/09/12 21:08, Richard Hipp wrote: [...]

Re: [fossil-users] Fossil enhancement idea. Was: trouble handling text files from SQL Server 2012

2012-09-13 Thread Richard Hipp
You assume correctly. The use of iconv won't do, though, since everything also needs to work on Unix. There are small, portable conversion routines in SQLite that you can copy. D. Richard Hipp - d...@sqlite.org Sent from phone - pardon brevity On Sep 13, 2012 7:44 PM, Scott Robison

Re: [fossil-users] Fossil enhancement idea. Was: trouble handling text files from SQL Server 2012

2012-09-13 Thread Scott Robison
So I've spent some time writing a small and I think portable routine to detect if a buffer is a valid UTF-16 (either little or big endian). It rejects buffers if they contain an odd number of bytes or contain any of the 66 non-character code-points or have invalid surrogate usage. While this seems