On Tue, Jun 27, 2017 at 4:18 AM, Richard Hipp wrote:
> The CSV import feature of the SQLite command-line shell expects to
> find UTF-8. It does not understand other encodings, and I have no
> plans to add converters for alternative encodings any time soon.
>
> The latest version
Thank you.
From: sqlite-users <sqlite-users-boun...@mailinglists.sqlite.org> on behalf of
Richard Hipp <d...@sqlite.org>
Sent: Tuesday, June 27, 2017 5:18:51 AM
To: SQLite mailing list
Subject: Re: [sqlite] UTF8-BOM not disregarded in CSV import
Th
The CSV import feature of the SQLite command-line shell expects to
find UTF-8. It does not understand other encodings, and I have no
plans to add converters for alternative encodings any time soon.
The latest version of trunk skips over a UTF-8 BOM at the beginning of
the input file.
--
D.
Hello,
On 2017-06-26 17:26, Scott Robison wrote:
+1
FAQ quote:
Q: When a BOM is used, is it only in 16-bit Unicode text?
A: No, a BOM can be used as a signature no matter how the Unicode
text is transformed: UTF-16, UTF-8, or UTF-32.
Q: How I should deal with BOMs?
A: Here are some
On 2017-06-26 15:01, jose isaias cabrera wrote:
I have made a desicion to always include the BOM in all my text files
whether they are UTF8, UTF16 or UTF32 little or big endian. I think
all of us should also.
I'm sorry, if I introduced ambiguity, but I had described SQLite's and
SQLite
On Jun 26, 2017 9:02 AM, "Simon Slavin" wrote:
There is no convention for "This software understands both UTF-16BE and
UTF-16LE but nothing else.". If it handles any BOMs, it should handle all
five. However, it can handle them by identifying, for example, UTF-32BE
and
I didn’t mean to imply you had to scan the whole content for a BOM, but rather
for illegal characters in the absence of a BOM.
On 6/26/17, 10:02 AM, "sqlite-users on behalf of Simon Slavin"
wrote:
Folks, I’m
On Jun 26, 2017 4:05 AM, "Rowan Worth" wrote:
On 26 June 2017 at 16:55, Scott Robison wrote:
> Byte Order Mark isn't perfectly descriptive when used with UTF-8. Neither
> is dialing a cell phone. Language evolves.
>
It's not descriptive in the
Folks, I’m sorry to interrupt but I’ve just woken up to 11 posts in this thread
and I see a lot of inaccurate 'facts' posted here. Rather than pick up on
statements in individual posts (which would unfairly pick on some people as
being less accurate than others) I’d like to post facts straight
Just occurred to me: another problem with the BOM is that some people who are
*not* writing UTF-8 are cargo-culting the BOM in anyway. So you may have to
scan the whole file to see if it’s really UTF-8 anyway.
You’re better off just assuming UTF-8 everywhere, generating an error (and
backing
At the bottom...
-Original Message-
From: Eric Grange
Sent: Monday, June 26, 2017 3:09 AM
To: SQLite mailing list
Subject: Re: [sqlite] UTF8-BOM not disregarded in CSV import
Alas, there is no end in sight to the pain for the Unicode decision to not
make the BOM compulsory for UTF-8
On 6/26/17, 2:09 AM, "sqlite-users on behalf of Eric Grange"
wrote:
> Alas, there is no end in sight to the pain for the Unicode decision to not
> make the BOM compulsory for UTF-8.
It’s not actually providing any
On 6/26/17 3:09 AM, Eric Grange wrote:
Alas, there is no end in sight to the pain for the Unicode decision to not
make the BOM compulsory for UTF-8.
Making it optional or non-necessary basically made every single text file
ambiguous, with non-trivial heuristics and implicit conventions required
>Easily solved by never including a superflous BOM in UTF-8 text
And that easy option has worked beautifully for 20 years... not.
Yes, BOM is a misnommer, yes it "wastes" 3 bytes, but in the real world
"text files" have a variety of encodings.
No BOM = you have to fire a whole suite of
On 26 June 2017 at 16:55, Scott Robison wrote:
> Byte Order Mark isn't perfectly descriptive when used with UTF-8. Neither
> is dialing a cell phone. Language evolves.
>
It's not descriptive in the slightest because UTF-8's byte order is
*specified by the encoding*.
On Jun 25, 2017 1:16 PM, "Cezary H. Noweta" wrote:
Certainly, there are no objections to extend an import's functionality
in such a way that it ignores the initial 0xFEFF. However, an import
should allow ZWNBSP as the first character, in its basic form, to be
conforming to
On Jun 26, 2017 1:47 AM, "Rowan Worth" wrote:
On 26 June 2017 at 15:09, Eric Grange wrote:
> Alas, there is no end in sight to the pain for the Unicode decision to not
> make the BOM compulsory for UTF-8.
>
UTF-8 is byte oriented. The very concept of byte
On 26 June 2017 at 15:09, Eric Grange wrote:
> Alas, there is no end in sight to the pain for the Unicode decision to not
> make the BOM compulsory for UTF-8.
>
UTF-8 is byte oriented. The very concept of byte order is nonsense in this
context as there is no multi-byte
On Sun, Jun 25, 2017 at 12:16 PM, Cezary H. Noweta
wrote:
> Hello,
>
>
> The standard says: ``Only UTF-16/32 (even not UTF-16/32LE/BE) encoding
> forms can contain BOM''. Let's conform to this.
>
>
I concur with that.
Since UTF-8 is only bytes; what would a BOM even change?
Alas, there is no end in sight to the pain for the Unicode decision to not
make the BOM compulsory for UTF-8.
Making it optional or non-necessary basically made every single text file
ambiguous, with non-trivial heuristics and implicit conventions required
instead, resulting in character
Hello,
On 2017-06-23 22:12, Mahmoud Al-Qudsi wrote:
I think you and I are on the same page here, Clemens? I abhor the
BOM, but the question is whether or not SQLite will cater to the fact
that the bigger names in the industry appear hell-bent on shoving it
in users’ documents by default.
” commands, perhaps leeway
can be shown in breaking with standards for the sake of compatibility and
sanity?
Mahmoud
From: Clemens Ladisch
Sent: Friday, June 23, 2017 2:25 AM
To: sqlite-users@mailinglists.sqlite.org
Subject: Re: [sqlite] UTF8-BOM not disregarded in CSV import
Mahmoud Al-Qudsi wrote
Mahmoud Al-Qudsi wrote:
> with `.import ……`, SQLite3 includes a BOM (UTF-8) as part of the first
> column of the first record.
The Unicode Standard 9.0 says in section 3.10:
| When represented in UTF-8, the byte order mark turns into the byte
| sequence . Its usage at the beginning of a UTF-8
Hello all,
Let me start off with my apologies if this is a documented issue; I did search
the fossil tickets but did not find anything for “BOM”.
As of SQLite 3.19.3, under `.mode csv` and with `.import ……`, SQLite3 includes
a BOM (UTF-8) as part of the first column of the first record.
IMHO,
24 matches
Mail list logo