Re: [sqlite] [OT] UTF8-BOM and text encoding detection (was: UTF8-BOM not disregarded in CSV import)

2017-06-29 Thread Peter da Silva
I always saw byte as something that was relevant for systems that could address 
objects smaller than words... “byte addressed” machines. The term was mnemonic 
for something bigger than a bit and smaller than a word. It was usually 8 bits 
=but there were 36-bit machines that were byte addressable 9 bits at a time. 
The DECsystem 10 guys also referred to the other subdivisions of their 36 bit 
words as bytes, sometimes, they could be 6, 7, 8, or 9 bits long. I think they 
had special instructions for operating on them, but they weren’t directly 
addressable.

There was also a “nibble”, smaller than a “byte”, which was always 4 bits (one 
hex digit). I don’t think any of the octal people used the word for their three 
bit digits.
 

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] [OT] UTF8-BOM and text encoding detection (was: UTF8-BOM not disregarded in CSV import)

2017-06-29 Thread Simon Slavin


On 29 Jun 2017, at 8:01pm, Warren Young  wrote:

> We wouldn’t have needed the term “octet” if “byte” always meant “8 bits”.

The terms "octet" and "decade" were prevalent across Europe in the 1970s.  
Certainly we used them both when I was first learning about computers.  My 
impression back then was that the term "byte" was the American word for "octet".

The web in general seems to agree with you, not me.  It seems that a word was 
made of bytes, and bytes were made of nybbles, and nybbles were made of bits, 
and that how many of which went into what depended on which platform you were 
talking about.  This contradicts my computing teacher who taught me the "by 
eight" definition of "byte".

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] [OT] UTF8-BOM and text encoding detection (was: UTF8-BOM not disregarded in CSV import)

2017-06-29 Thread Warren Young
On Jun 29, 2017, at 11:18 AM, Simon Slavin  wrote:
> 
> On 29 Jun 2017, at 5:39pm, Warren Young  wrote:
> 
>> Before roughly the mid 1970s, the size of a byte was whatever the computer 
>> or communications system designer said it was.
> 
> You mean that size of a word.

That, too.  Again I give the example of a 12-bit PDP-8 storing 6-bit packed 
ASCII text.  The word size is 12, and the byte size is 6.

The same machine could instead store 7-bit ASCII from the ASR-33 in its 12-bit 
words, and we could then speak of 7-bit bytes and 12-bit words.  This, too, was 
a thing in the PDP-8 world, though rarer, since the core memory field size was 
4k words, and the base machine config only had the one field, so 5 wasted bits 
per character was a painful hit.

> The word "byte" means "by eight”.

I failed to find that in an English corpus search.[1]  A search for “by eight” 
turns up hundreds of results (apparently limited to 600 by the search engine) 
but none of the matches is near “byte.”  A search for “by-eight” turns up only 
one result, also irrelevant.

I suspect the earliest print reference to that definition would be much later 
than the actual coinage of the word in 1956 by Werner Buchholz, making it a 
back-formation.  I’d expect to find that definition in print only after the 
microcomputer revolution that nailed the 8-bit byte into place.

Further counter-citations:

   https://stackoverflow.com/questions/13615764/
   https://en.wikipedia.org/wiki/Byte#History
   https://en.wikipedia.org/wiki/Talk:Byte#Byte_.3D_By-Eight.3F
   https://english.stackexchange.com/questions/121127/etymology-of-byte

I wish I could find a copy of 

   Buchholz, W., January 1981:
   "Origin of the Word 'Byte.'" 
   IEEE Annals of the History of Computing, 3, 1: p. 72 

that is not behind a paywall, as Buchholz is the man who coined the word for 
the IBM 7030 “Stretch,” which had a variable byte size.  It used 8-bit bytes 
for I/O, but it had variable-width bytes internally.

We wouldn’t have needed the term “octet” if “byte” always meant “8 bits”.


[1]: http://corpus.byu.edu/coca/

> With each bit of storage costing around 100,000 times what they do now

A bit of trivia I dropped during editing from the prior post: a 5 MB RK05 disk 
drive cost about the same as a luxury car.  (About US $40,000 today after CPI 
adjustment.)

Cadillac with all the options or RK05?  Let me think…RK05!
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] [OT] UTF8-BOM and text encoding detection (was: UTF8-BOM not disregarded in CSV import)

2017-06-29 Thread John McKown
On Thu, Jun 29, 2017 at 12:18 PM, Simon Slavin  wrote:

> A couple of minor comments.
>
> On 29 Jun 2017, at 5:39pm, Warren Young  wrote:
>
> > Before roughly the mid 1970s, the size of a byte was whatever the
> computer or communications system designer said it was.
>
> You mean that size of a word.  The word "byte" means "by eight".  It did
> not always mean 7 bits of data and one parity bit, but it was always 8 bits
> in total.
>
> > A common example would be a Teletype Model 33 ASR hardwired by DEC for
> transmitting 7-bit ASCII on 8-bit wide paper tapes with mark parity
>
> Thank you for mentioning that.  First computer terminal I ever used.  I
> think I still have some of the paper tape somewhere.
>
> > The 8-bit byte standard — and its even multiples — is relatively recent
> in computing history.  You can point to early examples like the 32-bit IBM
> 360 and later ones like the 16-bit Data General Nova and DEC PDP-11, but I
> believe it was the flood of 8-bit microcomputers in the mid to late 1970s
> that finally and firmly associated “byte” with “8 bits”.
>
> Again, the word you want is "word".  There were architectures with all
> sorts of weird word sizes.  "byte" always meant "by eight" and was a
> synonym for "octet".
>
> As Warren wrote, words did not always encode text as 8 bits per
> character.  Computers with 16-bit word sizes might encode ASCII as three
> 5-bit characters plus a parity bit, or use two 16-bit words for five 6-bit
> characters plus 2 meta-bits.  With each bit of storage costing around
> 100,000 times what they do now, and taking 10,000 times the time to move
> across your communications network, there was a wide variety of ingenious
> ways to save a bit here and a bit there.
>
> Simon.
>
>
​In today's world, you are completely correct. However, according to
Wikipedia (https://en.wikipedia.org/wiki/Byte_addressing), there was at
least one machine (Honeywell) which had a 36 bit word which was divided
into 9 bit "bytes" (i.e. an address pointed to a 9 bit "byte").​


-- 
Veni, Vidi, VISA: I came, I saw, I did a little shopping.

Maranatha! <><
John McKown
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] [OT] UTF8-BOM and text encoding detection (was: UTF8-BOM not disregarded in CSV import)

2017-06-29 Thread Simon Slavin


On 29 Jun 2017, at 6:18pm, Simon Slavin  wrote:

> Computers with 16-bit word sizes might encode ASCII as three 5-bit characters

Where I wrote "ASCII" I should have written "text".

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] [OT] UTF8-BOM and text encoding detection (was: UTF8-BOM not disregarded in CSV import)

2017-06-29 Thread Simon Slavin
A couple of minor comments.

On 29 Jun 2017, at 5:39pm, Warren Young  wrote:

> Before roughly the mid 1970s, the size of a byte was whatever the computer or 
> communications system designer said it was.

You mean that size of a word.  The word "byte" means "by eight".  It did not 
always mean 7 bits of data and one parity bit, but it was always 8 bits in 
total.

> A common example would be a Teletype Model 33 ASR hardwired by DEC for 
> transmitting 7-bit ASCII on 8-bit wide paper tapes with mark parity

Thank you for mentioning that.  First computer terminal I ever used.  I think I 
still have some of the paper tape somewhere.

> The 8-bit byte standard — and its even multiples — is relatively recent in 
> computing history.  You can point to early examples like the 32-bit IBM 360 
> and later ones like the 16-bit Data General Nova and DEC PDP-11, but I 
> believe it was the flood of 8-bit microcomputers in the mid to late 1970s 
> that finally and firmly associated “byte” with “8 bits”.

Again, the word you want is "word".  There were architectures with all sorts of 
weird word sizes.  "byte" always meant "by eight" and was a synonym for "octet".

As Warren wrote, words did not always encode text as 8 bits per character.  
Computers with 16-bit word sizes might encode ASCII as three 5-bit characters 
plus a parity bit, or use two 16-bit words for five 6-bit characters plus 2 
meta-bits.  With each bit of storage costing around 100,000 times what they do 
now, and taking 10,000 times the time to move across your communications 
network, there was a wide variety of ingenious ways to save a bit here and a 
bit there.

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] INSERT ... VALUES / want to "skip" default values

2017-06-29 Thread Jens Alfke

> On Jun 29, 2017, at 12:13 AM, Hick Gunter  wrote:
> 
> Double quotes is specifically for building identifiers that "look strange" 
> (i.e. embedded spaces, keywords, ...) which IMHO should be avoided because it 
> tends to clutter up the statement.

I agree that if you’re generating the schema by hand you should avoid creating 
names that require quoting.

However, if tables/columns/indexes are being generated dynamically, it can be 
very convenient to name them based on the external item that uses them, and 
that name might involve “strange” characters.

In my case, I create indexes on the fly based on JSON paths like 
“address[0].zipcode”, so I use the path as part of the index name. This is 
simpler than escaping the punctuation, using a digest of the path, or making up 
an identifier that then has to be stored somewhere else. It also makes the 
schema a lot easier to understand when looking at generated statements or 
poking around in the sqlite3 tool.

—Jens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] [OT] UTF8-BOM and text encoding detection (was: UTF8-BOM not disregarded in CSV import)

2017-06-29 Thread Warren Young
On Jun 27, 2017, at 3:02 PM, Keith Medcalf  wrote:
> 
>> The whole point of
>> specifying a format as 7 bits is that the 8th bit is ignored, or
>> perhaps used in an implementation-defined manner, regardless of whether
>> the 8th bit in a char is available or not.
> 
> ASCII was designed back in the days of low reliability serial communications 
> -- you know, back when data was sent using 7 bit data + 1 parity bits + 2 
> stop bits -- to increase the reliability of the communications.  A "byte" was 
> also 9 bits.  8 bits of data and a parity bit.

Before roughly the mid 1970s, the size of a byte was whatever the computer or 
communications system designer said it was.  Even within a single computer + 
serial comm system, the definitions could differ.  For this reason, we also 
have the term “octet,” which unambiguously means an 8-bit unit of data.

The 9-bit byte is largely a DEC-ism, since their pre-PDP-11 machines used a 
word size that was an integer multiple of 6 or 12.  DEC had 12-bit machines, 
18-bit machines, and 36-bit machines.  There was even a plan for a 24-bit 
design at one point.

A common example would be a Teletype Model 33 ASR hardwired by DEC for 
transmitting 7-bit ASCII on 8-bit wide paper tapes with mark parity, fed by a 
12-bit PDP-8 pulling that text off an RK05 cartridge disk from a file encoded 
in a 6-bit packed ASCII format.

6-bit packed ASCII schemes were common at the time: to efficiently store plain 
text in the native 12-, 18-, or 36-bit words, programmers would drop most of 
the control characters and punctuation, as well as either dropping or 
shift-encoding lowercase.

That isn’t an innovation from the DEC world, either: Émile Baudot came up with 
basically the same idea in his eponymous 5-bit telegraph code in 1870.  You 
could well say that Baudot code uses 5-bit bytes.  (This is also where the data 
communications unit “baud” comes from.)

The 8-bit byte standard — and its even multiples — is relatively recent in 
computing history.  You can point to early examples like the 32-bit IBM 360 and 
later ones like the 16-bit Data General Nova and DEC PDP-11, but I believe it 
was the flood of 8-bit microcomputers in the mid to late 1970s that finally and 
firmly associated “byte” with “8 bits”.

> Nowadays we use 8 bits for data with no parity

True parity bits (as opposed to mark/space parity) can only detect a 1-bit 
error.  We dropped parity checks when the data rates rose and SNR levels fell 
to the point that single-bit errors were a frequent occurrence, making parity 
checks practically useless.

> no error correction

The wonder, to my mind, is that it’s still an argument whether to use ECC RAM 
in any but the lowest-end machines.  You should have the option to put ECC RAM 
into any machine down to about the $500 level by simply paying a ~25% premium 
on the option cost for non-ECC RAM, but artificial market segmentation has kept 
ECC a feature of the server and high-end PC worlds only.

This sort of penny-pinching should have gone out of style in the 1990s, for the 
same reason Ethernet and USB use smarter error correction than did RS-232.

We should have flowed from parity RAM at the high end to ECC RAM at the high 
end to ECC everywhere by now.

> and no timing bits.

Timing bits aren’t needed when you have clock recovery hardware, which like 
ECC, is a superior technology that should be universal once transistors become 
sufficiently cheap.

Clock recovery becomes necessary once SNR levels get to the point they are now, 
where separate clock lines don’t really help any more.  You’d have to apply 
clock recovery type techniques to the clock line if you had it, so you might as 
well apply it to the data and leave the clock line out.

> Cuz when things screw up we want them to REALLY screw up ... and remain 
> undetectable.

Thus the move toward strongly checksummed filesystems like ZFS, btrfs, HAMMER, 
APFS, and ReFS.

Like ECC, this is a battle that should be over by now, but we’re going to see 
HFS+, NTFS, and extfs hang on for a long time yet because $REASONS.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] UTF8-BOM and text encoding detection (was: UTF8-BOM not disregarded in CSV import)

2017-06-29 Thread Tim Streater
On 29 Jun 2017 at 08:01, Eric Grange  wrote:

>> The sender, however, could be lying, and this needs to be considered
>
> This is an orthogonal problem: if the sender is sending you data that is
> not what it should be, then he could just as well be sending you
> well-encoded and well-formed but invalid data, or malware, or
> confidential/personal data you are not legally allowed to store, or, or,
> or... the list never ends.
>
> And generally speaking, if your code tries too hard to find a possible
> interpretation for invalid of malformed input, then you are far more likely
> to just end up with processed garbage, which will make it even harder to
> figure out down the road where the garbage in your database originated from
> (incorrect input? bug in the heuristics? etc.)

It will end up in the user's database. No heuristics are involved; I can do no 
more than believe what the sender tells me. The IDE I am using does at lest 
allow, in its base64-decode, that I request lossy conversion in the case of bad 
input.

--
Cheers  --  Tim
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Cannot delete a Database file?

2017-06-29 Thread Warren Young
On Jun 28, 2017, at 8:53 AM, R Smith  wrote:
> 
> 
> On 2017/06/28 4:33 PM, Simon Slavin wrote:
>> That’s two wrong things I’ve posted today. Must try harder. Sorry about 
>> that. 
> 
> I don't think that's a problem…

It’s a well-known phenomenon, called by some Cunningham’s Law: “The best way to 
get the right answer on the internet is not to ask a question; it's to post the 
wrong answer.”

Also: https://www.xkcd.com/386/
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] User Authentication Extension is broken in 3.19 branch

2017-06-29 Thread Warren Young
On Jun 28, 2017, at 1:18 AM, Thien, Christoph  
wrote:
> 
> The last release with working user authentication is 3.15.2.
> Release 3.16.0 breaks user authentication.
> 
> There had been a lot of changes between those releases.

The version control system that SQLite is hosted on (Fossil) has a very useful 
feature for situations like this called “bisect.”  To start using it, create a 
Fossil checkout of SQLite,[1] then from within that checkout directory say:

$ fossil update version-3.15.2
$ fossil bisect reset  # not necessary on first run-thru
$ fossil bisect good
$ fossil update version-3.16.0
$ fossil bisect bad

Fossil will then check out a version roughly halfway between those two.  Build 
that, try it, and then say “bisect bad” or “good” depending on whether your 
auth feature works or not.  In a few minutes, you will find the specific 
checkin that broke this feature.  Inspect the diffs for that checkin to see why 
it broke and what the fix is.

The fix will probably be obvious from the diffs.  The only trick is finding the 
particular checkin to examine, which is what bisect does for you.

[1]: https://www.sqlite.org/getthecode.html#clone

> At the moment we do not have a patch.

Given that this is an unsupported feature, I think it’s up to you to provide 
that patch.

> Would be great if some developer with knowledge about internals could fix it.

The SQLite core developers aren’t likely to do that, because “unsupported” 
means they don’t mess with that code any more.

Anyone else would have to have a reason to fix it, and given that it’s been 
broken for 3 major versions, I think that’s a fair indicator that very few 
people care about this feature working.

You’re one of the few people in the world with the motive to fix it.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] INSERT ... VALUES / want to "skip" default values

2017-06-29 Thread Peter da Silva
On 6/29/17, 5:20 AM, "sqlite-users on behalf of R Smith" 
 
wrote:
> SQLite isn't helping the confusion in this case, because it allows 
> double-quotes to be regarded as string values IF an identifier with that name 
> doesn't exist. This is of course all good and well until you misspell a 
> column name...

Shades of REXX.

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] INSERT ... VALUES / want to "skip" default values

2017-06-29 Thread Peter da Silva
On 6/29/17, 1:22 AM, "sqlite-users on behalf of Robert M. Münch" 
 wrote:
> Hi, sorry, should have mentioned that this doesn't work in my case, because 
> we are building the column placeholders dynamically. So, we would have to 
> handle putting the necessary column names in there all the time, which is not 
> feasible.

I have been generating SQL dynamically on a number of projects over the past 
10+ years, and have found that generating INSERT with column names in is (a) 
not really that much extra work, and (b) eliminates a whole class of bugs 
involving schema changes or even schema regeneration. It’s genuinely worth 
taking the time to do it right.

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] INSERT ... VALUES / want to "skip" default values

2017-06-29 Thread R Smith


On 2017/06/29 8:15 AM, Robert M. Münch wrote:

On 27 Jun 2017, at 22:11, David Raymond wrote:


Single quotes should be used for strings, so DEFAULT '-'

I thought it doesn't matter if I use " or ' for strings. What's the difference?


I had this misconception at some point too.  Double quotes are for 
specifying Identifiers (The names of stuff) in SQLite so as to not 
confuse a possible column name (aka identifier) with say an internal 
keyword or a string value.


SQLite isn't helping the confusion in this case, because it allows 
double-quotes to be regarded as string values IF an identifier with that 
name doesn't exist. This is of course all good and well until you 
misspell a column name...


To demonstrate the difference and possible pitfalls:

  -- SQLite version 3.17.0  [ Release: 2017-02-13 ]  on SQLitespeed 
version 2.0.2.4.
  -- 



CREATE TABLE QTest(
  "ID" INT,
  Value1 TEXT,
  Value2 TEXT
);

INSERT INTO QTest(ID, Value1) VALUES
 (1, 'Gorilla Conflict')
,(2, 'Moroccan Coffee Beans')
,(3, "Monaco Raceway")
;
-- All these were regarded as strings.

UPDATE QTest SET Value2 = 'Value1' || ' in the Jungle.';

SELECT ID, Value2 FROM QTest;

  --  ID  | Value2
  --  | -
  --   1  | Value1 in the Jungle.
  --   2  | Value1 in the Jungle.
  --   3  | Value1 in the Jungle.

-- Here we expected that outcome because using single quotes means
-- that can be nothing other than the string 'Value1'


UPDATE QTest SET Value2 = "Value1" || " in the Jungle.";

SELECT ID, Value2 FROM QTest;

  --  ID  | Value2
  --  | 
  --   1  | Gorilla Conflict in the Jungle.
  --   2  | Moroccan Coffee Beans in the Jungle.
  --   3  | Monaco Raceway in the Jungle.

-- Here "value1" is correctly regarded as an identifier but
-- " in the Jungle" is regarded as a string, even though it's
-- in double-quotes. This time the weirdness helped us


UPDATE QTest SET Value2 = "Valeu1" || " in the Jungle.";

SELECT ID, Value2 FROM QTest;

  --  ID  | Value2
  --  | -
  --   1  | Valeu1 in the Jungle.
  --   2  | Valeu1 in the Jungle.
  --   3  | Valeu1 in the Jungle.

-- Here our troubles start. It's exactly the same format as before,
-- but because of the spelling mistake, our needed-to-be-an-identifier
-- value1 simply got transformed to a string, no error.
-- (In fact, this entire script runs without errors, you can copy-paste
-- it into your own DB script mechanism)


UPDATE QTest SET Value2 = Value1 || ' in the Jungle.';

SELECT ID, Value2 FROM QTest;

  --  ID  | Value2
  --  | 
  --   1  | Gorilla Conflict in the Jungle.
  --   2  | Moroccan Coffee Beans in the Jungle.
  --   3  | Monaco Raceway in the Jungle.

-- This is the most correct way to do it. The same spelling mistake
-- here would error out.

-- The only time it is needed (or really a good idea) to use double
-- quotes, is when an identifier name either contains weird
-- characters or the identifier is the same as an internal Keyword.
-- These are not valid identifiers:  *1 A~;  ,  JOIN
-- But these definitely are valid:  "*1 A~;" , "JOIN"
-- In SQLite, even this is valid: CREATE TABLE " "(" " INT);
-- Yes - that is a table with the name SPACE and a column named SPACE.



DROP TABLE QTest;-- Cleanup

  --   Script Stats: Total Script Execution Time: 0d 00h 00m and 
00.030s
  -- Total Script Query Time: 0d 00h 00m and 
00.001s

  -- Total Database Rows Changed: 15
  -- Total Virtual-Machine Steps: 289
  -- Last executed Item Index:11
  -- Last Script Error:
  -- 




___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] INSERT ... VALUES / want to "skip" default values

2017-06-29 Thread Hick Gunter
Double quotes is specifically for building identifiers that "look strange" 
(i.e. embedded spaces, keywords, ...) which IMHO should be avoided because it 
tends to clutter up the statement.

Single quotes is for building strings.

Integer is a keyword, "integer" is an identifier and 'integer' a string.

asql> create temp table test (id integer primary key, "integer" integer default 
'integer');
asql> .desc test
+-++++
|   Name  |Datatype|Size| Hidden |
+-++++
| id  | integer|UNKNOWN ||
| integer | integer|UNKNOWN ||
+-++++
Field count: 2
asql> insert into test (id) values (1);
rows inserted
-
1
asql> select * from test;
id  integer
--  --
1   integer

-Ursprüngliche Nachricht-
Von: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] Im 
Auftrag von Robert M. Münch
Gesendet: Donnerstag, 29. Juni 2017 08:16
An: SQLite mailing list 
Betreff: Re: [sqlite] INSERT ... VALUES / want to "skip" default values

On 27 Jun 2017, at 22:11, David Raymond wrote:

> Single quotes should be used for strings, so DEFAULT '-'

I thought it doesn't matter if I use " or ' for strings. What's the difference?


> So there is no method to do something like...
>
> INSERT INTO test VALUES ('field a', DEFAULT, 'field c', 'field d');

That's what I want to do.

> PS: Simon: Specifying NULL will just put a NULL value in there, it won't use 
> the default.

I tried NULL and as you said, that doesn't work because NULL is put in.

--

Robert M. Münch, CEO
M: +41 79 65 11 49 6

Saphirion AG
smarter | better | faster

http://www.saphirion.com
http://www.nlpp.ch


___
 Gunter Hick
Software Engineer
Scientific Games International GmbH
FN 157284 a, HG Wien
Klitschgasse 2-4, A-1130 Vienna, Austria
Tel: +43 1 80100 0
E-Mail: h...@scigames.at

This communication (including any attachments) is intended for the use of the 
intended recipient(s) only and may contain information that is confidential, 
privileged or legally protected. Any unauthorized use or dissemination of this 
communication is strictly prohibited. If you have received this communication 
in error, please immediately notify the sender by return e-mail message and 
delete all copies of the original communication. Thank you for your cooperation.


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] UTF8-BOM and text encoding detection (was: UTF8-BOM not disregarded in CSV import)

2017-06-29 Thread Eric Grange
> The sender, however, could be lying, and this needs to be considered

This is an orthogonal problem: if the sender is sending you data that is
not what it should be, then he could just as well be sending you
well-encoded and well-formed but invalid data, or malware, or
confidential/personal data you are not legally allowed to store, or, or,
or... the list never ends.

And generally speaking, if your code tries too hard to find a possible
interpretation for invalid of malformed input, then you are far more likely
to just end up with processed garbage, which will make it even harder to
figure out down the road where the garbage in your database originated from
(incorrect input? bug in the heuristics? etc.)




On Wed, Jun 28, 2017 at 10:40 PM, Tim Streater  wrote:

> On 28 Jun 2017 at 14:20, Rowan Worth  wrote:
>
> > On 27 June 2017 at 18:42, Eric Grange  wrote:
> >
> >> So while in theory all the scenarios you describe are interesting, in
> >> practice seeing an utf-8 BOM provides an extremely
> >> high likeliness that a file will indeed be utf-8. Not always, but a
> memory
> >> chip could also be hit by a cosmic ray.
> >>
> >> Conversely the absence of an utf-8 BOM means a high probability of
> >> "something undetermined": ANSI or BOMless utf-8,
> >> or something more oddball (in which I lump utf-16 btw)... and the need
> for
> >> heuristics to kick in.
> >>
> >
> > I think we are largely in agreement here (esp. wrt utf-16 being an
> oddball
> > interchange format).
> >
> > It doesn't answer my question though, ie. what advantage the BOM tag
> > provides compared to assuming utf-8 from the outset. Yes if you see a
> utf-8
> > BOM you have immediate confidence that the data is utf-8 encoded, but
> what
> > have you lost if you start with [fake] confidence and treat the data as
> > utf-8 until proven otherwise?
>
> 1) Whether the data contained in a file is to be considered UTF-8 or not
> is an item of metadata about the file. As such, it has no business being
> part of the file itself. BOMs should therefore be deprecated.
>
> 2) I may receive data as part of an email, with a header such as:
>
>Content-type: text/plain; charset="utf-8"
>Content-Transfer-Encoding:  base64
>
> then I interpret that to mean that the attendant data, after decoding from
> base64, is it to be expected to be utf-8. The sender, however, could be
> lying, and this needs to be considered. Just because a header, or file
> metadata, or indeed a BOM, says some data or other is legal utf-8, this
> does not mean that it actually is.
>
>
> --
> Cheers  --  Tim
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] INSERT ... VALUES / want to "skip" default values

2017-06-29 Thread Clemens Ladisch
Robert M. Münch wrote:
> Is this behaviour standard or a SQLite variant?

Autoincrementing is an SQLite variant.
Default values are standard SQL.

It should be noted that standard SQL (above Entry SQL level) allows
DEFAULT in row value constructors.


Regards,
Clemens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] INSERT ... VALUES / want to "skip" default values

2017-06-29 Thread Robert M. Münch
On 28 Jun 2017, at 14:51, Simon Slavin wrote:

> Really ?  In that case I withdraw my previous answer.  I thought that NULLs 
> were converted to the default value for a column (which is usually NULL but 
> can be overridden with a DEFAULT clause).

I had exactly the same understanding. BTW: Is this behaviour standard or a 
SQLite variant?

-- 

Robert M. Münch, CEO
M: +41 79 65 11 49 6

Saphirion AG
smarter | better | faster

http://www.saphirion.com
http://www.nlpp.ch


signature.asc
Description: OpenPGP digital signature
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] INSERT ... VALUES / want to "skip" default values

2017-06-29 Thread Robert M. Münch
On 28 Jun 2017, at 9:49, Maks Verver wrote:

> I'm surprised nobody mentioned that you can specify the columns to be
> inserted in the query:
>
>   INSERT INTO test(a, c, d) VALUES (1, 2 3);
>
> (Note that `b` is missing it `a, c, d`. It will take the default value,
> which will be NULL, unless a different default was specified explicitly in
> the CREATE TABLE statement.)

Hi, sorry, should have mentioned that this doesn't work in my case, because we 
are building the column placeholders dynamically. So, we would have to handle 
putting the necessary column names in there all the time, which is not feasible.

-- 

Robert M. Münch, CEO
M: +41 79 65 11 49 6

Saphirion AG
smarter | better | faster

http://www.saphirion.com
http://www.nlpp.ch


signature.asc
Description: OpenPGP digital signature
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] INSERT ... VALUES / want to "skip" default values

2017-06-29 Thread Robert M. Münch
On 27 Jun 2017, at 22:24, David Raymond wrote:

> If you have to provide 4 values then the way you can use null to do that is 
> to add in a trigger to set the default, since NULL _is_ a value and _is_ 
> legal for that field.

Ha, that's a very good idea. I didn't have triggers in the radar. Great, I 
think that's solving my problem. Thanks a lot!

-- 

Robert M. Münch, CEO
M: +41 79 65 11 49 6

Saphirion AG
smarter | better | faster

http://www.saphirion.com
http://www.nlpp.ch


signature.asc
Description: OpenPGP digital signature
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] INSERT ... VALUES / want to "skip" default values

2017-06-29 Thread Robert M. Münch
On 27 Jun 2017, at 22:11, David Raymond wrote:

> Single quotes should be used for strings, so DEFAULT '-'

I thought it doesn't matter if I use " or ' for strings. What's the difference?


> So there is no method to do something like...
>
> INSERT INTO test VALUES ('field a', DEFAULT, 'field c', 'field d');

That's what I want to do.

> PS: Simon: Specifying NULL will just put a NULL value in there, it won't use 
> the default.

I tried NULL and as you said, that doesn't work because NULL is put in.

-- 

Robert M. Münch, CEO
M: +41 79 65 11 49 6

Saphirion AG
smarter | better | faster

http://www.saphirion.com
http://www.nlpp.ch


signature.asc
Description: OpenPGP digital signature
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users