Re: [sqlite] Most efficient storage for arrays

2014-04-22 Thread Keith Medcalf

>>> Your data is from MongoDB :)  Note they do have an extended JSON to
>>> deal with types like ObjectId, binary and dates:
>> Yes, it is. But I control the document structure and it does not have
>> any types that can't be converted easily after a pull over the net.
>
>Does that mean you can make it individual fields instead of a list?  As
>Petite Abeille pointed out you need to do SQL things the SQL way or you
>will end up in a world of hurt.

You need to use a RELATIONAL datastore in a RELATIONAL way.  This applies 
whether your access method is SQL, ISAM, HIDAM, BDAM or VSAM.  If the 
underlying datastore is not relational, you may also be able to use one (or 
more) of the mentioned access methods.  However, then you will be able to store 
non-relational data in it and manupulate it easily.  SQLite is a RELATIONAL 
datastore that uses the SQL access method.

Please do not confuse the access method and the datastore.  The datastore 
defines the properties of the filing cabinets, and the access method the 
language spoken by the filing clerk.

Any filing clerk can be taught to use many underlying storage cabinet systems, 
just as any given storage cabinet system may be accessed by any number of 
clerks speaking different languages.




___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] BLOBs and NULLs

2014-04-22 Thread Keith Medcalf

>> In summary: the context of a GUID defines its "scope of required
>> uniqueness," and a 16-byte GUID is essentially globally unique so long
>> as
>> it has no collisions within its context(s). (i.e. who cares if SHA1s
>> collide, so long as it's not in the same repo?)
>
>You might be interested in UUIDs, which define the scope and provide
>methods for presenting hashed, sequential and random GUIDs within that
>scope:
>
>
>
>Libraries are available for many languages to generate UUIDs under many
>of these schemes.
>
>
>On 22 Apr 2014, at 6:57pm, RSmith  wrote:
>
>> Just to add  - I cannot imagine why you would have dashes to start with
>
>Take, for example, this GUID:
>
>550e8400-e29b-41d4-a716-44de5544ac00
>
>A program (or a TABLE) frequently handles many GUIDs which differ only in
>one of the fields.  For instance a program may have to represent a GUID
>externally (on the screen and output files) with the complete GUID, but
>it could store items internally (perhaps in a database) using only the
>last 12 hex digits, or as an integer up to 2^48.  Since it know that all
>people GUIDs start
>
>550e8400-e29b-41d4-a716-
>
>whereas all vehicles GUIDs start
>
>550e8404-31e6-41d4-a716-

You don't ever really need a GUID at all.  Simply use an "integer primary key" 
(an integer starting at 1) and simply pretend that it is being added to the 
applicable base GUID of your random choosing.  Everything will still be unique 
and you will have saved yourself a crap load of storage space, index space, and 
conserved countless CPU cycles so that they can be spent on something more 
productive productive.

I have never seen a need to actually use a GUID for anything, it is a 
ridiculous concept.




___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Most efficient storage for arrays

2014-04-22 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 22/04/14 15:00, Neville Dastur wrote:
> On 22 Apr 2014, at 21:58, Roger Binns  wrote:
>> Your data is from MongoDB :)  Note they do have an extended JSON to
>> deal with types like ObjectId, binary and dates:
> Yes, it is. But I control the document structure and it does not have
> any types that can’t be converted easily after a pull over the net.

Does that mean you can make it individual fields instead of a list?  As
Petite Abeille pointed out you need to do SQL things the SQL way or you
will end up in a world of hurt.

>> How much data will there be overall and how performant does all this
>> have to be?
> 10,000s or records

Doesn't sound like much.

>> That still requires scanning every record as you aren't doing a match
>> from the start of the value.
> By this, you mean searching like this is “expensive”

You are searching within a column so an index can't be used - (eg looking
for || anywhere within that column).  That means every record has to
be examined.  Hence my queries about data size and performance requirements.

You will need to normalize if you want SQLite's performance and use
indexes so that columns contain exactly one value.

>> If you need something for not too much data then just be simple -
>> have a single column which is the JSON text of the whole document.
>> You can still use LIKE to find records of interest, deserialise back
>> to JSON and do the final filtering/querying/sorting there.
> 
> Not sure how this helps find say  in array2 alone. Note I have
> changed example so that there is a field value that is the same in
> array1 and 2 { "_id" : ObjectId(“xx"), "name" :
> “Description”, “array1" : [ “”,”” ], “array2" : [
> “”,””,”” ], "tags" :
> [“searchword1”,”searchword2”,”searchword3"] }

You have a SQL table containing one column  which is that entire document
as a single JSON string.  In SQL you do SELECT json FROM table WHERE json
LIKE '%%'.  This will match the above as well as any record where 
is in the _id, description etc.

Then in your app code you deserialise the json and check the object for
your exact query.  ie you use SQL to make a first pass looking for
candidates, and use your own app specific code to narrow that down.

>> If you need a "proper" and convenient solution with SQLite then user 
>> defined functions and virtual tables are a great way of hiding what
>> you do under the hood wrt to how the data is actually stored.  You'll
>> have to write them to understand your data shape and queries.
> Not really an option as I don’t have that control over the sqlite lib
> compiled into the mobile app.

Really?  It is trivial on iOS, and not too hard on Android (need to use
the NDK).

There is no easy way out of this.

Roger
-BEGIN PGP SIGNATURE-
Version: GnuPG v1

iEYEARECAAYFAlNXEIoACgkQmOOfHg372QQWxgCbBREj/ZCFTmdbFf5fLQLAqkAt
NTUAn0TfMt3T9faOVbfgfGpzIoeg9ndY
=rkUB
-END PGP SIGNATURE-
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] CTEs and unions

2014-04-22 Thread Andy Goth

On 4/22/2014 5:55 PM, Andy Goth wrote:

On 4/22/2014 5:16 PM, Dominique Devienne wrote:

sqlite> with cte(a) as (select 1)
...> select * from cte
...> union all
...> select * from cte;
Error: no such table: cte


All these queries work for me without error.


http://www.sqlite.org/cgi/src/info/67bfd59d9087a987

This commit fixed your problem, which was written up here:

http://www.sqlite.org/cgi/src/info/31a19d11b97088296a

Try upgrading to 3.8.4 or newer.

--
Andy Goth | 
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] CTEs and unions

2014-04-22 Thread Andy Goth

On 4/22/2014 5:16 PM, Dominique Devienne wrote:

sqlite> with cte(a) as (select 1)
...> select * from cte;
a
1

sqlite> with cte(a) as (select 1)
...> select * from cte
...> union all
...> select * from cte;
Error: no such table: cte

sqlite> with cte(a) as (select 1),
...>  cpy(b) as (select a from cte)
...> select a from cte
...> union all
...> select b from cpy;
Error: no such table: cte

sqlite> with recursive
...>   cte(a) as (select 1),
...>   cpy(b) as (select a from cte)
...> select a from cte
...> union all
...> select b from cpy;
Error: no such table: cte


All these queries work for me without error.

--
Andy Goth | 
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] CTEs and unions

2014-04-22 Thread Petite Abeille

On Apr 23, 2014, at 12:16 AM, Dominique Devienne  wrote:

> is there no way to reuse a CTE several times?

Hrm… of course you can… that’s the entire point of *Common* Table Expression:

with
DataSet
as
(
  select 1 as value
)
select  *
fromDataSet
union all
select  *
fromDataSet;


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Most efficient storage for arrays

2014-04-22 Thread Petite Abeille

On Apr 22, 2014, at 2:24 PM, Neville Dastur  wrote:

> So wondering is anyone that has done this sort of thing and worked out the 
> best way?

Yes. Normalize your data. And that’s that:

http://www.schemamania.org/sql/#lists

Quoting a few words:

"Questions are frequently asked about table designs that are hopelessly wrong. 
The solution to the question is not to write the query, but to re-write the 
table, after which the query will practically write itself.

Perhaps the most egregious example is a column whose value is a list or, in SQL 
terms, a repeating group. The elements in the list are perhaps comma-separated, 
and some poor schlep has the task of selecting or joining on the the nth 
element in the list.”

Don’t be *that* schlep. 


But, if you like pain and suffering, medieval style, I got a hack for you 
involving virtual tables, full text search, contentless tables, and shadows 
even!


It goes a bit like this:

sqlite> .head on
sqlite> select * from json_text where array1 match '';
array1|array2|tags
,|,,|searchword1,searchword2,searchword3


So, what’s json_text?

 A virtual, contentless, FTS4 table:

create virtual table json_text using fts4
(
  content='json',
  array1 text,
  array2 text,
  tags   text
);


Where does it get its content?

>From a regular table:

create table json
(
  _idtext,
  name   text,
  array1 text,
  array2 text,
  tags   text
);

insert
intojson
(
  _id,
  name,
  array1,
  array2,
  tags
)
values  (
  'xx',
  'Description',
  ',',
  ',,',
  'searchword1,searchword2,searchword3'
);

insert
intojson
(
  _id,
  name,
  array1,
  array2,
  tags
)
values  (
  'yyy',
  'Description',
  ',',
  ',,',
  'tag1,tag2,tag3'
);


See the doc for details about the full setup:

https://www.sqlite.org/fts3.html#section_6_2_1


What about the shadows you may ask?

Here you go:

create virtual table if not exists json_term using fts4aux( json_text );

sqlite> select * from json_term order by 1 limit 10;
term|col|documents|occurrences
|*|1|1
|0|1|1
|*|1|1
|0|1|1
|*|1|1
|1|1|1
|*|1|1
|1|1|1
|*|1|1
|1|1|1

All the text fields, nicely tokenized, and for your to use and abuse.

More info:

https://www.sqlite.org/fts3.html#fts4aux


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] CTEs and unions

2014-04-22 Thread Dominique Devienne
so we can't use CTEs to avoid stuttering in queries?

This is a contrived example of course, but is there no way to reuse a
CTE several times? In this case, to get two rows of one column, both
1s, without repeating the query?

https://sqlite.org/lang_with.html shows a CTE with two named queries,
the second using the first one, soI tried to "copy" my first named
query to work around the errors I was getting, but still no luck. What
am I missing? I'm having a hard time wrapping my head around CTEs...

I was trying to use a CTE to split multi-column rows into 2 rows per
initial row, first half of columns in first new row, then second half
of columns in another new row, w/o repeating the (longuish) query
yield the rows to split in the first place. Can CTEs do that?

Thanks, --DD

C:\Users\DDevienne>sqlite3
SQLite version 3.8.3.1 2014-02-11 14:52:19
Enter ".help" for instructions
Enter SQL statements terminated with a ";"

sqlite> with cte(a) as (select 1)
   ...> select * from cte;
a
1

sqlite> with cte(a) as (select 1)
   ...> select * from cte
   ...> union all
   ...> select * from cte;
Error: no such table: cte

sqlite> with cte(a) as (select 1),
   ...>  cpy(b) as (select a from cte)
   ...> select a from cte
   ...> union all
   ...> select b from cpy;
Error: no such table: cte

sqlite> with recursive
   ...>   cte(a) as (select 1),
   ...>   cpy(b) as (select a from cte)
   ...> select a from cte
   ...> union all
   ...> select b from cpy;
Error: no such table: cte
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Most efficient storage for arrays

2014-04-22 Thread Neville Dastur

On 22 Apr 2014, at 21:58, Roger Binns  wrote:

> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
> 
> On 22/04/14 05:24, Neville Dastur wrote:
>> I am looking for some advice on storing and searching data that comes
>> from an external JSON source and needs to be stored on device in a
>> Sqlite3 database.
> 
> Your data is from MongoDB :)  Note they do have an extended JSON to deal
> with types like ObjectId, binary and dates:
Yes, it is. But I control the document structure and it does not have any types 
that can’t be converted easily after a pull over the net.

> 
> http://docs.mongodb.org/manual/reference/mongodb-extended-json/#bson-data-types-and-associated-representations
> 
>> On device I need to be able to retrieve the entire “document” and
>> display all the array values,
> 
> How much data will there be overall and how performant does all this have
> to be?
10,000s or records

> 
>> The options i see are to store the array fields as pipe separated
>> values and then use LIKE “|%|” to search items.
> 
> That still requires scanning every record as you aren't doing a match from
> the start of the value.
By this, you mean searching like this is “expensive”

> 
> If you need performance then you need to normalize the data, or use a
> database that supports lists like postgres.
Not an option as the sqlite bit is on smartphones.

> 
> If you need something for not too much data then just be simple - have a
> single column which is the JSON text of the whole document.  You can still
> use LIKE to find records of interest, deserialise back to JSON and do the
> final filtering/querying/sorting there.
Not sure how this helps find say  in array2 alone. Note I have changed 
example so that there is a field value that is the same in array1 and 2
{
   "_id" : ObjectId(“xx"),
   "name" : “Description”,
   “array1" : [ “”,”” ],
   “array2" : [ “”,””,”” ],
   "tags" : [“searchword1”,”searchword2”,”searchword3"]
}

> 
> If you need a "proper" and convenient solution with SQLite then user
> defined functions and virtual tables are a great way of hiding what you do
> under the hood wrt to how the data is actually stored.  You'll have to
> write them to understand your data shape and queries.
Not really an option as I don’t have that control over the sqlite lib compiled 
into the mobile app.

> 
> There was a now defunct project UnQL that was mixing together JSON like
> data, SQL like queries and bit of SQLite.
I am aware. And I recall Richard being involved with this.

Neville
--
Surgeons Net Education: http://www.surgeons.org.uk
Clinical Software Solutions: http://www.clinsoftsolutions.com
Find our free and paid apps on the iTunes Apple store and Android Google Play 
store
LinkedIn: http://www.linkedin.com/profile/view?id=49617062

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Most efficient storage for arrays

2014-04-22 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 22/04/14 05:24, Neville Dastur wrote:
> I am looking for some advice on storing and searching data that comes
> from an external JSON source and needs to be stored on device in a
> Sqlite3 database.

Your data is from MongoDB :)  Note they do have an extended JSON to deal
with types like ObjectId, binary and dates:

http://docs.mongodb.org/manual/reference/mongodb-extended-json/#bson-data-types-and-associated-representations

> On device I need to be able to retrieve the entire “document” and
> display all the array values,

How much data will there be overall and how performant does all this have
to be?

> The options i see are to store the array fields as pipe separated
> values and then use LIKE “|%|” to search items.

That still requires scanning every record as you aren't doing a match from
the start of the value.

If you need performance then you need to normalize the data, or use a
database that supports lists like postgres.

If you need something for not too much data then just be simple - have a
single column which is the JSON text of the whole document.  You can still
use LIKE to find records of interest, deserialise back to JSON and do the
final filtering/querying/sorting there.

If you need a "proper" and convenient solution with SQLite then user
defined functions and virtual tables are a great way of hiding what you do
under the hood wrt to how the data is actually stored.  You'll have to
write them to understand your data shape and queries.

There was a now defunct project UnQL that was mixing together JSON like
data, SQL like queries and bit of SQLite.

Roger
-BEGIN PGP SIGNATURE-
Version: GnuPG v1

iEYEARECAAYFAlNW19UACgkQmOOfHg372QTNvgCgltU2vNzzVuRfTuPdc2+59VXS
h9sAnjya0jcj32aC3nYDA4Myv5x4tyxi
=fiQU
-END PGP SIGNATURE-
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] BLOBs and NULLs

2014-04-22 Thread Peter Aronson
SQLite seemed to provide good randomness in my (admittedly informal) tests.
 
Peter

From: jose isaias cabrera 
>To: Peter Aronson ; General Discussion of SQLite Database 
> 
>Sent: Tuesday, April 22, 2014 1:06 PM
>Subject: Re: [sqlite] BLOBs and NULLs
>
>
>
>"Peter Aronson" wrote...
>
>
>> If you want to use sqlite3_randomness to generate a Version 4 UUID 
>> according to RFC4122, the following code will can be used:
>>
>> unsigned char uuid_data[16];
>>
>> /* We'll generate a version 4 UUID as per RFC4122. Start by generating
>> 128 bits of randomness (we will use 122 of them). */
>> sqlite3_randomness (16,uuid_data);
>>
>> /* Set the two most significant bits (bits 6 and 7) of the
>> clock_seq_hi_and_reserved field to zero and one, respectively. */
>> uuid_data[8] &= 0x3f;
>> uuid_data[8] |= 0x80;
>> /* Set the four most significant bits (bits 12 through 15) of the
>> time_hi_and_version field to the 4-bit version number from
>> Section 4.1.3 (which is 4). */
>> uuid_data[6] &= 0x0f;
>> uuid_data[6] |= 0x40;
>>
>> This assumes that sqlite3_randomness generates sufficiently good random 
>> numbers, but it appears to in my tests.
>
>Are you saying that sqlite3 does not appear to provide "good random numbers" 
>in your tests, or that it appears to, in your tests?
>
>
>
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] BLOBs and NULLs

2014-04-22 Thread jose isaias cabrera


"Peter Aronson" wrote...


If you want to use sqlite3_randomness to generate a Version 4 UUID 
according to RFC4122, the following code will can be used:


unsigned char uuid_data[16];

/* We'll generate a version 4 UUID as per RFC4122. Start by generating
128 bits of randomness (we will use 122 of them). */
sqlite3_randomness (16,uuid_data);

/* Set the two most significant bits (bits 6 and 7) of the
clock_seq_hi_and_reserved field to zero and one, respectively. */
uuid_data[8] &= 0x3f;
uuid_data[8] |= 0x80;
/* Set the four most significant bits (bits 12 through 15) of the
time_hi_and_version field to the 4-bit version number from
Section 4.1.3 (which is 4). */
uuid_data[6] &= 0x0f;
uuid_data[6] |= 0x40;

This assumes that sqlite3_randomness generates sufficiently good random 
numbers, but it appears to in my tests.


Are you saying that sqlite3 does not appear to provide "good random numbers" 
in your tests, or that it appears to, in your tests?


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] BLOBs and NULLs

2014-04-22 Thread Peter Aronson
If you want to use sqlite3_randomness to generate a Version 4 UUID according to 
RFC4122, the following code will can be used:
 
  unsigned char  uuid_data[16];

  /* We'll generate a version 4 UUID as per RFC4122.  Start by generating
 128 bits of randomness (we will use 122 of them). */
  sqlite3_randomness (16,uuid_data);
  
  /* Set the two most significant bits (bits 6 and 7) of the 
 clock_seq_hi_and_reserved field to zero and one, respectively. */
  uuid_data[8] &= 0x3f;
  uuid_data[8] |= 0x80;
  /* Set the four most significant bits (bits 12 through 15) of the
 time_hi_and_version field to the 4-bit version number from
 Section 4.1.3 (which is 4). */
  uuid_data[6] &= 0x0f;
  uuid_data[6] |= 0x40;

This assumes that sqlite3_randomness generates sufficiently good random 
numbers, but it appears to in my tests.
 
Peter
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] BLOBs and NULLs

2014-04-22 Thread Dominique Devienne
>> That's why I wrote "our galaxy", not the "whole universe" ;) --DD
>
>
> Hehe, my bad... but that only changes a few orders of magnitude, there's only 
> a few billion galaxies :D

OK, you got me! After reading
http://www.universetoday.com/36302/atoms-in-the-universe/, 1e38 is not
even enough for our Sun's atoms, so I was way wrong indeed.

I stand humbly corrected. --DD
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] BLOBs and NULLs

2014-04-22 Thread Stephan Beal
On Tue, Apr 22, 2014 at 8:55 PM, Dominique Devienne wrote:

> > than using string-format data (be sure to use SQLITE_TRANSIENT when
> binding
> > the memory, too).
>

Sorry - i meant SQLITE_STATIC. If your memory will outlive the step() call
then use that, _NOT_ SQLITE_TRANSIENT, to avoid that sqlite makes a copy of
the memory.

-- 
- stephan beal
http://wanderinghorse.net/home/stephan/
http://gplus.to/sgbeal
"Freedom is sloppy. But since tyranny's the only guaranteed byproduct of
those who insist on a perfect world, freedom will have to do." -- Bigby Wolf
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] BLOBs and NULLs

2014-04-22 Thread RSmith


On 2014/04/22 20:52, Dominique Devienne wrote:

On Tue, Apr 22, 2014 at 8:46 PM, RSmith  wrote:

On 2014/04/22 20:06, Dominique Devienne wrote:

Regarding the uniqueness argument made by DRH, it's actually very hard
to generate 2 random-based GUIDS, given that a 128-bit is a very very
large number. It is said that 128-bit is large enough to store the
estimated number of atoms in our galaxy.//
current estimates

are between 78 and 82 orders of magnitude (that's 1.0E+82) of atoms in the
known universe in decimal, which would be around ~1.0 x 2^270 and obviously
require around 270 bits to store.

That's why I wrote "our galaxy", not the "whole universe" ;) --DD


Hehe, my bad... but that only changes a few orders of magnitude, there's only a 
few billion galaxies :D

You'd still need over 200 bits for just our galaxy!

Atoms are pretty small it seems...


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] BLOBs and NULLs

2014-04-22 Thread Dominique Devienne
On Tue, Apr 22, 2014 at 8:47 PM, Stephan Beal  wrote:
> On Tue, Apr 22, 2014 at 8:25 PM, Dominique Devienne 
> wrote:
>
>> Yet I don't see the point of a BIGINT either. A blob can effectively
>> act as a arbitrary sized integer already, albeit one stored in base
>> 256 and on which you cannot do arithmetic, but that's OK and enough to
>> use it as a PK / FK.
>>
>
> A blob can store raw binary data, i.e. raw integers from memory. Just be
> sure to encode/decode them if you want their stored representations to be
> platform-portable (big vs little endian). You can bind a blob using
> (, sizeof(myInt)) if you really want to, it just won't be
> platform-portable without settling on an encoding. If the goal is only
> performance, though, it might (without encoding) be (marginally) faster
> than using string-format data (be sure to use SQLITE_TRANSIENT when binding
> the memory, too).

Thanks for the advise. No, I'm not binding endian-specific native C
integer variables, but a uchar[16], so there are no endianness issues.

I mentally assimilate GUIDs are integers, but the code deals with raw
memory. --DD
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] BLOBs and NULLs

2014-04-22 Thread Simon Slavin

On 22 Apr 2014, at 4:55pm, Dominique Devienne  wrote:

> Simply because of the extra space needed to store it. 36 bytes vs 16
> bytes. That's 20 wasted bytes for the PK, and everytime that PK is
> references in other tables' FKs too. Times millions of rows, it adds
> up, for nothing. The GUID is no less "genuine" as you put it, just
> because it's stored as a 16-bytes blob rather than the canonical
> 36-char text preferred by humans.

And if you use GUIDs a lot, make two functions for SQLite which convert to and 
from text form.  Loading external functions is something SQLite really is 
excellent at.


On 22 Apr 2014, at 5:57pm, Stephan Beal  wrote:

> http://www.w3.org/DesignIssues/Axioms.html#nonunique
> 
> In summary: the context of a GUID defines its "scope of required
> uniqueness," and a 16-byte GUID is essentially globally unique so long as
> it has no collisions within its context(s). (i.e. who cares if SHA1s
> collide, so long as it's not in the same repo?)

You might be interested in UUIDs, which define the scope and provide methods 
for presenting hashed, sequential and random GUIDs within that scope:



Libraries are available for many languages to generate UUIDs under many of 
these schemes.


On 22 Apr 2014, at 6:57pm, RSmith  wrote:

> Just to add  - I cannot imagine why you would have dashes to start with

Take, for example, this GUID:

550e8400-e29b-41d4-a716-44de5544ac00

A program (or a TABLE) frequently handles many GUIDs which differ only in one 
of the fields.  For instance a program may have to represent a GUID externally 
(on the screen and output files) with the complete GUID, but it could store 
items internally (perhaps in a database) using only the last 12 hex digits, or 
as an integer up to 2^48.  Since it know that all people GUIDs start

550e8400-e29b-41d4-a716-

whereas all vehicles GUIDs start

550e8404-31e6-41d4-a716-

there is no need to store them all in a table where the column tells you 
definitively whether you're talking about a person or a vehicle.

When you're doing a lot of this sort of thing having the hyphens visible speeds 
up human time in figuring out what they're looking at.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] BLOBs and NULLs

2014-04-22 Thread Dominique Devienne
On Tue, Apr 22, 2014 at 8:46 PM, RSmith  wrote:
> On 2014/04/22 20:06, Dominique Devienne wrote:
>> Regarding the uniqueness argument made by DRH, it's actually very hard
>> to generate 2 random-based GUIDS, given that a 128-bit is a very very
>> large number. It is said that 128-bit is large enough to store the
>> estimated number of atoms in our galaxy.//
>
>> current estimates
> are between 78 and 82 orders of magnitude (that's 1.0E+82) of atoms in the
> known universe in decimal, which would be around ~1.0 x 2^270 and obviously
> require around 270 bits to store.

That's why I wrote "our galaxy", not the "whole universe" ;) --DD
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] BLOBs and NULLs

2014-04-22 Thread Dominique Devienne
On Tue, Apr 22, 2014 at 8:36 PM, Dominique Devienne  wrote:
> On Tue, Apr 22, 2014 at 8:16 PM, Richard Hipp  wrote:
>> On Tue, Apr 22, 2014 at 2:06 PM, Dominique Devienne 
>> wrote:
>>
>>> Regarding the uniqueness argument made by DRH, it's actually very hard
>>> to generate 2 random-based GUIDS [that collide], given that a 128-bit is a
>>> very very large number.
>>
>> This is called the "Birthday Paradox".  Ask Google for more information.
>
> Thanks for that Richard. Live and learn ;)

Actually, that Wikipedia article has the number of GUIDs necessary to
achieve a given probability of collisions in
http://en.wikipedia.org/wiki/Birthday_problem#Probability_table and
even goes to mention

"For comparison, 10e−18 to 10e−15 is the uncorrectable bit error rate
of a typical hard disk.[6] In theory, 128-bit hash functions, such as
MD5, should stay within that range until about 820 billion documents,
even if its possible outputs are many more"

So even generating a trillion 128-bit GUID, the probability of a
collision is still astonishingly small, in the same order as hard disk
error rates :) That's good enough for me! --DD
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] BLOBs and NULLs

2014-04-22 Thread Stephan Beal
On Tue, Apr 22, 2014 at 8:25 PM, Dominique Devienne wrote:

> Yet I don't see the point of a BIGINT either. A blob can effectively
> act as a arbitrary sized integer already, albeit one stored in base
> 256 and on which you cannot do arithmetic, but that's OK and enough to
> use it as a PK / FK.
>

A blob can store raw binary data, i.e. raw integers from memory. Just be
sure to encode/decode them if you want their stored representations to be
platform-portable (big vs little endian). You can bind a blob using
(, sizeof(myInt)) if you really want to, it just won't be
platform-portable without settling on an encoding. If the goal is only
performance, though, it might (without encoding) be (marginally) faster
than using string-format data (be sure to use SQLITE_TRANSIENT when binding
the memory, too).

-- 
- stephan beal
http://wanderinghorse.net/home/stephan/
http://gplus.to/sgbeal
"Freedom is sloppy. But since tyranny's the only guaranteed byproduct of
those who insist on a perfect world, freedom will have to do." -- Bigby Wolf
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] BLOBs and NULLs

2014-04-22 Thread RSmith


On 2014/04/22 20:06, Dominique Devienne wrote:


Regarding the uniqueness argument made by DRH, it's actually very hard
to generate 2 random-based GUIDS, given that a 128-bit is a very very
large number. It is said that 128-bit is large enough to store the
estimated number of atoms in our galaxy.//


I'm all with you on the idea and you can find the paradox described on the net (it's an actual thing), but my quick interlude here 
is specific to the statement above about being able to store the estimated number of atoms in the Universe. I think you might be 
thinking of SHA 256, because that's closer to how many will be needed to accurately store it - current estimates are between 78 and 
82 orders of magnitude (that's 1.0E+82) of atoms in the known universe in decimal, which would be around ~1.0 x 2^270 and obviously 
require around 270 bits to store. Of course Stars are manufacturing (and killing) atoms all the time, so an estimate will have to do!


in case anyone is interested in the actual math:
https://www.wolframalpha.com/input/?i=10^82+%3D+2^x

Or in the physics:
http://www.universetoday.com/36302/atoms-in-the-universe/



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] BLOBs and NULLs

2014-04-22 Thread Richard Hipp
On Tue, Apr 22, 2014 at 2:36 PM, Dominique Devienne wrote:

> Said Google tells me 2^128 - 1 = 3.4028237e+38
>
> and that sqrt(2^128 - 1) = 1.8446744e+19
>
> You've confused a 128-bit with a 64-bit integer in your 4 billion
> approximation, no?
>

Yes.  For a moment there, I was taking 2^64 was 4 billion.  It's 2^32 that
is 4 billion.  2^64 is larger.

-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] BLOBs and NULLs

2014-04-22 Thread Dominique Devienne
On Tue, Apr 22, 2014 at 8:16 PM, Richard Hipp  wrote:
> On Tue, Apr 22, 2014 at 2:06 PM, Dominique Devienne 
> wrote:
>
>> Regarding the uniqueness argument made by DRH, it's actually very hard
>> to generate 2 random-based GUIDS [that collide], given that a 128-bit is a
>> very very large number.
>
> This is called the "Birthday Paradox".  Ask Google for more information.

Thanks for that Richard. Live and learn ;)

> To a good approximation, if there are N possible values, you need to
> generate sqrt(N) of them before you have a 50/50 chance of getting a
> collision.  (Wikipedia has the exact formula, if you are interested, but
> the approximation is usually good enough.)
>
> So for a 128-bit GUID, you'd expect to get a collision after generating 4
> billion of them, or so.

Said Google tells me 2^128 - 1 = 3.4028237e+38

and that sqrt(2^128 - 1) = 1.8446744e+19

You've confused a 128-bit with a 64-bit integer in your 4 billion
approximation, no?

> The above assumes you have a good source of randomness.  The "rand()"
> function in your favorite programming language is often not quite that
> good.  But random() in SQLite does a decent job, at last on Unix where it
> can be seeded using /dev/random.

FWIW, I used the Boost's
http://www.boost.org/doc/libs/1_47_0/doc/html/boost/random/mt19937.html
for the RNG. --DD
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] BLOBs and NULLs

2014-04-22 Thread Drago, William @ MWG - NARDAEAST
I can generate the GUID as a 16 byte hex string, so the dashes are no problem. 
I'm working with VEE (similar to LabView) and .Net, so a lot of the mundane 
stuff is done for me with nifty little methods like ToArray(), ToString(), etc..


-Original Message-
From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] 
On Behalf Of RSmith
Sent: Tuesday, April 22, 2014 1:57 PM
To: sqlite-users@sqlite.org
Subject: Re: [sqlite] BLOBs and NULLs


On 2014/04/22 19:12, Richard Hipp wrote:
> On Tue, Apr 22, 2014 at 1:07 PM, Drago, William @ MWG - NARDAEAST <
> william.dr...@l-3com.com> wrote:
>
>> Does blob ignore them if they are included?
>>
> No.  That would be a syntax error.  The dashes in (strict) GUIDs are
> an arbitrary construct (perhaps originally designed to promote
> readability - as if anybody ever actually reads a GUID).  If you want
> to store the GUID as a BLOB then you'll have to strip the dashes
> yourselve (in your
> application) and do the hex-to-binary conversion yourself.

Just to add  - I cannot imagine why you would have dashes to start with, do you 
have a GUID generator that pumps out TEXT or STRING values? Usually they should 
be accompanied (in most libraries) by a function that pops out the actual 
16/20/32-byte BLOB as a stream or array of bytes/integers/whatever, and it 
usually involves some cost to convert it to strings with dashes etc. - a nice 
optimisation if those are no longer needed.

If you DO only get them as string though, the web is full of little bits of 
conversion code, and we will be glad to supply you with some too if you say 
which environment, function, etc.










___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
CONFIDENTIALITY, EXPORT CONTROL AND DISCLAIMER NOTE:This e-mail and any 
attachments are solely for the use of the addressee and may contain information 
that is privileged or confidential. Any disclosure, use or distribution of the 
information contained herein is prohibited. In the event this e-mail contains 
technical data within the definition of the International Traffic in Arms 
Regulations or Export Administration Regulations, it is subject to the export 
control laws of the U.S.Government. The recipient should check this e-mail and 
any attachments for the presence of viruses as L-3 does not accept any 
liability associated with the transmission of this e-mail. If you have received 
this communication in error, please notify the sender by reply e-mail and 
immediately delete this message and any attachments.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] BLOBs and NULLs

2014-04-22 Thread Dominique Devienne
On Tue, Apr 22, 2014 at 6:48 PM, Richard Hipp  wrote:
> On Tue, Apr 22, 2014 at 12:37 PM, Neville Dastur
>> I would hazard a guess that most mobile apps that use an internal DB, use
>> sqlite. With inconsistent mobile network coverage, having pure client side
>> PK generation is a must and GUIDs solve that problem well.

Exactly.

>> Has the time not come to support a 128bit INT that can then be used for GUID?

Yet I don't see the point of a BIGINT either. A blob can effectively
act as a arbitrary sized integer already, albeit one stored in base
256 and on which you cannot do arithmetic, but that's OK and enough to
use it as a PK / FK.

> "GUID" means different things to different people.  There are some "GUID"
> standards out there that people use.  But I take a more flexible approach
> and say that a "GUID" is any "Globally Unique IDentifier".  This
> generalized definition of "GUID" is not necessarily 128 bits (though I
> would argue 128 bits should be the bare minimum.)

Many people, including me, do refer to that one standard Simon already
linked to. And it is 128-bit.

> Fossil generates some of its "GUID"s using the SHA1 hash algorithm.  Other
> GUIDs (for example for ticket IDs) are generated using:
>
> SELECT lower(hex(randomblob(20)));
>
> You can increase the 20 to make the GUIDs as "globally unique" as you
> want.  The GUIDs discussed previously in this thread seem use 16 instead of
> 20 and thus are less unique.

But again, random GUIDs and (secure) hashes like SHA1 are different beats.

To compute your SHA1, you take an arbitrary large "content", and
process it to generate a hopefully unique but definitely not random
"number". The same content must generate the same "number, every time.
And changing just one byte of the content must generate an entirely
different hash.

While a random generated GUID "only" needs to be unique, generated out
of thin air, and strive to never generate the same "number" twice.

> So a 128bit int really isn't going to help here because as soon as you have
> one, you'll need a 160bit int.  And so forth...

It is enough for all practical purposes, with a good RNG.

> Better to simply use a BLOB which can have arbitrary length.  You aren't
> going to be adding and subtracting the GUIDs, so no need to call them
> integers.

I'm with you there.

> The average length of the derived, locally-unique identifiers in Fossil is
> about two bytes.  That is more compact than 16 or 20 bytes, regardless of
> whether you call it a BLOB or an INT.  So having a BIGINT capability
> doesn't really help you there either.

Sure. As you say, different applications have different requirements.

We load entities from different completely unrelated datastores, and
by using GUID PKs, we never run in collisions, unlike locally derived
local ids.

We can even have cross-datastore FKs using GUIDs, albeit not enforced
like intra-datastore FKs can be of course.

The decentralized nature of GUIDs, and it's practical uniqueness, do
make it a good choice for PK/FK IMHO, while remaining short enough a
key. Sure, 2 bytes beat 16 bytes (although with 2 bytes you're likely
on the low end; we can have in the millions of rows in our DBs).
Again, my $0.02. I was more reacting to James' anti-GUID paragraph ;)
--DD
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] BLOBs and NULLs

2014-04-22 Thread Richard Hipp
On Tue, Apr 22, 2014 at 2:06 PM, Dominique Devienne wrote:

> Regarding the uniqueness argument made by DRH, it's actually very hard
> to generate 2 random-based GUIDS [that collide], given that a 128-bit is a
> very very
> large number.
>

This is called the "Birthday Paradox".  Ask Google for more information.

To a good approximation, if there are N possible values, you need to
generate sqrt(N) of them before you have a 50/50 chance of getting a
collision.  (Wikipedia has the exact formula, if you are interested, but
the approximation is usually good enough.)

So for a 128-bit GUID, you'd expect to get a collision after generating 4
billion of them, or so.

The above assumes you have a good source of randomness.  The "rand()"
function in your favorite programming language is often not quite that
good.  But random() in SQLite does a decent job, at last on Unix where it
can be seeded using /dev/random.

-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] BLOBs and NULLs

2014-04-22 Thread Dominique Devienne
On Tue, Apr 22, 2014 at 6:57 PM, Stephan Beal  wrote:
> On Tue, Apr 22, 2014 at 6:48 PM, Richard Hipp  wrote:
>> Fossil generates some of its "GUID"s using the SHA1 hash algorithm.  Other
>> GUIDs (for example for ticket IDs) are generated using:
>>
>> SELECT lower(hex(randomblob(20)));
>>
>> You can increase the 20 to make the GUIDs as "globally unique" as you
>> want.  The GUIDs discussed previously in this thread seem use 16 instead of
>> 20 and thus are less unique.
>>
>
> That reminds me of a specific snippet from this article:
>
> http://www.w3.org/DesignIssues/Axioms.html#nonunique
>
> In summary: the context of a GUID defines its "scope of required
> uniqueness," and a 16-byte GUID is essentially globally unique so long as
> it has no collisions within its context(s). (i.e. who cares if SHA1s
> collide, so long as it's not in the same repo?)

First, SHA1 hashes and GUID, although they look the same (size
notwithstanding), are not the same. Hashes like SHA1 derive their
value from actual content (at a point in time), so they are in fact
better than randomly generated GUIDs. But not every applications can
easily compute content hashes (using SHA1, SHA256, or whatever other
secure hashing algo) for their content. And for mutable entities,
content hashes would be definition also mutate (ignoring very unlikely
collisions), unlike GUIDs which are arbitrary and immutable "by
design", which makes them suitable as PKs of mutate entities.

Regarding the uniqueness argument made by DRH, it's actually very hard
to generate 2 random-based GUIDS, given that a 128-bit is a very very
large number. It is said that 128-bit is large enough to store the
estimated number of atoms in our galaxy. It's good enough for my own
uses. Being of the curious type, I wrote a little test to generate a
large number of GUIDs (using boost::uuid), then sort them, then look
for the longest prefix (byte-wise, not char wise). To keep things
simple, I did that in memory, so could only generate 1/2 a billion,
and the longest common prefix I found was 7 bytes, out of the 16
bytes. Intuitively, I suspect one must generate increasingly large
number of GUIDs to increase the common prefix length by 1 byte each
time, but I didn't verify this intuition.

So yes, in theory, one will eventually run out of bits using a 128-bit
(integer) GUID, but in practice I don't think it hardly matters.My
$0.02. --DD
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] BLOBs and NULLs

2014-04-22 Thread RSmith


On 2014/04/22 19:12, Richard Hipp wrote:

On Tue, Apr 22, 2014 at 1:07 PM, Drago, William @ MWG - NARDAEAST <
william.dr...@l-3com.com> wrote:


Does blob ignore them if they are included?


No.  That would be a syntax error.  The dashes in (strict) GUIDs are an
arbitrary construct (perhaps originally designed to promote readability -
as if anybody ever actually reads a GUID).  If you want to store the GUID
as a BLOB then you'll have to strip the dashes yourselve (in your
application) and do the hex-to-binary conversion yourself.


Just to add  - I cannot imagine why you would have dashes to start with, do you have a GUID generator that pumps out TEXT or STRING 
values? Usually they should be accompanied (in most libraries) by a function that pops out the actual 16/20/32-byte BLOB as a stream 
or array of bytes/integers/whatever, and it usually involves some cost to convert it to strings with dashes etc. - a nice 
optimisation if those are no longer needed.


If you DO only get them as string though, the web is full of little bits of conversion code, and we will be glad to supply you with 
some too if you say which environment, function, etc.











___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] BLOBs and NULLs

2014-04-22 Thread Richard Hipp
On Tue, Apr 22, 2014 at 1:07 PM, Drago, William @ MWG - NARDAEAST <
william.dr...@l-3com.com> wrote:

> Does blob ignore them if they are included?
>

No.  That would be a syntax error.  The dashes in (strict) GUIDs are an
arbitrary construct (perhaps originally designed to promote readability -
as if anybody ever actually reads a GUID).  If you want to store the GUID
as a BLOB then you'll have to strip the dashes yourselve (in your
application) and do the hex-to-binary conversion yourself.


>
> -Bill
>
> -Original Message-
> From: sqlite-users-boun...@sqlite.org [mailto:
> sqlite-users-boun...@sqlite.org] On Behalf Of Richard Hipp
> Sent: Tuesday, April 22, 2014 12:56 PM
> To: General Discussion of SQLite Database
> Subject: Re: [sqlite] BLOBs and NULLs
>
> On Tue, Apr 22, 2014 at 12:55 PM, Drago, William @ MWG - NARDAEAST <
> william.dr...@l-3com.com> wrote:
>
> > Cool. So it's treating each 2 digit pair as a single byte hex value,
> > but what does blob do with the dashes?
> >
>
> Since the dashes carry no information, you could leave them out.
>
> --
> D. Richard Hipp
> d...@sqlite.org
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> CONFIDENTIALITY, EXPORT CONTROL AND DISCLAIMER NOTE:This e-mail and any
> attachments are solely for the use of the addressee and may contain
> information that is privileged or confidential. Any disclosure, use or
> distribution of the information contained herein is prohibited. In the
> event this e-mail contains technical data within the definition of the
> International Traffic in Arms Regulations or Export Administration
> Regulations, it is subject to the export control laws of the
> U.S.Government. The recipient should check this e-mail and any attachments
> for the presence of viruses as L-3 does not accept any liability associated
> with the transmission of this e-mail. If you have received this
> communication in error, please notify the sender by reply e-mail and
> immediately delete this message and any attachments.
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>



-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] BLOBs and NULLs

2014-04-22 Thread Drago, William @ MWG - NARDAEAST
Does blob ignore them if they are included?

-Bill

-Original Message-
From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] 
On Behalf Of Richard Hipp
Sent: Tuesday, April 22, 2014 12:56 PM
To: General Discussion of SQLite Database
Subject: Re: [sqlite] BLOBs and NULLs

On Tue, Apr 22, 2014 at 12:55 PM, Drago, William @ MWG - NARDAEAST < 
william.dr...@l-3com.com> wrote:

> Cool. So it's treating each 2 digit pair as a single byte hex value,
> but what does blob do with the dashes?
>

Since the dashes carry no information, you could leave them out.

--
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
CONFIDENTIALITY, EXPORT CONTROL AND DISCLAIMER NOTE:This e-mail and any 
attachments are solely for the use of the addressee and may contain information 
that is privileged or confidential. Any disclosure, use or distribution of the 
information contained herein is prohibited. In the event this e-mail contains 
technical data within the definition of the International Traffic in Arms 
Regulations or Export Administration Regulations, it is subject to the export 
control laws of the U.S.Government. The recipient should check this e-mail and 
any attachments for the presence of viruses as L-3 does not accept any 
liability associated with the transmission of this e-mail. If you have received 
this communication in error, please notify the sender by reply e-mail and 
immediately delete this message and any attachments.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] BLOBs and NULLs

2014-04-22 Thread Richard Hipp
On Tue, Apr 22, 2014 at 12:55 PM, Drago, William @ MWG - NARDAEAST <
william.dr...@l-3com.com> wrote:

> Cool. So it's treating each 2 digit pair as a single byte hex value, but
> what does blob do with the dashes?
>

Since the dashes carry no information, you could leave them out.

-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] cannot set connection while a datareader is active

2014-04-22 Thread Stefano Ravagni

Il 21/04/2014 22.03, rava ha scritto:

I know, you have all code wich is involved in the error... for the parte 
involved, no other function was called... i call executereader only for one 
datareader and two Times directly to the command... do you found the other 
datareader in the sent code? If exist it have to be in the sent code, but i 
don't see nothing..  you could?



Joe Mistachkin  ha scritto:



Stefano Ravagni wrote:

Thanks for answer Joe, but as just sayed i have only one datareader,


That is not possible.  There must be at least two different data reader
instances because the references in the watch window did not match.
Every time you call ExecuteReader, a brand new data reader is returned.
Keeping track of these returned data readers is very important because
they must all be properly disposed at some point.


i'm sure of that, and this is demostrated because with others provider
i have not error.


Perhaps the other providers do not have a restriction on resetting the
underlying connection for a command when data readers are active.

--
Joe Mistachkin

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Joe, i finally solve the error following your advices...

We search for a Datareader object, but we could not find There are 
(as i sayed) only one datareader object as declared object but 
thinking about your words when you saw ("Every time you call 
ExecuteReader, a brand new data reader is returned.") i check for a call 
to ExecuteReader WITHOUT association for DATI object (Datareader).


I found this line in a function wich load a Checklist ...

frmAutomazione.SorgenteAutomazioni.DataSource = objCmd.ExecuteReader

This line goes out of what i've searched forbut one time i change 
this code, all goes good!!!


So, i'm sorry for my insistence, but finally with your help i became 
able to find the problem and adjust it !!!


Thank you Thank you Thank you Thank you very much !!!

I would to make a little donation to your workexist a way ??


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] BLOBs and NULLs

2014-04-22 Thread Drago, William @ MWG - NARDAEAST
Cool. So it's treating each 2 digit pair as a single byte hex value, but what 
does blob do with the dashes?


-Original Message-
From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] 
On Behalf Of Dominique Devienne
Sent: Tuesday, April 22, 2014 11:55 AM
To: General Discussion of SQLite Database
Subject: Re: [sqlite] BLOBs and NULLs

On Tue, Apr 22, 2014 at 5:35 PM, Drago, William @ MWG - NARDAEAST 
 wrote:
>>I myself prefer create table foo (guid blob primary key [NOT NULL], ...).
>
> If a genuine GUID looks like this: 37af1247-2e77-4880-8f46-48803ae2cd0a, then 
> why blob and not text?

Simply because of the extra space needed to store it. 36 bytes vs 16 bytes. 
That's 20 wasted bytes for the PK, and everytime that PK is references in other 
tables' FKs too. Times millions of rows, it adds up, for nothing. The GUID is 
no less "genuine" as you put it, just because it's stored as a 16-bytes blob 
rather than the canonical 36-char text preferred by humans. The native code 
guids, e.g.
boost::uuid, also use 16 bytes, so conversions to/from the DB
(binding/defining) would required bytes-to-text, and text-to-byte conversions, 
again for no gain. If we ever show a GUID to the user, which is rather rare 
(and often a bug), sure, we pretty-print it as dash-separated hex, but 
otherwise the most compact "native"
representation is used everywhere else. For ad-hoc queries using the 
shell-tool, SQLite's x'ff' blob-literal comes in handy too, and
quote() [1] is used for blob-to-hex conversions in selects (nothing needed in 
blob-to-blob joins). --DD

[1] 
http://stackoverflow.com/questions/1039461/how-to-display-blob-value-using-xabc-binary-string-literal-syntax
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
CONFIDENTIALITY, EXPORT CONTROL AND DISCLAIMER NOTE:This e-mail and any 
attachments are solely for the use of the addressee and may contain information 
that is privileged or confidential. Any disclosure, use or distribution of the 
information contained herein is prohibited. In the event this e-mail contains 
technical data within the definition of the International Traffic in Arms 
Regulations or Export Administration Regulations, it is subject to the export 
control laws of the U.S.Government. The recipient should check this e-mail and 
any attachments for the presence of viruses as L-3 does not accept any 
liability associated with the transmission of this e-mail. If you have received 
this communication in error, please notify the sender by reply e-mail and 
immediately delete this message and any attachments.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] BLOBs and NULLs

2014-04-22 Thread Richard Hipp
On Tue, Apr 22, 2014 at 12:37 PM, Neville Dastur
wrote:

>
> On 22 Apr 2014, at 17:33, Richard Hipp  wrote:
>
> > The usual solution here is to have a table that maps GUIDs into small
> > locally-unique integers:
> >
> >CREATE TABLE guid_id(id INTEGER PRIMARY KEY, guid TEXT UNIQUE);
> >
> > Use the small integer "id" value for internal foreign keys and whatnot.
> > And use the guid_id table to map GUIDs to id when moving data in from and
> > out to the rest of the world.
>
> Sorry, but does this not just over complicate the problem.
>


I guess it depends on the problem.

The same idea is used in the schema for Fossil (http://www.fossil-scm.org/)
and it works quite well there.   But every problem is different.  It's an
engineering judgement.


> I would hazard a guess that most mobile apps that use an internal DB, use
> sqlite. With inconsistent mobile network coverage, having pure client side
> PK generation is a must and GUIDs solve that problem well. Has the time not
> come to support a 128bit INT that can then be used for GUID?
>

"GUID" means different things to different people.  There are some "GUID"
standards out there that people use.  But I take a more flexible approach
and say that a "GUID" is any "Globally Unique IDentifier".  This
generalized definition of "GUID" is not necessarily 128 bits (though I
would argue 128 bits should be the bare minimum.)

Fossil generates some of its "GUID"s using the SHA1 hash algorithm.  Other
GUIDs (for example for ticket IDs) are generated using:

SELECT lower(hex(randomblob(20)));

You can increase the 20 to make the GUIDs as "globally unique" as you
want.  The GUIDs discussed previously in this thread seem use 16 instead of
20 and thus are less unique.

So a 128bit int really isn't going to help here because as soon as you have
one, you'll need a 160bit int.  And so forth...

Better to simply use a BLOB which can have arbitrary length.  You aren't
going to be adding and subtracting the GUIDs, so no need to call them
integers.

The average length of the derived, locally-unique identifiers in Fossil is
about two bytes.  That is more compact than 16 or 20 bytes, regardless of
whether you call it a BLOB or an INT.  So having a BIGINT capability
doesn't really help you there either.

-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] BLOBs and NULLs

2014-04-22 Thread Neville Dastur

On 22 Apr 2014, at 17:33, Richard Hipp  wrote:

> The usual solution here is to have a table that maps GUIDs into small
> locally-unique integers:
> 
>CREATE TABLE guid_id(id INTEGER PRIMARY KEY, guid TEXT UNIQUE);
> 
> Use the small integer "id" value for internal foreign keys and whatnot.
> And use the guid_id table to map GUIDs to id when moving data in from and
> out to the rest of the world.

Sorry, but does this not just over complicate the problem.

I would hazard a guess that most mobile apps that use an internal DB, use 
sqlite. With inconsistent mobile network coverage, having pure client side PK 
generation is a must and GUIDs solve that problem well. Has the time not come 
to support a 128bit INT that can then be used for GUID?
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] BLOBs and NULLs

2014-04-22 Thread Richard Hipp
The usual solution here is to have a table that maps GUIDs into small
locally-unique integers:

CREATE TABLE guid_id(id INTEGER PRIMARY KEY, guid TEXT UNIQUE);

Use the small integer "id" value for internal foreign keys and whatnot.
And use the guid_id table to map GUIDs to id when moving data in from and
out to the rest of the world.


On Tue, Apr 22, 2014 at 11:55 AM, Dominique Devienne wrote:

> On Tue, Apr 22, 2014 at 5:35 PM, Drago, William @ MWG - NARDAEAST
>  wrote:
> >>I myself prefer create table foo (guid blob primary key [NOT NULL], ...).
> >
> > If a genuine GUID looks like this: 37af1247-2e77-4880-8f46-48803ae2cd0a,
> then why blob and not text?
>
> Simply because of the extra space needed to store it. 36 bytes vs 16
> bytes. That's 20 wasted bytes for the PK, and everytime that PK is
> references in other tables' FKs too. Times millions of rows, it adds
> up, for nothing. The GUID is no less "genuine" as you put it, just
> because it's stored as a 16-bytes blob rather than the canonical
> 36-char text preferred by humans. The native code guids, e.g.
> boost::uuid, also use 16 bytes, so conversions to/from the DB
> (binding/defining) would required bytes-to-text, and text-to-byte
> conversions, again for no gain. If we ever show a GUID to the user,
> which is rather rare (and often a bug), sure, we pretty-print it as
> dash-separated hex, but otherwise the most compact "native"
> representation is used everywhere else. For ad-hoc queries using the
> shell-tool, SQLite's x'ff' blob-literal comes in handy too, and
> quote() [1] is used for blob-to-hex conversions in selects (nothing
> needed in blob-to-blob joins). --DD
>
> [1]
> http://stackoverflow.com/questions/1039461/how-to-display-blob-value-using-xabc-binary-string-literal-syntax
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>



-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] BLOBs and NULLs

2014-04-22 Thread Dominique Devienne
On Tue, Apr 22, 2014 at 5:35 PM, Drago, William @ MWG - NARDAEAST
 wrote:
>>I myself prefer create table foo (guid blob primary key [NOT NULL], ...).
>
> If a genuine GUID looks like this: 37af1247-2e77-4880-8f46-48803ae2cd0a, then 
> why blob and not text?

Simply because of the extra space needed to store it. 36 bytes vs 16
bytes. That's 20 wasted bytes for the PK, and everytime that PK is
references in other tables' FKs too. Times millions of rows, it adds
up, for nothing. The GUID is no less "genuine" as you put it, just
because it's stored as a 16-bytes blob rather than the canonical
36-char text preferred by humans. The native code guids, e.g.
boost::uuid, also use 16 bytes, so conversions to/from the DB
(binding/defining) would required bytes-to-text, and text-to-byte
conversions, again for no gain. If we ever show a GUID to the user,
which is rather rare (and often a bug), sure, we pretty-print it as
dash-separated hex, but otherwise the most compact "native"
representation is used everywhere else. For ad-hoc queries using the
shell-tool, SQLite's x'ff' blob-literal comes in handy too, and
quote() [1] is used for blob-to-hex conversions in selects (nothing
needed in blob-to-blob joins). --DD

[1] 
http://stackoverflow.com/questions/1039461/how-to-display-blob-value-using-xabc-binary-string-literal-syntax
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] BLOBs and NULLs

2014-04-22 Thread Drago, William @ MWG - NARDAEAST
>I myself prefer create table foo (guid blob primary key [NOT NULL], ...).

If a genuine GUID looks like this: 37af1247-2e77-4880-8f46-48803ae2cd0a, then 
why blob and not text?

-Bill



-Original Message-
From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] 
On Behalf Of Dominique Devienne
Sent: Tuesday, April 22, 2014 5:07 AM
To: General Discussion of SQLite Database
Subject: Re: [sqlite] BLOBs and NULLs

On Mon, Apr 21, 2014 at 4:58 PM, James K. Lowden  
wrote:
> On Mon, 21 Apr 2014 13:30:15 +
> "Drago, William @ MWG - NARDAEAST"  wrote:
>
>> Should I split this table up into smaller tables to eliminate the
>> NULLs (e.g. use one table each for IL, Phase, RL, Isolation)?

> Your database design is a model of the real world.  The rules it
> enforces should reflect those of the world it models. The tuple (IL,
> Phase, RL, Isolation) doesn't have much meaning, does it, in the sense
> that *together* they say something special about whatever (GUID, Path)
> represent?  From your description, each individual tuple (e.g., (GUID,
> Path, IL)) is meaningful, but the presence of, say, IL without Phase
> is not.  Each is a separate, freestanding fact, justifying its own table.

Pushed to its extreme, this sounds like an EAV* model, which seems surprising 
coming from you James ;)

* http://en.wikipedia.org/wiki/Entity%E2%80%93attribute%E2%80%93value_model

> The part I like least about your design (1) the use of a GUID and, in
> particular (2) the name "GUID" for the column.  If you are generating
> this GUID, don't; use an integer.  If you're not generating it -- if
> it comes to you from another source and therefore identifies something
> in "the real world" in some sense, OK.  Either way, use the name of
> the column to reflect the thing identified, not the datatype of the
> identifier.

First, about your GUID versus integer point (an often contentious debate), I'm 
on the (opposite) GUID side myself. And the reason is to increase concurrency 
and decrease contention. When you depend on a server-enforced AUTO-INCREMENTED 
key, you must go to the central server to get that unique id, which is a 
round-trip (in client-server scenarii), and sometimes you must also insert a 
row, before you potentially have all the facts (columns). A GUID on the other 
hand can be randomly generated client-side, completely independently of the 
server, and still be unique (since 128 bit, a very very large integer, and a 
good RNG). If you need to "merge" later tables or databases created 
independently, you won't have GUID conflicts, unlike AUTO-INCREMENTED integer 
PKs (the integers are used in FKs elsewhere, so any merge operation must map 
the old AUTO-INC PKs to the new AUTO-INC ones, and updates all FKs). So IMHO, a 
GUID PK is very appropropriate in many scenarii, as long as one remembers it i
 s only a Surrogate Key, i.e. a convenient fixed-sized, globally unique (and 
thus often immutable) single value to use in FKs (especially convenient in the 
face of complex composite natural keys), and not a Natural Key.

Second, the datatype is blob (or RAW(16) in Oracle), not GUID, and there's thus 
nothing wrong to call it what it is. Many people in the SQL community seem to 
like stuttering, and will call it foo_id or foo_uid or foo_guid for a foo 
table, but I myself prefer create table foo (guid blob primary key [NOT NULL], 
...).

It's mostly a matter of opinion and taste, but there are benefits to using 
GUIDs as SKs in some circumstances in my book. My $0.02. --DD

PS: For a long time, SQLite depended on that integer PK, whether one wanted it 
or not (and I wasn't aware saying PRIMARY KEY for a non-integer PK did not mean 
it's standard-required NOT NULL until reading a recent thread. I wish there was 
a single opt-in pragma to disable all non-standard behavior). Now there's 
WITOUT ROWID tables, but from reading this list, it seems like it forces you to 
a B-tree, instead of a B*-tree  (store rows in leafs only) like ROWID-based 
tables, which I believe means more expensive inserts via more B-tree 
rebalancing, which is especially bad with large (blob-using) rows, given SQLite 
lack of out-of-row blob storage.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
CONFIDENTIALITY, EXPORT CONTROL AND DISCLAIMER NOTE:This e-mail and any 
attachments are solely for the use of the addressee and may contain information 
that is privileged or confidential. Any disclosure, use or distribution of the 
information contained herein is prohibited. In the event this e-mail contains 
technical data within the definition of the International Traffic in Arms 
Regulations or Export Administration Regulations, it is subject to the export 
control laws of the U.S.Government. The recipient should check this e-mail and 
any attachments for the presence 

Re: [sqlite] cannot set connection while a datareader is active

2014-04-22 Thread Stefano Ravagni

Il 21/04/2014 21.54, Joe Mistachkin ha scritto:

Stefano Ravagni wrote:

Thanks for answer Joe, but as just sayed i have only one datareader,


That is not possible.  There must be at least two different data reader
instances because the references in the watch window did not match.
Every time you call ExecuteReader, a brand new data reader is returned.
Keeping track of these returned data readers is very important because
they must all be properly disposed at some point.


i'm sure of that, and this is demostrated because with others provider
i have not error.


Perhaps the other providers do not have a restriction on resetting the
underlying connection for a command when data readers are active.

--
Joe Mistachkin

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users



Hello Joe, i forgot to tell you what to do for see the errorwhen 
checklist appear, click on ANNULLA button...after somes cicles it will 
appear...
I know what you say, but you have all code wich is involved in the 
error... for the parte involved, no other function was called... i call 
executereader only for one datareader and two only times directly to the 
command... do you found the other datareader in the sent code? If exist 
it have to be in the sent code, but i don't see nothing.. you could?



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] question for auto increament

2014-04-22 Thread RSmith

Hi Yang,

Most designers are lacking a bit inside the dev environments. You could use the sqlite command-line tool available from the 
downloads page at

http://sqlite.org/

or indeed any of the DB Management GUIs out there for SQLite such as these:
http://www.sqliteexpert.com/
http://www.rifin.co.za/software/sqlc/

Note: Only the Command-line facility is officially supported, for support with the others you will need to contact their own 
lists/proprietors/etc.



Alternately, you can simply directly execute some SQL to create your tables via 
the API, such as:

CREATE TABLE SomeTableName (
  IDKey INTEGER PRIMARY KEY AUTOINCREMENT   /* Item ID */,
  ColX TEXT NOT NULL /* Some column */,
ColY TEXT NOT NULL /* Some other column */,
ColZ TEXT NOT NULL /* Some more columns etc. */
);

Which will do what you need.

Good luck!


On 2014/04/21 22:13, Yang Hong wrote:

Hello, all:

  


I am using VS2013 + sqlite.net for 4.5.1 with 32 bit. I have question for a
primary key with auto increment. I can't figure out how to do it in C#. when
I use sqlite datasource designer in VS2013. It has option to add identity
checking, however, this feature doesn't work even I check this option. Do
you have anybody to help me out?

  


Regards,

  


yh

  


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Most efficient storage for arrays

2014-04-22 Thread Neville Dastur
Hi all

I am looking for some advice on storing and searching data that comes from an 
external JSON source and needs to be stored on device in a Sqlite3 database.

The JSON records are formatted as (simplified):
{
"_id" : ObjectId(“xx"),
"name" : “Description”,
“array1" : [ “”,”” ],
“array2" : [ “”,””,”” ],
"tags" : [“searchword1”,”searchword2”,”searchword3"]
}

On device I need to be able to retrieve the entire “document” and display all 
the array values, but I also need to be able to search the arrays and tags. For 
the array items I will need to be able to search on items just in array1 and 
for other search types on items in array1 or array2

The options i see are to store the array fields as pipe separated values and 
then use LIKE “|%|” to search items.

Alternatively, again store the array as pipe separated, but then also have 
other tables with columns id, array_item linked back to the main table. Then 
search using joins. However I can see this getting complicated when it comes to 
the two different search strategies detailed above. Might be mitigated by 
having just one extra table e.g. search_helper with id, parent_array, 
array_item then parent_array could be used to sub-filter.

So wondering is anyone that has done this sort of thing and worked out the best 
way?

Thanks

Neville
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] question for auto increament

2014-04-22 Thread Yang Hong
Hello, all:

 

I am using VS2013 + sqlite.net for 4.5.1 with 32 bit. I have question for a
primary key with auto increment. I can't figure out how to do it in C#. when
I use sqlite datasource designer in VS2013. It has option to add identity
checking, however, this feature doesn't work even I check this option. Do
you have anybody to help me out?

 

Regards,

 

yh

 

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] BLOBs and NULLs

2014-04-22 Thread Dominique Devienne
On Tue, Apr 22, 2014 at 12:05 PM, Simon Slavin  wrote:
> On 22 Apr 2014, at 10:07am, Dominique Devienne  wrote:
> Store them as 32 hex digits, or 32 hex digits with the minus signs in, or as 
> a 32-bit-length integer, I don't care, but have them conform to V1 or V4 
> Algorithm generation, or something else mentioned on that page.

There's really no point wasting space to store a stringified 36-char
GUID, rather than storing that same guid as 16-bytes. Which algo
generated the GUID is embedded in the GUID itself, if it's standard
conforming, and is irrelevant to relational integrity anyway. All
RDBMSs can blob-to-hex if you want to see a dash-less hex GUID stored
as a blob or RAW, and you can write something in whatever (C, PL/SQL,
TSQL, pure SQL etc...) to splice the dashes in if you really want to.
A guid is a guid, irrespective of its actual representation. And
there's nothing wrong calling it as such. Store them as hex-encoded
(or octal, or base-64, ...) text if that pleases you, but that doesn't
make it any more or any less a GUID. --DD
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] BLOBs and NULLs

2014-04-22 Thread Simon Slavin

On 22 Apr 2014, at 10:07am, Dominique Devienne  wrote:

> using GUIDs

Don't particularly mind if anyone is using GUIDs, but if anyone is using 
calling something GUID can you please make sure it's a real GUID ?  They look 
like this:



Store them as 32 hex digits, or 32 hex digits with the minus signs in, or as a 
32-bit-length integer, I don't care, but have them conform to V1 or V4 
Algorithm generation, or something else mentioned on that page.

If you don't want to do that, that's fine: generate whatever keys you want.  
Just don't call them GUIDs.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] BLOBs and NULLs

2014-04-22 Thread Dominique Devienne
On Mon, Apr 21, 2014 at 4:58 PM, James K. Lowden
 wrote:
> On Mon, 21 Apr 2014 13:30:15 +
> "Drago, William @ MWG - NARDAEAST"  wrote:
>
>> Should I split this table up into smaller tables to eliminate the
>> NULLs (e.g. use one table each for IL, Phase, RL, Isolation)?

> Your database design is a model of the real world.  The rules it
> enforces should reflect those of the world it models. The tuple (IL,
> Phase, RL, Isolation) doesn't have much meaning, does it, in the sense
> that *together* they say something special about whatever (GUID, Path)
> represent?  From your description, each individual tuple (e.g., (GUID,
> Path, IL)) is meaningful, but the presence of, say, IL without Phase is
> not.  Each is a separate, freestanding fact, justifying its own table.

Pushed to its extreme, this sounds like an EAV* model, which seems
surprising coming from you James ;)

* http://en.wikipedia.org/wiki/Entity%E2%80%93attribute%E2%80%93value_model

> The part I like least about your design (1) the use of a GUID and, in
> particular (2) the name "GUID" for the column.  If you are generating
> this GUID, don't; use an integer.  If you're not generating it -- if it
> comes to you from another source and therefore identifies something in
> "the real world" in some sense, OK.  Either way, use the name of the
> column to reflect the thing identified, not the datatype of the
> identifier.

First, about your GUID versus integer point (an often contentious
debate), I'm on the (opposite) GUID side myself. And the reason is to
increase concurrency and decrease contention. When you depend on a
server-enforced AUTO-INCREMENTED key, you must go to the central
server to get that unique id, which is a round-trip (in client-server
scenarii), and sometimes you must also insert a row, before you
potentially have all the facts (columns). A GUID on the other hand can
be randomly generated client-side, completely independently of the
server, and still be unique (since 128 bit, a very very large integer,
and a good RNG). If you need to "merge" later tables or databases
created independently, you won't have GUID conflicts, unlike
AUTO-INCREMENTED integer PKs (the integers are used in FKs elsewhere,
so any merge operation must map the old AUTO-INC PKs to the new
AUTO-INC ones, and updates all FKs). So IMHO, a GUID PK is very
appropropriate in many scenarii, as long as one remembers it is only a
Surrogate Key, i.e. a convenient fixed-sized, globally unique (and
thus often immutable) single value to use in FKs (especially
convenient in the face of complex composite natural keys), and not a
Natural Key.

Second, the datatype is blob (or RAW(16) in Oracle), not GUID, and
there's thus nothing wrong to call it what it is. Many people in the
SQL community seem to like stuttering, and will call it foo_id or
foo_uid or foo_guid for a foo table, but I myself prefer create table
foo (guid blob primary key [NOT NULL], ...).

It's mostly a matter of opinion and taste, but there are benefits to
using GUIDs as SKs in some circumstances in my book. My $0.02. --DD

PS: For a long time, SQLite depended on that integer PK, whether one
wanted it or not (and I wasn't aware saying PRIMARY KEY for a
non-integer PK did not mean it's standard-required NOT NULL until
reading a recent thread. I wish there was a single opt-in pragma to
disable all non-standard behavior). Now there's WITOUT ROWID tables,
but from reading this list, it seems like it forces you to a B-tree,
instead of a B*-tree  (store rows in leafs only) like ROWID-based
tables, which I believe means more expensive inserts via more B-tree
rebalancing, which is especially bad with large (blob-using) rows,
given SQLite lack of out-of-row blob storage.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users