Re: [sqlite] Most efficient storage for arrays

2014-04-23 Thread mm.w
Ok, you can store a JSON like structure, or plist, coding is a creative
discipline, not a nerd stuff.


On Wed, Apr 23, 2014 at 12:16 PM, mm.w <0xcafef...@gmail.com> wrote:

> Ok, you can store a JSON like structure, by reproducing the graf""
> adapted to SQL I wrote that (for fun, not only writing code for my day to
> day job) [mostly that what core data does with all this mysterious ids,
> that's just a transpose vertical to horizontal],  there are
> primitive-tables named by serializable types and the tree is represented by
> addresses, like any recursive expat-like parser will do.
>
>
> On Tue, Apr 22, 2014 at 10:13 PM, Keith Medcalf wrote:
>
>>
>> >>> Your data is from MongoDB :)  Note they do have an extended JSON to
>> >>> deal with types like ObjectId, binary and dates:
>> >> Yes, it is. But I control the document structure and it does not have
>> >> any types that can't be converted easily after a pull over the net.
>> >
>> >Does that mean you can make it individual fields instead of a list?  As
>> >Petite Abeille pointed out you need to do SQL things the SQL way or you
>> >will end up in a world of hurt.
>>
>> You need to use a RELATIONAL datastore in a RELATIONAL way.  This applies
>> whether your access method is SQL, ISAM, HIDAM, BDAM or VSAM.  If the
>> underlying datastore is not relational, you may also be able to use one (or
>> more) of the mentioned access methods.  However, then you will be able to
>> store non-relational data in it and manupulate it easily.  SQLite is a
>> RELATIONAL datastore that uses the SQL access method.
>>
>> Please do not confuse the access method and the datastore.  The datastore
>> defines the properties of the filing cabinets, and the access method the
>> language spoken by the filing clerk.
>>
>> Any filing clerk can be taught to use many underlying storage cabinet
>> systems, just as any given storage cabinet system may be accessed by any
>> number of clerks speaking different languages.
>>
>>
>>
>>
>> ___
>> sqlite-users mailing list
>> sqlite-users@sqlite.org
>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>>
>
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Most efficient storage for arrays

2014-04-23 Thread mm.w
Ok, you can store a JSON like structure, by reproducing the graf"" adapted
to SQL I wrote that (for fun, not only writing code for my day to day job)
[mostly that what core data does with all this mysterious ids, that's just
a transpose vertical to horizontal],  there are primitive-tables named by
serializable types and the tree is represented by addresses, like any
recursive expat-like parser will do.


On Tue, Apr 22, 2014 at 10:13 PM, Keith Medcalf  wrote:

>
> >>> Your data is from MongoDB :)  Note they do have an extended JSON to
> >>> deal with types like ObjectId, binary and dates:
> >> Yes, it is. But I control the document structure and it does not have
> >> any types that can't be converted easily after a pull over the net.
> >
> >Does that mean you can make it individual fields instead of a list?  As
> >Petite Abeille pointed out you need to do SQL things the SQL way or you
> >will end up in a world of hurt.
>
> You need to use a RELATIONAL datastore in a RELATIONAL way.  This applies
> whether your access method is SQL, ISAM, HIDAM, BDAM or VSAM.  If the
> underlying datastore is not relational, you may also be able to use one (or
> more) of the mentioned access methods.  However, then you will be able to
> store non-relational data in it and manupulate it easily.  SQLite is a
> RELATIONAL datastore that uses the SQL access method.
>
> Please do not confuse the access method and the datastore.  The datastore
> defines the properties of the filing cabinets, and the access method the
> language spoken by the filing clerk.
>
> Any filing clerk can be taught to use many underlying storage cabinet
> systems, just as any given storage cabinet system may be accessed by any
> number of clerks speaking different languages.
>
>
>
>
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Most efficient storage for arrays

2014-04-22 Thread Keith Medcalf

>>> Your data is from MongoDB :)  Note they do have an extended JSON to
>>> deal with types like ObjectId, binary and dates:
>> Yes, it is. But I control the document structure and it does not have
>> any types that can't be converted easily after a pull over the net.
>
>Does that mean you can make it individual fields instead of a list?  As
>Petite Abeille pointed out you need to do SQL things the SQL way or you
>will end up in a world of hurt.

You need to use a RELATIONAL datastore in a RELATIONAL way.  This applies 
whether your access method is SQL, ISAM, HIDAM, BDAM or VSAM.  If the 
underlying datastore is not relational, you may also be able to use one (or 
more) of the mentioned access methods.  However, then you will be able to store 
non-relational data in it and manupulate it easily.  SQLite is a RELATIONAL 
datastore that uses the SQL access method.

Please do not confuse the access method and the datastore.  The datastore 
defines the properties of the filing cabinets, and the access method the 
language spoken by the filing clerk.

Any filing clerk can be taught to use many underlying storage cabinet systems, 
just as any given storage cabinet system may be accessed by any number of 
clerks speaking different languages.




___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Most efficient storage for arrays

2014-04-22 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 22/04/14 15:00, Neville Dastur wrote:
> On 22 Apr 2014, at 21:58, Roger Binns  wrote:
>> Your data is from MongoDB :)  Note they do have an extended JSON to
>> deal with types like ObjectId, binary and dates:
> Yes, it is. But I control the document structure and it does not have
> any types that can’t be converted easily after a pull over the net.

Does that mean you can make it individual fields instead of a list?  As
Petite Abeille pointed out you need to do SQL things the SQL way or you
will end up in a world of hurt.

>> How much data will there be overall and how performant does all this
>> have to be?
> 10,000s or records

Doesn't sound like much.

>> That still requires scanning every record as you aren't doing a match
>> from the start of the value.
> By this, you mean searching like this is “expensive”

You are searching within a column so an index can't be used - (eg looking
for || anywhere within that column).  That means every record has to
be examined.  Hence my queries about data size and performance requirements.

You will need to normalize if you want SQLite's performance and use
indexes so that columns contain exactly one value.

>> If you need something for not too much data then just be simple -
>> have a single column which is the JSON text of the whole document.
>> You can still use LIKE to find records of interest, deserialise back
>> to JSON and do the final filtering/querying/sorting there.
> 
> Not sure how this helps find say  in array2 alone. Note I have
> changed example so that there is a field value that is the same in
> array1 and 2 { "_id" : ObjectId(“xx"), "name" :
> “Description”, “array1" : [ “”,”” ], “array2" : [
> “”,””,”” ], "tags" :
> [“searchword1”,”searchword2”,”searchword3"] }

You have a SQL table containing one column  which is that entire document
as a single JSON string.  In SQL you do SELECT json FROM table WHERE json
LIKE '%%'.  This will match the above as well as any record where 
is in the _id, description etc.

Then in your app code you deserialise the json and check the object for
your exact query.  ie you use SQL to make a first pass looking for
candidates, and use your own app specific code to narrow that down.

>> If you need a "proper" and convenient solution with SQLite then user 
>> defined functions and virtual tables are a great way of hiding what
>> you do under the hood wrt to how the data is actually stored.  You'll
>> have to write them to understand your data shape and queries.
> Not really an option as I don’t have that control over the sqlite lib
> compiled into the mobile app.

Really?  It is trivial on iOS, and not too hard on Android (need to use
the NDK).

There is no easy way out of this.

Roger
-BEGIN PGP SIGNATURE-
Version: GnuPG v1

iEYEARECAAYFAlNXEIoACgkQmOOfHg372QQWxgCbBREj/ZCFTmdbFf5fLQLAqkAt
NTUAn0TfMt3T9faOVbfgfGpzIoeg9ndY
=rkUB
-END PGP SIGNATURE-
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Most efficient storage for arrays

2014-04-22 Thread Petite Abeille

On Apr 22, 2014, at 2:24 PM, Neville Dastur  wrote:

> So wondering is anyone that has done this sort of thing and worked out the 
> best way?

Yes. Normalize your data. And that’s that:

http://www.schemamania.org/sql/#lists

Quoting a few words:

"Questions are frequently asked about table designs that are hopelessly wrong. 
The solution to the question is not to write the query, but to re-write the 
table, after which the query will practically write itself.

Perhaps the most egregious example is a column whose value is a list or, in SQL 
terms, a repeating group. The elements in the list are perhaps comma-separated, 
and some poor schlep has the task of selecting or joining on the the nth 
element in the list.”

Don’t be *that* schlep. 


But, if you like pain and suffering, medieval style, I got a hack for you 
involving virtual tables, full text search, contentless tables, and shadows 
even!


It goes a bit like this:

sqlite> .head on
sqlite> select * from json_text where array1 match '';
array1|array2|tags
,|,,|searchword1,searchword2,searchword3


So, what’s json_text?

 A virtual, contentless, FTS4 table:

create virtual table json_text using fts4
(
  content='json',
  array1 text,
  array2 text,
  tags   text
);


Where does it get its content?

>From a regular table:

create table json
(
  _idtext,
  name   text,
  array1 text,
  array2 text,
  tags   text
);

insert
intojson
(
  _id,
  name,
  array1,
  array2,
  tags
)
values  (
  'xx',
  'Description',
  ',',
  ',,',
  'searchword1,searchword2,searchword3'
);

insert
intojson
(
  _id,
  name,
  array1,
  array2,
  tags
)
values  (
  'yyy',
  'Description',
  ',',
  ',,',
  'tag1,tag2,tag3'
);


See the doc for details about the full setup:

https://www.sqlite.org/fts3.html#section_6_2_1


What about the shadows you may ask?

Here you go:

create virtual table if not exists json_term using fts4aux( json_text );

sqlite> select * from json_term order by 1 limit 10;
term|col|documents|occurrences
|*|1|1
|0|1|1
|*|1|1
|0|1|1
|*|1|1
|1|1|1
|*|1|1
|1|1|1
|*|1|1
|1|1|1

All the text fields, nicely tokenized, and for your to use and abuse.

More info:

https://www.sqlite.org/fts3.html#fts4aux


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Most efficient storage for arrays

2014-04-22 Thread Neville Dastur

On 22 Apr 2014, at 21:58, Roger Binns  wrote:

> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
> 
> On 22/04/14 05:24, Neville Dastur wrote:
>> I am looking for some advice on storing and searching data that comes
>> from an external JSON source and needs to be stored on device in a
>> Sqlite3 database.
> 
> Your data is from MongoDB :)  Note they do have an extended JSON to deal
> with types like ObjectId, binary and dates:
Yes, it is. But I control the document structure and it does not have any types 
that can’t be converted easily after a pull over the net.

> 
> http://docs.mongodb.org/manual/reference/mongodb-extended-json/#bson-data-types-and-associated-representations
> 
>> On device I need to be able to retrieve the entire “document” and
>> display all the array values,
> 
> How much data will there be overall and how performant does all this have
> to be?
10,000s or records

> 
>> The options i see are to store the array fields as pipe separated
>> values and then use LIKE “|%|” to search items.
> 
> That still requires scanning every record as you aren't doing a match from
> the start of the value.
By this, you mean searching like this is “expensive”

> 
> If you need performance then you need to normalize the data, or use a
> database that supports lists like postgres.
Not an option as the sqlite bit is on smartphones.

> 
> If you need something for not too much data then just be simple - have a
> single column which is the JSON text of the whole document.  You can still
> use LIKE to find records of interest, deserialise back to JSON and do the
> final filtering/querying/sorting there.
Not sure how this helps find say  in array2 alone. Note I have changed 
example so that there is a field value that is the same in array1 and 2
{
   "_id" : ObjectId(“xx"),
   "name" : “Description”,
   “array1" : [ “”,”” ],
   “array2" : [ “”,””,”” ],
   "tags" : [“searchword1”,”searchword2”,”searchword3"]
}

> 
> If you need a "proper" and convenient solution with SQLite then user
> defined functions and virtual tables are a great way of hiding what you do
> under the hood wrt to how the data is actually stored.  You'll have to
> write them to understand your data shape and queries.
Not really an option as I don’t have that control over the sqlite lib compiled 
into the mobile app.

> 
> There was a now defunct project UnQL that was mixing together JSON like
> data, SQL like queries and bit of SQLite.
I am aware. And I recall Richard being involved with this.

Neville
--
Surgeons Net Education: http://www.surgeons.org.uk
Clinical Software Solutions: http://www.clinsoftsolutions.com
Find our free and paid apps on the iTunes Apple store and Android Google Play 
store
LinkedIn: http://www.linkedin.com/profile/view?id=49617062

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Most efficient storage for arrays

2014-04-22 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 22/04/14 05:24, Neville Dastur wrote:
> I am looking for some advice on storing and searching data that comes
> from an external JSON source and needs to be stored on device in a
> Sqlite3 database.

Your data is from MongoDB :)  Note they do have an extended JSON to deal
with types like ObjectId, binary and dates:

http://docs.mongodb.org/manual/reference/mongodb-extended-json/#bson-data-types-and-associated-representations

> On device I need to be able to retrieve the entire “document” and
> display all the array values,

How much data will there be overall and how performant does all this have
to be?

> The options i see are to store the array fields as pipe separated
> values and then use LIKE “|%|” to search items.

That still requires scanning every record as you aren't doing a match from
the start of the value.

If you need performance then you need to normalize the data, or use a
database that supports lists like postgres.

If you need something for not too much data then just be simple - have a
single column which is the JSON text of the whole document.  You can still
use LIKE to find records of interest, deserialise back to JSON and do the
final filtering/querying/sorting there.

If you need a "proper" and convenient solution with SQLite then user
defined functions and virtual tables are a great way of hiding what you do
under the hood wrt to how the data is actually stored.  You'll have to
write them to understand your data shape and queries.

There was a now defunct project UnQL that was mixing together JSON like
data, SQL like queries and bit of SQLite.

Roger
-BEGIN PGP SIGNATURE-
Version: GnuPG v1

iEYEARECAAYFAlNW19UACgkQmOOfHg372QTNvgCgltU2vNzzVuRfTuPdc2+59VXS
h9sAnjya0jcj32aC3nYDA4Myv5x4tyxi
=fiQU
-END PGP SIGNATURE-
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Most efficient storage for arrays

2014-04-22 Thread Neville Dastur
Hi all

I am looking for some advice on storing and searching data that comes from an 
external JSON source and needs to be stored on device in a Sqlite3 database.

The JSON records are formatted as (simplified):
{
"_id" : ObjectId(“xx"),
"name" : “Description”,
“array1" : [ “”,”” ],
“array2" : [ “”,””,”” ],
"tags" : [“searchword1”,”searchword2”,”searchword3"]
}

On device I need to be able to retrieve the entire “document” and display all 
the array values, but I also need to be able to search the arrays and tags. For 
the array items I will need to be able to search on items just in array1 and 
for other search types on items in array1 or array2

The options i see are to store the array fields as pipe separated values and 
then use LIKE “|%|” to search items.

Alternatively, again store the array as pipe separated, but then also have 
other tables with columns id, array_item linked back to the main table. Then 
search using joins. However I can see this getting complicated when it comes to 
the two different search strategies detailed above. Might be mitigated by 
having just one extra table e.g. search_helper with id, parent_array, 
array_item then parent_array could be used to sub-filter.

So wondering is anyone that has done this sort of thing and worked out the best 
way?

Thanks

Neville
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users