Re: [sqlite] Most efficient storage for arrays
Ok, you can store a JSON like structure, or plist, coding is a creative discipline, not a nerd stuff. On Wed, Apr 23, 2014 at 12:16 PM, mm.w <0xcafef...@gmail.com> wrote: > Ok, you can store a JSON like structure, by reproducing the graf"" > adapted to SQL I wrote that (for fun, not only writing code for my day to > day job) [mostly that what core data does with all this mysterious ids, > that's just a transpose vertical to horizontal], there are > primitive-tables named by serializable types and the tree is represented by > addresses, like any recursive expat-like parser will do. > > > On Tue, Apr 22, 2014 at 10:13 PM, Keith Medcalf wrote: > >> >> >>> Your data is from MongoDB :) Note they do have an extended JSON to >> >>> deal with types like ObjectId, binary and dates: >> >> Yes, it is. But I control the document structure and it does not have >> >> any types that can't be converted easily after a pull over the net. >> > >> >Does that mean you can make it individual fields instead of a list? As >> >Petite Abeille pointed out you need to do SQL things the SQL way or you >> >will end up in a world of hurt. >> >> You need to use a RELATIONAL datastore in a RELATIONAL way. This applies >> whether your access method is SQL, ISAM, HIDAM, BDAM or VSAM. If the >> underlying datastore is not relational, you may also be able to use one (or >> more) of the mentioned access methods. However, then you will be able to >> store non-relational data in it and manupulate it easily. SQLite is a >> RELATIONAL datastore that uses the SQL access method. >> >> Please do not confuse the access method and the datastore. The datastore >> defines the properties of the filing cabinets, and the access method the >> language spoken by the filing clerk. >> >> Any filing clerk can be taught to use many underlying storage cabinet >> systems, just as any given storage cabinet system may be accessed by any >> number of clerks speaking different languages. >> >> >> >> >> ___ >> sqlite-users mailing list >> sqlite-users@sqlite.org >> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users >> > > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Most efficient storage for arrays
Ok, you can store a JSON like structure, by reproducing the graf"" adapted to SQL I wrote that (for fun, not only writing code for my day to day job) [mostly that what core data does with all this mysterious ids, that's just a transpose vertical to horizontal], there are primitive-tables named by serializable types and the tree is represented by addresses, like any recursive expat-like parser will do. On Tue, Apr 22, 2014 at 10:13 PM, Keith Medcalf wrote: > > >>> Your data is from MongoDB :) Note they do have an extended JSON to > >>> deal with types like ObjectId, binary and dates: > >> Yes, it is. But I control the document structure and it does not have > >> any types that can't be converted easily after a pull over the net. > > > >Does that mean you can make it individual fields instead of a list? As > >Petite Abeille pointed out you need to do SQL things the SQL way or you > >will end up in a world of hurt. > > You need to use a RELATIONAL datastore in a RELATIONAL way. This applies > whether your access method is SQL, ISAM, HIDAM, BDAM or VSAM. If the > underlying datastore is not relational, you may also be able to use one (or > more) of the mentioned access methods. However, then you will be able to > store non-relational data in it and manupulate it easily. SQLite is a > RELATIONAL datastore that uses the SQL access method. > > Please do not confuse the access method and the datastore. The datastore > defines the properties of the filing cabinets, and the access method the > language spoken by the filing clerk. > > Any filing clerk can be taught to use many underlying storage cabinet > systems, just as any given storage cabinet system may be accessed by any > number of clerks speaking different languages. > > > > > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Most efficient storage for arrays
>>> Your data is from MongoDB :) Note they do have an extended JSON to >>> deal with types like ObjectId, binary and dates: >> Yes, it is. But I control the document structure and it does not have >> any types that can't be converted easily after a pull over the net. > >Does that mean you can make it individual fields instead of a list? As >Petite Abeille pointed out you need to do SQL things the SQL way or you >will end up in a world of hurt. You need to use a RELATIONAL datastore in a RELATIONAL way. This applies whether your access method is SQL, ISAM, HIDAM, BDAM or VSAM. If the underlying datastore is not relational, you may also be able to use one (or more) of the mentioned access methods. However, then you will be able to store non-relational data in it and manupulate it easily. SQLite is a RELATIONAL datastore that uses the SQL access method. Please do not confuse the access method and the datastore. The datastore defines the properties of the filing cabinets, and the access method the language spoken by the filing clerk. Any filing clerk can be taught to use many underlying storage cabinet systems, just as any given storage cabinet system may be accessed by any number of clerks speaking different languages. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Most efficient storage for arrays
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 22/04/14 15:00, Neville Dastur wrote: > On 22 Apr 2014, at 21:58, Roger Binns wrote: >> Your data is from MongoDB :) Note they do have an extended JSON to >> deal with types like ObjectId, binary and dates: > Yes, it is. But I control the document structure and it does not have > any types that can’t be converted easily after a pull over the net. Does that mean you can make it individual fields instead of a list? As Petite Abeille pointed out you need to do SQL things the SQL way or you will end up in a world of hurt. >> How much data will there be overall and how performant does all this >> have to be? > 10,000s or records Doesn't sound like much. >> That still requires scanning every record as you aren't doing a match >> from the start of the value. > By this, you mean searching like this is “expensive” You are searching within a column so an index can't be used - (eg looking for || anywhere within that column). That means every record has to be examined. Hence my queries about data size and performance requirements. You will need to normalize if you want SQLite's performance and use indexes so that columns contain exactly one value. >> If you need something for not too much data then just be simple - >> have a single column which is the JSON text of the whole document. >> You can still use LIKE to find records of interest, deserialise back >> to JSON and do the final filtering/querying/sorting there. > > Not sure how this helps find say in array2 alone. Note I have > changed example so that there is a field value that is the same in > array1 and 2 { "_id" : ObjectId(“xx"), "name" : > “Description”, “array1" : [ “”,”” ], “array2" : [ > “”,””,”” ], "tags" : > [“searchword1”,”searchword2”,”searchword3"] } You have a SQL table containing one column which is that entire document as a single JSON string. In SQL you do SELECT json FROM table WHERE json LIKE '%%'. This will match the above as well as any record where is in the _id, description etc. Then in your app code you deserialise the json and check the object for your exact query. ie you use SQL to make a first pass looking for candidates, and use your own app specific code to narrow that down. >> If you need a "proper" and convenient solution with SQLite then user >> defined functions and virtual tables are a great way of hiding what >> you do under the hood wrt to how the data is actually stored. You'll >> have to write them to understand your data shape and queries. > Not really an option as I don’t have that control over the sqlite lib > compiled into the mobile app. Really? It is trivial on iOS, and not too hard on Android (need to use the NDK). There is no easy way out of this. Roger -BEGIN PGP SIGNATURE- Version: GnuPG v1 iEYEARECAAYFAlNXEIoACgkQmOOfHg372QQWxgCbBREj/ZCFTmdbFf5fLQLAqkAt NTUAn0TfMt3T9faOVbfgfGpzIoeg9ndY =rkUB -END PGP SIGNATURE- ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Most efficient storage for arrays
On Apr 22, 2014, at 2:24 PM, Neville Dastur wrote: > So wondering is anyone that has done this sort of thing and worked out the > best way? Yes. Normalize your data. And that’s that: http://www.schemamania.org/sql/#lists Quoting a few words: "Questions are frequently asked about table designs that are hopelessly wrong. The solution to the question is not to write the query, but to re-write the table, after which the query will practically write itself. Perhaps the most egregious example is a column whose value is a list or, in SQL terms, a repeating group. The elements in the list are perhaps comma-separated, and some poor schlep has the task of selecting or joining on the the nth element in the list.” Don’t be *that* schlep. But, if you like pain and suffering, medieval style, I got a hack for you involving virtual tables, full text search, contentless tables, and shadows even! It goes a bit like this: sqlite> .head on sqlite> select * from json_text where array1 match ''; array1|array2|tags ,|,,|searchword1,searchword2,searchword3 So, what’s json_text? A virtual, contentless, FTS4 table: create virtual table json_text using fts4 ( content='json', array1 text, array2 text, tags text ); Where does it get its content? >From a regular table: create table json ( _idtext, name text, array1 text, array2 text, tags text ); insert intojson ( _id, name, array1, array2, tags ) values ( 'xx', 'Description', ',', ',,', 'searchword1,searchword2,searchword3' ); insert intojson ( _id, name, array1, array2, tags ) values ( 'yyy', 'Description', ',', ',,', 'tag1,tag2,tag3' ); See the doc for details about the full setup: https://www.sqlite.org/fts3.html#section_6_2_1 What about the shadows you may ask? Here you go: create virtual table if not exists json_term using fts4aux( json_text ); sqlite> select * from json_term order by 1 limit 10; term|col|documents|occurrences |*|1|1 |0|1|1 |*|1|1 |0|1|1 |*|1|1 |1|1|1 |*|1|1 |1|1|1 |*|1|1 |1|1|1 All the text fields, nicely tokenized, and for your to use and abuse. More info: https://www.sqlite.org/fts3.html#fts4aux ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Most efficient storage for arrays
On 22 Apr 2014, at 21:58, Roger Binns wrote: > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA1 > > On 22/04/14 05:24, Neville Dastur wrote: >> I am looking for some advice on storing and searching data that comes >> from an external JSON source and needs to be stored on device in a >> Sqlite3 database. > > Your data is from MongoDB :) Note they do have an extended JSON to deal > with types like ObjectId, binary and dates: Yes, it is. But I control the document structure and it does not have any types that can’t be converted easily after a pull over the net. > > http://docs.mongodb.org/manual/reference/mongodb-extended-json/#bson-data-types-and-associated-representations > >> On device I need to be able to retrieve the entire “document” and >> display all the array values, > > How much data will there be overall and how performant does all this have > to be? 10,000s or records > >> The options i see are to store the array fields as pipe separated >> values and then use LIKE “|%|” to search items. > > That still requires scanning every record as you aren't doing a match from > the start of the value. By this, you mean searching like this is “expensive” > > If you need performance then you need to normalize the data, or use a > database that supports lists like postgres. Not an option as the sqlite bit is on smartphones. > > If you need something for not too much data then just be simple - have a > single column which is the JSON text of the whole document. You can still > use LIKE to find records of interest, deserialise back to JSON and do the > final filtering/querying/sorting there. Not sure how this helps find say in array2 alone. Note I have changed example so that there is a field value that is the same in array1 and 2 { "_id" : ObjectId(“xx"), "name" : “Description”, “array1" : [ “”,”” ], “array2" : [ “”,””,”” ], "tags" : [“searchword1”,”searchword2”,”searchword3"] } > > If you need a "proper" and convenient solution with SQLite then user > defined functions and virtual tables are a great way of hiding what you do > under the hood wrt to how the data is actually stored. You'll have to > write them to understand your data shape and queries. Not really an option as I don’t have that control over the sqlite lib compiled into the mobile app. > > There was a now defunct project UnQL that was mixing together JSON like > data, SQL like queries and bit of SQLite. I am aware. And I recall Richard being involved with this. Neville -- Surgeons Net Education: http://www.surgeons.org.uk Clinical Software Solutions: http://www.clinsoftsolutions.com Find our free and paid apps on the iTunes Apple store and Android Google Play store LinkedIn: http://www.linkedin.com/profile/view?id=49617062 ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Most efficient storage for arrays
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 22/04/14 05:24, Neville Dastur wrote: > I am looking for some advice on storing and searching data that comes > from an external JSON source and needs to be stored on device in a > Sqlite3 database. Your data is from MongoDB :) Note they do have an extended JSON to deal with types like ObjectId, binary and dates: http://docs.mongodb.org/manual/reference/mongodb-extended-json/#bson-data-types-and-associated-representations > On device I need to be able to retrieve the entire “document” and > display all the array values, How much data will there be overall and how performant does all this have to be? > The options i see are to store the array fields as pipe separated > values and then use LIKE “|%|” to search items. That still requires scanning every record as you aren't doing a match from the start of the value. If you need performance then you need to normalize the data, or use a database that supports lists like postgres. If you need something for not too much data then just be simple - have a single column which is the JSON text of the whole document. You can still use LIKE to find records of interest, deserialise back to JSON and do the final filtering/querying/sorting there. If you need a "proper" and convenient solution with SQLite then user defined functions and virtual tables are a great way of hiding what you do under the hood wrt to how the data is actually stored. You'll have to write them to understand your data shape and queries. There was a now defunct project UnQL that was mixing together JSON like data, SQL like queries and bit of SQLite. Roger -BEGIN PGP SIGNATURE- Version: GnuPG v1 iEYEARECAAYFAlNW19UACgkQmOOfHg372QTNvgCgltU2vNzzVuRfTuPdc2+59VXS h9sAnjya0jcj32aC3nYDA4Myv5x4tyxi =fiQU -END PGP SIGNATURE- ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Most efficient storage for arrays
Hi all I am looking for some advice on storing and searching data that comes from an external JSON source and needs to be stored on device in a Sqlite3 database. The JSON records are formatted as (simplified): { "_id" : ObjectId(“xx"), "name" : “Description”, “array1" : [ “”,”” ], “array2" : [ “”,””,”” ], "tags" : [“searchword1”,”searchword2”,”searchword3"] } On device I need to be able to retrieve the entire “document” and display all the array values, but I also need to be able to search the arrays and tags. For the array items I will need to be able to search on items just in array1 and for other search types on items in array1 or array2 The options i see are to store the array fields as pipe separated values and then use LIKE “|%|” to search items. Alternatively, again store the array as pipe separated, but then also have other tables with columns id, array_item linked back to the main table. Then search using joins. However I can see this getting complicated when it comes to the two different search strategies detailed above. Might be mitigated by having just one extra table e.g. search_helper with id, parent_array, array_item then parent_array could be used to sub-filter. So wondering is anyone that has done this sort of thing and worked out the best way? Thanks Neville ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users