On Mar 26, 2019, at 8:35 AM, Wout Mertens <wout.mert...@gmail.com> wrote: > > what would be the best way to answer these, given `CREATE TABLE foo(id > TEXT, json JSON);` and json is always a json object: > > - all rows with a given key bar > - SELECT * FROM foo WHERE json_extract(json, '$.bar') IS NOT NULL; > - all rows where there are only any of the given keys a,b in the object > - SELECT * FROM foo WHERE json_remove(json,'$.a','$.b') ='{}'; > - all rows where there are all the given keys a,b and no others in the > object > - SELECT * FROM foo WHERE json_remove(json,'$.a','$.b') ='{}' and > json_extract(json, '$.a') IS NOT NULL and json_extract(json, > '$.b') IS NOT > NULL;
The biggest problem with these isn’t the use of JSON per se, it’s that each one is unindexed, so they’ll all be a full table scan. That makes your use of SQLite here scarcely better than storing all the data in a single JSON object, stored in a simple text file, querying it with something like jq. If we’re talking about hundreds or a few thousand records, the parsing cost probably doesn’t matter, but if we’re talking millions or billions of records, it’s a pretty expensive way to go. You should be able to create indices on expressions like “json_extract(json, '$.bar')”, but you’ll want to test it to find out whether the space it takes is worth the speed increase. You could also think about partially denormalizing the data: on record insertion, pre-parse the JSON and extract the key fields as SQLite fields. Even better would be to simply choose not to store all of your data in JSON form: keep the key fields in normal SQLite fields. Then you can do indexed lookups on the actual records you require and do JSON extraction only on those records for each query. > these queries seem pretty onerous to me SQLite’s JSON extension is pretty quick. A recent test I did here did a full table scan with JSON parsing on each of ~10000 records in about 80ms. On the one hand, that’s impressively fast, but on the other, it’s up in human-scale time, so I ended up caching the query results for a few minutes to avoid making this expensive query too often. Keep in mind that there’s nothing you can’t encode in plain old SQLite, in principle. JSON is a luxury, not an absolute requirement. Transforming the data to be more relational than hierarchical could be a very worthwhile investment, if you have enough records or a low enough latency budget. _______________________________________________ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users