David Bicking wrote: > I have a data file with JSON data, and I want to get the data into an > sqlite database so I can run queries against it. It is not a large file, > only about 600 records in the main table. I've never worked with JSON > before, and really don't want to write my own parser to extract he data.
JSON and SQL have two fundamentally different data models. It is trivial to map from SQL to JSON, but going the other way may take careful thought. > Is there a tool out there that will more or less automate the task for > me? Hopefully free, as no one is paying me to do this. (The other > volunteers have maybe a dozen records in total and are doing their > reports by hand. ) The automation is at a lower level than you seem to realize. JSON I/O is a solved problem, but actually doing anything with that data, such as transforming it into a relational database form, is up to you. > A cursory look at the data: there is a top level "event" table. "Group" > and "venue" tables that are one-to-one with the event table, and a > "hosts" tables with many hosts to the event table. JSON doesn't have tables, nor does JSON pre-declare a schema. JSON intermixes the schema with the data, and the schema is allowed to vary over the course of the document. Here's some code I wrote to process JSON: https://wiki.tcl-lang.org/page/Alternative+JSON JSON has arrays, objects, strings, numbers, and literals. An array is a list of values of any type. An object is a mapping from strings to values of any type. The strings really ought to be unique, or results will be unpredictable. Strings and numbers are pretty much what you expect. Literals are false, true, or null. For example, here's JSON that will drop into SQL fairly easily: [{"name": "Andy", id: 42}, {"name": "David", id: 999}] Just do: CREATE TABLE TableNameGoesHere(name, id); INSERT INTO TableNameGoesHere VALUES ('Andy', 42), ('David', 999); But JSON allows anything, so there is no way to automatically extract a useful schema: ["Andy", 42, "David", 999, {"event": [1, 2, "345"]}, {"group": {}}] Here's what you need to do. Look through your data in detail and devise an SQL schema that captures it, or at least the subset you care about. Then take any JSON library (I recommend the one I wrote, since it's in Tcl and the Tcl binding for SQLite is the easiest binding) and use it to write a program that reads your JSON and inserts it into your SQL database. You will also have to think about whether you blow away the SQL database every time or you instead try to merge in new data, and if the old data needs to be kept, deleted, or archived. I don't think this is necessarily a difficult task, but it is definitely not one that can be performed without a serious analysis which cannot be automated. > Oh, to complicate things, the source data is not valid JSON, as the > upstream source did not escape quote marks within the text. So is there > a tool that can clean up the quotes, hopefully there won't be many bad > quotes... maybe three or four in the 600 record. Then it's not really JSON and can't be processed with true-blue JSON tools. Everything you're saying is calling for a custom tool. Custom tools do not have to be hard or expensive though. But in the case of quotes not being escaped, that all by itself could be difficult to automate, since heuristics will need to be applied to figure out whether any given quote mark is a string delimiter. Human intervention might be required. ------------------------------------------- CONFIDENTIALITY NOTICE: This email and any attachments are for the sole use of the intended recipient and may contain material that is proprietary, confidential, privileged or otherwise legally protected or restricted under applicable government laws. Any review, disclosure, distributing or other use without expressed permission of the sender is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies without reading, printing, or saving.. _______________________________________________ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users