I am trying to avoid reading through millions of records but would like to also avoid building a postgreSQL table by making less but an also significant number of calls to check column existence in the information_schema. Drill seems perfect for implicity discovering this. Right now, I am dumpy to file. It would be great to have a way to avoid declaring types when I do not know the name of every column in the data. The data is acquired from a source that only provides key value pairs when the values are not null.
Also, thanks. ________________________________________ From: MattK [m...@hybriddba.com] Sent: Wednesday, May 25, 2016 10:51 AM To: user@drill.apache.org Subject: Re: Apache Drill, Query PostgreSQL text or Jsonb as if it were from a json storage type? Would the PostgreSQL function jsonb_to_recordset(jsonb) help in this case? It would return to Drill a table instead of a set of JSON objects, but you would have to declare the types in the call. On 25 May 2016, at 12:26, Andrew Evans wrote: > Drill Members, > > I have an intriguing problem where I have hundreds of thousands or > even millions of records stored in jsonb format in uneven JSon objects > in a PostgreSQL database. I would like to be able to implicitly grab > column names and data types using your tool since neither PostreSQL or > Pentaho have this function. My current tool for this task is Scala but > I am not dedicated to writing and keeping up with the JSon format > specs; etc. in real time. > > Is it possible by conversion or otherwise to read the jsonb or a text > string from PostgreSQL as if it were being queried from a "json" type > storage instead of a "jdbc" type storage? If so, could I pull in > different columns from PostgreSQL without some sort of key (with the > original query)? Is there the ability to do some thing like SELECT > pkey, split\_to\_columns(convert\_to(field,'JSON')) FROM > postgres.mytable? > > For context, I have posted an example record below: > > pkey , data > 1423234, "{"loadCaseResult": {"case": {"type": "Adoption (Family)", > "judge": null, "style": "Confidential", "caseId": "12", "events": > {"event": [{"date": "08/06/2014", "type": "Request: Action", > "comment": "900125-Request for Action \\n Filed/Issued > By"}}} > > Thank you for your time, > > Andrew Evans > Java/Scala/Python Programmer at Hygenics Data,LLC