I am trying to avoid reading through millions of records but would like to also 
avoid building a postgreSQL table by making less but an also significant number 
of calls to check column existence in the information_schema. Drill seems 
perfect for implicity discovering this. Right now, I am dumpy to file. It would 
be great to have a way to avoid declaring types when I do not know the name of 
every column in the data. The data is acquired from a source that only provides 
key value pairs when the values are not null.

Also, thanks.  
________________________________________
From: MattK [m...@hybriddba.com]
Sent: Wednesday, May 25, 2016 10:51 AM
To: user@drill.apache.org
Subject: Re: Apache Drill, Query PostgreSQL text or Jsonb as if it were from a 
json storage type?

Would the PostgreSQL function jsonb_to_recordset(jsonb) help in this
case?

It would return to Drill a table instead of a set of JSON objects, but
you would have to declare the types in the call.

On 25 May 2016, at 12:26, Andrew Evans wrote:

> Drill Members,
>
> I have an intriguing problem where I have hundreds of thousands or
> even millions of records stored in jsonb format in uneven JSon objects
> in a PostgreSQL database. I would like to be able to implicitly grab
> column names and data types using your tool since neither PostreSQL or
> Pentaho have this function. My current tool for this task is Scala but
> I am not dedicated to writing and keeping up with the JSon format
> specs; etc. in real time.
>
> Is it possible by conversion or otherwise to read the jsonb or a text
> string from PostgreSQL as if it were being queried from a "json" type
> storage instead of a "jdbc" type storage? If so, could I pull in
> different columns from PostgreSQL without some sort of key (with the
> original query)? Is there the ability to do some thing like SELECT
> pkey, split\_to\_columns(convert\_to(field,'JSON')) FROM
> postgres.mytable?
>
> For context, I have posted an example record below:
>
> pkey       ,  data
> 1423234, "{"loadCaseResult": {"case": {"type": "Adoption (Family)",
> "judge": null, "style": "Confidential", "caseId": "12", "events":
> {"event": [{"date": "08/06/2014", "type": "Request: Action",
> "comment": "900125-Request for Action                \\n Filed/Issued
> By"}}}
>
> Thank you for your time,
>
> Andrew Evans
> Java/Scala/Python Programmer at Hygenics Data,LLC

Reply via email to