Hi Vimal,

Drill does not support JSON Path. Instead, Drill attempts to read your JSON
into records which you can then manipulate in SQL. Drill supports JSON
structure to some degree: nested records, arrays of a single type (with no
nulls.) More recent versions do provide a way to ignore some part of the
JSON, such as ignoring the message body of a REST response, to focus just
on the "payload" portion of the response.

It is possible to add JSON Path by creating a new format plugin. The
simplest JSON path implementations load your data into memory to simplify
path queries. Since Drill queries data at scale, if your files are large,
loading the entire file can be problematic. Instead, you'd want a streaming
path solution, or load a single object at a time and apply path rules to
that.

Another consideration is that using path rules on each record can be
expensive and is a cost paid for every query, which will slow performance.
If you will query the data multiple times, you may find it more effective
to perform an ETL from the JSON format into a Parquet as a separate step,
then query the Parquet format to get good query performance.

Thanks,

- Paul


On Mon, Sep 14, 2020 at 7:29 AM Vimal Jain <vkj...@gmail.com> wrote:

> Hi There,
> I am using Drill 1.17.
> I have complex json data in AWS S3 to query.
> I am looking to see if there is already a support in drill to query using
> JSON Path ( similar to XPath for XML  , for ref -
> https://goessner.net/articles/JsonPath/ )
> If it's not possible by default , is there we can make use of it ? ( like
> UDF etc ? )
>
> *Thanks and Regards,*
> *Vimal Jain*
>

Reply via email to