[jira] [Created] (DRILL-4710) Document Drill's JSON processing rules

Paul Rogers (JIRA) Mon, 06 Jun 2016 11:02:38 -0700

Paul Rogers created DRILL-4710:
----------------------------------

             Summary: Document Drill's JSON processing rules
                 Key: DRILL-4710
                 URL: https://issues.apache.org/jira/browse/DRILL-4710
             Project: Apache Drill
          Issue Type: Improvement
          Components: Documentation
            Reporter: Paul Rogers
            Priority: Minor



One of Drill's key benefits is the ability to query JSON-formatted data. Much 
great work has been done. But, unless someone happens to be a Drill developer, 
the details of exactly how Drill handles various JSON formats can be hard to 
find.

We should document how Drill handles various JSON scenarios.

* SELECT * (schema inferred)
* SELECT a, b, c (schema implied by query)

And various JSON structures:

* Top-level structure (list of maps. Can we handle an array of maps? A list of 
scalars?)
* Changes of the top-level map structure across rows.
** New field appears later in the file. (Was {a: 1, b: "s"}, now is {a: 1, b: 
"s", c: 10}
** Fields disappear later in the file
** Fields change type
** Start of file has many nulls for a field, later in file has non-null values.
* How Drill handles array fields
** Array field is null: { a: [10, 20]}, { a: null }
** Array contains nulls: { a: [10, null, 20] }
** Array contains single scalar type (number or string)
** Array contains multiple scalar types (number and string)
** Aray contains structured types (array, map)
* How Drill handles nested maps
** Explicit select: a, b.c, b.d: {a: 1, b: { c: "s", d: 10 }}
** Implicit select: *
** How data is delivered to Drill client
** How data is delivered to JDBC/ODBC clients
* Size issues
** Very large records (what is max size?)
** Very large strings
** Vary large arrays

Along with any other detailed information not covered by the above list.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (DRILL-4710) Document Drill's JSON processing rules

Reply via email to