[ 
https://issues.apache.org/jira/browse/DRILL-6035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16293362#comment-16293362
 ] 

Paul Rogers edited comment on DRILL-6035 at 12/16/17 1:19 AM:
--------------------------------------------------------------

h4. JSON Projection Pushdown

The JSON reader supports "projection push-down." The simple rules are simple in 
concept, but complex in details.

The project list comes from the query. In its simplest form, it is the list of 
columns following the {{SELECT}} keyword:

{code}
SELECT a, b.c, d[0] FROM ...
{code}

|| Projection || JSON Value of "a" || Drill Result ||
| `a` | Scalar | Projects `a` |
| | Array | Projects all elements of `a` |
| | Object | Projects all members of `a` |
| | Missing | Creates a {{Nullable INT}} (Drill 1.12) or {{Nullable VARCHAR}} 
(Drill 1.13) column |
| | {{null}} | As above |
| `a`.`b` | Scalar | Error (`a` must be an object) |
| | Scalar array | Error (`a` must be a map or an array of maps) |
| | Object that contains `b` | Projects just `b` from object `a` |
| | Object that does not contain `b` | Projects a nullable column `b` within 
map `a` |
| | Object array that contains `b` | Projects just be from the objects within 
array `a` |
| | Object array that does not contain `b` | Projects a nullable column `b` 
within the array of maps |
| | Missing | Projects a map `a` that contains a nullable column `b` |
| | {{null}} | As above |
| a\[0] | Scalar | Error (`a` must be an array) |
| | Scalar array | Projects just `a\[0]` as a scalar (the reader projects the 
entire array, a project operator pulls out the `a\[0]` element) |
| | Object | Error (`a` must e an array) |
| | Object array | Projects just object (map) `a\[0]` as described above |
| | {{null}} | JSON creates an array of null values, project pulls out `a\[0]` |
| | Missing | As above |

Notes:

* The rules above are for Drill 1.13. Drill 1.12 and earlier is different, and 
requires investigation.
* The rules for null values are suble. The type of the null is inferred from 
the project list in the case of a map (`a`.`b`) or an array (`a\[0]). Previous 
sections described null handling for the {{SELECT *}} and {{SELECT `a`}} cases.
* The rules for projecting map columns apply to both arrays and single maps. 
(In Drill 1.12 and earlier, the two cases appear to have behaved differently.)


was (Author: paul.rogers):
h4. JSON Projection Pushdown

The JSON reader supports "projection push-down." The simple rules are simple in 
concept, but complex in details.

The project list comes from the query. In its simplest form, it is the list of 
columns following the {{SELECT}} keyword:

{code}
SELECT a, b.c, d[0] FROM ...
{code}

|| Projection || JSON Value of `a` || Drill Result ||
| `a` | Scalar | Projects `a` |
| | Array | Projects all elements of `a` |
| | Object | Projects all members of `a` |
| | Missing | Creates a {{Nullable INT}} (Drill 1.12) or {{Nullable VARCHAR}} 
(Drill 1.13) column |
| | {{null}} | As above |
| `a`.`b` | Scalar | Error (`a` must be an object) |
| | Scalar array | Error (`a` must be a map or an array of maps) |
| | Object that contains `b` | Projects just `b` from object `a` |
| | Object that does not contain `b` | Projects a nullable column `b` within 
map `a` |
| | Object array that contains `b` | Projects just be from the objects within 
array `a` |
| | Object array that does not contain `b` | Projects a nullable column `b` 
within the array of maps |
| | Missing | Projects a map `a` that contains a nullable column `b` |
| | {{null}} | As above |
| a\[0] | Scalar | Error (`a` must be an array) |
| | Scalar array | Projects just `a\[0]` as a scalar (the reader projects the 
entire array, a project operator pulls out the `a\[0]` element) |
| | Object | Error (`a` must e an array) |
| | Object array | Projects just object (map) `a\[0]` as described above |
| | {{null}} | JSON creates an array of null values, project pulls out `a\[0]` |
| | Missing | As above |

Notes:

* The rules above are for Drill 1.13. Drill 1.12 and earlier is different, and 
requires investigation.
* The rules for null values are suble. The type of the null is inferred from 
the project list in the case of a map (`a`.`b`) or an array (`a\[0]). Previous 
sections described null handling for the {{SELECT *}} and {{SELECT `a`}} cases.
* The rules for projecting map columns apply to both arrays and single maps. 
(In Drill 1.12 and earlier, the two cases appear to have behaved differently.)

> Specify Drill's JSON behavior
> -----------------------------
>
>                 Key: DRILL-6035
>                 URL: https://issues.apache.org/jira/browse/DRILL-6035
>             Project: Apache Drill
>          Issue Type: Improvement
>    Affects Versions: 1.13.0
>            Reporter: Paul Rogers
>            Assignee: Pritesh Maker
>
> Drill supports JSON as its native data format. However, experience suggests 
> that Drill may have limitations in the JSON that Drill supports. This ticket 
> asks to clarify Drill's expected behavior on various kinds of JSON.
> Topics to be addressed:
> * Relational vs. non-relational structures
> * JSON structures used in practice and how they map to Drill
> * Support for varying data types
> * Support for missing values, especially across files
> These topics are complex, hence the request to provide a detailed 
> specifications that clarifies what Drill does and does not support (or what 
> is should and should not support.)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to