[ 
https://issues.apache.org/jira/browse/DRILL-4102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

aditya menon updated DRILL-4102:
--------------------------------
    Description: 
I tried to analyse a JSON file that had the following (sample) structure:

{code:json}
{
    "Key1": {
      "htmltags": "<htmltag attr1='bravo' /><htmltag attr2='delta' /><htmltag 
attr3='charlie' />"
    },
    "Key2": {
      "htmltags": "<htmltag attr1='kilo' /><htmltag attr2='lima' /><htmltag 
attr3='mike' />"
    },
    "Key3": {
      "htmltags": "<htmltag attr1='november' /><htmltag attr2='foxtrot' 
/><htmltag attr3='sierra' />"
    }
}
{code}

(Apologies for the obfuscation, I am unable to publish the original dataset. 
But the structure is exactly the same. Note especially how the keys and other 
data points *differ* in some places, and remain identical in others.)

When I run a {code:sql}SELECT * FROM DataFile.son{code} what I get is a single 
row listed under three columns: `"<htmltag attr1='bravo' /><htmltag 
attr2='delta' /><htmltag attr3='charlie' />"` [i.e., only the entry 
`Key1.htmltags`] .

Ideally, I should see three rows, each with entries from Key1..Key3, listed 
under the correct respective column.

  was:
I tried to analyse a JSON file that had the following (sample) structure:

{code:json}
{
    "Key1": {
      "htmltags": "<htmltag attr1='bravo' /><htmltag attr2='delta' /><htmltag 
attr3='charlie' />"
    },
    "Key2": {
      "htmltags": "<htmltag attr1='kilo' /><htmltag attr2='lima' /><htmltag 
attr3='mike' />"
    },
    "Key3": {
      "htmltags": "<htmltag attr1='november' /><htmltag attr2='foxtrot' 
/><htmltag attr3='sierra' />"
    }
}
{code}

(Apologies for the obfuscation, I am unable to publish the original dataset. 
But the structure is exactly the same. Note especially how the keys and other 
data points *differ* in some places, and remain identical in others.)

When I run a `SELECT * FROM DataFile.json` what I get is a single row listed 
under three columns: `"<htmltag attr1='bravo' /><htmltag attr2='delta' 
/><htmltag attr3='charlie' />"` [i.e., only the entry `Key1.htmltags`] .

Ideally, I should see three rows, each with entries from Key1..Key3, listed 
under the correct respective column.


> Only one row found in a JSON document that contains multiple items.
> -------------------------------------------------------------------
>
>                 Key: DRILL-4102
>                 URL: https://issues.apache.org/jira/browse/DRILL-4102
>             Project: Apache Drill
>          Issue Type: Bug
>         Environment: OS X, Drill embedded, v1.1.0 installed via HomeBrew
>            Reporter: aditya menon
>
> I tried to analyse a JSON file that had the following (sample) structure:
> {code:json}
> {
>     "Key1": {
>       "htmltags": "<htmltag attr1='bravo' /><htmltag attr2='delta' /><htmltag 
> attr3='charlie' />"
>     },
>     "Key2": {
>       "htmltags": "<htmltag attr1='kilo' /><htmltag attr2='lima' /><htmltag 
> attr3='mike' />"
>     },
>     "Key3": {
>       "htmltags": "<htmltag attr1='november' /><htmltag attr2='foxtrot' 
> /><htmltag attr3='sierra' />"
>     }
> }
> {code}
> (Apologies for the obfuscation, I am unable to publish the original dataset. 
> But the structure is exactly the same. Note especially how the keys and other 
> data points *differ* in some places, and remain identical in others.)
> When I run a {code:sql}SELECT * FROM DataFile.son{code} what I get is a 
> single row listed under three columns: `"<htmltag attr1='bravo' /><htmltag 
> attr2='delta' /><htmltag attr3='charlie' />"` [i.e., only the entry 
> `Key1.htmltags`] .
> Ideally, I should see three rows, each with entries from Key1..Key3, listed 
> under the correct respective column.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to