[
https://issues.apache.org/jira/browse/DRILL-1750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Abhishek Girish updated DRILL-1750:
-----------------------------------
Attachment: 4.json
3.json
2.json
1.json
> Querying directories with JSON files returns incomplete results
> ---------------------------------------------------------------
>
> Key: DRILL-1750
> URL: https://issues.apache.org/jira/browse/DRILL-1750
> Project: Apache Drill
> Issue Type: Bug
> Components: Storage - JSON
> Reporter: Abhishek Girish
> Assignee: Jacques Nadeau
> Attachments: 1.json, 2.json, 3.json, 4.json
>
>
> I happened to observe that querying (select *) a directory with json files
> displays only fields common to all json files. All corresponding fields are
> displayed while querying each of the json files individually. And in some
> scenarios, querying the directory crashes sqlline.
> The example below may help make the issue clear:
> > select * from dfs.`/data/json/tmp/1.json`;
> +------------+------------+------------+
> | artist | track_id | title |
> +------------+------------+------------+
> | Jonathan King | TRAAAEA128F935A30D | I'll Slap Your Face (Entertainment USA
> Theme) |
> +------------+------------+------------+
> 1 row selected (1.305 seconds)
> > select * from dfs.`/data/json/tmp/2.json`;
> +------------+------------+------------+------------+
> | artist | timestamp | track_id | title |
> +------------+------------+------------+------------+
> | Supersuckers | 2011-08-01 20:30:17.991134 | TRAAAQN128F9353BA0 | Double
> Wide |
> +------------+------------+------------+------------+
> 1 row selected (0.105 seconds)
> > select * from dfs.`/data/json/tmp/3.json`;
> +------------+------------+------------+
> | timestamp | track_id | title |
> +------------+------------+------------+
> | 2011-08-01 20:30:17.991134 | TRAAAQN128F9353BA0 | Double Wide |
> +------------+------------+------------+
> 1 row selected (0.083 seconds)
> > select * from dfs.`/data/json/tmp/4.json`;
> +------------+------------+
> | track_id | title |
> +------------+------------+
> | TRAAAQN128F9353BA0 | Double Wide |
> +------------+------------+
> 1 row selected (0.076 seconds)
> > select * from dfs.`/data/json/tmp`;
> +------------+------------+
> | track_id | title |
> +------------+------------+
> | TRAAAQN128F9353BA0 | Double Wide |
> | TRAAAQN128F9353BA0 | Double Wide |
> | TRAAAEA128F935A30D | I'll Slap Your Face (Entertainment USA Theme) |
> | TRAAAQN128F9353BA0 | Double Wide |
> +------------+------------+
> 4 rows selected (0.121 seconds)
> JVM Crash occurs at times:
> > select * from dfs.`/data/json/tmp`;
> +------------+------------+------------+
> | timestamp | track_id | title |
> +------------+------------+------------+
> | 2011-08-01 20:30:17.991134 | TRAAAQN128F9353BA0 | Double Wide |
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> # SIGSEGV (0xb) at pc=0x00007f3cb99be053, pid=13943, tid=139898808436480
> #
> # JRE version: OpenJDK Runtime Environment (7.0_65-b17) (build
> 1.7.0_65-mockbuild_2014_07_16_06_06-b00)
> # Java VM: OpenJDK 64-Bit Server VM (24.65-b04 mixed mode linux-amd64
> compressed oops)
> # Problematic frame:
> # V [libjvm.so+0x932053]
> #
> # Failed to write core dump. Core dumps have been disabled. To enable core
> dumping, try "ulimit -c unlimited" before starting Java again
> #
> # An error report file with more information is saved as:
> # /tmp/jvm-13943/hs_error.log
> #
> # If you would like to submit a bug report, please include
> # instructions on how to reproduce the bug and visit:
> # http://icedtea.classpath.org/bugzilla
> #
> Aborted
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)