[jira] [Commented] (DRILL-5033) Query on JSON that has null as value for each key
[ https://issues.apache.org/jira/browse/DRILL-5033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17652985#comment-17652985 ] ASF GitHub Bot commented on DRILL-5033: --- cgivre commented on PR #2731: URL: https://github.com/apache/drill/pull/2731#issuecomment-1367677557 @unical1988 You actually don't have to modify the code to get this data to read properly. As I mentioned on the user group, the easiest way would probably be to provide a schema. The good news is that you can do this at query time. Take a look here: https://drill.apache.org/docs/plugin-configuration-basics/#specifying-the-schema-as-table-function-parameter An example query might be: ```sql select * from table(dfs.tmp.`file.json`( schema => 'inline=(col0 varchar, col1 date properties {`drill.format` = `-MM-dd`}) properties {`drill.strict` = `false`}')) ``` > Query on JSON that has null as value for each key > - > > Key: DRILL-5033 > URL: https://issues.apache.org/jira/browse/DRILL-5033 > Project: Apache Drill > Issue Type: Bug > Components: Storage - JSON >Affects Versions: 1.9.0 >Reporter: Khurram Faraaz >Priority: Major > > Drill 1.9.0 git commit ID : 83513daf > Drill returns same result with or without `store.json.all_text_mode`=true > Note that each key in the JSON has null as its value. > [root@cent01 null_eq_joins]# cat right_all_nulls.json > { > "intKey" : null, > "bgintKey": null, > "strKey": null, > "boolKey": null, > "fltKey": null, > "dblKey": null, > "timKey": null, > "dtKey": null, > "tmstmpKey": null, > "intrvldyKey": null, > "intrvlyrKey": null > } > [root@cent01 null_eq_joins]# > Querying the above JSON file results in null as query result. > - We should see each of the keys in the JSON as a column in query result. > - And in each column the value should be a null value. > Current behavior does not look right. > {noformat} > 0: jdbc:drill:schema=dfs.tmp> select * from `right_all_nulls.json`; > +---+ > | * | > +---+ > | null | > +---+ > 1 row selected (0.313 seconds) > {noformat} > Adding comment from [~julianhyde] > IMHO it is similar but not the same as DRILL-1256. Worth logging an issue and > let [~jnadeau] (or someone) put on the record what should be the behavior of > an empty record (empty JSON map) when it is top-level (as in this case) or in > a collection. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (DRILL-5033) Query on JSON that has null as value for each key
[ https://issues.apache.org/jira/browse/DRILL-5033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17652983#comment-17652983 ] ASF GitHub Bot commented on DRILL-5033: --- unical1988 commented on PR #2731: URL: https://github.com/apache/drill/pull/2731#issuecomment-1367674759 @vvysotskyi My attempt to deal with this bug is just a quick workaround since the solution as stated by @cgivre might just be the setting of the schema, of the dataset to query, from the start (which requires non trivial updates to the code). > Query on JSON that has null as value for each key > - > > Key: DRILL-5033 > URL: https://issues.apache.org/jira/browse/DRILL-5033 > Project: Apache Drill > Issue Type: Bug > Components: Storage - JSON >Affects Versions: 1.9.0 >Reporter: Khurram Faraaz >Priority: Major > > Drill 1.9.0 git commit ID : 83513daf > Drill returns same result with or without `store.json.all_text_mode`=true > Note that each key in the JSON has null as its value. > [root@cent01 null_eq_joins]# cat right_all_nulls.json > { > "intKey" : null, > "bgintKey": null, > "strKey": null, > "boolKey": null, > "fltKey": null, > "dblKey": null, > "timKey": null, > "dtKey": null, > "tmstmpKey": null, > "intrvldyKey": null, > "intrvlyrKey": null > } > [root@cent01 null_eq_joins]# > Querying the above JSON file results in null as query result. > - We should see each of the keys in the JSON as a column in query result. > - And in each column the value should be a null value. > Current behavior does not look right. > {noformat} > 0: jdbc:drill:schema=dfs.tmp> select * from `right_all_nulls.json`; > +---+ > | * | > +---+ > | null | > +---+ > 1 row selected (0.313 seconds) > {noformat} > Adding comment from [~julianhyde] > IMHO it is similar but not the same as DRILL-1256. Worth logging an issue and > let [~jnadeau] (or someone) put on the record what should be the behavior of > an empty record (empty JSON map) when it is top-level (as in this case) or in > a collection. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (DRILL-5033) Query on JSON that has null as value for each key
[ https://issues.apache.org/jira/browse/DRILL-5033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17652918#comment-17652918 ] ASF GitHub Bot commented on DRILL-5033: --- cgivre commented on PR #2731: URL: https://github.com/apache/drill/pull/2731#issuecomment-1367526640 I copied the JIRA into the PR description. > Query on JSON that has null as value for each key > - > > Key: DRILL-5033 > URL: https://issues.apache.org/jira/browse/DRILL-5033 > Project: Apache Drill > Issue Type: Bug > Components: Storage - JSON >Affects Versions: 1.9.0 >Reporter: Khurram Faraaz >Priority: Major > > Drill 1.9.0 git commit ID : 83513daf > Drill returns same result with or without `store.json.all_text_mode`=true > Note that each key in the JSON has null as its value. > [root@cent01 null_eq_joins]# cat right_all_nulls.json > { > "intKey" : null, > "bgintKey": null, > "strKey": null, > "boolKey": null, > "fltKey": null, > "dblKey": null, > "timKey": null, > "dtKey": null, > "tmstmpKey": null, > "intrvldyKey": null, > "intrvlyrKey": null > } > [root@cent01 null_eq_joins]# > Querying the above JSON file results in null as query result. > - We should see each of the keys in the JSON as a column in query result. > - And in each column the value should be a null value. > Current behavior does not look right. > {noformat} > 0: jdbc:drill:schema=dfs.tmp> select * from `right_all_nulls.json`; > +---+ > | * | > +---+ > | null | > +---+ > 1 row selected (0.313 seconds) > {noformat} > Adding comment from [~julianhyde] > IMHO it is similar but not the same as DRILL-1256. Worth logging an issue and > let [~jnadeau] (or someone) put on the record what should be the behavior of > an empty record (empty JSON map) when it is top-level (as in this case) or in > a collection. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (DRILL-5033) Query on JSON that has null as value for each key
[ https://issues.apache.org/jira/browse/DRILL-5033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17652916#comment-17652916 ] Charles Givre commented on DRILL-5033: -- [https://github.com/apache/drill/pull/2731] > Query on JSON that has null as value for each key > - > > Key: DRILL-5033 > URL: https://issues.apache.org/jira/browse/DRILL-5033 > Project: Apache Drill > Issue Type: Bug > Components: Storage - JSON >Affects Versions: 1.9.0 >Reporter: Khurram Faraaz >Priority: Major > > Drill 1.9.0 git commit ID : 83513daf > Drill returns same result with or without `store.json.all_text_mode`=true > Note that each key in the JSON has null as its value. > [root@cent01 null_eq_joins]# cat right_all_nulls.json > { > "intKey" : null, > "bgintKey": null, > "strKey": null, > "boolKey": null, > "fltKey": null, > "dblKey": null, > "timKey": null, > "dtKey": null, > "tmstmpKey": null, > "intrvldyKey": null, > "intrvlyrKey": null > } > [root@cent01 null_eq_joins]# > Querying the above JSON file results in null as query result. > - We should see each of the keys in the JSON as a column in query result. > - And in each column the value should be a null value. > Current behavior does not look right. > {noformat} > 0: jdbc:drill:schema=dfs.tmp> select * from `right_all_nulls.json`; > +---+ > | * | > +---+ > | null | > +---+ > 1 row selected (0.313 seconds) > {noformat} > Adding comment from [~julianhyde] > IMHO it is similar but not the same as DRILL-1256. Worth logging an issue and > let [~jnadeau] (or someone) put on the record what should be the behavior of > an empty record (empty JSON map) when it is top-level (as in this case) or in > a collection. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (DRILL-8033) uptake POI 5.1.0
[ https://issues.apache.org/jira/browse/DRILL-8033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17652900#comment-17652900 ] ASF GitHub Bot commented on DRILL-8033: --- unical1988 opened a new pull request, #2731: URL: https://github.com/apache/drill/pull/2731 # [DRILL-8033](https://issues.apache.org/jira/browse/DRILL-8033): PR Title (Please replace `PR Title` with actual PR Title) ## Description (Please describe the change. If more than one ticket is fixed, include a reference to those tickets.) ## Documentation (Please describe user-visible changes similar to what should appear in the Drill documentation.) ## Testing (Please describe how this PR has been tested.) > uptake POI 5.1.0 > > > Key: DRILL-8033 > URL: https://issues.apache.org/jira/browse/DRILL-8033 > Project: Apache Drill > Issue Type: Task > Components: Server >Reporter: PJ Fanning >Priority: Major > > POI 5.1.0 is released. excel-streaming-reader 3.2.0 is an additional upgrade > that you'll need for POI 5.1.0 compatibility. I would expect that you won't > need to make code changes as part of the upgrade. -- This message was sent by Atlassian Jira (v8.20.10#820010)