[jira] [Commented] (DRILL-5919) Add non-numeric support for JSON processing
[ https://issues.apache.org/jira/browse/DRILL-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16345564#comment-16345564 ] Alexander Malashevsky commented on DRILL-5919: -- *VERSION_VERIFIED:* 1.13.0-SNAPSHOT The functionality has been verified, the following verificatios were performed: - Verify that store.json.writer.allow_nan_inf/store.json.reader.allow_nan_inf are setup to true by default - *PASSED*; - Verify that NaN/Infinity/-Infinity values will be extracted properly from .json file if store.json.reader.allow_nan_inf = true - *PASSED*; - Verify that the corresponding error message will appear if to select NaN/Infinity/-Infinity values with store.json.reader.allow_nan_inf = false - *PASSED*; - Verify that NaN/Infinity/-Infinity values will be written as literals to a .json file when store.json.writer.allow_nan_inf = true - *PASSED*; - Verify that NaN/Infinity/-Infinity values will be written as string values to a .json file when store.json.writer.allow_nan_inf = false - *PASSED*; - Verify that NaN/Infinity/-Infinity values are processed properly in queries with UNION, UNION ALL, JOIN, ORDER BY, WHERE, GROUP BY etc. - 2 issues found -> DRILL-6121, DRILL-6122 - Verify convert_toJSON/convert_fromJSON functions - *PASSED*; - Verify that NaN/Infinity/-Infinity values will be processed properly by Math functions - 1 issue found -> DRILL-6120 > Add non-numeric support for JSON processing > --- > > Key: DRILL-5919 > URL: https://issues.apache.org/jira/browse/DRILL-5919 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - JSON >Affects Versions: 1.11.0 >Reporter: Volodymyr Tkach >Assignee: Volodymyr Tkach >Priority: Major > Labels: doc-impacting, ready-to-commit > Fix For: 1.13.0 > > > Add session options to allow drill working with non standard json strings > number literals like: NaN, Infinity, -Infinity. By default these options will > be switched off, the user will be able to toggle them during working session. > *For documentation* > 1. Added two session options {{store.json.reader.allow_nan_inf}} and > {{store.json.writer.allow_nan_inf}} that allow to read/write NaN and Infinity > as numbers. By default these options are set to true. > 2. Extended signature of {{convert_toJSON}} and {{convert_fromJSON}} > functions by adding second optional parameter that enables read/write NaN and > Infinity. > For example: > {noformat} > select convert_fromJSON('{"key": NaN}') from (values(1)); will result with > JsonParseException, but > select convert_fromJSON('{"key": NaN}', true) from (values(1)); will parse > NaN as a number. > {noformat} > 3. Added unit tests, including tests for math functions. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-5919) Add non-numeric support for JSON processing
[ https://issues.apache.org/jira/browse/DRILL-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16324331#comment-16324331 ] ASF GitHub Bot commented on DRILL-5919: --- Github user asfgit closed the pull request at: https://github.com/apache/drill/pull/1026 > Add non-numeric support for JSON processing > --- > > Key: DRILL-5919 > URL: https://issues.apache.org/jira/browse/DRILL-5919 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - JSON >Affects Versions: 1.11.0 >Reporter: Volodymyr Tkach >Assignee: Volodymyr Tkach > Labels: doc-impacting, ready-to-commit > Fix For: 1.13.0 > > > Add session options to allow drill working with non standard json strings > number literals like: NaN, Infinity, -Infinity. By default these options will > be switched off, the user will be able to toggle them during working session. > *For documentation* > 1. Added two session options {{store.json.reader.allow_nan_inf}} and > {{store.json.writer.allow_nan_inf}} that allow to read/write NaN and Infinity > as numbers. By default these options are set to true. > 2. Extended signature of {{convert_toJSON}} and {{convert_fromJSON}} > functions by adding second optional parameter that enables read/write NaN and > Infinity. > For example: > {noformat} > select convert_fromJSON('{"key": NaN}') from (values(1)); will result with > JsonParseException, but > select convert_fromJSON('{"key": NaN}', true) from (values(1)); will parse > NaN as a number. > {noformat} > 3. Added unit tests, including tests for math functions. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-5919) Add non-numeric support for JSON processing
[ https://issues.apache.org/jira/browse/DRILL-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16323741#comment-16323741 ] Arina Ielchiieva commented on DRILL-5919: - [~parthc] it does not. DRILL-6018 was created for future enhancement after this Jira will be merged. > Add non-numeric support for JSON processing > --- > > Key: DRILL-5919 > URL: https://issues.apache.org/jira/browse/DRILL-5919 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - JSON >Affects Versions: 1.11.0 >Reporter: Volodymyr Tkach >Assignee: Volodymyr Tkach > Labels: doc-impacting, ready-to-commit > Fix For: 1.13.0 > > > Add session options to allow drill working with non standard json strings > number literals like: NaN, Infinity, -Infinity. By default these options will > be switched off, the user will be able to toggle them during working session. > *For documentation* > 1. Added two session options {{store.json.reader.allow_nan_inf}} and > {{store.json.writer.allow_nan_inf}} that allow to read/write NaN and Infinity > as numbers. By default these options are set to true. > 2. Extended signature of {{convert_toJSON}} and {{convert_fromJSON}} > functions by adding second optional parameter that enables read/write NaN and > Infinity. > For example: > {noformat} > select convert_fromJSON('{"key": NaN}') from (values(1)); will result with > JsonParseException, but > select convert_fromJSON('{"key": NaN}', true) from (values(1)); will parse > NaN as a number. > {noformat} > 3. Added unit tests, including tests for math functions. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-5919) Add non-numeric support for JSON processing
[ https://issues.apache.org/jira/browse/DRILL-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16322969#comment-16322969 ] Parth Chandra commented on DRILL-5919: -- Does this fix require DRILL-6018? > Add non-numeric support for JSON processing > --- > > Key: DRILL-5919 > URL: https://issues.apache.org/jira/browse/DRILL-5919 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - JSON >Affects Versions: 1.11.0 >Reporter: Volodymyr Tkach >Assignee: Volodymyr Tkach > Labels: doc-impacting, ready-to-commit > Fix For: 1.13.0 > > > Add session options to allow drill working with non standard json strings > number literals like: NaN, Infinity, -Infinity. By default these options will > be switched off, the user will be able to toggle them during working session. > *For documentation* > 1. Added two session options {{store.json.reader.allow_nan_inf}} and > {{store.json.writer.allow_nan_inf}} that allow to read/write NaN and Infinity > as numbers. By default these options are set to true. > 2. Extended signature of {{convert_toJSON}} and {{convert_fromJSON}} > functions by adding second optional parameter that enables read/write NaN and > Infinity. > For example: > {noformat} > select convert_fromJSON('{"key": NaN}') from (values(1)); will result with > JsonParseException, but > select convert_fromJSON('{"key": NaN}', true) from (values(1)); will parse > NaN as a number. > {noformat} > 3. Added unit tests, including tests for math functions. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-5919) Add non-numeric support for JSON processing
[ https://issues.apache.org/jira/browse/DRILL-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16298562#comment-16298562 ] ASF GitHub Bot commented on DRILL-5919: --- Github user arina-ielchiieva commented on the issue: https://github.com/apache/drill/pull/1026 @vladimirtkach thanks for adding tests for math functions. New changes look good to me. @paul-rogers please take a look. > Add non-numeric support for JSON processing > --- > > Key: DRILL-5919 > URL: https://issues.apache.org/jira/browse/DRILL-5919 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - JSON >Affects Versions: 1.11.0 >Reporter: Volodymyr Tkach >Assignee: Volodymyr Tkach > Labels: doc-impacting > Fix For: 1.13.0 > > > Add session options to allow drill working with non standard json strings > number literals like: NaN, Infinity, -Infinity. By default these options will > be switched off, the user will be able to toggle them during working session. > *For documentation* > 1. Added two session options {{store.json.reader.non_numeric_numbers}} and > {{store.json.reader.non_numeric_numbers}} that allow to read/write NaN and > Infinity as numbers. By default these options are set to false. > 2. Extended signature of {{convert_toJSON}} and {{convert_fromJSON}} > functions by adding second optional parameter that enables read/write NaN and > Infinity. > For example: > {noformat} > select convert_fromJSON('{"key": NaN}') from (values(1)); will result with > JsonParseException, but > select convert_fromJSON('{"key": NaN}', true) from (values(1)); will parse > NaN as a number. > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-5919) Add non-numeric support for JSON processing
[ https://issues.apache.org/jira/browse/DRILL-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16283686#comment-16283686 ] ASF GitHub Bot commented on DRILL-5919: --- Github user vladimirtkach commented on the issue: https://github.com/apache/drill/pull/1026 @paul-rogers Made changes, please review. > Add non-numeric support for JSON processing > --- > > Key: DRILL-5919 > URL: https://issues.apache.org/jira/browse/DRILL-5919 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - JSON >Affects Versions: 1.11.0 >Reporter: Volodymyr Tkach >Assignee: Volodymyr Tkach > Labels: doc-impacting > Fix For: Future > > > Add session options to allow drill working with non standard json strings > number literals like: NaN, Infinity, -Infinity. By default these options will > be switched off, the user will be able to toggle them during working session. > *For documentation* > 1. Added two session options {{store.json.reader.non_numeric_numbers}} and > {{store.json.reader.non_numeric_numbers}} that allow to read/write NaN and > Infinity as numbers. By default these options are set to false. > 2. Extended signature of {{convert_toJSON}} and {{convert_fromJSON}} > functions by adding second optional parameter that enables read/write NaN and > Infinity. > For example: > {noformat} > select convert_fromJSON('{"key": NaN}') from (values(1)); will result with > JsonParseException, but > select convert_fromJSON('{"key": NaN}', true) from (values(1)); will parse > NaN as a number. > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-5919) Add non-numeric support for JSON processing
[ https://issues.apache.org/jira/browse/DRILL-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16265061#comment-16265061 ] Volodymyr Tkach commented on DRILL-5919: Following math functions results with NumberFormatException because currently Calcite and Drill handles FLOAT,DOUBLE types using BigDecimal class, which doesn't support nan, inf values. Currently investigating the problem, looking for ways of how to handle nan,inf values in a different way. *Query example:* _select sin(cast('NaN' as float)) from (values(1))_ * div * divide * add * multiply * tanh * sin * asin * cos * cot * acos * sqrt * ceil * negative, * castFLOAT4 * abs * floor * exp * subtract * sinh * cbrt * mod * degrees * trunc * trunc * casthigh * log * log * power * atan * tan * radians * cosh * round * round * convertToNullableFLOAT8 * convertToNullableFLOAT4 > Add non-numeric support for JSON processing > --- > > Key: DRILL-5919 > URL: https://issues.apache.org/jira/browse/DRILL-5919 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - JSON >Affects Versions: 1.11.0 >Reporter: Volodymyr Tkach >Assignee: Volodymyr Tkach > Labels: doc-impacting > Fix For: Future > > > Add session options to allow drill working with non standard json strings > number literals like: NaN, Infinity, -Infinity. By default these options will > be switched off, the user will be able to toggle them during working session. > *For documentation* > 1. Added two session options {{store.json.reader.non_numeric_numbers}} and > {{store.json.reader.non_numeric_numbers}} that allow to read/write NaN and > Infinity as numbers. By default these options are set to false. > 2. Extended signature of {{convert_toJSON}} and {{convert_fromJSON}} > functions by adding second optional parameter that enables read/write NaN and > Infinity. > For example: > {noformat} > select convert_fromJSON('{"key": NaN}') from (values(1)); will result with > JsonParseException, but > select convert_fromJSON('{"key": NaN}', true) from (values(1)); will parse > NaN as a number. > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-5919) Add non-numeric support for JSON processing
[ https://issues.apache.org/jira/browse/DRILL-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16250437#comment-16250437 ] ASF GitHub Bot commented on DRILL-5919: --- Github user paul-rogers commented on the issue: https://github.com/apache/drill/pull/1026 On the two functions... Maybe just have one function that handles the Nan/Infinity case. As noted earlier, no matter what we do, JSON without these symbols will work. So, we need only consider JSON with the symbols. Either: * The NaN/Infinity cases always work, or * The NaN/Infinity cases work sometimes, fail others, depending on some option or argument. I would vote for the first case: it is simpler. I just can't see how anyone would use Drill to valid JSON and would want a query to fail if it contains NaN or Infinity. Can you suggest a case where failing the query would be of help to the user? > Add non-numeric support for JSON processing > --- > > Key: DRILL-5919 > URL: https://issues.apache.org/jira/browse/DRILL-5919 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - JSON >Affects Versions: 1.11.0 >Reporter: Volodymyr Tkach >Assignee: Volodymyr Tkach > Labels: doc-impacting > Fix For: Future > > > Add session options to allow drill working with non standard json strings > number literals like: NaN, Infinity, -Infinity. By default these options will > be switched off, the user will be able to toggle them during working session. > *For documentation* > 1. Added two session options {{store.json.reader.non_numeric_numbers}} and > {{store.json.reader.non_numeric_numbers}} that allow to read/write NaN and > Infinity as numbers. By default these options are set to false. > 2. Extended signature of {{convert_toJSON}} and {{convert_fromJSON}} > functions by adding second optional parameter that enables read/write NaN and > Infinity. > For example: > {noformat} > select convert_fromJSON('{"key": NaN}') from (values(1)); will result with > JsonParseException, but > select convert_fromJSON('{"key": NaN}', true) from (values(1)); will parse > NaN as a number. > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-5919) Add non-numeric support for JSON processing
[ https://issues.apache.org/jira/browse/DRILL-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16250434#comment-16250434 ] ASF GitHub Bot commented on DRILL-5919: --- Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/1026#discussion_r150689236 --- Diff: exec/java-exec/src/main/resources/drill-module.conf --- @@ -502,6 +502,8 @@ drill.exec.options: { store.format: "parquet", store.hive.optimize_scan_with_native_readers: false, store.json.all_text_mode: false, +store.json.writer.non_numeric_numbers: false, +store.json.reader.non_numeric_numbers: false, --- End diff -- OK. So, if the rest of Drill either does not support (or we don't know if it supports) NaN and Inf, should we introduce the change here that potentially leads to failures elsewhere? Do we know if JDBC and ODBC support these values? (I suppose they do as they are features of Java's float/double primitives...) > Add non-numeric support for JSON processing > --- > > Key: DRILL-5919 > URL: https://issues.apache.org/jira/browse/DRILL-5919 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - JSON >Affects Versions: 1.11.0 >Reporter: Volodymyr Tkach >Assignee: Volodymyr Tkach > Labels: doc-impacting > Fix For: Future > > > Add session options to allow drill working with non standard json strings > number literals like: NaN, Infinity, -Infinity. By default these options will > be switched off, the user will be able to toggle them during working session. > *For documentation* > 1. Added two session options {{store.json.reader.non_numeric_numbers}} and > {{store.json.reader.non_numeric_numbers}} that allow to read/write NaN and > Infinity as numbers. By default these options are set to false. > 2. Extended signature of {{convert_toJSON}} and {{convert_fromJSON}} > functions by adding second optional parameter that enables read/write NaN and > Infinity. > For example: > {noformat} > select convert_fromJSON('{"key": NaN}') from (values(1)); will result with > JsonParseException, but > select convert_fromJSON('{"key": NaN}', true) from (values(1)); will parse > NaN as a number. > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-5919) Add non-numeric support for JSON processing
[ https://issues.apache.org/jira/browse/DRILL-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16250432#comment-16250432 ] ASF GitHub Bot commented on DRILL-5919: --- Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/1026#discussion_r150688949 --- Diff: exec/java-exec/src/main/resources/drill-module.conf --- @@ -502,6 +502,8 @@ drill.exec.options: { store.format: "parquet", store.hive.optimize_scan_with_native_readers: false, store.json.all_text_mode: false, +store.json.writer.non_numeric_numbers: false, +store.json.reader.non_numeric_numbers: false, --- End diff -- See discussion below. If they are off, then queries that use NaN and Infinity will fail until the user turns them on. Queries that don't use NaN and Infinity won't care. So, what is the advantage to failing queries unnecessarily? > Add non-numeric support for JSON processing > --- > > Key: DRILL-5919 > URL: https://issues.apache.org/jira/browse/DRILL-5919 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - JSON >Affects Versions: 1.11.0 >Reporter: Volodymyr Tkach >Assignee: Volodymyr Tkach > Labels: doc-impacting > Fix For: Future > > > Add session options to allow drill working with non standard json strings > number literals like: NaN, Infinity, -Infinity. By default these options will > be switched off, the user will be able to toggle them during working session. > *For documentation* > 1. Added two session options {{store.json.reader.non_numeric_numbers}} and > {{store.json.reader.non_numeric_numbers}} that allow to read/write NaN and > Infinity as numbers. By default these options are set to false. > 2. Extended signature of {{convert_toJSON}} and {{convert_fromJSON}} > functions by adding second optional parameter that enables read/write NaN and > Infinity. > For example: > {noformat} > select convert_fromJSON('{"key": NaN}') from (values(1)); will result with > JsonParseException, but > select convert_fromJSON('{"key": NaN}', true) from (values(1)); will parse > NaN as a number. > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-5919) Add non-numeric support for JSON processing
[ https://issues.apache.org/jira/browse/DRILL-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16250433#comment-16250433 ] ASF GitHub Bot commented on DRILL-5919: --- Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/1026#discussion_r150689617 --- Diff: contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/MongoRecordReader.java --- @@ -73,6 +73,7 @@ private final MongoStoragePlugin plugin; private final boolean enableAllTextMode; + private final boolean enableNonNumericNumbers; --- End diff -- The JSON parser we use does use the term "non-numeric numbers", but not sure we want to keep that naming in Drill itself. > Add non-numeric support for JSON processing > --- > > Key: DRILL-5919 > URL: https://issues.apache.org/jira/browse/DRILL-5919 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - JSON >Affects Versions: 1.11.0 >Reporter: Volodymyr Tkach >Assignee: Volodymyr Tkach > Labels: doc-impacting > Fix For: Future > > > Add session options to allow drill working with non standard json strings > number literals like: NaN, Infinity, -Infinity. By default these options will > be switched off, the user will be able to toggle them during working session. > *For documentation* > 1. Added two session options {{store.json.reader.non_numeric_numbers}} and > {{store.json.reader.non_numeric_numbers}} that allow to read/write NaN and > Infinity as numbers. By default these options are set to false. > 2. Extended signature of {{convert_toJSON}} and {{convert_fromJSON}} > functions by adding second optional parameter that enables read/write NaN and > Infinity. > For example: > {noformat} > select convert_fromJSON('{"key": NaN}') from (values(1)); will result with > JsonParseException, but > select convert_fromJSON('{"key": NaN}', true) from (values(1)); will parse > NaN as a number. > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-5919) Add non-numeric support for JSON processing
[ https://issues.apache.org/jira/browse/DRILL-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16250431#comment-16250431 ] ASF GitHub Bot commented on DRILL-5919: --- Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/1026#discussion_r150688725 --- Diff: exec/java-exec/src/main/resources/drill-module.conf --- @@ -502,6 +502,8 @@ drill.exec.options: { store.format: "parquet", store.hive.optimize_scan_with_native_readers: false, store.json.all_text_mode: false, +store.json.writer.non_numeric_numbers: false, +store.json.reader.non_numeric_numbers: false, --- End diff -- As noted in DRILL-5949, options like this should be part of the plugin config (as in CSV), not session options. For now, let's leave them as session options and I'll migrate them to plugin config options along with the others. > Add non-numeric support for JSON processing > --- > > Key: DRILL-5919 > URL: https://issues.apache.org/jira/browse/DRILL-5919 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - JSON >Affects Versions: 1.11.0 >Reporter: Volodymyr Tkach >Assignee: Volodymyr Tkach > Labels: doc-impacting > Fix For: Future > > > Add session options to allow drill working with non standard json strings > number literals like: NaN, Infinity, -Infinity. By default these options will > be switched off, the user will be able to toggle them during working session. > *For documentation* > 1. Added two session options {{store.json.reader.non_numeric_numbers}} and > {{store.json.reader.non_numeric_numbers}} that allow to read/write NaN and > Infinity as numbers. By default these options are set to false. > 2. Extended signature of {{convert_toJSON}} and {{convert_fromJSON}} > functions by adding second optional parameter that enables read/write NaN and > Infinity. > For example: > {noformat} > select convert_fromJSON('{"key": NaN}') from (values(1)); will result with > JsonParseException, but > select convert_fromJSON('{"key": NaN}', true) from (values(1)); will parse > NaN as a number. > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-5919) Add non-numeric support for JSON processing
[ https://issues.apache.org/jira/browse/DRILL-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16250256#comment-16250256 ] ASF GitHub Bot commented on DRILL-5919: --- Github user vladimirtkach commented on the issue: https://github.com/apache/drill/pull/1026 what do you think about having two functions instead: convertFromJSON and convertFromJSON+some suffix. Second will be able to convert NaN, Infinity > Add non-numeric support for JSON processing > --- > > Key: DRILL-5919 > URL: https://issues.apache.org/jira/browse/DRILL-5919 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - JSON >Affects Versions: 1.11.0 >Reporter: Volodymyr Tkach >Assignee: Volodymyr Tkach > Labels: doc-impacting > Fix For: Future > > > Add session options to allow drill working with non standard json strings > number literals like: NaN, Infinity, -Infinity. By default these options will > be switched off, the user will be able to toggle them during working session. > *For documentation* > 1. Added two session options {{store.json.reader.non_numeric_numbers}} and > {{store.json.reader.non_numeric_numbers}} that allow to read/write NaN and > Infinity as numbers. By default these options are set to false. > 2. Extended signature of {{convert_toJSON}} and {{convert_fromJSON}} > functions by adding second optional parameter that enables read/write NaN and > Infinity. > For example: > {noformat} > select convert_fromJSON('{"key": NaN}') from (values(1)); will result with > JsonParseException, but > select convert_fromJSON('{"key": NaN}', true) from (values(1)); will parse > NaN as a number. > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-5919) Add non-numeric support for JSON processing
[ https://issues.apache.org/jira/browse/DRILL-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16249365#comment-16249365 ] ASF GitHub Bot commented on DRILL-5919: --- Github user vladimirtkach commented on a diff in the pull request: https://github.com/apache/drill/pull/1026#discussion_r150500913 --- Diff: exec/java-exec/src/main/resources/drill-module.conf --- @@ -502,6 +502,8 @@ drill.exec.options: { store.format: "parquet", store.hive.optimize_scan_with_native_readers: false, store.json.all_text_mode: false, +store.json.writer.non_numeric_numbers: false, +store.json.reader.non_numeric_numbers: false, --- End diff -- I think we should stick to json standard and leave them switched off by default. If user get an exception we may show the option name which he want to switch on. > Add non-numeric support for JSON processing > --- > > Key: DRILL-5919 > URL: https://issues.apache.org/jira/browse/DRILL-5919 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - JSON >Affects Versions: 1.11.0 >Reporter: Volodymyr Tkach >Assignee: Volodymyr Tkach > Labels: doc-impacting > Fix For: Future > > > Add session options to allow drill working with non standard json strings > number literals like: NaN, Infinity, -Infinity. By default these options will > be switched off, the user will be able to toggle them during working session. > *For documentation* > 1. Added two session options {{store.json.reader.non_numeric_numbers}} and > {{store.json.reader.non_numeric_numbers}} that allow to read/write NaN and > Infinity as numbers. By default these options are set to false. > 2. Extended signature of {{convert_toJSON}} and {{convert_fromJSON}} > functions by adding second optional parameter that enables read/write NaN and > Infinity. > For example: > {noformat} > select convert_fromJSON('{"key": NaN}') from (values(1)); will result with > JsonParseException, but > select convert_fromJSON('{"key": NaN}', true) from (values(1)); will parse > NaN as a number. > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-5919) Add non-numeric support for JSON processing
[ https://issues.apache.org/jira/browse/DRILL-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16249362#comment-16249362 ] ASF GitHub Bot commented on DRILL-5919: --- Github user vladimirtkach commented on a diff in the pull request: https://github.com/apache/drill/pull/1026#discussion_r150500204 --- Diff: exec/java-exec/src/main/resources/drill-module.conf --- @@ -502,6 +502,8 @@ drill.exec.options: { store.format: "parquet", store.hive.optimize_scan_with_native_readers: false, store.json.all_text_mode: false, +store.json.writer.non_numeric_numbers: false, +store.json.reader.non_numeric_numbers: false, --- End diff -- thanks, allow_nan_inf definitely better. > Add non-numeric support for JSON processing > --- > > Key: DRILL-5919 > URL: https://issues.apache.org/jira/browse/DRILL-5919 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - JSON >Affects Versions: 1.11.0 >Reporter: Volodymyr Tkach >Assignee: Volodymyr Tkach > Labels: doc-impacting > Fix For: Future > > > Add session options to allow drill working with non standard json strings > number literals like: NaN, Infinity, -Infinity. By default these options will > be switched off, the user will be able to toggle them during working session. > *For documentation* > 1. Added two session options {{store.json.reader.non_numeric_numbers}} and > {{store.json.reader.non_numeric_numbers}} that allow to read/write NaN and > Infinity as numbers. By default these options are set to false. > 2. Extended signature of {{convert_toJSON}} and {{convert_fromJSON}} > functions by adding second optional parameter that enables read/write NaN and > Infinity. > For example: > {noformat} > select convert_fromJSON('{"key": NaN}') from (values(1)); will result with > JsonParseException, but > select convert_fromJSON('{"key": NaN}', true) from (values(1)); will parse > NaN as a number. > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-5919) Add non-numeric support for JSON processing
[ https://issues.apache.org/jira/browse/DRILL-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16249361#comment-16249361 ] ASF GitHub Bot commented on DRILL-5919: --- Github user vladimirtkach commented on a diff in the pull request: https://github.com/apache/drill/pull/1026#discussion_r150500059 --- Diff: exec/java-exec/src/main/resources/drill-module.conf --- @@ -502,6 +502,8 @@ drill.exec.options: { store.format: "parquet", store.hive.optimize_scan_with_native_readers: false, store.json.all_text_mode: false, +store.json.writer.non_numeric_numbers: false, +store.json.reader.non_numeric_numbers: false, --- End diff -- No, we don't and we should test them. I think we have to create separate jira for testing math functions, because code changes from this PR doesn't affect logic of any math function, and while testing math function there will be other issues not connected with with functionality (like BigInteger constructor doesn't accept NaN, Infinity and others) > Add non-numeric support for JSON processing > --- > > Key: DRILL-5919 > URL: https://issues.apache.org/jira/browse/DRILL-5919 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - JSON >Affects Versions: 1.11.0 >Reporter: Volodymyr Tkach >Assignee: Volodymyr Tkach > Labels: doc-impacting > Fix For: Future > > > Add session options to allow drill working with non standard json strings > number literals like: NaN, Infinity, -Infinity. By default these options will > be switched off, the user will be able to toggle them during working session. > *For documentation* > 1. Added two session options {{store.json.reader.non_numeric_numbers}} and > {{store.json.reader.non_numeric_numbers}} that allow to read/write NaN and > Infinity as numbers. By default these options are set to false. > 2. Extended signature of {{convert_toJSON}} and {{convert_fromJSON}} > functions by adding second optional parameter that enables read/write NaN and > Infinity. > For example: > {noformat} > select convert_fromJSON('{"key": NaN}') from (values(1)); will result with > JsonParseException, but > select convert_fromJSON('{"key": NaN}', true) from (values(1)); will parse > NaN as a number. > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-5919) Add non-numeric support for JSON processing
[ https://issues.apache.org/jira/browse/DRILL-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16248269#comment-16248269 ] ASF GitHub Bot commented on DRILL-5919: --- Github user paul-rogers commented on the issue: https://github.com/apache/drill/pull/1026 Further, is the extra option to `convertFromJSON` really needed? Can't we just accept `NaN` and `Infinity` by default? Consider. If the option is off by default, users without `NaN` or `Infinity` data will see no difference. But, users will this data will get an error and have to hunt down the option to make their data work. If the option is on by default, users without `NaN` or `Infinity` data will see no difference. But, users will this data will also have their queries work by default. So, seems no harm in making the `NaN` and `Infinity` support turned on by default. > Add non-numeric support for JSON processing > --- > > Key: DRILL-5919 > URL: https://issues.apache.org/jira/browse/DRILL-5919 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - JSON >Affects Versions: 1.11.0 >Reporter: Volodymyr Tkach >Assignee: Volodymyr Tkach > Labels: doc-impacting > Fix For: Future > > > Add session options to allow drill working with non standard json strings > number literals like: NaN, Infinity, -Infinity. By default these options will > be switched off, the user will be able to toggle them during working session. > *For documentation* > 1. Added two session options {{store.json.reader.non_numeric_numbers}} and > {{store.json.reader.non_numeric_numbers}} that allow to read/write NaN and > Infinity as numbers. By default these options are set to false. > 2. Extended signature of {{convert_toJSON}} and {{convert_fromJSON}} > functions by adding second optional parameter that enables read/write NaN and > Infinity. > For example: > {noformat} > select convert_fromJSON('{"key": NaN}') from (values(1)); will result with > JsonParseException, but > select convert_fromJSON('{"key": NaN}', true) from (values(1)); will parse > NaN as a number. > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-5919) Add non-numeric support for JSON processing
[ https://issues.apache.org/jira/browse/DRILL-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16248210#comment-16248210 ] Paul Rogers commented on DRILL-5919: As a reference, this change has been ported to the revised JSON reader created within the "Batch Size Control" project. PR to be issued in Drill 1.13. > Add non-numeric support for JSON processing > --- > > Key: DRILL-5919 > URL: https://issues.apache.org/jira/browse/DRILL-5919 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - JSON >Affects Versions: 1.11.0 >Reporter: Volodymyr Tkach >Assignee: Volodymyr Tkach > Labels: doc-impacting > Fix For: Future > > > Add session options to allow drill working with non standard json strings > number literals like: NaN, Infinity, -Infinity. By default these options will > be switched off, the user will be able to toggle them during working session. > *For documentation* > 1. Added two session options {{store.json.reader.non_numeric_numbers}} and > {{store.json.reader.non_numeric_numbers}} that allow to read/write NaN and > Infinity as numbers. By default these options are set to false. > 2. Extended signature of {{convert_toJSON}} and {{convert_fromJSON}} > functions by adding second optional parameter that enables read/write NaN and > Infinity. > For example: > {noformat} > select convert_fromJSON('{"key": NaN}') from (values(1)); will result with > JsonParseException, but > select convert_fromJSON('{"key": NaN}', true) from (values(1)); will parse > NaN as a number. > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-5919) Add non-numeric support for JSON processing
[ https://issues.apache.org/jira/browse/DRILL-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16248189#comment-16248189 ] ASF GitHub Bot commented on DRILL-5919: --- Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/1026#discussion_r150359510 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/impl/conv/JsonConvertFrom.java --- @@ -91,4 +92,60 @@ public void eval(){ } } + @FunctionTemplate(name = "convert_fromJSON", scope = FunctionScope.SIMPLE, nulls = NullHandling.NULL_IF_NULL, isRandom = true) + public static class ConvertFromJsonVarcharNonNumerics implements DrillSimpleFunc{ + +@Param VarCharHolder in; +@Param BitHolder enableNonNumeric; +@Inject DrillBuf buffer; +@Workspace org.apache.drill.exec.vector.complex.fn.JsonReader jsonReader; + +@Output ComplexWriter writer; + +public void setup(){ + jsonReader = new org.apache.drill.exec.vector.complex.fn.JsonReader(buffer, false, false, false,/* do not read numbers as doubles */ --- End diff -- See above. > Add non-numeric support for JSON processing > --- > > Key: DRILL-5919 > URL: https://issues.apache.org/jira/browse/DRILL-5919 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - JSON >Affects Versions: 1.11.0 >Reporter: Volodymyr Tkach >Assignee: Volodymyr Tkach > Labels: doc-impacting > Fix For: Future > > > Add session options to allow drill working with non standard json strings > number literals like: NaN, Infinity, -Infinity. By default these options will > be switched off, the user will be able to toggle them during working session. > *For documentation* > 1. Added two session options {{store.json.reader.non_numeric_numbers}} and > {{store.json.reader.non_numeric_numbers}} that allow to read/write NaN and > Infinity as numbers. By default these options are set to false. > 2. Extended signature of {{convert_toJSON}} and {{convert_fromJSON}} > functions by adding second optional parameter that enables read/write NaN and > Infinity. > For example: > {noformat} > select convert_fromJSON('{"key": NaN}') from (values(1)); will result with > JsonParseException, but > select convert_fromJSON('{"key": NaN}', true) from (values(1)); will parse > NaN as a number. > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-5919) Add non-numeric support for JSON processing
[ https://issues.apache.org/jira/browse/DRILL-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16248191#comment-16248191 ] ASF GitHub Bot commented on DRILL-5919: --- Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/1026#discussion_r150360262 --- Diff: exec/java-exec/src/main/resources/drill-module.conf --- @@ -502,6 +502,8 @@ drill.exec.options: { store.format: "parquet", store.hive.optimize_scan_with_native_readers: false, store.json.all_text_mode: false, +store.json.writer.non_numeric_numbers: false, +store.json.reader.non_numeric_numbers: false, --- End diff -- Any reason these are not enabled by default? > Add non-numeric support for JSON processing > --- > > Key: DRILL-5919 > URL: https://issues.apache.org/jira/browse/DRILL-5919 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - JSON >Affects Versions: 1.11.0 >Reporter: Volodymyr Tkach >Assignee: Volodymyr Tkach > Labels: doc-impacting > Fix For: Future > > > Add session options to allow drill working with non standard json strings > number literals like: NaN, Infinity, -Infinity. By default these options will > be switched off, the user will be able to toggle them during working session. > *For documentation* > 1. Added two session options {{store.json.reader.non_numeric_numbers}} and > {{store.json.reader.non_numeric_numbers}} that allow to read/write NaN and > Infinity as numbers. By default these options are set to false. > 2. Extended signature of {{convert_toJSON}} and {{convert_fromJSON}} > functions by adding second optional parameter that enables read/write NaN and > Infinity. > For example: > {noformat} > select convert_fromJSON('{"key": NaN}') from (values(1)); will result with > JsonParseException, but > select convert_fromJSON('{"key": NaN}', true) from (values(1)); will parse > NaN as a number. > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-5919) Add non-numeric support for JSON processing
[ https://issues.apache.org/jira/browse/DRILL-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16248187#comment-16248187 ] ASF GitHub Bot commented on DRILL-5919: --- Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/1026#discussion_r150359625 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/impl/conv/JsonConvertTo.java --- @@ -90,7 +91,71 @@ public void eval(){ java.io.ByteArrayOutputStream stream = new java.io.ByteArrayOutputStream(); try { -org.apache.drill.exec.vector.complex.fn.JsonWriter jsonWriter = new org.apache.drill.exec.vector.complex.fn.JsonWriter(stream, true, true); +org.apache.drill.exec.vector.complex.fn.JsonWriter jsonWriter = new org.apache.drill.exec.vector.complex.fn.JsonWriter(stream, true, true, false); + +jsonWriter.write(input); + } catch (Exception e) { +throw new RuntimeException(e); + } + + byte [] bytea = stream.toByteArray(); + + out.buffer = buffer = buffer.reallocIfNeeded(bytea.length); + out.buffer.setBytes(0, bytea); + out.end = bytea.length; +} + } + + @FunctionTemplate(names = { "convert_toJSON", "convert_toSIMPLEJSON" } , scope = FunctionScope.SIMPLE, nulls = NullHandling.NULL_IF_NULL) + public static class ConvertToJsonNonNumeric implements DrillSimpleFunc{ + +@Param FieldReader input; +@Param BitHolder nonNumeric; +@Output VarBinaryHolder out; +@Inject DrillBuf buffer; + +public void setup(){ +} + +public void eval(){ + out.start = 0; + + java.io.ByteArrayOutputStream stream = new java.io.ByteArrayOutputStream(); + try { +org.apache.drill.exec.vector.complex.fn.JsonWriter jsonWriter = new org.apache.drill.exec.vector.complex.fn.JsonWriter(stream, true, false, --- End diff -- More copies. > Add non-numeric support for JSON processing > --- > > Key: DRILL-5919 > URL: https://issues.apache.org/jira/browse/DRILL-5919 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - JSON >Affects Versions: 1.11.0 >Reporter: Volodymyr Tkach >Assignee: Volodymyr Tkach > Labels: doc-impacting > Fix For: Future > > > Add session options to allow drill working with non standard json strings > number literals like: NaN, Infinity, -Infinity. By default these options will > be switched off, the user will be able to toggle them during working session. > *For documentation* > 1. Added two session options {{store.json.reader.non_numeric_numbers}} and > {{store.json.reader.non_numeric_numbers}} that allow to read/write NaN and > Infinity as numbers. By default these options are set to false. > 2. Extended signature of {{convert_toJSON}} and {{convert_fromJSON}} > functions by adding second optional parameter that enables read/write NaN and > Infinity. > For example: > {noformat} > select convert_fromJSON('{"key": NaN}') from (values(1)); will result with > JsonParseException, but > select convert_fromJSON('{"key": NaN}', true) from (values(1)); will parse > NaN as a number. > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-5919) Add non-numeric support for JSON processing
[ https://issues.apache.org/jira/browse/DRILL-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16248192#comment-16248192 ] ASF GitHub Bot commented on DRILL-5919: --- Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/1026#discussion_r150360475 --- Diff: exec/java-exec/src/test/java/org/apache/drill/exec/vector/complex/writer/TestJsonNonNumerics.java --- @@ -0,0 +1,167 @@ +/* +* Licensed to the Apache Software Foundation (ASF) under one or more +* contributor license agreements. See the NOTICE file distributed with +* this work for additional information regarding copyright ownership. +* The ASF licenses this file to you under the Apache License, Version 2.0 +* (the "License"); you may not use this file except in compliance with +* the License. You may obtain a copy of the License at +* +* http://www.apache.org/licenses/LICENSE-2.0 +* +* Unless required by applicable law or agreed to in writing, software +* distributed under the License is distributed on an "AS IS" BASIS, +* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +* See the License for the specific language governing permissions and +* limitations under the License. +*/ + +package org.apache.drill.exec.vector.complex.writer; + +import com.google.common.collect.ImmutableMap; +import org.apache.commons.io.FileUtils; +import org.apache.drill.BaseTestQuery; +import org.apache.drill.common.exceptions.UserRemoteException; +import org.apache.drill.common.expression.SchemaPath; +import org.apache.drill.exec.record.RecordBatchLoader; +import org.apache.drill.exec.record.VectorWrapper; +import org.apache.drill.exec.rpc.user.QueryDataBatch; +import org.apache.drill.exec.vector.VarCharVector; +import org.junit.Test; + +import java.io.File; +import java.util.List; + +import static org.hamcrest.CoreMatchers.containsString; +import static org.junit.Assert.*; + +public class TestJsonNonNumerics extends BaseTestQuery { + + @Test + public void testNonNumericSelect() throws Exception { +File file = new File(getTempDir("nan_test"), "nan_test.json"); +String json = "{\"nan\":NaN, \"inf\":Infinity}"; +String query = String.format("select * from dfs.`%s`",file.getAbsolutePath()); +try { + FileUtils.writeStringToFile(file, json); + test("alter session set `store.json.reader.non_numeric_numbers` = true"); + testBuilder() +.sqlQuery(query) +.unOrdered() +.baselineColumns("nan", "inf") +.baselineValues(Double.NaN, Double.POSITIVE_INFINITY) +.build() +.run(); +} finally { + test("alter session reset `store.json.reader.non_numeric_numbers`"); + FileUtils.deleteQuietly(file); +} + } + + @Test(expected = UserRemoteException.class) + public void testNonNumericFailure() throws Exception { +File file = new File(getTempDir("nan_test"), "nan_test.json"); +test("alter session set `store.json.reader.non_numeric_numbers` = false"); +String json = "{\"nan\":NaN, \"inf\":Infinity}"; +try { + FileUtils.writeStringToFile(file, json); + test("select * from dfs.`%s`;", file.getAbsolutePath()); +} catch (UserRemoteException e) { + assertThat(e.getMessage(), containsString("Error parsing JSON")); + throw e; +} finally { + test("alter session reset `store.json.reader.non_numeric_numbers`"); + FileUtils.deleteQuietly(file); +} + } + + @Test + public void testCreateTableNonNumerics() throws Exception { +File file = new File(getTempDir("nan_test"), "nan_test.json"); +String json = "{\"nan\":NaN, \"inf\":Infinity}"; +String tableName = "ctas_test"; +try { + FileUtils.writeStringToFile(file, json); + test("alter session set `store.json.reader.non_numeric_numbers` = true"); + test("alter session set `store.json.writer.non_numeric_numbers` = true"); + test("alter session set `store.format`='json'"); + test("create table dfs_test.tmp.`%s` as select * from dfs.`%s`;", tableName, file.getAbsolutePath()); + + // ensuring that `NaN` and `Infinity` tokens ARE NOT enclosed with double quotes + File resultFile = new File(new File(getDfsTestTmpSchemaLocation(),tableName),"0_0_0.json"); + String resultJson = FileUtils.readFileToString(resultFile); + int nanIndex = resultJson.indexOf("NaN"); + assertFalse("`NaN` must not be enclosed with \"\" ", resultJson.charAt(nanIndex - 1) == '"'); + assertFalse("`NaN` must not be enclo
[jira] [Commented] (DRILL-5919) Add non-numeric support for JSON processing
[ https://issues.apache.org/jira/browse/DRILL-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16248188#comment-16248188 ] ASF GitHub Bot commented on DRILL-5919: --- Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/1026#discussion_r150361681 --- Diff: exec/java-exec/src/test/java/org/apache/drill/exec/vector/complex/writer/TestJsonNonNumerics.java --- @@ -0,0 +1,167 @@ +/* +* Licensed to the Apache Software Foundation (ASF) under one or more +* contributor license agreements. See the NOTICE file distributed with +* this work for additional information regarding copyright ownership. +* The ASF licenses this file to you under the Apache License, Version 2.0 +* (the "License"); you may not use this file except in compliance with +* the License. You may obtain a copy of the License at +* +* http://www.apache.org/licenses/LICENSE-2.0 +* +* Unless required by applicable law or agreed to in writing, software +* distributed under the License is distributed on an "AS IS" BASIS, +* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +* See the License for the specific language governing permissions and +* limitations under the License. +*/ + +package org.apache.drill.exec.vector.complex.writer; + +import com.google.common.collect.ImmutableMap; +import org.apache.commons.io.FileUtils; +import org.apache.drill.BaseTestQuery; +import org.apache.drill.common.exceptions.UserRemoteException; +import org.apache.drill.common.expression.SchemaPath; +import org.apache.drill.exec.record.RecordBatchLoader; +import org.apache.drill.exec.record.VectorWrapper; +import org.apache.drill.exec.rpc.user.QueryDataBatch; +import org.apache.drill.exec.vector.VarCharVector; +import org.junit.Test; + +import java.io.File; +import java.util.List; + +import static org.hamcrest.CoreMatchers.containsString; +import static org.junit.Assert.*; + +public class TestJsonNonNumerics extends BaseTestQuery { + + @Test + public void testNonNumericSelect() throws Exception { +File file = new File(getTempDir("nan_test"), "nan_test.json"); +String json = "{\"nan\":NaN, \"inf\":Infinity}"; +String query = String.format("select * from dfs.`%s`",file.getAbsolutePath()); +try { + FileUtils.writeStringToFile(file, json); + test("alter session set `store.json.reader.non_numeric_numbers` = true"); + testBuilder() +.sqlQuery(query) +.unOrdered() +.baselineColumns("nan", "inf") +.baselineValues(Double.NaN, Double.POSITIVE_INFINITY) +.build() +.run(); +} finally { + test("alter session reset `store.json.reader.non_numeric_numbers`"); + FileUtils.deleteQuietly(file); +} + } + + @Test(expected = UserRemoteException.class) + public void testNonNumericFailure() throws Exception { +File file = new File(getTempDir("nan_test"), "nan_test.json"); +test("alter session set `store.json.reader.non_numeric_numbers` = false"); +String json = "{\"nan\":NaN, \"inf\":Infinity}"; +try { + FileUtils.writeStringToFile(file, json); + test("select * from dfs.`%s`;", file.getAbsolutePath()); +} catch (UserRemoteException e) { + assertThat(e.getMessage(), containsString("Error parsing JSON")); + throw e; +} finally { + test("alter session reset `store.json.reader.non_numeric_numbers`"); + FileUtils.deleteQuietly(file); +} + } + + @Test + public void testCreateTableNonNumerics() throws Exception { +File file = new File(getTempDir("nan_test"), "nan_test.json"); +String json = "{\"nan\":NaN, \"inf\":Infinity}"; +String tableName = "ctas_test"; +try { + FileUtils.writeStringToFile(file, json); + test("alter session set `store.json.reader.non_numeric_numbers` = true"); + test("alter session set `store.json.writer.non_numeric_numbers` = true"); + test("alter session set `store.format`='json'"); + test("create table dfs_test.tmp.`%s` as select * from dfs.`%s`;", tableName, file.getAbsolutePath()); + + // ensuring that `NaN` and `Infinity` tokens ARE NOT enclosed with double quotes + File resultFile = new File(new File(getDfsTestTmpSchemaLocation(),tableName),"0_0_0.json"); + String resultJson = FileUtils.readFileToString(resultFile); + int nanIndex = resultJson.indexOf("NaN"); + assertFalse("`NaN` must not be enclosed with \"\" ", resultJson.charAt(nanIndex - 1) == '"'); + assertFalse("`NaN` must not be enclo
[jira] [Commented] (DRILL-5919) Add non-numeric support for JSON processing
[ https://issues.apache.org/jira/browse/DRILL-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16248197#comment-16248197 ] ASF GitHub Bot commented on DRILL-5919: --- Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/1026#discussion_r150360192 --- Diff: exec/java-exec/src/main/resources/drill-module.conf --- @@ -502,6 +502,8 @@ drill.exec.options: { store.format: "parquet", store.hive.optimize_scan_with_native_readers: false, store.json.all_text_mode: false, +store.json.writer.non_numeric_numbers: false, +store.json.reader.non_numeric_numbers: false, --- End diff -- `allow_nan_inf`? > Add non-numeric support for JSON processing > --- > > Key: DRILL-5919 > URL: https://issues.apache.org/jira/browse/DRILL-5919 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - JSON >Affects Versions: 1.11.0 >Reporter: Volodymyr Tkach >Assignee: Volodymyr Tkach > Labels: doc-impacting > Fix For: Future > > > Add session options to allow drill working with non standard json strings > number literals like: NaN, Infinity, -Infinity. By default these options will > be switched off, the user will be able to toggle them during working session. > *For documentation* > 1. Added two session options {{store.json.reader.non_numeric_numbers}} and > {{store.json.reader.non_numeric_numbers}} that allow to read/write NaN and > Infinity as numbers. By default these options are set to false. > 2. Extended signature of {{convert_toJSON}} and {{convert_fromJSON}} > functions by adding second optional parameter that enables read/write NaN and > Infinity. > For example: > {noformat} > select convert_fromJSON('{"key": NaN}') from (values(1)); will result with > JsonParseException, but > select convert_fromJSON('{"key": NaN}', true) from (values(1)); will parse > NaN as a number. > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-5919) Add non-numeric support for JSON processing
[ https://issues.apache.org/jira/browse/DRILL-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16248196#comment-16248196 ] ASF GitHub Bot commented on DRILL-5919: --- Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/1026#discussion_r150360424 --- Diff: exec/java-exec/src/test/java/org/apache/drill/exec/vector/complex/writer/TestJsonNonNumerics.java --- @@ -0,0 +1,167 @@ +/* +* Licensed to the Apache Software Foundation (ASF) under one or more +* contributor license agreements. See the NOTICE file distributed with +* this work for additional information regarding copyright ownership. +* The ASF licenses this file to you under the Apache License, Version 2.0 +* (the "License"); you may not use this file except in compliance with +* the License. You may obtain a copy of the License at +* +* http://www.apache.org/licenses/LICENSE-2.0 +* +* Unless required by applicable law or agreed to in writing, software +* distributed under the License is distributed on an "AS IS" BASIS, +* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +* See the License for the specific language governing permissions and +* limitations under the License. +*/ + +package org.apache.drill.exec.vector.complex.writer; + +import com.google.common.collect.ImmutableMap; +import org.apache.commons.io.FileUtils; +import org.apache.drill.BaseTestQuery; +import org.apache.drill.common.exceptions.UserRemoteException; +import org.apache.drill.common.expression.SchemaPath; +import org.apache.drill.exec.record.RecordBatchLoader; +import org.apache.drill.exec.record.VectorWrapper; +import org.apache.drill.exec.rpc.user.QueryDataBatch; +import org.apache.drill.exec.vector.VarCharVector; +import org.junit.Test; + +import java.io.File; +import java.util.List; + +import static org.hamcrest.CoreMatchers.containsString; +import static org.junit.Assert.*; + +public class TestJsonNonNumerics extends BaseTestQuery { + + @Test + public void testNonNumericSelect() throws Exception { +File file = new File(getTempDir("nan_test"), "nan_test.json"); +String json = "{\"nan\":NaN, \"inf\":Infinity}"; +String query = String.format("select * from dfs.`%s`",file.getAbsolutePath()); +try { + FileUtils.writeStringToFile(file, json); + test("alter session set `store.json.reader.non_numeric_numbers` = true"); + testBuilder() +.sqlQuery(query) +.unOrdered() +.baselineColumns("nan", "inf") +.baselineValues(Double.NaN, Double.POSITIVE_INFINITY) +.build() +.run(); +} finally { + test("alter session reset `store.json.reader.non_numeric_numbers`"); + FileUtils.deleteQuietly(file); +} + } + + @Test(expected = UserRemoteException.class) + public void testNonNumericFailure() throws Exception { +File file = new File(getTempDir("nan_test"), "nan_test.json"); +test("alter session set `store.json.reader.non_numeric_numbers` = false"); +String json = "{\"nan\":NaN, \"inf\":Infinity}"; +try { + FileUtils.writeStringToFile(file, json); + test("select * from dfs.`%s`;", file.getAbsolutePath()); +} catch (UserRemoteException e) { + assertThat(e.getMessage(), containsString("Error parsing JSON")); + throw e; +} finally { + test("alter session reset `store.json.reader.non_numeric_numbers`"); + FileUtils.deleteQuietly(file); +} + } + + @Test + public void testCreateTableNonNumerics() throws Exception { +File file = new File(getTempDir("nan_test"), "nan_test.json"); +String json = "{\"nan\":NaN, \"inf\":Infinity}"; +String tableName = "ctas_test"; +try { + FileUtils.writeStringToFile(file, json); + test("alter session set `store.json.reader.non_numeric_numbers` = true"); + test("alter session set `store.json.writer.non_numeric_numbers` = true"); + test("alter session set `store.format`='json'"); + test("create table dfs_test.tmp.`%s` as select * from dfs.`%s`;", tableName, file.getAbsolutePath()); + + // ensuring that `NaN` and `Infinity` tokens ARE NOT enclosed with double quotes + File resultFile = new File(new File(getDfsTestTmpSchemaLocation(),tableName),"0_0_0.json"); + String resultJson = FileUtils.readFileToString(resultFile); + int nanIndex = resultJson.indexOf("NaN"); + assertFalse("`NaN` must not be enclosed with \"\" ", resultJson.charAt(nanIndex - 1) == '"'); + assertFalse("`NaN` must not be enclo
[jira] [Commented] (DRILL-5919) Add non-numeric support for JSON processing
[ https://issues.apache.org/jira/browse/DRILL-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16248190#comment-16248190 ] ASF GitHub Bot commented on DRILL-5919: --- Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/1026#discussion_r150359353 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/impl/conv/JsonConvertFrom.java --- @@ -50,7 +51,7 @@ private JsonConvertFrom(){} @Output ComplexWriter writer; public void setup(){ - jsonReader = new org.apache.drill.exec.vector.complex.fn.JsonReader(buffer, false, false, false /* do not read numbers as doubles */); + jsonReader = new org.apache.drill.exec.vector.complex.fn.JsonReader(buffer, false, false, false, false /* do not read numbers as doubles */); --- End diff -- Here, the comment refers to the second-to-last item. Consider this: ``` false, // What is the first one? false, // What is the second one? false, // do not read numbers as doubles false // Do not allow Nan, INF ``` > Add non-numeric support for JSON processing > --- > > Key: DRILL-5919 > URL: https://issues.apache.org/jira/browse/DRILL-5919 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - JSON >Affects Versions: 1.11.0 >Reporter: Volodymyr Tkach >Assignee: Volodymyr Tkach > Labels: doc-impacting > Fix For: Future > > > Add session options to allow drill working with non standard json strings > number literals like: NaN, Infinity, -Infinity. By default these options will > be switched off, the user will be able to toggle them during working session. > *For documentation* > 1. Added two session options {{store.json.reader.non_numeric_numbers}} and > {{store.json.reader.non_numeric_numbers}} that allow to read/write NaN and > Infinity as numbers. By default these options are set to false. > 2. Extended signature of {{convert_toJSON}} and {{convert_fromJSON}} > functions by adding second optional parameter that enables read/write NaN and > Infinity. > For example: > {noformat} > select convert_fromJSON('{"key": NaN}') from (values(1)); will result with > JsonParseException, but > select convert_fromJSON('{"key": NaN}', true) from (values(1)); will parse > NaN as a number. > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-5919) Add non-numeric support for JSON processing
[ https://issues.apache.org/jira/browse/DRILL-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16248194#comment-16248194 ] ASF GitHub Bot commented on DRILL-5919: --- Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/1026#discussion_r150359565 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/impl/conv/JsonConvertFrom.java --- @@ -91,4 +92,60 @@ public void eval(){ } } + @FunctionTemplate(name = "convert_fromJSON", scope = FunctionScope.SIMPLE, nulls = NullHandling.NULL_IF_NULL, isRandom = true) + public static class ConvertFromJsonVarcharNonNumerics implements DrillSimpleFunc{ + +@Param VarCharHolder in; +@Param BitHolder enableNonNumeric; +@Inject DrillBuf buffer; +@Workspace org.apache.drill.exec.vector.complex.fn.JsonReader jsonReader; + +@Output ComplexWriter writer; + +public void setup(){ + jsonReader = new org.apache.drill.exec.vector.complex.fn.JsonReader(buffer, false, false, false,/* do not read numbers as doubles */ + enableNonNumeric.value == 1); +} + +public void eval(){ + try { +jsonReader.setSource(in.start, in.end, in.buffer); +jsonReader.write(writer); +buffer = jsonReader.getWorkBuf(); + + } catch (Exception e) { +throw new org.apache.drill.common.exceptions.DrillRuntimeException("Error while converting from JSON. ", e); + } +} + } + + @FunctionTemplate(name = "convert_fromJSON", scope = FunctionScope.SIMPLE, nulls = NullHandling.NULL_IF_NULL, isRandom = true) + public static class ConvertFromJsonNonNumerics implements DrillSimpleFunc{ + +@Param VarBinaryHolder in; +@Param BitHolder enableNonNumeric; +@Inject DrillBuf buffer; +@Workspace org.apache.drill.exec.vector.complex.fn.JsonReader jsonReader; + +@Output ComplexWriter writer; + +public void setup(){ + jsonReader = new org.apache.drill.exec.vector.complex.fn.JsonReader(buffer, false, false, false, /* do not read numbers as doubles */ --- End diff -- See above. We really don't want all these duplicate copies. > Add non-numeric support for JSON processing > --- > > Key: DRILL-5919 > URL: https://issues.apache.org/jira/browse/DRILL-5919 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - JSON >Affects Versions: 1.11.0 >Reporter: Volodymyr Tkach >Assignee: Volodymyr Tkach > Labels: doc-impacting > Fix For: Future > > > Add session options to allow drill working with non standard json strings > number literals like: NaN, Infinity, -Infinity. By default these options will > be switched off, the user will be able to toggle them during working session. > *For documentation* > 1. Added two session options {{store.json.reader.non_numeric_numbers}} and > {{store.json.reader.non_numeric_numbers}} that allow to read/write NaN and > Infinity as numbers. By default these options are set to false. > 2. Extended signature of {{convert_toJSON}} and {{convert_fromJSON}} > functions by adding second optional parameter that enables read/write NaN and > Infinity. > For example: > {noformat} > select convert_fromJSON('{"key": NaN}') from (values(1)); will result with > JsonParseException, but > select convert_fromJSON('{"key": NaN}', true) from (values(1)); will parse > NaN as a number. > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-5919) Add non-numeric support for JSON processing
[ https://issues.apache.org/jira/browse/DRILL-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16248186#comment-16248186 ] ASF GitHub Bot commented on DRILL-5919: --- Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/1026#discussion_r150359452 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/impl/conv/JsonConvertFrom.java --- @@ -76,7 +77,7 @@ public void eval(){ @Output ComplexWriter writer; public void setup(){ - jsonReader = new org.apache.drill.exec.vector.complex.fn.JsonReader(buffer, false, false, false /* do not read numbers as doubles */); + jsonReader = new org.apache.drill.exec.vector.complex.fn.JsonReader(buffer, false, false, false, false /* do not read numbers as doubles */); --- End diff -- See above. Why do we have two copies? Can we have a function that returns the reader using default configs? > Add non-numeric support for JSON processing > --- > > Key: DRILL-5919 > URL: https://issues.apache.org/jira/browse/DRILL-5919 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - JSON >Affects Versions: 1.11.0 >Reporter: Volodymyr Tkach >Assignee: Volodymyr Tkach > Labels: doc-impacting > Fix For: Future > > > Add session options to allow drill working with non standard json strings > number literals like: NaN, Infinity, -Infinity. By default these options will > be switched off, the user will be able to toggle them during working session. > *For documentation* > 1. Added two session options {{store.json.reader.non_numeric_numbers}} and > {{store.json.reader.non_numeric_numbers}} that allow to read/write NaN and > Infinity as numbers. By default these options are set to false. > 2. Extended signature of {{convert_toJSON}} and {{convert_fromJSON}} > functions by adding second optional parameter that enables read/write NaN and > Infinity. > For example: > {noformat} > select convert_fromJSON('{"key": NaN}') from (values(1)); will result with > JsonParseException, but > select convert_fromJSON('{"key": NaN}', true) from (values(1)); will parse > NaN as a number. > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-5919) Add non-numeric support for JSON processing
[ https://issues.apache.org/jira/browse/DRILL-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16248193#comment-16248193 ] ASF GitHub Bot commented on DRILL-5919: --- Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/1026#discussion_r150360309 --- Diff: exec/java-exec/src/main/resources/drill-module.conf --- @@ -502,6 +502,8 @@ drill.exec.options: { store.format: "parquet", store.hive.optimize_scan_with_native_readers: false, store.json.all_text_mode: false, +store.json.writer.non_numeric_numbers: false, +store.json.reader.non_numeric_numbers: false, --- End diff -- Have we tested all Drill's floating point methods to ensure that they correctly handle NaN and INF? > Add non-numeric support for JSON processing > --- > > Key: DRILL-5919 > URL: https://issues.apache.org/jira/browse/DRILL-5919 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - JSON >Affects Versions: 1.11.0 >Reporter: Volodymyr Tkach >Assignee: Volodymyr Tkach > Labels: doc-impacting > Fix For: Future > > > Add session options to allow drill working with non standard json strings > number literals like: NaN, Infinity, -Infinity. By default these options will > be switched off, the user will be able to toggle them during working session. > *For documentation* > 1. Added two session options {{store.json.reader.non_numeric_numbers}} and > {{store.json.reader.non_numeric_numbers}} that allow to read/write NaN and > Infinity as numbers. By default these options are set to false. > 2. Extended signature of {{convert_toJSON}} and {{convert_fromJSON}} > functions by adding second optional parameter that enables read/write NaN and > Infinity. > For example: > {noformat} > select convert_fromJSON('{"key": NaN}') from (values(1)); will result with > JsonParseException, but > select convert_fromJSON('{"key": NaN}', true) from (values(1)); will parse > NaN as a number. > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-5919) Add non-numeric support for JSON processing
[ https://issues.apache.org/jira/browse/DRILL-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16248195#comment-16248195 ] ASF GitHub Bot commented on DRILL-5919: --- Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/1026#discussion_r150359992 --- Diff: contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/MongoRecordReader.java --- @@ -73,6 +73,7 @@ private final MongoStoragePlugin plugin; private final boolean enableAllTextMode; + private final boolean enableNonNumericNumbers; --- End diff -- Would recommend: `enableNanInf`. "Non-numeric numbers" sounds like we might allow "foo" or "thirteen". > Add non-numeric support for JSON processing > --- > > Key: DRILL-5919 > URL: https://issues.apache.org/jira/browse/DRILL-5919 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - JSON >Affects Versions: 1.11.0 >Reporter: Volodymyr Tkach >Assignee: Volodymyr Tkach > Labels: doc-impacting > Fix For: Future > > > Add session options to allow drill working with non standard json strings > number literals like: NaN, Infinity, -Infinity. By default these options will > be switched off, the user will be able to toggle them during working session. > *For documentation* > 1. Added two session options {{store.json.reader.non_numeric_numbers}} and > {{store.json.reader.non_numeric_numbers}} that allow to read/write NaN and > Infinity as numbers. By default these options are set to false. > 2. Extended signature of {{convert_toJSON}} and {{convert_fromJSON}} > functions by adding second optional parameter that enables read/write NaN and > Infinity. > For example: > {noformat} > select convert_fromJSON('{"key": NaN}') from (values(1)); will result with > JsonParseException, but > select convert_fromJSON('{"key": NaN}', true) from (values(1)); will parse > NaN as a number. > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-5919) Add non-numeric support for JSON processing
[ https://issues.apache.org/jira/browse/DRILL-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16245650#comment-16245650 ] ASF GitHub Bot commented on DRILL-5919: --- Github user arina-ielchiieva commented on the issue: https://github.com/apache/drill/pull/1026 Thanks, +1, LGTM. > Add non-numeric support for JSON processing > --- > > Key: DRILL-5919 > URL: https://issues.apache.org/jira/browse/DRILL-5919 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - JSON >Affects Versions: 1.11.0 >Reporter: Volodymyr Tkach >Assignee: Volodymyr Tkach > Labels: doc-impacting > Fix For: Future > > > Add session options to allow drill working with non standard json strings > number literals like: NaN, Infinity, -Infinity. By default these options will > be switched off, the user will be able to toggle them during working session. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-5919) Add non-numeric support for JSON processing
[ https://issues.apache.org/jira/browse/DRILL-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16245576#comment-16245576 ] ASF GitHub Bot commented on DRILL-5919: --- Github user vladimirtkach commented on the issue: https://github.com/apache/drill/pull/1026 @arina-ielchiieva made code changes according to your comments. > Add non-numeric support for JSON processing > --- > > Key: DRILL-5919 > URL: https://issues.apache.org/jira/browse/DRILL-5919 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - JSON >Affects Versions: 1.11.0 >Reporter: Volodymyr Tkach >Assignee: Volodymyr Tkach > Labels: doc-impacting > Fix For: Future > > > Add session options to allow drill working with non standard json strings > number literals like: NaN, Infinity, -Infinity. By default these options will > be switched off, the user will be able to toggle them during working session. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-5919) Add non-numeric support for JSON processing
[ https://issues.apache.org/jira/browse/DRILL-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16245416#comment-16245416 ] ASF GitHub Bot commented on DRILL-5919: --- Github user arina-ielchiieva commented on a diff in the pull request: https://github.com/apache/drill/pull/1026#discussion_r149903182 --- Diff: exec/java-exec/src/test/java/org/apache/drill/exec/vector/complex/writer/TestJsonNonNumerics.java --- @@ -0,0 +1,167 @@ +/* +* Licensed to the Apache Software Foundation (ASF) under one or more +* contributor license agreements. See the NOTICE file distributed with +* this work for additional information regarding copyright ownership. +* The ASF licenses this file to you under the Apache License, Version 2.0 +* (the "License"); you may not use this file except in compliance with +* the License. You may obtain a copy of the License at +* +* http://www.apache.org/licenses/LICENSE-2.0 +* +* Unless required by applicable law or agreed to in writing, software +* distributed under the License is distributed on an "AS IS" BASIS, +* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +* See the License for the specific language governing permissions and +* limitations under the License. +*/ + +package org.apache.drill.exec.vector.complex.writer; + +import com.google.common.collect.ImmutableMap; +import org.apache.commons.io.FileUtils; +import org.apache.drill.BaseTestQuery; +import org.apache.drill.common.exceptions.UserRemoteException; +import org.apache.drill.common.expression.SchemaPath; +import org.apache.drill.exec.record.RecordBatchLoader; +import org.apache.drill.exec.record.VectorWrapper; +import org.apache.drill.exec.rpc.user.QueryDataBatch; +import org.apache.drill.exec.vector.VarCharVector; +import org.junit.Test; + +import java.io.File; +import java.util.List; + +import static org.hamcrest.CoreMatchers.containsString; +import static org.junit.Assert.*; + +public class TestJsonNonNumerics extends BaseTestQuery { + + @Test + public void testNonNumericSelect() throws Exception { +File file = new File(getTempDir(""), "nan_test.json"); +String json = "{\"nan\":NaN, \"inf\":Infinity}"; +String query = String.format("select * from dfs.`%s`",file.getAbsolutePath()); +try { + FileUtils.writeStringToFile(file, json); + test("alter session set `store.json.reader.non_numeric_numbers` = true"); + testBuilder() +.sqlQuery(query) +.unOrdered() +.baselineColumns("nan", "inf") +.baselineValues(Double.NaN, Double.POSITIVE_INFINITY) +.build() +.run(); +} finally { + test("alter session reset `store.json.reader.non_numeric_numbers`"); + FileUtils.deleteQuietly(file); +} + } + + @Test(expected = UserRemoteException.class) + public void testNonNumericFailure() throws Exception { +File file = new File(getTempDir(""), "nan_test.json"); +test("alter session set `store.json.reader.non_numeric_numbers` = false"); +String json = "{\"nan\":NaN, \"inf\":Infinity}"; +try { + FileUtils.writeStringToFile(file, json); + test("select * from dfs.`%s`;", file.getAbsolutePath()); +} catch (UserRemoteException e) { + assertThat(e.getMessage(), containsString("Error parsing JSON")); + throw e; +} finally { + test("alter session reset `store.json.reader.non_numeric_numbers`"); + FileUtils.deleteQuietly(file); +} + } + + @Test + public void testCreateTableNonNumerics() throws Exception { +File file = new File(getTempDir(""), "nan_test.json"); +String json = "{\"nan\":NaN, \"inf\":Infinity}"; +String tableName = "ctas_test"; +try { + FileUtils.writeStringToFile(file, json); + test("alter session set `store.json.reader.non_numeric_numbers` = true"); + test("alter session set `store.json.writer.non_numeric_numbers` = true"); + test("alter session set `store.format`='json'"); + test("create table dfs_test.tmp.`%s` as select * from dfs.`%s`;", tableName, file.getAbsolutePath()); + + // ensuring that `NaN` and `Infinity` tokens ARE NOT enclosed with double quotes + File resultFile = new File(new File(getDfsTestTmpSchemaLocation(),tableName),"0_0_0.json"); + String resultJson = FileUtils.readFileToString(resultFile); + int nanIndex = resultJson.indexOf("NaN"); + assertFalse("`NaN` must not be enclosed with \"\" ", resultJson.charAt(nanIndex - 1) == '"'); + assertFalse("`NaN` must not be enclosed with \"\" ", r
[jira] [Commented] (DRILL-5919) Add non-numeric support for JSON processing
[ https://issues.apache.org/jira/browse/DRILL-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16245417#comment-16245417 ] ASF GitHub Bot commented on DRILL-5919: --- Github user arina-ielchiieva commented on a diff in the pull request: https://github.com/apache/drill/pull/1026#discussion_r149903705 --- Diff: exec/java-exec/src/test/java/org/apache/drill/exec/vector/complex/writer/TestJsonNonNumerics.java --- @@ -0,0 +1,167 @@ +/* +* Licensed to the Apache Software Foundation (ASF) under one or more +* contributor license agreements. See the NOTICE file distributed with +* this work for additional information regarding copyright ownership. +* The ASF licenses this file to you under the Apache License, Version 2.0 +* (the "License"); you may not use this file except in compliance with +* the License. You may obtain a copy of the License at +* +* http://www.apache.org/licenses/LICENSE-2.0 +* +* Unless required by applicable law or agreed to in writing, software +* distributed under the License is distributed on an "AS IS" BASIS, +* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +* See the License for the specific language governing permissions and +* limitations under the License. +*/ + +package org.apache.drill.exec.vector.complex.writer; + +import com.google.common.collect.ImmutableMap; +import org.apache.commons.io.FileUtils; +import org.apache.drill.BaseTestQuery; +import org.apache.drill.common.exceptions.UserRemoteException; +import org.apache.drill.common.expression.SchemaPath; +import org.apache.drill.exec.record.RecordBatchLoader; +import org.apache.drill.exec.record.VectorWrapper; +import org.apache.drill.exec.rpc.user.QueryDataBatch; +import org.apache.drill.exec.vector.VarCharVector; +import org.junit.Test; + +import java.io.File; +import java.util.List; + +import static org.hamcrest.CoreMatchers.containsString; +import static org.junit.Assert.*; + +public class TestJsonNonNumerics extends BaseTestQuery { + + @Test + public void testNonNumericSelect() throws Exception { +File file = new File(getTempDir(""), "nan_test.json"); --- End diff -- It's better to pass dir name as well, rather than emptiness. Ex: `getTempDir("test_nan")` > Add non-numeric support for JSON processing > --- > > Key: DRILL-5919 > URL: https://issues.apache.org/jira/browse/DRILL-5919 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - JSON >Affects Versions: 1.11.0 >Reporter: Volodymyr Tkach >Assignee: Volodymyr Tkach > Labels: doc-impacting > Fix For: Future > > > Add session options to allow drill working with non standard json strings > number literals like: NaN, Infinity, -Infinity. By default these options will > be switched off, the user will be able to toggle them during working session. -- This message was sent by Atlassian JIRA (v6.4.14#64029)