[ https://issues.apache.org/jira/browse/DRILL-6154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16367445#comment-16367445 ]
ASF GitHub Bot commented on DRILL-6154: --------------------------------------- Github user arina-ielchiieva commented on a diff in the pull request: https://github.com/apache/drill/pull/1123#discussion_r168777666 --- Diff: exec/java-exec/src/main/codegen/templates/AggrTypeFunctions1.java --- @@ -102,7 +102,20 @@ public void add() { </#if> nonNullCount.value = 1; <#if aggrtype.funcName == "min"> - value.value = Math.min(value.value, in.value); + <#if type.inputType?contains("Float4") || type.inputType?contains("Float8")> --- End diff -- 1. Please mind original indent. 2. Please add comment describing how we chose min / max for nan / inf values. 3. Please consider using if without nested if: `if - elseif - else`. > NaN, Infinity issues > -------------------- > > Key: DRILL-6154 > URL: https://issues.apache.org/jira/browse/DRILL-6154 > Project: Apache Drill > Issue Type: Bug > Reporter: Volodymyr Tkach > Assignee: Volodymyr Tkach > Priority: Major > Fix For: 1.13.0 > > > 1.Issue > *AFFECTED_VERSION:* drill-1.13.0-SNAPSHOT > *AFFECTED_FUNCTIONS:* > - *sqrt* > - *trunc* > *ISSUE_DESCRIPTION:* According to DRILL-5919/ new json number literals were > added: *NaN, Infinity, -Infinity*. The new data types must be processed > properly by existing functions. There are a few issues: > *1. SQRT function*. Run the following test query: > {code:java} > select sqrt(Nan) NaN, sqrt(Positive_Infinity) POS_INF, > sqrt(Negative_Infinity) NEG_INF from dfs.tmp.`PN_Inf_NaN.json`{code} > > - EXPECTED_RESULT: it was expected to get the following result: _NaN, > Infinity, NaN_ (expected result is based on java Math.sqrt() method) > - ACTUAL_RESULT: the test query returned: _NaN, Infinity, Infinity_ > *2. TRUNC function*. According to DRILL docs > ([https://drill.apache.org/docs/math-and-trig/):] _TRUNC(x, y) : Truncates x > to y decimal places. *Specifying y is optional. Default is 1*_. So, the > function must work properly without specifying *y* > However an error message appears. Run test_query: > {code:java} > select trunc(Nan) NaN, trunc(Positive_Infinity) POS_INF, > trunc(Negative_Infinity) NEG_INF from dfs.tmp.`PN_Inf_NaN.json`{code} > - EXPECTED_RESULT: it was expected to get the following result *NaN, NaN, > NaN* > - ACTUAL_RESULT: it appears the following error message: *Query Failed: An > Error Occurred org.apache.drill.common.exceptions.UserRemoteException: SYSTEM > ERROR: NumberFormatException Fragment 0:0 [Error Id: > 95e01fee-7433-4b0b-b913-32358b4a8f55 on node1:31010]* > Please investigate and fix, test file attached PN_Inf_NaN.json > 2. Issue > *AFFECTED_VERSION:* drill-1.13.0-SNAPSHOT > *AFFECTED_FUNCTIONALITY:* INNER JOIN > *ISSUE_DESCRIPTION:* There were added new Json data types in DRILL-5919: > *NaN, Infinity, -Infinity*. > During testing activities, it was detected a bit strange behavior of INNER > JOIN operator - different query results in almost the same queries. > *Query1* > {code:java} > select distinct t.name, tt.name from dfs.tmp.`ObjsX.json` t inner join > dfs.tmp.`ObjsX.json` tt on t.attr4 = tt.attr4 {code} > *Query2* > {code:java} > select distinct t.name from dfs.tmp.`ObjsX.json` t inner join > dfs.tmp.`ObjsX.json` tt on t.attr4 = tt.attr4 {code} > *Query1* differs from *Query2* by 1 columns only: > - In *Query1* - 2 columns are selected - t.name, tt.name > - In *Query2* - 1 column is selected - t.name > However *Query1*/*Query2* return completely different results: > - *Query1* returns > {code:java} > name name0 > object2 object2 > object2 object3 > object2 object4 > object3 object2 > object3 object3 > object3 object4 > object4 object2 > object4 object3 > object4 object4 > {code} > This result seems to be correct. > - *Query2* returns _*No result found*_, not expected: > *EXPECTED_RESULT:* > {code:java} > name > object2 > object3 > object4 > {code} > *ACTUAL_RESULT:(* > {code:java} > No result found{code} > *NB!:* the issue appears only if tables are _*JOINed by a column which > contains newly-added data types (NaN, Infinity, -Infinity)*_. The issue is > not reproducible is a user is JOINing tables by a column containing other > data types > 3. Issue > *AFFECTED_VERSION:* drill-1.13.0-SNAPSHOT > *AFFECTED_FUNCTIONALITY:* ORDER BY, DESC > *THIS ISSUE REFERS TO:DRILL-5919* > *ISSUE_DESCRIPTION:* 'ORDER BY/DESC' clause behaves in different ways when > sorting columns containing NaN values. In one case it considers NaN to be the > largest value, in another - the smallest one. > *Steps:* > - Select from the attached test file (orderBy.json, attached) > {code:java} > SELECT name, attr4 from dfs.tmp.`orderBy.json` order by name, attr4{code} > - Check the attached screen shot (orderByIssue.jpg): > *EXPECTED_RESULT:* It was expected the 'ORDER BY' clause to sort attr4 > columns data in the same way (most probably NaN should be the largest, see > *NB*) > *ACTUAL_RESULT:* attr4 column's values were sorted in different ways: for > 'obj1'/'obj3' NaN is the largest, for 'obj2'/'obj4' NaN is the smallest. > *NB:* Postgres as well as Java's sorting (Collection.sort() / Arrays.sort() > methods) treats NaN as the largest value > 4. Issue > *AFFECTED_VERSION:* drill-1.13.0-SNAPSHOT > *AFFECTED_FUNCTIONALITY:* max(column_name), min(column_name), > *ISSUE_DESCRIPTION:* min/max aggregation functions return the same result if > the selected column contains NaN value. > {code:java} > SELECT name, max(attr4), min(attr4) from dfs.tmp.`minMax.json` group by > name{code} > Result > {code:java} > name Min Max > obj1 NaN NaN > obj2 NaN NaN > obj3 NaN NaN > obj4 NaN NaN > {code} > As for me, this logic should be revised, current behavior is a bit confusing: > - Postgres considers NaN to be the largest value, so {{MAX(col_withNaN)}} > will return NaN, {{MIN(col_withNaN)}} - will return other value, so DRILL's > min/max logic can be adjusted to the Postgres' one, see postgres.jpg > - Or NAN can behave like NULL - DRILL's MIN/MAX functions ignore NULLs -- This message was sent by Atlassian JIRA (v7.6.3#76005)