[jira] [Commented] (HIVE-13282) GroupBy and select operator encounter ArrayIndexOutOfBoundsException
[ https://issues.apache.org/jira/browse/HIVE-13282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16920306#comment-16920306 ] Zihao Ye commented on HIVE-13282: - Hi [~mmccline], I'm wondering why this patch is not merged yet. If there are new issues introduced by this fix, how could I reproduce it? Thanks > GroupBy and select operator encounter ArrayIndexOutOfBoundsException > > > Key: HIVE-13282 > URL: https://issues.apache.org/jira/browse/HIVE-13282 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 1.2.1, 2.0.0, 2.1.0 >Reporter: Vikram Dixit K >Priority: Blocker > Attachments: HIVE-13282.01.patch, HIVE-13282.02.patch, > smb_fail_issue.patch, smb_groupby.q, smb_groupby.q.out > > > The group by and select operators run into the ArrayIndexOutOfBoundsException > when they incorrectly initialize themselves with tag 0 but the incoming tag > id is different. > {code} > select count(*) from > (select rt1.id from > (select t1.key as id, t1.value as od from tab t1 group by key, value) rt1) vt1 > join > (select rt2.id from > (select t2.key as id, t2.value as od from tab_part t2 group by key, value) > rt2) vt2 > where vt1.id=vt2.id; > {code} -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (HIVE-21955) SearchArgumentImpl generates wrong ExpressionTree in some cases which might result in loss of data
[ https://issues.apache.org/jira/browse/HIVE-21955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16880968#comment-16880968 ] Zihao Ye commented on HIVE-21955: - This is a quite serious problem which might lead to data loss in lots of general queries. Could someone help with reviewing the issue please? > SearchArgumentImpl generates wrong ExpressionTree in some cases which might > result in loss of data > --- > > Key: HIVE-21955 > URL: https://issues.apache.org/jira/browse/HIVE-21955 > Project: Hive > Issue Type: Bug > Components: Hive, storage-api >Affects Versions: 1.2.1 >Reporter: Zihao Ye >Assignee: Zihao Ye >Priority: Critical > Labels: pushdown > Attachments: HIVE-21955.1.branch-1.patch, HIVE-21955.branch-1.patch > > > ExpressionBuilder applies `pushDownNot`, `foldMaybe`, `flatten`, > `convertToCNF`, `flatten` and `buildLeafList` in order to form a > non-normalized expression into a CNF expression with the unique leaves. > After an expression is converted to CNF, there might be *more than one > non-leaf nodes which are exactly the same object* in the expression tree. If > this happens, those non-leaf nodes will be visited more than once in > `buildLeafList` function. As a result, a wrong ExpressionTree is generated. > My version is 1.2.1, but it seems that the higher versions are also affected -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (HIVE-21955) SearchArgumentImpl generates wrong ExpressionTree in some cases which might result in loss of data
[ https://issues.apache.org/jira/browse/HIVE-21955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16878943#comment-16878943 ] Zihao Ye edited comment on HIVE-21955 at 7/5/19 3:15 AM: - The failed tests are mostly related to Java version issues rather than this patch. And the newly added UT was passed. See https://builds.apache.org/job/PreCommit-HIVE-Build/17855/testReport/org.apache.hadoop.hive.ql.io.sarg/TestSearchArgumentImpl/ was (Author: zihao.ye): The failed tests are mostly related to Java version issues rather than this patch. > SearchArgumentImpl generates wrong ExpressionTree in some cases which might > result in loss of data > --- > > Key: HIVE-21955 > URL: https://issues.apache.org/jira/browse/HIVE-21955 > Project: Hive > Issue Type: Bug > Components: Hive, storage-api >Affects Versions: 1.2.1 >Reporter: Zihao Ye >Assignee: Zihao Ye >Priority: Critical > Labels: pushdown > Attachments: HIVE-21955.1.branch-1.patch, HIVE-21955.branch-1.patch > > > ExpressionBuilder applies `pushDownNot`, `foldMaybe`, `flatten`, > `convertToCNF`, `flatten` and `buildLeafList` in order to form a > non-normalized expression into a CNF expression with the unique leaves. > After an expression is converted to CNF, there might be *more than one > non-leaf nodes which are exactly the same object* in the expression tree. If > this happens, those non-leaf nodes will be visited more than once in > `buildLeafList` function. As a result, a wrong ExpressionTree is generated. > My version is 1.2.1, but it seems that the higher versions are also affected -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-21955) SearchArgumentImpl generates wrong ExpressionTree in some cases which might result in loss of data
[ https://issues.apache.org/jira/browse/HIVE-21955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16878943#comment-16878943 ] Zihao Ye commented on HIVE-21955: - The failed tests are mostly related to Java version issues rather than this patch. > SearchArgumentImpl generates wrong ExpressionTree in some cases which might > result in loss of data > --- > > Key: HIVE-21955 > URL: https://issues.apache.org/jira/browse/HIVE-21955 > Project: Hive > Issue Type: Bug > Components: Hive, storage-api >Affects Versions: 1.2.1 >Reporter: Zihao Ye >Assignee: Zihao Ye >Priority: Critical > Labels: pushdown > Attachments: HIVE-21955.1.branch-1.patch, HIVE-21955.branch-1.patch > > > ExpressionBuilder applies `pushDownNot`, `foldMaybe`, `flatten`, > `convertToCNF`, `flatten` and `buildLeafList` in order to form a > non-normalized expression into a CNF expression with the unique leaves. > After an expression is converted to CNF, there might be *more than one > non-leaf nodes which are exactly the same object* in the expression tree. If > this happens, those non-leaf nodes will be visited more than once in > `buildLeafList` function. As a result, a wrong ExpressionTree is generated. > My version is 1.2.1, but it seems that the higher versions are also affected -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21955) SearchArgumentImpl generates wrong ExpressionTree in some cases which might result in loss of data
[ https://issues.apache.org/jira/browse/HIVE-21955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zihao Ye updated HIVE-21955: Description: ExpressionBuilder applies `pushDownNot`, `foldMaybe`, `flatten`, `convertToCNF`, `flatten` and `buildLeafList` in order to form a non-normalized expression into a CNF expression with the unique leaves. After an expression is converted to CNF, there might be *more than one non-leaf nodes which are exactly the same object* in the expression tree. If this happens, those non-leaf nodes will be visited more than once in `buildLeafList` function. As a result, a wrong ExpressionTree is generated. My version is 1.2.1, but it seems that the higher versions are also affected was: ExpressionBuilder applies `pushDownNot`, `foldMaybe`, `flatten`, `convertToCNF`, `flatten` and `buildLeafList` in order to form a non-normalized expression into a CNF expression with the unique leaves. After an expression is converted to CNF, there might be *more than one non-leaf node which are exactly the same object* in the expression tree. If this happens, those non-leaf node will be visited more than once in `buildLeafList` function. As a result, a wrong ExpressionTree is generated. My version is 1.2.1, but it seems that the higher versions are also affected > SearchArgumentImpl generates wrong ExpressionTree in some cases which might > result in loss of data > --- > > Key: HIVE-21955 > URL: https://issues.apache.org/jira/browse/HIVE-21955 > Project: Hive > Issue Type: Bug > Components: Hive, storage-api >Affects Versions: 1.2.1 >Reporter: Zihao Ye >Assignee: Zihao Ye >Priority: Critical > Labels: pushdown > Attachments: HIVE-21955.1.branch-1.patch, HIVE-21955.branch-1.patch > > > ExpressionBuilder applies `pushDownNot`, `foldMaybe`, `flatten`, > `convertToCNF`, `flatten` and `buildLeafList` in order to form a > non-normalized expression into a CNF expression with the unique leaves. > After an expression is converted to CNF, there might be *more than one > non-leaf nodes which are exactly the same object* in the expression tree. If > this happens, those non-leaf nodes will be visited more than once in > `buildLeafList` function. As a result, a wrong ExpressionTree is generated. > My version is 1.2.1, but it seems that the higher versions are also affected -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21955) SearchArgumentImpl generates wrong ExpressionTree in some cases which might result in loss of data
[ https://issues.apache.org/jira/browse/HIVE-21955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zihao Ye updated HIVE-21955: Description: ExpressionBuilder applies `pushDownNot`, `foldMaybe`, `flatten`, `convertToCNF`, `flatten` and `buildLeafList` in order to form a non-normalized expression into a CNF expression with the unique leaves. After an expression is converted to CNF, there might be *more than one non-leaf node which are exactly the same object* in the expression tree. If this happens, those non-leaf node will be visited more than once in `buildLeafList` function. As a result, a wrong ExpressionTree is generated. My version is 1.2.1, but it seems that the higher versions are also affected was: ExpressionBuilder applies `pushDownNot`, `foldMaybe`, `flatten`, `convertToCNF`, `flatten` and `buildLeafList` in order to form a non-normalized expression into a CNF expression with the unique leaves. After an expression is converted to CNF, there might be more than one non-leaf node which are exactly the same object in the expression tree. If this happens, those non-leaf node will be visited more than once in `buildLeafList` function. As a result, a wrong ExpressionTree is generated. My version is 1.2.1, but it seems that the higher versions are also affected > SearchArgumentImpl generates wrong ExpressionTree in some cases which might > result in loss of data > --- > > Key: HIVE-21955 > URL: https://issues.apache.org/jira/browse/HIVE-21955 > Project: Hive > Issue Type: Bug > Components: Hive, storage-api >Affects Versions: 1.2.1 >Reporter: Zihao Ye >Assignee: Zihao Ye >Priority: Critical > Labels: pushdown > Attachments: HIVE-21955.1.branch-1.patch, HIVE-21955.branch-1.patch > > > ExpressionBuilder applies `pushDownNot`, `foldMaybe`, `flatten`, > `convertToCNF`, `flatten` and `buildLeafList` in order to form a > non-normalized expression into a CNF expression with the unique leaves. > After an expression is converted to CNF, there might be *more than one > non-leaf node which are exactly the same object* in the expression tree. If > this happens, those non-leaf node will be visited more than once in > `buildLeafList` function. As a result, a wrong ExpressionTree is generated. > My version is 1.2.1, but it seems that the higher versions are also affected -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21955) SearchArgumentImpl generates wrong ExpressionTree in some cases which might result in loss of data
[ https://issues.apache.org/jira/browse/HIVE-21955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zihao Ye updated HIVE-21955: Status: Open (was: Patch Available) Didn't attach a UT > SearchArgumentImpl generates wrong ExpressionTree in some cases which might > result in loss of data > --- > > Key: HIVE-21955 > URL: https://issues.apache.org/jira/browse/HIVE-21955 > Project: Hive > Issue Type: Bug > Components: Hive, storage-api >Affects Versions: 1.2.1 >Reporter: Zihao Ye >Assignee: Zihao Ye >Priority: Critical > Labels: pushdown > Attachments: HIVE-21955.branch-1.patch > > > ExpressionBuilder applies `pushDownNot`, `foldMaybe`, `flatten`, > `convertToCNF`, `flatten` and `buildLeafList` in order to form a > non-normalized expression into a CNF expression with the unique leaves. > After an expression is converted to CNF, there might be more than one > non-leaf node which are exactly the same object in the expression tree. If > this happens, those non-leaf node will be visited more than once in > `buildLeafList` function. As a result, a wrong ExpressionTree is generated. > My version is 1.2.1, but it seems that the higher versions are also affected -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21955) SearchArgumentImpl generates wrong ExpressionTree in some cases which might result in loss of data
[ https://issues.apache.org/jira/browse/HIVE-21955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zihao Ye updated HIVE-21955: Attachment: HIVE-21955.1.branch-1.patch Status: Patch Available (was: Open) UT added > SearchArgumentImpl generates wrong ExpressionTree in some cases which might > result in loss of data > --- > > Key: HIVE-21955 > URL: https://issues.apache.org/jira/browse/HIVE-21955 > Project: Hive > Issue Type: Bug > Components: Hive, storage-api >Affects Versions: 1.2.1 >Reporter: Zihao Ye >Assignee: Zihao Ye >Priority: Critical > Labels: pushdown > Attachments: HIVE-21955.1.branch-1.patch, HIVE-21955.branch-1.patch > > > ExpressionBuilder applies `pushDownNot`, `foldMaybe`, `flatten`, > `convertToCNF`, `flatten` and `buildLeafList` in order to form a > non-normalized expression into a CNF expression with the unique leaves. > After an expression is converted to CNF, there might be more than one > non-leaf node which are exactly the same object in the expression tree. If > this happens, those non-leaf node will be visited more than once in > `buildLeafList` function. As a result, a wrong ExpressionTree is generated. > My version is 1.2.1, but it seems that the higher versions are also affected -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21955) SearchArgumentImpl generates wrong ExpressionTree in some cases which might result in loss of data
[ https://issues.apache.org/jira/browse/HIVE-21955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zihao Ye updated HIVE-21955: Attachment: HIVE-21955.branch-1.patch Target Version/s: 1.2.1 Status: Patch Available (was: Open) > SearchArgumentImpl generates wrong ExpressionTree in some cases which might > result in loss of data > --- > > Key: HIVE-21955 > URL: https://issues.apache.org/jira/browse/HIVE-21955 > Project: Hive > Issue Type: Bug > Components: Hive, storage-api >Affects Versions: 1.2.1 >Reporter: Zihao Ye >Assignee: Zihao Ye >Priority: Critical > Labels: pushdown > Attachments: HIVE-21955.branch-1.patch > > > ExpressionBuilder applies `pushDownNot`, `foldMaybe`, `flatten`, > `convertToCNF`, `flatten` and `buildLeafList` in order to form a > non-normalized expression into a CNF expression with the unique leaves. > After an expression is converted to CNF, there might be more than one > non-leaf node which are exactly the same object in the expression tree. If > this happens, those non-leaf node will be visited more than once in > `buildLeafList` function. As a result, a wrong ExpressionTree is generated. > My version is 1.2.1, but it seems that the higher versions are also affected -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21955) SearchArgumentImpl generates wrong ExpressionTree in some cases which might result in loss of data
[ https://issues.apache.org/jira/browse/HIVE-21955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zihao Ye updated HIVE-21955: Component/s: storage-api > SearchArgumentImpl generates wrong ExpressionTree in some cases which might > result in loss of data > --- > > Key: HIVE-21955 > URL: https://issues.apache.org/jira/browse/HIVE-21955 > Project: Hive > Issue Type: Bug > Components: Hive, storage-api >Affects Versions: 1.2.1 >Reporter: Zihao Ye >Assignee: Zihao Ye >Priority: Critical > Labels: pushdown > > ExpressionBuilder applies `pushDownNot`, `foldMaybe`, `flatten`, > `convertToCNF`, `flatten` and `buildLeafList` in order to form a > non-normalized expression into a CNF expression with the unique leaves. > After an expression is converted to CNF, there might be more than one > non-leaf node which are exactly the same object in the expression tree. If > this happens, those non-leaf node will be visited more than once in > `buildLeafList` function. As a result, a wrong ExpressionTree is generated. > My version is 1.2.1, but it seems that the higher versions are also affected -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21955) SearchArgumentImpl generates wrong ExpressionTree in some cases which might result in loss of data
[ https://issues.apache.org/jira/browse/HIVE-21955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zihao Ye updated HIVE-21955: Description: ExpressionBuilder applies `pushDownNot`, `foldMaybe`, `flatten`, `convertToCNF`, `flatten` and `buildLeafList` in order to form a non-normalized expression into a CNF expression with the unique leaves. After an expression is converted to CNF, there might be more than one non-leaf node which are exactly the same object in the expression tree. If this happens, those non-leaf node will be visited more than once in `buildLeafList` function. As a result, a wrong ExpressionTree is generated. My version is 1.2.1, but it seems that the higher versions are also affected was: ExpressionBuilder applies `pushDownNot`, `foldMaybe`, `flatten`, `convertToCNF`, `flatten` and `buildLeafList` in order to form a non-normalized expression into a CNF expression with the unique leaves. After an expression is converted to CNF, there might be more than one non-leaf node which are exactly the same object in the expression tree. If this happens, those non-leaf node will be visited more than once in `buildLeafList` function. As a result, a wrong ExpressionTree is generated. > SearchArgumentImpl generates wrong ExpressionTree in some cases which might > result in loss of data > --- > > Key: HIVE-21955 > URL: https://issues.apache.org/jira/browse/HIVE-21955 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 1.2.1 >Reporter: Zihao Ye >Assignee: Zihao Ye >Priority: Critical > Labels: pushdown > > ExpressionBuilder applies `pushDownNot`, `foldMaybe`, `flatten`, > `convertToCNF`, `flatten` and `buildLeafList` in order to form a > non-normalized expression into a CNF expression with the unique leaves. > After an expression is converted to CNF, there might be more than one > non-leaf node which are exactly the same object in the expression tree. If > this happens, those non-leaf node will be visited more than once in > `buildLeafList` function. As a result, a wrong ExpressionTree is generated. > My version is 1.2.1, but it seems that the higher versions are also affected -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21955) SearchArgumentImpl generates wrong ExpressionTree in some cases which might result in loss of data
[ https://issues.apache.org/jira/browse/HIVE-21955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zihao Ye updated HIVE-21955: Affects Version/s: 1.2.1 > SearchArgumentImpl generates wrong ExpressionTree in some cases which might > result in loss of data > --- > > Key: HIVE-21955 > URL: https://issues.apache.org/jira/browse/HIVE-21955 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 1.2.1 >Reporter: Zihao Ye >Assignee: Zihao Ye >Priority: Critical > Labels: pushdown > > ExpressionBuilder applies `pushDownNot`, `foldMaybe`, `flatten`, > `convertToCNF`, `flatten` and `buildLeafList` in order to form a > non-normalized expression into a CNF expression with the unique leaves. > After an expression is converted to CNF, there might be more than one > non-leaf node which are exactly the same object in the expression tree. If > this happens, those non-leaf node will be visited more than once in > `buildLeafList` function. As a result, a wrong ExpressionTree is generated. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21955) SearchArgumentImpl generates wrong ExpressionTree in some cases which might result in loss of data
[ https://issues.apache.org/jira/browse/HIVE-21955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zihao Ye updated HIVE-21955: Component/s: (was: ORC) > SearchArgumentImpl generates wrong ExpressionTree in some cases which might > result in loss of data > --- > > Key: HIVE-21955 > URL: https://issues.apache.org/jira/browse/HIVE-21955 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Zihao Ye >Assignee: Zihao Ye >Priority: Critical > > ExpressionBuilder applies `pushDownNot`, `foldMaybe`, `flatten`, > `convertToCNF`, `flatten` and `buildLeafList` in order to form a > non-normalized expression into a CNF expression with the unique leaves. > After an expression is converted to CNF, there might be more than one > non-leaf node which are exactly the same object in the expression tree. If > this happens, those non-leaf node will be visited more than once in > `buildLeafList` function. As a result, a wrong ExpressionTree is generated. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21955) SearchArgumentImpl generates wrong ExpressionTree in some cases which might result in loss of data
[ https://issues.apache.org/jira/browse/HIVE-21955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zihao Ye updated HIVE-21955: Labels: pushdown (was: ) > SearchArgumentImpl generates wrong ExpressionTree in some cases which might > result in loss of data > --- > > Key: HIVE-21955 > URL: https://issues.apache.org/jira/browse/HIVE-21955 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Zihao Ye >Assignee: Zihao Ye >Priority: Critical > Labels: pushdown > > ExpressionBuilder applies `pushDownNot`, `foldMaybe`, `flatten`, > `convertToCNF`, `flatten` and `buildLeafList` in order to form a > non-normalized expression into a CNF expression with the unique leaves. > After an expression is converted to CNF, there might be more than one > non-leaf node which are exactly the same object in the expression tree. If > this happens, those non-leaf node will be visited more than once in > `buildLeafList` function. As a result, a wrong ExpressionTree is generated. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-21955) SearchArgumentImpl generates wrong ExpressionTree in some cases which might result in loss of data
[ https://issues.apache.org/jira/browse/HIVE-21955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16878391#comment-16878391 ] Zihao Ye commented on HIVE-21955: - An example: select * from t where ((rfr_page = 'search' and uid is not null) or (rfr_page = 'index' and uid is not null)); Filter text is: Filter text = (((rfr_page = 'search') and uid is not null) or ((rfr_page = 'index') and uid is not null)) However the sarg is: sarg: leaf-0 = (EQUALS rfr_page search), leaf-1 = (EQUALS rfr_page index), leaf-2 = (IS_NULL uid), expr = (and (or leaf-0 leaf-1) (or (not leaf-2) leaf-1) (or leaf-0 (not leaf-1)) (or (not leaf-2) (not leaf-1))) > SearchArgumentImpl generates wrong ExpressionTree in some cases which might > result in loss of data > --- > > Key: HIVE-21955 > URL: https://issues.apache.org/jira/browse/HIVE-21955 > Project: Hive > Issue Type: Bug > Components: Hive, ORC >Reporter: Zihao Ye >Assignee: Zihao Ye >Priority: Critical > > ExpressionBuilder applies `pushDownNot`, `foldMaybe`, `flatten`, > `convertToCNF`, `flatten` and `buildLeafList` in order to form a > non-normalized expression into a CNF expression with the unique leaves. > After an expression is converted to CNF, there might be more than one > non-leaf node which are exactly the same object in the expression tree. If > this happens, those non-leaf node will be visited more than once in > `buildLeafList` function. As a result, a wrong ExpressionTree is generated. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (HIVE-21955) SearchArgumentImpl generates wrong ExpressionTree in some cases which might result in loss of data
[ https://issues.apache.org/jira/browse/HIVE-21955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zihao Ye reassigned HIVE-21955: --- Assignee: Zihao Ye > SearchArgumentImpl generates wrong ExpressionTree in some cases which might > result in loss of data > --- > > Key: HIVE-21955 > URL: https://issues.apache.org/jira/browse/HIVE-21955 > Project: Hive > Issue Type: Bug > Components: Hive, ORC >Reporter: Zihao Ye >Assignee: Zihao Ye >Priority: Critical > > ExpressionBuilder applies `pushDownNot`, `foldMaybe`, `flatten`, > `convertToCNF`, `flatten` and `buildLeafList` in order to form a > non-normalized expression into a CNF expression with the unique leaves. > After an expression is converted to CNF, there might be more than one > non-leaf node which are exactly the same object in the expression tree. If > this happens, those non-leaf node will be visited more than once in > `buildLeafList` function. As a result, a wrong ExpressionTree is generated. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21486) FinalSelectOps is empty in lineage index if there is a script operator(transform)
[ https://issues.apache.org/jira/browse/HIVE-21486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zihao Ye updated HIVE-21486: Labels: lineage (was: ) > FinalSelectOps is empty in lineage index if there is a script > operator(transform) > - > > Key: HIVE-21486 > URL: https://issues.apache.org/jira/browse/HIVE-21486 > Project: Hive > Issue Type: Bug > Components: lineage >Affects Versions: 2.1.1, 2.3.4 >Reporter: Zihao Ye >Priority: Major > Labels: lineage > > SQL pattern: > create table t1 as select transform(c1) using '/bin/python script.py' as (c2) > from t2; > Lineage dependencies are correct. But the SelectOperator is not added to the > finalSelectOps in Lineage Index. So that index.getDependencies(finalSelOp) > got null in this case. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20912) Output data might be duplicated while speculation is enabled
[ https://issues.apache.org/jira/browse/HIVE-20912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zihao Ye updated HIVE-20912: Priority: Critical (was: Major) > Output data might be duplicated while speculation is enabled > > > Key: HIVE-20912 > URL: https://issues.apache.org/jira/browse/HIVE-20912 > Project: Hive > Issue Type: Bug > Components: Hive, Operators >Affects Versions: 1.2.1 > Environment: Hive 1.2.1 > Hadoop 2.7.3 > Tez 0.7.0 >Reporter: Zihao Ye >Priority: Critical > Attachments: image-2018-11-14-17-48-59-826.png, > image-2018-11-14-17-53-13-191.png, image-2018-11-14-17-53-50-171.png, > image-2018-11-14-19-28-18-924.png > > > The file merge stage had two tasks, which should create two files, but there > was three files created. > !image-2018-11-14-19-28-18-924.png! > By tracing the log, we found that there were two task attempts(one of them > was a speculation) finished in one second by such a coincidence. Although the > later one received a kill signal from AM, the rename operation was already > done at that time, which cause the data duplication. > The rename operation was done at _AbstractFileMergeOperator.closeOp()_, the > __ final path name was determined by the task attempt id rather than the task > id. In this case, the final path ended with '00_0' and '00_1' rather > than '00'. IMHO, by making the final path name ended with task id without > task attempt id, one task can only generate at most one file, which could > solve this issue. But I don't know the side effects for changing the final > path name. > This issue also affects other operators related to file renaming like > JoinOperator and FileSinkOperator. > !image-2018-11-14-17-53-13-191.png! > !image-2018-11-14-17-53-50-171.png! -- This message was sent by Atlassian JIRA (v7.6.3#76005)