[jira] [Commented] (HIVE-13282) GroupBy and select operator encounter ArrayIndexOutOfBoundsException

2019-09-01 Thread Zihao Ye (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-13282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16920306#comment-16920306
 ] 

Zihao Ye commented on HIVE-13282:
-

Hi [~mmccline], I'm wondering why this patch is not merged yet. If there are 
new issues introduced by this fix, how could I reproduce it? Thanks

> GroupBy and select operator encounter ArrayIndexOutOfBoundsException
> 
>
> Key: HIVE-13282
> URL: https://issues.apache.org/jira/browse/HIVE-13282
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 1.2.1, 2.0.0, 2.1.0
>Reporter: Vikram Dixit K
>Priority: Blocker
> Attachments: HIVE-13282.01.patch, HIVE-13282.02.patch, 
> smb_fail_issue.patch, smb_groupby.q, smb_groupby.q.out
>
>
> The group by and select operators run into the ArrayIndexOutOfBoundsException 
> when they incorrectly initialize themselves with tag 0 but the incoming tag 
> id is different.
> {code}
> select count(*) from
> (select rt1.id from
> (select t1.key as id, t1.value as od from tab t1 group by key, value) rt1) vt1
> join
> (select rt2.id from
> (select t2.key as id, t2.value as od from tab_part t2 group by key, value) 
> rt2) vt2
> where vt1.id=vt2.id;
> {code}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (HIVE-21955) SearchArgumentImpl generates wrong ExpressionTree in some cases which might result in loss of data

2019-07-09 Thread Zihao Ye (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16880968#comment-16880968
 ] 

Zihao Ye commented on HIVE-21955:
-

This is a quite serious problem which might lead to data loss in lots of 
general queries. Could someone help with reviewing the issue please?

> SearchArgumentImpl generates wrong ExpressionTree in some cases which might 
> result in loss of data 
> ---
>
> Key: HIVE-21955
> URL: https://issues.apache.org/jira/browse/HIVE-21955
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, storage-api
>Affects Versions: 1.2.1
>Reporter: Zihao Ye
>Assignee: Zihao Ye
>Priority: Critical
>  Labels: pushdown
> Attachments: HIVE-21955.1.branch-1.patch, HIVE-21955.branch-1.patch
>
>
> ExpressionBuilder applies `pushDownNot`, `foldMaybe`, `flatten`, 
> `convertToCNF`, `flatten` and `buildLeafList` in order to form a 
> non-normalized expression into a CNF expression with the unique leaves.
> After an expression is converted to CNF, there might be *more than one 
> non-leaf nodes which are exactly the same object* in the expression tree. If 
> this happens, those non-leaf nodes will be visited more than once in 
> `buildLeafList` function. As a result, a wrong ExpressionTree is generated.
> My version is 1.2.1, but it seems that the higher versions are also affected



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HIVE-21955) SearchArgumentImpl generates wrong ExpressionTree in some cases which might result in loss of data

2019-07-04 Thread Zihao Ye (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16878943#comment-16878943
 ] 

Zihao Ye edited comment on HIVE-21955 at 7/5/19 3:15 AM:
-

The failed tests are mostly related to Java version issues rather than this 
patch. And the newly added UT was passed. See 
https://builds.apache.org/job/PreCommit-HIVE-Build/17855/testReport/org.apache.hadoop.hive.ql.io.sarg/TestSearchArgumentImpl/


was (Author: zihao.ye):
The failed tests are mostly related to Java version issues rather than this 
patch.

> SearchArgumentImpl generates wrong ExpressionTree in some cases which might 
> result in loss of data 
> ---
>
> Key: HIVE-21955
> URL: https://issues.apache.org/jira/browse/HIVE-21955
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, storage-api
>Affects Versions: 1.2.1
>Reporter: Zihao Ye
>Assignee: Zihao Ye
>Priority: Critical
>  Labels: pushdown
> Attachments: HIVE-21955.1.branch-1.patch, HIVE-21955.branch-1.patch
>
>
> ExpressionBuilder applies `pushDownNot`, `foldMaybe`, `flatten`, 
> `convertToCNF`, `flatten` and `buildLeafList` in order to form a 
> non-normalized expression into a CNF expression with the unique leaves.
> After an expression is converted to CNF, there might be *more than one 
> non-leaf nodes which are exactly the same object* in the expression tree. If 
> this happens, those non-leaf nodes will be visited more than once in 
> `buildLeafList` function. As a result, a wrong ExpressionTree is generated.
> My version is 1.2.1, but it seems that the higher versions are also affected



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21955) SearchArgumentImpl generates wrong ExpressionTree in some cases which might result in loss of data

2019-07-04 Thread Zihao Ye (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16878943#comment-16878943
 ] 

Zihao Ye commented on HIVE-21955:
-

The failed tests are mostly related to Java version issues rather than this 
patch.

> SearchArgumentImpl generates wrong ExpressionTree in some cases which might 
> result in loss of data 
> ---
>
> Key: HIVE-21955
> URL: https://issues.apache.org/jira/browse/HIVE-21955
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, storage-api
>Affects Versions: 1.2.1
>Reporter: Zihao Ye
>Assignee: Zihao Ye
>Priority: Critical
>  Labels: pushdown
> Attachments: HIVE-21955.1.branch-1.patch, HIVE-21955.branch-1.patch
>
>
> ExpressionBuilder applies `pushDownNot`, `foldMaybe`, `flatten`, 
> `convertToCNF`, `flatten` and `buildLeafList` in order to form a 
> non-normalized expression into a CNF expression with the unique leaves.
> After an expression is converted to CNF, there might be *more than one 
> non-leaf nodes which are exactly the same object* in the expression tree. If 
> this happens, those non-leaf nodes will be visited more than once in 
> `buildLeafList` function. As a result, a wrong ExpressionTree is generated.
> My version is 1.2.1, but it seems that the higher versions are also affected



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21955) SearchArgumentImpl generates wrong ExpressionTree in some cases which might result in loss of data

2019-07-04 Thread Zihao Ye (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zihao Ye updated HIVE-21955:

Description: 
ExpressionBuilder applies `pushDownNot`, `foldMaybe`, `flatten`, 
`convertToCNF`, `flatten` and `buildLeafList` in order to form a non-normalized 
expression into a CNF expression with the unique leaves.

After an expression is converted to CNF, there might be *more than one non-leaf 
nodes which are exactly the same object* in the expression tree. If this 
happens, those non-leaf nodes will be visited more than once in `buildLeafList` 
function. As a result, a wrong ExpressionTree is generated.

My version is 1.2.1, but it seems that the higher versions are also affected

  was:
ExpressionBuilder applies `pushDownNot`, `foldMaybe`, `flatten`, 
`convertToCNF`, `flatten` and `buildLeafList` in order to form a non-normalized 
expression into a CNF expression with the unique leaves.

After an expression is converted to CNF, there might be *more than one non-leaf 
node which are exactly the same object* in the expression tree. If this 
happens, those non-leaf node will be visited more than once in `buildLeafList` 
function. As a result, a wrong ExpressionTree is generated.

My version is 1.2.1, but it seems that the higher versions are also affected


> SearchArgumentImpl generates wrong ExpressionTree in some cases which might 
> result in loss of data 
> ---
>
> Key: HIVE-21955
> URL: https://issues.apache.org/jira/browse/HIVE-21955
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, storage-api
>Affects Versions: 1.2.1
>Reporter: Zihao Ye
>Assignee: Zihao Ye
>Priority: Critical
>  Labels: pushdown
> Attachments: HIVE-21955.1.branch-1.patch, HIVE-21955.branch-1.patch
>
>
> ExpressionBuilder applies `pushDownNot`, `foldMaybe`, `flatten`, 
> `convertToCNF`, `flatten` and `buildLeafList` in order to form a 
> non-normalized expression into a CNF expression with the unique leaves.
> After an expression is converted to CNF, there might be *more than one 
> non-leaf nodes which are exactly the same object* in the expression tree. If 
> this happens, those non-leaf nodes will be visited more than once in 
> `buildLeafList` function. As a result, a wrong ExpressionTree is generated.
> My version is 1.2.1, but it seems that the higher versions are also affected



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21955) SearchArgumentImpl generates wrong ExpressionTree in some cases which might result in loss of data

2019-07-04 Thread Zihao Ye (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zihao Ye updated HIVE-21955:

Description: 
ExpressionBuilder applies `pushDownNot`, `foldMaybe`, `flatten`, 
`convertToCNF`, `flatten` and `buildLeafList` in order to form a non-normalized 
expression into a CNF expression with the unique leaves.

After an expression is converted to CNF, there might be *more than one non-leaf 
node which are exactly the same object* in the expression tree. If this 
happens, those non-leaf node will be visited more than once in `buildLeafList` 
function. As a result, a wrong ExpressionTree is generated.

My version is 1.2.1, but it seems that the higher versions are also affected

  was:
ExpressionBuilder applies `pushDownNot`, `foldMaybe`, `flatten`, 
`convertToCNF`, `flatten` and `buildLeafList` in order to form a non-normalized 
expression into a CNF expression with the unique leaves.

After an expression is converted to CNF, there might be more than one non-leaf 
node which are exactly the same object in the expression tree. If this happens, 
those non-leaf node will be visited more than once in `buildLeafList` function. 
As a result, a wrong ExpressionTree is generated.

My version is 1.2.1, but it seems that the higher versions are also affected


> SearchArgumentImpl generates wrong ExpressionTree in some cases which might 
> result in loss of data 
> ---
>
> Key: HIVE-21955
> URL: https://issues.apache.org/jira/browse/HIVE-21955
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, storage-api
>Affects Versions: 1.2.1
>Reporter: Zihao Ye
>Assignee: Zihao Ye
>Priority: Critical
>  Labels: pushdown
> Attachments: HIVE-21955.1.branch-1.patch, HIVE-21955.branch-1.patch
>
>
> ExpressionBuilder applies `pushDownNot`, `foldMaybe`, `flatten`, 
> `convertToCNF`, `flatten` and `buildLeafList` in order to form a 
> non-normalized expression into a CNF expression with the unique leaves.
> After an expression is converted to CNF, there might be *more than one 
> non-leaf node which are exactly the same object* in the expression tree. If 
> this happens, those non-leaf node will be visited more than once in 
> `buildLeafList` function. As a result, a wrong ExpressionTree is generated.
> My version is 1.2.1, but it seems that the higher versions are also affected



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21955) SearchArgumentImpl generates wrong ExpressionTree in some cases which might result in loss of data

2019-07-04 Thread Zihao Ye (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zihao Ye updated HIVE-21955:

Status: Open  (was: Patch Available)

Didn't attach a UT

> SearchArgumentImpl generates wrong ExpressionTree in some cases which might 
> result in loss of data 
> ---
>
> Key: HIVE-21955
> URL: https://issues.apache.org/jira/browse/HIVE-21955
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, storage-api
>Affects Versions: 1.2.1
>Reporter: Zihao Ye
>Assignee: Zihao Ye
>Priority: Critical
>  Labels: pushdown
> Attachments: HIVE-21955.branch-1.patch
>
>
> ExpressionBuilder applies `pushDownNot`, `foldMaybe`, `flatten`, 
> `convertToCNF`, `flatten` and `buildLeafList` in order to form a 
> non-normalized expression into a CNF expression with the unique leaves.
> After an expression is converted to CNF, there might be more than one 
> non-leaf node which are exactly the same object in the expression tree. If 
> this happens, those non-leaf node will be visited more than once in 
> `buildLeafList` function. As a result, a wrong ExpressionTree is generated.
> My version is 1.2.1, but it seems that the higher versions are also affected



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21955) SearchArgumentImpl generates wrong ExpressionTree in some cases which might result in loss of data

2019-07-04 Thread Zihao Ye (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zihao Ye updated HIVE-21955:

Attachment: HIVE-21955.1.branch-1.patch
Status: Patch Available  (was: Open)

UT added

> SearchArgumentImpl generates wrong ExpressionTree in some cases which might 
> result in loss of data 
> ---
>
> Key: HIVE-21955
> URL: https://issues.apache.org/jira/browse/HIVE-21955
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, storage-api
>Affects Versions: 1.2.1
>Reporter: Zihao Ye
>Assignee: Zihao Ye
>Priority: Critical
>  Labels: pushdown
> Attachments: HIVE-21955.1.branch-1.patch, HIVE-21955.branch-1.patch
>
>
> ExpressionBuilder applies `pushDownNot`, `foldMaybe`, `flatten`, 
> `convertToCNF`, `flatten` and `buildLeafList` in order to form a 
> non-normalized expression into a CNF expression with the unique leaves.
> After an expression is converted to CNF, there might be more than one 
> non-leaf node which are exactly the same object in the expression tree. If 
> this happens, those non-leaf node will be visited more than once in 
> `buildLeafList` function. As a result, a wrong ExpressionTree is generated.
> My version is 1.2.1, but it seems that the higher versions are also affected



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21955) SearchArgumentImpl generates wrong ExpressionTree in some cases which might result in loss of data

2019-07-04 Thread Zihao Ye (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zihao Ye updated HIVE-21955:

  Attachment: HIVE-21955.branch-1.patch
Target Version/s: 1.2.1
  Status: Patch Available  (was: Open)

> SearchArgumentImpl generates wrong ExpressionTree in some cases which might 
> result in loss of data 
> ---
>
> Key: HIVE-21955
> URL: https://issues.apache.org/jira/browse/HIVE-21955
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, storage-api
>Affects Versions: 1.2.1
>Reporter: Zihao Ye
>Assignee: Zihao Ye
>Priority: Critical
>  Labels: pushdown
> Attachments: HIVE-21955.branch-1.patch
>
>
> ExpressionBuilder applies `pushDownNot`, `foldMaybe`, `flatten`, 
> `convertToCNF`, `flatten` and `buildLeafList` in order to form a 
> non-normalized expression into a CNF expression with the unique leaves.
> After an expression is converted to CNF, there might be more than one 
> non-leaf node which are exactly the same object in the expression tree. If 
> this happens, those non-leaf node will be visited more than once in 
> `buildLeafList` function. As a result, a wrong ExpressionTree is generated.
> My version is 1.2.1, but it seems that the higher versions are also affected



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21955) SearchArgumentImpl generates wrong ExpressionTree in some cases which might result in loss of data

2019-07-04 Thread Zihao Ye (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zihao Ye updated HIVE-21955:

Component/s: storage-api

> SearchArgumentImpl generates wrong ExpressionTree in some cases which might 
> result in loss of data 
> ---
>
> Key: HIVE-21955
> URL: https://issues.apache.org/jira/browse/HIVE-21955
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, storage-api
>Affects Versions: 1.2.1
>Reporter: Zihao Ye
>Assignee: Zihao Ye
>Priority: Critical
>  Labels: pushdown
>
> ExpressionBuilder applies `pushDownNot`, `foldMaybe`, `flatten`, 
> `convertToCNF`, `flatten` and `buildLeafList` in order to form a 
> non-normalized expression into a CNF expression with the unique leaves.
> After an expression is converted to CNF, there might be more than one 
> non-leaf node which are exactly the same object in the expression tree. If 
> this happens, those non-leaf node will be visited more than once in 
> `buildLeafList` function. As a result, a wrong ExpressionTree is generated.
> My version is 1.2.1, but it seems that the higher versions are also affected



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21955) SearchArgumentImpl generates wrong ExpressionTree in some cases which might result in loss of data

2019-07-04 Thread Zihao Ye (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zihao Ye updated HIVE-21955:

Description: 
ExpressionBuilder applies `pushDownNot`, `foldMaybe`, `flatten`, 
`convertToCNF`, `flatten` and `buildLeafList` in order to form a non-normalized 
expression into a CNF expression with the unique leaves.

After an expression is converted to CNF, there might be more than one non-leaf 
node which are exactly the same object in the expression tree. If this happens, 
those non-leaf node will be visited more than once in `buildLeafList` function. 
As a result, a wrong ExpressionTree is generated.

My version is 1.2.1, but it seems that the higher versions are also affected

  was:
ExpressionBuilder applies `pushDownNot`, `foldMaybe`, `flatten`, 
`convertToCNF`, `flatten` and `buildLeafList` in order to form a non-normalized 
expression into a CNF expression with the unique leaves.

After an expression is converted to CNF, there might be more than one non-leaf 
node which are exactly the same object in the expression tree. If this happens, 
those non-leaf node will be visited more than once in `buildLeafList` function. 
As a result, a wrong ExpressionTree is generated.


> SearchArgumentImpl generates wrong ExpressionTree in some cases which might 
> result in loss of data 
> ---
>
> Key: HIVE-21955
> URL: https://issues.apache.org/jira/browse/HIVE-21955
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.1
>Reporter: Zihao Ye
>Assignee: Zihao Ye
>Priority: Critical
>  Labels: pushdown
>
> ExpressionBuilder applies `pushDownNot`, `foldMaybe`, `flatten`, 
> `convertToCNF`, `flatten` and `buildLeafList` in order to form a 
> non-normalized expression into a CNF expression with the unique leaves.
> After an expression is converted to CNF, there might be more than one 
> non-leaf node which are exactly the same object in the expression tree. If 
> this happens, those non-leaf node will be visited more than once in 
> `buildLeafList` function. As a result, a wrong ExpressionTree is generated.
> My version is 1.2.1, but it seems that the higher versions are also affected



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21955) SearchArgumentImpl generates wrong ExpressionTree in some cases which might result in loss of data

2019-07-04 Thread Zihao Ye (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zihao Ye updated HIVE-21955:

Affects Version/s: 1.2.1

> SearchArgumentImpl generates wrong ExpressionTree in some cases which might 
> result in loss of data 
> ---
>
> Key: HIVE-21955
> URL: https://issues.apache.org/jira/browse/HIVE-21955
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.1
>Reporter: Zihao Ye
>Assignee: Zihao Ye
>Priority: Critical
>  Labels: pushdown
>
> ExpressionBuilder applies `pushDownNot`, `foldMaybe`, `flatten`, 
> `convertToCNF`, `flatten` and `buildLeafList` in order to form a 
> non-normalized expression into a CNF expression with the unique leaves.
> After an expression is converted to CNF, there might be more than one 
> non-leaf node which are exactly the same object in the expression tree. If 
> this happens, those non-leaf node will be visited more than once in 
> `buildLeafList` function. As a result, a wrong ExpressionTree is generated.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21955) SearchArgumentImpl generates wrong ExpressionTree in some cases which might result in loss of data

2019-07-04 Thread Zihao Ye (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zihao Ye updated HIVE-21955:

Component/s: (was: ORC)

> SearchArgumentImpl generates wrong ExpressionTree in some cases which might 
> result in loss of data 
> ---
>
> Key: HIVE-21955
> URL: https://issues.apache.org/jira/browse/HIVE-21955
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Zihao Ye
>Assignee: Zihao Ye
>Priority: Critical
>
> ExpressionBuilder applies `pushDownNot`, `foldMaybe`, `flatten`, 
> `convertToCNF`, `flatten` and `buildLeafList` in order to form a 
> non-normalized expression into a CNF expression with the unique leaves.
> After an expression is converted to CNF, there might be more than one 
> non-leaf node which are exactly the same object in the expression tree. If 
> this happens, those non-leaf node will be visited more than once in 
> `buildLeafList` function. As a result, a wrong ExpressionTree is generated.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21955) SearchArgumentImpl generates wrong ExpressionTree in some cases which might result in loss of data

2019-07-04 Thread Zihao Ye (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zihao Ye updated HIVE-21955:

Labels: pushdown  (was: )

> SearchArgumentImpl generates wrong ExpressionTree in some cases which might 
> result in loss of data 
> ---
>
> Key: HIVE-21955
> URL: https://issues.apache.org/jira/browse/HIVE-21955
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Zihao Ye
>Assignee: Zihao Ye
>Priority: Critical
>  Labels: pushdown
>
> ExpressionBuilder applies `pushDownNot`, `foldMaybe`, `flatten`, 
> `convertToCNF`, `flatten` and `buildLeafList` in order to form a 
> non-normalized expression into a CNF expression with the unique leaves.
> After an expression is converted to CNF, there might be more than one 
> non-leaf node which are exactly the same object in the expression tree. If 
> this happens, those non-leaf node will be visited more than once in 
> `buildLeafList` function. As a result, a wrong ExpressionTree is generated.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21955) SearchArgumentImpl generates wrong ExpressionTree in some cases which might result in loss of data

2019-07-04 Thread Zihao Ye (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16878391#comment-16878391
 ] 

Zihao Ye commented on HIVE-21955:
-

An example:

select * from t where ((rfr_page = 'search' and uid is not null) or (rfr_page = 
'index' and uid is not null));

Filter text is:

Filter text = (((rfr_page = 'search') and uid is not null) or ((rfr_page = 
'index') and uid is not null))

However the sarg is:

sarg: leaf-0 = (EQUALS rfr_page search), leaf-1 = (EQUALS rfr_page index), 
leaf-2 = (IS_NULL uid), expr = (and (or leaf-0 leaf-1) (or (not leaf-2) leaf-1) 
(or leaf-0 (not leaf-1)) (or (not leaf-2) (not leaf-1)))

> SearchArgumentImpl generates wrong ExpressionTree in some cases which might 
> result in loss of data 
> ---
>
> Key: HIVE-21955
> URL: https://issues.apache.org/jira/browse/HIVE-21955
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, ORC
>Reporter: Zihao Ye
>Assignee: Zihao Ye
>Priority: Critical
>
> ExpressionBuilder applies `pushDownNot`, `foldMaybe`, `flatten`, 
> `convertToCNF`, `flatten` and `buildLeafList` in order to form a 
> non-normalized expression into a CNF expression with the unique leaves.
> After an expression is converted to CNF, there might be more than one 
> non-leaf node which are exactly the same object in the expression tree. If 
> this happens, those non-leaf node will be visited more than once in 
> `buildLeafList` function. As a result, a wrong ExpressionTree is generated.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-21955) SearchArgumentImpl generates wrong ExpressionTree in some cases which might result in loss of data

2019-07-04 Thread Zihao Ye (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zihao Ye reassigned HIVE-21955:
---

Assignee: Zihao Ye

> SearchArgumentImpl generates wrong ExpressionTree in some cases which might 
> result in loss of data 
> ---
>
> Key: HIVE-21955
> URL: https://issues.apache.org/jira/browse/HIVE-21955
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, ORC
>Reporter: Zihao Ye
>Assignee: Zihao Ye
>Priority: Critical
>
> ExpressionBuilder applies `pushDownNot`, `foldMaybe`, `flatten`, 
> `convertToCNF`, `flatten` and `buildLeafList` in order to form a 
> non-normalized expression into a CNF expression with the unique leaves.
> After an expression is converted to CNF, there might be more than one 
> non-leaf node which are exactly the same object in the expression tree. If 
> this happens, those non-leaf node will be visited more than once in 
> `buildLeafList` function. As a result, a wrong ExpressionTree is generated.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21486) FinalSelectOps is empty in lineage index if there is a script operator(transform)

2019-03-21 Thread Zihao Ye (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zihao Ye updated HIVE-21486:

Labels: lineage  (was: )

> FinalSelectOps is empty in lineage index if there is a script 
> operator(transform)
> -
>
> Key: HIVE-21486
> URL: https://issues.apache.org/jira/browse/HIVE-21486
> Project: Hive
>  Issue Type: Bug
>  Components: lineage
>Affects Versions: 2.1.1, 2.3.4
>Reporter: Zihao Ye
>Priority: Major
>  Labels: lineage
>
> SQL pattern:
> create table t1 as select transform(c1) using '/bin/python script.py' as (c2) 
> from t2;
> Lineage dependencies are correct. But the SelectOperator is not added to the 
> finalSelectOps in Lineage Index. So that index.getDependencies(finalSelOp) 
> got null in this case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20912) Output data might be duplicated while speculation is enabled

2018-11-14 Thread Zihao Ye (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zihao Ye updated HIVE-20912:

Priority: Critical  (was: Major)

> Output data might be duplicated while speculation is enabled
> 
>
> Key: HIVE-20912
> URL: https://issues.apache.org/jira/browse/HIVE-20912
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Operators
>Affects Versions: 1.2.1
> Environment: Hive 1.2.1
> Hadoop 2.7.3
> Tez 0.7.0
>Reporter: Zihao Ye
>Priority: Critical
> Attachments: image-2018-11-14-17-48-59-826.png, 
> image-2018-11-14-17-53-13-191.png, image-2018-11-14-17-53-50-171.png, 
> image-2018-11-14-19-28-18-924.png
>
>
> The file merge stage had two tasks, which should create two files, but there 
> was three files created.
> !image-2018-11-14-19-28-18-924.png!
> By tracing the log, we found that there were two task attempts(one of them 
> was a speculation) finished in one second by such a coincidence. Although the 
> later one received a kill signal from AM, the rename operation was already 
> done at that time, which cause the data duplication.
> The rename operation was done at _AbstractFileMergeOperator.closeOp()_, the 
> __ final path name was determined by the task attempt id rather than the task 
> id. In this case, the final path ended with '00_0' and '00_1' rather 
> than '00'. IMHO, by making the final path name ended with task id without 
> task attempt id, one task can only generate at most one file, which could 
> solve this issue. But I don't know the side effects for changing the final 
> path name.
> This issue also affects other operators related to file renaming like 
> JoinOperator and FileSinkOperator.
> !image-2018-11-14-17-53-13-191.png!
> !image-2018-11-14-17-53-50-171.png!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)