[jira] [Created] (HIVE-21408) Disable synthetic join predicates for non-equi joins for unintended cases

2019-03-07 Thread Deepak Jaiswal (JIRA)
Deepak Jaiswal created HIVE-21408:
-

 Summary: Disable synthetic join predicates for non-equi joins for 
unintended cases
 Key: HIVE-21408
 URL: https://issues.apache.org/jira/browse/HIVE-21408
 Project: Hive
  Issue Type: Bug
Reporter: Deepak Jaiswal
Assignee: Deepak Jaiswal


With support for synthetic join predicates on non-equi joins, it is important 
to make sure those predicates are used only for intended purpose. Currently, 
DPP and semi join reduction are not supposed to use it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Review Request 70031: HIVE-21167

2019-02-21 Thread Deepak Jaiswal

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70031/
---

(Updated Feb. 22, 2019, 7:19 a.m.)


Review request for hive, Jason Dere and Vaibhav Gumashta.


Changes
---

Added the union test which identified an issue which is fixed.
The followup JIRA to show bucketing version in explain extended is created.
https://issues.apache.org/jira/browse/HIVE-21304


Bugs: HIVE-21167
https://issues.apache.org/jira/browse/HIVE-21167


Repository: hive-git


Description
---

Bucketing: Bucketing version 1 is incorrectly partitioning data


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java 4b10e8974e 
  ql/src/test/queries/clientpositive/murmur_hash_migration.q 2b8da9f683 
  ql/src/test/results/clientpositive/llap/dynpart_sort_opt_vectorization.q.out 
5a2cd47381 
  ql/src/test/results/clientpositive/llap/murmur_hash_migration.q.out 
5343628252 


Diff: https://reviews.apache.org/r/70031/diff/2/

Changes: https://reviews.apache.org/r/70031/diff/1-2/


Testing
---


Thanks,

Deepak Jaiswal



[jira] [Created] (HIVE-21304) Show Bucketing version for ReduceSinkOp in explain extended plan

2019-02-21 Thread Deepak Jaiswal (JIRA)
Deepak Jaiswal created HIVE-21304:
-

 Summary: Show Bucketing version for ReduceSinkOp in explain 
extended plan
 Key: HIVE-21304
 URL: https://issues.apache.org/jira/browse/HIVE-21304
 Project: Hive
  Issue Type: Bug
Reporter: Deepak Jaiswal
Assignee: Deepak Jaiswal


Show Bucketing version for ReduceSinkOp in explain extended plan.

This helps identify what hashing algorithm is being used by by ReduceSinkOp.

 

cc [~vgarg]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Review Request 70031: HIVE-21167

2019-02-21 Thread Deepak Jaiswal


> On Feb. 21, 2019, 6:29 p.m., Vineet Garg wrote:
> > ql/src/test/results/clientpositive/llap/dynpart_sort_opt_vectorization.q.out
> > Line 1332 (original), 1332 (patched)
> > <https://reviews.apache.org/r/70031/diff/1/?file=2126091#file2126091line1332>
> >
> > Do you know the reason this size changed? This seems strange.

The size of one file went down by 2 and another went up by 2. It looks like 
this bug was hitting the test case.


- Deepak


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70031/#review213034
---


On Feb. 21, 2019, 8:59 a.m., Deepak Jaiswal wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70031/
> ---
> 
> (Updated Feb. 21, 2019, 8:59 a.m.)
> 
> 
> Review request for hive, Jason Dere and Vaibhav Gumashta.
> 
> 
> Bugs: HIVE-21167
> https://issues.apache.org/jira/browse/HIVE-21167
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Bucketing: Bucketing version 1 is incorrectly partitioning data
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java 4b10e8974e 
>   ql/src/test/queries/clientpositive/murmur_hash_migration.q 2b8da9f683 
>   
> ql/src/test/results/clientpositive/llap/dynpart_sort_opt_vectorization.q.out 
> 5a2cd47381 
>   ql/src/test/results/clientpositive/llap/murmur_hash_migration.q.out 
> 5343628252 
> 
> 
> Diff: https://reviews.apache.org/r/70031/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Deepak Jaiswal
> 
>



Review Request 70031: HIVE-21167

2019-02-21 Thread Deepak Jaiswal

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70031/
---

Review request for hive, Jason Dere and Vaibhav Gumashta.


Bugs: HIVE-21167
https://issues.apache.org/jira/browse/HIVE-21167


Repository: hive-git


Description
---

Bucketing: Bucketing version 1 is incorrectly partitioning data


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java 4b10e8974e 
  ql/src/test/queries/clientpositive/murmur_hash_migration.q 2b8da9f683 
  ql/src/test/results/clientpositive/llap/dynpart_sort_opt_vectorization.q.out 
5a2cd47381 
  ql/src/test/results/clientpositive/llap/murmur_hash_migration.q.out 
5343628252 


Diff: https://reviews.apache.org/r/70031/diff/1/


Testing
---


Thanks,

Deepak Jaiswal



Re: Review Request 69903: HIVE-21214

2019-02-05 Thread Deepak Jaiswal


> On Feb. 5, 2019, 11:50 p.m., Jason Dere wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java
> > Lines 1876 (patched)
> > <https://reviews.apache.org/r/69903/diff/1/?file=2123940#file2123940line1876>
> >
> > nit: add the filenames to the error message

will do.


- Deepak


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69903/#review212580
---


On Feb. 5, 2019, 10:10 p.m., Deepak Jaiswal wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69903/
> ---
> 
> (Updated Feb. 5, 2019, 10:10 p.m.)
> 
> 
> Review request for hive and Jason Dere.
> 
> 
> Bugs: HIVE-21214
> https://issues.apache.org/jira/browse/HIVE-21214
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> MoveTask : Use attemptId instead of file size for deduplication of files 
> compareTempOrDuplicateFiles()
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 8937b43811 
> 
> 
> Diff: https://reviews.apache.org/r/69903/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Deepak Jaiswal
> 
>



Re: Review Request 69903: HIVE-21214

2019-02-05 Thread Deepak Jaiswal


> On Feb. 5, 2019, 11:53 p.m., Jason Dere wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java
> > Line 1829 (original), 1838 (patched)
> > <https://reviews.apache.org/r/69903/diff/1/?file=2123940#file2123940line1838>
> >
> > No "if" - this dedup strategy does not work with speculative execution 
> > enabled.

Based on my understanding these are the two scenarios,

1. speculative execution succeeds, it has attempt ID 1. The original attempt ID 
is 0. The logic picks speculative one, regardless of original one's outcome. 
This works fine.
2. speculative execution fails, throws exception.

Let me know I am getting it wrong.


- Deepak


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69903/#review212581
-------


On Feb. 5, 2019, 10:10 p.m., Deepak Jaiswal wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69903/
> ---
> 
> (Updated Feb. 5, 2019, 10:10 p.m.)
> 
> 
> Review request for hive and Jason Dere.
> 
> 
> Bugs: HIVE-21214
> https://issues.apache.org/jira/browse/HIVE-21214
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> MoveTask : Use attemptId instead of file size for deduplication of files 
> compareTempOrDuplicateFiles()
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 8937b43811 
> 
> 
> Diff: https://reviews.apache.org/r/69903/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Deepak Jaiswal
> 
>



Review Request 69903: HIVE-21214

2019-02-05 Thread Deepak Jaiswal

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69903/
---

Review request for hive and Jason Dere.


Bugs: HIVE-21214
https://issues.apache.org/jira/browse/HIVE-21214


Repository: hive-git


Description
---

MoveTask : Use attemptId instead of file size for deduplication of files 
compareTempOrDuplicateFiles()


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 8937b43811 


Diff: https://reviews.apache.org/r/69903/diff/1/


Testing
---


Thanks,

Deepak Jaiswal



[jira] [Created] (HIVE-21214) MoveTask : Use attemptId instead of file size for deduplication of files compareTempOrDuplicateFiles()

2019-02-05 Thread Deepak Jaiswal (JIRA)
Deepak Jaiswal created HIVE-21214:
-

 Summary: MoveTask : Use attemptId instead of file size for 
deduplication of files compareTempOrDuplicateFiles()
 Key: HIVE-21214
 URL: https://issues.apache.org/jira/browse/HIVE-21214
 Project: Hive
  Issue Type: Bug
Reporter: Deepak Jaiswal
Assignee: Deepak Jaiswal


For a given task, if there is more than one attempt then deduplication logic 
kicks in.
{noformat}
Utilities.compareTempOrDuplicateFiles(){noformat}
The logic uses file size and picks the one with largest size. This logic is 
very fragile.

ideally, it should pick the successful attempt's file.

However, a simpler solution is to pick the newest attempt and also checking the 
file size for the newest attempt is the largest.

If not, throw an exception.

 

cc [~gopalv] [~thejas] [~jdere] [~ekoifman]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21196) Support semijoin reduction on multiple column join

2019-01-31 Thread Deepak Jaiswal (JIRA)
Deepak Jaiswal created HIVE-21196:
-

 Summary: Support semijoin reduction on multiple column join
 Key: HIVE-21196
 URL: https://issues.apache.org/jira/browse/HIVE-21196
 Project: Hive
  Issue Type: Bug
Reporter: Deepak Jaiswal
Assignee: Deepak Jaiswal


Currently for a query involving join on multiple columns creates  separate semi 
join edges for each key which in turn create a bloom filter for each of them, 
like below,

EXPLAIN select count(*) from srcpart_date_n7 join srcpart_small_n3 on 
(srcpart_date_n7.key = srcpart_small_n3.key1 and srcpart_date_n7.value = 
srcpart_small_n3.value1)
{code:java}

Map 1 <- Reducer 5 (BROADCAST_EDGE)
Reducer 2 <- Map 1 (SIMPLE_EDGE), Map 4 (SIMPLE_EDGE)
Reducer 3 <- Reducer 2 (CUSTOM_SIMPLE_EDGE)
Reducer 5 <- Map 4 (CUSTOM_SIMPLE_EDGE)
 A masked pattern was here 
  Vertices:
Map 1 
Map Operator Tree:
TableScan
  alias: srcpart_date_n7
  filterExpr: (key is not null and value is not null and (key 
BETWEEN DynamicValue(RS_7_srcpart_small_n3_key1_min) AND 
DynamicValue(RS_7_srcpart_small_n3_key1_max) and in_bloom_filter(key, 
DynamicValue(RS_7_srcpart_small_n3_key1_bloom_filter (type: boolean)
  Statistics: Num rows: 2000 Data size: 356000 Basic stats: 
COMPLETE Column stats: COMPLETE
  Filter Operator
predicate: ((key BETWEEN 
DynamicValue(RS_7_srcpart_small_n3_key1_min) AND 
DynamicValue(RS_7_srcpart_small_n3_key1_max) and in_bloom_filter(key, 
DynamicValue(RS_7_srcpart_small_n3_key1_bloom_filter))) and key is not null and 
value is not null) (type: boolean)
Statistics: Num rows: 2000 Data size: 356000 Basic stats: 
COMPLETE Column stats: COMPLETE
Select Operator
  expressions: key (type: string), value (type: string)
  outputColumnNames: _col0, _col1
  Statistics: Num rows: 2000 Data size: 356000 Basic stats: 
COMPLETE Column stats: COMPLETE
  Reduce Output Operator
key expressions: _col0 (type: string), _col1 (type: 
string)
sort order: ++
Map-reduce partition columns: _col0 (type: string), 
_col1 (type: string)
Statistics: Num rows: 2000 Data size: 356000 Basic 
stats: COMPLETE Column stats: COMPLETE
Execution mode: vectorized, llap
LLAP IO: all inputs
Map 4 
Map Operator Tree:
TableScan
  alias: srcpart_small_n3
  filterExpr: (key1 is not null and value1 is not null) (type: 
boolean)
  Statistics: Num rows: 20 Data size: 3560 Basic stats: PARTIAL 
Column stats: PARTIAL
  Filter Operator
predicate: (key1 is not null and value1 is not null) (type: 
boolean)
Statistics: Num rows: 20 Data size: 3560 Basic stats: 
PARTIAL Column stats: PARTIAL
Select Operator
  expressions: key1 (type: string), value1 (type: string)
  outputColumnNames: _col0, _col1
  Statistics: Num rows: 20 Data size: 3560 Basic stats: 
PARTIAL Column stats: PARTIAL
  Reduce Output Operator
key expressions: _col0 (type: string), _col1 (type: 
string)
sort order: ++
Map-reduce partition columns: _col0 (type: string), 
_col1 (type: string)
Statistics: Num rows: 20 Data size: 3560 Basic stats: 
PARTIAL Column stats: PARTIAL
  Select Operator
expressions: _col0 (type: string)
outputColumnNames: _col0
Statistics: Num rows: 20 Data size: 3560 Basic stats: 
PARTIAL Column stats: PARTIAL
Group By Operator
  aggregations: min(_col0), max(_col0), 
bloom_filter(_col0, expectedEntries=20)
  mode: hash
  outputColumnNames: _col0, _col1, _col2
  Statistics: Num rows: 1 Data size: 730 Basic stats: 
PARTIAL Column stats: PARTIAL
  Reduce Output Operator
sort order: 
Statistics: Num rows: 1 Data size: 730 Basic stats: 
PARTIAL Column stats: PARTIAL
value expressions: _col0 (type: string), _col1 
(type: string), _col2 (type: binary)
Execution mode: vectorized, llap
LLAP IO: all inputs
Reducer 2 
Execution mode: llap
Reduce Operator Tree:
  Merge Join Operator
c

Re: Review Request 69663: HIVE-16976

2019-01-11 Thread Deepak Jaiswal


> On Jan. 11, 2019, 6:07 p.m., Jesús Camacho Rodríguez wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java
> > Lines 35 (patched)
> > <https://reviews.apache.org/r/69663/diff/2/?file=2118652#file2118652line35>
> >
> > This is not needed?

Yes. That is correct. I will remove it before committing.


- Deepak


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69663/#review211896
-------


On Jan. 9, 2019, 5:50 p.m., Deepak Jaiswal wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69663/
> ---
> 
> (Updated Jan. 9, 2019, 5:50 p.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan, Gopal V, Jesús Camacho Rodríguez, 
> and Jason Dere.
> 
> 
> Bugs: HIVE-16976
> https://issues.apache.org/jira/browse/HIVE-16976
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> DPP: SyntheticJoinPredicate transitivity for < > and BETWEEN
> 
> The patch supports predicates on non-equi joins and provides an interface for 
> storage handler to decide if it can use this optimization.
> Work to integrate this with DPP and semijoin will be done in separate JIRA.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java d7f069eaa7 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveStorageHandler.java 
> 2ebb149354 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization.java
>  a1401aac72 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezUtils.java f8c7e18eb1 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDynamicListDesc.java 
> 676dfc9421 
>   ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java 
> e97e44796f 
>   ql/src/test/results/clientpositive/llap/cross_prod_1.q.out f900a01be4 
>   ql/src/test/results/clientpositive/llap/groupby_groupingset_bug.q.out 
> de74af6dff 
>   ql/src/test/results/clientpositive/llap/semijoin.q.out 00bc6cec55 
>   ql/src/test/results/clientpositive/llap/subquery_in.q.out 07cc4dbabc 
>   ql/src/test/results/clientpositive/llap/subquery_notin.q.out 29d8bbfb48 
>   ql/src/test/results/clientpositive/llap/subquery_scalar.q.out 1cf281afbd 
>   ql/src/test/results/clientpositive/llap/subquery_select.q.out 6255abdd70 
> 
> 
> Diff: https://reviews.apache.org/r/69663/diff/2/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Deepak Jaiswal
> 
>



Re: Review Request 69663: HIVE-16976

2019-01-09 Thread Deepak Jaiswal

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69663/
---

(Updated Jan. 9, 2019, 5:50 p.m.)


Review request for hive, Ashutosh Chauhan, Gopal V, Jesús Camacho Rodríguez, 
and Jason Dere.


Changes
---

Updated the patch with review comments.


Bugs: HIVE-16976
https://issues.apache.org/jira/browse/HIVE-16976


Repository: hive-git


Description
---

DPP: SyntheticJoinPredicate transitivity for < > and BETWEEN

The patch supports predicates on non-equi joins and provides an interface for 
storage handler to decide if it can use this optimization.
Work to integrate this with DPP and semijoin will be done in separate JIRA.


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java d7f069eaa7 
  ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveStorageHandler.java 
2ebb149354 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization.java
 a1401aac72 
  ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezUtils.java f8c7e18eb1 
  ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDynamicListDesc.java 
676dfc9421 
  ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java 
e97e44796f 
  ql/src/test/results/clientpositive/llap/cross_prod_1.q.out f900a01be4 
  ql/src/test/results/clientpositive/llap/groupby_groupingset_bug.q.out 
de74af6dff 
  ql/src/test/results/clientpositive/llap/semijoin.q.out 00bc6cec55 
  ql/src/test/results/clientpositive/llap/subquery_in.q.out 07cc4dbabc 
  ql/src/test/results/clientpositive/llap/subquery_notin.q.out 29d8bbfb48 
  ql/src/test/results/clientpositive/llap/subquery_scalar.q.out 1cf281afbd 
  ql/src/test/results/clientpositive/llap/subquery_select.q.out 6255abdd70 


Diff: https://reviews.apache.org/r/69663/diff/2/

Changes: https://reviews.apache.org/r/69663/diff/1-2/


Testing
---


Thanks,

Deepak Jaiswal



Re: Review Request 69663: HIVE-16976

2019-01-07 Thread Deepak Jaiswal


> On Jan. 7, 2019, 4:41 p.m., Jesús Camacho Rodríguez wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java
> > Lines 284 (patched)
> > <https://reviews.apache.org/r/69663/diff/1/?file=2117432#file2117432line288>
> >
> > We should add a call to extended version here as we did above for 
> > equality predicates. The only required change seems to be in 
> > _addParentReduceSink_ called from _createDerivatives_, which would receive 
> > the comparison operator from here. All the rest should already work as 
> > expected.
> > 
> > I believe this could be addressed in this JIRA since it is not a lot of 
> > code. However, if it is not addressed, please create follow-up and leave a 
> > TODO.
> 
> Deepak Jaiswal wrote:
> Will add the extended version. Thanks for bringing this up.

The existing logic for extension works for equality. I am planning to do this 
later. HIVE-21098 tracks it.


- Deepak


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69663/#review211725
-------


On Jan. 3, 2019, 8:39 p.m., Deepak Jaiswal wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69663/
> ---
> 
> (Updated Jan. 3, 2019, 8:39 p.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan, Gopal V, Jesús Camacho Rodríguez, 
> and Jason Dere.
> 
> 
> Bugs: HIVE-16976
> https://issues.apache.org/jira/browse/HIVE-16976
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> DPP: SyntheticJoinPredicate transitivity for < > and BETWEEN
> 
> The patch supports predicates on non-equi joins and provides an interface for 
> storage handler to decide if it can use this optimization.
> Work to integrate this with DPP and semijoin will be done in separate JIRA.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveStorageHandler.java 
> 2ebb149354 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization.java
>  a1401aac72 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezUtils.java f8c7e18eb1 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDynamicListDesc.java 
> 676dfc9421 
>   ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java 
> e97e44796f 
>   ql/src/test/results/clientpositive/llap/cross_prod_1.q.out ac1f4eabd8 
>   ql/src/test/results/clientpositive/llap/groupby_groupingset_bug.q.out 
> de74af6dff 
>   ql/src/test/results/clientpositive/llap/semijoin.q.out 63a270e57d 
>   ql/src/test/results/clientpositive/llap/subquery_in.q.out 07cc4dbabc 
>   ql/src/test/results/clientpositive/llap/subquery_notin.q.out 29d8bbfb48 
>   ql/src/test/results/clientpositive/llap/subquery_scalar.q.out e830835445 
>   ql/src/test/results/clientpositive/llap/subquery_select.q.out d3cc980ca1 
> 
> 
> Diff: https://reviews.apache.org/r/69663/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Deepak Jaiswal
> 
>



[jira] [Created] (HIVE-21098) DPP: SyntheticJoinPredicate transitivity for < > and BETWEEN needs extension

2019-01-07 Thread Deepak Jaiswal (JIRA)
Deepak Jaiswal created HIVE-21098:
-

 Summary: DPP: SyntheticJoinPredicate transitivity for < > and 
BETWEEN needs extension
 Key: HIVE-21098
 URL: https://issues.apache.org/jira/browse/HIVE-21098
 Project: Hive
  Issue Type: Bug
  Components: Hive
Reporter: Deepak Jaiswal
Assignee: Deepak Jaiswal


SyntheticJoinPredicates are supported for equality. Both in regular and 
extended format.

Similar extended format is needed for non-equi joins too.

 

See HIVE-16976



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Review Request 69663: HIVE-16976

2019-01-07 Thread Deepak Jaiswal


> On Jan. 7, 2019, 4:41 p.m., Jesús Camacho Rodríguez wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java
> > Lines 330 (patched)
> > <https://reviews.apache.org/r/69663/diff/1/?file=2117432#file2117432line334>
> >
> > Should we still return null if function text is not recognized?
> 
> Deepak Jaiswal wrote:
> Yes, that helps recognize unsupported functions. For eg,
> 
> ExprNodeGenericFuncDesc funcDesc = (ExprNodeGenericFuncDesc) 
> filter;
> // filter should be of type <, >, <= or >=
> if (getFuncText(funcDesc.getFuncText(), 1) == null) {
>   // unsupported
>   continue;
> }
> 
> I am open to better ways, hence the TODO.
> 
> Jesús Camacho Rodríguez wrote:
> Sorry, I did not express myself properly. Within the if (srcPos == 0) {, 
> shouldn't we return null if function text is not recognized (similar to what 
> we do below that you pointed out)?
> 
> Deepak Jaiswal wrote:
> That would require verifying the function text which is done in switch 
> case anyway.
> Inorder for non-equi join's synthetic joins to work properly, if the 
> switch case cant get a valid inversion text then it is not supported.
> That is why I used "1" to make sure it goes through the switch case. This 
> eliminates duplicating similar logic.
> 
> Jesús Camacho Rodríguez wrote:
> OK, I was getting confused by the semantics of the srcPos parameter (an 
> 'invert' boolean would have been clearer).
> Tbh, I think it is better to create two methods: one internal in 
> SyntheticJoinPredicate that would return whether a function is supported or 
> not, and a utility method in FunctionRegistry that would return the inverse 
> of a given function. Overhead is neglibible and there will be clear different 
> semantics.

Having two functions could create a maintenance headache. As mentioned above, 
the function will go to FunctionRegistry. There is already a comment,
return null; // helps identify unsupported functions

I can expand the comment to make things clearer.
Leaving the function as it is keeps things short and sweet and involves much 
less maintenance.


- Deepak


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69663/#review211725
---


On Jan. 3, 2019, 8:39 p.m., Deepak Jaiswal wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69663/
> ---
> 
> (Updated Jan. 3, 2019, 8:39 p.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan, Gopal V, Jesús Camacho Rodríguez, 
> and Jason Dere.
> 
> 
> Bugs: HIVE-16976
> https://issues.apache.org/jira/browse/HIVE-16976
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> DPP: SyntheticJoinPredicate transitivity for < > and BETWEEN
> 
> The patch supports predicates on non-equi joins and provides an interface for 
> storage handler to decide if it can use this optimization.
> Work to integrate this with DPP and semijoin will be done in separate JIRA.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveStorageHandler.java 
> 2ebb149354 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization.java
>  a1401aac72 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezUtils.java f8c7e18eb1 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDynamicListDesc.java 
> 676dfc9421 
>   ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java 
> e97e44796f 
>   ql/src/test/results/clientpositive/llap/cross_prod_1.q.out ac1f4eabd8 
>   ql/src/test/results/clientpositive/llap/groupby_groupingset_bug.q.out 
> de74af6dff 
>   ql/src/test/results/clientpositive/llap/semijoin.q.out 63a270e57d 
>   ql/src/test/results/clientpositive/llap/subquery_in.q.out 07cc4dbabc 
>   ql/src/test/results/clientpositive/llap/subquery_notin.q.out 29d8bbfb48 
>   ql/src/test/results/clientpositive/llap/subquery_scalar.q.out e830835445 
>   ql/src/test/results/clientpositive/llap/subquery_select.q.out d3cc980ca1 
> 
> 
> Diff: https://reviews.apache.org/r/69663/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Deepak Jaiswal
> 
>



Re: Review Request 69663: HIVE-16976

2019-01-07 Thread Deepak Jaiswal


> On Jan. 7, 2019, 10:40 p.m., Jesús Camacho Rodríguez wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java
> > Lines 254 (patched)
> > <https://reviews.apache.org/r/69663/diff/1/?file=2117432#file2117432line258>
> >
> > srcPos and targetPos do not seem to refer to function, but rather the 
> > inputs being joined. In addition, they do not change between loop 
> > iterations. Below, they are used to retrieve the child expression from the 
> > function, which does not seem correct.
> 
> Jesús Camacho Rodríguez wrote:
> OK, seeing your comment above, I understood better the code here. You may 
> be inverting the source and target, that is why you access the function 
> expression using them. Could you leave a comment explaining it?
> My comment above about the value change for srcPos and targetPos between 
> iterations still seems valid, the check could be done before to skip the the 
> loop in line 242.

Thanks for the tip. Yes, it makes sense to have the check before the loop 
begins as srcPos and targetPos do not change. We can skip the whole logic with 
this condition even before the if condition.


- Deepak


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69663/#review211745
-------


On Jan. 3, 2019, 8:39 p.m., Deepak Jaiswal wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69663/
> ---
> 
> (Updated Jan. 3, 2019, 8:39 p.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan, Gopal V, Jesús Camacho Rodríguez, 
> and Jason Dere.
> 
> 
> Bugs: HIVE-16976
> https://issues.apache.org/jira/browse/HIVE-16976
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> DPP: SyntheticJoinPredicate transitivity for < > and BETWEEN
> 
> The patch supports predicates on non-equi joins and provides an interface for 
> storage handler to decide if it can use this optimization.
> Work to integrate this with DPP and semijoin will be done in separate JIRA.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveStorageHandler.java 
> 2ebb149354 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization.java
>  a1401aac72 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezUtils.java f8c7e18eb1 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDynamicListDesc.java 
> 676dfc9421 
>   ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java 
> e97e44796f 
>   ql/src/test/results/clientpositive/llap/cross_prod_1.q.out ac1f4eabd8 
>   ql/src/test/results/clientpositive/llap/groupby_groupingset_bug.q.out 
> de74af6dff 
>   ql/src/test/results/clientpositive/llap/semijoin.q.out 63a270e57d 
>   ql/src/test/results/clientpositive/llap/subquery_in.q.out 07cc4dbabc 
>   ql/src/test/results/clientpositive/llap/subquery_notin.q.out 29d8bbfb48 
>   ql/src/test/results/clientpositive/llap/subquery_scalar.q.out e830835445 
>   ql/src/test/results/clientpositive/llap/subquery_select.q.out d3cc980ca1 
> 
> 
> Diff: https://reviews.apache.org/r/69663/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Deepak Jaiswal
> 
>



Re: Review Request 69663: HIVE-16976

2019-01-07 Thread Deepak Jaiswal


> On Jan. 7, 2019, 4:41 p.m., Jesús Camacho Rodríguez wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java
> > Line 182 (original)
> > <https://reviews.apache.org/r/69663/diff/1/?file=2117432#file2117432line183>
> >
> > Can you bring this back and delete 'if (sourceKeys.size() > 0) {' 
> > below? This is just a style change and indenting so many lines will just 
> > make more difficult following code provenance.
> 
> Deepak Jaiswal wrote:
> The continue is removed so that it reaches the residualFilter logic, 
> otherwise it would skip everything and move on to next target.
> 
> Jesús Camacho Rodríguez wrote:
> You are right, I did not see the extra }. Could the comment '//if 
> (sourceKeys.size() < 1) continue;' below be removed then? No need to leave it 
> there.

Sure. I forgot to remove it.


> On Jan. 7, 2019, 4:41 p.m., Jesús Camacho Rodríguez wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java
> > Lines 330 (patched)
> > <https://reviews.apache.org/r/69663/diff/1/?file=2117432#file2117432line334>
> >
> > Should we still return null if function text is not recognized?
> 
> Deepak Jaiswal wrote:
> Yes, that helps recognize unsupported functions. For eg,
> 
> ExprNodeGenericFuncDesc funcDesc = (ExprNodeGenericFuncDesc) 
> filter;
> // filter should be of type <, >, <= or >=
> if (getFuncText(funcDesc.getFuncText(), 1) == null) {
>   // unsupported
>   continue;
> }
> 
> I am open to better ways, hence the TODO.
> 
> Jesús Camacho Rodríguez wrote:
> Sorry, I did not express myself properly. Within the if (srcPos == 0) {, 
> shouldn't we return null if function text is not recognized (similar to what 
> we do below that you pointed out)?

That would require verifying the function text which is done in switch case 
anyway.
Inorder for non-equi join's synthetic joins to work properly, if the switch 
case cant get a valid inversion text then it is not supported.
That is why I used "1" to make sure it goes through the switch case. This 
eliminates duplicating similar logic.


- Deepak


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69663/#review211725
---


On Jan. 3, 2019, 8:39 p.m., Deepak Jaiswal wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69663/
> ---
> 
> (Updated Jan. 3, 2019, 8:39 p.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan, Gopal V, Jesús Camacho Rodríguez, 
> and Jason Dere.
> 
> 
> Bugs: HIVE-16976
> https://issues.apache.org/jira/browse/HIVE-16976
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> DPP: SyntheticJoinPredicate transitivity for < > and BETWEEN
> 
> The patch supports predicates on non-equi joins and provides an interface for 
> storage handler to decide if it can use this optimization.
> Work to integrate this with DPP and semijoin will be done in separate JIRA.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveStorageHandler.java 
> 2ebb149354 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization.java
>  a1401aac72 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezUtils.java f8c7e18eb1 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDynamicListDesc.java 
> 676dfc9421 
>   ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java 
> e97e44796f 
>   ql/src/test/results/clientpositive/llap/cross_prod_1.q.out ac1f4eabd8 
>   ql/src/test/results/clientpositive/llap/groupby_groupingset_bug.q.out 
> de74af6dff 
>   ql/src/test/results/clientpositive/llap/semijoin.q.out 63a270e57d 
>   ql/src/test/results/clientpositive/llap/subquery_in.q.out 07cc4dbabc 
>   ql/src/test/results/clientpositive/llap/subquery_notin.q.out 29d8bbfb48 
>   ql/src/test/results/clientpositive/llap/subquery_scalar.q.out e830835445 
>   ql/src/test/results/clientpositive/llap/subquery_select.q.out d3cc980ca1 
> 
> 
> Diff: https://reviews.apache.org/r/69663/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Deepak Jaiswal
> 
>



Re: Review Request 69663: HIVE-16976

2019-01-07 Thread Deepak Jaiswal


> On Jan. 7, 2019, 4:41 p.m., Jesús Camacho Rodríguez wrote:
> > Can we add some tests for the new feature?

The reason there is no test yet is because it does nothing end to end. Both DPP 
route and semijoin reduction route dont process the predicate yet.


> On Jan. 7, 2019, 4:41 p.m., Jesús Camacho Rodríguez wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java
> > Line 182 (original)
> > <https://reviews.apache.org/r/69663/diff/1/?file=2117432#file2117432line183>
> >
> > Can you bring this back and delete 'if (sourceKeys.size() > 0) {' 
> > below? This is just a style change and indenting so many lines will just 
> > make more difficult following code provenance.

The continue is removed so that it reaches the residualFilter logic, otherwise 
it would skip everything and move on to next target.


> On Jan. 7, 2019, 4:41 p.m., Jesús Camacho Rodríguez wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java
> > Lines 284 (patched)
> > <https://reviews.apache.org/r/69663/diff/1/?file=2117432#file2117432line288>
> >
> > We should add a call to extended version here as we did above for 
> > equality predicates. The only required change seems to be in 
> > _addParentReduceSink_ called from _createDerivatives_, which would receive 
> > the comparison operator from here. All the rest should already work as 
> > expected.
> > 
> > I believe this could be addressed in this JIRA since it is not a lot of 
> > code. However, if it is not addressed, please create follow-up and leave a 
> > TODO.

Will add the extended version. Thanks for bringing this up.


> On Jan. 7, 2019, 4:41 p.m., Jesús Camacho Rodríguez wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java
> > Lines 318 (patched)
> > <https://reviews.apache.org/r/69663/diff/1/?file=2117432#file2117432line322>
> >
> > return colExprMap.get(rsColName)

:|


> On Jan. 7, 2019, 4:41 p.m., Jesús Camacho Rodríguez wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java
> > Lines 328 (patched)
> > <https://reviews.apache.org/r/69663/diff/1/?file=2117432#file2117432line332>
> >
> > Can we move this method as _invertFunction_ utility method to 
> > _FunctionRegistry.java_?
> > 
> > In addition, instead of relying on the function text, I believe it 
> > would be more robust to have the UDF as the input. In particular, we can 
> > use _funcDesc.getGenericUDF();_ when calling this method, then rely in e.g. 
> > _udf instanceof GenericUDFOPEqualOrGreaterThan_ for the checks.

Yes, I can move this.
The reason I used function text is because I can use switch case and also much 
faster.
Otherwise, once extended in future, this could become a giant mess of if...else 
statements.
We can discuss this further.


> On Jan. 7, 2019, 4:41 p.m., Jesús Camacho Rodríguez wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java
> > Lines 330 (patched)
> > <https://reviews.apache.org/r/69663/diff/1/?file=2117432#file2117432line334>
> >
> > Should we still return null if function text is not recognized?

Yes, that helps recognize unsupported functions. For eg,

ExprNodeGenericFuncDesc funcDesc = (ExprNodeGenericFuncDesc) filter;
// filter should be of type <, >, <= or >=
if (getFuncText(funcDesc.getFuncText(), 1) == null) {
  // unsupported
  continue;
}

I am open to better ways, hence the TODO.


- Deepak


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69663/#review211725
---


On Jan. 3, 2019, 8:39 p.m., Deepak Jaiswal wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69663/
> ---
> 
> (Updated Jan. 3, 2019, 8:39 p.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan, Gopal V, Jesús Camacho Rodríguez, 
> and Jason Dere.
> 
> 
> Bugs: HIVE-16976
> https://issues.apache.org/jira/browse/HIVE-16976
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> DPP: SyntheticJoinPredicate transitivity for < > and BETWEEN
> 
> The patch supports predicates on non-equi joins and provides an interface for 
> storage handler to decide if it can use this optimization.
> Work to integrate this with DPP and semijoin will be done in se

Review Request 69663: HIVE-16976

2019-01-03 Thread Deepak Jaiswal

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69663/
---

Review request for hive, Ashutosh Chauhan, Gopal V, Jesús Camacho Rodríguez, 
and Jason Dere.


Bugs: HIVE-16976
https://issues.apache.org/jira/browse/HIVE-16976


Repository: hive-git


Description
---

DPP: SyntheticJoinPredicate transitivity for < > and BETWEEN

The patch supports predicates on non-equi joins and provides an interface for 
storage handler to decide if it can use this optimization.
Work to integrate this with DPP and semijoin will be done in separate JIRA.


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveStorageHandler.java 
2ebb149354 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization.java
 a1401aac72 
  ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezUtils.java f8c7e18eb1 
  ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDynamicListDesc.java 
676dfc9421 
  ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java 
e97e44796f 
  ql/src/test/results/clientpositive/llap/cross_prod_1.q.out ac1f4eabd8 
  ql/src/test/results/clientpositive/llap/groupby_groupingset_bug.q.out 
de74af6dff 
  ql/src/test/results/clientpositive/llap/semijoin.q.out 63a270e57d 
  ql/src/test/results/clientpositive/llap/subquery_in.q.out 07cc4dbabc 
  ql/src/test/results/clientpositive/llap/subquery_notin.q.out 29d8bbfb48 
  ql/src/test/results/clientpositive/llap/subquery_scalar.q.out e830835445 
  ql/src/test/results/clientpositive/llap/subquery_select.q.out d3cc980ca1 


Diff: https://reviews.apache.org/r/69663/diff/1/


Testing
---


Thanks,

Deepak Jaiswal



[jira] [Created] (HIVE-20868) SMB Join fails intermittently when TezDummyOperator has child op in getFinalOp in MapRecordProcessor

2018-11-05 Thread Deepak Jaiswal (JIRA)
Deepak Jaiswal created HIVE-20868:
-

 Summary: SMB Join fails intermittently when TezDummyOperator has 
child op in getFinalOp in MapRecordProcessor
 Key: HIVE-20868
 URL: https://issues.apache.org/jira/browse/HIVE-20868
 Project: Hive
  Issue Type: Bug
Reporter: Deepak Jaiswal
Assignee: Deepak Jaiswal






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [ANNOUNCE] New PMC Member : Zoltan

2018-10-30 Thread Deepak Jaiswal
Congratulations Zoltan!

On 10/30/18, 10:08 PM, "Ashutosh Chauhan"  wrote:

 Hello Hive community,

I'm pleased to announce that Zoltan Haindrich has accepted the Apache
Hive PMC's invitation, and is our newest PMC member. Many thanks to
Zoltan for all of his hard work.

Please join me in congratulating Zoltan!

Thanks,
Ashutosh




Re: [ANNOUNCE] New committer: Nishant Bangarwa

2018-10-19 Thread Deepak Jaiswal
Congratulations!

On 10/19/18, 10:14 AM, "Vineet Garg"  wrote:

Congrats Nishant!

> On Oct 19, 2018, at 8:36 AM, Gunther Hagleitner 
 wrote:
> 
> Congrats Nishant!
> 
> Cheers,
> Gunther.
> 
> From: Andrew Sherman 
> Sent: Friday, October 19, 2018 8:34 AM
> To: dev@hive.apache.org
> Subject: Re: [ANNOUNCE] New committer: Nishant Bangarwa
> 
> Congratulations Nishant!
> 
> On Fri, Oct 19, 2018 at 4:29 AM Peter Vary 
> wrote:
> 
>> Congratulations Nishant!
>> 
>>> On Oct 19, 2018, at 07:42, Sankar Hariappan 
>> wrote:
>>> 
>>> Congrats Nishant!
>>> 
>>> Best regards
>>> Sankar
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> On 15/10/18, 12:45 PM, "Ashutosh Chauhan"  wrote:
>>> 
 Apache Hive's Project Management Committee (PMC) has invited Nishant
 Bangarwa
 to become a committer, and we are pleased to announce that he has
>> accepted.
 
 Nishant, welcome, thank you for your contributions, and we look forward
>> your
 further interactions with the community!
 
 Ashutosh Chauhan (on behalf of the Apache Hive PMC)
>> 
>> 
> 
> 





Re: [ANNOUNCE] New committer: Janaki Lahorani

2018-10-09 Thread Deepak Jaiswal
Congratulations Janaki.

On 10/9/18, 8:52 AM, "Vihang Karajgaonkar"  wrote:

Congratulations Janaki!

On Tue, Oct 9, 2018 at 8:27 AM Andrew Sherman 

wrote:

> Congratulations Janaki!
>
> On Mon, Oct 8, 2018 at 10:05 PM Ashutosh Chauhan 
> wrote:
>
> > Apache Hive's Project Management Committee (PMC) has invited Janaki
> > Lahorani to become a committer, and we are pleased to announce that she
> has
> > accepted.
> > Janaki, welcome, thank you for your contributions, and we look forward 
to
> > your further interactions with the community!
> >
> > Ashutosh Chauhan (on behalf of the Apache Hive PMC)
> >
>




[jira] [Created] (HIVE-20641) load_data_using_job is failing

2018-09-26 Thread Deepak Jaiswal (JIRA)
Deepak Jaiswal created HIVE-20641:
-

 Summary: load_data_using_job is failing
 Key: HIVE-20641
 URL: https://issues.apache.org/jira/browse/HIVE-20641
 Project: Hive
  Issue Type: Bug
Reporter: Deepak Jaiswal
Assignee: Deepak Jaiswal


load_data_using_job is failing due to result diff.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Review Request 68848: HIV E-20540

2018-09-26 Thread Deepak Jaiswal

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68848/
---

Review request for hive and Gopal V.


Bugs: HIVE-20540
https://issues.apache.org/jira/browse/HIVE-20540


Repository: hive-git


Description
---

Vectorization : Support loading bucketed tables using sorted dynamic partition 
optimizer - II

Followup to HIVE-20510 with remaining issues

1. Avoid using Reflection.
2. In VectorizationContext, use correct place to setup the VectorExpression. It 
may be missed in certain cases.
3. In BucketNumExpression, make sure that a value is not overwritten before it 
is processed. Use a flag to achieve this.


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java 
55d2a16f03 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/BucketNumExpression.java
 d8c696c302 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/reducesink/VectorReduceSinkObjectHashOperator.java
 1a8395a71b 


Diff: https://reviews.apache.org/r/68848/diff/1/


Testing
---


Thanks,

Deepak Jaiswal



Re: Review Request 68772: HIVE-20593

2018-09-24 Thread Deepak Jaiswal

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68772/
---

(Updated Sept. 24, 2018, 6:56 a.m.)


Review request for hive and Eugene Koifman.


Changes
---

Implemented changes recommended.
Got green run on ptests.


Bugs: HIVE-20593
https://issues.apache.org/jira/browse/HIVE-20593


Repository: hive-git


Description
---

Load Data for partitioned ACID tables fails with bucketId out of range: -1

The tempTblObj is inherited from target table. However, the only table property 
which needs to be inherited is bucketing version. Properties like transactional 
etc should be ignored.


Diffs (updated)
-

  
data/files/load_data_job_acid/20180918230307-b382b8c7-271c-4025-be64-4a68f4db32e5_0_0
 PRE-CREATION 
  
data/files/load_data_job_acid/20180918230307-b382b8c7-271c-4025-be64-4a68f4db32e5_1_0
 PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java 
8d33cf5b23 
  ql/src/test/queries/clientpositive/load_data_using_job.q b760d9bc7e 
  ql/src/test/results/clientpositive/llap/load_data_using_job.q.out 21fd9334ea 


Diff: https://reviews.apache.org/r/68772/diff/2/

Changes: https://reviews.apache.org/r/68772/diff/1-2/


Testing
---


Thanks,

Deepak Jaiswal



Review Request 68772: HIVE-20593

2018-09-19 Thread Deepak Jaiswal

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68772/
---

Review request for hive and Eugene Koifman.


Bugs: HIVE-20593
https://issues.apache.org/jira/browse/HIVE-20593


Repository: hive-git


Description
---

Load Data for partitioned ACID tables fails with bucketId out of range: -1

The tempTblObj is inherited from target table. However, the only table property 
which needs to be inherited is bucketing version. Properties like transactional 
etc should be ignored.


Diffs
-

  
data/files/load_data_job_acid/20180918230307-b382b8c7-271c-4025-be64-4a68f4db32e5_0_0
 PRE-CREATION 
  
data/files/load_data_job_acid/20180918230307-b382b8c7-271c-4025-be64-4a68f4db32e5_1_0
 PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java 
8d33cf5b23 
  ql/src/test/queries/clientpositive/load_data_using_job.q b760d9bc7e 
  ql/src/test/results/clientpositive/llap/load_data_using_job.q.out 21fd9334ea 


Diff: https://reviews.apache.org/r/68772/diff/1/


Testing
---


Thanks,

Deepak Jaiswal



[jira] [Created] (HIVE-20593) Load Data for partitioned ACID tables fails with bucketId out of range: -1

2018-09-19 Thread Deepak Jaiswal (JIRA)
Deepak Jaiswal created HIVE-20593:
-

 Summary: Load Data for partitioned ACID tables fails with bucketId 
out of range: -1
 Key: HIVE-20593
 URL: https://issues.apache.org/jira/browse/HIVE-20593
 Project: Hive
  Issue Type: Bug
Reporter: Deepak Jaiswal
Assignee: Deepak Jaiswal


Load data for ACID tables is failing to load ORC files when it is converted to 
IAS job.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20540) Vectorization : Support loading bucketed tables using sorted dynamic partition optimizer - II

2018-09-11 Thread Deepak Jaiswal (JIRA)
Deepak Jaiswal created HIVE-20540:
-

 Summary: Vectorization : Support loading bucketed tables using 
sorted dynamic partition optimizer - II
 Key: HIVE-20540
 URL: https://issues.apache.org/jira/browse/HIVE-20540
 Project: Hive
  Issue Type: Bug
Reporter: Deepak Jaiswal
Assignee: Deepak Jaiswal


Followup to HIVE-20510 with remaining issues,

 

1. Avoid using Reflection.
2. In VectorizationContext, use correct place to setup the VectorExpression. It 
may be missed in certain cases.
3. In BucketNumExpression, make sure that a value is not overwritten before it 
is processed. Use a flag to achieve this.

cc [~gopalv]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Jdbc tests failing randomly

2018-09-08 Thread Deepak Jaiswal
Hi All,

It seems the jdbc UTs are failing randomly. Any idea who might know about them?

https://builds.apache.org/job/PreCommit-HIVE-Build/13661/testReport

https://builds.apache.org/job/PreCommit-HIVE-Build/13658/testReport

Regards,
Deepak


Re: Review Request 68648: HIVE-20510

2018-09-08 Thread Deepak Jaiswal

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68648/
---

(Updated Sept. 8, 2018, 8:38 a.m.)


Review request for hive, Gopal V and Matt McCline.


Changes
---

Updated results for failing tests.


Bugs: HIVE-20510
https://issues.apache.org/jira/browse/HIVE-20510


Repository: hive-git


Description
---

Vectorization : Support loading bucketed tables using sorted dynamic partition 
optimizer.
Added a new VectorExpression BucketNumberExpression to evaluate _bucket_number.
Made the loops as tight as possible.


Diffs (updated)
-

  
itests/hive-blobstore/src/test/results/clientpositive/insert_into_dynamic_partitions.q.out
 74a9a56f07 
  
itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_dynamic_partitions.q.out
 ee02c36f03 
  ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java 8bf0a9c77d 
  ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java a2a9c8421e 
  ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java 
57f7c0108e 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/BucketNumExpression.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/reducesink/VectorReduceSinkObjectHashOperator.java
 5ab59c9c61 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/SortedDynPartitionOptimizer.java
 51010aac85 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFBucketNumber.java 
PRE-CREATION 
  ql/src/test/queries/clientpositive/dynpart_sort_opt_vectorization.q 
435cdaddd0 
  ql/src/test/results/clientpositive/dynpart_sort_optimization_acid2.q.out 
aea757205f 
  ql/src/test/results/clientpositive/llap/dynpart_sort_opt_vectorization.q.out 
22f0a31eb3 
  ql/src/test/results/clientpositive/llap/dynpart_sort_optimization.q.out 
21fc2c545a 
  ql/src/test/results/clientpositive/llap/dynpart_sort_optimization_acid.q.out 
a0a5e0cf32 
  ql/src/test/results/clientpositive/show_functions.q.out 90608e2905 


Diff: https://reviews.apache.org/r/68648/diff/4/

Changes: https://reviews.apache.org/r/68648/diff/3-4/


Testing
---


Thanks,

Deepak Jaiswal



Re: Review Request 68648: HIVE-20510

2018-09-07 Thread Deepak Jaiswal

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68648/
---

(Updated Sept. 7, 2018, 7:57 p.m.)


Review request for hive, Gopal V and Matt McCline.


Changes
---

Missed non-vectorized case and some result updates.


Bugs: HIVE-20510
https://issues.apache.org/jira/browse/HIVE-20510


Repository: hive-git


Description
---

Vectorization : Support loading bucketed tables using sorted dynamic partition 
optimizer.
Added a new VectorExpression BucketNumberExpression to evaluate _bucket_number.
Made the loops as tight as possible.


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java 8bf0a9c77d 
  ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java a2a9c8421e 
  ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java 
57f7c0108e 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/BucketNumExpression.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/reducesink/VectorReduceSinkObjectHashOperator.java
 5ab59c9c61 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/SortedDynPartitionOptimizer.java
 51010aac85 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFBucketNumber.java 
PRE-CREATION 
  ql/src/test/queries/clientpositive/dynpart_sort_opt_vectorization.q 
435cdaddd0 
  ql/src/test/results/clientpositive/llap/dynpart_sort_opt_vectorization.q.out 
22f0a31eb3 
  ql/src/test/results/clientpositive/llap/dynpart_sort_optimization.q.out 
21fc2c545a 
  ql/src/test/results/clientpositive/llap/dynpart_sort_optimization_acid.q.out 
a0a5e0cf32 


Diff: https://reviews.apache.org/r/68648/diff/3/

Changes: https://reviews.apache.org/r/68648/diff/2-3/


Testing
---


Thanks,

Deepak Jaiswal



Re: Review Request 68648: HIVE-20510

2018-09-07 Thread Deepak Jaiswal

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68648/
---

(Updated Sept. 7, 2018, 6:42 p.m.)


Review request for hive, Gopal V and Matt McCline.


Changes
---

Addressed concerns from Matt's review.
Replaced the constant string _bucket_number with a UDF GenericUDFBucketNumber() 
to make sure _bucket_number could be uaed as a legitimate string in queries.


Bugs: HIVE-20510
https://issues.apache.org/jira/browse/HIVE-20510


Repository: hive-git


Description
---

Vectorization : Support loading bucketed tables using sorted dynamic partition 
optimizer.
Added a new VectorExpression BucketNumberExpression to evaluate _bucket_number.
Made the loops as tight as possible.


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java 8bf0a9c77d 
  ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java 
57f7c0108e 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/BucketNumExpression.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/reducesink/VectorReduceSinkObjectHashOperator.java
 5ab59c9c61 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/SortedDynPartitionOptimizer.java
 51010aac85 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFBucketNumber.java 
PRE-CREATION 
  ql/src/test/queries/clientpositive/dynpart_sort_opt_vectorization.q 
435cdaddd0 
  ql/src/test/results/clientpositive/llap/dynpart_sort_opt_vectorization.q.out 
22f0a31eb3 


Diff: https://reviews.apache.org/r/68648/diff/2/

Changes: https://reviews.apache.org/r/68648/diff/1-2/


Testing
---


Thanks,

Deepak Jaiswal



Review Request 68648: HIVE-20510

2018-09-06 Thread Deepak Jaiswal

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68648/
---

Review request for hive, Gopal V and Matt McCline.


Bugs: HIVE-20510
https://issues.apache.org/jira/browse/HIVE-20510


Repository: hive-git


Description
---

Vectorization : Support loading bucketed tables using sorted dynamic partition 
optimizer.
Added a new VectorExpression BucketNumberExpression to evaluate _bucket_number.
Made the loops as tight as possible.


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java 
57f7c0108e 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/BucketNumExpression.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/reducesink/VectorReduceSinkObjectHashOperator.java
 5ab59c9c61 
  ql/src/test/queries/clientpositive/dynpart_sort_opt_vectorization.q 
435cdaddd0 
  ql/src/test/results/clientpositive/llap/dynpart_sort_opt_vectorization.q.out 
22f0a31eb3 


Diff: https://reviews.apache.org/r/68648/diff/1/


Testing
---


Thanks,

Deepak Jaiswal



[jira] [Created] (HIVE-20510) Vectorization : Support loading bucketed tables using sorted dynamic partition optimizer

2018-09-06 Thread Deepak Jaiswal (JIRA)
Deepak Jaiswal created HIVE-20510:
-

 Summary: Vectorization : Support loading bucketed tables using 
sorted dynamic partition optimizer
 Key: HIVE-20510
 URL: https://issues.apache.org/jira/browse/HIVE-20510
 Project: Hive
  Issue Type: Bug
Reporter: Deepak Jaiswal
Assignee: Deepak Jaiswal


sorted dynamic partition optimizer does not work on bucketed tables when 
vectorization is enabled.

 

cc [~mmccline]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20508) Hive does not support user names of type "user@realm"

2018-09-05 Thread Deepak Jaiswal (JIRA)
Deepak Jaiswal created HIVE-20508:
-

 Summary: Hive does not support user names of type "user@realm"
 Key: HIVE-20508
 URL: https://issues.apache.org/jira/browse/HIVE-20508
 Project: Hive
  Issue Type: Bug
Reporter: Deepak Jaiswal
Assignee: Deepak Jaiswal


Hive does not support user names of type "user@realm". This causes 
authentication problem for user names containing email ids in Kerberos 
environment.

 

cc [~thejas]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [ANNOUNCE] New committer: Andrew Sherman

2018-09-03 Thread Deepak Jaiswal
Congratulation Andrew.

Deepak

On 9/3/18, 10:17 PM, "Zoltan Haindrich"  wrote:

Congratulations Andrew!

On 2 September 2018 04:49:00 CEST, Lefty Leverenz  
wrote:
>Congratulations Andrew!
>
>-- Lefty
>
>
>On Tue, Aug 28, 2018 at 11:36 AM Ashutosh Chauhan
>
>wrote:
>
>> Apache Hive's Project Management Committee (PMC) has invited Andrew
>Sherman
>> to become a committer, and we are pleased to announce that he has
>accepted.
>>
>> Andrew, welcome, thank you for your contributions, and we look
>forward to
>> your
>> further interactions with the community!
>>
>> Ashutosh Chauhan (on behalf of the Apache Hive PMC)
>>




Re: Review Request 68506: HIVE-20187

2018-08-25 Thread Deepak Jaiswal

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68506/
---

(Updated Aug. 25, 2018, 6:22 a.m.)


Review request for hive and Gunther Hagleitner.


Changes
---

Updated results.


Bugs: HIVE-20187
https://issues.apache.org/jira/browse/HIVE-20187


Repository: hive-git


Description
---

Incorrect query results in hive when hive.convert.join.bucket.mapjoin.tez is 
set to true
In some cases, Bucket mapjoin is incorrectly selected which leads to wrong 
results.


Diffs (updated)
-

  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/metainfo/annotation/OpTraitsRulesProcFactory.java
 89db530f54 
  ql/src/test/queries/clientpositive/bucket_map_join_tez2.q adcf6962ab 
  ql/src/test/results/clientpositive/llap/bucket_map_join_tez2.q.out 4f042cee50 
  ql/src/test/results/clientpositive/llap/limit_pushdown.q.out 4fc1419acd 
  ql/src/test/results/clientpositive/llap/offset_limit_ppd_optimizer.q.out 
2e8d5f375f 
  ql/src/test/results/clientpositive/llap/tez_smb_main.q.out 9929989f0e 
  ql/src/test/results/clientpositive/spark/bucket_map_join_tez2.q.out 
243cbc3428 


Diff: https://reviews.apache.org/r/68506/diff/2/

Changes: https://reviews.apache.org/r/68506/diff/1-2/


Testing
---


Thanks,

Deepak Jaiswal



Review Request 68506: HIVE-20187

2018-08-24 Thread Deepak Jaiswal

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68506/
---

Review request for hive and Gunther Hagleitner.


Bugs: HIVE-20187
https://issues.apache.org/jira/browse/HIVE-20187


Repository: hive-git


Description
---

Incorrect query results in hive when hive.convert.join.bucket.mapjoin.tez is 
set to true
In some cases, Bucket mapjoin is incorrectly selected which leads to wrong 
results.


Diffs
-

  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/metainfo/annotation/OpTraitsRulesProcFactory.java
 89db530f54 
  ql/src/test/queries/clientpositive/bucket_map_join_tez2.q adcf6962ab 
  ql/src/test/results/clientpositive/llap/bucket_map_join_tez2.q.out 4f042cee50 
  ql/src/test/results/clientpositive/llap/tez_smb_main.q.out 9929989f0e 


Diff: https://reviews.apache.org/r/68506/diff/1/


Testing
---


Thanks,

Deepak Jaiswal



Re: Review Request 68476: HIVE-20433

2018-08-23 Thread Deepak Jaiswal

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68476/
---

(Updated Aug. 23, 2018, 8:05 p.m.)


Review request for hive, Ashutosh Chauhan and Gopal V.


Changes
---

Fixed ptest failures.


Bugs: HIVE-20433
https://issues.apache.org/jira/browse/HIVE-20433


Repository: hive-git


Description
---

Implicit String to Timestamp conversion is slow


Diffs (updated)
-

  
serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/PrimitiveObjectInspectorUtils.java
 8a057d1dab 


Diff: https://reviews.apache.org/r/68476/diff/2/

Changes: https://reviews.apache.org/r/68476/diff/1-2/


Testing
---


Thanks,

Deepak Jaiswal



Review Request 68476: HIVE-20433

2018-08-22 Thread Deepak Jaiswal

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68476/
---

Review request for hive, Ashutosh Chauhan and Gopal V.


Bugs: HIVE-20433
https://issues.apache.org/jira/browse/HIVE-20433


Repository: hive-git


Description
---

Implicit String to Timestamp conversion is slow


Diffs (updated)
-

  
serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/PrimitiveObjectInspectorUtils.java
 8a057d1dab 


Diff: https://reviews.apache.org/r/68476/diff/1/


Testing
---


Thanks,

Deepak Jaiswal



[jira] [Created] (HIVE-20433) Implicit String to Timestamp conversion is slow

2018-08-21 Thread Deepak Jaiswal (JIRA)
Deepak Jaiswal created HIVE-20433:
-

 Summary: Implicit String to Timestamp conversion is slow
 Key: HIVE-20433
 URL: https://issues.apache.org/jira/browse/HIVE-20433
 Project: Hive
  Issue Type: Bug
Reporter: Deepak Jaiswal
Assignee: Deepak Jaiswal


getTimestampFromString() is slow at casting dates. It throws twice before date 
conversion can happen.

 

cc [~gopalv] [~ashutoshc]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Review Request 68359: HIVE-20393

2018-08-15 Thread Deepak Jaiswal

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68359/
---

(Updated Aug. 15, 2018, 10:19 p.m.)


Review request for hive, Gopal V, Jesús Camacho Rodríguez, and Jason Dere.


Changes
---

Less reliannt on NPE handling.


Bugs: HIVE-20393
https://issues.apache.org/jira/browse/HIVE-20393


Repository: hive-git


Description
---

See Jira.


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java f316f09953 


Diff: https://reviews.apache.org/r/68359/diff/2/

Changes: https://reviews.apache.org/r/68359/diff/1-2/


Testing
---


Thanks,

Deepak Jaiswal



Review Request 68359: HIVE-20393

2018-08-15 Thread Deepak Jaiswal

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68359/
---

Review request for hive, Gopal V and Jesús Camacho Rodríguez.


Bugs: HIVE-20393
https://issues.apache.org/jira/browse/HIVE-20393


Repository: hive-git


Description
---

See Jira.


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java f316f09953 


Diff: https://reviews.apache.org/r/68359/diff/1/


Testing
---


Thanks,

Deepak Jaiswal



[jira] [Created] (HIVE-20393) Semijoin Reduction : markSemiJoinForDPP behaves inconsistently

2018-08-15 Thread Deepak Jaiswal (JIRA)
Deepak Jaiswal created HIVE-20393:
-

 Summary: Semijoin Reduction : markSemiJoinForDPP behaves 
inconsistently
 Key: HIVE-20393
 URL: https://issues.apache.org/jira/browse/HIVE-20393
 Project: Hive
  Issue Type: Bug
Reporter: Deepak Jaiswal
Assignee: Deepak Jaiswal


markSemiJoinForDPP has multiple issues,

 
 * Uses map tsOps which is wrong as it disallows going thru same TS which may 
have filters from more than 1 semijoin edges. This results in inconsistent 
plans for same query as semijoin edges may be processed in different order each 
time.
 * Uses getColumnExpr() which is not as robust as extractColumn() thus 
resulting in NPEs.
 * The logic to mark an edge useful when NPE is hit may end up having bad edge.

cc [~gopalv]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Review Request 68281: HIVE-20354

2018-08-10 Thread Deepak Jaiswal

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68281/
---

(Updated Aug. 10, 2018, 8:59 p.m.)


Review request for hive, Eugene Koifman and Jason Dere.


Changes
---

A new approach.


Bugs: HIVE-20354
https://issues.apache.org/jira/browse/HIVE-20354


Repository: hive-git


Description
---

Semijoin hints dont work with merge statements.


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g f4d12ae564 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java a63aabed9f 
  ql/src/java/org/apache/hadoop/hive/ql/parse/UpdateDeleteSemanticAnalyzer.java 
8df290435d 
  ql/src/test/queries/clientpositive/semijoin_hint.q de176affd3 
  ql/src/test/results/clientpositive/llap/semijoin_hint.q.out 679916de07 


Diff: https://reviews.apache.org/r/68281/diff/4/

Changes: https://reviews.apache.org/r/68281/diff/3-4/


Testing
---


Thanks,

Deepak Jaiswal



Re: Review Request 68281: HIVE-20354

2018-08-10 Thread Deepak Jaiswal

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68281/
---

(Updated Aug. 10, 2018, 6:48 a.m.)


Review request for hive, Eugene Koifman and Jason Dere.


Changes
---

Fixed the issue for tables such as "select_table"


Bugs: HIVE-20354
https://issues.apache.org/jira/browse/HIVE-20354


Repository: hive-git


Description
---

Semijoin hints dont work with merge statements.


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g f4d12ae564 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java a63aabed9f 
  ql/src/java/org/apache/hadoop/hive/ql/parse/UpdateDeleteSemanticAnalyzer.java 
8df290435d 
  ql/src/test/queries/clientpositive/semijoin_hint.q de176affd3 
  ql/src/test/results/clientpositive/llap/semijoin_hint.q.out 679916de07 


Diff: https://reviews.apache.org/r/68281/diff/3/

Changes: https://reviews.apache.org/r/68281/diff/2-3/


Testing
---


Thanks,

Deepak Jaiswal



Re: Review Request 68281: HIVE-20354

2018-08-09 Thread Deepak Jaiswal

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68281/
---

(Updated Aug. 9, 2018, 7:19 p.m.)


Review request for hive, Eugene Koifman and Jason Dere.


Changes
---

Implemented review comments.


Bugs: HIVE-20354
https://issues.apache.org/jira/browse/HIVE-20354


Repository: hive-git


Description
---

Semijoin hints dont work with merge statements.


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g f4d12ae564 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 463880587e 
  ql/src/java/org/apache/hadoop/hive/ql/parse/UpdateDeleteSemanticAnalyzer.java 
8df290435d 
  ql/src/test/queries/clientpositive/semijoin_hint.q de176affd3 
  ql/src/test/results/clientpositive/llap/semijoin_hint.q.out 679916de07 


Diff: https://reviews.apache.org/r/68281/diff/2/

Changes: https://reviews.apache.org/r/68281/diff/1-2/


Testing
---


Thanks,

Deepak Jaiswal



Re: Review Request 68281: HIVE-20354

2018-08-09 Thread Deepak Jaiswal


> On Aug. 9, 2018, 6:33 p.m., Gopal V wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/parse/UpdateDeleteSemanticAnalyzer.java
> > Lines 1000 (patched)
> > <https://reviews.apache.org/r/68281/diff/1/?file=2070795#file2070795line1000>
> >
> > why not save it directly into setHintList()?

It has to be first processed before it can be set. Anyway I am going to abandon 
this approach in favor of what Eugene suggested.


- Deepak


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68281/#review207047
-------


On Aug. 9, 2018, 5:44 p.m., Deepak Jaiswal wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68281/
> ---
> 
> (Updated Aug. 9, 2018, 5:44 p.m.)
> 
> 
> Review request for hive, Eugene Koifman and Jason Dere.
> 
> 
> Bugs: HIVE-20354
> https://issues.apache.org/jira/browse/HIVE-20354
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Semijoin hints dont work with merge statements.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g f4d12ae564 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
> 463880587e 
>   
> ql/src/java/org/apache/hadoop/hive/ql/parse/UpdateDeleteSemanticAnalyzer.java 
> 8df290435d 
>   ql/src/test/queries/clientpositive/semijoin_hint.q de176affd3 
>   ql/src/test/results/clientpositive/llap/semijoin_hint.q.out 679916de07 
> 
> 
> Diff: https://reviews.apache.org/r/68281/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Deepak Jaiswal
> 
>



Review Request 68281: HIVE-20354

2018-08-09 Thread Deepak Jaiswal

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68281/
---

Review request for hive and Jason Dere.


Bugs: HIVE-20354
https://issues.apache.org/jira/browse/HIVE-20354


Repository: hive-git


Description
---

Semijoin hints dont work with merge statements.


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g f4d12ae564 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 463880587e 
  ql/src/java/org/apache/hadoop/hive/ql/parse/UpdateDeleteSemanticAnalyzer.java 
8df290435d 
  ql/src/test/queries/clientpositive/semijoin_hint.q de176affd3 
  ql/src/test/results/clientpositive/llap/semijoin_hint.q.out 679916de07 


Diff: https://reviews.apache.org/r/68281/diff/1/


Testing
---


Thanks,

Deepak Jaiswal



[jira] [Created] (HIVE-20354) Semijoin hints dont work with merge statements

2018-08-09 Thread Deepak Jaiswal (JIRA)
Deepak Jaiswal created HIVE-20354:
-

 Summary: Semijoin hints dont work with merge statements
 Key: HIVE-20354
 URL: https://issues.apache.org/jira/browse/HIVE-20354
 Project: Hive
  Issue Type: Bug
Reporter: Deepak Jaiswal
Assignee: Deepak Jaiswal


When merge statement is rewritten, it ignores any comment in the query which 
may include hints like semijoin.
If it is, it should not be ignored.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Review Request 68124: HIVE-20252

2018-08-01 Thread Deepak Jaiswal

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68124/
---

(Updated Aug. 1, 2018, 6:10 p.m.)


Review request for hive, Jesús Camacho Rodríguez and Jason Dere.


Changes
---

Left out a minor change from previous patch.


Bugs: HIVE-20252
https://issues.apache.org/jira/browse/HIVE-20252


Repository: hive-git


Description
---

See Jira.

removeSemiJoinCyclesDueToMapsideJoins is deprecated, although it has changes. I 
will eventually remove it and can be ignored.


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/OperatorUtils.java 7b2ae40107 
  ql/src/java/org/apache/hadoop/hive/ql/parse/ParseContext.java 538aa5e924 
  ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java c3eb886fd2 


Diff: https://reviews.apache.org/r/68124/diff/5/

Changes: https://reviews.apache.org/r/68124/diff/4-5/


Testing
---


Thanks,

Deepak Jaiswal



Re: Review Request 68124: HIVE-20252

2018-08-01 Thread Deepak Jaiswal

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68124/
---

(Updated Aug. 1, 2018, 6:05 p.m.)


Review request for hive, Jesús Camacho Rodríguez and Jason Dere.


Changes
---

Implemented review comments.


Bugs: HIVE-20252
https://issues.apache.org/jira/browse/HIVE-20252


Repository: hive-git


Description
---

See Jira.

removeSemiJoinCyclesDueToMapsideJoins is deprecated, although it has changes. I 
will eventually remove it and can be ignored.


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/OperatorUtils.java 7b2ae40107 
  ql/src/java/org/apache/hadoop/hive/ql/parse/ParseContext.java 538aa5e924 
  ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java c3eb886fd2 


Diff: https://reviews.apache.org/r/68124/diff/4/

Changes: https://reviews.apache.org/r/68124/diff/3-4/


Testing
---


Thanks,

Deepak Jaiswal



Re: Review Request 68124: HIVE-20252

2018-08-01 Thread Deepak Jaiswal


> On Aug. 1, 2018, 2:39 a.m., Jesús Camacho Rodríguez wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/OperatorUtils.java
> > Lines 451 (patched)
> > <https://reviews.apache.org/r/68124/diff/3/?file=2065696#file2065696line451>
> >
> > We can remove this first block, it does not buy us much in terms of 
> > algorithm perfomance, and method would have no restriction on start 
> > operator (plus more readable).
> 
> Deepak Jaiswal wrote:
> No. It wont work without it. It is not for performance, it is for 
> correctness. The start in our case is the RS2, going up wont work as it will 
> stop when it encounters RS1.
> The more generic one is in SharedWorkOptimizer, this one, I am afraid is 
> for this particular case.
> 
> Jesús Camacho Rodríguez wrote:
> The block can be part of the caller logic, so if you have the chain:
> SEL->GBY1->RS1->GBY2->RS2
> then you end up passing the SEL as the start operator.
> 
> 
> Then the method in OperatorUtils has no restriction and it is reusable: 
> given any operator, 1) output all the operators contained in the same work, 
> 2) gather all the terminal operators of that work, and 3) gather all the 
> semijoin branches of that work.

Thanks.


- Deepak


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68124/#review206717
---


On Aug. 1, 2018, 12:27 a.m., Deepak Jaiswal wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68124/
> ---
> 
> (Updated Aug. 1, 2018, 12:27 a.m.)
> 
> 
> Review request for hive, Jesús Camacho Rodríguez and Jason Dere.
> 
> 
> Bugs: HIVE-20252
> https://issues.apache.org/jira/browse/HIVE-20252
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> See Jira.
> 
> removeSemiJoinCyclesDueToMapsideJoins is deprecated, although it has changes. 
> I will eventually remove it and can be ignored.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/OperatorUtils.java 7b2ae40107 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/ParseContext.java 538aa5e924 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java c3eb886fd2 
> 
> 
> Diff: https://reviews.apache.org/r/68124/diff/3/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Deepak Jaiswal
> 
>



Re: Review Request 68124: HIVE-20252

2018-08-01 Thread Deepak Jaiswal


> On Aug. 1, 2018, 2:39 a.m., Jesús Camacho Rodríguez wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/OperatorUtils.java
> > Lines 451 (patched)
> > <https://reviews.apache.org/r/68124/diff/3/?file=2065696#file2065696line451>
> >
> > We can remove this first block, it does not buy us much in terms of 
> > algorithm perfomance, and method would have no restriction on start 
> > operator (plus more readable).

No. It wont work without it. It is not for performance, it is for correctness. 
The start in our case is the RS2, going up wont work as it will stop when it 
encounters RS1.
The more generic one is in SharedWorkOptimizer, this one, I am afraid is for 
this particular case.


> On Aug. 1, 2018, 2:39 a.m., Jesús Camacho Rodríguez wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/OperatorUtils.java
> > Lines 462 (patched)
> > <https://reviews.apache.org/r/68124/diff/3/?file=2065696#file2065696line462>
> >
> > Probably more useful to do the inverse, the private method void and the 
> > public method returns the operators in the work?

Aah, what was I thinking. I meant to do that only. Thanks for pointing this out.


- Deepak


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68124/#review206717
-------


On Aug. 1, 2018, 12:27 a.m., Deepak Jaiswal wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68124/
> ---
> 
> (Updated Aug. 1, 2018, 12:27 a.m.)
> 
> 
> Review request for hive, Jesús Camacho Rodríguez and Jason Dere.
> 
> 
> Bugs: HIVE-20252
> https://issues.apache.org/jira/browse/HIVE-20252
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> See Jira.
> 
> removeSemiJoinCyclesDueToMapsideJoins is deprecated, although it has changes. 
> I will eventually remove it and can be ignored.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/OperatorUtils.java 7b2ae40107 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/ParseContext.java 538aa5e924 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java c3eb886fd2 
> 
> 
> Diff: https://reviews.apache.org/r/68124/diff/3/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Deepak Jaiswal
> 
>



Re: Review Request 68124: HIVE-20252

2018-07-31 Thread Deepak Jaiswal

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68124/
---

(Updated Aug. 1, 2018, 12:27 a.m.)


Review request for hive, Jesús Camacho Rodríguez and Jason Dere.


Changes
---

Implemented review comments.


Bugs: HIVE-20252
https://issues.apache.org/jira/browse/HIVE-20252


Repository: hive-git


Description
---

See Jira.

removeSemiJoinCyclesDueToMapsideJoins is deprecated, although it has changes. I 
will eventually remove it and can be ignored.


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/OperatorUtils.java 7b2ae40107 
  ql/src/java/org/apache/hadoop/hive/ql/parse/ParseContext.java 538aa5e924 
  ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java c3eb886fd2 


Diff: https://reviews.apache.org/r/68124/diff/3/

Changes: https://reviews.apache.org/r/68124/diff/2-3/


Testing
---


Thanks,

Deepak Jaiswal



Re: Review Request 68124: HIVE-20252

2018-07-31 Thread Deepak Jaiswal


> On July 31, 2018, 11:38 p.m., Jesús Camacho Rodríguez wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java
> > Line 914 (original), 917 (patched)
> > <https://reviews.apache.org/r/68124/diff/2/?file=2065678#file2065678line983>
> >
> > Can be collapsed into single line in if condition.

I have been asked to not do that in other reviews before so I kept it that way. 
Lets keep it this way.


- Deepak


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68124/#review206707
---


On July 31, 2018, 11:07 p.m., Deepak Jaiswal wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68124/
> ---
> 
> (Updated July 31, 2018, 11:07 p.m.)
> 
> 
> Review request for hive, Jesús Camacho Rodríguez and Jason Dere.
> 
> 
> Bugs: HIVE-20252
> https://issues.apache.org/jira/browse/HIVE-20252
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> See Jira.
> 
> removeSemiJoinCyclesDueToMapsideJoins is deprecated, although it has changes. 
> I will eventually remove it and can be ignored.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/ParseContext.java 538aa5e924 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java c3eb886fd2 
> 
> 
> Diff: https://reviews.apache.org/r/68124/diff/2/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Deepak Jaiswal
> 
>



Re: Review Request 68124: HIVE-20252

2018-07-31 Thread Deepak Jaiswal


> On July 31, 2018, 11:38 p.m., Jesús Camacho Rodríguez wrote:
> >

Thanks I will work on the comments.


- Deepak


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68124/#review206707
---


On July 31, 2018, 11:07 p.m., Deepak Jaiswal wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68124/
> ---
> 
> (Updated July 31, 2018, 11:07 p.m.)
> 
> 
> Review request for hive, Jesús Camacho Rodríguez and Jason Dere.
> 
> 
> Bugs: HIVE-20252
> https://issues.apache.org/jira/browse/HIVE-20252
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> See Jira.
> 
> removeSemiJoinCyclesDueToMapsideJoins is deprecated, although it has changes. 
> I will eventually remove it and can be ignored.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/ParseContext.java 538aa5e924 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java c3eb886fd2 
> 
> 
> Diff: https://reviews.apache.org/r/68124/diff/2/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Deepak Jaiswal
> 
>



Re: Review Request 68124: HIVE-20252

2018-07-31 Thread Deepak Jaiswal

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68124/
---

(Updated July 31, 2018, 11:07 p.m.)


Review request for hive, Jesús Camacho Rodríguez and Jason Dere.


Changes
---

New approach where a virtual edge is created from non-semijoin terminal 
operators in a task to semijoin terminal operators within the task.
This creates a cycle if there exists a task level cycle.


Bugs: HIVE-20252
https://issues.apache.org/jira/browse/HIVE-20252


Repository: hive-git


Description
---

See Jira.

removeSemiJoinCyclesDueToMapsideJoins is deprecated, although it has changes. I 
will eventually remove it and can be ignored.


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/parse/ParseContext.java 538aa5e924 
  ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java c3eb886fd2 


Diff: https://reviews.apache.org/r/68124/diff/2/

Changes: https://reviews.apache.org/r/68124/diff/1-2/


Testing
---


Thanks,

Deepak Jaiswal



Review Request 68124: HIVE-20252

2018-07-30 Thread Deepak Jaiswal

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68124/
---

Review request for hive, Jesús Camacho Rodríguez and Jason Dere.


Bugs: HIVE-20252
https://issues.apache.org/jira/browse/HIVE-20252


Repository: hive-git


Description
---

See Jira.

removeSemiJoinCyclesDueToMapsideJoins is deprecated, although it has changes. I 
will eventually remove it and can be ignored.


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java 
011dadf495 
  ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java c3eb886fd2 


Diff: https://reviews.apache.org/r/68124/diff/1/


Testing
---


Thanks,

Deepak Jaiswal



Re: [ANNOUNCE] New committer: Slim Bouguerra

2018-07-30 Thread Deepak Jaiswal
Congrats Slim!

On 7/30/18, 4:03 PM, "Prasanth Jayachandran"  
wrote:

Congratulations Slim!

> On Jul 30, 2018, at 4:00 PM, Sergey Shelukhin  
wrote:
> 
> Congrats!
> 
> On 18/7/30, 12:53, "Gunther Hagleitner" 
> wrote:
> 
>> Congratulations!
>> 
>> Thanks,
>> Gunther.
>> 
>> From: Xuefu Zhang 
>> Sent: Monday, July 30, 2018 12:11 PM
>> To: dev@hive.apache.org
>> Subject: Re: [ANNOUNCE] New committer: Slim Bouguerra
>> 
>> congratulations!!!
>> 
>> On Mon, Jul 30, 2018 at 12:10 PM, Jesus Camacho Rodriguez <
>> jcamachorodrig...@hortonworks.com> wrote:
>> 
>>> Congrats Slim!
>>> 
>>> On 7/30/18, 10:53 AM, "Andrew Sherman" 
>>> wrote:
>>> 
>>>Congratulations Slim!
>>> 
>>>On Mon, Jul 30, 2018 at 12:46 AM Peter Vary
>>> >>> 
>>>wrote:
>>> 
 Congratulations Slim!
 
> On Jul 30, 2018, at 02:00, Ashutosh Chauhan
>>> 
 wrote:
> 
> Apache Hive's Project Management Committee (PMC) has invited
>>> Slim
 Bouguerra
> to become a committer, and we are pleased to announce that he
>>> has
 accepted.
> 
> Slim, welcome, thank you for your contributions, and we look
>>> forward your
> further interactions with the community!
> 
> Ashutosh Chauhan (on behalf of the Apache Hive PMC)
 
 
>>> 
>>> 
>>> 
> 





Re: [ANNOUNCE] New PMC Member : Vineet Garg

2018-07-30 Thread Deepak Jaiswal
Congratulations Vineet!

On 7/30/18, 12:45 AM, "Peter Vary"  wrote:

Congratulations Vineet!

> On Jul 30, 2018, at 01:59, Ashutosh Chauhan  wrote:
> 
> On behalf of the Hive PMC I am delighted to announce Vineet Garg is 
joining
> Hive PMC.
> Thanks Vineet for all your contributions till now. Looking forward to many
> more.
> 
> Welcome, Vineet!
> 
> Thanks,
> Ashutosh





[jira] [Created] (HIVE-20252) Semijoin Reduction : Cycles due to semi join branch may remain undetected if small table side has a map join upstream.

2018-07-26 Thread Deepak Jaiswal (JIRA)
Deepak Jaiswal created HIVE-20252:
-

 Summary: Semijoin Reduction : Cycles due to semi join branch may 
remain undetected if small table side has a map join upstream.
 Key: HIVE-20252
 URL: https://issues.apache.org/jira/browse/HIVE-20252
 Project: Hive
  Issue Type: Bug
Reporter: Deepak Jaiswal
Assignee: Deepak Jaiswal


For eg,

 
 # 2018-07-26T17:22:14,664 DEBUG [51377701-dc98-424f-82e0-bbb5d6c84316 main] 
optimizer.SharedWorkOptimizer: Before SharedWorkOptimizer:
 # 
TS[0]-FIL[96]-SEL[2]-MAPJOIN[156]-MAPJOIN[157]-MAPJOIN[161]-MAPJOIN[162]-FIL[47]-SEL[48]-MAPJOIN[163]-FIL[66]-SEL[67]-TNK[105]-GBY[68]-RS[69]-GBY[70]-SEL[71]-RS[72]-SEL[73]-LIM[74]-FS[75]
 #                                                           
-SEL[142]-GBY[143]-RS[144]-GBY[145]-RS[155]
 # TS[3]-FIL[97]-SEL[5]-RS[34]-MAPJOIN[156]
 # TS[6]-FIL[98]-SEL[8]-RS[37]-MAPJOIN[157]
 # TS[9]-FIL[99]-SEL[11]-MAPJOIN[158]-GBY[40]-RS[42]-MAPJOIN[161]
 # TS[12]-FIL[100]-SEL[14]-RS[16]-MAPJOIN[158]
 #                       -SEL[131]-GBY[132]-EVENT[133]
 # 
TS[19]-FIL[101]-SEL[21]-MAPJOIN[159]-GBY[29]-RS[30]-GBY[31]-SEL[32]-RS[45]-MAPJOIN[162]
 # TS[22]-FIL[102]-SEL[24]-RS[26]-MAPJOIN[159]
 #                       -SEL[139]-GBY[140]-EVENT[141]
 # 
TS[49]-FIL[103]-SEL[51]-MAPJOIN[160]-GBY[59]-RS[60]-GBY[61]-SEL[62]-RS[64]-MAPJOIN[163]
 # TS[52]-FIL[104]-SEL[54]-RS[56]-MAPJOIN[160]
 #                       -SEL[147]-GBY[148]-EVENT[149]
 # 
 # 
 # DPP information stored in the cache: \{TS[19]=[EVENT[141]], 
TS[9]=[EVENT[133]], TS[49]=[RS[155], EVENT[149]]}

 

The semi join branch in line 3 feeds into TS[49] in line 12 which feeds to 
MAPJOIN[163] going back to parent of the semi join branch at line 2.


The logic to detect cycle may fail as there is a MAPJOIN[160] at line 12 which 
could cause the logic to look for wrong TS. The logic to find TS operator 
upstream must use findOperatorsUpstream() and examine each TS Op for complete 
coverage.

 

cc [~jcamachorodriguez]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Review Request 68069: HIVE-20240

2018-07-26 Thread Deepak Jaiswal

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68069/
---

Review request for hive and Jason Dere.


Bugs: HIVE-20240
https://issues.apache.org/jira/browse/HIVE-20240


Repository: hive-git


Description
---

See Jira


Diffs
-

  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization.java
 caec2c08e9 
  ql/src/test/queries/clientpositive/dynamic_semijoin_reduction_4.q a04ab666e0 
  ql/src/test/results/clientpositive/llap/dynamic_semijoin_reduction_4.q.out 
0feb362023 


Diff: https://reviews.apache.org/r/68069/diff/1/


Testing
---


Thanks,

Deepak Jaiswal



[jira] [Created] (HIVE-20240) Semijoin Reduction : Use local variable to check for external table condition

2018-07-25 Thread Deepak Jaiswal (JIRA)
Deepak Jaiswal created HIVE-20240:
-

 Summary: Semijoin Reduction : Use local variable to check for 
external table condition
 Key: HIVE-20240
 URL: https://issues.apache.org/jira/browse/HIVE-20240
 Project: Hive
  Issue Type: Bug
Reporter: Deepak Jaiswal
Assignee: Deepak Jaiswal


This condition,

 

semiJoin = semiJoin && 
!disableSemiJoinOptDueToExternalTable(parseContext.getConf(), ts, ctx);

 

may set semiJoin to false if an external table is encountered and will remain 
false for subsequent cases. It should only disable it for that particular case.

 

cc [~jdere]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Review Request 67974: HIVE-20164

2018-07-23 Thread Deepak Jaiswal

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/67974/
---

(Updated July 23, 2018, 6:06 p.m.)


Review request for hive, Gopal V and Jason Dere.


Changes
---

Made the data set smaller for easier verification.


Bugs: HIVE-20164
https://issues.apache.org/jira/browse/HIVE-20164


Repository: hive-git


Description
---

Murmur Hash : Make sure CTAS and IAS use correct bucketing version


Diffs (updated)
-

  itests/src/test/resources/testconfiguration.properties 654185d962 
  ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java 1661aeccd7 
  ql/src/java/org/apache/hadoop/hive/ql/plan/TableDesc.java bbce940c2e 
  ql/src/test/queries/clientpositive/murmur_hash_migration.q PRE-CREATION 
  ql/src/test/results/clientpositive/llap/murmur_hash_migration.q.out 
PRE-CREATION 


Diff: https://reviews.apache.org/r/67974/diff/4/

Changes: https://reviews.apache.org/r/67974/diff/3-4/


Testing
---


Thanks,

Deepak Jaiswal



Re: Review Request 67974: HIVE-20164

2018-07-23 Thread Deepak Jaiswal

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/67974/
---

(Updated July 23, 2018, 5:48 p.m.)


Review request for hive, Gopal V and Jason Dere.


Changes
---

Sort results for easy verification and to avoid order change due to other 
possible changes.


Bugs: HIVE-20164
https://issues.apache.org/jira/browse/HIVE-20164


Repository: hive-git


Description
---

Murmur Hash : Make sure CTAS and IAS use correct bucketing version


Diffs (updated)
-

  itests/src/test/resources/testconfiguration.properties 654185d962 
  ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java 1661aeccd7 
  ql/src/java/org/apache/hadoop/hive/ql/plan/TableDesc.java bbce940c2e 
  ql/src/test/queries/clientpositive/murmur_hash_migration.q PRE-CREATION 
  ql/src/test/results/clientpositive/llap/murmur_hash_migration.q.out 
PRE-CREATION 


Diff: https://reviews.apache.org/r/67974/diff/3/

Changes: https://reviews.apache.org/r/67974/diff/2-3/


Testing
---


Thanks,

Deepak Jaiswal



Re: Review Request 67974: HIVE-20164

2018-07-20 Thread Deepak Jaiswal

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/67974/
---

(Updated July 20, 2018, 11:10 p.m.)


Review request for hive, Gopal V and Jason Dere.


Changes
---

Implemented review comments.


Bugs: HIVE-20164
https://issues.apache.org/jira/browse/HIVE-20164


Repository: hive-git


Description
---

Murmur Hash : Make sure CTAS and IAS use correct bucketing version


Diffs (updated)
-

  itests/src/test/resources/testconfiguration.properties d5a33bd8ca 
  ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java 1661aeccd7 
  ql/src/java/org/apache/hadoop/hive/ql/plan/TableDesc.java bbce940c2e 
  ql/src/test/queries/clientpositive/murmur_hash_migration.q PRE-CREATION 
  ql/src/test/results/clientpositive/llap/murmur_hash_migration.q.out 
PRE-CREATION 


Diff: https://reviews.apache.org/r/67974/diff/2/

Changes: https://reviews.apache.org/r/67974/diff/1-2/


Testing
---


Thanks,

Deepak Jaiswal



Review Request 67974: HIVE-20164

2018-07-19 Thread Deepak Jaiswal

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/67974/
---

Review request for hive, Gopal V and Jason Dere.


Bugs: HIVE-20164
https://issues.apache.org/jira/browse/HIVE-20164


Repository: hive-git


Description
---

Murmur Hash : Make sure CTAS and IAS use correct bucketing version


Diffs
-

  itests/src/test/resources/testconfiguration.properties d08528f319 
  ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java 1b433c7498 
  ql/src/java/org/apache/hadoop/hive/ql/plan/TableDesc.java bbce940c2e 
  ql/src/test/queries/clientpositive/murmur_hash_migration.q PRE-CREATION 
  ql/src/test/results/clientpositive/llap/murmur_hash_migration.q.out 
PRE-CREATION 


Diff: https://reviews.apache.org/r/67974/diff/1/


Testing
---


Thanks,

Deepak Jaiswal



Re: [VOTE] Should we release storage-api 2.7.0 rc1?

2018-07-18 Thread Deepak Jaiswal
Thanks for testing out the RC and your vote.
With 3 +1s, the vote passes. I will work on the release now.

Regards,
Deepak

On 7/18/18, 9:14 AM, "Jesus Camacho Rodriguez"  wrote:

+1

Built from sources and ran tests.

-Jesús



On 7/16/18, 10:31 AM, "Ashutosh Chauhan"  wrote:

+1
Built from sources.
Ran unit tests.
Checksums and sigs matched up.

On Mon, Jul 16, 2018 at 8:58 AM Owen O'Malley 
wrote:

> +1
>
> built & ran tests
> checked checksums & signature
> tested with ORC
>
    > On Thu, Jul 12, 2018 at 4:37 PM, Deepak Jaiswal 

> wrote:
>
> > Hi,
> >
> > I have prepared the rc1 off of branch-3.1.
> > Artifacts:
> > Tag : https://github.com/apache/hive/releases/tag/storage-
> > release-2.7.0-rc1
> > Tar Ball : http://home.apache.org/~djaiswal/hive-storage-2.7.0/
    > >
> > Regards,
> > Deepak
> >
> > On 7/10/18, 10:16 AM, "Deepak Jaiswal" 
> wrote:
> >
> > Thanks Owen for finding this out. I will work on the next RC 
once
> this
> > blocker is resolved.
> >
> > Regards,
> > Deepak
> >
> > On 7/10/18, 9:40 AM, "Owen O'Malley"  
wrote:
> >
> > Ok, Jesus and I tracked it down and I've filed
> > https://issues.apache.org/jira/browse/HIVE-20135 that is a
> > blocker on
> > storage-api 2.7.0.
> >
> > The impact was that orc 1.5 and master failed with the RC. 
orc
> 1.4
> > and
> > older were fine.
> >
> > .. Owen
> >
> > On Tue, Jul 10, 2018 at 8:17 AM, Owen O'Malley <
> > owen.omal...@gmail.com>
> > wrote:
> >
> > > I wanted to give an update on this. For now, I'm -1 
because the
> > ORC
> > > (branch-1.5) tests fail with this RC. I'll dig into what 
is
> > wrong, but it
> > > looks like something in the timezone changes broke 
backwards
> > compatibility.
> > >
> > > .. Owen
> > >
> > > On Mon, Jul 9, 2018 at 11:12 AM, Deepak Jaiswal <
> > djais...@hortonworks.com>
> > > wrote:
> > >
> > >> Thanks Alan.
    > >     >>
> > >> On 7/9/18, 10:17 AM, "Alan Gates" 
> wrote:
> > >>
> > >> +1.  Did a build with a clean maven repo, checked the
> > signature and
> > >> sha
> > >> hash, ran RAT.
> > >>
> > >> Alan.
> > >>
> > >> On Fri, Jul 6, 2018 at 2:21 PM Deepak Jaiswal <
> > >> djais...@hortonworks.com>
> > >> wrote:
> > >>
> > >> > Hi,
> > >> >
> > >> > I would like to make a new release of the 
storage-api.
> It
> > contains
> > >> changes
> > >> > required for Hive 3.1 release.
> > >> >
> > >> > Artifcats:
> > >> > Tag :
> > >> > 
https://github.com/apache/hive/releases/tag/storage-
> > release-
> > >> 2.7.0-rc0
> > >> > Tar Ball : http://home.apache.org/~
> > djaiswal/hive-storage-2.7.0/
> > >> >
> > >> > Regards,
> > >> > Deepak
> > >> >
> > >>
> > >>
> > >>
> > >
> >
> >
> >
> >
> >
>







[jira] [Created] (HIVE-20164) Murmur Hash : Make sure CTAS and IAS use correct bucketing version

2018-07-12 Thread Deepak Jaiswal (JIRA)
Deepak Jaiswal created HIVE-20164:
-

 Summary: Murmur Hash : Make sure CTAS and IAS use correct 
bucketing version
 Key: HIVE-20164
 URL: https://issues.apache.org/jira/browse/HIVE-20164
 Project: Hive
  Issue Type: Bug
Reporter: Deepak Jaiswal
Assignee: Deepak Jaiswal


With the migration to Murmur hash, CTAS and IAS from old table version to new 
table version does not work as intended and data is hashed using old hash logic.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [VOTE] Should we release storage-api 2.7.0 rc1?

2018-07-12 Thread Deepak Jaiswal
Hi,

I have prepared the rc1 off of branch-3.1.
Artifacts:
Tag : https://github.com/apache/hive/releases/tag/storage-release-2.7.0-rc1
Tar Ball : http://home.apache.org/~djaiswal/hive-storage-2.7.0/

Regards,
Deepak

On 7/10/18, 10:16 AM, "Deepak Jaiswal"  wrote:

Thanks Owen for finding this out. I will work on the next RC once this 
blocker is resolved.

Regards,
Deepak

On 7/10/18, 9:40 AM, "Owen O'Malley"  wrote:

Ok, Jesus and I tracked it down and I've filed
https://issues.apache.org/jira/browse/HIVE-20135 that is a blocker on
storage-api 2.7.0.

The impact was that orc 1.5 and master failed with the RC. orc 1.4 and
older were fine.

.. Owen

On Tue, Jul 10, 2018 at 8:17 AM, Owen O'Malley 
wrote:

> I wanted to give an update on this. For now, I'm -1 because the ORC
> (branch-1.5) tests fail with this RC. I'll dig into what is wrong, 
but it
> looks like something in the timezone changes broke backwards 
compatibility.
>
> .. Owen
>
    > On Mon, Jul 9, 2018 at 11:12 AM, Deepak Jaiswal 

> wrote:
>
>> Thanks Alan.
>>
>> On 7/9/18, 10:17 AM, "Alan Gates"  wrote:
>>
>> +1.  Did a build with a clean maven repo, checked the signature 
and
>> sha
>> hash, ran RAT.
>>
>> Alan.
>>
>> On Fri, Jul 6, 2018 at 2:21 PM Deepak Jaiswal <
>> djais...@hortonworks.com>
>> wrote:
>>
>> > Hi,
>> >
>> > I would like to make a new release of the storage-api. It 
contains
>> changes
>> > required for Hive 3.1 release.
>> >
>> > Artifcats:
>> > Tag :
>> > https://github.com/apache/hive/releases/tag/storage-release-
>> 2.7.0-rc0
>> > Tar Ball : http://home.apache.org/~djaiswal/hive-storage-2.7.0/
>> >
>> > Regards,
>> > Deepak
>> >
>>
>>
>>
>






[jira] [Created] (HIVE-20155) Semijoin Reduction : Put all the min-max filters before all the bloom filters

2018-07-12 Thread Deepak Jaiswal (JIRA)
Deepak Jaiswal created HIVE-20155:
-

 Summary: Semijoin Reduction : Put all the min-max filters before 
all the bloom filters
 Key: HIVE-20155
 URL: https://issues.apache.org/jira/browse/HIVE-20155
 Project: Hive
  Issue Type: Task
Reporter: Deepak Jaiswal
Assignee: Deepak Jaiswal


If there are more than 1 semijoin reduction filters, apply all min-max filters 
before any of the bloom filters are applied as bloom filter lookup is expensive.

 

cc [~gopalv] [~jdere]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Review Request 67887: HIVE-20090

2018-07-11 Thread Deepak Jaiswal

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/67887/#review205976
---



LGTM. I have some minor comments.


ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java
Lines 417 (patched)
<https://reviews.apache.org/r/67887/#comment288929>

Thanks!



ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java
Lines 419 (patched)
<https://reviews.apache.org/r/67887/#comment288945>

Each of these functions check if semijoin reduction is enabled or not. I 
think it would be a bit efficient if the check happens at the beginning of this 
function and remove it from all the underlying functions.

if (!procCtx.conf.getBoolVar(ConfVars.TEZ_DYNAMIC_SEMIJOIN_REDUCTION) ||
procCtx.parseContext.getRsToSemiJoinBranchInfo().size() == 0) {
  return;
}



ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java
Lines 1030 (patched)
<https://reviews.apache.org/r/67887/#comment288953>

this code is very similar to SemiJoinRemovalIfNoStatsProc. If possible, can 
we refactor it to void duplication?



ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java
Lines 1064 (patched)
<https://reviews.apache.org/r/67887/#comment288954>

Is the first condition to handle cycles?



ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java
Lines 1089 (patched)
<https://reviews.apache.org/r/67887/#comment288955>

Please move this line after the instanceof check.



ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java
Lines 309 (patched)
<https://reviews.apache.org/r/67887/#comment288948>

Extreme nit : Can you add a blank line before the numbered comments for 
better readability?


- Deepak Jaiswal


On July 11, 2018, 5:26 p.m., Jesús Camacho Rodríguez wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/67887/
> ---
> 
> (Updated July 11, 2018, 5:26 p.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan, Deepak Jaiswal, and Gopal V.
> 
> 
> Bugs: HIVE-20090
> https://issues.apache.org/jira/browse/HIVE-20090
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-20090
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 
> 6ea68c35000a5dadb7a01db47bbd8183bff966da 
>   itests/src/test/resources/testconfiguration.properties 
> 9e012ce2f8f789bde3f95acc43052bf4446fccbc 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java 
> dfd790853b2f73a465989374e78c01d282d16891 
>   ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java 
> dec2d1ef38b748a5c9b40d06af491dd168d70b72 
>   ql/src/test/queries/clientpositive/dynamic_semijoin_reduction_sw2.q 
> PRE-CREATION 
>   
> ql/src/test/results/clientpositive/llap/dynamic_semijoin_reduction_sw2.q.out 
> PRE-CREATION 
>   ql/src/test/results/clientpositive/llap/explainuser_1.q.out 
> f87fe36e11a7c7e535678dbfaaced04f33bbb501 
>   ql/src/test/results/clientpositive/llap/tez_fixed_bucket_pruning.q.out 
> 6987a96809e3c3300e1b76ea5df3069b3c1d162f 
>   ql/src/test/results/clientpositive/perf/tez/query1.q.out 
> 579940c66e25ebf5e7d0635aaedd0c0cc994f4e0 
>   ql/src/test/results/clientpositive/perf/tez/query16.q.out 
> 0b64c55b0f4ba036aeba4c49f478e9ee1409087c 
>   ql/src/test/results/clientpositive/perf/tez/query17.q.out 
> 2e5e254b2ddc3507f962cbc7691db51f1abafbca 
>   ql/src/test/results/clientpositive/perf/tez/query18.q.out 
> e8585275b4e51a55ce778dd154033fcdf859e617 
>   ql/src/test/results/clientpositive/perf/tez/query2.q.out 
> d24899ccf371ad42ef88cebc26cc671c097686da 
>   ql/src/test/results/clientpositive/perf/tez/query23.q.out 
> 6725bec30106bc3321c2869dfc304d0a4da82cf8 
>   ql/src/test/results/clientpositive/perf/tez/query24.q.out 
> 9fcec42c3ab29b898c9c947544a2e29dd08e95e8 
>   ql/src/test/results/clientpositive/perf/tez/query25.q.out 
> a885cf344b7e29dcf1b2d93d1914e7f9a8d4b921 
>   ql/src/test/results/clientpositive/perf/tez/query29.q.out 
> 46ff49d41a01591f075b2c48ae5a692640fd6eec 
>   ql/src/test/results/clientpositive/perf/tez/query31.q.out 
> c4d717d8680f6ac6f8f8b6ed01742384a84ddcf9 
>   ql/src/test/results/clientpositive/perf/tez/query32.q.out 
> 6be6f7aa6e6fc50bcedebe3f4d1b5fc00b52ee86 
>   ql/src/test/results/clientpositive/perf/tez/query39.q.out 
> 5966e243ea79b4b884950f34a5b7336e40f92889 
>   ql/src/test/results/clientpositive/perf/tez/query40.q.out 
> 2f116f12ebcba44b876508d0d0f0d827e3a8b28d 
>   ql/src/test/results

[jira] [Created] (HIVE-20142) Semijoin Reduction : Peform cost based removal after rule based removal.

2018-07-11 Thread Deepak Jaiswal (JIRA)
Deepak Jaiswal created HIVE-20142:
-

 Summary: Semijoin Reduction : Peform cost based removal after rule 
based removal.
 Key: HIVE-20142
 URL: https://issues.apache.org/jira/browse/HIVE-20142
 Project: Hive
  Issue Type: Task
Reporter: Deepak Jaiswal
Assignee: Deepak Jaiswal


The semijoin reduction removal logic is spread out into multiple functions. 
Currently, the cost based removal logic is applied before the rule based(dumb) 
ones. 

Instead, apply the rule based removal logic and then apply the cost based 
removal.

 

cc [~jdere] [~jcamachorodriguez]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [VOTE] Should we release storage-api 2.7.0 rc0?

2018-07-10 Thread Deepak Jaiswal
Thanks Owen for finding this out. I will work on the next RC once this blocker 
is resolved.

Regards,
Deepak

On 7/10/18, 9:40 AM, "Owen O'Malley"  wrote:

Ok, Jesus and I tracked it down and I've filed
https://issues.apache.org/jira/browse/HIVE-20135 that is a blocker on
storage-api 2.7.0.

The impact was that orc 1.5 and master failed with the RC. orc 1.4 and
older were fine.

.. Owen

On Tue, Jul 10, 2018 at 8:17 AM, Owen O'Malley 
wrote:

> I wanted to give an update on this. For now, I'm -1 because the ORC
> (branch-1.5) tests fail with this RC. I'll dig into what is wrong, but it
> looks like something in the timezone changes broke backwards 
compatibility.
>
> .. Owen
>
> On Mon, Jul 9, 2018 at 11:12 AM, Deepak Jaiswal 
> wrote:
>
>> Thanks Alan.
>>
>> On 7/9/18, 10:17 AM, "Alan Gates"  wrote:
>>
>> +1.  Did a build with a clean maven repo, checked the signature and
>> sha
>> hash, ran RAT.
>>
>> Alan.
>>
>> On Fri, Jul 6, 2018 at 2:21 PM Deepak Jaiswal <
>> djais...@hortonworks.com>
>> wrote:
>>
>> > Hi,
>> >
>> > I would like to make a new release of the storage-api. It contains
>> changes
>> > required for Hive 3.1 release.
>> >
>> > Artifcats:
>> > Tag :
>> > https://github.com/apache/hive/releases/tag/storage-release-
>> 2.7.0-rc0
>> > Tar Ball : http://home.apache.org/~djaiswal/hive-storage-2.7.0/
>> >
>> > Regards,
>> > Deepak
>> >
>>
>>
>>
>




Re: [VOTE] Should we release storage-api 2.7.0 rc0?

2018-07-09 Thread Deepak Jaiswal
Thanks Alan.

On 7/9/18, 10:17 AM, "Alan Gates"  wrote:

+1.  Did a build with a clean maven repo, checked the signature and sha
hash, ran RAT.

Alan.

On Fri, Jul 6, 2018 at 2:21 PM Deepak Jaiswal 
wrote:

> Hi,
>
> I would like to make a new release of the storage-api. It contains changes
> required for Hive 3.1 release.
>
> Artifcats:
> Tag :
> https://github.com/apache/hive/releases/tag/storage-release-2.7.0-rc0
> Tar Ball : http://home.apache.org/~djaiswal/hive-storage-2.7.0/
>
> Regards,
> Deepak
>




Re: Hive QA batches timing out

2018-07-09 Thread Deepak Jaiswal
Thanks Zoltan for the analysis. Perhaps we should disable the test in the 
meantime as it is blocking several people from committing.

I can go ahead and create a patch for it.

Regards,
Deepak

On 7/8/18, 11:33 PM, "Zoltan Haindrich"  wrote:

Hello

Thank you Deepak for taking a closer look! from what you've found I've 
noticed that the runtime of TestReplicationScenariosAcidTables have jumped up 
to ~2000sec in the 
runs which have failedit seems like this problem is there for a long 
time now; I've found jira tickets in which this test was "timed out" and the 
HiveQA comment was 
date at April 03so it's not entirely new...

The problem which prohibits this test from completing successfully seems 
like that it has difficulties closing down the metastore client - which goes on 
for a while ...
I don't know if this is an acid/replication/metastore/? issue...but it 
seems intermittent - I've a hunch that somehow it might happen more reliably 
with this test...I've 
opened HIVE-20121 to investigate this...

2018-07-08T22:07:33,461 DEBUG [main] metastore.HiveMetaStoreClient: Unable 
to shutdown metastore client. Will try closing transport directly.
org.apache.thrift.transport.TTransportException: Cannot write to null 
outputStream

some links to more or less recent logs:

http://104.198.109.242/logs/PreCommit-HIVE-Build-12481/failed/240_UTBatch_itests__hive-unit_9_tests/maven-test.txt
the hive.log is ~200M:

http://104.198.109.242/logs/PreCommit-HIVE-Build-12481/failed/240_UTBatch_itests__hive-unit_9_tests/logs/hive.log


cheers,
Zoltan

On 07/08/2018 06:49 PM, Deepak Jaiswal wrote:
> I am seeing tests timing out in my latest ptest run,
> 
> https://builds.apache.org/job/PreCommit-HIVE-Build/12468/testReport
> https://builds.apache.org/job/PreCommit-HIVE-Build/12468/console
> 
> TestAlterTableMetadata - did not produce a TEST-*.xml file (likely timed 
out) (batchId=240)
> TestAutoPurgeTables - did not produce a TEST-*.xml file (likely timed 
out) (batchId=240)
> TestLocationQueries - did not produce a TEST-*.xml file (likely timed 
out) (batchId=240)
> TestReplicationScenariosAcidTables - did not produce a TEST-*.xml file 
(likely timed out) (batchId=240)
> TestSemanticAnalyzerHookLoading - did not produce a TEST-*.xml file 
(likely timed out) (batchId=240)
> TestSparkStatistics - did not produce a TEST-*.xml file (likely timed 
out) (batchId=240)
> 
> 
>  From the Hive QA homepage, the last stable build was 12444 whereas the 
current run is 12473. I looked at some of the runs in between and it looks like 
most of the runs are failing due to the above batch of unit tests.
> 
> Regards,
> Deepak
> 





Hive QA batches timing out

2018-07-08 Thread Deepak Jaiswal
I am seeing tests timing out in my latest ptest run,

https://builds.apache.org/job/PreCommit-HIVE-Build/12468/testReport
https://builds.apache.org/job/PreCommit-HIVE-Build/12468/console

TestAlterTableMetadata - did not produce a TEST-*.xml file (likely timed out) 
(batchId=240)
TestAutoPurgeTables - did not produce a TEST-*.xml file (likely timed out) 
(batchId=240)
TestLocationQueries - did not produce a TEST-*.xml file (likely timed out) 
(batchId=240)
TestReplicationScenariosAcidTables - did not produce a TEST-*.xml file (likely 
timed out) (batchId=240)
TestSemanticAnalyzerHookLoading - did not produce a TEST-*.xml file (likely 
timed out) (batchId=240)
TestSparkStatistics - did not produce a TEST-*.xml file (likely timed out) 
(batchId=240)


From the Hive QA homepage, the last stable build was 12444 whereas the current 
run is 12473. I looked at some of the runs in between and it looks like most of 
the runs are failing due to the above batch of unit tests.

Regards,
Deepak


[VOTE] Should we release storage-api 2.7.0 rc0?

2018-07-06 Thread Deepak Jaiswal
Hi,

I would like to make a new release of the storage-api. It contains changes 
required for Hive 3.1 release.

Artifcats:
Tag : https://github.com/apache/hive/releases/tag/storage-release-2.7.0-rc0
Tar Ball : http://home.apache.org/~djaiswal/hive-storage-2.7.0/

Regards,
Deepak


[jira] [Created] (HIVE-20100) OpTraits : Select Optraits should stop when a mismatch is detected

2018-07-05 Thread Deepak Jaiswal (JIRA)
Deepak Jaiswal created HIVE-20100:
-

 Summary: OpTraits : Select Optraits should stop when a mismatch is 
detected
 Key: HIVE-20100
 URL: https://issues.apache.org/jira/browse/HIVE-20100
 Project: Hive
  Issue Type: Bug
Reporter: Deepak Jaiswal
Assignee: Deepak Jaiswal


The select operator's optraits logic as stated in the comment is,

// For bucket columns
// If all the columns match to the parent, put them in the bucket cols
// else, add empty list.
// For sort columns
// Keep the subset of all the columns as long as order is maintained.

 

However, this is not happening due to a bug. The bool found is never reset, so 
if a single match is found, the value remains true and allows the optraits get 
populated with partial list of columns for bucket col which is incorrect.
This may lead to creation of SMB join which should not happen.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Review Request 67800: HIVE-20039

2018-07-03 Thread Deepak Jaiswal

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/67800/
---

(Updated July 3, 2018, 8:20 p.m.)


Review request for hive and Gopal V.


Changes
---

Added test to llap only runs.


Bugs: HIVE-20039
https://issues.apache.org/jira/browse/HIVE-20039


Repository: hive-git


Description
---

Bucket pruning: Left Outer Join on bucketed table gives wrong result.
The context was reused by all the predicates. Instead use TS op directly.


Diffs (updated)
-

  
data/files/bucket_pruning/l3_clarity__l3_monthly_dw_factplan_datajoin_1_s2_2018022300104_1/00_0
 PRE-CREATION 
  
data/files/bucket_pruning/l3_clarity__l3_monthly_dw_factplan_dw_stg_2018022300104_1/00_0
 PRE-CREATION 
  data/files/bucket_pruning/l3_clarity__l3_snap_number_2018022300104/00_0 
PRE-CREATION 
  data/files/bucket_pruning/l3_monthly_dw_dimplan/56_0 PRE-CREATION 
  itests/src/test/resources/testconfiguration.properties d02c0fe8ba 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/FixedBucketPruningOptimizer.java
 2debacacb5 
  ql/src/test/queries/clientpositive/tez_fixed_bucket_pruning.q PRE-CREATION 
  ql/src/test/results/clientpositive/llap/tez_fixed_bucket_pruning.q.out 
PRE-CREATION 


Diff: https://reviews.apache.org/r/67800/diff/3/

Changes: https://reviews.apache.org/r/67800/diff/2-3/


Testing
---


Thanks,

Deepak Jaiswal



Re: Review Request 67800: HIVE-20039

2018-07-03 Thread Deepak Jaiswal

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/67800/
---

(Updated July 3, 2018, 6:49 a.m.)


Review request for hive and Gopal V.


Changes
---

Implemented recommended changes.
Added order by in queries for predictable results.


Bugs: HIVE-20039
https://issues.apache.org/jira/browse/HIVE-20039


Repository: hive-git


Description
---

Bucket pruning: Left Outer Join on bucketed table gives wrong result.
The context was reused by all the predicates. Instead use TS op directly.


Diffs (updated)
-

  
data/files/bucket_pruning/l3_clarity__l3_monthly_dw_factplan_datajoin_1_s2_2018022300104_1/00_0
 PRE-CREATION 
  
data/files/bucket_pruning/l3_clarity__l3_monthly_dw_factplan_dw_stg_2018022300104_1/00_0
 PRE-CREATION 
  data/files/bucket_pruning/l3_clarity__l3_snap_number_2018022300104/00_0 
PRE-CREATION 
  data/files/bucket_pruning/l3_monthly_dw_dimplan/56_0 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/FixedBucketPruningOptimizer.java
 2debacacb5 
  ql/src/test/queries/clientpositive/tez_fixed_bucket_pruning.q PRE-CREATION 
  ql/src/test/results/clientpositive/llap/tez_fixed_bucket_pruning.q.out 
PRE-CREATION 


Diff: https://reviews.apache.org/r/67800/diff/2/

Changes: https://reviews.apache.org/r/67800/diff/1-2/


Testing
---


Thanks,

Deepak Jaiswal



Re: Review Request 67800: HIVE-20039

2018-07-02 Thread Deepak Jaiswal

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/67800/#review205652
---




ql/src/java/org/apache/hadoop/hive/ql/optimizer/FixedBucketPruningOptimizer.java
Lines 90 (patched)
<https://reviews.apache.org/r/67800/#comment288542>

Is it even possible to change bucket count for a partition of in a table?
As far as I can see bucket number is a table wide property.


- Deepak Jaiswal


On July 3, 2018, 12:23 a.m., Deepak Jaiswal wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/67800/
> ---
> 
> (Updated July 3, 2018, 12:23 a.m.)
> 
> 
> Review request for hive and Gopal V.
> 
> 
> Bugs: HIVE-20039
> https://issues.apache.org/jira/browse/HIVE-20039
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Bucket pruning: Left Outer Join on bucketed table gives wrong result.
> The context was reused by all the predicates. Instead use TS op directly.
> 
> 
> Diffs
> -
> 
>   
> data/files/bucket_pruning/l3_clarity__l3_monthly_dw_factplan_datajoin_1_s2_2018022300104_1/00_0
>  PRE-CREATION 
>   
> data/files/bucket_pruning/l3_clarity__l3_monthly_dw_factplan_dw_stg_2018022300104_1/00_0
>  PRE-CREATION 
>   data/files/bucket_pruning/l3_clarity__l3_snap_number_2018022300104/00_0 
> PRE-CREATION 
>   data/files/bucket_pruning/l3_monthly_dw_dimplan/56_0 PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/FixedBucketPruningOptimizer.java
>  2debacacb5 
>   ql/src/test/queries/clientpositive/tez_fixed_bucket_pruning.q PRE-CREATION 
>   ql/src/test/results/clientpositive/llap/tez_fixed_bucket_pruning.q.out 
> PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/67800/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Deepak Jaiswal
> 
>



Review Request 67800: HIVE-20039

2018-07-02 Thread Deepak Jaiswal

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/67800/
---

Review request for hive and Gopal V.


Bugs: HIVE-20039
https://issues.apache.org/jira/browse/HIVE-20039


Repository: hive-git


Description
---

Bucket pruning: Left Outer Join on bucketed table gives wrong result.
The context was reused by all the predicates. Instead use TS op directly.


Diffs
-

  
data/files/bucket_pruning/l3_clarity__l3_monthly_dw_factplan_datajoin_1_s2_2018022300104_1/00_0
 PRE-CREATION 
  
data/files/bucket_pruning/l3_clarity__l3_monthly_dw_factplan_dw_stg_2018022300104_1/00_0
 PRE-CREATION 
  data/files/bucket_pruning/l3_clarity__l3_snap_number_2018022300104/00_0 
PRE-CREATION 
  data/files/bucket_pruning/l3_monthly_dw_dimplan/56_0 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/FixedBucketPruningOptimizer.java
 2debacacb5 
  ql/src/test/queries/clientpositive/tez_fixed_bucket_pruning.q PRE-CREATION 
  ql/src/test/results/clientpositive/llap/tez_fixed_bucket_pruning.q.out 
PRE-CREATION 


Diff: https://reviews.apache.org/r/67800/diff/1/


Testing
---


Thanks,

Deepak Jaiswal



[DISCUSS] Storage-API 2.7 release

2018-07-02 Thread Deepak Jaiswal
All,

The upcoming branch-3.1 will need changes from storage-api. I propose to create 
new release of storage-api.
Please let me know your thoughts on this. I am working on the release candidate.

Regards,
Deepak


[jira] [Created] (HIVE-20039) Left Outer Join on bucketed table gives wrong result

2018-06-30 Thread Deepak Jaiswal (JIRA)
Deepak Jaiswal created HIVE-20039:
-

 Summary: Left Outer Join on bucketed table gives wrong result
 Key: HIVE-20039
 URL: https://issues.apache.org/jira/browse/HIVE-20039
 Project: Hive
  Issue Type: Bug
Affects Versions: 2.3.2, 3.0.0
Reporter: Deepak Jaiswal
Assignee: Deepak Jaiswal


Left outer join on bucketed table on certain cases gives wrong results.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Unstable Hive QA

2018-06-28 Thread Deepak Jaiswal
:1.8.0_102]
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[?:1.8.0_102]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_102]
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
 [junit-4.11.jar:?]
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
 [junit-4.11.jar:?]
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
 [junit-4.11.jar:?]
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
 [junit-4.11.jar:?]
at 
org.apache.hadoop.hive.cli.control.CliAdapter$2$1.evaluate(CliAdapter.java:92) 
[hive-it-util-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at org.junit.rules.RunRules.evaluate(RunRules.java:20) [junit-4.11.jar:?]
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271) 
[junit-4.11.jar:?]
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
 [junit-4.11.jar:?]
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
 [junit-4.11.jar:?]
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) 
[junit-4.11.jar:?]
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63) 
[junit-4.11.jar:?]
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236) 
[junit-4.11.jar:?]
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53) 
[junit-4.11.jar:?]
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229) 
[junit-4.11.jar:?]
at org.junit.runners.ParentRunner.run(ParentRunner.java:309) 
[junit-4.11.jar:?]
at org.junit.runners.Suite.runChild(Suite.java:127) [junit-4.11.jar:?]
at org.junit.runners.Suite.runChild(Suite.java:26) [junit-4.11.jar:?]
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) 
[junit-4.11.jar:?]
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63) 
[junit-4.11.jar:?]
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236) 
[junit-4.11.jar:?]
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53) 
[junit-4.11.jar:?]
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229) 
[junit-4.11.jar:?]
at 
org.apache.hadoop.hive.cli.control.CliAdapter$1$1.evaluate(CliAdapter.java:73) 
[hive-it-util-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at org.junit.rules.RunRules.evaluate(RunRules.java:20) [junit-4.11.jar:?]
at org.junit.runners.ParentRunner.run(ParentRunner.java:309) 
[junit-4.11.jar:?]
at 
org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
 [surefire-junit4-2.21.0.jar:2.21.0]
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
 [surefire-junit4-2.21.0.jar:2.21.0]
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
 [surefire-junit4-2.21.0.jar:2.21.0]
at 
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159) 
[surefire-junit4-2.21.0.jar:2.21.0]
at 
org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:379)
 [surefire-booter-2.21.0.jar:2.21.0]
at 
org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:340)
 [surefire-booter-2.21.0.jar:2.21.0]
at 
org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:125) 
[surefire-booter-2.21.0.jar:2.21.0]
at 
org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:413) 
[surefire-booter-2.21.0.jar:2.21.0]
2018-06-27T22:24:51,363  INFO [d00e737b-0dde-4230-ae29-d20498bf8332 main] 
ql.Context: New scratch dir is hdfs://localhost:37593/home/hiveptest


Vineet

On Jun 27, 2018, at 11:47 PM, Deepak Jaiswal 
mailto:djais...@hortonworks.com>> wrote:

Ptests have become really unstable.

The druid tests are failing randomly,
https://builds.apache.org/job/PreCommit-HIVE-Build/12203/testReport

Should we disable them?

Deepak

On 6/27/18, 10:13 AM, "Deepak Jaiswal"  wrote:

   Hi All,

   It seems we are going back to instability in Hive QA runs. In the past 
few days I saw many runs where the failures were completely independent. When 
those tests are run locally, they don’t fail which makes them harder to catch.

   On one side I think having green run to commit makes sense, however, on 
the other side, the development is unnecessarily blocked. Putting the randomly 
failing tests in disabled list is also not a good idea as it brings down the 
code coverage.
   Any suggestions?

   Regards,
   Deepak







Re: Unstable Hive QA

2018-06-28 Thread Deepak Jaiswal
Ptests have become really unstable.

The druid tests are failing randomly,
https://builds.apache.org/job/PreCommit-HIVE-Build/12203/testReport

Should we disable them?

Deepak

On 6/27/18, 10:13 AM, "Deepak Jaiswal"  wrote:

Hi All,

It seems we are going back to instability in Hive QA runs. In the past few 
days I saw many runs where the failures were completely independent. When those 
tests are run locally, they don’t fail which makes them harder to catch.

On one side I think having green run to commit makes sense, however, on the 
other side, the development is unnecessarily blocked. Putting the randomly 
failing tests in disabled list is also not a good idea as it brings down the 
code coverage.
Any suggestions?

Regards,
Deepak




Re: Review Request 67698: HIVE-19967

2018-06-27 Thread Deepak Jaiswal

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/67698/
---

(Updated June 27, 2018, 6:42 p.m.)


Review request for hive, Gunther Hagleitner and Jason Dere.


Changes
---

Added a missed test from original patch to enable SMB.


Bugs: HIVE-19967
https://issues.apache.org/jira/browse/HIVE-19967


Repository: hive-git


Description
---

SMB Join : Need Optraits for PTFOperator ala GBY Op


Diffs (updated)
-

  itests/src/test/resources/testconfiguration.properties 9f25a9bad3 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/metainfo/annotation/AnnotateWithOpTraits.java
 3c8e61d47b 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/metainfo/annotation/OpTraitsRulesProcFactory.java
 dbcbbfd1a6 
  ql/src/test/queries/clientpositive/llap_smb_ptf.q PRE-CREATION 
  ql/src/test/queries/clientpositive/tez_smb_reduce_side.q PRE-CREATION 
  ql/src/test/results/clientpositive/llap/llap_smb_ptf.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/llap/tez_smb_reduce_side.q.out 
PRE-CREATION 


Diff: https://reviews.apache.org/r/67698/diff/2/

Changes: https://reviews.apache.org/r/67698/diff/1-2/


Testing
---


Thanks,

Deepak Jaiswal



[jira] [Created] (HIVE-20017) Logic to disable SMB/BMJ on external tables is too strict

2018-06-27 Thread Deepak Jaiswal (JIRA)
Deepak Jaiswal created HIVE-20017:
-

 Summary: Logic to disable SMB/BMJ on external tables is too strict
 Key: HIVE-20017
 URL: https://issues.apache.org/jira/browse/HIVE-20017
 Project: Hive
  Issue Type: Bug
Reporter: Deepak Jaiswal
Assignee: Deepak Jaiswal


The logic to disable SMB and BMJ on external tables is too strict as done in 
JIRA,

https://issues.apache.org/jira/browse/HIVE-19336

 

For SMB, if there is a group by, then the source table becomes irrelevant as 
the rows are bucketed and sorted by group by keys.

For BMJ, the small table(s) can be external, the check needs to be done only 
for big table.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Unstable Hive QA

2018-06-27 Thread Deepak Jaiswal
Hi All,

It seems we are going back to instability in Hive QA runs. In the past few days 
I saw many runs where the failures were completely independent. When those 
tests are run locally, they don’t fail which makes them harder to catch.

On one side I think having green run to commit makes sense, however, on the 
other side, the development is unnecessarily blocked. Putting the randomly 
failing tests in disabled list is also not a good idea as it brings down the 
code coverage.
Any suggestions?

Regards,
Deepak


Re: Hive QA logs not accessible

2018-06-25 Thread Deepak Jaiswal
It was too soon. Looks like it is broken again. One of my runs,

https://builds.apache.org/view/H-L/view/Hive/job/PreCommit-HIVE-Build/12114/console

Regards,
Deepak

On 6/25/18, 1:57 PM, "Deepak Jaiswal"  wrote:

Hi Vihang,

It took a while but tests started to appear, so all is good now.

Regards,
Deepak

On 6/25/18, 12:24 PM, "Vihang Karajgaonkar"  
wrote:

I see there are 6 builds in the queue right now (which is unusually 
small).
What is the JIRA number where you submitted the patch?

On Mon, Jun 25, 2018 at 11:05 AM, Deepak Jaiswal 

wrote:

> Hi Vihang,
>
> I am looking for logs of failed test runs. Thanks for optimizing this 
for
> successful runs. However, I think there is a problem with Hive QA, the
> queue is gone and I submitted a patch more than 10 minutes ago and it
> hasn’t started or enqueued yet.
>
> https://builds.apache.org/view/H-L/view/Hive/job/PreCommit-HIVE-Build/
>
> Regards,
> Deepak
>
> On 6/25/18, 10:53 AM, "Vihang Karajgaonkar" 

> wrote:
>
> Are you looking for logs for successful tests? I had submitted a 
change
> recently which stops skips downloading logs for successful tests 
to
> shave
> off ~10 min time from each run. I found that the job was spending 
too
> much
> time copying over ~20G of logs from worker nodes to the server. 
Can you
> give the JIRA number so that I can take a look?
>
> On Mon, Jun 25, 2018 at 10:38 AM, Deepak Jaiswal <
> djais...@hortonworks.com>
> wrote:
>
> > The Hive QA logs are not accessible for yesterday night’s run. 
Also,
> I
> > don’t see any test running.
> > Is the disk full again?
> >
> > Regards,
> > Deepak
> >
>
>
>






Re: Hive QA logs not accessible

2018-06-25 Thread Deepak Jaiswal
Hi Vihang,

It took a while but tests started to appear, so all is good now.

Regards,
Deepak

On 6/25/18, 12:24 PM, "Vihang Karajgaonkar"  
wrote:

I see there are 6 builds in the queue right now (which is unusually small).
What is the JIRA number where you submitted the patch?

On Mon, Jun 25, 2018 at 11:05 AM, Deepak Jaiswal 
wrote:

> Hi Vihang,
>
> I am looking for logs of failed test runs. Thanks for optimizing this for
> successful runs. However, I think there is a problem with Hive QA, the
> queue is gone and I submitted a patch more than 10 minutes ago and it
> hasn’t started or enqueued yet.
>
> https://builds.apache.org/view/H-L/view/Hive/job/PreCommit-HIVE-Build/
>
> Regards,
> Deepak
>
> On 6/25/18, 10:53 AM, "Vihang Karajgaonkar" 
> wrote:
>
> Are you looking for logs for successful tests? I had submitted a 
change
> recently which stops skips downloading logs for successful tests to
> shave
> off ~10 min time from each run. I found that the job was spending too
> much
> time copying over ~20G of logs from worker nodes to the server. Can 
you
> give the JIRA number so that I can take a look?
>
> On Mon, Jun 25, 2018 at 10:38 AM, Deepak Jaiswal <
> djais...@hortonworks.com>
> wrote:
>
> > The Hive QA logs are not accessible for yesterday night’s run. Also,
> I
> > don’t see any test running.
> > Is the disk full again?
> >
> > Regards,
> > Deepak
> >
>
>
>




Re: Review Request 67710: HIVE-19481

2018-06-25 Thread Deepak Jaiswal

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/67710/
---

(Updated June 25, 2018, 6:15 p.m.)


Review request for hive, Jason Dere and Sergey Shelukhin.


Changes
---

Updated results for failed tests.


Bugs: HIVE-19481
https://issues.apache.org/jira/browse/HIVE-19481


Repository: hive-git


Description
---

sample10.q returns wrong results.
Multiple issues were fixed
1. Instead of using old MR logic which assumes there is 1 file for each bucket, 
lookup buckets by name(non-managed tables)
2. Skip bucket pruning for managed tables.


Diffs (updated)
-

  itests/src/test/resources/testconfiguration.properties 517b413839 
  ql/src/java/org/apache/hadoop/hive/ql/metadata/Partition.java 9dbd869d57 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/SamplePruner.java 8200e6a237 
  ql/src/test/queries/clientpositive/sample10_mm.q PRE-CREATION 
  ql/src/test/results/clientpositive/archive_excludeHadoop20.q.out e4b390c9cd 
  ql/src/test/results/clientpositive/beeline/smb_mapjoin_11.q.out 9f946e0b50 
  ql/src/test/results/clientpositive/llap/sample10.q.out ce3c2880a6 
  ql/src/test/results/clientpositive/llap/sample10_mm.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/masking_5.q.out 498fc117c7 
  ql/src/test/results/clientpositive/sample6.q.out 7f853e55c5 
  ql/src/test/results/clientpositive/sample7.q.out 0e2fc287d4 
  ql/src/test/results/clientpositive/sample9.q.out 0de49a698a 
  ql/src/test/results/clientpositive/smb_mapjoin_11.q.out a83f3e66c4 
  
ql/src/test/results/clientpositive/spark/infer_bucket_sort_bucketed_table.q.out 
8fab7ecbd0 
  ql/src/test/results/clientpositive/spark/sample10.q.out 555e5f43ec 
  ql/src/test/results/clientpositive/spark/sample2.q.out 8b73fdf874 
  ql/src/test/results/clientpositive/spark/sample4.q.out 3269b015ec 
  ql/src/test/results/clientpositive/spark/sample6.q.out 36532d7fbe 
  ql/src/test/results/clientpositive/spark/sample7.q.out d0b52bcdce 


Diff: https://reviews.apache.org/r/67710/diff/2/

Changes: https://reviews.apache.org/r/67710/diff/1-2/


Testing
---


Thanks,

Deepak Jaiswal



Re: Hive QA logs not accessible

2018-06-25 Thread Deepak Jaiswal
Hi Vihang,

I am looking for logs of failed test runs. Thanks for optimizing this for 
successful runs. However, I think there is a problem with Hive QA, the queue is 
gone and I submitted a patch more than 10 minutes ago and it hasn’t started or 
enqueued yet.

https://builds.apache.org/view/H-L/view/Hive/job/PreCommit-HIVE-Build/

Regards,
Deepak

On 6/25/18, 10:53 AM, "Vihang Karajgaonkar"  
wrote:

Are you looking for logs for successful tests? I had submitted a change
recently which stops skips downloading logs for successful tests to shave
off ~10 min time from each run. I found that the job was spending too much
time copying over ~20G of logs from worker nodes to the server. Can you
give the JIRA number so that I can take a look?

On Mon, Jun 25, 2018 at 10:38 AM, Deepak Jaiswal 
wrote:

> The Hive QA logs are not accessible for yesterday night’s run. Also, I
> don’t see any test running.
> Is the disk full again?
>
> Regards,
> Deepak
>




Hive QA logs not accessible

2018-06-25 Thread Deepak Jaiswal
The Hive QA logs are not accessible for yesterday night’s run. Also, I don’t 
see any test running.
Is the disk full again?

Regards,
Deepak


Review Request 67710: HIVE-19481

2018-06-22 Thread Deepak Jaiswal

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/67710/
---

Review request for hive, Jason Dere and Sergey Shelukhin.


Bugs: HIVE-19481
https://issues.apache.org/jira/browse/HIVE-19481


Repository: hive-git


Description
---

sample10.q returns wrong results.
Multiple issues were fixed
1. Instead of using old MR logic which assumes there is 1 file for each bucket, 
lookup buckets by name(non-managed tables)
2. Skip bucket pruning for managed tables.


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/metadata/Partition.java 9dbd869d57 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/SamplePruner.java 8200e6a237 
  ql/src/test/queries/clientpositive/sample10_mm.q PRE-CREATION 
  ql/src/test/results/clientpositive/llap/sample10.q.out 1b95314980 
  ql/src/test/results/clientpositive/llap/sample10_mm.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/spark/sample10.q.out ac28779591 


Diff: https://reviews.apache.org/r/67710/diff/1/


Testing
---


Thanks,

Deepak Jaiswal



[jira] [Created] (HIVE-19972) Followup to HIVE-19928 : Fix the check for managed table

2018-06-22 Thread Deepak Jaiswal (JIRA)
Deepak Jaiswal created HIVE-19972:
-

 Summary: Followup to HIVE-19928 : Fix the check for managed table
 Key: HIVE-19972
 URL: https://issues.apache.org/jira/browse/HIVE-19972
 Project: Hive
  Issue Type: Bug
Reporter: Deepak Jaiswal
Assignee: Deepak Jaiswal


The check for managed table should use ENUM comparison rather than string 
comparison.

The check in the patch will always return false, thus maintaining existing 
behavior.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Review Request 67698: HIVE-19967

2018-06-22 Thread Deepak Jaiswal

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/67698/
---

Review request for hive, Gunther Hagleitner and Jason Dere.


Bugs: HIVE-19967
https://issues.apache.org/jira/browse/HIVE-19967


Repository: hive-git


Description
---

SMB Join : Need Optraits for PTFOperator ala GBY Op


Diffs
-

  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/metainfo/annotation/AnnotateWithOpTraits.java
 3c8e61d47b 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/metainfo/annotation/OpTraitsRulesProcFactory.java
 dbcbbfd1a6 
  ql/src/test/queries/clientpositive/llap_smb_ptf.q PRE-CREATION 
  ql/src/test/results/clientpositive/llap/llap_smb_ptf.q.out PRE-CREATION 


Diff: https://reviews.apache.org/r/67698/diff/1/


Testing
---


Thanks,

Deepak Jaiswal



  1   2   3   4   5   >