Review Request 70486: HIVE-21610: Union operator can flow in the wrong stage causing NPE

2019-04-16 Thread Antal Sinkovits via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70486/
---

Review request for hive, Zoltan Haindrich, Naveen Gangam, and Peter Vary.


Bugs: HIVE-21610
https://issues.apache.org/jira/browse/HIVE-21610


Repository: hive-git


Description
---

Because of HIVE-16227 it can happen that a UnionOperator will partially go into 
the wrong stage, because the currTask is changed, and the UnionOperator is 
reinitialized in GenMRFileSink1 with the wrong task.


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRFileSink1.java 
25c6b24f46771ba12327851285af15ffb9b65bcd 
  ql/src/test/queries/clientpositive/unionall_lateralview2.q PRE-CREATION 
  ql/src/test/results/clientpositive/unionall_lateralview2.q.out PRE-CREATION 


Diff: https://reviews.apache.org/r/70486/diff/1/


Testing
---


Thanks,

Antal Sinkovits



Review Request 69560: HIVE-21035: Race condition in SparkUtilities#getSparkSession

2018-12-12 Thread Antal Sinkovits via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69560/
---

Review request for hive, Denys Kuzmenko, Peter Vary, and Adam Szita.


Repository: hive-git


Description
---

It can happen, that when in one given session, multiple queries are executed, 
that due to a race condition, multiple spark application master gets kicked off.
In this case, the one that started earlier, will not be killed, when the hive 
session closes, consuming resources.


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkUtilities.java 
d384ed6db6f4630ee2ef309854236e8f12926688 
  ql/src/test/org/apache/hadoop/hive/ql/exec/spark/TestSparkUtilities.java 
PRE-CREATION 


Diff: https://reviews.apache.org/r/69560/diff/1/


Testing
---


Thanks,

Antal Sinkovits



Re: Review Request 68474: HIVE-20440

2018-11-07 Thread Antal Sinkovits via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68474/
---

(Updated nov. 7, 2018, 2:38 du)


Review request for hive, Naveen Gangam, Sahil Takiar, Adam Szita, and Xuefu 
Zhang.


Repository: hive-git


Description
---

I've modified the SmallTableCache to use guava cache, with soft references.
By using a value loader, I've also eliminated the synchronization on the 
intern-ed string of the path.


Diffs (updated)
-

  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/exec/spark/TestSmallTableCacheEviction.java
 PRE-CREATION 
  ql/pom.xml 8c3e55eaf4d0234a280b0936f6153d2f563bbe46 
  ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java 
da1dd426c9155290e30fd1e3ae7f19a5479a8967 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/AbstractMapJoinTableContainer.java
 9e65fd98d6e4451421641b1429ccf334fe9a9586 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/HybridHashTableContainer.java
 54377428eafdb79e1bbdc8a182eafb46f8febd23 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinBytesTableContainer.java
 0e4b8df036724bd83e85fc3cc70f534272dab4c4 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinTableContainer.java
 74e0b120ea3560a6a2a0074e6c0026b4874b3d5e 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinTableContainerSerDe.java
 24b8fea33815867ce544fd284437c4d02a21f1a3 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HashTableLoader.java 
cf27e92bafdc63096ec0fa8c3106657bab52f370 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SmallTableCache.java 
3293100af96dc60408c53065fa89143ead98f818 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/fast/VectorMapJoinFastTableContainer.java
 e8dcbf18cb09b190536f920a53d6e9fa870ce33b 
  ql/src/test/org/apache/hadoop/hive/ql/exec/spark/TestSmallTableCache.java 
PRE-CREATION 


Diff: https://reviews.apache.org/r/68474/diff/5/

Changes: https://reviews.apache.org/r/68474/diff/4-5/


Testing
---


Thanks,

Antal Sinkovits



Re: Review Request 68474: HIVE-20440

2018-11-07 Thread Antal Sinkovits via Review Board


> On okt. 16, 2018, 2:56 du, Sahil Takiar wrote:
> > Could we add some more E2E integration tests? I'm thinking they could at 
> > the granularity of a `MapJoinOperator`? For example, confirm that starting 
> > a new query actually evicts everything from the cache? We want to make sure 
> > we aren't accidentally leaking small tables.
> 
> Antal Sinkovits wrote:
> MapJoinOperator cannot be tested easily. There is a TestMapJoinOperator, 
> but the test code is really complex. And the eviction happens at the 
> HivePairFlatMapFunction level. For every Map/Reduce the cache is 
> reinitialized. If we are in a new query the cache gets evicted.

I've added a new test, to check this.


- Antal


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68474/#review209628
---


On nov. 7, 2018, 2:38 du, Antal Sinkovits wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68474/
> ---
> 
> (Updated nov. 7, 2018, 2:38 du)
> 
> 
> Review request for hive, Naveen Gangam, Sahil Takiar, Adam Szita, and Xuefu 
> Zhang.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> I've modified the SmallTableCache to use guava cache, with soft references.
> By using a value loader, I've also eliminated the synchronization on the 
> intern-ed string of the path.
> 
> 
> Diffs
> -
> 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/exec/spark/TestSmallTableCacheEviction.java
>  PRE-CREATION 
>   ql/pom.xml 8c3e55eaf4d0234a280b0936f6153d2f563bbe46 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java 
> da1dd426c9155290e30fd1e3ae7f19a5479a8967 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/AbstractMapJoinTableContainer.java
>  9e65fd98d6e4451421641b1429ccf334fe9a9586 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/HybridHashTableContainer.java
>  54377428eafdb79e1bbdc8a182eafb46f8febd23 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinBytesTableContainer.java
>  0e4b8df036724bd83e85fc3cc70f534272dab4c4 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinTableContainer.java
>  74e0b120ea3560a6a2a0074e6c0026b4874b3d5e 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinTableContainerSerDe.java
>  24b8fea33815867ce544fd284437c4d02a21f1a3 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HashTableLoader.java 
> cf27e92bafdc63096ec0fa8c3106657bab52f370 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SmallTableCache.java 
> 3293100af96dc60408c53065fa89143ead98f818 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/fast/VectorMapJoinFastTableContainer.java
>  e8dcbf18cb09b190536f920a53d6e9fa870ce33b 
>   ql/src/test/org/apache/hadoop/hive/ql/exec/spark/TestSmallTableCache.java 
> PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/68474/diff/5/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Antal Sinkovits
> 
>



Re: Review Request 68474: HIVE-20440

2018-11-06 Thread Antal Sinkovits via Review Board


> On okt. 16, 2018, 2:50 du, Sahil Takiar wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SmallTableCache.java
> > Lines 131 (patched)
> > 
> >
> > why do we run the action just for the l2 cache?

L2 contains all the elements from L1, so running through L2 is enough.


- Antal


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68474/#review209626
---


On nov. 6, 2018, 12:28 du, Antal Sinkovits wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68474/
> ---
> 
> (Updated nov. 6, 2018, 12:28 du)
> 
> 
> Review request for hive, Naveen Gangam, Sahil Takiar, Adam Szita, and Xuefu 
> Zhang.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> I've modified the SmallTableCache to use guava cache, with soft references.
> By using a value loader, I've also eliminated the synchronization on the 
> intern-ed string of the path.
> 
> 
> Diffs
> -
> 
>   ql/pom.xml 8c3e55eaf4d0234a280b0936f6153d2f563bbe46 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java 
> da1dd426c9155290e30fd1e3ae7f19a5479a8967 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/AbstractMapJoinTableContainer.java
>  9e65fd98d6e4451421641b1429ccf334fe9a9586 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/HybridHashTableContainer.java
>  54377428eafdb79e1bbdc8a182eafb46f8febd23 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinBytesTableContainer.java
>  0e4b8df036724bd83e85fc3cc70f534272dab4c4 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinTableContainer.java
>  74e0b120ea3560a6a2a0074e6c0026b4874b3d5e 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinTableContainerSerDe.java
>  24b8fea33815867ce544fd284437c4d02a21f1a3 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HashTableLoader.java 
> cf27e92bafdc63096ec0fa8c3106657bab52f370 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SmallTableCache.java 
> 3293100af96dc60408c53065fa89143ead98f818 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/fast/VectorMapJoinFastTableContainer.java
>  e8dcbf18cb09b190536f920a53d6e9fa870ce33b 
>   ql/src/test/org/apache/hadoop/hive/ql/exec/spark/TestSmallTableCache.java 
> PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/68474/diff/4/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Antal Sinkovits
> 
>



Re: Review Request 68474: HIVE-20440

2018-11-06 Thread Antal Sinkovits via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68474/
---

(Updated nov. 6, 2018, 12:28 du)


Review request for hive, Naveen Gangam, Sahil Takiar, Adam Szita, and Xuefu 
Zhang.


Repository: hive-git


Description
---

I've modified the SmallTableCache to use guava cache, with soft references.
By using a value loader, I've also eliminated the synchronization on the 
intern-ed string of the path.


Diffs (updated)
-

  ql/pom.xml 8c3e55eaf4d0234a280b0936f6153d2f563bbe46 
  ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java 
da1dd426c9155290e30fd1e3ae7f19a5479a8967 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/AbstractMapJoinTableContainer.java
 9e65fd98d6e4451421641b1429ccf334fe9a9586 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/HybridHashTableContainer.java
 54377428eafdb79e1bbdc8a182eafb46f8febd23 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinBytesTableContainer.java
 0e4b8df036724bd83e85fc3cc70f534272dab4c4 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinTableContainer.java
 74e0b120ea3560a6a2a0074e6c0026b4874b3d5e 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinTableContainerSerDe.java
 24b8fea33815867ce544fd284437c4d02a21f1a3 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HashTableLoader.java 
cf27e92bafdc63096ec0fa8c3106657bab52f370 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SmallTableCache.java 
3293100af96dc60408c53065fa89143ead98f818 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/fast/VectorMapJoinFastTableContainer.java
 e8dcbf18cb09b190536f920a53d6e9fa870ce33b 
  ql/src/test/org/apache/hadoop/hive/ql/exec/spark/TestSmallTableCache.java 
PRE-CREATION 


Diff: https://reviews.apache.org/r/68474/diff/4/

Changes: https://reviews.apache.org/r/68474/diff/3-4/


Testing
---


Thanks,

Antal Sinkovits



Re: Review Request 68474: HIVE-20440

2018-11-06 Thread Antal Sinkovits via Review Board


> On okt. 16, 2018, 2:56 du, Sahil Takiar wrote:
> > Could we add some more E2E integration tests? I'm thinking they could at 
> > the granularity of a `MapJoinOperator`? For example, confirm that starting 
> > a new query actually evicts everything from the cache? We want to make sure 
> > we aren't accidentally leaking small tables.

MapJoinOperator cannot be tested easily. There is a TestMapJoinOperator, but 
the test code is really complex. And the eviction happens at the 
HivePairFlatMapFunction level. For every Map/Reduce the cache is reinitialized. 
If we are in a new query the cache gets evicted.


- Antal


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68474/#review209628
---


On nov. 6, 2018, 12:28 du, Antal Sinkovits wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68474/
> ---
> 
> (Updated nov. 6, 2018, 12:28 du)
> 
> 
> Review request for hive, Naveen Gangam, Sahil Takiar, Adam Szita, and Xuefu 
> Zhang.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> I've modified the SmallTableCache to use guava cache, with soft references.
> By using a value loader, I've also eliminated the synchronization on the 
> intern-ed string of the path.
> 
> 
> Diffs
> -
> 
>   ql/pom.xml 8c3e55eaf4d0234a280b0936f6153d2f563bbe46 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java 
> da1dd426c9155290e30fd1e3ae7f19a5479a8967 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/AbstractMapJoinTableContainer.java
>  9e65fd98d6e4451421641b1429ccf334fe9a9586 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/HybridHashTableContainer.java
>  54377428eafdb79e1bbdc8a182eafb46f8febd23 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinBytesTableContainer.java
>  0e4b8df036724bd83e85fc3cc70f534272dab4c4 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinTableContainer.java
>  74e0b120ea3560a6a2a0074e6c0026b4874b3d5e 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinTableContainerSerDe.java
>  24b8fea33815867ce544fd284437c4d02a21f1a3 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HashTableLoader.java 
> cf27e92bafdc63096ec0fa8c3106657bab52f370 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SmallTableCache.java 
> 3293100af96dc60408c53065fa89143ead98f818 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/fast/VectorMapJoinFastTableContainer.java
>  e8dcbf18cb09b190536f920a53d6e9fa870ce33b 
>   ql/src/test/org/apache/hadoop/hive/ql/exec/spark/TestSmallTableCache.java 
> PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/68474/diff/4/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Antal Sinkovits
> 
>



Re: Review Request 69107: HIVE-20512

2018-10-31 Thread Antal Sinkovits via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69107/#review210230
---




ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkMergeFileRecordHandler.java
Line 93 (original), 93 (patched)


I'm not sure, but shouldnt we call incrementRowNumber from here as well?



ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkRecordHandler.java
Lines 67 (patched)


I think it should be deamon. What do you think?



ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkRecordHandler.java
Line 62 (original), 83 (patched)


Probably these three rows can move to the MemoryInfoLogger class  (as 
static method), as from now on its his responsibility.

Something like MemoryInfoLogger.start() or schedule



ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkReduceRecordHandler.java
Line 574 (original), 572 (patched)


Is this correct? In batch processing increasing it with only one?


- Antal Sinkovits


On okt. 26, 2018, 5:13 du, Bharathkrishna Guruvayoor Murali wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69107/
> ---
> 
> (Updated okt. 26, 2018, 5:13 du)
> 
> 
> Review request for hive, Antal Sinkovits, Sahil Takiar, and Vihang 
> Karajgaonkar.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Improve record and memory usage logging in SparkRecordHandler
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkMapRecordHandler.java 
> 88dd12c05ade417aca4cdaece4448d31d4e1d65f 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkMergeFileRecordHandler.java
>  8880bb604e088755dcfb0bcb39689702fab0cb77 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkRecordHandler.java 
> cb5bd7ada2d5ad4f1f654cf80ddaf4504be5d035 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkReduceRecordHandler.java
>  20e7ea0f4e8d4ff79dddeaab0406fc7350d22bd7 
> 
> 
> Diff: https://reviews.apache.org/r/69107/diff/3/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Bharathkrishna Guruvayoor Murali
> 
>



Re: Review Request 69107: HIVE-20512

2018-10-24 Thread Antal Sinkovits via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69107/#review209950
---




ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkRecordHandler.java
Lines 56 (patched)


Any reason why we use Timer?

From the timer java docs: 

"Java 5.0 introduced the java.util.concurrent package and one of the 
concurrency utilities therein is the ScheduledThreadPoolExecutor which is a 
thread pool for repeatedly executing tasks at a given rate or delay. It is 
effectively a more versatile replacement for the Timer/TimerTask combination, 
as it allows multiple service threads, accepts various time units, and doesn't 
require subclassing TimerTask (just implement Runnable). Configuring 
ScheduledThreadPoolExecutor with one thread makes it equivalent to Timer."


- Antal Sinkovits


On okt. 20, 2018, 7:13 du, Bharathkrishna Guruvayoor Murali wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69107/
> ---
> 
> (Updated okt. 20, 2018, 7:13 du)
> 
> 
> Review request for hive, Antal Sinkovits, Sahil Takiar, and Vihang 
> Karajgaonkar.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Improve record and memory usage logging in SparkRecordHandler
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkMapRecordHandler.java 
> 88dd12c05ade417aca4cdaece4448d31d4e1d65f 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkMergeFileRecordHandler.java
>  8880bb604e088755dcfb0bcb39689702fab0cb77 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkRecordHandler.java 
> cb5bd7ada2d5ad4f1f654cf80ddaf4504be5d035 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkReduceRecordHandler.java
>  20e7ea0f4e8d4ff79dddeaab0406fc7350d22bd7 
> 
> 
> Diff: https://reviews.apache.org/r/69107/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Bharathkrishna Guruvayoor Murali
> 
>



Re: Review Request 69022: HIVE-20737: Local SparkContext is shared between user sessions and should be closed only when there is no active

2018-10-18 Thread Antal Sinkovits via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69022/#review209739
---


Ship it!




Ship It!

- Antal Sinkovits


On okt. 16, 2018, 4:49 du, denys kuzmenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69022/
> ---
> 
> (Updated okt. 16, 2018, 4:49 du)
> 
> 
> Review request for hive, Sahil Takiar and Adam Szita.
> 
> 
> Bugs: HIVE-20737
> https://issues.apache.org/jira/browse/HIVE-20737
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> 1. Local SparkContext is shared between user sessions and should be closed 
> only when there is no active. 
> 2. Possible race condition in SparkSession.open() in case when user queries 
> run in parallel within the same session.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java 
> 72ff53e3bd 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionImpl.java
>  bb50129518 
>   
> ql/src/test/org/apache/hadoop/hive/ql/exec/spark/TestLocalHiveSparkClient.java
>  PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/69022/diff/3/
> 
> 
> Testing
> ---
> 
> Added TestLocalHiveSparkClient test
> 
> 
> File Attachments
> 
> 
> HIVE-20737.7.patch
>   
> https://reviews.apache.org/media/uploaded/files/2018/10/15/9cf8a2b3-9ec1-4316-81d0-3cd124b1a9fd__HIVE-20737.7.patch
> 
> 
> Thanks,
> 
> denys kuzmenko
> 
>



Re: Review Request 68474: HIVE-20440

2018-10-10 Thread Antal Sinkovits via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68474/
---

(Updated okt. 10, 2018, 1:20 du)


Review request for hive, Naveen Gangam, Sahil Takiar, Adam Szita, and Xuefu 
Zhang.


Summary (updated)
-

HIVE-20440


Repository: hive-git


Description
---

I've modified the SmallTableCache to use guava cache, with soft references.
By using a value loader, I've also eliminated the synchronization on the 
intern-ed string of the path.


Diffs (updated)
-

  ql/pom.xml d73deba440702ec39fc5610df28e0fe54baef025 
  ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java 
da1dd426c9155290e30fd1e3ae7f19a5479a8967 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/AbstractMapJoinTableContainer.java
 9e65fd98d6e4451421641b1429ccf334fe9a9586 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/HybridHashTableContainer.java
 54377428eafdb79e1bbdc8a182eafb46f8febd23 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinBytesTableContainer.java
 0e4b8df036724bd83e85fc3cc70f534272dab4c4 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinTableContainer.java
 74e0b120ea3560a6a2a0074e6c0026b4874b3d5e 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinTableContainerSerDe.java
 24b8fea33815867ce544fd284437c4d02a21f1a3 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HashTableLoader.java 
cf27e92bafdc63096ec0fa8c3106657bab52f370 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SmallTableCache.java 
3293100af96dc60408c53065fa89143ead98f818 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/fast/VectorMapJoinFastTableContainer.java
 e8dcbf18cb09b190536f920a53d6e9fa870ce33b 
  ql/src/test/org/apache/hadoop/hive/ql/exec/spark/TestSmallTableCache.java 
PRE-CREATION 


Diff: https://reviews.apache.org/r/68474/diff/3/

Changes: https://reviews.apache.org/r/68474/diff/2-3/


Testing
---


Thanks,

Antal Sinkovits



Re: Review Request 68683: Add new configuration to set the size of the global compile lock

2018-09-26 Thread Antal Sinkovits via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68683/#review209017
---




common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
Lines 3062 (patched)


Why is the default value -1? All the checks seems to go against >0. What 
happens when the value is 0?



ql/src/java/org/apache/hadoop/hive/ql/lock/CompileLock.java
Lines 12 (patched)


I think there is a typo here, and the it should be CompileLock.class.


- Antal Sinkovits


On szept. 25, 2018, 10:19 de, denys kuzmenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68683/
> ---
> 
> (Updated szept. 25, 2018, 10:19 de)
> 
> 
> Review request for hive, Zoltan Haindrich, Zoltan Haindrich, Naveen Gangam, 
> and Peter Vary.
> 
> 
> Bugs: HIVE-20535
> https://issues.apache.org/jira/browse/HIVE-20535
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> When removing the compile lock, it is quite risky to remove it entirely.
> 
> It would be good to provide a pool size for the concurrent compilation, so 
> the administrator can limit the load
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 8c39de3e77 
>   ql/src/java/org/apache/hadoop/hive/ql/Driver.java 737debd2ad 
>   ql/src/java/org/apache/hadoop/hive/ql/lock/CompileLock.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/lock/CompileLockFactory.java 
> PRE-CREATION 
>   ql/src/test/org/apache/hadoop/hive/ql/CompileLockTest.java PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/68683/diff/6/
> 
> 
> Testing
> ---
> 
> Added CompileLockTest
> 
> 
> File Attachments
> 
> 
> HIVE-20535.1.patch
>   
> https://reviews.apache.org/media/uploaded/files/2018/09/13/41f5a84a-70e5-4882-99c1-1cf98c4364e4__HIVE-20535.1.patch
> HIVE-20535.14.patch
>   
> https://reviews.apache.org/media/uploaded/files/2018/09/25/335b0f4b-ea94-41d4-881a-ec8bb870a376__HIVE-20535.14.patch
> HIVE-20535.14.patch
>   
> https://reviews.apache.org/media/uploaded/files/2018/09/25/a92b6da2-eeba-46ee-9409-162653826172__HIVE-20535.14.patch
> HIVE-20535.14.patch
>   
> https://reviews.apache.org/media/uploaded/files/2018/09/25/9db4cf76-9188-48fb-bd3d-5b28e43a791b__HIVE-20535.14.patch
> 
> 
> Thanks,
> 
> denys kuzmenko
> 
>



Re: Review Request 68474: HIVE-20440: Create better cache eviction policy for SmallTableCache

2018-09-19 Thread Antal Sinkovits via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68474/
---

(Updated szept. 19, 2018, 11:14 du)


Review request for hive, Naveen Gangam, Sahil Takiar, Adam Szita, and Xuefu 
Zhang.


Repository: hive-git


Description (updated)
---

I've modified the SmallTableCache to use guava cache, with soft references.
By using a value loader, I've also eliminated the synchronization on the 
intern-ed string of the path.


Diffs (updated)
-

  ql/pom.xml d73deba440702ec39fc5610df28e0fe54baef025 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HashTableLoader.java 
cf27e92bafdc63096ec0fa8c3106657bab52f370 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SmallTableCache.java 
3293100af96dc60408c53065fa89143ead98f818 
  ql/src/test/org/apache/hadoop/hive/ql/exec/spark/TestSmallTableCache.java 
PRE-CREATION 


Diff: https://reviews.apache.org/r/68474/diff/2/

Changes: https://reviews.apache.org/r/68474/diff/1-2/


Testing
---


Thanks,

Antal Sinkovits



Re: Review Request 68744: Add Surrogate Keys function to Hive

2018-09-18 Thread Antal Sinkovits via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68744/#review208733
---




ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSurrogateKey.java
Lines 116 (patched)


You have a potential NPE here, if the execution engine is not TEZ. I think 
it should support all execution engines (MR, spark, tez) or if its not 
possible, fail fast with a more reasonable exception.


- Antal Sinkovits


On szept. 18, 2018, 10:59 de, Miklos Gergely wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68744/
> ---
> 
> (Updated szept. 18, 2018, 10:59 de)
> 
> 
> Review request for hive and Ashutosh Chauhan.
> 
> 
> Bugs: HIVE-20536
> https://issues.apache.org/jira/browse/HIVE-20536
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Add new function that allows the generation of a surrogate key composed of 
> the write id, the task id, and an incremental row id.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java 3f538b3 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java 
> 3309b9b 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 6d7e63e 
>   
> ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSurrogateKey.java 
> PRE-CREATION 
>   
> ql/src/test/org/apache/hadoop/hive/ql/udf/generic/TestGenericUDFSurrogateKey.java
>  PRE-CREATION 
>   ql/src/test/results/clientpositive/show_functions.q.out 8d41e78 
> 
> 
> Diff: https://reviews.apache.org/r/68744/diff/1/
> 
> 
> Testing
> ---
> 
> Added a new junit test for the function.
> Tested it in beeline by adding one row, adding multiple rows, adding mutliple 
> rows to multiple tables via multuple insert (all having their own 
> surrogate_key column)
> 
> 
> Thanks,
> 
> Miklos Gergely
> 
>



Review Request 68474: HIVE-20440: Create better cache eviction policy for SmallTableCache

2018-08-22 Thread Antal Sinkovits via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68474/
---

Review request for hive, Naveen Gangam, Sahil Takiar, Adam Szita, and Xuefu 
Zhang.


Repository: hive-git


Description
---

I've modified the SmallTableCache to use guava cache, with soft references. 
By using a value loader, I've also eliminated the synchronization on the 
intern-ed string of the path.


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HashTableLoader.java 
cf27e92bafdc63096ec0fa8c3106657bab52f370 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SmallTableCache.java 
3293100af96dc60408c53065fa89143ead98f818 


Diff: https://reviews.apache.org/r/68474/diff/1/


Testing
---


Thanks,

Antal Sinkovits



Review Request 67250: HIVE-17317 - Make Dbcp configurable using hive properties in hive-site.xml

2018-05-22 Thread Antal Sinkovits via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/67250/
---

Review request for hive and Peter Vary.


Bugs: HIVE-17317
https://issues.apache.org/jira/browse/HIVE-17317


Repository: hive-git


Description
---

HIVE-17317 - Make Dbcp configurable using hive properties in hive-site.xml


Diffs
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 931533a556 
  itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestRestrictedList.java 
6270e14700 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
 264fdb9db9 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/datasource/BoneCPDataSourceProvider.java
 4ff2bb77d3 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/datasource/DataSourceProviderFactory.java
 e3c18e3358 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/datasource/DbCPDataSourceProvider.java
 PRE-CREATION 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/datasource/HikariCPDataSourceProvider.java
 6ffc24a27a 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
 21b98655e9 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/datasource/TestDataSourceProviderFactory.java
 2d45c29f97 


Diff: https://reviews.apache.org/r/67250/diff/1/


Testing
---


Thanks,

Antal Sinkovits