Re: Review Request 71645: HIVE-22292

2019-11-05 Thread Krisztian Kasa

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71645/
---

(Updated Nov. 5, 2019, 8:41 a.m.)


Review request for hive, Jesús Camacho Rodríguez, Zoltan Haindrich, and Vineet 
Garg.


Bugs: HIVE-22292
https://issues.apache.org/jira/browse/HIVE-22292


Repository: hive-git


Description
---

Implement Hypothetical-Set Aggregate Functions
==
1. rank, dense_rank, precent_rank, cume_dist
2. Allow unlimited column references in `WITHIN GROUP` clause
3. Refactor the implementation of the functions `percentile_cont` and 
`percentile_disc`: 
 - validate that only one parameter and column reference is passed to these 
two functions. 
 - since the semantics of the `WITHIN GROUP` clause allows multiple column 
references the parameter order had to be changed and this affect backward 
compatibility.


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java 5e88f30cab 
  ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java 059919710e 
  ql/src/java/org/apache/hadoop/hive/ql/exec/WindowFunctionDescription.java 
48645dc3f2 
  ql/src/java/org/apache/hadoop/hive/ql/exec/WindowFunctionInfo.java a0b0e48f4c 
  ql/src/java/org/apache/hadoop/hive/ql/parse/IdentifiersParser.g 55c6863f67 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 0198c0f724 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFCumeDist.java 
d0c155ff2d 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFDenseRank.java 
992f5bfd21 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFPercentRank.java 
64e9c8b7ca 
  
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFPercentileCont.java
 ad61410180 
  
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFPercentileDisc.java
 c8d3c12c80 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFRank.java 
13e2f537cd 
  ql/src/java/org/apache/hadoop/hive/ql/util/NullOrdering.java PRE-CREATION 
  ql/src/test/org/apache/hadoop/hive/ql/exec/TestFunctionRegistry.java 
dead3ec472 
  ql/src/test/org/apache/hadoop/hive/ql/parse/TestParseWithinGroupClause.java 
9d44ed87e9 
  ql/src/test/queries/clientpositive/hypothetical_set_aggregates.q PRE-CREATION 
  ql/src/test/results/clientpositive/hypothetical_set_aggregates.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/udaf_percentile_cont.q.out f12cb6cd5e 
  ql/src/test/results/clientpositive/udaf_percentile_disc.q.out d10fee577c 


Diff: https://reviews.apache.org/r/71645/diff/4/

Changes: https://reviews.apache.org/r/71645/diff/3-4/


Testing
---

New q test added for testing Hypothetical-Set Aggregate Functions: 
hypothetical_set_aggregates.q
Run q tests: hypothetical_set_aggregates.q, udaf_percentile_cont.q, 
udaf_percentile_disc.q
Run unit test: TestParseWithinGroupClause.java


Thanks,

Krisztian Kasa



[jira] [Created] (HIVE-22458) Add more constraints on showing partitions

2019-11-05 Thread Zhihua Deng (Jira)
Zhihua Deng created HIVE-22458:
--

 Summary: Add more constraints on showing partitions
 Key: HIVE-22458
 URL: https://issues.apache.org/jira/browse/HIVE-22458
 Project: Hive
  Issue Type: New Feature
Reporter: Zhihua Deng


When we showing partitions of a table with thousands of partitions,  all the 
partitions will be returned and it's not easy to catch the specified one from 
them, this make showing partitions hard to use. We can add where/limit/order by 
constraints to show partitions like:

 show partitions table_name [partition_specs] where partition_field >= value 
order by partition_field desc limit n;

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Review Request 71711: HIVE-21954: QTest: support for running qtests on various metastore DBs

2019-11-05 Thread Zoltan Haindrich

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71711/#review218500
---




itests/util/src/main/java/org/apache/hadoop/hive/cli/control/CliAdapter.java
Lines 82-93 (patched)


these things should not be part of the "adapter" as it breaks its adapter 
contract



itests/util/src/main/java/org/apache/hadoop/hive/cli/control/CliAdapter.java
Lines 121 (patched)


firstTestNotYetRun - sounds like pretty hairy :D
how far are the contents of this if's body to be considered beforeClass 
kinda stuff? - would that work?



itests/util/src/main/java/org/apache/hadoop/hive/cli/control/CoreAccumuloCliDriver.java
Lines 54-57 (original)


right now I guess I don't see  every doors and corners around here...but 
might be an alternate approach would be to concentrate the common part to some 
abstract between all the drivers and the cliadapter - might help clean up 
existing stuff as well...



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/tools/schematool/MetastoreSchemaTool.java
Line 199 (original), 199 (patched)


why did the signature of this method changed?


- Zoltan Haindrich


On Nov. 2, 2019, 2:34 p.m., Laszlo Bodor wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71711/
> ---
> 
> (Updated Nov. 2, 2019, 2:34 p.m.)
> 
> 
> Review request for hive, Zoltan Haindrich and Peter Vary.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-21954: QTest: support for running qtests on various metastore DBs
> 
> 
> Diffs
> -
> 
>   data/conf/perf-reg/spark/hive-site.xml 15ec63048e 
>   data/conf/perf-reg/tez/hive-site.xml 2951f30531 
>   data/scripts/q_test_init.sql df0582814a 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestLocationQueries.java
>  eb3b935f09 
>   itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestMTQueries.java 
> 3e0cdac67c 
>   itests/qtest/pom.xml 364d07f9d9 
>   
> itests/util/src/main/java/org/apache/hadoop/hive/cli/control/AbstractCoreBlobstoreCliDriver.java
>  50417e9378 
>   
> itests/util/src/main/java/org/apache/hadoop/hive/cli/control/CliAdapter.java 
> 574a67f2e3 
>   
> itests/util/src/main/java/org/apache/hadoop/hive/cli/control/CoreAccumuloCliDriver.java
>  9a23ef855e 
>   
> itests/util/src/main/java/org/apache/hadoop/hive/cli/control/CoreBeeLineDriver.java
>  c8239a731c 
>   
> itests/util/src/main/java/org/apache/hadoop/hive/cli/control/CoreCliDriver.java
>  d06acfb978 
>   
> itests/util/src/main/java/org/apache/hadoop/hive/cli/control/CoreCompareCliDriver.java
>  62ea96089a 
>   itests/util/src/main/java/org/apache/hadoop/hive/cli/control/CoreDummy.java 
> 301b91e54e 
>   
> itests/util/src/main/java/org/apache/hadoop/hive/cli/control/CoreHBaseCliDriver.java
>  40545d8d65 
>   
> itests/util/src/main/java/org/apache/hadoop/hive/cli/control/CoreHBaseNegativeCliDriver.java
>  6094e6dffb 
>   
> itests/util/src/main/java/org/apache/hadoop/hive/cli/control/CoreKuduCliDriver.java
>  71134e7b0a 
>   
> itests/util/src/main/java/org/apache/hadoop/hive/cli/control/CoreKuduNegativeCliDriver.java
>  4f6988c9f3 
>   
> itests/util/src/main/java/org/apache/hadoop/hive/cli/control/CoreNegativeCliDriver.java
>  bb9e65524d 
>   
> itests/util/src/main/java/org/apache/hadoop/hive/cli/control/CorePerfCliDriver.java
>  59c71f544c 
>   
> itests/util/src/main/java/org/apache/hadoop/hive/ql/QTestMetaStoreHandler.java
>  PRE-CREATION 
>   
> itests/util/src/main/java/org/apache/hadoop/hive/ql/QTestSystemProperties.java
>  f82d17e5b3 
>   itests/util/src/main/java/org/apache/hadoop/hive/ql/QTestUtil.java 
> 9856a30381 
>   
> itests/util/src/main/java/org/apache/hadoop/hive/ql/parse/CoreParseNegative.java
>  9a136e24f0 
>   pom.xml 6dbff132cd 
>   ql/src/test/queries/clientpositive/create_func1.q 2c6acfc291 
>   ql/src/test/queries/clientpositive/partition_params_postgres.q PRE-CREATION 
>   ql/src/test/results/clientpositive/create_func1.q.out 238d378cda 
>   ql/src/test/results/clientpositive/llap/sysdb.q.out af06f5050e 
>   ql/src/test/results/clientpositive/partition_params_postgres.q.out 
> PRE-CREATION 
>   ql/src/test/results/clientpositive/show_functions.q.out 9db684579b 
>   standalone-metastore/DEV-README 9c261171fb 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreSchemaInfo.java
>  49e19adf71 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/tools/schematool/MetastoreSchema

[jira] [Created] (HIVE-22459) Hive datadiff function provided inconsistent results when hive.ferch.task.conversion is set to none

2019-11-05 Thread Chiran Ravani (Jira)
Chiran Ravani created HIVE-22459:


 Summary: Hive datadiff function provided inconsistent results when 
hive.ferch.task.conversion is set to none
 Key: HIVE-22459
 URL: https://issues.apache.org/jira/browse/HIVE-22459
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Affects Versions: 3.0.0
Reporter: Chiran Ravani


Hive datadiff function provided inconsistent results when 
hive.ferch.task.conversion to more

Below is output, whereas in Hive 1.2 the results are consistent

Note: Same query works well on Hive 3 when hive.ferch.task.conversion is set to 
none
Steps to reproduce the problem.
{code}
0: jdbc:hive2://c1113-node2.squadron.support.> select datetimecol from 
testdatediff where datediff(cast(current_timestamp as string), datetimecol)<183;
INFO : Compiling 
command(queryId=hive_20191105103636_1dff22a1-02f3-48a8-b076-0b91272f2268): 
select datetimecol from testdatediff where datediff(cast(current_timestamp as 
string), datetimecol)<183
INFO : Semantic Analysis Completed (retrial = false)
INFO : Returning Hive schema: 
Schema(fieldSchemas:[FieldSchema(name:datetimecol, type:string, comment:null)], 
properties:null)
INFO : Completed compiling 
command(queryId=hive_20191105103636_1dff22a1-02f3-48a8-b076-0b91272f2268); Time 
taken: 0.479 seconds
INFO : Executing 
command(queryId=hive_20191105103636_1dff22a1-02f3-48a8-b076-0b91272f2268): 
select datetimecol from testdatediff where datediff(cast(current_timestamp as 
string), datetimecol)<183
INFO : Completed executing 
command(queryId=hive_20191105103636_1dff22a1-02f3-48a8-b076-0b91272f2268); Time 
taken: 0.013 seconds
INFO : OK
+--+
| datetimecol |
+--+
| 2019-07-24 |
+--+
1 row selected (0.797 seconds)
0: jdbc:hive2://c1113-node2.squadron.support.>
{code}

After setting fetch task conversion as none.

{code}
0: jdbc:hive2://c1113-node2.squadron.support.> set 
hive.fetch.task.conversion=none;
No rows affected (0.017 seconds)
0: jdbc:hive2://c1113-node2.squadron.support.> set hive.fetch.task.conversion;
+--+
| set |
+--+
| hive.fetch.task.conversion=none |
+--+
1 row selected (0.015 seconds)
0: jdbc:hive2://c1113-node2.squadron.support.> select datetimecol from 
testdatediff where datediff(cast(current_timestamp as string), datetimecol)<183;
INFO : Compiling 
command(queryId=hive_20191105103709_0c38e446-09cf-45dd-9553-365146f42452): 
select datetimecol from testdatediff where datediff(cast(current_timestamp as 
string), datetimecol)<183


++
| datetimecol |
++
| 2019-09-09T10:45:49+02:00 |
| 2019-07-24 |
++
2 rows selected (5.327 seconds)
0: jdbc:hive2://c1113-node2.squadron.support.>
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22460) LRFU cache policy leaks locked buffers upon purge()

2019-11-05 Thread Jira
Ádám Szita created HIVE-22460:
-

 Summary: LRFU cache policy leaks locked buffers upon purge()
 Key: HIVE-22460
 URL: https://issues.apache.org/jira/browse/HIVE-22460
 Project: Hive
  Issue Type: Bug
  Components: llap
Reporter: Ádám Szita
Assignee: Ádám Szita


LRFU policy's purge() implementation is carefully not removing buffers that are 
currently locked (i.e. in use by some IO thread). So far that's good.

However it won't keep track of such buffers after the purge() method has 
finished: it will always reset its heap and list, thereby forgetting 
information on these buffers. It will never be able to evict these in the 
future, even if they get unlocked and become eligible for eviction.

This is problematic as:
 * Although eventually these buffers might be evicted by BufferAllocator, by 
the time that happens we have wasted space and time.
 * Meta information about the buffers will remain in CacheContentsTracker 
forever, wasting heap space too.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Review Request 71707: Performance degradation on single row inserts

2019-11-05 Thread Attila Magyar

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71707/
---

(Updated Nov. 5, 2019, 3:32 p.m.)


Review request for hive, Ashutosh Chauhan, Peter Vary, and Slim Bouguerra.


Changes
---

Adressing Ashutosh's comments


Bugs: HIVE-22411
https://issues.apache.org/jira/browse/HIVE-22411


Repository: hive-git


Description
---

Executing single insert statements on a transactional table effects write 
performance on a s3 file system. Each insert creates a new delta directory. 
After each insert hive calculates statistics like number of file in the table 
and total size of the table. In order to calculate these, it traverses the 
directory recursively. During the recursion for each path a separate listStatus 
call is executed. In the end the more delta directory you have the more time it 
takes to calculate the statistics.

Therefore insertion time goes up linearly.


Diffs (updated)
-

  
standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/Warehouse.java
 38e843aeacf 
  
standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/utils/FileUtils.java
 bf206fffc26 


Diff: https://reviews.apache.org/r/71707/diff/2/

Changes: https://reviews.apache.org/r/71707/diff/1-2/


Testing
---

measured and plotted insertation time


Thanks,

Attila Magyar



Re: Review Request 71707: Performance degradation on single row inserts

2019-11-05 Thread Panos Garefalakis via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71707/#review218505
---




standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/utils/FileUtils.java
Lines 328 (patched)


Hey Attila, the solution looks good however, as other fileSystems might 
face similar issues in the future using this recursive method (i.e. Azure Blob 
storage)  wouldn't it make sense to have hdfs a the base case and others 
separately? and maybe throw a warn message here when the filesystem is not 
supported?


- Panos Garefalakis


On Nov. 5, 2019, 3:32 p.m., Attila Magyar wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71707/
> ---
> 
> (Updated Nov. 5, 2019, 3:32 p.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan, Peter Vary, and Slim Bouguerra.
> 
> 
> Bugs: HIVE-22411
> https://issues.apache.org/jira/browse/HIVE-22411
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Executing single insert statements on a transactional table effects write 
> performance on a s3 file system. Each insert creates a new delta directory. 
> After each insert hive calculates statistics like number of file in the table 
> and total size of the table. In order to calculate these, it traverses the 
> directory recursively. During the recursion for each path a separate 
> listStatus call is executed. In the end the more delta directory you have the 
> more time it takes to calculate the statistics.
> 
> Therefore insertion time goes up linearly.
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/Warehouse.java
>  38e843aeacf 
>   
> standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/utils/FileUtils.java
>  bf206fffc26 
> 
> 
> Diff: https://reviews.apache.org/r/71707/diff/2/
> 
> 
> Testing
> ---
> 
> measured and plotted insertation time
> 
> 
> Thanks,
> 
> Attila Magyar
> 
>



Re: Review Request 71707: Performance degradation on single row inserts

2019-11-05 Thread Attila Magyar


> On Nov. 5, 2019, 4:33 p.m., Panos Garefalakis wrote:
> > standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/utils/FileUtils.java
> > Lines 328 (patched)
> > 
> >
> > Hey Attila, the solution looks good however, as other fileSystems might 
> > face similar issues in the future using this recursive method (i.e. Azure 
> > Blob storage)  wouldn't it make sense to have hdfs a the base case and 
> > others separately? and maybe throw a warn message here when the filesystem 
> > is not supported?

Hey Panos, I checked the hadoop project and I found only one FS implementation 
with optimized recursive listFiles(), other implementations use the tree 
walking impl. from the base class. I think that's the more common case. Do you 
know where is the source of this Azure Blob storage? Is that one open source at 
all?


- Attila


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71707/#review218505
---


On Nov. 5, 2019, 3:32 p.m., Attila Magyar wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71707/
> ---
> 
> (Updated Nov. 5, 2019, 3:32 p.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan, Peter Vary, and Slim Bouguerra.
> 
> 
> Bugs: HIVE-22411
> https://issues.apache.org/jira/browse/HIVE-22411
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Executing single insert statements on a transactional table effects write 
> performance on a s3 file system. Each insert creates a new delta directory. 
> After each insert hive calculates statistics like number of file in the table 
> and total size of the table. In order to calculate these, it traverses the 
> directory recursively. During the recursion for each path a separate 
> listStatus call is executed. In the end the more delta directory you have the 
> more time it takes to calculate the statistics.
> 
> Therefore insertion time goes up linearly.
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/Warehouse.java
>  38e843aeacf 
>   
> standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/utils/FileUtils.java
>  bf206fffc26 
> 
> 
> Diff: https://reviews.apache.org/r/71707/diff/2/
> 
> 
> Testing
> ---
> 
> measured and plotted insertation time
> 
> 
> Thanks,
> 
> Attila Magyar
> 
>



Re: Review Request 71707: Performance degradation on single row inserts

2019-11-05 Thread Panos Garefalakis via Review Board


> On Nov. 5, 2019, 4:33 p.m., Panos Garefalakis wrote:
> > standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/utils/FileUtils.java
> > Lines 328 (patched)
> > 
> >
> > Hey Attila, the solution looks good however, as other fileSystems might 
> > face similar issues in the future using this recursive method (i.e. Azure 
> > Blob storage)  wouldn't it make sense to have hdfs a the base case and 
> > others separately? and maybe throw a warn message here when the filesystem 
> > is not supported?
> 
> Attila Magyar wrote:
> Hey Panos, I checked the hadoop project and I found only one FS 
> implementation with optimized recursive listFiles(), other implementations 
> use the tree walking impl. from the base class. I think that's the more 
> common case. Do you know where is the source of this Azure Blob storage? Is 
> that one open source at all?

Hey Attila, I was referring to this: 
https://hadoop.apache.org/docs/current/hadoop-azure/index.html 
but I was also assuming that the recursive method you modified be called for 
other filesystems as well - if thats not the case then my comment does not 
apply :)


- Panos


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71707/#review218505
---


On Nov. 5, 2019, 3:32 p.m., Attila Magyar wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71707/
> ---
> 
> (Updated Nov. 5, 2019, 3:32 p.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan, Peter Vary, and Slim Bouguerra.
> 
> 
> Bugs: HIVE-22411
> https://issues.apache.org/jira/browse/HIVE-22411
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Executing single insert statements on a transactional table effects write 
> performance on a s3 file system. Each insert creates a new delta directory. 
> After each insert hive calculates statistics like number of file in the table 
> and total size of the table. In order to calculate these, it traverses the 
> directory recursively. During the recursion for each path a separate 
> listStatus call is executed. In the end the more delta directory you have the 
> more time it takes to calculate the statistics.
> 
> Therefore insertion time goes up linearly.
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/Warehouse.java
>  38e843aeacf 
>   
> standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/utils/FileUtils.java
>  bf206fffc26 
> 
> 
> Diff: https://reviews.apache.org/r/71707/diff/2/
> 
> 
> Testing
> ---
> 
> measured and plotted insertation time
> 
> 
> Thanks,
> 
> Attila Magyar
> 
>



[jira] [Created] (HIVE-22461) NPE Metastore Transformer

2019-11-05 Thread Yongzhi Chen (Jira)
Yongzhi Chen created HIVE-22461:
---

 Summary: NPE Metastore Transformer
 Key: HIVE-22461
 URL: https://issues.apache.org/jira/browse/HIVE-22461
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 3.1.2
Reporter: Yongzhi Chen
Assignee: Yongzhi Chen


The stack looks as following:
{noformat}
2019-10-08 18:09:12,198 INFO  
org.apache.hadoop.hive.metastore.MetastoreDefaultTransformer: 
[pool-6-thread-328]: Starting translation for processor 
Hiveserver2#3.1.2000.7.0.2.0...@vc0732.halxg.cloudera.com on list 1
2019-10-08 18:09:12,198 ERROR 
org.apache.hadoop.hive.metastore.RetryingHMSHandler: [pool-6-thread-328]: 
java.lang.NullPointerException
at 
org.apache.hadoop.hive.metastore.MetastoreDefaultTransformer.transform(MetastoreDefaultTransformer.java:99)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getTableInternal(HiveMetaStore.java:3391)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table_req(HiveMetaStore.java:3352)
at sun.reflect.GeneratedMethodAccessor12.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
at com.sun.proxy.$Proxy28.get_table_req(Unknown Source)
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_table_req.getResult(ThriftHiveMetastore.java:16633)
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_table_req.getResult(ThriftHiveMetastore.java:16617)
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
at 
org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:636)
at 
org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:631)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1876)
at 
org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:631)
at 
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

2019-10-08 18:09:12,199 ERROR org.apache.thrift.server.TThreadPoolServer: 
[pool-6-thread-328]: Error occurred during processing of message.
java.lang.NullPointerException: null
at 
org.apache.hadoop.hive.metastore.MetastoreDefaultTransformer.transform(MetastoreDefaultTransformer.java:99)
 ~[hive-exec-3.1.2000.7.0.2.0-59.jar:3.1.2000.7.0.2.0-59]
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getTableInternal(HiveMetaStore.java:3391)
 ~[hive-exec-3.1.2000.7.0.2.0-59.jar:3.1.2000.7.0.2.0-59]
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table_req(HiveMetaStore.java:3352)
 ~[hive-exec-3.1.2000.7.0.2.0-59.jar:3.1.2000.7.0.2.0-59]
at sun.reflect.GeneratedMethodAccessor12.invoke(Unknown Source) ~[?:?]
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[?:1.8.0_141]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_141]
at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
 ~[hive-exec-3.1.2000.7.0.2.0-59.jar:3.1.2000.7.0.2.0-59]
at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
 ~[hive-exec-3.1.2000.7.0.2.0-59.jar:3.1.2000.7.0.2.0-59]
at com.sun.proxy.$Proxy28.get_table_req(Unknown Source) ~[?:?]
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_table_req.getResult(ThriftHiveMetastore.java:16633)
 ~[hive-exec-3.1.2000.7.0.2.0-59.jar:3.1.2000.7.0.2.0-59]
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_table_req.getResult(ThriftHiveMetastore.java:16617)
 ~[hive-exec-3.1.2000.7.0.2.0-59.jar:3.1.2000.7.0.2.0-59]
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) 
~[hive-exec-3.1.2000.7.0.2.0-59.jar:3.1.2000.7.0.2.0-59]
at org.apache.thrift.TBaseProcessor.proce

[jira] [Created] (HIVE-22462) Error Information Lost in GenericUDTFGetSplits

2019-11-05 Thread David Mollitor (Jira)
David Mollitor created HIVE-22462:
-

 Summary: Error Information Lost in GenericUDTFGetSplits
 Key: HIVE-22462
 URL: https://issues.apache.org/jira/browse/HIVE-22462
 Project: Hive
  Issue Type: Improvement
Affects Versions: 3.2.0
Reporter: David Mollitor
Assignee: David Mollitor
 Attachments: HIVE-22462.1.patch

I was recently looking at some logs from a failed unit test and saw:

 
{code:none}
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
org.apache.hadoop.hive.ql.metadata.HiveException: Failed to create temp table: 
nullCaused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
org.apache.hadoop.hive.ql.metadata.HiveException: Failed to create temp table: 
null at 
org.apache.hadoop.hive.ql.udf.generic.GenericUDTFGetSplits2.process(GenericUDTFGetSplits2.java:81)
 at org.apache.hadoop.hive.ql.exec.UDTFOperator.process(UDTFOperator.java:116) 
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:888) at 
org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95) 
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:888) at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.jav 
{code}

Error information was lost... useless 'null' string is written.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Review Request 71707: Performance degradation on single row inserts

2019-11-05 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71707/#review218518
---




standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/utils/FileUtils.java
Line 323 (original), 321 (patched)


can you please also make similiar change to 
common/src/java/org/apache/hadoop/hive/common/FileUtils.java::listStatusRecursively()
 so that method also benefits from this change.



standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/utils/FileUtils.java
Line 331 (original), 324 (patched)


you may use BlobStorageUtils::isBlobStorageFileSystem() here.



standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/utils/FileUtils.java
Lines 378 (patched)


BlobStorageUtils::isBlobStorageFileSystem() instead


- Ashutosh Chauhan


On Nov. 5, 2019, 3:32 p.m., Attila Magyar wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71707/
> ---
> 
> (Updated Nov. 5, 2019, 3:32 p.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan, Peter Vary, and Slim Bouguerra.
> 
> 
> Bugs: HIVE-22411
> https://issues.apache.org/jira/browse/HIVE-22411
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Executing single insert statements on a transactional table effects write 
> performance on a s3 file system. Each insert creates a new delta directory. 
> After each insert hive calculates statistics like number of file in the table 
> and total size of the table. In order to calculate these, it traverses the 
> directory recursively. During the recursion for each path a separate 
> listStatus call is executed. In the end the more delta directory you have the 
> more time it takes to calculate the statistics.
> 
> Therefore insertion time goes up linearly.
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/Warehouse.java
>  38e843aeacf 
>   
> standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/utils/FileUtils.java
>  bf206fffc26 
> 
> 
> Diff: https://reviews.apache.org/r/71707/diff/2/
> 
> 
> Testing
> ---
> 
> measured and plotted insertation time
> 
> 
> Thanks,
> 
> Attila Magyar
> 
>



[jira] [Created] (HIVE-22463) Support Decimal64 column multiplication with decimal64 Column/Scalar

2019-11-05 Thread Ramesh Kumar Thangarajan (Jira)
Ramesh Kumar Thangarajan created HIVE-22463:
---

 Summary: Support Decimal64 column multiplication with decimal64 
Column/Scalar
 Key: HIVE-22463
 URL: https://issues.apache.org/jira/browse/HIVE-22463
 Project: Hive
  Issue Type: Bug
Reporter: Ramesh Kumar Thangarajan


Support Decimal64 column multiplication with decimal64 Column/Scalar



--
This message was sent by Atlassian Jira
(v8.3.4#803005)