-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/67485/#review204700
-----------------------------------------------------------




standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
Line 2545 (original), 2538 (patched)
<https://reviews.apache.org/r/67485/#comment287344>

    My concern here is that we are removing the batch processing from this 
method. While the memory footprint of this method has reduced since we are not 
retrieving the fully loaded partition objects, I am worried that it may still 
cause OOMs for very large tables. Do you have any testing results which shows 
that this implementation is not any worse than what we already have in terms of 
the memory footprint?



standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
Lines 2575 (patched)
<https://reviews.apache.org/r/67485/#comment287342>

    Can we avoid creating a new List here by changing the method signature to 
get a Collection instead of List<string>?



standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
Lines 2784-2785 (patched)
<https://reviews.apache.org/r/67485/#comment287346>

    may be a future task for improvement. We should think of ways to reduce the 
duplicate strings here. Most of the partitions locations will have the same 
prefix in the their path location. Same with the partition keys part from the 
partname. Right now, may be if we just limit the number of rows in smaller 
batches somehow would do the trick.


- Vihang Karajgaonkar


On June 7, 2018, 10:31 a.m., Peter Vary wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/67485/
> -----------------------------------------------------------
> 
> (Updated June 7, 2018, 10:31 a.m.)
> 
> 
> Review request for hive, Alexander Kolbasov and Vihang Karajgaonkar.
> 
> 
> Bugs: HIVE-19783
>     https://issues.apache.org/jira/browse/HIVE-19783
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> Added a new getPartitionLocations method to the RawStore interface.
> 
> Implemented getPartitionLocations in ObjectStore using JDQL.
> Question: In CachedObjectStore: Shall I call rawStore.getPartitionLocations 
> or reimplement it using getPartitions?
> 
> Modified dropPartitionsAndGetLocations:
> - Instead of querying every partition data. Query only the locations using 
> the new interface method
> - Removed partKeys parameter which become unneccessary
> 
> 
> Diffs
> -----
> 
>   
> itests/hcatalog-unit/src/test/java/org/apache/hive/hcatalog/listener/DummyRawStoreFailEvent.java
>  ff97522 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
>  b9f5fb8 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
>  b3a8dd0 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/RawStore.java
>  f350aa9 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java
>  d9356b8 
>   
> standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/DummyRawStoreControlledCommit.java
>  8c3ada3 
>   
> standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/DummyRawStoreForJdoConnection.java
>  f98e8de 
> 
> 
> Diff: https://reviews.apache.org/r/67485/diff/2/
> 
> 
> Testing
> -------
> 
> Run the TestTablesCreateDropAlterTruncate test (partitioned table creation 
> and drop)
> 
> 
> Thanks,
> 
> Peter Vary
> 
>

Reply via email to