> On Oct. 11, 2018, 8:54 a.m., Peter Vary wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
> > Lines 1896 (patched)
> > <https://reviews.apache.org/r/68975/diff/1/?file=2095949#file2095949line1898>
> >
> >     What does this method do?
> >     Moving files and updating partition data if the partition is already 
> > exists?
> >     Javadoc would be good anyway.
> >     Follow-up idea: update partition data with alter_partitions (multiple 
> > updates with 1 HMS call)?

Good point, javadoc added.

I noticed that as well, just didn't want to make this change even more 
complicated. There are other points also wich could be be further optimized.  I 
made a note to myself to open a new jira after this is merged in.


> On Oct. 11, 2018, 8:54 a.m., Peter Vary wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
> > Line 1909 (original), 1939 (patched)
> > <https://reviews.apache.org/r/68975/diff/1/?file=2095949#file2095949line1944>
> >
> >     nit: maybe do not reformat these lines if they are not needed

done.


> On Oct. 11, 2018, 8:54 a.m., Peter Vary wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
> > Line 1939 (original), 1961 (patched)
> > <https://reviews.apache.org/r/68975/diff/1/?file=2095949#file2095949line1974>
> >
> >     nit: maybe do not reformat these lines if they are not needed

done.


> On Oct. 11, 2018, 8:54 a.m., Peter Vary wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
> > Line 1948 (original), 1970 (patched)
> > <https://reviews.apache.org/r/68975/diff/1/?file=2095949#file2095949line1983>
> >
> >     nit: maybe do not reformat these lines if they are not needed

done.


> On Oct. 11, 2018, 8:54 a.m., Peter Vary wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
> > Line 1966 (original), 1988 (patched)
> > <https://reviews.apache.org/r/68975/diff/1/?file=2095949#file2095949line2001>
> >
> >     nit: maybe do not reformat these lines if they are not needed

done.


> On Oct. 11, 2018, 8:54 a.m., Peter Vary wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
> > Lines 1970-1971 (original), 1992-1993 (patched)
> > <https://reviews.apache.org/r/68975/diff/1/?file=2095949#file2095949line2005>
> >
> >     nit: maybe do not reformat these lines if they are not needed

done.


> On Oct. 11, 2018, 8:54 a.m., Peter Vary wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
> > Lines 1978-1979 (original), 2000-2001 (patched)
> > <https://reviews.apache.org/r/68975/diff/1/?file=2095949#file2095949line2013>
> >
> >     nit: maybe do not reformat these lines if they are not needed

done.


> On Oct. 11, 2018, 8:54 a.m., Peter Vary wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
> > Lines 2102-2103 (patched)
> > <https://reviews.apache.org/r/68975/diff/1/?file=2095949#file2095949line2115>
> >
> >     Concerned about this.
> >     When we call Hive.loadPartition we call 2 methods there:
> >     - loadPartitionInternal - 1 snapshot
> >     - addPartitionToMetastore - 1 snapshot
> >     Are we sure that these calls are:
> >     - Lightweight - so we can happily call them twice
> >     - Return the same value in both ocassions even if some other transacion 
> > is finished during this time?

Yes, I see your point here. 
By checking the implementation of AcidUtils.getTableSnapshot, it is not 
guaranteed that the two calls will have the same result. 
I made a modification to use the same instance of snapshot until the loading of 
partition is done.


> On Oct. 11, 2018, 8:54 a.m., Peter Vary wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
> > Lines 2083-2084 (original), 2129-2130 (patched)
> > <https://reviews.apache.org/r/68975/diff/1/?file=2095949#file2095949line2149>
> >
> >     We do not use batching for adding partitions?

What should be the size of the batch? I used BATCH_RETRIEVE_MAX configuration 
parameter for fetching data from metastore. Can I use this one as well, or 
there is another parameter?


> On Oct. 11, 2018, 8:54 a.m., Peter Vary wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
> > Lines 2492-2506 (patched)
> > <https://reviews.apache.org/r/68975/diff/1/?file=2095949#file2095949line2528>
> >
> >     Do we need this? Can't we use equals method of maps to compare instead?

Java arrays do not override Object.equals(). Hence if you compare them using 
equals() (which is what the equals methods of all the collection classes do), 
you get "instance equality", instead of "value equality".


> On Oct. 11, 2018, 8:54 a.m., Peter Vary wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
> > Lines 2511 (patched)
> > <https://reviews.apache.org/r/68975/diff/1/?file=2095949#file2095949line2547>
> >
> >     Note: We still might fail with OOM if too many new partitions are 
> > there. We store in memory all of the new and old partition specifications. 
> > Was this the same before?

Yes, in the previous implementation we stored the old/new partion spec in 
memory.


> On Oct. 11, 2018, 8:54 a.m., Peter Vary wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
> > Lines 2604-2607 (patched)
> > <https://reviews.apache.org/r/68975/diff/1/?file=2095949#file2095949line2663>
> >
> >     Shouldn't we do this when uploading the files?
> >     I think the original intent was to show the progress of the loading of 
> > the partitions. We might want to keep this functionality.

yepp, done.


> On Oct. 11, 2018, 8:54 a.m., Peter Vary wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
> > Lines 2491-2492 (original), 2615 (patched)
> > <https://reviews.apache.org/r/68975/diff/1/?file=2095949#file2095949line2676>
> >
> >     nit: Please-please-please no unneccessary formatting changes... :)

done.


> On Oct. 11, 2018, 8:54 a.m., Peter Vary wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
> > Line 2496 (original), 2619 (patched)
> > <https://reviews.apache.org/r/68975/diff/1/?file=2095949#file2095949line2681>
> >
> >     We need the PerfLogEnd for LOAD_DYNAMIC_PARTITIONS

done.


> On Oct. 11, 2018, 8:54 a.m., Peter Vary wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
> > Line 2500 (original), 2621 (patched)
> > <https://reviews.apache.org/r/68975/diff/1/?file=2095949#file2095949line2685>
> >
> >     Why do we shallow the original exception. We should add as a root cause

done.


> On Oct. 11, 2018, 8:54 a.m., Peter Vary wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java
> > Lines 910 (patched)
> > <https://reviews.apache.org/r/68975/diff/1/?file=2095950#file2095950line910>
> >
> >     Do we need this as public?

You're right, I missed that one.


> On Oct. 11, 2018, 8:54 a.m., Peter Vary wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java
> > Lines 952 (patched)
> > <https://reviews.apache.org/r/68975/diff/1/?file=2095950#file2095950line952>
> >
> >     Do we need this as public?

Fixed.


> On Oct. 11, 2018, 8:54 a.m., Peter Vary wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java
> > Lines 1015 (patched)
> > <https://reviews.apache.org/r/68975/diff/1/?file=2095950#file2095950line1017>
> >
> >     It might be a good idea to add this to the javadoc of the method too, 
> > since this is a serious assumption. Maybe move the javadoc to the 
> > interface, or copy there too? Let's talk about this.

It's not an assumption, it's a fact. The loadDynamicPartitions works in a way, 
that only allows to load partitions dynamically into a single table. I move the 
comment to the javadoc.


- Laszlo


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68975/#review209443
-----------------------------------------------------------


On Oct. 10, 2018, 2:33 p.m., Laszlo Pinter wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68975/
> -----------------------------------------------------------
> 
> (Updated Oct. 10, 2018, 2:33 p.m.)
> 
> 
> Review request for hive and Peter Vary.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> HIVE-20661: Dynamic partitions loading calls add partition for every 
> partition 1-by-1
> 
> 
> Diffs
> -----
> 
>   
> ql/src/java/org/apache/hadoop/hive/metastore/SynchronizedMetaStoreClient.java 
> 0ab77e84c6222d35bcec9ce95ed02014911ef144 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 
> 4de038913a5c9a2c199f71702b8f70ca84d0856b 
>   
> ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java
>  a2b57fb646899c54b63be14a8cde9b8644a973aa 
>   
> standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
>  a2ec09f7157140fda751ab777649c698b3319a16 
> 
> 
> Diff: https://reviews.apache.org/r/68975/diff/1/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Laszlo Pinter
> 
>

Reply via email to