Re: Review Request 72532: HIVE-23495 AcidUtils.getAcidState cleanup

Karen Coppage via Review Board Wed, 10 Jun 2020 03:16:04 -0700


> On June 5, 2020, 2:04 p.m., Karen Coppage wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java
> > Line 1411 (original)
> > <https://reviews.apache.org/r/72532/diff/2/?file=2232788#file2232788line1418>
> >
> >     Are these originals not needed, or collected elsewhere?
> 
> Peter Varga wrote:
>     This one is bothering me, these lines were added when the snapshot way 
> was introduced, but I do not see why. When we calculated the AcidState 
> without the snapshot these files were not added to the originals list. It is 
> explicitly there few lines above, that if we have a base we consider every 
> original files as obsolete. The 
> TestTxnCommandsForMmTable#testInsertOverwriteForPartitionedMmTable breaks for 
> example if these files are added to the list. After an insert-overwrite to a 
> mm table and calling the major compaction, the compaction will create a new 
> base dir, not leaving the perfectly fine base dir generated by the insert 
> overwrite. I did not dig into the compaction to see why the original files 
> are triggering it, but I do not think these files needed in the original list.


Ok. I see that files in a base, even if in original format, should be 
considered base files and not original files. It's still not clear to me why 
this block was added but to me it makes sense to get rid of it.


- Karen


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72532/#review220956
-----------------------------------------------------------


On June 8, 2020, 10:58 a.m., Peter Varga wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72532/
> -----------------------------------------------------------
> 
> (Updated June 8, 2020, 10:58 a.m.)
> 
> 
> Review request for hive, Karen Coppage, Marta Kuczora, and Peter Vary.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> since HIVE-21225 there are two redundant implementation of the 
> AcidUtils.getAcidState.
> 
> The previous implementation (without the recursive listing) can be removed.
> 
> Also the performance can be improved, by removing unnecessary fileStatus 
> calls.
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java 635ed3149c 
>   ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java ca234cfb37 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java 1059cb227f 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRawRecordMerger.java 
> 16c915959c 
>   
> ql/src/java/org/apache/hadoop/hive/ql/io/orc/VectorizedOrcAcidRowBatchReader.java
>  598220b0c4 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Cleaner.java 2a15913f9f 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java 
> 4e5d5b003b 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java 
> 7913295380 
>   
> ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/MinorQueryCompactor.java 
> d83a50f555 
>   
> ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/MmMajorQueryCompactor.java
>  5e11d8d2d8 
>   
> ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/MmMinorQueryCompactor.java
>  1bdec7df2d 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Worker.java 75941b3f33 
>   ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java 337f469d1a 
>   ql/src/test/org/apache/hadoop/hive/ql/io/TestAcidUtils.java f351f04b08 
>   ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestInputOutputFormat.java 
> e4440e9136 
>   ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestOrcRawRecordMerger.java 
> f63c40a7b5 
>   streaming/src/test/org/apache/hive/streaming/TestStreaming.java 3a3b267927 
> 
> 
> Diff: https://reviews.apache.org/r/72532/diff/3/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Peter Varga
> 
>

Re: Review Request 72532: HIVE-23495 AcidUtils.getAcidState cleanup

Reply via email to