kasakrisz opened a new pull request #3050:
URL: https://github.com/apache/hive/pull/3050


   ### What changes were proposed in this pull request?
   Lineage entries are sorted before printing them. The comparator used for the 
sorting uses the string representation of `Partition` objects for comparing 
them.
   This patch propose to use only the partition values for Partition comparison 
instead
   
   ### Why are the changes needed?
   More than one partition objects may represents the same partition in the 
lineage info map. This is very common since each column has a different entry 
but the same partition. There are cases when more than one branches of the 
statement updates the same partition. In this case properties of the cached 
Partition objects may different:
   ```
   stats_part PARTITION(p=101).key ...]   -> Partition[(p=101)... 
transient_lastDdlTime=1645697627,...]
   stats_part PARTITION(p=101).value ...] -> Partition[(p=101)... 
transient_lastDdlTime=1645697627,...]
   stats_part PARTITION(p=101).key ...]   -> Partition[(p=101)... 
transient_lastDdlTime=1645697628,...]
   stats_part PARTITION(p=101).value ...] -> Partition[(p=101)... 
transient_lastDdlTime=1645697628,...]
   ```
   This difference changes the behavior of the comparator used for sorting 
Lineage entries. The printed entries contain the partition values only and this 
is the only value should be used when comparing partitions this case.
   
   ### Does this PR introduce _any_ user-facing change?
   Yes. Lineage order is more stable when running the same statement several 
time.
   
   ### How was this patch tested?
   Run flaky-check: http://ci.hive.apache.org/job/hive-flaky-check/530/
   Parameters:
   ```
   -Dtest=TestMiniLlapLocalCliDriver -Dqfile=stats_part_multi_insert_acid.q -pl 
itests/qtest -Pitests
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to