> On May 8, 2020, 10:23 a.m., Peter Vary wrote:
> > Thanks Marci,
> > Few querstions below - probably I just do not understand this part of the 
> > code enough.
> > 
> > Another question for the perf test: How many threads are you using?
> > 
> > Thanks,
> > Peter

Thanks Peti for the review. See my answers below, they are related to minor 
housekeeping changes that are not core to the optimization story.

The perf tests were run with 8 concurrent threads.


> On May 8, 2020, 10:23 a.m., Peter Vary wrote:
> > standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/CompactionTxnHandler.java
> > Line 1102 (original)
> > <https://reviews.apache.org/r/72481/diff/1/?file=2230753#file2230753line1102>
> >
> >     Why did we remove this?

checkRetryable never throws MetaException (and hence the code never enters this 
catch block), so I removed it from its throws clause


> On May 8, 2020, 10:23 a.m., Peter Vary wrote:
> > standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/CompactionTxnHandler.java
> > Line 1137 (original)
> > <https://reviews.apache.org/r/72481/diff/1/?file=2230753#file2230753line1137>
> >
> >     why did we remove this?

same as above


> On May 8, 2020, 10:23 a.m., Peter Vary wrote:
> > standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
> > Lines 1762-1765 (original), 1759-1762 (patched)
> > <https://reviews.apache.org/r/72481/diff/1/?file=2230754#file2230754line1764>
> >
> >     Is this a functionality or performance change?

neither really, just some readability refactor


> On May 8, 2020, 10:23 a.m., Peter Vary wrote:
> > standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
> > Line 2021 (original), 2013 (patched)
> > <https://reviews.apache.org/r/72481/diff/1/?file=2230754#file2230754line2025>
> >
> >     Why is this change required?

just seems counterintuitive to use string concat if we're already using a 
stringbuilder anyway


> On May 8, 2020, 10:23 a.m., Peter Vary wrote:
> > standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
> > Line 2067 (original), 2057 (patched)
> > <https://reviews.apache.org/r/72481/diff/1/?file=2230754#file2230754line2071>
> >
> >     Why is this change?

this was causing a checkstyle issue (line lenght too long)


> On May 8, 2020, 10:23 a.m., Peter Vary wrote:
> > standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
> > Line 2079 (original), 2070 (patched)
> > <https://reviews.apache.org/r/72481/diff/1/?file=2230754#file2230754line2084>
> >
> >     Why is this change?

unnecessary call to Long.toString


> On May 8, 2020, 10:23 a.m., Peter Vary wrote:
> > standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
> > Line 2090 (original), 2081 (patched)
> > <https://reviews.apache.org/r/72481/diff/1/?file=2230754#file2230754line2095>
> >
> >     Why is this change?

same as above


- Marton


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72481/#review220689
-----------------------------------------------------------


On May 7, 2020, 3:55 p.m., Marton Bod wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72481/
> -----------------------------------------------------------
> 
> (Updated May 7, 2020, 3:55 p.m.)
> 
> 
> Review request for hive, Denys Kuzmenko and Peter Vary.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> Removed global mutex on writeId allocation, which means write ids can now be 
> allocated concurrently for different tables without blocking each other, 
> speeding up execution (perf test results below). Concurrent 
> allocateTableWriteIds() operations targeting the same table are still mutexed 
> by an S4U if the table is already present in next_write_id, otherwise a race 
> condition to insert the table into next_write_id is solved by retrying after 
> catching the duplicate key exception (the thread which commits later will be 
> the one to retry).
> 
> The situation is similar when allocateTableWriteIds() and 
> replTableWriteIdState() are running concurrently - if they target different 
> tables, they won't block each other anymore. If they target the same table, 
> and the table is already inserted into next_write_id, replTableWriteIdState() 
> returns early and allocateTableWriteIds() updates the next id. If the table 
> is not yet in next_write_id, they might attempt to insert the same row 
> concurrently, in which case who commits later will get a duplicate key 
> exception and retry the operation, just as above.
> 
> 
> Diffs
> -----
> 
>   ql/src/test/org/apache/hadoop/hive/metastore/txn/TestTxnHandler.java 
> 868da0c7a0 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/CompactionTxnHandler.java
>  d59f863b11 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
>  cf41ef8aaf 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnStore.java
>  1e177f4a7b 
> 
> 
> Diff: https://reviews.apache.org/r/72481/diff/1/
> 
> 
> Testing
> -------
> 
> Unit test in TestTxnHandler
> + Perf tests:
> dbType    sameTable variant  ms/op  error
> MYSQL     FALSE     original 46.93  3.041
> MYSQL     FALSE     patched  19.283 1.311
> MYSQL     TRUE      original 50.185 3.595
> MYSQL     TRUE      patched  32.254 2.164
> ORACLE    FALSE     original 57.609 4.461
> ORACLE    FALSE     patched  25.721 2.551
> ORACLE    TRUE      original 59.668 3.172
> ORACLE    TRUE      patched  39.061 2.548
> POSTGRES  FALSE     original 39.364 2.94 
> POSTGRES  FALSE     patched  18.518 1.038
> POSTGRES  TRUE      original 39.868 2.679
> POSTGRES  TRUE      patched  28.874 1.768
> SQLSERVER FALSE     original 45.252 1.643
> SQLSERVER FALSE     patched  24.583 1.529
> SQLSERVER TRUE      original 49.149 3.45 
> SQLSERVER TRUE      patched  32.918 1.654
> (sameTable=true means that all threads were trying to allocate ids for the 
> same db.table,
> false means they all targeted different tables)
> 
> 
> Thanks,
> 
> Marton Bod
> 
>

Reply via email to