[
https://issues.apache.org/jira/browse/AVRO-743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12991201#comment-12991201
]
Scott Carey commented on AVRO-743:
----------------------------------
With a larger heap, it completes but slowly. There is still a large
regression. Using the new Perf.java, here are full read results with and
without the AVRO-650 changes.
args: -nowrite -server -Xmx256m -Xms256m -XX:+UseParallelGC
-XX:+UseCompressedOops -XX:+DoEscapeAnalysis
Only the generic results are below -- only the "one time use" reader tests are
affected and other generic tests are a good reference.
{code}
GenericRead: 1811 ms, 3.680 million entries/sec.
142.805 million bytes/sec
GenericNested_Read: 3015 ms, 2.211 million entries/sec.
85.801 million bytes/sec
GenericWithDefault_Read: 3253 ms, 2.049 million entries/sec.
79.532 million bytes/sec
GenericWithOutOfOrder_Read: 1855 ms, 3.594 million entries/sec.
139.472 million bytes/sec
GenericWithPromotion_Read: 1962 ms, 3.397 million entries/sec.
131.853 million bytes/sec
GenericOneTimeDecoderUse_Read: 1791 ms, 3.721 million entries/sec.
144.426 million bytes/sec
GenericOneTimeReaderUse_Read: 6989 ms, 0.954 million entries/sec.
37.014 million bytes/sec
GenericOneTimeUse_Read: 7373 ms, 0.904 million entries/sec.
35.088 million bytes/sec
{code}
If I revert AVRO-650, I get:
{code}
GenericRead: 1808 ms, 3.687 million entries/sec.
143.076 million bytes/sec
GenericNested_Read: 2872 ms, 2.321 million entries/sec.
90.062 million bytes/sec
GenericWithDefault_Read: 3389 ms, 1.967 million entries/sec.
76.340 million bytes/sec
GenericWithOutOfOrder_Read: 1805 ms, 3.693 million entries/sec.
143.319 million bytes/sec
GenericWithPromotion_Read: 1978 ms, 3.369 million entries/sec.
130.759 million bytes/sec
GenericOneTimeDecoderUse_Read: 1803 ms, 3.696 million entries/sec.
143.443 million bytes/sec
GenericOneTimeReaderUse_Read: 2289 ms, 2.912 million entries/sec.
113.024 million bytes/sec
GenericOneTimeUse_Read: 2299 ms, 2.899 million entries/sec.
112.501 million bytes/sec
{code}
To prevent the cases where GenericDatumReaders are created and disposed rapidly
from causing this issue, I tried several things. One was to remove the
resolver cached in GenericDatumReader entirely and only use the global cache.
This was surprisingly fast, but slowed all Generic tests by 10% to 15%.
Any variation that creates a new threadLocal per instance of GenericDatumReader
was bad. An alternate attempt tried to instead keep one global ThreadLocal
WeakReferenceCache with GenericDatumReader's as keys to track the relationship
was faster, but still a large memory hog and performance problem.
This is still not 100% thread-safe, but it is no worse than before. Since we
allow mutating state in setSchema() and setExpected() the only way to be
completely thread-safe is to synchronize those as well as their access .
Performance dropped quite a bit when I did that. Longer term we need to make
these objects immutable, and use a builder pattern when we don't know all the
fields prior to construction.
> Java: Performance Regression and memory pressure with GenericDatumReader
> ------------------------------------------------------------------------
>
> Key: AVRO-743
> URL: https://issues.apache.org/jira/browse/AVRO-743
> Project: Avro
> Issue Type: Bug
> Components: java
> Affects Versions: 1.5.0
> Reporter: Scott Carey
> Priority: Critical
> Fix For: 1.5.0
>
>
> AVRO-650 introduced a large performance regression and memory bloat issue
> with GenericDatumReader.
> Performance plummets for some Perf.java tests (One test took 1 hour to finish
> on my laptop).
> Some minor changes I tried result in it passing in shorter time, but with
> still an 80% performance degredation.
> This is associated with memory bloat related to ThreadLocals.
> More details provided in comments.
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira