[jira] [Updated] (AVRO-1089) Avro-C - Penalty 30x to 50x for using resolved writer on arrays
[ https://issues.apache.org/jira/browse/AVRO-1089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Douglas Creager updated AVRO-1089: -- Attachment: 0001-AVRO-1089.-Fix-performance-penalty-for-array-resolve.patch Here's a one-liner patch that fixes this. The problem was that an internal array wasn't being cleared, and was growing not just with the size of each test case, but with the number of test cases. Iterating through that array was causing the slowdown. All tests still pass; running time for the resolved array tests are now comparable with the non-resolved array tests. Avro-C - Penalty 30x to 50x for using resolved writer on arrays --- Key: AVRO-1089 URL: https://issues.apache.org/jira/browse/AVRO-1089 Project: Avro Issue Type: Bug Components: c Affects Versions: 1.6.3, 1.7.0 Environment: Ubuntu Linux Reporter: Vivek Nadkarni Attachments: 0001-AVRO-1089.-Fix-performance-penalty-for-array-resolve.patch, AVRO-1089-performance.png Original Estimate: 48h Remaining Estimate: 48h The new performance tests created in AVRO-1088 show that using the resolved writer takes 30 to 50 times longer than using no schema resolution or using the resolved reader for simple and nested arrays. For a simple array, using the resolved writer took ~30x longer than using the memory reader that assumed a matching schema. For the nested array, using the resolved writer took ~50x longer. These results suggest that there is a bug in resolved writer. I do not have a proposed fix at this time. Running simple array matched schemas 25 tests per run Run 1 Run 2 Run 3 Average time: 2.123s Tests/sec:117739 Running simple array resolved writer 1 tests per run Run 1 Run 2 Run 3 Average time: 2.747s Tests/sec:3641 Running nested array matched schemas 25 tests per run Run 1 Run 2 Run 3 Average time: 3.030s Tests/sec:82508 Running nested array resolved writer 1 tests per run Run 1 Run 2 Run 3 Average time: 6.650s Tests/sec:1504 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (AVRO-1089) Avro-C - Penalty 30x to 50x for using resolved writer on arrays
[ https://issues.apache.org/jira/browse/AVRO-1089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vivek Nadkarni updated AVRO-1089: - Attachment: AVRO-1089-performance.png This screenshot was generated using kcachegrind, after running the performance test test_simple_array_resolved_writer(). The plot shows that the majority of the time (97%) is spent in the function avro_resolved_writer_free_elements() called by avro_resolved_array_writer_reset(). This information suggests that the bug lies in one of these two functions. Unfortunately, I still don't have a mechanism or a fix for this issue. Avro-C - Penalty 30x to 50x for using resolved writer on arrays --- Key: AVRO-1089 URL: https://issues.apache.org/jira/browse/AVRO-1089 Project: Avro Issue Type: Bug Components: c Affects Versions: 1.6.3, 1.7.0 Environment: Ubuntu Linux Reporter: Vivek Nadkarni Fix For: 1.7.0 Attachments: AVRO-1089-performance.png Original Estimate: 48h Remaining Estimate: 48h The new performance tests created in AVRO-1088 show that using the resolved writer takes 30 to 50 times longer than using no schema resolution or using the resolved reader for simple and nested arrays. For a simple array, using the resolved writer took ~30x longer than using the memory reader that assumed a matching schema. For the nested array, using the resolved writer took ~50x longer. These results suggest that there is a bug in resolved writer. I do not have a proposed fix at this time. Running simple array matched schemas 25 tests per run Run 1 Run 2 Run 3 Average time: 2.123s Tests/sec:117739 Running simple array resolved writer 1 tests per run Run 1 Run 2 Run 3 Average time: 2.747s Tests/sec:3641 Running nested array matched schemas 25 tests per run Run 1 Run 2 Run 3 Average time: 3.030s Tests/sec:82508 Running nested array resolved writer 1 tests per run Run 1 Run 2 Run 3 Average time: 6.650s Tests/sec:1504 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira