[jira] [Commented] (LUCENE-2571) Indexing performance tests with realtime branch
[ https://issues.apache.org/jira/browse/LUCENE-2571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13020206#comment-13020206 ] Simon Willnauer commented on LUCENE-2571: - bq. Would you consider trying other MergePolicy objects on trunk? The BalancedSegment MP tries to avoid these long stoppages. I think there is a misunderstanding on your side. The long stoppages on trunk are not due to merges at all. They are due to flushing the DocumentsWriter which essentially means stop the world. This is why we can not make any progress. Merges are NOT blocking indexing on trunk no matter which MP you use. The Balanced MP is rather suited for RT environments to make reopening the reader quicker. you should maybe look at this blog entry for a more complete explanation: http://blog.jteam.nl/2011/04/01/gimme-all-resources-you-have-i-can-use-them/ Indexing performance tests with realtime branch --- Key: LUCENE-2571 URL: https://issues.apache.org/jira/browse/LUCENE-2571 Project: Lucene - Java Issue Type: Task Components: Index Reporter: Michael Busch Priority: Minor Fix For: Realtime Branch Attachments: wikimedium.realtime.Standard.nd10M_dps.png, wikimedium.realtime.Standard.nd10M_dps_addDocuments.png, wikimedium.realtime.Standard.nd10M_dps_addDocuments_flush.png, wikimedium.trunk.Standard.nd10M_dps.png, wikimedium.trunk.Standard.nd10M_dps_addDocuments.png We should run indexing performance tests with the DWPT changes and compare to trunk. We need to test both single-threaded and multi-threaded performance. NOTE: flush by RAM isn't implemented just yet, so either we wait with the tests or flush by doc count. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-2571) Indexing performance tests with realtime branch
[ https://issues.apache.org/jira/browse/LUCENE-2571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13020217#comment-13020217 ] Earwin Burrfoot commented on LUCENE-2571: - bq. Merges are NOT blocking indexing on trunk no matter which MP you use. Well.. merges tie up IO (especially if not on fancy SSDs/RAIDs), which in turn lags flushes - bigger delays for stop the world flushes / lower bandwith cap (after which they are forced to stop the world) for parallel flushes. So Lance's point is partially valid. Indexing performance tests with realtime branch --- Key: LUCENE-2571 URL: https://issues.apache.org/jira/browse/LUCENE-2571 Project: Lucene - Java Issue Type: Task Components: Index Reporter: Michael Busch Priority: Minor Fix For: Realtime Branch Attachments: wikimedium.realtime.Standard.nd10M_dps.png, wikimedium.realtime.Standard.nd10M_dps_addDocuments.png, wikimedium.realtime.Standard.nd10M_dps_addDocuments_flush.png, wikimedium.trunk.Standard.nd10M_dps.png, wikimedium.trunk.Standard.nd10M_dps_addDocuments.png We should run indexing performance tests with the DWPT changes and compare to trunk. We need to test both single-threaded and multi-threaded performance. NOTE: flush by RAM isn't implemented just yet, so either we wait with the tests or flush by doc count. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-2571) Indexing performance tests with realtime branch
[ https://issues.apache.org/jira/browse/LUCENE-2571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13020230#comment-13020230 ] Simon Willnauer commented on LUCENE-2571: - bq. Well.. merges tie up IO (especially if not on fancy SSDs/RAIDs), which in turn lags flushes - bigger delays for stop the world flushes / lower bandwith cap (after which they are forced to stop the world) for parallel flushes. True it will make a difference in certain situations but not for this benchmark RT does way more merges here since we are flushing way more segments. the time windows I used here is where we almost don't merge at all in the trunk run so it should not make a difference. I ran those benchmarks again with BalancedSegmentMergePolicy and it doesn't make any difference really. see below !wikimedium.trunk.Standard.nd10M_dps_BalancedSegmentMergePolicy.png! Indexing performance tests with realtime branch --- Key: LUCENE-2571 URL: https://issues.apache.org/jira/browse/LUCENE-2571 Project: Lucene - Java Issue Type: Task Components: Index Reporter: Michael Busch Priority: Minor Fix For: Realtime Branch Attachments: wikimedium.realtime.Standard.nd10M_dps.png, wikimedium.realtime.Standard.nd10M_dps_addDocuments.png, wikimedium.realtime.Standard.nd10M_dps_addDocuments_flush.png, wikimedium.trunk.Standard.nd10M_dps.png, wikimedium.trunk.Standard.nd10M_dps_BalancedSegmentMergePolicy.png, wikimedium.trunk.Standard.nd10M_dps_addDocuments.png We should run indexing performance tests with the DWPT changes and compare to trunk. We need to test both single-threaded and multi-threaded performance. NOTE: flush by RAM isn't implemented just yet, so either we wait with the tests or flush by doc count. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-2571) Indexing performance tests with realtime branch
[ https://issues.apache.org/jira/browse/LUCENE-2571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13019890#comment-13019890 ] Simon Willnauer commented on LUCENE-2571: - I run batch indexing benchmarks trunk vs. realtime branch with addDocument and with updateDocument. For add document I indexed 10M wikipedia docs into a spinning disk reading from a separate SSD Here is the realtime graph: !wikimedium.realtime.Standard.nd10M_dps_addDocuments.png! vs. trunk: !wikimedium.trunk.Standard.nd10M_dps_addDocuments.png! This graph shows how DWPT is flushing to disk over time: !wikimedium.realtime.Standard.nd10M_dps_addDocuments_flush.png! for updateDocument I build a 10M docs wiki index and indexed the exact same documents with updateDocument here are the results: Realtime Branch: !wikimedium.realtime.Standard.nd10M_dps.png! trunk: !wikimedium.trunk.Standard.nd10M_dps.png! Indexing performance tests with realtime branch --- Key: LUCENE-2571 URL: https://issues.apache.org/jira/browse/LUCENE-2571 Project: Lucene - Java Issue Type: Task Components: Index Reporter: Michael Busch Priority: Minor Fix For: Realtime Branch Attachments: wikimedium.realtime.Standard.nd10M_dps.png, wikimedium.realtime.Standard.nd10M_dps_addDocuments.png, wikimedium.realtime.Standard.nd10M_dps_addDocuments_flush.png, wikimedium.trunk.Standard.nd10M_dps.png, wikimedium.trunk.Standard.nd10M_dps_addDocuments.png We should run indexing performance tests with the DWPT changes and compare to trunk. We need to test both single-threaded and multi-threaded performance. NOTE: flush by RAM isn't implemented just yet, so either we wait with the tests or flush by doc count. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-2571) Indexing performance tests with realtime branch
[ https://issues.apache.org/jira/browse/LUCENE-2571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13020147#comment-13020147 ] Lance Norskog commented on LUCENE-2571: --- Would you consider trying other MergePolicy objects on trunk? The BalancedSegment MP tries to avoid these long stoppages. Indexing performance tests with realtime branch --- Key: LUCENE-2571 URL: https://issues.apache.org/jira/browse/LUCENE-2571 Project: Lucene - Java Issue Type: Task Components: Index Reporter: Michael Busch Priority: Minor Fix For: Realtime Branch Attachments: wikimedium.realtime.Standard.nd10M_dps.png, wikimedium.realtime.Standard.nd10M_dps_addDocuments.png, wikimedium.realtime.Standard.nd10M_dps_addDocuments_flush.png, wikimedium.trunk.Standard.nd10M_dps.png, wikimedium.trunk.Standard.nd10M_dps_addDocuments.png We should run indexing performance tests with the DWPT changes and compare to trunk. We need to test both single-threaded and multi-threaded performance. NOTE: flush by RAM isn't implemented just yet, so either we wait with the tests or flush by doc count. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org