[jira] [Updated] (SOLR-4509) Disable Stale Check - Distributed Search (Performance)
[ https://issues.apache.org/jira/browse/SOLR-4509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller updated SOLR-4509: -- Attachment: SOLR-4509.patch New patch. Some more cleanup. Also added closeExpiredConnections to the sweeper thread as the documentation recommends. Disable Stale Check - Distributed Search (Performance) -- Key: SOLR-4509 URL: https://issues.apache.org/jira/browse/SOLR-4509 Project: Solr Issue Type: Improvement Components: search Environment: 5 node SmartOS cluster (all nodes living in same global zone - i.e. same physical machine) Reporter: Ryan Zezeski Assignee: Mark Miller Priority: Minor Fix For: 5.0, Trunk Attachments: IsStaleTime.java, SOLR-4509-4_4_0.patch, SOLR-4509.patch, SOLR-4509.patch, SOLR-4509.patch, SOLR-4509.patch, baremetal-stale-nostale-med-latency.dat, baremetal-stale-nostale-med-latency.svg, baremetal-stale-nostale-throughput.dat, baremetal-stale-nostale-throughput.svg By disabling the Apache HTTP Client stale check I've witnessed a 2-4x increase in throughput and reduction of over 100ms. This patch was made in the context of a project I'm leading, called Yokozuna, which relies on distributed search. Here's the patch on Yokozuna: https://github.com/rzezeski/yokozuna/pull/26 Here's a write-up I did on my findings: http://www.zinascii.com/2013/solr-distributed-search-and-the-stale-check.html I'm happy to answer any questions or make changes to the patch to make it acceptable. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4509) Disable Stale Check - Distributed Search (Performance)
[ https://issues.apache.org/jira/browse/SOLR-4509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller updated SOLR-4509: -- Attachment: SOLR-4509.patch This patch removes some nocommits around config and cleans a few things up a bit. Disable Stale Check - Distributed Search (Performance) -- Key: SOLR-4509 URL: https://issues.apache.org/jira/browse/SOLR-4509 Project: Solr Issue Type: Improvement Components: search Environment: 5 node SmartOS cluster (all nodes living in same global zone - i.e. same physical machine) Reporter: Ryan Zezeski Assignee: Mark Miller Priority: Minor Fix For: 5.0, Trunk Attachments: IsStaleTime.java, SOLR-4509-4_4_0.patch, SOLR-4509.patch, SOLR-4509.patch, SOLR-4509.patch, baremetal-stale-nostale-med-latency.dat, baremetal-stale-nostale-med-latency.svg, baremetal-stale-nostale-throughput.dat, baremetal-stale-nostale-throughput.svg By disabling the Apache HTTP Client stale check I've witnessed a 2-4x increase in throughput and reduction of over 100ms. This patch was made in the context of a project I'm leading, called Yokozuna, which relies on distributed search. Here's the patch on Yokozuna: https://github.com/rzezeski/yokozuna/pull/26 Here's a write-up I did on my findings: http://www.zinascii.com/2013/solr-distributed-search-and-the-stale-check.html I'm happy to answer any questions or make changes to the patch to make it acceptable. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4509) Disable Stale Check - Distributed Search (Performance)
[ https://issues.apache.org/jira/browse/SOLR-4509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller updated SOLR-4509: -- Fix Version/s: Trunk 5.0 Disable Stale Check - Distributed Search (Performance) -- Key: SOLR-4509 URL: https://issues.apache.org/jira/browse/SOLR-4509 Project: Solr Issue Type: Improvement Components: search Environment: 5 node SmartOS cluster (all nodes living in same global zone - i.e. same physical machine) Reporter: Ryan Zezeski Assignee: Mark Miller Priority: Minor Fix For: 5.0, Trunk Attachments: IsStaleTime.java, SOLR-4509-4_4_0.patch, SOLR-4509.patch, baremetal-stale-nostale-med-latency.dat, baremetal-stale-nostale-med-latency.svg, baremetal-stale-nostale-throughput.dat, baremetal-stale-nostale-throughput.svg By disabling the Apache HTTP Client stale check I've witnessed a 2-4x increase in throughput and reduction of over 100ms. This patch was made in the context of a project I'm leading, called Yokozuna, which relies on distributed search. Here's the patch on Yokozuna: https://github.com/rzezeski/yokozuna/pull/26 Here's a write-up I did on my findings: http://www.zinascii.com/2013/solr-distributed-search-and-the-stale-check.html I'm happy to answer any questions or make changes to the patch to make it acceptable. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4509) Disable Stale Check - Distributed Search (Performance)
[ https://issues.apache.org/jira/browse/SOLR-4509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller updated SOLR-4509: -- Attachment: SOLR-4509.patch Disable Stale Check - Distributed Search (Performance) -- Key: SOLR-4509 URL: https://issues.apache.org/jira/browse/SOLR-4509 Project: Solr Issue Type: Improvement Components: search Environment: 5 node SmartOS cluster (all nodes living in same global zone - i.e. same physical machine) Reporter: Ryan Zezeski Assignee: Mark Miller Priority: Minor Fix For: 5.0, Trunk Attachments: IsStaleTime.java, SOLR-4509-4_4_0.patch, SOLR-4509.patch, SOLR-4509.patch, baremetal-stale-nostale-med-latency.dat, baremetal-stale-nostale-med-latency.svg, baremetal-stale-nostale-throughput.dat, baremetal-stale-nostale-throughput.svg By disabling the Apache HTTP Client stale check I've witnessed a 2-4x increase in throughput and reduction of over 100ms. This patch was made in the context of a project I'm leading, called Yokozuna, which relies on distributed search. Here's the patch on Yokozuna: https://github.com/rzezeski/yokozuna/pull/26 Here's a write-up I did on my findings: http://www.zinascii.com/2013/solr-distributed-search-and-the-stale-check.html I'm happy to answer any questions or make changes to the patch to make it acceptable. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4509) Disable Stale Check - Distributed Search (Performance)
[ https://issues.apache.org/jira/browse/SOLR-4509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Zezeski updated SOLR-4509: --- Attachment: SOLR-4509-4_4_0.patch Version of the patch that applies to 4.4.0. Disable Stale Check - Distributed Search (Performance) -- Key: SOLR-4509 URL: https://issues.apache.org/jira/browse/SOLR-4509 Project: Solr Issue Type: Improvement Components: search Environment: 5 node SmartOS cluster (all nodes living in same global zone - i.e. same physical machine) Reporter: Ryan Zezeski Priority: Minor Attachments: baremetal-stale-nostale-med-latency.dat, baremetal-stale-nostale-med-latency.svg, baremetal-stale-nostale-throughput.dat, baremetal-stale-nostale-throughput.svg, IsStaleTime.java, SOLR-4509-4_4_0.patch, SOLR-4509.patch By disabling the Apache HTTP Client stale check I've witnessed a 2-4x increase in throughput and reduction of over 100ms. This patch was made in the context of a project I'm leading, called Yokozuna, which relies on distributed search. Here's the patch on Yokozuna: https://github.com/rzezeski/yokozuna/pull/26 Here's a write-up I did on my findings: http://www.zinascii.com/2013/solr-distributed-search-and-the-stale-check.html I'm happy to answer any questions or make changes to the patch to make it acceptable. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4509) Disable Stale Check - Distributed Search (Performance)
[ https://issues.apache.org/jira/browse/SOLR-4509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Zezeski updated SOLR-4509: --- Attachment: baremetal-stale-nostale-throughput.svg baremetal-stale-nostale-throughput.dat baremetal-stale-nostale-med-latency.svg baremetal-stale-nostale-med-latency.dat I have some new results from a different cluster. The short story is that I still see improvement from removing the stale check, just not as dramatic as on my SmartOS cluster. Throughput improved by 108-120% and there was a 0-5ms delta in latency. What I take from this is that the benefits of removing the stale check will vary depending on # of nodes, hardware, query and load. In theory removing the stale check should never hurt as removing blocking syscalls should only help. But I totally understand if about being cautious with a change like this. Personally I'd like to see at least one other person confirm a non-negligible difference before I bother to make this patch more acceptable. Best to let this ticket stir a while I suppose. ## Cluster Specs Add nodes are running on baremetalcloud so this time they are truly different physical machines with no virtualization involved. * 8 nodes/shards * 1 x 2.66GHz Woodcrest E5150 (2 cores) * 2GB DDR2-667 * 73GB SAS 10k RPM * Ubuntu 12.04 * Oracle JDK: Java HotSpot(TM) 64-Bit Server VM (build 23.7-b01, mixed mode) * 512MB max heap * Example schema ## Bench Runner * 1 node * 2 x 2.66GHz Woodcrest E5150 * 8GB DDR2-667 * Using Basho Bench as load gen ## Queries All queries hit all shards. All queries were single term queries except for alpha which is conjunction. The numbers listed are the number of documents matching each term query. * alpha: 100K, 100K, 0 * lima: 1 * mike: 10 * november: 100 * oscar: 1K * papa: 10K * quebec: 100K Attached is the aggregate data (.dat) and corresponding plots (.svg) of that data. The data was aggregated from raw data collected by Basho Bench (and this raw data is actually the aggregate of all events at 10s intervals). E.g. the median latency is actually the mean of the median latencies calculated against all events in a given 10s period. What I'm saying is, it's rollup or a rollup so while there are 2 decimals of precision those numbers are not actually that precise. But this should be good for ballpark figures (if you're a stats geek please let me know if I'm committing a sin here). There is a big delta in latency for the mike benchmark but I'm chalking that up to an anomaly for the time being. Disable Stale Check - Distributed Search (Performance) -- Key: SOLR-4509 URL: https://issues.apache.org/jira/browse/SOLR-4509 Project: Solr Issue Type: Improvement Components: search Environment: 5 node SmartOS cluster (all nodes living in same global zone - i.e. same physical machine) Reporter: Ryan Zezeski Priority: Minor Attachments: baremetal-stale-nostale-med-latency.dat, baremetal-stale-nostale-med-latency.svg, baremetal-stale-nostale-throughput.dat, baremetal-stale-nostale-throughput.svg, IsStaleTime.java, SOLR-4509.patch By disabling the Apache HTTP Client stale check I've witnessed a 2-4x increase in throughput and reduction of over 100ms. This patch was made in the context of a project I'm leading, called Yokozuna, which relies on distributed search. Here's the patch on Yokozuna: https://github.com/rzezeski/yokozuna/pull/26 Here's a write-up I did on my findings: http://www.zinascii.com/2013/solr-distributed-search-and-the-stale-check.html I'm happy to answer any questions or make changes to the patch to make it acceptable. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4509) Disable Stale Check - Distributed Search (Performance)
[ https://issues.apache.org/jira/browse/SOLR-4509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Zezeski updated SOLR-4509: --- Attachment: IsStaleTime.java BTrace script to dynamically time the stale check Disable Stale Check - Distributed Search (Performance) -- Key: SOLR-4509 URL: https://issues.apache.org/jira/browse/SOLR-4509 Project: Solr Issue Type: Improvement Components: search Environment: 5 node SmartOS cluster (all nodes living in same global zone - i.e. same physical machine) Reporter: Ryan Zezeski Priority: Minor Attachments: IsStaleTime.java, SOLR-4509.patch By disabling the Apache HTTP Client stale check I've witnessed a 2-4x increase in throughput and reduction of over 100ms. This patch was made in the context of a project I'm leading, called Yokozuna, which relies on distributed search. Here's the patch on Yokozuna: https://github.com/rzezeski/yokozuna/pull/26 Here's a write-up I did on my findings: http://www.zinascii.com/2013/solr-distributed-search-and-the-stale-check.html I'm happy to answer any questions or make changes to the patch to make it acceptable. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4509) Disable Stale Check - Distributed Search (Performance)
[ https://issues.apache.org/jira/browse/SOLR-4509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Zezeski updated SOLR-4509: --- Attachment: SOLR-4509.patch Disable Stale Check - Distributed Search (Performance) -- Key: SOLR-4509 URL: https://issues.apache.org/jira/browse/SOLR-4509 Project: Solr Issue Type: Improvement Components: search Environment: 5 node SmartOS cluster (all nodes living in same global zone - i.e. same physical machine) Reporter: Ryan Zezeski Priority: Minor Attachments: SOLR-4509.patch By disabling the Apache HTTP Client stale check I've witnessed a 2-4x increase in throughput and reduction of over 100ms. This patch was made in the context of a project I'm leading, called Yokozuna, which relies on distributed search. Here's the patch on Yokozuna: https://github.com/rzezeski/yokozuna/pull/26 Here's a write-up I did on my findings: http://www.zinascii.com/2013/solr-distributed-search-and-the-stale-check.html I'm happy to answer any questions or make changes to the patch to make it acceptable. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org