[jira] [Commented] (SOLR-12882) Eliminate excessive lambda allocation in FacetFieldProcessorByHashDV.collectValFirstPhase
[ https://issues.apache.org/jira/browse/SOLR-12882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16673298#comment-16673298 ] Tim Underwood commented on SOLR-12882: -- Thanks for merging! > Eliminate excessive lambda allocation in > FacetFieldProcessorByHashDV.collectValFirstPhase > - > > Key: SOLR-12882 > URL: https://issues.apache.org/jira/browse/SOLR-12882 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: Facet Module >Affects Versions: 7.5 >Reporter: Tim Underwood >Assignee: David Smiley >Priority: Minor > Fix For: 7.6 > > Attachments: > start-2018-10-31_snapshot___Users_tim_Snapshots__-_YourKit_Java_Profiler_2017_02-b75_-_64-bit.png, > start_-_YourKit_Java_Profiler_2017_02-b75_-_64-bit.png > > Time Spent: 10m > Remaining Estimate: 0h > > The FacetFieldProcessorByHashDV.collectValFirstPhase method looks like this: > {noformat} > private void collectValFirstPhase(int segDoc, long val) throws IOException { > int slot = table.add(val); // this can trigger a rehash > // Our countAcc is virtual, so this is not needed: > // countAcc.incrementCount(slot, 1); > super.collectFirstPhase(segDoc, slot, slotNum -> > { Comparable value = calc.bitsToValue(val); return new > SlotContext(sf.getType().getFieldQuery(null, sf, calc.formatValue(value))); } > ); > } > {noformat} > > For each value that is being iterated over there is a lambda allocation that > is passed as the slotContext argument to the super.collectFirstPhase method. > The lambda can be factored out such that there is only a single instance per > FacetFieldProcessorByHashDV instance. > The only tradeoff being that the value needs to be looked up from the table > in the lambda. However looking the value up in the table is going to be less > expensive than a memory allocation and also the slotContext lambda is only > used in RelatednessAgg and not for any of the field faceting or metrics. > > {noformat} > private void collectValFirstPhase(int segDoc, long val) throws IOException { > int slot = table.add(val); // this can trigger a rehash > // Our countAcc is virtual, so this is not needed: > // countAcc.incrementCount(slot, 1); > super.collectFirstPhase(segDoc, slot, slotContext); > } > /** > * SlotContext to use during all {@link SlotAcc} collection. > * > * This avoids a memory allocation for each invocation of > collectValFirstPhase. > */ > private IntFunction slotContext = (slotNum) -> { > long val = table.vals[slotNum]; > Comparable value = calc.bitsToValue(val); > return new SlotContext(sf.getType().getFieldQuery(null, sf, > calc.formatValue(value))); > }; > {noformat} > > FacetFieldProcessorByArray already follows this same pattern -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-12882) Eliminate excessive lambda allocation in FacetFieldProcessorByHashDV.collectValFirstPhase
[ https://issues.apache.org/jira/browse/SOLR-12882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16672041#comment-16672041 ] ASF subversion and git services commented on SOLR-12882: Commit 1320db356833a6e4823e29416c388593b027948b in lucene-solr's branch refs/heads/branch_7x from [~tpunder] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=1320db3 ] SOLR-12882: Eliminate excessive lambda allocation in json facet FacetFieldProcessorByHashDV.collectValFirstPhase (cherry picked from commit cf445ba54998710466a7c6cb489d3162d20d127a) > Eliminate excessive lambda allocation in > FacetFieldProcessorByHashDV.collectValFirstPhase > - > > Key: SOLR-12882 > URL: https://issues.apache.org/jira/browse/SOLR-12882 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: Facet Module >Affects Versions: 7.5 >Reporter: Tim Underwood >Assignee: David Smiley >Priority: Major > Attachments: > start-2018-10-31_snapshot___Users_tim_Snapshots__-_YourKit_Java_Profiler_2017_02-b75_-_64-bit.png, > start_-_YourKit_Java_Profiler_2017_02-b75_-_64-bit.png > > Time Spent: 10m > Remaining Estimate: 0h > > The FacetFieldProcessorByHashDV.collectValFirstPhase method looks like this: > {noformat} > private void collectValFirstPhase(int segDoc, long val) throws IOException { > int slot = table.add(val); // this can trigger a rehash > // Our countAcc is virtual, so this is not needed: > // countAcc.incrementCount(slot, 1); > super.collectFirstPhase(segDoc, slot, slotNum -> > { Comparable value = calc.bitsToValue(val); return new > SlotContext(sf.getType().getFieldQuery(null, sf, calc.formatValue(value))); } > ); > } > {noformat} > > For each value that is being iterated over there is a lambda allocation that > is passed as the slotContext argument to the super.collectFirstPhase method. > The lambda can be factored out such that there is only a single instance per > FacetFieldProcessorByHashDV instance. > The only tradeoff being that the value needs to be looked up from the table > in the lambda. However looking the value up in the table is going to be less > expensive than a memory allocation and also the slotContext lambda is only > used in RelatednessAgg and not for any of the field faceting or metrics. > > {noformat} > private void collectValFirstPhase(int segDoc, long val) throws IOException { > int slot = table.add(val); // this can trigger a rehash > // Our countAcc is virtual, so this is not needed: > // countAcc.incrementCount(slot, 1); > super.collectFirstPhase(segDoc, slot, slotContext); > } > /** > * SlotContext to use during all {@link SlotAcc} collection. > * > * This avoids a memory allocation for each invocation of > collectValFirstPhase. > */ > private IntFunction slotContext = (slotNum) -> { > long val = table.vals[slotNum]; > Comparable value = calc.bitsToValue(val); > return new SlotContext(sf.getType().getFieldQuery(null, sf, > calc.formatValue(value))); > }; > {noformat} > > FacetFieldProcessorByArray already follows this same pattern -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-12882) Eliminate excessive lambda allocation in FacetFieldProcessorByHashDV.collectValFirstPhase
[ https://issues.apache.org/jira/browse/SOLR-12882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16672030#comment-16672030 ] ASF subversion and git services commented on SOLR-12882: Commit cf445ba54998710466a7c6cb489d3162d20d127a in lucene-solr's branch refs/heads/master from [~tpunder] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=cf445ba ] SOLR-12882: Eliminate excessive lambda allocation in json facet FacetFieldProcessorByHashDV.collectValFirstPhase > Eliminate excessive lambda allocation in > FacetFieldProcessorByHashDV.collectValFirstPhase > - > > Key: SOLR-12882 > URL: https://issues.apache.org/jira/browse/SOLR-12882 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: Facet Module >Affects Versions: 7.5 >Reporter: Tim Underwood >Assignee: David Smiley >Priority: Major > Attachments: > start-2018-10-31_snapshot___Users_tim_Snapshots__-_YourKit_Java_Profiler_2017_02-b75_-_64-bit.png, > start_-_YourKit_Java_Profiler_2017_02-b75_-_64-bit.png > > Time Spent: 10m > Remaining Estimate: 0h > > The FacetFieldProcessorByHashDV.collectValFirstPhase method looks like this: > {noformat} > private void collectValFirstPhase(int segDoc, long val) throws IOException { > int slot = table.add(val); // this can trigger a rehash > // Our countAcc is virtual, so this is not needed: > // countAcc.incrementCount(slot, 1); > super.collectFirstPhase(segDoc, slot, slotNum -> > { Comparable value = calc.bitsToValue(val); return new > SlotContext(sf.getType().getFieldQuery(null, sf, calc.formatValue(value))); } > ); > } > {noformat} > > For each value that is being iterated over there is a lambda allocation that > is passed as the slotContext argument to the super.collectFirstPhase method. > The lambda can be factored out such that there is only a single instance per > FacetFieldProcessorByHashDV instance. > The only tradeoff being that the value needs to be looked up from the table > in the lambda. However looking the value up in the table is going to be less > expensive than a memory allocation and also the slotContext lambda is only > used in RelatednessAgg and not for any of the field faceting or metrics. > > {noformat} > private void collectValFirstPhase(int segDoc, long val) throws IOException { > int slot = table.add(val); // this can trigger a rehash > // Our countAcc is virtual, so this is not needed: > // countAcc.incrementCount(slot, 1); > super.collectFirstPhase(segDoc, slot, slotContext); > } > /** > * SlotContext to use during all {@link SlotAcc} collection. > * > * This avoids a memory allocation for each invocation of > collectValFirstPhase. > */ > private IntFunction slotContext = (slotNum) -> { > long val = table.vals[slotNum]; > Comparable value = calc.bitsToValue(val); > return new SlotContext(sf.getType().getFieldQuery(null, sf, > calc.formatValue(value))); > }; > {noformat} > > FacetFieldProcessorByArray already follows this same pattern -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-12882) Eliminate excessive lambda allocation in FacetFieldProcessorByHashDV.collectValFirstPhase
[ https://issues.apache.org/jira/browse/SOLR-12882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16669586#comment-16669586 ] Tim Underwood commented on SOLR-12882: -- It's not a huge improvement but I noticed it showing up in YourKit memory allocation profiling: !start_-_YourKit_Java_Profiler_2017_02-b75_-_64-bit.png|width=600px! https://issues.apache.org/jira/secure/attachment/12946329/start_-_YourKit_Java_Profiler_2017_02-b75_-_64-bit.png !start-2018-10-31_snapshot___Users_tim_Snapshots__-_YourKit_Java_Profiler_2017_02-b75_-_64-bit.png|width=600px! https://issues.apache.org/jira/secure/attachment/12946330/start-2018-10-31_snapshot___Users_tim_Snapshots__-_YourKit_Java_Profiler_2017_02-b75_-_64-bit.png My informal benchmarking (I should really setup JMH for this stuff) for one of my facet heavy queries went from ~270-275 requests/second to ~285-293 requests/second for my setup. So it's a very minor improvement. > Eliminate excessive lambda allocation in > FacetFieldProcessorByHashDV.collectValFirstPhase > - > > Key: SOLR-12882 > URL: https://issues.apache.org/jira/browse/SOLR-12882 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: Facet Module >Affects Versions: 7.5 >Reporter: Tim Underwood >Priority: Major > Attachments: > start-2018-10-31_snapshot___Users_tim_Snapshots__-_YourKit_Java_Profiler_2017_02-b75_-_64-bit.png, > start_-_YourKit_Java_Profiler_2017_02-b75_-_64-bit.png > > Time Spent: 10m > Remaining Estimate: 0h > > The FacetFieldProcessorByHashDV.collectValFirstPhase method looks like this: > {noformat} > private void collectValFirstPhase(int segDoc, long val) throws IOException { > int slot = table.add(val); // this can trigger a rehash > // Our countAcc is virtual, so this is not needed: > // countAcc.incrementCount(slot, 1); > super.collectFirstPhase(segDoc, slot, slotNum -> > { Comparable value = calc.bitsToValue(val); return new > SlotContext(sf.getType().getFieldQuery(null, sf, calc.formatValue(value))); } > ); > } > {noformat} > > For each value that is being iterated over there is a lambda allocation that > is passed as the slotContext argument to the super.collectFirstPhase method. > The lambda can be factored out such that there is only a single instance per > FacetFieldProcessorByHashDV instance. > The only tradeoff being that the value needs to be looked up from the table > in the lambda. However looking the value up in the table is going to be less > expensive than a memory allocation and also the slotContext lambda is only > used in RelatednessAgg and not for any of the field faceting or metrics. > > {noformat} > private void collectValFirstPhase(int segDoc, long val) throws IOException { > int slot = table.add(val); // this can trigger a rehash > // Our countAcc is virtual, so this is not needed: > // countAcc.incrementCount(slot, 1); > super.collectFirstPhase(segDoc, slot, slotContext); > } > /** > * SlotContext to use during all {@link SlotAcc} collection. > * > * This avoids a memory allocation for each invocation of > collectValFirstPhase. > */ > private IntFunction slotContext = (slotNum) -> { > long val = table.vals[slotNum]; > Comparable value = calc.bitsToValue(val); > return new SlotContext(sf.getType().getFieldQuery(null, sf, > calc.formatValue(value))); > }; > {noformat} > > FacetFieldProcessorByArray already follows this same pattern -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-12882) Eliminate excessive lambda allocation in FacetFieldProcessorByHashDV.collectValFirstPhase
[ https://issues.apache.org/jira/browse/SOLR-12882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16668664#comment-16668664 ] David Smiley commented on SOLR-12882: - +1 I wonder how much this helps? Did you do benchmarking? > Eliminate excessive lambda allocation in > FacetFieldProcessorByHashDV.collectValFirstPhase > - > > Key: SOLR-12882 > URL: https://issues.apache.org/jira/browse/SOLR-12882 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: Facet Module >Affects Versions: 7.5 >Reporter: Tim Underwood >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > The FacetFieldProcessorByHashDV.collectValFirstPhase method looks like this: > {noformat} > private void collectValFirstPhase(int segDoc, long val) throws IOException { > int slot = table.add(val); // this can trigger a rehash > // Our countAcc is virtual, so this is not needed: > // countAcc.incrementCount(slot, 1); > super.collectFirstPhase(segDoc, slot, slotNum -> > { Comparable value = calc.bitsToValue(val); return new > SlotContext(sf.getType().getFieldQuery(null, sf, calc.formatValue(value))); } > ); > } > {noformat} > > For each value that is being iterated over there is a lambda allocation that > is passed as the slotContext argument to the super.collectFirstPhase method. > The lambda can be factored out such that there is only a single instance per > FacetFieldProcessorByHashDV instance. > The only tradeoff being that the value needs to be looked up from the table > in the lambda. However looking the value up in the table is going to be less > expensive than a memory allocation and also the slotContext lambda is only > used in RelatednessAgg and not for any of the field faceting or metrics. > > {noformat} > private void collectValFirstPhase(int segDoc, long val) throws IOException { > int slot = table.add(val); // this can trigger a rehash > // Our countAcc is virtual, so this is not needed: > // countAcc.incrementCount(slot, 1); > super.collectFirstPhase(segDoc, slot, slotContext); > } > /** > * SlotContext to use during all {@link SlotAcc} collection. > * > * This avoids a memory allocation for each invocation of > collectValFirstPhase. > */ > private IntFunction slotContext = (slotNum) -> { > long val = table.vals[slotNum]; > Comparable value = calc.bitsToValue(val); > return new SlotContext(sf.getType().getFieldQuery(null, sf, > calc.formatValue(value))); > }; > {noformat} > > FacetFieldProcessorByArray already follows this same pattern -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-12882) Eliminate excessive lambda allocation in FacetFieldProcessorByHashDV.collectValFirstPhase
[ https://issues.apache.org/jira/browse/SOLR-12882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1424#comment-1424 ] Shawn Heisey commented on SOLR-12882: - >From 2018-10-24 on the #solr-dev IRC channel: {noformat} 12:22 < tpunder> I have a few Solr Issues I'd like to get reviewed/merged (SOLR-12875, SOLR-12878, SOLR-12882, SOLR-12880). What's the best way to go about doing that? {noformat} These issues look very compelling, especially SOLR-12878. We've been fighting facet performance regression for a while now. If I had even a sliver of understanding of the code you're working on, I would help you. You might want to ping the dev list to raise visibility. > Eliminate excessive lambda allocation in > FacetFieldProcessorByHashDV.collectValFirstPhase > - > > Key: SOLR-12882 > URL: https://issues.apache.org/jira/browse/SOLR-12882 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: Facet Module >Affects Versions: 7.5 >Reporter: Tim Underwood >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > The FacetFieldProcessorByHashDV.collectValFirstPhase method looks like this: > {noformat} > private void collectValFirstPhase(int segDoc, long val) throws IOException { > int slot = table.add(val); // this can trigger a rehash > // Our countAcc is virtual, so this is not needed: > // countAcc.incrementCount(slot, 1); > super.collectFirstPhase(segDoc, slot, slotNum -> > { Comparable value = calc.bitsToValue(val); return new > SlotContext(sf.getType().getFieldQuery(null, sf, calc.formatValue(value))); } > ); > } > {noformat} > > For each value that is being iterated over there is a lambda allocation that > is passed as the slotContext argument to the super.collectFirstPhase method. > The lambda can be factored out such that there is only a single instance per > FacetFieldProcessorByHashDV instance. > The only tradeoff being that the value needs to be looked up from the table > in the lambda. However looking the value up in the table is going to be less > expensive than a memory allocation and also the slotContext lambda is only > used in RelatednessAgg and not for any of the field faceting or metrics. > > {noformat} > private void collectValFirstPhase(int segDoc, long val) throws IOException { > int slot = table.add(val); // this can trigger a rehash > // Our countAcc is virtual, so this is not needed: > // countAcc.incrementCount(slot, 1); > super.collectFirstPhase(segDoc, slot, slotContext); > } > /** > * SlotContext to use during all {@link SlotAcc} collection. > * > * This avoids a memory allocation for each invocation of > collectValFirstPhase. > */ > private IntFunction slotContext = (slotNum) -> { > long val = table.vals[slotNum]; > Comparable value = calc.bitsToValue(val); > return new SlotContext(sf.getType().getFieldQuery(null, sf, > calc.formatValue(value))); > }; > {noformat} > > FacetFieldProcessorByArray already follows this same pattern -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-12882) Eliminate excessive lambda allocation in FacetFieldProcessorByHashDV.collectValFirstPhase
[ https://issues.apache.org/jira/browse/SOLR-12882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16654258#comment-16654258 ] Tim Underwood commented on SOLR-12882: -- Pull request here: https://github.com/apache/lucene-solr/pull/476 > Eliminate excessive lambda allocation in > FacetFieldProcessorByHashDV.collectValFirstPhase > - > > Key: SOLR-12882 > URL: https://issues.apache.org/jira/browse/SOLR-12882 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: Facet Module >Affects Versions: 7.5 >Reporter: Tim Underwood >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > The FacetFieldProcessorByHashDV.collectValFirstPhase method looks like this: > {noformat} > private void collectValFirstPhase(int segDoc, long val) throws IOException { > int slot = table.add(val); // this can trigger a rehash > // Our countAcc is virtual, so this is not needed: > // countAcc.incrementCount(slot, 1); > super.collectFirstPhase(segDoc, slot, slotNum -> > { Comparable value = calc.bitsToValue(val); return new > SlotContext(sf.getType().getFieldQuery(null, sf, calc.formatValue(value))); } > ); > } > {noformat} > > For each value that is being iterated over there is a lambda allocation that > is passed as the slotContext argument to the super.collectFirstPhase method. > The lambda can be factored out such that there is only a single instance per > FacetFieldProcessorByHashDV instance. > The only tradeoff being that the value needs to be looked up from the table > in the lambda. However looking the value up in the table is going to be less > expensive than a memory allocation and also the slotContext lambda is only > used in RelatednessAgg and not for any of the field faceting or metrics. > > {noformat} > private void collectValFirstPhase(int segDoc, long val) throws IOException { > int slot = table.add(val); // this can trigger a rehash > // Our countAcc is virtual, so this is not needed: > // countAcc.incrementCount(slot, 1); > super.collectFirstPhase(segDoc, slot, slotContext); > } > /** > * SlotContext to use during all {@link SlotAcc} collection. > * > * This avoids a memory allocation for each invocation of > collectValFirstPhase. > */ > private IntFunction slotContext = (slotNum) -> { > long val = table.vals[slotNum]; > Comparable value = calc.bitsToValue(val); > return new SlotContext(sf.getType().getFieldQuery(null, sf, > calc.formatValue(value))); > }; > {noformat} > > FacetFieldProcessorByArray already follows this same pattern -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org