[jira] [Updated] (SOLR-1384) Allow fq or q to specify boolean query min must match like dismax's mm parameter
[ https://issues.apache.org/jira/browse/SOLR-1384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] alexey updated SOLR-1384: - Attachment: SOLR-1384.patch Please consider new patch. MinShouldMatch parameter value moved to SolrQueryParser according to Hoss Man's comment. > Allow fq or q to specify boolean query min must match like dismax's mm > parameter > > > Key: SOLR-1384 > URL: https://issues.apache.org/jira/browse/SOLR-1384 > Project: Solr > Issue Type: Improvement > Components: search >Affects Versions: 1.4 >Reporter: Preetam Rao >Priority: Minor > Fix For: 4.9, 6.0 > > Attachments: SOLR-1384.patch, SOLR-1384.patch > > > Dis max query provides "mm" parameter that can be set on the underlying > Lucene Boolean OR query using setMinimumNumberShouldMatch() method. > It will be great if we can have the same support on any fq or q that > specifies more than one term. This means we don't need to switch to dis max > query just for this one use case. > Example might look like this: > fq={!minMustMatch=75%}street:"917 Z st NW Washington DC" > Full supported syntax for the value allowed should be this: > http://lucene.apache.org/solr/api/org/apache/solr/util/doc-files/min-should-match.html > This is the underlying lucene facility: > http://www.netlikon.de/docs/javadoc-lucene/lucene_1_9/org/apache/lucene/search/BooleanQuery.html#setMinimumNumberShouldMatch%28int%29 -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-6841) Visualize lucene segment info in admin
[ https://issues.apache.org/jira/browse/SOLR-6841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14243932#comment-14243932 ] alexey edited comment on SOLR-6841 at 12/12/14 10:12 AM: - Work-in-progress UI screenshot attached to demonstrate the idea was (Author: alexey_kozhemiakin): Work-in-progress UI to demonstrate the idea > Visualize lucene segment info in admin > -- > > Key: SOLR-6841 > URL: https://issues.apache.org/jira/browse/SOLR-6841 > Project: Solr > Issue Type: Improvement >Reporter: alexey > Fix For: 5.0 > > Attachments: i7^cimgpsh_orig.png > > > We find it useful to tune merge policy not blindly but looking on segment > size and fill ratio. > We're working on a patch that adds a tab to admin page with McCandless-style > of segment visualization. > Draft UI is attached (currenly as part of admin.extra). > Please share your ideas if it's ok to put the code in core admin. > More details here > http://search-lucene.com/m/QTPa44cNJ1 > https://plus.google.com/+MichaelMcCandless/posts/MJVueTznYnD > http://blog.mikemccandless.com/2011/02/visualizing-lucenes-segment-merges.html -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6841) Visualize lucene segment info in admin
[ https://issues.apache.org/jira/browse/SOLR-6841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] alexey updated SOLR-6841: - Attachment: i7^cimgpsh_orig.png Work-in-progress UI to demonstrate the idea > Visualize lucene segment info in admin > -- > > Key: SOLR-6841 > URL: https://issues.apache.org/jira/browse/SOLR-6841 > Project: Solr > Issue Type: Improvement >Reporter: alexey > Fix For: 5.0 > > Attachments: i7^cimgpsh_orig.png > > > We find it useful to tune merge policy not blindly but looking on segment > size and fill ratio. > We're working on a patch that adds a tab to admin page with McCandless-style > of segment visualization. > Draft UI is attached (currenly as part of admin.extra). > Please share your ideas if it's ok to put the code in core admin. > More details here > http://search-lucene.com/m/QTPa44cNJ1 > https://plus.google.com/+MichaelMcCandless/posts/MJVueTznYnD > http://blog.mikemccandless.com/2011/02/visualizing-lucenes-segment-merges.html -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6841) Visualize lucene segment info in admin
[ https://issues.apache.org/jira/browse/SOLR-6841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] alexey updated SOLR-6841: - Description: We find it useful to tune merge policy not blindly but looking on segment size and fill ratio. We're working on a patch that adds a tab to admin page with McCandless-style of segment visualization. Draft UI is attached (currenly as part of admin.extra). Please share your ideas if it's ok to put the code in core admin. More details here http://search-lucene.com/m/QTPa44cNJ1 https://plus.google.com/+MichaelMcCandless/posts/MJVueTznYnD http://blog.mikemccandless.com/2011/02/visualizing-lucenes-segment-merges.html was: We find it useful to tune merge policy not blindly but looking on segment size and fill ratio. We're working on a patch that adds a tab to admin page with McCandless-style of segment visualization. Draft UI is attached (currenly as part of admin.extra). Please share your ideas. More details here http://search-lucene.com/m/QTPa44cNJ1 https://plus.google.com/+MichaelMcCandless/posts/MJVueTznYnD http://blog.mikemccandless.com/2011/02/visualizing-lucenes-segment-merges.html > Visualize lucene segment info in admin > -- > > Key: SOLR-6841 > URL: https://issues.apache.org/jira/browse/SOLR-6841 > Project: Solr > Issue Type: Improvement >Reporter: alexey > Fix For: 5.0 > > > We find it useful to tune merge policy not blindly but looking on segment > size and fill ratio. > We're working on a patch that adds a tab to admin page with McCandless-style > of segment visualization. > Draft UI is attached (currenly as part of admin.extra). > Please share your ideas if it's ok to put the code in core admin. > More details here > http://search-lucene.com/m/QTPa44cNJ1 > https://plus.google.com/+MichaelMcCandless/posts/MJVueTznYnD > http://blog.mikemccandless.com/2011/02/visualizing-lucenes-segment-merges.html -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-6841) Visualize lucene segment info in admin
alexey created SOLR-6841: Summary: Visualize lucene segment info in admin Key: SOLR-6841 URL: https://issues.apache.org/jira/browse/SOLR-6841 Project: Solr Issue Type: Improvement Reporter: alexey Fix For: 5.0 We find it useful to tune merge policy not blindly but looking on segment size and fill ratio. We're working on a patch that adds a tab to admin page with McCandless-style of segment visualization. Draft UI is attached (currenly as part of admin.extra). Please share your ideas. More details here http://search-lucene.com/m/QTPa44cNJ1 https://plus.google.com/+MichaelMcCandless/posts/MJVueTznYnD http://blog.mikemccandless.com/2011/02/visualizing-lucenes-segment-merges.html -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6601) Ignore TF IDF on query side
[ https://issues.apache.org/jira/browse/SOLR-6601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14165226#comment-14165226 ] alexey commented on SOLR-6601: -- Yes, we want to control relevancy by boost functions and field weights. In some scenarios tf and idf do not have meaningful value. > Ignore TF IDF on query side > --- > > Key: SOLR-6601 > URL: https://issues.apache.org/jira/browse/SOLR-6601 > Project: Solr > Issue Type: Improvement >Reporter: alexey >Priority: Trivial > > It's a typical request in user mail lists and from customers - "how to > ignore tf idf on query time". > Let's put these naive code snippet to contrib jar in order to avoid writing > it multiple times. > class IgnoreTfIdfSimilarity extends DefaultSimilarity { > @Override > public float tf(float freq) { > return 1.0; > } > @Override > public float tf(int freq) { > return 1.0; > } > @Override > // Note the signature of this method may now take longs: > // public float idf(long docFreq, long numDocs) > public float idf(int docFreq, int numDocs) { > return 1.0; > } > } -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6600) configurable relevance impact of phrases for edismax
[ https://issues.apache.org/jira/browse/SOLR-6600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14163566#comment-14163566 ] alexey commented on SOLR-6600: -- Yes, Jan Høydahl, thanks for pointing this out. Looks like this issue is a duplicate for SOLR-6062. > configurable relevance impact of phrases for edismax > > > Key: SOLR-6600 > URL: https://issues.apache.org/jira/browse/SOLR-6600 > Project: Solr > Issue Type: Improvement > Components: query parsers >Affects Versions: 4.9 >Reporter: alexey > Labels: edismax > > Currently solr has a tie breaker parameter which control how to aggregate > relevance score for search hits. > But score for fields (pf, pf2, pf3) are always summed up. > The goal of the patch is to wrap phrase clauses into single dismax clause > instead of multipe ones > Before patch > +( > DisjunctionMaxQuery((Body:james | Title:james)~tie_breaker) > DisjunctionMaxQuery((Body:kirk | Title:kirk)~tie_breaker)) > ) > DisjunctionMaxQuery((Body:"james kirk")~tie_breaker) > DisjunctionMaxQuery((Title:"james kirk")~tie_breaker) > after patch > +( > DisjunctionMaxQuery((Body:james | Title:james)~tie_breaker) > DisjunctionMaxQuery((Body:kirk | Title:kirk)~tie_breaker)) > ) > DisjunctionMaxQuery((Body:"james kirk" | Title:"james kirk") ~tie_breaker) -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6600) configurable relevance impact of phrases for edismax
[ https://issues.apache.org/jira/browse/SOLR-6600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] alexey updated SOLR-6600: - Issue Type: Improvement (was: Bug) > configurable relevance impact of phrases for edismax > > > Key: SOLR-6600 > URL: https://issues.apache.org/jira/browse/SOLR-6600 > Project: Solr > Issue Type: Improvement > Components: query parsers >Affects Versions: 4.9 >Reporter: alexey > Labels: edismax > > Currently solr has a tie breaker parameter which control how to aggregate > relevance score for search hits. > But score for fields (pf, pf2, pf3) are always summed up. > The goal of the patch is to wrap phrase clauses into single dismax clause > instead of multipe ones > Before patch > +( > DisjunctionMaxQuery((Body:james | Title:james)~tie_breaker) > DisjunctionMaxQuery((Body:kirk | Title:kirk)~tie_breaker)) > ) > DisjunctionMaxQuery((Body:"james kirk")~tie_breaker) > DisjunctionMaxQuery((Title:"james kirk")~tie_breaker) > after patch > +( > DisjunctionMaxQuery((Body:james | Title:james)~tie_breaker) > DisjunctionMaxQuery((Body:kirk | Title:kirk)~tie_breaker)) > ) > DisjunctionMaxQuery((Body:"james kirk" | Title:"james kirk") ~tie_breaker) -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-6601) Ignore TF IDF on query side
alexey created SOLR-6601: Summary: Ignore TF IDF on query side Key: SOLR-6601 URL: https://issues.apache.org/jira/browse/SOLR-6601 Project: Solr Issue Type: Improvement Reporter: alexey Priority: Trivial It's a typical request in user mail lists and from customers - "how to ignore tf idf on query time". Let's put these naive code snippet to contrib jar in order to avoid writing it multiple times. class IgnoreTfIdfSimilarity extends DefaultSimilarity { @Override public float tf(float freq) { return 1.0; } @Override public float tf(int freq) { return 1.0; } @Override // Note the signature of this method may now take longs: // public float idf(long docFreq, long numDocs) public float idf(int docFreq, int numDocs) { return 1.0; } } -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-6600) configurable relevance impact of phrases for edismax
alexey created SOLR-6600: Summary: configurable relevance impact of phrases for edismax Key: SOLR-6600 URL: https://issues.apache.org/jira/browse/SOLR-6600 Project: Solr Issue Type: Bug Components: query parsers Affects Versions: 4.9 Reporter: alexey Currently solr has a tie breaker parameter which control how to aggregate relevance score for search hits. But score for fields (pf, pf2, pf3) are always summed up. The goal of the patch is to wrap phrase clauses into single dismax clause instead of multipe ones Before patch +( DisjunctionMaxQuery((Body:james | Title:james)~tie_breaker) DisjunctionMaxQuery((Body:kirk | Title:kirk)~tie_breaker)) ) DisjunctionMaxQuery((Body:"james kirk")~tie_breaker) DisjunctionMaxQuery((Title:"james kirk")~tie_breaker) after patch +( DisjunctionMaxQuery((Body:james | Title:james)~tie_breaker) DisjunctionMaxQuery((Body:kirk | Title:kirk)~tie_breaker)) ) DisjunctionMaxQuery((Body:"james kirk" | Title:"james kirk") ~tie_breaker) -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5725) Efficient facets without counts for enum method
[ https://issues.apache.org/jira/browse/SOLR-5725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] alexey updated SOLR-5725: - Attachment: SOLR-5725.patch Draft version of patch attached. Tests and refined names of method and params will follow soon. Patched checked on 4.6 > Efficient facets without counts for enum method > --- > > Key: SOLR-5725 > URL: https://issues.apache.org/jira/browse/SOLR-5725 > Project: Solr > Issue Type: Improvement > Components: search >Reporter: alexey > Fix For: 5.0 > > Attachments: SOLR-5725.patch > > > Shot version: > This improves performance for facet.method=enum when it's enough to know that > facet count>0, for example when you it's when you dynamically populate > filters on search form. New method checks if two bitsets intersect instead of > counting intersection size. > Long version: > We have a dataset containing hundreds of millions of records, we facet by > dozens of fields with many of facet-excludes and have relatively small number > of unique values in fields, around thousands. > Before executing search, users work with "advanced search" form, our goal is > to populate dozens of filters with values which are applicable with other > selected values, so basically this is a use case for facets with mincount=1, > but without need in actual counts. > Our performance tests showed that facet.method=enum works much better than > fc\fcs, probably due to a specific ratio of "docset"\"unique terms count". > For example average execution of query time with method fc=1500ms, fcs=2600ms > and with enum=280ms. Profiling indicated the majority time for enum was spent > on intersecting docsets. > Hers's a patch that introduces an extension to facet calculation for > method=enum. Basically it uses docSetA.intersects(docSetB) instead of > docSetA. intersectionSize (docSetB). > As a result we were able to reduce our average query time from 280ms to 60ms. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5725) Efficient facets without counts for enum method
alexey created SOLR-5725: Summary: Efficient facets without counts for enum method Key: SOLR-5725 URL: https://issues.apache.org/jira/browse/SOLR-5725 Project: Solr Issue Type: Improvement Components: search Reporter: alexey Fix For: 5.0 Shot version: This improves performance for facet.method=enum when it's enough to know that facet count>0, for example when you it's when you dynamically populate filters on search form. New method checks if two bitsets intersect instead of counting intersection size. Long version: We have a dataset containing hundreds of millions of records, we facet by dozens of fields with many of facet-excludes and have relatively small number of unique values in fields, around thousands. Before executing search, users work with "advanced search" form, our goal is to populate dozens of filters with values which are applicable with other selected values, so basically this is a use case for facets with mincount=1, but without need in actual counts. Our performance tests showed that facet.method=enum works much better than fc\fcs, probably due to a specific ratio of "docset"\"unique terms count". For example average execution of query time with method fc=1500ms, fcs=2600ms and with enum=280ms. Profiling indicated the majority time for enum was spent on intersecting docsets. Hers's a patch that introduces an extension to facet calculation for method=enum. Basically it uses docSetA.intersects(docSetB) instead of docSetA. intersectionSize (docSetB). As a result we were able to reduce our average query time from 280ms to 60ms. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5589) Disabled replication in config is ignored
[ https://issues.apache.org/jira/browse/SOLR-5589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13891240#comment-13891240 ] alexey commented on SOLR-5589: -- I've attached new patch which illustrates and tests my initial concern. I've introduced new config param "replicationEnabled" to control enable\disable replication on master similar to enable\disable command. > Disabled replication in config is ignored > - > > Key: SOLR-5589 > URL: https://issues.apache.org/jira/browse/SOLR-5589 > Project: Solr > Issue Type: Bug > Components: replication (java) >Affects Versions: 4.5 >Reporter: alexey >Assignee: Shalin Shekhar Mangar > Fix For: 4.7 > > Attachments: SOLR-5589.patch, SOLR-5589.patch, SOLR-5589.patch, > SOLR-5589.patch, SOLR-5589.patch > > > When replication on master node is explicitly disabled in config, it is still > enabled after start. This is because when both master and slave > configurations are written with enabled=false, replication handler considers > this node is a master and enables it. With proposed patch handler will > consider this as master node but will disable replication on startup if it is > disabled in config (equivalent to disablereplication command). -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5589) Disabled replication in config is ignored
[ https://issues.apache.org/jira/browse/SOLR-5589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] alexey updated SOLR-5589: - Attachment: SOLR-5589.patch > Disabled replication in config is ignored > - > > Key: SOLR-5589 > URL: https://issues.apache.org/jira/browse/SOLR-5589 > Project: Solr > Issue Type: Bug > Components: replication (java) >Affects Versions: 4.5 >Reporter: alexey >Assignee: Shalin Shekhar Mangar > Fix For: 4.7 > > Attachments: SOLR-5589.patch, SOLR-5589.patch, SOLR-5589.patch, > SOLR-5589.patch, SOLR-5589.patch > > > When replication on master node is explicitly disabled in config, it is still > enabled after start. This is because when both master and slave > configurations are written with enabled=false, replication handler considers > this node is a master and enables it. With proposed patch handler will > consider this as master node but will disable replication on startup if it is > disabled in config (equivalent to disablereplication command). -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5589) Disabled replication in config is ignored
[ https://issues.apache.org/jira/browse/SOLR-5589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13884259#comment-13884259 ] alexey commented on SOLR-5589: -- Vitaly, ok it looks like my initial patch was not comlete. Thank you for working on it. What if attribute will be optional and will be used only for explicit disabling when someone need it? Missing attribute is treated as True to provide backwards compatibility. Ideally the following config to be enough to say that we want replication but disabled at startup. false commit schema.xml So the patch could be changed to something like if(disabledExplicitly(slave) || disabledExplicitly(master)){ replicationEnabled.set(false); } > Disabled replication in config is ignored > - > > Key: SOLR-5589 > URL: https://issues.apache.org/jira/browse/SOLR-5589 > Project: Solr > Issue Type: Bug > Components: replication (java) >Affects Versions: 4.5 >Reporter: alexey >Assignee: Shalin Shekhar Mangar > Fix For: 4.7 > > Attachments: SOLR-5589.patch, SOLR-5589.patch, SOLR-5589.patch, > SOLR-5589.patch > > > When replication on master node is explicitly disabled in config, it is still > enabled after start. This is because when both master and slave > configurations are written with enabled=false, replication handler considers > this node is a master and enables it. With proposed patch handler will > consider this as master node but will disable replication on startup if it is > disabled in config (equivalent to disablereplication command). -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5589) Disabled replication in config is ignored
[ https://issues.apache.org/jira/browse/SOLR-5589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] alexey updated SOLR-5589: - Attachment: SOLR-5589.patch > Disabled replication in config is ignored > - > > Key: SOLR-5589 > URL: https://issues.apache.org/jira/browse/SOLR-5589 > Project: Solr > Issue Type: Bug > Components: replication (java) >Affects Versions: 4.5 >Reporter: alexey > Fix For: 4.6 > > Attachments: SOLR-5589.patch > > > When replication on master node is explicitly disabled in config, it is still > enabled after start. This is because when both master and slave > configurations are written with enabled=false, replication handler considers > this node is a master and enables it. With proposed patch handler will > consider this as master node but will disable replication on startup if it is > disabled in config (equivalent to disablereplication command). -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5589) Disabled replication in config is ignored
alexey created SOLR-5589: Summary: Disabled replication in config is ignored Key: SOLR-5589 URL: https://issues.apache.org/jira/browse/SOLR-5589 Project: Solr Issue Type: Bug Components: replication (java) Affects Versions: 4.5 Reporter: alexey Fix For: 4.6 When replication on master node is explicitly disabled in config, it is still enabled after start. This is because when both master and slave configurations are written with enabled=false, replication handler considers this node is a master and enables it. With proposed patch handler will consider this as master node but will disable replication on startup if it is disabled in config (equivalent to disablereplication command). -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2548) Multithreaded faceting
[ https://issues.apache.org/jira/browse/SOLR-2548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13794018#comment-13794018 ] alexey commented on SOLR-2548: -- We observed 4x speedup when calculating 14 facets in 6 threads for 200mln index. Thanks everybody! https://twitter.com/AlexKozhemiakin/status/389688204309196800 > Multithreaded faceting > -- > > Key: SOLR-2548 > URL: https://issues.apache.org/jira/browse/SOLR-2548 > Project: Solr > Issue Type: Improvement > Components: search >Affects Versions: 3.1 >Reporter: Janne Majaranta >Assignee: Erick Erickson >Priority: Minor > Labels: facet > Fix For: 4.5, 5.0 > > Attachments: SOLR-2548_4.2.1.patch, SOLR-2548_for_31x.patch, > SOLR-2548_multithreaded_faceting,_dsmiley.patch, > SOLR-2548_multithreaded_faceting,_dsmiley.patch, SOLR-2548.patch, > SOLR-2548.patch, SOLR-2548.patch, SOLR-2548.patch, SOLR-2548.patch, > SOLR-2548.patch > > > Add multithreading support for faceting. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-2899) Add OpenNLP Analysis capabilities as a module
[ https://issues.apache.org/jira/browse/LUCENE-2899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13440345#comment-13440345 ] alexey commented on LUCENE-2899: Yes, please, it would be awesome if someone could make this last effort and commit this issue. Many thanks! > Add OpenNLP Analysis capabilities as a module > - > > Key: LUCENE-2899 > URL: https://issues.apache.org/jira/browse/LUCENE-2899 > Project: Lucene - Core > Issue Type: New Feature > Components: modules/analysis >Reporter: Grant Ingersoll >Assignee: Grant Ingersoll >Priority: Minor > Attachments: LUCENE-2899.patch, LUCENE-2899.patch, LUCENE-2899.patch, > LUCENE-2899.patch, LUCENE-2899.patch, opennlp_trunk.patch > > > Now that OpenNLP is an ASF project and has a nice license, it would be nice > to have a submodule (under analysis) that exposed capabilities for it. Drew > Farris, Tom Morton and I have code that does: > * Sentence Detection as a Tokenizer (could also be a TokenFilter, although it > would have to change slightly to buffer tokens) > * NamedEntity recognition as a TokenFilter > We are also planning a Tokenizer/TokenFilter that can put parts of speech as > either payloads (PartOfSpeechAttribute?) on a token or at the same position. > I'd propose it go under: > modules/analysis/opennlp -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org