[jira] [Updated] (SOLR-8921) Potential NPE in pivot facet
[ https://issues.apache.org/jira/browse/SOLR-8921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Molloy updated SOLR-8921: --- Attachment: SOLR-8921.patch Simplistic patch handling null queries without trying to guess why they're null. Seems to work but there might be better solutions. > Potential NPE in pivot facet > > > Key: SOLR-8921 > URL: https://issues.apache.org/jira/browse/SOLR-8921 > Project: Solr > Issue Type: Bug >Affects Versions: 5.4.1 >Reporter: Steve Molloy > Attachments: SOLR-8921.patch > > > For some queries distributed over multiple collections, I've hit a NPE when > SolrIndexSearcher tries to fetch results from cache. Basically, query > generated to compute pivot on document sub set is null, causing the NPE on > lookup. > 2016-03-28 11:34:58.361 ERROR (qtp268141378-751) [c:otif_fr s:shard1 > r:core_node1 x:otif_fr_shard1_replica1] o.a.s.h.RequestHandlerBase > java.lang.NullPointerException > at > java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:936) > at > org.apache.solr.util.ConcurrentLFUCache.get(ConcurrentLFUCache.java:92) > at org.apache.solr.search.LFUCache.get(LFUCache.java:153) > at > org.apache.solr.search.SolrIndexSearcher.getPositiveDocSet(SolrIndexSearcher.java:940) > at > org.apache.solr.search.SolrIndexSearcher.numDocs(SolrIndexSearcher.java:2098) > at > org.apache.solr.handler.component.PivotFacetProcessor.getSubsetSize(PivotFacetProcessor.java:356) > at > org.apache.solr.handler.component.PivotFacetProcessor.processSingle(PivotFacetProcessor.java:219) > at > org.apache.solr.handler.component.PivotFacetProcessor.process(PivotFacetProcessor.java:167) > at > org.apache.solr.handler.component.FacetComponent.process(FacetComponent.java:263) > at > org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:273) > at > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:156) > at org.apache.solr.core.SolrCore.execute(SolrCore.java:2073) > at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:658) > at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:457) > at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:223) > at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:181) > at > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652) > at > org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585) > at > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143) > at > org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:577) > at > org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223) > at > org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127) > at > org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515) > at > org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185) > at > org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061) > at > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) > at > org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:215) > at > org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:110) > at > org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97) > at org.eclipse.jetty.server.Server.handle(Server.java:499) > at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:310) > at > org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:257) > at > org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:540) > at > org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635) > at > org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555) > at java.lang.Thread.run(Thread.java:745) -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-8921) Potential NPE in pivot facet
Steve Molloy created SOLR-8921: -- Summary: Potential NPE in pivot facet Key: SOLR-8921 URL: https://issues.apache.org/jira/browse/SOLR-8921 Project: Solr Issue Type: Bug Affects Versions: 5.4.1 Reporter: Steve Molloy For some queries distributed over multiple collections, I've hit a NPE when SolrIndexSearcher tries to fetch results from cache. Basically, query generated to compute pivot on document sub set is null, causing the NPE on lookup. 2016-03-28 11:34:58.361 ERROR (qtp268141378-751) [c:otif_fr s:shard1 r:core_node1 x:otif_fr_shard1_replica1] o.a.s.h.RequestHandlerBase java.lang.NullPointerException at java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:936) at org.apache.solr.util.ConcurrentLFUCache.get(ConcurrentLFUCache.java:92) at org.apache.solr.search.LFUCache.get(LFUCache.java:153) at org.apache.solr.search.SolrIndexSearcher.getPositiveDocSet(SolrIndexSearcher.java:940) at org.apache.solr.search.SolrIndexSearcher.numDocs(SolrIndexSearcher.java:2098) at org.apache.solr.handler.component.PivotFacetProcessor.getSubsetSize(PivotFacetProcessor.java:356) at org.apache.solr.handler.component.PivotFacetProcessor.processSingle(PivotFacetProcessor.java:219) at org.apache.solr.handler.component.PivotFacetProcessor.process(PivotFacetProcessor.java:167) at org.apache.solr.handler.component.FacetComponent.process(FacetComponent.java:263) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:273) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:156) at org.apache.solr.core.SolrCore.execute(SolrCore.java:2073) at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:658) at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:457) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:223) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:181) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143) at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:577) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:215) at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:110) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97) at org.eclipse.jetty.server.Server.handle(Server.java:499) at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:310) at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:257) at org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:540) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635) at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555) at java.lang.Thread.run(Thread.java:745) -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8393) Component for Solr resource usage planning
[ https://issues.apache.org/jira/browse/SOLR-8393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15204298#comment-15204298 ] Steve Molloy commented on SOLR-8393: h1. Sizing Component The Solr SizeComponent is intended to compute resource usage information for a given Solr core. It will perform those computations based on current index schema, Solr configuration and document indexed in the core. It is not meant to be distributable, see the cluster sizing action of the collection admin API for more information about sizing distributed collections. h2. Configuration The SizeComponent, like any search component except for the base ones, must be defined in the solrconfig.xml file before it can be used. This is done in 2 parts. 1- Declare the component: 2- Use the component in some handler, using the default /select handler will make it easier to use: ... ... size h2. Usage Once you have configured the SizeComponent, it can be requested by enabling it in a standard query: http://localhost:8983/solr/core/select?q=*:*&rows=0&wt=xml&size=true h3. Parameters ||name||type||default||description|| |size|boolean|false|If set to true, sizing information will be included in response.| |avgDocSize|long|0| Document size used to compute resource usage. If less than 1, the value will be computed using the content of currently indexed documents.| |numDocs|long|0|Number of documents to use when computing resource usage. If less than 1, actual number of indexed documents will be used. This parameter will be ignored if estimationRatio is specified.| |estimationRatio|double|0.0|Ratio used for resource usage estimations. If a value greater than 0.0 is specified, the current number of documents will be multiplied by this ratio in order to determine number of documents to be used when computing resource usage.| |deletedDocs|long|-|If specified, will be used as number of deleted documents in the index when computing resource usage, otherwise, current number of deleted documents will be used instead.| |filterCacheMax|long|-|Size of the filter cache to use for computing resource usage, if not specified, current filter cache size will be used.| |queryResultCacheMax|long|-|Size of the query result cache to use for computing resource usage, if not specified, current query result cache size will be used.| |documentCacheMax|long|-|Size of the document cache to use for computing resource usage, if not specified, current document cache size will be used.| |queryResultMaxDocsCached|long|-|Maximum number of documents to cache per entry in query result cache to use for computing resource usage, if not specified, current maximum will be used.| h3. Response {code:xml} 0 109 *:* true true 0 xml 199.6 MB 33.35 MB 79.16 MB 2287 89.37 KB 152.94 KB 1,000 KB 44.68 MB 33.35 MB {code} ||result field|| ||description|| |total-disk-size| |Estimation of total disk space used by the index according to parameters.| |total-lucene-RAM| |Estimation of index RAM usage specifically for Lucene according to parameters.| |total-solr-RAM| |Estimation of total index RAM usage for Solr (including Lucene) according to parameters.| |estimated-num-docs| |Number of documents used for computing estimated values.| |estimated-doc-size| |Average size of document used for computing estimated values.| |solr-details|filterCache|Estimated maximum amount of RAM used for caching filters for the index, if cache was filled.| | |queryResultCache|Estimated maximum amount of RAM used for caching query results for the index, if cache was filled.| | |documentCache|Estimated maximum amount of RAM used for caching documents for the index, if cache was filled.| | |luceneRam|Estimated amount of RAM used by Lucene for the index.| h1. Cluster Sizing The cluster sizing action of the collection handler is intended to estimate resource usage for a complete Solr cluster. It is based on the Size Component and will perform calls to it internally in order to merge the results and compute aggregated estimations. It does not require any specific configuration, but requires that the SizeComponent is declared and used by the /select handler so that the ClusterSizing action can perform requests to it. h2. Usage The cluster sizing action can be accessed through the collections handler: http://localhost:8983/solr/admin/collections?action=clustersizing h3. Parameters All parameters from the SizeComponent, except for size parameter itself, can be passed to the cluster sizing action and will be relayed to the SizeComponent when estimating resource usage. Be
[jira] [Updated] (SOLR-7913) Add stream.body support to MLT QParser
[ https://issues.apache.org/jira/browse/SOLR-7913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Molloy updated SOLR-7913: --- Attachment: SOLR-7913.patch Properly propagate content streams to shards so they can parse request properly. > Add stream.body support to MLT QParser > -- > > Key: SOLR-7913 > URL: https://issues.apache.org/jira/browse/SOLR-7913 > Project: Solr > Issue Type: Improvement >Reporter: Anshum Gupta > Attachments: SOLR-7913.patch, SOLR-7913.patch > > > Continuing from > https://issues.apache.org/jira/browse/SOLR-7639?focusedCommentId=14601011&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14601011. > It'd be good to have stream.body be supported by the mlt qparser. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8649) Fail fast on wrong ZK chroot
[ https://issues.apache.org/jira/browse/SOLR-8649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15139447#comment-15139447 ] Steve Molloy commented on SOLR-8649: Another issue about launching Solr without a chroot to inexisting zNode. Attached patch adds a parameter to automatically create the zNode if required. > Fail fast on wrong ZK chroot > > > Key: SOLR-8649 > URL: https://issues.apache.org/jira/browse/SOLR-8649 > Project: Solr > Issue Type: Improvement > Components: SolrCloud >Reporter: Shalin Shekhar Mangar > Fix For: 5.5, Trunk > > > A typical scenario is when a user sets up ZK with a chroot /solr, runs Solr > and then restarts Solr without specifying the chroot. In the default > legacyCloud mode, Solr will happily start and create all ZK nodes as well as > collections found on the local cores. > I've been bit many times by this and so have more than a few Solr users. In a > private discussion, Hoss gave the following idea: > * We add a command to bin/solr to "prepare" ZooKeeper that accepts the zk > host string > * The command creates the chroot if it does not exist > * Touches /my-chroot/solr.key file > * Writes the complete zk host in the solr.in.sh or solr.in.cmd file > Once we do this, Solr will complain and fail fast if the solr.key file is not > found in the given chroot. We could also write a fixed string in the > /my-chroot instead of creating a /my-chroot/solr.key file. > We can do these things automatically when using embedded ZooKeeper. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-8604) RealTimeGet and MLT qParser support for collection parameter
[ https://issues.apache.org/jira/browse/SOLR-8604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Molloy updated SOLR-8604: --- Attachment: SOLR-8604.patch Initial patch using collection list in real time get and forwarding it from MLT. Still SVN, haven't moved to GIT yet... > RealTimeGet and MLT qParser support for collection parameter > > > Key: SOLR-8604 > URL: https://issues.apache.org/jira/browse/SOLR-8604 > Project: Solr > Issue Type: Improvement >Affects Versions: 5.3.1 >Reporter: Steve Molloy > Attachments: SOLR-8604.patch > > > The MLT query parser performs a realtime get to fetch document by id if it is > specified. As realtime get works only within a specific collection, so does > MLT. If collection parameter is supplied, it should be used both in MLT > qparser and realtime get to fetch the document in any collection, and to use > it for similarity in the case of MLT. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-8604) RealTimeGet and MLT qParser support for collection parameter
Steve Molloy created SOLR-8604: -- Summary: RealTimeGet and MLT qParser support for collection parameter Key: SOLR-8604 URL: https://issues.apache.org/jira/browse/SOLR-8604 Project: Solr Issue Type: Improvement Affects Versions: 5.3.1 Reporter: Steve Molloy The MLT query parser performs a realtime get to fetch document by id if it is specified. As realtime get works only within a specific collection, so does MLT. If collection parameter is supplied, it should be used both in MLT qparser and realtime get to fetch the document in any collection, and to use it for similarity in the case of MLT. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-8393) Component for Solr resource usage planning
[ https://issues.apache.org/jira/browse/SOLR-8393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Molloy updated SOLR-8393: --- Attachment: SOLR-8393.patch Force distrib=false for individual requests > Component for Solr resource usage planning > -- > > Key: SOLR-8393 > URL: https://issues.apache.org/jira/browse/SOLR-8393 > Project: Solr > Issue Type: Improvement >Reporter: Steve Molloy > Attachments: SOLR-8393.patch, SOLR-8393.patch, SOLR-8393.patch, > SOLR-8393.patch, SOLR-8393.patch, SOLR-8393.patch > > > One question that keeps coming back is how much disk and RAM do I need to run > Solr. The most common response is that it highly depends on your data. While > true, it makes for frustrated users trying to plan their deployments. > The idea I'm bringing is to create a new component that will attempt to > extrapolate resources needed in the future by looking at resources currently > used. By adding a parameter for the target number of documents, current > resources are adapted by a ratio relative to current number of documents. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8589) Add aliases to the LIST action results in the Collections API
[ https://issues.apache.org/jira/browse/SOLR-8589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15117342#comment-15117342 ] Steve Molloy commented on SOLR-8589: Sure, just wanted to make sure they're linked. I'm already using the other approach, so laziness is pushing me towards keeping it, but really, one or the other works. :) > Add aliases to the LIST action results in the Collections API > - > > Key: SOLR-8589 > URL: https://issues.apache.org/jira/browse/SOLR-8589 > Project: Solr > Issue Type: Improvement > Components: SolrCloud >Affects Versions: 5.4.1 >Reporter: Shawn Heisey >Assignee: Shawn Heisey >Priority: Minor > Attachments: SOLR-8589.patch, solr-8589-new-list-details-aliases.png > > > Although it is possible to get a list of SolrCloud aliases vi an HTTP API, it > is not available as a typical query response, I believe it is only available > via the http API for zookeeper. > The results from the LIST action in the Collections API is well-situated to > handle this. The current results are contained in a "collections" node, we > can simply add an "aliases" node if there are any aliases defined. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8589) Add aliases to the LIST action results in the Collections API
[ https://issues.apache.org/jira/browse/SOLR-8589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15117295#comment-15117295 ] Steve Molloy commented on SOLR-8589: There's already a ticket about listing aliases. > Add aliases to the LIST action results in the Collections API > - > > Key: SOLR-8589 > URL: https://issues.apache.org/jira/browse/SOLR-8589 > Project: Solr > Issue Type: Improvement > Components: SolrCloud >Affects Versions: 5.4.1 >Reporter: Shawn Heisey >Assignee: Shawn Heisey >Priority: Minor > Attachments: SOLR-8589.patch, solr-8589-new-list-details-aliases.png > > > Although it is possible to get a list of SolrCloud aliases vi an HTTP API, it > is not available as a typical query response, I believe it is only available > via the http API for zookeeper. > The results from the LIST action in the Collections API is well-situated to > handle this. The current results are contained in a "collections" node, we > can simply add an "aliases" node if there are any aliases defined. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-8393) Component for Solr resource usage planning
[ https://issues.apache.org/jira/browse/SOLR-8393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Molloy updated SOLR-8393: --- Attachment: SOLR-8393.patch Adjust to work with small and empty indexes, also add num docs used for estimation in results. > Component for Solr resource usage planning > -- > > Key: SOLR-8393 > URL: https://issues.apache.org/jira/browse/SOLR-8393 > Project: Solr > Issue Type: Improvement >Reporter: Steve Molloy > Attachments: SOLR-8393.patch, SOLR-8393.patch, SOLR-8393.patch, > SOLR-8393.patch, SOLR-8393.patch > > > One question that keeps coming back is how much disk and RAM do I need to run > Solr. The most common response is that it highly depends on your data. While > true, it makes for frustrated users trying to plan their deployments. > The idea I'm bringing is to create a new component that will attempt to > extrapolate resources needed in the future by looking at resources currently > used. By adding a parameter for the target number of documents, current > resources are adapted by a ratio relative to current number of documents. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-8393) Component for Solr resource usage planning
[ https://issues.apache.org/jira/browse/SOLR-8393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Molloy updated SOLR-8393: --- Attachment: SOLR-8393.patch Fix disappearing collections when using collection param (should not be able to modify clusterState's getCollections result...) > Component for Solr resource usage planning > -- > > Key: SOLR-8393 > URL: https://issues.apache.org/jira/browse/SOLR-8393 > Project: Solr > Issue Type: Improvement >Reporter: Steve Molloy > Attachments: SOLR-8393.patch, SOLR-8393.patch, SOLR-8393.patch, > SOLR-8393.patch > > > One question that keeps coming back is how much disk and RAM do I need to run > Solr. The most common response is that it highly depends on your data. While > true, it makes for frustrated users trying to plan their deployments. > The idea I'm bringing is to create a new component that will attempt to > extrapolate resources needed in the future by looking at resources currently > used. By adding a parameter for the target number of documents, current > resources are adapted by a ratio relative to current number of documents. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-8393) Component for Solr resource usage planning
[ https://issues.apache.org/jira/browse/SOLR-8393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Molloy updated SOLR-8393: --- Attachment: SOLR-8393.patch Updated patch with extra command on collection handler to get size of whole cluster. Also allow to specify average document size to use and estimation ratio. > Component for Solr resource usage planning > -- > > Key: SOLR-8393 > URL: https://issues.apache.org/jira/browse/SOLR-8393 > Project: Solr > Issue Type: Improvement >Reporter: Steve Molloy > Attachments: SOLR-8393.patch, SOLR-8393.patch, SOLR-8393.patch > > > One question that keeps coming back is how much disk and RAM do I need to run > Solr. The most common response is that it highly depends on your data. While > true, it makes for frustrated users trying to plan their deployments. > The idea I'm bringing is to create a new component that will attempt to > extrapolate resources needed in the future by looking at resources currently > used. By adding a parameter for the target number of documents, current > resources are adapted by a ratio relative to current number of documents. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-8393) Component for Solr resource usage planning
[ https://issues.apache.org/jira/browse/SOLR-8393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Molloy updated SOLR-8393: --- Attachment: SOLR-8393.patch Cleaned up version of the same patch. > Component for Solr resource usage planning > -- > > Key: SOLR-8393 > URL: https://issues.apache.org/jira/browse/SOLR-8393 > Project: Solr > Issue Type: Improvement >Reporter: Steve Molloy > Attachments: SOLR-8393.patch, SOLR-8393.patch > > > One question that keeps coming back is how much disk and RAM do I need to run > Solr. The most common response is that it highly depends on your data. While > true, it makes for frustrated users trying to plan their deployments. > The idea I'm bringing is to create a new component that will attempt to > extrapolate resources needed in the future by looking at resources currently > used. By adding a parameter for the target number of documents, current > resources are adapted by a ratio relative to current number of documents. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-8394) Luke handler doesn't support FilterLeafReader
[ https://issues.apache.org/jira/browse/SOLR-8394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Molloy updated SOLR-8394: --- Attachment: SOLR-8394.patch Simple patch to unwrap LeafReader if they are FilterLeafReader. Then apply same logic of exiting if not a SegmentReader. > Luke handler doesn't support FilterLeafReader > - > > Key: SOLR-8394 > URL: https://issues.apache.org/jira/browse/SOLR-8394 > Project: Solr > Issue Type: Improvement >Reporter: Steve Molloy > Attachments: SOLR-8394.patch > > > When fetching index information, luke handler only looks at ramBytesUsed for > SegmentReader leaves. If these readers are wrapped in FilterLeafReader, no > RAM usage is returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-8394) Luke handler doesn't support FilterLeafReader
Steve Molloy created SOLR-8394: -- Summary: Luke handler doesn't support FilterLeafReader Key: SOLR-8394 URL: https://issues.apache.org/jira/browse/SOLR-8394 Project: Solr Issue Type: Improvement Reporter: Steve Molloy When fetching index information, luke handler only looks at ramBytesUsed for SegmentReader leaves. If these readers are wrapped in FilterLeafReader, no RAM usage is returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-8393) Component for Solr resource usage planning
[ https://issues.apache.org/jira/browse/SOLR-8393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Molloy updated SOLR-8393: --- Attachment: SOLR-8393.patch Patch based on replication's disk size estimate and adapted Luke's index RAM estimates. Solr RAM estimates tentatively derived from excel sheet provided in dev-tools. > Component for Solr resource usage planning > -- > > Key: SOLR-8393 > URL: https://issues.apache.org/jira/browse/SOLR-8393 > Project: Solr > Issue Type: Improvement >Reporter: Steve Molloy > Attachments: SOLR-8393.patch > > > One question that keeps coming back is how much disk and RAM do I need to run > Solr. The most common response is that it highly depends on your data. While > true, it makes for frustrated users trying to plan their deployments. > The idea I'm bringing is to create a new component that will attempt to > extrapolate resources needed in the future by looking at resources currently > used. By adding a parameter for the target number of documents, current > resources are adapted by a ratio relative to current number of documents. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-8393) Component for Solr resource usage planning
Steve Molloy created SOLR-8393: -- Summary: Component for Solr resource usage planning Key: SOLR-8393 URL: https://issues.apache.org/jira/browse/SOLR-8393 Project: Solr Issue Type: Improvement Reporter: Steve Molloy One question that keeps coming back is how much disk and RAM do I need to run Solr. The most common response is that it highly depends on your data. While true, it makes for frustrated users trying to plan their deployments. The idea I'm bringing is to create a new component that will attempt to extrapolate resources needed in the future by looking at resources currently used. By adding a parameter for the target number of documents, current resources are adapted by a ratio relative to current number of documents. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-7642) Should launching Solr in cloud mode using a ZooKeeper chroot create the chroot znode if it doesn't exist?
[ https://issues.apache.org/jira/browse/SOLR-7642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Molloy updated SOLR-7642: --- Attachment: SOLR-7642.patch Here's a patch that creates the znode only if you add the createZkRoot system property. This way users can decide which behaviour they want. > Should launching Solr in cloud mode using a ZooKeeper chroot create the > chroot znode if it doesn't exist? > - > > Key: SOLR-7642 > URL: https://issues.apache.org/jira/browse/SOLR-7642 > Project: Solr > Issue Type: Improvement >Reporter: Timothy Potter >Priority: Minor > Attachments: SOLR-7642.patch > > > If you launch Solr for the first time in cloud mode using a ZooKeeper > connection string that includes a chroot leads to the following > initialization error: > {code} > ERROR - 2015-06-05 17:15:50.410; [ ] org.apache.solr.common.SolrException; > null:org.apache.solr.common.cloud.ZooKeeperException: A chroot was specified > in ZkHost but the znode doesn't exist. localhost:2181/lan > at > org.apache.solr.core.ZkContainer.initZooKeeper(ZkContainer.java:113) > at org.apache.solr.core.CoreContainer.load(CoreContainer.java:339) > at > org.apache.solr.servlet.SolrDispatchFilter.createCoreContainer(SolrDispatchFilter.java:140) > at > org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:110) > at > org.eclipse.jetty.servlet.FilterHolder.initialize(FilterHolder.java:138) > at > org.eclipse.jetty.servlet.ServletHandler.initialize(ServletHandler.java:852) > at > org.eclipse.jetty.servlet.ServletContextHandler.startContext(ServletContextHandler.java:298) > at > org.eclipse.jetty.webapp.WebAppContext.startWebapp(WebAppContext.java:1349) > at > org.eclipse.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1342) > at > org.eclipse.jetty.server.handler.ContextHandler.doStart(ContextHandler.java:741) > at > org.eclipse.jetty.webapp.WebAppContext.doStart(WebAppContext.java:505) > {code} > The work-around for this is to use the scripts/cloud-scripts/zkcli.sh script > to create the chroot znode (bootstrap action does this). > I'm wondering if we shouldn't just create the znode if it doesn't exist? Or > is that some violation of using a chroot? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8029) Modernize and standardize Solr APIs
[ https://issues.apache.org/jira/browse/SOLR-8029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14987762#comment-14987762 ] Steve Molloy commented on SOLR-8029: I'm +1 for dedicated paths for each resource, in other words, longer paths with collections in it and no special keywords. I personally agree that not having keywords will make things easier to read then having shorter URLs with special keywords. > Modernize and standardize Solr APIs > --- > > Key: SOLR-8029 > URL: https://issues.apache.org/jira/browse/SOLR-8029 > Project: Solr > Issue Type: Improvement >Affects Versions: Trunk >Reporter: Noble Paul >Assignee: Noble Paul > Labels: API, EaseOfUse > Fix For: Trunk > > > Solr APIs have organically evolved and they are sometimes inconsistent with > each other or not in sync with the widely followed conventions of HTTP > protocol. Trying to make incremental changes to make them modern is like > applying band-aid. So, we have done a complete rethink of what the APIs > should be. The most notable aspects of the API are as follows: > The new set of APIs will be placed under a new path {{/solr2}}. The legacy > APIs will continue to work under the {{/solr}} path as they used to and they > will be eventually deprecated. > There are 3 types of requests in the new API > * {{/solr2//*}} : Operations on specific collections > * {{/solr2/_cluster/*}} : Cluster-wide operations which are not specific to > any collections. > * {{/solr2/_node/*}} : Operations on the node receiving the request. This is > the counter part of the core admin API > This will be released as part of a major release. Check the link given below > for the full specification. Your comments are welcome > [Solr API version 2 Specification | http://bit.ly/1JYsBMQ] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8105) Component Support for MLT Handler
[ https://issues.apache.org/jira/browse/SOLR-8105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14935590#comment-14935590 ] Steve Molloy commented on SOLR-8105: And with patch for SOLR-7913, you can pass in text to your request if that is your use case for handler. > Component Support for MLT Handler > - > > Key: SOLR-8105 > URL: https://issues.apache.org/jira/browse/SOLR-8105 > Project: Solr > Issue Type: Improvement >Reporter: Dave Murray > > Would like component support for MLT Handler > Highlighting, Clustering, etc... > I currently have a workaround for highlighting (using interesting terms from > MLT), but clustering is an issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4968) The collection alias api should have a list cmd.
[ https://issues.apache.org/jira/browse/SOLR-4968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Molloy updated SOLR-4968: --- Attachment: SOLR-4968.patch Same patch without the eclipse workspace part in the paths (not sure why eclipse added those) > The collection alias api should have a list cmd. > > > Key: SOLR-4968 > URL: https://issues.apache.org/jira/browse/SOLR-4968 > Project: Solr > Issue Type: Improvement > Components: SolrCloud >Reporter: Mark Miller >Assignee: Mark Miller > Fix For: 4.9, Trunk > > Attachments: SOLR-4968.patch, SOLR-4968.patch, SOLR-4968.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-7913) Add stream.body support to MLT QParser
[ https://issues.apache.org/jira/browse/SOLR-7913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Molloy updated SOLR-7913: --- Attachment: SOLR-7913.patch Patch allowing to use stream.body in mlt QParser. Need to align RequestUtil logic and TestRemoteStreaming, both of which would prevent stream from getting to QParser in there current state. Hacked the RequestUtil and ignored one of the TestRemoteStreaming for now, everything else passes on 5.3.1 code and applies cleanly on trunk. > Add stream.body support to MLT QParser > -- > > Key: SOLR-7913 > URL: https://issues.apache.org/jira/browse/SOLR-7913 > Project: Solr > Issue Type: Improvement >Reporter: Anshum Gupta > Attachments: SOLR-7913.patch > > > Continuing from > https://issues.apache.org/jira/browse/SOLR-7639?focusedCommentId=14601011&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14601011. > It'd be good to have stream.body be supported by the mlt qparser. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8029) Modernize and standardize Solr APIs
[ https://issues.apache.org/jira/browse/SOLR-8029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14745451#comment-14745451 ] Steve Molloy commented on SOLR-8029: bq. The wt param will still drive the encoding/decoding Is there any plans for supporting HTTP Accept header at some point for setting response type? > Modernize and standardize Solr APIs > --- > > Key: SOLR-8029 > URL: https://issues.apache.org/jira/browse/SOLR-8029 > Project: Solr > Issue Type: Improvement >Affects Versions: 6.0 >Reporter: Noble Paul >Assignee: Noble Paul > Labels: API, EaseOfUse > Fix For: 6.0 > > > Solr APIs have organically evolved and they are sometimes inconsistent with > each other or not in sync with the widely followed conventions of HTTP > protocol. Trying to make incremental changes to make them modern is like > applying band-aid. So, we have done a complete rethink of what the APIs > should be. The most notable aspects of the API are as follows: > The new set of APIs will be placed under a new path {{/solr2}}. The legacy > APIs will continue to work under the {{/solr}} path as they used to and they > will be eventually deprecated. > There are 3 types of requests in the new API > * {{/solr2//*}} : Operations on specific collections > * {{/solr2/_cluster/*}} : Cluster-wide operations which are not specific to > any collections. > * {{/solr2/_node/*}} : Operations on the node receiving the request. This is > the counter part of the core admin API > This will be released as part of a major release. Check the link given below > for the full specification. Your comments are welcome > [Solr API version 2 Specification | http://bit.ly/1JYsBMQ] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4968) The collection alias api should have a list cmd.
[ https://issues.apache.org/jira/browse/SOLR-4968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Molloy updated SOLR-4968: --- Attachment: SOLR-4968.patch Adapted for 5.3.1, seems to apply well on trunk, but not tested. > The collection alias api should have a list cmd. > > > Key: SOLR-4968 > URL: https://issues.apache.org/jira/browse/SOLR-4968 > Project: Solr > Issue Type: Improvement > Components: SolrCloud >Reporter: Mark Miller >Assignee: Mark Miller > Fix For: 4.9, Trunk > > Attachments: SOLR-4968.patch, SOLR-4968.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8029) Modernize and standardize Solr APIs
[ https://issues.apache.org/jira/browse/SOLR-8029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14738875#comment-14738875 ] Steve Molloy commented on SOLR-8029: Yes, make the current version: /v1/{api} Make the new version: /v2/{api} And have /solr point to a configurable version, probably /v1 by default at first: /solr/collection/select => /v1/collection/select /v1/collection/select => Same as current /solr/collection/select /v2/collection/select => New API for collection operations. This way, existing clients get the existing behaviour. Client that wish to migrate progressively can use both /solr pointing to /v1 and /v2 in new calls. Completely new clients can either use /v2 or configure Solr so /solr points to /v2 and use that, meaning: /solr/collection/select => /v2/collection/select /v1/collection/select => current API /v2/collection/select => new API. > Modernize and standardize Solr APIs > --- > > Key: SOLR-8029 > URL: https://issues.apache.org/jira/browse/SOLR-8029 > Project: Solr > Issue Type: Improvement >Affects Versions: 6.0 >Reporter: Noble Paul >Assignee: Noble Paul > Labels: API, EaseOfUse > Fix For: 6.0 > > > Solr APIs have organically evolved and they are sometimes inconsistent with > each other or not in sync with the widely followed conventions of HTTP > protocol. Trying to make incremental changes to make them modern is like > applying band-aid. So, we have done a complete rethink of what the APIs > should be. The most notable aspects of the API are as follows: > The new set of APIs will be placed under a new path {{/solr2}}. The legacy > APIs will continue to work under the {{/solr}} path as they used to and they > will be eventually deprecated. > There are 3 types of requests in the new API > * {{/solr2//*}} : Operations on specific collections > * {{/solr2/_cluster/*}} : Cluster-wide operations which are not specific to > any collections. > * {{/solr2/_node/*}} : Operations on the node receiving the request. This is > the counter part of the core admin API > This will be released as part of a major release. Check the link given below > for the full specification. Your comments are welcome > [Solr API version 2 Specification | http://bit.ly/1JYsBMQ] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8029) Modernize and standardize Solr APIs
[ https://issues.apache.org/jira/browse/SOLR-8029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14738857#comment-14738857 ] Steve Molloy commented on SOLR-8029: bq. Changing stuff abruptly will infuriate users. All the existing apps should work when they move to new Solr. If we fail to do that we will hamper adoption. We should give users a painless migration path. Agreed, this is why I propose to have the current API URL point to the URL with /v1 or /v2 in it. Making the choice of default version configurable would allow people to use the API they want as they were using it in previous version, then start migrating slowly, at their own pace, to the new version by using /v2 URl in client code using new API. Once everything is updated, they could change default version configured and not have to change their client code. With this approach, the same would apply if in some years we decide to have a v3 API for whatever reason. > Modernize and standardize Solr APIs > --- > > Key: SOLR-8029 > URL: https://issues.apache.org/jira/browse/SOLR-8029 > Project: Solr > Issue Type: Improvement >Affects Versions: 6.0 >Reporter: Noble Paul >Assignee: Noble Paul > Labels: API, EaseOfUse > Fix For: 6.0 > > > Solr APIs have organically evolved and they are sometimes inconsistent with > each other or not in sync with the widely followed conventions of HTTP > protocol. Trying to make incremental changes to make them modern is like > applying band-aid. So, we have done a complete rethink of what the APIs > should be. The most notable aspects of the API are as follows: > The new set of APIs will be placed under a new path {{/solr2}}. The legacy > APIs will continue to work under the {{/solr}} path as they used to and they > will be eventually deprecated. > There are 3 types of requests in the new API > * {{/solr2//*}} : Operations on specific collections > * {{/solr2/_cluster/*}} : Cluster-wide operations which are not specific to > any collections. > * {{/solr2/_node/*}} : Operations on the node receiving the request. This is > the counter part of the core admin API > This will be released as part of a major release. Check the link given below > for the full specification. Your comments are welcome > [Solr API version 2 Specification | http://bit.ly/1JYsBMQ] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8029) Modernize and standardize Solr APIs
[ https://issues.apache.org/jira/browse/SOLR-8029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14738840#comment-14738840 ] Steve Molloy commented on SOLR-8029: Having version in URL is pretty common and makes sense to me. The old API could be made into v1, pointing the root /solr to v2 by default, with option to configure it to v1 for people needing to support backward compatibility with absolutely no impact on their existing client applications. > Modernize and standardize Solr APIs > --- > > Key: SOLR-8029 > URL: https://issues.apache.org/jira/browse/SOLR-8029 > Project: Solr > Issue Type: Improvement >Affects Versions: 6.0 >Reporter: Noble Paul >Assignee: Noble Paul > Labels: API, EaseOfUse > Fix For: 6.0 > > > Solr APIs have organically evolved and they are sometimes inconsistent with > each other or not in sync with the widely followed conventions of HTTP > protocol. Trying to make incremental changes to make them modern is like > applying band-aid. So, we have done a complete rethink of what the APIs > should be. The most notable aspects of the API are as follows: > The new set of APIs will be placed under a new path {{/solr2}}. The legacy > APIs will continue to work under the {{/solr}} path as they used to and they > will be eventually deprecated. > There are 3 types of requests in the new API > * {{/solr2//*}} : Operations on specific collections > * {{/solr2/_cluster/*}} : Cluster-wide operations which are not specific to > any collections. > * {{/solr2/_node/*}} : Operations on the node receiving the request. This is > the counter part of the core admin API > This will be released as part of a major release. Check the link given below > for the full specification. Your comments are welcome > [Solr API version 2 Specification | http://bit.ly/1JYsBMQ] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5606) REST based Collections API
[ https://issues.apache.org/jira/browse/SOLR-5606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14646453#comment-14646453 ] Steve Molloy commented on SOLR-5606: bq. The tool that you can use now is a browser. To some people, anything else is a special tool. Couldn't it be done through Solr Admin UI? I think the API and the admin UI have different target audience, one for devs and the other for administrators. > REST based Collections API > -- > > Key: SOLR-5606 > URL: https://issues.apache.org/jira/browse/SOLR-5606 > Project: Solr > Issue Type: Improvement > Components: SolrCloud >Reporter: Jan Høydahl >Priority: Minor > Fix For: Trunk > > > For consistency reasons, the collections API (and other admin APIs) should be > REST based. Spinoff from SOLR-1523 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5606) REST based Collections API
[ https://issues.apache.org/jira/browse/SOLR-5606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14646282#comment-14646282 ] Steve Molloy commented on SOLR-5606: Makes sense. Basically, what I'm trying to get at is that action to be performed should not be part of query parameters (or POST body which is pretty much the same). It should be represented by the HTTP method used as much as possible, POST to create, PUT to modify, GET to list/retrieve info, DELETE to delete. There are some operations that cannot fit this, but we should go that route when possible. > REST based Collections API > -- > > Key: SOLR-5606 > URL: https://issues.apache.org/jira/browse/SOLR-5606 > Project: Solr > Issue Type: Improvement > Components: SolrCloud >Reporter: Jan Høydahl >Priority: Minor > Fix For: Trunk > > > For consistency reasons, the collections API (and other admin APIs) should be > REST based. Spinoff from SOLR-1523 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5606) REST based Collections API
[ https://issues.apache.org/jira/browse/SOLR-5606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14646266#comment-14646266 ] Steve Molloy commented on SOLR-5606: I would expect POST on /solr/admin/collections/gettingstarted/replicas with optional content {"node":"192.168.0.1_solr"}, node being chosen randomly if not specified. You could then list a collection's replicas using GET on /solr/admin/collections/gettingstarted/replicas, same for shards on /solr/admin/collections/gettingstarted/shards. In the case of shards, there's also the split shard to consider, which I guess could either be POST on /solr/admin/collections/gettingstarted/shards/shard1?action=split or a PUT on the same endpoint. Not sure what the best approach is here as you're effectively creating new resources (2 new shards) from an existing one. Maybe a dedicated shard action endpoint for shard actions. The other approach would be to consider replicas and shards as part of the collection metadata, implying that adding a replica would be done through a PUT on /solr/admin/collections/gettingstarted with content like {"name":"gettingstarted", replicas:[{"shard":"shard1","node"192.168.0.1_solr"},{"shard":"shard1","node"192.168.0.2_solr"}]}. This works in theory but brings complications such as what to do if existing replica is not in the PUT sent. Would that mean delete it or keep it and just create new ones for what user sent. I think having sub-resources for replicas and shards would make it easier to maintain and understand. > REST based Collections API > -- > > Key: SOLR-5606 > URL: https://issues.apache.org/jira/browse/SOLR-5606 > Project: Solr > Issue Type: Improvement > Components: SolrCloud >Reporter: Jan Høydahl >Priority: Minor > Fix For: Trunk > > > For consistency reasons, the collections API (and other admin APIs) should be > REST based. Spinoff from SOLR-1523 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-5606) REST based Collections API
[ https://issues.apache.org/jira/browse/SOLR-5606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14645950#comment-14645950 ] Steve Molloy edited comment on SOLR-5606 at 7/29/15 12:11 PM: -- I guess others have already replied, but just to make sure my comment is not perceived as contradictory... I don't have an issue with /solr/admin/collections, /solr/admin/cores, etc. I think it actually makes a lot of sense. I wasn't debating about the actual endpoint but rather about the fact that the endpoint could work with multiple resource types with that "type" parameter. So last comment by [~noble.paul] makes perfect sense to me. Basically, /solr/admin/collections would manage collections, while /solr// would work with documents in it, so it makes sense to have them separate. was (Author: smolloy): I guess others have already replied, but just to make sure my comment is not perceived as contradictory... I don't have an issue with /solr/admin/collections, /solr/admin/cores, etc. I think it actually makes a lot of sense. I wasn't debating about the actual endpoint but rather about the fact that the endpoint could work with multiple resource types with that "type" parameter. So last comment by [~noble.paul] makes perfect sense to me. Basically, /solr/admin/collections would manage collections, while /solr/{collection}/{handler} would work with documents in it, so it makes sense to have them separate. > REST based Collections API > -- > > Key: SOLR-5606 > URL: https://issues.apache.org/jira/browse/SOLR-5606 > Project: Solr > Issue Type: Improvement > Components: SolrCloud >Reporter: Jan Høydahl >Priority: Minor > Fix For: Trunk > > > For consistency reasons, the collections API (and other admin APIs) should be > REST based. Spinoff from SOLR-1523 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5606) REST based Collections API
[ https://issues.apache.org/jira/browse/SOLR-5606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14645950#comment-14645950 ] Steve Molloy commented on SOLR-5606: I guess others have already replied, but just to make sure my comment is not perceived as contradictory... I don't have an issue with /solr/admin/collections, /solr/admin/cores, etc. I think it actually makes a lot of sense. I wasn't debating about the actual endpoint but rather about the fact that the endpoint could work with multiple resource types with that "type" parameter. So last comment by [~noble.paul] makes perfect sense to me. Basically, /solr/admin/collections would manage collections, while /solr/{collection}/{handler} would work with documents in it, so it makes sense to have them separate. > REST based Collections API > -- > > Key: SOLR-5606 > URL: https://issues.apache.org/jira/browse/SOLR-5606 > Project: Solr > Issue Type: Improvement > Components: SolrCloud >Reporter: Jan Høydahl >Priority: Minor > Fix For: Trunk > > > For consistency reasons, the collections API (and other admin APIs) should be > REST based. Spinoff from SOLR-1523 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5606) REST based Collections API
[ https://issues.apache.org/jira/browse/SOLR-5606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14645018#comment-14645018 ] Steve Molloy commented on SOLR-5606: I agree with pretty much all that is being said, with one big exception on the "type" parameter. If you want some REST-like API, you should not get different types of results depending on query parameters. So I'd definitely recommend keeping the /solr/collection approach, and have /solr/core for cores, /solr/schema, /solr/alias, etc... All this can live very well with bulk operations, you could either PUT a single schema to update it, or POST a list of collection information to create multiple collections in bulk, but on the right resource type. Not only is it cleaner to have the resource you want to interact with part of the URL used to access it, it would also make it much easier to integrate with standard client libraries for REST APIs. > REST based Collections API > -- > > Key: SOLR-5606 > URL: https://issues.apache.org/jira/browse/SOLR-5606 > Project: Solr > Issue Type: Improvement > Components: SolrCloud >Reporter: Jan Høydahl >Priority: Minor > Fix For: Trunk > > > For consistency reasons, the collections API (and other admin APIs) should be > REST based. Spinoff from SOLR-1523 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4212) Let facet queries hang off of pivots
[ https://issues.apache.org/jira/browse/SOLR-4212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14383792#comment-14383792 ] Steve Molloy commented on SOLR-4212: Indeed, we need to find a way to reconcile, but as the JSON API will be experimental in 5.1, I don't think we should stop this new functionality from getting in. It's in line with all the other work under this unbrella, some of which is already in 5.0. Whether it ends up in Lucene, Solr facet component or facet module, the functionality is something needed and I don't think we should hold it up as it is still the official facet implementation after all... > Let facet queries hang off of pivots > > > Key: SOLR-4212 > URL: https://issues.apache.org/jira/browse/SOLR-4212 > Project: Solr > Issue Type: Sub-task > Components: search >Affects Versions: 4.0 >Reporter: Steve Molloy >Assignee: Shalin Shekhar Mangar > Fix For: 4.9, Trunk > > Attachments: SOLR-4212-multiple-q.patch, SOLR-4212-multiple-q.patch, > SOLR-4212.patch, SOLR-4212.patch, SOLR-4212.patch, SOLR-4212.patch, > SOLR-4212.patch, SOLR-4212.patch, SOLR-6353-6686-4212.patch, > SOLR-6353-6686-4212.patch, patch-4212.txt > > > Facet pivot provide hierarchical support for computing data used to populate > a treemap or similar visualization. TreeMaps usually offer users extra > information by applying an overlay color on top of the existing square sizes > based on hierarchical counts. This second count is based on user choices, > representing, usually with gradient, the proportion of the square that fits > the user's choices. > The proposition is to use local parameters to specify facet query to apply > for pivot which matches a tag set on facet query. Parameter format would look > like: > facet.pivot={!query=r1}category,manufacturer > facet.query={!tag=r1}somequery > facet.query={!tag=r1}somedate:[NOW-1YEAR TO NOW] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6350) Percentiles in StatsComponent
[ https://issues.apache.org/jira/browse/SOLR-6350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14377812#comment-14377812 ] Steve Molloy commented on SOLR-6350: Agree with the separate issue to track this, created SOLR-7296 for it. > Percentiles in StatsComponent > - > > Key: SOLR-6350 > URL: https://issues.apache.org/jira/browse/SOLR-6350 > Project: Solr > Issue Type: Sub-task >Reporter: Hoss Man > Attachments: SOLR-6350-Xu.patch, SOLR-6350-Xu.patch, > SOLR-6350-xu.patch, SOLR-6350-xu.patch, SOLR-6350.patch, SOLR-6350.patch, > SOLR-6350.patch, SOLR-6350.patch > > > Add an option to compute user specified percentiles when computing stats > Example... > {noformat} > stats.field={!percentiles='1,2,98,99,99.999'}price > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-7214) JSON Facet API
[ https://issues.apache.org/jira/browse/SOLR-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14377811#comment-14377811 ] Steve Molloy commented on SOLR-7214: I've created SOLR-7296 to deal with this multitude of implementations. Feel free to comment, contribute, insult, etc. :) > JSON Facet API > -- > > Key: SOLR-7214 > URL: https://issues.apache.org/jira/browse/SOLR-7214 > Project: Solr > Issue Type: New Feature >Reporter: Yonik Seeley > Attachments: SOLR-7214.patch > > > Overview is here: http://yonik.com/json-facet-api/ > The structured nature of nested sub-facets are more naturally expressed in a > nested structure like JSON rather than the flat structure that normal query > parameters provide. > Goals: > - First class JSON support > - Easier programmatic construction of complex nested facet commands > - Support a much more canonical response format that is easier for clients to > parse > - First class analytics support > - Support a cleaner way to do distributed faceting > - Support better integration with other search features -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-7296) Reconcile facetting implementations
Steve Molloy created SOLR-7296: -- Summary: Reconcile facetting implementations Key: SOLR-7296 URL: https://issues.apache.org/jira/browse/SOLR-7296 Project: Solr Issue Type: Task Components: faceting Reporter: Steve Molloy SOLR-7214 introduced a new way of controlling faceting, the unmbrella SOLR-6348 brings a lot of improvements in facet functionality, namely around pivots. Both make a lot of sense from a user perspective, but currently have completely different implementations. With the analytics components, this makes 3 implementation of the same logic, which is bound to behave differently as time goes by. We should reconcile all implementations to ease maintenance and offer consistent behaviour no matter how parameters are passed to the API. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-7214) JSON Facet API
[ https://issues.apache.org/jira/browse/SOLR-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14371242#comment-14371242 ] Steve Molloy commented on SOLR-7214: I think the underlying implementations should be shared. While I agree that SimpleFacets got to the point of being anything but simple... I don't think having completely separate implementations will help at all. Haven't looked at the new code yet, but how hard would it be to roll in pivot stats (and everything under SOLR-6350) into this implementation? I'm thinking that while this new way of passing parameters to facetting is good, we'll still need to support the old way to avoid any pains for users currently doing it the old way. And this should be perfectly fine as after all, we're talking about how to pass parameters, not what to do about them. So, whatever underlying implementation is more solid, easier to maintain, evolve, etc. We should use that and have all functionality work with it. If this new implementation supports all Solr needs, then let's simply have a layer that can parse parameters into a JSON format that will be provided to it. If it's the other way around, let's parse the JSON into parameters for the facet processing. Either way, we should decouple the way to provide parameters from the actual processing, and we should have a single way of performing that processing for facets... > JSON Facet API > -- > > Key: SOLR-7214 > URL: https://issues.apache.org/jira/browse/SOLR-7214 > Project: Solr > Issue Type: New Feature >Reporter: Yonik Seeley > Attachments: SOLR-7214.patch > > > Overview is here: http://yonik.com/json-facet-api/ > The structured nature of nested sub-facets are more naturally expressed in a > nested structure like JSON rather than the flat structure that normal query > parameters provide. > Goals: > - First class JSON support > - Easier programmatic construction of complex nested facet commands > - Support a much more canonical response format that is easier for clients to > parse > - First class analytics support > - Support a cleaner way to do distributed faceting > - Support better integration with other search features -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6663) decide rules & syntax for computing stats/ranges/queries only at certain levels of a pivot
[ https://issues.apache.org/jira/browse/SOLR-6663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14368940#comment-14368940 ] Steve Molloy commented on SOLR-6663: +1 This seems straight-foward and flexible enough for every case I can think of. Also agree with the single tag name case to preserve current behaviour. > decide rules & syntax for computing stats/ranges/queries only at certain > levels of a pivot > -- > > Key: SOLR-6663 > URL: https://issues.apache.org/jira/browse/SOLR-6663 > Project: Solr > Issue Type: Sub-task >Reporter: Hoss Man > > [~smolloy] asked a great question in SOLR-6351... > bq. One more question around this, which applies for SOLR-6353 and SOLR-4212 > as well. Should we have a syntax to apply stats/queries/ranges only at > specific levels in the pivot hierarchy? It would reduce amount of computation > and size of response for cases where you only need it at a specific level > (usually last level I guess). > I'm splitting this off into it's own Sub-task for further discussion. > > For now, the "stats" localparam must be a single tag, and the "work around" > is to add a common tag to all stats you want to use. > ie, this will cause an error... > {noformat} > stats.field={!tag=tagA}price > stats.field={!tag=tagB}popularity > stats.field={!tag=tagB}clicks > facet.pivot={!stats=tagA,tagB}xxx,yyy,zz > {noformat} > but this will work... > {noformat} > stats.field={!tag=tagA,tagPivot}price > stats.field={!tag=tagB,tagPivot}popularity > stats.field={!tag=tagB,tagPivot}clicks > facet.pivot={!stats=tagPivot}xxx,yyy,zz > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4212) Let facet queries hang off of pivots
[ https://issues.apache.org/jira/browse/SOLR-4212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14357519#comment-14357519 ] Steve Molloy commented on SOLR-4212: [~shalinmangar] Thanks for updating the patch. Took a quick look and seems good, will try to find time to actually try it tomorrow. > Let facet queries hang off of pivots > > > Key: SOLR-4212 > URL: https://issues.apache.org/jira/browse/SOLR-4212 > Project: Solr > Issue Type: Sub-task > Components: search >Affects Versions: 4.0 >Reporter: Steve Molloy >Assignee: Shalin Shekhar Mangar > Fix For: 4.9, Trunk > > Attachments: SOLR-4212-multiple-q.patch, SOLR-4212-multiple-q.patch, > SOLR-4212.patch, SOLR-4212.patch, SOLR-4212.patch, SOLR-4212.patch, > SOLR-4212.patch, SOLR-4212.patch, SOLR-6353-6686-4212.patch, patch-4212.txt > > > Facet pivot provide hierarchical support for computing data used to populate > a treemap or similar visualization. TreeMaps usually offer users extra > information by applying an overlay color on top of the existing square sizes > based on hierarchical counts. This second count is based on user choices, > representing, usually with gradient, the proportion of the square that fits > the user's choices. > The proposition is to use local parameters to specify facet query to apply > for pivot which matches a tag set on facet query. Parameter format would look > like: > facet.pivot={!query=r1}category,manufacturer > facet.query={!tag=r1}somequery > facet.query={!tag=r1}somedate:[NOW-1YEAR TO NOW] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-7127) Add method to CloudSolrClient to create per-collection clients
[ https://issues.apache.org/jira/browse/SOLR-7127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1416#comment-1416 ] Steve Molloy commented on SOLR-7127: bq. How about instead of returning a vanilla SolrClient, we return a CollectionSolrClient, which is an extension of CloudSolrClient with all the setters overridden to throw UnsupportedOperationException. That also makes it easier to track whether or not we should close resources, etc. I think it would be better to extend SolrClient and expose methods that make sense instead of CloudSolrClient and having a bunch of methods that simply throw exceptions. I like the idea of having a CollectionSolrClient (or whatever other name that makes sense), but exposing a bunch of methods simply throwing exceptions is counter-intuitive for people using the API. > Add method to CloudSolrClient to create per-collection clients > -- > > Key: SOLR-7127 > URL: https://issues.apache.org/jira/browse/SOLR-7127 > Project: Solr > Issue Type: Improvement >Reporter: Alan Woodward >Assignee: Alan Woodward >Priority: Minor > Attachments: SOLR-7127.patch, SOLR-7127.patch > > > CloudSolrClient isn't thread-safe if you're making requests to multiple > collections, because defaultCollection is mutable. This can be a pain if > you're trying to index into multiple collections from a single queue of > documents. > This issue adds a .getCollectionClient(String) method to CloudSolrClient that > returns a child client directed at that collection. Under the hood it's > another CloudSolrClient sharing it's resources with the parent client, but > with a separate default collection set. The method returns a SolrClient, > however, so you can't then change the collection unless you explicitly cast > it. > Sort of related to what I wanted to do on SOLR-6894, but this is more > focussed. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-7127) Add method to CloudSolrClient to create per-collection clients
[ https://issues.apache.org/jira/browse/SOLR-7127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14329128#comment-14329128 ] Steve Molloy commented on SOLR-7127: bq. Anyway, are we sure this whole parent / child / juggle lots of clients is really better than adding some new methods that allow passing the collection to use? I agree, having a single instance on which to call request(req, collection) which would set collection param in request before sending to normal request(req) sound much more simple and reduces the number of actual live objects at the same time. > Add method to CloudSolrClient to create per-collection clients > -- > > Key: SOLR-7127 > URL: https://issues.apache.org/jira/browse/SOLR-7127 > Project: Solr > Issue Type: Improvement >Reporter: Alan Woodward >Assignee: Alan Woodward >Priority: Minor > Attachments: SOLR-7127.patch, SOLR-7127.patch > > > CloudSolrClient isn't thread-safe if you're making requests to multiple > collections, because defaultCollection is mutable. This can be a pain if > you're trying to index into multiple collections from a single queue of > documents. > This issue adds a .getCollectionClient(String) method to CloudSolrClient that > returns a child client directed at that collection. Under the hood it's > another CloudSolrClient sharing it's resources with the parent client, but > with a separate default collection set. The method returns a SolrClient, > however, so you can't then change the collection unless you explicitly cast > it. > Sort of related to what I wanted to do on SOLR-6894, but this is more > focussed. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6247) artifacts are half a gigabyte
[ https://issues.apache.org/jira/browse/LUCENE-6247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14327426#comment-14327426 ] Steve Molloy commented on LUCENE-6247: -- bq. I spun off separate issues as many suggested. Are those intended for both Lucene & Solr? Should separate ones be entered for Solr as well? I think they should be aligned as much as possible. > artifacts are half a gigabyte > - > > Key: LUCENE-6247 > URL: https://issues.apache.org/jira/browse/LUCENE-6247 > Project: Lucene - Core > Issue Type: Bug >Reporter: Robert Muir >Priority: Blocker > Fix For: 5.0 > > > This is a growing problem and now, its spun out of control. > The latest release artifacts are half a gigabyte. sorry, I am against adding > more retries to the smoke tester and continuing down the same path > (LUCENE-6231). > Instead I open this blocker issue to discuss fixing this. Whenever i tried to > fix it before (e.g. removing zips), people complained at me "what are you > trying to fix" and wouldn't let me minimize it in the slightest. > Now the problem is clear, we have a blocker issue to discuss it. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6247) artifacts are half a gigabyte
[ https://issues.apache.org/jira/browse/LUCENE-6247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14322996#comment-14322996 ] Steve Molloy commented on LUCENE-6247: -- +1 to everything in 5.1 (although zip may be easier for some than tgz, as long as there's only one) This said, maybe more people would vote on details if it was split in separate tickets. And if some part causes more waves, at least the rest could be tackled and be done with. I don't see javadocs and double-packaging as causing much waves (apart from potential long discussion around zip vs tgz), 3rd party libs might cause more. If you split the ticket, you can get all but 3rd party in quickly and at least make progress on reducing size. Not perfect, but on the right path... > artifacts are half a gigabyte > - > > Key: LUCENE-6247 > URL: https://issues.apache.org/jira/browse/LUCENE-6247 > Project: Lucene - Core > Issue Type: Bug >Reporter: Robert Muir >Priority: Blocker > Fix For: 5.0 > > > This is a growing problem and now, its spun out of control. > The latest release artifacts are half a gigabyte. sorry, I am against adding > more retries to the smoke tester and continuing down the same path > (LUCENE-6231). > Instead I open this blocker issue to discuss fixing this. Whenever i tried to > fix it before (e.g. removing zips), people complained at me "what are you > trying to fix" and wouldn't let me minimize it in the slightest. > Now the problem is clear, we have a blocker issue to discuss it. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6311) SearchHandler should use path when no qt or shard.qt parameter is specified
[ https://issues.apache.org/jira/browse/SOLR-6311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14316967#comment-14316967 ] Steve Molloy commented on SOLR-6311: bq. Definitely not a bug. you have to remember the context of how distributed search was added Thanks for the history, makes it clearer why it was needed. bq. But now is not then Indeed, now distributed/SolrCloud is pretty much the norm... So anyhow, patch with logic on version makes sense for me, so +1. > SearchHandler should use path when no qt or shard.qt parameter is specified > --- > > Key: SOLR-6311 > URL: https://issues.apache.org/jira/browse/SOLR-6311 > Project: Solr > Issue Type: Bug >Affects Versions: 4.9 >Reporter: Steve Molloy >Assignee: Timothy Potter > Attachments: SOLR-6311.patch, SOLR-6311.patch > > > When performing distributed searches, you have to specify shards.qt unless > you're on the default /select path for your handler. As this is configurable, > even the default search handler could be on another path. The shard requests > should thus default to the path if no shards.qt was specified. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6803) Pivot Performance
[ https://issues.apache.org/jira/browse/SOLR-6803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14242545#comment-14242545 ] Steve Molloy commented on SOLR-6803: The subset will be needed even if subfield is null when computing stats, ranges or queries (see SOLR-6348). All this isn't in branch 4.x though, and even in 5.x/trunk we may think of only calling it if any of those is needed (subField not null or stats or ranges or queries...). Have you tried with more than 2 levels? If so, are you seeing similar behaviour? > Pivot Performance > - > > Key: SOLR-6803 > URL: https://issues.apache.org/jira/browse/SOLR-6803 > Project: Solr > Issue Type: Bug >Affects Versions: 4.10.2 >Reporter: Neil Ireson >Priority: Minor > Attachments: PivotPerformanceTest.java > > > I found that my pivot search for terms per day was taking an age so I knocked > up a quick test, using a collection of 1 million documents with a different > number of random terms and times, to compare different ways of getting the > counts. > 1) Combined = combining the term and time in a single field. > 2) Facet = for each term set the query to the term and then get the time > facet > 3) Pivot = use the term/time pivot facet. > The following two tables present the results for version 4.9.1 vs 4.10.1, as > an average of five runs. > 4.9.1 (Processing time in ms) > |Values (#) | Combined (ms)| Facet (ms)| Pivot (ms)| > |100 |22|21|52| > |1000 | 178|57| 115| > |1 | 1363| 211| 310| > |10| 2592| 1009| 978| > |50| 3125| 3753| 2476| > |100 | 3957| 6789| 3725| > 4.10.1 (Processing time in ms) > |Values (#) | Combined (ms)| Facet (ms)| Pivot (ms)| > |100 |21|21|75| > |1000 | 188|60| 265| > |1 | 1438| 215| 1826| > |10| 2768| 1073| 16594| > |50| 3266| 3686| 99682| > |100 | 4080| 6777|208873| > The results show that, as the number of pivot values increases (i.e. number > of terms * number of times), pivot performance in 4.10.1 get progressively > worse. > I tried to look at the code but there was a lot of changes in pivoting > between 4.9 and 4.10, and so it is not clear to me what has cause the > performance issues. However the results seem to indicate that if the pivot > was simply a combined facet search, it could potentially produce better and > more robust performance. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6831) Make facet pivots respect timeout from SolrQueryTimeoutImpl
[ https://issues.apache.org/jira/browse/SOLR-6831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14241360#comment-14241360 ] Steve Molloy commented on SOLR-6831: In distributed pivot facetting requests, patch for SOLR-6616 might also be needed if a lot of refinement shard requests are going on. > Make facet pivots respect timeout from SolrQueryTimeoutImpl > --- > > Key: SOLR-6831 > URL: https://issues.apache.org/jira/browse/SOLR-6831 > Project: Solr > Issue Type: Bug >Reporter: Steve Molloy > Attachments: SOLR-6831.patch > > > SOLR-5986 allows most queries to timeout cleanly using the > ExitableDirectoryReader. In the case of facet pivots though, the exception is > caught in SolrIndexSearcher's getDocSetNC while building pivots, resulting in > numerous lookups, most of which failing, but overall still running for a long > time and eating up resources. It should respect the timeAllowed just like any > other request. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6616) Make shards.tolerant and timeAllowed work together
[ https://issues.apache.org/jira/browse/SOLR-6616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Molloy updated SOLR-6616: --- Attachment: SOLR-6616.patch This patch only returns partialResults if shards.tolerant was true. It also adds some timeout checks during distributed processing to cancel pending tasks when timing out. This avoid components to keep sending shard requests (for facet pivot refinement for instance) even after timing out. We may want to rename "shards.tolerant" to reflect the fact that it now also affects non-distributed requests. It should be specific to accepting partial results, no matter if it is distributed or not. > Make shards.tolerant and timeAllowed work together > -- > > Key: SOLR-6616 > URL: https://issues.apache.org/jira/browse/SOLR-6616 > Project: Solr > Issue Type: Bug >Reporter: Anshum Gupta >Assignee: Anshum Gupta > Attachments: SOLR-6616.patch > > > From SOLR-5986: > {quote} > As of now, when timeAllowed is set, we never get back an exception but just > partialResults in the response header is set to true in case of a shard > failure. This translates to shards.tolerant being ignored in that case. > On the code level, the TimeExceededException never reaches ShardHandler and > so the Exception is never set (similarly for ExitingReaderException) and/or > returned to the client. > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6831) Make facet pivots respect timeout from SolrQueryTimeoutImpl
[ https://issues.apache.org/jira/browse/SOLR-6831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Molloy updated SOLR-6831: --- Attachment: SOLR-6831.patch First attempt at a patch. Went with route of checking shouldExit() as changing logic to let exception propagate from SolrIndexSearcher opened too much changes everywhere. > Make facet pivots respect timeout from SolrQueryTimeoutImpl > --- > > Key: SOLR-6831 > URL: https://issues.apache.org/jira/browse/SOLR-6831 > Project: Solr > Issue Type: Bug >Reporter: Steve Molloy > Attachments: SOLR-6831.patch > > > SOLR-5986 allows most queries to timeout cleanly using the > ExitableDirectoryReader. In the case of facet pivots though, the exception is > caught in SolrIndexSearcher's getDocSetNC while building pivots, resulting in > numerous lookups, most of which failing, but overall still running for a long > time and eating up resources. It should respect the timeAllowed just like any > other request. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-6831) Make facet pivots respect timeout from SolrQueryTimeoutImpl
Steve Molloy created SOLR-6831: -- Summary: Make facet pivots respect timeout from SolrQueryTimeoutImpl Key: SOLR-6831 URL: https://issues.apache.org/jira/browse/SOLR-6831 Project: Solr Issue Type: Bug Reporter: Steve Molloy SOLR-5986 allows most queries to timeout cleanly using the ExitableDirectoryReader. In the case of facet pivots though, the exception is caught in SolrIndexSearcher's getDocSetNC while building pivots, resulting in numerous lookups, most of which failing, but overall still running for a long time and eating up resources. It should respect the timeAllowed just like any other request. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4792) stop shipping a war in 5.0
[ https://issues.apache.org/jira/browse/SOLR-4792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14230170#comment-14230170 ] Steve Molloy commented on SOLR-4792: bq. We have already discussed and had a vote. I didn't want to imply it wasn't discussed or that there wasn't any valid reason, I already said I agree with the change and we have moved away from using the war directly as soon as the decision was made. But still, linking this ticket to what it's actually blocking would make it clear why it is needed and why it's worth breaking some current integrations for people who check Jira but don't read the threads on the mailing list. Anyway, just wanted to help as we've been ready for the change for a while on our side. :) > stop shipping a war in 5.0 > -- > > Key: SOLR-4792 > URL: https://issues.apache.org/jira/browse/SOLR-4792 > Project: Solr > Issue Type: Task > Components: Build >Reporter: Robert Muir >Assignee: Mark Miller > Fix For: 5.0, Trunk > > Attachments: SOLR-4792.patch > > > see the vote on the developer list. > This is the first step: if we stop shipping a war then we are free to do > anything we want. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4792) stop shipping a war in 5.0
[ https://issues.apache.org/jira/browse/SOLR-4792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14229966#comment-14229966 ] Steve Molloy commented on SOLR-4792: I think all that is missing on this ticket is a link to the stuff that removing the WAR is expected to fix. I mean I think I agree with the approach, but only because I can kind ofo guess at what issues we're trying to solve by reading different discussions and such. There has been talks about deadlocks, connections, and all sorts of optimizations that are currently not possible because of the fact that Solr is a webapp shipped as a WAR that must work on any webapp container. So, where are the entries for these optimizations that need to be made? If you have half a dozen of those that are all depending on not being a war that can run on any container, then people will stop asking why we're removing a war and we can get back to actually implementing improvements. :) > stop shipping a war in 5.0 > -- > > Key: SOLR-4792 > URL: https://issues.apache.org/jira/browse/SOLR-4792 > Project: Solr > Issue Type: Task > Components: Build >Reporter: Robert Muir >Assignee: Mark Miller > Fix For: 5.0, Trunk > > Attachments: SOLR-4792.patch > > > see the vote on the developer list. > This is the first step: if we stop shipping a war then we are free to do > anything we want. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4212) Let facet queries hang off of pivots
[ https://issues.apache.org/jira/browse/SOLR-4212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Molloy updated SOLR-4212: --- Attachment: SOLR-4212.patch Small adjustments to fit latest changes in SOLR-6351. > Let facet queries hang off of pivots > > > Key: SOLR-4212 > URL: https://issues.apache.org/jira/browse/SOLR-4212 > Project: Solr > Issue Type: Sub-task > Components: search >Affects Versions: 4.0 >Reporter: Steve Molloy > Fix For: 4.9, Trunk > > Attachments: SOLR-4212-multiple-q.patch, SOLR-4212-multiple-q.patch, > SOLR-4212.patch, SOLR-4212.patch, SOLR-4212.patch, SOLR-4212.patch, > SOLR-4212.patch, patch-4212.txt > > > Facet pivot provide hierarchical support for computing data used to populate > a treemap or similar visualization. TreeMaps usually offer users extra > information by applying an overlay color on top of the existing square sizes > based on hierarchical counts. This second count is based on user choices, > representing, usually with gradient, the proportion of the square that fits > the user's choices. > The proposition is to use local parameters to specify facet query to apply > for pivot which matches a tag set on facet query. Parameter format would look > like: > facet.pivot={!query=r1}category,manufacturer > facet.query={!tag=r1}somequery > facet.query={!tag=r1}somedate:[NOW-1YEAR TO NOW] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3583) Percentiles for facets, pivot facets, and distributed pivot facets
[ https://issues.apache.org/jira/browse/SOLR-3583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14181313#comment-14181313 ] Steve Molloy commented on SOLR-3583: Patch has to be applied, but you may want to look at issues mentioned by Hoss which include a different approach for this functionality. > Percentiles for facets, pivot facets, and distributed pivot facets > -- > > Key: SOLR-3583 > URL: https://issues.apache.org/jira/browse/SOLR-3583 > Project: Solr > Issue Type: Improvement >Reporter: Chris Russell >Priority: Minor > Labels: newbie, patch > Fix For: 4.9, Trunk > > Attachments: SOLR-3583.patch, SOLR-3583.patch, SOLR-3583.patch, > SOLR-3583.patch, SOLR-3583.patch, SOLR-3583.patch, SOLR-3583.patch, > SOLR-3583.patch > > > Built on top of SOLR-2894, this patch adds percentiles and averages to > facets, pivot facets, and distributed pivot facets by making use of range > facet internals. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5480) Make MoreLikeThisHandler distributable
[ https://issues.apache.org/jira/browse/SOLR-5480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Molloy updated SOLR-5480: --- Attachment: SOLR-5480.patch Fix code for a failing unit test. > Make MoreLikeThisHandler distributable > -- > > Key: SOLR-5480 > URL: https://issues.apache.org/jira/browse/SOLR-5480 > Project: Solr > Issue Type: Improvement >Reporter: Steve Molloy >Assignee: Noble Paul > Attachments: MoreLikeThisHandlerTestST.txt, SOLR-5480.patch, > SOLR-5480.patch, SOLR-5480.patch, SOLR-5480.patch, SOLR-5480.patch, > SOLR-5480.patch, SOLR-5480.patch, SOLR-5480.patch, SOLR-5480.patch, > SOLR-5480.patch, SOLR-5480.patch > > > The MoreLikeThis component, when used in the standard search handler supports > distributed searches. But the MoreLikeThisHandler itself doesn't, which > prevents from say, passing in text to perform the query. I'll start looking > into adapting the SearchHandler logic to the MoreLikeThisHandler. If anyone > has some work done already and want to share, or want to contribute, any help > will be welcomed. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5480) Make MoreLikeThisHandler distributable
[ https://issues.apache.org/jira/browse/SOLR-5480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Molloy updated SOLR-5480: --- Attachment: SOLR-5480.patch Patch adapted to trunk with distributed test, refocussed on MLT in distributed mode and away from MLT query parser which has a ticket of its own. > Make MoreLikeThisHandler distributable > -- > > Key: SOLR-5480 > URL: https://issues.apache.org/jira/browse/SOLR-5480 > Project: Solr > Issue Type: Improvement >Reporter: Steve Molloy >Assignee: Noble Paul > Attachments: MoreLikeThisHandlerTestST.txt, SOLR-5480.patch, > SOLR-5480.patch, SOLR-5480.patch, SOLR-5480.patch, SOLR-5480.patch, > SOLR-5480.patch, SOLR-5480.patch, SOLR-5480.patch, SOLR-5480.patch, > SOLR-5480.patch > > > The MoreLikeThis component, when used in the standard search handler supports > distributed searches. But the MoreLikeThisHandler itself doesn't, which > prevents from say, passing in text to perform the query. I'll start looking > into adapting the SearchHandler logic to the MoreLikeThisHandler. If anyone > has some work done already and want to share, or want to contribute, any help > will be welcomed. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6612) maxScore included in distributed search results even if score not requested
[ https://issues.apache.org/jira/browse/SOLR-6612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Molloy updated SOLR-6612: --- Attachment: SOLR-6612.patch Modified returnFields to only return scores if needed and to ensure maxScore is unset if not needed. > maxScore included in distributed search results even if score not requested > --- > > Key: SOLR-6612 > URL: https://issues.apache.org/jira/browse/SOLR-6612 > Project: Solr > Issue Type: Bug >Affects Versions: 4.10.1 >Reporter: Steve Molloy >Priority: Minor > Attachments: SOLR-6612.patch > > > When performing a search on a single core, maxScore is only included in > response if scores were specifically requested (fl=*,score). In distributed > searches however, maxScore is always part of results wether or not the scores > were requested. The behaviour should be the same whether the search is > distributed or not. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-6612) maxScore included in distributed search results even if score not requested
Steve Molloy created SOLR-6612: -- Summary: maxScore included in distributed search results even if score not requested Key: SOLR-6612 URL: https://issues.apache.org/jira/browse/SOLR-6612 Project: Solr Issue Type: Bug Affects Versions: 4.10.1 Reporter: Steve Molloy Priority: Minor When performing a search on a single core, maxScore is only included in response if scores were specifically requested (fl=*,score). In distributed searches however, maxScore is always part of results wether or not the scores were requested. The behaviour should be the same whether the search is distributed or not. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6353) Let Range Facets Hang off of Pivots
[ https://issues.apache.org/jira/browse/SOLR-6353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14163473#comment-14163473 ] Steve Molloy commented on SOLR-6353: See latest patch on SOLR-4212 which covers both facet queries and facet ranges. Did not tackle refactoring of SImpleFacets, but have working solution that should at least provide basis on which to build. > Let Range Facets Hang off of Pivots > --- > > Key: SOLR-6353 > URL: https://issues.apache.org/jira/browse/SOLR-6353 > Project: Solr > Issue Type: Sub-task >Reporter: Hoss Man > > Conceptually very similar to the previous sibling issues about hanging stats > of pivots & ranges: using a "tag" on {{facet.range}} requests, we make it > possible to hang a range off the nodes of Pivots. > Example... > {noformat} > facet.pivot={!range=r1}category,manufacturer > facet.range={tag=r1}price > {noformat} > ...with the request above, in addition to computing range facets over the > price field for the entire result set, the PivotFacet component will also > include all of those ranges for every node of the tree it builds up when > generating a pivot of the fields "category,manufacturer" > This should easily be combinable with the other sibling tasks to hang stats > off ranges which hang off pivots. (see parent issue for example) -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4212) Let facet queries hang off of pivots
[ https://issues.apache.org/jira/browse/SOLR-4212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Molloy updated SOLR-4212: --- Attachment: SOLR-4212.patch Following approach under SOLR-6348, facet queries and facet reanges now included in facet pivots if their tag matches the localparam facet.query or facet.range of the facet.pivot request parameter. Added distributed unit tests for both as well as Solrj test additions. This also covers SOLR-6353. > Let facet queries hang off of pivots > > > Key: SOLR-4212 > URL: https://issues.apache.org/jira/browse/SOLR-4212 > Project: Solr > Issue Type: Sub-task > Components: search >Affects Versions: 4.0 >Reporter: Steve Molloy > Fix For: 4.9, Trunk > > Attachments: SOLR-4212-multiple-q.patch, SOLR-4212-multiple-q.patch, > SOLR-4212.patch, SOLR-4212.patch, SOLR-4212.patch, SOLR-4212.patch, > patch-4212.txt > > > Facet pivot provide hierarchical support for computing data used to populate > a treemap or similar visualization. TreeMaps usually offer users extra > information by applying an overlay color on top of the existing square sizes > based on hierarchical counts. This second count is based on user choices, > representing, usually with gradient, the proportion of the square that fits > the user's choices. > The proposition is to use local parameters to specify facet query to apply > for pivot which matches a tag set on facet query. Parameter format would look > like: > facet.pivot={!query=r1}category,manufacturer > facet.query={!tag=r1}somequery > facet.query={!tag=r1}somedate:[NOW-1YEAR TO NOW] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6351) Let Stats Hang off of Pivots (via 'tag')
[ https://issues.apache.org/jira/browse/SOLR-6351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14160413#comment-14160413 ] Steve Molloy commented on SOLR-6351: One more question around this, which applies for SOLR-6353 and SOLR-4212 as well. Should we have a syntax to apply stats/queries/ranges only at specific levels in the pivot hierarchy? It would reduce amount of computation and size of response for cases where you only need it at a specific level (usually last level I guess). Something like: facet.pivot={!stats=s1,s2}field1,field2 We could us * for all levels, or something like: facet.pivot={!stats=,,s3}field1,field2,field3 to only apply at 3rd level. > Let Stats Hang off of Pivots (via 'tag') > > > Key: SOLR-6351 > URL: https://issues.apache.org/jira/browse/SOLR-6351 > Project: Solr > Issue Type: Sub-task >Reporter: Hoss Man > Attachments: SOLR-6351.patch, SOLR-6351.patch, SOLR-6351.patch, > SOLR-6351.patch, SOLR-6351.patch, SOLR-6351.patch > > > he goal here is basically flip the notion of "stats.facet" on it's head, so > that instead of asking the stats component to also do some faceting > (something that's never worked well with the variety of field types and has > never worked in distributed mode) we instead ask the PivotFacet code to > compute some stats X for each leaf in a pivot. We'll do this with the > existing {{stats.field}} params, but we'll leverage the {{tag}} local param > of the {{stats.field}} instances to be able to associate which stats we want > hanging off of which {{facet.pivot}} > Example... > {noformat} > facet.pivot={!stats=s1}category,manufacturer > stats.field={!key=avg_price tag=s1 mean=true}price > stats.field={!tag=s1 min=true max=true}user_rating > {noformat} > ...with the request above, in addition to computing the min/max user_rating > and mean price (labeled "avg_price") over the entire result set, the > PivotFacet component will also include those stats for every node of the tree > it builds up when generating a pivot of the fields "category,manufacturer" -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4212) Let facet queries hang off of pivots
[ https://issues.apache.org/jira/browse/SOLR-4212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Molloy updated SOLR-4212: --- Description: Facet pivot provide hierarchical support for computing data used to populate a treemap or similar visualization. TreeMaps usually offer users extra information by applying an overlay color on top of the existing square sizes based on hierarchical counts. This second count is based on user choices, representing, usually with gradient, the proportion of the square that fits the user's choices. The proposition is to use local parameters to specify facet query to apply for pivot which matches a tag set on facet query. Parameter format would look like: facet.pivot={!query=r1}category,manufacturer facet.query={!tag=r1}somequery facet.query={!tag=r1}somedate:[NOW-1YEAR TO NOW] was: Facet pivot provide hierarchical support for computing data used to populate a treemap or similar visualization. TreeMaps usually offer users extra information by applying an overlay color on top of the existing square sizes based on hierarchical counts. This second count is based on user choices, representing, usually with gradient, the proportion of the square that fits the user's choices. The proposition is to use local parameters to specify facet query to apply for pivot which matches a tag set on facet query. > Let facet queries hang off of pivots > > > Key: SOLR-4212 > URL: https://issues.apache.org/jira/browse/SOLR-4212 > Project: Solr > Issue Type: Sub-task > Components: search >Affects Versions: 4.0 >Reporter: Steve Molloy > Fix For: 4.9, Trunk > > Attachments: SOLR-4212-multiple-q.patch, SOLR-4212-multiple-q.patch, > SOLR-4212.patch, SOLR-4212.patch, SOLR-4212.patch, patch-4212.txt > > > Facet pivot provide hierarchical support for computing data used to populate > a treemap or similar visualization. TreeMaps usually offer users extra > information by applying an overlay color on top of the existing square sizes > based on hierarchical counts. This second count is based on user choices, > representing, usually with gradient, the proportion of the square that fits > the user's choices. > The proposition is to use local parameters to specify facet query to apply > for pivot which matches a tag set on facet query. Parameter format would look > like: > facet.pivot={!query=r1}category,manufacturer > facet.query={!tag=r1}somequery > facet.query={!tag=r1}somedate:[NOW-1YEAR TO NOW] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4212) Let facet queries hang off of pivots
[ https://issues.apache.org/jira/browse/SOLR-4212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Molloy updated SOLR-4212: --- Description: Facet pivot provide hierarchical support for computing data used to populate a treemap or similar visualization. TreeMaps usually offer users extra information by applying an overlay color on top of the existing square sizes based on hierarchical counts. This second count is based on user choices, representing, usually with gradient, the proportion of the square that fits the user's choices. The proposition is to use local parameters to specify facet query to apply for pivot which matches a tag set on facet query. was: Facet pivot provide hierarchical support for computing data used to populate a treemap or similar visualization. TreeMaps usually offer users extra information by applying an overlay color on top of the existing square sizes based on hierarchical counts. This second count is based on user choices, representing, usually with gradient, the proportion of the square that fits the user's choices. The proposition is to add a facet.pivot.q parameter that would allow to specify one or more queries (per field) that would be intersected with DocSet used to calculate pivot count, stored in separate qcounts list, each entry keyed by the query. > Let facet queries hang off of pivots > > > Key: SOLR-4212 > URL: https://issues.apache.org/jira/browse/SOLR-4212 > Project: Solr > Issue Type: Sub-task > Components: search >Affects Versions: 4.0 >Reporter: Steve Molloy > Fix For: 4.9, Trunk > > Attachments: SOLR-4212-multiple-q.patch, SOLR-4212-multiple-q.patch, > SOLR-4212.patch, SOLR-4212.patch, SOLR-4212.patch, patch-4212.txt > > > Facet pivot provide hierarchical support for computing data used to populate > a treemap or similar visualization. TreeMaps usually offer users extra > information by applying an overlay color on top of the existing square sizes > based on hierarchical counts. This second count is based on user choices, > representing, usually with gradient, the proportion of the square that fits > the user's choices. > The proposition is to use local parameters to specify facet query to apply > for pivot which matches a tag set on facet query. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4212) Let facet queries hang off of pivots
[ https://issues.apache.org/jira/browse/SOLR-4212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Molloy updated SOLR-4212: --- Issue Type: Sub-task (was: Improvement) Parent: SOLR-6348 > Let facet queries hang off of pivots > > > Key: SOLR-4212 > URL: https://issues.apache.org/jira/browse/SOLR-4212 > Project: Solr > Issue Type: Sub-task > Components: search >Affects Versions: 4.0 >Reporter: Steve Molloy > Fix For: 4.9, Trunk > > Attachments: SOLR-4212-multiple-q.patch, SOLR-4212-multiple-q.patch, > SOLR-4212.patch, SOLR-4212.patch, SOLR-4212.patch, patch-4212.txt > > > Facet pivot provide hierarchical support for computing data used to populate > a treemap or similar visualization. TreeMaps usually offer users extra > information by applying an overlay color on top of the existing square sizes > based on hierarchical counts. This second count is based on user choices, > representing, usually with gradient, the proportion of the square that fits > the user's choices. > The proposition is to add a facet.pivot.q parameter that would allow to > specify one or more queries (per field) that would be intersected with DocSet > used to calculate pivot count, stored in separate qcounts list, each entry > keyed by the query. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4212) Let facet queries hang off of pivots
[ https://issues.apache.org/jira/browse/SOLR-4212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Molloy updated SOLR-4212: --- Summary: Let facet queries hang off of pivots (was: Support for facet pivot query for filtered count) > Let facet queries hang off of pivots > > > Key: SOLR-4212 > URL: https://issues.apache.org/jira/browse/SOLR-4212 > Project: Solr > Issue Type: Improvement > Components: search >Affects Versions: 4.0 >Reporter: Steve Molloy > Fix For: 4.9, Trunk > > Attachments: SOLR-4212-multiple-q.patch, SOLR-4212-multiple-q.patch, > SOLR-4212.patch, SOLR-4212.patch, SOLR-4212.patch, patch-4212.txt > > > Facet pivot provide hierarchical support for computing data used to populate > a treemap or similar visualization. TreeMaps usually offer users extra > information by applying an overlay color on top of the existing square sizes > based on hierarchical counts. This second count is based on user choices, > representing, usually with gradient, the proportion of the square that fits > the user's choices. > The proposition is to add a facet.pivot.q parameter that would allow to > specify one or more queries (per field) that would be intersected with DocSet > used to calculate pivot count, stored in separate qcounts list, each entry > keyed by the query. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6351) Let Stats Hang off of Pivots (via 'tag')
[ https://issues.apache.org/jira/browse/SOLR-6351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Molloy updated SOLR-6351: --- Attachment: SOLR-6351.patch Addressing some comments. Remove unused for-loop and CommonParams.STATS. Didn't touch the notSupprted test methods, will let Vitaliy a chance to speak for their usefulness. Also reverted the hasValues logic to replace it with checking if current pivot has positive count. Although it does produce some stats entries with Infinity minimum/maximum and NaN mean. This is what I was asking about before, I think I misunderstood the answer, but it still seems error-prone to have such entries... Finally, I updated some of the outputs to use NamedList instead of maps so that solrj binary works better. Did have to sort fields in QueryResponse to get tests to pass. Not sure this is the best way, but would sometimes get them out of order if I didn't. > Let Stats Hang off of Pivots (via 'tag') > > > Key: SOLR-6351 > URL: https://issues.apache.org/jira/browse/SOLR-6351 > Project: Solr > Issue Type: Sub-task >Reporter: Hoss Man > Attachments: SOLR-6351.patch, SOLR-6351.patch, SOLR-6351.patch, > SOLR-6351.patch, SOLR-6351.patch > > > he goal here is basically flip the notion of "stats.facet" on it's head, so > that instead of asking the stats component to also do some faceting > (something that's never worked well with the variety of field types and has > never worked in distributed mode) we instead ask the PivotFacet code to > compute some stats X for each leaf in a pivot. We'll do this with the > existing {{stats.field}} params, but we'll leverage the {{tag}} local param > of the {{stats.field}} instances to be able to associate which stats we want > hanging off of which {{facet.pivot}} > Example... > {noformat} > facet.pivot={!stats=s1}category,manufacturer > stats.field={!key=avg_price tag=s1 mean=true}price > stats.field={!tag=s1 min=true max=true}user_rating > {noformat} > ...with the request above, in addition to computing the min/max user_rating > and mean price (labeled "avg_price") over the entire result set, the > PivotFacet component will also include those stats for every node of the tree > it builds up when generating a pivot of the fields "category,manufacturer" -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6351) Let Stats Hang off of Pivots (via 'tag')
[ https://issues.apache.org/jira/browse/SOLR-6351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Molloy updated SOLR-6351: --- Attachment: SOLR-6351.patch Augmented previous 3 patches, added logic to not include stats entry if it's empty, fixed distributed logic by actually merging stats from shards. Currently have unit tests failing in solrj that I need to look at. > Let Stats Hang off of Pivots (via 'tag') > > > Key: SOLR-6351 > URL: https://issues.apache.org/jira/browse/SOLR-6351 > Project: Solr > Issue Type: Sub-task >Reporter: Hoss Man > Attachments: SOLR-6351.patch, SOLR-6351.patch, SOLR-6351.patch, > SOLR-6351.patch > > > he goal here is basically flip the notion of "stats.facet" on it's head, so > that instead of asking the stats component to also do some faceting > (something that's never worked well with the variety of field types and has > never worked in distributed mode) we instead ask the PivotFacet code to > compute some stats X for each leaf in a pivot. We'll do this with the > existing {{stats.field}} params, but we'll leverage the {{tag}} local param > of the {{stats.field}} instances to be able to associate which stats we want > hanging off of which {{facet.pivot}} > Example... > {noformat} > facet.pivot={!stats=s1}category,manufacturer > stats.field={!key=avg_price tag=s1 mean=true}price > stats.field={!tag=s1 min=true max=true}user_rating > {noformat} > ...with the request above, in addition to computing the min/max user_rating > and mean price (labeled "avg_price") over the entire result set, the > PivotFacet component will also include those stats for every node of the tree > it builds up when generating a pivot of the fields "category,manufacturer" -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4212) Support for facet pivot query for filtered count
[ https://issues.apache.org/jira/browse/SOLR-4212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14158374#comment-14158374 ] Steve Molloy commented on SOLR-4212: That's what I'm starting to realize by looking into SOLR-6351... :) It makes a lot of sense, I'll try to adapt and see if I can get facet ranges (SOLR-6353) covered at the same time, they should be similar with your proposed approach. > Support for facet pivot query for filtered count > > > Key: SOLR-4212 > URL: https://issues.apache.org/jira/browse/SOLR-4212 > Project: Solr > Issue Type: Improvement > Components: search >Affects Versions: 4.0 >Reporter: Steve Molloy > Fix For: 4.9, Trunk > > Attachments: SOLR-4212-multiple-q.patch, SOLR-4212-multiple-q.patch, > SOLR-4212.patch, SOLR-4212.patch, SOLR-4212.patch, patch-4212.txt > > > Facet pivot provide hierarchical support for computing data used to populate > a treemap or similar visualization. TreeMaps usually offer users extra > information by applying an overlay color on top of the existing square sizes > based on hierarchical counts. This second count is based on user choices, > representing, usually with gradient, the proportion of the square that fits > the user's choices. > The proposition is to add a facet.pivot.q parameter that would allow to > specify one or more queries (per field) that would be intersected with DocSet > used to calculate pivot count, stored in separate qcounts list, each entry > keyed by the query. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6351) Let Stats Hang off of Pivots (via 'tag')
[ https://issues.apache.org/jira/browse/SOLR-6351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14157990#comment-14157990 ] Steve Molloy commented on SOLR-6351: Ok, applied locally and see that most is combined. One thing though, what is expected behavior for pivots where count is 0? Currently, you'll get the full entry with NaN, infinity and such in it. Should it be null or empty instead? Or should it even show up at all? > Let Stats Hang off of Pivots (via 'tag') > > > Key: SOLR-6351 > URL: https://issues.apache.org/jira/browse/SOLR-6351 > Project: Solr > Issue Type: Sub-task >Reporter: Hoss Man > Attachments: SOLR-6351.patch, SOLR-6351.patch, SOLR-6351.patch > > > he goal here is basically flip the notion of "stats.facet" on it's head, so > that instead of asking the stats component to also do some faceting > (something that's never worked well with the variety of field types and has > never worked in distributed mode) we instead ask the PivotFacet code to > compute some stats X for each leaf in a pivot. We'll do this with the > existing {{stats.field}} params, but we'll leverage the {{tag}} local param > of the {{stats.field}} instances to be able to associate which stats we want > hanging off of which {{facet.pivot}} > Example... > {noformat} > facet.pivot={!stats=s1}category,manufacturer > stats.field={!key=avg_price tag=s1 mean=true}price > stats.field={!tag=s1 min=true max=true}user_rating > {noformat} > ...with the request above, in addition to computing the min/max user_rating > and mean price (labeled "avg_price") over the entire result set, the > PivotFacet component will also include those stats for every node of the tree > it builds up when generating a pivot of the fields "category,manufacturer" -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6351) Let Stats Hang off of Pivots (via 'tag')
[ https://issues.apache.org/jira/browse/SOLR-6351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14157129#comment-14157129 ] Steve Molloy commented on SOLR-6351: [~vzhovtiuk] Does your patch contain changes form mine? There were some NPE as Hoss mentioned which I think I got fixed. I like the addition of tests though, so hope to get the best of both. I don't mind providing a patch combining both patches, just want to avoid us posting at about the same time again. :) > Let Stats Hang off of Pivots (via 'tag') > > > Key: SOLR-6351 > URL: https://issues.apache.org/jira/browse/SOLR-6351 > Project: Solr > Issue Type: Sub-task >Reporter: Hoss Man > Attachments: SOLR-6351.patch, SOLR-6351.patch, SOLR-6351.patch > > > he goal here is basically flip the notion of "stats.facet" on it's head, so > that instead of asking the stats component to also do some faceting > (something that's never worked well with the variety of field types and has > never worked in distributed mode) we instead ask the PivotFacet code to > compute some stats X for each leaf in a pivot. We'll do this with the > existing {{stats.field}} params, but we'll leverage the {{tag}} local param > of the {{stats.field}} instances to be able to associate which stats we want > hanging off of which {{facet.pivot}} > Example... > {noformat} > facet.pivot={!stats=s1}category,manufacturer > stats.field={!key=avg_price tag=s1 mean=true}price > stats.field={!tag=s1 min=true max=true}user_rating > {noformat} > ...with the request above, in addition to computing the min/max user_rating > and mean price (labeled "avg_price") over the entire result set, the > PivotFacet component will also include those stats for every node of the tree > it builds up when generating a pivot of the fields "category,manufacturer" -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6351) Let Stats Hang off of Pivots (via 'tag')
[ https://issues.apache.org/jira/browse/SOLR-6351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Molloy updated SOLR-6351: --- Attachment: SOLR-6351.patch Adapted patch a bit to avoid NPE when no stats are asked, modified PivotFacetValue to propagate info and other small tweaks. Tests seem to be happy and got some requests to work, so I am too... :) > Let Stats Hang off of Pivots (via 'tag') > > > Key: SOLR-6351 > URL: https://issues.apache.org/jira/browse/SOLR-6351 > Project: Solr > Issue Type: Sub-task >Reporter: Hoss Man > Attachments: SOLR-6351.patch, SOLR-6351.patch > > > he goal here is basically flip the notion of "stats.facet" on it's head, so > that instead of asking the stats component to also do some faceting > (something that's never worked well with the variety of field types and has > never worked in distributed mode) we instead ask the PivotFacet code to > compute some stats X for each leaf in a pivot. We'll do this with the > existing {{stats.field}} params, but we'll leverage the {{tag}} local param > of the {{stats.field}} instances to be able to associate which stats we want > hanging off of which {{facet.pivot}} > Example... > {noformat} > facet.pivot={!stats=s1}category,manufacturer > stats.field={!key=avg_price tag=s1 mean=true}price > stats.field={!tag=s1 min=true max=true}user_rating > {noformat} > ...with the request above, in addition to computing the min/max user_rating > and mean price (labeled "avg_price") over the entire result set, the > PivotFacet component will also include those stats for every node of the tree > it builds up when generating a pivot of the fields "category,manufacturer" -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4212) Support for facet pivot query for filtered count
[ https://issues.apache.org/jira/browse/SOLR-4212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Molloy updated SOLR-4212: --- Attachment: SOLR-4212-multiple-q.patch Add test and build map of parsed queries once > Support for facet pivot query for filtered count > > > Key: SOLR-4212 > URL: https://issues.apache.org/jira/browse/SOLR-4212 > Project: Solr > Issue Type: Improvement > Components: search >Affects Versions: 4.0 >Reporter: Steve Molloy > Fix For: 4.9, Trunk > > Attachments: SOLR-4212-multiple-q.patch, SOLR-4212-multiple-q.patch, > SOLR-4212.patch, SOLR-4212.patch, SOLR-4212.patch, patch-4212.txt > > > Facet pivot provide hierarchical support for computing data used to populate > a treemap or similar visualization. TreeMaps usually offer users extra > information by applying an overlay color on top of the existing square sizes > based on hierarchical counts. This second count is based on user choices, > representing, usually with gradient, the proportion of the square that fits > the user's choices. > The proposition is to add a facet.pivot.q parameter that would allow to > specify one or more queries (per field) that would be intersected with DocSet > used to calculate pivot count, stored in separate qcounts list, each entry > keyed by the query. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4212) Support for facet pivot query for filtered count
[ https://issues.apache.org/jira/browse/SOLR-4212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Molloy updated SOLR-4212: --- Attachment: SOLR-4212-multiple-q.patch Add support for multiple queries per field. > Support for facet pivot query for filtered count > > > Key: SOLR-4212 > URL: https://issues.apache.org/jira/browse/SOLR-4212 > Project: Solr > Issue Type: Improvement > Components: search >Affects Versions: 4.0 >Reporter: Steve Molloy > Fix For: 4.9, Trunk > > Attachments: SOLR-4212-multiple-q.patch, SOLR-4212.patch, > SOLR-4212.patch, SOLR-4212.patch, patch-4212.txt > > > Facet pivot provide hierarchical support for computing data used to populate > a treemap or similar visualization. TreeMaps usually offer users extra > information by applying an overlay color on top of the existing square sizes > based on hierarchical counts. This second count is based on user choices, > representing, usually with gradient, the proportion of the square that fits > the user's choices. > The proposition is to add a facet.pivot.q parameter that would allow to > specify one or more queries (per field) that would be intersected with DocSet > used to calculate pivot count, stored in separate qcounts list, each entry > keyed by the query. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4212) Support for facet pivot query for filtered count
[ https://issues.apache.org/jira/browse/SOLR-4212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Molloy updated SOLR-4212: --- Description: Facet pivot provide hierarchical support for computing data used to populate a treemap or similar visualization. TreeMaps usually offer users extra information by applying an overlay color on top of the existing square sizes based on hierarchical counts. This second count is based on user choices, representing, usually with gradient, the proportion of the square that fits the user's choices. The proposition is to add a facet.pivot.q parameter that would allow to specify one or more queries (per field) that would be intersected with DocSet used to calculate pivot count, stored in separate qcounts list, each entry keyed by the query. was: Facet pivot provide hierarchical support for computing data used to populate a treemap or similar visualization. TreeMaps usually offer users extra information by applying an overlay color on top of the existing square sizes based on hierarchical counts. This second count is based on user choices, representing, usually with gradient, the proportion of the square that fits the user's choices. The proposition is to add a facet.pivot.q parameter that would allow to specify a query (per field) that would be intersected with DocSet used to calculate pivot count, stored in separate q-count. > Support for facet pivot query for filtered count > > > Key: SOLR-4212 > URL: https://issues.apache.org/jira/browse/SOLR-4212 > Project: Solr > Issue Type: Improvement > Components: search >Affects Versions: 4.0 >Reporter: Steve Molloy > Fix For: 4.9, Trunk > > Attachments: SOLR-4212.patch, SOLR-4212.patch, SOLR-4212.patch, > patch-4212.txt > > > Facet pivot provide hierarchical support for computing data used to populate > a treemap or similar visualization. TreeMaps usually offer users extra > information by applying an overlay color on top of the existing square sizes > based on hierarchical counts. This second count is based on user choices, > representing, usually with gradient, the proportion of the square that fits > the user's choices. > The proposition is to add a facet.pivot.q parameter that would allow to > specify one or more queries (per field) that would be intersected with DocSet > used to calculate pivot count, stored in separate qcounts list, each entry > keyed by the query. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6297) Distributed spellcheck with WordBreakSpellchecker can lose suggestions
[ https://issues.apache.org/jira/browse/SOLR-6297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14134412#comment-14134412 ] Steve Molloy commented on SOLR-6297: Sorry for not replying sooner, but yes, I applied the patch to our codebase and it seems to fix the issue. Thanks. > Distributed spellcheck with WordBreakSpellchecker can lose suggestions > -- > > Key: SOLR-6297 > URL: https://issues.apache.org/jira/browse/SOLR-6297 > Project: Solr > Issue Type: Bug >Affects Versions: 4.9 >Reporter: Steve Molloy >Assignee: James Dyer > Fix For: 4.11 > > Attachments: SOLR-6297.patch, SOLR-6297.patch, SOLR-6297.patch, > SOLR-6297.patch > > > When performing a spellcheck request in distributed environment with the > WordBreakSpellChecker configured, the shard response merging logic can lose > some suggestions. Basically, the merging logic ensures that all shards marked > the query as not being correctly spelled, which is good, but also expects all > shards to return some suggestions, which isn't necessarily the case. So if > shard 1 returns 10 suggestions but shard 2 returns none, the final result > will contain no suggestions because the term has suggestions from only 1 of 2 > shards. > This isn't the case with the DirectSolrSpellChecker which works properly. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5480) Make MoreLikeThisHandler distributable
[ https://issues.apache.org/jira/browse/SOLR-5480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14126057#comment-14126057 ] Steve Molloy commented on SOLR-5480: I often run into this with patches taken from Jira... The slightest change before creating the patch seem to have a significant impact on whether or not the patch can apply cleanly. I usually have to resort to applying whatever matches while excluding any conflicts which I then apply manually. > Make MoreLikeThisHandler distributable > -- > > Key: SOLR-5480 > URL: https://issues.apache.org/jira/browse/SOLR-5480 > Project: Solr > Issue Type: Improvement >Reporter: Steve Molloy >Assignee: Noble Paul > Attachments: MoreLikeThisHandlerTestST.txt, SOLR-5480.patch, > SOLR-5480.patch, SOLR-5480.patch, SOLR-5480.patch, SOLR-5480.patch, > SOLR-5480.patch, SOLR-5480.patch, SOLR-5480.patch, SOLR-5480.patch > > > The MoreLikeThis component, when used in the standard search handler supports > distributed searches. But the MoreLikeThisHandler itself doesn't, which > prevents from say, passing in text to perform the query. I'll start looking > into adapting the SearchHandler logic to the MoreLikeThisHandler. If anyone > has some work done already and want to share, or want to contribute, any help > will be welcomed. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4212) Support for facet pivot query for filtered count
[ https://issues.apache.org/jira/browse/SOLR-4212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Molloy updated SOLR-4212: --- Attachment: SOLR-4212.patch (right attchment this time) Adapted patch for 4.10 code and ensured all tests passed. > Support for facet pivot query for filtered count > > > Key: SOLR-4212 > URL: https://issues.apache.org/jira/browse/SOLR-4212 > Project: Solr > Issue Type: Improvement > Components: search >Affects Versions: 4.0 >Reporter: Steve Molloy > Fix For: 4.9, 5.0 > > Attachments: SOLR-4212.patch, SOLR-4212.patch, SOLR-4212.patch, > patch-4212.txt > > > Facet pivot provide hierarchical support for computing data used to populate > a treemap or similar visualization. TreeMaps usually offer users extra > information by applying an overlay color on top of the existing square sizes > based on hierarchical counts. This second count is based on user choices, > representing, usually with gradient, the proportion of the square that fits > the user's choices. > The proposition is to add a facet.pivot.q parameter that would allow to > specify a query (per field) that would be intersected with DocSet used to > calculate pivot count, stored in separate q-count. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Issue Comment Deleted] (SOLR-4212) Support for facet pivot query for filtered count
[ https://issues.apache.org/jira/browse/SOLR-4212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Molloy updated SOLR-4212: --- Comment: was deleted (was: Adapted patch for 4.10 code and validated that tests passed.) > Support for facet pivot query for filtered count > > > Key: SOLR-4212 > URL: https://issues.apache.org/jira/browse/SOLR-4212 > Project: Solr > Issue Type: Improvement > Components: search >Affects Versions: 4.0 >Reporter: Steve Molloy > Fix For: 4.9, 5.0 > > Attachments: SOLR-4212.patch, SOLR-4212.patch, patch-4212.txt > > > Facet pivot provide hierarchical support for computing data used to populate > a treemap or similar visualization. TreeMaps usually offer users extra > information by applying an overlay color on top of the existing square sizes > based on hierarchical counts. This second count is based on user choices, > representing, usually with gradient, the proportion of the square that fits > the user's choices. > The proposition is to add a facet.pivot.q parameter that would allow to > specify a query (per field) that would be intersected with DocSet used to > calculate pivot count, stored in separate q-count. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4212) Support for facet pivot query for filtered count
[ https://issues.apache.org/jira/browse/SOLR-4212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Molloy updated SOLR-4212: --- Attachment: (was: SOLR-4212.patch) > Support for facet pivot query for filtered count > > > Key: SOLR-4212 > URL: https://issues.apache.org/jira/browse/SOLR-4212 > Project: Solr > Issue Type: Improvement > Components: search >Affects Versions: 4.0 >Reporter: Steve Molloy > Fix For: 4.9, 5.0 > > Attachments: SOLR-4212.patch, SOLR-4212.patch, patch-4212.txt > > > Facet pivot provide hierarchical support for computing data used to populate > a treemap or similar visualization. TreeMaps usually offer users extra > information by applying an overlay color on top of the existing square sizes > based on hierarchical counts. This second count is based on user choices, > representing, usually with gradient, the proportion of the square that fits > the user's choices. > The proposition is to add a facet.pivot.q parameter that would allow to > specify a query (per field) that would be intersected with DocSet used to > calculate pivot count, stored in separate q-count. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4212) Support for facet pivot query for filtered count
[ https://issues.apache.org/jira/browse/SOLR-4212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Molloy updated SOLR-4212: --- Attachment: SOLR-4212.patch Adapted patch for 4.10 code and validated that tests passed. > Support for facet pivot query for filtered count > > > Key: SOLR-4212 > URL: https://issues.apache.org/jira/browse/SOLR-4212 > Project: Solr > Issue Type: Improvement > Components: search >Affects Versions: 4.0 >Reporter: Steve Molloy > Fix For: 4.9, 5.0 > > Attachments: SOLR-4212.patch, SOLR-4212.patch, SOLR-4212.patch, > patch-4212.txt > > > Facet pivot provide hierarchical support for computing data used to populate > a treemap or similar visualization. TreeMaps usually offer users extra > information by applying an overlay color on top of the existing square sizes > based on hierarchical counts. This second count is based on user choices, > representing, usually with gradient, the proportion of the square that fits > the user's choices. > The proposition is to add a facet.pivot.q parameter that would allow to > specify a query (per field) that would be intersected with DocSet used to > calculate pivot count, stored in separate q-count. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5480) Make MoreLikeThisHandler distributable
[ https://issues.apache.org/jira/browse/SOLR-5480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14125933#comment-14125933 ] Steve Molloy commented on SOLR-5480: [~claudenm] Something is wrong with this stack trace. You actually have compilation errors pointing to MLT handler not implementing an abstract method of RequestHandlerBase which is implemented in SearchHandler which MLT handler extends after applying the patch. Caused by: java.lang.Error: Unresolved compilation problems: The type MoreLikeThisHandler must implement the inherited abstract method RequestHandlerBase.handleRequestBody(SolrQueryRequest, SolrQueryResponse) After applying the latest patch, is you MoreLikeThisHandler declaration like: public class MoreLikeThisHandler extends SearchHandler I also see you are in eclipse (from the paths), are you running the tests from command line or within eclipse? (trying to see where things may differ) > Make MoreLikeThisHandler distributable > -- > > Key: SOLR-5480 > URL: https://issues.apache.org/jira/browse/SOLR-5480 > Project: Solr > Issue Type: Improvement >Reporter: Steve Molloy >Assignee: Noble Paul > Attachments: MoreLikeThisHandlerTestST.txt, SOLR-5480.patch, > SOLR-5480.patch, SOLR-5480.patch, SOLR-5480.patch, SOLR-5480.patch, > SOLR-5480.patch, SOLR-5480.patch, SOLR-5480.patch, SOLR-5480.patch > > > The MoreLikeThis component, when used in the standard search handler supports > distributed searches. But the MoreLikeThisHandler itself doesn't, which > prevents from say, passing in text to perform the query. I'll start looking > into adapting the SearchHandler logic to the MoreLikeThisHandler. If anyone > has some work done already and want to share, or want to contribute, any help > will be welcomed. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3583) Percentiles for facets, pivot facets, and distributed pivot facets
[ https://issues.apache.org/jira/browse/SOLR-3583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14125887#comment-14125887 ] Steve Molloy commented on SOLR-3583: [~hossman] I kind of agree with all your comments, although I still needed this functionality to be working today and see no progress on issues you pointed to. Anything we can do to help speed things up on the Stats component to support this? (specifically, we need distributed pivot faceted stats for average and sum for numeric fields). > Percentiles for facets, pivot facets, and distributed pivot facets > -- > > Key: SOLR-3583 > URL: https://issues.apache.org/jira/browse/SOLR-3583 > Project: Solr > Issue Type: Improvement >Reporter: Chris Russell >Priority: Minor > Labels: newbie, patch > Fix For: 4.9, 5.0 > > Attachments: SOLR-3583.patch, SOLR-3583.patch, SOLR-3583.patch, > SOLR-3583.patch, SOLR-3583.patch, SOLR-3583.patch, SOLR-3583.patch, > SOLR-3583.patch > > > Built on top of SOLR-2894, this patch adds percentiles and averages to > facets, pivot facets, and distributed pivot facets by making use of range > facet internals. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-3583) Percentiles for facets, pivot facets, and distributed pivot facets
[ https://issues.apache.org/jira/browse/SOLR-3583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Molloy updated SOLR-3583: --- Attachment: SOLR-3583.patch Adapted patch for 4.10 tag which now includes SOLR-2894 OOB. Ran all unit tests successfully. > Percentiles for facets, pivot facets, and distributed pivot facets > -- > > Key: SOLR-3583 > URL: https://issues.apache.org/jira/browse/SOLR-3583 > Project: Solr > Issue Type: Improvement >Reporter: Chris Russell >Priority: Minor > Labels: newbie, patch > Fix For: 4.9, 5.0 > > Attachments: SOLR-3583.patch, SOLR-3583.patch, SOLR-3583.patch, > SOLR-3583.patch, SOLR-3583.patch, SOLR-3583.patch, SOLR-3583.patch, > SOLR-3583.patch > > > Built on top of SOLR-2894, this patch adds percentiles and averages to > facets, pivot facets, and distributed pivot facets by making use of range > facet internals. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5480) Make MoreLikeThisHandler distributable
[ https://issues.apache.org/jira/browse/SOLR-5480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14125647#comment-14125647 ] Steve Molloy commented on SOLR-5480: [~claudenm] I just attached a version of the patch adapted for 4.10 release. Tests are passing in my environment (Ubuntu 14.04, Oracle JDK 1.7.0_67), I will perform some more integrationt tests in our setup and will let you know if I see any issue. What were the failures you were seeing? Do you have logs/stack traces? > Make MoreLikeThisHandler distributable > -- > > Key: SOLR-5480 > URL: https://issues.apache.org/jira/browse/SOLR-5480 > Project: Solr > Issue Type: Improvement >Reporter: Steve Molloy >Assignee: Noble Paul > Attachments: SOLR-5480.patch, SOLR-5480.patch, SOLR-5480.patch, > SOLR-5480.patch, SOLR-5480.patch, SOLR-5480.patch, SOLR-5480.patch, > SOLR-5480.patch, SOLR-5480.patch > > > The MoreLikeThis component, when used in the standard search handler supports > distributed searches. But the MoreLikeThisHandler itself doesn't, which > prevents from say, passing in text to perform the query. I'll start looking > into adapting the SearchHandler logic to the MoreLikeThisHandler. If anyone > has some work done already and want to share, or want to contribute, any help > will be welcomed. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5480) Make MoreLikeThisHandler distributable
[ https://issues.apache.org/jira/browse/SOLR-5480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Molloy updated SOLR-5480: --- Attachment: SOLR-5480.patch Patch adapted for 4.10, unit tests pass. > Make MoreLikeThisHandler distributable > -- > > Key: SOLR-5480 > URL: https://issues.apache.org/jira/browse/SOLR-5480 > Project: Solr > Issue Type: Improvement >Reporter: Steve Molloy >Assignee: Noble Paul > Attachments: SOLR-5480.patch, SOLR-5480.patch, SOLR-5480.patch, > SOLR-5480.patch, SOLR-5480.patch, SOLR-5480.patch, SOLR-5480.patch, > SOLR-5480.patch, SOLR-5480.patch > > > The MoreLikeThis component, when used in the standard search handler supports > distributed searches. But the MoreLikeThisHandler itself doesn't, which > prevents from say, passing in text to perform the query. I'll start looking > into adapting the SearchHandler logic to the MoreLikeThisHandler. If anyone > has some work done already and want to share, or want to contribute, any help > will be welcomed. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6311) SearchHandler should use path when no qt or shard.qt parameter is specified
[ https://issues.apache.org/jira/browse/SOLR-6311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14089286#comment-14089286 ] Steve Molloy commented on SOLR-6311: Haven't had much time to look into alternate solution yet, but for the point of default parameters. What is then the impact of having default parameters on /select handler itself? Is there some limitations on parameters that should be set as defaults in /select handler? Even example config comes with defaults and suggests that more can be added. > SearchHandler should use path when no qt or shard.qt parameter is specified > --- > > Key: SOLR-6311 > URL: https://issues.apache.org/jira/browse/SOLR-6311 > Project: Solr > Issue Type: Bug >Affects Versions: 4.9 >Reporter: Steve Molloy > Attachments: SOLR-6311.patch > > > When performing distributed searches, you have to specify shards.qt unless > you're on the default /select path for your handler. As this is configurable, > even the default search handler could be on another path. The shard requests > should thus default to the path if no shards.qt was specified. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6311) SearchHandler should use path when no qt or shard.qt parameter is specified
[ https://issues.apache.org/jira/browse/SOLR-6311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14086237#comment-14086237 ] Steve Molloy commented on SOLR-6311: Thanks Yonik, hadn't thought of that. I'll think about alternatives for this as. Our main usage of different handlers is for using different sets of components (the suggester in its own handler for autocomplete being an example), in which case it seems wrong to force the request to contain shards.qt for distributed searches when we're trying to hide the fact that it's distributed by using collections. > SearchHandler should use path when no qt or shard.qt parameter is specified > --- > > Key: SOLR-6311 > URL: https://issues.apache.org/jira/browse/SOLR-6311 > Project: Solr > Issue Type: Bug >Affects Versions: 4.9 >Reporter: Steve Molloy > Attachments: SOLR-6311.patch > > > When performing distributed searches, you have to specify shards.qt unless > you're on the default /select path for your handler. As this is configurable, > even the default search handler could be on another path. The shard requests > should thus default to the path if no shards.qt was specified. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6311) SearchHandler should use path when no qt or shard.qt parameter is specified
[ https://issues.apache.org/jira/browse/SOLR-6311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Molloy updated SOLR-6311: --- Attachment: SOLR-6311.patch This patch will use shards.qt if specified, default to qt if not, then default to path if both were omitted. > SearchHandler should use path when no qt or shard.qt parameter is specified > --- > > Key: SOLR-6311 > URL: https://issues.apache.org/jira/browse/SOLR-6311 > Project: Solr > Issue Type: Bug >Affects Versions: 4.9 >Reporter: Steve Molloy > Attachments: SOLR-6311.patch > > > When performing distributed searches, you have to specify shards.qt unless > you're on the default /select path for your handler. As this is configurable, > even the default search handler could be on another path. The shard requests > should thus default to the path if no shards.qt was specified. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-6311) SearchHandler should use path when no qt or shard.qt parameter is specified
Steve Molloy created SOLR-6311: -- Summary: SearchHandler should use path when no qt or shard.qt parameter is specified Key: SOLR-6311 URL: https://issues.apache.org/jira/browse/SOLR-6311 Project: Solr Issue Type: Bug Affects Versions: 4.9 Reporter: Steve Molloy When performing distributed searches, you have to specify shards.qt unless you're on the default /select path for your handler. As this is configurable, even the default search handler could be on another path. The shard requests should thus default to the path if no shards.qt was specified. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-6297) Distributed spellcheck with WordBreakSpellchecker can lose suggestions
Steve Molloy created SOLR-6297: -- Summary: Distributed spellcheck with WordBreakSpellchecker can lose suggestions Key: SOLR-6297 URL: https://issues.apache.org/jira/browse/SOLR-6297 Project: Solr Issue Type: Bug Affects Versions: 4.9 Reporter: Steve Molloy When performing a spellcheck request in distributed environment with the WordBreakSpellChecker configured, the shard response merging logic can lose some suggestions. Basically, the merging logic ensures that all shards marked the query as not being correctly spelled, which is good, but also expects all shards to return some suggestions, which isn't necessarily the case. So if shard 1 returns 10 suggestions but shard 2 returns none, the final result will contain no suggestions because the term has suggestions from only 1 of 2 shards. This isn't the case with the DirectSolrSpellChecker which works properly. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6248) MoreLikeThis Query Parser
[ https://issues.apache.org/jira/browse/SOLR-6248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14074360#comment-14074360 ] Steve Molloy commented on SOLR-6248: In this case it cannot replace the current MoreLikeThisHandler implementation which can analyze incoming text (as opposed to searching for a matching document in the index) in order to find similar documents in the index. Being able to query by unique field and returning similar documents is already covered by the MoreLikeThisComponent if you use rows=1 to get a single document and its set of similar ones. The use case that forces the MoreLikeThisHandler currently (at least that I know of) is really this on-the-fly analysis of text that is nowhere in the index. > MoreLikeThis Query Parser > - > > Key: SOLR-6248 > URL: https://issues.apache.org/jira/browse/SOLR-6248 > Project: Solr > Issue Type: New Feature >Reporter: Anshum Gupta > Attachments: SOLR-6248.patch > > > MLT Component doesn't let people highlight/paginate and the handler comes > with an cost of maintaining another piece in the config. Also, any changes to > the default (number of results to be fetched etc.) /select handler need to be > copied/synced with this handler too. > Having an MLT QParser would let users get back docs based on a query for them > to paginate, highlight etc. It would also give them the flexibility to use > this anywhere i.e. q,fq,bq etc. > A bit of history about MLT (thanks to Hoss) > MLT Handler pre-dates the existence of QParsers and was meant to take an > arbitrary query as input, find docs that match that > query, club them together to find interesting terms, and then use those > terms as if they were my main query to generate a main result set. > This result would then be used as the set to facet, highlight etc. > The flow: Query -> DocList(m) -> Bag (terms) -> Query -> DocList\(y) > The MLT component on the other hand solved a very different purpose of > augmenting the main result set. It is used to get similar docs for each of > the doc in the main result set. > DocSet\(n) -> n * Bag (terms) -> n * (Query) -> n * DocList(m) > The new approach: > All of this can be done better and cleaner (and makes more sense too) using > an MLT QParser. > An important thing to handle here is the case where the user doesn't have > TermVectors, in which case, it does what happens right now i.e. parsing > stored fields. > Also, in case the user doesn't have a field (to be used for MLT) indexed, the > field would need to be a TextField with an index analyzer defined. This > analyzer will then be used to extract terms for MLT. > In case of SolrCloud mode, '/get-termvectors' can be used after looking at > the schema (if TermVectors are enabled for the field). If not, a /get call > can be used to fetch the field and parse it. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6248) MoreLikeThis Query Parser
[ https://issues.apache.org/jira/browse/SOLR-6248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14073275#comment-14073275 ] Steve Molloy commented on SOLR-6248: I'd like to give this a spin, but looking at the attached patch, it's unclear how to pass in text. The parsers seem to be looking at "id" parameter, I haven't seen any reference to stream.body. What parameter would be used to pass in text to be analyzed and for which to return similar documents? > MoreLikeThis Query Parser > - > > Key: SOLR-6248 > URL: https://issues.apache.org/jira/browse/SOLR-6248 > Project: Solr > Issue Type: New Feature >Reporter: Anshum Gupta > Attachments: SOLR-6248.patch > > > MLT Component doesn't let people highlight/paginate and the handler comes > with an cost of maintaining another piece in the config. Also, any changes to > the default (number of results to be fetched etc.) /select handler need to be > copied/synced with this handler too. > Having an MLT QParser would let users get back docs based on a query for them > to paginate, highlight etc. It would also give them the flexibility to use > this anywhere i.e. q,fq,bq etc. > A bit of history about MLT (thanks to Hoss) > MLT Handler pre-dates the existence of QParsers and was meant to take an > arbitrary query as input, find docs that match that > query, club them together to find interesting terms, and then use those > terms as if they were my main query to generate a main result set. > This result would then be used as the set to facet, highlight etc. > The flow: Query -> DocList(m) -> Bag (terms) -> Query -> DocList\(y) > The MLT component on the other hand solved a very different purpose of > augmenting the main result set. It is used to get similar docs for each of > the doc in the main result set. > DocSet\(n) -> n * Bag (terms) -> n * (Query) -> n * DocList(m) > The new approach: > All of this can be done better and cleaner (and makes more sense too) using > an MLT QParser. > An important thing to handle here is the case where the user doesn't have > TermVectors, in which case, it does what happens right now i.e. parsing > stored fields. > Also, in case the user doesn't have a field (to be used for MLT) indexed, the > field would need to be a TextField with an index analyzer defined. This > analyzer will then be used to extract terms for MLT. > In case of SolrCloud mode, '/get-termvectors' can be used after looking at > the schema (if TermVectors are enabled for the field). If not, a /get call > can be used to fetch the field and parse it. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6248) MoreLikeThis Query Parser
[ https://issues.apache.org/jira/browse/SOLR-6248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14071645#comment-14071645 ] Steve Molloy commented on SOLR-6248: I meant passing in text as parameter as opposed to finding it in the index. With current MLT handler (not component), you can pass it in as body or stream.body to get documents similar to the text you pass in. In our case, we use it to find documents in one collection similar to a document found in another, or to some text directly provided by user. So, I know that at some point the SearchHandler started rejecting search requests with stream body, which would prevent this unless it could be achieved in another way. That's why I'm asking. :) > MoreLikeThis Query Parser > - > > Key: SOLR-6248 > URL: https://issues.apache.org/jira/browse/SOLR-6248 > Project: Solr > Issue Type: New Feature >Reporter: Anshum Gupta > > MLT Component doesn't let people highlight/paginate and the handler comes > with an cost of maintaining another piece in the config. Also, any changes to > the default (number of results to be fetched etc.) /select handler need to be > copied/synced with this handler too. > Having an MLT QParser would let users get back docs based on a query for them > to paginate, highlight etc. It would also give them the flexibility to use > this anywhere i.e. q,fq,bq etc. > A bit of history about MLT (thanks to Hoss) > MLT Handler pre-dates the existence of QParsers and was meant to take an > arbitrary query as input, find docs that match that > query, club them together to find interesting terms, and then use those > terms as if they were my main query to generate a main result set. > This result would then be used as the set to facet, highlight etc. > The flow: Query -> DocList(m) -> Bag (terms) -> Query -> DocList\(y) > The MLT component on the other hand solved a very different purpose of > augmenting the main result set. It is used to get similar docs for each of > the doc in the main result set. > DocSet\(n) -> n * Bag (terms) -> n * (Query) -> n * DocList(m) > The new approach: > All of this can be done better and cleaner (and makes more sense too) using > an MLT QParser. > An important thing to handle here is the case where the user doesn't have > TermVectors, in which case, it does what happens right now i.e. parsing > stored fields. > Also, in case the user doesn't have a field (to be used for MLT) indexed, the > field would need to be a TextField with an index analyzer defined. This > analyzer will then be used to extract terms for MLT. > In case of SolrCloud mode, '/get-termvectors' can be used after looking at > the schema (if TermVectors are enabled for the field). If not, a /get call > can be used to fetch the field and parse it. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6248) MoreLikeThis Query Parser
[ https://issues.apache.org/jira/browse/SOLR-6248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14068501#comment-14068501 ] Steve Molloy commented on SOLR-6248: Would that approach also support sending in text that isn't in the index? This is the main reason we're using the MLT handler, which we need to be distributed (thus SOLR-5480). but if we can have a single approach for both, I agree that not maintaining 2 configurations (and 2 handlers in the code) would be much better. Let me know if I can help out. > MoreLikeThis Query Parser > - > > Key: SOLR-6248 > URL: https://issues.apache.org/jira/browse/SOLR-6248 > Project: Solr > Issue Type: New Feature >Reporter: Anshum Gupta > > MLT Component doesn't let people highlight/paginate and the handler comes > with an cost of maintaining another piece in the config. Also, any changes to > the default (number of results to be fetched etc.) /select handler need to be > copied/synced with this handler too. > Having an MLT QParser would let users get back docs based on a query for them > to paginate, highlight etc. It would also give them the flexibility to use > this anywhere i.e. q,fq,bq etc. > A bit of history about MLT (thanks to Hoss) > MLT Handler pre-dates the existence of QParsers and was meant to take an > arbitrary query as input, find docs that match that > query, club them together to find interesting terms, and then use those > terms as if they were my main query to generate a main result set. > This result would then be used as the set to facet, highlight etc. > The flow: Query -> DocList(m) -> Bag (terms) -> Query -> DocList\(y) > The MLT component on the other hand solved a very different purpose of > augmenting the main result set. It is used to get similar docs for each of > the doc in the main result set. > DocSet\(n) -> n * Bag (terms) -> n * (Query) -> n * DocList(m) > The new approach: > All of this can be done better and cleaner (and makes more sense too) using > an MLT QParser. > An important thing to handle here is the case where the user doesn't have > TermVectors, in which case, it does what happens right now i.e. parsing > stored fields. > Also, in case the user doesn't have a field (to be used for MLT) indexed, the > field would need to be a TextField with an index analyzer defined. This > analyzer will then be used to extract terms for MLT. > In case of SolrCloud mode, '/get-termvectors' can be used after looking at > the schema (if TermVectors are enabled for the field). If not, a /get call > can be used to fetch the field and parse it. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2894) Implement distributed pivot faceting
[ https://issues.apache.org/jira/browse/SOLR-2894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14056463#comment-14056463 ] Steve Molloy commented on SOLR-2894: Quick note on PivotFacetHelper's retrieve method. I understand the desire for good performance and more than agree with it. But with some entries being optional (statistics and qcount from SOLR-3583 and SOLR-4212 for instance), this causes the lookup to start after the proper position thus not finding entries that are there. I don't have a better solution than starting from 0 currently, but I'm sure there's something that can be done to keep at least some of the speed improvement while still being able to support optional entries. Maybe force all optional to the end of the list, lookup by index for required ones (field, value, count) and starting at first optional spot for the rest? > Implement distributed pivot faceting > > > Key: SOLR-2894 > URL: https://issues.apache.org/jira/browse/SOLR-2894 > Project: Solr > Issue Type: Improvement >Reporter: Erik Hatcher >Assignee: Hoss Man > Fix For: 4.9, 5.0 > > Attachments: SOLR-2894-mincount-minification.patch, > SOLR-2894-reworked.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, > SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, > SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, > SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, > SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, > SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, > SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, > SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, > SOLR-2894_cloud_test.patch, dateToObject.patch, pivot_mincount_problem.sh > > > Following up on SOLR-792, pivot faceting currently only supports > undistributed mode. Distributed pivot faceting needs to be implemented. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3583) Percentiles for facets, pivot facets, and distributed pivot facets
[ https://issues.apache.org/jira/browse/SOLR-3583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14056453#comment-14056453 ] Steve Molloy commented on SOLR-3583: Noted. The issue is with entries that may or may not be in the response, like statistics or qcount in my case, which change the positions. I'll comment on SOLR-2894. Thanks for the explanation. > Percentiles for facets, pivot facets, and distributed pivot facets > -- > > Key: SOLR-3583 > URL: https://issues.apache.org/jira/browse/SOLR-3583 > Project: Solr > Issue Type: Improvement >Reporter: Chris Russell >Priority: Minor > Labels: newbie, patch > Fix For: 4.9, 5.0 > > Attachments: SOLR-3583.patch, SOLR-3583.patch, SOLR-3583.patch, > SOLR-3583.patch, SOLR-3583.patch, SOLR-3583.patch, SOLR-3583.patch > > > Built on top of SOLR-2894, this patch adds percentiles and averages to > facets, pivot facets, and distributed pivot facets by making use of range > facet internals. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3583) Percentiles for facets, pivot facets, and distributed pivot facets
[ https://issues.apache.org/jira/browse/SOLR-3583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14056327#comment-14056327 ] Steve Molloy commented on SOLR-3583: Found my issue, it was in PivotFacetHelper's retrieve method which specifies start index, which for some reason in my case was after the entry. I've also applied SOLR-4212, so maybe that's the reason, will look into it. But how much performance improvement is there with specifying the start index for that lookup? It seems error-prove as the list is getting populated elsewhere and no real control on the order is imposed. > Percentiles for facets, pivot facets, and distributed pivot facets > -- > > Key: SOLR-3583 > URL: https://issues.apache.org/jira/browse/SOLR-3583 > Project: Solr > Issue Type: Improvement >Reporter: Chris Russell >Priority: Minor > Labels: newbie, patch > Fix For: 4.9, 5.0 > > Attachments: SOLR-3583.patch, SOLR-3583.patch, SOLR-3583.patch, > SOLR-3583.patch, SOLR-3583.patch, SOLR-3583.patch, SOLR-3583.patch > > > Built on top of SOLR-2894, this patch adds percentiles and averages to > facets, pivot facets, and distributed pivot facets by making use of range > facet internals. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4212) Support for facet pivot query for filtered count
[ https://issues.apache.org/jira/browse/SOLR-4212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Molloy updated SOLR-4212: --- Attachment: (was: mylyn-context.zip) > Support for facet pivot query for filtered count > > > Key: SOLR-4212 > URL: https://issues.apache.org/jira/browse/SOLR-4212 > Project: Solr > Issue Type: Improvement > Components: search >Affects Versions: 4.0 >Reporter: Steve Molloy > Fix For: 4.9, 5.0 > > Attachments: SOLR-4212.patch, SOLR-4212.patch, patch-4212.txt > > > Facet pivot provide hierarchical support for computing data used to populate > a treemap or similar visualization. TreeMaps usually offer users extra > information by applying an overlay color on top of the existing square sizes > based on hierarchical counts. This second count is based on user choices, > representing, usually with gradient, the proportion of the square that fits > the user's choices. > The proposition is to add a facet.pivot.q parameter that would allow to > specify a query (per field) that would be intersected with DocSet used to > calculate pivot count, stored in separate q-count. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4212) Support for facet pivot query for filtered count
[ https://issues.apache.org/jira/browse/SOLR-4212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Molloy updated SOLR-4212: --- Attachment: mylyn-context.zip > Support for facet pivot query for filtered count > > > Key: SOLR-4212 > URL: https://issues.apache.org/jira/browse/SOLR-4212 > Project: Solr > Issue Type: Improvement > Components: search >Affects Versions: 4.0 >Reporter: Steve Molloy > Fix For: 4.9, 5.0 > > Attachments: SOLR-4212.patch, SOLR-4212.patch, mylyn-context.zip, > patch-4212.txt > > > Facet pivot provide hierarchical support for computing data used to populate > a treemap or similar visualization. TreeMaps usually offer users extra > information by applying an overlay color on top of the existing square sizes > based on hierarchical counts. This second count is based on user choices, > representing, usually with gradient, the proportion of the square that fits > the user's choices. > The proposition is to add a facet.pivot.q parameter that would allow to > specify a query (per field) that would be intersected with DocSet used to > calculate pivot count, stored in separate q-count. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org