[jira] [Commented] (SOLR-4787) Join Contrib
[ https://issues.apache.org/jira/browse/SOLR-4787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13671200#comment-13671200 ] Dawid Weiss commented on SOLR-4787: --- Oh, one more thing -- Colt is no longer maintained and there were a number of bugs in it. These have been fixed when Colt was ported to Apache Mahout; those classes are not part of Mahout Math. I'd still recommend using Fastutil or Hppc since these will be faster (by an inch but always). > Join Contrib > > > Key: SOLR-4787 > URL: https://issues.apache.org/jira/browse/SOLR-4787 > Project: Solr > Issue Type: New Feature > Components: search >Affects Versions: 4.2.1 >Reporter: Joel Bernstein >Priority: Minor > Fix For: 4.2.1 > > Attachments: SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, > SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch > > > This contrib provides a place where different join implementations can be > contributed to Solr. This contrib currently includes 3 join implementations. > The initial patch was generated from the Solr 4.2.1 tag. Because of changes > in the FieldCache API this patch will only build with Solr 4.2 or above. > *PostFilterJoinQParserPlugin aka "pjoin"* > The pjoin provides a join implementation that filters results in one core > based on the results of a search in another core. This is similar in > functionality to the JoinQParserPlugin but the implementation differs in a > couple of important ways. > The first way is that the pjoin is designed to work with integer join keys > only. So, in order to use pjoin, integer join keys must be included in both > the to and from core. > The second difference is that the pjoin builds memory structures that are > used to quickly connect the join keys. It also uses a custom SolrCache named > "join" to hold intermediate DocSets which are needed to build the join memory > structures. So, the pjoin will need more memory then the JoinQParserPlugin to > perform the join. > The main advantage of the pjoin is that it can scale to join millions of keys > between cores. > Because it's a PostFilter, it only needs to join records that match the main > query. > The syntax of the pjoin is the same as the JoinQParserPlugin except that the > plugin is referenced by the string "pjoin" rather then "join". > fq=\{!pjoin fromCore=collection2 from=id_i to=id_i\}user:customer1 > The example filter query above will search the fromCore (collection2) for > "user:customer1". This query will generate a list of values from the "from" > field that will be used to filter the main query. Only records from the main > query, where the "to" field is present in the "from" list will be included in > the results. > The solrconfig.xml in the main query core must contain the reference to the > pjoin. > class="org.apache.solr.joins.PostFilterJoinQParserPlugin"/> > And the join contrib jars must be registed in the solrconfig.xml. > > The solrconfig.xml in the fromcore must have the "join" SolrCache configured. > class="solr.LRUCache" > size="4096" > initialSize="1024" > /> > *JoinValueSourceParserPlugin aka vjoin* > The second implementation is the JoinValueSourceParserPlugin aka "vjoin". > This implements a ValueSource function query that can return values from a > second core based on join keys. This allows relevance data to be stored in a > separate core and then joined in the main query. > The vjoin is called using the "vjoin" function query. For example: > bf=vjoin(fromCore, fromKey, fromVal, toKey) > This example shows "vjoin" being called by the edismax boost function > parameter. This example will return the "fromVal" from the "fromCore". The > "fromKey" and "toKey" are used to link the records from the main query to the > records in the "fromCore". > As with the "pjoin", both the fromKey and toKey must be integers. Also like > the pjoin, the "join" SolrCache is used to hold the join memory structures. > To configure the vjoin you must register the ValueSource plugin in the > solrconfig.xml as follows: > class="org.apache.solr.joins.JoinValueSourceParserPlugin" /> > *JoinValueSourceParserPlugin2 aka vjoin2 aka Personalized ValueSource Join* > vjoin2 supports "personalized" ValueSource joins. The syntax is similar to > vjoin but adds an extra parameter so a query can be specified to join a > specific record set from the fromCore. This is designed to allow customer > specific relevance information to be added to the fromCore and then joined at > query time. > Syntax: > bf=vjoin2(fromCore,fromKey,fromVal,toKey,query) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA,
[jira] [Commented] (SOLR-4787) Join Contrib
[ https://issues.apache.org/jira/browse/SOLR-4787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13671199#comment-13671199 ] Dawid Weiss commented on SOLR-4787: --- Pull a class or two in source code form from fastutil or from HPPC. These are nearly identical these days, fastutil has support for Java collections interfaces (HPPC has its own API not stemming from JUC). Both of these are equally fast. > Join Contrib > > > Key: SOLR-4787 > URL: https://issues.apache.org/jira/browse/SOLR-4787 > Project: Solr > Issue Type: New Feature > Components: search >Affects Versions: 4.2.1 >Reporter: Joel Bernstein >Priority: Minor > Fix For: 4.2.1 > > Attachments: SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, > SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch > > > This contrib provides a place where different join implementations can be > contributed to Solr. This contrib currently includes 3 join implementations. > The initial patch was generated from the Solr 4.2.1 tag. Because of changes > in the FieldCache API this patch will only build with Solr 4.2 or above. > *PostFilterJoinQParserPlugin aka "pjoin"* > The pjoin provides a join implementation that filters results in one core > based on the results of a search in another core. This is similar in > functionality to the JoinQParserPlugin but the implementation differs in a > couple of important ways. > The first way is that the pjoin is designed to work with integer join keys > only. So, in order to use pjoin, integer join keys must be included in both > the to and from core. > The second difference is that the pjoin builds memory structures that are > used to quickly connect the join keys. It also uses a custom SolrCache named > "join" to hold intermediate DocSets which are needed to build the join memory > structures. So, the pjoin will need more memory then the JoinQParserPlugin to > perform the join. > The main advantage of the pjoin is that it can scale to join millions of keys > between cores. > Because it's a PostFilter, it only needs to join records that match the main > query. > The syntax of the pjoin is the same as the JoinQParserPlugin except that the > plugin is referenced by the string "pjoin" rather then "join". > fq=\{!pjoin fromCore=collection2 from=id_i to=id_i\}user:customer1 > The example filter query above will search the fromCore (collection2) for > "user:customer1". This query will generate a list of values from the "from" > field that will be used to filter the main query. Only records from the main > query, where the "to" field is present in the "from" list will be included in > the results. > The solrconfig.xml in the main query core must contain the reference to the > pjoin. > class="org.apache.solr.joins.PostFilterJoinQParserPlugin"/> > And the join contrib jars must be registed in the solrconfig.xml. > > The solrconfig.xml in the fromcore must have the "join" SolrCache configured. > class="solr.LRUCache" > size="4096" > initialSize="1024" > /> > *JoinValueSourceParserPlugin aka vjoin* > The second implementation is the JoinValueSourceParserPlugin aka "vjoin". > This implements a ValueSource function query that can return values from a > second core based on join keys. This allows relevance data to be stored in a > separate core and then joined in the main query. > The vjoin is called using the "vjoin" function query. For example: > bf=vjoin(fromCore, fromKey, fromVal, toKey) > This example shows "vjoin" being called by the edismax boost function > parameter. This example will return the "fromVal" from the "fromCore". The > "fromKey" and "toKey" are used to link the records from the main query to the > records in the "fromCore". > As with the "pjoin", both the fromKey and toKey must be integers. Also like > the pjoin, the "join" SolrCache is used to hold the join memory structures. > To configure the vjoin you must register the ValueSource plugin in the > solrconfig.xml as follows: > class="org.apache.solr.joins.JoinValueSourceParserPlugin" /> > *JoinValueSourceParserPlugin2 aka vjoin2 aka Personalized ValueSource Join* > vjoin2 supports "personalized" ValueSource joins. The syntax is similar to > vjoin but adds an extra parameter so a query can be specified to join a > specific record set from the fromCore. This is designed to allow customer > specific relevance information to be added to the fromCore and then joined at > query time. > Syntax: > bf=vjoin2(fromCore,fromKey,fromVal,toKey,query) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira --
[jira] [Commented] (SOLR-4715) CloudSolrServer does not provide support for setting underlying server properties
[ https://issues.apache.org/jira/browse/SOLR-4715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13671195#comment-13671195 ] Shawn Heisey commented on SOLR-4715: After a little bit of thought, I'm thinking the reason the ResponseParser object is final is so that there are no thread visibility problems, because it can't ever change. The following ideas would require removing that final modifier and adding an object for a shared RequestWriter. For CloudSolrServer: If no requests have been processed yet, then the LBHttpSolrServer object will not yet have any internal HttpSolrServer objects, so passing through setParser and setRequestWriter calls should be perfectly safe. We can block these methods once the first request gets processed, or we can just pass them through and rely on the following: For LBHttpSolrServer, we can do one of three things with setParser and setRequestWriter if there are any ServerWrapper objects (and therefore HttpSolrServer objects): 1) Throw an exception. 2) Ignore the request. 3) Make the requested change on all HttpSolrServer objects. > CloudSolrServer does not provide support for setting underlying server > properties > - > > Key: SOLR-4715 > URL: https://issues.apache.org/jira/browse/SOLR-4715 > Project: Solr > Issue Type: Bug >Affects Versions: 4.3 >Reporter: Hardik Upadhyay >Assignee: Shawn Heisey > Labels: solr, solrj > > CloudSolrServer (and LBHttpSolrServer) do not allow the user to set > underlying HttpSolrServer and HttpClient settings. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4816) Add document routing to CloudSolrServer
[ https://issues.apache.org/jira/browse/SOLR-4816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13671182#comment-13671182 ] Shawn Heisey commented on SOLR-4816: [~joel.bernstein] I was looking into how you switched to the binary writer so I could develop a patch for SOLR-4715. You've got it creating new writer and parser objects for every HttpSolrServer. Shouldn't there be one instance of each? The existing LBHttpSolrServer class shares one parser object for all of the inner HttpSolrServer objects. I'm struggling a bit on my patch, but if I can find a way to do it, there is some overlap with this issue. > Add document routing to CloudSolrServer > --- > > Key: SOLR-4816 > URL: https://issues.apache.org/jira/browse/SOLR-4816 > Project: Solr > Issue Type: Improvement > Components: SolrCloud >Affects Versions: 4.3 >Reporter: Joel Bernstein >Assignee: Mark Miller >Priority: Minor > Fix For: 5.0, 4.4 > > Attachments: SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816-sriesenberg.patch > > > This issue adds the following enhancements to CloudSolrServer's update logic: > 1) Document routing: Updates are routed directly to the correct shard leader > eliminating document routing at the server. > 2) Parallel update execution: Updates for each shard are executed in a > separate thread so parallel indexing can occur across the cluster. > 3) Javabin transport: Update requests are sent via javabin transport. > These enhancements should allow for near linear scalability on indexing > throughput. > Usage: > CloudSolrServer cloudClient = new CloudSolrServer(zkAddress); > SolrInputDocument doc1 = new SolrInputDocument(); > doc1.addField(id, "0"); > doc1.addField("a_t", "hello1"); > SolrInputDocument doc2 = new SolrInputDocument(); > doc2.addField(id, "2"); > doc2.addField("a_t", "hello2"); > UpdateRequest request = new UpdateRequest(); > request.add(doc1); > request.add(doc2); > request.setAction(AbstractUpdateRequest.ACTION.OPTIMIZE, false, false); > NamedList response = cloudClient.request(request); // Returns a backwards > compatible condensed response. > //To get more detailed response down cast to RouteResponse: > CloudSolrServer.RouteResponse rr = (CloudSolrServer.RouteResponse)response; > NamedList responses = rr.getRouteResponse(); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4715) CloudSolrServer does not provide support for setting underlying server properties
[ https://issues.apache.org/jira/browse/SOLR-4715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13671181#comment-13671181 ] Shawn Heisey commented on SOLR-4715: I've run into a challenge in creating a patch for this issue. The response parser object in LBHttpSolrServer is final. If there's a really good reason for this object to be final, then creating setParser and setRequestWriter methods could be really challenging. > CloudSolrServer does not provide support for setting underlying server > properties > - > > Key: SOLR-4715 > URL: https://issues.apache.org/jira/browse/SOLR-4715 > Project: Solr > Issue Type: Bug >Affects Versions: 4.3 >Reporter: Hardik Upadhyay >Assignee: Shawn Heisey > Labels: solr, solrj > > CloudSolrServer (and LBHttpSolrServer) do not allow the user to set > underlying HttpSolrServer and HttpClient settings. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4569) Allow customization of column stride field and norms via indexing chain
[ https://issues.apache.org/jira/browse/LUCENE-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13671164#comment-13671164 ] John Wang commented on LUCENE-4569: --- Hey Simon: Was wondering if you had a chance to look at this. Thanks -John > Allow customization of column stride field and norms via indexing chain > --- > > Key: LUCENE-4569 > URL: https://issues.apache.org/jira/browse/LUCENE-4569 > Project: Lucene - Core > Issue Type: Improvement > Components: core/index >Affects Versions: 4.0 >Reporter: John Wang >Assignee: Simon Willnauer > Attachments: patch.diff > > > We are building an in-memory indexing format and managing our own segments. > We are doing this by implementing a custom IndexingChain. We would like to > support column-stride-fields and norms without having to wire in a codec > (since we are managing our postings differently) > Suggested change is consistent with the api support for passing in a custom > InvertedDocConsumer. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4787) Join Contrib
[ https://issues.apache.org/jira/browse/SOLR-4787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13671104#comment-13671104 ] David Smiley commented on SOLR-4787: I suggest either [FastUtil|http://fastutil.dsi.unimi.it], or the similar [HPPC|http://labs.carrotsearch.com/hppc.html] (by [~dawidweiss] here at the ASF). For a single class it may make sense to copy it in source from. That kinda makes me cringe but for just one source file and for something that is externally tested and unlikely to have an unknown bug, I think it's fine. > Join Contrib > > > Key: SOLR-4787 > URL: https://issues.apache.org/jira/browse/SOLR-4787 > Project: Solr > Issue Type: New Feature > Components: search >Affects Versions: 4.2.1 >Reporter: Joel Bernstein >Priority: Minor > Fix For: 4.2.1 > > Attachments: SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, > SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch > > > This contrib provides a place where different join implementations can be > contributed to Solr. This contrib currently includes 3 join implementations. > The initial patch was generated from the Solr 4.2.1 tag. Because of changes > in the FieldCache API this patch will only build with Solr 4.2 or above. > *PostFilterJoinQParserPlugin aka "pjoin"* > The pjoin provides a join implementation that filters results in one core > based on the results of a search in another core. This is similar in > functionality to the JoinQParserPlugin but the implementation differs in a > couple of important ways. > The first way is that the pjoin is designed to work with integer join keys > only. So, in order to use pjoin, integer join keys must be included in both > the to and from core. > The second difference is that the pjoin builds memory structures that are > used to quickly connect the join keys. It also uses a custom SolrCache named > "join" to hold intermediate DocSets which are needed to build the join memory > structures. So, the pjoin will need more memory then the JoinQParserPlugin to > perform the join. > The main advantage of the pjoin is that it can scale to join millions of keys > between cores. > Because it's a PostFilter, it only needs to join records that match the main > query. > The syntax of the pjoin is the same as the JoinQParserPlugin except that the > plugin is referenced by the string "pjoin" rather then "join". > fq=\{!pjoin fromCore=collection2 from=id_i to=id_i\}user:customer1 > The example filter query above will search the fromCore (collection2) for > "user:customer1". This query will generate a list of values from the "from" > field that will be used to filter the main query. Only records from the main > query, where the "to" field is present in the "from" list will be included in > the results. > The solrconfig.xml in the main query core must contain the reference to the > pjoin. > class="org.apache.solr.joins.PostFilterJoinQParserPlugin"/> > And the join contrib jars must be registed in the solrconfig.xml. > > The solrconfig.xml in the fromcore must have the "join" SolrCache configured. > class="solr.LRUCache" > size="4096" > initialSize="1024" > /> > *JoinValueSourceParserPlugin aka vjoin* > The second implementation is the JoinValueSourceParserPlugin aka "vjoin". > This implements a ValueSource function query that can return values from a > second core based on join keys. This allows relevance data to be stored in a > separate core and then joined in the main query. > The vjoin is called using the "vjoin" function query. For example: > bf=vjoin(fromCore, fromKey, fromVal, toKey) > This example shows "vjoin" being called by the edismax boost function > parameter. This example will return the "fromVal" from the "fromCore". The > "fromKey" and "toKey" are used to link the records from the main query to the > records in the "fromCore". > As with the "pjoin", both the fromKey and toKey must be integers. Also like > the pjoin, the "join" SolrCache is used to hold the join memory structures. > To configure the vjoin you must register the ValueSource plugin in the > solrconfig.xml as follows: > class="org.apache.solr.joins.JoinValueSourceParserPlugin" /> > *JoinValueSourceParserPlugin2 aka vjoin2 aka Personalized ValueSource Join* > vjoin2 supports "personalized" ValueSource joins. The syntax is similar to > vjoin but adds an extra parameter so a query can be specified to join a > specific record set from the fromCore. This is designed to allow customer > specific relevance information to be added to the fromCore and then joined at > query time. > Syntax: > bf=vjoin2(fromCore,fromKey,fromVal,toKey,query) -- This message is automatically generated by JIRA. If you think it
[JENKINS-MAVEN] Lucene-Solr-Maven-trunk #866: POMs out of sync
Build: https://builds.apache.org/job/Lucene-Solr-Maven-trunk/866/ 1 tests failed. REGRESSION: org.apache.solr.cloud.SyncSliceTest.testDistribSearch Error Message: shard1 is not consistent. Got 305 from http://127.0.0.1:25787/collection1lastClient and got 5 from http://127.0.0.1:25791/collection1 Stack Trace: java.lang.AssertionError: shard1 is not consistent. Got 305 from http://127.0.0.1:25787/collection1lastClient and got 5 from http://127.0.0.1:25791/collection1 at __randomizedtesting.SeedInfo.seed([5C02485FF3EE2B29:DDE4C64784B14B15]:0) at org.junit.Assert.fail(Assert.java:93) at org.apache.solr.cloud.AbstractFullDistribZkTestBase.checkShardConsistency(AbstractFullDistribZkTestBase.java:963) at org.apache.solr.cloud.SyncSliceTest.doTest(SyncSliceTest.java:238) Build Log: [...truncated 24265 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4858) updateLog + core reload + deleteByQuery = leaked directory
[ https://issues.apache.org/jira/browse/SOLR-4858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13671056#comment-13671056 ] Hoss Man commented on SOLR-4858: bq. It seems that this test started failing after the following commit: Hmmm... git bisect? What confuses me is that r1457641 seems to have been undone by r1457647 ? .. but i guess maybe r1457641 broke it, and then subsequent commits kept it broken even when r1457647 reverted that specific change? (totally possible that the other changes in r1457647 are problematic here since SOLR-4604 in general is about updateLog and core reload.) > updateLog + core reload + deleteByQuery = leaked directory > -- > > Key: SOLR-4858 > URL: https://issues.apache.org/jira/browse/SOLR-4858 > Project: Solr > Issue Type: Bug >Affects Versions: 4.2.1 >Reporter: Hoss Man > Fix For: 4.3.1 > > Attachments: SOLR-4858.patch, SOLR-4858.patch, SOLR-4858.patch > > > I havene't been able to make sense of this yet, but trying to track down > another bug lead me to discover that the following combination leads to > problems... > * updateLog enabled > * do a core reload > * do a delete by query \*:\* > ...leave out any one of the three, and everything works fine. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4858) updateLog + core reload + deleteByQuery = leaked directory
[ https://issues.apache.org/jira/browse/SOLR-4858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man updated SOLR-4858: --- Attachment: SOLR-4858.patch A much larger patch... I initially found this bug because of a weird failure in a test i have on dependent project, and it took me longer then i would have liked to reproduce in a solr test because i didn't realize it was caused by using the updateLog, and i didn't realize how few solr tests take advantage of the updateLog. so with that in mind, it seemed to me like we should probably increase the test coverage of hte updatLog to see if there are any more situations that tickle bugs besides this odd edge case of reload+deleteByQuery. so in this updated patch... * same TestReloadAndDeleteDocs as before * the test solrconfig.xml now defaults to using the updateLog * SolrTestCaseJ4 uses randomization to occasionally disable the update log with a sys property * there is currently a nocommit in SolrTestCaseJ4 forcing the sys prop to always be true * any tests using solrconfig.xml that have an explicit need to use/not-use updateLog override the sysprop explicitly * a few schema files that did not have _version_ fields are updated to include them ...this still only scratches the surface of increasing the test coverage for the UpdateLog, but it already exposes a reproducible failure in AutoCommitTest with the same symptoms as my TestReloadAndDeleteDocs... * ERROR Timeout waiting for all directory ref counts... * searcher leak. (i have not yet narrowed down which method in AutoCommitTest the dir factory ref count is lost in) > updateLog + core reload + deleteByQuery = leaked directory > -- > > Key: SOLR-4858 > URL: https://issues.apache.org/jira/browse/SOLR-4858 > Project: Solr > Issue Type: Bug >Affects Versions: 4.2.1 >Reporter: Hoss Man > Fix For: 4.3.1 > > Attachments: SOLR-4858.patch, SOLR-4858.patch, SOLR-4858.patch > > > I havene't been able to make sense of this yet, but trying to track down > another bug lead me to discover that the following combination leads to > problems... > * updateLog enabled > * do a core reload > * do a delete by query \*:\* > ...leave out any one of the three, and everything works fine. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4858) updateLog + core reload + deleteByQuery = leaked directory
[ https://issues.apache.org/jira/browse/SOLR-4858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13671045#comment-13671045 ] Alexey Serba commented on SOLR-4858: {noformat} # this will cause a searcher leak because the directory failed to close ant test -Dtestcase=TestReloadAndDeleteDocs -Dtests.method=testReloadAndDeleteDocsWithUpdateLog {noformat} It seems that this test started failing after the following commit: {noformat} 0226c616297c84196753f0989b45471b59c7c09a is the first bad commit commit 0226c616297c84196753f0989b45471b59c7c09a Author: Mark Robert Miller Date: Mon Mar 18 04:51:18 2013 + SOLR-4604: SolrCore is not using the UpdateHandler that is passed to it in SolrCore#reload. git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/branches/branch_4x@1457641 13f79535-47bb-0310-9956-ffa450edef68 {noformat} https://github.com/apache/lucene-solr/commit/0226c616297c84196753f0989b45471b59c7c09a > updateLog + core reload + deleteByQuery = leaked directory > -- > > Key: SOLR-4858 > URL: https://issues.apache.org/jira/browse/SOLR-4858 > Project: Solr > Issue Type: Bug >Affects Versions: 4.2.1 >Reporter: Hoss Man > Fix For: 4.3.1 > > Attachments: SOLR-4858.patch, SOLR-4858.patch > > > I havene't been able to make sense of this yet, but trying to track down > another bug lead me to discover that the following combination leads to > problems... > * updateLog enabled > * do a core reload > * do a delete by query \*:\* > ...leave out any one of the three, and everything works fine. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4787) Join Contrib
[ https://issues.apache.org/jira/browse/SOLR-4787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13671031#comment-13671031 ] Kranti Parisa commented on SOLR-4787: - Even I have been using Trove lib. Along with Colt, the following looks interesting too http://javolution.org/core-java/target/apidocs/javolution/util/FastMap.html https://code.google.com/p/guava-libraries/ > Join Contrib > > > Key: SOLR-4787 > URL: https://issues.apache.org/jira/browse/SOLR-4787 > Project: Solr > Issue Type: New Feature > Components: search >Affects Versions: 4.2.1 >Reporter: Joel Bernstein >Priority: Minor > Fix For: 4.2.1 > > Attachments: SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, > SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch > > > This contrib provides a place where different join implementations can be > contributed to Solr. This contrib currently includes 3 join implementations. > The initial patch was generated from the Solr 4.2.1 tag. Because of changes > in the FieldCache API this patch will only build with Solr 4.2 or above. > *PostFilterJoinQParserPlugin aka "pjoin"* > The pjoin provides a join implementation that filters results in one core > based on the results of a search in another core. This is similar in > functionality to the JoinQParserPlugin but the implementation differs in a > couple of important ways. > The first way is that the pjoin is designed to work with integer join keys > only. So, in order to use pjoin, integer join keys must be included in both > the to and from core. > The second difference is that the pjoin builds memory structures that are > used to quickly connect the join keys. It also uses a custom SolrCache named > "join" to hold intermediate DocSets which are needed to build the join memory > structures. So, the pjoin will need more memory then the JoinQParserPlugin to > perform the join. > The main advantage of the pjoin is that it can scale to join millions of keys > between cores. > Because it's a PostFilter, it only needs to join records that match the main > query. > The syntax of the pjoin is the same as the JoinQParserPlugin except that the > plugin is referenced by the string "pjoin" rather then "join". > fq=\{!pjoin fromCore=collection2 from=id_i to=id_i\}user:customer1 > The example filter query above will search the fromCore (collection2) for > "user:customer1". This query will generate a list of values from the "from" > field that will be used to filter the main query. Only records from the main > query, where the "to" field is present in the "from" list will be included in > the results. > The solrconfig.xml in the main query core must contain the reference to the > pjoin. > class="org.apache.solr.joins.PostFilterJoinQParserPlugin"/> > And the join contrib jars must be registed in the solrconfig.xml. > > The solrconfig.xml in the fromcore must have the "join" SolrCache configured. > class="solr.LRUCache" > size="4096" > initialSize="1024" > /> > *JoinValueSourceParserPlugin aka vjoin* > The second implementation is the JoinValueSourceParserPlugin aka "vjoin". > This implements a ValueSource function query that can return values from a > second core based on join keys. This allows relevance data to be stored in a > separate core and then joined in the main query. > The vjoin is called using the "vjoin" function query. For example: > bf=vjoin(fromCore, fromKey, fromVal, toKey) > This example shows "vjoin" being called by the edismax boost function > parameter. This example will return the "fromVal" from the "fromCore". The > "fromKey" and "toKey" are used to link the records from the main query to the > records in the "fromCore". > As with the "pjoin", both the fromKey and toKey must be integers. Also like > the pjoin, the "join" SolrCache is used to hold the join memory structures. > To configure the vjoin you must register the ValueSource plugin in the > solrconfig.xml as follows: > class="org.apache.solr.joins.JoinValueSourceParserPlugin" /> > *JoinValueSourceParserPlugin2 aka vjoin2 aka Personalized ValueSource Join* > vjoin2 supports "personalized" ValueSource joins. The syntax is similar to > vjoin but adds an extra parameter so a query can be specified to join a > specific record set from the fromCore. This is designed to allow customer > specific relevance information to be added to the fromCore and then joined at > query time. > Syntax: > bf=vjoin2(fromCore,fromKey,fromVal,toKey,query) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (SOLR-4858) updateLog + core reload + deleteByQuery = leaked directory
[ https://issues.apache.org/jira/browse/SOLR-4858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670980#comment-13670980 ] Yonik Seeley commented on SOLR-4858: It appears any deleteByQuery will cause this (the MatchAllDocuments deleteByQuery actually has special handling in Solr, so it's an important distinction). > updateLog + core reload + deleteByQuery = leaked directory > -- > > Key: SOLR-4858 > URL: https://issues.apache.org/jira/browse/SOLR-4858 > Project: Solr > Issue Type: Bug >Affects Versions: 4.2.1 >Reporter: Hoss Man > Fix For: 4.3.1 > > Attachments: SOLR-4858.patch, SOLR-4858.patch > > > I havene't been able to make sense of this yet, but trying to track down > another bug lead me to discover that the following combination leads to > problems... > * updateLog enabled > * do a core reload > * do a delete by query \*:\* > ...leave out any one of the three, and everything works fine. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4858) updateLog + core reload + deleteByQuery = leaked directory
[ https://issues.apache.org/jira/browse/SOLR-4858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man updated SOLR-4858: --- Attachment: SOLR-4858.patch simplified test case. i removed the randomness and replaced it with two distinct methods testing the simple sequence of events with and without updated log enabled... {noformat} # this will pass ant test -Dtestcase=TestReloadAndDeleteDocs -Dtests.method=testReloadAndDeleteDocsNoUpdateLog # this will cause a searcher leak because the directory failed to close ant test -Dtestcase=TestReloadAndDeleteDocs -Dtests.method=testReloadAndDeleteDocsWithUpdateLog {noformat} > updateLog + core reload + deleteByQuery = leaked directory > -- > > Key: SOLR-4858 > URL: https://issues.apache.org/jira/browse/SOLR-4858 > Project: Solr > Issue Type: Bug >Affects Versions: 4.2.1 >Reporter: Hoss Man > Fix For: 4.3.1 > > Attachments: SOLR-4858.patch, SOLR-4858.patch > > > I havene't been able to make sense of this yet, but trying to track down > another bug lead me to discover that the following combination leads to > problems... > * updateLog enabled > * do a core reload > * do a delete by query \*:\* > ...leave out any one of the three, and everything works fine. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-4.x-Windows (64bit/jdk1.6.0_45) - Build # 2845 - Still Failing!
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Windows/2845/ Java: 64bit/jdk1.6.0_45 -XX:-UseCompressedOops -XX:+UseSerialGC 1 tests failed. REGRESSION: org.apache.solr.core.TestArbitraryIndexDir.testLoadNewIndexDir Error Message: Exception during query Stack Trace: java.lang.RuntimeException: Exception during query at __randomizedtesting.SeedInfo.seed([1DBD509CF6934D19:F4E7EBA4680ADDB1]:0) at org.apache.solr.SolrTestCaseJ4.assertQ(SolrTestCaseJ4.java:525) at org.apache.solr.core.TestArbitraryIndexDir.testLoadNewIndexDir(TestArbitraryIndexDir.java:126) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358) at java.lang.Thread.run(Thread.java:662) Caused by: java.lang.RuntimeException: REQUEST FAILED: xpath=*[count(//doc)=1] xml response was: 01 request was:start=0&q=id:2&qt=standard&rows=20&versi
[jira] [Commented] (SOLR-4882) Restrict SolrResourceLoader to only classloader accessible files and instance dir
[ https://issues.apache.org/jira/browse/SOLR-4882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670798#comment-13670798 ] Hoss Man commented on SOLR-4882: bq. ... In Lucene 5.0 we should not support this anymore. FWIW: it's not hard to imagine situations where people have legitimate desire for using absolute paths like this. ie: loading synonyms or stop words from some central location outside of their solr home dir (eg: /etc/solr-common/stopwords/en.txt, used by multiple solr instances, with diff solr home dirs, running on diff ports. With that in mind, I don't think it makes sense to completely remove this ability -- but it certainly makes sense to disable it by default and document the risks. bq. In 4.4 we should add a solrconfig.xml setting to enable the old behaviour, but disable it by default... Given the lifecycle of the resource loaders, it may not be easy to have this configuration per-core in solrconfig.xml. I'm also not sure if it's worth adding as a solr.xml config option given the complexities in how that file is peristet after core operations (and how many times we've screwed ourselves adding things to that file) Given that this is something (i think) we should generally discourage, and something that i don't think we should be shy about making "hard" to turn on, it might be enough just to say that the only way you can enable it is with an explicit (and scary named) system property that affects the entire Solr instance? > Restrict SolrResourceLoader to only classloader accessible files and instance > dir > - > > Key: SOLR-4882 > URL: https://issues.apache.org/jira/browse/SOLR-4882 > Project: Solr > Issue Type: Improvement >Affects Versions: 4.3 >Reporter: Uwe Schindler >Assignee: Uwe Schindler > Fix For: 5.0, 4.4 > > > SolrResourceLoader currently allows to load files from any > absolute/CWD-relative path, which is used as a fallback if the resource > cannot be looked up via the class loader. > We should limit this fallback to sub-dirs below the instanceDir passed into > the ctor. The CWD special case should be removed, too (the virtual CWD is > instance's config or root dir). > The reason for this is security related. Some Solr components allow to pass > in resource paths via REST parameters (e.g. XSL stalesheets,...) and load > them via resource loader. By this it is possible to limit the whole thing to > not allow loading e.g. /etc/passwd as a stylesheet. > In 4.4 we should add a solrconfig.xml setting to enable the old behaviour, > but disable it by default, if your existing installation requires the files > from outside the instance dir which are not available via the URLClassLoader > used internally. In Lucene 5.0 we should not support this anymore. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4787) Join Contrib
[ https://issues.apache.org/jira/browse/SOLR-4787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670679#comment-13670679 ] Joel Bernstein commented on SOLR-4787: -- Colt looks promising and it's under the Cern license which is very permissive. I'll test it out. > Join Contrib > > > Key: SOLR-4787 > URL: https://issues.apache.org/jira/browse/SOLR-4787 > Project: Solr > Issue Type: New Feature > Components: search >Affects Versions: 4.2.1 >Reporter: Joel Bernstein >Priority: Minor > Fix For: 4.2.1 > > Attachments: SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, > SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch > > > This contrib provides a place where different join implementations can be > contributed to Solr. This contrib currently includes 3 join implementations. > The initial patch was generated from the Solr 4.2.1 tag. Because of changes > in the FieldCache API this patch will only build with Solr 4.2 or above. > *PostFilterJoinQParserPlugin aka "pjoin"* > The pjoin provides a join implementation that filters results in one core > based on the results of a search in another core. This is similar in > functionality to the JoinQParserPlugin but the implementation differs in a > couple of important ways. > The first way is that the pjoin is designed to work with integer join keys > only. So, in order to use pjoin, integer join keys must be included in both > the to and from core. > The second difference is that the pjoin builds memory structures that are > used to quickly connect the join keys. It also uses a custom SolrCache named > "join" to hold intermediate DocSets which are needed to build the join memory > structures. So, the pjoin will need more memory then the JoinQParserPlugin to > perform the join. > The main advantage of the pjoin is that it can scale to join millions of keys > between cores. > Because it's a PostFilter, it only needs to join records that match the main > query. > The syntax of the pjoin is the same as the JoinQParserPlugin except that the > plugin is referenced by the string "pjoin" rather then "join". > fq=\{!pjoin fromCore=collection2 from=id_i to=id_i\}user:customer1 > The example filter query above will search the fromCore (collection2) for > "user:customer1". This query will generate a list of values from the "from" > field that will be used to filter the main query. Only records from the main > query, where the "to" field is present in the "from" list will be included in > the results. > The solrconfig.xml in the main query core must contain the reference to the > pjoin. > class="org.apache.solr.joins.PostFilterJoinQParserPlugin"/> > And the join contrib jars must be registed in the solrconfig.xml. > > The solrconfig.xml in the fromcore must have the "join" SolrCache configured. > class="solr.LRUCache" > size="4096" > initialSize="1024" > /> > *JoinValueSourceParserPlugin aka vjoin* > The second implementation is the JoinValueSourceParserPlugin aka "vjoin". > This implements a ValueSource function query that can return values from a > second core based on join keys. This allows relevance data to be stored in a > separate core and then joined in the main query. > The vjoin is called using the "vjoin" function query. For example: > bf=vjoin(fromCore, fromKey, fromVal, toKey) > This example shows "vjoin" being called by the edismax boost function > parameter. This example will return the "fromVal" from the "fromCore". The > "fromKey" and "toKey" are used to link the records from the main query to the > records in the "fromCore". > As with the "pjoin", both the fromKey and toKey must be integers. Also like > the pjoin, the "join" SolrCache is used to hold the join memory structures. > To configure the vjoin you must register the ValueSource plugin in the > solrconfig.xml as follows: > class="org.apache.solr.joins.JoinValueSourceParserPlugin" /> > *JoinValueSourceParserPlugin2 aka vjoin2 aka Personalized ValueSource Join* > vjoin2 supports "personalized" ValueSource joins. The syntax is similar to > vjoin but adds an extra parameter so a query can be specified to join a > specific record set from the fromCore. This is designed to allow customer > specific relevance information to be added to the fromCore and then joined at > query time. > Syntax: > bf=vjoin2(fromCore,fromKey,fromVal,toKey,query) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional
[jira] [Commented] (SOLR-4787) Join Contrib
[ https://issues.apache.org/jira/browse/SOLR-4787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670678#comment-13670678 ] Joel Bernstein commented on SOLR-4787: -- I'd like to switch this to a hash join rather then using the binary search anyway. For longs it would be great to use a HashMap that works with primitive keys, like Trove. Trove is LGPL I believe so I don't think we can use it though. I'll look around and see if I can find another library that does what Trove does. Let me know if you know of another one or you've got an implementation lying around. > Join Contrib > > > Key: SOLR-4787 > URL: https://issues.apache.org/jira/browse/SOLR-4787 > Project: Solr > Issue Type: New Feature > Components: search >Affects Versions: 4.2.1 >Reporter: Joel Bernstein >Priority: Minor > Fix For: 4.2.1 > > Attachments: SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, > SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch > > > This contrib provides a place where different join implementations can be > contributed to Solr. This contrib currently includes 3 join implementations. > The initial patch was generated from the Solr 4.2.1 tag. Because of changes > in the FieldCache API this patch will only build with Solr 4.2 or above. > *PostFilterJoinQParserPlugin aka "pjoin"* > The pjoin provides a join implementation that filters results in one core > based on the results of a search in another core. This is similar in > functionality to the JoinQParserPlugin but the implementation differs in a > couple of important ways. > The first way is that the pjoin is designed to work with integer join keys > only. So, in order to use pjoin, integer join keys must be included in both > the to and from core. > The second difference is that the pjoin builds memory structures that are > used to quickly connect the join keys. It also uses a custom SolrCache named > "join" to hold intermediate DocSets which are needed to build the join memory > structures. So, the pjoin will need more memory then the JoinQParserPlugin to > perform the join. > The main advantage of the pjoin is that it can scale to join millions of keys > between cores. > Because it's a PostFilter, it only needs to join records that match the main > query. > The syntax of the pjoin is the same as the JoinQParserPlugin except that the > plugin is referenced by the string "pjoin" rather then "join". > fq=\{!pjoin fromCore=collection2 from=id_i to=id_i\}user:customer1 > The example filter query above will search the fromCore (collection2) for > "user:customer1". This query will generate a list of values from the "from" > field that will be used to filter the main query. Only records from the main > query, where the "to" field is present in the "from" list will be included in > the results. > The solrconfig.xml in the main query core must contain the reference to the > pjoin. > class="org.apache.solr.joins.PostFilterJoinQParserPlugin"/> > And the join contrib jars must be registed in the solrconfig.xml. > > The solrconfig.xml in the fromcore must have the "join" SolrCache configured. > class="solr.LRUCache" > size="4096" > initialSize="1024" > /> > *JoinValueSourceParserPlugin aka vjoin* > The second implementation is the JoinValueSourceParserPlugin aka "vjoin". > This implements a ValueSource function query that can return values from a > second core based on join keys. This allows relevance data to be stored in a > separate core and then joined in the main query. > The vjoin is called using the "vjoin" function query. For example: > bf=vjoin(fromCore, fromKey, fromVal, toKey) > This example shows "vjoin" being called by the edismax boost function > parameter. This example will return the "fromVal" from the "fromCore". The > "fromKey" and "toKey" are used to link the records from the main query to the > records in the "fromCore". > As with the "pjoin", both the fromKey and toKey must be integers. Also like > the pjoin, the "join" SolrCache is used to hold the join memory structures. > To configure the vjoin you must register the ValueSource plugin in the > solrconfig.xml as follows: > class="org.apache.solr.joins.JoinValueSourceParserPlugin" /> > *JoinValueSourceParserPlugin2 aka vjoin2 aka Personalized ValueSource Join* > vjoin2 supports "personalized" ValueSource joins. The syntax is similar to > vjoin but adds an extra parameter so a query can be specified to join a > specific record set from the fromCore. This is designed to allow customer > specific relevance information to be added to the fromCore and then joined at > query time. > Syntax: > bf=vjoin2(fromCore,fromKey,fromVal,toKey,query) -- This message is automatically gene
[jira] [Commented] (SOLR-4715) CloudSolrServer does not provide support for setting underlying server properties
[ https://issues.apache.org/jira/browse/SOLR-4715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670668#comment-13670668 ] Shawn Heisey commented on SOLR-4715: I have tried some minimal testing with this code for setting the response parser and httpclient params, and it appears to work. > CloudSolrServer does not provide support for setting underlying server > properties > - > > Key: SOLR-4715 > URL: https://issues.apache.org/jira/browse/SOLR-4715 > Project: Solr > Issue Type: Bug >Affects Versions: 4.3 >Reporter: Hardik Upadhyay >Assignee: Shawn Heisey > Labels: solr, solrj > > CloudSolrServer (and LBHttpSolrServer) do not allow the user to set > underlying HttpSolrServer and HttpClient settings. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-4882) Restrict SolrResourceLoader to only classloader accessible files and instance dir
Uwe Schindler created SOLR-4882: --- Summary: Restrict SolrResourceLoader to only classloader accessible files and instance dir Key: SOLR-4882 URL: https://issues.apache.org/jira/browse/SOLR-4882 Project: Solr Issue Type: Improvement Affects Versions: 4.3 Reporter: Uwe Schindler Assignee: Uwe Schindler Fix For: 5.0, 4.4 SolrResourceLoader currently allows to load files from any absolute/CWD-relative path, which is used as a fallback if the resource cannot be looked up via the class loader. We should limit this fallback to sub-dirs below the instanceDir passed into the ctor. The CWD special case should be removed, too (the virtual CWD is instance's config or root dir). The reason for this is security related. Some Solr components allow to pass in resource paths via REST parameters (e.g. XSL stalesheets,...) and load them via resource loader. By this it is possible to limit the whole thing to not allow loading e.g. /etc/passwd as a stylesheet. In 4.4 we should add a solrconfig.xml setting to enable the old behaviour, but disable it by default, if your existing installation requires the files from outside the instance dir which are not available via the URLClassLoader used internally. In Lucene 5.0 we should not support this anymore. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4715) CloudSolrServer does not provide support for setting underlying server properties
[ https://issues.apache.org/jira/browse/SOLR-4715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670650#comment-13670650 ] Shawn Heisey commented on SOLR-4715: [~hupadhyay], the following code **MIGHT** allow you to change the response parser back to XML before this issue is implemented. I have not tested this, and I would be very curious about whether it works for you. It also changes a couple of HttpClient parameters, but you could remove those two lines. {code} import org.apache.http.client.HttpClient; import org.apache.solr.client.solrj.ResponseParser; import org.apache.solr.client.solrj.SolrServer; import org.apache.solr.client.solrj.SolrServerException; import org.apache.solr.client.solrj.impl.CloudSolrServer; import org.apache.solr.client.solrj.impl.HttpClientUtil; import org.apache.solr.client.solrj.impl.LBHttpSolrServer; import org.apache.solr.client.solrj.impl.XMLResponseParser; import org.apache.solr.common.params.ModifiableSolrParams; public class TestStuff { void test() throws MalformedURLException { String zkHost = ""; ModifiableSolrParams params = new ModifiableSolrParams(); params.set(HttpClientUtil.PROP_MAX_CONNECTIONS, 1000); params.set(HttpClientUtil.PROP_MAX_CONNECTIONS_PER_HOST, 200); HttpClient client = HttpClientUtil.createClient(params); ResponseParser parser = new XMLResponseParser(); LBHttpSolrServer lbServer = new LBHttpSolrServer(client, parser, "http://localhost/solr";); lbServer.removeSolrServer("http://localhost/solr";); SolrServer server = new CloudSolrServer(zkHost, lbServer); } } {code} > CloudSolrServer does not provide support for setting underlying server > properties > - > > Key: SOLR-4715 > URL: https://issues.apache.org/jira/browse/SOLR-4715 > Project: Solr > Issue Type: Bug >Affects Versions: 4.3 >Reporter: Hardik Upadhyay >Assignee: Shawn Heisey > Labels: solr, solrj > > CloudSolrServer (and LBHttpSolrServer) do not allow the user to set > underlying HttpSolrServer and HttpClient settings. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-4805) Calling Collection RELOAD where collection has a single core, leaves collection offline and unusable till reboot
[ https://issues.apache.org/jira/browse/SOLR-4805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670630#comment-13670630 ] David edited comment on SOLR-4805 at 5/30/13 7:25 PM: -- I get the same issue when I try to reload a single core. The only way for me to currently change my configs is to restart the container. was (Author: dboychuck): I get the same issue when I try to reload a single core. The only way for me to currently changing my configs is to restart the container. > Calling Collection RELOAD where collection has a single core, leaves > collection offline and unusable till reboot > > > Key: SOLR-4805 > URL: https://issues.apache.org/jira/browse/SOLR-4805 > Project: Solr > Issue Type: Bug > Components: SolrCloud >Affects Versions: 4.3 >Reporter: Jared Rodriguez >Assignee: Mark Miller > Fix For: 5.0, 4.4 > > > If you have a collection that is composed of a single core, then calling > reload on that collection leaves the core offline. This happens even if > nothing at all has changed about the collection or its config. This happens > whether you call reload via an http GET or if you directly call reload via > the collections api. > Tried a collection with a single core that contains data, change nothing > about the config in ZK and call reload and the collection. The call > completes, but ZK flags that replica with "state":"down" > Try it where a the single core contains no data and the same thing happens, > ZK config updates and broadcasts "state":"down" for the replica. > I did not try this in a multicore or replicated core environment. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4805) Calling Collection RELOAD where collection has a single core, leaves collection offline and unusable till reboot
[ https://issues.apache.org/jira/browse/SOLR-4805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670630#comment-13670630 ] David commented on SOLR-4805: - I get the same issue when I try to reload a single core. The only way for me to currently changing my configs is to restart the container. > Calling Collection RELOAD where collection has a single core, leaves > collection offline and unusable till reboot > > > Key: SOLR-4805 > URL: https://issues.apache.org/jira/browse/SOLR-4805 > Project: Solr > Issue Type: Bug > Components: SolrCloud >Affects Versions: 4.3 >Reporter: Jared Rodriguez >Assignee: Mark Miller > Fix For: 5.0, 4.4 > > > If you have a collection that is composed of a single core, then calling > reload on that collection leaves the core offline. This happens even if > nothing at all has changed about the collection or its config. This happens > whether you call reload via an http GET or if you directly call reload via > the collections api. > Tried a collection with a single core that contains data, change nothing > about the config in ZK and call reload and the collection. The call > completes, but ZK flags that replica with "state":"down" > Try it where a the single core contains no data and the same thing happens, > ZK config updates and broadcasts "state":"down" for the replica. > I did not try this in a multicore or replicated core environment. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4805) Calling Collection RELOAD where collection has a single core, leaves collection offline and unusable till reboot
[ https://issues.apache.org/jira/browse/SOLR-4805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670627#comment-13670627 ] Mark Miller commented on SOLR-4805: --- I'll fix it for 4.4 - we should stop doing preRegister when doing a reload. > Calling Collection RELOAD where collection has a single core, leaves > collection offline and unusable till reboot > > > Key: SOLR-4805 > URL: https://issues.apache.org/jira/browse/SOLR-4805 > Project: Solr > Issue Type: Bug > Components: SolrCloud >Affects Versions: 4.3 >Reporter: Jared Rodriguez >Assignee: Mark Miller > Fix For: 5.0, 4.4 > > > If you have a collection that is composed of a single core, then calling > reload on that collection leaves the core offline. This happens even if > nothing at all has changed about the collection or its config. This happens > whether you call reload via an http GET or if you directly call reload via > the collections api. > Tried a collection with a single core that contains data, change nothing > about the config in ZK and call reload and the collection. The call > completes, but ZK flags that replica with "state":"down" > Try it where a the single core contains no data and the same thing happens, > ZK config updates and broadcasts "state":"down" for the replica. > I did not try this in a multicore or replicated core environment. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4805) Calling Collection RELOAD where collection has a single core, leaves collection offline and unusable till reboot
[ https://issues.apache.org/jira/browse/SOLR-4805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670623#comment-13670623 ] David commented on SOLR-4805: - Details here: http://lucene.472066.n3.nabble.com/Collections-API-Reload-killing-my-cloud-td4067141.html > Calling Collection RELOAD where collection has a single core, leaves > collection offline and unusable till reboot > > > Key: SOLR-4805 > URL: https://issues.apache.org/jira/browse/SOLR-4805 > Project: Solr > Issue Type: Bug > Components: SolrCloud >Affects Versions: 4.3 >Reporter: Jared Rodriguez >Assignee: Mark Miller > Fix For: 5.0, 4.4 > > > If you have a collection that is composed of a single core, then calling > reload on that collection leaves the core offline. This happens even if > nothing at all has changed about the collection or its config. This happens > whether you call reload via an http GET or if you directly call reload via > the collections api. > Tried a collection with a single core that contains data, change nothing > about the config in ZK and call reload and the collection. The call > completes, but ZK flags that replica with "state":"down" > Try it where a the single core contains no data and the same thing happens, > ZK config updates and broadcasts "state":"down" for the replica. > I did not try this in a multicore or replicated core environment. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4805) Calling Collection RELOAD where collection has a single core, leaves collection offline and unusable till reboot
[ https://issues.apache.org/jira/browse/SOLR-4805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670622#comment-13670622 ] David commented on SOLR-4805: - I'm having this same issue on a cloud of 6 servers > Calling Collection RELOAD where collection has a single core, leaves > collection offline and unusable till reboot > > > Key: SOLR-4805 > URL: https://issues.apache.org/jira/browse/SOLR-4805 > Project: Solr > Issue Type: Bug > Components: SolrCloud >Affects Versions: 4.3 >Reporter: Jared Rodriguez >Assignee: Mark Miller > Fix For: 5.0, 4.4 > > > If you have a collection that is composed of a single core, then calling > reload on that collection leaves the core offline. This happens even if > nothing at all has changed about the collection or its config. This happens > whether you call reload via an http GET or if you directly call reload via > the collections api. > Tried a collection with a single core that contains data, change nothing > about the config in ZK and call reload and the collection. The call > completes, but ZK flags that replica with "state":"down" > Try it where a the single core contains no data and the same thing happens, > ZK config updates and broadcasts "state":"down" for the replica. > I did not try this in a multicore or replicated core environment. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-4881) Fix DocumentAnalysisRequestHandler to correctly use EmptyEntityResolver
[ https://issues.apache.org/jira/browse/SOLR-4881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler resolved SOLR-4881. - Resolution: Fixed Committed to 4.3.1, 4.4 and trunk. Thanks Hoss for pointing out the inconsistency! > Fix DocumentAnalysisRequestHandler to correctly use EmptyEntityResolver > --- > > Key: SOLR-4881 > URL: https://issues.apache.org/jira/browse/SOLR-4881 > Project: Solr > Issue Type: Bug > Components: Schema and Analysis >Affects Versions: 4.3 >Reporter: Uwe Schindler >Assignee: Uwe Schindler > Fix For: 5.0, 4.4, 4.3.1 > > Attachments: SOLR-4881.patch > > > This was overlooked while committing SOLR-3895. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5024) Can we reliably detect an incomplete first commit vs index corruption?
[ https://issues.apache.org/jira/browse/LUCENE-5024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670604#comment-13670604 ] Geoff Cooney commented on LUCENE-5024: -- Hi. I'm one of the users who reported/asked about this. Specifically, i was wondering if it's possible to deal with this by being explicit about the segments_n file being in the pre-committed state? That is, add one byte to segments_n file representing a boolean "isCommitted". Then you could treat an index that only has a segments_1 file set to "isCommitted"=false as a non-existant index. > Can we reliably detect an incomplete first commit vs index corruption? > -- > > Key: LUCENE-5024 > URL: https://issues.apache.org/jira/browse/LUCENE-5024 > Project: Lucene - Core > Issue Type: Bug > Components: core/index >Reporter: Michael McCandless > Fix For: 5.0, 4.4 > > > Normally, if something bad happens (OS, JVM, hardware crashes) while > IndexWriter is committing, we will just fallback to the prior commit > and no intervention necessary from the app. > But if that commit is the first commit, then on restart IndexWriter > will now throw CorruptIndexException, as of LUCENE-4738. > Prior to LUCENE-4738, in LUCENE-2812, we used to try to detect the > corrupt first commit, but that logic was dangerous and could result in > falsely believing no index is present when one is, e.g. when transient > IOExceptions are thrown due to file descriptor exhaustion. > But now two users have hit this change ... see "CorruptIndexException > when opening Index during first commit" and "Calling > IndexWriter.commit() immediately after creating the writer", both on > java-user. > It would be nice to get back to not marking an incomplete first commit > as corruption ... but we have to proceed carefully. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-4881) Fix DocumentAnalysisRequestHandler to correctly use EmptyEntityResolver
Uwe Schindler created SOLR-4881: --- Summary: Fix DocumentAnalysisRequestHandler to correctly use EmptyEntityResolver Key: SOLR-4881 URL: https://issues.apache.org/jira/browse/SOLR-4881 Project: Solr Issue Type: Bug Components: Schema and Analysis Affects Versions: 4.3 Reporter: Uwe Schindler Assignee: Uwe Schindler Fix For: 5.0, 4.4, 4.3.1 Attachments: SOLR-4881.patch This was overlooked while committing SOLR-3895 and SOLR-3614. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4881) Fix DocumentAnalysisRequestHandler to correctly use EmptyEntityResolver
[ https://issues.apache.org/jira/browse/SOLR-4881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated SOLR-4881: Attachment: SOLR-4881.patch Simple patch! > Fix DocumentAnalysisRequestHandler to correctly use EmptyEntityResolver > --- > > Key: SOLR-4881 > URL: https://issues.apache.org/jira/browse/SOLR-4881 > Project: Solr > Issue Type: Bug > Components: Schema and Analysis >Affects Versions: 4.3 >Reporter: Uwe Schindler >Assignee: Uwe Schindler > Fix For: 5.0, 4.4, 4.3.1 > > Attachments: SOLR-4881.patch > > > This was overlooked while committing SOLR-3895. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4881) Fix DocumentAnalysisRequestHandler to correctly use EmptyEntityResolver
[ https://issues.apache.org/jira/browse/SOLR-4881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated SOLR-4881: Description: This was overlooked while committing SOLR-3895. (was: This was overlooked while committing SOLR-3895 and SOLR-3614.) > Fix DocumentAnalysisRequestHandler to correctly use EmptyEntityResolver > --- > > Key: SOLR-4881 > URL: https://issues.apache.org/jira/browse/SOLR-4881 > Project: Solr > Issue Type: Bug > Components: Schema and Analysis >Affects Versions: 4.3 >Reporter: Uwe Schindler >Assignee: Uwe Schindler > Fix For: 5.0, 4.4, 4.3.1 > > Attachments: SOLR-4881.patch > > > This was overlooked while committing SOLR-3895. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4693) Create a collections API to delete/cleanup a Slice
[ https://issues.apache.org/jira/browse/SOLR-4693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670552#comment-13670552 ] Shalin Shekhar Mangar commented on SOLR-4693: - Thanks Anshum. A few comments: # Can we use "collection" instead of "name" just like we use in splitshard? # The following code will throw an exception for a shard with no range (custom hashing use-case). Also it allows deletion of slices in construction state going against the error message. {code} // For now, only allow for deletions of Inactive slices or custom hashes (range==null). // TODO: Add check for range gaps on Slice deletion if (!slice.getState().equals(Slice.INACTIVE) && slice.getRange() != null) { throw new SolrException(ErrorCode.BAD_REQUEST, "The slice: " + slice.getName() + " is not currently " + slice.getState() + ". Only inactive (or custom-hashed) slices can be deleted."); } {code} # The "deletecore" call to overseer is redundant because it is also made by the CoreAdmin UNLOAD action. # Can we re-use the code between "deletecollection" and "deleteshard"? The collectionCmd code checks for "live" state as well. # In DeleteSliceTest, after setSliceAsInactive(), we should poll the slice state until it becomes inactive or until a timeout value instead of just waiting for 5000ms # DeleteSliceTest.waitAndConfirmSliceDeletion is wrong. It does not actually use the counter variable. Also, cloudClient.getZkStateReader().getClusterState() doesn't actually force refresh the cluster state # We should fail with appropriate error message if there were nodes which could not be unloaded. Perhaps a separate "deletecore" call is appropriate here? # Do we know what would happen if such a "zombie" node comes back up? We need to make sure it cleans up properly. > Create a collections API to delete/cleanup a Slice > -- > > Key: SOLR-4693 > URL: https://issues.apache.org/jira/browse/SOLR-4693 > Project: Solr > Issue Type: Improvement > Components: SolrCloud >Reporter: Anshum Gupta >Assignee: Shalin Shekhar Mangar > Attachments: SOLR-4693.patch, SOLR-4693.patch > > > Have a collections API that cleans up a given shard. > Among other places, this would be useful post the shard split call to manage > the parent/original slice. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-5026) PagedGrowableWriter
Adrien Grand created LUCENE-5026: Summary: PagedGrowableWriter Key: LUCENE-5026 URL: https://issues.apache.org/jira/browse/LUCENE-5026 Project: Lucene - Core Issue Type: Improvement Reporter: Adrien Grand Assignee: Adrien Grand Fix For: 5.0, 4.4 We already have packed data structures that support more than 2B values such as AppendingLongBuffer and MonotonicAppendingLongBuffer but none of them supports random write-access. We could write a PagedGrowableWriter for this, which would essentially wrap an array of GrowableWriters. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4715) CloudSolrServer does not provide support for setting underlying server properties
[ https://issues.apache.org/jira/browse/SOLR-4715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670497#comment-13670497 ] Shawn Heisey commented on SOLR-4715: My initial inclination is to *NOT* provide additional constructors, but to provide a number of getters and setters. In addition to methods for setting timeouts and common httpclient properties, I would include getHttpClient and possibly something with a name like getHttpSolrServer or getInnerSolrServer. For CloudSolrServer, most of these new methods would just send/request the same information to/from LBHttpSolrServer. Should I change my approach? I haven't written any code yet. > CloudSolrServer does not provide support for setting underlying server > properties > - > > Key: SOLR-4715 > URL: https://issues.apache.org/jira/browse/SOLR-4715 > Project: Solr > Issue Type: Bug >Affects Versions: 4.3 >Reporter: Hardik Upadhyay >Assignee: Shawn Heisey > Labels: solr, solrj > > CloudSolrServer (and LBHttpSolrServer) do not allow the user to set > underlying HttpSolrServer and HttpClient settings. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4715) CloudSolrServer does not provide support for setting underlying server properties
[ https://issues.apache.org/jira/browse/SOLR-4715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shawn Heisey updated SOLR-4715: --- Description: CloudSolrServer (and LBHttpSolrServer) do not allow the user to set underlying HttpSolrServer and HttpClient settings. (was: CloudSolrServer as well as LBHttpSolrServer does not allow to set XMLResponseWriter) > CloudSolrServer does not provide support for setting underlying server > properties > - > > Key: SOLR-4715 > URL: https://issues.apache.org/jira/browse/SOLR-4715 > Project: Solr > Issue Type: Bug >Affects Versions: 4.3 >Reporter: Hardik Upadhyay >Assignee: Shawn Heisey > Labels: solr, solrj > > CloudSolrServer (and LBHttpSolrServer) do not allow the user to set > underlying HttpSolrServer and HttpClient settings. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4715) CloudSolrServer does not provide support for setting underlying server properties
[ https://issues.apache.org/jira/browse/SOLR-4715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shawn Heisey updated SOLR-4715: --- Summary: CloudSolrServer does not provide support for setting underlying server properties (was: CloudSolrServer does not provide support for setting XmlResponseWriter) > CloudSolrServer does not provide support for setting underlying server > properties > - > > Key: SOLR-4715 > URL: https://issues.apache.org/jira/browse/SOLR-4715 > Project: Solr > Issue Type: Bug >Affects Versions: 4.3 >Reporter: Hardik Upadhyay >Assignee: Shawn Heisey > Labels: solr, solrj > > CloudSolrServer as well as LBHttpSolrServer does not allow to set > XMLResponseWriter -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4715) CloudSolrServer does not provide support for setting XmlResponseWriter
[ https://issues.apache.org/jira/browse/SOLR-4715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shawn Heisey updated SOLR-4715: --- Affects Version/s: 4.3 > CloudSolrServer does not provide support for setting XmlResponseWriter > -- > > Key: SOLR-4715 > URL: https://issues.apache.org/jira/browse/SOLR-4715 > Project: Solr > Issue Type: Bug >Affects Versions: 4.3 >Reporter: Hardik Upadhyay >Assignee: Shawn Heisey > Labels: solr, solrj > > CloudSolrServer as well as LBHttpSolrServer does not allow to set > XMLResponseWriter -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (SOLR-4715) CloudSolrServer does not provide support for setting XmlResponseWriter
[ https://issues.apache.org/jira/browse/SOLR-4715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shawn Heisey reassigned SOLR-4715: -- Assignee: Shawn Heisey > CloudSolrServer does not provide support for setting XmlResponseWriter > -- > > Key: SOLR-4715 > URL: https://issues.apache.org/jira/browse/SOLR-4715 > Project: Solr > Issue Type: Bug >Reporter: Hardik Upadhyay >Assignee: Shawn Heisey > Labels: solr, solrj > > CloudSolrServer as well as LBHttpSolrServer does not allow to set > XMLResponseWriter -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-4880) ClientUtils#toSolrInputDocument(SolrDocument d) creates shallow copy for multivalued fields
Ivan Hrytsyuk created SOLR-4880: --- Summary: ClientUtils#toSolrInputDocument(SolrDocument d) creates shallow copy for multivalued fields Key: SOLR-4880 URL: https://issues.apache.org/jira/browse/SOLR-4880 Project: Solr Issue Type: Bug Components: clients - java Reporter: Ivan Hrytsyuk Fix For: 3.6 Multivalued fields are represented in SolrDocument as java.util.Collection. ClientUtils#toSolrInputDocument(SolrDocument d) creates shallow copy of the collections in resulted SolrInputDocument. That means that changes to resulted instance (i.e. adding/removing records) affect original instance as well, which is bad. *Expected Behaviour*: Deep copy of collections should be created. Changes to resulted instance shouldn't affect original instance *Possible Implementation*: {code:java} public static SolrInputDocument toSolrInputDocument(final SolrDocument solrDocument) { final Map fields = new LinkedHashMap(); return toSolrInputDocument(solrDocument, fields); } public static SolrInputDocument toSolrInputDocument(final SolrDocument solrDocument, final Map fields) { final SolrInputDocument result = new SolrInputDocument(fields); for(final Map.Entry entry : solrDocument.entrySet()) { if(entry.getValue() instanceof Collection) { result.setField(entry.getKey(), new ArrayList((Collection) entry.getValue())); } else { result.setField(entry.getKey(), entry.getValue()); } } return result; } {code} *Note*: Believe the same issue is true for ClientUtils#toSolrDocument(SolrInputDocument d) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5023) Only reader that contains fields can be added into readerContext
[ https://issues.apache.org/jira/browse/LUCENE-5023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670465#comment-13670465 ] Uwe Schindler commented on LUCENE-5023: --- The buggy code in SolrIndexSearcher was removed. Will be relaesed with 4.3.1 or 4.4 > Only reader that contains fields can be added into readerContext > > > Key: LUCENE-5023 > URL: https://issues.apache.org/jira/browse/LUCENE-5023 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 4.2 >Reporter: Bao Yang Yang >Assignee: Uwe Schindler >Priority: Critical > Original Estimate: 1h > Remaining Estimate: 1h > > When there is only Segements in solr core, which means no any indexes, in > CompositeReaderContext.build() method, the atomicReader that has no fields > returned should not be added into leaves. Otherwise, in > SolrIndexSearcher.getDocSetNC(Query query, DocSet filter), when execute line > fields.terms(t.field()), a nullpointerexception will occur since fields > variable is null. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-4877) SolrIndexSearcher#getDocSetNC should check for null return in AtomicReader#fields()
[ https://issues.apache.org/jira/browse/SOLR-4877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler resolved SOLR-4877. - Resolution: Fixed > SolrIndexSearcher#getDocSetNC should check for null return in > AtomicReader#fields() > --- > > Key: SOLR-4877 > URL: https://issues.apache.org/jira/browse/SOLR-4877 > Project: Solr > Issue Type: Bug >Affects Versions: 4.2, 4.3 >Reporter: Uwe Schindler >Assignee: Uwe Schindler > Fix For: 5.0, 4.4, 4.3.1 > > Attachments: SOLR-4877-nospecialcase.patch, SOLR-4877.patch > > > In LUCENE-5023 it was reported that composite reader contexts should not > contain null fields() readers. But this is wrong, as a null-fields() reader > may contain documents, just no fields. > fields() and terms() is documented to return null, so DocSets should check > for null (like all queries do in Lucene). It seems that DocSetNC does not > correctly check for null. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4877) SolrIndexSearcher#getDocSetNC should check for null return in AtomicReader#fields()
[ https://issues.apache.org/jira/browse/SOLR-4877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated SOLR-4877: Fix Version/s: 4.3.1 4.4 5.0 > SolrIndexSearcher#getDocSetNC should check for null return in > AtomicReader#fields() > --- > > Key: SOLR-4877 > URL: https://issues.apache.org/jira/browse/SOLR-4877 > Project: Solr > Issue Type: Bug >Affects Versions: 4.2, 4.3 >Reporter: Uwe Schindler >Assignee: Uwe Schindler > Fix For: 5.0, 4.4, 4.3.1 > > Attachments: SOLR-4877-nospecialcase.patch, SOLR-4877.patch > > > In LUCENE-5023 it was reported that composite reader contexts should not > contain null fields() readers. But this is wrong, as a null-fields() reader > may contain documents, just no fields. > fields() and terms() is documented to return null, so DocSets should check > for null (like all queries do in Lucene). It seems that DocSetNC does not > correctly check for null. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-4816) Add document routing to CloudSolrServer
[ https://issues.apache.org/jira/browse/SOLR-4816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670456#comment-13670456 ] Shawn Heisey edited comment on SOLR-4816 at 5/30/13 4:17 PM: - bq. I still think these extras will need to be off by default until 5. +1. Even in version 5, it should still be possible to turn them off. Advanced features (threading in particular) have a tendency to cause subtle bugs, and it's difficult to know if they are bugs in the underlying code or bugs in the advanced feature. Being able to turn them off will greatly help with debugging. IMHO, most tests that use CloudSolrServer should randomly turn things like threading on or off, change the writer and parser, etc. Which reminds me, I need to file an issue and work on a patch for Cloud/LBHttpSolrServer implementations that includes many of the getters/setters from HttpSolrServer. was (Author: elyograg): bq. I still think these extras will need to be off by default until 5. +1. Even in version 5, it should still be possible to turn them off. Advanced features (threading in particular) have a tendency to cause subtle bugs, and it's difficult to know if they are bugs in the underlying code or bugs in the advanced feature. Being able to turn them off will greatly help with debugging. IMHO, most tests that use CloudSolrServer should randomly turn things like threading on or off, change the writer and parser, etc. Which reminds me, I need to file an issue and work on a patch for {Cloud,LBHttp}SolrServer implementations that includes many of the getters/setters from HttpSolrServer. > Add document routing to CloudSolrServer > --- > > Key: SOLR-4816 > URL: https://issues.apache.org/jira/browse/SOLR-4816 > Project: Solr > Issue Type: Improvement > Components: SolrCloud >Affects Versions: 4.3 >Reporter: Joel Bernstein >Assignee: Mark Miller >Priority: Minor > Fix For: 5.0, 4.4 > > Attachments: SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816-sriesenberg.patch > > > This issue adds the following enhancements to CloudSolrServer's update logic: > 1) Document routing: Updates are routed directly to the correct shard leader > eliminating document routing at the server. > 2) Parallel update execution: Updates for each shard are executed in a > separate thread so parallel indexing can occur across the cluster. > 3) Javabin transport: Update requests are sent via javabin transport. > These enhancements should allow for near linear scalability on indexing > throughput. > Usage: > CloudSolrServer cloudClient = new CloudSolrServer(zkAddress); > SolrInputDocument doc1 = new SolrInputDocument(); > doc1.addField(id, "0"); > doc1.addField("a_t", "hello1"); > SolrInputDocument doc2 = new SolrInputDocument(); > doc2.addField(id, "2"); > doc2.addField("a_t", "hello2"); > UpdateRequest request = new UpdateRequest(); > request.add(doc1); > request.add(doc2); > request.setAction(AbstractUpdateRequest.ACTION.OPTIMIZE, false, false); > NamedList response = cloudClient.request(request); // Returns a backwards > compatible condensed response. > //To get more detailed response down cast to RouteResponse: > CloudSolrServer.RouteResponse rr = (CloudSolrServer.RouteResponse)response; > NamedList responses = rr.getRouteResponse(); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4816) Add document routing to CloudSolrServer
[ https://issues.apache.org/jira/browse/SOLR-4816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670457#comment-13670457 ] Joel Bernstein commented on SOLR-4816: -- The initial response behaves very much like a response when document routing is done on the server. On the server the Solr instance sends off the docs to the shards to be indexed and then returns a single unified response. This does basically the same thing but let's you down cast to get more info if you want to. > Add document routing to CloudSolrServer > --- > > Key: SOLR-4816 > URL: https://issues.apache.org/jira/browse/SOLR-4816 > Project: Solr > Issue Type: Improvement > Components: SolrCloud >Affects Versions: 4.3 >Reporter: Joel Bernstein >Assignee: Mark Miller >Priority: Minor > Fix For: 5.0, 4.4 > > Attachments: SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816-sriesenberg.patch > > > This issue adds the following enhancements to CloudSolrServer's update logic: > 1) Document routing: Updates are routed directly to the correct shard leader > eliminating document routing at the server. > 2) Parallel update execution: Updates for each shard are executed in a > separate thread so parallel indexing can occur across the cluster. > 3) Javabin transport: Update requests are sent via javabin transport. > These enhancements should allow for near linear scalability on indexing > throughput. > Usage: > CloudSolrServer cloudClient = new CloudSolrServer(zkAddress); > SolrInputDocument doc1 = new SolrInputDocument(); > doc1.addField(id, "0"); > doc1.addField("a_t", "hello1"); > SolrInputDocument doc2 = new SolrInputDocument(); > doc2.addField(id, "2"); > doc2.addField("a_t", "hello2"); > UpdateRequest request = new UpdateRequest(); > request.add(doc1); > request.add(doc2); > request.setAction(AbstractUpdateRequest.ACTION.OPTIMIZE, false, false); > NamedList response = cloudClient.request(request); // Returns a backwards > compatible condensed response. > //To get more detailed response down cast to RouteResponse: > CloudSolrServer.RouteResponse rr = (CloudSolrServer.RouteResponse)response; > NamedList responses = rr.getRouteResponse(); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4816) Add document routing to CloudSolrServer
[ https://issues.apache.org/jira/browse/SOLR-4816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670456#comment-13670456 ] Shawn Heisey commented on SOLR-4816: bq. I still think these extras will need to be off by default until 5. +1. Even in version 5, it should still be possible to turn them off. Advanced features (threading in particular) have a tendency to cause subtle bugs, and it's difficult to know if they are bugs in the underlying code or bugs in the advanced feature. Being able to turn them off will greatly help with debugging. IMHO, most tests that use CloudSolrServer should randomly turn things like threading on or off, change the writer and parser, etc. Which reminds me, I need to file an issue and work on a patch for {Cloud,LBHttp}SolrServer implementations that includes many of the getters/setters from HttpSolrServer. > Add document routing to CloudSolrServer > --- > > Key: SOLR-4816 > URL: https://issues.apache.org/jira/browse/SOLR-4816 > Project: Solr > Issue Type: Improvement > Components: SolrCloud >Affects Versions: 4.3 >Reporter: Joel Bernstein >Assignee: Mark Miller >Priority: Minor > Fix For: 5.0, 4.4 > > Attachments: SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816-sriesenberg.patch > > > This issue adds the following enhancements to CloudSolrServer's update logic: > 1) Document routing: Updates are routed directly to the correct shard leader > eliminating document routing at the server. > 2) Parallel update execution: Updates for each shard are executed in a > separate thread so parallel indexing can occur across the cluster. > 3) Javabin transport: Update requests are sent via javabin transport. > These enhancements should allow for near linear scalability on indexing > throughput. > Usage: > CloudSolrServer cloudClient = new CloudSolrServer(zkAddress); > SolrInputDocument doc1 = new SolrInputDocument(); > doc1.addField(id, "0"); > doc1.addField("a_t", "hello1"); > SolrInputDocument doc2 = new SolrInputDocument(); > doc2.addField(id, "2"); > doc2.addField("a_t", "hello2"); > UpdateRequest request = new UpdateRequest(); > request.add(doc1); > request.add(doc2); > request.setAction(AbstractUpdateRequest.ACTION.OPTIMIZE, false, false); > NamedList response = cloudClient.request(request); // Returns a backwards > compatible condensed response. > //To get more detailed response down cast to RouteResponse: > CloudSolrServer.RouteResponse rr = (CloudSolrServer.RouteResponse)response; > NamedList responses = rr.getRouteResponse(); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4816) Add document routing to CloudSolrServer
[ https://issues.apache.org/jira/browse/SOLR-4816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Bernstein updated SOLR-4816: - Attachment: SOLR-4816.patch Added Routable.java > Add document routing to CloudSolrServer > --- > > Key: SOLR-4816 > URL: https://issues.apache.org/jira/browse/SOLR-4816 > Project: Solr > Issue Type: Improvement > Components: SolrCloud >Affects Versions: 4.3 >Reporter: Joel Bernstein >Assignee: Mark Miller >Priority: Minor > Fix For: 5.0, 4.4 > > Attachments: SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816-sriesenberg.patch > > > This issue adds the following enhancements to CloudSolrServer's update logic: > 1) Document routing: Updates are routed directly to the correct shard leader > eliminating document routing at the server. > 2) Parallel update execution: Updates for each shard are executed in a > separate thread so parallel indexing can occur across the cluster. > 3) Javabin transport: Update requests are sent via javabin transport. > These enhancements should allow for near linear scalability on indexing > throughput. > Usage: > CloudSolrServer cloudClient = new CloudSolrServer(zkAddress); > SolrInputDocument doc1 = new SolrInputDocument(); > doc1.addField(id, "0"); > doc1.addField("a_t", "hello1"); > SolrInputDocument doc2 = new SolrInputDocument(); > doc2.addField(id, "2"); > doc2.addField("a_t", "hello2"); > UpdateRequest request = new UpdateRequest(); > request.add(doc1); > request.add(doc2); > request.setAction(AbstractUpdateRequest.ACTION.OPTIMIZE, false, false); > NamedList response = cloudClient.request(request); // Returns a backwards > compatible condensed response. > //To get more detailed response down cast to RouteResponse: > CloudSolrServer.RouteResponse rr = (CloudSolrServer.RouteResponse)response; > NamedList responses = rr.getRouteResponse(); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4816) Add document routing to CloudSolrServer
[ https://issues.apache.org/jira/browse/SOLR-4816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670446#comment-13670446 ] Joel Bernstein commented on SOLR-4816: -- OK, adding it now. > Add document routing to CloudSolrServer > --- > > Key: SOLR-4816 > URL: https://issues.apache.org/jira/browse/SOLR-4816 > Project: Solr > Issue Type: Improvement > Components: SolrCloud >Affects Versions: 4.3 >Reporter: Joel Bernstein >Assignee: Mark Miller >Priority: Minor > Fix For: 5.0, 4.4 > > Attachments: SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816-sriesenberg.patch > > > This issue adds the following enhancements to CloudSolrServer's update logic: > 1) Document routing: Updates are routed directly to the correct shard leader > eliminating document routing at the server. > 2) Parallel update execution: Updates for each shard are executed in a > separate thread so parallel indexing can occur across the cluster. > 3) Javabin transport: Update requests are sent via javabin transport. > These enhancements should allow for near linear scalability on indexing > throughput. > Usage: > CloudSolrServer cloudClient = new CloudSolrServer(zkAddress); > SolrInputDocument doc1 = new SolrInputDocument(); > doc1.addField(id, "0"); > doc1.addField("a_t", "hello1"); > SolrInputDocument doc2 = new SolrInputDocument(); > doc2.addField(id, "2"); > doc2.addField("a_t", "hello2"); > UpdateRequest request = new UpdateRequest(); > request.add(doc1); > request.add(doc2); > request.setAction(AbstractUpdateRequest.ACTION.OPTIMIZE, false, false); > NamedList response = cloudClient.request(request); // Returns a backwards > compatible condensed response. > //To get more detailed response down cast to RouteResponse: > CloudSolrServer.RouteResponse rr = (CloudSolrServer.RouteResponse)response; > NamedList responses = rr.getRouteResponse(); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4816) Add document routing to CloudSolrServer
[ https://issues.apache.org/jira/browse/SOLR-4816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670442#comment-13670442 ] Mark Miller commented on SOLR-4816: --- I don't see Routable in the current patch. > Add document routing to CloudSolrServer > --- > > Key: SOLR-4816 > URL: https://issues.apache.org/jira/browse/SOLR-4816 > Project: Solr > Issue Type: Improvement > Components: SolrCloud >Affects Versions: 4.3 >Reporter: Joel Bernstein >Assignee: Mark Miller >Priority: Minor > Fix For: 5.0, 4.4 > > Attachments: SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816-sriesenberg.patch > > > This issue adds the following enhancements to CloudSolrServer's update logic: > 1) Document routing: Updates are routed directly to the correct shard leader > eliminating document routing at the server. > 2) Parallel update execution: Updates for each shard are executed in a > separate thread so parallel indexing can occur across the cluster. > 3) Javabin transport: Update requests are sent via javabin transport. > These enhancements should allow for near linear scalability on indexing > throughput. > Usage: > CloudSolrServer cloudClient = new CloudSolrServer(zkAddress); > SolrInputDocument doc1 = new SolrInputDocument(); > doc1.addField(id, "0"); > doc1.addField("a_t", "hello1"); > SolrInputDocument doc2 = new SolrInputDocument(); > doc2.addField(id, "2"); > doc2.addField("a_t", "hello2"); > UpdateRequest request = new UpdateRequest(); > request.add(doc1); > request.add(doc2); > request.setAction(AbstractUpdateRequest.ACTION.OPTIMIZE, false, false); > NamedList response = cloudClient.request(request); // Returns a backwards > compatible condensed response. > //To get more detailed response down cast to RouteResponse: > CloudSolrServer.RouteResponse rr = (CloudSolrServer.RouteResponse)response; > NamedList responses = rr.getRouteResponse(); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4816) Add document routing to CloudSolrServer
[ https://issues.apache.org/jira/browse/SOLR-4816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670439#comment-13670439 ] Mark Miller commented on SOLR-4816: --- I'll do a review shortly. bq. It does this all by default, no switches needed. bq. exception that condenses the info from each shard into a single response How can that be backward compat if people are parsing the response? I'm not convinced you can do all this by default and be back compat, but I'll look at the latest patch. And batch will again have the slight change in runtime behavior. I still think these extras will need to be off by default until 5. > Add document routing to CloudSolrServer > --- > > Key: SOLR-4816 > URL: https://issues.apache.org/jira/browse/SOLR-4816 > Project: Solr > Issue Type: Improvement > Components: SolrCloud >Affects Versions: 4.3 >Reporter: Joel Bernstein >Assignee: Mark Miller >Priority: Minor > Fix For: 5.0, 4.4 > > Attachments: SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816-sriesenberg.patch > > > This issue adds the following enhancements to CloudSolrServer's update logic: > 1) Document routing: Updates are routed directly to the correct shard leader > eliminating document routing at the server. > 2) Parallel update execution: Updates for each shard are executed in a > separate thread so parallel indexing can occur across the cluster. > 3) Javabin transport: Update requests are sent via javabin transport. > These enhancements should allow for near linear scalability on indexing > throughput. > Usage: > CloudSolrServer cloudClient = new CloudSolrServer(zkAddress); > SolrInputDocument doc1 = new SolrInputDocument(); > doc1.addField(id, "0"); > doc1.addField("a_t", "hello1"); > SolrInputDocument doc2 = new SolrInputDocument(); > doc2.addField(id, "2"); > doc2.addField("a_t", "hello2"); > UpdateRequest request = new UpdateRequest(); > request.add(doc1); > request.add(doc2); > request.setAction(AbstractUpdateRequest.ACTION.OPTIMIZE, false, false); > NamedList response = cloudClient.request(request); // Returns a backwards > compatible condensed response. > //To get more detailed response down cast to RouteResponse: > CloudSolrServer.RouteResponse rr = (CloudSolrServer.RouteResponse)response; > NamedList responses = rr.getRouteResponse(); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4816) Add document routing to CloudSolrServer
[ https://issues.apache.org/jira/browse/SOLR-4816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller updated SOLR-4816: -- Fix Version/s: 4.4 5.0 > Add document routing to CloudSolrServer > --- > > Key: SOLR-4816 > URL: https://issues.apache.org/jira/browse/SOLR-4816 > Project: Solr > Issue Type: Improvement > Components: SolrCloud >Affects Versions: 4.3 >Reporter: Joel Bernstein >Assignee: Mark Miller >Priority: Minor > Fix For: 5.0, 4.4 > > Attachments: SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816-sriesenberg.patch > > > This issue adds the following enhancements to CloudSolrServer's update logic: > 1) Document routing: Updates are routed directly to the correct shard leader > eliminating document routing at the server. > 2) Parallel update execution: Updates for each shard are executed in a > separate thread so parallel indexing can occur across the cluster. > 3) Javabin transport: Update requests are sent via javabin transport. > These enhancements should allow for near linear scalability on indexing > throughput. > Usage: > CloudSolrServer cloudClient = new CloudSolrServer(zkAddress); > SolrInputDocument doc1 = new SolrInputDocument(); > doc1.addField(id, "0"); > doc1.addField("a_t", "hello1"); > SolrInputDocument doc2 = new SolrInputDocument(); > doc2.addField(id, "2"); > doc2.addField("a_t", "hello2"); > UpdateRequest request = new UpdateRequest(); > request.add(doc1); > request.add(doc2); > request.setAction(AbstractUpdateRequest.ACTION.OPTIMIZE, false, false); > NamedList response = cloudClient.request(request); // Returns a backwards > compatible condensed response. > //To get more detailed response down cast to RouteResponse: > CloudSolrServer.RouteResponse rr = (CloudSolrServer.RouteResponse)response; > NamedList responses = rr.getRouteResponse(); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (SOLR-4816) Add document routing to CloudSolrServer
[ https://issues.apache.org/jira/browse/SOLR-4816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller reassigned SOLR-4816: - Assignee: Mark Miller > Add document routing to CloudSolrServer > --- > > Key: SOLR-4816 > URL: https://issues.apache.org/jira/browse/SOLR-4816 > Project: Solr > Issue Type: Improvement > Components: SolrCloud >Affects Versions: 4.3 >Reporter: Joel Bernstein >Assignee: Mark Miller >Priority: Minor > Attachments: SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816-sriesenberg.patch > > > This issue adds the following enhancements to CloudSolrServer's update logic: > 1) Document routing: Updates are routed directly to the correct shard leader > eliminating document routing at the server. > 2) Parallel update execution: Updates for each shard are executed in a > separate thread so parallel indexing can occur across the cluster. > 3) Javabin transport: Update requests are sent via javabin transport. > These enhancements should allow for near linear scalability on indexing > throughput. > Usage: > CloudSolrServer cloudClient = new CloudSolrServer(zkAddress); > SolrInputDocument doc1 = new SolrInputDocument(); > doc1.addField(id, "0"); > doc1.addField("a_t", "hello1"); > SolrInputDocument doc2 = new SolrInputDocument(); > doc2.addField(id, "2"); > doc2.addField("a_t", "hello2"); > UpdateRequest request = new UpdateRequest(); > request.add(doc1); > request.add(doc2); > request.setAction(AbstractUpdateRequest.ACTION.OPTIMIZE, false, false); > NamedList response = cloudClient.request(request); // Returns a backwards > compatible condensed response. > //To get more detailed response down cast to RouteResponse: > CloudSolrServer.RouteResponse rr = (CloudSolrServer.RouteResponse)response; > NamedList responses = rr.getRouteResponse(); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-4816) Add document routing to CloudSolrServer
[ https://issues.apache.org/jira/browse/SOLR-4816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670431#comment-13670431 ] Joel Bernstein edited comment on SOLR-4816 at 5/30/13 4:01 PM: --- Latest patch is a version of CloudSolrServer that: 1)Does document routing 2)Sends requests to each shard in a separate thread 3) Uses javabin transport 4) Is backwards compatible with both the response and exception. It does this all by default, no switches needed. This is accomplished by returning a response or throwing an exception that condenses the info from each shard into a single response or exception. To get the full info for the response or exception you can down cast to either RouteReponse or RouteException which gives you a detailed breakdown from each of the shards. Will update the ticket name and description accordingly. was (Author: joel.bernstein): Latest patch is a version of CloudSolrServer that: 1)Does document routing 2)Sends requests to each shard in a separate thread 3) Uses javabin transport 4) Is backwards compatible with both the response and exception. It does this all by default, no switches needed. I accomplish this by returning a response or throwing an exception that condenses the info from each shard into a single response or exception. To get the full info for the response or exception you can down cast to either RouteReponse or RouteException which gives you a detailed breakdown from each of the shards. Will update the ticket name and description accordingly. > Add document routing to CloudSolrServer > --- > > Key: SOLR-4816 > URL: https://issues.apache.org/jira/browse/SOLR-4816 > Project: Solr > Issue Type: Improvement > Components: SolrCloud >Affects Versions: 4.3 >Reporter: Joel Bernstein >Priority: Minor > Attachments: SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816-sriesenberg.patch > > > This issue adds the following enhancements to CloudSolrServer's update logic: > 1) Document routing: Updates are routed directly to the correct shard leader > eliminating document routing at the server. > 2) Parallel update execution: Updates for each shard are executed in a > separate thread so parallel indexing can occur across the cluster. > 3) Javabin transport: Update requests are sent via javabin transport. > These enhancements should allow for near linear scalability on indexing > throughput. > Usage: > CloudSolrServer cloudClient = new CloudSolrServer(zkAddress); > SolrInputDocument doc1 = new SolrInputDocument(); > doc1.addField(id, "0"); > doc1.addField("a_t", "hello1"); > SolrInputDocument doc2 = new SolrInputDocument(); > doc2.addField(id, "2"); > doc2.addField("a_t", "hello2"); > UpdateRequest request = new UpdateRequest(); > request.add(doc1); > request.add(doc2); > request.setAction(AbstractUpdateRequest.ACTION.OPTIMIZE, false, false); > NamedList response = cloudClient.request(request); // Returns a backwards > compatible condensed response. > //To get more detailed response down cast to RouteResponse: > CloudSolrServer.RouteResponse rr = (CloudSolrServer.RouteResponse)response; > NamedList responses = rr.getRouteResponse(); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4877) SolrIndexSearcher#getDocSetNC should check for null return in AtomicReader#fields()
[ https://issues.apache.org/jira/browse/SOLR-4877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670437#comment-13670437 ] Robert Muir commented on SOLR-4877: --- +1 > SolrIndexSearcher#getDocSetNC should check for null return in > AtomicReader#fields() > --- > > Key: SOLR-4877 > URL: https://issues.apache.org/jira/browse/SOLR-4877 > Project: Solr > Issue Type: Bug >Affects Versions: 4.2, 4.3 >Reporter: Uwe Schindler >Assignee: Uwe Schindler > Attachments: SOLR-4877-nospecialcase.patch, SOLR-4877.patch > > > In LUCENE-5023 it was reported that composite reader contexts should not > contain null fields() readers. But this is wrong, as a null-fields() reader > may contain documents, just no fields. > fields() and terms() is documented to return null, so DocSets should check > for null (like all queries do in Lucene). It seems that DocSetNC does not > correctly check for null. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4816) Add document routing to CloudSolrServer
[ https://issues.apache.org/jira/browse/SOLR-4816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Bernstein updated SOLR-4816: - Description: This issue adds the following enhancements to CloudSolrServer's update logic: 1) Document routing: Updates are routed directly to the correct shard leader eliminating document routing at the server. 2) Parallel update execution: Updates for each shard are executed in a separate thread so parallel indexing can occur across the cluster. 3) Javabin transport: Update requests are sent via javabin transport. These enhancements should allow for near linear scalability on indexing throughput. Usage: CloudSolrServer cloudClient = new CloudSolrServer(zkAddress); SolrInputDocument doc1 = new SolrInputDocument(); doc1.addField(id, "0"); doc1.addField("a_t", "hello1"); SolrInputDocument doc2 = new SolrInputDocument(); doc2.addField(id, "2"); doc2.addField("a_t", "hello2"); UpdateRequest request = new UpdateRequest(); request.add(doc1); request.add(doc2); request.setAction(AbstractUpdateRequest.ACTION.OPTIMIZE, false, false); NamedList response = cloudClient.request(request); // Returns a backwards compatible condensed response. //To get more detailed response down cast to RouteResponse: CloudSolrServer.RouteResponse rr = (CloudSolrServer.RouteResponse)response; NamedList responses = rr.getRouteResponse(); was: This issue adds the following enhancements to CloudSolrServer's update logic: 1) Document routing: Updates are routed directly to the correct shard leader eliminating document routing at the server. 2) Parallel update execution: Updates for each shard executed in a separate thread so parallel indexing can occur on each shard. 3) Javabin transport: The requests are sent via javabin transport. These enhancements should allow for near linear scalability on indexing throughput. Usage: CloudSolrServer cloudClient = new CloudSolrServer(zkAddress); SolrInputDocument doc1 = new SolrInputDocument(); doc1.addField(id, "0"); doc1.addField("a_t", "hello1"); SolrInputDocument doc2 = new SolrInputDocument(); doc2.addField(id, "2"); doc2.addField("a_t", "hello2"); UpdateRequest request = new UpdateRequest(); request.add(doc1); request.add(doc2); request.setAction(AbstractUpdateRequest.ACTION.OPTIMIZE, false, false); NamedList response = cloudClient.request(request); // Returns a backwards compatible condensed response. //To get more detailed response down cast to RouteResponse: CloudSolrServer.RouteResponse rr = (CloudSolrServer.RouteResponse)response; NamedList responses = rr.getRouteResponse(); > Add document routing to CloudSolrServer > --- > > Key: SOLR-4816 > URL: https://issues.apache.org/jira/browse/SOLR-4816 > Project: Solr > Issue Type: Improvement > Components: SolrCloud >Affects Versions: 4.3 >Reporter: Joel Bernstein >Priority: Minor > Attachments: SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816-sriesenberg.patch > > > This issue adds the following enhancements to CloudSolrServer's update logic: > 1) Document routing: Updates are routed directly to the correct shard leader > eliminating document routing at the server. > 2) Parallel update execution: Updates for each shard are executed in a > separate thread so parallel indexing can occur across the cluster. > 3) Javabin transport: Update requests are sent via javabin transport. > These enhancements should allow for near linear scalability on indexing > throughput. > Usage: > CloudSolrServer cloudClient = new CloudSolrServer(zkAddress); > SolrInputDocument doc1 = new SolrInputDocument(); > doc1.addField(id, "0"); > doc1.addField("a_t", "hello1"); > SolrInputDocument doc2 = new SolrInputDocument(); > doc2.addField(id, "2"); > doc2.addField("a_t", "hello2"); > UpdateRequest request = new UpdateRequest(); > request.add(doc1); > request.add(doc2); > request.setAction(AbstractUpdateRequest.ACTION.OPTIMIZE, false, false); > NamedList response = cloudClient.request(request); // Returns a backwards > compatible condensed response. > //To get more detailed response down cast to RouteResponse: > CloudSolrServer.RouteResponse rr = (CloudSolrServer.RouteResponse)response; > NamedList responses = rr.getRouteResponse(); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (SOLR-4816) Add document routing to CloudSolrServer
[ https://issues.apache.org/jira/browse/SOLR-4816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Bernstein updated SOLR-4816: - Description: This issue adds the following enhancements to CloudSolrServer's update logic: 1) Document routing: Updates are routed directly to the correct shard leader eliminating document routing at the server. 2) Parallel update execution: Updates for each shard executed in a separate thread so parallel indexing can occur on each shard. 3) Javabin transport: The requests are sent via javabin transport. These enhancements should allow for near linear scalability on indexing throughput. Usage: CloudSolrServer cloudClient = new CloudSolrServer(zkAddress); SolrInputDocument doc1 = new SolrInputDocument(); doc1.addField(id, "0"); doc1.addField("a_t", "hello1"); SolrInputDocument doc2 = new SolrInputDocument(); doc2.addField(id, "2"); doc2.addField("a_t", "hello2"); UpdateRequest request = new UpdateRequest(); request.add(doc1); request.add(doc2); request.setAction(AbstractUpdateRequest.ACTION.OPTIMIZE, false, false); NamedList response = cloudClient.request(request); // Returns a backwards compatible condensed response. //To get more detailed response down cast to RouteResponse: CloudSolrServer.RouteResponse rr = (CloudSolrServer.RouteResponse)response; NamedList responses = rr.getRouteResponse(); was: This issue adds a new Solr Cloud client called the ConcurrentUpdateCloudSolrServer. This Solr Cloud client implements document routing in the client so that document routing overhead is eliminated on the Solr servers. Documents are batched up for each shard and then each batch is sent in it's own thread. With this client, Solr Cloud indexing throughput should scale linearly with cluster size. This client also has robust failover built-in because the actual requests are made using the LBHttpSolrServer. The list of urls used for the request to each shard begins with the leader and is followed by that shard's replicas. So the leader will be tried first and if it fails it will try the replicas. Sample usage: ConcurrentUpdateCloudServer client = new ConcurrentUpdateCloudSolrServer(zkHostAddress); UpdateRequest request = new UpdateRequest(); SolrInputDocument doc = new SolrInputDocument(); doc.addField("id", 2); doc.addField("manu","BMW"); request.add(doc); NamedList response = client.request(request); NamedList exceptions = response.get("exceptions"); // contains any exceptions from the shards NamedList responses = response.get("responses"); // contains the responses from shards without exception. > Add document routing to CloudSolrServer > --- > > Key: SOLR-4816 > URL: https://issues.apache.org/jira/browse/SOLR-4816 > Project: Solr > Issue Type: Improvement > Components: SolrCloud >Affects Versions: 4.3 >Reporter: Joel Bernstein >Priority: Minor > Attachments: SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816-sriesenberg.patch > > > This issue adds the following enhancements to CloudSolrServer's update logic: > 1) Document routing: Updates are routed directly to the correct shard leader > eliminating document routing at the server. > 2) Parallel update execution: Updates for each shard executed in a separate > thread so parallel indexing can occur on each shard. > 3) Javabin transport: The requests are sent via javabin transport. > These enhancements should allow for near linear scalability on indexing > throughput. > Usage: > CloudSolrServer cloudClient = new CloudSolrServer(zkAddress); > SolrInputDocument doc1 = new SolrInputDocument(); > doc1.addField(id, "0"); > doc1.addField("a_t", "hello1"); > SolrInputDocument doc2 = new SolrInputDocument(); > doc2.addField(id, "2"); > doc2.addField("a_t", "hello2"); > UpdateRequest request = new UpdateRequest(); > request.add(doc1); > request.add(doc2); > request.setAction(AbstractUpdateRequest.ACTION.OPTIMIZE, false, false); > NamedList response = cloudClient.request(request); // Returns a backwards > compatible condensed response. > //To get more detailed response down cast to RouteResponse: > CloudSolrServer.RouteResponse rr = (CloudSolrServer.RouteResponse)response; > NamedList responses = rr.getRouteResponse(); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira --
[jira] [Updated] (SOLR-4816) Add document routing to CloudSolrServer
[ https://issues.apache.org/jira/browse/SOLR-4816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Bernstein updated SOLR-4816: - Summary: Add document routing to CloudSolrServer (was: ConcurrentUpdateCloudSolrServer) > Add document routing to CloudSolrServer > --- > > Key: SOLR-4816 > URL: https://issues.apache.org/jira/browse/SOLR-4816 > Project: Solr > Issue Type: Improvement > Components: SolrCloud >Affects Versions: 4.3 >Reporter: Joel Bernstein >Priority: Minor > Attachments: SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816-sriesenberg.patch > > > This issue adds a new Solr Cloud client called the > ConcurrentUpdateCloudSolrServer. This Solr Cloud client implements document > routing in the client so that document routing overhead is eliminated on the > Solr servers. Documents are batched up for each shard and then each batch is > sent in it's own thread. > With this client, Solr Cloud indexing throughput should scale linearly with > cluster size. > This client also has robust failover built-in because the actual requests are > made using the LBHttpSolrServer. The list of urls used for the request to > each shard begins with the leader and is followed by that shard's replicas. > So the leader will be tried first and if it fails it will try the replicas. > Sample usage: > ConcurrentUpdateCloudServer client = new > ConcurrentUpdateCloudSolrServer(zkHostAddress); > UpdateRequest request = new UpdateRequest(); > SolrInputDocument doc = new SolrInputDocument(); > doc.addField("id", 2); > doc.addField("manu","BMW"); > request.add(doc); > NamedList response = client.request(request); > NamedList exceptions = response.get("exceptions"); // contains any exceptions > from the shards > NamedList responses = response.get("responses"); // contains the responses > from shards without exception. > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4816) ConcurrentUpdateCloudSolrServer
[ https://issues.apache.org/jira/browse/SOLR-4816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Bernstein updated SOLR-4816: - Attachment: SOLR-4816.patch Latest patch is a version of CloudSolrServer that: 1)Does document routing 2)Sends requests to each shard in a separate thread 3) Uses javabin transport 4) Is backwards compatible with both the response and exception. It does this all by default, no switches needed. I accomplish this by returning a response or throwing an exception that condenses the info from each shard into a single response or exception. To get the full info for the response or exception you can down cast to either RouteReponse or RouteException which gives you a detailed breakdown from each of the shards. Will update the ticket name and description accordingly. > ConcurrentUpdateCloudSolrServer > --- > > Key: SOLR-4816 > URL: https://issues.apache.org/jira/browse/SOLR-4816 > Project: Solr > Issue Type: Improvement > Components: SolrCloud >Affects Versions: 4.3 >Reporter: Joel Bernstein >Priority: Minor > Attachments: SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816-sriesenberg.patch > > > This issue adds a new Solr Cloud client called the > ConcurrentUpdateCloudSolrServer. This Solr Cloud client implements document > routing in the client so that document routing overhead is eliminated on the > Solr servers. Documents are batched up for each shard and then each batch is > sent in it's own thread. > With this client, Solr Cloud indexing throughput should scale linearly with > cluster size. > This client also has robust failover built-in because the actual requests are > made using the LBHttpSolrServer. The list of urls used for the request to > each shard begins with the leader and is followed by that shard's replicas. > So the leader will be tried first and if it fails it will try the replicas. > Sample usage: > ConcurrentUpdateCloudServer client = new > ConcurrentUpdateCloudSolrServer(zkHostAddress); > UpdateRequest request = new UpdateRequest(); > SolrInputDocument doc = new SolrInputDocument(); > doc.addField("id", 2); > doc.addField("manu","BMW"); > request.add(doc); > NamedList response = client.request(request); > NamedList exceptions = response.get("exceptions"); // contains any exceptions > from the shards > NamedList responses = response.get("responses"); // contains the responses > from shards without exception. > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-4870) RecentUpdates.update() does not increment numUpdates counter inside loop
[ https://issues.apache.org/jira/browse/SOLR-4870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shalin Shekhar Mangar resolved SOLR-4870. - Resolution: Fixed Fix Version/s: 4.3.1 Assignee: Shalin Shekhar Mangar Committed. trunk: r1487897 branch_4x: r1487899 lucene_solr_4_3: r1487900 > RecentUpdates.update() does not increment numUpdates counter inside loop > > > Key: SOLR-4870 > URL: https://issues.apache.org/jira/browse/SOLR-4870 > Project: Solr > Issue Type: Bug >Affects Versions: 4.3 >Reporter: Shalin Shekhar Mangar >Assignee: Shalin Shekhar Mangar > Fix For: 4.3.1 > > > As reported by AlexeyK on solr-user: > http://lucene.472066.n3.nabble.com/Solr-4-3-node-is-seen-as-active-in-Zk-while-in-recovery-mode-endless-recovery-td4065549.html > {quote} > Speaking about the update log - i have noticed a strange behavior concerning > the replay. The replay is *supposed* to be done for a predefined number of > log entries, but actually it is always done for the whole last 2 tlogs. > RecentUpdates.update() reads log within while (numUpdates < > numRecordsToKeep), while numUpdates is never incremented, so it exits when > the reader reaches EOF. > {quote} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5024) Can we reliably detect an incomplete first commit vs index corruption?
[ https://issues.apache.org/jira/browse/LUCENE-5024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670416#comment-13670416 ] Robert Muir commented on LUCENE-5024: - The best solution here i think, is removal of create-or-append. really index creation can be a one time thing you must do separately before you can use the directory. this is typically how its done: lucene is wierd and has this broken mechanism today instead. > Can we reliably detect an incomplete first commit vs index corruption? > -- > > Key: LUCENE-5024 > URL: https://issues.apache.org/jira/browse/LUCENE-5024 > Project: Lucene - Core > Issue Type: Bug > Components: core/index >Reporter: Michael McCandless > Fix For: 5.0, 4.4 > > > Normally, if something bad happens (OS, JVM, hardware crashes) while > IndexWriter is committing, we will just fallback to the prior commit > and no intervention necessary from the app. > But if that commit is the first commit, then on restart IndexWriter > will now throw CorruptIndexException, as of LUCENE-4738. > Prior to LUCENE-4738, in LUCENE-2812, we used to try to detect the > corrupt first commit, but that logic was dangerous and could result in > falsely believing no index is present when one is, e.g. when transient > IOExceptions are thrown due to file descriptor exhaustion. > But now two users have hit this change ... see "CorruptIndexException > when opening Index during first commit" and "Calling > IndexWriter.commit() immediately after creating the writer", both on > java-user. > It would be nice to get back to not marking an incomplete first commit > as corruption ... but we have to proceed carefully. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5024) Can we reliably detect an incomplete first commit vs index corruption?
[ https://issues.apache.org/jira/browse/LUCENE-5024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670410#comment-13670410 ] Robert Muir commented on LUCENE-5024: - even if we can, i'm not sure we should. real users hit corruption issues too. sorry to the two java-users for the inconvenience, but corruption/dataloss is WAY worse. > Can we reliably detect an incomplete first commit vs index corruption? > -- > > Key: LUCENE-5024 > URL: https://issues.apache.org/jira/browse/LUCENE-5024 > Project: Lucene - Core > Issue Type: Bug > Components: core/index >Reporter: Michael McCandless > Fix For: 5.0, 4.4 > > > Normally, if something bad happens (OS, JVM, hardware crashes) while > IndexWriter is committing, we will just fallback to the prior commit > and no intervention necessary from the app. > But if that commit is the first commit, then on restart IndexWriter > will now throw CorruptIndexException, as of LUCENE-4738. > Prior to LUCENE-4738, in LUCENE-2812, we used to try to detect the > corrupt first commit, but that logic was dangerous and could result in > falsely believing no index is present when one is, e.g. when transient > IOExceptions are thrown due to file descriptor exhaustion. > But now two users have hit this change ... see "CorruptIndexException > when opening Index during first commit" and "Calling > IndexWriter.commit() immediately after creating the writer", both on > java-user. > It would be nice to get back to not marking an incomplete first commit > as corruption ... but we have to proceed carefully. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-5025) Allow more than 2.1B "tail nodes" when building FST
Michael McCandless created LUCENE-5025: -- Summary: Allow more than 2.1B "tail nodes" when building FST Key: LUCENE-5025 URL: https://issues.apache.org/jira/browse/LUCENE-5025 Project: Lucene - Core Issue Type: Improvement Components: core/FSTs Reporter: Michael McCandless Fix For: 5.0, 4.4 We recently relaxed some of the limits for big FSTs, but there is one more limit I think we should fix. E.g. Aaron hit it in building the world's biggest FST: http://aaron.blog.archive.org/2013/05/29/worlds-biggest-fst/ The issue is NodeHash, which currently uses a GrowableWriter (packed ints impl that can grow both number of bits and number of values): it's indexed by int not long. This is a hash table that's used to share suffixes, so we need random get/put on a long index of long values, i.e. this is logically a long[]. I think one simple way to do this is to make a "paged" GrowableWriter... Along with this we'd need to fix the hash codes to be long not int. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-5024) Can we reliably detect an incomplete first commit vs index corruption?
Michael McCandless created LUCENE-5024: -- Summary: Can we reliably detect an incomplete first commit vs index corruption? Key: LUCENE-5024 URL: https://issues.apache.org/jira/browse/LUCENE-5024 Project: Lucene - Core Issue Type: Bug Components: core/index Reporter: Michael McCandless Fix For: 5.0, 4.4 Normally, if something bad happens (OS, JVM, hardware crashes) while IndexWriter is committing, we will just fallback to the prior commit and no intervention necessary from the app. But if that commit is the first commit, then on restart IndexWriter will now throw CorruptIndexException, as of LUCENE-4738. Prior to LUCENE-4738, in LUCENE-2812, we used to try to detect the corrupt first commit, but that logic was dangerous and could result in falsely believing no index is present when one is, e.g. when transient IOExceptions are thrown due to file descriptor exhaustion. But now two users have hit this change ... see "CorruptIndexException when opening Index during first commit" and "Calling IndexWriter.commit() immediately after creating the writer", both on java-user. It would be nice to get back to not marking an incomplete first commit as corruption ... but we have to proceed carefully. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2894) Implement distributed pivot faceting
[ https://issues.apache.org/jira/browse/SOLR-2894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670363#comment-13670363 ] Chris Russell commented on SOLR-2894: - I will take a look. > Implement distributed pivot faceting > > > Key: SOLR-2894 > URL: https://issues.apache.org/jira/browse/SOLR-2894 > Project: Solr > Issue Type: Improvement >Reporter: Erik Hatcher > Fix For: 4.4 > > Attachments: SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, > SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, > SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, > SOLR-2894-reworked.patch > > > Following up on SOLR-792, pivot faceting currently only supports > undistributed mode. Distributed pivot faceting needs to be implemented. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-4.x-Linux () - Build # 5839 - Failure!
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/5839/ Java: No tests ran. Build Log: [...truncated 27 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-trunk-Linux () - Build # 5902 - Still Failing!
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/5902/ Java: No tests ran. Build Log: [...truncated 25 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (SOLR-4875) DIH XPathRecordReader cannot handle two ways to read same attribute together
[ https://issues.apache.org/jira/browse/SOLR-4875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shalin Shekhar Mangar reassigned SOLR-4875: --- Assignee: Noble Paul > DIH XPathRecordReader cannot handle two ways to read same attribute together > > > Key: SOLR-4875 > URL: https://issues.apache.org/jira/browse/SOLR-4875 > Project: Solr > Issue Type: Bug > Components: contrib - DataImportHandler >Affects Versions: 4.3 >Reporter: Shalin Shekhar Mangar >Assignee: Noble Paul >Priority: Minor > Fix For: 4.4 > > Attachments: SOLR-4875.patch > > > From my comment on solr-user mailing list: > {quote} > I think there is a bug here. In my tests, xpath="/root/a/@y" works, > xpath="/root/a[@x='1']/@y" also works. But if you use them together the one > which is defined last returns null. I'll open an issue. > {quote} > http://lucene.472066.n3.nabble.com/Problem-with-xpath-expression-in-data-config-xml-td4066744.html -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4470) Support for basic http auth in internal solr requests
[ https://issues.apache.org/jira/browse/SOLR-4470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670287#comment-13670287 ] Mark Miller commented on SOLR-4470: --- bq. And if anyone wishes to setup a similar setup in their production they may borrow code from the test class, but it will be a manual step reinforcing that this is not a supported feature of the project as such. I would be much more okay with this - in this way, we are not responsible for the security this code provides - it's not shipping production solr code, its code a user can plug in as a filter himself and be responsible for himself. > Support for basic http auth in internal solr requests > - > > Key: SOLR-4470 > URL: https://issues.apache.org/jira/browse/SOLR-4470 > Project: Solr > Issue Type: New Feature > Components: clients - java, multicore, replication (java), SolrCloud >Affects Versions: 4.0 >Reporter: Per Steffensen >Assignee: Jan Høydahl > Labels: authentication, https, solrclient, solrcloud, ssl > Fix For: 4.4 > > Attachments: SOLR-4470_branch_4x_r1452629.patch, > SOLR-4470_branch_4x_r1452629.patch, SOLR-4470_branch_4x_r145.patch, > SOLR-4470.patch > > > We want to protect any HTTP-resource (url). We want to require credentials no > matter what kind of HTTP-request you make to a Solr-node. > It can faily easy be acheived as described on > http://wiki.apache.org/solr/SolrSecurity. This problem is that Solr-nodes > also make "internal" request to other Solr-nodes, and for it to work > credentials need to be provided here also. > Ideally we would like to "forward" credentials from a particular request to > all the "internal" sub-requests it triggers. E.g. for search and update > request. > But there are also "internal" requests > * that only indirectly/asynchronously triggered from "outside" requests (e.g. > shard creation/deletion/etc based on calls to the "Collection API") > * that do not in any way have relation to an "outside" "super"-request (e.g. > replica synching stuff) > We would like to aim at a solution where "original" credentials are > "forwarded" when a request directly/synchronously trigger a subrequest, and > fallback to a configured "internal credentials" for the > asynchronous/non-rooted requests. > In our solution we would aim at only supporting basic http auth, but we would > like to make a "framework" around it, so that not to much refactoring is > needed if you later want to make support for other kinds of auth (e.g. digest) > We will work at a solution but create this JIRA issue early in order to get > input/comments from the community as early as possible. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4877) SolrIndexSearcher#getDocSetNC should check for null return in AtomicReader#fields()
[ https://issues.apache.org/jira/browse/SOLR-4877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670273#comment-13670273 ] Uwe Schindler commented on SOLR-4877: - I will commit the "nospecialcase" patch if nobody objects. > SolrIndexSearcher#getDocSetNC should check for null return in > AtomicReader#fields() > --- > > Key: SOLR-4877 > URL: https://issues.apache.org/jira/browse/SOLR-4877 > Project: Solr > Issue Type: Bug >Affects Versions: 4.2, 4.3 >Reporter: Uwe Schindler >Assignee: Uwe Schindler > Attachments: SOLR-4877-nospecialcase.patch, SOLR-4877.patch > > > In LUCENE-5023 it was reported that composite reader contexts should not > contain null fields() readers. But this is wrong, as a null-fields() reader > may contain documents, just no fields. > fields() and terms() is documented to return null, so DocSets should check > for null (like all queries do in Lucene). It seems that DocSetNC does not > correctly check for null. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4877) SolrIndexSearcher#getDocSetNC should check for null return in AtomicReader#fields()
[ https://issues.apache.org/jira/browse/SOLR-4877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated SOLR-4877: Description: In LUCENE-5023 it was reported that composite reader contexts should not contain null fields() readers. But this is wrong, as a null-fields() reader may contain documents, just no fields. fields() and terms() is documented to return null, so DocSets should check for null (like all queries do in Lucene). It seems that DocSetNC does not correctly check for null. was: In LUCENE-5023 it was reported that composite reader contexts should not contain null fields() readers. But this is wrong, as a null-fields() reader may contain documents,m just no fields. fields() is documented to contain null fields, so DocSets should check for null (like all fields do in Lucene). It seems that DocSetNC does not correctly check for null. > SolrIndexSearcher#getDocSetNC should check for null return in > AtomicReader#fields() > --- > > Key: SOLR-4877 > URL: https://issues.apache.org/jira/browse/SOLR-4877 > Project: Solr > Issue Type: Bug >Affects Versions: 4.2, 4.3 >Reporter: Uwe Schindler >Assignee: Uwe Schindler > Attachments: SOLR-4877-nospecialcase.patch, SOLR-4877.patch > > > In LUCENE-5023 it was reported that composite reader contexts should not > contain null fields() readers. But this is wrong, as a null-fields() reader > may contain documents, just no fields. > fields() and terms() is documented to return null, so DocSets should check > for null (like all queries do in Lucene). It seems that DocSetNC does not > correctly check for null. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5023) Only reader that contains fields can be added into readerContext
[ https://issues.apache.org/jira/browse/LUCENE-5023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-5023: -- Labels: (was: patch) > Only reader that contains fields can be added into readerContext > > > Key: LUCENE-5023 > URL: https://issues.apache.org/jira/browse/LUCENE-5023 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 4.2 >Reporter: Bao Yang Yang >Assignee: Uwe Schindler >Priority: Critical > Original Estimate: 1h > Remaining Estimate: 1h > > When there is only Segements in solr core, which means no any indexes, in > CompositeReaderContext.build() method, the atomicReader that has no fields > returned should not be added into leaves. Otherwise, in > SolrIndexSearcher.getDocSetNC(Query query, DocSet filter), when execute line > fields.terms(t.field()), a nullpointerexception will occur since fields > variable is null. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-3076) Solr should support block joins
[ https://issues.apache.org/jira/browse/SOLR-3076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670262#comment-13670262 ] Alan Woodward edited comment on SOLR-3076 at 5/30/13 11:30 AM: --- bq. change numFound=1 to numFound=0 The failure is elsewhere, in the \*:\* query - it's expecting to find 9 docs, but actually finds 8. But I guess this is the same change. was (Author: romseygeek): bq. change numFound=1 to numFound=0 The failure is elsewhere, in the *:* query - it's expecting to find 9 docs, but actually finds 8. But I guess this is the same change. > Solr should support block joins > --- > > Key: SOLR-3076 > URL: https://issues.apache.org/jira/browse/SOLR-3076 > Project: Solr > Issue Type: New Feature >Reporter: Grant Ingersoll > Fix For: 5.0, 4.4 > > Attachments: 27M-singlesegment-histogram.png, 27M-singlesegment.png, > bjq-vs-filters-backward-disi.patch, bjq-vs-filters-illegal-state.patch, > child-bjqparser.patch, dih-3076.patch, dih-config.xml, > parent-bjq-qparser.patch, parent-bjq-qparser.patch, Screen Shot 2012-07-17 at > 1.12.11 AM.png, SOLR-3076-childDocs.patch, SOLR-3076.patch, SOLR-3076.patch, > SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, > SOLR-3076.patch, SOLR-7036-childDocs-solr-fork-trunk-patched, > solrconf-bjq-erschema-snippet.xml, solrconfig.xml.patch, > tochild-bjq-filtered-search-fix.patch > > > Lucene has the ability to do block joins, we should add it to Solr. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3076) Solr should support block joins
[ https://issues.apache.org/jira/browse/SOLR-3076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670262#comment-13670262 ] Alan Woodward commented on SOLR-3076: - bq. change numFound=1 to numFound=0 The failure is elsewhere, in the *:* query - it's expecting to find 9 docs, but actually finds 8. But I guess this is the same change. > Solr should support block joins > --- > > Key: SOLR-3076 > URL: https://issues.apache.org/jira/browse/SOLR-3076 > Project: Solr > Issue Type: New Feature >Reporter: Grant Ingersoll > Fix For: 5.0, 4.4 > > Attachments: 27M-singlesegment-histogram.png, 27M-singlesegment.png, > bjq-vs-filters-backward-disi.patch, bjq-vs-filters-illegal-state.patch, > child-bjqparser.patch, dih-3076.patch, dih-config.xml, > parent-bjq-qparser.patch, parent-bjq-qparser.patch, Screen Shot 2012-07-17 at > 1.12.11 AM.png, SOLR-3076-childDocs.patch, SOLR-3076.patch, SOLR-3076.patch, > SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, > SOLR-3076.patch, SOLR-7036-childDocs-solr-fork-trunk-patched, > solrconf-bjq-erschema-snippet.xml, solrconfig.xml.patch, > tochild-bjq-filtered-search-fix.patch > > > Lucene has the ability to do block joins, we should add it to Solr. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3076) Solr should support block joins
[ https://issues.apache.org/jira/browse/SOLR-3076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670256#comment-13670256 ] Vadim Kirilchuk commented on SOLR-3076: --- Thanks, Alan! bq. There are a bunch of test failures in the analyzing suggester suite, which is a bit odd. I will try to take a look at the weekend. bq. There's also a single test failure. Right, inconsistent behavior was fixed at some point (if you look at the test it has comment about this), so the proper way is to change numFound=1 to numFound=0. > Solr should support block joins > --- > > Key: SOLR-3076 > URL: https://issues.apache.org/jira/browse/SOLR-3076 > Project: Solr > Issue Type: New Feature >Reporter: Grant Ingersoll > Fix For: 5.0, 4.4 > > Attachments: 27M-singlesegment-histogram.png, 27M-singlesegment.png, > bjq-vs-filters-backward-disi.patch, bjq-vs-filters-illegal-state.patch, > child-bjqparser.patch, dih-3076.patch, dih-config.xml, > parent-bjq-qparser.patch, parent-bjq-qparser.patch, Screen Shot 2012-07-17 at > 1.12.11 AM.png, SOLR-3076-childDocs.patch, SOLR-3076.patch, SOLR-3076.patch, > SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, > SOLR-3076.patch, SOLR-7036-childDocs-solr-fork-trunk-patched, > solrconf-bjq-erschema-snippet.xml, solrconfig.xml.patch, > tochild-bjq-filtered-search-fix.patch > > > Lucene has the ability to do block joins, we should add it to Solr. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-4879) Indexing a field of type solr.SpatialRecursivePrefixTreeFieldType fails when at least two vertexes are more than 180 degrees apart
Øystein Torget created SOLR-4879: Summary: Indexing a field of type solr.SpatialRecursivePrefixTreeFieldType fails when at least two vertexes are more than 180 degrees apart Key: SOLR-4879 URL: https://issues.apache.org/jira/browse/SOLR-4879 Project: Solr Issue Type: Bug Environment: Linux, Solr 4.0.0, Solr 4.3.0 Reporter: Øystein Torget When trying to index a field of the type solr.SpatialRecursivePrefixTreeFieldType the indexing will fail if two vertexes are more than 180 longitudal degress apart. For instance this polygon will fail: POLYGON((-161 49, 0 49, 20 49, 20 89.1, 0 89.1, -161 89.2,-161 49)) but this will not. POLYGON((-160 49, 0 49, 20 49, 20 89.1, 0 89.1, -160 89.2,-160 49)) This contradicts the documentation found here: http://wiki.apache.org/solr/SolrAdaptersForLuceneSpatial4 The documentation states that each vertex must be less than 180 longitudal degrees apart from the previous vertex. Relevant parts from the schema.xml file: -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4470) Support for basic http auth in internal solr requests
[ https://issues.apache.org/jira/browse/SOLR-4470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670239#comment-13670239 ] Jan Høydahl commented on SOLR-4470: --- I am currently porting the patch to trunk. There are several new APIs added which needs instrumentation. At the same time, I am also moving RegExpAuthorizationFilter into test-framework and adding plugin support in solr.xml for plugging in your own internalRequestFactory and subRequestFactory. Will upload a new patch once ready, and probably also commit to the "security" branch. Next, will explore [~ryantxu]'s proposal for enforcing invariant params through TestHttpSolrServer. > Support for basic http auth in internal solr requests > - > > Key: SOLR-4470 > URL: https://issues.apache.org/jira/browse/SOLR-4470 > Project: Solr > Issue Type: New Feature > Components: clients - java, multicore, replication (java), SolrCloud >Affects Versions: 4.0 >Reporter: Per Steffensen >Assignee: Jan Høydahl > Labels: authentication, https, solrclient, solrcloud, ssl > Fix For: 4.4 > > Attachments: SOLR-4470_branch_4x_r1452629.patch, > SOLR-4470_branch_4x_r1452629.patch, SOLR-4470_branch_4x_r145.patch, > SOLR-4470.patch > > > We want to protect any HTTP-resource (url). We want to require credentials no > matter what kind of HTTP-request you make to a Solr-node. > It can faily easy be acheived as described on > http://wiki.apache.org/solr/SolrSecurity. This problem is that Solr-nodes > also make "internal" request to other Solr-nodes, and for it to work > credentials need to be provided here also. > Ideally we would like to "forward" credentials from a particular request to > all the "internal" sub-requests it triggers. E.g. for search and update > request. > But there are also "internal" requests > * that only indirectly/asynchronously triggered from "outside" requests (e.g. > shard creation/deletion/etc based on calls to the "Collection API") > * that do not in any way have relation to an "outside" "super"-request (e.g. > replica synching stuff) > We would like to aim at a solution where "original" credentials are > "forwarded" when a request directly/synchronously trigger a subrequest, and > fallback to a configured "internal credentials" for the > asynchronous/non-rooted requests. > In our solution we would aim at only supporting basic http auth, but we would > like to make a "framework" around it, so that not to much refactoring is > needed if you later want to make support for other kinds of auth (e.g. digest) > We will work at a solution but create this JIRA issue early in order to get > input/comments from the community as early as possible. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-3076) Solr should support block joins
[ https://issues.apache.org/jira/browse/SOLR-3076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Woodward updated SOLR-3076: Attachment: SOLR-3076.patch This patch updates the 12/10/12 patch to trunk. There are a bunch of test failures in the analyzing suggester suite, which is a bit odd. There's also a single test failure in AddBlockUpdateTest.testExceptionThrown, which I think is actually an error in the test (it seems to expect that a document with fieldtype errors in a subdoc would be added, instead of the entire block being rejected). I have a client who's keen to get this into trunk/4x soon. Would be good to get some momentum behind it. > Solr should support block joins > --- > > Key: SOLR-3076 > URL: https://issues.apache.org/jira/browse/SOLR-3076 > Project: Solr > Issue Type: New Feature >Reporter: Grant Ingersoll > Fix For: 5.0, 4.4 > > Attachments: 27M-singlesegment-histogram.png, 27M-singlesegment.png, > bjq-vs-filters-backward-disi.patch, bjq-vs-filters-illegal-state.patch, > child-bjqparser.patch, dih-3076.patch, dih-config.xml, > parent-bjq-qparser.patch, parent-bjq-qparser.patch, Screen Shot 2012-07-17 at > 1.12.11 AM.png, SOLR-3076-childDocs.patch, SOLR-3076.patch, SOLR-3076.patch, > SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, > SOLR-3076.patch, SOLR-7036-childDocs-solr-fork-trunk-patched, > solrconf-bjq-erschema-snippet.xml, solrconfig.xml.patch, > tochild-bjq-filtered-search-fix.patch > > > Lucene has the ability to do block joins, we should add it to Solr. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-5016) Sampling can break FacetResult labeling
[ https://issues.apache.org/jira/browse/LUCENE-5016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shai Erera resolved LUCENE-5016. Resolution: Fixed Fix Version/s: 4.4 5.0 Lucene Fields: New,Patch Available (was: New) Committed to trunk and 4x. Thanks Rob for reporting this! > Sampling can break FacetResult labeling > > > Key: LUCENE-5016 > URL: https://issues.apache.org/jira/browse/LUCENE-5016 > Project: Lucene - Core > Issue Type: Bug > Components: modules/facet >Affects Versions: 4.3 >Reporter: Rob Audenaerde >Assignee: Shai Erera >Priority: Minor > Fix For: 5.0, 4.4 > > Attachments: LUCENE-5016.patch, test-labels.zip > > > When sampling FacetResults, the TopKInEachNodeHandler is used to get the > FacetResults. > This is my case: > A FacetResult is returned (which matches a FacetRequest) from the > StandardFacetAccumulator. The facet has 0 results. The labelling of the > root-node seems incorrect. I know, from the StandardFacetAccumulator, that > the rootnode has a label, so I can use that one. > Currently the recursivelyLabel method uses the taxonomyReader.getPath() to > retrieve the label. I think we can skip that for the rootNode when there are > no children (and gain a little performance on the way too?) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5016) Sampling can break FacetResult labeling
[ https://issues.apache.org/jira/browse/LUCENE-5016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670200#comment-13670200 ] Gilad Barkai commented on LUCENE-5016: -- Patch looks good. +1 for commit > Sampling can break FacetResult labeling > > > Key: LUCENE-5016 > URL: https://issues.apache.org/jira/browse/LUCENE-5016 > Project: Lucene - Core > Issue Type: Bug > Components: modules/facet >Affects Versions: 4.3 >Reporter: Rob Audenaerde >Assignee: Shai Erera >Priority: Minor > Attachments: LUCENE-5016.patch, test-labels.zip > > > When sampling FacetResults, the TopKInEachNodeHandler is used to get the > FacetResults. > This is my case: > A FacetResult is returned (which matches a FacetRequest) from the > StandardFacetAccumulator. The facet has 0 results. The labelling of the > root-node seems incorrect. I know, from the StandardFacetAccumulator, that > the rootnode has a label, so I can use that one. > Currently the recursivelyLabel method uses the taxonomyReader.getPath() to > retrieve the label. I think we can skip that for the rootNode when there are > no children (and gain a little performance on the way too?) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4580) Support for protecting content in ZK
[ https://issues.apache.org/jira/browse/SOLR-4580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670184#comment-13670184 ] Per Steffensen commented on SOLR-4580: -- Documentation: https://wiki.apache.org/solr/Per%20Steffensen/ZooKeeper%20protecting%20content > Support for protecting content in ZK > > > Key: SOLR-4580 > URL: https://issues.apache.org/jira/browse/SOLR-4580 > Project: Solr > Issue Type: New Feature > Components: SolrCloud >Affects Versions: 4.2 >Reporter: Per Steffensen >Assignee: Per Steffensen > Labels: security, solr, zookeeper > Attachments: SOLR-4580_branch_4x_r1482255.patch > > > We want to protect content in zookeeper. > In order to run a CloudSolrServer in "client-space" you will have to open for > access to zookeeper from client-space. > If you do not trust persons or systems in client-space you want to protect > zookeeper against evilness from client-space - e.g. > * Changing configuration > * Trying to mess up system by manipulating clusterstate > * Add a delete-collection job to be carried out by the Overseer > * etc > Even if you do not open for zookeeper access to someone outside your "secure > zone" you might want to protect zookeeper content from being manipulated by > e.g. > * Malware that found its way into secure zone > * Other systems also using zookeeper > * etc. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org