[jira] [Commented] (SOLR-4787) Join Contrib

2013-05-30 Thread Dawid Weiss (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13671200#comment-13671200
 ] 

Dawid Weiss commented on SOLR-4787:
---

Oh, one more thing -- Colt is no longer maintained and there were a number of 
bugs in it. These have been fixed when Colt was ported to Apache Mahout; those 
classes are not part of Mahout Math.

I'd still recommend using Fastutil or Hppc since these will be faster (by an 
inch but always).

> Join Contrib
> 
>
> Key: SOLR-4787
> URL: https://issues.apache.org/jira/browse/SOLR-4787
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 4.2.1
>Reporter: Joel Bernstein
>Priority: Minor
> Fix For: 4.2.1
>
> Attachments: SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, 
> SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch
>
>
> This contrib provides a place where different join implementations can be 
> contributed to Solr. This contrib currently includes 3 join implementations. 
> The initial patch was generated from the Solr 4.2.1 tag. Because of changes 
> in the FieldCache API this patch will only build with Solr 4.2 or above.
> *PostFilterJoinQParserPlugin aka "pjoin"*
> The pjoin provides a join implementation that filters results in one core 
> based on the results of a search in another core. This is similar in 
> functionality to the JoinQParserPlugin but the implementation differs in a 
> couple of important ways.
> The first way is that the pjoin is designed to work with integer join keys 
> only. So, in order to use pjoin, integer join keys must be included in both 
> the to and from core.
> The second difference is that the pjoin builds memory structures that are 
> used to quickly connect the join keys. It also uses a custom SolrCache named 
> "join" to hold intermediate DocSets which are needed to build the join memory 
> structures. So, the pjoin will need more memory then the JoinQParserPlugin to 
> perform the join.
> The main advantage of the pjoin is that it can scale to join millions of keys 
> between cores.
> Because it's a PostFilter, it only needs to join records that match the main 
> query.
> The syntax of the pjoin is the same as the JoinQParserPlugin except that the 
> plugin is referenced by the string "pjoin" rather then "join".
> fq=\{!pjoin fromCore=collection2 from=id_i to=id_i\}user:customer1
> The example filter query above will search the fromCore (collection2) for 
> "user:customer1". This query will generate a list of values from the "from" 
> field that will be used to filter the main query. Only records from the main 
> query, where the "to" field is present in the "from" list will be included in 
> the results.
> The solrconfig.xml in the main query core must contain the reference to the 
> pjoin.
>  class="org.apache.solr.joins.PostFilterJoinQParserPlugin"/>
> And the join contrib jars must be registed in the solrconfig.xml.
> 
> The solrconfig.xml in the fromcore must have the "join" SolrCache configured.
> class="solr.LRUCache"
>   size="4096"
>   initialSize="1024"
>   />
> *JoinValueSourceParserPlugin aka vjoin*
> The second implementation is the JoinValueSourceParserPlugin aka "vjoin". 
> This implements a ValueSource function query that can return values from a 
> second core based on join keys. This allows relevance data to be stored in a 
> separate core and then joined in the main query.
> The vjoin is called using the "vjoin" function query. For example:
> bf=vjoin(fromCore, fromKey, fromVal, toKey)
> This example shows "vjoin" being called by the edismax boost function 
> parameter. This example will return the "fromVal" from the "fromCore". The 
> "fromKey" and "toKey" are used to link the records from the main query to the 
> records in the "fromCore".
> As with the "pjoin", both the fromKey and toKey must be integers. Also like 
> the pjoin, the "join" SolrCache is used to hold the join memory structures.
> To configure the vjoin you must register the ValueSource plugin in the 
> solrconfig.xml as follows:
>  class="org.apache.solr.joins.JoinValueSourceParserPlugin" />
> *JoinValueSourceParserPlugin2 aka vjoin2 aka Personalized ValueSource Join*
> vjoin2 supports "personalized" ValueSource joins. The syntax is similar to 
> vjoin but adds an extra parameter so a query can be specified to join a 
> specific record set from the fromCore. This is designed to allow customer 
> specific relevance information to be added to the fromCore and then joined at 
> query time.
> Syntax:
> bf=vjoin2(fromCore,fromKey,fromVal,toKey,query)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA,

[jira] [Commented] (SOLR-4787) Join Contrib

2013-05-30 Thread Dawid Weiss (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13671199#comment-13671199
 ] 

Dawid Weiss commented on SOLR-4787:
---

Pull a class or two in source code form from fastutil or from HPPC. These are 
nearly identical these days, fastutil has support for Java collections 
interfaces (HPPC has its own API not stemming from JUC). Both of these are 
equally fast.

> Join Contrib
> 
>
> Key: SOLR-4787
> URL: https://issues.apache.org/jira/browse/SOLR-4787
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 4.2.1
>Reporter: Joel Bernstein
>Priority: Minor
> Fix For: 4.2.1
>
> Attachments: SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, 
> SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch
>
>
> This contrib provides a place where different join implementations can be 
> contributed to Solr. This contrib currently includes 3 join implementations. 
> The initial patch was generated from the Solr 4.2.1 tag. Because of changes 
> in the FieldCache API this patch will only build with Solr 4.2 or above.
> *PostFilterJoinQParserPlugin aka "pjoin"*
> The pjoin provides a join implementation that filters results in one core 
> based on the results of a search in another core. This is similar in 
> functionality to the JoinQParserPlugin but the implementation differs in a 
> couple of important ways.
> The first way is that the pjoin is designed to work with integer join keys 
> only. So, in order to use pjoin, integer join keys must be included in both 
> the to and from core.
> The second difference is that the pjoin builds memory structures that are 
> used to quickly connect the join keys. It also uses a custom SolrCache named 
> "join" to hold intermediate DocSets which are needed to build the join memory 
> structures. So, the pjoin will need more memory then the JoinQParserPlugin to 
> perform the join.
> The main advantage of the pjoin is that it can scale to join millions of keys 
> between cores.
> Because it's a PostFilter, it only needs to join records that match the main 
> query.
> The syntax of the pjoin is the same as the JoinQParserPlugin except that the 
> plugin is referenced by the string "pjoin" rather then "join".
> fq=\{!pjoin fromCore=collection2 from=id_i to=id_i\}user:customer1
> The example filter query above will search the fromCore (collection2) for 
> "user:customer1". This query will generate a list of values from the "from" 
> field that will be used to filter the main query. Only records from the main 
> query, where the "to" field is present in the "from" list will be included in 
> the results.
> The solrconfig.xml in the main query core must contain the reference to the 
> pjoin.
>  class="org.apache.solr.joins.PostFilterJoinQParserPlugin"/>
> And the join contrib jars must be registed in the solrconfig.xml.
> 
> The solrconfig.xml in the fromcore must have the "join" SolrCache configured.
> class="solr.LRUCache"
>   size="4096"
>   initialSize="1024"
>   />
> *JoinValueSourceParserPlugin aka vjoin*
> The second implementation is the JoinValueSourceParserPlugin aka "vjoin". 
> This implements a ValueSource function query that can return values from a 
> second core based on join keys. This allows relevance data to be stored in a 
> separate core and then joined in the main query.
> The vjoin is called using the "vjoin" function query. For example:
> bf=vjoin(fromCore, fromKey, fromVal, toKey)
> This example shows "vjoin" being called by the edismax boost function 
> parameter. This example will return the "fromVal" from the "fromCore". The 
> "fromKey" and "toKey" are used to link the records from the main query to the 
> records in the "fromCore".
> As with the "pjoin", both the fromKey and toKey must be integers. Also like 
> the pjoin, the "join" SolrCache is used to hold the join memory structures.
> To configure the vjoin you must register the ValueSource plugin in the 
> solrconfig.xml as follows:
>  class="org.apache.solr.joins.JoinValueSourceParserPlugin" />
> *JoinValueSourceParserPlugin2 aka vjoin2 aka Personalized ValueSource Join*
> vjoin2 supports "personalized" ValueSource joins. The syntax is similar to 
> vjoin but adds an extra parameter so a query can be specified to join a 
> specific record set from the fromCore. This is designed to allow customer 
> specific relevance information to be added to the fromCore and then joined at 
> query time.
> Syntax:
> bf=vjoin2(fromCore,fromKey,fromVal,toKey,query)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

--

[jira] [Commented] (SOLR-4715) CloudSolrServer does not provide support for setting underlying server properties

2013-05-30 Thread Shawn Heisey (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13671195#comment-13671195
 ] 

Shawn Heisey commented on SOLR-4715:


After a little bit of thought, I'm thinking the reason the ResponseParser 
object is final is so that there are no thread visibility problems, because it 
can't ever change.  The following ideas would require removing that final 
modifier and adding an object for a shared RequestWriter.

For CloudSolrServer:  If no requests have been processed yet, then the 
LBHttpSolrServer object will not yet have any internal HttpSolrServer objects, 
so passing through setParser and setRequestWriter calls should be perfectly 
safe.  We can block these methods once the first request gets processed, or we 
can just pass them through and rely on the following:

For LBHttpSolrServer, we can do one of three things with setParser and 
setRequestWriter if there are any ServerWrapper objects (and therefore 
HttpSolrServer objects):  1) Throw an exception.  2) Ignore the request.  3) 
Make the requested change on all HttpSolrServer objects.


> CloudSolrServer does not provide support for setting underlying server 
> properties
> -
>
> Key: SOLR-4715
> URL: https://issues.apache.org/jira/browse/SOLR-4715
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.3
>Reporter: Hardik Upadhyay
>Assignee: Shawn Heisey
>  Labels: solr, solrj
>
> CloudSolrServer (and LBHttpSolrServer) do not allow the user to set 
> underlying HttpSolrServer and HttpClient settings.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4816) Add document routing to CloudSolrServer

2013-05-30 Thread Shawn Heisey (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13671182#comment-13671182
 ] 

Shawn Heisey commented on SOLR-4816:


[~joel.bernstein] I was looking into how you switched to the binary writer so I 
could develop a patch for SOLR-4715.  You've got it creating new writer and 
parser objects for every HttpSolrServer.  Shouldn't there be one instance of 
each?  The existing LBHttpSolrServer class shares one parser object for all of 
the inner HttpSolrServer objects.

I'm struggling a bit on my patch, but if I can find a way to do it, there is 
some overlap with this issue.


> Add document routing to CloudSolrServer
> ---
>
> Key: SOLR-4816
> URL: https://issues.apache.org/jira/browse/SOLR-4816
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 4.3
>Reporter: Joel Bernstein
>Assignee: Mark Miller
>Priority: Minor
> Fix For: 5.0, 4.4
>
> Attachments: SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816-sriesenberg.patch
>
>
> This issue adds the following enhancements to CloudSolrServer's update logic:
> 1) Document routing: Updates are routed directly to the correct shard leader 
> eliminating document routing at the server.
> 2) Parallel update execution: Updates for each shard are executed in a 
> separate thread so parallel indexing can occur across the cluster.
> 3) Javabin transport: Update requests are sent via javabin transport.
> These enhancements should allow for near linear scalability on indexing 
> throughput.
> Usage:
> CloudSolrServer cloudClient = new CloudSolrServer(zkAddress);
> SolrInputDocument doc1 = new SolrInputDocument();
> doc1.addField(id, "0");
> doc1.addField("a_t", "hello1");
> SolrInputDocument doc2 = new SolrInputDocument();
> doc2.addField(id, "2");
> doc2.addField("a_t", "hello2");
> UpdateRequest request = new UpdateRequest();
> request.add(doc1);
> request.add(doc2);
> request.setAction(AbstractUpdateRequest.ACTION.OPTIMIZE, false, false);
> NamedList response = cloudClient.request(request); // Returns a backwards 
> compatible condensed response.
> //To get more detailed response down cast to RouteResponse:
> CloudSolrServer.RouteResponse rr = (CloudSolrServer.RouteResponse)response;
> NamedList responses = rr.getRouteResponse(); 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4715) CloudSolrServer does not provide support for setting underlying server properties

2013-05-30 Thread Shawn Heisey (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13671181#comment-13671181
 ] 

Shawn Heisey commented on SOLR-4715:


I've run into a challenge in creating a patch for this issue.  The response 
parser object in LBHttpSolrServer is final.  If there's a really good reason 
for this object to be final, then creating setParser and setRequestWriter 
methods could be really challenging.

> CloudSolrServer does not provide support for setting underlying server 
> properties
> -
>
> Key: SOLR-4715
> URL: https://issues.apache.org/jira/browse/SOLR-4715
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.3
>Reporter: Hardik Upadhyay
>Assignee: Shawn Heisey
>  Labels: solr, solrj
>
> CloudSolrServer (and LBHttpSolrServer) do not allow the user to set 
> underlying HttpSolrServer and HttpClient settings.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4569) Allow customization of column stride field and norms via indexing chain

2013-05-30 Thread John Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13671164#comment-13671164
 ] 

John Wang commented on LUCENE-4569:
---

Hey Simon:

Was wondering if you had a chance to look at this.

Thanks

-John

> Allow customization of column stride field and norms via indexing chain
> ---
>
> Key: LUCENE-4569
> URL: https://issues.apache.org/jira/browse/LUCENE-4569
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/index
>Affects Versions: 4.0
>Reporter: John Wang
>Assignee: Simon Willnauer
> Attachments: patch.diff
>
>
> We are building an in-memory indexing format and managing our own segments. 
> We are doing this by implementing a custom IndexingChain. We would like to 
> support column-stride-fields and norms without having to wire in a codec 
> (since we are managing our postings differently)
> Suggested change is consistent with the api support for passing in a custom 
> InvertedDocConsumer.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4787) Join Contrib

2013-05-30 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13671104#comment-13671104
 ] 

David Smiley commented on SOLR-4787:


I suggest either [FastUtil|http://fastutil.dsi.unimi.it], or the similar 
[HPPC|http://labs.carrotsearch.com/hppc.html] (by [~dawidweiss] here at the 
ASF).  

For a single class it may make sense to copy it in source from.  That kinda 
makes me cringe but for just one source file and for something that is 
externally tested and unlikely to have an unknown bug, I think it's fine.

> Join Contrib
> 
>
> Key: SOLR-4787
> URL: https://issues.apache.org/jira/browse/SOLR-4787
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 4.2.1
>Reporter: Joel Bernstein
>Priority: Minor
> Fix For: 4.2.1
>
> Attachments: SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, 
> SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch
>
>
> This contrib provides a place where different join implementations can be 
> contributed to Solr. This contrib currently includes 3 join implementations. 
> The initial patch was generated from the Solr 4.2.1 tag. Because of changes 
> in the FieldCache API this patch will only build with Solr 4.2 or above.
> *PostFilterJoinQParserPlugin aka "pjoin"*
> The pjoin provides a join implementation that filters results in one core 
> based on the results of a search in another core. This is similar in 
> functionality to the JoinQParserPlugin but the implementation differs in a 
> couple of important ways.
> The first way is that the pjoin is designed to work with integer join keys 
> only. So, in order to use pjoin, integer join keys must be included in both 
> the to and from core.
> The second difference is that the pjoin builds memory structures that are 
> used to quickly connect the join keys. It also uses a custom SolrCache named 
> "join" to hold intermediate DocSets which are needed to build the join memory 
> structures. So, the pjoin will need more memory then the JoinQParserPlugin to 
> perform the join.
> The main advantage of the pjoin is that it can scale to join millions of keys 
> between cores.
> Because it's a PostFilter, it only needs to join records that match the main 
> query.
> The syntax of the pjoin is the same as the JoinQParserPlugin except that the 
> plugin is referenced by the string "pjoin" rather then "join".
> fq=\{!pjoin fromCore=collection2 from=id_i to=id_i\}user:customer1
> The example filter query above will search the fromCore (collection2) for 
> "user:customer1". This query will generate a list of values from the "from" 
> field that will be used to filter the main query. Only records from the main 
> query, where the "to" field is present in the "from" list will be included in 
> the results.
> The solrconfig.xml in the main query core must contain the reference to the 
> pjoin.
>  class="org.apache.solr.joins.PostFilterJoinQParserPlugin"/>
> And the join contrib jars must be registed in the solrconfig.xml.
> 
> The solrconfig.xml in the fromcore must have the "join" SolrCache configured.
> class="solr.LRUCache"
>   size="4096"
>   initialSize="1024"
>   />
> *JoinValueSourceParserPlugin aka vjoin*
> The second implementation is the JoinValueSourceParserPlugin aka "vjoin". 
> This implements a ValueSource function query that can return values from a 
> second core based on join keys. This allows relevance data to be stored in a 
> separate core and then joined in the main query.
> The vjoin is called using the "vjoin" function query. For example:
> bf=vjoin(fromCore, fromKey, fromVal, toKey)
> This example shows "vjoin" being called by the edismax boost function 
> parameter. This example will return the "fromVal" from the "fromCore". The 
> "fromKey" and "toKey" are used to link the records from the main query to the 
> records in the "fromCore".
> As with the "pjoin", both the fromKey and toKey must be integers. Also like 
> the pjoin, the "join" SolrCache is used to hold the join memory structures.
> To configure the vjoin you must register the ValueSource plugin in the 
> solrconfig.xml as follows:
>  class="org.apache.solr.joins.JoinValueSourceParserPlugin" />
> *JoinValueSourceParserPlugin2 aka vjoin2 aka Personalized ValueSource Join*
> vjoin2 supports "personalized" ValueSource joins. The syntax is similar to 
> vjoin but adds an extra parameter so a query can be specified to join a 
> specific record set from the fromCore. This is designed to allow customer 
> specific relevance information to be added to the fromCore and then joined at 
> query time.
> Syntax:
> bf=vjoin2(fromCore,fromKey,fromVal,toKey,query)

--
This message is automatically generated by JIRA.
If you think it

[JENKINS-MAVEN] Lucene-Solr-Maven-trunk #866: POMs out of sync

2013-05-30 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-Maven-trunk/866/

1 tests failed.
REGRESSION:  org.apache.solr.cloud.SyncSliceTest.testDistribSearch

Error Message:
shard1 is not consistent.  Got 305 from 
http://127.0.0.1:25787/collection1lastClient and got 5 from 
http://127.0.0.1:25791/collection1

Stack Trace:
java.lang.AssertionError: shard1 is not consistent.  Got 305 from 
http://127.0.0.1:25787/collection1lastClient and got 5 from 
http://127.0.0.1:25791/collection1
at 
__randomizedtesting.SeedInfo.seed([5C02485FF3EE2B29:DDE4C64784B14B15]:0)
at org.junit.Assert.fail(Assert.java:93)
at 
org.apache.solr.cloud.AbstractFullDistribZkTestBase.checkShardConsistency(AbstractFullDistribZkTestBase.java:963)
at org.apache.solr.cloud.SyncSliceTest.doTest(SyncSliceTest.java:238)




Build Log:
[...truncated 24265 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4858) updateLog + core reload + deleteByQuery = leaked directory

2013-05-30 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13671056#comment-13671056
 ] 

Hoss Man commented on SOLR-4858:


bq. It seems that this test started failing after the following commit:

Hmmm... git bisect?

What confuses me is that r1457641 seems to have been undone by r1457647 ? .. 
but i guess maybe r1457641 broke it, and then subsequent commits kept it broken 
even when r1457647 reverted that specific change? (totally possible that the 
other changes in r1457647 are problematic here since SOLR-4604 in general is 
about updateLog and core reload.)

> updateLog + core reload + deleteByQuery = leaked directory
> --
>
> Key: SOLR-4858
> URL: https://issues.apache.org/jira/browse/SOLR-4858
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.2.1
>Reporter: Hoss Man
> Fix For: 4.3.1
>
> Attachments: SOLR-4858.patch, SOLR-4858.patch, SOLR-4858.patch
>
>
> I havene't been able to make sense of this yet, but trying to track down 
> another bug lead me to discover that the following combination leads to 
> problems...
> * updateLog enabled
> * do a core reload
> * do a delete by query \*:\*
> ...leave out any one of the three, and everything works fine.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4858) updateLog + core reload + deleteByQuery = leaked directory

2013-05-30 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-4858:
---

Attachment: SOLR-4858.patch

A much larger patch...

I initially found this bug because of a weird failure in a test i have on 
dependent project, and it took me longer then i would have liked to reproduce 
in a solr test because i didn't realize it was caused by using the updateLog, 
and i didn't realize how few solr tests take advantage of the updateLog.

so with that in mind, it seemed to me like we should probably increase the test 
coverage of hte updatLog to see if there are any more situations that tickle 
bugs besides this odd edge case of reload+deleteByQuery.

so in this updated patch...

 * same TestReloadAndDeleteDocs as before
 * the test solrconfig.xml now defaults to using the updateLog
 * SolrTestCaseJ4 uses randomization to occasionally disable the update log 
with a sys property
 * there is currently a nocommit in SolrTestCaseJ4 forcing the sys prop to 
always be true
 * any tests using solrconfig.xml that have an explicit need to use/not-use 
updateLog override the sysprop explicitly
 * a few schema files that did not have _version_ fields are updated to include 
them

...this still only scratches the surface of increasing the test coverage for 
the UpdateLog, but it already exposes a reproducible failure in AutoCommitTest 
with the same symptoms as my TestReloadAndDeleteDocs...

 * ERROR Timeout waiting for all directory ref counts...
 * searcher leak.

(i have not yet narrowed down which method in AutoCommitTest the dir factory 
ref count is lost in)

> updateLog + core reload + deleteByQuery = leaked directory
> --
>
> Key: SOLR-4858
> URL: https://issues.apache.org/jira/browse/SOLR-4858
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.2.1
>Reporter: Hoss Man
> Fix For: 4.3.1
>
> Attachments: SOLR-4858.patch, SOLR-4858.patch, SOLR-4858.patch
>
>
> I havene't been able to make sense of this yet, but trying to track down 
> another bug lead me to discover that the following combination leads to 
> problems...
> * updateLog enabled
> * do a core reload
> * do a delete by query \*:\*
> ...leave out any one of the three, and everything works fine.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4858) updateLog + core reload + deleteByQuery = leaked directory

2013-05-30 Thread Alexey Serba (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13671045#comment-13671045
 ] 

Alexey Serba commented on SOLR-4858:


{noformat}
# this will cause a searcher leak because the directory failed to close
ant test -Dtestcase=TestReloadAndDeleteDocs 
-Dtests.method=testReloadAndDeleteDocsWithUpdateLog
{noformat}

It seems that this test started failing after the following commit:

{noformat}
0226c616297c84196753f0989b45471b59c7c09a is the first bad commit
commit 0226c616297c84196753f0989b45471b59c7c09a
Author: Mark Robert Miller 
Date:   Mon Mar 18 04:51:18 2013 +

SOLR-4604: SolrCore is not using the UpdateHandler that is passed to it in 
SolrCore#reload.

git-svn-id: 
https://svn.apache.org/repos/asf/lucene/dev/branches/branch_4x@1457641 
13f79535-47bb-0310-9956-ffa450edef68
{noformat}

https://github.com/apache/lucene-solr/commit/0226c616297c84196753f0989b45471b59c7c09a

> updateLog + core reload + deleteByQuery = leaked directory
> --
>
> Key: SOLR-4858
> URL: https://issues.apache.org/jira/browse/SOLR-4858
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.2.1
>Reporter: Hoss Man
> Fix For: 4.3.1
>
> Attachments: SOLR-4858.patch, SOLR-4858.patch
>
>
> I havene't been able to make sense of this yet, but trying to track down 
> another bug lead me to discover that the following combination leads to 
> problems...
> * updateLog enabled
> * do a core reload
> * do a delete by query \*:\*
> ...leave out any one of the three, and everything works fine.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4787) Join Contrib

2013-05-30 Thread Kranti Parisa (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13671031#comment-13671031
 ] 

Kranti Parisa commented on SOLR-4787:
-

Even I have been using Trove lib. Along with Colt, the following looks 
interesting too
http://javolution.org/core-java/target/apidocs/javolution/util/FastMap.html
https://code.google.com/p/guava-libraries/

> Join Contrib
> 
>
> Key: SOLR-4787
> URL: https://issues.apache.org/jira/browse/SOLR-4787
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 4.2.1
>Reporter: Joel Bernstein
>Priority: Minor
> Fix For: 4.2.1
>
> Attachments: SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, 
> SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch
>
>
> This contrib provides a place where different join implementations can be 
> contributed to Solr. This contrib currently includes 3 join implementations. 
> The initial patch was generated from the Solr 4.2.1 tag. Because of changes 
> in the FieldCache API this patch will only build with Solr 4.2 or above.
> *PostFilterJoinQParserPlugin aka "pjoin"*
> The pjoin provides a join implementation that filters results in one core 
> based on the results of a search in another core. This is similar in 
> functionality to the JoinQParserPlugin but the implementation differs in a 
> couple of important ways.
> The first way is that the pjoin is designed to work with integer join keys 
> only. So, in order to use pjoin, integer join keys must be included in both 
> the to and from core.
> The second difference is that the pjoin builds memory structures that are 
> used to quickly connect the join keys. It also uses a custom SolrCache named 
> "join" to hold intermediate DocSets which are needed to build the join memory 
> structures. So, the pjoin will need more memory then the JoinQParserPlugin to 
> perform the join.
> The main advantage of the pjoin is that it can scale to join millions of keys 
> between cores.
> Because it's a PostFilter, it only needs to join records that match the main 
> query.
> The syntax of the pjoin is the same as the JoinQParserPlugin except that the 
> plugin is referenced by the string "pjoin" rather then "join".
> fq=\{!pjoin fromCore=collection2 from=id_i to=id_i\}user:customer1
> The example filter query above will search the fromCore (collection2) for 
> "user:customer1". This query will generate a list of values from the "from" 
> field that will be used to filter the main query. Only records from the main 
> query, where the "to" field is present in the "from" list will be included in 
> the results.
> The solrconfig.xml in the main query core must contain the reference to the 
> pjoin.
>  class="org.apache.solr.joins.PostFilterJoinQParserPlugin"/>
> And the join contrib jars must be registed in the solrconfig.xml.
> 
> The solrconfig.xml in the fromcore must have the "join" SolrCache configured.
> class="solr.LRUCache"
>   size="4096"
>   initialSize="1024"
>   />
> *JoinValueSourceParserPlugin aka vjoin*
> The second implementation is the JoinValueSourceParserPlugin aka "vjoin". 
> This implements a ValueSource function query that can return values from a 
> second core based on join keys. This allows relevance data to be stored in a 
> separate core and then joined in the main query.
> The vjoin is called using the "vjoin" function query. For example:
> bf=vjoin(fromCore, fromKey, fromVal, toKey)
> This example shows "vjoin" being called by the edismax boost function 
> parameter. This example will return the "fromVal" from the "fromCore". The 
> "fromKey" and "toKey" are used to link the records from the main query to the 
> records in the "fromCore".
> As with the "pjoin", both the fromKey and toKey must be integers. Also like 
> the pjoin, the "join" SolrCache is used to hold the join memory structures.
> To configure the vjoin you must register the ValueSource plugin in the 
> solrconfig.xml as follows:
>  class="org.apache.solr.joins.JoinValueSourceParserPlugin" />
> *JoinValueSourceParserPlugin2 aka vjoin2 aka Personalized ValueSource Join*
> vjoin2 supports "personalized" ValueSource joins. The syntax is similar to 
> vjoin but adds an extra parameter so a query can be specified to join a 
> specific record set from the fromCore. This is designed to allow customer 
> specific relevance information to be added to the fromCore and then joined at 
> query time.
> Syntax:
> bf=vjoin2(fromCore,fromKey,fromVal,toKey,query)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (SOLR-4858) updateLog + core reload + deleteByQuery = leaked directory

2013-05-30 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670980#comment-13670980
 ] 

Yonik Seeley commented on SOLR-4858:


It appears any deleteByQuery will cause this (the MatchAllDocuments 
deleteByQuery actually has special handling in Solr, so it's an important 
distinction).

> updateLog + core reload + deleteByQuery = leaked directory
> --
>
> Key: SOLR-4858
> URL: https://issues.apache.org/jira/browse/SOLR-4858
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.2.1
>Reporter: Hoss Man
> Fix For: 4.3.1
>
> Attachments: SOLR-4858.patch, SOLR-4858.patch
>
>
> I havene't been able to make sense of this yet, but trying to track down 
> another bug lead me to discover that the following combination leads to 
> problems...
> * updateLog enabled
> * do a core reload
> * do a delete by query \*:\*
> ...leave out any one of the three, and everything works fine.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4858) updateLog + core reload + deleteByQuery = leaked directory

2013-05-30 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-4858:
---

Attachment: SOLR-4858.patch

simplified test case.

i removed the randomness and replaced it with two distinct methods testing the 
simple sequence of events with and without updated log enabled...

{noformat}
# this will pass

ant test -Dtestcase=TestReloadAndDeleteDocs 
-Dtests.method=testReloadAndDeleteDocsNoUpdateLog

# this will cause a searcher leak because the directory failed to close

ant test -Dtestcase=TestReloadAndDeleteDocs 
-Dtests.method=testReloadAndDeleteDocsWithUpdateLog
{noformat}



> updateLog + core reload + deleteByQuery = leaked directory
> --
>
> Key: SOLR-4858
> URL: https://issues.apache.org/jira/browse/SOLR-4858
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.2.1
>Reporter: Hoss Man
> Fix For: 4.3.1
>
> Attachments: SOLR-4858.patch, SOLR-4858.patch
>
>
> I havene't been able to make sense of this yet, but trying to track down 
> another bug lead me to discover that the following combination leads to 
> problems...
> * updateLog enabled
> * do a core reload
> * do a delete by query \*:\*
> ...leave out any one of the three, and everything works fine.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-4.x-Windows (64bit/jdk1.6.0_45) - Build # 2845 - Still Failing!

2013-05-30 Thread Policeman Jenkins Server
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Windows/2845/
Java: 64bit/jdk1.6.0_45 -XX:-UseCompressedOops -XX:+UseSerialGC

1 tests failed.
REGRESSION:  org.apache.solr.core.TestArbitraryIndexDir.testLoadNewIndexDir

Error Message:
Exception during query

Stack Trace:
java.lang.RuntimeException: Exception during query
at 
__randomizedtesting.SeedInfo.seed([1DBD509CF6934D19:F4E7EBA4680ADDB1]:0)
at org.apache.solr.SolrTestCaseJ4.assertQ(SolrTestCaseJ4.java:525)
at 
org.apache.solr.core.TestArbitraryIndexDir.testLoadNewIndexDir(TestArbitraryIndexDir.java:126)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
at 
org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.lang.RuntimeException: REQUEST FAILED: xpath=*[count(//doc)=1]
xml response was: 

01


request was:start=0&q=id:2&qt=standard&rows=20&versi

[jira] [Commented] (SOLR-4882) Restrict SolrResourceLoader to only classloader accessible files and instance dir

2013-05-30 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670798#comment-13670798
 ] 

Hoss Man commented on SOLR-4882:


bq. ... In Lucene 5.0 we should not support this anymore.

FWIW: it's not hard to imagine situations where people have legitimate desire 
for using absolute paths like this.  ie: loading synonyms or stop words from 
some central location outside of their solr home dir (eg: 
/etc/solr-common/stopwords/en.txt, used by multiple solr instances, with diff 
solr home dirs, running on diff ports.  

With that in mind, I don't think it makes sense to completely remove this 
ability -- but it certainly makes sense to disable it by default and document 
the risks.

bq. In 4.4 we should add a solrconfig.xml setting to enable the old behaviour, 
but disable it by default...

Given the lifecycle of the resource loaders, it may not be easy to have this 
configuration per-core in solrconfig.xml.  I'm also not sure if it's worth 
adding as a solr.xml config option given the complexities in how that file is 
peristet after core operations (and how many times we've screwed ourselves 
adding things to that file)

Given that this is something (i think) we should generally discourage, and 
something that i don't think we should be shy about making "hard" to turn on, 
it might be enough just to say that the only way you can enable it is with an 
explicit (and scary named) system property that affects the entire Solr 
instance?






> Restrict SolrResourceLoader to only classloader accessible files and instance 
> dir
> -
>
> Key: SOLR-4882
> URL: https://issues.apache.org/jira/browse/SOLR-4882
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 4.3
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
> Fix For: 5.0, 4.4
>
>
> SolrResourceLoader currently allows to load files from any 
> absolute/CWD-relative path, which is used as a fallback if the resource 
> cannot be looked up via the class loader.
> We should limit this fallback to sub-dirs below the instanceDir passed into 
> the ctor. The CWD special case should be removed, too (the virtual CWD is 
> instance's config or root dir).
> The reason for this is security related. Some Solr components allow to pass 
> in resource paths via REST parameters (e.g. XSL stalesheets,...) and load 
> them via resource loader. By this it is possible to limit the whole thing to
> not allow loading e.g. /etc/passwd as a stylesheet.
> In 4.4 we should add a solrconfig.xml setting to enable the old behaviour, 
> but disable it by default, if your existing installation requires the files 
> from outside the instance dir which are not available via the URLClassLoader 
> used internally. In Lucene 5.0 we should not support this anymore.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4787) Join Contrib

2013-05-30 Thread Joel Bernstein (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670679#comment-13670679
 ] 

Joel Bernstein commented on SOLR-4787:
--

Colt looks promising and it's under the Cern license which is very permissive. 
I'll test it out.

> Join Contrib
> 
>
> Key: SOLR-4787
> URL: https://issues.apache.org/jira/browse/SOLR-4787
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 4.2.1
>Reporter: Joel Bernstein
>Priority: Minor
> Fix For: 4.2.1
>
> Attachments: SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, 
> SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch
>
>
> This contrib provides a place where different join implementations can be 
> contributed to Solr. This contrib currently includes 3 join implementations. 
> The initial patch was generated from the Solr 4.2.1 tag. Because of changes 
> in the FieldCache API this patch will only build with Solr 4.2 or above.
> *PostFilterJoinQParserPlugin aka "pjoin"*
> The pjoin provides a join implementation that filters results in one core 
> based on the results of a search in another core. This is similar in 
> functionality to the JoinQParserPlugin but the implementation differs in a 
> couple of important ways.
> The first way is that the pjoin is designed to work with integer join keys 
> only. So, in order to use pjoin, integer join keys must be included in both 
> the to and from core.
> The second difference is that the pjoin builds memory structures that are 
> used to quickly connect the join keys. It also uses a custom SolrCache named 
> "join" to hold intermediate DocSets which are needed to build the join memory 
> structures. So, the pjoin will need more memory then the JoinQParserPlugin to 
> perform the join.
> The main advantage of the pjoin is that it can scale to join millions of keys 
> between cores.
> Because it's a PostFilter, it only needs to join records that match the main 
> query.
> The syntax of the pjoin is the same as the JoinQParserPlugin except that the 
> plugin is referenced by the string "pjoin" rather then "join".
> fq=\{!pjoin fromCore=collection2 from=id_i to=id_i\}user:customer1
> The example filter query above will search the fromCore (collection2) for 
> "user:customer1". This query will generate a list of values from the "from" 
> field that will be used to filter the main query. Only records from the main 
> query, where the "to" field is present in the "from" list will be included in 
> the results.
> The solrconfig.xml in the main query core must contain the reference to the 
> pjoin.
>  class="org.apache.solr.joins.PostFilterJoinQParserPlugin"/>
> And the join contrib jars must be registed in the solrconfig.xml.
> 
> The solrconfig.xml in the fromcore must have the "join" SolrCache configured.
> class="solr.LRUCache"
>   size="4096"
>   initialSize="1024"
>   />
> *JoinValueSourceParserPlugin aka vjoin*
> The second implementation is the JoinValueSourceParserPlugin aka "vjoin". 
> This implements a ValueSource function query that can return values from a 
> second core based on join keys. This allows relevance data to be stored in a 
> separate core and then joined in the main query.
> The vjoin is called using the "vjoin" function query. For example:
> bf=vjoin(fromCore, fromKey, fromVal, toKey)
> This example shows "vjoin" being called by the edismax boost function 
> parameter. This example will return the "fromVal" from the "fromCore". The 
> "fromKey" and "toKey" are used to link the records from the main query to the 
> records in the "fromCore".
> As with the "pjoin", both the fromKey and toKey must be integers. Also like 
> the pjoin, the "join" SolrCache is used to hold the join memory structures.
> To configure the vjoin you must register the ValueSource plugin in the 
> solrconfig.xml as follows:
>  class="org.apache.solr.joins.JoinValueSourceParserPlugin" />
> *JoinValueSourceParserPlugin2 aka vjoin2 aka Personalized ValueSource Join*
> vjoin2 supports "personalized" ValueSource joins. The syntax is similar to 
> vjoin but adds an extra parameter so a query can be specified to join a 
> specific record set from the fromCore. This is designed to allow customer 
> specific relevance information to be added to the fromCore and then joined at 
> query time.
> Syntax:
> bf=vjoin2(fromCore,fromKey,fromVal,toKey,query)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional 

[jira] [Commented] (SOLR-4787) Join Contrib

2013-05-30 Thread Joel Bernstein (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670678#comment-13670678
 ] 

Joel Bernstein commented on SOLR-4787:
--

I'd like to switch this to a hash join rather then using the binary search 
anyway. For longs it would be great to use a HashMap that works with primitive 
keys, like Trove. Trove is LGPL I believe so I don't think we can use it though.

I'll look around and see if I can find another library that does what Trove 
does. 

Let me know if you know of another one or you've got an implementation lying 
around.

> Join Contrib
> 
>
> Key: SOLR-4787
> URL: https://issues.apache.org/jira/browse/SOLR-4787
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 4.2.1
>Reporter: Joel Bernstein
>Priority: Minor
> Fix For: 4.2.1
>
> Attachments: SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, 
> SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch
>
>
> This contrib provides a place where different join implementations can be 
> contributed to Solr. This contrib currently includes 3 join implementations. 
> The initial patch was generated from the Solr 4.2.1 tag. Because of changes 
> in the FieldCache API this patch will only build with Solr 4.2 or above.
> *PostFilterJoinQParserPlugin aka "pjoin"*
> The pjoin provides a join implementation that filters results in one core 
> based on the results of a search in another core. This is similar in 
> functionality to the JoinQParserPlugin but the implementation differs in a 
> couple of important ways.
> The first way is that the pjoin is designed to work with integer join keys 
> only. So, in order to use pjoin, integer join keys must be included in both 
> the to and from core.
> The second difference is that the pjoin builds memory structures that are 
> used to quickly connect the join keys. It also uses a custom SolrCache named 
> "join" to hold intermediate DocSets which are needed to build the join memory 
> structures. So, the pjoin will need more memory then the JoinQParserPlugin to 
> perform the join.
> The main advantage of the pjoin is that it can scale to join millions of keys 
> between cores.
> Because it's a PostFilter, it only needs to join records that match the main 
> query.
> The syntax of the pjoin is the same as the JoinQParserPlugin except that the 
> plugin is referenced by the string "pjoin" rather then "join".
> fq=\{!pjoin fromCore=collection2 from=id_i to=id_i\}user:customer1
> The example filter query above will search the fromCore (collection2) for 
> "user:customer1". This query will generate a list of values from the "from" 
> field that will be used to filter the main query. Only records from the main 
> query, where the "to" field is present in the "from" list will be included in 
> the results.
> The solrconfig.xml in the main query core must contain the reference to the 
> pjoin.
>  class="org.apache.solr.joins.PostFilterJoinQParserPlugin"/>
> And the join contrib jars must be registed in the solrconfig.xml.
> 
> The solrconfig.xml in the fromcore must have the "join" SolrCache configured.
> class="solr.LRUCache"
>   size="4096"
>   initialSize="1024"
>   />
> *JoinValueSourceParserPlugin aka vjoin*
> The second implementation is the JoinValueSourceParserPlugin aka "vjoin". 
> This implements a ValueSource function query that can return values from a 
> second core based on join keys. This allows relevance data to be stored in a 
> separate core and then joined in the main query.
> The vjoin is called using the "vjoin" function query. For example:
> bf=vjoin(fromCore, fromKey, fromVal, toKey)
> This example shows "vjoin" being called by the edismax boost function 
> parameter. This example will return the "fromVal" from the "fromCore". The 
> "fromKey" and "toKey" are used to link the records from the main query to the 
> records in the "fromCore".
> As with the "pjoin", both the fromKey and toKey must be integers. Also like 
> the pjoin, the "join" SolrCache is used to hold the join memory structures.
> To configure the vjoin you must register the ValueSource plugin in the 
> solrconfig.xml as follows:
>  class="org.apache.solr.joins.JoinValueSourceParserPlugin" />
> *JoinValueSourceParserPlugin2 aka vjoin2 aka Personalized ValueSource Join*
> vjoin2 supports "personalized" ValueSource joins. The syntax is similar to 
> vjoin but adds an extra parameter so a query can be specified to join a 
> specific record set from the fromCore. This is designed to allow customer 
> specific relevance information to be added to the fromCore and then joined at 
> query time.
> Syntax:
> bf=vjoin2(fromCore,fromKey,fromVal,toKey,query)

--
This message is automatically gene

[jira] [Commented] (SOLR-4715) CloudSolrServer does not provide support for setting underlying server properties

2013-05-30 Thread Shawn Heisey (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670668#comment-13670668
 ] 

Shawn Heisey commented on SOLR-4715:


I have tried some minimal testing with this code for setting the response 
parser and httpclient params, and it appears to work.

> CloudSolrServer does not provide support for setting underlying server 
> properties
> -
>
> Key: SOLR-4715
> URL: https://issues.apache.org/jira/browse/SOLR-4715
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.3
>Reporter: Hardik Upadhyay
>Assignee: Shawn Heisey
>  Labels: solr, solrj
>
> CloudSolrServer (and LBHttpSolrServer) do not allow the user to set 
> underlying HttpSolrServer and HttpClient settings.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-4882) Restrict SolrResourceLoader to only classloader accessible files and instance dir

2013-05-30 Thread Uwe Schindler (JIRA)
Uwe Schindler created SOLR-4882:
---

 Summary: Restrict SolrResourceLoader to only classloader 
accessible files and instance dir
 Key: SOLR-4882
 URL: https://issues.apache.org/jira/browse/SOLR-4882
 Project: Solr
  Issue Type: Improvement
Affects Versions: 4.3
Reporter: Uwe Schindler
Assignee: Uwe Schindler
 Fix For: 5.0, 4.4


SolrResourceLoader currently allows to load files from any 
absolute/CWD-relative path, which is used as a fallback if the resource cannot 
be looked up via the class loader.

We should limit this fallback to sub-dirs below the instanceDir passed into the 
ctor. The CWD special case should be removed, too (the virtual CWD is 
instance's config or root dir).

The reason for this is security related. Some Solr components allow to pass in 
resource paths via REST parameters (e.g. XSL stalesheets,...) and load them via 
resource loader. By this it is possible to limit the whole thing to
not allow loading e.g. /etc/passwd as a stylesheet.

In 4.4 we should add a solrconfig.xml setting to enable the old behaviour, but 
disable it by default, if your existing installation requires the files from 
outside the instance dir which are not available via the URLClassLoader used 
internally. In Lucene 5.0 we should not support this anymore.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4715) CloudSolrServer does not provide support for setting underlying server properties

2013-05-30 Thread Shawn Heisey (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670650#comment-13670650
 ] 

Shawn Heisey commented on SOLR-4715:


[~hupadhyay], the following code **MIGHT** allow you to change the response 
parser back to XML before this issue is implemented.  I have not tested this, 
and I would be very curious about whether it works for you.  It also changes a 
couple of HttpClient parameters, but you could remove those two lines.

{code}
import org.apache.http.client.HttpClient;
import org.apache.solr.client.solrj.ResponseParser;
import org.apache.solr.client.solrj.SolrServer;
import org.apache.solr.client.solrj.SolrServerException;
import org.apache.solr.client.solrj.impl.CloudSolrServer;
import org.apache.solr.client.solrj.impl.HttpClientUtil;
import org.apache.solr.client.solrj.impl.LBHttpSolrServer;
import org.apache.solr.client.solrj.impl.XMLResponseParser;
import org.apache.solr.common.params.ModifiableSolrParams;

public class TestStuff
{
void test() throws MalformedURLException
{
String zkHost = "";
ModifiableSolrParams params = new ModifiableSolrParams();
params.set(HttpClientUtil.PROP_MAX_CONNECTIONS, 1000);
params.set(HttpClientUtil.PROP_MAX_CONNECTIONS_PER_HOST, 200);
HttpClient client = HttpClientUtil.createClient(params);
ResponseParser parser = new XMLResponseParser();
LBHttpSolrServer lbServer = new LBHttpSolrServer(client, parser, 
"http://localhost/solr";);
lbServer.removeSolrServer("http://localhost/solr";);
SolrServer server = new CloudSolrServer(zkHost, lbServer);
}
}
{code}


> CloudSolrServer does not provide support for setting underlying server 
> properties
> -
>
> Key: SOLR-4715
> URL: https://issues.apache.org/jira/browse/SOLR-4715
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.3
>Reporter: Hardik Upadhyay
>Assignee: Shawn Heisey
>  Labels: solr, solrj
>
> CloudSolrServer (and LBHttpSolrServer) do not allow the user to set 
> underlying HttpSolrServer and HttpClient settings.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-4805) Calling Collection RELOAD where collection has a single core, leaves collection offline and unusable till reboot

2013-05-30 Thread David (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670630#comment-13670630
 ] 

David edited comment on SOLR-4805 at 5/30/13 7:25 PM:
--

I get the same issue when I try to reload a single core. The only way for me to 
currently change my configs is to restart the container.

  was (Author: dboychuck):
I get the same issue when I try to reload a single core. The only way for 
me to currently changing my configs is to restart the container.
  
> Calling Collection RELOAD where collection has a single core, leaves 
> collection offline and unusable till reboot
> 
>
> Key: SOLR-4805
> URL: https://issues.apache.org/jira/browse/SOLR-4805
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Affects Versions: 4.3
>Reporter: Jared Rodriguez
>Assignee: Mark Miller
> Fix For: 5.0, 4.4
>
>
> If you have a collection that is composed of a single core, then calling 
> reload on that collection leaves the core offline.  This happens even if 
> nothing at all has changed about the collection or its config.  This happens 
> whether you call reload via an http GET or if you directly call reload via 
> the collections api. 
> Tried a collection with a single core that contains data, change nothing 
> about the config in ZK and call reload and the collection.  The call 
> completes, but ZK flags that replica with "state":"down"
> Try it where a the single core contains no data and the same thing happens, 
> ZK config updates and broadcasts "state":"down" for the replica.
> I did not try this in a multicore or replicated core environment.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4805) Calling Collection RELOAD where collection has a single core, leaves collection offline and unusable till reboot

2013-05-30 Thread David (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670630#comment-13670630
 ] 

David commented on SOLR-4805:
-

I get the same issue when I try to reload a single core. The only way for me to 
currently changing my configs is to restart the container.

> Calling Collection RELOAD where collection has a single core, leaves 
> collection offline and unusable till reboot
> 
>
> Key: SOLR-4805
> URL: https://issues.apache.org/jira/browse/SOLR-4805
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Affects Versions: 4.3
>Reporter: Jared Rodriguez
>Assignee: Mark Miller
> Fix For: 5.0, 4.4
>
>
> If you have a collection that is composed of a single core, then calling 
> reload on that collection leaves the core offline.  This happens even if 
> nothing at all has changed about the collection or its config.  This happens 
> whether you call reload via an http GET or if you directly call reload via 
> the collections api. 
> Tried a collection with a single core that contains data, change nothing 
> about the config in ZK and call reload and the collection.  The call 
> completes, but ZK flags that replica with "state":"down"
> Try it where a the single core contains no data and the same thing happens, 
> ZK config updates and broadcasts "state":"down" for the replica.
> I did not try this in a multicore or replicated core environment.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4805) Calling Collection RELOAD where collection has a single core, leaves collection offline and unusable till reboot

2013-05-30 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670627#comment-13670627
 ] 

Mark Miller commented on SOLR-4805:
---

I'll fix it for 4.4 - we should stop doing preRegister when doing a reload.

> Calling Collection RELOAD where collection has a single core, leaves 
> collection offline and unusable till reboot
> 
>
> Key: SOLR-4805
> URL: https://issues.apache.org/jira/browse/SOLR-4805
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Affects Versions: 4.3
>Reporter: Jared Rodriguez
>Assignee: Mark Miller
> Fix For: 5.0, 4.4
>
>
> If you have a collection that is composed of a single core, then calling 
> reload on that collection leaves the core offline.  This happens even if 
> nothing at all has changed about the collection or its config.  This happens 
> whether you call reload via an http GET or if you directly call reload via 
> the collections api. 
> Tried a collection with a single core that contains data, change nothing 
> about the config in ZK and call reload and the collection.  The call 
> completes, but ZK flags that replica with "state":"down"
> Try it where a the single core contains no data and the same thing happens, 
> ZK config updates and broadcasts "state":"down" for the replica.
> I did not try this in a multicore or replicated core environment.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4805) Calling Collection RELOAD where collection has a single core, leaves collection offline and unusable till reboot

2013-05-30 Thread David (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670623#comment-13670623
 ] 

David commented on SOLR-4805:
-

Details here: 
http://lucene.472066.n3.nabble.com/Collections-API-Reload-killing-my-cloud-td4067141.html

> Calling Collection RELOAD where collection has a single core, leaves 
> collection offline and unusable till reboot
> 
>
> Key: SOLR-4805
> URL: https://issues.apache.org/jira/browse/SOLR-4805
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Affects Versions: 4.3
>Reporter: Jared Rodriguez
>Assignee: Mark Miller
> Fix For: 5.0, 4.4
>
>
> If you have a collection that is composed of a single core, then calling 
> reload on that collection leaves the core offline.  This happens even if 
> nothing at all has changed about the collection or its config.  This happens 
> whether you call reload via an http GET or if you directly call reload via 
> the collections api. 
> Tried a collection with a single core that contains data, change nothing 
> about the config in ZK and call reload and the collection.  The call 
> completes, but ZK flags that replica with "state":"down"
> Try it where a the single core contains no data and the same thing happens, 
> ZK config updates and broadcasts "state":"down" for the replica.
> I did not try this in a multicore or replicated core environment.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4805) Calling Collection RELOAD where collection has a single core, leaves collection offline and unusable till reboot

2013-05-30 Thread David (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670622#comment-13670622
 ] 

David commented on SOLR-4805:
-

I'm having this same issue on a cloud of 6 servers

> Calling Collection RELOAD where collection has a single core, leaves 
> collection offline and unusable till reboot
> 
>
> Key: SOLR-4805
> URL: https://issues.apache.org/jira/browse/SOLR-4805
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Affects Versions: 4.3
>Reporter: Jared Rodriguez
>Assignee: Mark Miller
> Fix For: 5.0, 4.4
>
>
> If you have a collection that is composed of a single core, then calling 
> reload on that collection leaves the core offline.  This happens even if 
> nothing at all has changed about the collection or its config.  This happens 
> whether you call reload via an http GET or if you directly call reload via 
> the collections api. 
> Tried a collection with a single core that contains data, change nothing 
> about the config in ZK and call reload and the collection.  The call 
> completes, but ZK flags that replica with "state":"down"
> Try it where a the single core contains no data and the same thing happens, 
> ZK config updates and broadcasts "state":"down" for the replica.
> I did not try this in a multicore or replicated core environment.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-4881) Fix DocumentAnalysisRequestHandler to correctly use EmptyEntityResolver

2013-05-30 Thread Uwe Schindler (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler resolved SOLR-4881.
-

Resolution: Fixed

Committed to 4.3.1, 4.4 and trunk.

Thanks Hoss for pointing out the inconsistency!

> Fix DocumentAnalysisRequestHandler to correctly use EmptyEntityResolver
> ---
>
> Key: SOLR-4881
> URL: https://issues.apache.org/jira/browse/SOLR-4881
> Project: Solr
>  Issue Type: Bug
>  Components: Schema and Analysis
>Affects Versions: 4.3
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
> Fix For: 5.0, 4.4, 4.3.1
>
> Attachments: SOLR-4881.patch
>
>
> This was overlooked while committing SOLR-3895.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5024) Can we reliably detect an incomplete first commit vs index corruption?

2013-05-30 Thread Geoff Cooney (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670604#comment-13670604
 ] 

Geoff Cooney commented on LUCENE-5024:
--

Hi.  I'm one of the users who reported/asked about this.  

Specifically, i was wondering if it's possible to deal with this by being 
explicit about the segments_n file being in the pre-committed state?  That is, 
add one byte to segments_n file representing a boolean "isCommitted".  Then you 
could treat an index that only has a segments_1 file set to "isCommitted"=false 
as a non-existant index.  

> Can we reliably detect an incomplete first commit vs index corruption?
> --
>
> Key: LUCENE-5024
> URL: https://issues.apache.org/jira/browse/LUCENE-5024
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/index
>Reporter: Michael McCandless
> Fix For: 5.0, 4.4
>
>
> Normally, if something bad happens (OS, JVM, hardware crashes) while
> IndexWriter is committing, we will just fallback to the prior commit
> and no intervention necessary from the app.
> But if that commit is the first commit, then on restart IndexWriter
> will now throw CorruptIndexException, as of LUCENE-4738.
> Prior to LUCENE-4738, in LUCENE-2812, we used to try to detect the
> corrupt first commit, but that logic was dangerous and could result in
> falsely believing no index is present when one is, e.g. when transient
> IOExceptions are thrown due to file descriptor exhaustion.
> But now two users have hit this change ... see "CorruptIndexException
> when opening Index during first commit" and "Calling
> IndexWriter.commit() immediately after creating the writer", both on
> java-user.
> It would be nice to get back to not marking an incomplete first commit
> as corruption ... but we have to proceed carefully.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-4881) Fix DocumentAnalysisRequestHandler to correctly use EmptyEntityResolver

2013-05-30 Thread Uwe Schindler (JIRA)
Uwe Schindler created SOLR-4881:
---

 Summary: Fix DocumentAnalysisRequestHandler to correctly use 
EmptyEntityResolver
 Key: SOLR-4881
 URL: https://issues.apache.org/jira/browse/SOLR-4881
 Project: Solr
  Issue Type: Bug
  Components: Schema and Analysis
Affects Versions: 4.3
Reporter: Uwe Schindler
Assignee: Uwe Schindler
 Fix For: 5.0, 4.4, 4.3.1
 Attachments: SOLR-4881.patch

This was overlooked while committing SOLR-3895 and SOLR-3614.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4881) Fix DocumentAnalysisRequestHandler to correctly use EmptyEntityResolver

2013-05-30 Thread Uwe Schindler (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated SOLR-4881:


Attachment: SOLR-4881.patch

Simple patch!

> Fix DocumentAnalysisRequestHandler to correctly use EmptyEntityResolver
> ---
>
> Key: SOLR-4881
> URL: https://issues.apache.org/jira/browse/SOLR-4881
> Project: Solr
>  Issue Type: Bug
>  Components: Schema and Analysis
>Affects Versions: 4.3
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
> Fix For: 5.0, 4.4, 4.3.1
>
> Attachments: SOLR-4881.patch
>
>
> This was overlooked while committing SOLR-3895.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4881) Fix DocumentAnalysisRequestHandler to correctly use EmptyEntityResolver

2013-05-30 Thread Uwe Schindler (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated SOLR-4881:


Description: This was overlooked while committing SOLR-3895.  (was: This 
was overlooked while committing SOLR-3895 and SOLR-3614.)

> Fix DocumentAnalysisRequestHandler to correctly use EmptyEntityResolver
> ---
>
> Key: SOLR-4881
> URL: https://issues.apache.org/jira/browse/SOLR-4881
> Project: Solr
>  Issue Type: Bug
>  Components: Schema and Analysis
>Affects Versions: 4.3
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
> Fix For: 5.0, 4.4, 4.3.1
>
> Attachments: SOLR-4881.patch
>
>
> This was overlooked while committing SOLR-3895.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4693) Create a collections API to delete/cleanup a Slice

2013-05-30 Thread Shalin Shekhar Mangar (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670552#comment-13670552
 ] 

Shalin Shekhar Mangar commented on SOLR-4693:
-

Thanks Anshum.

A few comments:
# Can we use "collection" instead of "name" just like we use in splitshard?
# The following code will throw an exception for a shard with no range (custom 
hashing use-case). Also it allows deletion of slices in construction state 
going against the error message.
{code}
// For now, only allow for deletions of Inactive slices or custom hashes 
(range==null).
// TODO: Add check for range gaps on Slice deletion
if (!slice.getState().equals(Slice.INACTIVE) && slice.getRange() != null) {
  throw new SolrException(ErrorCode.BAD_REQUEST,
  "The slice: " + slice.getName() + " is not currently "
  + slice.getState() + ". Only inactive (or custom-hashed) slices can 
be deleted.");
}
{code}
# The "deletecore" call to overseer is redundant because it is also made by the 
CoreAdmin UNLOAD action.
# Can we re-use the code between "deletecollection" and "deleteshard"? The 
collectionCmd code checks for "live" state as well.
# In DeleteSliceTest, after setSliceAsInactive(), we should poll the slice 
state until it becomes inactive or until a timeout value instead of just 
waiting for 5000ms
# DeleteSliceTest.waitAndConfirmSliceDeletion is wrong. It does not actually 
use the counter variable. Also, 
cloudClient.getZkStateReader().getClusterState() doesn't actually force refresh 
the cluster state
# We should fail with appropriate error message if there were nodes which could 
not be unloaded. Perhaps a separate "deletecore" call is appropriate here?
# Do we know what would happen if such a "zombie" node comes back up? We need 
to make sure it cleans up properly.

> Create a collections API to delete/cleanup a Slice
> --
>
> Key: SOLR-4693
> URL: https://issues.apache.org/jira/browse/SOLR-4693
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Reporter: Anshum Gupta
>Assignee: Shalin Shekhar Mangar
> Attachments: SOLR-4693.patch, SOLR-4693.patch
>
>
> Have a collections API that cleans up a given shard.
> Among other places, this would be useful post the shard split call to manage 
> the parent/original slice.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-5026) PagedGrowableWriter

2013-05-30 Thread Adrien Grand (JIRA)
Adrien Grand created LUCENE-5026:


 Summary: PagedGrowableWriter
 Key: LUCENE-5026
 URL: https://issues.apache.org/jira/browse/LUCENE-5026
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Adrien Grand
Assignee: Adrien Grand
 Fix For: 5.0, 4.4


We already have packed data structures that support more than 2B values such as 
AppendingLongBuffer and MonotonicAppendingLongBuffer but none of them supports 
random write-access.

We could write a PagedGrowableWriter for this, which would essentially wrap an 
array of GrowableWriters.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4715) CloudSolrServer does not provide support for setting underlying server properties

2013-05-30 Thread Shawn Heisey (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670497#comment-13670497
 ] 

Shawn Heisey commented on SOLR-4715:


My initial inclination is to *NOT* provide additional constructors, but to 
provide a number of getters and setters.  In addition to methods for setting 
timeouts and common httpclient properties, I would include getHttpClient and 
possibly something with a name like getHttpSolrServer or getInnerSolrServer.  
For CloudSolrServer, most of these new methods would just send/request the same 
information to/from LBHttpSolrServer.

Should I change my approach?  I haven't written any code yet.


> CloudSolrServer does not provide support for setting underlying server 
> properties
> -
>
> Key: SOLR-4715
> URL: https://issues.apache.org/jira/browse/SOLR-4715
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.3
>Reporter: Hardik Upadhyay
>Assignee: Shawn Heisey
>  Labels: solr, solrj
>
> CloudSolrServer (and LBHttpSolrServer) do not allow the user to set 
> underlying HttpSolrServer and HttpClient settings.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4715) CloudSolrServer does not provide support for setting underlying server properties

2013-05-30 Thread Shawn Heisey (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shawn Heisey updated SOLR-4715:
---

Description: CloudSolrServer (and LBHttpSolrServer) do not allow the user 
to set underlying HttpSolrServer and HttpClient settings.  (was: 
CloudSolrServer as well as LBHttpSolrServer does not allow to set 
XMLResponseWriter)

> CloudSolrServer does not provide support for setting underlying server 
> properties
> -
>
> Key: SOLR-4715
> URL: https://issues.apache.org/jira/browse/SOLR-4715
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.3
>Reporter: Hardik Upadhyay
>Assignee: Shawn Heisey
>  Labels: solr, solrj
>
> CloudSolrServer (and LBHttpSolrServer) do not allow the user to set 
> underlying HttpSolrServer and HttpClient settings.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4715) CloudSolrServer does not provide support for setting underlying server properties

2013-05-30 Thread Shawn Heisey (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shawn Heisey updated SOLR-4715:
---

Summary: CloudSolrServer does not provide support for setting underlying 
server properties  (was: CloudSolrServer does not provide support for setting 
XmlResponseWriter)

> CloudSolrServer does not provide support for setting underlying server 
> properties
> -
>
> Key: SOLR-4715
> URL: https://issues.apache.org/jira/browse/SOLR-4715
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.3
>Reporter: Hardik Upadhyay
>Assignee: Shawn Heisey
>  Labels: solr, solrj
>
> CloudSolrServer as well as LBHttpSolrServer does not allow to set 
> XMLResponseWriter

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4715) CloudSolrServer does not provide support for setting XmlResponseWriter

2013-05-30 Thread Shawn Heisey (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shawn Heisey updated SOLR-4715:
---

Affects Version/s: 4.3

> CloudSolrServer does not provide support for setting XmlResponseWriter
> --
>
> Key: SOLR-4715
> URL: https://issues.apache.org/jira/browse/SOLR-4715
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.3
>Reporter: Hardik Upadhyay
>Assignee: Shawn Heisey
>  Labels: solr, solrj
>
> CloudSolrServer as well as LBHttpSolrServer does not allow to set 
> XMLResponseWriter

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Assigned] (SOLR-4715) CloudSolrServer does not provide support for setting XmlResponseWriter

2013-05-30 Thread Shawn Heisey (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shawn Heisey reassigned SOLR-4715:
--

Assignee: Shawn Heisey

> CloudSolrServer does not provide support for setting XmlResponseWriter
> --
>
> Key: SOLR-4715
> URL: https://issues.apache.org/jira/browse/SOLR-4715
> Project: Solr
>  Issue Type: Bug
>Reporter: Hardik Upadhyay
>Assignee: Shawn Heisey
>  Labels: solr, solrj
>
> CloudSolrServer as well as LBHttpSolrServer does not allow to set 
> XMLResponseWriter

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-4880) ClientUtils#toSolrInputDocument(SolrDocument d) creates shallow copy for multivalued fields

2013-05-30 Thread Ivan Hrytsyuk (JIRA)
Ivan Hrytsyuk created SOLR-4880:
---

 Summary: ClientUtils#toSolrInputDocument(SolrDocument d) creates 
shallow copy for multivalued fields
 Key: SOLR-4880
 URL: https://issues.apache.org/jira/browse/SOLR-4880
 Project: Solr
  Issue Type: Bug
  Components: clients - java
Reporter: Ivan Hrytsyuk
 Fix For: 3.6


Multivalued fields are represented in SolrDocument as java.util.Collection.
ClientUtils#toSolrInputDocument(SolrDocument d) creates shallow copy of the 
collections in resulted SolrInputDocument.
That means that changes to resulted instance (i.e. adding/removing records) 
affect original instance as well, which is bad.

*Expected Behaviour*: Deep copy of collections should be created. Changes to 
resulted instance shouldn't affect original instance

*Possible Implementation*:
{code:java}
public static SolrInputDocument toSolrInputDocument(final SolrDocument 
solrDocument) {
final Map fields = new 
LinkedHashMap();
return toSolrInputDocument(solrDocument, fields);
}

public static SolrInputDocument toSolrInputDocument(final SolrDocument 
solrDocument, final Map fields) {
final SolrInputDocument result = new SolrInputDocument(fields);
for(final Map.Entry entry : solrDocument.entrySet()) {
if(entry.getValue() instanceof Collection) {
result.setField(entry.getKey(), new 
ArrayList((Collection) entry.getValue()));
} else {
result.setField(entry.getKey(), entry.getValue());
}
}
return result;
}
{code}

*Note*: Believe the same issue is true for 
ClientUtils#toSolrDocument(SolrInputDocument d)



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5023) Only reader that contains fields can be added into readerContext

2013-05-30 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670465#comment-13670465
 ] 

Uwe Schindler commented on LUCENE-5023:
---

The buggy code in SolrIndexSearcher was removed. Will be relaesed with 4.3.1 or 
4.4

> Only reader that contains fields can be added into readerContext
> 
>
> Key: LUCENE-5023
> URL: https://issues.apache.org/jira/browse/LUCENE-5023
> Project: Lucene - Core
>  Issue Type: Bug
>Affects Versions: 4.2
>Reporter: Bao Yang Yang
>Assignee: Uwe Schindler
>Priority: Critical
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> When there is only Segements in solr core, which means no any indexes, in 
> CompositeReaderContext.build() method, the atomicReader that has no fields 
> returned should not be added into leaves. Otherwise, in 
> SolrIndexSearcher.getDocSetNC(Query query, DocSet filter), when execute line 
> fields.terms(t.field()), a nullpointerexception will occur since fields 
> variable is null.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-4877) SolrIndexSearcher#getDocSetNC should check for null return in AtomicReader#fields()

2013-05-30 Thread Uwe Schindler (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler resolved SOLR-4877.
-

Resolution: Fixed

> SolrIndexSearcher#getDocSetNC should check for null return in 
> AtomicReader#fields()
> ---
>
> Key: SOLR-4877
> URL: https://issues.apache.org/jira/browse/SOLR-4877
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.2, 4.3
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
> Fix For: 5.0, 4.4, 4.3.1
>
> Attachments: SOLR-4877-nospecialcase.patch, SOLR-4877.patch
>
>
> In LUCENE-5023 it was reported that composite reader contexts should not 
> contain null fields() readers. But this is wrong, as a null-fields() reader 
> may contain documents, just no fields.
> fields() and terms() is documented to return null, so DocSets should check 
> for null (like all queries do in Lucene). It seems that DocSetNC does not 
> correctly check for null.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4877) SolrIndexSearcher#getDocSetNC should check for null return in AtomicReader#fields()

2013-05-30 Thread Uwe Schindler (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated SOLR-4877:


Fix Version/s: 4.3.1
   4.4
   5.0

> SolrIndexSearcher#getDocSetNC should check for null return in 
> AtomicReader#fields()
> ---
>
> Key: SOLR-4877
> URL: https://issues.apache.org/jira/browse/SOLR-4877
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.2, 4.3
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
> Fix For: 5.0, 4.4, 4.3.1
>
> Attachments: SOLR-4877-nospecialcase.patch, SOLR-4877.patch
>
>
> In LUCENE-5023 it was reported that composite reader contexts should not 
> contain null fields() readers. But this is wrong, as a null-fields() reader 
> may contain documents, just no fields.
> fields() and terms() is documented to return null, so DocSets should check 
> for null (like all queries do in Lucene). It seems that DocSetNC does not 
> correctly check for null.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-4816) Add document routing to CloudSolrServer

2013-05-30 Thread Shawn Heisey (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670456#comment-13670456
 ] 

Shawn Heisey edited comment on SOLR-4816 at 5/30/13 4:17 PM:
-

bq. I still think these extras will need to be off by default until 5.

+1.  Even in version 5, it should still be possible to turn them off.  Advanced 
features (threading in particular) have a tendency to cause subtle bugs, and 
it's difficult to know if they are bugs in the underlying code or bugs in the 
advanced feature.  Being able to turn them off will greatly help with debugging.

IMHO, most tests that use CloudSolrServer should randomly turn things like 
threading on or off, change the writer and parser, etc.  Which reminds me, I 
need to file an issue and work on a patch for Cloud/LBHttpSolrServer 
implementations that includes many of the getters/setters from HttpSolrServer.


  was (Author: elyograg):
bq. I still think these extras will need to be off by default until 5.

+1.  Even in version 5, it should still be possible to turn them off.  Advanced 
features (threading in particular) have a tendency to cause subtle bugs, and 
it's difficult to know if they are bugs in the underlying code or bugs in the 
advanced feature.  Being able to turn them off will greatly help with debugging.

IMHO, most tests that use CloudSolrServer should randomly turn things like 
threading on or off, change the writer and parser, etc.  Which reminds me, I 
need to file an issue and work on a patch for {Cloud,LBHttp}SolrServer 
implementations that includes many of the getters/setters from HttpSolrServer.

  
> Add document routing to CloudSolrServer
> ---
>
> Key: SOLR-4816
> URL: https://issues.apache.org/jira/browse/SOLR-4816
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 4.3
>Reporter: Joel Bernstein
>Assignee: Mark Miller
>Priority: Minor
> Fix For: 5.0, 4.4
>
> Attachments: SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816-sriesenberg.patch
>
>
> This issue adds the following enhancements to CloudSolrServer's update logic:
> 1) Document routing: Updates are routed directly to the correct shard leader 
> eliminating document routing at the server.
> 2) Parallel update execution: Updates for each shard are executed in a 
> separate thread so parallel indexing can occur across the cluster.
> 3) Javabin transport: Update requests are sent via javabin transport.
> These enhancements should allow for near linear scalability on indexing 
> throughput.
> Usage:
> CloudSolrServer cloudClient = new CloudSolrServer(zkAddress);
> SolrInputDocument doc1 = new SolrInputDocument();
> doc1.addField(id, "0");
> doc1.addField("a_t", "hello1");
> SolrInputDocument doc2 = new SolrInputDocument();
> doc2.addField(id, "2");
> doc2.addField("a_t", "hello2");
> UpdateRequest request = new UpdateRequest();
> request.add(doc1);
> request.add(doc2);
> request.setAction(AbstractUpdateRequest.ACTION.OPTIMIZE, false, false);
> NamedList response = cloudClient.request(request); // Returns a backwards 
> compatible condensed response.
> //To get more detailed response down cast to RouteResponse:
> CloudSolrServer.RouteResponse rr = (CloudSolrServer.RouteResponse)response;
> NamedList responses = rr.getRouteResponse(); 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4816) Add document routing to CloudSolrServer

2013-05-30 Thread Joel Bernstein (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670457#comment-13670457
 ] 

Joel Bernstein commented on SOLR-4816:
--

The initial response behaves very much like a response when document routing is 
done on the server. On the server the Solr instance sends off the docs to the 
shards to be indexed and then returns a single unified response.

This does basically the same thing but let's you down cast to get more info if 
you want to.



> Add document routing to CloudSolrServer
> ---
>
> Key: SOLR-4816
> URL: https://issues.apache.org/jira/browse/SOLR-4816
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 4.3
>Reporter: Joel Bernstein
>Assignee: Mark Miller
>Priority: Minor
> Fix For: 5.0, 4.4
>
> Attachments: SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816-sriesenberg.patch
>
>
> This issue adds the following enhancements to CloudSolrServer's update logic:
> 1) Document routing: Updates are routed directly to the correct shard leader 
> eliminating document routing at the server.
> 2) Parallel update execution: Updates for each shard are executed in a 
> separate thread so parallel indexing can occur across the cluster.
> 3) Javabin transport: Update requests are sent via javabin transport.
> These enhancements should allow for near linear scalability on indexing 
> throughput.
> Usage:
> CloudSolrServer cloudClient = new CloudSolrServer(zkAddress);
> SolrInputDocument doc1 = new SolrInputDocument();
> doc1.addField(id, "0");
> doc1.addField("a_t", "hello1");
> SolrInputDocument doc2 = new SolrInputDocument();
> doc2.addField(id, "2");
> doc2.addField("a_t", "hello2");
> UpdateRequest request = new UpdateRequest();
> request.add(doc1);
> request.add(doc2);
> request.setAction(AbstractUpdateRequest.ACTION.OPTIMIZE, false, false);
> NamedList response = cloudClient.request(request); // Returns a backwards 
> compatible condensed response.
> //To get more detailed response down cast to RouteResponse:
> CloudSolrServer.RouteResponse rr = (CloudSolrServer.RouteResponse)response;
> NamedList responses = rr.getRouteResponse(); 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4816) Add document routing to CloudSolrServer

2013-05-30 Thread Shawn Heisey (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670456#comment-13670456
 ] 

Shawn Heisey commented on SOLR-4816:


bq. I still think these extras will need to be off by default until 5.

+1.  Even in version 5, it should still be possible to turn them off.  Advanced 
features (threading in particular) have a tendency to cause subtle bugs, and 
it's difficult to know if they are bugs in the underlying code or bugs in the 
advanced feature.  Being able to turn them off will greatly help with debugging.

IMHO, most tests that use CloudSolrServer should randomly turn things like 
threading on or off, change the writer and parser, etc.  Which reminds me, I 
need to file an issue and work on a patch for {Cloud,LBHttp}SolrServer 
implementations that includes many of the getters/setters from HttpSolrServer.


> Add document routing to CloudSolrServer
> ---
>
> Key: SOLR-4816
> URL: https://issues.apache.org/jira/browse/SOLR-4816
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 4.3
>Reporter: Joel Bernstein
>Assignee: Mark Miller
>Priority: Minor
> Fix For: 5.0, 4.4
>
> Attachments: SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816-sriesenberg.patch
>
>
> This issue adds the following enhancements to CloudSolrServer's update logic:
> 1) Document routing: Updates are routed directly to the correct shard leader 
> eliminating document routing at the server.
> 2) Parallel update execution: Updates for each shard are executed in a 
> separate thread so parallel indexing can occur across the cluster.
> 3) Javabin transport: Update requests are sent via javabin transport.
> These enhancements should allow for near linear scalability on indexing 
> throughput.
> Usage:
> CloudSolrServer cloudClient = new CloudSolrServer(zkAddress);
> SolrInputDocument doc1 = new SolrInputDocument();
> doc1.addField(id, "0");
> doc1.addField("a_t", "hello1");
> SolrInputDocument doc2 = new SolrInputDocument();
> doc2.addField(id, "2");
> doc2.addField("a_t", "hello2");
> UpdateRequest request = new UpdateRequest();
> request.add(doc1);
> request.add(doc2);
> request.setAction(AbstractUpdateRequest.ACTION.OPTIMIZE, false, false);
> NamedList response = cloudClient.request(request); // Returns a backwards 
> compatible condensed response.
> //To get more detailed response down cast to RouteResponse:
> CloudSolrServer.RouteResponse rr = (CloudSolrServer.RouteResponse)response;
> NamedList responses = rr.getRouteResponse(); 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4816) Add document routing to CloudSolrServer

2013-05-30 Thread Joel Bernstein (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-4816:
-

Attachment: SOLR-4816.patch

Added Routable.java

> Add document routing to CloudSolrServer
> ---
>
> Key: SOLR-4816
> URL: https://issues.apache.org/jira/browse/SOLR-4816
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 4.3
>Reporter: Joel Bernstein
>Assignee: Mark Miller
>Priority: Minor
> Fix For: 5.0, 4.4
>
> Attachments: SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816-sriesenberg.patch
>
>
> This issue adds the following enhancements to CloudSolrServer's update logic:
> 1) Document routing: Updates are routed directly to the correct shard leader 
> eliminating document routing at the server.
> 2) Parallel update execution: Updates for each shard are executed in a 
> separate thread so parallel indexing can occur across the cluster.
> 3) Javabin transport: Update requests are sent via javabin transport.
> These enhancements should allow for near linear scalability on indexing 
> throughput.
> Usage:
> CloudSolrServer cloudClient = new CloudSolrServer(zkAddress);
> SolrInputDocument doc1 = new SolrInputDocument();
> doc1.addField(id, "0");
> doc1.addField("a_t", "hello1");
> SolrInputDocument doc2 = new SolrInputDocument();
> doc2.addField(id, "2");
> doc2.addField("a_t", "hello2");
> UpdateRequest request = new UpdateRequest();
> request.add(doc1);
> request.add(doc2);
> request.setAction(AbstractUpdateRequest.ACTION.OPTIMIZE, false, false);
> NamedList response = cloudClient.request(request); // Returns a backwards 
> compatible condensed response.
> //To get more detailed response down cast to RouteResponse:
> CloudSolrServer.RouteResponse rr = (CloudSolrServer.RouteResponse)response;
> NamedList responses = rr.getRouteResponse(); 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4816) Add document routing to CloudSolrServer

2013-05-30 Thread Joel Bernstein (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670446#comment-13670446
 ] 

Joel Bernstein commented on SOLR-4816:
--

OK, adding it now.

> Add document routing to CloudSolrServer
> ---
>
> Key: SOLR-4816
> URL: https://issues.apache.org/jira/browse/SOLR-4816
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 4.3
>Reporter: Joel Bernstein
>Assignee: Mark Miller
>Priority: Minor
> Fix For: 5.0, 4.4
>
> Attachments: SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816-sriesenberg.patch
>
>
> This issue adds the following enhancements to CloudSolrServer's update logic:
> 1) Document routing: Updates are routed directly to the correct shard leader 
> eliminating document routing at the server.
> 2) Parallel update execution: Updates for each shard are executed in a 
> separate thread so parallel indexing can occur across the cluster.
> 3) Javabin transport: Update requests are sent via javabin transport.
> These enhancements should allow for near linear scalability on indexing 
> throughput.
> Usage:
> CloudSolrServer cloudClient = new CloudSolrServer(zkAddress);
> SolrInputDocument doc1 = new SolrInputDocument();
> doc1.addField(id, "0");
> doc1.addField("a_t", "hello1");
> SolrInputDocument doc2 = new SolrInputDocument();
> doc2.addField(id, "2");
> doc2.addField("a_t", "hello2");
> UpdateRequest request = new UpdateRequest();
> request.add(doc1);
> request.add(doc2);
> request.setAction(AbstractUpdateRequest.ACTION.OPTIMIZE, false, false);
> NamedList response = cloudClient.request(request); // Returns a backwards 
> compatible condensed response.
> //To get more detailed response down cast to RouteResponse:
> CloudSolrServer.RouteResponse rr = (CloudSolrServer.RouteResponse)response;
> NamedList responses = rr.getRouteResponse(); 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4816) Add document routing to CloudSolrServer

2013-05-30 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670442#comment-13670442
 ] 

Mark Miller commented on SOLR-4816:
---

I don't see Routable in the current patch.

> Add document routing to CloudSolrServer
> ---
>
> Key: SOLR-4816
> URL: https://issues.apache.org/jira/browse/SOLR-4816
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 4.3
>Reporter: Joel Bernstein
>Assignee: Mark Miller
>Priority: Minor
> Fix For: 5.0, 4.4
>
> Attachments: SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816-sriesenberg.patch
>
>
> This issue adds the following enhancements to CloudSolrServer's update logic:
> 1) Document routing: Updates are routed directly to the correct shard leader 
> eliminating document routing at the server.
> 2) Parallel update execution: Updates for each shard are executed in a 
> separate thread so parallel indexing can occur across the cluster.
> 3) Javabin transport: Update requests are sent via javabin transport.
> These enhancements should allow for near linear scalability on indexing 
> throughput.
> Usage:
> CloudSolrServer cloudClient = new CloudSolrServer(zkAddress);
> SolrInputDocument doc1 = new SolrInputDocument();
> doc1.addField(id, "0");
> doc1.addField("a_t", "hello1");
> SolrInputDocument doc2 = new SolrInputDocument();
> doc2.addField(id, "2");
> doc2.addField("a_t", "hello2");
> UpdateRequest request = new UpdateRequest();
> request.add(doc1);
> request.add(doc2);
> request.setAction(AbstractUpdateRequest.ACTION.OPTIMIZE, false, false);
> NamedList response = cloudClient.request(request); // Returns a backwards 
> compatible condensed response.
> //To get more detailed response down cast to RouteResponse:
> CloudSolrServer.RouteResponse rr = (CloudSolrServer.RouteResponse)response;
> NamedList responses = rr.getRouteResponse(); 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4816) Add document routing to CloudSolrServer

2013-05-30 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670439#comment-13670439
 ] 

Mark Miller commented on SOLR-4816:
---

I'll do a review shortly.

bq. It does this all by default, no switches needed.

bq. exception that condenses the info from each shard into a single response

How can that be backward compat if people are parsing the response? I'm not 
convinced you can do all this by default and be back compat, but I'll look at 
the latest patch.

And batch will again have the slight change in runtime behavior.

I still think these extras will need to be off by default until 5.

> Add document routing to CloudSolrServer
> ---
>
> Key: SOLR-4816
> URL: https://issues.apache.org/jira/browse/SOLR-4816
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 4.3
>Reporter: Joel Bernstein
>Assignee: Mark Miller
>Priority: Minor
> Fix For: 5.0, 4.4
>
> Attachments: SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816-sriesenberg.patch
>
>
> This issue adds the following enhancements to CloudSolrServer's update logic:
> 1) Document routing: Updates are routed directly to the correct shard leader 
> eliminating document routing at the server.
> 2) Parallel update execution: Updates for each shard are executed in a 
> separate thread so parallel indexing can occur across the cluster.
> 3) Javabin transport: Update requests are sent via javabin transport.
> These enhancements should allow for near linear scalability on indexing 
> throughput.
> Usage:
> CloudSolrServer cloudClient = new CloudSolrServer(zkAddress);
> SolrInputDocument doc1 = new SolrInputDocument();
> doc1.addField(id, "0");
> doc1.addField("a_t", "hello1");
> SolrInputDocument doc2 = new SolrInputDocument();
> doc2.addField(id, "2");
> doc2.addField("a_t", "hello2");
> UpdateRequest request = new UpdateRequest();
> request.add(doc1);
> request.add(doc2);
> request.setAction(AbstractUpdateRequest.ACTION.OPTIMIZE, false, false);
> NamedList response = cloudClient.request(request); // Returns a backwards 
> compatible condensed response.
> //To get more detailed response down cast to RouteResponse:
> CloudSolrServer.RouteResponse rr = (CloudSolrServer.RouteResponse)response;
> NamedList responses = rr.getRouteResponse(); 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4816) Add document routing to CloudSolrServer

2013-05-30 Thread Mark Miller (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller updated SOLR-4816:
--

Fix Version/s: 4.4
   5.0

> Add document routing to CloudSolrServer
> ---
>
> Key: SOLR-4816
> URL: https://issues.apache.org/jira/browse/SOLR-4816
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 4.3
>Reporter: Joel Bernstein
>Assignee: Mark Miller
>Priority: Minor
> Fix For: 5.0, 4.4
>
> Attachments: SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816-sriesenberg.patch
>
>
> This issue adds the following enhancements to CloudSolrServer's update logic:
> 1) Document routing: Updates are routed directly to the correct shard leader 
> eliminating document routing at the server.
> 2) Parallel update execution: Updates for each shard are executed in a 
> separate thread so parallel indexing can occur across the cluster.
> 3) Javabin transport: Update requests are sent via javabin transport.
> These enhancements should allow for near linear scalability on indexing 
> throughput.
> Usage:
> CloudSolrServer cloudClient = new CloudSolrServer(zkAddress);
> SolrInputDocument doc1 = new SolrInputDocument();
> doc1.addField(id, "0");
> doc1.addField("a_t", "hello1");
> SolrInputDocument doc2 = new SolrInputDocument();
> doc2.addField(id, "2");
> doc2.addField("a_t", "hello2");
> UpdateRequest request = new UpdateRequest();
> request.add(doc1);
> request.add(doc2);
> request.setAction(AbstractUpdateRequest.ACTION.OPTIMIZE, false, false);
> NamedList response = cloudClient.request(request); // Returns a backwards 
> compatible condensed response.
> //To get more detailed response down cast to RouteResponse:
> CloudSolrServer.RouteResponse rr = (CloudSolrServer.RouteResponse)response;
> NamedList responses = rr.getRouteResponse(); 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Assigned] (SOLR-4816) Add document routing to CloudSolrServer

2013-05-30 Thread Mark Miller (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller reassigned SOLR-4816:
-

Assignee: Mark Miller

> Add document routing to CloudSolrServer
> ---
>
> Key: SOLR-4816
> URL: https://issues.apache.org/jira/browse/SOLR-4816
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 4.3
>Reporter: Joel Bernstein
>Assignee: Mark Miller
>Priority: Minor
> Attachments: SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816-sriesenberg.patch
>
>
> This issue adds the following enhancements to CloudSolrServer's update logic:
> 1) Document routing: Updates are routed directly to the correct shard leader 
> eliminating document routing at the server.
> 2) Parallel update execution: Updates for each shard are executed in a 
> separate thread so parallel indexing can occur across the cluster.
> 3) Javabin transport: Update requests are sent via javabin transport.
> These enhancements should allow for near linear scalability on indexing 
> throughput.
> Usage:
> CloudSolrServer cloudClient = new CloudSolrServer(zkAddress);
> SolrInputDocument doc1 = new SolrInputDocument();
> doc1.addField(id, "0");
> doc1.addField("a_t", "hello1");
> SolrInputDocument doc2 = new SolrInputDocument();
> doc2.addField(id, "2");
> doc2.addField("a_t", "hello2");
> UpdateRequest request = new UpdateRequest();
> request.add(doc1);
> request.add(doc2);
> request.setAction(AbstractUpdateRequest.ACTION.OPTIMIZE, false, false);
> NamedList response = cloudClient.request(request); // Returns a backwards 
> compatible condensed response.
> //To get more detailed response down cast to RouteResponse:
> CloudSolrServer.RouteResponse rr = (CloudSolrServer.RouteResponse)response;
> NamedList responses = rr.getRouteResponse(); 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-4816) Add document routing to CloudSolrServer

2013-05-30 Thread Joel Bernstein (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670431#comment-13670431
 ] 

Joel Bernstein edited comment on SOLR-4816 at 5/30/13 4:01 PM:
---

Latest patch is a version of CloudSolrServer that: 

1)Does document routing
2)Sends requests to each shard in a separate thread
3) Uses javabin transport 
4) Is backwards compatible with both the response and exception. 

It does this all by default, no switches needed.

This is accomplished by returning a response or throwing an exception that 
condenses the info from each shard into a single response or exception.

To get the full info for the response or exception you can down cast to either 
RouteReponse or RouteException which gives you a detailed breakdown from each 
of the shards.

Will update the ticket name and description accordingly.



  was (Author: joel.bernstein):
Latest patch is a version of CloudSolrServer that: 

1)Does document routing
2)Sends requests to each shard in a separate thread
3) Uses javabin transport 
4) Is backwards compatible with both the response and exception. 

It does this all by default, no switches needed.

I accomplish this by returning a response or throwing an exception that 
condenses the info from each shard into a single response or exception.

To get the full info for the response or exception you can down cast to either 
RouteReponse or RouteException which gives you a detailed breakdown from each 
of the shards.

Will update the ticket name and description accordingly.


  
> Add document routing to CloudSolrServer
> ---
>
> Key: SOLR-4816
> URL: https://issues.apache.org/jira/browse/SOLR-4816
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 4.3
>Reporter: Joel Bernstein
>Priority: Minor
> Attachments: SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816-sriesenberg.patch
>
>
> This issue adds the following enhancements to CloudSolrServer's update logic:
> 1) Document routing: Updates are routed directly to the correct shard leader 
> eliminating document routing at the server.
> 2) Parallel update execution: Updates for each shard are executed in a 
> separate thread so parallel indexing can occur across the cluster.
> 3) Javabin transport: Update requests are sent via javabin transport.
> These enhancements should allow for near linear scalability on indexing 
> throughput.
> Usage:
> CloudSolrServer cloudClient = new CloudSolrServer(zkAddress);
> SolrInputDocument doc1 = new SolrInputDocument();
> doc1.addField(id, "0");
> doc1.addField("a_t", "hello1");
> SolrInputDocument doc2 = new SolrInputDocument();
> doc2.addField(id, "2");
> doc2.addField("a_t", "hello2");
> UpdateRequest request = new UpdateRequest();
> request.add(doc1);
> request.add(doc2);
> request.setAction(AbstractUpdateRequest.ACTION.OPTIMIZE, false, false);
> NamedList response = cloudClient.request(request); // Returns a backwards 
> compatible condensed response.
> //To get more detailed response down cast to RouteResponse:
> CloudSolrServer.RouteResponse rr = (CloudSolrServer.RouteResponse)response;
> NamedList responses = rr.getRouteResponse(); 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4877) SolrIndexSearcher#getDocSetNC should check for null return in AtomicReader#fields()

2013-05-30 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670437#comment-13670437
 ] 

Robert Muir commented on SOLR-4877:
---

+1

> SolrIndexSearcher#getDocSetNC should check for null return in 
> AtomicReader#fields()
> ---
>
> Key: SOLR-4877
> URL: https://issues.apache.org/jira/browse/SOLR-4877
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.2, 4.3
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
> Attachments: SOLR-4877-nospecialcase.patch, SOLR-4877.patch
>
>
> In LUCENE-5023 it was reported that composite reader contexts should not 
> contain null fields() readers. But this is wrong, as a null-fields() reader 
> may contain documents, just no fields.
> fields() and terms() is documented to return null, so DocSets should check 
> for null (like all queries do in Lucene). It seems that DocSetNC does not 
> correctly check for null.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4816) Add document routing to CloudSolrServer

2013-05-30 Thread Joel Bernstein (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-4816:
-

Description: 
This issue adds the following enhancements to CloudSolrServer's update logic:

1) Document routing: Updates are routed directly to the correct shard leader 
eliminating document routing at the server.

2) Parallel update execution: Updates for each shard are executed in a separate 
thread so parallel indexing can occur across the cluster.

3) Javabin transport: Update requests are sent via javabin transport.

These enhancements should allow for near linear scalability on indexing 
throughput.

Usage:

CloudSolrServer cloudClient = new CloudSolrServer(zkAddress);
SolrInputDocument doc1 = new SolrInputDocument();
doc1.addField(id, "0");
doc1.addField("a_t", "hello1");
SolrInputDocument doc2 = new SolrInputDocument();
doc2.addField(id, "2");
doc2.addField("a_t", "hello2");

UpdateRequest request = new UpdateRequest();
request.add(doc1);
request.add(doc2);
request.setAction(AbstractUpdateRequest.ACTION.OPTIMIZE, false, false);

NamedList response = cloudClient.request(request); // Returns a backwards 
compatible condensed response.

//To get more detailed response down cast to RouteResponse:
CloudSolrServer.RouteResponse rr = (CloudSolrServer.RouteResponse)response;
NamedList responses = rr.getRouteResponse(); 

  was:
This issue adds the following enhancements to CloudSolrServer's update logic:

1) Document routing: Updates are routed directly to the correct shard leader 
eliminating document routing at the server.

2) Parallel update execution: Updates for each shard executed in a separate 
thread so parallel indexing can occur on each shard.

3) Javabin transport: The requests are sent via javabin transport.

These enhancements should allow for near linear scalability on indexing 
throughput.

Usage:

CloudSolrServer cloudClient = new CloudSolrServer(zkAddress);
SolrInputDocument doc1 = new SolrInputDocument();
doc1.addField(id, "0");
doc1.addField("a_t", "hello1");
SolrInputDocument doc2 = new SolrInputDocument();
doc2.addField(id, "2");
doc2.addField("a_t", "hello2");

UpdateRequest request = new UpdateRequest();
request.add(doc1);
request.add(doc2);
request.setAction(AbstractUpdateRequest.ACTION.OPTIMIZE, false, false);

NamedList response = cloudClient.request(request); // Returns a backwards 
compatible condensed response.

//To get more detailed response down cast to RouteResponse:
CloudSolrServer.RouteResponse rr = (CloudSolrServer.RouteResponse)response;
NamedList responses = rr.getRouteResponse(); 


> Add document routing to CloudSolrServer
> ---
>
> Key: SOLR-4816
> URL: https://issues.apache.org/jira/browse/SOLR-4816
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 4.3
>Reporter: Joel Bernstein
>Priority: Minor
> Attachments: SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816-sriesenberg.patch
>
>
> This issue adds the following enhancements to CloudSolrServer's update logic:
> 1) Document routing: Updates are routed directly to the correct shard leader 
> eliminating document routing at the server.
> 2) Parallel update execution: Updates for each shard are executed in a 
> separate thread so parallel indexing can occur across the cluster.
> 3) Javabin transport: Update requests are sent via javabin transport.
> These enhancements should allow for near linear scalability on indexing 
> throughput.
> Usage:
> CloudSolrServer cloudClient = new CloudSolrServer(zkAddress);
> SolrInputDocument doc1 = new SolrInputDocument();
> doc1.addField(id, "0");
> doc1.addField("a_t", "hello1");
> SolrInputDocument doc2 = new SolrInputDocument();
> doc2.addField(id, "2");
> doc2.addField("a_t", "hello2");
> UpdateRequest request = new UpdateRequest();
> request.add(doc1);
> request.add(doc2);
> request.setAction(AbstractUpdateRequest.ACTION.OPTIMIZE, false, false);
> NamedList response = cloudClient.request(request); // Returns a backwards 
> compatible condensed response.
> //To get more detailed response down cast to RouteResponse:
> CloudSolrServer.RouteResponse rr = (CloudSolrServer.RouteResponse)response;
> NamedList responses = rr.getRouteResponse(); 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (SOLR-4816) Add document routing to CloudSolrServer

2013-05-30 Thread Joel Bernstein (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-4816:
-

Description: 
This issue adds the following enhancements to CloudSolrServer's update logic:

1) Document routing: Updates are routed directly to the correct shard leader 
eliminating document routing at the server.

2) Parallel update execution: Updates for each shard executed in a separate 
thread so parallel indexing can occur on each shard.

3) Javabin transport: The requests are sent via javabin transport.

These enhancements should allow for near linear scalability on indexing 
throughput.

Usage:

CloudSolrServer cloudClient = new CloudSolrServer(zkAddress);
SolrInputDocument doc1 = new SolrInputDocument();
doc1.addField(id, "0");
doc1.addField("a_t", "hello1");
SolrInputDocument doc2 = new SolrInputDocument();
doc2.addField(id, "2");
doc2.addField("a_t", "hello2");

UpdateRequest request = new UpdateRequest();
request.add(doc1);
request.add(doc2);
request.setAction(AbstractUpdateRequest.ACTION.OPTIMIZE, false, false);

NamedList response = cloudClient.request(request); // Returns a backwards 
compatible condensed response.

//To get more detailed response down cast to RouteResponse:
CloudSolrServer.RouteResponse rr = (CloudSolrServer.RouteResponse)response;
NamedList responses = rr.getRouteResponse(); 

  was:
This issue adds a new Solr Cloud client called the 
ConcurrentUpdateCloudSolrServer. This Solr Cloud client implements document 
routing in the client so that document routing overhead is eliminated on the 
Solr servers. Documents are batched up for each shard and then each batch is 
sent in it's own thread. 

With this client, Solr Cloud indexing throughput should scale linearly with 
cluster size.

This client also has robust failover built-in because the actual requests are 
made using the LBHttpSolrServer. The list of urls used for the request to each 
shard begins with the leader and is followed by that shard's replicas. So the 
leader will be tried first and if it fails it will try the replicas.


Sample usage:

ConcurrentUpdateCloudServer client = new 
ConcurrentUpdateCloudSolrServer(zkHostAddress);
UpdateRequest request = new UpdateRequest();
SolrInputDocument doc = new SolrInputDocument();
doc.addField("id", 2);
doc.addField("manu","BMW");
request.add(doc);
NamedList response = client.request(request);
NamedList exceptions = response.get("exceptions"); // contains any exceptions 
from the shards
NamedList responses = response.get("responses"); // contains the responses from 
shards without exception.






> Add document routing to CloudSolrServer
> ---
>
> Key: SOLR-4816
> URL: https://issues.apache.org/jira/browse/SOLR-4816
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 4.3
>Reporter: Joel Bernstein
>Priority: Minor
> Attachments: SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816-sriesenberg.patch
>
>
> This issue adds the following enhancements to CloudSolrServer's update logic:
> 1) Document routing: Updates are routed directly to the correct shard leader 
> eliminating document routing at the server.
> 2) Parallel update execution: Updates for each shard executed in a separate 
> thread so parallel indexing can occur on each shard.
> 3) Javabin transport: The requests are sent via javabin transport.
> These enhancements should allow for near linear scalability on indexing 
> throughput.
> Usage:
> CloudSolrServer cloudClient = new CloudSolrServer(zkAddress);
> SolrInputDocument doc1 = new SolrInputDocument();
> doc1.addField(id, "0");
> doc1.addField("a_t", "hello1");
> SolrInputDocument doc2 = new SolrInputDocument();
> doc2.addField(id, "2");
> doc2.addField("a_t", "hello2");
> UpdateRequest request = new UpdateRequest();
> request.add(doc1);
> request.add(doc2);
> request.setAction(AbstractUpdateRequest.ACTION.OPTIMIZE, false, false);
> NamedList response = cloudClient.request(request); // Returns a backwards 
> compatible condensed response.
> //To get more detailed response down cast to RouteResponse:
> CloudSolrServer.RouteResponse rr = (CloudSolrServer.RouteResponse)response;
> NamedList responses = rr.getRouteResponse(); 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

--

[jira] [Updated] (SOLR-4816) Add document routing to CloudSolrServer

2013-05-30 Thread Joel Bernstein (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-4816:
-

Summary: Add document routing to CloudSolrServer  (was: 
ConcurrentUpdateCloudSolrServer)

> Add document routing to CloudSolrServer
> ---
>
> Key: SOLR-4816
> URL: https://issues.apache.org/jira/browse/SOLR-4816
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 4.3
>Reporter: Joel Bernstein
>Priority: Minor
> Attachments: SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816-sriesenberg.patch
>
>
> This issue adds a new Solr Cloud client called the 
> ConcurrentUpdateCloudSolrServer. This Solr Cloud client implements document 
> routing in the client so that document routing overhead is eliminated on the 
> Solr servers. Documents are batched up for each shard and then each batch is 
> sent in it's own thread. 
> With this client, Solr Cloud indexing throughput should scale linearly with 
> cluster size.
> This client also has robust failover built-in because the actual requests are 
> made using the LBHttpSolrServer. The list of urls used for the request to 
> each shard begins with the leader and is followed by that shard's replicas. 
> So the leader will be tried first and if it fails it will try the replicas.
> Sample usage:
> ConcurrentUpdateCloudServer client = new 
> ConcurrentUpdateCloudSolrServer(zkHostAddress);
> UpdateRequest request = new UpdateRequest();
> SolrInputDocument doc = new SolrInputDocument();
> doc.addField("id", 2);
> doc.addField("manu","BMW");
> request.add(doc);
> NamedList response = client.request(request);
> NamedList exceptions = response.get("exceptions"); // contains any exceptions 
> from the shards
> NamedList responses = response.get("responses"); // contains the responses 
> from shards without exception.
> 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4816) ConcurrentUpdateCloudSolrServer

2013-05-30 Thread Joel Bernstein (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-4816:
-

Attachment: SOLR-4816.patch

Latest patch is a version of CloudSolrServer that: 

1)Does document routing
2)Sends requests to each shard in a separate thread
3) Uses javabin transport 
4) Is backwards compatible with both the response and exception. 

It does this all by default, no switches needed.

I accomplish this by returning a response or throwing an exception that 
condenses the info from each shard into a single response or exception.

To get the full info for the response or exception you can down cast to either 
RouteReponse or RouteException which gives you a detailed breakdown from each 
of the shards.

Will update the ticket name and description accordingly.



> ConcurrentUpdateCloudSolrServer
> ---
>
> Key: SOLR-4816
> URL: https://issues.apache.org/jira/browse/SOLR-4816
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 4.3
>Reporter: Joel Bernstein
>Priority: Minor
> Attachments: SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816-sriesenberg.patch
>
>
> This issue adds a new Solr Cloud client called the 
> ConcurrentUpdateCloudSolrServer. This Solr Cloud client implements document 
> routing in the client so that document routing overhead is eliminated on the 
> Solr servers. Documents are batched up for each shard and then each batch is 
> sent in it's own thread. 
> With this client, Solr Cloud indexing throughput should scale linearly with 
> cluster size.
> This client also has robust failover built-in because the actual requests are 
> made using the LBHttpSolrServer. The list of urls used for the request to 
> each shard begins with the leader and is followed by that shard's replicas. 
> So the leader will be tried first and if it fails it will try the replicas.
> Sample usage:
> ConcurrentUpdateCloudServer client = new 
> ConcurrentUpdateCloudSolrServer(zkHostAddress);
> UpdateRequest request = new UpdateRequest();
> SolrInputDocument doc = new SolrInputDocument();
> doc.addField("id", 2);
> doc.addField("manu","BMW");
> request.add(doc);
> NamedList response = client.request(request);
> NamedList exceptions = response.get("exceptions"); // contains any exceptions 
> from the shards
> NamedList responses = response.get("responses"); // contains the responses 
> from shards without exception.
> 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-4870) RecentUpdates.update() does not increment numUpdates counter inside loop

2013-05-30 Thread Shalin Shekhar Mangar (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar resolved SOLR-4870.
-

   Resolution: Fixed
Fix Version/s: 4.3.1
 Assignee: Shalin Shekhar Mangar

Committed.

trunk: r1487897
branch_4x: r1487899
lucene_solr_4_3: r1487900

> RecentUpdates.update() does not increment numUpdates counter inside loop
> 
>
> Key: SOLR-4870
> URL: https://issues.apache.org/jira/browse/SOLR-4870
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.3
>Reporter: Shalin Shekhar Mangar
>Assignee: Shalin Shekhar Mangar
> Fix For: 4.3.1
>
>
> As reported by AlexeyK on solr-user:
> http://lucene.472066.n3.nabble.com/Solr-4-3-node-is-seen-as-active-in-Zk-while-in-recovery-mode-endless-recovery-td4065549.html
> {quote}
> Speaking about the update log - i have noticed a strange behavior concerning
> the replay. The replay is *supposed* to be done for a predefined number of
> log entries, but actually it is always done for the whole last 2 tlogs.
> RecentUpdates.update() reads log within  while (numUpdates <
> numRecordsToKeep), while numUpdates is never incremented, so it exits when
> the reader reaches EOF.
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5024) Can we reliably detect an incomplete first commit vs index corruption?

2013-05-30 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670416#comment-13670416
 ] 

Robert Muir commented on LUCENE-5024:
-

The best solution here i think, is removal of create-or-append.

really index creation can be a one time thing you must do separately before you 
can use the directory. this is typically how its done: lucene is wierd and has 
this broken mechanism today instead.

> Can we reliably detect an incomplete first commit vs index corruption?
> --
>
> Key: LUCENE-5024
> URL: https://issues.apache.org/jira/browse/LUCENE-5024
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/index
>Reporter: Michael McCandless
> Fix For: 5.0, 4.4
>
>
> Normally, if something bad happens (OS, JVM, hardware crashes) while
> IndexWriter is committing, we will just fallback to the prior commit
> and no intervention necessary from the app.
> But if that commit is the first commit, then on restart IndexWriter
> will now throw CorruptIndexException, as of LUCENE-4738.
> Prior to LUCENE-4738, in LUCENE-2812, we used to try to detect the
> corrupt first commit, but that logic was dangerous and could result in
> falsely believing no index is present when one is, e.g. when transient
> IOExceptions are thrown due to file descriptor exhaustion.
> But now two users have hit this change ... see "CorruptIndexException
> when opening Index during first commit" and "Calling
> IndexWriter.commit() immediately after creating the writer", both on
> java-user.
> It would be nice to get back to not marking an incomplete first commit
> as corruption ... but we have to proceed carefully.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5024) Can we reliably detect an incomplete first commit vs index corruption?

2013-05-30 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670410#comment-13670410
 ] 

Robert Muir commented on LUCENE-5024:
-

even if we can, i'm not sure we should.

real users hit corruption issues too. sorry to the two java-users for the 
inconvenience, but corruption/dataloss is WAY worse.

> Can we reliably detect an incomplete first commit vs index corruption?
> --
>
> Key: LUCENE-5024
> URL: https://issues.apache.org/jira/browse/LUCENE-5024
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/index
>Reporter: Michael McCandless
> Fix For: 5.0, 4.4
>
>
> Normally, if something bad happens (OS, JVM, hardware crashes) while
> IndexWriter is committing, we will just fallback to the prior commit
> and no intervention necessary from the app.
> But if that commit is the first commit, then on restart IndexWriter
> will now throw CorruptIndexException, as of LUCENE-4738.
> Prior to LUCENE-4738, in LUCENE-2812, we used to try to detect the
> corrupt first commit, but that logic was dangerous and could result in
> falsely believing no index is present when one is, e.g. when transient
> IOExceptions are thrown due to file descriptor exhaustion.
> But now two users have hit this change ... see "CorruptIndexException
> when opening Index during first commit" and "Calling
> IndexWriter.commit() immediately after creating the writer", both on
> java-user.
> It would be nice to get back to not marking an incomplete first commit
> as corruption ... but we have to proceed carefully.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-5025) Allow more than 2.1B "tail nodes" when building FST

2013-05-30 Thread Michael McCandless (JIRA)
Michael McCandless created LUCENE-5025:
--

 Summary: Allow more than 2.1B "tail nodes" when building FST
 Key: LUCENE-5025
 URL: https://issues.apache.org/jira/browse/LUCENE-5025
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/FSTs
Reporter: Michael McCandless
 Fix For: 5.0, 4.4


We recently relaxed some of the limits for big FSTs, but there is
one more limit I think we should fix.  E.g. Aaron hit it in building
the world's biggest FST: 
http://aaron.blog.archive.org/2013/05/29/worlds-biggest-fst/

The issue is NodeHash, which currently uses a GrowableWriter (packed
ints impl that can grow both number of bits and number of values):
it's indexed by int not long.

This is a hash table that's used to share suffixes, so we need random
get/put on a long index of long values, i.e. this is logically a long[].

I think one simple way to do this is to make a "paged"
GrowableWriter...

Along with this we'd need to fix the hash codes to be long not
int.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-5024) Can we reliably detect an incomplete first commit vs index corruption?

2013-05-30 Thread Michael McCandless (JIRA)
Michael McCandless created LUCENE-5024:
--

 Summary: Can we reliably detect an incomplete first commit vs 
index corruption?
 Key: LUCENE-5024
 URL: https://issues.apache.org/jira/browse/LUCENE-5024
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/index
Reporter: Michael McCandless
 Fix For: 5.0, 4.4


Normally, if something bad happens (OS, JVM, hardware crashes) while
IndexWriter is committing, we will just fallback to the prior commit
and no intervention necessary from the app.

But if that commit is the first commit, then on restart IndexWriter
will now throw CorruptIndexException, as of LUCENE-4738.

Prior to LUCENE-4738, in LUCENE-2812, we used to try to detect the
corrupt first commit, but that logic was dangerous and could result in
falsely believing no index is present when one is, e.g. when transient
IOExceptions are thrown due to file descriptor exhaustion.

But now two users have hit this change ... see "CorruptIndexException
when opening Index during first commit" and "Calling
IndexWriter.commit() immediately after creating the writer", both on
java-user.

It would be nice to get back to not marking an incomplete first commit
as corruption ... but we have to proceed carefully.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2894) Implement distributed pivot faceting

2013-05-30 Thread Chris Russell (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670363#comment-13670363
 ] 

Chris Russell commented on SOLR-2894:
-

I will take a look.

> Implement distributed pivot faceting
> 
>
> Key: SOLR-2894
> URL: https://issues.apache.org/jira/browse/SOLR-2894
> Project: Solr
>  Issue Type: Improvement
>Reporter: Erik Hatcher
> Fix For: 4.4
>
> Attachments: SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, 
> SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, 
> SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, 
> SOLR-2894-reworked.patch
>
>
> Following up on SOLR-792, pivot faceting currently only supports 
> undistributed mode.  Distributed pivot faceting needs to be implemented.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-4.x-Linux () - Build # 5839 - Failure!

2013-05-30 Thread Policeman Jenkins Server
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/5839/
Java: 

No tests ran.

Build Log:
[...truncated 27 lines...]


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-trunk-Linux () - Build # 5902 - Still Failing!

2013-05-30 Thread Policeman Jenkins Server
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/5902/
Java: 

No tests ran.

Build Log:
[...truncated 25 lines...]


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Assigned] (SOLR-4875) DIH XPathRecordReader cannot handle two ways to read same attribute together

2013-05-30 Thread Shalin Shekhar Mangar (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar reassigned SOLR-4875:
---

Assignee: Noble Paul

> DIH XPathRecordReader cannot handle two ways to read same attribute together
> 
>
> Key: SOLR-4875
> URL: https://issues.apache.org/jira/browse/SOLR-4875
> Project: Solr
>  Issue Type: Bug
>  Components: contrib - DataImportHandler
>Affects Versions: 4.3
>Reporter: Shalin Shekhar Mangar
>Assignee: Noble Paul
>Priority: Minor
> Fix For: 4.4
>
> Attachments: SOLR-4875.patch
>
>
> From my comment on solr-user mailing list:
> {quote}
> I think there is a bug here. In my tests, xpath="/root/a/@y" works, 
> xpath="/root/a[@x='1']/@y" also works. But if you use them together the one 
> which is defined last returns null. I'll open an issue.
> {quote}
> http://lucene.472066.n3.nabble.com/Problem-with-xpath-expression-in-data-config-xml-td4066744.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4470) Support for basic http auth in internal solr requests

2013-05-30 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670287#comment-13670287
 ] 

Mark Miller commented on SOLR-4470:
---

bq.  And if anyone wishes to setup a similar setup in their production they may 
borrow code from the test class, but it will be a manual step reinforcing that 
this is not a supported feature of the project as such.

I would be much more okay with this - in this way, we are not responsible for 
the security this code provides - it's not shipping production solr code, its 
code a user can plug in as a filter himself and be responsible for himself.

> Support for basic http auth in internal solr requests
> -
>
> Key: SOLR-4470
> URL: https://issues.apache.org/jira/browse/SOLR-4470
> Project: Solr
>  Issue Type: New Feature
>  Components: clients - java, multicore, replication (java), SolrCloud
>Affects Versions: 4.0
>Reporter: Per Steffensen
>Assignee: Jan Høydahl
>  Labels: authentication, https, solrclient, solrcloud, ssl
> Fix For: 4.4
>
> Attachments: SOLR-4470_branch_4x_r1452629.patch, 
> SOLR-4470_branch_4x_r1452629.patch, SOLR-4470_branch_4x_r145.patch, 
> SOLR-4470.patch
>
>
> We want to protect any HTTP-resource (url). We want to require credentials no 
> matter what kind of HTTP-request you make to a Solr-node.
> It can faily easy be acheived as described on 
> http://wiki.apache.org/solr/SolrSecurity. This problem is that Solr-nodes 
> also make "internal" request to other Solr-nodes, and for it to work 
> credentials need to be provided here also.
> Ideally we would like to "forward" credentials from a particular request to 
> all the "internal" sub-requests it triggers. E.g. for search and update 
> request.
> But there are also "internal" requests
> * that only indirectly/asynchronously triggered from "outside" requests (e.g. 
> shard creation/deletion/etc based on calls to the "Collection API")
> * that do not in any way have relation to an "outside" "super"-request (e.g. 
> replica synching stuff)
> We would like to aim at a solution where "original" credentials are 
> "forwarded" when a request directly/synchronously trigger a subrequest, and 
> fallback to a configured "internal credentials" for the 
> asynchronous/non-rooted requests.
> In our solution we would aim at only supporting basic http auth, but we would 
> like to make a "framework" around it, so that not to much refactoring is 
> needed if you later want to make support for other kinds of auth (e.g. digest)
> We will work at a solution but create this JIRA issue early in order to get 
> input/comments from the community as early as possible.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4877) SolrIndexSearcher#getDocSetNC should check for null return in AtomicReader#fields()

2013-05-30 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670273#comment-13670273
 ] 

Uwe Schindler commented on SOLR-4877:
-

I will commit the "nospecialcase" patch if nobody objects.

> SolrIndexSearcher#getDocSetNC should check for null return in 
> AtomicReader#fields()
> ---
>
> Key: SOLR-4877
> URL: https://issues.apache.org/jira/browse/SOLR-4877
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.2, 4.3
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
> Attachments: SOLR-4877-nospecialcase.patch, SOLR-4877.patch
>
>
> In LUCENE-5023 it was reported that composite reader contexts should not 
> contain null fields() readers. But this is wrong, as a null-fields() reader 
> may contain documents, just no fields.
> fields() and terms() is documented to return null, so DocSets should check 
> for null (like all queries do in Lucene). It seems that DocSetNC does not 
> correctly check for null.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4877) SolrIndexSearcher#getDocSetNC should check for null return in AtomicReader#fields()

2013-05-30 Thread Uwe Schindler (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated SOLR-4877:


Description: 
In LUCENE-5023 it was reported that composite reader contexts should not 
contain null fields() readers. But this is wrong, as a null-fields() reader may 
contain documents, just no fields.

fields() and terms() is documented to return null, so DocSets should check for 
null (like all queries do in Lucene). It seems that DocSetNC does not correctly 
check for null.

  was:
In LUCENE-5023 it was reported that composite reader contexts should not 
contain null fields() readers. But this is wrong, as a null-fields() reader may 
contain documents,m just no fields.

fields() is documented to contain null fields, so DocSets should check for null 
(like all fields do in Lucene). It seems that DocSetNC does not correctly check 
for null.


> SolrIndexSearcher#getDocSetNC should check for null return in 
> AtomicReader#fields()
> ---
>
> Key: SOLR-4877
> URL: https://issues.apache.org/jira/browse/SOLR-4877
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.2, 4.3
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
> Attachments: SOLR-4877-nospecialcase.patch, SOLR-4877.patch
>
>
> In LUCENE-5023 it was reported that composite reader contexts should not 
> contain null fields() readers. But this is wrong, as a null-fields() reader 
> may contain documents, just no fields.
> fields() and terms() is documented to return null, so DocSets should check 
> for null (like all queries do in Lucene). It seems that DocSetNC does not 
> correctly check for null.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5023) Only reader that contains fields can be added into readerContext

2013-05-30 Thread Uwe Schindler (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated LUCENE-5023:
--

Labels:   (was: patch)

> Only reader that contains fields can be added into readerContext
> 
>
> Key: LUCENE-5023
> URL: https://issues.apache.org/jira/browse/LUCENE-5023
> Project: Lucene - Core
>  Issue Type: Bug
>Affects Versions: 4.2
>Reporter: Bao Yang Yang
>Assignee: Uwe Schindler
>Priority: Critical
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> When there is only Segements in solr core, which means no any indexes, in 
> CompositeReaderContext.build() method, the atomicReader that has no fields 
> returned should not be added into leaves. Otherwise, in 
> SolrIndexSearcher.getDocSetNC(Query query, DocSet filter), when execute line 
> fields.terms(t.field()), a nullpointerexception will occur since fields 
> variable is null.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-3076) Solr should support block joins

2013-05-30 Thread Alan Woodward (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670262#comment-13670262
 ] 

Alan Woodward edited comment on SOLR-3076 at 5/30/13 11:30 AM:
---

bq. change numFound=1 to numFound=0

The failure is elsewhere, in the \*:\* query - it's expecting to find 9 docs, 
but actually finds 8.  But I guess this is the same change.

  was (Author: romseygeek):
bq. change numFound=1 to numFound=0

The failure is elsewhere, in the *:* query - it's expecting to find 9 docs, but 
actually finds 8.  But I guess this is the same change.
  
> Solr should support block joins
> ---
>
> Key: SOLR-3076
> URL: https://issues.apache.org/jira/browse/SOLR-3076
> Project: Solr
>  Issue Type: New Feature
>Reporter: Grant Ingersoll
> Fix For: 5.0, 4.4
>
> Attachments: 27M-singlesegment-histogram.png, 27M-singlesegment.png, 
> bjq-vs-filters-backward-disi.patch, bjq-vs-filters-illegal-state.patch, 
> child-bjqparser.patch, dih-3076.patch, dih-config.xml, 
> parent-bjq-qparser.patch, parent-bjq-qparser.patch, Screen Shot 2012-07-17 at 
> 1.12.11 AM.png, SOLR-3076-childDocs.patch, SOLR-3076.patch, SOLR-3076.patch, 
> SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, 
> SOLR-3076.patch, SOLR-7036-childDocs-solr-fork-trunk-patched, 
> solrconf-bjq-erschema-snippet.xml, solrconfig.xml.patch, 
> tochild-bjq-filtered-search-fix.patch
>
>
> Lucene has the ability to do block joins, we should add it to Solr.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3076) Solr should support block joins

2013-05-30 Thread Alan Woodward (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670262#comment-13670262
 ] 

Alan Woodward commented on SOLR-3076:
-

bq. change numFound=1 to numFound=0

The failure is elsewhere, in the *:* query - it's expecting to find 9 docs, but 
actually finds 8.  But I guess this is the same change.

> Solr should support block joins
> ---
>
> Key: SOLR-3076
> URL: https://issues.apache.org/jira/browse/SOLR-3076
> Project: Solr
>  Issue Type: New Feature
>Reporter: Grant Ingersoll
> Fix For: 5.0, 4.4
>
> Attachments: 27M-singlesegment-histogram.png, 27M-singlesegment.png, 
> bjq-vs-filters-backward-disi.patch, bjq-vs-filters-illegal-state.patch, 
> child-bjqparser.patch, dih-3076.patch, dih-config.xml, 
> parent-bjq-qparser.patch, parent-bjq-qparser.patch, Screen Shot 2012-07-17 at 
> 1.12.11 AM.png, SOLR-3076-childDocs.patch, SOLR-3076.patch, SOLR-3076.patch, 
> SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, 
> SOLR-3076.patch, SOLR-7036-childDocs-solr-fork-trunk-patched, 
> solrconf-bjq-erschema-snippet.xml, solrconfig.xml.patch, 
> tochild-bjq-filtered-search-fix.patch
>
>
> Lucene has the ability to do block joins, we should add it to Solr.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3076) Solr should support block joins

2013-05-30 Thread Vadim Kirilchuk (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670256#comment-13670256
 ] 

Vadim Kirilchuk commented on SOLR-3076:
---

Thanks, Alan! 

bq. There are a bunch of test failures in the analyzing suggester suite, which 
is a bit odd.
I will try to take a look at the weekend.

bq. There's also a single test failure. 
Right, inconsistent behavior was fixed at some point (if you look at the test 
it has comment about this), so the proper way is to change numFound=1 to 
numFound=0.


> Solr should support block joins
> ---
>
> Key: SOLR-3076
> URL: https://issues.apache.org/jira/browse/SOLR-3076
> Project: Solr
>  Issue Type: New Feature
>Reporter: Grant Ingersoll
> Fix For: 5.0, 4.4
>
> Attachments: 27M-singlesegment-histogram.png, 27M-singlesegment.png, 
> bjq-vs-filters-backward-disi.patch, bjq-vs-filters-illegal-state.patch, 
> child-bjqparser.patch, dih-3076.patch, dih-config.xml, 
> parent-bjq-qparser.patch, parent-bjq-qparser.patch, Screen Shot 2012-07-17 at 
> 1.12.11 AM.png, SOLR-3076-childDocs.patch, SOLR-3076.patch, SOLR-3076.patch, 
> SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, 
> SOLR-3076.patch, SOLR-7036-childDocs-solr-fork-trunk-patched, 
> solrconf-bjq-erschema-snippet.xml, solrconfig.xml.patch, 
> tochild-bjq-filtered-search-fix.patch
>
>
> Lucene has the ability to do block joins, we should add it to Solr.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-4879) Indexing a field of type solr.SpatialRecursivePrefixTreeFieldType fails when at least two vertexes are more than 180 degrees apart

2013-05-30 Thread JIRA
Øystein Torget created SOLR-4879:


 Summary: Indexing a field of type 
solr.SpatialRecursivePrefixTreeFieldType fails when at least two vertexes are 
more than 180 degrees apart
 Key: SOLR-4879
 URL: https://issues.apache.org/jira/browse/SOLR-4879
 Project: Solr
  Issue Type: Bug
 Environment: Linux, Solr 4.0.0, Solr 4.3.0
Reporter: Øystein Torget


When trying to index a field of the type 
solr.SpatialRecursivePrefixTreeFieldType the indexing will fail if two vertexes 
are more than 180 longitudal degress apart.

For instance this polygon will fail: 

POLYGON((-161 49,  0 49,   20 49,   20 89.1,  0 89.1,   -161 89.2,-161 49))

but this will not.

POLYGON((-160 49,  0 49,   20 49,   20 89.1,  0 89.1,   -160 89.2,-160 49))

This contradicts the documentation found here: 
http://wiki.apache.org/solr/SolrAdaptersForLuceneSpatial4

The documentation states that each vertex must be less than 180 longitudal 
degrees apart from the previous vertex.

Relevant parts from the schema.xml file:






--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4470) Support for basic http auth in internal solr requests

2013-05-30 Thread JIRA

[ 
https://issues.apache.org/jira/browse/SOLR-4470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670239#comment-13670239
 ] 

Jan Høydahl commented on SOLR-4470:
---

I am currently porting the patch to trunk. There are several new APIs added 
which needs instrumentation.

At the same time, I am also moving RegExpAuthorizationFilter into 
test-framework and adding plugin support in solr.xml for plugging in your own 
internalRequestFactory and subRequestFactory.

Will upload a new patch once ready, and probably also commit to the "security" 
branch. Next, will explore [~ryantxu]'s proposal for enforcing invariant params 
through TestHttpSolrServer.

> Support for basic http auth in internal solr requests
> -
>
> Key: SOLR-4470
> URL: https://issues.apache.org/jira/browse/SOLR-4470
> Project: Solr
>  Issue Type: New Feature
>  Components: clients - java, multicore, replication (java), SolrCloud
>Affects Versions: 4.0
>Reporter: Per Steffensen
>Assignee: Jan Høydahl
>  Labels: authentication, https, solrclient, solrcloud, ssl
> Fix For: 4.4
>
> Attachments: SOLR-4470_branch_4x_r1452629.patch, 
> SOLR-4470_branch_4x_r1452629.patch, SOLR-4470_branch_4x_r145.patch, 
> SOLR-4470.patch
>
>
> We want to protect any HTTP-resource (url). We want to require credentials no 
> matter what kind of HTTP-request you make to a Solr-node.
> It can faily easy be acheived as described on 
> http://wiki.apache.org/solr/SolrSecurity. This problem is that Solr-nodes 
> also make "internal" request to other Solr-nodes, and for it to work 
> credentials need to be provided here also.
> Ideally we would like to "forward" credentials from a particular request to 
> all the "internal" sub-requests it triggers. E.g. for search and update 
> request.
> But there are also "internal" requests
> * that only indirectly/asynchronously triggered from "outside" requests (e.g. 
> shard creation/deletion/etc based on calls to the "Collection API")
> * that do not in any way have relation to an "outside" "super"-request (e.g. 
> replica synching stuff)
> We would like to aim at a solution where "original" credentials are 
> "forwarded" when a request directly/synchronously trigger a subrequest, and 
> fallback to a configured "internal credentials" for the 
> asynchronous/non-rooted requests.
> In our solution we would aim at only supporting basic http auth, but we would 
> like to make a "framework" around it, so that not to much refactoring is 
> needed if you later want to make support for other kinds of auth (e.g. digest)
> We will work at a solution but create this JIRA issue early in order to get 
> input/comments from the community as early as possible.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-3076) Solr should support block joins

2013-05-30 Thread Alan Woodward (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-3076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Woodward updated SOLR-3076:


Attachment: SOLR-3076.patch

This patch updates the 12/10/12 patch to trunk.  There are a bunch of test 
failures in the analyzing suggester suite, which is a bit odd.  There's also a 
single test failure in AddBlockUpdateTest.testExceptionThrown, which I think is 
actually an error in the test (it seems to expect that a document with 
fieldtype errors in a subdoc would be added, instead of the entire block being 
rejected).

I have a client who's keen to get this into trunk/4x soon.  Would be good to 
get some momentum behind it.

> Solr should support block joins
> ---
>
> Key: SOLR-3076
> URL: https://issues.apache.org/jira/browse/SOLR-3076
> Project: Solr
>  Issue Type: New Feature
>Reporter: Grant Ingersoll
> Fix For: 5.0, 4.4
>
> Attachments: 27M-singlesegment-histogram.png, 27M-singlesegment.png, 
> bjq-vs-filters-backward-disi.patch, bjq-vs-filters-illegal-state.patch, 
> child-bjqparser.patch, dih-3076.patch, dih-config.xml, 
> parent-bjq-qparser.patch, parent-bjq-qparser.patch, Screen Shot 2012-07-17 at 
> 1.12.11 AM.png, SOLR-3076-childDocs.patch, SOLR-3076.patch, SOLR-3076.patch, 
> SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, 
> SOLR-3076.patch, SOLR-7036-childDocs-solr-fork-trunk-patched, 
> solrconf-bjq-erschema-snippet.xml, solrconfig.xml.patch, 
> tochild-bjq-filtered-search-fix.patch
>
>
> Lucene has the ability to do block joins, we should add it to Solr.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-5016) Sampling can break FacetResult labeling

2013-05-30 Thread Shai Erera (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera resolved LUCENE-5016.


   Resolution: Fixed
Fix Version/s: 4.4
   5.0
Lucene Fields: New,Patch Available  (was: New)

Committed to trunk and 4x. Thanks Rob for reporting this!

> Sampling can break FacetResult labeling 
> 
>
> Key: LUCENE-5016
> URL: https://issues.apache.org/jira/browse/LUCENE-5016
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/facet
>Affects Versions: 4.3
>Reporter: Rob Audenaerde
>Assignee: Shai Erera
>Priority: Minor
> Fix For: 5.0, 4.4
>
> Attachments: LUCENE-5016.patch, test-labels.zip
>
>
> When sampling FacetResults, the TopKInEachNodeHandler is used to get the 
> FacetResults.
> This is my case:
> A FacetResult is returned (which matches a FacetRequest) from the 
> StandardFacetAccumulator. The facet has 0 results. The labelling of the 
> root-node seems incorrect. I know, from the StandardFacetAccumulator, that 
> the rootnode has a label, so I can use that one.
> Currently the recursivelyLabel method uses the taxonomyReader.getPath() to 
> retrieve the label. I think we can skip that for the rootNode when there are 
> no children (and gain a little performance on the way too?)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5016) Sampling can break FacetResult labeling

2013-05-30 Thread Gilad Barkai (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670200#comment-13670200
 ] 

Gilad Barkai commented on LUCENE-5016:
--

Patch looks good.
+1 for commit 

> Sampling can break FacetResult labeling 
> 
>
> Key: LUCENE-5016
> URL: https://issues.apache.org/jira/browse/LUCENE-5016
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/facet
>Affects Versions: 4.3
>Reporter: Rob Audenaerde
>Assignee: Shai Erera
>Priority: Minor
> Attachments: LUCENE-5016.patch, test-labels.zip
>
>
> When sampling FacetResults, the TopKInEachNodeHandler is used to get the 
> FacetResults.
> This is my case:
> A FacetResult is returned (which matches a FacetRequest) from the 
> StandardFacetAccumulator. The facet has 0 results. The labelling of the 
> root-node seems incorrect. I know, from the StandardFacetAccumulator, that 
> the rootnode has a label, so I can use that one.
> Currently the recursivelyLabel method uses the taxonomyReader.getPath() to 
> retrieve the label. I think we can skip that for the rootNode when there are 
> no children (and gain a little performance on the way too?)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4580) Support for protecting content in ZK

2013-05-30 Thread Per Steffensen (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670184#comment-13670184
 ] 

Per Steffensen commented on SOLR-4580:
--

Documentation: 
https://wiki.apache.org/solr/Per%20Steffensen/ZooKeeper%20protecting%20content

> Support for protecting content in ZK
> 
>
> Key: SOLR-4580
> URL: https://issues.apache.org/jira/browse/SOLR-4580
> Project: Solr
>  Issue Type: New Feature
>  Components: SolrCloud
>Affects Versions: 4.2
>Reporter: Per Steffensen
>Assignee: Per Steffensen
>  Labels: security, solr, zookeeper
> Attachments: SOLR-4580_branch_4x_r1482255.patch
>
>
> We want to protect content in zookeeper. 
> In order to run a CloudSolrServer in "client-space" you will have to open for 
> access to zookeeper from client-space. 
> If you do not trust persons or systems in client-space you want to protect 
> zookeeper against evilness from client-space - e.g.
> * Changing configuration
> * Trying to mess up system by manipulating clusterstate
> * Add a delete-collection job to be carried out by the Overseer
> * etc
> Even if you do not open for zookeeper access to someone outside your "secure 
> zone" you might want to protect zookeeper content from being manipulated by 
> e.g.
> * Malware that found its way into secure zone
> * Other systems also using zookeeper
> * etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org