[jira] [Commented] (LUCENE-3069) Lucene should have an entirely memory resident term dictionary

2013-08-23 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13748388#comment-13748388
 ] 

ASF subversion and git services commented on LUCENE-3069:
-

Commit 1516742 from [~billy] in branch 'dev/branches/lucene3069'
[ https://svn.apache.org/r1516742 ]

LUCENE-3069: API refactoring on Lucene40RW

> Lucene should have an entirely memory resident term dictionary
> --
>
> Key: LUCENE-3069
> URL: https://issues.apache.org/jira/browse/LUCENE-3069
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/index, core/search
>Affects Versions: 4.0-ALPHA
>Reporter: Simon Willnauer
>Assignee: Han Jiang
>  Labels: gsoc2013
> Fix For: 5.0, 4.5
>
> Attachments: df-ttf-estimate.txt, example.png, LUCENE-3069.patch, 
> LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, 
> LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, 
> LUCENE-3069.patch
>
>
> FST based TermDictionary has been a great improvement yet it still uses a 
> delta codec file for scanning to terms. Some environments have enough memory 
> available to keep the entire FST based term dict in memory. We should add a 
> TermDictionary implementation that encodes all needed information for each 
> term into the FST (custom fst.Output) and builds a FST from the entire term 
> not just the delta.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5186) Add CachingWrapperFilter.getFilter()

2013-08-23 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13748412#comment-13748412
 ] 

ASF subversion and git services commented on LUCENE-5186:
-

Commit 1516773 from [~jpountz] in branch 'dev/trunk'
[ https://svn.apache.org/r1516773 ]

LUCENE-5186: Added CachingWrapperFilter.getFilter.

> Add CachingWrapperFilter.getFilter()
> 
>
> Key: LUCENE-5186
> URL: https://issues.apache.org/jira/browse/LUCENE-5186
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/search
>Reporter: Trejkaz
>Assignee: Adrien Grand
>Priority: Minor
> Attachments: LUCENE-5186.patch
>
>
> There are a couple of use cases I can think of where being able to get the 
> underlying filter out of CachingWrapperFilter would be useful:
> 1. You might want to introspect the filter to figure out what's in it (the 
> use case we hit.)
> 2. You might want to serialise the filter since Lucene no longer supports 
> that itself.
> We currently work around this by subclassing, keeping another copy of the 
> underlying filter reference and implementing a trivial getter, which is an 
> easy workaround, but the trap is that a junior developer could unknowingly 
> create a CachingWrapperFilter without knowing that the 
> BetterCachingWrapperFilter exists, introducing a filter which cannot be 
> introspected.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5186) Add CachingWrapperFilter.getFilter()

2013-08-23 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13748414#comment-13748414
 ] 

ASF subversion and git services commented on LUCENE-5186:
-

Commit 1516774 from [~jpountz] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1516774 ]

LUCENE-5186: Added CachingWrapperFilter.getFilter.

> Add CachingWrapperFilter.getFilter()
> 
>
> Key: LUCENE-5186
> URL: https://issues.apache.org/jira/browse/LUCENE-5186
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/search
>Reporter: Trejkaz
>Assignee: Adrien Grand
>Priority: Minor
> Attachments: LUCENE-5186.patch
>
>
> There are a couple of use cases I can think of where being able to get the 
> underlying filter out of CachingWrapperFilter would be useful:
> 1. You might want to introspect the filter to figure out what's in it (the 
> use case we hit.)
> 2. You might want to serialise the filter since Lucene no longer supports 
> that itself.
> We currently work around this by subclassing, keeping another copy of the 
> underlying filter reference and implementing a trivial getter, which is an 
> easy workaround, but the trap is that a junior developer could unknowingly 
> create a CachingWrapperFilter without knowing that the 
> BetterCachingWrapperFilter exists, introducing a filter which cannot be 
> introspected.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-5186) Add CachingWrapperFilter.getFilter()

2013-08-23 Thread Adrien Grand (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand resolved LUCENE-5186.
--

   Resolution: Fixed
Fix Version/s: 4.5
   5.0

Committed, thanks!

> Add CachingWrapperFilter.getFilter()
> 
>
> Key: LUCENE-5186
> URL: https://issues.apache.org/jira/browse/LUCENE-5186
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/search
>Reporter: Trejkaz
>Assignee: Adrien Grand
>Priority: Minor
> Fix For: 5.0, 4.5
>
> Attachments: LUCENE-5186.patch
>
>
> There are a couple of use cases I can think of where being able to get the 
> underlying filter out of CachingWrapperFilter would be useful:
> 1. You might want to introspect the filter to figure out what's in it (the 
> use case we hit.)
> 2. You might want to serialise the filter since Lucene no longer supports 
> that itself.
> We currently work around this by subclassing, keeping another copy of the 
> underlying filter reference and implementing a trivial getter, which is an 
> easy workaround, but the trap is that a junior developer could unknowingly 
> create a CachingWrapperFilter without knowing that the 
> BetterCachingWrapperFilter exists, introducing a filter which cannot be 
> introspected.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [JENKINS] Lucene-Solr-trunk-Windows (64bit/jdk1.7.0_25) - Build # 3179 - Still Failing!

2013-08-23 Thread Martijn v Groningen
Thanks for fixing this test bug!


On 22 August 2013 16:11, Robert Muir  wrote:

> I committed a fix: slowwrapper bug
>
> On Thu, Aug 22, 2013 at 9:42 AM, Policeman Jenkins Server
>  wrote:
> > Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Windows/3179/
> > Java: 64bit/jdk1.7.0_25 -XX:+UseCompressedOops -XX:+UseG1GC
> >
> > 1 tests failed.
> > REGRESSION:
>  org.apache.lucene.search.grouping.DistinctValuesCollectorTest.testRandom
> >
> > Error Message:
> > CheckReader failed
> >
> > Stack Trace:
> > java.lang.RuntimeException: CheckReader failed
> > at
> __randomizedtesting.SeedInfo.seed([C17F45D490FFB13E:B33360DB219F074D]:0)
> > at
> org.apache.lucene.util._TestUtil.checkReader(_TestUtil.java:261)
> > at
> org.apache.lucene.util._TestUtil.checkReader(_TestUtil.java:240)
> > at
> org.apache.lucene.util.LuceneTestCase.newSearcher(LuceneTestCase.java:1310)
> > at
> org.apache.lucene.util.LuceneTestCase.newSearcher(LuceneTestCase.java:1286)
> > at
> org.apache.lucene.search.grouping.DistinctValuesCollectorTest.testRandom(DistinctValuesCollectorTest.java:253)
> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> > at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> > at java.lang.reflect.Method.invoke(Method.java:606)
> > at
> com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
> > at
> com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
> > at
> com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737)
> > at
> com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773)
> > at
> com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787)
> > at
> org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
> > at
> org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
> > at
> org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
> > at
> com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
> > at
> org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
> > at
> org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
> > at
> org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
> > at
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> > at
> com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
> > at
> com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782)
> > at
> com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442)
> > at
> com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746)
> > at
> com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648)
> > at
> com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682)
> > at
> com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693)
> > at
> org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
> > at
> org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
> > at
> com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
> > at
> com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
> > at
> com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
> > at
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> > at
> org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43)
> > at
> org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
> > at
> org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
> > at
> org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55)
> > at
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter

[jira] [Commented] (SOLR-2894) Implement distributed pivot faceting

2013-08-23 Thread William Harris (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13748428#comment-13748428
 ] 

William Harris commented on SOLR-2894:
--

If the values of those particular fields are null, I do not set those fields on 
the document ( It makes it slightly easier for me when parsing the response ). 
I altered my indexing application to add those fields even when they have no 
value, and now it works, so I'm thinking the NPE is caused not by a null value, 
but by the field not being set in all documents.

> Implement distributed pivot faceting
> 
>
> Key: SOLR-2894
> URL: https://issues.apache.org/jira/browse/SOLR-2894
> Project: Solr
>  Issue Type: Improvement
>Reporter: Erik Hatcher
> Fix For: 4.5
>
> Attachments: SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, 
> SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, 
> SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, 
> SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894-reworked.patch
>
>
> Following up on SOLR-792, pivot faceting currently only supports 
> undistributed mode.  Distributed pivot faceting needs to be implemented.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-5187) Make SlowCompositeReaderWrapper constructor private

2013-08-23 Thread Adrien Grand (JIRA)
Adrien Grand created LUCENE-5187:


 Summary: Make SlowCompositeReaderWrapper constructor private
 Key: LUCENE-5187
 URL: https://issues.apache.org/jira/browse/LUCENE-5187
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Adrien Grand
Assignee: Adrien Grand
Priority: Minor


I found a couple of places in the code base that duplicate the logic of 
SlowCompositeReaderWrapper.wrap. I think {{SlowCompositeReaderWrapper.wrap}} 
(vs. new SlowCompositeReaderWrapper) is what users need so we should probably 
make SlowCompositeReaderWrapper constructor private to enforce usage of 
{{wrap}}.









--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5187) Make SlowCompositeReaderWrapper constructor private

2013-08-23 Thread Adrien Grand (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand updated LUCENE-5187:
-

Attachment: LUCENE-5187.patch

Here is a patch.

> Make SlowCompositeReaderWrapper constructor private
> ---
>
> Key: LUCENE-5187
> URL: https://issues.apache.org/jira/browse/LUCENE-5187
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Assignee: Adrien Grand
>Priority: Minor
> Attachments: LUCENE-5187.patch
>
>
> I found a couple of places in the code base that duplicate the logic of 
> SlowCompositeReaderWrapper.wrap. I think {{SlowCompositeReaderWrapper.wrap}} 
> (vs. new SlowCompositeReaderWrapper) is what users need so we should probably 
> make SlowCompositeReaderWrapper constructor private to enforce usage of 
> {{wrap}}.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4787) Join Contrib

2013-08-23 Thread Joel Bernstein (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-4787:
-

Attachment: SOLR-4787.patch

New patch.

> Join Contrib
> 
>
> Key: SOLR-4787
> URL: https://issues.apache.org/jira/browse/SOLR-4787
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 4.2.1
>Reporter: Joel Bernstein
>Priority: Minor
> Fix For: 4.5, 5.0
>
> Attachments: SOLR-4787-deadlock-fix.patch, SOLR-4787.patch, 
> SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, 
> SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, 
> SOLR-4787.patch, SOLR-4787.patch, SOLR-4787-pjoin-long-keys.patch
>
>
> This contrib provides a place where different join implementations can be 
> contributed to Solr. This contrib currently includes 2 join implementations. 
> The initial patch was generated from the Solr 4.3 tag. Because of changes in 
> the FieldCache API this patch will only build with Solr 4.2 or above.
> *PostFilterJoinQParserPlugin aka "pjoin"*
> The pjoin provides a join implementation that filters results in one core 
> based on the results of a search in another core. This is similar in 
> functionality to the JoinQParserPlugin but the implementation differs in a 
> couple of important ways.
> The first way is that the pjoin is designed to work with integer join keys 
> only. So, in order to use pjoin, integer join keys must be included in both 
> the to and from core.
> The second difference is that the pjoin builds memory structures that are 
> used to quickly connect the join keys. It also uses a custom SolrCache named 
> "join" to hold intermediate DocSets which are needed to build the join memory 
> structures. So, the pjoin will need more memory then the JoinQParserPlugin to 
> perform the join.
> The main advantage of the pjoin is that it can scale to join millions of keys 
> between cores.
> Because it's a PostFilter, it only needs to join records that match the main 
> query.
> The syntax of the pjoin is the same as the JoinQParserPlugin except that the 
> plugin is referenced by the string "pjoin" rather then "join".
> fq=\{!pjoin fromCore=collection2 from=id_i to=id_i\}user:customer1
> The example filter query above will search the fromCore (collection2) for 
> "user:customer1". This query will generate a list of values from the "from" 
> field that will be used to filter the main query. Only records from the main 
> query, where the "to" field is present in the "from" list will be included in 
> the results.
> The solrconfig.xml in the main query core must contain the reference to the 
> pjoin.
>  class="org.apache.solr.joins.PostFilterJoinQParserPlugin"/>
> And the join contrib jars must be registed in the solrconfig.xml.
> 
> The solrconfig.xml in the fromcore must have the "join" SolrCache configured.
> class="solr.LRUCache"
>   size="4096"
>   initialSize="1024"
>   />
> *ValueSourceJoinParserPlugin aka vjoin*
> The second implementation is the ValueSourceJoinParserPlugin aka "vjoin". 
> This implements a ValueSource function query that can return a value from a 
> second core based on join keys and limiting query. The limiting query can be 
> used to select a specific subset of data from the join core. This allows 
> customer specific relevance data to be stored in a separate core and then 
> joined in the main query.
> The vjoin is called using the "vjoin" function query. For example:
> bf=vjoin(fromCore, fromKey, fromVal, toKey, query)
> This example shows "vjoin" being called by the edismax boost function 
> parameter. This example will return the "fromVal" from the "fromCore". The 
> "fromKey" and "toKey" are used to link the records from the main query to the 
> records in the "fromCore". The "query" is used to select a specific set of 
> records to join with in fromCore.
> Currently the fromKey and toKey must be longs but this will change in future 
> versions. Like the pjoin, the "join" SolrCache is used to hold the join 
> memory structures.
> To configure the vjoin you must register the ValueSource plugin in the 
> solrconfig.xml as follows:
>  class="org.apache.solr.joins.ValueSourceJoinParserPlugin" />

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3583) Percentiles for facets, pivot facets, and distributed pivot facets

2013-08-23 Thread Nim Lhug (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13748480#comment-13748480
 ] 

Nim Lhug commented on SOLR-3583:


[~selah], did you give up on this patch? From what I can read in the comments, 
it seems to work reasonably well and just needs a bit of cleanup?

> Percentiles for facets, pivot facets, and distributed pivot facets
> --
>
> Key: SOLR-3583
> URL: https://issues.apache.org/jira/browse/SOLR-3583
> Project: Solr
>  Issue Type: Improvement
>Reporter: Chris Russell
>Priority: Minor
>  Labels: newbie, patch
> Fix For: 4.5, 5.0
>
> Attachments: SOLR-3583.patch, SOLR-3583.patch, SOLR-3583.patch, 
> SOLR-3583.patch
>
>
> Built on top of SOLR-2894, this patch adds percentiles and averages to 
> facets, pivot facets, and distributed pivot facets by making use of range 
> facet internals.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4787) Join Contrib

2013-08-23 Thread Joel Bernstein (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-4787:
-

Description: 
This contrib provides a place where different join implementations can be 
contributed to Solr. This contrib currently includes 3 join implementations. 
The initial patch was generated from the Solr 4.3 tag. Because of changes in 
the FieldCache API this patch will only build with Solr 4.2 or above.

*HashSetJoinQParserPlugin aka "hjoin"*

The hjoin provides a join implementation that filters results in one core based 
on the results of a search in another core. This is similar in functionality to 
the JoinQParserPlugin but the implementation differs in a couple of important 
ways.

The first way is that the hjoin is designed to work with int and long join keys 
only. So, in order to use hjoin, int or long join keys must be included in both 
the to and from core.

The second difference is that the hjoin builds memory structures that are used 
to quickly connect the join keys. So, the pjoin will need more memory then the 
JoinQParserPlugin to perform the join.

The main advantage of the hjoin is that it can scale to join millions of keys 
between cores and provide sub-second response time. The hjoin should work well 
with up to two million results from the fromIndex and tens of millions of 
results from the main query.

The hjoin supports the following features:

1) Both lucene query and PostFilter implementations. A *"cost"* of > 99 will 
turn on the PostFilter. The PostFilter will typically outperform the Lucene 
query when the main query results have been narrowed down.

2) With the lucene query implementation there is an option to build the filter 
with threads. This can greatly improve the performance of the query if the main 
query index is very large. The "threads" parameter turns on threading. For 
example *threads=6* will use 6 threads to build the filter. This will setup a 
fixed threadpool with six threads to handle all hjoin requests. Once the 
threadpool is created the hjoin will always use it to build the filter. 
Threading does not come into play with the PostFilter.

3) The *size* local parameter can be used to set the initial size of the 
hashset used to perform the join. If this is set above the number of results 
from the fromIndex then the you can avoid hashset resizing which improves 
performance.

4) Nested filter queries. The local parameter "fq" can be used to nest a filter 
query within the join. The nest fq will filter the results of the join query. 
This can point another join to support nested joins.

5) Full caching support for the lucene query implementation. The filterCache 
and queryResultCache should work properly even with deep nesting of joins. Only 
the queryResultCache comes into play with the PostFilter implementation because 
PostFilters are not cacheable in the filterCache.

The syntax of the hjoin is similar to the JoinQParserPlugin except that the 
plugin is referenced by the string "hjoin" rather then "join".

fq=\{!hjoin fromIndex=collection2 from=id_i to=id_i threads=6 
fq=$qq\}user:customer1&qq=group:5

The example filter query above will search the fromIndex (collection2) for 
"user:customer1" applying the local fq parameter to filter the results. The 
lucene filter query will be built using 6 threads. This query will generate a 
list of values from the "from" field that will be used to filter the main 
query. Only records from the main query, where the "to" field is present in the 
"from" list will be included in the results.

The solrconfig.xml in the main query core must contain the reference to the 
pjoin.



And the join contrib jars must be registed in the solrconfig.xml.

 
 

*BitSetJoinQParserPlugin* aka bjoin*

The bjoin behaves exactly like the hjoin but uses a BitSet instead of a HashSet 
to perform the underlying join. Because of this the bjoin is much faster and 
can provide sub-second response times on tens of millions of records from the 
fromIndex and hundreds of millions of records from the main query.

But there are limitations to how the bjoin can be used. The bjoin treats the 
join keys as addresses in a BitSet and uses the Lucene OpenBitSet 
implementation which performs very well but is not sparse. So the BitSet memory 
is dictated by the size of the join keys. So a join key of 200,000,000 will 
need 25 MB of memory. For this reason the BitSet join does not support long 
join keys. In order to keep memory usage down the join keys should also be 
packed at the low end, for example from 1 to 50,000,000. 

Below is a sampe bjoin:

fq=\{!bjoin fromIndex=collection2 from=id_i to=id_i threads=6 
fq=$qq\}user:customer1&qq=group:5

To register the bjoin the solrconfig.xml in the main query core must contain 
the reference to the bjoin.



*ValueSourceJoinParserPlugin aka vjoin*

The second implementation is the ValueSourceJoinParserPlug

[jira] [Updated] (SOLR-4787) Join Contrib

2013-08-23 Thread Joel Bernstein (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-4787:
-

Description: 
This contrib provides a place where different join implementations can be 
contributed to Solr. This contrib currently includes 3 join implementations. 
The initial patch was generated from the Solr 4.3 tag. Because of changes in 
the FieldCache API this patch will only build with Solr 4.2 or above.

*HashSetJoinQParserPlugin aka "hjoin"*

The hjoin provides a join implementation that filters results in one core based 
on the results of a search in another core. This is similar in functionality to 
the JoinQParserPlugin but the implementation differs in a couple of important 
ways.

The first way is that the hjoin is designed to work with int and long join keys 
only. So, in order to use hjoin, int or long join keys must be included in both 
the to and from core.

The second difference is that the hjoin builds memory structures that are used 
to quickly connect the join keys. So, the hjoin will need more memory then the 
JoinQParserPlugin to perform the join.

The main advantage of the hjoin is that it can scale to join millions of keys 
between cores and provide sub-second response time. The hjoin should work well 
with up to two million results from the fromIndex and tens of millions of 
results from the main query.

The hjoin supports the following features:

1) Both lucene query and PostFilter implementations. A *"cost"* > 99 will turn 
on the PostFilter. The PostFilter will typically outperform the Lucene query 
when the main query results have been narrowed down.

2) With the lucene query implementation there is an option to build the filter 
with threads. This can greatly improve the performance of the query if the main 
query index is very large. The "threads" parameter turns on threading. For 
example *threads=6* will use 6 threads to build the filter. This will setup a 
fixed threadpool with six threads to handle all hjoin requests. Once the 
threadpool is created the hjoin will always use it to build the filter. 
Threading does not come into play with the PostFilter.

3) The *size* local parameter can be used to set the initial size of the 
hashset used to perform the join. If this is set above the number of results 
from the fromIndex then the you can avoid hashset resizing which improves 
performance.

4) Nested filter queries. The local parameter "fq" can be used to nest a filter 
query within the join. The nest fq will filter the results of the join query. 
This can point another join to support nested joins.

5) Full caching support for the lucene query implementation. The filterCache 
and queryResultCache should work properly even with deep nesting of joins. Only 
the queryResultCache comes into play with the PostFilter implementation because 
PostFilters are not cacheable in the filterCache.

The syntax of the hjoin is similar to the JoinQParserPlugin except that the 
plugin is referenced by the string "hjoin" rather then "join".

fq=\{!hjoin fromIndex=collection2 from=id_i to=id_i threads=6 
fq=$qq\}user:customer1&qq=group:5

The example filter query above will search the fromIndex (collection2) for 
"user:customer1" applying the local fq parameter to filter the results. The 
lucene filter query will be built using 6 threads. This query will generate a 
list of values from the "from" field that will be used to filter the main 
query. Only records from the main query, where the "to" field is present in the 
"from" list will be included in the results.

The solrconfig.xml in the main query core must contain the reference to the 
pjoin.



And the join contrib jars must be registed in the solrconfig.xml.

 
 

*BitSetJoinQParserPlugin* aka bjoin*

The bjoin behaves exactly like the hjoin but uses a BitSet instead of a HashSet 
to perform the underlying join. Because of this the bjoin is much faster and 
can provide sub-second response times on tens of millions of records from the 
fromIndex and hundreds of millions of records from the main query.

But there are limitations to how the bjoin can be used. The bjoin treats the 
join keys as addresses in a BitSet and uses the Lucene OpenBitSet 
implementation which performs very well but is not sparse. So the BitSet memory 
is dictated by the size of the join keys. So a join key of 200,000,000 will 
need 25 MB of memory. For this reason the BitSet join does not support long 
join keys. In order to keep memory usage down the join keys should also be 
packed at the low end, for example from 1 to 50,000,000. 

Below is a sampe bjoin:

fq=\{!bjoin fromIndex=collection2 from=id_i to=id_i threads=6 
fq=$qq\}user:customer1&qq=group:5

To register the bjoin the solrconfig.xml in the main query core must contain 
the reference to the bjoin.



*ValueSourceJoinParserPlugin aka vjoin*

The second implementation is the ValueSourceJoinParserPlugin 

[jira] [Updated] (SOLR-4787) Join Contrib

2013-08-23 Thread Joel Bernstein (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-4787:
-

Description: 
This contrib provides a place where different join implementations can be 
contributed to Solr. This contrib currently includes 3 join implementations. 
The initial patch was generated from the Solr 4.3 tag. Because of changes in 
the FieldCache API this patch will only build with Solr 4.2 or above.

*HashSetJoinQParserPlugin aka "hjoin"*

The hjoin provides a join implementation that filters results in one core based 
on the results of a search in another core. This is similar in functionality to 
the JoinQParserPlugin but the implementation differs in a couple of important 
ways.

The first way is that the hjoin is designed to work with int and long join keys 
only. So, in order to use hjoin, int or long join keys must be included in both 
the to and from core.

The second difference is that the hjoin builds memory structures that are used 
to quickly connect the join keys. So, the hjoin will need more memory then the 
JoinQParserPlugin to perform the join.

The main advantage of the hjoin is that it can scale to join millions of keys 
between cores and provide sub-second response time. The hjoin should work well 
with up to two million results from the fromIndex and tens of millions of 
results from the main query.

The hjoin supports the following features:

1) Both lucene query and PostFilter implementations. A *"cost"* of > 99 will 
turn on the PostFilter. The PostFilter will typically outperform the Lucene 
query when the main query results have been narrowed down.

2) With the lucene query implementation there is an option to build the filter 
with threads. This can greatly improve the performance of the query if the main 
query index is very large. The "threads" parameter turns on threading. For 
example *threads=6* will use 6 threads to build the filter. This will setup a 
fixed threadpool with six threads to handle all hjoin requests. Once the 
threadpool is created the hjoin will always use it to build the filter. 
Threading does not come into play with the PostFilter.

3) The *size* local parameter can be used to set the initial size of the 
hashset used to perform the join. If this is set above the number of results 
from the fromIndex then the you can avoid hashset resizing which improves 
performance.

4) Nested filter queries. The local parameter "fq" can be used to nest a filter 
query within the join. The nest fq will filter the results of the join query. 
This can point another join to support nested joins.

5) Full caching support for the lucene query implementation. The filterCache 
and queryResultCache should work properly even with deep nesting of joins. Only 
the queryResultCache comes into play with the PostFilter implementation because 
PostFilters are not cacheable in the filterCache.

The syntax of the hjoin is similar to the JoinQParserPlugin except that the 
plugin is referenced by the string "hjoin" rather then "join".

fq=\{!hjoin fromIndex=collection2 from=id_i to=id_i threads=6 
fq=$qq\}user:customer1&qq=group:5

The example filter query above will search the fromIndex (collection2) for 
"user:customer1" applying the local fq parameter to filter the results. The 
lucene filter query will be built using 6 threads. This query will generate a 
list of values from the "from" field that will be used to filter the main 
query. Only records from the main query, where the "to" field is present in the 
"from" list will be included in the results.

The solrconfig.xml in the main query core must contain the reference to the 
pjoin.



And the join contrib jars must be registed in the solrconfig.xml.

 
 

*BitSetJoinQParserPlugin* aka bjoin*

The bjoin behaves exactly like the hjoin but uses a BitSet instead of a HashSet 
to perform the underlying join. Because of this the bjoin is much faster and 
can provide sub-second response times on tens of millions of records from the 
fromIndex and hundreds of millions of records from the main query.

But there are limitations to how the bjoin can be used. The bjoin treats the 
join keys as addresses in a BitSet and uses the Lucene OpenBitSet 
implementation which performs very well but is not sparse. So the BitSet memory 
is dictated by the size of the join keys. So a join key of 200,000,000 will 
need 25 MB of memory. For this reason the BitSet join does not support long 
join keys. In order to keep memory usage down the join keys should also be 
packed at the low end, for example from 1 to 50,000,000. 

Below is a sampe bjoin:

fq=\{!bjoin fromIndex=collection2 from=id_i to=id_i threads=6 
fq=$qq\}user:customer1&qq=group:5

To register the bjoin the solrconfig.xml in the main query core must contain 
the reference to the bjoin.



*ValueSourceJoinParserPlugin aka vjoin*

The second implementation is the ValueSourceJoinParserPlug

[jira] [Updated] (SOLR-4787) Join Contrib

2013-08-23 Thread Joel Bernstein (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-4787:
-

Description: 
This contrib provides a place where different join implementations can be 
contributed to Solr. This contrib currently includes 3 join implementations. 
The initial patch was generated from the Solr 4.3 tag. Because of changes in 
the FieldCache API this patch will only build with Solr 4.2 or above.

*HashSetJoinQParserPlugin aka "hjoin"*

The hjoin provides a join implementation that filters results in one core based 
on the results of a search in another core. This is similar in functionality to 
the JoinQParserPlugin but the implementation differs in a couple of important 
ways.

The first way is that the hjoin is designed to work with int and long join keys 
only. So, in order to use hjoin, int or long join keys must be included in both 
the to and from core.

The second difference is that the hjoin builds memory structures that are used 
to quickly connect the join keys. So, the hjoin will need more memory then the 
JoinQParserPlugin to perform the join.

The main advantage of the hjoin is that it can scale to join millions of keys 
between cores and provide sub-second response time. The hjoin should work well 
with up to two million results from the fromIndex and tens of millions of 
results from the main query.

The hjoin supports the following features:

1) Both lucene query and PostFilter implementations. A *"cost"* > 99 will turn 
on the PostFilter. The PostFilter will typically outperform the Lucene query 
when the main query results have been narrowed down.

2) With the lucene query implementation there is an option to build the filter 
with threads. This can greatly improve the performance of the query if the main 
query index is very large. The "threads" parameter turns on threading. For 
example *threads=6* will use 6 threads to build the filter. This will setup a 
fixed threadpool with six threads to handle all hjoin requests. Once the 
threadpool is created the hjoin will always use it to build the filter. 
Threading does not come into play with the PostFilter.

3) The *size* local parameter can be used to set the initial size of the 
hashset used to perform the join. If this is set above the number of results 
from the fromIndex then the you can avoid hashset resizing which improves 
performance.

4) Nested filter queries. The local parameter "fq" can be used to nest a filter 
query within the join. The nested fq will filter the results of the join query. 
This can point to another join to support nested joins.

5) Full caching support for the lucene query implementation. The filterCache 
and queryResultCache should work properly even with deep nesting of joins. Only 
the queryResultCache comes into play with the PostFilter implementation because 
PostFilters are not cacheable in the filterCache.

The syntax of the hjoin is similar to the JoinQParserPlugin except that the 
plugin is referenced by the string "hjoin" rather then "join".

fq=\{!hjoin fromIndex=collection2 from=id_i to=id_i threads=6 
fq=$qq\}user:customer1&qq=group:5

The example filter query above will search the fromIndex (collection2) for 
"user:customer1" applying the local fq parameter to filter the results. The 
lucene filter query will be built using 6 threads. This query will generate a 
list of values from the "from" field that will be used to filter the main 
query. Only records from the main query, where the "to" field is present in the 
"from" list will be included in the results.

The solrconfig.xml in the main query core must contain the reference to the 
pjoin.



And the join contrib jars must be registed in the solrconfig.xml.

 
 

*BitSetJoinQParserPlugin* aka bjoin*

The bjoin behaves exactly like the hjoin but uses a BitSet instead of a HashSet 
to perform the underlying join. Because of this the bjoin is much faster and 
can provide sub-second response times on tens of millions of records from the 
fromIndex and hundreds of millions of records from the main query.

But there are limitations to how the bjoin can be used. The bjoin treats the 
join keys as addresses in a BitSet and uses the Lucene OpenBitSet 
implementation which performs very well but is not sparse. So the BitSet memory 
is dictated by the size of the join keys. So a join key of 200,000,000 will 
need 25 MB of memory. For this reason the BitSet join does not support long 
join keys. In order to keep memory usage down the join keys should also be 
packed at the low end, for example from 1 to 50,000,000. 

Below is a sampe bjoin:

fq=\{!bjoin fromIndex=collection2 from=id_i to=id_i threads=6 
fq=$qq\}user:customer1&qq=group:5

To register the bjoin the solrconfig.xml in the main query core must contain 
the reference to the bjoin.



*ValueSourceJoinParserPlugin aka vjoin*

The second implementation is the ValueSourceJoinParserPl

[jira] [Updated] (SOLR-4787) Join Contrib

2013-08-23 Thread Joel Bernstein (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-4787:
-

Attachment: (was: SOLR-4787.patch)

> Join Contrib
> 
>
> Key: SOLR-4787
> URL: https://issues.apache.org/jira/browse/SOLR-4787
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 4.2.1
>Reporter: Joel Bernstein
>Priority: Minor
> Fix For: 4.5, 5.0
>
> Attachments: SOLR-4787-deadlock-fix.patch, SOLR-4787.patch, 
> SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, 
> SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, 
> SOLR-4787.patch, SOLR-4787-pjoin-long-keys.patch
>
>
> This contrib provides a place where different join implementations can be 
> contributed to Solr. This contrib currently includes 3 join implementations. 
> The initial patch was generated from the Solr 4.3 tag. Because of changes in 
> the FieldCache API this patch will only build with Solr 4.2 or above.
> *HashSetJoinQParserPlugin aka "hjoin"*
> The hjoin provides a join implementation that filters results in one core 
> based on the results of a search in another core. This is similar in 
> functionality to the JoinQParserPlugin but the implementation differs in a 
> couple of important ways.
> The first way is that the hjoin is designed to work with int and long join 
> keys only. So, in order to use hjoin, int or long join keys must be included 
> in both the to and from core.
> The second difference is that the hjoin builds memory structures that are 
> used to quickly connect the join keys. So, the hjoin will need more memory 
> then the JoinQParserPlugin to perform the join.
> The main advantage of the hjoin is that it can scale to join millions of keys 
> between cores and provide sub-second response time. The hjoin should work 
> well with up to two million results from the fromIndex and tens of millions 
> of results from the main query.
> The hjoin supports the following features:
> 1) Both lucene query and PostFilter implementations. A *"cost"* > 99 will 
> turn on the PostFilter. The PostFilter will typically outperform the Lucene 
> query when the main query results have been narrowed down.
> 2) With the lucene query implementation there is an option to build the 
> filter with threads. This can greatly improve the performance of the query if 
> the main query index is very large. The "threads" parameter turns on 
> threading. For example *threads=6* will use 6 threads to build the filter. 
> This will setup a fixed threadpool with six threads to handle all hjoin 
> requests. Once the threadpool is created the hjoin will always use it to 
> build the filter. Threading does not come into play with the PostFilter.
> 3) The *size* local parameter can be used to set the initial size of the 
> hashset used to perform the join. If this is set above the number of results 
> from the fromIndex then the you can avoid hashset resizing which improves 
> performance.
> 4) Nested filter queries. The local parameter "fq" can be used to nest a 
> filter query within the join. The nested fq will filter the results of the 
> join query. This can point to another join to support nested joins.
> 5) Full caching support for the lucene query implementation. The filterCache 
> and queryResultCache should work properly even with deep nesting of joins. 
> Only the queryResultCache comes into play with the PostFilter implementation 
> because PostFilters are not cacheable in the filterCache.
> The syntax of the hjoin is similar to the JoinQParserPlugin except that the 
> plugin is referenced by the string "hjoin" rather then "join".
> fq=\{!hjoin fromIndex=collection2 from=id_i to=id_i threads=6 
> fq=$qq\}user:customer1&qq=group:5
> The example filter query above will search the fromIndex (collection2) for 
> "user:customer1" applying the local fq parameter to filter the results. The 
> lucene filter query will be built using 6 threads. This query will generate a 
> list of values from the "from" field that will be used to filter the main 
> query. Only records from the main query, where the "to" field is present in 
> the "from" list will be included in the results.
> The solrconfig.xml in the main query core must contain the reference to the 
> pjoin.
>  class="org.apache.solr.joins.HashSetJoinQParserPlugin"/>
> And the join contrib jars must be registed in the solrconfig.xml.
>  
>  
> *BitSetJoinQParserPlugin* aka bjoin*
> The bjoin behaves exactly like the hjoin but uses a BitSet instead of a 
> HashSet to perform the underlying join. Because of this the bjoin is much 
> faster and can provide sub-second response times on tens of millions of 
> records from the fromIndex and hundreds of millions of records from the main 
> query.
> But there are limitati

[jira] [Updated] (SOLR-4787) Join Contrib

2013-08-23 Thread Joel Bernstein (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-4787:
-

Attachment: SOLR-4787.patch

> Join Contrib
> 
>
> Key: SOLR-4787
> URL: https://issues.apache.org/jira/browse/SOLR-4787
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 4.2.1
>Reporter: Joel Bernstein
>Priority: Minor
> Fix For: 4.5, 5.0
>
> Attachments: SOLR-4787-deadlock-fix.patch, SOLR-4787.patch, 
> SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, 
> SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, 
> SOLR-4787.patch, SOLR-4787.patch, SOLR-4787-pjoin-long-keys.patch
>
>
> This contrib provides a place where different join implementations can be 
> contributed to Solr. This contrib currently includes 3 join implementations. 
> The initial patch was generated from the Solr 4.3 tag. Because of changes in 
> the FieldCache API this patch will only build with Solr 4.2 or above.
> *HashSetJoinQParserPlugin aka "hjoin"*
> The hjoin provides a join implementation that filters results in one core 
> based on the results of a search in another core. This is similar in 
> functionality to the JoinQParserPlugin but the implementation differs in a 
> couple of important ways.
> The first way is that the hjoin is designed to work with int and long join 
> keys only. So, in order to use hjoin, int or long join keys must be included 
> in both the to and from core.
> The second difference is that the hjoin builds memory structures that are 
> used to quickly connect the join keys. So, the hjoin will need more memory 
> then the JoinQParserPlugin to perform the join.
> The main advantage of the hjoin is that it can scale to join millions of keys 
> between cores and provide sub-second response time. The hjoin should work 
> well with up to two million results from the fromIndex and tens of millions 
> of results from the main query.
> The hjoin supports the following features:
> 1) Both lucene query and PostFilter implementations. A *"cost"* > 99 will 
> turn on the PostFilter. The PostFilter will typically outperform the Lucene 
> query when the main query results have been narrowed down.
> 2) With the lucene query implementation there is an option to build the 
> filter with threads. This can greatly improve the performance of the query if 
> the main query index is very large. The "threads" parameter turns on 
> threading. For example *threads=6* will use 6 threads to build the filter. 
> This will setup a fixed threadpool with six threads to handle all hjoin 
> requests. Once the threadpool is created the hjoin will always use it to 
> build the filter. Threading does not come into play with the PostFilter.
> 3) The *size* local parameter can be used to set the initial size of the 
> hashset used to perform the join. If this is set above the number of results 
> from the fromIndex then the you can avoid hashset resizing which improves 
> performance.
> 4) Nested filter queries. The local parameter "fq" can be used to nest a 
> filter query within the join. The nested fq will filter the results of the 
> join query. This can point to another join to support nested joins.
> 5) Full caching support for the lucene query implementation. The filterCache 
> and queryResultCache should work properly even with deep nesting of joins. 
> Only the queryResultCache comes into play with the PostFilter implementation 
> because PostFilters are not cacheable in the filterCache.
> The syntax of the hjoin is similar to the JoinQParserPlugin except that the 
> plugin is referenced by the string "hjoin" rather then "join".
> fq=\{!hjoin fromIndex=collection2 from=id_i to=id_i threads=6 
> fq=$qq\}user:customer1&qq=group:5
> The example filter query above will search the fromIndex (collection2) for 
> "user:customer1" applying the local fq parameter to filter the results. The 
> lucene filter query will be built using 6 threads. This query will generate a 
> list of values from the "from" field that will be used to filter the main 
> query. Only records from the main query, where the "to" field is present in 
> the "from" list will be included in the results.
> The solrconfig.xml in the main query core must contain the reference to the 
> pjoin.
>  class="org.apache.solr.joins.HashSetJoinQParserPlugin"/>
> And the join contrib jars must be registed in the solrconfig.xml.
>  
>  
> *BitSetJoinQParserPlugin* aka bjoin*
> The bjoin behaves exactly like the hjoin but uses a BitSet instead of a 
> HashSet to perform the underlying join. Because of this the bjoin is much 
> faster and can provide sub-second response times on tens of millions of 
> records from the fromIndex and hundreds of millions of records from the main 
> query.
> But there are li

Re: [JENKINS] Lucene-Solr-trunk-Windows (64bit/jdk1.7.0_25) - Build # 3179 - Still Failing!

2013-08-23 Thread Robert Muir
the bug was not in the test, the test found a bug in slow wrapper itself! :)

On Fri, Aug 23, 2013 at 5:43 AM, Martijn v Groningen
 wrote:
> Thanks for fixing this test bug!
>
>
> On 22 August 2013 16:11, Robert Muir  wrote:
>>
>> I committed a fix: slowwrapper bug
>>
>> On Thu, Aug 22, 2013 at 9:42 AM, Policeman Jenkins Server
>>  wrote:
>> > Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Windows/3179/
>> > Java: 64bit/jdk1.7.0_25 -XX:+UseCompressedOops -XX:+UseG1GC
>> >
>> > 1 tests failed.
>> > REGRESSION:
>> > org.apache.lucene.search.grouping.DistinctValuesCollectorTest.testRandom
>> >
>> > Error Message:
>> > CheckReader failed
>> >
>> > Stack Trace:
>> > java.lang.RuntimeException: CheckReader failed
>> > at
>> > __randomizedtesting.SeedInfo.seed([C17F45D490FFB13E:B33360DB219F074D]:0)
>> > at
>> > org.apache.lucene.util._TestUtil.checkReader(_TestUtil.java:261)
>> > at
>> > org.apache.lucene.util._TestUtil.checkReader(_TestUtil.java:240)
>> > at
>> > org.apache.lucene.util.LuceneTestCase.newSearcher(LuceneTestCase.java:1310)
>> > at
>> > org.apache.lucene.util.LuceneTestCase.newSearcher(LuceneTestCase.java:1286)
>> > at
>> > org.apache.lucene.search.grouping.DistinctValuesCollectorTest.testRandom(DistinctValuesCollectorTest.java:253)
>> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> > at
>> > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>> > at
>> > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> > at java.lang.reflect.Method.invoke(Method.java:606)
>> > at
>> > com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
>> > at
>> > com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
>> > at
>> > com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737)
>> > at
>> > com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773)
>> > at
>> > com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787)
>> > at
>> > org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
>> > at
>> > org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
>> > at
>> > org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
>> > at
>> > com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
>> > at
>> > org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
>> > at
>> > org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
>> > at
>> > org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
>> > at
>> > com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>> > at
>> > com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
>> > at
>> > com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782)
>> > at
>> > com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442)
>> > at
>> > com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746)
>> > at
>> > com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648)
>> > at
>> > com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682)
>> > at
>> > com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693)
>> > at
>> > org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
>> > at
>> > org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
>> > at
>> > com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
>> > at
>> > com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
>> > at
>> > com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
>> > at
>> > com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>> > at
>> > org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43)
>> > at
>> > org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
>> > at
>> > o

[jira] [Created] (SOLR-5185) Core Selector not compatible with text based web browsers (lynx/elinks).

2013-08-23 Thread Alasdair Campbell (JIRA)
Alasdair Campbell created SOLR-5185:
---

 Summary: Core Selector not compatible with text based web browsers 
(lynx/elinks).
 Key: SOLR-5185
 URL: https://issues.apache.org/jira/browse/SOLR-5185
 Project: Solr
  Issue Type: Bug
  Components: web gui
Affects Versions: 4.4
 Environment: All
Reporter: Alasdair Campbell
Priority: Minor


The Core Selector part of the web gui is not compatible with text based web 
browsers (lynx/elinks) rendering the control panel useless.

Priority increases when installed on a remote server when you do not wish Solr 
access from outwith the server.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5187) Make SlowCompositeReaderWrapper constructor private

2013-08-23 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13748521#comment-13748521
 ] 

Robert Muir commented on LUCENE-5187:
-

+1

In the case of grouping tests, is it really safe to change this way?

{noformat}
   // NOTE: intentional but temporary field cache insanity!
-  final FieldCache.Ints docIdToFieldId = FieldCache.DEFAULT.getInts(new 
SlowCompositeReaderWrapper(r), "id", false);
+  final FieldCache.Ints docIdToFieldId = 
FieldCache.DEFAULT.getInts(SlowCompositeReaderWrapper.wrap(r), "id", false);
{noformat}

if needed, Maybe instead, we should force compositeness by e.g. inserting an 
empty reader if necessary or whatever. but i'm not even sure what this test is 
doing.

> Make SlowCompositeReaderWrapper constructor private
> ---
>
> Key: LUCENE-5187
> URL: https://issues.apache.org/jira/browse/LUCENE-5187
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Assignee: Adrien Grand
>Priority: Minor
> Attachments: LUCENE-5187.patch
>
>
> I found a couple of places in the code base that duplicate the logic of 
> SlowCompositeReaderWrapper.wrap. I think {{SlowCompositeReaderWrapper.wrap}} 
> (vs. new SlowCompositeReaderWrapper) is what users need so we should probably 
> make SlowCompositeReaderWrapper constructor private to enforce usage of 
> {{wrap}}.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-2894) Implement distributed pivot faceting

2013-08-23 Thread William Harris (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13748428#comment-13748428
 ] 

William Harris edited comment on SOLR-2894 at 8/23/13 1:24 PM:
---

I thought the issue might be related to me not assigning those fields values in 
every document, but I tried reindexing giving them all values and the error 
still occurs.
I tampered with the source a bit, and managed to trace the error to 
srsp.getSolrResponse().getResponse(). Hope that helps.

  was (Author: killscreen):
If the values of those particular fields are null, I do not set those 
fields on the document ( It makes it slightly easier for me when parsing the 
response ). I altered my indexing application to add those fields even when 
they have no value, and now it works, so I'm thinking the NPE is caused not by 
a null value, but by the field not being set in all documents.
  
> Implement distributed pivot faceting
> 
>
> Key: SOLR-2894
> URL: https://issues.apache.org/jira/browse/SOLR-2894
> Project: Solr
>  Issue Type: Improvement
>Reporter: Erik Hatcher
> Fix For: 4.5
>
> Attachments: SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, 
> SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, 
> SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, 
> SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894-reworked.patch
>
>
> Following up on SOLR-792, pivot faceting currently only supports 
> undistributed mode.  Distributed pivot faceting needs to be implemented.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-2894) Implement distributed pivot faceting

2013-08-23 Thread William Harris (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13748428#comment-13748428
 ] 

William Harris edited comment on SOLR-2894 at 8/23/13 1:29 PM:
---

I thought the issue might be related to me not assigning those fields values in 
every document, but I tried reindexing giving them all values and the error 
still occurs.
I tampered with the source a bit, and managed to trace the error to 
srsp.getSolrResponse().getResponse(), meaning getSolrResponse() is returning 
null. Hope that helps.

  was (Author: killscreen):
I thought the issue might be related to me not assigning those fields 
values in every document, but I tried reindexing giving them all values and the 
error still occurs.
I tampered with the source a bit, and managed to trace the error to 
srsp.getSolrResponse().getResponse(). Hope that helps.
  
> Implement distributed pivot faceting
> 
>
> Key: SOLR-2894
> URL: https://issues.apache.org/jira/browse/SOLR-2894
> Project: Solr
>  Issue Type: Improvement
>Reporter: Erik Hatcher
> Fix For: 4.5
>
> Attachments: SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, 
> SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, 
> SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, 
> SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894-reworked.patch
>
>
> Following up on SOLR-792, pivot faceting currently only supports 
> undistributed mode.  Distributed pivot faceting needs to be implemented.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3069) Lucene should have an entirely memory resident term dictionary

2013-08-23 Thread Han Jiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Han Jiang updated LUCENE-3069:
--

Attachment: LUCENE-3069.patch

Patch, it will show how current codecs (Block/BlockTree + 
Lucene4X/Pulsing/Mock*) are changed according to our API refactoring. 
TestBackwardsCompatibility still fails, and I'll work on the impersonation 
later.

> Lucene should have an entirely memory resident term dictionary
> --
>
> Key: LUCENE-3069
> URL: https://issues.apache.org/jira/browse/LUCENE-3069
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/index, core/search
>Affects Versions: 4.0-ALPHA
>Reporter: Simon Willnauer
>Assignee: Han Jiang
>  Labels: gsoc2013
> Fix For: 5.0, 4.5
>
> Attachments: df-ttf-estimate.txt, example.png, LUCENE-3069.patch, 
> LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, 
> LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, 
> LUCENE-3069.patch, LUCENE-3069.patch
>
>
> FST based TermDictionary has been a great improvement yet it still uses a 
> delta codec file for scanning to terms. Some environments have enough memory 
> available to keep the entire FST based term dict in memory. We should add a 
> TermDictionary implementation that encodes all needed information for each 
> term into the FST (custom fst.Output) and builds a FST from the entire term 
> not just the delta.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5123) invert the codec postings API

2013-08-23 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-5123:
---

Attachment: LUCENE-5123.patch

New patch, cutting over SimpleText to the new inverted API.

I also had to cutover PerFieldPF (otherwise it would not be able to embed 
SimpleText), and a couple of tests.

Tests pass, at least once!

> invert the codec postings API
> -
>
> Key: LUCENE-5123
> URL: https://issues.apache.org/jira/browse/LUCENE-5123
> Project: Lucene - Core
>  Issue Type: Wish
>Reporter: Robert Muir
>Assignee: Michael McCandless
> Attachments: LUCENE-5123.patch, LUCENE-5123.patch
>
>
> Currently FieldsConsumer/PostingsConsumer/etc is a "push" oriented api, e.g. 
> FreqProxTermsWriter streams the postings at flush, and the default merge() 
> takes the incoming codec api and filters out deleted docs and "pushes" via 
> same api (but that can be overridden).
> It could be cleaner if we allowed for a "pull" model instead (like 
> DocValues). For example, maybe FreqProxTermsWriter could expose a Terms of 
> itself and just passed this to the codec consumer.
> This would give the codec more flexibility to e.g. do multiple passes if it 
> wanted to do things like encode high-frequency terms more efficiently with a 
> bitset-like encoding or other things...
> A codec can try to do things like this to some extent today, but its very 
> difficult (look at buffering in Pulsing). We made this change with DV and it 
> made a lot of interesting optimizations easy to implement...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [JENKINS] Lucene-Solr-trunk-Windows (64bit/jdk1.7.0_25) - Build # 3179 - Still Failing!

2013-08-23 Thread Martijn v Groningen
oops :) I see that now after checking the commit!


On 23 August 2013 14:56, Robert Muir  wrote:

> the bug was not in the test, the test found a bug in slow wrapper itself!
> :)
>
> On Fri, Aug 23, 2013 at 5:43 AM, Martijn v Groningen
>  wrote:
> > Thanks for fixing this test bug!
> >
> >
> > On 22 August 2013 16:11, Robert Muir  wrote:
> >>
> >> I committed a fix: slowwrapper bug
> >>
> >> On Thu, Aug 22, 2013 at 9:42 AM, Policeman Jenkins Server
> >>  wrote:
> >> > Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Windows/3179/
> >> > Java: 64bit/jdk1.7.0_25 -XX:+UseCompressedOops -XX:+UseG1GC
> >> >
> >> > 1 tests failed.
> >> > REGRESSION:
> >> >
> org.apache.lucene.search.grouping.DistinctValuesCollectorTest.testRandom
> >> >
> >> > Error Message:
> >> > CheckReader failed
> >> >
> >> > Stack Trace:
> >> > java.lang.RuntimeException: CheckReader failed
> >> > at
> >> >
> __randomizedtesting.SeedInfo.seed([C17F45D490FFB13E:B33360DB219F074D]:0)
> >> > at
> >> > org.apache.lucene.util._TestUtil.checkReader(_TestUtil.java:261)
> >> > at
> >> > org.apache.lucene.util._TestUtil.checkReader(_TestUtil.java:240)
> >> > at
> >> >
> org.apache.lucene.util.LuceneTestCase.newSearcher(LuceneTestCase.java:1310)
> >> > at
> >> >
> org.apache.lucene.util.LuceneTestCase.newSearcher(LuceneTestCase.java:1286)
> >> > at
> >> >
> org.apache.lucene.search.grouping.DistinctValuesCollectorTest.testRandom(DistinctValuesCollectorTest.java:253)
> >> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >> > at
> >> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> >> > at
> >> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> >> > at java.lang.reflect.Method.invoke(Method.java:606)
> >> > at
> >> >
> com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
> >> > at
> >> >
> com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
> >> > at
> >> >
> com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737)
> >> > at
> >> >
> com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773)
> >> > at
> >> >
> com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787)
> >> > at
> >> >
> org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
> >> > at
> >> >
> org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
> >> > at
> >> >
> org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
> >> > at
> >> >
> com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
> >> > at
> >> >
> org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
> >> > at
> >> >
> org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
> >> > at
> >> >
> org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
> >> > at
> >> >
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> >> > at
> >> >
> com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
> >> > at
> >> >
> com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782)
> >> > at
> >> >
> com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442)
> >> > at
> >> >
> com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746)
> >> > at
> >> >
> com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648)
> >> > at
> >> >
> com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682)
> >> > at
> >> >
> com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693)
> >> > at
> >> >
> org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
> >> > at
> >> >
> org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
> >> > at
> >> >
> com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
> >> > at
> >> >
> com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
> >> > at
> >> >
> com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
> >> >   

[jira] [Commented] (SOLR-5143) FullSolrCloudDistribCmdsTest fails often.

2013-08-23 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13748550#comment-13748550
 ] 

ASF subversion and git services commented on SOLR-5143:
---

Commit 1516847 from [~yo...@apache.org] in branch 'dev/trunk'
[ https://svn.apache.org/r1516847 ]

SOLR-5143: tests - avoid too large of a tree for a single block

> FullSolrCloudDistribCmdsTest fails often.
> -
>
> Key: SOLR-5143
> URL: https://issues.apache.org/jira/browse/SOLR-5143
> Project: Solr
>  Issue Type: Bug
>Reporter: Yonik Seeley
>Priority: Blocker
> Fix For: 4.5, 5.0
>
>
> I *think* this might have started hapening after the block join commit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3069) Lucene should have an entirely memory resident term dictionary

2013-08-23 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13748556#comment-13748556
 ] 

Michael McCandless commented on LUCENE-3069:


Patch looks great on quick look!  I'll look more when I'm back
online...

One thing: I think e.g. BlockTreeTermsReader needs some back-compat
code, so it won't try to read longsSize on old indices?


> Lucene should have an entirely memory resident term dictionary
> --
>
> Key: LUCENE-3069
> URL: https://issues.apache.org/jira/browse/LUCENE-3069
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/index, core/search
>Affects Versions: 4.0-ALPHA
>Reporter: Simon Willnauer
>Assignee: Han Jiang
>  Labels: gsoc2013
> Fix For: 5.0, 4.5
>
> Attachments: df-ttf-estimate.txt, example.png, LUCENE-3069.patch, 
> LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, 
> LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, 
> LUCENE-3069.patch, LUCENE-3069.patch
>
>
> FST based TermDictionary has been a great improvement yet it still uses a 
> delta codec file for scanning to terms. Some environments have enough memory 
> available to keep the entire FST based term dict in memory. We should add a 
> TermDictionary implementation that encodes all needed information for each 
> term into the FST (custom fst.Output) and builds a FST from the entire term 
> not just the delta.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4787) Join Contrib

2013-08-23 Thread Joel Bernstein (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-4787:
-

Description: 
This contrib provides a place where different join implementations can be 
contributed to Solr. This contrib currently includes 3 join implementations. 
The initial patch was generated from the Solr 4.3 tag. Because of changes in 
the FieldCache API this patch will only build with Solr 4.2 or above.

*HashSetJoinQParserPlugin aka hjoin*

The hjoin provides a join implementation that filters results in one core based 
on the results of a search in another core. This is similar in functionality to 
the JoinQParserPlugin but the implementation differs in a couple of important 
ways.

The first way is that the hjoin is designed to work with int and long join keys 
only. So, in order to use hjoin, int or long join keys must be included in both 
the to and from core.

The second difference is that the hjoin builds memory structures that are used 
to quickly connect the join keys. So, the hjoin will need more memory then the 
JoinQParserPlugin to perform the join.

The main advantage of the hjoin is that it can scale to join millions of keys 
between cores and provide sub-second response time. The hjoin should work well 
with up to two million results from the fromIndex and tens of millions of 
results from the main query.

The hjoin supports the following features:

1) Both lucene query and PostFilter implementations. A *"cost"* > 99 will turn 
on the PostFilter. The PostFilter will typically outperform the Lucene query 
when the main query results have been narrowed down.

2) With the lucene query implementation there is an option to build the filter 
with threads. This can greatly improve the performance of the query if the main 
query index is very large. The "threads" parameter turns on threading. For 
example *threads=6* will use 6 threads to build the filter. This will setup a 
fixed threadpool with six threads to handle all hjoin requests. Once the 
threadpool is created the hjoin will always use it to build the filter. 
Threading does not come into play with the PostFilter.

3) The *size* local parameter can be used to set the initial size of the 
hashset used to perform the join. If this is set above the number of results 
from the fromIndex then the you can avoid hashset resizing which improves 
performance.

4) Nested filter queries. The local parameter "fq" can be used to nest a filter 
query within the join. The nested fq will filter the results of the join query. 
This can point to another join to support nested joins.

5) Full caching support for the lucene query implementation. The filterCache 
and queryResultCache should work properly even with deep nesting of joins. Only 
the queryResultCache comes into play with the PostFilter implementation because 
PostFilters are not cacheable in the filterCache.

The syntax of the hjoin is similar to the JoinQParserPlugin except that the 
plugin is referenced by the string "hjoin" rather then "join".

fq=\{!hjoin fromIndex=collection2 from=id_i to=id_i threads=6 
fq=$qq\}user:customer1&qq=group:5

The example filter query above will search the fromIndex (collection2) for 
"user:customer1" applying the local fq parameter to filter the results. The 
lucene filter query will be built using 6 threads. This query will generate a 
list of values from the "from" field that will be used to filter the main 
query. Only records from the main query, where the "to" field is present in the 
"from" list will be included in the results.

The solrconfig.xml in the main query core must contain the reference to the 
pjoin.



And the join contrib jars must be registed in the solrconfig.xml.

 
 

*BitSetJoinQParserPlugin* aka bjoin*

The bjoin behaves exactly like the hjoin but uses a BitSet instead of a HashSet 
to perform the underlying join. Because of this the bjoin is much faster and 
can provide sub-second response times on tens of millions of records from the 
fromIndex and hundreds of millions of records from the main query.

But there are limitations to how the bjoin can be used. The bjoin treats the 
join keys as addresses in a BitSet and uses the Lucene OpenBitSet 
implementation which performs very well but is not sparse. So the BitSet memory 
is dictated by the size of the join keys. So a join key of 200,000,000 will 
need 25 MB of memory. For this reason the BitSet join does not support long 
join keys. In order to keep memory usage down the join keys should also be 
packed at the low end, for example from 1 to 50,000,000. 

Below is a sampe bjoin:

fq=\{!bjoin fromIndex=collection2 from=id_i to=id_i threads=6 
fq=$qq\}user:customer1&qq=group:5

To register the bjoin the solrconfig.xml in the main query core must contain 
the reference to the bjoin.



*ValueSourceJoinParserPlugin aka vjoin*

The second implementation is the ValueSourceJoinParserPlug

[jira] [Updated] (SOLR-4787) Join Contrib

2013-08-23 Thread Joel Bernstein (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-4787:
-

Description: 
This contrib provides a place where different join implementations can be 
contributed to Solr. This contrib currently includes 3 join implementations. 
The initial patch was generated from the Solr 4.3 tag. Because of changes in 
the FieldCache API this patch will only build with Solr 4.2 or above.

*HashSetJoinQParserPlugin aka hjoin*

The hjoin provides a join implementation that filters results in one core based 
on the results of a search in another core. This is similar in functionality to 
the JoinQParserPlugin but the implementation differs in a couple of important 
ways.

The first way is that the hjoin is designed to work with int and long join keys 
only. So, in order to use hjoin, int or long join keys must be included in both 
the to and from core.

The second difference is that the hjoin builds memory structures that are used 
to quickly connect the join keys. So, the hjoin will need more memory then the 
JoinQParserPlugin to perform the join.

The main advantage of the hjoin is that it can scale to join millions of keys 
between cores and provide sub-second response time. The hjoin should work well 
with up to two million results from the fromIndex and tens of millions of 
results from the main query.

The hjoin supports the following features:

1) Both lucene query and PostFilter implementations. A *"cost"* > 99 will turn 
on the PostFilter. The PostFilter will typically outperform the Lucene query 
when the main query results have been narrowed down.

2) With the lucene query implementation there is an option to build the filter 
with threads. This can greatly improve the performance of the query if the main 
query index is very large. The "threads" parameter turns on threading. For 
example *threads=6* will use 6 threads to build the filter. This will setup a 
fixed threadpool with six threads to handle all hjoin requests. Once the 
threadpool is created the hjoin will always use it to build the filter. 
Threading does not come into play with the PostFilter.

3) The *size* local parameter can be used to set the initial size of the 
hashset used to perform the join. If this is set above the number of results 
from the fromIndex then the you can avoid hashset resizing which improves 
performance.

4) Nested filter queries. The local parameter "fq" can be used to nest a filter 
query within the join. The nested fq will filter the results of the join query. 
This can point to another join to support nested joins.

5) Full caching support for the lucene query implementation. The filterCache 
and queryResultCache should work properly even with deep nesting of joins. Only 
the queryResultCache comes into play with the PostFilter implementation because 
PostFilters are not cacheable in the filterCache.

The syntax of the hjoin is similar to the JoinQParserPlugin except that the 
plugin is referenced by the string "hjoin" rather then "join".

fq=\{!hjoin fromIndex=collection2 from=id_i to=id_i threads=6 
fq=$qq\}user:customer1&qq=group:5

The example filter query above will search the fromIndex (collection2) for 
"user:customer1" applying the local fq parameter to filter the results. The 
lucene filter query will be built using 6 threads. This query will generate a 
list of values from the "from" field that will be used to filter the main 
query. Only records from the main query, where the "to" field is present in the 
"from" list will be included in the results.

The solrconfig.xml in the main query core must contain the reference to the 
pjoin.



And the join contrib jars must be registed in the solrconfig.xml.

 
 

*BitSetJoinQParserPlugin aka bjoin*

The bjoin behaves exactly like the hjoin but uses a BitSet instead of a HashSet 
to perform the underlying join. Because of this the bjoin is much faster and 
can provide sub-second response times on tens of millions of records from the 
fromIndex and hundreds of millions of records from the main query.

But there are limitations to how the bjoin can be used. The bjoin treats the 
join keys as addresses in a BitSet and uses the Lucene OpenBitSet 
implementation which performs very well but is not sparse. So the BitSet memory 
is dictated by the size of the join keys. So a join key of 200,000,000 will 
need 25 MB of memory. For this reason the BitSet join does not support long 
join keys. In order to keep memory usage down the join keys should also be 
packed at the low end, for example from 1 to 50,000,000. 

Below is a sampe bjoin:

fq=\{!bjoin fromIndex=collection2 from=id_i to=id_i threads=6 
fq=$qq\}user:customer1&qq=group:5

To register the bjoin the solrconfig.xml in the main query core must contain 
the reference to the bjoin.



*ValueSourceJoinParserPlugin aka vjoin*

The second implementation is the ValueSourceJoinParserPlugi

[jira] [Updated] (SOLR-4787) Join Contrib

2013-08-23 Thread Joel Bernstein (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-4787:
-

Description: 
This contrib provides a place where different join implementations can be 
contributed to Solr. This contrib currently includes 3 join implementations. 
The initial patch was generated from the Solr 4.3 tag. Because of changes in 
the FieldCache API this patch will only build with Solr 4.2 or above.

*HashSetJoinQParserPlugin aka hjoin*

The hjoin provides a join implementation that filters results in one core based 
on the results of a search in another core. This is similar in functionality to 
the JoinQParserPlugin but the implementation differs in a couple of important 
ways.

The first way is that the hjoin is designed to work with int and long join keys 
only. So, in order to use hjoin, int or long join keys must be included in both 
the to and from core.

The second difference is that the hjoin builds memory structures that are used 
to quickly connect the join keys. So, the hjoin will need more memory then the 
JoinQParserPlugin to perform the join.

The main advantage of the hjoin is that it can scale to join millions of keys 
between cores and provide sub-second response time. The hjoin should work well 
with up to two million results from the fromIndex and tens of millions of 
results from the main query.

The hjoin supports the following features:

1) Both lucene query and PostFilter implementations. A *"cost"* > 99 will turn 
on the PostFilter. The PostFilter will typically outperform the Lucene query 
when the main query results have been narrowed down.

2) With the lucene query implementation there is an option to build the filter 
with threads. This can greatly improve the performance of the query if the main 
query index is very large. The "threads" parameter turns on threading. For 
example *threads=6* will use 6 threads to build the filter. This will setup a 
fixed threadpool with six threads to handle all hjoin requests. Once the 
threadpool is created the hjoin will always use it to build the filter. 
Threading does not come into play with the PostFilter.

3) The *size* local parameter can be used to set the initial size of the 
hashset used to perform the join. If this is set above the number of results 
from the fromIndex then the you can avoid hashset resizing which improves 
performance.

4) Nested filter queries. The local parameter "fq" can be used to nest a filter 
query within the join. The nested fq will filter the results of the join query. 
This can point to another join to support nested joins.

5) Full caching support for the lucene query implementation. The filterCache 
and queryResultCache should work properly even with deep nesting of joins. Only 
the queryResultCache comes into play with the PostFilter implementation because 
PostFilters are not cacheable in the filterCache.

The syntax of the hjoin is similar to the JoinQParserPlugin except that the 
plugin is referenced by the string "hjoin" rather then "join".

fq=\{!hjoin fromIndex=collection2 from=id_i to=id_i threads=6 
fq=$qq\}user:customer1&qq=group:5

The example filter query above will search the fromIndex (collection2) for 
"user:customer1" applying the local fq parameter to filter the results. The 
lucene filter query will be built using 6 threads. This query will generate a 
list of values from the "from" field that will be used to filter the main 
query. Only records from the main query, where the "to" field is present in the 
"from" list will be included in the results.

The solrconfig.xml in the main query core must contain the reference to the 
pjoin.



And the join contrib jars must be registed in the solrconfig.xml.

 
 

*BitSetJoinQParserPlugin aka bjoin*

The bjoin behaves exactly like the hjoin but uses a BitSet instead of a HashSet 
to perform the underlying join. Because of this the bjoin is much faster and 
can provide sub-second response times on result sets of tens of millions of 
records from the fromIndex and hundreds of millions of records from the main 
query.

But there are limitations to how the bjoin can be used. The bjoin treats the 
join keys as addresses in a BitSet and uses the Lucene OpenBitSet 
implementation which performs very well but is not sparse. So the BitSet memory 
is dictated by the size of the join keys. So a join key of 200,000,000 will 
need 25 MB of memory. For this reason the BitSet join does not support long 
join keys. In order to keep memory usage down the join keys should also be 
packed at the low end, for example from 1 to 50,000,000. 

Below is a sampe bjoin:

fq=\{!bjoin fromIndex=collection2 from=id_i to=id_i threads=6 
fq=$qq\}user:customer1&qq=group:5

To register the bjoin the solrconfig.xml in the main query core must contain 
the reference to the bjoin.



*ValueSourceJoinParserPlugin aka vjoin*

The second implementation is the ValueSourc

[jira] [Updated] (SOLR-4787) Join Contrib

2013-08-23 Thread Joel Bernstein (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-4787:
-

Description: 
This contrib provides a place where different join implementations can be 
contributed to Solr. This contrib currently includes 3 join implementations. 
The initial patch was generated from the Solr 4.3 tag. Because of changes in 
the FieldCache API this patch will only build with Solr 4.2 or above.

*HashSetJoinQParserPlugin aka hjoin*

The hjoin provides a join implementation that filters results in one core based 
on the results of a search in another core. This is similar in functionality to 
the JoinQParserPlugin but the implementation differs in a couple of important 
ways.

The first way is that the hjoin is designed to work with int and long join keys 
only. So, in order to use hjoin, int or long join keys must be included in both 
the to and from core.

The second difference is that the hjoin builds memory structures that are used 
to quickly connect the join keys. So, the hjoin will need more memory then the 
JoinQParserPlugin to perform the join.

The main advantage of the hjoin is that it can scale to join millions of keys 
between cores and provide sub-second response time. The hjoin should work well 
with up to two million results from the fromIndex and tens of millions of 
results from the main query.

The hjoin supports the following features:

1) Both lucene query and PostFilter implementations. A *"cost"* > 99 will turn 
on the PostFilter. The PostFilter will typically outperform the Lucene query 
when the main query results have been narrowed down.

2) With the lucene query implementation there is an option to build the filter 
with threads. This can greatly improve the performance of the query if the main 
query index is very large. The "threads" parameter turns on threading. For 
example *threads=6* will use 6 threads to build the filter. This will setup a 
fixed threadpool with six threads to handle all hjoin requests. Once the 
threadpool is created the hjoin will always use it to build the filter. 
Threading does not come into play with the PostFilter.

3) The *size* local parameter can be used to set the initial size of the 
hashset used to perform the join. If this is set above the number of results 
from the fromIndex then the you can avoid hashset resizing which improves 
performance.

4) Nested filter queries. The local parameter "fq" can be used to nest a filter 
query within the join. The nested fq will filter the results of the join query. 
This can point to another join to support nested joins.

5) Full caching support for the lucene query implementation. The filterCache 
and queryResultCache should work properly even with deep nesting of joins. Only 
the queryResultCache comes into play with the PostFilter implementation because 
PostFilters are not cacheable in the filterCache.

The syntax of the hjoin is similar to the JoinQParserPlugin except that the 
plugin is referenced by the string "hjoin" rather then "join".

fq=\{!hjoin fromIndex=collection2 from=id_i to=id_i threads=6 
fq=$qq\}user:customer1&qq=group:5

The example filter query above will search the fromIndex (collection2) for 
"user:customer1" applying the local fq parameter to filter the results. The 
lucene filter query will be built using 6 threads. This query will generate a 
list of values from the "from" field that will be used to filter the main 
query. Only records from the main query, where the "to" field is present in the 
"from" list will be included in the results.

The solrconfig.xml in the main query core must contain the reference to the 
pjoin.



And the join contrib jars must be registed in the solrconfig.xml.

 
 

*BitSetJoinQParserPlugin aka bjoin*

The bjoin behaves exactly like the hjoin but uses a BitSet instead of a HashSet 
to perform the underlying join. Because of this the bjoin is much faster and 
can provide sub-second response times on result sets of tens of millions of 
records from the fromIndex and hundreds of millions of records from the main 
query.

But there are limitations to how the bjoin can be used. The bjoin treats the 
join keys as addresses in a BitSet and uses the Lucene OpenBitSet 
implementation which performs very well but is not sparse. So the BitSet memory 
is dictated by the size of the join keys. For example a bitset with a max join 
key of 200,000,000 will need 25 MB of memory. For this reason the BitSet join 
does not support long join keys. In order to keep memory usage down the join 
keys should also be packed at the low end, for example from 1 to 50,000,000. 

Below is a sampe bjoin:

fq=\{!bjoin fromIndex=collection2 from=id_i to=id_i threads=6 
fq=$qq\}user:customer1&qq=group:5

To register the bjoin the solrconfig.xml in the main query core must contain 
the reference to the bjoin.



*ValueSourceJoinParserPlugin aka vjoin*

The second imple

[jira] [Commented] (SOLR-5143) FullSolrCloudDistribCmdsTest fails often.

2013-08-23 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13748575#comment-13748575
 ] 

ASF subversion and git services commented on SOLR-5143:
---

Commit 1516859 from [~yo...@apache.org] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1516859 ]

SOLR-5143: tests - avoid too large of a tree for a single block

> FullSolrCloudDistribCmdsTest fails often.
> -
>
> Key: SOLR-5143
> URL: https://issues.apache.org/jira/browse/SOLR-5143
> Project: Solr
>  Issue Type: Bug
>Reporter: Yonik Seeley
>Priority: Blocker
> Fix For: 4.5, 5.0
>
>
> I *think* this might have started hapening after the block join commit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3069) Lucene should have an entirely memory resident term dictionary

2013-08-23 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13748579#comment-13748579
 ] 

ASF subversion and git services commented on LUCENE-3069:
-

Commit 1516860 from [~billy] in branch 'dev/branches/lucene3069'
[ https://svn.apache.org/r1516860 ]

LUCENE-3069: merge 'temp' codes back

> Lucene should have an entirely memory resident term dictionary
> --
>
> Key: LUCENE-3069
> URL: https://issues.apache.org/jira/browse/LUCENE-3069
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/index, core/search
>Affects Versions: 4.0-ALPHA
>Reporter: Simon Willnauer
>Assignee: Han Jiang
>  Labels: gsoc2013
> Fix For: 5.0, 4.5
>
> Attachments: df-ttf-estimate.txt, example.png, LUCENE-3069.patch, 
> LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, 
> LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, 
> LUCENE-3069.patch, LUCENE-3069.patch
>
>
> FST based TermDictionary has been a great improvement yet it still uses a 
> delta codec file for scanning to terms. Some environments have enough memory 
> available to keep the entire FST based term dict in memory. We should add a 
> TermDictionary implementation that encodes all needed information for each 
> term into the FST (custom fst.Output) and builds a FST from the entire term 
> not just the delta.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3069) Lucene should have an entirely memory resident term dictionary

2013-08-23 Thread Han Jiang (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13748582#comment-13748582
 ] 

Han Jiang commented on LUCENE-3069:
---

bq. Patch looks great on quick look! I'll look more when I'm back
bq. online...

OK! I commit it so that we can see later changes.

bq. One thing: I think e.g. BlockTreeTermsReader needs some back-compat
bq. code, so it won't try to read longsSize on old indices?

Yes, both two Block* term dict will have a new VERSION variable to mark the
change, and if codec header shows a previous version, they will not read
that longSize VInt.

> Lucene should have an entirely memory resident term dictionary
> --
>
> Key: LUCENE-3069
> URL: https://issues.apache.org/jira/browse/LUCENE-3069
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/index, core/search
>Affects Versions: 4.0-ALPHA
>Reporter: Simon Willnauer
>Assignee: Han Jiang
>  Labels: gsoc2013
> Fix For: 5.0, 4.5
>
> Attachments: df-ttf-estimate.txt, example.png, LUCENE-3069.patch, 
> LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, 
> LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, 
> LUCENE-3069.patch, LUCENE-3069.patch
>
>
> FST based TermDictionary has been a great improvement yet it still uses a 
> delta codec file for scanning to terms. Some environments have enough memory 
> available to keep the entire FST based term dict in memory. We should add a 
> TermDictionary implementation that encodes all needed information for each 
> term into the FST (custom fst.Output) and builds a FST from the entire term 
> not just the delta.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-5143) FullSolrCloudDistribCmdsTest fails often.

2013-08-23 Thread Yonik Seeley (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yonik Seeley resolved SOLR-5143.


Resolution: Fixed

> FullSolrCloudDistribCmdsTest fails often.
> -
>
> Key: SOLR-5143
> URL: https://issues.apache.org/jira/browse/SOLR-5143
> Project: Solr
>  Issue Type: Bug
>Reporter: Yonik Seeley
>Priority: Blocker
> Fix For: 4.5, 5.0
>
>
> I *think* this might have started hapening after the block join commit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Of resource loaders, CacheHeaderTest and being puzzled.

2013-08-23 Thread Erick Erickson
OK, figured out how to make this all work I think.

The take-away here for anyone else is that any behavior that depends on
initializing a test from the test-files directory will probably fail Real
Soon Now, especially in Solr5.x. The pattern is that by starting a Jetty
with a home dir of just "solr/", you wind up getting the default solr.xml
from ConfigSolrXmlOld. But that will no longer work in Solr5, we're
deprecating that functionality. You have to create a directory and copy all
the relevant files in it.

To help, there's a static method SolrTestCaseJ4.copyTestSolrHome that will
copy solr.xml to the root, and all the  necessary files into the
"collection specified"/conf. Currently this includes things like
solrconfig.xml, schema.xml, and support files for schema, currency.xml,
old_synonyms.txt and some others.

This turned out to be necessary, or at least quick, because there are some
interactions between that solrconfig.xml and stuff being tested for in some
cases. There are about 30 tests that fail, I'm not all the way through
fixing them but this is a recurring pattern.

There _should_ be a better way to do this without having to copy all those
files, but I'm leaving that for later.

There's also SolrTestCaseJ4.copyMinConf that takes the absolutely minimal
solrconfig.xml, schema.xml and solrconfig.snippet.randomindexingconfig.xml
files that you can use if you don't need much in the way of configuration
and want to minimize the files copied around.

I expect this to change somewhat as time passes, but it's "progress not
perfection"...

FWIW,
Erick



On Thu, Aug 22, 2013 at 8:34 PM, Erick Erickson wrote:

> I'm working on SOLR-4817 on trunk. The idea there is that if there is no
> path to solr.xml, we should fail. That part is easy to do.
>
> Where I'm having trouble is that this causes CacheHeaderTest to fail
> miserably. I've fixed other test failures by setting up a Jetty instance by
> creating a temporary directory like in other tests and populating it with a
> minimal set of config files.
>
> But CacheHeaderTest doesn't succeed if I do that. The glaring difference
> is this call:
> createJetty("solr/", null, null);
>
> If I create a temp dir that populates a directory (myHome) with solr.xml,
> collection1/conf/good stuff and call createJetty(myHome.getAbsolutePath(),
> null, null) then the test fails in one of several flavors.
>
> I'm having real trouble figuring out where the hell the configs are read
> from when the solrHome of "solr/" comes from, and why it would behave
> differently than a full configuration with an absolute path.
>
> Any pointers appreciated. Otherwise I'll just try it again in the morning.
>
> I know this is incoherent, but I'm at my wits end
>
> Erick
>


[jira] [Created] (SOLR-5186) SolrZkClient can leak threads if it doesn't start correctly

2013-08-23 Thread Alan Woodward (JIRA)
Alan Woodward created SOLR-5186:
---

 Summary: SolrZkClient can leak threads if it doesn't start 
correctly
 Key: SOLR-5186
 URL: https://issues.apache.org/jira/browse/SOLR-5186
 Project: Solr
  Issue Type: Bug
Reporter: Alan Woodward
Assignee: Alan Woodward
Priority: Minor
 Attachments: SOLR-5186.patch

Noticed this while writing tests for the embedded ZooKeeper servers.  If the 
connection manager can't connect to a ZK server before the 
clientConnectTimeout, or there's an Exception thrown during 
ZkClientConnectionStrategy.connect(), then the client's SolrZooKeeper instance 
isn't shutdown.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5186) SolrZkClient can leak threads if it doesn't start correctly

2013-08-23 Thread Alan Woodward (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Woodward updated SOLR-5186:


Attachment: SOLR-5186.patch

Trivial patch

> SolrZkClient can leak threads if it doesn't start correctly
> ---
>
> Key: SOLR-5186
> URL: https://issues.apache.org/jira/browse/SOLR-5186
> Project: Solr
>  Issue Type: Bug
>Reporter: Alan Woodward
>Assignee: Alan Woodward
>Priority: Minor
> Attachments: SOLR-5186.patch
>
>
> Noticed this while writing tests for the embedded ZooKeeper servers.  If the 
> connection manager can't connect to a ZK server before the 
> clientConnectTimeout, or there's an Exception thrown during 
> ZkClientConnectionStrategy.connect(), then the client's SolrZooKeeper 
> instance isn't shutdown.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5185) Core Selector not compatible with text based web browsers (lynx/elinks).

2013-08-23 Thread Shawn Heisey (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13748651#comment-13748651
 ] 

Shawn Heisey commented on SOLR-5185:


I can't speak for [~steffkes], who does most of our UI work, but I suspect that 
the current level of functionality would be very difficult to achieve in a way 
that's compatible with a text browser.

If you're trying to use a text-based browser, the chance of it running on 
Windows is pretty small.  If you have ssh access to the server from a graphical 
client, you can forward a local port to the Solr port on the remote system, and 
then access localhost:port/solr in a graphical browser on the machine making 
the ssh connection.  This works with PuTTY as well as virtually all commandline 
ssh clients.  If access requires a multi-hop ssh, you can do a port-forwarding 
chain.  It can be a little confusing to set up a port-forwarding chain, but it 
does work.


> Core Selector not compatible with text based web browsers (lynx/elinks).
> 
>
> Key: SOLR-5185
> URL: https://issues.apache.org/jira/browse/SOLR-5185
> Project: Solr
>  Issue Type: Bug
>  Components: web gui
>Affects Versions: 4.4
> Environment: All
>Reporter: Alasdair Campbell
>Priority: Minor
>  Labels: javascript
>
> The Core Selector part of the web gui is not compatible with text based web 
> browsers (lynx/elinks) rendering the control panel useless.
> Priority increases when installed on a remote server when you do not wish 
> Solr access from outwith the server.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-5185) Core Selector not compatible with text based web browsers (lynx/elinks).

2013-08-23 Thread Shawn Heisey (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13748651#comment-13748651
 ] 

Shawn Heisey edited comment on SOLR-5185 at 8/23/13 3:57 PM:
-

I can't speak for [~steffkes], who does most of our UI work, but I suspect that 
the current level of functionality would be very difficult to achieve in a way 
that's compatible with a text browser.

If you're trying to use a text-based browser, It's unlikely that you are 
running Solr on Windows.  If you have ssh access to the server from a graphical 
client, you can forward a local port to the Solr port on the remote system, and 
then access localhost:port/solr in a graphical browser on the machine making 
the ssh connection.  This works with PuTTY as well as virtually all commandline 
ssh clients.  If access requires a multi-hop ssh, you can do a port-forwarding 
chain.  It can be a little confusing to set up a port-forwarding chain, but it 
does work.


  was (Author: elyograg):
I can't speak for [~steffkes], who does most of our UI work, but I suspect 
that the current level of functionality would be very difficult to achieve in a 
way that's compatible with a text browser.

If you're trying to use a text-based browser, the chance of it running on 
Windows is pretty small.  If you have ssh access to the server from a graphical 
client, you can forward a local port to the Solr port on the remote system, and 
then access localhost:port/solr in a graphical browser on the machine making 
the ssh connection.  This works with PuTTY as well as virtually all commandline 
ssh clients.  If access requires a multi-hop ssh, you can do a port-forwarding 
chain.  It can be a little confusing to set up a port-forwarding chain, but it 
does work.

  
> Core Selector not compatible with text based web browsers (lynx/elinks).
> 
>
> Key: SOLR-5185
> URL: https://issues.apache.org/jira/browse/SOLR-5185
> Project: Solr
>  Issue Type: Bug
>  Components: web gui
>Affects Versions: 4.4
> Environment: All
>Reporter: Alasdair Campbell
>Priority: Minor
>  Labels: javascript
>
> The Core Selector part of the web gui is not compatible with text based web 
> browsers (lynx/elinks) rendering the control panel useless.
> Priority increases when installed on a remote server when you do not wish 
> Solr access from outwith the server.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-5099) The core.properties not created during collection creation

2013-08-23 Thread Mark Miller (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller resolved SOLR-5099.
---

Resolution: Fixed

> The core.properties not created during collection creation
> --
>
> Key: SOLR-5099
> URL: https://issues.apache.org/jira/browse/SOLR-5099
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Affects Versions: 4.5, 5.0
>Reporter: Herb Jiang
>Assignee: Mark Miller
>Priority: Critical
> Fix For: 4.5, 5.0
>
> Attachments: CorePropertiesLocator.java.patch
>
>
> When using the new solr.xml structure. The core auto discovery mechanism 
> trying to find core.properties. 
> But I found the core.properties cannot be create when I dynamically create a 
> collection.
> The root issue is the CorePropertiesLocator trying to create properties 
> before the instanceDir is created. 
> And collection creation process will done and looks fine at runtime, but it 
> will cause issues (cores are not auto discovered after server restart).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5185) Core Selector not compatible with text based web browsers (lynx/elinks).

2013-08-23 Thread Stefan Matheis (steffkes) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13748721#comment-13748721
 ] 

Stefan Matheis (steffkes) commented on SOLR-5185:
-

indeed as [~elyograg] said - i'd guess that this is your first issue you're 
facing and *not* the only one (=everything else, expect that works for you), 
right? because if you can't select a core right now .. i believe you haven't 
even seen the things that were waiting for you there? (:

> Core Selector not compatible with text based web browsers (lynx/elinks).
> 
>
> Key: SOLR-5185
> URL: https://issues.apache.org/jira/browse/SOLR-5185
> Project: Solr
>  Issue Type: Bug
>  Components: web gui
>Affects Versions: 4.4
> Environment: All
>Reporter: Alasdair Campbell
>Priority: Minor
>  Labels: javascript
>
> The Core Selector part of the web gui is not compatible with text based web 
> browsers (lynx/elinks) rendering the control panel useless.
> Priority increases when installed on a remote server when you do not wish 
> Solr access from outwith the server.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5185) Core Selector not compatible with text based web browsers (lynx/elinks).

2013-08-23 Thread Alasdair Campbell (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13748738#comment-13748738
 ] 

Alasdair Campbell commented on SOLR-5185:
-

Thanks for your reply [~elyograg] I guess that will do, feel free to close the 
bug.

Thanks for your input [~steffkes] however I have ran it locally without 
problems (how else would I have known about the core-selector), the issue was 
when I was testing it on a production server and had firewalled the port in use 
rendering it inaccessible remotely without using Shawn's technique.  Thanks for 
the input anyway.

> Core Selector not compatible with text based web browsers (lynx/elinks).
> 
>
> Key: SOLR-5185
> URL: https://issues.apache.org/jira/browse/SOLR-5185
> Project: Solr
>  Issue Type: Bug
>  Components: web gui
>Affects Versions: 4.4
> Environment: All
>Reporter: Alasdair Campbell
>Priority: Minor
>  Labels: javascript
>
> The Core Selector part of the web gui is not compatible with text based web 
> browsers (lynx/elinks) rendering the control panel useless.
> Priority increases when installed on a remote server when you do not wish 
> Solr access from outwith the server.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5187) Make SlowCompositeReaderWrapper constructor private

2013-08-23 Thread Adrien Grand (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13748761#comment-13748761
 ] 

Adrien Grand commented on LUCENE-5187:
--

Hmm, my understanding was that this test just needs a top-level fieldcache 
instance for testing purpose but I may be wrong...

> Make SlowCompositeReaderWrapper constructor private
> ---
>
> Key: LUCENE-5187
> URL: https://issues.apache.org/jira/browse/LUCENE-5187
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Assignee: Adrien Grand
>Priority: Minor
> Attachments: LUCENE-5187.patch
>
>
> I found a couple of places in the code base that duplicate the logic of 
> SlowCompositeReaderWrapper.wrap. I think {{SlowCompositeReaderWrapper.wrap}} 
> (vs. new SlowCompositeReaderWrapper) is what users need so we should probably 
> make SlowCompositeReaderWrapper constructor private to enforce usage of 
> {{wrap}}.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5187) Make SlowCompositeReaderWrapper constructor private

2013-08-23 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13748774#comment-13748774
 ] 

Robert Muir commented on LUCENE-5187:
-

My concern was the combination of the "intentional" comment and the fact it 
used the ctor (which forces wrapping always).

I guess we can just see if it fails, and deal with it if so (by forcing 
compositeness).

So I would just commit the patch for now!

> Make SlowCompositeReaderWrapper constructor private
> ---
>
> Key: LUCENE-5187
> URL: https://issues.apache.org/jira/browse/LUCENE-5187
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Assignee: Adrien Grand
>Priority: Minor
> Attachments: LUCENE-5187.patch
>
>
> I found a couple of places in the code base that duplicate the logic of 
> SlowCompositeReaderWrapper.wrap. I think {{SlowCompositeReaderWrapper.wrap}} 
> (vs. new SlowCompositeReaderWrapper) is what users need so we should probably 
> make SlowCompositeReaderWrapper constructor private to enforce usage of 
> {{wrap}}.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Assigned] (SOLR-4817) Solr should not fall back to the back compat built in solr.xml in SolrCloud mode.

2013-08-23 Thread Erick Erickson (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson reassigned SOLR-4817:


Assignee: Erick Erickson  (was: Mark Miller)

> Solr should not fall back to the back compat built in solr.xml in SolrCloud 
> mode.
> -
>
> Key: SOLR-4817
> URL: https://issues.apache.org/jira/browse/SOLR-4817
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Reporter: Mark Miller
>Assignee: Erick Erickson
>Priority: Minor
> Fix For: 4.5, 5.0
>
>
> A hard error is much more useful, and this built in solr.xml is not very good 
> for solrcloud - with the old style solr.xml with cores in it, you won't have 
> persistence and with the new style, it's not really ideal either.
> I think it makes it easier to debug solr.home to fail on this instead - but 
> just in solrcloud mode for now due to back compat. We might want to pull the 
> whole internal solr.xml for 5.0.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5187) Make SlowCompositeReaderWrapper constructor private

2013-08-23 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13748791#comment-13748791
 ] 

Uwe Schindler commented on LUCENE-5187:
---

Strng ++1

I wanted to do that long time, but some tests were made me afraid.

> Make SlowCompositeReaderWrapper constructor private
> ---
>
> Key: LUCENE-5187
> URL: https://issues.apache.org/jira/browse/LUCENE-5187
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Assignee: Adrien Grand
>Priority: Minor
> Attachments: LUCENE-5187.patch
>
>
> I found a couple of places in the code base that duplicate the logic of 
> SlowCompositeReaderWrapper.wrap. I think {{SlowCompositeReaderWrapper.wrap}} 
> (vs. new SlowCompositeReaderWrapper) is what users need so we should probably 
> make SlowCompositeReaderWrapper constructor private to enforce usage of 
> {{wrap}}.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4817) Solr should not fall back to the back compat built in solr.xml in SolrCloud mode.

2013-08-23 Thread Erick Erickson (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson updated SOLR-4817:
-

Attachment: SOLR-4817.patch

Patch for trunk only, it'll take a bit of work for back-compat in 4x.

[~romseygeek][~markrmil...@gmail.com] I had to do a bit of violence to the 
persist test and a couple of ZK tests just in case you want to take a quick 
glance at them.

All tests passed a bit ago, I'm trying it again now. If that works, I'll see 
what it would take to make it work with 4x and, if I'm lucky, get it committed 
over the weekend.

> Solr should not fall back to the back compat built in solr.xml in SolrCloud 
> mode.
> -
>
> Key: SOLR-4817
> URL: https://issues.apache.org/jira/browse/SOLR-4817
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Reporter: Mark Miller
>Assignee: Erick Erickson
>Priority: Minor
> Fix For: 4.5, 5.0
>
> Attachments: SOLR-4817.patch
>
>
> A hard error is much more useful, and this built in solr.xml is not very good 
> for solrcloud - with the old style solr.xml with cores in it, you won't have 
> persistence and with the new style, it's not really ideal either.
> I think it makes it easier to debug solr.home to fail on this instead - but 
> just in solrcloud mode for now due to back compat. We might want to pull the 
> whole internal solr.xml for 5.0.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4817) Solr should not fall back to the back compat built in solr.xml in SolrCloud mode.

2013-08-23 Thread Alan Woodward (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13748845#comment-13748845
 ] 

Alan Woodward commented on SOLR-4817:
-

I may be missing something, but this looks like it removes the 'fall back to 
built in solr.xml' code entirely?  Rather than just in cloud-mode.

> Solr should not fall back to the back compat built in solr.xml in SolrCloud 
> mode.
> -
>
> Key: SOLR-4817
> URL: https://issues.apache.org/jira/browse/SOLR-4817
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Reporter: Mark Miller
>Assignee: Erick Erickson
>Priority: Minor
> Fix For: 4.5, 5.0
>
> Attachments: SOLR-4817.patch
>
>
> A hard error is much more useful, and this built in solr.xml is not very good 
> for solrcloud - with the old style solr.xml with cores in it, you won't have 
> persistence and with the new style, it's not really ideal either.
> I think it makes it easier to debug solr.home to fail on this instead - but 
> just in solrcloud mode for now due to back compat. We might want to pull the 
> whole internal solr.xml for 5.0.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4817) Solr should not fall back to the back compat built in solr.xml in SolrCloud mode.

2013-08-23 Thread Erick Erickson (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13748891#comment-13748891
 ] 

Erick Erickson commented on SOLR-4817:
--

[~romseygeek] Yes, it does remove fallback entirely, but this is for trunk 
only. The 4x patch will have the fallback in it, which is what's behind my 
comment about for trunk only and back-compat needing more work for 4x.

> Solr should not fall back to the back compat built in solr.xml in SolrCloud 
> mode.
> -
>
> Key: SOLR-4817
> URL: https://issues.apache.org/jira/browse/SOLR-4817
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Reporter: Mark Miller
>Assignee: Erick Erickson
>Priority: Minor
> Fix For: 4.5, 5.0
>
> Attachments: SOLR-4817.patch
>
>
> A hard error is much more useful, and this built in solr.xml is not very good 
> for solrcloud - with the old style solr.xml with cores in it, you won't have 
> persistence and with the new style, it's not really ideal either.
> I think it makes it easier to debug solr.home to fail on this instead - but 
> just in solrcloud mode for now due to back compat. We might want to pull the 
> whole internal solr.xml for 5.0.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4817) Solr should not fall back to the back compat built in solr.xml in SolrCloud mode.

2013-08-23 Thread Erick Erickson (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13748927#comment-13748927
 ] 

Erick Erickson commented on SOLR-4817:
--

Also note I'm still dealing with a couple of failed tests, they don't appear to 
show up if other tests fail...

Nothing difficult so far, just a bit of slogging.

> Solr should not fall back to the back compat built in solr.xml in SolrCloud 
> mode.
> -
>
> Key: SOLR-4817
> URL: https://issues.apache.org/jira/browse/SOLR-4817
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Reporter: Mark Miller
>Assignee: Erick Erickson
>Priority: Minor
> Fix For: 4.5, 5.0
>
> Attachments: SOLR-4817.patch
>
>
> A hard error is much more useful, and this built in solr.xml is not very good 
> for solrcloud - with the old style solr.xml with cores in it, you won't have 
> persistence and with the new style, it's not really ideal either.
> I think it makes it easier to debug solr.home to fail on this instead - but 
> just in solrcloud mode for now due to back compat. We might want to pull the 
> whole internal solr.xml for 5.0.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3583) Percentiles for facets, pivot facets, and distributed pivot facets

2013-08-23 Thread Chris Russell (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13749009#comment-13749009
 ] 

Chris Russell commented on SOLR-3583:
-

Andrew Muldowney and I work together.  He has been working on this code most 
recently.  But yes, it is pretty solid at the moment.
Andrew, anything to add?

> Percentiles for facets, pivot facets, and distributed pivot facets
> --
>
> Key: SOLR-3583
> URL: https://issues.apache.org/jira/browse/SOLR-3583
> Project: Solr
>  Issue Type: Improvement
>Reporter: Chris Russell
>Priority: Minor
>  Labels: newbie, patch
> Fix For: 4.5, 5.0
>
> Attachments: SOLR-3583.patch, SOLR-3583.patch, SOLR-3583.patch, 
> SOLR-3583.patch
>
>
> Built on top of SOLR-2894, this patch adds percentiles and averages to 
> facets, pivot facets, and distributed pivot facets by making use of range 
> facet internals.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-SmokeRelease-4.x - Build # 102 - Still Failing

2013-08-23 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-SmokeRelease-4.x/102/

No tests ran.

Build Log:
[...truncated 34212 lines...]
prepare-release-no-sign:
[mkdir] Created dir: 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.x/lucene/build/fakeRelease
 [copy] Copying 416 files to 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.x/lucene/build/fakeRelease/lucene
 [copy] Copying 194 files to 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.x/lucene/build/fakeRelease/solr
 [exec] JAVA6_HOME is /home/hudson/tools/java/latest1.6
 [exec] JAVA7_HOME is /home/hudson/tools/java/latest1.7
 [exec] NOTE: output encoding is US-ASCII
 [exec] 
 [exec] Load release URL 
"file:/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.x/lucene/build/fakeRelease/"...
 [exec] 
 [exec] Test Lucene...
 [exec]   test basics...
 [exec]   get KEYS
 [exec] 0.1 MB in 0.01 sec (11.1 MB/sec)
 [exec]   check changes HTML...
 [exec]   download lucene-4.5.0-src.tgz...
 [exec] 27.1 MB in 0.04 sec (646.8 MB/sec)
 [exec] verify md5/sha1 digests
 [exec]   download lucene-4.5.0.tgz...
 [exec] 49.0 MB in 0.07 sec (661.0 MB/sec)
 [exec] verify md5/sha1 digests
 [exec]   download lucene-4.5.0.zip...
 [exec] 58.8 MB in 0.11 sec (519.2 MB/sec)
 [exec] verify md5/sha1 digests
 [exec]   unpack lucene-4.5.0.tgz...
 [exec] verify JAR/WAR metadata...
 [exec] test demo with 1.6...
 [exec]   got 5717 hits for query "lucene"
 [exec] test demo with 1.7...
 [exec]   got 5717 hits for query "lucene"
 [exec] check Lucene's javadoc JAR
 [exec] 
 [exec] 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.x/lucene/build/fakeReleaseTmp/unpack/lucene-4.5.0/docs/core/org/apache/lucene/util/AttributeSource.html
 [exec]   broken details HTML: Method Detail: addAttributeImpl: closing 
"" does not match opening ""
 [exec]   broken details HTML: Method Detail: getAttribute: closing 
"" does not match opening ""
 [exec] Traceback (most recent call last):
 [exec]   File 
"/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.x/dev-tools/script
 [exec] s/smokeTestRelease.py", line 1450, in 
 [exec] main()
 [exec]   File 
"/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.x/dev-tools/scripts/smokeTestRelease.py",
 line 1394, in main
 [exec] smokeTest(baseURL, svnRevision, version, tmpDir, isSigned, 
testArgs)
 [exec]   File 
"/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.x/dev-tools/scripts/smokeTestRelease.py",
 line 1431, in smokeTest
 [exec] unpackAndVerify('lucene', tmpDir, artifact, svnRevision, 
version, testArgs)
 [exec]   File 
"/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.x/dev-tools/scripts/smokeTestRelease.py",
 line 607, in unpackAndVerify
 [exec] verifyUnpacked(project, artifact, unpackPath, svnRevision, 
version, testArgs)
 [exec]   File 
"/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.x/dev-tools/scripts/smokeTestRelease.py",
 line 786, in verifyUnpacked
 [exec] checkJavadocpath('%s/docs' % unpackPath)
 [exec]   File 
"/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.x/dev-tools/scripts/smokeTestRelease.py",
 line 904, in checkJavadocpath
 [exec] raise RuntimeError('missing javadocs package summaries!')
 [exec] RuntimeError: missing javadocs package summaries!

BUILD FAILED
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.x/build.xml:314:
 exec returned: 1

Total time: 19 minutes 50 seconds
Build step 'Invoke Ant' marked build as failure
Email was triggered for: Failure
Sending email for trigger: Failure



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-5184) CountFacetRequest does not seem to sum the totals of the subResult values.

2013-08-23 Thread Shai Erera (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera resolved LUCENE-5184.


Resolution: Not A Problem

This is documented in OrdinalPolicy's enums. Perhaps it could have also been 
documented in a CHANGES entry for 4.2.0, but that's history now.

> CountFacetRequest does not seem to sum the totals of the subResult values.
> --
>
> Key: LUCENE-5184
> URL: https://issues.apache.org/jira/browse/LUCENE-5184
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/facet
>Affects Versions: 4.4
> Environment: Windows 7, Java 1.6 (64 bit), Eclipse
>Reporter: Karl Nicholas
>  Labels: CountFacetRequest, facet
> Attachments: FacetTest.java, LUCENE-5184.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> CountFacetRequest does not seem to sum the totals of the subResult values 
> when the query searches in a facet hierarchy. Seemed to be better behaved in 
> version 4.0, and changed when I updated to version 4.4, though I did have to 
> change code as well. I am using facets to create a hierarchy. Will attempt to 
> upload sample code. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org