[JENKINS-EA] Lucene-Solr-master-Linux (64bit/jdk-9-ea+147) - Build # 18653 - Still Unstable!

2016-12-29 Thread Policeman Jenkins Server
Build: https://jenkins.thetaphi.de/job/Lucene-Solr-master-Linux/18653/
Java: 64bit/jdk-9-ea+147 -XX:+UseCompressedOops -XX:+UseParallelGC

3 tests failed.
FAILED:  org.apache.solr.cloud.CollectionsAPISolrJTest.testCreateAndDeleteAlias

Error Message:
Error from server at https://127.0.0.1:44067/solr: create the collection time 
out:180s

Stack Trace:
org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error 
from server at https://127.0.0.1:44067/solr: create the collection time out:180s
at 
__randomizedtesting.SeedInfo.seed([C78FE6B86EB3B01C:96CA142CE0C4187]:0)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:627)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:279)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:268)
at 
org.apache.solr.client.solrj.impl.LBHttpSolrClient.doRequest(LBHttpSolrClient.java:439)
at 
org.apache.solr.client.solrj.impl.LBHttpSolrClient.request(LBHttpSolrClient.java:391)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.sendRequest(CloudSolrClient.java:1344)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:1095)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:1037)
at 
org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:149)
at 
org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:166)
at 
org.apache.solr.cloud.CollectionsAPISolrJTest.testCreateAndDeleteAlias(CollectionsAPISolrJTest.java:128)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:538)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1713)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:907)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:943)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:957)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:811)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:462)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:916)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:802)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:852)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 

[jira] [Commented] (SOLR-9668) Support cursor paging in SolrEntityProcessor

2016-12-29 Thread Mikhail Khludnev (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15787112#comment-15787112
 ] 

Mikhail Khludnev commented on SOLR-9668:


Not really. I just want to confirm that configuration approach is fine. 

> Support cursor paging in SolrEntityProcessor
> 
>
> Key: SOLR-9668
> URL: https://issues.apache.org/jira/browse/SOLR-9668
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: contrib - DataImportHandler
>Reporter: Yegor Kozlov
>Assignee: Mikhail Khludnev
>Priority: Minor
>  Labels: dataimportHandler
> Fix For: master (7.0)
>
> Attachments: SOLR-9668.patch, SOLR-9668.patch
>
>
> SolrEntityProcessor paginates using the start and rows parameters which can 
> be very inefficient at large offsets. In fact, the current implementation  is 
> impracticable to import large amounts of data (10M+ documents) because the 
> data import rate degrades from 1000docs/second to 10docs/second and the 
> import gets stuck.
> This patch introduces support for cursor paging which offers more or less 
> predictable performance. In my tests the time to fetch the 1st and 1000th 
> pages was about the same and the data import rate was stable throughout the 
> entire import. 
> To enable cursor paging a user needs to:
>  * add {{cursorMark='true'}} (!) attribute in the entity configuration;
>  * "sort" attribute in the entity configuration see note about sort at 
> https://cwiki.apache.org/confluence/display/solr/Pagination+of+Results ;
>  * remove {{timeout}} attribute.
> {code}
> 
> 
>   
>  query="*:*"
> rows="1000"
> cursorMark='true'
> sort="id asc"  
> url="http://localhost:8983/solr/collection1;>
> 
>   
> 
> {code}
> If the {{cursorMark}} attribute is missing or is not {{'true'}} then the 
> default start/rows pagination is used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-9907) in solr 6.2 select query with edismax and bf=rord(datecreated) is not working

2016-12-29 Thread pramod kishore (JIRA)
pramod kishore created SOLR-9907:


 Summary: in solr 6.2 select query with edismax and 
bf=rord(datecreated) is not working 
 Key: SOLR-9907
 URL: https://issues.apache.org/jira/browse/SOLR-9907
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
  Components: SolrCloud
 Environment: x86_64 GNU/Linux
Reporter: pramod kishore


We have solr cloud with 3 shard and 3 replica with solr 6.2 installed.Select 
query with edismax and bf=rord(datecreated) , where datecreated is a date field 
gives error.
Error Details:
--

"error":{
"metadata":[
  "error-class","org.apache.solr.common.SolrException",
  
"root-error-class","org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException"],
"msg":"org.apache.solr.client.solrj.SolrServerException: No live available 
to handle this request:[http://xyz:8983/solr/client_sku_shard3_replica3, 
http://xyz:8983/solr/client_sku_shard2_replica2, 
http://xyz:8983/solr/client_sku_shard2_replica1, 
http://xyz:8983/solr/client_sku_shard2_replica3, 
http://xyz:8983/solr/client_sku_shard1_replica1, 
http://xyz:8983/solr/client_sku_shard3_replica2, 
http://yxz:8983/solr/client_sku_shard1_replica2, 
http://xyz:8983/solr/client_sku_shard3_replica1, 
http://xyz:8983/solr/client_sku_shard1_replica3];,
"trace":"org.apache.solr.common.SolrException: 
org.apache.solr.client.solrj.SolrServerException: No live SolrServers available 
to handle this request:[http://xyz:8983/solr/client_sku_shard3_replica3, 
http://xyz:8983/solr/client_sku_shard2_replica2, 
http://xyz:8983/solr/client_sku_shard2_replica1, 
http://xyz:8983/solr/client_sku_shard2_replica3, 
http://xyz:8983/solr/client_sku_shard1_replica1, 
http://xyz:8983/solr/client_sku_shard3_replica2, 
http://yxz:8983/solr/client_sku_shard1_replica2, 
http://xyz:8983/solr/client_sku_shard3_replica1, 
http://xyz:8983/solr/client_sku_shard1_replica3]\n\tat 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:415)\n\tat
 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:154)\n\tat
 org.apache.solr.core.SolrCore.execute(SolrCore.java:2089)\n\tat 
org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:652)\n\tat 
org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:459)\n\tat 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:257)\n\tat
 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:208)\n\tat
 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1668)\n\tat
 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:581)\n\tat
 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)\n\tat
 
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)\n\tat
 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)\n\tat
 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1160)\n\tat
 
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:511)\n\tat 
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)\n\tat
 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1092)\n\tat
 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)\n\tat
 
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)\n\tat
 
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)\n\tat
 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)\n\tat
 org.eclipse.jetty.server.Server.handle(Server.java:518)\n\tat 
org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:308)\n\tat 
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:244)\n\tat
 
org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273)\n\tat
 org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)\n\tat 
org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)\n\tat
 
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceAndRun(ExecuteProduceConsume.java:246)\n\tat
 
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:156)\n\tat
 
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:654)\n\tat
 
org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:572)\n\tat
 java.lang.Thread.run(Thread.java:745)\nCaused by: 
org.apache.solr.client.solrj.SolrServerException: No live SolrServers available 
to handle this request:[http://xyz:8983/solr/client_sku_shard3_replica3, 
http://xyz:8983/solr/client_sku_shard2_replica2, 
http://xyz:8983/solr/client_sku_shard2_replica1, 

[jira] [Commented] (LUCENE-7603) Support Graph Token Streams in QueryBuilder

2016-12-29 Thread Matt Weber (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15786980#comment-15786980
 ] 

Matt Weber commented on LUCENE-7603:


[~dsmiley] . Thank you for the review!  I was able to come up with a way to 
preserve position increment gaps.  Can you please take another look?  

[~mikemccand] Can you please have another look as well?


> Support Graph Token Streams in QueryBuilder
> ---
>
> Key: LUCENE-7603
> URL: https://issues.apache.org/jira/browse/LUCENE-7603
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/queryparser, core/search
>Reporter: Matt Weber
>
> With [LUCENE-6664|https://issues.apache.org/jira/browse/LUCENE-6664] we can 
> use multi-term synonyms query time.  A "graph token stream" will be created 
> which which is nothing more than using the position length attribute on 
> stacked tokens to indicate how many positions a token should span.  Currently 
> the position length attribute on tokens is ignored during query parsing.  
> This issue will add support for handling these graph token streams inside the 
> QueryBuilder utility class used by query parsers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-master-Linux (32bit/jdk1.8.0_112) - Build # 18652 - Unstable!

2016-12-29 Thread Policeman Jenkins Server
Build: https://jenkins.thetaphi.de/job/Lucene-Solr-master-Linux/18652/
Java: 32bit/jdk1.8.0_112 -server -XX:+UseConcMarkSweepGC

1 tests failed.
FAILED:  org.apache.solr.metrics.reporters.SolrGangliaReporterTest.testReporter

Error Message:


Stack Trace:
java.util.ConcurrentModificationException
at 
__randomizedtesting.SeedInfo.seed([C0410EB9B52157DE:9FA5238EDE2DC49B]:0)
at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:901)
at java.util.ArrayList$Itr.next(ArrayList.java:851)
at 
org.apache.solr.metrics.reporters.SolrGangliaReporterTest.testReporter(SolrGangliaReporterTest.java:76)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1713)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:907)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:943)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:957)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:811)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:462)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:916)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:802)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:852)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367)
at java.lang.Thread.run(Thread.java:745)




Build Log:
[...truncated 12495 lines...]
   [junit4] Suite: org.apache.solr.metrics.reporters.SolrGangliaReporterTest
   [junit4]   2> Creating dataDir: 

[jira] [Commented] (SOLR-7466) Allow optional leading wildcards in complexphrase

2016-12-29 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15786862#comment-15786862
 ] 

Yonik Seeley commented on SOLR-7466:


bq. Is there any veto to make leading wildcards always on in complexphrase?

Seems fine to me.  This is limited to the complexphase qparser to begin with, 
and what makes sense for a default is to return what is requested without 
second guessing.


> Allow optional leading wildcards in complexphrase
> -
>
> Key: SOLR-7466
> URL: https://issues.apache.org/jira/browse/SOLR-7466
> Project: Solr
>  Issue Type: Improvement
>  Components: query parsers
>Affects Versions: 4.8
>Reporter: Andy hardin
>Assignee: Mikhail Khludnev
>  Labels: complexPhrase, query-parser, wildcards
> Attachments: SOLR-7466.patch
>
>
> Currently ComplexPhraseQParser (SOLR-1604) allows trailing wildcards on terms 
> in a phrase, but does not allow leading wildcards.  I would like the option 
> to be able to search for terms with both trailing and leading wildcards.  
> For example with:
> {!complexphrase allowLeadingWildcard=true} "j* *th"
> would match "John Smith", "Jim Smith", but not "John Schmitt"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7466) Allow optional leading wildcards in complexphrase

2016-12-29 Thread Erick Erickson (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15786764#comment-15786764
 ] 

Erick Erickson commented on SOLR-7466:
--

Mikhail:

I haven't looked at the code, so take this with a large grain of salt. 

My only concern is if having this on by default would cause a full terms scan. 
If having this always on by default means a user can use a leading wildcard 
without specifying ReverseWildcardFilterFactory in the analysis chain then it 
seems trappy.

That said, I'll defer to your familiarity with the, you know, actual code.

> Allow optional leading wildcards in complexphrase
> -
>
> Key: SOLR-7466
> URL: https://issues.apache.org/jira/browse/SOLR-7466
> Project: Solr
>  Issue Type: Improvement
>  Components: query parsers
>Affects Versions: 4.8
>Reporter: Andy hardin
>Assignee: Mikhail Khludnev
>  Labels: complexPhrase, query-parser, wildcards
> Attachments: SOLR-7466.patch
>
>
> Currently ComplexPhraseQParser (SOLR-1604) allows trailing wildcards on terms 
> in a phrase, but does not allow leading wildcards.  I would like the option 
> to be able to search for terms with both trailing and leading wildcards.  
> For example with:
> {!complexphrase allowLeadingWildcard=true} "j* *th"
> would match "John Smith", "Jim Smith", but not "John Schmitt"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9894) Tokenizer work randomly

2016-12-29 Thread Erick Erickson (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15786758#comment-15786758
 ] 

Erick Erickson commented on SOLR-9894:
--

We've mentioned several times that this involves a tokenizer that is _not_ 
supported by Apache Solr, specifically: 
org.wltea.pinyin.solr5.PinyinTokenFilterFactory. You have yet to show that the 
problem isn't in this custom class.

Plus, the class mentions Solr 5, yet you're logging this against Solr 6.

Unless and until you can show that this issue is a problem with Solr and not 
this non-solr tokenizer there is little that we can do. If you would like to 
retain consulting services to debug this custom code, please contact one of the 
many consulting services.

> Tokenizer work randomly
> ---
>
> Key: SOLR-9894
> URL: https://issues.apache.org/jira/browse/SOLR-9894
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: query parsers
>Affects Versions: 6.2.1
> Environment: solrcloud 6.2.1(3 solr nodes)
> OS:linux
> RAM:8G
>Reporter: 王海涛
>Priority: Critical
>  Labels: patch
> Attachments: step1.png, step2.png, step3.png, step4.png
>
>
> my schema.xml has a fieldType as folow:
> 
>   
>class="org.wltea.analyzer.lucene.IKTokenizerFactory" useSmart="false"/>
>class="org.wltea.pinyin.solr5.PinyinTokenFilterFactory" pinyinAll="true" 
> minTermLength="2"/> 
>   
>   
>   
>class="org.wltea.analyzer.lucene.IKTokenizerFactory" useSmart="true"/>
>  
>   
>   
> Attention:
>   index tokenzier useSmart is false
>   query tokenzier useSmart is true
> But when I send query request with parameter q ,
> the query tokenziner sometimes useSmart equals true
> sometimes useSmart equal false.
> That is so terrible!
> I guess the problem may be caught by tokenizer cache.
> when I query ,the tokenizer should use true as the useSmart's value,
> but it had cache the wrong tokenizer result which created by indexWriter who 
> use false as useSmart's value.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-Tests-6.x - Build # 634 - Still Unstable

2016-12-29 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-Tests-6.x/634/

1 tests failed.
FAILED:  org.apache.solr.handler.TestReplicationHandler.doTestStressReplication

Error Message:
timed out waiting for collection1 startAt time to exceed: Fri Dec 30 01:52:30 
GMT 2016

Stack Trace:
java.lang.AssertionError: timed out waiting for collection1 startAt time to 
exceed: Fri Dec 30 01:52:30 GMT 2016
at 
__randomizedtesting.SeedInfo.seed([22639141FA9254C5:F9C89187FFBA3D76]:0)
at org.junit.Assert.fail(Assert.java:93)
at 
org.apache.solr.handler.TestReplicationHandler.watchCoreStartAt(TestReplicationHandler.java:1510)
at 
org.apache.solr.handler.TestReplicationHandler.doTestStressReplication(TestReplicationHandler.java:860)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1713)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:907)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:943)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:957)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:811)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:462)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:916)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:802)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:852)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367)
at java.lang.Thread.run(Thread.java:745)




Build Log:
[...truncated 11738 lines...]
   [junit4] Suite: org.apache.solr.handler.TestReplicationHandler
   [junit4]   2> Creating dataDir: 

[jira] [Resolved] (SOLR-9843) Fix up DocValuesNotIndexedTest failures

2016-12-29 Thread Erick Erickson (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-9843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson resolved SOLR-9843.
--
   Resolution: Fixed
Fix Version/s: 6.4
   trunk

OK, it's not January, but it's close enough. Given that the situation is 
"impossible", this appears to be a test artifact and/or something fixed by 
someone else.

Somehow I committed this to 6x _then_ trunk, but I doubt it matters.

> Fix up DocValuesNotIndexedTest failures
> ---
>
> Key: SOLR-9843
> URL: https://issues.apache.org/jira/browse/SOLR-9843
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Blocker
> Fix For: trunk, 6.4
>
> Attachments: SOLR-9843.patch, SOLR-9843.patch, fail.txt, 
> shard3_replica1.txt, shard_3_searchers.txt
>
>
> I'll have to do a few iterations on the Jenkins builds since I can't get this 
> to fail locally. Marking as "blocker" since I'll probably have to put some 
> extra code in that I want to be sure is removed before we cut any new 
> releases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9843) Fix up DocValuesNotIndexedTest failures

2016-12-29 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15786661#comment-15786661
 ] 

ASF subversion and git services commented on SOLR-9843:
---

Commit 3ccd15a7658ad2821e8a2d2916781265db6f3afe in lucene-solr's branch 
refs/heads/master from [~erickerickson]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=3ccd15a ]

SOLR-9843 Fix up DocValuesNotIndexedTest failures
(cherry picked from commit f6a3557)


> Fix up DocValuesNotIndexedTest failures
> ---
>
> Key: SOLR-9843
> URL: https://issues.apache.org/jira/browse/SOLR-9843
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Blocker
> Attachments: SOLR-9843.patch, SOLR-9843.patch, fail.txt, 
> shard3_replica1.txt, shard_3_searchers.txt
>
>
> I'll have to do a few iterations on the Jenkins builds since I can't get this 
> to fail locally. Marking as "blocker" since I'll probably have to put some 
> extra code in that I want to be sure is removed before we cut any new 
> releases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9843) Fix up DocValuesNotIndexedTest failures

2016-12-29 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15786656#comment-15786656
 ] 

ASF subversion and git services commented on SOLR-9843:
---

Commit f6a3557ee287868fc864182ff5d2023542e29d0c in lucene-solr's branch 
refs/heads/branch_6x from [~erickerickson]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=f6a3557 ]

SOLR-9843 Fix up DocValuesNotIndexedTest failures


> Fix up DocValuesNotIndexedTest failures
> ---
>
> Key: SOLR-9843
> URL: https://issues.apache.org/jira/browse/SOLR-9843
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Blocker
> Attachments: SOLR-9843.patch, SOLR-9843.patch, fail.txt, 
> shard3_replica1.txt, shard_3_searchers.txt
>
>
> I'll have to do a few iterations on the Jenkins builds since I can't get this 
> to fail locally. Marking as "blocker" since I'll probably have to put some 
> extra code in that I want to be sure is removed before we cut any new 
> releases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-9843) Fix up DocValuesNotIndexedTest failures

2016-12-29 Thread Erick Erickson (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-9843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson updated SOLR-9843:
-
Attachment: SOLR-9843.patch

Removed extraneous logging.


> Fix up DocValuesNotIndexedTest failures
> ---
>
> Key: SOLR-9843
> URL: https://issues.apache.org/jira/browse/SOLR-9843
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Blocker
> Attachments: SOLR-9843.patch, SOLR-9843.patch, fail.txt, 
> shard3_replica1.txt, shard_3_searchers.txt
>
>
> I'll have to do a few iterations on the Jenkins builds since I can't get this 
> to fail locally. Marking as "blocker" since I'll probably have to put some 
> extra code in that I want to be sure is removed before we cut any new 
> releases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9668) Support cursor paging in SolrEntityProcessor

2016-12-29 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15786646#comment-15786646
 ] 

Noble Paul commented on SOLR-9668:
--

I haven't looked at the patch. Do you have any concerns mikhail you want me to 
specifically look at

> Support cursor paging in SolrEntityProcessor
> 
>
> Key: SOLR-9668
> URL: https://issues.apache.org/jira/browse/SOLR-9668
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: contrib - DataImportHandler
>Reporter: Yegor Kozlov
>Assignee: Mikhail Khludnev
>Priority: Minor
>  Labels: dataimportHandler
> Fix For: master (7.0)
>
> Attachments: SOLR-9668.patch, SOLR-9668.patch
>
>
> SolrEntityProcessor paginates using the start and rows parameters which can 
> be very inefficient at large offsets. In fact, the current implementation  is 
> impracticable to import large amounts of data (10M+ documents) because the 
> data import rate degrades from 1000docs/second to 10docs/second and the 
> import gets stuck.
> This patch introduces support for cursor paging which offers more or less 
> predictable performance. In my tests the time to fetch the 1st and 1000th 
> pages was about the same and the data import rate was stable throughout the 
> entire import. 
> To enable cursor paging a user needs to:
>  * add {{cursorMark='true'}} (!) attribute in the entity configuration;
>  * "sort" attribute in the entity configuration see note about sort at 
> https://cwiki.apache.org/confluence/display/solr/Pagination+of+Results ;
>  * remove {{timeout}} attribute.
> {code}
> 
> 
>   
>  query="*:*"
> rows="1000"
> cursorMark='true'
> sort="id asc"  
> url="http://localhost:8983/solr/collection1;>
> 
>   
> 
> {code}
> If the {{cursorMark}} attribute is missing or is not {{'true'}} then the 
> default start/rows pagination is used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-4763) Performance issue when using group.facet=true

2016-12-29 Thread Erick Erickson (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson resolved SOLR-4763.
--
   Resolution: Fixed
Fix Version/s: (was: 6.0)
   (was: 5.5)
   6.4
   trunk

Actually, probably previous to 6.4. The consensus seems to be use JSON facets.

If JSON facets don't really answer we can re-open this

> Performance issue when using group.facet=true
> -
>
> Key: SOLR-4763
> URL: https://issues.apache.org/jira/browse/SOLR-4763
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.2
>Reporter: Alexander Koval
>Assignee: Erick Erickson
> Fix For: trunk, 6.4
>
> Attachments: SOLR-4763.patch, SOLR-4763.patch, SOLR-4763.patch
>
>
> I do not know whether this is bug or not. But calculating facets with 
> {{group.facet=true}} is too slow.
> I have query that:
> {code}
> "matches": 730597,
> "ngroups": 24024,
> {code}
> 1. All queries with {{group.facet=true}}:
> {code}
> "QTime": 5171
> "facet": {
> "time": 4716
> {code}
> 2. Without {{group.facet}}:
> * First query:
> {code}
> "QTime": 3284
> "facet": {
> "time": 3104
> {code}
> * Next queries:
> {code}
> "QTime": 230,
> "facet": {
> "time": 76
> {code}
> So I think with {{group.facet=true}} Solr doesn't use cache to calculate 
> facets.
> Is it possible to improve performance of facets when {{group.facet=true}}?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-7036) Faster method for group.facet

2016-12-29 Thread Erick Erickson (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-7036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson resolved SOLR-7036.
--
   Resolution: Fixed
Fix Version/s: (was: 6.0)
   (was: 5.5)
   6.4
   trunk

Actually, probably previous to 6.4. The consensus seems to be use JSON facets.

If JSON facets don't really answer we can re-open this

> Faster method for group.facet
> -
>
> Key: SOLR-7036
> URL: https://issues.apache.org/jira/browse/SOLR-7036
> Project: Solr
>  Issue Type: Improvement
>  Components: faceting
>Affects Versions: 4.10.3
>Reporter: Jim Musil
>Assignee: Erick Erickson
> Fix For: trunk, 6.4
>
> Attachments: SOLR-7036.patch, SOLR-7036.patch, SOLR-7036.patch, 
> SOLR-7036.patch, SOLR-7036.patch, SOLR-7036.patch, SOLR-7036_zipped.zip, 
> jstack-output.txt, performance.txt, source_for_patch.zip
>
>
> This is a patch that speeds up the performance of requests made with 
> group.facet=true. The original code that collects and counts unique facet 
> values for each group does not use the same improved field cache methods that 
> have been added for normal faceting in recent versions.
> Specifically, this approach leverages the UninvertedField class which 
> provides a much faster way to look up docs that contain a term. I've also 
> added a simple grouping map so that when a term is found for a doc, it can 
> quickly look up the group to which it belongs.
> Group faceting was very slow for our data set and when the number of docs or 
> terms was high, the latency spiked to multiple second requests. This solution 
> provides better overall performance -- from an average of 54ms to 32ms. It 
> also dropped our slowest performing queries way down -- from 6012ms to 991ms.
> I also added a few tests.
> I added an additional parameter so that you can choose to use this method or 
> the original. Add group.facet.method=fc to use the improved method or 
> group.facet.method=original which is the default if not specified.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-9891) Add mkroot command to bin/solr and bin/solr.cmd

2016-12-29 Thread Erick Erickson (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-9891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson resolved SOLR-9891.
--
   Resolution: Fixed
Fix Version/s: 6.4
   trunk

Also changed many of the references to zkcli in the ref guide to use bin/solr.

> Add mkroot command to bin/solr and bin/solr.cmd
> ---
>
> Key: SOLR-9891
> URL: https://issues.apache.org/jira/browse/SOLR-9891
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Minor
> Fix For: trunk, 6.4
>
> Attachments: SOLR-9891.patch, SOLR-9891.patch
>
>
> This came to my attention just now. To use a different root in Solr, we say 
> this in the ref guide:
> IMPORTANT: If your ZooKeeper connection string uses a chroot, such as 
> localhost:2181/solr, then you need to bootstrap the /solr znode before 
> launching SolrCloud using the bin/solr script. To do this, you need to use 
> the zkcli.sh script shipped with Solr, such as:
> server/scripts/cloud-scripts/zkcli.sh -zkhost localhost:2181/solr -cmd 
> bootstrap -solrhome server/solr
> I think all this really does is create an empty /solr ZNode. We're trying to 
> move the common usages of the zkcli scripts to bin/solr so I tried making 
> this work.
> It's clumsy. If I try to copy up an empty directory to /solr nothing happens. 
> I got it to work by copying file:README.txt to zk:/solr/nonsense then delete 
> zk:/solr/nonsense. Ugly.
> I don't want to get into reproducing the whole Unix shell file manipulation 
> commands with mkdir, touch, etc.
> I guess we already have special 'upconfig' and 'downconfig' commands, so 
> maybe a specific command for this like 'mkroot' would be OK. Do people have 
> opinions about this as opposed to 'mkdir'? I'm tending to mkdir.
> Or have the cp command handle empty directories, but mkroot/mkdir seems more 
> intuitive if not as generic.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-9895) Replace existing ref guide references to zkcli with bin/solr zk options where possible

2016-12-29 Thread Erick Erickson (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-9895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson resolved SOLR-9895.
--
   Resolution: Fixed
Fix Version/s: 6.4
   trunk

I'm leaving the bits about ssl and all of security that aren't simply copying a 
file alone until the steps can be tested.

> Replace existing ref guide references to zkcli with bin/solr zk options where 
> possible
> --
>
> Key: SOLR-9895
> URL: https://issues.apache.org/jira/browse/SOLR-9895
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Minor
> Fix For: trunk, 6.4
>
>
> I was looking through the CWiki for SOLR-9891 and noticed a fair number of 
> references to zkcli. I'd like to replace as many of those as possible and use 
> the bin/solr zk way of interacting with Zookeeper on the principle that 
> fewer tools == less confusion.
> Any help welcome!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9891) Add mkroot command to bin/solr and bin/solr.cmd

2016-12-29 Thread Erick Erickson (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15786607#comment-15786607
 ] 

Erick Erickson commented on SOLR-9891:
--

Thanks Steve!



> Add mkroot command to bin/solr and bin/solr.cmd
> ---
>
> Key: SOLR-9891
> URL: https://issues.apache.org/jira/browse/SOLR-9891
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Minor
> Attachments: SOLR-9891.patch, SOLR-9891.patch
>
>
> This came to my attention just now. To use a different root in Solr, we say 
> this in the ref guide:
> IMPORTANT: If your ZooKeeper connection string uses a chroot, such as 
> localhost:2181/solr, then you need to bootstrap the /solr znode before 
> launching SolrCloud using the bin/solr script. To do this, you need to use 
> the zkcli.sh script shipped with Solr, such as:
> server/scripts/cloud-scripts/zkcli.sh -zkhost localhost:2181/solr -cmd 
> bootstrap -solrhome server/solr
> I think all this really does is create an empty /solr ZNode. We're trying to 
> move the common usages of the zkcli scripts to bin/solr so I tried making 
> this work.
> It's clumsy. If I try to copy up an empty directory to /solr nothing happens. 
> I got it to work by copying file:README.txt to zk:/solr/nonsense then delete 
> zk:/solr/nonsense. Ugly.
> I don't want to get into reproducing the whole Unix shell file manipulation 
> commands with mkdir, touch, etc.
> I guess we already have special 'upconfig' and 'downconfig' commands, so 
> maybe a specific command for this like 'mkroot' would be OK. Do people have 
> opinions about this as opposed to 'mkdir'? I'm tending to mkdir.
> Or have the cp command handle empty directories, but mkroot/mkdir seems more 
> intuitive if not as generic.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9891) Add mkroot command to bin/solr and bin/solr.cmd

2016-12-29 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15786605#comment-15786605
 ] 

ASF subversion and git services commented on SOLR-9891:
---

Commit ed55658620e66b7a06a820219daf6435dfb070d6 in lucene-solr's branch 
refs/heads/branch_6x from [~erickerickson]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=ed55658 ]

SOLR-9891: Add mkroot command to bin/solr and bin/solr.cmd
(cherry picked from commit cb266d5)


> Add mkroot command to bin/solr and bin/solr.cmd
> ---
>
> Key: SOLR-9891
> URL: https://issues.apache.org/jira/browse/SOLR-9891
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Minor
> Attachments: SOLR-9891.patch, SOLR-9891.patch
>
>
> This came to my attention just now. To use a different root in Solr, we say 
> this in the ref guide:
> IMPORTANT: If your ZooKeeper connection string uses a chroot, such as 
> localhost:2181/solr, then you need to bootstrap the /solr znode before 
> launching SolrCloud using the bin/solr script. To do this, you need to use 
> the zkcli.sh script shipped with Solr, such as:
> server/scripts/cloud-scripts/zkcli.sh -zkhost localhost:2181/solr -cmd 
> bootstrap -solrhome server/solr
> I think all this really does is create an empty /solr ZNode. We're trying to 
> move the common usages of the zkcli scripts to bin/solr so I tried making 
> this work.
> It's clumsy. If I try to copy up an empty directory to /solr nothing happens. 
> I got it to work by copying file:README.txt to zk:/solr/nonsense then delete 
> zk:/solr/nonsense. Ugly.
> I don't want to get into reproducing the whole Unix shell file manipulation 
> commands with mkdir, touch, etc.
> I guess we already have special 'upconfig' and 'downconfig' commands, so 
> maybe a specific command for this like 'mkroot' would be OK. Do people have 
> opinions about this as opposed to 'mkdir'? I'm tending to mkdir.
> Or have the cp command handle empty directories, but mkroot/mkdir seems more 
> intuitive if not as generic.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9891) Add mkroot command to bin/solr and bin/solr.cmd

2016-12-29 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15786602#comment-15786602
 ] 

ASF subversion and git services commented on SOLR-9891:
---

Commit cb266d5fc775bd9d26ed7f0e68e9d0d12793f9b5 in lucene-solr's branch 
refs/heads/master from [~erickerickson]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=cb266d5 ]

SOLR-9891: Add mkroot command to bin/solr and bin/solr.cmd


> Add mkroot command to bin/solr and bin/solr.cmd
> ---
>
> Key: SOLR-9891
> URL: https://issues.apache.org/jira/browse/SOLR-9891
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Minor
> Attachments: SOLR-9891.patch, SOLR-9891.patch
>
>
> This came to my attention just now. To use a different root in Solr, we say 
> this in the ref guide:
> IMPORTANT: If your ZooKeeper connection string uses a chroot, such as 
> localhost:2181/solr, then you need to bootstrap the /solr znode before 
> launching SolrCloud using the bin/solr script. To do this, you need to use 
> the zkcli.sh script shipped with Solr, such as:
> server/scripts/cloud-scripts/zkcli.sh -zkhost localhost:2181/solr -cmd 
> bootstrap -solrhome server/solr
> I think all this really does is create an empty /solr ZNode. We're trying to 
> move the common usages of the zkcli scripts to bin/solr so I tried making 
> this work.
> It's clumsy. If I try to copy up an empty directory to /solr nothing happens. 
> I got it to work by copying file:README.txt to zk:/solr/nonsense then delete 
> zk:/solr/nonsense. Ugly.
> I don't want to get into reproducing the whole Unix shell file manipulation 
> commands with mkdir, touch, etc.
> I guess we already have special 'upconfig' and 'downconfig' commands, so 
> maybe a specific command for this like 'mkroot' would be OK. Do people have 
> opinions about this as opposed to 'mkdir'? I'm tending to mkdir.
> Or have the cp command handle empty directories, but mkroot/mkdir seems more 
> intuitive if not as generic.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-9891) Add mkroot command to bin/solr and bin/solr.cmd

2016-12-29 Thread Erick Erickson (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-9891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson updated SOLR-9891:
-
Attachment: SOLR-9891.patch

Final patch with CHANGES.txt

> Add mkroot command to bin/solr and bin/solr.cmd
> ---
>
> Key: SOLR-9891
> URL: https://issues.apache.org/jira/browse/SOLR-9891
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Minor
> Attachments: SOLR-9891.patch, SOLR-9891.patch
>
>
> This came to my attention just now. To use a different root in Solr, we say 
> this in the ref guide:
> IMPORTANT: If your ZooKeeper connection string uses a chroot, such as 
> localhost:2181/solr, then you need to bootstrap the /solr znode before 
> launching SolrCloud using the bin/solr script. To do this, you need to use 
> the zkcli.sh script shipped with Solr, such as:
> server/scripts/cloud-scripts/zkcli.sh -zkhost localhost:2181/solr -cmd 
> bootstrap -solrhome server/solr
> I think all this really does is create an empty /solr ZNode. We're trying to 
> move the common usages of the zkcli scripts to bin/solr so I tried making 
> this work.
> It's clumsy. If I try to copy up an empty directory to /solr nothing happens. 
> I got it to work by copying file:README.txt to zk:/solr/nonsense then delete 
> zk:/solr/nonsense. Ugly.
> I don't want to get into reproducing the whole Unix shell file manipulation 
> commands with mkdir, touch, etc.
> I guess we already have special 'upconfig' and 'downconfig' commands, so 
> maybe a specific command for this like 'mkroot' would be OK. Do people have 
> opinions about this as opposed to 'mkdir'? I'm tending to mkdir.
> Or have the cp command handle empty directories, but mkroot/mkdir seems more 
> intuitive if not as generic.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-9894) Tokenizer work randomly

2016-12-29 Thread JIRA

[ 
https://issues.apache.org/jira/browse/SOLR-9894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15786574#comment-15786574
 ] 

王海涛 edited comment on SOLR-9894 at 12/30/16 1:24 AM:
-

Does anyone can resolve this bug? I will very appreciate you, because this bug 
make my company's search result so bad bad bad...


was (Author: wanghaitao):
Does anyone can resolve this bug? I will appreciate you, because this bug make 
my company search result so bad bad bad...

> Tokenizer work randomly
> ---
>
> Key: SOLR-9894
> URL: https://issues.apache.org/jira/browse/SOLR-9894
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: query parsers
>Affects Versions: 6.2.1
> Environment: solrcloud 6.2.1(3 solr nodes)
> OS:linux
> RAM:8G
>Reporter: 王海涛
>Priority: Critical
>  Labels: patch
> Attachments: step1.png, step2.png, step3.png, step4.png
>
>
> my schema.xml has a fieldType as folow:
> 
>   
>class="org.wltea.analyzer.lucene.IKTokenizerFactory" useSmart="false"/>
>class="org.wltea.pinyin.solr5.PinyinTokenFilterFactory" pinyinAll="true" 
> minTermLength="2"/> 
>   
>   
>   
>class="org.wltea.analyzer.lucene.IKTokenizerFactory" useSmart="true"/>
>  
>   
>   
> Attention:
>   index tokenzier useSmart is false
>   query tokenzier useSmart is true
> But when I send query request with parameter q ,
> the query tokenziner sometimes useSmart equals true
> sometimes useSmart equal false.
> That is so terrible!
> I guess the problem may be caught by tokenizer cache.
> when I query ,the tokenizer should use true as the useSmart's value,
> but it had cache the wrong tokenizer result which created by indexWriter who 
> use false as useSmart's value.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9894) Tokenizer work randomly

2016-12-29 Thread JIRA

[ 
https://issues.apache.org/jira/browse/SOLR-9894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15786574#comment-15786574
 ] 

王海涛 commented on SOLR-9894:
---

Does anyone can resolve this bug? I will appreciate you, because this bug make 
my company search result so bad bad bad...

> Tokenizer work randomly
> ---
>
> Key: SOLR-9894
> URL: https://issues.apache.org/jira/browse/SOLR-9894
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: query parsers
>Affects Versions: 6.2.1
> Environment: solrcloud 6.2.1(3 solr nodes)
> OS:linux
> RAM:8G
>Reporter: 王海涛
>Priority: Critical
>  Labels: patch
> Attachments: step1.png, step2.png, step3.png, step4.png
>
>
> my schema.xml has a fieldType as folow:
> 
>   
>class="org.wltea.analyzer.lucene.IKTokenizerFactory" useSmart="false"/>
>class="org.wltea.pinyin.solr5.PinyinTokenFilterFactory" pinyinAll="true" 
> minTermLength="2"/> 
>   
>   
>   
>class="org.wltea.analyzer.lucene.IKTokenizerFactory" useSmart="true"/>
>  
>   
>   
> Attention:
>   index tokenzier useSmart is false
>   query tokenzier useSmart is true
> But when I send query request with parameter q ,
> the query tokenziner sometimes useSmart equals true
> sometimes useSmart equal false.
> That is so terrible!
> I guess the problem may be caught by tokenizer cache.
> when I query ,the tokenizer should use true as the useSmart's value,
> but it had cache the wrong tokenizer result which created by indexWriter who 
> use false as useSmart's value.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-7564) AnalyzingInfixSuggester should close its IndexWriter by default at the end of build()

2016-12-29 Thread Steve Rowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-7564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Rowe resolved LUCENE-7564.

Resolution: Fixed

> AnalyzingInfixSuggester should close its IndexWriter by default at the end of 
> build()
> -
>
> Key: LUCENE-7564
> URL: https://issues.apache.org/jira/browse/LUCENE-7564
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Steve Rowe
>Assignee: Steve Rowe
> Fix For: master (7.0), 6.4
>
> Attachments: LUCENE-7564-fix-random-NRT-failures.patch, 
> LUCENE-7564-fix-random-NRT-failures.patch, LUCENE-7564.patch, 
> LUCENE-7564.patch
>
>
> From SOLR-6246, where AnalyzingInfixSuggester's write lock on its index is 
> causing trouble when reloading a Solr core:
> [~gsingers] wrote:
> bq. One suggestion that might minimize the impact: close the writer after 
> build
> [~varunthacker] wrote:
> {quote}
> This is what I am thinking -
> Create a Lucene issue in which {{AnalyzingInfixSuggester#build}} closes the 
> writer by default at the end.
> The {{add}} and {{update}} methods call {{ensureOpen}} and those who do 
> frequent real time updates directly via lucene won't see any slowdowns.
> [~mikemccand] - Would this approach have any major drawback from Lucene's 
> perspective? Else I can go ahead an tackle this in a Lucene issue
> {quote}
> [~mikemccand] wrote:
> bq. Fixing {{AnalyzingInfixSuggester}} to close the writer at the end of 
> build seems reasonable?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-9891) Add mkroot command to bin/solr and bin/solr.cmd

2016-12-29 Thread Erick Erickson (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-9891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson updated SOLR-9891:
-
Summary: Add mkroot command to bin/solr and bin/solr.cmd  (was: bin/solr 
cannot create an empty Znode which is useful for chroot)

> Add mkroot command to bin/solr and bin/solr.cmd
> ---
>
> Key: SOLR-9891
> URL: https://issues.apache.org/jira/browse/SOLR-9891
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Minor
> Attachments: SOLR-9891.patch
>
>
> This came to my attention just now. To use a different root in Solr, we say 
> this in the ref guide:
> IMPORTANT: If your ZooKeeper connection string uses a chroot, such as 
> localhost:2181/solr, then you need to bootstrap the /solr znode before 
> launching SolrCloud using the bin/solr script. To do this, you need to use 
> the zkcli.sh script shipped with Solr, such as:
> server/scripts/cloud-scripts/zkcli.sh -zkhost localhost:2181/solr -cmd 
> bootstrap -solrhome server/solr
> I think all this really does is create an empty /solr ZNode. We're trying to 
> move the common usages of the zkcli scripts to bin/solr so I tried making 
> this work.
> It's clumsy. If I try to copy up an empty directory to /solr nothing happens. 
> I got it to work by copying file:README.txt to zk:/solr/nonsense then delete 
> zk:/solr/nonsense. Ugly.
> I don't want to get into reproducing the whole Unix shell file manipulation 
> commands with mkdir, touch, etc.
> I guess we already have special 'upconfig' and 'downconfig' commands, so 
> maybe a specific command for this like 'mkroot' would be OK. Do people have 
> opinions about this as opposed to 'mkdir'? I'm tending to mkdir.
> Or have the cp command handle empty directories, but mkroot/mkdir seems more 
> intuitive if not as generic.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-7580) Spans tree scoring

2016-12-29 Thread Paul Elschot (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-7580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Elschot updated LUCENE-7580:
-
Attachment: LUCENE-7580.patch

Patch of 29 Dec 2016.

Compared to the previous patch, this adds:

Limiting the max allowed slop to Integer.MAX_VALUE-1 in the SpanNearQuery 
constructor and in TestSpanSearchEquivalence. An actual slop of 
Integer.MAX_VALUE causes an overflow in distance+1 that is used in 
computeSlopFactor. Since the same limitation is already present for indexed 
positions, I would not expect this slop factor miscalculation to actually occur.

The negative slops that occur for overlapping spans are changed to 0 before 
passing them to computeSlopFactor in NearSpansDocScorer in the patch here.

The non match distance passed to SpanNearQuery in the patch is verified to be 
at least the given slop.

A wrapper method SpansTreeScorer.wrap() is added that will wrap the span 
(subqueries of a) given query in a SpansTreeQuery. This works for span 
subqueries of BooleanQuery, DisjunctionMaxQuery and BoostQuery.

> Spans tree scoring
> --
>
> Key: LUCENE-7580
> URL: https://issues.apache.org/jira/browse/LUCENE-7580
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/search
>Affects Versions: master (7.0)
>Reporter: Paul Elschot
>Priority: Minor
> Fix For: 6.x
>
> Attachments: LUCENE-7580.patch, LUCENE-7580.patch, LUCENE-7580.patch
>
>
> Recurse the spans tree to compose a score based on the type of subqueries and 
> what matched



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9886) Add ability to turn off/on caches

2016-12-29 Thread Pushkar Raste (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15786323#comment-15786323
 ] 

Pushkar Raste commented on SOLR-9886:
-

[~noble.paul] My only concern adding legend to 
{{EditableSolrConfigAttributes.json}} is, if we ever parse this file using a 
JSON parser, we will have to move legend to some other place.

> Add ability to turn off/on caches 
> --
>
> Key: SOLR-9886
> URL: https://issues.apache.org/jira/browse/SOLR-9886
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Pushkar Raste
>Assignee: Noble Paul
>Priority: Minor
> Attachments: EnableDisableCacheAttribute.patch, SOLR-9886.patch, 
> SOLR-9886.patch
>
>
> There is no elegant way to turn off caches (filterCache, queryResultCache 
> etc) from the solrconfig. When I tried setting size and initialSize to zero, 
> it resulted in caches of size 2. Here is the code that overrides setting zero 
> sized cache. 
> https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/search/FastLRUCache.java#L61-L73
> Only way to disable cache right now is by removing cache configs from the 
> solrConfig, but we can simply provide an attribute to disable cache, so that 
> we can override it using a system property. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-9886) Add ability to turn off/on caches

2016-12-29 Thread Pushkar Raste (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-9886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pushkar Raste updated SOLR-9886:

Attachment: SOLR-9886.patch

Updated patch with a test.

> Add ability to turn off/on caches 
> --
>
> Key: SOLR-9886
> URL: https://issues.apache.org/jira/browse/SOLR-9886
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Pushkar Raste
>Assignee: Noble Paul
>Priority: Minor
> Attachments: EnableDisableCacheAttribute.patch, SOLR-9886.patch, 
> SOLR-9886.patch
>
>
> There is no elegant way to turn off caches (filterCache, queryResultCache 
> etc) from the solrconfig. When I tried setting size and initialSize to zero, 
> it resulted in caches of size 2. Here is the code that overrides setting zero 
> sized cache. 
> https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/search/FastLRUCache.java#L61-L73
> Only way to disable cache right now is by removing cache configs from the 
> solrConfig, but we can simply provide an attribute to disable cache, so that 
> we can override it using a system property. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #132: SOLR-9886

2016-12-29 Thread praste
GitHub user praste opened a pull request:

https://github.com/apache/lucene-solr/pull/132

SOLR-9886



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/praste/lucene-solr CacheConfig

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/lucene-solr/pull/132.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #132


commit c5363f6dabb0a36cc41f174023eafdd443ed106f
Author: Pushkar Raste 
Date:   2016-12-23T16:41:28Z

Allow enable/disable cache

commit b8daed32047475d0eee330374d8d4ed5f2820897
Author: Pushkar Raste 
Date:   2016-12-29T18:20:56Z

Merge branch 'master' of https://github.com/apache/lucene-solr into 
CacheConfig

commit e394b405b9e6fb0024c4b3a8747ec500dd5ba3d4
Author: Pushkar Raste 
Date:   2016-12-29T22:46:42Z

Test case, bug fix, updated EdittableCofing.json

commit 59484c080fc3e3e97bb5a443588c409e381e1340
Author: Pushkar Raste 
Date:   2016-12-29T22:47:37Z

Adding missing files




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9886) Add ability to turn off/on caches

2016-12-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15786306#comment-15786306
 ] 

ASF GitHub Bot commented on SOLR-9886:
--

GitHub user praste opened a pull request:

https://github.com/apache/lucene-solr/pull/132

SOLR-9886



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/praste/lucene-solr CacheConfig

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/lucene-solr/pull/132.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #132


commit c5363f6dabb0a36cc41f174023eafdd443ed106f
Author: Pushkar Raste 
Date:   2016-12-23T16:41:28Z

Allow enable/disable cache

commit b8daed32047475d0eee330374d8d4ed5f2820897
Author: Pushkar Raste 
Date:   2016-12-29T18:20:56Z

Merge branch 'master' of https://github.com/apache/lucene-solr into 
CacheConfig

commit e394b405b9e6fb0024c4b3a8747ec500dd5ba3d4
Author: Pushkar Raste 
Date:   2016-12-29T22:46:42Z

Test case, bug fix, updated EdittableCofing.json

commit 59484c080fc3e3e97bb5a443588c409e381e1340
Author: Pushkar Raste 
Date:   2016-12-29T22:47:37Z

Adding missing files




> Add ability to turn off/on caches 
> --
>
> Key: SOLR-9886
> URL: https://issues.apache.org/jira/browse/SOLR-9886
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Pushkar Raste
>Assignee: Noble Paul
>Priority: Minor
> Attachments: EnableDisableCacheAttribute.patch, SOLR-9886.patch
>
>
> There is no elegant way to turn off caches (filterCache, queryResultCache 
> etc) from the solrconfig. When I tried setting size and initialSize to zero, 
> it resulted in caches of size 2. Here is the code that overrides setting zero 
> sized cache. 
> https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/search/FastLRUCache.java#L61-L73
> Only way to disable cache right now is by removing cache configs from the 
> solrConfig, but we can simply provide an attribute to disable cache, so that 
> we can override it using a system property. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9891) bin/solr cannot create an empty Znode which is useful for chroot

2016-12-29 Thread Steve Rowe (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15786303#comment-15786303
 ] 

Steve Rowe commented on SOLR-9891:
--

Works for me on Windows 10:

First, check ZK contents:
{noformat}
...\git\lucene-solr\solr>bin\solr zk ls -r / -z localhost:2181

Connecting to ZooKeeper at localhost:2181 ...
Getting listing for Zookeeper node / from ZooKeeper at localhost:2181 recurse: 
true
/
{noformat}

Then, {{zk mkroot solr}}:

{noformat}
...\git\lucene-solr\solr>bin\solr zk mkroot solr -z localhost:2181

Connecting to ZooKeeper at localhost:2181 ...
Creating Zookeeper path solr on ZooKeeper at localhost:2181
{noformat}

And then check contents again:

{noformat}
...\git\lucene-solr\solr>bin\solr zk ls -r / -z localhost:2181

Connecting to ZooKeeper at localhost:2181 ...
Getting listing for Zookeeper node / from ZooKeeper at localhost:2181 recurse: 
true
/
/solr
{noformat}

I also tried {{zk mkroot solr2/subfolder}}, {{zk mkroot /solr3}} (note the 
leading slash), and these worked too.

> bin/solr cannot create an empty Znode which is useful for chroot
> 
>
> Key: SOLR-9891
> URL: https://issues.apache.org/jira/browse/SOLR-9891
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Minor
> Attachments: SOLR-9891.patch
>
>
> This came to my attention just now. To use a different root in Solr, we say 
> this in the ref guide:
> IMPORTANT: If your ZooKeeper connection string uses a chroot, such as 
> localhost:2181/solr, then you need to bootstrap the /solr znode before 
> launching SolrCloud using the bin/solr script. To do this, you need to use 
> the zkcli.sh script shipped with Solr, such as:
> server/scripts/cloud-scripts/zkcli.sh -zkhost localhost:2181/solr -cmd 
> bootstrap -solrhome server/solr
> I think all this really does is create an empty /solr ZNode. We're trying to 
> move the common usages of the zkcli scripts to bin/solr so I tried making 
> this work.
> It's clumsy. If I try to copy up an empty directory to /solr nothing happens. 
> I got it to work by copying file:README.txt to zk:/solr/nonsense then delete 
> zk:/solr/nonsense. Ugly.
> I don't want to get into reproducing the whole Unix shell file manipulation 
> commands with mkdir, touch, etc.
> I guess we already have special 'upconfig' and 'downconfig' commands, so 
> maybe a specific command for this like 'mkroot' would be OK. Do people have 
> opinions about this as opposed to 'mkdir'? I'm tending to mkdir.
> Or have the cp command handle empty directories, but mkroot/mkdir seems more 
> intuitive if not as generic.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9900) ReversedWildcardFilterFactory yields false positive hits for range query

2016-12-29 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15786233#comment-15786233
 ] 

ASF subversion and git services commented on SOLR-9900:
---

Commit ce6bdea6e3f3eca7a1058dd4fe2ff0af70d2e4c2 in lucene-solr's branch 
refs/heads/branch_6x from [~mkhludnev]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=ce6bdea ]

SOLR-9900: fix false positives on range queries with 
ReversedWildcardFilterFactory


> ReversedWildcardFilterFactory yields false positive hits for range query
> 
>
> Key: SOLR-9900
> URL: https://issues.apache.org/jira/browse/SOLR-9900
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Schema and Analysis
>Affects Versions: 1.4
>Reporter: Mikhail Khludnev
>Priority: Minor
> Attachments: SOLR-1321-range-q-false-positive.patch, SOLR-9900.patch
>
>
> Range query yields false positives when ReversedWildcardFilterFactory is on. 
> I'm not sure if it's worth to bother.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7595) RAMUsageTester in test-framework and static field checker no longer works with Java 9

2016-12-29 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15786229#comment-15786229
 ] 

ASF subversion and git services commented on LUCENE-7595:
-

Commit 40a8b4edb4cfc7de5b62037fdcb389afa247573d in lucene-solr's branch 
refs/heads/branch_6x from [~thetaphi]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=40a8b4e ]

LUCENE-7595: Disable another test not compatible with RamUsageTester


> RAMUsageTester in test-framework and static field checker no longer works 
> with Java 9
> -
>
> Key: LUCENE-7595
> URL: https://issues.apache.org/jira/browse/LUCENE-7595
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: general/test
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
>  Labels: Java9
> Fix For: 6.x, master (7.0), 6.4
>
> Attachments: LUCENE-7595.patch
>
>
> Lucene/Solr tests have a special rule that records memory usage in static 
> fields before and after test, so we can detect memory leaks. This check dives 
> into JDK classes (like java.lang.String to detect their size). As Java 9 
> build 148 completely forbids setAccessible on any runtime class, we have to 
> change or disable this check:
> - As first step I will only add the rule to LTC, if we not have Java 8
> - As a second step we might investigate how to improve this
> [~rcmuir] had some ideas for the 2nd point:
> - Don't dive into classes from JDK modules and instead "estimate" the size 
> for some special cases (like Strings)
> - Disallow any static field in tests that is not final (constant) and points 
> to an Object except: Strings and native (wrapper) types.
> In addition we also have RAMUsageTester, that has similar problems and is 
> used to compare estimations of Lucene's calculations of 
> Codec/IndexWriter/IndexReader memory usage with reality. We should simply 
> disable those tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7595) RAMUsageTester in test-framework and static field checker no longer works with Java 9

2016-12-29 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15786225#comment-15786225
 ] 

ASF subversion and git services commented on LUCENE-7595:
-

Commit d65c02e8cc14f03389c2426ea3d3ddd75e12b1ec in lucene-solr's branch 
refs/heads/master from [~thetaphi]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=d65c02e ]

LUCENE-7595: Disable another test not compatible with RamUsageTester


> RAMUsageTester in test-framework and static field checker no longer works 
> with Java 9
> -
>
> Key: LUCENE-7595
> URL: https://issues.apache.org/jira/browse/LUCENE-7595
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: general/test
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
>  Labels: Java9
> Fix For: 6.x, master (7.0), 6.4
>
> Attachments: LUCENE-7595.patch
>
>
> Lucene/Solr tests have a special rule that records memory usage in static 
> fields before and after test, so we can detect memory leaks. This check dives 
> into JDK classes (like java.lang.String to detect their size). As Java 9 
> build 148 completely forbids setAccessible on any runtime class, we have to 
> change or disable this check:
> - As first step I will only add the rule to LTC, if we not have Java 8
> - As a second step we might investigate how to improve this
> [~rcmuir] had some ideas for the 2nd point:
> - Don't dive into classes from JDK modules and instead "estimate" the size 
> for some special cases (like Strings)
> - Disallow any static field in tests that is not final (constant) and points 
> to an Object except: Strings and native (wrapper) types.
> In addition we also have RAMUsageTester, that has similar problems and is 
> used to compare estimations of Lucene's calculations of 
> Codec/IndexWriter/IndexReader memory usage with reality. We should simply 
> disable those tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9900) ReversedWildcardFilterFactory yields false positive hits for range query

2016-12-29 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15786200#comment-15786200
 ] 

ASF subversion and git services commented on SOLR-9900:
---

Commit 5d042d3a49dfcf654b8bf8a96521d5404bfd3a7b in lucene-solr's branch 
refs/heads/master from [~mkhludnev]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=5d042d3 ]

SOLR-9900: fix false positives on range queries with 
ReversedWildcardFilterFactory


> ReversedWildcardFilterFactory yields false positive hits for range query
> 
>
> Key: SOLR-9900
> URL: https://issues.apache.org/jira/browse/SOLR-9900
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Schema and Analysis
>Affects Versions: 1.4
>Reporter: Mikhail Khludnev
>Priority: Minor
> Attachments: SOLR-1321-range-q-false-positive.patch, SOLR-9900.patch
>
>
> Range query yields false positives when ReversedWildcardFilterFactory is on. 
> I'm not sure if it's worth to bother.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-9668) Support cursor paging in SolrEntityProcessor

2016-12-29 Thread Mikhail Khludnev (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15786130#comment-15786130
 ] 

Mikhail Khludnev edited comment on SOLR-9668 at 12/29/16 9:37 PM:
--

Are there any concerns? 



was (Author: mkhludnev):
is there any concerns? 


> Support cursor paging in SolrEntityProcessor
> 
>
> Key: SOLR-9668
> URL: https://issues.apache.org/jira/browse/SOLR-9668
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: contrib - DataImportHandler
>Reporter: Yegor Kozlov
>Assignee: Mikhail Khludnev
>Priority: Minor
>  Labels: dataimportHandler
> Fix For: master (7.0)
>
> Attachments: SOLR-9668.patch, SOLR-9668.patch
>
>
> SolrEntityProcessor paginates using the start and rows parameters which can 
> be very inefficient at large offsets. In fact, the current implementation  is 
> impracticable to import large amounts of data (10M+ documents) because the 
> data import rate degrades from 1000docs/second to 10docs/second and the 
> import gets stuck.
> This patch introduces support for cursor paging which offers more or less 
> predictable performance. In my tests the time to fetch the 1st and 1000th 
> pages was about the same and the data import rate was stable throughout the 
> entire import. 
> To enable cursor paging a user needs to:
>  * add {{cursorMark='true'}} (!) attribute in the entity configuration;
>  * "sort" attribute in the entity configuration see note about sort at 
> https://cwiki.apache.org/confluence/display/solr/Pagination+of+Results ;
>  * remove {{timeout}} attribute.
> {code}
> 
> 
>   
>  query="*:*"
> rows="1000"
> cursorMark='true'
> sort="id asc"  
> url="http://localhost:8983/solr/collection1;>
> 
>   
> 
> {code}
> If the {{cursorMark}} attribute is missing or is not {{'true'}} then the 
> default start/rows pagination is used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7466) Allow optional leading wildcards in complexphrase

2016-12-29 Thread Mikhail Khludnev (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15786176#comment-15786176
 ] 

Mikhail Khludnev commented on SOLR-7466:


Is there any veto to make leading wildcards always on in complexphrase?

> Allow optional leading wildcards in complexphrase
> -
>
> Key: SOLR-7466
> URL: https://issues.apache.org/jira/browse/SOLR-7466
> Project: Solr
>  Issue Type: Improvement
>  Components: query parsers
>Affects Versions: 4.8
>Reporter: Andy hardin
>Assignee: Mikhail Khludnev
>  Labels: complexPhrase, query-parser, wildcards
> Attachments: SOLR-7466.patch
>
>
> Currently ComplexPhraseQParser (SOLR-1604) allows trailing wildcards on terms 
> in a phrase, but does not allow leading wildcards.  I would like the option 
> to be able to search for terms with both trailing and leading wildcards.  
> For example with:
> {!complexphrase allowLeadingWildcard=true} "j* *th"
> would match "John Smith", "Jim Smith", but not "John Schmitt"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7564) AnalyzingInfixSuggester should close its IndexWriter by default at the end of build()

2016-12-29 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15786148#comment-15786148
 ] 

ASF subversion and git services commented on LUCENE-7564:
-

Commit 6b00ee5175d55d2f2a25ce6539dc12277022c898 in lucene-solr's branch 
refs/heads/master from [~steve_rowe]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=6b00ee5 ]

LUCENE-7564: add missing javadocs


> AnalyzingInfixSuggester should close its IndexWriter by default at the end of 
> build()
> -
>
> Key: LUCENE-7564
> URL: https://issues.apache.org/jira/browse/LUCENE-7564
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Steve Rowe
>Assignee: Steve Rowe
> Fix For: master (7.0), 6.4
>
> Attachments: LUCENE-7564-fix-random-NRT-failures.patch, 
> LUCENE-7564-fix-random-NRT-failures.patch, LUCENE-7564.patch, 
> LUCENE-7564.patch
>
>
> From SOLR-6246, where AnalyzingInfixSuggester's write lock on its index is 
> causing trouble when reloading a Solr core:
> [~gsingers] wrote:
> bq. One suggestion that might minimize the impact: close the writer after 
> build
> [~varunthacker] wrote:
> {quote}
> This is what I am thinking -
> Create a Lucene issue in which {{AnalyzingInfixSuggester#build}} closes the 
> writer by default at the end.
> The {{add}} and {{update}} methods call {{ensureOpen}} and those who do 
> frequent real time updates directly via lucene won't see any slowdowns.
> [~mikemccand] - Would this approach have any major drawback from Lucene's 
> perspective? Else I can go ahead an tackle this in a Lucene issue
> {quote}
> [~mikemccand] wrote:
> bq. Fixing {{AnalyzingInfixSuggester}} to close the writer at the end of 
> build seems reasonable?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7564) AnalyzingInfixSuggester should close its IndexWriter by default at the end of build()

2016-12-29 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15786146#comment-15786146
 ] 

ASF subversion and git services commented on LUCENE-7564:
-

Commit 73f068e50333902b3ea887100f063e61ebf1996b in lucene-solr's branch 
refs/heads/branch_6x from [~steve_rowe]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=73f068e ]

LUCENE-7564: Force single-threaded access to the AnalyzingInfixSuggester's 
SearcherManager when performing an acquire() or reassigning.  This fixes 
failures in AnalyzingInfixSuggester.testRandomNRT().


> AnalyzingInfixSuggester should close its IndexWriter by default at the end of 
> build()
> -
>
> Key: LUCENE-7564
> URL: https://issues.apache.org/jira/browse/LUCENE-7564
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Steve Rowe
>Assignee: Steve Rowe
> Fix For: master (7.0), 6.4
>
> Attachments: LUCENE-7564-fix-random-NRT-failures.patch, 
> LUCENE-7564-fix-random-NRT-failures.patch, LUCENE-7564.patch, 
> LUCENE-7564.patch
>
>
> From SOLR-6246, where AnalyzingInfixSuggester's write lock on its index is 
> causing trouble when reloading a Solr core:
> [~gsingers] wrote:
> bq. One suggestion that might minimize the impact: close the writer after 
> build
> [~varunthacker] wrote:
> {quote}
> This is what I am thinking -
> Create a Lucene issue in which {{AnalyzingInfixSuggester#build}} closes the 
> writer by default at the end.
> The {{add}} and {{update}} methods call {{ensureOpen}} and those who do 
> frequent real time updates directly via lucene won't see any slowdowns.
> [~mikemccand] - Would this approach have any major drawback from Lucene's 
> perspective? Else I can go ahead an tackle this in a Lucene issue
> {quote}
> [~mikemccand] wrote:
> bq. Fixing {{AnalyzingInfixSuggester}} to close the writer at the end of 
> build seems reasonable?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7564) AnalyzingInfixSuggester should close its IndexWriter by default at the end of build()

2016-12-29 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15786147#comment-15786147
 ] 

ASF subversion and git services commented on LUCENE-7564:
-

Commit 266ca264077671f40a381c4768c8c6a86275b268 in lucene-solr's branch 
refs/heads/branch_6x from [~steve_rowe]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=266ca26 ]

LUCENE-7564: add missing javadocs


> AnalyzingInfixSuggester should close its IndexWriter by default at the end of 
> build()
> -
>
> Key: LUCENE-7564
> URL: https://issues.apache.org/jira/browse/LUCENE-7564
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Steve Rowe
>Assignee: Steve Rowe
> Fix For: master (7.0), 6.4
>
> Attachments: LUCENE-7564-fix-random-NRT-failures.patch, 
> LUCENE-7564-fix-random-NRT-failures.patch, LUCENE-7564.patch, 
> LUCENE-7564.patch
>
>
> From SOLR-6246, where AnalyzingInfixSuggester's write lock on its index is 
> causing trouble when reloading a Solr core:
> [~gsingers] wrote:
> bq. One suggestion that might minimize the impact: close the writer after 
> build
> [~varunthacker] wrote:
> {quote}
> This is what I am thinking -
> Create a Lucene issue in which {{AnalyzingInfixSuggester#build}} closes the 
> writer by default at the end.
> The {{add}} and {{update}} methods call {{ensureOpen}} and those who do 
> frequent real time updates directly via lucene won't see any slowdowns.
> [~mikemccand] - Would this approach have any major drawback from Lucene's 
> perspective? Else I can go ahead an tackle this in a Lucene issue
> {quote}
> [~mikemccand] wrote:
> bq. Fixing {{AnalyzingInfixSuggester}} to close the writer at the end of 
> build seems reasonable?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7739) Lucene Classification Integration - UpdateRequestProcessor

2016-12-29 Thread Cassandra Targett (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15786133#comment-15786133
 ] 

Cassandra Targett commented on SOLR-7739:
-

I'm really far behind on this [~alessandro.benedetti], my apologies. I added a 
reference to this URP to the Ref Guide page 
https://cwiki.apache.org/confluence/display/solr/Update+Request+Processors. 
None of the update processors are described very well there, so it would be 
strange to suddenly have a long section for just one, although we'd like 
someday to give full descriptions for all of them. For now, though, I linked to 
the Wiki page, and will work through some edits on that soon.

> Lucene Classification Integration - UpdateRequestProcessor
> --
>
> Key: SOLR-7739
> URL: https://issues.apache.org/jira/browse/SOLR-7739
> Project: Solr
>  Issue Type: New Feature
>  Components: update
>Reporter: Alessandro Benedetti
>Assignee: Tommaso Teofili
>Priority: Minor
>  Labels: classification, index-time, update.chain, 
> updateProperties
> Fix For: 6.1, master (7.0)
>
> Attachments: SOLR-7739.1.patch, SOLR-7739.patch, SOLR-7739.patch, 
> SOLR-7739.patch
>
>
> It would be nice to have an UpdateRequestProcessor to interact with the 
> Lucene Classification Module and provide an easy way of auto classifying Solr 
> Documents on indexing.
> Documentation will be provided with the patch
> A first design will be provided soon.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9668) Support cursor paging in SolrEntityProcessor

2016-12-29 Thread Mikhail Khludnev (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15786130#comment-15786130
 ] 

Mikhail Khludnev commented on SOLR-9668:


is there any concerns? 


> Support cursor paging in SolrEntityProcessor
> 
>
> Key: SOLR-9668
> URL: https://issues.apache.org/jira/browse/SOLR-9668
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: contrib - DataImportHandler
>Reporter: Yegor Kozlov
>Assignee: Mikhail Khludnev
>Priority: Minor
>  Labels: dataimportHandler
> Fix For: master (7.0)
>
> Attachments: SOLR-9668.patch, SOLR-9668.patch
>
>
> SolrEntityProcessor paginates using the start and rows parameters which can 
> be very inefficient at large offsets. In fact, the current implementation  is 
> impracticable to import large amounts of data (10M+ documents) because the 
> data import rate degrades from 1000docs/second to 10docs/second and the 
> import gets stuck.
> This patch introduces support for cursor paging which offers more or less 
> predictable performance. In my tests the time to fetch the 1st and 1000th 
> pages was about the same and the data import rate was stable throughout the 
> entire import. 
> To enable cursor paging a user needs to:
>  * add {{cursorMark='true'}} (!) attribute in the entity configuration;
>  * "sort" attribute in the entity configuration see note about sort at 
> https://cwiki.apache.org/confluence/display/solr/Pagination+of+Results ;
>  * remove {{timeout}} attribute.
> {code}
> 
> 
>   
>  query="*:*"
> rows="1000"
> cursorMark='true'
> sort="id asc"  
> url="http://localhost:8983/solr/collection1;>
> 
>   
> 
> {code}
> If the {{cursorMark}} attribute is missing or is not {{'true'}} then the 
> default start/rows pagination is used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-7564) AnalyzingInfixSuggester should close its IndexWriter by default at the end of build()

2016-12-29 Thread Steve Rowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-7564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Rowe updated LUCENE-7564:
---
Attachment: LUCENE-7564-fix-random-NRT-failures.patch

Attaching a revised version of the patch to fix the random NRT failures.

The first version of the patch actually made the failure *more* likely on my 
system.  I saw two failures, each within a few hundred iterations.  
(Separately, I did finally get the test to fail on the unpatched code, after 
roughly 10k iterations.)

This version of the patch is a superset of the first version of the patch.  It 
adds synchronized sections around pulling a searcher from, and reassigning, the 
{{searcherMgr}}.  My theory is that when one thread is executing 
{{ensureOpen()}} and closes the {{searcherMgr}}, but then before the 
{{searcherMgr}} is reassigned in this thread, another thread attempts an 
{{acquire()}} on the now-closed {{searcherMgr}}.  The {{synchronized}} sections 
added in this version of the patch cause {{acquire()}} calls to wait until 
{{searcherMgr}} has finished being reassigned.  Since {{searcherMgr.release()}} 
is tolerant of being called after {{close()}} has been called, I didn't include 
the whole acquire/release cycle in the {{synchronized}} sections.

I saw no failures with this patch after 2000 beasting iterations.  I also 
beasted {{BlendedInfixSuggesterTest}} for 2000 iterations, and saw no failures 
there either.

I think this is ready to go.  I'll commit shortly.

> AnalyzingInfixSuggester should close its IndexWriter by default at the end of 
> build()
> -
>
> Key: LUCENE-7564
> URL: https://issues.apache.org/jira/browse/LUCENE-7564
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Steve Rowe
>Assignee: Steve Rowe
> Fix For: master (7.0), 6.4
>
> Attachments: LUCENE-7564-fix-random-NRT-failures.patch, 
> LUCENE-7564-fix-random-NRT-failures.patch, LUCENE-7564.patch, 
> LUCENE-7564.patch
>
>
> From SOLR-6246, where AnalyzingInfixSuggester's write lock on its index is 
> causing trouble when reloading a Solr core:
> [~gsingers] wrote:
> bq. One suggestion that might minimize the impact: close the writer after 
> build
> [~varunthacker] wrote:
> {quote}
> This is what I am thinking -
> Create a Lucene issue in which {{AnalyzingInfixSuggester#build}} closes the 
> writer by default at the end.
> The {{add}} and {{update}} methods call {{ensureOpen}} and those who do 
> frequent real time updates directly via lucene won't see any slowdowns.
> [~mikemccand] - Would this approach have any major drawback from Lucene's 
> perspective? Else I can go ahead an tackle this in a Lucene issue
> {quote}
> [~mikemccand] wrote:
> bq. Fixing {{AnalyzingInfixSuggester}} to close the writer at the end of 
> build seems reasonable?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS-EA] Lucene-Solr-6.x-Linux (64bit/jdk-9-ea+147) - Build # 2542 - Still Unstable!

2016-12-29 Thread Policeman Jenkins Server
Build: https://jenkins.thetaphi.de/job/Lucene-Solr-6.x-Linux/2542/
Java: 64bit/jdk-9-ea+147 -XX:+UseCompressedOops -XX:+UseConcMarkSweepGC

1 tests failed.
FAILED:  org.apache.solr.handler.TestSolrConfigHandlerCloud.test

Error Message:
Could not get expected value  'null' for path 'response/params/y/p' full 
output: {   "responseHeader":{ "status":0, "QTime":0},   "response":{   
  "znodeVersion":3, "params":{   "x":{ "a":"A val", 
"b":"B val", "":{"v":0}},   "y":{ "p":"P val", 
"q":"Q val", "":{"v":2},  from server:  
https://127.0.0.1:37308/collection1

Stack Trace:
java.lang.AssertionError: Could not get expected value  'null' for path 
'response/params/y/p' full output: {
  "responseHeader":{
"status":0,
"QTime":0},
  "response":{
"znodeVersion":3,
"params":{
  "x":{
"a":"A val",
"b":"B val",
"":{"v":0}},
  "y":{
"p":"P val",
"q":"Q val",
"":{"v":2},  from server:  https://127.0.0.1:37308/collection1
at 
__randomizedtesting.SeedInfo.seed([EACB5C2DE0687FA2:629F63F74E94125A]:0)
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.assertTrue(Assert.java:43)
at 
org.apache.solr.core.TestSolrConfigHandler.testForResponseElement(TestSolrConfigHandler.java:535)
at 
org.apache.solr.handler.TestSolrConfigHandlerCloud.testReqParams(TestSolrConfigHandlerCloud.java:273)
at 
org.apache.solr.handler.TestSolrConfigHandlerCloud.test(TestSolrConfigHandlerCloud.java:68)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:538)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1713)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:907)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:943)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:957)
at 
org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsFixedStatement.callStatement(BaseDistributedSearchTestCase.java:992)
at 
org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsStatement.evaluate(BaseDistributedSearchTestCase.java:967)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:811)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:462)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:916)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:802)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:852)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 

[jira] [Commented] (SOLR-9905) Add NullStream to isolate the performance of the ExportWriter

2016-12-29 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15786072#comment-15786072
 ] 

ASF subversion and git services commented on SOLR-9905:
---

Commit 0d830a7656e9b741970286dce5cb56d60df004f4 in lucene-solr's branch 
refs/heads/branch_6x from [~joel.bernstein]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=0d830a7 ]

SOLR-9905: Update CHANGES.txt


> Add NullStream to isolate the performance of the ExportWriter
> -
>
> Key: SOLR-9905
> URL: https://issues.apache.org/jira/browse/SOLR-9905
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
> Attachments: SOLR-9905.patch
>
>
> The NullStream is a utility function to test the raw performance of the 
> ExportWriter. This is a nice utility to have to diagnose bottlenecks in 
> streaming MapReduce operations. The NullStream will allow developers to test 
> the performance of the shuffling (Sorting, Partitioning, Exporting) in 
> isolation from the reduce operation (Rollup, Join, Group, etc..). 
> The NullStream simply iterates it's internal stream and eats the tuples. It 
> returns a single Tuple from each worker with the number of Tuples processed. 
> The idea is to iterate the stream without additional overhead so the 
> performance of the underlying stream can be isolated.
> Sample syntax:
> {code}
> parallel(collection2, workers=7, sort="nullCount desc", 
>   null(search(collection1, 
>q=*:*, 
>fl="id", 
>sort="id desc", 
>qt="/export", 
>wt="javabin", 
>partitionKeys=id)))
> {code}
> In the example above the NullStream is sent to 7 workers. Each worker will 
> iterate the search() expression and the NullStream will eat the tuples so the 
> raw performance of the search() can be understood.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9905) Add NullStream to isolate the performance of the ExportWriter

2016-12-29 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15786071#comment-15786071
 ] 

ASF subversion and git services commented on SOLR-9905:
---

Commit 1a1b3af78d1ce147f5be3da09edc27729578d744 in lucene-solr's branch 
refs/heads/branch_6x from [~joel.bernstein]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=1a1b3af ]

SOLR-9905: Add NullStream to isolate the performance of the ExportWriter


> Add NullStream to isolate the performance of the ExportWriter
> -
>
> Key: SOLR-9905
> URL: https://issues.apache.org/jira/browse/SOLR-9905
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
> Attachments: SOLR-9905.patch
>
>
> The NullStream is a utility function to test the raw performance of the 
> ExportWriter. This is a nice utility to have to diagnose bottlenecks in 
> streaming MapReduce operations. The NullStream will allow developers to test 
> the performance of the shuffling (Sorting, Partitioning, Exporting) in 
> isolation from the reduce operation (Rollup, Join, Group, etc..). 
> The NullStream simply iterates it's internal stream and eats the tuples. It 
> returns a single Tuple from each worker with the number of Tuples processed. 
> The idea is to iterate the stream without additional overhead so the 
> performance of the underlying stream can be isolated.
> Sample syntax:
> {code}
> parallel(collection2, workers=7, sort="nullCount desc", 
>   null(search(collection1, 
>q=*:*, 
>fl="id", 
>sort="id desc", 
>qt="/export", 
>wt="javabin", 
>partitionKeys=id)))
> {code}
> In the example above the NullStream is sent to 7 workers. Each worker will 
> iterate the search() expression and the NullStream will eat the tuples so the 
> raw performance of the search() can be understood.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-7607) LeafFieldComparator.setScorer() should throw IOException

2016-12-29 Thread Alan Woodward (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-7607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Woodward resolved LUCENE-7607.
---
   Resolution: Fixed
Fix Version/s: 6.4
   master (7.0)

> LeafFieldComparator.setScorer() should throw IOException
> 
>
> Key: LUCENE-7607
> URL: https://issues.apache.org/jira/browse/LUCENE-7607
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Alan Woodward
>Assignee: Alan Woodward
>Priority: Minor
> Fix For: master (7.0), 6.4
>
> Attachments: LUCENE-7607.patch, LUCENE-7607.patch
>
>
> Spinoff of LUCENE-5325.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-5325) Move ValueSource and FunctionValues under core/

2016-12-29 Thread Alan Woodward (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Woodward resolved LUCENE-5325.
---
   Resolution: Fixed
 Assignee: Alan Woodward
Fix Version/s: 6.4
   master (7.0)

I pushed the fix for 6.x and added a test there and on master.  Thanks for the 
reviews!  I'll open up some follow-up issues now.

> Move ValueSource and FunctionValues under core/
> ---
>
> Key: LUCENE-5325
> URL: https://issues.apache.org/jira/browse/LUCENE-5325
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/search
>Reporter: Shai Erera
>Assignee: Alan Woodward
> Fix For: master (7.0), 6.4
>
> Attachments: LUCENE-5325-6x-matchingbits.patch, LUCENE-5325-6x.patch, 
> LUCENE-5325.patch, LUCENE-5325.patch, LUCENE-5325.patch, LUCENE-5325.patch, 
> LUCENE-5325.patch, LUCENE-5325.patch, LUCENE-5325.patch
>
>
> Spinoff from LUCENE-5298: ValueSource and FunctionValues are abstract APIs 
> which exist under the queries/ module. That causes any module which wants to 
> depend on these APIs (but not necessarily on any of their actual 
> implementations!), to depend on the queries/ module. If we move these APIs 
> under core/, we can eliminate these dependencies and add some mock impls for 
> testing purposes.
> Quoting Robert from LUCENE-5298:
> {quote}
> we should eliminate the suggest/ dependencies on expressions and queries, the 
> expressions/ on queries, the grouping/ dependency on queries, the spatial/ 
> dependency on queries, its a mess.
> {quote}
> To add to that list, facet/ should not depend on queries too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5325) Move ValueSource and FunctionValues under core/

2016-12-29 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15786058#comment-15786058
 ] 

ASF subversion and git services commented on LUCENE-5325:
-

Commit a4335c0e9f01275c7d6e807813d9818b6e59d76e in lucene-solr's branch 
refs/heads/master from [~romseygeek]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=a4335c0 ]

LUCENE-5325: Add test for missing values in sorts


> Move ValueSource and FunctionValues under core/
> ---
>
> Key: LUCENE-5325
> URL: https://issues.apache.org/jira/browse/LUCENE-5325
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/search
>Reporter: Shai Erera
> Attachments: LUCENE-5325-6x-matchingbits.patch, LUCENE-5325-6x.patch, 
> LUCENE-5325.patch, LUCENE-5325.patch, LUCENE-5325.patch, LUCENE-5325.patch, 
> LUCENE-5325.patch, LUCENE-5325.patch, LUCENE-5325.patch
>
>
> Spinoff from LUCENE-5298: ValueSource and FunctionValues are abstract APIs 
> which exist under the queries/ module. That causes any module which wants to 
> depend on these APIs (but not necessarily on any of their actual 
> implementations!), to depend on the queries/ module. If we move these APIs 
> under core/, we can eliminate these dependencies and add some mock impls for 
> testing purposes.
> Quoting Robert from LUCENE-5298:
> {quote}
> we should eliminate the suggest/ dependencies on expressions and queries, the 
> expressions/ on queries, the grouping/ dependency on queries, the spatial/ 
> dependency on queries, its a mess.
> {quote}
> To add to that list, facet/ should not depend on queries too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5325) Move ValueSource and FunctionValues under core/

2016-12-29 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15786057#comment-15786057
 ] 

ASF subversion and git services commented on LUCENE-5325:
-

Commit 4e5a62140f4e90fc41fe91350c7787c8455f2887 in lucene-solr's branch 
refs/heads/branch_6x from [~romseygeek]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=4e5a621 ]

LUCENE-5325: Check for matching bits in NumericDocValues to XValues converter


> Move ValueSource and FunctionValues under core/
> ---
>
> Key: LUCENE-5325
> URL: https://issues.apache.org/jira/browse/LUCENE-5325
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/search
>Reporter: Shai Erera
> Attachments: LUCENE-5325-6x-matchingbits.patch, LUCENE-5325-6x.patch, 
> LUCENE-5325.patch, LUCENE-5325.patch, LUCENE-5325.patch, LUCENE-5325.patch, 
> LUCENE-5325.patch, LUCENE-5325.patch, LUCENE-5325.patch
>
>
> Spinoff from LUCENE-5298: ValueSource and FunctionValues are abstract APIs 
> which exist under the queries/ module. That causes any module which wants to 
> depend on these APIs (but not necessarily on any of their actual 
> implementations!), to depend on the queries/ module. If we move these APIs 
> under core/, we can eliminate these dependencies and add some mock impls for 
> testing purposes.
> Quoting Robert from LUCENE-5298:
> {quote}
> we should eliminate the suggest/ dependencies on expressions and queries, the 
> expressions/ on queries, the grouping/ dependency on queries, the spatial/ 
> dependency on queries, its a mess.
> {quote}
> To add to that list, facet/ should not depend on queries too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9905) Add NullStream to isolate the performance of the ExportWriter

2016-12-29 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15785977#comment-15785977
 ] 

ASF subversion and git services commented on SOLR-9905:
---

Commit 00723827ff5ad5c129d3d8487d2c64469ea03239 in lucene-solr's branch 
refs/heads/master from [~joel.bernstein]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=0072382 ]

SOLR-9905: Update CHANGES.txt


> Add NullStream to isolate the performance of the ExportWriter
> -
>
> Key: SOLR-9905
> URL: https://issues.apache.org/jira/browse/SOLR-9905
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
> Attachments: SOLR-9905.patch
>
>
> The NullStream is a utility function to test the raw performance of the 
> ExportWriter. This is a nice utility to have to diagnose bottlenecks in 
> streaming MapReduce operations. The NullStream will allow developers to test 
> the performance of the shuffling (Sorting, Partitioning, Exporting) in 
> isolation from the reduce operation (Rollup, Join, Group, etc..). 
> The NullStream simply iterates it's internal stream and eats the tuples. It 
> returns a single Tuple from each worker with the number of Tuples processed. 
> The idea is to iterate the stream without additional overhead so the 
> performance of the underlying stream can be isolated.
> Sample syntax:
> {code}
> parallel(collection2, workers=7, sort="nullCount desc", 
>   null(search(collection1, 
>q=*:*, 
>fl="id", 
>sort="id desc", 
>qt="/export", 
>wt="javabin", 
>partitionKeys=id)))
> {code}
> In the example above the NullStream is sent to 7 workers. Each worker will 
> iterate the search() expression and the NullStream will eat the tuples so the 
> raw performance of the search() can be understood.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9905) Add NullStream to isolate the performance of the ExportWriter

2016-12-29 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15785970#comment-15785970
 ] 

ASF subversion and git services commented on SOLR-9905:
---

Commit 7dcb557ab73da7fb7af0e8f698895e28dde4bbca in lucene-solr's branch 
refs/heads/master from [~joel.bernstein]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=7dcb557 ]

SOLR-9905: Add NullStream to isolate the performance of the ExportWriter


> Add NullStream to isolate the performance of the ExportWriter
> -
>
> Key: SOLR-9905
> URL: https://issues.apache.org/jira/browse/SOLR-9905
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
> Attachments: SOLR-9905.patch
>
>
> The NullStream is a utility function to test the raw performance of the 
> ExportWriter. This is a nice utility to have to diagnose bottlenecks in 
> streaming MapReduce operations. The NullStream will allow developers to test 
> the performance of the shuffling (Sorting, Partitioning, Exporting) in 
> isolation from the reduce operation (Rollup, Join, Group, etc..). 
> The NullStream simply iterates it's internal stream and eats the tuples. It 
> returns a single Tuple from each worker with the number of Tuples processed. 
> The idea is to iterate the stream without additional overhead so the 
> performance of the underlying stream can be isolated.
> Sample syntax:
> {code}
> parallel(collection2, workers=7, sort="nullCount desc", 
>   null(search(collection1, 
>q=*:*, 
>fl="id", 
>sort="id desc", 
>qt="/export", 
>wt="javabin", 
>partitionKeys=id)))
> {code}
> In the example above the NullStream is sent to 7 workers. Each worker will 
> iterate the search() expression and the NullStream will eat the tuples so the 
> raw performance of the search() can be understood.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-master-Linux (32bit/jdk1.8.0_112) - Build # 18648 - Still Unstable!

2016-12-29 Thread Policeman Jenkins Server
Build: https://jenkins.thetaphi.de/job/Lucene-Solr-master-Linux/18648/
Java: 32bit/jdk1.8.0_112 -server -XX:+UseG1GC

4 tests failed.
FAILED:  
org.apache.solr.cloud.CollectionsAPISolrJTest.testCreateCollectionWithPropertyParam

Error Message:
Error from server at https://127.0.0.1:44780/solr: create the collection time 
out:180s

Stack Trace:
org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error 
from server at https://127.0.0.1:44780/solr: create the collection time out:180s
at 
__randomizedtesting.SeedInfo.seed([EC2BBF99B998A360:EAB0CD683D466339]:0)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:627)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:279)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:268)
at 
org.apache.solr.client.solrj.impl.LBHttpSolrClient.doRequest(LBHttpSolrClient.java:439)
at 
org.apache.solr.client.solrj.impl.LBHttpSolrClient.request(LBHttpSolrClient.java:391)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.sendRequest(CloudSolrClient.java:1344)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:1095)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:1037)
at 
org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:149)
at 
org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:166)
at 
org.apache.solr.cloud.CollectionsAPISolrJTest.testCreateCollectionWithPropertyParam(CollectionsAPISolrJTest.java:189)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1713)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:907)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:943)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:957)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:811)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:462)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:916)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:802)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:852)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 

[jira] [Commented] (LUCENE-7595) RAMUsageTester in test-framework and static field checker no longer works with Java 9

2016-12-29 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15785955#comment-15785955
 ] 

ASF subversion and git services commented on LUCENE-7595:
-

Commit 80512ec412c20517341ddd50c78baf5270fcdc2f in lucene-solr's branch 
refs/heads/branch_6x from [~thetaphi]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=80512ec ]

LUCENE-7595: Fix bug with RamUsageTester incorrectly handling Iterables outside 
Java Runtime


> RAMUsageTester in test-framework and static field checker no longer works 
> with Java 9
> -
>
> Key: LUCENE-7595
> URL: https://issues.apache.org/jira/browse/LUCENE-7595
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: general/test
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
>  Labels: Java9
> Fix For: 6.x, master (7.0), 6.4
>
> Attachments: LUCENE-7595.patch
>
>
> Lucene/Solr tests have a special rule that records memory usage in static 
> fields before and after test, so we can detect memory leaks. This check dives 
> into JDK classes (like java.lang.String to detect their size). As Java 9 
> build 148 completely forbids setAccessible on any runtime class, we have to 
> change or disable this check:
> - As first step I will only add the rule to LTC, if we not have Java 8
> - As a second step we might investigate how to improve this
> [~rcmuir] had some ideas for the 2nd point:
> - Don't dive into classes from JDK modules and instead "estimate" the size 
> for some special cases (like Strings)
> - Disallow any static field in tests that is not final (constant) and points 
> to an Object except: Strings and native (wrapper) types.
> In addition we also have RAMUsageTester, that has similar problems and is 
> used to compare estimations of Lucene's calculations of 
> Codec/IndexWriter/IndexReader memory usage with reality. We should simply 
> disable those tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7595) RAMUsageTester in test-framework and static field checker no longer works with Java 9

2016-12-29 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15785952#comment-15785952
 ] 

ASF subversion and git services commented on LUCENE-7595:
-

Commit db9190db9372ae88a7392a7186397441ce070a96 in lucene-solr's branch 
refs/heads/master from [~thetaphi]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=db9190d ]

LUCENE-7595: Fix bug with RamUsageTester incorrectly handling Iterables outside 
Java Runtime


> RAMUsageTester in test-framework and static field checker no longer works 
> with Java 9
> -
>
> Key: LUCENE-7595
> URL: https://issues.apache.org/jira/browse/LUCENE-7595
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: general/test
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
>  Labels: Java9
> Fix For: 6.x, master (7.0), 6.4
>
> Attachments: LUCENE-7595.patch
>
>
> Lucene/Solr tests have a special rule that records memory usage in static 
> fields before and after test, so we can detect memory leaks. This check dives 
> into JDK classes (like java.lang.String to detect their size). As Java 9 
> build 148 completely forbids setAccessible on any runtime class, we have to 
> change or disable this check:
> - As first step I will only add the rule to LTC, if we not have Java 8
> - As a second step we might investigate how to improve this
> [~rcmuir] had some ideas for the 2nd point:
> - Don't dive into classes from JDK modules and instead "estimate" the size 
> for some special cases (like Strings)
> - Disallow any static field in tests that is not final (constant) and points 
> to an Object except: Strings and native (wrapper) types.
> In addition we also have RAMUsageTester, that has similar problems and is 
> used to compare estimations of Lucene's calculations of 
> Codec/IndexWriter/IndexReader memory usage with reality. We should simply 
> disable those tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7603) Support Graph Token Streams in QueryBuilder

2016-12-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15785941#comment-15785941
 ] 

ASF GitHub Bot commented on LUCENE-7603:


Github user mattweber commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/129#discussion_r94171262
  
--- Diff: 
lucene/core/src/java/org/apache/lucene/util/graph/GraphTokenStreamFiniteStrings.java
 ---
@@ -0,0 +1,294 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.lucene.util.graph;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+
+import org.apache.lucene.analysis.TokenStream;
+import org.apache.lucene.analysis.tokenattributes.BytesTermAttribute;
+import org.apache.lucene.analysis.tokenattributes.OffsetAttribute;
+import 
org.apache.lucene.analysis.tokenattributes.PositionIncrementAttribute;
+import org.apache.lucene.analysis.tokenattributes.PositionLengthAttribute;
+import org.apache.lucene.analysis.tokenattributes.TermToBytesRefAttribute;
+import org.apache.lucene.util.BytesRef;
+import org.apache.lucene.util.IntsRef;
+import org.apache.lucene.util.automaton.Automaton;
+import org.apache.lucene.util.automaton.FiniteStringsIterator;
+import org.apache.lucene.util.automaton.Operations;
+import org.apache.lucene.util.automaton.Transition;
+
+import static 
org.apache.lucene.util.automaton.Operations.DEFAULT_MAX_DETERMINIZED_STATES;
+
+/**
+ * Creates a list of {@link TokenStream} where each stream is the tokens 
that make up a finite string in graph token stream.  To do this,
+ * the graph token stream is converted to an {@link Automaton} and from 
there we use a {@link FiniteStringsIterator} to collect the various
+ * token streams for each finite string.
+ */
+public class GraphTokenStreamFiniteStrings {
+  /* TODO:
+ Most of this is a combination of code from TermAutomatonQuery and 
TokenStreamToTermAutomatonQuery. Would be
+ good to make this so it could be shared. */
+  private final Automaton.Builder builder;
+  Automaton det;
+  private final Map termToID = new HashMap<>();
+  private final Map idToTerm = new HashMap<>();
+  private int anyTermID = -1;
+
+  public GraphTokenStreamFiniteStrings() {
+this.builder = new Automaton.Builder();
+  }
+
+  private static class BytesRefArrayTokenStream extends TokenStream {
+private final BytesTermAttribute termAtt = 
addAttribute(BytesTermAttribute.class);
+private final BytesRef[] terms;
+private int offset;
+
+BytesRefArrayTokenStream(BytesRef[] terms) {
+  this.terms = terms;
+  offset = 0;
+}
+
+@Override
+public boolean incrementToken() throws IOException {
+  if (offset < terms.length) {
+clearAttributes();
+termAtt.setBytesRef(terms[offset]);
+offset = offset + 1;
+return true;
+  }
+
+  return false;
+}
+  }
+
+  /**
+   * Gets the list of finite string token streams from the given input 
graph token stream.
+   */
+  public List getTokenStreams(final TokenStream in) throws 
IOException {
+// build automation
+build(in);
+
+List tokenStreams = new ArrayList<>();
+final FiniteStringsIterator finiteStrings = new 
FiniteStringsIterator(det);
+for (IntsRef string; (string = finiteStrings.next()) != null; ) {
+  final BytesRef[] tokens = new BytesRef[string.length];
+  for (int idx = string.offset, len = string.offset + string.length; 
idx < len; idx++) {
+tokens[idx - string.offset] = idToTerm.get(string.ints[idx]);
+  }
+
+  tokenStreams.add(new BytesRefArrayTokenStream(tokens));
+}
+
+return tokenStreams;
+  }
+
+ 

[GitHub] lucene-solr pull request #129: LUCENE-7603: Support Graph Token Streams in Q...

2016-12-29 Thread mattweber
Github user mattweber commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/129#discussion_r94171262
  
--- Diff: 
lucene/core/src/java/org/apache/lucene/util/graph/GraphTokenStreamFiniteStrings.java
 ---
@@ -0,0 +1,294 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.lucene.util.graph;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+
+import org.apache.lucene.analysis.TokenStream;
+import org.apache.lucene.analysis.tokenattributes.BytesTermAttribute;
+import org.apache.lucene.analysis.tokenattributes.OffsetAttribute;
+import 
org.apache.lucene.analysis.tokenattributes.PositionIncrementAttribute;
+import org.apache.lucene.analysis.tokenattributes.PositionLengthAttribute;
+import org.apache.lucene.analysis.tokenattributes.TermToBytesRefAttribute;
+import org.apache.lucene.util.BytesRef;
+import org.apache.lucene.util.IntsRef;
+import org.apache.lucene.util.automaton.Automaton;
+import org.apache.lucene.util.automaton.FiniteStringsIterator;
+import org.apache.lucene.util.automaton.Operations;
+import org.apache.lucene.util.automaton.Transition;
+
+import static 
org.apache.lucene.util.automaton.Operations.DEFAULT_MAX_DETERMINIZED_STATES;
+
+/**
+ * Creates a list of {@link TokenStream} where each stream is the tokens 
that make up a finite string in graph token stream.  To do this,
+ * the graph token stream is converted to an {@link Automaton} and from 
there we use a {@link FiniteStringsIterator} to collect the various
+ * token streams for each finite string.
+ */
+public class GraphTokenStreamFiniteStrings {
+  /* TODO:
+ Most of this is a combination of code from TermAutomatonQuery and 
TokenStreamToTermAutomatonQuery. Would be
+ good to make this so it could be shared. */
+  private final Automaton.Builder builder;
+  Automaton det;
+  private final Map termToID = new HashMap<>();
+  private final Map idToTerm = new HashMap<>();
+  private int anyTermID = -1;
+
+  public GraphTokenStreamFiniteStrings() {
+this.builder = new Automaton.Builder();
+  }
+
+  private static class BytesRefArrayTokenStream extends TokenStream {
+private final BytesTermAttribute termAtt = 
addAttribute(BytesTermAttribute.class);
+private final BytesRef[] terms;
+private int offset;
+
+BytesRefArrayTokenStream(BytesRef[] terms) {
+  this.terms = terms;
+  offset = 0;
+}
+
+@Override
+public boolean incrementToken() throws IOException {
+  if (offset < terms.length) {
+clearAttributes();
+termAtt.setBytesRef(terms[offset]);
+offset = offset + 1;
+return true;
+  }
+
+  return false;
+}
+  }
+
+  /**
+   * Gets the list of finite string token streams from the given input 
graph token stream.
+   */
+  public List getTokenStreams(final TokenStream in) throws 
IOException {
+// build automation
+build(in);
+
+List tokenStreams = new ArrayList<>();
+final FiniteStringsIterator finiteStrings = new 
FiniteStringsIterator(det);
+for (IntsRef string; (string = finiteStrings.next()) != null; ) {
+  final BytesRef[] tokens = new BytesRef[string.length];
+  for (int idx = string.offset, len = string.offset + string.length; 
idx < len; idx++) {
+tokens[idx - string.offset] = idToTerm.get(string.ints[idx]);
+  }
+
+  tokenStreams.add(new BytesRefArrayTokenStream(tokens));
+}
+
+return tokenStreams;
+  }
+
+  private void build(final TokenStream in) throws IOException {
+if (det != null) {
+  throw new IllegalStateException("Automation already built");
+}
+
+final TermToBytesRefAttribute termBytesAtt = 

[jira] [Comment Edited] (SOLR-9898) Documentation for metrics collection and /admin/metrics

2016-12-29 Thread Cassandra Targett (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15785883#comment-15785883
 ] 

Cassandra Targett edited comment on SOLR-9898 at 12/29/16 7:00 PM:
---

I've started a page in the "drafts" area of the Solr Ref Guide: 
https://cwiki.apache.org/confluence/display/solr/Metrics+Reporting.

_edit_: the name can be changed if there's a better one.


was (Author: ctargett):
I've started a page in the "drafts" area of the Solr Ref Guide: 
https://cwiki.apache.org/confluence/display/solr/Metrics+Reporting.

> Documentation for metrics collection and /admin/metrics
> ---
>
> Key: SOLR-9898
> URL: https://issues.apache.org/jira/browse/SOLR-9898
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: documentation
>Affects Versions: master (7.0), 6.4
>Reporter: Andrzej Bialecki 
>Assignee: Cassandra Targett
>
> Draft documentation follows.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9898) Documentation for metrics collection and /admin/metrics

2016-12-29 Thread Cassandra Targett (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15785883#comment-15785883
 ] 

Cassandra Targett commented on SOLR-9898:
-

I've started a page in the "drafts" area of the Solr Ref Guide: 
https://cwiki.apache.org/confluence/display/solr/Metrics+Reporting.

> Documentation for metrics collection and /admin/metrics
> ---
>
> Key: SOLR-9898
> URL: https://issues.apache.org/jira/browse/SOLR-9898
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: documentation
>Affects Versions: master (7.0), 6.4
>Reporter: Andrzej Bialecki 
>Assignee: Cassandra Targett
>
> Draft documentation follows.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7608) Add a git .mailmap file to dedupe authors.

2016-12-29 Thread Mike Drob (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15785873#comment-15785873
 ] 

Mike Drob commented on LUCENE-7608:
---

Neat feature! I think you missed a {{m...@elastic.co}} and 
{{nkn...@apache.org}}.

> Add a git .mailmap file to dedupe authors.
> --
>
> Key: LUCENE-7608
> URL: https://issues.apache.org/jira/browse/LUCENE-7608
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Mark Miller
>Assignee: Mark Miller
>Priority: Minor
> Attachments: .mailmap, .mailmap
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Assigned] (SOLR-9898) Documentation for metrics collection and /admin/metrics

2016-12-29 Thread Cassandra Targett (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-9898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cassandra Targett reassigned SOLR-9898:
---

Assignee: Cassandra Targett

> Documentation for metrics collection and /admin/metrics
> ---
>
> Key: SOLR-9898
> URL: https://issues.apache.org/jira/browse/SOLR-9898
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: documentation
>Affects Versions: master (7.0), 6.4
>Reporter: Andrzej Bialecki 
>Assignee: Cassandra Targett
>
> Draft documentation follows.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-9905) Add NullStream to isolate the performance of the ExportWriter

2016-12-29 Thread Joel Bernstein (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-9905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-9905:
-
Description: 
The NullStream is a utility function to test the raw performance of the 
ExportWriter. This is a nice utility to have to diagnose bottlenecks in 
streaming MapReduce operations. The NullStream will allow developers to test 
the performance of the shuffling (Sorting, Partitioning, Exporting) in 
isolation from the reduce operation (Rollup, Join, Group, etc..). 

The NullStream simply iterates it's internal stream and eats the tuples. It 
returns a single Tuple from each worker with the number of Tuples processed. 
The idea is to iterate the stream without additional overhead so the 
performance of the underlying stream can be isolated.

Sample syntax:
{code}
parallel(collection2, workers=7, sort="nullCount desc", 
  null(search(collection1, 
   q=*:*, 
   fl="id", 
   sort="id desc", 
   qt="/export", 
   wt="javabin", 
   partitionKeys=id)))
{code}

In the example above the NullStream is sent to 7 workers. Each worker will 
iterate the search() expression and the NullStream will eat the tuples so the 
raw performance of the search() can be understood.

  was:
The NullStream is a utility function to test the raw performance of the 
ExportWriter. This is a nice utility to have to diagnose bottlenecks in 
streaming MapReduce operations. The NullStream will allow developers to test 
the performance of the shuffling (Sorting, Partitioning, Exporting) in 
isolation from the reduce operation (Rollup, Join, Group, etc..). 

The NullStream simply iterates it's internal stream and eats the tuples. It 
returns a single Tuple from each worker with the number of Tuples processed. 
The idea is to iterate the stream without additional overhead so the 
performance of the underlying stream can be isolated.

Sample syntax:
{code}
parallel(collection2, workers=7, sort="count desc", 
  null(search(collection1, 
   q=*:*, 
   fl="id", 
   sort="id desc", 
   qt="/export", 
   wt="javabin", 
   partitionKeys=id)))
{code}

In the example above the NullStream is sent to 7 workers. Each worker will 
iterate the search() expression and the NullStream will eat the tuples so the 
raw performance of the search() can be understood.


> Add NullStream to isolate the performance of the ExportWriter
> -
>
> Key: SOLR-9905
> URL: https://issues.apache.org/jira/browse/SOLR-9905
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
> Attachments: SOLR-9905.patch
>
>
> The NullStream is a utility function to test the raw performance of the 
> ExportWriter. This is a nice utility to have to diagnose bottlenecks in 
> streaming MapReduce operations. The NullStream will allow developers to test 
> the performance of the shuffling (Sorting, Partitioning, Exporting) in 
> isolation from the reduce operation (Rollup, Join, Group, etc..). 
> The NullStream simply iterates it's internal stream and eats the tuples. It 
> returns a single Tuple from each worker with the number of Tuples processed. 
> The idea is to iterate the stream without additional overhead so the 
> performance of the underlying stream can be isolated.
> Sample syntax:
> {code}
> parallel(collection2, workers=7, sort="nullCount desc", 
>   null(search(collection1, 
>q=*:*, 
>fl="id", 
>sort="id desc", 
>qt="/export", 
>wt="javabin", 
>partitionKeys=id)))
> {code}
> In the example above the NullStream is sent to 7 workers. Each worker will 
> iterate the search() expression and the NullStream will eat the tuples so the 
> raw performance of the search() can be understood.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-9905) Add NullStream to isolate the performance of the ExportWriter

2016-12-29 Thread Joel Bernstein (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-9905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-9905:
-
Attachment: SOLR-9905.patch

> Add NullStream to isolate the performance of the ExportWriter
> -
>
> Key: SOLR-9905
> URL: https://issues.apache.org/jira/browse/SOLR-9905
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
> Attachments: SOLR-9905.patch
>
>
> The NullStream is a utility function to test the raw performance of the 
> ExportWriter. This is a nice utility to have to diagnose bottlenecks in 
> streaming MapReduce operations. The NullStream will allow developers to test 
> the performance of the shuffling (Sorting, Partitioning, Exporting) in 
> isolation from the reduce operation (Rollup, Join, Group, etc..). 
> The NullStream simply iterates it's internal stream and eats the tuples. It 
> returns a single Tuple from each worker with the number of Tuples processed. 
> The idea is to iterate the stream without additional overhead so the 
> performance of the underlying stream can be isolated.
> Sample syntax:
> {code}
> parallel(collection2, workers=7, sort="count desc", 
>   null(search(collection1, 
>q=*:*, 
>fl="id", 
>sort="id desc", 
>qt="/export", 
>wt="javabin", 
>partitionKeys=id)))
> {code}
> In the example above the NullStream is sent to 7 workers. Each worker will 
> iterate the search() expression and the NullStream will eat the tuples so the 
> raw performance of the search() can be understood.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-7609) Refactor expressions module to use DoubleValuesSource

2016-12-29 Thread Alan Woodward (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-7609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Woodward updated LUCENE-7609:
--
Attachment: LUCENE-7609.patch

Patch.  The dependency on queries is kept for the moment, and the ValueSource 
versions of methods deprecated; these will be removed in a subsequent patch.

> Refactor expressions module to use DoubleValuesSource
> -
>
> Key: LUCENE-7609
> URL: https://issues.apache.org/jira/browse/LUCENE-7609
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Alan Woodward
> Attachments: LUCENE-7609.patch
>
>
> With DoubleValuesSource in core, we can refactor the expressions module to 
> use these instead of ValueSource, and remove the dependency of expressions on 
> the queries module in master.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-7609) Refactor expressions module to use DoubleValuesSource

2016-12-29 Thread Alan Woodward (JIRA)
Alan Woodward created LUCENE-7609:
-

 Summary: Refactor expressions module to use DoubleValuesSource
 Key: LUCENE-7609
 URL: https://issues.apache.org/jira/browse/LUCENE-7609
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Alan Woodward


With DoubleValuesSource in core, we can refactor the expressions module to use 
these instead of ValueSource, and remove the dependency of expressions on the 
queries module in master.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-9906) Use better check to validate if node recovered via PeerSync or Replication

2016-12-29 Thread Pushkar Raste (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-9906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pushkar Raste updated SOLR-9906:

Attachment: SOLR-PeerSyncVsReplicationTest.diff

Here is a patch. 

I have also fixed bugs in the tests I came across.

> Use better check to validate if node recovered via PeerSync or Replication
> --
>
> Key: SOLR-9906
> URL: https://issues.apache.org/jira/browse/SOLR-9906
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Pushkar Raste
>Priority: Minor
> Attachments: SOLR-PeerSyncVsReplicationTest.diff
>
>
> Tests {{LeaderFailureAfterFreshStartTest}} and {{PeerSyncReplicationTest}} 
> currently rely on number of requests made to the leader's replication handler 
> to check if node recovered via PeerSync or replication. This check is not 
> very reliable and we have seen failures in the past. 
> While tinkering with different way to write a better test I found 
> [SOLR-9859|SOLR-9859]. Now that SOLR-9859 is fixed, here is idea for better 
> way to distinguish recovery via PeerSync vs Replication. 
> * For {{PeerSyncReplicationTest}}, if node successfully recovers via 
> PeerSync, then file {{replication.properties}} should not exist
> For {{LeaderFailureAfterFreshStartTest}}, if the freshly replicated node does 
> not go into replication recovery after the leader failure, contents 
> {{replication.properties}} should not change 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-9906) Use better check to validate if node recovered via PeerSync or Replication

2016-12-29 Thread Pushkar Raste (JIRA)
Pushkar Raste created SOLR-9906:
---

 Summary: Use better check to validate if node recovered via 
PeerSync or Replication
 Key: SOLR-9906
 URL: https://issues.apache.org/jira/browse/SOLR-9906
 Project: Solr
  Issue Type: Improvement
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Pushkar Raste
Priority: Minor


Tests {{LeaderFailureAfterFreshStartTest}} and {{PeerSyncReplicationTest}} 
currently rely on number of requests made to the leader's replication handler 
to check if node recovered via PeerSync or replication. This check is not very 
reliable and we have seen failures in the past. 

While tinkering with different way to write a better test I found 
[SOLR-9859|SOLR-9859]. Now that SOLR-9859 is fixed, here is idea for better way 
to distinguish recovery via PeerSync vs Replication. 

* For {{PeerSyncReplicationTest}}, if node successfully recovers via PeerSync, 
then file {{replication.properties}} should not exist

For {{LeaderFailureAfterFreshStartTest}}, if the freshly replicated node does 
not go into replication recovery after the leader failure, contents 
{{replication.properties}} should not change 




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #131: Fix peer sync replcation test check

2016-12-29 Thread praste
GitHub user praste opened a pull request:

https://github.com/apache/lucene-solr/pull/131

Fix peer sync replcation test check



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/praste/lucene-solr 
fixPeerSyncReplcationTestCheck

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/lucene-solr/pull/131.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #131


commit 6073e39c99807305b19d77f6cc7b3a44b799ade4
Author: Pushkar Raste 
Date:   2016-12-08T22:36:36Z

More reliable check for PeerSync vs Replication

commit 3d340c502ee850ae1945891b88c83ae7ba7b4c42
Author: Pushkar Raste 
Date:   2016-12-09T17:32:25Z

Merge branch 'master' of https://github.com/apache/lucene-solr into 
fixPeerSyncReplcationTestCheck

commit 875151ec3b8fd9def06d5b99c4bdb02f705905ab
Author: Pushkar Raste 
Date:   2016-12-09T18:56:07Z

Merge branch 'master' of https://github.com/apache/lucene-solr into 
fixPeerSyncReplcationTestCheck

commit b47fd6acd30f5b9cbf7846d3e9edcef7c0182286
Author: Pushkar Raste 
Date:   2016-12-14T17:38:05Z

Fixing replication test

commit ac0b5db54cc72d21d289bb9cb9d97d1c474b1b3f
Author: Pushkar Raste 
Date:   2016-12-14T20:59:26Z

Merge branch 'master' of https://github.com/apache/lucene-solr into 
fixPeerSyncReplcationTestCheck

commit 9e75e1d2f4bcaf9811a2deb727aa8b7bfeda1f4a
Author: Pushkar Raste 
Date:   2016-12-16T17:14:31Z

Better way to identify replication vs peersync for the 
PeerSyncReplicationTest

commit a7596aac37ca2bdf4be0f2493404164afde98c31
Author: Pushkar Raste 
Date:   2016-12-16T17:14:48Z

Merge branch 'master' of https://github.com/apache/lucene-solr into 
fixPeerSyncReplcationTestCheck

commit b80c65ae4fd43463f661cca7d50f5d54ff6fea93
Author: Pushkar Raste 
Date:   2016-12-29T17:43:35Z

Merging master

commit 2f1c5a8f9b2187bbda4673e66e909e5f223e1432
Author: Pushkar Raste 
Date:   2016-12-29T18:05:47Z

Use better replication check




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS-EA] Lucene-Solr-6.x-Linux (32bit/jdk-9-ea+147) - Build # 2541 - Unstable!

2016-12-29 Thread Policeman Jenkins Server
Build: https://jenkins.thetaphi.de/job/Lucene-Solr-6.x-Linux/2541/
Java: 32bit/jdk-9-ea+147 -client -XX:+UseG1GC

11 tests failed.
FAILED:  
org.apache.lucene.codecs.asserting.TestAssertingNormsFormat.testRamBytesUsed

Error Message:
Actual RAM usage 344, but got 6857, -1893.3139534883721% error

Stack Trace:
java.lang.AssertionError: Actual RAM usage 344, but got 6857, 
-1893.3139534883721% error
at 
__randomizedtesting.SeedInfo.seed([18F35E00A2FC613F:EA504C4068837E69]:0)
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.assertTrue(Assert.java:43)
at 
org.apache.lucene.index.BaseIndexFileFormatTestCase.testRamBytesUsed(BaseIndexFileFormatTestCase.java:279)
at 
org.apache.lucene.index.BaseNormsFormatTestCase.testRamBytesUsed(BaseNormsFormatTestCase.java:46)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:538)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1713)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:907)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:943)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:957)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:811)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:462)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:916)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:802)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:852)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367)
at java.base/java.lang.Thread.run(Thread.java:844)


FAILED:  
org.apache.lucene.codecs.blocktreeords.TestOrdsBlockTree.testRamBytesUsed

Error Message:
Actual RAM usage 376, but got 2063, -448.67021276595744% error

Stack Trace:
java.lang.AssertionError: Actual RAM usage 376, but got 2063, 
-448.67021276595744% error
at 
__randomizedtesting.SeedInfo.seed([DE9F86D6A81BE32F:2C3C94966264FC79]:0)
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.assertTrue(Assert.java:43)
at 

[jira] [Commented] (LUCENE-7608) Add a git .mailmap file to dedupe authors.

2016-12-29 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15785686#comment-15785686
 ] 

David Smiley commented on LUCENE-7608:
--

LOL of course; thanks ;-)

> Add a git .mailmap file to dedupe authors.
> --
>
> Key: LUCENE-7608
> URL: https://issues.apache.org/jira/browse/LUCENE-7608
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Mark Miller
>Assignee: Mark Miller
>Priority: Minor
> Attachments: .mailmap, .mailmap
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7608) Add a git .mailmap file to dedupe authors.

2016-12-29 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15785682#comment-15785682
 ] 

Mark Miller commented on LUCENE-7608:
-

Also, I generally favored using the most recent username as the canonical name. 
Individuals can change this for themselves if they want something else.

> Add a git .mailmap file to dedupe authors.
> --
>
> Key: LUCENE-7608
> URL: https://issues.apache.org/jira/browse/LUCENE-7608
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Mark Miller
>Assignee: Mark Miller
>Priority: Minor
> Attachments: .mailmap, .mailmap
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7608) Add a git .mailmap file to dedupe authors.

2016-12-29 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15785679#comment-15785679
 ] 

Mark Miller commented on LUCENE-7608:
-

I have another check to do, but I think that covers the issues I see.

Placed in the git checkout root folder.

> Add a git .mailmap file to dedupe authors.
> --
>
> Key: LUCENE-7608
> URL: https://issues.apache.org/jira/browse/LUCENE-7608
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Mark Miller
>Assignee: Mark Miller
>Priority: Minor
> Attachments: .mailmap, .mailmap
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7608) Add a git .mailmap file to dedupe authors.

2016-12-29 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15785664#comment-15785664
 ] 

Mark Miller commented on LUCENE-7608:
-

https://www.google.com/search?q=git+mailmap

> Add a git .mailmap file to dedupe authors.
> --
>
> Key: LUCENE-7608
> URL: https://issues.apache.org/jira/browse/LUCENE-7608
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Mark Miller
>Assignee: Mark Miller
>Priority: Minor
> Attachments: .mailmap, .mailmap
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7608) Add a git .mailmap file to dedupe authors.

2016-12-29 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15785656#comment-15785656
 ] 

David Smiley commented on LUCENE-7608:
--

How is this used?

> Add a git .mailmap file to dedupe authors.
> --
>
> Key: LUCENE-7608
> URL: https://issues.apache.org/jira/browse/LUCENE-7608
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Mark Miller
>Assignee: Mark Miller
>Priority: Minor
> Attachments: .mailmap, .mailmap
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-7608) Add a git .mailmap file to dedupe authors.

2016-12-29 Thread Mark Miller (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-7608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller updated LUCENE-7608:

Attachment: (was: .mailmap)

> Add a git .mailmap file to dedupe authors.
> --
>
> Key: LUCENE-7608
> URL: https://issues.apache.org/jira/browse/LUCENE-7608
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Mark Miller
>Assignee: Mark Miller
>Priority: Minor
> Attachments: .mailmap, .mailmap
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-7608) Add a git .mailmap file to dedupe authors.

2016-12-29 Thread Mark Miller (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-7608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller updated LUCENE-7608:

Attachment: .mailmap

> Add a git .mailmap file to dedupe authors.
> --
>
> Key: LUCENE-7608
> URL: https://issues.apache.org/jira/browse/LUCENE-7608
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Mark Miller
>Assignee: Mark Miller
>Priority: Minor
> Attachments: .mailmap, .mailmap
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-7608) Add a git .mailmap file to dedupe authors.

2016-12-29 Thread Mark Miller (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-7608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller updated LUCENE-7608:

Attachment: .mailmap

> Add a git .mailmap file to dedupe authors.
> --
>
> Key: LUCENE-7608
> URL: https://issues.apache.org/jira/browse/LUCENE-7608
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Mark Miller
>Assignee: Mark Miller
>Priority: Minor
> Attachments: .mailmap, .mailmap
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS-EA] Lucene-Solr-master-Linux (64bit/jdk-9-ea+147) - Build # 18647 - Still Unstable!

2016-12-29 Thread Policeman Jenkins Server
Build: https://jenkins.thetaphi.de/job/Lucene-Solr-master-Linux/18647/
Java: 64bit/jdk-9-ea+147 -XX:+UseCompressedOops -XX:+UseParallelGC

10 tests failed.
FAILED:  
org.apache.lucene.codecs.blockterms.TestVarGapFixedIntervalPostingsFormat.testRamBytesUsed

Error Message:
Actual RAM usage 240, but got 2570, -970.83334% error

Stack Trace:
java.lang.AssertionError: Actual RAM usage 240, but got 2570, 
-970.83334% error
at 
__randomizedtesting.SeedInfo.seed([E7C8151E5DAC77AC:156B075E97D368FA]:0)
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.assertTrue(Assert.java:43)
at 
org.apache.lucene.index.BaseIndexFileFormatTestCase.testRamBytesUsed(BaseIndexFileFormatTestCase.java:279)
at 
org.apache.lucene.index.BasePostingsFormatTestCase.testRamBytesUsed(BasePostingsFormatTestCase.java:85)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:538)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1713)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:907)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:943)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:957)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:811)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:462)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:916)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:802)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:852)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367)
at java.base/java.lang.Thread.run(Thread.java:844)


FAILED:  
org.apache.lucene.codecs.asserting.TestAssertingNormsFormat.testRamBytesUsed

Error Message:
Actual RAM usage 384, but got 7433, -1835.67708% error

Stack Trace:
java.lang.AssertionError: Actual RAM usage 384, but got 7433, 
-1835.67708% error
at 
__randomizedtesting.SeedInfo.seed([BB53F54EB1E051FB:49F0E70E7B9F4EAD]:0)
at org.junit.Assert.fail(Assert.java:93)
at 

[jira] [Updated] (SOLR-9836) Add more graceful recovery steps when failing to create SolrCore

2016-12-29 Thread Mike Drob (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-9836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated SOLR-9836:

Attachment: SOLR-9836.patch

version 4:
* Rebased patch onto master, incorporating changes from SOLR-9859.
* Addressed failing tests.

> Add more graceful recovery steps when failing to create SolrCore
> 
>
> Key: SOLR-9836
> URL: https://issues.apache.org/jira/browse/SOLR-9836
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrCloud
>Reporter: Mike Drob
> Attachments: SOLR-9836.patch, SOLR-9836.patch, SOLR-9836.patch, 
> SOLR-9836.patch
>
>
> I have seen several cases where there is a zero-length segments_n file. We 
> haven't identified the root cause of these issues (possibly a poorly timed 
> crash during replication?) but if there is another node available then Solr 
> should be able to recover from this situation. Currently, we log and give up 
> on loading that core, leaving the user to manually intervene.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9891) bin/solr cannot create an empty Znode which is useful for chroot

2016-12-29 Thread Erick Erickson (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15785599#comment-15785599
 ] 

Erick Erickson commented on SOLR-9891:
--

I implemented this with 'mkroot'. Works on my machine, but still needs someone 
to take a few minutes and try it on Windows before I can commit it.

> bin/solr cannot create an empty Znode which is useful for chroot
> 
>
> Key: SOLR-9891
> URL: https://issues.apache.org/jira/browse/SOLR-9891
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Minor
> Attachments: SOLR-9891.patch
>
>
> This came to my attention just now. To use a different root in Solr, we say 
> this in the ref guide:
> IMPORTANT: If your ZooKeeper connection string uses a chroot, such as 
> localhost:2181/solr, then you need to bootstrap the /solr znode before 
> launching SolrCloud using the bin/solr script. To do this, you need to use 
> the zkcli.sh script shipped with Solr, such as:
> server/scripts/cloud-scripts/zkcli.sh -zkhost localhost:2181/solr -cmd 
> bootstrap -solrhome server/solr
> I think all this really does is create an empty /solr ZNode. We're trying to 
> move the common usages of the zkcli scripts to bin/solr so I tried making 
> this work.
> It's clumsy. If I try to copy up an empty directory to /solr nothing happens. 
> I got it to work by copying file:README.txt to zk:/solr/nonsense then delete 
> zk:/solr/nonsense. Ugly.
> I don't want to get into reproducing the whole Unix shell file manipulation 
> commands with mkdir, touch, etc.
> I guess we already have special 'upconfig' and 'downconfig' commands, so 
> maybe a specific command for this like 'mkroot' would be OK. Do people have 
> opinions about this as opposed to 'mkdir'? I'm tending to mkdir.
> Or have the cp command handle empty directories, but mkroot/mkdir seems more 
> intuitive if not as generic.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7603) Support Graph Token Streams in QueryBuilder

2016-12-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15785544#comment-15785544
 ] 

ASF GitHub Bot commented on LUCENE-7603:


Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/129#discussion_r94148633
  
--- Diff: 
lucene/core/src/java/org/apache/lucene/util/graph/GraphTokenStreamFiniteStrings.java
 ---
@@ -0,0 +1,294 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.lucene.util.graph;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+
+import org.apache.lucene.analysis.TokenStream;
+import org.apache.lucene.analysis.tokenattributes.BytesTermAttribute;
+import org.apache.lucene.analysis.tokenattributes.OffsetAttribute;
+import 
org.apache.lucene.analysis.tokenattributes.PositionIncrementAttribute;
+import org.apache.lucene.analysis.tokenattributes.PositionLengthAttribute;
+import org.apache.lucene.analysis.tokenattributes.TermToBytesRefAttribute;
+import org.apache.lucene.util.BytesRef;
+import org.apache.lucene.util.IntsRef;
+import org.apache.lucene.util.automaton.Automaton;
+import org.apache.lucene.util.automaton.FiniteStringsIterator;
+import org.apache.lucene.util.automaton.Operations;
+import org.apache.lucene.util.automaton.Transition;
+
+import static 
org.apache.lucene.util.automaton.Operations.DEFAULT_MAX_DETERMINIZED_STATES;
+
+/**
+ * Creates a list of {@link TokenStream} where each stream is the tokens 
that make up a finite string in graph token stream.  To do this,
+ * the graph token stream is converted to an {@link Automaton} and from 
there we use a {@link FiniteStringsIterator} to collect the various
+ * token streams for each finite string.
+ */
+public class GraphTokenStreamFiniteStrings {
+  /* TODO:
+ Most of this is a combination of code from TermAutomatonQuery and 
TokenStreamToTermAutomatonQuery. Would be
+ good to make this so it could be shared. */
+  private final Automaton.Builder builder;
+  Automaton det;
+  private final Map termToID = new HashMap<>();
+  private final Map idToTerm = new HashMap<>();
+  private int anyTermID = -1;
+
+  public GraphTokenStreamFiniteStrings() {
+this.builder = new Automaton.Builder();
+  }
+
+  private static class BytesRefArrayTokenStream extends TokenStream {
+private final BytesTermAttribute termAtt = 
addAttribute(BytesTermAttribute.class);
+private final BytesRef[] terms;
+private int offset;
+
+BytesRefArrayTokenStream(BytesRef[] terms) {
+  this.terms = terms;
+  offset = 0;
+}
+
+@Override
+public boolean incrementToken() throws IOException {
+  if (offset < terms.length) {
+clearAttributes();
+termAtt.setBytesRef(terms[offset]);
+offset = offset + 1;
+return true;
+  }
+
+  return false;
+}
+  }
+
+  /**
+   * Gets the list of finite string token streams from the given input 
graph token stream.
+   */
+  public List getTokenStreams(final TokenStream in) throws 
IOException {
+// build automation
+build(in);
+
+List tokenStreams = new ArrayList<>();
+final FiniteStringsIterator finiteStrings = new 
FiniteStringsIterator(det);
+for (IntsRef string; (string = finiteStrings.next()) != null; ) {
+  final BytesRef[] tokens = new BytesRef[string.length];
+  for (int idx = string.offset, len = string.offset + string.length; 
idx < len; idx++) {
+tokens[idx - string.offset] = idToTerm.get(string.ints[idx]);
+  }
+
+  tokenStreams.add(new BytesRefArrayTokenStream(tokens));
+}
+
+return tokenStreams;
+  }
+
+  

[GitHub] lucene-solr pull request #129: LUCENE-7603: Support Graph Token Streams in Q...

2016-12-29 Thread dsmiley
Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/129#discussion_r94148633
  
--- Diff: 
lucene/core/src/java/org/apache/lucene/util/graph/GraphTokenStreamFiniteStrings.java
 ---
@@ -0,0 +1,294 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.lucene.util.graph;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+
+import org.apache.lucene.analysis.TokenStream;
+import org.apache.lucene.analysis.tokenattributes.BytesTermAttribute;
+import org.apache.lucene.analysis.tokenattributes.OffsetAttribute;
+import 
org.apache.lucene.analysis.tokenattributes.PositionIncrementAttribute;
+import org.apache.lucene.analysis.tokenattributes.PositionLengthAttribute;
+import org.apache.lucene.analysis.tokenattributes.TermToBytesRefAttribute;
+import org.apache.lucene.util.BytesRef;
+import org.apache.lucene.util.IntsRef;
+import org.apache.lucene.util.automaton.Automaton;
+import org.apache.lucene.util.automaton.FiniteStringsIterator;
+import org.apache.lucene.util.automaton.Operations;
+import org.apache.lucene.util.automaton.Transition;
+
+import static 
org.apache.lucene.util.automaton.Operations.DEFAULT_MAX_DETERMINIZED_STATES;
+
+/**
+ * Creates a list of {@link TokenStream} where each stream is the tokens 
that make up a finite string in graph token stream.  To do this,
+ * the graph token stream is converted to an {@link Automaton} and from 
there we use a {@link FiniteStringsIterator} to collect the various
+ * token streams for each finite string.
+ */
+public class GraphTokenStreamFiniteStrings {
+  /* TODO:
+ Most of this is a combination of code from TermAutomatonQuery and 
TokenStreamToTermAutomatonQuery. Would be
+ good to make this so it could be shared. */
+  private final Automaton.Builder builder;
+  Automaton det;
+  private final Map termToID = new HashMap<>();
+  private final Map idToTerm = new HashMap<>();
+  private int anyTermID = -1;
+
+  public GraphTokenStreamFiniteStrings() {
+this.builder = new Automaton.Builder();
+  }
+
+  private static class BytesRefArrayTokenStream extends TokenStream {
+private final BytesTermAttribute termAtt = 
addAttribute(BytesTermAttribute.class);
+private final BytesRef[] terms;
+private int offset;
+
+BytesRefArrayTokenStream(BytesRef[] terms) {
+  this.terms = terms;
+  offset = 0;
+}
+
+@Override
+public boolean incrementToken() throws IOException {
+  if (offset < terms.length) {
+clearAttributes();
+termAtt.setBytesRef(terms[offset]);
+offset = offset + 1;
+return true;
+  }
+
+  return false;
+}
+  }
+
+  /**
+   * Gets the list of finite string token streams from the given input 
graph token stream.
+   */
+  public List getTokenStreams(final TokenStream in) throws 
IOException {
+// build automation
+build(in);
+
+List tokenStreams = new ArrayList<>();
+final FiniteStringsIterator finiteStrings = new 
FiniteStringsIterator(det);
+for (IntsRef string; (string = finiteStrings.next()) != null; ) {
+  final BytesRef[] tokens = new BytesRef[string.length];
+  for (int idx = string.offset, len = string.offset + string.length; 
idx < len; idx++) {
+tokens[idx - string.offset] = idToTerm.get(string.ints[idx]);
+  }
+
+  tokenStreams.add(new BytesRefArrayTokenStream(tokens));
+}
+
+return tokenStreams;
+  }
+
+  private void build(final TokenStream in) throws IOException {
+if (det != null) {
+  throw new IllegalStateException("Automation already built");
+}
+
+final TermToBytesRefAttribute termBytesAtt = 

[jira] [Updated] (SOLR-9905) Add NullStream to isolate the performance of the ExportWriter

2016-12-29 Thread Joel Bernstein (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-9905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-9905:
-
Description: 
The NullStream is a utility function to test the raw performance of the 
ExportWriter. This is a nice utility to have to diagnose bottlenecks in 
streaming MapReduce operations. The NullStream will allow developers to test 
the performance of the shuffling (Sorting, Partitioning, Exporting) in 
isolation from the reduce operation (Rollup, Join, Group, etc..). 

The NullStream simply iterates it's internal stream and eats the tuples. It 
returns a single Tuple from each worker with the number of Tuples processed. 
The idea is to iterate the stream without additional overhead so the 
performance of the underlying stream can be isolated.

Sample syntax:
{code}
parallel(collection2, workers=7, sort="count desc", 
  null(search(collection1, 
   q=*:*, 
   fl="id", 
   sort="id desc", 
   qt="/export", 
   wt="javabin", 
   partitionKeys=id)))
{code}

In the example above the NullStream is sent to 7 workers. Each worker will 
iterate the search() expression and the NullStream will eat the tuples so the 
raw performance of the search() can be understood.

> Add NullStream to isolate the performance of the ExportWriter
> -
>
> Key: SOLR-9905
> URL: https://issues.apache.org/jira/browse/SOLR-9905
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>
> The NullStream is a utility function to test the raw performance of the 
> ExportWriter. This is a nice utility to have to diagnose bottlenecks in 
> streaming MapReduce operations. The NullStream will allow developers to test 
> the performance of the shuffling (Sorting, Partitioning, Exporting) in 
> isolation from the reduce operation (Rollup, Join, Group, etc..). 
> The NullStream simply iterates it's internal stream and eats the tuples. It 
> returns a single Tuple from each worker with the number of Tuples processed. 
> The idea is to iterate the stream without additional overhead so the 
> performance of the underlying stream can be isolated.
> Sample syntax:
> {code}
> parallel(collection2, workers=7, sort="count desc", 
>   null(search(collection1, 
>q=*:*, 
>fl="id", 
>sort="id desc", 
>qt="/export", 
>wt="javabin", 
>partitionKeys=id)))
> {code}
> In the example above the NullStream is sent to 7 workers. Each worker will 
> iterate the search() expression and the NullStream will eat the tuples so the 
> raw performance of the search() can be understood.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Issue Comment Deleted] (SOLR-9905) Add NullStream to isolate the performance of the ExportWriter

2016-12-29 Thread Joel Bernstein (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-9905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-9905:
-
Comment: was deleted

(was: The NullStream is a utility function to test the raw performance of the 
ExportWriter. This is a nice utility to have to diagnose bottlenecks in 
streaming MapReduce operations. The NullStream will allow developers to test 
the performance of the shuffling (Sorting, Partitioning, Exporting) in 
isolation from the reduce operation (Rollup, Join, Group, etc..). 

The NullStream simply iterates it's internal stream and eats the tuples. It 
returns a single Tuple from each worker with the number of Tuples processed. 
The idea is to iterate the stream without additional overhead so the 
performance of the underlying stream can be isolated.

Sample syntax:
{code}
parallel(collection2, workers=7, sort="count desc", 
  null(search(collection1, 
   q=*:*, 
   fl="id", 
   sort="id desc", 
   qt="/export", 
   wt="javabin", 
   partitionKeys=id)))
{code}

In the example above the NullStream is sent to 7 workers. Each worker will 
iterate the search() expression and the NullStream will eat the tuples so the 
raw performance of the search() can be understood.)

> Add NullStream to isolate the performance of the ExportWriter
> -
>
> Key: SOLR-9905
> URL: https://issues.apache.org/jira/browse/SOLR-9905
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>
> The NullStream is a utility function to test the raw performance of the 
> ExportWriter. This is a nice utility to have to diagnose bottlenecks in 
> streaming MapReduce operations. The NullStream will allow developers to test 
> the performance of the shuffling (Sorting, Partitioning, Exporting) in 
> isolation from the reduce operation (Rollup, Join, Group, etc..). 
> The NullStream simply iterates it's internal stream and eats the tuples. It 
> returns a single Tuple from each worker with the number of Tuples processed. 
> The idea is to iterate the stream without additional overhead so the 
> performance of the underlying stream can be isolated.
> Sample syntax:
> {code}
> parallel(collection2, workers=7, sort="count desc", 
>   null(search(collection1, 
>q=*:*, 
>fl="id", 
>sort="id desc", 
>qt="/export", 
>wt="javabin", 
>partitionKeys=id)))
> {code}
> In the example above the NullStream is sent to 7 workers. Each worker will 
> iterate the search() expression and the NullStream will eat the tuples so the 
> raw performance of the search() can be understood.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-9905) Add NullStream to isolate the performance of the ExportWriter

2016-12-29 Thread Joel Bernstein (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15785517#comment-15785517
 ] 

Joel Bernstein edited comment on SOLR-9905 at 12/29/16 3:38 PM:


The NullStream is a utility function to test the raw performance of the 
ExportWriter. This is a nice utility to have to diagnose bottlenecks in 
streaming MapReduce operations. The NullStream will allow developers to test 
the performance of the shuffling (Sorting, Partitioning, Exporting) in 
isolation from the reduce operation (Rollup, Join, Group, etc..). 

The NullStream simply iterates it's internal stream and eats the tuples. It 
returns a single Tuple from each worker with the number of Tuples processed. 
The idea is to iterate the stream without additional overhead so the 
performance of the underlying stream can be isolated.

Sample syntax:
{code}
parallel(collection2, workers=7, sort="count desc", 
  null(search(collection1, 
   q=*:*, 
   fl="id", 
   sort="id desc", 
   qt="/export", 
   wt="javabin", 
   partitionKeys=id)))
{code}

In the example above the NullStream is sent to 7 workers. Each worker will 
iterate the search() expression and the NullStream will eat the tuples so the 
raw performance of the search() can be understood.


was (Author: joel.bernstein):
The NullStream is a utility function to test the raw performance of the 
ExportWriter. This is a nice utility to have to diagnose bottlenecks in 
streaming MapReduce operations. The NullStream will allow developers to test 
the performance of the shuffling (Sorting, Partitioning, Exporting) in 
isolation from the reduce operation (Rollup, Join, Group, etc..). 

The NullStream simply iterates it's internal stream and eats the tuples. It 
returns a single Tuple from each worker with the number of Tuples processed. 
The idea is to iterate the stream without additional overhead so the 
performance of the underlying stream can be isolated.

{code}
parallel(collection2, workers=7, sort="count desc", 
  null(search(collection1, 
   q=*:*, 
   fl="id", 
   sort="id desc", 
   qt="/export", 
   wt="javabin", 
   partitionKeys=id)))
{code}


> Add NullStream to isolate the performance of the ExportWriter
> -
>
> Key: SOLR-9905
> URL: https://issues.apache.org/jira/browse/SOLR-9905
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-9905) Add NullStream to isolate the performance of the ExportWriter

2016-12-29 Thread Joel Bernstein (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15785517#comment-15785517
 ] 

Joel Bernstein edited comment on SOLR-9905 at 12/29/16 3:30 PM:


The NullStream is a utility function to test the raw performance of the 
ExportWriter. This is a nice utility to have to diagnose bottlenecks in 
streaming MapReduce operations. The NullStream will allow developers to test 
the performance of the shuffling (Sorting, Partitioning, Exporting) in 
isolation from the reduce operation (Rollup, Join, Group, etc..). 

The NullStream simply iterates it's internal stream and eats the tuples. It 
returns a single Tuple from each worker with the number of Tuples processed. 
The idea is to iterate the stream without additional overhead so the 
performance of the underlying stream can be isolated.

{code}
parallel(collection2, workers=7, sort="count desc", 
  null(search(collection1, 
   q=*:*, 
   fl="id", 
   sort="id desc", 
   qt="/export", 
   wt="javabin", 
   partitionKeys=id)))
{code}



was (Author: joel.bernstein):
The NullStream is a utility function to test the raw performance of the 
ExportWriter. This is a nice utility to have to diagnose bottlenecks in 
streaming MapReduce operations. The NullStream will allow developers to test 
the performance of the shuffling (Sorting, Partitioning, Exporting) in 
isolation from the reduce operation (Rollup, Join, Group, etc..). 

{code}
parallel(collection2, workers=7, sort="count desc", 
  null(search(collection1, 
   q=*:*, 
   fl="id", 
   sort="id desc", 
   qt="/export", 
   wt="javabin", 
   partitionKeys=id)))
{code}


> Add NullStream to isolate the performance of the ExportWriter
> -
>
> Key: SOLR-9905
> URL: https://issues.apache.org/jira/browse/SOLR-9905
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-9905) Add NullStream to isolate the performance of the ExportWriter

2016-12-29 Thread Joel Bernstein (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15785517#comment-15785517
 ] 

Joel Bernstein edited comment on SOLR-9905 at 12/29/16 3:25 PM:


The NullStream is a utility function to test the raw performance of the 
ExportWriter. This is a nice utility to have to diagnose bottlenecks in 
streaming MapReduce operations. The NullStream will allow developers to test 
the performance of the shuffling (Sorting, Partitioning, Exporting) in 
isolation from the reduce operation (Rollup, Join, Group, etc..). 

{code}
parallel(collection2, workers=7, sort="count desc", 
  null(search(collection1, 
   q=*:*, 
   fl="id", 
   sort="id desc", 
   qt="/export", 
   wt="javabin", 
   partitionKeys=id)))
{code}



was (Author: joel.bernstein):
The NullStream is a utility function to test the raw performance of the 
ExportWriter. This is a nice utility to have to diagnose bottlenecks in 
MapReduce operations. The NullStream will allow developers to test the 
performance of the shuffling (Sorting, Partitioning, Exporting) in isolation 
from the reduce operation (Rollup, Join, Group, etc..). 

{code}
parallel(collection2, workers=7, sort="count desc", 
  null(search(collection1, 
   q=*:*, 
   fl="id", 
   sort="id desc", 
   qt="/export", 
   wt="javabin", 
   partitionKeys=id)))
{code}


> Add NullStream to isolate the performance of the ExportWriter
> -
>
> Key: SOLR-9905
> URL: https://issues.apache.org/jira/browse/SOLR-9905
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-9905) Add NullStream to isolate the performance of the ExportWriter

2016-12-29 Thread Joel Bernstein (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15785517#comment-15785517
 ] 

Joel Bernstein edited comment on SOLR-9905 at 12/29/16 3:25 PM:


The NullStream is a utility function to test the raw performance of the 
ExportWriter. This is a nice utility to have to diagnose bottlenecks in 
MapReduce operations. The NullStream will allow developers to test the 
performance of the shuffling (Sorting, Partitioning, Exporting) in isolation 
from the reduce operation (Rollup, Join, Group, etc..). 

{code}
parallel(collection2, workers=7, sort="count desc", 
  null(search(collection1, 
   q=*:*, 
   fl="id", 
   sort="id desc", 
   qt="/export", 
   wt="javabin", 
   partitionKeys=id)))
{code}



was (Author: joel.bernstein):
The NullStream is a utility function to test the raw performance of the 
ExportWriter. This is a nice utility to have to diagnose bottlenecks in slow 
running MapReduce operations. The NullStream will allow developers to test the 
performance of the shuffling (Sorting, Partitioning, Exporting) in isolation 
from the reduce operation (Rollup, Join, Group, etc..). 

{code}
parallel(collection2, workers=7, sort="count desc", 
  null(search(collection1, 
   q=*:*, 
   fl="id", 
   sort="id desc", 
   qt="/export", 
   wt="javabin", 
   partitionKeys=id)))
{code}


> Add NullStream to isolate the performance of the ExportWriter
> -
>
> Key: SOLR-9905
> URL: https://issues.apache.org/jira/browse/SOLR-9905
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9905) Add NullStream to isolate the performance of the ExportWriter

2016-12-29 Thread Joel Bernstein (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15785517#comment-15785517
 ] 

Joel Bernstein commented on SOLR-9905:
--

The NullStream is a utility function to test the raw performance of the 
ExportWriter. This is a nice utility to have to diagnose bottlenecks in slow 
running MapReduce operations. The NullStream will allow developers to test the 
performance of the shuffling (Sorting, Partitioning, Exporting) in isolation 
from the reduce operation (Rollup, Join, Group, etc..). 

{code}
parallel(collection2, workers=7, sort="count desc", 
  null(search(collection1, 
   q=*:*, 
   fl="id", 
   sort="id desc", 
   qt="/export", 
   wt="javabin", 
   partitionKeys=id)))
{code}


> Add NullStream to isolate the performance of the ExportWriter
> -
>
> Key: SOLR-9905
> URL: https://issues.apache.org/jira/browse/SOLR-9905
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-9905) Add NullStream to isolate the performance of the ExportWriter

2016-12-29 Thread Joel Bernstein (JIRA)
Joel Bernstein created SOLR-9905:


 Summary: Add NullStream to isolate the performance of the 
ExportWriter
 Key: SOLR-9905
 URL: https://issues.apache.org/jira/browse/SOLR-9905
 Project: Solr
  Issue Type: New Feature
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Joel Bernstein






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-9636) Add support for javabin for /stream, /sql internode communication

2016-12-29 Thread Joel Bernstein (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15785486#comment-15785486
 ] 

Joel Bernstein edited comment on SOLR-9636 at 12/29/16 3:15 PM:


I added a new NullStream to test the performance of exporting and sorting on a 
high cardinality field. High cardinality exporting/sorting is an important real 
world use case for supporting distributed joins on primary keys. The query 
looks like this:
 
{code}
parallel(collection2, workers=7, sort="count desc", 
  null(search(collection1, 
   q=*:*, 
   fl="id", 
   sort="id desc", 
   qt="/export", 
   wt="javabin", 
   partitionKeys=id)))
{code}

Notice the new *null* function which eats the tuples and returns a count to 
verify the number of tuples processed.

The test query is sorting on the id field which has a unique value in each 
record. Again performance was impressive:

* With json: 1,210,000 Tuples per second.
* With javabin: 1,350,000 Tuples per second.

So the ExportWriter doesn't slow down sorting on a high cardinality field.

Going forward the NullStream will be useful for testing the raw performance of 
the ExportWriter in isolation. This will help developers diagnose where the 
bottlekneck is if distributed joins aren't performing as expected.

For example if a join is slow, but the same export using the NullStream is 
fast, then you can be sure that the bottleneck is not in the ExportWriter, and 
is likely in Join stream.





was (Author: joel.bernstein):
I added a new NullStream to test the performance of exporting and sorting on a 
high cardinality field. This is a much more real world scenario for supporting 
distributed joins on primary keys. The query looks like this:
 
{code}
parallel(collection2, workers=7, sort="count desc", 
  null(search(collection1, 
   q=*:*, 
   fl="id", 
   sort="id desc", 
   qt="/export", 
   wt="javabin", 
   partitionKeys=id)))
{code}

Notice the new *null* function which eats the tuples and returns a count to 
verify the number of tuples processed.

The test query is sorting on the id field which has a unique value in each 
record. Again performance was impressive:

* With json: 1,210,000 Tuples per second.
* With javabin: 1,350,000 Tuples per second.

So the ExportWriter doesn't slow down sorting on a high cardinality field.

Going forward the NullStream will be useful for testing the raw performance of 
the ExportWriter in isolation. This will help developers diagnose where the 
bottlekneck is if distributed joins aren't performing as expected.

For example if a join is slow, but the same export using the NullStream is 
fast, then you can be sure that the bottleneck is not in the ExportWriter, and 
is likely in Join stream.




> Add support for javabin for /stream, /sql internode communication
> -
>
> Key: SOLR-9636
> URL: https://issues.apache.org/jira/browse/SOLR-9636
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Noble Paul
> Fix For: master (7.0), 6.4
>
> Attachments: SOLR-9636.patch
>
>
> currently it uses json, which is verbose and slow



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-9636) Add support for javabin for /stream, /sql internode communication

2016-12-29 Thread Joel Bernstein (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15785486#comment-15785486
 ] 

Joel Bernstein edited comment on SOLR-9636 at 12/29/16 3:14 PM:


I added a new NullStream to test the performance of exporting and sorting on a 
high cardinality field. This is a much more real world scenario for supporting 
distributed joins on primary keys. The query looks like this:
 
{code}
parallel(collection2, workers=7, sort="count desc", 
  null(search(collection1, 
   q=*:*, 
   fl="id", 
   sort="id desc", 
   qt="/export", 
   wt="javabin", 
   partitionKeys=id)))
{code}

Notice the new *null* function which eats the tuples and returns a count to 
verify the number of tuples processed.

The test query is sorting on the id field which has a unique value in each 
record. Again performance was impressive:

* With json: 1,210,000 Tuples per second.
* With javabin: 1,350,000 Tuples per second.

So the ExportWriter doesn't slow down sorting on a high cardinality field.

Going forward the NullStream will be useful for testing the raw performance of 
the ExportWriter in isolation. This will help developers diagnose where the 
bottlekneck is if distributed joins aren't performing as expected.

For example if a join is slow, but the same export using the NullStream is 
fast, then you can be sure that the bottleneck is not in the ExportWriter, and 
is likely in Join stream.





was (Author: joel.bernstein):
I added a new NullStream to test the performance of exporting and sorting on a 
high cardinality field. This is a much more real world scenario for supporting 
distributed joins on primary keys. The query looks like this:
 
{code}
parallel(collection2, workers=7, sort="count desc", 
  null(search(collection1, 
   q=*:*, 
   fl="id", 
   sort="id desc", 
   qt="/export", 
   wt="javabin", 
   partitionKeys=id)))
{code}

Notice the new *null* function which eats the tuples and returns a count to 
verify the number of tuples processed.

The test query is sorting on the id field which has a unique value in each 
record. Again performance was impressive:

* With json: 1,210,000 Tuples per second.
* With javabin: 1,350,000 Tuples per second.

So the ExportWriter doesn't slow down sorting on a high cardinality field.





> Add support for javabin for /stream, /sql internode communication
> -
>
> Key: SOLR-9636
> URL: https://issues.apache.org/jira/browse/SOLR-9636
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Noble Paul
> Fix For: master (7.0), 6.4
>
> Attachments: SOLR-9636.patch
>
>
> currently it uses json, which is verbose and slow



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9636) Add support for javabin for /stream, /sql internode communication

2016-12-29 Thread Joel Bernstein (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15785486#comment-15785486
 ] 

Joel Bernstein commented on SOLR-9636:
--

I added a new NullStream to test the performance of exporting and sorting on a 
high cardinality field. This is a much more real world scenario for supporting 
distributed joins on primary keys. The query looks like this:
 
{code}
parallel(collection2, workers=7, sort="count desc", 
  null(search(collection1, 
   q=*:*, 
   fl="id", 
   sort="id desc", 
   qt="/export", 
   wt="javabin", 
   partitionKeys=id)))
{code}

Notice the new *null* function which eats the tuples and returns a count to 
verify the number of tuples processed.

The test query is sorting on the id field which has a unique value in each 
record. Again performance was impressive:

* With json: 1,210,000 Tuples per second.
* With javabin: 1,350,000 Tuples per second.

So the ExportWriter doesn't slow down sorting on a high cardinality field.





> Add support for javabin for /stream, /sql internode communication
> -
>
> Key: SOLR-9636
> URL: https://issues.apache.org/jira/browse/SOLR-9636
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Noble Paul
> Fix For: master (7.0), 6.4
>
> Attachments: SOLR-9636.patch
>
>
> currently it uses json, which is verbose and slow



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7602) Fix compiler warnings for ant clean compile

2016-12-29 Thread Paul Elschot (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15785476#comment-15785476
 ] 

Paul Elschot commented on LUCENE-7602:
--

After taking a closer look at the other issues:

How about renaming the ContextMap here to ValueSourceContext or to VSContext ?

> Fix compiler warnings for ant clean compile
> ---
>
> Key: LUCENE-7602
> URL: https://issues.apache.org/jira/browse/LUCENE-7602
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Paul Elschot
>Priority: Minor
>  Labels: build
> Fix For: trunk
>
> Attachments: LUCENE-7602-ContextMap-lucene.patch, 
> LUCENE-7602-ContextMap-solr.patch, LUCENE-7602.patch, LUCENE-7602.patch, 
> LUCENE-7602.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5325) Move ValueSource and FunctionValues under core/

2016-12-29 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15785447#comment-15785447
 ] 

David Smiley commented on LUCENE-5325:
--

A test for that would be good.

> Move ValueSource and FunctionValues under core/
> ---
>
> Key: LUCENE-5325
> URL: https://issues.apache.org/jira/browse/LUCENE-5325
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/search
>Reporter: Shai Erera
> Attachments: LUCENE-5325-6x-matchingbits.patch, LUCENE-5325-6x.patch, 
> LUCENE-5325.patch, LUCENE-5325.patch, LUCENE-5325.patch, LUCENE-5325.patch, 
> LUCENE-5325.patch, LUCENE-5325.patch, LUCENE-5325.patch
>
>
> Spinoff from LUCENE-5298: ValueSource and FunctionValues are abstract APIs 
> which exist under the queries/ module. That causes any module which wants to 
> depend on these APIs (but not necessarily on any of their actual 
> implementations!), to depend on the queries/ module. If we move these APIs 
> under core/, we can eliminate these dependencies and add some mock impls for 
> testing purposes.
> Quoting Robert from LUCENE-5298:
> {quote}
> we should eliminate the suggest/ dependencies on expressions and queries, the 
> expressions/ on queries, the grouping/ dependency on queries, the spatial/ 
> dependency on queries, its a mess.
> {quote}
> To add to that list, facet/ should not depend on queries too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5325) Move ValueSource and FunctionValues under core/

2016-12-29 Thread Alan Woodward (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Woodward updated LUCENE-5325:
--
Attachment: LUCENE-5325-6x-matchingbits.patch

bq. shouldn't it check docsWithField?

Hm, you're quite right.  Here's a patch fixing that - will commit shortly.

> Move ValueSource and FunctionValues under core/
> ---
>
> Key: LUCENE-5325
> URL: https://issues.apache.org/jira/browse/LUCENE-5325
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/search
>Reporter: Shai Erera
> Attachments: LUCENE-5325-6x-matchingbits.patch, LUCENE-5325-6x.patch, 
> LUCENE-5325.patch, LUCENE-5325.patch, LUCENE-5325.patch, LUCENE-5325.patch, 
> LUCENE-5325.patch, LUCENE-5325.patch, LUCENE-5325.patch
>
>
> Spinoff from LUCENE-5298: ValueSource and FunctionValues are abstract APIs 
> which exist under the queries/ module. That causes any module which wants to 
> depend on these APIs (but not necessarily on any of their actual 
> implementations!), to depend on the queries/ module. If we move these APIs 
> under core/, we can eliminate these dependencies and add some mock impls for 
> testing purposes.
> Quoting Robert from LUCENE-5298:
> {quote}
> we should eliminate the suggest/ dependencies on expressions and queries, the 
> expressions/ on queries, the grouping/ dependency on queries, the spatial/ 
> dependency on queries, its a mess.
> {quote}
> To add to that list, facet/ should not depend on queries too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7608) Add a git .mailmap file to dedupe authors.

2016-12-29 Thread Adrien Grand (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15785390#comment-15785390
 ] 

Adrien Grand commented on LUCENE-7608:
--

+1

> Add a git .mailmap file to dedupe authors.
> --
>
> Key: LUCENE-7608
> URL: https://issues.apache.org/jira/browse/LUCENE-7608
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Mark Miller
>Assignee: Mark Miller
>Priority: Minor
> Attachments: .mailmap
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-7608) Add a git .mailmap file to dedupe authors.

2016-12-29 Thread Mark Miller (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-7608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller updated LUCENE-7608:

Attachment: .mailmap

> Add a git .mailmap file to dedupe authors.
> --
>
> Key: LUCENE-7608
> URL: https://issues.apache.org/jira/browse/LUCENE-7608
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Mark Miller
>Assignee: Mark Miller
>Priority: Minor
> Attachments: .mailmap
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-7608) Add a git .mailmap file to dedupe authors.

2016-12-29 Thread Mark Miller (JIRA)
Mark Miller created LUCENE-7608:
---

 Summary: Add a git .mailmap file to dedupe authors.
 Key: LUCENE-7608
 URL: https://issues.apache.org/jira/browse/LUCENE-7608
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Mark Miller
Assignee: Mark Miller
Priority: Minor
 Attachments: .mailmap





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



  1   2   >