[JENKINS] Lucene-Solr-NightlyTests-master - Build # 1431 - Still Unstable

2017-12-01 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-NightlyTests-master/1431/

6 tests failed.
FAILED:  org.apache.lucene.spatial3d.TestGeo3DPoint.testRandomBig

Error Message:
Test abandoned because suite timeout was reached.

Stack Trace:
java.lang.Exception: Test abandoned because suite timeout was reached.
at __randomizedtesting.SeedInfo.seed([707A7846BA6CB37C]:0)


FAILED:  junit.framework.TestSuite.org.apache.lucene.spatial3d.TestGeo3DPoint

Error Message:
Suite timeout exceeded (>= 720 msec).

Stack Trace:
java.lang.Exception: Suite timeout exceeded (>= 720 msec).
at __randomizedtesting.SeedInfo.seed([707A7846BA6CB37C]:0)


FAILED:  org.apache.solr.TestDistributedSearch.test

Error Message:
Expected to find shardAddress in the up shard info: 
{error=org.apache.solr.client.solrj.SolrServerException: Time allowed to handle 
this request exceeded,trace=org.apache.solr.client.solrj.SolrServerException: 
Time allowed to handle this request exceeded  at 
org.apache.solr.client.solrj.impl.LBHttpSolrClient.request(LBHttpSolrClient.java:460)
  at 
org.apache.solr.handler.component.HttpShardHandlerFactory.makeLoadBalancedRequest(HttpShardHandlerFactory.java:273)
  at 
org.apache.solr.handler.component.HttpShardHandler.lambda$submit$0(HttpShardHandler.java:175)
  at java.util.concurrent.FutureTask.run(FutureTask.java:266)  at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)  at 
java.util.concurrent.FutureTask.run(FutureTask.java:266)  at 
com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:176)
  at 
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:188)
  at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
 at java.lang.Thread.run(Thread.java:748) ,time=12}

Stack Trace:
java.lang.AssertionError: Expected to find shardAddress in the up shard info: 
{error=org.apache.solr.client.solrj.SolrServerException: Time allowed to handle 
this request exceeded,trace=org.apache.solr.client.solrj.SolrServerException: 
Time allowed to handle this request exceeded
at 
org.apache.solr.client.solrj.impl.LBHttpSolrClient.request(LBHttpSolrClient.java:460)
at 
org.apache.solr.handler.component.HttpShardHandlerFactory.makeLoadBalancedRequest(HttpShardHandlerFactory.java:273)
at 
org.apache.solr.handler.component.HttpShardHandler.lambda$submit$0(HttpShardHandler.java:175)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:176)
at 
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:188)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
,time=12}
at 
__randomizedtesting.SeedInfo.seed([F6964D5E7675648B:7EC27284D8890973]:0)
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.assertTrue(Assert.java:43)
at 
org.apache.solr.TestDistributedSearch.comparePartialResponses(TestDistributedSearch.java:1191)
at 
org.apache.solr.TestDistributedSearch.queryPartialResults(TestDistributedSearch.java:1132)
at 
org.apache.solr.TestDistributedSearch.test(TestDistributedSearch.java:992)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1737)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:934)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:970)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:984)
at 
org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsRepeatStatement.callStatement(BaseDistributedSearchTestCase.java:1019)
at 
org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsStatement.evaluate(BaseDistributedSearchTestCase.java:968)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate

[jira] [Commented] (SOLR-11616) Backup failing on a constantly changing index with NoSuchFileException

2017-12-01 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16275395#comment-16275395
 ] 

ASF subversion and git services commented on SOLR-11616:


Commit 62b35006780009758376d9e22b2c1a08e25b83a6 in lucene-solr's branch 
refs/heads/branch_7x from [~varunthacker]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=62b3500 ]

SOLR-11616: Snapshot the segments more robustly such that segments created 
during a backup does does not fail the operation

(cherry picked from commit 864ce90)


> Backup failing on a constantly changing index with NoSuchFileException
> --
>
> Key: SOLR-11616
> URL: https://issues.apache.org/jira/browse/SOLR-11616
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Varun Thacker
>Assignee: Varun Thacker
> Attachments: SOLR-11616.patch, SOLR-11616.patch, solr-6.3.log, 
> solr-7.1.log
>
>
> As reported by several users on SOLR-9120 , Solr backups fail with 
> NoSuchFileException on a constantly changing index. 
> Users linked SOLR-9120 to the root cause as the stack trace is the same , but 
> the fix proposed there won't fix backups to stop failing.
> We need to implement a similar fix in {{SnapShooter#createSnapshot}} to fix 
> the problem



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11616) Backup failing on a constantly changing index with NoSuchFileException

2017-12-01 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16275394#comment-16275394
 ] 

ASF subversion and git services commented on SOLR-11616:


Commit 864ce90d2cd9bfae66506f38823278738afe6c4a in lucene-solr's branch 
refs/heads/master from [~varunthacker]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=864ce90 ]

SOLR-11616: Snapshot the segments more robustly such that segments created 
during a backup does does not fail the
operation


> Backup failing on a constantly changing index with NoSuchFileException
> --
>
> Key: SOLR-11616
> URL: https://issues.apache.org/jira/browse/SOLR-11616
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Varun Thacker
>Assignee: Varun Thacker
> Attachments: SOLR-11616.patch, SOLR-11616.patch, solr-6.3.log, 
> solr-7.1.log
>
>
> As reported by several users on SOLR-9120 , Solr backups fail with 
> NoSuchFileException on a constantly changing index. 
> Users linked SOLR-9120 to the root cause as the stack trace is the same , but 
> the fix proposed there won't fix backups to stop failing.
> We need to implement a similar fix in {{SnapShooter#createSnapshot}} to fix 
> the problem



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Solr Ref Guide not building

2017-12-01 Thread Doug Turnbull
Hello!

I'm trying to update the Solr Ref guide with my change for SOLR-11662. I
believe I've installed the required dependencies and double checked the
README in the solr-ref-guide. Unfortunately, running ant build-site I
immediately get this error, seemingly on the first adoc file encountered:


 [exec] jekyll 3.5.0 | Error:  No header received back.
 [exec]   Conversion error: Jekyll::AsciiDoc::Converter encountered an
error while converting 'about-filters.adoc':

I feel like I must be doing something stupid (I'll assume user error on my
part). But if there's anything obvious I'm doing wrong, please let me know

A more complete log can be found here
https://gist.github.com/softwaredoug/36fe87f0d63403e7be22d5a2ff8af073

Thanks for any help

-Doug
-- 
Consultant, OpenSource Connections. Contact info at
http://o19s.com/about-us/doug-turnbull/; Free/Busy (http://bit.ly/dougs_cal)


[jira] [Commented] (SOLR-11662) Make overlapping query term scoring configurable per field type

2017-12-01 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16275350#comment-16275350
 ] 

ASF GitHub Bot commented on SOLR-11662:
---

Github user softwaredoug commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/275#discussion_r154483628
  
--- Diff: 
solr/core/src/java/org/apache/solr/parser/SolrQueryParserBase.java ---
@@ -539,6 +591,27 @@ protected Query newRegexpQuery(Term regexp) {
 return query;
   }
 
+  @Override
+  protected Query newSynonymQuery(Term terms[]) {
+switch (synonymQueryStyle) {
+  case PICK_BEST:
+List currPosnClauses = new ArrayList(terms.length);
+for (Term term : terms) {
+  currPosnClauses.add(newTermQuery(term));
+}
+DisjunctionMaxQuery dm = new DisjunctionMaxQuery(currPosnClauses, 
0.0f);
+return dm;
+  case AS_DISTINCT_TERMS:
+BooleanQuery.Builder builder = new BooleanQuery.Builder();
+for (Term term : terms) {
+  builder.add(newTermQuery(term), BooleanClause.Occur.SHOULD);
+}
+return builder.build();
+  default:
--- End diff --

I don't think synonymQueryStyle should ever be null (should default to 
AS_SAME_TERM)


> Make overlapping query term scoring configurable per field type
> ---
>
> Key: SOLR-11662
> URL: https://issues.apache.org/jira/browse/SOLR-11662
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Doug Turnbull
> Fix For: 7.2, master (8.0)
>
>
> This patch customizes the query-time behavior when query terms overlap 
> positions. Right now the only option is SynonymQuery. This is a fantastic 
> default & improvement on past versions. However, there are use cases where 
> terms overlap positions but don't carry exact synonymy relationships. Often 
> synonyms are actually used to model hypernym/hyponym relationships using 
> synonyms (or other analyzers). So the individual term scores matter, with 
> terms with higher specificity (hyponym) scoring higher than terms with lower 
> specificity (hypernym).
> This patch adds the fieldType setting scoreOverlaps, as in:
> {code:java}
>class="solr.TextField" positionIncrementGap="100" multiValued="true">
> {code}
> Valid values for scoreOverlaps are:
> *as_one_term*
> Default, most synonym use cases. Uses SynonymQuery
> Treats all terms as if they're exactly equivalent, with document frequency 
> from underlying terms blended 
> *pick_best*
> For a given document, score using the best scoring synonym (ie dismax over 
> generated terms). 
> Useful when synonyms not exactly equilevant. Instead they are used to model 
> hypernym/hyponym relationships. Such as expanding to synonyms of where terms 
> scores will reflect that quality
> IE this query time expansion
> tabby => tabby, cat, animal
> Searching "text", generates the dismax (text:tabby | text:cat | text:animal)
> *as_distinct_terms*
> (The pre 6.0 behavior.)
> Compromise between pick_best and as_oneSterm
> Appropriate when synonyms reflect a hypernym/hyponym relationship, but lets 
> scores stack, so documents with more tabby, cat, or animal the better w/ a 
> bias towards the term with highest specificity
> Terms are turned into a boolean OR query, with documen frequencies not blended
> IE this query time expansion
> tabby => tabby, cat, animal
> Searching "text", generates the boolean query (text:tabby  text:cat 
> text:animal)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #275: SOLR-11662: Configurable query when terms ove...

2017-12-01 Thread softwaredoug
Github user softwaredoug commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/275#discussion_r154483649
  
--- Diff: 
solr/core/src/test/org/apache/solr/search/TestSolrQueryParser.java ---
@@ -1057,7 +1057,35 @@ public void testShingleQueries() throws Exception {
 , "/response/numFound==1"
 );
   }
-  
+
+
+  public void testSynonymQueryStyle() throws Exception {
+ModifiableSolrParams edismaxParams = params("qf", "t_pick_best_foo");
+
+QParser qParser = QParser.getParser("tabby", "edismax", 
req(edismaxParams));
--- End diff --

whoops, good catch


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11662) Make overlapping query term scoring configurable per field type

2017-12-01 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16275351#comment-16275351
 ] 

ASF GitHub Bot commented on SOLR-11662:
---

Github user softwaredoug commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/275#discussion_r154483649
  
--- Diff: 
solr/core/src/test/org/apache/solr/search/TestSolrQueryParser.java ---
@@ -1057,7 +1057,35 @@ public void testShingleQueries() throws Exception {
 , "/response/numFound==1"
 );
   }
-  
+
+
+  public void testSynonymQueryStyle() throws Exception {
+ModifiableSolrParams edismaxParams = params("qf", "t_pick_best_foo");
+
+QParser qParser = QParser.getParser("tabby", "edismax", 
req(edismaxParams));
--- End diff --

whoops, good catch


> Make overlapping query term scoring configurable per field type
> ---
>
> Key: SOLR-11662
> URL: https://issues.apache.org/jira/browse/SOLR-11662
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Doug Turnbull
> Fix For: 7.2, master (8.0)
>
>
> This patch customizes the query-time behavior when query terms overlap 
> positions. Right now the only option is SynonymQuery. This is a fantastic 
> default & improvement on past versions. However, there are use cases where 
> terms overlap positions but don't carry exact synonymy relationships. Often 
> synonyms are actually used to model hypernym/hyponym relationships using 
> synonyms (or other analyzers). So the individual term scores matter, with 
> terms with higher specificity (hyponym) scoring higher than terms with lower 
> specificity (hypernym).
> This patch adds the fieldType setting scoreOverlaps, as in:
> {code:java}
>class="solr.TextField" positionIncrementGap="100" multiValued="true">
> {code}
> Valid values for scoreOverlaps are:
> *as_one_term*
> Default, most synonym use cases. Uses SynonymQuery
> Treats all terms as if they're exactly equivalent, with document frequency 
> from underlying terms blended 
> *pick_best*
> For a given document, score using the best scoring synonym (ie dismax over 
> generated terms). 
> Useful when synonyms not exactly equilevant. Instead they are used to model 
> hypernym/hyponym relationships. Such as expanding to synonyms of where terms 
> scores will reflect that quality
> IE this query time expansion
> tabby => tabby, cat, animal
> Searching "text", generates the dismax (text:tabby | text:cat | text:animal)
> *as_distinct_terms*
> (The pre 6.0 behavior.)
> Compromise between pick_best and as_oneSterm
> Appropriate when synonyms reflect a hypernym/hyponym relationship, but lets 
> scores stack, so documents with more tabby, cat, or animal the better w/ a 
> bias towards the term with highest specificity
> Terms are turned into a boolean OR query, with documen frequencies not blended
> IE this query time expansion
> tabby => tabby, cat, animal
> Searching "text", generates the boolean query (text:tabby  text:cat 
> text:animal)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #275: SOLR-11662: Configurable query when terms ove...

2017-12-01 Thread softwaredoug
Github user softwaredoug commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/275#discussion_r154483628
  
--- Diff: 
solr/core/src/java/org/apache/solr/parser/SolrQueryParserBase.java ---
@@ -539,6 +591,27 @@ protected Query newRegexpQuery(Term regexp) {
 return query;
   }
 
+  @Override
+  protected Query newSynonymQuery(Term terms[]) {
+switch (synonymQueryStyle) {
+  case PICK_BEST:
+List currPosnClauses = new ArrayList(terms.length);
+for (Term term : terms) {
+  currPosnClauses.add(newTermQuery(term));
+}
+DisjunctionMaxQuery dm = new DisjunctionMaxQuery(currPosnClauses, 
0.0f);
+return dm;
+  case AS_DISTINCT_TERMS:
+BooleanQuery.Builder builder = new BooleanQuery.Builder();
+for (Term term : terms) {
+  builder.add(newTermQuery(term), BooleanClause.Occur.SHOULD);
+}
+return builder.build();
+  default:
--- End diff --

I don't think synonymQueryStyle should ever be null (should default to 
AS_SAME_TERM)


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-11703) Solr Should Send Log Notifications if Ulimits are Incorrect

2017-12-01 Thread Kevin Cowan (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-11703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Cowan updated SOLR-11703:
---
Attachment: SOLR-11703.patch

> Solr Should Send Log Notifications if Ulimits are Incorrect
> ---
>
> Key: SOLR-11703
> URL: https://issues.apache.org/jira/browse/SOLR-11703
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Kevin Cowan
>Assignee: Erick Erickson
> Attachments: SOLR-11703.patch
>
>
> On most Linux instances, the default for 'open files' is set to something 
> entirely too low, E.g. 1024.   We have a large number of support tickets that 
> wind up with us having the client increase this number... programatically.  
>  It would make sense and save a great deal of support time if the solr 
> startup script checked these values, and either alter them, or at least alert 
> the user to the fact that they are set too low, which could cause trouble  I 
> am associating just one of many tickets where this is the result. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11412) Documentation changes for SOLR-11003: Bi-directional CDCR support

2017-12-01 Thread Varun Thacker (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16275196#comment-16275196
 ] 

Varun Thacker commented on SOLR-11412:
--

In general I think we could do with some more cleaning of the docs. No user is 
going to read so much of info before setting up CDCR. 

+1 to commit this with or without the changes I suggested. If it's committed 
without my changes then I can create a patch with everything that I felt can be 
folded it to make reviewing easier ?

> Documentation changes for SOLR-11003: Bi-directional CDCR support
> -
>
> Key: SOLR-11412
> URL: https://issues.apache.org/jira/browse/SOLR-11412
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: CDCR, documentation
>Reporter: Amrit Sarkar
>Assignee: Varun Thacker
> Attachments: CDCR_bidir.png, SOLR-11412-split.patch, 
> SOLR-11412.patch, SOLR-11412.patch, SOLR-11412.patch, SOLR-11412.patch, 
> SOLR-11412.patch
>
>
> Since SOLR-11003: Bi-directional CDCR scenario support, is reaching its 
> conclusion. The relevant changes in documentation needs to be done.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11412) Documentation changes for SOLR-11003: Bi-directional CDCR support

2017-12-01 Thread Varun Thacker (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16275194#comment-16275194
 ] 

Varun Thacker commented on SOLR-11412:
--

cdcr-api.adoc

I think we can remove "Enable Buffer Response", "Disable Buffer Response", 
"CDCR Start Response", "CDCR Stop Response"
The one line explaination doesn't add any value in my opinion. 
For "OPS Response" maybe all we should say is : "Provides the average number of 
operations as a sum and broken down by adds/deletes"
For "ERRORS Response" we should say "Provides the number of consecutive errors 
encountered by the replicator thread, the number of bad requests or internal 
errors since the start of the replication process, and a list of the last 
errors encountered ordered by timestamp."  The first part of the description 
seems unecessary to me.

Also both OPS and ERROR's should really be exposed via the metrics API. I'll 
file a separate Jira for this.

CDCR Architecture Page:

"The data changes can be replicated in near real-time (with a small delay) or 
could be scheduled to be sent at longer intervals to the Target data center" : 
"The data changes can be replicated at a configurable amount of time"

Should Source and Target should start with a capital letter?

"Since this is a full copy of the entire index, network bandwidth should be 
considered." : What value does this line add to the user? 

"CDCR can "bootstrap" the collection to the Target data center. Since this is a 
full copy of the entire index, network bandwidth should be considered. Of 
course both Source and Target collections may be empty to start." - Remove this 
part?  
The fifth paragraph ( "Replication supports both..." ) basically explains this 
in a better fashion

"The directional nature of the implementation implies a "push" model from the 
Source collection to the Target collection. Therefore, the Source configuration 
must be able to "see" the ZooKeeper ensemble in the Target cluster. The 
ZooKeeper ensemble is provided configured in the Source’s solrconfig.xml file."

I feel we can remove this entire para and just add a line to the 3rd para where 
we mention it's pushed based. Here's a stab at an updated 3rd para

"Each shard leader in the Source data center is responsible for replicating its 
updates to the corresponding leader in the Target data center. This is a push 
model and the source data center must be able to connect to the target 
ZooKeeper. Shard leaders in the Target data center will replicate the changes 
to their own replicas as normal SolrCloud updates."

"CDCR can be configured to replicate from one collection to a second collection 
within the same cluster. That is a specialized scenario not covered in this 
Guide." : Does this point have any value? I'm +0 in removing it 

>From "Figure 1. Uni-Directional Data Flow" :  "Since leaders may ... 
>Firewalls, ACL rules, etc., must be configured to allow this." I feel like we 
>have the ACL part could be mentioned in the architecture overview and removed 
>from here.

"With bi-directional updates, indexing and querying " : I guess the only hard 
requirement is indexing. Querying doesn't have any impact in this design i.e 
it's the same as uni directional

"Updates sent from Source data center to Target is not propagated back to 
Source when bi-directional updates are configured" : This is what point 7 talks 
about so maybe remove this


CDCR Configuration : 

 "" : We recommend everyone to disable buffering. Let's remove 
this comment


 From the "Initial Startup" section 

 "Sync the index directories from the Source collection to Target collection 
across to the corresponding shard nodes. rsync works well for this" till the 
end of the section : Seems like a lot of info or notes which are already known?

 "ZooKeeper Settings"

800 is a typo? We say we want to set 200 but use 

"Cross Data Center Replication Operations" : Should talk about how to update a 
schema. I'll add some docs after this commit

> Documentation changes for SOLR-11003: Bi-directional CDCR support
> -
>
> Key: SOLR-11412
> URL: https://issues.apache.org/jira/browse/SOLR-11412
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: CDCR, documentation
>Reporter: Amrit Sarkar
>Assignee: Varun Thacker
> Attachments: CDCR_bidir.png, SOLR-11412-split.patch, 
> SOLR-11412.patch, SOLR-11412.patch, SOLR-11412.patch, SOLR-11412.patch, 
> SOLR-11412.patch
>
>
> Since SOLR-11003: Bi-directional CDCR scenario support, is reaching its 
> conclusion. The relevant changes in documentation needs to be done.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---

[JENKINS] Lucene-Solr-NightlyTests-7.x - Build # 96 - Failure

2017-12-01 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-NightlyTests-7.x/96/

11 tests failed.
FAILED:  org.apache.lucene.index.TestNumericDocValuesUpdates.testTonsOfUpdates

Error Message:
Problem reading index from RawDirectoryWrapper(RAMDirectory@2b6ec28c 
lockFactory=org.apache.lucene.store.SingleInstanceLockFactory@3ee93668) 
(resource=RawDirectoryWrapper(RAMDirectory@2b6ec28c 
lockFactory=org.apache.lucene.store.SingleInstanceLockFactory@3ee93668))

Stack Trace:
org.apache.lucene.index.CorruptIndexException: Problem reading index from 
RawDirectoryWrapper(RAMDirectory@2b6ec28c 
lockFactory=org.apache.lucene.store.SingleInstanceLockFactory@3ee93668) 
(resource=RawDirectoryWrapper(RAMDirectory@2b6ec28c 
lockFactory=org.apache.lucene.store.SingleInstanceLockFactory@3ee93668))
at 
__randomizedtesting.SeedInfo.seed([3A5E8054B8A8DAA9:427B5E5F5A88F54B]:0)
at 
org.apache.lucene.index.SegmentCoreReaders.(SegmentCoreReaders.java:140)
at org.apache.lucene.index.SegmentReader.(SegmentReader.java:78)
at 
org.apache.lucene.index.ReadersAndUpdates.writeFieldUpdates(ReadersAndUpdates.java:688)
at 
org.apache.lucene.index.IndexWriter$ReaderPool.writeSomeDocValuesUpdates(IndexWriter.java:703)
at 
org.apache.lucene.index.FrozenBufferedUpdates.apply(FrozenBufferedUpdates.java:331)
at 
org.apache.lucene.index.DocumentsWriter$ResolveUpdatesEvent.process(DocumentsWriter.java:738)
at 
org.apache.lucene.index.IndexWriter.processEvents(IndexWriter.java:5082)
at 
org.apache.lucene.index.IndexWriter.processEvents(IndexWriter.java:5070)
at 
org.apache.lucene.index.IndexWriter.updateDocValues(IndexWriter.java:1879)
at 
org.apache.lucene.index.TestNumericDocValuesUpdates.testTonsOfUpdates(TestNumericDocValuesUpdates.java:1557)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1737)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:934)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:970)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:984)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:943)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:829)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:879)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:890)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.e

[jira] [Commented] (SOLR-11692) SolrDispatchFilter.closeShield passes the shielded response object back to jetty making the stream unclose able

2017-12-01 Thread Jeff Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16275163#comment-16275163
 ] 

Jeff Miller commented on SOLR-11692:


[~dsmiley] sent PR https://github.com/apache/lucene-solr/pull/281

> SolrDispatchFilter.closeShield passes the shielded response object back to 
> jetty making the stream unclose able
> ---
>
> Key: SOLR-11692
> URL: https://issues.apache.org/jira/browse/SOLR-11692
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Response Writers
>Affects Versions: 7.1
> Environment: Linux/Mac tested
>Reporter: Jeff Miller
>Priority: Minor
>  Labels: dispatchlayer, jetty, newbie, streams
> Attachments: SOLR-11692.patch
>
>   Original Estimate: 3h
>  Remaining Estimate: 3h
>
> In test mode we trigger closeShield code in SolrDispatchFilter, however there 
> are code paths where we passthrough the objects to the DefaultHandler which 
> can no longer close the response.
> Example stack trace:
> java.lang.AssertionError: Attempted close of response output stream.
> at 
> org.apache.solr.servlet.SolrDispatchFilter$2$1.close(SolrDispatchFilter.java:528)
> at org.eclipse.jetty.server.Dispatcher.commitResponse(Dispatcher.java:315)
> at org.eclipse.jetty.server.Dispatcher.forward(Dispatcher.java:279)
> at org.eclipse.jetty.server.Dispatcher.forward(Dispatcher.java:103)
> at org.eclipse.jetty.servlet.DefaultServlet.doGet(DefaultServlet.java:566)
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:734)
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:847)
> at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:684)
> at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1448)
> at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:385)
> at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:326)
> at 
> searchserver.filter.SfdcDispatchFilter.doFilter(SfdcDispatchFilter.java:204)
> at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
> at 
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
> at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
> at 
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)
> at 
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
> at 
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)
> at 
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)
> at 
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
> at 
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)
> at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
> at 
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
> at 
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)
> at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
> at org.eclipse.jetty.server.Server.handle(Server.java:370)
> at 
> org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)
> at 
> org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:949)
> at 
> org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1011)
> at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:644)
> at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
> at 
> org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:82)
> at 
> org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:668)
> at 
> org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:52)
> at 
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
> at 
> org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
> at java.lang.Thread.run(Thread.java:745)
> Related JIRA: SOLR-8933



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #281: Cleaning up class casts and making sure sheil...

2017-12-01 Thread millerjeff0
GitHub user millerjeff0 opened a pull request:

https://github.com/apache/lucene-solr/pull/281

Cleaning up class casts and making sure sheilded response isn't sent …

…up the filter chain

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/millerjeff0/lucene-solr SOLR-11692

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/lucene-solr/pull/281.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #281


commit 4affbece0268a20b9d6fc84699644dc364538f0c
Author: Jeff 
Date:   2017-12-01T23:16:07Z

Cleaning up class casts and making sure sheilded response isn't sent up the 
filter chain




---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-11256) Provide default for ConcurrentUpdateSolrClient's "queueSize" param

2017-12-01 Thread Anshum Gupta (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-11256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anshum Gupta resolved SOLR-11256.
-
Resolution: Fixed

Thanks [~gerlowskija]

> Provide default for ConcurrentUpdateSolrClient's "queueSize" param
> --
>
> Key: SOLR-11256
> URL: https://issues.apache.org/jira/browse/SOLR-11256
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrJ
>Affects Versions: master (8.0)
>Reporter: Jason Gerlowski
>Assignee: Anshum Gupta
>Priority: Minor
> Fix For: 7.2, master (8.0)
>
> Attachments: SOLR-11256.patch, SOLR-11256.patch
>
>
> A user on the mailing list recently pointed out that if it's not specified 
> explicitly as a Builder option, ConcurrentUpdateSolrClient will default to 
> using a queueSize of 0.  This value gets passed in to the underlying queue 
> data structure which throws an IllegalArgumentException, with an error 
> message that isn't obvious to those unfamiliar with the internals.
> We should provide a better default than the uninitialized-variable default of 
> 0.  Almost all occurrences in the code fall between 5 and 10, so a queueSize 
> in that range should be uncontroversial.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11256) Provide default for ConcurrentUpdateSolrClient's "queueSize" param

2017-12-01 Thread Anshum Gupta (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16275112#comment-16275112
 ] 

Anshum Gupta commented on SOLR-11256:
-

Sorry, I missed the JIRA# in my commit message on master and so it didn't get 
logged here.
Here's the commit from master: 8c855fa2870ad7ef3cc8450977f6e34b6d902d6b


> Provide default for ConcurrentUpdateSolrClient's "queueSize" param
> --
>
> Key: SOLR-11256
> URL: https://issues.apache.org/jira/browse/SOLR-11256
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrJ
>Affects Versions: master (8.0)
>Reporter: Jason Gerlowski
>Assignee: Anshum Gupta
>Priority: Minor
> Fix For: 7.2, master (8.0)
>
> Attachments: SOLR-11256.patch, SOLR-11256.patch
>
>
> A user on the mailing list recently pointed out that if it's not specified 
> explicitly as a Builder option, ConcurrentUpdateSolrClient will default to 
> using a queueSize of 0.  This value gets passed in to the underlying queue 
> data structure which throws an IllegalArgumentException, with an error 
> message that isn't obvious to those unfamiliar with the internals.
> We should provide a better default than the uninitialized-variable default of 
> 0.  Almost all occurrences in the code fall between 5 and 10, so a queueSize 
> in that range should be uncontroversial.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-11256) Provide default for ConcurrentUpdateSolrClient's "queueSize" param

2017-12-01 Thread Anshum Gupta (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-11256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anshum Gupta updated SOLR-11256:

Fix Version/s: master (8.0)
   7.2

> Provide default for ConcurrentUpdateSolrClient's "queueSize" param
> --
>
> Key: SOLR-11256
> URL: https://issues.apache.org/jira/browse/SOLR-11256
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrJ
>Affects Versions: master (8.0)
>Reporter: Jason Gerlowski
>Assignee: Anshum Gupta
>Priority: Minor
> Fix For: 7.2, master (8.0)
>
> Attachments: SOLR-11256.patch, SOLR-11256.patch
>
>
> A user on the mailing list recently pointed out that if it's not specified 
> explicitly as a Builder option, ConcurrentUpdateSolrClient will default to 
> using a queueSize of 0.  This value gets passed in to the underlying queue 
> data structure which throws an IllegalArgumentException, with an error 
> message that isn't obvious to those unfamiliar with the internals.
> We should provide a better default than the uninitialized-variable default of 
> 0.  Almost all occurrences in the code fall between 5 and 10, so a queueSize 
> in that range should be uncontroversial.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11256) Provide default for ConcurrentUpdateSolrClient's "queueSize" param

2017-12-01 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16275108#comment-16275108
 ] 

ASF subversion and git services commented on SOLR-11256:


Commit d9047125f97b682f7046b649b20cf54afba0225d in lucene-solr's branch 
refs/heads/branch_7x from [~anshumg]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=d904712 ]

SOLR-11256: The queue size for ConcurrentUpdateSolrClient should default to 10 
instead of throwing an IllegalArgumentException


> Provide default for ConcurrentUpdateSolrClient's "queueSize" param
> --
>
> Key: SOLR-11256
> URL: https://issues.apache.org/jira/browse/SOLR-11256
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrJ
>Affects Versions: master (8.0)
>Reporter: Jason Gerlowski
>Assignee: Anshum Gupta
>Priority: Minor
> Attachments: SOLR-11256.patch, SOLR-11256.patch
>
>
> A user on the mailing list recently pointed out that if it's not specified 
> explicitly as a Builder option, ConcurrentUpdateSolrClient will default to 
> using a queueSize of 0.  This value gets passed in to the underlying queue 
> data structure which throws an IllegalArgumentException, with an error 
> message that isn't obvious to those unfamiliar with the internals.
> We should provide a better default than the uninitialized-variable default of 
> 0.  Almost all occurrences in the code fall between 5 and 10, so a queueSize 
> in that range should be uncontroversial.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11412) Documentation changes for SOLR-11003: Bi-directional CDCR support

2017-12-01 Thread Varun Thacker (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16275103#comment-16275103
 ] 

Varun Thacker commented on SOLR-11412:
--

Hi Cassandra,

I am going through the patch right now

> Documentation changes for SOLR-11003: Bi-directional CDCR support
> -
>
> Key: SOLR-11412
> URL: https://issues.apache.org/jira/browse/SOLR-11412
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: CDCR, documentation
>Reporter: Amrit Sarkar
>Assignee: Varun Thacker
> Attachments: CDCR_bidir.png, SOLR-11412-split.patch, 
> SOLR-11412.patch, SOLR-11412.patch, SOLR-11412.patch, SOLR-11412.patch, 
> SOLR-11412.patch
>
>
> Since SOLR-11003: Bi-directional CDCR scenario support, is reaching its 
> conclusion. The relevant changes in documentation needs to be done.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11703) Solr Should Send Log Notifications if Ulimits are Incorrect

2017-12-01 Thread Varun Thacker (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16275092#comment-16275092
 ] 

Varun Thacker commented on SOLR-11703:
--

Hi Erick,

Both Noble and David suggest on SOLR-9560 that a warning is a better idea. 
Everyone here also agrees with that approach. Why not just modify the 
description of SOLR-9560 to mention the intent is to log a warming instead of 
tracking it on a separate Jira? 

> Solr Should Send Log Notifications if Ulimits are Incorrect
> ---
>
> Key: SOLR-11703
> URL: https://issues.apache.org/jira/browse/SOLR-11703
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Kevin Cowan
>Assignee: Erick Erickson
>
> On most Linux instances, the default for 'open files' is set to something 
> entirely too low, E.g. 1024.   We have a large number of support tickets that 
> wind up with us having the client increase this number... programatically.  
>  It would make sense and save a great deal of support time if the solr 
> startup script checked these values, and either alter them, or at least alert 
> the user to the fact that they are set too low, which could cause trouble  I 
> am associating just one of many tickets where this is the result. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-8073) TestBasicModelIn.testRandomScoring failure

2017-12-01 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-8073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless resolved LUCENE-8073.

Resolution: Duplicate

Woops thanks [~rcmuir]!

> TestBasicModelIn.testRandomScoring failure
> --
>
> Key: LUCENE-8073
> URL: https://issues.apache.org/jira/browse/LUCENE-8073
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Michael McCandless
>
> I hit this while beasting for another issue:
> {noformat}
>[junit4] Started J0 PID(12925@localhost).
>[junit4] Suite: org.apache.lucene.search.similarities.TestBasicModelIn
>[junit4]   1> 2.90165171E9 = score(DFRSimilarity, doc=0, 
> freq=1.5160105E9), computed from:
>[junit4]   1>   1.93443456E8 = boost
>[junit4]   1>   1.6061459E22 = NormalizationH1, computed from:
>[junit4]   1> 1.5160105E9 = tf
>[junit4]   1> 2.00029978E9 = avgFieldLength
>[junit4]   1> 49176.0 = len
>[junit4]   1>   2.4092188E23 = BasicModelIn, computed from:
>[junit4]   1> 49151.0 = numberOfDocuments
>[junit4]   1> 1.0 = docFreq
>[junit4]   1>   6.226085E-23 = AfterEffectL, computed from:
>[junit4]   1> 1.6061459E22 = tfn
>[junit4]   1>
>[junit4]   1> 2.90165197E9 = score(DFRSimilarity, doc=0, 
> freq=1.5160105E9), computed from:
>[junit4]   1>   1.93443456E8 = boost
>[junit4]   1>   1.4826518E22 = NormalizationH1, computed from:
>[junit4]   1> 1.5160105E9 = tf
>[junit4]   1> 2.00029978E9 = avgFieldLength
>[junit4]   1> 53272.0 = len
>[junit4]   1>   2.2239777E23 = BasicModelIn, computed from:
>[junit4]   1> 49151.0 = numberOfDocuments
>[junit4]   1> 1.0 = docFreq
>[junit4]   1>   6.7446724E-23 = AfterEffectL, computed from:
>[junit4]   1> 1.4826518E22 = tfn
>[junit4]   1>
>[junit4]   1> DFR I(n)L1
>[junit4]   1> 
> field="field",maxDoc=49151,docCount=49151,sumTotalTermFreq=98316735360683,sumDocFreq=49151
>[junit4]   1> term="term",docFreq=1,totalTermFreq=1516010534
>[junit4]   1> norm=133 (doc length ~ 53272)
>[junit4]   1> freq=1.5160105E9
>[junit4]   2> NOTE: reproduce with: ant test  -Dtestcase=TestBasicModelIn 
> -Dtests.method=testRandomScoring -Dtests.seed=4EBB7FC4E5233EEF 
> -Dtests.locale=da-DK -Dtests.timezone=Africa/Banjul -Dtests.asserts=true -\
> Dtests.file.encoding=ISO-8859-1
>[junit4] FAILURE 1.54s | TestBasicModelIn.testRandomScoring <<<
>[junit4]> Throwable #1: java.lang.AssertionError: 
> score(1.5160105E9,132)=2.90165171E9 < score(1.5160105E9,133)=2.90165197E9
>[junit4]>at 
> __randomizedtesting.SeedInfo.seed([4EBB7FC4E5233EEF:C5242676FF54D8E5]:0)
>[junit4]>at 
> org.apache.lucene.search.similarities.BaseSimilarityTestCase.doTestScoring(BaseSimilarityTestCase.java:423)
>[junit4]>at 
> org.apache.lucene.search.similarities.BaseSimilarityTestCase.testRandomScoring(BaseSimilarityTestCase.java:355)
>[junit4]>at java.lang.Thread.run(Thread.java:745)
>[junit4]   2> NOTE: test params are: codec=Asserting(Lucene70): 
> {field=FST50}, docValues:{}, maxPointsInLeafNode=216, 
> maxMBSortInHeap=7.270276664622743, 
> sim=Asserting(org.apache.lucene.search.similarities.Assert\
> ingSimilarity@58a83126), locale=da-DK, timezone=Africa/Banjul
>[junit4]   2> NOTE: Linux 4.4.0-75-generic amd64/Oracle Corporation 
> 1.8.0_121 (64-bit)/cpus=8,threads=1,free=395373056,total=513277952
>[junit4]   2> NOTE: All tests run in this JVM: [TestBasicModelIn]
>[junit4] Completed [1/1 (1!)] in 2.06s, 1 test, 1 failure <<< FAILURES!
>[junit4]
>[junit4]
>[junit4] Tests with failures [seed: 4EBB7FC4E5233EEF]:
>[junit4]   - 
> org.apache.lucene.search.similarities.TestBasicModelIn.testRandomScoring
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11412) Documentation changes for SOLR-11003: Bi-directional CDCR support

2017-12-01 Thread Cassandra Targett (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16275082#comment-16275082
 ] 

Cassandra Targett commented on SOLR-11412:
--

[~sarkaramr...@gmail.com], [~varunthacker] - Do either of you have any comments 
about the latest patch?

> Documentation changes for SOLR-11003: Bi-directional CDCR support
> -
>
> Key: SOLR-11412
> URL: https://issues.apache.org/jira/browse/SOLR-11412
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: CDCR, documentation
>Reporter: Amrit Sarkar
>Assignee: Varun Thacker
> Attachments: CDCR_bidir.png, SOLR-11412-split.patch, 
> SOLR-11412.patch, SOLR-11412.patch, SOLR-11412.patch, SOLR-11412.patch, 
> SOLR-11412.patch
>
>
> Since SOLR-11003: Bi-directional CDCR scenario support, is reaching its 
> conclusion. The relevant changes in documentation needs to be done.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11646) Ref Guide: Update API examples to include v2 style examples

2017-12-01 Thread Cassandra Targett (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16275075#comment-16275075
 ] 

Cassandra Targett commented on SOLR-11646:
--

The commits in earlier comments adds V2 API examples to a chunk of pages, 
specifically:

|| Page || Old API endpoint ||
|adding-custom-plugins-in-solrcloud-mode.adoc | //config |
|basic-authentication-plugin.adoc | /admin/authentication |
|blob-store-api.adoc | /admin/collections |
|config-sets.adoc | /admin/cores |
|content-streams.adoc | //config |
|getting-started-with-solrcloud.adoc | //config |
|major-changes-in-solr-7.adoc | //config |
|realtime-get.adoc  | //get or //get |
|requestdispatcher-in-solrconfig.adoc | //config |
|running-solr-on-hdfs.adoc | /admin/collections |
|transforming-and-indexing-custom-json.adoc | //config |

The model for these is to use the tab layout approach we added in SOLR-11584. 
In the HTML pages, these will appear as tabs for users to click on to select 
the example; in the PDF these will appear sequentially with the same labels 
that appear in the HTML as the tab labels. A good example of how this 
looks/works is this page: 
https://builds.apache.org/view/L/view/Lucene/job/Solr-reference-guide-master/javadoc/transforming-and-indexing-custom-json.html

In addition to the above list, I made a few other changes to other pages, 
notably {{about-this-guide.adoc}}, to which I added a section to describe what 
"V1" and "V2" mean, with a link to further context for the V2 approach. Other 
changes were mostly typos I noticed as I went along.

Steve Rowe pointed out to me privately that calling the "old" way of doing API 
calls "V1" is really a misnomer, but when I started it I couldn't figure out 
what to call them that was accurate and would be meaningful to people, so in 
that context it might be the best we can do. He also thought it may be 
confusing to call the "new" way "V2" since the path to the new endpoints 
doesn't include "v2", but again, that's what we call it so it's the most apt 
name I've come up with so far. If someone has a compelling suggestion for an 
alternative I could pretty easily fix the labels so this doesn't have to be set 
in stone yet.

I discovered a few pages on my original list that don't really need to updated:

|| Page || Old API endpoint || Notes ||
|configuring-logging.adoc | /admin/info/* | Doesn't accept POST requests to 
modify |
|filter-descriptions.adoc | //schema | POST 'schema/analysis' 
endpoint not supported in v2 |
|managed-resources.adoc | //schema | POST 'schema/analysis' 
endpoint not supported in v2 |
|collections-core-admin.adoc | /admin/collections | False positive example |
|learning-to-rank.adoc | //schema | Doesn't support v2 yet |

I probably (quite likely) will not have time to finish the rest of the pages 
before 7.2 Guide needs to be published, but it wasn't ever my intention to do 
them all at once anyway (I said it would be a series of commits). I left myself 
all the hardest ones yet to do. The remaining list of pages to review/update:

|| Page || Old API endpoint ||
|blob-store-api.adoc | /.system/blob
|collections-api.adoc | /admin/collections |
|config-api.adoc | //config |
|configsets-api.adoc | /admin/configs |
|configuring-solrconfig-xml.adoc | /admin/collections |
|coreadmin-api.adoc | /admin/cores |
|enabling-ssl.adoc | /admin/collections |
|implicit-requesthandlers.adoc | //config |
|making-and-restoring-backups.adoc | /admin/cores |
|other-parsers.adoc | //update |
|request-parameters-api.adoc | //config |
|rule-based-authorization-plugin.adoc | /admin/authorization |
|rule-based-authorization-plugin.adoc | /admin/collections |
|schema-api.adoc | //schema |
|schemaless-mode.adoc | //config |
|schemaless-mode.adoc | //schema |
|schemaless-mode.adoc | //update |
|solr-tutorial.adoc | //config |
|solr-tutorial.adoc | //schema |
|solr-tutorial.adoc | /admin/collections |
|solrcloud-autoscaling-api.adoc | /autoscaling/* |
|solrcloud-autoscaling-auto-add-replicas.adoc | /admin/collections |
|solrcloud-autoscaling-fault-tolerance.adoc | /autoscaling/* |
|solrcloud-autoscaling-overview.adoc | /admin/collections |
|solrcloud-autoscaling-overview.adoc | /autoscaling/* |
|updating-parts-of-documents.adoc | //update |
|uploading-data-with-index-handlers.adoc | //update |

Again, lots of duplicate pages there because I wanted to list out the endpoints 
that need to be changed. I also missed something in blob-store-api.adoc that I 
figured out after I'd started my commit, but I'll swing back and do another 
batch when I have a stretch of time to work on it.

> Ref Guide: Update API examples to include v2 style examples
> ---
>
> Key: SOLR-11646
> URL: https://issues.apache.org/jira/browse/SOLR-11646
> Project: Solr
>  Issue Type: Improvement
>  Security L

[jira] [Commented] (SOLR-11662) Make overlapping query term scoring configurable per field type

2017-12-01 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16275057#comment-16275057
 ] 

ASF GitHub Bot commented on SOLR-11662:
---

Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/275#discussion_r154456181
  
--- Diff: 
solr/core/src/test/org/apache/solr/search/TestSolrQueryParser.java ---
@@ -1057,7 +1057,35 @@ public void testShingleQueries() throws Exception {
 , "/response/numFound==1"
 );
   }
-  
+
+
+  public void testSynonymQueryStyle() throws Exception {
+ModifiableSolrParams edismaxParams = params("qf", "t_pick_best_foo");
+
+QParser qParser = QParser.getParser("tabby", "edismax", 
req(edismaxParams));
--- End diff --

Why not the default/lucene query parser?  That's what TestSolrQueryParser 
tests.


> Make overlapping query term scoring configurable per field type
> ---
>
> Key: SOLR-11662
> URL: https://issues.apache.org/jira/browse/SOLR-11662
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Doug Turnbull
> Fix For: 7.2, master (8.0)
>
>
> This patch customizes the query-time behavior when query terms overlap 
> positions. Right now the only option is SynonymQuery. This is a fantastic 
> default & improvement on past versions. However, there are use cases where 
> terms overlap positions but don't carry exact synonymy relationships. Often 
> synonyms are actually used to model hypernym/hyponym relationships using 
> synonyms (or other analyzers). So the individual term scores matter, with 
> terms with higher specificity (hyponym) scoring higher than terms with lower 
> specificity (hypernym).
> This patch adds the fieldType setting scoreOverlaps, as in:
> {code:java}
>class="solr.TextField" positionIncrementGap="100" multiValued="true">
> {code}
> Valid values for scoreOverlaps are:
> *as_one_term*
> Default, most synonym use cases. Uses SynonymQuery
> Treats all terms as if they're exactly equivalent, with document frequency 
> from underlying terms blended 
> *pick_best*
> For a given document, score using the best scoring synonym (ie dismax over 
> generated terms). 
> Useful when synonyms not exactly equilevant. Instead they are used to model 
> hypernym/hyponym relationships. Such as expanding to synonyms of where terms 
> scores will reflect that quality
> IE this query time expansion
> tabby => tabby, cat, animal
> Searching "text", generates the dismax (text:tabby | text:cat | text:animal)
> *as_distinct_terms*
> (The pre 6.0 behavior.)
> Compromise between pick_best and as_oneSterm
> Appropriate when synonyms reflect a hypernym/hyponym relationship, but lets 
> scores stack, so documents with more tabby, cat, or animal the better w/ a 
> bias towards the term with highest specificity
> Terms are turned into a boolean OR query, with documen frequencies not blended
> IE this query time expansion
> tabby => tabby, cat, animal
> Searching "text", generates the boolean query (text:tabby  text:cat 
> text:animal)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11662) Make overlapping query term scoring configurable per field type

2017-12-01 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16275060#comment-16275060
 ] 

ASF GitHub Bot commented on SOLR-11662:
---

Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/275#discussion_r154456952
  
--- Diff: 
solr/core/src/test/org/apache/solr/search/TestSolrQueryParser.java ---
@@ -1057,7 +1057,35 @@ public void testShingleQueries() throws Exception {
 , "/response/numFound==1"
 );
   }
-  
+
+
+  public void testSynonymQueryStyle() throws Exception {
+ModifiableSolrParams edismaxParams = params("qf", "t_pick_best_foo");
--- End diff --

Just a minor point here but you needn't have a SolrParams based variable; 
you could simply inline it at each invocation.  This makes it easier to read 
each test request.  If you were trying to share some common params across test 
invocations then I could understand.


> Make overlapping query term scoring configurable per field type
> ---
>
> Key: SOLR-11662
> URL: https://issues.apache.org/jira/browse/SOLR-11662
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Doug Turnbull
> Fix For: 7.2, master (8.0)
>
>
> This patch customizes the query-time behavior when query terms overlap 
> positions. Right now the only option is SynonymQuery. This is a fantastic 
> default & improvement on past versions. However, there are use cases where 
> terms overlap positions but don't carry exact synonymy relationships. Often 
> synonyms are actually used to model hypernym/hyponym relationships using 
> synonyms (or other analyzers). So the individual term scores matter, with 
> terms with higher specificity (hyponym) scoring higher than terms with lower 
> specificity (hypernym).
> This patch adds the fieldType setting scoreOverlaps, as in:
> {code:java}
>class="solr.TextField" positionIncrementGap="100" multiValued="true">
> {code}
> Valid values for scoreOverlaps are:
> *as_one_term*
> Default, most synonym use cases. Uses SynonymQuery
> Treats all terms as if they're exactly equivalent, with document frequency 
> from underlying terms blended 
> *pick_best*
> For a given document, score using the best scoring synonym (ie dismax over 
> generated terms). 
> Useful when synonyms not exactly equilevant. Instead they are used to model 
> hypernym/hyponym relationships. Such as expanding to synonyms of where terms 
> scores will reflect that quality
> IE this query time expansion
> tabby => tabby, cat, animal
> Searching "text", generates the dismax (text:tabby | text:cat | text:animal)
> *as_distinct_terms*
> (The pre 6.0 behavior.)
> Compromise between pick_best and as_oneSterm
> Appropriate when synonyms reflect a hypernym/hyponym relationship, but lets 
> scores stack, so documents with more tabby, cat, or animal the better w/ a 
> bias towards the term with highest specificity
> Terms are turned into a boolean OR query, with documen frequencies not blended
> IE this query time expansion
> tabby => tabby, cat, animal
> Searching "text", generates the boolean query (text:tabby  text:cat 
> text:animal)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11662) Make overlapping query term scoring configurable per field type

2017-12-01 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16275058#comment-16275058
 ] 

ASF GitHub Bot commented on SOLR-11662:
---

Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/275#discussion_r154457666
  
--- Diff: 
solr/core/src/java/org/apache/solr/parser/SolrQueryParserBase.java ---
@@ -539,6 +591,27 @@ protected Query newRegexpQuery(Term regexp) {
 return query;
   }
 
+  @Override
+  protected Query newSynonymQuery(Term terms[]) {
+switch (synonymQueryStyle) {
+  case PICK_BEST:
+List currPosnClauses = new ArrayList(terms.length);
+for (Term term : terms) {
+  currPosnClauses.add(newTermQuery(term));
+}
+DisjunctionMaxQuery dm = new DisjunctionMaxQuery(currPosnClauses, 
0.0f);
+return dm;
+  case AS_DISTINCT_TERMS:
+BooleanQuery.Builder builder = new BooleanQuery.Builder();
+for (Term term : terms) {
+  builder.add(newTermQuery(term), BooleanClause.Occur.SHOULD);
+}
+return builder.build();
+  default:
--- End diff --

What I meant to say in my previous review here is that you would have a 
case statement for AS_SAME_TERM and then to satisfy Java, add a default that 
throws an assertion error.  This way we see all 3 enum vals with their own 
case, which I think is easier to understand/maintain.  Oh, are you're doing 
this to handle "null"?  Hmm. Maybe put the case immediately before your current 
"default"?  Or prevent null in the first place?  Either I guess... nulls are 
unfortunate; I like to avoid them.  Notice TextField has primitives for some of 
its other settings; it'd be nice if likewise we had a non-null value for 
TextField.synonymQueryStyle.


> Make overlapping query term scoring configurable per field type
> ---
>
> Key: SOLR-11662
> URL: https://issues.apache.org/jira/browse/SOLR-11662
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Doug Turnbull
> Fix For: 7.2, master (8.0)
>
>
> This patch customizes the query-time behavior when query terms overlap 
> positions. Right now the only option is SynonymQuery. This is a fantastic 
> default & improvement on past versions. However, there are use cases where 
> terms overlap positions but don't carry exact synonymy relationships. Often 
> synonyms are actually used to model hypernym/hyponym relationships using 
> synonyms (or other analyzers). So the individual term scores matter, with 
> terms with higher specificity (hyponym) scoring higher than terms with lower 
> specificity (hypernym).
> This patch adds the fieldType setting scoreOverlaps, as in:
> {code:java}
>class="solr.TextField" positionIncrementGap="100" multiValued="true">
> {code}
> Valid values for scoreOverlaps are:
> *as_one_term*
> Default, most synonym use cases. Uses SynonymQuery
> Treats all terms as if they're exactly equivalent, with document frequency 
> from underlying terms blended 
> *pick_best*
> For a given document, score using the best scoring synonym (ie dismax over 
> generated terms). 
> Useful when synonyms not exactly equilevant. Instead they are used to model 
> hypernym/hyponym relationships. Such as expanding to synonyms of where terms 
> scores will reflect that quality
> IE this query time expansion
> tabby => tabby, cat, animal
> Searching "text", generates the dismax (text:tabby | text:cat | text:animal)
> *as_distinct_terms*
> (The pre 6.0 behavior.)
> Compromise between pick_best and as_oneSterm
> Appropriate when synonyms reflect a hypernym/hyponym relationship, but lets 
> scores stack, so documents with more tabby, cat, or animal the better w/ a 
> bias towards the term with highest specificity
> Terms are turned into a boolean OR query, with documen frequencies not blended
> IE this query time expansion
> tabby => tabby, cat, animal
> Searching "text", generates the boolean query (text:tabby  text:cat 
> text:animal)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11662) Make overlapping query term scoring configurable per field type

2017-12-01 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16275059#comment-16275059
 ] 

ASF GitHub Bot commented on SOLR-11662:
---

Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/275#discussion_r154458145
  
--- Diff: 
solr/core/src/java/org/apache/solr/parser/SolrQueryParserBase.java ---
@@ -78,6 +81,39 @@
   static final int MOD_NOT = 10;
   static final int MOD_REQ = 11;
 
+  protected SynonymQueryStyle synonymQueryStyle = AS_SAME_TERM;
+
+  /**
+   *  Query strategy when analyzed query terms overlap the same position 
(ie synonyms)
+   *  consider if pants and khakis are query time synonyms
+   *
+   *  {@link #AS_SAME_TERM}
+   *  {@link #PICK_BEST}
+   *  {@link #AS_DISTINCT_TERMS}
+   */
+  public static enum SynonymQueryStyle {
--- End diff --

I like the new name, and thanks for improving the javadocs.  BTW that "li" 
HTML list is missing the " wrapper.  Or better IMO is simply drop this 
list; it has no value I think.


> Make overlapping query term scoring configurable per field type
> ---
>
> Key: SOLR-11662
> URL: https://issues.apache.org/jira/browse/SOLR-11662
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Doug Turnbull
> Fix For: 7.2, master (8.0)
>
>
> This patch customizes the query-time behavior when query terms overlap 
> positions. Right now the only option is SynonymQuery. This is a fantastic 
> default & improvement on past versions. However, there are use cases where 
> terms overlap positions but don't carry exact synonymy relationships. Often 
> synonyms are actually used to model hypernym/hyponym relationships using 
> synonyms (or other analyzers). So the individual term scores matter, with 
> terms with higher specificity (hyponym) scoring higher than terms with lower 
> specificity (hypernym).
> This patch adds the fieldType setting scoreOverlaps, as in:
> {code:java}
>class="solr.TextField" positionIncrementGap="100" multiValued="true">
> {code}
> Valid values for scoreOverlaps are:
> *as_one_term*
> Default, most synonym use cases. Uses SynonymQuery
> Treats all terms as if they're exactly equivalent, with document frequency 
> from underlying terms blended 
> *pick_best*
> For a given document, score using the best scoring synonym (ie dismax over 
> generated terms). 
> Useful when synonyms not exactly equilevant. Instead they are used to model 
> hypernym/hyponym relationships. Such as expanding to synonyms of where terms 
> scores will reflect that quality
> IE this query time expansion
> tabby => tabby, cat, animal
> Searching "text", generates the dismax (text:tabby | text:cat | text:animal)
> *as_distinct_terms*
> (The pre 6.0 behavior.)
> Compromise between pick_best and as_oneSterm
> Appropriate when synonyms reflect a hypernym/hyponym relationship, but lets 
> scores stack, so documents with more tabby, cat, or animal the better w/ a 
> bias towards the term with highest specificity
> Terms are turned into a boolean OR query, with documen frequencies not blended
> IE this query time expansion
> tabby => tabby, cat, animal
> Searching "text", generates the boolean query (text:tabby  text:cat 
> text:animal)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #275: SOLR-11662: Configurable query when terms ove...

2017-12-01 Thread dsmiley
Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/275#discussion_r154457666
  
--- Diff: 
solr/core/src/java/org/apache/solr/parser/SolrQueryParserBase.java ---
@@ -539,6 +591,27 @@ protected Query newRegexpQuery(Term regexp) {
 return query;
   }
 
+  @Override
+  protected Query newSynonymQuery(Term terms[]) {
+switch (synonymQueryStyle) {
+  case PICK_BEST:
+List currPosnClauses = new ArrayList(terms.length);
+for (Term term : terms) {
+  currPosnClauses.add(newTermQuery(term));
+}
+DisjunctionMaxQuery dm = new DisjunctionMaxQuery(currPosnClauses, 
0.0f);
+return dm;
+  case AS_DISTINCT_TERMS:
+BooleanQuery.Builder builder = new BooleanQuery.Builder();
+for (Term term : terms) {
+  builder.add(newTermQuery(term), BooleanClause.Occur.SHOULD);
+}
+return builder.build();
+  default:
--- End diff --

What I meant to say in my previous review here is that you would have a 
case statement for AS_SAME_TERM and then to satisfy Java, add a default that 
throws an assertion error.  This way we see all 3 enum vals with their own 
case, which I think is easier to understand/maintain.  Oh, are you're doing 
this to handle "null"?  Hmm. Maybe put the case immediately before your current 
"default"?  Or prevent null in the first place?  Either I guess... nulls are 
unfortunate; I like to avoid them.  Notice TextField has primitives for some of 
its other settings; it'd be nice if likewise we had a non-null value for 
TextField.synonymQueryStyle.


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #275: SOLR-11662: Configurable query when terms ove...

2017-12-01 Thread dsmiley
Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/275#discussion_r154458145
  
--- Diff: 
solr/core/src/java/org/apache/solr/parser/SolrQueryParserBase.java ---
@@ -78,6 +81,39 @@
   static final int MOD_NOT = 10;
   static final int MOD_REQ = 11;
 
+  protected SynonymQueryStyle synonymQueryStyle = AS_SAME_TERM;
+
+  /**
+   *  Query strategy when analyzed query terms overlap the same position 
(ie synonyms)
+   *  consider if pants and khakis are query time synonyms
+   *
+   *  {@link #AS_SAME_TERM}
+   *  {@link #PICK_BEST}
+   *  {@link #AS_DISTINCT_TERMS}
+   */
+  public static enum SynonymQueryStyle {
--- End diff --

I like the new name, and thanks for improving the javadocs.  BTW that "li" 
HTML list is missing the " wrapper.  Or better IMO is simply drop this 
list; it has no value I think.


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #275: SOLR-11662: Configurable query when terms ove...

2017-12-01 Thread dsmiley
Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/275#discussion_r154456952
  
--- Diff: 
solr/core/src/test/org/apache/solr/search/TestSolrQueryParser.java ---
@@ -1057,7 +1057,35 @@ public void testShingleQueries() throws Exception {
 , "/response/numFound==1"
 );
   }
-  
+
+
+  public void testSynonymQueryStyle() throws Exception {
+ModifiableSolrParams edismaxParams = params("qf", "t_pick_best_foo");
--- End diff --

Just a minor point here but you needn't have a SolrParams based variable; 
you could simply inline it at each invocation.  This makes it easier to read 
each test request.  If you were trying to share some common params across test 
invocations then I could understand.


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #275: SOLR-11662: Configurable query when terms ove...

2017-12-01 Thread dsmiley
Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/275#discussion_r154456181
  
--- Diff: 
solr/core/src/test/org/apache/solr/search/TestSolrQueryParser.java ---
@@ -1057,7 +1057,35 @@ public void testShingleQueries() throws Exception {
 , "/response/numFound==1"
 );
   }
-  
+
+
+  public void testSynonymQueryStyle() throws Exception {
+ModifiableSolrParams edismaxParams = params("qf", "t_pick_best_foo");
+
+QParser qParser = QParser.getParser("tabby", "edismax", 
req(edismaxParams));
--- End diff --

Why not the default/lucene query parser?  That's what TestSolrQueryParser 
tests.


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11646) Ref Guide: Update API examples to include v2 style examples

2017-12-01 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16275052#comment-16275052
 ] 

ASF subversion and git services commented on SOLR-11646:


Commit 4fafbccdae412e73a85d1e2e63b0c40ff923a058 in lucene-solr's branch 
refs/heads/branch_7x from [~ctargett]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=4fafbcc ]

SOLR-11646: Add v2 API examples to several pages; add context to 
about-this-guide.adoc; tweak CSS to make tabs more obvious


> Ref Guide: Update API examples to include v2 style examples
> ---
>
> Key: SOLR-11646
> URL: https://issues.apache.org/jira/browse/SOLR-11646
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: documentation, v2 API
>Reporter: Cassandra Targett
>Assignee: Cassandra Targett
>
> The Ref Guide currently only has a single page with what might be generously 
> called an overview of the v2 API added in 6.5 
> (https://lucene.apache.org/solr/guide/v2-api.html) but most of the actual 
> APIs that support the v2 approach do not show an example of using it with the 
> v2 style. A few v2-style APIs are already used as examples, but there's 
> nothing consistent.
> With this issue I'll add API input/output examples throughout the Guide. Just 
> in terms of process, my intention is to have a series of commits to the pages 
> as I work through them so we make incremental progress. I'll start by adding 
> a list of pages/APIs to this issue so the scope of the work is clear.
> Once this is done we can figure out what to do with the V2 API page itself - 
> perhaps it gets archived and replaced with another page that describes Solr's 
> APIs overall; perhaps by then we figure out something else to do with it.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11646) Ref Guide: Update API examples to include v2 style examples

2017-12-01 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16275050#comment-16275050
 ] 

ASF subversion and git services commented on SOLR-11646:


Commit f2dd3c5f853c36b64607a84c0f6e9572319643db in lucene-solr's branch 
refs/heads/master from [~ctargett]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=f2dd3c5 ]

SOLR-11646: Add v2 API examples to several pages; add context to 
about-this-guide.adoc; tweak CSS to make tabs more obvious


> Ref Guide: Update API examples to include v2 style examples
> ---
>
> Key: SOLR-11646
> URL: https://issues.apache.org/jira/browse/SOLR-11646
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: documentation, v2 API
>Reporter: Cassandra Targett
>Assignee: Cassandra Targett
>
> The Ref Guide currently only has a single page with what might be generously 
> called an overview of the v2 API added in 6.5 
> (https://lucene.apache.org/solr/guide/v2-api.html) but most of the actual 
> APIs that support the v2 approach do not show an example of using it with the 
> v2 style. A few v2-style APIs are already used as examples, but there's 
> nothing consistent.
> With this issue I'll add API input/output examples throughout the Guide. Just 
> in terms of process, my intention is to have a series of commits to the pages 
> as I work through them so we make incremental progress. I'll start by adding 
> a list of pages/APIs to this issue so the scope of the work is clear.
> Once this is done we can figure out what to do with the V2 API page itself - 
> perhaps it gets archived and replaced with another page that describes Solr's 
> APIs overall; perhaps by then we figure out something else to do with it.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11692) SolrDispatchFilter.closeShield passes the shielded response object back to jetty making the stream unclose able

2017-12-01 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16275021#comment-16275021
 ] 

David Smiley commented on SOLR-11692:
-

+1 I like it Jeff.

The only change I suggest making is a little bit of maintenance here regarding 
the annoying casting of ServletRequest to HttpServletRequest in several places 
in this method.  You've added another spot.  Notice the first line of this 
method ensures we have the HTTP version.  Perhaps we can rename the parameter 
slightly and then cast to the current name in a local variable... then change 
various methods here to expect/return HttpServletRequest.  What do you think?  

> SolrDispatchFilter.closeShield passes the shielded response object back to 
> jetty making the stream unclose able
> ---
>
> Key: SOLR-11692
> URL: https://issues.apache.org/jira/browse/SOLR-11692
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Response Writers
>Affects Versions: 7.1
> Environment: Linux/Mac tested
>Reporter: Jeff Miller
>Priority: Minor
>  Labels: dispatchlayer, jetty, newbie, streams
> Attachments: SOLR-11692.patch
>
>   Original Estimate: 3h
>  Remaining Estimate: 3h
>
> In test mode we trigger closeShield code in SolrDispatchFilter, however there 
> are code paths where we passthrough the objects to the DefaultHandler which 
> can no longer close the response.
> Example stack trace:
> java.lang.AssertionError: Attempted close of response output stream.
> at 
> org.apache.solr.servlet.SolrDispatchFilter$2$1.close(SolrDispatchFilter.java:528)
> at org.eclipse.jetty.server.Dispatcher.commitResponse(Dispatcher.java:315)
> at org.eclipse.jetty.server.Dispatcher.forward(Dispatcher.java:279)
> at org.eclipse.jetty.server.Dispatcher.forward(Dispatcher.java:103)
> at org.eclipse.jetty.servlet.DefaultServlet.doGet(DefaultServlet.java:566)
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:734)
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:847)
> at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:684)
> at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1448)
> at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:385)
> at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:326)
> at 
> searchserver.filter.SfdcDispatchFilter.doFilter(SfdcDispatchFilter.java:204)
> at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
> at 
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
> at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
> at 
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)
> at 
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
> at 
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)
> at 
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)
> at 
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
> at 
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)
> at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
> at 
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
> at 
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)
> at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
> at org.eclipse.jetty.server.Server.handle(Server.java:370)
> at 
> org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)
> at 
> org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:949)
> at 
> org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1011)
> at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:644)
> at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
> at 
> org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:82)
> at 
> org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:668)
> at 
> org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:52)
> at 
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
> at 
>

[jira] [Comment Edited] (SOLR-11622) Bundled mime4j library not sufficient for Tika requirement

2017-12-01 Thread Tim Allison (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16274994#comment-16274994
 ] 

Tim Allison edited comment on SOLR-11622 at 12/1/17 9:17 PM:
-

There's still a clash with jdom triggered by rss files and rometools

{noformat}
Exception in thread "Thread-21" java.lang.NoClassDefFoundError: 
org/jdom2/input/JDOMParseException
at com.rometools.rome.io.SyndFeedInput.(SyndFeedInput.java:63)
at com.rometools.rome.io.SyndFeedInput.(SyndFeedInput.java:51)
{noformat}

I'm confirming that should be bumped to 2.0.4.




was (Author: talli...@mitre.org):
There's still a clash with jdom triggered by rss files and rometools

{noformat]
Exception in thread "Thread-21" java.lang.NoClassDefFoundError: 
org/jdom2/input/JDOMParseException
at com.rometools.rome.io.SyndFeedInput.(SyndFeedInput.java:63)
at com.rometools.rome.io.SyndFeedInput.(SyndFeedInput.java:51)
{noformat}

I'm confirming that should be bumped to 2.0.4.



> Bundled mime4j library not sufficient for Tika requirement
> --
>
> Key: SOLR-11622
> URL: https://issues.apache.org/jira/browse/SOLR-11622
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Build
>Affects Versions: 7.1, 6.6.2
>Reporter: Karim Malhas
>Assignee: Karthik Ramachandran
>Priority: Minor
>  Labels: build
> Attachments: SOLR-11622.patch
>
>
> The version 7.2 of Apache James Mime4j bundled with the Solr binary releases 
> does not match what is required by Apache Tika for parsing rfc2822 messages. 
> The master branch for james-mime4j seems to contain the missing Builder class
> [https://github.com/apache/james-mime4j/blob/master/core/src/main/java/org/apache/james/mime4j/stream/MimeConfig.java
> ]
> This prevents import of rfc2822 formatted messages. For example like so:
> {{./bin/post -c dovecot -type 'message/rfc822' 'testdata/email_01.txt'
> }}
> And results in the following stacktrace:
> java.lang.NoClassDefFoundError: 
> org/apache/james/mime4j/stream/MimeConfig$Builder
> at 
> org.apache.tika.parser.mail.RFC822Parser.parse(RFC822Parser.java:63)
> at 
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
> at 
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
> at 
> org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:135)
> at 
> org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:228)
> at 
> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:68)
> at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:173)
> at org.apache.solr.core.SolrCore.execute(SolrCore.java:2477)
> at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:723)
> at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:529)
> at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:361)
> at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:305)
> at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1691)
> at 
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
> at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
> at 
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
> at 
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
> at 
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
> at 
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
> at 
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
> at 
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
> at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
> at 
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
> at 
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
> at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
> at 
> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
> at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
> at org.eclipse.jetty.server.Server.handle(Server.java:534)
>   

[jira] [Commented] (SOLR-11622) Bundled mime4j library not sufficient for Tika requirement

2017-12-01 Thread Tim Allison (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16274994#comment-16274994
 ] 

Tim Allison commented on SOLR-11622:


There's still a clash with jdom triggered by rss files and rometools

{noformat]
Exception in thread "Thread-21" java.lang.NoClassDefFoundError: 
org/jdom2/input/JDOMParseException
at com.rometools.rome.io.SyndFeedInput.(SyndFeedInput.java:63)
at com.rometools.rome.io.SyndFeedInput.(SyndFeedInput.java:51)
{noformat}

I'm confirming that should be bumped to 2.0.4.



> Bundled mime4j library not sufficient for Tika requirement
> --
>
> Key: SOLR-11622
> URL: https://issues.apache.org/jira/browse/SOLR-11622
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Build
>Affects Versions: 7.1, 6.6.2
>Reporter: Karim Malhas
>Assignee: Karthik Ramachandran
>Priority: Minor
>  Labels: build
> Attachments: SOLR-11622.patch
>
>
> The version 7.2 of Apache James Mime4j bundled with the Solr binary releases 
> does not match what is required by Apache Tika for parsing rfc2822 messages. 
> The master branch for james-mime4j seems to contain the missing Builder class
> [https://github.com/apache/james-mime4j/blob/master/core/src/main/java/org/apache/james/mime4j/stream/MimeConfig.java
> ]
> This prevents import of rfc2822 formatted messages. For example like so:
> {{./bin/post -c dovecot -type 'message/rfc822' 'testdata/email_01.txt'
> }}
> And results in the following stacktrace:
> java.lang.NoClassDefFoundError: 
> org/apache/james/mime4j/stream/MimeConfig$Builder
> at 
> org.apache.tika.parser.mail.RFC822Parser.parse(RFC822Parser.java:63)
> at 
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
> at 
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
> at 
> org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:135)
> at 
> org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:228)
> at 
> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:68)
> at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:173)
> at org.apache.solr.core.SolrCore.execute(SolrCore.java:2477)
> at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:723)
> at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:529)
> at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:361)
> at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:305)
> at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1691)
> at 
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
> at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
> at 
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
> at 
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
> at 
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
> at 
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
> at 
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
> at 
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
> at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
> at 
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
> at 
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
> at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
> at 
> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
> at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
> at org.eclipse.jetty.server.Server.handle(Server.java:534)
> at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320)
> at 
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
> at 
> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273)
> at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)
> at 
> org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
> at 
> org.

[jira] [Commented] (SOLR-9560) Solr should check max open files and other ulimits and refuse to start if they are set too low

2017-12-01 Thread Erick Erickson (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16274958#comment-16274958
 ] 

Erick Erickson commented on SOLR-9560:
--

[~shalinmangar] Hmm, on a quick search I don't see a good way to get the ulimit 
information in Java. So are you suggesting that the script do the ulimit (or 
whatever) and pass the variables to the Java executable?

If that's the case, I don't see how that would work with an API. What am I 
missing here?

> Solr should check max open files and other ulimits and refuse to start if 
> they are set too low
> --
>
> Key: SOLR-9560
> URL: https://issues.apache.org/jira/browse/SOLR-9560
> Project: Solr
>  Issue Type: Wish
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Shalin Shekhar Mangar
>  Labels: newdev
> Fix For: 6.7, 7.0
>
> Attachments: SOLR-9560.patch
>
>
> Solr should check max open files and other ulimits and refuse to start if 
> they are set too low. Specifically:
> # max open files should be at least 32768
> # max memory size and virtual memory should both be unlimited



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-11256) Provide default for ConcurrentUpdateSolrClient's "queueSize" param

2017-12-01 Thread Anshum Gupta (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-11256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anshum Gupta updated SOLR-11256:

Attachment: SOLR-11256.patch

Thanks [~gerlowskija].
Here's an updated patch with a test to make sure that the only required 
parameter for the CUSC builder is the baseSolrUrl.
I'll commit this once the tests pass.

> Provide default for ConcurrentUpdateSolrClient's "queueSize" param
> --
>
> Key: SOLR-11256
> URL: https://issues.apache.org/jira/browse/SOLR-11256
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrJ
>Affects Versions: master (8.0)
>Reporter: Jason Gerlowski
>Assignee: Anshum Gupta
>Priority: Minor
> Attachments: SOLR-11256.patch, SOLR-11256.patch
>
>
> A user on the mailing list recently pointed out that if it's not specified 
> explicitly as a Builder option, ConcurrentUpdateSolrClient will default to 
> using a queueSize of 0.  This value gets passed in to the underlying queue 
> data structure which throws an IllegalArgumentException, with an error 
> message that isn't obvious to those unfamiliar with the internals.
> We should provide a better default than the uninitialized-variable default of 
> 0.  Almost all occurrences in the code fall between 5 and 10, so a queueSize 
> in that range should be uncontroversial.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Assigned] (SOLR-11256) Provide default for ConcurrentUpdateSolrClient's "queueSize" param

2017-12-01 Thread Anshum Gupta (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-11256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anshum Gupta reassigned SOLR-11256:
---

Assignee: Anshum Gupta

> Provide default for ConcurrentUpdateSolrClient's "queueSize" param
> --
>
> Key: SOLR-11256
> URL: https://issues.apache.org/jira/browse/SOLR-11256
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrJ
>Affects Versions: master (8.0)
>Reporter: Jason Gerlowski
>Assignee: Anshum Gupta
>Priority: Minor
> Attachments: SOLR-11256.patch
>
>
> A user on the mailing list recently pointed out that if it's not specified 
> explicitly as a Builder option, ConcurrentUpdateSolrClient will default to 
> using a queueSize of 0.  This value gets passed in to the underlying queue 
> data structure which throws an IllegalArgumentException, with an error 
> message that isn't obvious to those unfamiliar with the internals.
> We should provide a better default than the uninitialized-variable default of 
> 0.  Almost all occurrences in the code fall between 5 and 10, so a queueSize 
> in that range should be uncontroversial.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Reopened] (SOLR-11703) Solr Should Send Log Notifications if Ulimits are Incorrect

2017-12-01 Thread Erick Erickson (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-11703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson reopened SOLR-11703:
---

Reopening as SOLR-9560 hasn't gone anywhere. Let's reconcile the two

> Solr Should Send Log Notifications if Ulimits are Incorrect
> ---
>
> Key: SOLR-11703
> URL: https://issues.apache.org/jira/browse/SOLR-11703
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Kevin Cowan
>Assignee: Erick Erickson
>
> On most Linux instances, the default for 'open files' is set to something 
> entirely too low, E.g. 1024.   We have a large number of support tickets that 
> wind up with us having the client increase this number... programatically.  
>  It would make sense and save a great deal of support time if the solr 
> startup script checked these values, and either alter them, or at least alert 
> the user to the fact that they are set too low, which could cause trouble  I 
> am associating just one of many tickets where this is the result. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Assigned] (SOLR-11703) Solr Should Send Log Notifications if Ulimits are Incorrect

2017-12-01 Thread Erick Erickson (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-11703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson reassigned SOLR-11703:
-

Assignee: Erick Erickson

> Solr Should Send Log Notifications if Ulimits are Incorrect
> ---
>
> Key: SOLR-11703
> URL: https://issues.apache.org/jira/browse/SOLR-11703
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Kevin Cowan
>Assignee: Erick Erickson
>
> On most Linux instances, the default for 'open files' is set to something 
> entirely too low, E.g. 1024.   We have a large number of support tickets that 
> wind up with us having the client increase this number... programatically.  
>  It would make sense and save a great deal of support time if the solr 
> startup script checked these values, and either alter them, or at least alert 
> the user to the fact that they are set too low, which could cause trouble  I 
> am associating just one of many tickets where this is the result. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11508) core.properties should be stored $solr.data.home/$core.name

2017-12-01 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16274857#comment-16274857
 ] 

David Smiley commented on SOLR-11508:
-

Very well written Marc!

> core.properties should be stored $solr.data.home/$core.name
> ---
>
> Key: SOLR-11508
> URL: https://issues.apache.org/jira/browse/SOLR-11508
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Marc Morissette
>
> Since Solr 7, it is possible to store Solr cores in separate disk locations 
> using solr.data.home (see SOLR-6671). This is very useful where running Solr 
> in Docker where data must be stored in a directory which is independent from 
> the rest of the container.
> Unfortunately, while core data is stored in 
> {{$\{solr.data.home}/$\{core.name}/index/...}}, core.properties is stored in 
> {{$\{solr.solr.home}/$\{core.name}/core.properties}}.
> Reading SOLR-6671 comments, I think this was the expected behaviour but I 
> don't think it is the correct one.
> In addition to being inelegant and counterintuitive, this has the drawback of 
> stripping a core of its metadata and breaking core discovery when a Solr 
> installation is redeployed, whether in Docker or not.
> core.properties is mostly metadata and although it contains some 
> configuration, this configuration is specific to the core it accompanies. I 
> believe it should be stored in solr.data.home, with the rest of the data it 
> describes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11508) core.properties should be stored $solr.data.home/$core.name

2017-12-01 Thread Marc Morissette (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16274784#comment-16274784
 ] 

Marc Morissette commented on SOLR-11508:


[~elyograg], unfortunately what you propose is not really compatible with 
Docker. In Docker, configuration remains part of the image and users customize 
that configuration by either extending base images, mapping configuration files 
during deployment or configuring environment variables. Data must go in a 
separate directory, ideally one that can be empty without adverse effects. 
SOLR_HOME is thus not a good solution because it contains configsets and 
solr.xml.

SOLR_DATA_HOME is a good solution for people who use Solr in standalone mode 
and I will readily admit my patch addresses this use case poorly. I did not 
completely understand this variable's purpose at first and thought it was 
somehow "wrong" but it's not. I'm not arguing any change to it anymore.

In Cloud mode however, we deal with collections. Cores are more of an 
implementation detail. In Cloud Mode, I'd argue individual core.properties are 
closer to segment descriptors in their purpose which is why it makes more sense 
to keep them with the rest of the data. This is why I believe coreRootDirectory 
is the best way to separate configuration from data in Cloud mode.

To summarize, after reading everyone's viewpoint, I believe all 3 configuration 
variables are necessary as they address different use cases. [~dsmiley] and I 
are simply arguing for an easier way to configure coreRootDirectory. If no one 
sees an objection to that, I'll change the description of this bug as it's 
getting pretty stale and I'll find some time to work on a new patch to address 
that.

> core.properties should be stored $solr.data.home/$core.name
> ---
>
> Key: SOLR-11508
> URL: https://issues.apache.org/jira/browse/SOLR-11508
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Marc Morissette
>
> Since Solr 7, it is possible to store Solr cores in separate disk locations 
> using solr.data.home (see SOLR-6671). This is very useful where running Solr 
> in Docker where data must be stored in a directory which is independent from 
> the rest of the container.
> Unfortunately, while core data is stored in 
> {{$\{solr.data.home}/$\{core.name}/index/...}}, core.properties is stored in 
> {{$\{solr.solr.home}/$\{core.name}/core.properties}}.
> Reading SOLR-6671 comments, I think this was the expected behaviour but I 
> don't think it is the correct one.
> In addition to being inelegant and counterintuitive, this has the drawback of 
> stripping a core of its metadata and breaking core discovery when a Solr 
> installation is redeployed, whether in Docker or not.
> core.properties is mostly metadata and although it contains some 
> configuration, this configuration is specific to the core it accompanies. I 
> believe it should be stored in solr.data.home, with the rest of the data it 
> describes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8059) Fold early termination support into TopFieldCollector

2017-12-01 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16274665#comment-16274665
 ] 

ASF subversion and git services commented on LUCENE-8059:
-

Commit 75520e8f264c9c398d21b237de4c7d4ac5cdbcc6 in lucene-solr's branch 
refs/heads/branch_7x from [~jpountz]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=75520e8 ]

LUCENE-8059: Mark EarlyTerminatingSortingCollector as deprecated.


> Fold early termination support into TopFieldCollector
> -
>
> Key: LUCENE-8059
> URL: https://issues.apache.org/jira/browse/LUCENE-8059
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Assignee: Adrien Grand
>Priority: Minor
> Fix For: master (8.0), 7.2
>
> Attachments: LUCENE-8059.patch
>
>
> We should make early termination of requests easier to use.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8074) TestBooleanMinShouldMatch.testRandomQueries failure

2017-12-01 Thread Adrien Grand (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16274651#comment-16274651
 ] 

Adrien Grand commented on LUCENE-8074:
--

Thanks Mike, I'll look (probably early next week).

> TestBooleanMinShouldMatch.testRandomQueries failure
> ---
>
> Key: LUCENE-8074
> URL: https://issues.apache.org/jira/browse/LUCENE-8074
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Michael McCandless
>
> I hit this while beasting for another issue; it seems to reproduce:
> {noformat}
>[junit4]  says Привет! Master seed: E99EA9D958298BBA
>[junit4] Executing 1 suite with 1 JVM.
>[junit4]
>[junit4] Started J0 PID(19504@localhost).
>[junit4] Suite: org.apache.lucene.search.TestBooleanMinShouldMatch
>[junit4]   2> NOTE: reproduce with: ant test  
> -Dtestcase=TestBooleanMinShouldMatch -Dtests.method=testRandomQueries 
> -Dtests.seed=E99EA9D958298BBA -Dtests.locale=pt-BR 
> -Dtests.timezone=Africa/Dar_es_Salaam -Dtest\
> s.asserts=true -Dtests.file.encoding=UTF-8
>[junit4] FAILURE 0.74s | TestBooleanMinShouldMatch.testRandomQueries <<<
>[junit4]> Throwable #1: java.lang.AssertionError: Doc 0 scores don't 
> match
>[junit4]> TopDocs totalHits=3 top=3
>[junit4]>0) doc=0score=3.1725373
>[junit4]>1) doc=6score=0.84062046
>[junit4]>2) doc=4score=0.80648094
>[junit4]> TopDocs totalHits=1 top=1
>[junit4]>0) doc=0score=3.172537
>[junit4]> for query:(data:Y +data:3 data:4 data:4 data:1 -data:Y)~3 
> expected:<3.172537088394165> but was:<3.172537326812744>
>[junit4]>at 
> __randomizedtesting.SeedInfo.seed([E99EA9D958298BBA:B7B5193560F3A624]:0)
>[junit4]>at 
> org.apache.lucene.search.TestBooleanMinShouldMatch.assertSubsetOfSameScores(TestBooleanMinShouldMatch.java:379)
>[junit4]>at 
> org.apache.lucene.search.TestBooleanMinShouldMatch.testRandomQueries(TestBooleanMinShouldMatch.java:354)
>[junit4]>at java.lang.Thread.run(Thread.java:745)
>[junit4]   2> NOTE: test params are: codec=Asserting(Lucene70): 
> {all=PostingsFormat(name=LuceneFixedGap), 
> data=PostingsFormat(name=LuceneVarGapFixedInterval), 
> id=Lucene50(blocksize=128)}, docValues:{}, maxPoints\
> InLeafNode=83, maxMBSortInHeap=7.095630354403455, 
> sim=Asserting(org.apache.lucene.search.similarities.AssertingSimilarity@5665ff50),
>  locale=pt-BR, timezone=Africa/Dar_es_Salaam
>[junit4]   2> NOTE: Linux 4.4.0-75-generic amd64/Oracle Corporation 
> 1.8.0_121 (64-bit)/cpus=8,threads=1,free=444068192,total=514850816
>[junit4]   2> NOTE: All tests run in this JVM: [TestBooleanMinShouldMatch]
>[junit4] Completed [1/1 (1!)] in 1.35s, 1 test, 1 failure <<< FAILURES!
>[junit4]
>[junit4]
>[junit4] Tests with failures [seed: E99EA9D958298BBA]:
>[junit4]   - 
> org.apache.lucene.search.TestBooleanMinShouldMatch.testRandomQueries
>  {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8073) TestBasicModelIn.testRandomScoring failure

2017-12-01 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16274649#comment-16274649
 ] 

Robert Muir commented on LUCENE-8073:
-

Dup of LUCENE-8015

> TestBasicModelIn.testRandomScoring failure
> --
>
> Key: LUCENE-8073
> URL: https://issues.apache.org/jira/browse/LUCENE-8073
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Michael McCandless
>
> I hit this while beasting for another issue:
> {noformat}
>[junit4] Started J0 PID(12925@localhost).
>[junit4] Suite: org.apache.lucene.search.similarities.TestBasicModelIn
>[junit4]   1> 2.90165171E9 = score(DFRSimilarity, doc=0, 
> freq=1.5160105E9), computed from:
>[junit4]   1>   1.93443456E8 = boost
>[junit4]   1>   1.6061459E22 = NormalizationH1, computed from:
>[junit4]   1> 1.5160105E9 = tf
>[junit4]   1> 2.00029978E9 = avgFieldLength
>[junit4]   1> 49176.0 = len
>[junit4]   1>   2.4092188E23 = BasicModelIn, computed from:
>[junit4]   1> 49151.0 = numberOfDocuments
>[junit4]   1> 1.0 = docFreq
>[junit4]   1>   6.226085E-23 = AfterEffectL, computed from:
>[junit4]   1> 1.6061459E22 = tfn
>[junit4]   1>
>[junit4]   1> 2.90165197E9 = score(DFRSimilarity, doc=0, 
> freq=1.5160105E9), computed from:
>[junit4]   1>   1.93443456E8 = boost
>[junit4]   1>   1.4826518E22 = NormalizationH1, computed from:
>[junit4]   1> 1.5160105E9 = tf
>[junit4]   1> 2.00029978E9 = avgFieldLength
>[junit4]   1> 53272.0 = len
>[junit4]   1>   2.2239777E23 = BasicModelIn, computed from:
>[junit4]   1> 49151.0 = numberOfDocuments
>[junit4]   1> 1.0 = docFreq
>[junit4]   1>   6.7446724E-23 = AfterEffectL, computed from:
>[junit4]   1> 1.4826518E22 = tfn
>[junit4]   1>
>[junit4]   1> DFR I(n)L1
>[junit4]   1> 
> field="field",maxDoc=49151,docCount=49151,sumTotalTermFreq=98316735360683,sumDocFreq=49151
>[junit4]   1> term="term",docFreq=1,totalTermFreq=1516010534
>[junit4]   1> norm=133 (doc length ~ 53272)
>[junit4]   1> freq=1.5160105E9
>[junit4]   2> NOTE: reproduce with: ant test  -Dtestcase=TestBasicModelIn 
> -Dtests.method=testRandomScoring -Dtests.seed=4EBB7FC4E5233EEF 
> -Dtests.locale=da-DK -Dtests.timezone=Africa/Banjul -Dtests.asserts=true -\
> Dtests.file.encoding=ISO-8859-1
>[junit4] FAILURE 1.54s | TestBasicModelIn.testRandomScoring <<<
>[junit4]> Throwable #1: java.lang.AssertionError: 
> score(1.5160105E9,132)=2.90165171E9 < score(1.5160105E9,133)=2.90165197E9
>[junit4]>at 
> __randomizedtesting.SeedInfo.seed([4EBB7FC4E5233EEF:C5242676FF54D8E5]:0)
>[junit4]>at 
> org.apache.lucene.search.similarities.BaseSimilarityTestCase.doTestScoring(BaseSimilarityTestCase.java:423)
>[junit4]>at 
> org.apache.lucene.search.similarities.BaseSimilarityTestCase.testRandomScoring(BaseSimilarityTestCase.java:355)
>[junit4]>at java.lang.Thread.run(Thread.java:745)
>[junit4]   2> NOTE: test params are: codec=Asserting(Lucene70): 
> {field=FST50}, docValues:{}, maxPointsInLeafNode=216, 
> maxMBSortInHeap=7.270276664622743, 
> sim=Asserting(org.apache.lucene.search.similarities.Assert\
> ingSimilarity@58a83126), locale=da-DK, timezone=Africa/Banjul
>[junit4]   2> NOTE: Linux 4.4.0-75-generic amd64/Oracle Corporation 
> 1.8.0_121 (64-bit)/cpus=8,threads=1,free=395373056,total=513277952
>[junit4]   2> NOTE: All tests run in this JVM: [TestBasicModelIn]
>[junit4] Completed [1/1 (1!)] in 2.06s, 1 test, 1 failure <<< FAILURES!
>[junit4]
>[junit4]
>[junit4] Tests with failures [seed: 4EBB7FC4E5233EEF]:
>[junit4]   - 
> org.apache.lucene.search.similarities.TestBasicModelIn.testRandomScoring
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11622) Bundled mime4j library not sufficient for Tika requirement

2017-12-01 Thread Karthik Ramachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16274648#comment-16274648
 ] 

Karthik Ramachandran commented on SOLR-11622:
-

I just now read the bug report, we are currently using 6.6.2 and we d'not see 
any issue with ppt extraction.

> Bundled mime4j library not sufficient for Tika requirement
> --
>
> Key: SOLR-11622
> URL: https://issues.apache.org/jira/browse/SOLR-11622
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Build
>Affects Versions: 7.1, 6.6.2
>Reporter: Karim Malhas
>Assignee: Karthik Ramachandran
>Priority: Minor
>  Labels: build
> Attachments: SOLR-11622.patch
>
>
> The version 7.2 of Apache James Mime4j bundled with the Solr binary releases 
> does not match what is required by Apache Tika for parsing rfc2822 messages. 
> The master branch for james-mime4j seems to contain the missing Builder class
> [https://github.com/apache/james-mime4j/blob/master/core/src/main/java/org/apache/james/mime4j/stream/MimeConfig.java
> ]
> This prevents import of rfc2822 formatted messages. For example like so:
> {{./bin/post -c dovecot -type 'message/rfc822' 'testdata/email_01.txt'
> }}
> And results in the following stacktrace:
> java.lang.NoClassDefFoundError: 
> org/apache/james/mime4j/stream/MimeConfig$Builder
> at 
> org.apache.tika.parser.mail.RFC822Parser.parse(RFC822Parser.java:63)
> at 
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
> at 
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
> at 
> org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:135)
> at 
> org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:228)
> at 
> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:68)
> at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:173)
> at org.apache.solr.core.SolrCore.execute(SolrCore.java:2477)
> at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:723)
> at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:529)
> at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:361)
> at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:305)
> at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1691)
> at 
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
> at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
> at 
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
> at 
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
> at 
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
> at 
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
> at 
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
> at 
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
> at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
> at 
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
> at 
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
> at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
> at 
> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
> at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
> at org.eclipse.jetty.server.Server.handle(Server.java:534)
> at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320)
> at 
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
> at 
> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273)
> at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)
> at 
> org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
> at 
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)
> at 
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)
> at 
> org.e

[jira] [Comment Edited] (LUCENE-8071) GeoExactCircle should create circles with right number of planes

2017-12-01 Thread Ignacio Vera (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16274624#comment-16274624
 ] 

Ignacio Vera edited comment on LUCENE-8071 at 12/1/17 5:06 PM:
---

Thanks for committing [~karl wright],

Fair to say that the shape only supports planets which are slightly elongated 
like WGS84 (~abs(flattening)<0.05). 

I attach the test I am using to check the shape in case you think it is useful. 
I will move this shape to the spatial4j wrapper if it is ok with you.



was (Author: ivera):
Thanks for committing [~karl wright],

Fair to say that the shape only supports planets which are slightly elongated 
like WGS84 (~abs(flattening)<0.1). 

I attach the test I am using to check the shape in case you think it is useful. 
I will move this shape to the spatial4j wrapper if it is ok with you.


> GeoExactCircle should  create circles with right number of planes
> -
>
> Key: LUCENE-8071
> URL: https://issues.apache.org/jira/browse/LUCENE-8071
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/spatial3d
>Reporter: Ignacio Vera
>Assignee: Karl Wright
> Fix For: 6.7, master (8.0), 7.2
>
> Attachments: LUCENE-8071-test.patch, LUCENE-8071.patch, 
> testPointsWithIn.patch
>
>
> Hi [~kwri...@metacarta.com],
> There is still a situation when the test can fail. It happens when the planet 
> model is a SPHERE and the radius is slightly lower than PI. The circle is 
> created with two sectors but the circle plane is too big and the shape is 
> bogus.
> I will attach a test and a proposed solution. (I hope this is the last issue 
> of this saga)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re:Lucene/Solr 7.2

2017-12-01 Thread Christine Poerschke (BLOOMBERG/ QUEEN VIC)
I'd like to see https://issues.apache.org/jira/browse/SOLR-9137 included in the 
release. Hoping to commit it on/by Tuesday.

Christine

From: dev@lucene.apache.org At: 12/01/17 10:11:59To:  dev@lucene.apache.org
Subject: Lucene/Solr 7.2

Hello,

It's been more than 6 weeks since we released 7.1 and we accumulated a good set 
of changes, so I think we should release Lucene/Solr 7.2.0.

There is one change that I would like to have before building a RC: 
LUCENE-8043[1], which looks like it is almost ready to be merged. Please let me 
know if there are any other changes that should make it to the release.

I volunteer to be the release manager. I'm currently thinking of building the 
first release candidate next wednesday, December 6th.

[1] https://issues.apache.org/jira/browse/LUCENE-8043


[jira] [Updated] (LUCENE-8071) GeoExactCircle should create circles with right number of planes

2017-12-01 Thread Ignacio Vera (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-8071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ignacio Vera updated LUCENE-8071:
-
Attachment: (was: testPointsWithIn.patch)

> GeoExactCircle should  create circles with right number of planes
> -
>
> Key: LUCENE-8071
> URL: https://issues.apache.org/jira/browse/LUCENE-8071
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/spatial3d
>Reporter: Ignacio Vera
>Assignee: Karl Wright
> Fix For: 6.7, master (8.0), 7.2
>
> Attachments: LUCENE-8071-test.patch, LUCENE-8071.patch, 
> testPointsWithIn.patch
>
>
> Hi [~kwri...@metacarta.com],
> There is still a situation when the test can fail. It happens when the planet 
> model is a SPHERE and the radius is slightly lower than PI. The circle is 
> created with two sectors but the circle plane is too big and the shape is 
> bogus.
> I will attach a test and a proposed solution. (I hope this is the last issue 
> of this saga)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-8071) GeoExactCircle should create circles with right number of planes

2017-12-01 Thread Ignacio Vera (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-8071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ignacio Vera updated LUCENE-8071:
-
Attachment: testPointsWithIn.patch

> GeoExactCircle should  create circles with right number of planes
> -
>
> Key: LUCENE-8071
> URL: https://issues.apache.org/jira/browse/LUCENE-8071
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/spatial3d
>Reporter: Ignacio Vera
>Assignee: Karl Wright
> Fix For: 6.7, master (8.0), 7.2
>
> Attachments: LUCENE-8071-test.patch, LUCENE-8071.patch, 
> testPointsWithIn.patch
>
>
> Hi [~kwri...@metacarta.com],
> There is still a situation when the test can fail. It happens when the planet 
> model is a SPHERE and the radius is slightly lower than PI. The circle is 
> created with two sectors but the circle plane is too big and the shape is 
> bogus.
> I will attach a test and a proposed solution. (I hope this is the last issue 
> of this saga)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Lucene/Solr 7.2

2017-12-01 Thread Erick Erickson
David:

Well, I can make LUCENE-8048 committable in about 30 seconds ;) On a
more serious note, I've already delayed 2 weeks to give people time to
respond but nobody has. I'll put some time into digging at why the
test failures happened and if they're a test artifact rather than a
fundamental problem I'll be more inclined to push forward on it. We'll
see what I can find in the next day or two.


On Fri, Dec 1, 2017 at 7:59 AM, David Smiley  wrote:
> Doug's issue SOLR-11698 needs my final code review (probably final any way)
> and I plan to commit that as late as Monday if it goes well.
>
> Erick... IMO:
> * LUCENE-8048 probably needs some "bake" time IMO, plus it's not clear if
> it's committable yet (waiting for other input).
> * SOLR-11687 definitely include
>
>
> On Fri, Dec 1, 2017 at 10:41 AM Erick Erickson 
> wrote:
>>
>> SOLR-11687 and LUCENE-8048 are ones I'd like to consider getting in to
>> 7.2, should they have longer to bake though? Any opinions?
>>
>> On Fri, Dec 1, 2017 at 7:26 AM, Joel Bernstein  wrote:
>> > +1. I have a couple of tickets that I should have wrapped up by Monday.
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
> --
> Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
> LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
> http://www.solrenterprisesearchserver.com

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-8071) GeoExactCircle should create circles with right number of planes

2017-12-01 Thread Ignacio Vera (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-8071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ignacio Vera updated LUCENE-8071:
-
Attachment: testPointsWithIn.patch

Thanks for committing [~karl wright],

Fair to say that the shape only supports planets which are slightly elongated 
like WGS84 (~abs(flattening)<0.1). 

I attach the test I am using to check the shape in case you think it is useful. 
I will move this shape to the spatial4j wrapper if it is ok with you.


> GeoExactCircle should  create circles with right number of planes
> -
>
> Key: LUCENE-8071
> URL: https://issues.apache.org/jira/browse/LUCENE-8071
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/spatial3d
>Reporter: Ignacio Vera
>Assignee: Karl Wright
> Fix For: 6.7, master (8.0), 7.2
>
> Attachments: LUCENE-8071-test.patch, LUCENE-8071.patch, 
> testPointsWithIn.patch
>
>
> Hi [~kwri...@metacarta.com],
> There is still a situation when the test can fail. It happens when the planet 
> model is a SPHERE and the radius is slightly lower than PI. The circle is 
> created with two sectors but the circle plane is too big and the shape is 
> bogus.
> I will attach a test and a proposed solution. (I hope this is the last issue 
> of this saga)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-11692) SolrDispatchFilter.closeShield passes the shielded response object back to jetty making the stream unclose able

2017-12-01 Thread Jeff Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273647#comment-16273647
 ] 

Jeff Miller edited comment on SOLR-11692 at 12/1/17 4:48 PM:
-

[~markrmil...@gmail.com] Can you comment on this patch? The idea being we wrap 
the closeshield for the request/response only in the context of 
SolrDispatchFilter and if we have to pass it up to chain or forward it we pass 
the original 




was (Author: millerjeff0):
[~markrmil...@gmail.com] Can you comment on this patch? The idea being we wrap 
the closeshield for the request/response only in the context of 
SolrDispatchFilter and if we have to pass it up to chain or forward it we pass 
the original 

diff --git a/solr/core/src/java/org/apache/solr/servlet/SolrDispatchFilter.java 
b/solr/core/src/java/org/apache/solr/servlet/SolrDispatchFilter.java
index fa7eb56..dd27820 100644
--- a/solr/core/src/java/org/apache/solr/servlet/SolrDispatchFilter.java
+++ b/solr/core/src/java/org/apache/solr/servlet/SolrDispatchFilter.java
@@ -352,8 +352,7 @@ public class SolrDispatchFilter extends BaseSolrFilter {
 request = wrappedRequest.get();
   }
 
-  request = closeShield(request, retry);
-  response = closeShield(response, retry);
+
   
   if (cores.getAuthenticationPlugin() != null) {
 log.debug("User principal: {}", ((HttpServletRequest) 
request).getUserPrincipal());
@@ -376,7 +375,9 @@ public class SolrDispatchFilter extends BaseSolrFilter {
 }
   }
 
-  HttpSolrCall call = getHttpSolrCall((HttpServletRequest) request, 
(HttpServletResponse) response, retry);
+  ServletRequest shieldedRequest = closeShield(request, retry);
+  ServletResponse shieldedResponse = closeShield(response, retry);
+  HttpSolrCall call = getHttpSolrCall((HttpServletRequest) 
shieldedRequest, (HttpServletResponse) shieldedResponse, retry);
   ExecutorUtil.setServerThreadFlag(Boolean.TRUE);
   try {
 Action result = call.call();

> SolrDispatchFilter.closeShield passes the shielded response object back to 
> jetty making the stream unclose able
> ---
>
> Key: SOLR-11692
> URL: https://issues.apache.org/jira/browse/SOLR-11692
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Response Writers
>Affects Versions: 7.1
> Environment: Linux/Mac tested
>Reporter: Jeff Miller
>Priority: Minor
>  Labels: dispatchlayer, jetty, newbie, streams
> Attachments: SOLR-11692.patch
>
>   Original Estimate: 3h
>  Remaining Estimate: 3h
>
> In test mode we trigger closeShield code in SolrDispatchFilter, however there 
> are code paths where we passthrough the objects to the DefaultHandler which 
> can no longer close the response.
> Example stack trace:
> java.lang.AssertionError: Attempted close of response output stream.
> at 
> org.apache.solr.servlet.SolrDispatchFilter$2$1.close(SolrDispatchFilter.java:528)
> at org.eclipse.jetty.server.Dispatcher.commitResponse(Dispatcher.java:315)
> at org.eclipse.jetty.server.Dispatcher.forward(Dispatcher.java:279)
> at org.eclipse.jetty.server.Dispatcher.forward(Dispatcher.java:103)
> at org.eclipse.jetty.servlet.DefaultServlet.doGet(DefaultServlet.java:566)
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:734)
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:847)
> at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:684)
> at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1448)
> at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:385)
> at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:326)
> at 
> searchserver.filter.SfdcDispatchFilter.doFilter(SfdcDispatchFilter.java:204)
> at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
> at 
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
> at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
> at 
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)
> at 
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
> at 
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)
> at 
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)
> at 
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
> at 
> org.eclipse.jetty.server.handler.ContextHandler.doScope(Contex

[jira] [Updated] (SOLR-11692) SolrDispatchFilter.closeShield passes the shielded response object back to jetty making the stream unclose able

2017-12-01 Thread Jeff Miller (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-11692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Miller updated SOLR-11692:
---
Attachment: SOLR-11692.patch

Thanks [~dsmiley], here is a patch

> SolrDispatchFilter.closeShield passes the shielded response object back to 
> jetty making the stream unclose able
> ---
>
> Key: SOLR-11692
> URL: https://issues.apache.org/jira/browse/SOLR-11692
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Response Writers
>Affects Versions: 7.1
> Environment: Linux/Mac tested
>Reporter: Jeff Miller
>Priority: Minor
>  Labels: dispatchlayer, jetty, newbie, streams
> Attachments: SOLR-11692.patch
>
>   Original Estimate: 3h
>  Remaining Estimate: 3h
>
> In test mode we trigger closeShield code in SolrDispatchFilter, however there 
> are code paths where we passthrough the objects to the DefaultHandler which 
> can no longer close the response.
> Example stack trace:
> java.lang.AssertionError: Attempted close of response output stream.
> at 
> org.apache.solr.servlet.SolrDispatchFilter$2$1.close(SolrDispatchFilter.java:528)
> at org.eclipse.jetty.server.Dispatcher.commitResponse(Dispatcher.java:315)
> at org.eclipse.jetty.server.Dispatcher.forward(Dispatcher.java:279)
> at org.eclipse.jetty.server.Dispatcher.forward(Dispatcher.java:103)
> at org.eclipse.jetty.servlet.DefaultServlet.doGet(DefaultServlet.java:566)
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:734)
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:847)
> at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:684)
> at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1448)
> at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:385)
> at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:326)
> at 
> searchserver.filter.SfdcDispatchFilter.doFilter(SfdcDispatchFilter.java:204)
> at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
> at 
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
> at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
> at 
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)
> at 
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
> at 
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)
> at 
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)
> at 
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
> at 
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)
> at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
> at 
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
> at 
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)
> at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
> at org.eclipse.jetty.server.Server.handle(Server.java:370)
> at 
> org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)
> at 
> org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:949)
> at 
> org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1011)
> at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:644)
> at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
> at 
> org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:82)
> at 
> org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:668)
> at 
> org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:52)
> at 
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
> at 
> org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
> at java.lang.Thread.run(Thread.java:745)
> Related JIRA: SOLR-8933



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-8074) TestBooleanMinShouldMatch.testRandomQueries failure

2017-12-01 Thread Michael McCandless (JIRA)
Michael McCandless created LUCENE-8074:
--

 Summary: TestBooleanMinShouldMatch.testRandomQueries failure
 Key: LUCENE-8074
 URL: https://issues.apache.org/jira/browse/LUCENE-8074
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Michael McCandless


I hit this while beasting for another issue; it seems to reproduce:

{noformat}
   [junit4]  says Привет! Master seed: E99EA9D958298BBA
   [junit4] Executing 1 suite with 1 JVM.
   [junit4]
   [junit4] Started J0 PID(19504@localhost).
   [junit4] Suite: org.apache.lucene.search.TestBooleanMinShouldMatch
   [junit4]   2> NOTE: reproduce with: ant test  
-Dtestcase=TestBooleanMinShouldMatch -Dtests.method=testRandomQueries 
-Dtests.seed=E99EA9D958298BBA -Dtests.locale=pt-BR 
-Dtests.timezone=Africa/Dar_es_Salaam -Dtest\
s.asserts=true -Dtests.file.encoding=UTF-8
   [junit4] FAILURE 0.74s | TestBooleanMinShouldMatch.testRandomQueries <<<
   [junit4]> Throwable #1: java.lang.AssertionError: Doc 0 scores don't 
match
   [junit4]> TopDocs totalHits=3 top=3
   [junit4]>0) doc=0score=3.1725373
   [junit4]>1) doc=6score=0.84062046
   [junit4]>2) doc=4score=0.80648094
   [junit4]> TopDocs totalHits=1 top=1
   [junit4]>0) doc=0score=3.172537
   [junit4]> for query:(data:Y +data:3 data:4 data:4 data:1 -data:Y)~3 
expected:<3.172537088394165> but was:<3.172537326812744>
   [junit4]>at 
__randomizedtesting.SeedInfo.seed([E99EA9D958298BBA:B7B5193560F3A624]:0)
   [junit4]>at 
org.apache.lucene.search.TestBooleanMinShouldMatch.assertSubsetOfSameScores(TestBooleanMinShouldMatch.java:379)
   [junit4]>at 
org.apache.lucene.search.TestBooleanMinShouldMatch.testRandomQueries(TestBooleanMinShouldMatch.java:354)
   [junit4]>at java.lang.Thread.run(Thread.java:745)
   [junit4]   2> NOTE: test params are: codec=Asserting(Lucene70): 
{all=PostingsFormat(name=LuceneFixedGap), 
data=PostingsFormat(name=LuceneVarGapFixedInterval), 
id=Lucene50(blocksize=128)}, docValues:{}, maxPoints\
InLeafNode=83, maxMBSortInHeap=7.095630354403455, 
sim=Asserting(org.apache.lucene.search.similarities.AssertingSimilarity@5665ff50),
 locale=pt-BR, timezone=Africa/Dar_es_Salaam
   [junit4]   2> NOTE: Linux 4.4.0-75-generic amd64/Oracle Corporation 
1.8.0_121 (64-bit)/cpus=8,threads=1,free=444068192,total=514850816
   [junit4]   2> NOTE: All tests run in this JVM: [TestBooleanMinShouldMatch]
   [junit4] Completed [1/1 (1!)] in 1.35s, 1 test, 1 failure <<< FAILURES!
   [junit4]
   [junit4]
   [junit4] Tests with failures [seed: E99EA9D958298BBA]:
   [junit4]   - 
org.apache.lucene.search.TestBooleanMinShouldMatch.testRandomQueries
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-11711) Improve mincount & limit usage in pivot & field facets

2017-12-01 Thread Houston Putman (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-11711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Houston Putman updated SOLR-11711:
--
Summary: Improve mincount & limit usage in pivot & field facets  (was: 
Improve memory usage of pivot facets)

> Improve mincount & limit usage in pivot & field facets
> --
>
> Key: SOLR-11711
> URL: https://issues.apache.org/jira/browse/SOLR-11711
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: faceting
>Affects Versions: master (8.0)
>Reporter: Houston Putman
>  Labels: pull-request-available
> Fix For: 5.6, 6.7, 7.2
>
>
> Currently while sending pivot facet requests to each shard, the 
> {{facet.pivot.mincount}} is set to {{0}} if the facet is sorted by count with 
> a specified limit > 0. However with a mincount of 0, the pivot facet will use 
> exponentially more wasted memory for every pivot field added. This is because 
> there will be a total of {{limit^(# of pivots)}} pivot values created in 
> memory, even though the vast majority of them will have counts of 0, and are 
> therefore useless.
> Imagine the scenario of a pivot facet with 3 levels, and 
> {{facet.limit=1000}}. There will be a billion pivot values created, and there 
> will almost definitely be nowhere near a billion pivot values with counts > 0.
> This likely due to the reasoning mentioned in [this comment in the original 
> distributed pivot facet 
> ticket|https://issues.apache.org/jira/browse/SOLR-2894?focusedCommentId=13979898].
>  Basically it was thought that the refinement code would need to know that a 
> count was 0 for a shard so that a refinement request wasn't sent to that 
> shard. However this is checked in the code, [in this part of the refinement 
> candidate 
> checking|https://github.com/apache/lucene-solr/blob/releases/lucene-solr/7.1.0/solr/core/src/java/org/apache/solr/handler/component/PivotFacetField.java#L275].
>  Therefore if the {{pivot.mincount}} was set to 1, the non-existent values 
> would either:
> * Not be known, because the {{facet.limit}} was smaller than the number of 
> facet values with positive counts. This isn't an issue, because they wouldn't 
> have been returned with {{pivot.mincount}} set to 0.
> * Would be known, because the {{facet.limit}} would be larger than the number 
> of facet values returned. therefore this conditional would return false 
> (since we are only talking about pivot facets sorted by count).
> The solution, is to use the same pivot mincount as would be used if no limit 
> was specified. 
> This also relates to a similar problem in field faceting that was "fixed" in 
> [SOLR-8988|https://issues.apache.org/jira/browse/SOLR-8988#13324]. The 
> solution was to add a flag, {{facet.distrib.mco}}, which would enable not 
> choosing a mincount of 0 when unnessesary. Since this flag can only increase 
> performance, and doesn't break any queries I have removed it as an option and 
> replaced the code to use the feature always. 
> There was one code change necessary to fix the MCO option, since the 
> refinement candidate selection logic had a bug. The bug only occured with a 
> minCount > 0 and limit > 0 specified. When a shard replied with less than the 
> limit requested, it would assume the next maximum count on that shard was the 
> {{mincount}}, where it would actually be the {{mincount-1}} (because a facet 
> value with a count of mincount would have been returned). Therefore the MCO 
> didn't cause any errors, but with a mincount of 1 the refinement logic always 
> assumed that the shard had more values with a count of 1.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-8073) TestBasicModelIn.testRandomScoring failure

2017-12-01 Thread Michael McCandless (JIRA)
Michael McCandless created LUCENE-8073:
--

 Summary: TestBasicModelIn.testRandomScoring failure
 Key: LUCENE-8073
 URL: https://issues.apache.org/jira/browse/LUCENE-8073
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Michael McCandless


I hit this while beasting for another issue:

{noformat}
   [junit4] Started J0 PID(12925@localhost).
   [junit4] Suite: org.apache.lucene.search.similarities.TestBasicModelIn
   [junit4]   1> 2.90165171E9 = score(DFRSimilarity, doc=0, freq=1.5160105E9), 
computed from:
   [junit4]   1>   1.93443456E8 = boost
   [junit4]   1>   1.6061459E22 = NormalizationH1, computed from:
   [junit4]   1> 1.5160105E9 = tf
   [junit4]   1> 2.00029978E9 = avgFieldLength
   [junit4]   1> 49176.0 = len
   [junit4]   1>   2.4092188E23 = BasicModelIn, computed from:
   [junit4]   1> 49151.0 = numberOfDocuments
   [junit4]   1> 1.0 = docFreq
   [junit4]   1>   6.226085E-23 = AfterEffectL, computed from:
   [junit4]   1> 1.6061459E22 = tfn
   [junit4]   1>
   [junit4]   1> 2.90165197E9 = score(DFRSimilarity, doc=0, freq=1.5160105E9), 
computed from:
   [junit4]   1>   1.93443456E8 = boost
   [junit4]   1>   1.4826518E22 = NormalizationH1, computed from:
   [junit4]   1> 1.5160105E9 = tf
   [junit4]   1> 2.00029978E9 = avgFieldLength
   [junit4]   1> 53272.0 = len
   [junit4]   1>   2.2239777E23 = BasicModelIn, computed from:
   [junit4]   1> 49151.0 = numberOfDocuments
   [junit4]   1> 1.0 = docFreq
   [junit4]   1>   6.7446724E-23 = AfterEffectL, computed from:
   [junit4]   1> 1.4826518E22 = tfn
   [junit4]   1>
   [junit4]   1> DFR I(n)L1
   [junit4]   1> 
field="field",maxDoc=49151,docCount=49151,sumTotalTermFreq=98316735360683,sumDocFreq=49151
   [junit4]   1> term="term",docFreq=1,totalTermFreq=1516010534
   [junit4]   1> norm=133 (doc length ~ 53272)
   [junit4]   1> freq=1.5160105E9
   [junit4]   2> NOTE: reproduce with: ant test  -Dtestcase=TestBasicModelIn 
-Dtests.method=testRandomScoring -Dtests.seed=4EBB7FC4E5233EEF 
-Dtests.locale=da-DK -Dtests.timezone=Africa/Banjul -Dtests.asserts=true -\
Dtests.file.encoding=ISO-8859-1
   [junit4] FAILURE 1.54s | TestBasicModelIn.testRandomScoring <<<
   [junit4]> Throwable #1: java.lang.AssertionError: 
score(1.5160105E9,132)=2.90165171E9 < score(1.5160105E9,133)=2.90165197E9
   [junit4]>at 
__randomizedtesting.SeedInfo.seed([4EBB7FC4E5233EEF:C5242676FF54D8E5]:0)
   [junit4]>at 
org.apache.lucene.search.similarities.BaseSimilarityTestCase.doTestScoring(BaseSimilarityTestCase.java:423)
   [junit4]>at 
org.apache.lucene.search.similarities.BaseSimilarityTestCase.testRandomScoring(BaseSimilarityTestCase.java:355)
   [junit4]>at java.lang.Thread.run(Thread.java:745)
   [junit4]   2> NOTE: test params are: codec=Asserting(Lucene70): 
{field=FST50}, docValues:{}, maxPointsInLeafNode=216, 
maxMBSortInHeap=7.270276664622743, 
sim=Asserting(org.apache.lucene.search.similarities.Assert\
ingSimilarity@58a83126), locale=da-DK, timezone=Africa/Banjul
   [junit4]   2> NOTE: Linux 4.4.0-75-generic amd64/Oracle Corporation 
1.8.0_121 (64-bit)/cpus=8,threads=1,free=395373056,total=513277952
   [junit4]   2> NOTE: All tests run in this JVM: [TestBasicModelIn]
   [junit4] Completed [1/1 (1!)] in 2.06s, 1 test, 1 failure <<< FAILURES!
   [junit4]
   [junit4]
   [junit4] Tests with failures [seed: 4EBB7FC4E5233EEF]:
   [junit4]   - 
org.apache.lucene.search.similarities.TestBasicModelIn.testRandomScoring
{noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8043) Attempting to add documents past limit can corrupt index

2017-12-01 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16274607#comment-16274607
 ] 

Michael McCandless commented on LUCENE-8043:


Thanks [~simonw]; I love the new assert, and the patch looks correct to me.

I beasted all Lucene tests 33 times and hit this failure, twice:

{noformat}
ant test -Dtestcase=TestIndexWriter -Dtestmethod=testThreadInterruptDeadlock 
-Dtests.seed=55197CA38E8C827B

java.lang.AssertionError: pendingNumDocs 0 != 11 totalMaxDoc
at org.apache.lucene.index.IndexWriter.shutdown(IndexWriter.java:1277)
at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:1319)
at 
org.apache.lucene.index.TestIndexWriter$IndexerThreadInterrupt.run(TestIndexWriter.java:902)
{noformat}

But it does not reproduce for me.

I hit two other unrelated failures; look like Similarity issues ... I'll open 
separate issues for those.

> Attempting to add documents past limit can corrupt index
> 
>
> Key: LUCENE-8043
> URL: https://issues.apache.org/jira/browse/LUCENE-8043
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/index
>Affects Versions: 4.10, 7.0, master (8.0)
>Reporter: Yonik Seeley
>Assignee: Simon Willnauer
> Fix For: master (8.0), 7.2, 7.1.1
>
> Attachments: LUCENE-8043.patch, LUCENE-8043.patch, LUCENE-8043.patch, 
> LUCENE-8043.patch, YCS_IndexTest7a.java
>
>
> The IndexWriter check for too many documents does not always work, resulting 
> in going over the limit.  Once this happens, Lucene refuses to open the index 
> and throws a CorruptIndexException: Too many documents.
> This appears to affect all versions of Lucene/Solr (the check was first 
> implemented in LUCENE-5843 in v4.9.1/4.10 and we've seen this manifest in 
> 4.10) 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-11711) Improve memory usage of pivot facets

2017-12-01 Thread Houston Putman (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-11711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Houston Putman updated SOLR-11711:
--
Description: 
Currently while sending pivot facet requests to each shard, the 
{{facet.pivot.mincount}} is set to {{0}} if the facet is sorted by count with a 
specified limit > 0. However with a mincount of 0, the pivot facet will use 
exponentially more wasted memory for every pivot field added. This is because 
there will be a total of {{limit^(# of pivots)}} pivot values created in 
memory, even though the vast majority of them will have counts of 0, and are 
therefore useless.

Imagine the scenario of a pivot facet with 3 levels, and {{facet.limit=1000}}. 
There will be a billion pivot values created, and there will almost definitely 
be nowhere near a billion pivot values with counts > 0.

This likely due to the reasoning mentioned in [this comment in the original 
distributed pivot facet 
ticket|https://issues.apache.org/jira/browse/SOLR-2894?focusedCommentId=13979898].
 Basically it was thought that the refinement code would need to know that a 
count was 0 for a shard so that a refinement request wasn't sent to that shard. 
However this is checked in the code, [in this part of the refinement candidate 
checking|https://github.com/apache/lucene-solr/blob/releases/lucene-solr/7.1.0/solr/core/src/java/org/apache/solr/handler/component/PivotFacetField.java#L275].
 Therefore if the {{pivot.mincount}} was set to 1, the non-existent values 
would either:
* Not be known, because the {{facet.limit}} was smaller than the number of 
facet values with positive counts. This isn't an issue, because they wouldn't 
have been returned with {{pivot.mincount}} set to 0.
* Would be known, because the {{facet.limit}} would be larger than the number 
of facet values returned. therefore this conditional would return false (since 
we are only talking about pivot facets sorted by count).

The solution, is to use the same pivot mincount as would be used if no limit 
was specified. 

This also relates to a similar problem in field faceting that was "fixed" in 
[SOLR-8988|https://issues.apache.org/jira/browse/SOLR-8988#13324]. The solution 
was to add a flag, {{facet.distrib.mco}}, which would enable not choosing a 
mincount of 0 when unnessesary. Since this flag can only increase performance, 
and doesn't break any queries I have removed it as an option and replaced the 
code to use the feature always. 
There was one code change necessary to fix the MCO option, since the refinement 
candidate selection logic had a bug. The bug only occured with a minCount > 0 
and limit > 0 specified. When a shard replied with less than the limit 
requested, it would assume the next maximum count on that shard was the 
{{mincount}}, where it would actually be the {{mincount-1}} (because a facet 
value with a count of mincount would have been returned). Therefore the MCO 
didn't cause any errors, but with a mincount of 1 the refinement logic always 
assumed that the shard had more values with a count of 1.

  was:
Currently while sending pivot facet requests to each shard, the 
{{facet.pivot.mincount}} is set to {{0}} if the facet is sorted by count with a 
specified limit > 0. However with a mincount of 0, the pivot facet will use 
exponentially more wasted memory for every pivot field added. This is because 
there will be a total of {{limit^(# of pivots)}} pivot values created in 
memory, even though the vast majority of them will have counts of 0, and are 
therefore useless.

Imagine the scenario of a pivot facet with 3 levels, and {{facet.limit=1000}}. 
There will be a billion pivot values created, and there will almost definitely 
be nowhere near a billion pivot values with counts > 0.

This likely due to the reasoning mentioned in [this comment in the original 
distributed pivot facet 
ticket|https://issues.apache.org/jira/browse/SOLR-2894?focusedCommentId=13979898].
 Basically it was thought that the refinement code would need to know that a 
count was 0 for a shard so that a refinement request wasn't sent to that shard. 
However this is checked in the code, [in this part of the refinement candidate 
checking|https://github.com/apache/lucene-solr/blob/releases/lucene-solr/7.1.0/solr/core/src/java/org/apache/solr/handler/component/PivotFacetField.java#L275].
 Therefore if the {{pivot.mincount}} was set to 1, the non-existent values 
would either:
* Not be known, because the {{facet.limit}} was smaller than the number of 
facet values with positive counts. This isn't an issue, because they wouldn't 
have been returned with {{pivot.mincount}} set to 0.
* Would be known, because the {{facet.limit}} would be larger than the number 
of facet values returned. therefore this conditional would return false (since 
we are only talking about pivot facets sorted by count).

The solution, is to use the same pivot mincount as would be used if no limi

[jira] [Commented] (SOLR-11508) core.properties should be stored $solr.data.home/$core.name

2017-12-01 Thread Shawn Heisey (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16274581#comment-16274581
 ] 

Shawn Heisey commented on SOLR-11508:
-

[~dsmiley], You want to get rid of core discovery?  I'm showing my age when I 
say that for me, the "new" shine of that feature still hasn't quite worn off 
(working since 4.4, required since 5.0). What do you want to do instead?  Core 
discovery is used even in cloud mode, though I think when "zookeeper as truth" 
is fully realized, SolrCloud probably won't use it any more, and I support that 
idea.  For standalone mode, I think it's important, unless you have something 
in mind to replace it that most of us can agree is better.

The absolute easiest way to move everything other than code so everything in 
the program directory can be read-only is to set the solr home.  Technically 
speaking, especially with capabilities added later, this is "just" the location 
of solr.xml, assuming that file isn't found in zookeeper, though it also gets 
used as a starting location for core config and data.

IMHO, coreRootDirectory and solr.data.home are expert options for setting up a 
directory structure with more separation than what Solr does by default, and 
the documentation should state this.  I'm absolutely fine with having these 
features -- they give users a lot of control over Solr's index directory 
structure.

For most users, the solr home is all they'll need, and I think the 
documentation should say that too. The expert options should NOT be in the 
stock config files, except perhaps as commented out examples of what the user 
*CAN* do, if they're interested in greater control and responsibility.

I think that the names we've got for these options, although technically 
correct, are likely confusing to novices.  Here's a proposal, intended only for 
the master branch:

{quote}
Eliminate either solr.data.home or solr.solr.home, and use whichever is left 
for functionality currently handled by solr.solr.home.

Use solr.index.home for functionality currently handled by solr.data.home.  
This would be documented as an expert option.

Get rid of coreRootDirectory entirely -- defining the solr home and 
solr.index.home effectively gets the same results, which are config and data in 
separate places ... and I don't see much value in separating solr.xml from the 
rest of the config.  For SolrCloud, it could be argued that solr.index.home is 
unnecessary, though some might want core.properties to be in a different place 
than the indexes.  That need goes away if "zookeeper as truth" eliminates core 
discovery in cloud mode.

I don't think we should backport these ideas to 7.x -- it's a fairly major 
change that would confuse users who already understand what's there, and seems 
better to do in a major release.
{quote}

I do agree that the script should support environment variables for configuring 
all these options, even the expert ones, and that this should happen in 7.x.


> core.properties should be stored $solr.data.home/$core.name
> ---
>
> Key: SOLR-11508
> URL: https://issues.apache.org/jira/browse/SOLR-11508
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Marc Morissette
>
> Since Solr 7, it is possible to store Solr cores in separate disk locations 
> using solr.data.home (see SOLR-6671). This is very useful where running Solr 
> in Docker where data must be stored in a directory which is independent from 
> the rest of the container.
> Unfortunately, while core data is stored in 
> {{$\{solr.data.home}/$\{core.name}/index/...}}, core.properties is stored in 
> {{$\{solr.solr.home}/$\{core.name}/core.properties}}.
> Reading SOLR-6671 comments, I think this was the expected behaviour but I 
> don't think it is the correct one.
> In addition to being inelegant and counterintuitive, this has the drawback of 
> stripping a core of its metadata and breaking core discovery when a Solr 
> installation is redeployed, whether in Docker or not.
> core.properties is mostly metadata and although it contains some 
> configuration, this configuration is specific to the core it accompanies. I 
> believe it should be stored in solr.data.home, with the rest of the data it 
> describes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11622) Bundled mime4j library not sufficient for Tika requirement

2017-12-01 Thread Tim Allison (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16274578#comment-16274578
 ] 

Tim Allison commented on SOLR-11622:


Taking a look now.  I want to run all of Tika's unit test docs through it to 
make sure I didn't botch anything else...

You saw the POI bug in SOLR-11693?

> Bundled mime4j library not sufficient for Tika requirement
> --
>
> Key: SOLR-11622
> URL: https://issues.apache.org/jira/browse/SOLR-11622
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Build
>Affects Versions: 7.1, 6.6.2
>Reporter: Karim Malhas
>Assignee: Karthik Ramachandran
>Priority: Minor
>  Labels: build
> Attachments: SOLR-11622.patch
>
>
> The version 7.2 of Apache James Mime4j bundled with the Solr binary releases 
> does not match what is required by Apache Tika for parsing rfc2822 messages. 
> The master branch for james-mime4j seems to contain the missing Builder class
> [https://github.com/apache/james-mime4j/blob/master/core/src/main/java/org/apache/james/mime4j/stream/MimeConfig.java
> ]
> This prevents import of rfc2822 formatted messages. For example like so:
> {{./bin/post -c dovecot -type 'message/rfc822' 'testdata/email_01.txt'
> }}
> And results in the following stacktrace:
> java.lang.NoClassDefFoundError: 
> org/apache/james/mime4j/stream/MimeConfig$Builder
> at 
> org.apache.tika.parser.mail.RFC822Parser.parse(RFC822Parser.java:63)
> at 
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
> at 
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
> at 
> org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:135)
> at 
> org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:228)
> at 
> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:68)
> at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:173)
> at org.apache.solr.core.SolrCore.execute(SolrCore.java:2477)
> at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:723)
> at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:529)
> at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:361)
> at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:305)
> at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1691)
> at 
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
> at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
> at 
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
> at 
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
> at 
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
> at 
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
> at 
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
> at 
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
> at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
> at 
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
> at 
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
> at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
> at 
> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
> at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
> at org.eclipse.jetty.server.Server.handle(Server.java:534)
> at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320)
> at 
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
> at 
> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273)
> at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)
> at 
> org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
> at 
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)
> at 
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.jav

[jira] [Commented] (SOLR-11336) DocBasedVersionConstraintsProcessor should be more extensible

2017-12-01 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16274574#comment-16274574
 ] 

David Smiley commented on SOLR-11336:
-

bq. As the next step I want to extend it to process multiple versions at once - 
this fair to do as part of this?

+1

> DocBasedVersionConstraintsProcessor should be more extensible
> -
>
> Key: SOLR-11336
> URL: https://issues.apache.org/jira/browse/SOLR-11336
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Michael Braun
>Priority: Minor
> Attachments: SOLR-11336.patch
>
>
> DocBasedVersionConstraintsProcessor supports allowing document updates only 
> if the new version is greater than the old. However, if any behavior wants to 
> be extended / changed in minor ways, the entire class will need to be copied 
> and slightly modified rather than extending and changing the method in 
> question. 
> It would be nice if DocBasedVersionConstraintsProcessor stood on its own as a 
> non-private class. In addition, certain methods (such as pieces of 
> isVersionNewEnough) should be broken out into separate methods so they can be 
> extended such that someone can extend the processor class and override what 
> it means for a new version to be accepted (allowing equal versions through? 
> What if new is a lower not greater number?). 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-Tests-master - Build # 2201 - Still Failing

2017-12-01 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-Tests-master/2201/

3 tests failed.
FAILED:  org.apache.solr.security.TestPKIAuthenticationPlugin.test

Error Message:


Stack Trace:
java.lang.NullPointerException
at 
__randomizedtesting.SeedInfo.seed([5968E87503AD728D:D13CD7AFAD511F75]:0)
at 
org.apache.solr.security.TestPKIAuthenticationPlugin.test(TestPKIAuthenticationPlugin.java:102)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1737)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:934)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:970)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:984)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:943)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:829)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:879)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:890)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at java.lang.Thread.run(Thread.java:748)


FAILED:  
org.apache.solr.cloud.autoscaling.TriggerIntegrationTest.testTriggerThrottling

Error Message:
Both triggers should have fired by now

Stack Trace:
java.lang.AssertionError: Both triggers should have fired by now
at 
__randomizedtesting.SeedInfo.seed([5968E87503AD728D:A24A4050D107911F]:0)
at org.junit.Assert.fail(Assert.java:93)
at 
org.apache.solr.cloud.autoscaling.TriggerIntegrationTest.testTriggerThrottling(TriggerIntegrationTest.java:205)
at sun.reflect.NativeMetho

Re: Lucene/Solr 7.2

2017-12-01 Thread David Smiley
Doug's issue SOLR-11698 needs my final code review (probably final any way)
and I plan to commit that as late as Monday if it goes well.

Erick... IMO:
* LUCENE-8048 probably needs some "bake" time IMO, plus it's not clear if
it's committable yet (waiting for other input).
* SOLR-11687 definitely include


On Fri, Dec 1, 2017 at 10:41 AM Erick Erickson 
wrote:

> SOLR-11687 and LUCENE-8048 are ones I'd like to consider getting in to
> 7.2, should they have longer to bake though? Any opinions?
>
> On Fri, Dec 1, 2017 at 7:26 AM, Joel Bernstein  wrote:
> > +1. I have a couple of tickets that I should have wrapped up by Monday.
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
> --
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
http://www.solrenterprisesearchserver.com


[jira] [Resolved] (SOLR-11049) Solr in cloud mode silently fails uploading a big LTR model

2017-12-01 Thread Christine Poerschke (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-11049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christine Poerschke resolved SOLR-11049.

   Resolution: Workaround
Fix Version/s: master (8.0)
   7.2

Thanks for reporting this issue and the {{-Djute.maxbuffer}} workaround!

Closing this ticket with reference to the workaround and SOLR-11250 in 7.2 
adding a {{DefaultWrapperModel}} class for loading of large and/or externally 
stored LTRScoringModel definitions.

> Solr in cloud mode silently fails uploading a big LTR model
> ---
>
> Key: SOLR-11049
> URL: https://issues.apache.org/jira/browse/SOLR-11049
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: contrib - LTR
> Environment: tested with Solr 6.6 an integrated zookeeper
>Reporter: Stefan Langenmaier
> Fix For: 7.2, master (8.0)
>
> Attachments: SOLR-11049.patch
>
>
> Hi,
> I'm using Solr in cloud mode, I have a MultipleAdditiveTreesModel with about 
> 3MB in size. When I upload the model with
> {noformat}
> curl -v -XPUT 'http://localhost:8983/solr/tmdb/schema/model-store' 
> --data-binary @/big-tree.model -H 'Content-type:application/json'
> {noformat}
> I get the following response
> {code:html}
> {
>   "responseHeader":{
> "status":0,
> "QTime":24318}
> }
> {code}
> This looks kind of slow but without an error. When I check the config the 
> model is not visible and when I try to run a query that uses the model I get 
> the following error
> {code:html}
> "error":{
> "metadata":[
>   "error-class","org.apache.solr.common.SolrException",
>   "root-error-class","org.apache.solr.common.SolrException"],
> "msg":"cannot find model bigTreeModel",
> "code":400}
> {code}
> When I upload the model to solr where I increased the zookeeper znode size 
> limit with
> {noformat}
> -Djute.maxbuffer=0x1ff
> {noformat}
> the same model upload succeeds much faster
> {code:html}
> {
>   "responseHeader":{
> "status":0,
> "QTime":689}
> }
> {code}
> The model is visible in the configuration and queries that use it run without 
> error.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-8071) GeoExactCircle should create circles with right number of planes

2017-12-01 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-8071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved LUCENE-8071.
-
   Resolution: Fixed
Fix Version/s: 7.2
   master (8.0)
   6.7

> GeoExactCircle should  create circles with right number of planes
> -
>
> Key: LUCENE-8071
> URL: https://issues.apache.org/jira/browse/LUCENE-8071
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/spatial3d
>Reporter: Ignacio Vera
>Assignee: Karl Wright
> Fix For: 6.7, master (8.0), 7.2
>
> Attachments: LUCENE-8071-test.patch, LUCENE-8071.patch
>
>
> Hi [~kwri...@metacarta.com],
> There is still a situation when the test can fail. It happens when the planet 
> model is a SPHERE and the radius is slightly lower than PI. The circle is 
> created with two sectors but the circle plane is too big and the shape is 
> bogus.
> I will attach a test and a proposed solution. (I hope this is the last issue 
> of this saga)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8071) GeoExactCircle should create circles with right number of planes

2017-12-01 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16274544#comment-16274544
 ] 

ASF subversion and git services commented on LUCENE-8071:
-

Commit 80930b97ccbd5932f08727f4cdb15208a015c064 in lucene-solr's branch 
refs/heads/branch_7x from [~kwri...@metacarta.com]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=80930b9 ]

LUCENE-8071: Handle large concave circles properly.  Committed on behalf of 
Ignacio Vera.


> GeoExactCircle should  create circles with right number of planes
> -
>
> Key: LUCENE-8071
> URL: https://issues.apache.org/jira/browse/LUCENE-8071
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/spatial3d
>Reporter: Ignacio Vera
>Assignee: Karl Wright
> Attachments: LUCENE-8071-test.patch, LUCENE-8071.patch
>
>
> Hi [~kwri...@metacarta.com],
> There is still a situation when the test can fail. It happens when the planet 
> model is a SPHERE and the radius is slightly lower than PI. The circle is 
> created with two sectors but the circle plane is too big and the shape is 
> bogus.
> I will attach a test and a proposed solution. (I hope this is the last issue 
> of this saga)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8071) GeoExactCircle should create circles with right number of planes

2017-12-01 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16274542#comment-16274542
 ] 

ASF subversion and git services commented on LUCENE-8071:
-

Commit 1f1d7a326de28326a841229ace3519babba462f2 in lucene-solr's branch 
refs/heads/branch_6x from [~kwri...@metacarta.com]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=1f1d7a3 ]

LUCENE-8071: Handle large concave circles properly.  Committed on behalf of 
Ignacio Vera.


> GeoExactCircle should  create circles with right number of planes
> -
>
> Key: LUCENE-8071
> URL: https://issues.apache.org/jira/browse/LUCENE-8071
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/spatial3d
>Reporter: Ignacio Vera
>Assignee: Karl Wright
> Attachments: LUCENE-8071-test.patch, LUCENE-8071.patch
>
>
> Hi [~kwri...@metacarta.com],
> There is still a situation when the test can fail. It happens when the planet 
> model is a SPHERE and the radius is slightly lower than PI. The circle is 
> created with two sectors but the circle plane is too big and the shape is 
> bogus.
> I will attach a test and a proposed solution. (I hope this is the last issue 
> of this saga)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-11250) Add new LTR model which loads the model definition from the external resource

2017-12-01 Thread Christine Poerschke (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-11250?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christine Poerschke resolved SOLR-11250.

   Resolution: Fixed
Fix Version/s: master (8.0)
   7.2

Thanks [~yuyano] and [~shalinmangar]!

> Add new LTR model which loads the model definition from the external resource
> -
>
> Key: SOLR-11250
> URL: https://issues.apache.org/jira/browse/SOLR-11250
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: contrib - LTR
>Reporter: Yuki Yano
>Assignee: Christine Poerschke
>Priority: Minor
> Fix For: 7.2, master (8.0)
>
> Attachments: SOLR-11250.patch, SOLR-11250.patch, SOLR-11250.patch, 
> SOLR-11250.patch, SOLR-11250_master.patch, SOLR-11250_master_v2.patch, 
> SOLR-11250_master_v3.patch, SOLR-11250_master_v4.patch
>
>
> *example of committed change's usage:*
> {code}
> {
>   "class" : "org.apache.solr.ltr.model.DefaultWrapperModel",
>   "name" : "myWrapperModelName",
>   "params" : {
> "resource" : "models/myModel.json"
>   }
> }
> {code}
> 
> *original summary:*
> We add new model which contains only the location of the external model and 
> loads it during the initialization.
> By this procedure, large models which are difficult to upload to ZooKeeper 
> can be available.
> The new model works as the wrapper of existing models, and deligates APIs to 
> them.
> We add two classes by this patch:
> * {{ExternalModel}} : a base class for models with external resources.
> * {{URIExternalModel}} : an implementation of {{ExternalModel}} which loads 
> the external model from specified URI (ex. file:, http:, etc.).
> For example, if you have a model on the local disk 
> "file:///var/models/myModel.json", the definition of {{URIExternalModel}} 
> will be like the following.
> {code}
> {
>   "class" : "org.apache.solr.ltr.model.URIExternalModel",
>   "name" : "myURIExternalModel",
>   "features" : [],
>   "params" : {
> "uri" : "file:///var/models/myModel.json"
>   }
> }
> {code}
> If you use LTR with {{model=myURIExternalModel}}, the model of 
> {{myModel.json}} will be used for scoring documents.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-11291) Adding Solr Core Reporter

2017-12-01 Thread Christine Poerschke (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-11291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christine Poerschke resolved SOLR-11291.

   Resolution: Fixed
Fix Version/s: master (8.0)
   7.2

Thanks Omar!

> Adding Solr Core Reporter
> -
>
> Key: SOLR-11291
> URL: https://issues.apache.org/jira/browse/SOLR-11291
> Project: Solr
>  Issue Type: New Feature
>  Components: metrics
>Reporter: Omar Abdelnabi
>Assignee: Christine Poerschke
>Priority: Minor
> Fix For: 7.2, master (8.0)
>
> Attachments: SOLR-11291.patch, SOLR-11291.patch, SOLR-11291.patch, 
> SOLR-11291.patch
>
>
> Adds a new reporter, SolrCoreReporter, which allows metrics to be reported on 
> per-core basis.
> Also modifies the SolrMetricManager and SolrCoreMetricManager to take 
> advantage of this new reporter.
> Adds a test/example that uses the  SolrCoreReporter. Also adds randomization 
> to SolrCloudReportersTest.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8071) GeoExactCircle should create circles with right number of planes

2017-12-01 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16274538#comment-16274538
 ] 

ASF subversion and git services commented on LUCENE-8071:
-

Commit 6c3869f8b1932cef7e13ebc91fe3b04532215ea5 in lucene-solr's branch 
refs/heads/master from [~kwri...@metacarta.com]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=6c3869f ]

LUCENE-8071: Handle large concave circles properly.  Committed on behalf of 
Ignacio Vera.


> GeoExactCircle should  create circles with right number of planes
> -
>
> Key: LUCENE-8071
> URL: https://issues.apache.org/jira/browse/LUCENE-8071
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/spatial3d
>Reporter: Ignacio Vera
>Assignee: Karl Wright
> Attachments: LUCENE-8071-test.patch, LUCENE-8071.patch
>
>
> Hi [~kwri...@metacarta.com],
> There is still a situation when the test can fail. It happens when the planet 
> model is a SPHERE and the radius is slightly lower than PI. The circle is 
> created with two sectors but the circle plane is too big and the shape is 
> bogus.
> I will attach a test and a proposed solution. (I hope this is the last issue 
> of this saga)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11291) Adding Solr Core Reporter

2017-12-01 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16274536#comment-16274536
 ] 

ASF subversion and git services commented on SOLR-11291:


Commit a8fbff4d1b8aef343b87a78d0be0fb711a46e53b in lucene-solr's branch 
refs/heads/branch_7x from [~cpoerschke]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=a8fbff4 ]

SOLR-11291: Factor out abstract metrics/SolrCore[Container]Reporter classes. 
(Omar Abdelnabi, Christine Poerschke)


> Adding Solr Core Reporter
> -
>
> Key: SOLR-11291
> URL: https://issues.apache.org/jira/browse/SOLR-11291
> Project: Solr
>  Issue Type: New Feature
>  Components: metrics
>Reporter: Omar Abdelnabi
>Assignee: Christine Poerschke
>Priority: Minor
> Attachments: SOLR-11291.patch, SOLR-11291.patch, SOLR-11291.patch, 
> SOLR-11291.patch
>
>
> Adds a new reporter, SolrCoreReporter, which allows metrics to be reported on 
> per-core basis.
> Also modifies the SolrMetricManager and SolrCoreMetricManager to take 
> advantage of this new reporter.
> Adds a test/example that uses the  SolrCoreReporter. Also adds randomization 
> to SolrCloudReportersTest.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11250) Add new LTR model which loads the model definition from the external resource

2017-12-01 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16274535#comment-16274535
 ] 

ASF subversion and git services commented on SOLR-11250:


Commit 1cbe4db460fe4bff2364419e3d9c83fade78ed9c in lucene-solr's branch 
refs/heads/branch_7x from [~cpoerschke]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=1cbe4db ]

SOLR-11250: A new DefaultWrapperModel class for loading of large and/or 
externally stored LTRScoringModel definitions. (Yuki Yano, shalin, Christine 
Poerschke)


> Add new LTR model which loads the model definition from the external resource
> -
>
> Key: SOLR-11250
> URL: https://issues.apache.org/jira/browse/SOLR-11250
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: contrib - LTR
>Reporter: Yuki Yano
>Assignee: Christine Poerschke
>Priority: Minor
> Attachments: SOLR-11250.patch, SOLR-11250.patch, SOLR-11250.patch, 
> SOLR-11250.patch, SOLR-11250_master.patch, SOLR-11250_master_v2.patch, 
> SOLR-11250_master_v3.patch, SOLR-11250_master_v4.patch
>
>
> *example of committed change's usage:*
> {code}
> {
>   "class" : "org.apache.solr.ltr.model.DefaultWrapperModel",
>   "name" : "myWrapperModelName",
>   "params" : {
> "resource" : "models/myModel.json"
>   }
> }
> {code}
> 
> *original summary:*
> We add new model which contains only the location of the external model and 
> loads it during the initialization.
> By this procedure, large models which are difficult to upload to ZooKeeper 
> can be available.
> The new model works as the wrapper of existing models, and deligates APIs to 
> them.
> We add two classes by this patch:
> * {{ExternalModel}} : a base class for models with external resources.
> * {{URIExternalModel}} : an implementation of {{ExternalModel}} which loads 
> the external model from specified URI (ex. file:, http:, etc.).
> For example, if you have a model on the local disk 
> "file:///var/models/myModel.json", the definition of {{URIExternalModel}} 
> will be like the following.
> {code}
> {
>   "class" : "org.apache.solr.ltr.model.URIExternalModel",
>   "name" : "myURIExternalModel",
>   "features" : [],
>   "params" : {
> "uri" : "file:///var/models/myModel.json"
>   }
> }
> {code}
> If you use LTR with {{model=myURIExternalModel}}, the model of 
> {{myModel.json}} will be used for scoring documents.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-11714) AddReplicaSuggester endless loop

2017-12-01 Thread Andrzej Bialecki (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-11714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrzej Bialecki  updated SOLR-11714:
-
Description: 
{{SearchRateTrigger}} events are processed by {{ComputePlanAction}} and 
depending on the condition either a MoveReplicaSuggester or AddReplicaSuggester 
is selected.

When {{AddReplicaSuggester}} is selected there's currently a bug in master, due 
to an API change (Hint.COLL_SHARD should be used instead of Hint.COLL). 
However, after fixing that bug {{ComputePlanAction}} goes into an endless loop 
because the suggester endlessly keeps creating new operations.

Please see the patch that fixes the Hint.COLL_SHARD issue and modifies the unit 
test to illustrate this failure.

  was:
{{SearchRateTrigger}} events are processed by {{ComputePlanAction}} and 
depending on the condition either a MoveReplicaSuggester or AddReplicaSuggester 
is selected.

When {{AddReplicaSuggester}} is selected there's currently a bug in master, due 
to an API change (Hint.COLL_SHARD should be used instead of Hint.COLL). 
However, after fixing that bug {{ComputePlanAction}} goes into an endless loop 
because the suggester endlessly keeps creating new operations.

Please see patch that fixes the Hint.COLL_SHARD issue and modifies the unit 
test to illustrate this failure.


> AddReplicaSuggester endless loop
> 
>
> Key: SOLR-11714
> URL: https://issues.apache.org/jira/browse/SOLR-11714
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: AutoScaling
>Affects Versions: 7.2, master (8.0)
>Reporter: Andrzej Bialecki 
>Assignee: Noble Paul
> Attachments: SOLR-11714.diff
>
>
> {{SearchRateTrigger}} events are processed by {{ComputePlanAction}} and 
> depending on the condition either a MoveReplicaSuggester or 
> AddReplicaSuggester is selected.
> When {{AddReplicaSuggester}} is selected there's currently a bug in master, 
> due to an API change (Hint.COLL_SHARD should be used instead of Hint.COLL). 
> However, after fixing that bug {{ComputePlanAction}} goes into an endless 
> loop because the suggester endlessly keeps creating new operations.
> Please see the patch that fixes the Hint.COLL_SHARD issue and modifies the 
> unit test to illustrate this failure.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-11714) AddReplicaSuggester endless loop

2017-12-01 Thread Andrzej Bialecki (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-11714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrzej Bialecki  updated SOLR-11714:
-
Attachment: SOLR-11714.diff

> AddReplicaSuggester endless loop
> 
>
> Key: SOLR-11714
> URL: https://issues.apache.org/jira/browse/SOLR-11714
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: AutoScaling
>Affects Versions: 7.2, master (8.0)
>Reporter: Andrzej Bialecki 
>Assignee: Noble Paul
> Attachments: SOLR-11714.diff
>
>
> {{SearchRateTrigger}} events are processed by {{ComputePlanAction}} and 
> depending on the condition either a MoveReplicaSuggester or 
> AddReplicaSuggester is selected.
> When {{AddReplicaSuggester}} is selected there's currently a bug in master, 
> due to an API change (Hint.COLL_SHARD should be used instead of Hint.COLL). 
> However, after fixing that bug {{ComputePlanAction}} goes into an endless 
> loop because the suggester endlessly keeps creating new operations.
> Please see patch that fixes the Hint.COLL_SHARD issue and modifies the unit 
> test to illustrate this failure.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-11714) AddReplicaSuggester endless loop

2017-12-01 Thread Andrzej Bialecki (JIRA)
Andrzej Bialecki  created SOLR-11714:


 Summary: AddReplicaSuggester endless loop
 Key: SOLR-11714
 URL: https://issues.apache.org/jira/browse/SOLR-11714
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
  Components: AutoScaling
Affects Versions: 7.2, master (8.0)
Reporter: Andrzej Bialecki 
Assignee: Noble Paul


{{SearchRateTrigger}} events are processed by {{ComputePlanAction}} and 
depending on the condition either a MoveReplicaSuggester or AddReplicaSuggester 
is selected.

When {{AddReplicaSuggester}} is selected there's currently a bug in master, due 
to an API change (Hint.COLL_SHARD should be used instead of Hint.COLL). 
However, after fixing that bug {{ComputePlanAction}} goes into an endless loop 
because the suggester endlessly keeps creating new operations.

Please see patch that fixes the Hint.COLL_SHARD issue and modifies the unit 
test to illustrate this failure.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Lucene/Solr 7.2

2017-12-01 Thread Erick Erickson
SOLR-11687 and LUCENE-8048 are ones I'd like to consider getting in to
7.2, should they have longer to bake though? Any opinions?

On Fri, Dec 1, 2017 at 7:26 AM, Joel Bernstein  wrote:
> +1. I have a couple of tickets that I should have wrapped up by Monday.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Assigned] (LUCENE-8048) Filesystems do not guarantee order of directories updates

2017-12-01 Thread Erick Erickson (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-8048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson reassigned LUCENE-8048:
--

Assignee: Erick Erickson

> Filesystems do not guarantee order of directories updates
> -
>
> Key: LUCENE-8048
> URL: https://issues.apache.org/jira/browse/LUCENE-8048
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Nikolay Martynov
>Assignee: Erick Erickson
> Attachments: LUCENE-8048.patch, Screen Shot 2017-11-22 at 12.34.51 
> PM.png
>
>
> Currently when index is written to disk the following sequence of events is 
> taking place:
> * write segment file
> * sync segment file
> * write segment file
> * sync segment file
> ...
> * write list of segments
> * sync list of segments
> * rename list of segments
> * sync index directory
> This sequence leads to potential window of opportunity for system to crash 
> after 'rename list of segments' but before 'sync index directory' and 
> depending on exact filesystem implementation this may potentially lead to 
> 'list of segments' being visible in directory while some of the segments are 
> not.
> Solution to this is to sync index directory after all segments have been 
> written. [This 
> commit|https://github.com/mar-kolya/lucene-solr/commit/58e05dd1f633ab9b02d9e6374c7fab59689ae71c]
>  shows idea implemented. I'm fairly certain that I didn't find all the places 
> this may be potentially happening.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Lucene/Solr 7.2

2017-12-01 Thread Joel Bernstein
+1. I have a couple of tickets that I should have wrapped up by Monday.


[jira] [Commented] (SOLR-11336) DocBasedVersionConstraintsProcessor should be more extensible

2017-12-01 Thread Michael Braun (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16274508#comment-16274508
 ] 

Michael Braun commented on SOLR-11336:
--

As the next step I want to extend it to process multiple versions at once - 
this fair to do as part of this?

> DocBasedVersionConstraintsProcessor should be more extensible
> -
>
> Key: SOLR-11336
> URL: https://issues.apache.org/jira/browse/SOLR-11336
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Michael Braun
>Priority: Minor
> Attachments: SOLR-11336.patch
>
>
> DocBasedVersionConstraintsProcessor supports allowing document updates only 
> if the new version is greater than the old. However, if any behavior wants to 
> be extended / changed in minor ways, the entire class will need to be copied 
> and slightly modified rather than extending and changing the method in 
> question. 
> It would be nice if DocBasedVersionConstraintsProcessor stood on its own as a 
> non-private class. In addition, certain methods (such as pieces of 
> isVersionNewEnough) should be broken out into separate methods so they can be 
> extended such that someone can extend the processor class and override what 
> it means for a new version to be accepted (allowing equal versions through? 
> What if new is a lower not greater number?). 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8071) GeoExactCircle should create circles with right number of planes

2017-12-01 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16274502#comment-16274502
 ] 

Karl Wright commented on LUCENE-8071:
-

Hi [~ivera], this would have been addressed by the backing planes too.
Essentially, when the circle becomes concave, the logic has to change for 
computing "within".  That is the fundamental difference.

I think the fix is reasonable although I am not certain there aren't other 
cases besides the initial split where this same logic would be needed.


> GeoExactCircle should  create circles with right number of planes
> -
>
> Key: LUCENE-8071
> URL: https://issues.apache.org/jira/browse/LUCENE-8071
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/spatial3d
>Reporter: Ignacio Vera
>Assignee: Karl Wright
> Attachments: LUCENE-8071-test.patch, LUCENE-8071.patch
>
>
> Hi [~kwri...@metacarta.com],
> There is still a situation when the test can fail. It happens when the planet 
> model is a SPHERE and the radius is slightly lower than PI. The circle is 
> created with two sectors but the circle plane is too big and the shape is 
> bogus.
> I will attach a test and a proposed solution. (I hope this is the last issue 
> of this saga)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Assigned] (LUCENE-8071) GeoExactCircle should create circles with right number of planes

2017-12-01 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-8071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright reassigned LUCENE-8071:
---

Assignee: Karl Wright

> GeoExactCircle should  create circles with right number of planes
> -
>
> Key: LUCENE-8071
> URL: https://issues.apache.org/jira/browse/LUCENE-8071
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/spatial3d
>Reporter: Ignacio Vera
>Assignee: Karl Wright
> Attachments: LUCENE-8071-test.patch, LUCENE-8071.patch
>
>
> Hi [~kwri...@metacarta.com],
> There is still a situation when the test can fail. It happens when the planet 
> model is a SPHERE and the radius is slightly lower than PI. The circle is 
> created with two sectors but the circle plane is too big and the shape is 
> bogus.
> I will attach a test and a proposed solution. (I hope this is the last issue 
> of this saga)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8072) Improve accuracy of similarity scores

2017-12-01 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16274493#comment-16274493
 ] 

Robert Muir commented on LUCENE-8072:
-

As far as changes to double precision, we should be careful here too. Really 
the test improvements for LUCENE-8015 needs to be applied before we make any 
alterations for formulas because the current tests are too inefficient.

Similarity has to deal with crazy values for a variety of reasons in lucene and 
our first challenge is to get all of our scoring behaving properly with 
monotonicity we need for optimizations. Extra precision in various places may 
or may not help that, anyway lets avoid playing whack-a-mole :)

> Improve accuracy of similarity scores
> -
>
> Key: LUCENE-8072
> URL: https://issues.apache.org/jira/browse/LUCENE-8072
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Adrien Grand
>Priority: Minor
> Attachments: LUCENE-8072.patch
>
>
> I noticed two things we could do to improve the accuracy of our scores:
>  - use {{Math.log1p(x)}} instead of {{Math.log(1+x)}}, especially when x is 
> expected to be small
>  - use doubles for intermediate values that are used to compute norms in 
> BM25Similarity



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8072) Improve accuracy of similarity scores

2017-12-01 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16274490#comment-16274490
 ] 

Robert Muir commented on LUCENE-8072:
-

Also i dont see the benefit to relevance. I am fine with taking any perf hit, 
if it really helps, but I think we need to do this carefully on a case-by-case 
basis, not blindly across the board.

For example in the BM25 case it does not help to do this in the IDF, such tiny 
idfs are the stopword case so additional precision does not matter. It also 
does not help "behavior" since Math.log is already required to be 
semi-monotonic.

> Improve accuracy of similarity scores
> -
>
> Key: LUCENE-8072
> URL: https://issues.apache.org/jira/browse/LUCENE-8072
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Adrien Grand
>Priority: Minor
> Attachments: LUCENE-8072.patch
>
>
> I noticed two things we could do to improve the accuracy of our scores:
>  - use {{Math.log1p(x)}} instead of {{Math.log(1+x)}}, especially when x is 
> expected to be small
>  - use doubles for intermediate values that are used to compute norms in 
> BM25Similarity



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8072) Improve accuracy of similarity scores

2017-12-01 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16274486#comment-16274486
 ] 

Robert Muir commented on LUCENE-8072:
-

There is a cost to log1p, I'm not sure we should do this per-hit.

> Improve accuracy of similarity scores
> -
>
> Key: LUCENE-8072
> URL: https://issues.apache.org/jira/browse/LUCENE-8072
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Adrien Grand
>Priority: Minor
> Attachments: LUCENE-8072.patch
>
>
> I noticed two things we could do to improve the accuracy of our scores:
>  - use {{Math.log1p(x)}} instead of {{Math.log(1+x)}}, especially when x is 
> expected to be small
>  - use doubles for intermediate values that are used to compute norms in 
> BM25Similarity



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8068) Allow IndexWriter to write a single DWPT to disk

2017-12-01 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16274485#comment-16274485
 ] 

ASF subversion and git services commented on LUCENE-8068:
-

Commit d2554218c711bccd4b8a56b4d18b64a8f691c170 in lucene-solr's branch 
refs/heads/branch_7x from [~cpoerschke]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=d255421 ]

LUCENE-8068: remove now-unused import

(to fix 'ant precommit' failing)


> Allow IndexWriter to write a single DWPT to disk
> 
>
> Key: LUCENE-8068
> URL: https://issues.apache.org/jira/browse/LUCENE-8068
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Simon Willnauer
> Fix For: master (8.0), 7.2
>
> Attachments: LUCENE-8068.patch, LUCENE-8068.patch, LUCENE-8068.patch, 
> LUCENE-8068.patch
>
>
> Today we IW can only flush a DWPT to disk if an external resource calls 
> flush()  or refreshes a NRT reader or if a DWPT is selected as flush pending. 
> Yet, the latter has the problem that it always ties up an indexing thread and 
> if flush / NRT refresh is called a whole bunch of indexing threads is tied 
> up. If IW could offer a simple `flushNextBuffer()` method that synchronously 
> flushes the next pending or biggest active buffer to disk memory could be 
> controlled in a more fine granular fashion from outside of the IW. This is 
> for instance useful if more than one IW (shards) must be maintained in a 
> single JVM / system. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Lucene/Solr 7.2

2017-12-01 Thread Adrien Grand
It is not too time-consuming. The whole process takes a lot of time, but
most of the time is actually spent waiting (for other people to vote, that
javadocs upload, that mirrors replicate, etc.). You can have a look at
https://wiki.apache.org/lucene-java/ReleaseTodo to see what kind of things
need to be done as part of the release process. If you'd like to give it a
try but you're not sure how much time you can allocate to it, it might be
easier to start with a bugfix release, which usually has fewer chances of
being respun since the set of chances is usually more contained.

Le ven. 1 déc. 2017 à 15:48, David Smiley  a
écrit :

> Adrien,
>
> Thanks for doing this release.  Just curious, how much time would you
> estimate is involved in being the RM?  I've never done it before.  I'm not
> offering to take over this release but maybe I'll do a future one.
>
> ~ David
>
>
> On Fri, Dec 1, 2017 at 5:11 AM Adrien Grand  wrote:
>
>> Hello,
>>
>> It's been more than 6 weeks since we released 7.1 and we accumulated a
>> good set of changes, so I think we should release Lucene/Solr 7.2.0.
>>
>> There is one change that I would like to have before building a RC:
>> LUCENE-8043[1], which looks like it is almost ready to be merged. Please
>> let me know if there are any other changes that should make it to the
>> release.
>>
>> I volunteer to be the release manager. I'm currently thinking of building
>> the first release candidate next wednesday, December 6th.
>>
>> [1] https://issues.apache.org/jira/browse/LUCENE-8043
>>
> --
> Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
> LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
> http://www.solrenterprisesearchserver.com
>


Re: Lucene/Solr 7.2

2017-12-01 Thread David Smiley
Adrien,

Thanks for doing this release.  Just curious, how much time would you
estimate is involved in being the RM?  I've never done it before.  I'm not
offering to take over this release but maybe I'll do a future one.

~ David

On Fri, Dec 1, 2017 at 5:11 AM Adrien Grand  wrote:

> Hello,
>
> It's been more than 6 weeks since we released 7.1 and we accumulated a
> good set of changes, so I think we should release Lucene/Solr 7.2.0.
>
> There is one change that I would like to have before building a RC:
> LUCENE-8043[1], which looks like it is almost ready to be merged. Please
> let me know if there are any other changes that should make it to the
> release.
>
> I volunteer to be the release manager. I'm currently thinking of building
> the first release candidate next wednesday, December 6th.
>
> [1] https://issues.apache.org/jira/browse/LUCENE-8043
>
-- 
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
http://www.solrenterprisesearchserver.com


[jira] [Updated] (LUCENE-8071) GeoExactCircle should create circles with right number of planes

2017-12-01 Thread Ignacio Vera (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-8071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ignacio Vera updated LUCENE-8071:
-
Attachment: (was: LUCENE-8071.patch)

> GeoExactCircle should  create circles with right number of planes
> -
>
> Key: LUCENE-8071
> URL: https://issues.apache.org/jira/browse/LUCENE-8071
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/spatial3d
>Reporter: Ignacio Vera
> Attachments: LUCENE-8071-test.patch, LUCENE-8071.patch
>
>
> Hi [~kwri...@metacarta.com],
> There is still a situation when the test can fail. It happens when the planet 
> model is a SPHERE and the radius is slightly lower than PI. The circle is 
> created with two sectors but the circle plane is too big and the shape is 
> bogus.
> I will attach a test and a proposed solution. (I hope this is the last issue 
> of this saga)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-8071) GeoExactCircle should create circles with right number of planes

2017-12-01 Thread Ignacio Vera (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-8071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ignacio Vera updated LUCENE-8071:
-
Attachment: LUCENE-8071.patch

> GeoExactCircle should  create circles with right number of planes
> -
>
> Key: LUCENE-8071
> URL: https://issues.apache.org/jira/browse/LUCENE-8071
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/spatial3d
>Reporter: Ignacio Vera
> Attachments: LUCENE-8071-test.patch, LUCENE-8071.patch
>
>
> Hi [~kwri...@metacarta.com],
> There is still a situation when the test can fail. It happens when the planet 
> model is a SPHERE and the radius is slightly lower than PI. The circle is 
> created with two sectors but the circle plane is too big and the shape is 
> bogus.
> I will attach a test and a proposed solution. (I hope this is the last issue 
> of this saga)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-8072) Improve accuracy of similarity scores

2017-12-01 Thread Adrien Grand (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-8072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand updated LUCENE-8072:
-
Attachment: LUCENE-8072.patch

Here is a patch that applies these ideas to various similarities.

> Improve accuracy of similarity scores
> -
>
> Key: LUCENE-8072
> URL: https://issues.apache.org/jira/browse/LUCENE-8072
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Adrien Grand
>Priority: Minor
> Attachments: LUCENE-8072.patch
>
>
> I noticed two things we could do to improve the accuracy of our scores:
>  - use {{Math.log1p(x)}} instead of {{Math.log(1+x)}}, especially when x is 
> expected to be small
>  - use doubles for intermediate values that are used to compute norms in 
> BM25Similarity



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-11713) CdcrUpdateLogTest.testSubReader() failure

2017-12-01 Thread Steve Rowe (JIRA)
Steve Rowe created SOLR-11713:
-

 Summary: CdcrUpdateLogTest.testSubReader() failure
 Key: SOLR-11713
 URL: https://issues.apache.org/jira/browse/SOLR-11713
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Steve Rowe


Reproduces for me, from 
[https://builds.apache.org/job/Lucene-Solr-NightlyTests-master/1430/]:

{noformat}
Checking out Revision ebdaa44182cf4e017efc418134821291dc40ea46 
(refs/remotes/origin/master)
[...]
   [junit4]   2> NOTE: reproduce with: ant test  -Dtestcase=CdcrUpdateLogTest 
-Dtests.method=testSubReader -Dtests.seed=1A5FD357C74335A5 -Dtests.multiplier=2 
-Dtests.nightly=true -Dtests.slow=true 
-Dtests.linedocsfile=/home/jenkins/jenkins-slave/workspace/Lucene-Solr-NightlyTests-master/test-data/enwiki.random.lines.txt
 -Dtests.locale=vi -Dtests.timezone=America/Toronto -Dtests.asserts=true 
-Dtests.file.encoding=US-ASCII
   [junit4] FAILURE 6.59s J1 | CdcrUpdateLogTest.testSubReader <<<
   [junit4]> Throwable #1: java.lang.AssertionError
   [junit4]>at 
__randomizedtesting.SeedInfo.seed([1A5FD357C74335A5:57875934B794E477]:0)
   [junit4]>at 
org.apache.solr.update.CdcrUpdateLogTest.testSubReader(CdcrUpdateLogTest.java:583)
   [junit4]>at java.lang.Thread.run(Thread.java:748)
[...]
   [junit4]   2> NOTE: test params are: 
codec=DummyCompressingStoredFields(storedFieldsFormat=CompressingStoredFieldsFormat(compressionMode=DUMMY,
 chunkSize=2, maxDocsPerChunk=982, blockSize=6), 
termVectorsFormat=CompressingTermVectorsFormat(compressionMode=DUMMY, 
chunkSize=2, blockSize=6)), 
sim=Asserting(org.apache.lucene.search.similarities.AssertingSimilarity@1e1386ea),
 locale=vi, timezone=America/Toronto
   [junit4]   2> NOTE: Linux 3.13.0-88-generic amd64/Oracle Corporation 
1.8.0_144 (64-bit)/cpus=4,threads=1,free=211037008,total=384827392
{noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8068) Allow IndexWriter to write a single DWPT to disk

2017-12-01 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16274462#comment-16274462
 ] 

ASF subversion and git services commented on LUCENE-8068:
-

Commit 832a975bc4aaa12f7f96443bd1b2b4b6be65a48c in lucene-solr's branch 
refs/heads/master from [~cpoerschke]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=832a975 ]

LUCENE-8068: remove now-unused import

(to fix 'ant precommit' failing)


> Allow IndexWriter to write a single DWPT to disk
> 
>
> Key: LUCENE-8068
> URL: https://issues.apache.org/jira/browse/LUCENE-8068
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Simon Willnauer
> Fix For: master (8.0), 7.2
>
> Attachments: LUCENE-8068.patch, LUCENE-8068.patch, LUCENE-8068.patch, 
> LUCENE-8068.patch
>
>
> Today we IW can only flush a DWPT to disk if an external resource calls 
> flush()  or refreshes a NRT reader or if a DWPT is selected as flush pending. 
> Yet, the latter has the problem that it always ties up an indexing thread and 
> if flush / NRT refresh is called a whole bunch of indexing threads is tied 
> up. If IW could offer a simple `flushNextBuffer()` method that synchronously 
> flushes the next pending or biggest active buffer to disk memory could be 
> controlled in a more fine granular fashion from outside of the IW. This is 
> for instance useful if more than one IW (shards) must be maintained in a 
> single JVM / system. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-8072) Improve accuracy of similarity scores

2017-12-01 Thread Adrien Grand (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-8072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand updated LUCENE-8072:
-
Description: 
I noticed two things we could do to improve the accuracy of our scores:
 - use {{Math.log1p(x)}} instead of {{Math.log(1+x)}}, especially when x is 
expected to be small
 - use doubles for intermediate values that are used to compute norms in 
BM25Similarity

  was:
I noticed two things we could do to improve the accuracy of our scores:
 - use Math.log1p(x) instead of Math.log(1+x), especially when x is expected to 
be small
 - use doubles for intermediate values that are used to compute norms in 
BM25Similarity


> Improve accuracy of similarity scores
> -
>
> Key: LUCENE-8072
> URL: https://issues.apache.org/jira/browse/LUCENE-8072
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Adrien Grand
>Priority: Minor
>
> I noticed two things we could do to improve the accuracy of our scores:
>  - use {{Math.log1p(x)}} instead of {{Math.log(1+x)}}, especially when x is 
> expected to be small
>  - use doubles for intermediate values that are used to compute norms in 
> BM25Similarity



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-8072) Improve accuracy of similarity scores

2017-12-01 Thread Adrien Grand (JIRA)
Adrien Grand created LUCENE-8072:


 Summary: Improve accuracy of similarity scores
 Key: LUCENE-8072
 URL: https://issues.apache.org/jira/browse/LUCENE-8072
 Project: Lucene - Core
  Issue Type: Task
Reporter: Adrien Grand
Priority: Minor


I noticed two things we could do to improve the accuracy of our scores:
 - use Math.log1p(x) instead of Math.log(1+x), especially when x is expected to 
be small
 - use doubles for intermediate values that are used to compute norms in 
BM25Similarity



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11508) core.properties should be stored $solr.data.home/$core.name

2017-12-01 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16274437#comment-16274437
 ] 

David Smiley commented on SOLR-11508:
-

bq. ... keep code R/O ...

It's not enough to keep code R/O unless your situation is a predetermined 
static set of core names (and not SolrCloud).  Solr writes new cores, and that 
means a core instance dir with a core.properties.

bq. ... Core discovery ...

Wether core discovery stays or leaves, I see that as immaterial.  Cores need to 
live _somewhere_ and Solr puts them in a configurable coreRootDirectory.  
Notice it doesn't have the word "discovery" in it.

RE Docker: I already explained that a custom SOLR_DATA_HOME (to e.g. a volume) 
doesn't solve the docker/volume problem since Solr (in either mode) will write 
core.properties to the coreRootDirectory which is not in SOLR_DATA_HOME.

> core.properties should be stored $solr.data.home/$core.name
> ---
>
> Key: SOLR-11508
> URL: https://issues.apache.org/jira/browse/SOLR-11508
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Marc Morissette
>
> Since Solr 7, it is possible to store Solr cores in separate disk locations 
> using solr.data.home (see SOLR-6671). This is very useful where running Solr 
> in Docker where data must be stored in a directory which is independent from 
> the rest of the container.
> Unfortunately, while core data is stored in 
> {{$\{solr.data.home}/$\{core.name}/index/...}}, core.properties is stored in 
> {{$\{solr.solr.home}/$\{core.name}/core.properties}}.
> Reading SOLR-6671 comments, I think this was the expected behaviour but I 
> don't think it is the correct one.
> In addition to being inelegant and counterintuitive, this has the drawback of 
> stripping a core of its metadata and breaking core discovery when a Solr 
> installation is redeployed, whether in Docker or not.
> core.properties is mostly metadata and although it contains some 
> configuration, this configuration is specific to the core it accompanies. I 
> believe it should be stored in solr.data.home, with the rest of the data it 
> describes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11597) Implement RankNet.

2017-12-01 Thread Michael A. Alcorn (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16274414#comment-16274414
 ] 

Michael A. Alcorn commented on SOLR-11597:
--

Hi, [~cpoerschke]. I don't think those changes would cause too much overhead 
(but I don't know for sure!), and they definitely seem to make things more 
readable. I'm used to thinking of neural network calculations in terms of dot 
products, but your changes would probably make things more maintainable for 
people less familiar with them.

> Implement RankNet.
> --
>
> Key: SOLR-11597
> URL: https://issues.apache.org/jira/browse/SOLR-11597
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: contrib - LTR
>Reporter: Michael A. Alcorn
>
> Implement RankNet as described in [this 
> tutorial|https://github.com/airalcorn2/Solr-LTR].



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11508) core.properties should be stored $solr.data.home/$core.name

2017-12-01 Thread JIRA

[ 
https://issues.apache.org/jira/browse/SOLR-11508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16274384#comment-16274384
 ] 

Jan Høydahl commented on SOLR-11508:


As the one who committed the {{SOLR_DATA_HOME}} feature in the first place, the 
intent of that feature was to have *one* place to define which path (mount 
point) the index should be located, so you can keep code/config/data separate, 
and keep code R/O. You don't need to mess with each single core.property file 
or solrconfig.xml file, and it works both for standalone node and SolrCloud.

Core discovery is a remnant from the past that I hope will go away completely, 
so +1 for deprecating it along with {{coreRootDirectory}} and get rid of it.

Now, until core discovery dies, I see the two viewpoints as to whether 
core.properties belongs with the index or belongs in SOLR_HOME. I always 
thought of it as config and explicitly tested for having these in SOLR_HOME. 
Also since core discovery is only(?) used in standalone mode you will have your 
core configs in SOLR_HOME already, and the discovery mechanisms detects the 
core directory by the presence of the {{core.properties}} file, so it makes 
sense in my head that the two are co-located. Then, if you wish to move data to 
a separate disk, you change {{SOLR_DATA_HOME}} and copy only the data folders 
to the new disk. If you want data and config together on an external location, 
all you need to change is SOLR_HOME, not SOLR_DATA_HOME.

As to docker, I also believe that the official docker image should use the 
official {{install_solr_service.sh}} script, and set a custom SOLR_DATA_HOME, 
see SOLR-10906 for plan for installer script to support modifying data home.

> core.properties should be stored $solr.data.home/$core.name
> ---
>
> Key: SOLR-11508
> URL: https://issues.apache.org/jira/browse/SOLR-11508
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Marc Morissette
>
> Since Solr 7, it is possible to store Solr cores in separate disk locations 
> using solr.data.home (see SOLR-6671). This is very useful where running Solr 
> in Docker where data must be stored in a directory which is independent from 
> the rest of the container.
> Unfortunately, while core data is stored in 
> {{$\{solr.data.home}/$\{core.name}/index/...}}, core.properties is stored in 
> {{$\{solr.solr.home}/$\{core.name}/core.properties}}.
> Reading SOLR-6671 comments, I think this was the expected behaviour but I 
> don't think it is the correct one.
> In addition to being inelegant and counterintuitive, this has the drawback of 
> stripping a core of its metadata and breaking core discovery when a Solr 
> installation is redeployed, whether in Docker or not.
> core.properties is mostly metadata and although it contains some 
> configuration, this configuration is specific to the core it accompanies. I 
> believe it should be stored in solr.data.home, with the rest of the data it 
> describes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8011) Improve similarity explanations

2017-12-01 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16274309#comment-16274309
 ] 

ASF GitHub Bot commented on LUCENE-8011:


Github user mayya-sharipova commented on the issue:

https://github.com/apache/lucene-solr/pull/280
  
@jpountz thank you Adrien, I will work on these classes as well


> Improve similarity explanations
> ---
>
> Key: LUCENE-8011
> URL: https://issues.apache.org/jira/browse/LUCENE-8011
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Robert Muir
>  Labels: newdev
>
> LUCENE-7997 improves BM25 and Classic explains to better explain:
> {noformat}
> product of:
>   2.2 = scaling factor, k1 + 1
>   9.388654 = idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:
> 1.0 = n, number of documents containing term
> 17927.0 = N, total number of documents with field
>   0.9987758 = tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) 
> from:
> 979.0 = freq, occurrences of term within document
> 1.2 = k1, term saturation parameter
> 0.75 = b, length normalization parameter
> 1.0 = dl, length of field
> 1.0 = avgdl, average length of field
> {noformat}
> Previously it was pretty cryptic and used confusing terminology like 
> docCount/docFreq without explanation: 
> {noformat}
> product of:
>   0.016547536 = idf, computed as log(1 + (docCount - docFreq + 0.5) / 
> (docFreq + 0.5)) from:
> 449.0 = docFreq
> 456.0 = docCount
>   2.1920826 = tfNorm, computed as (freq * (k1 + 1)) / (freq + k1 * (1 - b + b 
> * fieldLength / avgFieldLength)) from:
> 113659.0 = freq=113658
> 1.2 = parameter k1
> 0.75 = parameter b
> 2300.5593 = avgFieldLength
> 1048600.0 = fieldLength
> {noformat}
> We should fix other similarities too in the same way, they should be more 
> practical.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



  1   2   >