[GitHub] [lucene-solr] atris commented on issue #491: Update for multiple term's suggestion scoring

2019-09-18 Thread GitBox
atris commented on issue #491: Update for multiple term's suggestion scoring
URL: https://github.com/apache/lucene-solr/pull/491#issuecomment-532976319
 
 
   Hi @UtsavVanodiya7 ,
   
   Thanks for raising the PR. I took a brief look and it is a bit hard to grok 
the exact improvements that you are suggesting.
   
   Could you file a JIRA and share your proposal in the description of the 
same? Also, please highlight the limitations in the status quo that you would 
like to fix, the use cases and demonstrable benefits.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] atris commented on issue #242: a little error about TopDocs

2019-09-18 Thread GitBox
atris commented on issue #242: a little error about TopDocs
URL: https://github.com/apache/lucene-solr/pull/242#issuecomment-532975296
 
 
   Closing this PR -- status quo is correct


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] atris closed pull request #242: a little error about TopDocs

2019-09-18 Thread GitBox
atris closed pull request #242: a little error about TopDocs
URL: https://github.com/apache/lucene-solr/pull/242
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] atris commented on issue #815: LUCENE-8213: Introduce Asynchronous Caching in LRUQueryCache

2019-09-18 Thread GitBox
atris commented on issue #815: LUCENE-8213: Introduce Asynchronous Caching in 
LRUQueryCache
URL: https://github.com/apache/lucene-solr/pull/815#issuecomment-532948963
 
 
   @mikemccand Thanks, fixed. Interestingly, moving the asynchronous load check 
to `cacheAsynchronously` also removed the need for the new exception. Please 
see the latest and share your comments.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] dsmiley commented on a change in pull request #884: LUCENE-8980: optimise SegmentTermsEnum.seekExact performance

2019-09-18 Thread GitBox
dsmiley commented on a change in pull request #884: LUCENE-8980: optimise 
SegmentTermsEnum.seekExact performance
URL: https://github.com/apache/lucene-solr/pull/884#discussion_r325967263
 
 

 ##
 File path: 
lucene/core/src/java/org/apache/lucene/codecs/blocktree/SegmentTermsEnum.java
 ##
 @@ -321,6 +321,10 @@ public boolean seekExact(BytesRef target) throws 
IOException {
   throw new IllegalStateException("terms index was not loaded");
 }
 
+if (fr.size() > 0 && (target.compareTo(fr.getMin()) < 0 || 
target.compareTo(fr.getMax()) > 0)) {
 
 Review comment:
   +1
   Interestingly, while working on UniformSplit PostingsFormat we identified 
the value in this -- 
`org.apache.lucene.codecs.uniformsplit.BlockReader#seekExact(org.apache.lucene.util.BytesRef)`
 
   If you tend to add data very sequentially in terms of IDs (common for many 
apps, log data, and especially during full indexing), this can be very 
noticeable since most segments will rule out the ID completely.
   
   Still, to be sure, we should run https://github.com/mikemccand/luceneutil/


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] atris commented on a change in pull request #815: LUCENE-8213: Introduce Asynchronous Caching in LRUQueryCache

2019-09-18 Thread GitBox
atris commented on a change in pull request #815: LUCENE-8213: Introduce 
Asynchronous Caching in LRUQueryCache
URL: https://github.com/apache/lucene-solr/pull/815#discussion_r325964538
 
 

 ##
 File path: lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java
 ##
 @@ -262,6 +284,15 @@ DocIdSet get(Query key, IndexReader.CacheHelper 
cacheHelper) {
 assert lock.isHeldByCurrentThread();
 assert key instanceof BoostQuery == false;
 assert key instanceof ConstantScoreQuery == false;
+
+/*
+ * If the current query is already being asynchronously cached,
+ * do not trigger another cache operation
+ */
+if (inFlightAsyncLoadQueries.contains(key)) {
 
 Review comment:
   Good idea, fixed, thanks


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-7353) Duplicated child/grand-child docs in a block-join structure should be removed by the shard hosting the docs not by the query controller

2019-09-18 Thread David Smiley (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-7353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16932984#comment-16932984
 ] 

David Smiley commented on SOLR-7353:


Since all documents (parent/child) require a unique key, I think it should 
simply be invalid for any child document to have an ID that is not unique with 
any other (either in the same nested structure or anywhere else in the index).  
We don't check this but perhaps we should when indexing a doc.

> Duplicated child/grand-child docs in a block-join structure should be removed 
> by the shard hosting the docs not by the query controller
> ---
>
> Key: SOLR-7353
> URL: https://issues.apache.org/jira/browse/SOLR-7353
> Project: Solr
>  Issue Type: Improvement
>Reporter: Timothy Potter
>Priority: Minor
>
> I've indexed the following 8 docs into a 2-shard collection (Solr 4.8'ish - 
> internal custom branch roughly based on 4.8) ... notice that the 3 
> grand-children of 2-1 have dup'd keys:
> {code}
> [
>   {
> "id":"1",
> "name":"parent",
> "_childDocuments_":[
>   {
> "id":"1-1",
> "name":"child"
>   },
>   {
> "id":"1-2",
> "name":"child"
>   }
> ]
>   },
>   {
> "id":"2",
> "name":"parent",
> "_childDocuments_":[
>   {
> "id":"2-1",
> "name":"child",
> "_childDocuments_":[
>   {
> "id":"2-1-1",
> "name":"grandchild"
>   },
>   {
> "id":"2-1-1",
> "name":"grandchild2"
>   },
>   {
> "id":"2-1-1",
> "name":"grandchild3"
>   }
> ]
>   }
> ]
>   }
> ]
> {code}
> When I query this collection, using:
> http://localhost:8984/solr/blockjoin2_shard2_replica1/select?q=*%3A*=json=true=true=10
> I get:
> {code}
> {
>   "responseHeader":{
> "status":0,
> "QTime":9,
> "params":{
>   "indent":"true",
>   "q":"*:*",
>   "shards.info":"true",
>   "wt":"json",
>   "rows":"10"}},
>   "shards.info":{
> 
> "http://localhost:8984/solr/blockjoin2_shard1_replica1/|http://localhost:8985/solr/blockjoin2_shard1_replica2/":{
>   "numFound":3,
>   "maxScore":1.0,
>   "shardAddress":"http://localhost:8984/solr/blockjoin2_shard1_replica1;,
>   "time":4},
> 
> "http://localhost:8984/solr/blockjoin2_shard2_replica1/|http://localhost:8985/solr/blockjoin2_shard2_replica2/":{
>   "numFound":5,
>   "maxScore":1.0,
>   "shardAddress":"http://localhost:8985/solr/blockjoin2_shard2_replica2;,
>   "time":4}},
>   "response":{"numFound":6,"start":0,"maxScore":1.0,"docs":[
>   {
> "id":"1-1",
> "name":"child"},
>   {
> "id":"1-2",
> "name":"child"},
>   {
> "id":"1",
> "name":"parent",
> "_version_":1495272401329455104},
>   {
> "id":"2-1-1",
> "name":"grandchild"},
>   {
> "id":"2-1",
> "name":"child"},
>   {
> "id":"2",
> "name":"parent",
> "_version_":1495272401361960960}]
>   }}
> {code}
> So Solr has de-duped the results.
> If I execute this query against the shard that has the dupes (distrib=false):
> http://localhost:8984/solr/blockjoin2_shard2_replica1/select?q=*%3A*=json=true=true=10=false
> Then the dupes are returned:
> {code}
> {
>   "responseHeader":{
> "status":0,
> "QTime":0,
> "params":{
>   "indent":"true",
>   "q":"*:*",
>   "shards.info":"true",
>   "distrib":"false",
>   "wt":"json",
>   "rows":"10"}},
>   "response":{"numFound":5,"start":0,"docs":[
>   {
> "id":"2-1-1",
> "name":"grandchild"},
>   {
> "id":"2-1-1",
> "name":"grandchild2"},
>   {
> "id":"2-1-1",
> "name":"grandchild3"},
>   {
> "id":"2-1",
> "name":"child"},
>   {
> "id":"2",
> "name":"parent",
> "_version_":1495272401361960960}]
>   }}
> {code}
> Shouldn't the distrib and non-distrib (direct to shard) queries produce 
> consistent results wrt this block?
> Of course we shouldn't index dupes, but I don't think it's the query 
> controller's job to de-dupe and change numDocs, esp. based on the value of 
> the rows parameter. Other users have reported this problem on the mailing 
> list:
> >>>
> We've seen this as well. Before we understood the cause, it seemed very
> bizarre that hitting different nodes would yield different numFound, as
> well as using different rows=N (since the proxying node only de-dupe the
> documents that are returned in the response).
> I think "consistency" and "correctness" should be clearly delineated. Of
> course we'd 

[jira] [Resolved] (SOLR-6596) Atomic update and adding child doc not working together

2019-09-18 Thread David Smiley (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-6596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Smiley resolved SOLR-6596.

Fix Version/s: 8.1
   Resolution: Fixed

> Atomic update and adding child doc not working together
> ---
>
> Key: SOLR-6596
> URL: https://issues.apache.org/jira/browse/SOLR-6596
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.10
>Reporter: Ishan Chattopadhyaya
>Priority: Minor
> Fix For: 8.1
>
> Attachments: SOLR-6596-tests.patch, SOLR-6596-tests.patch
>
>
> Was able to reproduce the issue reported here:
> http://qnalist.com/questions/5175783/solrj-bug-related-to-solrj-4-10-for-having-both-incremental-partial-update-and-child-document-on-the-same-solr-document
> The following two failing tests:
> 1. 
> a) Add a document 'parent'. Commit.
> b) Make atomic update to 'parent' doc
> c) Add a child doc to 'parent'. Commit.
> Expected 2 documents, Actual 1 document
> 2. 
> a) Add a document with id 'parent'
> b) Add another document with id 'parent' with a child 'child'. Commit
> Expected 2 documents (the overwritten parent document and the child), Actual 
> 3 documents (two documents with the id 'parent').



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Closed] (SOLR-6596) Atomic update and adding child doc not working together

2019-09-18 Thread David Smiley (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-6596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Smiley closed SOLR-6596.
--
Assignee: David Smiley

> Atomic update and adding child doc not working together
> ---
>
> Key: SOLR-6596
> URL: https://issues.apache.org/jira/browse/SOLR-6596
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.10
>Reporter: Ishan Chattopadhyaya
>Assignee: David Smiley
>Priority: Minor
> Fix For: 8.1
>
> Attachments: SOLR-6596-tests.patch, SOLR-6596-tests.patch
>
>
> Was able to reproduce the issue reported here:
> http://qnalist.com/questions/5175783/solrj-bug-related-to-solrj-4-10-for-having-both-incremental-partial-update-and-child-document-on-the-same-solr-document
> The following two failing tests:
> 1. 
> a) Add a document 'parent'. Commit.
> b) Make atomic update to 'parent' doc
> c) Add a child doc to 'parent'. Commit.
> Expected 2 documents, Actual 1 document
> 2. 
> a) Add a document with id 'parent'
> b) Add another document with id 'parent' with a child 'child'. Commit
> Expected 2 documents (the overwritten parent document and the child), Actual 
> 3 documents (two documents with the id 'parent').



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-6596) Atomic update and adding child doc not working together

2019-09-18 Thread David Smiley (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-6596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Smiley updated SOLR-6596:
---
Attachment: SOLR-6596-tests.patch
Status: Open  (was: Open)

I applied the patch and updated it to master branch, plus made some small 
changes to show that it can be made to work thanks to SOLR-12638.  The \_root\_ 
field needs to be stored (or docValues) – unfortunately.  And child documents 
need to be added as named child documents instead of the classic "anonymous" 
style.  An atomic update doc needs to even add the child document in this way.  
The test shows this now.   The tests for JavaBin mode here works, although the 
XML mode does not due to SOLR-12677.  I'm going to close as fixed.

> Atomic update and adding child doc not working together
> ---
>
> Key: SOLR-6596
> URL: https://issues.apache.org/jira/browse/SOLR-6596
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.10
>Reporter: Ishan Chattopadhyaya
>Priority: Minor
> Attachments: SOLR-6596-tests.patch, SOLR-6596-tests.patch
>
>
> Was able to reproduce the issue reported here:
> http://qnalist.com/questions/5175783/solrj-bug-related-to-solrj-4-10-for-having-both-incremental-partial-update-and-child-document-on-the-same-solr-document
> The following two failing tests:
> 1. 
> a) Add a document 'parent'. Commit.
> b) Make atomic update to 'parent' doc
> c) Add a child doc to 'parent'. Commit.
> Expected 2 documents, Actual 1 document
> 2. 
> a) Add a document with id 'parent'
> b) Add another document with id 'parent' with a child 'child'. Commit
> Expected 2 documents (the overwritten parent document and the child), Actual 
> 3 documents (two documents with the id 'parent').



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-8980) Optimise SegmentTermsEnum.seekExact performance

2019-09-18 Thread Guoqiang Jiang (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-8980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16932967#comment-16932967
 ] 

Guoqiang Jiang edited comment on LUCENE-8980 at 9/19/19 1:25 AM:
-

Please help to take a look, thanks:)


was (Author: jgq2008303393):
Please help to take a look, thanks.

> Optimise SegmentTermsEnum.seekExact performance
> ---
>
> Key: LUCENE-8980
> URL: https://issues.apache.org/jira/browse/LUCENE-8980
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/codecs
>Affects Versions: 8.2
>Reporter: Guoqiang Jiang
>Priority: Major
>  Labels: performance
> Fix For: master (9.0)
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> *Description*
> In Elasticsearch, each document has an _id field that uniquely identifies it, 
> which is indexed so that documents can be looked up from Lucene. When users 
> write Elasticsearch with self-generated _id values, even if the conflict rate 
> is very low, Elasticsearch has to check _id uniqueness through Lucene API for 
> each document, which result in poor write performance.
>  
> *Solution*
> 1. Choose a better _id generator before writing ES
> Different _id formats have a great impact on write performance. We have 
> verified this in production cluster. Users can refer to the following blog 
> and choose a better _id generator.
> [http://blog.mikemccandless.com/2014/05/choosing-fast-unique-identifier-uuid.html]
> 2. Optimise with min/maxTerm metrics in Lucene
> As Lucene stores min/maxTerm metrics for each segment and field, we can use 
> those metrics to optimise performance of Lucene look up API. When calling 
> SegmentTermsEnum.seekExact() to lookup an term in one segment, we can check 
> whether the term fall in the range of minTerm and maxTerm, so that wo skip 
> some useless segments as soon as possible.
>  
> *Tests*
> I have made some write benchmark using _id in UUID V1 format, and the 
> benchmark result is as follows:
> ||Branch||Write speed after 4h||CPU cost||Overall improvement||Write speed 
> after 8h||CPU cost||Overall improvement||
> |Original Lucene|29.9w/s|68.4%|N/A|26.7w/s|66.6%|N/A|
> |Optimised Lucene|34.5w/s
> (+15.4%)|63.8
> (-6.7%)|+22.1%|31.5w/s
> (18.0%)|61.5
> (-7.7%)|+25.7%|
> As shown above, after 8 hours of continuous writing, write speed improves by 
> 18.0%, CPU cost decreases by 7.7%, and overall performance improves by 25.7%. 
> The Elasticsearch GET API and ids query would get similar performance 
> improvements.
> It should be noted that the benchmark test needs to be run several hours 
> continuously, because the performance improvements is not obvious when the 
> data is completely cached or the number of segments is too small.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8980) Optimise SegmentTermsEnum.seekExact performance

2019-09-18 Thread Guoqiang Jiang (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-8980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16932967#comment-16932967
 ] 

Guoqiang Jiang commented on LUCENE-8980:


Please help to take a look, thanks.

> Optimise SegmentTermsEnum.seekExact performance
> ---
>
> Key: LUCENE-8980
> URL: https://issues.apache.org/jira/browse/LUCENE-8980
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/codecs
>Affects Versions: 8.2
>Reporter: Guoqiang Jiang
>Priority: Major
>  Labels: performance
> Fix For: master (9.0)
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> *Description*
> In Elasticsearch, each document has an _id field that uniquely identifies it, 
> which is indexed so that documents can be looked up from Lucene. When users 
> write Elasticsearch with self-generated _id values, even if the conflict rate 
> is very low, Elasticsearch has to check _id uniqueness through Lucene API for 
> each document, which result in poor write performance.
>  
> *Solution*
> 1. Choose a better _id generator before writing ES
> Different _id formats have a great impact on write performance. We have 
> verified this in production cluster. Users can refer to the following blog 
> and choose a better _id generator.
> [http://blog.mikemccandless.com/2014/05/choosing-fast-unique-identifier-uuid.html]
> 2. Optimise with min/maxTerm metrics in Lucene
> As Lucene stores min/maxTerm metrics for each segment and field, we can use 
> those metrics to optimise performance of Lucene look up API. When calling 
> SegmentTermsEnum.seekExact() to lookup an term in one segment, we can check 
> whether the term fall in the range of minTerm and maxTerm, so that wo skip 
> some useless segments as soon as possible.
>  
> *Tests*
> I have made some write benchmark using _id in UUID V1 format, and the 
> benchmark result is as follows:
> ||Branch||Write speed after 4h||CPU cost||Overall improvement||Write speed 
> after 8h||CPU cost||Overall improvement||
> |Original Lucene|29.9w/s|68.4%|N/A|26.7w/s|66.6%|N/A|
> |Optimised Lucene|34.5w/s
> (+15.4%)|63.8
> (-6.7%)|+22.1%|31.5w/s
> (18.0%)|61.5
> (-7.7%)|+25.7%|
> As shown above, after 8 hours of continuous writing, write speed improves by 
> 18.0%, CPU cost decreases by 7.7%, and overall performance improves by 25.7%. 
> The Elasticsearch GET API and ids query would get similar performance 
> improvements.
> It should be noted that the benchmark test needs to be run several hours 
> continuously, because the performance improvements is not obvious when the 
> data is completely cached or the number of segments is too small.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13778) Windows JDK SSL Test Failure trend: SSLException: Software caused connection abort: recv failed

2019-09-18 Thread Hoss Man (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16932942#comment-16932942
 ] 

Hoss Man commented on SOLR-13778:
-

Here's a full example of what one of these stack traces tends to look like...
{noformat}
...
   [junit4]> Caused by: javax.net.ssl.SSLException: Software caused 
connection abort: recv failed
   [junit4]>at 
java.base/sun.security.ssl.Alert.createSSLException(Alert.java:127)
   [junit4]>at 
java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:320)
   [junit4]>at 
java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:263)
   [junit4]>at 
java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:258)
   [junit4]>at 
java.base/sun.security.ssl.SSLSocketImpl.handleException(SSLSocketImpl.java:1342)
   [junit4]>at 
java.base/sun.security.ssl.SSLSocketImpl$AppInputStream.read(SSLSocketImpl.java:844)
   [junit4]>at 
org.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:137)
   [junit4]>at 
org.apache.http.impl.io.SessionInputBufferImpl.fillBuffer(SessionInputBufferImpl.java:153)
   [junit4]>at 
org.apache.http.impl.io.SessionInputBufferImpl.readLine(SessionInputBufferImpl.java:282)
   [junit4]>at 
org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:138)
   [junit4]>at 
org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:56)
   [junit4]>at 
org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:259)
   [junit4]>at 
org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:163)
   [junit4]>at 
org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:165)
   [junit4]>at 
org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:273)
   [junit4]>at 
org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125)
   [junit4]>at 
org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:272)
   [junit4]>at 
org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:185)
   [junit4]>at 
org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89)
   [junit4]>at 
org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110)
   [junit4]>at 
org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
   [junit4]>at 
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
   [junit4]>at 
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)
   [junit4]>at 
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:564)
   [junit4]>... 46 more
   [junit4]>Suppressed: java.net.SocketException: Software caused 
connection abort: socket write error
   [junit4]>at 
java.base/java.net.SocketOutputStream.socketWrite0(Native Method)
   [junit4]>at 
java.base/java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:110)
   [junit4]>at 
java.base/java.net.SocketOutputStream.write(SocketOutputStream.java:150)
   [junit4]>at 
java.base/sun.security.ssl.SSLSocketOutputRecord.encodeAlert(SSLSocketOutputRecord.java:81)
   [junit4]>at 
java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:351)
   [junit4]>... 68 more
   [junit4]> Caused by: java.net.SocketException: Software caused 
connection abort: recv failed
   [junit4]>at 
java.base/java.net.SocketInputStream.socketRead0(Native Method)
   [junit4]>at 
java.base/java.net.SocketInputStream.socketRead(SocketInputStream.java:115)
   [junit4]>at 
java.base/java.net.SocketInputStream.read(SocketInputStream.java:168)
   [junit4]>at 
java.base/java.net.SocketInputStream.read(SocketInputStream.java:140)
   [junit4]>at 
java.base/sun.security.ssl.SSLSocketInputRecord.read(SSLSocketInputRecord.java:448)
   [junit4]>at 
java.base/sun.security.ssl.SSLSocketInputRecord.bytesInCompletePacket(SSLSocketInputRecord.java:68)
   [junit4]>at 
java.base/sun.security.ssl.SSLSocketImpl.readApplicationRecord(SSLSocketImpl.java:1132)
   [junit4]>at 
java.base/sun.security.ssl.SSLSocketImpl$AppInputStream.read(SSLSocketImpl.java:828)
   [junit4]>... 64 more
{noformat}
Allthough it's not obvious from the public view of my reports, grepping all the 
available logs (available 

[jira] [Created] (SOLR-13778) Windows JDK SSL Test Failure trend: SSLException: Software caused connection abort: recv failed

2019-09-18 Thread Hoss Man (Jira)
Hoss Man created SOLR-13778:
---

 Summary: Windows JDK SSL Test Failure trend: SSLException: 
Software caused connection abort: recv failed
 Key: SOLR-13778
 URL: https://issues.apache.org/jira/browse/SOLR-13778
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Hoss Man


Now that Uwe's jenkins build has been correctly reporting it's build results 
for my [automated 
reports|http://fucit.org/solr-jenkins-reports/failure-report.html] to pick up, 
I've noticed a pattern of failures that indicate a definite problem with using 
SSL on Windows (even with java 11.0.4
 )
 The symptommatic stack traces all contain...
{noformat}
...
   [junit4]> Caused by: javax.net.ssl.SSLException: Software caused 
connection abort: recv failed
   [junit4]>at 
java.base/sun.security.ssl.Alert.createSSLException(Alert.java:127)
...
   [junit4]> Caused by: java.net.SocketException: Software caused 
connection abort: recv failed
   [junit4]>at 
java.base/java.net.SocketInputStream.socketRead0(Native Method)
...
{noformat}
I suspect this may be related to 
[https://bugs.openjdk.java.net/browse/JDK-8209333] but i have no concrete 
evidence to back this up.

I'll post some details of my analysis in comments...



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-13777) contrib/ltr (and maybe others?) does not have test logging configured correctly

2019-09-18 Thread Hoss Man (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-13777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-13777:

Summary: contrib/ltr (and maybe others?) does not have test logging 
configured correctly  (was: contrib/ltr does not have test logging configured 
correctly)

> contrib/ltr (and maybe others?) does not have test logging configured 
> correctly
> ---
>
> Key: SOLR-13777
> URL: https://issues.apache.org/jira/browse/SOLR-13777
> Project: Solr
>  Issue Type: Test
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Hoss Man
>Priority: Major
>
> i'm not 100% certain what magical properties/files are needed to get test 
> logging setup consistently across all solr modules, but it doesn't seem to be 
> in place for contrib/ltr.
> Consider this trivial example...
> {noformat}
> hossman@slate:~/lucene/dev/solr/contrib/ltr [j11] [master] $ ant test 
> -Dtestcase=TestManagedFeatureStore -Dtests.showSuccess=true
> ...
> -test:
>[junit4]  says 你好! Master seed: CEC8592E171B132F
>[junit4] Executing 1 suite with 1 JVM.
>[junit4] 
>[junit4] Started J0 PID(18975@slate).
>[junit4] Suite: org.apache.solr.ltr.store.rest.TestManagedFeatureStore
>[junit4]   2> ERROR StatusLogger No Log4j 2 configuration file found. 
> Using default configuration (logging only errors to the console), or user 
> programmatically provided configurations. Set system property 'log4j2.debug' 
> to show Log4j 2 internal initialization logging. See 
> https://logging.apache.org/log4j/2.x/manual/configuration.html for 
> instructions on how to configure Log4j 2
>[junit4] OK  2.37s | 
> TestManagedFeatureStore.testMissingFeatureReturnsNull
>[junit4] OK  0.66s | 
> TestManagedFeatureStore.testDefaultFeatureStoreName
>[junit4] OK  0.48s | TestManagedFeatureStore.getInstanceTest
>[junit4] OK  0.34s | TestManagedFeatureStore.testFeatureStoreAdd
>[junit4] OK  0.35s | TestManagedFeatureStore.getInvalidInstanceTest
>[junit4] OK  0.35s | TestManagedFeatureStore.testFeatureStoreGet
>[junit4] Completed [1/1] in 5.71s, 6 tests
> ...
> {noformat}
> Notably this means that when tests in this module fail as part of jenkins 
> builds, the only thing that gets logged is the final exception -- not any of 
> the normal test scafolding (like what randomized SSL/clientAuth values were 
> used) or logging leading up to that failure.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (SOLR-13777) contrib/ltr does not have test logging configured correctly

2019-09-18 Thread Hoss Man (Jira)
Hoss Man created SOLR-13777:
---

 Summary: contrib/ltr does not have test logging configured 
correctly
 Key: SOLR-13777
 URL: https://issues.apache.org/jira/browse/SOLR-13777
 Project: Solr
  Issue Type: Test
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Hoss Man


i'm not 100% certain what magical properties/files are needed to get test 
logging setup consistently across all solr modules, but it doesn't seem to be 
in place for contrib/ltr.

Consider this trivial example...

{noformat}
hossman@slate:~/lucene/dev/solr/contrib/ltr [j11] [master] $ ant test 
-Dtestcase=TestManagedFeatureStore -Dtests.showSuccess=true
...
-test:
   [junit4]  says 你好! Master seed: CEC8592E171B132F
   [junit4] Executing 1 suite with 1 JVM.
   [junit4] 
   [junit4] Started J0 PID(18975@slate).
   [junit4] Suite: org.apache.solr.ltr.store.rest.TestManagedFeatureStore
   [junit4]   2> ERROR StatusLogger No Log4j 2 configuration file found. Using 
default configuration (logging only errors to the console), or user 
programmatically provided configurations. Set system property 'log4j2.debug' to 
show Log4j 2 internal initialization logging. See 
https://logging.apache.org/log4j/2.x/manual/configuration.html for instructions 
on how to configure Log4j 2
   [junit4] OK  2.37s | 
TestManagedFeatureStore.testMissingFeatureReturnsNull
   [junit4] OK  0.66s | TestManagedFeatureStore.testDefaultFeatureStoreName
   [junit4] OK  0.48s | TestManagedFeatureStore.getInstanceTest
   [junit4] OK  0.34s | TestManagedFeatureStore.testFeatureStoreAdd
   [junit4] OK  0.35s | TestManagedFeatureStore.getInvalidInstanceTest
   [junit4] OK  0.35s | TestManagedFeatureStore.testFeatureStoreGet
   [junit4] Completed [1/1] in 5.71s, 6 tests
...
{noformat}

Notably this means that when tests in this module fail as part of jenkins 
builds, the only thing that gets logged is the final exception -- not any of 
the normal test scafolding (like what randomized SSL/clientAuth values were 
used) or logging leading up to that failure.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce Asynchronous Caching in LRUQueryCache

2019-09-18 Thread GitBox
mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce 
Asynchronous Caching in LRUQueryCache
URL: https://github.com/apache/lucene-solr/pull/815#discussion_r325915728
 
 

 ##
 File path: lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java
 ##
 @@ -262,6 +284,15 @@ DocIdSet get(Query key, IndexReader.CacheHelper 
cacheHelper) {
 assert lock.isHeldByCurrentThread();
 assert key instanceof BoostQuery == false;
 assert key instanceof ConstantScoreQuery == false;
+
+/*
+ * If the current query is already being asynchronously cached,
+ * do not trigger another cache operation
+ */
+if (inFlightAsyncLoadQueries.contains(key)) {
 
 Review comment:
   Hmm can we fix the concurrency issue here, e.g. just call 
`inFlightAsyncLoadQueries.add` and if the returned value is `false` (because 
the query was already in the set) then throw the exception?  Or, move this 
check down into the `cacheAsynchronously` method where we are doing the `add`?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce Asynchronous Caching in LRUQueryCache

2019-09-18 Thread GitBox
mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce 
Asynchronous Caching in LRUQueryCache
URL: https://github.com/apache/lucene-solr/pull/815#discussion_r325915208
 
 

 ##
 File path: lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java
 ##
 @@ -832,5 +924,20 @@ public BulkScorer bulkScorer(LeafReaderContext context) 
throws IOException {
   return new DefaultBulkScorer(new ConstantScoreScorer(this, 0f, 
ScoreMode.COMPLETE_NO_SCORES, disi));
 }
 
+// Perform a cache load asynchronously
+// NOTE: Potentially, two threads can trigger a load for the same query 
concurrently as the check for presence of the query
+// done upstream and the lock is not he
 
 Review comment:
   Hmm this sentence abruptly ended?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce Asynchronous Caching in LRUQueryCache

2019-09-18 Thread GitBox
mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce 
Asynchronous Caching in LRUQueryCache
URL: https://github.com/apache/lucene-solr/pull/815#discussion_r325844374
 
 

 ##
 File path: lucene/CHANGES.txt
 ##
 @@ -56,6 +56,9 @@ Improvements
 * LUCENE-8937: Avoid agressive stemming on numbers in the FrenchMinimalStemmer.
   (Adrien Gallou via Tomoko Uchida)
 
+* LUCENE-8213: LRUQueryCache#doCache can now use IndexSearcher's Executor(if 
present)
 
 Review comment:
   Space before `(if present`?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13763) Improve the tracking of "freedisk" in autoscaling simulations

2019-09-18 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16932852#comment-16932852
 ] 

ASF subversion and git services commented on SOLR-13763:


Commit d50085f1cb771ede941567878fd84c48498b9577 in lucene-solr's branch 
refs/heads/branch_8x from Andrzej Bialecki
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=d50085f ]

SOLR-13763: Ignore freedisk changes in a live simulator created from snapshot.


> Improve the tracking of "freedisk" in autoscaling simulations
> -
>
> Key: SOLR-13763
> URL: https://issues.apache.org/jira/browse/SOLR-13763
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Andrzej Bialecki 
>Assignee: Andrzej Bialecki 
>Priority: Major
> Fix For: 8.3
>
>
> The "freedisk" node metric is tracked closely when adding / removing / moving 
> replicas but it's not tracked for simulated updates, even though the 
> corresponding simulated replica sizes are.
> This causes some inconsistencies in "freedisk" calculation and reporting, 
> which may affect the results of simulations.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13763) Improve the tracking of "freedisk" in autoscaling simulations

2019-09-18 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16932849#comment-16932849
 ] 

ASF subversion and git services commented on SOLR-13763:


Commit 9e449ad0bcc8c95bc4a4164362c4f652a92f3910 in lucene-solr's branch 
refs/heads/master from Andrzej Bialecki
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=9e449ad ]

SOLR-13763: Ignore freedisk changes in a live simulator created from snapshot.


> Improve the tracking of "freedisk" in autoscaling simulations
> -
>
> Key: SOLR-13763
> URL: https://issues.apache.org/jira/browse/SOLR-13763
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Andrzej Bialecki 
>Assignee: Andrzej Bialecki 
>Priority: Major
> Fix For: 8.3
>
>
> The "freedisk" node metric is tracked closely when adding / removing / moving 
> replicas but it's not tracked for simulated updates, even though the 
> corresponding simulated replica sizes are.
> This causes some inconsistencies in "freedisk" calculation and reporting, 
> which may affect the results of simulations.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-9640) Support PKI authentication and SSL in standalone-mode master/slave auth with local security.json

2019-09-18 Thread Jira


 [ 
https://issues.apache.org/jira/browse/SOLR-9640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl updated SOLR-9640:
--
Fix Version/s: (was: 8.1)
   (was: master (9.0))

> Support PKI authentication and SSL in standalone-mode master/slave auth with 
> local security.json
> 
>
> Key: SOLR-9640
> URL: https://issues.apache.org/jira/browse/SOLR-9640
> Project: Solr
>  Issue Type: New Feature
>  Components: Authentication, security
>Reporter: Jan Høydahl
>Assignee: Jan Høydahl
>Priority: Major
>  Labels: authentication, pki
> Attachments: SOLR-9640.patch, SOLR-9640.patch, SOLR-9640.patch, 
> SOLR-9640.patch, SOLR-9640.patch, SOLR-9640.patch, SOLR-9640.patch, 
> SOLR-9640.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> While working with SOLR-9481 I managed to secure Solr standalone on a 
> single-node server. However, when adding 
> {{=localhost:8081/solr/foo,localhost:8082/solr/foo}} to the request, I 
> get 401 error. This issue will fix PKI auth to work for standalone, which 
> should automatically make both sharding and master/slave index replication 
> work.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-13773) Add env-var options to prometheus-exporter start script.

2019-09-18 Thread Anshum Gupta (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-13773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anshum Gupta updated SOLR-13773:

Fix Version/s: 8.3
   master (9.0)

> Add env-var options to prometheus-exporter start script.
> 
>
> Key: SOLR-13773
> URL: https://issues.apache.org/jira/browse/SOLR-13773
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: metrics
>Affects Versions: master (9.0), 8.3
>Reporter: Houston Putman
>Priority: Minor
>  Labels: metric-collector
> Fix For: master (9.0), 8.3
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> The startup options for the Prometheus Exporter are pretty sparse compared to 
> what the Solr start script offers.
> I've added some options that mirror what Solr offers, such as
>  * SOLR_HEAP
>  * SOLR_JAVA_MEM
>  * GC_TUNE
> Having just the memory settings available would let us start the prometheus 
> exporter with more than 500 Mb of heap, which right now isn't possible as the 
> max heap is [hard coded 
> here|https://github.com/apache/lucene-solr/blob/master/solr/contrib/prometheus-exporter/bin/solr-exporter#L107].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] anshumg merged pull request #890: SOLR-13773: Prometheus Exporter GC and Heap options (#887)

2019-09-18 Thread GitBox
anshumg merged pull request #890: SOLR-13773: Prometheus Exporter GC and Heap 
options (#887)
URL: https://github.com/apache/lucene-solr/pull/890
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13773) Add env-var options to prometheus-exporter start script.

2019-09-18 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16932816#comment-16932816
 ] 

ASF subversion and git services commented on SOLR-13773:


Commit b9633e0f26052155a18fcbdd08b7fa3a30a88e37 in lucene-solr's branch 
refs/heads/branch_8x from Anshum Gupta
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=b9633e0 ]

SOLR-13773: Prometheus Exporter GC and Heap options (#887) (#890)

* SOLR-13773: Prometheus Exporter GC and Heap options

* Adding info to the ref-guide.

> Add env-var options to prometheus-exporter start script.
> 
>
> Key: SOLR-13773
> URL: https://issues.apache.org/jira/browse/SOLR-13773
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: metrics
>Affects Versions: master (9.0), 8.3
>Reporter: Houston Putman
>Priority: Minor
>  Labels: metric-collector
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> The startup options for the Prometheus Exporter are pretty sparse compared to 
> what the Solr start script offers.
> I've added some options that mirror what Solr offers, such as
>  * SOLR_HEAP
>  * SOLR_JAVA_MEM
>  * GC_TUNE
> Having just the memory settings available would let us start the prometheus 
> exporter with more than 500 Mb of heap, which right now isn't possible as the 
> max heap is [hard coded 
> here|https://github.com/apache/lucene-solr/blob/master/solr/contrib/prometheus-exporter/bin/solr-exporter#L107].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13773) Add env-var options to prometheus-exporter start script.

2019-09-18 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16932815#comment-16932815
 ] 

ASF subversion and git services commented on SOLR-13773:


Commit b9633e0f26052155a18fcbdd08b7fa3a30a88e37 in lucene-solr's branch 
refs/heads/branch_8x from Anshum Gupta
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=b9633e0 ]

SOLR-13773: Prometheus Exporter GC and Heap options (#887) (#890)

* SOLR-13773: Prometheus Exporter GC and Heap options

* Adding info to the ref-guide.

> Add env-var options to prometheus-exporter start script.
> 
>
> Key: SOLR-13773
> URL: https://issues.apache.org/jira/browse/SOLR-13773
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: metrics
>Affects Versions: master (9.0), 8.3
>Reporter: Houston Putman
>Priority: Minor
>  Labels: metric-collector
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> The startup options for the Prometheus Exporter are pretty sparse compared to 
> what the Solr start script offers.
> I've added some options that mirror what Solr offers, such as
>  * SOLR_HEAP
>  * SOLR_JAVA_MEM
>  * GC_TUNE
> Having just the memory settings available would let us start the prometheus 
> exporter with more than 500 Mb of heap, which right now isn't possible as the 
> max heap is [hard coded 
> here|https://github.com/apache/lucene-solr/blob/master/solr/contrib/prometheus-exporter/bin/solr-exporter#L107].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] anshumg opened a new pull request #890: SOLR-13773: Prometheus Exporter GC and Heap options (#887)

2019-09-18 Thread GitBox
anshumg opened a new pull request #890: SOLR-13773: Prometheus Exporter GC and 
Heap options (#887)
URL: https://github.com/apache/lucene-solr/pull/890
 
 
   Cherrypick from master (c7f84873280d7d60ebdfff72e0c72fb60cf24e69)
   * SOLR-13773: Prometheus Exporter GC and Heap options
   * Adding info to the ref-guide.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13773) Add env-var options to prometheus-exporter start script.

2019-09-18 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16932799#comment-16932799
 ] 

ASF subversion and git services commented on SOLR-13773:


Commit c7f84873280d7d60ebdfff72e0c72fb60cf24e69 in lucene-solr's branch 
refs/heads/master from Houston Putman
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=c7f8487 ]

SOLR-13773: Prometheus Exporter GC and Heap options (#887)

* SOLR-13773: Prometheus Exporter GC and Heap options

* Adding info to the ref-guide.


> Add env-var options to prometheus-exporter start script.
> 
>
> Key: SOLR-13773
> URL: https://issues.apache.org/jira/browse/SOLR-13773
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: metrics
>Affects Versions: master (9.0), 8.3
>Reporter: Houston Putman
>Priority: Minor
>  Labels: metric-collector
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> The startup options for the Prometheus Exporter are pretty sparse compared to 
> what the Solr start script offers.
> I've added some options that mirror what Solr offers, such as
>  * SOLR_HEAP
>  * SOLR_JAVA_MEM
>  * GC_TUNE
> Having just the memory settings available would let us start the prometheus 
> exporter with more than 500 Mb of heap, which right now isn't possible as the 
> max heap is [hard coded 
> here|https://github.com/apache/lucene-solr/blob/master/solr/contrib/prometheus-exporter/bin/solr-exporter#L107].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13773) Add env-var options to prometheus-exporter start script.

2019-09-18 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16932800#comment-16932800
 ] 

ASF subversion and git services commented on SOLR-13773:


Commit c7f84873280d7d60ebdfff72e0c72fb60cf24e69 in lucene-solr's branch 
refs/heads/master from Houston Putman
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=c7f8487 ]

SOLR-13773: Prometheus Exporter GC and Heap options (#887)

* SOLR-13773: Prometheus Exporter GC and Heap options

* Adding info to the ref-guide.


> Add env-var options to prometheus-exporter start script.
> 
>
> Key: SOLR-13773
> URL: https://issues.apache.org/jira/browse/SOLR-13773
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: metrics
>Affects Versions: master (9.0), 8.3
>Reporter: Houston Putman
>Priority: Minor
>  Labels: metric-collector
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> The startup options for the Prometheus Exporter are pretty sparse compared to 
> what the Solr start script offers.
> I've added some options that mirror what Solr offers, such as
>  * SOLR_HEAP
>  * SOLR_JAVA_MEM
>  * GC_TUNE
> Having just the memory settings available would let us start the prometheus 
> exporter with more than 500 Mb of heap, which right now isn't possible as the 
> max heap is [hard coded 
> here|https://github.com/apache/lucene-solr/blob/master/solr/contrib/prometheus-exporter/bin/solr-exporter#L107].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] anshumg commented on issue #887: SOLR-13773: Prometheus Exporter GC and Heap options

2019-09-18 Thread GitBox
anshumg commented on issue #887: SOLR-13773: Prometheus Exporter GC and Heap 
options
URL: https://github.com/apache/lucene-solr/pull/887#issuecomment-532854058
 
 
   Thanks @HoustonPutman . I'll also cherry pick this into 8x.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-13776) Allow derivatives to be computed for derivatives

2019-09-18 Thread Joel Bernstein (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-13776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-13776:
--
Description: The *derivative* Stream Evaluator operates over functions 
(spline, lerp, akima, loess, polyfit, oscillate etc...), but does not work with 
the function returned by the derivative function itself. This means a spline 
needs to be wrapped around the derivative to take a second derivative. This 
ticket will allow the derivative function to applied directly to output of the 
derivative function.  (was: The *derivative* Stream Evaluator operates over 
functions (spline, lerp, akima, loess, polyfit, oscillate etc...), but does not 
work with the function created by the derivative function. This means a spline 
needs to be wrapped around the derivative to take a second derivative. This 
ticket will allow the derivative function to applied directly to output of the 
derivative function.)

> Allow derivatives to be computed for derivatives
> 
>
> Key: SOLR-13776
> URL: https://issues.apache.org/jira/browse/SOLR-13776
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
>Priority: Minor
>
> The *derivative* Stream Evaluator operates over functions (spline, lerp, 
> akima, loess, polyfit, oscillate etc...), but does not work with the function 
> returned by the derivative function itself. This means a spline needs to be 
> wrapped around the derivative to take a second derivative. This ticket will 
> allow the derivative function to applied directly to output of the derivative 
> function.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-13776) Allow derivatives to be computed from derivatives

2019-09-18 Thread Joel Bernstein (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-13776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-13776:
--
Description: The *derivative* Stream Evaluator operates over functions 
(spline, lerp, akima, loess, polyfit, oscillate etc...), but does not work with 
the function created by the derivative function. This means a spline needs to 
be wrapped around the derivative to take a second derivative. This ticket will 
allow the derivative function to applied directly to output of the derivative 
function.  (was: Currently the *derivative* Stream Evaluator operates over 
functions (spline, lerp, akima, loess, polyfit, oscillate etc...). It would be 
useful if the *derivative* function could operate directly over a vector.)

> Allow derivatives to be computed from derivatives
> -
>
> Key: SOLR-13776
> URL: https://issues.apache.org/jira/browse/SOLR-13776
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
>Priority: Minor
>
> The *derivative* Stream Evaluator operates over functions (spline, lerp, 
> akima, loess, polyfit, oscillate etc...), but does not work with the function 
> created by the derivative function. This means a spline needs to be wrapped 
> around the derivative to take a second derivative. This ticket will allow the 
> derivative function to applied directly to output of the derivative function.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-13776) Allow derivatives to be computed for derivatives

2019-09-18 Thread Joel Bernstein (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-13776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-13776:
--
Summary: Allow derivatives to be computed for derivatives  (was: Allow 
derivatives to be computed from derivatives)

> Allow derivatives to be computed for derivatives
> 
>
> Key: SOLR-13776
> URL: https://issues.apache.org/jira/browse/SOLR-13776
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
>Priority: Minor
>
> The *derivative* Stream Evaluator operates over functions (spline, lerp, 
> akima, loess, polyfit, oscillate etc...), but does not work with the function 
> created by the derivative function. This means a spline needs to be wrapped 
> around the derivative to take a second derivative. This ticket will allow the 
> derivative function to applied directly to output of the derivative function.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-13776) Allow derivatives to be computed from derivatives

2019-09-18 Thread Joel Bernstein (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-13776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-13776:
--
Summary: Allow derivatives to be computed from derivatives  (was: Allow 
derivatives to be computed directly from vectors)

> Allow derivatives to be computed from derivatives
> -
>
> Key: SOLR-13776
> URL: https://issues.apache.org/jira/browse/SOLR-13776
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
>Priority: Minor
>
> Currently the *derivative* Stream Evaluator operates over functions (spline, 
> lerp, akima, loess, polyfit, oscillate etc...). It would be useful if the 
> *derivative* function could operate directly over a vector.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] HoustonPutman commented on issue #887: SOLR-13773: Prometheus Exporter GC and Heap options

2019-09-18 Thread GitBox
HoustonPutman commented on issue #887: SOLR-13773: Prometheus Exporter GC and 
Heap options
URL: https://github.com/apache/lucene-solr/pull/887#issuecomment-532828998
 
 
   Awesome, thanks Anshum. Edited the ref guide and added a change log entry. 
Let me know if you want me to reword anything


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13763) Improve the tracking of "freedisk" in autoscaling simulations

2019-09-18 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16932741#comment-16932741
 ] 

ASF subversion and git services commented on SOLR-13763:


Commit 84bf86f99976a8ecbc46236ab12b578d76678e20 in lucene-solr's branch 
refs/heads/branch_8x from Andrzej Bialecki
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=84bf86f ]

SOLR-13763: Improve the tracking of "freedisk" in autoscaling simulations.


> Improve the tracking of "freedisk" in autoscaling simulations
> -
>
> Key: SOLR-13763
> URL: https://issues.apache.org/jira/browse/SOLR-13763
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Andrzej Bialecki 
>Assignee: Andrzej Bialecki 
>Priority: Major
> Fix For: 8.3
>
>
> The "freedisk" node metric is tracked closely when adding / removing / moving 
> replicas but it's not tracked for simulated updates, even though the 
> corresponding simulated replica sizes are.
> This causes some inconsistencies in "freedisk" calculation and reporting, 
> which may affect the results of simulations.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Resolved] (SOLR-13763) Improve the tracking of "freedisk" in autoscaling simulations

2019-09-18 Thread Andrzej Bialecki (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-13763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrzej Bialecki  resolved SOLR-13763.
--
Resolution: Fixed

> Improve the tracking of "freedisk" in autoscaling simulations
> -
>
> Key: SOLR-13763
> URL: https://issues.apache.org/jira/browse/SOLR-13763
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Andrzej Bialecki 
>Assignee: Andrzej Bialecki 
>Priority: Major
> Fix For: 8.3
>
>
> The "freedisk" node metric is tracked closely when adding / removing / moving 
> replicas but it's not tracked for simulated updates, even though the 
> corresponding simulated replica sizes are.
> This causes some inconsistencies in "freedisk" calculation and reporting, 
> which may affect the results of simulations.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (LUCENE-8977) Handle punctuation characters in KoreanTokenizer

2019-09-18 Thread Namgyu Kim (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-8977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namgyu Kim updated LUCENE-8977:
---
Issue Type: Improvement  (was: Bug)

> Handle punctuation characters in KoreanTokenizer
> 
>
> Key: LUCENE-8977
> URL: https://issues.apache.org/jira/browse/LUCENE-8977
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Namgyu Kim
>Priority: Minor
>
> As we discussed on LUCENE-8966, KoreanTokenizer always divides into one and 
> the others now when there are continuous punctuation marks.
>  (사이즈 => [사이즈] [.] [...])
>  But KoreanTokenizer doesn't divide when first character is punctuation.
>  (...사이즈 => [...] [사이즈])
> It looks like the result from the viterbi path, but users can think weird 
> about the following case:
>  ("사이즈" means "size" in Korean)
> ||Case #1||Case #2||
> |Input : "...사이즈..."|Input : "...4..4사이즈"|
> |Result : [...] [사이즈] [.] [..]|Result : [...] [4] [.] [.] [4] [사이즈]|
> From what I checked, Nori has a punctuation characters(like . ,) in the 
> dictionary but Kuromoji is not.
>  ("サイズ" means "size" in Japanese)
> ||Case #1||Case #2||
> |Input : "...サイズ..."|Input : "...4..4サイズ"|
> |Result : [...] [サイズ] [...]|Result : [...] [4] [..] [4] [サイズ]|
> There are some ways to resolve it like hard-coding for punctuation but it 
> seems not good.
>  So I think we need to discuss it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] anshumg commented on issue #887: SOLR-13773: Prometheus Exporter GC and Heap options

2019-09-18 Thread GitBox
anshumg commented on issue #887: SOLR-13773: Prometheus Exporter GC and Heap 
options
URL: https://github.com/apache/lucene-solr/pull/887#issuecomment-532789014
 
 
   Looks good to me. Can you also add the ref guide changes here, and I'll 
merge this in right after. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13763) Improve the tracking of "freedisk" in autoscaling simulations

2019-09-18 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16932689#comment-16932689
 ] 

ASF subversion and git services commented on SOLR-13763:


Commit 6a8cfddf305718b21d64eebb144d18b330dfdb24 in lucene-solr's branch 
refs/heads/master from Andrzej Bialecki
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=6a8cfdd ]

SOLR-13763: Improve the tracking of "freedisk" in autoscaling simulations.


> Improve the tracking of "freedisk" in autoscaling simulations
> -
>
> Key: SOLR-13763
> URL: https://issues.apache.org/jira/browse/SOLR-13763
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Andrzej Bialecki 
>Assignee: Andrzej Bialecki 
>Priority: Major
> Fix For: 8.3
>
>
> The "freedisk" node metric is tracked closely when adding / removing / moving 
> replicas but it's not tracked for simulated updates, even though the 
> corresponding simulated replica sizes are.
> This causes some inconsistencies in "freedisk" calculation and reporting, 
> which may affect the results of simulations.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13349) High CPU usage in Solr due to Java 8 bug

2019-09-18 Thread Shawn Heisey (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16932680#comment-16932680
 ] 

Shawn Heisey commented on SOLR-13349:
-

bq. Has their been any reports or known issues with Java 8 and SOLR build 7.2.0?

Java 8 is the minimum version needed since Solr 6.0.0.  

The problem documented in this Jira issue was introduced in 7.7.0, so 7.2.0 
should be fine.  I cannot guarantee that there are not OTHER problems in 7.2.0, 
it's relatively old and we fix a lot of bugs in each release.

Jira is not a support portal.  Questions like this belong on the mailing list 
or the IRC channel.

https://lucene.apache.org/solr/community.html#mailing-lists-irc


> High CPU usage in Solr due to Java 8 bug
> 
>
> Key: SOLR-13349
> URL: https://issues.apache.org/jira/browse/SOLR-13349
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 7.7, 8.0, master (9.0)
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Major
> Fix For: 7.7.2, 8.1, master (9.0)
>
> Attachments: SOLR-13349.patch
>
>
> We've had sporadic reports of high CPU usage in Solr 7. Lukas Weiss reported 
> a Java 8 bug that appears to be the root cause (I have not personally 
> verified), see: [https://bugs.openjdk.java.net/browse/JDK-8129861] e-mail 
> reproduced below:
> CommitTracker makes this call:
> Executors.newScheduledThreadPool(0, new 
> DefaultSolrThreadFactory("commitScheduler"));
> The supposition is that calling this with 1 will fix this (untested)
> [~ichattopadhyaya] This affects 6.6 and IIUC you're spinning a new version. 
> We'll need to verify and include this fix.
> [~jpountz] You're right. I first thought "naaah, it wouldn't be that far 
> back" but your question made me check, thanks!
> AFAICT, this is the only place in Solr/Lucene that uses zero.
> Using Java 9+ is another work-around.
> Anyone picking this up should port to 7.7 as well.
> e-mail from the user's list (many thanks to Lukas and Adam).
> Apologies, I can’t figure out how to reply to the Solr mailing list.
>  I just ran across the same high CPU usage issue. I believe it’’s caused by 
>  this commit which was introduced in Solr 7.7.0 
>  
> [https://github.com/apache/lucene-solr/commit/eb652b84edf441d8369f5188cdd5e3ae2b151434#diff-e54b251d166135a1afb7938cfe152bb5]
>  There is a bug in JDK versions <=8 where using 0 threads in the 
>  ScheduledThreadPool causes high CPU usage: 
>  [https://bugs.openjdk.java.net/browse/JDK-8129861]
>  Oddly, the latest version 
>  of solr/core/src/java/org/apache/solr/update/CommitTracker.java on 
>  master still uses 0 executors as the default. Presumably most everyone is 
>  using JDK 9 or greater which has the bug fixed, so they don’t experience 
>  the bug.
>  Feel free to relay this back to the mailing list.
>  Thanks,
>  Adam Guthrie
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] bruno-roustant opened a new pull request #889: LUCENE-8983: Add PhraseWildcardQuery to control multi-terms expansions in a phrase

2019-09-18 Thread GitBox
bruno-roustant opened a new pull request #889: LUCENE-8983: Add 
PhraseWildcardQuery to control multi-terms expansions in a phrase
URL: https://github.com/apache/lucene-solr/pull/889
 
 
   A generalized version of PhraseQuery, built with one or more MultiTermQuery 
that provides term expansions for multi-terms (one of the expanded terms must 
match).
   Its main advantage is to control the total number of expansions across all 
MultiTermQuery and across all segments.
   
This query is similar to MultiPhraseQuery, but it handles, controls and 
optimizes the multi-term expansions.


   This query is equivalent to building an ordered SpanNearQuery with a list of 
SpanTermQuery and SpanMultiTermQueryWrapper.
But it optimizes the multi-term 
expansions and the segment accesses.
It first resolves the single-terms to 
early stop if some does not match. Then it expands each multi-term 
sequentially, stopping immediately if one does not match. It detects the 
segments that do not match to skip them for the next expansions. This often 
avoid expanding the other multi-terms on some or even all segments. And finally 
it controls the total number of expansions.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (LUCENE-8983) PhraseWildcardQuery - new query to control and optimize wildcard expansions in phrase

2019-09-18 Thread Bruno Roustant (Jira)
Bruno Roustant created LUCENE-8983:
--

 Summary: PhraseWildcardQuery - new query to control and optimize 
wildcard expansions in phrase
 Key: LUCENE-8983
 URL: https://issues.apache.org/jira/browse/LUCENE-8983
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/search
Reporter: Bruno Roustant


A generalized version of PhraseQuery, built with one or more MultiTermQuery 
that provides term expansions for multi-terms (one of the expanded terms must 
match).

Its main advantage is to control the total number of expansions across all 
MultiTermQuery and across all segments.


 This query is similar to MultiPhraseQuery, but it handles, controls and 
optimizes the multi-term expansions.
 
 This query is equivalent to building an ordered SpanNearQuery with a list of 
SpanTermQuery and SpanMultiTermQueryWrapper.
 But it optimizes the multi-term expansions and the segment accesses.
 It first resolves the single-terms to early stop if some does not match. Then 
it expands each multi-term sequentially, stopping immediately if one does not 
match. It detects the segments that do not match to skip them for the next 
expansions. This often avoid expanding the other multi-terms on some or even 
all segments. And finally it controls the total number of expansions.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] dsmiley commented on issue #887: SOLR-13773: Prometheus Exporter GC and Heap options

2019-09-18 Thread GitBox
dsmiley commented on issue #887: SOLR-13773: Prometheus Exporter GC and Heap 
options
URL: https://github.com/apache/lucene-solr/pull/887#issuecomment-532752421
 
 
   :+1: Looks good to me.
   
   @elyograg I don't think every process out their should disambiguate it's use 
of env vars.  It's an impossible task resulting in verbose env vars (ex: 
SOLR_PROMETHEUS_JAVA_HEAP ?).  Besides, nowadays production is increasingly 
container based rendering that pointless.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] HoustonPutman commented on issue #887: SOLR-13773: Prometheus Exporter GC and Heap options

2019-09-18 Thread GitBox
HoustonPutman commented on issue #887: SOLR-13773: Prometheus Exporter GC and 
Heap options
URL: https://github.com/apache/lucene-solr/pull/887#issuecomment-532745320
 
 
   Thanks for the input. I can definitely change it to not use SOLR*, and 
understand how that could cause confusion. 
   
   @elyograg I definitely like the suggestion, but I'll keep that as a separate 
PR since I wouldn't want this to affect backwards compatibility. (If people 
were previously using the generic JAVA/JVM names for non-Solr things, assuming 
that Solr wouldn't read them)
   
   @anshumg The precommit errors look entirely unrelated to my stuff, as it has 
issues with Java code, but I'll look into it more.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13180) ClassCastExceptions in o.a.s.s.facet.FacetModule for valid JSON inputs that are not objects

2019-09-18 Thread Munendra S N (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16932557#comment-16932557
 ] 

Munendra S N commented on SOLR-13180:
-

[~janhoy]
I think the current patch can be committed. Is there anything pending to be 
done here?

> ClassCastExceptions in o.a.s.s.facet.FacetModule for valid JSON inputs that 
> are not objects
> ---
>
> Key: SOLR-13180
> URL: https://issues.apache.org/jira/browse/SOLR-13180
> Project: Solr
>  Issue Type: Bug
>  Components: Facet Module
>Affects Versions: 7.5, master (9.0)
> Environment: Running on Unix, using a git checkout close to master.
> h2. Steps to reproduce
>  * Build commit ea2c8ba of Solr as described in the section below.
>  * Build the films collection as described below.
>  * Start the server using the command \{{“./bin/solr start -f -p 8983 -s 
> /tmp/home”}}
>  * Request the URL above.
> h2. Compiling the server
> {noformat}
> git clone https://github.com/apache/lucene-solr
> cd lucene-solr
> git checkout ea2c8ba
> ant compile
> cd solr
> ant server
> {noformat}
> h2. Building the collection
> We followed Exercise 2 from the quick start tutorial 
> ([http://lucene.apache.org/solr/guide/7_5/solr-tutorial.html#exercise-2]) - 
> for reference, I have attached a copy of the database.
> {noformat}
> mkdir -p /tmp/home
> echo '' > 
> /tmp/home/solr.xml
> {noformat}
> In one terminal start a Solr instance in foreground:
> {noformat}
> ./bin/solr start -f -p 8983 -s /tmp/home
> {noformat}
> In another terminal, create a collection of movies, with no shards and no 
> replication:
> {noformat}
> bin/solr create -c films
> curl -X POST -H 'Content-type:application/json' --data-binary 
> '\\{"add-field": {"name":"name", "type":"text_general", "multiValued":false, 
> "stored":true}}' http://localhost:8983/solr/films/schema
> curl -X POST -H 'Content-type:application/json' --data-binary 
> '\{"add-copy-field" : {"source":"*","dest":"_text_"}}' 
> [http://localhost:8983/solr/films/schema]
> ./bin/post -c films example/films/films.json
> {noformat}
>Reporter: Johannes Kloos
>Priority: Minor
>  Labels: diffblue, newdev
> Attachments: SOLR-13180.patch, home.zip
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Requesting the following URL gives a 500 error due to a ClassCastException in 
> o.a.s.s.f.FacetModule: [http://localhost:8983/solr/films/select?json=0]
> The error response is caught by an uncaught ClassCastException, with the 
> stacktrace shown here:
> java.lang.ClassCastException: java.lang.Long cannot be cast to java.util.Map
> at org.apache.solr.search.facet.FacetModule.prepare(FacetModule.java:78)
> at 
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:272)
>  
> The cause of this bug is similar to #13178: line 78 in FacetModule reads
> {{jsonFacet = (Map) json.get("facet")}}
> and assumes that the JSON object contained in facet is a JSON object, while 
> we only guarantee that it is a JSON value.
> Line 92 semms to contain another situation like this, but I do not have a 
> test case handy for this specific case.
> This bug was found using [Diffblue Microservices 
> Testing|http://www.diffblue.com/labs]. Find more information on this [test 
> campaign|https://www.diffblue.com/blog/2018/12/19/diffblue-microservice-testing-a-sneak-peek-at-our-early-product-and-results].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13272) Interval facet support for JSON faceting

2019-09-18 Thread Munendra S N (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16932554#comment-16932554
 ] 

Munendra S N commented on SOLR-13272:
-

I'm planning to commit it this weekend

> Interval facet support for JSON faceting
> 
>
> Key: SOLR-13272
> URL: https://issues.apache.org/jira/browse/SOLR-13272
> Project: Solr
>  Issue Type: New Feature
>  Components: Facet Module
>Reporter: Apoorv Bhawsar
>Assignee: Munendra S N
>Priority: Major
> Attachments: SOLR-13272.patch, SOLR-13272.patch
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> Interval facet is supported in classical facet component but has no support 
> in json facet requests.
>  In cases of block join and aggregations, this would be helpful
> Assuming request format -
> {code:java}
> json.facet={pubyear:{type : interval,field : 
> pubyear_i,intervals:[{key:"2000-2200",value:"[2000,2200]"}]}}
> {code}
>  
>  PR https://github.com/apache/lucene-solr/pull/597



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13725) TermsFacetMap.setLimit() unnecessarily rejects negative parameter value

2019-09-18 Thread Munendra S N (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16932552#comment-16932552
 ] 

Munendra S N commented on SOLR-13725:
-

I'm planning to commit this in the weekend

> TermsFacetMap.setLimit() unnecessarily rejects negative parameter value
> ---
>
> Key: SOLR-13725
> URL: https://issues.apache.org/jira/browse/SOLR-13725
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrJ
>Affects Versions: 8.2
>Reporter: Richard Walker
>Assignee: Munendra S N
>Priority: Trivial
> Attachments: SOLR-13725.patch, SOLR-13725.patch
>
>
> SolrJ's {{TermsFacetMap.setLimit(int maximumBuckets)}} rejects a negative 
> parameter value with an IllegalArgumentException "Parameter 'maximumBuckets' 
> must be non-negative".
> But a negative value for the limit parameter is accepted by Solr server, and 
> is meaningful: i.e., it means "no limit".
> The {{setLimit()}} method shouldn't reject a negative parameter value.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13349) High CPU usage in Solr due to Java 8 bug

2019-09-18 Thread Robert Ash (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16932540#comment-16932540
 ] 

Robert Ash commented on SOLR-13349:
---

Has their been any reports or known issues with Java 8 and SOLR build 7.2.0?

Thank you in advance

> High CPU usage in Solr due to Java 8 bug
> 
>
> Key: SOLR-13349
> URL: https://issues.apache.org/jira/browse/SOLR-13349
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 7.7, 8.0, master (9.0)
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Major
> Fix For: 7.7.2, 8.1, master (9.0)
>
> Attachments: SOLR-13349.patch
>
>
> We've had sporadic reports of high CPU usage in Solr 7. Lukas Weiss reported 
> a Java 8 bug that appears to be the root cause (I have not personally 
> verified), see: [https://bugs.openjdk.java.net/browse/JDK-8129861] e-mail 
> reproduced below:
> CommitTracker makes this call:
> Executors.newScheduledThreadPool(0, new 
> DefaultSolrThreadFactory("commitScheduler"));
> The supposition is that calling this with 1 will fix this (untested)
> [~ichattopadhyaya] This affects 6.6 and IIUC you're spinning a new version. 
> We'll need to verify and include this fix.
> [~jpountz] You're right. I first thought "naaah, it wouldn't be that far 
> back" but your question made me check, thanks!
> AFAICT, this is the only place in Solr/Lucene that uses zero.
> Using Java 9+ is another work-around.
> Anyone picking this up should port to 7.7 as well.
> e-mail from the user's list (many thanks to Lukas and Adam).
> Apologies, I can’t figure out how to reply to the Solr mailing list.
>  I just ran across the same high CPU usage issue. I believe it’’s caused by 
>  this commit which was introduced in Solr 7.7.0 
>  
> [https://github.com/apache/lucene-solr/commit/eb652b84edf441d8369f5188cdd5e3ae2b151434#diff-e54b251d166135a1afb7938cfe152bb5]
>  There is a bug in JDK versions <=8 where using 0 threads in the 
>  ScheduledThreadPool causes high CPU usage: 
>  [https://bugs.openjdk.java.net/browse/JDK-8129861]
>  Oddly, the latest version 
>  of solr/core/src/java/org/apache/solr/update/CommitTracker.java on 
>  master still uses 0 executors as the default. Presumably most everyone is 
>  using JDK 9 or greater which has the bug fixed, so they don’t experience 
>  the bug.
>  Feel free to relay this back to the mailing list.
>  Thanks,
>  Adam Guthrie
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-8928) suggest.cfq does not work with DocumentExpressionDictionaryFactory/weightExpression

2019-09-18 Thread Erick Erickson (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-8928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16932510#comment-16932510
 ] 

Erick Erickson commented on SOLR-8928:
--

Unassigned, status==open resolution==unresolved. I see no ambiguity here. It 
may well have been fixed in more recent versions of Solr, this was against 
5.5

> suggest.cfq does not work with 
> DocumentExpressionDictionaryFactory/weightExpression
> ---
>
> Key: SOLR-8928
> URL: https://issues.apache.org/jira/browse/SOLR-8928
> Project: Solr
>  Issue Type: Bug
>  Components: Suggester
>Affects Versions: 5.5
>Reporter: jmlucjav
>Priority: Major
>
> Using BlendedInfixLookupFactory, trying to use  
> DocumentExpressionDictionaryFactory/weightExpression with suggest.cfq does 
> not work. No docs get returned, even the ones that comply with the cfq.
> Moving to DocumentDictionaryFactory/weightField fixes this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8982) Make NativeUnixDirectory pure java now that direct IO is possible

2019-09-18 Thread Uwe Schindler (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-8982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16932509#comment-16932509
 ] 

Uwe Schindler commented on LUCENE-8982:
---

Here is the full JDK patch including examples in test (how to align direct 
buffers and how to open channels): 
http://hg.openjdk.java.net/jdk10/master/rev/d72d7d55c765

> Make NativeUnixDirectory pure java now that direct IO is possible
> -
>
> Key: LUCENE-8982
> URL: https://issues.apache.org/jira/browse/LUCENE-8982
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/misc
>Reporter: Michael McCandless
>Priority: Major
>
> {{NativeUnixDirectory}} is a {{Directory}} implementation that uses direct IO 
> to write newly merged segments.  Direct IO bypasses the kernel's buffer cache 
> and write cache, making merge writes "invisible" to the kernel, though the 
> reads for merging the N segments are still going through the kernel.
> But today, {{NativeUnixDirectory}} uses a small JNI wrapper to access the 
> {{O_DIRECT}} flag to {{open}} ... since JDK9 we can now pass that flag in 
> pure java code, so we should now fix {{NativeUnixDirectory}} to not use JNI 
> anymore.
> We should also run some more realistic benchmarks seeing if this option 
> really helps nodes that are doing concurrent indexing (merging) and searching.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] jgq2008303393 edited a comment on issue #884: LUCENE-8980: optimise SegmentTermsEnum.seekExact performance

2019-09-18 Thread GitBox
jgq2008303393 edited a comment on issue #884: LUCENE-8980: optimise 
SegmentTermsEnum.seekExact performance
URL: https://github.com/apache/lucene-solr/pull/884#issuecomment-532538959
 
 
   Ping @jpountz @mikemccand.
   Please help to take a look, thanks.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] jgq2008303393 edited a comment on issue #884: LUCENE-8980: optimise SegmentTermsEnum.seekExact performance

2019-09-18 Thread GitBox
jgq2008303393 edited a comment on issue #884: LUCENE-8980: optimise 
SegmentTermsEnum.seekExact performance
URL: https://github.com/apache/lucene-solr/pull/884#issuecomment-532538959
 
 
   Ping @jpountz @mikemccand.
   Please help to take a look, Thanks


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-8928) suggest.cfq does not work with DocumentExpressionDictionaryFactory/weightExpression

2019-09-18 Thread Baskar (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-8928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16932501#comment-16932501
 ] 

Baskar commented on SOLR-8928:
--

Could you please update whether its resolved or not

> suggest.cfq does not work with 
> DocumentExpressionDictionaryFactory/weightExpression
> ---
>
> Key: SOLR-8928
> URL: https://issues.apache.org/jira/browse/SOLR-8928
> Project: Solr
>  Issue Type: Bug
>  Components: Suggester
>Affects Versions: 5.5
>Reporter: jmlucjav
>Priority: Major
>
> Using BlendedInfixLookupFactory, trying to use  
> DocumentExpressionDictionaryFactory/weightExpression with suggest.cfq does 
> not work. No docs get returned, even the ones that comply with the cfq.
> Moving to DocumentDictionaryFactory/weightField fixes this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8982) Make NativeUnixDirectory pure java now that direct IO is possible

2019-09-18 Thread Uwe Schindler (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-8982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16932503#comment-16932503
 ] 

Uwe Schindler commented on LUCENE-8982:
---

We should rename the directory to something more generic. It's also supported 
on Windows!

In addition, the hardcoded alignment of 512 should be determined on directory 
creation (reading it from FileStore.getBlockSize().

Code should also create FileChannel directly not via forbidden Java.io.File 
classes. That was just a workaround to bring the FileDescriptor into the game.

> Make NativeUnixDirectory pure java now that direct IO is possible
> -
>
> Key: LUCENE-8982
> URL: https://issues.apache.org/jira/browse/LUCENE-8982
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/misc
>Reporter: Michael McCandless
>Priority: Major
>
> {{NativeUnixDirectory}} is a {{Directory}} implementation that uses direct IO 
> to write newly merged segments.  Direct IO bypasses the kernel's buffer cache 
> and write cache, making merge writes "invisible" to the kernel, though the 
> reads for merging the N segments are still going through the kernel.
> But today, {{NativeUnixDirectory}} uses a small JNI wrapper to access the 
> {{O_DIRECT}} flag to {{open}} ... since JDK9 we can now pass that flag in 
> pure java code, so we should now fix {{NativeUnixDirectory}} to not use JNI 
> anymore.
> We should also run some more realistic benchmarks seeing if this option 
> really helps nodes that are doing concurrent indexing (merging) and searching.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13105) A visual guide to Solr Math Expressions and Streaming Expressions

2019-09-18 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16932492#comment-16932492
 ] 

ASF subversion and git services commented on SOLR-13105:


Commit 2ee39219372e60b363545a1fc653cc4a45fc92b7 in lucene-solr's branch 
refs/heads/SOLR-13105-visual from Joel Bernstein
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=2ee3921 ]

SOLR-13105: Improve numerical analysis docs 2


> A visual guide to Solr Math Expressions and Streaming Expressions
> -
>
> Key: SOLR-13105
> URL: https://issues.apache.org/jira/browse/SOLR-13105
> Project: Solr
>  Issue Type: New Feature
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
>Priority: Major
> Attachments: Screen Shot 2019-01-14 at 10.56.32 AM.png, Screen Shot 
> 2019-02-21 at 2.14.43 PM.png, Screen Shot 2019-03-03 at 2.28.35 PM.png, 
> Screen Shot 2019-03-04 at 7.47.57 PM.png, Screen Shot 2019-03-13 at 10.47.47 
> AM.png, Screen Shot 2019-03-30 at 6.17.04 PM.png
>
>
> Visualization is now a fundamental element of Solr Streaming Expressions and 
> Math Expressions. This ticket will create a visual guide to Solr Math 
> Expressions and Solr Streaming Expressions that includes *Apache Zeppelin* 
> visualization examples.
> It will also cover using the JDBC expression to *analyze* and *visualize* 
> results from any JDBC compliant data source.
> Intro from the guide:
> {code:java}
> Streaming Expressions exposes the capabilities of Solr Cloud as composable 
> functions. These functions provide a system for searching, transforming, 
> analyzing and visualizing data stored in Solr Cloud collections.
> At a high level there are four main capabilities that will be explored in the 
> documentation:
> * Searching, sampling and aggregating results from Solr.
> * Transforming result sets after they are retrieved from Solr.
> * Analyzing and modeling result sets using probability and statistics and 
> machine learning libraries.
> * Visualizing result sets, aggregations and statistical models of the data.
> {code}
>  
> A few sample visualizations are attached to the ticket.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13105) A visual guide to Solr Math Expressions and Streaming Expressions

2019-09-18 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16932489#comment-16932489
 ] 

ASF subversion and git services commented on SOLR-13105:


Commit f872f8d22408df5ab98a4221a903647ae82d928f in lucene-solr's branch 
refs/heads/SOLR-13105-visual from Joel Bernstein
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=f872f8d ]

SOLR-13105: Improve numerical analysis docs 1


> A visual guide to Solr Math Expressions and Streaming Expressions
> -
>
> Key: SOLR-13105
> URL: https://issues.apache.org/jira/browse/SOLR-13105
> Project: Solr
>  Issue Type: New Feature
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
>Priority: Major
> Attachments: Screen Shot 2019-01-14 at 10.56.32 AM.png, Screen Shot 
> 2019-02-21 at 2.14.43 PM.png, Screen Shot 2019-03-03 at 2.28.35 PM.png, 
> Screen Shot 2019-03-04 at 7.47.57 PM.png, Screen Shot 2019-03-13 at 10.47.47 
> AM.png, Screen Shot 2019-03-30 at 6.17.04 PM.png
>
>
> Visualization is now a fundamental element of Solr Streaming Expressions and 
> Math Expressions. This ticket will create a visual guide to Solr Math 
> Expressions and Solr Streaming Expressions that includes *Apache Zeppelin* 
> visualization examples.
> It will also cover using the JDBC expression to *analyze* and *visualize* 
> results from any JDBC compliant data source.
> Intro from the guide:
> {code:java}
> Streaming Expressions exposes the capabilities of Solr Cloud as composable 
> functions. These functions provide a system for searching, transforming, 
> analyzing and visualizing data stored in Solr Cloud collections.
> At a high level there are four main capabilities that will be explored in the 
> documentation:
> * Searching, sampling and aggregating results from Solr.
> * Transforming result sets after they are retrieved from Solr.
> * Analyzing and modeling result sets using probability and statistics and 
> machine learning libraries.
> * Visualizing result sets, aggregations and statistical models of the data.
> {code}
>  
> A few sample visualizations are attached to the ticket.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13105) A visual guide to Solr Math Expressions and Streaming Expressions

2019-09-18 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16932487#comment-16932487
 ] 

ASF subversion and git services commented on SOLR-13105:


Commit 0a8311e6f7be0c78a756c447141c707cf7a955ca in lucene-solr's branch 
refs/heads/SOLR-13105-visual from Joel Bernstein
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=0a8311e ]

SOLR-13105: Improve numerical analysis docs


> A visual guide to Solr Math Expressions and Streaming Expressions
> -
>
> Key: SOLR-13105
> URL: https://issues.apache.org/jira/browse/SOLR-13105
> Project: Solr
>  Issue Type: New Feature
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
>Priority: Major
> Attachments: Screen Shot 2019-01-14 at 10.56.32 AM.png, Screen Shot 
> 2019-02-21 at 2.14.43 PM.png, Screen Shot 2019-03-03 at 2.28.35 PM.png, 
> Screen Shot 2019-03-04 at 7.47.57 PM.png, Screen Shot 2019-03-13 at 10.47.47 
> AM.png, Screen Shot 2019-03-30 at 6.17.04 PM.png
>
>
> Visualization is now a fundamental element of Solr Streaming Expressions and 
> Math Expressions. This ticket will create a visual guide to Solr Math 
> Expressions and Solr Streaming Expressions that includes *Apache Zeppelin* 
> visualization examples.
> It will also cover using the JDBC expression to *analyze* and *visualize* 
> results from any JDBC compliant data source.
> Intro from the guide:
> {code:java}
> Streaming Expressions exposes the capabilities of Solr Cloud as composable 
> functions. These functions provide a system for searching, transforming, 
> analyzing and visualizing data stored in Solr Cloud collections.
> At a high level there are four main capabilities that will be explored in the 
> documentation:
> * Searching, sampling and aggregating results from Solr.
> * Transforming result sets after they are retrieved from Solr.
> * Analyzing and modeling result sets using probability and statistics and 
> machine learning libraries.
> * Visualizing result sets, aggregations and statistical models of the data.
> {code}
>  
> A few sample visualizations are attached to the ticket.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-13776) Allow derivatives to be computed directly from vectors

2019-09-18 Thread Joel Bernstein (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-13776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-13776:
--
Description: Currently the *derivative* Stream Evaluator operates over 
functions (spline, lerp, akima, loess, polyfit, oscillate etc...). It would be 
useful if the *derivative* function could operate directly over a vector.  
(was: Currently the *derivative* Stream Evaluator operates over functions 
(spline, akima, loess, polyfit, oscillate etc...). It would be useful if the 
*derivative* function could operate directly over a vector.)

> Allow derivatives to be computed directly from vectors
> --
>
> Key: SOLR-13776
> URL: https://issues.apache.org/jira/browse/SOLR-13776
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
>Priority: Minor
>
> Currently the *derivative* Stream Evaluator operates over functions (spline, 
> lerp, akima, loess, polyfit, oscillate etc...). It would be useful if the 
> *derivative* function could operate directly over a vector.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-13776) Allow derivatives to be computed directly from vectors

2019-09-18 Thread Joel Bernstein (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-13776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-13776:
--
Priority: Minor  (was: Major)

> Allow derivatives to be computed directly from vectors
> --
>
> Key: SOLR-13776
> URL: https://issues.apache.org/jira/browse/SOLR-13776
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
>Priority: Minor
>
> Currently the *derivative* Stream Evaluator operates over functions (spline, 
> akima, loess, polyfit, oscillate etc...). It would be useful if the 
> *derivative* function could operate directly over a vector.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-13776) Allow derivatives to be computed directly from vectors

2019-09-18 Thread Joel Bernstein (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-13776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-13776:
--
Description: Currently the *derivative* Stream Evaluator operates over 
functions (spline, akima, loess, polyfit, oscillate etc...). It would be useful 
if the *derivative* function could operate directly over a vector.  (was: 
Currently the derivative Stream Evaluator operates over functions (spline, 
akima, loess, polyfit, oscillate etc...). It would be useful if the derivative 
function could operate directly over a vector.)

> Allow derivatives to be computed directly from vectors
> --
>
> Key: SOLR-13776
> URL: https://issues.apache.org/jira/browse/SOLR-13776
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Priority: Major
>
> Currently the *derivative* Stream Evaluator operates over functions (spline, 
> akima, loess, polyfit, oscillate etc...). It would be useful if the 
> *derivative* function could operate directly over a vector.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (SOLR-13776) Allow derivatives to be computed directly from vectors

2019-09-18 Thread Joel Bernstein (Jira)
Joel Bernstein created SOLR-13776:
-

 Summary: Allow derivatives to be computed directly from vectors
 Key: SOLR-13776
 URL: https://issues.apache.org/jira/browse/SOLR-13776
 Project: Solr
  Issue Type: Improvement
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Joel Bernstein


Currently the derivative Stream Evaluator operates over functions (spline, 
akima, loess, polyfit, oscillate etc...). It would be useful if the derivative 
function could operate directly over a vector.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Assigned] (SOLR-13776) Allow derivatives to be computed directly from vectors

2019-09-18 Thread Joel Bernstein (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-13776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein reassigned SOLR-13776:
-

Assignee: Joel Bernstein

> Allow derivatives to be computed directly from vectors
> --
>
> Key: SOLR-13776
> URL: https://issues.apache.org/jira/browse/SOLR-13776
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
>Priority: Major
>
> Currently the *derivative* Stream Evaluator operates over functions (spline, 
> akima, loess, polyfit, oscillate etc...). It would be useful if the 
> *derivative* function could operate directly over a vector.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (LUCENE-8982) Make NativeUnixDirectory pure java now that direct IO is possible

2019-09-18 Thread Michael McCandless (Jira)
Michael McCandless created LUCENE-8982:
--

 Summary: Make NativeUnixDirectory pure java now that direct IO is 
possible
 Key: LUCENE-8982
 URL: https://issues.apache.org/jira/browse/LUCENE-8982
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/misc
Reporter: Michael McCandless


{{NativeUnixDirectory}} is a {{Directory}} implementation that uses direct IO 
to write newly merged segments.  Direct IO bypasses the kernel's buffer cache 
and write cache, making merge writes "invisible" to the kernel, though the 
reads for merging the N segments are still going through the kernel.

But today, {{NativeUnixDirectory}} uses a small JNI wrapper to access the 
{{O_DIRECT}} flag to {{open}} ... since JDK9 we can now pass that flag in pure 
java code, so we should now fix {{NativeUnixDirectory}} to not use JNI anymore.

We should also run some more realistic benchmarks seeing if this option really 
helps nodes that are doing concurrent indexing (merging) and searching.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13741) possible AuditLogger bugs uncovered while hardening AuditLoggerIntegrationTest

2019-09-18 Thread Jira


[ 
https://issues.apache.org/jira/browse/SOLR-13741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16932435#comment-16932435
 ] 

Jan Høydahl commented on SOLR-13741:


Btw: I just love IntelliJ's "right-click->git->Show history for selection" 
feature, it makes it so easy to inspect the history of a small code block! I 
just discovered the feature a few weeks ago!

> possible AuditLogger bugs uncovered while hardening AuditLoggerIntegrationTest
> --
>
> Key: SOLR-13741
> URL: https://issues.apache.org/jira/browse/SOLR-13741
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Hoss Man
>Assignee: Hoss Man
>Priority: Major
> Attachments: SOLR-13741.patch
>
>
> A while back i saw a weird non-reproducible failure from 
> AuditLoggerIntegrationTest.  When i started reading through that code, 2 
> things jumped out at me:
> # the way the 'delay' option works is brittle, and makes assumptions about 
> CPU scheduling that aren't neccessarily going to be true (and also suffers 
> from the problem that Thread.sleep isn't garunteed to sleep as long as you 
> ask it too)
> # the way the existing {{waitForAuditEventCallbacks(number)}} logic works by 
> checking the size of a (List) {{buffer}} of recieved events in a sleep/poll 
> loop, until it contains at least N items -- but the code that adds items to 
> that buffer in the async Callback thread async _before_ the code that updates 
> other state variables (like the global {{count}} and the patch specific 
> {{resourceCounts}}) meaning that a test waiting on 3 events could "see" 3 
> events added to the buffer, but calling {{assertEquals(3, 
> receiver.getTotalCount())}} could subsequently fail because that variable 
> hadn't been udpated yet.
> #2 was the source of the failures I was seeing, and while a quick fix for 
> that specific problem would be to update all other state _before_ adding the 
> event to the buffer, I set out to try and make more general improvements to 
> the test:
> * eliminate the dependency on sleep loops by {{await}}-ing on concurrent data 
> structures
> * harden the assertions made about the expected events recieved (updating 
> some test methods that currently just assert the number of events recieved)
> * add new assertions that _only_ the expected events are recieved.
> In the process of doing this, I've found several oddities/descrepencies 
> between things the test currently claims/asserts, and what *actually* happens 
> under more rigerous scrutiny/assertions.
> I'll attach a patch shortly that has my (in progress) updates and inlcudes 
> copious nocommits about things seem suspect.  the summary of these concerns 
> is:
> * SolrException status codes that do not match what the existing test says 
> they should (but doesn't assert)
> * extra AuditEvents occuring that the existing test does not expect
> * AuditEvents for incorrect credentials that do not at all match the expected 
> AuditEvent in the existing test -- which the current test seems to miss in 
> it's assertions because it's picking up some extra events from triggered by 
> previuos requests earlier in the test that just happen to also match the 
> asserctions.
> ...it's not clear to me if the test logic is correct and these are "code 
> bugs" or if the test is faulty.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13741) possible AuditLogger bugs uncovered while hardening AuditLoggerIntegrationTest

2019-09-18 Thread Jira


[ 
https://issues.apache.org/jira/browse/SOLR-13741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16932428#comment-16932428
 ] 

Jan Høydahl commented on SOLR-13741:


Thanks for taking a look at this. I see you have some great improvements on how 
to assert exceptions etc.

Wrt the REJECTED + UNAUTHORIZED events I see the same as you, and I believe 
there is a code bug, not a test bug. In HttpSolrCall#471 in the {{authorize()}} 
call, if authResponse == PROMPT, it will actually match both blocks and emit 
two audit events: 
[https://github.com/apache/lucene-solr/blob/26ede632e6259eb9d16861a3c0f782c9c8999762/solr/core/src/java/org/apache/solr/servlet/HttpSolrCall.java#L475:L493]
 
{code:java}
if (authResponse.statusCode == AuthorizationResponse.PROMPT.statusCode) {...}
if (!(authResponse.statusCode == HttpStatus.SC_ACCEPTED) && 
!(authResponse.statusCode == HttpStatus.SC_OK)) {...}
{code}
When code==401, it is also true that code!=200. Intuitively there should be 
both a sendErrora and return RETURN before line #484 in the first if block?

The first if block was introduced back in 2005 as part of SOLR-7757. 
[~noble.paul] why does the if not return? It will *always* fall through to and 
trigger the next if block!

> possible AuditLogger bugs uncovered while hardening AuditLoggerIntegrationTest
> --
>
> Key: SOLR-13741
> URL: https://issues.apache.org/jira/browse/SOLR-13741
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Hoss Man
>Assignee: Hoss Man
>Priority: Major
> Attachments: SOLR-13741.patch
>
>
> A while back i saw a weird non-reproducible failure from 
> AuditLoggerIntegrationTest.  When i started reading through that code, 2 
> things jumped out at me:
> # the way the 'delay' option works is brittle, and makes assumptions about 
> CPU scheduling that aren't neccessarily going to be true (and also suffers 
> from the problem that Thread.sleep isn't garunteed to sleep as long as you 
> ask it too)
> # the way the existing {{waitForAuditEventCallbacks(number)}} logic works by 
> checking the size of a (List) {{buffer}} of recieved events in a sleep/poll 
> loop, until it contains at least N items -- but the code that adds items to 
> that buffer in the async Callback thread async _before_ the code that updates 
> other state variables (like the global {{count}} and the patch specific 
> {{resourceCounts}}) meaning that a test waiting on 3 events could "see" 3 
> events added to the buffer, but calling {{assertEquals(3, 
> receiver.getTotalCount())}} could subsequently fail because that variable 
> hadn't been udpated yet.
> #2 was the source of the failures I was seeing, and while a quick fix for 
> that specific problem would be to update all other state _before_ adding the 
> event to the buffer, I set out to try and make more general improvements to 
> the test:
> * eliminate the dependency on sleep loops by {{await}}-ing on concurrent data 
> structures
> * harden the assertions made about the expected events recieved (updating 
> some test methods that currently just assert the number of events recieved)
> * add new assertions that _only_ the expected events are recieved.
> In the process of doing this, I've found several oddities/descrepencies 
> between things the test currently claims/asserts, and what *actually* happens 
> under more rigerous scrutiny/assertions.
> I'll attach a patch shortly that has my (in progress) updates and inlcudes 
> copious nocommits about things seem suspect.  the summary of these concerns 
> is:
> * SolrException status codes that do not match what the existing test says 
> they should (but doesn't assert)
> * extra AuditEvents occuring that the existing test does not expect
> * AuditEvents for incorrect credentials that do not at all match the expected 
> AuditEvent in the existing test -- which the current test seems to miss in 
> it's assertions because it's picking up some extra events from triggered by 
> previuos requests earlier in the test that just happen to also match the 
> asserctions.
> ...it's not clear to me if the test logic is correct and these are "code 
> bugs" or if the test is faulty.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-11214) GraphQuery not working for TrieField's that has only docValues

2019-09-18 Thread Erick Erickson (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-11214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16932404#comment-16932404
 ] 

Erick Erickson edited comment on SOLR-11214 at 9/18/19 1:04 PM:


[~varunthacker] You put some of the checking in for this, and it's restricted 
to points-based and string types. Is there any specific reason that Trie (and 
anything Numeric) isn't supported or is it just a case that we should check for 
that too. The code is in
{code:java}
GraphQueryParser.validateFields {code}
In fact, do we even care about what type the field is if it has docValues? 
SortableTextField would be rather weird to support, but would it work?

And is there a reason why the test for Point fields does not include whether 
it's indexed, while the test for string fields does?
{code}
public void validateFields(String field) throws SyntaxError {
if (req.getSchema().getField(field) == null) {
  throw new SyntaxError("field " + field + " not defined in schema");
}

if (req.getSchema().getField(field).getType().isPointField()) {
  if (req.getSchema().getField(field).hasDocValues()) {
return;
  } else {
throw new SyntaxError("point field " + field + " must have 
docValues=true");
  }
}
if (req.getSchema().getField(field).getType() instanceof StrField) {
  if ((req.getSchema().getField(field).hasDocValues() || 
req.getSchema().getField(field).indexed())) {
return;
  } else {
throw new SyntaxError("string field " + field + " must have 
indexed=true or docValues=true");
  }
}
throw new SyntaxError("FieldType for field=" + field + " not supported");
  }
{code}

As you can tell, I haven't looked at much else but that check...


was (Author: erickerickson):
[~varunthacker] You put some of the checking in for this, and it's restricted 
to points-based and string types. Is there any specific reason that Trie (and 
anything Numeric) isn't supported or is it just a case that we should check for 
that too. The code is in
{code:java}
GraphQueryParser.validateFields {code}
In fact, do we even care about what type the field is if it has docValues? 
SortableTextField would be rather weird to support, but would it work?

As you can tell, I haven't looked at much else but that check...

> GraphQuery not working for TrieField's that has only docValues
> --
>
> Key: SOLR-11214
> URL: https://issues.apache.org/jira/browse/SOLR-11214
> Project: Solr
>  Issue Type: Bug
>  Components: query parsers
>Affects Versions: 6.6
>Reporter: Karthik Ramachandran
>Assignee: Karthik Ramachandran
>Priority: Major
>
> Graph traversal is not working for TrieField's with only docValues since the 
> construction of leaf or parent node queries uses only TermQuery.
> \\ \\
> {code:xml|title=managed-schema|borderStyle=solid}
> 
>  docValues="true" />
>  docValues="true" />
>  docValues="true" />
>  docValues="true" />
> id
> 
>  precisionStep="0" positionIncrementGap="0"/>
>  precisionStep="0" positionIncrementGap="0"/>
> 
> {code}
> {code}
> curl -XPOST -H 'Content-Type: application/json' 
> 'http://localhost:8983/solr/graph/update' --data-binary ' {
>  "add" : { "doc" : { "id" : "1", "name" : "Root1" } },
>  "add" : { "doc" : { "id" : "2", "name" : "Root2" } },
>  "add" : { "doc" : { "id" : "3", "name" : "Root3" } },
>  "add" : { "doc" : { "id" : "11", "parentid" : "1", "name" : "Root1 Child1" } 
> },
>  "add" : { "doc" : { "id" : "12", "parentid" : "1", "name" : "Root1 Child2" } 
> },
>  "add" : { "doc" : { "id" : "13", "parentid" : "1", "name" : "Root1 Child3" } 
> },
>  "add" : { "doc" : { "id" : "21", "parentid" : "2", "name" : "Root2 Child1" } 
> },
>  "add" : { "doc" : { "id" : "22", "parentid" : "2", "name" : "Root2 Child2" } 
> },
>  "add" : { "doc" : { "id" : "121", "parentid" : "12", "name" : "Root12 
> Child1" } },
>  "add" : { "doc" : { "id" : "122", "parentid" : "12", "name" : "Root12 
> Child2" } },
>  "add" : { "doc" : { "id" : "131", "parentid" : "13", "name" : "Root13 
> Child1" } },
>  "commit" : {}
> }'
> {code}
> {code}
> http://localhost:8983/solr/graph/select?q=*:*={!graph from=parentid 
> to=id}id:1
> or
> http://localhost:8983/solr/graph/select?q=*:*={!graph from=id 
> to=parentid}id:122
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-11214) GraphQuery not working for TrieField's that has only docValues

2019-09-18 Thread Erick Erickson (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-11214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16932404#comment-16932404
 ] 

Erick Erickson commented on SOLR-11214:
---

[~varunthacker] You put some of the checking in for this, and it's restricted 
to points-based and string types. Is there any specific reason that Trie (and 
anything Numeric) isn't supported or is it just a case that we should check for 
that too. The code is in
{code:java}
GraphQueryParser.validateFields {code}
In fact, do we even care about what type the field is if it has docValues? 
SortableTextField would be rather weird to support, but would it work?

As you can tell, I haven't looked at much else but that check...

> GraphQuery not working for TrieField's that has only docValues
> --
>
> Key: SOLR-11214
> URL: https://issues.apache.org/jira/browse/SOLR-11214
> Project: Solr
>  Issue Type: Bug
>  Components: query parsers
>Affects Versions: 6.6
>Reporter: Karthik Ramachandran
>Assignee: Karthik Ramachandran
>Priority: Major
>
> Graph traversal is not working for TrieField's with only docValues since the 
> construction of leaf or parent node queries uses only TermQuery.
> \\ \\
> {code:xml|title=managed-schema|borderStyle=solid}
> 
>  docValues="true" />
>  docValues="true" />
>  docValues="true" />
>  docValues="true" />
> id
> 
>  precisionStep="0" positionIncrementGap="0"/>
>  precisionStep="0" positionIncrementGap="0"/>
> 
> {code}
> {code}
> curl -XPOST -H 'Content-Type: application/json' 
> 'http://localhost:8983/solr/graph/update' --data-binary ' {
>  "add" : { "doc" : { "id" : "1", "name" : "Root1" } },
>  "add" : { "doc" : { "id" : "2", "name" : "Root2" } },
>  "add" : { "doc" : { "id" : "3", "name" : "Root3" } },
>  "add" : { "doc" : { "id" : "11", "parentid" : "1", "name" : "Root1 Child1" } 
> },
>  "add" : { "doc" : { "id" : "12", "parentid" : "1", "name" : "Root1 Child2" } 
> },
>  "add" : { "doc" : { "id" : "13", "parentid" : "1", "name" : "Root1 Child3" } 
> },
>  "add" : { "doc" : { "id" : "21", "parentid" : "2", "name" : "Root2 Child1" } 
> },
>  "add" : { "doc" : { "id" : "22", "parentid" : "2", "name" : "Root2 Child2" } 
> },
>  "add" : { "doc" : { "id" : "121", "parentid" : "12", "name" : "Root12 
> Child1" } },
>  "add" : { "doc" : { "id" : "122", "parentid" : "12", "name" : "Root12 
> Child2" } },
>  "add" : { "doc" : { "id" : "131", "parentid" : "13", "name" : "Root13 
> Child1" } },
>  "commit" : {}
> }'
> {code}
> {code}
> http://localhost:8983/solr/graph/select?q=*:*={!graph from=parentid 
> to=id}id:1
> or
> http://localhost:8983/solr/graph/select?q=*:*={!graph from=id 
> to=parentid}id:122
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-13775) Update PR template to suggest Githubs: "Allow Edits From Maintainers" option

2019-09-18 Thread Jason Gerlowski (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16932385#comment-16932385
 ] 

Jason Gerlowski edited comment on SOLR-13775 at 9/18/19 12:36 PM:
--

As a first pass at this, I would put the checklist item:

* I have given Solr maintainers 
[access|https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork]
 to contribute to my PR branch.

near the top of the list, right under "I have reviewed the guidelines for How 
to Contribute...".

As a part of this JIRA, we should also add a similar notice to the 
How-to-Contribute docs, with a little more detail.




was (Author: gerlowskija):
As a first pass at this, I would put the checklist item:

* I have given Solr maintainers 
[access|https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork]
 to contribute to my PR branch.

near the top of the list, right under "I have reviewed the guidelines for How 
to Contribute...".



> Update PR template to suggest Githubs: "Allow Edits From Maintainers" option
> 
>
> Key: SOLR-13775
> URL: https://issues.apache.org/jira/browse/SOLR-13775
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: github
>Reporter: Jason Gerlowski
>Assignee: Jason Gerlowski
>Priority: Minor
>
> As discussed recently on the mailing list, Github does have one big downside: 
> it's pretty onerous for two people to both contribute code to the same PR.  
> PR's typically come from personal-fork's of Solr, owned by an individual.  If 
> that individual doesn't think to give other collaborators edit-access, then 
> they must open secondary-PRs to get their changes the primary PR.  These 
> secondary PRs must be reviewed, merged etc.
> This makes collaboration much more difficult than it is in the patch world 
> for example, where I can (e.g.) help a contributor clean up their formatting 
> by just uploading a new patch.
> Unfortunately the best workaround at this point for this in github is to 
> prompt those opening PRs to grant access to Solr's upstream committers and 
> maintainers. We can do this by linking users to this 
> [option|https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork]
>  in the "PR checklist" that was added recently.
> Users don't have to provide this access if they don't want.  But hopefully 
> it'll many collaboration easier in many cases.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13775) Update PR template to suggest Githubs: "Allow Edits From Maintainers" option

2019-09-18 Thread Jason Gerlowski (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16932385#comment-16932385
 ] 

Jason Gerlowski commented on SOLR-13775:


As a first pass at this, I would put the checklist item:

* I have given Solr maintainers 
[access|https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork]
 to contribute to my PR branch.

near the top of the list, right under "I have reviewed the guidelines for How 
to Contribute...".



> Update PR template to suggest Githubs: "Allow Edits From Maintainers" option
> 
>
> Key: SOLR-13775
> URL: https://issues.apache.org/jira/browse/SOLR-13775
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: github
>Reporter: Jason Gerlowski
>Assignee: Jason Gerlowski
>Priority: Minor
>
> As discussed recently on the mailing list, Github does have one big downside: 
> it's pretty onerous for two people to both contribute code to the same PR.  
> PR's typically come from personal-fork's of Solr, owned by an individual.  If 
> that individual doesn't think to give other collaborators edit-access, then 
> they must open secondary-PRs to get their changes the primary PR.  These 
> secondary PRs must be reviewed, merged etc.
> This makes collaboration much more difficult than it is in the patch world 
> for example, where I can (e.g.) help a contributor clean up their formatting 
> by just uploading a new patch.
> Unfortunately the best workaround at this point for this in github is to 
> prompt those opening PRs to grant access to Solr's upstream committers and 
> maintainers. We can do this by linking users to this 
> [option|https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork]
>  in the "PR checklist" that was added recently.
> Users don't have to provide this access if they don't want.  But hopefully 
> it'll many collaboration easier in many cases.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-13775) Update PR template to suggest Githubs: "Allow Edits From Maintainers" option

2019-09-18 Thread Jason Gerlowski (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-13775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Gerlowski updated SOLR-13775:
---
Summary: Update PR template to suggest Githubs: "Allow Edits From 
Maintainers" option  (was: Update PR template to suggest Github's: "Allow Edits 
From Maintainers" option)

> Update PR template to suggest Githubs: "Allow Edits From Maintainers" option
> 
>
> Key: SOLR-13775
> URL: https://issues.apache.org/jira/browse/SOLR-13775
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: github
>Reporter: Jason Gerlowski
>Assignee: Jason Gerlowski
>Priority: Minor
>
> As discussed recently on the mailing list, Github does have one big downside: 
> it's pretty onerous for two people to both contribute code to the same PR.  
> PR's typically come from personal-fork's of Solr, owned by an individual.  If 
> that individual doesn't think to give other collaborators edit-access, then 
> they must open secondary-PRs to get their changes the primary PR.  These 
> secondary PRs must be reviewed, merged etc.
> This makes collaboration much more difficult than it is in the patch world 
> for example, where I can (e.g.) help a contributor clean up their formatting 
> by just uploading a new patch.
> Unfortunately the best workaround at this point for this in github is to 
> prompt those opening PRs to grant access to Solr's upstream committers and 
> maintainers. We can do this by linking users to this 
> [option|https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork]
>  in the "PR checklist" that was added recently.
> Users don't have to provide this access if they don't want.  But hopefully 
> it'll many collaboration easier in many cases.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-8137) GraphTokenStreamFiniteStrings does not handle position inc > 1 in multi-word synoyms

2019-09-18 Thread Jira


[ 
https://issues.apache.org/jira/browse/LUCENE-8137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16932317#comment-16932317
 ] 

Jan Høydahl edited comment on LUCENE-8137 at 9/18/19 11:53 AM:
---

I think I'm hitting this issue as well. We have SF followed by SGF.

For input "a hero", SF removes "a" and "hero" gets position=2. But if "hero" 
has a synonym, then the gap is eaten by SGF and "hero" and its synonym will 
have position=1. Attaching a patch with a test that I belive reproduces the 
issue (although my TokenStream karma is not very high). In the test, the last 
assert fails with "Expected :2 Actual   :1"

[^SGF_SF_interaction.patch]

If you believe it is not the same issue, then I can open a new JIRA.


was (Author: janhoy):
I think I'm hitting this issue as well. We have SF followed by SGF.

For input "a hero", SF removes "a" and "hero" gets position=2. But if "hero" 
has a synonym, then the gap is eaten by SGF and "hero" and its synonym will 
have position=1. Attaching a patch with a test that I belive reproduces the 
issue (although my TokenStream karma is not very high).

[^SGF_SF_interaction.patch]

If you believe it is not the same issue, then I can open a new JIRA.

> GraphTokenStreamFiniteStrings does not handle position inc > 1 in multi-word 
> synoyms
> 
>
> Key: LUCENE-8137
> URL: https://issues.apache.org/jira/browse/LUCENE-8137
> Project: Lucene - Core
>  Issue Type: Bug
>Affects Versions: 7.2.1, 8.0
>Reporter: Jim Ferenczi
>Assignee: Jim Ferenczi
>Priority: Major
> Attachments: SGF_SF_interaction.patch
>
>
> The automaton built for graph queries that contain multiple multi-word 
> synonyms does not handle gaps if they appear in the middle of a multi-word 
> synonym. In such case the token next to the gap is considered as part of the 
> multi-word synonym. 
> Stop words that appear before or after multi-word synonyms are handled 
> correctly in the current version but the synonym rule "part of speech, pos" 
> for instance does not create the expected query if "of" is removed by a 
> filter that is set after the synonym_graph.  One solution would be to reuse 
> TokenStreamToAutomaton (with minor changes to add the ability to create token 
> transitions rather than chars) which preserves gaps (as a transition) in the 
> produced automaton.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-8945) Allow to change the output file delimiter on Luke "export terms" feature

2019-09-18 Thread Tomoko Uchida (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-8945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tomoko Uchida resolved LUCENE-8945.
---
Fix Version/s: 8.3
   master (9.0)
 Assignee: Tomoko Uchida
   Resolution: Fixed

> Allow to change the output file delimiter on Luke "export terms" feature
> 
>
> Key: LUCENE-8945
> URL: https://issues.apache.org/jira/browse/LUCENE-8945
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/luke
>Reporter: Tomoko Uchida
>Assignee: Tomoko Uchida
>Priority: Minor
> Fix For: master (9.0), 8.3
>
> Attachments: LUCENE-8945-final.patch, LUCENE-8945.patch, 
> LUCENE-8945.patch, delimiter_comma_exported_file.PNG, 
> delimiter_space_exported_file.PNG, delimiter_tab_exported_file.PNG, 
> luke_export_delimiter.png
>
>
> This is a follow-up issue for LUCENE-8764.
> Current delimiter is fixed to "," (comma), but terms also can include comma 
> and they are not escaped. It would be better if the delimiter can be 
> changed/selected to a tab or whitespace when exporting.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8945) Allow to change the output file delimiter on Luke "export terms" feature

2019-09-18 Thread Tomoko Uchida (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-8945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16932320#comment-16932320
 ] 

Tomoko Uchida commented on LUCENE-8945:
---

Seems that ASF bot is not working...

Committed to the master and 8x, with slight modification (moved Delimiter to 
private enum, it's used only in the factory anyway). Here is the final patch 
[^LUCENE-8945-final.patch]

[https://github.com/apache/lucene-solr/commit/369df12c2cc54e929bd25dd77424242ddd0fb047]

Thanks [~shahamish150294]!

> Allow to change the output file delimiter on Luke "export terms" feature
> 
>
> Key: LUCENE-8945
> URL: https://issues.apache.org/jira/browse/LUCENE-8945
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/luke
>Reporter: Tomoko Uchida
>Priority: Minor
> Attachments: LUCENE-8945-final.patch, LUCENE-8945.patch, 
> LUCENE-8945.patch, delimiter_comma_exported_file.PNG, 
> delimiter_space_exported_file.PNG, delimiter_tab_exported_file.PNG, 
> luke_export_delimiter.png
>
>
> This is a follow-up issue for LUCENE-8764.
> Current delimiter is fixed to "," (comma), but terms also can include comma 
> and they are not escaped. It would be better if the delimiter can be 
> changed/selected to a tab or whitespace when exporting.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (LUCENE-8945) Allow to change the output file delimiter on Luke "export terms" feature

2019-09-18 Thread Tomoko Uchida (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-8945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tomoko Uchida updated LUCENE-8945:
--
Attachment: LUCENE-8945-final.patch

> Allow to change the output file delimiter on Luke "export terms" feature
> 
>
> Key: LUCENE-8945
> URL: https://issues.apache.org/jira/browse/LUCENE-8945
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/luke
>Reporter: Tomoko Uchida
>Priority: Minor
> Attachments: LUCENE-8945-final.patch, LUCENE-8945.patch, 
> LUCENE-8945.patch, delimiter_comma_exported_file.PNG, 
> delimiter_space_exported_file.PNG, delimiter_tab_exported_file.PNG, 
> luke_export_delimiter.png
>
>
> This is a follow-up issue for LUCENE-8764.
> Current delimiter is fixed to "," (comma), but terms also can include comma 
> and they are not escaped. It would be better if the delimiter can be 
> changed/selected to a tab or whitespace when exporting.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8137) GraphTokenStreamFiniteStrings does not handle position inc > 1 in multi-word synoyms

2019-09-18 Thread Jira


[ 
https://issues.apache.org/jira/browse/LUCENE-8137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16932317#comment-16932317
 ] 

Jan Høydahl commented on LUCENE-8137:
-

I think I'm hitting this issue as well. We have SF followed by SGF.

For input "a hero", SF removes "a" and "hero" gets position=2. But if "hero" 
has a synonym, then the gap is eaten by SGF and "hero" and its synonym will 
have position=1. Attaching a patch with a test that I belive reproduces the 
issue (although my TokenStream karma is not very high).

[^SGF_SF_interaction.patch]

If you believe it is not the same issue, then I can open a new JIRA.

> GraphTokenStreamFiniteStrings does not handle position inc > 1 in multi-word 
> synoyms
> 
>
> Key: LUCENE-8137
> URL: https://issues.apache.org/jira/browse/LUCENE-8137
> Project: Lucene - Core
>  Issue Type: Bug
>Affects Versions: 7.2.1, 8.0
>Reporter: Jim Ferenczi
>Assignee: Jim Ferenczi
>Priority: Major
> Attachments: SGF_SF_interaction.patch
>
>
> The automaton built for graph queries that contain multiple multi-word 
> synonyms does not handle gaps if they appear in the middle of a multi-word 
> synonym. In such case the token next to the gap is considered as part of the 
> multi-word synonym. 
> Stop words that appear before or after multi-word synonyms are handled 
> correctly in the current version but the synonym rule "part of speech, pos" 
> for instance does not create the expected query if "of" is removed by a 
> filter that is set after the synonym_graph.  One solution would be to reuse 
> TokenStreamToAutomaton (with minor changes to add the ability to create token 
> transitions rather than chars) which preserves gaps (as a transition) in the 
> produced automaton.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (LUCENE-8137) GraphTokenStreamFiniteStrings does not handle position inc > 1 in multi-word synoyms

2019-09-18 Thread Jira


 [ 
https://issues.apache.org/jira/browse/LUCENE-8137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl updated LUCENE-8137:

Attachment: SGF_SF_interaction.patch

> GraphTokenStreamFiniteStrings does not handle position inc > 1 in multi-word 
> synoyms
> 
>
> Key: LUCENE-8137
> URL: https://issues.apache.org/jira/browse/LUCENE-8137
> Project: Lucene - Core
>  Issue Type: Bug
>Affects Versions: 7.2.1, 8.0
>Reporter: Jim Ferenczi
>Assignee: Jim Ferenczi
>Priority: Major
> Attachments: SGF_SF_interaction.patch
>
>
> The automaton built for graph queries that contain multiple multi-word 
> synonyms does not handle gaps if they appear in the middle of a multi-word 
> synonym. In such case the token next to the gap is considered as part of the 
> multi-word synonym. 
> Stop words that appear before or after multi-word synonyms are handled 
> correctly in the current version but the synonym rule "part of speech, pos" 
> for instance does not create the expected query if "of" is removed by a 
> filter that is set after the synonym_graph.  One solution would be to reuse 
> TokenStreamToAutomaton (with minor changes to add the ability to create token 
> transitions rather than chars) which preserves gaps (as a transition) in the 
> produced automaton.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Assigned] (LUCENE-8978) "Max Bottom" Based Early Termination For Concurrent Search

2019-09-18 Thread Atri Sharma (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-8978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Atri Sharma reassigned LUCENE-8978:
---

Assignee: Atri Sharma

> "Max Bottom" Based Early Termination For Concurrent Search
> --
>
> Key: LUCENE-8978
> URL: https://issues.apache.org/jira/browse/LUCENE-8978
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Atri Sharma
>Assignee: Atri Sharma
>Priority: Major
>  Time Spent: 5h 50m
>  Remaining Estimate: 0h
>
> When running a search concurrently, collectors which have collected the 
> number of hits requested locally i.e. their local priority queue is full can 
> then globally publish their bottom hit's score, and other collectors can then 
> use that score as the filter. If multiple collectors have full priority 
> queues, the maximum of all bottom scores will be considered as the global 
> bottom score.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (SOLR-13775) Update PR template to suggest Github's: "Allow Edits From Maintainers" option

2019-09-18 Thread Jason Gerlowski (Jira)
Jason Gerlowski created SOLR-13775:
--

 Summary: Update PR template to suggest Github's: "Allow Edits From 
Maintainers" option
 Key: SOLR-13775
 URL: https://issues.apache.org/jira/browse/SOLR-13775
 Project: Solr
  Issue Type: Improvement
  Security Level: Public (Default Security Level. Issues are Public)
  Components: github
Reporter: Jason Gerlowski
Assignee: Jason Gerlowski


As discussed recently on the mailing list, Github does have one big downside: 
it's pretty onerous for two people to both contribute code to the same PR.  
PR's typically come from personal-fork's of Solr, owned by an individual.  If 
that individual doesn't think to give other collaborators edit-access, then 
they must open secondary-PRs to get their changes the primary PR.  These 
secondary PRs must be reviewed, merged etc.

This makes collaboration much more difficult than it is in the patch world for 
example, where I can (e.g.) help a contributor clean up their formatting by 
just uploading a new patch.

Unfortunately the best workaround at this point for this in github is to prompt 
those opening PRs to grant access to Solr's upstream committers and 
maintainers. We can do this by linking users to this 
[option|https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork]
 in the "PR checklist" that was added recently.

Users don't have to provide this access if they don't want.  But hopefully 
it'll many collaboration easier in many cases.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-8951) Create issues@ and builds@ lists and update notifications

2019-09-18 Thread Jira


 [ 
https://issues.apache.org/jira/browse/LUCENE-8951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl resolved LUCENE-8951.
-
Resolution: Fixed

Resolving this. I'll open a discussion thread on dev@ about whether to include 
[Created] notifications there.

> Create issues@ and builds@ lists and update notifications
> -
>
> Key: LUCENE-8951
> URL: https://issues.apache.org/jira/browse/LUCENE-8951
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Jan Høydahl
>Assignee: Jan Høydahl
>Priority: Major
>
> Issue to plan and execute decision from dev mailing list 
> [https://lists.apache.org/thread.html/762d72a9045642dc488dc7a2fd0a525707e5fa5671ac0648a3604c9b@%3Cdev.lucene.apache.org%3E]
>  # Create mailing lists as an announce only list (/)
>  # Subscribe all emails that will be allowed to post (/)
>  # Update websites with info about the new lists (/)
>  # Announce to dev@ list that the change will happen (/)
>  # Modify Jira and Github bots to post to issues@ list instead of dev@ (/)
>  # Modify Jenkins (including Policeman and other) to post to builds@ (/)
>  # Announce to dev@ list that the change is effective (/)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (LUCENE-8951) Create issues@ and builds@ lists and update notifications

2019-09-18 Thread Jira


 [ 
https://issues.apache.org/jira/browse/LUCENE-8951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl updated LUCENE-8951:

Description: 
Issue to plan and execute decision from dev mailing list 
[https://lists.apache.org/thread.html/762d72a9045642dc488dc7a2fd0a525707e5fa5671ac0648a3604c9b@%3Cdev.lucene.apache.org%3E]
 # Create mailing lists as an announce only list (/)
 # Subscribe all emails that will be allowed to post (/)
 # Update websites with info about the new lists (/)
 # Announce to dev@ list that the change will happen (/)
 # Modify Jira and Github bots to post to issues@ list instead of dev@ (/)
 # Modify Jenkins (including Policeman and other) to post to builds@ (/)
 # Announce to dev@ list that the change is effective (/)

  was:
Issue to plan and execute decision from dev mailing list 
[https://lists.apache.org/thread.html/762d72a9045642dc488dc7a2fd0a525707e5fa5671ac0648a3604c9b@%3Cdev.lucene.apache.org%3E]
 # Create mailing lists as an announce only list (/)
 # Subscribe all emails that will be allowed to post (/)
 # Update websites with info about the new lists (/)
 # Announce to dev@ list that the change will happen (/)
 # Modify Jira and Github bots to post to issues@ list instead of dev@ (/)
 # Modify Jenkins (including Policeman and other) to post to builds@ (/)
 # Announce to dev@ list that the change is effective


> Create issues@ and builds@ lists and update notifications
> -
>
> Key: LUCENE-8951
> URL: https://issues.apache.org/jira/browse/LUCENE-8951
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Jan Høydahl
>Assignee: Jan Høydahl
>Priority: Major
>
> Issue to plan and execute decision from dev mailing list 
> [https://lists.apache.org/thread.html/762d72a9045642dc488dc7a2fd0a525707e5fa5671ac0648a3604c9b@%3Cdev.lucene.apache.org%3E]
>  # Create mailing lists as an announce only list (/)
>  # Subscribe all emails that will be allowed to post (/)
>  # Update websites with info about the new lists (/)
>  # Announce to dev@ list that the change will happen (/)
>  # Modify Jira and Github bots to post to issues@ list instead of dev@ (/)
>  # Modify Jenkins (including Policeman and other) to post to builds@ (/)
>  # Announce to dev@ list that the change is effective (/)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] janhoy commented on a change in pull request #888: SOLR-13774 add lucene/solr openjdk compatibility matrix to ref guide.

2019-09-18 Thread GitBox
janhoy commented on a change in pull request #888: SOLR-13774 add lucene/solr 
openjdk compatibility matrix to ref guide.
URL: https://github.com/apache/lucene-solr/pull/888#discussion_r325546598
 
 

 ##
 File path: solr/solr-ref-guide/src/solr-system-requirements.adoc
 ##
 @@ -93,3 +93,119 @@ The success rate in our automated tests is similar with 
all the Java versions te
 * This version has continuous testing with Java 9, 10, 11, 12 and the 
pre-release version of Java 13.
 * There were known issues with Kerberos with Java 9+ prior to Solr 8.1. If 
using 8.0, you should test in your environment.
 * Be sure to test with SSL/TLS and/or authorization enabled in your 
environment if you require either when using Java 9+.
+
+=== Lucene/Solr OpenJDK Compatibility
+The following compatibility matrix was generated by running an `ant test` 
command for each version of Solr and OpenJDK. The tests were run in a 
non-SSL/TLS environment without authorization enabled. A BUILD SUCCESSFUL 
message resulted in a "Y" and a BUILD FAILED message resulted in a "_N_".
+
+[cols="1,6*^" options="header"]
+|===
+|Lucene/Solr|OpenJDK 8|OpenJDK 9|OpenJDK 10|OpenJDK 11|OpenJDK 12|OpenJDK 13
+|*3.1.0* |_N_ |_N_ |_N_  |_N_  |_N_  |_N_
 
 Review comment:
   Agree. It's cool to see the output but doesn't provide much above what we 
already document for each version. And the reference Guide is version specific, 
so to tell in RefGuide for Solr 8.3 that Solr 4.10 won't work with Java 13 is 
not very interesting?
   
   I'm more interested in running tests against AdoptOpenJDK, Corretto, 
OracleOpenJDK etc, than against all versions of Oracles non-free options.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] anshumg commented on issue #887: SOLR-13773: Prometheus Exporter GC and Heap options

2019-09-18 Thread GitBox
anshumg commented on issue #887: SOLR-13773: Prometheus Exporter GC and Heap 
options
URL: https://github.com/apache/lucene-solr/pull/887#issuecomment-532557887
 
 
   Thanks @HoustonPutman .
   I agree with @dsmiley w/ the naming. SOLR* will confuse users here.
   
   Also, the precommit check seems to have failed, though glancing through it, 
seems like it wasn't anything related to your changes. I've just retriggered 
the action, and if fails again it will be worth making sure that it's unrelated 
(also where did it break).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6966) Contribution: Codec for index-level encryption

2019-09-18 Thread Jira


[ 
https://issues.apache.org/jira/browse/LUCENE-6966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16932154#comment-16932154
 ] 

Jan Høydahl commented on LUCENE-6966:
-

+1 for a simple Directory based approach. Anyone who can lobby for a 
contribution? I have clients asking for this as well.

> Contribution: Codec for index-level encryption
> --
>
> Key: LUCENE-6966
> URL: https://issues.apache.org/jira/browse/LUCENE-6966
> Project: Lucene - Core
>  Issue Type: New Feature
>  Components: modules/other
>Reporter: Renaud Delbru
>Priority: Major
>  Labels: codec, contrib
> Attachments: Encryption Codec Documentation.pdf, LUCENE-6966-1.patch, 
> LUCENE-6966-2-docvalues.patch, LUCENE-6966-2.patch
>
>
> We would like to contribute a codec that enables the encryption of sensitive 
> data in the index that has been developed as part of an engagement with a 
> customer. We think that this could be of interest for the community.
> Below is a description of the project.
> h1. Introduction
> In comparison with approaches where all data is encrypted (e.g., file system 
> encryption, index output / directory encryption), encryption at a codec level 
> enables more fine-grained control on which block of data is encrypted. This 
> is more efficient since less data has to be encrypted. This also gives more 
> flexibility such as the ability to select which field to encrypt.
> Some of the requirements for this project were:
> * The performance impact of the encryption should be reasonable.
> * The user can choose which field to encrypt.
> * Key management: During the life cycle of the index, the user can provide a 
> new version of his encryption key. Multiple key versions should co-exist in 
> one index.
> h1. What is supported ?
> - Block tree terms index and dictionary
> - Compressed stored fields format
> - Compressed term vectors format
> - Doc values format (prototype based on an encrypted index output) - this 
> will be submitted as a separated patch
> - Index upgrader: command to upgrade all the index segments with the latest 
> key version available.
> h1. How it is implemented ?
> h2. Key Management
> One index segment is encrypted with a single key version. An index can have 
> multiple segments, each one encrypted using a different key version. The key 
> version for a segment is stored in the segment info.
> The provided codec is abstract, and a subclass is responsible in providing an 
> implementation of the cipher factory. The cipher factory is responsible of 
> the creation of a cipher instance based on a given key version.
> h2. Encryption Model
> The encryption model is based on AES/CBC with padding. Initialisation vector 
> (IV) is reused for performance reason, but only on a per format and per 
> segment basis.
> While IV reuse is usually considered a bad practice, the CBC mode is somehow 
> resilient to IV reuse. The only "leak" of information that this could lead to 
> is being able to know that two encrypted blocks of data starts with the same 
> prefix. However, it is unlikely that two data blocks in an index segment will 
> start with the same data:
> - Stored Fields Format: Each encrypted data block is a compressed block 
> (~4kb) of one or more documents. It is unlikely that two compressed blocks 
> start with the same data prefix.
> - Term Vectors: Each encrypted data block is a compressed block (~4kb) of 
> terms and payloads from one or more documents. It is unlikely that two 
> compressed blocks start with the same data prefix.
> - Term Dictionary Index: The term dictionary index is encoded and encrypted 
> in one single data block.
> - Term Dictionary Data: Each data block of the term dictionary encodes a set 
> of suffixes. It is unlikely to have two dictionary data blocks sharing the 
> same prefix within the same segment.
> - DocValues: A DocValues file will be composed of multiple encrypted data 
> blocks. It is unlikely to have two data blocks sharing the same prefix within 
> the same segment (each one will encodes a list of values associated to a 
> field).
> To the best of our knowledge, this model should be safe. However, it would be 
> good if someone with security expertise in the community could review and 
> validate it. 
> h1. Performance
> We report here a performance benchmark we did on an early prototype based on 
> Lucene 4.x. The benchmark was performed on the Wikipedia dataset where all 
> the fields (id, title, body, date) were encrypted. Only the block tree terms 
> and compressed stored fields format were tested at that time. 
> h2. Indexing
> The indexing throughput slightly decreased and is roughly 15% less than with 
> the base Lucene. 
> The merge time slightly increased by 35%.
> There was no significant difference in term of index size.
> h2. Query 

[GitHub] [lucene-solr] anshumg commented on a change in pull request #888: SOLR-13774 add lucene/solr openjdk compatibility matrix to ref guide.

2019-09-18 Thread GitBox
anshumg commented on a change in pull request #888: SOLR-13774 add lucene/solr 
openjdk compatibility matrix to ref guide.
URL: https://github.com/apache/lucene-solr/pull/888#discussion_r325517067
 
 

 ##
 File path: solr/solr-ref-guide/src/solr-system-requirements.adoc
 ##
 @@ -93,3 +93,119 @@ The success rate in our automated tests is similar with 
all the Java versions te
 * This version has continuous testing with Java 9, 10, 11, 12 and the 
pre-release version of Java 13.
 * There were known issues with Kerberos with Java 9+ prior to Solr 8.1. If 
using 8.0, you should test in your environment.
 * Be sure to test with SSL/TLS and/or authorization enabled in your 
environment if you require either when using Java 9+.
+
+=== Lucene/Solr OpenJDK Compatibility
+The following compatibility matrix was generated by running an `ant test` 
command for each version of Solr and OpenJDK. The tests were run in a 
non-SSL/TLS environment without authorization enabled. A BUILD SUCCESSFUL 
message resulted in a "Y" and a BUILD FAILED message resulted in a "_N_".
+
+[cols="1,6*^" options="header"]
+|===
+|Lucene/Solr|OpenJDK 8|OpenJDK 9|OpenJDK 10|OpenJDK 11|OpenJDK 12|OpenJDK 13
+|*3.1.0* |_N_ |_N_ |_N_  |_N_  |_N_  |_N_
 
 Review comment:
   I don't think it makes sense to add all versions in here. Also, `ant test` 
is not necessarily the best way to generate this - however, this seems very 
promising w.r.t. the successful test % :)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8945) Allow to change the output file delimiter on Luke "export terms" feature

2019-09-18 Thread Tomoko Uchida (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-8945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16932147#comment-16932147
 ] 

Tomoko Uchida commented on LUCENE-8945:
---

+1

I will commit it to the ASF repo in shortly.

> Allow to change the output file delimiter on Luke "export terms" feature
> 
>
> Key: LUCENE-8945
> URL: https://issues.apache.org/jira/browse/LUCENE-8945
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/luke
>Reporter: Tomoko Uchida
>Priority: Minor
> Attachments: LUCENE-8945.patch, LUCENE-8945.patch, 
> delimiter_comma_exported_file.PNG, delimiter_space_exported_file.PNG, 
> delimiter_tab_exported_file.PNG, luke_export_delimiter.png
>
>
> This is a follow-up issue for LUCENE-8764.
> Current delimiter is fixed to "," (comma), but terms also can include comma 
> and they are not escaped. It would be better if the delimiter can be 
> changed/selected to a tab or whitespace when exporting.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] jgq2008303393 edited a comment on issue #884: LUCENE-8980: optimise SegmentTermsEnum.seekExact performance

2019-09-18 Thread GitBox
jgq2008303393 edited a comment on issue #884: LUCENE-8980: optimise 
SegmentTermsEnum.seekExact performance
URL: https://github.com/apache/lucene-solr/pull/884#issuecomment-532538959
 
 
   Ping @jpountz @mikemccand.
   Please help to take a look


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] jgq2008303393 edited a comment on issue #884: LUCENE-8980: optimise SegmentTermsEnum.seekExact performance

2019-09-18 Thread GitBox
jgq2008303393 edited a comment on issue #884: LUCENE-8980: optimise 
SegmentTermsEnum.seekExact performance
URL: https://github.com/apache/lucene-solr/pull/884#issuecomment-532538959
 
 
   Ping @jpountz @mikemccand.
   Please help to take a look.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] jgq2008303393 edited a comment on issue #884: LUCENE-8980: optimise SegmentTermsEnum.seekExact performance

2019-09-18 Thread GitBox
jgq2008303393 edited a comment on issue #884: LUCENE-8980: optimise 
SegmentTermsEnum.seekExact performance
URL: https://github.com/apache/lucene-solr/pull/884#issuecomment-532538959
 
 
   ping @jpountz @mikemccand


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] jgq2008303393 commented on issue #884: LUCENE-8980: optimise SegmentTermsEnum.seekExact performance

2019-09-18 Thread GitBox
jgq2008303393 commented on issue #884: LUCENE-8980: optimise 
SegmentTermsEnum.seekExact performance
URL: https://github.com/apache/lucene-solr/pull/884#issuecomment-532538959
 
 
   ping@jpountz @mikemccand


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org