[jira] [Commented] (LUCENE-9529) Larger stored fields block sizes mean we're more likely to disable optimized bulk merging

2020-09-16 Thread Adrien Grand (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17197386#comment-17197386
 ] 

Adrien Grand commented on LUCENE-9529:
--

I like this idea.

> Larger stored fields block sizes mean we're more likely to disable optimized 
> bulk merging
> -
>
> Key: LUCENE-9529
> URL: https://issues.apache.org/jira/browse/LUCENE-9529
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Priority: Minor
>
> Whenever possible when merging stored fields, Lucene tries to copy the 
> compressed data instead of decompressing the source segment to then 
> re-compressing in the destination segment. A problem with this approach is 
> that if some blocks are incomplete (typically the last block of a segment) 
> then it remains incomplete in the destination segment too, and if we do it 
> for too long we end up with a bad compression ratio. So Lucene keeps track of 
> these incomplete blocks, and makes sure to keep a ratio of incomplete blocks 
> below 1%.
> But as we increased the block size, it has become more likely to have a high 
> ratio of incomplete blocks. E.g. if you have a segment with 1MB of stored 
> fields, with 16kB blocks like before, you have 63 complete blocks and 1 
> incomplete block, or 1.6%. But now with ~512kB blocks, you have one complete 
> block and 1 incomplete block, ie. 50%.
> I'm not sure how to fix it or even whether it should be fixed but wanted to 
> open an issue to track this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] odidev opened a new pull request #1881: Upgrade zookeeper version to 3.6.2 to use recent version of netty

2020-09-16 Thread GitBox


odidev opened a new pull request #1881:
URL: https://github.com/apache/lucene-solr/pull/1881


   The update netty ver 4.1.50 includes both security fixes and AArch64 
performance improvements
   Refer release notes for detail: 
https://netty.io/news/2020/05/13/4-1-50-Final.html
   
   Signed-off-by: odidev 
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14151) Make schema components load from packages

2020-09-16 Thread Noble Paul (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17197361#comment-17197361
 ] 

Noble Paul commented on SOLR-14151:
---

Thanks Daniel

I could reproduce the problem. I shall commit a proper fix

> Make schema components load from packages
> -
>
> Key: SOLR-14151
> URL: https://issues.apache.org/jira/browse/SOLR-14151
> Project: Solr
>  Issue Type: Sub-task
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
>  Labels: packagemanager
> Fix For: 8.7
>
>  Time Spent: 12h 50m
>  Remaining Estimate: 0h
>
> Example:
> {code:xml}
>  
> 
>   
>generateNumberParts="0" catenateWords="0"
>   catenateNumbers="0" catenateAll="0"/>
>   
>   
> 
>   
> {code}
> * When a package is updated, the entire {{IndexSchema}} object is refreshed, 
> but the SolrCore object is not reloaded
> * Any component can be prefixed with the package name
> * The semantics of loading plugins remain the same as that of the components 
> in {{solrconfig.xml}}
> * Plugins can be registered using schema API



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14151) Make schema components load from packages

2020-09-16 Thread Daniel Worley (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17197334#comment-17197334
 ] 

Daniel Worley commented on SOLR-14151:
--

Apologies for the wrong diagnosis earlier.  I debugged some more and noticed 
SolrResourceLoader::findClass was never making use of the schemaLoader.  I 
added the following code to the top of that method and the test will pass using 
a packaged class for the encoder on a delimited payload filter.  Not sure if 
this has other repercussions or if there's something cleaner but I thought I'd 
share the findings.

{{if (cname.contains(":") && schemaLoader != null) {}}
{{  return schemaLoader.findClass(cname, expectedType);}}
{{}}}

> Make schema components load from packages
> -
>
> Key: SOLR-14151
> URL: https://issues.apache.org/jira/browse/SOLR-14151
> Project: Solr
>  Issue Type: Sub-task
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
>  Labels: packagemanager
> Fix For: 8.7
>
>  Time Spent: 12h 50m
>  Remaining Estimate: 0h
>
> Example:
> {code:xml}
>  
> 
>   
>generateNumberParts="0" catenateWords="0"
>   catenateNumbers="0" catenateAll="0"/>
>   
>   
> 
>   
> {code}
> * When a package is updated, the entire {{IndexSchema}} object is refreshed, 
> but the SolrCore object is not reloaded
> * Any component can be prefixed with the package name
> * The semantics of loading plugins remain the same as that of the components 
> in {{solrconfig.xml}}
> * Plugins can be registered using schema API



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-9529) Larger stored fields block sizes mean we're more likely to disable optimized bulk merging

2020-09-16 Thread Robert Muir (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17197328#comment-17197328
 ] 

Robert Muir edited comment on LUCENE-9529 at 9/17/20, 2:23 AM:
---

The current code tracks the total number of chunks, and the total number of 
"dirty" (incomplete) chunks. 

Then we compute "tooDirty" like this:
{code}
  /** 
   * Returns true if we should recompress this reader, even though we could 
bulk merge compressed data 
   * 
   * The last chunk written for a segment is typically incomplete, so without 
recompressing,
   * in some worst-case situations (e.g. frequent reopen with tiny flushes), 
over time the 
   * compression ratio can degrade. This is a safety switch.
   */
  boolean tooDirty(CompressingStoredFieldsReader candidate) {
// more than 1% dirty, or more than hard limit of 1024 dirty chunks
return candidate.getNumDirtyChunks() > 1024 || 
   candidate.getNumDirtyChunks() * 100 > candidate.getNumChunks();
  }
{code}

Maybe to be more fair, we could use a similar formula but track numDirtyDocs 
and compare with numDocs (we know this value already)? We could still keep a 
safety-switch such as 1024 dirty chunks to avoid some worst-case scenario, but 
just change the ratio at least. 


was (Author: rcmuir):
The current code tracks the total number of chunks, and the total number of 
"dirty" (incomplete) chunks. 

Then we compute "tooDirty" like this:
{code}
  /** 
   * Returns true if we should recompress this reader, even though we could 
bulk merge compressed data 
   * 
   * The last chunk written for a segment is typically incomplete, so without 
recompressing,
   * in some worst-case situations (e.g. frequent reopen with tiny flushes), 
over time the 
   * compression ratio can degrade. This is a safety switch.
   */
  boolean tooDirty(CompressingStoredFieldsReader candidate) {
// more than 1% dirty, or more than hard limit of 1024 dirty chunks
return candidate.getNumDirtyChunks() > 1024 || 
   candidate.getNumDirtyChunks() * 100 > candidate.getNumChunks();
  }
{noformat}

Maybe to be more fair, we could use a similar formula but track numDirtyDocs 
and compare with numDocs (we know this value already)? We could still keep a 
safety-switch such as 1024 dirty chunks to avoid some worst-case scenario, but 
just change the ratio at least. 

> Larger stored fields block sizes mean we're more likely to disable optimized 
> bulk merging
> -
>
> Key: LUCENE-9529
> URL: https://issues.apache.org/jira/browse/LUCENE-9529
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Priority: Minor
>
> Whenever possible when merging stored fields, Lucene tries to copy the 
> compressed data instead of decompressing the source segment to then 
> re-compressing in the destination segment. A problem with this approach is 
> that if some blocks are incomplete (typically the last block of a segment) 
> then it remains incomplete in the destination segment too, and if we do it 
> for too long we end up with a bad compression ratio. So Lucene keeps track of 
> these incomplete blocks, and makes sure to keep a ratio of incomplete blocks 
> below 1%.
> But as we increased the block size, it has become more likely to have a high 
> ratio of incomplete blocks. E.g. if you have a segment with 1MB of stored 
> fields, with 16kB blocks like before, you have 63 complete blocks and 1 
> incomplete block, or 1.6%. But now with ~512kB blocks, you have one complete 
> block and 1 incomplete block, ie. 50%.
> I'm not sure how to fix it or even whether it should be fixed but wanted to 
> open an issue to track this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9529) Larger stored fields block sizes mean we're more likely to disable optimized bulk merging

2020-09-16 Thread Robert Muir (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17197328#comment-17197328
 ] 

Robert Muir commented on LUCENE-9529:
-

The current code tracks the total number of chunks, and the total number of 
"dirty" (incomplete) chunks. 

Then we compute "tooDirty" like this:
{code}
  /** 
   * Returns true if we should recompress this reader, even though we could 
bulk merge compressed data 
   * 
   * The last chunk written for a segment is typically incomplete, so without 
recompressing,
   * in some worst-case situations (e.g. frequent reopen with tiny flushes), 
over time the 
   * compression ratio can degrade. This is a safety switch.
   */
  boolean tooDirty(CompressingStoredFieldsReader candidate) {
// more than 1% dirty, or more than hard limit of 1024 dirty chunks
return candidate.getNumDirtyChunks() > 1024 || 
   candidate.getNumDirtyChunks() * 100 > candidate.getNumChunks();
  }
{noformat}

Maybe to be more fair, we could use a similar formula but track numDirtyDocs 
and compare with numDocs (we know this value already)? We could still keep a 
safety-switch such as 1024 dirty chunks to avoid some worst-case scenario, but 
just change the ratio at least. 

> Larger stored fields block sizes mean we're more likely to disable optimized 
> bulk merging
> -
>
> Key: LUCENE-9529
> URL: https://issues.apache.org/jira/browse/LUCENE-9529
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Priority: Minor
>
> Whenever possible when merging stored fields, Lucene tries to copy the 
> compressed data instead of decompressing the source segment to then 
> re-compressing in the destination segment. A problem with this approach is 
> that if some blocks are incomplete (typically the last block of a segment) 
> then it remains incomplete in the destination segment too, and if we do it 
> for too long we end up with a bad compression ratio. So Lucene keeps track of 
> these incomplete blocks, and makes sure to keep a ratio of incomplete blocks 
> below 1%.
> But as we increased the block size, it has become more likely to have a high 
> ratio of incomplete blocks. E.g. if you have a segment with 1MB of stored 
> fields, with 16kB blocks like before, you have 63 complete blocks and 1 
> incomplete block, or 1.6%. But now with ~512kB blocks, you have one complete 
> block and 1 incomplete block, ie. 50%.
> I'm not sure how to fix it or even whether it should be fixed but wanted to 
> open an issue to track this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-14151) Make schema components load from packages

2020-09-16 Thread Daniel Worley (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17197297#comment-17197297
 ] 

Daniel Worley edited comment on SOLR-14151 at 9/17/20, 2:17 AM:


I've been testing these updates against some plugin code that utilizes a custom 
encoder for the DelimitedPayloadTokenFilterFactory.  It appears that Lucene 
level code does not pick up on loaded packages.  -I believe this is due to the 
SolrResourceLoader defaulting to a list of hardcoded packages when a list of 
subpackages is not provided which will be the case with Lucene level code.- 
edit: After looking more I don't think that is related.

To reproduce you could add a dummy PayloadEncoder to your sample package then 
try using it on a DelimitedPayloadTokenFilterFactory in 
TestPackages::testSchemaPlugins, something like:

{{" 'filters' : [\\{ 'class':'solr.DelimitedPayloadTokenFilterFactory', 
'encoder':'schemapkg:my.pkg.MyEncoder' }]\n";}}


was (Author: worleydl):
I've been testing these updates against some plugin code that utilizes a custom 
encoder for the DelimitedPayloadTokenFilterFactory.  It appears that Lucene 
level code does not pick up on loaded packages.  I believe this is due to the 
SolrResourceLoader defaulting to a list of hardcoded packages when a list of 
subpackages is not provided which will be the case with Lucene level code.

To reproduce you could add a dummy PayloadEncoder to your sample package then 
try using it on a DelimitedPayloadTokenFilterFactory in 
TestPackages::testSchemaPlugins, something like:

{{" 'filters' : [\{ 'class':'solr.DelimitedPayloadTokenFilterFactory', 
'encoder':'schemapkg:my.pkg.MyEncoder' }]\n";}}

> Make schema components load from packages
> -
>
> Key: SOLR-14151
> URL: https://issues.apache.org/jira/browse/SOLR-14151
> Project: Solr
>  Issue Type: Sub-task
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
>  Labels: packagemanager
> Fix For: 8.7
>
>  Time Spent: 12h 50m
>  Remaining Estimate: 0h
>
> Example:
> {code:xml}
>  
> 
>   
>generateNumberParts="0" catenateWords="0"
>   catenateNumbers="0" catenateAll="0"/>
>   
>   
> 
>   
> {code}
> * When a package is updated, the entire {{IndexSchema}} object is refreshed, 
> but the SolrCore object is not reloaded
> * Any component can be prefixed with the package name
> * The semantics of loading plugins remain the same as that of the components 
> in {{solrconfig.xml}}
> * Plugins can be registered using schema API



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-14151) Make schema components load from packages

2020-09-16 Thread Noble Paul (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17197315#comment-17197315
 ] 

Noble Paul edited comment on SOLR-14151 at 9/17/20, 1:47 AM:
-

[~tflobbe]

These are just test failures  due to object leaks and not test failures 
themselves. The tests are passing and the post-test cleanups are failing

A combination of CoreContainer shutdown + core reload is when this results in a 
resource leak. Does it affect a real user? NO . They are shutting down their 
server and there will be no leaks

We really need to get to the bottom of this . There are 3 options

* fix core reloading issues. Me and Erick worked on this several days and we 
still do not see an elegant solution
* avoid managed schema doing core reloads. There is a 
[PR|https://github.com/apache/lucene-solr/pull/1880]
* revert this
I do not wish to go with # 3 . This is an extremely important feature.


was (Author: noble.paul):
[~tflobbe]

These are just test failures  due to object leaks and not test failures 
themselves. The tests are passing and the post-test cleanups are failing

We really need to get to the bottom of this . There are 3 options

* fix core reloading issues. Me and Erick worked on this several days and we 
still do not see an elegant solution
* avoid managed schema doing core reloads. There is a 
[PR|https://github.com/apache/lucene-solr/pull/1880]
* revert this
I do not wish to go with # 3 . This is an extremely important feature.

> Make schema components load from packages
> -
>
> Key: SOLR-14151
> URL: https://issues.apache.org/jira/browse/SOLR-14151
> Project: Solr
>  Issue Type: Sub-task
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
>  Labels: packagemanager
> Fix For: 8.7
>
>  Time Spent: 12h 50m
>  Remaining Estimate: 0h
>
> Example:
> {code:xml}
>  
> 
>   
>generateNumberParts="0" catenateWords="0"
>   catenateNumbers="0" catenateAll="0"/>
>   
>   
> 
>   
> {code}
> * When a package is updated, the entire {{IndexSchema}} object is refreshed, 
> but the SolrCore object is not reloaded
> * Any component can be prefixed with the package name
> * The semantics of loading plugins remain the same as that of the components 
> in {{solrconfig.xml}}
> * Plugins can be registered using schema API



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-14151) Make schema components load from packages

2020-09-16 Thread Daniel Worley (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17197318#comment-17197318
 ] 

Daniel Worley edited comment on SOLR-14151 at 9/17/20, 1:44 AM:


[~noble.paul] Under TestPackages::testSchemaPlugins edit the filters String to 
include a `solr.DelimitedPayloadTokenFilterFactory` but try to specify a 
[package]:[class] as the encoder and it should fail.  See my last comment for 
an example string you can use.

The issue was originally discovered installing the package from 
[https://github.com/o19s/payload-component] and setting up an analyzer chain 
with: `` which caused it to 
error out.


was (Author: worleydl):
[~noble.paul] Under TestPackages::testSchemaPlugins edit the filters String to 
include a `solr.DelimitedPayloadTokenFilterFactory` but try to specify a 
[package]:[class] as the encoder and it should fail.  See my last comment for 
an example string you can use.

The issue was originally discovered installing the plugin from 
https://github.com/o19s/payload-component and setting up an analyzer chain 
with: `` which caused it to 
error out.

> Make schema components load from packages
> -
>
> Key: SOLR-14151
> URL: https://issues.apache.org/jira/browse/SOLR-14151
> Project: Solr
>  Issue Type: Sub-task
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
>  Labels: packagemanager
> Fix For: 8.7
>
>  Time Spent: 12h 50m
>  Remaining Estimate: 0h
>
> Example:
> {code:xml}
>  
> 
>   
>generateNumberParts="0" catenateWords="0"
>   catenateNumbers="0" catenateAll="0"/>
>   
>   
> 
>   
> {code}
> * When a package is updated, the entire {{IndexSchema}} object is refreshed, 
> but the SolrCore object is not reloaded
> * Any component can be prefixed with the package name
> * The semantics of loading plugins remain the same as that of the components 
> in {{solrconfig.xml}}
> * Plugins can be registered using schema API



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14151) Make schema components load from packages

2020-09-16 Thread Daniel Worley (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17197318#comment-17197318
 ] 

Daniel Worley commented on SOLR-14151:
--

[~noble.paul] Under TestPackages::testSchemaPlugins edit the filters String to 
include a `solr.DelimitedPayloadTokenFilterFactory` but try to specify a 
[package]:[class] as the encoder and it should fail.  See my last comment for 
an example string you can use.

The issue was originally discovered installing the plugin from 
https://github.com/o19s/payload-component and setting up an analyzer chain 
with: `` which caused it to 
error out.

> Make schema components load from packages
> -
>
> Key: SOLR-14151
> URL: https://issues.apache.org/jira/browse/SOLR-14151
> Project: Solr
>  Issue Type: Sub-task
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
>  Labels: packagemanager
> Fix For: 8.7
>
>  Time Spent: 12h 50m
>  Remaining Estimate: 0h
>
> Example:
> {code:xml}
>  
> 
>   
>generateNumberParts="0" catenateWords="0"
>   catenateNumbers="0" catenateAll="0"/>
>   
>   
> 
>   
> {code}
> * When a package is updated, the entire {{IndexSchema}} object is refreshed, 
> but the SolrCore object is not reloaded
> * Any component can be prefixed with the package name
> * The semantics of loading plugins remain the same as that of the components 
> in {{solrconfig.xml}}
> * Plugins can be registered using schema API



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-14151) Make schema components load from packages

2020-09-16 Thread Noble Paul (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17197315#comment-17197315
 ] 

Noble Paul edited comment on SOLR-14151 at 9/17/20, 1:37 AM:
-

[~tflobbe]

These are just test failures  due to object leaks and not test failures 
themselves. The tests are passing and the post-test cleanups are failing

We really need to get to the bottom of this . There are 3 options

* fix core reloading issues. Me and Erick worked on this several days and we 
still do not see an elegant solution
* avoid managed schema doing core reloads. There is a 
[PR|https://github.com/apache/lucene-solr/pull/1880]
* revert this
I do not wish to go with # 3 . This is an extremely important feature.


was (Author: noble.paul):
[~tflobbe]

These are just test failures  due to object leaks and not test failures 
themselves. The tests are passing and the post-test cleanups are failing

We really need to get to the bottom of this . There are 3 options

* fix core reloading issues
* avoid managed schema doing core reloads
* revert this
I do not wish to go with # 3 . This is an extremely important feature.

> Make schema components load from packages
> -
>
> Key: SOLR-14151
> URL: https://issues.apache.org/jira/browse/SOLR-14151
> Project: Solr
>  Issue Type: Sub-task
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
>  Labels: packagemanager
> Fix For: 8.7
>
>  Time Spent: 12h 50m
>  Remaining Estimate: 0h
>
> Example:
> {code:xml}
>  
> 
>   
>generateNumberParts="0" catenateWords="0"
>   catenateNumbers="0" catenateAll="0"/>
>   
>   
> 
>   
> {code}
> * When a package is updated, the entire {{IndexSchema}} object is refreshed, 
> but the SolrCore object is not reloaded
> * Any component can be prefixed with the package name
> * The semantics of loading plugins remain the same as that of the components 
> in {{solrconfig.xml}}
> * Plugins can be registered using schema API



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] noblepaul opened a new pull request #1880: SOLR-14151: Do not reload core for schema changes

2020-09-16 Thread GitBox


noblepaul opened a new pull request #1880:
URL: https://github.com/apache/lucene-solr/pull/1880


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14151) Make schema components load from packages

2020-09-16 Thread Noble Paul (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17197315#comment-17197315
 ] 

Noble Paul commented on SOLR-14151:
---

[~tflobbe]

These are just test failures  due to object leaks and not test failures 
themselves. The tests are passing and the post-test cleanups are failing

We really need to get to the bottom of this . There are 3 options

* fix core reloading issues
* avoid managed schema doing core reloads
* revert this
I do not wish to go with # 3 . This is an extremely important feature.

> Make schema components load from packages
> -
>
> Key: SOLR-14151
> URL: https://issues.apache.org/jira/browse/SOLR-14151
> Project: Solr
>  Issue Type: Sub-task
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
>  Labels: packagemanager
> Fix For: 8.7
>
>  Time Spent: 12h 40m
>  Remaining Estimate: 0h
>
> Example:
> {code:xml}
>  
> 
>   
>generateNumberParts="0" catenateWords="0"
>   catenateNumbers="0" catenateAll="0"/>
>   
>   
> 
>   
> {code}
> * When a package is updated, the entire {{IndexSchema}} object is refreshed, 
> but the SolrCore object is not reloaded
> * Any component can be prefixed with the package name
> * The semantics of loading plugins remain the same as that of the components 
> in {{solrconfig.xml}}
> * Plugins can be registered using schema API



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14151) Make schema components load from packages

2020-09-16 Thread Noble Paul (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17197313#comment-17197313
 ] 

Noble Paul commented on SOLR-14151:
---

[~worleydl]  can you please share the steps you used ?



> Make schema components load from packages
> -
>
> Key: SOLR-14151
> URL: https://issues.apache.org/jira/browse/SOLR-14151
> Project: Solr
>  Issue Type: Sub-task
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
>  Labels: packagemanager
> Fix For: 8.7
>
>  Time Spent: 12h 40m
>  Remaining Estimate: 0h
>
> Example:
> {code:xml}
>  
> 
>   
>generateNumberParts="0" catenateWords="0"
>   catenateNumbers="0" catenateAll="0"/>
>   
>   
> 
>   
> {code}
> * When a package is updated, the entire {{IndexSchema}} object is refreshed, 
> but the SolrCore object is not reloaded
> * Any component can be prefixed with the package name
> * The semantics of loading plugins remain the same as that of the components 
> in {{solrconfig.xml}}
> * Plugins can be registered using schema API



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Issue Comment Deleted] (SOLR-14151) Make schema components load from packages

2020-09-16 Thread Noble Paul (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-14151:
--
Comment: was deleted

(was: Commit cbb1659640cd51be8b403eda8399c527af1c848e in lucene-solr's branch 
refs/heads/master from noblepaul
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=cbb1659 ]

Revert "Revert "SOLR-14151: Bug fixes (#1815)""

This reverts commit 27a14fe48139019a4c09ba072751e093fc5cb5f1.

Undoing accidental commit
)

> Make schema components load from packages
> -
>
> Key: SOLR-14151
> URL: https://issues.apache.org/jira/browse/SOLR-14151
> Project: Solr
>  Issue Type: Sub-task
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
>  Labels: packagemanager
> Fix For: 8.7
>
>  Time Spent: 12h 40m
>  Remaining Estimate: 0h
>
> Example:
> {code:xml}
>  
> 
>   
>generateNumberParts="0" catenateWords="0"
>   catenateNumbers="0" catenateAll="0"/>
>   
>   
> 
>   
> {code}
> * When a package is updated, the entire {{IndexSchema}} object is refreshed, 
> but the SolrCore object is not reloaded
> * Any component can be prefixed with the package name
> * The semantics of loading plugins remain the same as that of the components 
> in {{solrconfig.xml}}
> * Plugins can be registered using schema API



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Issue Comment Deleted] (SOLR-14151) Make schema components load from packages

2020-09-16 Thread Noble Paul (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-14151:
--
Comment: was deleted

(was: Commit 27a14fe48139019a4c09ba072751e093fc5cb5f1 in lucene-solr's branch 
refs/heads/master from noblepaul
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=27a14fe ]

Revert "SOLR-14151: Bug fixes (#1815)"

This reverts commit 95ab98c920833f286608846188d69302b478f80a.

revert the previous change
)

> Make schema components load from packages
> -
>
> Key: SOLR-14151
> URL: https://issues.apache.org/jira/browse/SOLR-14151
> Project: Solr
>  Issue Type: Sub-task
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
>  Labels: packagemanager
> Fix For: 8.7
>
>  Time Spent: 12h 40m
>  Remaining Estimate: 0h
>
> Example:
> {code:xml}
>  
> 
>   
>generateNumberParts="0" catenateWords="0"
>   catenateNumbers="0" catenateAll="0"/>
>   
>   
> 
>   
> {code}
> * When a package is updated, the entire {{IndexSchema}} object is refreshed, 
> but the SolrCore object is not reloaded
> * Any component can be prefixed with the package name
> * The semantics of loading plugins remain the same as that of the components 
> in {{solrconfig.xml}}
> * Plugins can be registered using schema API



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14151) Make schema components load from packages

2020-09-16 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17197311#comment-17197311
 ] 

ASF subversion and git services commented on SOLR-14151:


Commit cbb1659640cd51be8b403eda8399c527af1c848e in lucene-solr's branch 
refs/heads/master from noblepaul
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=cbb1659 ]

Revert "Revert "SOLR-14151: Bug fixes (#1815)""

This reverts commit 27a14fe48139019a4c09ba072751e093fc5cb5f1.

Undoing accidental commit


> Make schema components load from packages
> -
>
> Key: SOLR-14151
> URL: https://issues.apache.org/jira/browse/SOLR-14151
> Project: Solr
>  Issue Type: Sub-task
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
>  Labels: packagemanager
> Fix For: 8.7
>
>  Time Spent: 12h 40m
>  Remaining Estimate: 0h
>
> Example:
> {code:xml}
>  
> 
>   
>generateNumberParts="0" catenateWords="0"
>   catenateNumbers="0" catenateAll="0"/>
>   
>   
> 
>   
> {code}
> * When a package is updated, the entire {{IndexSchema}} object is refreshed, 
> but the SolrCore object is not reloaded
> * Any component can be prefixed with the package name
> * The semantics of loading plugins remain the same as that of the components 
> in {{solrconfig.xml}}
> * Plugins can be registered using schema API



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14871) Use Annotations for v2 APIs in/cluster path

2020-09-16 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17197308#comment-17197308
 ] 

ASF subversion and git services commented on SOLR-14871:


Commit 298dcbf7aad00ff44434ceacc3a89f0449ec72bf in lucene-solr's branch 
refs/heads/branch_8x from noblepaul
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=298dcbf ]

SOLR-14871: Use Annotations for v2 APIs in /cluster path


> Use Annotations for v2 APIs in/cluster path
> ---
>
> Key: SOLR-14871
> URL: https://issues.apache.org/jira/browse/SOLR-14871
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> We have custom json specs for each of these APIs. With the annotation 
> framework , it can be made simple and readable and we can eliminate a lot of 
> code



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14151) Make schema components load from packages

2020-09-16 Thread Daniel Worley (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17197297#comment-17197297
 ] 

Daniel Worley commented on SOLR-14151:
--

I've been testing these updates against some plugin code that utilizes a custom 
encoder for the DelimitedPayloadTokenFilterFactory.  It appears that Lucene 
level code does not pick up on loaded packages.  I believe this is due to the 
SolrResourceLoader defaulting to a list of hardcoded packages when a list of 
subpackages is not provided which will be the case with Lucene level code.

To reproduce you could add a dummy PayloadEncoder to your sample package then 
try using it on a DelimitedPayloadTokenFilterFactory in 
TestPackages::testSchemaPlugins, something like:

{{" 'filters' : [\{ 'class':'solr.DelimitedPayloadTokenFilterFactory', 
'encoder':'schemapkg:my.pkg.MyEncoder' }]\n";}}

> Make schema components load from packages
> -
>
> Key: SOLR-14151
> URL: https://issues.apache.org/jira/browse/SOLR-14151
> Project: Solr
>  Issue Type: Sub-task
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
>  Labels: packagemanager
> Fix For: 8.7
>
>  Time Spent: 12h 40m
>  Remaining Estimate: 0h
>
> Example:
> {code:xml}
>  
> 
>   
>generateNumberParts="0" catenateWords="0"
>   catenateNumbers="0" catenateAll="0"/>
>   
>   
> 
>   
> {code}
> * When a package is updated, the entire {{IndexSchema}} object is refreshed, 
> but the SolrCore object is not reloaded
> * Any component can be prefixed with the package name
> * The semantics of loading plugins remain the same as that of the components 
> in {{solrconfig.xml}}
> * Plugins can be registered using schema API



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14824) Simplify set of Ref Guide build parameters

2020-09-16 Thread Chris M. Hostetter (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17197285#comment-17197285
 ] 

Chris M. Hostetter commented on SOLR-14824:
---

Hmmm i don't really feel comfortable trying to tackle those changes 
here/now – let's pursue this idea more in SOLR-14870 – because AFAICT those 
productDir/buildDir variables return absolute file paths (which will then be 
turned into absolute 'file:///' URLs) but that would break the existing link 
validation logic we have in the ref-guide because it only checks links to 
relative paths ... but i'm assuming we could relativize those paths again the 
ref-guides own buildDir? ... either way: let's not muck with them until we 
actually get the link checking working again.

(also, FWIW, trying to call {{toURI}} on 
{{project(':lucene').buildDir.toPath()}} or 
{{project(':lucene').buildDir.toPath().resolve('documentation')}} kept giving 
me a weird groovy error i couldn't make heads of tails of: {{No signature of 
method: build_16l26jli5s3vhg4xrd8039hq6.ext() is applicable for argument types: 
(build_16l26jli5s3vhg4xrd8039hq6$_run_closure6) values: 
[build_16l26jli5s3vhg4xrd8039hq6$_run_closure6@c2feddd]}}

> Simplify set of Ref Guide build parameters
> --
>
> Key: SOLR-14824
> URL: https://issues.apache.org/jira/browse/SOLR-14824
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Build, documentation
>Reporter: Cassandra Targett
>Assignee: Chris M. Hostetter
>Priority: Minor
> Attachments: SOLR-14824.patch, SOLR-14824.patch
>
>
> While trying to solve LUCENE-9495, I thought it might be a good idea to try 
> to simplify the set of variables and properties used during the Ref Guide 
> build.
> There are 3 areas to work on:
> 1. Remove the "barebones-html" build. With Gradle the build is self-contained 
> and {{gradlew check}} and {{gradle precommit}} could just build the full docs 
> and check them.
> 2. Remove some properties that only existed for a hypothetical need related 
> to the now-removed PDF.
> 3. Change remaining properties to be defined directly in build.gradle instead 
> of relying on ant properties functionality.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14871) Use Annotations for v2 APIs in/cluster path

2020-09-16 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17197274#comment-17197274
 ] 

ASF subversion and git services commented on SOLR-14871:


Commit 5bc7fb286182eeca4e4d2c9ff9dfb98f6a399125 in lucene-solr's branch 
refs/heads/master from noblepaul
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=5bc7fb2 ]

SOLR-14871: remove unused test


> Use Annotations for v2 APIs in/cluster path
> ---
>
> Key: SOLR-14871
> URL: https://issues.apache.org/jira/browse/SOLR-14871
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> We have custom json specs for each of these APIs. With the annotation 
> framework , it can be made simple and readable and we can eliminate a lot of 
> code



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14151) Make schema components load from packages

2020-09-16 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17197273#comment-17197273
 ] 

ASF subversion and git services commented on SOLR-14151:


Commit 27a14fe48139019a4c09ba072751e093fc5cb5f1 in lucene-solr's branch 
refs/heads/master from noblepaul
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=27a14fe ]

Revert "SOLR-14151: Bug fixes (#1815)"

This reverts commit 95ab98c920833f286608846188d69302b478f80a.

revert the previous change


> Make schema components load from packages
> -
>
> Key: SOLR-14151
> URL: https://issues.apache.org/jira/browse/SOLR-14151
> Project: Solr
>  Issue Type: Sub-task
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
>  Labels: packagemanager
> Fix For: 8.7
>
>  Time Spent: 12h 40m
>  Remaining Estimate: 0h
>
> Example:
> {code:xml}
>  
> 
>   
>generateNumberParts="0" catenateWords="0"
>   catenateNumbers="0" catenateAll="0"/>
>   
>   
> 
>   
> {code}
> * When a package is updated, the entire {{IndexSchema}} object is refreshed, 
> but the SolrCore object is not reloaded
> * Any component can be prefixed with the package name
> * The semantics of loading plugins remain the same as that of the components 
> in {{solrconfig.xml}}
> * Plugins can be registered using schema API



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] murblanc closed pull request #1664: SOLR-11208: Usage SynchronousQueue in Executors prevent large scale operations

2020-09-16 Thread GitBox


murblanc closed pull request #1664:
URL: https://github.com/apache/lucene-solr/pull/1664


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-14824) Simplify set of Ref Guide build parameters

2020-09-16 Thread Uwe Schindler (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17197265#comment-17197265
 ] 

Uwe Schindler edited comment on SOLR-14824 at 9/16/20, 10:36 PM:
-

Hi Hoss,

looks fine to me in general, but I'd change a bit those crazy settings: 

{{solrRootPath: '../../../../solr/',}}

This is a bit strange to read, it would be much cleaner to use gradle's ability 
to know where the files are without relying on the project directory layout, 
like {{project(':solr').projectDir}}

The same applies to the "localJavadocs" folders (see SOLR-14870). There you can 
also use something like: 
{{project(':lucene').buildDir.toPath().resolve('documentation').toURI().toAsciiString()}},
 which returns an URL/URI that can be used inside HTML as links and would 
resolve to the "site/global" javadocs of Lucene.


was (Author: thetaphi):
Hi Hoss,

looks fine to me in general, but I'd change a bit those crazy settings: 

{{solrRootPath: '../../../../solr/',}}

This is a bit strange to read, it would be much cleaner to use gradle's ability 
to know where the files are without relying on the project directory layout, 
like {{project(':solr').projectDir}}

The same applies to the "localJavadocs" folders (see SOLR-14870).

> Simplify set of Ref Guide build parameters
> --
>
> Key: SOLR-14824
> URL: https://issues.apache.org/jira/browse/SOLR-14824
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Build, documentation
>Reporter: Cassandra Targett
>Assignee: Chris M. Hostetter
>Priority: Minor
> Attachments: SOLR-14824.patch, SOLR-14824.patch
>
>
> While trying to solve LUCENE-9495, I thought it might be a good idea to try 
> to simplify the set of variables and properties used during the Ref Guide 
> build.
> There are 3 areas to work on:
> 1. Remove the "barebones-html" build. With Gradle the build is self-contained 
> and {{gradlew check}} and {{gradle precommit}} could just build the full docs 
> and check them.
> 2. Remove some properties that only existed for a hypothetical need related 
> to the now-removed PDF.
> 3. Change remaining properties to be defined directly in build.gradle instead 
> of relying on ant properties functionality.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-14824) Simplify set of Ref Guide build parameters

2020-09-16 Thread Uwe Schindler (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17197265#comment-17197265
 ] 

Uwe Schindler edited comment on SOLR-14824 at 9/16/20, 10:34 PM:
-

Hi Hoss,

looks fine to me in general, but I'd change a bit those crazy settings: 

{{solrRootPath: '../../../../solr/',}}

This is a bit strange to read, it would be much cleaner to use gradle's ability 
to know where the files are without relying on the project directory layout, 
like {{project(':solr').projectDir}}

The same applies to the "localJavadocs" folders (see SOLR-14870).


was (Author: thetaphi):
Hi Hoss,

looks fine to me in general, but I'd change a bit those crazy settings: 

{{solrRootPath: '../../../../solr/',}}

This is a bit strange to read, it would be much cleaner to use gradle's ability 
to know where the files are without relying on the project directory layout, 
like {{$project(':solr').projectDir}}

The same applies to the "localJavadocs" folders (see SOLR-14870).

> Simplify set of Ref Guide build parameters
> --
>
> Key: SOLR-14824
> URL: https://issues.apache.org/jira/browse/SOLR-14824
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Build, documentation
>Reporter: Cassandra Targett
>Assignee: Chris M. Hostetter
>Priority: Minor
> Attachments: SOLR-14824.patch, SOLR-14824.patch
>
>
> While trying to solve LUCENE-9495, I thought it might be a good idea to try 
> to simplify the set of variables and properties used during the Ref Guide 
> build.
> There are 3 areas to work on:
> 1. Remove the "barebones-html" build. With Gradle the build is self-contained 
> and {{gradlew check}} and {{gradle precommit}} could just build the full docs 
> and check them.
> 2. Remove some properties that only existed for a hypothetical need related 
> to the now-removed PDF.
> 3. Change remaining properties to be defined directly in build.gradle instead 
> of relying on ant properties functionality.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14824) Simplify set of Ref Guide build parameters

2020-09-16 Thread Uwe Schindler (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17197265#comment-17197265
 ] 

Uwe Schindler commented on SOLR-14824:
--

Hi Hoss,

looks fine to me in general, but I'd change a bit those crazy settings: 

{{solrRootPath: '../../../../solr/',}}

This is a bit strange to read, it would be much cleaner to use gradle's ability 
to know where the files are without relying on the project directory layout, 
like {{$project(':solr').projectDir}}

The same applies to the "localJavadocs" folders (see SOLR-14870).

> Simplify set of Ref Guide build parameters
> --
>
> Key: SOLR-14824
> URL: https://issues.apache.org/jira/browse/SOLR-14824
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Build, documentation
>Reporter: Cassandra Targett
>Assignee: Chris M. Hostetter
>Priority: Minor
> Attachments: SOLR-14824.patch, SOLR-14824.patch
>
>
> While trying to solve LUCENE-9495, I thought it might be a good idea to try 
> to simplify the set of variables and properties used during the Ref Guide 
> build.
> There are 3 areas to work on:
> 1. Remove the "barebones-html" build. With Gradle the build is self-contained 
> and {{gradlew check}} and {{gradle precommit}} could just build the full docs 
> and check them.
> 2. Remove some properties that only existed for a hypothetical need related 
> to the now-removed PDF.
> 3. Change remaining properties to be defined directly in build.gradle instead 
> of relying on ant properties functionality.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14870) gradle build does not validate ref-guide -> javadoc links

2020-09-16 Thread Chris M. Hostetter (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17197264#comment-17197264
 ] 

Chris M. Hostetter commented on SOLR-14870:
---

{quote}OK, sorry for being inprecise. ...
{quote}
not your fault – just me not understanding the gradle build enough yet to havea 
coherent conversation about it – and in particular misunderstanding (and having 
backwards) what you're calling "local" javadocs vs "global" javadocs and 
how/when/why things should (or should not) link to them.
{quote}The checklinks task in the refguide depends on the "global javadocs" / 
"documentation" task ...
{quote}
yup yup ... we're on the same page.
{quote}.. If I have some time tomorrow morning, I may be able to help you.
{quote}
No worries, lemme get SOLR-14824 committed (which will clean up some of the 
existing cruftin the ref-guide build.gradle) and then i'll take a stab at:
 * refactoring the {{buildSite}} task (and deps) to separate out the "build" 
logic from the "check" logic
 * refactoring the logic in the build/check tasks into custom ["Task 
Types"|https://docs.gradle.org/current/userguide/custom_tasks.html] with 
parameterized javadoc urls / validation flags
 * adding new "local" versions of build/check using that use the (relative) 
path to the "global" javadocs and indicate that all links should be verified

> gradle build does not validate ref-guide -> javadoc links
> -
>
> Key: SOLR-14870
> URL: https://issues.apache.org/jira/browse/SOLR-14870
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Chris M. Hostetter
>Priority: Major
>
> the ant build had (has on 8x) a feature that ensured we didn't have any 
> broken links between the ref guide and the javadocs...
> {code}
>  depends="javadocs,changes-to-html,process-webpages">
>  inheritall="false">
>   
>   
> 
>   
> {code}
> ...by default {{cd solr/solr-ref-guide && ant bare-bones-html-validation}} 
> just did interanal validation of the strucure of the guide, but this hook 
> ment that {{cd solr && ant documentation}} (or {{ant precommit}}) would first 
> build the javadocs; then build the ref-guide; then validate _all_ links i 
> nthe ref-guide, even those to (local) javadocs
> While the "local.javadocs" property logic _inside_ the 
> solr-ref-guide/build.xml was ported to build.gradle, the logic to leverage 
> this functionality from the "solr" project doesn't seem to have been 
> preserved -- so currently, {{gradle check}} doesn't know/care if someone adds 
> a nonsense javadoc link to the ref-guide (or removes a class/method whose 
> javadoc is already currently to from the ref guide)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14613) Provide a clean API for pluggable replica assignment implementations

2020-09-16 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17197261#comment-17197261
 ] 

ASF subversion and git services commented on SOLR-14613:


Commit c7d234cafd37d824ff6642b449d3cb333d1e4a9a in lucene-solr's branch 
refs/heads/master from Ilan Ginzburg
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=c7d234c ]

SOLR-14613: Autoscaling replacement using placement plugins

Allow using placement plugins to compute replica placement on the cluster for 
Collection API calls.
This is the first code drop for the replacement of the Autoscaling feature.
Javadoc of sample plugin 
org.apache.solr.cluster.placement.plugins.SamplePluginAffinityReplicaPlacement 
details how to enable this replica placement strategy.
PR's #1684 then #1845

> Provide a clean API for pluggable replica assignment implementations
> 
>
> Key: SOLR-14613
> URL: https://issues.apache.org/jira/browse/SOLR-14613
> Project: Solr
>  Issue Type: Improvement
>  Components: AutoScaling
>Reporter: Andrzej Bialecki
>Assignee: Ilan Ginzburg
>Priority: Major
>  Time Spent: 40h 40m
>  Remaining Estimate: 0h
>
> As described in SIP-8 the current autoscaling Policy implementation has 
> several limitations that make it difficult to use for very large clusters and 
> very large collections. SIP-8 also mentions the possible migration path by 
> providing alternative implementations of the placement strategies that are 
> less complex but more efficient in these very large environments.
> We should review the existing APIs that the current autoscaling engine uses 
> ({{SolrCloudManager}} , {{AssignStrategy}} , {{Suggester}} and related 
> interfaces) to see if they provide a sufficient and minimal API for plugging 
> in alternative autoscaling placement strategies, and if necessary refactor 
> the existing APIs.
> Since these APIs are internal it should be possible to do this without 
> breaking back-compat.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] murblanc merged pull request #1845: SOLR-14613: Autoscaling replacement using placement plugins

2020-09-16 Thread GitBox


murblanc merged pull request #1845:
URL: https://github.com/apache/lucene-solr/pull/1845


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14824) Simplify set of Ref Guide build parameters

2020-09-16 Thread Chris M. Hostetter (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris M. Hostetter updated SOLR-14824:
--
Attachment: SOLR-14824.patch
  Assignee: Chris M. Hostetter
Status: Open  (was: Open)

{quote}Next Steps: ...
{quote}
Updated patch includes those outstanding items with a slightly tweaked property 
name.

Final syntax with patch applied (as documented in meta-docs/publish.adoc) ...
{noformat}
# build the "DRAFT" guide over and over as much as you want as things change...
./gradlew solr:solr-ref-guide:buildSite

# build the "real" site for final publishing...
./gradlew solr:solr-ref-guide:buildSite -PsolrGuideDraft=false
{noformat}

FWIW, based on the conversations in SOLR-14870 I was thinking that instead of a 
command line property to control the "DRAFT" status, we should instead really 
just have "buildSite" which would always build the "real" (ie: never a DRAFT) 
site into build/html-site, while some new "buildLocalGuide" would always build 
the "DRAFT" pages (and use relative links to "local" javadocs for validation - 
see discussion in SOLR-14870) and the "default" gradle task / workflow should 
be to run the "buildLocalGuide" task – leaving "buildSite" as something only 
the a committer would run when publishing the site.

BUT ... then i thought about the actaul publishing workflow (as documented in 
meta-docs/publish.adoc ) and the fact that we do currently "publish" DRAFT 
copies of the site to lucene.apache.org for people to review – up until the 
solr release is "final" at which point we publish the "non-draft" version of 
the guide

So for now i left in the {{-PsolrGuideDraft=false}} propety ... but it's 
something we may want to reconsider down the road?

[~ctargett] how does this patch look to you? are there anything you had in mind 
for simplification/clean up that i overlooked?

> Simplify set of Ref Guide build parameters
> --
>
> Key: SOLR-14824
> URL: https://issues.apache.org/jira/browse/SOLR-14824
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Build, documentation
>Reporter: Cassandra Targett
>Assignee: Chris M. Hostetter
>Priority: Minor
> Attachments: SOLR-14824.patch, SOLR-14824.patch
>
>
> While trying to solve LUCENE-9495, I thought it might be a good idea to try 
> to simplify the set of variables and properties used during the Ref Guide 
> build.
> There are 3 areas to work on:
> 1. Remove the "barebones-html" build. With Gradle the build is self-contained 
> and {{gradlew check}} and {{gradle precommit}} could just build the full docs 
> and check them.
> 2. Remove some properties that only existed for a hypothetical need related 
> to the now-removed PDF.
> 3. Change remaining properties to be defined directly in build.gradle instead 
> of relying on ant properties functionality.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9528) Clean up obsolete and commented-out cruft from StandardSyntaxParser.jj

2020-09-16 Thread Erick Erickson (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17197256#comment-17197256
 ] 

Erick Erickson commented on LUCENE-9528:


This'll be an interesting test of whether the Gradle tasks that process the jj 
file works, let me know if it doesn't...

I don't know how often the javacc task(s) are run. I do know that we've seen 
multiple instances of people hand-editing the _results_ of that task, it's very 
easy to overlook the comments at the top of the generated files that say DO NOT 
HAND EDIT. Particularly when you're trying to remove warnings, fix deprecations 
and the like.

Now I'm wondering if it makes sense to run this target as part of check. If 
people had hand-edited the output, it'd at least recreate the output with 
differences, then fail check because they weren't "git add"ed yet. It takes 
about 5 seconds to run that task.

It took me some time to make all of the hand-edits be done automatically, but 
the rule should be that if you hand-edit the results of that task, then 
something happens that gives you a clue you've done something wrong.

> Clean up obsolete and commented-out cruft from StandardSyntaxParser.jj
> --
>
> Key: LUCENE-9528
> URL: https://issues.apache.org/jira/browse/LUCENE-9528
> Project: Lucene - Core
>  Issue Type: Sub-task
>Reporter: Dawid Weiss
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The indentation in that file is crazy. So are micro-optimizations which make 
> reading the syntax parser much more difficult than it actually is.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7882) Maybe expression compiler should cache recently compiled expressions?

2020-09-16 Thread Haoyu Zhai (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-7882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17197237#comment-17197237
 ] 

Haoyu Zhai commented on LUCENE-7882:


Hi Uwe

Thank you for making this fantastic PR!

This issue was a bug in our service that would incorrectly recompile many 
expressions and it is fixed, now we only see ~500 expression compiled per 
benchmark run.

We've tested this PR by compiling with JDK11 and running with JDK15 (because of 
some reason it's not easy to compile our service with JDK15 directly). But 
because of the reason I mentioned above, it seems that we don't have enough 
expressions compilation now to observe the difference with or without the PR.

> Maybe expression compiler should cache recently compiled expressions?
> -
>
> Key: LUCENE-7882
> URL: https://issues.apache.org/jira/browse/LUCENE-7882
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/expressions
>Reporter: Michael McCandless
>Assignee: Uwe Schindler
>Priority: Major
> Attachments: demo.patch
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> I've been running search performance tests using a simple expression 
> ({{_score + ln(1000+unit_sales)}}) for sorting and hit this odd bottleneck:
> {noformat}
> "pool-1-thread-30" #70 prio=5 os_prio=0 tid=0x7eea7000a000 nid=0x1ea8a 
> waiting for monitor entry [0x7eea867dd000]
>java.lang.Thread.State: BLOCKED (on object monitor)
>   at 
> org.apache.lucene.expressions.js.JavascriptCompiler$CompiledExpression.evaluate(_score
>  + ln(1000+unit_sales))
>   at 
> org.apache.lucene.expressions.ExpressionFunctionValues.doubleValue(ExpressionFunctionValues.java:49)
>   at 
> com.amazon.lucene.OrderedVELeafCollector.collectInternal(OrderedVELeafCollector.java:123)
>   at 
> com.amazon.lucene.OrderedVELeafCollector.collect(OrderedVELeafCollector.java:108)
>   at 
> org.apache.lucene.search.MultiCollectorManager$Collectors$LeafCollectors.collect(MultiCollectorManager.java:102)
>   at 
> org.apache.lucene.search.Weight$DefaultBulkScorer.scoreAll(Weight.java:241)
>   at 
> org.apache.lucene.search.Weight$DefaultBulkScorer.score(Weight.java:184)
>   at org.apache.lucene.search.BulkScorer.score(BulkScorer.java:39)
>   at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:658)
>   at org.apache.lucene.search.IndexSearcher$5.call(IndexSearcher.java:600)
>   at org.apache.lucene.search.IndexSearcher$5.call(IndexSearcher.java:597)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> {noformat}
> I couldn't see any {{synchronized}} in the sources here, so I'm not sure 
> which object monitor it's blocked on.
> I was accidentally compiling a new expression for every query, and that 
> bottleneck would cause overall QPS to slow down drastically (~4X slower after 
> ~1 hour of redline tests), as if the JVM is getting slower and slower to 
> evaluate each expression the more expressions I had compiled.
> I tested JDK 9-ea and it also kept slowing down over time as the performance 
> test ran.
> Maybe we should put a small cache in front of the expressions compiler to 
> make it less trappy?  Or maybe we can get to the root cause of why the JVM 
> slows down more and more, the more expressions you compile?
> I won't have time to work on this in the near future so if anyone else feels 
> the itch, please scratch it!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14870) gradle build does not validate ref-guide -> javadoc links

2020-09-16 Thread Uwe Schindler (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17197224#comment-17197224
 ] 

Uwe Schindler commented on SOLR-14870:
--

{quote}
bq. ... Maybe we have some communication problem here (terms and semantic). ...

Almost certainly, because very little of what you said makes sense to me – and 
most of that is just my lack of comprehension, only a little bit is a 
disagreement about what i _do_ understand, but i suspect you are right that 
it's just a terminology confusion.
{quote}

OK, sorry for being inprecise. The answer was meant for [~dweiss]. I have read 
you text and maybe for clarification on your side I will (again) explain what's 
completely different in Gradle with regards to Javadocs:

The gradle build actually builds 2 different versions of the Javadocs:
- One called "local javadocs" which are per project: Every Lucene / Solr module 
has its own "small" piece of Javadocs residing in the projects subfolder. Those 
Javadocs are isolated and intended to be packaged into Javadoc-JAR files 
available to IDEs and other consumers from Maven central. Those Javadocs never 
go to any website and therefor they are also not the "target for link checks".
- The other "global javadocs" variant is part of "gradlew :documentation" or 
"gradlew :lucene:documentation" (if you only want Lucene. This task builds the 
Javadocs as they are published on the website. Those javadocs are then located 
at {{$\{project(':lucene').buildDir\}/documentation}} and are copied to web 
page. They have exactly the structure like published on the website. The final 
location may change, as I am planning to make the global javadocs a separate 
subproject of lucene/solr (to assemble those). But with above logic it wold 
just be a change in the above {{project()}} path, so you don't need to take 
care.

As you see from that, IMHO the ideal way to check the links would be the 
following:
- The checklinks task in the refguide depends on the "global javadocs" / 
"documentation" task (as its Lucene only, right?). Ideally it should rely on 
its output (I can fix this to declare the documentation folder as its output), 
then the dependency would be on the output folder.
- When checklinks is running, it just passes above project output, converted to 
URI, to the checker task or the build refguide task. This should be done 
always, as the output to be checked never depends on external resources.
- Ideally for checking purposes the gradle build should have 2 different tasks 
(one that builds the real refguide with absolute links, and one just for 
checking purposes to another output folder, using the above project-URL).

Does this now sound doable? If I have some time tomorrow morning, I may be able 
to help you. 

> gradle build does not validate ref-guide -> javadoc links
> -
>
> Key: SOLR-14870
> URL: https://issues.apache.org/jira/browse/SOLR-14870
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Chris M. Hostetter
>Priority: Major
>
> the ant build had (has on 8x) a feature that ensured we didn't have any 
> broken links between the ref guide and the javadocs...
> {code}
>  depends="javadocs,changes-to-html,process-webpages">
>  inheritall="false">
>   
>   
> 
>   
> {code}
> ...by default {{cd solr/solr-ref-guide && ant bare-bones-html-validation}} 
> just did interanal validation of the strucure of the guide, but this hook 
> ment that {{cd solr && ant documentation}} (or {{ant precommit}}) would first 
> build the javadocs; then build the ref-guide; then validate _all_ links i 
> nthe ref-guide, even those to (local) javadocs
> While the "local.javadocs" property logic _inside_ the 
> solr-ref-guide/build.xml was ported to build.gradle, the logic to leverage 
> this functionality from the "solr" project doesn't seem to have been 
> preserved -- so currently, {{gradle check}} doesn't know/care if someone adds 
> a nonsense javadoc link to the ref-guide (or removes a class/method whose 
> javadoc is already currently to from the ref guide)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] Hronom commented on pull request #1864: SOLR-14850 ExactStatsCache NullPointerException when shards.tolerant=true

2020-09-16 Thread GitBox


Hronom commented on pull request #1864:
URL: https://github.com/apache/lucene-solr/pull/1864#issuecomment-693648037


   
   @munendrasn do you need something specific?
   To reproduce problem you need to use some subset of `ExactStatsCache` for 
example `ExactSharedStatsCache` and run any simple query with 
`shards.tolerant=true` and put down 1 shard in your solr cloud cluster. And you 
will get exception...



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9528) Clean up obsolete and commented-out cruft from StandardSyntaxParser.jj

2020-09-16 Thread Dawid Weiss (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17197218#comment-17197218
 ] 

Dawid Weiss commented on LUCENE-9528:
-

I do have one question about syntax. We currently allow boost and fuzzy 
operators to come in any order:
{code}
term^3~2
{code}

The order above doesn't make any sense from parser grammar point of view (and 
is in fact coded incorrectly allowing two conflicting fuzzy operators). The 
boost applies to a sub-clause (be it regexp, term query, phrase query) and 
grammar-wise it shouldn't occur before the fuzzy operator.

I *can* implement this as it was but it makes parser grammar more difficult to 
read and is just plain unnatural (in my opinion). If there are no objections, 
I'd like to restrict query parsing to disallow the above ordering.

https://github.com/dweiss/lucene-solr/blob/LUCENE-9528/lucene/queryparser/src/java/org/apache/lucene/queryparser/flexible/standard/parser/StandardSyntaxParser.jj#L398-L426

> Clean up obsolete and commented-out cruft from StandardSyntaxParser.jj
> --
>
> Key: LUCENE-9528
> URL: https://issues.apache.org/jira/browse/LUCENE-9528
> Project: Lucene - Core
>  Issue Type: Sub-task
>Reporter: Dawid Weiss
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The indentation in that file is crazy. So are micro-optimizations which make 
> reading the syntax parser much more difficult than it actually is.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] dweiss opened a new pull request #1879: LUCENE-9528: cleanup of flexible query parser's grammar

2020-09-16 Thread GitBox


dweiss opened a new pull request #1879:
URL: https://github.com/apache/lucene-solr/pull/1879


   Piggybacks some cleanups to javacc generation scripts as well.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9445) Expose new case insensitive RegExpQuery support in QueryParser

2020-09-16 Thread Dawid Weiss (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17197213#comment-17197213
 ] 

Dawid Weiss commented on LUCENE-9445:
-

I've just spent a day cleaning up the grammar for flexible query parser, 
Mark... Do you think you can wait until I push it? LUCENE-9528 

> Expose new case insensitive RegExpQuery support in QueryParser
> --
>
> Key: LUCENE-9445
> URL: https://issues.apache.org/jira/browse/LUCENE-9445
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/queryparser
>Reporter: Mark Harwood
>Assignee: Mark Harwood
>Priority: Minor
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> LUCENE-9386 added a case insensitive matching option to RegExpQuery.
> This proposal is to extend the QueryParser syntax to allow for an optional 
> `i` (case Insensitive) flag to appear on the end of regular expressions e.g. 
> /Foo/i
>  
> This is regex syntax supported by a number of programming languages.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14870) gradle build does not validate ref-guide -> javadoc links

2020-09-16 Thread Dawid Weiss (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17197211#comment-17197211
 ] 

Dawid Weiss commented on SOLR-14870:


Right. It really shouldn't be too difficult to put together... but it's gradle 
deep waters to start with, Chris... Don't get discouraged. 

If you're willing to do this then I'd start by peeking around the existing 
configuration scripts - they're self-contained (for the most part) and when you 
eyeball them you'll see how things get done (conversely, you may also see 
things that look like magic - feel free to ask then, please). 

Then hack around and see if you can make things work, file a PR and ping me (or 
Uwe) and we'll help out.

These 4 tasks you mention look exactly how I think it could look... but it's 
definitely an open problem and many equally good solutions exist.

> gradle build does not validate ref-guide -> javadoc links
> -
>
> Key: SOLR-14870
> URL: https://issues.apache.org/jira/browse/SOLR-14870
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Chris M. Hostetter
>Priority: Major
>
> the ant build had (has on 8x) a feature that ensured we didn't have any 
> broken links between the ref guide and the javadocs...
> {code}
>  depends="javadocs,changes-to-html,process-webpages">
>  inheritall="false">
>   
>   
> 
>   
> {code}
> ...by default {{cd solr/solr-ref-guide && ant bare-bones-html-validation}} 
> just did interanal validation of the strucure of the guide, but this hook 
> ment that {{cd solr && ant documentation}} (or {{ant precommit}}) would first 
> build the javadocs; then build the ref-guide; then validate _all_ links i 
> nthe ref-guide, even those to (local) javadocs
> While the "local.javadocs" property logic _inside_ the 
> solr-ref-guide/build.xml was ported to build.gradle, the logic to leverage 
> this functionality from the "solr" project doesn't seem to have been 
> preserved -- so currently, {{gradle check}} doesn't know/care if someone adds 
> a nonsense javadoc link to the ref-guide (or removes a class/method whose 
> javadoc is already currently to from the ref guide)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14151) Make schema components load from packages

2020-09-16 Thread Tomas Eduardo Fernandez Lobbe (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17197184#comment-17197184
 ] 

Tomas Eduardo Fernandez Lobbe commented on SOLR-14151:
--

Looking at the 7 days report from Hoss, the failures described here account for 
the top 5 most failed tests, causing ~60% of the total failures.

> Make schema components load from packages
> -
>
> Key: SOLR-14151
> URL: https://issues.apache.org/jira/browse/SOLR-14151
> Project: Solr
>  Issue Type: Sub-task
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
>  Labels: packagemanager
> Fix For: 8.7
>
>  Time Spent: 12h 40m
>  Remaining Estimate: 0h
>
> Example:
> {code:xml}
>  
> 
>   
>generateNumberParts="0" catenateWords="0"
>   catenateNumbers="0" catenateAll="0"/>
>   
>   
> 
>   
> {code}
> * When a package is updated, the entire {{IndexSchema}} object is refreshed, 
> but the SolrCore object is not reloaded
> * Any component can be prefixed with the package name
> * The semantics of loading plugins remain the same as that of the components 
> in {{solrconfig.xml}}
> * Plugins can be registered using schema API



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14151) Make schema components load from packages

2020-09-16 Thread Tomas Eduardo Fernandez Lobbe (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17197178#comment-17197178
 ] 

Tomas Eduardo Fernandez Lobbe commented on SOLR-14151:
--

Got it, reload is buggy, but the changes introduced by this Jira issue are 
triggering those way more than before (looking at my inbox, I count at least 5 
to 10 failures a day on these tests since the commits here). What do you 
suggest we do?

> Make schema components load from packages
> -
>
> Key: SOLR-14151
> URL: https://issues.apache.org/jira/browse/SOLR-14151
> Project: Solr
>  Issue Type: Sub-task
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
>  Labels: packagemanager
> Fix For: 8.7
>
>  Time Spent: 12h 40m
>  Remaining Estimate: 0h
>
> Example:
> {code:xml}
>  
> 
>   
>generateNumberParts="0" catenateWords="0"
>   catenateNumbers="0" catenateAll="0"/>
>   
>   
> 
>   
> {code}
> * When a package is updated, the entire {{IndexSchema}} object is refreshed, 
> but the SolrCore object is not reloaded
> * Any component can be prefixed with the package name
> * The semantics of loading plugins remain the same as that of the components 
> in {{solrconfig.xml}}
> * Plugins can be registered using schema API



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Resolved] (SOLR-14799) JWT authentication plugin should not require subject, unless set as principalClaim

2020-09-16 Thread Erik Hatcher (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Hatcher resolved SOLR-14799.
-
Resolution: Fixed

> JWT authentication plugin should not require subject, unless set as 
> principalClaim
> --
>
> Key: SOLR-14799
> URL: https://issues.apache.org/jira/browse/SOLR-14799
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Erik Hatcher
>Assignee: Erik Hatcher
>Priority: Blocker
> Fix For: 8.7
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Some environments don't use "sub" (subject) claim with Solr, but rather rely 
> on a custom claim (such as "solrid") to be required.   This ticket is about 
> making subject claim optional, and only required when principalClaim=sub



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14799) JWT authentication plugin should not require subject, unless set as principalClaim

2020-09-16 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17197163#comment-17197163
 ] 

ASF subversion and git services commented on SOLR-14799:


Commit a2a718f70d510cafaed3de7a396529df252342ee in lucene-solr's branch 
refs/heads/branch_8x from Erik Hatcher
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=a2a718f ]

SOLR-14799: JWT authentication plugin only requires sub claim when 
principalClaim=sub


> JWT authentication plugin should not require subject, unless set as 
> principalClaim
> --
>
> Key: SOLR-14799
> URL: https://issues.apache.org/jira/browse/SOLR-14799
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Erik Hatcher
>Assignee: Erik Hatcher
>Priority: Blocker
> Fix For: 8.7
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Some environments don't use "sub" (subject) claim with Solr, but rather rely 
> on a custom claim (such as "solrid") to be required.   This ticket is about 
> making subject claim optional, and only required when principalClaim=sub



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14799) JWT authentication plugin should not require subject, unless set as principalClaim

2020-09-16 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17197164#comment-17197164
 ] 

ASF subversion and git services commented on SOLR-14799:


Commit b643de6f1747d67389460a7e9ea156fc809be040 in lucene-solr's branch 
refs/heads/branch_8x from Erik Hatcher
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=b643de6 ]

SOLR-14799: add CHANGES entry


> JWT authentication plugin should not require subject, unless set as 
> principalClaim
> --
>
> Key: SOLR-14799
> URL: https://issues.apache.org/jira/browse/SOLR-14799
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Erik Hatcher
>Assignee: Erik Hatcher
>Priority: Blocker
> Fix For: 8.7
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Some environments don't use "sub" (subject) claim with Solr, but rather rely 
> on a custom claim (such as "solrid") to be required.   This ticket is about 
> making subject claim optional, and only required when principalClaim=sub



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14799) JWT authentication plugin should not require subject, unless set as principalClaim

2020-09-16 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17197154#comment-17197154
 ] 

ASF subversion and git services commented on SOLR-14799:


Commit a0404a75011e9dfce4920d66dc64b884c735dbf0 in lucene-solr's branch 
refs/heads/master from Erik Hatcher
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=a0404a7 ]

SOLR-14799: add CHANGES entry


> JWT authentication plugin should not require subject, unless set as 
> principalClaim
> --
>
> Key: SOLR-14799
> URL: https://issues.apache.org/jira/browse/SOLR-14799
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Erik Hatcher
>Assignee: Erik Hatcher
>Priority: Blocker
> Fix For: 8.7
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Some environments don't use "sub" (subject) claim with Solr, but rather rely 
> on a custom claim (such as "solrid") to be required.   This ticket is about 
> making subject claim optional, and only required when principalClaim=sub



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14799) JWT authentication plugin should not require subject, unless set as principalClaim

2020-09-16 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17197156#comment-17197156
 ] 

ASF subversion and git services commented on SOLR-14799:


Commit a0404a75011e9dfce4920d66dc64b884c735dbf0 in lucene-solr's branch 
refs/heads/master from Erik Hatcher
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=a0404a7 ]

SOLR-14799: add CHANGES entry


> JWT authentication plugin should not require subject, unless set as 
> principalClaim
> --
>
> Key: SOLR-14799
> URL: https://issues.apache.org/jira/browse/SOLR-14799
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Erik Hatcher
>Assignee: Erik Hatcher
>Priority: Blocker
> Fix For: 8.7
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Some environments don't use "sub" (subject) claim with Solr, but rather rely 
> on a custom claim (such as "solrid") to be required.   This ticket is about 
> making subject claim optional, and only required when principalClaim=sub



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14799) JWT authentication plugin should not require subject, unless set as principalClaim

2020-09-16 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17197155#comment-17197155
 ] 

ASF subversion and git services commented on SOLR-14799:


Commit 22022463d7c1437785249428dc37dc1b052c5fdb in lucene-solr's branch 
refs/heads/master from Erik Hatcher
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=2202246 ]

SOLR-14799: JWT authentication plugin only requires sub claim when 
principalClaim=sub


> JWT authentication plugin should not require subject, unless set as 
> principalClaim
> --
>
> Key: SOLR-14799
> URL: https://issues.apache.org/jira/browse/SOLR-14799
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Erik Hatcher
>Assignee: Erik Hatcher
>Priority: Blocker
> Fix For: 8.7
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Some environments don't use "sub" (subject) claim with Solr, but rather rely 
> on a custom claim (such as "solrid") to be required.   This ticket is about 
> making subject claim optional, and only required when principalClaim=sub



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14799) JWT authentication plugin should not require subject, unless set as principalClaim

2020-09-16 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17197153#comment-17197153
 ] 

ASF subversion and git services commented on SOLR-14799:


Commit 22022463d7c1437785249428dc37dc1b052c5fdb in lucene-solr's branch 
refs/heads/master from Erik Hatcher
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=2202246 ]

SOLR-14799: JWT authentication plugin only requires sub claim when 
principalClaim=sub


> JWT authentication plugin should not require subject, unless set as 
> principalClaim
> --
>
> Key: SOLR-14799
> URL: https://issues.apache.org/jira/browse/SOLR-14799
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Erik Hatcher
>Assignee: Erik Hatcher
>Priority: Blocker
> Fix For: 8.7
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Some environments don't use "sub" (subject) claim with Solr, but rather rely 
> on a custom claim (such as "solrid") to be required.   This ticket is about 
> making subject claim optional, and only required when principalClaim=sub



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14870) gradle build does not validate ref-guide -> javadoc links

2020-09-16 Thread Chris M. Hostetter (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17197151#comment-17197151
 ] 

Chris M. Hostetter commented on SOLR-14870:
---

{quote}... Maybe we have some communication problem here (terms and semantic). 
...
{quote}
Almost certainly, because very little of what you said makes sense to me – and 
most of that is just my lack of comprehension, only a little bit is a 
disagreement about what i _do_ understand, but i suspect you are right that 
it's just a terminology confusion.

So here's some more detailed background / specifies / defining terms solely 
from the point of view of the ref-guide build process – ignoring for a moment 
the changes being planned/discussed in SOLR-14824...

the ref-guide build (both ant and gradle) has 2 orthogonal concepts:
 * how we "build" the guide into HTML, via different tasks/targets
 ** "buildSite" uses jekyll
 ** "bareBonesAsciiDoctor" uses asciidoctor directly
 * the prefix used in the generated HTML, via the "local.javadocs" property
 ** if true, use prefixes like "../../docs/" *AND THOSE LINKS WILL LATER BE 
VALIDATED*
 ** if false, use prefixes like "https://lucene.apache.org/...; (which 
implicitly won't be validated)

The fact that those 2 concepts are orthogonal, is really just a historical 
oddity of how we use to build a PDF, but couldn't do link validation in the 
PDF; and how hard it is/was to run "jekyll" from ant; and the fact that we 
wanted some automated validation that could be hooked into "precommit".
 * The "local.javadocs" property was never really meant for a human to set when 
running any targets/tasks directly in the ref-guide.
 * Likewise no human was ever expected to *READ* the "bare bones" ref-guide 
output.

Both of those concepts existed purely so that when "ant precommit" 
(conceptually) depended on "cd solr && ant documentation" that would then 
ensure that ref-guide was forcibly re-generated (because there was no caching 
of the ref-guide build outputs) so we could verify no malformed adoc files, and 
the generated HTML would use "../../etc..." style javadoc links which could be 
easily validated locally.

Uwe: Does that all make sense in terms of the terminology / purpose of the 
"local.javadocs" param i mentioned? we really only have have one "site" output 
(built via jekyll) that we actually care about people viewing/publishing, and 
it (historically) has always used "https://...; style links for the javadocs.

(whether we want to change that down the road to be more configurable if/when 
we tie the ref-guide into top level "package/publish" actions seems like it's a 
bigger question for another jira?)

If we were to start over today (even in ant where w/all of it's problems 
running jekyll directly) where we no longer care about the PDF, and we have a 
"check" task/target inside each sub-project (not just a top level "precommit") 
then (along hte lines of what Dawid said) instead of 2x2 ways to build the 
guide, it would probably make sense to just have 2 ways to build the guide, w/o 
the project level "local.javadocs" property at all...
 * "buildSite" would use jekyll with all javadoc links pointing to 
"http://lucene.apache.org/...;
 ** when we later run the "link checker" on this, it would ignore the remote 
links javadoc links (just like it ignores any external link)
 * "bareBonesAsciiDoctor" would use asciidoctor directly, with all javadoc 
links pointing to "../../etc..."
 ** when we later run the "link checker" on this, it would validate those links

Which brings me to Dawid's comment, which i think really hits hte nail on the 
head in terms of my primary concern as to how to do this in the "gradle-esque" 
way...
{quote}If docs are to be rendered with local links they should be rendered 
separately (different task, different outputs). This ensures caches 
(up-to-date) are working correctly. That global "local.javadocs" should be 
removed,
{quote}
Right. I'm staring to drink the gradle Kool-Aid now in terms of intermediate 
outputs and caching...
{quote}It should be two different tasks - one rendering links to local 
javadocs, the other to remote ones. 'buildSite' should be in fact called 
'checkSite(Local|Remote)' because that's what it does.
{quote}
Hmmm... but if i'm following you correctly, it should really be 4 tasks:
 * buildSite
 ** just handle adoc -> html converstion via jekyll into some "build/site" 
directory
 ** use "http://...; style javadoc links
 * buildBareBonesHtml
 ** just handle adoc -> html conversion via asciidoctor into some 
"build/bare-bones" directory
 ** "../../etc..." style javadoc links
 * checkSite
 ** depend on buildSite
 ** just do link checking on the "build/site" directory
 * checkBareBonesHtml
 ** depend on buildBareBones _and the "solr" and "lucene" level javadoc tasks_
 ** just do link checking on the "build/bare-bones" directory

Does that make sense?

The one caveat 

[jira] [Commented] (SOLR-14799) JWT authentication plugin should not require subject, unless set as principalClaim

2020-09-16 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17197149#comment-17197149
 ] 

ASF subversion and git services commented on SOLR-14799:


Commit 22022463d7c1437785249428dc37dc1b052c5fdb in lucene-solr's branch 
refs/heads/master from Erik Hatcher
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=2202246 ]

SOLR-14799: JWT authentication plugin only requires sub claim when 
principalClaim=sub


> JWT authentication plugin should not require subject, unless set as 
> principalClaim
> --
>
> Key: SOLR-14799
> URL: https://issues.apache.org/jira/browse/SOLR-14799
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Erik Hatcher
>Assignee: Erik Hatcher
>Priority: Blocker
> Fix For: 8.7
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Some environments don't use "sub" (subject) claim with Solr, but rather rely 
> on a custom claim (such as "solrid") to be required.   This ticket is about 
> making subject claim optional, and only required when principalClaim=sub



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14799) JWT authentication plugin should not require subject, unless set as principalClaim

2020-09-16 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17197150#comment-17197150
 ] 

ASF subversion and git services commented on SOLR-14799:


Commit a0404a75011e9dfce4920d66dc64b884c735dbf0 in lucene-solr's branch 
refs/heads/master from Erik Hatcher
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=a0404a7 ]

SOLR-14799: add CHANGES entry


> JWT authentication plugin should not require subject, unless set as 
> principalClaim
> --
>
> Key: SOLR-14799
> URL: https://issues.apache.org/jira/browse/SOLR-14799
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Erik Hatcher
>Assignee: Erik Hatcher
>Priority: Blocker
> Fix For: 8.7
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Some environments don't use "sub" (subject) claim with Solr, but rather rely 
> on a custom claim (such as "solrid") to be required.   This ticket is about 
> making subject claim optional, and only required when principalClaim=sub



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14799) JWT authentication plugin should not require subject, unless set as principalClaim

2020-09-16 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17197148#comment-17197148
 ] 

ASF subversion and git services commented on SOLR-14799:


Commit a0404a75011e9dfce4920d66dc64b884c735dbf0 in lucene-solr's branch 
refs/heads/master from Erik Hatcher
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=a0404a7 ]

SOLR-14799: add CHANGES entry


> JWT authentication plugin should not require subject, unless set as 
> principalClaim
> --
>
> Key: SOLR-14799
> URL: https://issues.apache.org/jira/browse/SOLR-14799
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Erik Hatcher
>Assignee: Erik Hatcher
>Priority: Blocker
> Fix For: 8.7
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Some environments don't use "sub" (subject) claim with Solr, but rather rely 
> on a custom claim (such as "solrid") to be required.   This ticket is about 
> making subject claim optional, and only required when principalClaim=sub



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14799) JWT authentication plugin should not require subject, unless set as principalClaim

2020-09-16 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17197147#comment-17197147
 ] 

ASF subversion and git services commented on SOLR-14799:


Commit 22022463d7c1437785249428dc37dc1b052c5fdb in lucene-solr's branch 
refs/heads/master from Erik Hatcher
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=2202246 ]

SOLR-14799: JWT authentication plugin only requires sub claim when 
principalClaim=sub


> JWT authentication plugin should not require subject, unless set as 
> principalClaim
> --
>
> Key: SOLR-14799
> URL: https://issues.apache.org/jira/browse/SOLR-14799
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Erik Hatcher
>Assignee: Erik Hatcher
>Priority: Blocker
> Fix For: 8.7
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Some environments don't use "sub" (subject) claim with Solr, but rather rely 
> on a custom claim (such as "solrid") to be required.   This ticket is about 
> making subject claim optional, and only required when principalClaim=sub



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Assigned] (SOLR-14799) JWT authentication plugin should not require subject, unless set as principalClaim

2020-09-16 Thread Erik Hatcher (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Hatcher reassigned SOLR-14799:
---

Assignee: Erik Hatcher

> JWT authentication plugin should not require subject, unless set as 
> principalClaim
> --
>
> Key: SOLR-14799
> URL: https://issues.apache.org/jira/browse/SOLR-14799
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Erik Hatcher
>Assignee: Erik Hatcher
>Priority: Blocker
> Fix For: 8.7
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Some environments don't use "sub" (subject) claim with Solr, but rather rely 
> on a custom claim (such as "solrid") to be required.   This ticket is about 
> making subject claim optional, and only required when principalClaim=sub



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document

2020-09-16 Thread Michael McCandless (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-9444:
---
Attachment: LUCENE-9444.patch
Status: Patch Available  (was: Patch Available)

Hi [~goankur] – patch looks good!  I downloaded and tweaked a bit:
 * Smoothed wording of some of the javadocs
 * Fixed indent to two spaces (matching Lucene's style)
 * Removed catch {{IOException}} / rethrow {{RuntimeException}} – since these 
methods already throw {{IOException}} we can just throw directly?
 * Added a few assertions and factored out local variables

Net/net I think it is ready!  Can you have a look / iterate?  Thanks!

> Need an API to easily fetch facet labels for a field in a document
> --
>
> Key: LUCENE-9444
> URL: https://issues.apache.org/jira/browse/LUCENE-9444
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/facet
>Affects Versions: 8.6
>Reporter: Ankur
>Priority: Major
>  Labels: facet
> Attachments: LUCENE-9444.patch, LUCENE-9444.patch, 
> LUCENE-9444.v2.patch
>
>
> A facet field may be included in the list of fields whose values are to be 
> returned for each hit.
> In order to get the facet labels for each hit we need to
>  # Create an instance of _DocValuesOrdinalsReader_ and invoke 
> _getReader(LeafReaderContext context)_ method to obtain an instance of 
> _OrdinalsSegmentReader()_
>  # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then 
> used to fetch and decode the binary payload in the document's BinaryDocValues 
> field. This provides the ordinals that refer to facet labels in the 
> taxonomy.**
>  # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be 
> returned.
>  
> Ideally there should be a simple API - *String[] getLabels(docId)* that hides 
> all the above details and gives us the string labels. This can be part of 
> *TaxonomyFacets* but that's just one idea.
> I am opening this issue to get community feedback and suggestions.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (LUCENE-9529) Larger stored fields block sizes mean we're more likely to disable optimized bulk merging

2020-09-16 Thread Adrien Grand (Jira)
Adrien Grand created LUCENE-9529:


 Summary: Larger stored fields block sizes mean we're more likely 
to disable optimized bulk merging
 Key: LUCENE-9529
 URL: https://issues.apache.org/jira/browse/LUCENE-9529
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Adrien Grand


Whenever possible when merging stored fields, Lucene tries to copy the 
compressed data instead of decompressing the source segment to then 
re-compressing in the destination segment. A problem with this approach is that 
if some blocks are incomplete (typically the last block of a segment) then it 
remains incomplete in the destination segment too, and if we do it for too long 
we end up with a bad compression ratio. So Lucene keeps track of these 
incomplete blocks, and makes sure to keep a ratio of incomplete blocks below 1%.

But as we increased the block size, it has become more likely to have a high 
ratio of incomplete blocks. E.g. if you have a segment with 1MB of stored 
fields, with 16kB blocks like before, you have 63 complete blocks and 1 
incomplete block, or 1.6%. But now with ~512kB blocks, you have one complete 
block and 1 incomplete block, ie. 50%.

I'm not sure how to fix it or even whether it should be fixed but wanted to 
open an issue to track this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14799) JWT authentication plugin should not require subject, unless set as principalClaim

2020-09-16 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17197101#comment-17197101
 ] 

ASF subversion and git services commented on SOLR-14799:


Commit c63684f93b81b5c1f9e8d453349813deffc4ebfe in lucene-solr's branch 
refs/heads/master from Erik Hatcher
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=c63684f ]

Revert "SOLR-14799: JWT authentication plugin only requires sub claim when 
principalClaim=sub"

This reverts commit bc0c9ffee31b47c2630126deea28d7ff3d829016.


> JWT authentication plugin should not require subject, unless set as 
> principalClaim
> --
>
> Key: SOLR-14799
> URL: https://issues.apache.org/jira/browse/SOLR-14799
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Erik Hatcher
>Priority: Blocker
> Fix For: 8.7
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Some environments don't use "sub" (subject) claim with Solr, but rather rely 
> on a custom claim (such as "solrid") to be required.   This ticket is about 
> making subject claim optional, and only required when principalClaim=sub



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14799) JWT authentication plugin should not require subject, unless set as principalClaim

2020-09-16 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17197094#comment-17197094
 ] 

ASF subversion and git services commented on SOLR-14799:


Commit bc0c9ffee31b47c2630126deea28d7ff3d829016 in lucene-solr's branch 
refs/heads/master from Erik Hatcher
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=bc0c9ff ]

SOLR-14799: JWT authentication plugin only requires sub claim when 
principalClaim=sub


> JWT authentication plugin should not require subject, unless set as 
> principalClaim
> --
>
> Key: SOLR-14799
> URL: https://issues.apache.org/jira/browse/SOLR-14799
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Erik Hatcher
>Priority: Blocker
> Fix For: 8.7
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Some environments don't use "sub" (subject) claim with Solr, but rather rely 
> on a custom claim (such as "solrid") to be required.   This ticket is about 
> making subject claim optional, and only required when principalClaim=sub



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] rishisankar commented on a change in pull request #1770: SOLR-14763 SolrJ HTTP/2 Async API using CompletableFuture

2020-09-16 Thread GitBox


rishisankar commented on a change in pull request #1770:
URL: https://github.com/apache/lucene-solr/pull/1770#discussion_r489566851



##
File path: solr/solrj/src/java/org/apache/solr/client/solrj/SolrRequest.java
##
@@ -260,18 +260,13 @@ public final T process(SolrClient client) throws 
SolrServerException, IOExceptio
   public final CompletableFuture processAsynchronously(SolrClient client, 
String collection) {
 final long startNanos = System.nanoTime();
 final CompletableFuture> internalFuture = 
client.requestAsync(this, collection);
-final CompletableFuture apiFuture = new CompletableFuture<>();
-
-internalFuture.whenComplete((result, error) -> {
-  if (!internalFuture.isCompletedExceptionally()) {
-T res = createResponse(client);
-res.setResponse(result);
-long endNanos = System.nanoTime();
-res.setElapsedTime(TimeUnit.NANOSECONDS.toMillis(endNanos - 
startNanos));
-apiFuture.complete(res);
-  } else {
-apiFuture.completeExceptionally(error);
-  }
+
+final CompletableFuture apiFuture = internalFuture.thenApply((result) 
-> {

Review comment:
   If internalFuture completes exceptionally, apiFuture will complete 
exceptionally with a `CompletionException` (with internalFuture's exception as 
its cause). That being said, I think it's better to not have a 
CompletionException wrapper around the actual exception, so I'll revert to 
using whenComplete instead of thenApply.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] rishisankar commented on a change in pull request #1770: SOLR-14763 SolrJ HTTP/2 Async API using CompletableFuture

2020-09-16 Thread GitBox


rishisankar commented on a change in pull request #1770:
URL: https://github.com/apache/lucene-solr/pull/1770#discussion_r489563730



##
File path: solr/solrj/src/java/org/apache/solr/client/solrj/SolrRequest.java
##
@@ -260,18 +260,13 @@ public final T process(SolrClient client) throws 
SolrServerException, IOExceptio
   public final CompletableFuture processAsynchronously(SolrClient client, 
String collection) {
 final long startNanos = System.nanoTime();
 final CompletableFuture> internalFuture = 
client.requestAsync(this, collection);
-final CompletableFuture apiFuture = new CompletableFuture<>();
-
-internalFuture.whenComplete((result, error) -> {
-  if (!internalFuture.isCompletedExceptionally()) {
-T res = createResponse(client);
-res.setResponse(result);
-long endNanos = System.nanoTime();
-res.setElapsedTime(TimeUnit.NANOSECONDS.toMillis(endNanos - 
startNanos));
-apiFuture.complete(res);
-  } else {
-apiFuture.completeExceptionally(error);
-  }
+
+final CompletableFuture apiFuture = internalFuture.thenApply((result) 
-> {

Review comment:
   If internalFuture completes exceptionally, apiFuture will complete 
exceptionally with a `CompletionException` (with internalFuture's exception as 
its cause). That being said, I don't really like the `CompletionException` 
wrapper around the actual exception, so I'll revert to using whenComplete 
instead of thenApply.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] ctargett commented on pull request #1869: SOLR-14866 autowidth tables in ref guide

2020-09-16 Thread GitBox


ctargett commented on pull request #1869:
URL: https://github.com/apache/lucene-solr/pull/1869#issuecomment-693491731


   I don't view the syntax of "leave a blank line after the first line to make 
it a header row" as simpler (which is what I think you mean), at least it isn't 
for me. Until I looked it up yesterday, I completely forgot it was an option.
   
   To be clear, I don't think it's worth the effort to try to standardize all 
tables to use the same syntax nor to spend time policing what form of the 
syntax people choose to use. The flexibility of the syntax is one of its 
strengths, and we could bikeshed all kinds of options but we all have better 
things to do. Like if you have that much time for Ref Guide edits, I could 
suggest several other things to work on instead. We should encourage people to 
review our guidelines and use them, but at the same time we should support 
those who are already comfortable doing things their particular way.
   
   Re: the wrapping. It appears if we add `white-space: nowrap` (or 
`white-space: pre`, both prevent wrapping) to the CSS for the `` (and maybe 
also ``) element (`ref-guide.css` L#996 or thereabouts), that forces text 
without spaces not to break.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Assigned] (SOLR-14859) [* TO *] queries on DateRange fields miss results

2020-09-16 Thread Jason Gerlowski (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Gerlowski reassigned SOLR-14859:
--

Assignee: Jason Gerlowski

> [* TO *] queries on DateRange fields miss results
> -
>
> Key: SOLR-14859
> URL: https://issues.apache.org/jira/browse/SOLR-14859
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: query parsers
>Affects Versions: 8.5
>Reporter: Jason Gerlowski
>Assignee: Jason Gerlowski
>Priority: Major
> Attachments: SOLR-14859.patch, query-debug.png, reproduce.sh, 
> schema.png
>
>
> "exists" queries ({{[* TO *]}}) on DateRange fields return 0 results 
> regardless of docs in the index with values in that field.
> The issue appears to be that the query is converted into a 
> {{NormsFieldExistsQuery}}, even though DateRangeField uses omitNorms=true by 
> default.  Probably introduced by SOLR-11746's changes to these optimizable 
> range queries.
> I've attached a script to reproduce the issue (tested on Solr 8.6.2) and 
> screenshots showing showing schema and query-parsing info for the 
> reproduction.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14828) reduce 'error' logging noise in BaseCloudSolrClient.requestWithRetryOnStaleState

2020-09-16 Thread Christine Poerschke (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17197012#comment-17197012
 ] 

Christine Poerschke commented on SOLR-14828:


Thanks [~noble.paul] and [~anshum] for your input, I've updated the 
https://github.com/apache/lucene-solr/pull/1825 and added you both as extra 
reviewers too.

> reduce 'error' logging noise in 
> BaseCloudSolrClient.requestWithRetryOnStaleState
> 
>
> Key: SOLR-14828
> URL: https://issues.apache.org/jira/browse/SOLR-14828
> Project: Solr
>  Issue Type: Task
>  Components: SolrJ
>Reporter: Christine Poerschke
>Assignee: Christine Poerschke
>Priority: Minor
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Currently -- e.g. 
> https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.6.2/solr/solrj/src/java/org/apache/solr/client/solrj/impl/BaseCloudSolrClient.java#L960-L961
>  -- an error is logged even if request retrying will happen (and hopefully 
> succeed).
> This task proposes to 'info' or 'warn' rather than 'error' log if the request 
> will be retried.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] cpoerschke commented on a change in pull request #1877: SOLR-13181: param macro expansion could throw

2020-09-16 Thread GitBox


cpoerschke commented on a change in pull request #1877:
URL: https://github.com/apache/lucene-solr/pull/1877#discussion_r489465208



##
File path: solr/core/src/java/org/apache/solr/request/macro/MacroExpander.java
##
@@ -136,11 +136,12 @@ private String _expand(String val) {
   }
   else if (idx < 0) {
 if (sb == null) return val;
-sb.append(val.substring(start));
+sb.append(val, start, val.length());
 return sb.toString();
   }
 
   // found unescaped "${"
+  int matchedStart = idx;

Review comment:
   ```suggestion
 final int matchedStart = idx;
   ```

##
File path: solr/core/src/java/org/apache/solr/request/macro/MacroExpander.java
##
@@ -154,14 +155,14 @@ else if (idx < 0) {
   }
 
   if (matchedStart > 0) {
-sb.append(val.substring(start, matchedStart));
+sb.append(val, start, matchedStart);
   }
 
   // update "start" to be at the end of ${...}
-  start = rbrace + 1;
+  idx = start = rbrace + 1;
 
-  // String inbetween = val.substring(idx, rbrace);
-  StrParser parser = new StrParser(val, idx, rbrace);
+  // String in-between = val.substring(idx, rbrace);

Review comment:
   How about removing rather than amending the `// String inbetween = 
val.substring(idx, rbrace);` commented out code line?

##
File path: solr/core/src/java/org/apache/solr/request/macro/MacroExpander.java
##
@@ -188,7 +189,7 @@ else if (failOnMissingParams) {
 
   } catch (SyntaxError syntaxError) {
 // append the part we would have skipped
-sb.append( val.substring(matchedStart, start) );
+sb.append(val, matchedStart, start);

Review comment:
   observation: `foo.append(bar.substring(x,y));` is also found in other 
places in the code base, not sure how it might work implementation wise but it 
would be lovely if tooling would flag up that there's a more efficient 
alternative

##
File path: solr/core/src/java/org/apache/solr/request/macro/MacroExpander.java
##
@@ -136,11 +136,12 @@ private String _expand(String val) {
   }
   else if (idx < 0) {
 if (sb == null) return val;
-sb.append(val.substring(start));
+sb.append(val, start, val.length());
 return sb.toString();
   }
 
   // found unescaped "${"
+  int matchedStart = idx;
   idx += macroStart.length();
 
   int rbrace = val.indexOf('}', idx);

Review comment:
   ```suggestion
 int rbrace = val.indexOf('}', matchedStart + macroStart.length());
   ```
   
   similar to line 165 below and then the `idx += macroStart.length();` 
assignment would not be needed since there would be no use of `idx` between 
that incrementing assigment and the `idx = start = rbrace + 1;` resetting 
assignment below.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] jpountz commented on a change in pull request #1866: LUCENE-9523: Speed up query shapes for geometries that generate multiple points

2020-09-16 Thread GitBox


jpountz commented on a change in pull request #1866:
URL: https://github.com/apache/lucene-solr/pull/1866#discussion_r489452934



##
File path: lucene/core/src/java/org/apache/lucene/document/ShapeQuery.java
##
@@ -265,10 +265,20 @@ private Scorer getSparseScorer(final LeafReader reader, 
final Weight weight, fin
 final DocIdSetIterator iterator = new BitSetIterator(result, cost[0]);
 return new ConstantScoreScorer(weight, boost, scoreMode, iterator);
   }
-  final DocIdSetBuilder docIdSetBuilder = new 
DocIdSetBuilder(reader.maxDoc(), values, query.getField());
-  values.intersect(getSparseVisitor(query, docIdSetBuilder));
-  final DocIdSetIterator iterator = docIdSetBuilder.build().iterator();
-  return new ConstantScoreScorer(weight, boost, scoreMode, iterator);
+  if (values.getDocCount() << 2 < values.size()) {
+// we use a dense structure so we can skip already visited documents
+final FixedBitSet result = new FixedBitSet(reader.maxDoc());

Review comment:
   ok, let's keep a FixedBitSet for now then





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] HoustonPutman commented on a change in pull request #1861: SOLR-10391: Add overwrite option to UPLOAD ConfigSet action

2020-09-16 Thread GitBox


HoustonPutman commented on a change in pull request #1861:
URL: https://github.com/apache/lucene-solr/pull/1861#discussion_r489447516



##
File path: 
solr/core/src/java/org/apache/solr/handler/admin/ConfigSetsHandler.java
##
@@ -170,21 +176,90 @@ private void handleConfigUploadRequest(SolrQueryRequest 
req, SolrQueryResponse r
 
 // Create a node for the configuration in zookeeper
 boolean trusted = getTrusted(req);
-zkClient.makePath(configPathInZk, ("{\"trusted\": " + 
Boolean.toString(trusted) + "}").
-getBytes(StandardCharsets.UTF_8), true);
+Set filesToDelete = Collections.emptySet();
+if (overwritesExisting) {
+  if (!trusted) {
+ensureOverwritingUntrustedConfigSet(zkClient, configPathInZk);
+  }
+  if (req.getParams().getBool(ConfigSetParams.CLEANUP, false)) {
+filesToDelete = getAllConfigsetFiles(zkClient, configPathInZk);
+  }
+} else {
+  zkClient.makePath(configPathInZk, ("{\"trusted\": " + 
Boolean.toString(trusted) + "}").
+  getBytes(StandardCharsets.UTF_8), true);
+}
 
 ZipInputStream zis = new ZipInputStream(inputStream, 
StandardCharsets.UTF_8);
 ZipEntry zipEntry = null;
 while ((zipEntry = zis.getNextEntry()) != null) {
   String filePathInZk = configPathInZk + "/" + zipEntry.getName();
+  if (filePathInZk.endsWith("/")) {
+filesToDelete.remove(filePathInZk.substring(0, filePathInZk.length() 
-1));
+  } else {
+filesToDelete.remove(filePathInZk);
+  }
   if (zipEntry.isDirectory()) {
-zkClient.makePath(filePathInZk, true);
+zkClient.makePath(filePathInZk, false,  true);

Review comment:
   Not very likely, was just thinking about possibilities. It's probably 
not an issue 99.9% of times, so it likely doesn't need to be addressed.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] HoustonPutman commented on a change in pull request #1861: SOLR-10391: Add overwrite option to UPLOAD ConfigSet action

2020-09-16 Thread GitBox


HoustonPutman commented on a change in pull request #1861:
URL: https://github.com/apache/lucene-solr/pull/1861#discussion_r489445830



##
File path: 
solr/core/src/java/org/apache/solr/handler/admin/ConfigSetsHandler.java
##
@@ -170,21 +176,90 @@ private void handleConfigUploadRequest(SolrQueryRequest 
req, SolrQueryResponse r
 
 // Create a node for the configuration in zookeeper
 boolean trusted = getTrusted(req);
-zkClient.makePath(configPathInZk, ("{\"trusted\": " + 
Boolean.toString(trusted) + "}").
-getBytes(StandardCharsets.UTF_8), true);
+Set filesToDelete = Collections.emptySet();
+if (overwritesExisting) {
+  if (!trusted) {
+ensureOverwritingUntrustedConfigSet(zkClient, configPathInZk);
+  }
+  if (req.getParams().getBool(ConfigSetParams.CLEANUP, false)) {
+filesToDelete = getAllConfigsetFiles(zkClient, configPathInZk);
+  }
+} else {

Review comment:
   Yeah, for example if you want to use plugins or something similar, I 
could see someone not using a trusted configSet at first, but then wanting to 
overwrite their configSet to be "trusted" to be able to use the plugins.
   
   I don't see why that shouldn't be possible, given that there's little 
difference security-wise between creating a new config set and overwriting w/ 
cleanup.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] epugh commented on pull request #1869: SOLR-14866 autowidth tables in ref guide

2020-09-16 Thread GitBox


epugh commented on pull request #1869:
URL: https://github.com/apache/lucene-solr/pull/1869#issuecomment-693410287


   Thanks for pointing that out.  So, should we change it up in the 
asciidoc-syntax.adoc to remove the options="header", and then start 
progressivley going with the simpler style?   I like how the simpler style 
removes a lot of thinking, and maybe we highlight to only use the more complex 
style when you need "specific control over the column width in tables".   There 
are a couple of places where the autowidth doesnt' look good, places where, if 
we had a nowrap type setting we would use that in conjuction with autowidth.
   
   Again, happy to do the leg work, I know it grows the scope of this PR beyond 
removing those TODO's to revamping all tables.  



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] ctargett commented on pull request #1869: SOLR-14866 autowidth tables in ref guide

2020-09-16 Thread GitBox


ctargett commented on pull request #1869:
URL: https://github.com/apache/lucene-solr/pull/1869#issuecomment-693403436


   bq. Having consistent best practices encoded in the document
   
   The place that codifies our asciidoc best practices is in 
`meta-docs/asciidoc-syntax.adoc`, which is published with every Ref Guide as 
part of `how-to-contribute.adoc`. It in fact says to use the `options="header"` 
syntax for header rows, but to be honest I don't see any problem with someone 
using any valid syntax that ends up with the same HTML style. 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] iverase commented on pull request #1866: LUCENE-9523: Speed up query shapes for geometries that generate multiple points

2020-09-16 Thread GitBox


iverase commented on pull request #1866:
URL: https://github.com/apache/lucene-solr/pull/1866#issuecomment-693389673


   > Do you think we could always use this new code path?
   
   I run the Lucene benchmark for points using always the new code path, there 
is a small decrease of performance but not very significative:
   
   ```
   ||Approach||Shape||M hits/sec  ||QPS||Hit count  ||
||Dev||Base ||Diff||Dev||Base||Diff||Dev||Base||Diff||
   |shapes|polyRussia|9.33|9.47|-2%|2.66|2.70|-2%|3508846|3508846| 0%|
   |shapes|polyMedium|3.05|3.17|-4%|37.34|38.87|-4%|2693559|2693559| 0%|
   |shapes|poly 10|40.82|41.41|-1%|25.81|26.18|-1%|355809475|355809475| 0%|
   |shapes|box|41.48|41.62|-0%|42.20|42.35|-0%|221118844|221118844| 0%|
   |shapes|distance|43.62|44.34|-2%|25.63|26.05|-2%|382961957|382961957| 0%|
   ```
   
   If I use a `SparseFixedBitSet` instead of a `FixedBitSet` then the decrease 
of QPS is quite notable:
   
   ```
   |shapes|polyRussia|8.80|9.51|-7%|2.51|2.71|-7%|3508846|3508846| 0%|
   |shapes|polyMedium|3.04|3.15|-4%|37.24|38.64|-4%|2693559|2693559| 0%|
   |shapes|poly 10|32.62|40.88|-20%|20.63|25.85|-20%|355809475|355809475| 0%|
   |shapes|box|31.05|42.37|-27%|31.59|43.12|-27%|221118844|221118844| 0%|
   |shapes|distance|34.06|43.94|-22%|20.01|25.82|-22%|382961957|382961957| 0%|
   ```
   
   I am not concern for testability as one code path is tested when only 
indexing points and the other one should be trigger when indexing only polygons.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-9525) Better handle small documents with the new Lucene87StoredFieldsFormat

2020-09-16 Thread Adrien Grand (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-9525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand resolved LUCENE-9525.
--
Fix Version/s: 8.7
   Resolution: Fixed

> Better handle small documents with the new Lucene87StoredFieldsFormat
> -
>
> Key: LUCENE-9525
> URL: https://issues.apache.org/jira/browse/LUCENE-9525
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Priority: Minor
> Fix For: 8.7
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Stored fields configure a maximum number of fields per block, whose goal is 
> to make sure that you don't decompress more than X documents to get access to 
> a single one. However this has interesting effects with the new format.
> For instance we use 4kB of dictionary and blocks of 60kB for at most 512 
> documents per block. So if your documents are very small, say 10 bytes, the 
> block will be 5120 bytes overall, and we'll first compress 4096 bytes 
> independently, and then 5120-4096=1024 bytes with 4096 bytes of dictionary. 
> In this case training the dictionary takes more time than actually 
> compressing the data, and it's not even sure it's worth it since only 1024 
> bytes out of the 5120 bytes of the block get compressed with a preset 
> dictionary.
> I'm considering adapting the dictionary size and the block size to the total 
> block size in order to better handle such cases.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-9510) SortingStoredFieldsConsumer should use a format that has better random-access

2020-09-16 Thread Adrien Grand (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-9510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand resolved LUCENE-9510.
--
Fix Version/s: 8.7
   Resolution: Fixed

> SortingStoredFieldsConsumer should use a format that has better random-access
> -
>
> Key: LUCENE-9510
> URL: https://issues.apache.org/jira/browse/LUCENE-9510
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Priority: Minor
> Fix For: 8.7
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> We noticed some indexing rate regressions in Elasticsearch after upgrading to 
> a new Lucene snapshot. This is due to the fact that 
> SortingStoredFieldsConsumer is using the default codec to write stored fields 
> on flush. Compression doesn't matter much for this case since these are 
> temporary files that get removed on flush after the segment is sorted anyway 
> so we could switch to a format that has faster random access.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] iverase commented on a change in pull request #1866: LUCENE-9523: Speed up query shapes for geometries that generate multiple points

2020-09-16 Thread GitBox


iverase commented on a change in pull request #1866:
URL: https://github.com/apache/lucene-solr/pull/1866#discussion_r489412779



##
File path: lucene/core/src/java/org/apache/lucene/document/ShapeQuery.java
##
@@ -265,10 +265,20 @@ private Scorer getSparseScorer(final LeafReader reader, 
final Weight weight, fin
 final DocIdSetIterator iterator = new BitSetIterator(result, cost[0]);
 return new ConstantScoreScorer(weight, boost, scoreMode, iterator);
   }
-  final DocIdSetBuilder docIdSetBuilder = new 
DocIdSetBuilder(reader.maxDoc(), values, query.getField());
-  values.intersect(getSparseVisitor(query, docIdSetBuilder));
-  final DocIdSetIterator iterator = docIdSetBuilder.build().iterator();
-  return new ConstantScoreScorer(weight, boost, scoreMode, iterator);
+  if (values.getDocCount() << 2 < values.size()) {
+// we use a dense structure so we can skip already visited documents
+final FixedBitSet result = new FixedBitSet(reader.maxDoc());

Review comment:
   Using a `SparseFixedBitSet` the gain is not so significant expect for 
complex queries:
   
   ```
   |point|intersects|0.00|0.00|-2%|347.46|356.09|-2%|2644|2644| 0%|
   |box|intersects|5.57|5.64|-1%|37.86|38.35|-1%|33081264|33081264| 0%|
   |distance|intersects|5.33|5.25| 2%|18.72|18.42| 2%|64062400|64062400| 0%|
   |poly 10|intersects|4.73|4.51| 5%|18.00|17.19| 5%|59064569|59064569| 0%|
   |polyMedium|intersects|0.43|0.34|24%|26.53|21.41|24%|528812|528812| 0%|
   |polyRussia|intersects|1.68|1.10|52%|6.87|4.51|52%|244848|244848| 0%|
   ```





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] iverase commented on a change in pull request #1866: LUCENE-9523: Speed up query shapes for geometries that generate multiple points

2020-09-16 Thread GitBox


iverase commented on a change in pull request #1866:
URL: https://github.com/apache/lucene-solr/pull/1866#discussion_r489408392



##
File path: lucene/core/src/java/org/apache/lucene/document/ShapeQuery.java
##
@@ -265,10 +265,20 @@ private Scorer getSparseScorer(final LeafReader reader, 
final Weight weight, fin
 final DocIdSetIterator iterator = new BitSetIterator(result, cost[0]);
 return new ConstantScoreScorer(weight, boost, scoreMode, iterator);
   }
-  final DocIdSetBuilder docIdSetBuilder = new 
DocIdSetBuilder(reader.maxDoc(), values, query.getField());
-  values.intersect(getSparseVisitor(query, docIdSetBuilder));
-  final DocIdSetIterator iterator = docIdSetBuilder.build().iterator();
-  return new ConstantScoreScorer(weight, boost, scoreMode, iterator);
+  if (values.getDocCount() << 2 < values.size()) {
+// we use a dense structure so we can skip already visited documents
+final FixedBitSet result = new FixedBitSet(reader.maxDoc());
+final long[] cost = new long[]{reader.maxDoc()};

Review comment:
   Copy/paste error, it should be : 
   
   ```
   final long[] cost = new long[]{0};
   ```





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] iverase commented on a change in pull request #1866: LUCENE-9523: Speed up query shapes for geometries that generate multiple points

2020-09-16 Thread GitBox


iverase commented on a change in pull request #1866:
URL: https://github.com/apache/lucene-solr/pull/1866#discussion_r489407818



##
File path: lucene/core/src/java/org/apache/lucene/document/ShapeQuery.java
##
@@ -265,10 +265,20 @@ private Scorer getSparseScorer(final LeafReader reader, 
final Weight weight, fin
 final DocIdSetIterator iterator = new BitSetIterator(result, cost[0]);
 return new ConstantScoreScorer(weight, boost, scoreMode, iterator);
   }
-  final DocIdSetBuilder docIdSetBuilder = new 
DocIdSetBuilder(reader.maxDoc(), values, query.getField());
-  values.intersect(getSparseVisitor(query, docIdSetBuilder));
-  final DocIdSetIterator iterator = docIdSetBuilder.build().iterator();
-  return new ConstantScoreScorer(weight, boost, scoreMode, iterator);
+  if (values.getDocCount() << 2 < values.size()) {

Review comment:
   indeed that is safer





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] arafalov commented on pull request #1869: SOLR-14866 autowidth tables in ref guide

2020-09-16 Thread GitBox


arafalov commented on pull request #1869:
URL: https://github.com/apache/lucene-solr/pull/1869#issuecomment-693380260


   Having consistent best practices encoded in the document, would be a Win 
from my point of view. I also look a lot at other examples, just in case our 
use-case is different from generic adoc advice.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] epugh edited a comment on pull request #1869: SOLR-14866 autowidth tables in ref guide

2020-09-16 Thread GitBox


epugh edited a comment on pull request #1869:
URL: https://github.com/apache/lucene-solr/pull/1869#issuecomment-693365546


   Thanks for the sleuthing @ctargett, I never thought about the case of no 
command.   I tested it this morning, and you are correct.
   
   So, I think one pattern or the other for tables headings might be the way to 
go.  I know that I learn by looking at other examples in the ref guide, so if I 
see other tables using the options="header", then I'll do the same.  Likewise 
if I looked instead at the example in `aliases.adoc` first. 
   
   I updated `the-standard-query-parser.adoc` to remove the autowidth.spread, 
and I think it is cleaner and easier to read...
   
   Does this sound like the right approach?  



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] epugh commented on pull request #1869: SOLR-14866 autowidth tables in ref guide

2020-09-16 Thread GitBox


epugh commented on pull request #1869:
URL: https://github.com/apache/lucene-solr/pull/1869#issuecomment-693365546


   Thanks for the sleuthing @ctargett, I never thought about the case of no 
command.   I tested it this morning, and you are correct.
   
   So, I think one pattern or the other for tables headings might be the way to 
go.  I know that I learn by looking at other examples in the ref guide, so if I 
see other tables using the options="header", then I'll do the same.  Likewise 
if I looked instead at the example in `aliases.adoc` first. 
   
   I updated `the-standard-query-parser.adoc` to remove the autowidth.spread, 
and I think it is cleaner...
   
   
   I'm happy to rework this PR to both remove all the TODO's and 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14151) Make schema components load from packages

2020-09-16 Thread Noble Paul (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17196923#comment-17196923
 ] 

Noble Paul commented on SOLR-14151:
---

The failures are due to unclosed objects. In a node that is shutdown it's not a 
problem. That was added as a defence against sloppy coding and object leaks. 
So, adding complexity & potential deadlocks is worse than the problem itself

> Make schema components load from packages
> -
>
> Key: SOLR-14151
> URL: https://issues.apache.org/jira/browse/SOLR-14151
> Project: Solr
>  Issue Type: Sub-task
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
>  Labels: packagemanager
> Fix For: 8.7
>
>  Time Spent: 12h 40m
>  Remaining Estimate: 0h
>
> Example:
> {code:xml}
>  
> 
>   
>generateNumberParts="0" catenateWords="0"
>   catenateNumbers="0" catenateAll="0"/>
>   
>   
> 
>   
> {code}
> * When a package is updated, the entire {{IndexSchema}} object is refreshed, 
> but the SolrCore object is not reloaded
> * Any component can be prefixed with the package name
> * The semantics of loading plugins remain the same as that of the components 
> in {{solrconfig.xml}}
> * Plugins can be registered using schema API



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14861) CoreContainer shutdown needs to be aware of other ongoing operations and wait until they're complete

2020-09-16 Thread Erick Erickson (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17196922#comment-17196922
 ] 

Erick Erickson commented on SOLR-14861:
---

Noodling on this a bit more when typing a long reply over on SOLR-14151:

I think we need a way for shutdown to somehow cause Solr to start refusing 
_all_ incoming requests, wait until all in-flight operations are complete, and 
_then_ start shutting down. The approach in the patch is too local, even if it 
would work. I'd love suggestions here. And this is exacerbated by the fact that 
the test framework calls CoreContainer.shutdown() directly.

> CoreContainer shutdown needs to be aware of other ongoing operations and wait 
> until they're complete
> 
>
> Key: SOLR-14861
> URL: https://issues.apache.org/jira/browse/SOLR-14861
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Major
> Attachments: SOLR-14861.patch
>
>
> Noble and I are trying to get to the bottom of the TestBulkSchemaConcurrent 
> failures and found what looks like a glaring gap in how 
> CoreContainer.shutdown operates. I don't know the impact on production since 
> we're shutting down anyway, but I think this is responsible for the errors in 
> TestBulkSchemaConcurrent and likely behind others, especially any other test 
> that fails intermittently that involves core reloads, including and 
> especially any tests that exercise managed schema.
> We have clear evidence of this sequence:
> 1> some CoreContainer.reloads come in and get _partway_ through, in 
> particular past the test at the top where CoreContainer.reload() throws an 
> AlreadyClosed exception if (isShutdown).
> 2> Some CoreContainer.shutdown() threads get some processing time before the 
> reloads in <1> are finished.
> 3> the threads in <1> pick back up and go wonky. I suspect that there are a 
> number of different things that could be going wrong here depending on how 
> far through CoreContainer.shutdown() gets that pop out in different ways.
> Since it's my shift (Noble has to sleep sometime), I put some crude locking 
> in just to test the idea; incrementing an AtomicInteger on entry to 
> CoreContainer.reload then decrementing it at the end, and spinning in 
> CoreContainer.shutdown() until the AtomicInteger was back to zero. With that 
> in place, 100 runs and no errors whereas before I could never get even 10 
> runs to finish without an error. This is not a proper fix at all, and the way 
> it's currently running there are still possible race conditions, just much 
> smaller windows. And I suspect it risks spinning forever. But it's enough to 
> make me believe I finally understand what's happening.
> I also suspect that reload is more sensitive than most operations on a core 
> due to the fact that it runs for a long time, but I assume other operations 
> have the same potential. Shouldn't CoreContainer.shutDown() wait until no 
> other operations are in flight?
> On a quick scan of CoreContainer, there are actually few places where we even 
> check for isShutdown, I suspect the places we do are ad-hoc that we've found 
> by trial-and-error when tests fail. We need a design rather than hit-or-miss 
> hacking.
> I think that isShutdown should be replaced with something more robust. What 
> that is IDK quite yet because I've been hammering at this long enough and I 
> need a break.
> This is consistent with another observation about this particular test. If 
> there's sleep at the end, it wouldn't fail; all the reloads get a chance to 
> finish before anything was shut down.
> An open question how much this matters to production systems. In the testing 
> case, bunches of these reloads are issued then we immediately end the test 
> and start shutting things down. It needs to be fixed if we're going to cut 
> down on test failures though. Besides, it's just wrong ;)
> Assigning to myself to track. I'd be perfectly happy, now that Noble and I 
> have done the hard work, for someone to swoop in and take the credit for 
> fixing it ;)
> gradlew beast -Ptests.dups=10 --tests TestBulkSchemaConcurrent
> always fails for me on current code without my hack...



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14151) Make schema components load from packages

2020-09-16 Thread Erick Erickson (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17196916#comment-17196916
 ] 

Erick Erickson commented on SOLR-14151:
---

[~tflobbe]  See SOLR-14861. Specifically, "is buggy" amounts at least has this 
problem: When Corcontainer.shutdown is running, there's a variable "isShutdown" 
in CoreContainer that's set, and we check for that in various other places, 
specifically reload() but there are a number of other places scattered all 
through the code. The case Noble and I found was that CoreContainer.reload() 
checks this variable at the top and gets past it.

Then some other thread calls shutdown before the reload is done, and the 
reloading thread is time-sliced out and the shutdown code executes for a while. 
Then that thread is time-sliced out and the reload picks up, but by now the 
state of the container is such that the reload can't continue.

The problem manifested itself with unreleased object suite-level failures. The 
actual test succeeded. That said, there are certainly other ways this kind of 
thing could manifest itself. IDK whether the tests you mentioned have the same 
problem or not, but it'd be likely if the failures are unreleased objects. 

I attached a patch to that Jira that I started looking at (it's in horrible 
shape, but if I ever pick that Jira up again I wanted to have it handy to 
remember lessons learned about why this approach is probably bad) that tries to 
use a reentrant lock to make sure no other CoreContainer operations are not 
in-flight when we shutdown or load. It lead to a bunch of deadlocks.

Besides, that approach is all about CoreContainer operations, there are places 
outside CoreContainer that check CoreContainer.isShutdown that potentially have 
the same problem.

The particular scenario was that the test did something that caused a reload, 
_then immediately terminated._ which started the shutdown process so it's 
somewhat artificial. Even just putting a delay in the end of the test before it 
terminated the test class completely cured the problem for that particular 
test. Of course that's not a fix, but it is evidence for the diagnosis.

So basically I punted. Introducing the locking in CoreContainer has a lot of 
potential for deadlocks, besides when I saw the other parts of the code that 
tested CoreContainer.isShutdown I realized it's more widespread. Besides that, 
I'm not sure how important this is in production when weighed against the 
potential for deadlock, in this particular case it only manifested itself 
because the test was shutting down the so quickly.

I think we need a way for shutdown to somehow cause Solr to start refusing 
_all_ incoming requests, wait until all in-flight operations are complete, and 
then start shutting down. The approach in the patch is too local, even if it 
would work. I'd love suggestions here. And this is exacerbated by the fact that 
the test framework calls CoreContainer.shutdown() directly...

> Make schema components load from packages
> -
>
> Key: SOLR-14151
> URL: https://issues.apache.org/jira/browse/SOLR-14151
> Project: Solr
>  Issue Type: Sub-task
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
>  Labels: packagemanager
> Fix For: 8.7
>
>  Time Spent: 12h 40m
>  Remaining Estimate: 0h
>
> Example:
> {code:xml}
>  
> 
>   
>generateNumberParts="0" catenateWords="0"
>   catenateNumbers="0" catenateAll="0"/>
>   
>   
> 
>   
> {code}
> * When a package is updated, the entire {{IndexSchema}} object is refreshed, 
> but the SolrCore object is not reloaded
> * Any component can be prefixed with the package name
> * The semantics of loading plugins remain the same as that of the components 
> in {{solrconfig.xml}}
> * Plugins can be registered using schema API



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9510) SortingStoredFieldsConsumer should use a format that has better random-access

2020-09-16 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17196902#comment-17196902
 ] 

ASF subversion and git services commented on LUCENE-9510:
-

Commit c3bdc006a292392ec5dffd57298426bf731c9887 in lucene-solr's branch 
refs/heads/branch_8x from Adrien Grand
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=c3bdc00 ]

LUCENE-9510: Don't compress temporary stored fields and term vectors when index 
sorting is enabled. (#1874)

When index sorting is enabled, stored fields and term vectors can't be
written on the fly like in the normal case, so they are written into
temporary files that then get resorted. For these temporary files,
disabling compression speeds up indexing significantly.

On a synthetic test that indexes stored fields and a doc value field
populated with random values that is used for index sorting, this
resulted in a 3x indexing speedup.


> SortingStoredFieldsConsumer should use a format that has better random-access
> -
>
> Key: LUCENE-9510
> URL: https://issues.apache.org/jira/browse/LUCENE-9510
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Priority: Minor
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> We noticed some indexing rate regressions in Elasticsearch after upgrading to 
> a new Lucene snapshot. This is due to the fact that 
> SortingStoredFieldsConsumer is using the default codec to write stored fields 
> on flush. Compression doesn't matter much for this case since these are 
> temporary files that get removed on flush after the segment is sorted anyway 
> so we could switch to a format that has faster random access.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9510) SortingStoredFieldsConsumer should use a format that has better random-access

2020-09-16 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17196911#comment-17196911
 ] 

ASF subversion and git services commented on LUCENE-9510:
-

Commit c3bdc006a292392ec5dffd57298426bf731c9887 in lucene-solr's branch 
refs/heads/branch_8x from Adrien Grand
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=c3bdc00 ]

LUCENE-9510: Don't compress temporary stored fields and term vectors when index 
sorting is enabled. (#1874)

When index sorting is enabled, stored fields and term vectors can't be
written on the fly like in the normal case, so they are written into
temporary files that then get resorted. For these temporary files,
disabling compression speeds up indexing significantly.

On a synthetic test that indexes stored fields and a doc value field
populated with random values that is used for index sorting, this
resulted in a 3x indexing speedup.


> SortingStoredFieldsConsumer should use a format that has better random-access
> -
>
> Key: LUCENE-9510
> URL: https://issues.apache.org/jira/browse/LUCENE-9510
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Priority: Minor
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> We noticed some indexing rate regressions in Elasticsearch after upgrading to 
> a new Lucene snapshot. This is due to the fact that 
> SortingStoredFieldsConsumer is using the default codec to write stored fields 
> on flush. Compression doesn't matter much for this case since these are 
> temporary files that get removed on flush after the segment is sorted anyway 
> so we could switch to a format that has faster random access.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9525) Better handle small documents with the new Lucene87StoredFieldsFormat

2020-09-16 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17196912#comment-17196912
 ] 

ASF subversion and git services commented on LUCENE-9525:
-

Commit 9cd3af50f8093ddf9c70c90fa7cc8e1103ecabb7 in lucene-solr's branch 
refs/heads/branch_8x from Adrien Grand
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=9cd3af5 ]

LUCENE-9525: Better handle small documents with Lucene87StoredFieldsFormat. 
(#1876)

Instead of configuring a dictionary size and a block size, the format
now tries to have 10 sub blocks per bigger block, and adapts the size of
the dictionary and of the sub blocks to this overall block size.

> Better handle small documents with the new Lucene87StoredFieldsFormat
> -
>
> Key: LUCENE-9525
> URL: https://issues.apache.org/jira/browse/LUCENE-9525
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Priority: Minor
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Stored fields configure a maximum number of fields per block, whose goal is 
> to make sure that you don't decompress more than X documents to get access to 
> a single one. However this has interesting effects with the new format.
> For instance we use 4kB of dictionary and blocks of 60kB for at most 512 
> documents per block. So if your documents are very small, say 10 bytes, the 
> block will be 5120 bytes overall, and we'll first compress 4096 bytes 
> independently, and then 5120-4096=1024 bytes with 4096 bytes of dictionary. 
> In this case training the dictionary takes more time than actually 
> compressing the data, and it's not even sure it's worth it since only 1024 
> bytes out of the 5120 bytes of the block get compressed with a preset 
> dictionary.
> I'm considering adapting the dictionary size and the block size to the total 
> block size in order to better handle such cases.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9525) Better handle small documents with the new Lucene87StoredFieldsFormat

2020-09-16 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17196903#comment-17196903
 ] 

ASF subversion and git services commented on LUCENE-9525:
-

Commit 9cd3af50f8093ddf9c70c90fa7cc8e1103ecabb7 in lucene-solr's branch 
refs/heads/branch_8x from Adrien Grand
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=9cd3af5 ]

LUCENE-9525: Better handle small documents with Lucene87StoredFieldsFormat. 
(#1876)

Instead of configuring a dictionary size and a block size, the format
now tries to have 10 sub blocks per bigger block, and adapts the size of
the dictionary and of the sub blocks to this overall block size.

> Better handle small documents with the new Lucene87StoredFieldsFormat
> -
>
> Key: LUCENE-9525
> URL: https://issues.apache.org/jira/browse/LUCENE-9525
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Priority: Minor
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Stored fields configure a maximum number of fields per block, whose goal is 
> to make sure that you don't decompress more than X documents to get access to 
> a single one. However this has interesting effects with the new format.
> For instance we use 4kB of dictionary and blocks of 60kB for at most 512 
> documents per block. So if your documents are very small, say 10 bytes, the 
> block will be 5120 bytes overall, and we'll first compress 4096 bytes 
> independently, and then 5120-4096=1024 bytes with 4096 bytes of dictionary. 
> In this case training the dictionary takes more time than actually 
> compressing the data, and it's not even sure it's worth it since only 1024 
> bytes out of the 5120 bytes of the block get compressed with a preset 
> dictionary.
> I'm considering adapting the dictionary size and the block size to the total 
> block size in order to better handle such cases.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14872) unable to restore solr backup

2020-09-16 Thread Anil Paladugu (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17196885#comment-17196885
 ] 

Anil Paladugu commented on SOLR-14872:
--

can you please give me resolution for my problem .

> unable to restore solr backup 
> --
>
> Key: SOLR-14872
> URL: https://issues.apache.org/jira/browse/SOLR-14872
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Backup/Restore
>Affects Versions: 7.0.1
> Environment: prod
>Reporter: Anil Paladugu
>Priority: Blocker
>  Labels: restore
>
> hey , i'm unable to restore backup in to collection . i'm getting error like 
> below
> [http://localhost:8983/solr/admin/collections?action=RESTORE=CountryCodes=/opt/solr=concodes=2=2|https://qa_solr.voltusone.com/solr/admin/collections?action=RESTORE=CountryCodes=/opt/solr=concodes=2=2]
> result:
> { "responseHeader":\{ "status":0, "QTime":2}, "Operation restore caused 
> exception:":"org.apache.solr.common.SolrException:org.apache.solr.common.SolrException:
>  Solr cloud with available number of nodes:2 is insufficient for restoring a 
> collection with 2 shards, total replicas per shard 6 and maxShardsPerNode -1. 
> Consider increasing maxShardsPerNode value OR number of available nodes.", 
> "exception":\{ "msg":"Solr cloud with available number of nodes:2 is 
> insufficient for restoring a collection with 2 shards, total replicas per 
> shard 6 and maxShardsPerNode -1. Consider increasing maxShardsPerNode value 
> OR number of available nodes.", "rspCode":400}, "status":\{ "state":"failed", 
> "msg":"found [1000] in failed tasks"}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Resolved] (SOLR-14872) unable to restore solr backup

2020-09-16 Thread Erick Erickson (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson resolved SOLR-14872.
---
Resolution: Invalid

Please raise questions like this on the user's list, we try to reserve JIRAs 
for known bugs/enhancements rather than usage questions. The JIRA system is not 
a support portal.

See: 
http://lucene.apache.org/solr/community.html#mailing-lists-irc there are links 
to both Lucene and Solr mailing lists there.

A _lot_ more people will see your question on that list and may be able to help 
more quickly.

You might want to review: 
https://wiki.apache.org/solr/UsingMailingLists

If it's determined that this really is a code issue or enhancement to Lucene or 
Solr and not a configuration/usage problem, we can raise a new JIRA or reopen 
this one.

> unable to restore solr backup 
> --
>
> Key: SOLR-14872
> URL: https://issues.apache.org/jira/browse/SOLR-14872
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Backup/Restore
>Affects Versions: 7.0.1
> Environment: prod
>Reporter: Anil Paladugu
>Priority: Blocker
>  Labels: restore
>
> hey , i'm unable to restore backup in to collection . i'm getting error like 
> below
> [http://localhost:8983/solr/admin/collections?action=RESTORE=CountryCodes=/opt/solr=concodes=2=2|https://qa_solr.voltusone.com/solr/admin/collections?action=RESTORE=CountryCodes=/opt/solr=concodes=2=2]
> result:
> { "responseHeader":\{ "status":0, "QTime":2}, "Operation restore caused 
> exception:":"org.apache.solr.common.SolrException:org.apache.solr.common.SolrException:
>  Solr cloud with available number of nodes:2 is insufficient for restoring a 
> collection with 2 shards, total replicas per shard 6 and maxShardsPerNode -1. 
> Consider increasing maxShardsPerNode value OR number of available nodes.", 
> "exception":\{ "msg":"Solr cloud with available number of nodes:2 is 
> insufficient for restoring a collection with 2 shards, total replicas per 
> shard 6 and maxShardsPerNode -1. Consider increasing maxShardsPerNode value 
> OR number of available nodes.", "rspCode":400}, "status":\{ "state":"failed", 
> "msg":"found [1000] in failed tasks"}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9525) Better handle small documents with the new Lucene87StoredFieldsFormat

2020-09-16 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17196880#comment-17196880
 ] 

ASF subversion and git services commented on LUCENE-9525:
-

Commit ad71bee0161cd52dba73f866c897e88fde2639a4 in lucene-solr's branch 
refs/heads/master from Adrien Grand
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=ad71bee ]

LUCENE-9525: Better handle small documents with Lucene87StoredFieldsFormat. 
(#1876)

Instead of configuring a dictionary size and a block size, the format
now tries to have 10 sub blocks per bigger block, and adapts the size of
the dictionary and of the sub blocks to this overall block size.

> Better handle small documents with the new Lucene87StoredFieldsFormat
> -
>
> Key: LUCENE-9525
> URL: https://issues.apache.org/jira/browse/LUCENE-9525
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Priority: Minor
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Stored fields configure a maximum number of fields per block, whose goal is 
> to make sure that you don't decompress more than X documents to get access to 
> a single one. However this has interesting effects with the new format.
> For instance we use 4kB of dictionary and blocks of 60kB for at most 512 
> documents per block. So if your documents are very small, say 10 bytes, the 
> block will be 5120 bytes overall, and we'll first compress 4096 bytes 
> independently, and then 5120-4096=1024 bytes with 4096 bytes of dictionary. 
> In this case training the dictionary takes more time than actually 
> compressing the data, and it's not even sure it's worth it since only 1024 
> bytes out of the 5120 bytes of the block get compressed with a preset 
> dictionary.
> I'm considering adapting the dictionary size and the block size to the total 
> block size in order to better handle such cases.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] jpountz merged pull request #1876: LUCENE-9525: Better handle small documents with Lucene87StoredFieldsFormat.

2020-09-16 Thread GitBox


jpountz merged pull request #1876:
URL: https://github.com/apache/lucene-solr/pull/1876


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9510) SortingStoredFieldsConsumer should use a format that has better random-access

2020-09-16 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17196879#comment-17196879
 ] 

ASF subversion and git services commented on LUCENE-9510:
-

Commit 93094ef7e4470dd9f0ade3a3d8403548729a4609 in lucene-solr's branch 
refs/heads/master from Adrien Grand
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=93094ef ]

LUCENE-9510: Don't compress temporary stored fields and term vectors when index 
sorting is enabled. (#1874)

When index sorting is enabled, stored fields and term vectors can't be
written on the fly like in the normal case, so they are written into
temporary files that then get resorted. For these temporary files,
disabling compression speeds up indexing significantly.

On a synthetic test that indexes stored fields and a doc value field
populated with random values that is used for index sorting, this
resulted in a 3x indexing speedup.

> SortingStoredFieldsConsumer should use a format that has better random-access
> -
>
> Key: LUCENE-9510
> URL: https://issues.apache.org/jira/browse/LUCENE-9510
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Priority: Minor
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> We noticed some indexing rate regressions in Elasticsearch after upgrading to 
> a new Lucene snapshot. This is due to the fact that 
> SortingStoredFieldsConsumer is using the default codec to write stored fields 
> on flush. Compression doesn't matter much for this case since these are 
> temporary files that get removed on flush after the segment is sorted anyway 
> so we could switch to a format that has faster random access.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] jpountz merged pull request #1874: LUCENE-9510: Don't compress temporary stored fields and term vectors when index sorting is enabled.

2020-09-16 Thread GitBox


jpountz merged pull request #1874:
URL: https://github.com/apache/lucene-solr/pull/1874


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] murblanc commented on pull request #1845: SOLR-14613: Autoscaling replacement using placement plugins

2020-09-16 Thread GitBox


murblanc commented on pull request #1845:
URL: https://github.com/apache/lucene-solr/pull/1845#issuecomment-693315053


   > > I need more guidance.
   > 
   > Add your APIs 
[here](https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/handler/ClusterAPI.java)
   
   
   Thanks!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] jpountz commented on a change in pull request #1866: LUCENE-9523: Speed up query shapes for geometries that generate multiple points

2020-09-16 Thread GitBox


jpountz commented on a change in pull request #1866:
URL: https://github.com/apache/lucene-solr/pull/1866#discussion_r489311556



##
File path: lucene/core/src/java/org/apache/lucene/document/ShapeQuery.java
##
@@ -265,10 +265,20 @@ private Scorer getSparseScorer(final LeafReader reader, 
final Weight weight, fin
 final DocIdSetIterator iterator = new BitSetIterator(result, cost[0]);
 return new ConstantScoreScorer(weight, boost, scoreMode, iterator);
   }
-  final DocIdSetBuilder docIdSetBuilder = new 
DocIdSetBuilder(reader.maxDoc(), values, query.getField());
-  values.intersect(getSparseVisitor(query, docIdSetBuilder));
-  final DocIdSetIterator iterator = docIdSetBuilder.build().iterator();
-  return new ConstantScoreScorer(weight, boost, scoreMode, iterator);
+  if (values.getDocCount() << 2 < values.size()) {
+// we use a dense structure so we can skip already visited documents
+final FixedBitSet result = new FixedBitSet(reader.maxDoc());
+final long[] cost = new long[]{reader.maxDoc()};

Review comment:
   We just need one long?
   ```suggestion
   final long[] cost = new long[]{1};
   ```

##
File path: lucene/core/src/java/org/apache/lucene/document/ShapeQuery.java
##
@@ -265,10 +265,20 @@ private Scorer getSparseScorer(final LeafReader reader, 
final Weight weight, fin
 final DocIdSetIterator iterator = new BitSetIterator(result, cost[0]);
 return new ConstantScoreScorer(weight, boost, scoreMode, iterator);
   }
-  final DocIdSetBuilder docIdSetBuilder = new 
DocIdSetBuilder(reader.maxDoc(), values, query.getField());
-  values.intersect(getSparseVisitor(query, docIdSetBuilder));
-  final DocIdSetIterator iterator = docIdSetBuilder.build().iterator();
-  return new ConstantScoreScorer(weight, boost, scoreMode, iterator);
+  if (values.getDocCount() << 2 < values.size()) {

Review comment:
   I think we should be careful with overflows, maybe divide instead of 
multiplying, ie.
   ```suggestion
 if (values.getDocCount() < (values.size() >>> 2)) {
   ```

##
File path: lucene/core/src/java/org/apache/lucene/document/ShapeQuery.java
##
@@ -265,10 +265,20 @@ private Scorer getSparseScorer(final LeafReader reader, 
final Weight weight, fin
 final DocIdSetIterator iterator = new BitSetIterator(result, cost[0]);
 return new ConstantScoreScorer(weight, boost, scoreMode, iterator);
   }
-  final DocIdSetBuilder docIdSetBuilder = new 
DocIdSetBuilder(reader.maxDoc(), values, query.getField());
-  values.intersect(getSparseVisitor(query, docIdSetBuilder));
-  final DocIdSetIterator iterator = docIdSetBuilder.build().iterator();
-  return new ConstantScoreScorer(weight, boost, scoreMode, iterator);
+  if (values.getDocCount() << 2 < values.size()) {
+// we use a dense structure so we can skip already visited documents
+final FixedBitSet result = new FixedBitSet(reader.maxDoc());

Review comment:
   I wonder if we should use SparseFixedBitSet to avoid allocating so much 
memory at once.

##
File path: lucene/core/src/java/org/apache/lucene/document/ShapeQuery.java
##
@@ -340,7 +351,8 @@ public Relation compare(byte[] minTriangle, byte[] 
maxTriangle) {
 };
   }
 
-  /** create a visitor that adds documents that match the query using a sparse 
bitset. (Used by INTERSECT) */
+  /** create a visitor that adds documents that match the query using a sparse 
bitset. (Used by INTERSECT
+   * when the number of points <= 4 * number of docs ) */

Review comment:
   ```suggestion
  * when the number of docs <= 4 * number of points ) */
   ```





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9486) Explore using preset dictionaries with LZ4 for stored fields

2020-09-16 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17196826#comment-17196826
 ] 

ASF subversion and git services commented on LUCENE-9486:
-

Commit 78b8a0ae39fd7fe1d349edd4f6b1b946df1fd759 in lucene-solr's branch 
refs/heads/branch_8x from Adrien Grand
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=78b8a0a ]

LUCENE-9486: Use ByteBuffersDataOutput to collect data like on master.


> Explore using preset dictionaries with LZ4 for stored fields
> 
>
> Key: LUCENE-9486
> URL: https://issues.apache.org/jira/browse/LUCENE-9486
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Priority: Minor
> Fix For: 8.7
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Follow-up of LUCENE-9447: using preset dictionaries with DEFLATE provided 
> very significant gains. Adding support for preset dictionaries with LZ4 would 
> be easy so let's give it a try?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9445) Expose new case insensitive RegExpQuery support in QueryParser

2020-09-16 Thread Mark Harwood (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17196793#comment-17196793
 ] 

Mark Harwood commented on LUCENE-9445:
--

We will likely have to break existing parser behaviour to add this feature.

The existing parser impl allows regex clauses and other clauses to appear next 
to each other without a space e.g.

{{  /foo/bar}}

is interpreted as a regex query "foo" OR term query "bar".

If we want to introduce /Foo/i as syntax for a case insensitive regex then we 
will need to insist on a space between regexes and other search terms to 
cleanly separate any regex flags from other search terms. This will require 
throwing an error if we encounter any text other than "i" after the closing /. 
It will still be legal to have operators like ) for boolean logic or ^ for 
boosts immediately after the regex closing slash but any other tokens will 
cause an error. I have discussed with Jim Ferenczi the idea of a flag to 
control what happens in an error situation but we can't easily revert to 
parsing the offending string in a BWC way and it's not ideal to silently drop 
it either. Always throwing an error seems like the only viable option.

The repercussions of stricter parsing is that sloppy searches like a 
cut-and-paste URL or file path would now throw an error. These would fail 
loudly for example:

{{  [http://foo.com/bar]}}
 {{  /mydrive/myfolder/myfile}}

 

The question is does this new regex syntax warrant the breaking change this 
introduces, or more broadly, what is the BWC policy for making changes to query 
parser?

 

 

 

 

> Expose new case insensitive RegExpQuery support in QueryParser
> --
>
> Key: LUCENE-9445
> URL: https://issues.apache.org/jira/browse/LUCENE-9445
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/queryparser
>Reporter: Mark Harwood
>Assignee: Mark Harwood
>Priority: Minor
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> LUCENE-9386 added a case insensitive matching option to RegExpQuery.
> This proposal is to extend the QueryParser syntax to allow for an optional 
> `i` (case Insensitive) flag to appear on the end of regular expressions e.g. 
> /Foo/i
>  
> This is regex syntax supported by a number of programming languages.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14870) gradle build does not validate ref-guide -> javadoc links

2020-09-16 Thread Dawid Weiss (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17196792#comment-17196792
 ] 

Dawid Weiss commented on SOLR-14870:


Right. I'm really not following those documentation bits and I can't dig into 
this now. Maybe later, given time.

> gradle build does not validate ref-guide -> javadoc links
> -
>
> Key: SOLR-14870
> URL: https://issues.apache.org/jira/browse/SOLR-14870
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Chris M. Hostetter
>Priority: Major
>
> the ant build had (has on 8x) a feature that ensured we didn't have any 
> broken links between the ref guide and the javadocs...
> {code}
>  depends="javadocs,changes-to-html,process-webpages">
>  inheritall="false">
>   
>   
> 
>   
> {code}
> ...by default {{cd solr/solr-ref-guide && ant bare-bones-html-validation}} 
> just did interanal validation of the strucure of the guide, but this hook 
> ment that {{cd solr && ant documentation}} (or {{ant precommit}}) would first 
> build the javadocs; then build the ref-guide; then validate _all_ links i 
> nthe ref-guide, even those to (local) javadocs
> While the "local.javadocs" property logic _inside_ the 
> solr-ref-guide/build.xml was ported to build.gradle, the logic to leverage 
> this functionality from the "solr" project doesn't seem to have been 
> preserved -- so currently, {{gradle check}} doesn't know/care if someone adds 
> a nonsense javadoc link to the ref-guide (or removes a class/method whose 
> javadoc is already currently to from the ref guide)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] noblepaul edited a comment on pull request #1845: SOLR-14613: Autoscaling replacement using placement plugins

2020-09-16 Thread GitBox


noblepaul edited a comment on pull request #1845:
URL: https://github.com/apache/lucene-solr/pull/1845#issuecomment-693216668


   >I need more guidance.
   
   
   Add your APIs 
[here](https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/handler/ClusterAPI.java)
 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] noblepaul merged pull request #1878: SOLR-14871 Use Annotations for v2 APIs in/cluster path

2020-09-16 Thread GitBox


noblepaul merged pull request #1878:
URL: https://github.com/apache/lucene-solr/pull/1878


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14871) Use Annotations for v2 APIs in/cluster path

2020-09-16 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17196768#comment-17196768
 ] 

ASF subversion and git services commented on SOLR-14871:


Commit 7b8e72e5531f3678242e1106d528ec835ac33959 in lucene-solr's branch 
refs/heads/master from Noble Paul
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=7b8e72e ]

SOLR-14871 Use Annotations for v2 APIs in/cluster path (#1878)



> Use Annotations for v2 APIs in/cluster path
> ---
>
> Key: SOLR-14871
> URL: https://issues.apache.org/jira/browse/SOLR-14871
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> We have custom json specs for each of these APIs. With the annotation 
> framework , it can be made simple and readable and we can eliminate a lot of 
> code



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-14870) gradle build does not validate ref-guide -> javadoc links

2020-09-16 Thread Uwe Schindler (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17196766#comment-17196766
 ] 

Uwe Schindler edited comment on SOLR-14870 at 9/16/20, 8:05 AM:


Actually the links refguide -> javadocs should setup a link to the "global" 
javadocs. The local "javadocs" (as packaged into Maven Javadocs-JAR files) are 
not in the deal. Maybe we have some communication problem here (terms and 
semantic).

About [~dweiss]'s comments:

bq. this should not be the case. If docs are to be rendered with local links 
they should be rendered separately (different task, different outputs). This 
ensures caches (up-to-date) are working correctly. That global "local.javadocs" 
should be removed, in other words.

I agree! The problem here is manifold:

- The global javadocs (which should be a separate subproject for packaging) 
should have this externally configureable link (so Jenkins or release manager 
can adapt the links) -> see below, here the links are fine
- The local Javadocs (which are packaged into Maven javadocs-JARS and that are 
local to project folders) should not have any outside links, because they are 
always viewed in isolation (e.g., inside IDE). I'd just disable all 
inter-project links for them. Then there's also no need to check links. There 
should not even be a link between Lucene subprojects.

bq. Our "global" javadocs are rendered conditionally too (see 
documentation.gradle)

That's fine, as its a constant for the whole build. When it changes, the inputs 
of renderJavadocs changes and all files are regenerated. You can easily test 
this by passing another javadoc URL with {{-P}} and the documentation is 
regenerated. I tested this several times when I implemented this.


was (Author: thetaphi):
Actually the links refguide -> javadocs should setup a link to the "global" 
javadocs. The local "javadocs" (as packaged into Maven Javadocs-JAR files) are 
not in the deal. Maybe we have some communication problem here (terms and 
semantic).

About [~dweiss]'s comments:

bq. this should not be the case. If docs are to be rendered with local links 
they should be rendered separately (different task, different outputs). This 
ensures caches (up-to-date) are working correctly. That global "local.javadocs" 
should be removed, in other words.

I agree! The problem here is manifold:

- The global javadocs (which should be a separate subproject for packaging) 
should have this externally configureable link (so Jenkins or release manager 
can adapt the links) -> see below, here the links are fine
- The local Javadocs (which are packaged into Maven javadocs-JARS and that are 
local to project folders) whoulc not have any outside links, because they are 
always viewed in isolation (e.g., inside IDE). I'd just disable all 
inter-project links for them. Then there's also no need to check links. There 
should not even be a link between Lucene subprojects.

bq. Our "global" javadocs are rendered conditionally too (see 
documentation.gradle)

That's fine, as its a constant for the whole build. When it changes, the inputs 
of renderJavadocs changes and all files are regenerated. You can easily test 
this by passing another javadoc URL with {{-P}} and the documentation is 
regenerated. I tested this several times when I implemented this.

> gradle build does not validate ref-guide -> javadoc links
> -
>
> Key: SOLR-14870
> URL: https://issues.apache.org/jira/browse/SOLR-14870
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Chris M. Hostetter
>Priority: Major
>
> the ant build had (has on 8x) a feature that ensured we didn't have any 
> broken links between the ref guide and the javadocs...
> {code}
>  depends="javadocs,changes-to-html,process-webpages">
>  inheritall="false">
>   
>   
> 
>   
> {code}
> ...by default {{cd solr/solr-ref-guide && ant bare-bones-html-validation}} 
> just did interanal validation of the strucure of the guide, but this hook 
> ment that {{cd solr && ant documentation}} (or {{ant precommit}}) would first 
> build the javadocs; then build the ref-guide; then validate _all_ links i 
> nthe ref-guide, even those to (local) javadocs
> While the "local.javadocs" property logic _inside_ the 
> solr-ref-guide/build.xml was ported to build.gradle, the logic to leverage 
> this functionality from the "solr" project doesn't seem to have been 
> preserved -- so currently, {{gradle check}} doesn't know/care if someone adds 
> a nonsense javadoc link to the ref-guide (or removes a class/method whose 
> javadoc is already currently to from the ref guide)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (SOLR-14870) gradle build does not validate ref-guide -> javadoc links

2020-09-16 Thread Uwe Schindler (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17196766#comment-17196766
 ] 

Uwe Schindler commented on SOLR-14870:
--

Actually the links refguide -> javadocs should setup a link to the "global" 
javadocs. The local "javadocs" (as packaged into Maven Javadocs-JAR files) are 
not in the deal. Maybe we have some communication problem here (terms and 
semantic).

About [~dweiss]'s comments:

bq. this should not be the case. If docs are to be rendered with local links 
they should be rendered separately (different task, different outputs). This 
ensures caches (up-to-date) are working correctly. That global "local.javadocs" 
should be removed, in other words.

I agree! The problem here is manifold:

- The global javadocs (which should be a separate subproject for packaging) 
should have this externally configureable link (so Jenkins or release manager 
can adapt the links) -> see below, here the links are fine
- The local Javadocs (which are packaged into Maven javadocs-JARS and that are 
local to project folders) whoulc not have any outside links, because they are 
always viewed in isolation (e.g., inside IDE). I'd just disable all 
inter-project links for them. Then there's also no need to check links. There 
should not even be a link between Lucene subprojects.

bq. Our "global" javadocs are rendered conditionally too (see 
documentation.gradle)

That's fine, as its a constant for the whole build. When it changes, the inputs 
of renderJavadocs changes and all files are regenerated. You can easily test 
this by passing another javadoc URL with {{-P}} and the documentation is 
regenerated. I tested this several times when I implemented this.

> gradle build does not validate ref-guide -> javadoc links
> -
>
> Key: SOLR-14870
> URL: https://issues.apache.org/jira/browse/SOLR-14870
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Chris M. Hostetter
>Priority: Major
>
> the ant build had (has on 8x) a feature that ensured we didn't have any 
> broken links between the ref guide and the javadocs...
> {code}
>  depends="javadocs,changes-to-html,process-webpages">
>  inheritall="false">
>   
>   
> 
>   
> {code}
> ...by default {{cd solr/solr-ref-guide && ant bare-bones-html-validation}} 
> just did interanal validation of the strucure of the guide, but this hook 
> ment that {{cd solr && ant documentation}} (or {{ant precommit}}) would first 
> build the javadocs; then build the ref-guide; then validate _all_ links i 
> nthe ref-guide, even those to (local) javadocs
> While the "local.javadocs" property logic _inside_ the 
> solr-ref-guide/build.xml was ported to build.gradle, the logic to leverage 
> this functionality from the "solr" project doesn't seem to have been 
> preserved -- so currently, {{gradle check}} doesn't know/care if someone adds 
> a nonsense javadoc link to the ref-guide (or removes a class/method whose 
> javadoc is already currently to from the ref guide)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



  1   2   >