[jira] [Commented] (LUCENE-8987) Move Lucene web site from svn to git

2020-02-12 Thread Jira


[ 
https://issues.apache.org/jira/browse/LUCENE-8987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17035984#comment-17035984
 ] 

Jan Høydahl commented on LUCENE-8987:
-

Fixed broken links to core system requirements (missing core/ in URL path)

> Move Lucene web site from svn to git
> 
>
> Key: LUCENE-8987
> URL: https://issues.apache.org/jira/browse/LUCENE-8987
> Project: Lucene - Core
>  Issue Type: Task
>  Components: general/website
>Reporter: Jan Høydahl
>Assignee: Jan Høydahl
>Priority: Major
> Attachments: lucene-site-repo.png
>
>
> INFRA just enabled [a new way of configuring website 
> build|https://s.apache.org/asfyaml] from a git branch, [see dev list 
> email|https://lists.apache.org/thread.html/b6f7e40bece5e83e27072ecc634a7815980c90240bc0a2ccb417f1fd@%3Cdev.lucene.apache.org%3E].
>  It allows for automatic builds of both staging and production site, much 
> like the old CMS. We can choose to auto publish the html content of an 
> {{output/}} folder, or to have a bot build the site using 
> [Pelican|https://github.com/getpelican/pelican] from a {{content/}} folder.
> The goal of this issue is to explore how this can be done for 
> [http://lucene.apache.org|http://lucene.apache.org/] by, by creating a new 
> git repo {{lucene-site}}, copy over the site from svn, see if it can be 
> "Pelicanized" easily and then test staging. Benefits are that more people 
> will be able to edit the web site and we can take PRs from the public (with 
> GitHub preview of pages).
> Non-goals:
>  * Create a new web site or a new graphic design
>  * Change from Markdown to Asciidoc



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14259) backport SOLR-14013 to Solr 7.7

2020-02-12 Thread Noble Paul (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17035864#comment-17035864
 ] 

Noble Paul commented on SOLR-14259:
---

True. We will not have a 7.8 release

> backport SOLR-14013 to Solr 7.7
> ---
>
> Key: SOLR-14259
> URL: https://issues.apache.org/jira/browse/SOLR-14259
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
> Fix For: 7.7.3
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8987) Move Lucene web site from svn to git

2020-02-12 Thread Jira


[ 
https://issues.apache.org/jira/browse/LUCENE-8987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17035813#comment-17035813
 ] 

Jan Høydahl commented on LUCENE-8987:
-

Ok, I tried to disable the plugin `md_inline_extension` 
([https://github.com/apache/lucene-site/commit/26bf54c2e14c6d134cebe3faa74d965eff31683d])
 and now the site builds. I diffed output folder with and without extension and 
no difference, so I don't think we rely on it for anything. [~adamwalz] do you 
know why it is there?

> Move Lucene web site from svn to git
> 
>
> Key: LUCENE-8987
> URL: https://issues.apache.org/jira/browse/LUCENE-8987
> Project: Lucene - Core
>  Issue Type: Task
>  Components: general/website
>Reporter: Jan Høydahl
>Assignee: Jan Høydahl
>Priority: Major
> Attachments: lucene-site-repo.png
>
>
> INFRA just enabled [a new way of configuring website 
> build|https://s.apache.org/asfyaml] from a git branch, [see dev list 
> email|https://lists.apache.org/thread.html/b6f7e40bece5e83e27072ecc634a7815980c90240bc0a2ccb417f1fd@%3Cdev.lucene.apache.org%3E].
>  It allows for automatic builds of both staging and production site, much 
> like the old CMS. We can choose to auto publish the html content of an 
> {{output/}} folder, or to have a bot build the site using 
> [Pelican|https://github.com/getpelican/pelican] from a {{content/}} folder.
> The goal of this issue is to explore how this can be done for 
> [http://lucene.apache.org|http://lucene.apache.org/] by, by creating a new 
> git repo {{lucene-site}}, copy over the site from svn, see if it can be 
> "Pelicanized" easily and then test staging. Benefits are that more people 
> will be able to edit the web site and we can take PRs from the public (with 
> GitHub preview of pages).
> Non-goals:
>  * Create a new web site or a new graphic design
>  * Change from Markdown to Asciidoc



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (SOLR-14260) Make SchemaRegistryProvider pluggable in HttpClientUtil

2020-02-12 Thread Andy Throgmorton (Jira)
Andy Throgmorton created SOLR-14260:
---

 Summary: Make SchemaRegistryProvider pluggable in HttpClientUtil
 Key: SOLR-14260
 URL: https://issues.apache.org/jira/browse/SOLR-14260
 Project: Solr
  Issue Type: Improvement
  Security Level: Public (Default Security Level. Issues are Public)
  Components: SolrJ
Reporter: Andy Throgmorton


HttpClientUtil.java defines and uses an abstract SchemaRegistryProvider for 
mapping a protocol to an Apache ConnectionSocketFactory. There is only one 
implementation of this abstract class (outside of test cases). Currently, it is 
not override-able at runtime.

This PR adds the ability to override the registry provider at runtime, using 
the class name value provided by "solr.schema.registry.provider", similar to 
how this class allows for choosing the HttpClientBuilderFactory at runtime.

We've implemented a custom mTLS solution in Solr (which uses a custom SSL 
context). This change helps us more easily configure Solr in a modular way, 
since we've implemented a custom SchemaRegistryProvider that configures Apache 
clients to use our SSL context.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9034) Officially publish the new site

2020-02-12 Thread Uwe Schindler (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17035776#comment-17035776
 ] 

Uwe Schindler commented on LUCENE-9034:
---

Yes let's release the site now. I have some time tomorrow. I can contact infra 
on slack to manage switch.

After that I will care on cleaning up svn with them, to get rid of (then) 
outdated clone. I will possibly also shuffle the docs folders to final location 
and test everything in production.

> Officially publish the new site
> ---
>
> Key: LUCENE-9034
> URL: https://issues.apache.org/jira/browse/LUCENE-9034
> Project: Lucene - Core
>  Issue Type: Sub-task
>  Components: general/website
>Reporter: Jan Høydahl
>Assignee: Jan Høydahl
>Priority: Major
>
> Publishing the web site means creating a publish branch and adding the right 
> magic instructions to {{.asf.yml}} etc. This will then publish the new site 
> and disable old CMS.
> Before we do that we should
>  # Make sure all docs and release tools are updated for new site publishing 
> instructions
>  # Create a PR with latest changes in old CMS site since the export. This 
> will be the changes done during 8.3.0 release and possibly some news entries 
> related to security issues etc.
> After publishing we should ask INFRA to make old site svn read-only (and 
> perhaps do a commit that replaces svn content with a README.txt), so it is 
> obvious for everyone that we have migrated.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8987) Move Lucene web site from svn to git

2020-02-12 Thread Jira


[ 
https://issues.apache.org/jira/browse/LUCENE-8987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17035773#comment-17035773
 ] 

Jan Høydahl commented on LUCENE-8987:
-

I pushed a change to the site but buildbot failed to build the site, see 
[https://ci2.apache.org/#/builders/3/builds/366/steps/2/logs/stdio]

Don't know why this suddenly happens now and not before. I flagged it on INFRA 
slack, hope they look into it. The really bad thing is that they publish the 
site even if the Pelican build failed - leaving a non-working staging website 
:( 

> Move Lucene web site from svn to git
> 
>
> Key: LUCENE-8987
> URL: https://issues.apache.org/jira/browse/LUCENE-8987
> Project: Lucene - Core
>  Issue Type: Task
>  Components: general/website
>Reporter: Jan Høydahl
>Assignee: Jan Høydahl
>Priority: Major
> Attachments: lucene-site-repo.png
>
>
> INFRA just enabled [a new way of configuring website 
> build|https://s.apache.org/asfyaml] from a git branch, [see dev list 
> email|https://lists.apache.org/thread.html/b6f7e40bece5e83e27072ecc634a7815980c90240bc0a2ccb417f1fd@%3Cdev.lucene.apache.org%3E].
>  It allows for automatic builds of both staging and production site, much 
> like the old CMS. We can choose to auto publish the html content of an 
> {{output/}} folder, or to have a bot build the site using 
> [Pelican|https://github.com/getpelican/pelican] from a {{content/}} folder.
> The goal of this issue is to explore how this can be done for 
> [http://lucene.apache.org|http://lucene.apache.org/] by, by creating a new 
> git repo {{lucene-site}}, copy over the site from svn, see if it can be 
> "Pelicanized" easily and then test staging. Benefits are that more people 
> will be able to edit the web site and we can take PRs from the public (with 
> GitHub preview of pages).
> Non-goals:
>  * Create a new web site or a new graphic design
>  * Change from Markdown to Asciidoc



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (LUCENE-9223) Add Apache license headers

2020-02-12 Thread Jira
Jan Høydahl created LUCENE-9223:
---

 Summary: Add Apache license headers
 Key: LUCENE-9223
 URL: https://issues.apache.org/jira/browse/LUCENE-9223
 Project: Lucene - Core
  Issue Type: Sub-task
Reporter: Jan Høydahl


All source files should probably have the license header. Now some have and 
others don't.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8987) Move Lucene web site from svn to git

2020-02-12 Thread Jira


[ 
https://issues.apache.org/jira/browse/LUCENE-8987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17035762#comment-17035762
 ] 

Jan Høydahl commented on LUCENE-8987:
-

[~danmuzi] I just pushed a fix for the core/features.html that you reported 
above - it was missing. I think we have fixed all your comments now. Really 
grateful for your review - let us know if you find other bugs in the new site 
before we push it to production.

> Move Lucene web site from svn to git
> 
>
> Key: LUCENE-8987
> URL: https://issues.apache.org/jira/browse/LUCENE-8987
> Project: Lucene - Core
>  Issue Type: Task
>  Components: general/website
>Reporter: Jan Høydahl
>Assignee: Jan Høydahl
>Priority: Major
> Attachments: lucene-site-repo.png
>
>
> INFRA just enabled [a new way of configuring website 
> build|https://s.apache.org/asfyaml] from a git branch, [see dev list 
> email|https://lists.apache.org/thread.html/b6f7e40bece5e83e27072ecc634a7815980c90240bc0a2ccb417f1fd@%3Cdev.lucene.apache.org%3E].
>  It allows for automatic builds of both staging and production site, much 
> like the old CMS. We can choose to auto publish the html content of an 
> {{output/}} folder, or to have a bot build the site using 
> [Pelican|https://github.com/getpelican/pelican] from a {{content/}} folder.
> The goal of this issue is to explore how this can be done for 
> [http://lucene.apache.org|http://lucene.apache.org/] by, by creating a new 
> git repo {{lucene-site}}, copy over the site from svn, see if it can be 
> "Pelicanized" easily and then test staging. Benefits are that more people 
> will be able to edit the web site and we can take PRs from the public (with 
> GitHub preview of pages).
> Non-goals:
>  * Create a new web site or a new graphic design
>  * Change from Markdown to Asciidoc



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9034) Officially publish the new site

2020-02-12 Thread Jira


[ 
https://issues.apache.org/jira/browse/LUCENE-9034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17035738#comment-17035738
 ] 

Jan Høydahl commented on LUCENE-9034:
-

[~uschindler] and others watching: I can see no new or open issues regarding 
things that need fixing on the stage version of the site. Shall we process with 
prod release and then tackle issues as they pop up? Bette to do this now than 
letting another release go by before the switch.

> Officially publish the new site
> ---
>
> Key: LUCENE-9034
> URL: https://issues.apache.org/jira/browse/LUCENE-9034
> Project: Lucene - Core
>  Issue Type: Sub-task
>  Components: general/website
>Reporter: Jan Høydahl
>Assignee: Jan Høydahl
>Priority: Major
>
> Publishing the web site means creating a publish branch and adding the right 
> magic instructions to {{.asf.yml}} etc. This will then publish the new site 
> and disable old CMS.
> Before we do that we should
>  # Make sure all docs and release tools are updated for new site publishing 
> instructions
>  # Create a PR with latest changes in old CMS site since the export. This 
> will be the changes done during 8.3.0 release and possibly some news entries 
> related to security issues etc.
> After publishing we should ask INFRA to make old site svn read-only (and 
> perhaps do a commit that replaces svn content with a README.txt), so it is 
> obvious for everyone that we have migrated.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Resolved] (SOLR-14257) Keyword's not indexed or searchable

2020-02-12 Thread Erick Erickson (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson resolved SOLR-14257.
---
Resolution: Not A Problem

What is your analysis chain? Have you looked at the admin/analysis page to see 
what's thrown away? Most likely the tokenizer you've specified (or are using by 
default) is throwing this away and you need to use a different tokenizer for 
the field.

 

That said, please raise questions like this on the user's list, we try to 
reserve JIRAs for known bugs/enhancements rather than usage questions.

See: 
http://lucene.apache.org/solr/community.html#mailing-lists-irc

A _lot_ more people will see your question on that list and may be able to help 
more quickly.

You might want to review: https://wiki.apache.org/solr/UsingMailingLists

If it's determined that this really is a code issue or enhancement to Solr and 
not a configuration/usage problem, we can raise a new JIRA or reopen this one.

 

> Keyword's not indexed or searchable
> ---
>
> Key: SOLR-14257
> URL: https://issues.apache.org/jira/browse/SOLR-14257
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Schema and Analysis
>Affects Versions: 7.6
>Reporter: Shae Bottum
>Priority: Major
>
> During indexing, if the value of your column is the literal char  , 
> solr's tokenizer will pass over this value and Not tokenize it. This value 
> then is not indexed and therefore not searchable. Need to make this keyword 
> searchable. I understand to search it, you would need to add quotes around 
> the value * to ensure the asterisk is not treated as a wildcard and return 
> all. The use case is searching for the actual value of an asterisk. 
>  
> tokenizer works for "jo*n" or "j*n"
> tokenizer does Not work for "**" 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-9048) Tutorial and docs section missing from the new website

2020-02-12 Thread Jira


 [ 
https://issues.apache.org/jira/browse/LUCENE-9048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl resolved LUCENE-9048.
-
Resolution: Fixed

> Tutorial and docs section missing from the new website
> --
>
> Key: LUCENE-9048
> URL: https://issues.apache.org/jira/browse/LUCENE-9048
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: general/website
>Reporter: Jan Høydahl
>Assignee: Jan Høydahl
>Priority: Major
>
> See [https://lucene.staged.apache.org/solr/resources.html#tutorials]
> The Tutorials and Docuemtation sub sections are missing from this page



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9048) Tutorial and docs section missing from the new website

2020-02-12 Thread Jira


[ 
https://issues.apache.org/jira/browse/LUCENE-9048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17035735#comment-17035735
 ] 

Jan Høydahl commented on LUCENE-9048:
-

This is fixed already

> Tutorial and docs section missing from the new website
> --
>
> Key: LUCENE-9048
> URL: https://issues.apache.org/jira/browse/LUCENE-9048
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: general/website
>Reporter: Jan Høydahl
>Assignee: Jan Høydahl
>Priority: Major
>
> See [https://lucene.staged.apache.org/solr/resources.html#tutorials]
> The Tutorials and Docuemtation sub sections are missing from this page



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Assigned] (LUCENE-9048) Tutorial and docs section missing from the new website

2020-02-12 Thread Jira


 [ 
https://issues.apache.org/jira/browse/LUCENE-9048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl reassigned LUCENE-9048:
---

Assignee: Jan Høydahl

> Tutorial and docs section missing from the new website
> --
>
> Key: LUCENE-9048
> URL: https://issues.apache.org/jira/browse/LUCENE-9048
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: general/website
>Reporter: Jan Høydahl
>Assignee: Jan Høydahl
>Priority: Major
>
> See [https://lucene.staged.apache.org/solr/resources.html#tutorials]
> The Tutorials and Docuemtation sub sections are missing from this page



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14259) backport SOLR-14013 to Solr 7.7

2020-02-12 Thread Jira


[ 
https://issues.apache.org/jira/browse/SOLR-14259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17035711#comment-17035711
 ] 

Jan Høydahl commented on SOLR-14259:


Since there will never be a 7.8 release that should not be necessary. That 
branch should probably be made read only?

> backport SOLR-14013 to Solr 7.7
> ---
>
> Key: SOLR-14259
> URL: https://issues.apache.org/jira/browse/SOLR-14259
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
> Fix For: 7.7.3
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14259) backport SOLR-14013 to Solr 7.7

2020-02-12 Thread Houston Putman (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17035696#comment-17035696
 ] 

Houston Putman commented on SOLR-14259:
---

Jus to confirm, this should be merged to 7_x as well?

> backport SOLR-14013 to Solr 7.7
> ---
>
> Key: SOLR-14259
> URL: https://issues.apache.org/jira/browse/SOLR-14259
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
> Fix For: 7.7.3
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14254) Index backcompat break between 8.3.1 and 8.4.1

2020-02-12 Thread Adrien Grand (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17035630#comment-17035630
 ] 

Adrien Grand commented on SOLR-14254:
-

I started a discussion on LUCENE-9222.

bq. If I understand you then wouldn't this mean introducing backwards 
incompatibilities that don't actually exist? 

Yes this is correct. This might actually be a feature, as it makes all Lucene 
versions look the same, instead of some versions being compatible with the 
previous one and others not. And it also avoids silent corruptions from 
sneaking in, ie. when a change is made that would cause API calls to return 
wrong results without triggering a CorruptIndexException?

> Index backcompat break between 8.3.1 and 8.4.1
> --
>
> Key: SOLR-14254
> URL: https://issues.apache.org/jira/browse/SOLR-14254
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Jason Gerlowski
>Priority: Major
>
> I believe I found a backcompat break between 8.4.1 and 8.3.1.
> I encountered this when a Solr 8.3.1 cluster was upgraded to 8.4.1.  On 8.4. 
> nodes, several collections had cores fail to come up with 
> {{CorruptIndexException}}:
> {code}
> 2020-02-10 20:58:26.136 ERROR 
> (coreContainerWorkExecutor-2-thread-1-processing-n:192.168.1.194:8983_solr) [ 
>   ] o.a.s.c.CoreContainer Error waiting for SolrCore to be loaded on startup 
> => org.apache.sol
> r.common.SolrException: Unable to create core 
> [testbackcompat_shard1_replica_n1]
> at 
> org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1313)
> org.apache.solr.common.SolrException: Unable to create core 
> [testbackcompat_shard1_replica_n1]
> at 
> org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1313)
>  ~[?:?]
> at 
> org.apache.solr.core.CoreContainer.lambda$load$13(CoreContainer.java:788) 
> ~[?:?]
> at 
> com.codahale.metrics.InstrumentedExecutorService$InstrumentedCallable.call(InstrumentedExecutorService.java:202)
>  ~[metrics-core-4.0.5.jar:4.0.5]
> at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?]
> at 
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:210)
>  ~[?:?]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>  ~[?:?]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>  ~[?:?]
> at java.lang.Thread.run(Thread.java:834) [?:?]
> Caused by: org.apache.solr.common.SolrException: Error opening new searcher
> at org.apache.solr.core.SolrCore.(SolrCore.java:1072) ~[?:?]
> at org.apache.solr.core.SolrCore.(SolrCore.java:901) ~[?:?]
> at 
> org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1292)
>  ~[?:?]
> ... 7 more
> Caused by: org.apache.solr.common.SolrException: Error opening new searcher
> at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:2182) 
> ~[?:?]
> at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:2302) 
> ~[?:?]
> at org.apache.solr.core.SolrCore.initSearcher(SolrCore.java:1132) 
> ~[?:?]
> at org.apache.solr.core.SolrCore.(SolrCore.java:1013) ~[?:?]
> at org.apache.solr.core.SolrCore.(SolrCore.java:901) ~[?:?]
> at 
> org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1292)
>  ~[?:?]
> ... 7 more
> Caused by: org.apache.lucene.index.CorruptIndexException: codec mismatch: 
> actual codec=Lucene50PostingsWriterDoc vs expected 
> codec=Lucene84PostingsWriterDoc 
> (resource=MMapIndexInput(path="/Users/jasongerlowski/run/solrdata/data/testbackcompat_shard1_replica_n1/data/index/_0_FST50_0.doc"))
> at 
> org.apache.lucene.codecs.CodecUtil.checkHeaderNoMagic(CodecUtil.java:208) 
> ~[?:?]
> at org.apache.lucene.codecs.CodecUtil.checkHeader(CodecUtil.java:198) 
> ~[?:?]
> at 
> org.apache.lucene.codecs.CodecUtil.checkIndexHeader(CodecUtil.java:255) ~[?:?]
> at 
> org.apache.lucene.codecs.lucene84.Lucene84PostingsReader.(Lucene84PostingsReader.java:82)
>  ~[?:?]
> at 
> org.apache.lucene.codecs.memory.FSTPostingsFormat.fieldsProducer(FSTPostingsFormat.java:66)
>  ~[?:?]
> at 
> org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsReader.(PerFieldPostingsFormat.java:315)
>  ~[?:?]
> at 
> org.apache.lucene.codecs.perfield.PerFieldPostingsFormat.fieldsProducer(PerFieldPostingsFormat.java:395)
>  ~[?:?]
> at 
> org.apache.lucene.index.SegmentCoreReaders.(SegmentCoreReaders.java:114)
>  ~[?:?]
> at 
> org.apache.lucene.index.SegmentReader.(SegmentReader.java:84) ~[?:?]
> at 
> 

[jira] [Created] (LUCENE-9222) Detect upgrades with non-default formats

2020-02-12 Thread Adrien Grand (Jira)
Adrien Grand created LUCENE-9222:


 Summary: Detect upgrades with non-default formats
 Key: LUCENE-9222
 URL: https://issues.apache.org/jira/browse/LUCENE-9222
 Project: Lucene - Core
  Issue Type: Wish
Reporter: Adrien Grand


Lucene doesn't give any backward-compatibility guarantees with non-default 
formats, but doesn't try to detect such misuse either, and a couple users fell 
in this trap over the years, see e.g. SOLR-14254.

What about dynamically creating the version number of the index format based on 
the current Lucene version, so that Lucene would fail with an 
IndexFormatTooOldException with non-default formats instead of a confusing 
CorruptIndexException. The change would consist of doing something like that 
for all our non-default index formats:

{code}
diff --git 
a/lucene/codecs/src/java/org/apache/lucene/codecs/memory/FSTTermsWriter.java 
b/lucene/codecs/src/java/org/apache/lucene/codecs/memory/FSTTermsWriter.java
index fcc0d00a593..18b35760aec 100644
--- a/lucene/codecs/src/java/org/apache/lucene/codecs/memory/FSTTermsWriter.java
+++ b/lucene/codecs/src/java/org/apache/lucene/codecs/memory/FSTTermsWriter.java
@@ -41,6 +41,7 @@ import org.apache.lucene.util.BytesRef;
 import org.apache.lucene.util.FixedBitSet;
 import org.apache.lucene.util.IOUtils;
 import org.apache.lucene.util.IntsRefBuilder;
+import org.apache.lucene.util.Version;
 import org.apache.lucene.util.fst.FSTCompiler;
 import org.apache.lucene.util.fst.FST;
 import org.apache.lucene.util.fst.Util;
@@ -123,7 +124,7 @@ import org.apache.lucene.util.fst.Util;
 public class FSTTermsWriter extends FieldsConsumer {
   static final String TERMS_EXTENSION = "tfp";
   static final String TERMS_CODEC_NAME = "FSTTerms";
-  public static final int TERMS_VERSION_START = 2;
+  public static final int TERMS_VERSION_START = (Version.LATEST.major << 16) | 
(Version.LATEST.minor << 8) | Version.LATEST.bugfix;
   public static final int TERMS_VERSION_CURRENT = TERMS_VERSION_START;
   
   final PostingsWriterBase postingsWriter;
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9211) Adding compression to BinaryDocValues storage

2020-02-12 Thread David Smiley (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17035624#comment-17035624
 ] 

David Smiley commented on LUCENE-9211:
--

Thanks so much for running the benchmarks [~mharwood]!  When you say you 
modified "this line"; the link did not work.  If you merely changed the default 
spatial.alg to use composite then it's only indexing point data which is not 
realistic for this spatial strategy.  Instead LUCENE-5579 has a spatial.alg 
file that converts those points to random circles and it'll be more 
interesting.  I just did a diff on that spatial.alg with the default one and 
they are pretty similar overall.

> Adding compression to BinaryDocValues storage
> -
>
> Key: LUCENE-9211
> URL: https://issues.apache.org/jira/browse/LUCENE-9211
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/codecs
>Reporter: Mark Harwood
>Assignee: Mark Harwood
>Priority: Minor
>  Labels: pull-request-available
>
> While SortedSetDocValues can be used today to store identical values in a 
> compact form this is not effective for data with many unique values.
> The proposal is that BinaryDocValues should be stored in LZ4 compressed 
> blocks which can dramatically reduce disk storage costs in many cases. The 
> proposal is blocks of a number of documents are stored as a single compressed 
> blob along with metadata that records offsets where the original document 
> values can be found in the uncompressed content.
> There's a trade-off here between efficient compression (more docs-per-block = 
> better compression) and fast retrieval times (fewer docs-per-block = faster 
> read access for single values). A fixed block size of 32 docs seems like it 
> would be a reasonable compromise for most scenarios.
> A PR is up for review here [https://github.com/apache/lucene-solr/pull/1234]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14254) Index backcompat break between 8.3.1 and 8.4.1

2020-02-12 Thread David Smiley (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17035619#comment-17035619
 ] 

David Smiley commented on SOLR-14254:
-

> rejecting upgrading indices that have non-default formats in Solr

How might that work?  My limited understanding is that "upgrading indices" 
happen transparently via merging, typically due to adding data.  But for the 
special postings format case, Lucene can't even properly read the data any more.

> changing non-default formats to use a version number that is computed using 
> the current Lucene version.

If I understand you then wouldn't this mean introducing backwards 
incompatibilities that don't actually exist?  Maybe I don't get the idea.

Even if some format names haven't changes despite actual format changes, maybe 
for the non-default formats this is what we should do; it's the simplest course 
of action that would be helpful IMO.



> Index backcompat break between 8.3.1 and 8.4.1
> --
>
> Key: SOLR-14254
> URL: https://issues.apache.org/jira/browse/SOLR-14254
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Jason Gerlowski
>Priority: Major
>
> I believe I found a backcompat break between 8.4.1 and 8.3.1.
> I encountered this when a Solr 8.3.1 cluster was upgraded to 8.4.1.  On 8.4. 
> nodes, several collections had cores fail to come up with 
> {{CorruptIndexException}}:
> {code}
> 2020-02-10 20:58:26.136 ERROR 
> (coreContainerWorkExecutor-2-thread-1-processing-n:192.168.1.194:8983_solr) [ 
>   ] o.a.s.c.CoreContainer Error waiting for SolrCore to be loaded on startup 
> => org.apache.sol
> r.common.SolrException: Unable to create core 
> [testbackcompat_shard1_replica_n1]
> at 
> org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1313)
> org.apache.solr.common.SolrException: Unable to create core 
> [testbackcompat_shard1_replica_n1]
> at 
> org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1313)
>  ~[?:?]
> at 
> org.apache.solr.core.CoreContainer.lambda$load$13(CoreContainer.java:788) 
> ~[?:?]
> at 
> com.codahale.metrics.InstrumentedExecutorService$InstrumentedCallable.call(InstrumentedExecutorService.java:202)
>  ~[metrics-core-4.0.5.jar:4.0.5]
> at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?]
> at 
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:210)
>  ~[?:?]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>  ~[?:?]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>  ~[?:?]
> at java.lang.Thread.run(Thread.java:834) [?:?]
> Caused by: org.apache.solr.common.SolrException: Error opening new searcher
> at org.apache.solr.core.SolrCore.(SolrCore.java:1072) ~[?:?]
> at org.apache.solr.core.SolrCore.(SolrCore.java:901) ~[?:?]
> at 
> org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1292)
>  ~[?:?]
> ... 7 more
> Caused by: org.apache.solr.common.SolrException: Error opening new searcher
> at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:2182) 
> ~[?:?]
> at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:2302) 
> ~[?:?]
> at org.apache.solr.core.SolrCore.initSearcher(SolrCore.java:1132) 
> ~[?:?]
> at org.apache.solr.core.SolrCore.(SolrCore.java:1013) ~[?:?]
> at org.apache.solr.core.SolrCore.(SolrCore.java:901) ~[?:?]
> at 
> org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1292)
>  ~[?:?]
> ... 7 more
> Caused by: org.apache.lucene.index.CorruptIndexException: codec mismatch: 
> actual codec=Lucene50PostingsWriterDoc vs expected 
> codec=Lucene84PostingsWriterDoc 
> (resource=MMapIndexInput(path="/Users/jasongerlowski/run/solrdata/data/testbackcompat_shard1_replica_n1/data/index/_0_FST50_0.doc"))
> at 
> org.apache.lucene.codecs.CodecUtil.checkHeaderNoMagic(CodecUtil.java:208) 
> ~[?:?]
> at org.apache.lucene.codecs.CodecUtil.checkHeader(CodecUtil.java:198) 
> ~[?:?]
> at 
> org.apache.lucene.codecs.CodecUtil.checkIndexHeader(CodecUtil.java:255) ~[?:?]
> at 
> org.apache.lucene.codecs.lucene84.Lucene84PostingsReader.(Lucene84PostingsReader.java:82)
>  ~[?:?]
> at 
> org.apache.lucene.codecs.memory.FSTPostingsFormat.fieldsProducer(FSTPostingsFormat.java:66)
>  ~[?:?]
> at 
> org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsReader.(PerFieldPostingsFormat.java:315)
>  ~[?:?]
> at 
> 

[jira] [Resolved] (SOLR-14247) IndexSizeTriggerMixedBoundsTest does a lot of sleeping

2020-02-12 Thread Chris M. Hostetter (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris M. Hostetter resolved SOLR-14247.
---
Resolution: Fixed

> IndexSizeTriggerMixedBoundsTest does a lot of sleeping
> --
>
> Key: SOLR-14247
> URL: https://issues.apache.org/jira/browse/SOLR-14247
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Tests
>Reporter: Mike Drob
>Assignee: Mike Drob
>Priority: Minor
> Fix For: master (9.0)
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> When I run tests locally, the slowest reported test is always 
> IndexSizeTriggerMixedBoundsTest  coming in at around 2 minutes.
> I took a look at the code and discovered that at least 80s of that is all 
> sleeps!
> There might need to be more synchronization and ordering added back in, but 
> when I removed all of the sleeps the test still passed locally for me, so I'm 
> not too sure what the point was or why we were slowing the system down so 
> much.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14247) IndexSizeTriggerMixedBoundsTest does a lot of sleeping

2020-02-12 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17035615#comment-17035615
 ] 

ASF subversion and git services commented on SOLR-14247:


Commit f1fc3e7ba204d7211e9920639fb525d100614886 in lucene-solr's branch 
refs/heads/master from Chris M. Hostetter
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=f1fc3e7 ]

SOLR-14247: Revert SolrTestCase Logger removal


> IndexSizeTriggerMixedBoundsTest does a lot of sleeping
> --
>
> Key: SOLR-14247
> URL: https://issues.apache.org/jira/browse/SOLR-14247
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Tests
>Reporter: Mike Drob
>Assignee: Mike Drob
>Priority: Minor
> Fix For: master (9.0)
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> When I run tests locally, the slowest reported test is always 
> IndexSizeTriggerMixedBoundsTest  coming in at around 2 minutes.
> I took a look at the code and discovered that at least 80s of that is all 
> sleeps!
> There might need to be more synchronization and ordering added back in, but 
> when I removed all of the sleeps the test still passed locally for me, so I'm 
> not too sure what the point was or why we were slowing the system down so 
> much.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Resolved] (SOLR-14245) Validate Replica / ReplicaInfo on creation

2020-02-12 Thread Chris M. Hostetter (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris M. Hostetter resolved SOLR-14245.
---
Resolution: Fixed

> Validate Replica / ReplicaInfo on creation
> --
>
> Key: SOLR-14245
> URL: https://issues.apache.org/jira/browse/SOLR-14245
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrCloud
>Reporter: Andrzej Bialecki
>Assignee: Andrzej Bialecki
>Priority: Minor
> Fix For: 8.5
>
>
> Replica / ReplicaInfo should be immutable and their fields should be 
> validated on creation.
> Some users reported that very rarely during a failed collection CREATE or 
> DELETE, or when the Overseer task queue becomes corrupted, Solr may write to 
> ZK incomplete replica infos (eg. node_name = null).
> This problem is difficult to reproduce but we should add safeguards anyway to 
> prevent writing such corrupted replica info to ZK.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14259) backport SOLR-14013 to Solr 7.7

2020-02-12 Thread Noble Paul (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-14259:
--
Fix Version/s: 7.7.3

> backport SOLR-14013 to Solr 7.7
> ---
>
> Key: SOLR-14259
> URL: https://issues.apache.org/jira/browse/SOLR-14259
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
> Fix For: 7.7.3
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] noblepaul opened a new pull request #1254: SOLR-14259: trying to port to Solr 7.7

2020-02-12 Thread GitBox
noblepaul opened a new pull request #1254: SOLR-14259: trying to port to Solr 
7.7
URL: https://github.com/apache/lucene-solr/pull/1254
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Reopened] (SOLR-14247) IndexSizeTriggerMixedBoundsTest does a lot of sleeping

2020-02-12 Thread Chris M. Hostetter (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris M. Hostetter reopened SOLR-14247:
---

Why did this issue modify SolrTestCase.java ? ? ?

 

For reasons i don't understand, this issue removed the Logger from SolrTestCase 
– which (again for reasons i don't undestand) seems to be causing suite level 
thread leaks of Log4j AsyncLogger threads from any test that does not define 
it's own loggers – ie: something about how we are using async logging means 
that any SolrCloudTestCase that doesn't initialize a logger anywhere will leak 
a logger thread – and evidently the SolrCloudTestCase Logger was ensuring this 
didn't happen until it was removed by this jira...

 

As an example, starting with 71b869381ef0090a6e96eccbc9924ebdb4f57306 the 
trivial {{NamedListTest}} fails for me 100% of the time with leaked threads 
(regardless of seed) ...
{noformat}
   [junit4]   2> NOTE: reproduce with: ant test  -Dtestcase=NamedListTest 
-Dtests.seed=F67D0AB0258C4521 -Dtests.slow=true -Dtests.badapples=true 
-Dtests.locale=yue-Hant -Dtests.timezone=Antarctica/South_Pole 
-Dtests.asserts=true -Dtests.file.encoding=UTF-8
   [junit4] ERROR   0.00s | NamedListTest (suite) <<<
   [junit4]> Throwable #1: 
com.carrotsearch.randomizedtesting.ThreadLeakError: 1 thread leaked from SUITE 
scope at org.apache.solr.common.util.NamedListTest: 
   [junit4]>1) Thread[id=16, name=Log4j2-TF-1-AsyncLoggerConfig-1, 
state=TIMED_WAITING, group=TGRP-NamedListTest]
   [junit4]> at 
java.base@11.0.4/jdk.internal.misc.Unsafe.park(Native Method)
   [junit4]> at 
java.base@11.0.4/java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:234)
   [junit4]> at 
java.base@11.0.4/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2123)
   [junit4]> at 
app//com.lmax.disruptor.TimeoutBlockingWaitStrategy.waitFor(TimeoutBlockingWaitStrategy.java:38)
   [junit4]> at 
app//com.lmax.disruptor.ProcessingSequenceBarrier.waitFor(ProcessingSequenceBarrier.java:56)
   [junit4]> at 
app//com.lmax.disruptor.BatchEventProcessor.processEvents(BatchEventProcessor.java:159)
   [junit4]> at 
app//com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:125)
   [junit4]> at 
java.base@11.0.4/java.lang.Thread.run(Thread.java:834)
   [junit4]>at 
__randomizedtesting.SeedInfo.seed([F67D0AB0258C4521]:0)Throwable #2: 
com.carrotsearch.randomizedtesting.ThreadLeakError: There are still zombie 
threads that couldn't be terminated:
   [junit4]>1) Thread[id=16, name=Log4j2-TF-1-AsyncLoggerConfig-1, 
state=TIMED_WAITING, group=TGRP-NamedListTest]
   [junit4]> at 
java.base@11.0.4/jdk.internal.misc.Unsafe.park(Native Method)
   [junit4]> at 
java.base@11.0.4/java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:234)
   [junit4]> at 
java.base@11.0.4/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2123)
   [junit4]> at 
app//com.lmax.disruptor.TimeoutBlockingWaitStrategy.waitFor(TimeoutBlockingWaitStrategy.java:38)
   [junit4]> at 
app//com.lmax.disruptor.ProcessingSequenceBarrier.waitFor(ProcessingSequenceBarrier.java:56)
   [junit4]> at 
app//com.lmax.disruptor.BatchEventProcessor.processEvents(BatchEventProcessor.java:159)
   [junit4]> at 
app//com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:125)
   [junit4]> at 
java.base@11.0.4/java.lang.Thread.run(Thread.java:834)
   [junit4]>at 
__randomizedtesting.SeedInfo.seed([F67D0AB0258C4521]:0)
   [junit4] Completed [1/1 (1!)] in 23.32s, 6 tests, 2 errors <<< FAILURES!
{noformat}
These failures do not happen w/ b21312f411bdfb069114846f31f45dcc6ec6ecb8 (the 
prior commit on the master branch) checked out.

 

> IndexSizeTriggerMixedBoundsTest does a lot of sleeping
> --
>
> Key: SOLR-14247
> URL: https://issues.apache.org/jira/browse/SOLR-14247
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Tests
>Reporter: Mike Drob
>Assignee: Mike Drob
>Priority: Minor
> Fix For: master (9.0)
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> When I run tests locally, the slowest reported test is always 
> IndexSizeTriggerMixedBoundsTest  coming in at around 2 minutes.
> I took a look at the code and discovered that at least 80s of that is all 
> sleeps!
> There might need to be more synchronization and ordering added back in, but 
> when I removed all of the sleeps the test still 

[jira] [Commented] (SOLR-14013) javabin performance regressions

2020-02-12 Thread Noble Paul (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17035589#comment-17035589
 ] 

Noble Paul commented on SOLR-14013:
---

I've opened SOLR-14259

 

> javabin performance regressions
> ---
>
> Key: SOLR-14013
> URL: https://issues.apache.org/jira/browse/SOLR-14013
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 7.7
>Reporter: Yonik Seeley
>Assignee: Yonik Seeley
>Priority: Blocker
> Fix For: 8.4
>
> Attachments: SOLR-14013.patch, SOLR-14013.patch, TestQuerySpeed.java, 
> test.json
>
>
> As noted by [~rrockenbaugh] in SOLR-13963, javabin also recently became 
> orders of magnitude slower in certain cases since v7.7.  The cases identified 
> so far include large numbers of values in a field.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (SOLR-14259) backport SOLR-14013 to Solr 7.7

2020-02-12 Thread Noble Paul (Jira)
Noble Paul created SOLR-14259:
-

 Summary: backport SOLR-14013 to Solr 7.7
 Key: SOLR-14259
 URL: https://issues.apache.org/jira/browse/SOLR-14259
 Project: Solr
  Issue Type: Improvement
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Noble Paul
Assignee: Noble Paul






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14245) Validate Replica / ReplicaInfo on creation

2020-02-12 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17035581#comment-17035581
 ] 

ASF subversion and git services commented on SOLR-14245:


Commit 3dd484ba29db04e4b5d4181e4a042dcc448b34be in lucene-solr's branch 
refs/heads/branch_8x from Chris M. Hostetter
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=3dd484b ]

SOLR-14245: Fix ReplicaListTransformerTest

Previous changes to this issue 'fixed' the way the test was creating mock 
Replica instances,
to ensure all properties were specified -- but these changes tickled a bug in 
the existing test
scaffolding that caused it's "expecations" to be based on a regex check against 
only the base "url"
even though the test logic itself looked at the entire "core url"

The result is that there were reproducible failures if/when the randomly 
generated regex matched
".*1.*" because the existing test logic did not expect that to match the url or 
a Replica with
a core name of "core1" because it only considered the base url

(cherry picked from commit 49e20dbee4b7e74448928a48bfbb50da1018400f)


> Validate Replica / ReplicaInfo on creation
> --
>
> Key: SOLR-14245
> URL: https://issues.apache.org/jira/browse/SOLR-14245
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrCloud
>Reporter: Andrzej Bialecki
>Assignee: Andrzej Bialecki
>Priority: Minor
> Fix For: 8.5
>
>
> Replica / ReplicaInfo should be immutable and their fields should be 
> validated on creation.
> Some users reported that very rarely during a failed collection CREATE or 
> DELETE, or when the Overseer task queue becomes corrupted, Solr may write to 
> ZK incomplete replica infos (eg. node_name = null).
> This problem is difficult to reproduce but we should add safeguards anyway to 
> prevent writing such corrupted replica info to ZK.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14254) Index backcompat break between 8.3.1 and 8.4.1

2020-02-12 Thread Adrien Grand (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17035577#comment-17035577
 ] 

Adrien Grand commented on SOLR-14254:
-

I don't have any objections to changing the name but I don't think FST50 is 
inappropriate. We don't always rename index formats when we change them, for 
instance we kept the name Lucene60PointsFormat when we added selective indexing.

Maybe we could make the situation less trappy by rejecting upgrading indices 
that have non-default formats in Solr, or changing non-default formats to use a 
version number that is computed using the current Lucene version.


> Index backcompat break between 8.3.1 and 8.4.1
> --
>
> Key: SOLR-14254
> URL: https://issues.apache.org/jira/browse/SOLR-14254
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Jason Gerlowski
>Priority: Major
>
> I believe I found a backcompat break between 8.4.1 and 8.3.1.
> I encountered this when a Solr 8.3.1 cluster was upgraded to 8.4.1.  On 8.4. 
> nodes, several collections had cores fail to come up with 
> {{CorruptIndexException}}:
> {code}
> 2020-02-10 20:58:26.136 ERROR 
> (coreContainerWorkExecutor-2-thread-1-processing-n:192.168.1.194:8983_solr) [ 
>   ] o.a.s.c.CoreContainer Error waiting for SolrCore to be loaded on startup 
> => org.apache.sol
> r.common.SolrException: Unable to create core 
> [testbackcompat_shard1_replica_n1]
> at 
> org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1313)
> org.apache.solr.common.SolrException: Unable to create core 
> [testbackcompat_shard1_replica_n1]
> at 
> org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1313)
>  ~[?:?]
> at 
> org.apache.solr.core.CoreContainer.lambda$load$13(CoreContainer.java:788) 
> ~[?:?]
> at 
> com.codahale.metrics.InstrumentedExecutorService$InstrumentedCallable.call(InstrumentedExecutorService.java:202)
>  ~[metrics-core-4.0.5.jar:4.0.5]
> at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?]
> at 
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:210)
>  ~[?:?]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>  ~[?:?]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>  ~[?:?]
> at java.lang.Thread.run(Thread.java:834) [?:?]
> Caused by: org.apache.solr.common.SolrException: Error opening new searcher
> at org.apache.solr.core.SolrCore.(SolrCore.java:1072) ~[?:?]
> at org.apache.solr.core.SolrCore.(SolrCore.java:901) ~[?:?]
> at 
> org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1292)
>  ~[?:?]
> ... 7 more
> Caused by: org.apache.solr.common.SolrException: Error opening new searcher
> at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:2182) 
> ~[?:?]
> at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:2302) 
> ~[?:?]
> at org.apache.solr.core.SolrCore.initSearcher(SolrCore.java:1132) 
> ~[?:?]
> at org.apache.solr.core.SolrCore.(SolrCore.java:1013) ~[?:?]
> at org.apache.solr.core.SolrCore.(SolrCore.java:901) ~[?:?]
> at 
> org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1292)
>  ~[?:?]
> ... 7 more
> Caused by: org.apache.lucene.index.CorruptIndexException: codec mismatch: 
> actual codec=Lucene50PostingsWriterDoc vs expected 
> codec=Lucene84PostingsWriterDoc 
> (resource=MMapIndexInput(path="/Users/jasongerlowski/run/solrdata/data/testbackcompat_shard1_replica_n1/data/index/_0_FST50_0.doc"))
> at 
> org.apache.lucene.codecs.CodecUtil.checkHeaderNoMagic(CodecUtil.java:208) 
> ~[?:?]
> at org.apache.lucene.codecs.CodecUtil.checkHeader(CodecUtil.java:198) 
> ~[?:?]
> at 
> org.apache.lucene.codecs.CodecUtil.checkIndexHeader(CodecUtil.java:255) ~[?:?]
> at 
> org.apache.lucene.codecs.lucene84.Lucene84PostingsReader.(Lucene84PostingsReader.java:82)
>  ~[?:?]
> at 
> org.apache.lucene.codecs.memory.FSTPostingsFormat.fieldsProducer(FSTPostingsFormat.java:66)
>  ~[?:?]
> at 
> org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsReader.(PerFieldPostingsFormat.java:315)
>  ~[?:?]
> at 
> org.apache.lucene.codecs.perfield.PerFieldPostingsFormat.fieldsProducer(PerFieldPostingsFormat.java:395)
>  ~[?:?]
> at 
> org.apache.lucene.index.SegmentCoreReaders.(SegmentCoreReaders.java:114)
>  ~[?:?]
> at 
> org.apache.lucene.index.SegmentReader.(SegmentReader.java:84) ~[?:?]
> at 
> 

[GitHub] [lucene-solr] dsmiley commented on issue #357: [SOLR-12238] Synonym Queries boost

2020-02-12 Thread GitBox
dsmiley commented on issue #357: [SOLR-12238] Synonym Queries boost
URL: https://github.com/apache/lucene-solr/pull/357#issuecomment-585341721
 
 
   Personally I'm fine with backporting this although QueryBuilder is not 
labelled as experimental so I wonder if we are "allowed" to change the API like 
we did in a minor release?  It's debatable.  Maybe it should be labelled 
experimental now.
   
   Please merge @romseygeek as I'll be on a vacation shortly.  Otherwise I 
could get to it the last week of this month.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14245) Validate Replica / ReplicaInfo on creation

2020-02-12 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17035568#comment-17035568
 ] 

ASF subversion and git services commented on SOLR-14245:


Commit 49e20dbee4b7e74448928a48bfbb50da1018400f in lucene-solr's branch 
refs/heads/master from Chris M. Hostetter
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=49e20db ]

SOLR-14245: Fix ReplicaListTransformerTest

Previous changes to this issue 'fixed' the way the test was creating mock 
Replica instances,
to ensure all properties were specified -- but these changes tickled a bug in 
the existing test
scaffolding that caused it's "expecations" to be based on a regex check against 
only the base "url"
even though the test logic itself looked at the entire "core url"

The result is that there were reproducible failures if/when the randomly 
generated regex matched
".*1.*" because the existing test logic did not expect that to match the url or 
a Replica with
a core name of "core1" because it only considered the base url


> Validate Replica / ReplicaInfo on creation
> --
>
> Key: SOLR-14245
> URL: https://issues.apache.org/jira/browse/SOLR-14245
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrCloud
>Reporter: Andrzej Bialecki
>Assignee: Andrzej Bialecki
>Priority: Minor
> Fix For: 8.5
>
>
> Replica / ReplicaInfo should be immutable and their fields should be 
> validated on creation.
> Some users reported that very rarely during a failed collection CREATE or 
> DELETE, or when the Overseer task queue becomes corrupted, Solr may write to 
> ZK incomplete replica infos (eg. node_name = null).
> This problem is difficult to reproduce but we should add safeguards anyway to 
> prevent writing such corrupted replica info to ZK.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] nknize opened a new pull request #1253: LUCENE-9150: Restore support for dynamic PlanetModel in spatial3d

2020-02-12 Thread GitBox
nknize opened a new pull request #1253: LUCENE-9150: Restore support for 
dynamic PlanetModel in spatial3d
URL: https://github.com/apache/lucene-solr/pull/1253
 
 
   This PR adds dynamic geographic datum support to Geo3D to make lucene a 
viable option for indexing/searching in different spatial reference systems 
(e.g., more accurately computing query shape relations to BKD's internal nodes 
using datum consistent with the spatial projection).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14254) Index backcompat break between 8.3.1 and 8.4.1

2020-02-12 Thread David Smiley (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17035554#comment-17035554
 ] 

David Smiley commented on SOLR-14254:
-

Surely "FST50" is not appropriate anymore; no?  If it changed to FST84 or 
whatever then the user would get a message about the postingsFormat FST50 not 
being found or something like that.  That's helpful! 

> Index backcompat break between 8.3.1 and 8.4.1
> --
>
> Key: SOLR-14254
> URL: https://issues.apache.org/jira/browse/SOLR-14254
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Jason Gerlowski
>Priority: Major
>
> I believe I found a backcompat break between 8.4.1 and 8.3.1.
> I encountered this when a Solr 8.3.1 cluster was upgraded to 8.4.1.  On 8.4. 
> nodes, several collections had cores fail to come up with 
> {{CorruptIndexException}}:
> {code}
> 2020-02-10 20:58:26.136 ERROR 
> (coreContainerWorkExecutor-2-thread-1-processing-n:192.168.1.194:8983_solr) [ 
>   ] o.a.s.c.CoreContainer Error waiting for SolrCore to be loaded on startup 
> => org.apache.sol
> r.common.SolrException: Unable to create core 
> [testbackcompat_shard1_replica_n1]
> at 
> org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1313)
> org.apache.solr.common.SolrException: Unable to create core 
> [testbackcompat_shard1_replica_n1]
> at 
> org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1313)
>  ~[?:?]
> at 
> org.apache.solr.core.CoreContainer.lambda$load$13(CoreContainer.java:788) 
> ~[?:?]
> at 
> com.codahale.metrics.InstrumentedExecutorService$InstrumentedCallable.call(InstrumentedExecutorService.java:202)
>  ~[metrics-core-4.0.5.jar:4.0.5]
> at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?]
> at 
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:210)
>  ~[?:?]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>  ~[?:?]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>  ~[?:?]
> at java.lang.Thread.run(Thread.java:834) [?:?]
> Caused by: org.apache.solr.common.SolrException: Error opening new searcher
> at org.apache.solr.core.SolrCore.(SolrCore.java:1072) ~[?:?]
> at org.apache.solr.core.SolrCore.(SolrCore.java:901) ~[?:?]
> at 
> org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1292)
>  ~[?:?]
> ... 7 more
> Caused by: org.apache.solr.common.SolrException: Error opening new searcher
> at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:2182) 
> ~[?:?]
> at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:2302) 
> ~[?:?]
> at org.apache.solr.core.SolrCore.initSearcher(SolrCore.java:1132) 
> ~[?:?]
> at org.apache.solr.core.SolrCore.(SolrCore.java:1013) ~[?:?]
> at org.apache.solr.core.SolrCore.(SolrCore.java:901) ~[?:?]
> at 
> org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1292)
>  ~[?:?]
> ... 7 more
> Caused by: org.apache.lucene.index.CorruptIndexException: codec mismatch: 
> actual codec=Lucene50PostingsWriterDoc vs expected 
> codec=Lucene84PostingsWriterDoc 
> (resource=MMapIndexInput(path="/Users/jasongerlowski/run/solrdata/data/testbackcompat_shard1_replica_n1/data/index/_0_FST50_0.doc"))
> at 
> org.apache.lucene.codecs.CodecUtil.checkHeaderNoMagic(CodecUtil.java:208) 
> ~[?:?]
> at org.apache.lucene.codecs.CodecUtil.checkHeader(CodecUtil.java:198) 
> ~[?:?]
> at 
> org.apache.lucene.codecs.CodecUtil.checkIndexHeader(CodecUtil.java:255) ~[?:?]
> at 
> org.apache.lucene.codecs.lucene84.Lucene84PostingsReader.(Lucene84PostingsReader.java:82)
>  ~[?:?]
> at 
> org.apache.lucene.codecs.memory.FSTPostingsFormat.fieldsProducer(FSTPostingsFormat.java:66)
>  ~[?:?]
> at 
> org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsReader.(PerFieldPostingsFormat.java:315)
>  ~[?:?]
> at 
> org.apache.lucene.codecs.perfield.PerFieldPostingsFormat.fieldsProducer(PerFieldPostingsFormat.java:395)
>  ~[?:?]
> at 
> org.apache.lucene.index.SegmentCoreReaders.(SegmentCoreReaders.java:114)
>  ~[?:?]
> at 
> org.apache.lucene.index.SegmentReader.(SegmentReader.java:84) ~[?:?]
> at 
> org.apache.lucene.index.ReadersAndUpdates.getReader(ReadersAndUpdates.java:177)
>  ~[?:?]
> at 
> org.apache.lucene.index.ReadersAndUpdates.getReadOnlyClone(ReadersAndUpdates.java:219)
>  ~[?:?]
> at 
> org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:109)
>  

[jira] [Updated] (LUCENE-9221) Lucene Logo Contest

2020-02-12 Thread Ryan Ernst (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-9221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryan Ernst updated LUCENE-9221:
---
Description: 
The Lucene logo has served the project well for almost 20 years. However, it 
does sometimes show its age and misses modern nice-to-haves like invertable or 
grayscale variants.
  
 The PMC would like to have a contest to replace the current logo. This issue 
will serve as the submission mechanism for that contest. When the submission 
deadline closes, a community poll will be used to guide the PMC in the decision 
of which logo to choose. Keeping the current logo will be a possible outcome of 
this decision, if a majority likes the current logo more than any other 
proposal.
  
 The logo should adhere to the guidelines set forth by Apache for project logos 
([https://www.apache.org/foundation/marks/pmcs#graphics]), specifically that 
the full project name, "Apache Lucene", must appear in the logo (although the 
word "Apache" may be in a smaller font than "Lucene").
  
 The contest will last approximately one month. The submission deadline is 
*Monday, March 16, 2020*. Submissions should be attached in a single zip or tar 
archive, with the filename of the form \{{[user]-[proposal 
number].[extension]}}.

  was:
The Lucene logo has served the project well for almost 20 years. However, it 
does sometimes show its age and misses modern nice-to-haves like invertable or 
grayscale variants.
  
 The PMC would like to have a contest to replace the current logo. This issue 
will serve as the submission mechanism for that contest. When the submission 
deadline closes, a community poll will be used to guide the PMC in the decision 
of which logo to choose. Keeping the current logo will be a possible outcome of 
this decision, if a majority likes the current logo more than any other 
proposal.
  
 The logo should adhere to the guidelines set forth by Apache for project logos 
([https://www.apache.org/foundation/marks/pmcs#graphics]), specifically that 
the full project name, "Apache Lucene", must appear in the logo (although the 
word "Apache" may be in a smaller font than "Lucene").
  
 The contest will last approximately one month. The submission deadline is 
*Monday, March 16, 2020*. Submissions should be attached in a single zip or tar 
archive, with the filename of the form {{{user}-\{proposal 
number}.\{extension}.}}


> Lucene Logo Contest
> ---
>
> Key: LUCENE-9221
> URL: https://issues.apache.org/jira/browse/LUCENE-9221
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Ryan Ernst
>Priority: Trivial
>
> The Lucene logo has served the project well for almost 20 years. However, it 
> does sometimes show its age and misses modern nice-to-haves like invertable 
> or grayscale variants.
>   
>  The PMC would like to have a contest to replace the current logo. This issue 
> will serve as the submission mechanism for that contest. When the submission 
> deadline closes, a community poll will be used to guide the PMC in the 
> decision of which logo to choose. Keeping the current logo will be a possible 
> outcome of this decision, if a majority likes the current logo more than any 
> other proposal.
>   
>  The logo should adhere to the guidelines set forth by Apache for project 
> logos ([https://www.apache.org/foundation/marks/pmcs#graphics]), specifically 
> that the full project name, "Apache Lucene", must appear in the logo 
> (although the word "Apache" may be in a smaller font than "Lucene").
>   
>  The contest will last approximately one month. The submission deadline is 
> *Monday, March 16, 2020*. Submissions should be attached in a single zip or 
> tar archive, with the filename of the form \{{[user]-[proposal 
> number].[extension]}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (LUCENE-9221) Lucene Logo Contest

2020-02-12 Thread Ryan Ernst (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-9221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryan Ernst updated LUCENE-9221:
---
Description: 
The Lucene logo has served the project well for almost 20 years. However, it 
does sometimes show its age and misses modern nice-to-haves like invertable or 
grayscale variants.
  
 The PMC would like to have a contest to replace the current logo. This issue 
will serve as the submission mechanism for that contest. When the submission 
deadline closes, a community poll will be used to guide the PMC in the decision 
of which logo to choose. Keeping the current logo will be a possible outcome of 
this decision, if a majority likes the current logo more than any other 
proposal.
  
 The logo should adhere to the guidelines set forth by Apache for project logos 
([https://www.apache.org/foundation/marks/pmcs#graphics]), specifically that 
the full project name, "Apache Lucene", must appear in the logo (although the 
word "Apache" may be in a smaller font than "Lucene").
  
 The contest will last approximately one month. The submission deadline is 
*Monday, March 16, 2020*. Submissions should be attached in a single zip or tar 
archive, with the filename of the form {{{user}-\{proposal 
number}.\{extension}.}}

  was:
The Lucene logo has served the project well for almost 20 years. However, it 
does sometimes show its age and misses modern nice-to-haves like invertable or 
grayscale variants.
 
The PMC would like to have a contest to replace the current logo. This issue 
will serve as the submission mechanism for that contest. When the submission 
deadline closes, a community poll will be used to guide the PMC in the decision 
of which logo to choose. Keeping the current logo will be a possible outcome of 
this decision, if a majority likes the current logo more than any other 
proposal.
 
The logo should adhere to the guidelines set forth by Apache for project logos 
([https://www.apache.org/foundation/marks/pmcs#graphics]), specifically that 
the full project name, "Apache Lucene", must appear in the logo (although the 
word "Apache" may be in a smaller font than "Lucene").
 
The contest will last approximately one month. The submission deadline is 
*Monday, March 16, 2020*. Submissions should be attached in a single zip or tar 
archive, with the filename of the form *{user}-\{proposal number}.\{extension}*.


> Lucene Logo Contest
> ---
>
> Key: LUCENE-9221
> URL: https://issues.apache.org/jira/browse/LUCENE-9221
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Ryan Ernst
>Priority: Trivial
>
> The Lucene logo has served the project well for almost 20 years. However, it 
> does sometimes show its age and misses modern nice-to-haves like invertable 
> or grayscale variants.
>   
>  The PMC would like to have a contest to replace the current logo. This issue 
> will serve as the submission mechanism for that contest. When the submission 
> deadline closes, a community poll will be used to guide the PMC in the 
> decision of which logo to choose. Keeping the current logo will be a possible 
> outcome of this decision, if a majority likes the current logo more than any 
> other proposal.
>   
>  The logo should adhere to the guidelines set forth by Apache for project 
> logos ([https://www.apache.org/foundation/marks/pmcs#graphics]), specifically 
> that the full project name, "Apache Lucene", must appear in the logo 
> (although the word "Apache" may be in a smaller font than "Lucene").
>   
>  The contest will last approximately one month. The submission deadline is 
> *Monday, March 16, 2020*. Submissions should be attached in a single zip or 
> tar archive, with the filename of the form {{{user}-\{proposal 
> number}.\{extension}.}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (LUCENE-9221) Lucene Logo Contest

2020-02-12 Thread Ryan Ernst (Jira)
Ryan Ernst created LUCENE-9221:
--

 Summary: Lucene Logo Contest
 Key: LUCENE-9221
 URL: https://issues.apache.org/jira/browse/LUCENE-9221
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Ryan Ernst


The Lucene logo has served the project well for almost 20 years. However, it 
does sometimes show its age and misses modern nice-to-haves like invertable or 
grayscale variants.
 
The PMC would like to have a contest to replace the current logo. This issue 
will serve as the submission mechanism for that contest. When the submission 
deadline closes, a community poll will be used to guide the PMC in the decision 
of which logo to choose. Keeping the current logo will be a possible outcome of 
this decision, if a majority likes the current logo more than any other 
proposal.
 
The logo should adhere to the guidelines set forth by Apache for project logos 
([https://www.apache.org/foundation/marks/pmcs#graphics]), specifically that 
the full project name, "Apache Lucene", must appear in the logo (although the 
word "Apache" may be in a smaller font than "Lucene").
 
The contest will last approximately one month. The submission deadline is 
*Monday, March 16, 2020*. Submissions should be attached in a single zip or tar 
archive, with the filename of the form *{user}-\{proposal number}.\{extension}*.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14013) javabin performance regressions

2020-02-12 Thread Houston Putman (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17035499#comment-17035499
 ] 

Houston Putman commented on SOLR-14013:
---

[~noble.paul], at the very least I think we should backport this to 7_7. If we 
want to leave the latest release of 7 in a state with a significant 
regression/bug in it, then we are basically asking people to either:
 * Know that 7.6 is the last stable release of solr for people wanting to use 
multiValued fields in a sharded collection
 * Upgrade to Solr 8.4

In my opinion, neither of those are good options. Because users are always 
going to go with the most up to date version of Solr that works for their 
index, and upgrading to new major versions is a very tough process for a lot of 
people.

This isn't a bug that existed throughout the entirety of Solr 7, it was 
introduced in the last minor release. A lot of people are very comfortable with 
Solr 7, and trust it. People also trust that the last minor/patch version of 
something is going to be the most stable version. We should make sure that the 
latest release of our second to last major version (7) is stable and maintains 
that trust that users have in it and Solr in general.

It is very little work to backport this, and also probably not a whole lot of 
work to do another patch or minor release (7.8 or 7.7.3). And with that work we 
will be providing a significantly better user experience for our community. 

> javabin performance regressions
> ---
>
> Key: SOLR-14013
> URL: https://issues.apache.org/jira/browse/SOLR-14013
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 7.7
>Reporter: Yonik Seeley
>Assignee: Yonik Seeley
>Priority: Blocker
> Fix For: 8.4
>
> Attachments: SOLR-14013.patch, SOLR-14013.patch, TestQuerySpeed.java, 
> test.json
>
>
> As noted by [~rrockenbaugh] in SOLR-13963, javabin also recently became 
> orders of magnitude slower in certain cases since v7.7.  The cases identified 
> so far include large numbers of values in a field.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14257) Keyword's not indexed or searchable

2020-02-12 Thread Shae Bottum (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shae Bottum updated SOLR-14257:
---
Description: 
During indexing, if the value of your column is the literal char  , solr's 
tokenizer will pass over this value and Not tokenize it. This value then is not 
indexed and therefore not searchable. Need to make this keyword searchable. I 
understand to search it, you would need to add quotes around the value * to 
ensure the asterisk is not treated as a wildcard and return all. The use case 
is searching for the actual value of an asterisk. 

 

tokenizer works for "jo*n" or "j*n"

tokenizer does Not work for "**" 

  was:
During indexing, if the value of your column is the literal char  , solr's 
tokenizer will pass over this value and Not tokenize it. This value then is not 
indexed and therefore not searchable. Need to make this keyword searchable. I 
understand to search it, you would need to add quotes around the value * to 
ensure the asterisk is not treated as a wildcard and return all. The use case 
is searching for the actual value of an asterisk. 

 

tokenizer works for "jo*n" or "j*n"

tokenizer does Not work for "*" or ** "" or ** "***" etc etc.


> Keyword's not indexed or searchable
> ---
>
> Key: SOLR-14257
> URL: https://issues.apache.org/jira/browse/SOLR-14257
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Schema and Analysis
>Affects Versions: 7.6
>Reporter: Shae Bottum
>Priority: Major
>
> During indexing, if the value of your column is the literal char  , 
> solr's tokenizer will pass over this value and Not tokenize it. This value 
> then is not indexed and therefore not searchable. Need to make this keyword 
> searchable. I understand to search it, you would need to add quotes around 
> the value * to ensure the asterisk is not treated as a wildcard and return 
> all. The use case is searching for the actual value of an asterisk. 
>  
> tokenizer works for "jo*n" or "j*n"
> tokenizer does Not work for "**" 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14257) Keyword's not indexed or searchable

2020-02-12 Thread Shae Bottum (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shae Bottum updated SOLR-14257:
---
Description: 
During indexing, if the value of your column is the literal char  , solr's 
tokenizer will pass over this value and Not tokenize it. This value then is not 
indexed and therefore not searchable. Need to make this keyword searchable. I 
understand to search it, you would need to add quotes around the value * to 
ensure the asterisk is not treated as a wildcard and return all. The use case 
is searching for the actual value of an asterisk. 

 

tokenizer works for "jo*n" or "j*n"

tokenizer does Not work for "*" or ** "" or ** "***" etc etc.

  was:
During indexing, if the value of your column is the literal char  , solr's 
tokenizer will pass over this value and Not tokenize it. This value then is not 
indexed and therefore not searchable. Need to make this keyword searchable. I 
understand to search it, you would need to add quotes around the value * to 
ensure the asterisk is not treated as a wildcard and return all. The use case 
is searching for the actual value of an asterisk. 

 

tokenizer works for "jo*n" or "j*n"

tokenizer does Not work for "*" or "**" or "***" etc etc.


> Keyword's not indexed or searchable
> ---
>
> Key: SOLR-14257
> URL: https://issues.apache.org/jira/browse/SOLR-14257
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Schema and Analysis
>Affects Versions: 7.6
>Reporter: Shae Bottum
>Priority: Major
>
> During indexing, if the value of your column is the literal char  , 
> solr's tokenizer will pass over this value and Not tokenize it. This value 
> then is not indexed and therefore not searchable. Need to make this keyword 
> searchable. I understand to search it, you would need to add quotes around 
> the value * to ensure the asterisk is not treated as a wildcard and return 
> all. The use case is searching for the actual value of an asterisk. 
>  
> tokenizer works for "jo*n" or "j*n"
> tokenizer does Not work for "*" or ** "" or ** "***" etc etc.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (SOLR-14257) Keyword's not indexed or searchable

2020-02-12 Thread Shae Bottum (Jira)
Shae Bottum created SOLR-14257:
--

 Summary: Keyword's not indexed or searchable
 Key: SOLR-14257
 URL: https://issues.apache.org/jira/browse/SOLR-14257
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
  Components: Schema and Analysis
Affects Versions: 7.6
Reporter: Shae Bottum


During indexing, if the value of your column is the literal char  , solr's 
tokenizer will pass over this value and Not tokenize it. This value then is not 
indexed and therefore not searchable. Need to make this keyword searchable. I 
understand to search it, you would need to add quotes around the value * to 
ensure the asterisk is not treated as a wildcard and return all. The use case 
is searching for the actual value of an asterisk. 

 

tokenizer works for "jo*n" or "j*n"

tokenizer does Not work for "*" or "**" or "***" etc etc.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14256) Remove HashDocSet

2020-02-12 Thread David Smiley (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Smiley updated SOLR-14256:

Issue Type: Task  (was: Bug)

> Remove HashDocSet
> -
>
> Key: SOLR-14256
> URL: https://issues.apache.org/jira/browse/SOLR-14256
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: search
>Reporter: David Smiley
>Priority: Major
>
> This particular DocSet is only used in places where we need to convert 
> SortedIntDocSet in particular to a DocSet that is fast for random access.  
> Once such a conversion happens, it's only used to test some docs for presence 
> and it could be another interface.  DocSet has kind of a large-ish API 
> surface area to implement.  Since we only need to test docs, we could use 
> Bits interface (having only 2 methods) backed by an off-the-shelf primitive 
> long hash set on our classpath.  Perhaps a new method on DocSet: getBits() or 
> DocSetUtil.getBits(DocSet).
> In addition to removing complexity unto itself, this improvement is required 
> by SOLR-14185 because it wants to be able to produce a DocIdSetIterator slice 
> directly from the DocSet but HashDocSet can't do that without sorting first.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (SOLR-14256) Remove HashDocSet

2020-02-12 Thread David Smiley (Jira)
David Smiley created SOLR-14256:
---

 Summary: Remove HashDocSet
 Key: SOLR-14256
 URL: https://issues.apache.org/jira/browse/SOLR-14256
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
  Components: search
Reporter: David Smiley


This particular DocSet is only used in places where we need to convert 
SortedIntDocSet in particular to a DocSet that is fast for random access.  Once 
such a conversion happens, it's only used to test some docs for presence and it 
could be another interface.  DocSet has kind of a large-ish API surface area to 
implement.  Since we only need to test docs, we could use Bits interface 
(having only 2 methods) backed by an off-the-shelf primitive long hash set on 
our classpath.  Perhaps a new method on DocSet: getBits() or 
DocSetUtil.getBits(DocSet).

In addition to removing complexity unto itself, this improvement is required by 
SOLR-14185 because it wants to be able to produce a DocIdSetIterator slice 
directly from the DocSet but HashDocSet can't do that without sorting first.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Assigned] (SOLR-14216) Exclude HealthCheck from authentication

2020-02-12 Thread Jira


 [ 
https://issues.apache.org/jira/browse/SOLR-14216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl reassigned SOLR-14216:
--

Assignee: Jan Høydahl

> Exclude HealthCheck from authentication
> ---
>
> Key: SOLR-14216
> URL: https://issues.apache.org/jira/browse/SOLR-14216
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Authentication
>Reporter: Jan Høydahl
>Assignee: Jan Høydahl
>Priority: Major
>
> The {{HealthCheckHandler}} on {{/api/node/health}} and 
> {{/solr/admin/info/health}} should by default not be subject to 
> authentication, but be open for all. This allows for load balancers and 
> various monitoring to probe Solr's health without having to support the auth 
> scheme in place. I can't see any reason we need auth on the health endpoint.
> It is possible to achieve the same by setting blockUnknown=false and 
> configuring three RBAC permissions: One for v1 endpoint, one for v2 endpoint 
> and one "all" catch all at the end of the chain. But this is cumbersome so 
> better have this ootb.
> An alternative solution is to create a separate HttpServer for health check, 
> listening on a different port, just like embedded ZK and JMX.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Assigned] (SOLR-14250) Solr tries to read request body after error response is sent

2020-02-12 Thread Jira


 [ 
https://issues.apache.org/jira/browse/SOLR-14250?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl reassigned SOLR-14250:
--

Assignee: Jan Høydahl

> Solr tries to read request body after error response is sent
> 
>
> Key: SOLR-14250
> URL: https://issues.apache.org/jira/browse/SOLR-14250
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Jan Høydahl
>Assignee: Jan Høydahl
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> If a client sends a {{HTTP POST}} request with header {{Expect: 
> 100-continue}} the normal flow is for Solr (Jetty) to first respond with a 
> {{HTTP 100 continue}} response, then the client will send the body which will 
> be processed and then a final response is sent by Solr.
> However, if such a request leads to an error (e.g. 404 or 401), then Solr 
> will skip the 100 response and instead send the error response directly. The 
> very last ation of {{SolrDispatchFilter#doFilter}} is to call 
> {{consumeInputFully()}}. However, this should not be done in case an error 
> response has already been sent, else you'll provoke an exception in Jetty's 
> HTTP lib:
> {noformat}
> 2020-02-07 23:13:26.459 INFO  (qtp403547747-24) [   ] 
> o.a.s.s.SolrDispatchFilter Could not consume full client request => 
> java.io.IOException: Committed before 100 Continues
>   at 
> org.eclipse.jetty.http2.server.HttpChannelOverHTTP2.continue100(HttpChannelOverHTTP2.java:362)
> java.io.IOException: Committed before 100 Continues
>   at 
> org.eclipse.jetty.http2.server.HttpChannelOverHTTP2.continue100(HttpChannelOverHTTP2.java:362)
>  ~[http2-server-9.4.19.v20190610.jar:9.4.19.v20190610]
>   at org.eclipse.jetty.server.Request.getInputStream(Request.java:872) 
> ~[jetty-server-9.4.19.v20190610.jar:9.4.19.v20190610]
>   at 
> javax.servlet.ServletRequestWrapper.getInputStream(ServletRequestWrapper.java:185)
>  ~[javax.servlet-api-3.1.0.jar:3.1.0]
>   at 
> org.apache.solr.servlet.SolrDispatchFilter$1.getInputStream(SolrDispatchFilter.java:612)
>  ~[solr-core-8.4.1.jar:8.4.1 832bf13dd9187095831caf69783179d41059d013 - ishan 
> - 2020-01-10 13:40:28]
>   at 
> org.apache.solr.servlet.SolrDispatchFilter.consumeInputFully(SolrDispatchFilter.java:454)
>  ~[solr-core-8.4.1.jar:8.4.1 832bf13dd9187095831caf69783179d41059d013 - ishan 
> - 2020-01-10 13:40:28]
>   at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:445)
>  ~[solr-core-8.4.1.jar:8.4.1 832bf13dd9187095831caf69783179d41059d013 - ishan 
> - 2020-01-10 13:40:28]
> {noformat}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9220) Upgrade Snowball version to 2.0

2020-02-12 Thread Erick Erickson (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17035394#comment-17035394
 ] 

Erick Erickson commented on LUCENE-9220:


It'd be interesting to see if the tests pass.

On a quick look, it would be fairly tedious. All of the class names have been 
changed to start with a lowercase letter, so all of the referenced in the code 
would need to be changed and there have been some interface changes that would 
need to be hunted down one by one. I don't know how much work that would be.

It's probably a good idea to upgrade, but not something I'll have time for any 
time soon.

> Upgrade Snowball version to 2.0
> ---
>
> Key: LUCENE-9220
> URL: https://issues.apache.org/jira/browse/LUCENE-9220
> Project: Lucene - Core
>  Issue Type: Wish
>Reporter: Nguyen Minh Gia Huy
>Priority: Major
>
> When working with Snowball-based stemmers, I realized that Lucene is 
> currently [using a pre-compiled version of 
> Snowball|https://lucene.apache.org/core/8_4_1/analyzers-common/org/apache/lucene/analysis/snowball/package-summary.html],
>  that seems from 12 years ago: 
> https://github.com/snowballstem/snowball/tree/e103b5c257383ee94a96e7fc58cab3c567bf079b
> Snowball has just released v2.0 in 10/2019 with many improvements, new 
> supported languages ( Arabic, Indonesian…) and new features ( stringdef 
> notation for Unicode codepoints…). Details of the changes could be found 
> here: https://github.com/snowballstem/snowball/blob/master/NEWS. I think 
> these changes of Snowball could give a promising positive impact on Lucene.
> I wonder when Lucene should upgrade Snowball to the latest version ( v2.0).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9211) Adding compression to BinaryDocValues storage

2020-02-12 Thread juan camilo rodriguez duran (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17035224#comment-17035224
 ] 

juan camilo rodriguez duran commented on LUCENE-9211:
-

[~mharwood] the main idea of mine PR it just to make code cleaner and 
extensible, it is not supposed to introduce any regression nor improvement of 
the current format. (spoiler alert: I'm working in the extension to improve 
sorted and sorted set doc values for the lookup using BytesRef)

> Adding compression to BinaryDocValues storage
> -
>
> Key: LUCENE-9211
> URL: https://issues.apache.org/jira/browse/LUCENE-9211
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/codecs
>Reporter: Mark Harwood
>Assignee: Mark Harwood
>Priority: Minor
>  Labels: pull-request-available
>
> While SortedSetDocValues can be used today to store identical values in a 
> compact form this is not effective for data with many unique values.
> The proposal is that BinaryDocValues should be stored in LZ4 compressed 
> blocks which can dramatically reduce disk storage costs in many cases. The 
> proposal is blocks of a number of documents are stored as a single compressed 
> blob along with metadata that records offsets where the original document 
> values can be found in the uncompressed content.
> There's a trade-off here between efficient compression (more docs-per-block = 
> better compression) and fast retrieval times (fewer docs-per-block = faster 
> read access for single values). A fixed block size of 32 docs seems like it 
> would be a reasonable compromise for most scenarios.
> A PR is up for review here [https://github.com/apache/lucene-solr/pull/1234]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-6853) solr.ManagedSynonymFilterFactory/ManagedStopwordFilterFactory: URLEncoding - Not able to delete Synonyms/Stopwords with special characters

2020-02-12 Thread Markus Kalkbrenner (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-6853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17035219#comment-17035219
 ] 

Markus Kalkbrenner commented on SOLR-6853:
--

This issue has been reported for the solarium Solr PHP client, too:
https://github.com/solariumphp/solarium/pull/742

> solr.ManagedSynonymFilterFactory/ManagedStopwordFilterFactory: URLEncoding - 
> Not able to delete Synonyms/Stopwords with special characters
> --
>
> Key: SOLR-6853
> URL: https://issues.apache.org/jira/browse/SOLR-6853
> Project: Solr
>  Issue Type: Bug
>  Components: Schema and Analysis
>Affects Versions: 4.10.2
> Environment: Solr 4.10.2 running @ Win7
>Reporter: Tomasz Sulkowski
>Priority: Major
>  Labels: ManagedStopwordFilterFactory, 
> ManagedSynonymFilterFactory, REST, SOLR
> Attachments: SOLR-6853.patch
>
>
> Hi Guys,
> We're using the SOLR Rest API in order to manage synonyms and stopwords with 
> solr.Managed*FilterFactory.
> {_emphasis_}The same applies to stopwords. I am going to explain the synonym 
> case only from this point on.{_emphasis_}
> Let us consider the following _schema_analysis_synonyms_en.json managedMap: {
> "xxx#xxx":["xxx#xxx"],
> "xxx%xxx":["xxx%xxx"],
> "xxx/xxx":["xxx/xxx"],
> "xxx:xxx":["xxx:xxx"],
> "xxx;xxx":["xxx;xxx"],
> "xx ":["xx "]
> }
> I can add such synonym to keyword relations using REST API. The problem is 
> that I cannot remove/list them as 
> http://localhost:8983/solr/collection1/schema/analysis/synonyms/en/
>  where  is one of the map's key throws 404, or 500 (in case of 
> xxx%25xxx):
> java.lang.NullPointerException at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:367)
>  at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:207)
>  at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
>  at 
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455) at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137) 
> at 
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557) 
> at 
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
>  at 
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)
>  at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384) 
> at 
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
>  at 
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)
>  at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135) 
> at 
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
>  at 
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)
>  at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
>  at org.eclipse.jetty.server.Server.handle(Server.java:368) at 
> org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)
>  at 
> org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53)
>  at 
> org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:942)
>  at 
> org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1004)
>  at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:640) at 
> org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235) at 
> org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)
>  at 
> org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264)
>  at 
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
>  at 
> org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
>  at java.lang.Thread.run(Unknown Source)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9136) Introduce IVFFlat to Lucene for ANN similarity search

2020-02-12 Thread Xin-Chun Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17035213#comment-17035213
 ] 

Xin-Chun Zhang commented on LUCENE-9136:


The index format of IVFFlat is organized as follows, 
!1581409981369-9dea4099-4e41-4431-8f45-a3bb8cac46c0.png!

In general, the number of centroids lies within the interval [4 * sqrt(N), 16 * 
sqrt(N)], where N is the data set size. We use (4 * sqrt(N)) as the actual 
value of centroid number to balance between accuracy and computational load, 
denoted by c. And the full data set is used for training if its size no larger 
than 200,000. Otherwise (128 *  c) points are selected after shuffling for 
training in order to accelerate training.

Experiments have been conducted on a large data set (sift1M, 
[http://corpus-texmex.irisa.fr/]) to verify the implementation of IVFFlat. The 
base data set (sift_base.fvecs) contains 1,000,000 vectors with 128 dimensions. 
And 10,000 queries (sift_query.fvecs) are used for recall testing. The recall 
ratio follows

Recall=(Recall vectors in groundTruth) / (number of queries * TopK), where 
number of queries = 10,000 and TopK=100. The results are as follows (single 
thread and single segment),

 
||nprobe||avg. search time (ms)||recall (%)||
|8|16.3827|44.24|
|16|16.5834|58.04|
|32|19.2031|71.55|
|64|24.7065|83.30|
|128|34.9165|92.03|
|256|60.5844|97.18|
| | | |

**The test codes could be found in 
[https://github.com/irvingzhang/lucene-solr/blob/jira/LUCENE-9136/lucene/core/src/test/org/apache/lucene/util/KnnIvfAndGraphPerformTester.java.|https://github.com/irvingzhang/lucene-solr/blob/jira/LUCENE-9136/lucene/core/src/test/org/apache/lucene/util/KnnIvfAndGraphPerformTester.java]

 

 

 

 

> Introduce IVFFlat to Lucene for ANN similarity search
> -
>
> Key: LUCENE-9136
> URL: https://issues.apache.org/jira/browse/LUCENE-9136
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Xin-Chun Zhang
>Priority: Major
> Attachments: 1581409981369-9dea4099-4e41-4431-8f45-a3bb8cac46c0.png
>
>
> Representation learning (RL) has been an established discipline in the 
> machine learning space for decades but it draws tremendous attention lately 
> with the emergence of deep learning. The central problem of RL is to 
> determine an optimal representation of the input data. By embedding the data 
> into a high dimensional vector, the vector retrieval (VR) method is then 
> applied to search the relevant items.
> With the rapid development of RL over the past few years, the technique has 
> been used extensively in industry from online advertising to computer vision 
> and speech recognition. There exist many open source implementations of VR 
> algorithms, such as Facebook's FAISS and Microsoft's SPTAG, providing various 
> choices for potential users. However, the aforementioned implementations are 
> all written in C++, and no plan for supporting Java interface, making it hard 
> to be integrated in Java projects or those who are not familier with C/C++  
> [[https://github.com/facebookresearch/faiss/issues/105]]. 
> The algorithms for vector retrieval can be roughly classified into four 
> categories,
>  # Tree-base algorithms, such as KD-tree;
>  # Hashing methods, such as LSH (Local Sensitive Hashing);
>  # Product quantization based algorithms, such as IVFFlat;
>  # Graph-base algorithms, such as HNSW, SSG, NSG;
> where IVFFlat and HNSW are the most popular ones among all the VR algorithms.
> IVFFlat is better for high-precision applications such as face recognition, 
> while HNSW performs better in general scenarios including recommendation and 
> personalized advertisement. *The recall ratio of IVFFlat could be gradually 
> increased by adjusting the query parameter (nprobe), while it's hard for HNSW 
> to improve its accuracy*. In theory, IVFFlat could achieve 100% recall ratio. 
> Recently, the implementation of HNSW (Hierarchical Navigable Small World, 
> LUCENE-9004) for Lucene, has made great progress. The issue draws attention 
> of those who are interested in Lucene or hope to use HNSW with Solr/Lucene. 
> As an alternative for solving ANN similarity search problems, IVFFlat is also 
> very popular with many users and supporters. Compared with HNSW, IVFFlat has 
> smaller index size but requires k-means clustering, while HNSW is faster in 
> query (no training required) but requires extra storage for saving graphs 
> [indexing 1M 
> vectors|[https://github.com/facebookresearch/faiss/wiki/Indexing-1M-vectors]].
>  Another advantage is that IVFFlat can be faster and more accurate when 
> enables GPU parallel computing (current not support in Java). Both algorithms 
> have their merits and demerits. Since HNSW is now under development, it may 
> be better to provide both implementations 

[jira] [Created] (LUCENE-9220) Upgrade Snowball version to 2.0

2020-02-12 Thread Nguyen Minh Gia Huy (Jira)
Nguyen Minh Gia Huy created LUCENE-9220:
---

 Summary: Upgrade Snowball version to 2.0
 Key: LUCENE-9220
 URL: https://issues.apache.org/jira/browse/LUCENE-9220
 Project: Lucene - Core
  Issue Type: Wish
Reporter: Nguyen Minh Gia Huy


When working with Snowball-based stemmers, I realized that Lucene is currently 
[using a pre-compiled version of 
Snowball|https://lucene.apache.org/core/8_4_1/analyzers-common/org/apache/lucene/analysis/snowball/package-summary.html],
 that seems from 12 years ago: 
https://github.com/snowballstem/snowball/tree/e103b5c257383ee94a96e7fc58cab3c567bf079b

Snowball has just released v2.0 in 10/2019 with many improvements, new 
supported languages ( Arabic, Indonesian…) and new features ( stringdef 
notation for Unicode codepoints…). Details of the changes could be found here: 
https://github.com/snowballstem/snowball/blob/master/NEWS. I think these 
changes of Snowball could give a promising positive impact on Lucene.

I wonder when Lucene should upgrade Snowball to the latest version ( v2.0).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (LUCENE-9136) Introduce IVFFlat to Lucene for ANN similarity search

2020-02-12 Thread Xin-Chun Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-9136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xin-Chun Zhang updated LUCENE-9136:
---
Attachment: 1581409981369-9dea4099-4e41-4431-8f45-a3bb8cac46c0.png

> Introduce IVFFlat to Lucene for ANN similarity search
> -
>
> Key: LUCENE-9136
> URL: https://issues.apache.org/jira/browse/LUCENE-9136
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Xin-Chun Zhang
>Priority: Major
> Attachments: 1581409981369-9dea4099-4e41-4431-8f45-a3bb8cac46c0.png
>
>
> Representation learning (RL) has been an established discipline in the 
> machine learning space for decades but it draws tremendous attention lately 
> with the emergence of deep learning. The central problem of RL is to 
> determine an optimal representation of the input data. By embedding the data 
> into a high dimensional vector, the vector retrieval (VR) method is then 
> applied to search the relevant items.
> With the rapid development of RL over the past few years, the technique has 
> been used extensively in industry from online advertising to computer vision 
> and speech recognition. There exist many open source implementations of VR 
> algorithms, such as Facebook's FAISS and Microsoft's SPTAG, providing various 
> choices for potential users. However, the aforementioned implementations are 
> all written in C++, and no plan for supporting Java interface, making it hard 
> to be integrated in Java projects or those who are not familier with C/C++  
> [[https://github.com/facebookresearch/faiss/issues/105]]. 
> The algorithms for vector retrieval can be roughly classified into four 
> categories,
>  # Tree-base algorithms, such as KD-tree;
>  # Hashing methods, such as LSH (Local Sensitive Hashing);
>  # Product quantization based algorithms, such as IVFFlat;
>  # Graph-base algorithms, such as HNSW, SSG, NSG;
> where IVFFlat and HNSW are the most popular ones among all the VR algorithms.
> IVFFlat is better for high-precision applications such as face recognition, 
> while HNSW performs better in general scenarios including recommendation and 
> personalized advertisement. *The recall ratio of IVFFlat could be gradually 
> increased by adjusting the query parameter (nprobe), while it's hard for HNSW 
> to improve its accuracy*. In theory, IVFFlat could achieve 100% recall ratio. 
> Recently, the implementation of HNSW (Hierarchical Navigable Small World, 
> LUCENE-9004) for Lucene, has made great progress. The issue draws attention 
> of those who are interested in Lucene or hope to use HNSW with Solr/Lucene. 
> As an alternative for solving ANN similarity search problems, IVFFlat is also 
> very popular with many users and supporters. Compared with HNSW, IVFFlat has 
> smaller index size but requires k-means clustering, while HNSW is faster in 
> query (no training required) but requires extra storage for saving graphs 
> [indexing 1M 
> vectors|[https://github.com/facebookresearch/faiss/wiki/Indexing-1M-vectors]].
>  Another advantage is that IVFFlat can be faster and more accurate when 
> enables GPU parallel computing (current not support in Java). Both algorithms 
> have their merits and demerits. Since HNSW is now under development, it may 
> be better to provide both implementations (HNSW && IVFFlat) for potential 
> users who are faced with very different scenarios and want to more choices.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-9136) Introduce IVFFlat to Lucene for ANN similarity search

2020-02-12 Thread Xin-Chun Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17019507#comment-17019507
 ] 

Xin-Chun Zhang edited comment on LUCENE-9136 at 2/12/20 9:33 AM:
-

I worked on this issue for about three to four days. And it now works fine for 
searching.

My personal dev branch is available in github 
[https://github.com/irvingzhang/lucene-solr/tree/jira/LUCENE-9136]. The index 
format (only one meta file with suffix .ifi) of IVFFlat is shown in the class 
Lucene90IvfFlatIndexFormat. In my implementation, the clustering process was 
optimized when the number of vectors is sufficient large (e.g. > 200,000 per 
segment). A subset after shuffling is selected for training, thereby saving 
time and memory. The insertion performance of IVFFlat is better due to no extra 
executions on insertion while HNSW need to maintain the graph. However, IVFFlat 
consumes more time in flushing because of the k-means clustering.

My test cases show that the query performance of IVFFlat is better than HNSW, 
even if HNSW uses a cache for graphs while IVFFlat has no cache. And its recall 
is pretty high (avg time < 10ms and recall>96% over a set of 5 random 
vectors with 100 dimensions). My test class for IVFFlat is under the directory 
[https://github.com/irvingzhang/lucene-solr/blob/jira/LUCENE-9136/lucene/core/src/test/org/apache/lucene/util/ivfflat/|https://github.com/irvingzhang/lucene-solr/blob/jira/LUCENE-9136/lucene/core/src/test/org/apache/lucene/util/ivfflat/TestKnnIvfFlat.java].
 Performance comparison between IVFFlat and HNSW is in the class 
TestKnnGraphAndIvfFlat.

The work is still in its early stage. There must be some bugs that need to be 
fixed and and I would like to hear more comments. Everyone is welcomed to 
participate in this issue.


was (Author: irvingzhang):
I worked on this issue for about three to four days. And it now works fine for 
searching.

My personal dev branch is available in github 
[https://github.com/irvingzhang/lucene-solr/tree/jira/LUCENE-9136]. The index 
format (only one meta file with suffix .ifi) of IVFFlat is shown in the class 
Lucene90IvfFlatIndexFormat. In my implementation, the clustering process was 
optimized when the number of vectors is sufficient large (e.g. > 10,000,000 per 
segment). A subset after shuffling is selected for training, thereby saving 
time and memory. The insertion performance of IVFFlat is better due to no extra 
executions on insertion while HNSW need to maintain the graph. However, IVFFlat 
consumes more time in flushing because of the k-means clustering.

My test cases show that the query performance of IVFFlat is better than HNSW, 
even if HNSW uses a cache for graphs while IVFFlat has no cache. And its recall 
is pretty high (avg time < 10ms and recall>96% over a set of 5 random 
vectors with 100 dimensions). My test class for IVFFlat is under the directory 
[https://github.com/irvingzhang/lucene-solr/blob/jira/LUCENE-9136/lucene/core/src/test/org/apache/lucene/util/ivfflat/|https://github.com/irvingzhang/lucene-solr/blob/jira/LUCENE-9136/lucene/core/src/test/org/apache/lucene/util/ivfflat/TestKnnIvfFlat.java].
 Performance comparison between IVFFlat and HNSW is in the class 
TestKnnGraphAndIvfFlat.

The work is still in its early stage. There must be some bugs that need to be 
fixed and and I would like to hear more comments. Everyone is welcomed to 
participate in this issue.

> Introduce IVFFlat to Lucene for ANN similarity search
> -
>
> Key: LUCENE-9136
> URL: https://issues.apache.org/jira/browse/LUCENE-9136
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Xin-Chun Zhang
>Priority: Major
>
> Representation learning (RL) has been an established discipline in the 
> machine learning space for decades but it draws tremendous attention lately 
> with the emergence of deep learning. The central problem of RL is to 
> determine an optimal representation of the input data. By embedding the data 
> into a high dimensional vector, the vector retrieval (VR) method is then 
> applied to search the relevant items.
> With the rapid development of RL over the past few years, the technique has 
> been used extensively in industry from online advertising to computer vision 
> and speech recognition. There exist many open source implementations of VR 
> algorithms, such as Facebook's FAISS and Microsoft's SPTAG, providing various 
> choices for potential users. However, the aforementioned implementations are 
> all written in C++, and no plan for supporting Java interface, making it hard 
> to be integrated in Java projects or those who are not familier with C/C++  
> [[https://github.com/facebookresearch/faiss/issues/105]]. 
> The algorithms for vector retrieval can be roughly 

[jira] [Created] (LUCENE-9219) Port ECJ-based linter to gradle

2020-02-12 Thread Dawid Weiss (Jira)
Dawid Weiss created LUCENE-9219:
---

 Summary: Port ECJ-based linter to gradle
 Key: LUCENE-9219
 URL: https://issues.apache.org/jira/browse/LUCENE-9219
 Project: Lucene - Core
  Issue Type: Sub-task
Reporter: Dawid Weiss






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] dweiss commented on issue #1242: LUCENE-9201: Port documentation-lint task to Gradle build

2020-02-12 Thread GitBox
dweiss commented on issue #1242: LUCENE-9201: Port documentation-lint task to 
Gradle build
URL: https://github.com/apache/lucene-solr/pull/1242#issuecomment-585103020
 
 
   Ok, sure thing. I'll create a sub-task on this issue and maybe try to push 
the ecj linter forward so that it is there as an example to copy from. Many 
things in gradle are not so obvious (although they are fairly clear once you 
soak in the basic concepts). If you have doubts or questions about how the code 
in the patch works please don't hesitate to ask.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] mocobeta closed pull request #1242: LUCENE-9201: Port documentation-lint task to Gradle build

2020-02-12 Thread GitBox
mocobeta closed pull request #1242: LUCENE-9201: Port documentation-lint task 
to Gradle build
URL: https://github.com/apache/lucene-solr/pull/1242
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] mocobeta commented on issue #1242: LUCENE-9201: Port documentation-lint task to Gradle build

2020-02-12 Thread GitBox
mocobeta commented on issue #1242: LUCENE-9201: Port documentation-lint task to 
Gradle build
URL: https://github.com/apache/lucene-solr/pull/1242#issuecomment-585101615
 
 
   @dweiss thank you for your comments and the patch in Jira. 
   
   Let me close this PR for now and open another ones (in a week or so), to 
narrow down the scopes. 
   - source code linting by ECJ (unused imports check)
   - documentation linting by the python checkers: this may be further split 
into 
 - "missing javadoc check" (defined at each sub-project) and 
 - "broken links check" (defined at the root project to check inter-project 
links)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] iverase commented on issue #1249: LUCENE-9217: Add validation to XYGeometries

2020-02-12 Thread GitBox
iverase commented on issue #1249: LUCENE-9217: Add validation to XYGeometries
URL: https://github.com/apache/lucene-solr/pull/1249#issuecomment-585091231
 
 
   I have opened #1252 that should supersede this one. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] iverase opened a new pull request #1252: LUCENE-9218: XYGeoemtries should expose values as floats

2020-02-12 Thread GitBox
iverase opened a new pull request #1252: LUCENE-9218: XYGeoemtries should 
expose values as floats
URL: https://github.com/apache/lucene-solr/pull/1252
 
 
   Boxing the values to doubles happen when creating component2D objects.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (LUCENE-9218) XYGeometries should use floats instead of doubles

2020-02-12 Thread Ignacio Vera (Jira)
Ignacio Vera created LUCENE-9218:


 Summary: XYGeometries should use floats instead of doubles
 Key: LUCENE-9218
 URL: https://issues.apache.org/jira/browse/LUCENE-9218
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Ignacio Vera


XYGeometries (XYPolygon, XYLine, XYRectangle & XYPoint) are a bit 
counter-intuitive. Where most of them are initialised using floats, when 
returning those values, they are returned as doubles. In addition XYRectangle 
seems to work on doubles.

In this issue it is proposed to harmonise those classes to only work on floats. 
As these classes were just move to core and they have not been released, it 
should be ok to change its interfaces.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org