[jira] [Commented] (LUCENE-9136) Introduce IVFFlat to Lucene for ANN similarity search

2020-02-15 Thread Xin-Chun Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17037727#comment-17037727
 ] 

Xin-Chun Zhang commented on LUCENE-9136:


Hi, [~jtibshirani], thanks for your suggestions!

??"I wonder if this clustering-based approach could fit more closely in the 
current search framework. In the current prototype, we keep all the cluster 
information on-heap. We could instead try storing each cluster as its own 
'term' with a postings list. The kNN query would then be modelled as an 'OR' 
over these terms."??

In the previous implementation 
([https://github.com/irvingzhang/lucene-solr/commit/eb5f79ea7a705595821f73f80a0c5752061869b2]),
 the cluster information is divided into two parts – meta (.ifi) and data(.ifd) 
as shown in the following figure, where each cluster with a postings list is 
stored in the data file (.ifd) and not kept on-heap. A major concern of this 
implementation is its reading performance of cluster data since reading is a 
very frequent behavior on kNN search. I will test and check the performance. 

!image-2020-02-16-15-05-02-451.png!

??"Because of this concern, it could be nice to include benchmarks for index 
time (in addition to QPS)..."??

Many thanks! I will check the links you mentioned and consider optimize the 
clustering cost. In addition, more benchmarks will be added soon.

> Introduce IVFFlat to Lucene for ANN similarity search
> -
>
> Key: LUCENE-9136
> URL: https://issues.apache.org/jira/browse/LUCENE-9136
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Xin-Chun Zhang
>Priority: Major
> Attachments: 1581409981369-9dea4099-4e41-4431-8f45-a3bb8cac46c0.png, 
> image-2020-02-16-15-05-02-451.png
>
>
> Representation learning (RL) has been an established discipline in the 
> machine learning space for decades but it draws tremendous attention lately 
> with the emergence of deep learning. The central problem of RL is to 
> determine an optimal representation of the input data. By embedding the data 
> into a high dimensional vector, the vector retrieval (VR) method is then 
> applied to search the relevant items.
> With the rapid development of RL over the past few years, the technique has 
> been used extensively in industry from online advertising to computer vision 
> and speech recognition. There exist many open source implementations of VR 
> algorithms, such as Facebook's FAISS and Microsoft's SPTAG, providing various 
> choices for potential users. However, the aforementioned implementations are 
> all written in C++, and no plan for supporting Java interface, making it hard 
> to be integrated in Java projects or those who are not familier with C/C++  
> [[https://github.com/facebookresearch/faiss/issues/105]]. 
> The algorithms for vector retrieval can be roughly classified into four 
> categories,
>  # Tree-base algorithms, such as KD-tree;
>  # Hashing methods, such as LSH (Local Sensitive Hashing);
>  # Product quantization based algorithms, such as IVFFlat;
>  # Graph-base algorithms, such as HNSW, SSG, NSG;
> where IVFFlat and HNSW are the most popular ones among all the VR algorithms.
> IVFFlat is better for high-precision applications such as face recognition, 
> while HNSW performs better in general scenarios including recommendation and 
> personalized advertisement. *The recall ratio of IVFFlat could be gradually 
> increased by adjusting the query parameter (nprobe), while it's hard for HNSW 
> to improve its accuracy*. In theory, IVFFlat could achieve 100% recall ratio. 
> Recently, the implementation of HNSW (Hierarchical Navigable Small World, 
> LUCENE-9004) for Lucene, has made great progress. The issue draws attention 
> of those who are interested in Lucene or hope to use HNSW with Solr/Lucene. 
> As an alternative for solving ANN similarity search problems, IVFFlat is also 
> very popular with many users and supporters. Compared with HNSW, IVFFlat has 
> smaller index size but requires k-means clustering, while HNSW is faster in 
> query (no training required) but requires extra storage for saving graphs 
> [indexing 1M 
> vectors|[https://github.com/facebookresearch/faiss/wiki/Indexing-1M-vectors]].
>  Another advantage is that IVFFlat can be faster and more accurate when 
> enables GPU parallel computing (current not support in Java). Both algorithms 
> have their merits and demerits. Since HNSW is now under development, it may 
> be better to provide both implementations (HNSW && IVFFlat) for potential 
> users who are faced with very different scenarios and want to more choices.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional 

[jira] [Updated] (LUCENE-9136) Introduce IVFFlat to Lucene for ANN similarity search

2020-02-15 Thread Xin-Chun Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-9136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xin-Chun Zhang updated LUCENE-9136:
---
Attachment: image-2020-02-16-15-05-02-451.png

> Introduce IVFFlat to Lucene for ANN similarity search
> -
>
> Key: LUCENE-9136
> URL: https://issues.apache.org/jira/browse/LUCENE-9136
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Xin-Chun Zhang
>Priority: Major
> Attachments: 1581409981369-9dea4099-4e41-4431-8f45-a3bb8cac46c0.png, 
> image-2020-02-16-15-05-02-451.png
>
>
> Representation learning (RL) has been an established discipline in the 
> machine learning space for decades but it draws tremendous attention lately 
> with the emergence of deep learning. The central problem of RL is to 
> determine an optimal representation of the input data. By embedding the data 
> into a high dimensional vector, the vector retrieval (VR) method is then 
> applied to search the relevant items.
> With the rapid development of RL over the past few years, the technique has 
> been used extensively in industry from online advertising to computer vision 
> and speech recognition. There exist many open source implementations of VR 
> algorithms, such as Facebook's FAISS and Microsoft's SPTAG, providing various 
> choices for potential users. However, the aforementioned implementations are 
> all written in C++, and no plan for supporting Java interface, making it hard 
> to be integrated in Java projects or those who are not familier with C/C++  
> [[https://github.com/facebookresearch/faiss/issues/105]]. 
> The algorithms for vector retrieval can be roughly classified into four 
> categories,
>  # Tree-base algorithms, such as KD-tree;
>  # Hashing methods, such as LSH (Local Sensitive Hashing);
>  # Product quantization based algorithms, such as IVFFlat;
>  # Graph-base algorithms, such as HNSW, SSG, NSG;
> where IVFFlat and HNSW are the most popular ones among all the VR algorithms.
> IVFFlat is better for high-precision applications such as face recognition, 
> while HNSW performs better in general scenarios including recommendation and 
> personalized advertisement. *The recall ratio of IVFFlat could be gradually 
> increased by adjusting the query parameter (nprobe), while it's hard for HNSW 
> to improve its accuracy*. In theory, IVFFlat could achieve 100% recall ratio. 
> Recently, the implementation of HNSW (Hierarchical Navigable Small World, 
> LUCENE-9004) for Lucene, has made great progress. The issue draws attention 
> of those who are interested in Lucene or hope to use HNSW with Solr/Lucene. 
> As an alternative for solving ANN similarity search problems, IVFFlat is also 
> very popular with many users and supporters. Compared with HNSW, IVFFlat has 
> smaller index size but requires k-means clustering, while HNSW is faster in 
> query (no training required) but requires extra storage for saving graphs 
> [indexing 1M 
> vectors|[https://github.com/facebookresearch/faiss/wiki/Indexing-1M-vectors]].
>  Another advantage is that IVFFlat can be faster and more accurate when 
> enables GPU parallel computing (current not support in Java). Both algorithms 
> have their merits and demerits. Since HNSW is now under development, it may 
> be better to provide both implementations (HNSW && IVFFlat) for potential 
> users who are faced with very different scenarios and want to more choices.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (LUCENE-9136) Introduce IVFFlat to Lucene for ANN similarity search

2020-02-15 Thread Xin-Chun Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-9136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xin-Chun Zhang updated LUCENE-9136:
---
Attachment: (was: image-2020-02-16-14-36-54-478.png)

> Introduce IVFFlat to Lucene for ANN similarity search
> -
>
> Key: LUCENE-9136
> URL: https://issues.apache.org/jira/browse/LUCENE-9136
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Xin-Chun Zhang
>Priority: Major
> Attachments: 1581409981369-9dea4099-4e41-4431-8f45-a3bb8cac46c0.png, 
> image-2020-02-16-15-05-02-451.png
>
>
> Representation learning (RL) has been an established discipline in the 
> machine learning space for decades but it draws tremendous attention lately 
> with the emergence of deep learning. The central problem of RL is to 
> determine an optimal representation of the input data. By embedding the data 
> into a high dimensional vector, the vector retrieval (VR) method is then 
> applied to search the relevant items.
> With the rapid development of RL over the past few years, the technique has 
> been used extensively in industry from online advertising to computer vision 
> and speech recognition. There exist many open source implementations of VR 
> algorithms, such as Facebook's FAISS and Microsoft's SPTAG, providing various 
> choices for potential users. However, the aforementioned implementations are 
> all written in C++, and no plan for supporting Java interface, making it hard 
> to be integrated in Java projects or those who are not familier with C/C++  
> [[https://github.com/facebookresearch/faiss/issues/105]]. 
> The algorithms for vector retrieval can be roughly classified into four 
> categories,
>  # Tree-base algorithms, such as KD-tree;
>  # Hashing methods, such as LSH (Local Sensitive Hashing);
>  # Product quantization based algorithms, such as IVFFlat;
>  # Graph-base algorithms, such as HNSW, SSG, NSG;
> where IVFFlat and HNSW are the most popular ones among all the VR algorithms.
> IVFFlat is better for high-precision applications such as face recognition, 
> while HNSW performs better in general scenarios including recommendation and 
> personalized advertisement. *The recall ratio of IVFFlat could be gradually 
> increased by adjusting the query parameter (nprobe), while it's hard for HNSW 
> to improve its accuracy*. In theory, IVFFlat could achieve 100% recall ratio. 
> Recently, the implementation of HNSW (Hierarchical Navigable Small World, 
> LUCENE-9004) for Lucene, has made great progress. The issue draws attention 
> of those who are interested in Lucene or hope to use HNSW with Solr/Lucene. 
> As an alternative for solving ANN similarity search problems, IVFFlat is also 
> very popular with many users and supporters. Compared with HNSW, IVFFlat has 
> smaller index size but requires k-means clustering, while HNSW is faster in 
> query (no training required) but requires extra storage for saving graphs 
> [indexing 1M 
> vectors|[https://github.com/facebookresearch/faiss/wiki/Indexing-1M-vectors]].
>  Another advantage is that IVFFlat can be faster and more accurate when 
> enables GPU parallel computing (current not support in Java). Both algorithms 
> have their merits and demerits. Since HNSW is now under development, it may 
> be better to provide both implementations (HNSW && IVFFlat) for potential 
> users who are faced with very different scenarios and want to more choices.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (LUCENE-9136) Introduce IVFFlat to Lucene for ANN similarity search

2020-02-15 Thread Xin-Chun Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-9136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xin-Chun Zhang updated LUCENE-9136:
---
Attachment: image-2020-02-16-14-36-54-478.png

> Introduce IVFFlat to Lucene for ANN similarity search
> -
>
> Key: LUCENE-9136
> URL: https://issues.apache.org/jira/browse/LUCENE-9136
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Xin-Chun Zhang
>Priority: Major
> Attachments: 1581409981369-9dea4099-4e41-4431-8f45-a3bb8cac46c0.png, 
> image-2020-02-16-14-36-54-478.png
>
>
> Representation learning (RL) has been an established discipline in the 
> machine learning space for decades but it draws tremendous attention lately 
> with the emergence of deep learning. The central problem of RL is to 
> determine an optimal representation of the input data. By embedding the data 
> into a high dimensional vector, the vector retrieval (VR) method is then 
> applied to search the relevant items.
> With the rapid development of RL over the past few years, the technique has 
> been used extensively in industry from online advertising to computer vision 
> and speech recognition. There exist many open source implementations of VR 
> algorithms, such as Facebook's FAISS and Microsoft's SPTAG, providing various 
> choices for potential users. However, the aforementioned implementations are 
> all written in C++, and no plan for supporting Java interface, making it hard 
> to be integrated in Java projects or those who are not familier with C/C++  
> [[https://github.com/facebookresearch/faiss/issues/105]]. 
> The algorithms for vector retrieval can be roughly classified into four 
> categories,
>  # Tree-base algorithms, such as KD-tree;
>  # Hashing methods, such as LSH (Local Sensitive Hashing);
>  # Product quantization based algorithms, such as IVFFlat;
>  # Graph-base algorithms, such as HNSW, SSG, NSG;
> where IVFFlat and HNSW are the most popular ones among all the VR algorithms.
> IVFFlat is better for high-precision applications such as face recognition, 
> while HNSW performs better in general scenarios including recommendation and 
> personalized advertisement. *The recall ratio of IVFFlat could be gradually 
> increased by adjusting the query parameter (nprobe), while it's hard for HNSW 
> to improve its accuracy*. In theory, IVFFlat could achieve 100% recall ratio. 
> Recently, the implementation of HNSW (Hierarchical Navigable Small World, 
> LUCENE-9004) for Lucene, has made great progress. The issue draws attention 
> of those who are interested in Lucene or hope to use HNSW with Solr/Lucene. 
> As an alternative for solving ANN similarity search problems, IVFFlat is also 
> very popular with many users and supporters. Compared with HNSW, IVFFlat has 
> smaller index size but requires k-means clustering, while HNSW is faster in 
> query (no training required) but requires extra storage for saving graphs 
> [indexing 1M 
> vectors|[https://github.com/facebookresearch/faiss/wiki/Indexing-1M-vectors]].
>  Another advantage is that IVFFlat can be faster and more accurate when 
> enables GPU parallel computing (current not support in Java). Both algorithms 
> have their merits and demerits. Since HNSW is now under development, it may 
> be better to provide both implementations (HNSW && IVFFlat) for potential 
> users who are faced with very different scenarios and want to more choices.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9220) Upgrade Snowball version to 2.0

2020-02-15 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17037713#comment-17037713
 ] 

ASF subversion and git services commented on LUCENE-9220:
-

Commit 8ced733fc3a2b7db61b6d96e5399ae2a2918d3ba in lucene-solr's branch 
refs/heads/jira/LUCENE-9220 from Robert Muir
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=8ced733 ]

LUCENE-9220: teach gradle the new snowball-generated boilerplate, too


> Upgrade Snowball version to 2.0
> ---
>
> Key: LUCENE-9220
> URL: https://issues.apache.org/jira/browse/LUCENE-9220
> Project: Lucene - Core
>  Issue Type: Wish
>Reporter: Nguyen Minh Gia Huy
>Priority: Major
> Attachments: snowball_53739a805cfa6c.patch, 
> snowball_53739a805cfa6c.patch, snowball_53739a805cfa6c.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When working with Snowball-based stemmers, I realized that Lucene is 
> currently [using a pre-compiled version of 
> Snowball|https://lucene.apache.org/core/8_4_1/analyzers-common/org/apache/lucene/analysis/snowball/package-summary.html],
>  that seems from 12 years ago: 
> https://github.com/snowballstem/snowball/tree/e103b5c257383ee94a96e7fc58cab3c567bf079b
> Snowball has just released v2.0 in 10/2019 with many improvements, new 
> supported languages ( Arabic, Indonesian…) and new features ( stringdef 
> notation for Unicode codepoints…). Details of the changes could be found 
> here: https://github.com/snowballstem/snowball/blob/master/NEWS. I think 
> these changes of Snowball could give a promising positive impact on Lucene.
> I wonder when Lucene should upgrade Snowball to the latest version ( v2.0).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9220) Upgrade Snowball version to 2.0

2020-02-15 Thread Robert Muir (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17037711#comment-17037711
 ] 

Robert Muir commented on LUCENE-9220:
-

stemmers, stopwords, and tests are currently generated by the script. Since it 
requires cloning 3 other repos (snowball, snowball-data, snowball-website) I 
will aim to move the script's logic to be executed under gradle next. It will 
not be portable or be in the groovy language.

> Upgrade Snowball version to 2.0
> ---
>
> Key: LUCENE-9220
> URL: https://issues.apache.org/jira/browse/LUCENE-9220
> Project: Lucene - Core
>  Issue Type: Wish
>Reporter: Nguyen Minh Gia Huy
>Priority: Major
> Attachments: snowball_53739a805cfa6c.patch, 
> snowball_53739a805cfa6c.patch, snowball_53739a805cfa6c.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When working with Snowball-based stemmers, I realized that Lucene is 
> currently [using a pre-compiled version of 
> Snowball|https://lucene.apache.org/core/8_4_1/analyzers-common/org/apache/lucene/analysis/snowball/package-summary.html],
>  that seems from 12 years ago: 
> https://github.com/snowballstem/snowball/tree/e103b5c257383ee94a96e7fc58cab3c567bf079b
> Snowball has just released v2.0 in 10/2019 with many improvements, new 
> supported languages ( Arabic, Indonesian…) and new features ( stringdef 
> notation for Unicode codepoints…). Details of the changes could be found 
> here: https://github.com/snowballstem/snowball/blob/master/NEWS. I think 
> these changes of Snowball could give a promising positive impact on Lucene.
> I wonder when Lucene should upgrade Snowball to the latest version ( v2.0).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (LUCENE-9220) Upgrade Snowball version to 2.0

2020-02-15 Thread Robert Muir (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-9220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-9220:

Attachment: snowball_53739a805cfa6c.patch

> Upgrade Snowball version to 2.0
> ---
>
> Key: LUCENE-9220
> URL: https://issues.apache.org/jira/browse/LUCENE-9220
> Project: Lucene - Core
>  Issue Type: Wish
>Reporter: Nguyen Minh Gia Huy
>Priority: Major
> Attachments: snowball_53739a805cfa6c.patch, 
> snowball_53739a805cfa6c.patch, snowball_53739a805cfa6c.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When working with Snowball-based stemmers, I realized that Lucene is 
> currently [using a pre-compiled version of 
> Snowball|https://lucene.apache.org/core/8_4_1/analyzers-common/org/apache/lucene/analysis/snowball/package-summary.html],
>  that seems from 12 years ago: 
> https://github.com/snowballstem/snowball/tree/e103b5c257383ee94a96e7fc58cab3c567bf079b
> Snowball has just released v2.0 in 10/2019 with many improvements, new 
> supported languages ( Arabic, Indonesian…) and new features ( stringdef 
> notation for Unicode codepoints…). Details of the changes could be found 
> here: https://github.com/snowballstem/snowball/blob/master/NEWS. I think 
> these changes of Snowball could give a promising positive impact on Lucene.
> I wonder when Lucene should upgrade Snowball to the latest version ( v2.0).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9220) Upgrade Snowball version to 2.0

2020-02-15 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17037709#comment-17037709
 ] 

ASF subversion and git services commented on LUCENE-9220:
-

Commit 03aebecf98acab31c608dbcdbf8b5c038c3c02f7 in lucene-solr's branch 
refs/heads/jira/LUCENE-9220 from Robert Muir
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=03aebec ]

LUCENE-9220: regenerate all snowball stopfiles


> Upgrade Snowball version to 2.0
> ---
>
> Key: LUCENE-9220
> URL: https://issues.apache.org/jira/browse/LUCENE-9220
> Project: Lucene - Core
>  Issue Type: Wish
>Reporter: Nguyen Minh Gia Huy
>Priority: Major
> Attachments: snowball_53739a805cfa6c.patch, 
> snowball_53739a805cfa6c.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When working with Snowball-based stemmers, I realized that Lucene is 
> currently [using a pre-compiled version of 
> Snowball|https://lucene.apache.org/core/8_4_1/analyzers-common/org/apache/lucene/analysis/snowball/package-summary.html],
>  that seems from 12 years ago: 
> https://github.com/snowballstem/snowball/tree/e103b5c257383ee94a96e7fc58cab3c567bf079b
> Snowball has just released v2.0 in 10/2019 with many improvements, new 
> supported languages ( Arabic, Indonesian…) and new features ( stringdef 
> notation for Unicode codepoints…). Details of the changes could be found 
> here: https://github.com/snowballstem/snowball/blob/master/NEWS. I think 
> these changes of Snowball could give a promising positive impact on Lucene.
> I wonder when Lucene should upgrade Snowball to the latest version ( v2.0).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (LUCENE-9220) Upgrade Snowball version to 2.0

2020-02-15 Thread Robert Muir (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-9220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-9220:

Attachment: snowball_53739a805cfa6c.patch

> Upgrade Snowball version to 2.0
> ---
>
> Key: LUCENE-9220
> URL: https://issues.apache.org/jira/browse/LUCENE-9220
> Project: Lucene - Core
>  Issue Type: Wish
>Reporter: Nguyen Minh Gia Huy
>Priority: Major
> Attachments: snowball_53739a805cfa6c.patch, 
> snowball_53739a805cfa6c.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When working with Snowball-based stemmers, I realized that Lucene is 
> currently [using a pre-compiled version of 
> Snowball|https://lucene.apache.org/core/8_4_1/analyzers-common/org/apache/lucene/analysis/snowball/package-summary.html],
>  that seems from 12 years ago: 
> https://github.com/snowballstem/snowball/tree/e103b5c257383ee94a96e7fc58cab3c567bf079b
> Snowball has just released v2.0 in 10/2019 with many improvements, new 
> supported languages ( Arabic, Indonesian…) and new features ( stringdef 
> notation for Unicode codepoints…). Details of the changes could be found 
> here: https://github.com/snowballstem/snowball/blob/master/NEWS. I think 
> these changes of Snowball could give a promising positive impact on Lucene.
> I wonder when Lucene should upgrade Snowball to the latest version ( v2.0).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9220) Upgrade Snowball version to 2.0

2020-02-15 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17037700#comment-17037700
 ] 

ASF subversion and git services commented on LUCENE-9220:
-

Commit d9a285c857e632a32f7762c49d2ab8363ae8c876 in lucene-solr's branch 
refs/heads/jira/LUCENE-9220 from Robert Muir
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=d9a285c ]

LUCENE-9220: automate generation of (bsd license-only, sampled) test data


> Upgrade Snowball version to 2.0
> ---
>
> Key: LUCENE-9220
> URL: https://issues.apache.org/jira/browse/LUCENE-9220
> Project: Lucene - Core
>  Issue Type: Wish
>Reporter: Nguyen Minh Gia Huy
>Priority: Major
> Attachments: snowball_53739a805cfa6c.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When working with Snowball-based stemmers, I realized that Lucene is 
> currently [using a pre-compiled version of 
> Snowball|https://lucene.apache.org/core/8_4_1/analyzers-common/org/apache/lucene/analysis/snowball/package-summary.html],
>  that seems from 12 years ago: 
> https://github.com/snowballstem/snowball/tree/e103b5c257383ee94a96e7fc58cab3c567bf079b
> Snowball has just released v2.0 in 10/2019 with many improvements, new 
> supported languages ( Arabic, Indonesian…) and new features ( stringdef 
> notation for Unicode codepoints…). Details of the changes could be found 
> here: https://github.com/snowballstem/snowball/blob/master/NEWS. I think 
> these changes of Snowball could give a promising positive impact on Lucene.
> I wonder when Lucene should upgrade Snowball to the latest version ( v2.0).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9220) Upgrade Snowball version to 2.0

2020-02-15 Thread Robert Muir (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17037687#comment-17037687
 ] 

Robert Muir commented on LUCENE-9220:
-

I'm working my way thru automating the process (test data right now). I would 
like a better situation, e.g. maybe we have commit hash + patch + regen script 
in our repo tied into gradle regenerate task. It may just be limited to only 
work on linux or similar, we have to patch sources, invoke makefile, compile c 
code, etc.

> Upgrade Snowball version to 2.0
> ---
>
> Key: LUCENE-9220
> URL: https://issues.apache.org/jira/browse/LUCENE-9220
> Project: Lucene - Core
>  Issue Type: Wish
>Reporter: Nguyen Minh Gia Huy
>Priority: Major
> Attachments: snowball_53739a805cfa6c.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When working with Snowball-based stemmers, I realized that Lucene is 
> currently [using a pre-compiled version of 
> Snowball|https://lucene.apache.org/core/8_4_1/analyzers-common/org/apache/lucene/analysis/snowball/package-summary.html],
>  that seems from 12 years ago: 
> https://github.com/snowballstem/snowball/tree/e103b5c257383ee94a96e7fc58cab3c567bf079b
> Snowball has just released v2.0 in 10/2019 with many improvements, new 
> supported languages ( Arabic, Indonesian…) and new features ( stringdef 
> notation for Unicode codepoints…). Details of the changes could be found 
> here: https://github.com/snowballstem/snowball/blob/master/NEWS. I think 
> these changes of Snowball could give a promising positive impact on Lucene.
> I wonder when Lucene should upgrade Snowball to the latest version ( v2.0).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8987) Move Lucene web site from svn to git

2020-02-15 Thread Uwe Schindler (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-8987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17037681#comment-17037681
 ] 

Uwe Schindler commented on LUCENE-8987:
---

It looks like this does not work in markdown files:
{code}
./content/pages/solr/resources.md:* [Latest Release](/solr/{{ 
LUCENE_LATEST_RELEASE | replace(".", "_") }}/index.html)
{code}

> Move Lucene web site from svn to git
> 
>
> Key: LUCENE-8987
> URL: https://issues.apache.org/jira/browse/LUCENE-8987
> Project: Lucene - Core
>  Issue Type: Task
>  Components: general/website
>Reporter: Jan Høydahl
>Assignee: Jan Høydahl
>Priority: Major
> Attachments: lucene-site-repo.png
>
>
> INFRA just enabled [a new way of configuring website 
> build|https://s.apache.org/asfyaml] from a git branch, [see dev list 
> email|https://lists.apache.org/thread.html/b6f7e40bece5e83e27072ecc634a7815980c90240bc0a2ccb417f1fd@%3Cdev.lucene.apache.org%3E].
>  It allows for automatic builds of both staging and production site, much 
> like the old CMS. We can choose to auto publish the html content of an 
> {{output/}} folder, or to have a bot build the site using 
> [Pelican|https://github.com/getpelican/pelican] from a {{content/}} folder.
> The goal of this issue is to explore how this can be done for 
> [http://lucene.apache.org|http://lucene.apache.org/] by, by creating a new 
> git repo {{lucene-site}}, copy over the site from svn, see if it can be 
> "Pelicanized" easily and then test staging. Benefits are that more people 
> will be able to edit the web site and we can take PRs from the public (with 
> GitHub preview of pages).
> Non-goals:
>  * Create a new web site or a new graphic design
>  * Change from Markdown to Asciidoc



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8987) Move Lucene web site from svn to git

2020-02-15 Thread Uwe Schindler (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-8987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17037679#comment-17037679
 ] 

Uwe Schindler commented on LUCENE-8987:
---

Small issue: https://lucene.apache.org/solr/resources.html#documentation
The link to the Solr Javadocs is missing the version number!

> Move Lucene web site from svn to git
> 
>
> Key: LUCENE-8987
> URL: https://issues.apache.org/jira/browse/LUCENE-8987
> Project: Lucene - Core
>  Issue Type: Task
>  Components: general/website
>Reporter: Jan Høydahl
>Assignee: Jan Høydahl
>Priority: Major
> Attachments: lucene-site-repo.png
>
>
> INFRA just enabled [a new way of configuring website 
> build|https://s.apache.org/asfyaml] from a git branch, [see dev list 
> email|https://lists.apache.org/thread.html/b6f7e40bece5e83e27072ecc634a7815980c90240bc0a2ccb417f1fd@%3Cdev.lucene.apache.org%3E].
>  It allows for automatic builds of both staging and production site, much 
> like the old CMS. We can choose to auto publish the html content of an 
> {{output/}} folder, or to have a bot build the site using 
> [Pelican|https://github.com/getpelican/pelican] from a {{content/}} folder.
> The goal of this issue is to explore how this can be done for 
> [http://lucene.apache.org|http://lucene.apache.org/] by, by creating a new 
> git repo {{lucene-site}}, copy over the site from svn, see if it can be 
> "Pelicanized" easily and then test staging. Benefits are that more people 
> will be able to edit the web site and we can take PRs from the public (with 
> GitHub preview of pages).
> Non-goals:
>  * Create a new web site or a new graphic design
>  * Change from Markdown to Asciidoc



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8987) Move Lucene web site from svn to git

2020-02-15 Thread Uwe Schindler (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-8987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17037678#comment-17037678
 ] 

Uwe Schindler commented on LUCENE-8987:
---

The new site is now live:
- https://lucene.apache.org/ (we should maybe just fix certificate warnings 
because of unencrypted content still there)
- The Subversion Git CMS tree was cleaned up (added file "MOVED_TO_GIT")
- The Subversion folder with the Javadocs and Refguide was kept alive: 
https://svn.apache.org/repos/infra/websites/production/lucene/content

To publish Javadocs and Refguide nothing has changed, only extpath.txt is gone. 
Just commit to subversion as you did before.

> Move Lucene web site from svn to git
> 
>
> Key: LUCENE-8987
> URL: https://issues.apache.org/jira/browse/LUCENE-8987
> Project: Lucene - Core
>  Issue Type: Task
>  Components: general/website
>Reporter: Jan Høydahl
>Assignee: Jan Høydahl
>Priority: Major
> Attachments: lucene-site-repo.png
>
>
> INFRA just enabled [a new way of configuring website 
> build|https://s.apache.org/asfyaml] from a git branch, [see dev list 
> email|https://lists.apache.org/thread.html/b6f7e40bece5e83e27072ecc634a7815980c90240bc0a2ccb417f1fd@%3Cdev.lucene.apache.org%3E].
>  It allows for automatic builds of both staging and production site, much 
> like the old CMS. We can choose to auto publish the html content of an 
> {{output/}} folder, or to have a bot build the site using 
> [Pelican|https://github.com/getpelican/pelican] from a {{content/}} folder.
> The goal of this issue is to explore how this can be done for 
> [http://lucene.apache.org|http://lucene.apache.org/] by, by creating a new 
> git repo {{lucene-site}}, copy over the site from svn, see if it can be 
> "Pelicanized" easily and then test staging. Benefits are that more people 
> will be able to edit the web site and we can take PRs from the public (with 
> GitHub preview of pages).
> Non-goals:
>  * Create a new web site or a new graphic design
>  * Change from Markdown to Asciidoc



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-9034) Officially publish the new site

2020-02-15 Thread Uwe Schindler (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-9034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler resolved LUCENE-9034.
---
Resolution: Fixed

The new site is now live:
- https://lucene.apache.org/ (we should maybe just fix certificate warnings 
because of unencrypted content still there)
- The Subversion Git CMS tree was cleaned up (added file "MOVED_TO_GIT")
- The Subversion folder with the Javadocs and Refguide was kept alive: 
https://svn.apache.org/repos/infra/websites/production/lucene/content

To publish Javadocs and Refguide nothing has changed, only extpath.txt is gone. 
Just commit to subversion as you did before.

> Officially publish the new site
> ---
>
> Key: LUCENE-9034
> URL: https://issues.apache.org/jira/browse/LUCENE-9034
> Project: Lucene - Core
>  Issue Type: Sub-task
>  Components: general/website
>Reporter: Jan Høydahl
>Assignee: Uwe Schindler
>Priority: Major
>
> Publishing the web site means creating a publish branch and adding the right 
> magic instructions to {{.asf.yml}} etc. This will then publish the new site 
> and disable old CMS.
> Before we do that we should
>  # Make sure all docs and release tools are updated for new site publishing 
> instructions
>  # Create a PR with latest changes in old CMS site since the export. This 
> will be the changes done during 8.3.0 release and possibly some news entries 
> related to security issues etc.
> After publishing we should ask INFRA to make old site svn read-only (and 
> perhaps do a commit that replaces svn content with a README.txt), so it is 
> obvious for everyone that we have migrated.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] rmuir opened a new pull request #1262: LUCENE-9220: regenerate all stemmers from snowball 2.0

2020-02-15 Thread GitBox
rmuir opened a new pull request #1262: LUCENE-9220: regenerate all stemmers 
from snowball 2.0
URL: https://github.com/apache/lucene-solr/pull/1262
 
 
   Instead of patching them after-the-fact (both manually and
   automatically over the years) we patch the generator.
   
   This is easier to maintain than patches/changes against generated code.
   See LUCENE-9220 for more information.
   
   There is a remaining nocommit, test data. Also need to hook in and test
   the new languages that are added here.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9220) Upgrade Snowball version to 2.0

2020-02-15 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17037669#comment-17037669
 ] 

ASF subversion and git services commented on LUCENE-9220:
-

Commit fc229b170197e37ffcbdb330e7657939979a7def in lucene-solr's branch 
refs/heads/jira/LUCENE-9220 from Robert Muir
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=fc229b1 ]

LUCENE-9220 regenerate all stemmers from snowball 2.0

Instead of patching them after-the-fact (both manually and
automatically over the years) we patch the generator.

This is easier to maintain than patches/changes against generated code.
See LUCENE-9220 for more information.

There is a remaining nocommit, test data. Also need to hook in and test
the new languages that are added here.


> Upgrade Snowball version to 2.0
> ---
>
> Key: LUCENE-9220
> URL: https://issues.apache.org/jira/browse/LUCENE-9220
> Project: Lucene - Core
>  Issue Type: Wish
>Reporter: Nguyen Minh Gia Huy
>Priority: Major
> Attachments: snowball_53739a805cfa6c.patch
>
>
> When working with Snowball-based stemmers, I realized that Lucene is 
> currently [using a pre-compiled version of 
> Snowball|https://lucene.apache.org/core/8_4_1/analyzers-common/org/apache/lucene/analysis/snowball/package-summary.html],
>  that seems from 12 years ago: 
> https://github.com/snowballstem/snowball/tree/e103b5c257383ee94a96e7fc58cab3c567bf079b
> Snowball has just released v2.0 in 10/2019 with many improvements, new 
> supported languages ( Arabic, Indonesian…) and new features ( stringdef 
> notation for Unicode codepoints…). Details of the changes could be found 
> here: https://github.com/snowballstem/snowball/blob/master/NEWS. I think 
> these changes of Snowball could give a promising positive impact on Lucene.
> I wonder when Lucene should upgrade Snowball to the latest version ( v2.0).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9220) Upgrade Snowball version to 2.0

2020-02-15 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17037670#comment-17037670
 ] 

ASF subversion and git services commented on LUCENE-9220:
-

Commit fc229b170197e37ffcbdb330e7657939979a7def in lucene-solr's branch 
refs/heads/jira/LUCENE-9220 from Robert Muir
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=fc229b1 ]

LUCENE-9220 regenerate all stemmers from snowball 2.0

Instead of patching them after-the-fact (both manually and
automatically over the years) we patch the generator.

This is easier to maintain than patches/changes against generated code.
See LUCENE-9220 for more information.

There is a remaining nocommit, test data. Also need to hook in and test
the new languages that are added here.


> Upgrade Snowball version to 2.0
> ---
>
> Key: LUCENE-9220
> URL: https://issues.apache.org/jira/browse/LUCENE-9220
> Project: Lucene - Core
>  Issue Type: Wish
>Reporter: Nguyen Minh Gia Huy
>Priority: Major
> Attachments: snowball_53739a805cfa6c.patch
>
>
> When working with Snowball-based stemmers, I realized that Lucene is 
> currently [using a pre-compiled version of 
> Snowball|https://lucene.apache.org/core/8_4_1/analyzers-common/org/apache/lucene/analysis/snowball/package-summary.html],
>  that seems from 12 years ago: 
> https://github.com/snowballstem/snowball/tree/e103b5c257383ee94a96e7fc58cab3c567bf079b
> Snowball has just released v2.0 in 10/2019 with many improvements, new 
> supported languages ( Arabic, Indonesian…) and new features ( stringdef 
> notation for Unicode codepoints…). Details of the changes could be found 
> here: https://github.com/snowballstem/snowball/blob/master/NEWS. I think 
> these changes of Snowball could give a promising positive impact on Lucene.
> I wonder when Lucene should upgrade Snowball to the latest version ( v2.0).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9220) Upgrade Snowball version to 2.0

2020-02-15 Thread Robert Muir (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17037666#comment-17037666
 ] 

Robert Muir commented on LUCENE-9220:
-

I've uploaded a patch of that same commit against 53739a805cfa6c of snowball. 
This way we can refer to it in documentation for now from this issue.

I will make a lucene PR for the lucene-side changes here.

> Upgrade Snowball version to 2.0
> ---
>
> Key: LUCENE-9220
> URL: https://issues.apache.org/jira/browse/LUCENE-9220
> Project: Lucene - Core
>  Issue Type: Wish
>Reporter: Nguyen Minh Gia Huy
>Priority: Major
> Attachments: snowball_53739a805cfa6c.patch
>
>
> When working with Snowball-based stemmers, I realized that Lucene is 
> currently [using a pre-compiled version of 
> Snowball|https://lucene.apache.org/core/8_4_1/analyzers-common/org/apache/lucene/analysis/snowball/package-summary.html],
>  that seems from 12 years ago: 
> https://github.com/snowballstem/snowball/tree/e103b5c257383ee94a96e7fc58cab3c567bf079b
> Snowball has just released v2.0 in 10/2019 with many improvements, new 
> supported languages ( Arabic, Indonesian…) and new features ( stringdef 
> notation for Unicode codepoints…). Details of the changes could be found 
> here: https://github.com/snowballstem/snowball/blob/master/NEWS. I think 
> these changes of Snowball could give a promising positive impact on Lucene.
> I wonder when Lucene should upgrade Snowball to the latest version ( v2.0).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (LUCENE-9220) Upgrade Snowball version to 2.0

2020-02-15 Thread Robert Muir (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-9220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-9220:

Attachment: snowball_53739a805cfa6c.patch

> Upgrade Snowball version to 2.0
> ---
>
> Key: LUCENE-9220
> URL: https://issues.apache.org/jira/browse/LUCENE-9220
> Project: Lucene - Core
>  Issue Type: Wish
>Reporter: Nguyen Minh Gia Huy
>Priority: Major
> Attachments: snowball_53739a805cfa6c.patch
>
>
> When working with Snowball-based stemmers, I realized that Lucene is 
> currently [using a pre-compiled version of 
> Snowball|https://lucene.apache.org/core/8_4_1/analyzers-common/org/apache/lucene/analysis/snowball/package-summary.html],
>  that seems from 12 years ago: 
> https://github.com/snowballstem/snowball/tree/e103b5c257383ee94a96e7fc58cab3c567bf079b
> Snowball has just released v2.0 in 10/2019 with many improvements, new 
> supported languages ( Arabic, Indonesian…) and new features ( stringdef 
> notation for Unicode codepoints…). Details of the changes could be found 
> here: https://github.com/snowballstem/snowball/blob/master/NEWS. I think 
> these changes of Snowball could give a promising positive impact on Lucene.
> I wonder when Lucene should upgrade Snowball to the latest version ( v2.0).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9220) Upgrade Snowball version to 2.0

2020-02-15 Thread Robert Muir (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17037665#comment-17037665
 ] 

Robert Muir commented on LUCENE-9220:
-

Of course tests won't pass! Otherwise this thing gets massively slower because 
it won't have our fixes to unnecessary reflection, string creation, etc.

Also lucene has armenian and estonian, neither of which are currently in the 
snowball repo (one is on the website though, and the other has a PR). So we 
have to generate and enable stemmers for those languages. We also support 
stemmers that are disabled by default (KP, german2, lovins), so we have to 
enable and generate those too.

Finally there is the mixed tabs/space indentation, the lack of license headers, 
the lack of javadocs, the mixed tab/space indentation, it all adds up to make 
it quite the pain in the ass.

I think instead of patching *generated* code we should patch snowball itself 
and try to send the fixes to them upstream. It seems reasonable they would want 
consistent whitespace, docs, licensing, better performance, etc.

For example, currently methodhandle patching will fail because the generated 
structure has changed, but its a one-liner to fix this in their C-code 
generator, and easier to maintain that way, even as a patch.

I have made such changes here: 
https://github.com/rmuir/snowball/commit/2e1433394ef02ee248127c8e3485d9cbc395d577


> Upgrade Snowball version to 2.0
> ---
>
> Key: LUCENE-9220
> URL: https://issues.apache.org/jira/browse/LUCENE-9220
> Project: Lucene - Core
>  Issue Type: Wish
>Reporter: Nguyen Minh Gia Huy
>Priority: Major
>
> When working with Snowball-based stemmers, I realized that Lucene is 
> currently [using a pre-compiled version of 
> Snowball|https://lucene.apache.org/core/8_4_1/analyzers-common/org/apache/lucene/analysis/snowball/package-summary.html],
>  that seems from 12 years ago: 
> https://github.com/snowballstem/snowball/tree/e103b5c257383ee94a96e7fc58cab3c567bf079b
> Snowball has just released v2.0 in 10/2019 with many improvements, new 
> supported languages ( Arabic, Indonesian…) and new features ( stringdef 
> notation for Unicode codepoints…). Details of the changes could be found 
> here: https://github.com/snowballstem/snowball/blob/master/NEWS. I think 
> these changes of Snowball could give a promising positive impact on Lucene.
> I wonder when Lucene should upgrade Snowball to the latest version ( v2.0).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] janhoy commented on issue #365: [SOLR-12243] span query generalization + query parser tests

2020-02-15 Thread GitBox
janhoy commented on issue #365: [SOLR-12243] span query generalization + query 
parser tests
URL: https://github.com/apache/lucene-solr/pull/365#issuecomment-586652193
 
 
   Merged manually


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] janhoy closed pull request #365: [SOLR-12243] span query generalization + query parser tests

2020-02-15 Thread GitBox
janhoy closed pull request #365: [SOLR-12243] span query generalization + query 
parser tests
URL: https://github.com/apache/lucene-solr/pull/365
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] janhoy commented on issue #639: Solve the problem of highlighting Chinese inaccurately.

2020-02-15 Thread GitBox
janhoy commented on issue #639: Solve the problem of highlighting Chinese 
inaccurately.
URL: https://github.com/apache/lucene-solr/pull/639#issuecomment-586652109
 
 
   Thanks for contributing. You way want to discuss your problem in the mailing 
list and confirm it is a bug, and then invite developers to take a look at your 
PR.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] janhoy commented on issue #807: Remove solr.jetty.https.port when SSL is not used

2020-02-15 Thread GitBox
janhoy commented on issue #807: Remove solr.jetty.https.port when SSL is not 
used
URL: https://github.com/apache/lucene-solr/pull/807#issuecomment-586651472
 
 
   This looks like a bug. Please create a corresponding JIRA issue and read the 
checklist above. There should be a line in solr/CHANGES.txt as well for this 
change.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9034) Officially publish the new site

2020-02-15 Thread Uwe Schindler (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17037664#comment-17037664
 ] 

Uwe Schindler commented on LUCENE-9034:
---

Opened: INFRA-19859

> Officially publish the new site
> ---
>
> Key: LUCENE-9034
> URL: https://issues.apache.org/jira/browse/LUCENE-9034
> Project: Lucene - Core
>  Issue Type: Sub-task
>  Components: general/website
>Reporter: Jan Høydahl
>Assignee: Uwe Schindler
>Priority: Major
>
> Publishing the web site means creating a publish branch and adding the right 
> magic instructions to {{.asf.yml}} etc. This will then publish the new site 
> and disable old CMS.
> Before we do that we should
>  # Make sure all docs and release tools are updated for new site publishing 
> instructions
>  # Create a PR with latest changes in old CMS site since the export. This 
> will be the changes done during 8.3.0 release and possibly some news entries 
> related to security issues etc.
> After publishing we should ask INFRA to make old site svn read-only (and 
> perhaps do a commit that replaces svn content with a README.txt), so it is 
> obvious for everyone that we have migrated.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] janhoy commented on issue #908: Change the file format of README files from README.txt to README.md a…

2020-02-15 Thread GitBox
janhoy commented on issue #908: Change the file format of README files from 
README.txt to README.md a…
URL: https://github.com/apache/lucene-solr/pull/908#issuecomment-586651025
 
 
   @pinkeshsharma Have you seen my review feedback? Please make adjustments so 
we can merge this.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] janhoy closed pull request #873: Rename README.txt to README.md

2020-02-15 Thread GitBox
janhoy closed pull request #873: Rename README.txt to README.md
URL: https://github.com/apache/lucene-solr/pull/873
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] janhoy commented on issue #873: Rename README.txt to README.md

2020-02-15 Thread GitBox
janhoy commented on issue #873: Rename README.txt to README.md
URL: https://github.com/apache/lucene-solr/pull/873#issuecomment-586650863
 
 
   This is a duplicate of #908 which also aims to change formatting. Closing


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] janhoy closed pull request #988: Update README.md

2020-02-15 Thread GitBox
janhoy closed pull request #988: Update README.md
URL: https://github.com/apache/lucene-solr/pull/988
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] janhoy commented on issue #988: Update README.md

2020-02-15 Thread GitBox
janhoy commented on issue #988: Update README.md
URL: https://github.com/apache/lucene-solr/pull/988#issuecomment-586650685
 
 
   Closing this. Please contribute instead to the cwiki page already linked to.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9034) Officially publish the new site

2020-02-15 Thread Uwe Schindler (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17037658#comment-17037658
 ] 

Uwe Schindler commented on LUCENE-9034:
---

OK, sorry for delay. I will contact infra soon.

Uwe

> Officially publish the new site
> ---
>
> Key: LUCENE-9034
> URL: https://issues.apache.org/jira/browse/LUCENE-9034
> Project: Lucene - Core
>  Issue Type: Sub-task
>  Components: general/website
>Reporter: Jan Høydahl
>Assignee: Uwe Schindler
>Priority: Major
>
> Publishing the web site means creating a publish branch and adding the right 
> magic instructions to {{.asf.yml}} etc. This will then publish the new site 
> and disable old CMS.
> Before we do that we should
>  # Make sure all docs and release tools are updated for new site publishing 
> instructions
>  # Create a PR with latest changes in old CMS site since the export. This 
> will be the changes done during 8.3.0 release and possibly some news entries 
> related to security issues etc.
> After publishing we should ask INFRA to make old site svn read-only (and 
> perhaps do a commit that replaces svn content with a README.txt), so it is 
> obvious for everyone that we have migrated.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] janhoy commented on issue #1005: LUCENE-9042: Refactor TopGroups.merge tests

2020-02-15 Thread GitBox
janhoy commented on issue #1005: LUCENE-9042: Refactor TopGroups.merge tests
URL: https://github.com/apache/lucene-solr/pull/1005#issuecomment-586645396
 
 
   Linking with LUCENE-9042


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] janhoy merged pull request #1090: Update README.txt for analysis-extras

2020-02-15 Thread GitBox
janhoy merged pull request #1090: Update README.txt for analysis-extras
URL: https://github.com/apache/lucene-solr/pull/1090
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] janhoy commented on issue #1090: Update README.txt for analysis-extras

2020-02-15 Thread GitBox
janhoy commented on issue #1090: Update README.txt for analysis-extras
URL: https://github.com/apache/lucene-solr/pull/1090#issuecomment-586645053
 
 
   Thanks for the contribution


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] janhoy commented on issue #1143: HdfsDirectory support createTempOutput

2020-02-15 Thread GitBox
janhoy commented on issue #1143: HdfsDirectory support createTempOutput
URL: https://github.com/apache/lucene-solr/pull/1143#issuecomment-586644825
 
 
   @kaynewu Have you opened a JIRA issue as well for this?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] janhoy commented on issue #1036: SOLR-13967: Update query.css to make query form sticky on the top while scrolling page

2020-02-15 Thread GitBox
janhoy commented on issue #1036: SOLR-13967: Update query.css to make query 
form sticky on the top while scrolling page
URL: https://github.com/apache/lucene-solr/pull/1036#issuecomment-586644660
 
 
   @shuson Why is this PR still open? The JIRA issue is closed as implemented 
but I cannot see that anything has been committed?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] janhoy closed pull request #932: SOLR-13829: Stop converting continuous numbers to Longs in Streaming Expressions

2020-02-15 Thread GitBox
janhoy closed pull request #932: SOLR-13829: Stop converting continuous numbers 
to Longs in Streaming Expressions
URL: https://github.com/apache/lucene-solr/pull/932
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] janhoy closed pull request #529: LUCENE-8617: Use SimpleFSDirectory on non-default FS

2020-02-15 Thread GitBox
janhoy closed pull request #529: LUCENE-8617: Use SimpleFSDirectory on 
non-default FS
URL: https://github.com/apache/lucene-solr/pull/529
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] janhoy closed pull request #500: LUCENE-8517: do not wrap FixedShingleFilter with conditional in TestR…

2020-02-15 Thread GitBox
janhoy closed pull request #500: LUCENE-8517: do not wrap FixedShingleFilter 
with conditional in TestR…
URL: https://github.com/apache/lucene-solr/pull/500
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] janhoy closed pull request #1016: SOLR-13662: Test fix & Reference guide for package manager

2020-02-15 Thread GitBox
janhoy closed pull request #1016: SOLR-13662: Test fix & Reference guide for 
package manager
URL: https://github.com/apache/lucene-solr/pull/1016
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] gus-asf commented on issue #976: SOLR-13749: Implement support for joining across collections with multiple shards

2020-02-15 Thread GitBox
gus-asf commented on issue #976: SOLR-13749: Implement support for joining 
across collections with multiple shards
URL: https://github.com/apache/lucene-solr/pull/976#issuecomment-586640936
 
 
   merged manually


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] gus-asf closed pull request #976: SOLR-13749: Implement support for joining across collections with multiple shards

2020-02-15 Thread GitBox
gus-asf closed pull request #976: SOLR-13749: Implement support for joining 
across collections with multiple shards
URL: https://github.com/apache/lucene-solr/pull/976
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13971) Velocity custom template RCE vulnerability

2020-02-15 Thread Jira


[ 
https://issues.apache.org/jira/browse/SOLR-13971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17037589#comment-17037589
 ] 

Jan Høydahl commented on SOLR-13971:


Also add 7.7.3 as fixVersion in this JIRA?

> Velocity custom template RCE vulnerability
> --
>
> Key: SOLR-13971
> URL: https://issues.apache.org/jira/browse/SOLR-13971
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 5.0, 5.5.5, 6.0, 6.6.5, 7.0, 7.7, 8.0, 8.3
>Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Blocker
> Fix For: 8.4
>
> Attachments: SOLR-13971.patch
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> We need to disable this. There is a zero day attack in the wild. 41 stars on 
> this github project: 
> # https://github.com/jas502n/solr_rce
> # https://gist.github.com/s00py/a1ba36a3689fa13759ff910e179fc133
> We need to disable this in a way that cannot be re-enabled using the Config 
> API.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] dweiss closed pull request #543: LUCENE-8474: final cleanups and removal of RAMDirectory

2020-02-15 Thread GitBox
dweiss closed pull request #543: LUCENE-8474: final cleanups and removal of 
RAMDirectory
URL: https://github.com/apache/lucene-solr/pull/543
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] dweiss closed pull request #829: SOLR-13452: Update the lucene-solr build from Ivy+Ant+Maven (shadow build) to Gradle.

2020-02-15 Thread GitBox
dweiss closed pull request #829: SOLR-13452: Update the lucene-solr build from 
Ivy+Ant+Maven (shadow build) to Gradle.
URL: https://github.com/apache/lucene-solr/pull/829
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] dweiss closed pull request #536: LUCENE-8643: Decrease test complexity in the default case. Exclude simple text codec.

2020-02-15 Thread GitBox
dweiss closed pull request #536: LUCENE-8643: Decrease test complexity in the 
default case. Exclude simple text codec.
URL: https://github.com/apache/lucene-solr/pull/536
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] dweiss closed pull request #533: LUCENE-8636: TestPointQueries and long execution times

2020-02-15 Thread GitBox
dweiss closed pull request #533: LUCENE-8636: TestPointQueries and long 
execution times
URL: https://github.com/apache/lucene-solr/pull/533
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] janhoy closed pull request #124: fix small issue in solr shell script

2020-02-15 Thread GitBox
janhoy closed pull request #124: fix small issue in solr shell script
URL: https://github.com/apache/lucene-solr/pull/124
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14258) DocList (DocSlice) should not implement DocSet

2020-02-15 Thread David Smiley (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17037523#comment-17037523
 ] 

David Smiley commented on SOLR-14258:
-

I’m okay with master only but I’d like your opinion on what change has the most 
compatibility risk here?

> DocList (DocSlice) should not implement DocSet
> --
>
> Key: SOLR-14258
> URL: https://issues.apache.org/jira/browse/SOLR-14258
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: search
>Reporter: David Smiley
>Assignee: David Smiley
>Priority: Minor
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> DocList is an internal interface used to hold the documents we'll ultimately 
> return from search.  It has one implementation -- DocSlice.  It implements 
> DocSet but I think that was a mistake.  Basically no-where does Solr depend 
> on the fact that a DocList is a DocSet today, and keeping it this way 
> complicates maintenance on DocSet's abstraction.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14258) DocList (DocSlice) should not implement DocSet

2020-02-15 Thread Mikhail Khludnev (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17037499#comment-17037499
 ] 

Mikhail Khludnev commented on SOLR-14258:
-

I'm ok with PR. Do you target it for master only? I worry about all customers 
plugins made for 8x.

> DocList (DocSlice) should not implement DocSet
> --
>
> Key: SOLR-14258
> URL: https://issues.apache.org/jira/browse/SOLR-14258
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: search
>Reporter: David Smiley
>Assignee: David Smiley
>Priority: Minor
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> DocList is an internal interface used to hold the documents we'll ultimately 
> return from search.  It has one implementation -- DocSlice.  It implements 
> DocSet but I think that was a mistake.  Basically no-where does Solr depend 
> on the fact that a DocList is a DocSet today, and keeping it this way 
> complicates maintenance on DocSet's abstraction.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org