[jira] [Commented] (LUCENE-7053) Remove deprecated BytesRef#getUTF8SortedAsUTF16Comparator(); remove natural comparator in favour of Java 8 one

2016-03-01 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15174080#comment-15174080
 ] 

ASF subversion and git services commented on LUCENE-7053:
-

Commit 3c27980c4ae716ba74b3a0e2c70b3dd1c1d4 in lucene-solr's branch 
refs/heads/master from [~thetaphi]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=3c27980 ]

LUCENE-7053: Simplify code to work around Java 8u25 compiler bug


> Remove deprecated BytesRef#getUTF8SortedAsUTF16Comparator(); remove natural 
> comparator in favour of Java 8 one
> --
>
> Key: LUCENE-7053
> URL: https://issues.apache.org/jira/browse/LUCENE-7053
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/other
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
> Fix For: master, 6.0
>
> Attachments: LUCENE-7053.patch, LUCENE-7053.patch, LUCENE-7053.patch, 
> LUCENE-7053.patch
>
>
> Followup from LUCENE-7052: This removes the legacy, deprecated 
> getUTF8SortedAsUTF16Comparator() in the BytesRef class. I know originally we 
> added the different comparators to be able to allow the index term dict to be 
> sorted in different order. This never proved to be useful, as many Lucene 
> queries rely on the default order. The only codec that used another byte 
> order internally was the Lucene 3 one (but it used the unicode spaghetti 
> algorithm to reorder its term enums at runtime).
> This patch also removes the BytesRef-Comparator completely and just 
> implements compareTo. So all code can rely on natural ordering.
> This patch also cleans up other usages of natural order comparators, e.g. in 
> ArrayUtil, because Java 8 natively provides a comparator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7053) Remove deprecated BytesRef#getUTF8SortedAsUTF16Comparator(); remove natural comparator in favour of Java 8 one

2016-02-29 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15171680#comment-15171680
 ] 

Michael McCandless commented on LUCENE-7053:


Thanks [~thetaphi]!

> Remove deprecated BytesRef#getUTF8SortedAsUTF16Comparator(); remove natural 
> comparator in favour of Java 8 one
> --
>
> Key: LUCENE-7053
> URL: https://issues.apache.org/jira/browse/LUCENE-7053
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/other
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
> Fix For: master, 6.0
>
> Attachments: LUCENE-7053.patch, LUCENE-7053.patch, LUCENE-7053.patch, 
> LUCENE-7053.patch
>
>
> Followup from LUCENE-7052: This removes the legacy, deprecated 
> getUTF8SortedAsUTF16Comparator() in the BytesRef class. I know originally we 
> added the different comparators to be able to allow the index term dict to be 
> sorted in different order. This never proved to be useful, as many Lucene 
> queries rely on the default order. The only codec that used another byte 
> order internally was the Lucene 3 one (but it used the unicode spaghetti 
> algorithm to reorder its term enums at runtime).
> This patch also removes the BytesRef-Comparator completely and just 
> implements compareTo. So all code can rely on natural ordering.
> This patch also cleans up other usages of natural order comparators, e.g. in 
> ArrayUtil, because Java 8 natively provides a comparator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7053) Remove deprecated BytesRef#getUTF8SortedAsUTF16Comparator(); remove natural comparator in favour of Java 8 one

2016-02-29 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15171596#comment-15171596
 ] 

ASF subversion and git services commented on LUCENE-7053:
-

Commit 8ffa436f00d24cb45af49160739f71b3654349ce in lucene-solr's branch 
refs/heads/master from [~thetaphi]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=8ffa436 ]

LUCENE-7053: Move comparator to better place in code; generalize to use 
CharSequence instead of String


> Remove deprecated BytesRef#getUTF8SortedAsUTF16Comparator(); remove natural 
> comparator in favour of Java 8 one
> --
>
> Key: LUCENE-7053
> URL: https://issues.apache.org/jira/browse/LUCENE-7053
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/other
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
> Fix For: master, 6.0
>
> Attachments: LUCENE-7053.patch, LUCENE-7053.patch, LUCENE-7053.patch, 
> LUCENE-7053.patch
>
>
> Followup from LUCENE-7052: This removes the legacy, deprecated 
> getUTF8SortedAsUTF16Comparator() in the BytesRef class. I know originally we 
> added the different comparators to be able to allow the index term dict to be 
> sorted in different order. This never proved to be useful, as many Lucene 
> queries rely on the default order. The only codec that used another byte 
> order internally was the Lucene 3 one (but it used the unicode spaghetti 
> algorithm to reorder its term enums at runtime).
> This patch also removes the BytesRef-Comparator completely and just 
> implements compareTo. So all code can rely on natural ordering.
> This patch also cleans up other usages of natural order comparators, e.g. in 
> ArrayUtil, because Java 8 natively provides a comparator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7053) Remove deprecated BytesRef#getUTF8SortedAsUTF16Comparator(); remove natural comparator in favour of Java 8 one

2016-02-29 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15171576#comment-15171576
 ] 

ASF subversion and git services commented on LUCENE-7053:
-

Commit f48d23cd1448f20fb1b97ec986ded76a04a7075c in lucene-solr's branch 
refs/heads/master from [~thetaphi]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=f48d23c ]

LUCENE-7053: Remove custom comparators from BytesRef class and solely use 
natural byte[] comparator throughout codebase. It also replaces the natural 
comparator in ArrayUtil by Java 8's Comparator#naturalOrder().


> Remove deprecated BytesRef#getUTF8SortedAsUTF16Comparator(); remove natural 
> comparator in favour of Java 8 one
> --
>
> Key: LUCENE-7053
> URL: https://issues.apache.org/jira/browse/LUCENE-7053
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/other
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
> Fix For: master, 6.0
>
> Attachments: LUCENE-7053.patch, LUCENE-7053.patch, LUCENE-7053.patch, 
> LUCENE-7053.patch
>
>
> Followup from LUCENE-7052: This removes the legacy, deprecated 
> getUTF8SortedAsUTF16Comparator() in the BytesRef class. I know originally we 
> added the different comparators to be able to allow the index term dict to be 
> sorted in different order. This never proved to be useful, as many Lucene 
> queries rely on the default order. The only codec that used another byte 
> order internally was the Lucene 3 one (but it used the unicode spaghetti 
> algorithm to reorder its term enums at runtime).
> This patch also removes the BytesRef-Comparator completely and just 
> implements compareTo. So all code can rely on natural ordering.
> This patch also cleans up other usages of natural order comparators, e.g. in 
> ArrayUtil, because Java 8 natively provides a comparator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7053) Remove deprecated BytesRef#getUTF8SortedAsUTF16Comparator(); remove natural comparator in favour of Java 8 one

2016-02-28 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1517#comment-1517
 ] 

Robert Muir commented on LUCENE-7053:
-

+1

> Remove deprecated BytesRef#getUTF8SortedAsUTF16Comparator(); remove natural 
> comparator in favour of Java 8 one
> --
>
> Key: LUCENE-7053
> URL: https://issues.apache.org/jira/browse/LUCENE-7053
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/other
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
> Fix For: master, 6.0
>
> Attachments: LUCENE-7053.patch, LUCENE-7053.patch, LUCENE-7053.patch, 
> LUCENE-7053.patch
>
>
> Followup from LUCENE-7052: This removes the legacy, deprecated 
> getUTF8SortedAsUTF16Comparator() in the BytesRef class. I know originally we 
> added the different comparators to be able to allow the index term dict to be 
> sorted in different order. This never proved to be useful, as many Lucene 
> queries rely on the default order. The only codec that used another byte 
> order internally was the Lucene 3 one (but it used the unicode spaghetti 
> algorithm to reorder its term enums at runtime).
> This patch also removes the BytesRef-Comparator completely and just 
> implements compareTo. So all code can rely on natural ordering.
> This patch also cleans up other usages of natural order comparators, e.g. in 
> ArrayUtil, because Java 8 natively provides a comparator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7053) Remove deprecated BytesRef#getUTF8SortedAsUTF16Comparator(); remove natural comparator in favour of Java 8 one

2016-02-28 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15171087#comment-15171087
 ] 

Uwe Schindler commented on LUCENE-7053:
---

All tests pass.

> Remove deprecated BytesRef#getUTF8SortedAsUTF16Comparator(); remove natural 
> comparator in favour of Java 8 one
> --
>
> Key: LUCENE-7053
> URL: https://issues.apache.org/jira/browse/LUCENE-7053
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/other
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
> Fix For: master, 6.0
>
> Attachments: LUCENE-7053.patch, LUCENE-7053.patch, LUCENE-7053.patch, 
> LUCENE-7053.patch
>
>
> Followup from LUCENE-7052: This removes the legacy, deprecated 
> getUTF8SortedAsUTF16Comparator() in the BytesRef class. I know originally we 
> added the different comparators to be able to allow the index term dict to be 
> sorted in different order. This never proved to be useful, as many Lucene 
> queries rely on the default order. The only codec that used another byte 
> order internally was the Lucene 3 one (but it used the unicode spaghetti 
> algorithm to reorder its term enums at runtime).
> This patch also removes the BytesRef-Comparator completely and just 
> implements compareTo. So all code can rely on natural ordering.
> This patch also cleans up other usages of natural order comparators, e.g. in 
> ArrayUtil, because Java 8 natively provides a comparator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7053) Remove deprecated BytesRef#getUTF8SortedAsUTF16Comparator()

2016-02-28 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15171058#comment-15171058
 ] 

Robert Muir commented on LUCENE-7053:
-

Yes, please, and remove BytesRef.COMPARATOR which just duplicates that: 
naturalOrder() already returns a singleton.

Also in cases like TreeSet creation in the join tests, we should just make 
{{new TreeSet<>()}} and not pass any comparator in at all.

> Remove deprecated BytesRef#getUTF8SortedAsUTF16Comparator()
> ---
>
> Key: LUCENE-7053
> URL: https://issues.apache.org/jira/browse/LUCENE-7053
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/other
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
> Fix For: master, 6.0
>
> Attachments: LUCENE-7053.patch, LUCENE-7053.patch, LUCENE-7053.patch
>
>
> Followup from LUCENE-7052: This removes the legacy, deprecated 
> getUTF8SortedAsUTF16Comparator() in the BytesRef class. The only left over 
> user was TSTLookup. Moves the code there as private impl detail.
> This also converts the comparators to lambdas for better readability.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7053) Remove deprecated BytesRef#getUTF8SortedAsUTF16Comparator()

2016-02-28 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15171027#comment-15171027
 ] 

Uwe Schindler commented on LUCENE-7053:
---

As we implemented {{compareTo}} we could remove the comparator completely. One 
could use {{Collections.naturalOrder()}} instead (naturalOrder is defined to 
use {{compareTo}}. At places like Collections.sort() we could remove the 
comparator argument completely.

Any comments on this?

> Remove deprecated BytesRef#getUTF8SortedAsUTF16Comparator()
> ---
>
> Key: LUCENE-7053
> URL: https://issues.apache.org/jira/browse/LUCENE-7053
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/other
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
> Fix For: master, 6.0
>
> Attachments: LUCENE-7053.patch, LUCENE-7053.patch, LUCENE-7053.patch
>
>
> Followup from LUCENE-7052: This removes the legacy, deprecated 
> getUTF8SortedAsUTF16Comparator() in the BytesRef class. The only left over 
> user was TSTLookup. Moves the code there as private impl detail.
> This also converts the comparators to lambdas for better readability.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7053) Remove deprecated BytesRef#getUTF8SortedAsUTF16Comparator()

2016-02-28 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15171012#comment-15171012
 ] 

Uwe Schindler commented on LUCENE-7053:
---

bq. We can take this further, e.g. I grep'd for places calling 
BytesRef.getUTF8SortedAsUnicodeComparator and it turns up silliness in 
BlockTermsReader that should just be invoking BytesRef.compareTo directly 
instead, I think?

Yeah. As said, we may not remove the comparator completely, but we should only 
use it at places where we can't use {{Comparable}} interface that 
BytesRef implements.

bq. You can also fix TestUnicodeUtil's custom String -> int[] code points logic 
maybe?

Will check this, too. I am currently investigating it Java 8 already has some 
Comparator interface somewhere ready-to use. But does not look like that.

> Remove deprecated BytesRef#getUTF8SortedAsUTF16Comparator()
> ---
>
> Key: LUCENE-7053
> URL: https://issues.apache.org/jira/browse/LUCENE-7053
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/other
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
> Fix For: master, 6.0
>
> Attachments: LUCENE-7053.patch
>
>
> Followup from LUCENE-7052: This removes the legacy, deprecated 
> getUTF8SortedAsUTF16Comparator() in the BytesRef class. The only left over 
> user was TSTLookup. Moves the code there as private impl detail.
> This also converts the comparators to lambdas for better readability.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7053) Remove deprecated BytesRef#getUTF8SortedAsUTF16Comparator()

2016-02-28 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15171010#comment-15171010
 ] 

Michael McCandless commented on LUCENE-7053:


+1 to the patch.

You can also fix {{TestUnicodeUtil}}'s custom String -> int[] code points logic 
maybe?

bq. There is a bit code duplication in both tests (sorting Strings in code 
point order), should we maybe move the new comparator to TestUtil?

+1

We can take this further, e.g. I grep'd for places calling 
{{BytesRef.getUTF8SortedAsUnicodeComparator}} and it turns up silliness in 
{{BlockTermsReader}} that should just be invoking {{BytesRef.compareTo}} 
directly instead, I think?

> Remove deprecated BytesRef#getUTF8SortedAsUTF16Comparator()
> ---
>
> Key: LUCENE-7053
> URL: https://issues.apache.org/jira/browse/LUCENE-7053
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/other
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
> Fix For: master, 6.0
>
> Attachments: LUCENE-7053.patch
>
>
> Followup from LUCENE-7052: This removes the legacy, deprecated 
> getUTF8SortedAsUTF16Comparator() in the BytesRef class. The only left over 
> user was TSTLookup. Moves the code there as private impl detail.
> This also converts the comparators to lambdas for better readability.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7053) Remove deprecated BytesRef#getUTF8SortedAsUTF16Comparator()

2016-02-28 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15171008#comment-15171008
 ] 

Uwe Schindler commented on LUCENE-7053:
---

There is a bit code duplication in both tests (sorting Strings in code point 
order), should we maybe move the new comparator to TestUtil?

> Remove deprecated BytesRef#getUTF8SortedAsUTF16Comparator()
> ---
>
> Key: LUCENE-7053
> URL: https://issues.apache.org/jira/browse/LUCENE-7053
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/other
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
> Fix For: master, 6.0
>
> Attachments: LUCENE-7053.patch
>
>
> Followup from LUCENE-7052: This removes the legacy, deprecated 
> getUTF8SortedAsUTF16Comparator() in the BytesRef class. The only left over 
> user was TSTLookup. Moves the code there as private impl detail.
> This also converts the comparators to lambdas for better readability.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org