Hi Ivan,

Would I be wrong if I described the logic of these two comparators as the following:

The comparator compares two character sequences as though each of them would be 1st transformed into a tuple of the form:

(A0, N0, A1, N1, ..., An-1, Nn-1, An)

where:

A0 and An are (possibly empty) sub-sequences consisting of non-decimal-digit characters A1 ... An-1 are non-empty sub-sequences consisting of non-decimal-digit characters N0 ... Nn-1 are non-empty sub-sequences consisting of decimal-digit characters

...such that all sub-sequences concatenated together in order as they appear in the tuple yield the original character sequence.

Any characher sequence can be uniquely transformed into such tuple. For example:

"" -> (A0); where A0 == ""
"ab10" -> (A0, N0, A1); where A0 == "ab", N0 == "10", A1 = ""
"1" -> (A0, N0, A1); where A0 == "", N0 == "1", A1 = ""
...

After transformation, the tuples are compared by their elements (from left to right) so that corresponding Ax elements are compared lexicographically and Nx elements are compared as decimal integers (with two variations considering leading zeroes).

If I am right than perhaps such description would be easier to understand.

What do you think?


Regards, Peter

On 07/19/2017 10:41 AM, Ivan Gerasimov wrote:
Hello!

It is a proposal to provide a String comparator, which will pay attention to the numbers embedded into the strings (should they present).

This proposal was initially discussed back in 2014 and seemed to bring some interest from the community: http://mail.openjdk.java.net/pipermail/core-libs-dev/2014-December/030343.html

In the latest webrev two methods are added to the public API:
j.u.Comparator.comparingNumerically() and
j.u.Comparator.comparingNumericallyLeadingZerosAhead().

The regression test is extended to exercise this new comparator.

BUGURL: https://bugs.openjdk.java.net/browse/JDK-8134512
WEBREV: http://cr.openjdk.java.net/~igerasim/8134512/01/webrev/

Comments, suggestions are very welcome!


Reply via email to