paulirwin commented on code in PR #1121:
URL: https://github.com/apache/lucenenet/pull/1121#discussion_r1927206723
##########
src/Lucene.Net/Support/Arrays.cs:
##########
@@ -622,34 +624,5 @@ public static string ToString<T>(T[] array)
sb.Append(']');
return sb.ToString();
}
-
- /// <summary>
- /// Creates a <see cref="string"/> representation of the array passed.
- /// The result is surrounded by brackets <c>"[]"</c>, each
- /// element is converted to a <see cref="string"/> via the
- /// <paramref name="provider"/> and separated by <c>", "</c>. If
- /// the array is <c>null</c>, then <c>"null"</c> is returned.
- /// </summary>
- /// <typeparam name="T">The type of array element.</typeparam>
- /// <param name="array">The array to convert.</param>
- /// <param name="provider">A <see cref="IFormatProvider"/> instance
that supplies the culture formatting information.</param>
- /// <returns>The converted array string.</returns>
- public static string ToString<T>(T[] array, IFormatProvider provider)
Review Comment:
In this case, in FacetResult (the only usage), the formatted array is an
array of strings, so culture has no effect either way, making this more of a
theoretical discussion. But assuming that it was i.e. floating point numbers in
other usages in the future, or that this code might become part of J2N for
other users...
I disagree that this behavior is correct or not a bug. There is a difference
between formatting _just_ a number as a string in a way that might be useful to
end users (in which case a culture is necessary, i.e. displaying in a UI or for
subsequent round-trip parsing) and formatting an array. Apart from perhaps some
math-inclined nerds like myself, array syntax is only useful for debugging and
diagnostic purposes, not for end users. (Also, it is not a proper
serialization, in which case you might want to use something like JSON instead,
which also does not support decimal commas.) And if you do get that array
printed in i.e. a log message, it is not useful for it to have a decimal comma
for floating-point numbers if that is how your culture is configured. The
primary benefit of formatting an array as a string is for programmers (or those
that can understand data structures) to diagnose the values in an array, or
perhaps even to copy/paste into i.e. a unit test. If you did that with
values formatted with a decimal comma, that is not correct in C# or most
programming languages. Java is correct here in using the invariant culture, and
I do not see a good reason why our diagnostic messages should differ from Java
in this regard (again, assuming someone adds a use for this method on numbers
in the future).
And that's one problem even if there's just one number in the array. The
second problem is the list separator, which is hardcoded to ", " (although
correctly, IMO, per the above, but it's a problem if you're formatting the
values with a culture). When the culture uses a decimal comma, this breaks the
formatting of the list due to ambiguity. This theoretically could be fixed if
we had a CultureInfo parameter (instead of IFormatProvider) by using
`TextInfo.ListSeparator`, but AFAICT there's no way to get that from
IFormatProvider. Regardless, even if we could, and say it was `';'` for that
culture, the formatted array string of `"[1,1; 2,2]"` is not close to being
valid in most programming languages, which is the main benefit to the reader of
the output of `Arrays.ToString`, either from a cognitive understanding of the
diagnostic or to actually put into code like a unit test. Say you had 1000
items in the array. That would require extensive manual editing to put into a
unit test to
repeat behavior from that diagnostic string, whereas the string `"[1.1, 2.2]"`
would at least be valid syntax as-is for `double[]` or `object[]`. Of course
there are easy examples where this would not be valid C#/Java, but it at least
can get you close for many primitive types.
There is a good reason why Java's `Arrays.toString` is culture-invariant:
because Java code is. I think we should match the same behavior now and going
forward for formatting Arrays as a string, which again is only useful for
diagnostic purposes as if it were code, and not serialization or end-user use.
ChatGPT perhaps said it best when asked:
> Java's `Arrays.toString(...)` methods are not locale-aware because they
are designed to provide a consistent, straightforward string representation of
arrays for debugging and logging purposes, rather than being formatted for
user-facing output. Locale-sensitive formatting is typically handled by
dedicated classes like `NumberFormat` or `MessageFormat`, keeping
`Arrays.toString(...)` lightweight and predictable across different
environments.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]