[
https://issues.apache.org/jira/browse/MATH-1688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18076160#comment-18076160
]
Gilles Sadowski commented on MATH-1688:
---------------------------------------
Please confirm that the issue is present in the latest release of Commons Math.
> ComplexFormat.parse exhibits inconsistent behavior due to implicit comma
> skipping by NumberFormat
> -------------------------------------------------------------------------------------------------
>
> Key: MATH-1688
> URL: https://issues.apache.org/jira/browse/MATH-1688
> Project: Commons Math
> Issue Type: Bug
> Affects Versions: 3.6.1
> Reporter: 尹茂椿萱
> Priority: Major
>
> Description
> ComplexFormat.parse exhibits inconsistent and undocumented behavior when
> parsing inputs containing commas.
> Commas are silently ignored in numeric components, but not in structural
> positions (such as between a number and '+' or 'i'). This results in
> context-dependent parsing behavior that is difficult to predict and may hide
> malformed input.
> —
> Reproducible Example
> ```java
> ComplexFormat format = new ComplexFormat();
> System.out.println(format.parse(",,7+,,,2i")); // 7 + 2i
> System.out.println(format.parse(",8+,,3i")); // 8 + 3i
> System.out.println(format.parse(",7")); // 7 + 0i
> System.out.println(format.parse(";7")); // null
> System.out.println(format.parse("#7")); // null
> ```
> —
> Observed Behavior
> - Commas are ignored when they appear inside numeric components:
> - ",,7" → 7
> - ",,,2" → 2
> - As a result, inputs like:
> - ",,7+,,,2i"
> - ",8+,,3i"
> are successfully parsed into valid complex numbers
> - However, commas are not accepted in structural positions:
> - e.g. between a number and '+' or 'i', parsing may fail
> - Other invalid characters (e.g. ';', '#') are not ignored and cause parsing
> to fail
> - *{*}Importantly, comma does not behave like a structural separator:{*}*
> Input such as:
> - "7,8"
>
> is parsed as:
> - 78 (single number)
> rather than being interpreted as:
> - two values (e.g. 7 + 0i and 8 + 0i)
> This further indicates that ',' is not treated as a consistent delimiter in
> any meaningful semantic sense.
> —
> Expected Behavior
> Parsing should be consistent and predictable:
> - Either commas should be explicitly supported as valid separators and
> documented
> - Or invalid characters should cause parsing to fail uniformly
> Silent skipping of certain characters in some contexts but not others leads
> to confusing and unsafe behavior.
> In particular, if ',' were intended as a delimiter, inputs like "7,8" should
> be parsed consistently as multiple values, not collapsed into a single number.
> —
> Root Cause Analysis
> The behavior originates from CompositeFormat.parseNumber:
> Number number = format.parse(source, pos);
> This delegates parsing to NumberFormat (typically DecimalFormat).
> DecimalFormat treats ',' as a grouping separator and ignores it during
> parsing. For example:
> ",,7" → 7
> "7,8" → 78
> This is confirmed by observing that:
> - pos.getIndex() advances after parsing ",,7"
> - startIndex != endIndex, so parsing is considered successful
> Therefore:
> - ',' is implicitly ignored inside numeric components by NumberFormat
> - but ComplexFormat does not handle ',' consistently in other parsing stages
> —
> Consequence
> This leads to inconsistent parsing behavior:
> - ',' is ignored inside numeric values
> - ',' is not treated as a structural delimiter between complex numbers
> - but ',' is also rejected in structural positions (e.g. around '+' or 'i')
> As a result, parsing becomes context-dependent and non-intuitive.
> Additionally, malformed input may be silently accepted and interpreted as
> valid data, making it difficult to detect input errors.
> —
> Additional Notes
> This behavior is not documented in the ComplexFormat API and may surprise
> users expecting strict parsing.
> The issue arises from the interaction between:
> - a lenient numeric parser (NumberFormat)
> - and a stricter structural parser (ComplexFormat)
> —
> Possible Improvements
> - Disable grouping parsing in NumberFormat when used by ComplexFormat
> - Or explicitly handle separators at the ComplexFormat level
> - Or document the current behavior clearly
> Providing a strict parsing mode could also help avoid ambiguity.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)