[ 
https://issues.apache.org/jira/browse/MATH-1688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18076160#comment-18076160
 ] 

Gilles Sadowski commented on MATH-1688:
---------------------------------------

Please confirm that the issue is present in the latest release of Commons Math.

> ComplexFormat.parse exhibits inconsistent behavior due to implicit comma 
> skipping by NumberFormat
> -------------------------------------------------------------------------------------------------
>
>                 Key: MATH-1688
>                 URL: https://issues.apache.org/jira/browse/MATH-1688
>             Project: Commons Math
>          Issue Type: Bug
>    Affects Versions: 3.6.1
>            Reporter: 尹茂椿萱
>            Priority: Major
>
> Description
> ComplexFormat.parse exhibits inconsistent and undocumented behavior when 
> parsing inputs containing commas.
> Commas are silently ignored in numeric components, but not in structural 
> positions (such as between a number and '+' or 'i'). This results in 
> context-dependent parsing behavior that is difficult to predict and may hide 
> malformed input.
> —
> Reproducible Example
> ```java
> ComplexFormat format = new ComplexFormat();
> System.out.println(format.parse(",,7+,,,2i"));   // 7 + 2i
> System.out.println(format.parse(",8+,,3i"));     // 8 + 3i
> System.out.println(format.parse(",7"));          // 7 + 0i
> System.out.println(format.parse(";7"));          // null
> System.out.println(format.parse("#7"));          // null
> ```
> —
> Observed Behavior
>  - Commas are ignored when they appear inside numeric components:
>   - ",,7" → 7
>   - ",,,2" → 2
>  - As a result, inputs like:
>   - ",,7+,,,2i"
>   - ",8+,,3i"
>   are successfully parsed into valid complex numbers
>  - However, commas are not accepted in structural positions:
>   - e.g. between a number and '+' or 'i', parsing may fail
>  - Other invalid characters (e.g. ';', '#') are not ignored and cause parsing 
> to fail
>  - *{*}Importantly, comma does not behave like a structural separator:{*}*
>   Input such as:
>   - "7,8"
>   
>   is parsed as:
>   - 78 (single number)
>   rather than being interpreted as:
>   - two values (e.g. 7 + 0i and 8 + 0i)
>   This further indicates that ',' is not treated as a consistent delimiter in 
> any meaningful semantic sense.
> —
> Expected Behavior
> Parsing should be consistent and predictable:
>  - Either commas should be explicitly supported as valid separators and 
> documented
>  - Or invalid characters should cause parsing to fail uniformly
> Silent skipping of certain characters in some contexts but not others leads 
> to confusing and unsafe behavior.
> In particular, if ',' were intended as a delimiter, inputs like "7,8" should 
> be parsed consistently as multiple values, not collapsed into a single number.
> —
> Root Cause Analysis
> The behavior originates from CompositeFormat.parseNumber:
>     Number number = format.parse(source, pos);
> This delegates parsing to NumberFormat (typically DecimalFormat).
> DecimalFormat treats ',' as a grouping separator and ignores it during 
> parsing. For example:
>     ",,7" → 7
>     "7,8" → 78
> This is confirmed by observing that:
>  - pos.getIndex() advances after parsing ",,7"
>  - startIndex != endIndex, so parsing is considered successful
> Therefore:
>  - ',' is implicitly ignored inside numeric components by NumberFormat
>  - but ComplexFormat does not handle ',' consistently in other parsing stages
> —
> Consequence
> This leads to inconsistent parsing behavior:
>  - ',' is ignored inside numeric values
>  - ',' is not treated as a structural delimiter between complex numbers
>  - but ',' is also rejected in structural positions (e.g. around '+' or 'i')
> As a result, parsing becomes context-dependent and non-intuitive.
> Additionally, malformed input may be silently accepted and interpreted as 
> valid data, making it difficult to detect input errors.
> —
> Additional Notes
> This behavior is not documented in the ComplexFormat API and may surprise 
> users expecting strict parsing.
> The issue arises from the interaction between:
>  - a lenient numeric parser (NumberFormat)
>  - and a stricter structural parser (ComplexFormat)
> —
> Possible Improvements
>  - Disable grouping parsing in NumberFormat when used by ComplexFormat
>  - Or explicitly handle separators at the ComplexFormat level
>  - Or document the current behavior clearly
> Providing a strict parsing mode could also help avoid ambiguity.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to