[ 
https://issues.apache.org/jira/browse/MATH-1688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18076172#comment-18076172
 ] 

尹茂椿萱 commented on MATH-1688:
----------------------------

Hi there, thank you for your reply and suggestions. I have just confirmed that 
this issue still exists in the latest version (4.0beta1), and I have now 
updated the bug report.

> ComplexFormat.parse exhibits inconsistent behavior due to implicit comma 
> skipping by NumberFormat
> -------------------------------------------------------------------------------------------------
>
>                 Key: MATH-1688
>                 URL: https://issues.apache.org/jira/browse/MATH-1688
>             Project: Commons Math
>          Issue Type: Bug
>    Affects Versions: 4.0-beta1
>            Reporter: 尹茂椿萱
>            Priority: Major
>
> Description
> ComplexFormat.parse exhibits inconsistent and undocumented behavior when 
> parsing inputs containing commas.
> Commas are silently ignored in numeric components, but not treated as 
> structural separators. In addition, error handling behavior differs between 
> versions (e.g., returning null vs throwing an exception), which further 
> complicates usage.
> —
> Reproducible Example (Commons Math 4.0)
> ```java
> import org.apache.commons.math4.legacy.util.ComplexFormat;
> public class StringUtils {
>     public static void main(String[] args)
> {         ComplexFormat format = new ComplexFormat();         
> System.out.println(format.parse(",,7+,,,2i"));   // (7.0, 2.0)         
> System.out.println(format.parse(",8+,,3i"));     // (8.0, 3.0)         
> System.out.println(format.parse(",7"));          // (7.0, 0.0)         
> System.out.println(format.parse("7,,8"));        // (78.0, 0.0)         
> System.out.println(format.parse("#7"));          // throws MathParseException 
>     }
> }
> ```
> —
> Observed Behavior
>  - Commas are ignored when they appear inside numeric components:
>   - ",,7" → 7
>   - ",,,2" → 2
>  - As a result:
>   - ",,7+,,,2i" → (7.0, 2.0)
>   - ",8+,,3i" → (8.0, 3.0)
>  - Input such as:
>   - "7,,8"
>   
>   is parsed as:
>   - (78.0, 0.0)
>   This indicates that commas are not treated as delimiters between values,
>   but are instead silently removed inside numeric parsing.
>  - Other invalid characters (e.g. '#') are not ignored:
>   - "#7" results in a MathParseException
>  - Error handling differs from earlier versions:
>   - In some versions, invalid input may return null
>   - In Commons Math 4.0, invalid input throws an exception
> —
> Expected Behavior
> Parsing should be consistent and predictable:
>  - Either commas should be explicitly supported as valid separators and 
> documented
>  - Or invalid characters should cause parsing to fail uniformly
> In particular:
>  - If commas are treated as delimiters, "7,,8" should not collapse into 78
>  - If commas are not valid syntax, inputs containing them should fail 
> consistently
> Additionally, error handling behavior should be clearly defined and 
> consistent across versions.
> —
> Root Cause Analysis
> The behavior originates from CompositeFormat.parseNumber:
>     Number number = format.parse(source, pos);
> This delegates parsing to NumberFormat (typically DecimalFormat).
> DecimalFormat treats ',' as a grouping separator and ignores it during 
> parsing:
>     ",,7" → 7
>     "7,,8" → 78
> This is confirmed by observing that:
>  - pos.getIndex() advances after parsing ",,7"
>  - startIndex != endIndex, so parsing is considered successful
> Therefore:
>  - ',' is implicitly ignored inside numeric components by NumberFormat
>  - but ComplexFormat does not handle ',' consistently in structural parsing
> —
> Consequence
> This leads to inconsistent parsing behavior:
>  - ',' is ignored inside numeric values
>  - ',' is not treated as a structural delimiter
>  - ',' is rejected in structural positions (e.g. around '+' or 'i')
> As a result:
>  - Parsing becomes context-dependent and non-intuitive
>  - Malformed input may be silently accepted and misinterpreted
>  - Behavior differs across versions (null vs exception)
> —
> Additional Notes
> This behavior is not documented in the ComplexFormat API and may surprise 
> users expecting strict parsing.
> The issue arises from the interaction between:
>  - a lenient numeric parser (NumberFormat)
>  - and a stricter structural parser (ComplexFormat)
> —
> Possible Improvements
>  - Disable grouping parsing in NumberFormat when used by ComplexFormat
>  - Or explicitly handle separators at the ComplexFormat level
>  - Or document the current behavior clearly
> Providing a strict parsing mode could also help avoid ambiguity.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to