[
https://issues.apache.org/jira/browse/MATH-1688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18076172#comment-18076172
]
尹茂椿萱 commented on MATH-1688:
----------------------------
Hi there, thank you for your reply and suggestions. I have just confirmed that
this issue still exists in the latest version (4.0beta1), and I have now
updated the bug report.
> ComplexFormat.parse exhibits inconsistent behavior due to implicit comma
> skipping by NumberFormat
> -------------------------------------------------------------------------------------------------
>
> Key: MATH-1688
> URL: https://issues.apache.org/jira/browse/MATH-1688
> Project: Commons Math
> Issue Type: Bug
> Affects Versions: 4.0-beta1
> Reporter: 尹茂椿萱
> Priority: Major
>
> Description
> ComplexFormat.parse exhibits inconsistent and undocumented behavior when
> parsing inputs containing commas.
> Commas are silently ignored in numeric components, but not treated as
> structural separators. In addition, error handling behavior differs between
> versions (e.g., returning null vs throwing an exception), which further
> complicates usage.
> —
> Reproducible Example (Commons Math 4.0)
> ```java
> import org.apache.commons.math4.legacy.util.ComplexFormat;
> public class StringUtils {
> public static void main(String[] args)
> { ComplexFormat format = new ComplexFormat();
> System.out.println(format.parse(",,7+,,,2i")); // (7.0, 2.0)
> System.out.println(format.parse(",8+,,3i")); // (8.0, 3.0)
> System.out.println(format.parse(",7")); // (7.0, 0.0)
> System.out.println(format.parse("7,,8")); // (78.0, 0.0)
> System.out.println(format.parse("#7")); // throws MathParseException
> }
> }
> ```
> —
> Observed Behavior
> - Commas are ignored when they appear inside numeric components:
> - ",,7" → 7
> - ",,,2" → 2
> - As a result:
> - ",,7+,,,2i" → (7.0, 2.0)
> - ",8+,,3i" → (8.0, 3.0)
> - Input such as:
> - "7,,8"
>
> is parsed as:
> - (78.0, 0.0)
> This indicates that commas are not treated as delimiters between values,
> but are instead silently removed inside numeric parsing.
> - Other invalid characters (e.g. '#') are not ignored:
> - "#7" results in a MathParseException
> - Error handling differs from earlier versions:
> - In some versions, invalid input may return null
> - In Commons Math 4.0, invalid input throws an exception
> —
> Expected Behavior
> Parsing should be consistent and predictable:
> - Either commas should be explicitly supported as valid separators and
> documented
> - Or invalid characters should cause parsing to fail uniformly
> In particular:
> - If commas are treated as delimiters, "7,,8" should not collapse into 78
> - If commas are not valid syntax, inputs containing them should fail
> consistently
> Additionally, error handling behavior should be clearly defined and
> consistent across versions.
> —
> Root Cause Analysis
> The behavior originates from CompositeFormat.parseNumber:
> Number number = format.parse(source, pos);
> This delegates parsing to NumberFormat (typically DecimalFormat).
> DecimalFormat treats ',' as a grouping separator and ignores it during
> parsing:
> ",,7" → 7
> "7,,8" → 78
> This is confirmed by observing that:
> - pos.getIndex() advances after parsing ",,7"
> - startIndex != endIndex, so parsing is considered successful
> Therefore:
> - ',' is implicitly ignored inside numeric components by NumberFormat
> - but ComplexFormat does not handle ',' consistently in structural parsing
> —
> Consequence
> This leads to inconsistent parsing behavior:
> - ',' is ignored inside numeric values
> - ',' is not treated as a structural delimiter
> - ',' is rejected in structural positions (e.g. around '+' or 'i')
> As a result:
> - Parsing becomes context-dependent and non-intuitive
> - Malformed input may be silently accepted and misinterpreted
> - Behavior differs across versions (null vs exception)
> —
> Additional Notes
> This behavior is not documented in the ComplexFormat API and may surprise
> users expecting strict parsing.
> The issue arises from the interaction between:
> - a lenient numeric parser (NumberFormat)
> - and a stricter structural parser (ComplexFormat)
> —
> Possible Improvements
> - Disable grouping parsing in NumberFormat when used by ComplexFormat
> - Or explicitly handle separators at the ComplexFormat level
> - Or document the current behavior clearly
> Providing a strict parsing mode could also help avoid ambiguity.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)