https://bugs.documentfoundation.org/show_bug.cgi?id=166195
Bug ID: 166195 Summary: Integer Overflow in Year Recognition Leads to Periodic Date/Text Interpretation Product: LibreOffice Version: unspecified Hardware: All OS: All Status: UNCONFIRMED Severity: normal Priority: medium Component: Calc Assignee: libreoffice-bugs@lists.freedesktop.org Reporter: gplhust...@gmail.com Description: When parsing date-like strings, Calc exhibits inconsistent behavior for large year values due to an integer overflow issue. Specifically, year recognition alternates between valid (Date) and invalid (Text) periodically as the input year number increases. This occurs because of how large year numbers are handled during conversion from sal_Int32 to sal_uInt16 and the subsequent interpretation of that sal_uInt16 as a sal_Int16. Enter the following strings into Calc cells: "32767-1-1" -> Correctly recognized as Date(32767, 1, 1) "32768-1-1" -> Incorrectly recognized as Text("32768-1-1") "80000-1-1" -> Incorrectly recognized as Date(14464, 1, 1) (due to 80000 % 65536 = 14464) "100000-1-1" -> Incorrectly recognized as Text("100000-1-1") (due to 100000 % 65536 = 34464, which is > 32767) The issue stems from the ImpSvNumberInputScan::ImplGetYear function in svl/source/numbers/zforfind.cxx. 1. The input year (as digits) is parsed into a sal_Int32. 2. This sal_Int32 is then cast to a sal_uInt16 (nYear). When the input year value exceeds 65535, this conversion results in uint16 overflow (modulo 65536 arithmetic). This explains the observed periodicity of 65536. 3. Downstream, this sal_uInt16 nYear value is effectively treated as a sal_Int16 when determining validity or setting the cell value (e.g., via Calc.setValue or similar mechanisms). 4. When the sal_uInt16 value falls within the range 32768 to 65535, its interpretation as a sal_Int16 results in a negative value. 5. Calc apparently treats these negative years as invalid in this context, causing the entire input string to be fallback-recognized as Text. Therefore: - Years resulting in an effective sal_uInt16 value between 0 and 32767 (inclusive) are considered valid Date years. - Years resulting in an effective sal_uInt16 value between 32768 and 65535 (inclusive) are considered invalid, and the input becomes Text. While this is unlikely to affect users entering typical, smaller year values, it can lead to unexpected data interpretation if large numbers are entered, potentially due to typos. This inconsistent behavior could compromise data integrity in specific scenarios. A straightforward mitigation is to enforce a maximum valid year limit, effectively treating any year input that would result in a value > 32767 as invalid (Text). This can be achieved with a small modification to the sal_Int32 to sal_uInt16 conversion: The following patch implements the proposed solution. It has been tested locally and resolves the described inconsistent behavior. diff --git a/svl/source/numbers/zforfind.cxx b/svl/source/numbers/zforfind.cxx index 2b91fa662..648c1007f 100644 --- a/svl/source/numbers/zforfind.cxx +++ b/svl/source/numbers/zforfind.cxx @@ -1111,7 +1111,11 @@ sal_uInt16 ImpSvNumberInputScan::ImplGetYear( sal_uInt16 nIndex ) // leading zero as convention. if (nLen <= 6) { - nYear = static_cast<sal_uInt16>(sStrArray[nNums[nIndex]].toInt32()); + sal_Int32 temp = sStrArray[nNums[nIndex]].toInt32(); + // safely convert int32 to uint16 by capping values at 65535 + // downstream code will cast uint16 to int16(where values >32767 become negative and are considerd invalid for year) + nYear = temp < 65536 ? temp : 65535; + // A year in another, not Gregorian CE era is never expanded. // A year < 100 entered with at least 3 digits with leading 0 is taken // as is without expansion. I have implemented and tested this fix. However, I am currently unfamiliar with the process for contributing code changes to the Calc/LibreOffice project. Guidance on submitting this patch would be appreciated. Steps to Reproduce: Enter the following strings into Calc cells: "32767-1-1" -> Correctly recognized as Date(32767, 1, 1) "32768-1-1" -> Incorrectly recognized as Text("32768-1-1") "80000-1-1" -> Incorrectly recognized as Date(14464, 1, 1) (due to 80000 % 65536 = 14464) "100000-1-1" -> Incorrectly recognized as Text("100000-1-1") (due to 100000 % 65536 = 34464, which is > 32767) Actual Results: "32767-1-1" -> Date(32767, 1, 1) "32768-1-1" -> Text("32768-1-1") "80000-1-1" -> Date(14464, 1, 1) "100000-1-1" -> Text("100000-1-1") Expected Results: "32767-1-1" -> Date(32767, 1, 1) "32768-1-1" -> Text("32768-1-1") "80000-1-1" -> Text("80000-1-1") "100000-1-1" -> Text("100000-1-1") Reproducible: Always User Profile Reset: No Additional Info: Version: 24.8.3.2 (AARCH64) / LibreOffice Community Build ID: 48a6bac9e7e268aeb4c3483fcf825c94556d9f92 CPU threads: 8; OS: macOS 15.0; UI render: Skia/Metal; VCL: osx Locale: en-US (en_CN.UTF-8); UI: en-US Calc: threaded -- You are receiving this mail because: You are the assignee for the bug.