no23reason opened a new issue, #38528:
URL: https://github.com/apache/arrow/issues/38528

   ### Describe the enhancement requested
   
   The `pyarrow.compute.strptime` handles the `%Y` format part (i.e. 4-digit 
year) differently from the built-in `time.strptime`:
   When the year part of the input has only two digits, `time.strptime` fails 
to parse it, while `pyarrow.compute.strptime` parses it with no error yielding 
a result with a year in the 1st century.
   
   For example
   
   ```python
   >>> import pyarrow as pa
   >>> import pyarrow.compute as pc
   >>> import time
   >>> input = "01-01-23"
   >>> format = "%m-%d-%Y"
   >>> time.strptime(input, format)
   Traceback (most recent call last):
     File "<stdin>", line 1, in <module>
     File 
"/Users/no23reason/.pyenv/versions/3.11.4/lib/python3.11/_strptime.py", line 
562, in _strptime_time
       tt = _strptime(data_string, format)[0]
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
     File 
"/Users/no23reason/.pyenv/versions/3.11.4/lib/python3.11/_strptime.py", line 
349, in _strptime
       raise ValueError("time data %r does not match format %r" %
   ValueError: time data '01-01-23' does not match format '%m-%d-%Y'
   >>> pc.strptime(pa.array([input]), format=format, unit="s")
   <pyarrow.lib.TimestampArray object at 0x152e40c40>
   [
     0023-01-01 00:00:00
   ]
   ```
   I believe the `pyarrow.compute.strptime` should also fail in this case as it 
most likely means that the format is wrong.
   
   ### Component(s)
   
   Python


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to