This is an automated email from the ASF dual-hosted git repository.

AlenkaF pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/arrow.git


The following commit(s) were added to refs/heads/main by this push:
     new 1fc0e194e8 GH-45644: [Doc][Python] Document timezone loss when 
converting timestamp arrays to NumPy (#49843)
1fc0e194e8 is described below

commit 1fc0e194e869ea4a66b03809d0bc3204ed83947f
Author: Alexandros Anastasiou <[email protected]>
AuthorDate: Tue May 5 13:26:15 2026 +0100

    GH-45644: [Doc][Python] Document timezone loss when converting timestamp 
arrays to NumPy (#49843)
    
    ### Rationale for this change
    
    NumPy's `datetime64` type does not support timezones. When converting a 
timezone-aware Arrow timestamp array to NumPy via `to_numpy()`, the timezone 
information is silently dropped. This behaviour is expected but undocumented, 
which can surprise users (see #45644).
    
    ### What changes are included in this PR?
    
    Adds a "Timezone-aware Timestamps" subsection to 
`docs/source/python/numpy.rst` that:
    
    - Explains the timezone loss when calling `to_numpy()` on tz-aware 
timestamp arrays
    - Shows a code example demonstrating the behavior
    - Documents two alternatives: `to_pandas()` for tz-aware Series, and 
`to_pylist()` for Python `datetime` objects with `tzinfo`
    
    ### Are these changes tested?
    
    Documentation-only change. All code examples were verified against pyarrow 
24.0.0 and `sphinx-lint` passes clean.
    
    ### Are there any user-facing changes?
    
    No behaviour changes. This adds documentation for existing behaviour.
    
    ### AI-generated code disclosure
    
    This PR was developed with assistance from an AI coding tool (Claude, 
Anthropic). All changes have been reviewed, understood, and verified.
    
    * GitHub Issue: #45644
    Closes #45644
    
    Authored-by: Alexandros Anastasiou <[email protected]>
    Signed-off-by: AlenkaF <[email protected]>
---
 docs/source/python/numpy.rst | 52 ++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 52 insertions(+)

diff --git a/docs/source/python/numpy.rst b/docs/source/python/numpy.rst
index 01fb1982d5..07a6aa803f 100644
--- a/docs/source/python/numpy.rst
+++ b/docs/source/python/numpy.rst
@@ -73,3 +73,55 @@ representation as Arrow, and assuming the Arrow data has no 
nulls.
 For more complex data types, you have to use the 
:meth:`~pyarrow.Array.to_pandas`
 method (which will construct a Numpy array with Pandas semantics for, e.g.,
 representation of null values).
+
+Timezone-aware Timestamps
+~~~~~~~~~~~~~~~~~~~~~~~~~
+
+NumPy's ``datetime64`` type does not support timezones. When converting a
+timezone-aware Arrow timestamp array to NumPy via 
:meth:`~pyarrow.Array.to_numpy`,
+the timezone information is silently dropped:
+
+.. code-block:: python
+
+   >>> arr = pa.array([1735689600, 1735689600], type=pa.timestamp("s", 
tz="UTC"))
+   >>> arr.type
+   TimestampType(timestamp[s, tz=UTC])
+   >>> arr.to_numpy()
+   array(['2025-01-01T00:00:00', '2025-01-01T00:00:00'],
+         dtype='datetime64[s]')
+
+If you need to preserve timezone information, there are two alternatives:
+
+* Convert to a Pandas Series, which supports timezone-aware ``datetime64`` 
dtypes:
+
+  .. code-block:: python
+
+     >>> arr.to_pandas()
+     0   2025-01-01 00:00:00+00:00
+     1   2025-01-01 00:00:00+00:00
+     dtype: datetime64[s, UTC]
+
+  To get a NumPy array while preserving timezone information, use
+  ``timestamp_as_object=True``:
+
+  .. code-block:: python
+
+     >>> arr.to_pandas(timestamp_as_object=True).to_numpy()  # doctest: 
+ELLIPSIS
+     array([datetime.datetime(2025, 1, 1, 0, 0, tzinfo=...),
+            datetime.datetime(2025, 1, 1, 0, 0, tzinfo=...)],
+           dtype=object)
+
+  .. note::
+
+     For nested types (e.g., list arrays containing timestamps),
+     ``to_pandas()`` may not preserve timezone information. Structs and maps
+     do retain timezones, but lists currently do not. See
+     `GH-41162 <https://github.com/apache/arrow/issues/41162>`_ for details.
+
+* Convert to Python ``datetime`` objects, which carry ``tzinfo``:
+
+  .. code-block:: python
+
+     >>> arr.to_pylist()  # doctest: +SKIP
+     [datetime.datetime(2025, 1, 1, 0, 0, tzinfo=zoneinfo.ZoneInfo(key='UTC')),
+      datetime.datetime(2025, 1, 1, 0, 0, tzinfo=zoneinfo.ZoneInfo(key='UTC'))]

Reply via email to