[ 
https://issues.apache.org/jira/browse/ARROW-2129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16360835#comment-16360835
 ] 

ASF GitHub Bot commented on ARROW-2129:
---------------------------------------

wesm closed pull request #1588: ARROW-2129: [Python] Handle conversion of empty 
tables to Pandas
URL: https://github.com/apache/arrow/pull/1588
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/cpp/src/arrow/python/arrow_to_pandas.cc 
b/cpp/src/arrow/python/arrow_to_pandas.cc
index 60a2eae5d..048898936 100644
--- a/cpp/src/arrow/python/arrow_to_pandas.cc
+++ b/cpp/src/arrow/python/arrow_to_pandas.cc
@@ -280,6 +280,9 @@ class PandasBlock {
 
 template <typename T>
 inline const T* GetPrimitiveValues(const Array& arr) {
+  if (arr.length() == 0) {
+    return nullptr;
+  }
   const auto& prim_arr = static_cast<const PrimitiveArray&>(arr);
   const T* raw_values = reinterpret_cast<const T*>(prim_arr.values()->data());
   return raw_values + arr.offset();
@@ -304,9 +307,11 @@ inline void ConvertIntegerNoNullsSameType(PandasOptions 
options, const ChunkedAr
                                           T* out_values) {
   for (int c = 0; c < data.num_chunks(); c++) {
     const auto& arr = *data.chunk(c);
-    const T* in_values = GetPrimitiveValues<T>(arr);
-    memcpy(out_values, in_values, sizeof(T) * arr.length());
-    out_values += arr.length();
+    if (arr.length() > 0) {
+      const T* in_values = GetPrimitiveValues<T>(arr);
+      memcpy(out_values, in_values, sizeof(T) * arr.length());
+      out_values += arr.length();
+    }
   }
 }
 
diff --git a/python/pyarrow/tests/test_convert_pandas.py 
b/python/pyarrow/tests/test_convert_pandas.py
index 7dbf0d7ed..ee202b706 100644
--- a/python/pyarrow/tests/test_convert_pandas.py
+++ b/python/pyarrow/tests/test_convert_pandas.py
@@ -1322,6 +1322,12 @@ def test_table_batch_empty_dataframe(self):
         _check_pandas_roundtrip(df2, preserve_index=True)
         _check_pandas_roundtrip(df2, as_batch=True, preserve_index=True)
 
+    def test_convert_empty_table(self):
+        arr = pa.array([], type=pa.int64())
+        tm.assert_almost_equal(arr.to_pandas(), np.array([], dtype=np.int64))
+        arr = pa.array([], type=pa.string())
+        tm.assert_almost_equal(arr.to_pandas(), np.array([], dtype=object))
+
     def test_array_from_pandas_date_with_mask(self):
         m = np.array([True, False, True])
         data = pd.Series([


 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> [Python] Segmentation fault on conversion of empty array to Pandas
> ------------------------------------------------------------------
>
>                 Key: ARROW-2129
>                 URL: https://issues.apache.org/jira/browse/ARROW-2129
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: Python
>    Affects Versions: 0.8.0
>            Reporter: Uwe L. Korn
>            Assignee: Uwe L. Korn
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 0.9.0
>
>
> Converting an empty {{pyarrow.Array}} to a Pandas series causes a 
> segmentation fault.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to