[ https://issues.apache.org/jira/browse/ARROW-2814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Antoine Pitrou updated ARROW-2814: ---------------------------------- Fix Version/s: (was: 0.10.0) 0.11.0 > [Python] Struct type inference and conversion works for lists but not NumPy > arrays with dtype object > ---------------------------------------------------------------------------------------------------- > > Key: ARROW-2814 > URL: https://issues.apache.org/jira/browse/ARROW-2814 > Project: Apache Arrow > Issue Type: Bug > Components: Python > Affects Versions: 0.9.0 > Reporter: rob > Priority: Major > Fix For: 0.11.0 > > Attachments: > part-00000-8f03690f-736d-43a9-9287-6db9e228d59c.c000.gz.parquet > > > Example, setup: > {code} > import pandas as pd > s = pd.Series([{'data': {'document_id': None, > 'document_type': None, > 'master_customer_id': None, > 'message': 'User Login Request', > 'policy_id': None, > 'sequence_no': 14, > 'user_name': None}, > 'header': {'actor_id': None, > 'actor_type': None, > 'brand_code': 'ES', > 'event_origin': None, > 'event_timestamp': '2018-01-01T18:25:43.511Z', > 'event_type': 'LOGIN', > 'master_customer_id': '14', > 'source': 'CUSTOMER_AUTH_SERVICE', > 'source_id': None, > 'source_version': None}, > 'payload_version': '1', > 'status': {'status_code': 100, 'status_message': 'Success'}}]) > {code} > This works: > {code} > In [24]: pa.array(list(s)) > Out[24]: > <pyarrow.lib.StructArray object at 0x7f8435b09c28> > [ > {'data': {'document_id': None, 'document_type': None, 'master_customer_id': > None, 'message': 'User Login Request', 'policy_id': None, 'sequence_no': 14, > 'user_name': None}, 'header': {'actor_id': None, 'actor_type': None, > 'brand_code': 'ES', 'event_origin': None, 'event_timestamp': > '2018-01-01T18:25:43.511Z', 'event_type': 'LOGIN', 'master_customer_id': > '14', 'source': 'CUSTOMER_AUTH_SERVICE', 'source_id': None, 'source_version': > None}, 'payload_version': '1', 'status': {'status_code': 100, > 'status_message': 'Success'}} > ] > {code} > This does not: > {code} > In [23]: pa.array(s) > --------------------------------------------------------------------------- > ArrowInvalid Traceback (most recent call last) > <ipython-input-23-eba23a1638b7> in <module>() > ----> 1 pa.array(s) > ~/code/arrow/python/pyarrow/array.pxi in pyarrow.lib.array() > 175 values, type = pdcompat.get_datetimetz_type(values, > obj.dtype, > 176 type) > --> 177 return _ndarray_to_array(values, mask, type, from_pandas, > pool) > 178 else: > 179 if mask is not None: > ~/code/arrow/python/pyarrow/array.pxi in pyarrow.lib._ndarray_to_array() > 75 > 76 with nogil: > ---> 77 check_status(NdarrayToArrow(pool, values, mask, > 78 use_pandas_null_sentinels, > 79 c_type, &chunked_out)) > ~/code/arrow/python/pyarrow/error.pxi in pyarrow.lib.check_status() > 79 message = frombytes(status.message()) > 80 if status.IsInvalid(): > ---> 81 raise ArrowInvalid(message) > 82 elif status.IsIOError(): > 83 raise ArrowIOError(message) > ArrowInvalid: ../src/arrow/python/numpy_to_arrow.cc:1742 code: > converter.Convert() > Error inferring Arrow type for Python object array. Got Python object of type > dict but can only handle these types: string, bool, float, int, date, time, > decimal, bytearray, list, array > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)