Joris Van den Bossche created ARROW-6158: --------------------------------------------
Summary: [Python] possible to create StructArray with type that conflicts with child array's types Key: ARROW-6158 URL: https://issues.apache.org/jira/browse/ARROW-6158 Project: Apache Arrow Issue Type: Bug Components: Python Reporter: Joris Van den Bossche Using the Python interface as example. This creates a {{StructArray}} where the field types don't match the child array types: {code} a = pa.array([1, 2, 3], type=pa.int64()) b = pa.array(['a', 'b', 'c'], type=pa.string()) inconsistent_fields = [pa.field('a', pa.int32()), pa.field('b', pa.float64())] a = pa.StructArray.from_arrays([a, b], fields=inconsistent_fields) {code} The above works fine. I didn't find anything that errors (eg conversion to pandas, slicing), also validation passes, but the type actually has the inconsistent child types: {code} In [2]: a Out[2]: <pyarrow.lib.StructArray object at 0x7f450af52eb8> -- is_valid: all not null -- child 0 type: int64 [ 1, 2, 3 ] -- child 1 type: string [ "a", "b", "c" ] In [3]: a.type Out[3]: StructType(struct<a: int32, b: double>) In [4]: a.to_pandas() Out[4]: array([{'a': 1, 'b': 'a'}, {'a': 2, 'b': 'b'}, {'a': 3, 'b': 'c'}], dtype=object) In [5]: a.validate() {code} Shouldn't this be disallowed somehow? (it could be checked in the Python {{from_arrays}} method, but maybe also in {{StructArray::Make}} which already checks for the number of fields vs arrays and a consistent array length). Similarly to discussion in ARROW-6132, I would also expect that this the {{ValidateArray}} catches this. -- This message was sent by Atlassian JIRA (v7.6.14#76016)