[jira] [Created] (ARROW-7782) Losing index information when using write_to_dataset with partition_cols
Ludwik Bielczynski created ARROW-7782: - Summary: Losing index information when using write_to_dataset with partition_cols Key: ARROW-7782 URL: https://issues.apache.org/jira/browse/ARROW-7782 Project: Apache Arrow Issue Type: Bug Environment: pyarrow==0.15.1 Reporter: Ludwik Bielczynski One cannot save the index when using {{pyarrow.parquet.write_to_dataset()}} with given partition_cols arguments. Here I have created a minimal example which shows the issue: {code:java} from pathlib import Path import pandas as pd from pyarrow import Table from pyarrow.parquet import write_to_dataset path = Path('/home/ludwik/Documents/YieldPlanet/research/trials') file_name = 'trial_pq.parquet' df = pd.DataFrame({"A": [1, 2, 3], "B": ['a', 'a', 'b'] }, index=pd.Index(['a', 'b', 'c'], name='idx')) table = Table.from_pandas(df) write_to_dataset(table, str(path / file_name), partition_cols=['B'], partition_filename_cb=None, filesystem=None) {code} The issue is rather important for pandas and dask users. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-1398) No support reading columns of type decimal(19,4)
Ludwik Bielczynski created ARROW-1398: - Summary: No support reading columns of type decimal(19,4) Key: ARROW-1398 URL: https://issues.apache.org/jira/browse/ARROW-1398 Project: Apache Arrow Issue Type: Bug Components: Python Affects Versions: 0.6.0 Environment: Ubuntu 16.04, Python 3.6 Reporter: Ludwik Bielczynski I have a localy saved parquet database created in spark from querrying an SQL database. When I run: {code:python} import pyarrow.parquet as pq path = "path/to/parquet/dataset" dataset = pq.ParquetDataset(path) dataset.read() {code} an error indicating that there is no support for reading columns of type decimal(19,4). It's quite a common type used in SQL databases and I saw in the source code that there is an implementation for decimals. I'm stuck trying to figuring out a solution. Is there a walk around (conversion of decimals to integers during reading)? -- This message was sent by Atlassian JIRA (v6.4.14#64029)