[jira] [Commented] (ARROW-1997) [Python] to_pandas with strings_to_categorical fails
[ https://issues.apache.org/jira/browse/ARROW-1997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16330767#comment-16330767 ] ASF GitHub Bot commented on ARROW-1997: --- wesm commented on a change in pull request #1480: ARROW-1997: [C++/Python] Ignore zero-copy-option in to_pandas when `strings_to_categorical` is True URL: https://github.com/apache/arrow/pull/1480#discussion_r162403987 ## File path: cpp/src/arrow/python/arrow_to_pandas.cc ## @@ -996,14 +996,19 @@ class CategoricalBlock : public PandasBlock { return Status::OK(); }; -if (data.num_chunks() == 1 && indices_first.null_count() == 0) { +if (!needs_copy_ && data.num_chunks() == 1 && indices_first.null_count() == 0) { RETURN_NOT_OK(CheckIndices(indices_first, dict_arr_first.dictionary()->length())); RETURN_NOT_OK(AllocateNDArrayFromIndices(npy_type, indices_first)); } else { if (options_.zero_copy_only) { std::stringstream ss; -ss << "Needed to copy " << data.num_chunks() << " chunks with " - << indices_first.null_count() << " indices nulls, but zero_copy_only was True"; +if (needs_copy_) { + ss << "Zero-copy is not allowed, but zero_copy_only was True"; Review comment: Is it possible to add a unit test that hits this code path, or is there a test already that does? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > [Python] to_pandas with strings_to_categorical fails > > > Key: ARROW-1997 > URL: https://issues.apache.org/jira/browse/ARROW-1997 > Project: Apache Arrow > Issue Type: Bug >Reporter: Licht Takeuchi >Assignee: Licht Takeuchi >Priority: Major > Labels: pull-request-available > > Repro code. > Seems that unexpected deallocation occured. > {code:java} > import pandas as pd > import pyarrow as pa > df = pd.DataFrame({ > 'Foo': ['A', 'A', 'B', 'B'] > }) > table = pa.Table.from_pandas(df) > df = table.to_pandas(strings_to_categorical=True) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARROW-1997) [Python] to_pandas with strings_to_categorical fails
[ https://issues.apache.org/jira/browse/ARROW-1997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16329279#comment-16329279 ] ASF GitHub Bot commented on ARROW-1997: --- xhochy commented on issue #1480: ARROW-1997: [C++/Python] Ignore zero-copy-option in to_pandas when `strings_to_categorical` is True URL: https://github.com/apache/arrow/pull/1480#issuecomment-358417487 @Licht-T The code looks good but I fail to understand the initial problem and thus cannot really understand what the change should actually do. Can you explain it a bit more? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > [Python] to_pandas with strings_to_categorical fails > > > Key: ARROW-1997 > URL: https://issues.apache.org/jira/browse/ARROW-1997 > Project: Apache Arrow > Issue Type: Bug >Reporter: Licht Takeuchi >Assignee: Licht Takeuchi >Priority: Major > Labels: pull-request-available > > Repro code. > Seems that unexpected deallocation occured. > {code:java} > import pandas as pd > import pyarrow as pa > df = pd.DataFrame({ > 'Foo': ['A', 'A', 'B', 'B'] > }) > table = pa.Table.from_pandas(df) > df = table.to_pandas(strings_to_categorical=True) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARROW-1997) [Python] to_pandas with strings_to_categorical fails
[ https://issues.apache.org/jira/browse/ARROW-1997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16326658#comment-16326658 ] ASF GitHub Bot commented on ARROW-1997: --- Licht-T opened a new pull request #1480: ARROW-1997: [C++/Python] Avoid zero-copy-option in to_pandas when `strings_to_categorical` is True URL: https://github.com/apache/arrow/pull/1480 This closes [ARROW-1997](https://issues.apache.org/jira/browse/ARROW-1997). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > [Python] to_pandas with strings_to_categorical fails > > > Key: ARROW-1997 > URL: https://issues.apache.org/jira/browse/ARROW-1997 > Project: Apache Arrow > Issue Type: Bug >Reporter: Licht Takeuchi >Assignee: Licht Takeuchi >Priority: Major > Labels: pull-request-available > > Repro code. > Seems that unexpected deallocation occured. > {code:java} > import pandas as pd > import pyarrow as pa > df = pd.DataFrame({ > 'Foo': ['A', 'A', 'B', 'B'] > }) > table = pa.Table.from_pandas(df) > df = table.to_pandas(strings_to_categorical=True) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)