[ 
https://issues.apache.org/jira/browse/ARROW-2787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16546536#comment-16546536
 ] 

Antoine Pitrou commented on ARROW-2787:
---------------------------------------

Ok, I see, this is because we build the Linux binaries with 
{{-D_GLIBCXX_USE_CXX11_ABI=0}}, so you must pass that flag when building your 
Cython extension as well. It kind of sucks that it crashes weirdly instead of 
failing to compile if you don't pass it, though.

> [Python] Memory Issue passing table from python to c++ via cython
> -----------------------------------------------------------------
>
>                 Key: ARROW-2787
>                 URL: https://issues.apache.org/jira/browse/ARROW-2787
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: Integration, Python
>    Affects Versions: 0.9.0
>         Environment: clang6
>            Reporter: Joseph Toth
>            Assignee: Antoine Pitrou
>            Priority: Major
>              Labels: cython
>
> I wanted to create a simple example of reading a table in Python and pass it 
> to C+, but I'm doing something wrong or there is a memory issue. When the 
> table gets to C+ and I print out column names it also prints out a lot of 
> junk and what looks like pydocs. Let me know if you need any more info. 
> Thanks!
> *demo.py*
> {code:python}
> import numpy
> from psy.automl import cyth
> import pandas as pd
> from absl import app
> def main(argv):
>   sup = pd.DataFrame({
>   'int': [1, 2],
>   'str': ['a', 'b']
>   })
>   table = pa.Table.from_pandas(sup)
>   cyth.c_t(table)
> {code}
> *cyth.pyx*
> {code:python}
> import pandas as pd
> import pyarrow as pa
> from pyarrow.lib cimport *
> cdef extern from "cyth.h" namespace "psy":
>  void t(shared_ptr[CTable])
> def c_t(obj):
>  # These print work
>  # for i in range(obj.num_columns):
>  # print(obj.column(i).name
>   cdef shared_ptr[CTable] tbl = pyarrow_unwrap_table(obj)
>   t(tbl)
> {code}
>  *cyth.h*
> {code:c++}
> #include <iostream>
> #include <string>
> #include "arrow/api.h"
> #include "arrow/python/api.h"
> #include "Python.h"
> namespace psy {
> void t(std::shared_ptr<arrow::Table> pytable) {
> // This works
>   std::cout << "NUM" << pytable->num_columns();
> // This prints a lot of garbage
>   for(int i = 0; i < pytable->num_columns(); i++) {
>   std::cout << pytable->column(i)->name();
>   }
>  }
> }
> {code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to