Re: Efficiently allocating an empty vector (python)

2019-12-11 Thread Ted Gooch
Not sure if this is any better, but I have an open PR right now in Iceberg, where we are doing something similar: https://github.com/apache/incubator-iceberg/pull/544/commits/28166fd3f0e3a24863048a2721f1ae69f243e2af#diff-51d6edf951c105e1e62a3f1e8b4640aaR319-R341 @staticmethod def create_null_colum

[jira] [Created] (ARROW-7080) [Python][Parquet] Expose parquet field_id in Schema objects

2019-11-06 Thread Ted Gooch (Jira)
Ted Gooch created ARROW-7080: Summary: [Python][Parquet] Expose parquet field_id in Schema objects Key: ARROW-7080 URL: https://issues.apache.org/jira/browse/ARROW-7080 Project: Apache Arrow

Re: questions about Gandiva

2019-10-31 Thread Ted Gooch
You can also see some of the Gandiva python bindings in the tests in pyarrow: https://github.com/apache/arrow/blob/master/python/pyarrow/tests/test_gandiva.py On Thu, Oct 31, 2019 at 10:26 AM Wes McKinney wrote: > hi > > On Thu, Oct 31, 2019 at 12:11 AM Yibo Cai wrote: > > > > Hi, > > > > Arro

Re: A couple of questions about pyarrow.parquet

2019-05-17 Thread Ted Gooch
Progress%22%2C%20Reopened)%20AND%20text%20~%20%22pushdown%22 > > > > > > > On Fri, May 17, 2019 at 11:48 AM Ted Gooch wrote: > > > > > Hi, > > > > > > I've been doing some work trying to get the parquet read path going for > > the >

A couple of questions about pyarrow.parquet

2019-05-17 Thread Ted Gooch
Hi, I've been doing some work trying to get the parquet read path going for the python iceberg library. I have two questions that I couldn't get figured out, and was hoping I could get some guidance from the list here. First, I'd like to create a Par

Re: Pyarrow filter/sort/bsearch

2019-05-13 Thread Ted Gooch
At least for the filtering part, isn't it already possible via gandiva filters[1]? I had a similar question about pushing record-level filtering into the parquet reader. [1] https://github.com/apache/arrow/blob/master/python/pyarrow/tests/test_gandiva.py#L86-L100 On Mon, May 13, 2019 at 8:51 AM W