Re: Creating and populating Arrow table directly?

2020-10-12 Thread Jacob Quinn
I'm not familiar with the internals of the pyarrow implementation, but as the primary author of the Arrow.jl Julia implementation, I think I can provide a little insight that's probably applicable. The conceptual problem here is that the arrow format is immutable; arrow data is laid out in a fixed

Creating and populating Arrow table directly?

2020-10-12 Thread Jonathan Yu
Hello there, I'm recording an a-priori known number of entries per column, and I want to create a Table using these entries. I'm currently using numpy.empty to pre-allocate empty arrays, then creating a Table from that via the pyarrow.table(data={}) constructor. It seems a bit silly to create a b

Re: [R and C++] passing arrow::Arrow from R to C++ for reading and writing?

2020-10-12 Thread Neal Richardson
Hi Colin, Does the code you shared run? If not, how does it fail? One guess is that you're probably getting undefined symbols errors because you need more than just -larrow. See https://github.com/apache/arrow/blob/master/r/configure#L35 for others you need, and depending on how you installed arro

[R and C++] passing arrow::Arrow from R to C++ for reading and writing?

2020-10-12 Thread Colin McLean
Dear Arrow users, I was wondering if anyone can help me understand how I can create an arrow::Array object in R, then pass this into C++ (using the Rcpp library) for both reading and writing too? Similar what is done using the R bigmemory (https://privefl.github.io/blog/Tip-Optimize-your

Is there a `write_record_batch` method corresonding to `pa.ipc.read_record_batch`?

2020-10-12 Thread Shawn Yang
I want to write a record batch as ipc message separately without writing a schema. In my case, the schema is known to peers ahead of time. I noticed arrow java already has this method `org.apache.arrow.vector.ipc.message.MessageSerializer#serialize(org.apache.arrow.vector.ipc.WriteChannel, org.apac