It's "arbitrary" from Arrow's point of view, because Arrow itself cannot
represent this data (except as a binary blob). Though, as Micah said,
this may change at some point.
Instead of extending Arrow to fit this use case, perhaps it would be
better to write a separate library that sits atop Ar
Joris Van den Bossche created ARROW-5220:
Summary: [Python] index / unknown columns in specified schema in
Table.from_pandas
Key: ARROW-5220
URL: https://issues.apache.org/jira/browse/ARROW-5220
Liya Fan created ARROW-5221:
---
Summary: Improvement the performance of class SegmentsUtil
Key: ARROW-5221
URL: https://issues.apache.org/jira/browse/ARROW-5221
Project: Apache Arrow
Issue Type: Impr
Can non-Arrow PMC members/committers vote?
If so, +1
-Brian
On 4/25/19, 4:34 PM, "Wes McKinney" wrote:
EXTERNAL
In a recent mailing list discussion [1] Micah Kornfield has proposed
to add new list and variable-size binary and unicode types to the
Arrow columnar format
hi Brian,
I doubt that such a change could be made on a short time horizon.
Collecting feedback and building consensus (if it is even possible)
with stakeholders would take some time. The appropriate place to have
the discussion is here on the mailing list, though
Thanks
On Mon, Apr 8, 2019 at 1
Hello Wes,
Thanks for the info! I'm working to better understand Parquet/Arrow design and
development processes. No hurry for LARGE_BYTE_ARRAY.
-Brian
On 4/26/19, 11:14 AM, "Wes McKinney" wrote:
EXTERNAL
hi Brian,
I doubt that such a change could be made on a short
Neal Richardson created ARROW-5222:
--
Summary: [Python] Issues with installing pyarrow for development
on MacOS
Key: ARROW-5222
URL: https://issues.apache.org/jira/browse/ARROW-5222
Project: Apache Ar
Hi Arrow developers,
I'm currently working on IPC in Rust, specifically reading Arrow files.
I've noticed that null buffers/bitmaps are always padded to 64 bits (from
pyarrow, not sure about others), while in Rust we pad to 8 bits.
1. Is this fine re. Rust per the spec?
I'm having issues with re
Hi Neville,
Here is my understanding. Per the spec [1], 8 bytes of padding is
allowed/required but 64 bytes is recommended (Is "bits" in your e-mail is a
typo?). The main rationale is to allow SIMD instructions.
For actual record batches only padding to a multiple of 8-bytes are
required [2].
N
The Buffer struct / metadata need not be a multiple of 8 bytes necessarily
but you must write padding bytes when emitting the IPC protocol. So if your
validity bitmap is 2 bytes in-memory then you must write at least 6 more
bytes of padding on the wire.
On Fri, Apr 26, 2019, 3:48 PM Micah Kornfiel
10 matches
Mail list logo