Thank you for putting together this proposal. Very exciting development. I
left some comments in the RFC doc, summarized here as:
* Flatbuffer is usable as a serialization agnostic IDL (
https://adsharma.github.io/flattools/)
* serde library + msgpack is a worthy candidate to consider for
serializ
I agree with you any thoughts on a way forward for at least hardening the
spec (or should this be done at the same time as adding the new field)?
On Mon, Aug 16, 2021 at 1:45 AM Wes McKinney wrote:
> I've been poking around the project, and I'm growing concerned that
> our use of the KeyValue fi
PS : need to check what databases do / allow, as well
Le 16/08/2021 à 23:12, Antoine Pitrou a écrit :
POSIX allows for a single leap second:
https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/time.h.html
The Windows API does not seem to know about leap seconds:
https://docs.microsoft
POSIX allows for a single leap second:
https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/time.h.html
The Windows API does not seem to know about leap seconds:
https://docs.microsoft.com/en-us/windows/win32/api/minwinbase/ns-minwinbase-systemtime
The standard Python type `datetime.time`
At the risk of opening a can of worms, isn't it possible that a time could
exceed 24 hours? Like, when there are leap seconds added?
> Some experiments inspired by an SO post[1] led me to question the meaning
of time.
Looks like the arrow mailing list is taking a philosophical turn :)
Neal
On M
Le 16/08/2021 à 20:52, Weston Pace a écrit :
Some experiments inspired by an SO post[1] led me to question the
meaning of time. The main question is **what happens when the value
exceeds 24 hours?**.
A) One potential interpretation is that these are invalid but neither
the C++ implementatio
Some experiments inspired by an SO post[1] led me to question the
meaning of time. The main question is **what happens when the value
exceeds 24 hours?**.
A) One potential interpretation is that these are invalid but neither
the C++ implementation or pyarrow reject these today. Nor do they
corr
I agree that "what happens when Numpy is not available at runtime" is a
rather annoying problem. I'm not sure what happens when you call one
of the Numpy C API functions and Numpy is not found (crash? error
return?). It can probably be detected, but needs to be done
consistently at the start of
I've thought about this in the past, and I would like to make NumPy an
optional dependency, but one of the things that kept me from trying
was the extent to which NumPy arrays are supported as inputs (or
elements of inputs) to pyarrow.array. The implementation in
python_to_arrow.cc is significantly
It seems like a good idea to attempt to make this change. The most
difficult thing might be projects that use the arrow/python/pyarrow.h
C++ API, so we would have to provide a viable migration path for
those. turbodbc is one example
https://github.com/blue-yonder/turbodbc/search?l=C%2B%2B&q=pyarro
As Arrow/PyArrow grows more compute functions and features we might move
toward a world where the number of users relying on PyArrow without going
through Pandas or NumPy might grow.
NumPy is a compile time dependency for PyArrow as it's required to compile
the C++ code needed to implement the pan
I agree with this proposal, the Arrow C++ library does not need to depend
on Python or PyArrow code.
AFAIU this will eliminate the use of -DARROW_PYTHON build flag for Arrow
C++ given that Python-related code will be compiled with PyArrow builds.
Besides the use of "ARROW_PYTHON" env variable in CM
I definitely think this is desirable.
There's probably going to be a bit of work getting it to pass on all CI
(including the various nightly builds).
Regards
Antoine.
Le 16/08/2021 à 17:08, Alessandro Molina a écrit :
PyArrow is currently full Cython codebase, but in reality it relies on
This seems reasonable as long as it is actually feasible (the dependencies
are cleanly separable)..
A while ago I had a proof of concept bazel build working that was able to
automatically build the changes together.
On Monday, August 16, 2021, David Li wrote:
> I support this. In the past I had
I support this. In the past I had to effectively do this manually to build
Arrow/PyArrow in a monorepo (to build for multiple Python versions
simultaneously without having conflicting copies of Arrow for each Python
version). From what I remember, there's some usage of Arrow-internal headers
th
PyArrow is currently full Cython codebase, but in reality it relies on some
classes and functions that are implemented in C++ within the src/python
directory ( https://github.com/apache/arrow/tree/master/cpp/src/arrow/python
). Especially for numpy/pandas conversion code that has to interface with
The vote passes with 3 +1 binding and 1 +1 non-binding
The release is available here:
https://dist.apache.org/repos/dist/release/arrow/arrow-rs-5.2.0
The release has also been published to crates.io:
https://crates.io/crates/arrow/5.2.0
https://crates.io/crates/arrow-flight/5.2.0
https://crates
I've been poking around the project, and I'm growing concerned that
our use of the KeyValue field has already been non-compliant in many
cases since we do not validate UTF8-ness. Since we also use KeyValue
to handle opaque data serialization for extension types [1], the fact
that the specification
18 matches
Mail list logo