Hi all,
Looking for guidance on how to submit a design and PR to add WASM32 support
to apache arrow's rust libraries.
I am looking to use the arrow library to pass data in arrow format between
the host spark environment and UDFs defined in WASM .
I created the following JIRA ticket to capture
Thanks Micah. I'll check in the test file that has the V6 metadata and
open a PR later today
On Mon, Jul 13, 2020 at 5:53 PM Micah Kornfield wrote:
>
> To clarify on UBSAN and enums. My understanding is:
>
> enum A { a = 1, b =2, c = 3};
> class enum B : int16_t { a = 1, b = 2, c = 3};
>
> A a
To clarify on UBSAN and enums. My understanding is:
enum A { a = 1, b =2, c = 3};
class enum B : int16_t { a = 1, b = 2, c = 3};
A a = static_cast(4); // UB
B b = static_cast(4); // Not UB. Declaring the holding type makes this
allowable.
On Mon, Jul 13, 2020 at 3:44 PM Micah Kornfield
Please see [1]. I ran this arrow-ipc-read-write-test with UBSAN enabled
and it passed (this isn't my normal dev environment so please double check).
https://github.com/emkornfield/arrow/commit/7fbd0fb95f7ea164284720428c7974b87b4b2443
On Mon, Jul 13, 2020 at 3:12 PM Micah Kornfield
wrote:
> I
I think this might be more complicated, let me see if i can write a test
that demonstrates what I'm talking about.
On Mon, Jul 13, 2020 at 3:10 PM Wes McKinney wrote:
> Here's a patch that does the check
>
>
> https://github.com/wesm/arrow/commit/5bfdb4255a66a4ec62b1c36ba07682fad47df9a7
>
>
Here's a patch that does the check
https://github.com/wesm/arrow/commit/5bfdb4255a66a4ec62b1c36ba07682fad47df9a7
Here is a serialized schema that uses a V6 version
https://drive.google.com/file/d/1GiWh5yKXdMaLRWU5K4cnGW2ilybF0LF_/view?usp=sharing
See in action
On Mon, Jul 13, 2020 at 4:43 PM Micah Kornfield wrote:
>>
>> We don't have any test cases that have a future metadata version. I
>> made a branch where I added V6 and wrote an IPC message, then found
>> that I was unable to determine that it was out of bounds (presumably
>> UBSAN would error,
>
> We don't have any test cases that have a future metadata version. I
> made a branch where I added V6 and wrote an IPC message, then found
> that I was unable to determine that it was out of bounds (presumably
> UBSAN would error, though, but we need a runtime error outside of
> ASAN/UBSAN).
On Mon, Jul 13, 2020 at 4:31 PM Micah Kornfield wrote:
>
> >
> >
> > That static cast is currently undefined behavior.
>
> Is ubsan reporting this? When looking into the feature enum I tried to
> understand if that was valid. At the time I read the C++ spec* if the enum
> has an explicitly
>
>
> That static cast is currently undefined behavior.
Is ubsan reporting this? When looking into the feature enum I tried to
understand if that was valid. At the time I read the C++ spec* if the enum
has an explicitly declared type, all values in that types range are
supported.
The generated
I've discovered while working on ARROW-9399 that it is very difficult
with the Flatbuffers API in C++ to detect a MetadataVersion [1] that
is higher than the current version.
For example, suppose that 3 or 4 years from now we move from version
V5 to version V6. The generated Flatbuffers code
In that case, I will take my time :)
On Mon, Jul 13, 2020 at 11:00 AM Antoine Pitrou wrote:
>
> I don't think we want to introduce last-minute unforeseen issues (such
> as security issues) in the IPC layer, so personally I'd rather defer the
> feature enum implementation to the next version.
>
https://issues.apache.org/jira/browse/ARROW-9443 is the other R build
issue. I should get a fix one way or another today, but regardless, it is
not release-blocking.
Neal
On Mon, Jul 13, 2020 at 8:24 AM Neal Richardson
wrote:
> conda-r is ticketed
I don't think we want to introduce last-minute unforeseen issues (such
as security issues) in the IPC layer, so personally I'd rather defer the
feature enum implementation to the next version.
Just my two cents :)
Regards
Antoine.
Le 13/07/2020 à 19:42, Micah Kornfield a écrit :
> I'll try
I'll try to make PRs for the feature enum tonight. I don't think this is a
blocker as there are other mechanisms to detect the current values listed.
On Mon, Jul 13, 2020 at 10:37 AM Wes McKinney wrote:
> Aside from fixing nightly builds, which of the 25 issues remaining in
> the 1.0.0
Aside from fixing nightly builds, which of the 25 issues remaining in
the 1.0.0 milestone must be resolved in order to release? Speak now or
forever hold your peace =)
As one problem where I haven't seen activity, we have not implemented
the Feature Enum anywhere, do we want to try to add simple
conda-r is ticketed (https://issues.apache.org/jira/browse/ARROW-9409) and
has a PR (https://github.com/apache/arrow/pull/7706) but there are
remaining issues and I am uncertain that this is a build worth maintaining
anyway. If anyone has opinions, please comment on the PR.
As for any other R
Agreed, but even then, if some Parquet files are generated inside of a
well-defined system which only needs to be interoperable with itself,
it's not necessaril harmful to allow LZ4 compression when writing new files.
Regards
Antoine.
Le 13/07/2020 à 17:07, Wes McKinney a écrit :
> I didn’t
On Mon, Jul 13, 2020 at 11:15 AM Antoine Pitrou wrote:
>
>
> I'm not sure that's a good idea. There are probably Parquet files that
> are only ever used with the Arrow implementation (Arrow C++, Arrow
> Python, Arrow R...).
I tend to agree with Antoine here. As an alternative to disabling the
I didn’t say to disable _reading_ them, only writing them.
On Mon, Jul 13, 2020 at 4:15 AM Antoine Pitrou wrote:
>
> I'm not sure that's a good idea. There are probably Parquet files that
> are only ever used with the Arrow implementation (Arrow C++, Arrow
> Python, Arrow R...).
>
> I admit
I'll volunteer to disable writing/reading LZ4. I'll submit a patch in the next
few days.
On 2020/07/12 22:11:33, Wes McKinney wrote:
> Since there hasn't been other movement on this, we need to disable
> writing LZ4-compressed files until this can be investigated more
> thoroughly. If someone
Failures with patches:
- wheel-osx-*: https://github.com/apache/arrow/pull/7728 should fix it
- conda-cpp-valgrind: https://github.com/apache/arrow/pull/7727 should fix it
Known failures:
- conda-python-3.8-jpype: https://issues.apache.org/jira/browse/ARROW-9385
New failures:
-
I resubmitted the nightly jobs and the builds are running:
https://github.com/ursa-labs/crossbow/branches/all?query=build-834
We'll see tomorrow whether the issue persists or not.
On Mon, Jul 13, 2020 at 1:31 PM Krisztián Szűcs
wrote:
>
> This report is misleading because no builds were
This report is misleading because no builds were triggered at all, see
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-07-13-0
I'm investigating it.
On Mon, Jul 13, 2020 at 12:15 PM Crossbow wrote:
>
>
> Arrow Build Report for Job nightly-2020-07-13-0
>
> All tasks:
>
Hi,
My organization already uses the official C# Arrow library for a product.
It seems that the official library is working fine on the product,
so I think it has some stability compared to what it used to be.
Moreover, I think that if we focus on the official C# implementation,
we can test
Hi,
My organization alreadly uses the official C# Arrow library for a product.
It seems that the official library is working fine on the product,
so I think it has some stability compared to what it used to be.
Moreover, I think that if we focus on the official C# implementation,
we can test
Arrow Build Report for Job nightly-2020-07-13-0
All tasks:
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-07-13-0
Succeeded Tasks:
- centos-6-amd64:
URL:
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-07-13-0-github-centos-6-amd64
-
On Sat, 11 Jul 2020 09:55:16 -0700
Jacques Nadeau wrote:
>
> I'm against extending use of flatbuf within Arrow. The language support is
> too weak. Language support isn't just about having a binding for different
> languages, it is about having a high-quality binding.
Could you please expand on
I'm not sure that's a good idea. There are probably Parquet files that
are only ever used with the Arrow implementation (Arrow C++, Arrow
Python, Arrow R...).
I admit I'm also not terribly bothered about this, since the Parquet
community itself doesn't seem to care much about the issue (it has
29 matches
Mail list logo