Re: [Discuss] Single offset per array has a non-trivial performance implication

2021-10-26 Thread Micah Kornfield
> > To understand why this is the case, consider comparing two boolean arrays > (a, b), where "a" has been sliced and has a validity and "b" does not. In > this case, we could compare the values of the arrays (taking into account > "a"'s offset), and clone "a"'s validity. However, this does not

Re: Arrow in HPC

2021-10-26 Thread Keith Kraus
Outside of just HPC, integrating UCX would potentially allow taking advantage of its shared memory backend which would be interesting from a performance perspective in the single-node, multi-process case in many situations. Not sure it's worth the UCX dependency in the long run, but would allow

Re: Arrow in HPC

2021-10-26 Thread Yibo Cai
On 10/26/21 10:02 PM, David Li wrote: Hi Yibo, Just curious, has there been more thought on this from your/the HPC side? Yes. I will investigate the possible approach. Maybe build a quick (and dirty) POC test at first. I also realized we never asked, what is motivating Flight in this

Arrow sync call October 27 at 12:00 US/Eastern, 16:00 UTC

2021-10-26 Thread Ian Cook
Hi all, Our biweekly sync call is tomorrow at 12:00 noon Eastern time. The Zoom meeting URL for this and other biweekly Arrow sync calls is: https://zoom.us/j/87649033008?pwd=SitsRHluQStlREM0TjJVYkRibVZsUT09 Alternatively, enter this information into the Zoom website or app to join the call:

Re: [Discuss] Single offset per array has a non-trivial performance implication

2021-10-26 Thread Weston Pace
I don't think the presence of array-level offsets precludes the presence of buffer-level offsets. For example, in the C++ implementation we have both buffer offsets and array offsets. Buffer offsets are used mainly in the IPC layer I think when we are constructing arrays from larger memory

[Discuss] Single offset per array has a non-trivial performance implication

2021-10-26 Thread Jorge Cardoso Leitão
Hi, One aspect of the design of "arrow2" is that it deals with array slices differently from the rest of the implementations. Essentially, the offset is not stored in ArrayData, but on each individual Buffer. Some important consequence are: * people can work with buffers and bitmaps without

Re: [Parquet, C++] Writing Compliant Nested Types to Parquet

2021-10-26 Thread Sarah Gilmore
Hi Micah, Thanks for clearing this up for me! Best, Sarah From: Micah Kornfield Sent: Monday, October 25, 2021 12:06 PM To: dev Subject: Re: [Parquet, C++] Writing Compliant Nested Types to Parquet Hi Sarah, For new consumers of the library setting it to true

Re: [VOTE][RESULT] Release Apache Arrow 6.0.0 - RC3

2021-10-26 Thread Krisztián Szűcs
The current status of the post release tasks: 1. [in-pr] bump version numbers 2. [done] upload source 3. [done] upload binaries 4. [in-pr] update website 5. [depends-on-brew] upload ruby gems 6. [done] upload js packages 8. [done] upload C# packages 10. [ ] update conda recipes 11. [done] upload

Re: Arrow in HPC

2021-10-26 Thread David Li
Hi Yibo, Just curious, has there been more thought on this from your/the HPC side? I also realized we never asked, what is motivating Flight in this space in the first place? Presumably broader Arrow support in general? -David On Fri, Sep 10, 2021, at 12:27, Micah Kornfield wrote: > > > > I

[VOTE][RESULT] Release Apache Arrow 6.0.0 - RC3

2021-10-26 Thread Krisztián Szűcs
Resending with RESULT subject line. The VOTE carries with 3 binding +1 and 2 non-binding +1 votes. I'm starting the post release tasks and will keep you posted about the current status. Thanks everyone! > > On Tue, Oct 26, 2021 at 1:56 PM Benson Muite > wrote: > > > > Ok. Thanks for the

Re: [VOTE] Release Apache Arrow 6.0.0 - RC3

2021-10-26 Thread Krisztián Szűcs
The VOTE carries with 3 binding +1 and 2 non-binding +1 votes. I'm starting the post release tasks and will keep you posted about the current status. Thanks everyone! On Tue, Oct 26, 2021 at 1:56 PM Benson Muite wrote: > > Ok. Thanks for the feedback. > > Javascript may have problems when

Re: [VOTE] Release Apache Arrow 6.0.0 - RC3

2021-10-26 Thread Benson Muite
Ok. Thanks for the feedback. Javascript may have problems when using nohup so directly running env "TEST_DEFAULT=0" env "TEST_JS=1" bash dev/release/verify-release-candidate.sh source 6.0.0 3 seems to work, but nohup env "TEST_DEFAULT=0" env "TEST_JS=1" bash

Re: [VOTE] Release Apache Arrow 6.0.0 - RC3

2021-10-26 Thread Krisztián Szűcs
Thanks Benson for verifying! Created a jira to track the depreciation warnings [1] and seems like you've already created a PR for the javascript issue [2]. Luckily, these issues are not blockers. [1]: https://issues.apache.org/jira/browse/ARROW-14468 [2]: