[jira] [Assigned] (ARROW-5530) [C++] Add options to ValueCount/Unique/DictEncode kernel to toggle null behavior
[ https://issues.apache.org/jira/browse/ARROW-5530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rok Mihevc reassigned ARROW-5530: - Assignee: Rok Mihevc > [C++] Add options to ValueCount/Unique/DictEncode kernel to toggle null > behavior > > > Key: ARROW-5530 > URL: https://issues.apache.org/jira/browse/ARROW-5530 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ >Reporter: Francois Saint-Jacques >Assignee: Rok Mihevc >Priority: Major > Labels: analytics > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (ARROW-11759) [C++] Kernel to extract datetime components (year, month, day, etc) from timestamp type
[ https://issues.apache.org/jira/browse/ARROW-11759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rok Mihevc reassigned ARROW-11759: -- Assignee: Rok Mihevc > [C++] Kernel to extract datetime components (year, month, day, etc) from > timestamp type > --- > > Key: ARROW-11759 > URL: https://issues.apache.org/jira/browse/ARROW-11759 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ >Reporter: Joris Van den Bossche >Assignee: Rok Mihevc >Priority: Major > > It can be very useful to extract certain "fields" from the timestamp, such as > the year, month, day, etc. > See eg > https://pandas.pydata.org/docs/user_guide/timeseries.html#time-date-components > for the ones available in pandas. > Using pandas as an example, there are the basic components of the datetime: > {code} > >>> ts = pd.Timestamp.now() > >>> ts > Timestamp('2021-02-24 10:47:54.294504') > >>> ts.year > 2021 > >>> ts.month > 2 > >>> ts.day > 24 > >>> ts.hour > 10 > >>> ts.minute > 49 > >>> ts.second > 54 > >>> ts.microsecond > 607393 > >>> ts.nanosecond > 0 > {code} > (only for the sub-second, this is not fully clear how to divide it in > microseconds or milliseconds, etc) > But in addition also some more "advanced" like: > {code} > >>> ts.dayofyear > 55 > >>> ts.dayofweek > 2 > >>> ts.week > 8 > >>> ts.isocalendar() > (2021, 8, 3) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (ARROW-12437) [Rust] [Ballista] Ballista plans must not include RepartitionExec
[ https://issues.apache.org/jira/browse/ARROW-12437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Grove reassigned ARROW-12437: -- Assignee: Andy Grove > [Rust] [Ballista] Ballista plans must not include RepartitionExec > - > > Key: ARROW-12437 > URL: https://issues.apache.org/jira/browse/ARROW-12437 > Project: Apache Arrow > Issue Type: Bug > Components: Rust - Ballista >Reporter: Andy Grove >Assignee: Andy Grove >Priority: Major > Labels: pull-request-available > Fix For: 5.0.0 > > Time Spent: 1h > Remaining Estimate: 0h > > Ballista plans must not include RepartitionExec because it results in > incorrect results. Ballista needs to manage its own repartitioning in a > distributed-aware way later on. For now we just need to configure the > DataFusion context to disable repartition. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (ARROW-12437) [Rust] [Ballista] Ballista plans must not include RepartitionExec
[ https://issues.apache.org/jira/browse/ARROW-12437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Grove resolved ARROW-12437. Fix Version/s: 5.0.0 Resolution: Fixed Issue resolved by pull request 10086 [https://github.com/apache/arrow/pull/10086] > [Rust] [Ballista] Ballista plans must not include RepartitionExec > - > > Key: ARROW-12437 > URL: https://issues.apache.org/jira/browse/ARROW-12437 > Project: Apache Arrow > Issue Type: Bug > Components: Rust - Ballista >Reporter: Andy Grove >Priority: Major > Labels: pull-request-available > Fix For: 5.0.0 > > Time Spent: 1h > Remaining Estimate: 0h > > Ballista plans must not include RepartitionExec because it results in > incorrect results. Ballista needs to manage its own repartitioning in a > distributed-aware way later on. For now we just need to configure the > DataFusion context to disable repartition. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-12437) [Rust] [Ballista] Ballista plans must not include RepartitionExec
[ https://issues.apache.org/jira/browse/ARROW-12437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-12437: --- Labels: pull-request-available (was: ) > [Rust] [Ballista] Ballista plans must not include RepartitionExec > - > > Key: ARROW-12437 > URL: https://issues.apache.org/jira/browse/ARROW-12437 > Project: Apache Arrow > Issue Type: Bug > Components: Rust - Ballista >Reporter: Andy Grove >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Ballista plans must not include RepartitionExec because it results in > incorrect results. Ballista needs to manage its own repartitioning in a > distributed-aware way later on. For now we just need to configure the > DataFusion context to disable repartition. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-12437) [Rust] [Ballista] Ballista plans must not include RepartitionExec
Andy Grove created ARROW-12437: -- Summary: [Rust] [Ballista] Ballista plans must not include RepartitionExec Key: ARROW-12437 URL: https://issues.apache.org/jira/browse/ARROW-12437 Project: Apache Arrow Issue Type: Bug Components: Rust - Ballista Reporter: Andy Grove Ballista plans must not include RepartitionExec because it results in incorrect results. Ballista needs to manage its own repartitioning in a distributed-aware way later on. For now we just need to configure the DataFusion context to disable repartition. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-12436) [Rust][Ballista] Add watch capabilities to config backend trait
[ https://issues.apache.org/jira/browse/ARROW-12436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-12436: --- Labels: pull-request-available (was: ) > [Rust][Ballista] Add watch capabilities to config backend trait > --- > > Key: ARROW-12436 > URL: https://issues.apache.org/jira/browse/ARROW-12436 > Project: Apache Arrow > Issue Type: Task > Components: Rust - Ballista >Reporter: Ximo Guanter >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > [arrow/lib.rs at 66aa3e7c365a8d4c4eca6e23668f2988e714b493 · apache/arrow > (github.com)|https://github.com/apache/arrow/blob/66aa3e7c365a8d4c4eca6e23668f2988e714b493/rust/ballista/rust/scheduler/src/lib.rs#L183] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (ARROW-12334) [Rust] [Ballista] Aggregate queries producing incorrect results
[ https://issues.apache.org/jira/browse/ARROW-12334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Grove resolved ARROW-12334. Fix Version/s: 5.0.0 Resolution: Fixed Issue resolved by pull request 10083 [https://github.com/apache/arrow/pull/10083] > [Rust] [Ballista] Aggregate queries producing incorrect results > --- > > Key: ARROW-12334 > URL: https://issues.apache.org/jira/browse/ARROW-12334 > Project: Apache Arrow > Issue Type: Bug > Components: Rust - Ballista >Reporter: Andy Grove >Assignee: Andy Grove >Priority: Major > Labels: pull-request-available > Fix For: 5.0.0 > > Time Spent: 1h 20m > Remaining Estimate: 0h > > I just ran benchmarks for the first time in a while and I see duplicate > entries for group by keys. > > For example, query 1 has "group by l_returnflag, l_linestatus" and I see > multiple results with l_returnflag = 'A' and l_linestatus = 'F'. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-12433) [Rust] Builds failing due to new flatbuffer release introducing const generics
[ https://issues.apache.org/jira/browse/ARROW-12433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Grove updated ARROW-12433: --- Component/s: Rust > [Rust] Builds failing due to new flatbuffer release introducing const generics > -- > > Key: ARROW-12433 > URL: https://issues.apache.org/jira/browse/ARROW-12433 > Project: Apache Arrow > Issue Type: Bug > Components: Rust >Affects Versions: 4.0.0 >Reporter: Andy Grove >Priority: Blocker > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 1h 20m > Remaining Estimate: 0h > > I filed [https://github.com/google/flatbuffers/issues/6572] but for now we > should pin the dependency to 0.8.3 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (ARROW-12433) [Rust] Builds failing due to new flatbuffer release introducing const generics
[ https://issues.apache.org/jira/browse/ARROW-12433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Grove resolved ARROW-12433. Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 10082 [https://github.com/apache/arrow/pull/10082] > [Rust] Builds failing due to new flatbuffer release introducing const generics > -- > > Key: ARROW-12433 > URL: https://issues.apache.org/jira/browse/ARROW-12433 > Project: Apache Arrow > Issue Type: Bug >Affects Versions: 4.0.0 >Reporter: Andy Grove >Priority: Blocker > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 1h 10m > Remaining Estimate: 0h > > I filed [https://github.com/google/flatbuffers/issues/6572] but for now we > should pin the dependency to 0.8.3 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-12436) [Rust][Ballista] Add watch capabilities to config backend trait
Ximo Guanter created ARROW-12436: Summary: [Rust][Ballista] Add watch capabilities to config backend trait Key: ARROW-12436 URL: https://issues.apache.org/jira/browse/ARROW-12436 Project: Apache Arrow Issue Type: Task Components: Rust - Ballista Reporter: Ximo Guanter [arrow/lib.rs at 66aa3e7c365a8d4c4eca6e23668f2988e714b493 · apache/arrow (github.com)|https://github.com/apache/arrow/blob/66aa3e7c365a8d4c4eca6e23668f2988e714b493/rust/ballista/rust/scheduler/src/lib.rs#L183] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (ARROW-12419) [Java] flatc is not used in mvn
[ https://issues.apache.org/jira/browse/ARROW-12419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kouhei Sutou resolved ARROW-12419. -- Fix Version/s: 5.0.0 Resolution: Fixed Issue resolved by pull request 10067 [https://github.com/apache/arrow/pull/10067] > [Java] flatc is not used in mvn > --- > > Key: ARROW-12419 > URL: https://issues.apache.org/jira/browse/ARROW-12419 > Project: Apache Arrow > Issue Type: Improvement > Components: Java >Affects Versions: 4.0.0 >Reporter: Kazuaki Ishizaki >Assignee: Kazuaki Ishizaki >Priority: Minor > Labels: pull-request-available > Fix For: 5.0.0 > > Time Spent: 40m > Remaining Estimate: 0h > > ARROW-12111 removed the usage of flatc during the build process in mvn. Thus, > it is not necessary to explicitly download flatc for s390x. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-12435) [Rust][DataFusion] Remove unnecessary references to namespace in executor
[ https://issues.apache.org/jira/browse/ARROW-12435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-12435: --- Labels: pull-request-available (was: ) > [Rust][DataFusion] Remove unnecessary references to namespace in executor > - > > Key: ARROW-12435 > URL: https://issues.apache.org/jira/browse/ARROW-12435 > Project: Apache Arrow > Issue Type: Task > Components: Rust - Ballista >Reporter: Ximo Guanter >Priority: Minor > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > There is no need to support multiple executor clusters from a scheduler, so > the namespace of an executor is implicitly defined by the scheduler it > connects to. See > [https://the-asf.slack.com/archives/C01QUFS30TD/p1618679585211100] for more > context -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-12435) [Rust][DataFusion] Remove unnecessary references to namespace in executor
Ximo Guanter created ARROW-12435: Summary: [Rust][DataFusion] Remove unnecessary references to namespace in executor Key: ARROW-12435 URL: https://issues.apache.org/jira/browse/ARROW-12435 Project: Apache Arrow Issue Type: Task Components: Rust - Ballista Reporter: Ximo Guanter There is no need to support multiple executor clusters from a scheduler, so the namespace of an executor is implicitly defined by the scheduler it connects to. See [https://the-asf.slack.com/archives/C01QUFS30TD/p1618679585211100] for more context -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-12334) [Rust] [Ballista] Aggregate queries producing incorrect results
[ https://issues.apache.org/jira/browse/ARROW-12334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-12334: --- Labels: pull-request-available (was: ) > [Rust] [Ballista] Aggregate queries producing incorrect results > --- > > Key: ARROW-12334 > URL: https://issues.apache.org/jira/browse/ARROW-12334 > Project: Apache Arrow > Issue Type: Bug > Components: Rust - Ballista >Reporter: Andy Grove >Assignee: Andy Grove >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > I just ran benchmarks for the first time in a while and I see duplicate > entries for group by keys. > > For example, query 1 has "group by l_returnflag, l_linestatus" and I see > multiple results with l_returnflag = 'A' and l_linestatus = 'F'. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-12433) [Rust] Builds failing due to new flatbuffer release introducing const generics
[ https://issues.apache.org/jira/browse/ARROW-12433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17324321#comment-17324321 ] Andy Grove commented on ARROW-12433: Thanks [~alippai] that is a good suggestion So the issue is that our builds with nightly Rust are failing (our SIMD feature requires nightly, and the nightly version of Rust we use does not have const generics yet). I went ahead with a PR to pin to 0.8.3 to fix our builds. > [Rust] Builds failing due to new flatbuffer release introducing const generics > -- > > Key: ARROW-12433 > URL: https://issues.apache.org/jira/browse/ARROW-12433 > Project: Apache Arrow > Issue Type: Bug >Affects Versions: 4.0.0 >Reporter: Andy Grove >Priority: Blocker > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > I filed [https://github.com/google/flatbuffers/issues/6572] but for now we > should pin the dependency to 0.8.3 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-12433) [Rust] Builds failing due to new flatbuffer release introducing const generics
[ https://issues.apache.org/jira/browse/ARROW-12433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Grove updated ARROW-12433: --- Priority: Blocker (was: Critical) > [Rust] Builds failing due to new flatbuffer release introducing const generics > -- > > Key: ARROW-12433 > URL: https://issues.apache.org/jira/browse/ARROW-12433 > Project: Apache Arrow > Issue Type: Bug >Affects Versions: 4.0.0 >Reporter: Andy Grove >Priority: Blocker > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > I filed [https://github.com/google/flatbuffers/issues/6572] but for now we > should pin the dependency to 0.8.3 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-12433) [Rust] Builds failing due to new flatbuffer release introducing const generics
[ https://issues.apache.org/jira/browse/ARROW-12433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-12433: --- Labels: pull-request-available (was: ) > [Rust] Builds failing due to new flatbuffer release introducing const generics > -- > > Key: ARROW-12433 > URL: https://issues.apache.org/jira/browse/ARROW-12433 > Project: Apache Arrow > Issue Type: Bug >Affects Versions: 4.0.0 >Reporter: Andy Grove >Priority: Critical > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > I filed [https://github.com/google/flatbuffers/issues/6572] but for now we > should pin the dependency to 0.8.3 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-12434) [Rust] [Ballista] Show executed plans with metrics
[ https://issues.apache.org/jira/browse/ARROW-12434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-12434: --- Labels: pull-request-available (was: ) > [Rust] [Ballista] Show executed plans with metrics > -- > > Key: ARROW-12434 > URL: https://issues.apache.org/jira/browse/ARROW-12434 > Project: Apache Arrow > Issue Type: New Feature > Components: Rust - Ballista >Reporter: Andy Grove >Assignee: Andy Grove >Priority: Major > Labels: pull-request-available > Fix For: 5.0.0 > > Time Spent: 10m > Remaining Estimate: 0h > > Show executed plans with metrics to help with debugging and performance tuning -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-12434) [Rust] [Ballista] Show executed plans with metrics
Andy Grove created ARROW-12434: -- Summary: [Rust] [Ballista] Show executed plans with metrics Key: ARROW-12434 URL: https://issues.apache.org/jira/browse/ARROW-12434 Project: Apache Arrow Issue Type: New Feature Components: Rust - Ballista Reporter: Andy Grove Assignee: Andy Grove Fix For: 5.0.0 Show executed plans with metrics to help with debugging and performance tuning -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-12433) [Rust] Builds failing due to new flatbuffer release introducing const generics
[ https://issues.apache.org/jira/browse/ARROW-12433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17324312#comment-17324312 ] Adam Lippai commented on ARROW-12433: - [~andygrove] I don't know if it makes any difference, I filed this https://github.com/google/flatbuffers/pull/6573/files > [Rust] Builds failing due to new flatbuffer release introducing const generics > -- > > Key: ARROW-12433 > URL: https://issues.apache.org/jira/browse/ARROW-12433 > Project: Apache Arrow > Issue Type: Bug >Affects Versions: 4.0.0 >Reporter: Andy Grove >Priority: Critical > > I filed [https://github.com/google/flatbuffers/issues/6572] but for now we > should pin the dependency to 0.8.3 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-12433) [Rust] Builds failing due to new flatbuffer release introducing const generics
[ https://issues.apache.org/jira/browse/ARROW-12433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17324311#comment-17324311 ] Adam Lippai commented on ARROW-12433: - No, likely I'm the one who don't understand something here. I just followed the link in the error message to the closed issue, looked into the feature and found the blogpost that it's shipped (in a minimal version). For me flatbuffers 0.8.4 and arrow master compiles: {code:java} alippai:/mnt/c/Repositories/arrow/rust/arrow$ cargo clean alippai:/mnt/c/Repositories/arrow/rust/arrow$ cargo --version --verbose cargo 1.51.0 (43b129a20 2021-03-16) release: 1.51.0 commit-hash: 43b129a20fbf1ede0df411396ccf0c024bf34134 commit-date: 2021-03-16 alippai@DESKTOP-HTTH82C:/mnt/c/Repositories/arrow/rust/arrow$ cargo build Compiling autocfg v1.0.1 Compiling libc v0.2.93 Compiling proc-macro2 v1.0.24 Compiling memchr v2.3.4 Compiling unicode-xid v0.2.1 Compiling syn v1.0.67 Compiling serde v1.0.125 Compiling cfg-if v1.0.0 Compiling getrandom v0.1.16 Compiling ryu v1.0.5 Compiling bitflags v1.2.1 Compiling byteorder v1.4.3 Compiling lexical-core v0.7.5 Compiling hashbrown v0.9.1 Compiling cfg_aliases v0.1.1 Compiling itoa v0.4.7 Compiling serde_json v1.0.64 Compiling ppv-lite86 v0.2.10 Compiling serde_derive v1.0.125 Compiling lazy_static v1.4.0 Compiling regex-syntax v0.6.23 Compiling arrayvec v0.5.2 Compiling smallvec v1.6.1 Compiling static_assertions v1.1.0 Compiling hex v0.4.3 Compiling arrow v4.0.0-SNAPSHOT (/mnt/c/Repositories/arrow/rust/arrow) Compiling regex-automata v0.1.9 Compiling num-traits v0.2.14 Compiling num-integer v0.1.44 Compiling num-bigint v0.3.2 Compiling num-rational v0.3.2 Compiling num-iter v0.1.42 Compiling indexmap v1.6.2 Compiling aho-corasick v0.7.15 Compiling csv-core v0.1.10 Compiling quote v1.0.9 Compiling regex v1.4.5 Compiling time v0.1.43 Compiling rand_core v0.5.1 Compiling rand_chacha v0.2.2 Compiling num-complex v0.3.1 Compiling rand v0.7.3 Compiling chrono v0.4.19 Compiling bstr v0.2.15 Compiling csv v1.1.6 Compiling num v0.3.1 Compiling thiserror-impl v1.0.24 Compiling thiserror v1.0.24 Compiling flatbuffers v0.8.4 Finished dev [unoptimized + debuginfo] target(s) in 48.18s {code} > [Rust] Builds failing due to new flatbuffer release introducing const generics > -- > > Key: ARROW-12433 > URL: https://issues.apache.org/jira/browse/ARROW-12433 > Project: Apache Arrow > Issue Type: Bug >Affects Versions: 4.0.0 >Reporter: Andy Grove >Priority: Critical > > I filed [https://github.com/google/flatbuffers/issues/6572] but for now we > should pin the dependency to 0.8.3 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-12433) [Rust] Builds failing due to new flatbuffer release introducing const generics
[ https://issues.apache.org/jira/browse/ARROW-12433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17324310#comment-17324310 ] Daniël Heres commented on ARROW-12433: -- I think the nightly versions of rust are outdated in the CI [~andygrove] > [Rust] Builds failing due to new flatbuffer release introducing const generics > -- > > Key: ARROW-12433 > URL: https://issues.apache.org/jira/browse/ARROW-12433 > Project: Apache Arrow > Issue Type: Bug >Affects Versions: 4.0.0 >Reporter: Andy Grove >Priority: Critical > > I filed [https://github.com/google/flatbuffers/issues/6572] but for now we > should pin the dependency to 0.8.3 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-12433) [Rust] Builds failing due to new flatbuffer release introducing const generics
[ https://issues.apache.org/jira/browse/ARROW-12433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17324308#comment-17324308 ] Andy Grove commented on ARROW-12433: [~alippai] Am I misunderstanding this issue? > [Rust] Builds failing due to new flatbuffer release introducing const generics > -- > > Key: ARROW-12433 > URL: https://issues.apache.org/jira/browse/ARROW-12433 > Project: Apache Arrow > Issue Type: Bug >Affects Versions: 4.0.0 >Reporter: Andy Grove >Priority: Critical > > I filed [https://github.com/google/flatbuffers/issues/6572] but for now we > should pin the dependency to 0.8.3 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (ARROW-12430) [C++] Support LZO compression
[ https://issues.apache.org/jira/browse/ARROW-12430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17324306#comment-17324306 ] Haowei Yu edited comment on ARROW-12430 at 4/17/21, 4:21 PM: - Yes, it's for parquet. was (Author: howryu): Yes, it's for parquest. > [C++] Support LZO compression > - > > Key: ARROW-12430 > URL: https://issues.apache.org/jira/browse/ARROW-12430 > Project: Apache Arrow > Issue Type: New Feature > Components: C++ >Reporter: Haowei Yu >Priority: Major > > I have some code that supports arrow compression with LZO and am willing to > contribute. However, I do understand there is a license concern w.r.t using > lzo library since it's under GPL2. I am not sure if you can take the change > set. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-12433) [Rust] Builds failing due to new flatbuffer release introducing const generics
[ https://issues.apache.org/jira/browse/ARROW-12433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17324307#comment-17324307 ] Andy Grove commented on ARROW-12433: CI is already using 1.51 ... "latest update on 2021-03-25, rust version 1.51.0" > [Rust] Builds failing due to new flatbuffer release introducing const generics > -- > > Key: ARROW-12433 > URL: https://issues.apache.org/jira/browse/ARROW-12433 > Project: Apache Arrow > Issue Type: Bug >Affects Versions: 4.0.0 >Reporter: Andy Grove >Priority: Critical > > I filed [https://github.com/google/flatbuffers/issues/6572] but for now we > should pin the dependency to 0.8.3 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-12430) [C++] Support LZO compression
[ https://issues.apache.org/jira/browse/ARROW-12430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17324306#comment-17324306 ] Haowei Yu commented on ARROW-12430: --- Yes, it's for parquest. > [C++] Support LZO compression > - > > Key: ARROW-12430 > URL: https://issues.apache.org/jira/browse/ARROW-12430 > Project: Apache Arrow > Issue Type: New Feature > Components: C++ >Reporter: Haowei Yu >Priority: Major > > I have some code that supports arrow compression with LZO and am willing to > contribute. However, I do understand there is a license concern w.r.t using > lzo library since it's under GPL2. I am not sure if you can take the change > set. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-12433) [Rust] Builds failing due to new flatbuffer release introducing const generics
[ https://issues.apache.org/jira/browse/ARROW-12433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17324304#comment-17324304 ] Adam Lippai commented on ARROW-12433: - Shouldn't we bump stable rust version to 1.51 instead? Ref: https://blog.rust-lang.org/2021/03/25/Rust-1.51.0.html > [Rust] Builds failing due to new flatbuffer release introducing const generics > -- > > Key: ARROW-12433 > URL: https://issues.apache.org/jira/browse/ARROW-12433 > Project: Apache Arrow > Issue Type: Bug >Affects Versions: 4.0.0 >Reporter: Andy Grove >Priority: Critical > > I filed [https://github.com/google/flatbuffers/issues/6572] but for now we > should pin the dependency to 0.8.3 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-12433) [Rust] Builds failing due to new flatbuffer release introducing const generics
Andy Grove created ARROW-12433: -- Summary: [Rust] Builds failing due to new flatbuffer release introducing const generics Key: ARROW-12433 URL: https://issues.apache.org/jira/browse/ARROW-12433 Project: Apache Arrow Issue Type: Bug Affects Versions: 4.0.0 Reporter: Andy Grove I filed [https://github.com/google/flatbuffers/issues/6572] but for now we should pin the dependency to 0.8.3 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-12432) [Rust] [DataFusion] Add metrics for SortExec
[ https://issues.apache.org/jira/browse/ARROW-12432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-12432: --- Labels: pull-request-available (was: ) > [Rust] [DataFusion] Add metrics for SortExec > > > Key: ARROW-12432 > URL: https://issues.apache.org/jira/browse/ARROW-12432 > Project: Apache Arrow > Issue Type: New Feature > Components: Rust - DataFusion >Reporter: Andy Grove >Priority: Major > Labels: pull-request-available > Fix For: 5.0.0 > > Time Spent: 10m > Remaining Estimate: 0h > > Add metrics for SortExec -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-12432) [Rust] [DataFusion] Add metrics for SortExec
Andy Grove created ARROW-12432: -- Summary: [Rust] [DataFusion] Add metrics for SortExec Key: ARROW-12432 URL: https://issues.apache.org/jira/browse/ARROW-12432 Project: Apache Arrow Issue Type: New Feature Components: Rust - DataFusion Reporter: Andy Grove Fix For: 5.0.0 Add metrics for SortExec -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-12334) [Rust] [Ballista] Aggregate queries producing incorrect results
[ https://issues.apache.org/jira/browse/ARROW-12334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17324288#comment-17324288 ] Andy Grove commented on ARROW-12334: I tracked this down and there are two separate bugs: 1. We are getting RepartitionExec in the plan which is not compatible with Ballista and explodes the number of partitions (and likely causes incorrect results) 2. The query actually works fine and the final sort produces 2 rows, but the results are created by reading all the intermediate results as well > [Rust] [Ballista] Aggregate queries producing incorrect results > --- > > Key: ARROW-12334 > URL: https://issues.apache.org/jira/browse/ARROW-12334 > Project: Apache Arrow > Issue Type: Bug > Components: Rust - Ballista >Reporter: Andy Grove >Assignee: Andy Grove >Priority: Major > > I just ran benchmarks for the first time in a while and I see duplicate > entries for group by keys. > > For example, query 1 has "group by l_returnflag, l_linestatus" and I see > multiple results with l_returnflag = 'A' and l_linestatus = 'F'. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-12430) [C++] Support LZO compression
[ https://issues.apache.org/jira/browse/ARROW-12430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17324287#comment-17324287 ] Antoine Pitrou commented on ARROW-12430: Is it for Parquet? Otherwise I'm not sure what you'd need it for. > [C++] Support LZO compression > - > > Key: ARROW-12430 > URL: https://issues.apache.org/jira/browse/ARROW-12430 > Project: Apache Arrow > Issue Type: New Feature > Components: C++ >Reporter: Haowei Yu >Priority: Major > > I have some code that supports arrow compression with LZO and am willing to > contribute. However, I do understand there is a license concern w.r.t using > lzo library since it's under GPL2. I am not sure if you can take the change > set. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-12430) [C++] Support LZO compression
[ https://issues.apache.org/jira/browse/ARROW-12430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17324280#comment-17324280 ] Haowei Yu commented on ARROW-12430: --- Ah ok, I really want arrow to support LZO since right now I have to keep the diff somewhere and patch those diff whenever I need to upgrade arrow version, which is painful. I don't need need binary distribution, I can compile arrow by myself, that is fine. > [C++] Support LZO compression > - > > Key: ARROW-12430 > URL: https://issues.apache.org/jira/browse/ARROW-12430 > Project: Apache Arrow > Issue Type: New Feature > Components: C++ >Reporter: Haowei Yu >Priority: Major > > I have some code that supports arrow compression with LZO and am willing to > contribute. However, I do understand there is a license concern w.r.t using > lzo library since it's under GPL2. I am not sure if you can take the change > set. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (ARROW-12429) [C++] MergedGeneratorTestFixture is incorrectly instantiated
[ https://issues.apache.org/jira/browse/ARROW-12429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Li resolved ARROW-12429. -- Fix Version/s: 5.0.0 Resolution: Fixed Issue resolved by pull request 10075 [https://github.com/apache/arrow/pull/10075] > [C++] MergedGeneratorTestFixture is incorrectly instantiated > > > Key: ARROW-12429 > URL: https://issues.apache.org/jira/browse/ARROW-12429 > Project: Apache Arrow > Issue Type: Bug > Components: C++ >Reporter: David Li >Assignee: David Li >Priority: Major > Labels: pull-request-available > Fix For: 5.0.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > [https://gist.github.com/kou/868eaed328b348e45865747044044272#file-source-cpp-txt] > Looks like the base class was accidentally instantiated instead of the actual > test -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (ARROW-12430) [C++] Support LZO compression
[ https://issues.apache.org/jira/browse/ARROW-12430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17324218#comment-17324218 ] Antoine Pitrou edited comment on ARROW-12430 at 4/17/21, 10:25 AM: --- Indeed, the license issue is a bit tricky. It is not clear whether making use of the LZO APIs absolutely requires adherence to the GPL by Arrow itself. The GNU readline library (GPL-licensed) is in a similar situation and it [states|https://tiswww.cwru.edu/php/chet/readline/rltop.html] that you may make use of it inside software licensed under any GPL-compatible license (the Apache license 2.0 is GPL-compatible according to the FSF). However, the FSF [contradicts its own advice|https://www.gnu.org/licenses/old-licenses/gpl-2.0-faq.en.html#IfLibraryIsGPL] in the GPL FAQ. If you feel strongly about this feature, you should probably contact the LZO author and ask them their position, because that is what matters. Note that, in any case, we would not distribute binaries with LZO enabled; you would have to compile Arrow yourself for that. was (Author: pitrou): Indeed, the license issue is a bit tricky. It is not clear whether making use of the LZO APIs absolutely requires adherence to the GPL by Arrow itself. The GNU readline library (GPL-licensed) is in a similar situation and it [states|https://tiswww.cwru.edu/php/chet/readline/rltop.html] that you may make use of it inside software licensed under any GPL-compatible license (the Apache license 2.0 is GPL-compatible according to the FSF). However, the FSF [contradicts its own advice|https://www.gnu.org/licenses/old-licenses/gpl-2.0-faq.en.html#IfLibraryIsGPL] in the GPL FAQ. If you feel strongly about this feature, you should probably contact the LZO author and ask them their position. Note that, in any case, we would not distribute binaries with LZO enabled; you would have to compile Arrow yourself for that. > [C++] Support LZO compression > - > > Key: ARROW-12430 > URL: https://issues.apache.org/jira/browse/ARROW-12430 > Project: Apache Arrow > Issue Type: New Feature > Components: C++ >Reporter: Haowei Yu >Priority: Major > > I have some code that supports arrow compression with LZO and am willing to > contribute. However, I do understand there is a license concern w.r.t using > lzo library since it's under GPL2. I am not sure if you can take the change > set. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (ARROW-12430) [C++] Support LZO compression
[ https://issues.apache.org/jira/browse/ARROW-12430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17324218#comment-17324218 ] Antoine Pitrou edited comment on ARROW-12430 at 4/17/21, 10:24 AM: --- Indeed, the license issue is a bit tricky. It is not clear whether making use of the LZO APIs absolutely requires adherence to the GPL by Arrow itself. The GNU readline library (GPL-licensed) is in a similar situation and it [states|https://tiswww.cwru.edu/php/chet/readline/rltop.html] that you may make use of it inside software licensed under any GPL-compatible license (the Apache license 2.0 is GPL-compatible according to the FSF). However, the FSF [contradicts its own advice|https://www.gnu.org/licenses/old-licenses/gpl-2.0-faq.en.html#IfLibraryIsGPL] in the GPL FAQ. If you feel strongly about this feature, you should probably contact the LZO author and ask them their position. Note that, in any case, we would not distribute binaries with LZO enabled; you would have to compile Arrow yourself for that. was (Author: pitrou): Indeed, the license issue is a bit tricky. It is not clear whether making use of the LZO APIs absolutely requires adherence to the GPL by Arrow itself. The GNU readline library (GPL-licensed) is in a similar situation and it [states|https://tiswww.cwru.edu/php/chet/readline/rltop.html] that you may make use of it inside software licensed under software licensed under any GPL-compatible license (the Apache license 2.0 is GPL-compatible according to the FSF). However, the FSF [contradicts its own advice|https://www.gnu.org/licenses/old-licenses/gpl-2.0-faq.en.html#IfLibraryIsGPL] in the GPL FAQ. If you feel strongly about this feature, you should probably contact the LZO author and ask them their position. Note that, in any case, we would not distribute binaries with LZO enabled; you would have to compile Arrow yourself for that. > [C++] Support LZO compression > - > > Key: ARROW-12430 > URL: https://issues.apache.org/jira/browse/ARROW-12430 > Project: Apache Arrow > Issue Type: New Feature > Components: C++ >Reporter: Haowei Yu >Priority: Major > > I have some code that supports arrow compression with LZO and am willing to > contribute. However, I do understand there is a license concern w.r.t using > lzo library since it's under GPL2. I am not sure if you can take the change > set. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-12430) [C++] Support LZO compression
[ https://issues.apache.org/jira/browse/ARROW-12430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17324218#comment-17324218 ] Antoine Pitrou commented on ARROW-12430: Indeed, the license issue is a bit tricky. It is not clear whether making use of the LZO APIs absolutely requires adherence to the GPL by Arrow itself. The GNU readline library (GPL-licensed) is in a similar situation and it [states|https://tiswww.cwru.edu/php/chet/readline/rltop.html] that you may make use of it inside software licensed under software licensed under any GPL-compatible license (the Apache license 2.0 is GPL-compatible according to the FSF). However, the FSF [contradicts its own advice|https://www.gnu.org/licenses/old-licenses/gpl-2.0-faq.en.html#IfLibraryIsGPL] in the GPL FAQ. If you feel strongly about this feature, you should probably contact the LZO author and ask them their position. Note that, in any case, we would not distribute binaries with LZO enabled; you would have to compile Arrow yourself for that. > [C++] Support LZO compression > - > > Key: ARROW-12430 > URL: https://issues.apache.org/jira/browse/ARROW-12430 > Project: Apache Arrow > Issue Type: New Feature > Components: C++ >Reporter: Haowei Yu >Priority: Major > > I have some code that supports arrow compression with LZO and am willing to > contribute. However, I do understand there is a license concern w.r.t using > lzo library since it's under GPL2. I am not sure if you can take the change > set. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-12431) [Python] pa.array mask inverted when type is binary and value to be converted in numpy array
Daniel Nugent created ARROW-12431: - Summary: [Python] pa.array mask inverted when type is binary and value to be converted in numpy array Key: ARROW-12431 URL: https://issues.apache.org/jira/browse/ARROW-12431 Project: Apache Arrow Issue Type: Bug Reporter: Daniel Nugent {code:python} Python 3.9.2 | packaged by conda-forge | (default, Feb 21 2021, 05:02:46) [GCC 9.3.0] on linux Type "help", "copyright", "credits" or "license" for more information. >>> import numpy as np >>> import pyarrow as pa >>> >>> pa.array(np.array([b'\x00']),type=pa.binary(1), mask = np.array([False])) [ null ] >>> pa.array(np.array([b'\x00']),type=pa.binary(1), mask = np.array([True])) [ 00 ] >>> pa.array([b'\x00'],type=pa.binary(1), mask = np.array([False])) [ 00 ] >>> pa.__version__ '3.0.0' >>> np.__version__ '1.20.1' {code} Happens both with FixedSizeBinary and variable sized binary (I was working with FixedSizeBinary). Does not happen for integers (presumably other types, didn't exhaustively check)? -- This message was sent by Atlassian Jira (v8.3.4#803005)