[jira] [Created] (ARROW-11326) utf8 vector buffers don't work if allocated within Web Assembly memory of Node.js
Dmitri Bronnikov created ARROW-11326: Summary: utf8 vector buffers don't work if allocated within Web Assembly memory of Node.js Key: ARROW-11326 URL: https://issues.apache.org/jira/browse/ARROW-11326 Project: Apache Arrow Issue Type: Bug Components: JavaScript Environment: node.js in Mac book pro Reporter: Dmitri Bronnikov After making int32array of offsets = [0, 1] and uint8array of values c[ascii_code('A')]create a vector of strings: const vec = arrow.Vector.new(arrow.Data.new(new Utf8(), 0, 1, 0, [offsets, values, null, null]) then access the first and only element: console.log(vec.get(0)) Works within browsers. Works in node.js with fixed size types, e.g. float or integer. Fails in node.js with this callstack at ../../node_modules/@apache-arrow/es2015-umd/buffer/index.js:311:1 at __proto__ (../../node_modules/@apache-arrow/es2015-umd/buffer/index.js:167:1) at Function._Buffer [as from] (../../node_modules/@apache-arrow/es2015-umd/buffer/index.js:154:1) at prototype (../../node_modules/@apache-arrow/es2015-umd/util/utf8.ts:43:31) at partial2 (../../node_modules/@apache-arrow/es2015-umd/visitor/get.ts:293:12) at go.isArray [as get] (../../node_modules/@apache-arrow/es2015-umd/vector/index.ts:175:43) at Sr.get (../../node_modules/@apache-arrow/es2015-umd/util/args.ts:27:7) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-11325) [Packaging][C#] Release Apache.Arrow.Flight and Apache.Arrow.Flight.AspNetCore
Kouhei Sutou created ARROW-11325: Summary: [Packaging][C#] Release Apache.Arrow.Flight and Apache.Arrow.Flight.AspNetCore Key: ARROW-11325 URL: https://issues.apache.org/jira/browse/ARROW-11325 Project: Apache Arrow Issue Type: Improvement Components: C#, Packaging Reporter: Kouhei Sutou Assignee: Kouhei Sutou -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-11324) [Rust] Querying datetime data in DataFusion with an embedded timezone always fails
Max Burke created ARROW-11324: - Summary: [Rust] Querying datetime data in DataFusion with an embedded timezone always fails Key: ARROW-11324 URL: https://issues.apache.org/jira/browse/ARROW-11324 Project: Apache Arrow Issue Type: Bug Components: Rust - DataFusion Reporter: Max Burke We have a number (~ hundreds of thousands) of Parquet files that have embedded Arrow schemas in them that have time-valued columns with the type DateTime(TimeUnit::Nanosecond, Some("UTC")). One of the changes in the Arrow 2 -> 3 working window was to make the Parquet loader prefer the Arrow schema compared to the one generated from the columns. But because DataFusion has the timezone field of the DateTime variant hardcoded as None, we can't load any of our data after this upgrade; we get errors like: {{SELECT * FROM parquet_table WHERE ("timestamp" >= to_timestamp('2010-03-24T13:00:00.00Z') AND "timestamp" <= to_timestamp('2010-03-25T00:00:00.00Z')) ORDER BY timestamp ASC NULLS LAST;}} {{Plan("\'Timestamp(Nanosecond, Some(\"UTC\")) >= Timestamp(Nanosecond, None)\' can\'t be evaluated because there isn\'t a common type to coerce the types to")}} Any ideas/thoughts? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-11323) [Rust][DataFusion] with queries with ORDER BY or GROUP BY that return no
Andrew Lamb created ARROW-11323: --- Summary: [Rust][DataFusion] with queries with ORDER BY or GROUP BY that return no Key: ARROW-11323 URL: https://issues.apache.org/jira/browse/ARROW-11323 Project: Apache Arrow Issue Type: Bug Reporter: Andrew Lamb If you run a SQL query in datafusion which has predicates that produces no rows that also includes a GROUP BY or ORDER BY clause, you get the following error: Error of "ArrowError(ComputeError("concat requires input of at least one array"))" Here are two test cases that show the problem: https://github.com/apache/arrow/blob/master/rust/datafusion/src/execution/context.rs#L889 {code} #[tokio::test] async fn sort_empty() -> Result<()> { // The predicate on this query purposely generates no results let results = execute("SELECT c1, c2 FROM test WHERE c1 > 10 ORDER BY c1 DESC, c2 ASC", 4).await?; assert_eq!(results.len(), 0); Ok(()) } #[tokio::test] async fn aggregate_empty() -> Result<()> { // The predicate on this query purposely generates no results let results = execute("SELECT SUM(c1), SUM(c2) FROM test where c1 > 10", 4).await?; assert_eq!(results.len(), 0); Ok(()) } {code{ -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-11322) [Rust] Arrow `memory` made private is a breaking API change
Max Burke created ARROW-11322: - Summary: [Rust] Arrow `memory` made private is a breaking API change Key: ARROW-11322 URL: https://issues.apache.org/jira/browse/ARROW-11322 Project: Apache Arrow Issue Type: Bug Reporter: Max Burke Assignee: Jorge Leitão We depend on functionality in the Arrow memory module for buffer building and this was recently made private. Please make this module public again. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-11321) [Rust][DataFusion] Fix DataFusion compilation error
Daniël Heres created ARROW-11321: Summary: [Rust][DataFusion] Fix DataFusion compilation error Key: ARROW-11321 URL: https://issues.apache.org/jira/browse/ARROW-11321 Project: Apache Arrow Issue Type: Improvement Components: Rust - DataFusion Reporter: Daniël Heres Assignee: Daniël Heres -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-11320) [C++] Spurious test failure when creating temporary dir
Antoine Pitrou created ARROW-11320: -- Summary: [C++] Spurious test failure when creating temporary dir Key: ARROW-11320 URL: https://issues.apache.org/jira/browse/ARROW-11320 Project: Apache Arrow Issue Type: Bug Components: C++ Reporter: Antoine Pitrou Assignee: Antoine Pitrou When running the release verification script, I sometimes get this error: {code} [--] 5 tests from TestInt8/TestSparseTensorRoundTrip/0, where TypeParam = arrow::Int8Type [ RUN ] TestInt8/TestSparseTensorRoundTrip/0.WithSparseCOOIndexRowMajor /tmp/arrow-3.0.0.4SRpe/apache-arrow-3.0.0/cpp/src/arrow/ipc/tensor_test.cc:53: Failure Failed '_error_or_value8.status()' failed with IOError: Path already exists: '/tmp/ipc-test-qj6ng827/' [ FAILED ] TestInt8/TestSparseTensorRoundTrip/0.WithSparseCOOIndexRowMajor, where TypeParam = arrow::Int8Type (0 ms) [ RUN ] TestInt8/TestSparseTensorRoundTrip/0.WithSparseCOOIndexColumnMajor [ OK ] TestInt8/TestSparseTensorRoundTrip/0.WithSparseCOOIndexColumnMajor (0 ms) [ RUN ] TestInt8/TestSparseTensorRoundTrip/0.WithSparseCSRIndex [ OK ] TestInt8/TestSparseTensorRoundTrip/0.WithSparseCSRIndex (0 ms) [ RUN ] TestInt8/TestSparseTensorRoundTrip/0.WithSparseCSCIndex [ OK ] TestInt8/TestSparseTensorRoundTrip/0.WithSparseCSCIndex (0 ms) [ RUN ] TestInt8/TestSparseTensorRoundTrip/0.WithSparseCSFIndex [ OK ] TestInt8/TestSparseTensorRoundTrip/0.WithSparseCSFIndex (1 ms) [--] 5 tests from TestInt8/TestSparseTensorRoundTrip/0 (1 ms total) {code} It seems that in some fringe cases, the random generation of temporary directory names produces duplicates. Most likely this means the random generator is getting the same seed from different processes. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-11319) [Rust] [DataFusion] Improve test comparisons to record batch
Andrew Lamb created ARROW-11319: --- Summary: [Rust] [DataFusion] Improve test comparisons to record batch Key: ARROW-11319 URL: https://issues.apache.org/jira/browse/ARROW-11319 Project: Apache Arrow Issue Type: Improvement Components: Rust - DataFusion Reporter: Andrew Lamb Assignee: Andrew Lamb The test::format_batch function does not have wide range of type support (e.g. it doesn't support dictionaries) and its output makes tests hard to read / update, in my opinion. We should consolidate the tests to use `arrow::util::pretty::pretty_format_batches` both to reduce code duplication as well as increase type support -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-11318) [Rust] Support pretty printing timestamp, date, and time types
Andrew Lamb created ARROW-11318: --- Summary: [Rust] Support pretty printing timestamp, date, and time types Key: ARROW-11318 URL: https://issues.apache.org/jira/browse/ARROW-11318 Project: Apache Arrow Issue Type: Improvement Components: Rust Reporter: Andrew Lamb Assignee: Andrew Lamb I found this while removing `test::format_batches` (PR to come shortly), pretty printing was printing numbers rather than dates. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-11317) [Arrow] Don't run CI tests twice and test the prettyprint feature
Andrew Lamb created ARROW-11317: --- Summary: [Arrow] Don't run CI tests twice and test the prettyprint feature Key: ARROW-11317 URL: https://issues.apache.org/jira/browse/ARROW-11317 Project: Apache Arrow Issue Type: Improvement Reporter: Andrew Lamb Assignee: Andrew Lamb -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-11316) [Rust]: BitMap is_set should return Result rather than relying on inlined assertion
Mahmut Bulut created ARROW-11316: Summary: [Rust]: BitMap is_set should return Result rather than relying on inlined assertion Key: ARROW-11316 URL: https://issues.apache.org/jira/browse/ARROW-11316 Project: Apache Arrow Issue Type: Bug Reporter: Mahmut Bulut The inlined assertion is prone to fail and panic when a user of the method passes anything other than 0..7 range. This is making wrong usages to crash the application that uses Arrow. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-11315) [Packaging][APT][arm64] Add missing gir1.2-gandiva-1.0
Kouhei Sutou created ARROW-11315: Summary: [Packaging][APT][arm64] Add missing gir1.2-gandiva-1.0 Key: ARROW-11315 URL: https://issues.apache.org/jira/browse/ARROW-11315 Project: Apache Arrow Issue Type: Bug Components: Packaging Reporter: Kouhei Sutou Assignee: Kouhei Sutou -- This message was sent by Atlassian Jira (v8.3.4#803005)