[jira] [Created] (ARROW-11326) utf8 vector buffers don't work if allocated within Web Assembly memory of Node.js

2021-01-19 Thread Dmitri Bronnikov (Jira)
Dmitri Bronnikov created ARROW-11326:


 Summary: utf8 vector buffers don't work if allocated within Web 
Assembly memory of Node.js
 Key: ARROW-11326
 URL: https://issues.apache.org/jira/browse/ARROW-11326
 Project: Apache Arrow
  Issue Type: Bug
  Components: JavaScript
 Environment: node.js in Mac book pro
Reporter: Dmitri Bronnikov


After making int32array of offsets = [0, 1] and uint8array of values 
c[ascii_code('A')]create a vector of strings:

const vec = arrow.Vector.new(arrow.Data.new(new Utf8(), 0, 1, 0, [offsets, 
values, null, null])

then access the first and only element:

console.log(vec.get(0))

Works within browsers. Works in node.js with fixed size types, e.g. float or 
integer.

Fails in node.js with this callstack

at ../../node_modules/@apache-arrow/es2015-umd/buffer/index.js:311:1
   at __proto__ 
(../../node_modules/@apache-arrow/es2015-umd/buffer/index.js:167:1)
   at Function._Buffer [as from] 
(../../node_modules/@apache-arrow/es2015-umd/buffer/index.js:154:1)
   at prototype (../../node_modules/@apache-arrow/es2015-umd/util/utf8.ts:43:31)
   at partial2 
(../../node_modules/@apache-arrow/es2015-umd/visitor/get.ts:293:12)
   at go.isArray [as get] 
(../../node_modules/@apache-arrow/es2015-umd/vector/index.ts:175:43)
   at Sr.get (../../node_modules/@apache-arrow/es2015-umd/util/args.ts:27:7)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-11325) [Packaging][C#] Release Apache.Arrow.Flight and Apache.Arrow.Flight.AspNetCore

2021-01-19 Thread Kouhei Sutou (Jira)
Kouhei Sutou created ARROW-11325:


 Summary: [Packaging][C#] Release Apache.Arrow.Flight and 
Apache.Arrow.Flight.AspNetCore
 Key: ARROW-11325
 URL: https://issues.apache.org/jira/browse/ARROW-11325
 Project: Apache Arrow
  Issue Type: Improvement
  Components: C#, Packaging
Reporter: Kouhei Sutou
Assignee: Kouhei Sutou






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-11324) [Rust] Querying datetime data in DataFusion with an embedded timezone always fails

2021-01-19 Thread Max Burke (Jira)
Max Burke created ARROW-11324:
-

 Summary: [Rust] Querying datetime data in DataFusion with an 
embedded timezone always fails
 Key: ARROW-11324
 URL: https://issues.apache.org/jira/browse/ARROW-11324
 Project: Apache Arrow
  Issue Type: Bug
  Components: Rust - DataFusion
Reporter: Max Burke


We have a number (~ hundreds of thousands) of Parquet files that have embedded 
Arrow schemas in them that have time-valued columns with the type 
DateTime(TimeUnit::Nanosecond, Some("UTC")).

 

One of the changes in the Arrow 2 -> 3 working window was to make the Parquet 
loader prefer the Arrow schema compared to the one generated from the columns. 

 

But because DataFusion has the timezone field of the DateTime variant hardcoded 
as None, we can't load any of our data after this upgrade; we get errors like:



{{SELECT * FROM parquet_table WHERE ("timestamp" >= 
to_timestamp('2010-03-24T13:00:00.00Z') AND "timestamp" <= 
to_timestamp('2010-03-25T00:00:00.00Z')) ORDER BY timestamp ASC NULLS 
LAST;}}
{{Plan("\'Timestamp(Nanosecond, Some(\"UTC\")) >= Timestamp(Nanosecond, None)\' 
can\'t be evaluated because there isn\'t a common type to coerce the types 
to")}}

 

Any ideas/thoughts? 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-11323) [Rust][DataFusion] with queries with ORDER BY or GROUP BY that return no

2021-01-19 Thread Andrew Lamb (Jira)
Andrew Lamb created ARROW-11323:
---

 Summary: [Rust][DataFusion]  with queries with ORDER BY or GROUP 
BY that return no 
 Key: ARROW-11323
 URL: https://issues.apache.org/jira/browse/ARROW-11323
 Project: Apache Arrow
  Issue Type: Bug
Reporter: Andrew Lamb


If you run a SQL query in datafusion which has predicates that produces no rows 
that also includes a GROUP BY or ORDER BY clause, you get the following error:

Error of "ArrowError(ComputeError("concat requires input of at least one 
array"))"

Here are two test cases that show the problem: 
https://github.com/apache/arrow/blob/master/rust/datafusion/src/execution/context.rs#L889

{code}
#[tokio::test]
async fn sort_empty() -> Result<()> {
// The predicate on this query purposely generates no results
let results =
execute("SELECT c1, c2 FROM test WHERE c1 > 10 ORDER BY c1 
DESC, c2 ASC", 4).await?;
assert_eq!(results.len(), 0);
Ok(())
}


#[tokio::test]
async fn aggregate_empty() -> Result<()> {
// The predicate on this query purposely generates no results
let results = execute("SELECT SUM(c1), SUM(c2) FROM test where c1 > 
10", 4).await?;
assert_eq!(results.len(), 0);
Ok(())
}

{code{



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-11322) [Rust] Arrow `memory` made private is a breaking API change

2021-01-19 Thread Max Burke (Jira)
Max Burke created ARROW-11322:
-

 Summary: [Rust] Arrow `memory` made private is a breaking API 
change
 Key: ARROW-11322
 URL: https://issues.apache.org/jira/browse/ARROW-11322
 Project: Apache Arrow
  Issue Type: Bug
Reporter: Max Burke
Assignee: Jorge Leitão


We depend on functionality in the Arrow memory module for buffer building and 
this was recently made private. 

 

Please make this module public again.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-11321) [Rust][DataFusion] Fix DataFusion compilation error

2021-01-19 Thread Jira
Daniël Heres created ARROW-11321:


 Summary: [Rust][DataFusion] Fix DataFusion compilation error
 Key: ARROW-11321
 URL: https://issues.apache.org/jira/browse/ARROW-11321
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Rust - DataFusion
Reporter: Daniël Heres
Assignee: Daniël Heres






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-11320) [C++] Spurious test failure when creating temporary dir

2021-01-19 Thread Antoine Pitrou (Jira)
Antoine Pitrou created ARROW-11320:
--

 Summary: [C++] Spurious test failure when creating temporary dir
 Key: ARROW-11320
 URL: https://issues.apache.org/jira/browse/ARROW-11320
 Project: Apache Arrow
  Issue Type: Bug
  Components: C++
Reporter: Antoine Pitrou
Assignee: Antoine Pitrou


When running the release verification script, I sometimes get this error:
{code}
[--] 5 tests from TestInt8/TestSparseTensorRoundTrip/0, where TypeParam 
= arrow::Int8Type
[ RUN  ] TestInt8/TestSparseTensorRoundTrip/0.WithSparseCOOIndexRowMajor
/tmp/arrow-3.0.0.4SRpe/apache-arrow-3.0.0/cpp/src/arrow/ipc/tensor_test.cc:53: 
Failure
Failed
'_error_or_value8.status()' failed with IOError: Path already exists: 
'/tmp/ipc-test-qj6ng827/'
[  FAILED  ] TestInt8/TestSparseTensorRoundTrip/0.WithSparseCOOIndexRowMajor, 
where TypeParam = arrow::Int8Type (0 ms)
[ RUN  ] TestInt8/TestSparseTensorRoundTrip/0.WithSparseCOOIndexColumnMajor
[   OK ] TestInt8/TestSparseTensorRoundTrip/0.WithSparseCOOIndexColumnMajor 
(0 ms)
[ RUN  ] TestInt8/TestSparseTensorRoundTrip/0.WithSparseCSRIndex
[   OK ] TestInt8/TestSparseTensorRoundTrip/0.WithSparseCSRIndex (0 ms)
[ RUN  ] TestInt8/TestSparseTensorRoundTrip/0.WithSparseCSCIndex
[   OK ] TestInt8/TestSparseTensorRoundTrip/0.WithSparseCSCIndex (0 ms)
[ RUN  ] TestInt8/TestSparseTensorRoundTrip/0.WithSparseCSFIndex
[   OK ] TestInt8/TestSparseTensorRoundTrip/0.WithSparseCSFIndex (1 ms)
[--] 5 tests from TestInt8/TestSparseTensorRoundTrip/0 (1 ms total)
{code}

It seems that in some fringe cases, the random generation of temporary 
directory names produces duplicates. Most likely this means the random 
generator is getting the same seed from different processes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-11319) [Rust] [DataFusion] Improve test comparisons to record batch

2021-01-19 Thread Andrew Lamb (Jira)
Andrew Lamb created ARROW-11319:
---

 Summary: [Rust] [DataFusion] Improve test comparisons to record 
batch
 Key: ARROW-11319
 URL: https://issues.apache.org/jira/browse/ARROW-11319
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Rust - DataFusion
Reporter: Andrew Lamb
Assignee: Andrew Lamb


The test::format_batch function does not have wide range of type support (e.g. 
it doesn't support dictionaries) and its output makes tests hard to read / 
update, in my opinion. We should consolidate the tests to use 
`arrow::util::pretty::pretty_format_batches` both to reduce code duplication as 
well as increase type support




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-11318) [Rust] Support pretty printing timestamp, date, and time types

2021-01-19 Thread Andrew Lamb (Jira)
Andrew Lamb created ARROW-11318:
---

 Summary: [Rust] Support pretty printing timestamp, date, and time 
types
 Key: ARROW-11318
 URL: https://issues.apache.org/jira/browse/ARROW-11318
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Rust
Reporter: Andrew Lamb
Assignee: Andrew Lamb



I found this while removing `test::format_batches` (PR to come shortly),

pretty printing was printing numbers rather than dates.




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-11317) [Arrow] Don't run CI tests twice and test the prettyprint feature

2021-01-19 Thread Andrew Lamb (Jira)
Andrew Lamb created ARROW-11317:
---

 Summary: [Arrow] Don't run CI tests twice and test the prettyprint 
feature
 Key: ARROW-11317
 URL: https://issues.apache.org/jira/browse/ARROW-11317
 Project: Apache Arrow
  Issue Type: Improvement
Reporter: Andrew Lamb
Assignee: Andrew Lamb






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-11316) [Rust]: BitMap is_set should return Result rather than relying on inlined assertion

2021-01-19 Thread Mahmut Bulut (Jira)
Mahmut Bulut created ARROW-11316:


 Summary: [Rust]: BitMap is_set should return Result rather than 
relying on inlined assertion
 Key: ARROW-11316
 URL: https://issues.apache.org/jira/browse/ARROW-11316
 Project: Apache Arrow
  Issue Type: Bug
Reporter: Mahmut Bulut


The inlined assertion is prone to fail and panic when a user of the method 
passes anything other than 0..7 range. This is making wrong usages to crash the 
application that uses Arrow.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-11315) [Packaging][APT][arm64] Add missing gir1.2-gandiva-1.0

2021-01-19 Thread Kouhei Sutou (Jira)
Kouhei Sutou created ARROW-11315:


 Summary: [Packaging][APT][arm64] Add missing gir1.2-gandiva-1.0
 Key: ARROW-11315
 URL: https://issues.apache.org/jira/browse/ARROW-11315
 Project: Apache Arrow
  Issue Type: Bug
  Components: Packaging
Reporter: Kouhei Sutou
Assignee: Kouhei Sutou






--
This message was sent by Atlassian Jira
(v8.3.4#803005)