[jira] [Created] (ARROW-9591) [Rust] Investigate removing the offset requirement on the boolean kernels

2020-07-28 Thread Paddy Horan (Jira)
Paddy Horan created ARROW-9591:
--

 Summary: [Rust] Investigate removing the offset requirement on the 
boolean kernels
 Key: ARROW-9591
 URL: https://issues.apache.org/jira/browse/ARROW-9591
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Rust
Reporter: Paddy Horan






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-9590) [Rust] Ensure SIMD kernel implementations handle slicing correctly

2020-07-28 Thread Paddy Horan (Jira)
Paddy Horan created ARROW-9590:
--

 Summary: [Rust] Ensure SIMD kernel implementations handle slicing 
correctly
 Key: ARROW-9590
 URL: https://issues.apache.org/jira/browse/ARROW-9590
 Project: Apache Arrow
  Issue Type: Bug
  Components: Rust
Reporter: Paddy Horan






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-9589) [C++/R] arrow_exports.h contains structs declared as class

2020-07-28 Thread Uwe Korn (Jira)
Uwe Korn created ARROW-9589:
---

 Summary: [C++/R] arrow_exports.h contains structs declared as class
 Key: ARROW-9589
 URL: https://issues.apache.org/jira/browse/ARROW-9589
 Project: Apache Arrow
  Issue Type: Bug
  Components: R
Affects Versions: 1.0.0
Reporter: Uwe Korn
Assignee: Uwe Korn
 Fix For: 2.0.0


This is an issue in an MSVC-based toolchain.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-9588) [C++] clang/win: Copy constructor of ParquetInvalidOrCorruptedFileException not correctly triggered

2020-07-28 Thread Uwe Korn (Jira)
Uwe Korn created ARROW-9588:
---

 Summary: [C++] clang/win: Copy constructor of 
ParquetInvalidOrCorruptedFileException not correctly triggered
 Key: ARROW-9588
 URL: https://issues.apache.org/jira/browse/ARROW-9588
 Project: Apache Arrow
  Issue Type: Bug
Reporter: Uwe Korn


The copy constructor of ParquetInvalidOrCorruptedFileException doesn't seem to 
be taken correctly when building with clang 9.0.1 on Windows in a MSVC 
toolchain.

Adding {{ParquetInvalidOrCorruptedFileException(const 
ParquetInvalidOrCorruptedFileException&) = default;}} as an explicit copy 
constructor didn't help.

Happy to any ideas here, probably a long shot as there are other clang-msvc 
problems.

{code}
[49/62] Building CXX object 
src/parquet/CMakeFiles/parquet_shared.dir/Unity/unity_1_cxx.cxx.obj
FAILED: src/parquet/CMakeFiles/parquet_shared.dir/Unity/unity_1_cxx.cxx.obj
C:\Users\Administrator\miniconda3\conda-bld\arrow-cpp-ext_1595962790058\_build_env\Library\bin\clang++.exe
  -DARROW_HAVE_RUNTIME_AVX2 -DARROW_HAVE_RUNTIME_AVX512 
-DARROW_HAVE_RUNTIME_SSE4_2 -DARROW_HAVE_S
SE4_2 -DARROW_WITH_BROTLI -DARROW_WITH_BZ2 -DARROW_WITH_LZ4 -DARROW_WITH_SNAPPY 
-DARROW_WITH_TIMING_TESTS -DARROW_WITH_UTF8PROC -DARROW_WITH_ZLIB 
-DARROW_WITH_ZSTD -DAWS_COMMON_USE_IMPORT_EXPORT -DAWS_EVE
NT_STREAM_USE_IMPORT_EXPORT -DAWS_SDK_VERSION_MAJOR=1 -DAWS_SDK_VERSION_MINOR=7 
-DAWS_SDK_VERSION_PATCH=164 -DHAVE_INTTYPES_H -DHAVE_NETDB_H -DNOMINMAX 
-DPARQUET_EXPORTING -DUSE_IMPORT_EXPORT -DUSE_IMPORT
_EXPORT=1 -DUSE_WINDOWS_DLL_SEMANTICS -D_CRT_SECURE_NO_WARNINGS 
-Dparquet_shared_EXPORTS -Isrc -I../src -I../src/generated -isystem 
../thirdparty/flatbuffers/include -isystem C:/Users/Administrator/minico
nda3/conda-bld/arrow-cpp-ext_1595962790058/_h_env/Library/include -isystem 
../thirdparty/hadoop/include -fvisibility-inlines-hidden -std=c++14 
-fmessage-length=0 -march=k8 -mtune=haswell -ftree-vectorize
-fstack-protector-strong -O2 -ffunction-sections -pipe 
-D_CRT_SECURE_NO_WARNINGS -D_MT -D_DLL -nostdlib -Xclang --dependent-lib=msvcrt 
-fuse-ld=lld -fno-aligned-allocation -Qunused-arguments -fcolor-diagn
ostics -O3 -DNDEBUG  -Wa,-mbig-obj -Wall -Wno-unknown-warning-option 
-Wno-pass-failed -msse4.2  -O3 -DNDEBUG -D_DLL -D_MT -Xclang 
--dependent-lib=msvcrt   -std=c++14 -MD -MT src/parquet/CMakeFiles/parquet
_shared.dir/Unity/unity_1_cxx.cxx.obj -MF 
src\parquet\CMakeFiles\parquet_shared.dir\Unity\unity_1_cxx.cxx.obj.d -o 
src/parquet/CMakeFiles/parquet_shared.dir/Unity/unity_1_cxx.cxx.obj -c 
src/parquet/CMakeF
iles/parquet_shared.dir/Unity/unity_1_cxx.cxx
In file included from 
src/parquet/CMakeFiles/parquet_shared.dir/Unity/unity_1_cxx.cxx:3:
In file included from 
C:/Users/Administrator/miniconda3/conda-bld/arrow-cpp-ext_1595962790058/work/cpp/src/parquet/column_scanner.cc:18:
In file included from ../src\parquet/column_scanner.h:29:
In file included from ../src\parquet/column_reader.h:25:
In file included from ../src\parquet/exception.h:26:
In file included from ../src\parquet/platform.h:23:
In file included from ../src\arrow/buffer.h:28:
In file included from ../src\arrow/status.h:25:
../src\arrow/util/string_builder.h:49:10: error: invalid operands to binary 
expression ('std::ostream' (aka 'basic_ostream >') and 
'parquet::ParquetInvalidOrCorruptedFileException'
)
  stream << head;
  ~~ ^  
../src\arrow/util/string_builder.h:61:3: note: in instantiation of function 
template specialization 
'arrow::util::StringBuilderRecursive' requested here
  StringBuilderRecursive(ss.stream(), std::forward(args)...);
  ^
../src\arrow/status.h:160:31: note: in instantiation of function template 
specialization 
'arrow::util::StringBuilder' 
requested here
return Status(code, util::StringBuilder(std::forward(args)...));
  ^
../src\arrow/status.h:204:20: note: in instantiation of function template 
specialization 
'arrow::Status::FromArgs' 
requested here
return Status::FromArgs(StatusCode::Invalid, std::forward(args)...);
   ^
../src\parquet/exception.h:129:49: note: in instantiation of function template 
specialization 
'arrow::Status::Invalid' 
requested here
  : 
ParquetStatusException(::arrow::Status::Invalid(std::forward(args)...)) {}
^
C:/Users/Administrator/miniconda3/conda-bld/arrow-cpp-ext_1595962790058/work/cpp/src/parquet/file_reader.cc:270:13:
 note: in instantiation of function template specialization 
'parquet::ParquetInvalidOrCor
ruptedFileException::ParquetInvalidOrCorruptedFileException' requested here
  throw ParquetInvalidOrCorruptedFileException("Parquet file size is 0 
bytes");
^
C:\BuildTools\VC\Tools\MSVC\14.16.27023\include\ostream:480:36: note: candidate 
function not viable: no known conversion from 
'parquet::ParquetInvalidOrCorruptedFileException' to 'const void *' for 1st ar
gument; take the 

[jira] [Created] (ARROW-9587) [FlightRPC][Java] Clean up DoPut/FlightStream memory handling

2020-07-28 Thread David Li (Jira)
David Li created ARROW-9587:
---

 Summary: [FlightRPC][Java] Clean up DoPut/FlightStream memory 
handling
 Key: ARROW-9587
 URL: https://issues.apache.org/jira/browse/ARROW-9587
 Project: Apache Arrow
  Issue Type: Improvement
  Components: FlightRPC, Java
Affects Versions: 1.0.0
Reporter: David Li
Assignee: David Li


We've been running into issues with DoPut in Java. In particular:
 * Closing a FlightStream without draining it should not send a cancellation to 
the other side. A server will have sent an explicit error message, or will 
simply just not want to read the entire stream. A client should explicitly 
cancel/gRPC will cancel for you anyways when  you end the call.
 * The server should not close or clean up anything for you in DoPut (it should 
act like DoExchange). Otherwise trying to use it with ARROW-9586 becomes 
impossible (you need to close the FlightStream before ending the call, or 
you'll close the per-call allocator before you close the FlightStream)

I think this also ties into flakiness in unit tests.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-9586) [FlightRPC][Java] Allow using a per-call Arrow allocator

2020-07-28 Thread David Li (Jira)
David Li created ARROW-9586:
---

 Summary: [FlightRPC][Java] Allow using a per-call Arrow allocator
 Key: ARROW-9586
 URL: https://issues.apache.org/jira/browse/ARROW-9586
 Project: Apache Arrow
  Issue Type: Improvement
  Components: FlightRPC, Java
Reporter: David Li
Assignee: David Li
 Fix For: 2.0.0


We've been running into issues with Flight and gRPC leaking direct memory at 
scale. One thing we'd like to do is have a (child) allocator per DoGet/DoPut 
call, so we can more accurately track memory usage. We have a candidate 
implementation that is rather messy, but can be upstreamed as part of 
flight-grpc.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-9585) Remove duplicated to-do line in DataFusion readme

2020-07-28 Thread Paul Whalen (Jira)
Paul Whalen created ARROW-9585:
--

 Summary: Remove duplicated to-do line in DataFusion readme
 Key: ARROW-9585
 URL: https://issues.apache.org/jira/browse/ARROW-9585
 Project: Apache Arrow
  Issue Type: Task
  Components: Documentation
Reporter: Paul Whalen






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-9583) Offset is mishandled in arithmetic and boolean compute kernels

2020-07-28 Thread Jira
Jörn Horstmann created ARROW-9583:
-

 Summary: Offset is mishandled in arithmetic and boolean compute 
kernels
 Key: ARROW-9583
 URL: https://issues.apache.org/jira/browse/ARROW-9583
 Project: Apache Arrow
  Issue Type: Bug
  Components: Rust
Affects Versions: 1.0.0
Reporter: Jörn Horstmann


Several compute kernels create the resulting ArrayData with the same offset of 
one of the operands. Instead this offset should be 0 since the buffer is 
freshly constructed with the correct len.

Example of one failing test:

 
{code:java}
#[test]
fn test_primitive_array_add_sliced() {
let a = Int32Array::from(vec![0, 0, 0, 5, 6, 7, 8, 9, 0]);
let b = Int32Array::from(vec![0, 0, 0, 6, 7, 8, 9, 8, 0]);
let a = a.slice(3, 5);
let b = b.slice(3, 5);
let a = a.as_any().downcast_ref::().unwrap();
let b = b.as_any().downcast_ref::().unwrap();

assert_eq!(5, a.value(0));
assert_eq!(6, b.value(0));

let c = add(, ).unwrap();
assert_eq!(5, c.len());

assert_eq!(11, c.value(0));
assert_eq!(13, c.value(1));
assert_eq!(15, c.value(2));
assert_eq!(17, c.value(3));
assert_eq!(17, c.value(4));
}
 {code}
Additionally, the boolean kernels seem to require that both operands have the 
same offset. This shouldn't be needed, but it seems that the simd 
implementation requires that the offset is a multiple of 8 (bits) so that the 
operation works correctly on whole bytes. The scalar implementation should be 
fine with any offset.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-9582) [Rust] Implement Array::memory_size()

2020-07-28 Thread Andy Grove (Jira)
Andy Grove created ARROW-9582:
-

 Summary: [Rust] Implement Array::memory_size()
 Key: ARROW-9582
 URL: https://issues.apache.org/jira/browse/ARROW-9582
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Rust
Reporter: Andy Grove
Assignee: Andy Grove


I would like to be able to determine how much memory is being used by Arrow 
Arrays so that I can better monitor and report on memory usage when profiling 
and tuning code.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-9581) [Dev][Release] Bump next snapshot versions to 2.0.0

2020-07-28 Thread Krisztian Szucs (Jira)
Krisztian Szucs created ARROW-9581:
--

 Summary: [Dev][Release] Bump next snapshot versions to 2.0.0
 Key: ARROW-9581
 URL: https://issues.apache.org/jira/browse/ARROW-9581
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Developer Tools
Reporter: Krisztian Szucs
Assignee: Krisztian Szucs
 Fix For: 2.0.0


The upcoming major release will have version 2.0.0, update the hardcoded 
version numbers.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)