[jira] [Created] (ARROW-9591) [Rust] Investigate removing the offset requirement on the boolean kernels
Paddy Horan created ARROW-9591: -- Summary: [Rust] Investigate removing the offset requirement on the boolean kernels Key: ARROW-9591 URL: https://issues.apache.org/jira/browse/ARROW-9591 Project: Apache Arrow Issue Type: Improvement Components: Rust Reporter: Paddy Horan -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-9590) [Rust] Ensure SIMD kernel implementations handle slicing correctly
Paddy Horan created ARROW-9590: -- Summary: [Rust] Ensure SIMD kernel implementations handle slicing correctly Key: ARROW-9590 URL: https://issues.apache.org/jira/browse/ARROW-9590 Project: Apache Arrow Issue Type: Bug Components: Rust Reporter: Paddy Horan -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-9589) [C++/R] arrow_exports.h contains structs declared as class
Uwe Korn created ARROW-9589: --- Summary: [C++/R] arrow_exports.h contains structs declared as class Key: ARROW-9589 URL: https://issues.apache.org/jira/browse/ARROW-9589 Project: Apache Arrow Issue Type: Bug Components: R Affects Versions: 1.0.0 Reporter: Uwe Korn Assignee: Uwe Korn Fix For: 2.0.0 This is an issue in an MSVC-based toolchain. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-9588) [C++] clang/win: Copy constructor of ParquetInvalidOrCorruptedFileException not correctly triggered
Uwe Korn created ARROW-9588: --- Summary: [C++] clang/win: Copy constructor of ParquetInvalidOrCorruptedFileException not correctly triggered Key: ARROW-9588 URL: https://issues.apache.org/jira/browse/ARROW-9588 Project: Apache Arrow Issue Type: Bug Reporter: Uwe Korn The copy constructor of ParquetInvalidOrCorruptedFileException doesn't seem to be taken correctly when building with clang 9.0.1 on Windows in a MSVC toolchain. Adding {{ParquetInvalidOrCorruptedFileException(const ParquetInvalidOrCorruptedFileException&) = default;}} as an explicit copy constructor didn't help. Happy to any ideas here, probably a long shot as there are other clang-msvc problems. {code} [49/62] Building CXX object src/parquet/CMakeFiles/parquet_shared.dir/Unity/unity_1_cxx.cxx.obj FAILED: src/parquet/CMakeFiles/parquet_shared.dir/Unity/unity_1_cxx.cxx.obj C:\Users\Administrator\miniconda3\conda-bld\arrow-cpp-ext_1595962790058\_build_env\Library\bin\clang++.exe -DARROW_HAVE_RUNTIME_AVX2 -DARROW_HAVE_RUNTIME_AVX512 -DARROW_HAVE_RUNTIME_SSE4_2 -DARROW_HAVE_S SE4_2 -DARROW_WITH_BROTLI -DARROW_WITH_BZ2 -DARROW_WITH_LZ4 -DARROW_WITH_SNAPPY -DARROW_WITH_TIMING_TESTS -DARROW_WITH_UTF8PROC -DARROW_WITH_ZLIB -DARROW_WITH_ZSTD -DAWS_COMMON_USE_IMPORT_EXPORT -DAWS_EVE NT_STREAM_USE_IMPORT_EXPORT -DAWS_SDK_VERSION_MAJOR=1 -DAWS_SDK_VERSION_MINOR=7 -DAWS_SDK_VERSION_PATCH=164 -DHAVE_INTTYPES_H -DHAVE_NETDB_H -DNOMINMAX -DPARQUET_EXPORTING -DUSE_IMPORT_EXPORT -DUSE_IMPORT _EXPORT=1 -DUSE_WINDOWS_DLL_SEMANTICS -D_CRT_SECURE_NO_WARNINGS -Dparquet_shared_EXPORTS -Isrc -I../src -I../src/generated -isystem ../thirdparty/flatbuffers/include -isystem C:/Users/Administrator/minico nda3/conda-bld/arrow-cpp-ext_1595962790058/_h_env/Library/include -isystem ../thirdparty/hadoop/include -fvisibility-inlines-hidden -std=c++14 -fmessage-length=0 -march=k8 -mtune=haswell -ftree-vectorize -fstack-protector-strong -O2 -ffunction-sections -pipe -D_CRT_SECURE_NO_WARNINGS -D_MT -D_DLL -nostdlib -Xclang --dependent-lib=msvcrt -fuse-ld=lld -fno-aligned-allocation -Qunused-arguments -fcolor-diagn ostics -O3 -DNDEBUG -Wa,-mbig-obj -Wall -Wno-unknown-warning-option -Wno-pass-failed -msse4.2 -O3 -DNDEBUG -D_DLL -D_MT -Xclang --dependent-lib=msvcrt -std=c++14 -MD -MT src/parquet/CMakeFiles/parquet _shared.dir/Unity/unity_1_cxx.cxx.obj -MF src\parquet\CMakeFiles\parquet_shared.dir\Unity\unity_1_cxx.cxx.obj.d -o src/parquet/CMakeFiles/parquet_shared.dir/Unity/unity_1_cxx.cxx.obj -c src/parquet/CMakeF iles/parquet_shared.dir/Unity/unity_1_cxx.cxx In file included from src/parquet/CMakeFiles/parquet_shared.dir/Unity/unity_1_cxx.cxx:3: In file included from C:/Users/Administrator/miniconda3/conda-bld/arrow-cpp-ext_1595962790058/work/cpp/src/parquet/column_scanner.cc:18: In file included from ../src\parquet/column_scanner.h:29: In file included from ../src\parquet/column_reader.h:25: In file included from ../src\parquet/exception.h:26: In file included from ../src\parquet/platform.h:23: In file included from ../src\arrow/buffer.h:28: In file included from ../src\arrow/status.h:25: ../src\arrow/util/string_builder.h:49:10: error: invalid operands to binary expression ('std::ostream' (aka 'basic_ostream >') and 'parquet::ParquetInvalidOrCorruptedFileException' ) stream << head; ~~ ^ ../src\arrow/util/string_builder.h:61:3: note: in instantiation of function template specialization 'arrow::util::StringBuilderRecursive' requested here StringBuilderRecursive(ss.stream(), std::forward(args)...); ^ ../src\arrow/status.h:160:31: note: in instantiation of function template specialization 'arrow::util::StringBuilder' requested here return Status(code, util::StringBuilder(std::forward(args)...)); ^ ../src\arrow/status.h:204:20: note: in instantiation of function template specialization 'arrow::Status::FromArgs' requested here return Status::FromArgs(StatusCode::Invalid, std::forward(args)...); ^ ../src\parquet/exception.h:129:49: note: in instantiation of function template specialization 'arrow::Status::Invalid' requested here : ParquetStatusException(::arrow::Status::Invalid(std::forward(args)...)) {} ^ C:/Users/Administrator/miniconda3/conda-bld/arrow-cpp-ext_1595962790058/work/cpp/src/parquet/file_reader.cc:270:13: note: in instantiation of function template specialization 'parquet::ParquetInvalidOrCor ruptedFileException::ParquetInvalidOrCorruptedFileException' requested here throw ParquetInvalidOrCorruptedFileException("Parquet file size is 0 bytes"); ^ C:\BuildTools\VC\Tools\MSVC\14.16.27023\include\ostream:480:36: note: candidate function not viable: no known conversion from 'parquet::ParquetInvalidOrCorruptedFileException' to 'const void *' for 1st ar gument; take the
[jira] [Created] (ARROW-9587) [FlightRPC][Java] Clean up DoPut/FlightStream memory handling
David Li created ARROW-9587: --- Summary: [FlightRPC][Java] Clean up DoPut/FlightStream memory handling Key: ARROW-9587 URL: https://issues.apache.org/jira/browse/ARROW-9587 Project: Apache Arrow Issue Type: Improvement Components: FlightRPC, Java Affects Versions: 1.0.0 Reporter: David Li Assignee: David Li We've been running into issues with DoPut in Java. In particular: * Closing a FlightStream without draining it should not send a cancellation to the other side. A server will have sent an explicit error message, or will simply just not want to read the entire stream. A client should explicitly cancel/gRPC will cancel for you anyways when you end the call. * The server should not close or clean up anything for you in DoPut (it should act like DoExchange). Otherwise trying to use it with ARROW-9586 becomes impossible (you need to close the FlightStream before ending the call, or you'll close the per-call allocator before you close the FlightStream) I think this also ties into flakiness in unit tests. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-9586) [FlightRPC][Java] Allow using a per-call Arrow allocator
David Li created ARROW-9586: --- Summary: [FlightRPC][Java] Allow using a per-call Arrow allocator Key: ARROW-9586 URL: https://issues.apache.org/jira/browse/ARROW-9586 Project: Apache Arrow Issue Type: Improvement Components: FlightRPC, Java Reporter: David Li Assignee: David Li Fix For: 2.0.0 We've been running into issues with Flight and gRPC leaking direct memory at scale. One thing we'd like to do is have a (child) allocator per DoGet/DoPut call, so we can more accurately track memory usage. We have a candidate implementation that is rather messy, but can be upstreamed as part of flight-grpc. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-9585) Remove duplicated to-do line in DataFusion readme
Paul Whalen created ARROW-9585: -- Summary: Remove duplicated to-do line in DataFusion readme Key: ARROW-9585 URL: https://issues.apache.org/jira/browse/ARROW-9585 Project: Apache Arrow Issue Type: Task Components: Documentation Reporter: Paul Whalen -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-9583) Offset is mishandled in arithmetic and boolean compute kernels
Jörn Horstmann created ARROW-9583: - Summary: Offset is mishandled in arithmetic and boolean compute kernels Key: ARROW-9583 URL: https://issues.apache.org/jira/browse/ARROW-9583 Project: Apache Arrow Issue Type: Bug Components: Rust Affects Versions: 1.0.0 Reporter: Jörn Horstmann Several compute kernels create the resulting ArrayData with the same offset of one of the operands. Instead this offset should be 0 since the buffer is freshly constructed with the correct len. Example of one failing test: {code:java} #[test] fn test_primitive_array_add_sliced() { let a = Int32Array::from(vec![0, 0, 0, 5, 6, 7, 8, 9, 0]); let b = Int32Array::from(vec![0, 0, 0, 6, 7, 8, 9, 8, 0]); let a = a.slice(3, 5); let b = b.slice(3, 5); let a = a.as_any().downcast_ref::().unwrap(); let b = b.as_any().downcast_ref::().unwrap(); assert_eq!(5, a.value(0)); assert_eq!(6, b.value(0)); let c = add(, ).unwrap(); assert_eq!(5, c.len()); assert_eq!(11, c.value(0)); assert_eq!(13, c.value(1)); assert_eq!(15, c.value(2)); assert_eq!(17, c.value(3)); assert_eq!(17, c.value(4)); } {code} Additionally, the boolean kernels seem to require that both operands have the same offset. This shouldn't be needed, but it seems that the simd implementation requires that the offset is a multiple of 8 (bits) so that the operation works correctly on whole bytes. The scalar implementation should be fine with any offset. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-9582) [Rust] Implement Array::memory_size()
Andy Grove created ARROW-9582: - Summary: [Rust] Implement Array::memory_size() Key: ARROW-9582 URL: https://issues.apache.org/jira/browse/ARROW-9582 Project: Apache Arrow Issue Type: Improvement Components: Rust Reporter: Andy Grove Assignee: Andy Grove I would like to be able to determine how much memory is being used by Arrow Arrays so that I can better monitor and report on memory usage when profiling and tuning code. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-9581) [Dev][Release] Bump next snapshot versions to 2.0.0
Krisztian Szucs created ARROW-9581: -- Summary: [Dev][Release] Bump next snapshot versions to 2.0.0 Key: ARROW-9581 URL: https://issues.apache.org/jira/browse/ARROW-9581 Project: Apache Arrow Issue Type: Improvement Components: Developer Tools Reporter: Krisztian Szucs Assignee: Krisztian Szucs Fix For: 2.0.0 The upcoming major release will have version 2.0.0, update the hardcoded version numbers. -- This message was sent by Atlassian Jira (v8.3.4#803005)