[jira] [Created] (ARROW-9066) [Python] Raise correct error in isnull()
Uwe Korn created ARROW-9066: --- Summary: [Python] Raise correct error in isnull() Key: ARROW-9066 URL: https://issues.apache.org/jira/browse/ARROW-9066 Project: Apache Arrow Issue Type: Bug Components: Python Affects Versions: 0.17.1 Reporter: Uwe Korn Assignee: Uwe Korn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-9026) [C++/Python] Force package removal from arrow-nightlies conda repository
Uwe Korn created ARROW-9026: --- Summary: [C++/Python] Force package removal from arrow-nightlies conda repository Key: ARROW-9026 URL: https://issues.apache.org/jira/browse/ARROW-9026 Project: Apache Arrow Issue Type: Bug Components: Packaging Reporter: Uwe Korn Assignee: Uwe Korn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-9024) [C++/Python] Install anaconda-client in conda-clean job
Uwe Korn created ARROW-9024: --- Summary: [C++/Python] Install anaconda-client in conda-clean job Key: ARROW-9024 URL: https://issues.apache.org/jira/browse/ARROW-9024 Project: Apache Arrow Issue Type: Bug Components: Packaging Reporter: Uwe Korn Assignee: Uwe Korn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-9023) [C++] Use mimalloc conda package
Uwe Korn created ARROW-9023: --- Summary: [C++] Use mimalloc conda package Key: ARROW-9023 URL: https://issues.apache.org/jira/browse/ARROW-9023 Project: Apache Arrow Issue Type: Improvement Components: C++, Packaging Reporter: Uwe Korn Assignee: Uwe Korn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8962) [C++] Linking failure with clang-4.0
Uwe Korn created ARROW-8962: --- Summary: [C++] Linking failure with clang-4.0 Key: ARROW-8962 URL: https://issues.apache.org/jira/browse/ARROW-8962 Project: Apache Arrow Issue Type: Bug Components: C++ Reporter: Uwe Korn Assignee: Uwe Korn {code:java} FAILED: release/arrow-file-to-stream : && /Users/uwe/miniconda3/envs/pyarrow-dev/bin/ccache /Users/uwe/miniconda3/envs/pyarrow-dev/bin/x86_64-apple-darwin13.4.0-clang++ -march=core2 -mtune=haswell -mssse3 -ftree-vectorize -fPIC -fPIE -fstack-protector-strong -O2 -pipe -stdlib=libc++ -fvisibility-inlines-hidden -std=c++14 -fmessage-length=0 -Qunused-arguments -fcolor-diagnostics -O3 -DNDEBUG -Wall -Wno-unknown-warning-option -Wno-pass-failed -msse4.2 -O3 -DNDEBUG -isysroot /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.15.sdk -Wl,-search_paths_first -Wl,-headerpad_max_install_names -Wl,-pie -Wl,-headerpad_max_install_names -Wl,-dead_strip_dylibs src/arrow/ipc/CMakeFiles/arrow-file-to-stream.dir/file_to_stream.cc.o -o release/arrow-file-to-stream release/libarrow.a /usr/local/opt/openssl@1.1/lib/libssl.dylib /usr/local/opt/openssl@1.1/lib/libcrypto.dylib /Users/uwe/miniconda3/envs/pyarrow-dev/lib/libbrotlienc-static.a /Users/uwe/miniconda3/envs/pyarrow-dev/lib/libbrotlidec-static.a /Users/uwe/miniconda3/envs/pyarrow-dev/lib/libbrotlicommon-static.a /Users/uwe/miniconda3/envs/pyarrow-dev/lib/liblz4.dylib /Users/uwe/miniconda3/envs/pyarrow-dev/lib/libsnappy.1.1.7.dylib /Users/uwe/miniconda3/envs/pyarrow-dev/lib/libz.dylib /Users/uwe/miniconda3/envs/pyarrow-dev/lib/libzstd.dylib /Users/uwe/miniconda3/envs/pyarrow-dev/lib/liborc.a /Users/uwe/miniconda3/envs/pyarrow-dev/lib/libprotobuf.dylib jemalloc_ep-prefix/src/jemalloc_ep/dist//lib/libjemalloc_pic.a && : Undefined symbols for architecture x86_64: "arrow::internal::(anonymous namespace)::StringToFloatConverterImpl::main_junk_value_", referenced from: arrow::internal::StringToFloat(char const*, unsigned long, float*) in libarrow.a(value_parsing.cc.o) arrow::internal::StringToFloat(char const*, unsigned long, double*) in libarrow.a(value_parsing.cc.o) "arrow::internal::(anonymous namespace)::StringToFloatConverterImpl::fallback_junk_value_", referenced from: arrow::internal::StringToFloat(char const*, unsigned long, float*) in libarrow.a(value_parsing.cc.o) arrow::internal::StringToFloat(char const*, unsigned long, double*) in libarrow.a(value_parsing.cc.o) ld: symbol(s) not found for architecture x86_64 clang-4.0: error: linker command failed with exit code 1 (use -v to see invocation) {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8941) [C++/Python] arrow-nightlies conda repository is full
Uwe Korn created ARROW-8941: --- Summary: [C++/Python] arrow-nightlies conda repository is full Key: ARROW-8941 URL: https://issues.apache.org/jira/browse/ARROW-8941 Project: Apache Arrow Issue Type: Improvement Components: C++, Packaging, Python Reporter: Uwe Korn You currently have 3 public packages and 0 packages that require to be authenticated. Using 10.0 GB of 3.0 GB storage We need a script to delete old packages, e.g. once a week? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8571) [C++] Switch AppVeyor image to VS 2017
Uwe Korn created ARROW-8571: --- Summary: [C++] Switch AppVeyor image to VS 2017 Key: ARROW-8571 URL: https://issues.apache.org/jira/browse/ARROW-8571 Project: Apache Arrow Issue Type: Improvement Components: C++ Reporter: Uwe Korn Assignee: Uwe Korn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8359) [C++/Python] Enable aarch64/ppc64le build in conda recipes
Uwe Korn created ARROW-8359: --- Summary: [C++/Python] Enable aarch64/ppc64le build in conda recipes Key: ARROW-8359 URL: https://issues.apache.org/jira/browse/ARROW-8359 Project: Apache Arrow Issue Type: Improvement Components: C++, Packaging, Python Reporter: Uwe Korn Fix For: 0.17.0 These two new arches were added in the conda recipes, we should also build them as nightlies. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8350) [Python] Implement to_numpy on ChunkedArray
Uwe Korn created ARROW-8350: --- Summary: [Python] Implement to_numpy on ChunkedArray Key: ARROW-8350 URL: https://issues.apache.org/jira/browse/ARROW-8350 Project: Apache Arrow Issue Type: Improvement Components: Python Reporter: Uwe Korn We support {{to_numpy}} on Array instances but not on {{ChunkedArray}} instances. It would be quite useful to have it also there to support returning e.g. non-nanosecond datetime instances. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8288) [Python] Expose with_ modifiers on DataType
Uwe Korn created ARROW-8288: --- Summary: [Python] Expose with_ modifiers on DataType Key: ARROW-8288 URL: https://issues.apache.org/jira/browse/ARROW-8288 Project: Apache Arrow Issue Type: Improvement Components: Python Reporter: Uwe Korn Assignee: Uwe Korn Fix For: 0.17.0 We have several {{WithX}} functions defined on {{DataType}} in C++ but only {{WithMetadata}} is yet exposed in Python. We should expose the rest of them. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8285) [Python][Dataset] ScalarExpression doesn't accept numpy scalars
Uwe Korn created ARROW-8285: --- Summary: [Python][Dataset] ScalarExpression doesn't accept numpy scalars Key: ARROW-8285 URL: https://issues.apache.org/jira/browse/ARROW-8285 Project: Apache Arrow Issue Type: Improvement Components: Python Reporter: Uwe Korn {{pyarrow.dataset.ScalarExpression}} doesn't accept numpy scalars. This would be useful as values coming out of {{pandas}} or {{numpy}} are such. Example: {code:java} import pyarrow.dataset as ds import numpy as np ds.ScalarExpression(np.int64(2)){code} {code:java} --- TypeError Traceback (most recent call last) in > 1 ds.ScalarExpression(np.int64(2)) ~/miniconda3/envs/kartothek/lib/python3.7/site-packages/pyarrow/_dataset.pyx in pyarrow._dataset.ScalarExpression.__init__() TypeError: Not yet supported scalar value: 2 {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8284) [C++][Dataset] Schema evolution for timestamp columns
Uwe Korn created ARROW-8284: --- Summary: [C++][Dataset] Schema evolution for timestamp columns Key: ARROW-8284 URL: https://issues.apache.org/jira/browse/ARROW-8284 Project: Apache Arrow Issue Type: Improvement Components: C++ - Dataset Reporter: Uwe Korn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8283) [C++/Python][Dataset] Non-existent files are silently dropped in pa.dataset.FileSystemDataset
Uwe Korn created ARROW-8283: --- Summary: [C++/Python][Dataset] Non-existent files are silently dropped in pa.dataset.FileSystemDataset Key: ARROW-8283 URL: https://issues.apache.org/jira/browse/ARROW-8283 Project: Apache Arrow Issue Type: Improvement Components: C++ - Dataset, Python Reporter: Uwe Korn When passing a list of files to the constructor of {{pyarrow.dataset.FileSystemData}}, all files that don't exist are silently dropped immediately (i.e. no fragments are created for them). Instead, I would expect that fragments will be created for them but an error is thrown when one tries to read the fragment with the non-existent file. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8282) [C++/Python][Dataset] Support schema evolution for integer columns
Uwe Korn created ARROW-8282: --- Summary: [C++/Python][Dataset] Support schema evolution for integer columns Key: ARROW-8282 URL: https://issues.apache.org/jira/browse/ARROW-8282 Project: Apache Arrow Issue Type: Improvement Components: C++ - Dataset Reporter: Uwe Korn When reading in a dataset where the schema specifies that column X is of type {{int64}} but the partition actually contains the data stored in that columns as {{int32}}, an upcast should be done. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8281) [R] Name collision of arrow.dll on Windows
Uwe Korn created ARROW-8281: --- Summary: [R] Name collision of arrow.dll on Windows Key: ARROW-8281 URL: https://issues.apache.org/jira/browse/ARROW-8281 Project: Apache Arrow Issue Type: Improvement Components: Packaging, R Affects Versions: 0.16.0 Reporter: Uwe Korn Currently we build the R extension for Windows only for CRAN with static linkage. For conda-forge, we though want to build it with dynamic linkage to {{arrow-cpp}}. Here we come into the issue that the R packages as well as the C++ package produces an {{arrow.dll}}. As there is no RPATH equivalent on Windows, the dynamic loader cannot find the right relatonship of both and fails to load the library. >From my point of view, the simplest approach here would be to name the R >{{arrow.dll}} differently, e.g. {{rarrow.dll}}. Would this be possible? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8174) [Python] Refactor context_choices in test_cuda_numba_interop to be a module level fixture
Uwe Korn created ARROW-8174: --- Summary: [Python] Refactor context_choices in test_cuda_numba_interop to be a module level fixture Key: ARROW-8174 URL: https://issues.apache.org/jira/browse/ARROW-8174 Project: Apache Arrow Issue Type: Improvement Components: Python Reporter: Uwe Korn Instead of being a global variable that is set/unset in setup_module/teardown_module -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8159) [Python] pyarrow.Schema.from_pandas doesn't support ExtensionDtype
Uwe Korn created ARROW-8159: --- Summary: [Python] pyarrow.Schema.from_pandas doesn't support ExtensionDtype Key: ARROW-8159 URL: https://issues.apache.org/jira/browse/ARROW-8159 Project: Apache Arrow Issue Type: Bug Components: Python Affects Versions: 0.16.0 Reporter: Uwe Korn Assignee: Uwe Korn Fix For: 0.17.0 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8149) [C++/Python] Enable CUDA Support in conda recipes
Uwe Korn created ARROW-8149: --- Summary: [C++/Python] Enable CUDA Support in conda recipes Key: ARROW-8149 URL: https://issues.apache.org/jira/browse/ARROW-8149 Project: Apache Arrow Issue Type: New Feature Components: C++, Packaging Reporter: Uwe Korn Fix For: 0.17.0 See the changes in [https://github.com/conda-forge/arrow-cpp-feedstock/pull/123], we need to copy this into the Arrow repository and also test CUDA in these recipes. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8008) [C++/Python] Framework Python is preferred even though not the activated one
Uwe Korn created ARROW-8008: --- Summary: [C++/Python] Framework Python is preferred even though not the activated one Key: ARROW-8008 URL: https://issues.apache.org/jira/browse/ARROW-8008 Project: Apache Arrow Issue Type: Bug Components: C++, Python Reporter: Uwe Korn Assignee: Uwe Korn Currently the framework Python is preferred on macOS eventhough development happens in a completely different Python runtime. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8007) [Python] Remove unused and defunct assert_get_object_equal in plasma tests
Uwe Korn created ARROW-8007: --- Summary: [Python] Remove unused and defunct assert_get_object_equal in plasma tests Key: ARROW-8007 URL: https://issues.apache.org/jira/browse/ARROW-8007 Project: Apache Arrow Issue Type: Bug Components: Python Affects Versions: 0.16.0 Reporter: Uwe Korn Assignee: Uwe Korn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-7493) [Python] Expose sum kernel in pyarrow.compute and support ChunkedArray inputs
Uwe Korn created ARROW-7493: --- Summary: [Python] Expose sum kernel in pyarrow.compute and support ChunkedArray inputs Key: ARROW-7493 URL: https://issues.apache.org/jira/browse/ARROW-7493 Project: Apache Arrow Issue Type: Improvement Components: C++ - Compute, Python Reporter: Uwe Korn Assignee: Uwe Korn Fix For: 1.0.0 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-7250) Undefined symbols for StringToFloatConverter::Impl with clang 4.x
Uwe Korn created ARROW-7250: --- Summary: Undefined symbols for StringToFloatConverter::Impl with clang 4.x Key: ARROW-7250 URL: https://issues.apache.org/jira/browse/ARROW-7250 Project: Apache Arrow Issue Type: Bug Components: C++ Affects Versions: 0.15.1 Reporter: Uwe Korn Assignee: Uwe Korn {code:java} Undefined symbols for architecture x86_64: "arrow::internal::StringToFloatConverter::Impl::main_junk_value_", referenced from: arrow::internal::StringToFloatConverter::StringToFloat(char const*, unsigned long, float*) in libarrow.a(parsing.cc.o) arrow::internal::StringToFloatConverter::StringToFloat(char const*, unsigned long, double*) in libarrow.a(parsing.cc.o) "arrow::internal::StringToFloatConverter::Impl::fallback_junk_value_", referenced from: arrow::internal::StringToFloatConverter::StringToFloat(char const*, unsigned long, float*) in libarrow.a(parsing.cc.o) arrow::internal::StringToFloatConverter::StringToFloat(char const*, unsigned long, double*) in libarrow.a(parsing.cc.o) ld: symbol(s) not found for architecture x86_64{code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-7008) [Python] pyarrow.chunked_array([array]) fails on array with
Uwe Korn created ARROW-7008: --- Summary: [Python] pyarrow.chunked_array([array]) fails on array with Key: ARROW-7008 URL: https://issues.apache.org/jira/browse/ARROW-7008 Project: Apache Arrow Issue Type: Bug Components: Python Affects Versions: 0.15.0 Reporter: Uwe Korn Minimal reproducer: {code} import pyarrow as pa pa.chunked_array([pa.array([], type=pa.string()).dictionary_encode().dictionary]) {code} Traceback {code} (lldb) bt * thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x20) * frame #0: 0x000112cd5d0e libarrow.15.dylib`arrow::Status arrow::internal::ValidateVisitor::ValidateOffsets(arrow::BinaryArray const&) + 94 frame #1: 0x000112cc79a3 libarrow.15.dylib`arrow::Status arrow::VisitArrayInline(arrow::Array const&, arrow::internal::ValidateVisitor*) + 915 frame #2: 0x000112cc747d libarrow.15.dylib`arrow::Array::Validate() const + 829 frame #3: 0x000112e3ea19 libarrow.15.dylib`arrow::ChunkedArray::Validate() const + 89 frame #4: 0x000112b8eb7d lib.cpython-37m-darwin.so`__pyx_pw_7pyarrow_3lib_135chunked_array(_object*, _object*, _object*) + 3661 {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-6996) [Python] Expose boolean filter kernel on Table
Uwe Korn created ARROW-6996: --- Summary: [Python] Expose boolean filter kernel on Table Key: ARROW-6996 URL: https://issues.apache.org/jira/browse/ARROW-6996 Project: Apache Arrow Issue Type: Improvement Components: Python Reporter: Uwe Korn This is currently only implemented for Array but would also be useful on Tables and ChunkedArrays. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-6919) [Python] Expose more builders in Cython
Uwe Korn created ARROW-6919: --- Summary: [Python] Expose more builders in Cython Key: ARROW-6919 URL: https://issues.apache.org/jira/browse/ARROW-6919 Project: Apache Arrow Issue Type: Improvement Components: Python Reporter: Uwe Korn Assignee: Uwe Korn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-6873) [Python] Stale CColumn reference break Cython cimport pyarrow
Uwe Korn created ARROW-6873: --- Summary: [Python] Stale CColumn reference break Cython cimport pyarrow Key: ARROW-6873 URL: https://issues.apache.org/jira/browse/ARROW-6873 Project: Apache Arrow Issue Type: Bug Components: Python Affects Versions: 0.15.0 Reporter: Uwe Korn Fix For: 0.15.1 Traceback: {code} Error compiling Cython file: ... # under the License. from __future__ import absolute_import from libcpp.memory cimport shared_ptr from pyarrow.includes.libarrow cimport (CArray, CBuffer, CColumn, CDataType, ^ …/lib/python3.7/site-packages/pyarrow/__init__.pxd:21:0: 'pyarrow/includes/libarrow/CColumn.pxd' not found Error compiling Cython file: ... cdef object wrap_tensor(const shared_ptr[CTensor]& sp_tensor) cdef object wrap_sparse_tensor_coo( const shared_ptr[CSparseTensorCOO]& sp_sparse_tensor) cdef object wrap_sparse_tensor_csr( const shared_ptr[CSparseTensorCSR]& sp_sparse_tensor) cdef object wrap_column(const shared_ptr[CColumn]& ccolumn) ^ …/lib/python3.7/site-packages/pyarrow/__init__.pxd:39:52: unknown type in template argument Error compiling Cython file: ... from pyarrow cimport Int64ArrayBuilder ^ /Users/uwe/.ipython/cython/_cython_magic_3eb31dd63fb578b618cc8e98a60dbdf5.pyx:2:0: 'pyarrow/Int64ArrayBuilder.pxd' not found --- {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)