[jira] [Created] (ARROW-9066) [Python] Raise correct error in isnull()

2020-06-08 Thread Uwe Korn (Jira)
Uwe Korn created ARROW-9066:
---

 Summary: [Python] Raise correct error in isnull()
 Key: ARROW-9066
 URL: https://issues.apache.org/jira/browse/ARROW-9066
 Project: Apache Arrow
  Issue Type: Bug
  Components: Python
Affects Versions: 0.17.1
Reporter: Uwe Korn
Assignee: Uwe Korn






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-9026) [C++/Python] Force package removal from arrow-nightlies conda repository

2020-06-03 Thread Uwe Korn (Jira)
Uwe Korn created ARROW-9026:
---

 Summary: [C++/Python] Force package removal from arrow-nightlies 
conda repository
 Key: ARROW-9026
 URL: https://issues.apache.org/jira/browse/ARROW-9026
 Project: Apache Arrow
  Issue Type: Bug
  Components: Packaging
Reporter: Uwe Korn
Assignee: Uwe Korn






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-9024) [C++/Python] Install anaconda-client in conda-clean job

2020-06-03 Thread Uwe Korn (Jira)
Uwe Korn created ARROW-9024:
---

 Summary: [C++/Python] Install anaconda-client in conda-clean job
 Key: ARROW-9024
 URL: https://issues.apache.org/jira/browse/ARROW-9024
 Project: Apache Arrow
  Issue Type: Bug
  Components: Packaging
Reporter: Uwe Korn
Assignee: Uwe Korn






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-9023) [C++] Use mimalloc conda package

2020-06-03 Thread Uwe Korn (Jira)
Uwe Korn created ARROW-9023:
---

 Summary: [C++] Use mimalloc conda package
 Key: ARROW-9023
 URL: https://issues.apache.org/jira/browse/ARROW-9023
 Project: Apache Arrow
  Issue Type: Improvement
  Components: C++, Packaging
Reporter: Uwe Korn
Assignee: Uwe Korn






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-8962) [C++] Linking failure with clang-4.0

2020-05-27 Thread Uwe Korn (Jira)
Uwe Korn created ARROW-8962:
---

 Summary: [C++] Linking failure with clang-4.0
 Key: ARROW-8962
 URL: https://issues.apache.org/jira/browse/ARROW-8962
 Project: Apache Arrow
  Issue Type: Bug
  Components: C++
Reporter: Uwe Korn
Assignee: Uwe Korn


{code:java}
FAILED: release/arrow-file-to-stream
: && /Users/uwe/miniconda3/envs/pyarrow-dev/bin/ccache 
/Users/uwe/miniconda3/envs/pyarrow-dev/bin/x86_64-apple-darwin13.4.0-clang++  
-march=core2 -mtune=haswell -mssse3 -ftree-vectorize -fPIC -fPIE 
-fstack-protector-strong -O2 -pipe -stdlib=libc++ -fvisibility-inlines-hidden 
-std=c++14 -fmessage-length=0 -Qunused-arguments -fcolor-diagnostics -O3 
-DNDEBUG  -Wall -Wno-unknown-warning-option -Wno-pass-failed -msse4.2  -O3 
-DNDEBUG -isysroot 
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.15.sdk
 -Wl,-search_paths_first -Wl,-headerpad_max_install_names -Wl,-pie 
-Wl,-headerpad_max_install_names -Wl,-dead_strip_dylibs 
src/arrow/ipc/CMakeFiles/arrow-file-to-stream.dir/file_to_stream.cc.o  -o 
release/arrow-file-to-stream  release/libarrow.a 
/usr/local/opt/openssl@1.1/lib/libssl.dylib 
/usr/local/opt/openssl@1.1/lib/libcrypto.dylib 
/Users/uwe/miniconda3/envs/pyarrow-dev/lib/libbrotlienc-static.a 
/Users/uwe/miniconda3/envs/pyarrow-dev/lib/libbrotlidec-static.a 
/Users/uwe/miniconda3/envs/pyarrow-dev/lib/libbrotlicommon-static.a 
/Users/uwe/miniconda3/envs/pyarrow-dev/lib/liblz4.dylib 
/Users/uwe/miniconda3/envs/pyarrow-dev/lib/libsnappy.1.1.7.dylib 
/Users/uwe/miniconda3/envs/pyarrow-dev/lib/libz.dylib 
/Users/uwe/miniconda3/envs/pyarrow-dev/lib/libzstd.dylib 
/Users/uwe/miniconda3/envs/pyarrow-dev/lib/liborc.a 
/Users/uwe/miniconda3/envs/pyarrow-dev/lib/libprotobuf.dylib 
jemalloc_ep-prefix/src/jemalloc_ep/dist//lib/libjemalloc_pic.a && :
Undefined symbols for architecture x86_64:
  "arrow::internal::(anonymous 
namespace)::StringToFloatConverterImpl::main_junk_value_", referenced from:
  arrow::internal::StringToFloat(char const*, unsigned long, float*) in 
libarrow.a(value_parsing.cc.o)
  arrow::internal::StringToFloat(char const*, unsigned long, double*) in 
libarrow.a(value_parsing.cc.o)
  "arrow::internal::(anonymous 
namespace)::StringToFloatConverterImpl::fallback_junk_value_", referenced from:
  arrow::internal::StringToFloat(char const*, unsigned long, float*) in 
libarrow.a(value_parsing.cc.o)
  arrow::internal::StringToFloat(char const*, unsigned long, double*) in 
libarrow.a(value_parsing.cc.o)
ld: symbol(s) not found for architecture x86_64
clang-4.0: error: linker command failed with exit code 1 (use -v to see 
invocation) {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-8941) [C++/Python] arrow-nightlies conda repository is full

2020-05-26 Thread Uwe Korn (Jira)
Uwe Korn created ARROW-8941:
---

 Summary: [C++/Python] arrow-nightlies conda repository is full
 Key: ARROW-8941
 URL: https://issues.apache.org/jira/browse/ARROW-8941
 Project: Apache Arrow
  Issue Type: Improvement
  Components: C++, Packaging, Python
Reporter: Uwe Korn


You currently have 3 public packages and 0 packages that require to be 
authenticated.
Using 10.0 GB of 3.0 GB storage

 

We need a script to delete old packages, e.g. once a week?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-8571) [C++] Switch AppVeyor image to VS 2017

2020-04-23 Thread Uwe Korn (Jira)
Uwe Korn created ARROW-8571:
---

 Summary: [C++] Switch AppVeyor image to VS 2017
 Key: ARROW-8571
 URL: https://issues.apache.org/jira/browse/ARROW-8571
 Project: Apache Arrow
  Issue Type: Improvement
  Components: C++
Reporter: Uwe Korn
Assignee: Uwe Korn






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-8359) [C++/Python] Enable aarch64/ppc64le build in conda recipes

2020-04-06 Thread Uwe Korn (Jira)
Uwe Korn created ARROW-8359:
---

 Summary: [C++/Python] Enable aarch64/ppc64le build in conda recipes
 Key: ARROW-8359
 URL: https://issues.apache.org/jira/browse/ARROW-8359
 Project: Apache Arrow
  Issue Type: Improvement
  Components: C++, Packaging, Python
Reporter: Uwe Korn
 Fix For: 0.17.0


These two new arches were added in the conda recipes, we should also build them 
as nightlies.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-8350) [Python] Implement to_numpy on ChunkedArray

2020-04-06 Thread Uwe Korn (Jira)
Uwe Korn created ARROW-8350:
---

 Summary: [Python] Implement to_numpy on ChunkedArray
 Key: ARROW-8350
 URL: https://issues.apache.org/jira/browse/ARROW-8350
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Python
Reporter: Uwe Korn


We support {{to_numpy}} on Array instances but not on {{ChunkedArray}} 
instances. It would be quite useful to have it also there to support returning 
e.g. non-nanosecond datetime instances.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-8288) [Python] Expose with_ modifiers on DataType

2020-03-31 Thread Uwe Korn (Jira)
Uwe Korn created ARROW-8288:
---

 Summary: [Python] Expose with_ modifiers on DataType
 Key: ARROW-8288
 URL: https://issues.apache.org/jira/browse/ARROW-8288
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Python
Reporter: Uwe Korn
Assignee: Uwe Korn
 Fix For: 0.17.0


We have several {{WithX}} functions defined on {{DataType}} in C++ but only 
{{WithMetadata}} is yet exposed in Python. We should expose the rest of them.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-8285) [Python][Dataset] ScalarExpression doesn't accept numpy scalars

2020-03-31 Thread Uwe Korn (Jira)
Uwe Korn created ARROW-8285:
---

 Summary: [Python][Dataset] ScalarExpression doesn't accept numpy 
scalars
 Key: ARROW-8285
 URL: https://issues.apache.org/jira/browse/ARROW-8285
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Python
Reporter: Uwe Korn


{{pyarrow.dataset.ScalarExpression}} doesn't accept numpy scalars. This would 
be useful as values coming out of {{pandas}} or {{numpy}} are such.

Example:
{code:java}
import pyarrow.dataset as ds
import numpy as np

ds.ScalarExpression(np.int64(2)){code}
{code:java}
---
TypeError Traceback (most recent call last)
 in 
> 1 ds.ScalarExpression(np.int64(2))

~/miniconda3/envs/kartothek/lib/python3.7/site-packages/pyarrow/_dataset.pyx in 
pyarrow._dataset.ScalarExpression.__init__()

TypeError: Not yet supported scalar value: 2 {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-8284) [C++][Dataset] Schema evolution for timestamp columns

2020-03-31 Thread Uwe Korn (Jira)
Uwe Korn created ARROW-8284:
---

 Summary: [C++][Dataset] Schema evolution for timestamp columns
 Key: ARROW-8284
 URL: https://issues.apache.org/jira/browse/ARROW-8284
 Project: Apache Arrow
  Issue Type: Improvement
  Components: C++ - Dataset
Reporter: Uwe Korn






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-8283) [C++/Python][Dataset] Non-existent files are silently dropped in pa.dataset.FileSystemDataset

2020-03-31 Thread Uwe Korn (Jira)
Uwe Korn created ARROW-8283:
---

 Summary: [C++/Python][Dataset] Non-existent files are silently 
dropped in pa.dataset.FileSystemDataset
 Key: ARROW-8283
 URL: https://issues.apache.org/jira/browse/ARROW-8283
 Project: Apache Arrow
  Issue Type: Improvement
  Components: C++ - Dataset, Python
Reporter: Uwe Korn


When passing a list of files to the constructor of 
{{pyarrow.dataset.FileSystemData}}, all files that don't exist are silently 
dropped immediately (i.e. no fragments are created for them).

Instead, I would expect that fragments will be created for them but an error is 
thrown when one tries to read the fragment with the non-existent file.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-8282) [C++/Python][Dataset] Support schema evolution for integer columns

2020-03-31 Thread Uwe Korn (Jira)
Uwe Korn created ARROW-8282:
---

 Summary: [C++/Python][Dataset] Support schema evolution for 
integer columns
 Key: ARROW-8282
 URL: https://issues.apache.org/jira/browse/ARROW-8282
 Project: Apache Arrow
  Issue Type: Improvement
  Components: C++ - Dataset
Reporter: Uwe Korn


When reading in a dataset where the schema specifies that column X is of type 
{{int64}} but the partition actually contains the data stored in that columns 
as {{int32}}, an upcast should be done.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-8281) [R] Name collision of arrow.dll on Windows

2020-03-31 Thread Uwe Korn (Jira)
Uwe Korn created ARROW-8281:
---

 Summary: [R] Name collision of arrow.dll on Windows
 Key: ARROW-8281
 URL: https://issues.apache.org/jira/browse/ARROW-8281
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Packaging, R
Affects Versions: 0.16.0
Reporter: Uwe Korn


Currently we build the R extension for Windows only for CRAN with static 
linkage. For conda-forge, we though want to build it with dynamic linkage to 
{{arrow-cpp}}. Here we come into the issue that the R packages as well as the 
C++ package produces an {{arrow.dll}}. As there is no RPATH equivalent on 
Windows, the dynamic loader cannot find the right relatonship of both and fails 
to load the library.

>From my point of view, the simplest approach here would be to name the R 
>{{arrow.dll}} differently, e.g. {{rarrow.dll}}. Would this be possible?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-8174) [Python] Refactor context_choices in test_cuda_numba_interop to be a module level fixture

2020-03-20 Thread Uwe Korn (Jira)
Uwe Korn created ARROW-8174:
---

 Summary: [Python] Refactor context_choices in 
test_cuda_numba_interop to be a module level fixture
 Key: ARROW-8174
 URL: https://issues.apache.org/jira/browse/ARROW-8174
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Python
Reporter: Uwe Korn


Instead of being a global variable that is set/unset in 
setup_module/teardown_module



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-8159) [Python] pyarrow.Schema.from_pandas doesn't support ExtensionDtype

2020-03-19 Thread Uwe Korn (Jira)
Uwe Korn created ARROW-8159:
---

 Summary: [Python] pyarrow.Schema.from_pandas doesn't support 
ExtensionDtype
 Key: ARROW-8159
 URL: https://issues.apache.org/jira/browse/ARROW-8159
 Project: Apache Arrow
  Issue Type: Bug
  Components: Python
Affects Versions: 0.16.0
Reporter: Uwe Korn
Assignee: Uwe Korn
 Fix For: 0.17.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-8149) [C++/Python] Enable CUDA Support in conda recipes

2020-03-18 Thread Uwe Korn (Jira)
Uwe Korn created ARROW-8149:
---

 Summary: [C++/Python] Enable CUDA Support in conda recipes
 Key: ARROW-8149
 URL: https://issues.apache.org/jira/browse/ARROW-8149
 Project: Apache Arrow
  Issue Type: New Feature
  Components: C++, Packaging
Reporter: Uwe Korn
 Fix For: 0.17.0


See the changes in 
[https://github.com/conda-forge/arrow-cpp-feedstock/pull/123], we need to copy 
this into the Arrow repository and also test CUDA in these recipes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-8008) [C++/Python] Framework Python is preferred even though not the activated one

2020-03-05 Thread Uwe Korn (Jira)
Uwe Korn created ARROW-8008:
---

 Summary: [C++/Python] Framework Python is preferred even though 
not the activated one
 Key: ARROW-8008
 URL: https://issues.apache.org/jira/browse/ARROW-8008
 Project: Apache Arrow
  Issue Type: Bug
  Components: C++, Python
Reporter: Uwe Korn
Assignee: Uwe Korn


Currently the framework Python is preferred on macOS eventhough development 
happens in a completely different Python runtime.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-8007) [Python] Remove unused and defunct assert_get_object_equal in plasma tests

2020-03-05 Thread Uwe Korn (Jira)
Uwe Korn created ARROW-8007:
---

 Summary: [Python] Remove unused and defunct 
assert_get_object_equal in plasma tests
 Key: ARROW-8007
 URL: https://issues.apache.org/jira/browse/ARROW-8007
 Project: Apache Arrow
  Issue Type: Bug
  Components: Python
Affects Versions: 0.16.0
Reporter: Uwe Korn
Assignee: Uwe Korn






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-7493) [Python] Expose sum kernel in pyarrow.compute and support ChunkedArray inputs

2020-01-03 Thread Uwe Korn (Jira)
Uwe Korn created ARROW-7493:
---

 Summary: [Python] Expose sum kernel in pyarrow.compute and support 
ChunkedArray inputs
 Key: ARROW-7493
 URL: https://issues.apache.org/jira/browse/ARROW-7493
 Project: Apache Arrow
  Issue Type: Improvement
  Components: C++ - Compute, Python
Reporter: Uwe Korn
Assignee: Uwe Korn
 Fix For: 1.0.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-7250) Undefined symbols for StringToFloatConverter::Impl with clang 4.x

2019-11-24 Thread Uwe Korn (Jira)
Uwe Korn created ARROW-7250:
---

 Summary: Undefined symbols for StringToFloatConverter::Impl with 
clang 4.x
 Key: ARROW-7250
 URL: https://issues.apache.org/jira/browse/ARROW-7250
 Project: Apache Arrow
  Issue Type: Bug
  Components: C++
Affects Versions: 0.15.1
Reporter: Uwe Korn
Assignee: Uwe Korn


{code:java}
Undefined symbols for architecture x86_64:
  "arrow::internal::StringToFloatConverter::Impl::main_junk_value_", referenced 
from:
  arrow::internal::StringToFloatConverter::StringToFloat(char const*, 
unsigned long, float*) in libarrow.a(parsing.cc.o)
  arrow::internal::StringToFloatConverter::StringToFloat(char const*, 
unsigned long, double*) in libarrow.a(parsing.cc.o)
  "arrow::internal::StringToFloatConverter::Impl::fallback_junk_value_", 
referenced from:
  arrow::internal::StringToFloatConverter::StringToFloat(char const*, 
unsigned long, float*) in libarrow.a(parsing.cc.o)
  arrow::internal::StringToFloatConverter::StringToFloat(char const*, 
unsigned long, double*) in libarrow.a(parsing.cc.o)
ld: symbol(s) not found for architecture x86_64{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-7008) [Python] pyarrow.chunked_array([array]) fails on array with

2019-10-28 Thread Uwe Korn (Jira)
Uwe Korn created ARROW-7008:
---

 Summary: [Python] pyarrow.chunked_array([array]) fails on array 
with 
 Key: ARROW-7008
 URL: https://issues.apache.org/jira/browse/ARROW-7008
 Project: Apache Arrow
  Issue Type: Bug
  Components: Python
Affects Versions: 0.15.0
Reporter: Uwe Korn


Minimal reproducer:

{code}
import pyarrow as pa

pa.chunked_array([pa.array([], 
type=pa.string()).dictionary_encode().dictionary])
{code}

Traceback

{code}
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS 
(code=1, address=0x20)
  * frame #0: 0x000112cd5d0e libarrow.15.dylib`arrow::Status 
arrow::internal::ValidateVisitor::ValidateOffsets(arrow::BinaryArray const&) + 94
frame #1: 0x000112cc79a3 libarrow.15.dylib`arrow::Status 
arrow::VisitArrayInline(arrow::Array const&, 
arrow::internal::ValidateVisitor*) + 915
frame #2: 0x000112cc747d libarrow.15.dylib`arrow::Array::Validate() 
const + 829
frame #3: 0x000112e3ea19 
libarrow.15.dylib`arrow::ChunkedArray::Validate() const + 89
frame #4: 0x000112b8eb7d 
lib.cpython-37m-darwin.so`__pyx_pw_7pyarrow_3lib_135chunked_array(_object*, 
_object*, _object*) + 3661
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-6996) [Python] Expose boolean filter kernel on Table

2019-10-25 Thread Uwe Korn (Jira)
Uwe Korn created ARROW-6996:
---

 Summary: [Python] Expose boolean filter kernel on Table
 Key: ARROW-6996
 URL: https://issues.apache.org/jira/browse/ARROW-6996
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Python
Reporter: Uwe Korn


This is currently only implemented for Array but would also be useful on Tables 
and ChunkedArrays.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-6919) [Python] Expose more builders in Cython

2019-10-17 Thread Uwe Korn (Jira)
Uwe Korn created ARROW-6919:
---

 Summary: [Python] Expose more builders in Cython
 Key: ARROW-6919
 URL: https://issues.apache.org/jira/browse/ARROW-6919
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Python
Reporter: Uwe Korn
Assignee: Uwe Korn






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-6873) [Python] Stale CColumn reference break Cython cimport pyarrow

2019-10-14 Thread Uwe Korn (Jira)
Uwe Korn created ARROW-6873:
---

 Summary: [Python] Stale CColumn reference break Cython cimport 
pyarrow
 Key: ARROW-6873
 URL: https://issues.apache.org/jira/browse/ARROW-6873
 Project: Apache Arrow
  Issue Type: Bug
  Components: Python
Affects Versions: 0.15.0
Reporter: Uwe Korn
 Fix For: 0.15.1


Traceback:

{code}
Error compiling Cython file:

...
# under the License.

from __future__ import absolute_import

from libcpp.memory cimport shared_ptr
from pyarrow.includes.libarrow cimport (CArray, CBuffer, CColumn, CDataType,
^


…/lib/python3.7/site-packages/pyarrow/__init__.pxd:21:0: 
'pyarrow/includes/libarrow/CColumn.pxd' not found

Error compiling Cython file:

...
cdef object wrap_tensor(const shared_ptr[CTensor]& sp_tensor)
cdef object wrap_sparse_tensor_coo(
const shared_ptr[CSparseTensorCOO]& sp_sparse_tensor)
cdef object wrap_sparse_tensor_csr(
const shared_ptr[CSparseTensorCSR]& sp_sparse_tensor)
cdef object wrap_column(const shared_ptr[CColumn]& ccolumn)
   ^


…/lib/python3.7/site-packages/pyarrow/__init__.pxd:39:52: unknown type in 
template argument

Error compiling Cython file:

...

from pyarrow cimport Int64ArrayBuilder
^


/Users/uwe/.ipython/cython/_cython_magic_3eb31dd63fb578b618cc8e98a60dbdf5.pyx:2:0:
 'pyarrow/Int64ArrayBuilder.pxd' not found
---
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)