[jira] [Assigned] (ARROW-18099) [Python] Cannot create pandas categorical from table only with nulls

2022-12-15 Thread Uwe Korn (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-18099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Korn reassigned ARROW-18099:


Assignee: Damian Barabonkov

> [Python] Cannot create pandas categorical from table only with nulls
> 
>
> Key: ARROW-18099
> URL: https://issues.apache.org/jira/browse/ARROW-18099
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Affects Versions: 9.0.0
> Environment: OSX 12.6
> M1 silicon
>Reporter: Damian Barabonkov
>Assignee: Damian Barabonkov
>Priority: Minor
>  Labels: pull-request-available, python-conversion
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> A pyarrow Table with only null values cannot be instantiated as a Pandas 
> DataFrame with said column as a category. However, pandas does support 
> "empty" categoricals. Therefore, a simple patch would be to load the pa.Table 
> as an object first and convert, once in pandas, to a categorical which will 
> be empty. However, that does not solve the pyarrow bug at its root.
>  
> Sample reproducible example
> {code:java}
> import pyarrow as pa
> pylist = [{'x': None, '__index_level_0__': 2}, {'x': None, 
> '__index_level_0__': 3}]
> tbl = pa.Table.from_pylist(pylist)
>  
> # Errors
> df_broken = tbl.to_pandas(categories=["x"])
>  
> # Works
> df_works = tbl.to_pandas()
> df_works = df_works.astype({"x": "category"}) {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (ARROW-18150) [Python] test_cython failing on macOS

2022-10-25 Thread Uwe Korn (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-18150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17623659#comment-17623659
 ] 

Uwe Korn commented on ARROW-18150:
--

I haven't seen this before but this is the following CPython (related) issue: 
https://github.com/python/cpython/issues/97524

> [Python] test_cython failing on macOS
> -
>
> Key: ARROW-18150
> URL: https://issues.apache.org/jira/browse/ARROW-18150
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Reporter: Alenka Frim
>Priority: Critical
>
> [https://github.com/apache/arrow/actions/runs/3315249930/jobs/5475594297#step:5:354]
> {code:java}
> ImportError: 
> dlopen(/private/var/folders/24/8k48jl6d249_n_qfxwsl6xvmgn/T/pytest-of-runner/pytest-0/test_cython_api0/pyarrow_cython_example.cpython-310-darwin.so,
>  0x0002): Library not loaded: '@rpath/libarrow_python.1000.dylib'
> 357  Referenced from: 
> '/private/var/folders/24/8k48jl6d249_n_qfxwsl6xvmgn/T/pytest-of-runner/pytest-0/test_cython_api0/pyarrow_cython_example.cpython-310-darwin.so'
> 358  Reason: tried: '/usr/local/lib/libarrow_python.1000.dylib' (no such 
> file), '/usr/local/lib/libarrow_python.1000.dylib' (no such file), 
> '/usr/local/lib/libarrow_python.1000.dylib' (no such file), 
> '/usr/lib/libarrow_python.1000.dylib' (no such file) {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (ARROW-17335) [Python] Type checking support

2022-08-10 Thread Uwe Korn (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-17335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17578099#comment-17578099
 ] 

Uwe Korn commented on ARROW-17335:
--

My initial efforts with regards to typing the code base stopped as the inline 
type annotations (and their automatic extraction into pyi) is the crucial 
component here. All the important data structures of pyarrow are implemented in 
Cython, only a very small fraction of the externally visible API is in plain 
Python. Thus as long as the referenced Cython issue isn't solved, I don't think 
it makes sense to progress on the Arrow side.

> [Python] Type checking support
> --
>
> Key: ARROW-17335
> URL: https://issues.apache.org/jira/browse/ARROW-17335
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: Python
>Reporter: Jorrick Sleijster
>Priority: Major
>   Original Estimate: 10h
>  Remaining Estimate: 10h
>
> h1. mypy and static type checking
> As of Python3.6, it has been possible to introduce typing information in the 
> code. This became immensely popular in a short period of time. Shortly after, 
> the tool `mypy` arrived and this has become the industry standard for static 
> type checking inside Python. It is able to check very quickly for invalid 
> types which makes it possible to serve as a pre-commit. It has raised many 
> bugs that I did not see myself and has been a very valuable tool.
> h2. Now what does this mean for PyArrow?
> When we run mypy on code that uses PyArrow, you will get error message as 
> follows:
> ```
> some_util_using_pyarrow/hdfs_utils.py:5: error: Skipping analyzing "pyarrow": 
> module is installed, but missing library stubs or py.typed marker
> some_util_using_pyarrow/hdfs_utils.py:9: error: Skipping analyzing "pyarrow": 
> module is installed, but missing library stubs or py.typed marker
> some_util_using_pyarrow/hdfs_utils.py:11: error: Skipping analyzing 
> "pyarrow.fs": module is installed, but missing library stubs or py.typed 
> marker
> ```
> More information is available here: 
> [https://mypy.readthedocs.io/en/stable/running_mypy.html#missing-library-stubs-or-py-typed-marker]
> h2. You can solve this in three ways:
>  # Ignore the message. This, however, will put all types from PyArrow to 
> `Any`, making it unable to find user errors with the PyArrow library
>  # Create a Python stub file. This is what previously used to be the 
> standard, however, it no longer a popular option. This is because stubs are 
> extra, next to the source code, while you can also inline the code with type 
> hints, which brings me to our third option.
>  # Create a `py.typed` file and use inline type hints. This is the most 
> popular option today because it requires no extra files (except for the 
> py.typed file), allows all the type hints to be with the code (like now in 
> the documentation) and not only provides your users but also the developers 
> of the library themselves with type hints (and hinting of issues inside your 
> IDE).
>  
> My personal opinion already shines through the options, it is 3 as this has 
> shortly become the industry standard since the introduction.
> h2. What should we do?
> I'd very much like to work on this, however, I don't feel like wasting time. 
> Therefore, I am raising this ticket to see if this had been considered before 
> or if we just didn't get to this yet.
> I'd like to open the discussion here:
>  # Do you agree with number #3 as type hints.
>  # Should we remove the documentation annotations for the type hints given 
> they will be inside the functions? Or should we keep it and specify it in the 
> code? Which would make it double.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (ARROW-16797) [Python][Packaging] Update conda-recipes from conda-forge feedstock

2022-06-09 Thread Uwe Korn (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-16797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17552352#comment-17552352
 ] 

Uwe Korn commented on ARROW-16797:
--

Can take care of them in the next days. 

> [Python][Packaging] Update conda-recipes from conda-forge feedstock
> ---
>
> Key: ARROW-16797
> URL: https://issues.apache.org/jira/browse/ARROW-16797
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Packaging, Python
>Reporter: Raúl Cumplido
>Priority: Major
>
> Our conda-recipes have not been updated for the last 4 months 
> ([https://github.com/apache/arrow/tree/master/dev/tasks/conda-recipes/.ci_support)]
>  and they are not up-to-date with the upstream feedstocks:
> [arrow-cpp-feedstock]: [https://github.com/conda-forge/arrow-cpp-feedstock]
> [parquet-cpp-feedstock]: 
> [https://github.com/conda-forge/parquet-cpp-feedstock]
> We should keep them up-to-date.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (ARROW-16687) [C++] Can not find conda-installed Google Benchmark library

2022-06-01 Thread Uwe Korn (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-16687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17544760#comment-17544760
 ] 

Uwe Korn commented on ARROW-16687:
--

My main guess for such issues is that one version may have come from defaults 
and not anaconda. Hard for me to reproduce though.

> [C++] Can not find conda-installed Google Benchmark library
> ---
>
> Key: ARROW-16687
> URL: https://issues.apache.org/jira/browse/ARROW-16687
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Reporter: Antoine Pitrou
>Priority: Major
>
> I have {{benchmark 1.6.1}} installed from conda-forge, yet when trying to 
> build Arrow C++ with benchmarks enabled I get the following error:
> {code}
> CMake Error at cmake_modules/ThirdpartyToolchain.cmake:253 (find_package):
>   By not providing "Findbenchmark.cmake" in CMAKE_MODULE_PATH this project
>   has asked CMake to find a package configuration file provided by
>   "benchmark", but CMake did not find one.
>   Could not find a package configuration file provided by "benchmark"
>   (requested version 1.6.0) with any of the following names:
> benchmarkConfig.cmake
> benchmark-config.cmake
>   Add the installation prefix of "benchmark" to CMAKE_PREFIX_PATH or set
>   "benchmark_DIR" to a directory containing one of the above files.  If
>   "benchmark" provides a separate development package or SDK, be sure it has
>   been installed.
> Call Stack (most recent call first):
>   cmake_modules/ThirdpartyToolchain.cmake:2141 (resolve_dependency)
>   CMakeLists.txt:567 (include)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (ARROW-16520) [C++][Flight] Symbol lookup error in tests with conda-forge g++

2022-05-11 Thread Uwe Korn (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-16520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17534877#comment-17534877
 ] 

Uwe Korn commented on ARROW-16520:
--

This won't make a difference as {{lib.cpp}} isn't used for the build of 
{{main}}.

> [C++][Flight] Symbol lookup error in tests with conda-forge g++
> ---
>
> Key: ARROW-16520
> URL: https://issues.apache.org/jira/browse/ARROW-16520
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++, FlightRPC
>Reporter: Antoine Pitrou
>Priority: Major
>
> When building Arrow with conda-forge {{g\+\+}}, I get the following errors in 
> the Flight tests. This did not seem to happen with {{clang\+\+}}.
> {code}
> 29/69 Test #51: arrow-flight-internals-test ..***Failed0.18 
> sec
> Running arrow-flight-internals-test, redirecting output into 
> /home/antoine/arrow/dev/cpp/build-test/build/test-logs/arrow-flight-internals-test.txt
>  (attempt 1/1)
> /home/antoine/arrow/dev/cpp/build-test/debug/arrow-flight-internals-test: 
> symbol lookup error: 
> /home/antoine/miniconda3/envs/pyarrow/lib/libarrow_flight.so.900: undefined 
> symbol: google::protobuf::internal::InternalMetadata::~InternalMetadata()
> ~/arrow/dev/cpp/build-test/src/arrow/flight
>   Start 55: arrow-ipc-json-simple-test
> 30/69 Test #52: arrow-flight-test ***Failed0.16 
> sec
> Running arrow-flight-test, redirecting output into 
> /home/antoine/arrow/dev/cpp/build-test/build/test-logs/arrow-flight-test.txt 
> (attempt 1/1)
> /home/antoine/arrow/dev/cpp/build-test/debug/arrow-flight-test: symbol lookup 
> error: /home/antoine/miniconda3/envs/pyarrow/lib/libarrow_flight.so.900: 
> undefined symbol: 
> google::protobuf::internal::InternalMetadata::~InternalMetadata()
> ~/arrow/dev/cpp/build-test/src/arrow/flight
>   Start 56: arrow-ipc-read-write-test
> 31/69 Test #53: arrow-flight-sql-test ***Failed0.17 
> sec
> Running arrow-flight-sql-test, redirecting output into 
> /home/antoine/arrow/dev/cpp/build-test/build/test-logs/arrow-flight-sql-test.txt
>  (attempt 1/1)
> /home/antoine/arrow/dev/cpp/build-test/debug/arrow-flight-sql-test: symbol 
> lookup error: 
> /home/antoine/miniconda3/envs/pyarrow/lib/libarrow_flight.so.900: undefined 
> symbol: google::protobuf::internal::InternalMetadata::~InternalMetadata()
> ~/arrow/dev/cpp/build-test/src/arrow/flight/sql
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (ARROW-16520) [C++][Flight] Symbol lookup error in tests with conda-forge g++

2022-05-11 Thread Uwe Korn (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-16520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17534866#comment-17534866
 ] 

Uwe Korn commented on ARROW-16520:
--

Still trying to build a minimal reproducer for this: 
https://gist.github.com/xhochy/0ec41c962e669d90026c671105aebb5f This breaks 
independently of {{-fvisibility-inlines-hidden}}

> [C++][Flight] Symbol lookup error in tests with conda-forge g++
> ---
>
> Key: ARROW-16520
> URL: https://issues.apache.org/jira/browse/ARROW-16520
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++, FlightRPC
>Reporter: Antoine Pitrou
>Priority: Major
>
> When building Arrow with conda-forge {{g\+\+}}, I get the following errors in 
> the Flight tests. This did not seem to happen with {{clang\+\+}}.
> {code}
> 29/69 Test #51: arrow-flight-internals-test ..***Failed0.18 
> sec
> Running arrow-flight-internals-test, redirecting output into 
> /home/antoine/arrow/dev/cpp/build-test/build/test-logs/arrow-flight-internals-test.txt
>  (attempt 1/1)
> /home/antoine/arrow/dev/cpp/build-test/debug/arrow-flight-internals-test: 
> symbol lookup error: 
> /home/antoine/miniconda3/envs/pyarrow/lib/libarrow_flight.so.900: undefined 
> symbol: google::protobuf::internal::InternalMetadata::~InternalMetadata()
> ~/arrow/dev/cpp/build-test/src/arrow/flight
>   Start 55: arrow-ipc-json-simple-test
> 30/69 Test #52: arrow-flight-test ***Failed0.16 
> sec
> Running arrow-flight-test, redirecting output into 
> /home/antoine/arrow/dev/cpp/build-test/build/test-logs/arrow-flight-test.txt 
> (attempt 1/1)
> /home/antoine/arrow/dev/cpp/build-test/debug/arrow-flight-test: symbol lookup 
> error: /home/antoine/miniconda3/envs/pyarrow/lib/libarrow_flight.so.900: 
> undefined symbol: 
> google::protobuf::internal::InternalMetadata::~InternalMetadata()
> ~/arrow/dev/cpp/build-test/src/arrow/flight
>   Start 56: arrow-ipc-read-write-test
> 31/69 Test #53: arrow-flight-sql-test ***Failed0.17 
> sec
> Running arrow-flight-sql-test, redirecting output into 
> /home/antoine/arrow/dev/cpp/build-test/build/test-logs/arrow-flight-sql-test.txt
>  (attempt 1/1)
> /home/antoine/arrow/dev/cpp/build-test/debug/arrow-flight-sql-test: symbol 
> lookup error: 
> /home/antoine/miniconda3/envs/pyarrow/lib/libarrow_flight.so.900: undefined 
> symbol: google::protobuf::internal::InternalMetadata::~InternalMetadata()
> ~/arrow/dev/cpp/build-test/src/arrow/flight/sql
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (ARROW-16520) [C++][Flight] Symbol lookup error in tests with conda-forge g++

2022-05-11 Thread Uwe Korn (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-16520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17534861#comment-17534861
 ] 

Uwe Korn commented on ARROW-16520:
--

No that should have triggered it already. My feeling is currently that you see 
different versions of {{libprotobuf}} in your setup.

> [C++][Flight] Symbol lookup error in tests with conda-forge g++
> ---
>
> Key: ARROW-16520
> URL: https://issues.apache.org/jira/browse/ARROW-16520
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++, FlightRPC
>Reporter: Antoine Pitrou
>Priority: Major
>
> When building Arrow with conda-forge {{g\+\+}}, I get the following errors in 
> the Flight tests. This did not seem to happen with {{clang\+\+}}.
> {code}
> 29/69 Test #51: arrow-flight-internals-test ..***Failed0.18 
> sec
> Running arrow-flight-internals-test, redirecting output into 
> /home/antoine/arrow/dev/cpp/build-test/build/test-logs/arrow-flight-internals-test.txt
>  (attempt 1/1)
> /home/antoine/arrow/dev/cpp/build-test/debug/arrow-flight-internals-test: 
> symbol lookup error: 
> /home/antoine/miniconda3/envs/pyarrow/lib/libarrow_flight.so.900: undefined 
> symbol: google::protobuf::internal::InternalMetadata::~InternalMetadata()
> ~/arrow/dev/cpp/build-test/src/arrow/flight
>   Start 55: arrow-ipc-json-simple-test
> 30/69 Test #52: arrow-flight-test ***Failed0.16 
> sec
> Running arrow-flight-test, redirecting output into 
> /home/antoine/arrow/dev/cpp/build-test/build/test-logs/arrow-flight-test.txt 
> (attempt 1/1)
> /home/antoine/arrow/dev/cpp/build-test/debug/arrow-flight-test: symbol lookup 
> error: /home/antoine/miniconda3/envs/pyarrow/lib/libarrow_flight.so.900: 
> undefined symbol: 
> google::protobuf::internal::InternalMetadata::~InternalMetadata()
> ~/arrow/dev/cpp/build-test/src/arrow/flight
>   Start 56: arrow-ipc-read-write-test
> 31/69 Test #53: arrow-flight-sql-test ***Failed0.17 
> sec
> Running arrow-flight-sql-test, redirecting output into 
> /home/antoine/arrow/dev/cpp/build-test/build/test-logs/arrow-flight-sql-test.txt
>  (attempt 1/1)
> /home/antoine/arrow/dev/cpp/build-test/debug/arrow-flight-sql-test: symbol 
> lookup error: 
> /home/antoine/miniconda3/envs/pyarrow/lib/libarrow_flight.so.900: undefined 
> symbol: google::protobuf::internal::InternalMetadata::~InternalMetadata()
> ~/arrow/dev/cpp/build-test/src/arrow/flight/sql
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (ARROW-16520) [C++][Flight] Symbol lookup error in tests with conda-forge g++

2022-05-11 Thread Uwe Korn (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-16520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17534853#comment-17534853
 ] 

Uwe Korn commented on ARROW-16520:
--

I'm a bit confused that the error only shows up at runtime. Shouldn't it 
already trigger during linkage?

> [C++][Flight] Symbol lookup error in tests with conda-forge g++
> ---
>
> Key: ARROW-16520
> URL: https://issues.apache.org/jira/browse/ARROW-16520
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++, FlightRPC
>Reporter: Antoine Pitrou
>Priority: Major
>
> When building Arrow with conda-forge {{g\+\+}}, I get the following errors in 
> the Flight tests. This did not seem to happen with {{clang\+\+}}.
> {code}
> 29/69 Test #51: arrow-flight-internals-test ..***Failed0.18 
> sec
> Running arrow-flight-internals-test, redirecting output into 
> /home/antoine/arrow/dev/cpp/build-test/build/test-logs/arrow-flight-internals-test.txt
>  (attempt 1/1)
> /home/antoine/arrow/dev/cpp/build-test/debug/arrow-flight-internals-test: 
> symbol lookup error: 
> /home/antoine/miniconda3/envs/pyarrow/lib/libarrow_flight.so.900: undefined 
> symbol: google::protobuf::internal::InternalMetadata::~InternalMetadata()
> ~/arrow/dev/cpp/build-test/src/arrow/flight
>   Start 55: arrow-ipc-json-simple-test
> 30/69 Test #52: arrow-flight-test ***Failed0.16 
> sec
> Running arrow-flight-test, redirecting output into 
> /home/antoine/arrow/dev/cpp/build-test/build/test-logs/arrow-flight-test.txt 
> (attempt 1/1)
> /home/antoine/arrow/dev/cpp/build-test/debug/arrow-flight-test: symbol lookup 
> error: /home/antoine/miniconda3/envs/pyarrow/lib/libarrow_flight.so.900: 
> undefined symbol: 
> google::protobuf::internal::InternalMetadata::~InternalMetadata()
> ~/arrow/dev/cpp/build-test/src/arrow/flight
>   Start 56: arrow-ipc-read-write-test
> 31/69 Test #53: arrow-flight-sql-test ***Failed0.17 
> sec
> Running arrow-flight-sql-test, redirecting output into 
> /home/antoine/arrow/dev/cpp/build-test/build/test-logs/arrow-flight-sql-test.txt
>  (attempt 1/1)
> /home/antoine/arrow/dev/cpp/build-test/debug/arrow-flight-sql-test: symbol 
> lookup error: 
> /home/antoine/miniconda3/envs/pyarrow/lib/libarrow_flight.so.900: undefined 
> symbol: google::protobuf::internal::InternalMetadata::~InternalMetadata()
> ~/arrow/dev/cpp/build-test/src/arrow/flight/sql
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (ARROW-14323) [Python] Import error (ppc64le) - Illegal instruction (core dumped)

2022-04-12 Thread Uwe Korn (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-14323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17521159#comment-17521159
 ] 

Uwe Korn commented on ARROW-14323:
--

This should be fixed as this has been a compiler bug with GCC 7. We patched GCC 
and rebuild the arrow packages on conda-forge.

> [Python] Import error (ppc64le) - Illegal instruction (core dumped)
> ---
>
> Key: ARROW-14323
> URL: https://issues.apache.org/jira/browse/ARROW-14323
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Affects Versions: 5.0.0
> Environment: Platform: Red Hat Enterprise Linux 8.2 (Ootpa) - PPC64le
> Python version: 3.6
> PyArrow version: pyarrow - 5.0.0 - 
> conda-forge/linux-ppc64le::pyarrow-5.0.0-py36h7a46c7e_8_cpu
>Reporter: Gerardo Cervantes
>Priority: Major
>
> h2. Description
> The bug occurs when Installing the PyArrow library with Conda.
> After trying to import with Python, this error shows up:
>  
> {code:java}
> Illegal instruction (core dumped){code}
>  
> h2. Environment
>  * Platform: Red Hat Enterprise Linux 8.2 (Ootpa) - PPC64le
>  * Python version: 3.6
>  * PyArrow version: pyarrow - 5.0.0 - 
> conda-forge/linux-ppc64le::pyarrow-5.0.0-py36h7a46c7e_8_cpu
> h2. Steps to reproduce the bug
> Installing with Conda and then trying to import the package through Python 
> gives the error. 
> {{}}
> {code:java}
> conda create --name pyarrow_py36 python=3.6
> conda activate pyarrow_py36
> conda install pyarrow
> {code}
> {{}}
> h2. Tracebacks
> conda create --name pyarrow_py36 python=3.6
>  
> {code:java}
> Collecting package metadata (current_repodata.json): done
> Solving environment: done
> ==> WARNING: A newer version of conda exists. <==
>  current version: 4.9.2
>  latest version: 4.10.3
> Please update conda by running
> $ conda update -n base -c defaults conda
>  
> ## Package Plan ##
> environment location: /p/home/gerryc/.conda/envs/pyarrow_py36
> added / updated specs:
>  - python=3.6
> The following NEW packages will be INSTALLED:
> _libgcc_mutex conda-forge/linux-ppc64le::_libgcc_mutex-0.1-conda_forge
>  _openmp_mutex conda-forge/linux-ppc64le::_openmp_mutex-4.5-1_gnu
>  ca-certificates 
> conda-forge/linux-ppc64le::ca-certificates-2021.10.8-h1084571_0
>  certifi pkgs/main/linux-ppc64le::certifi-2020.12.5-py36h6ffa863_0
>  ld_impl_linux-ppc~ 
> conda-forge/linux-ppc64le::ld_impl_linux-ppc64le-2.36.1-ha35d02b_2
>  libffi conda-forge/linux-ppc64le::libffi-3.4.2-h3b9df90_4
>  libgcc-ng conda-forge/linux-ppc64le::libgcc-ng-11.2.0-h7698a5e_11
>  libgomp conda-forge/linux-ppc64le::libgomp-11.2.0-h7698a5e_11
>  libstdcxx-ng conda-forge/linux-ppc64le::libstdcxx-ng-11.2.0-habdf983_11
>  libzlib conda-forge/linux-ppc64le::libzlib-1.2.11-h339bb43_1013
>  ncurses conda-forge/linux-ppc64le::ncurses-6.2-hea85c5d_4
>  openssl conda-forge/linux-ppc64le::openssl-1.1.1l-h4e0d66e_0
>  pip conda-forge/noarch::pip-21.3-pyhd8ed1ab_0
>  python conda-forge/linux-ppc64le::python-3.6.13-h57873ef_2_cpython
>  readline conda-forge/linux-ppc64le::readline-8.1-h5c45dff_0
>  setuptools pkgs/main/linux-ppc64le::setuptools-58.0.4-py36h6ffa863_0
>  sqlite conda-forge/linux-ppc64le::sqlite-3.36.0-h4e2196e_2
>  tk conda-forge/linux-ppc64le::tk-8.6.11-h41c6715_1
>  wheel conda-forge/noarch::wheel-0.37.0-pyhd8ed1ab_1
>  xz conda-forge/linux-ppc64le::xz-5.2.5-h6eb9509_1
>  zlib conda-forge/linux-ppc64le::zlib-1.2.11-h339bb43_1013
> Proceed ([y]/n)? y
> Preparing transaction: done
> Verifying transaction: done
> Executing transaction: done
> #
> # To activate this environment, use
> #
> # $ conda activate pyarrow_py36
> #
> # To deactivate an active environment, use
> #
> # $ conda deactivate
> {code}
>  
> conda activate pyarrow_py36
> conda install pyarrow
>  
> {code:java}
> Collecting package metadata (current_repodata.json): done
> Solving environment: failed with initial frozen solve. Retrying with flexible 
> solve.
> Solving environment: failed with repodata from current_repodata.json, will 
> retry with next repodata source.
> Collecting package metadata (repodata.json): done
> Solving environment: done
> ==> WARNING: A newer version of conda exists. <==
>  current version: 4.9.2
>  latest version: 4.10.3
> Please update conda by running
> $ conda update -n base -c defaults conda
>  
> ## Package Plan ##
> environment location: /p/home/gerryc/.conda/envs/pyarrow_py36
> added / updated specs:
>  - pyarrow
> The following NEW packages will be INSTALLED:
> abseil-cpp conda-forge/linux-ppc64le::abseil-cpp-20210324.2-h3b9df90_0
>  arrow-cpp conda-forge/linux-ppc64le::arrow-cpp-5.0.0-py36hf9cf308_8_cpu
>  aws-c-cal conda-forge/linux-ppc64le::aws-c-cal-0.5.11-hb3fac3d_0
>  aws-c-common conda-forge/linux-ppc64le::aws-c-common-0.6.2-h4e0d66e_0
>  aws-c-event-stream 
> conda-forge/li

[jira] [Commented] (ARROW-15897) [C++] Linker error when building Flight tests

2022-03-10 Thread Uwe Korn (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-15897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17504406#comment-17504406
 ] 

Uwe Korn commented on ARROW-15897:
--

Not that easily but nowadays the {{conda}} and {{mamba}} commands are no longer 
executables but shell functions to exactly handle the reactivation of the 
environment on installation of these packages.

> [C++] Linker error when building Flight tests
> -
>
> Key: ARROW-15897
> URL: https://issues.apache.org/jira/browse/ARROW-15897
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++, FlightRPC
>Reporter: Antoine Pitrou
>Priority: Major
> Fix For: 8.0.0
>
>
> When building the Flight unit tests, I get the following error:
> {code}
> [16/25] Linking CXX executable debug/flight-test-integration-server
> FAILED: debug/flight-test-integration-server 
> [... snip long linker command line ...]
> /usr/bin/ld: /home/antoine/miniconda3/envs/pyarrow/lib/libgrpc++.so.1.44.0: 
> undefined reference to `std::__throw_bad_array_new_length()@GLIBCXX_3.4.29'
> clang: error: linker command failed with exit code 1 (use -v to see 
> invocation)
> {code}
> This is using a pre-built gRPC from conda-forge ({{grpc-cpp 1.44.0 
> h3d78c48_1}}).
> Other possibly relevant options: {{CMAKE_CXX_STANDARD=17}}, 
> {{ARROW_BUILD_STATIC=OFF}}.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (ARROW-15897) [C++] Linker error when building Flight tests

2022-03-10 Thread Uwe Korn (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-15897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17504387#comment-17504387
 ] 

Uwe Korn commented on ARROW-15897:
--

The environment variables are typically set by packages that end in the 
platform, e.g. `gxx_linux-64`.

> [C++] Linker error when building Flight tests
> -
>
> Key: ARROW-15897
> URL: https://issues.apache.org/jira/browse/ARROW-15897
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++, FlightRPC
>Reporter: Antoine Pitrou
>Priority: Major
> Fix For: 8.0.0
>
>
> When building the Flight unit tests, I get the following error:
> {code}
> [16/25] Linking CXX executable debug/flight-test-integration-server
> FAILED: debug/flight-test-integration-server 
> [... snip long linker command line ...]
> /usr/bin/ld: /home/antoine/miniconda3/envs/pyarrow/lib/libgrpc++.so.1.44.0: 
> undefined reference to `std::__throw_bad_array_new_length()@GLIBCXX_3.4.29'
> clang: error: linker command failed with exit code 1 (use -v to see 
> invocation)
> {code}
> This is using a pre-built gRPC from conda-forge ({{grpc-cpp 1.44.0 
> h3d78c48_1}}).
> Other possibly relevant options: {{CMAKE_CXX_STANDARD=17}}, 
> {{ARROW_BUILD_STATIC=OFF}}.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (ARROW-15897) [C++] Linker error when building Flight tests

2022-03-10 Thread Uwe Korn (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-15897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17504385#comment-17504385
 ] 

Uwe Korn commented on ARROW-15897:
--

The error is exactly as you discovered: conda-forge uses a more recent 
libstdc++ than your system. Reactivating your environment loads the correct 
environment variables. This should also happen automatically if you install 
packages with {{conda}} or a recent {{mamba}} version. 

> [C++] Linker error when building Flight tests
> -
>
> Key: ARROW-15897
> URL: https://issues.apache.org/jira/browse/ARROW-15897
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++, FlightRPC
>Reporter: Antoine Pitrou
>Priority: Major
> Fix For: 8.0.0
>
>
> When building the Flight unit tests, I get the following error:
> {code}
> [16/25] Linking CXX executable debug/flight-test-integration-server
> FAILED: debug/flight-test-integration-server 
> [... snip long linker command line ...]
> /usr/bin/ld: /home/antoine/miniconda3/envs/pyarrow/lib/libgrpc++.so.1.44.0: 
> undefined reference to `std::__throw_bad_array_new_length()@GLIBCXX_3.4.29'
> clang: error: linker command failed with exit code 1 (use -v to see 
> invocation)
> {code}
> This is using a pre-built gRPC from conda-forge ({{grpc-cpp 1.44.0 
> h3d78c48_1}}).
> Other possibly relevant options: {{CMAKE_CXX_STANDARD=17}}, 
> {{ARROW_BUILD_STATIC=OFF}}.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (ARROW-15903) [C++] Cannot build with system CUDA toolkit and conda-forge compilers

2022-03-10 Thread Uwe Korn (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-15903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17504382#comment-17504382
 ] 

Uwe Korn commented on ARROW-15903:
--

Do you have {{nvcc_linux-64}} installed? That could be missing.

> [C++] Cannot build with system CUDA toolkit and conda-forge compilers
> -
>
> Key: ARROW-15903
> URL: https://issues.apache.org/jira/browse/ARROW-15903
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++, GPU
>Reporter: Antoine Pitrou
>Priority: Major
>
> I have the CUDA toolkit installed on my Ubuntu 20.04 machine. I can build 
> Arrow with CUDA support fine using the system-provided clang. However, if I 
> try the conda-forge clang, I get the following CMake error when configuring:
> {code}
> -- Unable to find cudart library.
> CMake Error at 
> /home/antoine/miniconda3/envs/pyarrow/share/cmake-3.22/Modules/FindPackageHandleStandardArgs.cmake:230
>  (message):
>   Could NOT find CUDAToolkit (missing: CUDA_CUDART) (found version
>   "10.1.243")
> Call Stack (most recent call first):
>   
> /home/antoine/miniconda3/envs/pyarrow/share/cmake-3.22/Modules/FindPackageHandleStandardArgs.cmake:594
>  (_FPHSA_FAILURE_MESSAGE)
>   
> /home/antoine/miniconda3/envs/pyarrow/share/cmake-3.22/Modules/FindCUDAToolkit.cmake:814
>  (find_package_handle_standard_args)
>   src/arrow/gpu/CMakeLists.txt:40 (find_package)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (ARROW-15752) [C++] Compilation failure with aws-c-cal from conda-forge

2022-03-04 Thread Uwe Korn (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-15752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17501248#comment-17501248
 ] 

Uwe Korn commented on ARROW-15752:
--

[~apitrou] Can you check again? All rebuilds are through and a new aws-sdk-cpp 
version was also released meanwhile.

> [C++] Compilation failure with aws-c-cal from conda-forge
> -
>
> Key: ARROW-15752
> URL: https://issues.apache.org/jira/browse/ARROW-15752
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Reporter: Antoine Pitrou
>Priority: Major
>
> I get the following CMake error when trying to configure with aws-sdk-cpp 
> from conda-forge:
> {code}
> CMake Error at 
> /home/antoine/miniconda3/envs/pyarrow/lib/aws-c-cal/cmake/modules/FindLibCrypto.cmake:21
>  (get_target_property):
>   get_target_property() called with non-existent target "crypto".
> Call Stack (most recent call first):
>   
> /home/antoine/miniconda3/envs/pyarrow/share/cmake-3.22/Modules/CMakeFindDependencyMacro.cmake:47
>  (find_package)
>   
> /home/antoine/miniconda3/envs/pyarrow/lib/aws-c-cal/cmake/aws-c-cal-config.cmake:7
>  (find_dependency)
>   
> /home/antoine/miniconda3/envs/pyarrow/share/cmake-3.22/Modules/CMakeFindDependencyMacro.cmake:47
>  (find_package)
>   
> /home/antoine/miniconda3/envs/pyarrow/lib/aws-c-io/cmake/aws-c-io-config.cmake:8
>  (find_dependency)
>   
> /home/antoine/miniconda3/envs/pyarrow/share/cmake-3.22/Modules/CMakeFindDependencyMacro.cmake:47
>  (find_package)
>   
> /home/antoine/miniconda3/envs/pyarrow/lib/aws-c-http/cmake/aws-c-http-config.cmake:3
>  (find_dependency)
>   
> /home/antoine/miniconda3/envs/pyarrow/share/cmake-3.22/Modules/CMakeFindDependencyMacro.cmake:47
>  (find_package)
>   
> /home/antoine/miniconda3/envs/pyarrow/lib/aws-crt-cpp/cmake/aws-crt-cpp-config.cmake:3
>  (find_dependency)
>   
> /home/antoine/miniconda3/envs/pyarrow/share/cmake-3.22/Modules/CMakeFindDependencyMacro.cmake:47
>  (find_package)
>   
> /home/antoine/miniconda3/envs/pyarrow/lib/cmake/aws-cpp-sdk-core/aws-cpp-sdk-core-config.cmake:13
>  (find_dependency)
>   
> /home/antoine/miniconda3/envs/pyarrow/lib/cmake/AWSSDK/AWSSDKConfig.cmake:307 
> (find_package)
>   cmake_modules/ThirdpartyToolchain.cmake:4326 (find_package)
>   CMakeLists.txt:548 (include)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (ARROW-15752) [C++] Compilation failure with aws-c-cal from conda-forge

2022-02-22 Thread Uwe Korn (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-15752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17496203#comment-17496203
 ] 

Uwe Korn commented on ARROW-15752:
--

I'm confused though that you can see it. This error has prevented the whole AWS 
SDK chain to rebuild and you shouldn't be seeing it downstream.

> [C++] Compilation failure with aws-c-cal from conda-forge
> -
>
> Key: ARROW-15752
> URL: https://issues.apache.org/jira/browse/ARROW-15752
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Reporter: Antoine Pitrou
>Priority: Major
>
> I get the following CMake error when trying to configure with aws-sdk-cpp 
> from conda-forge:
> {code}
> CMake Error at 
> /home/antoine/miniconda3/envs/pyarrow/lib/aws-c-cal/cmake/modules/FindLibCrypto.cmake:21
>  (get_target_property):
>   get_target_property() called with non-existent target "crypto".
> Call Stack (most recent call first):
>   
> /home/antoine/miniconda3/envs/pyarrow/share/cmake-3.22/Modules/CMakeFindDependencyMacro.cmake:47
>  (find_package)
>   
> /home/antoine/miniconda3/envs/pyarrow/lib/aws-c-cal/cmake/aws-c-cal-config.cmake:7
>  (find_dependency)
>   
> /home/antoine/miniconda3/envs/pyarrow/share/cmake-3.22/Modules/CMakeFindDependencyMacro.cmake:47
>  (find_package)
>   
> /home/antoine/miniconda3/envs/pyarrow/lib/aws-c-io/cmake/aws-c-io-config.cmake:8
>  (find_dependency)
>   
> /home/antoine/miniconda3/envs/pyarrow/share/cmake-3.22/Modules/CMakeFindDependencyMacro.cmake:47
>  (find_package)
>   
> /home/antoine/miniconda3/envs/pyarrow/lib/aws-c-http/cmake/aws-c-http-config.cmake:3
>  (find_dependency)
>   
> /home/antoine/miniconda3/envs/pyarrow/share/cmake-3.22/Modules/CMakeFindDependencyMacro.cmake:47
>  (find_package)
>   
> /home/antoine/miniconda3/envs/pyarrow/lib/aws-crt-cpp/cmake/aws-crt-cpp-config.cmake:3
>  (find_dependency)
>   
> /home/antoine/miniconda3/envs/pyarrow/share/cmake-3.22/Modules/CMakeFindDependencyMacro.cmake:47
>  (find_package)
>   
> /home/antoine/miniconda3/envs/pyarrow/lib/cmake/aws-cpp-sdk-core/aws-cpp-sdk-core-config.cmake:13
>  (find_dependency)
>   
> /home/antoine/miniconda3/envs/pyarrow/lib/cmake/AWSSDK/AWSSDKConfig.cmake:307 
> (find_package)
>   cmake_modules/ThirdpartyToolchain.cmake:4326 (find_package)
>   CMakeLists.txt:548 (include)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (ARROW-15752) [C++] Compilation failure with aws-c-cal from conda-forge

2022-02-22 Thread Uwe Korn (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-15752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17496201#comment-17496201
 ] 

Uwe Korn commented on ARROW-15752:
--

This is probably fixed by aws-c-cal 0.5.14 but needs 1-2 days for the whole AWS 
stack to rebuild.

> [C++] Compilation failure with aws-c-cal from conda-forge
> -
>
> Key: ARROW-15752
> URL: https://issues.apache.org/jira/browse/ARROW-15752
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Reporter: Antoine Pitrou
>Priority: Major
>
> I get the following CMake error when trying to configure with aws-sdk-cpp 
> from conda-forge:
> {code}
> CMake Error at 
> /home/antoine/miniconda3/envs/pyarrow/lib/aws-c-cal/cmake/modules/FindLibCrypto.cmake:21
>  (get_target_property):
>   get_target_property() called with non-existent target "crypto".
> Call Stack (most recent call first):
>   
> /home/antoine/miniconda3/envs/pyarrow/share/cmake-3.22/Modules/CMakeFindDependencyMacro.cmake:47
>  (find_package)
>   
> /home/antoine/miniconda3/envs/pyarrow/lib/aws-c-cal/cmake/aws-c-cal-config.cmake:7
>  (find_dependency)
>   
> /home/antoine/miniconda3/envs/pyarrow/share/cmake-3.22/Modules/CMakeFindDependencyMacro.cmake:47
>  (find_package)
>   
> /home/antoine/miniconda3/envs/pyarrow/lib/aws-c-io/cmake/aws-c-io-config.cmake:8
>  (find_dependency)
>   
> /home/antoine/miniconda3/envs/pyarrow/share/cmake-3.22/Modules/CMakeFindDependencyMacro.cmake:47
>  (find_package)
>   
> /home/antoine/miniconda3/envs/pyarrow/lib/aws-c-http/cmake/aws-c-http-config.cmake:3
>  (find_dependency)
>   
> /home/antoine/miniconda3/envs/pyarrow/share/cmake-3.22/Modules/CMakeFindDependencyMacro.cmake:47
>  (find_package)
>   
> /home/antoine/miniconda3/envs/pyarrow/lib/aws-crt-cpp/cmake/aws-crt-cpp-config.cmake:3
>  (find_dependency)
>   
> /home/antoine/miniconda3/envs/pyarrow/share/cmake-3.22/Modules/CMakeFindDependencyMacro.cmake:47
>  (find_package)
>   
> /home/antoine/miniconda3/envs/pyarrow/lib/cmake/aws-cpp-sdk-core/aws-cpp-sdk-core-config.cmake:13
>  (find_dependency)
>   
> /home/antoine/miniconda3/envs/pyarrow/lib/cmake/AWSSDK/AWSSDKConfig.cmake:307 
> (find_package)
>   cmake_modules/ThirdpartyToolchain.cmake:4326 (find_package)
>   CMakeLists.txt:548 (include)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (ARROW-15752) [C++] Compilation failure with aws-c-cal from conda-forge

2022-02-22 Thread Uwe Korn (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-15752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17496128#comment-17496128
 ] 

Uwe Korn commented on ARROW-15752:
--

[~apitrou] Can you post me a {{conda list}} here?

> [C++] Compilation failure with aws-c-cal from conda-forge
> -
>
> Key: ARROW-15752
> URL: https://issues.apache.org/jira/browse/ARROW-15752
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Reporter: Antoine Pitrou
>Priority: Major
>
> I get the following CMake error when trying to configure with aws-sdk-cpp 
> from conda-forge:
> {code}
> CMake Error at 
> /home/antoine/miniconda3/envs/pyarrow/lib/aws-c-cal/cmake/modules/FindLibCrypto.cmake:21
>  (get_target_property):
>   get_target_property() called with non-existent target "crypto".
> Call Stack (most recent call first):
>   
> /home/antoine/miniconda3/envs/pyarrow/share/cmake-3.22/Modules/CMakeFindDependencyMacro.cmake:47
>  (find_package)
>   
> /home/antoine/miniconda3/envs/pyarrow/lib/aws-c-cal/cmake/aws-c-cal-config.cmake:7
>  (find_dependency)
>   
> /home/antoine/miniconda3/envs/pyarrow/share/cmake-3.22/Modules/CMakeFindDependencyMacro.cmake:47
>  (find_package)
>   
> /home/antoine/miniconda3/envs/pyarrow/lib/aws-c-io/cmake/aws-c-io-config.cmake:8
>  (find_dependency)
>   
> /home/antoine/miniconda3/envs/pyarrow/share/cmake-3.22/Modules/CMakeFindDependencyMacro.cmake:47
>  (find_package)
>   
> /home/antoine/miniconda3/envs/pyarrow/lib/aws-c-http/cmake/aws-c-http-config.cmake:3
>  (find_dependency)
>   
> /home/antoine/miniconda3/envs/pyarrow/share/cmake-3.22/Modules/CMakeFindDependencyMacro.cmake:47
>  (find_package)
>   
> /home/antoine/miniconda3/envs/pyarrow/lib/aws-crt-cpp/cmake/aws-crt-cpp-config.cmake:3
>  (find_dependency)
>   
> /home/antoine/miniconda3/envs/pyarrow/share/cmake-3.22/Modules/CMakeFindDependencyMacro.cmake:47
>  (find_package)
>   
> /home/antoine/miniconda3/envs/pyarrow/lib/cmake/aws-cpp-sdk-core/aws-cpp-sdk-core-config.cmake:13
>  (find_dependency)
>   
> /home/antoine/miniconda3/envs/pyarrow/lib/cmake/AWSSDK/AWSSDKConfig.cmake:307 
> (find_package)
>   cmake_modules/ThirdpartyToolchain.cmake:4326 (find_package)
>   CMakeLists.txt:548 (include)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Resolved] (ARROW-15444) [C++] Compilation with GCC 7.5 fails in aggregate_basic.cc

2022-02-18 Thread Uwe Korn (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-15444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Korn resolved ARROW-15444.
--
  Assignee: Uwe Korn
Resolution: Invalid

This has been an internal issue in GCC and I have applied the same patch that 
Ubuntu has in conda-forge.

> [C++] Compilation with GCC 7.5 fails in aggregate_basic.cc
> --
>
> Key: ARROW-15444
> URL: https://issues.apache.org/jira/browse/ARROW-15444
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Reporter: Uwe Korn
>Assignee: Uwe Korn
>Priority: Major
>  Labels: pull-request-available
> Fix For: 8.0.0
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> Building with GCC 7.5 currently fails with the following internal error. We 
> need to support this GCC version for CUDA-enabled and PPC64LE builds on 
> conda-forge. See also the updated conda recipe in 
> https://github.com/apache/arrow/pull/11916
> {code:java}
> 2022-01-24T14:18:48.2261185Z [182/405] Building CXX object 
> src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/aggregate_basic.cc.o
> 2022-01-24T14:18:48.2261792Z FAILED: 
> src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/aggregate_basic.cc.o 
> 2022-01-24T14:18:48.2268608Z 
> /build/arrow-cpp-ext_1643033227908/_build_env/bin/powerpc64le-conda-linux-gnu-c++
>  -DARROW_EXPORTING -DARROW_HDFS -DARROW_JEMALLOC 
> -DARROW_JEMALLOC_INCLUDE_DIR="" -DARROW_MIMALLOC -DARROW_WITH_BACKTRACE 
> -DARROW_WITH_BROTLI -DARROW_WITH_BZ2 -DARROW_WITH_LZ4 -DARROW_WITH_RE2 
> -DARROW_WITH_SNAPPY -DARROW_WITH_TIMING_TESTS -DARROW_WITH_UTF8PROC 
> -DARROW_WITH_ZLIB -DARROW_WITH_ZSTD -DURI_STATIC_BUILD 
> -I/build/arrow-cpp-ext_1643033227908/work/cpp/build/src 
> -I/build/arrow-cpp-ext_1643033227908/work/cpp/src 
> -I/build/arrow-cpp-ext_1643033227908/work/cpp/src/generated -isystem 
> /build/arrow-cpp-ext_1643033227908/work/cpp/thirdparty/flatbuffers/include 
> -isystem 
> /build/arrow-cpp-ext_1643033227908/work/cpp/build/jemalloc_ep-prefix/src 
> -isystem 
> /build/arrow-cpp-ext_1643033227908/work/cpp/build/mimalloc_ep/src/mimalloc_ep/include/mimalloc-1.7
>  -isystem 
> /build/arrow-cpp-ext_1643033227908/work/cpp/build/xsimd_ep/src/xsimd_ep-install/include
>  -isystem 
> /build/arrow-cpp-ext_1643033227908/work/cpp/thirdparty/hadoop/include 
> -Wno-noexcept-type -fvisibility-inlines-hidden -std=c++17 -fmessage-length=0 
> -mcpu=power8 -mtune=power8 -ftree-vectorize -fPIC -fstack-protector-strong 
> -fno-plt -O3 -pipe -isystem 
> /build/arrow-cpp-ext_1643033227908/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_pla/include
>  
> -fdebug-prefix-map=/build/arrow-cpp-ext_1643033227908/work=/usr/local/src/conda/arrow-cpp-7.0.0.dev553
>  
> -fdebug-prefix-map=/build/arrow-cpp-ext_1643033227908/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_pla=/usr/local/src/conda-prefix
>  -fdiagnostics-color=always -fuse-ld=gold -O3 -DNDEBUG  -Wall 
> -fno-semantic-interposition  -O3 -DNDEBUG -fPIC -std=c++1z -MD -MT 
> src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/aggregate_basic.cc.o 
> -MF 
> src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/aggregate_basic.cc.o.d 
> -o src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/aggregate_basic.cc.o 
> -c 
> /build/arrow-cpp-ext_1643033227908/work/cpp/src/arrow/compute/kernels/aggregate_basic.cc
> 2022-01-24T14:18:48.2273037Z In file included from 
> /build/arrow-cpp-ext_1643033227908/work/cpp/src/arrow/compute/kernels/codegen_internal.h:46:0,
> 2022-01-24T14:18:48.2273811Z  from 
> /build/arrow-cpp-ext_1643033227908/work/cpp/src/arrow/compute/kernels/util_internal.h:26,
> 2022-01-24T14:18:48.2274563Z  from 
> /build/arrow-cpp-ext_1643033227908/work/cpp/src/arrow/compute/kernels/aggregate_internal.h:20,
> 2022-01-24T14:18:48.2275318Z  from 
> /build/arrow-cpp-ext_1643033227908/work/cpp/src/arrow/compute/kernels/aggregate_basic_internal.h:24,
> 2022-01-24T14:18:48.2276088Z  from 
> /build/arrow-cpp-ext_1643033227908/work/cpp/src/arrow/compute/kernels/aggregate_basic.cc:19:
> 2022-01-24T14:18:48.2277993Z 
> /build/arrow-cpp-ext_1643033227908/work/cpp/src/arrow/compute/kernels/aggregate_internal.h:
>  In instantiation of 'arrow::compute::internal::SumArray(const 
> arrow::ArrayData&, ValueFunc&&):: [with ValueType = double; 
> SumType = double; arrow::compute::SimdLevel::type SimdLevel = 
> (arrow::compute::SimdLevel::type)0; ValueFunc = 
> a

[jira] [Created] (ARROW-15670) [C++/Python/Packaging] Update conda pinnings and enable GCS on Windows

2022-02-13 Thread Uwe Korn (Jira)
Uwe Korn created ARROW-15670:


 Summary: [C++/Python/Packaging] Update conda pinnings and enable 
GCS on Windows
 Key: ARROW-15670
 URL: https://issues.apache.org/jira/browse/ARROW-15670
 Project: Apache Arrow
  Issue Type: Improvement
  Components: C++, Packaging, Python
Reporter: Uwe Korn
Assignee: Uwe Korn






--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (ARROW-15444) [C++] Compilation with GCC 7.5 fails in aggregate_basic.cc

2022-01-27 Thread Uwe Korn (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-15444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17483052#comment-17483052
 ] 

Uwe Korn commented on ARROW-15444:
--

Ongoing work on this is at 
[https://github.com/conda-forge/ctng-compilers-feedstock/pull/82]

> [C++] Compilation with GCC 7.5 fails in aggregate_basic.cc
> --
>
> Key: ARROW-15444
> URL: https://issues.apache.org/jira/browse/ARROW-15444
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Reporter: Uwe Korn
>Priority: Major
>  Labels: pull-request-available
> Fix For: 7.0.0
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> Building with GCC 7.5 currently fails with the following internal error. We 
> need to support this GCC version for CUDA-enabled and PPC64LE builds on 
> conda-forge. See also the updated conda recipe in 
> https://github.com/apache/arrow/pull/11916
> {code:java}
> 2022-01-24T14:18:48.2261185Z [182/405] Building CXX object 
> src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/aggregate_basic.cc.o
> 2022-01-24T14:18:48.2261792Z FAILED: 
> src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/aggregate_basic.cc.o 
> 2022-01-24T14:18:48.2268608Z 
> /build/arrow-cpp-ext_1643033227908/_build_env/bin/powerpc64le-conda-linux-gnu-c++
>  -DARROW_EXPORTING -DARROW_HDFS -DARROW_JEMALLOC 
> -DARROW_JEMALLOC_INCLUDE_DIR="" -DARROW_MIMALLOC -DARROW_WITH_BACKTRACE 
> -DARROW_WITH_BROTLI -DARROW_WITH_BZ2 -DARROW_WITH_LZ4 -DARROW_WITH_RE2 
> -DARROW_WITH_SNAPPY -DARROW_WITH_TIMING_TESTS -DARROW_WITH_UTF8PROC 
> -DARROW_WITH_ZLIB -DARROW_WITH_ZSTD -DURI_STATIC_BUILD 
> -I/build/arrow-cpp-ext_1643033227908/work/cpp/build/src 
> -I/build/arrow-cpp-ext_1643033227908/work/cpp/src 
> -I/build/arrow-cpp-ext_1643033227908/work/cpp/src/generated -isystem 
> /build/arrow-cpp-ext_1643033227908/work/cpp/thirdparty/flatbuffers/include 
> -isystem 
> /build/arrow-cpp-ext_1643033227908/work/cpp/build/jemalloc_ep-prefix/src 
> -isystem 
> /build/arrow-cpp-ext_1643033227908/work/cpp/build/mimalloc_ep/src/mimalloc_ep/include/mimalloc-1.7
>  -isystem 
> /build/arrow-cpp-ext_1643033227908/work/cpp/build/xsimd_ep/src/xsimd_ep-install/include
>  -isystem 
> /build/arrow-cpp-ext_1643033227908/work/cpp/thirdparty/hadoop/include 
> -Wno-noexcept-type -fvisibility-inlines-hidden -std=c++17 -fmessage-length=0 
> -mcpu=power8 -mtune=power8 -ftree-vectorize -fPIC -fstack-protector-strong 
> -fno-plt -O3 -pipe -isystem 
> /build/arrow-cpp-ext_1643033227908/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_pla/include
>  
> -fdebug-prefix-map=/build/arrow-cpp-ext_1643033227908/work=/usr/local/src/conda/arrow-cpp-7.0.0.dev553
>  
> -fdebug-prefix-map=/build/arrow-cpp-ext_1643033227908/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_pla=/usr/local/src/conda-prefix
>  -fdiagnostics-color=always -fuse-ld=gold -O3 -DNDEBUG  -Wall 
> -fno-semantic-interposition  -O3 -DNDEBUG -fPIC -std=c++1z -MD -MT 
> src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/aggregate_basic.cc.o 
> -MF 
> src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/aggregate_basic.cc.o.d 
> -o src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/aggregate_basic.cc.o 
> -c 
> /build/arrow-cpp-ext_1643033227908/work/cpp/src/arrow/compute/kernels/aggregate_basic.cc
> 2022-01-24T14:18:48.2273037Z In file included from 
> /build/arrow-cpp-ext_1643033227908/work/cpp/src/arrow/compute/kernels/codegen_internal.h:46:0,
> 2022-01-24T14:18:48.2273811Z  from 
> /build/arrow-cpp-ext_1643033227908/work/cpp/src/arrow/compute/kernels/util_internal.h:26,
> 2022-01-24T14:18:48.2274563Z  from 
> /build/arrow-cpp-ext_1643033227908/work/cpp/src/arrow/compute/kernels/aggregate_internal.h:20,
> 2022-01-24T14:18:48.2275318Z  from 
> /build/arrow-cpp-ext_1643033227908/work/cpp/src/arrow/compute/kernels/aggregate_basic_internal.h:24,
> 2022-01-24T14:18:48.2276088Z  from 
> /build/arrow-cpp-ext_1643033227908/work/cpp/src/arrow/compute/kernels/aggregate_basic.cc:19:
> 2022-01-24T14:18:48.2277993Z 
> /build/arrow-cpp-ext_1643033227908/work/cpp/src/arrow/compute/kernels/aggregate_internal.h:
>  In instantiation of 'arrow::compute::internal::SumArray(const 
> arrow::ArrayData&, ValueFunc&&):: [with ValueType = double; 
> SumType = double; arrow::compute::SimdLevel::type SimdLevel = 
> (arrow::compute::SimdLevel::type)0; ValueFunc = 
> arrow::compute::internal::SumArray(cons

[jira] [Commented] (ARROW-15444) [C++] Compilation with GCC 7.5 fails in aggregate_basic.cc

2022-01-25 Thread Uwe Korn (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-15444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17482091#comment-17482091
 ] 

Uwe Korn commented on ARROW-15444:
--

Disabling CMAKE_CXX_STANDARD and GCS together fixes the issue. I'll try out now 
what the actual cause is but this is then not a release blocker anymore.

> [C++] Compilation with GCC 7.5 fails in aggregate_basic.cc
> --
>
> Key: ARROW-15444
> URL: https://issues.apache.org/jira/browse/ARROW-15444
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Reporter: Uwe Korn
>Priority: Major
>  Labels: pull-request-available
> Fix For: 7.0.0
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Building with GCC 7.5 currently fails with the following internal error. We 
> need to support this GCC version for CUDA-enabled and PPC64LE builds on 
> conda-forge. See also the updated conda recipe in 
> https://github.com/apache/arrow/pull/11916
> {code:java}
> 2022-01-24T14:18:48.2261185Z [182/405] Building CXX object 
> src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/aggregate_basic.cc.o
> 2022-01-24T14:18:48.2261792Z FAILED: 
> src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/aggregate_basic.cc.o 
> 2022-01-24T14:18:48.2268608Z 
> /build/arrow-cpp-ext_1643033227908/_build_env/bin/powerpc64le-conda-linux-gnu-c++
>  -DARROW_EXPORTING -DARROW_HDFS -DARROW_JEMALLOC 
> -DARROW_JEMALLOC_INCLUDE_DIR="" -DARROW_MIMALLOC -DARROW_WITH_BACKTRACE 
> -DARROW_WITH_BROTLI -DARROW_WITH_BZ2 -DARROW_WITH_LZ4 -DARROW_WITH_RE2 
> -DARROW_WITH_SNAPPY -DARROW_WITH_TIMING_TESTS -DARROW_WITH_UTF8PROC 
> -DARROW_WITH_ZLIB -DARROW_WITH_ZSTD -DURI_STATIC_BUILD 
> -I/build/arrow-cpp-ext_1643033227908/work/cpp/build/src 
> -I/build/arrow-cpp-ext_1643033227908/work/cpp/src 
> -I/build/arrow-cpp-ext_1643033227908/work/cpp/src/generated -isystem 
> /build/arrow-cpp-ext_1643033227908/work/cpp/thirdparty/flatbuffers/include 
> -isystem 
> /build/arrow-cpp-ext_1643033227908/work/cpp/build/jemalloc_ep-prefix/src 
> -isystem 
> /build/arrow-cpp-ext_1643033227908/work/cpp/build/mimalloc_ep/src/mimalloc_ep/include/mimalloc-1.7
>  -isystem 
> /build/arrow-cpp-ext_1643033227908/work/cpp/build/xsimd_ep/src/xsimd_ep-install/include
>  -isystem 
> /build/arrow-cpp-ext_1643033227908/work/cpp/thirdparty/hadoop/include 
> -Wno-noexcept-type -fvisibility-inlines-hidden -std=c++17 -fmessage-length=0 
> -mcpu=power8 -mtune=power8 -ftree-vectorize -fPIC -fstack-protector-strong 
> -fno-plt -O3 -pipe -isystem 
> /build/arrow-cpp-ext_1643033227908/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_pla/include
>  
> -fdebug-prefix-map=/build/arrow-cpp-ext_1643033227908/work=/usr/local/src/conda/arrow-cpp-7.0.0.dev553
>  
> -fdebug-prefix-map=/build/arrow-cpp-ext_1643033227908/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_pla=/usr/local/src/conda-prefix
>  -fdiagnostics-color=always -fuse-ld=gold -O3 -DNDEBUG  -Wall 
> -fno-semantic-interposition  -O3 -DNDEBUG -fPIC -std=c++1z -MD -MT 
> src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/aggregate_basic.cc.o 
> -MF 
> src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/aggregate_basic.cc.o.d 
> -o src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/aggregate_basic.cc.o 
> -c 
> /build/arrow-cpp-ext_1643033227908/work/cpp/src/arrow/compute/kernels/aggregate_basic.cc
> 2022-01-24T14:18:48.2273037Z In file included from 
> /build/arrow-cpp-ext_1643033227908/work/cpp/src/arrow/compute/kernels/codegen_internal.h:46:0,
> 2022-01-24T14:18:48.2273811Z  from 
> /build/arrow-cpp-ext_1643033227908/work/cpp/src/arrow/compute/kernels/util_internal.h:26,
> 2022-01-24T14:18:48.2274563Z  from 
> /build/arrow-cpp-ext_1643033227908/work/cpp/src/arrow/compute/kernels/aggregate_internal.h:20,
> 2022-01-24T14:18:48.2275318Z  from 
> /build/arrow-cpp-ext_1643033227908/work/cpp/src/arrow/compute/kernels/aggregate_basic_internal.h:24,
> 2022-01-24T14:18:48.2276088Z  from 
> /build/arrow-cpp-ext_1643033227908/work/cpp/src/arrow/compute/kernels/aggregate_basic.cc:19:
> 2022-01-24T14:18:48.2277993Z 
> /build/arrow-cpp-ext_1643033227908/work/cpp/src/arrow/compute/kernels/aggregate_internal.h:
>  In instantiation of 'arrow::compute::internal::SumArray(const 
> arrow::ArrayData&, ValueFunc&&):: [with ValueType = double; 
> SumType = double; arrow::compute::SimdLevel::type SimdLevel = 
> (arrow::compute::SimdLevel::type)0

[jira] [Commented] (ARROW-15444) [C++] Compilation with GCC 7.5 fails in aggregate_basic.cc

2022-01-25 Thread Uwe Korn (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-15444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17482075#comment-17482075
 ] 

Uwe Korn commented on ARROW-15444:
--

The following CXXFLAGS seems to work
{code:java}
-Wno-noexcept-type  -fdiagnostics-color=always -O3 -DNDEBUG  -Wall 
-Wno-conversion -Wno-deprecated-declarations -Wno-sign-conversion 
-Wunused-result -fno-semantic-interposition -msse4.2  {code}

> [C++] Compilation with GCC 7.5 fails in aggregate_basic.cc
> --
>
> Key: ARROW-15444
> URL: https://issues.apache.org/jira/browse/ARROW-15444
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Reporter: Uwe Korn
>Priority: Major
>  Labels: pull-request-available
> Fix For: 7.0.0
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Building with GCC 7.5 currently fails with the following internal error. We 
> need to support this GCC version for CUDA-enabled and PPC64LE builds on 
> conda-forge. See also the updated conda recipe in 
> https://github.com/apache/arrow/pull/11916
> {code:java}
> 2022-01-24T14:18:48.2261185Z [182/405] Building CXX object 
> src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/aggregate_basic.cc.o
> 2022-01-24T14:18:48.2261792Z FAILED: 
> src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/aggregate_basic.cc.o 
> 2022-01-24T14:18:48.2268608Z 
> /build/arrow-cpp-ext_1643033227908/_build_env/bin/powerpc64le-conda-linux-gnu-c++
>  -DARROW_EXPORTING -DARROW_HDFS -DARROW_JEMALLOC 
> -DARROW_JEMALLOC_INCLUDE_DIR="" -DARROW_MIMALLOC -DARROW_WITH_BACKTRACE 
> -DARROW_WITH_BROTLI -DARROW_WITH_BZ2 -DARROW_WITH_LZ4 -DARROW_WITH_RE2 
> -DARROW_WITH_SNAPPY -DARROW_WITH_TIMING_TESTS -DARROW_WITH_UTF8PROC 
> -DARROW_WITH_ZLIB -DARROW_WITH_ZSTD -DURI_STATIC_BUILD 
> -I/build/arrow-cpp-ext_1643033227908/work/cpp/build/src 
> -I/build/arrow-cpp-ext_1643033227908/work/cpp/src 
> -I/build/arrow-cpp-ext_1643033227908/work/cpp/src/generated -isystem 
> /build/arrow-cpp-ext_1643033227908/work/cpp/thirdparty/flatbuffers/include 
> -isystem 
> /build/arrow-cpp-ext_1643033227908/work/cpp/build/jemalloc_ep-prefix/src 
> -isystem 
> /build/arrow-cpp-ext_1643033227908/work/cpp/build/mimalloc_ep/src/mimalloc_ep/include/mimalloc-1.7
>  -isystem 
> /build/arrow-cpp-ext_1643033227908/work/cpp/build/xsimd_ep/src/xsimd_ep-install/include
>  -isystem 
> /build/arrow-cpp-ext_1643033227908/work/cpp/thirdparty/hadoop/include 
> -Wno-noexcept-type -fvisibility-inlines-hidden -std=c++17 -fmessage-length=0 
> -mcpu=power8 -mtune=power8 -ftree-vectorize -fPIC -fstack-protector-strong 
> -fno-plt -O3 -pipe -isystem 
> /build/arrow-cpp-ext_1643033227908/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_pla/include
>  
> -fdebug-prefix-map=/build/arrow-cpp-ext_1643033227908/work=/usr/local/src/conda/arrow-cpp-7.0.0.dev553
>  
> -fdebug-prefix-map=/build/arrow-cpp-ext_1643033227908/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_pla=/usr/local/src/conda-prefix
>  -fdiagnostics-color=always -fuse-ld=gold -O3 -DNDEBUG  -Wall 
> -fno-semantic-interposition  -O3 -DNDEBUG -fPIC -std=c++1z -MD -MT 
> src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/aggregate_basic.cc.o 
> -MF 
> src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/aggregate_basic.cc.o.d 
> -o src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/aggregate_basic.cc.o 
> -c 
> /build/arrow-cpp-ext_1643033227908/work/cpp/src/arrow/compute/kernels/aggregate_basic.cc
> 2022-01-24T14:18:48.2273037Z In file included from 
> /build/arrow-cpp-ext_1643033227908/work/cpp/src/arrow/compute/kernels/codegen_internal.h:46:0,
> 2022-01-24T14:18:48.2273811Z  from 
> /build/arrow-cpp-ext_1643033227908/work/cpp/src/arrow/compute/kernels/util_internal.h:26,
> 2022-01-24T14:18:48.2274563Z  from 
> /build/arrow-cpp-ext_1643033227908/work/cpp/src/arrow/compute/kernels/aggregate_internal.h:20,
> 2022-01-24T14:18:48.2275318Z  from 
> /build/arrow-cpp-ext_1643033227908/work/cpp/src/arrow/compute/kernels/aggregate_basic_internal.h:24,
> 2022-01-24T14:18:48.2276088Z  from 
> /build/arrow-cpp-ext_1643033227908/work/cpp/src/arrow/compute/kernels/aggregate_basic.cc:19:
> 2022-01-24T14:18:48.2277993Z 
> /build/arrow-cpp-ext_1643033227908/work/cpp/src/arrow/compute/kernels/aggregate_internal.h:
>  In instantiation of 'arrow::compute::internal::SumArray(const 
> arrow::ArrayData&, ValueFunc&&):: [with ValueType = double; 
> SumType 

[jira] [Commented] (ARROW-15444) [C++] Compilation with GCC 7.5 fails in aggregate_basic.cc

2022-01-25 Thread Uwe Korn (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-15444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17481965#comment-17481965
 ] 

Uwe Korn commented on ARROW-15444:
--

Somewhere in these CXXFLAGS must be a problem. These are the common ones in the 
ppc64le and CUDA builds.
{code:java}
 CXXFLAGS=-fvisibility-inlines-hidden -std=c++17 -fmessage-length=0 
-ftree-vectorize -fPIC -fstack-protector-strong -fno-plt{code}
 

> [C++] Compilation with GCC 7.5 fails in aggregate_basic.cc
> --
>
> Key: ARROW-15444
> URL: https://issues.apache.org/jira/browse/ARROW-15444
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Reporter: Uwe Korn
>Priority: Major
> Fix For: 7.0.0
>
>
> Building with GCC 7.5 currently fails with the following internal error. We 
> need to support this GCC version for CUDA-enabled and PPC64LE builds on 
> conda-forge. See also the updated conda recipe in 
> https://github.com/apache/arrow/pull/11916
> {code:java}
> 2022-01-24T14:18:48.2261185Z [182/405] Building CXX object 
> src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/aggregate_basic.cc.o
> 2022-01-24T14:18:48.2261792Z FAILED: 
> src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/aggregate_basic.cc.o 
> 2022-01-24T14:18:48.2268608Z 
> /build/arrow-cpp-ext_1643033227908/_build_env/bin/powerpc64le-conda-linux-gnu-c++
>  -DARROW_EXPORTING -DARROW_HDFS -DARROW_JEMALLOC 
> -DARROW_JEMALLOC_INCLUDE_DIR="" -DARROW_MIMALLOC -DARROW_WITH_BACKTRACE 
> -DARROW_WITH_BROTLI -DARROW_WITH_BZ2 -DARROW_WITH_LZ4 -DARROW_WITH_RE2 
> -DARROW_WITH_SNAPPY -DARROW_WITH_TIMING_TESTS -DARROW_WITH_UTF8PROC 
> -DARROW_WITH_ZLIB -DARROW_WITH_ZSTD -DURI_STATIC_BUILD 
> -I/build/arrow-cpp-ext_1643033227908/work/cpp/build/src 
> -I/build/arrow-cpp-ext_1643033227908/work/cpp/src 
> -I/build/arrow-cpp-ext_1643033227908/work/cpp/src/generated -isystem 
> /build/arrow-cpp-ext_1643033227908/work/cpp/thirdparty/flatbuffers/include 
> -isystem 
> /build/arrow-cpp-ext_1643033227908/work/cpp/build/jemalloc_ep-prefix/src 
> -isystem 
> /build/arrow-cpp-ext_1643033227908/work/cpp/build/mimalloc_ep/src/mimalloc_ep/include/mimalloc-1.7
>  -isystem 
> /build/arrow-cpp-ext_1643033227908/work/cpp/build/xsimd_ep/src/xsimd_ep-install/include
>  -isystem 
> /build/arrow-cpp-ext_1643033227908/work/cpp/thirdparty/hadoop/include 
> -Wno-noexcept-type -fvisibility-inlines-hidden -std=c++17 -fmessage-length=0 
> -mcpu=power8 -mtune=power8 -ftree-vectorize -fPIC -fstack-protector-strong 
> -fno-plt -O3 -pipe -isystem 
> /build/arrow-cpp-ext_1643033227908/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_pla/include
>  
> -fdebug-prefix-map=/build/arrow-cpp-ext_1643033227908/work=/usr/local/src/conda/arrow-cpp-7.0.0.dev553
>  
> -fdebug-prefix-map=/build/arrow-cpp-ext_1643033227908/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_pla=/usr/local/src/conda-prefix
>  -fdiagnostics-color=always -fuse-ld=gold -O3 -DNDEBUG  -Wall 
> -fno-semantic-interposition  -O3 -DNDEBUG -fPIC -std=c++1z -MD -MT 
> src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/aggregate_basic.cc.o 
> -MF 
> src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/aggregate_basic.cc.o.d 
> -o src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/aggregate_basic.cc.o 
> -c 
> /build/arrow-cpp-ext_1643033227908/work/cpp/src/arrow/compute/kernels/aggregate_basic.cc
> 2022-01-24T14:18:48.2273037Z In file included from 
> /build/arrow-cpp-ext_1643033227908/work/cpp/src/arrow/compute/kernels/codegen_internal.h:46:0,
> 2022-01-24T14:18:48.2273811Z  from 
> /build/arrow-cpp-ext_1643033227908/work/cpp/src/arrow/compute/kernels/util_internal.h:26,
> 2022-01-24T14:18:48.2274563Z  from 
> /build/arrow-cpp-ext_1643033227908/work/cpp/src/arrow/compute/kernels/aggregate_internal.h:20,
> 2022-01-24T14:18:48.2275318Z  from 
> /build/arrow-cpp-ext_1643033227908/work/cpp/src/arrow/compute/kernels/aggregate_basic_internal.h:24,
> 2022-01-24T14:18:48.2276088Z  from 
> /build/arrow-cpp-ext_1643033227908/work/cpp/src/arrow/compute/kernels/aggregate_basic.cc:19:
> 2022-01-24T14:18:48.2277993Z 
> /build/arrow-cpp-ext_1643033227908/work/cpp/src/arrow/compute/kernels/aggregate_internal.h:
>  In instantiation of 'arrow::compute::internal::SumArray(const 
> arrow::ArrayData&, ValueFunc&&):: [with ValueType = double; 
> SumType = double; arrow::compute::SimdLevel::type SimdLevel = 
> (arrow::compute::SimdLevel::type)0

[jira] [Created] (ARROW-15445) [C++/Python] pyarrow build incorrectly detects x86 as system process during cross-cimpile

2022-01-25 Thread Uwe Korn (Jira)
Uwe Korn created ARROW-15445:


 Summary: [C++/Python] pyarrow build incorrectly detects x86 as 
system process during cross-cimpile
 Key: ARROW-15445
 URL: https://issues.apache.org/jira/browse/ARROW-15445
 Project: Apache Arrow
  Issue Type: Bug
  Components: C++, Python
Reporter: Uwe Korn


When cross-compiling {{pyarrow}} for aarch64 or ppc64le we run into the 
following issue:
{code:java}
-- System processor: x86_64
-- Performing Test CXX_SUPPORTS_SSE4_2
-- Performing Test CXX_SUPPORTS_SSE4_2 - Failed
-- Performing Test CXX_SUPPORTS_AVX2
-- Performing Test CXX_SUPPORTS_AVX2 - Failed
-- Performing Test CXX_SUPPORTS_AVX512
-- Performing Test CXX_SUPPORTS_AVX512 - Failed
-- Arrow build warning level: PRODUCTION
CMake Error at cmake_modules/SetupCxxFlags.cmake:456 (message):
  SSE4.2 required but compiler doesn't support it.
Call Stack (most recent call first):
  CMakeLists.txt:121 (include)


-- Configuring incomplete, errors occurred!
 {code}
The error is valid as we are building for a target system that doesn't support 
SSE at all.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (ARROW-15444) [C++] Compilation with GCC 7.5 fails in aggregate_basic.cc

2022-01-25 Thread Uwe Korn (Jira)
Uwe Korn created ARROW-15444:


 Summary: [C++] Compilation with GCC 7.5 fails in aggregate_basic.cc
 Key: ARROW-15444
 URL: https://issues.apache.org/jira/browse/ARROW-15444
 Project: Apache Arrow
  Issue Type: Bug
  Components: C++
Reporter: Uwe Korn


Building with GCC 7.5 currently fails with the following internal error. We 
need to support this GCC version for CUDA-enabled and PPC64LE builds on 
conda-forge. See also the updated conda recipe in 
https://github.com/apache/arrow/pull/11916
{code:java}
2022-01-24T14:18:48.2261185Z [182/405] Building CXX object 
src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/aggregate_basic.cc.o
2022-01-24T14:18:48.2261792Z FAILED: 
src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/aggregate_basic.cc.o 
2022-01-24T14:18:48.2268608Z 
/build/arrow-cpp-ext_1643033227908/_build_env/bin/powerpc64le-conda-linux-gnu-c++
 -DARROW_EXPORTING -DARROW_HDFS -DARROW_JEMALLOC 
-DARROW_JEMALLOC_INCLUDE_DIR="" -DARROW_MIMALLOC -DARROW_WITH_BACKTRACE 
-DARROW_WITH_BROTLI -DARROW_WITH_BZ2 -DARROW_WITH_LZ4 -DARROW_WITH_RE2 
-DARROW_WITH_SNAPPY -DARROW_WITH_TIMING_TESTS -DARROW_WITH_UTF8PROC 
-DARROW_WITH_ZLIB -DARROW_WITH_ZSTD -DURI_STATIC_BUILD 
-I/build/arrow-cpp-ext_1643033227908/work/cpp/build/src 
-I/build/arrow-cpp-ext_1643033227908/work/cpp/src 
-I/build/arrow-cpp-ext_1643033227908/work/cpp/src/generated -isystem 
/build/arrow-cpp-ext_1643033227908/work/cpp/thirdparty/flatbuffers/include 
-isystem 
/build/arrow-cpp-ext_1643033227908/work/cpp/build/jemalloc_ep-prefix/src 
-isystem 
/build/arrow-cpp-ext_1643033227908/work/cpp/build/mimalloc_ep/src/mimalloc_ep/include/mimalloc-1.7
 -isystem 
/build/arrow-cpp-ext_1643033227908/work/cpp/build/xsimd_ep/src/xsimd_ep-install/include
 -isystem /build/arrow-cpp-ext_1643033227908/work/cpp/thirdparty/hadoop/include 
-Wno-noexcept-type -fvisibility-inlines-hidden -std=c++17 -fmessage-length=0 
-mcpu=power8 -mtune=power8 -ftree-vectorize -fPIC -fstack-protector-strong 
-fno-plt -O3 -pipe -isystem 
/build/arrow-cpp-ext_1643033227908/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_pla/include
 
-fdebug-prefix-map=/build/arrow-cpp-ext_1643033227908/work=/usr/local/src/conda/arrow-cpp-7.0.0.dev553
 
-fdebug-prefix-map=/build/arrow-cpp-ext_1643033227908/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_pla=/usr/local/src/conda-prefix
 -fdiagnostics-color=always -fuse-ld=gold -O3 -DNDEBUG  -Wall 
-fno-semantic-interposition  -O3 -DNDEBUG -fPIC -std=c++1z -MD -MT 
src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/aggregate_basic.cc.o -MF 
src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/aggregate_basic.cc.o.d -o 
src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/aggregate_basic.cc.o -c 
/build/arrow-cpp-ext_1643033227908/work/cpp/src/arrow/compute/kernels/aggregate_basic.cc
2022-01-24T14:18:48.2273037Z In file included from 
/build/arrow-cpp-ext_1643033227908/work/cpp/src/arrow/compute/kernels/codegen_internal.h:46:0,
2022-01-24T14:18:48.2273811Z  from 
/build/arrow-cpp-ext_1643033227908/work/cpp/src/arrow/compute/kernels/util_internal.h:26,
2022-01-24T14:18:48.2274563Z  from 
/build/arrow-cpp-ext_1643033227908/work/cpp/src/arrow/compute/kernels/aggregate_internal.h:20,
2022-01-24T14:18:48.2275318Z  from 
/build/arrow-cpp-ext_1643033227908/work/cpp/src/arrow/compute/kernels/aggregate_basic_internal.h:24,
2022-01-24T14:18:48.2276088Z  from 
/build/arrow-cpp-ext_1643033227908/work/cpp/src/arrow/compute/kernels/aggregate_basic.cc:19:
2022-01-24T14:18:48.2277993Z 
/build/arrow-cpp-ext_1643033227908/work/cpp/src/arrow/compute/kernels/aggregate_internal.h:
 In instantiation of 'arrow::compute::internal::SumArray(const 
arrow::ArrayData&, ValueFunc&&):: [with ValueType = double; 
SumType = double; arrow::compute::SimdLevel::type SimdLevel = 
(arrow::compute::SimdLevel::type)0; ValueFunc = 
arrow::compute::internal::SumArray(const arrow::ArrayData&) [with ValueType = 
double; SumType = double; arrow::compute::SimdLevel::type SimdLevel = 
(arrow::compute::SimdLevel::type)0]::]':
2022-01-24T14:18:48.2281061Z 
/build/arrow-cpp-ext_1643033227908/work/cpp/src/arrow/compute/kernels/aggregate_internal.h:181:5:
   required from 'struct arrow::compute::internal::SumArray(const 
arrow::ArrayData&, ValueFunc&&) [with ValueType = double; SumType = double; 
arrow::compute::SimdLevel::type SimdLevel = (arrow::compute::SimdLevel::type)0; 
ValueFunc = arrow::compute::internal::SumArray(const arrow::ArrayData&) [with 
ValueType = double; SumType = d

[jira] [Resolved] (ARROW-15141) [C++] Fatal error condition occurred in aws_thread_launch

2021-12-19 Thread Uwe Korn (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-15141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Korn resolved ARROW-15141.
--
Resolution: Fixed

Fixed by
 * [https://github.com/conda-forge/arrow-cpp-feedstock/pull/637]
 * [https://github.com/conda-forge/arrow-cpp-feedstock/pull/638]
 * [https://github.com/conda-forge/arrow-cpp-feedstock/pull/639]
 * [https://github.com/conda-forge/arrow-cpp-feedstock/pull/640]

> [C++] Fatal error condition occurred in aws_thread_launch
> -
>
> Key: ARROW-15141
> URL: https://issues.apache.org/jira/browse/ARROW-15141
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++, Python
>Affects Versions: 6.0.0, 6.0.1
> Environment: - `uname -a`:
> Linux datalab2 5.4.0-91-generic #102-Ubuntu SMP Fri Nov 5 16:31:28 UTC 2021 
> x86_64 x86_64 x86_64 GNU/Linux
> - `mamba list | grep -i "pyarrow\|tensorflow\|^python"`
> pyarrow   6.0.0   py39hff6fa39_1_cpuconda-forge
> python3.9.7   hb7a2778_3_cpythonconda-forge
> python-dateutil   2.8.2  pyhd8ed1ab_0conda-forge
> python-flatbuffers1.12   pyhd8ed1ab_1conda-forge
> python-irodsclient1.0.0  pyhd8ed1ab_0conda-forge
> python-rocksdb0.7.0py39h7fcd5f3_4conda-forge
> python_abi3.9  2_cp39conda-forge
> tensorflow2.6.2   cuda112py39h9333c2f_0conda-forge
> tensorflow-base   2.6.2   cuda112py39h7de589b_0conda-forge
> tensorflow-estimator  2.6.2   cuda112py39h9333c2f_0conda-forge
> tensorflow-gpu2.6.2   cuda112py39h0bbbad9_0conda-forge
>Reporter: F. H.
>Assignee: Uwe Korn
>Priority: Major
>
> Hi, I am getting randomly the following error when first running inference 
> with a Tensorflow model and then writing the result to a `.parquet` file:
> {code}
> Fatal error condition occurred in 
> /home/conda/feedstock_root/build_artifacts/aws-c-io_1633633131324/work/source/event_loop.c:72:
>  aws_thread_launch(&cleanup_thread, s_event_loop_destroy_async_thread_fn, 
> el_group, &thread_options) == AWS_OP_SUCCESS
> Exiting Application
> 
> Stack trace:
> 
> /home//miniconda3/envs/spliceai_env/lib/python3.9/site-packages/pyarrow/../../../././libaws-c-common.so.1(aws_backtrace_print+0x59)
>  [0x7ffb14235f19]
> /home//miniconda3/envs/spliceai_env/lib/python3.9/site-packages/pyarrow/../../../././libaws-c-common.so.1(aws_fatal_assert+0x48)
>  [0x7ffb14227098]
> /home//miniconda3/envs/spliceai_env/lib/python3.9/site-packages/pyarrow/../../.././././libaws-c-io.so.1.0.0(+0x10a43)
>  [0x7ffb1406ea43]
> /home//miniconda3/envs/spliceai_env/lib/python3.9/site-packages/pyarrow/../../../././libaws-c-common.so.1(aws_ref_count_release+0x1d)
>  [0x7ffb14237fad]
> /home//miniconda3/envs/spliceai_env/lib/python3.9/site-packages/pyarrow/../../.././././libaws-c-io.so.1.0.0(+0xe35a)
>  [0x7ffb1406c35a]
> /home//miniconda3/envs/spliceai_env/lib/python3.9/site-packages/pyarrow/../../../././libaws-c-common.so.1(aws_ref_count_release+0x1d)
>  [0x7ffb14237fad]
> /home//miniconda3/envs/spliceai_env/lib/python3.9/site-packages/pyarrow/../../../././libaws-crt-cpp.so(_ZN3Aws3Crt2Io15ClientBootstrapD1Ev+0x3a)
>  [0x7ffb142a2f5a]
> /home//miniconda3/envs/spliceai_env/lib/python3.9/site-packages/pyarrow/../../.././libaws-cpp-sdk-core.so(+0x5f570)
>  [0x7ffb147fd570]
> /lib/x86_64-linux-gnu/libc.so.6(+0x49a27) [0x7ffb17f7da27]
> /lib/x86_64-linux-gnu/libc.so.6(on_exit+0) [0x7ffb17f7dbe0]
> /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xfa) [0x7ffb17f5b0ba]
> /home//miniconda3/envs/spliceai_env/bin/python3.9(+0x20aa51) 
> [0x562576609a51]
> /bin/bash: line 1: 2341494 Aborted                 (core dumped)
> {code}
> My colleague ran into the same issue on Centos 8 while running the same job + 
> same environment on SLURM, so I guess it could be some issue with tensorflow 
> + pyarrow.
> Also I found a github issue with multiple people running into the same issue:
> [https://github.com/huggingface/datasets/issues/3310]
>  
> It would be very important to my lab that this bug gets resolved, as we 
> cannot work with parquet any more. Unfortunately, we do not have the 
> knowledge to fix it.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (ARROW-15141) [C++] Fatal error condition occurred in aws_thread_launch

2021-12-19 Thread Uwe Korn (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-15141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Korn reassigned ARROW-15141:


Assignee: Uwe Korn

> [C++] Fatal error condition occurred in aws_thread_launch
> -
>
> Key: ARROW-15141
> URL: https://issues.apache.org/jira/browse/ARROW-15141
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++, Python
>Affects Versions: 6.0.0, 6.0.1
> Environment: - `uname -a`:
> Linux datalab2 5.4.0-91-generic #102-Ubuntu SMP Fri Nov 5 16:31:28 UTC 2021 
> x86_64 x86_64 x86_64 GNU/Linux
> - `mamba list | grep -i "pyarrow\|tensorflow\|^python"`
> pyarrow   6.0.0   py39hff6fa39_1_cpuconda-forge
> python3.9.7   hb7a2778_3_cpythonconda-forge
> python-dateutil   2.8.2  pyhd8ed1ab_0conda-forge
> python-flatbuffers1.12   pyhd8ed1ab_1conda-forge
> python-irodsclient1.0.0  pyhd8ed1ab_0conda-forge
> python-rocksdb0.7.0py39h7fcd5f3_4conda-forge
> python_abi3.9  2_cp39conda-forge
> tensorflow2.6.2   cuda112py39h9333c2f_0conda-forge
> tensorflow-base   2.6.2   cuda112py39h7de589b_0conda-forge
> tensorflow-estimator  2.6.2   cuda112py39h9333c2f_0conda-forge
> tensorflow-gpu2.6.2   cuda112py39h0bbbad9_0conda-forge
>Reporter: F. H.
>Assignee: Uwe Korn
>Priority: Major
>
> Hi, I am getting randomly the following error when first running inference 
> with a Tensorflow model and then writing the result to a `.parquet` file:
> {code}
> Fatal error condition occurred in 
> /home/conda/feedstock_root/build_artifacts/aws-c-io_1633633131324/work/source/event_loop.c:72:
>  aws_thread_launch(&cleanup_thread, s_event_loop_destroy_async_thread_fn, 
> el_group, &thread_options) == AWS_OP_SUCCESS
> Exiting Application
> 
> Stack trace:
> 
> /home//miniconda3/envs/spliceai_env/lib/python3.9/site-packages/pyarrow/../../../././libaws-c-common.so.1(aws_backtrace_print+0x59)
>  [0x7ffb14235f19]
> /home//miniconda3/envs/spliceai_env/lib/python3.9/site-packages/pyarrow/../../../././libaws-c-common.so.1(aws_fatal_assert+0x48)
>  [0x7ffb14227098]
> /home//miniconda3/envs/spliceai_env/lib/python3.9/site-packages/pyarrow/../../.././././libaws-c-io.so.1.0.0(+0x10a43)
>  [0x7ffb1406ea43]
> /home//miniconda3/envs/spliceai_env/lib/python3.9/site-packages/pyarrow/../../../././libaws-c-common.so.1(aws_ref_count_release+0x1d)
>  [0x7ffb14237fad]
> /home//miniconda3/envs/spliceai_env/lib/python3.9/site-packages/pyarrow/../../.././././libaws-c-io.so.1.0.0(+0xe35a)
>  [0x7ffb1406c35a]
> /home//miniconda3/envs/spliceai_env/lib/python3.9/site-packages/pyarrow/../../../././libaws-c-common.so.1(aws_ref_count_release+0x1d)
>  [0x7ffb14237fad]
> /home//miniconda3/envs/spliceai_env/lib/python3.9/site-packages/pyarrow/../../../././libaws-crt-cpp.so(_ZN3Aws3Crt2Io15ClientBootstrapD1Ev+0x3a)
>  [0x7ffb142a2f5a]
> /home//miniconda3/envs/spliceai_env/lib/python3.9/site-packages/pyarrow/../../.././libaws-cpp-sdk-core.so(+0x5f570)
>  [0x7ffb147fd570]
> /lib/x86_64-linux-gnu/libc.so.6(+0x49a27) [0x7ffb17f7da27]
> /lib/x86_64-linux-gnu/libc.so.6(on_exit+0) [0x7ffb17f7dbe0]
> /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xfa) [0x7ffb17f5b0ba]
> /home//miniconda3/envs/spliceai_env/bin/python3.9(+0x20aa51) 
> [0x562576609a51]
> /bin/bash: line 1: 2341494 Aborted                 (core dumped)
> {code}
> My colleague ran into the same issue on Centos 8 while running the same job + 
> same environment on SLURM, so I guess it could be some issue with tensorflow 
> + pyarrow.
> Also I found a github issue with multiple people running into the same issue:
> [https://github.com/huggingface/datasets/issues/3310]
>  
> It would be very important to my lab that this bug gets resolved, as we 
> cannot work with parquet any more. Unfortunately, we do not have the 
> knowledge to fix it.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (ARROW-15141) [C++] Fatal error condition occurred in aws_thread_launch

2021-12-17 Thread Uwe Korn (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-15141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17461315#comment-17461315
 ] 

Uwe Korn commented on ARROW-15141:
--

[~apitrou] I would simply rebuild all pyarrow conda versions with the old SDK 
again until I see a fix for this. It would be nice to have a reproducer for 
this on Linux in the conda recipe. Currently the code that fails on Windows 
passes there.

> [C++] Fatal error condition occurred in aws_thread_launch
> -
>
> Key: ARROW-15141
> URL: https://issues.apache.org/jira/browse/ARROW-15141
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++, Python
>Affects Versions: 6.0.0, 6.0.1
> Environment: - `uname -a`:
> Linux datalab2 5.4.0-91-generic #102-Ubuntu SMP Fri Nov 5 16:31:28 UTC 2021 
> x86_64 x86_64 x86_64 GNU/Linux
> - `mamba list | grep -i "pyarrow\|tensorflow\|^python"`
> pyarrow   6.0.0   py39hff6fa39_1_cpuconda-forge
> python3.9.7   hb7a2778_3_cpythonconda-forge
> python-dateutil   2.8.2  pyhd8ed1ab_0conda-forge
> python-flatbuffers1.12   pyhd8ed1ab_1conda-forge
> python-irodsclient1.0.0  pyhd8ed1ab_0conda-forge
> python-rocksdb0.7.0py39h7fcd5f3_4conda-forge
> python_abi3.9  2_cp39conda-forge
> tensorflow2.6.2   cuda112py39h9333c2f_0conda-forge
> tensorflow-base   2.6.2   cuda112py39h7de589b_0conda-forge
> tensorflow-estimator  2.6.2   cuda112py39h9333c2f_0conda-forge
> tensorflow-gpu2.6.2   cuda112py39h0bbbad9_0conda-forge
>Reporter: F. H.
>Priority: Major
>
> Hi, I am getting randomly the following error when first running inference 
> with a Tensorflow model and then writing the result to a `.parquet` file:
> {code}
> Fatal error condition occurred in 
> /home/conda/feedstock_root/build_artifacts/aws-c-io_1633633131324/work/source/event_loop.c:72:
>  aws_thread_launch(&cleanup_thread, s_event_loop_destroy_async_thread_fn, 
> el_group, &thread_options) == AWS_OP_SUCCESS
> Exiting Application
> 
> Stack trace:
> 
> /home//miniconda3/envs/spliceai_env/lib/python3.9/site-packages/pyarrow/../../../././libaws-c-common.so.1(aws_backtrace_print+0x59)
>  [0x7ffb14235f19]
> /home//miniconda3/envs/spliceai_env/lib/python3.9/site-packages/pyarrow/../../../././libaws-c-common.so.1(aws_fatal_assert+0x48)
>  [0x7ffb14227098]
> /home//miniconda3/envs/spliceai_env/lib/python3.9/site-packages/pyarrow/../../.././././libaws-c-io.so.1.0.0(+0x10a43)
>  [0x7ffb1406ea43]
> /home//miniconda3/envs/spliceai_env/lib/python3.9/site-packages/pyarrow/../../../././libaws-c-common.so.1(aws_ref_count_release+0x1d)
>  [0x7ffb14237fad]
> /home//miniconda3/envs/spliceai_env/lib/python3.9/site-packages/pyarrow/../../.././././libaws-c-io.so.1.0.0(+0xe35a)
>  [0x7ffb1406c35a]
> /home//miniconda3/envs/spliceai_env/lib/python3.9/site-packages/pyarrow/../../../././libaws-c-common.so.1(aws_ref_count_release+0x1d)
>  [0x7ffb14237fad]
> /home//miniconda3/envs/spliceai_env/lib/python3.9/site-packages/pyarrow/../../../././libaws-crt-cpp.so(_ZN3Aws3Crt2Io15ClientBootstrapD1Ev+0x3a)
>  [0x7ffb142a2f5a]
> /home//miniconda3/envs/spliceai_env/lib/python3.9/site-packages/pyarrow/../../.././libaws-cpp-sdk-core.so(+0x5f570)
>  [0x7ffb147fd570]
> /lib/x86_64-linux-gnu/libc.so.6(+0x49a27) [0x7ffb17f7da27]
> /lib/x86_64-linux-gnu/libc.so.6(on_exit+0) [0x7ffb17f7dbe0]
> /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xfa) [0x7ffb17f5b0ba]
> /home//miniconda3/envs/spliceai_env/bin/python3.9(+0x20aa51) 
> [0x562576609a51]
> /bin/bash: line 1: 2341494 Aborted                 (core dumped)
> {code}
> My colleague ran into the same issue on Centos 8 while running the same job + 
> same environment on SLURM, so I guess it could be some issue with tensorflow 
> + pyarrow.
> Also I found a github issue with multiple people running into the same issue:
> [https://github.com/huggingface/datasets/issues/3310]
>  
> It would be very important to my lab that this bug gets resolved, as we 
> cannot work with parquet any more. Unfortunately, we do not have the 
> knowledge to fix it.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (ARROW-15024) [C++] Link failure when using google-cloud-cpp from conda-forge

2021-12-08 Thread Uwe Korn (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-15024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17455228#comment-17455228
 ] 

Uwe Korn commented on ARROW-15024:
--

Looks like we are missing one of the {{libgoogle_cloud_cpp_*.so.1.34.1}} 
libraries, sadly I have no idea which one.

> [C++] Link failure when using google-cloud-cpp from conda-forge
> ---
>
> Key: ARROW-15024
> URL: https://issues.apache.org/jira/browse/ARROW-15024
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Reporter: Antoine Pitrou
>Priority: Major
>
> {code}
> [125/169] Linking CXX executable debug/arrow-gcsfs-test
> FAILED: debug/arrow-gcsfs-test 
> : && /usr/bin/ccache /usr/bin/g++-9 -Wno-noexcept-type  
> -fdiagnostics-color=always -fuse-ld=gold -ggdb -O0  -Wall 
> -fno-semantic-interposition -msse4.2  -D_GLIBCXX_USE_CXX11_ABI=1 
> -D_GLIBCXX_USE_CXX11_ABI=1 -fno-omit-frame-pointer -g  
> src/arrow/filesystem/CMakeFiles/arrow-gcsfs-test.dir/gcsfs_test.cc.o -o 
> debug/arrow-gcsfs-test  
> -Wl,-rpath,/home/antoine/arrow/dev/cpp/build-test/debug:/home/antoine/miniconda3/envs/pyarrow/lib
>   debug/libarrow_testing.so.700.0.0  debug/libarrow.so.700.0.0  
> /home/antoine/miniconda3/envs/pyarrow/lib/libcrypto.so  
> /home/antoine/miniconda3/envs/pyarrow/lib/libssl.so  
> /home/antoine/miniconda3/envs/pyarrow/lib/libbrotlienc.so  
> /home/antoine/miniconda3/envs/pyarrow/lib/libbrotlidec.so  
> /home/antoine/miniconda3/envs/pyarrow/lib/libbrotlicommon.so  
> /home/antoine/miniconda3/envs/pyarrow/lib/liborc.so  
> /home/antoine/miniconda3/envs/pyarrow/lib/libprotobuf.so  
> /home/antoine/miniconda3/envs/pyarrow/lib/libgoogle_cloud_cpp_storage.so.1.34.1
>   /home/antoine/miniconda3/envs/pyarrow/lib/libutf8proc.so  
> /home/antoine/miniconda3/envs/pyarrow/lib/libre2.so.9.0.0  -ldl  
> /home/antoine/miniconda3/envs/pyarrow/lib/libgtest_main.so  
> /home/antoine/miniconda3/envs/pyarrow/lib/libgmock.so  
> /home/antoine/miniconda3/envs/pyarrow/lib/libboost_filesystem.so  
> /home/antoine/miniconda3/envs/pyarrow/lib/libboost_system.so  -ldl  
> /home/antoine/miniconda3/envs/pyarrow/lib/libssl.so  
> /home/antoine/miniconda3/envs/pyarrow/lib/libcrypto.so  
> /home/antoine/miniconda3/envs/pyarrow/lib/libabsl_str_format_internal.so.2103.0.1
>   
> /home/antoine/miniconda3/envs/pyarrow/lib/libgoogle_cloud_cpp_common.so.1.34.1
>   /home/antoine/miniconda3/envs/pyarrow/lib/libabsl_time.so.2103.0.1  
> /home/antoine/miniconda3/envs/pyarrow/lib/libabsl_strings.so.2103.0.1  
> /home/antoine/miniconda3/envs/pyarrow/lib/libabsl_strings_internal.so.2103.0.1
>   
> /home/antoine/miniconda3/envs/pyarrow/lib/libabsl_throw_delegate.so.2103.0.1  
> /home/antoine/miniconda3/envs/pyarrow/lib/libabsl_base.so.2103.0.1  
> /home/antoine/miniconda3/envs/pyarrow/lib/libabsl_spinlock_wait.so.2103.0.1  
> /home/antoine/miniconda3/envs/pyarrow/lib/libabsl_int128.so.2103.0.1  
> /home/antoine/miniconda3/envs/pyarrow/lib/libabsl_civil_time.so.2103.0.1  
> /home/antoine/miniconda3/envs/pyarrow/lib/libabsl_time_zone.so.2103.0.1  
> /home/antoine/miniconda3/envs/pyarrow/lib/libabsl_bad_variant_access.so.2103.0.1
>   
> /home/antoine/miniconda3/envs/pyarrow/lib/libabsl_bad_optional_access.so.2103.0.1
>   
> /home/antoine/miniconda3/envs/pyarrow/lib/libabsl_raw_logging_internal.so.2103.0.1
>   /home/antoine/miniconda3/envs/pyarrow/lib/libabsl_log_severity.so.2103.0.1  
> /home/antoine/miniconda3/envs/pyarrow/lib/libcrc32c.so.1.1.0  
> /home/antoine/miniconda3/envs/pyarrow/lib/libcurl.so  
> /home/antoine/miniconda3/envs/pyarrow/lib/libz.so  
> jemalloc_ep-prefix/src/jemalloc_ep/dist//lib/libjemalloc_pic.a  
> mimalloc_ep/src/mimalloc_ep/lib/mimalloc-1.7/libmimalloc-debug.a  -lrt  
> /home/antoine/miniconda3/envs/pyarrow/lib/libgtest.so  -pthread && :
> /home/antoine/arrow/dev/cpp/src/arrow/filesystem/gcsfs_test.cc:91: error: 
> undefined reference to 'google::cloud::v1::MakeInsecureCredentials()'
> /home/antoine/arrow/dev/cpp/src/arrow/filesystem/gcsfs_test.cc:91: error: 
> undefined reference to 
> 'google::cloud::storage::v1::Client::Client(google::cloud::v1::Options)'
> /home/antoine/arrow/dev/cpp/src/arrow/filesystem/gcsfs_test.cc:128: error: 
> undefined reference to 'google::cloud::v1::MakeInsecureCredentials()'
> /home/antoine/arrow/dev/cpp/src/arrow/filesystem/gcsfs_test.cc:128: error: 
> undefined reference to 
> 'google::cloud::storage::v1::Client::Client(google::cloud::v1::Options)'
> /home/antoine/arrow/dev/cpp/src/arrow/filesystem/gcsfs_test.cc:337: error: 
> undefined reference to 
> 'absl::lts_20210324::ParseTime(absl::lts_20210324::string_view, 
> absl::lts_20210324::string_view, absl::lts_20210324::Time*, 
> std::__cxx11::basic_string, std::allocator 
> >*)'
> /home/antoine/arrow/dev/cpp/src/arrow/filesystem/gcsfs_test.cc:338: 

[jira] [Commented] (ARROW-14256) [CI][Packaging] conda build failures

2021-10-20 Thread Uwe Korn (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-14256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17431428#comment-17431428
 ] 

Uwe Korn commented on ARROW-14256:
--

Upstream issue: https://github.com/conda-forge/conda-forge.github.io/issues/1528

> [CI][Packaging] conda build failures
> 
>
> Key: ARROW-14256
> URL: https://issues.apache.org/jira/browse/ARROW-14256
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Continuous Integration, Packaging
>Reporter: Antoine Pitrou
>Priority: Critical
> Fix For: 7.0.0
>
>
> It seems many of the conda packaging nightly builds are failing due to conda 
> dependency resolution errors, example here:
> https://dev.azure.com/ursacomputing/crossbow/_build/results?buildId=12793&view=logs&j=0da5d1d9-276d-5173-c4c4-9d4d4ed14fdb&t=d9b15392-e4ce-5e4c-0c8c-b69645229181&l=3049



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-14323) [Python] Import error (ppc64le) - Illegal instruction (core dumped)

2021-10-14 Thread Uwe Korn (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-14323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17428969#comment-17428969
 ] 

Uwe Korn commented on ARROW-14323:
--

You could also try and reach out on https://gitter.im/conda-forge-ppc64le/Lobby 
whether anyone there has an idea.

> [Python] Import error (ppc64le) - Illegal instruction (core dumped)
> ---
>
> Key: ARROW-14323
> URL: https://issues.apache.org/jira/browse/ARROW-14323
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Affects Versions: 5.0.0
> Environment: Platform: Red Hat Enterprise Linux 8.2 (Ootpa) - PPC64le
> Python version: 3.6
> PyArrow version: pyarrow - 5.0.0 - 
> conda-forge/linux-ppc64le::pyarrow-5.0.0-py36h7a46c7e_8_cpu
>Reporter: Gerardo Cervantes
>Priority: Critical
>
> h2. Description
> The bug occurs when Installing the PyArrow library with Conda.
> After trying to import with Python, this error shows up:
>  
> {code:java}
> Illegal instruction (core dumped){code}
>  
> h2. Environment
>  * Platform: Red Hat Enterprise Linux 8.2 (Ootpa) - PPC64le
>  * Python version: 3.6
>  * PyArrow version: pyarrow - 5.0.0 - 
> conda-forge/linux-ppc64le::pyarrow-5.0.0-py36h7a46c7e_8_cpu
> h2. Steps to reproduce the bug
> Installing with Conda and then trying to import the package through Python 
> gives the error. 
> {{}}
> {code:java}
> conda create --name pyarrow_py36 python=3.6
> conda activate pyarrow_py36
> conda install pyarrow
> {code}
> {{}}
> h2. Tracebacks
> conda create --name pyarrow_py36 python=3.6
>  
> {code:java}
> Collecting package metadata (current_repodata.json): done
> Solving environment: done
> ==> WARNING: A newer version of conda exists. <==
>  current version: 4.9.2
>  latest version: 4.10.3
> Please update conda by running
> $ conda update -n base -c defaults conda
>  
> ## Package Plan ##
> environment location: /p/home/gerryc/.conda/envs/pyarrow_py36
> added / updated specs:
>  - python=3.6
> The following NEW packages will be INSTALLED:
> _libgcc_mutex conda-forge/linux-ppc64le::_libgcc_mutex-0.1-conda_forge
>  _openmp_mutex conda-forge/linux-ppc64le::_openmp_mutex-4.5-1_gnu
>  ca-certificates 
> conda-forge/linux-ppc64le::ca-certificates-2021.10.8-h1084571_0
>  certifi pkgs/main/linux-ppc64le::certifi-2020.12.5-py36h6ffa863_0
>  ld_impl_linux-ppc~ 
> conda-forge/linux-ppc64le::ld_impl_linux-ppc64le-2.36.1-ha35d02b_2
>  libffi conda-forge/linux-ppc64le::libffi-3.4.2-h3b9df90_4
>  libgcc-ng conda-forge/linux-ppc64le::libgcc-ng-11.2.0-h7698a5e_11
>  libgomp conda-forge/linux-ppc64le::libgomp-11.2.0-h7698a5e_11
>  libstdcxx-ng conda-forge/linux-ppc64le::libstdcxx-ng-11.2.0-habdf983_11
>  libzlib conda-forge/linux-ppc64le::libzlib-1.2.11-h339bb43_1013
>  ncurses conda-forge/linux-ppc64le::ncurses-6.2-hea85c5d_4
>  openssl conda-forge/linux-ppc64le::openssl-1.1.1l-h4e0d66e_0
>  pip conda-forge/noarch::pip-21.3-pyhd8ed1ab_0
>  python conda-forge/linux-ppc64le::python-3.6.13-h57873ef_2_cpython
>  readline conda-forge/linux-ppc64le::readline-8.1-h5c45dff_0
>  setuptools pkgs/main/linux-ppc64le::setuptools-58.0.4-py36h6ffa863_0
>  sqlite conda-forge/linux-ppc64le::sqlite-3.36.0-h4e2196e_2
>  tk conda-forge/linux-ppc64le::tk-8.6.11-h41c6715_1
>  wheel conda-forge/noarch::wheel-0.37.0-pyhd8ed1ab_1
>  xz conda-forge/linux-ppc64le::xz-5.2.5-h6eb9509_1
>  zlib conda-forge/linux-ppc64le::zlib-1.2.11-h339bb43_1013
> Proceed ([y]/n)? y
> Preparing transaction: done
> Verifying transaction: done
> Executing transaction: done
> #
> # To activate this environment, use
> #
> # $ conda activate pyarrow_py36
> #
> # To deactivate an active environment, use
> #
> # $ conda deactivate
> {code}
>  
> conda activate pyarrow_py36
> conda install pyarrow
>  
> {code:java}
> Collecting package metadata (current_repodata.json): done
> Solving environment: failed with initial frozen solve. Retrying with flexible 
> solve.
> Solving environment: failed with repodata from current_repodata.json, will 
> retry with next repodata source.
> Collecting package metadata (repodata.json): done
> Solving environment: done
> ==> WARNING: A newer version of conda exists. <==
>  current version: 4.9.2
>  latest version: 4.10.3
> Please update conda by running
> $ conda update -n base -c defaults conda
>  
> ## Package Plan ##
> environment location: /p/home/gerryc/.conda/envs/pyarrow_py36
> added / updated specs:
>  - pyarrow
> The following NEW packages will be INSTALLED:
> abseil-cpp conda-forge/linux-ppc64le::abseil-cpp-20210324.2-h3b9df90_0
>  arrow-cpp conda-forge/linux-ppc64le::arrow-cpp-5.0.0-py36hf9cf308_8_cpu
>  aws-c-cal conda-forge/linux-ppc64le::aws-c-cal-0.5.11-hb3fac3d_0
>  aws-c-common conda-forge/linux-ppc64le::aws-c-common-0.6.2-h4e0d66e_0
>  aws-c-event-stream 
> conda-forge/linux-ppc64l

[jira] [Commented] (ARROW-14323) [Python] Import error (ppc64le) - Illegal instruction (core dumped)

2021-10-14 Thread Uwe Korn (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-14323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17428968#comment-17428968
 ] 

Uwe Korn commented on ARROW-14323:
--

This is weird, I would guess that this is happening in the allocation step but 
I'm also out of ideas how this could be caused.

> [Python] Import error (ppc64le) - Illegal instruction (core dumped)
> ---
>
> Key: ARROW-14323
> URL: https://issues.apache.org/jira/browse/ARROW-14323
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Affects Versions: 5.0.0
> Environment: Platform: Red Hat Enterprise Linux 8.2 (Ootpa) - PPC64le
> Python version: 3.6
> PyArrow version: pyarrow - 5.0.0 - 
> conda-forge/linux-ppc64le::pyarrow-5.0.0-py36h7a46c7e_8_cpu
>Reporter: Gerardo Cervantes
>Priority: Critical
>
> h2. Description
> The bug occurs when Installing the PyArrow library with Conda.
> After trying to import with Python, this error shows up:
>  
> {code:java}
> Illegal instruction (core dumped){code}
>  
> h2. Environment
>  * Platform: Red Hat Enterprise Linux 8.2 (Ootpa) - PPC64le
>  * Python version: 3.6
>  * PyArrow version: pyarrow - 5.0.0 - 
> conda-forge/linux-ppc64le::pyarrow-5.0.0-py36h7a46c7e_8_cpu
> h2. Steps to reproduce the bug
> Installing with Conda and then trying to import the package through Python 
> gives the error. 
> {{}}
> {code:java}
> conda create --name pyarrow_py36 python=3.6
> conda activate pyarrow_py36
> conda install pyarrow
> {code}
> {{}}
> h2. Tracebacks
> conda create --name pyarrow_py36 python=3.6
>  
> {code:java}
> Collecting package metadata (current_repodata.json): done
> Solving environment: done
> ==> WARNING: A newer version of conda exists. <==
>  current version: 4.9.2
>  latest version: 4.10.3
> Please update conda by running
> $ conda update -n base -c defaults conda
>  
> ## Package Plan ##
> environment location: /p/home/gerryc/.conda/envs/pyarrow_py36
> added / updated specs:
>  - python=3.6
> The following NEW packages will be INSTALLED:
> _libgcc_mutex conda-forge/linux-ppc64le::_libgcc_mutex-0.1-conda_forge
>  _openmp_mutex conda-forge/linux-ppc64le::_openmp_mutex-4.5-1_gnu
>  ca-certificates 
> conda-forge/linux-ppc64le::ca-certificates-2021.10.8-h1084571_0
>  certifi pkgs/main/linux-ppc64le::certifi-2020.12.5-py36h6ffa863_0
>  ld_impl_linux-ppc~ 
> conda-forge/linux-ppc64le::ld_impl_linux-ppc64le-2.36.1-ha35d02b_2
>  libffi conda-forge/linux-ppc64le::libffi-3.4.2-h3b9df90_4
>  libgcc-ng conda-forge/linux-ppc64le::libgcc-ng-11.2.0-h7698a5e_11
>  libgomp conda-forge/linux-ppc64le::libgomp-11.2.0-h7698a5e_11
>  libstdcxx-ng conda-forge/linux-ppc64le::libstdcxx-ng-11.2.0-habdf983_11
>  libzlib conda-forge/linux-ppc64le::libzlib-1.2.11-h339bb43_1013
>  ncurses conda-forge/linux-ppc64le::ncurses-6.2-hea85c5d_4
>  openssl conda-forge/linux-ppc64le::openssl-1.1.1l-h4e0d66e_0
>  pip conda-forge/noarch::pip-21.3-pyhd8ed1ab_0
>  python conda-forge/linux-ppc64le::python-3.6.13-h57873ef_2_cpython
>  readline conda-forge/linux-ppc64le::readline-8.1-h5c45dff_0
>  setuptools pkgs/main/linux-ppc64le::setuptools-58.0.4-py36h6ffa863_0
>  sqlite conda-forge/linux-ppc64le::sqlite-3.36.0-h4e2196e_2
>  tk conda-forge/linux-ppc64le::tk-8.6.11-h41c6715_1
>  wheel conda-forge/noarch::wheel-0.37.0-pyhd8ed1ab_1
>  xz conda-forge/linux-ppc64le::xz-5.2.5-h6eb9509_1
>  zlib conda-forge/linux-ppc64le::zlib-1.2.11-h339bb43_1013
> Proceed ([y]/n)? y
> Preparing transaction: done
> Verifying transaction: done
> Executing transaction: done
> #
> # To activate this environment, use
> #
> # $ conda activate pyarrow_py36
> #
> # To deactivate an active environment, use
> #
> # $ conda deactivate
> {code}
>  
> conda activate pyarrow_py36
> conda install pyarrow
>  
> {code:java}
> Collecting package metadata (current_repodata.json): done
> Solving environment: failed with initial frozen solve. Retrying with flexible 
> solve.
> Solving environment: failed with repodata from current_repodata.json, will 
> retry with next repodata source.
> Collecting package metadata (repodata.json): done
> Solving environment: done
> ==> WARNING: A newer version of conda exists. <==
>  current version: 4.9.2
>  latest version: 4.10.3
> Please update conda by running
> $ conda update -n base -c defaults conda
>  
> ## Package Plan ##
> environment location: /p/home/gerryc/.conda/envs/pyarrow_py36
> added / updated specs:
>  - pyarrow
> The following NEW packages will be INSTALLED:
> abseil-cpp conda-forge/linux-ppc64le::abseil-cpp-20210324.2-h3b9df90_0
>  arrow-cpp conda-forge/linux-ppc64le::arrow-cpp-5.0.0-py36hf9cf308_8_cpu
>  aws-c-cal conda-forge/linux-ppc64le::aws-c-cal-0.5.11-hb3fac3d_0
>  aws-c-common conda-forge/linux-ppc64le::aws-c-common-0.6.2-h4e0d66e_0
>  aws-c-event-stream 
> conda-forge

[jira] [Commented] (ARROW-14256) [CI][Packaging] conda build failures

2021-10-14 Thread Uwe Korn (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-14256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17428829#comment-17428829
 ] 

Uwe Korn commented on ARROW-14256:
--

This seems to be a general conda/conda-build issue with the release of OpenSSL 
3.0. Not all dependencies of arrow-cpp support it yet but conda aggressively 
updates it to 3.0 and thus generates the conflicts.

> [CI][Packaging] conda build failures
> 
>
> Key: ARROW-14256
> URL: https://issues.apache.org/jira/browse/ARROW-14256
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Continuous Integration, Packaging
>Reporter: Antoine Pitrou
>Priority: Critical
> Fix For: 6.0.0
>
>
> It seems many of the conda packaging nightly builds are failing due to conda 
> dependency resolution errors, example here:
> https://dev.azure.com/ursacomputing/crossbow/_build/results?buildId=12793&view=logs&j=0da5d1d9-276d-5173-c4c4-9d4d4ed14fdb&t=d9b15392-e4ce-5e4c-0c8c-b69645229181&l=3049



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-13743) [CI] OSX job fails due to incompatible git and libcurl

2021-09-02 Thread Uwe Korn (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-13743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17408916#comment-17408916
 ] 

Uwe Korn commented on ARROW-13743:
--

The issue here is that {{git}} is pulled from {{pkgs/main}} and not from 
{{conda-forge}}. You should switch to {{channel_priority: strict}} in the conda 
configuration to avoid channel clashed.

> [CI] OSX job fails due to incompatible git and libcurl
> --
>
> Key: ARROW-13743
> URL: https://issues.apache.org/jira/browse/ARROW-13743
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++, Continuous Integration
>Reporter: Yibo Cai
>Priority: Major
> Fix For: 6.0.0
>
>
> https://dev.azure.com/ursacomputing/crossbow/_build/results?buildId=10372&view=logs&j=cf796865-97b7-5cd1-be8e-6e00ce4fd8cf&t=9f7de14c-8ff0-55c4-a998-d852f888262c&l=15
> [NIGHTLY] Arrow Build Report for Job nightly-2021-08-24-0
> https://www.mail-archive.com/builds@arrow.apache.org/msg00109.html



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-13583) Create wheels for M1 Silicon for older versions

2021-08-07 Thread Uwe Korn (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-13583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17395467#comment-17395467
 ] 

Uwe Korn commented on ARROW-13583:
--

We have builds for older pyarrow versions for the M1 on conda-forge. You can 
either switch to using conda/mamba as a package manager here or look up the 
patches that we are using to build the older versions for the M1.

> Create wheels for M1 Silicon for older versions
> ---
>
> Key: ARROW-13583
> URL: https://issues.apache.org/jira/browse/ARROW-13583
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Python
>Affects Versions: 3.0.0
>Reporter: Michelangelo D'Agostino
>Priority: Minor
>
> The Snowflake data warehouse has a python client that requires an older 
> version of pyarrow-->=3.0.0,<3.1.0:
> [https://docs.snowflake.com/en/user-guide/python-connector-pandas.html#requirements]
> I've tried everything I can think of or read about to get the older versions 
> of pyarrow working on my M1 but to no avail.  Snowflake is saying they won't 
> update their connector until September:
>  
> [https://github.com/snowflakedb/snowflake-connector-python/issues/815] 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-13134) [C++] arrow-s3fs-test fails or hangs with aws-sdk-cpp 1.9.45

2021-07-02 Thread Uwe Korn (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-13134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17373466#comment-17373466
 ] 

Uwe Korn commented on ARROW-13134:
--

Merged 1.9.51 on conda-forge, will be available in 60-90min

> [C++] arrow-s3fs-test fails or hangs with aws-sdk-cpp 1.9.45
> 
>
> Key: ARROW-13134
> URL: https://issues.apache.org/jira/browse/ARROW-13134
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++, Continuous Integration
>Reporter: Antoine Pitrou
>Priority: Critical
>  Labels: pull-request-available
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> See possible hang on AppVeyor:
> [https://ci.appveyor.com/project/ApacheSoftwareFoundation/arrow/builds/39683796/job/olkossi0bpq4k186?fullLog=true]
> Running locally on Ubuntu produces multiple failures and then a crash:
> https://gist.github.com/pitrou/eccdffa2161f7257bead90c63c10037c



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-13209) [Python][CI] macOS wheel builds should raise on linker warnings

2021-06-29 Thread Uwe Korn (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-13209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17371302#comment-17371302
 ] 

Uwe Korn commented on ARROW-13209:
--

I have no idea besides grep'ing the build log.

> [Python][CI] macOS wheel builds should raise on linker warnings
> ---
>
> Key: ARROW-13209
> URL: https://issues.apache.org/jira/browse/ARROW-13209
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Continuous Integration, Python
>Reporter: Krisztian Szucs
>Priority: Major
>
> In order to prevent similar issues to 
> https://issues.apache.org/jira/browse/ARROW-13108 we need to halt the macOS 
> wheel build on linker warnings like the following:
> - ld: warning: object file 
> (/usr/local/Cellar/brotli/1.0.9/lib/libbrotlidec-static.a(decode.c.o)) was 
> built for newer macOS version (10.15) than being linked (10.13)
> - ld: warning: object file 
> (/usr/local/Cellar/brotli/1.0.9/lib/libbrotlienc-static.a(encode.c.o)) was 
> built for newer macOS version (10.15) than being linked (10.13)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-13108) [Python] Pyarrow 4.0.0 crashes upon import on macOS 10.13.6

2021-06-23 Thread Uwe Korn (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-13108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17368245#comment-17368245
 ] 

Uwe Korn commented on ARROW-13108:
--

>From the Snowflake CI log, it seems that the latest wheels require macOS 
>10.15+ . [~kszucs] Maybe this is even intentional? 

> [Python] Pyarrow 4.0.0 crashes upon import on macOS 10.13.6
> ---
>
> Key: ARROW-13108
> URL: https://issues.apache.org/jira/browse/ARROW-13108
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Affects Versions: 4.0.0, 4.0.1
>Reporter: Mark Keller
>Priority: Major
> Fix For: 5.0.0
>
>
> Our Jenkins worker that we use for building `snowflake-connector-python` has 
> the following setup:
>  
> {code:java}
> $ uname -a
> Darwin imac.local 17.7.0 Darwin Kernel Version 17.7.0: Fri Jul  6 19:54:51 
> PDT 2018; root:xnu-4570.71.3~2/RELEASE_X86_64 x86_64
> $ python --version --version
> Python 3.6.8 (v3.6.8:3c6b436a57, Dec 24 2018, 02:04:31) 
> [GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.57)]
> $ pip list
> PackageVersion
> -- ---
> Cython 0.29.23
> numpy  1.19.5
> pip21.1.2
> pyarrow4.0.0
> setuptools 57.0.0
> wheel  0.36.2
> {code}
> This is in a completely new venv.
> Then after installing these dependencies see the issue here:
> {code:java}
> $ python -c "import pyarrow"
> Traceback (most recent call last):
>   File "", line 1, in 
>   File 
> "/Users/Shared/Jenkins/Home/workspace/BuildPyConnector-Mac/venv-36/lib/python3.6/site-packages/pyarrow/__init__.py",
>  line 63, in 
> import pyarrow.lib as _lib
> ImportError: 
> dlopen(/Users/Shared/Jenkins/Home/workspace/BuildPyConnector-Mac/venv-36/lib/python3.6/site-packages/pyarrow/lib.cpython-36m-darwin.so,
>  2): Symbol not found: chkstk_darwin
>   Referenced from: 
> /Users/Shared/Jenkins/Home/workspace/BuildPyConnector-Mac/venv-36/lib/python3.6/site-packages/pyarrow/libarrow.400.dylib
>   Expected in: /usr/lib/libSystem.B.dylib
>  in 
> /Users/Shared/Jenkins/Home/workspace/BuildPyConnector-Mac/venv-36/lib/python3.6/site-packages/pyarrow/libarrow.400.dylib
> {code}
> I'm sorry I'm not too sure what could be causing this, but please see what 
> Uwe said here: 
> [https://github.com/snowflakedb/snowflake-connector-python/pull/762#issuecomment-863531840]
>  
> I'd be happy to help you test a potential fix if you don't have a machine 
> with such an old MacOS version
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-13140) [C++/Python] Upgrade libthrift pin in the nightlies

2021-06-22 Thread Uwe Korn (Jira)
Uwe Korn created ARROW-13140:


 Summary: [C++/Python] Upgrade libthrift pin in the nightlies
 Key: ARROW-13140
 URL: https://issues.apache.org/jira/browse/ARROW-13140
 Project: Apache Arrow
  Issue Type: Task
  Components: C++, Packaging, Python
Reporter: Uwe Korn
Assignee: Uwe Korn






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-13134) [C++] arrow-s3fs-test fails or hangs with aws-sdk-cpp 1.9.39

2021-06-21 Thread Uwe Korn (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-13134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17366615#comment-17366615
 ] 

Uwe Korn commented on ARROW-13134:
--

aws-sdk-cpp 1.9.44 should be available in ~30min, so you could also give that a 
try then.

> [C++] arrow-s3fs-test fails or hangs with aws-sdk-cpp 1.9.39
> 
>
> Key: ARROW-13134
> URL: https://issues.apache.org/jira/browse/ARROW-13134
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Reporter: Antoine Pitrou
>Priority: Major
>
> See possible hang on AppVeyor:
> [https://ci.appveyor.com/project/ApacheSoftwareFoundation/arrow/builds/39683796/job/olkossi0bpq4k186?fullLog=true]
> Running locally on Ubuntu produces multiple failures and then a crash:
> https://gist.github.com/pitrou/eccdffa2161f7257bead90c63c10037c



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (ARROW-12738) [CI] [Gandiva] Nightly build error in azure-conda-osx-clang-py38 (and py39, py*-r*)

2021-06-10 Thread Uwe Korn (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-12738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Korn resolved ARROW-12738.
--
Fix Version/s: 5.0.0
   Resolution: Fixed

Issue resolved by pull request 10499
[https://github.com/apache/arrow/pull/10499]

> [CI] [Gandiva] Nightly build error in azure-conda-osx-clang-py38 (and py39, 
> py*-r*)
> ---
>
> Key: ARROW-12738
> URL: https://issues.apache.org/jira/browse/ARROW-12738
> Project: Apache Arrow
>  Issue Type: Task
>  Components: C++ - Gandiva, Continuous Integration
>Reporter: Mauricio 'Pachá' Vargas Sepúlveda
>Assignee: Uwe Korn
>Priority: Major
>  Labels: pull-request-available
> Fix For: 5.0.0
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> These are all failing because of a mismatch in LLVM version, they all have 
> some variation on the following when referencing the precompiled gandiva
> {code}
> Unknown attribute kind (70) (Producer: 'LLVM12.0.0' Reader: 'LLVM 10.0.1')
> {code}
> It _looks like_ the precompiled gandiva llvm version might be [coming from 
> the .env file|LLVM=12]
> Examples:
> https://dev.azure.com/ursacomputing/crossbow/_build/results?buildId=6219&view=logs&j=cf796865-97b7-5cd1-be8e-6e00ce4fd8cf&t=88ee2fb8-46fd-5c68-fde3-1c8d31ba2a5f&l=1069
> https://dev.azure.com/ursacomputing/crossbow/_build/results?buildId=6250&view=logs&j=cf796865-97b7-5cd1-be8e-6e00ce4fd8cf&t=88ee2fb8-46fd-5c68-fde3-1c8d31ba2a5f&l=1046
> For some of these, the build phase will pass (even though it did not succeed 
> and the error is then: "File ... does not exist". This is a red herring and 
> the build problem above is the real issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-9431) [C++/Python] Kernel for SetItem(IntegerArray, values)

2021-05-27 Thread Uwe Korn (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-9431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17352713#comment-17352713
 ] 

Uwe Korn commented on ARROW-9431:
-

{quote}[~uwe] What is the intended use case? {quote}

The intention was to support pandas' {code}def __setitem__(self, key, 
value){code} where key is an array and value is an array of equal length. In 
fletcher we currently roundtrip to {{numpy}} for that: 
https://github.com/xhochy/fletcher/blob/master/fletcher/base.py#L998-L1018 The 
biggest use case for this is though the ArrowStringArray that is being built 
inside of pandas currently.

> [C++/Python] Kernel for SetItem(IntegerArray, values)
> -
>
> Key: ARROW-9431
> URL: https://issues.apache.org/jira/browse/ARROW-9431
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: C++, Python
>Affects Versions: 2.0.0
>Reporter: Uwe Korn
>Priority: Major
>
> We should have a kernel that allows overriding the values of an array using 
> an integer array as the indexer and a scalar or array of equal length as the 
> values.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (ARROW-12649) [Python/Packaging] Move conda-aarch64 to Azure with cross-compilation

2021-05-07 Thread Uwe Korn (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-12649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Korn resolved ARROW-12649.
--
Resolution: Fixed

Issue resolved by pull request 10243
[https://github.com/apache/arrow/pull/10243]

> [Python/Packaging] Move conda-aarch64 to Azure with cross-compilation
> -
>
> Key: ARROW-12649
> URL: https://issues.apache.org/jira/browse/ARROW-12649
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Packaging, Python
>Reporter: Uwe Korn
>Assignee: Uwe Korn
>Priority: Major
>  Labels: pull-request-available
> Fix For: 5.0.0
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-12649) [Python/Packaging] Move conda-aarch64 to Azure with cross-compilation

2021-05-04 Thread Uwe Korn (Jira)
Uwe Korn created ARROW-12649:


 Summary: [Python/Packaging] Move conda-aarch64 to Azure with 
cross-compilation
 Key: ARROW-12649
 URL: https://issues.apache.org/jira/browse/ARROW-12649
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Packaging, Python
Reporter: Uwe Korn
Assignee: Uwe Korn
 Fix For: 5.0.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-12420) [C++/Dataset] Reading null columns as dictionary not longer possible

2021-04-19 Thread Uwe Korn (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-12420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17324790#comment-17324790
 ] 

Uwe Korn commented on ARROW-12420:
--

Thanks [~kszucs]!

> [C++/Dataset] Reading null columns as dictionary not longer possible
> 
>
> Key: ARROW-12420
> URL: https://issues.apache.org/jira/browse/ARROW-12420
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++
>Affects Versions: 4.0.0
>Reporter: Uwe Korn
>Assignee: Krisztian Szucs
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Reading a dataset with a dictionary column where some of the files don't 
> contain any data for that column (and thus are typed as null) broke with 
> https://github.com/apache/arrow/pull/9532. It worked with the 3.0 release 
> though and thus I would consider this a regression.
> This can be reproduced using the following Python snippet:
> {code}
> import pyarrow as pa
> import pyarrow.parquet as pq
> import pyarrow.dataset as ds
> table = pa.table({"a": [None, None]})
> pq.write_table(table, "test.parquet")
> schema = pa.schema([pa.field("a", pa.dictionary(pa.int32(), pa.string()))])
> fsds = ds.FileSystemDataset.from_paths(
> paths=["test.parquet"],
> schema=schema,
> format=pa.dataset.ParquetFileFormat(),
> filesystem=pa.fs.LocalFileSystem(),
> )
> fsds.to_table()
> {code}
> The exception on master is currently:
> {code}
> ---
> ArrowNotImplementedError  Traceback (most recent call last)
>  in 
>   6 filesystem=pa.fs.LocalFileSystem(),
>   7 )
> > 8 fsds.to_table()
> ~/Development/arrow/python/pyarrow/_dataset.pyx in 
> pyarrow._dataset.Dataset.to_table()
> 456 table : Table instance
> 457 """
> --> 458 return self._scanner(**kwargs).to_table()
> 459 
> 460 def head(self, int num_rows, **kwargs):
> ~/Development/arrow/python/pyarrow/_dataset.pyx in 
> pyarrow._dataset.Scanner.to_table()
>2887 result = self.scanner.ToTable()
>2888 
> -> 2889 return pyarrow_wrap_table(GetResultValue(result))
>2890 
>2891 def take(self, object indices):
> ~/Development/arrow/python/pyarrow/error.pxi in 
> pyarrow.lib.pyarrow_internal_check_status()
> 139 cdef api int pyarrow_internal_check_status(const CStatus& status) \
> 140 nogil except -1:
> --> 141 return check_status(status)
> 142 
> 143 
> ~/Development/arrow/python/pyarrow/error.pxi in pyarrow.lib.check_status()
> 116 raise ArrowKeyError(message)
> 117 elif status.IsNotImplemented():
> --> 118 raise ArrowNotImplementedError(message)
> 119 elif status.IsTypeError():
> 120 raise ArrowTypeError(message)
> ArrowNotImplementedError: Unsupported cast from null to 
> dictionary (no available cast 
> function for target type)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-12420) [C++/Dataset] Reading null columns as dictionary not longer possible

2021-04-16 Thread Uwe Korn (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-12420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17323792#comment-17323792
 ] 

Uwe Korn commented on ARROW-12420:
--

cc [~bkietz] who wrote the PR that broke it ;)

> [C++/Dataset] Reading null columns as dictionary not longer possible
> 
>
> Key: ARROW-12420
> URL: https://issues.apache.org/jira/browse/ARROW-12420
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++
>Affects Versions: 4.0.0
>Reporter: Uwe Korn
>Priority: Major
> Fix For: 4.0.0
>
>
> Reading a dataset with a dictionary column where some of the files don't 
> contain any data for that column (and thus are typed as null) broke with 
> https://github.com/apache/arrow/pull/9532. It worked with the 3.0 release 
> though and thus I would consider this a regression.
> This can be reproduced using the following Python snippet:
> {code}
> import pyarrow as pa
> import pyarrow.parquet as pq
> import pyarrow.dataset as ds
> table = pa.table({"a": [None, None]})
> pq.write_table(table, "test.parquet")
> schema = pa.schema([pa.field("a", pa.dictionary(pa.int32(), pa.string()))])
> fsds = ds.FileSystemDataset.from_paths(
> paths=["test.parquet"],
> schema=schema,
> format=pa.dataset.ParquetFileFormat(),
> filesystem=pa.fs.LocalFileSystem(),
> )
> fsds.to_table()
> {code}
> The exception on master is currently:
> {code}
> ---
> ArrowNotImplementedError  Traceback (most recent call last)
>  in 
>   6 filesystem=pa.fs.LocalFileSystem(),
>   7 )
> > 8 fsds.to_table()
> ~/Development/arrow/python/pyarrow/_dataset.pyx in 
> pyarrow._dataset.Dataset.to_table()
> 456 table : Table instance
> 457 """
> --> 458 return self._scanner(**kwargs).to_table()
> 459 
> 460 def head(self, int num_rows, **kwargs):
> ~/Development/arrow/python/pyarrow/_dataset.pyx in 
> pyarrow._dataset.Scanner.to_table()
>2887 result = self.scanner.ToTable()
>2888 
> -> 2889 return pyarrow_wrap_table(GetResultValue(result))
>2890 
>2891 def take(self, object indices):
> ~/Development/arrow/python/pyarrow/error.pxi in 
> pyarrow.lib.pyarrow_internal_check_status()
> 139 cdef api int pyarrow_internal_check_status(const CStatus& status) \
> 140 nogil except -1:
> --> 141 return check_status(status)
> 142 
> 143 
> ~/Development/arrow/python/pyarrow/error.pxi in pyarrow.lib.check_status()
> 116 raise ArrowKeyError(message)
> 117 elif status.IsNotImplemented():
> --> 118 raise ArrowNotImplementedError(message)
> 119 elif status.IsTypeError():
> 120 raise ArrowTypeError(message)
> ArrowNotImplementedError: Unsupported cast from null to 
> dictionary (no available cast 
> function for target type)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-12420) [C++/Dataset] Reading null columns as dictionary not longer possible

2021-04-16 Thread Uwe Korn (Jira)
Uwe Korn created ARROW-12420:


 Summary: [C++/Dataset] Reading null columns as dictionary not 
longer possible
 Key: ARROW-12420
 URL: https://issues.apache.org/jira/browse/ARROW-12420
 Project: Apache Arrow
  Issue Type: Improvement
  Components: C++
Affects Versions: 4.0.0
Reporter: Uwe Korn
 Fix For: 4.0.0


Reading a dataset with a dictionary column where some of the files don't 
contain any data for that column (and thus are typed as null) broke with 
https://github.com/apache/arrow/pull/9532. It worked with the 3.0 release 
though and thus I would consider this a regression.

This can be reproduced using the following Python snippet:

{code}
import pyarrow as pa
import pyarrow.parquet as pq
import pyarrow.dataset as ds

table = pa.table({"a": [None, None]})
pq.write_table(table, "test.parquet")
schema = pa.schema([pa.field("a", pa.dictionary(pa.int32(), pa.string()))])
fsds = ds.FileSystemDataset.from_paths(
paths=["test.parquet"],
schema=schema,
format=pa.dataset.ParquetFileFormat(),
filesystem=pa.fs.LocalFileSystem(),
)
fsds.to_table()
{code}

The exception on master is currently:

{code}
---
ArrowNotImplementedError  Traceback (most recent call last)
 in 
  6 filesystem=pa.fs.LocalFileSystem(),
  7 )
> 8 fsds.to_table()

~/Development/arrow/python/pyarrow/_dataset.pyx in 
pyarrow._dataset.Dataset.to_table()
456 table : Table instance
457 """
--> 458 return self._scanner(**kwargs).to_table()
459 
460 def head(self, int num_rows, **kwargs):

~/Development/arrow/python/pyarrow/_dataset.pyx in 
pyarrow._dataset.Scanner.to_table()
   2887 result = self.scanner.ToTable()
   2888 
-> 2889 return pyarrow_wrap_table(GetResultValue(result))
   2890 
   2891 def take(self, object indices):

~/Development/arrow/python/pyarrow/error.pxi in 
pyarrow.lib.pyarrow_internal_check_status()
139 cdef api int pyarrow_internal_check_status(const CStatus& status) \
140 nogil except -1:
--> 141 return check_status(status)
142 
143 

~/Development/arrow/python/pyarrow/error.pxi in pyarrow.lib.check_status()
116 raise ArrowKeyError(message)
117 elif status.IsNotImplemented():
--> 118 raise ArrowNotImplementedError(message)
119 elif status.IsTypeError():
120 raise ArrowTypeError(message)

ArrowNotImplementedError: Unsupported cast from null to 
dictionary (no available cast function 
for target type)
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (ARROW-8284) [C++][Dataset] Schema evolution for timestamp columns

2021-04-15 Thread Uwe Korn (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Korn resolved ARROW-8284.
-
Fix Version/s: 4.0.0
   Resolution: Fixed

> [C++][Dataset] Schema evolution for timestamp columns
> -
>
> Key: ARROW-8284
> URL: https://issues.apache.org/jira/browse/ARROW-8284
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++
>Reporter: Uwe Korn
>Priority: Major
>  Labels: dataset
> Fix For: 4.0.0
>
>
> In a dataset, one can have timestamp columns with different resolutions. 
> There should be an optional to cast all timestamps to the type mentioned in 
> the schema. A typical example could be that we store a pandas DataFrame with 
> {{ns}} precision to Parquet files that only support {{us}} resolution in 
> their most widespread from. Then the dataset schema and the actual file 
> content don't match anymore.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-8284) [C++][Dataset] Schema evolution for timestamp columns

2021-04-15 Thread Uwe Korn (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-8284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17322141#comment-17322141
 ] 

Uwe Korn commented on ARROW-8284:
-

Tested this with the following code to verify that this is fixed in master:

{code}
import pyarrow as pa
import pyarrow.parquet as pq
import pyarrow.dataset as ds
import pandas as pd
import numpy as np

table1 = pa.table({"timestamp": pa.array([0, 1, 2], type=pa.timestamp("us"))})
pq.write_table(table1, "ts1.parquet")
table2 = pa.table({"timestamp": pa.array([1, 10, 200], 
type=pa.timestamp("ms"))})
pq.write_table(table2, "ts2.parquet")

dataset = ds.FileSystemDataset.from_paths(
paths=["ts1.parquet", "ts2.parquet"],
schema=table1.schema,
format=pa.dataset.ParquetFileFormat(),
filesystem=pa.fs.LocalFileSystem()
)
dataset.to_table().to_pandas()
{code}

> [C++][Dataset] Schema evolution for timestamp columns
> -
>
> Key: ARROW-8284
> URL: https://issues.apache.org/jira/browse/ARROW-8284
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++
>Reporter: Uwe Korn
>Priority: Major
>  Labels: dataset
>
> In a dataset, one can have timestamp columns with different resolutions. 
> There should be an optional to cast all timestamps to the type mentioned in 
> the schema. A typical example could be that we store a pandas DataFrame with 
> {{ns}} precision to Parquet files that only support {{us}} resolution in 
> their most widespread from. Then the dataset schema and the actual file 
> content don't match anymore.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (ARROW-8282) [C++/Python][Dataset] Support schema evolution for integer columns

2021-04-15 Thread Uwe Korn (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Korn resolved ARROW-8282.
-
Resolution: Fixed

This has been resolved on master in the meantime thus this will work starting 
with the 4.0 release.

> [C++/Python][Dataset] Support schema evolution for integer columns
> --
>
> Key: ARROW-8282
> URL: https://issues.apache.org/jira/browse/ARROW-8282
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++
>Reporter: Uwe Korn
>Priority: Major
>  Labels: dataset
> Fix For: 4.0.0
>
>
> When reading in a dataset where the schema specifies that column X is of type 
> {{int64}} but the partition actually contains the data stored in that columns 
> as {{int32}}, an upcast should be done.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ARROW-8282) [C++/Python][Dataset] Support schema evolution for integer columns

2021-04-15 Thread Uwe Korn (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Korn updated ARROW-8282:

Fix Version/s: (was: 5.0.0)
   4.0.0

> [C++/Python][Dataset] Support schema evolution for integer columns
> --
>
> Key: ARROW-8282
> URL: https://issues.apache.org/jira/browse/ARROW-8282
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++
>Reporter: Uwe Korn
>Priority: Major
>  Labels: dataset
> Fix For: 4.0.0
>
>
> When reading in a dataset where the schema specifies that column X is of type 
> {{int64}} but the partition actually contains the data stored in that columns 
> as {{int32}}, an upcast should be done.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (ARROW-12246) [CI] Sync conda recipes with upstream feedstock

2021-04-15 Thread Uwe Korn (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-12246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Korn resolved ARROW-12246.
--
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 9923
[https://github.com/apache/arrow/pull/9923]

> [CI] Sync conda recipes with upstream feedstock
> ---
>
> Key: ARROW-12246
> URL: https://issues.apache.org/jira/browse/ARROW-12246
> Project: Apache Arrow
>  Issue Type: Task
>  Components: Continuous Integration
>Reporter: Joris Van den Bossche
>Assignee: Uwe Korn
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (ARROW-12246) [CI] Sync conda recipes with upstream feedstock

2021-04-15 Thread Uwe Korn (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-12246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Korn reassigned ARROW-12246:


Assignee: Uwe Korn  (was: Joris Van den Bossche)

> [CI] Sync conda recipes with upstream feedstock
> ---
>
> Key: ARROW-12246
> URL: https://issues.apache.org/jira/browse/ARROW-12246
> Project: Apache Arrow
>  Issue Type: Task
>  Components: Continuous Integration
>Reporter: Joris Van den Bossche
>Assignee: Uwe Korn
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-12316) [C++] Switch default memory allocator from jemalloc to mimalloc

2021-04-11 Thread Uwe Korn (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-12316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17318907#comment-17318907
 ] 

Uwe Korn commented on ARROW-12316:
--

[~npr] Where can I find these benchmarks?

> [C++] Switch default memory allocator from jemalloc to mimalloc
> ---
>
> Key: ARROW-12316
> URL: https://issues.apache.org/jira/browse/ARROW-12316
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: C++
>Reporter: Neal Richardson
>Priority: Major
> Fix For: 4.0.0
>
>
> Benchmarking shows that mimalloc seems to be faster on real workflows (at 
> least on macOS, still collecting data on Ubuntu). We could switch the default 
> memory pool cases so that mimalloc is preferred. 
> cc [~jonkeane] [~apitrou]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-11684) [Python] Publish macos m1 arm wheels for pypi

2021-04-06 Thread Uwe Korn (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-11684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17315772#comment-17315772
 ] 

Uwe Korn commented on ARROW-11684:
--

I can review this but I have personally no interest in wheels, conda package 
work so much smoother.

> [Python] Publish macos m1 arm wheels for pypi 
> --
>
> Key: ARROW-11684
> URL: https://issues.apache.org/jira/browse/ARROW-11684
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Packaging, Python
>Affects Versions: 3.0.0
>Reporter: Jordan Mendelson
>Priority: Major
> Fix For: 4.0.0
>
>
> It would be nice to get macos arm64 build published to pypi to support the m1 
> processor.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-12230) [C++/Python/Packaging] Move conda aarch64 builds to Azure Pipelines

2021-04-06 Thread Uwe Korn (Jira)
Uwe Korn created ARROW-12230:


 Summary: [C++/Python/Packaging] Move conda aarch64 builds to Azure 
Pipelines
 Key: ARROW-12230
 URL: https://issues.apache.org/jira/browse/ARROW-12230
 Project: Apache Arrow
  Issue Type: Improvement
  Components: C++, Packaging, Python
Reporter: Uwe Korn


We should move the nightly conda builds for aarch64 to Azure Pipelines as they 
currently fail on drone due to the hard 1h timeout. On Azure Pipelines, they 
should work automatically thanks to conda-forge's cross-compilation setup. The 
necessary trick here is that the {{.ci_support}} files contain a 
{{target_platform}} line.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-12122) [Python] Cannot install via pip. M1 mac

2021-03-31 Thread Uwe Korn (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-12122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17312129#comment-17312129
 ] 

Uwe Korn commented on ARROW-12122:
--

https://github.com/matthew-brett/multibuild now support cross-compiling for 
Apple Silicon, we should be able to reuse the current macOS builds with a 
different target.

> [Python] Cannot install via pip. M1 mac
> ---
>
> Key: ARROW-12122
> URL: https://issues.apache.org/jira/browse/ARROW-12122
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Reporter: Bastien Boutonnet
>Priority: Major
>
> when doing {{pip install pyarrow --no-use-pep517}}
> {noformat}
> Collecting pyarrow
>  Using cached pyarrow-3.0.0.tar.gz (682 kB)
> Requirement already satisfied: numpy>=1.16.6 in 
> /Users/bastienboutonnet/Library/Caches/pypoetry/virtualenvs/dbt-sugar-lJO0x__U-py3.8/lib/python3.8/site-packages
>  (from pyarrow) (1.20.2)
> Building wheels for collected packages: pyarrow
>  Building wheel for pyarrow (setup.py) ... error
>  ERROR: Command errored out with exit status 1:
>  command: 
> /Users/bastienboutonnet/Library/Caches/pypoetry/virtualenvs/dbt-sugar-lJO0x__U-py3.8/bin/python
>  -u -c 'import sys, setuptools, tokenize; sys.argv[0] = 
> '"'"'/private/var/folders/v2/lfkghkc147j06_jd13v1f0yrgn/T/pip-install-ri2w315u/pyarrow_8d01252c437341798da24cfec11f603e/setup.py'"'"';
>  
> __file__='"'"'/private/var/folders/v2/lfkghkc147j06_jd13v1f0yrgn/T/pip-install-ri2w315u/pyarrow_8d01252c437341798da24cfec11f603e/setup.py'"'"';f=getattr(tokenize,
>  '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', 
> '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' 
> bdist_wheel -d 
> /private/var/folders/v2/lfkghkc147j06_jd13v1f0yrgn/T/pip-wheel-vpkwqzyi
>  cwd: 
> /private/var/folders/v2/lfkghkc147j06_jd13v1f0yrgn/T/pip-install-ri2w315u/pyarrow_8d01252c437341798da24cfec11f603e/
>  Complete output (238 lines):
>  running bdist_wheel
>  running build
>  running build_py
>  creating build
>  creating build/lib.macosx-11.2-arm64-3.8
>  creating build/lib.macosx-11.2-arm64-3.8/pyarrow
>  copying pyarrow/orc.py -> build/lib.macosx-11.2-arm64-3.8/pyarrow
>  copying pyarrow/_generated_version.py -> 
> build/lib.macosx-11.2-arm64-3.8/pyarrow
>  copying pyarrow/compat.py -> build/lib.macosx-11.2-arm64-3.8/pyarrow
>  copying pyarrow/benchmark.py -> build/lib.macosx-11.2-arm64-3.8/pyarrow
>  copying pyarrow/parquet.py -> build/lib.macosx-11.2-arm64-3.8/pyarrow
>  copying pyarrow/ipc.py -> build/lib.macosx-11.2-arm64-3.8/pyarrow
>  copying pyarrow/util.py -> build/lib.macosx-11.2-arm64-3.8/pyarrow
>  copying pyarrow/flight.py -> build/lib.macosx-11.2-arm64-3.8/pyarrow
>  copying pyarrow/cffi.py -> build/lib.macosx-11.2-arm64-3.8/pyarrow
>  copying pyarrow/filesystem.py -> build/lib.macosx-11.2-arm64-3.8/pyarrow
>  copying pyarrow/__init__.py -> build/lib.macosx-11.2-arm64-3.8/pyarrow
>  copying pyarrow/plasma.py -> build/lib.macosx-11.2-arm64-3.8/pyarrow
>  copying pyarrow/types.py -> build/lib.macosx-11.2-arm64-3.8/pyarrow
>  copying pyarrow/dataset.py -> build/lib.macosx-11.2-arm64-3.8/pyarrow
>  copying pyarrow/cuda.py -> build/lib.macosx-11.2-arm64-3.8/pyarrow
>  copying pyarrow/feather.py -> build/lib.macosx-11.2-arm64-3.8/pyarrow
>  copying pyarrow/pandas_compat.py -> build/lib.macosx-11.2-arm64-3.8/pyarrow
>  copying pyarrow/fs.py -> build/lib.macosx-11.2-arm64-3.8/pyarrow
>  copying pyarrow/csv.py -> build/lib.macosx-11.2-arm64-3.8/pyarrow
>  copying pyarrow/jvm.py -> build/lib.macosx-11.2-arm64-3.8/pyarrow
>  copying pyarrow/hdfs.py -> build/lib.macosx-11.2-arm64-3.8/pyarrow
>  copying pyarrow/json.py -> build/lib.macosx-11.2-arm64-3.8/pyarrow
>  copying pyarrow/serialization.py -> build/lib.macosx-11.2-arm64-3.8/pyarrow
>  copying pyarrow/compute.py -> build/lib.macosx-11.2-arm64-3.8/pyarrow
>  creating build/lib.macosx-11.2-arm64-3.8/pyarrow/tests
>  copying pyarrow/tests/test_tensor.py -> 
> build/lib.macosx-11.2-arm64-3.8/pyarrow/tests
>  copying pyarrow/tests/test_ipc.py -> 
> build/lib.macosx-11.2-arm64-3.8/pyarrow/tests
>  copying pyarrow/tests/conftest.py -> 
> build/lib.macosx-11.2-arm64-3.8/pyarrow/tests
>  copying pyarrow/tests/test_convert_builtin.py -> 
> build/lib.macosx-11.2-arm64-3.8/pyarrow/tests
>  copying pyarrow/tests/test_misc.py -> 
> build/lib.macosx-11.2-arm64-3.8/pyarrow/tests
>  copying pyarrow/tests/test_gandiva.py -> 
> build/lib.macosx-11.2-arm64-3.8/pyarrow/tests
>  copying pyarrow/tests/strategies.py -> 
> build/lib.macosx-11.2-arm64-3.8/pyarrow/tests
>  copying pyarrow/tests/test_adhoc_memory_leak.py -> 
> build/lib.macosx-11.2-arm64-3.8/pyarrow/tests
>  copying pyarrow/tests/arrow_7980.py -> 
> build/lib.macosx-11.2-arm64-3.8/pyarrow/tests
>  copying pyarrow/tests/util.py -> 
> 

[jira] [Commented] (ARROW-11608) [CI] turbodbc integration tests are failing (build isue)

2021-03-29 Thread Uwe Korn (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-11608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17310931#comment-17310931
 ] 

Uwe Korn commented on ARROW-11608:
--

Yes, I should take care of this.

> [CI] turbodbc integration tests are failing (build isue)
> 
>
> Key: ARROW-11608
> URL: https://issues.apache.org/jira/browse/ARROW-11608
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: CI
>Reporter: Joris Van den Bossche
>Priority: Major
>
> Both turbodbc builds are failing, see eg 
> https://github.com/ursacomputing/crossbow/runs/1885201762
> It seems a failure to build turbodbc: 
> {code}
> /build/turbodbc /
> -- The CXX compiler identification is GNU 9.3.0
> -- Detecting CXX compiler ABI info
> -- Detecting CXX compiler ABI info - done
> -- Check for working CXX compiler: 
> /opt/conda/envs/arrow/bin/x86_64-conda-linux-gnu-c++ - skipped
> -- Detecting CXX compile features
> -- Detecting CXX compile features - done
> -- Build type: Debug
> CMake Error at CMakeLists.txt:14 (add_subdirectory):
>   add_subdirectory given source "pybind11" which is not an existing
>   directory.
> -- Found GTest: /opt/conda/envs/arrow/lib/libgtest.so  
> -- Found Boost: /opt/conda/envs/arrow/include (found version "1.74.0") found 
> components: locale 
> -- Detecting unixODBC library
> --   Found header files at: /opt/conda/envs/arrow/include
> --   Found library at: /opt/conda/envs/arrow/lib/libodbc.so
> -- Found Boost: /opt/conda/envs/arrow/include (found version "1.74.0") found 
> components: system date_time locale 
> -- Detecting unixODBC library
> --   Found header files at: /opt/conda/envs/arrow/include
> --   Found library at: /opt/conda/envs/arrow/lib/libodbc.so
> -- Found Boost: /opt/conda/envs/arrow/include (found version "1.74.0") found 
> components: system 
> -- Detecting unixODBC library
> --   Found header files at: /opt/conda/envs/arrow/include
> --   Found library at: /opt/conda/envs/arrow/lib/libodbc.so
> CMake Error at cpp/turbodbc_python/Library/CMakeLists.txt:3 
> (pybind11_add_module):
>   Unknown CMake command "pybind11_add_module".
> -- Configuring incomplete, errors occurred!
> See also "/build/turbodbc/CMakeFiles/CMakeOutput.log".
> See also "/build/turbodbc/CMakeFiles/CMakeError.log".
> 1
> Error: `docker-compose --file 
> /home/runner/work/crossbow/crossbow/arrow/docker-compose.yml run --rm -e 
> SETUPTOOLS_SCM_PRETEND_VERSION=3.1.0.dev174 conda-python-turbodbc` exited 
> with a non-zero exit code 1, see the process log above.
> {code}
> cc [~uwe]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-12142) [Python] undefined symbol: _ZN5arrow6StatusC1ENS_10StatusCodeERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE

2021-03-29 Thread Uwe Korn (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-12142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17310925#comment-17310925
 ] 

Uwe Korn commented on ARROW-12142:
--

This hasn't changed, you will also need that flag with newer {{manylinux}} 
releases.

> [Python] undefined symbol: 
> _ZN5arrow6StatusC1ENS_10StatusCodeERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE
> 
>
> Key: ARROW-12142
> URL: https://issues.apache.org/jira/browse/ARROW-12142
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Affects Versions: 3.0.0
> Environment:  ubuntu-20.04 in Github Actions CI
>Reporter: Shane Harvey
>Priority: Major
>
> Using Ubuntu 20.04 in Github Actions CI to test a python extension that 
> integrates with MongoDB and pyarrow, I get this error when attempting to 
> import the Cython+pyarrow extension module:
> {code:python}
> ImportError: Failed to import test module: test_arrow
> Traceback (most recent call last):
>   File "/usr/lib/python3.8/unittest/loader.py", line 436, in _find_test_path
> module = self._get_module_from_name(name)
>   File "/usr/lib/python3.8/unittest/loader.py", line 377, in 
> _get_module_from_name
> __import__(name)
>   File 
> "/home/runner/work/mongo-arrow/mongo-arrow/bindings/python/test/test_arrow.py",
>  line 21, in 
> from pymongoarrow.api import aggregate_arrow_all, find_arrow_all, Schema
>   File 
> "/home/runner/work/mongo-arrow/mongo-arrow/bindings/python/pymongoarrow/__init__.py",
>  line 18, in 
> from pymongoarrow.lib import libbson_version
> ImportError: 
> /home/runner/work/mongo-arrow/mongo-arrow/bindings/python/pymongoarrow/lib.cpython-38-x86_64-linux-gnu.so:
>  undefined symbol: 
> _ZN5arrow6StatusC1ENS_10StatusCodeERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE
> {code}
> The task installs pyarrow 3.0 from the manylinux2014 wheel:
> {code}
> Collecting pyarrow>=3
>   Downloading pyarrow-3.0.0-cp38-cp38-manylinux2014_x86_64.whl (20.7 MB)
> {code}
> The same project works fine locally on macOS (10.15) also using pyarrow 3.0 
> installed via pip. Upon googling I found this blog: 
> https://uwekorn.com/2019/09/15/how-we-build-apache-arrows-manylinux-wheels.html
> The article explains that the fix for {{"undefined symbol: 
> _ZN5arrow6StatusC1ENS_10StatusCodeERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE"}}
>  is to add {{-D_GLIBCXX_USE_CXX11_ABI=0}} to CFLAGS which did work for me. 
> However, the article says this is only needed for manylinux1 wheels because 
> they build on an old platform. Is it expected that users still need to define 
> this flag when using manylinux2014 wheels?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-8282) [C++/Python][Dataset] Support schema evolution for integer columns

2021-03-22 Thread Uwe Korn (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-8282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17306400#comment-17306400
 ] 

Uwe Korn commented on ARROW-8282:
-

This is still an issue especially in my context, I can have a look at that in 
the next two weeks.

> [C++/Python][Dataset] Support schema evolution for integer columns
> --
>
> Key: ARROW-8282
> URL: https://issues.apache.org/jira/browse/ARROW-8282
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++
>Reporter: Uwe Korn
>Priority: Major
>  Labels: dataset
> Fix For: 4.0.0
>
>
> When reading in a dataset where the schema specifies that column X is of type 
> {{int64}} but the partition actually contains the data stored in that columns 
> as {{int32}}, an upcast should be done.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-9367) [Python] Sorting on pyarrow data structures ?

2021-03-16 Thread Uwe Korn (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-9367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17302404#comment-17302404
 ] 

Uwe Korn commented on ARROW-9367:
-

This is now working in 3.0:

{code}
import pyarrow as pa
import pyarrow.compute as pc

table = pa.table({
"a": [1, 2, 3, -1],
"b": [10, 9, 9, 7],
"c": [10, 10, 8, 11]
})

indices = pc.sort_indices(table, sort_keys=[("b", "ascending"), ("c", 
"ascending")])
table = pc.take(table, indices)
table.to_pydict()
# {'a': [-1, 3, 2, 1], 'b': [7, 9, 9, 10], 'c': [11, 8, 10, 10]}
{code}

> [Python] Sorting on pyarrow data structures ?
> -
>
> Key: ARROW-9367
> URL: https://issues.apache.org/jira/browse/ARROW-9367
> Project: Apache Arrow
>  Issue Type: Wish
>  Components: Python
>Reporter: Athanassios Hatzis
>Priority: Major
>  Labels: sort
>
> Hi, I consider sorting a fundamental operation for any in-memory data 
> structures, including those of PyArrow.
> It would be nice if pa.array, pa.table, etc had sorting methods but I did not 
> find any. One has to pass sorting indices calculated from some other library, 
> such as numpy, to sort them. Sorting indices could have been calculated 
> directly from PyArrow. Am I missing something here ? That increases 
> significantly complexity for the developer.
> Do you have any plans on implementing such a feature in the near future ?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (ARROW-9367) [Python] Sorting on pyarrow data structures ?

2021-03-16 Thread Uwe Korn (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-9367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Korn resolved ARROW-9367.
-
Fix Version/s: 3.0.0
   Resolution: Fixed

> [Python] Sorting on pyarrow data structures ?
> -
>
> Key: ARROW-9367
> URL: https://issues.apache.org/jira/browse/ARROW-9367
> Project: Apache Arrow
>  Issue Type: Wish
>  Components: Python
>Reporter: Athanassios Hatzis
>Priority: Major
>  Labels: sort
> Fix For: 3.0.0
>
>
> Hi, I consider sorting a fundamental operation for any in-memory data 
> structures, including those of PyArrow.
> It would be nice if pa.array, pa.table, etc had sorting methods but I did not 
> find any. One has to pass sorting indices calculated from some other library, 
> such as numpy, to sort them. Sorting indices could have been calculated 
> directly from PyArrow. Am I missing something here ? That increases 
> significantly complexity for the developer.
> Do you have any plans on implementing such a feature in the near future ?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (ARROW-11835) [Python] PyArrow 3.0/Pip installation errors on Big Sur.

2021-03-13 Thread Uwe Korn (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-11835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Korn resolved ARROW-11835.
--
Fix Version/s: 3.0.0
 Assignee: Uwe Korn
   Resolution: Not A Problem

Old versions of {{pip}} didn't recognize that macOS 10 and 11 are compatible 
and thus ignored existing wheels.

> [Python] PyArrow 3.0/Pip installation errors on Big Sur.
> 
>
> Key: ARROW-11835
> URL: https://issues.apache.org/jira/browse/ARROW-11835
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Affects Versions: 3.0.0
> Environment: Big Sur 11.2.2. Python 3.7.9 with PyEnv 1.2.23 and Pip 
> 20.1.1.
>Reporter: Davin Chia
>Assignee: Uwe Korn
>Priority: Minor
> Fix For: 3.0.0
>
>
>  Hi,
> I'm running into the below error while trying to install `PyArrow` via Pip. 
> I've installed the latest apache arrow C++ libraries via brew install 
> apache-arrow.
> I spent some time digging into this without much progress, and SO/Google 
> wasn't much good, so any help is very appreciated. Thanks!
> Specific error message:
> {code:java}
> creating build/bdist.macosx-11.2-x86_64/wheel/pyarrow/include
>  error: can't copy 'build/lib.macosx-11.2-x86_64-3.7/pyarrow/include/arrow': 
> doesn't exist or not a regular file{code}
> Full stack trace:
> {code:java}
> ➜ ~ pip install pyarrow
> Collecting pyarrow
>  Using cached pyarrow-3.0.0.tar.gz (682 kB)
>  Installing build dependencies ... done
>  Getting requirements to build wheel ... done
>  Preparing wheel metadata ... done
> Requirement already satisfied: numpy>=1.16.6 in 
> ./.pyenv/versions/3.7.9/lib/python3.7/site-packages (from pyarrow) (1.20.1)
> Building wheels for collected packages: pyarrow
>  Building wheel for pyarrow (PEP 517) ... error
>  ERROR: Command errored out with exit status 1:
>  command: /Users/davinchia/.pyenv/versions/3.7.9/bin/python3.7 
> /Users/davinchia/.pyenv/versions/3.7.9/lib/python3.7/site-packages/pip/_vendor/pep517/_in_process.py
>  build_wheel /var/folders/4h/k452x8ns06q7w_w7wkb3j16hgn/T/tmpuw3ooopv
>  cwd: 
> /private/var/folders/4h/k452x8ns06q7w_w7wkb3j16hgn/T/pip-install-5cyfkenw/pyarrow
>  Complete output (438 lines):
>  running bdist_wheel
>  running build
>  running build_py
>  creating build
>  creating build/lib.macosx-11.2-x86_64-3.7
>  creating build/lib.macosx-11.2-x86_64-3.7/pyarrow
>  copying pyarrow/orc.py -> build/lib.macosx-11.2-x86_64-3.7/pyarrow
>  copying pyarrow/_generated_version.py -> 
> build/lib.macosx-11.2-x86_64-3.7/pyarrow
>  copying pyarrow/compat.py -> build/lib.macosx-11.2-x86_64-3.7/pyarrow
>  copying pyarrow/benchmark.py -> build/lib.macosx-11.2-x86_64-3.7/pyarrow
>  copying pyarrow/parquet.py -> build/lib.macosx-11.2-x86_64-3.7/pyarrow
>  copying pyarrow/ipc.py -> build/lib.macosx-11.2-x86_64-3.7/pyarrow
>  copying pyarrow/util.py -> build/lib.macosx-11.2-x86_64-3.7/pyarrow
>  copying pyarrow/flight.py -> build/lib.macosx-11.2-x86_64-3.7/pyarrow
>  copying pyarrow/cffi.py -> build/lib.macosx-11.2-x86_64-3.7/pyarrow
>  copying pyarrow/filesystem.py -> build/lib.macosx-11.2-x86_64-3.7/pyarrow
>  copying pyarrow/__init__.py -> build/lib.macosx-11.2-x86_64-3.7/pyarrow
>  copying pyarrow/plasma.py -> build/lib.macosx-11.2-x86_64-3.7/pyarrow
>  copying pyarrow/types.py -> build/lib.macosx-11.2-x86_64-3.7/pyarrow
>  copying pyarrow/dataset.py -> build/lib.macosx-11.2-x86_64-3.7/pyarrow
>  copying pyarrow/cuda.py -> build/lib.macosx-11.2-x86_64-3.7/pyarrow
>  copying pyarrow/feather.py -> build/lib.macosx-11.2-x86_64-3.7/pyarrow
>  copying pyarrow/pandas_compat.py -> build/lib.macosx-11.2-x86_64-3.7/pyarrow
>  copying pyarrow/fs.py -> build/lib.macosx-11.2-x86_64-3.7/pyarrow
>  copying pyarrow/csv.py -> build/lib.macosx-11.2-x86_64-3.7/pyarrow
>  copying pyarrow/jvm.py -> build/lib.macosx-11.2-x86_64-3.7/pyarrow
>  copying pyarrow/hdfs.py -> build/lib.macosx-11.2-x86_64-3.7/pyarrow
>  copying pyarrow/json.py -> build/lib.macosx-11.2-x86_64-3.7/pyarrow
>  copying pyarrow/serialization.py -> build/lib.macosx-11.2-x86_64-3.7/pyarrow
>  copying pyarrow/compute.py -> build/lib.macosx-11.2-x86_64-3.7/pyarrow
>  creating build/lib.macosx-11.2-x86_64-3.7/pyarrow/tests
>  copying pyarrow/tests/test_tensor.py -> 
> build/lib.macosx-11.2-x86_64-3.7/pyarrow/tests
>  copying pyarrow/tests/test_ipc.py -> 
> build/lib.macosx-11.2-x86_64-3.7/pyarrow/tests
>  copying pyarrow/tests/conftest.py -> 
> build/lib.macosx-11.2-x86_64-3.7/pyarrow/tests
>  copying pyarrow/tests/test_convert_builtin.py -> 
> build/lib.macosx-11.2-x86_64-3.7/pyarrow/tests
>  copying pyarrow/tests/test_misc.py -> 
> build/lib.macosx-11.2-x86_64-3.7/pyarrow/tests
>  copying pyarrow/tests/test_gandiva.py -> 
> build/lib.macosx-11.2-x86_64-3.7/pyarrow/tests
>  copying pyarrow/tests/strate

[jira] [Commented] (ARROW-11835) [Python] PyArrow 3.0/Pip installation errors on Big Sur.

2021-03-12 Thread Uwe Korn (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-11835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17300613#comment-17300613
 ] 

Uwe Korn commented on ARROW-11835:
--

What happens if you upgrade to `pip>=20.3.2` ?

> [Python] PyArrow 3.0/Pip installation errors on Big Sur.
> 
>
> Key: ARROW-11835
> URL: https://issues.apache.org/jira/browse/ARROW-11835
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Affects Versions: 3.0.0
> Environment: Big Sur 11.2.2. Python 3.7.9 with PyEnv 1.2.23 and Pip 
> 20.1.1.
>Reporter: Davin Chia
>Priority: Minor
>
>  Hi,
> I'm running into the below error while trying to install `PyArrow` via Pip. 
> I've installed the latest apache arrow C++ libraries via brew install 
> apache-arrow.
> I spent some time digging into this without much progress, and SO/Google 
> wasn't much good, so any help is very appreciated. Thanks!
> Specific error message:
> {code:java}
> creating build/bdist.macosx-11.2-x86_64/wheel/pyarrow/include
>  error: can't copy 'build/lib.macosx-11.2-x86_64-3.7/pyarrow/include/arrow': 
> doesn't exist or not a regular file{code}
> Full stack trace:
> {code:java}
> ➜ ~ pip install pyarrow
> Collecting pyarrow
>  Using cached pyarrow-3.0.0.tar.gz (682 kB)
>  Installing build dependencies ... done
>  Getting requirements to build wheel ... done
>  Preparing wheel metadata ... done
> Requirement already satisfied: numpy>=1.16.6 in 
> ./.pyenv/versions/3.7.9/lib/python3.7/site-packages (from pyarrow) (1.20.1)
> Building wheels for collected packages: pyarrow
>  Building wheel for pyarrow (PEP 517) ... error
>  ERROR: Command errored out with exit status 1:
>  command: /Users/davinchia/.pyenv/versions/3.7.9/bin/python3.7 
> /Users/davinchia/.pyenv/versions/3.7.9/lib/python3.7/site-packages/pip/_vendor/pep517/_in_process.py
>  build_wheel /var/folders/4h/k452x8ns06q7w_w7wkb3j16hgn/T/tmpuw3ooopv
>  cwd: 
> /private/var/folders/4h/k452x8ns06q7w_w7wkb3j16hgn/T/pip-install-5cyfkenw/pyarrow
>  Complete output (438 lines):
>  running bdist_wheel
>  running build
>  running build_py
>  creating build
>  creating build/lib.macosx-11.2-x86_64-3.7
>  creating build/lib.macosx-11.2-x86_64-3.7/pyarrow
>  copying pyarrow/orc.py -> build/lib.macosx-11.2-x86_64-3.7/pyarrow
>  copying pyarrow/_generated_version.py -> 
> build/lib.macosx-11.2-x86_64-3.7/pyarrow
>  copying pyarrow/compat.py -> build/lib.macosx-11.2-x86_64-3.7/pyarrow
>  copying pyarrow/benchmark.py -> build/lib.macosx-11.2-x86_64-3.7/pyarrow
>  copying pyarrow/parquet.py -> build/lib.macosx-11.2-x86_64-3.7/pyarrow
>  copying pyarrow/ipc.py -> build/lib.macosx-11.2-x86_64-3.7/pyarrow
>  copying pyarrow/util.py -> build/lib.macosx-11.2-x86_64-3.7/pyarrow
>  copying pyarrow/flight.py -> build/lib.macosx-11.2-x86_64-3.7/pyarrow
>  copying pyarrow/cffi.py -> build/lib.macosx-11.2-x86_64-3.7/pyarrow
>  copying pyarrow/filesystem.py -> build/lib.macosx-11.2-x86_64-3.7/pyarrow
>  copying pyarrow/__init__.py -> build/lib.macosx-11.2-x86_64-3.7/pyarrow
>  copying pyarrow/plasma.py -> build/lib.macosx-11.2-x86_64-3.7/pyarrow
>  copying pyarrow/types.py -> build/lib.macosx-11.2-x86_64-3.7/pyarrow
>  copying pyarrow/dataset.py -> build/lib.macosx-11.2-x86_64-3.7/pyarrow
>  copying pyarrow/cuda.py -> build/lib.macosx-11.2-x86_64-3.7/pyarrow
>  copying pyarrow/feather.py -> build/lib.macosx-11.2-x86_64-3.7/pyarrow
>  copying pyarrow/pandas_compat.py -> build/lib.macosx-11.2-x86_64-3.7/pyarrow
>  copying pyarrow/fs.py -> build/lib.macosx-11.2-x86_64-3.7/pyarrow
>  copying pyarrow/csv.py -> build/lib.macosx-11.2-x86_64-3.7/pyarrow
>  copying pyarrow/jvm.py -> build/lib.macosx-11.2-x86_64-3.7/pyarrow
>  copying pyarrow/hdfs.py -> build/lib.macosx-11.2-x86_64-3.7/pyarrow
>  copying pyarrow/json.py -> build/lib.macosx-11.2-x86_64-3.7/pyarrow
>  copying pyarrow/serialization.py -> build/lib.macosx-11.2-x86_64-3.7/pyarrow
>  copying pyarrow/compute.py -> build/lib.macosx-11.2-x86_64-3.7/pyarrow
>  creating build/lib.macosx-11.2-x86_64-3.7/pyarrow/tests
>  copying pyarrow/tests/test_tensor.py -> 
> build/lib.macosx-11.2-x86_64-3.7/pyarrow/tests
>  copying pyarrow/tests/test_ipc.py -> 
> build/lib.macosx-11.2-x86_64-3.7/pyarrow/tests
>  copying pyarrow/tests/conftest.py -> 
> build/lib.macosx-11.2-x86_64-3.7/pyarrow/tests
>  copying pyarrow/tests/test_convert_builtin.py -> 
> build/lib.macosx-11.2-x86_64-3.7/pyarrow/tests
>  copying pyarrow/tests/test_misc.py -> 
> build/lib.macosx-11.2-x86_64-3.7/pyarrow/tests
>  copying pyarrow/tests/test_gandiva.py -> 
> build/lib.macosx-11.2-x86_64-3.7/pyarrow/tests
>  copying pyarrow/tests/strategies.py -> 
> build/lib.macosx-11.2-x86_64-3.7/pyarrow/tests
>  copying pyarrow/tests/test_adhoc_memory_leak.py -> 
> build/lib.macosx-11.2-x86_64-3.7/pyarrow/

[jira] [Commented] (ARROW-11695) [C++][FlightRPC][Packaging] Update support for disabling TLS server verification for recent gRPC versions

2021-02-26 Thread Uwe Korn (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-11695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17292036#comment-17292036
 ] 

Uwe Korn commented on ARROW-11695:
--

[~amasend] please update and try again, I have pushed a new build with the 
final patch.

> [C++][FlightRPC][Packaging] Update support for disabling TLS server 
> verification for recent gRPC versions
> -
>
> Key: ARROW-11695
> URL: https://issues.apache.org/jira/browse/ARROW-11695
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++, FlightRPC, Packaging
>Affects Versions: 3.0.0
> Environment: macOS, conda env, python 3.6 / 3.7
>Reporter: Amadeusz
>Assignee: David Li
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> It is regarding issue from github: 
> [https://github.com/apache/arrow/issues/9525]
> Output of `conda list`:
> {code:java}
> Name Version Build Channel
>  abseil-cpp 20200923.3 h046ec9c_0 conda-forge
>  aif360 0.3.0 pypi_0 pypi
>  appdirs 1.4.4 pypi_0 pypi
>  appnope 0.1.2 pypi_0 pypi
>  arrow-cpp 3.0.0 py36h25f3d33_3_cpu conda-forge
>  astunparse 1.6.3 pypi_0 pypi
>  attrs 20.3.0 pypi_0 pypi
>  aws-c-cal 0.4.5 hf7813a8_6 conda-forge
>  aws-c-common 0.4.67 hbcf498f_0 conda-forge
>  aws-c-event-stream 0.2.6 h8218164_4 conda-forge
>  aws-c-io 0.8.3 h339dee7_1 conda-forge
>  aws-checksums 0.1.11 h339dee7_1 conda-forge
>  aws-sdk-cpp 1.8.138 h5307d9a_1 conda-forge
>  backcall 0.2.0 pypi_0 pypi
>  bayesian-optimization 1.2.0 pypi_0 pypi
>  black 19.10b0 pypi_0 pypi
>  boto3 1.17.9 pypi_0 pypi
>  botocore 1.20.9 pypi_0 pypi
>  brotli 1.0.9 h046ec9c_4 conda-forge
>  bzip2 1.0.8 hc929b4f_4 conda-forge
>  c-ares 1.17.1 hc929b4f_0 conda-forge
>  ca-certificates 2020.12.5 h033912b_0 conda-forge
>  cached-property 1.5.2 pypi_0 pypi
>  category-encoders 2.1.0 pypi_0 pypi
>  certifi 2020.12.5 py36h79c6626_1 conda-forge
>  chardet 3.0.4 pypi_0 pypi
>  click 7.1.2 pypi_0 pypi
>  cycler 0.10.0 pypi_0 pypi
>  cython 0.29.21 pypi_0 pypi
>  decorator 4.4.2 pypi_0 pypi
>  docutils 0.15.2 pypi_0 pypi
>  flask 1.1.2 pypi_0 pypi
>  future 0.18.2 pypi_0 pypi
>  gflags 2.2.2 hb1e8313_1004 conda-forge
>  glog 0.4.0 hb7f4fc5_3 conda-forge
>  greenery 3.1 pypi_0 pypi
>  grpc-cpp 1.35.0 h330f241_0 conda-forge
>  grpcio 1.35.0 pypi_0 pypi
>  h5py 2.10.0 pypi_0 pypi
>  hpsklearn 0.1.0 pypi_0 pypi
>  hyperopt 0.1.2 pypi_0 pypi
>  ibm-cos-sdk 2.7.0 pypi_0 pypi
>  ibm-cos-sdk-core 2.7.0 pypi_0 pypi
>  ibm-cos-sdk-s3transfer 2.7.0 pypi_0 pypi
>  ibm-watson-machine-learning 1.0.45 pypi_0 pypi
>  idna 2.9 pypi_0 pypi
>  imageio 2.9.0 pypi_0 pypi
>  ipython 7.16.1 pypi_0 pypi
>  ipython-genutils 0.2.0 pypi_0 pypi
>  itsdangerous 1.1.0 pypi_0 pypi
>  jedi 0.18.0 pypi_0 pypi
>  jinja2 2.11.3 pypi_0 pypi
>  jmespath 0.9.5 pypi_0 pypi
>  joblib 1.0.1 pyhd8ed1ab_0 conda-forge
>  jsonschema 2.6.0 pypi_0 pypi
>  jsonsubschema 0.0.1 pypi_0 pypi
>  keras 2.3.1 pypi_0 pypi
>  keras-applications 1.0.8 pypi_0 pypi
>  keras-preprocessing 1.1.2 pypi_0 pypi
>  kiwisolver 1.3.1 pypi_0 pypi
>  krb5 1.17.1 hddcf347_0 
>  lale 0.4.13 pypi_0 pypi
>  liac-arff 2.5.0 pypi_0 pypi
>  libblas 3.9.0 8_openblas conda-forge
>  libcblas 3.9.0 8_openblas conda-forge
>  libcurl 7.71.1 h9bf37e3_8 conda-forge
>  libcxx 11.0.1 habf9029_0 conda-forge
>  libcxxabi 4.0.1 hcfea43d_1 
>  libedit 3.1.20181209 hb402a30_0 
>  libev 4.33 haf1e3a3_1 conda-forge
>  libevent 2.1.10 hddc9c9b_3 conda-forge
>  libffi 3.2.1 h475c297_4 
>  libgfortran 5.0.0 9_3_0_h6c81a4c_18 conda-forge
>  libgfortran5 9.3.0 h6c81a4c_18 conda-forge
>  liblapack 3.9.0 8_openblas conda-forge
>  libnghttp2 1.43.0 h07e645a_0 conda-forge
>  libopenblas 0.3.12 openmp_h54245bb_1 conda-forge
>  libprotobuf 3.14.0 hfd3ada9_0 conda-forge
>  libssh2 1.9.0 h8a08a2b_5 conda-forge
>  libthrift 0.13.0 h990abc0_6 conda-forge
>  libutf8proc 2.6.1 h35c211d_0 conda-forge
>  lightgbm 2.2.3 pypi_0 pypi
>  llvm-openmp 11.0.1 h7c73e74_0 conda-forge
>  lomond 0.3.3 pypi_0 pypi
>  lxml 4.5.1 pypi_0 pypi
>  lz4-c 1.9.3 h046ec9c_0 conda-forge
>  markupsafe 1.1.1 pypi_0 pypi
>  matplotlib 3.3.4 pypi_0 pypi
>  minepy 1.2.5 pypi_0 pypi
>  mock 4.0.3 pypi_0 pypi
>  ncurses 6.2 h0a44026_0 
>  networkx 2.4 pypi_0 pypi
>  nose 1.3.7 pypi_0 pypi
>  numexpr 2.7.2 pypi_0 pypi
>  numpy 1.18.1 pypi_0 pypi
>  opencv-python 4.5.1.48 pypi_0 pypi
>  openml 0.9.0 pypi_0 pypi
>  openssl 1.1.1j hbcf498f_0 conda-forge
>  orc 1.6.7 h675e114_0 conda-forge
>  pandas 0.25.3 pypi_0 pypi
>  parquet-cpp 1.5.1 2 conda-forge
>  parso 0.8.1 pypi_0 pypi
>  pathspec 0.8.1 pypi_0 pypi
>  patsy 0.5.1 pypi_0 pypi
>  pexpect 4.8.0 pypi_0 pypi
>  pickleshare 0.7.5 pypi_0 pypi
>  pillow 8.1.0 pypi_0

[jira] [Commented] (ARROW-11695) [C++][FlightRPC][Packaging] Update support for disabling TLS server verification for recent gRPC versions

2021-02-25 Thread Uwe Korn (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-11695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17291452#comment-17291452
 ] 

Uwe Korn commented on ARROW-11695:
--

The conda package is already fixed, on PyPI you need to wait for Arrow 4.0

> [C++][FlightRPC][Packaging] Update support for disabling TLS server 
> verification for recent gRPC versions
> -
>
> Key: ARROW-11695
> URL: https://issues.apache.org/jira/browse/ARROW-11695
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++, FlightRPC, Packaging
>Affects Versions: 3.0.0
> Environment: macOS, conda env, python 3.6 / 3.7
>Reporter: Amadeusz
>Assignee: David Li
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> It is regarding issue from github: 
> [https://github.com/apache/arrow/issues/9525]
> Output of `conda list`:
> {code:java}
> Name Version Build Channel
>  abseil-cpp 20200923.3 h046ec9c_0 conda-forge
>  aif360 0.3.0 pypi_0 pypi
>  appdirs 1.4.4 pypi_0 pypi
>  appnope 0.1.2 pypi_0 pypi
>  arrow-cpp 3.0.0 py36h25f3d33_3_cpu conda-forge
>  astunparse 1.6.3 pypi_0 pypi
>  attrs 20.3.0 pypi_0 pypi
>  aws-c-cal 0.4.5 hf7813a8_6 conda-forge
>  aws-c-common 0.4.67 hbcf498f_0 conda-forge
>  aws-c-event-stream 0.2.6 h8218164_4 conda-forge
>  aws-c-io 0.8.3 h339dee7_1 conda-forge
>  aws-checksums 0.1.11 h339dee7_1 conda-forge
>  aws-sdk-cpp 1.8.138 h5307d9a_1 conda-forge
>  backcall 0.2.0 pypi_0 pypi
>  bayesian-optimization 1.2.0 pypi_0 pypi
>  black 19.10b0 pypi_0 pypi
>  boto3 1.17.9 pypi_0 pypi
>  botocore 1.20.9 pypi_0 pypi
>  brotli 1.0.9 h046ec9c_4 conda-forge
>  bzip2 1.0.8 hc929b4f_4 conda-forge
>  c-ares 1.17.1 hc929b4f_0 conda-forge
>  ca-certificates 2020.12.5 h033912b_0 conda-forge
>  cached-property 1.5.2 pypi_0 pypi
>  category-encoders 2.1.0 pypi_0 pypi
>  certifi 2020.12.5 py36h79c6626_1 conda-forge
>  chardet 3.0.4 pypi_0 pypi
>  click 7.1.2 pypi_0 pypi
>  cycler 0.10.0 pypi_0 pypi
>  cython 0.29.21 pypi_0 pypi
>  decorator 4.4.2 pypi_0 pypi
>  docutils 0.15.2 pypi_0 pypi
>  flask 1.1.2 pypi_0 pypi
>  future 0.18.2 pypi_0 pypi
>  gflags 2.2.2 hb1e8313_1004 conda-forge
>  glog 0.4.0 hb7f4fc5_3 conda-forge
>  greenery 3.1 pypi_0 pypi
>  grpc-cpp 1.35.0 h330f241_0 conda-forge
>  grpcio 1.35.0 pypi_0 pypi
>  h5py 2.10.0 pypi_0 pypi
>  hpsklearn 0.1.0 pypi_0 pypi
>  hyperopt 0.1.2 pypi_0 pypi
>  ibm-cos-sdk 2.7.0 pypi_0 pypi
>  ibm-cos-sdk-core 2.7.0 pypi_0 pypi
>  ibm-cos-sdk-s3transfer 2.7.0 pypi_0 pypi
>  ibm-watson-machine-learning 1.0.45 pypi_0 pypi
>  idna 2.9 pypi_0 pypi
>  imageio 2.9.0 pypi_0 pypi
>  ipython 7.16.1 pypi_0 pypi
>  ipython-genutils 0.2.0 pypi_0 pypi
>  itsdangerous 1.1.0 pypi_0 pypi
>  jedi 0.18.0 pypi_0 pypi
>  jinja2 2.11.3 pypi_0 pypi
>  jmespath 0.9.5 pypi_0 pypi
>  joblib 1.0.1 pyhd8ed1ab_0 conda-forge
>  jsonschema 2.6.0 pypi_0 pypi
>  jsonsubschema 0.0.1 pypi_0 pypi
>  keras 2.3.1 pypi_0 pypi
>  keras-applications 1.0.8 pypi_0 pypi
>  keras-preprocessing 1.1.2 pypi_0 pypi
>  kiwisolver 1.3.1 pypi_0 pypi
>  krb5 1.17.1 hddcf347_0 
>  lale 0.4.13 pypi_0 pypi
>  liac-arff 2.5.0 pypi_0 pypi
>  libblas 3.9.0 8_openblas conda-forge
>  libcblas 3.9.0 8_openblas conda-forge
>  libcurl 7.71.1 h9bf37e3_8 conda-forge
>  libcxx 11.0.1 habf9029_0 conda-forge
>  libcxxabi 4.0.1 hcfea43d_1 
>  libedit 3.1.20181209 hb402a30_0 
>  libev 4.33 haf1e3a3_1 conda-forge
>  libevent 2.1.10 hddc9c9b_3 conda-forge
>  libffi 3.2.1 h475c297_4 
>  libgfortran 5.0.0 9_3_0_h6c81a4c_18 conda-forge
>  libgfortran5 9.3.0 h6c81a4c_18 conda-forge
>  liblapack 3.9.0 8_openblas conda-forge
>  libnghttp2 1.43.0 h07e645a_0 conda-forge
>  libopenblas 0.3.12 openmp_h54245bb_1 conda-forge
>  libprotobuf 3.14.0 hfd3ada9_0 conda-forge
>  libssh2 1.9.0 h8a08a2b_5 conda-forge
>  libthrift 0.13.0 h990abc0_6 conda-forge
>  libutf8proc 2.6.1 h35c211d_0 conda-forge
>  lightgbm 2.2.3 pypi_0 pypi
>  llvm-openmp 11.0.1 h7c73e74_0 conda-forge
>  lomond 0.3.3 pypi_0 pypi
>  lxml 4.5.1 pypi_0 pypi
>  lz4-c 1.9.3 h046ec9c_0 conda-forge
>  markupsafe 1.1.1 pypi_0 pypi
>  matplotlib 3.3.4 pypi_0 pypi
>  minepy 1.2.5 pypi_0 pypi
>  mock 4.0.3 pypi_0 pypi
>  ncurses 6.2 h0a44026_0 
>  networkx 2.4 pypi_0 pypi
>  nose 1.3.7 pypi_0 pypi
>  numexpr 2.7.2 pypi_0 pypi
>  numpy 1.18.1 pypi_0 pypi
>  opencv-python 4.5.1.48 pypi_0 pypi
>  openml 0.9.0 pypi_0 pypi
>  openssl 1.1.1j hbcf498f_0 conda-forge
>  orc 1.6.7 h675e114_0 conda-forge
>  pandas 0.25.3 pypi_0 pypi
>  parquet-cpp 1.5.1 2 conda-forge
>  parso 0.8.1 pypi_0 pypi
>  pathspec 0.8.1 pypi_0 pypi
>  patsy 0.5.1 pypi_0 pypi
>  pexpect 4.8.0 pypi_0 pypi
>  pickleshare 0.7.5 pypi_0 pypi
>  pillow 8.1.0 pypi_0 pypi
>  pip 2

[jira] [Resolved] (ARROW-11695) [C++][FlightRPC][Packaging] Update support for disabling TLS server verification for recent gRPC versions

2021-02-25 Thread Uwe Korn (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-11695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Korn resolved ARROW-11695.
--
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 9569
[https://github.com/apache/arrow/pull/9569]

> [C++][FlightRPC][Packaging] Update support for disabling TLS server 
> verification for recent gRPC versions
> -
>
> Key: ARROW-11695
> URL: https://issues.apache.org/jira/browse/ARROW-11695
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++, FlightRPC, Packaging
>Affects Versions: 3.0.0
> Environment: macOS, conda env, python 3.6 / 3.7
>Reporter: Amadeusz
>Assignee: David Li
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> It is regarding issue from github: 
> [https://github.com/apache/arrow/issues/9525]
> Output of `conda list`:
> {code:java}
> Name Version Build Channel
>  abseil-cpp 20200923.3 h046ec9c_0 conda-forge
>  aif360 0.3.0 pypi_0 pypi
>  appdirs 1.4.4 pypi_0 pypi
>  appnope 0.1.2 pypi_0 pypi
>  arrow-cpp 3.0.0 py36h25f3d33_3_cpu conda-forge
>  astunparse 1.6.3 pypi_0 pypi
>  attrs 20.3.0 pypi_0 pypi
>  aws-c-cal 0.4.5 hf7813a8_6 conda-forge
>  aws-c-common 0.4.67 hbcf498f_0 conda-forge
>  aws-c-event-stream 0.2.6 h8218164_4 conda-forge
>  aws-c-io 0.8.3 h339dee7_1 conda-forge
>  aws-checksums 0.1.11 h339dee7_1 conda-forge
>  aws-sdk-cpp 1.8.138 h5307d9a_1 conda-forge
>  backcall 0.2.0 pypi_0 pypi
>  bayesian-optimization 1.2.0 pypi_0 pypi
>  black 19.10b0 pypi_0 pypi
>  boto3 1.17.9 pypi_0 pypi
>  botocore 1.20.9 pypi_0 pypi
>  brotli 1.0.9 h046ec9c_4 conda-forge
>  bzip2 1.0.8 hc929b4f_4 conda-forge
>  c-ares 1.17.1 hc929b4f_0 conda-forge
>  ca-certificates 2020.12.5 h033912b_0 conda-forge
>  cached-property 1.5.2 pypi_0 pypi
>  category-encoders 2.1.0 pypi_0 pypi
>  certifi 2020.12.5 py36h79c6626_1 conda-forge
>  chardet 3.0.4 pypi_0 pypi
>  click 7.1.2 pypi_0 pypi
>  cycler 0.10.0 pypi_0 pypi
>  cython 0.29.21 pypi_0 pypi
>  decorator 4.4.2 pypi_0 pypi
>  docutils 0.15.2 pypi_0 pypi
>  flask 1.1.2 pypi_0 pypi
>  future 0.18.2 pypi_0 pypi
>  gflags 2.2.2 hb1e8313_1004 conda-forge
>  glog 0.4.0 hb7f4fc5_3 conda-forge
>  greenery 3.1 pypi_0 pypi
>  grpc-cpp 1.35.0 h330f241_0 conda-forge
>  grpcio 1.35.0 pypi_0 pypi
>  h5py 2.10.0 pypi_0 pypi
>  hpsklearn 0.1.0 pypi_0 pypi
>  hyperopt 0.1.2 pypi_0 pypi
>  ibm-cos-sdk 2.7.0 pypi_0 pypi
>  ibm-cos-sdk-core 2.7.0 pypi_0 pypi
>  ibm-cos-sdk-s3transfer 2.7.0 pypi_0 pypi
>  ibm-watson-machine-learning 1.0.45 pypi_0 pypi
>  idna 2.9 pypi_0 pypi
>  imageio 2.9.0 pypi_0 pypi
>  ipython 7.16.1 pypi_0 pypi
>  ipython-genutils 0.2.0 pypi_0 pypi
>  itsdangerous 1.1.0 pypi_0 pypi
>  jedi 0.18.0 pypi_0 pypi
>  jinja2 2.11.3 pypi_0 pypi
>  jmespath 0.9.5 pypi_0 pypi
>  joblib 1.0.1 pyhd8ed1ab_0 conda-forge
>  jsonschema 2.6.0 pypi_0 pypi
>  jsonsubschema 0.0.1 pypi_0 pypi
>  keras 2.3.1 pypi_0 pypi
>  keras-applications 1.0.8 pypi_0 pypi
>  keras-preprocessing 1.1.2 pypi_0 pypi
>  kiwisolver 1.3.1 pypi_0 pypi
>  krb5 1.17.1 hddcf347_0 
>  lale 0.4.13 pypi_0 pypi
>  liac-arff 2.5.0 pypi_0 pypi
>  libblas 3.9.0 8_openblas conda-forge
>  libcblas 3.9.0 8_openblas conda-forge
>  libcurl 7.71.1 h9bf37e3_8 conda-forge
>  libcxx 11.0.1 habf9029_0 conda-forge
>  libcxxabi 4.0.1 hcfea43d_1 
>  libedit 3.1.20181209 hb402a30_0 
>  libev 4.33 haf1e3a3_1 conda-forge
>  libevent 2.1.10 hddc9c9b_3 conda-forge
>  libffi 3.2.1 h475c297_4 
>  libgfortran 5.0.0 9_3_0_h6c81a4c_18 conda-forge
>  libgfortran5 9.3.0 h6c81a4c_18 conda-forge
>  liblapack 3.9.0 8_openblas conda-forge
>  libnghttp2 1.43.0 h07e645a_0 conda-forge
>  libopenblas 0.3.12 openmp_h54245bb_1 conda-forge
>  libprotobuf 3.14.0 hfd3ada9_0 conda-forge
>  libssh2 1.9.0 h8a08a2b_5 conda-forge
>  libthrift 0.13.0 h990abc0_6 conda-forge
>  libutf8proc 2.6.1 h35c211d_0 conda-forge
>  lightgbm 2.2.3 pypi_0 pypi
>  llvm-openmp 11.0.1 h7c73e74_0 conda-forge
>  lomond 0.3.3 pypi_0 pypi
>  lxml 4.5.1 pypi_0 pypi
>  lz4-c 1.9.3 h046ec9c_0 conda-forge
>  markupsafe 1.1.1 pypi_0 pypi
>  matplotlib 3.3.4 pypi_0 pypi
>  minepy 1.2.5 pypi_0 pypi
>  mock 4.0.3 pypi_0 pypi
>  ncurses 6.2 h0a44026_0 
>  networkx 2.4 pypi_0 pypi
>  nose 1.3.7 pypi_0 pypi
>  numexpr 2.7.2 pypi_0 pypi
>  numpy 1.18.1 pypi_0 pypi
>  opencv-python 4.5.1.48 pypi_0 pypi
>  openml 0.9.0 pypi_0 pypi
>  openssl 1.1.1j hbcf498f_0 conda-forge
>  orc 1.6.7 h675e114_0 conda-forge
>  pandas 0.25.3 pypi_0 pypi
>  parquet-cpp 1.5.1 2 conda-forge
>  parso 0.8.1 pypi_0 pypi
>  pathspec 0.8.1 pypi_0 pypi
>  patsy 0.5.1 pypi_0 pypi
>  pexpect 4.8.0 pypi_0 pypi
>  pickleshare 0.7.5 pypi_0 pypi
>  pillow 8.1.0 pypi_0 pypi
>  pip 2

[jira] [Commented] (ARROW-11695) [Python][macOS] Using encryption with server verification disabled is unsupported.

2021-02-22 Thread Uwe Korn (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-11695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1724#comment-1724
 ] 

Uwe Korn commented on ARROW-11695:
--

[~lidavidm] It would be nice to have a CMake option that requires this check to 
pass. I would set that then in the conda packages so that we are sure that it 
is always compiled in. Then we would at least be notified if future versions 
break again. 

> [Python][macOS] Using encryption with server verification disabled is 
> unsupported.
> --
>
> Key: ARROW-11695
> URL: https://issues.apache.org/jira/browse/ARROW-11695
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Packaging, Python
>Affects Versions: 3.0.0
> Environment: macOS, conda env, python 3.6 / 3.7
>Reporter: Amadeusz
>Priority: Major
>
> It is regarding issue from github: 
> [https://github.com/apache/arrow/issues/9525]
> Output of `conda list`:
> {code:java}
> Name Version Build Channel
>  abseil-cpp 20200923.3 h046ec9c_0 conda-forge
>  aif360 0.3.0 pypi_0 pypi
>  appdirs 1.4.4 pypi_0 pypi
>  appnope 0.1.2 pypi_0 pypi
>  arrow-cpp 3.0.0 py36h25f3d33_3_cpu conda-forge
>  astunparse 1.6.3 pypi_0 pypi
>  attrs 20.3.0 pypi_0 pypi
>  aws-c-cal 0.4.5 hf7813a8_6 conda-forge
>  aws-c-common 0.4.67 hbcf498f_0 conda-forge
>  aws-c-event-stream 0.2.6 h8218164_4 conda-forge
>  aws-c-io 0.8.3 h339dee7_1 conda-forge
>  aws-checksums 0.1.11 h339dee7_1 conda-forge
>  aws-sdk-cpp 1.8.138 h5307d9a_1 conda-forge
>  backcall 0.2.0 pypi_0 pypi
>  bayesian-optimization 1.2.0 pypi_0 pypi
>  black 19.10b0 pypi_0 pypi
>  boto3 1.17.9 pypi_0 pypi
>  botocore 1.20.9 pypi_0 pypi
>  brotli 1.0.9 h046ec9c_4 conda-forge
>  bzip2 1.0.8 hc929b4f_4 conda-forge
>  c-ares 1.17.1 hc929b4f_0 conda-forge
>  ca-certificates 2020.12.5 h033912b_0 conda-forge
>  cached-property 1.5.2 pypi_0 pypi
>  category-encoders 2.1.0 pypi_0 pypi
>  certifi 2020.12.5 py36h79c6626_1 conda-forge
>  chardet 3.0.4 pypi_0 pypi
>  click 7.1.2 pypi_0 pypi
>  cycler 0.10.0 pypi_0 pypi
>  cython 0.29.21 pypi_0 pypi
>  decorator 4.4.2 pypi_0 pypi
>  docutils 0.15.2 pypi_0 pypi
>  flask 1.1.2 pypi_0 pypi
>  future 0.18.2 pypi_0 pypi
>  gflags 2.2.2 hb1e8313_1004 conda-forge
>  glog 0.4.0 hb7f4fc5_3 conda-forge
>  greenery 3.1 pypi_0 pypi
>  grpc-cpp 1.35.0 h330f241_0 conda-forge
>  grpcio 1.35.0 pypi_0 pypi
>  h5py 2.10.0 pypi_0 pypi
>  hpsklearn 0.1.0 pypi_0 pypi
>  hyperopt 0.1.2 pypi_0 pypi
>  ibm-cos-sdk 2.7.0 pypi_0 pypi
>  ibm-cos-sdk-core 2.7.0 pypi_0 pypi
>  ibm-cos-sdk-s3transfer 2.7.0 pypi_0 pypi
>  ibm-watson-machine-learning 1.0.45 pypi_0 pypi
>  idna 2.9 pypi_0 pypi
>  imageio 2.9.0 pypi_0 pypi
>  ipython 7.16.1 pypi_0 pypi
>  ipython-genutils 0.2.0 pypi_0 pypi
>  itsdangerous 1.1.0 pypi_0 pypi
>  jedi 0.18.0 pypi_0 pypi
>  jinja2 2.11.3 pypi_0 pypi
>  jmespath 0.9.5 pypi_0 pypi
>  joblib 1.0.1 pyhd8ed1ab_0 conda-forge
>  jsonschema 2.6.0 pypi_0 pypi
>  jsonsubschema 0.0.1 pypi_0 pypi
>  keras 2.3.1 pypi_0 pypi
>  keras-applications 1.0.8 pypi_0 pypi
>  keras-preprocessing 1.1.2 pypi_0 pypi
>  kiwisolver 1.3.1 pypi_0 pypi
>  krb5 1.17.1 hddcf347_0 
>  lale 0.4.13 pypi_0 pypi
>  liac-arff 2.5.0 pypi_0 pypi
>  libblas 3.9.0 8_openblas conda-forge
>  libcblas 3.9.0 8_openblas conda-forge
>  libcurl 7.71.1 h9bf37e3_8 conda-forge
>  libcxx 11.0.1 habf9029_0 conda-forge
>  libcxxabi 4.0.1 hcfea43d_1 
>  libedit 3.1.20181209 hb402a30_0 
>  libev 4.33 haf1e3a3_1 conda-forge
>  libevent 2.1.10 hddc9c9b_3 conda-forge
>  libffi 3.2.1 h475c297_4 
>  libgfortran 5.0.0 9_3_0_h6c81a4c_18 conda-forge
>  libgfortran5 9.3.0 h6c81a4c_18 conda-forge
>  liblapack 3.9.0 8_openblas conda-forge
>  libnghttp2 1.43.0 h07e645a_0 conda-forge
>  libopenblas 0.3.12 openmp_h54245bb_1 conda-forge
>  libprotobuf 3.14.0 hfd3ada9_0 conda-forge
>  libssh2 1.9.0 h8a08a2b_5 conda-forge
>  libthrift 0.13.0 h990abc0_6 conda-forge
>  libutf8proc 2.6.1 h35c211d_0 conda-forge
>  lightgbm 2.2.3 pypi_0 pypi
>  llvm-openmp 11.0.1 h7c73e74_0 conda-forge
>  lomond 0.3.3 pypi_0 pypi
>  lxml 4.5.1 pypi_0 pypi
>  lz4-c 1.9.3 h046ec9c_0 conda-forge
>  markupsafe 1.1.1 pypi_0 pypi
>  matplotlib 3.3.4 pypi_0 pypi
>  minepy 1.2.5 pypi_0 pypi
>  mock 4.0.3 pypi_0 pypi
>  ncurses 6.2 h0a44026_0 
>  networkx 2.4 pypi_0 pypi
>  nose 1.3.7 pypi_0 pypi
>  numexpr 2.7.2 pypi_0 pypi
>  numpy 1.18.1 pypi_0 pypi
>  opencv-python 4.5.1.48 pypi_0 pypi
>  openml 0.9.0 pypi_0 pypi
>  openssl 1.1.1j hbcf498f_0 conda-forge
>  orc 1.6.7 h675e114_0 conda-forge
>  pandas 0.25.3 pypi_0 pypi
>  parquet-cpp 1.5.1 2 conda-forge
>  parso 0.8.1 pypi_0 pypi
>  pathspec 0.8.1 pypi_0 pypi
>  patsy 0.5.1 pypi_0 pypi
>  pexpect 4.8.0 pypi_0 pypi
>  pickleshare 0.7.5 pypi_0 pypi
>  pillow 8.1.0 pypi_0 pypi
>  pip 20.0.2 py36_1 
>  ply 3.11 pypi_0 pypi
>  p

[jira] [Commented] (ARROW-11695) [Python][macOS] Using encryption with server verification disabled is unsupported.

2021-02-22 Thread Uwe Korn (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-11695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17288396#comment-17288396
 ] 

Uwe Korn commented on ARROW-11695:
--

The build fails with the latest gRPC 1.35 release:

{code}
check_tls_opts_132.cc: 
/Users/uwe/mambaforge/conda-bld/arrow-cpp-ext_1614003135390/work/cpp/src/arrow/flight/try_compile/check_tls_opts_132.cc:29:61:
 error: no member named 'server_verification_option' in 
'grpc::experimental::TlsCredentialsOptions'
check_tls_opts_132.cc:   grpc_tls_server_verification_option server_opt = 
options->server_verification_option()
check_tls_opts_132.cc:
~~~  ^
check_tls_opts_132.cc: 
/Users/uwe/mambaforge/conda-bld/arrow-cpp-ext_1614003135390/work/cpp/src/arrow/flight/try_compile/check_tls_opts_132.cc:34:39:
 warning: unused variable 'opt' [-Wunused-variable]
check_tls_opts_132.cc:   grpc_tls_server_verification_option opt = 
check(nullptr)
check_tls_opts_132.cc:   ^
check_tls_opts_132.cc: 1 warning and 1 error generated.
check_tls_opts_132.cc: ninja: build stopped: subcommand failed.
check_tls_opts_132.cc:
{code}

> [Python][macOS] Using encryption with server verification disabled is 
> unsupported.
> --
>
> Key: ARROW-11695
> URL: https://issues.apache.org/jira/browse/ARROW-11695
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Packaging, Python
>Affects Versions: 3.0.0
> Environment: macOS, conda env, python 3.6 / 3.7
>Reporter: Amadeusz
>Priority: Major
>
> It is regarding issue from github: 
> [https://github.com/apache/arrow/issues/9525]
> Output of `conda list`:
> {code:java}
> Name Version Build Channel
>  abseil-cpp 20200923.3 h046ec9c_0 conda-forge
>  aif360 0.3.0 pypi_0 pypi
>  appdirs 1.4.4 pypi_0 pypi
>  appnope 0.1.2 pypi_0 pypi
>  arrow-cpp 3.0.0 py36h25f3d33_3_cpu conda-forge
>  astunparse 1.6.3 pypi_0 pypi
>  attrs 20.3.0 pypi_0 pypi
>  aws-c-cal 0.4.5 hf7813a8_6 conda-forge
>  aws-c-common 0.4.67 hbcf498f_0 conda-forge
>  aws-c-event-stream 0.2.6 h8218164_4 conda-forge
>  aws-c-io 0.8.3 h339dee7_1 conda-forge
>  aws-checksums 0.1.11 h339dee7_1 conda-forge
>  aws-sdk-cpp 1.8.138 h5307d9a_1 conda-forge
>  backcall 0.2.0 pypi_0 pypi
>  bayesian-optimization 1.2.0 pypi_0 pypi
>  black 19.10b0 pypi_0 pypi
>  boto3 1.17.9 pypi_0 pypi
>  botocore 1.20.9 pypi_0 pypi
>  brotli 1.0.9 h046ec9c_4 conda-forge
>  bzip2 1.0.8 hc929b4f_4 conda-forge
>  c-ares 1.17.1 hc929b4f_0 conda-forge
>  ca-certificates 2020.12.5 h033912b_0 conda-forge
>  cached-property 1.5.2 pypi_0 pypi
>  category-encoders 2.1.0 pypi_0 pypi
>  certifi 2020.12.5 py36h79c6626_1 conda-forge
>  chardet 3.0.4 pypi_0 pypi
>  click 7.1.2 pypi_0 pypi
>  cycler 0.10.0 pypi_0 pypi
>  cython 0.29.21 pypi_0 pypi
>  decorator 4.4.2 pypi_0 pypi
>  docutils 0.15.2 pypi_0 pypi
>  flask 1.1.2 pypi_0 pypi
>  future 0.18.2 pypi_0 pypi
>  gflags 2.2.2 hb1e8313_1004 conda-forge
>  glog 0.4.0 hb7f4fc5_3 conda-forge
>  greenery 3.1 pypi_0 pypi
>  grpc-cpp 1.35.0 h330f241_0 conda-forge
>  grpcio 1.35.0 pypi_0 pypi
>  h5py 2.10.0 pypi_0 pypi
>  hpsklearn 0.1.0 pypi_0 pypi
>  hyperopt 0.1.2 pypi_0 pypi
>  ibm-cos-sdk 2.7.0 pypi_0 pypi
>  ibm-cos-sdk-core 2.7.0 pypi_0 pypi
>  ibm-cos-sdk-s3transfer 2.7.0 pypi_0 pypi
>  ibm-watson-machine-learning 1.0.45 pypi_0 pypi
>  idna 2.9 pypi_0 pypi
>  imageio 2.9.0 pypi_0 pypi
>  ipython 7.16.1 pypi_0 pypi
>  ipython-genutils 0.2.0 pypi_0 pypi
>  itsdangerous 1.1.0 pypi_0 pypi
>  jedi 0.18.0 pypi_0 pypi
>  jinja2 2.11.3 pypi_0 pypi
>  jmespath 0.9.5 pypi_0 pypi
>  joblib 1.0.1 pyhd8ed1ab_0 conda-forge
>  jsonschema 2.6.0 pypi_0 pypi
>  jsonsubschema 0.0.1 pypi_0 pypi
>  keras 2.3.1 pypi_0 pypi
>  keras-applications 1.0.8 pypi_0 pypi
>  keras-preprocessing 1.1.2 pypi_0 pypi
>  kiwisolver 1.3.1 pypi_0 pypi
>  krb5 1.17.1 hddcf347_0 
>  lale 0.4.13 pypi_0 pypi
>  liac-arff 2.5.0 pypi_0 pypi
>  libblas 3.9.0 8_openblas conda-forge
>  libcblas 3.9.0 8_openblas conda-forge
>  libcurl 7.71.1 h9bf37e3_8 conda-forge
>  libcxx 11.0.1 habf9029_0 conda-forge
>  libcxxabi 4.0.1 hcfea43d_1 
>  libedit 3.1.20181209 hb402a30_0 
>  libev 4.33 haf1e3a3_1 conda-forge
>  libevent 2.1.10 hddc9c9b_3 conda-forge
>  libffi 3.2.1 h475c297_4 
>  libgfortran 5.0.0 9_3_0_h6c81a4c_18 conda-forge
>  libgfortran5 9.3.0 h6c81a4c_18 conda-forge
>  liblapack 3.9.0 8_openblas conda-forge
>  libnghttp2 1.43.0 h07e645a_0 conda-forge
>  libopenblas 0.3.12 openmp_h54245bb_1 conda-forge
>  libprotobuf 3.14.0 hfd3ada9_0 conda-forge
>  libssh2 1.9.0 h8a08a2b_5 conda-forge
>  libthrift 0.13.0 h990abc0_6 conda-forge
>  libutf8proc 2.6.1 h35c211d_0 conda-forge
>  lightgbm 2.2.3 pypi_0 pypi
>  llvm-openmp 11.0.1 h7c73e74_0 conda-forge
>  lomond 0.3.3 pypi_0

[jira] [Created] (ARROW-11724) [C++] Namespace collisions with protobuf 3.15

2021-02-21 Thread Uwe Korn (Jira)
Uwe Korn created ARROW-11724:


 Summary: [C++] Namespace collisions with protobuf 3.15
 Key: ARROW-11724
 URL: https://issues.apache.org/jira/browse/ARROW-11724
 Project: Apache Arrow
  Issue Type: Improvement
  Components: C++, FlightRPC
Affects Versions: 3.0.0
Reporter: Uwe Korn
Assignee: Uwe Korn
 Fix For: 4.0.0


We define {{pb}} as a namespace alias in the flight sources. This conflicts 
with {{protobuf}} starting to introduce it as its global namespace alias.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-5280) [C++] Find a better solution to the conda compilers macOS issue

2021-02-21 Thread Uwe Korn (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-5280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17288082#comment-17288082
 ] 

Uwe Korn commented on ARROW-5280:
-

Nowadays you should be able to run compilation on macOS with a newer SDK. It 
might may take some weeks for a freshly released SDK to be supported by the 
compilers but we have upgraded all tooling to work with a variety of SDKs. If 
this doesn't work, open a bug on the conda-forge side.

> [C++] Find a better solution to the conda compilers macOS issue
> ---
>
> Key: ARROW-5280
> URL: https://issues.apache.org/jira/browse/ARROW-5280
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++, Developer Tools
>Reporter: Neal Richardson
>Priority: Major
>
> See [https://github.com/apache/arrow/pull/4231#pullrequestreview-234617308] 
> and https://issues.apache.org/jira/browse/ARROW-4935. Conda's `compilers` 
> require an old macOS SDK, which makes installation awkward at best. We can 
> _almost_ build on macOS without conda `compilers`, but the jemalloc failure 
> remains. As Uwe says, "Maybe we can figure out a way in conda-forge to use 
> newer compilers than the ones referenced by the {{compilers}} package."



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-11552) [Python][Packaging] Include CUDA support in wheels

2021-02-09 Thread Uwe Korn (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-11552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17281768#comment-17281768
 ] 

Uwe Korn commented on ARROW-11552:
--

[~VoVAllen] With the new repo as with the existing code, this misses a 
maintainer to actually do this. Currently I think there is only the willingness 
to delete code, not for any further work :(

> [Python][Packaging] Include CUDA support in wheels
> --
>
> Key: ARROW-11552
> URL: https://issues.apache.org/jira/browse/ARROW-11552
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: GPU, Packaging, Python
>Reporter: Allen Zhou
>Priority: Major
>
> I'm developing a new project based on plasma store with CUDA capability. But 
> I found it is unfriendly for user to compile arrow with cuda capability and 
> install it on their systems. Therefore I'm wondering whether dev team will 
> consider release C++ library with CUDA capability. Or how can I make my own 
> distribution using tools like pip?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (ARROW-11555) The pyarrow installation is not built with support for 'HadoopFileSystem'

2021-02-08 Thread Uwe Korn (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-11555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Korn resolved ARROW-11555.
--
Resolution: Invalid

Closing this here as this is an Anaconda issue which is tracked over at 
https://github.com/AnacondaRecipes/pyarrow-feedstock/issues/2

> The pyarrow installation is not built with support for 'HadoopFileSystem'
> -
>
> Key: ARROW-11555
> URL: https://issues.apache.org/jira/browse/ARROW-11555
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Affects Versions: 3.0.0
> Environment: Ubuntu 20.04 LTS
> conda 4.8.3
> Python 3.7.7
> pyarrow=3.0.0
>Reporter: Nikolai Janakiev
>Assignee: Uwe Korn
>Priority: Minor
>
> When running:
> {code:java}
> import pyarrow as pa
> pa.fs.HadoopFileSystem("node-master", port=54310)
> {code}
> I get the following error:
> {code:java}
> ImportError: The pyarrow installation is not built with support for 
> 'HadoopFileSystem'{code}
> Installed on Ubuntu 20.04 LTS and via:
> {code:java}
> conda install -c conda-forge "pyarrow=3.0.0"{code}
> But when I run the following command, I am able to connect to my HDFS cluster:
> {code:java}
> hdfs = pa.hdfs.connect('node-master', port=54310)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-11555) The pyarrow installation is not built with support for 'HadoopFileSystem'

2021-02-08 Thread Uwe Korn (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-11555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17281076#comment-17281076
 ] 

Uwe Korn commented on ARROW-11555:
--

This has {{arrow-cpp}} from defaults, this is missing HDFS support. You have a 
lot packages from defaults, I suggest to recreate the environment with solely 
using {{conda-forge}} as the package source. The {{arrow-cpp}} / {{pyarrow}} 
packages are currently in a broken state.

> The pyarrow installation is not built with support for 'HadoopFileSystem'
> -
>
> Key: ARROW-11555
> URL: https://issues.apache.org/jira/browse/ARROW-11555
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Affects Versions: 3.0.0
> Environment: Ubuntu 20.04 LTS
> conda 4.8.3
> Python 3.7.7
> pyarrow=3.0.0
>Reporter: Nikolai Janakiev
>Assignee: Uwe Korn
>Priority: Minor
>
> When running:
> {code:java}
> import pyarrow as pa
> pa.fs.HadoopFileSystem("node-master", port=54310)
> {code}
> I get the following error:
> {code:java}
> ImportError: The pyarrow installation is not built with support for 
> 'HadoopFileSystem'{code}
> Installed on Ubuntu 20.04 LTS and via:
> {code:java}
> conda install -c conda-forge "pyarrow=3.0.0"{code}
> But when I run the following command, I am able to connect to my HDFS cluster:
> {code:java}
> hdfs = pa.hdfs.connect('node-master', port=54310)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-11552) [Python][Packaging] Include CUDA support in wheels

2021-02-08 Thread Uwe Korn (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-11552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17281027#comment-17281027
 ] 

Uwe Korn commented on ARROW-11552:
--

We cannot build official packages with CUDA support as this would make us 
depend hard on non-free libraries. Otherwise, in contrast things like 
Tensorflow, we only use a limited set of CUDA features, so we can compile for 
the oldest supported CUDA version and it will work on all newer.

I'm still +1 on dropping Plasma from the Arrow source tree, we get a lot of bug 
reports but zero maintenance. Having it in tree leads to false expectations.

> [Python][Packaging] Include CUDA support in wheels
> --
>
> Key: ARROW-11552
> URL: https://issues.apache.org/jira/browse/ARROW-11552
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: GPU, Packaging, Python
>Reporter: Allen Zhou
>Priority: Major
>
> I'm developing a new project based on plasma store with CUDA capability. But 
> I found it is unfriendly for user to compile arrow with cuda capability and 
> install it on their systems. Therefore I'm wondering whether dev team will 
> consider release C++ library with CUDA capability. Or how can I make my own 
> distribution using tools like pip?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-11555) The pyarrow installation is not built with support for 'HadoopFileSystem'

2021-02-08 Thread Uwe Korn (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-11555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17281020#comment-17281020
 ] 

Uwe Korn commented on ARROW-11555:
--

Can you please post the output of {{conda list}} of your environment?

We explicitly check for this support in 
https://github.com/conda-forge/arrow-cpp-feedstock/blob/0aa7d9e91df13451efa5d500e9dc500e786222ff/recipe/meta.yaml#L218
 and thus shouldn't be missing.

> The pyarrow installation is not built with support for 'HadoopFileSystem'
> -
>
> Key: ARROW-11555
> URL: https://issues.apache.org/jira/browse/ARROW-11555
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Affects Versions: 3.0.0
> Environment: Ubuntu 20.04 LTS
> conda 4.8.3
> Python 3.7.7
> pyarrow=3.0.0
>Reporter: Nikolai Janakiev
>Assignee: Uwe Korn
>Priority: Minor
>
> When running:
> {code:java}
> import pyarrow as pa
> pa.fs.HadoopFileSystem("node-master", port=54310)
> {code}
> I get the following error:
> {code:java}
> ImportError: The pyarrow installation is not built with support for 
> 'HadoopFileSystem'{code}
> Installed on Ubuntu 20.04 LTS and via:
> {code:java}
> conda install -c conda-forge "pyarrow=3.0.0"{code}
> But when I run the following command, I am able to connect to my HDFS cluster:
> {code:java}
> hdfs = pa.hdfs.connect('node-master', port=54310)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (ARROW-11555) The pyarrow installation is not built with support for 'HadoopFileSystem'

2021-02-08 Thread Uwe Korn (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-11555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Korn reassigned ARROW-11555:


Assignee: Uwe Korn

> The pyarrow installation is not built with support for 'HadoopFileSystem'
> -
>
> Key: ARROW-11555
> URL: https://issues.apache.org/jira/browse/ARROW-11555
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Affects Versions: 3.0.0
> Environment: Ubuntu 20.04 LTS
> conda 4.8.3
> Python 3.7.7
> pyarrow=3.0.0
>Reporter: Nikolai Janakiev
>Assignee: Uwe Korn
>Priority: Minor
>
> When running:
> {code:java}
> import pyarrow as pa
> pa.fs.HadoopFileSystem("node-master", port=54310)
> {code}
> I get the following error:
> {code:java}
> ImportError: The pyarrow installation is not built with support for 
> 'HadoopFileSystem'{code}
> Installed on Ubuntu 20.04 LTS and via:
> {code:java}
> conda install -c conda-forge "pyarrow=3.0.0"{code}
> But when I run the following command, I am able to connect to my HDFS cluster:
> {code:java}
> hdfs = pa.hdfs.connect('node-master', port=54310)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-11075) [Python] Getting reference not found with ORC enabled pyarrow

2021-02-04 Thread Uwe Korn (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-11075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17278803#comment-17278803
 ] 

Uwe Korn commented on ARROW-11075:
--

Can you provide a reproducible dockerfile or similar? I fail to see anything 
obvious here.

> [Python] Getting reference not found with ORC enabled pyarrow
> -
>
> Key: ARROW-11075
> URL: https://issues.apache.org/jira/browse/ARROW-11075
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Affects Versions: 1.0.1
> Environment: PPC64LE
>Reporter: Kandarpa
>Priority: Major
> Attachments: arrow_cpp_build.log, arrow_python_build.log, 
> conda_list.txt
>
>
> Generated the pyarrow with OCR enabled on Power using following steps:
> {code:java}
> export ARROW_HOME=$CONDA_PREFIX
> mkdir cpp/build
> cd cpp/build
> cmake -DCMAKE_INSTALL_PREFIX=$ARROW_HOME \
>       -DCMAKE_INSTALL_LIBDIR=lib \
>       -DARROW_WITH_BZ2=ON \
>       -DARROW_WITH_ZLIB=ON \
>       -DARROW_WITH_ZSTD=ON \
>       -DARROW_WITH_LZ4=ON \
>       -DARROW_WITH_SNAPPY=ON \
>       -DARROW_WITH_BROTLI=ON \
>       -DARROW_PARQUET=ON \
>       -DARROW_PYTHON=ON \
>       -DARROW_BUILD_TESTS=ON \
>       -DARROW_CUDA=ON \
>       -DCUDA_CUDA_LIBRARY=/usr/local/cuda/lib64/stubs/libcuda.so \
>       -DARROW_ORC=ON \
>   ..
> make -j
> make install
> cd ../../python
> python setup.py build_ext --bundle-arrow-cpp --with-orc --with-cuda 
> --with-parquet bdist_wheel
> {code}
>  
>  
> With the generated whl package installed, ran CUDF tests and observed 
> following error:
> *_ERROR cudf - ImportError: 
> /conda/envs/rmm/lib/python3.7/site-packages/pyarrow/_orc.cpython-37m-powerpc64le-linux-gnu.so:
>  undefined symbol: _ZN5arrow8adapters3orc13OR..._*
> Please find the whole error log below:
> 
>  ERRORS 
> 
>   ERROR 
> collecting test session 
> _
>  /conda/envs/rmm/lib/python3.7/importlib/__init__.py:127: in import_module
>      return _bootstrap._gcd_import(name[level:], package, level)
>  :1006: in _gcd_import
>      ???
>  :983: in _find_and_load
>      ???
>  :953: in _find_and_load_unlocked
>      ???
>  :219: in _call_with_frames_removed
>      ???
>  :1006: in _gcd_import
>      ???
>  :983: in _find_and_load
>      ???
>  :953: in _find_and_load_unlocked
>      ???
>  :219: in _call_with_frames_removed
>      ???
>  :1006: in _gcd_import
>      ???
>  :983: in _find_and_load
>      ???
>  :967: in _find_and_load_unlocked
>      ???
>  :677: in _load_unlocked
>      ???
>  :728: in exec_module
>      ???
>  :219: in _call_with_frames_removed
>      ???
>  cudf/cudf/__init__.py:60: in 
>      from cudf.io import (
>  cudf/cudf/io/__init__.py:8: in 
>      from cudf.io.orc import read_orc, read_orc_metadata, to_orc
>  cudf/cudf/io/orc.py:6: in 
>      from pyarrow import orc as orc
>  /conda/envs/rmm/lib/python3.7/site-packages/pyarrow/orc.py:24: in 
>      import pyarrow._orc as _orc
>  {color:#de350b}E   ImportError: 
> /conda/envs/rmm/lib/python3.7/site-packages/pyarrow/_orc.cpython-37m-powerpc64le-linux-gnu.so:
>  undefined symbol: 
> _ZN5arrow8adapters3orc13ORCFileReader4ReadEPSt10shared_ptrINS_5TableEE{color}
>  === 
> short test summary info 
> 
>  *_ERROR cudf - ImportError: 
> /conda/envs/rmm/lib/python3.7/site-packages/pyarrow/_orc.cpython-37m-powerpc64le-linux-gnu.so:
>  undefined symbol: _ZN5arrow8adapters3orc13OR..._*
>   
> Interrupted: 1 error during collection 
> 
>  === 
> 1 error in 1.54s 
> ===
>  Fatal Python error: Segmentation fault



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-9215) pyarrow parquet writer converts uint32 columns to int64

2021-02-03 Thread Uwe Korn (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-9215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17277918#comment-17277918
 ] 

Uwe Korn commented on ARROW-9215:
-

Yes, "because it is possible to be safe" is exactly the reason for this 
inconsistency.

> pyarrow parquet writer converts uint32 columns to int64
> ---
>
> Key: ARROW-9215
> URL: https://issues.apache.org/jira/browse/ARROW-9215
> Project: Apache Arrow
>  Issue Type: Bug
>Reporter: Devavret Makkar
>Assignee: Uwe Korn
>Priority: Major
>
> pyarrow parquet writer changes uint32 columns to int64. This change is not 
> made for other types and uint8, uint16, and uint64 columns retain their type.
> {code:python}
> In [1]: import pandas as pd
> In [2]: import pyarrow as pa
> In [3]: import pyarrow.parquet as pq
> In [5]: df = pd.DataFrame({'a':pd.Series([1,2,3], dtype='uint32')})
> In [6]: padf = pa.Table.from_pandas(df)
> In [7]: padf
> Out[7]: 
> pyarrow.Table
> a: uint32
> In [8]: pq.write_table(padf, 'pa.parquet')
> In [9]: pq.read_table('pa.parquet')
> Out[9]: 
> pyarrow.Table
> a: int64
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (ARROW-11472) [Python][CI] Kartothek integrations build is failing with numpy 1.20

2021-02-03 Thread Uwe Korn (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-11472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Korn resolved ARROW-11472.
--
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 9396
[https://github.com/apache/arrow/pull/9396]

> [Python][CI] Kartothek integrations build is failing with numpy 1.20
> 
>
> Key: ARROW-11472
> URL: https://issues.apache.org/jira/browse/ARROW-11472
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Reporter: Joris Van den Bossche
>Assignee: Joris Van den Bossche
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> See eg https://github.com/ursacomputing/crossbow/runs/1804464537, failure 
> looks like:
> {code}
>   ERROR collecting tests/io/dask/dataframe/test_read.py 
> _
> tests/io/dask/dataframe/test_read.py:185: in 
> @pytest.mark.parametrize("col", get_dataframe_not_nested().columns)
> kartothek/core/testing.py:65: in get_dataframe_not_nested
> "unicode": pd.Series(["Ö"], dtype=np.unicode),
> /opt/conda/envs/arrow/lib/python3.7/site-packages/pandas/core/series.py:335: 
> in __init__
> data = sanitize_array(data, index, dtype, copy, raise_cast_failure=True)
> /opt/conda/envs/arrow/lib/python3.7/site-packages/pandas/core/construction.py:480:
>  in sanitize_array
> subarr = _try_cast(data, dtype, copy, raise_cast_failure)
> /opt/conda/envs/arrow/lib/python3.7/site-packages/pandas/core/construction.py:587:
>  in _try_cast
> maybe_cast_to_integer_array(arr, dtype)
> /opt/conda/envs/arrow/lib/python3.7/site-packages/pandas/core/dtypes/cast.py:1723:
>  in maybe_cast_to_integer_array
> casted = np.array(arr, dtype=dtype, copy=copy)
> E   ValueError: invalid literal for int() with base 10: 'Ö'
> {code}
> So it seems that {{pd.Series(["Ö"], dtype=np.unicode)}} stopped working with 
> numpy 1.20.0



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-9215) pyarrow parquet writer converts uint32 columns to int64

2021-02-03 Thread Uwe Korn (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-9215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17277907#comment-17277907
 ] 

Uwe Korn commented on ARROW-9215:
-

For uint64, we have no better option than int64 and have to live with some kind 
of overflows. For uint32 where some values don't fit into int32, we can 
definitely fit all possible values inside the int64 range, thus we can avoid 
overflows by upcasting to int64.

> pyarrow parquet writer converts uint32 columns to int64
> ---
>
> Key: ARROW-9215
> URL: https://issues.apache.org/jira/browse/ARROW-9215
> Project: Apache Arrow
>  Issue Type: Bug
>Reporter: Devavret Makkar
>Assignee: Uwe Korn
>Priority: Major
>
> pyarrow parquet writer changes uint32 columns to int64. This change is not 
> made for other types and uint8, uint16, and uint64 columns retain their type.
> {code:python}
> In [1]: import pandas as pd
> In [2]: import pyarrow as pa
> In [3]: import pyarrow.parquet as pq
> In [5]: df = pd.DataFrame({'a':pd.Series([1,2,3], dtype='uint32')})
> In [6]: padf = pa.Table.from_pandas(df)
> In [7]: padf
> Out[7]: 
> pyarrow.Table
> a: uint32
> In [8]: pq.write_table(padf, 'pa.parquet')
> In [9]: pq.read_table('pa.parquet')
> Out[9]: 
> pyarrow.Table
> a: int64
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (ARROW-11445) Type conversion failure on numpy 0.1.20

2021-01-31 Thread Uwe Korn (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-11445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Korn resolved ARROW-11445.
--
Fix Version/s: 3.0.0
 Assignee: Uwe Korn
   Resolution: Duplicate

This is known issue with wheels older versions, see also the linked issue and 
https://github.com/numpy/numpy/issues/17913. Please update your {{pyarrow}} 
version if you want to use the latest NumPy release.

> Type conversion failure on numpy 0.1.20
> ---
>
> Key: ARROW-11445
> URL: https://issues.apache.org/jira/browse/ARROW-11445
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Affects Versions: 2.0.0
> Environment: Python 3.7.4
> Mac OS
>Reporter: Carlo Mazzaferro
>Assignee: Uwe Korn
>Priority: Major
> Fix For: 3.0.0
>
>
> While I have not dug deep enough in the Arrow codebase, it seems to me that 
> this is caused by the new numpy release: 
> [https://github.com/numpy/numpy/releases] 
> The issue below in fact is not observed when using numpy 0.19.*
>  
>  
>  
> {code:java}
> >>> pandas.__version__, pa.__version__, numpy.__version__
> ('1.2.1', '2.0.0', '1.20.0')
> >>> df = pandas.DataFrame({'a': numpy.random.randn(10), 'b': 
> >>> numpy.random.randn(7).tolist() + [None, pandas.NA, numpy.nan], 'c': 
> >>> list(range(9)) + [numpy.nan]})
> >>> pa.Table.from_pandas(df)
> Traceback (most recent call last):
>   File "", line 1, in 
> pa.Table.from_pandas(df)
>   File "pyarrow/table.pxi", line 1394, in pyarrow.lib.Table.from_pandas
>   File 
> "/Users/carlomazzafero/.virtualenvs/arr/lib/python3.7/site-packages/pyarrow/pandas_compat.py",
>  line 588, in dataframe_to_arrays
> for c, f in zip(columns_to_convert, convert_fields)]
>   File 
> "/Users/carlomazzafero/.virtualenvs/arr/lib/python3.7/site-packages/pyarrow/pandas_compat.py",
>  line 588, in 
> for c, f in zip(columns_to_convert, convert_fields)]
>   File 
> "/Users/carlomazzafero/.virtualenvs/arr/lib/python3.7/site-packages/pyarrow/pandas_compat.py",
>  line 574, in convert_column
> raise e
>   File 
> "/Users/carlomazzafero/.virtualenvs/arr/lib/python3.7/site-packages/pyarrow/pandas_compat.py",
>  line 568, in convert_column
> result = pa.array(col, type=type_, from_pandas=True, safe=safe)
>   File "pyarrow/array.pxi", line 292, in pyarrow.lib.array
>   File "pyarrow/array.pxi", line 79, in pyarrow.lib._ndarray_to_array
>   File "pyarrow/array.pxi", line 67, in pyarrow.lib._ndarray_to_type
>   File "pyarrow/error.pxi", line 107, in pyarrow.lib.check_status
> pyarrow.lib.ArrowTypeError: ('Did not pass numpy.dtype object', 'Conversion 
> failed for column a with type float64')
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-2034) [C++] Filesystem implementation for Azure Blob Store

2021-01-28 Thread Uwe Korn (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-2034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17273858#comment-17273858
 ] 

Uwe Korn commented on ARROW-2034:
-

It is as confusing in reality, here is what they all are (I'm though already 1 
year outdated on this):

* Blob Store: Like S3, simple but limited API
* Data Lake Gen 1: HDFS-like deployment with different but more user-friendly 
API / attributes
* Data Lake Gen 2: Some improvements were made to the Blob Store so that there 
is no need for a special (more expensive) Data Lake service anymore, everything 
is now on the Blob Store. A new set of APIs was though released that exposes 
some nice features that the initial Blob Store API didn't have, probably for 
marketing purposes this was named Data Lake Gen 2 although technically Blob 
Store Gen 2 would have been more appropriate.

> [C++] Filesystem implementation for Azure Blob Store
> 
>
> Key: ARROW-2034
> URL: https://issues.apache.org/jira/browse/ARROW-2034
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: C++
>Reporter: Wes McKinney
>Assignee: Uwe Korn
>Priority: Major
>  Labels: filesystem
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (ARROW-11372) Support RC verification on macOS-ARM64

2021-01-28 Thread Uwe Korn (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-11372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Korn resolved ARROW-11372.
--
Resolution: Fixed

Issue resolved by pull request 9315
[https://github.com/apache/arrow/pull/9315]

> Support RC verification on macOS-ARM64
> --
>
> Key: ARROW-11372
> URL: https://issues.apache.org/jira/browse/ARROW-11372
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Developer Tools
>Reporter: Uwe Korn
>Assignee: Uwe Korn
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> There are some assumptions in the verification scripts that assume an x86 
> system.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-11390) [Python] pyarrow 3.0 issues with turbodbc

2021-01-27 Thread Uwe Korn (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-11390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17272806#comment-17272806
 ] 

Uwe Korn commented on ARROW-11390:
--

You should use {{pyarrow}} and {{turbodbc}} both from conda-forge. That is the 
most reliable way. When you install {{turbodbc}} with {{pip}} it is built and 
then the build is cached on your system. If you change your {{pyarrow}}, you 
need to uninstall {{turbodbc}}, delete the caches and build it from source 
again, conda on the other side takes care of all that.

{{turbodbc}} has not yet been rebuilt on conda-forge for the new Arrow version, 
probably will be available in 3-4h, just wait until then before doing new tests.

> [Python] pyarrow 3.0 issues with turbodbc
> -
>
> Key: ARROW-11390
> URL: https://issues.apache.org/jira/browse/ARROW-11390
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Affects Versions: 3.0.0
> Environment: pyarrow 3.0.0
> fsspec 0.8.4
> adlfs v0.5.9
> pandas 1.2.1
> numpy 1.19.5
> turbodbc 4.1.1
>Reporter: Lance Dacey
>Priority: Major
>  Labels: python, turbodbc
>
> This is more of a turbodbc issue I think, but perhaps someone here would have 
> some idea of what changed to cause potential issues. 
> {code:java}
> cursor = connection.cursor()
> cursor.execute("select top 10 * from dbo.tickets")
> table = cursor.fetchallarrow(){code}
> I am able to run table.num_rows and it will print out 10.
> If I run table.to_pandas() or table.schema or try to write the table to a 
> dataset, my kernel dies with no explanation. I reverted back to pyarrow 2.0 
> and the same code works again.
> [https://github.com/blue-yonder/turbodbc/issues/289]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-11390) [Python] pyarrow 3.0 issues with turbodbc

2021-01-26 Thread Uwe Korn (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-11390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17272204#comment-17272204
 ] 

Uwe Korn commented on ARROW-11390:
--

Did you recompile {{turbodbc}} from source after you installed the new Arrow 
version?

> [Python] pyarrow 3.0 issues with turbodbc
> -
>
> Key: ARROW-11390
> URL: https://issues.apache.org/jira/browse/ARROW-11390
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Affects Versions: 3.0.0
> Environment: pyarrow 3.0.0
> fsspec 0.8.4
> adlfs v0.5.9
> pandas 1.2.1
> numpy 1.19.5
> turbodbc 4.1.1
>Reporter: Lance Dacey
>Priority: Major
>  Labels: python, turbodbc
>
> This is more of a turbodbc issue I think, but perhaps someone here would have 
> some idea of what changed to cause potential issues. 
> {code:java}
> cursor = connection.cursor()
> cursor.execute("select top 10 * from dbo.tickets")
> table = cursor.fetchallarrow(){code}
> I am able to run table.num_rows and it will print out 10.
> If I run table.to_pandas() or table.schema or try to write the table to a 
> dataset, my kernel dies with no explanation. I reverted back to pyarrow 2.0 
> and the same code works again.
> [https://github.com/blue-yonder/turbodbc/issues/289]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-11179) [Format] Make comments in fb files friendly to rust doc

2021-01-25 Thread Uwe Korn (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-11179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17271345#comment-17271345
 ] 

Uwe Korn commented on ARROW-11179:
--

We should be ok from a C++ side here, doxygen supports backticks as used in the 
PR but I'm not sure whether we actually parse the generated files.

> [Format] Make comments in fb files friendly to rust doc
> ---
>
> Key: ARROW-11179
> URL: https://issues.apache.org/jira/browse/ARROW-11179
> Project: Apache Arrow
>  Issue Type: Improvement
>Reporter: Qingyou Meng
>Priority: Trivial
>  Labels: pull-request-available
> Attachments: format-0ed34c83.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Currently, comments in flatbuffer files are directly copied to rust and c++ 
> source codes.
> That's great but there are some problems with cargo doc:
>  * array element abc[1] or link label [smith2017knl] causes `broken intra doc 
> links` warning
>  * example code/figure blocks are flatten into one line, see example [arrow 
> 2.0.0 
> doc|https://docs.rs/arrow/2.0.0/arrow/ipc/gen/SparseTensor/struct.SparseTensorIndexCSF.html#method.indptrType]
> After flatc generating, those ipc files have to be updated manually to fix 
> the above problems.
> So I'm suggesting update flatbuffer comments to address this problem.
>  * Escape inline code with ` and `
>  * Escape text block with ```text and ```
>  * change {color:#00875a}[smith2017knl]:{color} 
> [http://shaden.io/pub-files/smith2017knl.pdf] to 
> {color:#403294}[smith2017knl]({color}{color:#403294}[http://shaden.io/pub-files/smith2017knl.pdf]){color}
> {color:#172b4d}The attachment file *format-0ed34c83.patch*{color} is created 
> with git command
> {code:java}
> git diff 0ed34c83 -p format > format-0ed34c83.patch{code}
> where *0ed34c83* is this commit: 
> {noformat}
> 0ed34c83c ARROW-9400: [Python] Do not depend on conda-forge static libraries 
> in Windows wheel builds{noformat}
> [~emkornfield] may I create a pull request for this?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-11372) Support RC verification on macOS-ARM64

2021-01-25 Thread Uwe Korn (Jira)
Uwe Korn created ARROW-11372:


 Summary: Support RC verification on macOS-ARM64
 Key: ARROW-11372
 URL: https://issues.apache.org/jira/browse/ARROW-11372
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Developer Tools
Reporter: Uwe Korn
Assignee: Uwe Korn
 Fix For: 3.0.0


There are some assumptions in the verification scripts that assume an x86 
system.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-11075) [Python] Getting reference not found with OCR enabled pyarrow

2021-01-22 Thread Uwe Korn (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-11075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17270419#comment-17270419
 ] 

Uwe Korn commented on ARROW-11075:
--

The latest ORC release is supporting shared linkage and the conda toolchain has 
been reworked to link dynamically: 
https://github.com/conda-forge/arrow-cpp-feedstock/blob/1.0.x/recipe/meta.yaml. 
The major issue here is probably that ORC 0.6.2 is built as part of the Arrow 
thirdparty toolchain but 0.6.6 headers are used during the build. Not sure how 
this links but that feels like the most likely issue to me.

> [Python] Getting reference not found with OCR enabled pyarrow
> -
>
> Key: ARROW-11075
> URL: https://issues.apache.org/jira/browse/ARROW-11075
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Affects Versions: 1.0.1
> Environment: PPC64LE
>Reporter: Kandarpa
>Priority: Major
> Attachments: arrow_cpp_build.log, arrow_python_build.log, 
> conda_list.txt
>
>
> Generated the pyarrow with OCR enabled on Power using following steps:
> {code:java}
> export ARROW_HOME=$CONDA_PREFIX
> mkdir cpp/build
> cd cpp/build
> cmake -DCMAKE_INSTALL_PREFIX=$ARROW_HOME \
>       -DCMAKE_INSTALL_LIBDIR=lib \
>       -DARROW_WITH_BZ2=ON \
>       -DARROW_WITH_ZLIB=ON \
>       -DARROW_WITH_ZSTD=ON \
>       -DARROW_WITH_LZ4=ON \
>       -DARROW_WITH_SNAPPY=ON \
>       -DARROW_WITH_BROTLI=ON \
>       -DARROW_PARQUET=ON \
>       -DARROW_PYTHON=ON \
>       -DARROW_BUILD_TESTS=ON \
>       -DARROW_CUDA=ON \
>       -DCUDA_CUDA_LIBRARY=/usr/local/cuda/lib64/stubs/libcuda.so \
>       -DARROW_ORC=ON \
>   ..
> make -j
> make install
> cd ../../python
> python setup.py build_ext --bundle-arrow-cpp --with-orc --with-cuda 
> --with-parquet bdist_wheel
> {code}
>  
>  
> With the generated whl package installed, ran CUDF tests and observed 
> following error:
> *_ERROR cudf - ImportError: 
> /conda/envs/rmm/lib/python3.7/site-packages/pyarrow/_orc.cpython-37m-powerpc64le-linux-gnu.so:
>  undefined symbol: _ZN5arrow8adapters3orc13OR..._*
> Please find the whole error log below:
> 
>  ERRORS 
> 
>   ERROR 
> collecting test session 
> _
>  /conda/envs/rmm/lib/python3.7/importlib/__init__.py:127: in import_module
>      return _bootstrap._gcd_import(name[level:], package, level)
>  :1006: in _gcd_import
>      ???
>  :983: in _find_and_load
>      ???
>  :953: in _find_and_load_unlocked
>      ???
>  :219: in _call_with_frames_removed
>      ???
>  :1006: in _gcd_import
>      ???
>  :983: in _find_and_load
>      ???
>  :953: in _find_and_load_unlocked
>      ???
>  :219: in _call_with_frames_removed
>      ???
>  :1006: in _gcd_import
>      ???
>  :983: in _find_and_load
>      ???
>  :967: in _find_and_load_unlocked
>      ???
>  :677: in _load_unlocked
>      ???
>  :728: in exec_module
>      ???
>  :219: in _call_with_frames_removed
>      ???
>  cudf/cudf/__init__.py:60: in 
>      from cudf.io import (
>  cudf/cudf/io/__init__.py:8: in 
>      from cudf.io.orc import read_orc, read_orc_metadata, to_orc
>  cudf/cudf/io/orc.py:6: in 
>      from pyarrow import orc as orc
>  /conda/envs/rmm/lib/python3.7/site-packages/pyarrow/orc.py:24: in 
>      import pyarrow._orc as _orc
>  {color:#de350b}E   ImportError: 
> /conda/envs/rmm/lib/python3.7/site-packages/pyarrow/_orc.cpython-37m-powerpc64le-linux-gnu.so:
>  undefined symbol: 
> _ZN5arrow8adapters3orc13ORCFileReader4ReadEPSt10shared_ptrINS_5TableEE{color}
>  === 
> short test summary info 
> 
>  *_ERROR cudf - ImportError: 
> /conda/envs/rmm/lib/python3.7/site-packages/pyarrow/_orc.cpython-37m-powerpc64le-linux-gnu.so:
>  undefined symbol: _ZN5arrow8adapters3orc13OR..._*
>   
> Interrupted: 1 error during collection 
> 
>  === 
> 1 error in 1.54s 
> ===
>  Fatal Python error: Segmentation fault



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-11075) [Python] Getting reference not found with OCR enabled pyarrow

2021-01-22 Thread Uwe Korn (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-11075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17270411#comment-17270411
 ] 

Uwe Korn commented on ARROW-11075:
--

I would guess that the issue is related to {{-DORC_SOURCE=BUNDLED}} and having 
{{orc}} installed as a conda package at the same time. Can you remove the 
{{-DORC_SOURCE=BUNDLED}} flag and do a clean build? Do you know why you have 
set that?

> [Python] Getting reference not found with OCR enabled pyarrow
> -
>
> Key: ARROW-11075
> URL: https://issues.apache.org/jira/browse/ARROW-11075
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Affects Versions: 1.0.1
> Environment: PPC64LE
>Reporter: Kandarpa
>Priority: Major
> Attachments: arrow_cpp_build.log, arrow_python_build.log, 
> conda_list.txt
>
>
> Generated the pyarrow with OCR enabled on Power using following steps:
> {code:java}
> export ARROW_HOME=$CONDA_PREFIX
> mkdir cpp/build
> cd cpp/build
> cmake -DCMAKE_INSTALL_PREFIX=$ARROW_HOME \
>       -DCMAKE_INSTALL_LIBDIR=lib \
>       -DARROW_WITH_BZ2=ON \
>       -DARROW_WITH_ZLIB=ON \
>       -DARROW_WITH_ZSTD=ON \
>       -DARROW_WITH_LZ4=ON \
>       -DARROW_WITH_SNAPPY=ON \
>       -DARROW_WITH_BROTLI=ON \
>       -DARROW_PARQUET=ON \
>       -DARROW_PYTHON=ON \
>       -DARROW_BUILD_TESTS=ON \
>       -DARROW_CUDA=ON \
>       -DCUDA_CUDA_LIBRARY=/usr/local/cuda/lib64/stubs/libcuda.so \
>       -DARROW_ORC=ON \
>   ..
> make -j
> make install
> cd ../../python
> python setup.py build_ext --bundle-arrow-cpp --with-orc --with-cuda 
> --with-parquet bdist_wheel
> {code}
>  
>  
> With the generated whl package installed, ran CUDF tests and observed 
> following error:
> *_ERROR cudf - ImportError: 
> /conda/envs/rmm/lib/python3.7/site-packages/pyarrow/_orc.cpython-37m-powerpc64le-linux-gnu.so:
>  undefined symbol: _ZN5arrow8adapters3orc13OR..._*
> Please find the whole error log below:
> 
>  ERRORS 
> 
>   ERROR 
> collecting test session 
> _
>  /conda/envs/rmm/lib/python3.7/importlib/__init__.py:127: in import_module
>      return _bootstrap._gcd_import(name[level:], package, level)
>  :1006: in _gcd_import
>      ???
>  :983: in _find_and_load
>      ???
>  :953: in _find_and_load_unlocked
>      ???
>  :219: in _call_with_frames_removed
>      ???
>  :1006: in _gcd_import
>      ???
>  :983: in _find_and_load
>      ???
>  :953: in _find_and_load_unlocked
>      ???
>  :219: in _call_with_frames_removed
>      ???
>  :1006: in _gcd_import
>      ???
>  :983: in _find_and_load
>      ???
>  :967: in _find_and_load_unlocked
>      ???
>  :677: in _load_unlocked
>      ???
>  :728: in exec_module
>      ???
>  :219: in _call_with_frames_removed
>      ???
>  cudf/cudf/__init__.py:60: in 
>      from cudf.io import (
>  cudf/cudf/io/__init__.py:8: in 
>      from cudf.io.orc import read_orc, read_orc_metadata, to_orc
>  cudf/cudf/io/orc.py:6: in 
>      from pyarrow import orc as orc
>  /conda/envs/rmm/lib/python3.7/site-packages/pyarrow/orc.py:24: in 
>      import pyarrow._orc as _orc
>  {color:#de350b}E   ImportError: 
> /conda/envs/rmm/lib/python3.7/site-packages/pyarrow/_orc.cpython-37m-powerpc64le-linux-gnu.so:
>  undefined symbol: 
> _ZN5arrow8adapters3orc13ORCFileReader4ReadEPSt10shared_ptrINS_5TableEE{color}
>  === 
> short test summary info 
> 
>  *_ERROR cudf - ImportError: 
> /conda/envs/rmm/lib/python3.7/site-packages/pyarrow/_orc.cpython-37m-powerpc64le-linux-gnu.so:
>  undefined symbol: _ZN5arrow8adapters3orc13OR..._*
>   
> Interrupted: 1 error during collection 
> 
>  === 
> 1 error in 1.54s 
> ===
>  Fatal Python error: Segmentation fault



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-11231) [Packaging] Add mimalloc to Linux builds

2021-01-13 Thread Uwe Korn (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-11231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17264075#comment-17264075
 ] 

Uwe Korn commented on ARROW-11231:
--

The conda packages already come with both.

> [Packaging] Add mimalloc to Linux builds
> 
>
> Key: ARROW-11231
> URL: https://issues.apache.org/jira/browse/ARROW-11231
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++, Packaging, Python, R
>Reporter: Weston Pace
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> As a result of ARROW-11009/ARROW-11049/etc. it is now possible to switch 
> allocators at runtime.  Previously only one allocator was compiled into the 
> released builds (e.g. the PYPI version of pyarrow only included jemalloc on 
> Linux and mimalloc on Windows).  It might now make sense to compile both 
> mimalloc and jemalloc on Linux builds to allow the user to switch out to the 
> appropriate allocator at runtime if they need to.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-6715) [Website] Describe "non-free" component is needed for Plasma packages in install page

2021-01-10 Thread Uwe Korn (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-6715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17262291#comment-17262291
 ] 

Uwe Korn commented on ARROW-6715:
-

Sounds good. It wasn't obvious for me from the initial description whether we 
suddenly had a hard dependency on CUDA in Plasma. This is the setup/reasoning I 
was aware, so everything fine from my side 👍

> [Website] Describe "non-free" component is needed for Plasma packages in 
> install page
> -
>
> Key: ARROW-6715
> URL: https://issues.apache.org/jira/browse/ARROW-6715
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Website
>Reporter: Kouhei Sutou
>Assignee: Kouhei Sutou
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Because Plasma packages depend on nvidia-cuda-toolkit package that in 
> non-free component.
> Note that Plasma packages are available only for amd64. Because 
> nvidia-cuda-toolkit package isn't available for arm64.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (ARROW-10853) [Java] Undeprecate sqlToArrow helpers

2021-01-10 Thread Uwe Korn (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-10853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Korn resolved ARROW-10853.
--
Fix Version/s: (was: 3.0.0)
   Resolution: Won't Fix

> [Java] Undeprecate sqlToArrow helpers
> -
>
> Key: ARROW-10853
> URL: https://issues.apache.org/jira/browse/ARROW-10853
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Java
>Affects Versions: 2.0.0
>Reporter: Uwe Korn
>Assignee: Uwe Korn
>Priority: Major
>
> These helper functions are really useful when called from Python as they deal 
> with a lot of "internals" of Java that we don't want to handle from the 
> Python side. We rather would keep using these functions.
> Note that some of them are broken due to recent refactoring and only return 
> 1024 rows (the default iterator size) without the ability to change that.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-10853) [Java] Undeprecate sqlToArrow helpers

2021-01-10 Thread Uwe Korn (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-10853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17262274#comment-17262274
 ] 

Uwe Korn commented on ARROW-10853:
--

After using the non-deprecated interface for a bit, I think we can close this. 
It was a bit fiddly at first but having read thrice over the code I came up 
with a okish looking and not too complex Python/Java code to use it.

Cross-reference: I use this in 
https://uwekorn.com/2020/12/30/fast-jdbc-revisited.html 

> [Java] Undeprecate sqlToArrow helpers
> -
>
> Key: ARROW-10853
> URL: https://issues.apache.org/jira/browse/ARROW-10853
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Java
>Affects Versions: 2.0.0
>Reporter: Uwe Korn
>Assignee: Uwe Korn
>Priority: Major
> Fix For: 3.0.0
>
>
> These helper functions are really useful when called from Python as they deal 
> with a lot of "internals" of Java that we don't want to handle from the 
> Python side. We rather would keep using these functions.
> Note that some of them are broken due to recent refactoring and only return 
> 1024 rows (the default iterator size) without the ability to change that.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-10744) [Python] Enable wheel deployment for Mac OS 11 Big Sur

2021-01-10 Thread Uwe Korn (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-10744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17262268#comment-17262268
 ] 

Uwe Korn commented on ARROW-10744:
--

Can you update to the latest {{pip}} version and try again? I heard it was a 
bug in {{pip}} that prevented the wheels build on 10.x to be used. These are 
compatible and should be picked up correctly with the latest {{pip}} release.

> [Python] Enable wheel deployment for Mac OS 11 Big Sur
> --
>
> Key: ARROW-10744
> URL: https://issues.apache.org/jira/browse/ARROW-10744
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Python
>Reporter: David de L.
>Priority: Major
>
> It is currently quite tricky to get pyarrow to build on latest Mac 
> distributions.
> Since GitHub runners 
> [support|https://docs.github.com/en/free-pro-team@latest/actions/reference/specifications-for-github-hosted-runners#supported-runners-and-hardware-resources]
>  Mac 11.0 Big Sur, could wheels be built for this OS in CD?
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-11075) [Python] Getting reference not found with OCR enabled pyarrow

2021-01-10 Thread Uwe Korn (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-11075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17262210#comment-17262210
 ] 

Uwe Korn commented on ARROW-11075:
--

Can you post the output of {{conda list}} and the build logs for the C++ and 
Python part of Arrow? Without these three it will be hard to debug.

> [Python] Getting reference not found with OCR enabled pyarrow
> -
>
> Key: ARROW-11075
> URL: https://issues.apache.org/jira/browse/ARROW-11075
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Affects Versions: 1.0.1
> Environment: PPC64LE
>Reporter: Kandarpa
>Priority: Major
>
> Generated the pyarrow with OCR enabled on Power using following steps:
> {code:java}
> export ARROW_HOME=$CONDA_PREFIX
> mkdir cpp/build
> cd cpp/build
> cmake -DCMAKE_INSTALL_PREFIX=$ARROW_HOME \
>       -DCMAKE_INSTALL_LIBDIR=lib \
>       -DARROW_WITH_BZ2=ON \
>       -DARROW_WITH_ZLIB=ON \
>       -DARROW_WITH_ZSTD=ON \
>       -DARROW_WITH_LZ4=ON \
>       -DARROW_WITH_SNAPPY=ON \
>       -DARROW_WITH_BROTLI=ON \
>       -DARROW_PARQUET=ON \
>       -DARROW_PYTHON=ON \
>       -DARROW_BUILD_TESTS=ON \
>       -DARROW_CUDA=ON \
>       -DCUDA_CUDA_LIBRARY=/usr/local/cuda/lib64/stubs/libcuda.so \
>       -DARROW_ORC=ON \
>   ..
> make -j
> make install
> cd ../../python
> python setup.py build_ext --bundle-arrow-cpp --with-orc --with-cuda 
> --with-parquet bdist_wheel
> {code}
>  
>  
> With the generated whl package installed, ran CUDF tests and observed 
> following error:
> *_ERROR cudf - ImportError: 
> /conda/envs/rmm/lib/python3.7/site-packages/pyarrow/_orc.cpython-37m-powerpc64le-linux-gnu.so:
>  undefined symbol: _ZN5arrow8adapters3orc13OR..._*
> Please find the whole error log below:
> 
>  ERRORS 
> 
>   ERROR 
> collecting test session 
> _
>  /conda/envs/rmm/lib/python3.7/importlib/__init__.py:127: in import_module
>      return _bootstrap._gcd_import(name[level:], package, level)
>  :1006: in _gcd_import
>      ???
>  :983: in _find_and_load
>      ???
>  :953: in _find_and_load_unlocked
>      ???
>  :219: in _call_with_frames_removed
>      ???
>  :1006: in _gcd_import
>      ???
>  :983: in _find_and_load
>      ???
>  :953: in _find_and_load_unlocked
>      ???
>  :219: in _call_with_frames_removed
>      ???
>  :1006: in _gcd_import
>      ???
>  :983: in _find_and_load
>      ???
>  :967: in _find_and_load_unlocked
>      ???
>  :677: in _load_unlocked
>      ???
>  :728: in exec_module
>      ???
>  :219: in _call_with_frames_removed
>      ???
>  cudf/cudf/__init__.py:60: in 
>      from cudf.io import (
>  cudf/cudf/io/__init__.py:8: in 
>      from cudf.io.orc import read_orc, read_orc_metadata, to_orc
>  cudf/cudf/io/orc.py:6: in 
>      from pyarrow import orc as orc
>  /conda/envs/rmm/lib/python3.7/site-packages/pyarrow/orc.py:24: in 
>      import pyarrow._orc as _orc
>  {color:#de350b}E   ImportError: 
> /conda/envs/rmm/lib/python3.7/site-packages/pyarrow/_orc.cpython-37m-powerpc64le-linux-gnu.so:
>  undefined symbol: 
> _ZN5arrow8adapters3orc13ORCFileReader4ReadEPSt10shared_ptrINS_5TableEE{color}
>  === 
> short test summary info 
> 
>  *_ERROR cudf - ImportError: 
> /conda/envs/rmm/lib/python3.7/site-packages/pyarrow/_orc.cpython-37m-powerpc64le-linux-gnu.so:
>  undefined symbol: _ZN5arrow8adapters3orc13OR..._*
>   
> Interrupted: 1 error during collection 
> 
>  === 
> 1 error in 1.54s 
> ===
>  Fatal Python error: Segmentation fault



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-11198) [Packaging][Python] Ensure setuptools version during build supports markdown

2021-01-10 Thread Uwe Korn (Jira)
Uwe Korn created ARROW-11198:


 Summary: [Packaging][Python] Ensure setuptools version during 
build supports markdown
 Key: ARROW-11198
 URL: https://issues.apache.org/jira/browse/ARROW-11198
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Packaging, Python
Reporter: Uwe Korn
Assignee: Uwe Korn
 Fix For: 3.0.0


We use a {{text/markdown}} long description and thus should always build/upload 
with at least setuptools 38.6.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (ARROW-10180) [C++][Doc] Update dependency management docs following aws-sdk-cpp addition

2021-01-10 Thread Uwe Korn (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-10180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Korn resolved ARROW-10180.
--
Resolution: Fixed

Issue resolved by pull request 9150
[https://github.com/apache/arrow/pull/9150]

> [C++][Doc] Update dependency management docs following aws-sdk-cpp addition
> ---
>
> Key: ARROW-10180
> URL: https://issues.apache.org/jira/browse/ARROW-10180
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++, Documentation
>Reporter: Neal Richardson
>Assignee: Kouhei Sutou
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.0.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> https://arrow.apache.org/docs/developers/cpp/building.html#build-dependency-management
>  needs updating after (esp.) ARROW-10068. aws-sdk-cpp can be "bundled" but 
> still has system dependencies that cannot be, for example.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-6715) [Website] Describe "non-free" component is needed for Plasma packages in install page

2021-01-10 Thread Uwe Korn (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-6715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17262208#comment-17262208
 ] 

Uwe Korn commented on ARROW-6715:
-

Isn't that an optional dependency? Or is this issue for the deb/rpm packages?

> [Website] Describe "non-free" component is needed for Plasma packages in 
> install page
> -
>
> Key: ARROW-6715
> URL: https://issues.apache.org/jira/browse/ARROW-6715
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Website
>Reporter: Kouhei Sutou
>Assignee: Kouhei Sutou
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Because Plasma packages depend on nvidia-cuda-toolkit package that in 
> non-free component.
> Note that Plasma packages are available only for amd64. Because 
> nvidia-cuda-toolkit package isn't available for arm64.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


  1   2   3   >