Re: [Rust] Adding owners to crates.io for arrow and parquet crates

2019-01-26 Thread Uwe L. Korn
Also for me https://crates.io/users/xhochy, please On Sat, Jan 19, 2019, at 10:33 PM, Krisztián Szűcs wrote: > Me too please: https://crates.io/users/kszucs > > Thanks, Krisztian > > On Sat, Jan 19, 2019 at 10:18 PM Kouhei Sutou wrote: > > > Could you add me? > > > > Here is my account: https:

[jira] [Created] (ARROW-4374) [C++] DictionaryBuilder does not correctly report length and null_count

2019-01-25 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-4374: -- Summary: [C++] DictionaryBuilder does not correctly report length and null_count Key: ARROW-4374 URL: https://issues.apache.org/jira/browse/ARROW-4374 Project: Apache

[jira] [Created] (ARROW-4367) [C++] StringDictionaryBuilder segfaults on Finish with only null entries

2019-01-25 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-4367: -- Summary: [C++] StringDictionaryBuilder segfaults on Finish with only null entries Key: ARROW-4367 URL: https://issues.apache.org/jira/browse/ARROW-4367 Project: Apache

[jira] [Created] (ARROW-4362) [Java] Test OpenJDK 11 in CI

2019-01-24 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-4362: -- Summary: [Java] Test OpenJDK 11 in CI Key: ARROW-4362 URL: https://issues.apache.org/jira/browse/ARROW-4362 Project: Apache Arrow Issue Type: Improvement

[jira] [Created] (ARROW-4360) [C++] Query homebrew for Thrift

2019-01-24 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-4360: -- Summary: [C++] Query homebrew for Thrift Key: ARROW-4360 URL: https://issues.apache.org/jira/browse/ARROW-4360 Project: Apache Arrow Issue Type: Bug

Re: [VOTE] Accept donation of Rust DataFusion library for Apache Arrow

2019-01-24 Thread Uwe L. Korn
+1 (binding) as the Rust community seems to support this. Uwe On Thu, Jan 24, 2019, at 7:45 AM, Melik-Adamyan, Areg wrote: > +1 (non-binding) > > Is there a plan for C++ API? > > -Original Message- > From: Renjie Liu [mailto:liurenjie2...@gmail.com] > Sent: Wednesday, January 23, 2019

[jira] [Created] (ARROW-4356) [CI] Add integration (docker) test for turbodbc

2019-01-24 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-4356: -- Summary: [CI] Add integration (docker) test for turbodbc Key: ARROW-4356 URL: https://issues.apache.org/jira/browse/ARROW-4356 Project: Apache Arrow Issue Type

[jira] [Created] (ARROW-4355) [C++] test-util functions are no longer part of libarrow

2019-01-24 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-4355: -- Summary: [C++] test-util functions are no longer part of libarrow Key: ARROW-4355 URL: https://issues.apache.org/jira/browse/ARROW-4355 Project: Apache Arrow

[jira] [Created] (ARROW-4322) [CI] docker nightlies fails after conda-forge compiler migration

2019-01-22 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-4322: -- Summary: [CI] docker nightlies fails after conda-forge compiler migration Key: ARROW-4322 URL: https://issues.apache.org/jira/browse/ARROW-4322 Project: Apache Arrow

Re: Confluence Edit Access (build verification instructions)?

2019-01-21 Thread Uwe L. Korn
I guess it's the same as everywhere else. Gave you the appropriate rights. On Mon, Jan 21, 2019, at 7:23 PM, Uwe L. Korn wrote: > Hello Micah, > > what's your username on confluence? > > Uwe > > On Mon, Jan 21, 2019, at 7:21 PM, Micah Kornfield wrote: > >

Re: Confluence Edit Access (build verification instructions)?

2019-01-21 Thread Uwe L. Korn
Hello Micah, what's your username on confluence? Uwe On Mon, Jan 21, 2019, at 7:21 PM, Micah Kornfield wrote: > I ran into an issue running release verification on ubuntu 18.04 (I think > "jq" needs to be installed with apt-get). I wanted to update the > confluence page [1], but I don't see an

[jira] [Created] (ARROW-4303) [Gandiva/Python] Build LLVM with RTTI in manylinux1 container

2019-01-20 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-4303: -- Summary: [Gandiva/Python] Build LLVM with RTTI in manylinux1 container Key: ARROW-4303 URL: https://issues.apache.org/jira/browse/ARROW-4303 Project: Apache Arrow

[jira] [Created] (ARROW-4298) [Java] Building Flight fails with OpenJDK 11

2019-01-19 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-4298: -- Summary: [Java] Building Flight fails with OpenJDK 11 Key: ARROW-4298 URL: https://issues.apache.org/jira/browse/ARROW-4298 Project: Apache Arrow Issue Type

Re: [RESULT] [VOTE] Release Apache Arrow 0.12.0 RC4

2019-01-19 Thread Uwe L. Korn
I can help with the conda-forge packages. Uwe On Sat, Jan 19, 2019, at 5:21 PM, Wes McKinney wrote: > The vote carries with 3 binding +1 votes. Thanks to everyone for > helping verify the release > > There are a number of post-release tasks in > > https://cwiki.apache.org/confluence/display/ARR

[jira] [Created] (ARROW-4291) [Dev] Support selecting features in release scripts

2019-01-18 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-4291: -- Summary: [Dev] Support selecting features in release scripts Key: ARROW-4291 URL: https://issues.apache.org/jira/browse/ARROW-4291 Project: Apache Arrow Issue

[jira] [Created] (ARROW-4290) [C++/Gandiva] Support detecting correct LLVM version in Homebrew

2019-01-18 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-4290: -- Summary: [C++/Gandiva] Support detecting correct LLVM version in Homebrew Key: ARROW-4290 URL: https://issues.apache.org/jira/browse/ARROW-4290 Project: Apache Arrow

[jira] [Created] (ARROW-4289) [C++] Forward AR and RANLIB to thirdparty builds

2019-01-18 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-4289: -- Summary: [C++] Forward AR and RANLIB to thirdparty builds Key: ARROW-4289 URL: https://issues.apache.org/jira/browse/ARROW-4289 Project: Apache Arrow Issue Type

[jira] [Created] (ARROW-4287) [C++] Ensure minimal bison version on OSX for Thrift

2019-01-18 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-4287: -- Summary: [C++] Ensure minimal bison version on OSX for Thrift Key: ARROW-4287 URL: https://issues.apache.org/jira/browse/ARROW-4287 Project: Apache Arrow Issue

[jira] [Created] (ARROW-4286) [C++/R] Namespace vendored Boost

2019-01-18 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-4286: -- Summary: [C++/R] Namespace vendored Boost Key: ARROW-4286 URL: https://issues.apache.org/jira/browse/ARROW-4286 Project: Apache Arrow Issue Type: New Feature

Re: Benchmarking dashboard proposal

2019-01-18 Thread Uwe L. Korn
Hello, note that we have(had?) the Python benchmarks continuously running and reported at https://pandas.pydata.org/speed/arrow/. Seems like this stopped in July 2018. UWe On Fri, Jan 18, 2019, at 9:23 AM, Antoine Pitrou wrote: > > Hi Areg, > > That sounds like a good idea to me. Note our be

[jira] [Created] (ARROW-4265) [C++] Automatic conversion between Table and std::vector>

2019-01-15 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-4265: -- Summary: [C++] Automatic conversion between Table and std::vector> Key: ARROW-4265 URL: https://issues.apache.org/jira/browse/ARROW-4265 Project: Apache Ar

Re: Move arrow-site.git to gitbox

2019-01-15 Thread Uwe L. Korn
Created https://issues.apache.org/jira/browse/INFRA-17655 On Thu, Jan 3, 2019, at 6:07 PM, Krisztián Szűcs wrote: > +1 > > On Thu, Jan 3, 2019 at 4:34 PM Wes McKinney wrote: > > > +1 > > > > On Thu, Jan 3, 2019, 9:22 AM Uwe L. Korn > > > > Hello, >

Re: Compiling Arrow for RaspberryPi

2019-01-09 Thread Uwe L. Korn
Hello Suvayu, for arrow-cpp it is definitely possible to cross-compile on the desktop as it using standard CMake for the build. There are a lot of guides available for doing cross compilation with CMake. This may work but I would expect that in some places we're probably not passing all flags t

Re: RecordBatchFile with no batches, Error: Pyarrow.lib.ArrowInvalid: File is smaller than indicated metadata size.

2019-01-09 Thread Uwe L. Korn
Hello Ryan, for CentOS and pip, I would recommend to use the docker scripts that we use to build the manylinux1 compatible wheels (the ones we also upload to PyPI): https://github.com/apache/arrow/tree/master/python/manylinux1 They will bootstrap an isolated environment in docker that is indepe

[jira] [Created] (ARROW-4210) [Python] Mention boost-cpp directly in the conda meta.yaml for pyarrow

2019-01-09 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-4210: -- Summary: [Python] Mention boost-cpp directly in the conda meta.yaml for pyarrow Key: ARROW-4210 URL: https://issues.apache.org/jira/browse/ARROW-4210 Project: Apache

[jira] [Created] (ARROW-4191) [C++] Use same CC and AR for jemalloc as for the main sources

2019-01-08 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-4191: -- Summary: [C++] Use same CC and AR for jemalloc as for the main sources Key: ARROW-4191 URL: https://issues.apache.org/jira/browse/ARROW-4191 Project: Apache Arrow

Re: [Rust] crate versions and release process

2019-01-06 Thread Uwe L. Korn
This is definitely possible for Apache projects. Currently we still have two releases: "Arrow without JS" and "Arrow JS". We can have separate release votes for security and small fixes for subcrates. There are mainly two things that "limit us": 1. We still need to do the release votes (there a

Re: Arrow Rust roadmapping [was Re: [Gandiva] Representing logical query plans in protobuf]

2019-01-06 Thread Uwe L. Korn
Hello Andy, one thing that we had in discussions in the past and also opened me up a bit to the parquet-cpp merge is that merging code into a repo doesn't mean that it will reside always there. Apache has the infrastructure and guidelines to split a part of a project into a separate one. This i

Re: Building arrow using Xcode on Mac OS

2019-01-04 Thread Uwe L. Korn
Hello Hatem, I don't know of anyone that has used Xcode to build Arrow yet. We're normally using `-GNinja` or the default make generator to build it. As I have a Mac, I'll have a look at this but "cmake -G Xcode" is not running for me at the moment. To help us debug this, can you open a JIRA on

Re: Building arrow using Xcode on Mac OS

2019-01-04 Thread Uwe L. Korn
Hello Hatem, I don't know of anyone that has used Xcode to build Arrow yet. We're normally using `-GNinja` or the default make generator to build it. As I have a Mac, I'll have a look at this but "cmake -G Xcode" is not running for me at the moment. To help us debug this, can you open a JIRA on

Re: Enabling an installation path for Arrow R users

2019-01-03 Thread Uwe L. Korn
We probably need to support both, conda-forge and CRAN. As a first shot, conda-forge will be much easier to setup as we should have a better build toolchain available there and this could also then be used in the multilanguage scenario demos really well. From my experience, the usage of conda in

Move arrow-site.git to gitbox

2019-01-03 Thread Uwe L. Korn
Hello, as requested per the mail from ASF infra, I would like to move the arrow-site git repo to gitbox. This is the repo used for the distribution of the rendered version of the website. Some +1s or a point why we should consider alternatives would help to bring this forward. If there is conse

[jira] [Created] (ARROW-4129) [Python] Fix syntax problem in benchmark docs

2018-12-28 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-4129: -- Summary: [Python] Fix syntax problem in benchmark docs Key: ARROW-4129 URL: https://issues.apache.org/jira/browse/ARROW-4129 Project: Apache Arrow Issue Type

Re: C++ documentation overhaul

2018-12-27 Thread Uwe L. Korn
I also see this problem. This is due to the underlying filesystem on macOS being case insensitive. The fix is to make your file system case sensitive (this is possible but takes a while) We have two generated files pyarrow.array.rst and pyarrow.Array.rst. For me the latter is the one that relia

[jira] [Created] (ARROW-4107) [Python] Use ninja in pyarrow manylinux1 build

2018-12-23 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-4107: -- Summary: [Python] Use ninja in pyarrow manylinux1 build Key: ARROW-4107 URL: https://issues.apache.org/jira/browse/ARROW-4107 Project: Apache Arrow Issue Type

Re: How to append to parquet file periodically and read intermediate data - pyarrow.lib.ArrowIOError: Invalid parquet file. Corrupt footer.

2018-12-19 Thread Uwe L. Korn
t; >>> what Uwe suggests is usually the way to go, your active process writes to a >>> new file every time. Then you have a parallel process/thread that does >>> compaction of smaller files in the background such that you don't have too >>> many files. >>&g

Re: How to append to parquet file periodically and read intermediate data - pyarrow.lib.ArrowIOError: Invalid parquet file. Corrupt footer.

2018-12-19 Thread Uwe L. Korn
Hello Darren, you're out of luck here. Parquet files are immutable and meant for batch writes. Once they're written you cannot modify them anymore. To load them, you need to know their metadata which is in the footer. The footer is always at the end of the file and written once you call close.

Re: Reviewing PRs (was: Re: Arrow sync call)

2018-12-19 Thread Uwe L. Korn
+1, I would also like to see them in Sphinx. Uwe > Am 19.12.2018 um 11:13 schrieb Antoine Pitrou : > > > We should decide where we want to put developer docs. > > I would favour putting them in the Sphinx docs, personally. > > Regards > > Antoine. > > >> Le 19/12/2018 à 02:20, Wes McKinne

[jira] [Created] (ARROW-4054) [Python] Update gtest, flatbuffers and OpenSSL in manylinux1 base image

2018-12-17 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-4054: -- Summary: [Python] Update gtest, flatbuffers and OpenSSL in manylinux1 base image Key: ARROW-4054 URL: https://issues.apache.org/jira/browse/ARROW-4054 Project: Apache

[jira] [Created] (ARROW-3995) [CI] Use understandable names in Travis Matrix

2018-12-11 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-3995: -- Summary: [CI] Use understandable names in Travis Matrix Key: ARROW-3995 URL: https://issues.apache.org/jira/browse/ARROW-3995 Project: Apache Arrow Issue Type

[jira] [Created] (ARROW-3972) [Gandiva] Update to LLVM 7

2018-12-09 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-3972: -- Summary: [Gandiva] Update to LLVM 7 Key: ARROW-3972 URL: https://issues.apache.org/jira/browse/ARROW-3972 Project: Apache Arrow Issue Type: Improvement

[jira] [Created] (ARROW-3944) [Python] Build manylinux1 docker image directly in the CI

2018-12-05 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-3944: -- Summary: [Python] Build manylinux1 docker image directly in the CI Key: ARROW-3944 URL: https://issues.apache.org/jira/browse/ARROW-3944 Project: Apache Arrow

[jira] [Created] (ARROW-3932) [Python/Documentation] Include Benchmarks.md in Sphinx docs

2018-12-03 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-3932: -- Summary: [Python/Documentation] Include Benchmarks.md in Sphinx docs Key: ARROW-3932 URL: https://issues.apache.org/jira/browse/ARROW-3932 Project: Apache Arrow

Re: [VOTE] Accept donation of Rust Parquet implementation

2018-12-01 Thread Uwe L. Korn
+1, nice to see this joining the Apache community Uwe > Am 01.12.2018 um 10:16 schrieb Antoine Pitrou : > > >> Le 01/12/2018 à 00:50, Wes McKinney a écrit : >> >> This vote is to determine if the Arrow PMC is in favor of accepting >> this donation. If the vote passes, the PMC and the authors

[jira] [Created] (ARROW-3834) [Doc] Merge Python & C++ and move to top-level

2018-11-18 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-3834: -- Summary: [Doc] Merge Python & C++ and move to top-level Key: ARROW-3834 URL: https://issues.apache.org/jira/browse/ARROW-3834 Project: Apache Arrow Issue

[jira] [Created] (ARROW-3829) [Python] Support protocols to extract Arrow objects from third-party classes

2018-11-17 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-3829: -- Summary: [Python] Support protocols to extract Arrow objects from third-party classes Key: ARROW-3829 URL: https://issues.apache.org/jira/browse/ARROW-3829 Project

[jira] [Created] (ARROW-3767) [C++] Add cast for Null to any type

2018-11-12 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-3767: -- Summary: [C++] Add cast for Null to any type Key: ARROW-3767 URL: https://issues.apache.org/jira/browse/ARROW-3767 Project: Apache Arrow Issue Type: Improvement

Re: [ANNOUNCE] New Arrow PMC member: Krisztián Szűcs

2018-11-08 Thread Uwe L. Korn
Congratulations Krisztián! On Thu, Nov 8, 2018, at 9:56 PM, Philipp Moritz wrote: > Congrats and welcome Krisztián! > > On Thu, Nov 8, 2018 at 11:48 AM Wes McKinney wrote: > > > The Project Management Committee (PMC) for Apache Arrow has invited > > Krisztián Szűcs to become a PMC member and we

Re: [ANNOUNCE] New Arrow committers: Romain François, Sebastien Binet, Yosuke Shiro

2018-11-08 Thread Uwe L. Korn
Welcome to all of you! On Thu, Nov 8, 2018, at 8:56 PM, Wes McKinney wrote: > On behalf of the Arrow PMC, I'm happy to announce that Romain > François, Sebastien Binet, and Yosuke Shiro have been invited to be > committers on the project. > > Welcome, and thanks for your contributions!

Re: Creating Buffer directly from pointer/length

2018-11-08 Thread Uwe L. Korn
Hello Randy, you are looking for https://arrow.apache.org/docs/python/generated/pyarrow.foreign_buffer.html#pyarrow.foreign_buffer This takes an address, size and a Python object for having a reference on the object. In your case the last one can be None. Note that this will not do a copy and

[jira] [Created] (ARROW-3711) [C++] Don't pass CXX_FLAGS to C_FLAGS

2018-11-06 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-3711: -- Summary: [C++] Don't pass CXX_FLAGS to C_FLAGS Key: ARROW-3711 URL: https://issues.apache.org/jira/browse/ARROW-3711 Project: Apache Arrow Issue Type

Re: Help by following "parquet" and "pyarrow" tags on StackOverflow

2018-11-06 Thread Uwe L. Korn
We also have an `apache-arrow` tag on StackOverflow. I was only follwoing this and not pyarrow. Note that you can setup email notifications for these tags at https://stackexchange.com/filters Cheers Uwe On Tue, Nov 6, 2018, at 10:06 AM, Wes McKinney wrote: > hi folks, > > We are getting a lot

Re: Encoding options (delta, rle, ...) in pyarrow bindings

2018-11-02 Thread Uwe L. Korn
r Delta encoding in the Arrow columnar format. I suspect > > this will eventually be added as it can be quite important to improve > > in-memory query execution performance. > > > > Wes > > > > On Fri, Nov 2, 2018, 2:18 PM Uwe L. Korn > > > > Hello S

Re: Encoding options (delta, rle, ...) in pyarrow bindings

2018-11-02 Thread Uwe L. Korn
Hello Sebastian, currently you can only switch between plain and dictionary-encoding-combined-with-run-length encoding using the `use_dictionary` flag on https://arrow.apache.org/docs/python/generated/pyarrow.parquet.write_table.html#pyarrow.parquet.write_table . Other encoding are yet only im

[jira] [Created] (ARROW-3670) [C++] Use FindBacktrace to find execinfo.h support

2018-11-01 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-3670: -- Summary: [C++] Use FindBacktrace to find execinfo.h support Key: ARROW-3670 URL: https://issues.apache.org/jira/browse/ARROW-3670 Project: Apache Arrow Issue

[jira] [Created] (ARROW-3642) [C++] Add arrowConfig.cmake generation

2018-10-28 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-3642: -- Summary: [C++] Add arrowConfig.cmake generation Key: ARROW-3642 URL: https://issues.apache.org/jira/browse/ARROW-3642 Project: Apache Arrow Issue Type

[jira] [Created] (ARROW-3641) [C++/Python] remove public keyword from Cython api functions

2018-10-28 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-3641: -- Summary: [C++/Python] remove public keyword from Cython api functions Key: ARROW-3641 URL: https://issues.apache.org/jira/browse/ARROW-3641 Project: Apache Arrow

[jira] [Created] (ARROW-3610) [C++] Add interface to turn stl_allocator into arrow::MemoryPool

2018-10-24 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-3610: -- Summary: [C++] Add interface to turn stl_allocator into arrow::MemoryPool Key: ARROW-3610 URL: https://issues.apache.org/jira/browse/ARROW-3610 Project: Apache Arrow

Re: [RESULT] [VOTE] Release Apache Arrow 0.11.1 (RC0)

2018-10-23 Thread Uwe L. Korn
I'll take care of > * Upload the new wheels to PyPI > * Update the conda packages Uwe

Re: [VOTE] Release Apache Arrow 0.11.1 (RC0)

2018-10-21 Thread Uwe L. Korn
+1 (binding) Run verification script on OSX, had the same Plasma failures in Python as in the 0.11 vote and thus not considering them as critical. On Sun, Oct 21, 2018, at 11:15 PM, Krisztián Szűcs wrote: > I can't run the verification script right now, but I've followed the > changes, and it's

[jira] [Created] (ARROW-3583) [Python/Java] Create RecordBatch from VectorSchemaRoot

2018-10-21 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-3583: -- Summary: [Python/Java] Create RecordBatch from VectorSchemaRoot Key: ARROW-3583 URL: https://issues.apache.org/jira/browse/ARROW-3583 Project: Apache Arrow

Re: Making a bugfix 0.11.1 release

2018-10-20 Thread Uwe L. Korn
I have triggered the wheel builds on my crossbow repo with build-25, feel free to use them. Uwe On Sat, Oct 20, 2018, at 3:52 PM, Wes McKinney wrote: > I'm having problems with Crossbow. I am going to try a few things > (going through the setup process "from scratch" -- new tokens, new > local r

[jira] [Created] (ARROW-3565) [Python] Pin tensorflow to 1.11.0 in manylinux1 container

2018-10-19 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-3565: -- Summary: [Python] Pin tensorflow to 1.11.0 in manylinux1 container Key: ARROW-3565 URL: https://issues.apache.org/jira/browse/ARROW-3565 Project: Apache Arrow

Re: [VOTE] Accept donation of Ruby bindings to Parquet GLib

2018-10-18 Thread Uwe L. Korn
+1 > Am 18.10.2018 um 22:59 schrieb Wes McKinney : > > hello, > > Kouhei Sutou is proposing to donate Ruby bindings to the Parquet GLib > library, which was received as a donation in September. This Ruby > library was originally developed at > > https://github.com/red-data-tools/red-parquet/ >

[jira] [Created] (ARROW-3535) [Python] pip install tensorflow install too new numpy in manylinux1 build

2018-10-16 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-3535: -- Summary: [Python] pip install tensorflow install too new numpy in manylinux1 build Key: ARROW-3535 URL: https://issues.apache.org/jira/browse/ARROW-3535 Project: Apache

[jira] [Created] (ARROW-3534) [Python] Update zlib library in manylinux1 image

2018-10-16 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-3534: -- Summary: [Python] Update zlib library in manylinux1 image Key: ARROW-3534 URL: https://issues.apache.org/jira/browse/ARROW-3534 Project: Apache Arrow Issue Type

[jira] [Created] (ARROW-3533) [Python/Documentation] Use sphinx_rtd_theme instead of Bootstrap

2018-10-16 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-3533: -- Summary: [Python/Documentation] Use sphinx_rtd_theme instead of Bootstrap Key: ARROW-3533 URL: https://issues.apache.org/jira/browse/ARROW-3533 Project: Apache Arrow

[jira] [Created] (ARROW-3530) [Java/Python] Add conversion for pyarrow.Schema from org.apache…pojo.Schema

2018-10-16 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-3530: -- Summary: [Java/Python] Add conversion for pyarrow.Schema from org.apache…pojo.Schema Key: ARROW-3530 URL: https://issues.apache.org/jira/browse/ARROW-3530 Project

Re: [VOTE] Accept donation of Arrow C# .NET implementation

2018-10-15 Thread Uwe L. Korn
+1 On Mon, Oct 15, 2018, at 5:27 PM, Wes McKinney wrote: > hi folks, > > Individuals from Feyen Zylstra LLC have developed a C# implementation > of Apache Arrow and are proposing to donate it to the Apache project, > as discussed on the mailing list > > https://github.com/feyenzylstra/apache-arr

Re: [DRAFT] Apache Arrow board report October 2018

2018-10-11 Thread Uwe L. Korn
You could also mention that we are about to receive a C# donation. Otherwise this looks good. Uwe On Thu, Oct 11, 2018, at 6:05 PM, Wes McKinney wrote: > ## Description: > > Apache Arrow is a cross-language development platform for in-memory data. It > specifies a standardized language-independ

Re: parquet-column_scanner-test failure

2018-10-11 Thread Uwe L. Korn
ering > EEMCS, TU Delft, The Netherlands > ____________ > From: Uwe L. Korn [uw...@xhochy.com] > Sent: Thursday, October 11, 2018 2:43 PM > To: dev@arrow.apache.org > Subject: Re: parquet-column_scanner-test failure > > Hello Tanveer, > > your attachment did

Re: parquet-column_scanner-test failure

2018-10-11 Thread Uwe L. Korn
Hello Tanveer, your attachment did not come through as attachments are not allowed on the mailing list. Can you post it somewhere? Uwe On Thu, Oct 11, 2018, at 12:33 PM, Tanveer Ahmad - EWI wrote: > Hi, > > I enabled following flags and got error in the attachment (parquet- > column_scanner-t

[jira] [Created] (ARROW-3482) [C++] Build with JEMALLOC by default

2018-10-10 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-3482: -- Summary: [C++] Build with JEMALLOC by default Key: ARROW-3482 URL: https://issues.apache.org/jira/browse/ARROW-3482 Project: Apache Arrow Issue Type

Re: [VOTE] Release Apache Arrow 0.11.0 (RC1)

2018-10-08 Thread Uwe L. Korn
+1 (binding) I'm quite uncomfortable with the number of breakages but I think that at this size of the project it will be unevitable that we will have still some minor problems in the release. On Mon, Oct 8, 2018, at 6:39 AM, Bryan Cutler wrote: > +1 (non-binding) > > I ran tests for C++, Pyth

Re: [JIRA] -ARROW-1780 - JDBC Adapter - resolved.

2018-10-05 Thread Uwe L. Korn
Hello Atul, sorry for the long turnaround time. I finally had the time to spin up the code from Python. I simply did some tests with a table of New York Taxi trip data and Apache Drill. Using the bundled JDBC driver and JayDeBeAPI, the default for accessing JDBC from Python, it took 11 minutes

Re: Petastorm: PyArrow based library for Tensorflow, PyTorch and others...

2018-10-05 Thread Uwe L. Korn
Hello Yevgeni, this looks interesting. Can you make a PR to https://github.com/apache/arrow so that Petastorm is listed on https://arrow.apache.org/powered_by/ ? I browsed a bit through your code. As far as I can see your approach is store to have a set of Parquet files in a directory with a

Re: [VOTE] Release Apache Arrow 0.11.0 (RC1)

2018-10-04 Thread Uwe L. Korn
for > > fixing this .asc problem? > > > > > > Thanks, > > -- > > kou > > > > In <1538639225.4190225.1530323248.542da...@webmail.messagingengine.com> > > "Re: [VOTE] Release Apache Arrow 0.11.0 (RC1)" on Thu, 04 Oct 2018 >

[jira] [Created] (ARROW-3443) [Java] Flight reports memory leaks in TestBasicOperation

2018-10-04 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-3443: -- Summary: [Java] Flight reports memory leaks in TestBasicOperation Key: ARROW-3443 URL: https://issues.apache.org/jira/browse/ARROW-3443 Project: Apache Arrow

Re: [VOTE] Release Apache Arrow 0.11.0 (RC1)

2018-10-04 Thread Uwe L. Korn
Hello Kou, It seems like you have used a GPG key that is not in the main keys files: ``` + gpg --verify apache-arrow-0.11.0.tar.gz.asc apache-arrow-0.11.0.tar.gz gpg: Signature made Thu Oct 4 05:46:23 2018 CEST gpg:using DSA key 7714A383F6F73E2D9828791D17423F641C837F31 gpg: Can't

[jira] [Created] (ARROW-3395) [C++/Python] Add docker container for linting

2018-10-01 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-3395: -- Summary: [C++/Python] Add docker container for linting Key: ARROW-3395 URL: https://issues.apache.org/jira/browse/ARROW-3395 Project: Apache Arrow Issue Type

[jira] [Created] (ARROW-3392) [Python] Support filters in disjunctive normal form in ParquetDataset

2018-10-01 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-3392: -- Summary: [Python] Support filters in disjunctive normal form in ParquetDataset Key: ARROW-3392 URL: https://issues.apache.org/jira/browse/ARROW-3392 Project: Apache

[jira] [Created] (ARROW-3391) [Python] Support \0 characters in binary Parquet predicate values

2018-10-01 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-3391: -- Summary: [Python] Support \0 characters in binary Parquet predicate values Key: ARROW-3391 URL: https://issues.apache.org/jira/browse/ARROW-3391 Project: Apache Arrow

[jira] [Created] (ARROW-3388) [Python] boolean Partition keys in ParquetDataset are reconstructed as string

2018-10-01 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-3388: -- Summary: [Python] boolean Partition keys in ParquetDataset are reconstructed as string Key: ARROW-3388 URL: https://issues.apache.org/jira/browse/ARROW-3388 Project

[jira] [Created] (ARROW-3363) [C++/Python] Add helper functions to detect scalar Python types

2018-09-29 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-3363: -- Summary: [C++/Python] Add helper functions to detect scalar Python types Key: ARROW-3363 URL: https://issues.apache.org/jira/browse/ARROW-3363 Project: Apache Arrow

[jira] [Created] (ARROW-3335) [Python] Add ccache to manylinux1 container

2018-09-26 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-3335: -- Summary: [Python] Add ccache to manylinux1 container Key: ARROW-3335 URL: https://issues.apache.org/jira/browse/ARROW-3335 Project: Apache Arrow Issue Type

[jira] [Created] (ARROW-3334) [Python] Update conda packages to new numpy requirement

2018-09-26 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-3334: -- Summary: [Python] Update conda packages to new numpy requirement Key: ARROW-3334 URL: https://issues.apache.org/jira/browse/ARROW-3334 Project: Apache Arrow

Re: Timeline for 0.11 Arrow release

2018-09-25 Thread Uwe L. Korn
> I may not be able to work on this for 0.11 but can you add > me to https://bintray.com/apache ? I want write permission > to https://bintray.com/apache/arrow . > > https://bintray.com/kou is my account. You will need to make an INFRA ticket for this. The apache org is centrally managed. Uwe

Re: [VOTE] Accept donation of C GLib bindings to Parquet C++ libraries

2018-09-25 Thread Uwe L. Korn
+1 (binding) On Tue, Sep 25, 2018, at 1:36 PM, Wes McKinney wrote: > hello, > > Kouhei Sutou is proposing to donate C GLib bindings to the Parquet C++ > libraries (which can read Arrow tables back), designed to work > together with the existing GLib bindings in Apache Arrow. This work > was origi

[jira] [Created] (ARROW-3301) [Website] Update Jekyll and Bootstrap

2018-09-22 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-3301: -- Summary: [Website] Update Jekyll and Bootstrap Key: ARROW-3301 URL: https://issues.apache.org/jira/browse/ARROW-3301 Project: Apache Arrow Issue Type

[jira] [Created] (ARROW-3267) [Python] Create empty table from schema

2018-09-19 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-3267: -- Summary: [Python] Create empty table from schema Key: ARROW-3267 URL: https://issues.apache.org/jira/browse/ARROW-3267 Project: Apache Arrow Issue Type

Re: Lighter build matrix on a language specific fork.

2018-09-06 Thread Uwe L. Korn
The problem could be that it checks against master and you will probably have changes for R in the ci/ directory. Changes in that directory will trigger a build for the full matrix. So to get the build simple and fast, we should get the ci/ changes for R into master soon. Uwe On Thu, Sep 6, 20

Re: [DISCUSS] Dropping support for CentOS 5 / RHEL5 in Python packages

2018-09-06 Thread Uwe L. Korn
Hello Wes, I'm ok with option 2 when we use the yet unfinished manylinux2010 image as the base. This way, we will still be able to produce wheels that in the near future are actually based an a architecture tag supported by a PEP. Also as I have some packaging nightmare, I would feel much bette

[jira] [Created] (ARROW-3143) [C++] CopyBitmap into existing memory

2018-08-29 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-3143: -- Summary: [C++] CopyBitmap into existing memory Key: ARROW-3143 URL: https://issues.apache.org/jira/browse/ARROW-3143 Project: Apache Arrow Issue Type

[jira] [Created] (ARROW-3141) [Python] Tensorflow support in pyarrow wheels pins numpy>=1.14

2018-08-29 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-3141: -- Summary: [Python] Tensorflow support in pyarrow wheels pins numpy>=1.14 Key: ARROW-3141 URL: https://issues.apache.org/jira/browse/ARROW-3141 Project: Apache Ar

Re: How to concatenate RecordBatches into a single RecordBatch?

2018-08-28 Thread Uwe L. Korn
Hello Jacob, while not optimal, you could try to use https://docs.python.org/3/library/io.html#io.BufferedReader together with a much larger buffer_size than the default. This might not be the best way possible as we have to cross the Python/C++ boundary more often but should improve on the cu

[jira] [Created] (ARROW-3109) [Python] Add Python 3.7 virtualenvs to manylinux1 container

2018-08-22 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-3109: -- Summary: [Python] Add Python 3.7 virtualenvs to manylinux1 container Key: ARROW-3109 URL: https://issues.apache.org/jira/browse/ARROW-3109 Project: Apache Arrow

[jira] [Created] (ARROW-3108) [C++] arrow::PrettyPrint for Table instances

2018-08-22 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-3108: -- Summary: [C++] arrow::PrettyPrint for Table instances Key: ARROW-3108 URL: https://issues.apache.org/jira/browse/ARROW-3108 Project: Apache Arrow Issue Type

[jira] [Created] (ARROW-3107) [C++] arrow::PrettyPrint for Column instances

2018-08-22 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-3107: -- Summary: [C++] arrow::PrettyPrint for Column instances Key: ARROW-3107 URL: https://issues.apache.org/jira/browse/ARROW-3107 Project: Apache Arrow Issue Type

Re: [VOTE] Combining Arrow C++ development process with Apache Parquet C++

2018-08-21 Thread Uwe L. Korn
+1 On Wed, Aug 22, 2018, at 12:53 AM, Phillip Cloud wrote: > +1 > > On Tue, Aug 21, 2018 at 6:26 PM Jacques Nadeau wrote: > > > +1 > > > > > > > > On Tue, Aug 21, 2018 at 3:21 PM Philipp Moritz wrote: > > > > > +1 for the monorepo plan and push access to Parquet C++ committers > > > > > > -- P

Re: Timeline for 0.11 Arrow release

2018-08-21 Thread Uwe L. Korn
Hello, I will also go over the release and add items. For my personal goal for 0.11, I want to have predicate pushdown for Parquet files working. This means that we should be able to determine in Python code what the relevant RowGroups in a file are as well as filtering a Table given the set of

<    1   2   3   4   5   6   7   8   9   10   >