Re: [VOTE] Split Go release process

2024-08-27 Thread Uwe L. Korn
+1 (binding) On Tue, Aug 27, 2024, at 3:04 PM, Joris Van den Bossche wrote: > +1 (binding) > > On Mon, 26 Aug 2024 at 09:56, Antoine Pitrou wrote: >> >> +1 (binding) >> >> Le 26/08/2024 à 04:37, Sutou Kouhei a écrit : >> > Hi, >> > >> > I would like to propose splitting Go release process. >> > >

Re: [VOTE] Move Arrow DataFusion Subproject to new Top Level Apache Project

2024-03-01 Thread Uwe L. Korn
+1 (binding) On Fri, Mar 1, 2024, at 2:37 PM, Andy Grove wrote: > +1 (binding) > > On Fri, Mar 1, 2024 at 6:20 AM Weston Pace wrote: > >> +1 (binding) >> >> On Fri, Mar 1, 2024 at 3:33 AM Andrew Lamb wrote: >> >> > Hello, >> > >> > As we have discussed[1][2] I would like to vote on the proposal

Re: [VOTE] Move issue tracking to GitHub Issues

2022-10-27 Thread Uwe L. Korn
+1 On Thu, Oct 27, 2022, at 5:13 PM, Nic wrote: > +1 > > On Thu, 27 Oct 2022 at 14:00, Alenka Frim > wrote: > >> +1 >> >> On Thu, Oct 27, 2022 at 2:36 PM prem sagar gali >> wrote: >> >> > +1 >> > >> > On Thu, Oct 27, 2022 at 7:13 AM Dewey Dunnington >> > wrote: >> > >> > > +1 (non-binding)! >>

Re: [Discuss][Python] Stop publishing universal wheels?

2022-10-27 Thread Uwe L. Korn
Hello, if we have wheels for x86_64 and arm64 individually, I don't see an argument for keeping universal2 ones. x86_64 Macs will probably stay around for a while as Apple is quite good in keeping old hardware updated, and the laptops themselves are pretty solid. Best Uwe On Thu, Oct 27, 2022

Re: [VOTE] Release Apache Arrow 7.0.0 - RC6

2022-01-25 Thread Uwe L. Korn
Hello all, I sadly get an issue with compiling with GCC 7.5 at the moment as reported in https://issues.apache.org/jira/browse/ARROW-15444 We need this version to support CUDA-enabled and ppc64le builds on conda-forge. Cheers Uwe On Tue, Jan 25, 2022, at 10:35 AM, Krisztián Szűcs wrote: > Than

Re: [DISCUSS] Dropping support for Visual Studio 2015

2021-08-14 Thread Uwe L. Korn
+1 VS2017 should also be compatible with VS2015 so that this should cause any issues for downstream users that link dynamically. > Am 14.08.2021 um 01:56 schrieb Benjamin Kietzman : > > Thanks for commenting, all. I'll open a JIRA/PR to remove support next week. > >> On Tue, Aug 10, 2021, 09

Re: [RESULT] [VOTE] Release Apache Arrow 3.0.0 - RC2

2021-01-27 Thread Uwe L. Korn
1. [done] rebase master 2. [done] upload source 3. [done] upload binaries 4. [done] update website 5. [done] upload ruby gems 6. [done] upload js packages 8. [done] upload C# packages 9. [done] upload rust crates 10. [done] update conda recipes 11. [done] upload wheels/sdist to pypi 12. [ ]

Re: [VOTE] Release Apache Arrow 3.0.0 - RC2

2021-01-25 Thread Uwe L. Korn
+1 (binding) Verified C++, Python and Rust on the Apple M1 (natively!) and all works. I had to do some slight modifications to the verification script but they are independent of the source tarball: https://github.com/apache/arrow/pull/9315 Cheers Uwe On Fri, Jan 22, 2021, at 4:59 PM, Neal Ric

Re: Incompatability of all existing pyarrow releases with the next NumPy release

2020-12-04 Thread Uwe L. Korn
Still, the PR is so trival that we should merge it. I'm not uptodate what the status of the 2.0.1 release is but this would be an essential patch for that. On Fri, Dec 4, 2020, at 9:22 PM, Antoine Pitrou wrote: > > > Le 04/12/2020 à 21:11, Uwe L. Korn a écrit : > > Hell

Incompatability of all existing pyarrow releases with the next NumPy release

2020-12-04 Thread Uwe L. Korn
Hello all, Today the Karotothek CI turned quite red in https://github.com/JDASoftwareGroup/kartothek/pull/383 / https://github.com/JDASoftwareGroup/kartothek/pull/383/checks?check_run_id=1497941813 as the new NumPy 1.20rc1 was pulled in. It simply broke all pyarrow<->NumPy interop as now dtype

Re: Removing Python 3.5 support

2020-11-26 Thread Uwe L. Korn
+1 from my side too On Thu, Nov 26, 2020, at 1:04 PM, Joris Van den Bossche wrote: > +1 on dropping Python 3.5 > > On Thu, 26 Nov 2020 at 12:26, Antoine Pitrou wrote: > > > > > Hello, > > > > Python 3.5 is not supported upstream, neither by the CPython development > > team nor by third-party pr

Re: ursa-labs/crossbow on travis-ci.com is disabled

2020-11-26 Thread Uwe L. Korn
Also note that drone.io supports linux-arm64 which we use in conda-forge for this architecture and is already setup in crossbow (although we had issues with branches not being seen). On Thu, Nov 26, 2020, at 1:31 AM, Jeroen Ooms wrote: > On Wed, Nov 25, 2020 at 10:54 PM Sutou Kouhei wrote: > >

Re: [Governance] [Proposal] Stop force-pushing to PRs after release?

2020-11-25 Thread Uwe L. Korn
Hello Jorge, I know from the past on the Python/C++ side, we needed to do this for a lot of contributors to enable them to work with their branches/PRs again as they were overwhelmed with the complexity of these rebases. Personally, I wouldn't like to spend much time on whether we should rebase

Re: Development with C++ and Cython APIs in Arrow

2020-11-06 Thread Uwe L. Korn
Vibhatha Abeykoon wrote: > > > Hello Uwe, > > > > Nice example. I will follow this. > > > > With Regards, > > Vibhatha Abeykoon > > > > > > On Fri, Nov 6, 2020 at 9:36 AM Uwe L. Korn wrote: > > > >> Hello Vibhatha, > &

Re: Development with C++ and Cython APIs in Arrow

2020-11-06 Thread Uwe L. Korn
Hello Vibhatha, the best is to set a relative RPATH on the libraries. An example for this can be seen in the turbodbc sources: https://github.com/blue-yonder/turbodbc/blob/80a29a7edfbdabf12410af01c0c0ae74bfc3aab4/setup.py#L186-L189 Cheers Uwe On Tue, Nov 3, 2020, at 11:44 PM, Vibhatha Abeykoon

Re: [NIGHTLY] Arrow Build Report for Job nightly-2020-11-05-0

2020-11-05 Thread Uwe L. Korn
Taking care of the failing conda-win jobs in https://issues.apache.org/jira/browse/ARROW-10502 On Thu, Nov 5, 2020, at 11:14 AM, Crossbow wrote: > > Arrow Build Report for Job nightly-2020-11-05-0 > > All tasks: > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-11-05-0 >

Re: [VOTE] Release Apache Arrow 2.0.0 - RC2

2020-10-21 Thread Uwe L. Korn
> > 1. [done] rebase master > > > >> > > > 2. [done] upload source > > > >> > > > 3. [done] upload binaries > > > >> > > > 4. [kszucs] update website > > > >> > > > 5. [done] upload

Re: [VOTE] Release Apache Arrow 2.0.0 - RC2

2020-10-19 Thread Uwe L. Korn
+0 from my side, I see no big issues. I was able to verify the wheels, the source verification fails due to the llvm package issues on brew; thus I'm not able to +1 this time. Uwe On Mon, Oct 19, 2020, at 7:38 PM, Krisztián Szűcs wrote: > On Mon, Oct 19, 2020 at 5:32 PM Uwe L. Kor

Re: [VOTE] Release Apache Arrow 2.0.0 - RC2

2020-10-19 Thread Uwe L. Korn
s to address this with Homebrew though and re-add the llvm@10 package. This isn't a change in policy and I guess that it may suffice to add the new Arrow release to homebrew to get llvm@10 re-added. Uwe > > Neal > > On Mon, Oct 19, 2020 at 6:04 AM Uwe L. Korn wrote: > >

Re: [VOTE] Release Apache Arrow 2.0.0 - RC2

2020-10-19 Thread Uwe L. Korn
Trying to verify on macOS but run into the following two issues: * The default S3 region is „eu-central-1“ for me despite setting LANG=C * llvm@10 is not available for homebrew anymore, see also https://github.com/Homebrew/homebrew-core/pull/62798#issuecomment-711606370

Re: [C++] Arrow to ORC type conversion

2020-10-18 Thread Uwe L. Korn
This sounds reasonable from an Arrow perspective, you might want to CC the ORC list as well or ask someone there to co-review your work in the adapter. Uwe > Am 18.10.2020 um 17:24 schrieb Ying Zhou : > > Hi, > > I’m developing the adapter that converts Arrow Arrays, ChunkedArrays, > RecordBa

Re: [VOTE] Accept donation of Julia implementation for Apache Arrow

2020-10-14 Thread Uwe L. Korn
+1 (binding) On Wed, Oct 14, 2020, at 3:58 PM, Andy Grove wrote: > +1 (binding) > > On Tue, Oct 13, 2020 at 8:26 PM Fan Liya wrote: > > > +1 (non-binding) > > > > Best, > > Liya Fan > > > > > > On Wed, Oct 14, 2020 at 9:02 AM Sutou Kouhei wrote: > > > > > +1 (binding) > > > > > > In > > > "

Re: [NIGHTLY] Arrow Build Report for Job nightly-2020-10-02-0

2020-10-02 Thread Uwe L. Korn
conda-*-aarch64 hit the 1h time limit on drone.io, probably not easy to fix. On Fri, Oct 2, 2020, at 12:23 PM, Crossbow wrote: > > Arrow Build Report for Job nightly-2020-10-02-0 > > All tasks: > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-10-02-0 > > Failed Tasks: >

Re: [NIGHTLY] Arrow Build Report for Job nightly-2020-09-27-0

2020-09-27 Thread Uwe L. Korn
I'm working on a fix for the conda failures in https://github.com/apache/arrow/pull/8282 On Sun, Sep 27, 2020, at 12:20 PM, Crossbow wrote: > > Arrow Build Report for Job nightly-2020-09-27-0 > > All tasks: > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-09-27-0 > > Fa

Re: Closing Plasma issues?

2020-09-07 Thread Uwe L. Korn
If we do that, we should be clear with that and remove the code. Shipping Plasma as part of the release and not maintaining it as other parts of the Arrow libraries seems inconsistent and will just be an annoyance to user to find a partly unusable component. Cheers Uwe On Mon, Sep 7, 2020, at

Re: [VOTE] Release Apache Arrow 1.0.0 - RC2

2020-07-24 Thread Uwe L. Korn
1. [done] rebase master 2. [done] upload source 3. [kszucs] upload binaries 4. [ ] update website 5. [ ] upload ruby gems 6. [ ] upload js packages 8. [ ] upload C# packages 9. [andygrove] upload rust crates 10. [uwe] update conda recipes 11. [kszucs] upload wheels to pypi 12. [ ] update ho

Re: Introducing Cylon

2020-07-22 Thread Uwe L. Korn
Hello Niranda, cool to see this. Feel free to open a PR to add it to the Powered By list on https://arrow.apache.org/powered_by/ Cheers Uwe On Tue, Jul 21, 2020, at 8:03 PM, Niranda Perera wrote: > Hi all, > > We would like to introduce Cylon to the Arrow community. It is an > open-source, lea

Re: [DRAFT] Arrow Board Report July 2020

2020-07-08 Thread Uwe L. Korn
Happy with the current version. I think this gives enough input for the board. We have so much things happening that are much better presented in the process of the 1.0 release. On Wed, Jul 8, 2020, at 12:52 AM, Micah Kornfield wrote: > Worth mentioning the website work? > > On Tue, Jul 7, 2020

Re: Developing a C++ Python extension

2020-07-02 Thread Uwe L. Korn
work though: import ctypes arrow_python = ctypes.CDLL('libarrow.so', ctypes.RTLD_GLOBAL) libarrow_python = ctypes.CDLL('libarrow_python.so', ctypes.RTLD_GLOBAL) On Thu, Jul 2, 2020, at 4:32 PM, Uwe L. Korn wrote: > I had so much fun with the wheels in the past, I'm no

Re: Developing a C++ Python extension

2020-07-02 Thread Uwe L. Korn
m all > into our library. Feel free to scrape the perspective repo's cmake > lists and setup.py for details. > > Tim Paine > tim.paine.nyc > > > On Jul 2, 2020, at 10:32, Uwe L. Korn wrote: > > > > I had so much fun with the wheels in the past,

Re: Developing a C++ Python extension

2020-07-02 Thread Uwe L. Korn
I had so much fun with the wheels in the past, I'm now a happy member of conda-forge core instead :D The good thing first: * The C++ ABI didn't change between the manylinux versions, it is the old one in all cases. So you mix & match manylinux versions. The sad things: * The manylinuxX standa

Re: [DISCUSS] Ongoing LZ4 problems with Parquet files

2020-06-30 Thread Uwe L. Korn
I'm also in favor of disabling support for now. Having to deal with broken files or the detection of various incompatible implementations in the long-term will harm more than not supporting LZ4 for a while. Snappy is generally more used than LZ4 in this category as it has been available since th

Re: [VOTE] Permitting unsigned integers for Arrow dictionary indices

2020-06-30 Thread Uwe L. Korn
+1 (binding) On Tue, Jun 30, 2020, at 6:24 AM, Wes McKinney wrote: > +1 (binding) > > On Mon, Jun 29, 2020 at 11:11 PM Ben Kietzman > wrote: > > > > +1 (non binding) > > > > On Mon, Jun 29, 2020, 18:00 Wes McKinney wrote: > > > > > Hi, > > > > > > As discussed on the mailing list [1], it has b

Re: [VOTE] Increment MetadataVersion in Schema.fbs from V4 to V5 for 1.0.0 release

2020-06-30 Thread Uwe L. Korn
+1 (binding) On Tue, Jun 30, 2020, at 11:11 AM, Neville Dipale wrote: > +1 (non-binding) > > On Tue, 30 Jun 2020 at 06:29, Ben Kietzman wrote: > > > +1 (non binding) > > > > On Tue, Jun 30, 2020, 00:25 Wes McKinney wrote: > > > > > +1 (binding) > > > > > > On Mon, Jun 29, 2020 at 10:49 PM Mica

Re: [DISCUSS][C++] Performance work and compiler standardization for linux

2020-06-23 Thread Uwe L. Korn
FTR: We can use the latest(!) clang for all platform for conda and wheels. It isn't probably even that much of a complicated setup. On Mon, Jun 22, 2020, at 5:42 PM, Francois Saint-Jacques wrote: > We should aim to improve the performance of the most widely used > *default* packages, which are p

Re: [DISCUSS][C++] Performance work and compiler standardization for linux

2020-06-22 Thread Uwe L. Korn
With my conda-forge background, I would suggest to use clang as a performance baseline, because it's currently the only compiler that works reliably on all platforms. Most Linux distributions are nowadays built with gcc, also making a strong argument, but on OSX and Windows the picture is a bit

[C++] Kernels with scalar input

2020-06-17 Thread Uwe L. Korn
Hello all, I'm trying to implement a `contains` kernel that takes as an input a StringArray and a scalar string (see https://issues.apache.org/jira/browse/ARROW-9160). I feel confident with the rest of the new Kernels setup but I didn't find an example kernel where we also pass in a scalar att

Re: [DISCUSS] [C++] custom allocator for large objects

2020-06-05 Thread Uwe L. Korn
On Fri, Jun 5, 2020, at 3:13 PM, Rémi Dettai wrote: > Hi Antoine ! > > I would indeed have expected jemalloc to do that (remap the pages) > I have no idea about the performance gain this would provide (if any). > Could be interesting to explore. This would actually be the most interesting thing.

Re: [DISCUSS] [C++] custom allocator for large objects

2020-06-05 Thread Uwe L. Korn
Hello Rémi, under the hood jemalloc does quite similar things to what you describe. I'm not sure what the offset is in the current version but in earlier releases, it used a different allocation strategy for objects above 4MB. For the initial large allocation, you will see quite some copies as

Re: [NIGHTLY] Arrow Build Report for Job nightly-2020-05-30-0

2020-05-30 Thread Uwe L. Korn
https://github.com/apache/arrow/pull/7305 should enable us to upload conda packages again. On Sat, May 30, 2020, at 12:10 PM, Crossbow wrote: > > Arrow Build Report for Job nightly-2020-05-30-0 > > All tasks: > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-05-30-0 > >

Re: Arrow sync all at 12pm US-Eastern / 16:00 UTC

2020-05-27 Thread Uwe L. Korn
No, we are just talking about removing static libraries from conda-forge that may be (/have been) used as part of the Arrow build. This shouldn't affect any non-conda Arrow users/developers. Cheers, Uwe On Wed, May 27, 2020, at 6:53 PM, Rémi Dettai wrote: > @Uwe: Just a quick question about the

Re: [NIGHTLY] Arrow Build Report for Job nightly-2020-05-26-0

2020-05-26 Thread Uwe L. Korn
The conda builds are failing are we have exceed the storage available for our conda repository: You currently have 3 public packages and 0 packages that require to be authenticated. Using 10.0 GB of 3.0 GB storage I guess we something that deletes old builds automatically. On Tue, May 26, 2020

Re: Arrow Flight connector for SQL Server

2020-05-21 Thread Uwe L. Korn
Hello Brendan, welcome to the community. In addition to the folks at Dremio, I wanted to make you aware of the Python ODBC client library https://github.com/blue-yonder/turbodbc which provides a high-performance ODBC<->Arrow adapter. It is especially popular with MS SQL Server users as the fas

Re: [VOTE] Release Apache Arrow 0.17.1 - RC1

2020-05-19 Thread Uwe L. Korn
Current status: 1. [done] rebase (not required for a patch release) 2. [done] upload source 3. [done] upload binaries 4. [done|in-pr] update website 5. [done] upload ruby gems 6. [ ] upload js packages 8. [done] upload C# packages 9. [ ] upload rust crates 10. [done] update conda recipes (

Re: [Python] black vs. autopep8

2020-04-09 Thread Uwe L. Korn
The non-configurability of black is one of the strongest arguments I see for black. The codestyle will always be subjective. From previous discussions I know that my personal preference of readability conflicts with that of Antoine and Wes, so will probably others. We have the same issue with us

Re: [C++] Compute: Datum and "ChunkedArray&" inputs

2020-04-07 Thread Uwe L. Korn
-types. On Tue, Apr 7, 2020, at 1:00 PM, Uwe L. Korn wrote: > Hello all, > > I'm in the progress of changing the implementation of the Take kernel > to work on ChunkedArrays without concatenating them into a single Array > first. While working on the implementation, I rea

[C++] Compute: Datum and "ChunkedArray&" inputs

2020-04-07 Thread Uwe L. Korn
Hello all, I'm in the progress of changing the implementation of the Take kernel to work on ChunkedArrays without concatenating them into a single Array first. While working on the implementation, I realised that we switch often between Datum and the specific-typed parameters. This works quite

Re: Proposal to use Black for automatic formatting of Python code

2020-03-27 Thread Uwe L. Korn
I'm also very much in favor of this. For the black / cython support, I think the current state is reflected in https://github.com/pablogsal/black/tree/cython. On Fri, Mar 27, 2020, at 4:40 AM, Micah Kornfield wrote: > +1 from me as well. > > On Thursday, March 26, 2020, Neal Richardson > wrote

Re: [VOTE] Release Apache Arrow 0.16.0 - RC2

2020-02-05 Thread Uwe L. Korn
I'm failing to verify C++ on macOS as it seems that we nowadays pull all dependencies from the system. Is there a known way to build & test on OSX with the script and use conda for the requirements? Otherwise I probably need to investe to create such a way. Cheers Uwe On Wed, Feb 5, 2020, at

[Python] Exposing compute kernels

2019-12-17 Thread Uwe L. Korn
Hello all, we have developed quite some compute kernels in C++ nowadays and I would like to call them from Python. We could expose the kernels on the Array/ChunkedArray classes themselves or as standalone functions (or as both). What would be the preferred way? Also exposing them as standalone

Re: Adding stronger warnings about pre-production Arrow IPC implementations (C#, Rust)

2019-11-22 Thread Uwe L. Korn
Hello Wes, what about adding an implementation status (table) to the README of every language? Things like "Supports Arrow File Format", "Supports Arrow Stream Format", "Passes IPC integration tests", "Supports Flight" are things that are interesting to users and show how far an implementation

Re: [DISCUSS] Reviewing Arrow commit/code review policy

2019-10-14 Thread Uwe L. Korn
Hello all, I also think we should stay with CTR for the moment. If we wanted to enforce RTC or at least a bit better notification for reviewers of certain parts of Arrow, we could setup a CODEOWNERS file[1] to add experts of a certain file/folder as a reviewer on PRs on Github. Cheers Uwe [1]

Re: [DISCUSS] C-level in-process array protocol

2019-10-08 Thread Uwe L. Korn
I'm not sure whether flatbuffers is actually an issue in the end but keeping it out of the C-API definitely simplifies it a bit adoption-wise. I don't think that though that using protobuf would make a difference here. In general, I really like the C-interface work as sadly C-APIs are still the

Re: [DRAFT] Apache Arrow Board Report - October 2019

2019-10-08 Thread Uwe L. Korn
I'm not sure what qualifies for "board attention" but it seems that CI is a critical problem in Apache projects, not just Arrow. Should we raise that? Uwe On Tue, Oct 8, 2019, at 12:00 AM, Wes McKinney wrote: > Here is a start for our Q3 board report > > ## Description: > The mission of Apache

Re: Collecting Arrow critique and our roadmap on that

2019-09-23 Thread Uwe L. Korn
ted to a roadmap on the confluence wiki that > > should be folded in as appropriate too. > > > > Neal > > > > On Thu, Sep 19, 2019 at 10:26 AM Uwe L. Korn wrote: > > > > > > Hello, > > > > > > there has been a lot of public discussio

Collecting Arrow critique and our roadmap on that

2019-09-19 Thread Uwe L. Korn
Hello, there has been a lot of public discussions lately with some mentions of actually informed, valid critique of things in the Arrow project. From my perspective, these things include "there is not STL-native C++ Arrow API", "the base build requires too much dependencies", "the pyarrow packa

Re: [DISCUSS] C-level in-process array protocol

2019-09-19 Thread Uwe L. Korn
Hello, I like this proposal as it will make interfacing inside a process between various Arrow supports much easier. I'm a bit critical though of using a string as the format representation as one needs to parse it correctly. Couldn't we use the enums we already have and reimplement them as C-d

Re: Build issues on macOS [newbie]

2019-09-19 Thread Uwe L. Korn
Hello Tarek, this error message is normally the one you get when CONDA_BUILD_SYSROOT doesn't point to your 10.9 SDK. Please delete your build folder again and do `export CONDA_BUILD_SYSROOT=..` immediately before running cmake. Running e.g. a conda install will sadly reset this variable to some

Re: [DISCUSS] Changing C++ build system default options to produce more barebones builds

2019-09-18 Thread Uwe L. Korn
> > This is also a lot of work, but could also potentially benefit the > developer experience because we can make unit tests depend on individual > compilable units instead of all of libarrow. There are trade-offs here as > well in terms of public API coverage. > > On Tue, Sep

Re: [DISCUSS] Changing C++ build system default options to produce more barebones builds

2019-09-17 Thread Uwe L. Korn
Hello, I can think of two other alternatives that make it more visible what Arrow core is and what are the optional components: * Error out when no component is selected instead of building just the core Arrow. Here we could add an explanative message that list all components and for each comp

Re: [DISCUSS][C++] Rethinking our current C++ shared library (.so / .dll) approach

2019-09-17 Thread Uwe L. Korn
Hello, I'm actually against this proposal. My main concern is at the moment that Arrow C++/Python grows to a really heavy tool where you always have to bring along all baggage even when you're only using a small part of it. This is a problem which makes it harder to use Arrow in projects becau

Re: [PROPOSAL] Consolidate Arrow's CI configuration

2019-09-05 Thread Uwe L. Korn
Hello Krisztián, > Am 05.09.2019 um 14:22 schrieb Krisztián Szűcs : > >> * The build configuration is automatically updated on a merge to master? >> > Not yet, but this can be automatized too with buildbot itself. This is something I would actually like to have before getting rid of the Travi

Re: [PROPOSAL] Consolidate Arrow's CI configuration

2019-09-05 Thread Uwe L. Korn
Hello Krisztián, I like this proposal. CI coverage and response time is a crucial thing for the health of the project. In general I like the consolidation and local reproducibility of tge builds. Some questions I wanted to ask to make sure I understand your proposal correctly (hopefully they a

Re: Parquet to Arrow in Java

2019-09-04 Thread Uwe L. Korn
Hello, You may want to interact with the Apache Iceberg community here. They are currently a similar things: https://lists.apache.org/thread.html/3bb4f89a0b37f474cf67915f91326fa845afa597bdd2463c98a2c8b9@%3Cdev.iceberg.apache.org%3E I'm not involved in this, just reading both mailing lists and t

Re: Trouble building on Mac OS Mojave

2019-08-31 Thread Uwe L. Korn
Hello Chris, as a contributor, it is often simpler to use conda to construct a local development environment as outlined in https://arrow.apache.org/docs/developers/python.html#using-conda This is the typical environment most contributors work in. Even when not using conda as a package/environm

Re: Building on Arrow CUDA

2019-07-31 Thread Uwe L. Korn
Hello Paul, you might want to look into https://github.com/conda-forge/conda-forge.github.io/issues/687 where CUDA support on conda-forge is dicussed. I'm not uptodate anymore on this but reading the whole issue should give you the current level of support. Once this is solved, adding cuda sup

Re: [VOTE] Adopt FORMAT and LIBRARY SemVer-based version schemes for Arrow 1.0.0 and beyond

2019-07-31 Thread Uwe L. Korn
+1 from me. I really like the separate versions Uwe On Tue, Jul 30, 2019, at 2:21 PM, Antoine Pitrou wrote: > > +1 from me. > > Regards > > Antoine. > > > > On Fri, 26 Jul 2019 14:33:30 -0500 > Wes McKinney wrote: > > hello, > > > > As discussed on the mailing list thread [1], Micah Korn

Re: [C++] Private implementations and virtual interfaces

2019-07-28 Thread Uwe L. Korn
ney a écrit : > > On Sat, Jul 27, 2019 at 4:38 PM Uwe L. Korn wrote: > >> > >> The PIMPL is a thing I would trade a bit of performance as it brings ABI > >> stability. This is something that will help us making Arrow usage in > >> thirdparty code much simple

Re: [C++] Private implementations and virtual interfaces

2019-07-27 Thread Uwe L. Korn
The PIMPL is a thing I would trade a bit of performance as it brings ABI stability. This is something that will help us making Arrow usage in thirdparty code much simpler. Simple updates when an API was only extended but the ABI is intact is a great ease on the Arrow consumer side. I know that

Re: [Discuss] Do a 0.15.0 release before 1.0.0?

2019-07-23 Thread Uwe L. Korn
It is also a good way to test the change in public. We don't want to adjust something like this anymore in a 1.0.0 release. Already doing this in 0.15.0 and then maybe doing adjustments due to issues that appear "in the wild" is psychologically the easier way. There is a lot of thinking of users

Re: [Memo] API Behavior changes

2019-07-22 Thread Uwe L. Korn
Hallo Liya, what about having this as part of the repository, e.g. java/api-changes.md? We have an auto-generated changelog that is quite verbose but having such documentation for consumers of the Java library would be really helpful as it is gives a denser packed information on upgrading versi

Re: Caution about CI builds on personal forks

2019-07-17 Thread Uwe L. Korn
Docker works well for all people on all OSes. Interesting will be Windows, OSX or aarch64 builds which require a special system. Uwe On Wed, Jul 17, 2019, at 6:11 PM, Antoine Pitrou wrote: > > I'm not sure how Docker will work for people not on Linux though? > (and/or for macOS builds) > >

Re: Sharing Java Arrow Buffer with C++ in same process

2019-07-17 Thread Uwe L. Korn
Hello Hans, we sadly have no code for the C++<->Java interaction but a good example is the Python<->Java interaction code in https://github.com/apache/arrow/blob/master/python/pyarrow/jvm.py . This call Java from Python using the jpype1 module and then uses the memory pointers in the Java obje

[jira] [Created] (ARROW-5919) [R] Add nightly tests for building r-arrow with dependencies from conda-forge

2019-07-12 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-5919: -- Summary: [R] Add nightly tests for building r-arrow with dependencies from conda-forge Key: ARROW-5919 URL: https://issues.apache.org/jira/browse/ARROW-5919 Project

Re: [Python] Wheel questions

2019-07-12 Thread Uwe L. Korn
Hallo, On Thu, Jul 11, 2019, at 9:51 PM, Wes McKinney wrote: > On Thu, Jul 11, 2019 at 11:26 AM Antoine Pitrou wrote: > > > > > > Le 11/07/2019 à 17:52, Krisztián Szűcs a écrit : > > > Hi All, > > > > > > I have a couple of questions about the wheel packaging: > > > - why do we build an arrow nam

Re: [Discuss] Support an alternative memory layout for varchar/varbinary vectors

2019-07-11 Thread Uwe L. Korn
t; What do you think? Let's brain-storm it. > > Best, > Liya Fan > > > On Thu, Jul 11, 2019 at 8:05 PM Uwe L. Korn wrote: > > > Hello Liya, > > > > I'm quite -1 on this type as Arrow is about efficient columnar structures. > > We have opened the

Re: [Discuss] Support an alternative memory layout for varchar/varbinary vectors

2019-07-11 Thread Uwe L. Korn
Hello Liya, I'm quite -1 on this type as Arrow is about efficient columnar structures. We have opened the standard also to matrix-like types but always keep the constraint of consecutive memory. Now also adding types where memory is no longer consecutive but spread in the heap will make the sco

Re: [DISCUSS][C++] Evaluating the arrow::Column C++ class

2019-07-09 Thread Uwe L. Korn
e, Jul 9, 2019, 2:54 AM Uwe L. Korn wrote: > > > Hello Wes, > > > > where do you intend the Field object living then? Would this be part of > > the schema of the Table object? > > > > Uwe > > > > On Mon, Jul 8, 2019, at 11:18 PM, Wes McKin

Re: [DISCUSS][C++] Evaluating the arrow::Column C++ class

2019-07-09 Thread Uwe L. Korn
Hello Wes, where do you intend the Field object living then? Would this be part of the schema of the Table object? Uwe On Mon, Jul 8, 2019, at 11:18 PM, Wes McKinney wrote: > hi folks, > > For some time now I have been uncertain about the utility provided by > the arrow::Column C++ class. Fund

Re: [DISCUSS] C++ SO versioning with 1.0.0

2019-07-03 Thread Uwe L. Korn
I've documented that some time ago: https://github.com/apache/arrow/blob/master/docs/source/developers/cpp.rst I actually wanted to add this to the build but we were breaking the ABI so often that it would have never been green. Uwe On Wed, Jul 3, 2019, at 9:52 PM, Sutou Kouhei wrote: > Ruby u

Re: New CI system: Ursabot

2019-06-16 Thread Uwe L. Korn
On Fri, Jun 14, 2019, at 11:23 PM, Krisztián Szűcs wrote: > On Fri, Jun 14, 2019 at 9:04 PM Wes McKinney wrote: > > > hi Krisz, > > > > Thanks for working on this! It already helped me fix a Python 2.7-only > > bug yesterday https://github.com/apache/arrow/pull/4553 > > > > I have a bunch of q

Re: Reduced Arrow CI capacity

2019-05-31 Thread Uwe L. Korn
On Fri, May 31, 2019, at 12:11 AM, Antoine Pitrou wrote: > > Le 30/05/2019 à 22:39, Uwe L. Korn a écrit : > > Hello all, > > > > Krisztián has been lately working on getting Buildbot running for Arrow. > > While I have not yet had the time to look at it in de

Re: Reduced Arrow CI capacity

2019-05-30 Thread Uwe L. Korn
Hello all, Krisztián has been lately working on getting Buildbot running for Arrow. While I have not yet had the time to look at it in detail what would hinder us using it as the main Linux builder and ditching Travis except for OSX? Otherwise I have lately made really good experiences with Git

Re: Not testing Python 2.7 on CI

2019-05-30 Thread Uwe L. Korn
Hello Antoine, when we're not testing Python 2.7 on CI anymore, I would suggest to drop Python 2 support completely then. My personal experience tells me that once we drop Python 2 on CI, we will immediately build a simple thing that breaks Python 2 support. Pushing out releases that might wo

Re: Python development setup and LLVM 7 / Gandiva

2019-05-26 Thread Uwe L. Korn
Hello John, I guess you also have some other llvm-* packages installed on OSX. We currently have the problem that they override each other on OSX: https://github.com/conda-forge/llvmdev-feedstock/issues/60 The compilers shipped by conda-forge on OSX use llvm=4.0.1 and thus this is also installe

Re: [Python] Any reason to exclude __lt__ from ArrayValue ?

2019-05-26 Thread Uwe L. Korn
Hello John, as with most things concering the *Value classes: Missing implementations are simply "not-done-yet" and not explicit omissions. The value instances have not yet seen that much use and therefore lack a lot of functionality. Feel free to add this to them. Uwe On Sat, May 25, 2019, a

Re: A couple of questions about pyarrow.parquet

2019-05-23 Thread Uwe L. Korn
Hello Ted, regarding predicate pushdown in Python, have a look at my unfinished PR at https://github.com/apache/arrow/pull/2623. This was stopped since we were missing native filter in Arrow. The requirements for that have now been implemented and we could probably reactivate the PR. Uwe On S

Re: [Discuss] [Python] protocol for conversion to pyarrow Array

2019-05-09 Thread Uwe L. Korn
+1 to the idea of adding a protocol to let other objects define their way to Arrow structures. For pandas.Series I would expect that they return an Arrow Column. For the Arrow->pandas conversion I have a bit mixed feelings. In the normal Fletcher case I would expect that we don't convert anyth

[jira] [Created] (ARROW-5265) [Python/CI] Add integration test with kartothek

2019-05-06 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-5265: -- Summary: [Python/CI] Add integration test with kartothek Key: ARROW-5265 URL: https://issues.apache.org/jira/browse/ARROW-5265 Project: Apache Arrow Issue Type

Re: C++ and Python size problems with Arrow 0.13.0

2019-04-07 Thread Uwe L. Korn
> By the way, I don't understand why those are not symlinks. They should be symlinks, we have special code for this: https://github.com/apache/arrow/blob/4495305092411e8551c60341e273c8aa3c14b282/python/setup.py#L489-L499 This is probably not going into the wheel as wheels are zip-files and the

Re: C++ and Python size problems with Arrow 0.13.0

2019-04-07 Thread Uwe L. Korn
The only magic that auditwheel does on the Linux package is that it pulls in our shared version of libz.so into the wheel, otherwise there should be no differences in the wheel contents. Uwe On Wed, Apr 3, 2019, at 12:06 PM, Krisztián Szűcs wrote: > This is what the wheel contains before runnin

Re: [VOTE] Proposed changes to Arrow Flight protocol

2019-04-07 Thread Uwe L. Korn
+1 (binding) On Sat, Apr 6, 2019, at 3:09 AM, Kouhei Sutou wrote: > +1 (binding) > > In > "[VOTE] Proposed changes to Arrow Flight protocol" on Tue, 2 Apr 2019 > 19:05:27 -0500, > Wes McKinney wrote: > > > Hi, > > > > David Li has proposed to make the following additions or changes > > t

Re: [DRAFT] Apache Arrow ASF Board Report April 2019

2019-04-07 Thread Uwe L. Korn
+1 On Fri, Apr 5, 2019, at 10:02 PM, Wes McKinney wrote: > ## Description: > > Apache Arrow is a cross-language development platform for in-memory data. It > specifies a standardized language-independent columnar memory format for flat > and hierarchical data, organized for efficient analytic ope

Re: [VOTE] Add new DurationInterval Type to Arrow Format

2019-04-07 Thread Uwe L. Korn
+1 (binding) On Sat, Apr 6, 2019, at 2:44 AM, Kouhei Sutou wrote: > +1 (binding) > > In > "[VOTE] Add new DurationInterval Type to Arrow Format" on Wed, 3 Apr > 2019 07:59:56 -0700, > Jacques Nadeau wrote: > > > I'd like to propose a change to the Arrow format to support a new duration >

[jira] [Created] (ARROW-5074) [C++/Python] When installing into a SYSTEM prefix, RPATHs are not correctly set

2019-03-31 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-5074: -- Summary: [C++/Python] When installing into a SYSTEM prefix, RPATHs are not correctly set Key: ARROW-5074 URL: https://issues.apache.org/jira/browse/ARROW-5074 Project

[jira] [Created] (ARROW-4987) [C++] Use orc conda-package on Linux and OSX

2019-03-21 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-4987: -- Summary: [C++] Use orc conda-package on Linux and OSX Key: ARROW-4987 URL: https://issues.apache.org/jira/browse/ARROW-4987 Project: Apache Arrow Issue Type

Re: [VOTE] Release Apache Arrow JS 0.4.1 - RC0

2019-03-21 Thread Uwe L. Korn
This saldy fails locally for me on OSX High Sierra: ``` + npm run test > apache-arrow@0.4.1 test > /private/var/folders/3j/b8ctc4654q71hd_nqqh8yxc0gp/T/arrow-js-0.4.1.X.8XkDsa8C/apache-arrow-js-0.4.1 > NODE_NO_WARNINGS=1 gulp test [15:23:02] Using gulpfile /private/var/folders/3j/b8ctc

Re: MemoryPool in Arrow libraries

2019-03-21 Thread Uwe L. Korn
Hello, > On alignment: The Arrow Spec calls for at least 8-byte alignment but > recommends 64-byte alignment precisely for SIMD use-cases. There is still > an open JIRA item [3] to make Java have 64-byte alignment, so I don't think > Java is handling 64-byte alignment (I don't know about 8-byte

[jira] [Created] (ARROW-4985) [C++] arrow/testing headers are not installed

2019-03-21 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-4985: -- Summary: [C++] arrow/testing headers are not installed Key: ARROW-4985 URL: https://issues.apache.org/jira/browse/ARROW-4985 Project: Apache Arrow Issue Type

Re: Timeline for 0.13 Arrow release

2019-03-19 Thread Uwe L. Korn
from > >>>>>> 0.12. > >>>>>>>>>> > >>>>>>>>>> We need an RM for 0.13, any PMCs want to volunteer? > >>>>>>>>>> > >>>>>>>>>> Take a look at our rele

  1   2   3   4   5   6   7   8   9   10   >