Compute kernels and Gandiva operators

2019-02-12 Thread Ravindra Pindikura
Hi, I was looking at the recent checkin for arrow kernels, and started to think of how they would work alongside Gandiva. Here are my thoughts : 1. Gandiva already has two high-level operators namely project and filter, with runtime code generation * It already supports 100s of functions (eg.

[jira] [Created] (ARROW-4559) pyarrow can't read/write filenames with special characters

2019-02-12 Thread Jean-Christophe Petkovich (JIRA)
Jean-Christophe Petkovich created ARROW-4559: Summary: pyarrow can't read/write filenames with special characters Key: ARROW-4559 URL: https://issues.apache.org/jira/browse/ARROW-4559 Proje

[jira] [Created] (ARROW-4558) [C++][Flight] Avoid undefined behavior with gRPC memory optimizations

2019-02-12 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-4558: --- Summary: [C++][Flight] Avoid undefined behavior with gRPC memory optimizations Key: ARROW-4558 URL: https://issues.apache.org/jira/browse/ARROW-4558 Project: Apache Arr

[jira] [Created] (ARROW-4557) [JS] Add Table/Schema/RecordBatch `selectAt(...indices)` method

2019-02-12 Thread Paul Taylor (JIRA)
Paul Taylor created ARROW-4557: -- Summary: [JS] Add Table/Schema/RecordBatch `selectAt(...indices)` method Key: ARROW-4557 URL: https://issues.apache.org/jira/browse/ARROW-4557 Project: Apache Arrow

[jira] [Created] (ARROW-4556) [Rust] Preserve order of JSON inferred schema

2019-02-12 Thread Neville Dipale (JIRA)
Neville Dipale created ARROW-4556: - Summary: [Rust] Preserve order of JSON inferred schema Key: ARROW-4556 URL: https://issues.apache.org/jira/browse/ARROW-4556 Project: Apache Arrow Issue Ty

[jira] [Created] (ARROW-4555) [JS] Add high-level Table and Column creation methods

2019-02-12 Thread Paul Taylor (JIRA)
Paul Taylor created ARROW-4555: -- Summary: [JS] Add high-level Table and Column creation methods Key: ARROW-4555 URL: https://issues.apache.org/jira/browse/ARROW-4555 Project: Apache Arrow Issue

[jira] [Created] (ARROW-4554) [JS] Implement logic for combining Vectors with different lengths/chunksizes

2019-02-12 Thread Paul Taylor (JIRA)
Paul Taylor created ARROW-4554: -- Summary: [JS] Implement logic for combining Vectors with different lengths/chunksizes Key: ARROW-4554 URL: https://issues.apache.org/jira/browse/ARROW-4554 Project: Apach

[jira] [Created] (ARROW-4553) [JS] Implement Schema/Field/DataType comparators

2019-02-12 Thread Paul Taylor (JIRA)
Paul Taylor created ARROW-4553: -- Summary: [JS] Implement Schema/Field/DataType comparators Key: ARROW-4553 URL: https://issues.apache.org/jira/browse/ARROW-4553 Project: Apache Arrow Issue Type:

[jira] [Created] (ARROW-4552) [JS] Table and Schema assign implementations

2019-02-12 Thread Paul Taylor (JIRA)
Paul Taylor created ARROW-4552: -- Summary: [JS] Table and Schema assign implementations Key: ARROW-4552 URL: https://issues.apache.org/jira/browse/ARROW-4552 Project: Apache Arrow Issue Type: New

[jira] [Created] (ARROW-4551) [JS] Investigate using Symbols to access Row columns by index

2019-02-12 Thread Brian Hulette (JIRA)
Brian Hulette created ARROW-4551: Summary: [JS] Investigate using Symbols to access Row columns by index Key: ARROW-4551 URL: https://issues.apache.org/jira/browse/ARROW-4551 Project: Apache Arrow

[jira] [Created] (ARROW-4550) [JS] Fix AMD pattern

2019-02-12 Thread Dominik Moritz (JIRA)
Dominik Moritz created ARROW-4550: - Summary: [JS] Fix AMD pattern Key: ARROW-4550 URL: https://issues.apache.org/jira/browse/ARROW-4550 Project: Apache Arrow Issue Type: Bug Compone

Re: [Rust] Rust 0.13.0 release

2019-02-12 Thread Chao Sun
I’m also interested in the Parquet/Arrow integration and may help there. This is however a relative large feature and I’m not sure if it can be done in 0.13. Another area I’d like to work in is high level Parquet writer support. This issue has been discussed several times in the past. People shoul

Re: [Rust] Rust 0.13.0 release

2019-02-12 Thread paddy horan
Hi All, The focus for me for 0.13.0 is SIMD. I would like to port all the "ops" in "array_ops" to the new "compute" module and leverage SIMD for them all. I have most of this done in various forks. Past 0.13.0 I would really like to work toward getting Rust running in the integration tests.

[jira] [Created] (ARROW-4549) [C++] Can't build benchmark code on CUDA enabled build

2019-02-12 Thread Kouhei Sutou (JIRA)
Kouhei Sutou created ARROW-4549: --- Summary: [C++] Can't build benchmark code on CUDA enabled build Key: ARROW-4549 URL: https://issues.apache.org/jira/browse/ARROW-4549 Project: Apache Arrow Iss

[jira] [Created] (ARROW-4548) [C++] run-clang-format.py is not supported on Windows

2019-02-12 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-4548: --- Summary: [C++] run-clang-format.py is not supported on Windows Key: ARROW-4548 URL: https://issues.apache.org/jira/browse/ARROW-4548 Project: Apache Arrow Issu

[jira] [Created] (ARROW-4547) [Python][Documentation] Update python/development.rst with instructions for CUDA-enabled builds

2019-02-12 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-4547: --- Summary: [Python][Documentation] Update python/development.rst with instructions for CUDA-enabled builds Key: ARROW-4547 URL: https://issues.apache.org/jira/browse/ARROW-4547

Re: Arrow Flight protocol/API questions

2019-02-12 Thread Wes McKinney
On Tue, Feb 12, 2019 at 3:46 PM Antoine Pitrou wrote: > > > Le 12/02/2019 à 22:34, Wes McKinney a écrit : > > On Tue, Feb 12, 2019 at 2:48 PM Antoine Pitrou wrote: > >> > >> > >> Hi David, > >> > >> I think allowing to send application-specific ancillary data in addition > >> to Arrow data makes

Re: Arrow Flight protocol/API questions

2019-02-12 Thread Antoine Pitrou
Le 12/02/2019 à 22:34, Wes McKinney a écrit : > On Tue, Feb 12, 2019 at 2:48 PM Antoine Pitrou wrote: >> >> >> Hi David, >> >> I think allowing to send application-specific ancillary data in addition >> to Arrow data makes sense. >> >> (I'm also wondering whether the choice of gRPC is appropriat

Re: Arrow Flight protocol/API questions

2019-02-12 Thread Wes McKinney
Even if zeromq did make more sense, we couldn't take it on as a dependency because of non-ASF-compatible licenses Java zeromq: MPL 2.0 libzmq: GPL On Tue, Feb 12, 2019 at 3:33 PM Jonathan Chiang wrote: > > Would zeromq make more sense than gRPC? > > Thanks, > Jonathan > > > On Feb 12, 2019, at 1

Re: Arrow Flight protocol/API questions

2019-02-12 Thread Wes McKinney
On Tue, Feb 12, 2019 at 2:48 PM Antoine Pitrou wrote: > > > Hi David, > > I think allowing to send application-specific ancillary data in addition > to Arrow data makes sense. > > (I'm also wondering whether the choice of gRPC is appropriate at all - > the current C++ hacks around "zero-copy" are

Re: Arrow Flight protocol/API questions

2019-02-12 Thread Jonathan Chiang
Would zeromq make more sense than gRPC? Thanks, Jonathan > On Feb 12, 2019, at 12:48 PM, Antoine Pitrou wrote: > > > Hi David, > > I think allowing to send application-specific ancillary data in addition > to Arrow data makes sense. > > (I'm also wondering whether the choice of gRPC is app

Re: Arrow Flight protocol/API questions

2019-02-12 Thread Antoine Pitrou
Hi David, I think allowing to send application-specific ancillary data in addition to Arrow data makes sense. (I'm also wondering whether the choice of gRPC is appropriate at all - the current C++ hacks around "zero-copy" are not pretty and they may not translate to other languages either) Reg

Arrow Flight protocol/API questions

2019-02-12 Thread David Ming Li
Hi all, We've been evaluating Flight for our use, and we're wondering if the protocol is still open to extensions, as having a few application-defined metadata fields would help our use cases a lot. (Apologies if this is a repost - was having issue with the spam filter.) Specifically, in

[jira] [Created] (ARROW-4546) LICENSE.txt should be updated.

2019-02-12 Thread Renat Valiullin (JIRA)
Renat Valiullin created ARROW-4546: -- Summary: LICENSE.txt should be updated. Key: ARROW-4546 URL: https://issues.apache.org/jira/browse/ARROW-4546 Project: Apache Arrow Issue Type: Task

[jira] [Created] (ARROW-4545) [C#] Extend Append/AppendRange in BinaryArray to support building rows

2019-02-12 Thread Chris Hutchinson (JIRA)
Chris Hutchinson created ARROW-4545: --- Summary: [C#] Extend Append/AppendRange in BinaryArray to support building rows Key: ARROW-4545 URL: https://issues.apache.org/jira/browse/ARROW-4545 Project: A

Re: [Rust] Rust 0.13.0 release

2019-02-12 Thread Neville Dipale
Thanks for bringing this up Andy. I'm unemployed/on recovery leave, so I've had some surplus time to work on Rust. There's a lot of features that I've wanted to work on, some which I've spent some time attempting, but struggled with. A few block additional work that I could contribute. In 0.13.0

[Rust] Rust 0.13.0 release

2019-02-12 Thread Andy Grove
I was curious what our Rust committers and contributors are excited about for 0.13.0. The feature I would most like to see is that ability for DataFusion to run SQL against Parquet files again, as that would give me an excuse for a PoC in my day job using Arrow. I know there were some efforts und

[jira] [Created] (ARROW-4544) [Rust] Read nested JSON structs into StructArrays

2019-02-12 Thread Neville Dipale (JIRA)
Neville Dipale created ARROW-4544: - Summary: [Rust] Read nested JSON structs into StructArrays Key: ARROW-4544 URL: https://issues.apache.org/jira/browse/ARROW-4544 Project: Apache Arrow Issu

[jira] [Created] (ARROW-4543) [C#] Update Flat Buffers code to latest version

2019-02-12 Thread Eric Erhardt (JIRA)
Eric Erhardt created ARROW-4543: --- Summary: [C#] Update Flat Buffers code to latest version Key: ARROW-4543 URL: https://issues.apache.org/jira/browse/ARROW-4543 Project: Apache Arrow Issue Type

[jira] [Created] (ARROW-4542) Denominate row group size in bytes (not in no of rows)

2019-02-12 Thread Remek Zajac (JIRA)
Remek Zajac created ARROW-4542: -- Summary: Denominate row group size in bytes (not in no of rows) Key: ARROW-4542 URL: https://issues.apache.org/jira/browse/ARROW-4542 Project: Apache Arrow Issue

[jira] [Created] (ARROW-4541) [Gandiva] Enable timestamp tests on windows platform

2019-02-12 Thread shyam narayan singh (JIRA)
shyam narayan singh created ARROW-4541: -- Summary: [Gandiva] Enable timestamp tests on windows platform Key: ARROW-4541 URL: https://issues.apache.org/jira/browse/ARROW-4541 Project: Apache Arrow

[jira] [Created] (ARROW-4540) [Rust] Add basic JSON reader

2019-02-12 Thread Neville Dipale (JIRA)
Neville Dipale created ARROW-4540: - Summary: [Rust] Add basic JSON reader Key: ARROW-4540 URL: https://issues.apache.org/jira/browse/ARROW-4540 Project: Apache Arrow Issue Type: Sub-task

[jira] [Created] (ARROW-4539) [Java]List vector child value count not set correctly

2019-02-12 Thread Praveen Kumar Desabandu (JIRA)
Praveen Kumar Desabandu created ARROW-4539: -- Summary: [Java]List vector child value count not set correctly Key: ARROW-4539 URL: https://issues.apache.org/jira/browse/ARROW-4539 Project: Apach

[jira] [Created] (ARROW-4538) pa.Table.from_pandas() with df.index.name != None breaks write_to_dataset()

2019-02-12 Thread Christian Thiel (JIRA)
Christian Thiel created ARROW-4538: -- Summary: pa.Table.from_pandas() with df.index.name != None breaks write_to_dataset() Key: ARROW-4538 URL: https://issues.apache.org/jira/browse/ARROW-4538 Project

[jira] [Created] (ARROW-4537) [CI] Suppress shell warning on travis-ci

2019-02-12 Thread Kenta Murata (JIRA)
Kenta Murata created ARROW-4537: --- Summary: [CI] Suppress shell warning on travis-ci Key: ARROW-4537 URL: https://issues.apache.org/jira/browse/ARROW-4537 Project: Apache Arrow Issue Type: Task