Re: [Discuss] Streaming: Differentiate between length of RecordBatch and utilized portion-- common use-case?

2019-10-17 Thread Micah Kornfield
On the specification: I'm -.5 on saying array lengths can be different then row batch length (especially if both are valid lengths). I can see some wiggle room the the current language [1][2] that might allow for modifying this, so I think we should update it one way or another however this conver

[jira] [Created] (ARROW-6931) [Java] Consider starting to use Google Truth Fluent Assertions library

2019-10-17 Thread Micah Kornfield (Jira)
Micah Kornfield created ARROW-6931: -- Summary: [Java] Consider starting to use Google Truth Fluent Assertions library Key: ARROW-6931 URL: https://issues.apache.org/jira/browse/ARROW-6931 Project: Apa

[jira] [Created] (ARROW-6930) [Java] Create static factory methods for common array types of testing.

2019-10-17 Thread Micah Kornfield (Jira)
Micah Kornfield created ARROW-6930: -- Summary: [Java] Create static factory methods for common array types of testing. Key: ARROW-6930 URL: https://issues.apache.org/jira/browse/ARROW-6930 Project: Ap

[jira] [Created] (ARROW-6929) [C++] ValidateArray is out of sync with the ListArray IPC specification

2019-10-17 Thread Micah Kornfield (Jira)
Micah Kornfield created ARROW-6929: -- Summary: [C++] ValidateArray is out of sync with the ListArray IPC specification Key: ARROW-6929 URL: https://issues.apache.org/jira/browse/ARROW-6929 Project: Ap

pyarrow and pyzmq no copy

2019-10-17 Thread seshu yamajala
I would like to use pyarrow with pyzmq no copy to send dicts of arrays across the network without having to make copies of the arrays stored in the dicts. I've come up with an example, here: https://gist.github.com/syamajala/51d52f5e326ff719bf6231546091991d However, I'm having trouble with deser

Re: [ANNOUNCE] New Arrow committer: Eric Erhardt

2019-10-17 Thread Fan Liya
Congrats Eric! Best, Liya Fan On Fri, Oct 18, 2019 at 3:06 AM paddy horan wrote: > Congrats Eric! > > > From: Micah Kornfield > Sent: Thursday, October 17, 2019 12:45:15 PM > To: dev > Subject: Re: [ANNOUNCE] New Arrow committer: Eric Erhardt > > Congrats Eric

[jira] [Created] (ARROW-6928) [Rust] Add FixedSizeList type

2019-10-17 Thread Neville Dipale (Jira)
Neville Dipale created ARROW-6928: - Summary: [Rust] Add FixedSizeList type Key: ARROW-6928 URL: https://issues.apache.org/jira/browse/ARROW-6928 Project: Apache Arrow Issue Type: Sub-task

Re: [DISCUSS] [Rust] Adding support for Flight protocol

2019-10-17 Thread Neville Dipale
Thanks Andy, Please see https://github.com/apache/arrow/pull/4167#issuecomment-543381089 for the status of the PR. We have a few missing data types (fixed list, timezone to timestamp, etc.) that are currently stopping me from testing the reading of files. I'm trying out creating a fixed size list

Re: [DISCUSS] [Rust] Adding support for Flight protocol

2019-10-17 Thread Andy Grove
Thanks for all the updates. I'd like to get involved and help out with this effort as well. I don't have any major work planned for DataFusion for 1.0.0 now other than maybe moving to the new parquet ArrowReader, if it is ready in time. I have been chatting with the author of the Rust Flatbuffer p

Re: [NIGHTLY] Arrow Build Report for Job nightly-2019-10-17-0

2019-10-17 Thread Sutou Kouhei
https://github.com/apache/arrow/pull/5689 will fix them: > - debian-stretch: > URL: > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-10-17-0-azure-debian-stretch > - debian-buster: > URL: > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-10-17-0-a

[jira] [Created] (ARROW-6927) [C++] Add gRPC version check

2019-10-17 Thread Kouhei Sutou (Jira)
Kouhei Sutou created ARROW-6927: --- Summary: [C++] Add gRPC version check Key: ARROW-6927 URL: https://issues.apache.org/jira/browse/ARROW-6927 Project: Apache Arrow Issue Type: Improvement

[jira] [Created] (ARROW-6926) Support __sizeof__ protocol for Python objects

2019-10-17 Thread Matthew Rocklin (Jira)
Matthew Rocklin created ARROW-6926: -- Summary: Support __sizeof__ protocol for Python objects Key: ARROW-6926 URL: https://issues.apache.org/jira/browse/ARROW-6926 Project: Apache Arrow Issue

[jira] [Created] (ARROW-6925) Arrow fails to buld on MacOS 10.13.6 using brew gcc 7 and 8

2019-10-17 Thread John Norris (Jira)
John Norris created ARROW-6925: -- Summary: Arrow fails to buld on MacOS 10.13.6 using brew gcc 7 and 8 Key: ARROW-6925 URL: https://issues.apache.org/jira/browse/ARROW-6925 Project: Apache Arrow

[jira] [Created] (ARROW-6924) [Python] Disallow writing Parquet V2 files from Python until PARQUET-458 is resolved

2019-10-17 Thread Wes McKinney (Jira)
Wes McKinney created ARROW-6924: --- Summary: [Python] Disallow writing Parquet V2 files from Python until PARQUET-458 is resolved Key: ARROW-6924 URL: https://issues.apache.org/jira/browse/ARROW-6924 Proj

Re: [DISCUSS] [Rust] Adding support for Flight protocol

2019-10-17 Thread Neville Dipale
Good evening With support for testing against integration files now done, I've resumed work on the IPC reader. If I don't encounter trouble reading the existing files, I expect to be done with this work by the end of the weekend. I had taken the approach of one large PR to include all Rust-support

C data interface: draft C++ implementation and Python <-> R bridge

2019-10-17 Thread Antoine Pitrou
Hi, For the record, I've been working on a C++ implementation of exporting and importing data using the C data interface: https://github.com/apache/arrow/pull/5608 (some datatypes are not handled yet, and metadata is currently not implemented) Also Neal has used that PR to create a proof-

Re: [ANNOUNCE] New Arrow committer: Eric Erhardt

2019-10-17 Thread paddy horan
Congrats Eric! From: Micah Kornfield Sent: Thursday, October 17, 2019 12:45:15 PM To: dev Subject: Re: [ANNOUNCE] New Arrow committer: Eric Erhardt Congrats Eric! On Thu, Oct 17, 2019 at 6:58 AM Wes McKinney wrote: > On behalf of the Arrow PMC, I'm happy to a

[jira] [Created] (ARROW-6923) [C++] Option for Filter kernel how to handle nulls in the selection vector

2019-10-17 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-6923: Summary: [C++] Option for Filter kernel how to handle nulls in the selection vector Key: ARROW-6923 URL: https://issues.apache.org/jira/browse/ARROW-6923

[jira] [Created] (ARROW-6922) [Python] Pandas master build is failing (MultiIndex.levels change)

2019-10-17 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-6922: Summary: [Python] Pandas master build is failing (MultiIndex.levels change) Key: ARROW-6922 URL: https://issues.apache.org/jira/browse/ARROW-6922 Proj

[jira] [Created] (ARROW-6921) Regression: Cannot round-trip IPC files between PyArrow and Arrow JS

2019-10-17 Thread Joe Quigley (Jira)
Joe Quigley created ARROW-6921: -- Summary: Regression: Cannot round-trip IPC files between PyArrow and Arrow JS Key: ARROW-6921 URL: https://issues.apache.org/jira/browse/ARROW-6921 Project: Apache Arrow

Re: [ANNOUNCE] New Arrow committer: Eric Erhardt

2019-10-17 Thread Micah Kornfield
Congrats Eric! On Thu, Oct 17, 2019 at 6:58 AM Wes McKinney wrote: > On behalf of the Arrow PMC, I'm happy to announce that Eric has > accepted an invitation to become a committer on Apache Arrow. > > Welcome, and thank you for your contributions! >

[jira] [Created] (ARROW-6920) [python] create manylinux wheels for python3.8

2019-10-17 Thread Simon Hewitt (Jira)
Simon Hewitt created ARROW-6920: --- Summary: [python] create manylinux wheels for python3.8 Key: ARROW-6920 URL: https://issues.apache.org/jira/browse/ARROW-6920 Project: Apache Arrow Issue Type:

[jira] [Created] (ARROW-6919) [Python] Expose more builders in Cython

2019-10-17 Thread Uwe Korn (Jira)
Uwe Korn created ARROW-6919: --- Summary: [Python] Expose more builders in Cython Key: ARROW-6919 URL: https://issues.apache.org/jira/browse/ARROW-6919 Project: Apache Arrow Issue Type: Improvement

Re: [DISCUSS] [Rust] Adding support for Flight protocol

2019-10-17 Thread David Li
Just for reference, it's possible once the basic IPC support is merged; I had a proof of concept, though it needs to be updated to use Tonic over tower-grpc, actually implement the zero-copy optimizations, provide a real API, etc. https://github.com/apache/arrow/pull/4167#issuecomment-529695811 O

[jira] [Created] (ARROW-6918) [R] Make docker-compose setup faster

2019-10-17 Thread Antoine Pitrou (Jira)
Antoine Pitrou created ARROW-6918: - Summary: [R] Make docker-compose setup faster Key: ARROW-6918 URL: https://issues.apache.org/jira/browse/ARROW-6918 Project: Apache Arrow Issue Type: Impro

Re: [DISCUSS] [Rust] Adding support for Flight protocol

2019-10-17 Thread Wes McKinney
I hope to see Flight in all the reference implementations eventually. Having hardened IPC support is a pre-requisite, it would be ideal to have Rust as a participant in the integration tests On Thu, Oct 17, 2019 at 9:41 AM Andy Grove wrote: > > I was approached directly about adding Flight suppo

[DISCUSS] [Rust] Adding support for Flight protocol

2019-10-17 Thread Andy Grove
I was approached directly about adding Flight support to the Rust implementation, and said I would start a discussion here on the mailing list. There is ongoing work with IPC and integration and I believe that it would make sense to start looking at adding Flight support. I'd like to hear what ot

Re: [Discuss] Streaming: Differentiate between length of RecordBatch and utilized portion-- common use-case?

2019-10-17 Thread John Muehlhausen
Micah, thanks very much for your input. A few thoughts in response: ``Over the time horizon of desired latency if you aren't receiving enough messages to take advantage of columnar analytics, a system probably has enough time to compact batches after the fact for later analysis and conversely if

[jira] [Created] (ARROW-6917) [Developer] Implement Python script to generate git cherry-pick commands needed to create patch build branch for maint releases

2019-10-17 Thread Wes McKinney (Jira)
Wes McKinney created ARROW-6917: --- Summary: [Developer] Implement Python script to generate git cherry-pick commands needed to create patch build branch for maint releases Key: ARROW-6917 URL: https://issues.apache.o

[jira] [Created] (ARROW-6916) [Developer] Alphabetize task names in nightly Crossbow report

2019-10-17 Thread Wes McKinney (Jira)
Wes McKinney created ARROW-6916: --- Summary: [Developer] Alphabetize task names in nightly Crossbow report Key: ARROW-6916 URL: https://issues.apache.org/jira/browse/ARROW-6916 Project: Apache Arrow

Re: [NIGHTLY] Arrow Build Report for Job nightly-2019-10-17-0

2019-10-17 Thread Wes McKinney
Holy moly lots of failures! - docker-clang-format https://issues.apache.org/jira/browse/ARROW-6914 - wheel-osx-cp36m: - wheel-osx-cp37m: - wheel-osx-cp27m: - wheel-osx-cp35m: - gandiva-jar-osx: - All seem to be caused by the Homebrew issues? I'm looking at the others and will create JIRA issue

[jira] [Created] (ARROW-6915) [Developer] Do not overwrite minor release version with merge script, even if not specified by committer

2019-10-17 Thread Wes McKinney (Jira)
Wes McKinney created ARROW-6915: --- Summary: [Developer] Do not overwrite minor release version with merge script, even if not specified by committer Key: ARROW-6915 URL: https://issues.apache.org/jira/browse/ARROW-69

[jira] [Created] (ARROW-6914) [CI] docker-clang-format nightly failing

2019-10-17 Thread Wes McKinney (Jira)
Wes McKinney created ARROW-6914: --- Summary: [CI] docker-clang-format nightly failing Key: ARROW-6914 URL: https://issues.apache.org/jira/browse/ARROW-6914 Project: Apache Arrow Issue Type: Bug

[jira] [Created] (ARROW-6913) [R] Potential bug in compute.cc

2019-10-17 Thread Antoine Pitrou (Jira)
Antoine Pitrou created ARROW-6913: - Summary: [R] Potential bug in compute.cc Key: ARROW-6913 URL: https://issues.apache.org/jira/browse/ARROW-6913 Project: Apache Arrow Issue Type: Bug

[ANNOUNCE] New Arrow committer: Eric Erhardt

2019-10-17 Thread Wes McKinney
On behalf of the Arrow PMC, I'm happy to announce that Eric has accepted an invitation to become a committer on Apache Arrow. Welcome, and thank you for your contributions!

Re: Possible Arrow 0.15.1 release

2019-10-17 Thread Wes McKinney
Thanks. Yes, you have to type 1.0.0,0.15.1 when merging to set both fix versions On Thu, Oct 17, 2019 at 8:52 AM Antoine Pitrou wrote: > > > I added 0.15.1 back to a couple of fixed issues where the merge script > removed it :-/ > > > > Le 17/10/2019 à 09:30, Wes McKinney a écrit : > > Nearly a

Re: Possible Arrow 0.15.1 release

2019-10-17 Thread Antoine Pitrou
I added 0.15.1 back to a couple of fixed issues where the merge script removed it :-/ Le 17/10/2019 à 09:30, Wes McKinney a écrit : Nearly all the fixes are in for 0.15.1. I think the only thing preventing us from making an RC soon is (surprise) new problems with packaging (Homebrew-relat

Re: Possible Arrow 0.15.1 release

2019-10-17 Thread Wes McKinney
Nearly all the fixes are in for 0.15.1. I think the only thing preventing us from making an RC soon is (surprise) new problems with packaging (Homebrew-related, and others). Anything else that needs to go into 0.15.1 that is not marked as such? On Sat, Oct 12, 2019 at 1:52 AM Fan Liya wrote: >

[NIGHTLY] Arrow Build Report for Job nightly-2019-10-17-0

2019-10-17 Thread Crossbow
Arrow Build Report for Job nightly-2019-10-17-0 All tasks: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-10-17-0 Failed Tasks: - wheel-osx-cp36m: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-10-17-0-travis-wheel-osx-cp36m - debian-stretc