[jira] [Created] (ARROW-3921) [CI][GLib] Log Homebrew output

2018-11-30 Thread Kouhei Sutou (JIRA)
Kouhei Sutou created ARROW-3921: --- Summary: [CI][GLib] Log Homebrew output Key: ARROW-3921 URL: https://issues.apache.org/jira/browse/ARROW-3921 Project: Apache Arrow Issue Type: Sub-task

Re: [VOTE] Accept donation of Rust Parquet implementation

2018-11-30 Thread Kouhei Sutou
+1 In "[VOTE] Accept donation of Rust Parquet implementation" on Fri, 30 Nov 2018 17:50:49 -0600, Wes McKinney wrote: > Dear all, > > The developers of > > https://github.com/sunchao/parquet-rs > > have been in touch with Apache Arrow and Apache Parquet. Based on > mailing list

Re: [VOTE] Accept donation of Rust Parquet implementation

2018-11-30 Thread Felix Cheung
+1!! From: Andy Grove Sent: Friday, November 30, 2018 4:26:21 PM To: dev@arrow.apache.org Subject: Re: [VOTE] Accept donation of Rust Parquet implementation +1 and great to see this happening! On Fri, Nov 30, 2018 at 4:51 PM Wes McKinney wrote: > Dear all, >

Re: [VOTE] Accept donation of Rust Parquet implementation

2018-11-30 Thread Andy Grove
+1 and great to see this happening! On Fri, Nov 30, 2018 at 4:51 PM Wes McKinney wrote: > Dear all, > > The developers of > > https://github.com/sunchao/parquet-rs > > have been in touch with Apache Arrow and Apache Parquet. Based on > mailing list discussions, it is being proposed to donate

[jira] [Created] (ARROW-3920) Plasma reference counting not properly done in TensorFlow custom operator.

2018-11-30 Thread Robert Nishihara (JIRA)
Robert Nishihara created ARROW-3920: --- Summary: Plasma reference counting not properly done in TensorFlow custom operator. Key: ARROW-3920 URL: https://issues.apache.org/jira/browse/ARROW-3920

Timeline for Arrow 0.12.0 release

2018-11-30 Thread Wes McKinney
hi folks, Tomorrow is December 1. The last major Arrow release (0.11.0) took place on October 8. Given how much work has happened in the project in the last ~2 months, I think it would be great to complete the next major release before the end-of-year holidays set in. I've been curating the JIRA

[jira] [Created] (ARROW-3919) [Python] Support 64 bit indices for pyarrow.serialize and pyarrow.deserialize

2018-11-30 Thread Philipp Moritz (JIRA)
Philipp Moritz created ARROW-3919: - Summary: [Python] Support 64 bit indices for pyarrow.serialize and pyarrow.deserialize Key: ARROW-3919 URL: https://issues.apache.org/jira/browse/ARROW-3919

[jira] [Created] (ARROW-3918) [Python] ParquetWriter.write_table doesn't support coerce_timestamps or allow_truncated_timestamps

2018-11-30 Thread David Lee (JIRA)
David Lee created ARROW-3918: Summary: [Python] ParquetWriter.write_table doesn't support coerce_timestamps or allow_truncated_timestamps Key: ARROW-3918 URL: https://issues.apache.org/jira/browse/ARROW-3918

Re: zlib shared linking patch -- arrow 0.11.0 vs 0.11.1?

2018-11-30 Thread Wes McKinney
I opened a PR to update the website for the 0.11.1 patch release https://github.com/apache/arrow/pull/3060 On Wed, Nov 28, 2018 at 8:02 AM Wes McKinney wrote: > > We forgot to update the website when we made the 0.11.1 bugfix > release; it would be a good idea to do that. > On Mon, Nov 26, 2018

Re: [JAVA] Arrow performance measurement

2018-11-30 Thread Wes McKinney
hi Animesh -- can you link to JIRA issues about the C++ improvements you're describing? Want to make sure this doesn't fall through the cracks Thanks Wes On Mon, Nov 26, 2018 at 7:54 AM Antoine Pitrou wrote: > > > Hi Animesh, > > Le 26/11/2018 à 14:23, Animesh Trivedi a écrit : > > > > * C++

Re: RFC: Type inference rules

2018-11-30 Thread Wes McKinney
On Fri, Nov 30, 2018 at 10:00 AM Ben Kietzman wrote: > > I think the fallback graph approach might still be useful in the case of > parsing with unions allowed, albeit with a much broader graph. > > For example, > > INT64 + STRING -> UNION(INT64, STRING) > T + UNION (*) -> UNION(T, *) > # ... > >

Re: Arrow development sync call: 12p Eastern / 17:00 UTC

2018-11-30 Thread Wes McKinney
By the way: if anyone would like to be added to the calendar invite for the biweekly syncs, please contact me off-list On Fri, Nov 30, 2018 at 8:58 AM Wes McKinney wrote: > > I've heard from a couple people that they weren't able to join at 12pm > when clicking on the link. Since that particular

Re: RFC: Type inference rules

2018-11-30 Thread Ben Kietzman
I think the fallback graph approach might still be useful in the case of parsing with unions allowed, albeit with a much broader graph. For example, INT64 + STRING -> UNION(INT64, STRING) T + UNION (*) -> UNION(T, *) # ... Related: how should ordinarily convertible types be handled in the

Re: RFC: Type inference rules

2018-11-30 Thread Wes McKinney
I think there's two useful modes for for schema-on-read: * Unions allowed * Unions not allowed We haven't implemented union inference for converting Python sequences yet. see e.g. In [1]: import pyarrow as pa In [2]: pa.array([{'a': 'foo'}, {'a': 'bar'}]) Out[2]: -- is_valid: all not null --

Re: RFC: Type inference rules

2018-11-30 Thread Antoine Pitrou
Le 30/11/2018 à 15:43, Ben Kietzman a écrit : > Hi Antoine, > > The conversion of previous blocks is part of the fall back mechanism I'm > trying to describe. When type inference fails (even in a different block), > conversion of all blocks of the column is attempted to the next type in the >

Re: RFC: Type inference rules

2018-11-30 Thread Francois Saint-Jacques
Hello, With JSON and other "typed" formats (msgpack, protobuf, ...) you need to take account unions, e.g. {a: "herp", b: 10} {a: true, c: "derp"} The type for `a` would be union. I think we should also evaluate into investing at ingesting different schema DSL (protobuf idl, json-schema) to

[jira] [Created] (ARROW-3917) [Python] Provide Python API to ArrayBuilder classes

2018-11-30 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-3917: --- Summary: [Python] Provide Python API to ArrayBuilder classes Key: ARROW-3917 URL: https://issues.apache.org/jira/browse/ARROW-3917 Project: Apache Arrow Issue

Re: Arrow development sync call: 12p Eastern / 17:00 UTC

2018-11-30 Thread Wes McKinney
I've heard from a couple people that they weren't able to join at 12pm when clicking on the link. Since that particular Google Meet room was set up by Dremio, someone there has to let you in. So if this happens in the future, wait a couple minutes and try again. You can also write to the mailing

[jira] [Created] (ARROW-3916) [Python] Support caller-provided filesystem in `ParquetWriter` constructor

2018-11-30 Thread Mackenzie (JIRA)
Mackenzie created ARROW-3916: Summary: [Python] Support caller-provided filesystem in `ParquetWriter` constructor Key: ARROW-3916 URL: https://issues.apache.org/jira/browse/ARROW-3916 Project: Apache

[jira] [Created] (ARROW-3915) [Python] Support partition columns when incrementally writing

2018-11-30 Thread Mackenzie (JIRA)
Mackenzie created ARROW-3915: Summary: [Python] Support partition columns when incrementally writing Key: ARROW-3915 URL: https://issues.apache.org/jira/browse/ARROW-3915 Project: Apache Arrow

Re: [DISCUSS] Rust add adapter for parquet

2018-11-30 Thread Wes McKinney
Thanks. I will review today and start a vote about the code donation. I know that we already voted to accept in Apache Parquet, but I want to double check that the Arrow community is also on board with sharing responsibility for this code. If the Parquet community wants to make Rust Parquet

[jira] [Created] (ARROW-3914) [C++/Python/Packaging] Docker-compose setup for Alpine linux

2018-11-30 Thread Krisztian Szucs (JIRA)
Krisztian Szucs created ARROW-3914: -- Summary: [C++/Python/Packaging] Docker-compose setup for Alpine linux Key: ARROW-3914 URL: https://issues.apache.org/jira/browse/ARROW-3914 Project: Apache Arrow

Re: RFC: Type inference rules

2018-11-30 Thread Antoine Pitrou
Hi Ben, Le 30/11/2018 à 02:19, Ben Kietzman a écrit : > Currently, to figure out which types may be inferred and under which > circumstances they will be inferred involves digging through code. I think > it would be useful to have an API for expressing type inference rules. > Ideally this would