Re: Coordinating / scheduling C++ Parquet-Arrow nested data work (ARROW-1644 and others)

2020-03-03 Thread Micah Kornfield
Hi Igor, If you have the time https://issues.apache.org/jira/browse/ARROW-7960 might be a good task to pick up for this I think it should be a relatively small amount of code, so it is probably a good contribution to the project. Once that is wrapped up we can see were we both are. Cheers, Micah

Re: [DISCUSS] Adding "trivial" buffer compression option to IPC protocol (ARROW-300)

2020-03-03 Thread Wes McKinney
On Tue, Mar 3, 2020, 8:11 PM Fan Liya wrote: > Sure. I agree with you that we should not overdo this. > I am wondering if we should provide an option to allow users to plugin > their customized compression strategies. > Can you provide a patch showing changes to Message.fbs (or Schema.fbs) that

Re: [DISCUSS] Adding "trivial" buffer compression option to IPC protocol (ARROW-300)

2020-03-03 Thread Fan Liya
Sure. I agree with you that we should not overdo this. I am wondering if we should provide an option to allow users to plugin their customized compression strategies. Best, Liya Fan On Tue, Mar 3, 2020 at 9:47 PM Wes McKinney wrote: > On Tue, Mar 3, 2020, 7:36 AM Fan Liya wrote: > > > I am so

Arrow sync call March 4 at 12:00 US/Eastern, 17:00 UTC

2020-03-03 Thread Neal Richardson
Hi all, Reminder that our biweekly call is coming up tomorrow/later today at https://meet.google.com/vtm-teks-phx. All are welcome to join. Notes will be sent out to the mailing list afterward. Neal

[jira] [Created] (ARROW-7998) [C++][Plasma] Make Seal requests synchronous

2020-03-03 Thread Stephanie Wang (Jira)
Stephanie Wang created ARROW-7998: - Summary: [C++][Plasma] Make Seal requests synchronous Key: ARROW-7998 URL: https://issues.apache.org/jira/browse/ARROW-7998 Project: Apache Arrow Issue

[jira] [Created] (ARROW-7997) Schema equals method with inconsistent docs in pyarrow.

2020-03-03 Thread Jira
Otávio Vasques created ARROW-7997: - Summary: Schema equals method with inconsistent docs in pyarrow. Key: ARROW-7997 URL: https://issues.apache.org/jira/browse/ARROW-7997 Project: Apache Arrow

[jira] [Created] (ARROW-7996) Error serializing empty pandas DataFrame with pyarrow

2020-03-03 Thread Juan David Agudelo (Jira)
Juan David Agudelo created ARROW-7996: - Summary: Error serializing empty pandas DataFrame with pyarrow Key: ARROW-7996 URL: https://issues.apache.org/jira/browse/ARROW-7996 Project: Apache Arrow

[jira] [Created] (ARROW-7995) [C++] IO: coalescing and caching read ranges

2020-03-03 Thread Antoine Pitrou (Jira)
Antoine Pitrou created ARROW-7995: - Summary: [C++] IO: coalescing and caching read ranges Key: ARROW-7995 URL: https://issues.apache.org/jira/browse/ARROW-7995 Project: Apache Arrow Issue

Re: Coordinating / scheduling C++ Parquet-Arrow nested data work (ARROW-1644 and others)

2020-03-03 Thread Igor Calabria
Hi Micah, I actually got involved with another personal project and had to postpone my contribution to arrow a bit. The good news is that I'm almost done with it, so I could help you with the read side very soon. Any ideas how we could coordinate this? Em qua., 26 de fev. de 2020 às 21:06, Wes

[jira] [Created] (ARROW-7994) [CI][C++] Move AppVeyor MinGW builds to Github Actions

2020-03-03 Thread Antoine Pitrou (Jira)
Antoine Pitrou created ARROW-7994: - Summary: [CI][C++] Move AppVeyor MinGW builds to Github Actions Key: ARROW-7994 URL: https://issues.apache.org/jira/browse/ARROW-7994 Project: Apache Arrow

Re: [DISCUSS] Adding "trivial" buffer compression option to IPC protocol (ARROW-300)

2020-03-03 Thread Wes McKinney
On Tue, Mar 3, 2020, 7:36 AM Fan Liya wrote: > I am so glad to see this discussion, and I am willing to provide help from > the Java side. > > In the proposal, I see the support for basic compression strategies > (e.g.gzip, snappy). > IMO, applying a single basic strategy is not likely to

Re: [DISCUSS] Adding "trivial" buffer compression option to IPC protocol (ARROW-300)

2020-03-03 Thread Antoine Pitrou
Well, we shouldn't overdo this either. We are not trying to replicate the Parquet format. Regards Antoine. Le 03/03/2020 à 14:36, Fan Liya a écrit : > I am so glad to see this discussion, and I am willing to provide help from > the Java side. > > In the proposal, I see the support for

Re: [DISCUSS] Adding "trivial" buffer compression option to IPC protocol (ARROW-300)

2020-03-03 Thread Fan Liya
I am so glad to see this discussion, and I am willing to provide help from the Java side. In the proposal, I see the support for basic compression strategies (e.g.gzip, snappy). IMO, applying a single basic strategy is not likely to achieve performance improvement for most scenarios. The optimal

Re: [DISCUSS] Adding "trivial" buffer compression option to IPC protocol (ARROW-300)

2020-03-03 Thread Antoine Pitrou
If we want to use a HTTP header, it would be more of a Accept-Encoding header, no? In any case, we would have to put non-standard values there (e.g. lz4), so I'm not sure how desirable it is to repurpose HTTP headers for that, rather than add some dedicated field to the Flight messages.

Re: [DISCUSS] Adding "trivial" buffer compression option to IPC protocol (ARROW-300)

2020-03-03 Thread David Li
gRPC supports headers so for Flight, we could send essentially an Accept header and perhaps a Content-Type header. David On Mon, Mar 2, 2020, 23:15 Micah Kornfield wrote: > Hi Wes, > A few thoughts on this. In general, I think it is a good idea. But before > proceeding, I think the following