[jira] [Created] (ARROW-6790) [Release] Automatically disable integration test cases in release verification

2019-10-03 Thread Bryan Cutler (Jira)
Bryan Cutler created ARROW-6790: --- Summary: [Release] Automatically disable integration test cases in release verification Key: ARROW-6790 URL: https://issues.apache.org/jira/browse/ARROW-6790 Project: A

Re: Docker organization for development images

2019-10-03 Thread Sutou Kouhei
https://hub.docker.com/u/ktou In "Docker organization for development images" on Thu, 3 Oct 2019 15:10:25 +0200, Krisztián Szűcs wrote: > Hi, > > We've created a docker hub organisation called "arrowdev" > to host the images defined in the docker-compose.yml, see > the following commit [1

[jira] [Created] (ARROW-6789) [Python] Automatically box bytes/buffer-like values yielded from `FlightServerBase.do_action` in Result values

2019-10-03 Thread Wes McKinney (Jira)
Wes McKinney created ARROW-6789: --- Summary: [Python] Automatically box bytes/buffer-like values yielded from `FlightServerBase.do_action` in Result values Key: ARROW-6789 URL: https://issues.apache.org/jira/browse/AR

Re: uncertain about JIRA issue granularity

2019-10-03 Thread Micah Kornfield
Hi John, It depends on what the change encompasses. If it affects the format then it would be nice to have tracking bugs in all languages to implement the feature (i.e. adding data to the footer). If it is an implementation specific feature then only the target languages need to be implemented (i

Re: uncertain about JIRA issue granularity

2019-10-03 Thread John Muehlhausen
I thought I should open all of the issues for tracking even if I don't implement all of them right away? On Thu, Oct 3, 2019 at 5:46 PM Antoine Pitrou wrote: > > Le 04/10/2019 à 00:18, John Muehlhausen a écrit : > > I need to create two (or more) issues for > > custom_metadata in Footer ... >

Re: arrow::io::MemoryMappedFile from fd rather than path

2019-10-03 Thread Antoine Pitrou
Le 04/10/2019 à 00:31, John Muehlhausen a écrit : > http://lackingrhoticity.blogspot.com/2015/05/passing-fds-handles-between-processes.html > > If I'm reading this correctly, it doesn't affect our Open(fd) API on > Windows, but only how descriptors are communicated between processes that > want

Re: uncertain about JIRA issue granularity

2019-10-03 Thread Antoine Pitrou
Le 04/10/2019 à 00:18, John Muehlhausen a écrit : > I need to create two (or more) issues for > custom_metadata in Footer ... > https://lists.apache.org/thread.html/c3b3d1456b7062a435f6795c0308ccb7c8fe55c818cfed2cf55f76c5@%3Cdev.arrow.apache.org%3E > > and > memory map based on fd ... > http

Re: arrow::io::MemoryMappedFile from fd rather than path

2019-10-03 Thread John Muehlhausen
http://lackingrhoticity.blogspot.com/2015/05/passing-fds-handles-between-processes.html If I'm reading this correctly, it doesn't affect our Open(fd) API on Windows, but only how descriptors are communicated between processes that want to make use of it. On Thu, Oct 3, 2019 at 4:24 PM Antoine Pit

uncertain about JIRA issue granularity

2019-10-03 Thread John Muehlhausen
I need to create two (or more) issues for custom_metadata in Footer ... https://lists.apache.org/thread.html/c3b3d1456b7062a435f6795c0308ccb7c8fe55c818cfed2cf55f76c5@%3Cdev.arrow.apache.org%3E and memory map based on fd ... https://lists.apache.org/thread.html/83373ab00f552ee8afd2bac2b2721468b

[jira] [Created] (ARROW-6788) [CI] Migrate Travis CI lint job to GitHub Actions

2019-10-03 Thread Wes McKinney (Jira)
Wes McKinney created ARROW-6788: --- Summary: [CI] Migrate Travis CI lint job to GitHub Actions Key: ARROW-6788 URL: https://issues.apache.org/jira/browse/ARROW-6788 Project: Apache Arrow Issue Ty

Re: arrow::io::MemoryMappedFile from fd rather than path

2019-10-03 Thread Antoine Pitrou
Le 03/10/2019 à 23:21, John Muehlhausen a écrit : > > Would we just make a variant of Open() that takes a fd rather than a path? That sounds like a good idea. Would you like to open a JIRA and a PR? > Would this API have any analogy on Windows? Do we have platform-specific > functionality?

[jira] [Created] (ARROW-6787) [CI] Decommission "C++ with clang 7 and system packages" Travis CI job

2019-10-03 Thread Wes McKinney (Jira)
Wes McKinney created ARROW-6787: --- Summary: [CI] Decommission "C++ with clang 7 and system packages" Travis CI job Key: ARROW-6787 URL: https://issues.apache.org/jira/browse/ARROW-6787 Project: Apache Ar

arrow::io::MemoryMappedFile from fd rather than path

2019-10-03 Thread John Muehlhausen
I have a situation where multiple processes need to access a memory mapped file. However, between the time the first process maps the file and the time a subsequent process in the group maps the file, the file may have been removed from the filesystem. (I.e. has no "path") Coordinating the cache

[jira] [Created] (ARROW-6786) [C++] arrow-dataset-file-parquet-test is slow

2019-10-03 Thread Antoine Pitrou (Jira)
Antoine Pitrou created ARROW-6786: - Summary: [C++] arrow-dataset-file-parquet-test is slow Key: ARROW-6786 URL: https://issues.apache.org/jira/browse/ARROW-6786 Project: Apache Arrow Issue Ty

Re: Collecting Arrow critique and our roadmap on that

2019-10-03 Thread Bryan Cutler
A lot of good info here, I added a point that has come up often for me. On Thu, Oct 3, 2019 at 10:03 AM Wes McKinney wrote: > I read through and left some comments. > > Would be great to turn into an FAQ section in the docs and add a link > to the navigation on the front page of the website. > >

[jira] [Created] (ARROW-6785) [JS] Remove superfluous child assignment

2019-10-03 Thread Wes McKinney (Jira)
Wes McKinney created ARROW-6785: --- Summary: [JS] Remove superfluous child assignment Key: ARROW-6785 URL: https://issues.apache.org/jira/browse/ARROW-6785 Project: Apache Arrow Issue Type: Bug

[jira] [Created] (ARROW-6784) [C++][R] Move filter, take, select C++ code from Rcpp to C++ library

2019-10-03 Thread Neal Richardson (Jira)
Neal Richardson created ARROW-6784: -- Summary: [C++][R] Move filter, take, select C++ code from Rcpp to C++ library Key: ARROW-6784 URL: https://issues.apache.org/jira/browse/ARROW-6784 Project: Apach

Re: Docker organization for development images

2019-10-03 Thread Bryan Cutler
Sounds good, thanks Krisztian! On Thu, Oct 3, 2019 at 6:10 AM Krisztián Szűcs wrote: > Hi, > > We've created a docker hub organisation called "arrowdev" > to host the images defined in the docker-compose.yml, see > the following commit [1]. > So now it is possible to speed up the image builds by

Re: Collecting Arrow critique and our roadmap on that

2019-10-03 Thread Wes McKinney
I read through and left some comments. Would be great to turn into an FAQ section in the docs and add a link to the navigation on the front page of the website. On Mon, Sep 23, 2019 at 1:22 PM Uwe L. Korn wrote: > > Thanks to the all contributions that already came in. I made some more > additi

[jira] [Created] (ARROW-6783) [C++] Provide API for reconstruction of RecordBatch from Flatbuffer containing process memory addresses instead of relative offsets into an IPC message

2019-10-03 Thread Wes McKinney (Jira)
Wes McKinney created ARROW-6783: --- Summary: [C++] Provide API for reconstruction of RecordBatch from Flatbuffer containing process memory addresses instead of relative offsets into an IPC message Key: ARROW-6783 URL

[jira] [Created] (ARROW-6782) [C++] Build minimal core Arrow libraries without any Boost headers

2019-10-03 Thread Wes McKinney (Jira)
Wes McKinney created ARROW-6782: --- Summary: [C++] Build minimal core Arrow libraries without any Boost headers Key: ARROW-6782 URL: https://issues.apache.org/jira/browse/ARROW-6782 Project: Apache Arrow

Re: [DISCUSS] raw pointers and FFI (C-level in-process array protocol)

2019-10-03 Thread Wes McKinney
Related: Gandiva invented its own particular way of passing memory addresses through the JNI boundary rather than using Flatbuffers messages https://github.com/apache/arrow/blob/master/cpp/src/gandiva/jni/jni_common.cc#L505 I'm all for language-agnostic in-memory data passing, but there is a use

[jira] [Created] (ARROW-6781) [C++] Improve and consolidate ARROW_CHECK, DCHECK macros

2019-10-03 Thread Ben Kietzman (Jira)
Ben Kietzman created ARROW-6781: --- Summary: [C++] Improve and consolidate ARROW_CHECK, DCHECK macros Key: ARROW-6781 URL: https://issues.apache.org/jira/browse/ARROW-6781 Project: Apache Arrow I

[NIGHTLY] Arrow Build Report for Job nightly-2019-10-03-0

2019-10-03 Thread Crossbow
Arrow Build Report for Job nightly-2019-10-03-0 All tasks: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-10-03-0 Failed Tasks: - wheel-manylinux1-cp37m: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-10-03-0-travis-wheel-manylinux1-cp37m -

[jira] [Created] (ARROW-6780) [C++][Parquet] Support DurationType in writing/reading parquet

2019-10-03 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-6780: Summary: [C++][Parquet] Support DurationType in writing/reading parquet Key: ARROW-6780 URL: https://issues.apache.org/jira/browse/ARROW-6780 Project:

[jira] [Created] (ARROW-6779) [Python] Conversion from datetime.datetime to timstamp('ns') can overflow

2019-10-03 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-6779: Summary: [Python] Conversion from datetime.datetime to timstamp('ns') can overflow Key: ARROW-6779 URL: https://issues.apache.org/jira/browse/ARROW-6779

[jira] [Created] (ARROW-6778) [C++] Support DurationType in Cast kernel

2019-10-03 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-6778: Summary: [C++] Support DurationType in Cast kernel Key: ARROW-6778 URL: https://issues.apache.org/jira/browse/ARROW-6778 Project: Apache Arrow

Docker organization for development images

2019-10-03 Thread Krisztián Szűcs
Hi, We've created a docker hub organisation called "arrowdev" to host the images defined in the docker-compose.yml, see the following commit [1]. So now it is possible to speed up the image builds by pulling the layers first, I suggest to use the --pull flag for building images: `docker-compose bu

Re: Clarifying interpretation of Buffer "length" field in Arrow protocol

2019-10-03 Thread Wes McKinney
On Thu, Oct 3, 2019 at 7:33 AM Antoine Pitrou wrote: > > > Le 03/10/2019 à 14:22, Wes McKinney a écrit : > > On Thu, Oct 3, 2019 at 4:26 AM Antoine Pitrou wrote: > >> > >> > >> Yeah, I think the spec should be strict. And for convenience, I'd say > >> it should probably be the padded length (tho

Re: Clarifying interpretation of Buffer "length" field in Arrow protocol

2019-10-03 Thread Antoine Pitrou
Le 03/10/2019 à 14:22, Wes McKinney a écrit : > On Thu, Oct 3, 2019 at 4:26 AM Antoine Pitrou wrote: >> >> >> Yeah, I think the spec should be strict. And for convenience, I'd say >> it should probably be the padded length (though I don't have a strong >> opinion). > > The reason I'm against t

Re: Clarifying interpretation of Buffer "length" field in Arrow protocol

2019-10-03 Thread Wes McKinney
On Thu, Oct 3, 2019 at 4:26 AM Antoine Pitrou wrote: > > > Yeah, I think the spec should be strict. And for convenience, I'd say > it should probably be the padded length (though I don't have a strong > opinion). The reason I'm against this is that it makes it impossible for a producer to preser

Re: [DISCUSS] Result vs Status

2019-10-03 Thread Antoine Pitrou
Le 03/10/2019 à 06:13, Micah Kornfield a écrit : > > It was my impression that we had workable solutions for using Result in at > least Python and Glib/Ruby (I'm don't know about R). In Python we do (though it needed a C++-side helper). Regards Antoine.

Re: [DISCUSS] raw pointers and FFI (C-level in-process array protocol)

2019-10-03 Thread Antoine Pitrou
Hi Jacques, Le 03/10/2019 à 02:46, Jacques Nadeau a écrit : > > I think it is reasonable to argue that keeping any ABI (or header/struct > pattern) as narrow as possible would allow us to minimize overlap with the > existing in-memory specification. In Arrow's case, this could be as simple > as

Re: Clarifying interpretation of Buffer "length" field in Arrow protocol

2019-10-03 Thread Antoine Pitrou
Yeah, I think the spec should be strict. And for convenience, I'd say it should probably be the padded length (though I don't have a strong opinion). Regards Antoine. Le 03/10/2019 à 06:23, Micah Kornfield a écrit : > Hi Wes, > It seems fine to be flexible here. However: > > >> This could