Re: [DRAFT] Apache Arrow Board Report - October 2019

2019-10-09 Thread Wes McKinney
On Thu, Oct 10, 2019 at 12:22 AM Jacques Nadeau wrote: > > I'm not dismissing the there are issues but I also don't feel like there > has been constant discussion for months on the list that INFRA is not being > responsive to Arrow community requests. It seems like you might be saying a > couple

Re: [DRAFT] Apache Arrow Board Report - October 2019

2019-10-09 Thread Jacques Nadeau
I'm not dismissing the there are issues but I also don't feel like there has been constant discussion for months on the list that INFRA is not being responsive to Arrow community requests. It seems like you might be saying a couple different things one of two things (or both?)? 1) The Arrow

Re: [DISCUSS] Proposal about integration test of arrow parquet reader

2019-10-09 Thread Renjie Liu
It would be fine in that case. Wes McKinney 于 2019年10月10日周四 下午12:58写道: > On Wed, Oct 9, 2019 at 10:16 PM Renjie Liu > wrote: > > > > 1. There already exists a low level parquet writer which can produce > > parquet file, so unit test should be fine. But writer from arrow to > parquet > >

Re: [DISCUSS] Proposal about integration test of arrow parquet reader

2019-10-09 Thread Wes McKinney
On Wed, Oct 9, 2019 at 10:16 PM Renjie Liu wrote: > > 1. There already exists a low level parquet writer which can produce > parquet file, so unit test should be fine. But writer from arrow to parquet > doesn't exist yet, and it may take some period of time to finish it. > 2. In fact my data are

Re: Looking ahead to 1.0

2019-10-09 Thread Wes McKinney
Hi John, Since the 1.0.0 release is focused on Format stability, probably the only real "blockers" will be ensuring that we have hardened multiple implementations (in particular C++ and Java) of the columnar format as specified with integration tests to prove it. The issues you listed sound more

Re: [DRAFT] Apache Arrow Board Report - October 2019

2019-10-09 Thread Wes McKinney
hi Jacques, I think we need to share the concerns that many PMC members have over the constraints that INFRA is placing on us. Can we rephrase the concern in a way that is more helpful? Firstly, I respect and appreciate the ASF's desire to limit write access to committers only from an IP

[jira] [Created] (ARROW-6844) List columns read broken with 0.15.0

2019-10-09 Thread Benoit Rostykus (Jira)
Benoit Rostykus created ARROW-6844: -- Summary: List columns read broken with 0.15.0 Key: ARROW-6844 URL: https://issues.apache.org/jira/browse/ARROW-6844 Project: Apache Arrow Issue Type:

Re: [DISCUSS] Proposal about integration test of arrow parquet reader

2019-10-09 Thread Renjie Liu
1. There already exists a low level parquet writer which can produce parquet file, so unit test should be fine. But writer from arrow to parquet doesn't exist yet, and it may take some period of time to finish it. 2. In fact my data are randomly generated and it's definitely reproducible. However,

Re: [DRAFT] Apache Arrow Board Report - October 2019

2019-10-09 Thread Jacques Nadeau
I think we need to more direct in listing issues for the board. What have we done? What do we want them to do? In general, any large org is going to be slow to add new deep integrations into GitHub. I don't think we should expect Apache to be any different (it took several years before we could

[jira] [Created] (ARROW-6843) [Website] Disable deploy on pull request

2019-10-09 Thread Kouhei Sutou (Jira)
Kouhei Sutou created ARROW-6843: --- Summary: [Website] Disable deploy on pull request Key: ARROW-6843 URL: https://issues.apache.org/jira/browse/ARROW-6843 Project: Apache Arrow Issue Type:

Re: Can't find myself in contributor list

2019-10-09 Thread Hengruo Zhang
Got it. 6408 was reverted. That makes sense. On Wed, Oct 9, 2019 at 3:19 PM Wes McKinney wrote: > I'm seeing > > $ git hist | grep Hengruo > * f9cd2958a 2019-10-09 | ARROW-6274: [Rust] [DataFusion] Add support > for writing results to CSV [Hengruo Zhang] > * 3145e9bef 2019-09-08 | ARROW-6408:

[jira] [Created] (ARROW-6842) [Website] Jekyll error building website

2019-10-09 Thread Wes McKinney (Jira)
Wes McKinney created ARROW-6842: --- Summary: [Website] Jekyll error building website Key: ARROW-6842 URL: https://issues.apache.org/jira/browse/ARROW-6842 Project: Apache Arrow Issue Type: Bug

Re: Can't find myself in contributor list

2019-10-09 Thread Wes McKinney
I'm seeing $ git hist | grep Hengruo * f9cd2958a 2019-10-09 | ARROW-6274: [Rust] [DataFusion] Add support for writing results to CSV [Hengruo Zhang] * 3145e9bef 2019-09-08 | ARROW-6408: [Rust] use "if cfg!" pattern [Hengruo Zhang] So there's only 1 commit in the last 1 month. This doesn't appear

Re: Can't find myself in contributor list

2019-10-09 Thread paddy horan
It might also be due to our merge tool. PRs are merged locally and pushed to master (with the corresponding PR on github being “closed” rather than “merged”). This might not be reflected in the pulse view. P From: Wes McKinney Sent: Wednesday, October 9,

[jira] [Created] (ARROW-6841) [C++] Upgrade to LLVM 8

2019-10-09 Thread Wes McKinney (Jira)
Wes McKinney created ARROW-6841: --- Summary: [C++] Upgrade to LLVM 8 Key: ARROW-6841 URL: https://issues.apache.org/jira/browse/ARROW-6841 Project: Apache Arrow Issue Type: Improvement

[jira] [Created] (ARROW-6840) [C++/Python] retrieve fd of open memory mapped file and Open() memory mapped file by fd

2019-10-09 Thread John Muehlhausen (Jira)
John Muehlhausen created ARROW-6840: --- Summary: [C++/Python] retrieve fd of open memory mapped file and Open() memory mapped file by fd Key: ARROW-6840 URL: https://issues.apache.org/jira/browse/ARROW-6840

[jira] [Created] (ARROW-6839) [Java] access File Footer custom_metadata

2019-10-09 Thread John Muehlhausen (Jira)
John Muehlhausen created ARROW-6839: --- Summary: [Java] access File Footer custom_metadata Key: ARROW-6839 URL: https://issues.apache.org/jira/browse/ARROW-6839 Project: Apache Arrow Issue

[jira] [Created] (ARROW-6838) [JS] access File Footer custom_metadata

2019-10-09 Thread John Muehlhausen (Jira)
John Muehlhausen created ARROW-6838: --- Summary: [JS] access File Footer custom_metadata Key: ARROW-6838 URL: https://issues.apache.org/jira/browse/ARROW-6838 Project: Apache Arrow Issue

[jira] [Created] (ARROW-6837) [C++/Python] access File Footer custom_metadata

2019-10-09 Thread John Muehlhausen (Jira)
John Muehlhausen created ARROW-6837: --- Summary: [C++/Python] access File Footer custom_metadata Key: ARROW-6837 URL: https://issues.apache.org/jira/browse/ARROW-6837 Project: Apache Arrow

[jira] [Created] (ARROW-6836) [Format] add a custom_metadata:[KeyValue] field to the Footer table in File.fbs

2019-10-09 Thread John Muehlhausen (Jira)
John Muehlhausen created ARROW-6836: --- Summary: [Format] add a custom_metadata:[KeyValue] field to the Footer table in File.fbs Key: ARROW-6836 URL: https://issues.apache.org/jira/browse/ARROW-6836

[jira] [Created] (ARROW-6835) [Archery][CMake] Restore ARROW_LINT_ONLY

2019-10-09 Thread Francois Saint-Jacques (Jira)
Francois Saint-Jacques created ARROW-6835: - Summary: [Archery][CMake] Restore ARROW_LINT_ONLY Key: ARROW-6835 URL: https://issues.apache.org/jira/browse/ARROW-6835 Project: Apache Arrow

Re: Can't find myself in contributor list

2019-10-09 Thread Wes McKinney
GitHub only shows the top 100 contributors to the project in https://github.com/apache/arrow/graphs/contributors Similarly I think you need more commits to show up in the Pulse view On Wed, Oct 9, 2019 at 2:58 PM Hengruo Zhang wrote: > > Hi, > > My two PRs have been already merged to the

[jira] [Created] (ARROW-6834) [C++] Appveyor build failing on master

2019-10-09 Thread Wes McKinney (Jira)
Wes McKinney created ARROW-6834: --- Summary: [C++] Appveyor build failing on master Key: ARROW-6834 URL: https://issues.apache.org/jira/browse/ARROW-6834 Project: Apache Arrow Issue Type: Bug

Looking ahead to 1.0

2019-10-09 Thread Neal Richardson
Congratulations everyone on 0.15! I know a lot of hard work went into it, not only in the software itself but also in the build and release process. Once you've caught your breath from the release, we should start thinking about what's in scope for our next release, the big 1.0. To get us started

[jira] [Created] (ARROW-6833) [R][CI] Add crossbow job for full R autobrew macOS build

2019-10-09 Thread Neal Richardson (Jira)
Neal Richardson created ARROW-6833: -- Summary: [R][CI] Add crossbow job for full R autobrew macOS build Key: ARROW-6833 URL: https://issues.apache.org/jira/browse/ARROW-6833 Project: Apache Arrow

[jira] [Created] (ARROW-6832) [R] Implement Codec::IsAvailable

2019-10-09 Thread Neal Richardson (Jira)
Neal Richardson created ARROW-6832: -- Summary: [R] Implement Codec::IsAvailable Key: ARROW-6832 URL: https://issues.apache.org/jira/browse/ARROW-6832 Project: Apache Arrow Issue Type:

[jira] [Created] (ARROW-6831) [R] Update R macOS/Windows builds for change in cmake compression defaults

2019-10-09 Thread Neal Richardson (Jira)
Neal Richardson created ARROW-6831: -- Summary: [R] Update R macOS/Windows builds for change in cmake compression defaults Key: ARROW-6831 URL: https://issues.apache.org/jira/browse/ARROW-6831

Re: [NIGHTLY] Arrow Build Report for Job nightly-2019-10-09-0

2019-10-09 Thread Neal Richardson
FWIW there appears to have been a recent update to grpc on Homebrew involving protobuf: https://github.com/Homebrew/homebrew-core/commits/master/Formula/grpc.rb Last time we had a Homebrew grpc issue, I made this at Kou's suggestion: https://github.com/Homebrew/homebrew-core/pull/44198 I think

Re: [NIGHTLY] Arrow Build Report for Job nightly-2019-10-09-0

2019-10-09 Thread Wes McKinney
It looks like protobuf and other gRPC dependencies are being built from source when doing `brew install grpc`. This is probably an issue with the Homebrew stack, do we know how to address this situation now and in the future (probably requires asking the Homebrew community about grpc "bottles")?

[jira] [Created] (ARROW-6830) Question / Feature Request- Select Subset of Columns in read_arrow

2019-10-09 Thread Anthony Abate (Jira)
Anthony Abate created ARROW-6830: Summary: Question / Feature Request- Select Subset of Columns in read_arrow Key: ARROW-6830 URL: https://issues.apache.org/jira/browse/ARROW-6830 Project: Apache

[jira] [Created] (ARROW-6829) [Docs] Migrate integration test docs to Sphinx, fix instructions after ARROW-6466

2019-10-09 Thread Wes McKinney (Jira)
Wes McKinney created ARROW-6829: --- Summary: [Docs] Migrate integration test docs to Sphinx, fix instructions after ARROW-6466 Key: ARROW-6829 URL: https://issues.apache.org/jira/browse/ARROW-6829

[jira] [Created] (ARROW-6827) [Archery] lint sub-command should provide a --fail-fast option

2019-10-09 Thread Francois Saint-Jacques (Jira)
Francois Saint-Jacques created ARROW-6827: - Summary: [Archery] lint sub-command should provide a --fail-fast option Key: ARROW-6827 URL: https://issues.apache.org/jira/browse/ARROW-6827

[jira] [Created] (ARROW-6828) [Archery] Benchmark diff should provide a TUI friendly output

2019-10-09 Thread Francois Saint-Jacques (Jira)
Francois Saint-Jacques created ARROW-6828: - Summary: [Archery] Benchmark diff should provide a TUI friendly output Key: ARROW-6828 URL: https://issues.apache.org/jira/browse/ARROW-6828

[NIGHTLY] Arrow Build Report for Job nightly-2019-10-09-0

2019-10-09 Thread Crossbow
Arrow Build Report for Job nightly-2019-10-09-0 All tasks: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-10-09-0 Failed Tasks: - gandiva-jar-trusty: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-10-09-0-travis-gandiva-jar-trusty -

[jira] [Created] (ARROW-6826) [Archery] Default build should be minimal

2019-10-09 Thread Francois Saint-Jacques (Jira)
Francois Saint-Jacques created ARROW-6826: - Summary: [Archery] Default build should be minimal Key: ARROW-6826 URL: https://issues.apache.org/jira/browse/ARROW-6826 Project: Apache Arrow

[jira] [Created] (ARROW-6825) [C++] Rework CSV reader IO around readahead iterator

2019-10-09 Thread Antoine Pitrou (Jira)
Antoine Pitrou created ARROW-6825: - Summary: [C++] Rework CSV reader IO around readahead iterator Key: ARROW-6825 URL: https://issues.apache.org/jira/browse/ARROW-6825 Project: Apache Arrow

[jira] [Created] (ARROW-6823) [C++][Python][R] Support metadata in the feather format?

2019-10-09 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-6823: Summary: [C++][Python][R] Support metadata in the feather format? Key: ARROW-6823 URL: https://issues.apache.org/jira/browse/ARROW-6823 Project:

Re: Table.cast throws ArrowNotImplementedError (pyarrow==0.15.0)

2019-10-09 Thread Joris Van den Bossche
Hi Lucas, Do you have a small code example? Trying the following worked in pyarrow 0.14, and still seems to work now: In [1]: table = pa.table({'a': [1, 2, 3]}) In [2]: table Out[2]: pyarrow.Table a: int64 In [3]: table.cast(pa.schema([('a', pa.int32())])) Out[3]: pyarrow.Table a: int32 In