[jira] [Created] (ARROW-3721) [Gandiva] [Python] Support all Gandiva literals

2018-11-08 Thread Philipp Moritz (JIRA)
Philipp Moritz created ARROW-3721: - Summary: [Gandiva] [Python] Support all Gandiva literals Key: ARROW-3721 URL: https://issues.apache.org/jira/browse/ARROW-3721 Project: Apache Arrow Issue

[jira] [Created] (ARROW-3722) [C++] Allow specifying column types to CSV reader

2018-11-08 Thread Antoine Pitrou (JIRA)
Antoine Pitrou created ARROW-3722: - Summary: [C++] Allow specifying column types to CSV reader Key: ARROW-3722 URL: https://issues.apache.org/jira/browse/ARROW-3722 Project: Apache Arrow Issu

[jira] [Created] (ARROW-3718) [Gandiva] Remove spurious gtest include

2018-11-08 Thread Philipp Moritz (JIRA)
Philipp Moritz created ARROW-3718: - Summary: [Gandiva] Remove spurious gtest include Key: ARROW-3718 URL: https://issues.apache.org/jira/browse/ARROW-3718 Project: Apache Arrow Issue Type: Im

Creating Buffer directly from pointer/length

2018-11-08 Thread Randy Zwitch
Within OmniSci (MapD), we have the following code that takes a pointer and length and reads to a NumPy array before calling py_buffer: https://github.com/omnisci/pymapd/blob/master/pymapd/shm.pyx#L31-L52 Is it possible to eliminate the NumPy step and go directly do an Arrow buffer? There is bo

Re: Creating Buffer directly from pointer/length

2018-11-08 Thread Antoine Pitrou
You should be able to use pa.foreign_buffer(): https://arrow.apache.org/docs/python/generated/pyarrow.foreign_buffer.html#pyarrow.foreign_buffer Regards Antoine. Le 08/11/2018 à 18:49, Randy Zwitch a écrit : > Within OmniSci (MapD), we have the following code that takes a pointer > and lengt

Re: Creating Buffer directly from pointer/length

2018-11-08 Thread Wes McKinney
Yes, see pyarrow.foreign_buffer If this isn't in the documentation, could you open a JIRA to fix that? Thanks Wes On Thu, Nov 8, 2018, 11:53 AM Randy Zwitch Within OmniSci (MapD), we have the following code that takes a pointer > and length and reads to a NumPy array before calling py_buffer: >

[jira] [Created] (ARROW-3727) [Python] Document use of pyarrow.foreign_buffer in Sphinx

2018-11-08 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-3727: --- Summary: [Python] Document use of pyarrow.foreign_buffer in Sphinx Key: ARROW-3727 URL: https://issues.apache.org/jira/browse/ARROW-3727 Project: Apache Arrow

Re: Creating Buffer directly from pointer/length

2018-11-08 Thread Pearu Peterson
Hi, For host memory, you can use pyarrow.foreign_buffer, see https://arrow.apache.org/docs/python/generated/pyarrow.foreign_buffer.html For device memory, one can use pyarrow.cuda.foreign_buffer. HTH, Pearu On Thu, Nov 8, 2018 at 7:53 PM Randy Zwitch wrote: > Within OmniSci (MapD), we have

[jira] [Created] (ARROW-3726) [Rust] CSV Reader & Writer

2018-11-08 Thread nevi_me (JIRA)
nevi_me created ARROW-3726: -- Summary: [Rust] CSV Reader & Writer Key: ARROW-3726 URL: https://issues.apache.org/jira/browse/ARROW-3726 Project: Apache Arrow Issue Type: New Feature Compone

[jira] [Created] (ARROW-3728) Merging Parquet Files - Pandas Meta in Schema Mismatch

2018-11-08 Thread Micah Williamson (JIRA)
Micah Williamson created ARROW-3728: --- Summary: Merging Parquet Files - Pandas Meta in Schema Mismatch Key: ARROW-3728 URL: https://issues.apache.org/jira/browse/ARROW-3728 Project: Apache Arrow

Re: Creating Buffer directly from pointer/length

2018-11-08 Thread Uwe L. Korn
Hello Randy, you are looking for https://arrow.apache.org/docs/python/generated/pyarrow.foreign_buffer.html#pyarrow.foreign_buffer This takes an address, size and a Python object for having a reference on the object. In your case the last one can be None. Note that this will not do a copy and

Re: Creating Buffer directly from pointer/length

2018-11-08 Thread Randy Zwitch
Thanks Uwe, Wes, Pearu and Antoine. This is in the pyarrow docs, but no example, so I'll open up a JIRA so that it might be more obvious the next person. On 11/8/18 12:59 PM, Uwe L. Korn wrote: Hello Randy, you are looking for https://arrow.apache.org/docs/python/generated/pyarrow.foreign_b

Re: Creating Buffer directly from pointer/length

2018-11-08 Thread Wes McKinney
I opened https://issues.apache.org/jira/browse/ARROW-3727 about adding examples. I will mention to add an example for CUDA also On Thu, Nov 8, 2018 at 2:30 PM Randy Zwitch wrote: > > Thanks Uwe, Wes, Pearu and Antoine. This is in the pyarrow docs, but no > example, so I'll open up a JIRA so that i

[ANNOUNCE] New Arrow PMC member: Krisztián Szűcs

2018-11-08 Thread Wes McKinney
The Project Management Committee (PMC) for Apache Arrow has invited Krisztián Szűcs to become a PMC member and we are pleased to announce that he has accepted. Congratulations and welcome, Krisztián!

[jira] [Created] (ARROW-3724) [GLib] Update gitignore

2018-11-08 Thread Yosuke Shiro (JIRA)
Yosuke Shiro created ARROW-3724: --- Summary: [GLib] Update gitignore Key: ARROW-3724 URL: https://issues.apache.org/jira/browse/ARROW-3724 Project: Apache Arrow Issue Type: Improvement

[ANNOUNCE] New Arrow committers: Romain François, Sebastien Binet, Yosuke Shiro

2018-11-08 Thread Wes McKinney
On behalf of the Arrow PMC, I'm happy to announce that Romain François, Sebastien Binet, and Yosuke Shiro have been invited to be committers on the project. Welcome, and thanks for your contributions!

Re: [ANNOUNCE] New Arrow PMC member: Krisztián Szűcs

2018-11-08 Thread Philipp Moritz
Congrats and welcome Krisztián! On Thu, Nov 8, 2018 at 11:48 AM Wes McKinney wrote: > The Project Management Committee (PMC) for Apache Arrow has invited > Krisztián Szűcs to become a PMC member and we are pleased to announce > that he has accepted. > > Congratulations and welcome, Krisztián! >

Re: [ANNOUNCE] New Arrow committers: Romain François, Sebastien Binet, Yosuke Shiro

2018-11-08 Thread Antoine Pitrou
It's nice to have new people onboard. Welcome everyone :-) Le 08/11/2018 à 20:56, Wes McKinney a écrit : > On behalf of the Arrow PMC, I'm happy to announce that Romain > François, Sebastien Binet, and Yosuke Shiro have been invited to be > committers on the project. > > Welcome, and thanks fo

Re: [ANNOUNCE] New Arrow committers: Romain François, Sebastien Binet, Yosuke Shiro

2018-11-08 Thread Philipp Moritz
Welcome everybody! On Thu, Nov 8, 2018 at 12:57 PM Antoine Pitrou wrote: > > It's nice to have new people onboard. Welcome everyone :-) > > Le 08/11/2018 à 20:56, Wes McKinney a écrit : > > On behalf of the Arrow PMC, I'm happy to announce that Romain > > François, Sebastien Binet, and Yosuke S

Re: [ANNOUNCE] New Arrow committers: Romain François, Sebastien Binet, Yosuke Shiro

2018-11-08 Thread Uwe L. Korn
Welcome to all of you! On Thu, Nov 8, 2018, at 8:56 PM, Wes McKinney wrote: > On behalf of the Arrow PMC, I'm happy to announce that Romain > François, Sebastien Binet, and Yosuke Shiro have been invited to be > committers on the project. > > Welcome, and thanks for your contributions!

Re: [ANNOUNCE] New Arrow PMC member: Krisztián Szűcs

2018-11-08 Thread Uwe L. Korn
Congratulations Krisztián! On Thu, Nov 8, 2018, at 9:56 PM, Philipp Moritz wrote: > Congrats and welcome Krisztián! > > On Thu, Nov 8, 2018 at 11:48 AM Wes McKinney wrote: > > > The Project Management Committee (PMC) for Apache Arrow has invited > > Krisztián Szűcs to become a PMC member and we

[jira] [Created] (ARROW-3725) [GLib] Add field readers to GArrowStructDataType

2018-11-08 Thread Kouhei Sutou (JIRA)
Kouhei Sutou created ARROW-3725: --- Summary: [GLib] Add field readers to GArrowStructDataType Key: ARROW-3725 URL: https://issues.apache.org/jira/browse/ARROW-3725 Project: Apache Arrow Issue Typ

[jira] [Created] (ARROW-3723) [Plasma] [Ruby] Add Ruby bindings of Plasma

2018-11-08 Thread Yosuke Shiro (JIRA)
Yosuke Shiro created ARROW-3723: --- Summary: [Plasma] [Ruby] Add Ruby bindings of Plasma Key: ARROW-3723 URL: https://issues.apache.org/jira/browse/ARROW-3723 Project: Apache Arrow Issue Type: Ne

[jira] [Created] (ARROW-3720) [GLib] Use "indices" instead of "indexes"

2018-11-08 Thread Kouhei Sutou (JIRA)
Kouhei Sutou created ARROW-3720: --- Summary: [GLib] Use "indices" instead of "indexes" Key: ARROW-3720 URL: https://issues.apache.org/jira/browse/ARROW-3720 Project: Apache Arrow Issue Type: Impr

[jira] [Created] (ARROW-3717) Add GCSFSWrapper for DaskFileSystem

2018-11-08 Thread Emmett McQuinn (JIRA)
Emmett McQuinn created ARROW-3717: - Summary: Add GCSFSWrapper for DaskFileSystem Key: ARROW-3717 URL: https://issues.apache.org/jira/browse/ARROW-3717 Project: Apache Arrow Issue Type: New Fe

Re: Assign/update : NA bitmap vs sentinel

2018-11-08 Thread Wes McKinney
hey Matt, Thanks for giving your perspective on the mailing list. My objective in writing about this recently (http://wesmckinney.com/blog/bitmaps-vs-sentinel-values/, though I need to update since the sentinel case can be done more efficiently than what's there now) was to help dispel the notion

Re: [ANNOUNCE] New Arrow committers: Romain François, Sebastien Binet, Yosuke Shiro

2018-11-08 Thread Li Jin
Welcome! On Thu, Nov 8, 2018 at 4:01 PM Uwe L. Korn wrote: > Welcome to all of you! > > On Thu, Nov 8, 2018, at 8:56 PM, Wes McKinney wrote: > > On behalf of the Arrow PMC, I'm happy to announce that Romain > > François, Sebastien Binet, and Yosuke Shiro have been invited to be > > committers on

Re: [ANNOUNCE] New Arrow PMC member: Krisztián Szűcs

2018-11-08 Thread Li Jin
Congrats! On Thu, Nov 8, 2018 at 4:02 PM Uwe L. Korn wrote: > Congratulations Krisztián! > > On Thu, Nov 8, 2018, at 9:56 PM, Philipp Moritz wrote: > > Congrats and welcome Krisztián! > > > > On Thu, Nov 8, 2018 at 11:48 AM Wes McKinney > wrote: > > > > > The Project Management Committee (PMC)

[jira] [Created] (ARROW-3730) [Python] Output a representation of pyarrow.Schema that can be used to reconstruct a schema in a script

2018-11-08 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-3730: --- Summary: [Python] Output a representation of pyarrow.Schema that can be used to reconstruct a schema in a script Key: ARROW-3730 URL: https://issues.apache.org/jira/browse/ARROW-373

[jira] [Created] (ARROW-3731) [R] R API for reading and writing Parquet files

2018-11-08 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-3731: --- Summary: [R] R API for reading and writing Parquet files Key: ARROW-3731 URL: https://issues.apache.org/jira/browse/ARROW-3731 Project: Apache Arrow Issue Type

[jira] [Created] (ARROW-3729) [C++] Support for writing TIMESTAMP_NANOS Parquet metadata

2018-11-08 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-3729: --- Summary: [C++] Support for writing TIMESTAMP_NANOS Parquet metadata Key: ARROW-3729 URL: https://issues.apache.org/jira/browse/ARROW-3729 Project: Apache Arrow

Re: Support for TIMESTAMP_NANOS in parquet-cpp

2018-11-08 Thread Wes McKinney
I opened an issue here https://issues.apache.org/jira/browse/ARROW-3729. Patches would be welcome On Sat, Oct 20, 2018 at 12:55 PM Wes McKinney wrote: > > hi Roman, > > We would welcome adding such a document to the Arrow wiki > https://cwiki.apache.org/confluence/display/ARROW. As to your other >

[jira] [Created] (ARROW-3732) [R] Add functions to write RecordBatch or Schema to Message value, then read back

2018-11-08 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-3732: --- Summary: [R] Add functions to write RecordBatch or Schema to Message value, then read back Key: ARROW-3732 URL: https://issues.apache.org/jira/browse/ARROW-3732 Project

Re: Assign/update : NA bitmap vs sentinel

2018-11-08 Thread Phillip Cloud
There is one database that I'm aware of that uses sentinels _and_ supports complex types with missing values: Kx's KDB+. This has led to some seriously strange choices like the ASCII space character being used as the sentinel value for strings. See https://code.kx.com/wiki/Reference/Datatypes for m

[jira] [Created] (ARROW-3733) [GLib] Add to_string() to GArrowTable and GArrowColumn

2018-11-08 Thread Kouhei Sutou (JIRA)
Kouhei Sutou created ARROW-3733: --- Summary: [GLib] Add to_string() to GArrowTable and GArrowColumn Key: ARROW-3733 URL: https://issues.apache.org/jira/browse/ARROW-3733 Project: Apache Arrow Iss

AW: Support for TIMESTAMP_NANOS in parquet-cpp

2018-11-08 Thread Roman Karlstetter
I would be willing to implement that. I’ll probably need some advice on my patch though, as I’m fairly new to the parquet code. Roman Von: Wes McKinney Gesendet: Donnerstag, 8. November 2018 23:22 An: dev@arrow.apache.org Betreff: Re: Support for TIMESTAMP_NANOS in parquet-cpp I opened an issue