[jira] [Created] (ARROW-2292) [Python] More consistent / intuitive name for pyarrow.frombuffer

2018-03-08 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-2292:
---

 Summary: [Python] More consistent / intuitive name for 
pyarrow.frombuffer
 Key: ARROW-2292
 URL: https://issues.apache.org/jira/browse/ARROW-2292
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Python
Reporter: Wes McKinney
 Fix For: 0.9.0


Now that we have {{pyarrow.foreign_buffer}}, things are a bit odd. We could 
call {{from_buffer}} something like {{py_buffer}} instead?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Trying to compile Arrow C++

2018-03-08 Thread Wes McKinney
I see now where this is missing in the build docs, I think we have
updated the docs elsewhere (e.g. in the Python side source build
instructions). I got confused when you said you installed
"libboost-dev" because that will also install libboost-regex-dev.

thanks!
Wes

On Thu, Mar 8, 2018 at 10:52 PM, Andy Grove  wrote:
> OK, so after installing libboost-regex-dev I can generate a makefile. The
> docs don't state that I need to install this. I'll submit a one line PR for
> the docs.
>
> Thanks for the help.
>
> On Thu, Mar 8, 2018 at 7:40 PM, Wes McKinney  wrote:
>
>> Sure, please go ahead and open a JIRA. Do you have libboost-regex-dev
>> installed (is /usr/lib/x86_64-linux-gnu/libboost_regex.so present?)?
>>
>> On Thu, Mar 8, 2018 at 9:31 PM, Andy Grove  wrote:
>> > So I cloned arrow again in a new directory and it is no longer looking
>> for
>> > the local boost install.
>> >
>> > It is looking in /usr/include for headers and recognizes now that I have
>> > boost 1.58.0
>> >
>> > It still fails though, looking for boost_regex. Here is a new gist.
>> >
>> > https://gist.github.com/andygrove/840b5f4d9c500669bbd1de1b84287a0e
>> >
>> > Should I go ahead and file a JIRA for this?
>> >
>> > On Thu, Mar 8, 2018 at 8:32 AM, Wes McKinney 
>> wrote:
>> >
>> >> OK, does moving the ~/git/boost_1_66_0 someplace else (or removing it)
>> >> make the problem go away? We should open a JIRA to see why the CMake
>> >> build system is being fooled by that directory and see if it can be
>> >> fixed
>> >>
>> >> On Thu, Mar 8, 2018 at 10:28 AM, Andy Grove 
>> wrote:
>> >> > Thanks. Here's the gist. I do not have BOOST env vars set. It does
>> seem
>> >> to
>> >> > be looking for headers in a boost directory parallel to arrow though.
>> >> >
>> >> > https://gist.github.com/andygrove/8abfa027fa29fb9f31efeab90043682c
>> >> >
>> >> > On Thu, Mar 8, 2018 at 8:22 AM, Wes McKinney 
>> >> wrote:
>> >> >
>> >> >> If you could also run with
>> >> >>
>> >> >> -DARROW_VERBOSE_THIRDPARTY_BUILD=ON
>> >> >>
>> >> >> that would provide additional debugging help
>> >> >>
>> >> >> On Thu, Mar 8, 2018 at 10:17 AM, Wes McKinney 
>> >> wrote:
>> >> >> > hi Andy,
>> >> >> >
>> >> >> > Can you post the complete output of running CMake in a gist or
>> >> >> > someplace for us to have a look? Do you have any BOOST_*
>> environment
>> >> >> > variables set?
>> >> >> >
>> >> >> > Thanks
>> >> >> > Wes
>> >> >> >
>> >> >> > On Thu, Mar 8, 2018 at 10:12 AM, Andy Grove > >
>> >> >> wrote:
>> >> >> >> So I'm following the instructions and installed the binary
>> >> dependencies,
>> >> >> >> including libboost-dev. I see boost headers in /usr/include/boost.
>> >> I'm
>> >> >> >> using Ubuntu 16.04.
>> >> >> >>
>> >> >> >> In the Arrow cpp directory, I ran:
>> >> >> >>
>> >> >> >> cmake -G "Unix Makefiles"
>> >> >> >>
>> >> >> >> I get this output:
>> >> >> >>
>> >> >> >>   Unable to find the requested Boost libraries.
>> >> >> >>
>> >> >> >>   Boost version: 0.0.0
>> >> >> >>
>> >> >> >>   Boost include path: /home/andy/git/boost_1_66_0
>> >> >> >>
>> >> >> >>   Could not find the following Boost libraries:
>> >> >> >>
>> >> >> >>   boost_regex
>> >> >> >>
>> >> >> >>   Some (but not all) of the required Boost libraries were found.
>> You
>> >> >> may
>> >> >> >>   need to install these additional Boost libraries.
>> Alternatively,
>> >> set
>> >> >> >>   BOOST_LIBRARYDIR to the directory containing Boost libraries or
>> >> >> BOOST_ROOT
>> >> >> >>   to the location of Boost.
>> >> >> >>
>> >> >> >> I also tried installing boost headers and going that route but ran
>> >> into
>> >> >> >> different problems.
>> >> >> >>
>> >> >> >> I'd appreciate some guidance.
>> >> >> >>
>> >> >> >> Thanks,
>> >> >> >>
>> >> >> >> Andy.
>> >> >>
>> >>
>>


[jira] [Created] (ARROW-2291) cpp README missing instructions for libboost-regex-dev

2018-03-08 Thread Andy Grove (JIRA)
Andy Grove created ARROW-2291:
-

 Summary: cpp README missing instructions for libboost-regex-dev
 Key: ARROW-2291
 URL: https://issues.apache.org/jira/browse/ARROW-2291
 Project: Apache Arrow
  Issue Type: Improvement
  Components: C++
 Environment: Ubuntu 16.04
Reporter: Andy Grove


After following the instructions in the README, I could not generate a makefile 
using CMake because of a missing dependency.

The README needs to be updated to include installing libboost-regex-dev.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Trying to compile Arrow C++

2018-03-08 Thread Wes McKinney
Sure, please go ahead and open a JIRA. Do you have libboost-regex-dev
installed (is /usr/lib/x86_64-linux-gnu/libboost_regex.so present?)?

On Thu, Mar 8, 2018 at 9:31 PM, Andy Grove  wrote:
> So I cloned arrow again in a new directory and it is no longer looking for
> the local boost install.
>
> It is looking in /usr/include for headers and recognizes now that I have
> boost 1.58.0
>
> It still fails though, looking for boost_regex. Here is a new gist.
>
> https://gist.github.com/andygrove/840b5f4d9c500669bbd1de1b84287a0e
>
> Should I go ahead and file a JIRA for this?
>
> On Thu, Mar 8, 2018 at 8:32 AM, Wes McKinney  wrote:
>
>> OK, does moving the ~/git/boost_1_66_0 someplace else (or removing it)
>> make the problem go away? We should open a JIRA to see why the CMake
>> build system is being fooled by that directory and see if it can be
>> fixed
>>
>> On Thu, Mar 8, 2018 at 10:28 AM, Andy Grove  wrote:
>> > Thanks. Here's the gist. I do not have BOOST env vars set. It does seem
>> to
>> > be looking for headers in a boost directory parallel to arrow though.
>> >
>> > https://gist.github.com/andygrove/8abfa027fa29fb9f31efeab90043682c
>> >
>> > On Thu, Mar 8, 2018 at 8:22 AM, Wes McKinney 
>> wrote:
>> >
>> >> If you could also run with
>> >>
>> >> -DARROW_VERBOSE_THIRDPARTY_BUILD=ON
>> >>
>> >> that would provide additional debugging help
>> >>
>> >> On Thu, Mar 8, 2018 at 10:17 AM, Wes McKinney 
>> wrote:
>> >> > hi Andy,
>> >> >
>> >> > Can you post the complete output of running CMake in a gist or
>> >> > someplace for us to have a look? Do you have any BOOST_* environment
>> >> > variables set?
>> >> >
>> >> > Thanks
>> >> > Wes
>> >> >
>> >> > On Thu, Mar 8, 2018 at 10:12 AM, Andy Grove 
>> >> wrote:
>> >> >> So I'm following the instructions and installed the binary
>> dependencies,
>> >> >> including libboost-dev. I see boost headers in /usr/include/boost.
>> I'm
>> >> >> using Ubuntu 16.04.
>> >> >>
>> >> >> In the Arrow cpp directory, I ran:
>> >> >>
>> >> >> cmake -G "Unix Makefiles"
>> >> >>
>> >> >> I get this output:
>> >> >>
>> >> >>   Unable to find the requested Boost libraries.
>> >> >>
>> >> >>   Boost version: 0.0.0
>> >> >>
>> >> >>   Boost include path: /home/andy/git/boost_1_66_0
>> >> >>
>> >> >>   Could not find the following Boost libraries:
>> >> >>
>> >> >>   boost_regex
>> >> >>
>> >> >>   Some (but not all) of the required Boost libraries were found.  You
>> >> may
>> >> >>   need to install these additional Boost libraries.  Alternatively,
>> set
>> >> >>   BOOST_LIBRARYDIR to the directory containing Boost libraries or
>> >> BOOST_ROOT
>> >> >>   to the location of Boost.
>> >> >>
>> >> >> I also tried installing boost headers and going that route but ran
>> into
>> >> >> different problems.
>> >> >>
>> >> >> I'd appreciate some guidance.
>> >> >>
>> >> >> Thanks,
>> >> >>
>> >> >> Andy.
>> >>
>>


Re: Trying to compile Arrow C++

2018-03-08 Thread Andy Grove
So I cloned arrow again in a new directory and it is no longer looking for
the local boost install.

It is looking in /usr/include for headers and recognizes now that I have
boost 1.58.0

It still fails though, looking for boost_regex. Here is a new gist.

https://gist.github.com/andygrove/840b5f4d9c500669bbd1de1b84287a0e

Should I go ahead and file a JIRA for this?

On Thu, Mar 8, 2018 at 8:32 AM, Wes McKinney  wrote:

> OK, does moving the ~/git/boost_1_66_0 someplace else (or removing it)
> make the problem go away? We should open a JIRA to see why the CMake
> build system is being fooled by that directory and see if it can be
> fixed
>
> On Thu, Mar 8, 2018 at 10:28 AM, Andy Grove  wrote:
> > Thanks. Here's the gist. I do not have BOOST env vars set. It does seem
> to
> > be looking for headers in a boost directory parallel to arrow though.
> >
> > https://gist.github.com/andygrove/8abfa027fa29fb9f31efeab90043682c
> >
> > On Thu, Mar 8, 2018 at 8:22 AM, Wes McKinney 
> wrote:
> >
> >> If you could also run with
> >>
> >> -DARROW_VERBOSE_THIRDPARTY_BUILD=ON
> >>
> >> that would provide additional debugging help
> >>
> >> On Thu, Mar 8, 2018 at 10:17 AM, Wes McKinney 
> wrote:
> >> > hi Andy,
> >> >
> >> > Can you post the complete output of running CMake in a gist or
> >> > someplace for us to have a look? Do you have any BOOST_* environment
> >> > variables set?
> >> >
> >> > Thanks
> >> > Wes
> >> >
> >> > On Thu, Mar 8, 2018 at 10:12 AM, Andy Grove 
> >> wrote:
> >> >> So I'm following the instructions and installed the binary
> dependencies,
> >> >> including libboost-dev. I see boost headers in /usr/include/boost.
> I'm
> >> >> using Ubuntu 16.04.
> >> >>
> >> >> In the Arrow cpp directory, I ran:
> >> >>
> >> >> cmake -G "Unix Makefiles"
> >> >>
> >> >> I get this output:
> >> >>
> >> >>   Unable to find the requested Boost libraries.
> >> >>
> >> >>   Boost version: 0.0.0
> >> >>
> >> >>   Boost include path: /home/andy/git/boost_1_66_0
> >> >>
> >> >>   Could not find the following Boost libraries:
> >> >>
> >> >>   boost_regex
> >> >>
> >> >>   Some (but not all) of the required Boost libraries were found.  You
> >> may
> >> >>   need to install these additional Boost libraries.  Alternatively,
> set
> >> >>   BOOST_LIBRARYDIR to the directory containing Boost libraries or
> >> BOOST_ROOT
> >> >>   to the location of Boost.
> >> >>
> >> >> I also tried installing boost headers and going that route but ran
> into
> >> >> different problems.
> >> >>
> >> >> I'd appreciate some guidance.
> >> >>
> >> >> Thanks,
> >> >>
> >> >> Andy.
> >>
>


Re: Working towards getting 0.9.0 release candidate up next week

2018-03-08 Thread Kouhei Sutou
Thanks!

--
kou

In 
  "Re: Working towards getting 0.9.0 release candidate up next week" on Thu, 8 
Mar 2018 20:44:14 -0500,
  Wes McKinney  wrote:

> hi Kou -- yes, I think this is a good idea. It will require a little
> bit of work to be able to produce a viable standalone source tarball.
> Between Uwe, Phillip, Antoine, and I, we should be able to come up
> with a plan to do this
> 
> - Wes
> 
> On Thu, Mar 8, 2018 at 8:33 PM, Kouhei Sutou  wrote:
>> Hi,
>>
>>>- Updating pip packages for C++ and Python
>>
>> Can we try adding PyArrow source package to PyPI at the
>> 0.9.0?
>>
>> I want to install PyArrow with Arrow C++ installed by .deb
>> or .rpm. I want to use both Red Arrow (Ruby bindings) and
>> PyArrow in the same process via PyCall (Ruby library
>> to integrate with Python). In the case, I need to use the
>> same Arrow C++ in both Red Arrow and PyArrow.
>>
>> Now, there are only binary packages for PyArrow at
>> https://pypi.python.org/pypi/pyarrow . If there is a source
>> package for PyArrow at PyPI, I can install PyArrow with
>> Arrow C++ installed by .deb or .rpm by "pip --no-binary
>> pyarrow".
>>
>> Red Arrow can also use Arrow C++ installed by .deb or .rpm.
>>
>>
>> Thanks,
>> --
>> kou
>>
>> In 
>>   "Re: Working towards getting 0.9.0 release candidate up next week" on Thu, 
>> 8 Mar 2018 11:25:32 -0800,
>>   Siddharth Teotia  wrote:
>>
>>> All,
>>>
>>> I plan to get RC out over the weekend or early Monday. Is that fine with
>>> everybody?
>>>
>>> We have 6 items in progress --
>>> https://issues.apache.org/jira/projects/ARROW/versions/12341707#release-report-tab-body.
>>> How do people feel about completing these JIRAs by tomorrow? I am
>>> completely fine with deferring the RC to early next week (Mon/Tue/Wed) if
>>> necessary. Just looking for consensus. Also, I suggest that we defer the
>>> ones with TODO status. I will do it later today unless I hear otherwise.
>>>
>>> I was wondering if anyone else is interested in collaborating for the
>>> post-release tasks. As per
>>> https://github.com/apache/arrow/blob/master/dev/release/RELEASE_MANAGEMENT.md,
>>> following are the high level post-release tasks. Please let me know if you
>>> would like to take up something. I have written my name against some of
>>> them.
>>>
>>>
>>>- Updating the Arrow Website (Sidd)
>>>- Uploading release artifacts to SVN -- looks like PMC karma is needed
>>>to do this
>>>- Announcing release (Sidd)
>>>- Updating website with new API documentation (Sidd)
>>>- Updating pip packages for C++ and Python
>>>- Updating conda packages for C++ and Python (Sidd)
>>>- Updating Java Maven artifacts in Maven central (Sidd)
>>>- Release blog post
>>>
>>> If anything is missing, please add to the above list. It will be helpful
>>> for tracking.
>>>
>>> Thanks,
>>> Sidd
>>>
>>> On Sun, Mar 4, 2018 at 12:34 PM, Wes McKinney  wrote:
>>>
 hey Sidd,

 The Python backlog is still in pretty rough shape. I'd like to see if
 we can make an RC by Friday but if not we can defer to Monday/Tuesday
 the following week (3/12 or 13). I will trim as much as possible out
 of the current backlog to get things down to the essential

 - Wes

 On Sun, Feb 25, 2018 at 11:58 AM, Siddharth Teotia 
 wrote:
 > Sounds good.
 >
 > Thanks
 > Sidd
 >
 > On Feb 24, 2018 6:24 PM, "Wes McKinney"  wrote:
 >
 > Hi Sidd,
 >
 > I think we have too many bugs to make an RC this coming week. I suggest
 we
 > defer to the following week.
 >
 > Thanks
 > Wes
 >
 > On Feb 24, 2018 7:09 PM, "Siddharth Teotia" 
 wrote:
 >
 > Hi All,
 >
 > We currently have 10 issues in progress and PRs are available for 8 of
 > them. In interest of getting a release candidate next week, I would
 request
 > people to review PRs as soon as they can to help make progress and close
 > out as many JIRAs as we can.
 >
 > There are 32 issues in TODO list and 25 of them are not yet assigned. I
 am
 > planning to defer some of the unassigned ones later today or tomorrow. It
 > would be good to soon grab/assign the issues that people want to be fixed
 > for 0.9.0.
 >
 > Here is the link to backlog:
 > https://issues.apache.org/jira/projects/ARROW/versions/12341707
 >
 > Thanks,
 > Sidd



Re: Working towards getting 0.9.0 release candidate up next week

2018-03-08 Thread Wes McKinney
hi Kou -- yes, I think this is a good idea. It will require a little
bit of work to be able to produce a viable standalone source tarball.
Between Uwe, Phillip, Antoine, and I, we should be able to come up
with a plan to do this

- Wes

On Thu, Mar 8, 2018 at 8:33 PM, Kouhei Sutou  wrote:
> Hi,
>
>>- Updating pip packages for C++ and Python
>
> Can we try adding PyArrow source package to PyPI at the
> 0.9.0?
>
> I want to install PyArrow with Arrow C++ installed by .deb
> or .rpm. I want to use both Red Arrow (Ruby bindings) and
> PyArrow in the same process via PyCall (Ruby library
> to integrate with Python). In the case, I need to use the
> same Arrow C++ in both Red Arrow and PyArrow.
>
> Now, there are only binary packages for PyArrow at
> https://pypi.python.org/pypi/pyarrow . If there is a source
> package for PyArrow at PyPI, I can install PyArrow with
> Arrow C++ installed by .deb or .rpm by "pip --no-binary
> pyarrow".
>
> Red Arrow can also use Arrow C++ installed by .deb or .rpm.
>
>
> Thanks,
> --
> kou
>
> In 
>   "Re: Working towards getting 0.9.0 release candidate up next week" on Thu, 
> 8 Mar 2018 11:25:32 -0800,
>   Siddharth Teotia  wrote:
>
>> All,
>>
>> I plan to get RC out over the weekend or early Monday. Is that fine with
>> everybody?
>>
>> We have 6 items in progress --
>> https://issues.apache.org/jira/projects/ARROW/versions/12341707#release-report-tab-body.
>> How do people feel about completing these JIRAs by tomorrow? I am
>> completely fine with deferring the RC to early next week (Mon/Tue/Wed) if
>> necessary. Just looking for consensus. Also, I suggest that we defer the
>> ones with TODO status. I will do it later today unless I hear otherwise.
>>
>> I was wondering if anyone else is interested in collaborating for the
>> post-release tasks. As per
>> https://github.com/apache/arrow/blob/master/dev/release/RELEASE_MANAGEMENT.md,
>> following are the high level post-release tasks. Please let me know if you
>> would like to take up something. I have written my name against some of
>> them.
>>
>>
>>- Updating the Arrow Website (Sidd)
>>- Uploading release artifacts to SVN -- looks like PMC karma is needed
>>to do this
>>- Announcing release (Sidd)
>>- Updating website with new API documentation (Sidd)
>>- Updating pip packages for C++ and Python
>>- Updating conda packages for C++ and Python (Sidd)
>>- Updating Java Maven artifacts in Maven central (Sidd)
>>- Release blog post
>>
>> If anything is missing, please add to the above list. It will be helpful
>> for tracking.
>>
>> Thanks,
>> Sidd
>>
>> On Sun, Mar 4, 2018 at 12:34 PM, Wes McKinney  wrote:
>>
>>> hey Sidd,
>>>
>>> The Python backlog is still in pretty rough shape. I'd like to see if
>>> we can make an RC by Friday but if not we can defer to Monday/Tuesday
>>> the following week (3/12 or 13). I will trim as much as possible out
>>> of the current backlog to get things down to the essential
>>>
>>> - Wes
>>>
>>> On Sun, Feb 25, 2018 at 11:58 AM, Siddharth Teotia 
>>> wrote:
>>> > Sounds good.
>>> >
>>> > Thanks
>>> > Sidd
>>> >
>>> > On Feb 24, 2018 6:24 PM, "Wes McKinney"  wrote:
>>> >
>>> > Hi Sidd,
>>> >
>>> > I think we have too many bugs to make an RC this coming week. I suggest
>>> we
>>> > defer to the following week.
>>> >
>>> > Thanks
>>> > Wes
>>> >
>>> > On Feb 24, 2018 7:09 PM, "Siddharth Teotia" 
>>> wrote:
>>> >
>>> > Hi All,
>>> >
>>> > We currently have 10 issues in progress and PRs are available for 8 of
>>> > them. In interest of getting a release candidate next week, I would
>>> request
>>> > people to review PRs as soon as they can to help make progress and close
>>> > out as many JIRAs as we can.
>>> >
>>> > There are 32 issues in TODO list and 25 of them are not yet assigned. I
>>> am
>>> > planning to defer some of the unassigned ones later today or tomorrow. It
>>> > would be good to soon grab/assign the issues that people want to be fixed
>>> > for 0.9.0.
>>> >
>>> > Here is the link to backlog:
>>> > https://issues.apache.org/jira/projects/ARROW/versions/12341707
>>> >
>>> > Thanks,
>>> > Sidd
>>>


Re: Working towards getting 0.9.0 release candidate up next week

2018-03-08 Thread Kouhei Sutou
Hi,

>- Updating pip packages for C++ and Python

Can we try adding PyArrow source package to PyPI at the
0.9.0?

I want to install PyArrow with Arrow C++ installed by .deb
or .rpm. I want to use both Red Arrow (Ruby bindings) and
PyArrow in the same process via PyCall (Ruby library
to integrate with Python). In the case, I need to use the
same Arrow C++ in both Red Arrow and PyArrow.

Now, there are only binary packages for PyArrow at
https://pypi.python.org/pypi/pyarrow . If there is a source
package for PyArrow at PyPI, I can install PyArrow with
Arrow C++ installed by .deb or .rpm by "pip --no-binary
pyarrow".

Red Arrow can also use Arrow C++ installed by .deb or .rpm.


Thanks,
--
kou

In 
  "Re: Working towards getting 0.9.0 release candidate up next week" on Thu, 8 
Mar 2018 11:25:32 -0800,
  Siddharth Teotia  wrote:

> All,
> 
> I plan to get RC out over the weekend or early Monday. Is that fine with
> everybody?
> 
> We have 6 items in progress --
> https://issues.apache.org/jira/projects/ARROW/versions/12341707#release-report-tab-body.
> How do people feel about completing these JIRAs by tomorrow? I am
> completely fine with deferring the RC to early next week (Mon/Tue/Wed) if
> necessary. Just looking for consensus. Also, I suggest that we defer the
> ones with TODO status. I will do it later today unless I hear otherwise.
> 
> I was wondering if anyone else is interested in collaborating for the
> post-release tasks. As per
> https://github.com/apache/arrow/blob/master/dev/release/RELEASE_MANAGEMENT.md,
> following are the high level post-release tasks. Please let me know if you
> would like to take up something. I have written my name against some of
> them.
> 
> 
>- Updating the Arrow Website (Sidd)
>- Uploading release artifacts to SVN -- looks like PMC karma is needed
>to do this
>- Announcing release (Sidd)
>- Updating website with new API documentation (Sidd)
>- Updating pip packages for C++ and Python
>- Updating conda packages for C++ and Python (Sidd)
>- Updating Java Maven artifacts in Maven central (Sidd)
>- Release blog post
> 
> If anything is missing, please add to the above list. It will be helpful
> for tracking.
> 
> Thanks,
> Sidd
> 
> On Sun, Mar 4, 2018 at 12:34 PM, Wes McKinney  wrote:
> 
>> hey Sidd,
>>
>> The Python backlog is still in pretty rough shape. I'd like to see if
>> we can make an RC by Friday but if not we can defer to Monday/Tuesday
>> the following week (3/12 or 13). I will trim as much as possible out
>> of the current backlog to get things down to the essential
>>
>> - Wes
>>
>> On Sun, Feb 25, 2018 at 11:58 AM, Siddharth Teotia 
>> wrote:
>> > Sounds good.
>> >
>> > Thanks
>> > Sidd
>> >
>> > On Feb 24, 2018 6:24 PM, "Wes McKinney"  wrote:
>> >
>> > Hi Sidd,
>> >
>> > I think we have too many bugs to make an RC this coming week. I suggest
>> we
>> > defer to the following week.
>> >
>> > Thanks
>> > Wes
>> >
>> > On Feb 24, 2018 7:09 PM, "Siddharth Teotia" 
>> wrote:
>> >
>> > Hi All,
>> >
>> > We currently have 10 issues in progress and PRs are available for 8 of
>> > them. In interest of getting a release candidate next week, I would
>> request
>> > people to review PRs as soon as they can to help make progress and close
>> > out as many JIRAs as we can.
>> >
>> > There are 32 issues in TODO list and 25 of them are not yet assigned. I
>> am
>> > planning to defer some of the unassigned ones later today or tomorrow. It
>> > would be good to soon grab/assign the issues that people want to be fixed
>> > for 0.9.0.
>> >
>> > Here is the link to backlog:
>> > https://issues.apache.org/jira/projects/ARROW/versions/12341707
>> >
>> > Thanks,
>> > Sidd
>>


[jira] [Created] (ARROW-2290) [C++/Python] Add ability to set codec options for lz4 codec

2018-03-08 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-2290:
---

 Summary: [C++/Python] Add ability to set codec options for lz4 
codec
 Key: ARROW-2290
 URL: https://issues.apache.org/jira/browse/ARROW-2290
 Project: Apache Arrow
  Issue Type: Improvement
  Components: C++, Python
Reporter: Wes McKinney


The LZ4 library has many parameters, currently we do not expose these in C++ or 
Python



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Working towards getting 0.9.0 release candidate up next week

2018-03-08 Thread Siddharth Teotia
Thanks, Wes. Let's shoot for Monday.

On Thu, Mar 8, 2018 at 11:31 AM, Wes McKinney  wrote:

> Since almost all of the items in TODO are C++ or Python issues, I can
> do a final review today to remove anything that isn't absolutely
> necessary for 0.9.0. We have a couple of nasty bugs still in TODO that
> we should try to fix -- in the event that they cannot be fixed, we may
> need to do a 0.9.1 in a week or two. I would suggest we wait to cut
> the RC until Monday to give enough time for these last items to get
> fixes in.
>
> There are some other things that need doing, like updates per changes
> to the ASF checksum policy ARROW-2268.
>
> I can write by EOD today with a status report on the issues in TODO.
>
> I believe you need to be a PMC to undertake the source release process
> prior to the vote -- I am happy to help with this on Monday.
>
> - Wes
>
> On Thu, Mar 8, 2018 at 2:25 PM, Siddharth Teotia 
> wrote:
> > All,
> >
> > I plan to get RC out over the weekend or early Monday. Is that fine with
> > everybody?
> >
> > We have 6 items in progress --
> > https://issues.apache.org/jira/projects/ARROW/versions/
> 12341707#release-report-tab-body.
> > How do people feel about completing these JIRAs by tomorrow? I am
> > completely fine with deferring the RC to early next week (Mon/Tue/Wed) if
> > necessary. Just looking for consensus. Also, I suggest that we defer the
> > ones with TODO status. I will do it later today unless I hear otherwise.
> >
> > I was wondering if anyone else is interested in collaborating for the
> > post-release tasks. As per
> > https://github.com/apache/arrow/blob/master/dev/release/
> RELEASE_MANAGEMENT.md,
> > following are the high level post-release tasks. Please let me know if
> you
> > would like to take up something. I have written my name against some of
> > them.
> >
> >
> >- Updating the Arrow Website (Sidd)
> >- Uploading release artifacts to SVN -- looks like PMC karma is needed
> >to do this
> >- Announcing release (Sidd)
> >- Updating website with new API documentation (Sidd)
> >- Updating pip packages for C++ and Python
> >- Updating conda packages for C++ and Python (Sidd)
> >- Updating Java Maven artifacts in Maven central (Sidd)
> >- Release blog post
> >
> > If anything is missing, please add to the above list. It will be helpful
> > for tracking.
> >
> > Thanks,
> > Sidd
> >
> > On Sun, Mar 4, 2018 at 12:34 PM, Wes McKinney 
> wrote:
> >
> >> hey Sidd,
> >>
> >> The Python backlog is still in pretty rough shape. I'd like to see if
> >> we can make an RC by Friday but if not we can defer to Monday/Tuesday
> >> the following week (3/12 or 13). I will trim as much as possible out
> >> of the current backlog to get things down to the essential
> >>
> >> - Wes
> >>
> >> On Sun, Feb 25, 2018 at 11:58 AM, Siddharth Teotia <
> siddha...@dremio.com>
> >> wrote:
> >> > Sounds good.
> >> >
> >> > Thanks
> >> > Sidd
> >> >
> >> > On Feb 24, 2018 6:24 PM, "Wes McKinney"  wrote:
> >> >
> >> > Hi Sidd,
> >> >
> >> > I think we have too many bugs to make an RC this coming week. I
> suggest
> >> we
> >> > defer to the following week.
> >> >
> >> > Thanks
> >> > Wes
> >> >
> >> > On Feb 24, 2018 7:09 PM, "Siddharth Teotia" 
> >> wrote:
> >> >
> >> > Hi All,
> >> >
> >> > We currently have 10 issues in progress and PRs are available for 8 of
> >> > them. In interest of getting a release candidate next week, I would
> >> request
> >> > people to review PRs as soon as they can to help make progress and
> close
> >> > out as many JIRAs as we can.
> >> >
> >> > There are 32 issues in TODO list and 25 of them are not yet assigned.
> I
> >> am
> >> > planning to defer some of the unassigned ones later today or
> tomorrow. It
> >> > would be good to soon grab/assign the issues that people want to be
> fixed
> >> > for 0.9.0.
> >> >
> >> > Here is the link to backlog:
> >> > https://issues.apache.org/jira/projects/ARROW/versions/12341707
> >> >
> >> > Thanks,
> >> > Sidd
> >>
>


Re: Working towards getting 0.9.0 release candidate up next week

2018-03-08 Thread Wes McKinney
Since almost all of the items in TODO are C++ or Python issues, I can
do a final review today to remove anything that isn't absolutely
necessary for 0.9.0. We have a couple of nasty bugs still in TODO that
we should try to fix -- in the event that they cannot be fixed, we may
need to do a 0.9.1 in a week or two. I would suggest we wait to cut
the RC until Monday to give enough time for these last items to get
fixes in.

There are some other things that need doing, like updates per changes
to the ASF checksum policy ARROW-2268.

I can write by EOD today with a status report on the issues in TODO.

I believe you need to be a PMC to undertake the source release process
prior to the vote -- I am happy to help with this on Monday.

- Wes

On Thu, Mar 8, 2018 at 2:25 PM, Siddharth Teotia  wrote:
> All,
>
> I plan to get RC out over the weekend or early Monday. Is that fine with
> everybody?
>
> We have 6 items in progress --
> https://issues.apache.org/jira/projects/ARROW/versions/12341707#release-report-tab-body.
> How do people feel about completing these JIRAs by tomorrow? I am
> completely fine with deferring the RC to early next week (Mon/Tue/Wed) if
> necessary. Just looking for consensus. Also, I suggest that we defer the
> ones with TODO status. I will do it later today unless I hear otherwise.
>
> I was wondering if anyone else is interested in collaborating for the
> post-release tasks. As per
> https://github.com/apache/arrow/blob/master/dev/release/RELEASE_MANAGEMENT.md,
> following are the high level post-release tasks. Please let me know if you
> would like to take up something. I have written my name against some of
> them.
>
>
>- Updating the Arrow Website (Sidd)
>- Uploading release artifacts to SVN -- looks like PMC karma is needed
>to do this
>- Announcing release (Sidd)
>- Updating website with new API documentation (Sidd)
>- Updating pip packages for C++ and Python
>- Updating conda packages for C++ and Python (Sidd)
>- Updating Java Maven artifacts in Maven central (Sidd)
>- Release blog post
>
> If anything is missing, please add to the above list. It will be helpful
> for tracking.
>
> Thanks,
> Sidd
>
> On Sun, Mar 4, 2018 at 12:34 PM, Wes McKinney  wrote:
>
>> hey Sidd,
>>
>> The Python backlog is still in pretty rough shape. I'd like to see if
>> we can make an RC by Friday but if not we can defer to Monday/Tuesday
>> the following week (3/12 or 13). I will trim as much as possible out
>> of the current backlog to get things down to the essential
>>
>> - Wes
>>
>> On Sun, Feb 25, 2018 at 11:58 AM, Siddharth Teotia 
>> wrote:
>> > Sounds good.
>> >
>> > Thanks
>> > Sidd
>> >
>> > On Feb 24, 2018 6:24 PM, "Wes McKinney"  wrote:
>> >
>> > Hi Sidd,
>> >
>> > I think we have too many bugs to make an RC this coming week. I suggest
>> we
>> > defer to the following week.
>> >
>> > Thanks
>> > Wes
>> >
>> > On Feb 24, 2018 7:09 PM, "Siddharth Teotia" 
>> wrote:
>> >
>> > Hi All,
>> >
>> > We currently have 10 issues in progress and PRs are available for 8 of
>> > them. In interest of getting a release candidate next week, I would
>> request
>> > people to review PRs as soon as they can to help make progress and close
>> > out as many JIRAs as we can.
>> >
>> > There are 32 issues in TODO list and 25 of them are not yet assigned. I
>> am
>> > planning to defer some of the unassigned ones later today or tomorrow. It
>> > would be good to soon grab/assign the issues that people want to be fixed
>> > for 0.9.0.
>> >
>> > Here is the link to backlog:
>> > https://issues.apache.org/jira/projects/ARROW/versions/12341707
>> >
>> > Thanks,
>> > Sidd
>>


Re: Working towards getting 0.9.0 release candidate up next week

2018-03-08 Thread Siddharth Teotia
All,

I plan to get RC out over the weekend or early Monday. Is that fine with
everybody?

We have 6 items in progress --
https://issues.apache.org/jira/projects/ARROW/versions/12341707#release-report-tab-body.
How do people feel about completing these JIRAs by tomorrow? I am
completely fine with deferring the RC to early next week (Mon/Tue/Wed) if
necessary. Just looking for consensus. Also, I suggest that we defer the
ones with TODO status. I will do it later today unless I hear otherwise.

I was wondering if anyone else is interested in collaborating for the
post-release tasks. As per
https://github.com/apache/arrow/blob/master/dev/release/RELEASE_MANAGEMENT.md,
following are the high level post-release tasks. Please let me know if you
would like to take up something. I have written my name against some of
them.


   - Updating the Arrow Website (Sidd)
   - Uploading release artifacts to SVN -- looks like PMC karma is needed
   to do this
   - Announcing release (Sidd)
   - Updating website with new API documentation (Sidd)
   - Updating pip packages for C++ and Python
   - Updating conda packages for C++ and Python (Sidd)
   - Updating Java Maven artifacts in Maven central (Sidd)
   - Release blog post

If anything is missing, please add to the above list. It will be helpful
for tracking.

Thanks,
Sidd

On Sun, Mar 4, 2018 at 12:34 PM, Wes McKinney  wrote:

> hey Sidd,
>
> The Python backlog is still in pretty rough shape. I'd like to see if
> we can make an RC by Friday but if not we can defer to Monday/Tuesday
> the following week (3/12 or 13). I will trim as much as possible out
> of the current backlog to get things down to the essential
>
> - Wes
>
> On Sun, Feb 25, 2018 at 11:58 AM, Siddharth Teotia 
> wrote:
> > Sounds good.
> >
> > Thanks
> > Sidd
> >
> > On Feb 24, 2018 6:24 PM, "Wes McKinney"  wrote:
> >
> > Hi Sidd,
> >
> > I think we have too many bugs to make an RC this coming week. I suggest
> we
> > defer to the following week.
> >
> > Thanks
> > Wes
> >
> > On Feb 24, 2018 7:09 PM, "Siddharth Teotia" 
> wrote:
> >
> > Hi All,
> >
> > We currently have 10 issues in progress and PRs are available for 8 of
> > them. In interest of getting a release candidate next week, I would
> request
> > people to review PRs as soon as they can to help make progress and close
> > out as many JIRAs as we can.
> >
> > There are 32 issues in TODO list and 25 of them are not yet assigned. I
> am
> > planning to defer some of the unassigned ones later today or tomorrow. It
> > would be good to soon grab/assign the issues that people want to be fixed
> > for 0.9.0.
> >
> > Here is the link to backlog:
> > https://issues.apache.org/jira/projects/ARROW/versions/12341707
> >
> > Thanks,
> > Sidd
>


[jira] [Created] (ARROW-2289) [GLib] Add Numeric, Integer and FloatingPoint data types

2018-03-08 Thread Kouhei Sutou (JIRA)
Kouhei Sutou created ARROW-2289:
---

 Summary: [GLib] Add  Numeric, Integer and FloatingPoint data types
 Key: ARROW-2289
 URL: https://issues.apache.org/jira/browse/ARROW-2289
 Project: Apache Arrow
  Issue Type: Improvement
  Components: GLib
Affects Versions: 0.8.0
Reporter: Kouhei Sutou
Assignee: Kouhei Sutou
 Fix For: 0.9.0






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Trying to compile Arrow C++

2018-03-08 Thread Wes McKinney
OK, does moving the ~/git/boost_1_66_0 someplace else (or removing it)
make the problem go away? We should open a JIRA to see why the CMake
build system is being fooled by that directory and see if it can be
fixed

On Thu, Mar 8, 2018 at 10:28 AM, Andy Grove  wrote:
> Thanks. Here's the gist. I do not have BOOST env vars set. It does seem to
> be looking for headers in a boost directory parallel to arrow though.
>
> https://gist.github.com/andygrove/8abfa027fa29fb9f31efeab90043682c
>
> On Thu, Mar 8, 2018 at 8:22 AM, Wes McKinney  wrote:
>
>> If you could also run with
>>
>> -DARROW_VERBOSE_THIRDPARTY_BUILD=ON
>>
>> that would provide additional debugging help
>>
>> On Thu, Mar 8, 2018 at 10:17 AM, Wes McKinney  wrote:
>> > hi Andy,
>> >
>> > Can you post the complete output of running CMake in a gist or
>> > someplace for us to have a look? Do you have any BOOST_* environment
>> > variables set?
>> >
>> > Thanks
>> > Wes
>> >
>> > On Thu, Mar 8, 2018 at 10:12 AM, Andy Grove 
>> wrote:
>> >> So I'm following the instructions and installed the binary dependencies,
>> >> including libboost-dev. I see boost headers in /usr/include/boost. I'm
>> >> using Ubuntu 16.04.
>> >>
>> >> In the Arrow cpp directory, I ran:
>> >>
>> >> cmake -G "Unix Makefiles"
>> >>
>> >> I get this output:
>> >>
>> >>   Unable to find the requested Boost libraries.
>> >>
>> >>   Boost version: 0.0.0
>> >>
>> >>   Boost include path: /home/andy/git/boost_1_66_0
>> >>
>> >>   Could not find the following Boost libraries:
>> >>
>> >>   boost_regex
>> >>
>> >>   Some (but not all) of the required Boost libraries were found.  You
>> may
>> >>   need to install these additional Boost libraries.  Alternatively, set
>> >>   BOOST_LIBRARYDIR to the directory containing Boost libraries or
>> BOOST_ROOT
>> >>   to the location of Boost.
>> >>
>> >> I also tried installing boost headers and going that route but ran into
>> >> different problems.
>> >>
>> >> I'd appreciate some guidance.
>> >>
>> >> Thanks,
>> >>
>> >> Andy.
>>


Re: Trying to compile Arrow C++

2018-03-08 Thread Andy Grove
Thanks. Here's the gist. I do not have BOOST env vars set. It does seem to
be looking for headers in a boost directory parallel to arrow though.

https://gist.github.com/andygrove/8abfa027fa29fb9f31efeab90043682c

On Thu, Mar 8, 2018 at 8:22 AM, Wes McKinney  wrote:

> If you could also run with
>
> -DARROW_VERBOSE_THIRDPARTY_BUILD=ON
>
> that would provide additional debugging help
>
> On Thu, Mar 8, 2018 at 10:17 AM, Wes McKinney  wrote:
> > hi Andy,
> >
> > Can you post the complete output of running CMake in a gist or
> > someplace for us to have a look? Do you have any BOOST_* environment
> > variables set?
> >
> > Thanks
> > Wes
> >
> > On Thu, Mar 8, 2018 at 10:12 AM, Andy Grove 
> wrote:
> >> So I'm following the instructions and installed the binary dependencies,
> >> including libboost-dev. I see boost headers in /usr/include/boost. I'm
> >> using Ubuntu 16.04.
> >>
> >> In the Arrow cpp directory, I ran:
> >>
> >> cmake -G "Unix Makefiles"
> >>
> >> I get this output:
> >>
> >>   Unable to find the requested Boost libraries.
> >>
> >>   Boost version: 0.0.0
> >>
> >>   Boost include path: /home/andy/git/boost_1_66_0
> >>
> >>   Could not find the following Boost libraries:
> >>
> >>   boost_regex
> >>
> >>   Some (but not all) of the required Boost libraries were found.  You
> may
> >>   need to install these additional Boost libraries.  Alternatively, set
> >>   BOOST_LIBRARYDIR to the directory containing Boost libraries or
> BOOST_ROOT
> >>   to the location of Boost.
> >>
> >> I also tried installing boost headers and going that route but ran into
> >> different problems.
> >>
> >> I'd appreciate some guidance.
> >>
> >> Thanks,
> >>
> >> Andy.
>


Re: Trying to compile Arrow C++

2018-03-08 Thread Wes McKinney
If you could also run with

-DARROW_VERBOSE_THIRDPARTY_BUILD=ON

that would provide additional debugging help

On Thu, Mar 8, 2018 at 10:17 AM, Wes McKinney  wrote:
> hi Andy,
>
> Can you post the complete output of running CMake in a gist or
> someplace for us to have a look? Do you have any BOOST_* environment
> variables set?
>
> Thanks
> Wes
>
> On Thu, Mar 8, 2018 at 10:12 AM, Andy Grove  wrote:
>> So I'm following the instructions and installed the binary dependencies,
>> including libboost-dev. I see boost headers in /usr/include/boost. I'm
>> using Ubuntu 16.04.
>>
>> In the Arrow cpp directory, I ran:
>>
>> cmake -G "Unix Makefiles"
>>
>> I get this output:
>>
>>   Unable to find the requested Boost libraries.
>>
>>   Boost version: 0.0.0
>>
>>   Boost include path: /home/andy/git/boost_1_66_0
>>
>>   Could not find the following Boost libraries:
>>
>>   boost_regex
>>
>>   Some (but not all) of the required Boost libraries were found.  You may
>>   need to install these additional Boost libraries.  Alternatively, set
>>   BOOST_LIBRARYDIR to the directory containing Boost libraries or BOOST_ROOT
>>   to the location of Boost.
>>
>> I also tried installing boost headers and going that route but ran into
>> different problems.
>>
>> I'd appreciate some guidance.
>>
>> Thanks,
>>
>> Andy.


Trying to compile Arrow C++

2018-03-08 Thread Andy Grove
So I'm following the instructions and installed the binary dependencies,
including libboost-dev. I see boost headers in /usr/include/boost. I'm
using Ubuntu 16.04.

In the Arrow cpp directory, I ran:

cmake -G "Unix Makefiles"

I get this output:

  Unable to find the requested Boost libraries.

  Boost version: 0.0.0

  Boost include path: /home/andy/git/boost_1_66_0

  Could not find the following Boost libraries:

  boost_regex

  Some (but not all) of the required Boost libraries were found.  You may
  need to install these additional Boost libraries.  Alternatively, set
  BOOST_LIBRARYDIR to the directory containing Boost libraries or BOOST_ROOT
  to the location of Boost.

I also tried installing boost headers and going that route but ran into
different problems.

I'd appreciate some guidance.

Thanks,

Andy.


Re: Introducing myself

2018-03-08 Thread Andy Grove
Hi Krisztian,

Yes, I'd love to team up. Thanks for the link ... I had been looking at a
different Rust Arrow project.

Once I have arrow building I will let you know.

Thanks,

Andy.

On Wed, Mar 7, 2018 at 8:41 AM, Krisztián Szűcs 
wrote:

> Hey Andy!
>
> In the last couple of days I was digging arrow and iron-arrow (
> https://link.getmailspring.com/link/local-f599f61d-1722-
> v1.1.4-22d9f20d@kszucs-mbp.local/0?redirect=https%3A%2F%
> 2Fgithub.com%2Fjihoonson%2Firon-arrow=dev%40arrow.apache.org)
> in order to create a rust impl for arrow.
> My background is mostly p[c]ythonic, so I'd gladly team up if You are
> interested.
>
> Krisztian
> On Mar 7 2018, at 4:25 pm, Andy Grove  wrote:
> >
> > Hi,
> > I just wanted to introduce myself to the group before I start asking lots
> > of questions. I'm a software engineer mostly working with
> > Scala/Spark/Kudu/Parquet in my day job and in my spare time I have been
> > working on a POC of a distributed data platform implemented in Rust. The
> > project is called DataFusion (https://www.datafusion.rs/).
> >
> > The project is very early and the implementation is currently very simple
> > row-based processing but the performance is already quite exciting to me
> > (current test case is 4x faster than Apache Spark).
> >
> > I have decided that I should now concentrate on making Apache Arrow the
> > native memory format so that I can implement more efficient data
> processing
> > and make it easier in the future to be able to integrate with things like
> > Kudu and Parquet. It's also just a great way for me to learn about
> > columnar-processing.
> >
> > I'm just in the process of getting Arrow compiling and reading the docs.
> > I'll be back soon with questions I'm sure.
> >
> > Thanks,
> > Andy.
>


[jira] [Created] (ARROW-2288) [Python] slicing logic defective

2018-03-08 Thread Antoine Pitrou (JIRA)
Antoine Pitrou created ARROW-2288:
-

 Summary: [Python] slicing logic defective
 Key: ARROW-2288
 URL: https://issues.apache.org/jira/browse/ARROW-2288
 Project: Apache Arrow
  Issue Type: Bug
  Components: Python
Affects Versions: 0.8.0
Reporter: Antoine Pitrou
Assignee: Antoine Pitrou


The slicing logic tends to go too far when normalizing large negative bounds, 
which leads to results not in line with Python's slicing semantics:
{code}
>>> arr = pa.array([1,2,3,4])
>>> arr[-99:100]

[
  2,
  3,
  4
]
>>> arr.to_pylist()[-99:100]
[1, 2, 3, 4]
>>> 
>>> 
>>> arr[-6:-5]

[
  3
]
>>> arr.to_pylist()[-6:-5]
[]
{code}
Also note this crash:
{code}
>>> arr[10:13]
/home/antoine/arrow/cpp/src/arrow/array.cc:105 Check failed: (offset) <= 
(data.length) 
Abandon (core dumped)
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)