[jira] [Created] (ARROW-2292) [Python] More consistent / intuitive name for pyarrow.frombuffer
Wes McKinney created ARROW-2292: --- Summary: [Python] More consistent / intuitive name for pyarrow.frombuffer Key: ARROW-2292 URL: https://issues.apache.org/jira/browse/ARROW-2292 Project: Apache Arrow Issue Type: Improvement Components: Python Reporter: Wes McKinney Fix For: 0.9.0 Now that we have {{pyarrow.foreign_buffer}}, things are a bit odd. We could call {{from_buffer}} something like {{py_buffer}} instead? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: Trying to compile Arrow C++
I see now where this is missing in the build docs, I think we have updated the docs elsewhere (e.g. in the Python side source build instructions). I got confused when you said you installed "libboost-dev" because that will also install libboost-regex-dev. thanks! Wes On Thu, Mar 8, 2018 at 10:52 PM, Andy Grovewrote: > OK, so after installing libboost-regex-dev I can generate a makefile. The > docs don't state that I need to install this. I'll submit a one line PR for > the docs. > > Thanks for the help. > > On Thu, Mar 8, 2018 at 7:40 PM, Wes McKinney wrote: > >> Sure, please go ahead and open a JIRA. Do you have libboost-regex-dev >> installed (is /usr/lib/x86_64-linux-gnu/libboost_regex.so present?)? >> >> On Thu, Mar 8, 2018 at 9:31 PM, Andy Grove wrote: >> > So I cloned arrow again in a new directory and it is no longer looking >> for >> > the local boost install. >> > >> > It is looking in /usr/include for headers and recognizes now that I have >> > boost 1.58.0 >> > >> > It still fails though, looking for boost_regex. Here is a new gist. >> > >> > https://gist.github.com/andygrove/840b5f4d9c500669bbd1de1b84287a0e >> > >> > Should I go ahead and file a JIRA for this? >> > >> > On Thu, Mar 8, 2018 at 8:32 AM, Wes McKinney >> wrote: >> > >> >> OK, does moving the ~/git/boost_1_66_0 someplace else (or removing it) >> >> make the problem go away? We should open a JIRA to see why the CMake >> >> build system is being fooled by that directory and see if it can be >> >> fixed >> >> >> >> On Thu, Mar 8, 2018 at 10:28 AM, Andy Grove >> wrote: >> >> > Thanks. Here's the gist. I do not have BOOST env vars set. It does >> seem >> >> to >> >> > be looking for headers in a boost directory parallel to arrow though. >> >> > >> >> > https://gist.github.com/andygrove/8abfa027fa29fb9f31efeab90043682c >> >> > >> >> > On Thu, Mar 8, 2018 at 8:22 AM, Wes McKinney >> >> wrote: >> >> > >> >> >> If you could also run with >> >> >> >> >> >> -DARROW_VERBOSE_THIRDPARTY_BUILD=ON >> >> >> >> >> >> that would provide additional debugging help >> >> >> >> >> >> On Thu, Mar 8, 2018 at 10:17 AM, Wes McKinney >> >> wrote: >> >> >> > hi Andy, >> >> >> > >> >> >> > Can you post the complete output of running CMake in a gist or >> >> >> > someplace for us to have a look? Do you have any BOOST_* >> environment >> >> >> > variables set? >> >> >> > >> >> >> > Thanks >> >> >> > Wes >> >> >> > >> >> >> > On Thu, Mar 8, 2018 at 10:12 AM, Andy Grove > > >> >> >> wrote: >> >> >> >> So I'm following the instructions and installed the binary >> >> dependencies, >> >> >> >> including libboost-dev. I see boost headers in /usr/include/boost. >> >> I'm >> >> >> >> using Ubuntu 16.04. >> >> >> >> >> >> >> >> In the Arrow cpp directory, I ran: >> >> >> >> >> >> >> >> cmake -G "Unix Makefiles" >> >> >> >> >> >> >> >> I get this output: >> >> >> >> >> >> >> >> Unable to find the requested Boost libraries. >> >> >> >> >> >> >> >> Boost version: 0.0.0 >> >> >> >> >> >> >> >> Boost include path: /home/andy/git/boost_1_66_0 >> >> >> >> >> >> >> >> Could not find the following Boost libraries: >> >> >> >> >> >> >> >> boost_regex >> >> >> >> >> >> >> >> Some (but not all) of the required Boost libraries were found. >> You >> >> >> may >> >> >> >> need to install these additional Boost libraries. >> Alternatively, >> >> set >> >> >> >> BOOST_LIBRARYDIR to the directory containing Boost libraries or >> >> >> BOOST_ROOT >> >> >> >> to the location of Boost. >> >> >> >> >> >> >> >> I also tried installing boost headers and going that route but ran >> >> into >> >> >> >> different problems. >> >> >> >> >> >> >> >> I'd appreciate some guidance. >> >> >> >> >> >> >> >> Thanks, >> >> >> >> >> >> >> >> Andy. >> >> >> >> >> >>
[jira] [Created] (ARROW-2291) cpp README missing instructions for libboost-regex-dev
Andy Grove created ARROW-2291: - Summary: cpp README missing instructions for libboost-regex-dev Key: ARROW-2291 URL: https://issues.apache.org/jira/browse/ARROW-2291 Project: Apache Arrow Issue Type: Improvement Components: C++ Environment: Ubuntu 16.04 Reporter: Andy Grove After following the instructions in the README, I could not generate a makefile using CMake because of a missing dependency. The README needs to be updated to include installing libboost-regex-dev. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: Trying to compile Arrow C++
Sure, please go ahead and open a JIRA. Do you have libboost-regex-dev installed (is /usr/lib/x86_64-linux-gnu/libboost_regex.so present?)? On Thu, Mar 8, 2018 at 9:31 PM, Andy Grovewrote: > So I cloned arrow again in a new directory and it is no longer looking for > the local boost install. > > It is looking in /usr/include for headers and recognizes now that I have > boost 1.58.0 > > It still fails though, looking for boost_regex. Here is a new gist. > > https://gist.github.com/andygrove/840b5f4d9c500669bbd1de1b84287a0e > > Should I go ahead and file a JIRA for this? > > On Thu, Mar 8, 2018 at 8:32 AM, Wes McKinney wrote: > >> OK, does moving the ~/git/boost_1_66_0 someplace else (or removing it) >> make the problem go away? We should open a JIRA to see why the CMake >> build system is being fooled by that directory and see if it can be >> fixed >> >> On Thu, Mar 8, 2018 at 10:28 AM, Andy Grove wrote: >> > Thanks. Here's the gist. I do not have BOOST env vars set. It does seem >> to >> > be looking for headers in a boost directory parallel to arrow though. >> > >> > https://gist.github.com/andygrove/8abfa027fa29fb9f31efeab90043682c >> > >> > On Thu, Mar 8, 2018 at 8:22 AM, Wes McKinney >> wrote: >> > >> >> If you could also run with >> >> >> >> -DARROW_VERBOSE_THIRDPARTY_BUILD=ON >> >> >> >> that would provide additional debugging help >> >> >> >> On Thu, Mar 8, 2018 at 10:17 AM, Wes McKinney >> wrote: >> >> > hi Andy, >> >> > >> >> > Can you post the complete output of running CMake in a gist or >> >> > someplace for us to have a look? Do you have any BOOST_* environment >> >> > variables set? >> >> > >> >> > Thanks >> >> > Wes >> >> > >> >> > On Thu, Mar 8, 2018 at 10:12 AM, Andy Grove >> >> wrote: >> >> >> So I'm following the instructions and installed the binary >> dependencies, >> >> >> including libboost-dev. I see boost headers in /usr/include/boost. >> I'm >> >> >> using Ubuntu 16.04. >> >> >> >> >> >> In the Arrow cpp directory, I ran: >> >> >> >> >> >> cmake -G "Unix Makefiles" >> >> >> >> >> >> I get this output: >> >> >> >> >> >> Unable to find the requested Boost libraries. >> >> >> >> >> >> Boost version: 0.0.0 >> >> >> >> >> >> Boost include path: /home/andy/git/boost_1_66_0 >> >> >> >> >> >> Could not find the following Boost libraries: >> >> >> >> >> >> boost_regex >> >> >> >> >> >> Some (but not all) of the required Boost libraries were found. You >> >> may >> >> >> need to install these additional Boost libraries. Alternatively, >> set >> >> >> BOOST_LIBRARYDIR to the directory containing Boost libraries or >> >> BOOST_ROOT >> >> >> to the location of Boost. >> >> >> >> >> >> I also tried installing boost headers and going that route but ran >> into >> >> >> different problems. >> >> >> >> >> >> I'd appreciate some guidance. >> >> >> >> >> >> Thanks, >> >> >> >> >> >> Andy. >> >> >>
Re: Trying to compile Arrow C++
So I cloned arrow again in a new directory and it is no longer looking for the local boost install. It is looking in /usr/include for headers and recognizes now that I have boost 1.58.0 It still fails though, looking for boost_regex. Here is a new gist. https://gist.github.com/andygrove/840b5f4d9c500669bbd1de1b84287a0e Should I go ahead and file a JIRA for this? On Thu, Mar 8, 2018 at 8:32 AM, Wes McKinneywrote: > OK, does moving the ~/git/boost_1_66_0 someplace else (or removing it) > make the problem go away? We should open a JIRA to see why the CMake > build system is being fooled by that directory and see if it can be > fixed > > On Thu, Mar 8, 2018 at 10:28 AM, Andy Grove wrote: > > Thanks. Here's the gist. I do not have BOOST env vars set. It does seem > to > > be looking for headers in a boost directory parallel to arrow though. > > > > https://gist.github.com/andygrove/8abfa027fa29fb9f31efeab90043682c > > > > On Thu, Mar 8, 2018 at 8:22 AM, Wes McKinney > wrote: > > > >> If you could also run with > >> > >> -DARROW_VERBOSE_THIRDPARTY_BUILD=ON > >> > >> that would provide additional debugging help > >> > >> On Thu, Mar 8, 2018 at 10:17 AM, Wes McKinney > wrote: > >> > hi Andy, > >> > > >> > Can you post the complete output of running CMake in a gist or > >> > someplace for us to have a look? Do you have any BOOST_* environment > >> > variables set? > >> > > >> > Thanks > >> > Wes > >> > > >> > On Thu, Mar 8, 2018 at 10:12 AM, Andy Grove > >> wrote: > >> >> So I'm following the instructions and installed the binary > dependencies, > >> >> including libboost-dev. I see boost headers in /usr/include/boost. > I'm > >> >> using Ubuntu 16.04. > >> >> > >> >> In the Arrow cpp directory, I ran: > >> >> > >> >> cmake -G "Unix Makefiles" > >> >> > >> >> I get this output: > >> >> > >> >> Unable to find the requested Boost libraries. > >> >> > >> >> Boost version: 0.0.0 > >> >> > >> >> Boost include path: /home/andy/git/boost_1_66_0 > >> >> > >> >> Could not find the following Boost libraries: > >> >> > >> >> boost_regex > >> >> > >> >> Some (but not all) of the required Boost libraries were found. You > >> may > >> >> need to install these additional Boost libraries. Alternatively, > set > >> >> BOOST_LIBRARYDIR to the directory containing Boost libraries or > >> BOOST_ROOT > >> >> to the location of Boost. > >> >> > >> >> I also tried installing boost headers and going that route but ran > into > >> >> different problems. > >> >> > >> >> I'd appreciate some guidance. > >> >> > >> >> Thanks, > >> >> > >> >> Andy. > >> >
Re: Working towards getting 0.9.0 release candidate up next week
Thanks! -- kou In"Re: Working towards getting 0.9.0 release candidate up next week" on Thu, 8 Mar 2018 20:44:14 -0500, Wes McKinney wrote: > hi Kou -- yes, I think this is a good idea. It will require a little > bit of work to be able to produce a viable standalone source tarball. > Between Uwe, Phillip, Antoine, and I, we should be able to come up > with a plan to do this > > - Wes > > On Thu, Mar 8, 2018 at 8:33 PM, Kouhei Sutou wrote: >> Hi, >> >>>- Updating pip packages for C++ and Python >> >> Can we try adding PyArrow source package to PyPI at the >> 0.9.0? >> >> I want to install PyArrow with Arrow C++ installed by .deb >> or .rpm. I want to use both Red Arrow (Ruby bindings) and >> PyArrow in the same process via PyCall (Ruby library >> to integrate with Python). In the case, I need to use the >> same Arrow C++ in both Red Arrow and PyArrow. >> >> Now, there are only binary packages for PyArrow at >> https://pypi.python.org/pypi/pyarrow . If there is a source >> package for PyArrow at PyPI, I can install PyArrow with >> Arrow C++ installed by .deb or .rpm by "pip --no-binary >> pyarrow". >> >> Red Arrow can also use Arrow C++ installed by .deb or .rpm. >> >> >> Thanks, >> -- >> kou >> >> In >> "Re: Working towards getting 0.9.0 release candidate up next week" on Thu, >> 8 Mar 2018 11:25:32 -0800, >> Siddharth Teotia wrote: >> >>> All, >>> >>> I plan to get RC out over the weekend or early Monday. Is that fine with >>> everybody? >>> >>> We have 6 items in progress -- >>> https://issues.apache.org/jira/projects/ARROW/versions/12341707#release-report-tab-body. >>> How do people feel about completing these JIRAs by tomorrow? I am >>> completely fine with deferring the RC to early next week (Mon/Tue/Wed) if >>> necessary. Just looking for consensus. Also, I suggest that we defer the >>> ones with TODO status. I will do it later today unless I hear otherwise. >>> >>> I was wondering if anyone else is interested in collaborating for the >>> post-release tasks. As per >>> https://github.com/apache/arrow/blob/master/dev/release/RELEASE_MANAGEMENT.md, >>> following are the high level post-release tasks. Please let me know if you >>> would like to take up something. I have written my name against some of >>> them. >>> >>> >>>- Updating the Arrow Website (Sidd) >>>- Uploading release artifacts to SVN -- looks like PMC karma is needed >>>to do this >>>- Announcing release (Sidd) >>>- Updating website with new API documentation (Sidd) >>>- Updating pip packages for C++ and Python >>>- Updating conda packages for C++ and Python (Sidd) >>>- Updating Java Maven artifacts in Maven central (Sidd) >>>- Release blog post >>> >>> If anything is missing, please add to the above list. It will be helpful >>> for tracking. >>> >>> Thanks, >>> Sidd >>> >>> On Sun, Mar 4, 2018 at 12:34 PM, Wes McKinney wrote: >>> hey Sidd, The Python backlog is still in pretty rough shape. I'd like to see if we can make an RC by Friday but if not we can defer to Monday/Tuesday the following week (3/12 or 13). I will trim as much as possible out of the current backlog to get things down to the essential - Wes On Sun, Feb 25, 2018 at 11:58 AM, Siddharth Teotia wrote: > Sounds good. > > Thanks > Sidd > > On Feb 24, 2018 6:24 PM, "Wes McKinney" wrote: > > Hi Sidd, > > I think we have too many bugs to make an RC this coming week. I suggest we > defer to the following week. > > Thanks > Wes > > On Feb 24, 2018 7:09 PM, "Siddharth Teotia" wrote: > > Hi All, > > We currently have 10 issues in progress and PRs are available for 8 of > them. In interest of getting a release candidate next week, I would request > people to review PRs as soon as they can to help make progress and close > out as many JIRAs as we can. > > There are 32 issues in TODO list and 25 of them are not yet assigned. I am > planning to defer some of the unassigned ones later today or tomorrow. It > would be good to soon grab/assign the issues that people want to be fixed > for 0.9.0. > > Here is the link to backlog: > https://issues.apache.org/jira/projects/ARROW/versions/12341707 > > Thanks, > Sidd
Re: Working towards getting 0.9.0 release candidate up next week
hi Kou -- yes, I think this is a good idea. It will require a little bit of work to be able to produce a viable standalone source tarball. Between Uwe, Phillip, Antoine, and I, we should be able to come up with a plan to do this - Wes On Thu, Mar 8, 2018 at 8:33 PM, Kouhei Sutouwrote: > Hi, > >>- Updating pip packages for C++ and Python > > Can we try adding PyArrow source package to PyPI at the > 0.9.0? > > I want to install PyArrow with Arrow C++ installed by .deb > or .rpm. I want to use both Red Arrow (Ruby bindings) and > PyArrow in the same process via PyCall (Ruby library > to integrate with Python). In the case, I need to use the > same Arrow C++ in both Red Arrow and PyArrow. > > Now, there are only binary packages for PyArrow at > https://pypi.python.org/pypi/pyarrow . If there is a source > package for PyArrow at PyPI, I can install PyArrow with > Arrow C++ installed by .deb or .rpm by "pip --no-binary > pyarrow". > > Red Arrow can also use Arrow C++ installed by .deb or .rpm. > > > Thanks, > -- > kou > > In > "Re: Working towards getting 0.9.0 release candidate up next week" on Thu, > 8 Mar 2018 11:25:32 -0800, > Siddharth Teotia wrote: > >> All, >> >> I plan to get RC out over the weekend or early Monday. Is that fine with >> everybody? >> >> We have 6 items in progress -- >> https://issues.apache.org/jira/projects/ARROW/versions/12341707#release-report-tab-body. >> How do people feel about completing these JIRAs by tomorrow? I am >> completely fine with deferring the RC to early next week (Mon/Tue/Wed) if >> necessary. Just looking for consensus. Also, I suggest that we defer the >> ones with TODO status. I will do it later today unless I hear otherwise. >> >> I was wondering if anyone else is interested in collaborating for the >> post-release tasks. As per >> https://github.com/apache/arrow/blob/master/dev/release/RELEASE_MANAGEMENT.md, >> following are the high level post-release tasks. Please let me know if you >> would like to take up something. I have written my name against some of >> them. >> >> >>- Updating the Arrow Website (Sidd) >>- Uploading release artifacts to SVN -- looks like PMC karma is needed >>to do this >>- Announcing release (Sidd) >>- Updating website with new API documentation (Sidd) >>- Updating pip packages for C++ and Python >>- Updating conda packages for C++ and Python (Sidd) >>- Updating Java Maven artifacts in Maven central (Sidd) >>- Release blog post >> >> If anything is missing, please add to the above list. It will be helpful >> for tracking. >> >> Thanks, >> Sidd >> >> On Sun, Mar 4, 2018 at 12:34 PM, Wes McKinney wrote: >> >>> hey Sidd, >>> >>> The Python backlog is still in pretty rough shape. I'd like to see if >>> we can make an RC by Friday but if not we can defer to Monday/Tuesday >>> the following week (3/12 or 13). I will trim as much as possible out >>> of the current backlog to get things down to the essential >>> >>> - Wes >>> >>> On Sun, Feb 25, 2018 at 11:58 AM, Siddharth Teotia >>> wrote: >>> > Sounds good. >>> > >>> > Thanks >>> > Sidd >>> > >>> > On Feb 24, 2018 6:24 PM, "Wes McKinney" wrote: >>> > >>> > Hi Sidd, >>> > >>> > I think we have too many bugs to make an RC this coming week. I suggest >>> we >>> > defer to the following week. >>> > >>> > Thanks >>> > Wes >>> > >>> > On Feb 24, 2018 7:09 PM, "Siddharth Teotia" >>> wrote: >>> > >>> > Hi All, >>> > >>> > We currently have 10 issues in progress and PRs are available for 8 of >>> > them. In interest of getting a release candidate next week, I would >>> request >>> > people to review PRs as soon as they can to help make progress and close >>> > out as many JIRAs as we can. >>> > >>> > There are 32 issues in TODO list and 25 of them are not yet assigned. I >>> am >>> > planning to defer some of the unassigned ones later today or tomorrow. It >>> > would be good to soon grab/assign the issues that people want to be fixed >>> > for 0.9.0. >>> > >>> > Here is the link to backlog: >>> > https://issues.apache.org/jira/projects/ARROW/versions/12341707 >>> > >>> > Thanks, >>> > Sidd >>>
Re: Working towards getting 0.9.0 release candidate up next week
Hi, >- Updating pip packages for C++ and Python Can we try adding PyArrow source package to PyPI at the 0.9.0? I want to install PyArrow with Arrow C++ installed by .deb or .rpm. I want to use both Red Arrow (Ruby bindings) and PyArrow in the same process via PyCall (Ruby library to integrate with Python). In the case, I need to use the same Arrow C++ in both Red Arrow and PyArrow. Now, there are only binary packages for PyArrow at https://pypi.python.org/pypi/pyarrow . If there is a source package for PyArrow at PyPI, I can install PyArrow with Arrow C++ installed by .deb or .rpm by "pip --no-binary pyarrow". Red Arrow can also use Arrow C++ installed by .deb or .rpm. Thanks, -- kou In"Re: Working towards getting 0.9.0 release candidate up next week" on Thu, 8 Mar 2018 11:25:32 -0800, Siddharth Teotia wrote: > All, > > I plan to get RC out over the weekend or early Monday. Is that fine with > everybody? > > We have 6 items in progress -- > https://issues.apache.org/jira/projects/ARROW/versions/12341707#release-report-tab-body. > How do people feel about completing these JIRAs by tomorrow? I am > completely fine with deferring the RC to early next week (Mon/Tue/Wed) if > necessary. Just looking for consensus. Also, I suggest that we defer the > ones with TODO status. I will do it later today unless I hear otherwise. > > I was wondering if anyone else is interested in collaborating for the > post-release tasks. As per > https://github.com/apache/arrow/blob/master/dev/release/RELEASE_MANAGEMENT.md, > following are the high level post-release tasks. Please let me know if you > would like to take up something. I have written my name against some of > them. > > >- Updating the Arrow Website (Sidd) >- Uploading release artifacts to SVN -- looks like PMC karma is needed >to do this >- Announcing release (Sidd) >- Updating website with new API documentation (Sidd) >- Updating pip packages for C++ and Python >- Updating conda packages for C++ and Python (Sidd) >- Updating Java Maven artifacts in Maven central (Sidd) >- Release blog post > > If anything is missing, please add to the above list. It will be helpful > for tracking. > > Thanks, > Sidd > > On Sun, Mar 4, 2018 at 12:34 PM, Wes McKinney wrote: > >> hey Sidd, >> >> The Python backlog is still in pretty rough shape. I'd like to see if >> we can make an RC by Friday but if not we can defer to Monday/Tuesday >> the following week (3/12 or 13). I will trim as much as possible out >> of the current backlog to get things down to the essential >> >> - Wes >> >> On Sun, Feb 25, 2018 at 11:58 AM, Siddharth Teotia >> wrote: >> > Sounds good. >> > >> > Thanks >> > Sidd >> > >> > On Feb 24, 2018 6:24 PM, "Wes McKinney" wrote: >> > >> > Hi Sidd, >> > >> > I think we have too many bugs to make an RC this coming week. I suggest >> we >> > defer to the following week. >> > >> > Thanks >> > Wes >> > >> > On Feb 24, 2018 7:09 PM, "Siddharth Teotia" >> wrote: >> > >> > Hi All, >> > >> > We currently have 10 issues in progress and PRs are available for 8 of >> > them. In interest of getting a release candidate next week, I would >> request >> > people to review PRs as soon as they can to help make progress and close >> > out as many JIRAs as we can. >> > >> > There are 32 issues in TODO list and 25 of them are not yet assigned. I >> am >> > planning to defer some of the unassigned ones later today or tomorrow. It >> > would be good to soon grab/assign the issues that people want to be fixed >> > for 0.9.0. >> > >> > Here is the link to backlog: >> > https://issues.apache.org/jira/projects/ARROW/versions/12341707 >> > >> > Thanks, >> > Sidd >>
[jira] [Created] (ARROW-2290) [C++/Python] Add ability to set codec options for lz4 codec
Wes McKinney created ARROW-2290: --- Summary: [C++/Python] Add ability to set codec options for lz4 codec Key: ARROW-2290 URL: https://issues.apache.org/jira/browse/ARROW-2290 Project: Apache Arrow Issue Type: Improvement Components: C++, Python Reporter: Wes McKinney The LZ4 library has many parameters, currently we do not expose these in C++ or Python -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: Working towards getting 0.9.0 release candidate up next week
Thanks, Wes. Let's shoot for Monday. On Thu, Mar 8, 2018 at 11:31 AM, Wes McKinneywrote: > Since almost all of the items in TODO are C++ or Python issues, I can > do a final review today to remove anything that isn't absolutely > necessary for 0.9.0. We have a couple of nasty bugs still in TODO that > we should try to fix -- in the event that they cannot be fixed, we may > need to do a 0.9.1 in a week or two. I would suggest we wait to cut > the RC until Monday to give enough time for these last items to get > fixes in. > > There are some other things that need doing, like updates per changes > to the ASF checksum policy ARROW-2268. > > I can write by EOD today with a status report on the issues in TODO. > > I believe you need to be a PMC to undertake the source release process > prior to the vote -- I am happy to help with this on Monday. > > - Wes > > On Thu, Mar 8, 2018 at 2:25 PM, Siddharth Teotia > wrote: > > All, > > > > I plan to get RC out over the weekend or early Monday. Is that fine with > > everybody? > > > > We have 6 items in progress -- > > https://issues.apache.org/jira/projects/ARROW/versions/ > 12341707#release-report-tab-body. > > How do people feel about completing these JIRAs by tomorrow? I am > > completely fine with deferring the RC to early next week (Mon/Tue/Wed) if > > necessary. Just looking for consensus. Also, I suggest that we defer the > > ones with TODO status. I will do it later today unless I hear otherwise. > > > > I was wondering if anyone else is interested in collaborating for the > > post-release tasks. As per > > https://github.com/apache/arrow/blob/master/dev/release/ > RELEASE_MANAGEMENT.md, > > following are the high level post-release tasks. Please let me know if > you > > would like to take up something. I have written my name against some of > > them. > > > > > >- Updating the Arrow Website (Sidd) > >- Uploading release artifacts to SVN -- looks like PMC karma is needed > >to do this > >- Announcing release (Sidd) > >- Updating website with new API documentation (Sidd) > >- Updating pip packages for C++ and Python > >- Updating conda packages for C++ and Python (Sidd) > >- Updating Java Maven artifacts in Maven central (Sidd) > >- Release blog post > > > > If anything is missing, please add to the above list. It will be helpful > > for tracking. > > > > Thanks, > > Sidd > > > > On Sun, Mar 4, 2018 at 12:34 PM, Wes McKinney > wrote: > > > >> hey Sidd, > >> > >> The Python backlog is still in pretty rough shape. I'd like to see if > >> we can make an RC by Friday but if not we can defer to Monday/Tuesday > >> the following week (3/12 or 13). I will trim as much as possible out > >> of the current backlog to get things down to the essential > >> > >> - Wes > >> > >> On Sun, Feb 25, 2018 at 11:58 AM, Siddharth Teotia < > siddha...@dremio.com> > >> wrote: > >> > Sounds good. > >> > > >> > Thanks > >> > Sidd > >> > > >> > On Feb 24, 2018 6:24 PM, "Wes McKinney" wrote: > >> > > >> > Hi Sidd, > >> > > >> > I think we have too many bugs to make an RC this coming week. I > suggest > >> we > >> > defer to the following week. > >> > > >> > Thanks > >> > Wes > >> > > >> > On Feb 24, 2018 7:09 PM, "Siddharth Teotia" > >> wrote: > >> > > >> > Hi All, > >> > > >> > We currently have 10 issues in progress and PRs are available for 8 of > >> > them. In interest of getting a release candidate next week, I would > >> request > >> > people to review PRs as soon as they can to help make progress and > close > >> > out as many JIRAs as we can. > >> > > >> > There are 32 issues in TODO list and 25 of them are not yet assigned. > I > >> am > >> > planning to defer some of the unassigned ones later today or > tomorrow. It > >> > would be good to soon grab/assign the issues that people want to be > fixed > >> > for 0.9.0. > >> > > >> > Here is the link to backlog: > >> > https://issues.apache.org/jira/projects/ARROW/versions/12341707 > >> > > >> > Thanks, > >> > Sidd > >> >
Re: Working towards getting 0.9.0 release candidate up next week
Since almost all of the items in TODO are C++ or Python issues, I can do a final review today to remove anything that isn't absolutely necessary for 0.9.0. We have a couple of nasty bugs still in TODO that we should try to fix -- in the event that they cannot be fixed, we may need to do a 0.9.1 in a week or two. I would suggest we wait to cut the RC until Monday to give enough time for these last items to get fixes in. There are some other things that need doing, like updates per changes to the ASF checksum policy ARROW-2268. I can write by EOD today with a status report on the issues in TODO. I believe you need to be a PMC to undertake the source release process prior to the vote -- I am happy to help with this on Monday. - Wes On Thu, Mar 8, 2018 at 2:25 PM, Siddharth Teotiawrote: > All, > > I plan to get RC out over the weekend or early Monday. Is that fine with > everybody? > > We have 6 items in progress -- > https://issues.apache.org/jira/projects/ARROW/versions/12341707#release-report-tab-body. > How do people feel about completing these JIRAs by tomorrow? I am > completely fine with deferring the RC to early next week (Mon/Tue/Wed) if > necessary. Just looking for consensus. Also, I suggest that we defer the > ones with TODO status. I will do it later today unless I hear otherwise. > > I was wondering if anyone else is interested in collaborating for the > post-release tasks. As per > https://github.com/apache/arrow/blob/master/dev/release/RELEASE_MANAGEMENT.md, > following are the high level post-release tasks. Please let me know if you > would like to take up something. I have written my name against some of > them. > > >- Updating the Arrow Website (Sidd) >- Uploading release artifacts to SVN -- looks like PMC karma is needed >to do this >- Announcing release (Sidd) >- Updating website with new API documentation (Sidd) >- Updating pip packages for C++ and Python >- Updating conda packages for C++ and Python (Sidd) >- Updating Java Maven artifacts in Maven central (Sidd) >- Release blog post > > If anything is missing, please add to the above list. It will be helpful > for tracking. > > Thanks, > Sidd > > On Sun, Mar 4, 2018 at 12:34 PM, Wes McKinney wrote: > >> hey Sidd, >> >> The Python backlog is still in pretty rough shape. I'd like to see if >> we can make an RC by Friday but if not we can defer to Monday/Tuesday >> the following week (3/12 or 13). I will trim as much as possible out >> of the current backlog to get things down to the essential >> >> - Wes >> >> On Sun, Feb 25, 2018 at 11:58 AM, Siddharth Teotia >> wrote: >> > Sounds good. >> > >> > Thanks >> > Sidd >> > >> > On Feb 24, 2018 6:24 PM, "Wes McKinney" wrote: >> > >> > Hi Sidd, >> > >> > I think we have too many bugs to make an RC this coming week. I suggest >> we >> > defer to the following week. >> > >> > Thanks >> > Wes >> > >> > On Feb 24, 2018 7:09 PM, "Siddharth Teotia" >> wrote: >> > >> > Hi All, >> > >> > We currently have 10 issues in progress and PRs are available for 8 of >> > them. In interest of getting a release candidate next week, I would >> request >> > people to review PRs as soon as they can to help make progress and close >> > out as many JIRAs as we can. >> > >> > There are 32 issues in TODO list and 25 of them are not yet assigned. I >> am >> > planning to defer some of the unassigned ones later today or tomorrow. It >> > would be good to soon grab/assign the issues that people want to be fixed >> > for 0.9.0. >> > >> > Here is the link to backlog: >> > https://issues.apache.org/jira/projects/ARROW/versions/12341707 >> > >> > Thanks, >> > Sidd >>
Re: Working towards getting 0.9.0 release candidate up next week
All, I plan to get RC out over the weekend or early Monday. Is that fine with everybody? We have 6 items in progress -- https://issues.apache.org/jira/projects/ARROW/versions/12341707#release-report-tab-body. How do people feel about completing these JIRAs by tomorrow? I am completely fine with deferring the RC to early next week (Mon/Tue/Wed) if necessary. Just looking for consensus. Also, I suggest that we defer the ones with TODO status. I will do it later today unless I hear otherwise. I was wondering if anyone else is interested in collaborating for the post-release tasks. As per https://github.com/apache/arrow/blob/master/dev/release/RELEASE_MANAGEMENT.md, following are the high level post-release tasks. Please let me know if you would like to take up something. I have written my name against some of them. - Updating the Arrow Website (Sidd) - Uploading release artifacts to SVN -- looks like PMC karma is needed to do this - Announcing release (Sidd) - Updating website with new API documentation (Sidd) - Updating pip packages for C++ and Python - Updating conda packages for C++ and Python (Sidd) - Updating Java Maven artifacts in Maven central (Sidd) - Release blog post If anything is missing, please add to the above list. It will be helpful for tracking. Thanks, Sidd On Sun, Mar 4, 2018 at 12:34 PM, Wes McKinneywrote: > hey Sidd, > > The Python backlog is still in pretty rough shape. I'd like to see if > we can make an RC by Friday but if not we can defer to Monday/Tuesday > the following week (3/12 or 13). I will trim as much as possible out > of the current backlog to get things down to the essential > > - Wes > > On Sun, Feb 25, 2018 at 11:58 AM, Siddharth Teotia > wrote: > > Sounds good. > > > > Thanks > > Sidd > > > > On Feb 24, 2018 6:24 PM, "Wes McKinney" wrote: > > > > Hi Sidd, > > > > I think we have too many bugs to make an RC this coming week. I suggest > we > > defer to the following week. > > > > Thanks > > Wes > > > > On Feb 24, 2018 7:09 PM, "Siddharth Teotia" > wrote: > > > > Hi All, > > > > We currently have 10 issues in progress and PRs are available for 8 of > > them. In interest of getting a release candidate next week, I would > request > > people to review PRs as soon as they can to help make progress and close > > out as many JIRAs as we can. > > > > There are 32 issues in TODO list and 25 of them are not yet assigned. I > am > > planning to defer some of the unassigned ones later today or tomorrow. It > > would be good to soon grab/assign the issues that people want to be fixed > > for 0.9.0. > > > > Here is the link to backlog: > > https://issues.apache.org/jira/projects/ARROW/versions/12341707 > > > > Thanks, > > Sidd >
[jira] [Created] (ARROW-2289) [GLib] Add Numeric, Integer and FloatingPoint data types
Kouhei Sutou created ARROW-2289: --- Summary: [GLib] Add Numeric, Integer and FloatingPoint data types Key: ARROW-2289 URL: https://issues.apache.org/jira/browse/ARROW-2289 Project: Apache Arrow Issue Type: Improvement Components: GLib Affects Versions: 0.8.0 Reporter: Kouhei Sutou Assignee: Kouhei Sutou Fix For: 0.9.0 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: Trying to compile Arrow C++
OK, does moving the ~/git/boost_1_66_0 someplace else (or removing it) make the problem go away? We should open a JIRA to see why the CMake build system is being fooled by that directory and see if it can be fixed On Thu, Mar 8, 2018 at 10:28 AM, Andy Grovewrote: > Thanks. Here's the gist. I do not have BOOST env vars set. It does seem to > be looking for headers in a boost directory parallel to arrow though. > > https://gist.github.com/andygrove/8abfa027fa29fb9f31efeab90043682c > > On Thu, Mar 8, 2018 at 8:22 AM, Wes McKinney wrote: > >> If you could also run with >> >> -DARROW_VERBOSE_THIRDPARTY_BUILD=ON >> >> that would provide additional debugging help >> >> On Thu, Mar 8, 2018 at 10:17 AM, Wes McKinney wrote: >> > hi Andy, >> > >> > Can you post the complete output of running CMake in a gist or >> > someplace for us to have a look? Do you have any BOOST_* environment >> > variables set? >> > >> > Thanks >> > Wes >> > >> > On Thu, Mar 8, 2018 at 10:12 AM, Andy Grove >> wrote: >> >> So I'm following the instructions and installed the binary dependencies, >> >> including libboost-dev. I see boost headers in /usr/include/boost. I'm >> >> using Ubuntu 16.04. >> >> >> >> In the Arrow cpp directory, I ran: >> >> >> >> cmake -G "Unix Makefiles" >> >> >> >> I get this output: >> >> >> >> Unable to find the requested Boost libraries. >> >> >> >> Boost version: 0.0.0 >> >> >> >> Boost include path: /home/andy/git/boost_1_66_0 >> >> >> >> Could not find the following Boost libraries: >> >> >> >> boost_regex >> >> >> >> Some (but not all) of the required Boost libraries were found. You >> may >> >> need to install these additional Boost libraries. Alternatively, set >> >> BOOST_LIBRARYDIR to the directory containing Boost libraries or >> BOOST_ROOT >> >> to the location of Boost. >> >> >> >> I also tried installing boost headers and going that route but ran into >> >> different problems. >> >> >> >> I'd appreciate some guidance. >> >> >> >> Thanks, >> >> >> >> Andy. >>
Re: Trying to compile Arrow C++
Thanks. Here's the gist. I do not have BOOST env vars set. It does seem to be looking for headers in a boost directory parallel to arrow though. https://gist.github.com/andygrove/8abfa027fa29fb9f31efeab90043682c On Thu, Mar 8, 2018 at 8:22 AM, Wes McKinneywrote: > If you could also run with > > -DARROW_VERBOSE_THIRDPARTY_BUILD=ON > > that would provide additional debugging help > > On Thu, Mar 8, 2018 at 10:17 AM, Wes McKinney wrote: > > hi Andy, > > > > Can you post the complete output of running CMake in a gist or > > someplace for us to have a look? Do you have any BOOST_* environment > > variables set? > > > > Thanks > > Wes > > > > On Thu, Mar 8, 2018 at 10:12 AM, Andy Grove > wrote: > >> So I'm following the instructions and installed the binary dependencies, > >> including libboost-dev. I see boost headers in /usr/include/boost. I'm > >> using Ubuntu 16.04. > >> > >> In the Arrow cpp directory, I ran: > >> > >> cmake -G "Unix Makefiles" > >> > >> I get this output: > >> > >> Unable to find the requested Boost libraries. > >> > >> Boost version: 0.0.0 > >> > >> Boost include path: /home/andy/git/boost_1_66_0 > >> > >> Could not find the following Boost libraries: > >> > >> boost_regex > >> > >> Some (but not all) of the required Boost libraries were found. You > may > >> need to install these additional Boost libraries. Alternatively, set > >> BOOST_LIBRARYDIR to the directory containing Boost libraries or > BOOST_ROOT > >> to the location of Boost. > >> > >> I also tried installing boost headers and going that route but ran into > >> different problems. > >> > >> I'd appreciate some guidance. > >> > >> Thanks, > >> > >> Andy. >
Re: Trying to compile Arrow C++
If you could also run with -DARROW_VERBOSE_THIRDPARTY_BUILD=ON that would provide additional debugging help On Thu, Mar 8, 2018 at 10:17 AM, Wes McKinneywrote: > hi Andy, > > Can you post the complete output of running CMake in a gist or > someplace for us to have a look? Do you have any BOOST_* environment > variables set? > > Thanks > Wes > > On Thu, Mar 8, 2018 at 10:12 AM, Andy Grove wrote: >> So I'm following the instructions and installed the binary dependencies, >> including libboost-dev. I see boost headers in /usr/include/boost. I'm >> using Ubuntu 16.04. >> >> In the Arrow cpp directory, I ran: >> >> cmake -G "Unix Makefiles" >> >> I get this output: >> >> Unable to find the requested Boost libraries. >> >> Boost version: 0.0.0 >> >> Boost include path: /home/andy/git/boost_1_66_0 >> >> Could not find the following Boost libraries: >> >> boost_regex >> >> Some (but not all) of the required Boost libraries were found. You may >> need to install these additional Boost libraries. Alternatively, set >> BOOST_LIBRARYDIR to the directory containing Boost libraries or BOOST_ROOT >> to the location of Boost. >> >> I also tried installing boost headers and going that route but ran into >> different problems. >> >> I'd appreciate some guidance. >> >> Thanks, >> >> Andy.
Trying to compile Arrow C++
So I'm following the instructions and installed the binary dependencies, including libboost-dev. I see boost headers in /usr/include/boost. I'm using Ubuntu 16.04. In the Arrow cpp directory, I ran: cmake -G "Unix Makefiles" I get this output: Unable to find the requested Boost libraries. Boost version: 0.0.0 Boost include path: /home/andy/git/boost_1_66_0 Could not find the following Boost libraries: boost_regex Some (but not all) of the required Boost libraries were found. You may need to install these additional Boost libraries. Alternatively, set BOOST_LIBRARYDIR to the directory containing Boost libraries or BOOST_ROOT to the location of Boost. I also tried installing boost headers and going that route but ran into different problems. I'd appreciate some guidance. Thanks, Andy.
Re: Introducing myself
Hi Krisztian, Yes, I'd love to team up. Thanks for the link ... I had been looking at a different Rust Arrow project. Once I have arrow building I will let you know. Thanks, Andy. On Wed, Mar 7, 2018 at 8:41 AM, Krisztián Szűcswrote: > Hey Andy! > > In the last couple of days I was digging arrow and iron-arrow ( > https://link.getmailspring.com/link/local-f599f61d-1722- > v1.1.4-22d9f20d@kszucs-mbp.local/0?redirect=https%3A%2F% > 2Fgithub.com%2Fjihoonson%2Firon-arrow=dev%40arrow.apache.org) > in order to create a rust impl for arrow. > My background is mostly p[c]ythonic, so I'd gladly team up if You are > interested. > > Krisztian > On Mar 7 2018, at 4:25 pm, Andy Grove wrote: > > > > Hi, > > I just wanted to introduce myself to the group before I start asking lots > > of questions. I'm a software engineer mostly working with > > Scala/Spark/Kudu/Parquet in my day job and in my spare time I have been > > working on a POC of a distributed data platform implemented in Rust. The > > project is called DataFusion (https://www.datafusion.rs/). > > > > The project is very early and the implementation is currently very simple > > row-based processing but the performance is already quite exciting to me > > (current test case is 4x faster than Apache Spark). > > > > I have decided that I should now concentrate on making Apache Arrow the > > native memory format so that I can implement more efficient data > processing > > and make it easier in the future to be able to integrate with things like > > Kudu and Parquet. It's also just a great way for me to learn about > > columnar-processing. > > > > I'm just in the process of getting Arrow compiling and reading the docs. > > I'll be back soon with questions I'm sure. > > > > Thanks, > > Andy. >
[jira] [Created] (ARROW-2288) [Python] slicing logic defective
Antoine Pitrou created ARROW-2288: - Summary: [Python] slicing logic defective Key: ARROW-2288 URL: https://issues.apache.org/jira/browse/ARROW-2288 Project: Apache Arrow Issue Type: Bug Components: Python Affects Versions: 0.8.0 Reporter: Antoine Pitrou Assignee: Antoine Pitrou The slicing logic tends to go too far when normalizing large negative bounds, which leads to results not in line with Python's slicing semantics: {code} >>> arr = pa.array([1,2,3,4]) >>> arr[-99:100] [ 2, 3, 4 ] >>> arr.to_pylist()[-99:100] [1, 2, 3, 4] >>> >>> >>> arr[-6:-5] [ 3 ] >>> arr.to_pylist()[-6:-5] [] {code} Also note this crash: {code} >>> arr[10:13] /home/antoine/arrow/cpp/src/arrow/array.cc:105 Check failed: (offset) <= (data.length) Abandon (core dumped) {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)