Re: [VOTE] Release Apache Arrow 4.0.0 - RC3
Hmm it seems that the PyArrow wheel doesn’t actually install on my Mac. Sorry I didn’t report any source testing issues since my environment is pretty messed up.. (arrowvenv) (base) karlkatzen@chloes venv % python3 Python 3.8.3 (default, Jul 2 2020, 11:26:31) [Clang 10.0.0 ] :: Anaconda, Inc. on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import pyarrow as pa Traceback (most recent call last): File "", line 1, in File "/Users/karlkatzen/Documents/code/venv/arrowvenv/lib/python3.8/site-packages/pyarrow/__init__.py", line 63, in import pyarrow.lib as _lib ImportError: dlopen(/Users/karlkatzen/Documents/code/venv/arrowvenv/lib/python3.8/site-packages/pyarrow/lib.cpython-38-darwin.so, 2): Symbol not found: __ZN5arrow10StopSource5tokenEv Referenced from: /Users/karlkatzen/Documents/code/venv/arrowvenv/lib/python3.8/site-packages/pyarrow/lib.cpython-38-darwin.so Expected in: /usr/local/lib/libarrow.400.dylib in /Users/karlkatzen/Documents/code/venv/arrowvenv/lib/python3.8/site-packages/pyarrow/lib.cpython-38-darwin.so What is __ZN5arrow10StopSource5tokenEv? > On Apr 27, 2021, at 11:23 PM, Micah Kornfield wrote: > > Oh, nice, I thought they just missed the cutoff. > > On Tue, Apr 27, 2021 at 8:19 PM Ying Zhou wrote: > >> They actually did. >> >> Ying >> >>> On Apr 27, 2021, at 11:11 PM, Micah Kornfield >> wrote: >>> >>> Did the ORC additions actually make it into 4.0? >>> >>> On Tue, Apr 27, 2021 at 7:55 PM Ying Zhou wrote: >>> Sure. I just added some info about the ORC writer. I think we need to update the documentation in both C++ and Python as well to include ORC. >> I will do it. Ying > On Apr 27, 2021, at 5:28 PM, Neal Richardson < neal.p.richard...@gmail.com> wrote: > > 4.0 blog post is still pretty bare and could use some help filling in: > https://github.com/apache/arrow-site/pull/104 > > Thanks, > Neal > > On Tue, Apr 27, 2021 at 1:55 PM Sutou Kouhei >> wrote: > >> The remaining tasks: >> >> 3. [in-pr|Kou] upload binaries >> https://github.com/apache/arrow/pull/10172 >> 10. [Uwe] update conda recipes >> 12. [in-pr|Ian] update homebrew packages >> https://github.com/Homebrew/homebrew-core/pull/76060 >> >> I updated versions on JIRA: >> >> * >> >> https://cwiki.apache.org/confluence/display/ARROW/Release+Management+Guide#ReleaseManagementGuide-Markingthereleasedversionas%22RELEASED%22onJIRA >> * >> >> https://cwiki.apache.org/confluence/display/ARROW/Release+Management+Guide#ReleaseManagementGuide-StartingthenewversiononJIRA >> >> In >> "Re: [VOTE] Release Apache Arrow 4.0.0 - RC3" on Tue, 27 Apr 2021 >> 10:37:04 -0500, >> Paul Taylor wrote: >> >>> JS packages have been uploaded. >>> >>> Paul >>> >>> On 4/27/21 9:47 AM, Neal Richardson wrote: R package has been accepted by CRAN. Neal On Tue, Apr 27, 2021 at 7:25 AM Krisztián Szűcs wrote: > I've just opened a PR with the updated documentation. > > The remaining tasks: > > 3. [in-pr|Kou] upload binaries > 6. [Paul] upload js packages > 10. [Uwe] update conda recipes > 12. [todo] update homebrew packages > 14. [Kou] update msys2 > 15. [Neal] update R packages > 16. [in-pr|Krisztian] update docs > > On Tue, Apr 27, 2021 at 2:42 PM Krisztián Szűcs > wrote: >> On Tue, Apr 27, 2021 at 2:21 PM Paul Taylor < ptaylor.apa...@gmail.com >>> > wrote: >>> These look like the errors resolved in >>> https://github.com/apache/arrow/pull/10156. Can we cherry-pick that >>> commit to the release branch? >> Great, I'll cherry-pick that commit. >> >> Could you please release the JS packages to npm? I think the >> lerna.json needs to be updated before npm publish. >> >> Thank Paul! >>> >>> On 4/27/21 7:04 AM, Krisztián Szűcs wrote: I'd need some help to both release the JS packages using the new > lerna configuration and to fix the JS documentation generation [1]. We should backport these changes to the release-4.0.0 branch. [1]: > >> >> https://dev.azure.com/ursacomputing/crossbow/_build/results?buildId=4297&view=logs&j=0da5d1d9-276d-5173-c4c4-9d4d4ed14fdb&t=d9b15392-e4ce-5e4c-0c8c-b69645229181 On Tue, Apr 27, 2021 at 1:50 AM Sutou Kouhei < >> k...@clear-code.com> > wrote: > I'll also update MSYS2 packages: > > 1. [x] open a pull request to bump the version numbers in the > source code > 2. [x] upload source >
Re: [VOTE] Release Apache Arrow 4.0.0 - RC3
Oh, nice, I thought they just missed the cutoff. On Tue, Apr 27, 2021 at 8:19 PM Ying Zhou wrote: > They actually did. > > Ying > > > On Apr 27, 2021, at 11:11 PM, Micah Kornfield > wrote: > > > > Did the ORC additions actually make it into 4.0? > > > > On Tue, Apr 27, 2021 at 7:55 PM Ying Zhou wrote: > > > >> Sure. I just added some info about the ORC writer. I think we need to > >> update the documentation in both C++ and Python as well to include ORC. > I > >> will do it. > >> > >> Ying > >> > >>> On Apr 27, 2021, at 5:28 PM, Neal Richardson < > >> neal.p.richard...@gmail.com> wrote: > >>> > >>> 4.0 blog post is still pretty bare and could use some help filling in: > >>> https://github.com/apache/arrow-site/pull/104 > >>> > >>> Thanks, > >>> Neal > >>> > >>> On Tue, Apr 27, 2021 at 1:55 PM Sutou Kouhei > wrote: > >>> > The remaining tasks: > > 3. [in-pr|Kou] upload binaries > https://github.com/apache/arrow/pull/10172 > 10. [Uwe] update conda recipes > 12. [in-pr|Ian] update homebrew packages > https://github.com/Homebrew/homebrew-core/pull/76060 > > I updated versions on JIRA: > > * > > >> > https://cwiki.apache.org/confluence/display/ARROW/Release+Management+Guide#ReleaseManagementGuide-Markingthereleasedversionas%22RELEASED%22onJIRA > * > > >> > https://cwiki.apache.org/confluence/display/ARROW/Release+Management+Guide#ReleaseManagementGuide-StartingthenewversiononJIRA > > In > "Re: [VOTE] Release Apache Arrow 4.0.0 - RC3" on Tue, 27 Apr 2021 > 10:37:04 -0500, > Paul Taylor wrote: > > > JS packages have been uploaded. > > > > Paul > > > > On 4/27/21 9:47 AM, Neal Richardson wrote: > >> R package has been accepted by CRAN. > >> > >> Neal > >> > >> On Tue, Apr 27, 2021 at 7:25 AM Krisztián Szűcs > >> > >> wrote: > >> > >>> I've just opened a PR with the updated documentation. > >>> > >>> The remaining tasks: > >>> > >>> 3. [in-pr|Kou] upload binaries > >>> 6. [Paul] upload js packages > >>> 10. [Uwe] update conda recipes > >>> 12. [todo] update homebrew packages > >>> 14. [Kou] update msys2 > >>> 15. [Neal] update R packages > >>> 16. [in-pr|Krisztian] update docs > >>> > >>> On Tue, Apr 27, 2021 at 2:42 PM Krisztián Szűcs > >>> wrote: > On Tue, Apr 27, 2021 at 2:21 PM Paul Taylor < > >> ptaylor.apa...@gmail.com > > > >>> wrote: > > These look like the errors resolved in > > https://github.com/apache/arrow/pull/10156. Can we cherry-pick > >> that > > commit to the release branch? > Great, I'll cherry-pick that commit. > > Could you please release the JS packages to npm? I think the > lerna.json needs to be updated before npm publish. > > Thank Paul! > > > > On 4/27/21 7:04 AM, Krisztián Szűcs wrote: > >> I'd need some help to both release the JS packages using the new > >>> lerna > >> configuration and to fix the JS documentation generation [1]. We > >> should backport these changes to the release-4.0.0 branch. > >> > >> [1]: > >>> > > >> > https://dev.azure.com/ursacomputing/crossbow/_build/results?buildId=4297&view=logs&j=0da5d1d9-276d-5173-c4c4-9d4d4ed14fdb&t=d9b15392-e4ce-5e4c-0c8c-b69645229181 > >> On Tue, Apr 27, 2021 at 1:50 AM Sutou Kouhei < > k...@clear-code.com> > >>> wrote: > >>> I'll also update MSYS2 packages: > >>> > >>> 1. [x] open a pull request to bump the version numbers in the > >>> source code > >>> 2. [x] upload source > >>> 3. [kou] upload binaries > >>> 4. [x] update website > >>> 5. [x] upload ruby gems > >>> 6. [ ] upload js packages > >>> 8. [x] upload C# packages > >>> 9. [x] upload rust crates > >>> 10. [ ] update conda recipes > >>> 11. [x] upload wheels/sdist to pypi > >>> 12. [ ] update homebrew packages > >>> 13. [x] update maven artifacts > >>> 14. [kou] update msys2 > >>> 15. [nealrichardson] update R packages > >>> 16. [ ] update docs > >>> > >>> In >>> r...@mail.gmail.com> > >>> "Re: [VOTE] Release Apache Arrow 4.0.0 - RC3" on Tue, 27 Apr > >>> 2021 01:48:37 +0200, > >>> Krisztián Szűcs wrote: > >>> > On Tue, Apr 27, 2021 at 1:05 AM Andy Grove < > >> andygrov...@gmail.com > > > >>> wrote: > > The following Rust crates have been published: arrow, > >>> arrow-flight, parquet, parquet_derive, datafusion > Thanks Andy! > > The current status is: > 1. [x] open a pull request to bump the version numbers in the > >>> source code > 2. [x] upload source > 3. [kou]
Re: [VOTE] Release Apache Arrow 4.0.0 - RC3
They actually did. Ying > On Apr 27, 2021, at 11:11 PM, Micah Kornfield wrote: > > Did the ORC additions actually make it into 4.0? > > On Tue, Apr 27, 2021 at 7:55 PM Ying Zhou wrote: > >> Sure. I just added some info about the ORC writer. I think we need to >> update the documentation in both C++ and Python as well to include ORC. I >> will do it. >> >> Ying >> >>> On Apr 27, 2021, at 5:28 PM, Neal Richardson < >> neal.p.richard...@gmail.com> wrote: >>> >>> 4.0 blog post is still pretty bare and could use some help filling in: >>> https://github.com/apache/arrow-site/pull/104 >>> >>> Thanks, >>> Neal >>> >>> On Tue, Apr 27, 2021 at 1:55 PM Sutou Kouhei wrote: >>> The remaining tasks: 3. [in-pr|Kou] upload binaries https://github.com/apache/arrow/pull/10172 10. [Uwe] update conda recipes 12. [in-pr|Ian] update homebrew packages https://github.com/Homebrew/homebrew-core/pull/76060 I updated versions on JIRA: * >> https://cwiki.apache.org/confluence/display/ARROW/Release+Management+Guide#ReleaseManagementGuide-Markingthereleasedversionas%22RELEASED%22onJIRA * >> https://cwiki.apache.org/confluence/display/ARROW/Release+Management+Guide#ReleaseManagementGuide-StartingthenewversiononJIRA In "Re: [VOTE] Release Apache Arrow 4.0.0 - RC3" on Tue, 27 Apr 2021 10:37:04 -0500, Paul Taylor wrote: > JS packages have been uploaded. > > Paul > > On 4/27/21 9:47 AM, Neal Richardson wrote: >> R package has been accepted by CRAN. >> >> Neal >> >> On Tue, Apr 27, 2021 at 7:25 AM Krisztián Szűcs >> >> wrote: >> >>> I've just opened a PR with the updated documentation. >>> >>> The remaining tasks: >>> >>> 3. [in-pr|Kou] upload binaries >>> 6. [Paul] upload js packages >>> 10. [Uwe] update conda recipes >>> 12. [todo] update homebrew packages >>> 14. [Kou] update msys2 >>> 15. [Neal] update R packages >>> 16. [in-pr|Krisztian] update docs >>> >>> On Tue, Apr 27, 2021 at 2:42 PM Krisztián Szűcs >>> wrote: On Tue, Apr 27, 2021 at 2:21 PM Paul Taylor < >> ptaylor.apa...@gmail.com > >>> wrote: > These look like the errors resolved in > https://github.com/apache/arrow/pull/10156. Can we cherry-pick >> that > commit to the release branch? Great, I'll cherry-pick that commit. Could you please release the JS packages to npm? I think the lerna.json needs to be updated before npm publish. Thank Paul! > > On 4/27/21 7:04 AM, Krisztián Szűcs wrote: >> I'd need some help to both release the JS packages using the new >>> lerna >> configuration and to fix the JS documentation generation [1]. We >> should backport these changes to the release-4.0.0 branch. >> >> [1]: >>> >> https://dev.azure.com/ursacomputing/crossbow/_build/results?buildId=4297&view=logs&j=0da5d1d9-276d-5173-c4c4-9d4d4ed14fdb&t=d9b15392-e4ce-5e4c-0c8c-b69645229181 >> On Tue, Apr 27, 2021 at 1:50 AM Sutou Kouhei >>> wrote: >>> I'll also update MSYS2 packages: >>> >>> 1. [x] open a pull request to bump the version numbers in the >>> source code >>> 2. [x] upload source >>> 3. [kou] upload binaries >>> 4. [x] update website >>> 5. [x] upload ruby gems >>> 6. [ ] upload js packages >>> 8. [x] upload C# packages >>> 9. [x] upload rust crates >>> 10. [ ] update conda recipes >>> 11. [x] upload wheels/sdist to pypi >>> 12. [ ] update homebrew packages >>> 13. [x] update maven artifacts >>> 14. [kou] update msys2 >>> 15. [nealrichardson] update R packages >>> 16. [ ] update docs >>> >>> In >> r...@mail.gmail.com> >>> "Re: [VOTE] Release Apache Arrow 4.0.0 - RC3" on Tue, 27 Apr >>> 2021 01:48:37 +0200, >>> Krisztián Szűcs wrote: >>> On Tue, Apr 27, 2021 at 1:05 AM Andy Grove < >> andygrov...@gmail.com > >>> wrote: > The following Rust crates have been published: arrow, >>> arrow-flight, parquet, parquet_derive, datafusion Thanks Andy! The current status is: 1. [x] open a pull request to bump the version numbers in the >>> source code 2. [x] upload source 3. [kou] upload binaries 4. [x] update website 5. [x] upload ruby gems 6. [ ] upload js packages 8. [x] upload C# packages 9. [x] upload rust crates 10. [ ] update conda recipes 11. [x] upload wheels/sdist to pypi 12. [ ] update homebrew packages 13. [x] update maven
Re: [VOTE] Release Apache Arrow 4.0.0 - RC3
Did the ORC additions actually make it into 4.0? On Tue, Apr 27, 2021 at 7:55 PM Ying Zhou wrote: > Sure. I just added some info about the ORC writer. I think we need to > update the documentation in both C++ and Python as well to include ORC. I > will do it. > > Ying > > > On Apr 27, 2021, at 5:28 PM, Neal Richardson < > neal.p.richard...@gmail.com> wrote: > > > > 4.0 blog post is still pretty bare and could use some help filling in: > > https://github.com/apache/arrow-site/pull/104 > > > > Thanks, > > Neal > > > > On Tue, Apr 27, 2021 at 1:55 PM Sutou Kouhei wrote: > > > >> The remaining tasks: > >> > >> 3. [in-pr|Kou] upload binaries > >>https://github.com/apache/arrow/pull/10172 > >> 10. [Uwe] update conda recipes > >> 12. [in-pr|Ian] update homebrew packages > >>https://github.com/Homebrew/homebrew-core/pull/76060 > >> > >> I updated versions on JIRA: > >> > >> * > >> > https://cwiki.apache.org/confluence/display/ARROW/Release+Management+Guide#ReleaseManagementGuide-Markingthereleasedversionas%22RELEASED%22onJIRA > >> * > >> > https://cwiki.apache.org/confluence/display/ARROW/Release+Management+Guide#ReleaseManagementGuide-StartingthenewversiononJIRA > >> > >> In > >> "Re: [VOTE] Release Apache Arrow 4.0.0 - RC3" on Tue, 27 Apr 2021 > >> 10:37:04 -0500, > >> Paul Taylor wrote: > >> > >>> JS packages have been uploaded. > >>> > >>> Paul > >>> > >>> On 4/27/21 9:47 AM, Neal Richardson wrote: > R package has been accepted by CRAN. > > Neal > > On Tue, Apr 27, 2021 at 7:25 AM Krisztián Szűcs > > wrote: > > > I've just opened a PR with the updated documentation. > > > > The remaining tasks: > > > > 3. [in-pr|Kou] upload binaries > > 6. [Paul] upload js packages > > 10. [Uwe] update conda recipes > > 12. [todo] update homebrew packages > > 14. [Kou] update msys2 > > 15. [Neal] update R packages > > 16. [in-pr|Krisztian] update docs > > > > On Tue, Apr 27, 2021 at 2:42 PM Krisztián Szűcs > > wrote: > >> On Tue, Apr 27, 2021 at 2:21 PM Paul Taylor < > ptaylor.apa...@gmail.com > >>> > > wrote: > >>> These look like the errors resolved in > >>> https://github.com/apache/arrow/pull/10156. Can we cherry-pick > that > >>> commit to the release branch? > >> Great, I'll cherry-pick that commit. > >> > >> Could you please release the JS packages to npm? I think the > >> lerna.json needs to be updated before npm publish. > >> > >> Thank Paul! > >>> > >>> On 4/27/21 7:04 AM, Krisztián Szűcs wrote: > I'd need some help to both release the JS packages using the new > > lerna > configuration and to fix the JS documentation generation [1]. We > should backport these changes to the release-4.0.0 branch. > > [1]: > > > >> > https://dev.azure.com/ursacomputing/crossbow/_build/results?buildId=4297&view=logs&j=0da5d1d9-276d-5173-c4c4-9d4d4ed14fdb&t=d9b15392-e4ce-5e4c-0c8c-b69645229181 > On Tue, Apr 27, 2021 at 1:50 AM Sutou Kouhei > > wrote: > > I'll also update MSYS2 packages: > > > > 1. [x] open a pull request to bump the version numbers in the > > source code > > 2. [x] upload source > > 3. [kou] upload binaries > > 4. [x] update website > > 5. [x] upload ruby gems > > 6. [ ] upload js packages > > 8. [x] upload C# packages > > 9. [x] upload rust crates > > 10. [ ] update conda recipes > > 11. [x] upload wheels/sdist to pypi > > 12. [ ] update homebrew packages > > 13. [x] update maven artifacts > > 14. [kou] update msys2 > > 15. [nealrichardson] update R packages > > 16. [ ] update docs > > > > In > r...@mail.gmail.com> > >"Re: [VOTE] Release Apache Arrow 4.0.0 - RC3" on Tue, 27 Apr > > 2021 01:48:37 +0200, > >Krisztián Szűcs wrote: > > > >> On Tue, Apr 27, 2021 at 1:05 AM Andy Grove < > andygrov...@gmail.com > >>> > > wrote: > >>> The following Rust crates have been published: arrow, > > arrow-flight, parquet, parquet_derive, datafusion > >> Thanks Andy! > >> > >> The current status is: > >> 1. [x] open a pull request to bump the version numbers in the > > source code > >> 2. [x] upload source > >> 3. [kou] upload binaries > >> 4. [x] update website > >> 5. [x] upload ruby gems > >> 6. [ ] upload js packages > >> 8. [x] upload C# packages > >> 9. [x] upload rust crates > >> 10. [ ] update conda recipes > >> 11. [x] upload wheels/sdist to pypi > >> 12. [ ] update homebrew packages > >> 13. [x] update maven artifacts > >> 14. [ ] update msys2 > >> 15. [nealrichardson] update R packages > >> 16. [ ] update docs >
Re: [VOTE] Release Apache Arrow 4.0.0 - RC3
Sure. I just added some info about the ORC writer. I think we need to update the documentation in both C++ and Python as well to include ORC. I will do it. Ying > On Apr 27, 2021, at 5:28 PM, Neal Richardson > wrote: > > 4.0 blog post is still pretty bare and could use some help filling in: > https://github.com/apache/arrow-site/pull/104 > > Thanks, > Neal > > On Tue, Apr 27, 2021 at 1:55 PM Sutou Kouhei wrote: > >> The remaining tasks: >> >> 3. [in-pr|Kou] upload binaries >>https://github.com/apache/arrow/pull/10172 >> 10. [Uwe] update conda recipes >> 12. [in-pr|Ian] update homebrew packages >>https://github.com/Homebrew/homebrew-core/pull/76060 >> >> I updated versions on JIRA: >> >> * >> https://cwiki.apache.org/confluence/display/ARROW/Release+Management+Guide#ReleaseManagementGuide-Markingthereleasedversionas%22RELEASED%22onJIRA >> * >> https://cwiki.apache.org/confluence/display/ARROW/Release+Management+Guide#ReleaseManagementGuide-StartingthenewversiononJIRA >> >> In >> "Re: [VOTE] Release Apache Arrow 4.0.0 - RC3" on Tue, 27 Apr 2021 >> 10:37:04 -0500, >> Paul Taylor wrote: >> >>> JS packages have been uploaded. >>> >>> Paul >>> >>> On 4/27/21 9:47 AM, Neal Richardson wrote: R package has been accepted by CRAN. Neal On Tue, Apr 27, 2021 at 7:25 AM Krisztián Szűcs wrote: > I've just opened a PR with the updated documentation. > > The remaining tasks: > > 3. [in-pr|Kou] upload binaries > 6. [Paul] upload js packages > 10. [Uwe] update conda recipes > 12. [todo] update homebrew packages > 14. [Kou] update msys2 > 15. [Neal] update R packages > 16. [in-pr|Krisztian] update docs > > On Tue, Apr 27, 2021 at 2:42 PM Krisztián Szűcs > wrote: >> On Tue, Apr 27, 2021 at 2:21 PM Paul Taylor >> > wrote: >>> These look like the errors resolved in >>> https://github.com/apache/arrow/pull/10156. Can we cherry-pick that >>> commit to the release branch? >> Great, I'll cherry-pick that commit. >> >> Could you please release the JS packages to npm? I think the >> lerna.json needs to be updated before npm publish. >> >> Thank Paul! >>> >>> On 4/27/21 7:04 AM, Krisztián Szűcs wrote: I'd need some help to both release the JS packages using the new > lerna configuration and to fix the JS documentation generation [1]. We should backport these changes to the release-4.0.0 branch. [1]: > >> https://dev.azure.com/ursacomputing/crossbow/_build/results?buildId=4297&view=logs&j=0da5d1d9-276d-5173-c4c4-9d4d4ed14fdb&t=d9b15392-e4ce-5e4c-0c8c-b69645229181 On Tue, Apr 27, 2021 at 1:50 AM Sutou Kouhei > wrote: > I'll also update MSYS2 packages: > > 1. [x] open a pull request to bump the version numbers in the > source code > 2. [x] upload source > 3. [kou] upload binaries > 4. [x] update website > 5. [x] upload ruby gems > 6. [ ] upload js packages > 8. [x] upload C# packages > 9. [x] upload rust crates > 10. [ ] update conda recipes > 11. [x] upload wheels/sdist to pypi > 12. [ ] update homebrew packages > 13. [x] update maven artifacts > 14. [kou] update msys2 > 15. [nealrichardson] update R packages > 16. [ ] update docs > > In r...@mail.gmail.com> >"Re: [VOTE] Release Apache Arrow 4.0.0 - RC3" on Tue, 27 Apr > 2021 01:48:37 +0200, >Krisztián Szűcs wrote: > >> On Tue, Apr 27, 2021 at 1:05 AM Andy Grove >> > wrote: >>> The following Rust crates have been published: arrow, > arrow-flight, parquet, parquet_derive, datafusion >> Thanks Andy! >> >> The current status is: >> 1. [x] open a pull request to bump the version numbers in the > source code >> 2. [x] upload source >> 3. [kou] upload binaries >> 4. [x] update website >> 5. [x] upload ruby gems >> 6. [ ] upload js packages >> 8. [x] upload C# packages >> 9. [x] upload rust crates >> 10. [ ] update conda recipes >> 11. [x] upload wheels/sdist to pypi >> 12. [ ] update homebrew packages >> 13. [x] update maven artifacts >> 14. [ ] update msys2 >> 15. [nealrichardson] update R packages >> 16. [ ] update docs >>> On Mon, Apr 26, 2021 at 4:34 PM Andy Grove < >> andygrov...@gmail.com> > wrote: Yes, I can handle the Rust release. On Mon, Apr 26, 2021, 4:17 PM Krisztián Szűcs < > szucs.kriszt...@gmail.com> wrote: > @Andy Grove could you please handle the rust release? > > On Mon, Apr 26, 2021 at 11:51 PM Krisztián Szűcs > wrote: >>
Arrow sync call April 28 at 12:00 US/Eastern, 16:00 UTC
Hi all, Our biweekly call is coming up tomorrow at https://meet.google.com/vtm-teks-phx. All are welcome to join. Notes will be shared with the mailing list afterward. Neal
Re: [Format][RFC] Introduce COMPLEX type for IntervalUnit
Thanks Micah — I commented in the PR. Once we've settled on the details we can come up with an implementation / vote plan On Tue, Apr 27, 2021 at 1:12 PM Micah Kornfield wrote: > To nudge this along I opend up https://github.com/apache/arrow/pull/10177 > > Comments welcome. > > On Sun, Apr 11, 2021 at 9:38 PM Micah Kornfield > wrote: > >> If there are no more comments on this maybe we should update the original >> RFC PR and ensure we are OK with it in principle (Dmitry do you want to do >> this or should we start a new PR)? I can try to work on the C++/Python and >> Java code in the next few weeks. >> >> >> On Sun, Apr 4, 2021 at 1:35 PM Micah Kornfield >> wrote: >> >>> Looking more at the Postgres spec and storage details, I'd be supportive of having a COMPLEX interval type which could be a packed type (possibly using the same 16-byte storage layout as Postgres -- depending on whether this complex interval needs granularity smaller than seconds, more analysis needed) >>> >>> >>> I agree we seem to be coalescing towards this representation which IIUC >>> is (32 bit months, 32 bit days, 64 bit for Seconds + fractional seconds). >>> I think the main questions here are: >>> >>> 1. Configurable sub-second granularity (I would again lean no and just >>> say nanoseconds, since this is the highest granularity other Arrow time >>> types support). >>> 2. Range compatibility with other interval types (if we fix at >>> nanoseconds IIUC there are some extreme values that couldn't be converted >>> to Arrow). I would guess that these extrema are rare enough that this >>> should not be an issue. >>> >>> I think that adding an entry to the IntervalUnit enum doesn't pose any forward compatibility problems (because implementations _should_ be able to recognized an unsupported unit? >>> >>> >>> I also agree with this, I can do an audit before an official proposal of >>> Java and C++. >>> >>> -Micah >>> >>> >>> >>> >>> >>> On Sun, Apr 4, 2021 at 11:10 AM Wes McKinney >>> wrote: >>> Looking more at the Postgres spec and storage details, I'd be supportive of having a COMPLEX interval type which could be a packed type (possibly using the same 16-byte storage layout as Postgres -- depending on whether this complex interval needs granularity smaller than seconds, more analysis needed). I think that adding an entry to the IntervalUnit enum doesn't pose any forward compatibility problems (because implementations _should_ be able to recognized an unsupported unit?). It isn't wonderful to have 3 different types of intervals (ideally just SIMPLE and COMPLEX), but it seems like something we can live with. If we're going to do this, we should try to do it properly now since I would probably object at adding a 4th interval type if the need for it came up in the future. On Fri, Apr 2, 2021 at 5:37 PM Micah Kornfield wrote: > > > > > However it seems a little unfortunate that there is now way to represent a > > "common" interval like "1 week and 1 hour" with native arrow types > > > I might have misunderstood,but at least in postgres, I thought this boils > down to "0 months, 7 days, 3600 seconds". Since months is 0, this seems > like it fits squarely in the existing interval type Days_Mills. > > I thought what can't be represented today is "1 Year 1 Hour". It seems > like none of the proposals so far cover weeks as an explicit type? > > On Fri, Apr 2, 2021 at 2:42 PM Andrew Lamb wrote: > > > I think it is plausible that we use Arrow structs to create a synthetic > > interval type for DataFusion (I don't have a compelling usecase to store > > the intervals themselves, or to expose them outside of DataFusion). > > > > However it seems a little unfortunate that there is now way to represent a > > "common" interval like "1 week and 1 hour" with native arrow types > > > > > > > > On Fri, Apr 2, 2021 at 4:38 PM Micah Kornfield < emkornfi...@gmail.com> > > wrote: > > > >> The real usecase I have is "postgres compatibility" > >> > >> > >> Yeah, I'm a little conflicted on this. A broader analysis might be > >> necessary and I'd welcome others thoughts, but at what point should we > >> mostly consider the type system closed? Should we be aiming for full > >> parity with ANSI SQL/Postgres SQL or something else? > >> > >> > >>> I have no known need for the actual postgres timestamp internal > >>> representation. > >> > >> > >> I suppose there is an edge case that the seconds range is larger for > >> microseconds compared to nanoseconds with the simple representation. But > >> that seems minor. > >> > >> On Fri, Apr 2, 2021 at 1:25 PM Andrew
Re: [DISCUSS] experimental repos
This process seems pretty reasonable to me. Thanks for writing up the document. On Tue, Apr 27, 2021 at 10:56 AM Micah Kornfield wrote: > Hi Jorge, > I think especially for the second case, it might be better to keep things > on branches in the repro even if they aren't quite mergeable. Even for the > first case, I would potentially aim for the "closest possible" repo with a > new branch. > > I think standalone repos tend to indicate a higher level of > maturity/commitment to casual observers that these experiments are meant to > represent (even clearly documented experimental repos). I think once the > project reaches some level of maturity and there is broader community > commitment the decision/technical part can be made to spinning new repos. > > Again, this isn't based on any data or any real experience so I'm happy for > whoever wants to try out the process to go the way they see fit. I also > guess there might be some technical implications for some languages based > on branches/repos. > > Cheers, > Micah > > On Mon, Apr 26, 2021 at 8:44 AM Jorge Cardoso Leitão < > jorgecarlei...@gmail.com> wrote: > > > Hi Micah, > > > > For code that is mergeable, I would say that a branch is superior, as it > > keeps lineage and thus enables rebasing. IMO there are two use-cases for > > this mechanism: > > > > * create a new component from scratch (e.g. Ballista, bindings to > language > > Z, python-datafusion). > > * re-write an existing implementation / component, with no intentions of > > being mergeable in the traditional sense. > > > > Operationally, the first case would be things like mv new_repo/* > > old_repo/component/ > > The second case would be for things like rm -rf old_repo/*; mv new_repo/* > > old_repo/ or "this repo is now official code, old is in maintenance > mode". > > > > Best, > > Jorge > > > > > > > > > > On Mon, Apr 26, 2021 at 6:40 AM Micah Kornfield > > wrote: > > > >> Hi Jorge, > >> Thanks for doing this. > >> > >> One question that springs to mind: For work that is mostly intended to > be > >> merged back to an existing Arrow repo what are the trade-offs between > >> brand > >> new repos as compared to a separate branch in existing one in an > existing > >> one? > >> > >> Cheers, > >> Micah > >> > >> On Sun, Apr 25, 2021 at 9:31 PM Jorge Cardoso Leitão < > >> jorgecarlei...@gmail.com> wrote: > >> > >> > Hi, > >> > > >> > As discussed in other threads (rust sync and parquet2), I would like > to > >> > open the discussion around opening repos for experimental work that > may > >> or > >> > may not be merged / used. > >> > > >> > The rationale is that we incentivise work to be conducted within ASF > and > >> > Apache Arrows' governance, thereby clarifying the context and > >> governance of > >> > that work, while still offering the freedoms that people enjoy when > >> > creating a new repo on their personal accounts. > >> > > >> > I wrote a draft proposal here: [1] > >> > > >> > [1] > >> > > >> > > >> > https://docs.google.com/document/d/1rDTWezkKkmOQ3HQeX8NhaXbKZLeuwcyFdC4ZA7ZatDY/edit?usp=sharing > >> > > >> > Best, > >> > Jorge > >> > > >> > > >
Re: [DISCUSS] Moving the format directory to arrow-format repository
I wouldn't be too excited about this. Here are my thoughts: 1. Having the format/ directory in apache/arrow be a submodule would be cumbersome and error-prone for developers. The only submodules we have right now are optional testing dependencies — not having these initialized and updated does not result in a broken project, whereas this change would. We have a copy of parquet.thrift from apache/parquet-format for similar reasons. 2. So based on #1, we would want to maintain a copy of the format files in apache/arrow, even if there were a separate apache/arrow-format repository. The format files are slow-moving enough that I don't think it's burdensome to mirror these into satellite repositories like arrow/arrow-rs. On Tue, Apr 27, 2021 at 10:54 AM Neville Dipale wrote: > Hi Arrow devs, > > Andy noticed that we carry a copy of the format directory in arrow-rs, > which > is bound to get outdated in the future. > > We would like to propose creating an arrow-format repository, similar to > parquet-format, so that arrow-rs and other future separate repositories > could > add this as a submodule. > > What are your thoughts? > > Regards > Neville >
Re: [VOTE] Release Apache Arrow 4.0.0 - RC3
4.0 blog post is still pretty bare and could use some help filling in: https://github.com/apache/arrow-site/pull/104 Thanks, Neal On Tue, Apr 27, 2021 at 1:55 PM Sutou Kouhei wrote: > The remaining tasks: > > 3. [in-pr|Kou] upload binaries > https://github.com/apache/arrow/pull/10172 > 10. [Uwe] update conda recipes > 12. [in-pr|Ian] update homebrew packages > https://github.com/Homebrew/homebrew-core/pull/76060 > > I updated versions on JIRA: > > * > https://cwiki.apache.org/confluence/display/ARROW/Release+Management+Guide#ReleaseManagementGuide-Markingthereleasedversionas%22RELEASED%22onJIRA > * > https://cwiki.apache.org/confluence/display/ARROW/Release+Management+Guide#ReleaseManagementGuide-StartingthenewversiononJIRA > > In > "Re: [VOTE] Release Apache Arrow 4.0.0 - RC3" on Tue, 27 Apr 2021 > 10:37:04 -0500, > Paul Taylor wrote: > > > JS packages have been uploaded. > > > > Paul > > > > On 4/27/21 9:47 AM, Neal Richardson wrote: > >> R package has been accepted by CRAN. > >> > >> Neal > >> > >> On Tue, Apr 27, 2021 at 7:25 AM Krisztián Szűcs > >> > >> wrote: > >> > >>> I've just opened a PR with the updated documentation. > >>> > >>> The remaining tasks: > >>> > >>> 3. [in-pr|Kou] upload binaries > >>> 6. [Paul] upload js packages > >>> 10. [Uwe] update conda recipes > >>> 12. [todo] update homebrew packages > >>> 14. [Kou] update msys2 > >>> 15. [Neal] update R packages > >>> 16. [in-pr|Krisztian] update docs > >>> > >>> On Tue, Apr 27, 2021 at 2:42 PM Krisztián Szűcs > >>> wrote: > On Tue, Apr 27, 2021 at 2:21 PM Paul Taylor > > >>> wrote: > > These look like the errors resolved in > > https://github.com/apache/arrow/pull/10156. Can we cherry-pick that > > commit to the release branch? > Great, I'll cherry-pick that commit. > > Could you please release the JS packages to npm? I think the > lerna.json needs to be updated before npm publish. > > Thank Paul! > > > > On 4/27/21 7:04 AM, Krisztián Szűcs wrote: > >> I'd need some help to both release the JS packages using the new > >>> lerna > >> configuration and to fix the JS documentation generation [1]. We > >> should backport these changes to the release-4.0.0 branch. > >> > >> [1]: > >>> > https://dev.azure.com/ursacomputing/crossbow/_build/results?buildId=4297&view=logs&j=0da5d1d9-276d-5173-c4c4-9d4d4ed14fdb&t=d9b15392-e4ce-5e4c-0c8c-b69645229181 > >> On Tue, Apr 27, 2021 at 1:50 AM Sutou Kouhei > >>> wrote: > >>> I'll also update MSYS2 packages: > >>> > >>> 1. [x] open a pull request to bump the version numbers in the > >>> source code > >>> 2. [x] upload source > >>> 3. [kou] upload binaries > >>> 4. [x] update website > >>> 5. [x] upload ruby gems > >>> 6. [ ] upload js packages > >>> 8. [x] upload C# packages > >>> 9. [x] upload rust crates > >>> 10. [ ] update conda recipes > >>> 11. [x] upload wheels/sdist to pypi > >>> 12. [ ] update homebrew packages > >>> 13. [x] update maven artifacts > >>> 14. [kou] update msys2 > >>> 15. [nealrichardson] update R packages > >>> 16. [ ] update docs > >>> > >>> In >>> r...@mail.gmail.com> > >>> "Re: [VOTE] Release Apache Arrow 4.0.0 - RC3" on Tue, 27 Apr > >>> 2021 01:48:37 +0200, > >>> Krisztián Szűcs wrote: > >>> > On Tue, Apr 27, 2021 at 1:05 AM Andy Grove > > >>> wrote: > > The following Rust crates have been published: arrow, > >>> arrow-flight, parquet, parquet_derive, datafusion > Thanks Andy! > > The current status is: > 1. [x] open a pull request to bump the version numbers in the > >>> source code > 2. [x] upload source > 3. [kou] upload binaries > 4. [x] update website > 5. [x] upload ruby gems > 6. [ ] upload js packages > 8. [x] upload C# packages > 9. [x] upload rust crates > 10. [ ] update conda recipes > 11. [x] upload wheels/sdist to pypi > 12. [ ] update homebrew packages > 13. [x] update maven artifacts > 14. [ ] update msys2 > 15. [nealrichardson] update R packages > 16. [ ] update docs > > On Mon, Apr 26, 2021 at 4:34 PM Andy Grove < > andygrov...@gmail.com> > >>> wrote: > >> Yes, I can handle the Rust release. > >> > >> On Mon, Apr 26, 2021, 4:17 PM Krisztián Szűcs < > >>> szucs.kriszt...@gmail.com> wrote: > >>> @Andy Grove could you please handle the rust release? > >>> > >>> On Mon, Apr 26, 2021 at 11:51 PM Krisztián Szűcs > >>> wrote: > 1. [x] open a pull request to bump the version numbers in the > >>> source code > 2. [x] upload source > 3. [kou] upload binaries > 4. [x] update website > 5. [x] upload ruby gems > 6. [ ] upload js packag
Re: [VOTE] Release Apache Arrow 4.0.0 - RC3
The remaining tasks: 3. [in-pr|Kou] upload binaries https://github.com/apache/arrow/pull/10172 10. [Uwe] update conda recipes 12. [in-pr|Ian] update homebrew packages https://github.com/Homebrew/homebrew-core/pull/76060 I updated versions on JIRA: * https://cwiki.apache.org/confluence/display/ARROW/Release+Management+Guide#ReleaseManagementGuide-Markingthereleasedversionas%22RELEASED%22onJIRA * https://cwiki.apache.org/confluence/display/ARROW/Release+Management+Guide#ReleaseManagementGuide-StartingthenewversiononJIRA In "Re: [VOTE] Release Apache Arrow 4.0.0 - RC3" on Tue, 27 Apr 2021 10:37:04 -0500, Paul Taylor wrote: > JS packages have been uploaded. > > Paul > > On 4/27/21 9:47 AM, Neal Richardson wrote: >> R package has been accepted by CRAN. >> >> Neal >> >> On Tue, Apr 27, 2021 at 7:25 AM Krisztián Szűcs >> >> wrote: >> >>> I've just opened a PR with the updated documentation. >>> >>> The remaining tasks: >>> >>> 3. [in-pr|Kou] upload binaries >>> 6. [Paul] upload js packages >>> 10. [Uwe] update conda recipes >>> 12. [todo] update homebrew packages >>> 14. [Kou] update msys2 >>> 15. [Neal] update R packages >>> 16. [in-pr|Krisztian] update docs >>> >>> On Tue, Apr 27, 2021 at 2:42 PM Krisztián Szűcs >>> wrote: On Tue, Apr 27, 2021 at 2:21 PM Paul Taylor >>> wrote: > These look like the errors resolved in > https://github.com/apache/arrow/pull/10156. Can we cherry-pick that > commit to the release branch? Great, I'll cherry-pick that commit. Could you please release the JS packages to npm? I think the lerna.json needs to be updated before npm publish. Thank Paul! > > On 4/27/21 7:04 AM, Krisztián Szűcs wrote: >> I'd need some help to both release the JS packages using the new >>> lerna >> configuration and to fix the JS documentation generation [1]. We >> should backport these changes to the release-4.0.0 branch. >> >> [1]: >>> https://dev.azure.com/ursacomputing/crossbow/_build/results?buildId=4297&view=logs&j=0da5d1d9-276d-5173-c4c4-9d4d4ed14fdb&t=d9b15392-e4ce-5e4c-0c8c-b69645229181 >> On Tue, Apr 27, 2021 at 1:50 AM Sutou Kouhei >>> wrote: >>> I'll also update MSYS2 packages: >>> >>> 1. [x] open a pull request to bump the version numbers in the >>> source code >>> 2. [x] upload source >>> 3. [kou] upload binaries >>> 4. [x] update website >>> 5. [x] upload ruby gems >>> 6. [ ] upload js packages >>> 8. [x] upload C# packages >>> 9. [x] upload rust crates >>> 10. [ ] update conda recipes >>> 11. [x] upload wheels/sdist to pypi >>> 12. [ ] update homebrew packages >>> 13. [x] update maven artifacts >>> 14. [kou] update msys2 >>> 15. [nealrichardson] update R packages >>> 16. [ ] update docs >>> >>> In >> r...@mail.gmail.com> >>> "Re: [VOTE] Release Apache Arrow 4.0.0 - RC3" on Tue, 27 Apr >>> 2021 01:48:37 +0200, >>> Krisztián Szűcs wrote: >>> On Tue, Apr 27, 2021 at 1:05 AM Andy Grove >>> wrote: > The following Rust crates have been published: arrow, >>> arrow-flight, parquet, parquet_derive, datafusion Thanks Andy! The current status is: 1. [x] open a pull request to bump the version numbers in the >>> source code 2. [x] upload source 3. [kou] upload binaries 4. [x] update website 5. [x] upload ruby gems 6. [ ] upload js packages 8. [x] upload C# packages 9. [x] upload rust crates 10. [ ] update conda recipes 11. [x] upload wheels/sdist to pypi 12. [ ] update homebrew packages 13. [x] update maven artifacts 14. [ ] update msys2 15. [nealrichardson] update R packages 16. [ ] update docs > On Mon, Apr 26, 2021 at 4:34 PM Andy Grove >>> wrote: >> Yes, I can handle the Rust release. >> >> On Mon, Apr 26, 2021, 4:17 PM Krisztián Szűcs < >>> szucs.kriszt...@gmail.com> wrote: >>> @Andy Grove could you please handle the rust release? >>> >>> On Mon, Apr 26, 2021 at 11:51 PM Krisztián Szűcs >>> wrote: 1. [x] open a pull request to bump the version numbers in the >>> source code 2. [x] upload source 3. [kou] upload binaries 4. [x] update website 5. [x] upload ruby gems 6. [ ] upload js packages 8. [x] upload C# packages 9. [ ] upload rust crates 10. [ ] update conda recipes 11. [in-progress] upload wheels/sdist to pypi 12. [ ] update homebrew packages 13. [x] update maven artifacts 14. [ ] update msys2 15. [nealrichardson] update R packages 16. [ ] update docs The JS post release task is failing with: >>>
Re: [C++][Python] Parquet INT96 overflow for arrow timestamps
Hi Karik, I answered in the JIRA itself. Feel free to ask any more questions! Regards Antoine. Le 27/04/2021 à 16:28, Karik Isichei a écrit : Hi there, I previously raised an issue regarding arrow timestamp values overflowing when reading parquet type INT96 ( https://issues.apache.org/jira/browse/ARROW-12096). I would like to try and add a contribution to this try and fix this, but wanted to check the following: - If I should start with the code in "arrow/cpp/src/parquet/" to try and identify a fix? - If there is any existing process or functions to check this overflow that I should apply to the reader? - Or if I should write a PR to expose some settings in the pyarrow parquet reader settings to read INT96 types as a different timestamp than what arrow infers? Thanks, Karik
Re: [Format][RFC] Introduce COMPLEX type for IntervalUnit
To nudge this along I opend up https://github.com/apache/arrow/pull/10177 Comments welcome. On Sun, Apr 11, 2021 at 9:38 PM Micah Kornfield wrote: > If there are no more comments on this maybe we should update the original > RFC PR and ensure we are OK with it in principle (Dmitry do you want to do > this or should we start a new PR)? I can try to work on the C++/Python and > Java code in the next few weeks. > > > On Sun, Apr 4, 2021 at 1:35 PM Micah Kornfield > wrote: > >> Looking more at the Postgres spec and storage details, I'd be >>> supportive of having a COMPLEX interval type which could be a packed >>> type (possibly using the same 16-byte storage layout as Postgres -- >>> depending on whether this complex interval needs granularity smaller >>> than seconds, more analysis needed) >> >> >> I agree we seem to be coalescing towards this representation which IIUC >> is (32 bit months, 32 bit days, 64 bit for Seconds + fractional seconds). >> I think the main questions here are: >> >> 1. Configurable sub-second granularity (I would again lean no and just >> say nanoseconds, since this is the highest granularity other Arrow time >> types support). >> 2. Range compatibility with other interval types (if we fix at >> nanoseconds IIUC there are some extreme values that couldn't be converted >> to Arrow). I would guess that these extrema are rare enough that this >> should not be an issue. >> >> I think that adding an entry to >>> the IntervalUnit enum doesn't pose any forward compatibility problems >>> (because implementations _should_ be able to recognized an unsupported >>> unit? >> >> >> I also agree with this, I can do an audit before an official proposal of >> Java and C++. >> >> -Micah >> >> >> >> >> >> On Sun, Apr 4, 2021 at 11:10 AM Wes McKinney wrote: >> >>> Looking more at the Postgres spec and storage details, I'd be >>> supportive of having a COMPLEX interval type which could be a packed >>> type (possibly using the same 16-byte storage layout as Postgres -- >>> depending on whether this complex interval needs granularity smaller >>> than seconds, more analysis needed). I think that adding an entry to >>> the IntervalUnit enum doesn't pose any forward compatibility problems >>> (because implementations _should_ be able to recognized an unsupported >>> unit?). It isn't wonderful to have 3 different types of intervals >>> (ideally just SIMPLE and COMPLEX), but it seems like something we can >>> live with. If we're going to do this, we should try to do it properly >>> now since I would probably object at adding a 4th interval type if the >>> need for it came up in the future. >>> >>> On Fri, Apr 2, 2021 at 5:37 PM Micah Kornfield >>> wrote: >>> > >>> > > >>> > > However it seems a little unfortunate that there is now way to >>> represent a >>> > > "common" interval like "1 week and 1 hour" with native arrow types >>> > >>> > >>> > I might have misunderstood,but at least in postgres, I thought this >>> boils >>> > down to "0 months, 7 days, 3600 seconds". Since months is 0, this >>> seems >>> > like it fits squarely in the existing interval type Days_Mills. >>> > >>> > I thought what can't be represented today is "1 Year 1 Hour". It seems >>> > like none of the proposals so far cover weeks as an explicit type? >>> > >>> > On Fri, Apr 2, 2021 at 2:42 PM Andrew Lamb >>> wrote: >>> > >>> > > I think it is plausible that we use Arrow structs to create a >>> synthetic >>> > > interval type for DataFusion (I don't have a compelling usecase to >>> store >>> > > the intervals themselves, or to expose them outside of DataFusion). >>> > > >>> > > However it seems a little unfortunate that there is now way to >>> represent a >>> > > "common" interval like "1 week and 1 hour" with native arrow types >>> > > >>> > > >>> > > >>> > > On Fri, Apr 2, 2021 at 4:38 PM Micah Kornfield < >>> emkornfi...@gmail.com> >>> > > wrote: >>> > > >>> > >> The real usecase I have is "postgres compatibility" >>> > >> >>> > >> >>> > >> Yeah, I'm a little conflicted on this. A broader analysis might be >>> > >> necessary and I'd welcome others thoughts, but at what point should >>> we >>> > >> mostly consider the type system closed? Should we be aiming for >>> full >>> > >> parity with ANSI SQL/Postgres SQL or something else? >>> > >> >>> > >> >>> > >>> I have no known need for the actual postgres timestamp internal >>> > >>> representation. >>> > >> >>> > >> >>> > >> I suppose there is an edge case that the seconds range is larger for >>> > >> microseconds compared to nanoseconds with the simple >>> representation. But >>> > >> that seems minor. >>> > >> >>> > >> On Fri, Apr 2, 2021 at 1:25 PM Andrew Lamb >>> wrote: >>> > >> >>> > >>> The real usecase I have is "postgres compatibility" - in the sense >>> that >>> > >>> we can write SQL queries / expressions that use postgres interval >>> type [1] >>> > >>> and corresponding expressions with the full postgres interval >>> range. I have >>> > >>> no known need for t
Re: [DISCUSS] experimental repos
Hi Jorge, I think especially for the second case, it might be better to keep things on branches in the repro even if they aren't quite mergeable. Even for the first case, I would potentially aim for the "closest possible" repo with a new branch. I think standalone repos tend to indicate a higher level of maturity/commitment to casual observers that these experiments are meant to represent (even clearly documented experimental repos). I think once the project reaches some level of maturity and there is broader community commitment the decision/technical part can be made to spinning new repos. Again, this isn't based on any data or any real experience so I'm happy for whoever wants to try out the process to go the way they see fit. I also guess there might be some technical implications for some languages based on branches/repos. Cheers, Micah On Mon, Apr 26, 2021 at 8:44 AM Jorge Cardoso Leitão < jorgecarlei...@gmail.com> wrote: > Hi Micah, > > For code that is mergeable, I would say that a branch is superior, as it > keeps lineage and thus enables rebasing. IMO there are two use-cases for > this mechanism: > > * create a new component from scratch (e.g. Ballista, bindings to language > Z, python-datafusion). > * re-write an existing implementation / component, with no intentions of > being mergeable in the traditional sense. > > Operationally, the first case would be things like mv new_repo/* > old_repo/component/ > The second case would be for things like rm -rf old_repo/*; mv new_repo/* > old_repo/ or "this repo is now official code, old is in maintenance mode". > > Best, > Jorge > > > > > On Mon, Apr 26, 2021 at 6:40 AM Micah Kornfield > wrote: > >> Hi Jorge, >> Thanks for doing this. >> >> One question that springs to mind: For work that is mostly intended to be >> merged back to an existing Arrow repo what are the trade-offs between >> brand >> new repos as compared to a separate branch in existing one in an existing >> one? >> >> Cheers, >> Micah >> >> On Sun, Apr 25, 2021 at 9:31 PM Jorge Cardoso Leitão < >> jorgecarlei...@gmail.com> wrote: >> >> > Hi, >> > >> > As discussed in other threads (rust sync and parquet2), I would like to >> > open the discussion around opening repos for experimental work that may >> or >> > may not be merged / used. >> > >> > The rationale is that we incentivise work to be conducted within ASF and >> > Apache Arrows' governance, thereby clarifying the context and >> governance of >> > that work, while still offering the freedoms that people enjoy when >> > creating a new repo on their personal accounts. >> > >> > I wrote a draft proposal here: [1] >> > >> > [1] >> > >> > >> https://docs.google.com/document/d/1rDTWezkKkmOQ3HQeX8NhaXbKZLeuwcyFdC4ZA7ZatDY/edit?usp=sharing >> > >> > Best, >> > Jorge >> > >> >
[DISCUSS] Moving the format directory to arrow-format repository
Hi Arrow devs, Andy noticed that we carry a copy of the format directory in arrow-rs, which is bound to get outdated in the future. We would like to propose creating an arrow-format repository, similar to parquet-format, so that arrow-rs and other future separate repositories could add this as a submodule. What are your thoughts? Regards Neville
Re: Issue with pyarrow v4.0.0 - Write parquet files with non str datatypes
Hi Jorge, How did you install pyarrow 4.0.0? The error you show typically points to an installation issue (eg built with a wrong numpy) Best, Joris On Tue, 27 Apr 2021 at 16:47, Jorge Alarcon wrote: > Hi everybody, > > > > Please, there is an issue with pyarrow (version 4.0.0) when you try to > write a parquet with your engine. It is not possible to write a parquet > from a pandas df when it includes non str columns (datetime64, float64, > int64…) > > > > Example: > > > > df = pd.DataFrame({'A':[1, 2, 3], 'B':['a', 'b', 'c']}) > > df.to_parquet('example.parquet', engine='pyarrow') #Not working > > *ArrowTypeError*: ('Did not pass numpy.dtype object', 'Conversion failed > for column InternalId with type float64') > > > > df['A'] = df['A'].astype(str) > > df.to_parquet('example.parquet', engine='pyarrow') #Working > > > > Best! > > > > *Jorge Alarcon* > > *Senior Data Analytics Specialist* > > > > Mail: jorge.alar...@maccresi.com > > Telf: +34 683541389 > > 28020 Madrid > > > > >
Re: [DISCUSS] [Rust] Python-datafusion
Hi Jorge, This all sounds good to me. It might be nice to test against both the pinned released version of pyarrow and at head if possible. I like the idea of not causing release churn as long as all the underlying libraries are compatible. Thanks for the write up. -Micah On Mon, Apr 26, 2021 at 10:30 AM Jorge Cardoso Leitão < jorgecarlei...@gmail.com> wrote: > Hi Micah, > > All testing is actually done from Python: create a record batch in > pyarrow, push it to datafusion, > consume it back in Python, and compare the result using pyarrows' > equality. Sometimes parquet is used instead. > The library is tested against pyarrow==1 from pypi: we can bump that, but > if it works in pyarrow==1, > chances are things will improve with higher versions :) > > Releases: I thought to have it released as a separate wheel for two > reasons: > > * not force people that want pyarrow to download datafusion binaries with > it > * have independent versioning from pyarrow > > and "bracked" the pyarrow that we ensure compatibility with. > > Another alternative is to release with the same versioning as datafusion, > like arrow c++ / pyarrow and spark / pyspark. > The upside is that the versions are aligned. The downside is that we will > be releasing a lot of majors for no reason: so far, all backward > incompatible changes in datafusion were not backward incompatible in > python-datafusion: it is easier to break backward compat. in a Rust library > than it is in a Python wrapper to a Rust library. > > What are your thoughts, Micah? > > Best, > Jorge > > > > > > On Sun, Apr 25, 2021 at 10:32 PM Micah Kornfield > wrote: > >> Hi Jorge, >> I think this would certainly be a valuable contribution. How were you >> thinking of hosting (which repo)/publishing it (maintaintaining a separate >> wheel)? Also did you have thoughts integration testing with pyarrow? >> >> Cheers, >> Micah >> >> On Sun, Apr 25, 2021 at 9:13 AM Jorge Cardoso Leitão < >> jorgecarlei...@gmail.com> wrote: >> >> > Hi, >> > >> > I fielded a PR [1] to open up a discussion to incorporate >> python-datafusion >> > [2] into the Apache Arrow project. >> > >> > Python-datafusion is a Python library [3] built on top of DataFusions >> that >> > enables people to use DataFusion from Python. It leverages the C data >> > interface for zero-cost copy between DataFusion and pyarrow (a bunch of >> > pointers is shared around). >> > >> > For example, it allows users to read a CSV from Rust, pass the arrays >> to a >> > C++ kernel, continue the computation in Rust's kernels, and export to >> > parquet using Rust (or C++ parquet, or whatever ^_^). It supports UDFs >> and >> > UDAFs, in case someone wants to go crazy with Pyarrow, Pandas, numpy or >> > tensorflow. =) >> > >> > Best, >> > Jorge >> > >> > [1] https://github.com/apache/arrow-datafusion/pull/69 >> > [2] https://github.com/jorgecarleitao/datafusion-python >> > [3] https://pypi.org/project/datafusion/ >> > >> >
Re: [VOTE] Release Apache Arrow 4.0.0 - RC3
JS packages have been uploaded. Paul On 4/27/21 9:47 AM, Neal Richardson wrote: R package has been accepted by CRAN. Neal On Tue, Apr 27, 2021 at 7:25 AM Krisztián Szűcs wrote: I've just opened a PR with the updated documentation. The remaining tasks: 3. [in-pr|Kou] upload binaries 6. [Paul] upload js packages 10. [Uwe] update conda recipes 12. [todo] update homebrew packages 14. [Kou] update msys2 15. [Neal] update R packages 16. [in-pr|Krisztian] update docs On Tue, Apr 27, 2021 at 2:42 PM Krisztián Szűcs wrote: On Tue, Apr 27, 2021 at 2:21 PM Paul Taylor wrote: These look like the errors resolved in https://github.com/apache/arrow/pull/10156. Can we cherry-pick that commit to the release branch? Great, I'll cherry-pick that commit. Could you please release the JS packages to npm? I think the lerna.json needs to be updated before npm publish. Thank Paul! On 4/27/21 7:04 AM, Krisztián Szűcs wrote: I'd need some help to both release the JS packages using the new lerna configuration and to fix the JS documentation generation [1]. We should backport these changes to the release-4.0.0 branch. [1]: https://dev.azure.com/ursacomputing/crossbow/_build/results?buildId=4297&view=logs&j=0da5d1d9-276d-5173-c4c4-9d4d4ed14fdb&t=d9b15392-e4ce-5e4c-0c8c-b69645229181 On Tue, Apr 27, 2021 at 1:50 AM Sutou Kouhei wrote: I'll also update MSYS2 packages: 1. [x] open a pull request to bump the version numbers in the source code 2. [x] upload source 3. [kou] upload binaries 4. [x] update website 5. [x] upload ruby gems 6. [ ] upload js packages 8. [x] upload C# packages 9. [x] upload rust crates 10. [ ] update conda recipes 11. [x] upload wheels/sdist to pypi 12. [ ] update homebrew packages 13. [x] update maven artifacts 14. [kou] update msys2 15. [nealrichardson] update R packages 16. [ ] update docs In r...@mail.gmail.com> "Re: [VOTE] Release Apache Arrow 4.0.0 - RC3" on Tue, 27 Apr 2021 01:48:37 +0200, Krisztián Szűcs wrote: On Tue, Apr 27, 2021 at 1:05 AM Andy Grove wrote: The following Rust crates have been published: arrow, arrow-flight, parquet, parquet_derive, datafusion Thanks Andy! The current status is: 1. [x] open a pull request to bump the version numbers in the source code 2. [x] upload source 3. [kou] upload binaries 4. [x] update website 5. [x] upload ruby gems 6. [ ] upload js packages 8. [x] upload C# packages 9. [x] upload rust crates 10. [ ] update conda recipes 11. [x] upload wheels/sdist to pypi 12. [ ] update homebrew packages 13. [x] update maven artifacts 14. [ ] update msys2 15. [nealrichardson] update R packages 16. [ ] update docs On Mon, Apr 26, 2021 at 4:34 PM Andy Grove wrote: Yes, I can handle the Rust release. On Mon, Apr 26, 2021, 4:17 PM Krisztián Szűcs < szucs.kriszt...@gmail.com> wrote: @Andy Grove could you please handle the rust release? On Mon, Apr 26, 2021 at 11:51 PM Krisztián Szűcs wrote: 1. [x] open a pull request to bump the version numbers in the source code 2. [x] upload source 3. [kou] upload binaries 4. [x] update website 5. [x] upload ruby gems 6. [ ] upload js packages 8. [x] upload C# packages 9. [ ] upload rust crates 10. [ ] update conda recipes 11. [in-progress] upload wheels/sdist to pypi 12. [ ] update homebrew packages 13. [x] update maven artifacts 14. [ ] update msys2 15. [nealrichardson] update R packages 16. [ ] update docs The JS post release task is failing with: lerna ERR! ENOLERNA `lerna.json` does not exist, have you run `lerna init`? I assume the lerna configuration should be updated including the version number. @Paul Taylor could you please handle the JS release? On Mon, Apr 26, 2021 at 9:01 PM Krisztián Szűcs wrote: The current status of the post-release tasks: 1. [x] open a pull request to bump the version numbers in the source code 2. [x] upload source 3. [can't do] upload binaries 4. [x] update website 5. [x] upload ruby gems 6. [ ] upload js packages 8. [ ] upload C# packages 9. [ ] upload rust crates 10. [ ] update conda recipes 11. [kszucs] upload wheels/sdist to pypi 12. [ ] update homebrew packages 13. [kszucs] update maven artifacts 14. [ ] update msys2 15. [nealrichardson] update R packages 16. [ ] update docs On Mon, Apr 26, 2021 at 8:19 PM Krisztián Szűcs wrote: The VOTE carries with 4 binding +1 and 5 non-binding +1 votes. Thanks everyone! I'm starting the post release tasks and keep you posted about the current status. On Mon, Apr 26, 2021 at 6:06 PM Neal Richardson wrote: +1 (binding) GitHub Actions verifications are green and R artifact builds are successful. Neal On Mon, Apr 26, 2021 at 6:02 AM Krisztián Szűcs < szucs.kriszt...@gmail.com> wrote: On Sun, Apr 25, 2021 at 10:59 PM Sutou Kouhei < k...@clear-code.com> wrote: Here: https://github.com/apache/arrow/pull/10126 I've incorporated the automatic verification step to the release procedure so we can start the VOTE after having positive feedback from
Issue with pyarrow v4.0.0 - Write parquet files with non str datatypes
Hi everybody, Please, there is an issue with pyarrow (version 4.0.0) when you try to write a parquet with your engine. It is not possible to write a parquet from a pandas df when it includes non str columns (datetime64, float64, int64...) Example: df = pd.DataFrame({'A':[1, 2, 3], 'B':['a', 'b', 'c']}) df.to_parquet('example.parquet', engine='pyarrow') #Not working ArrowTypeError: ('Did not pass numpy.dtype object', 'Conversion failed for column InternalId with type float64') df['A'] = df['A'].astype(str) df.to_parquet('example.parquet', engine='pyarrow') #Working Best! [cid:image001.jpg@01D73B80.7A2385C0] Jorge Alarcon Senior Data Analytics Specialist Mail: jorge.alar...@maccresi.com Telf: +34 683541389 28020 Madrid
Re: [VOTE] Release Apache Arrow 4.0.0 - RC3
R package has been accepted by CRAN. Neal On Tue, Apr 27, 2021 at 7:25 AM Krisztián Szűcs wrote: > I've just opened a PR with the updated documentation. > > The remaining tasks: > > 3. [in-pr|Kou] upload binaries > 6. [Paul] upload js packages > 10. [Uwe] update conda recipes > 12. [todo] update homebrew packages > 14. [Kou] update msys2 > 15. [Neal] update R packages > 16. [in-pr|Krisztian] update docs > > On Tue, Apr 27, 2021 at 2:42 PM Krisztián Szűcs > wrote: > > > > On Tue, Apr 27, 2021 at 2:21 PM Paul Taylor > wrote: > > > > > > These look like the errors resolved in > > > https://github.com/apache/arrow/pull/10156. Can we cherry-pick that > > > commit to the release branch? > > Great, I'll cherry-pick that commit. > > > > Could you please release the JS packages to npm? I think the > > lerna.json needs to be updated before npm publish. > > > > Thank Paul! > > > > > > > > > On 4/27/21 7:04 AM, Krisztián Szűcs wrote: > > > > I'd need some help to both release the JS packages using the new > lerna > > > > configuration and to fix the JS documentation generation [1]. We > > > > should backport these changes to the release-4.0.0 branch. > > > > > > > > [1]: > https://dev.azure.com/ursacomputing/crossbow/_build/results?buildId=4297&view=logs&j=0da5d1d9-276d-5173-c4c4-9d4d4ed14fdb&t=d9b15392-e4ce-5e4c-0c8c-b69645229181 > > > > > > > > On Tue, Apr 27, 2021 at 1:50 AM Sutou Kouhei > wrote: > > > >> I'll also update MSYS2 packages: > > > >> > > > >> 1. [x] open a pull request to bump the version numbers in the > source code > > > >> 2. [x] upload source > > > >> 3. [kou] upload binaries > > > >> 4. [x] update website > > > >> 5. [x] upload ruby gems > > > >> 6. [ ] upload js packages > > > >> 8. [x] upload C# packages > > > >> 9. [x] upload rust crates > > > >> 10. [ ] update conda recipes > > > >> 11. [x] upload wheels/sdist to pypi > > > >> 12. [ ] update homebrew packages > > > >> 13. [x] update maven artifacts > > > >> 14. [kou] update msys2 > > > >> 15. [nealrichardson] update R packages > > > >> 16. [ ] update docs > > > >> > > > >> In r...@mail.gmail.com> > > > >>"Re: [VOTE] Release Apache Arrow 4.0.0 - RC3" on Tue, 27 Apr > 2021 01:48:37 +0200, > > > >>Krisztián Szűcs wrote: > > > >> > > > >>> On Tue, Apr 27, 2021 at 1:05 AM Andy Grove > wrote: > > > The following Rust crates have been published: arrow, > arrow-flight, parquet, parquet_derive, datafusion > > > >>> Thanks Andy! > > > >>> > > > >>> The current status is: > > > >>> 1. [x] open a pull request to bump the version numbers in the > source code > > > >>> 2. [x] upload source > > > >>> 3. [kou] upload binaries > > > >>> 4. [x] update website > > > >>> 5. [x] upload ruby gems > > > >>> 6. [ ] upload js packages > > > >>> 8. [x] upload C# packages > > > >>> 9. [x] upload rust crates > > > >>> 10. [ ] update conda recipes > > > >>> 11. [x] upload wheels/sdist to pypi > > > >>> 12. [ ] update homebrew packages > > > >>> 13. [x] update maven artifacts > > > >>> 14. [ ] update msys2 > > > >>> 15. [nealrichardson] update R packages > > > >>> 16. [ ] update docs > > > On Mon, Apr 26, 2021 at 4:34 PM Andy Grove > wrote: > > > > Yes, I can handle the Rust release. > > > > > > > > On Mon, Apr 26, 2021, 4:17 PM Krisztián Szűcs < > szucs.kriszt...@gmail.com> wrote: > > > >> @Andy Grove could you please handle the rust release? > > > >> > > > >> On Mon, Apr 26, 2021 at 11:51 PM Krisztián Szűcs > > > >> wrote: > > > >>> 1. [x] open a pull request to bump the version numbers in the > source code > > > >>> 2. [x] upload source > > > >>> 3. [kou] upload binaries > > > >>> 4. [x] update website > > > >>> 5. [x] upload ruby gems > > > >>> 6. [ ] upload js packages > > > >>> 8. [x] upload C# packages > > > >>> 9. [ ] upload rust crates > > > >>> 10. [ ] update conda recipes > > > >>> 11. [in-progress] upload wheels/sdist to pypi > > > >>> 12. [ ] update homebrew packages > > > >>> 13. [x] update maven artifacts > > > >>> 14. [ ] update msys2 > > > >>> 15. [nealrichardson] update R packages > > > >>> 16. [ ] update docs > > > >>> > > > >>> The JS post release task is failing with: > > > >> lerna ERR! ENOLERNA `lerna.json` does not exist, have you > run `lerna init`? > > > >>> I assume the lerna configuration should be updated including > the version number. > > > >>> > > > >>> @Paul Taylor could you please handle the JS release? > > > >>> > > > >>> On Mon, Apr 26, 2021 at 9:01 PM Krisztián Szűcs > > > >>> wrote: > > > The current status of the post-release tasks: > > > > > > 1. [x] open a pull request to bump the version numbers in > the source code > > > 2. [x] upload source > > > 3. [can't do] upload binaries > > > 4. [x] update website > > > 5. [x] upload ruby gems > > > 6. [ ] upload js packages > > > >
[C++][Python] Parquet INT96 overflow for arrow timestamps
Hi there, I previously raised an issue regarding arrow timestamp values overflowing when reading parquet type INT96 ( https://issues.apache.org/jira/browse/ARROW-12096). I would like to try and add a contribution to this try and fix this, but wanted to check the following: - If I should start with the code in "arrow/cpp/src/parquet/" to try and identify a fix? - If there is any existing process or functions to check this overflow that I should apply to the reader? - Or if I should write a PR to expose some settings in the pyarrow parquet reader settings to read INT96 types as a different timestamp than what arrow infers? Thanks, Karik
Re: [VOTE] Release Apache Arrow 4.0.0 - RC3
I've just opened a PR with the updated documentation. The remaining tasks: 3. [in-pr|Kou] upload binaries 6. [Paul] upload js packages 10. [Uwe] update conda recipes 12. [todo] update homebrew packages 14. [Kou] update msys2 15. [Neal] update R packages 16. [in-pr|Krisztian] update docs On Tue, Apr 27, 2021 at 2:42 PM Krisztián Szűcs wrote: > > On Tue, Apr 27, 2021 at 2:21 PM Paul Taylor wrote: > > > > These look like the errors resolved in > > https://github.com/apache/arrow/pull/10156. Can we cherry-pick that > > commit to the release branch? > Great, I'll cherry-pick that commit. > > Could you please release the JS packages to npm? I think the > lerna.json needs to be updated before npm publish. > > Thank Paul! > > > > > > On 4/27/21 7:04 AM, Krisztián Szűcs wrote: > > > I'd need some help to both release the JS packages using the new lerna > > > configuration and to fix the JS documentation generation [1]. We > > > should backport these changes to the release-4.0.0 branch. > > > > > > [1]: > > > https://dev.azure.com/ursacomputing/crossbow/_build/results?buildId=4297&view=logs&j=0da5d1d9-276d-5173-c4c4-9d4d4ed14fdb&t=d9b15392-e4ce-5e4c-0c8c-b69645229181 > > > > > > On Tue, Apr 27, 2021 at 1:50 AM Sutou Kouhei wrote: > > >> I'll also update MSYS2 packages: > > >> > > >> 1. [x] open a pull request to bump the version numbers in the source > > >> code > > >> 2. [x] upload source > > >> 3. [kou] upload binaries > > >> 4. [x] update website > > >> 5. [x] upload ruby gems > > >> 6. [ ] upload js packages > > >> 8. [x] upload C# packages > > >> 9. [x] upload rust crates > > >> 10. [ ] update conda recipes > > >> 11. [x] upload wheels/sdist to pypi > > >> 12. [ ] update homebrew packages > > >> 13. [x] update maven artifacts > > >> 14. [kou] update msys2 > > >> 15. [nealrichardson] update R packages > > >> 16. [ ] update docs > > >> > > >> In > > >>"Re: [VOTE] Release Apache Arrow 4.0.0 - RC3" on Tue, 27 Apr 2021 > > >> 01:48:37 +0200, > > >>Krisztián Szűcs wrote: > > >> > > >>> On Tue, Apr 27, 2021 at 1:05 AM Andy Grove > > >>> wrote: > > The following Rust crates have been published: arrow, arrow-flight, > > parquet, parquet_derive, datafusion > > >>> Thanks Andy! > > >>> > > >>> The current status is: > > >>> 1. [x] open a pull request to bump the version numbers in the source > > >>> code > > >>> 2. [x] upload source > > >>> 3. [kou] upload binaries > > >>> 4. [x] update website > > >>> 5. [x] upload ruby gems > > >>> 6. [ ] upload js packages > > >>> 8. [x] upload C# packages > > >>> 9. [x] upload rust crates > > >>> 10. [ ] update conda recipes > > >>> 11. [x] upload wheels/sdist to pypi > > >>> 12. [ ] update homebrew packages > > >>> 13. [x] update maven artifacts > > >>> 14. [ ] update msys2 > > >>> 15. [nealrichardson] update R packages > > >>> 16. [ ] update docs > > On Mon, Apr 26, 2021 at 4:34 PM Andy Grove > > wrote: > > > Yes, I can handle the Rust release. > > > > > > On Mon, Apr 26, 2021, 4:17 PM Krisztián Szűcs > > > wrote: > > >> @Andy Grove could you please handle the rust release? > > >> > > >> On Mon, Apr 26, 2021 at 11:51 PM Krisztián Szűcs > > >> wrote: > > >>> 1. [x] open a pull request to bump the version numbers in the > > >>> source code > > >>> 2. [x] upload source > > >>> 3. [kou] upload binaries > > >>> 4. [x] update website > > >>> 5. [x] upload ruby gems > > >>> 6. [ ] upload js packages > > >>> 8. [x] upload C# packages > > >>> 9. [ ] upload rust crates > > >>> 10. [ ] update conda recipes > > >>> 11. [in-progress] upload wheels/sdist to pypi > > >>> 12. [ ] update homebrew packages > > >>> 13. [x] update maven artifacts > > >>> 14. [ ] update msys2 > > >>> 15. [nealrichardson] update R packages > > >>> 16. [ ] update docs > > >>> > > >>> The JS post release task is failing with: > > >> lerna ERR! ENOLERNA `lerna.json` does not exist, have you run > > >> `lerna init`? > > >>> I assume the lerna configuration should be updated including the > > >>> version number. > > >>> > > >>> @Paul Taylor could you please handle the JS release? > > >>> > > >>> On Mon, Apr 26, 2021 at 9:01 PM Krisztián Szűcs > > >>> wrote: > > The current status of the post-release tasks: > > > > 1. [x] open a pull request to bump the version numbers in the > > source code > > 2. [x] upload source > > 3. [can't do] upload binaries > > 4. [x] update website > > 5. [x] upload ruby gems > > 6. [ ] upload js packages > > 8. [ ] upload C# packages > > 9. [ ] upload rust crates > > 10. [ ] update conda recipes > > 11. [kszucs] upload wheels/sdist to pypi > > 12. [ ] update homebrew packages > > 13. [kszucs] update maven artifacts > > 14. [ ]
Re: [VOTE] Release Apache Arrow 4.0.0 - RC3
On Tue, Apr 27, 2021 at 2:21 PM Paul Taylor wrote: > > These look like the errors resolved in > https://github.com/apache/arrow/pull/10156. Can we cherry-pick that > commit to the release branch? Great, I'll cherry-pick that commit. Could you please release the JS packages to npm? I think the lerna.json needs to be updated before npm publish. Thank Paul! > > > On 4/27/21 7:04 AM, Krisztián Szűcs wrote: > > I'd need some help to both release the JS packages using the new lerna > > configuration and to fix the JS documentation generation [1]. We > > should backport these changes to the release-4.0.0 branch. > > > > [1]: > > https://dev.azure.com/ursacomputing/crossbow/_build/results?buildId=4297&view=logs&j=0da5d1d9-276d-5173-c4c4-9d4d4ed14fdb&t=d9b15392-e4ce-5e4c-0c8c-b69645229181 > > > > On Tue, Apr 27, 2021 at 1:50 AM Sutou Kouhei wrote: > >> I'll also update MSYS2 packages: > >> > >> 1. [x] open a pull request to bump the version numbers in the source code > >> 2. [x] upload source > >> 3. [kou] upload binaries > >> 4. [x] update website > >> 5. [x] upload ruby gems > >> 6. [ ] upload js packages > >> 8. [x] upload C# packages > >> 9. [x] upload rust crates > >> 10. [ ] update conda recipes > >> 11. [x] upload wheels/sdist to pypi > >> 12. [ ] update homebrew packages > >> 13. [x] update maven artifacts > >> 14. [kou] update msys2 > >> 15. [nealrichardson] update R packages > >> 16. [ ] update docs > >> > >> In > >>"Re: [VOTE] Release Apache Arrow 4.0.0 - RC3" on Tue, 27 Apr 2021 > >> 01:48:37 +0200, > >>Krisztián Szűcs wrote: > >> > >>> On Tue, Apr 27, 2021 at 1:05 AM Andy Grove wrote: > The following Rust crates have been published: arrow, arrow-flight, > parquet, parquet_derive, datafusion > >>> Thanks Andy! > >>> > >>> The current status is: > >>> 1. [x] open a pull request to bump the version numbers in the source code > >>> 2. [x] upload source > >>> 3. [kou] upload binaries > >>> 4. [x] update website > >>> 5. [x] upload ruby gems > >>> 6. [ ] upload js packages > >>> 8. [x] upload C# packages > >>> 9. [x] upload rust crates > >>> 10. [ ] update conda recipes > >>> 11. [x] upload wheels/sdist to pypi > >>> 12. [ ] update homebrew packages > >>> 13. [x] update maven artifacts > >>> 14. [ ] update msys2 > >>> 15. [nealrichardson] update R packages > >>> 16. [ ] update docs > On Mon, Apr 26, 2021 at 4:34 PM Andy Grove wrote: > > Yes, I can handle the Rust release. > > > > On Mon, Apr 26, 2021, 4:17 PM Krisztián Szűcs > > wrote: > >> @Andy Grove could you please handle the rust release? > >> > >> On Mon, Apr 26, 2021 at 11:51 PM Krisztián Szűcs > >> wrote: > >>> 1. [x] open a pull request to bump the version numbers in the source > >>> code > >>> 2. [x] upload source > >>> 3. [kou] upload binaries > >>> 4. [x] update website > >>> 5. [x] upload ruby gems > >>> 6. [ ] upload js packages > >>> 8. [x] upload C# packages > >>> 9. [ ] upload rust crates > >>> 10. [ ] update conda recipes > >>> 11. [in-progress] upload wheels/sdist to pypi > >>> 12. [ ] update homebrew packages > >>> 13. [x] update maven artifacts > >>> 14. [ ] update msys2 > >>> 15. [nealrichardson] update R packages > >>> 16. [ ] update docs > >>> > >>> The JS post release task is failing with: > >> lerna ERR! ENOLERNA `lerna.json` does not exist, have you run > >> `lerna init`? > >>> I assume the lerna configuration should be updated including the > >>> version number. > >>> > >>> @Paul Taylor could you please handle the JS release? > >>> > >>> On Mon, Apr 26, 2021 at 9:01 PM Krisztián Szűcs > >>> wrote: > The current status of the post-release tasks: > > 1. [x] open a pull request to bump the version numbers in the > source code > 2. [x] upload source > 3. [can't do] upload binaries > 4. [x] update website > 5. [x] upload ruby gems > 6. [ ] upload js packages > 8. [ ] upload C# packages > 9. [ ] upload rust crates > 10. [ ] update conda recipes > 11. [kszucs] upload wheels/sdist to pypi > 12. [ ] update homebrew packages > 13. [kszucs] update maven artifacts > 14. [ ] update msys2 > 15. [nealrichardson] update R packages > 16. [ ] update docs > > On Mon, Apr 26, 2021 at 8:19 PM Krisztián Szűcs > wrote: > > The VOTE carries with 4 binding +1 and 5 non-binding +1 votes. > > > > Thanks everyone! > > > > I'm starting the post release tasks and keep you posted about the > > current status. > > > > On Mon, Apr 26, 2021 at 6:06 PM Neal Richardson > > wrote: > >> +1 (binding) > >> > >> GitHub Actions verifications are green and R artifact builds
Re: [VOTE] Release Apache Arrow 4.0.0 - RC3
These look like the errors resolved in https://github.com/apache/arrow/pull/10156. Can we cherry-pick that commit to the release branch? On 4/27/21 7:04 AM, Krisztián Szűcs wrote: I'd need some help to both release the JS packages using the new lerna configuration and to fix the JS documentation generation [1]. We should backport these changes to the release-4.0.0 branch. [1]: https://dev.azure.com/ursacomputing/crossbow/_build/results?buildId=4297&view=logs&j=0da5d1d9-276d-5173-c4c4-9d4d4ed14fdb&t=d9b15392-e4ce-5e4c-0c8c-b69645229181 On Tue, Apr 27, 2021 at 1:50 AM Sutou Kouhei wrote: I'll also update MSYS2 packages: 1. [x] open a pull request to bump the version numbers in the source code 2. [x] upload source 3. [kou] upload binaries 4. [x] update website 5. [x] upload ruby gems 6. [ ] upload js packages 8. [x] upload C# packages 9. [x] upload rust crates 10. [ ] update conda recipes 11. [x] upload wheels/sdist to pypi 12. [ ] update homebrew packages 13. [x] update maven artifacts 14. [kou] update msys2 15. [nealrichardson] update R packages 16. [ ] update docs In "Re: [VOTE] Release Apache Arrow 4.0.0 - RC3" on Tue, 27 Apr 2021 01:48:37 +0200, Krisztián Szűcs wrote: On Tue, Apr 27, 2021 at 1:05 AM Andy Grove wrote: The following Rust crates have been published: arrow, arrow-flight, parquet, parquet_derive, datafusion Thanks Andy! The current status is: 1. [x] open a pull request to bump the version numbers in the source code 2. [x] upload source 3. [kou] upload binaries 4. [x] update website 5. [x] upload ruby gems 6. [ ] upload js packages 8. [x] upload C# packages 9. [x] upload rust crates 10. [ ] update conda recipes 11. [x] upload wheels/sdist to pypi 12. [ ] update homebrew packages 13. [x] update maven artifacts 14. [ ] update msys2 15. [nealrichardson] update R packages 16. [ ] update docs On Mon, Apr 26, 2021 at 4:34 PM Andy Grove wrote: Yes, I can handle the Rust release. On Mon, Apr 26, 2021, 4:17 PM Krisztián Szűcs wrote: @Andy Grove could you please handle the rust release? On Mon, Apr 26, 2021 at 11:51 PM Krisztián Szűcs wrote: 1. [x] open a pull request to bump the version numbers in the source code 2. [x] upload source 3. [kou] upload binaries 4. [x] update website 5. [x] upload ruby gems 6. [ ] upload js packages 8. [x] upload C# packages 9. [ ] upload rust crates 10. [ ] update conda recipes 11. [in-progress] upload wheels/sdist to pypi 12. [ ] update homebrew packages 13. [x] update maven artifacts 14. [ ] update msys2 15. [nealrichardson] update R packages 16. [ ] update docs The JS post release task is failing with: lerna ERR! ENOLERNA `lerna.json` does not exist, have you run `lerna init`? I assume the lerna configuration should be updated including the version number. @Paul Taylor could you please handle the JS release? On Mon, Apr 26, 2021 at 9:01 PM Krisztián Szűcs wrote: The current status of the post-release tasks: 1. [x] open a pull request to bump the version numbers in the source code 2. [x] upload source 3. [can't do] upload binaries 4. [x] update website 5. [x] upload ruby gems 6. [ ] upload js packages 8. [ ] upload C# packages 9. [ ] upload rust crates 10. [ ] update conda recipes 11. [kszucs] upload wheels/sdist to pypi 12. [ ] update homebrew packages 13. [kszucs] update maven artifacts 14. [ ] update msys2 15. [nealrichardson] update R packages 16. [ ] update docs On Mon, Apr 26, 2021 at 8:19 PM Krisztián Szűcs wrote: The VOTE carries with 4 binding +1 and 5 non-binding +1 votes. Thanks everyone! I'm starting the post release tasks and keep you posted about the current status. On Mon, Apr 26, 2021 at 6:06 PM Neal Richardson wrote: +1 (binding) GitHub Actions verifications are green and R artifact builds are successful. Neal On Mon, Apr 26, 2021 at 6:02 AM Krisztián Szűcs wrote: On Sun, Apr 25, 2021 at 10:59 PM Sutou Kouhei wrote: Here: https://github.com/apache/arrow/pull/10126 I've incorporated the automatic verification step to the release procedure so we can start the VOTE after having positive feedback from the verification tasks. In "Re: [VOTE] Release Apache Arrow 4.0.0 - RC3" on Sun, 25 Apr 2021 15:12:30 -0500, Wes McKinney wrote: Have we run the GitHub Actions release verifications, or can we do that? I will try to run the RC verification on my dev machine (I recently reinstalled Linux so wasn't equipped to immediately run the verification script) On Sun, Apr 25, 2021 at 2:31 PM Jorge Cardoso Leitão wrote: +1, based on Rust alone. All tests pass as they should. Thanks a lot everyone for making this happen. Best, Jorge On Thu, Apr 22, 2021 at 5:17 PM Jonathan Keane +1 (non-binding) Verified wheels, sources, and binaries on macOS 11.2 using the verification script (except for Java Integration, Glib, and Ruby). Like Antoine I ran into the same issue with Ruby. I also installed Arrow and the R package locally + ran some adhoc tests using som
Re: [VOTE] Release Apache Arrow 4.0.0 - RC3
I'd need some help to both release the JS packages using the new lerna configuration and to fix the JS documentation generation [1]. We should backport these changes to the release-4.0.0 branch. [1]: https://dev.azure.com/ursacomputing/crossbow/_build/results?buildId=4297&view=logs&j=0da5d1d9-276d-5173-c4c4-9d4d4ed14fdb&t=d9b15392-e4ce-5e4c-0c8c-b69645229181 On Tue, Apr 27, 2021 at 1:50 AM Sutou Kouhei wrote: > > I'll also update MSYS2 packages: > > 1. [x] open a pull request to bump the version numbers in the source code > 2. [x] upload source > 3. [kou] upload binaries > 4. [x] update website > 5. [x] upload ruby gems > 6. [ ] upload js packages > 8. [x] upload C# packages > 9. [x] upload rust crates > 10. [ ] update conda recipes > 11. [x] upload wheels/sdist to pypi > 12. [ ] update homebrew packages > 13. [x] update maven artifacts > 14. [kou] update msys2 > 15. [nealrichardson] update R packages > 16. [ ] update docs > > In > "Re: [VOTE] Release Apache Arrow 4.0.0 - RC3" on Tue, 27 Apr 2021 01:48:37 > +0200, > Krisztián Szűcs wrote: > > > On Tue, Apr 27, 2021 at 1:05 AM Andy Grove wrote: > >> > >> The following Rust crates have been published: arrow, arrow-flight, > >> parquet, parquet_derive, datafusion > > Thanks Andy! > > > > The current status is: > > 1. [x] open a pull request to bump the version numbers in the source code > > 2. [x] upload source > > 3. [kou] upload binaries > > 4. [x] update website > > 5. [x] upload ruby gems > > 6. [ ] upload js packages > > 8. [x] upload C# packages > > 9. [x] upload rust crates > > 10. [ ] update conda recipes > > 11. [x] upload wheels/sdist to pypi > > 12. [ ] update homebrew packages > > 13. [x] update maven artifacts > > 14. [ ] update msys2 > > 15. [nealrichardson] update R packages > > 16. [ ] update docs > >> > >> On Mon, Apr 26, 2021 at 4:34 PM Andy Grove wrote: > >>> > >>> Yes, I can handle the Rust release. > >>> > >>> On Mon, Apr 26, 2021, 4:17 PM Krisztián Szűcs > >>> wrote: > > @Andy Grove could you please handle the rust release? > > On Mon, Apr 26, 2021 at 11:51 PM Krisztián Szűcs > wrote: > > > > 1. [x] open a pull request to bump the version numbers in the source > > code > > 2. [x] upload source > > 3. [kou] upload binaries > > 4. [x] update website > > 5. [x] upload ruby gems > > 6. [ ] upload js packages > > 8. [x] upload C# packages > > 9. [ ] upload rust crates > > 10. [ ] update conda recipes > > 11. [in-progress] upload wheels/sdist to pypi > > 12. [ ] update homebrew packages > > 13. [x] update maven artifacts > > 14. [ ] update msys2 > > 15. [nealrichardson] update R packages > > 16. [ ] update docs > > > > The JS post release task is failing with: > > >>> lerna ERR! ENOLERNA `lerna.json` does not exist, have you run > > >>> `lerna init`? > > I assume the lerna configuration should be updated including the > > version number. > > > > @Paul Taylor could you please handle the JS release? > > > > On Mon, Apr 26, 2021 at 9:01 PM Krisztián Szűcs > > wrote: > > > > > > The current status of the post-release tasks: > > > > > > 1. [x] open a pull request to bump the version numbers in the > > > source code > > > 2. [x] upload source > > > 3. [can't do] upload binaries > > > 4. [x] update website > > > 5. [x] upload ruby gems > > > 6. [ ] upload js packages > > > 8. [ ] upload C# packages > > > 9. [ ] upload rust crates > > > 10. [ ] update conda recipes > > > 11. [kszucs] upload wheels/sdist to pypi > > > 12. [ ] update homebrew packages > > > 13. [kszucs] update maven artifacts > > > 14. [ ] update msys2 > > > 15. [nealrichardson] update R packages > > > 16. [ ] update docs > > > > > > On Mon, Apr 26, 2021 at 8:19 PM Krisztián Szűcs > > > wrote: > > > > > > > > The VOTE carries with 4 binding +1 and 5 non-binding +1 votes. > > > > > > > > Thanks everyone! > > > > > > > > I'm starting the post release tasks and keep you posted about the > > > > current status. > > > > > > > > On Mon, Apr 26, 2021 at 6:06 PM Neal Richardson > > > > wrote: > > > > > > > > > > +1 (binding) > > > > > > > > > > GitHub Actions verifications are green and R artifact builds are > > > > > successful. > > > > > > > > > > Neal > > > > > > > > > > On Mon, Apr 26, 2021 at 6:02 AM Krisztián Szűcs > > > > > > > > > > wrote: > > > > > > > > > > > On Sun, Apr 25, 2021 at 10:59 PM Sutou Kouhei > > > > > > wrote: > > > > > > > > > > > > > > Here: https://github.com/apache/arrow/pull/10126 > > > > > > I've incorporated the automatic verification step to the > > > > > > release > > > > > > procedure so w
[NIGHTLY] Arrow Build Report for Job nightly-2021-04-27-0
Arrow Build Report for Job nightly-2021-04-27-0 All tasks: https://github.com/ursacomputing/crossbow/branches/all?query=nightly-2021-04-27-0 Failed Tasks: - conda-linux-gcc-py36-arm64: URL: https://github.com/ursacomputing/crossbow/branches/all?query=nightly-2021-04-27-0-drone-conda-linux-gcc-py36-arm64 - conda-linux-gcc-py37-arm64: URL: https://github.com/ursacomputing/crossbow/branches/all?query=nightly-2021-04-27-0-drone-conda-linux-gcc-py37-arm64 - conda-linux-gcc-py38-arm64: URL: https://github.com/ursacomputing/crossbow/branches/all?query=nightly-2021-04-27-0-drone-conda-linux-gcc-py38-arm64 - conda-linux-gcc-py39-arm64: URL: https://github.com/ursacomputing/crossbow/branches/all?query=nightly-2021-04-27-0-drone-conda-linux-gcc-py39-arm64 - test-conda-python-3.7-turbodbc-latest: URL: https://github.com/ursacomputing/crossbow/branches/all?query=nightly-2021-04-27-0-github-test-conda-python-3.7-turbodbc-latest - test-conda-python-3.7-turbodbc-master: URL: https://github.com/ursacomputing/crossbow/branches/all?query=nightly-2021-04-27-0-github-test-conda-python-3.7-turbodbc-master - test-conda-python-3.7: URL: https://github.com/ursacomputing/crossbow/branches/all?query=nightly-2021-04-27-0-github-test-conda-python-3.7 - test-conda-python-3.8-jpype: URL: https://github.com/ursacomputing/crossbow/branches/all?query=nightly-2021-04-27-0-github-test-conda-python-3.8-jpype - test-conda-python-3.9: URL: https://github.com/ursacomputing/crossbow/branches/all?query=nightly-2021-04-27-0-github-test-conda-python-3.9 - test-ubuntu-20.10-docs: URL: https://github.com/ursacomputing/crossbow/branches/all?query=nightly-2021-04-27-0-azure-test-ubuntu-20.10-docs Succeeded Tasks: - centos-7-amd64: URL: https://github.com/ursacomputing/crossbow/branches/all?query=nightly-2021-04-27-0-github-centos-7-amd64 - centos-8-amd64: URL: https://github.com/ursacomputing/crossbow/branches/all?query=nightly-2021-04-27-0-github-centos-8-amd64 - centos-8-arm64: URL: https://github.com/ursacomputing/crossbow/branches/all?query=nightly-2021-04-27-0-travis-centos-8-arm64 - conda-clean: URL: https://github.com/ursacomputing/crossbow/branches/all?query=nightly-2021-04-27-0-azure-conda-clean - conda-linux-gcc-py36-cpu-r36: URL: https://github.com/ursacomputing/crossbow/branches/all?query=nightly-2021-04-27-0-azure-conda-linux-gcc-py36-cpu-r36 - conda-linux-gcc-py36-cuda: URL: https://github.com/ursacomputing/crossbow/branches/all?query=nightly-2021-04-27-0-azure-conda-linux-gcc-py36-cuda - conda-linux-gcc-py37-cpu-r40: URL: https://github.com/ursacomputing/crossbow/branches/all?query=nightly-2021-04-27-0-azure-conda-linux-gcc-py37-cpu-r40 - conda-linux-gcc-py37-cuda: URL: https://github.com/ursacomputing/crossbow/branches/all?query=nightly-2021-04-27-0-azure-conda-linux-gcc-py37-cuda - conda-linux-gcc-py38-cpu: URL: https://github.com/ursacomputing/crossbow/branches/all?query=nightly-2021-04-27-0-azure-conda-linux-gcc-py38-cpu - conda-linux-gcc-py38-cuda: URL: https://github.com/ursacomputing/crossbow/branches/all?query=nightly-2021-04-27-0-azure-conda-linux-gcc-py38-cuda - conda-linux-gcc-py39-cpu: URL: https://github.com/ursacomputing/crossbow/branches/all?query=nightly-2021-04-27-0-azure-conda-linux-gcc-py39-cpu - conda-linux-gcc-py39-cuda: URL: https://github.com/ursacomputing/crossbow/branches/all?query=nightly-2021-04-27-0-azure-conda-linux-gcc-py39-cuda - conda-osx-arm64-clang-py38: URL: https://github.com/ursacomputing/crossbow/branches/all?query=nightly-2021-04-27-0-azure-conda-osx-arm64-clang-py38 - conda-osx-arm64-clang-py39: URL: https://github.com/ursacomputing/crossbow/branches/all?query=nightly-2021-04-27-0-azure-conda-osx-arm64-clang-py39 - conda-osx-clang-py36-r36: URL: https://github.com/ursacomputing/crossbow/branches/all?query=nightly-2021-04-27-0-azure-conda-osx-clang-py36-r36 - conda-osx-clang-py37-r40: URL: https://github.com/ursacomputing/crossbow/branches/all?query=nightly-2021-04-27-0-azure-conda-osx-clang-py37-r40 - conda-osx-clang-py38: URL: https://github.com/ursacomputing/crossbow/branches/all?query=nightly-2021-04-27-0-azure-conda-osx-clang-py38 - conda-osx-clang-py39: URL: https://github.com/ursacomputing/crossbow/branches/all?query=nightly-2021-04-27-0-azure-conda-osx-clang-py39 - conda-win-vs2017-py36-r36: URL: https://github.com/ursacomputing/crossbow/branches/all?query=nightly-2021-04-27-0-azure-conda-win-vs2017-py36-r36 - conda-win-vs2017-py37-r40: URL: https://github.com/ursacomputing/crossbow/branches/all?query=nightly-2021-04-27-0-azure-conda-win-vs2017-py37-r40 - conda-win-vs2017-py38: URL: https://github.com/ursacomputing/crossbow/branches/all?query=nightly-2021-04-27-0-azure-conda-win-vs2017-py38 - conda-win-vs2017-py39: URL: https://github.com/ursacomputing/crossbow/branches/all?query=nightly-2021-04-27-0-azure-conda-win-vs2017-py39 - debian-bullseye-amd64: URL: https://github.com/ursacomputing