Re: Coordinating / scheduling C++ Parquet-Arrow nested data work (ARROW-1644 and others)

2020-02-04 Thread Micah Kornfield
>
> Glad to hear about the progress. As I mentioned on #2, what do you
> think about setting up a feature branch for you to merge PRs into?
> Then the branch can be iterated on and we can merge it back when it's
> feature complete and does not have perf regressions for the flat
> read/write path.
>
> I'd like to avoid a separate branch if possible.  I'm willing to close the
open PR till I'm sure it is needed but I'm hoping keeping PRs as small
focused as possible with performance testing a long the way will be a
better reviewer and developer experience here.

The earliest I'd have time to work on this myself would likely be
> sometime in March. Others are welcome to jump in as well (and it'd be
> great to increase the overall level of knowledge of the Parquet
> codebase)

Hopefully, Igor can help out otherwise I'll take up the read path after I
finish the write path.

-Micah

On Tue, Feb 4, 2020 at 3:31 PM Wes McKinney  wrote:

> hi Micah
>
> On Mon, Feb 3, 2020 at 12:01 AM Micah Kornfield 
> wrote:
> >
> > Just to give an update.  I've been a little bit delayed, but my progress
> is
> > as follows:
> > 1.  Had 1 PR merged that will exercise basic end-to-end tests.
> > 2.  Have another PR open that allows a configuration option in C++ to
> > determine which algorithm version to use for reading/writing, the
> existing
> > version and the new version supported complex-nested arrays.  I think a
> > large amount of code will be reused/delegated to but I will err on the
> side
> > of not touching the existing code/algorithms so that any errors in the
> > implementation  or performance regressions can hopefully be mitigated at
> > runtime.  I expect in later releases (once the code has "baked") will
> > become a no-op.
>
> Glad to hear about the progress. As I mentioned on #2, what do you
> think about setting up a feature branch for you to merge PRs into?
> Then the branch can be iterated on and we can merge it back when it's
> feature complete and does not have perf regressions for the flat
> read/write path.
>
> > 3.  Started coding the write path.
> >
> > Which leaves:
> > 1.  Finishing the write path (I estimate 2-3 weeks) to be code complete
> > 2.  Implementing the read path.
>
> The earliest I'd have time to work on this myself would likely be
> sometime in March. Others are welcome to jump in as well (and it'd be
> great to increase the overall level of knowledge of the Parquet
> codebase)
>
> > Again, I'm happy to collaborate if people have bandwidth and want to
> > contribute.
> >
> > Thanks,
> > Micah
> >
> > On Thu, Jan 9, 2020 at 10:31 PM Micah Kornfield 
> > wrote:
> >
> > > Hi Wes,
> > > I'm still interested in doing the work.  But don't to hold anybody up
> if
> > > they have bandwidth.
> > >
> > > In order to actually make progress on this, my plan will be to:
> > > 1.  Help with the current Java review backlog through early next week
> or
> > > so (this has been taking the majority of my time allocated for Arrow
> > > contributions for the last 6 months or so).
> > > 2.  Shift all my attention to trying to get this done (this means no
> > > reviews other then closing out existing ones that I've started until
> it is
> > > done).  Hopefully, other Java committers can help shrink the backlog
> > > further (Jacques thanks for you recent efforts here).
> > >
> > > Thanks,
> > > Micah
> > >
> > > On Thu, Jan 9, 2020 at 8:16 AM Wes McKinney 
> wrote:
> > >
> > >> hi folks,
> > >>
> > >> I think we have reached a point where the incomplete C++ Parquet
> > >> nested data assembly/disassembly is harming the value of several
> > >> others parts of the project, for example the Datasets API. As another
> > >> example, it's possible to ingest nested data from JSON but not write
> > >> it to Parquet in general.
> > >>
> > >> Implementing the nested data read and write path completely is a
> > >> difficult project requiring at least several weeks of dedicated work,
> > >> so it's not so surprising that it hasn't been accomplished yet. I know
> > >> that several people have expressed interest in working on it, but I
> > >> would like to see if anyone would be able to volunteer a commitment of
> > >> time and guess on a rough timeline when this work could be done. It
> > >> seems to me if this slips beyond 2020 it will significant diminish the
> > >> value being created by other parts of the project.
> > >>
> > >> Since I'm pretty familiar with all the Parquet code I'm one candidate
> > >> person to take on this project (and I can dedicate the time, but it
> > >> would come at the expense of other projects where I can also be
> > >> useful). But Micah and others expressed interest in working on it, so
> > >> I wanted to have a discussion about it to see what others think.
> > >>
> > >> Thanks
> > >> Wes
> > >>
> > >
>


Re: [DISCUSS] Improving GitHub Actions configurations + tooling

2020-02-04 Thread Wes McKinney
hi Kou,

Thanks for your thoughts.

My preference would be to generate _some_ configurations but not
necessarily _all_. The more copy-and-pasting is going on, the harder
it will be to refactor and make general improvements to some
configurations (since you may have to change N things instead of 1
thing). Since I haven't really worked much on the GitHub Actions files
I will leave the decisions to the people doing the work. If the builds
run and we can verify patches, then that's the important thing, but we
should monitor how much time is being spent manually editing and
refactoring these files over the course of this year.

- Wes

On Tue, Feb 4, 2020 at 8:43 PM Sutou Kouhei  wrote:
>
> Hi,
>
> -0 for this idea.
> I don't strongly against this idea but I don't like this
> idea.
>
>
> The current raw GitHub Actions configurations + raw Docker
> tools (+ raw shell scripts) are simple and easy to
> understand for many developers because there are no magics.
> If we create our tools, they will introduce some magics. It
> may block new developers. (We want to increase Apache Arrow
> developers as much as possible, right?)
>
>
> If we use raw GitHub Actions configuration, we can introduce
> knowledge from other projects. For example, I'll migrate
> MinGW builds on AppVeyor to GitHub Actions. I know how to
> setup MinGW environment on GitHub Actions:
>
>   
> https://github.com/groonga/groonga/blob/master/.github/workflows/windows-mingw.yml
>
> If I can use raw GitHub Actions configuration, I'll re-use
> the similar approach in Apache Arrow easily. If we have our
> tools, I need to translate my GitHub Actions configuration
> knowledge to codes/configurations for our tools.
>
>
> I agree with duplication. But it may be resolved by
> improving GitHub Actions configuration syntax. For example,
> GitLab CI configuration has "include" feature:
>
>   https://docs.gitlab.com/ee/ci/yaml/#include
>
> If GitHub Actions configuration implements similar feature,
> duplication will be able to reduced. GitHub accepts feature
> requests in community forum:
>
>   https://github.community/t5/GitHub-Actions/bd-p/actions
>
> Feature request example:
>
>   
> https://github.community/t5/GitHub-Actions/Feature-Request-context-property-similar-to-github-repository/m-p/38964/highlight/true#M3608
>
> No "include" like feature in GitHub Actions:
>
>   
> https://github.community/t5/GitHub-Actions/Is-it-possible-to-reuse-workflow-yaml-to-setup-similar-workflows/m-p/40634#
>
>
>
> Thanks,
> --
> kou
>
> In 
>   "Re: [DISCUSS] Improving GitHub Actions configurations + tooling" on Tue, 4 
> Feb 2020 22:54:14 +0100,
>   Krisztián Szűcs  wrote:
>
> > On Tue, Feb 4, 2020 at 6:31 PM Wes McKinney  wrote:
> >>
> >> I'm personally not too concerned with the details as long as people
> >> generally agree that the solution is maintainable (Kou's and Antoine's
> >> feedback here would be helpful) and there is not an abundant odor of
> >> code duplication
> > I'm not entirely satisfied with the current solution, but so far so good.
> >
> > Before improving the current GHA setup I'd be indeed nice to see others'
> > opinions and preferences.
> >>
> >> On Tue, Feb 4, 2020 at 9:48 AM Neal Richardson
> >>  wrote:
> >> >
> >> > What if we wrote our own action(s) to wrap up some of the boilerplate? It
> >> > doesn't seem that there are any off-the-shelf actions we could use to 
> >> > drive
> >> > docker-compose:
> >> > https://github.com/marketplace?utf8=%E2%9C%93&type=actions&query=docker-compose
> >> > but I don't think it would be that difficult to wrap `docker-compose pull
> >> > $JOB && docker-compose build $JOB && docker-compose run $JOB` or similar 
> >> > in
> >> > an action.
> >> >
> >> > Neal
> >> >
> >> > On Tue, Feb 4, 2020 at 6:57 AM Krisztián Szűcs 
> >> > 
> >> > wrote:
> >> >
> >> > > Hi,
> >> > >
> >> > > On Mon, Feb 3, 2020 at 5:25 PM Wes McKinney  
> >> > > wrote:
> >> > > >
> >> > > > hi folks,
> >> > > >
> >> > > > I have noticed that many of our GitHub Actions configurations are 
> >> > > > very
> >> > > > similar to each other
> >> > > They are indeed.
> >> > > >
> >> > > > https://www.diffchecker.com/eF4tHdzo
> >> > > >
> >> > > > Aside from the "copy-paste" issue, some work would have to be done to
> >> > > > generate a Crossbow configuration using GHA.
> >> > > Do you mean having GHA as a backend for crossbow? @Kou has just
> >> > > added support for that in [1]
> >> > > >
> >> > > > It seems like a solution to these issues is to create a program to
> >> > > > generate the GHA configurations (using some templates or other 
> >> > > > tools).
> >> > > > So what is in .github/workflows would not be edited by human hands in
> >> > > > general but rather generated by this program.
> >> > > That would be quite similar to what I've implemented in ursabot [2], 
> >> > > just
> >> > > generating GHA flavored ymls instead of buildbot objects, so it seems
> >> > > doable. Of course we'll need commit hooks to force the regeneration of
> >> > > 

Re: [Gandiva] LLVM version

2020-02-04 Thread Sutou Kouhei
Thanks!

In <3ab45aeb-d789-4876-8a1e-ac8424257...@www.fastmail.com>
  "Re: [Gandiva] LLVM version" on Tue, 04 Feb 2020 17:36:12 +0530,
  "Projjal Chanda"  wrote:

> Hi Kou,
> Sure. I will let you know.
> 
> Regards,
> Projjal
> 
> On Tue, Feb 4, 2020, at 2:20 AM, Sutou Kouhei wrote:
>> Hi Projjal,
>> 
>> > Let me test the change by running it with Dremio.
>> 
>> Thanks!
>> 
>> > Will update if there are any issues.
>> 
>> It means that we can move forward if we don't get any
>> responses from you in a week (long? short?), right?
>> 
>> 
>> Thanks,
>> --
>> kou
>> 
>> In <29da1c69-6f14-45aa-8ea6-4293dc615...@www.fastmail.com>
>>  "Re: [Gandiva] LLVM version" on Mon, 03 Feb 2020 11:26:43 +0530,
>>  "Projjal Chanda"  wrote:
>> 
>> > Hi Kou,
>> > Let me test the change by running it with Dremio. Will update if there are 
>> > any issues.
>> > 
>> > Regards,
>> > Projjal
>> > 
>> > On Mon, Feb 3, 2020, at 9:11 AM, Wes McKinney wrote:
>> >> hi Kou,
>> >> 
>> >> Since nearly 2 weeks have passed, and the changes do not seem too
>> >> risky, absent more comments I think it's safe to move forward with the
>> >> upgrade.
>> >> 
>> >> - Wes
>> >> 
>> >> On Sun, Feb 2, 2020 at 6:55 PM Sutou Kouhei  wrote:
>> >> >
>> >> > Hi,
>> >> >
>> >> > Does Gandiva have any policy about LLVM version?
>> >> >
>> >> > The current Gandiva requires LLVM 7. Other LLVM versions
>> >> > aren't supported. But the latest LLVM is 9. Can we upgrade
>> >> > LLVM?
>> >> >
>> >> > Homebrew provides LLVM 4, 6, 7, 8 and 9 but doesn't accept
>> >> > apache-arrow package that depends outdated LLVM:
>> >> >
>> >> > https://github.com/Homebrew/homebrew-core/pull/42385
>> >> >
>> >> > It means that apache-arrow package on Homebrew can't enable
>> >> > Gandiva until we upgrade LLVM to the latest version.
>> >> >
>> >> >
>> >> > We have a pull request that upgrades supported LLVM to 8:
>> >> > https://github.com/apache/arrow/pull/6266
>> >> >
>> >> > In the pull request, Wes mentioned to Gandiva developers but
>> >> > there are no responses.
>> >> >
>> >> >
>> >> > In the pull request, there are no Gandiva changes. So we
>> >> > will be able to support LLVM 7 and 8 without any #ifdef.
>> >> > Can we support multiple LLVM versions? Or should we support
>> >> > only one LLVM version?
>> >> >
>> >> >
>> >> > I think that we can consider C++ tools provided by LLVM such
>> >> > as clang-format separately. We will be able to use different
>> >> > LLVM versions for Gandiva and C++ tools. For example, we
>> >> > will be able to use LLVM 8 for Gandiva and LLVM 7 for
>> >> > clang-format at the same time by improving our CMake
>> >> > configuration.
>> >> >
>> >> >
>> >> > Thanks,
>> >> > --
>> >> > kou
>> >> 
>> 


Re: [DISCUSS] Improving GitHub Actions configurations + tooling

2020-02-04 Thread Sutou Kouhei
Hi,

-0 for this idea.
I don't strongly against this idea but I don't like this
idea.


The current raw GitHub Actions configurations + raw Docker
tools (+ raw shell scripts) are simple and easy to
understand for many developers because there are no magics.
If we create our tools, they will introduce some magics. It
may block new developers. (We want to increase Apache Arrow
developers as much as possible, right?)


If we use raw GitHub Actions configuration, we can introduce
knowledge from other projects. For example, I'll migrate
MinGW builds on AppVeyor to GitHub Actions. I know how to
setup MinGW environment on GitHub Actions:

  
https://github.com/groonga/groonga/blob/master/.github/workflows/windows-mingw.yml

If I can use raw GitHub Actions configuration, I'll re-use
the similar approach in Apache Arrow easily. If we have our
tools, I need to translate my GitHub Actions configuration
knowledge to codes/configurations for our tools.


I agree with duplication. But it may be resolved by
improving GitHub Actions configuration syntax. For example,
GitLab CI configuration has "include" feature:

  https://docs.gitlab.com/ee/ci/yaml/#include

If GitHub Actions configuration implements similar feature,
duplication will be able to reduced. GitHub accepts feature
requests in community forum:

  https://github.community/t5/GitHub-Actions/bd-p/actions

Feature request example:

  
https://github.community/t5/GitHub-Actions/Feature-Request-context-property-similar-to-github-repository/m-p/38964/highlight/true#M3608

No "include" like feature in GitHub Actions:

  
https://github.community/t5/GitHub-Actions/Is-it-possible-to-reuse-workflow-yaml-to-setup-similar-workflows/m-p/40634#



Thanks,
--
kou

In 
  "Re: [DISCUSS] Improving GitHub Actions configurations + tooling" on Tue, 4 
Feb 2020 22:54:14 +0100,
  Krisztián Szűcs  wrote:

> On Tue, Feb 4, 2020 at 6:31 PM Wes McKinney  wrote:
>>
>> I'm personally not too concerned with the details as long as people
>> generally agree that the solution is maintainable (Kou's and Antoine's
>> feedback here would be helpful) and there is not an abundant odor of
>> code duplication
> I'm not entirely satisfied with the current solution, but so far so good.
> 
> Before improving the current GHA setup I'd be indeed nice to see others'
> opinions and preferences.
>>
>> On Tue, Feb 4, 2020 at 9:48 AM Neal Richardson
>>  wrote:
>> >
>> > What if we wrote our own action(s) to wrap up some of the boilerplate? It
>> > doesn't seem that there are any off-the-shelf actions we could use to drive
>> > docker-compose:
>> > https://github.com/marketplace?utf8=%E2%9C%93&type=actions&query=docker-compose
>> > but I don't think it would be that difficult to wrap `docker-compose pull
>> > $JOB && docker-compose build $JOB && docker-compose run $JOB` or similar in
>> > an action.
>> >
>> > Neal
>> >
>> > On Tue, Feb 4, 2020 at 6:57 AM Krisztián Szűcs 
>> > wrote:
>> >
>> > > Hi,
>> > >
>> > > On Mon, Feb 3, 2020 at 5:25 PM Wes McKinney  wrote:
>> > > >
>> > > > hi folks,
>> > > >
>> > > > I have noticed that many of our GitHub Actions configurations are very
>> > > > similar to each other
>> > > They are indeed.
>> > > >
>> > > > https://www.diffchecker.com/eF4tHdzo
>> > > >
>> > > > Aside from the "copy-paste" issue, some work would have to be done to
>> > > > generate a Crossbow configuration using GHA.
>> > > Do you mean having GHA as a backend for crossbow? @Kou has just
>> > > added support for that in [1]
>> > > >
>> > > > It seems like a solution to these issues is to create a program to
>> > > > generate the GHA configurations (using some templates or other tools).
>> > > > So what is in .github/workflows would not be edited by human hands in
>> > > > general but rather generated by this program.
>> > > That would be quite similar to what I've implemented in ursabot [2], just
>> > > generating GHA flavored ymls instead of buildbot objects, so it seems
>> > > doable. Of course we'll need commit hooks to force the regeneration of
>> > > these configuration files.
>> > > >
>> > > > This program could also assist with local automation for
>> > > > reproducibility purposes (for example, locally executing a cascade of
>> > > > dependent docker-compose steps).
>> > > Another independent improvement could be to ditch docker-compose
>> > > completely.
>> > > I'd say that 70% of the docker-compose.yml [3] and the relating
>> > > dockerfiles are
>> > > filled with duplications necessary because of the limited parametrization
>> > > and
>> > > reusability of docker and docker-compose. It also makes harder to use new
>> > > docker
>> > > features like https://docs.docker.com/buildx/working-with-buildx/
>> > >
>> > > Again I'm referring ursabot where I've already implemented the ideas, the
>> > > docker files [4] and the image hierarchy from the compose file [3] could 
>> > > be
>> > > replaced by something similar like the ursabot docker utility [6].
>> > > The builder definitions [7] which are 

Re: [Java] Issues with IntelliJ + errorprone + OpenJDK

2020-02-04 Thread Fan Liya
I could not reproduce the problem, and I did not see "central maven org"
from my environment.

1. If the problem occurred when installing the plugin, maybe we can
download the plugin (from
https://plugins.jetbrains.com/plugin/7349-error-prone-compiler/versions)
and install it locally.
2. If the problem occurred when building the code, maybe we can use another
maven repo by overriding the settings.xml file. (The default repo is
https://repo.maven.apache.org/maven2/ in my environment, when no
settings.xml is specified).

Best,
Liya Fan

On Wed, Feb 5, 2020 at 2:38 AM Bryan Cutler  wrote:

> Here is where it came up at, looks to be installed in the m2 repository
>
> bryan@lm-P50 ~ $ find ~/ -name "failureaccess-*.jar" -type f
>
> /home/bryan/.m2/repository/com/google/guava/failureaccess/1.0.1/failureaccess-1.0.1.jar
>
> /home/bryan/.IdeaIC2019.2/system/download-cache/error-prone/2.3.3/failureaccess-1.0.1.jar
>
> Also error-prone jar is in the IntelliJ plugin directory
> find ~/ -name "error-prone*.jar" -type f
> /home/bryan/.IdeaIC2019.2/config/plugins/error-prone/lib/error-prone.jar
>
> /home/bryan/.IdeaIC2019.2/config/plugins/error-prone/lib/jps/error-prone-jps-plugin.jar
>
>
>
>
> On Tue, Feb 4, 2020 at 7:44 AM Andy Grove  wrote:
>
> > Actually, central.maven.org doesn't even exist ...
> >
> > On Tue, Feb 4, 2020 at 8:28 AM Andy Grove  wrote:
> >
> > > Thanks for the help but I followed the same instructions and get this
> > > error:
> > >
> > > Error:Failed to download error-prone compiler JARs: Failed to download
> '
> > >
> >
> http://central.maven.org/maven2/com/google/guava/failureaccess/1.0.1/failureaccess-1.0.1.jar
> > > ':
> > > central.maven.org
> > >
> > > The issue is that this maven central no longer supports http and
> requires
> > > https. Maybe I could manually install this file somewhere? I did try
> > > installing in my local m2 repo but that didn't work.
> > >
> > > If anyone could scan their local drive for this file and let me know
> > where
> > > it is installed that could unblock me.
> > >
> > > Thanks,
> > >
> > > Andy.
> > >
> > >
> > >
> > > On Mon, Feb 3, 2020 at 6:24 PM Fan Liya  wrote:
> > >
> > >> I was having the same problem, and it was solved by
> > >>
> > >> 1. Install the "Error Prone Compiler" plugin to intellij
> > >> 2. setting "Settings/Build, Execution, Deployment/Compiler/Java
> > >> Compiler/Use compiler" to "Javac with error-prone"
> > >>
> > >> I am using Intellij 2019.3 (Community Edition)
> > >>
> > >> Best,
> > >> Liya Fan
> > >>
> > >> On Tue, Feb 4, 2020 at 7:25 AM Bryan Cutler 
> wrote:
> > >>
> > >> > Ahh, now that you sent that link it jogged my memory. A while ago I
> > >> think I
> > >> > did see that error and installed the error prone compiler plugin
> > >> mentioned.
> > >> > It worked after that I believe, but I am on IntillJ 2019.2.4 on
> > Ubuntu,
> > >> and
> > >> > it was a while ago so maybe something changed. If there is anything
> I
> > >> can
> > >> > check to help you out, let me know.
> > >> >
> > >> > On Mon, Feb 3, 2020 at 12:22 PM Andy Grove 
> > >> wrote:
> > >> >
> > >> > > So it turns out there are specific instructions [1] for using
> > >> errorprone
> > >> > > with IntelliJ. Unfortunately, this doesn't work due to a bug in
> > >> IntelliJ
> > >> > > that was fixed a few days ago but not released yet [2].
> > >> > >
> > >> > > [1] https://errorprone.info/docs/installation
> > >> > > [2]
> > >> > >
> > >> > >
> > >> >
> > >>
> >
> https://intellij-support.jetbrains.com/hc/en-us/community/posts/360007052380-error-prone-compile-plugin-cant-download-jar
> > >> > >
> > >> > >
> > >> > >
> > >> > > On Mon, Feb 3, 2020 at 1:10 PM Andy Grove 
> > >> wrote:
> > >> > >
> > >> > > > Hi Bryan,
> > >> > > >
> > >> > > > Yes, I tried opening as a Maven project and got the same error.
> > I'm
> > >> > using
> > >> > > > OpenJDK 1.8.0_232 on both Ubuntu 19.04 and macOS 10.14.6 and get
> > the
> > >> > same
> > >> > > > error on both. I'm using IntelliJ Ultimate 2019.3.2. Building
> from
> > >> the
> > >> > > > command line with Maven works fine.
> > >> > > >
> > >> > > > Very odd. I'll guess I'll do a little more research on
> errorprone.
> > >> > > >
> > >> > > > Thanks,
> > >> > > >
> > >> > > > Andy.
> > >> > > >
> > >> > > >
> > >> > > > On Mon, Feb 3, 2020 at 12:50 PM Bryan Cutler  >
> > >> > wrote:
> > >> > > >
> > >> > > >> Hi Andy,
> > >> > > >> What is your JDK version? I haven't seen that exact error, did
> > you
> > >> > open
> > >> > > >> Arrow as a Maven project in Intellij?
> > >> > > >>
> > >> > > >> On Mon, Feb 3, 2020 at 7:47 AM Andy Grove <
> andygrov...@gmail.com
> > >
> > >> > > wrote:
> > >> > > >>
> > >> > > >> > I'm working on the Java codebase and cannot run code inside
> > >> IntelliJ
> > >> > > >> and it
> > >> > > >> > looks like some kind of compatibility issue between
> errorprone
> > >> and
> > >> > the
> > >> > > >> JDK
> > >> > > >> > that IntelliJ is using. I'm hoping other Java committers have
> > >> found
> > >> > 

Re: [VOTE] Release Apache Arrow 0.16.0 - RC2

2020-02-04 Thread Krisztián Szűcs
Hi,

I've cherry-picked the wheel fix [1] on top of the 0.16 release tag,
re-built the wheels using crossbow [2], and uploaded them to
bintray [3] (also removed win-py38m).

Anyone who has voted after verifying the wheels, please re-run
the verification script again for the wheels and re-vote.

Thanks, Krisztian

[1] 
https://github.com/apache/arrow/commit/67e34c53b3be4c88348369f8109626b4a8a997aa
[2] https://github.com/ursa-labs/crossbow/branches/all?query=build-733
[3] https://bintray.com/apache/arrow/python-rc/0.16.0-rc2#files

On Tue, Feb 4, 2020 at 7:08 PM Wes McKinney  wrote:
>
> +1 (binding)
>
> Some patches were required to the verification scripts but I have run:
>
> * Full source verification on Ubuntu 18.04
> * Linux binary verification
> * Source verification on Windows 10 (needed ARROW-6757)
> * Windows binary verification. Note that Python 3.8 wheel is broken
> (see ARROW-7755). Whoever uploads the wheels to PyPI _SHOULD NOT_
> upload this 3.8 wheel until we know what's wrong (if we upload a
> broken wheel then `pip install pyarrow==0.16.0` will be permanently
> broken on Windows/Python 3.8)
>
> On Mon, Feb 3, 2020 at 9:26 PM Francois Saint-Jacques
>  wrote:
> >
> > Tested on ubuntu 18.04 for the source release.
> >
> > On Mon, Feb 3, 2020 at 10:07 PM Francois Saint-Jacques
> >  wrote:
> > >
> > > +1
> > >
> > > Binaries verification didn't have any issues.
> > > Sources verification worked with some local environment hiccups
> > >
> > > François
> > >
> > > On Mon, Feb 3, 2020 at 8:46 PM Andy Grove  wrote:
> > > >
> > > > +1 (binding) based on running the Rust tests
> > > >
> > > > Thanks.
> > > >
> > > > On Thu, Jan 30, 2020 at 8:13 PM Krisztián Szűcs 
> > > > 
> > > > wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > I would like to propose the following release candidate (RC2) of 
> > > > > Apache
> > > > > Arrow version 0.16.0. This is a release consisting of 728
> > > > > resolved JIRA issues[1].
> > > > >
> > > > > This release candidate is based on commit:
> > > > > 729a7689fd87572e6a14ad36f19cd579a8b8d9c5 [2]
> > > > >
> > > > > The source release rc2 is hosted at [3].
> > > > > The binary artifacts are hosted at [4][5][6][7].
> > > > > The changelog is located at [8].
> > > > >
> > > > > Please download, verify checksums and signatures, run the unit tests,
> > > > > and vote on the release. See [9] for how to validate a release 
> > > > > candidate.
> > > > >
> > > > > The vote will be open for at least 72 hours.
> > > > >
> > > > > [ ] +1 Release this as Apache Arrow 0.16.0
> > > > > [ ] +0
> > > > > [ ] -1 Do not release this as Apache Arrow 0.16.0 because...
> > > > >
> > > > > [1]:
> > > > > https://issues.apache.org/jira/issues/?jql=project%20%3D%20ARROW%20AND%20status%20in%20%28Resolved%2C%20Closed%29%20AND%20fixVersion%20%3D%200.16.0
> > > > > [2]:
> > > > > https://github.com/apache/arrow/tree/729a7689fd87572e6a14ad36f19cd579a8b8d9c5
> > > > > [3]: 
> > > > > https://dist.apache.org/repos/dist/dev/arrow/apache-arrow-0.16.0-rc2
> > > > > [4]: https://bintray.com/apache/arrow/centos-rc/0.16.0-rc2
> > > > > [5]: https://bintray.com/apache/arrow/debian-rc/0.16.0-rc2
> > > > > [6]: https://bintray.com/apache/arrow/python-rc/0.16.0-rc2
> > > > > [7]: https://bintray.com/apache/arrow/ubuntu-rc/0.16.0-rc2
> > > > > [8]:
> > > > > https://github.com/apache/arrow/blob/729a7689fd87572e6a14ad36f19cd579a8b8d9c5/CHANGELOG.md
> > > > > [9]:
> > > > > https://cwiki.apache.org/confluence/display/ARROW/How+to+Verify+Release+Candidates
> > > > >


Re: Coordinating / scheduling C++ Parquet-Arrow nested data work (ARROW-1644 and others)

2020-02-04 Thread Wes McKinney
hi Micah

On Mon, Feb 3, 2020 at 12:01 AM Micah Kornfield  wrote:
>
> Just to give an update.  I've been a little bit delayed, but my progress is
> as follows:
> 1.  Had 1 PR merged that will exercise basic end-to-end tests.
> 2.  Have another PR open that allows a configuration option in C++ to
> determine which algorithm version to use for reading/writing, the existing
> version and the new version supported complex-nested arrays.  I think a
> large amount of code will be reused/delegated to but I will err on the side
> of not touching the existing code/algorithms so that any errors in the
> implementation  or performance regressions can hopefully be mitigated at
> runtime.  I expect in later releases (once the code has "baked") will
> become a no-op.

Glad to hear about the progress. As I mentioned on #2, what do you
think about setting up a feature branch for you to merge PRs into?
Then the branch can be iterated on and we can merge it back when it's
feature complete and does not have perf regressions for the flat
read/write path.

> 3.  Started coding the write path.
>
> Which leaves:
> 1.  Finishing the write path (I estimate 2-3 weeks) to be code complete
> 2.  Implementing the read path.

The earliest I'd have time to work on this myself would likely be
sometime in March. Others are welcome to jump in as well (and it'd be
great to increase the overall level of knowledge of the Parquet
codebase)

> Again, I'm happy to collaborate if people have bandwidth and want to
> contribute.
>
> Thanks,
> Micah
>
> On Thu, Jan 9, 2020 at 10:31 PM Micah Kornfield 
> wrote:
>
> > Hi Wes,
> > I'm still interested in doing the work.  But don't to hold anybody up if
> > they have bandwidth.
> >
> > In order to actually make progress on this, my plan will be to:
> > 1.  Help with the current Java review backlog through early next week or
> > so (this has been taking the majority of my time allocated for Arrow
> > contributions for the last 6 months or so).
> > 2.  Shift all my attention to trying to get this done (this means no
> > reviews other then closing out existing ones that I've started until it is
> > done).  Hopefully, other Java committers can help shrink the backlog
> > further (Jacques thanks for you recent efforts here).
> >
> > Thanks,
> > Micah
> >
> > On Thu, Jan 9, 2020 at 8:16 AM Wes McKinney  wrote:
> >
> >> hi folks,
> >>
> >> I think we have reached a point where the incomplete C++ Parquet
> >> nested data assembly/disassembly is harming the value of several
> >> others parts of the project, for example the Datasets API. As another
> >> example, it's possible to ingest nested data from JSON but not write
> >> it to Parquet in general.
> >>
> >> Implementing the nested data read and write path completely is a
> >> difficult project requiring at least several weeks of dedicated work,
> >> so it's not so surprising that it hasn't been accomplished yet. I know
> >> that several people have expressed interest in working on it, but I
> >> would like to see if anyone would be able to volunteer a commitment of
> >> time and guess on a rough timeline when this work could be done. It
> >> seems to me if this slips beyond 2020 it will significant diminish the
> >> value being created by other parts of the project.
> >>
> >> Since I'm pretty familiar with all the Parquet code I'm one candidate
> >> person to take on this project (and I can dedicate the time, but it
> >> would come at the expense of other projects where I can also be
> >> useful). But Micah and others expressed interest in working on it, so
> >> I wanted to have a discussion about it to see what others think.
> >>
> >> Thanks
> >> Wes
> >>
> >


[jira] [Created] (ARROW-7772) Unable to dplyr::filter on date32 object when using open_dataset()

2020-02-04 Thread Stephanie Hazlitt (Jira)
Stephanie Hazlitt created ARROW-7772:


 Summary: Unable to dplyr::filter on date32 object when using 
open_dataset()
 Key: ARROW-7772
 URL: https://issues.apache.org/jira/browse/ARROW-7772
 Project: Apache Arrow
  Issue Type: Bug
  Components: R
Affects Versions: 0.15.1
 Environment: version  R version 3.6.2 (2019-12-12)
 os   macOS Mojave 10.14.6
 system   x86_64, darwin15.6.0
Reporter: Stephanie Hazlitt
 Fix For: 0.16.0


I am trying to filter on a date column using `open_dataset()` and 
`dplyr::filter()`:

library(arrow)
library(dplyr)

tmp <- tempfile()
dir.create(tmp)
df <- data.frame(date = Sys.Date())
write_parquet(df, file.path(tmp, "file.parquet"))

ds <- open_dataset(tmp)


ds %>%
 filter(date > as.Date("2020-02-02")) %>%
 collect()

 

This code crashes R with this error message:
{quote}/private/var/folders/nz/vv4_9tw56nv9k3tkvyszvwg8gn/T/hbtmp/apache-arrow-20200203-29929-1uoyri7/cpp/src/arrow/result.cc:28:
 ValueOrDie called on an error: NotImplemented: casting scalarsof type 
date64[ms] to type date32[day]
0 arrow.so 0x000104461f1d _ZN5arrow4util7CerrLogD2Ev + 209
1 arrow.so 0x000104461e3e _ZN5arrow4util7CerrLogD0Ev + 14
2 arrow.so 0x000104461de6 _ZN5arrow4util8ArrowLogD1Ev + 34
3 arrow.so 0x00010436c57f 
_ZN5arrow8internal14DieWithMessageERKNSt3__112basic_stringIcNS1_11char_traitsIcEENS1_9allocatorIc
 + 63
4 arrow.so 0x00010436d384 
_ZNR5arrow6ResultINSt3__110shared_ptrINS_6Scalar10ValueOrDieEv + 192
5 arrow.so 0x00010426ee1a 
_ZN5arrow7dataset23InsertImplicitCastsImpl4CastENSt3__110shared_ptrINS_8DataTypeEEEPNS3_INS0_10ExpressionEEE
 + 186
6 arrow.so 0x00010426e033 
_ZN5arrow7dataset23InsertImplicitCastsImplclERKNS0_20ComparisonExpressionE + 757
7 arrow.so 0x000104269de7 
_ZN5arrow7dataset15VisitExpressionINS0_23InsertImplicitCastsImplEEEDTclfp0_fp_EERKNS0_10ExpressionEOT_
 + 317
8 arrow.so 0x000104269c76 
_ZN5arrow7dataset19InsertImplicitCastsERKNS0_10ExpressionERKNS_6SchemaE + 36
9 arrow.so 0x000104134e04 
_Z32dataset___ScannerBuilder__FilterRKNSt3__110shared_ptrIN5arrow7dataset14ScannerBuilderEEERKNS0_INS2_10ExpressionEEE
 + 52
10 arrow.so 0x0001040f35b7 _arrow_dataset___ScannerBuilder__Filter + 135
11 libR.dylib 0x0001001f375b R_doDotCall + 955
12 libR.dylib 0x00010023d46a bcEval + 99306
13 libR.dylib 0x00010022494d Rf_eval + 445
14 libR.dylib 0x000100243209 R_execClosure + 2153
15 libR.dylib 0x00010024210a Rf_applyClosure + 346
16 libR.dylib 0x000100224e7d Rf_eval + 1773
17 libR.dylib 0x000100245880 do_begin + 432
18 libR.dylib 0x000100224b40 Rf_eval + 944
19 libR.dylib 0x000100243209 R_execClosure + 2153
20 libR.dylib 0x00010024210a Rf_applyClosure + 346
21 libR.dylib 0x00010022bd71 bcEval + 27889
22 libR.dylib 0x00010022494d Rf_eval + 445
23 libR.dylib 0x000100243209 R_execClosure + 2153
24 libR.dylib 0x00010024210a Rf_applyClosure + 346
25 libR.dylib 0x000100287675 dispatchMethod + 757
26 libR.dylib 0x000100287332 Rf_usemethod + 738
27 libR.dylib 0x000100287926 do_usemethod + 646
28 libR.dylib 0x00010022c369 bcEval + 29417
29 libR.dylib 0x00010022494d Rf_eval + 445
30 libR.dylib 0x000100243209 R_execClosure + 2153
31 libR.dylib 0x00010024210a Rf_applyClosure + 346
32 libR.dylib 0x000100224e7d Rf_eval + 1773
33 libR.dylib 0x000100243209 R_execClosure + 2153
34 libR.dylib 0x00010024210a Rf_applyClosure + 346
35 libR.dylib 0x00010022bd71 bcEval + 27889
36 libR.dylib 0x00010022494d Rf_eval + 445
37 libR.dylib 0x0001002418c3 forcePromise + 179
38 libR.dylib 0x000100224c30 Rf_eval + 1184
39 libR.dylib 0x000100247241 do_withVisible + 49
40 libR.dylib 0x000100286603 do_internal + 339
41 libR.dylib 0x00010022c369 bcEval + 29417
42 libR.dylib 0x00010022494d Rf_eval + 445
43 libR.dylib 0x000100243209 R_execClosure + 2153
44 libR.dylib 0x00010024210a Rf_applyClosure + 346
45 libR.dylib 0x00010022bd71 bcEval + 27889
46 libR.dylib 0x00010022494d Rf_eval + 445
47 libR.dylib 0x000100243209 R_execClosure + 2153
48 libR.dylib 0x00010024210a Rf_applyClosure + 346
49 libR.dylib 0x000100224e7d Rf_eval + 1773
50 libR.dylib 0x000100243209 R_execClosure + 2153
51 libR.dylib 0x00010024210a Rf_applyClosure + 346
52 libR.dylib 0x000100224e7d Rf_eval + 1773
53 libR.dylib 0x000100246bc6 do_eval + 646
54 libR.dylib 0x00010022c186 bcEval + 28934
55 libR.dylib 0x00010022494d Rf_eval + 445
56 libR.dylib 0x000100243209 R_execClosure + 2153
57 libR.dylib 0x00010024210a Rf_applyClosure + 346
58 libR.dylib 0x00010022bd71 bcEval + 27889
59 libR.dylib 0x00010022494d Rf_eval + 445
60 libR.dylib 0x0001002418c3 forcePromise + 179
61 libR.dylib 0x000100224c30 Rf_eval + 1184
62 libR.dylib 0x00010024724

[jira] [Created] (ARROW-7771) [Release] Use ARROW_TMPDIR environment variable in the verification scripts instead of TMPDIR

2020-02-04 Thread Krisztian Szucs (Jira)
Krisztian Szucs created ARROW-7771:
--

 Summary: [Release] Use ARROW_TMPDIR environment variable in the 
verification scripts instead of TMPDIR
 Key: ARROW-7771
 URL: https://issues.apache.org/jira/browse/ARROW-7771
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Developer Tools
Reporter: Krisztian Szucs
 Fix For: 1.0.0


See discussion https://github.com/apache/arrow/pull/6344#issuecomment-582128686



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] Improving GitHub Actions configurations + tooling

2020-02-04 Thread Krisztián Szűcs
On Tue, Feb 4, 2020 at 6:31 PM Wes McKinney  wrote:
>
> I'm personally not too concerned with the details as long as people
> generally agree that the solution is maintainable (Kou's and Antoine's
> feedback here would be helpful) and there is not an abundant odor of
> code duplication
I'm not entirely satisfied with the current solution, but so far so good.

Before improving the current GHA setup I'd be indeed nice to see others'
opinions and preferences.
>
> On Tue, Feb 4, 2020 at 9:48 AM Neal Richardson
>  wrote:
> >
> > What if we wrote our own action(s) to wrap up some of the boilerplate? It
> > doesn't seem that there are any off-the-shelf actions we could use to drive
> > docker-compose:
> > https://github.com/marketplace?utf8=%E2%9C%93&type=actions&query=docker-compose
> > but I don't think it would be that difficult to wrap `docker-compose pull
> > $JOB && docker-compose build $JOB && docker-compose run $JOB` or similar in
> > an action.
> >
> > Neal
> >
> > On Tue, Feb 4, 2020 at 6:57 AM Krisztián Szűcs 
> > wrote:
> >
> > > Hi,
> > >
> > > On Mon, Feb 3, 2020 at 5:25 PM Wes McKinney  wrote:
> > > >
> > > > hi folks,
> > > >
> > > > I have noticed that many of our GitHub Actions configurations are very
> > > > similar to each other
> > > They are indeed.
> > > >
> > > > https://www.diffchecker.com/eF4tHdzo
> > > >
> > > > Aside from the "copy-paste" issue, some work would have to be done to
> > > > generate a Crossbow configuration using GHA.
> > > Do you mean having GHA as a backend for crossbow? @Kou has just
> > > added support for that in [1]
> > > >
> > > > It seems like a solution to these issues is to create a program to
> > > > generate the GHA configurations (using some templates or other tools).
> > > > So what is in .github/workflows would not be edited by human hands in
> > > > general but rather generated by this program.
> > > That would be quite similar to what I've implemented in ursabot [2], just
> > > generating GHA flavored ymls instead of buildbot objects, so it seems
> > > doable. Of course we'll need commit hooks to force the regeneration of
> > > these configuration files.
> > > >
> > > > This program could also assist with local automation for
> > > > reproducibility purposes (for example, locally executing a cascade of
> > > > dependent docker-compose steps).
> > > Another independent improvement could be to ditch docker-compose
> > > completely.
> > > I'd say that 70% of the docker-compose.yml [3] and the relating
> > > dockerfiles are
> > > filled with duplications necessary because of the limited parametrization
> > > and
> > > reusability of docker and docker-compose. It also makes harder to use new
> > > docker
> > > features like https://docs.docker.com/buildx/working-with-buildx/
> > >
> > > Again I'm referring ursabot where I've already implemented the ideas, the
> > > docker files [4] and the image hierarchy from the compose file [3] could 
> > > be
> > > replaced by something similar like the ursabot docker utility [6].
> > > The builder definitions [7] which are the counterparts for the scripts
> > > [8] compose
> > > command [9] and compose parameters [10].
> > > I'm not saying that we should use the exact build definition syntax
> > > from ursabot,
> > > we can have arbitrary ymls etc. but the configuration I've referenced from
> > > ursabot approximates what we can achieve by increasing the abstraction
> > > level.
> > >
> > > Note, that the ursabot sources I've referenced are not GPL contaminated.
> > > >
> > > > Thoughts?
> > > On the other hand I see one big advantage of the current setup. By using
> > > plain
> > > GHA ymls, a single docker-compose.yml, Dockerfiles and a set of small bash
> > > scripts, our CI configuration is more idiomatic. It sits closer to the
> > > developers'
> > > expectations.
> > > It's hard for me to judge the developer friendliness of either of "CI
> > > tools"
> > > (GHA+docker setup, crossbow, ursabot) because I was the main perpetrator
> > > of those, but I suppose that the new GHA setup is the easiest to work 
> > > with.
> > > >
> > > Thanks, Krisztian
> > >
> > > [1] https://github.com/apache/arrow/pull/6286
> > > [2]
> > > https://github.com/ursa-labs/ursabot/blob/master/projects/arrow/master.cfg#L67-L256
> > > [3] https://github.com/apache/arrow/blob/master/docker-compose.yml
> > > [4] https://github.com/apache/arrow/tree/master/ci/docker
> > > [6]
> > > https://github.com/ursa-labs/ursabot/blob/master/projects/arrow/arrow/docker.py
> > > [7]
> > > https://github.com/ursa-labs/ursabot/blob/master/projects/arrow/arrow/builders.py#L332
> > > [8] https://github.com/apache/arrow/tree/master/ci/scripts
> > > [9] https://github.com/apache/arrow/blob/master/docker-compose.yml#L306
> > > [10] https://github.com/apache/arrow/blob/master/docker-compose.yml#L264
> > > > - Wes
> > >


Re: [VOTE] Release Apache Arrow 0.16.0 - RC2

2020-02-04 Thread Bryan Cutler
+1

I had some trouble due to ARROW-7760 at first, but applied the same patch
and passed. I ran the command:
TMPDIR=/tmp/arrow TEST_DEFAULT=0 TEST_SOURCE=1 TEST_CPP=1 TEST_PYTHON=1
TEST_JAVA=1 TEST_INTEGRATION_CPP=1 TEST_INTEGRATION_JAVA=1
dev/release/verify-release-candidate.sh source 0.16.0 2

On Tue, Feb 4, 2020 at 10:08 AM Wes McKinney  wrote:

> +1 (binding)
>
> Some patches were required to the verification scripts but I have run:
>
> * Full source verification on Ubuntu 18.04
> * Linux binary verification
> * Source verification on Windows 10 (needed ARROW-6757)
> * Windows binary verification. Note that Python 3.8 wheel is broken
> (see ARROW-7755). Whoever uploads the wheels to PyPI _SHOULD NOT_
> upload this 3.8 wheel until we know what's wrong (if we upload a
> broken wheel then `pip install pyarrow==0.16.0` will be permanently
> broken on Windows/Python 3.8)
>
> On Mon, Feb 3, 2020 at 9:26 PM Francois Saint-Jacques
>  wrote:
> >
> > Tested on ubuntu 18.04 for the source release.
> >
> > On Mon, Feb 3, 2020 at 10:07 PM Francois Saint-Jacques
> >  wrote:
> > >
> > > +1
> > >
> > > Binaries verification didn't have any issues.
> > > Sources verification worked with some local environment hiccups
> > >
> > > François
> > >
> > > On Mon, Feb 3, 2020 at 8:46 PM Andy Grove 
> wrote:
> > > >
> > > > +1 (binding) based on running the Rust tests
> > > >
> > > > Thanks.
> > > >
> > > > On Thu, Jan 30, 2020 at 8:13 PM Krisztián Szűcs <
> szucs.kriszt...@gmail.com>
> > > > wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > I would like to propose the following release candidate (RC2) of
> Apache
> > > > > Arrow version 0.16.0. This is a release consisting of 728
> > > > > resolved JIRA issues[1].
> > > > >
> > > > > This release candidate is based on commit:
> > > > > 729a7689fd87572e6a14ad36f19cd579a8b8d9c5 [2]
> > > > >
> > > > > The source release rc2 is hosted at [3].
> > > > > The binary artifacts are hosted at [4][5][6][7].
> > > > > The changelog is located at [8].
> > > > >
> > > > > Please download, verify checksums and signatures, run the unit
> tests,
> > > > > and vote on the release. See [9] for how to validate a release
> candidate.
> > > > >
> > > > > The vote will be open for at least 72 hours.
> > > > >
> > > > > [ ] +1 Release this as Apache Arrow 0.16.0
> > > > > [ ] +0
> > > > > [ ] -1 Do not release this as Apache Arrow 0.16.0 because...
> > > > >
> > > > > [1]:
> > > > >
> https://issues.apache.org/jira/issues/?jql=project%20%3D%20ARROW%20AND%20status%20in%20%28Resolved%2C%20Closed%29%20AND%20fixVersion%20%3D%200.16.0
> > > > > [2]:
> > > > >
> https://github.com/apache/arrow/tree/729a7689fd87572e6a14ad36f19cd579a8b8d9c5
> > > > > [3]:
> https://dist.apache.org/repos/dist/dev/arrow/apache-arrow-0.16.0-rc2
> > > > > [4]: https://bintray.com/apache/arrow/centos-rc/0.16.0-rc2
> > > > > [5]: https://bintray.com/apache/arrow/debian-rc/0.16.0-rc2
> > > > > [6]: https://bintray.com/apache/arrow/python-rc/0.16.0-rc2
> > > > > [7]: https://bintray.com/apache/arrow/ubuntu-rc/0.16.0-rc2
> > > > > [8]:
> > > > >
> https://github.com/apache/arrow/blob/729a7689fd87572e6a14ad36f19cd579a8b8d9c5/CHANGELOG.md
> > > > > [9]:
> > > > >
> https://cwiki.apache.org/confluence/display/ARROW/How+to+Verify+Release+Candidates
> > > > >
>


[jira] [Created] (ARROW-7770) [Release] Archery does not use correct integration test args

2020-02-04 Thread Bryan Cutler (Jira)
Bryan Cutler created ARROW-7770:
---

 Summary: [Release] Archery does not use correct integration test 
args
 Key: ARROW-7770
 URL: https://issues.apache.org/jira/browse/ARROW-7770
 Project: Apache Arrow
  Issue Type: Bug
  Components: Archery
Reporter: Bryan Cutler
Assignee: Bryan Cutler


When using release verification script and selecting integration tests, Archery 
ignores selected tests and runs all tests.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Arrow sync call February 5 at 12:00 US/Eastern, 17:00 UTC

2020-02-04 Thread Neal Richardson
Hi all,
Reminder that our biweekly call is tomorrow at
https://meet.google.com/vtm-teks-phx. All are welcome to join. Notes will
be sent out to the mailing list afterwards (like, right after, I promise).

Neal


Re: Arrow sync call January 22 at 12:00 US/Eastern, 17:00 UTC

2020-02-04 Thread Neal Richardson
Oops, forgot to send the notes for the call from two weeks ago:

Attendees:
Projjal Chanda
Todd Hendricks
Ben Kietzman
Micah Kornfield
Rok Mihevc
Antoine Pitrou
Neal Richardson
François Saint-Jacques
Krisztián Szucs
Aaron
Luke

Discussion:
* 0.16
* C++ Result class refactor: we've stalled, still a lot left to do. See
https://issues.apache.org/jira/browse/ARROW-7231. Should push for before
1.0 but not a blocker for it (1.0 is about hardening the specification, not
the implementations)
* offset/C data API followup: should we add an offset/filter field to the
format? Since most implementations do something like that.
* Need help reviewing and merging Java patches

On Tue, Jan 21, 2020 at 3:34 PM Neal Richardson 
wrote:

> Hi all,
> Reminder that our biweekly call is tomorrow (or much later today,
> depending on your time zone) at https://meet.google.com/vtm-teks-phx. All
> are welcome to join. Notes will be sent out to the mailing list afterwards.
>
> Neal
>


Re: [Java] Issues with IntelliJ + errorprone + OpenJDK

2020-02-04 Thread Bryan Cutler
Here is where it came up at, looks to be installed in the m2 repository

bryan@lm-P50 ~ $ find ~/ -name "failureaccess-*.jar" -type f
/home/bryan/.m2/repository/com/google/guava/failureaccess/1.0.1/failureaccess-1.0.1.jar
/home/bryan/.IdeaIC2019.2/system/download-cache/error-prone/2.3.3/failureaccess-1.0.1.jar

Also error-prone jar is in the IntelliJ plugin directory
find ~/ -name "error-prone*.jar" -type f
/home/bryan/.IdeaIC2019.2/config/plugins/error-prone/lib/error-prone.jar
/home/bryan/.IdeaIC2019.2/config/plugins/error-prone/lib/jps/error-prone-jps-plugin.jar




On Tue, Feb 4, 2020 at 7:44 AM Andy Grove  wrote:

> Actually, central.maven.org doesn't even exist ...
>
> On Tue, Feb 4, 2020 at 8:28 AM Andy Grove  wrote:
>
> > Thanks for the help but I followed the same instructions and get this
> > error:
> >
> > Error:Failed to download error-prone compiler JARs: Failed to download '
> >
> http://central.maven.org/maven2/com/google/guava/failureaccess/1.0.1/failureaccess-1.0.1.jar
> > ':
> > central.maven.org
> >
> > The issue is that this maven central no longer supports http and requires
> > https. Maybe I could manually install this file somewhere? I did try
> > installing in my local m2 repo but that didn't work.
> >
> > If anyone could scan their local drive for this file and let me know
> where
> > it is installed that could unblock me.
> >
> > Thanks,
> >
> > Andy.
> >
> >
> >
> > On Mon, Feb 3, 2020 at 6:24 PM Fan Liya  wrote:
> >
> >> I was having the same problem, and it was solved by
> >>
> >> 1. Install the "Error Prone Compiler" plugin to intellij
> >> 2. setting "Settings/Build, Execution, Deployment/Compiler/Java
> >> Compiler/Use compiler" to "Javac with error-prone"
> >>
> >> I am using Intellij 2019.3 (Community Edition)
> >>
> >> Best,
> >> Liya Fan
> >>
> >> On Tue, Feb 4, 2020 at 7:25 AM Bryan Cutler  wrote:
> >>
> >> > Ahh, now that you sent that link it jogged my memory. A while ago I
> >> think I
> >> > did see that error and installed the error prone compiler plugin
> >> mentioned.
> >> > It worked after that I believe, but I am on IntillJ 2019.2.4 on
> Ubuntu,
> >> and
> >> > it was a while ago so maybe something changed. If there is anything I
> >> can
> >> > check to help you out, let me know.
> >> >
> >> > On Mon, Feb 3, 2020 at 12:22 PM Andy Grove 
> >> wrote:
> >> >
> >> > > So it turns out there are specific instructions [1] for using
> >> errorprone
> >> > > with IntelliJ. Unfortunately, this doesn't work due to a bug in
> >> IntelliJ
> >> > > that was fixed a few days ago but not released yet [2].
> >> > >
> >> > > [1] https://errorprone.info/docs/installation
> >> > > [2]
> >> > >
> >> > >
> >> >
> >>
> https://intellij-support.jetbrains.com/hc/en-us/community/posts/360007052380-error-prone-compile-plugin-cant-download-jar
> >> > >
> >> > >
> >> > >
> >> > > On Mon, Feb 3, 2020 at 1:10 PM Andy Grove 
> >> wrote:
> >> > >
> >> > > > Hi Bryan,
> >> > > >
> >> > > > Yes, I tried opening as a Maven project and got the same error.
> I'm
> >> > using
> >> > > > OpenJDK 1.8.0_232 on both Ubuntu 19.04 and macOS 10.14.6 and get
> the
> >> > same
> >> > > > error on both. I'm using IntelliJ Ultimate 2019.3.2. Building from
> >> the
> >> > > > command line with Maven works fine.
> >> > > >
> >> > > > Very odd. I'll guess I'll do a little more research on errorprone.
> >> > > >
> >> > > > Thanks,
> >> > > >
> >> > > > Andy.
> >> > > >
> >> > > >
> >> > > > On Mon, Feb 3, 2020 at 12:50 PM Bryan Cutler 
> >> > wrote:
> >> > > >
> >> > > >> Hi Andy,
> >> > > >> What is your JDK version? I haven't seen that exact error, did
> you
> >> > open
> >> > > >> Arrow as a Maven project in Intellij?
> >> > > >>
> >> > > >> On Mon, Feb 3, 2020 at 7:47 AM Andy Grove  >
> >> > > wrote:
> >> > > >>
> >> > > >> > I'm working on the Java codebase and cannot run code inside
> >> IntelliJ
> >> > > >> and it
> >> > > >> > looks like some kind of compatibility issue between errorprone
> >> and
> >> > the
> >> > > >> JDK
> >> > > >> > that IntelliJ is using. I'm hoping other Java committers have
> >> found
> >> > a
> >> > > >> > solution already to this?
> >> > > >> >
> >> > > >> > Error:java: java.lang.RuntimeException:
> >> java.lang.NoSuchMethodError:
> >> > > >> >
> >> > > >> >
> >> > > >>
> >> > >
> >> >
> >>
> com.sun.tools.javac.util.JavacMessages.add(Lcom/sun/tools/javac/util/JavacMessages$ResourceBundleHelper;)V
> >> > > >> > Error:java: Caused by: java.lang.NoSuchMethodError:
> >> > > >> >
> >> > > >> >
> >> > > >>
> >> > >
> >> >
> >>
> com.sun.tools.javac.util.JavacMessages.add(Lcom/sun/tools/javac/util/JavacMessages$ResourceBundleHelper;)V
> >> > > >> > Error:java: at
> >> > > >> >
> >> > > >> >
> >> > > >>
> >> > >
> >> >
> >>
> com.google.errorprone.BaseErrorProneJavaCompiler.setupMessageBundle(BaseErrorProneJavaCompiler.java:202)
> >> > > >> > Error:java: at
> >> > > >> >
> >> > > >> >
> >> > > >>
> >> > >
> >> >
> >>
> com.google.errorprone.ErrorProneJavacPlugin.init(ErrorProneJavacPlu

[jira] [Created] (ARROW-7769) [Python] Default PYARROW_CMAKE_GENERATOR can yield broken libraries with MSVC if the C++ toolchain is different

2020-02-04 Thread Wes McKinney (Jira)
Wes McKinney created ARROW-7769:
---

 Summary: [Python] Default PYARROW_CMAKE_GENERATOR can yield broken 
libraries with MSVC if the C++ toolchain is different
 Key: ARROW-7769
 URL: https://issues.apache.org/jira/browse/ARROW-7769
 Project: Apache Arrow
  Issue Type: Bug
  Components: Python
Reporter: Wes McKinney


See discussion in https://github.com/apache/arrow/pull/6350.

Python's setup.py defaults to the "Visual Studio 14 2015 Win64" CMake generator 
here

https://github.com/apache/arrow/blob/apache-arrow-0.16.0/python/setup.py#L130

We found in ARROW-6757 that if VS 2017 or newer was used to build the C++ 
libraries, then there can be a toolchain conflict causing segfaults. I'm not 
sure if there's a better way to infer which VS toolchain is "preferred" (based 
on what "VsDevCmd.bat" was run), but we should see if we should do something 
other than what we have now



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-7768) Implement Length and TryClone traits for Cursor> in reader.rs

2020-02-04 Thread David Kegley (Jira)
David Kegley created ARROW-7768:
---

 Summary: Implement Length and TryClone traits for Cursor> 
in reader.rs
 Key: ARROW-7768
 URL: https://issues.apache.org/jira/browse/ARROW-7768
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Rust
Reporter: David Kegley


Currently Length and TryClone are implemented for Cursor<&'a [u8]> in 
src/file/reader.rs



Attempting to create a cursor from a Vec


{code:java}
fn test_cursor_and_file_has_the_same_behaviour() {
  let mut buf: Vec = Vec::new();
  get_test_file("alltypes_plain.parquet")
.read_to_end(&mut buf)
.unwrap();
  let cursor = Cursor::new(buf.as_slice());
...
{code}
 

results in:

 
{code:java}
`buf` does not live long enough

borrowed value does not live long enoughrustc(E0597)reader.rs(681, 34): 
borrowed value does not live long enoughreader.rs(681, 34): argument requires 
that `buf` is borrowed for `'static`reader.rs(691, 5): `buf` dropped here while 
still borrowed
{code}
 

 

Implementing Length and TryClone for Cursor> would allow for:


{code:java}
fn test_cursor_and_file_has_the_same_behaviour() {
  let mut buf: Vec = Vec::new();
  get_test_file("alltypes_plain.parquet")
.read_to_end(&mut buf)
.unwrap();
  let cursor = Cursor::new(buf);
  let read_from_cursor = SerializedFileReader::new(cursor).unwrap();
...
{code}

Otherwise, buf: Vec must be declared static in order to initialize a 
SerializedFileReader from a Cursor.

I'm new to rust so perhaps this is the intended behavior, but if not I'm happy 
to submit a PR for this

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-7767) [C++] Add a facility to create a Bitmap buffer from an data pointer with a specified sentinel

2020-02-04 Thread Francois Saint-Jacques (Jira)
Francois Saint-Jacques created ARROW-7767:
-

 Summary: [C++] Add a facility to create a Bitmap buffer from an 
data pointer with a specified sentinel
 Key: ARROW-7767
 URL: https://issues.apache.org/jira/browse/ARROW-7767
 Project: Apache Arrow
  Issue Type: Improvement
  Components: C++, R
Reporter: Francois Saint-Jacques


This is a special case for R and other cases where the null value is 
represented by a sentinel. This would read the data pointer and return a null 
bitmap buffer where bits are activate for every row where the value is not the 
sentinel value. If no sentinel is encountered, return nullptr. 


{code:c++}
template 
Result> NullBitmapFromSentinelData(MemoryPool* pool, 
const CType* data, size_t n_values, CType sentinel_value>();
{code}




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-7766) [Python][Packaging] Windows py38 wheels are built with wrong ABI tag

2020-02-04 Thread Neal Richardson (Jira)
Neal Richardson created ARROW-7766:
--

 Summary: [Python][Packaging] Windows py38 wheels are built with 
wrong ABI tag
 Key: ARROW-7766
 URL: https://issues.apache.org/jira/browse/ARROW-7766
 Project: Apache Arrow
  Issue Type: Bug
  Components: Packaging, Python
Reporter: Neal Richardson
Assignee: Neal Richardson


File paths have {{cp38m}} in them, which confuses pip.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [VOTE] Release Apache Arrow 0.16.0 - RC2

2020-02-04 Thread Wes McKinney
+1 (binding)

Some patches were required to the verification scripts but I have run:

* Full source verification on Ubuntu 18.04
* Linux binary verification
* Source verification on Windows 10 (needed ARROW-6757)
* Windows binary verification. Note that Python 3.8 wheel is broken
(see ARROW-7755). Whoever uploads the wheels to PyPI _SHOULD NOT_
upload this 3.8 wheel until we know what's wrong (if we upload a
broken wheel then `pip install pyarrow==0.16.0` will be permanently
broken on Windows/Python 3.8)

On Mon, Feb 3, 2020 at 9:26 PM Francois Saint-Jacques
 wrote:
>
> Tested on ubuntu 18.04 for the source release.
>
> On Mon, Feb 3, 2020 at 10:07 PM Francois Saint-Jacques
>  wrote:
> >
> > +1
> >
> > Binaries verification didn't have any issues.
> > Sources verification worked with some local environment hiccups
> >
> > François
> >
> > On Mon, Feb 3, 2020 at 8:46 PM Andy Grove  wrote:
> > >
> > > +1 (binding) based on running the Rust tests
> > >
> > > Thanks.
> > >
> > > On Thu, Jan 30, 2020 at 8:13 PM Krisztián Szűcs 
> > > 
> > > wrote:
> > >
> > > > Hi,
> > > >
> > > > I would like to propose the following release candidate (RC2) of Apache
> > > > Arrow version 0.16.0. This is a release consisting of 728
> > > > resolved JIRA issues[1].
> > > >
> > > > This release candidate is based on commit:
> > > > 729a7689fd87572e6a14ad36f19cd579a8b8d9c5 [2]
> > > >
> > > > The source release rc2 is hosted at [3].
> > > > The binary artifacts are hosted at [4][5][6][7].
> > > > The changelog is located at [8].
> > > >
> > > > Please download, verify checksums and signatures, run the unit tests,
> > > > and vote on the release. See [9] for how to validate a release 
> > > > candidate.
> > > >
> > > > The vote will be open for at least 72 hours.
> > > >
> > > > [ ] +1 Release this as Apache Arrow 0.16.0
> > > > [ ] +0
> > > > [ ] -1 Do not release this as Apache Arrow 0.16.0 because...
> > > >
> > > > [1]:
> > > > https://issues.apache.org/jira/issues/?jql=project%20%3D%20ARROW%20AND%20status%20in%20%28Resolved%2C%20Closed%29%20AND%20fixVersion%20%3D%200.16.0
> > > > [2]:
> > > > https://github.com/apache/arrow/tree/729a7689fd87572e6a14ad36f19cd579a8b8d9c5
> > > > [3]: 
> > > > https://dist.apache.org/repos/dist/dev/arrow/apache-arrow-0.16.0-rc2
> > > > [4]: https://bintray.com/apache/arrow/centos-rc/0.16.0-rc2
> > > > [5]: https://bintray.com/apache/arrow/debian-rc/0.16.0-rc2
> > > > [6]: https://bintray.com/apache/arrow/python-rc/0.16.0-rc2
> > > > [7]: https://bintray.com/apache/arrow/ubuntu-rc/0.16.0-rc2
> > > > [8]:
> > > > https://github.com/apache/arrow/blob/729a7689fd87572e6a14ad36f19cd579a8b8d9c5/CHANGELOG.md
> > > > [9]:
> > > > https://cwiki.apache.org/confluence/display/ARROW/How+to+Verify+Release+Candidates
> > > >


Re: [NIGHTLY] Arrow Build Report for Job nightly-2020-02-04-0

2020-02-04 Thread Wes McKinney
congratulations! =)

On Tue, Feb 4, 2020 at 7:55 AM Francois Saint-Jacques
 wrote:
>
> This is a first!
>
> On Tue, Feb 4, 2020 at 8:47 AM Crossbow  wrote:
> >
> >
> > Arrow Build Report for Job nightly-2020-02-04-0
> >
> > All tasks: 
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0
> >
> > Succeeded Tasks:
> > - centos-6:
> >   URL: 
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-azure-centos-6
> > - centos-7:
> >   URL: 
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-azure-centos-7
> > - centos-8:
> >   URL: 
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-azure-centos-8
> > - conda-linux-gcc-py27:
> >   URL: 
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-azure-conda-linux-gcc-py27
> > - conda-linux-gcc-py36:
> >   URL: 
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-azure-conda-linux-gcc-py36
> > - conda-linux-gcc-py37:
> >   URL: 
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-azure-conda-linux-gcc-py37
> > - conda-linux-gcc-py38:
> >   URL: 
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-azure-conda-linux-gcc-py38
> > - conda-osx-clang-py27:
> >   URL: 
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-azure-conda-osx-clang-py27
> > - conda-osx-clang-py36:
> >   URL: 
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-azure-conda-osx-clang-py36
> > - conda-osx-clang-py37:
> >   URL: 
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-azure-conda-osx-clang-py37
> > - conda-osx-clang-py38:
> >   URL: 
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-azure-conda-osx-clang-py38
> > - conda-win-vs2015-py36:
> >   URL: 
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-azure-conda-win-vs2015-py36
> > - conda-win-vs2015-py37:
> >   URL: 
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-azure-conda-win-vs2015-py37
> > - conda-win-vs2015-py38:
> >   URL: 
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-azure-conda-win-vs2015-py38
> > - debian-buster:
> >   URL: 
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-azure-debian-buster
> > - debian-stretch:
> >   URL: 
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-azure-debian-stretch
> > - gandiva-jar-osx:
> >   URL: 
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-travis-gandiva-jar-osx
> > - gandiva-jar-trusty:
> >   URL: 
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-travis-gandiva-jar-trusty
> > - homebrew-cpp:
> >   URL: 
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-travis-homebrew-cpp
> > - macos-r-autobrew:
> >   URL: 
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-travis-macos-r-autobrew
> > - test-conda-cpp:
> >   URL: 
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-circle-test-conda-cpp
> > - test-conda-python-2.7-pandas-latest:
> >   URL: 
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-circle-test-conda-python-2.7-pandas-latest
> > - test-conda-python-2.7:
> >   URL: 
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-circle-test-conda-python-2.7
> > - test-conda-python-3.6:
> >   URL: 
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-circle-test-conda-python-3.6
> > - test-conda-python-3.7-dask-latest:
> >   URL: 
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-circle-test-conda-python-3.7-dask-latest
> > - test-conda-python-3.7-hdfs-2.9.2:
> >   URL: 
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-circle-test-conda-python-3.7-hdfs-2.9.2
> > - test-conda-python-3.7-pandas-latest:
> >   URL: 
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-circle-test-conda-python-3.7-pandas-latest
> > - test-conda-python-3.7-pandas-master:
> >   URL: 
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-circle-test-conda-python-3.7-pandas-master
> > - test-conda-python-3.7-spark-master:
> >   URL: 
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-circle-test-conda-python-3.7-spark-master
> > - test-conda-python-3.7-turbodbc-latest:
> >   URL: 
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-circle-test-conda-python-3.7-turbodbc-latest
> > - test-conda-python-3.7-turbodbc-master:
> >   URL: 
> > https://github.com/ursa-labs/crossbow/branches/

Re: [DISCUSS] Improving GitHub Actions configurations + tooling

2020-02-04 Thread Wes McKinney
I'm personally not too concerned with the details as long as people
generally agree that the solution is maintainable (Kou's and Antoine's
feedback here would be helpful) and there is not an abundant odor of
code duplication

On Tue, Feb 4, 2020 at 9:48 AM Neal Richardson
 wrote:
>
> What if we wrote our own action(s) to wrap up some of the boilerplate? It
> doesn't seem that there are any off-the-shelf actions we could use to drive
> docker-compose:
> https://github.com/marketplace?utf8=%E2%9C%93&type=actions&query=docker-compose
> but I don't think it would be that difficult to wrap `docker-compose pull
> $JOB && docker-compose build $JOB && docker-compose run $JOB` or similar in
> an action.
>
> Neal
>
> On Tue, Feb 4, 2020 at 6:57 AM Krisztián Szűcs 
> wrote:
>
> > Hi,
> >
> > On Mon, Feb 3, 2020 at 5:25 PM Wes McKinney  wrote:
> > >
> > > hi folks,
> > >
> > > I have noticed that many of our GitHub Actions configurations are very
> > > similar to each other
> > They are indeed.
> > >
> > > https://www.diffchecker.com/eF4tHdzo
> > >
> > > Aside from the "copy-paste" issue, some work would have to be done to
> > > generate a Crossbow configuration using GHA.
> > Do you mean having GHA as a backend for crossbow? @Kou has just
> > added support for that in [1]
> > >
> > > It seems like a solution to these issues is to create a program to
> > > generate the GHA configurations (using some templates or other tools).
> > > So what is in .github/workflows would not be edited by human hands in
> > > general but rather generated by this program.
> > That would be quite similar to what I've implemented in ursabot [2], just
> > generating GHA flavored ymls instead of buildbot objects, so it seems
> > doable. Of course we'll need commit hooks to force the regeneration of
> > these configuration files.
> > >
> > > This program could also assist with local automation for
> > > reproducibility purposes (for example, locally executing a cascade of
> > > dependent docker-compose steps).
> > Another independent improvement could be to ditch docker-compose
> > completely.
> > I'd say that 70% of the docker-compose.yml [3] and the relating
> > dockerfiles are
> > filled with duplications necessary because of the limited parametrization
> > and
> > reusability of docker and docker-compose. It also makes harder to use new
> > docker
> > features like https://docs.docker.com/buildx/working-with-buildx/
> >
> > Again I'm referring ursabot where I've already implemented the ideas, the
> > docker files [4] and the image hierarchy from the compose file [3] could be
> > replaced by something similar like the ursabot docker utility [6].
> > The builder definitions [7] which are the counterparts for the scripts
> > [8] compose
> > command [9] and compose parameters [10].
> > I'm not saying that we should use the exact build definition syntax
> > from ursabot,
> > we can have arbitrary ymls etc. but the configuration I've referenced from
> > ursabot approximates what we can achieve by increasing the abstraction
> > level.
> >
> > Note, that the ursabot sources I've referenced are not GPL contaminated.
> > >
> > > Thoughts?
> > On the other hand I see one big advantage of the current setup. By using
> > plain
> > GHA ymls, a single docker-compose.yml, Dockerfiles and a set of small bash
> > scripts, our CI configuration is more idiomatic. It sits closer to the
> > developers'
> > expectations.
> > It's hard for me to judge the developer friendliness of either of "CI
> > tools"
> > (GHA+docker setup, crossbow, ursabot) because I was the main perpetrator
> > of those, but I suppose that the new GHA setup is the easiest to work with.
> > >
> > Thanks, Krisztian
> >
> > [1] https://github.com/apache/arrow/pull/6286
> > [2]
> > https://github.com/ursa-labs/ursabot/blob/master/projects/arrow/master.cfg#L67-L256
> > [3] https://github.com/apache/arrow/blob/master/docker-compose.yml
> > [4] https://github.com/apache/arrow/tree/master/ci/docker
> > [6]
> > https://github.com/ursa-labs/ursabot/blob/master/projects/arrow/arrow/docker.py
> > [7]
> > https://github.com/ursa-labs/ursabot/blob/master/projects/arrow/arrow/builders.py#L332
> > [8] https://github.com/apache/arrow/tree/master/ci/scripts
> > [9] https://github.com/apache/arrow/blob/master/docker-compose.yml#L306
> > [10] https://github.com/apache/arrow/blob/master/docker-compose.yml#L264
> > > - Wes
> >


[jira] [Created] (ARROW-7765) [C++] Add Result to the Visitor pattern

2020-02-04 Thread Francois Saint-Jacques (Jira)
Francois Saint-Jacques created ARROW-7765:
-

 Summary: [C++] Add Result to the Visitor pattern
 Key: ARROW-7765
 URL: https://issues.apache.org/jira/browse/ARROW-7765
 Project: Apache Arrow
  Issue Type: Sub-task
Reporter: Francois Saint-Jacques






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-7764) [C++] Builders allocate a null bitmap buffer even if there is no nulls

2020-02-04 Thread Francois Saint-Jacques (Jira)
Francois Saint-Jacques created ARROW-7764:
-

 Summary: [C++] Builders allocate a null bitmap buffer even if 
there is no nulls
 Key: ARROW-7764
 URL: https://issues.apache.org/jira/browse/ARROW-7764
 Project: Apache Arrow
  Issue Type: Improvement
  Components: C++
Reporter: Francois Saint-Jacques


This is an optimization where we can coalesce to nullptr if there's no null in 
the array.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] Improving GitHub Actions configurations + tooling

2020-02-04 Thread Neal Richardson
What if we wrote our own action(s) to wrap up some of the boilerplate? It
doesn't seem that there are any off-the-shelf actions we could use to drive
docker-compose:
https://github.com/marketplace?utf8=%E2%9C%93&type=actions&query=docker-compose
but I don't think it would be that difficult to wrap `docker-compose pull
$JOB && docker-compose build $JOB && docker-compose run $JOB` or similar in
an action.

Neal

On Tue, Feb 4, 2020 at 6:57 AM Krisztián Szűcs 
wrote:

> Hi,
>
> On Mon, Feb 3, 2020 at 5:25 PM Wes McKinney  wrote:
> >
> > hi folks,
> >
> > I have noticed that many of our GitHub Actions configurations are very
> > similar to each other
> They are indeed.
> >
> > https://www.diffchecker.com/eF4tHdzo
> >
> > Aside from the "copy-paste" issue, some work would have to be done to
> > generate a Crossbow configuration using GHA.
> Do you mean having GHA as a backend for crossbow? @Kou has just
> added support for that in [1]
> >
> > It seems like a solution to these issues is to create a program to
> > generate the GHA configurations (using some templates or other tools).
> > So what is in .github/workflows would not be edited by human hands in
> > general but rather generated by this program.
> That would be quite similar to what I've implemented in ursabot [2], just
> generating GHA flavored ymls instead of buildbot objects, so it seems
> doable. Of course we'll need commit hooks to force the regeneration of
> these configuration files.
> >
> > This program could also assist with local automation for
> > reproducibility purposes (for example, locally executing a cascade of
> > dependent docker-compose steps).
> Another independent improvement could be to ditch docker-compose
> completely.
> I'd say that 70% of the docker-compose.yml [3] and the relating
> dockerfiles are
> filled with duplications necessary because of the limited parametrization
> and
> reusability of docker and docker-compose. It also makes harder to use new
> docker
> features like https://docs.docker.com/buildx/working-with-buildx/
>
> Again I'm referring ursabot where I've already implemented the ideas, the
> docker files [4] and the image hierarchy from the compose file [3] could be
> replaced by something similar like the ursabot docker utility [6].
> The builder definitions [7] which are the counterparts for the scripts
> [8] compose
> command [9] and compose parameters [10].
> I'm not saying that we should use the exact build definition syntax
> from ursabot,
> we can have arbitrary ymls etc. but the configuration I've referenced from
> ursabot approximates what we can achieve by increasing the abstraction
> level.
>
> Note, that the ursabot sources I've referenced are not GPL contaminated.
> >
> > Thoughts?
> On the other hand I see one big advantage of the current setup. By using
> plain
> GHA ymls, a single docker-compose.yml, Dockerfiles and a set of small bash
> scripts, our CI configuration is more idiomatic. It sits closer to the
> developers'
> expectations.
> It's hard for me to judge the developer friendliness of either of "CI
> tools"
> (GHA+docker setup, crossbow, ursabot) because I was the main perpetrator
> of those, but I suppose that the new GHA setup is the easiest to work with.
> >
> Thanks, Krisztian
>
> [1] https://github.com/apache/arrow/pull/6286
> [2]
> https://github.com/ursa-labs/ursabot/blob/master/projects/arrow/master.cfg#L67-L256
> [3] https://github.com/apache/arrow/blob/master/docker-compose.yml
> [4] https://github.com/apache/arrow/tree/master/ci/docker
> [6]
> https://github.com/ursa-labs/ursabot/blob/master/projects/arrow/arrow/docker.py
> [7]
> https://github.com/ursa-labs/ursabot/blob/master/projects/arrow/arrow/builders.py#L332
> [8] https://github.com/apache/arrow/tree/master/ci/scripts
> [9] https://github.com/apache/arrow/blob/master/docker-compose.yml#L306
> [10] https://github.com/apache/arrow/blob/master/docker-compose.yml#L264
> > - Wes
>


Re: [Java] Issues with IntelliJ + errorprone + OpenJDK

2020-02-04 Thread Andy Grove
Actually, central.maven.org doesn't even exist ...

On Tue, Feb 4, 2020 at 8:28 AM Andy Grove  wrote:

> Thanks for the help but I followed the same instructions and get this
> error:
>
> Error:Failed to download error-prone compiler JARs: Failed to download '
> http://central.maven.org/maven2/com/google/guava/failureaccess/1.0.1/failureaccess-1.0.1.jar
> ':
> central.maven.org
>
> The issue is that this maven central no longer supports http and requires
> https. Maybe I could manually install this file somewhere? I did try
> installing in my local m2 repo but that didn't work.
>
> If anyone could scan their local drive for this file and let me know where
> it is installed that could unblock me.
>
> Thanks,
>
> Andy.
>
>
>
> On Mon, Feb 3, 2020 at 6:24 PM Fan Liya  wrote:
>
>> I was having the same problem, and it was solved by
>>
>> 1. Install the "Error Prone Compiler" plugin to intellij
>> 2. setting "Settings/Build, Execution, Deployment/Compiler/Java
>> Compiler/Use compiler" to "Javac with error-prone"
>>
>> I am using Intellij 2019.3 (Community Edition)
>>
>> Best,
>> Liya Fan
>>
>> On Tue, Feb 4, 2020 at 7:25 AM Bryan Cutler  wrote:
>>
>> > Ahh, now that you sent that link it jogged my memory. A while ago I
>> think I
>> > did see that error and installed the error prone compiler plugin
>> mentioned.
>> > It worked after that I believe, but I am on IntillJ 2019.2.4 on Ubuntu,
>> and
>> > it was a while ago so maybe something changed. If there is anything I
>> can
>> > check to help you out, let me know.
>> >
>> > On Mon, Feb 3, 2020 at 12:22 PM Andy Grove 
>> wrote:
>> >
>> > > So it turns out there are specific instructions [1] for using
>> errorprone
>> > > with IntelliJ. Unfortunately, this doesn't work due to a bug in
>> IntelliJ
>> > > that was fixed a few days ago but not released yet [2].
>> > >
>> > > [1] https://errorprone.info/docs/installation
>> > > [2]
>> > >
>> > >
>> >
>> https://intellij-support.jetbrains.com/hc/en-us/community/posts/360007052380-error-prone-compile-plugin-cant-download-jar
>> > >
>> > >
>> > >
>> > > On Mon, Feb 3, 2020 at 1:10 PM Andy Grove 
>> wrote:
>> > >
>> > > > Hi Bryan,
>> > > >
>> > > > Yes, I tried opening as a Maven project and got the same error. I'm
>> > using
>> > > > OpenJDK 1.8.0_232 on both Ubuntu 19.04 and macOS 10.14.6 and get the
>> > same
>> > > > error on both. I'm using IntelliJ Ultimate 2019.3.2. Building from
>> the
>> > > > command line with Maven works fine.
>> > > >
>> > > > Very odd. I'll guess I'll do a little more research on errorprone.
>> > > >
>> > > > Thanks,
>> > > >
>> > > > Andy.
>> > > >
>> > > >
>> > > > On Mon, Feb 3, 2020 at 12:50 PM Bryan Cutler 
>> > wrote:
>> > > >
>> > > >> Hi Andy,
>> > > >> What is your JDK version? I haven't seen that exact error, did you
>> > open
>> > > >> Arrow as a Maven project in Intellij?
>> > > >>
>> > > >> On Mon, Feb 3, 2020 at 7:47 AM Andy Grove 
>> > > wrote:
>> > > >>
>> > > >> > I'm working on the Java codebase and cannot run code inside
>> IntelliJ
>> > > >> and it
>> > > >> > looks like some kind of compatibility issue between errorprone
>> and
>> > the
>> > > >> JDK
>> > > >> > that IntelliJ is using. I'm hoping other Java committers have
>> found
>> > a
>> > > >> > solution already to this?
>> > > >> >
>> > > >> > Error:java: java.lang.RuntimeException:
>> java.lang.NoSuchMethodError:
>> > > >> >
>> > > >> >
>> > > >>
>> > >
>> >
>> com.sun.tools.javac.util.JavacMessages.add(Lcom/sun/tools/javac/util/JavacMessages$ResourceBundleHelper;)V
>> > > >> > Error:java: Caused by: java.lang.NoSuchMethodError:
>> > > >> >
>> > > >> >
>> > > >>
>> > >
>> >
>> com.sun.tools.javac.util.JavacMessages.add(Lcom/sun/tools/javac/util/JavacMessages$ResourceBundleHelper;)V
>> > > >> > Error:java: at
>> > > >> >
>> > > >> >
>> > > >>
>> > >
>> >
>> com.google.errorprone.BaseErrorProneJavaCompiler.setupMessageBundle(BaseErrorProneJavaCompiler.java:202)
>> > > >> > Error:java: at
>> > > >> >
>> > > >> >
>> > > >>
>> > >
>> >
>> com.google.errorprone.ErrorProneJavacPlugin.init(ErrorProneJavacPlugin.java:40)
>> > > >> >
>> > > >>
>> > > >
>> > >
>> >
>>
>


Re: [Java] Issues with IntelliJ + errorprone + OpenJDK

2020-02-04 Thread Andy Grove
Thanks for the help but I followed the same instructions and get this error:

Error:Failed to download error-prone compiler JARs: Failed to download '
http://central.maven.org/maven2/com/google/guava/failureaccess/1.0.1/failureaccess-1.0.1.jar
':
central.maven.org

The issue is that this maven central no longer supports http and requires
https. Maybe I could manually install this file somewhere? I did try
installing in my local m2 repo but that didn't work.

If anyone could scan their local drive for this file and let me know where
it is installed that could unblock me.

Thanks,

Andy.



On Mon, Feb 3, 2020 at 6:24 PM Fan Liya  wrote:

> I was having the same problem, and it was solved by
>
> 1. Install the "Error Prone Compiler" plugin to intellij
> 2. setting "Settings/Build, Execution, Deployment/Compiler/Java
> Compiler/Use compiler" to "Javac with error-prone"
>
> I am using Intellij 2019.3 (Community Edition)
>
> Best,
> Liya Fan
>
> On Tue, Feb 4, 2020 at 7:25 AM Bryan Cutler  wrote:
>
> > Ahh, now that you sent that link it jogged my memory. A while ago I
> think I
> > did see that error and installed the error prone compiler plugin
> mentioned.
> > It worked after that I believe, but I am on IntillJ 2019.2.4 on Ubuntu,
> and
> > it was a while ago so maybe something changed. If there is anything I can
> > check to help you out, let me know.
> >
> > On Mon, Feb 3, 2020 at 12:22 PM Andy Grove 
> wrote:
> >
> > > So it turns out there are specific instructions [1] for using
> errorprone
> > > with IntelliJ. Unfortunately, this doesn't work due to a bug in
> IntelliJ
> > > that was fixed a few days ago but not released yet [2].
> > >
> > > [1] https://errorprone.info/docs/installation
> > > [2]
> > >
> > >
> >
> https://intellij-support.jetbrains.com/hc/en-us/community/posts/360007052380-error-prone-compile-plugin-cant-download-jar
> > >
> > >
> > >
> > > On Mon, Feb 3, 2020 at 1:10 PM Andy Grove 
> wrote:
> > >
> > > > Hi Bryan,
> > > >
> > > > Yes, I tried opening as a Maven project and got the same error. I'm
> > using
> > > > OpenJDK 1.8.0_232 on both Ubuntu 19.04 and macOS 10.14.6 and get the
> > same
> > > > error on both. I'm using IntelliJ Ultimate 2019.3.2. Building from
> the
> > > > command line with Maven works fine.
> > > >
> > > > Very odd. I'll guess I'll do a little more research on errorprone.
> > > >
> > > > Thanks,
> > > >
> > > > Andy.
> > > >
> > > >
> > > > On Mon, Feb 3, 2020 at 12:50 PM Bryan Cutler 
> > wrote:
> > > >
> > > >> Hi Andy,
> > > >> What is your JDK version? I haven't seen that exact error, did you
> > open
> > > >> Arrow as a Maven project in Intellij?
> > > >>
> > > >> On Mon, Feb 3, 2020 at 7:47 AM Andy Grove 
> > > wrote:
> > > >>
> > > >> > I'm working on the Java codebase and cannot run code inside
> IntelliJ
> > > >> and it
> > > >> > looks like some kind of compatibility issue between errorprone and
> > the
> > > >> JDK
> > > >> > that IntelliJ is using. I'm hoping other Java committers have
> found
> > a
> > > >> > solution already to this?
> > > >> >
> > > >> > Error:java: java.lang.RuntimeException:
> java.lang.NoSuchMethodError:
> > > >> >
> > > >> >
> > > >>
> > >
> >
> com.sun.tools.javac.util.JavacMessages.add(Lcom/sun/tools/javac/util/JavacMessages$ResourceBundleHelper;)V
> > > >> > Error:java: Caused by: java.lang.NoSuchMethodError:
> > > >> >
> > > >> >
> > > >>
> > >
> >
> com.sun.tools.javac.util.JavacMessages.add(Lcom/sun/tools/javac/util/JavacMessages$ResourceBundleHelper;)V
> > > >> > Error:java: at
> > > >> >
> > > >> >
> > > >>
> > >
> >
> com.google.errorprone.BaseErrorProneJavaCompiler.setupMessageBundle(BaseErrorProneJavaCompiler.java:202)
> > > >> > Error:java: at
> > > >> >
> > > >> >
> > > >>
> > >
> >
> com.google.errorprone.ErrorProneJavacPlugin.init(ErrorProneJavacPlugin.java:40)
> > > >> >
> > > >>
> > > >
> > >
> >
>


[jira] [Created] (ARROW-7763) [Java] Update README with instructions for IntelliJ users

2020-02-04 Thread Andy Grove (Jira)
Andy Grove created ARROW-7763:
-

 Summary: [Java] Update README with instructions for IntelliJ users
 Key: ARROW-7763
 URL: https://issues.apache.org/jira/browse/ARROW-7763
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Java
Reporter: Andy Grove
Assignee: Andy Grove
 Fix For: 1.0.0


IntelliJ needs to be configured to use the errorprone compiler and this is not 
currently documented, making it hard for new contributors to build/test the 
project. We can pretty much just link to the instructions at 
https://errorprone.info/docs/installation



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] Improving GitHub Actions configurations + tooling

2020-02-04 Thread Krisztián Szűcs
Hi,

On Mon, Feb 3, 2020 at 5:25 PM Wes McKinney  wrote:
>
> hi folks,
>
> I have noticed that many of our GitHub Actions configurations are very
> similar to each other
They are indeed.
>
> https://www.diffchecker.com/eF4tHdzo
>
> Aside from the "copy-paste" issue, some work would have to be done to
> generate a Crossbow configuration using GHA.
Do you mean having GHA as a backend for crossbow? @Kou has just
added support for that in [1]
>
> It seems like a solution to these issues is to create a program to
> generate the GHA configurations (using some templates or other tools).
> So what is in .github/workflows would not be edited by human hands in
> general but rather generated by this program.
That would be quite similar to what I've implemented in ursabot [2], just
generating GHA flavored ymls instead of buildbot objects, so it seems
doable. Of course we'll need commit hooks to force the regeneration of
these configuration files.
>
> This program could also assist with local automation for
> reproducibility purposes (for example, locally executing a cascade of
> dependent docker-compose steps).
Another independent improvement could be to ditch docker-compose completely.
I'd say that 70% of the docker-compose.yml [3] and the relating dockerfiles are
filled with duplications necessary because of the limited parametrization and
reusability of docker and docker-compose. It also makes harder to use new docker
features like https://docs.docker.com/buildx/working-with-buildx/

Again I'm referring ursabot where I've already implemented the ideas, the
docker files [4] and the image hierarchy from the compose file [3] could be
replaced by something similar like the ursabot docker utility [6].
The builder definitions [7] which are the counterparts for the scripts
[8] compose
command [9] and compose parameters [10].
I'm not saying that we should use the exact build definition syntax
from ursabot,
we can have arbitrary ymls etc. but the configuration I've referenced from
ursabot approximates what we can achieve by increasing the abstraction level.

Note, that the ursabot sources I've referenced are not GPL contaminated.
>
> Thoughts?
On the other hand I see one big advantage of the current setup. By using plain
GHA ymls, a single docker-compose.yml, Dockerfiles and a set of small bash
scripts, our CI configuration is more idiomatic. It sits closer to the
developers'
expectations.
It's hard for me to judge the developer friendliness of either of "CI tools"
(GHA+docker setup, crossbow, ursabot) because I was the main perpetrator
of those, but I suppose that the new GHA setup is the easiest to work with.
>
Thanks, Krisztian

[1] https://github.com/apache/arrow/pull/6286
[2] 
https://github.com/ursa-labs/ursabot/blob/master/projects/arrow/master.cfg#L67-L256
[3] https://github.com/apache/arrow/blob/master/docker-compose.yml
[4] https://github.com/apache/arrow/tree/master/ci/docker
[6] 
https://github.com/ursa-labs/ursabot/blob/master/projects/arrow/arrow/docker.py
[7] 
https://github.com/ursa-labs/ursabot/blob/master/projects/arrow/arrow/builders.py#L332
[8] https://github.com/apache/arrow/tree/master/ci/scripts
[9] https://github.com/apache/arrow/blob/master/docker-compose.yml#L306
[10] https://github.com/apache/arrow/blob/master/docker-compose.yml#L264
> - Wes


[jira] [Created] (ARROW-7762) [Python] Exceptions in ParquetWriter get ignored

2020-02-04 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-7762:


 Summary: [Python] Exceptions in ParquetWriter get ignored
 Key: ARROW-7762
 URL: https://issues.apache.org/jira/browse/ARROW-7762
 Project: Apache Arrow
  Issue Type: Bug
  Components: Python
Reporter: Joris Van den Bossche


For example:

{code:python}
In [43]: table = pa.table({'a': [1, 2, 3]}) 

In [44]: pq.write_table(table, "test.parquet", version="2.2")   

   
---
ArrowExceptionTraceback (most recent call last)
ArrowException: Unsupported Parquet format version
Exception ignored in: 'pyarrow._parquet.ParquetWriter._set_version'
pyarrow.lib.ArrowException: Unsupported Parquet format version
{code}





--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [NIGHTLY] Arrow Build Report for Job nightly-2020-02-04-0

2020-02-04 Thread Francois Saint-Jacques
This is a first!

On Tue, Feb 4, 2020 at 8:47 AM Crossbow  wrote:
>
>
> Arrow Build Report for Job nightly-2020-02-04-0
>
> All tasks: 
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0
>
> Succeeded Tasks:
> - centos-6:
>   URL: 
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-azure-centos-6
> - centos-7:
>   URL: 
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-azure-centos-7
> - centos-8:
>   URL: 
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-azure-centos-8
> - conda-linux-gcc-py27:
>   URL: 
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-azure-conda-linux-gcc-py27
> - conda-linux-gcc-py36:
>   URL: 
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-azure-conda-linux-gcc-py36
> - conda-linux-gcc-py37:
>   URL: 
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-azure-conda-linux-gcc-py37
> - conda-linux-gcc-py38:
>   URL: 
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-azure-conda-linux-gcc-py38
> - conda-osx-clang-py27:
>   URL: 
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-azure-conda-osx-clang-py27
> - conda-osx-clang-py36:
>   URL: 
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-azure-conda-osx-clang-py36
> - conda-osx-clang-py37:
>   URL: 
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-azure-conda-osx-clang-py37
> - conda-osx-clang-py38:
>   URL: 
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-azure-conda-osx-clang-py38
> - conda-win-vs2015-py36:
>   URL: 
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-azure-conda-win-vs2015-py36
> - conda-win-vs2015-py37:
>   URL: 
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-azure-conda-win-vs2015-py37
> - conda-win-vs2015-py38:
>   URL: 
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-azure-conda-win-vs2015-py38
> - debian-buster:
>   URL: 
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-azure-debian-buster
> - debian-stretch:
>   URL: 
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-azure-debian-stretch
> - gandiva-jar-osx:
>   URL: 
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-travis-gandiva-jar-osx
> - gandiva-jar-trusty:
>   URL: 
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-travis-gandiva-jar-trusty
> - homebrew-cpp:
>   URL: 
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-travis-homebrew-cpp
> - macos-r-autobrew:
>   URL: 
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-travis-macos-r-autobrew
> - test-conda-cpp:
>   URL: 
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-circle-test-conda-cpp
> - test-conda-python-2.7-pandas-latest:
>   URL: 
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-circle-test-conda-python-2.7-pandas-latest
> - test-conda-python-2.7:
>   URL: 
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-circle-test-conda-python-2.7
> - test-conda-python-3.6:
>   URL: 
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-circle-test-conda-python-3.6
> - test-conda-python-3.7-dask-latest:
>   URL: 
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-circle-test-conda-python-3.7-dask-latest
> - test-conda-python-3.7-hdfs-2.9.2:
>   URL: 
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-circle-test-conda-python-3.7-hdfs-2.9.2
> - test-conda-python-3.7-pandas-latest:
>   URL: 
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-circle-test-conda-python-3.7-pandas-latest
> - test-conda-python-3.7-pandas-master:
>   URL: 
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-circle-test-conda-python-3.7-pandas-master
> - test-conda-python-3.7-spark-master:
>   URL: 
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-circle-test-conda-python-3.7-spark-master
> - test-conda-python-3.7-turbodbc-latest:
>   URL: 
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-circle-test-conda-python-3.7-turbodbc-latest
> - test-conda-python-3.7-turbodbc-master:
>   URL: 
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-circle-test-conda-python-3.7-turbodbc-master
> - test-conda-python-3.7:
>   URL: 
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-circle-test-conda-python-3.7
> - test-conda-python-3.8-dask-master:
>   URL: 
> https://github.com/

[NIGHTLY] Arrow Build Report for Job nightly-2020-02-04-0

2020-02-04 Thread Crossbow


Arrow Build Report for Job nightly-2020-02-04-0

All tasks: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0

Succeeded Tasks:
- centos-6:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-azure-centos-6
- centos-7:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-azure-centos-7
- centos-8:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-azure-centos-8
- conda-linux-gcc-py27:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-azure-conda-linux-gcc-py27
- conda-linux-gcc-py36:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-azure-conda-linux-gcc-py36
- conda-linux-gcc-py37:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-azure-conda-linux-gcc-py37
- conda-linux-gcc-py38:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-azure-conda-linux-gcc-py38
- conda-osx-clang-py27:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-azure-conda-osx-clang-py27
- conda-osx-clang-py36:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-azure-conda-osx-clang-py36
- conda-osx-clang-py37:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-azure-conda-osx-clang-py37
- conda-osx-clang-py38:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-azure-conda-osx-clang-py38
- conda-win-vs2015-py36:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-azure-conda-win-vs2015-py36
- conda-win-vs2015-py37:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-azure-conda-win-vs2015-py37
- conda-win-vs2015-py38:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-azure-conda-win-vs2015-py38
- debian-buster:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-azure-debian-buster
- debian-stretch:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-azure-debian-stretch
- gandiva-jar-osx:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-travis-gandiva-jar-osx
- gandiva-jar-trusty:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-travis-gandiva-jar-trusty
- homebrew-cpp:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-travis-homebrew-cpp
- macos-r-autobrew:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-travis-macos-r-autobrew
- test-conda-cpp:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-circle-test-conda-cpp
- test-conda-python-2.7-pandas-latest:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-circle-test-conda-python-2.7-pandas-latest
- test-conda-python-2.7:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-circle-test-conda-python-2.7
- test-conda-python-3.6:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-circle-test-conda-python-3.6
- test-conda-python-3.7-dask-latest:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-circle-test-conda-python-3.7-dask-latest
- test-conda-python-3.7-hdfs-2.9.2:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-circle-test-conda-python-3.7-hdfs-2.9.2
- test-conda-python-3.7-pandas-latest:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-circle-test-conda-python-3.7-pandas-latest
- test-conda-python-3.7-pandas-master:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-circle-test-conda-python-3.7-pandas-master
- test-conda-python-3.7-spark-master:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-circle-test-conda-python-3.7-spark-master
- test-conda-python-3.7-turbodbc-latest:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-circle-test-conda-python-3.7-turbodbc-latest
- test-conda-python-3.7-turbodbc-master:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-circle-test-conda-python-3.7-turbodbc-master
- test-conda-python-3.7:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-circle-test-conda-python-3.7
- test-conda-python-3.8-dask-master:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-circle-test-conda-python-3.8-dask-master
- test-conda-python-3.8-pandas-latest:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-04-0-circle-test-conda-python-3.8-pandas-latest
- test-c

[jira] [Created] (ARROW-7761) [C++] Add S3 support to fs::FileSystemFromUri

2020-02-04 Thread Francois Saint-Jacques (Jira)
Francois Saint-Jacques created ARROW-7761:
-

 Summary: [C++] Add S3 support to fs::FileSystemFromUri
 Key: ARROW-7761
 URL: https://issues.apache.org/jira/browse/ARROW-7761
 Project: Apache Arrow
  Issue Type: Improvement
  Components: C++
Reporter: Francois Saint-Jacques


FileSystemFromUri doesn't support S3. This would give almost immediate support 
for S3 in python/R.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-7760) [Release] Fix verify-release-candidate.sh since pip3 seems to no longer be in miniconda

2020-02-04 Thread Krisztian Szucs (Jira)
Krisztian Szucs created ARROW-7760:
--

 Summary: [Release] Fix verify-release-candidate.sh since pip3 
seems to no longer be in miniconda
 Key: ARROW-7760
 URL: https://issues.apache.org/jira/browse/ARROW-7760
 Project: Apache Arrow
  Issue Type: Bug
  Components: Developer Tools
Reporter: Krisztian Szucs






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [Gandiva] LLVM version

2020-02-04 Thread Projjal Chanda
Hi Kou,
Sure. I will let you know.

Regards,
Projjal

On Tue, Feb 4, 2020, at 2:20 AM, Sutou Kouhei wrote:
> Hi Projjal,
> 
> > Let me test the change by running it with Dremio.
> 
> Thanks!
> 
> > Will update if there are any issues.
> 
> It means that we can move forward if we don't get any
> responses from you in a week (long? short?), right?
> 
> 
> Thanks,
> --
> kou
> 
> In <29da1c69-6f14-45aa-8ea6-4293dc615...@www.fastmail.com>
>  "Re: [Gandiva] LLVM version" on Mon, 03 Feb 2020 11:26:43 +0530,
>  "Projjal Chanda"  wrote:
> 
> > Hi Kou,
> > Let me test the change by running it with Dremio. Will update if there are 
> > any issues.
> > 
> > Regards,
> > Projjal
> > 
> > On Mon, Feb 3, 2020, at 9:11 AM, Wes McKinney wrote:
> >> hi Kou,
> >> 
> >> Since nearly 2 weeks have passed, and the changes do not seem too
> >> risky, absent more comments I think it's safe to move forward with the
> >> upgrade.
> >> 
> >> - Wes
> >> 
> >> On Sun, Feb 2, 2020 at 6:55 PM Sutou Kouhei  wrote:
> >> >
> >> > Hi,
> >> >
> >> > Does Gandiva have any policy about LLVM version?
> >> >
> >> > The current Gandiva requires LLVM 7. Other LLVM versions
> >> > aren't supported. But the latest LLVM is 9. Can we upgrade
> >> > LLVM?
> >> >
> >> > Homebrew provides LLVM 4, 6, 7, 8 and 9 but doesn't accept
> >> > apache-arrow package that depends outdated LLVM:
> >> >
> >> > https://github.com/Homebrew/homebrew-core/pull/42385
> >> >
> >> > It means that apache-arrow package on Homebrew can't enable
> >> > Gandiva until we upgrade LLVM to the latest version.
> >> >
> >> >
> >> > We have a pull request that upgrades supported LLVM to 8:
> >> > https://github.com/apache/arrow/pull/6266
> >> >
> >> > In the pull request, Wes mentioned to Gandiva developers but
> >> > there are no responses.
> >> >
> >> >
> >> > In the pull request, there are no Gandiva changes. So we
> >> > will be able to support LLVM 7 and 8 without any #ifdef.
> >> > Can we support multiple LLVM versions? Or should we support
> >> > only one LLVM version?
> >> >
> >> >
> >> > I think that we can consider C++ tools provided by LLVM such
> >> > as clang-format separately. We will be able to use different
> >> > LLVM versions for Gandiva and C++ tools. For example, we
> >> > will be able to use LLVM 8 for Gandiva and LLVM 7 for
> >> > clang-format at the same time by improving our CMake
> >> > configuration.
> >> >
> >> >
> >> > Thanks,
> >> > --
> >> > kou
> >> 
>