Re: [Rust] Long compile times causing CI to fail

2019-09-08 Thread Krisztián Szűcs
Hey Paddy,

You can also take a look at the buildbot build times (tab) at
https://ci.ursalabs.org/#/builders/93

On Sun, Sep 8, 2019 at 2:58 AM paddy horan  wrote:

> Hi All,
>
> We have recently had a lot of CI builds fail for Rust due to long compile
> times, this was first pointed out by Francois on the following PR:
> https://github.com/apache/arrow/pull/5303
>
> However, it seems unrelated to this change as the following PR's are
> failing for the same reason also:
> https://github.com/apache/arrow/pull/5310
> https://github.com/apache/arrow/pull/5309
>
> I'm not sure what has changed (4 days ago I posted a PR that did not have
> this issue, https://github.com/apache/arrow/pull/5269) but it would seem
> that this increase might be due to updates to the nightly compiler, I will
> try to find out.  Any other ideas?
>
> Paddy
>
>
>
>
>


Re: [PROPOSAL] Consolidate Arrow's CI configuration

2019-09-08 Thread Krisztián Szűcs
On Sat, Sep 7, 2019 at 9:54 AM Sutou Kouhei  wrote:

> Hi,
>
> I may have Ursabot experience because I've tried to create a
> Ursabot configuration for GLib:
>
>   https://github.com/ursa-labs/ursabot/pull/172

Which is great, thanks for doing that!

>
>
> I like the proposal to consolidate CI configuration into
> Arrow repository. But I like the current docker-compose
> based approach to write how to run each CI job.
>
> I know that Krisztián pointed out docker-compose based
> approach has a problem in Docker image dependency
> resolution.
>
>
> https://lists.apache.org/thread.html/fd801fa85c3393edd0db415d70dbc4c3537a811ec8587a6fbcc842cd@%3Cdev.arrow.apache.org%3E
>
> > The "docker-compose setup"
> > --
> > ...
> > However docker-compose is not suitable for building and running
> > hierarchical
> > images. This is why we have added Makefile [1] to execute a "build" with
> a
> > single make command instead of manually executing multiple commands
> > involving
> > multiple images (which is error prone). It can also leave a lot of
> garbage
> > after both containers and images.
> > ...
> > [1]: https://github.com/apache/arrow/blob/master/Makefile.docker
>
> But I felt that I want to use well used approach than our
> specific Python based DSL while I created c_glib
> configuration for Ursabot. If we can use well used approach,
> we can use the approach in other projects. It means that we
> can decrease learning cost.
>
> I also felt that creating a class for each command for
> readable DSL is over-engineering. I know that I could use
> raw ShellCommand class but it'll break consistency.

I've added those command aliases really just for convenience, and to
not forget to customise them a bit. Each command can be customized
to parse e.g. number of failing/warning/succeded test cases from
a step and create a summary - which can greatly improve the readability
of the build log. Set different behaviours for different states, they can
use
locks across the whole CI, and other dynamic things can be done, like
triggering another schedulers.
These commands are not shell commands, we can represent more with
the buildbot build steps than with shell scripts. The conversion would also
work from buildbot BuildSteps to bash scripts by mocking out the non
ShellCommand steps. Thus buildbot DSL can be executed as a shell script
however with a shell script we cannot represent certain logics, which
would be useful for the hosted build master.

>
> For example:
>
>   Creating Meson class to run meson command:
>
> https://github.com/ursa-labs/ursabot/pull/172/files#diff-663dab3e9eab42dfac85d2fdb69c7e95R313-R315
>
> How about just creating a wrapper script for docker-compose
> instead of creating DSL?
>
I've also tried to figure out a way to reuse the bits from the
docker-compose
setup, but after some time I've realised that it'd be easier to generate
bash scripts and docker-compose.yml from the buildbot DSL because it
represents more abstractions.
Additionally docker-compose was not convenient for first use either, it took
a couple of iterations to reach the current state which balances between
the limitations of docker-compose and Arrow's requirements.
While docker-compose and the docker builders would work with linux and
windows builds, other platforms would fall short.

>
> For example, we will be able to use labels [labels] to put
> metadata to each image:
>
> 
> diff --git a/arrow-docker-compose b/arrow-docker-compose
> new file mode 100755
> index 0..fcb7f5e37
> --- /dev/null
> +++ b/arrow-docker-compose
> @@ -0,0 +1,13 @@
> +#!/usr/bin/env ruby
> +
> +require "yaml"
> +
> +if ARGV == ["build", "c_glib"]
> +  config = YAML.load(File.read("docker-compose.yml"))
> +  from =
> config["services"]["c_glib"]["build"]["labels"]["org.apache.arrow.from"]
> +  if from
> +system("docker-compose", "build", from)
> +  end
> +end
> +system("docker-compose", *ARGV)
> diff --git a/docker-compose.yml b/docker-compose.yml
> index 4f3f4128a..acd649a19 100644
> --- a/docker-compose.yml
> +++ b/docker-compose.yml
> @@ -103,6 +103,8 @@ services:
>  build:
>context: .
>dockerfile: c_glib/Dockerfile
> +  labels:
> +"org.apache.arrow.from": cpp
>  volumes: *ubuntu-volumes
>
>cpp:
> 
>
> "./arrow-docker-compose build c_glib" runs
> "docker-compose build cpp" then
> "docker-compose build c_glib".
>
> [labels] https://docs.docker.com/compose/compose-file/#labels
>
>
> If we just have convenient docker-compose wrapper, can we
> use raw Buildbot that just runs the docker-compose wrapper?
>
> I also know that Krisztián pointed out using docker-compose
> from Buildbot approach has some problems.
>
>
> https://lists.apache.org/thread.html/fd801fa85c3393edd0db415d70dbc4c3537a811ec8587a6fbcc842cd@%3Cdev.arrow.apache.org%3E
>
> > Use docker-compose from ursabot?
> > 
> >
> > So assume that we should use docker-compose commands in the buildbot
> > b

Can the R interface to write_parquet accept strings?

2019-09-08 Thread Daniel Feenberg
Can the R interface to Arrow Parquet write string data? Take the
following script:

   library(arrow)
   library(tidyverse)
   write_parquet(table = tibble(y = c("a", "b", "c")), file = "string.parquet")

I get the error message:

   Error in write_parquet_file(to_arrow(table), file) :
   Arrow error: IOError: Metadata contains Thrift LogicalType that is
   not recognized.

after warnings that stats::filter(), stats::lag() and
arrow::read_table() are masked, but I assume that isn't the problem.
This is with R 3.5.1 and arrow_0.14.1.1


Daniel Feenberg


[jira] [Created] (ARROW-6486) [Python] Allow subclassing & monkey-patching of Table

2019-09-08 Thread ARF (Jira)
ARF created ARROW-6486:
--

 Summary: [Python] Allow subclassing & monkey-patching of Table
 Key: ARROW-6486
 URL: https://issues.apache.org/jira/browse/ARROW-6486
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Python
Reporter: ARF


Currently, many classes in ``pyarrow`` behave strangely to the Python user: 
they are neither subclassable not monkey-patchable.

 

{{>>> import pyarrow as pa}}
{{>>> class MyTable(pa.Table):}}
{{... pass}}
{{...}}
{{>>> table = MyTable.from_arrays([], [])}}
{{>>> type(table)}}
{{}}

The factory method did not return an instance of our subclass...

Never mind, let's monkey-patch {{Table}}:

{{}}

{{>>> pa.TableOriginal = pa.Table}}
{{>>> pa.Table = MyTable}}
{{>>> table = pa.Table.from_arrays([], [])}}
{{>>> type(table)}}
{{}}
{{}}

 

OK, that did not work either.

Let's be sneaky:

{{>>> table.__class__ = MyTable}}
{{Traceback (most recent call last):}}
{{ File "", line 1, in }}
{{TypeError: __class__ assignment only supported for heap types or ModuleType 
subclasses}}
{{>>>}}

 

There is currently no way to modify or extend the behaviour of a {{Table}} 
instance. Users can use only what {{pyarrow}} provides out of the box. - This 
is likely to be a source of frustration for many python users.

 

The attached PR remedies this for the {{Table}} class:

{{>>> import pyarrow as pa}}
{{>>> class MyTable(pa.Table):}}
{{... pass}}
{{...}}
{{>>> table = MyTable.from_arrays([], [])}}
{{>>> type(table)}}
{{}}
{{>>>}}
{{>>> pa.TableOriginal = pa.Table}}
{{>>> pa.Table = MyTable}}
{{>>> table = pa.Table.from_arrays([], [])}}
{{>>> type(table)}}
{{}}
{{>>>}}

 

Ideally, these modifications would be extended to the other cython-defined 
classes of {{pyarrow}}, but given that Table is likely to be the interface that 
most users begin their interaction with, I thought this would be a good start.

Keeping the changes limited to a single class should also keep merge conflicts 
manageable.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (ARROW-6487) [Rust] [DataFusion] Create test utils module

2019-09-08 Thread Andy Grove (Jira)
Andy Grove created ARROW-6487:
-

 Summary: [Rust] [DataFusion] Create test utils module
 Key: ARROW-6487
 URL: https://issues.apache.org/jira/browse/ARROW-6487
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Rust, Rust - DataFusion
Reporter: Andy Grove
Assignee: Andy Grove
 Fix For: 0.15.0


I've been learning how to better organize unit test code in Rust and would like 
to introduce a test utils module containing common test helper functions. This 
code will use {{#[cfg(test)]}} to make sure it doesn't ship with the production 
code.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (ARROW-6488) [Python] pyarrow.NULL equals to itself

2019-09-08 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-6488:


 Summary: [Python] pyarrow.NULL equals to itself
 Key: ARROW-6488
 URL: https://issues.apache.org/jira/browse/ARROW-6488
 Project: Apache Arrow
  Issue Type: Bug
  Components: Python
Reporter: Joris Van den Bossche
 Fix For: 0.15.0


Somewhat related to ARROW-6386 on the interpretation of nulls, we currently 
have the following behaviour:

{code}
In [28]: pa.NULL == pa.NULL 

   
Out[28]: True
{code}

Which I think is certainly unexpected for a null / missing value. I still need 
to check what the array-level compare kernel does (NULL or False? ideally NULL 
I think), but we should follow that.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (ARROW-6489) [Developer][Documentation]Fix merge script and readme

2019-09-08 Thread Kenta Murata (Jira)
Kenta Murata created ARROW-6489:
---

 Summary: [Developer][Documentation]Fix merge script and readme
 Key: ARROW-6489
 URL: https://issues.apache.org/jira/browse/ARROW-6489
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Developer Tools
Reporter: Kenta Murata
Assignee: Kenta Murata


The following things should be fixed.

- merge_arrow_pr.py shouldn't be affected by git's merge.ff value.
- README should describe the information of APACHE_JIRA_USERNAME and 
APACHE_JIRA_PASSWORD
- README should describe the users needs to install requests and jira libraries 
before running merge_arrow_pr.py



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


Re: [ANNOUNCE] New committers: Ben Kietzman, Kenta Murata, and Neal Richardson

2019-09-08 Thread Joris Van den Bossche
Congratulations!

On Sat, 7 Sep 2019 at 20:54, Rok Mihevc  wrote:

> Congrats all!
>
> On Sat, Sep 7, 2019 at 5:02 AM Bryan Cutler  wrote:
>
> > Congrats Ben, Kenta and Neal!
> >
> > On Fri, Sep 6, 2019, 12:15 PM Krisztián Szűcs  >
> > wrote:
> >
> > > Congratulations!
> > >
> > > On Fri, Sep 6, 2019 at 8:12 PM Ben Kietzman 
> > > wrote:
> > >
> > > > Thanks!
> > > >
> > > > On Fri, Sep 6, 2019 at 1:09 PM Micah Kornfield <
> emkornfi...@gmail.com>
> > > > wrote:
> > > >
> > > > > Congrats everyone! (apologies if I double sent this).
> > > > >
> > > > > On Fri, Sep 6, 2019 at 10:06 AM Neal Richardson <
> > > > > neal.p.richard...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Thanks, y'all!
> > > > > >
> > > > > > On Fri, Sep 6, 2019 at 5:44 AM David Li 
> > > wrote:
> > > > > > >
> > > > > > > Congrats all! :)
> > > > > > >
> > > > > > > Best,
> > > > > > > David
> > > > > > >
> > > > > > > On 9/6/19, Francois Saint-Jacques 
> > wrote:
> > > > > > > > Congrats to everyone!
> > > > > > > >
> > > > > > > > François
> > > > > > > >
> > > > > > > > On Fri, Sep 6, 2019 at 4:34 AM Kenta Murata 
> > > wrote:
> > > > > > > >>
> > > > > > > >> Thank you very much everyone!
> > > > > > > >> I'm very happy to join this community.
> > > > > > > >>
> > > > > > > >> 2019年9月6日(金) 12:39 Micah Kornfield :
> > > > > > > >>
> > > > > > > >> >
> > > > > > > >> > Congrats everyone.
> > > > > > > >> >
> > > > > > > >> > On Thu, Sep 5, 2019 at 7:06 PM Ji Liu
> > > >  > > > > >
> > > > > > > >> > wrote:
> > > > > > > >> >
> > > > > > > >> > > Congratulations!
> > > > > > > >> > >
> > > > > > > >> > > Thanks,
> > > > > > > >> > > Ji Liu
> > > > > > > >> > >
> > > > > > > >> > >
> > > > > > > >> > >
> > > > > --
> > > > > > > >> > > From:Fan Liya 
> > > > > > > >> > > Send Time:2019年9月6日(星期五) 09:28
> > > > > > > >> > > To:dev 
> > > > > > > >> > > Subject:Re: [ANNOUNCE] New committers: Ben Kietzman,
> Kenta
> > > > > Murata,
> > > > > > > >> > > and
> > > > > > > >> > > Neal Richardson
> > > > > > > >> > >
> > > > > > > >> > > Big congratulations to Ben, Kenta and Neal!
> > > > > > > >> > >
> > > > > > > >> > > Best,
> > > > > > > >> > > Liya Fan
> > > > > > > >> > >
> > > > > > > >> > > On Fri, Sep 6, 2019 at 5:33 AM Wes McKinney <
> > > > > wesmck...@gmail.com>
> > > > > > > >> > > wrote:
> > > > > > > >> > >
> > > > > > > >> > > > hi all,
> > > > > > > >> > > >
> > > > > > > >> > > > on behalf of the Arrow PMC, I'm pleased to announce
> that
> > > > Ben,
> > > > > > > >> > > > Kenta,
> > > > > > > >> > > > and Neal have accepted invitations to become Arrow
> > > > committers.
> > > > > > > >> > > > Welcome
> > > > > > > >> > > > and thank you for all your contributions!
> > > > > > > >> > > >
> > > > > > > >> > >
> > > > > > > >>
> > > > > > > >>
> > > > > > > >>
> > > > > > > >> --
> > > > > > > >> Kenta Murata
> > > > > > > >> OpenPGP FP = 1D69 ADDE 081C 9CC2 2E54  98C1 CEFE 8AFB 6081
> > B062
> > > > > > > >>
> > > > > > > >> 本を書きました!!
> > > > > > > >> 『Ruby 逆引きレシピ』 http://www.amazon.co.jp/dp/4798119881/mrkn-22
> > > > > > > >>
> > > > > > > >> E-mail: m...@mrkn.jp
> > > > > > > >> twitter: http://twitter.com/mrkn/
> > > > > > > >> blog: http://d.hatena.ne.jp/mrkn/
> > > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>