Re: [ANNOUNCE] New Arrow committer: Dane Pitkin

2024-05-07 Thread Nic Crane
Congrats Dane, well deserved!

On Tue, 7 May 2024 at 15:16, Gang Wu  wrote:
>
> Congratulations Dane!
>
> Best,
> Gang
>
> On Tue, May 7, 2024 at 10:12 PM Ian Cook  wrote:
>
> > Congratulations Dane!
> >
> > On Tue, May 7, 2024 at 10:10 AM Alenka Frim  > .invalid>
> > wrote:
> >
> > > Yay, congratulations Dane!!
> > >
> > > On Tue, May 7, 2024 at 4:00 PM Rok Mihevc  wrote:
> > >
> > > > Congrats Dane!
> > > >
> > > > Rok
> > > >
> > > > On Tue, May 7, 2024 at 3:57 PM wish maple 
> > > wrote:
> > > >
> > > > > Congrats!
> > > > >
> > > > > Best,
> > > > > Xuwei Fu
> > > > >
> > > > > Joris Van den Bossche  于2024年5月7日周二
> > > > 21:53写道:
> > > > >
> > > > > > On behalf of the Arrow PMC, I'm happy to announce that Dane Pitkin
> > > has
> > > > > > accepted an invitation to become a committer on Apache Arrow.
> > > Welcome,
> > > > > > and thank you for your contributions!
> > > > > >
> > > > > > Joris
> > > > > >
> > > > >
> > > >
> > >
> >


[ANNOUNCE] New Arrow committer: Bryce Mecum

2024-03-17 Thread Nic Crane
On behalf of the Arrow PMC, I'm happy to announce that Bryce Mecum has
accepted an invitation to become a committer on Apache Arrow. Welcome, and
thank you for your contributions!

Nic


Re: New tag for releases for R-universe

2024-02-10 Thread Nic Crane
Thanks Raul for asking.  For the moment, to get it set up, it was simplest
to just have the tag, which we can then move to a different commit upon
another release.  Jacob has started pushing the CRAN maintenance branches
to the main repo (from 14.0.2 onwards) with "-cran" as the suffix, so the
head of those branches for the CRAN release will also match up to the
version on RUniverse as it's the same version.  We could rename them to
something other than "-cran" to make it less ambiguous though.

On Sat, 10 Feb 2024 at 21:17, Raúl Cumplido  wrote:

> Hi,
>
> Thanks for doing this! Only one question, would there be any downside on
> the R side on having the tag with a version associated so we have a
> historic on the repo?
>
> Something on the lines of "r-universe-release-15.0.0".
>
> Maybe not relevant at the moment but if in the future we decide to have
> some long term support releases or something like this it might be
> relevant.
>
> Thanks,
> Raúl
>
>
> El sáb, 10 feb 2024, 22:10, Jonathan Keane  escribió:
>
> > Thanks for this Nic.
> >
> > And just to clarify: the latest here is the latest _release_ of Apache
> > Arrow with this new set up. Prior to this the build available on
> R-universe
> > were effectively dev builds (commits to main), but with this new tag,
> > R-universe will only have (or at least default to having) the latest
> > release.
> >
> > -Jon
> >
> >
> > On Sat, Feb 10, 2024 at 2:18 PM Nic Crane  wrote:
> >
> > > Hi folks,
> > >
> > > The Arrow R package is distributed via a few different methods, one of
> > > which is R-universe[1].
> > >
> > > In order for r-universe to track the latest version of the R package,
> we
> > > have started using the tag "r-universe-release" to indicate the commit
> > > which represents the latest version of the R package (which is also
> > > submitted to CRAN).
> > >
> > > I'm mentioning it here just to be transparent about this - it doesn't
> > make
> > > any changes to the current release process as it's still the version
> > based
> > > off the release candidate and this is just an additional step for the R
> > > package which follows a successful main project release.
> > >
> > > Hope this all sounds OK - if not, happy to take feedback for changes
> etc
> > on
> > > this.
> > >
> > > Thanks,
> > >
> > > Nic
> > >
> > >
> > > [1] https://r-universe.dev/
> > >
> >
>


New tag for releases for R-universe

2024-02-10 Thread Nic Crane
Hi folks,

The Arrow R package is distributed via a few different methods, one of
which is R-universe[1].

In order for r-universe to track the latest version of the R package, we
have started using the tag "r-universe-release" to indicate the commit
which represents the latest version of the R package (which is also
submitted to CRAN).

I'm mentioning it here just to be transparent about this - it doesn't make
any changes to the current release process as it's still the version based
off the release candidate and this is just an additional step for the R
package which follows a successful main project release.

Hope this all sounds OK - if not, happy to take feedback for changes etc on
this.

Thanks,

Nic


[1] https://r-universe.dev/


Arrow R Package Development Sync Call - New meeting time and Google Calendar link

2024-01-24 Thread Nic Crane
The fortnightly Arrow R package dev community call has a new time,
alternating between Mondays at 15:00 GMT (10:00 ET) and 16:00 GMT (11:00
ET) to facilitate more times that folks involved in contributing to the R
package project can make.

I will no longer be sending out these notifications to the mailing list,
but the new meeting times can be found on a Google Calendar:
https://calendar.google.com/calendar/u/0?cid=NzlmYzA5MjliZDVhMDU4NjEyMWJkZDYwNGRlYTkwM2VlNWRmYjY5MDc1NmNhYzRkMWQ3OTlmMWQzZDcxN2ZlNEBncm91cC5jYWxlbmRhci5nb29nbGUuY29t

Video call link: https://meet.google.com/ghw-qfvv-cjb

The meeting notes can be found here; please feel free to add items to the
agenda:
https://docs.google.com/document/d/1nSIfJw8mfqtvScqvSVqmktpWff80pFmkqiZT7nTtiDo/edit

Thanks,

Nic


Arrow R Package Development Sync Call - Thurs 11th Jan

2024-01-10 Thread Nic Crane
The fortnightly Arrow R package dev community call is on Thursday 11th
January at 17:30 UTC (12:30 ET).

Video call link: https://meet.google.com/ghw-qfvv-cjb

The meeting notes can be found here; please feel free to add items to the
agenda:
https://docs.google.com/document/d/1nSIfJw8mfqtvScqvSVqmktpWff80pFmkqiZT7nTtiDo/edit

Thanks,

Nic


Arrow R Package Development Sync Call - Thurs 14th Dec

2023-12-13 Thread Nic Crane
The fortnightly Arrow R package dev community call is on Thursday 14th
December at 17:30 UTC (12:30 ET).

Video call link: https://meet.google.com/ghw-qfvv-cjb

The meeting notes can be found here; please feel free to add items to the
agenda:
https://docs.google.com/document/d/1nSIfJw8mfqtvScqvSVqmktpWff80pFmkqiZT7nTtiDo/edit

Thanks,

Nic


Arrow R Package Development Sync Call - Thurs 30th Nov

2023-11-30 Thread Nic Crane
The fortnightly Arrow R package dev community call is on Thursday 30th
November at 17:30 UTC (12:30 ET).

Video call link: https://meet.google.com/ghw-qfvv-cjb

The meeting notes can be found here; please feel free to add items to the
agenda:
https://docs.google.com/document/d/1nSIfJw8mfqtvScqvSVqmktpWff80pFmkqiZT7nTtiDo/edit

Thanks,

Nic


Re: [ANNOUNCE] New Arrow PMC chair: Andy Grove

2023-11-27 Thread Nic Crane
Congrats Andy!

On Mon, 27 Nov 2023 at 15:17, Gang Wu  wrote:

> Congrats Andy!
>
> Thanks Andrew for the past year as well.
>
> Best,
> Gang
>
> On Mon, Nov 27, 2023 at 10:59 PM Matt Topol 
> wrote:
>
> > Congrats Andy!
> >
> > On Mon, Nov 27, 2023 at 9:44 AM Gavin Ray  wrote:
> >
> > > Yay, congrats Andy! Well-deserved!
> > >
> > > On Mon, Nov 27, 2023 at 9:13 AM Kevin Gurney
> >  > > >
> > > wrote:
> > >
> > > > Congratulations, Andy!
> > > > 
> > > > From: Raúl Cumplido 
> > > > Sent: Monday, November 27, 2023 8:58 AM
> > > > To: dev@arrow.apache.org 
> > > > Subject: Re: [ANNOUNCE] New Arrow PMC chair: Andy Grove
> > > >
> > > > Congratulations Andy and thanks for the effort during last year
> Andrew!
> > > >
> > > > El lun, 27 nov 2023 a las 14:54, David Li ()
> > > > escribió:
> > > > >
> > > > > Congrats Andy!
> > > > >
> > > > > On Mon, Nov 27, 2023, at 08:02, Mehmet Ozan Kabak wrote:
> > > > > > Congratulations Andy. I am sure we will keep building great tech
> > this
> > > > > > year, just like last year, under your watch.
> > > > > >
> > > > > > Mehmet Ozan Kabak
> > > > > >
> > > > > >
> > > > > >> On Nov 27, 2023, at 3:47 PM, Daniël Heres <
> danielhe...@gmail.com>
> > > > wrote:
> > > > > >>
> > > > > >> Congrats Andy!
> > > > > >>
> > > > > >> Op ma 27 nov 2023 om 13:47 schreef Andrew Lamb <
> > > al...@influxdata.com
> > > > >:
> > > > > >>
> > > > > >>> I am pleased to announce that the Arrow Project has a new PMC
> > chair
> > > > and VP
> > > > > >>> as per our tradition of rotating the chair once a year. I have
> > > > resigned and
> > > > > >>> Andy Grove was duly elected by the PMC and approved unanimously
> > by
> > > > the
> > > > > >>> board.
> > > > > >>>
> > > > > >>> Please join me in congratulating Andy Grove!
> > > > > >>>
> > > > > >>> Thanks,
> > > > > >>> Andrew
> > > > > >>>
> > > > > >>
> > > > > >>
> > > > > >> --
> > > > > >> Daniël Heres
> > > >
> > > >
> > >
> >
>


Re: [ANNOUNCE] New Arrow PMC member: Raúl Cumplido

2023-11-13 Thread Nic Crane
Congrats Raul!

On Tue, 14 Nov 2023, 03:28 Andrew Lamb,  wrote:

> The Project Management Committee (PMC) for Apache Arrow has invited
> Raúl Cumplido  to become a PMC member and we are pleased to announce
> that  Raúl Cumplido has accepted.
>
> Please join me in congratulating them.
>
> Andrew
>


Arrow R Package Development Sync Call - Thurs 2nd Nov

2023-11-01 Thread Nic Crane
The fortnightly Arrow R package dev community call is on Thursday 2nd
November at 16:30 UTC (12:30 ET).

Video call link: https://meet.google.com/ghw-qfvv-cjb

The meeting notes can be found here; please feel free to add items to the
agenda:
https://docs.google.com/document/d/1nSIfJw8mfqtvScqvSVqmktpWff80pFmkqiZT7nTtiDo/edit

Thanks,

Nic


Re: Help regarding setting up the r package in arrow apache

2023-10-20 Thread Nic Crane
Hi Divyansh,

You mentioned previously that you were trying to get an R dev setup in
Docker, but that gist looks quite different to what's recommended in the R
dev setup instructions linked above.  It looks like you're trying to set up
a Docker image based on one of our CI jobs which builds the docs for
multiple Arrow implementations, which I wouldn't recommend, as it involves
a lot more steps than you need for an R dev setup, and probably won't be
that useful for the specific task of getting a working R dev setup on
Docker.

I'd recommend instead looking at the article mentioned by me, Bryce, and
Jon [1].  Happy to answer any questions if any issues come up with those
instructions, as they could potentially be made more clear, and it's always
useful to get feedback on docs like these.

Nic

[1] https://arrow.apache.org/docs/r/articles/developers/docker.html

On Fri, 20 Oct 2023 at 08:13, Divyansh Khatri 
wrote:

> please see this and help me resolve the issue
> https://gist.github.com/Divyansh200102/3ba4f5e391d8e62307f8b584a5a659d8
>
> On Wed, 18 Oct 2023 at 19:14, Jonathan Keane  wrote:
>
> > For development of the R package with docker containers, the link [1]
> that
> > Nic sent in this same thread is the place to go. In addition to that
> > docker-focused one, there are a handful of others that might prove useful
> > to you in getting your development environment setup [2].
> >
> > If you run into any issues, feel free to post here, but it's helpful to
> do
> > so with debugging mode on (i.e. set the env var ARROW_DEV to true) and to
> > provide the exact commands you sent along with the output you're seeing
> so
> > we can help diagnose what's going wrong.
> >
> > [1] – https://arrow.apache.org/docs/r/articles/developers/docker.html
> > [2] –
> https://arrow.apache.org/docs/r/articles/index.html#developer-guides
> >
> > -Jon
> >
> >
> > On Wed, Oct 18, 2023 at 2:48 AM Divyansh Khatri <
> > divyanshkhatri...@gmail.com>
> > wrote:
> >
> > > I am trying to contribute to the arrow project.so i am trying to setup
> > the
> > > project on locally.
> > >
> > > On Tue, 17 Oct 2023 at 05:14, Bryce Mecum 
> wrote:
> > >
> > > > That error makes it look like you're running `docker compose up` from
> > > > the root of the Arrow source tree which is likely not what you want.
> > > > Are you trying to use the Arrow R package in a Docker container or
> are
> > > > you trying to contribute to it by developing inside of a Docker
> > > > container? Nic's link [1] is a good starting point.
> > > >
> > > > [1] https://arrow.apache.org/docs/r/articles/developers/docker.html
> > > >
> > > > On Mon, Oct 16, 2023 at 4:31 AM Divyansh Khatri
> > > >  wrote:
> > > > >
> > > > > Hi,so i am basically using the docker cmd 'docker compose up -d' in
> > the
> > > > > docker-compose.yml but i am encountering this error(Error response
> > from
> > > > > daemon: manifest for amd64/maven:3.5.4-eclipse-temurin-8 not found:
> > > > > manifest unknown: manifest unknown)so i am not sure how to proceed
> > from
> > > > > here?
> > > > >
> > > > > On Mon, 16 Oct 2023 at 14:17, Nic Crane 
> wrote:
> > > > >
> > > > > > Hi Divyansh,
> > > > > >
> > > > > > There are instructions for creating a R package dev setup here:
> > > > > > https://arrow.apache.org/docs/r/articles/developers/setup.html
> > > > > >
> > > > > > If you can explain a bit more about what you've tried so far and
> > > > what's not
> > > > > > working, we may be able to advise.
> > > > > >
> > > > > > Best wishes,
> > > > > >
> > > > > > Nic
> > > > > >
> > > > > > On Mon, 16 Oct 2023 at 06:02, Divyansh Khatri <
> > > > divyanshkhatri...@gmail.com
> > > > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > I am having problems regarding setting up the r package using
> > > docker
> > > > of
> > > > > > the
> > > > > > > apache arrow.Can you give me the step by step process of how
> do i
> > > > setup
> > > > > > the
> > > > > > > r package in my vs code system using docker.
> > > > > > >
> > > > > >
> > > >
> > >
> >
>


Arrow R Package Development Sync Call - Thursday 19th October

2023-10-19 Thread Nic Crane
The fortnightly Arrow R package dev community call is on Thursday 19th
October at 16:30 UTC (12:30 ET).

Video call link: https://meet.google.com/ghw-qfvv-cjb

The meeting notes can be found here; please feel free to add items to the
agenda:
https://docs.google.com/document/d/1nSIfJw8mfqtvScqvSVqmktpWff80pFmkqiZT7nTtiDo/edit

Thanks,

Nic


Re: Help regarding setting up the r package in arrow apache

2023-10-16 Thread Nic Crane
I'm not sure that's quite right, but I don't know enough Docker without
looking it up to say what's happened.

There's some R-specific guidance here with an example:
https://arrow.apache.org/docs/r/articles/developers/docker.html


On Mon, 16 Oct 2023, 13:31 Divyansh Khatri, 
wrote:

> Hi,so i am basically using the docker cmd 'docker compose up -d' in the
> docker-compose.yml but i am encountering this error(Error response from
> daemon: manifest for amd64/maven:3.5.4-eclipse-temurin-8 not found:
> manifest unknown: manifest unknown)so i am not sure how to proceed from
> here?
>
> On Mon, 16 Oct 2023 at 14:17, Nic Crane  wrote:
>
> > Hi Divyansh,
> >
> > There are instructions for creating a R package dev setup here:
> > https://arrow.apache.org/docs/r/articles/developers/setup.html
> >
> > If you can explain a bit more about what you've tried so far and what's
> not
> > working, we may be able to advise.
> >
> > Best wishes,
> >
> > Nic
> >
> > On Mon, 16 Oct 2023 at 06:02, Divyansh Khatri <
> divyanshkhatri...@gmail.com
> > >
> > wrote:
> >
> > > I am having problems regarding setting up the r package using docker of
> > the
> > > apache arrow.Can you give me the step by step process of how do i setup
> > the
> > > r package in my vs code system using docker.
> > >
> >
>


Re: Help regarding setting up the r package in arrow apache

2023-10-16 Thread Nic Crane
Hi Divyansh,

There are instructions for creating a R package dev setup here:
https://arrow.apache.org/docs/r/articles/developers/setup.html

If you can explain a bit more about what you've tried so far and what's not
working, we may be able to advise.

Best wishes,

Nic

On Mon, 16 Oct 2023 at 06:02, Divyansh Khatri 
wrote:

> I am having problems regarding setting up the r package using docker of the
> apache arrow.Can you give me the step by step process of how do i setup the
> r package in my vs code system using docker.
>


Re: [ANNOUNCE] New Arrow PMC member: Jonathan Keane

2023-10-15 Thread Nic Crane
Congrats Jon!

On Sun, 15 Oct 2023, 05:52 Jacob Wujciak-Jens,
 wrote:

> Congratulations !
>
> Raúl Cumplido  schrieb am So., 15. Okt. 2023,
> 00:58:
>
> > Congratulations Jon!
> >
> > El dom, 15 oct 2023, 0:05, Antoine Pitrou  escribió:
> >
> > >
> > > Welcome to the PMC, Jon!
> > >
> > > Le 14/10/2023 à 19:42, David Li a écrit :
> > > > Congrats Jon!
> > > >
> > > > On Sat, Oct 14, 2023, at 13:25, Ian Cook wrote:
> > > >> Congratulations Jonathan!
> > > >>
> > > >> On Sat, Oct 14, 2023 at 13:24 Andrew Lamb 
> > wrote:
> > > >>
> > > >>> The Project Management Committee (PMC) for Apache Arrow has invited
> > > >>> Jonathan Keane to become a PMC member and we are pleased to
> announce
> > > >>> that Jonathan Keane has accepted.
> > > >>>
> > > >>> Congratulations and welcome!
> > > >>>
> > > >>> Andrew
> > > >>>
> > >
> >
>


Arrow R Package Development Sync Call - **NEW URL** - Thursday 5th October

2023-10-05 Thread Nic Crane
The fortnightly Arrow R package dev community call is on Thursday 5th
October at 16:30 UTC (12:30 ET).

There is a new URL for joining the call.

Video call link: https://meet.google.com/ghw-qfvv-cjb

The meeting notes can be found here; please feel free to add items to the
agenda:
https://docs.google.com/document/d/1nSIfJw8mfqtvScqvSVqmktpWff80pFmkqiZT7nTtiDo/edit

Thanks,

Nic


No Arrow R Package Development Sync Call this week

2023-09-20 Thread Nic Crane
Although the Arrow R package dev meeting is usually held on alternate
Thursdays, there will be no meeting this week as lots of people are away at
various conference. The next meeting will be Thursday 5th October.


Arrow R Package Development Sync Call - Thursday 7th September

2023-09-06 Thread Nic Crane
The fortnightly Arrow R package dev community call is on Thursday 7th
September at 16:30 UTC (12:30 ET).

Joining instructions are below.

Video call link: https://meet.google.com/dbm-ybmv-evb
Phone numbers: https://tel.meet/dbm-ybmv-evb?pin=9199558189233

The meeting notes can be found here; please feel free to add items to the
agenda:
https://docs.google.com/document/d/1nSIfJw8mfqtvScqvSVqmktpWff80pFmkqiZT7nTtiDo/edit

Thanks,

Nic


Arrow R Package Development Sync Call

2023-08-24 Thread Nic Crane
The fortnightly Arrow R package dev community call is on Thursday 24th
August at 16:30 UTC (12:30 ET).

Joining instructions are below.

Video call link: https://meet.google.com/dbm-ybmv-evb
Phone numbers: https://tel.meet/dbm-ybmv-evb?pin=9199558189233

The meeting notes can be found here; please feel free to add items to the
agenda:
https://docs.google.com/document/d/1nSIfJw8mfqtvScqvSVqmktpWff80pFmkqiZT7nTtiDo/edit

Thanks,

Nic


Arrow R package development sync call - Thursday 10th August at 16:30 UTC, 12:30 ET

2023-08-09 Thread Nic Crane
The fortnightly Arrow R package dev community call is on Thursday 10th
August at 16:30 UTC (12:30 ET).

Joining instructions are below.

Video call link: https://meet.google.com/dbm-ybmv-evb
Phone numbers: https://tel.meet/dbm-ybmv-evb?pin=9199558189233

The meeting notes can be found here; please feel free to add items to the
agenda:
https://docs.google.com/document/d/1nSIfJw8mfqtvScqvSVqmktpWff80pFmkqiZT7nTtiDo/edit

Thanks,

Nic


Arrow R package development sync call - Thursday 27th July at 16:30 UTC (12:30 ET)

2023-07-27 Thread Nic Crane
The fortnightly Arrow R package dev community call is on Thursday 27th July
at 16:30 UTC (12:30 ET).

Joining instructions are below.

Video call link: https://meet.google.com/dbm-ybmv-evb
Phone numbers: https://tel.meet/dbm-ybmv-evb?pin=9199558189233

The meeting notes can be found here; please feel free to add items to the
agenda:
https://docs.google.com/document/d/1nSIfJw8mfqtvScqvSVqmktpWff80pFmkqiZT7nTtiDo/edit

Thanks,

Nic


Arrow R package development sync call - Thursday 13th July at 16:30 UTC (12:30 ET)

2023-07-12 Thread Nic Crane
The fortnightly Arrow R package dev community call is on Thursday 13th July
at 16:30 UTC (12:30 ET).

Joining instructions are below.

Video call link: https://meet.google.com/dbm-ybmv-evb
Phone numbers: https://tel.meet/dbm-ybmv-evb?pin=9199558189233

The meeting notes can be found here; please feel free to add items to the
agenda:
https://docs.google.com/document/d/1nSIfJw8mfqtvScqvSVqmktpWff80pFmkqiZT7nTtiDo/edit

Thanks,

Nic


Arrow R package development sync call - Thursday 29th June at 16:30 UTC (12:30 ET)

2023-06-28 Thread Nic Crane
The fortnightly Arrow R package dev community call is on Thursday 29th June
at 16:30 UTC (12:30 ET).

Joining instructions are below.

Video call link: https://meet.google.com/dbm-ybmv-evb
Phone numbers: https://tel.meet/dbm-ybmv-evb?pin=9199558189233

The meeting notes can be found here; please feel free to add items to the
agenda:
https://docs.google.com/document/d/1nSIfJw8mfqtvScqvSVqmktpWff80pFmkqiZT7nTtiDo/edit

Thanks,

Nic


Re: [ANNOUNCE] New Arrow PMC member: Dewey Dunnington

2023-06-23 Thread Nic Crane
Well-deserved Dewey, congratulations!

On Fri, 23 Jun 2023 at 11:53, Vibhatha Abeykoon  wrote:

> Congratulations Dewey!
>
> On Fri, Jun 23, 2023 at 4:16 PM Alenka Frim  .invalid>
> wrote:
>
> > Congratulations Dewey!! 
> >
> > On Fri, Jun 23, 2023 at 12:10 PM Raúl Cumplido 
> > wrote:
> >
> > > Congratulations Dewey!
> > >
> > > El vie, 23 jun 2023, 11:55, Andrew Lamb 
> escribió:
> > >
> > > > The Project Management Committee (PMC) for Apache Arrow has invited
> > > > Dewey Dunnington (paleolimbot) to become a PMC member and we are
> > pleased
> > > to
> > > > announce
> > > > that Dewey Dunnington has accepted.
> > > >
> > > > Congratulations and welcome!
> > > >
> > >
> >
>


Arrow R package development sync call - Thursday 15th June at 16:30 UTC (12:30 ET)

2023-06-14 Thread Nic Crane
The fortnightly Arrow R package dev community call is on Thursday 15th June
at 16:30 UTC (12:30 ET).

Joining instructions are below.

Video call link: https://meet.google.com/dbm-ybmv-evb
Phone numbers: https://tel.meet/dbm-ybmv-evb?pin=9199558189233

The meeting notes can be found here; please feel free to add items to the
agenda:
https://docs.google.com/document/d/1nSIfJw8mfqtvScqvSVqmktpWff80pFmkqiZT7nTtiDo/edit

Thanks,

Nic


Re: [VOTE][Julia] Release Apache Arrow Julia 2.6.2 RC1

2023-06-12 Thread Nic Crane
+1 (Ubuntu 22.04)

On Mon, 12 Jun 2023 at 01:50, Sutou Kouhei  wrote:

> Hi,
>
> Could PMC members vote on this?
>
> Thanks,
> --
> kou
>
> In <20230610.044039.1468288593045013710@clear-code.com>
>   "[VOTE][Julia] Release Apache Arrow Julia 2.6.2 RC1" on Sat, 10 Jun 2023
> 04:40:39 +0900 (JST),
>   Sutou Kouhei  wrote:
>
> > Hi,
> >
> > I would like to propose the following release candidate (RC1) of
> > Apache Arrow Julia version 2.6.2.
> >
> > This release candidate is based on commit:
> > 9f1d51a2c975bd83cbaf70c5f640762c6a0bccaf [1]
> >
> > The source release rc1 is hosted at [2].
> >
> > Please download, verify checksums and signatures, run the unit tests,
> > and vote on the release. See [3] for how to validate a release candidate.
> >
> > The vote will be open for at least 24 hours.
> >
> > [ ] +1 Release this as Apache Arrow Julia 2.6.2
> > [ ] +0
> > [ ] -1 Do not release this as Apache Arrow Julia 2.6.2 because...
> >
> > [1]:
> https://github.com/apache/arrow-julia/tree/9f1d51a2c975bd83cbaf70c5f640762c6a0bccaf
> > [2]:
> https://dist.apache.org/repos/dist/dev/arrow/apache-arrow-julia-2.6.2-rc1/
> > [3]:
> https://github.com/apache/arrow-julia/blob/main/dev/release/README.md#verify
>


Arrow R package development sync call - Thursday 1st June at 16:30 UTC (12:30 ET)

2023-05-31 Thread Nic Crane
The fortnightly Arrow R package dev community call is on Thursday 1st June
at 16:30 UTC (12:30 ET).

Joining instructions are below.

Video call link: https://meet.google.com/dbm-ybmv-evb
Phone numbers: https://tel.meet/dbm-ybmv-evb?pin=9199558189233

The meeting notes can be found here; please feel free to add items to the
agenda:
https://docs.google.com/document/d/1nSIfJw8mfqtvScqvSVqmktpWff80pFmkqiZT7nTtiDo/edit

Thanks,

Nic


Arrow R package development sync call - Thursday 18th May at 16:30 UTC (12:30 ET)

2023-05-17 Thread Nic Crane
The fortnightly Arrow R package dev community call is on Thursday 18th May
at 16:30 UTC (12:30 ET).

Joining instructions are below.

Video call link: https://meet.google.com/dbm-ybmv-evb
Phone numbers: https://tel.meet/dbm-ybmv-evb?pin=9199558189233

The meeting notes can be found here; please feel free to add items to the
agenda:
https://docs.google.com/document/d/1nSIfJw8mfqtvScqvSVqmktpWff80pFmkqiZT7nTtiDo/edit

Thanks,

Nic


Re: [RESULT][VOTE] Release Apache Arrow 12.0.0 - RC0

2023-05-10 Thread Nic Crane
The R package release tasks are all complete now

On Tue, 9 May 2023 at 19:34, Raúl Cumplido  wrote:

> Hi,
>
> Almost all post-release tasks have been completed:
> - [done] Update the released milestone Date and set to "Closed" on GitHub
> - [done] Merge changes on release branch to maintenance branch for
> patch releases
> - [done] Add the new release to the Apache Reporter System
> - [done] Upload source
> - [done] Upload binaries
> - [done] Update website
> - [done] Upload JavaScript packages
> - [done] Upload C# packages
> - [done] Upload wheels/sdist to pypi
> - [done] Publish Maven artifacts
> - [done] Bump versions
> - [done] Update tags for Go modules
> - [done] Update docs.
> - [done] Update conda recipes
> - [done] Update MSYS2 package
> - [done] Update Homebrew packages
> - [done] Upload RubyGems
> - [done] Publish release blog posts
> - [done] Remove old artifacts
> - [done] Announce the release on Twitter
>
> There are still some in progress which I'll keep working on:
> - [In progress] Announce the new release
> Waiting for the email to arrive at:
> https://lists.apache.org/list?annou...@apache.org
> - [in progress] Update vcpkg port
>   NOTE: Add -DARROW_ACERO=ON and remove -DARROW_PLASMA
>   PR: https://github.com/microsoft/vcpkg/pull/31321
> - [in progress ] Update Conan recipe
>   NOTE: Add -DARROW_ACERO=ON and remove -DARROW_PLASMA only
>   for 12.0.0
> Currently in progress for 11.0.0 (It was not done on the previous release)
> PR: https://github.com/conan-io/conan-center-index/pull/17458
> I'll follow up with 12.0.0 too
> - [ ] Update version in Apache Arrow Cookbook
>
> There are some which I can't do, from my understanding the R package
> update has been submitted too:
> - [ ] Make the CPP PARQUET related version as "RELEASED" on JIRA
> - [ ] Start the new version on JIRA for the related CPP PARQUET version
> - [ ] Update R packages
>
> Thanks!
>
> El mar, 2 may 2023 a las 16:27, Sutou Kouhei ()
> escribió:
> >
> > Hi Raúl,
> >
> > > I am having some problems with linxubrew:
> > >
> > > $ cd "$(brew --repository homebrew/core)"
> > > bash: cd:
> /home/linuxbrew/.linuxbrew/Homebrew/Library/Taps/homebrew/homebrew-core:
> > > No such file or directory
> >
> > Ah, recent Homebrew doesn't clone homebrew/homebrew-core by
> > default:
> >
> >   https://brew.sh/2023/02/16/homebrew-4.0.0/
> >
> > I think that $(brew --repository homebrew/core) works again
> > by running "brew tap homebrew/core".
> >
> >
> > Thanks,
> > --
> > kou
> >
> > In 
> >   "Re: [RESULT][VOTE] Release Apache Arrow 12.0.0 - RC0" on Tue, 2 May
> 2023 15:25:34 +0200,
> >   Raúl Cumplido  wrote:
> >
> > > Hi,
> > >
> > > Thanks Kou,
> > >
> > > The current status for the post-release tasks can be seen on the list
> below.
> > >
> > > I am having some problems with linxubrew:
> > >
> > > $ cd "$(brew --repository homebrew/core)"
> > > bash: cd:
> /home/linuxbrew/.linuxbrew/Homebrew/Library/Taps/homebrew/homebrew-core:
> > > No such file or directory
> > >
> > > I am trying to start a MacOs on AWS to create the homebrew-core PR but
> > > if someone can take that one I will continue trying to update vcpkg,
> > > conan and the cookbooks.
> > >
> > > DONE:
> > > - [done] Update the released milestone Date and set to "Closed" on
> GitHub
> > > - [done] Merge changes on release branch to maintenance branch for
> > > patch releases
> > > - [done] Add the new release to the Apache Reporter System
> > > - [done] Upload source
> > > - [done] Upload binaries
> > > - [done] Update website
> > > - [done] Upload JavaScript packages
> > > - [done] Upload C# packages
> > > - [done] Upload wheels/sdist to pypi
> > > - [done] Publish Maven artifacts
> > > - [done] Bump versions
> > > - [done] Update tags for Go modules
> > > - [done] Update docs.
> > >
> > > IN PROGRESS:
> > > - [In progress] Update conda recipes
> > >   Update to 12.0.0 Done
> > >   NOTE: Add -DARROW_ACERO=ON and remove -DARROW_PLASMA
> > >   PR Created for ARROW_ACERO:
> > > https://github.com/conda-forge/arrow-cpp-feedstock/pull/1037
> > > - [In progress] Update MSYS2 package
> > > PR: https://github.com/msys2/MINGW-packages/pull/17050
> > >   NOTE: Add -DARROW_ACERO=ON and remove -DARROW_PLASMA
> > >
> > > TODO:
> > > - [ ] Make the CPP PARQUET related version as "RELEASED" on JIRA
> > > - [ ] Start the new version on JIRA for the related CPP PARQUET version
> > > - [ ] Update Homebrew packages
> > >   NOTE: Add -DARROW_ACERO=ON and remove -DARROW_PLASMA
> > > - [ ] Upload RubyGems
> > > - [ ] Update R packages
> > > - [ ] Update vcpkg port
> > >   NOTE: Add -DARROW_ACERO=ON and remove -DARROW_PLASMA
> > > - [ ] Update Conan recipe
> > >   NOTE: Add -DARROW_ACERO=ON and remove -DARROW_PLASMA only
> > >   for 12.0.0+
> > > - [ ] Update version in Apache Arrow Cookbook
> > > - [ ] Announce the new release
> > > - [ ] Publish release blog posts
> > > - [ ] Announce the release on Twitter
> > > - [ ] Remove old artifacts
> > >
> > > El mar, 2 may 2023 a las 2:48, Sutou 

Re: [ANNOUNCE] New Arrow PMC member: Matt Topol

2023-05-03 Thread Nic Crane
Congratulations!

On Thu, 4 May 2023, 05:24 Vibhatha Abeykoon,  wrote:

> Congratulations Matt!
>
> On Thu, May 4, 2023 at 7:35 AM Ian Cook  wrote:
>
> > Congratulations Matt!!!
> >
> > On Wed, May 3, 2023 at 9:55 PM Yibo Cai  wrote:
> > >
> > > Congrats Matt!
> > >
> > > On 5/4/23 07:07, Krisztián Szűcs wrote:
> > > > Congrats Matt!
> > > >
> > > > On Wed, May 3, 2023 at 11:44 PM Rok Mihevc 
> > wrote:
> > > >>
> > > >> Congrats Matt. Well deserved!
> > > >>
> > > >> Rok
> > > >>
> > > >> On Wed, May 3, 2023 at 11:03 PM David Li 
> wrote:
> > > >>
> > > >>> Congrats Matt!
> > > >>>
> > > >>> On Wed, May 3, 2023, at 16:06, Neal Richardson wrote:
> > >  Congratulations!
> > > 
> > >  On Wed, May 3, 2023 at 1:58 PM Jacob Wujciak
> > > >>> 
> > >  wrote:
> > > 
> > > > Congratulations, well deserved!
> > > >
> > > > On Wed, May 3, 2023 at 7:48 PM Weston Pace <
> weston.p...@gmail.com>
> > > >>> wrote:
> > > >
> > > >> Congratulations!
> > > >>
> > > >> On Wed, May 3, 2023 at 10:47 AM Raúl Cumplido <
> > raulcumpl...@gmail.com
> > > 
> > > >> wrote:
> > > >>
> > > >>> Congratulations Matt!
> > > >>>
> > > >>> El mié, 3 may 2023, 19:44, vin jake 
> > > >>> escribió:
> > > >>>
> > >  Congratulations, Matt!
> > > 
> > >  Felipe Oliveira Carvalho  于 2023年5月4日周四
> > > > 01:42写道:
> > > 
> > > > Congratulations, Matt!
> > > >
> > > > On Wed, 3 May 2023 at 14:37 Andrew Lamb <
> al...@influxdata.com>
> > > >> wrote:
> > > >
> > > >> The Project Management Committee (PMC) for Apache Arrow has
> > > > invited
> > > >> Matt Topol (zeroshade) to become a PMC member and we are
> > > >>> pleased
> > > > to
> > > >> announce
> > > >> that Matt has accepted.
> > > >>
> > > >> Congratulations and welcome!
> > > >>
> > > >
> > > 
> > > >>>
> > > >>
> > > >
> > > >>>
> >
>


Re: Arrow R package development sync call - Thursday 3rd May at 16:30 UTC (12:30 ET)

2023-05-03 Thread Nic Crane
Apologies, that should have read Thursday 4th May

On Wed, 3 May 2023 at 17:54, Nic Crane  wrote:

> The fortnightly Arrow R package dev community call is on Thursday 3rd May
> at 16:30 UTC (12:30 ET).
>
> Joining instructions are below.
>
> Video call link: https://meet.google.com/dbm-ybmv-evb
> Phone numbers: https://tel.meet/dbm-ybmv-evb?pin=9199558189233
>
> The meeting notes can be found here; please feel free to add items to the
> agenda:
> https://docs.google.com/document/d/1nSIfJw8mfqtvScqvSVqmktpWff80pFmkqiZT7nTtiDo/edit
>
> Thanks,
>
> Nic
>


Arrow R package development sync call - Thursday 3rd May at 16:30 UTC (12:30 ET)

2023-05-03 Thread Nic Crane
The fortnightly Arrow R package dev community call is on Thursday 3rd May
at 16:30 UTC (12:30 ET).

Joining instructions are below.

Video call link: https://meet.google.com/dbm-ybmv-evb
Phone numbers: https://tel.meet/dbm-ybmv-evb?pin=9199558189233

The meeting notes can be found here; please feel free to add items to the
agenda:
https://docs.google.com/document/d/1nSIfJw8mfqtvScqvSVqmktpWff80pFmkqiZT7nTtiDo/edit

Thanks,

Nic


Arrow R package development sync call - Thursday 6th April at 16:30 UTC (12:30 ET)

2023-04-05 Thread Nic Crane
The fortnightly Arrow R package dev community call is on Thursday 6th April
at 16:30 UTC (12:30 ET).

Joining instructions are below.

Video call link: https://meet.google.com/dbm-ybmv-evb
Phone numbers: https://tel.meet/dbm-ybmv-evb?pin=9199558189233

The meeting notes can be found here; please feel free to add items to the
agenda:
https://docs.google.com/document/d/1nSIfJw8mfqtvScqvSVqmktpWff80pFmkqiZT7nTtiDo/edit

Thanks,

Nic


Arrow R package development sync call - Thursday 23rd March at 17:30 UTC (13:30 EST)

2023-03-23 Thread Nic Crane
The fortnightly Arrow R package dev community call is on Thursday 23rd
March at 17:30 UTC (13:30 EST).

Joining instructions are below.

Video call link: https://meet.google.com/dbm-ybmv-evb
Phone numbers: https://tel.meet/dbm-ybmv-evb?pin=9199558189233

The meeting notes can be found here; please feel free to add items to the
agenda:
https://docs.google.com/document/d/1nSIfJw8mfqtvScqvSVqmktpWff80pFmkqiZT7nTtiDo/edit

Thanks,

Nic


Re: [ANNOUNCE] New Arrow PMC member: Will Jones

2023-03-15 Thread Nic Crane
Congratulations Will! :)

On Tue, 14 Mar 2023 at 03:09, Jin Shang  wrote:

> Congrats Will!
>
> > 2023年3月14日 11:06,Vibhatha Abeykoon  写道:
> >
> > Congratulations Will.
> >
> > On Tue, Mar 14, 2023 at 6:53 AM Gang Wu  wrote:
> >
> >> Congrats, Will!
> >>
> >> Best,
> >> Gang
> >>
> >> On Tue, Mar 14, 2023 at 9:21 AM Junming Chen <
> junming.che...@outlook.com>
> >> wrote:
> >>
> >>> Congrats, Will!
> >>> 
> >>> From: David Li 
> >>> Sent: Tuesday, March 14, 2023 5:16 AM
> >>> To: dev@arrow.apache.org 
> >>> Subject: Re: [ANNOUNCE] New Arrow PMC member: Will Jones
> >>>
> >>> Congrats, Will!
> >>>
> >>> On Mon, Mar 13, 2023, at 17:06, Joris Van den Bossche wrote:
>  Congrats Will!
> 
>  On Mon, 13 Mar 2023 at 22:01, Michel Miotto Barbosa
>   wrote:
> >
> > Congratulations Wiil!
> >
> > A disposição | At your disposal
> >
> > Michel Miotto Barbosa
> > https://www.linkedin.com/in/michelmiottobarbosa/
> > mmiottobarb...@gmail.com
> > +55 11 984 342 347
> >
> >
> >
> >
> > On Mon, Mar 13, 2023 at 2:58 PM Andrew Lamb 
> >>> wrote:
> >
> >> The Project Management Committee (PMC) for Apache Arrow has invited
> >> Will Jones to become a PMC member and we are pleased to announce
> >> that Will Jones has accepted.
> >>
> >> Congratulations and welcome!
> >>
> >>>
> >>
>
>


Arrow R package development sync call - Thursday 9th March at 17:30 UTC

2023-03-08 Thread Nic Crane
The Arrow R package dev community call is on Thursday 9th March at 17:30
UTC.
Joining instructions are below.

Thursday 9th March · 17:30 – 18:30
Video call link: https://meet.google.com/dbm-ybmv-evb
Phone numbers: https://tel.meet/dbm-ybmv-evb?pin=9199558189233

The meeting notes can be found here; please feel free to add items to the
agenda:
https://docs.google.com/document/d/1nSIfJw8mfqtvScqvSVqmktpWff80pFmkqiZT7nTtiDo/edit

Thanks,

Nic


Re: [VOTE][Format] Fixed shape tensor Canonical Extension Type

2023-03-06 Thread Nic Crane
+1

On Mon, 6 Mar 2023 at 12:41, Alenka Frim 
wrote:

> Hi all,
>
> I am starting a new voting thread with this email as the first voting
> thread [1] opened up new
> comments and suggestions and we wanted to take time to see how that
> evolves.
>
> *I would like to propose we vote on adding the fixed shape tensor canonical
> extension type*
> *with the following specification:*
>
> Fixed shape tensor
> ==
>
> * Extension name: `arrow.fixed_shape_tensor`.
>
> * The storage type of the extension: ``FixedSizeList`` where:
>
>   * **value_type** is the data type of individual tensor elements.
>   * **list_size** is the product of all the elements in tensor shape.
>
> * Extension type parameters:
>
>   * **value_type** = the Arrow data type of individual tensor elements.
>   * **shape** = the physical shape of the contained tensors
> as an array.
>
>   Optional parameters describing the logical layout:
>
>   * **dim_names** = explicit names to tensor dimensions
> as an array. The length of it should be equal to the shape
> length and equal to the number of dimensions.
>
> ``dim_names`` can be used if the dimensions have well-known
> names and they map to the physical layout (row-major).
>
>   * **permutation**  = indices of the desired ordering of the
> original dimensions, defined as an array.
>
> The indices contain a permutation of the values [0, 1, .., N-1] where
> N is the number of dimensions. The permutation indicates which
> dimension of the logical layout corresponds to which dimension of the
> physical tensor (the i-th dimension of the logical view corresponds
> to the dimension with number ``permutations[i]`` of the physical
> tensor).
>
> Permutation can be useful in case the logical order of
> the tensor is a permutation of the physical order (row-major).
>
> When logical and physical layout are equal, the permutation will always
> be ([0, 1, .., N-1]) and can therefore be left out.
>
> * Description of the serialization:
>
>   The metadata must be a valid JSON object including shape of
>   the contained tensors as an array with key **"shape"** plus optional
>   dimension names with keys **"dim_names"** and ordering of the
>   dimensions with key **"permutation"**.
>
>   - Example: ``{ "shape": [2, 5]}``
>   - Example with ``dim_names`` metadata for NCHW ordered data:
>
> ``{ "shape": [100, 200, 500], "dim_names": ["C", "H", "W"]}``
>
>   - Example of permuted 3-dimensional tensor:
>
> ``{ "shape": [100, 200, 500], "permutation": [2, 0, 1]}``
>
> This is the physical layout shape and the the shape of the logical
> layout would in this case be ``[500, 100, 200]``.
>
> .. note::
>
>   Elements in a fixed shape tensor extension array are stored
>   in row-major/C-contiguous order.
>
> * The specification is submitted as a PR [2] to Canonical Extension Types
> document under the
>format specifications directory [3].
>
> There are also two implementations submitted to Apache Arrow repository:
> * C++ implementation of the proposed specification [4]
> * Python example implementation of the proposed specification and usage
> (only illustrative) [5]
>
>
> The vote will be open for at least 72 hours.
>
> [ ] +1 Accept this proposal
> [ ] +0
> [ ] -1 Do not accept this proposal because...
>
>
> Regards, Alenka
>
> [1]: https://lists.apache.org/thread/3cj0cr44hg3t2rn0kxly8td82yfob1nd
> [2]: https://github.com/apache/arrow/pull/33925/files
> [3]:
>
> https://github.com/apache/arrow/blob/main/docs/source/format/CanonicalExtensions.rst
>
> [4]: https://github.com/apache/arrow/pull/8510/files
> [5]: https://github.com/apache/arrow/pull/33948/files
>


Re: [VOTE] Release Apache Arrow nanoarrow 0.1.0 - RC1

2023-03-02 Thread Nic Crane
+1 (binding)

Verified on Ubuntu 22.04

On Thu, 2 Mar 2023 at 12:19, SHIMA Tatsuya  wrote:

> +1 (non-binding)
>
> Verified on Ubuntu 22.04 (amd64).
>
> I used R 4.2.2 installed from source.
>
> On 2023/03/02 20:39, Joris Van den Bossche wrote:
> > +1 (binding)
> >
> > Verified on Ubuntu 20.04
> >
> > It worked with conda R for me, I only needed to ensure to install a
> > conda compiler to get it building
> > (https://github.com/apache/arrow-nanoarrow/pull/142)
> >
> > On Thu, 2 Mar 2023 at 05:29, Jin Shang  wrote:
> >> +1 (non-binding). Verified on macOS 12.5 aarch64 and Ubuntu 22.04
> aarch64.
> >> Dependencies were installed via homebrew and apt. Everything went
> smoothly.
> >>
> >> On Thu, Mar 2, 2023 at 1:04 AM Dewey Dunnington
> >>  wrote:
> >>
> >>> Hello,
> >>>
> >>> I would like to propose the following release candidate (RC1) of Apache
> >>> Arrow nanoarrow [0] version 0.1.0. This is an initial release
> consisting of
> >>> 31 resolved GitHub issues [1].
> >>>
> >>> Special thanks to David Li for his reviews and support during the
> >>> preparation of this initial release candidate!
> >>>
> >>> This release candidate is based on commit:
> >>> 341279af1b2fdede36871d212f339083ffbd75eb [2]
> >>>
> >>> The source release rc1 is hosted at [3].
> >>> The changelog is located at [4].
> >>>
> >>> Please download, verify checksums and signatures, run the unit tests,
> and
> >>> vote on the release. See [5] for how to validate a release candidate.
> >>>
> >>> The vote will be open for at least 72 hours.
> >>>
> >>> [ ] +1 Release this as Apache Arrow nanoarrow 0.1.0
> >>> [ ] +0
> >>> [ ] -1 Do not release this as Apache Arrow nanoarrow 0.1.0 because...
> >>>
> >>> [0] https://github.com/apache/arrow-nanoarrow
> >>> [1] https://github.com/apache/arrow-nanoarrow/milestone/1?closed=1
> >>> [2]
> >>>
> >>>
> https://github.com/apache/arrow-nanoarrow/tree/apache-arrow-nanoarrow-0.1.0-rc1
> >>> [3]
> >>>
> >>>
> https://dist.apache.org/repos/dist/dev/arrow/apache-arrow-nanoarrow-0.1.0-rc1/
> >>> [4]
> >>>
> >>>
> https://github.com/apache/arrow-nanoarrow/blob/apache-arrow-nanoarrow-0.1.0-rc1/CHANGELOG.md
> >>> [5]
> >>>
> https://github.com/apache/arrow-nanoarrow/blob/main/dev/release/README.md
> >>>
>


Arrow R package development sync call - today at 17:30 UTC

2023-02-23 Thread Nic Crane
The Arrow R package dev community call is today at 17:30 UTC.
Joining instructions are below.

Thursday, 23 February · 17:30 – 18:30
Google Meet joining info
Video call link: https://meet.google.com/dbm-ybmv-evb
Or dial: ‪(GB) +44 20 3957 3353‬ PIN: ‪487 675 609‬#
More phone numbers: https://tel.meet/dbm-ybmv-evb?pin=9199558189233

Thanks,

Nic


Re: R arrow package question

2023-02-09 Thread Nic Crane
Hi Angelo,

I think what might be happening here is that you have space characters in
your integer column, which are causing problems.

I created what could be a reproducible example of your problem at:
https://gist.github.com/thisisnic/af265166d5cd1ebce605cf3e478ee6d8

In short, can you try including the parameter (and values) `null_values =
c("", " ", "NA")` in your call to `open_dataset()`?  By default, empty
strings are set to NA values, but spaces are not, so this could be the
source of your error.

Nic

On Thu, 9 Feb 2023 at 05:06, Angelo Casalan  wrote:

> Hi Everyone,
>
> Thanks for the responses. I hope you are all well.
>
> Hi Dewey. As to the problematic column error message: Invalid: Could not
> open CSV input source 'folder/name.CSV': Invalid: In CSV column #30: Row
> #5: CSV conversion error to int32: invalid value ''
>
> I manually opened the csv and saw the cells are empty or blanks along with
> integers on the same column 30. Also present in some other columns.
>
> I tried manually setting via schema() the columns as utf8()/character
> equivalent in R, or string().
>
> I still get the same error message.
>
> disk.frame read these columns mixing integers with spaces/blanks as
> integers smoothly with no error messages at all. I think disk.frame read
> the spaces/blanks as null values/NA in R studio.
>
> I am scripting all of these in RMarkdown if that might be a factor.
>
> Questions:
> 1.Is there a way in open_dataset() to automatically set all blanks as null
> values across multiple csvs which im trying to load into R? Similar in
> logic to pandas.read_csv('test.csv',na_values=['nan'])
>
> manual re-encoding is not feasible because im dealing with millions of data
> points, I am also just a secondary user of this data, and my goal is to
> automate in R for my organization.
>
> 2.  Are there other arrow functions/commands that can load multiple csvs
> from my local folder as an arrow object?
>
> Regards,
>
> On Tue, Jan 31, 2023 at 8:50 AM Angelo Casalan 
> wrote:
>
> > Hi Jacob,
> >
> > Thanks. To provide some specifics on my query:
> >
> > 1.which version of arrow are you running?
> > - 10.0.1
> >
> > 2. The error message provides an exact col,row position, have you checked
> > the value there?
> > Yes. It is int64. This is after running open_dataset without specifying
> > schema:
> > '''
> > arrow<-open_dataset(
> > sources="location of csv files",
> > format="csv"
> > )
> > '''
> >
> >  3. I have to correct the exact error message:
> > CSV conversion error to int64:invalid value ' '
> > I think arrow tells me the invalid value present is ' '
> >
> >  4. This reminds me of cases where scientific notation is used for
> > integers
> >  which causes an error but that usually shows the value e.g. "1e6".
> > the invalid value is: ' '
> >
> > 5. I am really confused because using disk.frame() function, on the same
> > csvs, I have not encountered this problem on this column because it was
> > cleanly encoded as a numeric variable.
> >
> > Regards,
> >
> >
> >
> > On Fri, Jan 27, 2023 at 9:43 AM Angelo Casalan 
> > wrote:
> >
> >> Hi ,
> >>
> >> I hope you are well. I wish to ask how I can resolve this error:
> >>
> >> "CSV conversion error to int64: invalid value"
> >>
> >>
> >> To give an idea of my dataset. I have 4 csvs all placed in a local
> folder.
> >>
> >>
> >> The code below worked when importing:
> >>
> >>
> >> arrow<-open_dataset(
> >> sources="csv location",
> >> format="csv")
> >>
> >>
> >> However, when I run:
> >>
> >>
> >> arrow %>% count(column) %>% collect()
> >> nrow(arrow %>% collect)
> >>
> >> head(arrow %>% collect(),10 )
> >>
> >> I always get the same  error message: "Invalid: In CSV column #12: Row
> >> #580. CSV conversion error to int64: invalid value"
> >>
> >> I tried going back to open_dataset(,schema() ). Where the column that is
> >> giving me problems is set as utf8 or sometimes str in the schema
> argument.
> >>
> >> schema(
> >> col=utf8(),
> >> other nth columns
> >> )
> >>
> >> But I still encounter the same problem.
> >>
> >> Using this code below fail to work either.
> >>
> >> arrow2<-arrow_table(arrow)
> >>
> >> Thanks in advance if you can help me.
> >>
> >> --
> >> Regards,
> >>
> >> Angelo Casalan
> >> Statistical Methodology Unit
> >>
> >
> >
> > --
> > Regards,
> >
> > Angelo Casalan
> > Statistical Methodology Unit
> >
>
>
> --
> Regards,
>
> Angelo Casalan
>


Arrow R package development sync call - Thursday 9th Feb at 17:30 UTC

2023-02-08 Thread Nic Crane
The next Arrow R package dev community call is on Thursday 9th February at
17:30 UTC.

You can add items to the agenda on the day, or by adding a comment to the
meeting notes here:
https://docs.google.com/document/d/1nSIfJw8mfqtvScqvSVqmktpWff80pFmkqiZT7nTtiDo/edit?usp=sharing

Joining instructions are below.

Thursday, 9 February · 17:30 – 18:30
Google Meet joining info
Video call link: https://meet.google.com/dbm-ybmv-evb
Or dial: ‪(GB) +44 20 3957 3353‬ PIN: ‪487 675 609‬#
More phone numbers: https://tel.meet/dbm-ybmv-evb?pin=9199558189233

Thanks,

Nic


Re: [DISCUSS] PR automation workflow

2023-02-03 Thread Nic Crane
I have no specific comments on the what/how, other than to say I'm strongly
in favour of some kind of system being implemented and tried out, as I
currently rely on manual processes that are inefficient and make it easy to
miss PRs which need looking at.

On Thu, 2 Feb 2023 at 18:16, Andrew Lamb  wrote:

> A process that we use in arrow-rs / arrow-datafusion,  which is less
> precise but seems to be working well enough at the moment, is :
>
> 1. Mark  PRs that have received feedback and need more work prior to merge
> from `Ready to Review` back to `Draft`
> 2. Ask the author to set it back to "ready to review" when it is ready for
> the next round of review
>
> Andrew
>
>
> On Thu, Feb 2, 2023 at 4:17 AM Antoine Pitrou  wrote:
>
> >
> > Hi Raul,
> >
> > Since I'm the one who proposed that we reuse CPython's existing workflow
> > infrastructure, it follows logically that I'm in favour :-)
> >
> > I'm a CPython core developer myself (though inactive lately), I will add
> > that this workflow is really easing the work of reviewing PRs, as it
> > makes obvious whether a PR is needing attention from a committer.
> >
> > Once we start working with it, we may decide to make adjustments to
> > better fit our expectations, but I think we can start with the
> > unmodified workflow scheme.
> >
> > Regards
> >
> > Antoine.
> >
> >
> > Le 01/02/2023 à 15:34, Raúl Cumplido a écrit :
> > > Hi,
> > >
> > > I would like to start working on some automation for our PRs and issues
> > > workflows.
> > >
> > > I've heard, and have experienced, the frustration of spending a lot of
> > time
> > > on our issue tracker and our PRs to follow up on their status.
> > > It is very hard to keep track of which PRs and issues are waiting for
> > user
> > > feedback, have gone stale or are pending maintainer/committer action.
> > > This means users frequently get no timely response, all the while we
> > > regularly spend time on GH to look for PRs / issues needing action from
> > us.
> > > As a first step we should probably tackle PRs, once PRs are tackled and
> > we
> > > are satisfied with the solution, we can try to devise a similar one for
> > GH
> > > issues.
> > >
> > > An example of a great improvement is the CODEOWNERS addition [1]. This
> > > allows us to use filters like `is:pr is:open user-review-requested:@me
> `
> > [2]
> > > which will show PRs that have requested a review from us. This does not
> > > solve the problem of what are the PRs waiting for second review,
> > > waiting for changes, etcetera.
> > >
> > > I don't think we have to reinvent the wheel, CPython has something that
> > > works well and can easily be adapted/tweaked.
> > > They use a GitHub bot (bedevere) with the following state machine:
> > > https://github.com/python/bedevere#pr-state-machine
> > >
> > > PRs have one label of the following workflow labels, depending of the
> > state:
> > > - `Awaiting review`
> > > - `Awaiting core review`
> > > - `Awaiting changes`
> > > - `Awaiting change review`
> > > - `Awaiting merge`
> > >
> > > I would like to propose adding a GitHub bot to our repo that triggers
> on
> > PR
> > > changes / comments implementing a similar workflow than the one on the
> > > CPython repository.
> > >
> > > I am going to start working on it and I would love to hear feedback
> about
> > > that workflow. I have also created an issue on the Repo [3].
> > >
> > > Kind regards,
> > > Raúl
> > >
> > > [1] https://github.com/apache/arrow/pull/33622
> > > [2]
> > >
> >
> https://github.com/apache/arrow/pulls?q=is%3Apr+is%3Aopen+user-review-requested%3A%40me+
> > > [3] https://github.com/apache/arrow/issues/33977
> > >
> >
>


Re: R arrow package question

2023-01-31 Thread Nic Crane
Hi Angelo,

The original code with just `open_dataset()` works as it's created a
dataset without actually pulling the data into your R session.  The
subsequent commands you tried (i.e. involving `collect()` read in the
files, resulting in an error when the data is read in.

It looks like there's an invalid value in your dataset which is causing it
to fail to load.  From the error message you see there, it looks like it's
in the 12th column of your data in row 580.  I think when Jacob asked "have
you checked the value there", another way of phrasing what he said would be
to ask if you have manually checked the contents of whichever CSV is
causing the problem, in row 580 and column 12, to see what value is there?
(rather than checking the data type/value reported by Arrow).

It's going to be tricky to help diagnose the issue without a reproducible
example. If I'm working with a larger dataset, I usually narrow down the
issue by dividing it into two smaller datasets and running the code on each
to see which one contains the problematic row, and then keep going until I
find the row which is failing to load.  If you can get to the point where
you can pinpoint the exact values which are causing problems, this will be
the quickest way we can help you.

Best wishes,

Nic

On Tue, 31 Jan 2023 at 00:52, Angelo Casalan  wrote:

> Hi Jacob,
>
> Thanks. To provide some specifics on my query:
>
> 1.which version of arrow are you running?
> - 10.0.1
>
> 2. The error message provides an exact col,row position, have you checked
> the value there?
> Yes. It is int64. This is after running open_dataset without specifying
> schema:
> '''
> arrow<-open_dataset(
> sources="location of csv files",
> format="csv"
> )
> '''
>
>  3. I have to correct the exact error message:
> CSV conversion error to int64:invalid value ' '
> I think arrow tells me the invalid value present is ' '
>
>  4. This reminds me of cases where scientific notation is used for integers
>  which causes an error but that usually shows the value e.g. "1e6".
> the invalid value is: ' '
>
> 5. I am really confused because using disk.frame() function, on the same
> csvs, I have not encountered this problem on this column because it was
> cleanly encoded as a numeric variable.
>
> Regards,
>
>
>
> On Fri, Jan 27, 2023 at 9:43 AM Angelo Casalan 
> wrote:
>
> > Hi ,
> >
> > I hope you are well. I wish to ask how I can resolve this error:
> >
> > "CSV conversion error to int64: invalid value"
> >
> >
> > To give an idea of my dataset. I have 4 csvs all placed in a local
> folder.
> >
> >
> > The code below worked when importing:
> >
> >
> > arrow<-open_dataset(
> > sources="csv location",
> > format="csv")
> >
> >
> > However, when I run:
> >
> >
> > arrow %>% count(column) %>% collect()
> > nrow(arrow %>% collect)
> >
> > head(arrow %>% collect(),10 )
> >
> > I always get the same  error message: "Invalid: In CSV column #12: Row
> > #580. CSV conversion error to int64: invalid value"
> >
> > I tried going back to open_dataset(,schema() ). Where the column that is
> > giving me problems is set as utf8 or sometimes str in the schema
> argument.
> >
> > schema(
> > col=utf8(),
> > other nth columns
> > )
> >
> > But I still encounter the same problem.
> >
> > Using this code below fail to work either.
> >
> > arrow2<-arrow_table(arrow)
> >
> > Thanks in advance if you can help me.
> >
> > --
> > Regards,
> >
> > Angelo Casalan
> > Statistical Methodology Unit
> >
>
>
> --
> Regards,
>
> Angelo Casalan
> Statistical Methodology Unit
>


Arrow R package development sync call - today at 17:30 UTC

2023-01-26 Thread Nic Crane
The Arrow R package dev community call is today at 17:30 UTC.
Joining instructions are below.

Thursday, 26 January · 17:30 – 18:30
Google Meet joining info
Video call link: https://meet.google.com/dbm-ybmv-evb
Or dial: ‪(GB) +44 20 3957 3353‬ PIN: ‪487 675 609‬#
More phone numbers: https://tel.meet/dbm-ybmv-evb?pin=9199558189233

Thanks,

Nic


Arrow R package development sync call - tomorrow (Thurs 12th Jan) at 17:30 UTC

2023-01-11 Thread Nic Crane
The Arrow R package dev community call is tomorrow at 17:30 UTC.

Joining instructions are below.

Thursday, 12 January · 17:30 – 18:30
Google Meet joining info
Video call link: https://meet.google.com/dbm-ybmv-evb
Or dial: ‪(ES) +34 910 48 95 10‬ PIN: ‪919 955 818 9233‬#
More phone numbers: https://tel.meet/dbm-ybmv-evb?pin=9199558189233

The notes from the last call can be found at:
https://docs.google.com/document/d/1nSIfJw8mfqtvScqvSVqmktpWff80pFmkqiZT7nTtiDo/edit?usp=sharing


Thanks,

Nic


Re: [ANNOUNCE] New Arrow PMC chair: Andrew Lamb

2022-12-26 Thread Nic Crane
Congratulations!

On Mon, 26 Dec 2022, 11:01 Daniël Heres,  wrote:

> Congrats Andrew!
>
> On Mon, Dec 26, 2022, 09:00 L. C. Hsieh  wrote:
>
> > Congratulations!
> >
> > On Sun, Dec 25, 2022 at 10:36 PM Weston Pace 
> > wrote:
> > >
> > > Congratulations!
> > >
> > > On Sun, Dec 25, 2022, 9:44 PM Remzi Yang <1371656737...@gmail.com>
> > wrote:
> > >
> > > > Congratulation Andrew!
> > > >
> > > > On Mon, 26 Dec 2022 at 13:40, David Li  wrote:
> > > >
> > > > > Congrats Andrew!
> > > > >
> > > > > On Mon, Dec 26, 2022, at 00:26, vin jake wrote:
> > > > > > congratulation!
> > > > > >
> > > > > > Sutou Kouhei  于 2022年12月26日周一 12:54写道:
> > > > > >
> > > > > >> I am pleased to announce that we have a new PMC chair and VP as
> > per
> > > > > >> our newly started tradition of rotating the chair once a year. I
> > have
> > > > > >> resigned and Andrew Lamb was duly elected by the PMC and
> approved
> > > > > >> unanimously by the board. Please join me in congratulating
> Andrew
> > > > Lamb!
> > > > > >>
> > > > > >> Thanks,
> > > > > >> --
> > > > > >> kou
> > > > > >>
> > > > >
> > > >
> >
>


[ANNOUNCE] New Arrow committer: Jacob Wujciak

2022-12-15 Thread Nic Crane
On behalf of the Arrow PMC, I'm happy to announce that Jacob Wujciak has
accepted an invitation to become a committer on Apache Arrow. Welcome, and
thank you for your contributions!

Nic


Arrow R package development sync call - today at 17:30 UTC

2022-12-15 Thread Nic Crane
The Arrow R package dev community call is today at 17:30 UTC.
Joining instructions are below.

Thursday, 15 December · 17:30 – 18:30
Google Meet joining info
Video call link: https://meet.google.com/dbm-ybmv-evb
Or dial: ‪(ES) +34 910 48 95 10‬ PIN: ‪919 955 818 9233‬#
More phone numbers: https://tel.meet/dbm-ybmv-evb?pin=9199558189233

Thanks,

Nic


Re: [ANNOUNCE] New Arrow committer: Nic Crane

2021-09-10 Thread Nic Crane
Thanks everyone! :)

On Fri, Sep 10, 2021 at 4:54 AM QP Hou  wrote:

> Congrats Nic!
>
> On Thu, Sep 9, 2021 at 8:47 AM Neal Richardson
>  wrote:
> >
> > On behalf of the Apache Arrow PMC, I'm happy to announce that Nic Crane
> > has accepted an invitation to become a committer on Apache Arrow.
> >
> > Welcome and thank you for your contributions!
> >
> > Neal
>