[jira] [Created] (ARROW-7514) [C#] Make GetValueOffset Obsolete

2020-01-07 Thread Takashi Hashida (Jira)
Takashi Hashida created ARROW-7514:
--

 Summary: [C#] Make GetValueOffset Obsolete
 Key: ARROW-7514
 URL: https://issues.apache.org/jira/browse/ARROW-7514
 Project: Apache Arrow
  Issue Type: Improvement
  Components: C#
Reporter: Takashi Hashida


[BinaryArray.GetValueOffset|https://github.com/apache/arrow/blob/master/csharp/src/Apache.Arrow/Arrays/BinaryArray.cs#L172]
 and 
[ListArray.GetValueOffset|https://github.com/apache/arrow/blob/master/csharp/src/Apache.Arrow/Arrays/ListArray.cs#L47]
 no longer have value.



We should add an `Obsolete` attribute to these methods in the next release, 
then remove these methods in the future release.

 

Show this discussion: 
[https://github.com/apache/arrow/pull/6029#discussion_r361505788]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-7513) [JS] Arrow Tutorial: Common data types

2020-01-07 Thread Leo Meyerovich (Jira)
Leo Meyerovich created ARROW-7513:
-

 Summary: [JS] Arrow Tutorial: Common data types
 Key: ARROW-7513
 URL: https://issues.apache.org/jira/browse/ARROW-7513
 Project: Apache Arrow
  Issue Type: Task
  Components: JavaScript
Reporter: Leo Meyerovich
Assignee: Leo Meyerovich


The JS client lacks basic introductory material around creating the common 
basic data types such as turning JS arrays into ints, dicts, etc. There is no 
equivalent of Python's [https://arrow.apache.org/docs/python/data.html] . This 
has made use for myself difficult, and I bet for others.

 

As with prev tutorials, I started sketching on 
[https://observablehq.com/@lmeyerov/rich-data-types-in-apache-arrow-js-efficient-data-tables-wit]
  . When we're happy can make sense to export as an html or something to the 
repo, or just link from the main readme.

I believe the target topics worth covering are:
 * Common user data types: Ints, Dicts, Struct, Time
 * Common column types: Data, Vector, Column
 * Going from individual & arrays & buffers of JS values to Arrow-wrapped 
forms, and basic inspection of the result

Not worth going into here is Tables vs. RecordBatches, which is the other 
tutorial.

 

1. Ideas of what to add/edit/remove?

2. And anyone up for helping with discussion of Data vs. Vector, and ingest of 
Time & Struct?

3. ... Should we be encouraging Struct or Map? I saw some PRs changing stuff 
here.

 

cc [~wesm] [~bhulette] [~paul.e.taylor]

 

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Timeline for next major release [was Re: Looking to 1.0]

2020-01-07 Thread Wes McKinney
I just finished an initial curation of the JIRA backlog. There are now
137 issues which is probably more than will be resolved before
releasing. I noticed some concerning bugs that may need attention, but
if there are any new feature or nice-to-have issues that you are
familiar with please remove them from the 0.16.0 milestone if you
don't think they will be done in the next 7-10 days

On Tue, Jan 7, 2020 at 5:28 PM Krisztián Szűcs
 wrote:
>
> On Tue, Jan 7, 2020 at 11:40 PM Neal Richardson
>  wrote:
> >
> > If we expect that the release process may be less stable this time, should
> > we bump up our target date for an RC, like to the 20th or 21st (two weeks
> > from now)? That would give us more leeway to make sure we get a release out
> > before the end of January.
> Agree.
> >
> > Neal
> >
> > On Tue, Jan 7, 2020 at 1:02 PM Krisztián Szűcs 
> > wrote:
> >
> > > Sounds good to me. I'll help with the jira curation.
> > >
> > > Because of the recent CI migrations we'll need to be more thorough during
> > > the verification, and I also expect minor issues during the release
> > > process.
> > > So I volunteer to be the RM if no one else wants to jump in.
> > >
> > > Thanks, Krisztian
> > >
> > > On Tue, Jan 7, 2020 at 7:26 PM Wes McKinney  wrote:
> > > >
> > > > I just renamed the 1.0.0 release version in JIRA to 0.16.0 and will
> > > > work on removing issues that are not necessary to be able to release
> > > > (others, please help). If we make miraculous progress with the 1.0.0
> > > > columnar format blockers (per discussion below), we can change this
> > > > back, but I think either way we should put ourselves on a critical
> > > > path to have an RC cut by Friday January 24. Does that seem doable?
> > > >
> > > > On Tue, Jan 7, 2020 at 10:25 AM Wes McKinney 
> > > wrote:
> > > > >
> > > > > We absolutely should have a list of exactly what needs to be done to
> > > > > put out the 1.0.0 release, but based on what we know needs to be done
> > > > > I am not optimistic that it can all be accomplished before the end of
> > > > > January. That doesn't mean that we should assume these things won't
> > > > > get done before March/April time frame. If they get done sooner, let's
> > > > > release 1.0.0 sooner.
> > > > >
> > > > > On Mon, Jan 6, 2020 at 6:03 PM Neal Richardson
> > > > >  wrote:
> > > > > >
> > > > > > I'm all for maintaining a regular cadence of releases, but before we
> > > cast
> > > > > > aside the idea of 1.0, I'd still encourage us to do the work of
> > > enumerating
> > > > > > what truly must happen before we call a release 1.0 so that we can
> > > get it
> > > > > > done. Otherwise, in April we're going to be talking about doing a
> > > 0.17
> > > > > > release.
> > > > > >
> > > > > > I believe I've found the issues that Wes referenced and added them 
> > > > > > as
> > > > > > "blockers" to 1.0.0. That brings the total blocker count listed on
> > > > > >
> > > https://cwiki.apache.org/confluence/display/ARROW/Arrow+1.0.0+Release to
> > > 10
> > > > > > issues, though some may be overlapping/redundant. Do we think this
> > > is an
> > > > > > exhaustive list of blockers? Should some of these be downgraded to
> > > > > > not-blocking? If we were to resolve all 10 of these issues, would we
> > > have
> > > > > > consensus that we're ready for 1.0?
> > > > > >
> > > > > > Would it help to update this wiki, which seems pretty stale at this
> > > point?
> > > > > >
> > > https://cwiki.apache.org/confluence/display/ARROW/Columnar+Format+1.0+Milestone
> > > > > >
> > > > > > Thanks,
> > > > > > Neal
> > > > > >
> > > > > >
> > > > > > On Mon, Jan 6, 2020 at 11:40 AM Bryan Cutler 
> > > wrote:
> > > > > >
> > > > > > > I agree on a 0.16.0 release. In the meantime I'll try to help out
> > > with
> > > > > > > getting the Java side ready for 1.0.
> > > > > > >
> > > > > > > On Sat, Jan 4, 2020 at 7:21 PM Fan Liya 
> > > wrote:
> > > > > > >
> > > > > > > > Hi Jacques,
> > > > > > > >
> > > > > > > > ARROW-4526 is interesting. I would like to try to resolve it.
> > > > > > > > Thanks a lot for the information.
> > > > > > > >
> > > > > > > > Best,
> > > > > > > > Liya Fan
> > > > > > > >
> > > > > > > >
> > > > > > > > On Sun, Jan 5, 2020 at 6:14 AM Jacques Nadeau <
> > > jacq...@apache.org>
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > > The third ticket I was commenting on was ARROW-4526.
> > > > > > > > >
> > > > > > > > > Fan, do you want to take a shot at that one?
> > > > > > > > >
> > > > > > > > > On Fri, Jan 3, 2020 at 8:16 PM Fan Liya 
> > > wrote:
> > > > > > > > >
> > > > > > > > > >   Hi Jacques,
> > > > > > > > > >
> > > > > > > > > > I am interested in the issues, and if it is possible, I
> > > would like to
> > > > > > > > try
> > > > > > > > > > to resolve them.
> > > > > > > > > >
> > > > > > > > > > Thanks.
> > > > > > > > > >
> > > > > > > > > > Liya Fan
> > > > > > > > > >
> > > > > > > > > > On Sat, Jan 4, 2020 at 7:16 AM Jacques Nadeau <
> > > jacq...@apache.org>
> 

Re: Timeline for next major release [was Re: Looking to 1.0]

2020-01-07 Thread Wes McKinney
That sounds fine to me. I don't see many blocking issues for a major
release, and the nightly reports are fairly clean, so I think we
should try to be ready to go at the beginning of that week of the
19th.

On Tue, Jan 7, 2020 at 4:40 PM Neal Richardson
 wrote:
>
> If we expect that the release process may be less stable this time, should
> we bump up our target date for an RC, like to the 20th or 21st (two weeks
> from now)? That would give us more leeway to make sure we get a release out
> before the end of January.
>
> Neal
>
> On Tue, Jan 7, 2020 at 1:02 PM Krisztián Szűcs 
> wrote:
>
> > Sounds good to me. I'll help with the jira curation.
> >
> > Because of the recent CI migrations we'll need to be more thorough during
> > the verification, and I also expect minor issues during the release
> > process.
> > So I volunteer to be the RM if no one else wants to jump in.
> >
> > Thanks, Krisztian
> >
> > On Tue, Jan 7, 2020 at 7:26 PM Wes McKinney  wrote:
> > >
> > > I just renamed the 1.0.0 release version in JIRA to 0.16.0 and will
> > > work on removing issues that are not necessary to be able to release
> > > (others, please help). If we make miraculous progress with the 1.0.0
> > > columnar format blockers (per discussion below), we can change this
> > > back, but I think either way we should put ourselves on a critical
> > > path to have an RC cut by Friday January 24. Does that seem doable?
> > >
> > > On Tue, Jan 7, 2020 at 10:25 AM Wes McKinney 
> > wrote:
> > > >
> > > > We absolutely should have a list of exactly what needs to be done to
> > > > put out the 1.0.0 release, but based on what we know needs to be done
> > > > I am not optimistic that it can all be accomplished before the end of
> > > > January. That doesn't mean that we should assume these things won't
> > > > get done before March/April time frame. If they get done sooner, let's
> > > > release 1.0.0 sooner.
> > > >
> > > > On Mon, Jan 6, 2020 at 6:03 PM Neal Richardson
> > > >  wrote:
> > > > >
> > > > > I'm all for maintaining a regular cadence of releases, but before we
> > cast
> > > > > aside the idea of 1.0, I'd still encourage us to do the work of
> > enumerating
> > > > > what truly must happen before we call a release 1.0 so that we can
> > get it
> > > > > done. Otherwise, in April we're going to be talking about doing a
> > 0.17
> > > > > release.
> > > > >
> > > > > I believe I've found the issues that Wes referenced and added them as
> > > > > "blockers" to 1.0.0. That brings the total blocker count listed on
> > > > >
> > https://cwiki.apache.org/confluence/display/ARROW/Arrow+1.0.0+Release to
> > 10
> > > > > issues, though some may be overlapping/redundant. Do we think this
> > is an
> > > > > exhaustive list of blockers? Should some of these be downgraded to
> > > > > not-blocking? If we were to resolve all 10 of these issues, would we
> > have
> > > > > consensus that we're ready for 1.0?
> > > > >
> > > > > Would it help to update this wiki, which seems pretty stale at this
> > point?
> > > > >
> > https://cwiki.apache.org/confluence/display/ARROW/Columnar+Format+1.0+Milestone
> > > > >
> > > > > Thanks,
> > > > > Neal
> > > > >
> > > > >
> > > > > On Mon, Jan 6, 2020 at 11:40 AM Bryan Cutler 
> > wrote:
> > > > >
> > > > > > I agree on a 0.16.0 release. In the meantime I'll try to help out
> > with
> > > > > > getting the Java side ready for 1.0.
> > > > > >
> > > > > > On Sat, Jan 4, 2020 at 7:21 PM Fan Liya 
> > wrote:
> > > > > >
> > > > > > > Hi Jacques,
> > > > > > >
> > > > > > > ARROW-4526 is interesting. I would like to try to resolve it.
> > > > > > > Thanks a lot for the information.
> > > > > > >
> > > > > > > Best,
> > > > > > > Liya Fan
> > > > > > >
> > > > > > >
> > > > > > > On Sun, Jan 5, 2020 at 6:14 AM Jacques Nadeau <
> > jacq...@apache.org>
> > > > > > wrote:
> > > > > > >
> > > > > > > > The third ticket I was commenting on was ARROW-4526.
> > > > > > > >
> > > > > > > > Fan, do you want to take a shot at that one?
> > > > > > > >
> > > > > > > > On Fri, Jan 3, 2020 at 8:16 PM Fan Liya 
> > wrote:
> > > > > > > >
> > > > > > > > >   Hi Jacques,
> > > > > > > > >
> > > > > > > > > I am interested in the issues, and if it is possible, I
> > would like to
> > > > > > > try
> > > > > > > > > to resolve them.
> > > > > > > > >
> > > > > > > > > Thanks.
> > > > > > > > >
> > > > > > > > > Liya Fan
> > > > > > > > >
> > > > > > > > > On Sat, Jan 4, 2020 at 7:16 AM Jacques Nadeau <
> > jacq...@apache.org>
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > I identified three things in the java library that I think
> > are top
> > > > > > of
> > > > > > > > > mind
> > > > > > > > > > and should be fixed before 1.0 to avoid weird
> > incompatibility
> > > > > > changes
> > > > > > > > in
> > > > > > > > > > the java apis (technical debt). I've tagged them as
> > pre-1.0 as I
> > > > > > > don't
> > > > > > > > > > exactly see what is the right way to tag/label a target
> > release
> > > > 

Re: Timeline for next major release [was Re: Looking to 1.0]

2020-01-07 Thread Krisztián Szűcs
On Tue, Jan 7, 2020 at 11:40 PM Neal Richardson
 wrote:
>
> If we expect that the release process may be less stable this time, should
> we bump up our target date for an RC, like to the 20th or 21st (two weeks
> from now)? That would give us more leeway to make sure we get a release out
> before the end of January.
Agree.
>
> Neal
>
> On Tue, Jan 7, 2020 at 1:02 PM Krisztián Szűcs 
> wrote:
>
> > Sounds good to me. I'll help with the jira curation.
> >
> > Because of the recent CI migrations we'll need to be more thorough during
> > the verification, and I also expect minor issues during the release
> > process.
> > So I volunteer to be the RM if no one else wants to jump in.
> >
> > Thanks, Krisztian
> >
> > On Tue, Jan 7, 2020 at 7:26 PM Wes McKinney  wrote:
> > >
> > > I just renamed the 1.0.0 release version in JIRA to 0.16.0 and will
> > > work on removing issues that are not necessary to be able to release
> > > (others, please help). If we make miraculous progress with the 1.0.0
> > > columnar format blockers (per discussion below), we can change this
> > > back, but I think either way we should put ourselves on a critical
> > > path to have an RC cut by Friday January 24. Does that seem doable?
> > >
> > > On Tue, Jan 7, 2020 at 10:25 AM Wes McKinney 
> > wrote:
> > > >
> > > > We absolutely should have a list of exactly what needs to be done to
> > > > put out the 1.0.0 release, but based on what we know needs to be done
> > > > I am not optimistic that it can all be accomplished before the end of
> > > > January. That doesn't mean that we should assume these things won't
> > > > get done before March/April time frame. If they get done sooner, let's
> > > > release 1.0.0 sooner.
> > > >
> > > > On Mon, Jan 6, 2020 at 6:03 PM Neal Richardson
> > > >  wrote:
> > > > >
> > > > > I'm all for maintaining a regular cadence of releases, but before we
> > cast
> > > > > aside the idea of 1.0, I'd still encourage us to do the work of
> > enumerating
> > > > > what truly must happen before we call a release 1.0 so that we can
> > get it
> > > > > done. Otherwise, in April we're going to be talking about doing a
> > 0.17
> > > > > release.
> > > > >
> > > > > I believe I've found the issues that Wes referenced and added them as
> > > > > "blockers" to 1.0.0. That brings the total blocker count listed on
> > > > >
> > https://cwiki.apache.org/confluence/display/ARROW/Arrow+1.0.0+Release to
> > 10
> > > > > issues, though some may be overlapping/redundant. Do we think this
> > is an
> > > > > exhaustive list of blockers? Should some of these be downgraded to
> > > > > not-blocking? If we were to resolve all 10 of these issues, would we
> > have
> > > > > consensus that we're ready for 1.0?
> > > > >
> > > > > Would it help to update this wiki, which seems pretty stale at this
> > point?
> > > > >
> > https://cwiki.apache.org/confluence/display/ARROW/Columnar+Format+1.0+Milestone
> > > > >
> > > > > Thanks,
> > > > > Neal
> > > > >
> > > > >
> > > > > On Mon, Jan 6, 2020 at 11:40 AM Bryan Cutler 
> > wrote:
> > > > >
> > > > > > I agree on a 0.16.0 release. In the meantime I'll try to help out
> > with
> > > > > > getting the Java side ready for 1.0.
> > > > > >
> > > > > > On Sat, Jan 4, 2020 at 7:21 PM Fan Liya 
> > wrote:
> > > > > >
> > > > > > > Hi Jacques,
> > > > > > >
> > > > > > > ARROW-4526 is interesting. I would like to try to resolve it.
> > > > > > > Thanks a lot for the information.
> > > > > > >
> > > > > > > Best,
> > > > > > > Liya Fan
> > > > > > >
> > > > > > >
> > > > > > > On Sun, Jan 5, 2020 at 6:14 AM Jacques Nadeau <
> > jacq...@apache.org>
> > > > > > wrote:
> > > > > > >
> > > > > > > > The third ticket I was commenting on was ARROW-4526.
> > > > > > > >
> > > > > > > > Fan, do you want to take a shot at that one?
> > > > > > > >
> > > > > > > > On Fri, Jan 3, 2020 at 8:16 PM Fan Liya 
> > wrote:
> > > > > > > >
> > > > > > > > >   Hi Jacques,
> > > > > > > > >
> > > > > > > > > I am interested in the issues, and if it is possible, I
> > would like to
> > > > > > > try
> > > > > > > > > to resolve them.
> > > > > > > > >
> > > > > > > > > Thanks.
> > > > > > > > >
> > > > > > > > > Liya Fan
> > > > > > > > >
> > > > > > > > > On Sat, Jan 4, 2020 at 7:16 AM Jacques Nadeau <
> > jacq...@apache.org>
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > I identified three things in the java library that I think
> > are top
> > > > > > of
> > > > > > > > > mind
> > > > > > > > > > and should be fixed before 1.0 to avoid weird
> > incompatibility
> > > > > > changes
> > > > > > > > in
> > > > > > > > > > the java apis (technical debt). I've tagged them as
> > pre-1.0 as I
> > > > > > > don't
> > > > > > > > > > exactly see what is the right way to tag/label a target
> > release
> > > > > > for a
> > > > > > > > > > ticket.
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > https://issues.apache.org/jira/browse/ARROW-7495?jql=labels%20%3D%20pre-1.0
> 

[jira] [Created] (ARROW-7512) Dictionary memo missing elements in id_to_dictionary_ map after deserialization

2020-01-07 Thread Wamsi Viswanath (Jira)
Wamsi Viswanath created ARROW-7512:
--

 Summary: Dictionary memo missing elements in id_to_dictionary_ map 
after deserialization
 Key: ARROW-7512
 URL: https://issues.apache.org/jira/browse/ARROW-7512
 Project: Apache Arrow
  Issue Type: Bug
  Components: C++
Affects Versions: 0.15.0
Reporter: Wamsi Viswanath


`id_to_dictionary_` map is empty after de-serialization of schema using 
ReadSchema method.

An example for reproduction:

[https://gist.github.com/wamsiv/77dc1db44b5805828172e6c94d61d2d9]

I see that it is probably being missed here: 
https://github.com/apache/arrow/blob/master/cpp/src/arrow/ipc/metadata_internal.cc#L804

Please let me know if the behavior is expected and if so then how the client is 
expected to have dictionary array values?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Timeline for next major release [was Re: Looking to 1.0]

2020-01-07 Thread Neal Richardson
If we expect that the release process may be less stable this time, should
we bump up our target date for an RC, like to the 20th or 21st (two weeks
from now)? That would give us more leeway to make sure we get a release out
before the end of January.

Neal

On Tue, Jan 7, 2020 at 1:02 PM Krisztián Szűcs 
wrote:

> Sounds good to me. I'll help with the jira curation.
>
> Because of the recent CI migrations we'll need to be more thorough during
> the verification, and I also expect minor issues during the release
> process.
> So I volunteer to be the RM if no one else wants to jump in.
>
> Thanks, Krisztian
>
> On Tue, Jan 7, 2020 at 7:26 PM Wes McKinney  wrote:
> >
> > I just renamed the 1.0.0 release version in JIRA to 0.16.0 and will
> > work on removing issues that are not necessary to be able to release
> > (others, please help). If we make miraculous progress with the 1.0.0
> > columnar format blockers (per discussion below), we can change this
> > back, but I think either way we should put ourselves on a critical
> > path to have an RC cut by Friday January 24. Does that seem doable?
> >
> > On Tue, Jan 7, 2020 at 10:25 AM Wes McKinney 
> wrote:
> > >
> > > We absolutely should have a list of exactly what needs to be done to
> > > put out the 1.0.0 release, but based on what we know needs to be done
> > > I am not optimistic that it can all be accomplished before the end of
> > > January. That doesn't mean that we should assume these things won't
> > > get done before March/April time frame. If they get done sooner, let's
> > > release 1.0.0 sooner.
> > >
> > > On Mon, Jan 6, 2020 at 6:03 PM Neal Richardson
> > >  wrote:
> > > >
> > > > I'm all for maintaining a regular cadence of releases, but before we
> cast
> > > > aside the idea of 1.0, I'd still encourage us to do the work of
> enumerating
> > > > what truly must happen before we call a release 1.0 so that we can
> get it
> > > > done. Otherwise, in April we're going to be talking about doing a
> 0.17
> > > > release.
> > > >
> > > > I believe I've found the issues that Wes referenced and added them as
> > > > "blockers" to 1.0.0. That brings the total blocker count listed on
> > > >
> https://cwiki.apache.org/confluence/display/ARROW/Arrow+1.0.0+Release to
> 10
> > > > issues, though some may be overlapping/redundant. Do we think this
> is an
> > > > exhaustive list of blockers? Should some of these be downgraded to
> > > > not-blocking? If we were to resolve all 10 of these issues, would we
> have
> > > > consensus that we're ready for 1.0?
> > > >
> > > > Would it help to update this wiki, which seems pretty stale at this
> point?
> > > >
> https://cwiki.apache.org/confluence/display/ARROW/Columnar+Format+1.0+Milestone
> > > >
> > > > Thanks,
> > > > Neal
> > > >
> > > >
> > > > On Mon, Jan 6, 2020 at 11:40 AM Bryan Cutler 
> wrote:
> > > >
> > > > > I agree on a 0.16.0 release. In the meantime I'll try to help out
> with
> > > > > getting the Java side ready for 1.0.
> > > > >
> > > > > On Sat, Jan 4, 2020 at 7:21 PM Fan Liya 
> wrote:
> > > > >
> > > > > > Hi Jacques,
> > > > > >
> > > > > > ARROW-4526 is interesting. I would like to try to resolve it.
> > > > > > Thanks a lot for the information.
> > > > > >
> > > > > > Best,
> > > > > > Liya Fan
> > > > > >
> > > > > >
> > > > > > On Sun, Jan 5, 2020 at 6:14 AM Jacques Nadeau <
> jacq...@apache.org>
> > > > > wrote:
> > > > > >
> > > > > > > The third ticket I was commenting on was ARROW-4526.
> > > > > > >
> > > > > > > Fan, do you want to take a shot at that one?
> > > > > > >
> > > > > > > On Fri, Jan 3, 2020 at 8:16 PM Fan Liya 
> wrote:
> > > > > > >
> > > > > > > >   Hi Jacques,
> > > > > > > >
> > > > > > > > I am interested in the issues, and if it is possible, I
> would like to
> > > > > > try
> > > > > > > > to resolve them.
> > > > > > > >
> > > > > > > > Thanks.
> > > > > > > >
> > > > > > > > Liya Fan
> > > > > > > >
> > > > > > > > On Sat, Jan 4, 2020 at 7:16 AM Jacques Nadeau <
> jacq...@apache.org>
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > > I identified three things in the java library that I think
> are top
> > > > > of
> > > > > > > > mind
> > > > > > > > > and should be fixed before 1.0 to avoid weird
> incompatibility
> > > > > changes
> > > > > > > in
> > > > > > > > > the java apis (technical debt). I've tagged them as
> pre-1.0 as I
> > > > > > don't
> > > > > > > > > exactly see what is the right way to tag/label a target
> release
> > > > > for a
> > > > > > > > > ticket.
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> https://issues.apache.org/jira/browse/ARROW-7495?jql=labels%20%3D%20pre-1.0
> > > > > > > > >
> > > > > > > > > For the three tickets I identified, does anyone have
> interest in
> > > > > > trying
> > > > > > > > to
> > > > > > > > > resolve?
> > > > > > > > >
> > > > > > > > > thanks,
> > > > > > > > > Jacques
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Thu, Jan 2, 2020 at 11:55 

Re: Timeline for next major release [was Re: Looking to 1.0]

2020-01-07 Thread Krisztián Szűcs
Sounds good to me. I'll help with the jira curation.

Because of the recent CI migrations we'll need to be more thorough during
the verification, and I also expect minor issues during the release process.
So I volunteer to be the RM if no one else wants to jump in.

Thanks, Krisztian

On Tue, Jan 7, 2020 at 7:26 PM Wes McKinney  wrote:
>
> I just renamed the 1.0.0 release version in JIRA to 0.16.0 and will
> work on removing issues that are not necessary to be able to release
> (others, please help). If we make miraculous progress with the 1.0.0
> columnar format blockers (per discussion below), we can change this
> back, but I think either way we should put ourselves on a critical
> path to have an RC cut by Friday January 24. Does that seem doable?
>
> On Tue, Jan 7, 2020 at 10:25 AM Wes McKinney  wrote:
> >
> > We absolutely should have a list of exactly what needs to be done to
> > put out the 1.0.0 release, but based on what we know needs to be done
> > I am not optimistic that it can all be accomplished before the end of
> > January. That doesn't mean that we should assume these things won't
> > get done before March/April time frame. If they get done sooner, let's
> > release 1.0.0 sooner.
> >
> > On Mon, Jan 6, 2020 at 6:03 PM Neal Richardson
> >  wrote:
> > >
> > > I'm all for maintaining a regular cadence of releases, but before we cast
> > > aside the idea of 1.0, I'd still encourage us to do the work of 
> > > enumerating
> > > what truly must happen before we call a release 1.0 so that we can get it
> > > done. Otherwise, in April we're going to be talking about doing a 0.17
> > > release.
> > >
> > > I believe I've found the issues that Wes referenced and added them as
> > > "blockers" to 1.0.0. That brings the total blocker count listed on
> > > https://cwiki.apache.org/confluence/display/ARROW/Arrow+1.0.0+Release to 
> > > 10
> > > issues, though some may be overlapping/redundant. Do we think this is an
> > > exhaustive list of blockers? Should some of these be downgraded to
> > > not-blocking? If we were to resolve all 10 of these issues, would we have
> > > consensus that we're ready for 1.0?
> > >
> > > Would it help to update this wiki, which seems pretty stale at this point?
> > > https://cwiki.apache.org/confluence/display/ARROW/Columnar+Format+1.0+Milestone
> > >
> > > Thanks,
> > > Neal
> > >
> > >
> > > On Mon, Jan 6, 2020 at 11:40 AM Bryan Cutler  wrote:
> > >
> > > > I agree on a 0.16.0 release. In the meantime I'll try to help out with
> > > > getting the Java side ready for 1.0.
> > > >
> > > > On Sat, Jan 4, 2020 at 7:21 PM Fan Liya  wrote:
> > > >
> > > > > Hi Jacques,
> > > > >
> > > > > ARROW-4526 is interesting. I would like to try to resolve it.
> > > > > Thanks a lot for the information.
> > > > >
> > > > > Best,
> > > > > Liya Fan
> > > > >
> > > > >
> > > > > On Sun, Jan 5, 2020 at 6:14 AM Jacques Nadeau 
> > > > wrote:
> > > > >
> > > > > > The third ticket I was commenting on was ARROW-4526.
> > > > > >
> > > > > > Fan, do you want to take a shot at that one?
> > > > > >
> > > > > > On Fri, Jan 3, 2020 at 8:16 PM Fan Liya  
> > > > > > wrote:
> > > > > >
> > > > > > >   Hi Jacques,
> > > > > > >
> > > > > > > I am interested in the issues, and if it is possible, I would 
> > > > > > > like to
> > > > > try
> > > > > > > to resolve them.
> > > > > > >
> > > > > > > Thanks.
> > > > > > >
> > > > > > > Liya Fan
> > > > > > >
> > > > > > > On Sat, Jan 4, 2020 at 7:16 AM Jacques Nadeau 
> > > > > > wrote:
> > > > > > >
> > > > > > > > I identified three things in the java library that I think are 
> > > > > > > > top
> > > > of
> > > > > > > mind
> > > > > > > > and should be fixed before 1.0 to avoid weird incompatibility
> > > > changes
> > > > > > in
> > > > > > > > the java apis (technical debt). I've tagged them as pre-1.0 as I
> > > > > don't
> > > > > > > > exactly see what is the right way to tag/label a target release
> > > > for a
> > > > > > > > ticket.
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > > https://issues.apache.org/jira/browse/ARROW-7495?jql=labels%20%3D%20pre-1.0
> > > > > > > >
> > > > > > > > For the three tickets I identified, does anyone have interest in
> > > > > trying
> > > > > > > to
> > > > > > > > resolve?
> > > > > > > >
> > > > > > > > thanks,
> > > > > > > > Jacques
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > On Thu, Jan 2, 2020 at 11:55 AM Neal Richardson <
> > > > > > > > neal.p.richard...@gmail.com>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi all,
> > > > > > > > > Happy new year! As we look ahead to 2020, it's time to start
> > > > > > mobilizing
> > > > > > > > for
> > > > > > > > > the Arrow 1.0 release. At 0.15, I believe we decided that our
> > > > next
> > > > > > > > release
> > > > > > > > > should be 1.0, and it's been a couple of months since 0.15, so
> > > > > we're
> > > > > > > due
> > > > > > > > to
> > > > > > > > > release again this month, give or take. (See [1] for 

Re: Timeline for next major release [was Re: Looking to 1.0]

2020-01-07 Thread Neal Richardson
Thanks, Wes. I made
https://cwiki.apache.org/confluence/display/ARROW/Arrow+0.16.0+Release to
help us track 0.16.

Neal

On Tue, Jan 7, 2020 at 10:26 AM Wes McKinney  wrote:

> I just renamed the 1.0.0 release version in JIRA to 0.16.0 and will
> work on removing issues that are not necessary to be able to release
> (others, please help). If we make miraculous progress with the 1.0.0
> columnar format blockers (per discussion below), we can change this
> back, but I think either way we should put ourselves on a critical
> path to have an RC cut by Friday January 24. Does that seem doable?
>
> On Tue, Jan 7, 2020 at 10:25 AM Wes McKinney  wrote:
> >
> > We absolutely should have a list of exactly what needs to be done to
> > put out the 1.0.0 release, but based on what we know needs to be done
> > I am not optimistic that it can all be accomplished before the end of
> > January. That doesn't mean that we should assume these things won't
> > get done before March/April time frame. If they get done sooner, let's
> > release 1.0.0 sooner.
> >
> > On Mon, Jan 6, 2020 at 6:03 PM Neal Richardson
> >  wrote:
> > >
> > > I'm all for maintaining a regular cadence of releases, but before we
> cast
> > > aside the idea of 1.0, I'd still encourage us to do the work of
> enumerating
> > > what truly must happen before we call a release 1.0 so that we can get
> it
> > > done. Otherwise, in April we're going to be talking about doing a 0.17
> > > release.
> > >
> > > I believe I've found the issues that Wes referenced and added them as
> > > "blockers" to 1.0.0. That brings the total blocker count listed on
> > > https://cwiki.apache.org/confluence/display/ARROW/Arrow+1.0.0+Release
> to 10
> > > issues, though some may be overlapping/redundant. Do we think this is
> an
> > > exhaustive list of blockers? Should some of these be downgraded to
> > > not-blocking? If we were to resolve all 10 of these issues, would we
> have
> > > consensus that we're ready for 1.0?
> > >
> > > Would it help to update this wiki, which seems pretty stale at this
> point?
> > >
> https://cwiki.apache.org/confluence/display/ARROW/Columnar+Format+1.0+Milestone
> > >
> > > Thanks,
> > > Neal
> > >
> > >
> > > On Mon, Jan 6, 2020 at 11:40 AM Bryan Cutler 
> wrote:
> > >
> > > > I agree on a 0.16.0 release. In the meantime I'll try to help out
> with
> > > > getting the Java side ready for 1.0.
> > > >
> > > > On Sat, Jan 4, 2020 at 7:21 PM Fan Liya 
> wrote:
> > > >
> > > > > Hi Jacques,
> > > > >
> > > > > ARROW-4526 is interesting. I would like to try to resolve it.
> > > > > Thanks a lot for the information.
> > > > >
> > > > > Best,
> > > > > Liya Fan
> > > > >
> > > > >
> > > > > On Sun, Jan 5, 2020 at 6:14 AM Jacques Nadeau 
> > > > wrote:
> > > > >
> > > > > > The third ticket I was commenting on was ARROW-4526.
> > > > > >
> > > > > > Fan, do you want to take a shot at that one?
> > > > > >
> > > > > > On Fri, Jan 3, 2020 at 8:16 PM Fan Liya 
> wrote:
> > > > > >
> > > > > > >   Hi Jacques,
> > > > > > >
> > > > > > > I am interested in the issues, and if it is possible, I would
> like to
> > > > > try
> > > > > > > to resolve them.
> > > > > > >
> > > > > > > Thanks.
> > > > > > >
> > > > > > > Liya Fan
> > > > > > >
> > > > > > > On Sat, Jan 4, 2020 at 7:16 AM Jacques Nadeau <
> jacq...@apache.org>
> > > > > > wrote:
> > > > > > >
> > > > > > > > I identified three things in the java library that I think
> are top
> > > > of
> > > > > > > mind
> > > > > > > > and should be fixed before 1.0 to avoid weird incompatibility
> > > > changes
> > > > > > in
> > > > > > > > the java apis (technical debt). I've tagged them as pre-1.0
> as I
> > > > > don't
> > > > > > > > exactly see what is the right way to tag/label a target
> release
> > > > for a
> > > > > > > > ticket.
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> https://issues.apache.org/jira/browse/ARROW-7495?jql=labels%20%3D%20pre-1.0
> > > > > > > >
> > > > > > > > For the three tickets I identified, does anyone have
> interest in
> > > > > trying
> > > > > > > to
> > > > > > > > resolve?
> > > > > > > >
> > > > > > > > thanks,
> > > > > > > > Jacques
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > On Thu, Jan 2, 2020 at 11:55 AM Neal Richardson <
> > > > > > > > neal.p.richard...@gmail.com>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi all,
> > > > > > > > > Happy new year! As we look ahead to 2020, it's time to
> start
> > > > > > mobilizing
> > > > > > > > for
> > > > > > > > > the Arrow 1.0 release. At 0.15, I believe we decided that
> our
> > > > next
> > > > > > > > release
> > > > > > > > > should be 1.0, and it's been a couple of months since
> 0.15, so
> > > > > we're
> > > > > > > due
> > > > > > > > to
> > > > > > > > > release again this month, give or take. (See [1] for when
> we most
> > > > > > > > recently
> > > > > > > > > discussed doing 1.0 back in June, or if you're a fan of
> ancient
> > > > > > > history,
> > > > > > > 

[jira] [Created] (ARROW-7511) [C#] - Batch / Data Size Can't Exceed 2 gigs

2020-01-07 Thread Anthony Abate (Jira)
Anthony Abate created ARROW-7511:


 Summary: [C#] - Batch / Data Size Can't Exceed 2 gigs
 Key: ARROW-7511
 URL: https://issues.apache.org/jira/browse/ARROW-7511
 Project: Apache Arrow
  Issue Type: Bug
  Components: C#
Affects Versions: 0.15.1
Reporter: Anthony Abate


While the Arrow spec does not forbid batches larger than 2 gigs, the C# library 
can not support this in its current form due to limits on managed memory as it 
tries to put the whole batch into a single Span/Memory

It is possible to fix this by not trying to use Memory/Span/byte[] for the 
entire Batch.. and instead move the memory mapping to the ArrowBuffers.  This 
only move the problem 'lower' as it would then still set the limit of a Column 
Data in a single batch to be 2 Gigs.  

This seems like plenty of memory... but if you think of strings columns, the 
data is just one giant string appended to together with offsets and it can get 
very large quickly.

I think the unfortunate problem is that memory management in the C# managed 
world is always going to hit the 2 gig limit somewhere. (please correct me if I 
am wrong on this statement)

That ultimately means the C# library either has to reject files of certain 
characteristics (ie validation checks on opening) , or the spec needs put upper 
limits on certain internal arrow constructs (ie arrow buffer) to eliminate the 
need for more than a 2 gigs of contiguous memory for the smallest arrow object.

However, If the spec was indeed designed for the smallest buffer object to be 
larger than 2 gigs, or for the entire memory buffer of arrow to be contiguous, 
one has to wonder if at some point, it might just make sense for the C# library 
to use the C++ library as its memory manager as replicating a very large blocks 
of memory more work than its wroth.

In any case,  this issue is more about 'deferring' the 2 gig size problem by 
moving it down to the buffer objects... This might require some re-write of the 
batch data structures

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-7510) [C++] Array::null_count() is not thread-compatible

2020-01-07 Thread Zhuo Peng (Jira)
Zhuo Peng created ARROW-7510:


 Summary: [C++] Array::null_count() is not thread-compatible
 Key: ARROW-7510
 URL: https://issues.apache.org/jira/browse/ARROW-7510
 Project: Apache Arrow
  Issue Type: Bug
  Components: C++
Reporter: Zhuo Peng


ArrayData has a mutable member null_count, that can be updated in a const 
function. However null_count is not atomic, so it's subject to data race.

 

I guess Arrays are not thread-safe (which is reasonable), but at least they 
should be thread-compatible so that concurrent access to const member functions 
are fine.

(The race looks "benign", but see [1][2])

[https://github.com/apache/arrow/blob/dbe708c7527a4aa6b63df7722cd57db4e0bd2dc7/cpp/src/arrow/array.cc#L123]

 

[1][https://software.intel.com/en-us/blogs/2013/01/06/benign-data-races-what-could-possibly-go-wrong]

[2][https://bartoszmilewski.com/2014/10/25/dealing-with-benign-data-races-the-c-way/]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [NIGHTLY] Arrow Build Report for Job nightly-2020-01-07-0

2020-01-07 Thread Wes McKinney
Thanks, I commented on ARROW-6727. Adding retry logic to the asset
upload I would guess will fix the common cases

On Tue, Jan 7, 2020 at 11:38 AM Neal Richardson
 wrote:
>
> https://issues.apache.org/jira/browse/ARROW-6727 and
> https://issues.apache.org/jira/browse/ARROW-6739 are about artifact upload
> timeouts.
>
> On Tue, Jan 7, 2020 at 9:09 AM Wes McKinney  wrote:
>
> > wheel-osx-cp37m failed due to GitHub uploading flakiness. This manner
> > of flakiness seems to occur quite often, is there any mitigation
> > against this?
> >
> > https://travis-ci.org/ursa-labs/crossbow/builds/633670862#L6621
> >
> > gandiva-jar-osx also failed during the deploy step, has this been
> > happening consistently?
> >
> > https://travis-ci.org/ursa-labs/crossbow/builds/633670142#L2020
> >
> > On Tue, Jan 7, 2020 at 7:32 AM Crossbow  wrote:
> > >
> > >
> > > Arrow Build Report for Job nightly-2020-01-07-0
> > >
> > > All tasks:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0
> > >
> > > Failed Tasks:
> > > - gandiva-jar-osx:
> > >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-travis-gandiva-jar-osx
> > > - homebrew-cpp:
> > >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-travis-homebrew-cpp
> > > - wheel-osx-cp37m:
> > >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-travis-wheel-osx-cp37m
> > >
> > > Succeeded Tasks:
> > > - centos-6:
> > >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-azure-centos-6
> > > - centos-7:
> > >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-azure-centos-7
> > > - centos-8:
> > >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-azure-centos-8
> > > - conda-linux-gcc-py27:
> > >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-azure-conda-linux-gcc-py27
> > > - conda-linux-gcc-py36:
> > >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-azure-conda-linux-gcc-py36
> > > - conda-linux-gcc-py37:
> > >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-azure-conda-linux-gcc-py37
> > > - conda-linux-gcc-py38:
> > >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-azure-conda-linux-gcc-py38
> > > - conda-osx-clang-py27:
> > >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-azure-conda-osx-clang-py27
> > > - conda-osx-clang-py36:
> > >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-azure-conda-osx-clang-py36
> > > - conda-osx-clang-py37:
> > >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-azure-conda-osx-clang-py37
> > > - conda-osx-clang-py38:
> > >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-azure-conda-osx-clang-py38
> > > - conda-win-vs2015-py36:
> > >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-azure-conda-win-vs2015-py36
> > > - conda-win-vs2015-py37:
> > >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-azure-conda-win-vs2015-py37
> > > - conda-win-vs2015-py38:
> > >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-azure-conda-win-vs2015-py38
> > > - debian-buster:
> > >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-azure-debian-buster
> > > - debian-stretch:
> > >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-azure-debian-stretch
> > > - gandiva-jar-trusty:
> > >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-travis-gandiva-jar-trusty
> > > - macos-r-autobrew:
> > >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-travis-macos-r-autobrew
> > > - test-conda-cpp:
> > >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-circle-test-conda-cpp
> > > - test-conda-python-2.7-pandas-latest:
> > >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-circle-test-conda-python-2.7-pandas-latest
> > > - test-conda-python-2.7:
> > >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-circle-test-conda-python-2.7
> > > - test-conda-python-3.6:
> > >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-circle-test-conda-python-3.6
> > > - test-conda-python-3.7-dask-latest:
> > >   URL:
> > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-circle-test-conda-python-3.7-dask-latest
> > > - test-conda-python-3.7-hdfs-2.9.2:
> > >   URL:
> > 

Re: [NIGHTLY] Arrow Build Report for Job nightly-2020-01-07-0

2020-01-07 Thread Neal Richardson
https://issues.apache.org/jira/browse/ARROW-6727 and
https://issues.apache.org/jira/browse/ARROW-6739 are about artifact upload
timeouts.

On Tue, Jan 7, 2020 at 9:09 AM Wes McKinney  wrote:

> wheel-osx-cp37m failed due to GitHub uploading flakiness. This manner
> of flakiness seems to occur quite often, is there any mitigation
> against this?
>
> https://travis-ci.org/ursa-labs/crossbow/builds/633670862#L6621
>
> gandiva-jar-osx also failed during the deploy step, has this been
> happening consistently?
>
> https://travis-ci.org/ursa-labs/crossbow/builds/633670142#L2020
>
> On Tue, Jan 7, 2020 at 7:32 AM Crossbow  wrote:
> >
> >
> > Arrow Build Report for Job nightly-2020-01-07-0
> >
> > All tasks:
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0
> >
> > Failed Tasks:
> > - gandiva-jar-osx:
> >   URL:
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-travis-gandiva-jar-osx
> > - homebrew-cpp:
> >   URL:
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-travis-homebrew-cpp
> > - wheel-osx-cp37m:
> >   URL:
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-travis-wheel-osx-cp37m
> >
> > Succeeded Tasks:
> > - centos-6:
> >   URL:
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-azure-centos-6
> > - centos-7:
> >   URL:
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-azure-centos-7
> > - centos-8:
> >   URL:
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-azure-centos-8
> > - conda-linux-gcc-py27:
> >   URL:
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-azure-conda-linux-gcc-py27
> > - conda-linux-gcc-py36:
> >   URL:
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-azure-conda-linux-gcc-py36
> > - conda-linux-gcc-py37:
> >   URL:
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-azure-conda-linux-gcc-py37
> > - conda-linux-gcc-py38:
> >   URL:
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-azure-conda-linux-gcc-py38
> > - conda-osx-clang-py27:
> >   URL:
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-azure-conda-osx-clang-py27
> > - conda-osx-clang-py36:
> >   URL:
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-azure-conda-osx-clang-py36
> > - conda-osx-clang-py37:
> >   URL:
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-azure-conda-osx-clang-py37
> > - conda-osx-clang-py38:
> >   URL:
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-azure-conda-osx-clang-py38
> > - conda-win-vs2015-py36:
> >   URL:
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-azure-conda-win-vs2015-py36
> > - conda-win-vs2015-py37:
> >   URL:
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-azure-conda-win-vs2015-py37
> > - conda-win-vs2015-py38:
> >   URL:
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-azure-conda-win-vs2015-py38
> > - debian-buster:
> >   URL:
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-azure-debian-buster
> > - debian-stretch:
> >   URL:
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-azure-debian-stretch
> > - gandiva-jar-trusty:
> >   URL:
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-travis-gandiva-jar-trusty
> > - macos-r-autobrew:
> >   URL:
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-travis-macos-r-autobrew
> > - test-conda-cpp:
> >   URL:
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-circle-test-conda-cpp
> > - test-conda-python-2.7-pandas-latest:
> >   URL:
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-circle-test-conda-python-2.7-pandas-latest
> > - test-conda-python-2.7:
> >   URL:
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-circle-test-conda-python-2.7
> > - test-conda-python-3.6:
> >   URL:
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-circle-test-conda-python-3.6
> > - test-conda-python-3.7-dask-latest:
> >   URL:
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-circle-test-conda-python-3.7-dask-latest
> > - test-conda-python-3.7-hdfs-2.9.2:
> >   URL:
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-circle-test-conda-python-3.7-hdfs-2.9.2
> > - test-conda-python-3.7-pandas-latest:
> >   URL:
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-circle-test-conda-python-3.7-pandas-latest
> > - test-conda-python-3.7-pandas-master:
> >   URL:
> 

Re: [NIGHTLY] Arrow Build Report for Job nightly-2020-01-07-0

2020-01-07 Thread Wes McKinney
wheel-osx-cp37m failed due to GitHub uploading flakiness. This manner
of flakiness seems to occur quite often, is there any mitigation
against this?

https://travis-ci.org/ursa-labs/crossbow/builds/633670862#L6621

gandiva-jar-osx also failed during the deploy step, has this been
happening consistently?

https://travis-ci.org/ursa-labs/crossbow/builds/633670142#L2020

On Tue, Jan 7, 2020 at 7:32 AM Crossbow  wrote:
>
>
> Arrow Build Report for Job nightly-2020-01-07-0
>
> All tasks: 
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0
>
> Failed Tasks:
> - gandiva-jar-osx:
>   URL: 
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-travis-gandiva-jar-osx
> - homebrew-cpp:
>   URL: 
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-travis-homebrew-cpp
> - wheel-osx-cp37m:
>   URL: 
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-travis-wheel-osx-cp37m
>
> Succeeded Tasks:
> - centos-6:
>   URL: 
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-azure-centos-6
> - centos-7:
>   URL: 
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-azure-centos-7
> - centos-8:
>   URL: 
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-azure-centos-8
> - conda-linux-gcc-py27:
>   URL: 
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-azure-conda-linux-gcc-py27
> - conda-linux-gcc-py36:
>   URL: 
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-azure-conda-linux-gcc-py36
> - conda-linux-gcc-py37:
>   URL: 
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-azure-conda-linux-gcc-py37
> - conda-linux-gcc-py38:
>   URL: 
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-azure-conda-linux-gcc-py38
> - conda-osx-clang-py27:
>   URL: 
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-azure-conda-osx-clang-py27
> - conda-osx-clang-py36:
>   URL: 
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-azure-conda-osx-clang-py36
> - conda-osx-clang-py37:
>   URL: 
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-azure-conda-osx-clang-py37
> - conda-osx-clang-py38:
>   URL: 
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-azure-conda-osx-clang-py38
> - conda-win-vs2015-py36:
>   URL: 
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-azure-conda-win-vs2015-py36
> - conda-win-vs2015-py37:
>   URL: 
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-azure-conda-win-vs2015-py37
> - conda-win-vs2015-py38:
>   URL: 
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-azure-conda-win-vs2015-py38
> - debian-buster:
>   URL: 
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-azure-debian-buster
> - debian-stretch:
>   URL: 
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-azure-debian-stretch
> - gandiva-jar-trusty:
>   URL: 
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-travis-gandiva-jar-trusty
> - macos-r-autobrew:
>   URL: 
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-travis-macos-r-autobrew
> - test-conda-cpp:
>   URL: 
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-circle-test-conda-cpp
> - test-conda-python-2.7-pandas-latest:
>   URL: 
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-circle-test-conda-python-2.7-pandas-latest
> - test-conda-python-2.7:
>   URL: 
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-circle-test-conda-python-2.7
> - test-conda-python-3.6:
>   URL: 
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-circle-test-conda-python-3.6
> - test-conda-python-3.7-dask-latest:
>   URL: 
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-circle-test-conda-python-3.7-dask-latest
> - test-conda-python-3.7-hdfs-2.9.2:
>   URL: 
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-circle-test-conda-python-3.7-hdfs-2.9.2
> - test-conda-python-3.7-pandas-latest:
>   URL: 
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-circle-test-conda-python-3.7-pandas-latest
> - test-conda-python-3.7-pandas-master:
>   URL: 
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-circle-test-conda-python-3.7-pandas-master
> - test-conda-python-3.7-spark-master:
>   URL: 
> https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-circle-test-conda-python-3.7-spark-master
> - test-conda-python-3.7-turbodbc-latest:
>   URL: 
> 

Arrow sync call January 8 at 12:00 US/Eastern, 17:00 UTC

2020-01-07 Thread Neal Richardson
Hi all,
Happy 2020! Reminder that our biweekly call is in 24 hours at
https://meet.google.com/vtm-teks-phx. All are welcome to join. Notes will
be sent out to the mailing list afterwards.

Neal


[jira] [Created] (ARROW-7509) Turn on Checked mode for debug builds

2020-01-07 Thread Anthony Abate (Jira)
Anthony Abate created ARROW-7509:


 Summary: Turn on Checked mode for debug builds
 Key: ARROW-7509
 URL: https://issues.apache.org/jira/browse/ARROW-7509
 Project: Apache Arrow
  Issue Type: Improvement
  Components: C#
Affects Versions: 0.15.1
Reporter: Anthony Abate


Anyone object to turning on checked mode for debug builds? 

There have been many arithmetic overflow bugs. These could have been caught 
earlier simply by running the code with checked turned on.

Then the unit tests could be run in debug mode and any obvious overflow bugs 
might be caught



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-7508) DateTime Reading is Broken

2020-01-07 Thread Anthony Abate (Jira)
Anthony Abate created ARROW-7508:


 Summary: DateTime Reading is Broken
 Key: ARROW-7508
 URL: https://issues.apache.org/jira/browse/ARROW-7508
 Project: Apache Arrow
  Issue Type: Bug
  Components: C#
Affects Versions: 0.15.1
Reporter: Anthony Abate
Assignee: Anthony Abate


DateTime support for writing works - but reading is broken.

This another arithmetic overflow bug (reported a few already) which is causing 
date to be misinterpreted

 

I extracted the current logic out to linqpad and to show the bug and fix:

 
{code:java}
var dto = DateTimeOffset.Parse("2024-09-25");
(dto.ToUnixTimeMilliseconds() / 8640).Dump();
// YIELDS: 19991

unchecked  (current code)
{
DateTimeOffset.FromUnixTimeMilliseconds(19991 * 
8640).Dump();
// 1/8/1970 WRONG
}   

checked
{
DateTimeOffset.FromUnixTimeMilliseconds((long)19991 * 
8640).Dump();
// 9/25/2024 CORRECT
} {code}
 

 

this fix is trivial - a cast to long is missing whereever 
FromUnixTimeMilliseconds is used

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Looking to 1.0

2020-01-07 Thread Wes McKinney
We absolutely should have a list of exactly what needs to be done to
put out the 1.0.0 release, but based on what we know needs to be done
I am not optimistic that it can all be accomplished before the end of
January. That doesn't mean that we should assume these things won't
get done before March/April time frame. If they get done sooner, let's
release 1.0.0 sooner.

On Mon, Jan 6, 2020 at 6:03 PM Neal Richardson
 wrote:
>
> I'm all for maintaining a regular cadence of releases, but before we cast
> aside the idea of 1.0, I'd still encourage us to do the work of enumerating
> what truly must happen before we call a release 1.0 so that we can get it
> done. Otherwise, in April we're going to be talking about doing a 0.17
> release.
>
> I believe I've found the issues that Wes referenced and added them as
> "blockers" to 1.0.0. That brings the total blocker count listed on
> https://cwiki.apache.org/confluence/display/ARROW/Arrow+1.0.0+Release to 10
> issues, though some may be overlapping/redundant. Do we think this is an
> exhaustive list of blockers? Should some of these be downgraded to
> not-blocking? If we were to resolve all 10 of these issues, would we have
> consensus that we're ready for 1.0?
>
> Would it help to update this wiki, which seems pretty stale at this point?
> https://cwiki.apache.org/confluence/display/ARROW/Columnar+Format+1.0+Milestone
>
> Thanks,
> Neal
>
>
> On Mon, Jan 6, 2020 at 11:40 AM Bryan Cutler  wrote:
>
> > I agree on a 0.16.0 release. In the meantime I'll try to help out with
> > getting the Java side ready for 1.0.
> >
> > On Sat, Jan 4, 2020 at 7:21 PM Fan Liya  wrote:
> >
> > > Hi Jacques,
> > >
> > > ARROW-4526 is interesting. I would like to try to resolve it.
> > > Thanks a lot for the information.
> > >
> > > Best,
> > > Liya Fan
> > >
> > >
> > > On Sun, Jan 5, 2020 at 6:14 AM Jacques Nadeau 
> > wrote:
> > >
> > > > The third ticket I was commenting on was ARROW-4526.
> > > >
> > > > Fan, do you want to take a shot at that one?
> > > >
> > > > On Fri, Jan 3, 2020 at 8:16 PM Fan Liya  wrote:
> > > >
> > > > >   Hi Jacques,
> > > > >
> > > > > I am interested in the issues, and if it is possible, I would like to
> > > try
> > > > > to resolve them.
> > > > >
> > > > > Thanks.
> > > > >
> > > > > Liya Fan
> > > > >
> > > > > On Sat, Jan 4, 2020 at 7:16 AM Jacques Nadeau 
> > > > wrote:
> > > > >
> > > > > > I identified three things in the java library that I think are top
> > of
> > > > > mind
> > > > > > and should be fixed before 1.0 to avoid weird incompatibility
> > changes
> > > > in
> > > > > > the java apis (technical debt). I've tagged them as pre-1.0 as I
> > > don't
> > > > > > exactly see what is the right way to tag/label a target release
> > for a
> > > > > > ticket.
> > > > > >
> > > > >
> > > >
> > >
> > https://issues.apache.org/jira/browse/ARROW-7495?jql=labels%20%3D%20pre-1.0
> > > > > >
> > > > > > For the three tickets I identified, does anyone have interest in
> > > trying
> > > > > to
> > > > > > resolve?
> > > > > >
> > > > > > thanks,
> > > > > > Jacques
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Thu, Jan 2, 2020 at 11:55 AM Neal Richardson <
> > > > > > neal.p.richard...@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > Hi all,
> > > > > > > Happy new year! As we look ahead to 2020, it's time to start
> > > > mobilizing
> > > > > > for
> > > > > > > the Arrow 1.0 release. At 0.15, I believe we decided that our
> > next
> > > > > > release
> > > > > > > should be 1.0, and it's been a couple of months since 0.15, so
> > > we're
> > > > > due
> > > > > > to
> > > > > > > release again this month, give or take. (See [1] for when we most
> > > > > > recently
> > > > > > > discussed doing 1.0 back in June, or if you're a fan of ancient
> > > > > history,
> > > > > > > see [2] for a similar discussion from July 2017.)
> > > > > > >
> > > > > > > Since there appeared to be consensus before that it is time for
> > > 1.0,
> > > > > > let's
> > > > > > > discuss how to get it done. One first step would be to make sure
> > > that
> > > > > > we've
> > > > > > > identified all format/specification issues we think we must
> > resolve
> > > > > > before
> > > > > > > declaring 1.0. [3] shows 3 "blockers" for the 1.0 release
> > already.
> > > > > There
> > > > > > > are an additional 14 "Format" issues ([4]); perhaps some of those
> > > > > should
> > > > > > > also be labeled blockers for 1.0.
> > > > > > >
> > > > > > > It would be great if folks could review Jira in their areas of
> > > > > expertise
> > > > > > > and make sure everything essential for 1.0 is ticketed and
> > > > prioritized
> > > > > > > appropriately. Once we've identified the required tasks for
> > making
> > > a
> > > > > 1.0
> > > > > > > release, we can work together on burning those down.
> > > > > > >
> > > > > > > Neal
> > > > > > >
> > > > > > > [1]:
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > 

[NIGHTLY] Arrow Build Report for Job nightly-2020-01-07-0

2020-01-07 Thread Crossbow


Arrow Build Report for Job nightly-2020-01-07-0

All tasks: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0

Failed Tasks:
- gandiva-jar-osx:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-travis-gandiva-jar-osx
- homebrew-cpp:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-travis-homebrew-cpp
- wheel-osx-cp37m:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-travis-wheel-osx-cp37m

Succeeded Tasks:
- centos-6:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-azure-centos-6
- centos-7:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-azure-centos-7
- centos-8:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-azure-centos-8
- conda-linux-gcc-py27:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-azure-conda-linux-gcc-py27
- conda-linux-gcc-py36:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-azure-conda-linux-gcc-py36
- conda-linux-gcc-py37:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-azure-conda-linux-gcc-py37
- conda-linux-gcc-py38:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-azure-conda-linux-gcc-py38
- conda-osx-clang-py27:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-azure-conda-osx-clang-py27
- conda-osx-clang-py36:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-azure-conda-osx-clang-py36
- conda-osx-clang-py37:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-azure-conda-osx-clang-py37
- conda-osx-clang-py38:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-azure-conda-osx-clang-py38
- conda-win-vs2015-py36:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-azure-conda-win-vs2015-py36
- conda-win-vs2015-py37:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-azure-conda-win-vs2015-py37
- conda-win-vs2015-py38:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-azure-conda-win-vs2015-py38
- debian-buster:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-azure-debian-buster
- debian-stretch:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-azure-debian-stretch
- gandiva-jar-trusty:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-travis-gandiva-jar-trusty
- macos-r-autobrew:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-travis-macos-r-autobrew
- test-conda-cpp:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-circle-test-conda-cpp
- test-conda-python-2.7-pandas-latest:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-circle-test-conda-python-2.7-pandas-latest
- test-conda-python-2.7:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-circle-test-conda-python-2.7
- test-conda-python-3.6:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-circle-test-conda-python-3.6
- test-conda-python-3.7-dask-latest:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-circle-test-conda-python-3.7-dask-latest
- test-conda-python-3.7-hdfs-2.9.2:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-circle-test-conda-python-3.7-hdfs-2.9.2
- test-conda-python-3.7-pandas-latest:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-circle-test-conda-python-3.7-pandas-latest
- test-conda-python-3.7-pandas-master:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-circle-test-conda-python-3.7-pandas-master
- test-conda-python-3.7-spark-master:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-circle-test-conda-python-3.7-spark-master
- test-conda-python-3.7-turbodbc-latest:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-circle-test-conda-python-3.7-turbodbc-latest
- test-conda-python-3.7-turbodbc-master:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-circle-test-conda-python-3.7-turbodbc-master
- test-conda-python-3.7:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-circle-test-conda-python-3.7
- test-conda-python-3.8-dask-master:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-07-0-circle-test-conda-python-3.8-dask-master
-