[ANNOUNCE] Apache Arrow 13.0.0 released

2023-08-25 Thread Raúl Cumplido
The Apache Arrow community is pleased to announce the 13.0.0 release.
It includes 456 resolved issues ([1]) since the 12.0.1 release.

The release is available now from our website and [2]:
http://arrow.apache.org/install/

Read about what's new in the release
https://arrow.apache.org/blog/2023/08/24/13.0.0-release/

Changelog
https://arrow.apache.org/release/13.0.0.html

What is Apache Arrow?
-

Apache Arrow is a columnar in-memory analytics layer designed to accelerate big
data. It houses a set of canonical in-memory representations of flat and
hierarchical data along with multiple language-bindings for structure
manipulation. It also provides low-overhead streaming and batch messaging,
zero-copy interprocess communication (IPC), and vectorized in-memory analytics
libraries.

Please report any feedback to the mailing lists ([3])

Regards,
The Apache Arrow community

[1]: https://github.com/apache/arrow/milestone/53?closed=1
[2]: https://www.apache.org/dyn/closer.cgi/arrow/arrow-13.0.0/
[3]: https://lists.apache.org/list.html?dev@arrow.apache.org


Re: [DataFusion] What should the Python / High Level Interface Look Like?

2023-08-25 Thread Andrew Lamb
Thanks for bringing this up Josh!

I agree the current DataFusion community is very focused on building a
foundation for data intensive systems like databases, data flow engines,
etc. It is NOT really on any end user of those systems.

In my mind there is not yet a community around the DataFuson python
bindings that will drive it forward in any way other than bindings to the
underlying engine.

One of DataFusion's strengths is that it can be used to build many things
(including the things you list above)

Therefore, I think the question is "what do you want to build?" Insofar
that DataFusion (the technology and the community at large) can help, we
would love to. This could be within the DataFusion / Arrow project itself
or it could be entirely outside using the code from the project, or maybe
something in between.

Hope that helps
Andrew


On Thu, Aug 24, 2023 at 5:23 AM Josh Magarick  wrote:

> Ahoy!
> Recently, there was a request to find people to take a more active role in
> defining and building a Python interface to DataFusion here:
> https://github.com/apache/arrow-datafusion-python/issues/440
>
> In response, I've filed the following to get a sense of what's important to
> people:
> https://github.com/apache/arrow-datafusion-python/issues/462
>
> However, in addition to wanting to publicize my request more, some nagging
> questions about the broader goals of DataFusion and Arrow remain. Given
> that DF is pitched as a foundation for database systems, what are the
> aspirations for an interface in Python or other high level languages? Are
> folks imagining it will be used for building pipelines, automated analysis,
> interactive EDA? All of the above? Something else entirely?
>
> Given my background I'm inclined toward something aimed at both interactive
> and automated data analysis. It seems like a lot of the foundation is
> there, though I think doing it right requires more than just an interface
> on what exists already. There's more to discuss but hopefully this is
> enough to get started. Thanks for taking the time to read this.
>
> Regards,
>
> Josh
>


Re: [VOTE][RUST][DataFusion] Release Apache Arrow DataFusion 30.0.0 RC1

2023-08-25 Thread Andy Grove
The vote passes with 5 +1 votes (4 binding). Thanks. everyone! The crates
were released without issue this time.

On Wed, Aug 23, 2023 at 1:28 AM vin jake  wrote:

> +1 (binding)
>
> Verified on M1 macbook.
>
> Thanks Andy!
>
> On Tue, Aug 22, 2023 at 10:48 PM Andy Grove  wrote:
>
> > Hi,
> >
> > I would like to propose a release of Apache Arrow DataFusion
> > Implementation,
> > version 30.0.0.
> >
> > This release candidate is based on commit:
> > c703526596c8602f24d470d98c469c985a99b4b5 [1]
> > The proposed release tarball and signatures are hosted at [2].
> > The changelog is located at [3].
> >
> > Please download, verify checksums and signatures, run the unit tests, and
> > vote
> > on the release. The vote will be open for at least 72 hours.
> >
> > Only votes from PMC members are binding, but all members of the community
> > are
> > encouraged to test the release and vote with "(non-binding)".
> >
> > The standard verification procedure is documented at
> >
> >
> https://github.com/apache/arrow-datafusion/blob/main/dev/release/README.md#verifying-release-candidates
> > .
> >
> > [ ] +1 Release this as Apache Arrow DataFusion 30.0.0
> > [ ] +0
> > [ ] -1 Do not release this as Apache Arrow DataFusion 30.0.0 because...
> >
> > Here is my vote:
> >
> > +1
> >
> > [1]:
> >
> >
> https://github.com/apache/arrow-datafusion/tree/c703526596c8602f24d470d98c469c985a99b4b5
> > [2]:
> >
> >
> https://dist.apache.org/repos/dist/dev/arrow/apache-arrow-datafusion-30.0.0-rc1
> > [3]:
> >
> >
> https://github.com/apache/arrow-datafusion/blob/c703526596c8602f24d470d98c469c985a99b4b5/CHANGELOG.md
> >
>


[RESULT][VOTE][RUST][DataFusion] Release Apache Arrow DataFusion 30.0.0 RC1

2023-08-25 Thread Andy Grove
On Fri, Aug 25, 2023 at 9:58 AM Andy Grove  wrote:

> The vote passes with 5 +1 votes (4 binding). Thanks. everyone! The crates
> were released without issue this time.
>
> On Wed, Aug 23, 2023 at 1:28 AM vin jake  wrote:
>
>> +1 (binding)
>>
>> Verified on M1 macbook.
>>
>> Thanks Andy!
>>
>> On Tue, Aug 22, 2023 at 10:48 PM Andy Grove 
>> wrote:
>>
>> > Hi,
>> >
>> > I would like to propose a release of Apache Arrow DataFusion
>> > Implementation,
>> > version 30.0.0.
>> >
>> > This release candidate is based on commit:
>> > c703526596c8602f24d470d98c469c985a99b4b5 [1]
>> > The proposed release tarball and signatures are hosted at [2].
>> > The changelog is located at [3].
>> >
>> > Please download, verify checksums and signatures, run the unit tests,
>> and
>> > vote
>> > on the release. The vote will be open for at least 72 hours.
>> >
>> > Only votes from PMC members are binding, but all members of the
>> community
>> > are
>> > encouraged to test the release and vote with "(non-binding)".
>> >
>> > The standard verification procedure is documented at
>> >
>> >
>> https://github.com/apache/arrow-datafusion/blob/main/dev/release/README.md#verifying-release-candidates
>> > .
>> >
>> > [ ] +1 Release this as Apache Arrow DataFusion 30.0.0
>> > [ ] +0
>> > [ ] -1 Do not release this as Apache Arrow DataFusion 30.0.0 because...
>> >
>> > Here is my vote:
>> >
>> > +1
>> >
>> > [1]:
>> >
>> >
>> https://github.com/apache/arrow-datafusion/tree/c703526596c8602f24d470d98c469c985a99b4b5
>> > [2]:
>> >
>> >
>> https://dist.apache.org/repos/dist/dev/arrow/apache-arrow-datafusion-30.0.0-rc1
>> > [3]:
>> >
>> >
>> https://github.com/apache/arrow-datafusion/blob/c703526596c8602f24d470d98c469c985a99b4b5/CHANGELOG.md
>> >
>>
>


Improved nightly build dashboard

2023-08-25 Thread Jacob Wujciak-Jens
Hello Everyone!

Sam spent some time this week improving our nightly build dashboard with
new features like filtering by job name (failed and passing builds!) and
fancy new graphs.

This will make investigating and fixing nightly fails even easier (the next
release will come soon ;) )! Thanks Sam!

You can find the dashboard here: http://crossbow.voltrondata.com/
(it is intentionally http due to technical limitations)

Best
Jacob