Re: [ANNOUNCE] Apache Kudu 1.0.0 release

2016-09-20 Thread Benjamin Kim
Todd,

Thanks. I’ll look into those.

Cheers,
Ben


> On Sep 20, 2016, at 12:11 AM, Todd Lipcon  wrote:
> 
> The Apache Kudu team is happy to announce the release of Kudu 1.0.0!
> 
> Kudu is an open source storage engine for structured data which supports 
> low-latency random access together with efficient analytical access patterns. 
> It is designed within the context of the Apache Hadoop ecosystem and supports 
> many integrations with other data analytics projects both inside and outside 
> of the Apache Software Foundation.
> 
> This latest version adds several new features, including:
> 
> - Removal of multiversion concurrency control (MVCC) history is now 
> supported. This allows Kudu to reclaim disk space, where previously Kudu 
> would keep a full history of all changes made to a given table since the 
> beginning of time.
> 
> - Most of Kudu’s command line tools have been consolidated under a new 
> top-level "kudu" tool. This reduces the number of large binaries distributed 
> with Kudu and also includes much-improved help output.
> 
> - Administrative tools including "kudu cluster ksck" now support running 
> against multi-master Kudu clusters.
> 
> - The C++ client API now supports writing data in AUTO_FLUSH_BACKGROUND mode. 
> This can provide higher throughput for ingest workloads.
> 
> This release also includes many bug fixes, optimizations, and other 
> improvements, detailed in the release notes available at:
> http://kudu.apache.org/releases/1.0.0/docs/release_notes.html 
> 
> 
> Download the source release here:
> http://kudu.apache.org/releases/1.0.0/ 
> 
> 
> Convenience binary artifacts for the Java client and various Java 
> integrations (eg Spark, Flume) are also now available via the ASF Maven 
> repository.
> 
> Enjoy the new release!
> 
> - The Apache Kudu team



Re: [ANNOUNCE] Apache Kudu 1.0.0 release

2016-09-20 Thread Todd Lipcon
-announce


On Tue, Sep 20, 2016 at 11:34 AM, Benjamin Kim  wrote:

> This is awesome!!! Great!!!
>
> Do you know if any improvements were also made to the Spark plugin jar?
>

Looks like a few changes based on the git log:
https://gist.github.com/4fa3ccc3b9be787227fed89c1bd42837

as well as a number of changes to the Java client (which gets pulled into
the Spark jar):
https://gist.github.com/e2a8ca78e51773fabb70aae34207199f


In particular, I think the partition pruning work in the Java client should
reduce the number of Spark partitions if you have predicates on your data
frames. (though I haven't personally verified it)

-Todd



> On Sep 20, 2016, at 12:11 AM, Todd Lipcon  wrote:
>
> The Apache Kudu team is happy to announce the release of Kudu 1.0.0!
>
> Kudu is an open source storage engine for structured data which supports
> low-latency random access together with efficient analytical access
> patterns. It is designed within the context of the Apache Hadoop ecosystem
> and supports many integrations with other data analytics projects both
> inside and outside of the Apache Software Foundation.
>
> This latest version adds several new features, including:
>
> - Removal of multiversion concurrency control (MVCC) history is now
> supported. This allows Kudu to reclaim disk space, where previously Kudu
> would keep a full history of all changes made to a given table since the
> beginning of time.
>
> - Most of Kudu’s command line tools have been consolidated under a new
> top-level "kudu" tool. This reduces the number of large binaries
> distributed with Kudu and also includes much-improved help output.
>
> - Administrative tools including "kudu cluster ksck" now support running
> against multi-master Kudu clusters.
>
> - The C++ client API now supports writing data in AUTO_FLUSH_BACKGROUND
> mode. This can provide higher throughput for ingest workloads.
>
> This release also includes many bug fixes, optimizations, and other
> improvements, detailed in the release notes available at:
> http://kudu.apache.org/releases/1.0.0/docs/release_notes.html
>
> Download the source release here:
> http://kudu.apache.org/releases/1.0.0/
>
> Convenience binary artifacts for the Java client and various Java
> integrations (eg Spark, Flume) are also now available via the ASF Maven
> repository.
>
> Enjoy the new release!
>
> - The Apache Kudu team
>
>
>


-- 
Todd Lipcon
Software Engineer, Cloudera


Re: [ANNOUNCE] Apache Kudu 1.0.0 release

2016-09-20 Thread Aminul Islam
Congrats
On Sep 20, 2016 9:35 PM, "Jacques Nadeau"  wrote:

> Congrats to everyone. This is a great accomplishment!
>
> On Tue, Sep 20, 2016 at 12:11 AM, Todd Lipcon  wrote:
>
>> The Apache Kudu team is happy to announce the release of Kudu 1.0.0!
>>
>> Kudu is an open source storage engine for structured data which supports
>> low-latency random access together with efficient analytical access
>> patterns. It is designed within the context of the Apache Hadoop ecosystem
>> and supports many integrations with other data analytics projects both
>> inside and outside of the Apache Software Foundation.
>>
>> This latest version adds several new features, including:
>>
>> - Removal of multiversion concurrency control (MVCC) history is now
>> supported. This allows Kudu to reclaim disk space, where previously Kudu
>> would keep a full history of all changes made to a given table since the
>> beginning of time.
>>
>> - Most of Kudu’s command line tools have been consolidated under a new
>> top-level "kudu" tool. This reduces the number of large binaries
>> distributed with Kudu and also includes much-improved help output.
>>
>> - Administrative tools including "kudu cluster ksck" now support running
>> against multi-master Kudu clusters.
>>
>> - The C++ client API now supports writing data in AUTO_FLUSH_BACKGROUND
>> mode. This can provide higher throughput for ingest workloads.
>>
>> This release also includes many bug fixes, optimizations, and other
>> improvements, detailed in the release notes available at:
>> http://kudu.apache.org/releases/1.0.0/docs/release_notes.html
>>
>> Download the source release here:
>> http://kudu.apache.org/releases/1.0.0/
>>
>> Convenience binary artifacts for the Java client and various Java
>> integrations (eg Spark, Flume) are also now available via the ASF Maven
>> repository.
>>
>> Enjoy the new release!
>>
>> - The Apache Kudu team
>>
>
>