[ANNOUNCE] YCSB 0.11.0 released

2016-09-21 Thread Govind Kamat
On behalf of the development community, I'm pleased to announce the
release of YCSB version 0.11.0.

Highlights:
  * Support for ArangoDB.  This is a new binding.
  * Update to Apache Geode (incubating) to improve memory footprint.
  * "couchbase" client deprecated in favor of "couchbase2".
  * Capability to specify TTL for Couchbase2.
  * Various Elasticsearch improvements.
  * Kudu binding updated for version 0.9.0.
  * Fix for issue with hdrhistogram+raw.
  * Performance optimizations for BasicDB and RandomByteIterator. 

Full release notes, including links to source and convenience binaries:
https://github.com/brianfrankcooper/YCSB/releases/tag/0.11.0

This release covers changes since the beginning of July.

Govind


Re: [ANNOUNCE] Apache Kudu 1.0.0 release

2016-09-21 Thread Jean-Daniel Cryans
(with my vendor hat on)

>From a Cloudera perspective, support for Kudu is still in beta. We offer
the bits with no guarantees. If you have more questions regarding parcels,
CM, etc, please direct them to
http://community.cloudera.com/t5/Beta-Releases-Apache-Kudu/bd-p/Beta

Thanks!

J-D

On Wed, Sep 21, 2016 at 11:11 AM, Benjamin Kim  wrote:

> I tried installing using Cloudera Manager and noticed that the
> documentation doesn’t state the URL to enter in the Parcel Settings. So, I
> just re-used the old one for the beta, but there is an annoying reminder
> that Kudu is still beta. Is there a new parcel URL that is not for the beta?
>
> Thanks,
> Ben
>
>
> On Sep 20, 2016, at 11:23 PM, Matteo Durighetto 
> wrote:
>
>
> 2016-09-20 9:11 GMT+02:00 Todd Lipcon :
>
>> The Apache Kudu team is happy to announce the release of Kudu 1.0.0!
>>
>> Kudu is an open source storage engine for structured data which supports
>> low-latency random access together with efficient analytical access
>> patterns. It is designed within the context of the Apache Hadoop ecosystem
>> and supports many integrations with other data analytics projects both
>> inside and outside of the Apache Software Foundation.
>>
>> This latest version adds several new features, including:
>>
>> - Removal of multiversion concurrency control (MVCC) history is now
>> supported. This allows Kudu to reclaim disk space, where previously Kudu
>> would keep a full history of all changes made to a given table since the
>> beginning of time.
>>
>> - Most of Kudu’s command line tools have been consolidated under a new
>> top-level "kudu" tool. This reduces the number of large binaries
>> distributed with Kudu and also includes much-improved help output.
>>
>> - Administrative tools including "kudu cluster ksck" now support running
>> against multi-master Kudu clusters.
>>
>> - The C++ client API now supports writing data in AUTO_FLUSH_BACKGROUND
>> mode. This can provide higher throughput for ingest workloads.
>>
>> This release also includes many bug fixes, optimizations, and other
>> improvements, detailed in the release notes available at:
>> http://kudu.apache.org/releases/1.0.0/docs/release_notes.html
>>
>> Download the source release here:
>> http://kudu.apache.org/releases/1.0.0/
>>
>> Convenience binary artifacts for the Java client and various Java
>> integrations (eg Spark, Flume) are also now available via the ASF Maven
>> repository.
>>
>> Enjoy the new release!
>>
>> - The Apache Kudu team
>>
>
>
> Really great. Moreover there are a new producer in flume-kudu sink:
> The regexp kudu producer
>
> https://github.com/cloudera/kudu/blob/master/java/kudu-
> flume-sink/src/main/java/org/apache/kudu/flume/sink/
> RegexpKuduOperationsProducer.java
>
> With the regexp kudu producer is simple to cast with a reg exp and write
> records into kudu tables:
>
>  * A regular expression serializer that generates one {@link Insert} or
>  * {@link Upsert} per {@link Event} by parsing the payload into values
> using a
>  * regular expression. Values are coerced to the proper column types.
>  *
>  * Example: if the Kudu table has the schema
>  *
>  * key INT32
>  * name STRING
>  *
>  * and producer.pattern is '(?\\d+),(?\w+)', then the
>  * RegexpKuduOperationsProducer will parse the string
>  *
>  * |12345,Mike||54321,Todd|
>  *
>  * into the rows (key=12345, name=Mike) and (key=54321, name=Todd).
>
> We are just testing it, and it's working.
>
> Kind Regards
>
> Matteo Durighetto
> e-mail: m.durighe...@miriade.it
>
>
>
>


Re: [ANNOUNCE] Apache Kudu 1.0.0 release

2016-09-21 Thread Benjamin Kim
I tried installing using Cloudera Manager and noticed that the documentation 
doesn’t state the URL to enter in the Parcel Settings. So, I just re-used the 
old one for the beta, but there is an annoying reminder that Kudu is still 
beta. Is there a new parcel URL that is not for the beta?

Thanks,
Ben


> On Sep 20, 2016, at 11:23 PM, Matteo Durighetto  
> wrote:
> 
> 
> 2016-09-20 9:11 GMT+02:00 Todd Lipcon  >:
> The Apache Kudu team is happy to announce the release of Kudu 1.0.0!
> 
> Kudu is an open source storage engine for structured data which supports 
> low-latency random access together with efficient analytical access patterns. 
> It is designed within the context of the Apache Hadoop ecosystem and supports 
> many integrations with other data analytics projects both inside and outside 
> of the Apache Software Foundation.
> 
> This latest version adds several new features, including:
> 
> - Removal of multiversion concurrency control (MVCC) history is now 
> supported. This allows Kudu to reclaim disk space, where previously Kudu 
> would keep a full history of all changes made to a given table since the 
> beginning of time.
> 
> - Most of Kudu’s command line tools have been consolidated under a new 
> top-level "kudu" tool. This reduces the number of large binaries distributed 
> with Kudu and also includes much-improved help output.
> 
> - Administrative tools including "kudu cluster ksck" now support running 
> against multi-master Kudu clusters.
> 
> - The C++ client API now supports writing data in AUTO_FLUSH_BACKGROUND mode. 
> This can provide higher throughput for ingest workloads.
> 
> This release also includes many bug fixes, optimizations, and other 
> improvements, detailed in the release notes available at:
> http://kudu.apache.org/releases/1.0.0/docs/release_notes.html 
> 
> 
> Download the source release here:
> http://kudu.apache.org/releases/1.0.0/ 
> 
> 
> Convenience binary artifacts for the Java client and various Java 
> integrations (eg Spark, Flume) are also now available via the ASF Maven 
> repository.
> 
> Enjoy the new release!
> 
> - The Apache Kudu team
> 
> 
> Really great. Moreover there are a new producer in flume-kudu sink:
> The regexp kudu producer
> 
> https://github.com/cloudera/kudu/blob/master/java/kudu-flume-sink/src/main/java/org/apache/kudu/flume/sink/RegexpKuduOperationsProducer.java
>  
> 
> 
> With the regexp kudu producer is simple to cast with a reg exp and write 
> records into kudu tables:
> 
>  * A regular expression serializer that generates one {@link Insert} or
>  * {@link Upsert} per {@link Event} by parsing the payload into values using a
>  * regular expression. Values are coerced to the proper column types.
>  *
>  * Example: if the Kudu table has the schema
>  *
>  * key INT32
>  * name STRING
>  *
>  * and producer.pattern is '(?\\d+),(?\w+)', then the
>  * RegexpKuduOperationsProducer will parse the string
>  *
>  * |12345,Mike||54321,Todd|
>  *
>  * into the rows (key=12345, name=Mike) and (key=54321, name=Todd).
> 
> We are just testing it, and it's working.
> 
> Kind Regards
> 
> Matteo Durighetto
> e-mail: m.durighe...@miriade.it 
> 
> 



Re: Create encoded columns in kudu

2016-09-21 Thread Dan Burkert
On Wed, Sep 21, 2016 at 7:53 AM, Jean-Daniel Cryans 
wrote:

> Hi Amit,
>
> There's this jira on the Impala side: https://issues.cloudera.
> org/browse/IMPALA-3726
>
> I don't know exactly when it'll be available, but I think it's being
> looked at.
>
> Dan Burkert also has a Rust shell for Kudu somewhere, I'll let him comment
> about it.
>

Yes, there is an experimental shell called kudusql
 that can create tables with column
encoding and compression:

kudu> show tables;
Table | ID
--+-
s | 783e6d7003e74db28b272349fb16a959
t | 757e4d14c1aa4af1a85cd0983c512359

kudu> CREATE TABLE with_encoding (
a INT32 NOT NULL ENCODING bitshuffle,
PRIMARY KEY (a)
) DISTRIBUTE BY HASH (a) INTO 4 BUCKETS
WITH 1 REPLICA;
table created

kudu> SHOW TABLES;
Table | ID
--+-
with_encoding | 584424c61d614dec8e75f9868620a72f
s | 783e6d7003e74db28b272349fb16a959
t | 757e4d14c1aa4af1a85cd0983c512359

kudu> DESCRIBE TABLE WITH_ENCODING;
error: Master(MasterError { code: TableNotFound, status: NotFound: The
table does not exist: table_name: "WITH_ENCODING" })

kudu> DESCRIBE TABLE with_encoding;
Column | Type  | Nullable | Encoding   | Compression
---+---+--++
a  | Int32 | False| BitShuffle | Default

kudusql is very experimental at the moment, and I do not recommend pointing
it at a production cluster.

- Dan



>
> J-D
>
> On Wed, Sep 21, 2016 at 5:36 AM, Amit Adhau 
> wrote:
>
>> Hi kudu Team,
>>
>> Is there a direct way apart from api, to add the encoding for the kudu
>> table columns, e.g. in Create table statement which we run on impala shell,
>> can we specify the dictionary encoding?
>>
>> --
>> Thanks & Regards,
>>
>> *Amit Adhau* | Data Architect
>>
>> *GLOBANT* | IND:+91 9821518132
>>
>> [image: Facebook] 
>>
>> [image: Twitter] 
>>
>> [image: Youtube] 
>>
>> [image: Linkedin] 
>>
>> [image: Pinterest] 
>>
>> [image: Globant] 
>>
>> The information contained in this e-mail may be confidential. It has been
>> sent for the sole use of the intended recipient(s). If the reader of this
>> message is not an intended recipient, you are hereby notified that any
>> unauthorized review, use, disclosure, dissemination, distribution or
>> copying of this communication, or any of its contents,
>> is strictly prohibited. If you have received it by mistake please let us
>> know by e-mail immediately and delete it from your system. Many thanks.
>>
>>
>>
>> La información contenida en este mensaje puede ser confidencial. Ha sido
>> enviada para el uso exclusivo del destinatario(s) previsto. Si el lector de
>> este mensaje no fuera el destinatario previsto, por el presente queda Ud.
>> notificado que cualquier lectura, uso, publicación, diseminación,
>> distribución o copiado de esta comunicación o su contenido está
>> estrictamente prohibido. En caso de que Ud. hubiera recibido este mensaje
>> por error le agradeceremos notificarnos por e-mail inmediatamente y
>> eliminarlo de su sistema. Muchas gracias.
>>
>>
>


Re: Create encoded columns in kudu

2016-09-21 Thread Amit Adhau
Thank you Jean, having it on impala side would certainly be helpful.

Thanks,
Amit

On Sep 21, 2016 8:23 PM, "Jean-Daniel Cryans"  wrote:

> Hi Amit,
>
> There's this jira on the Impala side: https://issues.cloudera.
> org/browse/IMPALA-3726
>
> I don't know exactly when it'll be available, but I think it's being
> looked at.
>
> Dan Burkert also has a Rust shell for Kudu somewhere, I'll let him comment
> about it.
>
> J-D
>
> On Wed, Sep 21, 2016 at 5:36 AM, Amit Adhau 
> wrote:
>
>> Hi kudu Team,
>>
>> Is there a direct way apart from api, to add the encoding for the kudu
>> table columns, e.g. in Create table statement which we run on impala shell,
>> can we specify the dictionary encoding?
>>
>> --
>> Thanks & Regards,
>>
>> *Amit Adhau* | Data Architect
>>
>> *GLOBANT* | IND:+91 9821518132
>>
>> [image: Facebook] 
>>
>> [image: Twitter] 
>>
>> [image: Youtube] 
>>
>> [image: Linkedin] 
>>
>> [image: Pinterest] 
>>
>> [image: Globant] 
>>
>> The information contained in this e-mail may be confidential. It has been
>> sent for the sole use of the intended recipient(s). If the reader of this
>> message is not an intended recipient, you are hereby notified that any
>> unauthorized review, use, disclosure, dissemination, distribution or
>> copying of this communication, or any of its contents,
>> is strictly prohibited. If you have received it by mistake please let us
>> know by e-mail immediately and delete it from your system. Many thanks.
>>
>>
>>
>> La información contenida en este mensaje puede ser confidencial. Ha sido
>> enviada para el uso exclusivo del destinatario(s) previsto. Si el lector de
>> este mensaje no fuera el destinatario previsto, por el presente queda Ud.
>> notificado que cualquier lectura, uso, publicación, diseminación,
>> distribución o copiado de esta comunicación o su contenido está
>> estrictamente prohibido. En caso de que Ud. hubiera recibido este mensaje
>> por error le agradeceremos notificarnos por e-mail inmediatamente y
>> eliminarlo de su sistema. Muchas gracias.
>>
>>
>

-- 


The information contained in this e-mail may be confidential. It has been 
sent for the sole use of the intended recipient(s). If the reader of this 
message is not an intended recipient, you are hereby notified that any 
unauthorized review, use, disclosure, dissemination, distribution or 
copying of this communication, or any of its contents, 
is strictly prohibited. If you have received it by mistake please let us 
know by e-mail immediately and delete it from your system. Many thanks.

 

La información contenida en este mensaje puede ser confidencial. Ha sido 
enviada para el uso exclusivo del destinatario(s) previsto. Si el lector de 
este mensaje no fuera el destinatario previsto, por el presente queda Ud. 
notificado que cualquier lectura, uso, publicación, diseminación, 
distribución o copiado de esta comunicación o su contenido está 
estrictamente prohibido. En caso de que Ud. hubiera recibido este mensaje 
por error le agradeceremos notificarnos por e-mail inmediatamente y 
eliminarlo de su sistema. Muchas gracias.



Re: Create encoded columns in kudu

2016-09-21 Thread Jean-Daniel Cryans
Hi Amit,

There's this jira on the Impala side:
https://issues.cloudera.org/browse/IMPALA-3726

I don't know exactly when it'll be available, but I think it's being looked
at.

Dan Burkert also has a Rust shell for Kudu somewhere, I'll let him comment
about it.

J-D

On Wed, Sep 21, 2016 at 5:36 AM, Amit Adhau  wrote:

> Hi kudu Team,
>
> Is there a direct way apart from api, to add the encoding for the kudu
> table columns, e.g. in Create table statement which we run on impala shell,
> can we specify the dictionary encoding?
>
> --
> Thanks & Regards,
>
> *Amit Adhau* | Data Architect
>
> *GLOBANT* | IND:+91 9821518132
>
> [image: Facebook] 
>
> [image: Twitter] 
>
> [image: Youtube] 
>
> [image: Linkedin] 
>
> [image: Pinterest] 
>
> [image: Globant] 
>
> The information contained in this e-mail may be confidential. It has been
> sent for the sole use of the intended recipient(s). If the reader of this
> message is not an intended recipient, you are hereby notified that any
> unauthorized review, use, disclosure, dissemination, distribution or
> copying of this communication, or any of its contents,
> is strictly prohibited. If you have received it by mistake please let us
> know by e-mail immediately and delete it from your system. Many thanks.
>
>
>
> La información contenida en este mensaje puede ser confidencial. Ha sido
> enviada para el uso exclusivo del destinatario(s) previsto. Si el lector de
> este mensaje no fuera el destinatario previsto, por el presente queda Ud.
> notificado que cualquier lectura, uso, publicación, diseminación,
> distribución o copiado de esta comunicación o su contenido está
> estrictamente prohibido. En caso de que Ud. hubiera recibido este mensaje
> por error le agradeceremos notificarnos por e-mail inmediatamente y
> eliminarlo de su sistema. Muchas gracias.
>
>


Create encoded columns in kudu

2016-09-21 Thread Amit Adhau
Hi kudu Team,

Is there a direct way apart from api, to add the encoding for the kudu
table columns, e.g. in Create table statement which we run on impala shell,
can we specify the dictionary encoding?

-- 
Thanks & Regards,

*Amit Adhau* | Data Architect

*GLOBANT* | IND:+91 9821518132

[image: Facebook] 

[image: Twitter] 

[image: Youtube] 

[image: Linkedin] 

[image: Pinterest] 

[image: Globant] 

-- 


The information contained in this e-mail may be confidential. It has been 
sent for the sole use of the intended recipient(s). If the reader of this 
message is not an intended recipient, you are hereby notified that any 
unauthorized review, use, disclosure, dissemination, distribution or 
copying of this communication, or any of its contents, 
is strictly prohibited. If you have received it by mistake please let us 
know by e-mail immediately and delete it from your system. Many thanks.

 

La información contenida en este mensaje puede ser confidencial. Ha sido 
enviada para el uso exclusivo del destinatario(s) previsto. Si el lector de 
este mensaje no fuera el destinatario previsto, por el presente queda Ud. 
notificado que cualquier lectura, uso, publicación, diseminación, 
distribución o copiado de esta comunicación o su contenido está 
estrictamente prohibido. En caso de que Ud. hubiera recibido este mensaje 
por error le agradeceremos notificarnos por e-mail inmediatamente y 
eliminarlo de su sistema. Muchas gracias.