Re: Performance Question

Mike Percy Fri, 27 May 2016 18:23:06 -0700

Have you considered whether you have a scan heavy or a random access heavy
workload? Have you considered whether you always access / update a whole
row vs only a partial row? Kudu is a column store so has some
awesome performance characteristics when you are doing a lot of scanning of
just a couple of columns.


I don't know the answer to your question but if your concern is performance
then I would be interested in seeing comparisons from a perf perspective on
certain workloads.

Finally, a year ago Aerospike did quite poorly in a Jepsen test:
https://aphyr.com/posts/324-jepsen-aerospike

I wonder if they have addressed any of those issues.

Mike

On Friday, May 27, 2016, Benjamin Kim <bbuil...@gmail.com> wrote:

> I am just curious. How will Kudu compare with Aerospike (
> http://www.aerospike.com)? I went to a Spark Roadshow and found out about
> this piece of software. It appears to fit our use case perfectly since we
> are an ad-tech company trying to leverage our user profiles data. Plus, it
> already has a Spark connector and has a SQL-like client. The tables can be
> accessed using Spark SQL DataFrames and, also, made into SQL tables for
> direct use with Spark SQL ODBC/JDBC Thriftserver. I see from the work done
> here http://gerrit.cloudera.org:8080/#/c/2992/ that the Spark integration
> is well underway and, from the looks of it lately, almost complete. I would
> prefer to use Kudu since we are already a Cloudera shop, and Kudu is easy
> to deploy and configure using Cloudera Manager. I also hope that some of
> Aerospike’s speed optimization techniques can make it into Kudu in the
> future, if they have not been already thought of or included.
>
> Just some thoughts…
>
> Cheers,
> Ben



-- 
--
Mike Percy
Software Engineer, Cloudera

Re: Performance Question

Reply via email to