subject:"Spark on HBase vs. Spark on HDFS"

Re: Spark on HBase vs. Spark on HDFS

2014-05-23 Thread Mayur Rustagi

Also I am unsure if Spark on Hbase leverages Locality. When you cache process data do you see node_local jobs in process list. Spark on HDFS leverages locality quite well can really boost performance by 3-4x in my experience. If you are loading all your data from HBase to spark then you are

Spark on HBase vs. Spark on HDFS

2014-05-22 Thread Limbeck, Philip

HI! We are currently using HBase as our primary data store of different event-like data. On-top of that, we use Shark to aggregate this data and keep it in memory for fast data access. Since we use no specific HBase functionality whatsoever except Putting data into it, a discussion came up on

Re: Spark on HBase vs. Spark on HDFS

2014-05-22 Thread Nick Pentreath

Hi In my opinion, running HBase for immutable data is generally overkill in particular if you are using Shark anyway to cache and analyse the data and provide the speed. HBase is designed for random-access data patterns and high throughput R/W activities. If you are only ever writing immutable

Re: Spark on HBase vs. Spark on HDFS

Spark on HBase vs. Spark on HDFS

Re: Spark on HBase vs. Spark on HDFS

3 matches

Site Navigation

Mail list logo

Footer information