Geert-Jan - We're currently working on a somewhat similar project to integrate Flume to ingest data into Riak CS for later processing using Hadoop. The limitations of HDFS/S3, when using the s3:// or s3n:// URIs, seem to revolve around renaming objects (copy/delete) in Riak CS. If you can avoid that, this link should work fine.
Regarding how data is stored in Riak CS, the data block storage is Bitcask with manifest storage being held in LevelDB. Riak CS is optimized for larger object sizes and I believe smaller object sizes would not be nearly as efficient as working with plain Riak if only because of the overhead incurred by Riak CS. The benefits of Riak generally carry over to Riak CS so there shouldn't be any need to worry about losing raw power. Respectfully - Dan Kerrigan On Tue, Jul 30, 2013 at 2:21 PM, gbrits <[email protected]> wrote: > This may be totally missing the mark but I've been reading up on ways to do > fast iterative processing in Storm or Spark/shark, with the ultimate goal > of > results ending up in Riak for fast multi-key retrieval. > > I want this setup to be as lean as possible for obvious reasons so I've > started to look more closely at the possible Riak CS / Spark combo. > > Apparently, please correct if wrong, Riak CS sits on top of Riak and is > S3-api compliant. Underlying the db for the objects is levelDB (which would > have been my choice anyway, bc of the low in-mem key overhead) Apparently > Bitcask is also used, although it's not clear to me what for exactly. > > At the same time Spark (with Shark on top, which is what Hive is for Hadoop > if that in any way makes things clearer) can use HDFS or S3 as it's so > called 'deep store'. > > Combining this it seems, Riak CS and Spark/Shark could be a nice pretty > tight combo providing interative and adhoc quering through Shark + all the > excellent stuff of Riak through the S3 protocol which they both speak . > > Is this correct? > Would I loose any of the raw power of Riak when going with Riak CS? Anyone > ever tried this combo? > > Thanks, > Geert-Jan > > > > -- > View this message in context: > http://riak-users.197444.n3.nabble.com/combining-Riak-CS-and-Spark-shark-by-speaking-over-s3-protocol-tp4028621.html > Sent from the Riak Users mailing list archive at Nabble.com. > > _______________________________________________ > riak-users mailing list > [email protected] > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >
_______________________________________________ riak-users mailing list [email protected] http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
