We use 2MB chunks for our CFS implementation of HDFS:
http://www.datastax.com/dev/blog/cassandra-file-system-design
On Mon, Apr 2, 2012 at 4:23 AM, Franc Carter franc.car...@sirca.org.au wrote:
Hi,
We are in the early stages of thinking about a project that needs to store
data that will be
On Wed, Apr 4, 2012 at 8:56 AM, Jonathan Ellis jbel...@gmail.com wrote:
We use 2MB chunks for our CFS implementation of HDFS:
http://www.datastax.com/dev/blog/cassandra-file-system-design
thanks
On Mon, Apr 2, 2012 at 4:23 AM, Franc Carter franc.car...@sirca.org.au
wrote:
Hi,
We
Hi,
We are in the early stages of thinking about a project that needs to store
data that will be accessed by Hadoop. One of the concerns we have is around
the Latency of HDFS as our use case is is not for reading all the data and
hence we will need custom RecordReaders etc.
I've seen a couple of
This is a difficult question to answer for a variety of reasons, but I'll
give it a try, maybe it will be helpful, maybe not.
The most obvious problem with this is that Thrift is buffer based, not
streaming. That means that whatever the size of your chunk it needs to
be received, deserialized,
On Tue, Apr 3, 2012 at 4:18 AM, Ben Coverston ben.covers...@datastax.comwrote:
This is a difficult question to answer for a variety of reasons, but I'll
give it a try, maybe it will be helpful, maybe not.
The most obvious problem with this is that Thrift is buffer based, not
streaming. That