?
On Wed, Jun 2, 2010 at 3:53 PM, Ben Browning ben...@gmail.com wrote:
Martin,
On Wed, Jun 2, 2010 at 8:34 AM, Dr. Martin Grabmüller
martin.grabmuel...@eleven.de wrote:
I think you can specify an end key, but it should be a key which does
exist
in your column family.
Logically, it doesn't
There really aren't seed nodes in a Cassandra cluster. When you
specify a seed in a node's configuration it's just a way to let it
know how to find the other nodes in the cluster. A node functions the
same whether it is another node's seed or not. In other words, all of
the nodes in a cluster are
How many subcolumns are in each supercolumn and how large are the
values? Your example shows 8 subcolumns, but I didn't know if that was
the actual number. I've been able to read columns out of Cassandra at
an order of magnitude higher than what you're seeing here but there
are too many variables
Martin,
On Wed, Jun 2, 2010 at 8:34 AM, Dr. Martin Grabmüller
martin.grabmuel...@eleven.de wrote:
I think you can specify an end key, but it should be a key which does exist
in your column family.
Logically, it doesn't make sense to ever specify an end key with
random partitioner. If you
I like to model this kind of data as columns, where the timestamps are
the column name (either longs, TimeUUIDs, or string depending on your
usage). If you have too much data for a single row, you'd need to have
multiple rows of these. For time-series data, it makes sense to use
one row per
Maxim,
Check out the getLocation() method from this file:
http://svn.apache.org/repos/asf/cassandra/trunk/src/java/org/apache/cassandra/hadoop/ColumnFamilyRecordReader.java
Basically, it loops over the list of nodes containing this split of
data and if any of them are the local node, it returns
I like to base my batch sizes off of the total number of columns
instead of the number of rows. This effectively means counting the
number of Mutation objects in your mutation map and submitting the
batch once it reaches a certain size. For my data, batch sizes of
about 25,000 columns work best.
not the bottleneck.
On Tue, May 11, 2010 at 8:31 AM, David Boxenhorn da...@lookin2.com wrote:
Thanks a lot! 25,000 is a number I can work with.
Any other suggestions?
On Tue, May 11, 2010 at 3:21 PM, Ben Browning ben...@gmail.com wrote:
I like to base my batch sizes off of the total number of columns