Re: Exactly one wide row per node for a given CF?

2013-12-11 Thread Aaron Morton
  Querying the table was fast. What I didn’t do was test the table under load, 
 nor did I try this in a multi-node cluster.
As the number of columns in a row increases so does the size of the column 
index which is read as part of the read path. 

For background and comparisons of latency see 
http://thelastpickle.com/blog/2011/07/04/Cassandra-Query-Plans.html  or my talk 
on performance at the SF summit last year 
http://thelastpickle.com/speaking/2012/08/08/Cassandra-Summit-SF.html While the 
column index has been lifted to the -Index.db component AFAIK it must still be 
fully loaded.

Larger rows take longer to go through compaction, tend to cause more JVM GC and 
have issue during repair. See the in_memory_compaction_limit_in_mb comments in 
the yaml file. During repair we detect differences in ranges of rows and stream 
them between the nodes. If you have wide rows and a single column is our of 
sync we will create a new copy of that row on the node, which must then be 
compacted. I’ve seen the load on nodes with very wide rows go down by 150GB 
just by reducing the compaction settings. 

IMHO all things been equal rows in the few 10’s of MB work better. 

Cheers

-
Aaron Morton
New Zealand
@aaronmorton

Co-Founder  Principal Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com

On 11/12/2013, at 2:41 am, Robert Wille rwi...@fold3.com wrote:

 I have a question about this statement:
 
 When rows get above a few 10’s  of MB things can slow down, when they get 
 above 50 MB they can be a pain, when they get above 100MB it’s a warning 
 sign. And when they get above 1GB, well you you don’t want to know what 
 happens then. 
 
 I tested a data model that I created. Here’s the schema for the table in 
 question:
 
 CREATE TABLE bdn_index_pub (
   tree INT,
   pord INT,
   hpath VARCHAR,
   PRIMARY KEY (tree, pord)
 );
 
 As a test, I inserted 100 million records. tree had the same value for every 
 record, and I had 100 million values for pord. hpath averaged about 50 
 characters in length. My understanding is that all 100 million strings would 
 have been stored in a single row, since they all had the same value for the 
 first component of the primary key. I didn’t look at the size of the table, 
 but it had to be several gigs (uncompressed). Contrary to what Aaron says, I 
 do want to know what happens, because I didn’t experience any issues with 
 this table during my test. Inserting was fast. The last batch of records 
 inserted in approximately the same amount of time as the first batch. 
 Querying the table was fast. What I didn’t do was test the table under load, 
 nor did I try this in a multi-node cluster.
 
 If this is bad, can somebody suggest a better pattern? This table was 
 designed to support a query like this: select hpath from bdn_index_pub where 
 tree = :tree and pord = :start and pord = :end. In my application, most 
 trees will have less than a million records. A handful will have 10’s of 
 millions, and one of them will have 100 million.
 
 If I need to break up my rows, my first instinct would be to divide each tree 
 into blocks of say 10,000 and change tree to a string that contains the tree 
 and the block number. Something like this:
 
 17:0, 0, ‘/’
 …
 17:0, , ’/a/b/c’
 17:1,1, ‘/a/b/d’
 …
 
 I’d then need to issue an extra query for ranges that crossed block 
 boundaries.
 
 Any suggestions on a better pattern?
 
 Thanks
 
 Robert
 
 From: Aaron Morton aa...@thelastpickle.com
 Reply-To: user@cassandra.apache.org
 Date: Tuesday, December 10, 2013 at 12:33 AM
 To: Cassandra User user@cassandra.apache.org
 Subject: Re: Exactly one wide row per node for a given CF?
 
 But this becomes troublesome if I add or remove nodes. What effectively I 
 want is to partition on the unique id of the record modulus N (id % N; 
 where N is the number of nodes).
 This is exactly the problem consistent hashing (used by cassandra) is 
 designed to solve. If you hash the key and modulo the number of nodes, adding 
 and removing nodes requires a lot of data to move. 
 
 I want to be able to randomly distribute a large set of records but keep 
 them clustered in one wide row per node.
 Sounds like you should revisit your data modelling, this is a pretty well 
 known anti pattern. 
 
 When rows get above a few 10’s  of MB things can slow down, when they get 
 above 50 MB they can be a pain, when they get above 100MB it’s a warning 
 sign. And when they get above 1GB, well you you don’t want to know what 
 happens then. 
 
 It’s a bad idea and you should take another look at the data model. If you 
 have to do it, you can try the ByteOrderedPartitioner which uses the row key 
 as a token, given you total control of the row placement. 
 
 Cheers
 
 
 -
 Aaron Morton
 New Zealand
 @aaronmorton
 
 Co-Founder  Principal Consultant
 Apache Cassandra Consulting
 http://www.thelastpickle.com
 
 On 4/12/2013, at 8:32 pm, Vivek Mishra

Re: Exactly one wide row per node for a given CF?

2013-12-10 Thread Robert Wille
I have a question about this statement:

When rows get above a few 10¹s  of MB things can slow down, when they get
above 50 MB they can be a pain, when they get above 100MB it¹s a warning
sign. And when they get above 1GB, well you you don¹t want to know what
happens then. 

I tested a data model that I created. Here¹s the schema for the table in
question:

CREATE TABLE bdn_index_pub (

tree INT,

pord INT,

hpath VARCHAR,

PRIMARY KEY (tree, pord)

);


As a test, I inserted 100 million records. tree had the same value for every
record, and I had 100 million values for pord. hpath averaged about 50
characters in length. My understanding is that all 100 million strings would
have been stored in a single row, since they all had the same value for the
first component of the primary key. I didn¹t look at the size of the table,
but it had to be several gigs (uncompressed). Contrary to what Aaron says, I
do want to know what happens, because I didn¹t experience any issues with
this table during my test. Inserting was fast. The last batch of records
inserted in approximately the same amount of time as the first batch.
Querying the table was fast. What I didn¹t do was test the table under load,
nor did I try this in a multi-node cluster.

If this is bad, can somebody suggest a better pattern? This table was
designed to support a query like this: select hpath from bdn_index_pub where
tree = :tree and pord = :start and pord = :end. In my application, most
trees will have less than a million records. A handful will have 10¹s of
millions, and one of them will have 100 million.

If I need to break up my rows, my first instinct would be to divide each
tree into blocks of say 10,000 and change tree to a string that contains the
tree and the block number. Something like this:

17:0, 0, Œ/¹
Š
17:0, , ¹/a/b/c¹
17:1,1, Œ/a/b/d¹
Š

I¹d then need to issue an extra query for ranges that crossed block
boundaries.

Any suggestions on a better pattern?

Thanks

Robert

From:  Aaron Morton aa...@thelastpickle.com
Reply-To:  user@cassandra.apache.org
Date:  Tuesday, December 10, 2013 at 12:33 AM
To:  Cassandra User user@cassandra.apache.org
Subject:  Re: Exactly one wide row per node for a given CF?

 But this becomes troublesome if I add or remove nodes. What effectively I
 want is to partition on the unique id of the record modulus N (id % N; where
 N is the number of nodes).
This is exactly the problem consistent hashing (used by cassandra) is
designed to solve. If you hash the key and modulo the number of nodes,
adding and removing nodes requires a lot of data to move.

 I want to be able to randomly distribute a large set of records but keep them
 clustered in one wide row per node.
Sounds like you should revisit your data modelling, this is a pretty well
known anti pattern.

When rows get above a few 10¹s  of MB things can slow down, when they get
above 50 MB they can be a pain, when they get above 100MB it¹s a warning
sign. And when they get above 1GB, well you you don¹t want to know what
happens then. 

It¹s a bad idea and you should take another look at the data model. If you
have to do it, you can try the ByteOrderedPartitioner which uses the row key
as a token, given you total control of the row placement.

Cheers


-
Aaron Morton
New Zealand
@aaronmorton

Co-Founder  Principal Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com

On 4/12/2013, at 8:32 pm, Vivek Mishra mishra.v...@gmail.com wrote:

 So Basically you want to create a cluster of multiple unique keys, but data
 which belongs to one unique should be colocated. correct?
 
 -Vivek
 
 
 On Tue, Dec 3, 2013 at 10:39 AM, onlinespending onlinespend...@gmail.com
 wrote:
 Subject says it all. I want to be able to randomly distribute a large set of
 records but keep them clustered in one wide row per node.
 
 As an example, lets say I¹ve got a collection of about 1 million records each
 with a unique id. If I just go ahead and set the primary key (and therefore
 the partition key) as the unique id, I¹ll get very good random distribution
 across my server cluster. However, each record will be its own row. I¹d like
 to have each record belong to one large wide row (per server node) so I can
 have them sorted or clustered on some other column.
 
 If I say have 5 nodes in my cluster, I could randomly assign a value of 1 - 5
 at the time of creation and have the partition key set to this value. But
 this becomes troublesome if I add or remove nodes. What effectively I want is
 to partition on the unique id of the record modulus N (id % N; where N is the
 number of nodes).
 
 I have to imagine there¹s a mechanism in Cassandra to simply randomize the
 partitioning without even using a key (and then clustering on some column).
 
 Thanks for any help.
 





Re: Exactly one wide row per node for a given CF?

2013-12-10 Thread onlinespending
comments below

On Dec 9, 2013, at 11:33 PM, Aaron Morton aa...@thelastpickle.com wrote:

 But this becomes troublesome if I add or remove nodes. What effectively I 
 want is to partition on the unique id of the record modulus N (id % N; where 
 N is the number of nodes).
 This is exactly the problem consistent hashing (used by cassandra) is 
 designed to solve. If you hash the key and modulo the number of nodes, adding 
 and removing nodes requires a lot of data to move. 
 
 I want to be able to randomly distribute a large set of records but keep 
 them clustered in one wide row per node.
 Sounds like you should revisit your data modelling, this is a pretty well 
 known anti pattern. 
 
 When rows get above a few 10’s  of MB things can slow down, when they get 
 above 50 MB they can be a pain, when they get above 100MB it’s a warning 
 sign. And when they get above 1GB, well you you don’t want to know what 
 happens then. 
 
 It’s a bad idea and you should take another look at the data model. If you 
 have to do it, you can try the ByteOrderedPartitioner which uses the row key 
 as a token, given you total control of the row placement.


You should re-read my last paragraph as an example that would clearly benefit 
from such an approach. If one understands how paging works then you’ll see why 
how you’d benefit from grouping probabilistically similar data within each 
node, but also wanting to split data across nodes to reduce hot spotting.

Regardless, I no longer think it’s necessary to have a single wide rode per 
node. Several wide rows per node is just as good, since for all practical 
purposes paging in the first N key/values per M rows on a node is the same as 
reading in the first N*M key/values from a single row.

So I’m going to do what I alluded to before. Treat the LSB of a record’s id as 
the partition key, and then cluster on something meaningful (such as geohash) 
and the prefix of the id.

create table test (
id_prefix int,
id_suffix int,
geohash text,
value text, 
primary key (id_suffix, geohash, id_prefix));

So if this was for a collection of users, they would be randomly distributed 
across nodes to increase parallelism and reduce hotspots, but within each wide 
row they'd be meaningfully clustered by geographic region, so as to increase 
the probability that paged in data is going to contain more of the working set.



 
 Cheers
 
 
 -
 Aaron Morton
 New Zealand
 @aaronmorton
 
 Co-Founder  Principal Consultant
 Apache Cassandra Consulting
 http://www.thelastpickle.com
 
 On 4/12/2013, at 8:32 pm, Vivek Mishra mishra.v...@gmail.com wrote:
 
 So Basically you want to create a cluster of multiple unique keys, but data 
 which belongs to one unique should be colocated. correct?
 
 -Vivek
 
 
 On Tue, Dec 3, 2013 at 10:39 AM, onlinespending onlinespend...@gmail.com 
 wrote:
 Subject says it all. I want to be able to randomly distribute a large set of 
 records but keep them clustered in one wide row per node.
 
 As an example, lets say I’ve got a collection of about 1 million records 
 each with a unique id. If I just go ahead and set the primary key (and 
 therefore the partition key) as the unique id, I’ll get very good random 
 distribution across my server cluster. However, each record will be its own 
 row. I’d like to have each record belong to one large wide row (per server 
 node) so I can have them sorted or clustered on some other column.
 
 If I say have 5 nodes in my cluster, I could randomly assign a value of 1 - 
 5 at the time of creation and have the partition key set to this value. But 
 this becomes troublesome if I add or remove nodes. What effectively I want 
 is to partition on the unique id of the record modulus N (id % N; where N is 
 the number of nodes).
 
 I have to imagine there’s a mechanism in Cassandra to simply randomize the 
 partitioning without even using a key (and then clustering on some column).
 
 Thanks for any help.
 
 



Re: Exactly one wide row per node for a given CF?

2013-12-10 Thread onlinespending
Where you’ll run into trouble is with compaction.  It looks as if pord is some 
sequentially increasing value.  Try your experiment again with a clustering key 
which is a bit more random at the time of insertion.


On Dec 10, 2013, at 5:41 AM, Robert Wille rwi...@fold3.com wrote:

 I have a question about this statement:
 
 When rows get above a few 10’s  of MB things can slow down, when they get 
 above 50 MB they can be a pain, when they get above 100MB it’s a warning 
 sign. And when they get above 1GB, well you you don’t want to know what 
 happens then. 
 
 I tested a data model that I created. Here’s the schema for the table in 
 question:
 
 CREATE TABLE bdn_index_pub (
   tree INT,
   pord INT,
   hpath VARCHAR,
   PRIMARY KEY (tree, pord)
 );
 
 As a test, I inserted 100 million records. tree had the same value for every 
 record, and I had 100 million values for pord. hpath averaged about 50 
 characters in length. My understanding is that all 100 million strings would 
 have been stored in a single row, since they all had the same value for the 
 first component of the primary key. I didn’t look at the size of the table, 
 but it had to be several gigs (uncompressed). Contrary to what Aaron says, I 
 do want to know what happens, because I didn’t experience any issues with 
 this table during my test. Inserting was fast. The last batch of records 
 inserted in approximately the same amount of time as the first batch. 
 Querying the table was fast. What I didn’t do was test the table under load, 
 nor did I try this in a multi-node cluster.
 
 If this is bad, can somebody suggest a better pattern? This table was 
 designed to support a query like this: select hpath from bdn_index_pub where 
 tree = :tree and pord = :start and pord = :end. In my application, most 
 trees will have less than a million records. A handful will have 10’s of 
 millions, and one of them will have 100 million.
 
 If I need to break up my rows, my first instinct would be to divide each tree 
 into blocks of say 10,000 and change tree to a string that contains the tree 
 and the block number. Something like this:
 
 17:0, 0, ‘/’
 …
 17:0, , ’/a/b/c’
 17:1,1, ‘/a/b/d’
 …
 
 I’d then need to issue an extra query for ranges that crossed block 
 boundaries.
 
 Any suggestions on a better pattern?
 
 Thanks
 
 Robert
 
 From: Aaron Morton aa...@thelastpickle.com
 Reply-To: user@cassandra.apache.org
 Date: Tuesday, December 10, 2013 at 12:33 AM
 To: Cassandra User user@cassandra.apache.org
 Subject: Re: Exactly one wide row per node for a given CF?
 
 But this becomes troublesome if I add or remove nodes. What effectively I 
 want is to partition on the unique id of the record modulus N (id % N; 
 where N is the number of nodes).
 This is exactly the problem consistent hashing (used by cassandra) is 
 designed to solve. If you hash the key and modulo the number of nodes, adding 
 and removing nodes requires a lot of data to move. 
 
 I want to be able to randomly distribute a large set of records but keep 
 them clustered in one wide row per node.
 Sounds like you should revisit your data modelling, this is a pretty well 
 known anti pattern. 
 
 When rows get above a few 10’s  of MB things can slow down, when they get 
 above 50 MB they can be a pain, when they get above 100MB it’s a warning 
 sign. And when they get above 1GB, well you you don’t want to know what 
 happens then. 
 
 It’s a bad idea and you should take another look at the data model. If you 
 have to do it, you can try the ByteOrderedPartitioner which uses the row key 
 as a token, given you total control of the row placement. 
 
 Cheers
 
 
 -
 Aaron Morton
 New Zealand
 @aaronmorton
 
 Co-Founder  Principal Consultant
 Apache Cassandra Consulting
 http://www.thelastpickle.com
 
 On 4/12/2013, at 8:32 pm, Vivek Mishra mishra.v...@gmail.com wrote:
 
 So Basically you want to create a cluster of multiple unique keys, but data 
 which belongs to one unique should be colocated. correct?
 
 -Vivek
 
 
 On Tue, Dec 3, 2013 at 10:39 AM, onlinespending onlinespend...@gmail.com 
 wrote:
 Subject says it all. I want to be able to randomly distribute a large set 
 of records but keep them clustered in one wide row per node.
 
 As an example, lets say I’ve got a collection of about 1 million records 
 each with a unique id. If I just go ahead and set the primary key (and 
 therefore the partition key) as the unique id, I’ll get very good random 
 distribution across my server cluster. However, each record will be its own 
 row. I’d like to have each record belong to one large wide row (per server 
 node) so I can have them sorted or clustered on some other column.
 
 If I say have 5 nodes in my cluster, I could randomly assign a value of 1 - 
 5 at the time of creation and have the partition key set to this value. But 
 this becomes troublesome if I add or remove nodes. What effectively I want 
 is to partition on the unique id

Re: Exactly one wide row per node for a given CF?

2013-12-10 Thread Laing, Michael
You could shard your rows like the following.

You would need over 100 shards, possibly... so testing is in order :)

Michael

-- put this in file and run using 'cqlsh -f file

DROP KEYSPACE robert_test;

CREATE KEYSPACE robert_test WITH replication = {
'class': 'SimpleStrategy',
'replication_factor' : 1
};

USE robert_test;

CREATE TABLE bdn_index_pub (
tree int,
shard int,
pord int,
hpath text,
PRIMARY KEY ((tree, shard), pord)
);

-- shard is calculated as pord % 12

COPY bdn_index_pub (tree, shard, pord, hpath) FROM STDIN;
1, 1, 1, Chicago
5, 3, 15, New York
1, 5, 5, Melbourne
3, 2, 2, San Francisco
1, 3, 3, Palo Alto
\.

SELECT * FROM bdn_index_pub
WHERE shard IN (0,1,2,3,4,5,6,7,8,9,10,11)
AND tree =  1
AND pord  4
AND pord  0
ORDER BY pord desc
;

-- returns:

-- tree | shard | pord | hpath
+---+--+--
--1 | 3 |3 |  Palo Alto
--1 | 1 |1 |Chicago



On Tue, Dec 10, 2013 at 8:41 AM, Robert Wille rwi...@fold3.com wrote:

 I have a question about this statement:

 When rows get above a few 10’s  of MB things can slow down, when they get
 above 50 MB they can be a pain, when they get above 100MB it’s a warning
 sign. And when they get above 1GB, well you you don’t want to know what
 happens then.

 I tested a data model that I created. Here’s the schema for the table in
 question:

 CREATE TABLE bdn_index_pub (

 tree INT,

 pord INT,

 hpath VARCHAR,

 PRIMARY KEY (tree, pord)

 );

 As a test, I inserted 100 million records. tree had the same value for
 every record, and I had 100 million values for pord. hpath averaged about
 50 characters in length. My understanding is that all 100 million strings
 would have been stored in a single row, since they all had the same value
 for the first component of the primary key. I didn’t look at the size of
 the table, but it had to be several gigs (uncompressed). Contrary to what
 Aaron says, I do want to know what happens, because I didn’t experience any
 issues with this table during my test. Inserting was fast. The last batch
 of records inserted in approximately the same amount of time as the first
 batch. Querying the table was fast. What I didn’t do was test the table
 under load, nor did I try this in a multi-node cluster.

 If this is bad, can somebody suggest a better pattern? This table was
 designed to support a query like this: select hpath from bdn_index_pub
 where tree = :tree and pord = :start and pord = :end. In my application,
 most trees will have less than a million records. A handful will have 10’s
 of millions, and one of them will have 100 million.

 If I need to break up my rows, my first instinct would be to divide each
 tree into blocks of say 10,000 and change tree to a string that contains
 the tree and the block number. Something like this:

 17:0, 0, ‘/’
 …
 17:0, , ’/a/b/c’
 17:1,1, ‘/a/b/d’
 …

 I’d then need to issue an extra query for ranges that crossed block
 boundaries.

 Any suggestions on a better pattern?

 Thanks

 Robert

 From: Aaron Morton aa...@thelastpickle.com
 Reply-To: user@cassandra.apache.org
 Date: Tuesday, December 10, 2013 at 12:33 AM
 To: Cassandra User user@cassandra.apache.org
 Subject: Re: Exactly one wide row per node for a given CF?

 But this becomes troublesome if I add or remove nodes. What effectively I
 want is to partition on the unique id of the record modulus N (id % N;
 where N is the number of nodes).

 This is exactly the problem consistent hashing (used by cassandra) is
 designed to solve. If you hash the key and modulo the number of nodes,
 adding and removing nodes requires a lot of data to move.

 I want to be able to randomly distribute a large set of records but keep
 them clustered in one wide row per node.

 Sounds like you should revisit your data modelling, this is a pretty well
 known anti pattern.

 When rows get above a few 10’s  of MB things can slow down, when they get
 above 50 MB they can be a pain, when they get above 100MB it’s a warning
 sign. And when they get above 1GB, well you you don’t want to know what
 happens then.

 It’s a bad idea and you should take another look at the data model. If you
 have to do it, you can try the ByteOrderedPartitioner which uses the row
 key as a token, given you total control of the row placement.

 Cheers


 -
 Aaron Morton
 New Zealand
 @aaronmorton

 Co-Founder  Principal Consultant
 Apache Cassandra Consulting
 http://www.thelastpickle.com

 On 4/12/2013, at 8:32 pm, Vivek Mishra mishra.v...@gmail.com wrote:

 So Basically you want to create a cluster of multiple unique keys, but
 data which belongs to one unique should be colocated. correct?

 -Vivek


 On Tue, Dec 3, 2013 at 10:39 AM, onlinespending 
 onlinespend...@gmail.comwrote:

 Subject says it all. I want to be able to randomly distribute a large set
 of records but keep them clustered in one wide row per node.

 As an example, lets say I’ve got a collection

Re: Exactly one wide row per node for a given CF?

2013-12-09 Thread Aaron Morton
 But this becomes troublesome if I add or remove nodes. What effectively I 
 want is to partition on the unique id of the record modulus N (id % N; where 
 N is the number of nodes).
This is exactly the problem consistent hashing (used by cassandra) is designed 
to solve. If you hash the key and modulo the number of nodes, adding and 
removing nodes requires a lot of data to move. 

 I want to be able to randomly distribute a large set of records but keep them 
 clustered in one wide row per node.
Sounds like you should revisit your data modelling, this is a pretty well known 
anti pattern. 

When rows get above a few 10’s  of MB things can slow down, when they get above 
50 MB they can be a pain, when they get above 100MB it’s a warning sign. And 
when they get above 1GB, well you you don’t want to know what happens then. 

It’s a bad idea and you should take another look at the data model. If you have 
to do it, you can try the ByteOrderedPartitioner which uses the row key as a 
token, given you total control of the row placement. 

Cheers


-
Aaron Morton
New Zealand
@aaronmorton

Co-Founder  Principal Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com

On 4/12/2013, at 8:32 pm, Vivek Mishra mishra.v...@gmail.com wrote:

 So Basically you want to create a cluster of multiple unique keys, but data 
 which belongs to one unique should be colocated. correct?
 
 -Vivek
 
 
 On Tue, Dec 3, 2013 at 10:39 AM, onlinespending onlinespend...@gmail.com 
 wrote:
 Subject says it all. I want to be able to randomly distribute a large set of 
 records but keep them clustered in one wide row per node.
 
 As an example, lets say I’ve got a collection of about 1 million records each 
 with a unique id. If I just go ahead and set the primary key (and therefore 
 the partition key) as the unique id, I’ll get very good random distribution 
 across my server cluster. However, each record will be its own row. I’d like 
 to have each record belong to one large wide row (per server node) so I can 
 have them sorted or clustered on some other column.
 
 If I say have 5 nodes in my cluster, I could randomly assign a value of 1 - 5 
 at the time of creation and have the partition key set to this value. But 
 this becomes troublesome if I add or remove nodes. What effectively I want is 
 to partition on the unique id of the record modulus N (id % N; where N is the 
 number of nodes).
 
 I have to imagine there’s a mechanism in Cassandra to simply randomize the 
 partitioning without even using a key (and then clustering on some column).
 
 Thanks for any help.
 



Re: Exactly one wide row per node for a given CF?

2013-12-09 Thread Aaron Morton
 Basically this desire all stems from wanting efficient use of memory. 
Do you have any real latency numbers you are trying to tune ? 

Otherwise this sounds a little like premature optimisation.

Cheers

-
Aaron Morton
New Zealand
@aaronmorton

Co-Founder  Principal Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com

On 5/12/2013, at 6:16 am, onlinespending onlinespend...@gmail.com wrote:

 Pretty much yes. Although I think it’d be nice if Cassandra handled such a 
 case, I’ve resigned to the fact that it cannot at the moment. The workaround 
 will be to partition on the LSB portion of the id (giving 256 rows spread 
 amongst my nodes) which allows room for scaling, and then cluster each row on 
 geohash or something else.
 
 Basically this desire all stems from wanting efficient use of memory. 
 Frequently accessed keys’ values are kept in RAM through the OS page cache. 
 But the page size is 4KB. This is a problem if you are accessing several 
 small records of data (say 200 bytes), since each record only occupies a 
 small % of a page. This is why it’s important to increase the probability 
 that neighboring data on the disk is relevant. Worst case would be to read in 
 a full 4KB page into RAM, of which you’re only accessing one record that’s a 
 couple hundred bytes. All of the other unused data of the page is wastefully 
 occupying RAM. Now project this problem to a collection of millions of small 
 records all indiscriminately and randomly scattered on the disk, and you can 
 easily see how inefficient your memory usage will become.
 
 That’s why it’s best to cluster data in some meaningful way, all in an effort 
 to increasing the probability that when one record is accessed in that 4KB 
 block that its neighboring records will also be accessed. This brings me back 
 to the question of this thread. I want to randomly distribute the data 
 amongst the nodes to avoid hot spotting, but within each node I want to 
 cluster the data meaningfully such that the probability that neighboring data 
 is relevant is increased.
 
 An example of this would be having a huge collection of small records that 
 store basic user information. If you partition on the unique user id, then 
 you’ll get nice random distribution but with no ability to cluster (each 
 record would occupy its own row). You could partition on say geographical 
 region, but then you’ll end up with hot spotting when one region is more 
 active than another. So ideally you’d like to randomly assign a node to each 
 record to increase parallelism, but then cluster all records on a node by say 
 geohash since it is more likely (however small that may be) that when one 
 user from a geographical region is accessed other users from the same region 
 will also need to be accessed. It’s certainly better than having some random 
 user record next to the one you are accessing at the moment.
 
 
 
 
 On Dec 3, 2013, at 11:32 PM, Vivek Mishra mishra.v...@gmail.com wrote:
 
 So Basically you want to create a cluster of multiple unique keys, but data 
 which belongs to one unique should be colocated. correct?
 
 -Vivek
 
 
 On Tue, Dec 3, 2013 at 10:39 AM, onlinespending onlinespend...@gmail.com 
 wrote:
 Subject says it all. I want to be able to randomly distribute a large set of 
 records but keep them clustered in one wide row per node.
 
 As an example, lets say I’ve got a collection of about 1 million records 
 each with a unique id. If I just go ahead and set the primary key (and 
 therefore the partition key) as the unique id, I’ll get very good random 
 distribution across my server cluster. However, each record will be its own 
 row. I’d like to have each record belong to one large wide row (per server 
 node) so I can have them sorted or clustered on some other column.
 
 If I say have 5 nodes in my cluster, I could randomly assign a value of 1 - 
 5 at the time of creation and have the partition key set to this value. But 
 this becomes troublesome if I add or remove nodes. What effectively I want 
 is to partition on the unique id of the record modulus N (id % N; where N is 
 the number of nodes).
 
 I have to imagine there’s a mechanism in Cassandra to simply randomize the 
 partitioning without even using a key (and then clustering on some column).
 
 Thanks for any help.
 
 



Re: Exactly one wide row per node for a given CF?

2013-12-04 Thread onlinespending
Pretty much yes. Although I think it’d be nice if Cassandra handled such a 
case, I’ve resigned to the fact that it cannot at the moment. The workaround 
will be to partition on the LSB portion of the id (giving 256 rows spread 
amongst my nodes) which allows room for scaling, and then cluster each row on 
geohash or something else.

Basically this desire all stems from wanting efficient use of memory. 
Frequently accessed keys’ values are kept in RAM through the OS page cache. But 
the page size is 4KB. This is a problem if you are accessing several small 
records of data (say 200 bytes), since each record only occupies a small % of a 
page. This is why it’s important to increase the probability that neighboring 
data on the disk is relevant. Worst case would be to read in a full 4KB page 
into RAM, of which you’re only accessing one record that’s a couple hundred 
bytes. All of the other unused data of the page is wastefully occupying RAM. 
Now project this problem to a collection of millions of small records all 
indiscriminately and randomly scattered on the disk, and you can easily see how 
inefficient your memory usage will become.

That’s why it’s best to cluster data in some meaningful way, all in an effort 
to increasing the probability that when one record is accessed in that 4KB 
block that its neighboring records will also be accessed. This brings me back 
to the question of this thread. I want to randomly distribute the data amongst 
the nodes to avoid hot spotting, but within each node I want to cluster the 
data meaningfully such that the probability that neighboring data is relevant 
is increased.

An example of this would be having a huge collection of small records that 
store basic user information. If you partition on the unique user id, then 
you’ll get nice random distribution but with no ability to cluster (each record 
would occupy its own row). You could partition on say geographical region, but 
then you’ll end up with hot spotting when one region is more active than 
another. So ideally you’d like to randomly assign a node to each record to 
increase parallelism, but then cluster all records on a node by say geohash 
since it is more likely (however small that may be) that when one user from a 
geographical region is accessed other users from the same region will also need 
to be accessed. It’s certainly better than having some random user record next 
to the one you are accessing at the moment.




On Dec 3, 2013, at 11:32 PM, Vivek Mishra mishra.v...@gmail.com wrote:

 So Basically you want to create a cluster of multiple unique keys, but data 
 which belongs to one unique should be colocated. correct?
 
 -Vivek
 
 
 On Tue, Dec 3, 2013 at 10:39 AM, onlinespending onlinespend...@gmail.com 
 wrote:
 Subject says it all. I want to be able to randomly distribute a large set of 
 records but keep them clustered in one wide row per node.
 
 As an example, lets say I’ve got a collection of about 1 million records each 
 with a unique id. If I just go ahead and set the primary key (and therefore 
 the partition key) as the unique id, I’ll get very good random distribution 
 across my server cluster. However, each record will be its own row. I’d like 
 to have each record belong to one large wide row (per server node) so I can 
 have them sorted or clustered on some other column.
 
 If I say have 5 nodes in my cluster, I could randomly assign a value of 1 - 5 
 at the time of creation and have the partition key set to this value. But 
 this becomes troublesome if I add or remove nodes. What effectively I want is 
 to partition on the unique id of the record modulus N (id % N; where N is the 
 number of nodes).
 
 I have to imagine there’s a mechanism in Cassandra to simply randomize the 
 partitioning without even using a key (and then clustering on some column).
 
 Thanks for any help.
 



Re: Exactly one wide row per node for a given CF?

2013-12-03 Thread Vivek Mishra
So Basically you want to create a cluster of multiple unique keys, but data
which belongs to one unique should be colocated. correct?

-Vivek


On Tue, Dec 3, 2013 at 10:39 AM, onlinespending onlinespend...@gmail.comwrote:

 Subject says it all. I want to be able to randomly distribute a large set
 of records but keep them clustered in one wide row per node.

 As an example, lets say I’ve got a collection of about 1 million records
 each with a unique id. If I just go ahead and set the primary key (and
 therefore the partition key) as the unique id, I’ll get very good random
 distribution across my server cluster. However, each record will be its own
 row. I’d like to have each record belong to one large wide row (per server
 node) so I can have them sorted or clustered on some other column.

 If I say have 5 nodes in my cluster, I could randomly assign a value of 1
 - 5 at the time of creation and have the partition key set to this value.
 But this becomes troublesome if I add or remove nodes. What effectively I
 want is to partition on the unique id of the record modulus N (id % N;
 where N is the number of nodes).

 I have to imagine there’s a mechanism in Cassandra to simply randomize the
 partitioning without even using a key (and then clustering on some column).

 Thanks for any help.