Re: new nodetool ring output and unbalanced ring?

2012-09-10 Thread Guy Incognito

out of interest, why -100 and not -1 or + 1?  any particular reason?

On 06/09/2012 19:17, Tyler Hobbs wrote:
To minimize the impact on the cluster, I would bootstrap a new 1d node 
at (42535295865117307932921825928971026432 - 100), then decommission 
the 1c node at 42535295865117307932921825928971026432 and run cleanup 
on your us-east nodes.


On Thu, Sep 6, 2012 at 1:11 PM, William Oberman 
ober...@civicscience.com mailto:ober...@civicscience.com wrote:


Didn't notice the racks!  Of course

If I change a 1c to a 1d, what would I have to do to make sure
data shuffles around correctly?  Repair everywhere?

will

On Thu, Sep 6, 2012 at 2:09 PM, Tyler Hobbs ty...@datastax.com
mailto:ty...@datastax.com wrote:

The main issue is that one of your us-east nodes is in rack
1d, while the restart are in rack 1c.  With NTS and multiple
racks, Cassandra will try use one node from each rack as a
replica for a range until it either meets the RF for the DC,
or runs out of racks, in which case it just picks nodes
sequentially going clockwise around the ring (starting from
the range being considered, not the last node that was chosen
as a replica).

To fix this, you'll either need to make the 1d node a 1c node,
or make 42535295865117307932921825928971026432 a 1d node so
that you're alternating racks within that DC.


On Thu, Sep 6, 2012 at 12:54 PM, William Oberman
ober...@civicscience.com mailto:ober...@civicscience.com
wrote:

Hi,

I recently upgraded from 0.8.x to 1.1.x (through 1.0
briefly) and nodetool -ring seems to have changed from
owns to effectively owns.  Effectively owns seems to
account for replication factor (RF).  I'm ok with all of
this, yet I still can't figure out what's up with my
cluster.  I have a NetworkTopologyStrategy with two data
centers (DCs) with RF/number nodes in DC combinations of:
DC Name, RF, # in DC
analytics, 1, 2
us-east, 3, 4
So I'd expect 50% on each analytics node, and 75% for each
us-east node.  Instead, I have two nodes in us-east with
50/100??? (the other two are 75/75 as expected).

Here is the output of nodetool (all nodes report the same
thing):
Address DC  RackStatus State   Load
 Effective-Ownership Token
 127605887595351923798765477786913079296
x.x.x.x   us-east 1c  Up Normal  94.57 GB
   75.00%0
x.x.x.x   analytics   1c  Up Normal  60.64 GB
   50.00%1
x.x.x.x   us-east 1c  Up Normal  131.76 GB
  75.00%  42535295865117307932921825928971026432
x.x.x.xus-east 1c  Up Normal  43.45 GB
   50.00%  85070591730234615865843651857942052864
x.x.x.xanalytics   1d  Up Normal  60.88 GB
   50.00%  85070591730234615865843651857942052865
x.x.x.x   us-east 1d  Up Normal  98.56 GB
   100.00% 127605887595351923798765477786913079296

If I use cassandra-cli to do show keyspaces; I get (and
again, all nodes report the same thing):
Keyspace: civicscience:
  Replication Strategy:
org.apache.cassandra.locator.NetworkTopologyStrategy
  Durable Writes: true
Options: [analytics:1, us-east:3]
I removed the output about all of my column families
(CFs), hopefully that doesn't matter.

Did I compute the tokens wrong?  Is there a combination of
nodetool commands I can run to migrate the data around to
rebalance to 75/75/75/75?  I routinely run repair already.
 And as the release notes required, I ran upgradesstables
during the upgrade process.

Before the upgrade, I was getting analytics = 0%, and
us-east = 25% on each node, which I expected for owns.

will




-- 
Tyler Hobbs

DataStax http://datastax.com/




-- 
Will Oberman

Civic Science, Inc.
3030 Penn Avenue., First Floor
Pittsburgh, PA 15201
(M) 412-480-7835 tel:412-480-7835
(E) ober...@civicscience.com mailto:ober...@civicscience.com




--
Tyler Hobbs
DataStax http://datastax.com/





Re: Data Modeling- another question

2012-08-28 Thread Guy Incognito
i would respectfully disagree, what you have said is true but it really 
depends on the use case.


1) do you expect to be doing updates to individual fields of an item, or 
will you always update all fields at once?  if you are doing separate 
updates then the first is definitely easier to handle updates.
2) do you expect to do paging of the list?  this will be easier with the 
json approach, as in the first your item may span across a page boundary 
- not an insurmountable problem by any means, but more complicated 
nonetheless.  this is not

an issue obviously if all your items have the same number of fields.
3) do you expect to read or delete multiple items individually? you may 
have to do multiple reads/deletes of a row if the items are not adjacent 
to each other as you cannot do 'disjoint' slices of columns at the 
moment.  with the json approach you can just specify individual columns 
and you're done.  again this is less of an issue if items have a known 
set of fields, but your list of columns to read/delete may get quite 
large fairly quickly


the first is definitely better if you want to update individual fields, 
read-then-write is not a good idea in cassandra.  but it is more 
complicated for most usage scenarios, so you have to work out if you 
really need the extra flexibility.


On 24/08/2012 13:54, samal wrote:

First is better choice, each filed can be updated separately(write only).
Second you have to take care json yourself (read first-modify-then write).

On Fri, Aug 24, 2012 at 5:45 PM, Roshni Rajagopal 
roshni.rajago...@wal-mart.com mailto:roshni.rajago...@wal-mart.com 
wrote:


Hi,

Suppose I have a column family to associate a user to a dynamic
list of items. I want to store 5-10 key  information about the
item,  no specific sorting requirements are there.
I have two options

A) use composite columns
UserId1 : {
 itemid1:Name = Betty Crocker,
 itemid1:Descr = Cake
itemid1:Qty = 5
 itemid2:Name = Nutella,
 itemid2:Descr = Choc spread
itemid2:Qty = 15
}

B) use a json with the data
UserId1 : {
 itemid1 = {name: Betty Crocker,descr: Cake, Qty: 5},
 itemid2 ={name: Nutella,descr: Choc spread, Qty: 15}
}

Which do you suggest would be better?


Regards,
Roshni

This email and any files transmitted with it are confidential and
intended solely for the individual or entity to whom they are
addressed. If you have received this email in error destroy it
immediately. *** Walmart Confidential ***






Truncate failing with 1.0 client against 0.7 cluster

2012-07-16 Thread Guy Incognito
i'm doing an upgrade of Cassandra 0.7 to 1.0 at the moment, and as part 
of the preparation i'm upgrading to 1.0 client libraries (we use Hector 
1.0-5) prior to upgrading the cluster itself.  I'm seeing some of our 
integration tests against the dev 0.7 cluster fail as they get 
UnavailableExceptions when trying to truncate the test column families.  
This is new behaviour with the 1.0 client libraries, it doesn't happen 
with the 0.7 libraries.


It seems to fail immediately, it doesn't eg wait for eg the 10 second 
RPC timeout, it fails straight away.  Anyone have any ideas as to what 
may be happening?  Interestingly I seem to be able to get around it if i 
only tell Hector about one of the nodes (we have 4). If I give it all 
four then it throws the UnavailableException.


Re: Truncate failing with 1.0 client against 0.7 cluster

2012-07-16 Thread Guy Incognito
sorry i don't have the exact text right now but it's along the lines of 
'not enough replicas available to handle the requested consistency 
level'.  i'm requesting quorum but i've tried with one, and any and it 
made no difference.


On 16/07/2012 19:30, aaron morton wrote:
UnavailableException is a server side error, whats the full error 
message ?



Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 17/07/2012, at 5:31 AM, Guy Incognito wrote:

i'm doing an upgrade of Cassandra 0.7 to 1.0 at the moment, and as 
part of the preparation i'm upgrading to 1.0 client libraries (we use 
Hector 1.0-5) prior to upgrading the cluster itself.  I'm seeing some 
of our integration tests against the dev 0.7 cluster fail as they get 
UnavailableExceptions when trying to truncate the test column 
families.  This is new behaviour with the 1.0 client libraries, it 
doesn't happen with the 0.7 libraries.


It seems to fail immediately, it doesn't eg wait for eg the 10 second 
RPC timeout, it fails straight away.  Anyone have any ideas as to 
what may be happening?  Interestingly I seem to be able to get around 
it if i only tell Hector about one of the nodes (we have 4). If I 
give it all four then it throws the UnavailableException.







Re: cassandra 1.0.9 error - Read an invalid frame size of 0

2012-06-26 Thread Guy Incognito

i have seen this as well, is it a known issue?

On 18/06/2012 19:38, Gurpreet Singh wrote:


I found a fix for this one, rather a workaround.

I changed the rpc_server_type in cassandra.yaml, from hsha to sync, 
and the error went away. I guess, there is some issue with the thrift 
nonblocking server.


Thanks
Gurpreet

On Wed, May 16, 2012 at 7:04 PM, Gurpreet Singh 
gurpreet.si...@gmail.com mailto:gurpreet.si...@gmail.com wrote:


Thanks Aaron. will do!


On Mon, May 14, 2012 at 1:14 PM, aaron morton
aa...@thelastpickle.com mailto:aa...@thelastpickle.com wrote:

Are you using framed transport on the client side ?

Try the Hector user list for hector specific help
https://groups.google.com/forum/?fromgroups#!searchin/hector-users
https://groups.google.com/forum/?fromgroups#%21searchin/hector-users

Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 12/05/2012, at 5:44 AM, Gurpreet Singh wrote:


This is hampering our testing of cassandra a lot, and our
move to cassandra 1.0.9.
Has anyone seen this before? Should I be trying a different
version of cassandra?

/G

On Thu, May 10, 2012 at 11:29 PM, Gurpreet Singh
gurpreet.si...@gmail.com mailto:gurpreet.si...@gmail.com
wrote:

Hi,
i have created 1 node cluster of cassandra 1.0.9. I am
setting this up for testing reads/writes.

I am seeing the following error in the server system.log

ERROR [Selector-Thread-7] 2012-05-10 22:44:02,607
TNonblockingServer.java (line 467) Read an invalid frame
size of 0. Are you using TFramedTransport on the client
side?

Initially i was using a old hector 0.7.x, but even after
switching to hector 1.0-5 and thrift version 0.6.1, i
still see this error.
I am using 20 threads writing/reading from cassandra. The
max write batch size is 10 with payload size constant per
key to be 600 bytes.

On the client side, i see Hector exceptions happenning
coinciding with these messages on the server.

Any ideas why these errors are happenning?

Thanks
Gurpreet











Re: Schema advice/help

2012-03-28 Thread Guy Incognito
well, no.  my assumption is that he knows what the 5 itemTypes (or 
appropriate corresponding ids) are, so he can do a known 5-rowkey 
lookup.  if he does not know, then agreed, my proposal is not a great fit.


could do (as originally suggested)

userId - itemType:activityId

if you want to keep everything in the same row (again assumes that you 
know what the itemTypes are).  but then you can't really do a multiget, 
you have to do 5 separate slice queries, one for each item type.


can also do some wacky stuff around maintaining a row that explicitly 
only holds the last 10 items by itemType (meaning you have to delete the 
oldest one everytime you insert a new one), but that prolly requires 
read-on-write etc and is a lot messier.  and you will prolly need to 
worry about the case where you (transiently) have more than 10 'latest' 
items for a single itemType.


On 28/03/2012 09:49, Maciej Miklas wrote:
yes - but anyway in your example you need key range quey and that 
requires OOP, right?


On Tue, Mar 27, 2012 at 5:13 PM, Guy Incognito dnd1...@gmail.com 
mailto:dnd1...@gmail.com wrote:


multiget does not require OPP.

On 27/03/2012 09:51, Maciej Miklas wrote:

multiget would require Order Preserving Partitioner, and this can
lead to unbalanced ring and hot spots.

Maybe you can use secondary index on itemtype - is must have
small cardinality:
http://pkghosh.wordpress.com/2011/03/02/cassandra-secondary-index-patterns/



On Tue, Mar 27, 2012 at 10:10 AM, Guy Incognito
dnd1...@gmail.com mailto:dnd1...@gmail.com wrote:

without the ability to do disjoint column slices, i would
probably use 5 different rows.

userId:itemType - activityId

then it's a multiget slice of 10 items from each of your 5 rows.


On 26/03/2012 22:16, Ertio Lew wrote:

I need to store activities by each user, on 5 items
types. I always want to read last 10 activities on each
item type, by a user (ie, total activities to read at a
time =50).

I am wanting to store these activities in a single row
for each user so that they can be retrieved in single row
query, since I want to read all the last 10 activities on
each item.. I am thinking of creating composite names
appending itemtype : activityId(activityId is just
timestamp value) but then, I don't see about how to read
the last 10 activities from all itemtypes.

Any ideas about schema to do this better way ?










Re: Schema advice/help

2012-03-27 Thread Guy Incognito
without the ability to do disjoint column slices, i would probably use 5 
different rows.


userId:itemType - activityId

then it's a multiget slice of 10 items from each of your 5 rows.

On 26/03/2012 22:16, Ertio Lew wrote:
I need to store activities by each user, on 5 items types. I always 
want to read last 10 activities on each item type, by a user (ie, 
total activities to read at a time =50).


I am wanting to store these activities in a single row for each user 
so that they can be retrieved in single row query, since I want to 
read all the last 10 activities on each item.. I am thinking of 
creating composite names appending itemtype : 
activityId(activityId is just timestamp value) but then, I don't see 
about how to read the last 10 activities from all itemtypes.


Any ideas about schema to do this better way ?




Re: problem in create column family

2012-03-27 Thread Guy Incognito

why don't you show us the command you're actually trying to run?

On 27/03/2012 08:52, puneet loya wrote:

I m using cassandra 1.0.8..

Please reply

On Tue, Mar 27, 2012 at 12:28 PM, R. Verlangen ro...@us2.nl 
mailto:ro...@us2.nl wrote:


Not sure about that, what version of Cassandra are you using?
Maybe someone else here knows how to solve this..


2012/3/27 puneet loya puneetl...@gmail.com
mailto:puneetl...@gmail.com

ya had created with UTF8Type before.. It gave the same error.

On executing help assume command it is giving 'utf8' as a type.

so can i use comparator='utf8' or not??


Please reply


On Mon, Mar 26, 2012 at 9:17 PM, R. Verlangen ro...@us2.nl
mailto:ro...@us2.nl wrote:

You should use the full type names, e.g.

create column family MyColumnFamily with comparator=UTF8Type;


2012/3/26 puneet loya puneetl...@gmail.com
mailto:puneetl...@gmail.com

It is giving errors like  Unable to find
abstract-type class
'org.apache.cassandra.db.marshal.utf8' 

and java.lang.RuntimeException:
org.apache.cassandra.db.marshal.MarshalException:
cannot parse 'catalogueId' as hex bytes

where catalogueId is a column that has utf8 as its
data type. they may be just synactical errors..

Please suggest if u can help me out on dis??




-- 
With kind regards,


Robin Verlangen
www.robinverlangen.nl http://www.robinverlangen.nl





-- 
With kind regards,


Robin Verlangen
www.robinverlangen.nl http://www.robinverlangen.nl






Re: Schema advice/help

2012-03-27 Thread Guy Incognito

multiget does not require OPP.

On 27/03/2012 09:51, Maciej Miklas wrote:
multiget would require Order Preserving Partitioner, and this can lead 
to unbalanced ring and hot spots.


Maybe you can use secondary index on itemtype - is must have small 
cardinality: 
http://pkghosh.wordpress.com/2011/03/02/cassandra-secondary-index-patterns/




On Tue, Mar 27, 2012 at 10:10 AM, Guy Incognito dnd1...@gmail.com 
mailto:dnd1...@gmail.com wrote:


without the ability to do disjoint column slices, i would probably
use 5 different rows.

userId:itemType - activityId

then it's a multiget slice of 10 items from each of your 5 rows.


On 26/03/2012 22:16, Ertio Lew wrote:

I need to store activities by each user, on 5 items types. I
always want to read last 10 activities on each item type, by a
user (ie, total activities to read at a time =50).

I am wanting to store these activities in a single row for
each user so that they can be retrieved in single row query,
since I want to read all the last 10 activities on each item..
I am thinking of creating composite names appending itemtype
: activityId(activityId is just timestamp value) but then, I
don't see about how to read the last 10 activities from all
itemtypes.

Any ideas about schema to do this better way ?







Re: Exceptions related to thrift transport

2012-03-22 Thread Guy Incognito
are you perhaps trying to send a large batch mutate?  i've seen broken 
pipes etc in cassandra 0.7 (currently in the process of upgrading to 
1.0.8) when a large batch mutate is sent.


On 22/03/2012 07:09, Tiwari, Dushyant wrote:


Hector version 1.0-3

What is the reason for the second exception, BTW?

Thanks,

Dushyant

*From:*aaron morton [mailto:aa...@thelastpickle.com]
*Sent:* Wednesday, March 21, 2012 10:46 PM
*To:* user@cassandra.apache.org
*Subject:* Re: Exceptions related to thrift transport

1.org.apache.thrift.TException: Message length exceeded: 134218240

thrift_mas_message_length_in_mb

https://github.com/apache/cassandra/blob/cassandra-1.0/conf/cassandra.yaml#L243

(134218240 is 128MB, which is a lot of data(

2.org.apache.thrift.protocol.TProtocolException: Missing version
in readMessageBegin, old client?

What version of hector are you using  ?

Cheers

-

Aaron Morton

Freelance Developer

@aaronmorton

http://www.thelastpickle.com

On 22/03/2012, at 12:02 AM, Tiwari, Dushyant wrote:



Hi Cassandra Users,

A couple of questions on the server side exceptions that I see 
sometimes --


1.org.apache.thrift.TException: Message length exceeded: 134218240

nHow to configure message length?

2.org.apache.thrift.protocol.TProtocolException: Missing version in 
readMessageBegin, old client?


-How to rectify this exception?

Some related client side Exceptions are --

1.org.apache.thrift.transport.TTransportException: 
java.net.SocketException: Broken pipe


2.me.prettyprint.hector.api.exceptions.HectorTransportException: 
org.apache.thrift.transport.TTransportException


Caused by: org.apache.thrift.transport.TTransportException: null

at 
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)


Using Hector as client. The queries are writes to CF with indexes.

The frequency of these exceptions are very low.

Thanks,

Dushyant



NOTICE:Morgan Stanley is not acting as a municipal advisor and the 
opinions or views contained herein are not intended to be, and do not 
constitute, advice within the meaning of Section 975 of the Dodd-Frank 
Wall Street Reform and Consumer Protection Act.If you have received 
this communication in error, please destroy all electronic and paper 
copies and notify the sender immediately. Mistransmission is not 
intended to waive confidentiality or privilege. Morgan Stanley 
reserves the right, to the extent permitted under applicable law, to 
monitor electronic communications. This message is subject to terms 
available at the following 
link:http://www.morganstanley.com/disclaimers. If you cannot access 
these links, please notify us by reply message and we will send the 
contents to you. By messaging with Morgan Stanley you consent to the 
foregoing.



NOTICE: Morgan Stanley is not acting as a municipal advisor and the 
opinions or views contained herein are not intended to be, and do not 
constitute, advice within the meaning of Section 975 of the Dodd-Frank 
Wall Street Reform and Consumer Protection Act. If you have received 
this communication in error, please destroy all electronic and paper 
copies and notify the sender immediately. Mistransmission is not 
intended to waive confidentiality or privilege. Morgan Stanley 
reserves the right, to the extent permitted under applicable law, to 
monitor electronic communications. This message is subject to terms 
available at the following link: 
http://www.morganstanley.com/disclaimers. If you cannot access these 
links, please notify us by reply message and we will send the contents 
to you. By messaging with Morgan Stanley you consent to the foregoing.




Re: Newbie Question: Cassandra consuming 100% CPU on ubuntu server

2012-02-18 Thread Guy Incognito
perhaps entirely unrelated, but somebody was asking about lockups on EC2 
yesterday and found: http://wiki.apache.org/cassandra/FAQ#ubuntu_hangs


On 18/02/2012 14:58, Aditya Gupta wrote:
Am I installing it the right way ? While installing I didn't verify 
the signatures using public key.


On Sat, Feb 18, 2012 at 8:21 PM, Aditya Gupta ady...@gmail.com 
mailto:ady...@gmail.com wrote:


No data at all. just a fresh installation


On Sat, Feb 18, 2012 at 6:57 PM, R. Verlangen ro...@us2.nl
mailto:ro...@us2.nl wrote:

You might want to check your Cassandra logs, they contain
important information that might lead you to the actual cause
of the problems.

2012/2/18 Aditya Gupta ady...@gmail.com
mailto:ady...@gmail.com

Thanks! But what about the 100% cpu consumption that
is causing the server to hang?


On Sat, Feb 18, 2012 at 6:19 PM, Watanabe Maki
watanabe.m...@gmail.com mailto:watanabe.m...@gmail.com
wrote:

I haven't use the packaged kit, but Cassandra uses
half of physical memory on your system by default.
You need to edit cassandra-env.sh to decrease heap size.
Update MAX_HEAP_SIZE and NEW_HEAP_SIZE and restart.

From iPhone


On 2012/02/18, at 20:40, Aditya Gupta
ady...@gmail.com mailto:ady...@gmail.com wrote:


I just installed Cassandra on my ubuntu server by
adding the following to the sources list:

deb http://www.apache.org/dist/cassandra/debian
10x main
deb-src
http://www.apache.org/dist/cassandra/debian 10x main


Soon after install I started getting OOM errors 
then the server became unresponsive. I added more RAM
to the server but found that cassandra was consuming
100% CPU  1GB RAM as soon the server was being
started. Why is this happening  how can get it to
normal conditions ?










Re: read-repair?

2012-02-02 Thread Guy Incognito
sorry to be dense, but which is it?  do i get the old version or the new 
version?  or is it indeterminate?


On 02/02/2012 01:42, Peter Schuller wrote:

i have RF=3, my row/column lives on 3 nodes right?  if (for some reason, eg
a timed-out write at quorum) node 1 has a 'new' version of the row/column
(eg clock = 10), but node 2 and 3 have 'old' versions (clock = 5), when i
try to read my row/column at quorum, what do i get back?

You either get back the new version or the old version, depending on
whether node 1 was participated in the read. In your scenario, the
prevoius write at quorum failed (since it only made it to one node),
so this is not a violation of the contract.

Once node 2 and/or 3 return their response, read repair (if it is
active) will cause re-read and re-conciliation followed by a row
mutation being send to the nodes to correct the column.


do i get the clock 5 version because that is what the quorum agrees on, and

No; a quorum of node is waited for, and the newest column wins. This
accomplish the reads-see-write invariant.





read-repair?

2012-02-01 Thread Guy Incognito

how does read repair work in the following scenario?

i have RF=3, my row/column lives on 3 nodes right?  if (for some reason, 
eg a timed-out write at quorum) node 1 has a 'new' version of the 
row/column (eg clock = 10), but node 2 and 3 have 'old' versions (clock 
= 5), when i try to read my row/column at quorum, what do i get back?


do i get the clock 5 version because that is what the quorum agrees on, 
and then read-repair kicks in and nodes 2 and 3 are updated to clock 10 
so a subsequent read returns clock 10?  or are nodes 2 and 3 updated to 
clock 10 first, and i get the clock 10 version on the initial read?


atomicity of a row write

2012-01-23 Thread Guy Incognito

hi all,

having read: http://wiki.apache.org/cassandra/FAQ#batch_mutate_atomic

i would like some clarification:

is a write to a single row key in a single column family atomic in the 
sense that i can do a batch mutate where i


1) write col 'A' to key 'B'
2) write 'col 'C' to key 'B'

and either both column writes will succeed, or both will fail?  i won't 
get the situation where eg col 'A' is written and col 'B' fails, my 
client receives an error, but col 'A' is actually persisted and becomes 
visible to other clients?


does this hold if i write key 'B' across two different column families?  
(i assume not, but the faq doesn't seem to explicitly exclude this).


PS i'm not worried about isolation per se, i'm interested in what the 
'eventually consistent' state is.


Re: Replacing supercolumns with composite columns; Getting the equivalent of retrieving a list of supercolumns by name

2012-01-04 Thread Guy Incognito
i know it's a throwaway example, but i would probably structure your 
column the other way around in that case.


ie steve.4, steve.5, steve.6, greg.4, greg.6, greg.9.

and then do two slice queries, steve.4-steve.10, greg.4-greg.10.

On 04/01/2012 15:41, Jeremiah Jordan wrote:
You can't use a slice range.  But you can query for the specific 
columns.  4.steve, 5.steve, 6.steve ... 4.greg, 5.greg, 
6.greg.  Just have to ask for all of the possible columns you want.


On 01/03/2012 04:31 PM, Stephen Pope wrote:

  The bonus you're talking about here, how do I apply that?

  For example, my columns are in the form of number.id such as 
4.steve, 4.greg, 5.steve, 5.george. Is there a way to query a slice 
of numbers with a list of ids? As in, I want all the columns with 
numbers between 4 and 10 which have ids steve or greg.


  Cheers,
  Steve

-Original Message-
From: Jeremiah Jordan [mailto:jeremiah.jor...@morningstar.com]
Sent: Tuesday, January 03, 2012 3:12 PM
To: user@cassandra.apache.org
Cc: Asil Klin
Subject: Re: Replacing supercolumns with composite columns; Getting 
the equivalent of retrieving a list of supercolumns by name


The main issue with replacing super columns with composite columns 
right now is that if you don't know all your sub-column names you 
can't select multiple super columns worth of data in the same query 
without getting extra stuff.  You have to use a slice to get all 
subcolumns of a given super column, and you can't have disjoint 
slices, so if you want two super columns full, you have to get all 
the other stuff that is in between them, or make two queries.
If you know what all of the sub-column names are you can ask for all 
of the super/sub column pairs for all of the super columns you want 
and not get extra data.


If you don't need to pull multiple super columns at a time with 
slices like that, then there isn't really an issue.


A bonus of using composite keys like this, is that if there is a 
specific sub column you want from multiple super columns, you can 
pull all those out with a single multiget and you don't have to pull 
the rest of the columns...


So there are pros and cons...

-Jeremiah


On 01/03/2012 01:58 PM, Asil Klin wrote:

I have a super columns family which I always use to retrieve a list of
supercolumns(with all subcolumns) by name. I am looking forward to
replace all SuperColumns in my schema with the composite columns.

How could I design schema so that I could do the equivalent of
retrieving a list of supercolumns by name, in case of using composite
columns.

(As of now I thought of using the supercolumn name as the first
component of the composite name and the subcolumn name as 2nd
component of composite name.)




Re: Doubts related to composite type column names/values

2011-12-20 Thread Guy Incognito
afaik composite lets you do sorting in a way that would be 
difficult/impossible with string concatenation.


eg String, Integer with the string ascending, and the integer descending.

if i had composites available (which i don't b/c we are on 0.7), i would 
use them over string concatenation.  string concatenation is a pain.


On 20/12/2011 20:33, Maxim Potekhin wrote:
Thank you Aaron! As long as I have plain strings, would you say that I 
would do almost as well with catenation?


Of course I realize that mixed types are a very different case where 
the composite is very useful.


Thanks

Maxim


On 12/20/2011 2:44 PM, aaron morton wrote:
Component values are compared in a type aware fashion, an Integer is 
an Integer. Not a 10 character zero padded string.


You can also slice on the components. Just like with string concat, 
but nicer.  . e.g. If you app is storing comments for a thing, and 
the column names have the form comment_id, field or Integer, 
String you can slice for all properties of a comment or all 
properties for comments between two comment_id's


Finally, the client library knows what's going on.

Hope that helps.

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 21/12/2011, at 7:43 AM, Maxim Potekhin wrote:


With regards to static, what are major benefits as it compares with
string catenation (with some convenient separator inserted)?

Thanks

Maxim


On 12/20/2011 1:39 PM, Richard Low wrote:
On Tue, Dec 20, 2011 at 5:28 PM, Ertio Lewertio...@gmail.com 
mailto:ertio...@gmail.com  wrote:

With regard to the composite columns stuff in Cassandra, I have the
following doubts :

1. What is the storage overhead of the composite type column 
names/values,
The values are the same.  For each dimension, there is 3 bytes 
overhead.


2. what exactly is the difference between the DynamicComposite and 
Static

Composite ?

Static composite type has the types of each dimension specified in the
column family definition, so all names within that column family have
the same type.  Dynamic composite type lets you specify the type for
each column, so they can be different.  There is extra storage
overhead for this and care must be taken to ensure all column names
remain comparable.











Re: memory estimate for each key in the key cache

2011-12-18 Thread Guy Incognito
to be blunt, this doesn't sound right to me, unless it's doing something 
rather more clever to manage the memory.


i mocked up a simple class containing a byte[], ByteBuffer and long, and 
the shallow size alone is 32 bytes.  deep size with a byte[16], 1-byte 
bytebuffer and long is 132.  this is a on a 64-bit jvm on win x64, but 
is consistent(ish) with what i've seen in the past on linux jvms.  the 
actual code has rather more objects than this (it's a map, it has a 
pair, decoratedKey) so would be quite a bit bigger per key.


On 17/12/2011 03:42, Brandon Williams wrote:

On Fri, Dec 16, 2011 at 9:31 PM, Dave Brosiusdbros...@mebigfatguy.com  wrote:

Wow, Java is a lot better than I thought if it can perform that kind of
magic.  I'm guessing the wiki information is just old and out of date. It's
probably more like 60 + sizeof(key)

With jamm and MAT it's fairly easy to test.  The number is accurate
last I checked.

-Brandon




Re: best practices for simulating transactions in Cassandra

2011-12-10 Thread Guy Incognito

you could try writing with the clock of the initial replay entry?

On 06/12/2011 20:26, John Laban wrote:
Ah, neat.  It is similar to what was proposed in (4) above with adding 
transactions to Cages, but instead of snapshotting the data to be 
rolled back (the before data), you snapshot the data to be replayed 
(the after data).  And then later, if you find that the transaction 
didn't complete, you just keep replaying the transaction until it takes.


The part I don't understand with this approach though:  how do you 
ensure that someone else didn't change the data between your initial 
failed transaction and the later replaying of the transaction?  You 
could get lost writes in that situation.


Dominic (in the Cages blog post) explained a workaround with that for 
his rollback proposal:  all subsequent readers or writers of that data 
would have to check for abandoned transactions and roll them back 
themselves before they could read the data.  I don't think this is 
possible with the XACT_LOG replay approach in these slides though, 
based on how the data is indexed (cassandra node token + timeUUID).



PS:  How are you liking Cages?




2011/12/6 Jérémy SEVELLEC jsevel...@gmail.com 
mailto:jsevel...@gmail.com


Hi John,

I had exactly the same reflexions.

I'm using zookeeper and cage to lock et isolate.

but how to rollback?
It's impossible so try replay!

the idea is explained in this presentation
http://www.slideshare.net/mattdennis/cassandra-data-modeling (starting
from slide 24)

- insert your whole data into one column
- make the job
- remove (or expire) your column.

if there is a problem during making the job, you keep the
possibility to replay and replay and replay (synchronously or in a
batch).

Regards

Jérémy


2011/12/5 John Laban j...@pagerduty.com mailto:j...@pagerduty.com

Hello,

I'm building a system using Cassandra as a datastore and I
have a few places where I am need of transactions.

I'm using ZooKeeper to provide locking when I'm in need of
some concurrency control or isolation, so that solves that
half of the puzzle.

What I need now is to sometimes be able to get atomicity
across multiple writes by simulating the
begin/rollback/commit abilities of a relational DB.  In
other words, there are places where I need to perform multiple
updates/inserts, and if I fail partway through, I would
ideally be able to rollback the partially-applied updates.

Now, I *know* this isn't possible with Cassandra.  What I'm
looking for are all the best practices, or at least tips and
tricks, so that I can get around this limitation in Cassandra
and still maintain a consistent datastore.  (I am using quorum
reads/writes so that eventual consistency doesn't kick my ass
here as well.)

Below are some ideas I've been able to dig up.  Please let me
know if any of them don't make sense, or if there are better
approaches:


1) Updates to a row in a column family are atomic.  So try to
model your data so that you would only ever need to update a
single row in a single CF at once.  Essentially, you model
your data around transactions.  This is tricky but can
certainly be done in some situations.

2) If you are only dealing with multiple row *inserts* (and
not updates), have one of the rows act as a 'commit' by
essentially validating the presence of the other rows.  For
example, say you were performing an operation where you wanted
to create an Account row and 5 User rows all at once (this is
an unlikely example, but bear with me).  You could insert 5
rows into the Users CF, and then the 1 row into the Accounts
CF, which acts as the commit.  If something went wrong before
the Account could be created, any Users that had been created
so far would be orphaned and unusable, as your business logic
can ensure that they can't exist without an Account.  You
could also have an offline cleanup process that swept away
orphans.

3) Try to model your updates as idempotent column inserts
instead.  How do you model updates as inserts?  Instead of
munging the value directly, you could insert a column
containing the operation you want to perform (like +5).  It
would work kind of like the Consistent Vote Counting
implementation: ( https://gist.github.com/41 ).  How do
you make the inserts idempotent?  Make sure the column names
correspond to a request ID or some other identifier that would
be identical across re-drives of a given (perhaps originally
failed) request.  This could leave your datastore in a
temporarily inconsistent state, but would eventually 

Re: UUIDType

2011-11-21 Thread Guy Incognito
no particular reason, just wanting clarification b/c i saw a post (from 
ed anuff i think) about java.util.UUID being inconsistent with RFC4122, 
and this coming to light when looking at Cassandra's TimeUUIDType and 
LexicalUUIDType.  so i wondered if cassandra's types were consistent 
with RFC4122, and it seems like they are not either.


On 21/11/2011 18:34, Jonathan Ellis wrote:

I think that's correct, but why would you want to do that?

On Sun, Nov 20, 2011 at 2:55 AM, Guy Incognitodnd1...@gmail.com  wrote:

am i correct that neither of Cassandra's UUIDTypes (at least in 0.7) compare
UUIDs according to RFC4122 (ie as two unsigned longs)?









UUIDType

2011-11-20 Thread Guy Incognito
am i correct that neither of Cassandra's UUIDTypes (at least in 0.7) 
compare UUIDs according to RFC4122 (ie as two unsigned longs)?


Re: Mass deletion -- slowing down

2011-11-14 Thread Guy Incognito
i think what he means is...do you know what day the 'oldest' day is?  eg 
if you have a rolling window of say 2 weeks, structure your query so 
that your slice range only goes back 2 weeks, rather than to the 
beginning of time.  this would avoid iterating over all the tombstones 
from prior to the 2 week window.  this wouldn't work if you are deleting 
arbitrary days in the middle of your date range.


On 14/11/2011 02:02, Maxim Potekhin wrote:

Thanks Peter,

I'm not sure I entirely follow. By the oldest data, do you mean the
primary key corresponding to the limit of the time horizon? 
Unfortunately,
unique IDs and the timstamps do not correlate in the sense that 
chronologically
newer entries might have a smaller sequential ID. That's because the 
timestamp
corresponds to the last update that's stochastic in the sense that the 
jobs can take

from seconds to days to complete. As I said I'm not sure I understood you
correctly.

Also, I note that queries on different dates (i.e. not contaminated 
with lots

of tombstones) work just fine, which is consistent with the picture that
emerged so far.

Theoretically -- would compaction or cleanup help?

Thanks

Maxim




On 11/13/2011 8:39 PM, Peter Schuller wrote:
I do limit the number of rows I'm asking for in Pycassa. Queries on 
primary

keys still work fine,

Is it feasable in your situation to keep track of the oldest possible
data (for example, if there is a single sequential writer that rotates
old entries away it could keep a record of what the oldest might be)
so that you can bound your index lookup= that value (and avoid the
tombstones)?







Re: indexes from CassandraSF

2011-11-14 Thread Guy Incognito

ok great, thanks ed, that's really helpful.

just wanted to make sure i wasn't missing something fundamental.

On 13/11/2011 23:57, Ed Anuff wrote:

Yes, correct, it's not going to clean itself.  Using your example with
a little more detail:

1 ) A(T1) reads previous location (T0,L0) from index_entries for user U0
2 ) B(T2) reads previous location (T0,L0) from index_entries for user U0
3 ) A(T1) deletes previous location (T0,L0) from index_entries for user U0
4 ) B(T2) deletes previous location (T0,L0) from index_entries for user U0
5 ) A(T1) deletes previous location (L0,T0,U0) for user U0 from index
6 ) B(T2) deletes previous location (L0,T0,U0) for user U0 from index
7 ) A(T1) inserts new location (T1,L1) into index_entries for user U0
8 ) B(T2) inserts new location (T2,L2) into index_entries for user U0
9 ) index_entries for user U0 now contains (T1,L1),(T2,L2)
10) A(T1) inserts new location (L1,T1,U0) for user U0 into index
11) B(T2) inserts new location (L2,T2,U0) for user U0 into index
12) A(T1) sets new location (L1) on user U0
13) B(T2) sets new location (L2) on user U0
14) C(T3) queries for users where location equals L1, gets back user
U0 where current location is actually L2

So, you want to either verify on read by making sure the queried field
is correct before returning it in your result set to the rest of your
app, or you want to use locking (ex. lock on (U0,location) during
updates).  The key thing here is that although the index is not in the
desired state at (14), the information is in the system to get to that
state (the previous values in index_entries).  This lets the cleanup
happen on the next update of location for user U0:

15) D(T4) reads previous locations (T1,L1),(T2,L2) from index entries
for user U0
16) D(T4) deletes previous locations (T1,L1),(T2,L2) from index
entries for user U0
17) D(T4) deletes previous locations (L1,T1,U0),(L2,T2,U0) for user U0
from index
18) D(T4) inserts new location (T4,L3) into index entries for user U0
19) D(T4) inserts new location (L3,T4,U0) for user U0 into index
20) D(T4) sets new location (L3) on user U0

BTW, just to reiterate since this sometimes comes up, the timestamps
being stored in these tuples are not longs, they're time UUIDs, so T1
and T2 are never equal.

Ed


On Sun, Nov 13, 2011 at 6:52 AM, Guy Incognitodnd1...@gmail.com  wrote:

[1] i'm not particularly worried about transient conditions so that's ok.  i
think there's still the possibility of a non-transient false positive...if 2
writes were to happen at exactly the same time (highly unlikely), eg

1) A reads previous location (L1) from index entries
2) B reads previous location (L1) from index entries
3) A deletes previous location (L1) from index entries
4) B deletes previous location (L1) from index entries
5) A deletes previous location (L1) from index
6) B deletes previous location (L1) from index
7) A enters new location (L2) into index entries
8) B enters new location (L3) into index entries
9 ) A enters new location (L2) into index
10) B enters new location (L3) into index
11) A sets new location (L2) on users
12) B sets new location (L2) on users

after this, don't i end up with an incorrect L2 location in index entries
and in the index, that won't be resolved until the next write of location
for that user?

[2] ah i see...so the client would continuously retry until the update
works.  that's fine provided the client doesn't bomb out with some other
error, if that were to happen then i have potentially deleted the index
entry columns without deleting the corresponding index columns.

i can handle both of the above for my use case, i just want to clarify
whether they are possible (however unlikely) scenarios.

On 13/11/2011 02:41, Ed Anuff wrote:

1) The index updates should be eventually consistent.  This does mean
that you can get a transient false-positive on your search results.
If this doesn't work for you, then you either need to use ZK or some
other locking solution or do read repair by making sure that the row
you retrieve contains the value you're searching for before passing it
on to the rest of your applicaiton.

2)  You should be able to reapply the batch updates til they succeed.
The update is idempotent.  One thing that's important that the slides
don't make clear is that this requires using time-based uuids as your
timestamp components.  Take a look at the sample code.

Hope this helps,

Ed

On Sat, Nov 12, 2011 at 3:59 PM, Guy Incognitodnd1...@gmail.comwrote:

help?

On 10/11/2011 19:34, Guy Incognito wrote:

hi,

i've been looking at the model below from Ed Anuff's presentation at
Cassandra CF (http://www.slideshare.net/edanuff/indexing-in-cassandra).
  Couple of questions:

1) Isn't there still the chance that two concurrent updates may end up
with the index containing two entries for the given user, only one of
which
would be match the actual value in the Users cf?

2) What happens if your batch fails partway through the update?  If i
understand

Re: indexes from CassandraSF

2011-11-13 Thread Guy Incognito
[1] i'm not particularly worried about transient conditions so that's 
ok.  i think there's still the possibility of a non-transient false 
positive...if 2 writes were to happen at exactly the same time (highly 
unlikely), eg


1) A reads previous location (L1) from index entries
2) B reads previous location (L1) from index entries
3) A deletes previous location (L1) from index entries
4) B deletes previous location (L1) from index entries
5) A deletes previous location (L1) from index
6) B deletes previous location (L1) from index
7) A enters new location (L2) into index entries
8) B enters new location (L3) into index entries
9 ) A enters new location (L2) into index
10) B enters new location (L3) into index
11) A sets new location (L2) on users
12) B sets new location (L2) on users

after this, don't i end up with an incorrect L2 location in index 
entries and in the index, that won't be resolved until the next write of 
location for that user?


[2] ah i see...so the client would continuously retry until the update 
works.  that's fine provided the client doesn't bomb out with some other 
error, if that were to happen then i have potentially deleted the index 
entry columns without deleting the corresponding index columns.


i can handle both of the above for my use case, i just want to clarify 
whether they are possible (however unlikely) scenarios.


On 13/11/2011 02:41, Ed Anuff wrote:

1) The index updates should be eventually consistent.  This does mean
that you can get a transient false-positive on your search results.
If this doesn't work for you, then you either need to use ZK or some
other locking solution or do read repair by making sure that the row
you retrieve contains the value you're searching for before passing it
on to the rest of your applicaiton.

2)  You should be able to reapply the batch updates til they succeed.
The update is idempotent.  One thing that's important that the slides
don't make clear is that this requires using time-based uuids as your
timestamp components.  Take a look at the sample code.

Hope this helps,

Ed

On Sat, Nov 12, 2011 at 3:59 PM, Guy Incognitodnd1...@gmail.com  wrote:

help?

On 10/11/2011 19:34, Guy Incognito wrote:

hi,

i've been looking at the model below from Ed Anuff's presentation at
Cassandra CF (http://www.slideshare.net/edanuff/indexing-in-cassandra).
  Couple of questions:

1) Isn't there still the chance that two concurrent updates may end up
with the index containing two entries for the given user, only one of which
would be match the actual value in the Users cf?

2) What happens if your batch fails partway through the update?  If i
understand correctly there are no guarantees about ordering when a batch is
executed, so isn't it possible that eg the previous
value entries in Users_Index_Entries may have been deleted, and then the
batch fails before the entries in Indexes are deleted, ie the mechanism has
'lost' those values?  I assume this can be addressed
by not deleting the old entries until the batch has succeeded (ie put the
previous entry deletion into a separate, subsequent batch).  this at least
lets you retry at a later time.

perhaps i'm missing something?

SELECT {location}..{location, *}
FROM Users_Index_Entries WHERE KEY =user_key;

BEGIN BATCH

DELETE {location, ts1}, {location, ts2}, ...
FROM Users_Index_Entries WHERE KEY =user_key;

DELETE {value1,user_key, ts1}, {value2,user_key, ts2}, ...
FROM Indexes WHERE KEY = Users_By_Location;

UPDATE Users_Index_Entries SET {location, ts3} =value3
WHERE KEY=user_key;

UPDATE Indexes SET {value3,user_key, ts3) = null
WHERE KEY = Users_By_Location;

UPDATE Users SET location =value3
WHERE KEY =user_key;

APPLY BATCH







Re: indexes from CassandraSF

2011-11-12 Thread Guy Incognito

help?

On 10/11/2011 19:34, Guy Incognito wrote:

hi,

i've been looking at the model below from Ed Anuff's presentation at 
Cassandra CF 
(http://www.slideshare.net/edanuff/indexing-in-cassandra).  Couple of 
questions:


1) Isn't there still the chance that two concurrent updates may end up 
with the index containing two entries for the given user, only one of 
which would be match the actual value in the Users cf?


2) What happens if your batch fails partway through the update?  If i 
understand correctly there are no guarantees about ordering when a 
batch is executed, so isn't it possible that eg the previous
value entries in Users_Index_Entries may have been deleted, and then 
the batch fails before the entries in Indexes are deleted, ie the 
mechanism has 'lost' those values?  I assume this can be addressed
by not deleting the old entries until the batch has succeeded (ie put 
the previous entry deletion into a separate, subsequent batch).  this 
at least lets you retry at a later time.


perhaps i'm missing something?

SELECT {location}..{location, *}
FROM Users_Index_Entries WHERE KEY = user_key;

BEGIN BATCH

DELETE {location, ts1}, {location, ts2}, ...
FROM Users_Index_Entries WHERE KEY = user_key;

DELETE {value1, user_key, ts1}, {value2, user_key, ts2}, ...
FROM Indexes WHERE KEY = Users_By_Location;

UPDATE Users_Index_Entries SET {location, ts3} = value3
WHERE KEY=user_key;

UPDATE Indexes SET {value3, user_key, ts3) = null
WHERE KEY = Users_By_Location;

UPDATE Users SET location = value3
WHERE KEY = user_key;

APPLY BATCH





indexes from CassandraSF

2011-11-10 Thread Guy Incognito

hi,

i've been looking at the model below from Ed Anuff's presentation at 
Cassandra CF (http://www.slideshare.net/edanuff/indexing-in-cassandra).  
Couple of questions:


1) Isn't there still the chance that two concurrent updates may end up 
with the index containing two entries for the given user, only one of 
which would be match the actual value in the Users cf?


2) What happens if your batch fails partway through the update?  If i 
understand correctly there are no guarantees about ordering when a batch 
is executed, so isn't it possible that eg the previous
value entries in Users_Index_Entries may have been deleted, and then the 
batch fails before the entries in Indexes are deleted, ie the mechanism 
has 'lost' those values?  I assume this can be addressed
by not deleting the old entries until the batch has succeeded (ie put 
the previous entry deletion into a separate, subsequent batch).  this at 
least lets you retry at a later time.


perhaps i'm missing something?

SELECT {location}..{location, *}
FROM Users_Index_Entries WHERE KEY = user_key;

BEGIN BATCH

DELETE {location, ts1}, {location, ts2}, ...
FROM Users_Index_Entries WHERE KEY = user_key;

DELETE {value1, user_key, ts1}, {value2, user_key, ts2}, ...
FROM Indexes WHERE KEY = Users_By_Location;

UPDATE Users_Index_Entries SET {location, ts3} = value3
WHERE KEY=user_key;

UPDATE Indexes SET {value3, user_key, ts3) = null
WHERE KEY = Users_By_Location;

UPDATE Users SET location = value3
WHERE KEY = user_key;

APPLY BATCH



Re: security

2011-11-09 Thread Guy Incognito

ok, thx for the input!

On 09/11/2011 15:19, Mohit Anchlia wrote:

We lockdown ssh to root from any network. We also provide individual
logins including sysadmin and they go through LDAP authentication.
Anyone who does sudo su as root gets logged and alerted via trapsend.
We use firewalls and also have a separate vlan for datastore servers.
We then open only specific ports from our application servers to
datastore servers.

You should also look at Cassandra authentication as additional means
of securing your data.

On Wed, Nov 9, 2011 at 6:39 AM, Sasha Dolgysdo...@gmail.com  wrote:

Firewall with appropriate rules.


On Tue, Nov 8, 2011 at 6:30 PM, Guy Incognitodnd1...@gmail.com  wrote:

hi,

is there a standard approach to securing cassandra eg within a corporate
network?  at the moment in our dev environment, anybody with network
connectivity to the cluster can connect to it and mess with it.  this would
not be acceptable in prod.  do people generally write custom authenticators
etc, or just put the cluster behind a firewall with the appropriate rules to
limit access?




security

2011-11-08 Thread Guy Incognito

hi,

is there a standard approach to securing cassandra eg within a corporate 
network?  at the moment in our dev environment, anybody with network 
connectivity to the cluster can connect to it and mess with it.  this 
would not be acceptable in prod.  do people generally write custom 
authenticators etc, or just put the cluster behind a firewall with the 
appropriate rules to limit access?


CompositeType for use with 0.7

2011-11-05 Thread Guy Incognito
Is this a lib I can just drop into a 0.7 instance of cassandra and use?  
I'm not sure what to make of the README about not using it with versions 
earlier than 0.8.0-rc1.


https://github.com/riptano/hector-composite

the goal is to start using CompositeTypes in 0.7 (which I can't upgrade 
to 0.8 at the moment), with a seamless transition to 0.8 when I do 
upgrade.  will using this with hector 0.8.x allow this?


super sub slice query?

2011-10-27 Thread Guy Incognito
is there such a thing?  a query that runs against a SC family and 
returns a subset of subcolumns from a set of super-columns?


is there a way to have eg a slice query (or super slice query) only 
return the column names, rather than the value as well?