Doubt in Row key range scan

2012-05-28 Thread Prakrati Agrawal
Dear all

I have stored my data into Cassandra database in the format tickerID_date. 
Now when I specify the row key range like 1_2012/05/24(start) to 
1_2012/05/27(end) it says that the end key md5 value is lesser than start key 
md5 value. So I changed my start key to  1_2012/05/27 and end key to 
1_2012/05/24, then I got all the keys even which are not in my range like 
67_2012/05/23 and 54_2012/05/28. I am  using Thrift API.
Please help me as I want only the columns of 1_2012/05/24, 1_2012/05/25 , 
1_2012/05/26 and 1_2012/05/27.

Prakrati Agrawal | Developer - Big Data(ID)| 9731648376 | www.mu-sigma.com



This email message may contain proprietary, private and confidential 
information. The information transmitted is intended only for the person(s) or 
entities to which it is addressed. Any review, retransmission, dissemination or 
other use of, or taking of any action in reliance upon, this information by 
persons or entities other than the intended recipient is prohibited and may be 
illegal. If you received this in error, please contact the sender and delete 
the message from your system.

Mu Sigma takes all reasonable steps to ensure that its electronic 
communications are free from viruses. However, given Internet accessibility, 
the Company cannot accept liability for any virus introduced by this e-mail or 
any attachment and you are advised to use up-to-date virus checking software.


Re: Doubt in Row key range scan

2012-05-28 Thread Pierre Chalamet
Hi,

It's normal.

Keys to replicas are determined with a hash (md5) when using the random 
partitionner (which you are using I guess).
 
You probably want to switch to the order preserving partionner or tweak your 
data model in order to rely on 2nd index for such filtering.

- Pierre

-Original Message-
From: Prakrati Agrawal prakrati.agra...@mu-sigma.com
Date: Mon, 28 May 2012 04:39:46 
To: user@cassandra.apache.orguser@cassandra.apache.org
Reply-To: user@cassandra.apache.org
Subject: Doubt in Row key range scan 

Dear all

I have stored my data into Cassandra database in the format tickerID_date. 
Now when I specify the row key range like 1_2012/05/24(start) to 
1_2012/05/27(end) it says that the end key md5 value is lesser than start key 
md5 value. So I changed my start key to  1_2012/05/27 and end key to 
1_2012/05/24, then I got all the keys even which are not in my range like 
67_2012/05/23 and 54_2012/05/28. I am  using Thrift API.
Please help me as I want only the columns of 1_2012/05/24, 1_2012/05/25 , 
1_2012/05/26 and 1_2012/05/27.

Prakrati Agrawal | Developer - Big Data(ID)| 9731648376 | www.mu-sigma.com



This email message may contain proprietary, private and confidential 
information. The information transmitted is intended only for the person(s) or 
entities to which it is addressed. Any review, retransmission, dissemination or 
other use of, or taking of any action in reliance upon, this information by 
persons or entities other than the intended recipient is prohibited and may be 
illegal. If you received this in error, please contact the sender and delete 
the message from your system.

Mu Sigma takes all reasonable steps to ensure that its electronic 
communications are free from viruses. However, given Internet accessibility, 
the Company cannot accept liability for any virus introduced by this e-mail or 
any attachment and you are advised to use up-to-date virus checking software.



Re: Doubt in Row key range scan

2012-05-28 Thread Alain RODRIGUEZ
You are using the Random Partitioner.

Using the RP is a good thing because you avoid hot spots, but it has
its defaults too. You can't scan a slice of row, they won't be ordered
because all your keys are stored using their md5 values.

You should review your data model to use columns to order your data.

Alain

2012/5/28 Prakrati Agrawal prakrati.agra...@mu-sigma.com:
 Dear all



 I have stored my data into Cassandra database in the format “tickerID_date”.
 Now when I specify the row key range like 1_2012/05/24(start) to
 1_2012/05/27(end) it says that the end key md5 value is lesser than start
 key md5 value. So I changed my start key to  1_2012/05/27 and end key to
 1_2012/05/24, then I got all the keys even which are not in my range like
 67_2012/05/23 and 54_2012/05/28. I am  using Thrift API.

 Please help me as I want only the columns of 1_2012/05/24, 1_2012/05/25 ,
 1_2012/05/26 and 1_2012/05/27.



 Prakrati Agrawal | Developer - Big Data(ID)| 9731648376 | www.mu-sigma.com




 
 This email message may contain proprietary, private and confidential
 information. The information transmitted is intended only for the person(s)
 or entities to which it is addressed. Any review, retransmission,
 dissemination or other use of, or taking of any action in reliance upon,
 this information by persons or entities other than the intended recipient is
 prohibited and may be illegal. If you received this in error, please contact
 the sender and delete the message from your system.

 Mu Sigma takes all reasonable steps to ensure that its electronic
 communications are free from viruses. However, given Internet accessibility,
 the Company cannot accept liability for any virus introduced by this e-mail
 or any attachment and you are advised to use up-to-date virus checking
 software.


RE: Doubt in Row key range scan

2012-05-28 Thread Prakrati Agrawal
Please could you tell me how to tweak my data model to rely on 2nd index ?
Thank you


Prakrati Agrawal | Developer - Big Data(ID)| 9731648376 | www.mu-sigma.com

From: Pierre Chalamet [mailto:pie...@chalamet.net]
Sent: Monday, May 28, 2012 3:31 PM
To: user@cassandra.apache.org
Subject: Re: Doubt in Row key range scan

Hi,

It's normal.

Keys to replicas are determined with a hash (md5) when using the random 
partitionner (which you are using I guess).

You probably want to switch to the order preserving partionner or tweak your 
data model in order to rely on 2nd index for such filtering.
- Pierre

From: Prakrati Agrawal prakrati.agra...@mu-sigma.com
Date: Mon, 28 May 2012 04:39:46 -0500
To: user@cassandra.apache.orguser@cassandra.apache.org
ReplyTo: user@cassandra.apache.org
Subject: Doubt in Row key range scan

Dear all

I have stored my data into Cassandra database in the format tickerID_date. 
Now when I specify the row key range like 1_2012/05/24(start) to 
1_2012/05/27(end) it says that the end key md5 value is lesser than start key 
md5 value. So I changed my start key to  1_2012/05/27 and end key to 
1_2012/05/24, then I got all the keys even which are not in my range like 
67_2012/05/23 and 54_2012/05/28. I am  using Thrift API.
Please help me as I want only the columns of 1_2012/05/24, 1_2012/05/25 , 
1_2012/05/26 and 1_2012/05/27.

Prakrati Agrawal | Developer - Big Data(ID)| 9731648376 | www.mu-sigma.com



This email message may contain proprietary, private and confidential 
information. The information transmitted is intended only for the person(s) or 
entities to which it is addressed. Any review, retransmission, dissemination or 
other use of, or taking of any action in reliance upon, this information by 
persons or entities other than the intended recipient is prohibited and may be 
illegal. If you received this in error, please contact the sender and delete 
the message from your system.

Mu Sigma takes all reasonable steps to ensure that its electronic 
communications are free from viruses. However, given Internet accessibility, 
the Company cannot accept liability for any virus introduced by this e-mail or 
any attachment and you are advised to use up-to-date virus checking software.


This email message may contain proprietary, private and confidential 
information. The information transmitted is intended only for the person(s) or 
entities to which it is addressed. Any review, retransmission, dissemination or 
other use of, or taking of any action in reliance upon, this information by 
persons or entities other than the intended recipient is prohibited and may be 
illegal. If you received this in error, please contact the sender and delete 
the message from your system.

Mu Sigma takes all reasonable steps to ensure that its electronic 
communications are free from viruses. However, given Internet accessibility, 
the Company cannot accept liability for any virus introduced by this e-mail or 
any attachment and you are advised to use up-to-date virus checking software.


Re: Doubt in Row key range scan

2012-05-28 Thread Luís Ferreira
Check this out: http://www.anuff.com/2011/02/indexing-in-cassandra.html#more

Or just google for wide row indexes.
On May 28, 2012, at 11:22 AM, Prakrati Agrawal wrote:

 Please could you tell me how to tweak my data model to rely on 2nd index ?
 Thank you
  
  
 Prakrati Agrawal | Developer - Big Data(ID)| 9731648376 | www.mu-sigma.com
  
 From: Pierre Chalamet [mailto:pie...@chalamet.net] 
 Sent: Monday, May 28, 2012 3:31 PM
 To: user@cassandra.apache.org
 Subject: Re: Doubt in Row key range scan
  
 Hi,
 
 It's normal.
 
 Keys to replicas are determined with a hash (md5) when using the random 
 partitionner (which you are using I guess).
 
 You probably want to switch to the order preserving partionner or tweak your 
 data model in order to rely on 2nd index for such filtering.
 - Pierre
 From: Prakrati Agrawal prakrati.agra...@mu-sigma.com
 Date: Mon, 28 May 2012 04:39:46 -0500
 To: user@cassandra.apache.orguser@cassandra.apache.org
 ReplyTo: user@cassandra.apache.org
 Subject: Doubt in Row key range scan
  
 Dear all
  
 I have stored my data into Cassandra database in the format “tickerID_date”. 
 Now when I specify the row key range like 1_2012/05/24(start) to 
 1_2012/05/27(end) it says that the end key md5 value is lesser than start key 
 md5 value. So I changed my start key to  1_2012/05/27 and end key to 
 1_2012/05/24, then I got all the keys even which are not in my range like 
 67_2012/05/23 and 54_2012/05/28. I am  using Thrift API.
 Please help me as I want only the columns of 1_2012/05/24, 1_2012/05/25 , 
 1_2012/05/26 and 1_2012/05/27.
  
 Prakrati Agrawal | Developer - Big Data(ID)| 9731648376 | www.mu-sigma.com
  
  
 This email message may contain proprietary, private and confidential 
 information. The information transmitted is intended only for the person(s) 
 or entities to which it is addressed. Any review, retransmission, 
 dissemination or other use of, or taking of any action in reliance upon, this 
 information by persons or entities other than the intended recipient is 
 prohibited and may be illegal. If you received this in error, please contact 
 the sender and delete the message from your system.
 
 Mu Sigma takes all reasonable steps to ensure that its electronic 
 communications are free from viruses. However, given Internet accessibility, 
 the Company cannot accept liability for any virus introduced by this e-mail 
 or any attachment and you are advised to use up-to-date virus checking 
 software.
 
 This email message may contain proprietary, private and confidential 
 information. The information transmitted is intended only for the person(s) 
 or entities to which it is addressed. Any review, retransmission, 
 dissemination or other use of, or taking of any action in reliance upon, this 
 information by persons or entities other than the intended recipient is 
 prohibited and may be illegal. If you received this in error, please contact 
 the sender and delete the message from your system.
 
 Mu Sigma takes all reasonable steps to ensure that its electronic 
 communications are free from viruses. However, given Internet accessibility, 
 the Company cannot accept liability for any virus introduced by this e-mail 
 or any attachment and you are advised to use up-to-date virus checking 
 software.

Cumprimentos,
Luís Ferreira