Re: Backup solution

2013-03-15 Thread Rene Kochen
Thank you. I have a high bandwidth connection. But that also means that
regular repairs on the backup data-center will take a long time.

2013/3/14 Jabbar Azam aja...@gmail.com

 Hello,

 If the live data centre disappears restoring the data from the backup is
 going to take ages especially if the data is going from one data centre to
 another, unless you have a high bandwidth connection between data centres
 or you have a small amount of data.

 Jabbar Azam
 On 14 Mar 2013 14:31, Rene Kochen rene.koc...@schange.com wrote:

 Hi all,

 Is the following a good backup solution.

 Create two data-centers:

 - A live data-center with multiple nodes (commodity hardware). Clients
 connect to this cluster with LOCAL_QUORUM.
 - A backup data-center with 1 node (with fast SSDs). Clients do not
 connect to this cluster. Cluster only used for creating and storing
 snapshots.

 Advantages:

 - No snapshots and bulk network I/O (transfer snapshots) needed on the
 live cluster.
 - Clients are not slowed down because writes to the backup data-center
 are async.
 - On the backup cluster snapshots are made on a regular basis. This again
 does not affect the live cluster.
 - The back-up cluster does not need to process client requests/reads, so
 we need less machines for the backup cluster than the live cluster.

 Are there any disadvantages with this approach?

 Thanks!




Re: cql query not giving any result.

2013-03-15 Thread Kuldeep Mishra
Hi,
Is it possible in Cassandra to make multiple column with same name ?, like
in this particular scenario I have two column with same name as key,
first one is rowkey and second on is column name .


Thanks and Regards
Kuldeep

On Fri, Mar 15, 2013 at 4:05 PM, Kuldeep Mishra kuld.cs.mis...@gmail.comwrote:


 Hi ,
 Following cql query not returning any result
 cqlsh:KunderaExamples select * from DOCTOR where key='kuldeep';

I have enabled secondary indexes on both column.

 Screen shot is attached

 Please help


 --
 Thanks and Regards
 Kuldeep Kumar Mishra
 +919540965199




-- 
Thanks and Regards
Kuldeep Kumar Mishra
+919540965199


Re: cql query not giving any result.

2013-03-15 Thread Jason Wee
Here is a list of keywords and whether or not the words are reserved. A
reserved keyword cannot be used as an identifier unless you enclose the
word in double quotation marks. Non-reserved keywords have a specific
meaning in certain context but can be used as an identifier outside this
context.

http://www.datastax.com/docs/1.2/cql_cli/cql_lexicon#cql-keywords


On Fri, Mar 15, 2013 at 6:43 PM, Kuldeep Mishra kuld.cs.mis...@gmail.comwrote:

 Hi,
 Is it possible in Cassandra to make multiple column with same name ?, like
 in this particular scenario I have two column with same name as key,
 first one is rowkey and second on is column name .


 Thanks and Regards
 Kuldeep


 On Fri, Mar 15, 2013 at 4:05 PM, Kuldeep Mishra 
 kuld.cs.mis...@gmail.comwrote:


 Hi ,
 Following cql query not returning any result
 cqlsh:KunderaExamples select * from DOCTOR where key='kuldeep';

I have enabled secondary indexes on both column.

 Screen shot is attached

 Please help


 --
 Thanks and Regards
 Kuldeep Kumar Mishra
 +919540965199




 --
 Thanks and Regards
 Kuldeep Kumar Mishra
 +919540965199



Re: cql query not giving any result.

2013-03-15 Thread Sylvain Lebresne
On Fri, Mar 15, 2013 at 11:43 AM, Kuldeep Mishra
kuld.cs.mis...@gmail.comwrote:

 Hi,
 Is it possible in Cassandra to make multiple column with same name ?, like
 in this particular scenario I have two column with same name as key,
 first one is rowkey and second on is column name .


No, it shouldn't be possible and that is your problem. How did you created
that table?

--
Sylvain



 Thanks and Regards
 Kuldeep


 On Fri, Mar 15, 2013 at 4:05 PM, Kuldeep Mishra 
 kuld.cs.mis...@gmail.comwrote:


 Hi ,
 Following cql query not returning any result
 cqlsh:KunderaExamples select * from DOCTOR where key='kuldeep';

I have enabled secondary indexes on both column.

 Screen shot is attached

 Please help


 --
 Thanks and Regards
 Kuldeep Kumar Mishra
 +919540965199




 --
 Thanks and Regards
 Kuldeep Kumar Mishra
 +919540965199



Re: cql query not giving any result.

2013-03-15 Thread Kuldeep Mishra
Hi Sylvain,
  I created it using thrift client, here is column family creation
script,

Cassandra.Client client;
CfDef user_Def = new CfDef();
user_Def.name = DOCTOR;
user_Def.keyspace = KunderaExamples;
user_Def.setComparator_type(UTF8Type);
user_Def.setDefault_validation_class(UTF8Type);
user_Def.setKey_validation_class(UTF8Type);
ColumnDef key = new ColumnDef(ByteBuffer.wrap(KEY.getBytes()),
UTF8Type);
key.index_type = IndexType.KEYS;
ColumnDef age = new ColumnDef(ByteBuffer.wrap(AGE.getBytes()),
UTF8Type);
age.index_type = IndexType.KEYS;
user_Def.addToColumn_metadata(key);
user_Def.addToColumn_metadata(age);

client.set_keyspace(KunderaExamples);
client.system_add_column_family(user_Def);


Thanks
KK

On Fri, Mar 15, 2013 at 4:24 PM, Sylvain Lebresne sylv...@datastax.comwrote:

 On Fri, Mar 15, 2013 at 11:43 AM, Kuldeep Mishra kuld.cs.mis...@gmail.com
  wrote:

 Hi,
 Is it possible in Cassandra to make multiple column with same name ?,
 like in this particular scenario I have two column with same name as key,
 first one is rowkey and second on is column name .


 No, it shouldn't be possible and that is your problem. How did you created
 that table?

 --
 Sylvain



 Thanks and Regards
 Kuldeep


 On Fri, Mar 15, 2013 at 4:05 PM, Kuldeep Mishra kuld.cs.mis...@gmail.com
  wrote:


 Hi ,
 Following cql query not returning any result
 cqlsh:KunderaExamples select * from DOCTOR where key='kuldeep';

I have enabled secondary indexes on both column.

 Screen shot is attached

 Please help


 --
 Thanks and Regards
 Kuldeep Kumar Mishra
 +919540965199




 --
 Thanks and Regards
 Kuldeep Kumar Mishra
 +919540965199





-- 
Thanks and Regards
Kuldeep Kumar Mishra
+919540965199


Re: cql query not giving any result.

2013-03-15 Thread Vivek Mishra
Ok. So it's a case  when, CQL returns rowkey value as key and there is
also column present with name as key.

Sounds like a bug?

-Vivek

On Fri, Mar 15, 2013 at 5:17 PM, Kuldeep Mishra kuld.cs.mis...@gmail.comwrote:

 Hi Sylvain,
   I created it using thrift client, here is column family creation
 script,

 Cassandra.Client client;
 CfDef user_Def = new CfDef();
 user_Def.name = DOCTOR;
 user_Def.keyspace = KunderaExamples;
 user_Def.setComparator_type(UTF8Type);
 user_Def.setDefault_validation_class(UTF8Type);
 user_Def.setKey_validation_class(UTF8Type);
 ColumnDef key = new ColumnDef(ByteBuffer.wrap(KEY.getBytes()),
 UTF8Type);
 key.index_type = IndexType.KEYS;
 ColumnDef age = new ColumnDef(ByteBuffer.wrap(AGE.getBytes()),
 UTF8Type);
 age.index_type = IndexType.KEYS;
 user_Def.addToColumn_metadata(key);
 user_Def.addToColumn_metadata(age);

 client.set_keyspace(KunderaExamples);
 client.system_add_column_family(user_Def);


 Thanks
 KK


 On Fri, Mar 15, 2013 at 4:24 PM, Sylvain Lebresne sylv...@datastax.comwrote:

 On Fri, Mar 15, 2013 at 11:43 AM, Kuldeep Mishra 
 kuld.cs.mis...@gmail.com wrote:

 Hi,
 Is it possible in Cassandra to make multiple column with same name ?,
 like in this particular scenario I have two column with same name as key,
 first one is rowkey and second on is column name .


 No, it shouldn't be possible and that is your problem. How did you
 created that table?

 --
 Sylvain



 Thanks and Regards
 Kuldeep


 On Fri, Mar 15, 2013 at 4:05 PM, Kuldeep Mishra 
 kuld.cs.mis...@gmail.com wrote:


 Hi ,
 Following cql query not returning any result
 cqlsh:KunderaExamples select * from DOCTOR where key='kuldeep';

I have enabled secondary indexes on both column.

 Screen shot is attached

 Please help


 --
 Thanks and Regards
 Kuldeep Kumar Mishra
 +919540965199




 --
 Thanks and Regards
 Kuldeep Kumar Mishra
 +919540965199





 --
 Thanks and Regards
 Kuldeep Kumar Mishra
 +919540965199



Re: Backup solution

2013-03-15 Thread Aaron Turner
On Fri, Mar 15, 2013 at 3:12 AM, Rene Kochen
rene.koc...@emea.schange.com wrote:
 Thank you. I have a high bandwidth connection. But that also means that
 regular repairs on the backup data-center will take a long time.



Honestly, at this point I don't think anyone can provide you any good
feedback based on facts because so far you haven't given us any facts.
 Like:

1. How big of a data set?
2. How many nodes in your primary DC?
3. How many transactions/sec is your primary DC doing?
4. What are your uptime SLA's?
5. Just how fast is high bandwidth  How much latency?

Anyways, will it work?  Possibly.  What are the disadvantages?  Well
it depends on a bunch of things you haven't told us.



-- 
Aaron Turner
http://synfin.net/ Twitter: @synfinatic
http://tcpreplay.synfin.net/ - Pcap editing and replay tools for Unix  Windows
Those who would give up essential Liberty, to purchase a little temporary
Safety, deserve neither Liberty nor Safety.
-- Benjamin Franklin
carpe diem quam minimum credula postero


Re: Backup solution

2013-03-15 Thread Philip O'Toole
You can consider using a WAN optimization appliance such as a Riverbed 
Steelhead to significantly speed up your transfers, though that will cost. It 
is a common approach to speed up inter-datacenter transfers. Steelheads for the 
AWS EC2 cloud are also available. 

(Disclaimer: I used to write software for the physical and AWS Steelheads.)

Philip

On Mar 15, 2013, at 9:22 AM, Aaron Turner synfina...@gmail.com wrote:

 On Fri, Mar 15, 2013 at 3:12 AM, Rene Kochen
 rene.koc...@emea.schange.com wrote:
 Thank you. I have a high bandwidth connection. But that also means that
 regular repairs on the backup data-center will take a long time.
 
 
 
 Honestly, at this point I don't think anyone can provide you any good
 feedback based on facts because so far you haven't given us any facts.
 Like:
 
 1. How big of a data set?
 2. How many nodes in your primary DC?
 3. How many transactions/sec is your primary DC doing?
 4. What are your uptime SLA's?
 5. Just how fast is high bandwidth  How much latency?
 
 Anyways, will it work?  Possibly.  What are the disadvantages?  Well
 it depends on a bunch of things you haven't told us.
 
 
 
 -- 
 Aaron Turner
 http://synfin.net/ Twitter: @synfinatic
 http://tcpreplay.synfin.net/ - Pcap editing and replay tools for Unix  
 Windows
 Those who would give up essential Liberty, to purchase a little temporary
 Safety, deserve neither Liberty nor Safety.
-- Benjamin Franklin
 carpe diem quam minimum credula postero


Re: cassandra 1.2.2 build generates slightly different than one on website

2013-03-15 Thread Hiller, Dean
Very weird.  I went back and tried to reproduce it after cleaning all changes 
from git.  I am not sure how that got deleted nor how I ended up with wordcount 
*.class files since I am not doing any map/reduce or anything…..oh well, must 
have made a mistake somewhere.

Thanks,
Dean

From: Sylvain Lebresne sylv...@datastax.commailto:sylv...@datastax.com
Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org 
user@cassandra.apache.orgmailto:user@cassandra.apache.org
Date: Friday, March 15, 2013 11:51 AM
To: user@cassandra.apache.orgmailto:user@cassandra.apache.org 
user@cassandra.apache.orgmailto:user@cassandra.apache.org
Subject: Re: cassandra 1.2.2 build generates slightly different than one on 
website

I suspect you are doing something wrong because both the released archive (I 
just checked) and the tag (as shown here: 
https://github.com/apache/cassandra/blob/cassandra-1.2.2/bin/cassandra.in.sh) 
have the cassandra.in.shhttp://cassandra.in.sh file.

--
Sylvain


On Fri, Mar 15, 2013 at 6:11 PM, Hiller, Dean 
dean.hil...@nrel.govmailto:dean.hil...@nrel.gov wrote:
On git, I checkout out tag 1.2.2 and built it and then tar –xvf the bin distro, 
but it

 1.  Has extra *.class files in apache-cassandra-1.2.2-SNAPSHOT/bin directory
 2.  Is missing the cassandra.in.shhttp://cassandra.in.sh so it would not 
actually start properly?

The second one took me a while to figure out.  This makes me unsure if the tag 
is actually matching what was released?

Thanks,
Dean



Re: Can't replace dead node

2013-03-15 Thread Andrey Ilinykh
I removed Priam and get the same picture.


What I do is- I added to cassandra-env.sh two lines and start cassandra.

JVM_OPTS=$JVM_OPTS
-Dcassandra.initial_token=aaba
JVM_OPTS=$JVM_OPTS
-Dcassandra.replace_token=aaba

Then I can successfully run ring command


Note: Ownership information does not include topology, please specify a
keyspace.
Address DC  RackStatus State   LoadOwns
   Token

   Token(bytes[aaba])
10.28.241.14us-east 1a  Up Normal  251.96 GB
33.33%  Token(bytes[0010])
10.240.119.230  us-east 1b  Up Normal  252.48 GB
33.33%  Token(bytes[5565])
10.147.174.27   us-east 1c  Up Normal  11.26 KB
 33.33%  Token(bytes[aaba])

It shows the current node as part of the ring, but it is empty. In data
directory I can see only system key space.

There is no any errors in log file. It just doen't stream data from other
nodes.
I can launch 1.1.6 but not 1.1.7 or higher. Any ideas what was changed in
1.1.7?

Thank you,
  Andrey



INFO [main] 2013-03-15 18:20:45,303 AbstractCassandraDaemon.java (line 101)
Logging initialized
 INFO [main] 2013-03-15 18:20:45,309 AbstractCassandraDaemon.java (line
122) JVM vendor/version: Java HotSpot(TM) 64-Bit Server VM/1.6.0_35
 INFO [main] 2013-03-15 18:20:45,310 AbstractCassandraDaemon.java (line
123) Heap size: 1931476992/1931476992
 INFO [main] 2013-03-15 18:20:45,311 AbstractCassandraDaemon.java (line
124) Classpath:
/opt/apache-cassandra-1.1.10/bin/../conf:/opt/apache-cassandra-1.1.10/bin/../build/classes/main:/opt/apache-cassandra-1.1.10/bin/../build/classes/thrift:/opt/apache-cassandra-1.1.10/bin/../lib/antlr-3.2.jar:/opt/apache-cassandra-1.1.10/bin/../lib/apache-cassandra-1.1.10.jar:/opt/apache-cassandra-1.1.10/bin/../lib/apache-cassandra-clientutil-1.1.10.jar:/opt/apache-cassandra-1.1.10/bin/../lib/apache-cassandra-thrift-1.1.10.jar:/opt/apache-cassandra-1.1.10/bin/../lib/avro-1.4.0-fixes.jar:/opt/apache-cassandra-1.1.10/bin/../lib/avro-1.4.0-sources-fixes.jar:/opt/apache-cassandra-1.1.10/bin/../lib/commons-cli-1.1.jar:/opt/apache-cassandra-1.1.10/bin/../lib/commons-codec-1.2.jar:/opt/apache-cassandra-1.1.10/bin/../lib/commons-lang-2.4.jar:/opt/apache-cassandra-1.1.10/bin/../lib/compress-lzf-0.8.4.jar:/opt/apache-cassandra-1.1.10/bin/../lib/concurrentlinkedhashmap-lru-1.3.jar:/opt/apache-cassandra-1.1.10/bin/../lib/guava-r08.jar:/opt/apache-cassandra-1.1.10/bin/../lib/high-scale-lib-1.1.2.jar:/opt/apache-cassandra-1.1.10/bin/../lib/jackson-core-asl-1.9.2.jar:/opt/apache-cassandra-1.1.10/bin/../lib/jackson-mapper-asl-1.9.2.jar:/opt/apache-cassandra-1.1.10/bin/../lib/jamm-0.2.5.jar:/opt/apache-cassandra-1.1.10/bin/../lib/jline-0.9.94.jar:/opt/apache-cassandra-1.1.10/bin/../lib/jna.jar:/opt/apache-cassandra-1.1.10/bin/../lib/json-simple-1.1.jar:/opt/apache-cassandra-1.1.10/bin/../lib/libthrift-0.7.0.jar:/opt/apache-cassandra-1.1.10/bin/../lib/log4j-1.2.16.jar:/opt/apache-cassandra-1.1.10/bin/../lib/metrics-core-2.0.3.jar:/opt/apache-cassandra-1.1.10/bin/../lib/priam.jar:/opt/apache-cassandra-1.1.10/bin/../lib/servlet-api-2.5-20081211.jar:/opt/apache-cassandra-1.1.10/bin/../lib/slf4j-api-1.6.1.jar:/opt/apache-cassandra-1.1.10/bin/../lib/slf4j-log4j12-1.6.1.jar:/opt/apache-cassandra-1.1.10/bin/../lib/snakeyaml-1.6.jar:/opt/apache-cassandra-1.1.10/bin/../lib/snappy-java-1.0.4.1.jar:/opt/apache-cassandra-1.1.10/bin/../lib/snaptree-0.1.jar:/opt/apache-cassandra-1.1.10/bin/../lib/jamm-0.2.5.jar
 INFO [main] 2013-03-15 18:20:47,406 CLibrary.java (line 111) JNA mlockall
successful
 INFO [main] 2013-03-15 18:20:47,419 DatabaseDescriptor.java (line 123)
Loading settings from file:/opt/apache-cassandra-1.1.10/conf/cassandra.yaml
 INFO [main] 2013-03-15 18:20:47,840 DatabaseDescriptor.java (line 182)
DiskAccessMode 'auto' determined to be mmap, indexAccessMode is mmap
 INFO [main] 2013-03-15 18:20:47,853 DatabaseDescriptor.java (line 246)
Global memtable threshold is enabled at 614MB
 INFO [main] 2013-03-15 18:20:47,879 Ec2Snitch.java (line 66) EC2Snitch
using region: us-east, zone: 1c.
 INFO [main] 2013-03-15 18:20:48,359 CacheService.java (line 96)
Initializing key cache with capacity of 92 MBs.
 INFO [main] 2013-03-15 18:20:48,376 CacheService.java (line 107)
Scheduling key cache save to each 14400 seconds (going to save all keys).
 INFO [main] 2013-03-15 18:20:48,377 CacheService.java (line 121)
Initializing row cache with capacity of 0 MBs and provider
org.apache.cassandra.cache.SerializingCacheProvider
 INFO [main] 2013-03-15 18:20:48,384 CacheService.java (line 133)
Scheduling row cache save to each 0 seconds (going to save all keys).
 INFO [main] 2013-03-15 18:20:48,661 DatabaseDescriptor.java (line 509)
Couldn't detect any schema definitions in local storage.

Re: HintedHandoff IOError?

2013-03-15 Thread Janne Jalkanen

JMX ended up just with lots more IOErrors. Did a rolling restart of the cluster 
and removed the HH family in the mean time. That seemed to do the trick. Thanks!

/Janne

On Mar 14, 2013, at 06:58 , aaron morton aa...@thelastpickle.com wrote:

 What is the sanctioned way of removing hints? rm -f HintsColumnFamily*? 
 Truncate from CLI?
 There is a JMX command to do it for a particular node. 
 But if you just want to remove all of them, stop and delete the files. 
 
  the only one with zero size are the -tmp- files.  It seems odd…
 Temp files are created during compaction and flushing sstables. 
 
 Cheers
 
 
 -
 Aaron Morton
 Freelance Cassandra Consultant
 New Zealand
 
 @aaronmorton
 http://www.thelastpickle.com
 
 On 11/03/2013, at 11:19 PM, Janne Jalkanen janne.jalka...@ecyrd.com wrote:
 
 
 Oops, forgot to mention that, did I… Cass 1.1.10. 
 
 What is the sanctioned way of removing hints? rm -f HintsColumnFamily*? 
 Truncate from CLI?
 
 This is ls -l of my /system/HintsColumnFamily/ btw - the only one with zero 
 size are the -tmp- files.  It seems odd…
 
 -rw-rw-r--  1 ubuntu ubuntu 86373144 Jan 26 21:39 
 system-HintsColumnFamily-hf-11-Data.db
 -rw-rw-r--  1 ubuntu ubuntu   80 Jan 26 21:39 
 system-HintsColumnFamily-hf-11-Digest.sha1
 -rw-rw-r--  1 ubuntu ubuntu  976 Jan 26 21:39 
 system-HintsColumnFamily-hf-11-Filter.db
 -rw-rw-r--  1 ubuntu ubuntu   11 Jan 26 21:39 
 system-HintsColumnFamily-hf-11-Index.db
 -rw-rw-r--  1 ubuntu ubuntu 4348 Jan 26 21:39 
 system-HintsColumnFamily-hf-11-Statistics.db
 -rw-rw-r--  1 ubuntu ubuntu  569 Feb 27 08:33 
 system-HintsColumnFamily-hf-23-Data.db
 -rw-rw-r--  1 ubuntu ubuntu   80 Feb 27 08:33 
 system-HintsColumnFamily-hf-23-Digest.sha1
 -rw-rw-r--  1 ubuntu ubuntu 1936 Feb 27 08:33 
 system-HintsColumnFamily-hf-23-Filter.db
 -rw-rw-r--  1 ubuntu ubuntu   11 Feb 27 08:33 
 system-HintsColumnFamily-hf-23-Index.db
 -rw-rw-r--  1 ubuntu ubuntu 4356 Feb 27 08:33 
 system-HintsColumnFamily-hf-23-Statistics.db
 -rw-rw-r--  1 ubuntu ubuntu  5500155 Feb 27 08:57 
 system-HintsColumnFamily-hf-24-Data.db
 -rw-rw-r--  1 ubuntu ubuntu   80 Feb 27 08:57 
 system-HintsColumnFamily-hf-24-Digest.sha1
 -rw-rw-r--  1 ubuntu ubuntu   16 Feb 27 08:57 
 system-HintsColumnFamily-hf-24-Filter.db
 -rw-rw-r--  1 ubuntu ubuntu   26 Feb 27 08:57 
 system-HintsColumnFamily-hf-24-Index.db
 -rw-rw-r--  1 ubuntu ubuntu 4340 Feb 27 08:57 
 system-HintsColumnFamily-hf-24-Statistics.db
 -rw-rw-r--  1 ubuntu ubuntu0 Feb 27 08:57 
 system-HintsColumnFamily-tmp-hf-25-Data.db
 -rw-rw-r--  1 ubuntu ubuntu0 Feb 27 08:57 
 system-HintsColumnFamily-tmp-hf-25-Index.db
 
 
 /Janne
 
 On Mar 12, 2013, at 08:07 , aaron morton aa...@thelastpickle.com wrote:
 
 What version of cassandra are you using?
 I would stop each node and delete the hints. If it happens again I could 
 either indicate a failing disk or a bug. 
 
 Cheers
 
 -
 Aaron Morton
 Freelance Cassandra Consultant
 New Zealand
 
 @aaronmorton
 http://www.thelastpickle.com
 
 On 11/03/2013, at 2:13 PM, Robert Coli robert.d.a.c...@gmail.com wrote:
 
 On Mon, Mar 11, 2013 at 7:05 AM, Janne Jalkanen
 janne.jalka...@ecyrd.com wrote:
 I keep seeing these in my log.  Three-node cluster, one node is working 
 fine, but two other nodes have increased latencies and these in the error 
 logs (might of course be unrelated). No obvious GC pressure, no disk 
 errors that I can see.  Ubuntu 12.04 on EC2, Java 7. Repair is run 
 regularly.
 
 My two questions: 1) should I worry, and 2) what might be going on, and 
 3) is there any way to get rid of these? Can I just blow my HintedHandoff 
 table to smithereens?
 
 http://svn.apache.org/repos/asf/cassandra/trunk/src/java/org/apache/cassandra/io/sstable/IndexHelper.java
 
 public static Filter defreezeBloomFilter(FileDataInput file, long
 maxSize, boolean useOldBuffer) throws IOException
{
int size = file.readInt();
if (size  maxSize || size = 0)
throw new EOFException(bloom filter claims to be  + size
 +  bytes, longer than entire row size  + maxSize);
ByteBuffer bytes = file.readBytes(size);
 
 
 Based on the above, I would suspect either a zero byte -Filter.db file
 or a corrupt one. Probably worry a little bit, but only a little bit
 unless your cluster is RF=1.
 
 =Rob
 
 
 



secondary index problem

2013-03-15 Thread Brett Tinling
We have a CF with an indexed column 'type', but we get incomplete results when 
we query that CF for all rows matching 'type'.  We can find the missing rows if 
we query by key.

 * we are seeing this on a small, single node, 1.2.2 instance with few rows.
 * we use thrift execute_cql_query, no CL is specified
 * none of repair, restart, compact, scrub helped
 
 Finally, nodetool rebuild_index fixed it.  
 
 Is index rebuild something we need to do periodically?  How often?  Is there a 
way to know when it needs to be done?  Do we have to run rebuild on all nodes?
 
 We have not noticed this until 1.2
 
 Regards,
  - Brett
 

 
 




smime.p7s
Description: S/MIME cryptographic signature


Secondary Indexes

2013-03-15 Thread Andy Stec
We need to provide search capability based on a field that is a bitmap
combination of 18 possible values. We want to use secondary indexes to
improve performance. One possible solution is to create a named column for
each value and have a secondary index for each of the 18 columns.
Questions we have are:


- Will that result in Cassandra creating 18 new column families,
one for each index?

- If a given column is not specified in any rows, will Cassandra
still create an index column family?

- The documentation says that indexes are rebuilt with every
Cassandra restart. Why is that needed? What does the rebuild do? Does it
read the whole column family into memory at once?


Re: secondary index problem

2013-03-15 Thread Janne Jalkanen

This could be either of the following bugs (which might be the same thing).  I 
get it too every time I recycle a node on 1.1.10.

https://issues.apache.org/jira/browse/CASSANDRA-4973
or
https://issues.apache.org/jira/browse/CASSANDRA-4785

/Janne

On Mar 15, 2013, at 23:24 , Brett Tinling btinl...@lacunasystems.com wrote:

 We have a CF with an indexed column 'type', but we get incomplete results 
 when we query that CF for all rows matching 'type'.  We can find the missing 
 rows if we query by key.
 
 * we are seeing this on a small, single node, 1.2.2 instance with few rows.
 * we use thrift execute_cql_query, no CL is specified
 * none of repair, restart, compact, scrub helped
 
 Finally, nodetool rebuild_index fixed it.  
 
 Is index rebuild something we need to do periodically?  How often?  Is there 
 a way to know when it needs to be done?  Do we have to run rebuild on all 
 nodes?
 
 We have not noticed this until 1.2
 
 Regards,
  - Brett
 
 
 
 
 
 



java.lang.OutOfMemoryError: unable to create new native thread

2013-03-15 Thread S C
I have a Cassandra node that is going down frequently with 
'java.lang.OutOfMemoryError: unable to create new native thread. Its a 16GB VM 
out of which 4GB is set as Xmx and there are no other process running on the 
VM.  I have about 300 clients connecting to this node on an average. I have no 
indication from vmstats/SAR that my VM has used more memory or is memory 
hungry. Doesn't indicate a memory issue to me. Appreciate any pointers to this.
System Specs: 2CPU16GBRHEL 6.2


Thank you.

Cassandra Compression and Wide Rows

2013-03-15 Thread Drew Kutcharian
Hey Guys,

I remember reading somewhere that C* compression is not very effective when 
most of the CFs are in wide-row format and some folks turn the compression off 
and use disk level compression as a workaround. Considering that wide rows with 
composites are first class citizens in CQL3, is this still the case? Has 
there been any improvements on this?

Thanks,

Drew

Waiting on read repair?

2013-03-15 Thread Jasdeep Hundal
I've got a couple of questions related issues I'm encountering using
Cassandra under a heavy write load:

1. With a ConsistencyLevel of quorum, does
FBUtilities.waitForFutures() wait for read repair to complete before
returning?
2. When read repair applies a mutation, it needs to obtain a lock for
the associated memtable. Does compaction obtain this same lock? (I'm
asking because I've seen readrepair spend a few seconds stalling in
org.apache.cassandra.db.Table.apply).

Thanks,
Jasdeep


Re: Backup solution

2013-03-15 Thread Aaron Turner
On Fri, Mar 15, 2013 at 10:35 AM, Rene Kochen
rene.koc...@emea.schange.com wrote:
 Hi Aaron,

 We have many deployments, but typically:

 - Live cluster of six nodes, replication factor = 3.
 - A node processes more reads than writes (approximately 100 get_slices
 per/second, narrow rows).
 - Data per node is about 50 to 100 GBytes.
 - We should recover within 4 hours.

 The idea is to put the backup cluster close to the live cluster with a
 gigabit connection only for Cassandra.

100 reads/sec/node doesn't sound like a lot to me... And 100G/node is
far below the recommended limit.  Sounds to me  you've possibly over
spec'd your cluster (not a bad thing, just an observation).  Of
course, if your data set is growing, then...

That said, I wouldn't consider a single node in a 2nd DC receiving
updates via Cassandra a backup.  That's because a bug in cassandra
which corrupts your data or a user accidentally doing the wrong thing
(like issuing deletes they shouldn't) means that get's replicated to
all your nodes- including the one in the other DC.

A real backup would be to take snapshots on the nodes and then copy
them off the cluster.

I'd say replication is good if you want a hot-standby for a disaster
recovery site so you can quickly recover from a hardware fault.
Especially if you have a 4hr SLA, how are you going to get your
primary DC back up after a fire, earthquake, etc in 4 hours?  Heck, a
switch failure might knock you out for 4 hours depending on how
quickly you can swap another one in and how recent your config backups
are.

Better to have a DR site with a smaller set of nodes with the data
ready to go.  Maybe they won't be as fast, but hopefully you can make
sure the most important queries are handled.  But for that, I would
probably go with something more then just a single node in the DR DC.

One thing to remember is that compactions will impact the feasible
single node size to something smaller then you can potentially
allocate disk space for.   Ie: just because you can build a 4TB disk
array, doesn't mean you can have a single Cassandra node with 4TB of
data.  Typically, people around here seem to recommend ~400GB, but
that depends on hardware.

Honestly, for the price of a single computer you could test this
pretty easy.  That's what I'd do.

-- 
Aaron Turner
http://synfin.net/ Twitter: @synfinatic
http://tcpreplay.synfin.net/ - Pcap editing and replay tools for Unix  Windows
Those who would give up essential Liberty, to purchase a little temporary
Safety, deserve neither Liberty nor Safety.
-- Benjamin Franklin
carpe diem quam minimum credula postero


RE: java.lang.OutOfMemoryError: unable to create new native thread

2013-03-15 Thread S C
I think I figured out where the issue is. I will keep you posted soon.

From: as...@outlook.com
To: user@cassandra.apache.org
Subject: java.lang.OutOfMemoryError: unable to create new native thread
Date: Fri, 15 Mar 2013 17:54:25 -0500




I have a Cassandra node that is going down frequently with 
'java.lang.OutOfMemoryError: unable to create new native thread. Its a 16GB VM 
out of which 4GB is set as Xmx and there are no other process running on the 
VM.  I have about 300 clients connecting to this node on an average. I have no 
indication from vmstats/SAR that my VM has used more memory or is memory 
hungry. Doesn't indicate a memory issue to me. Appreciate any pointers to this.
System Specs: 2CPU16GBRHEL 6.2


Thank you.  
  

Re: Incompatible Gossip 1.1.6 to 1.2.1 Upgrade?

2013-03-15 Thread Arya Goudarzi
Thank you very much Aaron. I recall from the logs of this upgraded node to
1.2.2 reported seeing others as dead. Brandon suggested in
https://issues.apache.org/jira/browse/CASSANDRA-5332 that I should at least
upgrade from 1.1.7. So, I decided to try upgrading to 1.1.10 first before
upgrading to 1.2.2. I am in the middle of troubleshooting some other issues
I had with that upgrade (posted separately), once I am done, I will give
your suggestion a try.

On Mon, Mar 11, 2013 at 10:34 PM, aaron morton aa...@thelastpickle.comwrote:

  Is this just a display bug in nodetool or this upgraded node really sees
 the other ones as dead?
 Is the 1.2.2 node which is see all the others as down processing requests ?
 Is it showing the others as down in the log ?

 I'm not really sure what's happening. But you can try starting the 1.2.2
 node with the

 -Dcassandra.load_ring_state=false

 parameter, append it at the bottom of the cassandra-env.sh file. It will
 force the node to get the ring state from the others.

 Cheers

 -
 Aaron Morton
 Freelance Cassandra Consultant
 New Zealand

 @aaronmorton
 http://www.thelastpickle.com

 On 8/03/2013, at 10:24 PM, Arya Goudarzi gouda...@gmail.com wrote:

  OK. I upgraded one node from 1.1.6 to 1.2.2 today. Despite some new
 problems that I had and I posted them in a separate email, this issue still
 exists but now it is only on 1.2.2 node. This means that the nodes running
 1.1.6 see all other nodes including 1.2.2 as Up. Here is the ring and
 gossip from nodes with 1.1.6 for example. Bold denotes upgraded node:
 
  Address DC  RackStatus State   Load
  Effective-Ownership Token
 
141784319550391026443072753098378663700
  XX.180.36us-east 1b  Up Normal  49.47 GB
  25.00%  1808575600
  XX.231.121  us-east 1c  Up Normal  47.08 GB
  25.00%  7089215977519551322153637656637080005
  XX.177.177  us-east 1d  Up Normal  33.64 GB
  25.00%  14178431955039102644307275311465584410
  XX.7.148us-east 1b  Up Normal  41.27 GB
  25.00%  42535295865117307932921825930779602030
  XX.20.9 us-east 1c  Up Normal  38.51 GB
  25.00%  49624511842636859255075463585608106435
  XX.86.255us-east 1d  Up Normal  34.78 GB
  25.00%  56713727820156410577229101240436610840
  XX.63.230us-east 1b  Up Normal  38.11 GB
  25.00%  85070591730234615865843651859750628460
  XX.163.36   us-east 1c  Up Normal  44.25 GB
  25.00%  92159807707754167187997289514579132865
  XX.31.234us-east 1d  Up Normal  44.66 GB
  25.00%  99249023685273718510150927169407637270
  XX.132.169   us-east 1b  Up Normal  44.2 GB
 25.00%  127605887595351923798765477788721654890
  XX.71.63 us-east 1c  Up Normal  38.74 GB
  25.00%  134695103572871475120919115443550159295
  XX.197.209  us-east 1d  Up Normal  41.5 GB
 25.00%  141784319550391026443072753098378663700
 
  /XX.71.63
RACK:1c
SCHEMA:99dce53b-487e-3e7b-a958-a1cc48d9f575
LOAD:4.1598705272E10
DC:us-east
INTERNAL_IP:XX.194.92
STATUS:NORMAL,134695103572871475120919115443550159295
RPC_ADDRESS:XX.194.92
RELEASE_VERSION:1.1.6
  /XX.86.255
RACK:1d
SCHEMA:99dce53b-487e-3e7b-a958-a1cc48d9f575
LOAD:3.734334162E10
DC:us-east
INTERNAL_IP:XX.6.195
STATUS:NORMAL,56713727820156410577229101240436610840
RPC_ADDRESS:XX.6.195
RELEASE_VERSION:1.1.6
  /XX.7.148
RACK:1b
SCHEMA:99dce53b-487e-3e7b-a958-a1cc48d9f575
LOAD:4.4316975808E10
DC:us-east
INTERNAL_IP:XX.47.250
STATUS:NORMAL,42535295865117307932921825930779602030
RPC_ADDRESS:XX.47.250
RELEASE_VERSION:1.1.6
  /XX.63.230
RACK:1b
SCHEMA:99dce53b-487e-3e7b-a958-a1cc48d9f575
LOAD:4.0918593305E10
DC:us-east
INTERNAL_IP:XX.89.127
STATUS:NORMAL,85070591730234615865843651859750628460
RPC_ADDRESS:XX.89.127
RELEASE_VERSION:1.1.6
  /XX.132.169
RACK:1b
SCHEMA:99dce53b-487e-3e7b-a958-a1cc48d9f575
LOAD:4.745883458E10
DC:us-east
INTERNAL_IP:XX.94.161
STATUS:NORMAL,127605887595351923798765477788721654890
RPC_ADDRESS:XX.94.161
RELEASE_VERSION:1.1.6
  /XX.180.36
RACK:1b
SCHEMA:99dce53b-487e-3e7b-a958-a1cc48d9f575
LOAD:5.311963027E10
DC:us-east
INTERNAL_IP:XX.123.112
STATUS:NORMAL,1808575600
RPC_ADDRESS:XX.123.112
RELEASE_VERSION:1.1.6
  /XX.163.36
RACK:1c
SCHEMA:99dce53b-487e-3e7b-a958-a1cc48d9f575
LOAD:4.7516755022E10
DC:us-east
INTERNAL_IP:XX.163.180
STATUS:NORMAL,92159807707754167187997289514579132865
RPC_ADDRESS:XX.163.180
RELEASE_VERSION:1.1.6
  /XX.31.234
RACK:1d
SCHEMA:99dce53b-487e-3e7b-a958-a1cc48d9f575
LOAD:4.7954372912E10
  

Re: cql query not giving any result.

2013-03-15 Thread Vivek Mishra
Any suggestions?
-Vivek

On Fri, Mar 15, 2013 at 5:20 PM, Vivek Mishra mishra.v...@gmail.com wrote:

 Ok. So it's a case  when, CQL returns rowkey value as key and there is
 also column present with name as key.

 Sounds like a bug?

 -Vivek


 On Fri, Mar 15, 2013 at 5:17 PM, Kuldeep Mishra 
 kuld.cs.mis...@gmail.comwrote:

 Hi Sylvain,
   I created it using thrift client, here is column family creation
 script,

 Cassandra.Client client;
 CfDef user_Def = new CfDef();
 user_Def.name = DOCTOR;
 user_Def.keyspace = KunderaExamples;
 user_Def.setComparator_type(UTF8Type);
 user_Def.setDefault_validation_class(UTF8Type);
 user_Def.setKey_validation_class(UTF8Type);
 ColumnDef key = new ColumnDef(ByteBuffer.wrap(KEY.getBytes()),
 UTF8Type);
 key.index_type = IndexType.KEYS;
 ColumnDef age = new ColumnDef(ByteBuffer.wrap(AGE.getBytes()),
 UTF8Type);
 age.index_type = IndexType.KEYS;
 user_Def.addToColumn_metadata(key);
 user_Def.addToColumn_metadata(age);

 client.set_keyspace(KunderaExamples);
 client.system_add_column_family(user_Def);


 Thanks
 KK


 On Fri, Mar 15, 2013 at 4:24 PM, Sylvain Lebresne 
 sylv...@datastax.comwrote:

 On Fri, Mar 15, 2013 at 11:43 AM, Kuldeep Mishra 
 kuld.cs.mis...@gmail.com wrote:

 Hi,
 Is it possible in Cassandra to make multiple column with same name ?,
 like in this particular scenario I have two column with same name as key,
 first one is rowkey and second on is column name .


 No, it shouldn't be possible and that is your problem. How did you
 created that table?

 --
 Sylvain



 Thanks and Regards
 Kuldeep


 On Fri, Mar 15, 2013 at 4:05 PM, Kuldeep Mishra 
 kuld.cs.mis...@gmail.com wrote:


 Hi ,
 Following cql query not returning any result
 cqlsh:KunderaExamples select * from DOCTOR where key='kuldeep';

I have enabled secondary indexes on both column.

 Screen shot is attached

 Please help


 --
 Thanks and Regards
 Kuldeep Kumar Mishra
 +919540965199




 --
 Thanks and Regards
 Kuldeep Kumar Mishra
 +919540965199





 --
 Thanks and Regards
 Kuldeep Kumar Mishra
 +919540965199





Re: hinted handoff disabling trade-offs

2013-03-15 Thread Matt Kap
Thanks Aaron.

I am using CL=ONE. read_repair_chance=0. The part which I'm wondering
about is what happens to the internal Cassandra writes if Hinted
Handoffs are disabled. I think I understand what it means for
application-level data, but the part I'm not entirely sure about is
what it could mean for Cassandra internals.

My cluster is under heavy write load. I'm considering disabling Hinted
Handoffs so the nodes recover quicker in case compactions begin to
back up.

On Wed, Mar 6, 2013 at 2:06 AM, aaron morton aa...@thelastpickle.com wrote:
 The advantage of HH is that it reduces the probability of a DigestMismatch
 when using a CL  ONE. A DigestMismatch means the read has to run a second
 time before returning to the client.

 - No risk of hinted-handoffs building up
 - No risk of hinted-handoffs flooding a node that just came up

 See the yaml config settings for the max hint window and the throttling.

 Can anyone suggest any other factors that I'm missing here. Specifically
 reasons
 not to do this.

 If you are doing this for performance first make sure your data model is
 efficient, that you are doing the most efficient reads (see my presentation
 here http://www.datastax.com/events/cassandrasummit2012/presentations), and
 your caching is bang on. Then consider if you can tune the CL, and if your
 client is token aware so it directs traffic to a node that has it.

 Cheers

 -
 Aaron Morton
 Freelance Cassandra Developer
 New Zealand

 @aaronmorton
 http://www.thelastpickle.com

 On 4/03/2013, at 9:19 PM, Michael Kjellman mkjell...@barracuda.com wrote:

 Also, if you have enough hints being created that its significantly
 impacting your heap I have a feeling things are going to get out of sync
 very quickly.

 On Mar 4, 2013, at 9:17 PM, Wz1975 wz1...@yahoo.com wrote:

 Why do you think disabling hinted handoff will improve memory usage?


 Thanks.
 -Wei

 Sent from my Samsung smartphone on ATT


  Original message 
 Subject: Re: hinted handoff disabling trade-offs
 From: Michael Kjellman mkjell...@barracuda.com
 To: user@cassandra.apache.org user@cassandra.apache.org
 CC:


 Repair is slow.

 On Mar 4, 2013, at 8:07 PM, Matt Kap matvey1...@gmail.com wrote:

 I am looking to get a second opinion about disabling hinted-handoffs. I
 have an application that can tolerate a fair amount of inconsistency
 (advertising domain), and so I'm weighting the pros and cons of hinted
 handoffs. I'm running Cassandra 1.0, looking to upgrade to 1.1 soon.

 Pros of disabling hinted handoffs:
 - Reduces heap
 - Improves GC performance
 - No risk of hinted-handoffs building up
 - No risk of hinted-handoffs flooding a node that just came up

 Cons
 - Some writes can be lost, at least until repair runs

 Can anyone suggest any other factors that I'm missing here. Specifically
 reasons
 not to do this.

 Cheers!
 -Matt

 Copy, by Barracuda, helps you store, protect, and share all your amazing
 things. Start today: www.copy.com.


 --
 Copy, by Barracuda, helps you store, protect, and share all your amazing
 things. Start today: www.copy.com.
   ­­





-- 
www.calcmachine.com - easy online calculator.