Tables showing up as our_table-147a2090ed4211e480153bc81e542ebd/ in data dir

2015-04-28 Thread Donald Smith
Using 2.1.4, tables in our data/ directory are showing up as


our_table-147a2090ed4211e480153bc81e542ebd/


instead of as


 our_table/


Why would that happen? We're also seeing lagging compactions and high cpu usage.


 Thanks, Don


Re: Best Practice to add a node in a Cluster

2015-04-28 Thread Neha Trivedi
Interesting Eric !!!
Not sure if this would be allowed. Alter keyspace to RF=3 and then add a
node.

On Tue, Apr 28, 2015 at 8:54 PM, Eric Stevens  wrote:

> I would double check in a test cluster (or with a tool like CCM to confirm
> to set up a local throwaway cluster), but for this *specific* use case
> (going from RF==NodeCount to RF==NodeCount with a higher number) you should
> be able to have a simpler path.  Set RF=3 before you add your new node,
> then add the new node.  It will bootstrap all data from the other two
> nodes, then your job is done.
>
> You shouldn't have to run repair (which you normally have to do after
> increasing RF in order to make sure all nodes have their data - the nodes
> already have all their data), and you shouldn't have to run cleanup (which
> you normally have to do after increasing node count to instruct the old
> nodes to forget data for which they are no longer responsible).  The data
> responsibility hasn't changed for any node, all nodes are still responsible
> for all data.
>
> On Mon, Apr 27, 2015 at 9:19 PM, Neha Trivedi 
> wrote:
>
>> Thans Arun !
>>
>> On Tue, Apr 28, 2015 at 9:44 AM, arun sirimalla 
>> wrote:
>>
>>> Hi Neha,
>>>
>>>
>>> After you add the node to the cluster, run nodetool cleanup on all nodes.
>>> Next running repair on each node will replicate the data. Make sure you
>>> run the repair on one node at a time, because repair is an expensive
>>> process (Utilizes high CPU).
>>>
>>>
>>>
>>>
>>> On Mon, Apr 27, 2015 at 8:36 PM, Neha Trivedi 
>>> wrote:
>>>
 Thanks Eric and Matt :) !!

 Yes the purpose is to improve reliability.
 Right now, from our driver we are querying using degradePolicy for
 reliability.



 *For changing the keyspace for RF=3, the procedure is as under:*
 1. Add a new node to the cluster (new node is not in seed list)

 2. ALTER KEYSPACE system_auth WITH REPLICATION =
   {'class' : 'NetworkTopologyStrategy', 'dc1' : 3};


1. On each affected node, run nodetool repair

 .

2. Wait until repair completes on a node, then move to the next
node.


 Any other things to take care?

 Thanks
 Regards
 neha


 On Mon, Apr 27, 2015 at 9:45 PM, Eric Stevens 
 wrote:

> It depends on why you're adding a new node.  If you're running out of
> disk space or IO capacity in your 2 node cluster, then changing RF to 3
> will not improve either condition - you'd still be writing all data to all
> three nodes.
>
> However if you're looking to improve reliability, a 2 node RF=2
> cluster cannot have either node offline without losing quorum, while a 3
> node RF=3 cluster can have one node offline and still be able to achieve
> quorum.  RF=3 is a common replication factor because of this 
> characteristic.
>
> Make sure your new node is not in its own seeds list, or it will not
> bootstrap (it will come online immediately and start serving requests).
>
> On Mon, Apr 27, 2015 at 8:46 AM, Neha Trivedi 
> wrote:
>
>> Hi
>> We have a 2 Cluster Node with RF=2. We are planing to add a new node.
>>
>> Should we change RF to 3 in the schema?
>> OR Just added a new node with the same RF=2?
>>
>> Any other Best Practice that we need to take care?
>>
>> Thanks
>> regards
>> Neha
>>
>>
>

>>>
>>>
>>> --
>>> Arun
>>> Senior Hadoop/Cassandra Engineer
>>> Cloudwick
>>>
>>> Champion of Big Data (Cloudera)
>>>
>>> http://www.cloudera.com/content/dev-center/en/home/champions-of-big-data.html
>>>
>>> 2014 Data Impact Award Winner (Cloudera)
>>>
>>> http://www.cloudera.com/content/cloudera/en/campaign/data-impact-awards.html
>>>
>>>
>>
>


Re: How to use nodetool ring only for one data center

2015-04-28 Thread Surbhi Gupta
When we have multi datacenter . In output datacenter like DC1 etc is not
easy to associate with each node like we have RAC .

On 28 April 2015 at 16:30, Rahul Neelakantan  wrote:

> Do you want this for some sort of reporting requirement? If so you may be
> able to write a quick she'll script using grep to remove the unwanted data
>
> Rahul
>
> > On Apr 28, 2015, at 7:24 PM, Surbhi Gupta 
> wrote:
> >
> > Hi,
> >
> > I wanted to know, how can we get the information of the token rings only
> for one data centers when using vnodes and multiple data center.
> >
> > Thanks
> > Surbhi
>


Re: How to use nodetool ring only for one data center

2015-04-28 Thread Rahul Neelakantan
Do you want this for some sort of reporting requirement? If so you may be able 
to write a quick she'll script using grep to remove the unwanted data

Rahul

> On Apr 28, 2015, at 7:24 PM, Surbhi Gupta  wrote:
> 
> Hi,
> 
> I wanted to know, how can we get the information of the token rings only for 
> one data centers when using vnodes and multiple data center.
> 
> Thanks
> Surbhi


How to use nodetool ring only for one data center

2015-04-28 Thread Surbhi Gupta
Hi,

I wanted to know, how can we get the information of the token rings only
for one data centers when using vnodes and multiple data center.

Thanks
Surbhi


Denormalization leads to terrible, rather than better, Cassandra performance -- I am really puzzled

2015-04-28 Thread dlu66061
Cassandra gurus, I am really puzzled by my observations,and hope to get some
help explaining the results. Thanks in advance.
I think it has always been advocated in Cassandracommunity that
de-normalization leads to better performance. I wanted to seehow much
performance improvement it can offer, but the results were totallyopposite.
The performance degraded dramatically for simultaneously requests forthe
same set of data.
*Environment:*
I have a Cassandra cluster consisting of 3 AWS m3.largeinstances, with
Cassandra 2.0.6 installed and pretty much default settings. My programis
written in Java using Java Driver 2.0.8.
*Normalized case:*
I have two tables created with the following 2 CQLstatements
CREATE TABLE event (event_idUUID, time_token timeuuid, …­ 30 other
attributes, …­ PRIMARY KEY (event_id))
CREATE TABLE event_index(index_key text, time_token timeuuid, event_id
UUID,   PRIMARY KEY(index_key, time_token)) 
In my program, given the proper index_key and a tokenrange (tokenLowerBound
to tokenUpperBound), I first query the event_index table
/Query 1:/
SELECT * FROM event_index WHEREindex_key in (…­) AND time_token >
tokenLowerBound AND time_token <=tokenUpperBound ORDER BY time_token ASC
LIMIT 2000
to get a list of event_ids and then run the following CQLto get the event
details.
/Query 2:/
SELECT * FROM event WHEREevent_id IN (a list of event_ids from the above
query)
I repeat the above process, with updated token range fromthe previous run.
This actually performs pretty well.
In this normalized process, I have to *run 2 queries*to get data: the first
one should be very quick since it is getting a slice ofan internally wide
row. The second query may take long because it needs to hitup to 2000 rows
of event table.
*De-normalized case:*
What if we can attach event detail to the index and runjust 1 query? Like
Query 1, would it be much faster since it is also getting aslice of an
internally wide row?
I created a third table that merged the above two tablestogether. Notice the
first three attributes and the PRIMARY KEY definition areexactly the same as
the "event_index" table.
CREATE TABLEevent_index_with_detail (index_key text, time_token timeuuid,
event_id UUID, …30 other attributes, …­ PRIMARY KEY
(index_key, time_token))
Then I can just run the following query to achieve mygoal, with the same
index and token range as in query 1:
/Query 3:/
SELECT * FROMevent_index_with_detail WHERE index_key in (…­) AND
time_token >tokenLowerBound AND time_token <= tokenUpperBound ORDER BY
time_token ASCLIMIT 2000
*Performance observations*
Using Java Driver 2.0.8, I wrote a program that runsQuery 1 + Query 2 in the
normalized case, or Query 3 in the denormalized case.All queries is set with
LOCAL_QUORUM consistency level.
Then I created 1 or more instances of the program tosimultaneously retrieve
the SAME set of 1 million events stored in Cassandra.Each test runs for 5
minutes, and the results are shown below.
 
  
 
  
1 instance
  
5 instances
  
10 instances

  
Normalized
  
89
  
315
  
417

  
Denormalized
  
100
  
*43*
  
*3*
   
Note that the unit of measure is number of operations. Soin the normalized
case, the programs runs 89 times and retrieves 178K events fora single
instance, 315 times and 630K events to 5 instances (each instance getsabout
126K events), and 417 times and 834K events to 10 instancessimultaneously
(each instance gets about 83.4K events).
Well for the de-normalized case, the performance islittle better for a
single instance case, in which the program runs 100 timesand retrieves 200K
events. However, it turns sharply south for multiplesimultaneous instances.
All 5 instances completed successfully only 43operations together, and all
10 instances completed successfully only 3operations together. For the
latter case, the log showed that 3 instances eachretrieved 2000 events
successfully, and 7 other instances retrieved 0.
In the de-normalized case, the program reported a lot ofexceptions like
below:
com.datastax.driver.core.exceptions.ReadTimeoutException,Cassandra timeout
during read query at consistency LOCAL_QUORUM (2 responseswere required but
only 1 replica responded)
com.datastax.driver.core.exceptions.NoHostAvailableException,All host(s)
tried for query failed (no host was tried)
I repeated the two cases back and forth several times,and the results
remained the same.
I also observed CPU usage on the 3 Cassandra servers, andthey were all much
higher for the de-normalized case.
 
  
 
  
1 instance
  
5 instances
  
10 instances

  
Normalized
  
7% usr, 2% sys
  
30% usr, 8% sys
  
40% usr, 10% sys

  
Denormalized
  
44% usr, 0.3% sys
  
65% usr, 1% sys
  
70% usr, 2% sys
   
*Questions*
This is really not what I expected, and I am puzzled andhave not figured out
a good explanatio

Re: New node got stuck joining the cluster after a while

2015-04-28 Thread Analia Lorenzatto
Thanks very much for answering!

Do you think that after failing the joining task of a node to the cluster
should I run some repairs and cleanups?

Thanks!

On Tue, Apr 28, 2015 at 5:13 AM, Carlos Rolo  wrote:

> Hi,
>
> The 2.1.x series is not recommeded for use, especially the first versions.
> I would downgrade to 2.0.14 or if must stay on 2.1 upgrade your cluster to
> 2.1.4 or the imminent release of 2.1.5.
>
> This mailing list as a few tips how to deal with the 2.1.x releases, but
> the best way is indeed a downgrade or wait for the 2.1.5.
>
> Regards,
>
> Carlos Juzarte Rolo
> Cassandra Consultant
>
> Pythian - Love your data
>
> rolo@pythian | Twitter: cjrolo | Linkedin: *linkedin.com/in/carlosjuzarterolo
> *
> Mobile: +31 6 159 61 814 | Tel: +1 613 565 8696 x1649
> www.pythian.com
>
> On Tue, Apr 28, 2015 at 3:30 AM, Analia Lorenzatto <
> analialorenza...@gmail.com> wrote:
>
>>
>> Hello guys,
>>
>> I have a cluster comprised of 2 nodes, configured with vnodes.  Using
>> 2.1.0-2 version of cassandra.
>>
>> And I am facing an issue when I want to joing a new node to the cluster.
>>
>> At first starting joining but then it got stuck:
>>
>> UN  1x.x.x.x  348.11 GB  256 100.0%  1c
>> UN  1x.x.x.x  342.74 GB  256 100.0%  1c
>> UJ  1x.x.x.x  26.86 GB   256 ?   1c
>>
>>
>> I can see some errors on the already working nodes:
>>
>> *WARN  [SharedPool-Worker-7] 2015-04-27 17:41:16,060
>> SliceQueryFilter.java:236 - Read 5001 live and 66548 tombstoned cells in
>> usmc.userpixel (see tombstone_warn_threshol*
>> *d). 5000 columns was requested, slices=[-],
>> delInfo={deletedAt=-9223372036854775808, localDeletion=2147483647
>> <2147483647>}*
>> *WARN  [SharedPool-Worker-32] 2015-04-27 17:41:16,668
>> SliceQueryFilter.java:236 - Read 2012 live and 30440 tombstoned cells in
>> usmc.userpixel (see tombstone_warn_thresho*
>> *ld). 5001 columns was requested,
>> slices=[b6d051df-0a8f-4c13-b93c-1b4ff0d82b8d:date-],
>> delInfo={deletedAt=-9223372036854775808, localDeletion=2147483647}*
>>
>> *ERROR [CompactionExecutor:35638] 2015-04-27 19:06:07,613
>> CassandraDaemon.java:166 - Exception in thread
>> Thread[CompactionExecutor:35638,1,main]*
>> *java.lang.AssertionError: Memory was freed*
>> *at
>> org.apache.cassandra.io.util.Memory.checkPosition(Memory.java:281)
>> ~[apache-cassandra-2.1.0.jar:2.1.0]*
>> *at org.apache.cassandra.io.util.Memory.getInt(Memory.java:233)
>> ~[apache-cassandra-2.1.0.jar:2.1.0]*
>> *at
>> org.apache.cassandra.io.sstable.IndexSummary.getPositionInSummary(IndexSummary.java:118)
>> ~[apache-cassandra-2.1.0.jar:2.1.0]*
>> *at
>> org.apache.cassandra.io.sstable.IndexSummary.getKey(IndexSummary.java:123)
>> ~[apache-cassandra-2.1.0.jar:2.1.0]*
>> * at
>> org.apache.cassandra.io.sstable.IndexSummary.binarySearch(IndexSummary.java:92)
>> ~[apache-cassandra-2.1.0.jar:2.1.0]*
>> *at
>> org.apache.cassandra.io.sstable.SSTableReader.getSampleIndexesForRanges(SSTableReader.java:1209)
>> ~[apache-cassandra-2.1.0.jar:2.1.0]*
>> *at
>> org.apache.cassandra.io.sstable.SSTableReader.estimatedKeysForRanges(SSTableReader.java:1165)
>> ~[apache-cassandra-2.1.0.jar:2.1.0]*
>> *at
>> org.apache.cassandra.db.compaction.AbstractCompactionStrategy.worthDroppingTombstones(AbstractCompactionStrategy.java:328)
>> ~[apache-cassandra-2.1.0.jar:2.1.0*
>> *]*
>> *at
>> org.apache.cassandra.db.compaction.LeveledCompactionStrategy.findDroppableSSTable(LeveledCompactionStrategy.java:365)
>> ~[apache-cassandra-2.1.0.jar:2.1.0]*
>> *at
>> org.apache.cassandra.db.compaction.LeveledCompactionStrategy.getMaximalTask(LeveledCompactionStrategy.java:127)
>> ~[apache-cassandra-2.1.0.jar:2.1.0]*
>> *at
>> org.apache.cassandra.db.compaction.LeveledCompactionStrategy.getNextBackgroundTask(LeveledCompactionStrategy.java:112)
>> ~[apache-cassandra-2.1.0.jar:2.1.0]*
>> *at
>> org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run(CompactionManager.java:229)
>> ~[apache-cassandra-2.1.0.jar:2.1.0]*
>> *at
>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>> ~[na:1.7.0_51]*
>> *at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>> ~[na:1.7.0_51]*
>> *at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>> ~[na:1.7.0_51]*
>> *at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>> [na:1.7.0_51]*
>> *at java.lang.Thread.run(Thread.java:744) [na:1.7.0_51]*
>>
>> But I do not see any warning or error message in logs of the joining
>> nodes.  I just see an exception there when I run: "nodetool info":
>>
>> root@:~# nodetool info
>> ID   : f5e49647-59fa-474f-b6af-9f65abc43581
>> Gossip active: true
>> Thrift active: false
>> Native Transport active: f

Re: Best Practice to add a node in a Cluster

2015-04-28 Thread Eric Stevens
I would double check in a test cluster (or with a tool like CCM to confirm
to set up a local throwaway cluster), but for this *specific* use case
(going from RF==NodeCount to RF==NodeCount with a higher number) you should
be able to have a simpler path.  Set RF=3 before you add your new node,
then add the new node.  It will bootstrap all data from the other two
nodes, then your job is done.

You shouldn't have to run repair (which you normally have to do after
increasing RF in order to make sure all nodes have their data - the nodes
already have all their data), and you shouldn't have to run cleanup (which
you normally have to do after increasing node count to instruct the old
nodes to forget data for which they are no longer responsible).  The data
responsibility hasn't changed for any node, all nodes are still responsible
for all data.

On Mon, Apr 27, 2015 at 9:19 PM, Neha Trivedi 
wrote:

> Thans Arun !
>
> On Tue, Apr 28, 2015 at 9:44 AM, arun sirimalla 
> wrote:
>
>> Hi Neha,
>>
>>
>> After you add the node to the cluster, run nodetool cleanup on all nodes.
>> Next running repair on each node will replicate the data. Make sure you
>> run the repair on one node at a time, because repair is an expensive
>> process (Utilizes high CPU).
>>
>>
>>
>>
>> On Mon, Apr 27, 2015 at 8:36 PM, Neha Trivedi 
>> wrote:
>>
>>> Thanks Eric and Matt :) !!
>>>
>>> Yes the purpose is to improve reliability.
>>> Right now, from our driver we are querying using degradePolicy for
>>> reliability.
>>>
>>>
>>>
>>> *For changing the keyspace for RF=3, the procedure is as under:*
>>> 1. Add a new node to the cluster (new node is not in seed list)
>>>
>>> 2. ALTER KEYSPACE system_auth WITH REPLICATION =
>>>   {'class' : 'NetworkTopologyStrategy', 'dc1' : 3};
>>>
>>>
>>>1. On each affected node, run nodetool repair
>>>
>>> .
>>>
>>>2. Wait until repair completes on a node, then move to the next node.
>>>
>>>
>>> Any other things to take care?
>>>
>>> Thanks
>>> Regards
>>> neha
>>>
>>>
>>> On Mon, Apr 27, 2015 at 9:45 PM, Eric Stevens  wrote:
>>>
 It depends on why you're adding a new node.  If you're running out of
 disk space or IO capacity in your 2 node cluster, then changing RF to 3
 will not improve either condition - you'd still be writing all data to all
 three nodes.

 However if you're looking to improve reliability, a 2 node RF=2 cluster
 cannot have either node offline without losing quorum, while a 3 node RF=3
 cluster can have one node offline and still be able to achieve quorum.
 RF=3 is a common replication factor because of this characteristic.

 Make sure your new node is not in its own seeds list, or it will not
 bootstrap (it will come online immediately and start serving requests).

 On Mon, Apr 27, 2015 at 8:46 AM, Neha Trivedi 
 wrote:

> Hi
> We have a 2 Cluster Node with RF=2. We are planing to add a new node.
>
> Should we change RF to 3 in the schema?
> OR Just added a new node with the same RF=2?
>
> Any other Best Practice that we need to take care?
>
> Thanks
> regards
> Neha
>
>

>>>
>>
>>
>> --
>> Arun
>> Senior Hadoop/Cassandra Engineer
>> Cloudwick
>>
>> Champion of Big Data (Cloudera)
>>
>> http://www.cloudera.com/content/dev-center/en/home/champions-of-big-data.html
>>
>> 2014 Data Impact Award Winner (Cloudera)
>>
>> http://www.cloudera.com/content/cloudera/en/campaign/data-impact-awards.html
>>
>>
>


RE: minimum bandwidth requirement between two Geo Redundant sites of Cassandra database

2015-04-28 Thread Daniels, Kelly
We will be anxious to confirm results in production but we have been seeing 30 
ms or less as best; at 45 ms we start having more than tolerable errors, at 50 
ms we start to disregard the usability.  We are also looking into features from 
Oracle DATA Guard such as ‘Apply Lag’ and ‘Transport Lag’ to see if we can 
manage internally when the transport distance is beyond reality.

From: Carlos Rolo [mailto:r...@pythian.com]
Sent: Tuesday, April 28, 2015 2:07 AM
To: user@cassandra.apache.org
Subject: Re: minimum bandwidth requirement between two Geo Redundant sites of 
Cassandra database

Hi,
I would not recommend anything below 1Gbps for the bandwidth. Latency try to 
have it as low as you can.

Regards,

Carlos Juzarte Rolo
Cassandra Consultant

Pythian - Love your data

rolo@pythian | Twitter: cjrolo | Linkedin: 
linkedin.com/in/carlosjuzarterolo
Mobile: +31 6 159 61 814 | Tel: +1 613 565 8696 x1649
www.pythian.com

On Tue, Apr 28, 2015 at 5:28 AM, Gaurav Bhatnagar 
mailto:gauravb...@gmail.com>> wrote:
Hi,
 Is there any minimum bandwidth requirement between two Geo Redundant data 
centres?
What is the minimum latency that link between two Geo Redundant data centres 
should have to get best efficient operations?
Regards,
Gaurav



--




Re: New node got stuck joining the cluster after a while

2015-04-28 Thread Carlos Rolo
Hi,

The 2.1.x series is not recommeded for use, especially the first versions.
I would downgrade to 2.0.14 or if must stay on 2.1 upgrade your cluster to
2.1.4 or the imminent release of 2.1.5.

This mailing list as a few tips how to deal with the 2.1.x releases, but
the best way is indeed a downgrade or wait for the 2.1.5.

Regards,

Carlos Juzarte Rolo
Cassandra Consultant

Pythian - Love your data

rolo@pythian | Twitter: cjrolo | Linkedin: *linkedin.com/in/carlosjuzarterolo
*
Mobile: +31 6 159 61 814 | Tel: +1 613 565 8696 x1649
www.pythian.com

On Tue, Apr 28, 2015 at 3:30 AM, Analia Lorenzatto <
analialorenza...@gmail.com> wrote:

>
> Hello guys,
>
> I have a cluster comprised of 2 nodes, configured with vnodes.  Using
> 2.1.0-2 version of cassandra.
>
> And I am facing an issue when I want to joing a new node to the cluster.
>
> At first starting joining but then it got stuck:
>
> UN  1x.x.x.x  348.11 GB  256 100.0%  1c
> UN  1x.x.x.x  342.74 GB  256 100.0%  1c
> UJ  1x.x.x.x  26.86 GB   256 ?   1c
>
>
> I can see some errors on the already working nodes:
>
> *WARN  [SharedPool-Worker-7] 2015-04-27 17:41:16,060
> SliceQueryFilter.java:236 - Read 5001 live and 66548 tombstoned cells in
> usmc.userpixel (see tombstone_warn_threshol*
> *d). 5000 columns was requested, slices=[-],
> delInfo={deletedAt=-9223372036854775808, localDeletion=2147483647
> <2147483647>}*
> *WARN  [SharedPool-Worker-32] 2015-04-27 17:41:16,668
> SliceQueryFilter.java:236 - Read 2012 live and 30440 tombstoned cells in
> usmc.userpixel (see tombstone_warn_thresho*
> *ld). 5001 columns was requested,
> slices=[b6d051df-0a8f-4c13-b93c-1b4ff0d82b8d:date-],
> delInfo={deletedAt=-9223372036854775808, localDeletion=2147483647}*
>
> *ERROR [CompactionExecutor:35638] 2015-04-27 19:06:07,613
> CassandraDaemon.java:166 - Exception in thread
> Thread[CompactionExecutor:35638,1,main]*
> *java.lang.AssertionError: Memory was freed*
> *at
> org.apache.cassandra.io.util.Memory.checkPosition(Memory.java:281)
> ~[apache-cassandra-2.1.0.jar:2.1.0]*
> *at org.apache.cassandra.io.util.Memory.getInt(Memory.java:233)
> ~[apache-cassandra-2.1.0.jar:2.1.0]*
> *at
> org.apache.cassandra.io.sstable.IndexSummary.getPositionInSummary(IndexSummary.java:118)
> ~[apache-cassandra-2.1.0.jar:2.1.0]*
> *at
> org.apache.cassandra.io.sstable.IndexSummary.getKey(IndexSummary.java:123)
> ~[apache-cassandra-2.1.0.jar:2.1.0]*
> * at
> org.apache.cassandra.io.sstable.IndexSummary.binarySearch(IndexSummary.java:92)
> ~[apache-cassandra-2.1.0.jar:2.1.0]*
> *at
> org.apache.cassandra.io.sstable.SSTableReader.getSampleIndexesForRanges(SSTableReader.java:1209)
> ~[apache-cassandra-2.1.0.jar:2.1.0]*
> *at
> org.apache.cassandra.io.sstable.SSTableReader.estimatedKeysForRanges(SSTableReader.java:1165)
> ~[apache-cassandra-2.1.0.jar:2.1.0]*
> *at
> org.apache.cassandra.db.compaction.AbstractCompactionStrategy.worthDroppingTombstones(AbstractCompactionStrategy.java:328)
> ~[apache-cassandra-2.1.0.jar:2.1.0*
> *]*
> *at
> org.apache.cassandra.db.compaction.LeveledCompactionStrategy.findDroppableSSTable(LeveledCompactionStrategy.java:365)
> ~[apache-cassandra-2.1.0.jar:2.1.0]*
> *at
> org.apache.cassandra.db.compaction.LeveledCompactionStrategy.getMaximalTask(LeveledCompactionStrategy.java:127)
> ~[apache-cassandra-2.1.0.jar:2.1.0]*
> *at
> org.apache.cassandra.db.compaction.LeveledCompactionStrategy.getNextBackgroundTask(LeveledCompactionStrategy.java:112)
> ~[apache-cassandra-2.1.0.jar:2.1.0]*
> *at
> org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run(CompactionManager.java:229)
> ~[apache-cassandra-2.1.0.jar:2.1.0]*
> *at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> ~[na:1.7.0_51]*
> *at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> ~[na:1.7.0_51]*
> *at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> ~[na:1.7.0_51]*
> *at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> [na:1.7.0_51]*
> *at java.lang.Thread.run(Thread.java:744) [na:1.7.0_51]*
>
> But I do not see any warning or error message in logs of the joining
> nodes.  I just see an exception there when I run: "nodetool info":
>
> root@:~# nodetool info
> ID   : f5e49647-59fa-474f-b6af-9f65abc43581
> Gossip active: true
> Thrift active: false
> Native Transport active: false
> Load : 26.86 GB
> Generation No: 1430163258
> Uptime (seconds) : 18799
> Heap Memory (MB) : 4185.15 / 7566.00
> error: null
> -- StackTrace --
> java.lang.AssertionError
> at
> org.apache.cassandra.locator.TokenMetadata.getTokens(TokenMetadata.java:440)
> at
> org.apache.cassandra.service.StorageService.getTokens(Stor

Re: minimum bandwidth requirement between two Geo Redundant sites of Cassandra database

2015-04-28 Thread Carlos Rolo
Hi,

I would not recommend anything below 1Gbps for the bandwidth. Latency try
to have it as low as you can.

Regards,

Carlos Juzarte Rolo
Cassandra Consultant

Pythian - Love your data

rolo@pythian | Twitter: cjrolo | Linkedin: *linkedin.com/in/carlosjuzarterolo
*
Mobile: +31 6 159 61 814 | Tel: +1 613 565 8696 x1649
www.pythian.com

On Tue, Apr 28, 2015 at 5:28 AM, Gaurav Bhatnagar 
wrote:

> Hi,
>  Is there any minimum bandwidth requirement between two Geo Redundant
> data centres?
> What is the minimum latency that link between two Geo Redundant data
> centres should have to get best efficient operations?
>
> Regards,
> Gaurav
>

-- 


--