Re: How to come up with a predefined topology

2012-07-12 Thread prasenjit mukherjee
Thanks. Some follow up questions :

1.  How do the reads use strategy/snitch information ? I am assuming
the reads can go to any of the replicas. WIll it also use the
snitch/strategy info to find next 'R' replicas 'closest' to
coordinator-node ?

2. In a single DC ( with n racks and r replicas ) what algorithm
cassandra uses to write its replicas in following scenarios :
a. nr : I am assuming, have 1 replica in each rack.
b. nr : ?? I am assuming, try to equally distribute replicas across
in each racks.

-Thanks,
Prasenjit

On Thu, Jul 12, 2012 at 11:24 AM, Tyler Hobbs ty...@datastax.com wrote:
 I highly recommend specifying the same rack for all nodes (using
 cassandra-topology.properties) unless you really have a good reason not too
 (and you probably don't).  The way that replicas are chosen when multiple
 racks are in play can be fairly confusing and lead to a data imbalance if
 you don't catch it.


 On Wed, Jul 11, 2012 at 10:53 PM, prasenjit mukherjee prasen@gmail.com
 wrote:

  As far as I know there isn't any way to use the rack name in the
  strategy_options for a keyspace. You
  might want to look at the code to dig into that, perhaps.

 Aha, I was wondering if I could do that as well ( specify rack options )
 :)

 Thanks for the pointer, I will dig into the code.

 -Thanks,
 Prasenjit

 On Thu, Jul 12, 2012 at 5:33 AM, Richard Lowe richard.l...@arkivum.com
 wrote:
  If you then specify the parameters for the keyspace to use these, you
  can control exactly which set of nodes replicas end up on.
 
  For example, in cassandra-cli:
 
  create keyspace ks1 with placement_strategy =
  'org.apache.cassandra.locator.NetworkTopologyStrategy' and strategy_options
  = { DC1_realtime: 2, DC1_analytics: 1, DC2_realtime: 1 };
 
  As far as I know there isn't any way to use the rack name in the
  strategy_options for a keyspace. You might want to look at the code to dig
  into that, perhaps.
 
  Whichever snitch you use, the nodes are sorted in order of proximity to
  the client node. How this is determined depends on the snitch that's used
  but most (the ones that ship with Cassandra) will use the default ordering
  of same-node  same-rack  same-datacenter  different-datacenter. Each
  snitch has methods to tell Cassandra which rack and DC a node is in, so it
  always knows which node is closest. Used with the Bloom filters this can
  tell us where the nearest replica is.
 
 
 
  -Original Message-
  From: prasenjit mukherjee [mailto:prasen@gmail.com]
  Sent: 11 July 2012 06:33
  To: user
  Subject: How to come up with a predefined topology
 
  Quoting from
  http://www.datastax.com/docs/0.8/cluster_architecture/replication#networktopologystrategy
  :
 
  Asymmetrical replication groupings are also possible depending on your
  use case. For example, you may want to have three replicas per data center
  to serve real-time application requests, and then have a single replica in 
  a
  separate data center designated to running analytics.
 
  Have 2 questions :
  1. Any example how to configure a topology with 3 replicas in one DC (
  with 2 in 1 rack + 1 in another rack ) and one replica in another DC ?
   The default networktopologystrategy with rackinferringsnitch will only
  give me equal distribution ( 2+2 )
 
  2. I am assuming the reads can go to any of the replicas. Is there a
  client which will send query to a node ( in cassandra ring ) which is
  closest to the client ?
 
  -Thanks,
  Prasenjit
 
 




 --
 Tyler Hobbs
 DataStax



Re: Why is our range query failing in Cassandra 0.8.10 Client

2012-07-12 Thread Sylvain Lebresne
When executing a query like:
  get events WHERE Firm=434550 AND ds_timestamp=1341955958200 AND
ds_timestamp=1341955958200;
what the 2ndary index implementation will do is:
1) it queries the index for Firm for the row with key 434550 (because
that's the only one restricted by an equal clause, and that is why you
need at least one equal clause).
2) the query from 1 will return a bunch of events row keys for those
events whose Firm=434550. So for each of those row key it queries the
corresponding event
3) if a given queried event matches the remaining clauses (here
ds_timestamp=1341955958200 AND ds_timestamp=1341955958200), it adds
it to the result, otherwise it skips.

So what I suspect is happening is that you have *lots* of events matching
'Firm=434550' but only one matches 'ds_timestamp=1341955958200 AND
ds_timestamp=1341955958200'. And given that by default, it tries to
find 100 results, it will scan all the events having 'Firm=434550'
before returning.  Which it probably cannot do within the timeout.

But when you do
  get events WHERE Firm=434550 AND ds_timestamp=1341955958200
given that lots of event having 'Firm=434550' probably match
ds_timestamp=1341955958200, it is able to find 100 of them quickly.

Lastly, when you do
  get events WHERE Firm=434550 AND ds_timestamp=1341955958200;
the implementation has now both clause that are equal, and based on
internal stats it is able to determine that querying the ds_timestamp
index will discriminate potential results more efficiently. So it will
query the ds_timestamp index instead of the Firm one, which will yield
all events whose timestamp is 1341955958200, but since there isn't
many such events, it quickly finds the one matching Firm=434550

In other words, you are not doing something wrong, you are just
hitting a limitation/weakness of the 2ndary index implementation.

So if the query you really want to do is for a specific timestamp, you
definitively want to use an equal rather than two non-strict
inequalities. But if what you want is query events in a very small but
non-discrete window of time, then using 2ndary indexes might just not
fit the bill.

In that case, one option would be to do a custom/specialized index
yourself. If you construct an index where the row key is the Firm and
the column name is the ds_timestamp (and the colum value is the event
identifier), then finding events having a specific firm for a time
window will be (always) efficient. However that's more work on your
side since unfortunately Cassandra is not able to automatically create
such specialized index itself (at least not yet).

--
Sylvain

On Wed, Jul 11, 2012 at 10:01 PM, JohnB j...@tiac.net wrote:
 Hi:

 We are currently using Cassandra 0.8.10 and have run into some strange issues 
 surrounding
 querying for a range of data


 I ran a couple of get statements via the Cassandra client and found some 
 interesting results:


 Consider the following Column Family Definition:

 ColumnFamily: events
   Key Validation Class: org.apache.cassandra.db.marshal.BytesType
   Default column value validator: 
 org.apache.cassandra.db.marshal.BytesType
   Columns sorted by: org.apache.cassandra.db.marshal.UTF8Type
   Row cache size / save period in seconds: 0.0/0
   Row Cache Provider: 
 org.apache.cassandra.cache.ConcurrentLinkedHashCacheProvider
   Key cache size / save period in seconds: 20.0/14400
   Memtable thresholds: 0.2953125/1440/63 (millions of ops/minutes/MB)
   GC grace seconds: 864000
   Compaction min/max thresholds: 4/32
   Read repair chance: 1.0
   Replicate on write: true
   Built indexes: [events.events_Firm_idx, events.events_OrdType_idx, 
 events.events_OrderID_idx
  , events.events_OrderQty_idx, events.events_Price_idx, 
 events.events_Symbol_idx, events.events_ds_timestamp_idx]
   Column Metadata:
 Column Name: Firm
   Validation Class: org.apache.cassandra.db.marshal.BytesType
   Index Name: events_Firm_idx
   Index Type: KEYS
 Column Name: OrdType
   Validation Class: org.apache.cassandra.db.marshal.BytesType
   Index Name: events_OrdType_idx
   Index Type: KEYS
 Column Name: OrderID
   Validation Class: org.apache.cassandra.db.marshal.BytesType
   Index Name: events_OrderID_idx
   Index Type: KEYS
 Column Name: OrderQty
   Validation Class: org.apache.cassandra.db.marshal.LongType
   Index Name: events_OrderQty_idx
   Index Type: KEYS
 Column Name: Price
   Validation Class: org.apache.cassandra.db.marshal.LongType
   Index Name: events_Price_idx
   Index Type: KEYS
 Column Name: Symbol
   Validation Class: org.apache.cassandra.db.marshal.BytesType
   Index Name: events_Symbol_idx
 Column Name: ds_timestamp
   Validation Class: org.apache.cassandra.db.marshal.LongType
   Index Name: 

Re: failed to delete commitlog, cassandra can't accept writes

2012-07-12 Thread Holger Hoffstaette
On Tue, 10 Jul 2012 14:35:23 -0700, Frank Hsueh wrote:

 after reading the JIRA, I decided to use Java 6.

It has nothing to do with the JDK. I can reproduce it with either JDK6 or
JDK7 as well.

 anybody seen this before?  is this related to 4337 ?

It's exactly that.

-h




Increased replication factor not evident in CLI

2012-07-12 Thread Dustin Wenz
We recently increased the replication factor of a keyspace in our cassandra 
1.1.1 cluster from 2 to 4. This was done by setting the replication factor to 4 
in cassandra-cli, and then running a repair on each node.

Everything seems to have worked; the commands completed successfully and disk 
usage increased significantly. However, if I perform a describe on the 
keyspace, it still shows replication_factor:2. So, it appears that the 
replication factor might be 4, but it reports as 2. I'm not entirely sure how 
to confirm one or the other.

Since then, I've stopped and restarted the cluster, and even ran an 
upgradesstables on each node. The replication factor still doesn't report as I 
would expect. Am I missing something here?

- .Dustin



How to speed up data loading

2012-07-12 Thread Leonid Ilyevsky
I am loading a large set of data into a CF with composite key. The load is 
going pretty slow, hundreds or even thousands times slower than it would do in 
RDBMS.
I have a choice of how granular my physical key (the first component of the 
primary key) is, this way I can balance between smaller rows and too many keys 
vs. wide rows and fewer keys. What are the guidelines about this? How the width 
of the physical row affects the speed of load?

I see that Cassandra is doing a lot of processing behind the scene, even when I 
kill the client, the server is still consuming a lot of CPU for a long time.

What else should I look at ? Anything in configuration?


This email, along with any attachments, is confidential and may be legally 
privileged or otherwise protected from disclosure. Any unauthorized 
dissemination, copying or use of the contents of this email is strictly 
prohibited and may be in violation of law. If you are not the intended 
recipient, any disclosure, copying, forwarding or distribution of this email is 
strictly prohibited and this email and any attachments should be deleted 
immediately. This email and any attachments do not constitute an offer to sell 
or a solicitation of an offer to purchase any interest in any investment 
vehicle sponsored by Moon Capital Management LP (Moon Capital). Moon Capital 
does not provide legal, accounting or tax advice. Any statement regarding 
legal, accounting or tax matters was not intended or written to be relied upon 
by any person as advice. Moon Capital does not waive confidentiality or 
privilege as a result of this email.


Re: Connected file list in Cassandra

2012-07-12 Thread aaron morton
Can pages appear in many documents ? If not try this

Document CF:
row_key: doc_id
column: page_number:page_id page_number is the order of pages, page_id is the 
row key for below

Page CF:
row_key: page_id
columns:
- doc_id
- page_data


If you know the page_id, read the doc_id from Page CF, then iterate over the 
Document CF and read from Page CF. 

Hope that helps. 

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 12/07/2012, at 7:47 AM, David Brosius wrote:

 
 why not just hold the pages as different columns in the same row? columns are 
 automatically sorted such that if the column name was associated with the 
 page number it would automatically flow the way you wanted.
 
 - Original Message -
 From: Tomek Hankus tom...@gmail.com 
 Sent: Wed, July 11, 2012 14:34
 Subject: Connected file list in Cassandra
 
 Hi,
 at the moment I'm doing research about keeping linked/connected file list 
 in Cassandra- e.g. PDF file cut into pages (multiple PDFs) where first page 
 is connected to second, second to third etc.
 This files connec tion/lin k is not specified. Main goal is to be able to 
 get all linked files (the whole PDF/ all pages) while having only key to 
 first file (page).
 
 Is there any Cassandra tool/feature which could help me to do that or the 
 only way is to create some wrapper holding keys relations?
 
 
 Tom H
 
 



Re: Composite column/key creation via Hector

2012-07-12 Thread aaron morton
You may have better luck on the Hector Mailing list… 
https://groups.google.com/forum/?fromgroups#!forum/hector-users


Here is something I found in the docs though 
http://hector-client.github.com/hector/build/html/content/composite_with_templates.html

Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 12/07/2012, at 9:04 AM, Michael Cherkasov wrote:

 Hi all,
 
 What is the right way to create CF with dynamic composite column and 
 composite key?
 
 Now I use code like this:
 
  private static final String DEFAULT_DYNAMIC_COMPOSITE_ALIAES =
 
 (a=AsciiType,b=BytesType,i=IntegerType,x=LexicalUUIDType,l=LongType,t=TimeUUIDType,s=UTF8Type,u=UUIDType,A=AsciiType(reversed=true),B=BytesType(reversed=true),I=IntegerType(reversed=true),X=LexicalUUIDType(reversed=true),L=LongType(reversed=true),T=TimeUUIDType(reversed=true),S=UTF8Type(reversed=true),U=UUIDType(reversed=true));
 
 for composite columns:
  BasicColumnFamilyDefinition columnFamilyDefinition = new 
 BasicColumnFamilyDefinition();
 columnFamilyDefinition.setComparatorType( 
 ComparatorType.DYNAMICCOMPOSITETYPE );
 columnFamilyDefinition.setComparatorTypeAlias( 
 DEFAULT_DYNAMIC_COMPOSITE_ALIAES );
 columnFamilyDefinition.setKeyspaceName( keyspaceName );
 columnFamilyDefinition.setName( TestCase );
 columnFamilyDefinition.setColumnType( ColumnType.STANDARD );
 ColumnFamilyDefinition cfDefStandard = new ThriftCfDef( 
 columnFamilyDefinition );
 cfDefStandard.setKeyValidationClass( 
 ComparatorType.UTF8TYPE.getClassName() );
 cfDefStandard.setDefaultValidationClass( 
 ComparatorType.UTF8TYPE.getClassName() );
 
 for keys:
 columnFamilyDefinition = new BasicColumnFamilyDefinition();
 columnFamilyDefinition.setComparatorType( ComparatorType.UTF8TYPE );
 columnFamilyDefinition.setKeyspaceName( keyspaceName );
 columnFamilyDefinition.setName( Parameter );
 columnFamilyDefinition.setColumnType( ColumnType.STANDARD );
 cfDefStandard = new ThriftCfDef( columnFamilyDefinition );
 cfDefStandard.setKeyValidationClass( 
 ComparatorType.DYNAMICCOMPOSITETYPE.getClassName() + 
 DEFAULT_DYNAMIC_COMPOSITE_ALIAES );
 cfDefStandard.setDefaultValidationClass( 
 ComparatorType.UTF8TYPE.getClassName() );
 
 Does it correct code? Do I really need so terrible 
 DEFAULT_DYNAMIC_COMPOSITE_ALIAES ?



Re: Concerns about Cassandra upgrade from 1.0.6 to 1.1.X

2012-07-12 Thread aaron morton
It's always a good idea to have a read of the NEWS.txt file 
https://github.com/apache/cassandra/blob/cassandra-1.1/NEWS.txt

Cheers
-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 12/07/2012, at 5:51 PM, Tyler Hobbs wrote:

 On Wed, Jul 11, 2012 at 8:38 PM, Roshan codeva...@gmail.com wrote:
 
 
 Currently we are using Cassandra 1.0.6 in our production system but suffer
 with the CASSANDRA-3616 (it is already fixed in 1.0.7 version).
 
 We thought to upgrade the Cassandra to 1.1.X versions, to get it's new
 features, but having some concerns about the upgrade and expert advices are
 mostly welcome.
 
 1. Can Cassandra 1.1.X identify 1.0.X configurations like SSTables, commit
 logs, etc without ant issue? And vise versa. Because if something happens to
 1.1.X after deployed to production, we want to downgrade to 1.0.6 version
 (because that's the versions we tested with our applications).
 
 1.1 can handle 1.0 data/schemas/etc without a problem, but the reverse is not 
 necessarily true.  I don't know what in particular might break if you 
 downgrade from 1.1 to 1.0, but in general, Cassandra does not handle 
 downgrading gracefully; typically the SSTable formats have changed during 
 major releases.  If you snapshot prior to upgrading, you can always roll back 
 to that, but you will have lost anything written since the upgrade.
  
 
 2. How do we need to do upgrade process?  Currently we have 3 node 1.0.6
 cluster in production. Can we upgrade node by node? If we upgrade node by
 node, will the other 1.0.6 nodes identify 1.1.X nodes without any issue?
 
 Yes, you can do a rolling upgrade to 1.1, one node at a time.  It's usually 
 fine to leave the cluster in a mixed state for a short while as long as you 
 don't do things like repairs, decommissions, or bootstraps, but I wouldn't 
 stay in a mixed state any longer than you have to.
 
 It's best to test major upgrades with a second, non-production cluster if 
 that's an option.
 
 -- 
 Tyler Hobbs
 DataStax
 



Re: How to come up with a predefined topology

2012-07-12 Thread aaron morton
 WIll it also use the
 snitch/strategy info to find next 'R' replicas 'closest' to
 coordinator-node ?
yes. 

 2. In a single DC ( with n racks and r replicas ) what algorithm
The logic is here
https://github.com/apache/cassandra/blob/cassandra-1.1/src/java/org/apache/cassandra/locator/NetworkTopologyStrategy.java#L78

 a. nr : I am assuming, have 1 replica in each rack.
You have 1 replica in the first n racks. 

 b. nr : ?? I am assuming, try to equally distribute replicas across
 in each racks.
int(n/r) racks will have the same number of replicas. n % r will have more. 

This is why multi rack replication can be tricky. 

Hope that helps. 


-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 12/07/2012, at 8:05 PM, prasenjit mukherjee wrote:

 Thanks. Some follow up questions :
 
 1.  How do the reads use strategy/snitch information ? I am assuming
 the reads can go to any of the replicas. WIll it also use the
 snitch/strategy info to find next 'R' replicas 'closest' to
 coordinator-node ?
 
 2. In a single DC ( with n racks and r replicas ) what algorithm
 cassandra uses to write its replicas in following scenarios :
 a. nr : I am assuming, have 1 replica in each rack.
 b. nr : ?? I am assuming, try to equally distribute replicas across
 in each racks.
 
 -Thanks,
 Prasenjit
 
 On Thu, Jul 12, 2012 at 11:24 AM, Tyler Hobbs ty...@datastax.com wrote:
 I highly recommend specifying the same rack for all nodes (using
 cassandra-topology.properties) unless you really have a good reason not too
 (and you probably don't).  The way that replicas are chosen when multiple
 racks are in play can be fairly confusing and lead to a data imbalance if
 you don't catch it.
 
 
 On Wed, Jul 11, 2012 at 10:53 PM, prasenjit mukherjee prasen@gmail.com
 wrote:
 
 As far as I know there isn't any way to use the rack name in the
 strategy_options for a keyspace. You
 might want to look at the code to dig into that, perhaps.
 
 Aha, I was wondering if I could do that as well ( specify rack options )
 :)
 
 Thanks for the pointer, I will dig into the code.
 
 -Thanks,
 Prasenjit
 
 On Thu, Jul 12, 2012 at 5:33 AM, Richard Lowe richard.l...@arkivum.com
 wrote:
 If you then specify the parameters for the keyspace to use these, you
 can control exactly which set of nodes replicas end up on.
 
 For example, in cassandra-cli:
 
 create keyspace ks1 with placement_strategy =
 'org.apache.cassandra.locator.NetworkTopologyStrategy' and strategy_options
 = { DC1_realtime: 2, DC1_analytics: 1, DC2_realtime: 1 };
 
 As far as I know there isn't any way to use the rack name in the
 strategy_options for a keyspace. You might want to look at the code to dig
 into that, perhaps.
 
 Whichever snitch you use, the nodes are sorted in order of proximity to
 the client node. How this is determined depends on the snitch that's used
 but most (the ones that ship with Cassandra) will use the default ordering
 of same-node  same-rack  same-datacenter  different-datacenter. Each
 snitch has methods to tell Cassandra which rack and DC a node is in, so it
 always knows which node is closest. Used with the Bloom filters this can
 tell us where the nearest replica is.
 
 
 
 -Original Message-
 From: prasenjit mukherjee [mailto:prasen@gmail.com]
 Sent: 11 July 2012 06:33
 To: user
 Subject: How to come up with a predefined topology
 
 Quoting from
 http://www.datastax.com/docs/0.8/cluster_architecture/replication#networktopologystrategy
 :
 
 Asymmetrical replication groupings are also possible depending on your
 use case. For example, you may want to have three replicas per data center
 to serve real-time application requests, and then have a single replica in 
 a
 separate data center designated to running analytics.
 
 Have 2 questions :
 1. Any example how to configure a topology with 3 replicas in one DC (
 with 2 in 1 rack + 1 in another rack ) and one replica in another DC ?
 The default networktopologystrategy with rackinferringsnitch will only
 give me equal distribution ( 2+2 )
 
 2. I am assuming the reads can go to any of the replicas. Is there a
 client which will send query to a node ( in cassandra ring ) which is
 closest to the client ?
 
 -Thanks,
 Prasenjit
 
 
 
 
 
 
 --
 Tyler Hobbs
 DataStax
 



Re: Composite column/key creation via Hector

2012-07-12 Thread Dave Brosius
BTW, an issue was just fixed with dynamic columns in hector, you might 
want to try trunk.


https://github.com/hector-client/hector/commit/2910b484629add683f61f392553e824c291fb6eb



On 07/12/2012 06:25 PM, aaron morton wrote:
You may have better luck on the Hector Mailing list… 
https://groups.google.com/forum/?fromgroups#!forum/hector-users 
https://groups.google.com/forum/?fromgroups#%21forum/hector-users



Here is something I found in the docs though 
http://hector-client.github.com/hector/build/html/content/composite_with_templates.html


Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 12/07/2012, at 9:04 AM, Michael Cherkasov wrote:


Hi all,

What is the right way to create CF with dynamic composite column and 
composite key?


Now I use code like this:

 private static final String DEFAULT_DYNAMIC_COMPOSITE_ALIAES =

(a=AsciiType,b=BytesType,i=IntegerType,x=LexicalUUIDType,l=LongType,t=TimeUUIDType,s=UTF8Type,u=UUIDType,A=AsciiType(reversed=true),B=BytesType(reversed=true),I=IntegerType(reversed=true),X=LexicalUUIDType(reversed=true),L=LongType(reversed=true),T=TimeUUIDType(reversed=true),S=UTF8Type(reversed=true),U=UUIDType(reversed=true));


for composite columns:
 BasicColumnFamilyDefinition columnFamilyDefinition = new 
BasicColumnFamilyDefinition();
columnFamilyDefinition.setComparatorType( 
ComparatorType.DYNAMICCOMPOSITETYPE );
columnFamilyDefinition.setComparatorTypeAlias( 
DEFAULT_DYNAMIC_COMPOSITE_ALIAES );

columnFamilyDefinition.setKeyspaceName( keyspaceName );
columnFamilyDefinition.setName( TestCase );
columnFamilyDefinition.setColumnType( ColumnType.STANDARD );
ColumnFamilyDefinition cfDefStandard = new ThriftCfDef( 
columnFamilyDefinition );
cfDefStandard.setKeyValidationClass( 
ComparatorType.UTF8TYPE.getClassName() );
cfDefStandard.setDefaultValidationClass( 
ComparatorType.UTF8TYPE.getClassName() );


for keys:
columnFamilyDefinition = new BasicColumnFamilyDefinition();
columnFamilyDefinition.setComparatorType( 
ComparatorType.UTF8TYPE );

columnFamilyDefinition.setKeyspaceName( keyspaceName );
columnFamilyDefinition.setName( Parameter );
columnFamilyDefinition.setColumnType( ColumnType.STANDARD );
cfDefStandard = new ThriftCfDef( columnFamilyDefinition );
cfDefStandard.setKeyValidationClass( 
ComparatorType.DYNAMICCOMPOSITETYPE.getClassName() + 
DEFAULT_DYNAMIC_COMPOSITE_ALIAES );
cfDefStandard.setDefaultValidationClass( 
ComparatorType.UTF8TYPE.getClassName() );


Does it correct code? Do I really need 
so terrible DEFAULT_DYNAMIC_COMPOSITE_ALIAES ?






Re: Increased replication factor not evident in CLI

2012-07-12 Thread aaron morton
Do multiple nodes say the RF is 2 ? Can you show the output from the CLI ? Do 
show schema and show keyspace say the same thing ?

Cheers



-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 13/07/2012, at 7:39 AM, Dustin Wenz wrote:

 We recently increased the replication factor of a keyspace in our cassandra 
 1.1.1 cluster from 2 to 4. This was done by setting the replication factor to 
 4 in cassandra-cli, and then running a repair on each node.
 
 Everything seems to have worked; the commands completed successfully and disk 
 usage increased significantly. However, if I perform a describe on the 
 keyspace, it still shows replication_factor:2. So, it appears that the 
 replication factor might be 4, but it reports as 2. I'm not entirely sure how 
 to confirm one or the other.
 
 Since then, I've stopped and restarted the cluster, and even ran an 
 upgradesstables on each node. The replication factor still doesn't report as 
 I would expect. Am I missing something here?
 
   - .Dustin
 



Re: Concerns about Cassandra upgrade from 1.0.6 to 1.1.X

2012-07-12 Thread Roshan
Thanks Aaron. My major concern is upgrade node by node. Because currently we
are using 1.0.6 in production and plan is to upgrade singe node to 1.1.2 at
a time.

Any comments?

Thanks.

--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Concerns-about-Cassandra-upgrade-from-1-0-6-to-1-1-X-tp7581197p7581221.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.


Re: Increased replication factor not evident in CLI

2012-07-12 Thread Edward Capriolo
Possibly the bug with nanotime causing cassandra to think the change
happened in the past. Talked about onlist in past few days.
On Thursday, July 12, 2012, aaron morton aa...@thelastpickle.com wrote:
 Do multiple nodes say the RF is 2 ? Can you show the output from the CLI
? Do show schema and show keyspace say the same thing ?
 Cheers


 -
 Aaron Morton
 Freelance Developer
 @aaronmorton
 http://www.thelastpickle.com
 On 13/07/2012, at 7:39 AM, Dustin Wenz wrote:

 We recently increased the replication factor of a keyspace in our
cassandra 1.1.1 cluster from 2 to 4. This was done by setting the
replication factor to 4 in cassandra-cli, and then running a repair on each
node.

 Everything seems to have worked; the commands completed successfully and
disk usage increased significantly. However, if I perform a describe on the
keyspace, it still shows replication_factor:2. So, it appears that the
replication factor might be 4, but it reports as 2. I'm not entirely sure
how to confirm one or the other.

 Since then, I've stopped and restarted the cluster, and even ran an
upgradesstables on each node. The replication factor still doesn't report
as I would expect. Am I missing something here?

 - .Dustin





Re: BulkLoading sstables from v1.0.3 to v1.1.1

2012-07-12 Thread Edward Capriolo
Historically you have not been able to stream stables between different
file formats. Cassandra 1.0 creates files named hc . While 1.1 uses hd.
Since bulk loading streams I am not sure this will work.

On Thursday, July 12, 2012, aaron morton aa...@thelastpickle.com wrote:
 Do you have the full error logs ? Their should be a couple of caused by:
errors that will help track it down where the original Assertion is thrown.
 The second error is probably the result of the first. Something has upset
the SSTable tracking.
 If you can get the full error stack, and some steps to reproduce, can you
raise a ticket on https://issues.apache.org/jira/browse/CASSANDRA ?
 Thanks

 -
 Aaron Morton
 Freelance Developer
 @aaronmorton
 http://www.thelastpickle.com
 On 10/07/2012, at 7:43 PM, rubbish me wrote:

 Thanks Ivo.
 We are quite close to releasing so we'd hope to understand what causing
the error and may try to avoid it where possible. As said, it seems to work
ok the first time round.
 The problem you referring in the last mail, was it restricted to bulk
loading or otherwise?
 Thanks
 -A

 Ivo Meißner i...@overtronic.com 於 10 Jul 2012 07:20 寫道:

 Hi,
 there are some problems in version 1.1.1 with secondary indexes and key
caches that are fixed in 1.1.2.
 I would try to upgrade to 1.1.2 and see if the error still occurs.
 Ivo




 Hi

 As part of a continuous development of a system migration, we have a test
build to take a snapshot of a keyspace from cassandra v 1.0.3 and bulk load
it to a cluster of 1.1.1 using the sstableloader.sh.  Not sure if relevant,
but one of the cf contains a secondary index.

 The build basically does:
 Drop the destination keyspace if exist
 Add the destination keyspace, wait for schema to agree



Re: Increased replication factor not evident in CLI

2012-07-12 Thread Michael Theroux
Sounds a lot like a bug that I hit that was filed and fixed recently:

https://issues.apache.org/jira/browse/CASSANDRA-4432

-Mike

On Jul 12, 2012, at 8:16 PM, Edward Capriolo wrote:

 Possibly the bug with nanotime causing cassandra to think the change happened 
 in the past. Talked about onlist in past few days.
 On Thursday, July 12, 2012, aaron morton aa...@thelastpickle.com wrote:
  Do multiple nodes say the RF is 2 ? Can you show the output from the CLI ? 
  Do show schema and show keyspace say the same thing ?
  Cheers
 
 
  -
  Aaron Morton
  Freelance Developer
  @aaronmorton
  http://www.thelastpickle.com
  On 13/07/2012, at 7:39 AM, Dustin Wenz wrote:
 
  We recently increased the replication factor of a keyspace in our cassandra 
  1.1.1 cluster from 2 to 4. This was done by setting the replication factor 
  to 4 in cassandra-cli, and then running a repair on each node.
 
  Everything seems to have worked; the commands completed successfully and 
  disk usage increased significantly. However, if I perform a describe on the 
  keyspace, it still shows replication_factor:2. So, it appears that the 
  replication factor might be 4, but it reports as 2. I'm not entirely sure 
  how to confirm one or the other.
 
  Since then, I've stopped and restarted the cluster, and even ran an 
  upgradesstables on each node. The replication factor still doesn't report 
  as I would expect. Am I missing something here?
 
  - .Dustin
 
 
 



Re: is this something to be concerned about - MUTATION message dropped

2012-07-12 Thread Frank Hsueh
oh.  darn.  I was hoping for something like, here's the data you
requested, and by the way, latencies are 80% to the point of timeout; might
want to back off a little

mx4j it is.


On Wed, Jul 11, 2012 at 10:46 PM, Tyler Hobbs ty...@datastax.com wrote:

 JMX is really the only way it exposes that kind of information.  I
 recommend setting up mx4j if you want to check on the server stats
 programmatically.


 On Wed, Jul 11, 2012 at 8:17 PM, Frank Hsueh frank.hs...@gmail.comwrote:

 out of curiosity, is there a way that Cassandra can communicate that it's
 close to the being overloaded ?


 On Sun, Jun 17, 2012 at 6:29 PM, aaron morton aa...@thelastpickle.comwrote:

 http://wiki.apache.org/cassandra/FAQ#dropped_messages

 https://www.google.com/#q=cassandra+dropped+messages

 Cheers


   -
 Aaron Morton
 Freelance Developer
 @aaronmorton
 http://www.thelastpickle.com

 On 15/06/2012, at 12:54 AM, Poziombka, Wade L wrote:

 INFO [ScheduledTasks:1] 2012-06-14 07:49:54,355 MessagingService.java
 (line 615) 15 MUTATION message dropped in last 5000ms
 ** **
 It is at INFO level so I’m inclined to think not but is seems like
 whenever messages are dropped there may be some issue?





 --
 Frank Hsueh | frank.hs...@gmail.com




 --
 Tyler Hobbs
 DataStax http://datastax.com/




-- 
Frank Hsueh | frank.hs...@gmail.com


Re: Cassandra take 100% CPU for 2~3 minutes every half an hour and mutation lost

2012-07-12 Thread Jason Tang
Hi

After change the parameter of concurrent compactor, we can limit Cassandra
to use 100% of one core at that moment. (concurrent_compactors: 1)

And I got the stack of the crazy thread, it last 2~3 minutes, on same
stack.

Any clue of this issue?

Thread 18114: (state = IN_JAVA)

 - java.util.AbstractList$Itr.hasNext() @bci=8, line=339 (Compiled frame;
information may be imprecise)

 -
org.apache.cassandra.db.ColumnFamilyStore.removeDeletedStandard(org.apache.cassandra.db.ColumnFamily,
int) @bci=6, line=841 (Compiled frame)

 -
org.apache.cassandra.db.ColumnFamilyStore.removeDeletedColumnsOnly(org.apache.cassandra.db.ColumnFamily,
int) @bci=17, line=835 (Compiled frame)

 -
org.apache.cassandra.db.ColumnFamilyStore.removeDeleted(org.apache.cassandra.db.ColumnFamily,
int) @bci=8, line=826 (Compiled frame)

 -
org.apache.cassandra.db.compaction.PrecompactedRow.removeDeletedAndOldShards(org.apache.cassandra.db.DecoratedKey,
org.apache.cassandra.db.compaction.CompactionController,
org.apache.cassandra.db.ColumnFamily) @bci=38, line=77 (Compiled frame)

 -
org.apache.cassandra.db.compaction.PrecompactedRow.init(org.apache.cassandra.db.compaction.CompactionController,
java.util.List) @bci=33, line=102 (Compiled frame)

 -
org.apache.cassandra.db.compaction.CompactionController.getCompactedRow(java.util.List)
@bci=223, line=133 (Compiled frame)

 -
org.apache.cassandra.db.compaction.CompactionIterable$Reducer.getReduced()
@bci=44, line=102 (Compiled frame)

 -
org.apache.cassandra.db.compaction.CompactionIterable$Reducer.getReduced()
@bci=1, line=87 (Compiled frame)

 - org.apache.cassandra.utils.MergeIterator$ManyToOne.consume() @bci=88,
line=116 (Compiled frame)

 - org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext() @bci=5,
line=99 (Compiled frame)

 - com.google.common.collect.AbstractIterator.tryToComputeNext() @bci=9,
line=140 (Compiled frame)

 - com.google.common.collect.AbstractIterator.hasNext() @bci=61, line=135
(Compiled frame)

 - com.google.common.collect.Iterators$7.computeNext() @bci=4, line=614
(Compiled frame)

 - com.google.common.collect.AbstractIterator.tryToComputeNext() @bci=9,
line=140 (Compiled frame)

 - com.google.common.collect.AbstractIterator.hasNext() @bci=61, line=135
(Compiled frame)

 -
org.apache.cassandra.db.compaction.CompactionTask.execute(org.apache.cassandra.db.compaction.CompactionManager$CompactionExecutorStatsCollector)
@bci=542, line=141 (Compiled frame)

 - org.apache.cassandra.db.compaction.CompactionManager$1.call() @bci=117,
line=134 (Interpreted frame)

 - org.apache.cassandra.db.compaction.CompactionManager$1.call() @bci=1,
line=114 (Interpreted frame)

 - java.util.concurrent.FutureTask$Sync.innerRun() @bci=30, line=303
(Interpreted frame)

 - java.util.concurrent.FutureTask.run() @bci=4, line=138 (Interpreted
frame)

 -
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(java.lang.Runnable)
@bci=59, line=886 (Compiled frame)

 - java.util.concurrent.ThreadPoolExecutor$Worker.run() @bci=28, line=908
(Compiled frame)

 - java.lang.Thread.run() @bci=11, line=662 (Interpreted frame)



BRs

//Jason



2012/7/11 Jason Tang ares.t...@gmail.com

 Hi

 I encounter the High CPU problem, Cassandra 1.0.3, happened on both
 sized and leveled compaction, 6G heap, 64bit Oracle java. For normal
 traffic, Cassandra will use 15% CPU.

 But every half a hour, Cassandra will use almost 100% total cpu (SUSE,
 12 Core).

 And here is the top information for that moment.

 #top -H -p 12451

 top - 12:30:14 up 15 days, 12:49,  6 users,  load average: 10.52, 8.92,
 8.14
 Tasks: 706 total,  21 running, 685 sleeping,   0 stopped,   0 zombie
 Cpu(s): 25.7%us, 14.0%sy, 48.9%ni,  6.5%id,  0.0%wa,  0.0%hi,  4.9%si,
  0.0%st
 Mem: 24150M total,12218M used,11932M free,  142M buffers
 Swap:0M total,0M used,0M free, 3714M cached

   PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
 20291 casadm24   4 8003m 5.4g 167m R   92 22.7   0:42.46 java
 20276 casadm24   4 8003m 5.4g 167m R   88 22.7   0:43.88 java
 20181 casadm24   4 8003m 5.4g 167m R   86 22.7   0:52.97 java
 20213 casadm24   4 8003m 5.4g 167m R   85 22.7   0:49.21 java
 20188 casadm24   4 8003m 5.4g 167m R   82 22.7   0:54.34 java
 20268 casadm24   4 8003m 5.4g 167m R   81 22.7   0:46.25 java
 20269 casadm24   4 8003m 5.4g 167m R   41 22.7   0:15.11 java
 20316 casadm24   4 8003m 5.4g 167m S   20 22.7   0:02.35 java
 20191 casadm24   4 8003m 5.4g 167m R   15 22.7   0:16.85 java
 12500 casadm20   0 8003m 5.4g 167m R6 22.7   1:07.86 java
 15245 casadm20   0 8003m 5.4g 167m D5 22.7   0:36.45 java

 Jstack can not print the stack.
 Thread 20291: (state = IN_JAVA)
 Error occurred during stack walking:
 ...
 Thread 20276: (state = IN_JAVA)
 Error occurred during stack walking:

 After it come back, the stack shows:
 Thread 20291: (state = BLOCKED)
  - sun.misc.Unsafe.park(boolean, long) 

High RecentWriteLatencyMicro

2012-07-12 Thread rohit bhatia
Hi

As I understand that writes in cassandra are directly pushed to memory
and using counters with CL.ONE shouldn't take the read latency for
counters in account. So Writes for incrementing counters with CL.ONE
should basically be really fast.

But in my 8 node cluster(16 core/32G ram/cassandra1.0.5/java7 each)
with RF=2, At a traffic of 55k qps = 14k increments per node/7k write
requests per node, the write latency(from jmx) increases to around 7-8
ms from the low traffic value of 0.5ms.  The Nodes aren't even pushed
with absent I/O, lots of free RAM and 30% CPU idle time/OS Load 20.
The write latency by cfstats (supposedly the latency for 1 node to
increment its counter) is a small amount ( 0.05ms).

1) Is the whole of 7-8ms being spent in thrift overheads and
Scheduling delays ? (there is insignificant .1ms ping time between
machines)

2) Do keeping a large number of CF(17 in our case) adversely affect
write performance? (except from the extreme flushing scenario)

3) I see a lot of threads(4,000-10,000) with names like
pool-2-thread-* (pointed out as client-connection-threads on the
mailing list before) periodically forming up. but with idle cpu time
and zero pending tasks in tpstats, why do requests keep piling up (GC
stops threads for 100ms every 1-2 seconds, effectively pausing
cassandra 5-10% of its time, but this doesn't seem to be the reason)

Thanks
Rohit


Re: How to come up with a predefined topology

2012-07-12 Thread prasenjit mukherjee
On Fri, Jul 13, 2012 at 4:04 AM, aaron morton aa...@thelastpickle.com wrote:
 The logic is here
 https://github.com/apache/cassandra/blob/cassandra-1.1/src/java/org/apache/cassandra/locator/NetworkTopologyStrategy.java#L78

Thanks Aaron for pointing to the code.


 a. nr : I am assuming, have 1 replica in each rack.

 You have 1 replica in the first n racks.

 b. nr : ?? I am assuming, try to equally distribute replicas across
 in each racks.

 int(n/r) racks will have the same number of replicas. n % r will have more.

Did you mean  r%n ( since rn)  ?

Shouldn't the logic be : all racks will have at least int(r/n) and r%n
will have 1 additional replica ?

Sample use case ( r = 8, n = 3 )
n1 : 3 ( 2+1 )
n2:  3 ( 2+1 )
n3:  2

Is the above understanding correct ?

-Thanks,
Prasenjit


 This is why multi rack replication can be tricky.

 Hope that helps.


 -
 Aaron Morton
 Freelance Developer
 @aaronmorton
 http://www.thelastpickle.com

 On 12/07/2012, at 8:05 PM, prasenjit mukherjee wrote:

 Thanks. Some follow up questions :

 1.  How do the reads use strategy/snitch information ? I am assuming
 the reads can go to any of the replicas. WIll it also use the
 snitch/strategy info to find next 'R' replicas 'closest' to
 coordinator-node ?

 2. In a single DC ( with n racks and r replicas ) what algorithm
 cassandra uses to write its replicas in following scenarios :
 a. nr : I am assuming, have 1 replica in each rack.
 b. nr : ?? I am assuming, try to equally distribute replicas across
 in each racks.

 -Thanks,
 Prasenjit

 On Thu, Jul 12, 2012 at 11:24 AM, Tyler Hobbs ty...@datastax.com wrote:

 I highly recommend specifying the same rack for all nodes (using

 cassandra-topology.properties) unless you really have a good reason not too

 (and you probably don't).  The way that replicas are chosen when multiple

 racks are in play can be fairly confusing and lead to a data imbalance if

 you don't catch it.



 On Wed, Jul 11, 2012 at 10:53 PM, prasenjit mukherjee prasen@gmail.com

 wrote:


 As far as I know there isn't any way to use the rack name in the

 strategy_options for a keyspace. You

 might want to look at the code to dig into that, perhaps.


 Aha, I was wondering if I could do that as well ( specify rack options )

 :)


 Thanks for the pointer, I will dig into the code.


 -Thanks,

 Prasenjit


 On Thu, Jul 12, 2012 at 5:33 AM, Richard Lowe richard.l...@arkivum.com

 wrote:

 If you then specify the parameters for the keyspace to use these, you

 can control exactly which set of nodes replicas end up on.


 For example, in cassandra-cli:


 create keyspace ks1 with placement_strategy =

 'org.apache.cassandra.locator.NetworkTopologyStrategy' and strategy_options

 = { DC1_realtime: 2, DC1_analytics: 1, DC2_realtime: 1 };


 As far as I know there isn't any way to use the rack name in the

 strategy_options for a keyspace. You might want to look at the code to dig

 into that, perhaps.


 Whichever snitch you use, the nodes are sorted in order of proximity to

 the client node. How this is determined depends on the snitch that's used

 but most (the ones that ship with Cassandra) will use the default ordering

 of same-node  same-rack  same-datacenter  different-datacenter. Each

 snitch has methods to tell Cassandra which rack and DC a node is in, so it

 always knows which node is closest. Used with the Bloom filters this can

 tell us where the nearest replica is.




 -Original Message-

 From: prasenjit mukherjee [mailto:prasen@gmail.com]

 Sent: 11 July 2012 06:33

 To: user

 Subject: How to come up with a predefined topology


 Quoting from

 http://www.datastax.com/docs/0.8/cluster_architecture/replication#networktopologystrategy

 :


 Asymmetrical replication groupings are also possible depending on your

 use case. For example, you may want to have three replicas per data center

 to serve real-time application requests, and then have a single replica in a

 separate data center designated to running analytics.


 Have 2 questions :

 1. Any example how to configure a topology with 3 replicas in one DC (

 with 2 in 1 rack + 1 in another rack ) and one replica in another DC ?

 The default networktopologystrategy with rackinferringsnitch will only

 give me equal distribution ( 2+2 )


 2. I am assuming the reads can go to any of the replicas. Is there a

 client which will send query to a node ( in cassandra ring ) which is

 closest to the client ?


 -Thanks,

 Prasenjit







 --

 Tyler Hobbs

 DataStax