Re: [Cas 2.0.2] Looping Repair since activating PasswordAuthenticator

2013-11-04 Thread Dennis Schwan

Hi Yuki,

thanks for your answer. I still do nt know if it is expected behaviour 
that Cassandra tries to repair these 1280 ranges everytime I run a 
nodetool repair on every node?


Regards,
Dennis

Am 03.11.2013 03:27, schrieb Yuki Morishita:

Hi Dennis,

As you can see in the output,


[2013-10-31 09:39:59,811] Starting repair command #1, repairing 1280 ranges for 
keyspace system_auth

repair was trying to repair 1280 ranges.
I imagine you are using vnodes, and since Cassandra does repair range
by range in almost sequentially, it will take some time.
You can specify range to repair using '-st' and '-et' option.
For more info, see
http://www.datastax.com/documentation/cassandra/2.0/webhelp/index.html#cassandra/operations/ops_repair_nodes_c.html.

On Thu, Oct 31, 2013 at 3:42 AM, Dennis Schwan dennis.sch...@1und1.de wrote:

Hi there,

I have used this manual:
http://www.datastax.com/documentation/cassandra/2.0/webhelp/index.html#cassandra/security/security_config_native_authenticate_t.html
to use the PasswordAuthenticator but now everytime I run a nodetool repair
it repairs the system_auth keyspace which needs about 10 to 15 minutes.

nodetool repair
[2013-10-31 09:39:59,623] Nothing to repair for keyspace 'system'
[2013-10-31 09:39:59,811] Starting repair command #1, repairing 1280 ranges
for keyspace system_auth

This is what I get on every node every time i start a repair.

Logfile:
  INFO [AntiEntropyStage:1] 2013-10-31 09:40:55,632 Differencer.java (line
67) [repair #27b48380-4208-11e3-acc6-b1938d6360b5] Endpoints /10.30.9.60 and
/10.30.9.61 are consistent for credentials
  INFO [AntiEntropyStage:1] 2013-10-31 09:40:55,634 Differencer.java (line
67) [repair #27b48380-4208-11e3-acc6-b1938d6360b5] Endpoints /10.30.9.60 and
/10.30.9.58 are consistent for credentials
  INFO [AntiEntropyStage:1] 2013-10-31 09:40:55,638 Differencer.java (line
67) [repair #27b48380-4208-11e3-acc6-b1938d6360b5] Endpoints /10.30.9.60 and
/10.30.9.59 are consistent for credentials
  INFO [AntiEntropyStage:1] 2013-10-31 09:40:55,642 Differencer.java (line
67) [repair #27b48380-4208-11e3-acc6-b1938d6360b5] Endpoints /10.30.9.60 and
/10.30.9.57 are consistent for credentials
  INFO [AntiEntropyStage:1] 2013-10-31 09:40:55,643 Differencer.java (line
67) [repair #27b48380-4208-11e3-acc6-b1938d6360b5] Endpoints /10.30.9.61 and
/10.30.9.58 are consistent for credentials
  INFO [AntiEntropyStage:1] 2013-10-31 09:40:55,644 Differencer.java (line
67) [repair #27b48380-4208-11e3-acc6-b1938d6360b5] Endpoints /10.30.9.61 and
/10.30.9.59 are consistent for credentials
  INFO [AntiEntropyStage:1] 2013-10-31 09:40:55,644 Differencer.java (line
67) [repair #27b48380-4208-11e3-acc6-b1938d6360b5] Endpoints /10.30.9.61 and
/10.30.9.57 are consistent for credentials
  INFO [AntiEntropyStage:1] 2013-10-31 09:40:55,645 Differencer.java (line
67) [repair #27b48380-4208-11e3-acc6-b1938d6360b5] Endpoints /10.30.9.58 and
/10.30.9.59 are consistent for credentials
  INFO [AntiEntropyStage:1] 2013-10-31 09:40:55,646 Differencer.java (line
67) [repair #27b48380-4208-11e3-acc6-b1938d6360b5] Endpoints /10.30.9.58 and
/10.30.9.57 are consistent for credentials
  INFO [AntiEntropyStage:1] 2013-10-31 09:40:55,647 Differencer.java (line
67) [repair #27b48380-4208-11e3-acc6-b1938d6360b5] Endpoints /10.30.9.59 and
/10.30.9.57 are consistent for credentials
  INFO [AntiEntropyStage:1] 2013-10-31 09:40:55,648 RepairSession.java (line
214) [repair #27b48380-4208-11e3-acc6-b1938d6360b5] credentials is fully
synced
  INFO [AntiEntropyStage:1] 2013-10-31 09:40:55,942 RepairSession.java (line
157) [repair #27b48380-4208-11e3-acc6-b1938d6360b5] Received merkle tree for
permissions from /10.30.9.60
  INFO [AntiEntropyStage:1] 2013-10-31 09:40:56,047 RepairSession.java (line
157) [repair #27b48380-4208-11e3-acc6-b1938d6360b5] Received merkle tree for
permissions from /10.30.9.61
  INFO [AntiEntropyStage:1] 2013-10-31 09:40:56,129 RepairSession.java (line
157) [repair #27b48380-4208-11e3-acc6-b1938d6360b5] Received merkle tree for
permissions from /10.30.9.58
  INFO [AntiEntropyStage:1] 2013-10-31 09:40:56,190 RepairSession.java (line
157) [repair #27b48380-4208-11e3-acc6-b1938d6360b5] Received merkle tree for
permissions from /10.30.9.59

Is this expected behaviour? I have only added a new superuser and changed
the password of the default superuser so there should not be too much to do
at all.

Thanks for your help!
Dennis

--
Dennis Schwan

Oracle DBA
Mail Core

11 Internet AG | Brauerstraße 48 | 76135 Karlsruhe | Germany
Phone: +49 721 91374-8738
E-Mail: dennis.sch...@1und1.de | Web: www.1und1.de

Hauptsitz Montabaur, Amtsgericht Montabaur, HRB 6484

Vorstand: Ralph Dommermuth, Frank Einhellinger, Robert Hoffmann, Andreas
Hofmann, Markus Huhn, Hans-Henning Kettler, Uwe Lamnek, Jan Oetjen,
Christian Würst
Aufsichtsratsvorsitzender: Michael Scheeren

Member of United Internet

Diese E-Mail kann vertrauliche und/oder gesetzlich geschützte Informationen
enthalten. Wenn Sie 

Re: Bad Request: No indexed columns present in by-columns clause with Equal operator?

2013-11-04 Thread Hannu Kröger
I tested the same and it seems to be so that you cannot such queries with
indexed columns. Probably you need to have at least one condition with
equal sign in the where clause. I am not sure.

You can achieve your goal by defining the primary key as follows:

create table test (
employee_id text,
employee_name text,
value text,
last_modified_date timeuuid,
primary key (employee_id, last_modified_date)
   );

and then querying like this:
select * from test where last_modified_date  mintimeuuid('2013-11-03
13:33:30') and last_modified_date  maxtimeuuid('2013-11-05 13:33:45')
ALLOW FILTERING;

However, that will be slow because it has to do scanning. Therefore you
need to say ALLOW FILTERING. Without that you will get a warning:
Bad Request: Cannot execute this query as it might involve data filtering
and thus may have unpredictable performance. If you want to execute this
query despite the performance unpredictability, use ALLOW FILTERING

The performance by using Cassandra like this is probably far from optimal.

Hannu




2013/11/3 Techy Teck comptechge...@gmail.com

 Thanks Hannu. I got your point.. But in my example `employee_id` won't be
 larger than `32767`.. So I am thinking of creating an index on these two
 columns -

 create index employee_name_idx on test (employee_name);
 create index last_modified_date_idx on test (last_modified_date);

 As the chances of executing the queries on above is very minimal.. Very
 rarely, we will be executing the above query but if we do, I wanted system
 to be capable of doing it.

 Now I can execute the below queries after creating an index -

 select * from test where employee_name = 'e27';

 select employee_id from test where employee_name = 'e27';
 select * from test where employee_id = '1';

 But I cannot execute the below query which is - Give me everything that
 has changed within 15 minutes . So I wrote the below query like this -

 select * from test where last_modified_date  mintimeuuid('2013-11-03
 13:33:30') and last_modified_date  maxtimeuuid('2013-11-03 13:33:45');

 But it doesn't run and I always get error as  -

 Bad Request: No indexed columns present in by-columns clause with
 Equal operator


 Any thoughts what wrong I am doing here?



 On Sun, Nov 3, 2013 at 12:43 PM, Hannu Kröger hkro...@gmail.com wrote:

 Hi,

 You cannot query using a field that is not indexed in CQL. You have to
 create either secondary index or create index tables and manage those
 indexes by yourself and query using those. Since those keys are of high
 cardinality, usually the recommendation for this kind of use cases is that
 you create several tables with all the data.

 1) A table with employee_id as the primary key.
 2) A table with last_modified_at as the primary key (use case 2)
 3) A table with employee_name as the primary key (your test query with
 employee_name 'e27' and use cases 1  3.)

 Then you populate all those tables with your data and then you use those
 tables depending on the query.

 Cheers,
 Hannu



 2013/11/3 Techy Teck comptechge...@gmail.com

 I have below table in CQL-

 create table test (
 employee_id text,
 employee_name text,
 value text,
 last_modified_date timeuuid,
 primary key (employee_id)
);


 I inserted couple of records in the above table like this which I will
 be inserting in our actual use case scenario as well-

 insert into test (employee_id, employee_name, value,
 last_modified_date) values ('1', 'e27',  'some_value', now());
 insert into test (employee_id, employee_name, value,
 last_modified_date) values ('2', 'e27',  'some_new_value', now());
 insert into test (employee_id, employee_name, value,
 last_modified_date) values ('3', 'e27',  'some_again_value', now());
 insert into test (employee_id, employee_name, value,
 last_modified_date) values ('4', 'e28',  'some_values', now());
 insert into test (employee_id, employee_name, value,
 last_modified_date) values ('5', 'e28',  'some_new_values', now());



 Now I was doing select query for -  give me all the employee_id for
 employee_name `e27`.

 select employee_id from test where employee_name = 'e27';

 And this is the error I am getting -

 Bad Request: No indexed columns present in by-columns clause with
 Equal operator
 Perhaps you meant to use CQL 2? Try using the -2 option when
 starting cqlsh.


 Is there anything wrong I am doing here?

 My use cases are in general -

  1. Give me everything for any of the employee_name?
  2. Give me everything for what has changed in last 5 minutes?
  3. Give me the latest employee_id for any of the employee_name?

 I am running Cassandra 1.2.11






filter using timeuuid column type

2013-11-04 Thread Turi, Ferenc (GE Power Water, Non-GE)
Hi,

Is it possible to filter records by using timeuuid column types in case the 
column is not part of the primary key?

I tried the followings:

[cqlsh 3.1.2 | Cassandra 1.2.10.1 | CQL spec 3.0.0 | Thrift protocol 19.36.0]

CREATE TABLE timeuuid_test2(
row_key text,
time timeuuid,
time2 timeuuid,
message text,
PRIMARY KEY (row_key, time)
)

Cqlsh:select * from timeuuid_test2 where time2now();

Bad Request: No indexed columns present in by-columns clause with Equal 
operator
I tried to create the required index:

create index timeuuid_test2_idx on timeuuid_test2 (time2);

Bad Request: No indexed columns present in by-columns clause with Equal 
operator
The result is the same...

If the used column is time then everything is OK.

select * from timeuuid_test2 where timenow() ALLOW FILTERING;

The question here. Why I can't use the 'time2' column  when filtering despite 
the column is indexed?

Thanks,

Ferenc


Managing index tables

2013-11-04 Thread Thomas Stets
What is the best way to manage index tables on update/deletion of the
indexed data?

I have a table containing all kinds of data fora user, i.e. name, address,
contact data, company data etc. Key to this table is the user ID.

I also maintain about a dozen index tables matching my queries, like name,
email address, company D.U.N.S number, permissions the user has, etc. These
index tables contain the user IDs matching the search key as column names,
with the column values left empty.

Whenever a user is deleted or updated I have to make sure to update the
index tables, i.e. if the permissions of a user changes I have to remove
the user ID from the rows matching the permission he no longer has.

My problem is to find all matching entries, especially for data I no longer
have.

My solution so far is to keep a separate table to keep track of all index
tables and keys the user can be found in. In the case mentioned I look up
the keys for the permissions table, remove the user ID from there, then
remove the entry in the keys table.

This works so far (in production for more than a year and a half), and it
also allows me to clean up after something has gone wrong.

But still, all this additional level of meta information adds a lot of
complexity. I was wondering wether there is some kind of pattern that
addresses my problem. I found lots of information saying that creating the
index tables is the way to go, but nobody ever mentions maintaining the
index tables.

tia, Thomas


Re: Cass 1.1.11 out of memory during compaction ?

2013-11-04 Thread Oleg Dulin
If i do that, wouldn't I need to scrub my sstables ?


Takenori Sato ts...@cloudian.com wrote:
 Try increasing column_index_size_in_kb.
 
 A slice query to get some ranges(SliceFromReadCommand) requires to read
 all the column indexes for the row, thus could hit OOM if you have a very 
 wide row.
 
 On Sun, Nov 3, 2013 at 11:54 PM, Oleg Dulin oleg.du...@gmail.com wrote:
 
 Cass 1.1.11 ran out of memory on me with this exception (see below).
 
 My parameters are 8gig heap, new gen is 1200M.
 
 ERROR [ReadStage:55887] 2013-11-02 23:35:18,419
 AbstractCassandraDaemon.java (line 132) Exception in thread
 Thread[ReadStage:55887,5,main] java.lang.OutOfMemoryError: Java heap
 space   
 at 
 org.apache.cassandra.io.util.RandomAccessReader.readBytes(RandomAccessReader.java:323)
 
at org.apache.cassandra.utils.ByteBufferUtil.read(
 ByteBufferUtil.java:398)at
 org.apache.cassandra.utils.ByteBufferUtil.readWithShortLength(ByteBufferUtil.java:380)
 
at 
 org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:88)
 
at 
 org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:83)
 
at 
 org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:73)
 
at 
 org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:37)
 
at org.apache.cassandra.db.columniterator.IndexedSliceReader$
 IndexedBlockFetcher.getNextBlock(IndexedSliceReader.java:179)at
 org.apache.cassandra.db.columniterator.IndexedSliceReader.
 computeNext(IndexedSliceReader.java:121)at
 org.apache.cassandra.db.columniterator.IndexedSliceReader.
 computeNext(IndexedSliceReader.java:48)at
 com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
 
at 
 com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
 
at org.apache.cassandra.db.columniterator.
 SSTableSliceIterator.hasNext(SSTableSliceIterator.java:116)at
 org.apache.cassandra.utils.MergeIterator$Candidate.
 advance(MergeIterator.java:147)at
 org.apache.cassandra.utils.MergeIterator$ManyToOne.
 advance(MergeIterator.java:126)at
 org.apache.cassandra.utils.MergeIterator$ManyToOne.
 computeNext(MergeIterator.java:100)at
 com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
 
at 
 com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
 
at org.apache.cassandra.db.filter.SliceQueryFilter.
 collectReducedColumns(SliceQueryFilter.java:117)at
 org.apache.cassandra.db.filter.QueryFilter.
 collateColumns(QueryFilter.java:140)   
 at 
 org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:292)
 
at
 org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:64)
 
at
 org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1362)
 
at
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1224)
 
at
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1159)
 
at org.apache.cassandra.db.Table.getRow(Table.java:378)at
 org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:69)
 
at org.apache.cassandra.db.ReadVerbHandler.doVerb(
 ReadVerbHandler.java:51)at
 org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59)
 
at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 
at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 
at java.lang.Thread.run(Thread.java:722)
 
 Any thoughts ?
 
 This is a dual data center set up, with 4 nodes in each DC and RF=2 in each.
 
 --
 Regards,
 Oleg Dulin a href=http://www.olegdulin.com;http://www.olegdulin.com/a



Re: IllegalStateException when bootstrapping new nodes

2013-11-04 Thread Cyril Scetbon
No one can find something useful in our logs ? :(

-- 
Cyril SCETBON

On 29 Oct 2013, at 16:38, Cyril Scetbon cyril.scet...@free.fr wrote:

 Sorry but as the link is bad here is the good one : 
 http://www.sendspace.com/file/7p81lz



Duplicate hard link - Cassandra 1.2.9

2013-11-04 Thread Elias Ross
Cassandra 1.2.9, embedded into the RHQ 4.9 project.

I'm getting the following:

Caused by: java.lang.RuntimeException: Tried to create duplicate hard link
to
/data05/rhq/data/system/NodeIdInfo/snapshots/1383587405678/system-NodeIdInfo-ic-
1-TOC.txt
at
org.apache.cassandra.io.util.FileUtils.createHardLink(FileUtils.java:70)
at
org.apache.cassandra.io.sstable.SSTableReader.createLinks(SSTableReader.java:1081)
at
org.apache.cassandra.db.ColumnFamilyStore.snapshotWithoutFlush(ColumnFamilyStore.java:1567)
at
org.apache.cassandra.db.ColumnFamilyStore.snapshot(ColumnFamilyStore.java:1612)
at org.apache.cassandra.db.Table.snapshot(Table.java:194)
at
org.apache.cassandra.service.StorageService.takeSnapshot(StorageService.java:2203)

Clearing the snapshot directory doesn't seem to fix this issue. Restarting
the node doesn't fix it. This is obviously a bug, but I'm not sure what to
do about it.

I don't have enough context to know what this means or what to do to fix it.


1.1.11: system keyspace is filling up

2013-11-04 Thread Oleg Dulin

I have a dual DC setup, 4 nodes, RF=4 in each.

The one that is used as primary has its system keyspace fill up with 
200 gigs of data, majority of which is hints.


Why does this happen ?

How can I clean it up ?

--
Regards,
Oleg Dulin
http://www.olegdulin.com




CFP - NoSQL Room FOSDEM Cassandra Community

2013-11-04 Thread laura.czajkow...@gmail.com
Hi all,

We're pleased to announce the call for participation for the NoSQL devroom,
returning after a great last year.

NoSQL is an encompassing term that covers a multitude of different and
interesting database solutions.  As the interest in NoSQL continues to
grow, we are looking for talks on any open source NoSQL database or related
topic.

Speaking slots are 25 or 50 minutes. To propose a talk please go to:
http://bit.ly/nosql-devroom-2013
https://penta.fosdem.org/submission/FOSDEM14

As FOSDEM is a friendly open source conference, please refrain from
slagging matches about each other’s projects. Keep it respectful, keep it
non-commercial, and remember that all decks are subject to approval.

http://bit.ly/nosql-devroom-2013

If you do not want to give a talk yourself but have ideas for NoSQL topics,
send them to the mailing list at nosql-devr...@lists.fosdem.org. Know
someone who might be interested in the devroom? Please forward them this
email on our behalf. Want to help out but don’t know how? Contact us!

The devroom is scheduled for Sunday, February 2nd and has approx 80 seats. The
call for proposals is open until Dec 13th and speakers will be notified by
December 20th. The final schedule will then be announced by January 10th.

Any changes will be announced on the mailing list:
https://lists.fosdem.org/listinfo/nosql-devroom



*Original announcement went on
: http://www.lczajkowski.com/2013/10/10/cfp-for-nosql-devroom-at-fosdem/
http://www.lczajkowski.com/2013/10/10/cfp-for-nosql-devroom-at-fosdem/*


Laura


Re: Duplicate hard link - Cassandra 1.2.9

2013-11-04 Thread Robert Coli
On Mon, Nov 4, 2013 at 10:08 AM, Elias Ross gen...@noderunner.net wrote:

 Cassandra 1.2.9, embedded into the RHQ 4.9 project.

 I'm getting the following:

 Caused by: java.lang.RuntimeException: Tried to create duplicate hard link
 to
 /data05/rhq/data/system/NodeIdInfo/snapshots/1383587405678/system-NodeIdInfo-ic-


Someone else having a similar issue, but on upgrade to 2.0.x from 1.2.10.

http://mail-archives.apache.org/mod_mbox/cassandra-user/201309.mbox/%3C00b801ceb9ef$e3cc6a70$ab653f50$@struq.com%3E

In that case, it was : https://issues.apache.org/jira/browse/CASSANDRA-6093

But I'm not sure that applies to your issue? If not, file a JIRA! :D

=Rob


Re: 1.1.11: system keyspace is filling up

2013-11-04 Thread Robert Coli
On Mon, Nov 4, 2013 at 11:34 AM, Oleg Dulin oleg.du...@gmail.com wrote:

 I have a dual DC setup, 4 nodes, RF=4 in each.

 The one that is used as primary has its system keyspace fill up with 200
 gigs of data, majority of which is hints.

 Why does this happen ?

 How can I clean it up ?


If you have this many hints, you probably have flapping / frequent network
partition, or very overloaded nodes. If you compare the number of hints to
the number of dropped messages, that would be informative. If you're
hinting because you're dropping, increase capacity. If you're hinting
because of partition, figure out why there's so much partition.

WRT cleaning up hints, they will automatically be cleaned up eventually, as
long as they are successfully being delivered. If you need to manually
clean them up you can truncate system.hints keyspace.

=Rob


Re: Strange exception when storing heavy data in cassandra 2.0.0...

2013-11-04 Thread Robert Coli
On Fri, Nov 1, 2013 at 10:29 PM, Krishna Chaitanya
bnsk1990r...@gmail.comwrote:

  I am newbie to the  Cassandra world. I am currently using
 Cassandra 2.0.0 with thrift 0.8.0 for storing netflow packets using
 libQtCassandra library.  ... Is this a known issue because it did not occur
 when we were using Cassandra-1.2.6 and previous versions with pycassa
 library for accessing the database. ... How can I avoid this exception and
 also is there any way in which I can get back my node to running state even
 if it means reinstalling cassandra???  Thank You in advance for any help.


https://engineering.eventbrite.com/what-version-of-cassandra-should-i-run/

You could try upgrading to 2.0.2, or downgrading back to 1.2.6. The former
is likely to be easier/more possible than the latter, but may or may not
resolve your problem. The latter may require a dump/load or reload of your
cluster's data, if sstables have been upgraded to a 2.0 version.

note : I have not looked for your particular issue in the apache JIRA,
there may be value to you doing so.

=Rob


Re: Duplicate hard link - Cassandra 1.2.9

2013-11-04 Thread Elias Ross
Thanks Robert.

CASSANDRA-6298

Is there any way to maybe do a workaround? I guess the thinking I have is
the duplicate hard link is probably pretty harmless and getting rid of the
check would at least get me past this issue.


Re: Cass 1.1.11 out of memory during compaction ?

2013-11-04 Thread Takenori Sato
I would go with cleanup.

Be careful for this bug.
https://issues.apache.org/jira/browse/CASSANDRA-5454


On Mon, Nov 4, 2013 at 9:05 PM, Oleg Dulin oleg.du...@gmail.com wrote:

 If i do that, wouldn't I need to scrub my sstables ?


 Takenori Sato ts...@cloudian.com wrote:
  Try increasing column_index_size_in_kb.
 
  A slice query to get some ranges(SliceFromReadCommand) requires to read
  all the column indexes for the row, thus could hit OOM if you have a
 very wide row.
 
  On Sun, Nov 3, 2013 at 11:54 PM, Oleg Dulin oleg.du...@gmail.com
 wrote:
 
  Cass 1.1.11 ran out of memory on me with this exception (see below).
 
  My parameters are 8gig heap, new gen is 1200M.
 
  ERROR [ReadStage:55887] 2013-11-02 23:35:18,419
  AbstractCassandraDaemon.java (line 132) Exception in thread
  Thread[ReadStage:55887,5,main] java.lang.OutOfMemoryError: Java heap
  space
  at
 org.apache.cassandra.io.util.RandomAccessReader.readBytes(RandomAccessReader.java:323)
 
 at org.apache.cassandra.utils.ByteBufferUtil.read(
  ByteBufferUtil.java:398)at
 
 org.apache.cassandra.utils.ByteBufferUtil.readWithShortLength(ByteBufferUtil.java:380)
 
 at
 org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:88)
 
 at
 org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:83)
 
 at
 org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:73)
 
 at
 org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:37)
 
 at org.apache.cassandra.db.columniterator.IndexedSliceReader$
  IndexedBlockFetcher.getNextBlock(IndexedSliceReader.java:179)at
  org.apache.cassandra.db.columniterator.IndexedSliceReader.
  computeNext(IndexedSliceReader.java:121)at
  org.apache.cassandra.db.columniterator.IndexedSliceReader.
  computeNext(IndexedSliceReader.java:48)at
 
 com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
 
 at
 com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
 
 at org.apache.cassandra.db.columniterator.
  SSTableSliceIterator.hasNext(SSTableSliceIterator.java:116)at
  org.apache.cassandra.utils.MergeIterator$Candidate.
  advance(MergeIterator.java:147)at
  org.apache.cassandra.utils.MergeIterator$ManyToOne.
  advance(MergeIterator.java:126)at
  org.apache.cassandra.utils.MergeIterator$ManyToOne.
  computeNext(MergeIterator.java:100)at
 
 com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
 
 at
 com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
 
 at org.apache.cassandra.db.filter.SliceQueryFilter.
  collectReducedColumns(SliceQueryFilter.java:117)at
  org.apache.cassandra.db.filter.QueryFilter.
  collateColumns(QueryFilter.java:140)
  at
 org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:292)
 
 at
 
 org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:64)
 
 at
 
 org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1362)
 
 at
 
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1224)
 
 at
 
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1159)
 
 at org.apache.cassandra.db.Table.getRow(Table.java:378)at
 
 org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:69)
 
 at org.apache.cassandra.db.ReadVerbHandler.doVerb(
  ReadVerbHandler.java:51)at
 
 org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59)
 
 at
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 
 at java.lang.Thread.run(Thread.java:722)
 
  Any thoughts ?
 
  This is a dual data center set up, with 4 nodes in each DC and RF=2 in
 each.
 
  --
  Regards,
  Oleg Dulin a href=http://www.olegdulin.com;http://www.olegdulin.com
 /a




Re: CF name length restrictions (CASSANDRA-4157 and CASSANDRA-4110)

2013-11-04 Thread Aaron Morton
My understanding of  CASSANDRA-4110 is that the file name (not the total path 
length) has to be = 255 chars long. 

On not windows platforms in 1.1.0+ you should be ok with KS + CF names that 
combined go up to about 230 chars. Leaving room for the extra few things 
Cassandra dds to the SStable file names. 

Cheers

-
Aaron Morton
New Zealand
@aaronmorton

Co-Founder  Principal Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com

On 1/11/2013, at 5:54 am, Peter Sanford psanf...@nearbuysystems.com wrote:

 We're working on upgrading from 1.0.12 to 1.1.12. After upgrading a test node 
 I ran into CASSANDRA-4157 which restricts the max length of CF names to = 48 
 characters. It looks like CASSANDRA-4110 will allow us to upgrade and keep 
 our existing long CF names, but we won't be able to create new CFs with names 
 longer than 48 chars. 
 
 Is there any reason that the logic from 4110 wasn't also applied to the 4157 
 code path? 
 
 (Our naming convention results in a lot of materialized view CFs that have 
 names  48 characters.)
 
 -psanford



Re: Storage management during rapid growth

2013-11-04 Thread Aaron Morton
 However, when monitoring the performance of our cluster, we see sustained 
 periods - especially during repair/compaction/cleanup - of several hours 
 where there are 2000 IOPS.
If the IOPS are there compaction / repair / cleanup will use them if the 
configuration allows it. If there are not there and the configuration matches 
the resources the only issue will be things take longer (assuming the HW can 
handle the throughput). 

 2) Move some nodes to a SAN solution, ensuring that there is a mix of 
 storage, drives,
IMHO you will have a terrible time and regret the decision. Performance in 
anger rarely matches local disks and when someone decides the SAN needs to go 
through a maintenance process say goodbye to your node. Also you will need very 
good network links. 

Cassandra is designed for shared nothing architecture, it’s best to embrace 
that. 


 1) Has anyone moved from SSDs to spinning-platter disks, or managed a cluster 
 that contained both? Do the numbers we're seeing exaggerate the performance 
 hit we'd see if we moved to spinners?
Try to get a feel for the general IOPS used for reads without compaction etc 
running. 
Also for the bytes going into the cluster on the rpc / native binary interface. 
 
 2) Have you successfully used a SAN or a hybrid SAN solution (some local, 
 some SAN-based) to dynamically add storage to the cluster? What type of SAN 
 have you used, and what issues have you run into?
I’ve worked with people who have internal SANS and those that have used EBS. I 
would not describe either solution as optimal. The issues are performance under 
load, network contention, SLA / consistency. 


 3) Am I missing a way of economically scaling storage?
version 1.2+ has better support for fat nodes, nodes with up to 5TB of data via:

* JBOD: mount each disk independently and add it to adata_file_directories . 
Cassandra will balance the write load between disks and have one flush thread 
per data directory, I’ve heard this gives good performance with HDD's. This 
will give you 100% of the raw disk capacity and mean a single disk failure does 
necessitate a node rebuild. 
* disk failure: set the disk_failure_policy to best_effort or stop so the node 
can handle disk failure 
https://github.com/apache/cassandra/blob/cassandra-1.2/conf/cassandra.yaml#L125
* have good networking in place so you can rebuild a failed node, either 
completely or from a failed disk. 
* use vnodes so that as the number of nodes grows the time to rebuild a failed 
node drops. 

I would be a little uneasy about very high node loads with only three nodes. 
The main concern is how long it will take to replace a node that completely 
fails. 

I’ve also seen people have a good time moving from SSD to 12 fast disks in a 
RAID10 config.

You can mix HDD and SSD’s and have some hot CF’s on the SSD and others on the 
HDD. 

Hope that helps. 
 

-
Aaron Morton
New Zealand
@aaronmorton

Co-Founder  Principal Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com

On 1/11/2013, at 10:01 am, Franc Carter franc.car...@sirca.org.au wrote:

 
 I can't comment on the technical question, however one thing I learnt with 
 managing the growth of data is that the $/GB of tends to drop at a rate that 
 can absorb a moderate proportion of the  increase in cost due to the increase 
 in size of data. I'd recommend having a wet-finger-in-the-air stab at 
 projecting the growth in data sizes versus the historical trends in the 
 decease in cost of storage.
 
 cheers
 
 
 
 On Fri, Nov 1, 2013 at 7:15 AM, Dave Cowen d...@luciddg.com wrote:
 Hi, all -
 
 I'm currently managing a small Cassandra cluster, several nodes with local 
 SSD storage.
 
 It's difficult for to forecast the growth of the Cassandra data over the next 
 couple of years for various reasons, but it is virtually guaranteed to grow 
 substantially.
 
 During this time, there may be times where it is desirable to increase the 
 amount of storage available to each node, but, assuming we are not I/O bound, 
 keep from expanding the cluster horizontally with additional nodes that have 
 local storage. In addition, expanding with local SSDs is costly.
 
 My colleagues and I have had several discussions of a couple of other options 
 that don't involve scaling horizontally or adding SSDs:
 
 1) Move to larger, cheaper spinning-platter disks. However, when monitoring 
 the performance of our cluster, we see sustained periods - especially during 
 repair/compaction/cleanup - of several hours where there are 2000 IOPS. It 
 will be hard to get to that level of performance in each node with spinning 
 platter disks, and we'd prefer not to take that kind of performance hit 
 during maintenance operations.
 
 2) Move some nodes to a SAN solution, ensuring that there is a mix of 
 storage, drives, LUNs and RAIDs so that there isn't a single point of 
 failure. While we're aware that this is frowned on in the Cassandra community 
 due to Cassandra's 

Re: Heap almost full

2013-11-04 Thread Aaron Morton
 When we analyzed the heap, almost all of it was memtables.
What were the top classes ? 
I would normally expect an OOM in pre 1.2 days to be the result of bloom 
filters, compaction meta data and index samples. 

 Is there any known issue with 1.1.5 which causes memtable_total_space_in_mb 
 not to be respected, or not defaulting to 1/3rd of the heap size?
Nothing I can remember. 
We estimate the in memory size of the memtables using the live ratio. That’s 
been pretty good for a while now, but you may want to check the change log for 
changes there. 

 The latest test was running on high performance 32-core, 128 GB RAM, 7 RAID-0 
 1TB disks (regular).
With all those cores grab the TLAB setting from the 1.2 cassandra-env.sh file. 

Cheers


-
Aaron Morton
New Zealand
@aaronmorton

Co-Founder  Principal Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com

On 1/11/2013, at 2:59 pm, Arindam Barua aba...@247-inc.com wrote:

 
 Thank you for your responses. In another recent test, the heap actually got 
 full, and we got an out of memory error. When we analyzed the heap, almost 
 all of it was memtables. Is there any known issue with 1.1.5 which causes 
 memtable_total_space_in_mb not to be respected, or not defaulting to 1/3rd of 
 the heap size? Or is it possible that the load in the test is that high that 
 Cassandra is not able to keep flushing even though it starts the process when 
 memtable_total_space_in_mb is 1/3rd of the heap?
 
 We recently switched to LeveledCompaction, however, when we got the earlier 
 heap warning, that was running on SizeTiered.
 The latest test was running on high performance 32-core, 128 GB RAM, 7 RAID-0 
 1TB disks (regular). Earlier tests were run on lesser hardware with the same 
 load, but there was no memory problem. We are running more tests to check if 
 this is always reproducible.
 
 Answering some of the earlier questions if it helps:
 
 We have Cassandra 1.1.5 running in production. Upgrading to the latest 1.2.x 
 release is on the roadmap, but till then this needs to be figured out.
 
 - How many data do you got per node ?
 We are running into these errors while running tests in QA starting with 0 
 load. These are around 4 hr tests which end up adding under 1 GB of data on 
 each node of a 4-node ring, or a 2-node ring.
 
 - What is the value of the index_intval (cassandra.yaml) ?
 It's the default value of 128.
 
 Thanks,
 Arindam
 
 -Original Message-
 From: Aaron Morton [mailto:aa...@thelastpickle.com] 
 Sent: Monday, October 28, 2013 12:09 AM
 To: Cassandra User
 Subject: Re: Heap almost full
 
 1] [14/10/2013:19:15:08 PDT] ScheduledTasks:1:  WARN GCInspector.java (line 
 145) Heap is 0.8287082580489245 full.  You may need to reduce memtable 
 and/or cache sizes.  Cassandra will now flush up to the two largest 
 memtables to free up memory.  Adjust flush_largest_memtables_at threshold in 
 cassandra.yaml if you don't want Cassandra to do this automatically
 This means that the CMS GC was unable to free memory quickly, you've not run 
 out but may do under heavy load. 
 
 CMS uses CPU resources to do it's job, how much CPU do you have available ? 
 To check the behaviour of the CMS collector using JConsole or another tool to 
 watch the heap size, you should see a nice saw tooth graph. It should 
 gradually grow then drop quickly to below 3ish GB. If the size of CMS is not 
 low enough you will spend more time in GC. 
 
 You may also want to adjust flush_largest_memtables_at to be .8 to give CMS a 
 chance to do it's work. It starts at .75
 
 In 1.2+ bloomfilters are off-heap, you can use vnodes...
 +1 for 1.2 with off heap bloom filters. 
 
 - increasing the heap to 10GB.
 
 -1 
 Unless you have a node under heavy memory problems, pre 1.2 with 1+billion 
 rows and lots of bloom filters, increasing the heap is not the answer. It 
 will increase the time taken for ParNew CMS and in kicks the problem down the 
 road. 
 
 Cheers
 
 -
 Aaron Morton
 New Zealand
 @aaronmorton
 
 Co-Founder  Principal Consultant
 Apache Cassandra Consulting
 http://www.thelastpickle.com
 
 On 26/10/2013, at 8:32 am, Alain RODRIGUEZ arodr...@gmail.com wrote:
 
 If you are starting with Cassandra I really advice you to start with 1.2.11
 
 In 1.2+ bloomfilters are off-heap, you can use vnodes...
 
 I summed up the bloom filter usage reported by nodetool cfstats in all the 
 CFs and it was under 50 MB.
 
 This is quite a small value. Is there no error in your conversion from Bytes 
 read in cfstats ?
 
 If you are trying to understand this could you tell us :
 
 - How many data do you got per node ?
 - What is the value of the index_intval (cassandra.yaml) ?
 
 If you are trying to fix this, you can try :
 
 - changing the memtable_total_space_in_mb to 1024
 - increasing the heap to 10GB.
 
 Hope this will help somehow :).
 
 Good luck
 
 
 2013/10/16 Arindam Barua aba...@247-inc.com
 
 
 During performance testing being run on 

Re: How to generate tokens for my two node Cassandra cluster?

2013-11-04 Thread Aaron Morton
For a while now the binary distribution as included a tool to calculate tokens:

aarons-MBP-2011:apache-cassandra-1.2.11 aaron$ tools/bin/token-generator 
Token Generator Interactive Mode


 How many datacenters will participate in this Cassandra cluster? 1
 How many nodes are in datacenter #1? 3

DC #1:
  Node #1:0
  Node #2:   56713727820156410577229101238628035242
  Node #3:  113427455640312821154458202477256070484

Cheers



-
Aaron Morton
New Zealand
@aaronmorton

Co-Founder  Principal Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com

On 2/11/2013, at 3:43 am, Ray Sutton ray.sut...@gmail.com wrote:

 Your quotes need to be escaped:
 python -c num=2; print \\n\.join([(\token %d: %d\ %(i,(i*(2**127)/num))) 
 for i in range(0,num)])
 
 
 --
 Ray  //o-o\\
 
 
 
 On Fri, Nov 1, 2013 at 10:36 AM, Peter Sanford psanf...@nearbuysystems.com 
 wrote:
 I can't tell you why that one-liner isn't working, but you can try 
 http://www.cassandraring.com for generating balanced tokens.
 
 
 On Thu, Oct 31, 2013 at 11:59 PM, Techy Teck comptechge...@gmail.com wrote:
 I am trying to setup two node Cassandra Cluster on windows machine. I have 
 basically two windows machine and I was following this datastax tutorial 
 (http://www.datastax.com/2012/01/how-to-setup-and-monitor-a-multi-node-cassandra-cluster-on-windows)
 
 Whenever I use the below command to get the token number from the above 
 tutorial -
 
 python -c num=2; print \n.join([(token %d: %d 
 %(i,(i*(2**127)/num))) for i in range(0,num)])
 
 
 I always get this error - 
 
 C:\Users\usernamepython -c num=2; print \n.join([(token %d: %d 
 %(i,(i*(2**127)/num))) for i
 in range(0,num)])
   File string, line 1
 num=2; print \n.join([(token %d: %d %(i,(i*(2**127)/num))) for i 
 in range(0,num)])
 ^
 SyntaxError: invalid syntax
 
 
 
 



Re: Not able to form a Cassandra cluster of two nodes in Windows?

2013-11-04 Thread Aaron Morton
 My First Node details are - 
 
 initial_token: 0
 seeds: 10.0.0.4  
 listen_address: 10.0.0.4   #IP of Machine - A (Wireless LAN adapter 
 Wireless Network Connection)
 rpc_address: 10.0.0.4
 
 My Second Node details are - 
 
 initial_token: 0
 seeds: 10.0.0.4
 listen_address: 10.0.0.7   #IP of Machine - B (Wireless LAN adapter 
 Wireless Network Connection)
 rpc_address: 10.0.0.7
 

You cannot have two nodes with the same tokens, is there an error in the logs ? 

if you are just starting put the simple thing is delete all the data and 
restart the machines. 

Hope that helps. 

-
Aaron Morton
New Zealand
@aaronmorton

Co-Founder  Principal Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com

On 2/11/2013, at 6:02 am, Techy Teck comptechge...@gmail.com wrote:

 In my case, both of my laptop are running Windows 7 64 bit.. Not sure what's 
 the problem...
 
 
 On Fri, Nov 1, 2013 at 4:48 AM, Aaron Mintz aaron.m.mi...@gmail.com wrote:
 One issue I ran into that produced similar symptoms: if you have 
 internode_compression turned on without the proper snappy library available 
 for your architecture (i had 64-bit linux), starting up will fail to link the 
 nodes. It'll also be silent unless you set a certain class logging level to 
 DEBUG, but it basically presented as if nodes would each form their own 
 single-machine ring
 
 
 On Fri, Nov 1, 2013 at 3:52 AM, Techy Teck comptechge...@gmail.com wrote:
 I am trying to setup two nodes of Cassandra cluster on my windows machine. 
 Basically, I have two windows machine. In both of my machine, I have 
 installed Cassandra 1.2.11 from Datastax. Now I was following this 
 [tutorial](http://www.datastax.com/2012/01/how-to-setup-and-monitor-a-multi-node-cassandra-cluster-on-windows)
  to setup two node Cassandra Cluster.
 
 After installing Cassandra into those two machines, I stopped the services 
 for the Cassandra server, DataStax OpsCenter, and the DataStax OpsCenter 
 agent in those two machines.. 
 
 And then I started making changes in the yaml file - 
 
 My First Node details are - 
 
 initial_token: 0
 seeds: 10.0.0.4  
 listen_address: 10.0.0.4   #IP of Machine - A (Wireless LAN adapter 
 Wireless Network Connection)
 rpc_address: 10.0.0.4
 
 My Second Node details are - 
 
 initial_token: 0
 seeds: 10.0.0.4
 listen_address: 10.0.0.7   #IP of Machine - B (Wireless LAN adapter 
 Wireless Network Connection)
 rpc_address: 10.0.0.7
 
 Both of my serves gets started up properly after I start the services for 
 server. But they are not forming a cluster of two nodes somehow? Is there 
 anything I am missing here?
 
 Machine-A Nodetool Information-
 
 Datacenter: datacenter1
 ==
 Replicas: 1
 
 Address   RackStatus State   LoadOwns
 Token
 
 
 10.0.0.4  rack1   Up Normal  212.1 KB100.00% 
 5264744098649860606
 
 Machine-B Nodetool Information-
 
 Starting NodeTool
 
 Datacenter: datacenter1
 ==
 Replicas: 1
 
 Address   RackStatus State   LoadOwns
 Token
 
 
 10.0.0.7  rack1   Up Normal  68.46 KB100.00% 
 407804996740764696