Re: Cassandra and "-server" JVM parameter

2013-06-07 Thread Romain Hardouin
Hi,

In HotSpot 64-bit, only the server JIT is included.

Cheers



--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Cassandra-and-server-JVM-parameter-tp7588196p7588233.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.


Re: Bulk loader with Cassandra 1.2.5

2013-06-07 Thread Keith Wright
Looking into it further, I believe your issue is that you did not define the 
table with compact storage.  Without that, CQL3 will treat every column as a 
composite (as is hinted in your stack trace where you see AbstractCompositeType 
is the cause of the error).  Try changing your table definition as follows:


create table users (
id uuid primary key,
firstname varchar,
lastname varchar,
password varchar,
age int,
email varchar)

WITH COMPACT STORAGE

and compaction = {'class' : 'LeveledCompactionStrategy' }

From: Davide Anastasia 
mailto:davide.anasta...@gmail.com>>
Reply-To: "user@cassandra.apache.org" 
mailto:user@cassandra.apache.org>>
Date: Friday, June 7, 2013 2:11 AM
To: "user@cassandra.apache.org" 
mailto:user@cassandra.apache.org>>
Subject: Re: Bulk loader with Cassandra 1.2.5

AbstractCompositeType.java


Re: Bulk loader with Cassandra 1.2.5

2013-06-07 Thread Davide Anastasia
Hi Keith,
You are my hero :-)
It does work now.

Thanks a lot,
Davide
On 7 Jun 2013 10:57, "Keith Wright"  wrote:

> Looking into it further, I believe your issue is that you did not define
> the table with compact storage.  Without that, CQL3 will treat every column
> as a composite (as is hinted in your stack trace where you see
> AbstractCompositeType is the cause of the error).  Try changing your table
> definition as follows:
>
> create table users (
> id uuid primary key,
> firstname varchar,
> lastname varchar,
> password varchar,
> age int,
> email varchar)
>
> WITH COMPACT STORAGE
>
> and compaction = {'class' : 'LeveledCompactionStrategy' }
>
>
> From: Davide Anastasia 
> Reply-To: "user@cassandra.apache.org" 
> Date: Friday, June 7, 2013 2:11 AM
> To: "user@cassandra.apache.org" 
> Subject: Re: Bulk loader with Cassandra 1.2.5
>
> AbstractCompositeType.java
>


Re: Reduce Cassandra GC

2013-06-07 Thread Joel Samuelsson
I keep having issues with GC. Besides the cluster mentioned above, we also
have a single node development cluster having the same issues. This node
has 12.33 GB data, a couple of million skinny rows and basically no load.
It has default memory settings but keep getting very long stop-the-world GC
pauses:
INFO [ScheduledTasks:1] 2013-06-07 10:37:02,537 GCInspector.java (line 122)
GC for ParNew: 99342 ms for 1 collections, 1400754488 used; max is
4114612224
To try to rule out amount of memory, I set it to 16GB (we're on a virtual
environment), with 4GB of it for Cassandra heap but that didn't help
either, the incredibly long GC pauses keep coming.
So I think something else is causing these issues, unless everyone is
having really long GC pauses (which I doubt). I came across this thread:
http://www.mail-archive.com/user@cassandra.apache.org/msg24042.html
suggesting # date -s “`date`” might help my issues. It didn't however
(unless I am supposed to replace that second date with the actual date?).

Has anyone had similar issues?


2013/4/17 aaron morton 

> > INFO [ScheduledTasks:1] 2013-04-15 14:00:02,749 GCInspector.java (line
> 122) GC for ParNew: 338798 ms for 1 collections, 592212416 used; max is
> 1046937600
> This does not say that the heap is full.
> ParNew is GC activity for the new heap, which is typically a smaller part
> of the overall heap.
>
> It sounds like you are running with defaults for the memory config, which
> is generally a good idea. But 4GB total memory for a node is on the small
> size.
>
> Try some changes, edit the cassandra-env.sh file and change
>
> MAX_HEAP_SIZE="2G"
> HEAP_NEWSIZE="400M"
>
> You may also want to try:
>
> MAX_HEAP_SIZE="2G"
> HEAP_NEWSIZE="800M"
> JVM_OPTS="$JVM_OPTS -XX:SurvivorRatio=4"
> JVM_OPTS="$JVM_OPTS -XX:MaxTenuringThreshold=2"
>
> The size of the new heap generally depends on the number of cores
> available, see the commends in the -env file.
>
> An older discussion about memory use, not that in 1.2 the bloom filters
> (and compression data) are off heap now.
> http://www.mail-archive.com/user@cassandra.apache.org/msg25762.html
>
> Hope that helps.
>
> -
> Aaron Morton
> Freelance Cassandra Consultant
> New Zealand
>
> @aaronmorton
> http://www.thelastpickle.com
>
> On 17/04/2013, at 11:06 PM, Joel Samuelsson 
> wrote:
>
> > You're right, it's probably hard. I should have provided more data.
> >
> > I'm running Ubuntu 10.04 LTS with JNA installed. I believe this line in
> the log indicates that JNA is working, please correct me if I'm wrong:
> > CLibrary.java (line 111) JNA mlockall successful
> >
> > Total amount of RAM is 4GB.
> >
> > My description of data size was very bad. Sorry about that. Data set
> size is 12.3 GB per node, compressed.
> >
> > Heap size is 998.44MB according to nodetool info.
> > Key cache is 49MB bytes according to nodetool info.
> > Row cache size is 0 bytes acoording to nodetool info.
> > Max new heap is 205MB kbytes according to Memory Pool "Par Eden Space"
> max in jconsole.
> > Memtable is left at default which should give it 333MB according to
> documentation (uncertain where I can verify this).
> >
> > Our production cluster seems similar to your dev cluster so possibly
> increasing the heap to 2GB might help our issues.
> >
> > I am still interested in getting rough estimates of how much heap will
> be needed as data grows. Other than empirical studies how would I go about
> getting such estimates?
> >
> >
> > 2013/4/16 Viktor Jevdokimov 
> > How one could provide any help without any knowledge about your cluster,
> node and environment settings?
> >
> >
> >
> > 40GB was calculated from 2 nodes with RF=2 (each has 100% data range),
> 2.4-2.5M rows * 6 cols * 3kB as a minimum without compression and any
> overhead (sstable, bloom filters and indexes).
> >
> >
> >
> > With ParNew GC time such as yours even if it is a swapping issue I could
> say only that heap size is too small.
> >
> >
> >
> > Check Heap, New Heap sizes, memtable and cache sizes. Are you on Linux?
> Is JNA installed and used? What is total amount of RAM?
> >
> >
> >
> > Just for a DEV environment we use 3 virtual machines with 4GB RAM and
> use 2GB heap without any GC issue with amount of data from 0 to 16GB
> compressed on each node. Memtable space sized to 100MB, New Heap 400MB.
> >
> >
> >
> > Best regards / Pagarbiai
> > Viktor Jevdokimov
> > Senior Developer
> >
> > Email: viktor.jevdoki...@adform.com
> > Phone: +370 5 212 3063, Fax +370 5 261 0453
> > J. Jasinskio 16C, LT-01112 Vilnius, Lithuania
> > Follow us on Twitter: @adforminsider
> > Take a ride with Adform's Rich Media Suite
> > 
> > 
> >
> > Disclaimer: The information contained in this message and attachments is
> intended solely for the attention and use of the named addressee and may be
> confidential. If you are not the intended recipient, you are reminded that
> the information remains the property of the sender. You must not use,
> disclose, distribute, copy, print or

Re: unable to delete

2013-06-07 Thread Nikolay Mihaylov
Hi

please note that when you drop column family, the data on the disk is not
deleted.

this is something you should do yourself.

>> Do the files get deleted on GC/server restart?
the question actually translates - do the column family existed after the
restart?

John pls correct me if I am explaining it wrong.

Nick.


On Mon, Jun 3, 2013 at 10:14 PM, Robert Coli  wrote:

> On Mon, Jun 3, 2013 at 11:57 AM, John R. Frank  wrote:
> > Is it considered normal for cassandra to experience this error:
> >
> > ERROR [NonPeriodicTasks:1] 2013-06-03 18:17:05,374
> SSTableDeletingTask.java
> > (line 72) Unable to delete
> > /raid0/cassandra/data///--ic-19-Data.db (it
> will
> > be removed on server restart; we'll also retry after GC)
>
>
> cassandra//src/java/org/apache/cassandra/io/sstable/SSTableDeletingTask.java
> "
>File datafile = new File(desc.filenameFor(Component.DATA));
> if (!datafile.delete())
> {
> logger.error("Unable to delete " + datafile + " (it will
> be removed on server restart; we'll also retry after GC)");
> failedTasks.add(this);
> return;
> }
> "
>
> There are contexts where it is appropriate for Cassandra to be unable
> to delete a file using io.File.delete.
>
> "
> // Deleting sstables is tricky because the mmapping might not have
> been finalized yet,
> // and delete will fail (on Windows) until it is (we only force the
> unmapping on SUN VMs).
> // Additionally, we need to make sure to delete the data file first,
> so on restart the others
> // will be recognized as GCable.
> "
>
> Do the files get deleted on GC/server restart?
>
> > This is on the DataStax EC2 AMI in a two-node cluster.  After deleting
> 1,000
> > rows from a CF with 20,000 rows, the DB becomes slow, and I'm trying to
> > figure out why.  Could this error message be pointing at a proximate
> cause?
>
> Almost certainly not. By the time that a sstable file is subject to
> deletion, it should no longer be "live". When it is no longer "live"
> it is not in the read path.
>
> You can verify this by using nodetool getsstables on a given key.
>
> What operation are you trying to do when the "DB becomes slow"?
>
> =Rob
>


Re: [Cassandra] Conflict resolution in Cassandra

2013-06-07 Thread Edward Capriolo
Conflicts are managed at the column level.
1) If two columns have the same name the column with the highest timestamp
wins.
2) If two columns have the same column name and the same timestamp the
value of the column is compared and the highest* wins.

Someone correct me if I am wrong about the *. I know the algorithm is
deterministic, I do not remember if it is highest or lowest.


On Thu, Jun 6, 2013 at 6:25 PM, Emalayan Vairavanathan  wrote:

> I tried google and found conflicting answers. Thats why wanted to double
> check with user forum.
>
> Thanks
>
>   --
>  *From:* Bryan Talbot 
> *To:* user@cassandra.apache.org; Emalayan Vairavanathan <
> svemala...@yahoo.com>
> *Sent:* Thursday, 6 June 2013 3:19 PM
> *Subject:* Re: [Cassandra] Conflict resolution in Cassandra
>
> For generic questions like this, google is your friend:
> http://lmgtfy.com/?q=cassandra+conflict+resolution
>
> -Bryan
>
>
> On Thu, Jun 6, 2013 at 11:23 AM, Emalayan Vairavanathan <
> svemala...@yahoo.com> wrote:
>
> Hi All,
>
> Can someone tell me about the conflict resolution mechanisms provided by
> Cassandra?
>
> More specifically does Cassandra provides a way to define application
> specific conflict resolution mechanisms (per row basis  / column basis)?
>or
> Does it automatically manage the conflicts based on some synchronization
> algorithms ?
>
>
> Thank you
> Emalayan
>
>
>
>
>
>


Re: [Cassandra] Conflict resolution in Cassandra

2013-06-07 Thread Theo Hultberg
Like Edward says Cassandra's conflict resolution strategy is LWW (last
write wins). This may seem simplistic, but Cassandra's Big Query-esque data
model makes it less of an issue than in a pure key/value-store like Riak,
for example. When all you have is an opaque value for a key you want to be
able to do things like keeping conflicting writes so that you can resolve
them later. Since Cassandra's rows aren't opaque, but more like a sorted
map LWW is almost always enough. With Cassandra you can add new
columns/cells to a row from multiple clients without having to worry about
conflicts. It's only when multiple clients write to the same column/cell
that there is an issue, but in that case you usually can (and you probably
should) model your way around that.

T#


On Fri, Jun 7, 2013 at 4:51 PM, Edward Capriolo wrote:

> Conflicts are managed at the column level.
> 1) If two columns have the same name the column with the highest timestamp
> wins.
> 2) If two columns have the same column name and the same timestamp the
> value of the column is compared and the highest* wins.
>
> Someone correct me if I am wrong about the *. I know the algorithm is
> deterministic, I do not remember if it is highest or lowest.
>
>
> On Thu, Jun 6, 2013 at 6:25 PM, Emalayan Vairavanathan <
> svemala...@yahoo.com> wrote:
>
>> I tried google and found conflicting answers. Thats why wanted to double
>> check with user forum.
>>
>> Thanks
>>
>>   --
>>  *From:* Bryan Talbot 
>> *To:* user@cassandra.apache.org; Emalayan Vairavanathan <
>> svemala...@yahoo.com>
>> *Sent:* Thursday, 6 June 2013 3:19 PM
>> *Subject:* Re: [Cassandra] Conflict resolution in Cassandra
>>
>> For generic questions like this, google is your friend:
>> http://lmgtfy.com/?q=cassandra+conflict+resolution
>>
>> -Bryan
>>
>>
>> On Thu, Jun 6, 2013 at 11:23 AM, Emalayan Vairavanathan <
>> svemala...@yahoo.com> wrote:
>>
>> Hi All,
>>
>> Can someone tell me about the conflict resolution mechanisms provided by
>> Cassandra?
>>
>> More specifically does Cassandra provides a way to define application
>> specific conflict resolution mechanisms (per row basis  / column basis)?
>>or
>> Does it automatically manage the conflicts based on some synchronization
>> algorithms ?
>>
>>
>> Thank you
>> Emalayan
>>
>>
>>
>>
>>
>>
>


Data model for financial time series

2013-06-07 Thread Davide Anastasia
Hi,
I am trying to build the storage of stock prices in Cassandra. My queries are 
ideally of three types:
- give me everything between time A and time B;
- give me everything about symbol X;
- give me everything of type Y;
...or an intersection of the three. Something I will be happy doing is:
- give me all the trades about APPL between 7:00am and 3:00pm of a certain day.

However, being a time series, I will be happy to retrieve the data in ascending 
order of timestamp (from 7:00 to 3:00).

I have tried to build my table with the timestamp (as timeuuid) as primary key, 
however I cannot manage to get my data in order and and "order by" in CQL3 
raise an error and doesn't perform the query.

Does anybody have any suggestion to get a good design the fits my queries?
Thanks,
David


Re: Data model for financial time series

2013-06-07 Thread Jake Luciani
We have built a similar system, you can ready about our data model in CQL3
here:

http://www.slideshare.net/carlyeks/nyc-big-tech-day-2013

We are going to be presenting a similar talk next week at the cassandra
summit.


On Fri, Jun 7, 2013 at 12:34 PM, Davide Anastasia <
davide.anasta...@qualitycapital.com> wrote:

>  Hi,
>
> I am trying to build the storage of stock prices in Cassandra. My queries
> are ideally of three types:
>
> - give me everything between time A and time B;
>
> - give me everything about symbol X;
>
> - give me everything of type Y;
>
> …or an intersection of the three. Something I will be happy doing is:
>
> - give me all the trades about APPL between 7:00am and 3:00pm of a certain
> day.
>
> ** **
>
> However, being a time series, I will be happy to retrieve the data in
> ascending order of timestamp (from 7:00 to 3:00).
>
> ** **
>
> I have tried to build my table with the timestamp (as timeuuid) as primary
> key, however I cannot manage to get my data in order and and “order by” in
> CQL3 raise an error and doesn’t perform the query.
>
> ** **
>
> Does anybody have any suggestion to get a good design the fits my queries?
> 
>
> Thanks,
>
> David
>



-- 
http://twitter.com/tjake


Re: unable to delete

2013-06-07 Thread Radim Kolar



  Could this error message be pointing at a proximate cause?

no


Re: Reduce Cassandra GC

2013-06-07 Thread Igor
If you are talking about 1.2.x then I also have memory problems on the 
idle cluster: java memory constantly slow grows up to limit, then spend 
long time for GC. I never seen such behaviour for 1.0.x and 1.1.x, where 
on idle cluster java memory stay on the same value.


On 06/07/2013 05:19 PM, Joel Samuelsson wrote:
I keep having issues with GC. Besides the cluster mentioned above, we 
also have a single node development cluster having the same issues. 
This node has 12.33 GB data, a couple of million skinny rows and 
basically no load. It has default memory settings but keep getting 
very long stop-the-world GC pauses:
INFO [ScheduledTasks:1] 2013-06-07 10:37:02,537 GCInspector.java (line 
122) GC for ParNew: 99342 ms for 1 collections, 1400754488 used; max 
is 4114612224
To try to rule out amount of memory, I set it to 16GB (we're on a 
virtual environment), with 4GB of it for Cassandra heap but that 
didn't help either, the incredibly long GC pauses keep coming.
So I think something else is causing these issues, unless everyone is 
having really long GC pauses (which I doubt). I came across this thread:

http://www.mail-archive.com/user@cassandra.apache.org/msg24042.html
suggesting # date -s “`date`” might help my issues. It didn't however 
(unless I am supposed to replace that second date with the actual date?).


Has anyone had similar issues?


2013/4/17 aaron morton >


> INFO [ScheduledTasks:1] 2013-04-15 14:00:02,749 GCInspector.java
(line 122) GC for ParNew: 338798 ms for 1 collections, 592212416
used; max is 1046937600
This does not say that the heap is full.
ParNew is GC activity for the new heap, which is typically a
smaller part of the overall heap.

It sounds like you are running with defaults for the memory
config, which is generally a good idea. But 4GB total memory for a
node is on the small size.

Try some changes, edit the cassandra-env.sh file and change

MAX_HEAP_SIZE="2G"
HEAP_NEWSIZE="400M"

You may also want to try:

MAX_HEAP_SIZE="2G"
HEAP_NEWSIZE="800M"
JVM_OPTS="$JVM_OPTS -XX:SurvivorRatio=4"
JVM_OPTS="$JVM_OPTS -XX:MaxTenuringThreshold=2"

The size of the new heap generally depends on the number of cores
available, see the commends in the -env file.

An older discussion about memory use, not that in 1.2 the bloom
filters (and compression data) are off heap now.
http://www.mail-archive.com/user@cassandra.apache.org/msg25762.html

Hope that helps.

-
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 17/04/2013, at 11:06 PM, Joel Samuelsson
mailto:samuelsson.j...@gmail.com>> wrote:

> You're right, it's probably hard. I should have provided more data.
>
> I'm running Ubuntu 10.04 LTS with JNA installed. I believe this
line in the log indicates that JNA is working, please correct me
if I'm wrong:
> CLibrary.java (line 111) JNA mlockall successful
>
> Total amount of RAM is 4GB.
>
> My description of data size was very bad. Sorry about that. Data
set size is 12.3 GB per node, compressed.
>
> Heap size is 998.44MB according to nodetool info.
> Key cache is 49MB bytes according to nodetool info.
> Row cache size is 0 bytes acoording to nodetool info.
> Max new heap is 205MB kbytes according to Memory Pool "Par Eden
Space" max in jconsole.
> Memtable is left at default which should give it 333MB according
to documentation (uncertain where I can verify this).
>
> Our production cluster seems similar to your dev cluster so
possibly increasing the heap to 2GB might help our issues.
>
> I am still interested in getting rough estimates of how much
heap will be needed as data grows. Other than empirical studies
how would I go about getting such estimates?
>
>
> 2013/4/16 Viktor Jevdokimov mailto:viktor.jevdoki...@adform.com>>
> How one could provide any help without any knowledge about your
cluster, node and environment settings?
>
>
>
> 40GB was calculated from 2 nodes with RF=2 (each has 100% data
range), 2.4-2.5M rows * 6 cols * 3kB as a minimum without
compression and any overhead (sstable, bloom filters and indexes).
>
>
>
> With ParNew GC time such as yours even if it is a swapping issue
I could say only that heap size is too small.
>
>
>
> Check Heap, New Heap sizes, memtable and cache sizes. Are you on
Linux? Is JNA installed and used? What is total amount of RAM?
>
>
>
> Just for a DEV environment we use 3 virtual machines with 4GB
RAM and use 2GB heap without any GC issue with amount of data from
0 to 16GB compressed on each node. Memtable space sized to 100MB,
New Heap 400MB.
>
>
>
> Best regards / Pagarbiai
> Viktor Jevdokimo

Re: changing ips on node replacement

2013-06-07 Thread Robert Coli
On Fri, May 24, 2013 at 10:01 AM, Robert Coli  wrote:
> On Fri, May 24, 2013 at 9:01 AM, Hiller, Dean  wrote:
>> I seem to remember problems with ghost nodes, etc. and I seem to remember if 
>> you are replacing a node and you don’t use the same ip, this can cause 
>> issues.  Is this correct?
>
> If you don't use replace_token, this won't work at all. You'll get
> "attempt to bootstrap node into range of live node" type error,
> because the old ip will still own the token.

As a followup, this is actually incorrect.

If you :

1) have dead (or live) node A with token X
2) start node B with token X and auto_bootstrap=false
3) node B will take over responsibility for token X from node A
without bootstrapping, due to having a higher generation number

This is a pretty good reason to make sure that you don't :

a) have auto_bootstrap=false in your config file
b) have cassandra set to auto start

=Rob


headed to cassandra conference next week in San Fran?

2013-06-07 Thread Hiller, Dean
I would not mind meeting people there.  My cell is 303-517-8902, best to text 
me probably or just email me at d...@alvazan.com.

Later,
Dean


Re: headed to cassandra conference next week in San Fran?

2013-06-07 Thread Faraaz Sareshwala
I'll be attending  and will try and meet up with you :). I see your posts often
on this list -- would love to pick your brain and learn more about what you are
using cassandra for and how it's working for you.

I'm a software engineer at Quantcast and we're just beginning to use cassandra.
So far it's been great, but there's still a lot to learn in this space.

See you at the conference, hopefully!

Faraaz

On Fri, Jun 07, 2013 at 01:15:08PM -0700, Hiller, Dean wrote:
> I would not mind meeting people there.  My cell is 303-517-8902, best to text 
> me probably or just email me at d...@alvazan.com.
> 
> Later,
> Dean


Cassandra (1.2.5) + Pig (0.11.1) Errors with large column families

2013-06-07 Thread Mark Lewandowski
I'm currently trying to get Cassandra (1.2.5) and Pig (0.11.1) to play nice
together.  I'm running a basic script:

rows = LOAD 'cassandra://keyspace/colfam' USING CassandraStorage();
dump rows;

This fails for my column family which has ~100,000 rows.  However, if I
modify the script to this:

rows = LOAD 'cassandra://betable_games/bets' USING CassandraStorage();
rows = limit rows 7000;
dump rows;

Then it seems to work.  7000 is about as high as I've been able to get it
before it fails.  The error I keep getting is:

2013-06-07 14:58:49,119 [Thread-4] WARN
 org.apache.hadoop.mapred.LocalJobRunner - job_local_0001
java.lang.RuntimeException: org.apache.thrift.TException: Message length
exceeded: 4480
at
org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.maybeInit(ColumnFamilyRecordReader.java:384)
at
org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.computeNext(ColumnFamilyRecordReader.java:390)
at
org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.computeNext(ColumnFamilyRecordReader.java:313)
at
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
at
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
at
org.apache.cassandra.hadoop.ColumnFamilyRecordReader.getProgress(ColumnFamilyRecordReader.java:103)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.getProgress(PigRecordReader.java:169)
at
org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.getProgress(MapTask.java:514)
at
org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:539)
at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:214)
Caused by: org.apache.thrift.TException: Message length exceeded: 4480
at
org.apache.thrift.protocol.TBinaryProtocol.checkReadLength(TBinaryProtocol.java:393)
at
org.apache.thrift.protocol.TBinaryProtocol.readBinary(TBinaryProtocol.java:363)
at org.apache.cassandra.thrift.Column.read(Column.java:535)
at
org.apache.cassandra.thrift.ColumnOrSuperColumn.read(ColumnOrSuperColumn.java:507)
at org.apache.cassandra.thrift.KeySlice.read(KeySlice.java:408)
at
org.apache.cassandra.thrift.Cassandra$get_range_slices_result.read(Cassandra.java:12905)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
at
org.apache.cassandra.thrift.Cassandra$Client.recv_get_range_slices(Cassandra.java:734)
at
org.apache.cassandra.thrift.Cassandra$Client.get_range_slices(Cassandra.java:718)
at
org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.maybeInit(ColumnFamilyRecordReader.java:346)
... 13 more


I've seen a similar problem on this mailing list using Cassandra-1.2.3,
however the fixes on that thread of increasing
thrift_framed_transport_size_in_mb, thrift_max_message_length_in_mb in
cassandra.yaml did not appear to have any effect.  Has anyone else seen
this issue, and how can I fix it?

Thanks,

-Mark


Re: smallest/largest UUIDs for LexicalUUIDType

2013-06-07 Thread John R. Frank
Follow-up question:  it seems that range queries on the *second* field 
of a CompositeType(UUIDType(), UUIDType()) do not work.


If I concatenate the two UUID.hex values into a 32-character string 
instead of a CompositeType of two UUIDs, then range queries work 
correctly.


This is illustrated below... so the question is:  what is the point of a 
CompositeType if range queries only work on the first field?  Is it just a 
convenience class for keeping things strongly typed and cleanly organized, 
or did I break something in the way I setup CompositeType in the example 
earlier in this thread?



def join_uuids(*uuids):
return ''.join(map(attrgetter('hex'), uuids))

def split_uuids(uuid_str):
return map(lambda s: uuid.UUID(hex=''.join(s)), grouper(uuid_str, 32))

def grouper(iterable, n, fillvalue=None):
"Collect data into fixed-length chunks or blocks"
# grouper('ABCDEFG', 3, 'x') --> ABC DEF Gxx
args = [iter(iterable)] * n
return itertools.izip_longest(fillvalue=fillvalue, *args)

def 
test_composite_column_names_second_level_range_query_with_decomposited_keys():
'''
check that we can execute range queries on the second part of a
CompositeType column name after we unpack the composite key into a
long string of concatenated hex forms of the UUIDs
'''
sm = SystemManager(chosen_server)
sm.create_keyspace(namespace, SIMPLE_STRATEGY, {'replication_factor': '1'})

family = 'test'
sm.create_column_family(
namespace, family, super=False,
key_validation_class = ASCII_TYPE,
default_validation_class = BYTES_TYPE,
comparator_type=UTF8Type(),
)
pool = ConnectionPool(namespace, config['storage_addresses'],
  max_retries=1000, pool_timeout=10, pool_size=2, 
timeout=120)

cf = pycassa.ColumnFamily(pool, family)
u1, u2, u3, u4 = uuid.uuid1(), uuid.uuid1(), uuid.uuid1(), uuid.uuid1()

cf.insert('inbound', {join_uuids(u1, u2): b''})
cf.insert('inbound', {join_uuids(u1, u3): b''})
cf.insert('inbound', {join_uuids(u1, u4): b''})

## test range searching
start  = uuid.UUID(int=u3.int - 1)
finish = uuid.UUID(int=u3.int + 1)
assert start.int < u3.int < finish.int
rec3 = cf.get('inbound',
  column_start =join_uuids(u1, start),
  column_finish=join_uuids(u1, finish)).items()
assert len(rec3) == 1
assert split_uuids(rec3[0][0])[1] == u3
  This assert above passes!

  This next part fails :-/
## now insert many rows -- enough that some should fall in each
## subrange below
for i in xrange(1000):
cf.insert('inbound', {join_uuids(u1, uuid.uuid4()): b''})

## do four ranges, and expect more than zero in each
step_size = 2**(128 - 2)
for i in range(2**2, 0, -1):
start =  uuid.UUID(int=(i-1) * step_size)
finish = uuid.UUID(int=min(i * step_size, 2**128 - 1))
recs = cf.get('inbound',
  column_start =join_uuids(u1, start),
  column_finish=join_uuids(u1, finish)).items()
for key, val in recs:
key = split_uuids(key)
assert val == b''
assert key[0] == u1
assert key[1] < finish
assert start < key[1]   ## this passes!! (fails with 
CompositeType...)

assert len(recs) > 0
print len(recs), ' for ', start, finish

sm.close()


Multiple data center performance

2013-06-07 Thread Daning Wang
We have deployed multi-center but got performance issue. When the nodes on
other center are up, the read response time from clients is 4 or 5 times
higher. when we take those nodes down, the response time becomes
normal(compare to the time before we changed to multi-center).

We have high volume on the cluster, the consistency level is one for read.
so my understanding is most of traffic between data center should be read
repair. but seems that could not create much delay.

What could cause the problem? how to debug this?

Here is the keyspace,

[default@dsat] describe dsat;
Keyspace: dsat:
  Replication Strategy: org.apache.cassandra.locator.NetworkTopologyStrategy
  Durable Writes: true
Options: [dc2:1, dc1:3]
  Column Families:
ColumnFamily: categorization_cache


Ring

Datacenter: dc1
===
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address   Load   Tokens  Owns (effective)  Host ID
  Rack
UN  xx.xx.xx..111   59.2 GB256 37.5%
4d6ed8d6-870d-4963-8844-08268607757e  rac1
DN  xx.xx.xx..121   99.63 GB   256 37.5%
9d0d56ce-baf6-4440-a233-ad6f1d564602  rac1
UN  xx.xx.xx..120   66.32 GB   256 37.5%
0fd912fb-3187-462b-8c8a-7d223751b649  rac1
UN  xx.xx.xx..118   63.61 GB   256 37.5%
3c6e6862-ab14-4a8c-9593-49631645349d  rac1
UN  xx.xx.xx..117   68.16 GB   256 37.5%
ee6cdf23-d5e4-4998-a2db-f6c0ce41035a  rac1
UN  xx.xx.xx..116   32.41 GB   256 37.5%
f783eeef-1c51-4f91-ab7c-a60669816770  rac1
UN  xx.xx.xx..115   64.24 GB   256 37.5%
e75105fb-b330-4f40-aa4f-8e6e11838e37  rac1
UN  xx.xx.xx..112   61.32 GB   256 37.5%
2547ee54-88dd-4994-a1ad-d9ba367ed11f  rac1
Datacenter: dc2
===
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address   Load   Tokens  Owns (effective)  Host ID
  Rack
DN  xx.xx.xx.19958.39 GB   256 50.0%
6954754a-e9df-4b3c-aca7-146b938515d8  rac1
DN  xx.xx.xx..61  33.79 GB   256 50.0%
91b8d510-966a-4f2d-a666-d7edbe986a1c  rac1


Thank you in advance,

Daning