Re: Repair Process Taking too long

2012-04-11 Thread Frank Ng
Thank you for confirming that the per node data size is most likely causing
the long repair process.  I have tried a repair on smaller column families
and it was significantly faster.

On Wed, Apr 11, 2012 at 9:55 PM, aaron morton wrote:

> If you have 1TB of data it will take a long time to repair. Every bit of
> data has to be read and a hash generated. This is one of the reasons we
> often suggest that around 300 to 400Gb per node is a good load in the
> general case.
>
> Look at nodetool compactionstats .Is there a validation compaction running
> ? If so it is still building the merkle  hash tree.
>
> Look at nodetool netstats . Is it streaming data ? If so all hash trees
> have been calculated.
>
> Cheers
>
>
> -
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 12/04/2012, at 2:16 AM, Frank Ng wrote:
>
> Can you expand further on your issue? Were you using Random Patitioner?
>
> thanks
>
> On Tue, Apr 10, 2012 at 5:35 PM, David Leimbach  wrote:
>
>> I had this happen when I had really poorly generated tokens for the ring.
>>  Cassandra seems to accept numbers that are too big.  You get hot spots
>> when you think you should be balanced and repair never ends (I think there
>> is a 48 hour timeout).
>>
>>
>> On Tuesday, April 10, 2012, Frank Ng wrote:
>>
>>> I am not using tier-sized compaction.
>>>
>>>
>>> On Tue, Apr 10, 2012 at 12:56 PM, Jonathan Rhone wrote:
>>>
 Data size, number of nodes, RF?

 Are you using size-tiered compaction on any of the column families that
 hold a lot of your data?

 Do your cassandra logs say you are streaming a lot of ranges?
 zgrep -E "(Performing streaming repair|out of sync)"


 On Tue, Apr 10, 2012 at 9:45 AM, Igor  wrote:

>  On 04/10/2012 07:16 PM, Frank Ng wrote:
>
> Short answer - yes.
> But you are asking wrong question.
>
>
> I think both processes are taking a while.  When it starts up,
> netstats and compactionstats show nothing.  Anyone out there successfully
> using ext3 and their repair processes are faster than this?
>
>  On Tue, Apr 10, 2012 at 10:42 AM, Igor  wrote:
>
>> Hi
>>
>> You can check with nodetool  which part of repair process is slow -
>> network streams or verify compactions. use nodetool netstats or
>> compactionstats.
>>
>>
>> On 04/10/2012 05:16 PM, Frank Ng wrote:
>>
>>> Hello,
>>>
>>> I am on Cassandra 1.0.7.  My repair processes are taking over 30
>>> hours to complete.  Is it normal for the repair process to take this 
>>> long?
>>>  I wonder if it's because I am using the ext3 file system.
>>>
>>> thanks
>>>
>>
>>
>
>


 --
 Jonathan Rhone
 Software Engineer

 *TinyCo*
 800 Market St., Fl 6
 San Francisco, CA 94102
 www.tinyco.com


>>>
>
>


Re: Repair Process Taking too long

2012-04-11 Thread aaron morton
If you have 1TB of data it will take a long time to repair. Every bit of data 
has to be read and a hash generated. This is one of the reasons we often 
suggest that around 300 to 400Gb per node is a good load in the general case. 

Look at nodetool compactionstats .Is there a validation compaction running ? If 
so it is still building the merkle  hash tree. 

Look at nodetool netstats . Is it streaming data ? If so all hash trees have 
been calculated. 

Cheers


-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 12/04/2012, at 2:16 AM, Frank Ng wrote:

> Can you expand further on your issue? Were you using Random Patitioner?
> 
> thanks
> 
> On Tue, Apr 10, 2012 at 5:35 PM, David Leimbach  wrote:
> I had this happen when I had really poorly generated tokens for the ring.  
> Cassandra seems to accept numbers that are too big.  You get hot spots when 
> you think you should be balanced and repair never ends (I think there is a 48 
> hour timeout).
> 
> 
> On Tuesday, April 10, 2012, Frank Ng wrote:
> I am not using tier-sized compaction.
> 
> 
> On Tue, Apr 10, 2012 at 12:56 PM, Jonathan Rhone  wrote:
> Data size, number of nodes, RF?
> 
> Are you using size-tiered compaction on any of the column families that hold 
> a lot of your data?
> 
> Do your cassandra logs say you are streaming a lot of ranges?
> zgrep -E "(Performing streaming repair|out of sync)" 
> 
> 
> On Tue, Apr 10, 2012 at 9:45 AM, Igor  wrote:
> On 04/10/2012 07:16 PM, Frank Ng wrote:
> 
> Short answer - yes.
> But you are asking wrong question.
> 
> 
>> I think both processes are taking a while.  When it starts up, netstats and 
>> compactionstats show nothing.  Anyone out there successfully using ext3 and 
>> their repair processes are faster than this?
>> 
>> On Tue, Apr 10, 2012 at 10:42 AM, Igor  wrote:
>> Hi
>> 
>> You can check with nodetool  which part of repair process is slow - network 
>> streams or verify compactions. use nodetool netstats or compactionstats.
>> 
>> 
>> On 04/10/2012 05:16 PM, Frank Ng wrote:
>> Hello,
>> 
>> I am on Cassandra 1.0.7.  My repair processes are taking over 30 hours to 
>> complete.  Is it normal for the repair process to take this long?  I wonder 
>> if it's because I am using the ext3 file system.
>> 
>> thanks
>> 
>> 
> 
> 
> 
> 
> -- 
> Jonathan Rhone
> Software Engineer
> 
> TinyCo
> 800 Market St., Fl 6
> San Francisco, CA 94102
> www.tinyco.com
> 
> 
> 



Re: Why so many SSTables?

2012-04-11 Thread Watanabe Maki
If you increase sstable_size_in_mb to 200MB, you will need more IO for each 
compaction. For example, if your memtable will be flushed, and LCS needs to 
compact it with 10 overwrapped L1 sstables, you will need almost 2GB read and 
2GB write for the single compaction.

From iPhone


On 2012/04/11, at 21:43, Romain HARDOUIN  wrote:

> 
> Thank you for your answers. 
> 
> I originally post this question because we encoutered an OOM Exception on 2 
> nodes during repair session. 
> Memory analyzing shows an hotspot: an ArrayList of SSTableBoundedScanner 
> which contains as many objects there are SSTables on disk (7747 objects at 
> the time). 
> This ArrayList consumes 47% of the heap space (786 MB). 
> 
> We want each node to handle 1 TB, so we must dramatically reduce the number 
> of SSTables. 
> 
> Thus, is there any drawback if we set sstable_size_in_mb to 200MB? 
> Otherwise shoudl we go back to Tiered Compaction? 
> 
> Regards, 
> 
> Romain
> 
> 
> Maki Watanabe  a écrit sur 11/04/2012 04:21:47 :
> 
> > You can configure sstable size by sstable_size_in_mb parameter for LCS.
> > The default value is 5MB.
> > You should better to check you don't have many pending compaction tasks
> > with nodetool tpstats and compactionstats also.
> > If you have enough IO throughput, you can increase
> > compaction_throughput_mb_per_sec
> > in cassandra.yaml to reduce pending compactions.
> > 
> > maki
> > 
> > 2012/4/10 Romain HARDOUIN :
> > >
> > > Hi,
> > >
> > > We are surprised by the number of files generated by Cassandra.
> > > Our cluster consists of 9 nodes and each node handles about 35 GB.
> > > We're using Cassandra 1.0.6 with LeveledCompactionStrategy.
> > > We have 30 CF.
> > >
> > > We've got roughly 45,000 files under the keyspace directory on each node:
> > > ls -l /var/lib/cassandra/data/OurKeyspace/ | wc -l
> > > 44372
> > >
> > > The biggest CF is spread over 38,000 files:
> > > ls -l Documents* | wc -l
> > > 37870
> > >
> > > ls -l Documents*-Data.db | wc -l
> > > 7586
> > >
> > > Many SSTable are about 4 MB:
> > >
> > > 19 MB -> 1 SSTable
> > > 12 MB -> 2 SSTables
> > > 11 MB -> 2 SSTables
> > > 9.2 MB -> 1 SSTable
> > > 7.0 MB to 7.9 MB -> 6 SSTables
> > > 6.0 MB to 6.4 MB -> 6 SSTables
> > > 5.0 MB to 5.4 MB -> 4 SSTables
> > > 4.0 MB to 4.7 MB -> 7139 SSTables
> > > 3.0 MB to 3.9 MB -> 258 SSTables
> > > 2.0 MB to 2.9 MB -> 35 SSTables
> > > 1.0 MB to 1.9 MB -> 13 SSTables
> > > 87 KB to  994 KB -> 87 SSTables
> > > 0 KB -> 32 SSTables
> > >
> > > FYI here is CF information:
> > >
> > > ColumnFamily: Documents
> > >   Key Validation Class: org.apache.cassandra.db.marshal.BytesType
> > >   Default column value validator: 
> > > org.apache.cassandra.db.marshal.BytesType
> > >   Columns sorted by: org.apache.cassandra.db.marshal.BytesType
> > >   Row cache size / save period in seconds / keys to save : 0.0/0/all
> > >   Row Cache Provider: org.apache.cassandra.cache.SerializingCacheProvider
> > >   Key cache size / save period in seconds: 20.0/14400
> > >   GC grace seconds: 1728000
> > >   Compaction min/max thresholds: 4/32
> > >   Read repair chance: 1.0
> > >   Replicate on write: true
> > >   Column Metadata:
> > > Column Name: refUUID (7265664944)
> > >   Validation Class: org.apache.cassandra.db.marshal.BytesType
> > >   Index Name: refUUID_idx
> > >   Index Type: KEYS
> > >   Compaction Strategy:
> > > org.apache.cassandra.db.compaction.LeveledCompactionStrategy
> > >   Compression Options:
> > > sstable_compression: org.apache.cassandra.io.compress.SnappyCompressor
> > >
> > > Is it a bug? If not, how can we tune Cassandra to avoid this?
> > >
> > > Regards,
> > >
> > > Romain


Re: Initial token - newbie question (version 1.0.8)

2012-04-11 Thread Watanabe Maki
auto_bootstrap parameter has been removed and always enabled since 1.0.

maki


On 2012/04/12, at 6:10, Paolo Bernardi  wrote:

> I think that setting auto_bootstrap = true or false into cassandra.yaml is 
> enough (if it isn't there already just add it, for example, after 
> initial_token)
> 
> Paolo
> 
> On Apr 11, 2012 10:34 PM, "Jay Parashar"  wrote:
> Thanks a lot Jeremiah.
> Also would you be able to tell me where to  configure the auto_bootstrap
> parameter in version 1.0.8?
> 
> Thanks
> Jay
> 
> -Original Message-
> From: Jeremiah Jordan [mailto:jeremiah.jor...@morningstar.com]
> Sent: Wednesday, April 11, 2012 3:03 PM
> To: user@cassandra.apache.org
> Subject: RE: Initial token - newbie question (version 1.0.8)
> 
> You have to use nodetool move to change the token after the node has started
> the first time.  The value in the config file is only used on first startup.
> 
> Unless you were using RF=3 on your 3 node ring, you can't just start with a
> new token without using nodetool.  You have to do move so that the data gets
> put in the right place.
> 
> How you would do it with out nodetool:
> Dangerous, not smart, can easily shoot yourself in the foot and lose your
> data way, if you were RF = 3:
> If you used RF=3, then all nodes should have all data, and you can stop all
> nodes, remove the system keyspace data, and start up the new cluster with
> the right stuff in the yaml file (blowing away system means this is like
> starting a brand new cluster).  Then re-create all of your keyspaces/column
> families and they will pick up the already existing data.
> 
> Though, if you are rf=3, nodetool move shouldn't be moving anything anyway,
> so you should just do it the right way and use nodetool.
> 
> 
> From: Jay Parashar [jparas...@itscape.com]
> Sent: Wednesday, April 11, 2012 1:44 PM
> To: user@cassandra.apache.org
> Subject: Initial token - newbie question (version 1.0.8)
> 
> I created a 3 node ring with the intial_token blank. Of course as expected,
> Cassandra generated its own tokens on startup (e.g. tokens X, Y and Z) The
> nodes or course were not properly balanced, so I did the following steps
> 
>1)  stopped all the 3 nodes
>2) assigned initial_tokens (A, B, C) respectively
>3) Restarted the nodes
> 
> What I find if that the node were still using the original tokens (X, Y and
> Z). Log messages say for node 1 show "Using saved token X"
> 
> I could rebalance suing nodetool and now the nodes are using the correct
> tokens.
> 
> But the question is, why were the new tokens not read from the
> Cassandra.yaml file? Without using nodetool, how do I make it get the token
> from the yaml file? Where is it saved?
> 
> Another question: I could not find the auto_bootstrap in the yaml file as
> per the documentation. Where is this param located?
> Appreciate it.
> Thanks in advance
> Jay
> 
> 


Re: Why so many SSTables?

2012-04-11 Thread Ben Coverston
>In general I would limit the data load per node to 300 to 400GB. Otherwise
> things can painful when it comes time to run compaction / repair / move .

+1 on more nodes of moderate size


RE: Initial token - newbie question (version 1.0.8)

2012-04-11 Thread Paolo Bernardi
I think that setting auto_bootstrap = true or false into cassandra.yaml is
enough (if it isn't there already just add it, for example, after
initial_token)

Paolo
On Apr 11, 2012 10:34 PM, "Jay Parashar"  wrote:

> Thanks a lot Jeremiah.
> Also would you be able to tell me where to  configure the auto_bootstrap
> parameter in version 1.0.8?
>
> Thanks
> Jay
>
> -Original Message-
> From: Jeremiah Jordan [mailto:jeremiah.jor...@morningstar.com]
> Sent: Wednesday, April 11, 2012 3:03 PM
> To: user@cassandra.apache.org
> Subject: RE: Initial token - newbie question (version 1.0.8)
>
> You have to use nodetool move to change the token after the node has
> started
> the first time.  The value in the config file is only used on first
> startup.
>
> Unless you were using RF=3 on your 3 node ring, you can't just start with a
> new token without using nodetool.  You have to do move so that the data
> gets
> put in the right place.
>
> How you would do it with out nodetool:
> Dangerous, not smart, can easily shoot yourself in the foot and lose your
> data way, if you were RF = 3:
> If you used RF=3, then all nodes should have all data, and you can stop all
> nodes, remove the system keyspace data, and start up the new cluster with
> the right stuff in the yaml file (blowing away system means this is like
> starting a brand new cluster).  Then re-create all of your keyspaces/column
> families and they will pick up the already existing data.
>
> Though, if you are rf=3, nodetool move shouldn't be moving anything anyway,
> so you should just do it the right way and use nodetool.
>
> 
> From: Jay Parashar [jparas...@itscape.com]
> Sent: Wednesday, April 11, 2012 1:44 PM
> To: user@cassandra.apache.org
> Subject: Initial token - newbie question (version 1.0.8)
>
> I created a 3 node ring with the intial_token blank. Of course as expected,
> Cassandra generated its own tokens on startup (e.g. tokens X, Y and Z) The
> nodes or course were not properly balanced, so I did the following steps
>
>1)  stopped all the 3 nodes
>2) assigned initial_tokens (A, B, C) respectively
>3) Restarted the nodes
>
> What I find if that the node were still using the original tokens (X, Y and
> Z). Log messages say for node 1 show "Using saved token X"
>
> I could rebalance suing nodetool and now the nodes are using the correct
> tokens.
>
> But the question is, why were the new tokens not read from the
> Cassandra.yaml file? Without using nodetool, how do I make it get the token
> from the yaml file? Where is it saved?
>
> Another question: I could not find the auto_bootstrap in the yaml file as
> per the documentation. Where is this param located?
> Appreciate it.
> Thanks in advance
> Jay
>
>
>


RE: Initial token - newbie question (version 1.0.8)

2012-04-11 Thread Jay Parashar
Thanks a lot Jeremiah.
Also would you be able to tell me where to  configure the auto_bootstrap
parameter in version 1.0.8?

Thanks
Jay

-Original Message-
From: Jeremiah Jordan [mailto:jeremiah.jor...@morningstar.com] 
Sent: Wednesday, April 11, 2012 3:03 PM
To: user@cassandra.apache.org
Subject: RE: Initial token - newbie question (version 1.0.8)

You have to use nodetool move to change the token after the node has started
the first time.  The value in the config file is only used on first startup.

Unless you were using RF=3 on your 3 node ring, you can't just start with a
new token without using nodetool.  You have to do move so that the data gets
put in the right place.

How you would do it with out nodetool:
Dangerous, not smart, can easily shoot yourself in the foot and lose your
data way, if you were RF = 3:
If you used RF=3, then all nodes should have all data, and you can stop all
nodes, remove the system keyspace data, and start up the new cluster with
the right stuff in the yaml file (blowing away system means this is like
starting a brand new cluster).  Then re-create all of your keyspaces/column
families and they will pick up the already existing data.

Though, if you are rf=3, nodetool move shouldn't be moving anything anyway,
so you should just do it the right way and use nodetool.


From: Jay Parashar [jparas...@itscape.com]
Sent: Wednesday, April 11, 2012 1:44 PM
To: user@cassandra.apache.org
Subject: Initial token - newbie question (version 1.0.8)

I created a 3 node ring with the intial_token blank. Of course as expected,
Cassandra generated its own tokens on startup (e.g. tokens X, Y and Z) The
nodes or course were not properly balanced, so I did the following steps

1)  stopped all the 3 nodes
2) assigned initial_tokens (A, B, C) respectively
3) Restarted the nodes

What I find if that the node were still using the original tokens (X, Y and
Z). Log messages say for node 1 show "Using saved token X"

I could rebalance suing nodetool and now the nodes are using the correct
tokens.

But the question is, why were the new tokens not read from the
Cassandra.yaml file? Without using nodetool, how do I make it get the token
from the yaml file? Where is it saved?

Another question: I could not find the auto_bootstrap in the yaml file as
per the documentation. Where is this param located?
Appreciate it.
Thanks in advance
Jay




RE: Initial token - newbie question (version 1.0.8)

2012-04-11 Thread Jeremiah Jordan
You have to use nodetool move to change the token after the node has started 
the first time.  The value in the config file is only used on first startup.

Unless you were using RF=3 on your 3 node ring, you can't just start with a new 
token without using nodetool.  You have to do move so that the data gets put in 
the right place.

How you would do it with out nodetool:
Dangerous, not smart, can easily shoot yourself in the foot and lose your data 
way, if you were RF = 3:
If you used RF=3, then all nodes should have all data, and you can stop all 
nodes, remove the system keyspace data, and start up the new cluster with the 
right stuff in the yaml file (blowing away system means this is like starting a 
brand new cluster).  Then re-create all of your keyspaces/column families and 
they will pick up the already existing data.

Though, if you are rf=3, nodetool move shouldn't be moving anything anyway, so 
you should just do it the right way and use nodetool.


From: Jay Parashar [jparas...@itscape.com]
Sent: Wednesday, April 11, 2012 1:44 PM
To: user@cassandra.apache.org
Subject: Initial token - newbie question (version 1.0.8)

I created a 3 node ring with the intial_token blank. Of course as expected,
Cassandra generated its own tokens on startup (e.g. tokens X, Y and Z)
The nodes or course were not properly balanced, so I did the following steps

1)  stopped all the 3 nodes
2) assigned initial_tokens (A, B, C) respectively
3) Restarted the nodes

What I find if that the node were still using the original tokens (X, Y and
Z). Log messages say for node 1 show "Using saved token X"

I could rebalance suing nodetool and now the nodes are using the correct
tokens.

But the question is, why were the new tokens not read from the
Cassandra.yaml file? Without using nodetool, how do I make it get the token
from the yaml file? Where is it saved?

Another question: I could not find the auto_bootstrap in the yaml file as
per the documentation. Where is this param located?
Appreciate it.
Thanks in advance
Jay



Re: Why so many SSTables?

2012-04-11 Thread aaron morton
In general I would limit the data load per node to 300 to 400GB. Otherwise 
things can painful when it comes time to run compaction / repair / move . 

Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 12/04/2012, at 1:00 AM, Dave Brosius wrote:

> It's easy to spend other people's money, but handling 1TB of data with 1.5 g 
> heap?  Memory is cheap, and just a little more will solve many problems.
> 
> 
> On 04/11/2012 08:43 AM, Romain HARDOUIN wrote:
>> 
>> 
>> Thank you for your answers. 
>> 
>> I originally post this question because we encoutered an OOM Exception on 2 
>> nodes during repair session. 
>> Memory analyzing shows an hotspot: an ArrayList of SSTableBoundedScanner 
>> which contains as many objects there are SSTables on disk (7747 objects at 
>> the time). 
>> This ArrayList consumes 47% of the heap space (786 MB). 
>> 
>> We want each node to handle 1 TB, so we must dramatically reduce the number 
>> of SSTables. 
>> 
>> Thus, is there any drawback if we set sstable_size_in_mb to 200MB? 
>> Otherwise shoudl we go back to Tiered Compaction? 
>> 
>> Regards, 
>> 
>> Romain
>> 
>> 
>> Maki Watanabe  a écrit sur 11/04/2012 04:21:47 :
>> 
>> > You can configure sstable size by sstable_size_in_mb parameter for LCS.
>> > The default value is 5MB.
>> > You should better to check you don't have many pending compaction tasks
>> > with nodetool tpstats and compactionstats also.
>> > If you have enough IO throughput, you can increase
>> > compaction_throughput_mb_per_sec
>> > in cassandra.yaml to reduce pending compactions.
>> > 
>> > maki
>> > 
>> > 2012/4/10 Romain HARDOUIN :
>> > >
>> > > Hi,
>> > >
>> > > We are surprised by the number of files generated by Cassandra.
>> > > Our cluster consists of 9 nodes and each node handles about 35 GB.
>> > > We're using Cassandra 1.0.6 with LeveledCompactionStrategy.
>> > > We have 30 CF.
>> > >
>> > > We've got roughly 45,000 files under the keyspace directory on each node:
>> > > ls -l /var/lib/cassandra/data/OurKeyspace/ | wc -l
>> > > 44372
>> > >
>> > > The biggest CF is spread over 38,000 files:
>> > > ls -l Documents* | wc -l
>> > > 37870
>> > >
>> > > ls -l Documents*-Data.db | wc -l
>> > > 7586
>> > >
>> > > Many SSTable are about 4 MB:
>> > >
>> > > 19 MB -> 1 SSTable
>> > > 12 MB -> 2 SSTables
>> > > 11 MB -> 2 SSTables
>> > > 9.2 MB -> 1 SSTable
>> > > 7.0 MB to 7.9 MB -> 6 SSTables
>> > > 6.0 MB to 6.4 MB -> 6 SSTables
>> > > 5.0 MB to 5.4 MB -> 4 SSTables
>> > > 4.0 MB to 4.7 MB -> 7139 SSTables
>> > > 3.0 MB to 3.9 MB -> 258 SSTables
>> > > 2.0 MB to 2.9 MB -> 35 SSTables
>> > > 1.0 MB to 1.9 MB -> 13 SSTables
>> > > 87 KB to  994 KB -> 87 SSTables
>> > > 0 KB -> 32 SSTables
>> > >
>> > > FYI here is CF information:
>> > >
>> > > ColumnFamily: Documents
>> > >   Key Validation Class: org.apache.cassandra.db.marshal.BytesType
>> > >   Default column value validator: 
>> > > org.apache.cassandra.db.marshal.BytesType
>> > >   Columns sorted by: org.apache.cassandra.db.marshal.BytesType
>> > >   Row cache size / save period in seconds / keys to save : 0.0/0/all
>> > >   Row Cache Provider: org.apache.cassandra.cache.SerializingCacheProvider
>> > >   Key cache size / save period in seconds: 20.0/14400
>> > >   GC grace seconds: 1728000
>> > >   Compaction min/max thresholds: 4/32
>> > >   Read repair chance: 1.0
>> > >   Replicate on write: true
>> > >   Column Metadata:
>> > > Column Name: refUUID (7265664944)
>> > >   Validation Class: org.apache.cassandra.db.marshal.BytesType
>> > >   Index Name: refUUID_idx
>> > >   Index Type: KEYS
>> > >   Compaction Strategy:
>> > > org.apache.cassandra.db.compaction.LeveledCompactionStrategy
>> > >   Compression Options:
>> > > sstable_compression: 
>> > > org.apache.cassandra.io.compress.SnappyCompressor
>> > >
>> > > Is it a bug? If not, how can we tune Cassandra to avoid this?
>> > >
>> > > Regards,
>> > >
>> > > Romain
> 



Re: need of regular nodetool repair

2012-04-11 Thread aaron morton
HH in 1.X+ is very good, but it is still an optimisation for achieving 
consistency. 

>> So I expect that even if I loose some HH then some other replica will reply 
>> with data.  Is it correct?
Run a repair and see. 

Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 11/04/2012, at 9:10 PM, Igor wrote:

> On 04/11/2012 12:04 PM, ruslan usifov wrote:
>> 
>> HH - this is hinted handoff?
>> 
> Yes
> 
>> 2012/4/11 Igor 
>> On 04/11/2012 11:49 AM, R. Verlangen wrote:
>> 
>> Not everything, just HH :)
>> 
>> I hope this works for me for the next reasons: I have quite large RF (6 
>> datacenters, each carry one replica of all dataset), read and write at CL 
>> ONE, relatively small TTL - 10 days, I have no deletes, servers almost never 
>> go down for hour. So I expect that even if I loose some HH then some other 
>> replica will reply with data.  Is it correct?
>> 
>> Hope this works for me, but can not work for others. 
>> 
>> 
>>> Well, if everything works 100% at any time there should be nothing to 
>>> repair, however with a distributed cluster it would be pretty rare for that 
>>> to occur. At least that is how I interpret this.
>>> 
>>> 2012/4/11 Igor 
>>> BTW, I heard that we don't need to run repair if all your data have TTL, 
>>> all HH works,  and you never delete your data.
>>> 
>>> 
>>> On 04/11/2012 11:34 AM, ruslan usifov wrote:
 
 Sorry fo my bad english, so QUORUM allow  doesn't make repair regularity? 
 But form your anser it does not follow
 
 2012/4/11 R. Verlangen 
 Yes, I personally have configured it to perform a repair once a week, as 
 the GCGraceSeconds is at 10 days.
 
 This is also what's in the manual  
 http://wiki.apache.org/cassandra/Operations#Repairing_missing_or_inconsistent_data
  (point 2)
 
 
 2012/4/11 ruslan usifov 
 Hello
 
 I have follow question, if we Read and write to cassandra claster with 
 QUORUM consistency level, does this allow to us do not call nodetool 
 repair regular? (i.e. every GCGraceSeconds) 
 
 
 
 -- 
 With kind regards,
 
 Robin Verlangen
 www.robinverlangen.nl
 
 
>>> 
>>> 
>>> 
>>> 
>>> -- 
>>> With kind regards,
>>> 
>>> Robin Verlangen
>>> www.robinverlangen.nl
>>> 
>> 
>> 
> 



Re: Materialized Views or Index CF - data model question

2012-04-11 Thread aaron morton
> a) "These queries are not easily supported on standard Cassandra"
> select * from book where price  < 992   order by price descending limit 30;
> 
> This is a typical (time series data)timeline query well supported by
> Cassandra, from my understanding.
Queries that use a secondary index (on price) must include an equality 
operator. 

> 
> b) "You do not need a different CF for each custom secondary index.
> Try putting the name of the index in the row key. "
> 
> I couldn't understand it. Can you help to build an demo with CF
> structure and some sample data?
You can have one CF that contains multiple secondary indexes. 

key: col_1:value_1
col_name: entity_id_1

key: col_2:value_2
col_name: entity_id_1

Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 11/04/2012, at 7:24 AM, Data Craftsman wrote:

> Hi Aaron,
> 
> Thanks for the quick answer, I'll build a prototype to benchmark each
> approach next week.
> 
> Here are more questions based on your reply:
> 
> a) "These queries are not easily supported on standard Cassandra"
> select * from book where price  < 992   order by price descending limit 30;
> 
> This is a typical (time series data)timeline query well supported by
> Cassandra, from my understanding.
> 
> b) "You do not need a different CF for each custom secondary index.
> Try putting the name of the index in the row key. "
> 
> I couldn't understand it. Can you help to build an demo with CF
> structure and some sample data?
> 
> Thanks,
> Charlie | DBA developer
> 
> 
> 
> On Sun, Apr 8, 2012 at 2:30 PM, aaron morton  wrote:
>> We need to query data by each column, do pagination as below,
>> 
>> select * from book where isbn   < "XYZ" order by ISBN   descending limit 30;
>> select * from book where price  < 992   order by price  descending limit 30;
>> select * from book where col_n1 < 789   order by col_n1 descending limit 30;
>> select * from book where col_n2 < "MUJ" order by col_n2 descending limit 30;
>> ...
>> select * from book where col_nm < 978 order by col_nm descending limit 30;
>> 
>> These queries are not easily supported on standard Cassandra. If you need
>> this level of query complexity consider Data Stax Enterprise, Solr, or a
>> RDBMS.
>> 
>> If we choose Materialized Views approach, we have to update all
>> 20 Materialized View column family(s), for each base row update.
>> Will the Cassandra write performance acceptable?
>> 
>> Yes, depending on the size of the cluster and the machine spec.
>> 
>> It's often a good idea to design CF's to match the workloads. If you have
>> some data that changes faster than other, consider splitting them into
>> different CFs.
>> 
>> Should we just normalize the data, create base book table with book_id
>> as primary key, and then
>> build 20 index column family(s), use wide row column slicing approach,
>> with index column data value as column name and book_id as value?
>> 
>> You do not need a different CF for each custom secondary index. Try putting
>> the name of the index in the row key.
>> 
>> What will you recommend?
>> 
>> Take another look at the queries you *need* to support. Then build a small
>> proof of concept to see if Cassandra will work for you.
>> 
>> Hope that helps.
>> 
>> -
>> Aaron Morton
>> Freelance Developer
>> @aaronmorton
>> http://www.thelastpickle.com
>> 
>> On 6/04/2012, at 6:46 AM, Data Craftsman wrote:
>> 
>> Howdy,
>> 
>> Can I ask a data model question here?
>> 
>> We have a book table with 20 columns, 300 million rows, average row
>> size is 1500 bytes.
>> 
>> create table book(
>> book_id,
>> isbn,
>> price,
>> author,
>> titile,
>> ...
>> col_n1,
>> col_n2,
>> ...
>> col_nm
>> );
>> 
>> Data usage:
>> 
>> We need to query data by each column, do pagination as below,
>> 
>> select * from book where isbn   < "XYZ" order by ISBN   descending limit 30;
>> select * from book where price  < 992   order by price  descending limit 30;
>> select * from book where col_n1 < 789   order by col_n1 descending limit 30;
>> select * from book where col_n2 < "MUJ" order by col_n2 descending limit 30;
>> ...
>> select * from book where col_nm < 978 order by col_nm descending limit 30;
>> 
>> Write: 100 million updates a day.
>> Read : 16  million queries a day. 200 queries per second, one query
>> returns 30 rows.
>> 
>> ***
>> Materialized Views approach
>> 
>> {"ISBN_01",book_object1},{"ISBN_02",book_object2},...,{"ISBN_N",book_objectN}
>> ...
>> We will end up with 20 timelines.
>> 
>> 
>> ***
>> Index approach - create 2nd Column Family as Index
>> 
>> 'ISBN_01': 'book_id_a01','book_id_a02',...,'book_id_aN'
>> 'ISBN_02': 'book_id_b01','book_id_b02',...,'book_id_bN'
>> ...
>> 'ISBN_0m': 'book_id_m01','book_id_m02',...,'book_id_mN'
>> 
>> This way, we will create 20 index Column Family(s).
>> 
>> ---
>> 
>> If we choose Materialized Views approach, we have to update all
>> 20 Materialized View column family(s), for each base row update.
>> Will the

Re: insert in cql

2012-04-11 Thread puneet loya
thank you :)

On Wed, Apr 11, 2012 at 8:55 PM, Eric Evans  wrote:

> On Wed, Apr 11, 2012 at 5:20 AM, puneet loya  wrote:
> > insert into users (KEY) values (512313);
> >
> > users is my column family and key is its only attribute..
> >
> > It is giving an error
> > bad request : line 1:24 required (...)+ loop did not match anything at
> input
> > ')'
> >
> > do you find any error here?
>
> Yes.  KEY here is presumably the row key (aka PRIMARY KEY) and you
> cannot store an otherwise empty row, you need at least one actual
> column.
>
> --
> Eric Evans
> Acunu | http://www.acunu.com | @acunu
>


Re: Issue with SStable loader.

2012-04-11 Thread aaron morton
See this post for info on how the table loader is configured 
http://www.datastax.com/dev/blog/bulk-loading

Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 11/04/2012, at 6:07 AM, aaron morton wrote:

> Did you update the config for sstableloader ?
> 
> Are their any data files in the data directory pointed to by the 
> sstableloader config ? 
> 
> Cheers
> 
> -
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 10/04/2012, at 11:56 PM, Rishabh Agrawal wrote:
> 
>> Hello,
>>  
>> I had three node cluster which I converted to 4 node one. Later I 
>> decommissioned one of them and load balanced the data on remaining 3. I 
>> removed decommissioned node from ‘seed list’ . I restarted all nodes and 
>> performed compaction. After that when I am using sstable loader it is trying 
>> to connect to that decommissioned node and hence failing.
>>  
>> Can someone provide me solution to same.
>>  
>> Regards
>> Rishabh Agrawal
>> 
>> 
>> Register for the Impetus webinar on Mobile Testing Automation : Best 
>> Practices - Apr 13 (10:00 am PT). http://bit.ly/Hb4YOq 
>> 
>> Impetus’ expert to present on ‘Streamlining Cloud Based Performance Testing’ 
>> at STAREAST 2012, FL - Apr 15 -20. Know more about our IP Enabled Solutions 
>> for Load Testing & Mobile Test Automation at the event. 
>> 
>> 
>> NOTE: This message may contain information that is confidential, 
>> proprietary, privileged or otherwise protected by law. The message is 
>> intended solely for the named addressee. If received in error, please 
>> destroy and notify the sender. Any use of this email is prohibited when 
>> received in error. Impetus does not represent, warrant and/or guarantee, 
>> that the integrity of this communication has been maintained nor that the 
>> communication is free of errors, virus, interception or interference.
> 



Initial token - newbie question (version 1.0.8)

2012-04-11 Thread Jay Parashar
I created a 3 node ring with the intial_token blank. Of course as expected,
Cassandra generated its own tokens on startup (e.g. tokens X, Y and Z)
The nodes or course were not properly balanced, so I did the following steps

1)  stopped all the 3 nodes
2) assigned initial_tokens (A, B, C) respectively
3) Restarted the nodes

What I find if that the node were still using the original tokens (X, Y and
Z). Log messages say for node 1 show "Using saved token X"

I could rebalance suing nodetool and now the nodes are using the correct
tokens.

But the question is, why were the new tokens not read from the
Cassandra.yaml file? Without using nodetool, how do I make it get the token
from the yaml file? Where is it saved?

Another question: I could not find the auto_bootstrap in the yaml file as
per the documentation. Where is this param located?
Appreciate it.
Thanks in advance
Jay



Re: Trouble with wrong data

2012-04-11 Thread aaron morton
> However after recovering from this issue (freeing some space and fixing the 
> value of  "commitlog_total_space_in_mb" in cassandra.yaml)
Did the commit log grow larger than commitlog_total_space_in_mb ? 

> I realized that all statistics were all destroyed. I have bad values on every 
> single counter since I start using them (september) !
Counter operations are not idempotent. If you client retries a counter 
operation it may result in the increment been applied twice. Could this have 
been your issue ? 

Cheers

 
-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 11/04/2012, at 2:35 AM, Alain RODRIGUEZ wrote:

> By the way, I am using Cassandra 1.0.7, CL = ONE (R/W), RF = 2, 2 EC2 
> c1.medium nodes cluster
> 
> Alain
> 
> 2012/4/10 Alain RODRIGUEZ 
> Hi, I'm experimenting a strange and very annoying phenomena.
> 
> I had a problem with the commit log size which grew too much and full one of 
> the hard disks in all my nodes almost at the same time (2 nodes only, RF=2, 
> so the 2 nodes are behaving exactly in the same way)
> 
> My data are mounted in an other partition that was not full. However after 
> recovering from this issue (freeing some space and fixing the value of  
> "commitlog_total_space_in_mb" in cassandra.yaml) I realized that all 
> statistics were all destroyed. I have bad values on every single counter 
> since I start using them (september) !
> 
> Does anyone experimented something similar or have any clue on this ?
> 
> Do you need more information ?
> 
> Alain
> 



Re: json2sstable error: Can not write to the Standard columns Super Column Family

2012-04-11 Thread aaron morton
Your json is for a standard CF but you are trying to load it into a super CF. 

There is a dedicated bulk loader interface you may find useful 
http://www.datastax.com/dev/blog/bulk-loading

Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 11/04/2012, at 2:22 AM, Aliou SOW wrote:

> Dear All,
> 
> I am new to Cassandra 1.0.8, and I use the tool json2sstable for bulk insert, 
> but I still have the error:
> 
> java.lang.RuntimeException: Can not write to the Standard columns Super 
> Column Family.
>  Has org.apache.cassandra.tools.SSTableImport.importSorted 
> (SSTableImport.java: 368)
>  Has org.apache.cassandra.tools.SSTableImport.importJson 
> (SSTableImport.java: 255)
>  Has org.apache.cassandra.tools.SSTableImport.main 
> (SSTableImport.java: 479)
> ERROR: Can not write to the Standard columns Super Column Family.
> 
> Before that, i first created a keyspace "testkeyspace", then a column family 
> "testCF" wherein I defined the key as varchar. Here is the structure of my 
> json file:
>  
> {
> "rs12354060": {
> "X1714T": 1.0,
> "X1905T": 1.0,
> ...
> "X3155T": 1.0
> }
> "rs3115850": {
> "X1714T": 0938,
> "X1905T": 0879,
> ...
> "X3155T": 0822
> }
> }
>  
> Help please, and if there is a other and easier way for bulk inserts, let me 
> know please.
>  
> Kind regards, Aliou.



Re: Cassandra running out of memory?

2012-04-11 Thread aaron morton
> 'system_memory_in_mb' (3760) and the 'system_cpu_cores' (1) according to our 
> nodes' specification. We also changed the 'MAX_HEAP_SIZE' to 2G and the 
> 'HEAP_NEWSIZE' to 200M (we think the second is related to the Garbage 
> Collection). 
It's best to leave the default settings unless you know what you are doing 
here. 

> In case you find this useful, swap is off and unevictable memory seems to be 
> very high on all 3 servers (2.3GB, we usually observe the amount of 
> unevictable memory on other Linux servers of around 0-16KB)
Cassandra locks the java memory so it cannot be swapped out. 

> The problem is that the node we hit from our thrift interface dies regularly 
> (approximately after we store 2-2.5G of data). Error message: 
> OutOfMemoryError: Java Heap Space and according to the log it in fact used 
> all of the allocated memory.
The easiest solution will be to use a larger EC2 instance. 

People normally use an m1.xlarge with 16Gb of ram (you would also try an 
m1.large).

If you are still experimenting I would suggest using the larger instances so 
you can make some progress. Once you have a feel for how things work you can 
then try to match the instances to your budget.

Hope that helps. 

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 11/04/2012, at 1:54 AM, Vasileios Vlachos wrote:

> Hello,
> 
> We are experimenting a bit with Cassandra lately (version 1.0.7) and we seem 
> to have some problems with memory. We use EC2 as our test environment and we 
> have three nodes with 3.7G of memory and 1 core @ 2.4G, all running Ubuntu 
> server 11.10. 
> 
> The problem is that the node we hit from our thrift interface dies regularly 
> (approximately after we store 2-2.5G of data). Error message: 
> OutOfMemoryError: Java Heap Space and according to the log it in fact used 
> all of the allocated memory.
> 
> The nodes are under relatively constant load and store about 2000-4000 row 
> keys a minute, which are batched through the Trift interface in 10-30 row 
> keys at once (with about 50 columns each). The number of reads is very low 
> with around 1000-2000 a day and only requesting the data of a single row key. 
> The is currently only one used column family.
> 
> The initial thought was that something was wrong in the cassandra-env.sh 
> file. So, we specified the variables 'system_memory_in_mb' (3760) and the 
> 'system_cpu_cores' (1) according to our nodes' specification. We also changed 
> the 'MAX_HEAP_SIZE' to 2G and the 'HEAP_NEWSIZE' to 200M (we think the second 
> is related to the Garbage Collection). Unfortunately, that did not solve the 
> issue and the node we hit via thrift keeps on dying regularly.
> 
> In case you find this useful, swap is off and unevictable memory seems to be 
> very high on all 3 servers (2.3GB, we usually observe the amount of 
> unevictable memory on other Linux servers of around 0-16KB) (We are not quite 
> sure how the unevictable memory ties into Cassandra, its just something we 
> observed while looking into the problem). The CPU is pretty much idle the 
> entire time. The heap memory is clearly being reduced once in a while 
> according to nodetool, but obviously grows over the limit as time goes by.
> 
> Any ideas? Thanks in advance.
> 
> Bill



Re: cassandra 0.8.7 + hector 0.8.3: All Quorum reads result in writes?

2012-04-11 Thread Thibaut Britz
I will disable read repair for slice requests fully (we can handle those on
the application side) until we upgrade to 1.0.8.

Thanks,
Thibaut


On Wed, Apr 11, 2012 at 7:04 PM, Jeremy Hanna wrote:

> I backported this to 0.8.4 and it didn't fix the problem we were seeing
> (as I outlined in my parallel post) but if it fixes it for you, then
> beautiful.  Just wanted to let you know our experience with similar
> symptoms.
>
> On Apr 11, 2012, at 11:56 AM, Thibaut Britz wrote:
>
> > Fixed in  https://issues.apache.org/jira/browse/CASSANDRA-3843
> >
> >
> >
> > On Wed, Apr 11, 2012 at 5:58 PM, Thibaut Britz <
> thibaut.br...@trendiction.com> wrote:
> > We have read repair disabled (0.0).
> >
> > Even if this would be the case, this also doesn't explain why the writes
> are executed again and again when going over the same range again and again.
> >
> > The keyspace is new, it doesn't contain any thumbstones and only 1
> keys.
> >
> >
> >
> > On Wed, Apr 11, 2012 at 5:52 PM, R. Verlangen  wrote:
> > Are you sure this isn't read-repair?
> http://wiki.apache.org/cassandra/ReadRepair
> >
> >
> > 2012/4/11 Thibaut Britz 
> > Also executing the same multiget rangeslice query over the same range
> again will trigger the same writes again and again.
> >
> > On Wed, Apr 11, 2012 at 5:41 PM, Thibaut Britz <
> thibaut.br...@trendiction.com> wrote:
> > Hi,
> >
> > I just diagnosted this strange behavior:
> >
> > When I fetch a rangeslice through hector and set the consistency level
> to quorum, according to cfstats (and also to the output files on the hd),
> cassandra seems to execute a write request for each read I execute. The
> write count in cfstats is increased when I execute the rangeslice function
> over the same range again and again (without saving anything at all).
> >
> > If I set the consitency level to ONE, no writes are executed.
> >
> > How can I disable this? Why are the records rewritten each time, even
> though I don't want them to be rewritten?
> >
> > Thanks,
> > Thibaut.
> >
> >
> > Code:
> > Keyspace ks = getConnection(cluster,
> consistencylevel);
> >
> >   RangeSlicesQuery
> rangeSlicesQuery = HFactory.createRangeSlicesQuery(ks,
> StringSerializer.get(), StringSerializer.get(), s);
> >
> >
> rangeSlicesQuery.setColumnFamily(columnFamily);
> >   rangeSlicesQuery.setColumnNames(column);
> >
> >   rangeSlicesQuery.setKeys(start, end);
> >   rangeSlicesQuery.setRowCount(maxrows);
> >
> >   QueryResult V>> result = rangeSlicesQuery.execute();
> >   return result.get();
> >
> >
> >
> >
> >
> >
> >
> > --
> > With kind regards,
> >
> > Robin Verlangen
> > www.robinverlangen.nl
> >
> >
> >
>
>


Re: cassandra 0.8.7 + hector 0.8.3: All Quorum reads result in writes?

2012-04-11 Thread Jeremy Hanna
I backported this to 0.8.4 and it didn't fix the problem we were seeing (as I 
outlined in my parallel post) but if it fixes it for you, then beautiful.  Just 
wanted to let you know our experience with similar symptoms.

On Apr 11, 2012, at 11:56 AM, Thibaut Britz wrote:

> Fixed in  https://issues.apache.org/jira/browse/CASSANDRA-3843 
> 
> 
> 
> On Wed, Apr 11, 2012 at 5:58 PM, Thibaut Britz 
>  wrote:
> We have read repair disabled (0.0).
> 
> Even if this would be the case, this also doesn't explain why the writes are 
> executed again and again when going over the same range again and again.
> 
> The keyspace is new, it doesn't contain any thumbstones and only 1 keys.
> 
> 
> 
> On Wed, Apr 11, 2012 at 5:52 PM, R. Verlangen  wrote:
> Are you sure this isn't read-repair?  
> http://wiki.apache.org/cassandra/ReadRepair 
> 
> 
> 2012/4/11 Thibaut Britz 
> Also executing the same multiget rangeslice query over the same range again 
> will trigger the same writes again and again.
> 
> On Wed, Apr 11, 2012 at 5:41 PM, Thibaut Britz 
>  wrote:
> Hi,
> 
> I just diagnosted this strange behavior:
> 
> When I fetch a rangeslice through hector and set the consistency level to 
> quorum, according to cfstats (and also to the output files on the hd), 
> cassandra seems to execute a write request for each read I execute. The write 
> count in cfstats is increased when I execute the rangeslice function over the 
> same range again and again (without saving anything at all).
> 
> If I set the consitency level to ONE, no writes are executed.
> 
> How can I disable this? Why are the records rewritten each time, even though 
> I don't want them to be rewritten?
> 
> Thanks,
> Thibaut.
> 
> 
> Code:
> Keyspace ks = getConnection(cluster, 
> consistencylevel);
> 
>   RangeSlicesQuery 
> rangeSlicesQuery = HFactory.createRangeSlicesQuery(ks, 
> StringSerializer.get(), StringSerializer.get(), s);
> 
>   rangeSlicesQuery.setColumnFamily(columnFamily);
>   rangeSlicesQuery.setColumnNames(column);
> 
>   rangeSlicesQuery.setKeys(start, end);
>   rangeSlicesQuery.setRowCount(maxrows);
> 
>   QueryResult> 
> result = rangeSlicesQuery.execute();
>   return result.get();
> 
> 
> 
> 
> 
> 
> 
> -- 
> With kind regards,
> 
> Robin Verlangen
> www.robinverlangen.nl
> 
> 
> 



Re: cassandra 0.8.7 + hector 0.8.3: All Quorum reads result in writes?

2012-04-11 Thread Thibaut Britz
Fixed in  https://issues.apache.org/jira/browse/CASSANDRA-3843



On Wed, Apr 11, 2012 at 5:58 PM, Thibaut Britz <
thibaut.br...@trendiction.com> wrote:

> We have read repair disabled (0.0).
>
> Even if this would be the case, this also doesn't explain why the writes
> are executed again and again when going over the same range again and again.
>
> The keyspace is new, it doesn't contain any thumbstones and only 1
> keys.
>
>
>
> On Wed, Apr 11, 2012 at 5:52 PM, R. Verlangen  wrote:
>
>> Are you sure this isn't read-repair?
>> http://wiki.apache.org/cassandra/ReadRepair
>>
>>
>> 2012/4/11 Thibaut Britz 
>>
>>> Also executing the same multiget rangeslice query over the same range
>>> again will trigger the same writes again and again.
>>>
>>> On Wed, Apr 11, 2012 at 5:41 PM, Thibaut Britz <
>>> thibaut.br...@trendiction.com> wrote:
>>>
 Hi,

 I just diagnosted this strange behavior:

 When I fetch a rangeslice through hector and set the consistency level
 to quorum, according to cfstats (and also to the output files on the hd),
 cassandra seems to execute a write request for each read I execute. The
 write count in cfstats is increased when I execute the rangeslice function
 over the same range again and again (without saving anything at all).

 If I set the consitency level to ONE, no writes are executed.

 How can I disable this? Why are the records rewritten each time, even
 though I don't want them to be rewritten?

 Thanks,
 Thibaut.


 Code:
 Keyspace ks = getConnection(cluster,
 consistencylevel);

  RangeSlicesQuery rangeSlicesQuery =
 HFactory.createRangeSlicesQuery(ks, StringSerializer.get(),
 StringSerializer.get(), s);

 rangeSlicesQuery.setColumnFamily(columnFamily);
 rangeSlicesQuery.setColumnNames(column);

 rangeSlicesQuery.setKeys(start, end);
 rangeSlicesQuery.setRowCount(maxrows);

 QueryResult> result =
 rangeSlicesQuery.execute();
 return result.get();




>>>
>>
>>
>> --
>> With kind regards,
>>
>> Robin Verlangen
>> www.robinverlangen.nl
>>
>>
>


Re: cassandra 0.8.7 + hector 0.8.3: All Quorum reads result in writes?

2012-04-11 Thread Jeremy Hanna
fwiw - we had a similar problem reading at quorum with 0.8.4 when reading with 
hadoop.  The symptom we see is when reading a column family with hadoop using 
quorum using 0.8.4, we have lots of minor compactions as a result of heavy 
writes.  When we read at CL.ONE or move to 1.0.8 the problem is not there.  I 
tried to backport a couple of related patches but that did not solve the 
problem.  We're looking to upgrade soon to 1.0.9 but the workaround until then 
for us is read at CL.ONE and write at CL.ALL with hadoop.  I'm not sure this is 
the same problem, but it sounds like it and it doesn't have to do with hector.

On Apr 11, 2012, at 10:58 AM, Thibaut Britz wrote:

> We have read repair disabled (0.0).
> 
> Even if this would be the case, this also doesn't explain why the writes are 
> executed again and again when going over the same range again and again.
> 
> The keyspace is new, it doesn't contain any thumbstones and only 1 keys.
> 
> 
> 
> On Wed, Apr 11, 2012 at 5:52 PM, R. Verlangen  wrote:
> Are you sure this isn't read-repair?  
> http://wiki.apache.org/cassandra/ReadRepair 
> 
> 
> 2012/4/11 Thibaut Britz 
> Also executing the same multiget rangeslice query over the same range again 
> will trigger the same writes again and again.
> 
> On Wed, Apr 11, 2012 at 5:41 PM, Thibaut Britz 
>  wrote:
> Hi,
> 
> I just diagnosted this strange behavior:
> 
> When I fetch a rangeslice through hector and set the consistency level to 
> quorum, according to cfstats (and also to the output files on the hd), 
> cassandra seems to execute a write request for each read I execute. The write 
> count in cfstats is increased when I execute the rangeslice function over the 
> same range again and again (without saving anything at all).
> 
> If I set the consitency level to ONE, no writes are executed.
> 
> How can I disable this? Why are the records rewritten each time, even though 
> I don't want them to be rewritten?
> 
> Thanks,
> Thibaut.
> 
> 
> Code:
> Keyspace ks = getConnection(cluster, 
> consistencylevel);
> 
>   RangeSlicesQuery 
> rangeSlicesQuery = HFactory.createRangeSlicesQuery(ks, 
> StringSerializer.get(), StringSerializer.get(), s);
> 
>   rangeSlicesQuery.setColumnFamily(columnFamily);
>   rangeSlicesQuery.setColumnNames(column);
> 
>   rangeSlicesQuery.setKeys(start, end);
>   rangeSlicesQuery.setRowCount(maxrows);
> 
>   QueryResult> 
> result = rangeSlicesQuery.execute();
>   return result.get();
> 
> 
> 
> 
> 
> 
> 
> -- 
> With kind regards,
> 
> Robin Verlangen
> www.robinverlangen.nl
> 
> 



RE: INserting data in Cassandra

2012-04-11 Thread Aliou SOW

Thanks :)

But finally i used Hector and it works fine :D

Date: Wed, 11 Apr 2012 17:19:15 +0200
From: berna...@gmail.com
To: user@cassandra.apache.org
Subject: Re: INserting data in Cassandra


  



  
  
On 04/11/12 11:42, Aliou SOW wrote:

  
  

And I used
the
  tool json2sstable, but that does
not work,
I
  always have an error:



  java.lang.RuntimeException:
Can't
write Super columns to the Standard Column Family.



So I have
two
questions:

1) What I did
  wrong, must I define the complete structure of my
column family before
using json2sstable or is it the structure of my json file
which is not good?

  
  

I think that you just have to specify column_type = 'super' when
creating your column family (see
http://stackoverflow.com/questions/6835183/set-super-column-family-in-cassandra-cli)



Paolo


  

Re: cassandra 0.8.7 + hector 0.8.3: All Quorum reads result in writes?

2012-04-11 Thread Thibaut Britz
We have read repair disabled (0.0).

Even if this would be the case, this also doesn't explain why the writes
are executed again and again when going over the same range again and again.

The keyspace is new, it doesn't contain any thumbstones and only 1 keys.



On Wed, Apr 11, 2012 at 5:52 PM, R. Verlangen  wrote:

> Are you sure this isn't read-repair?
> http://wiki.apache.org/cassandra/ReadRepair
>
>
> 2012/4/11 Thibaut Britz 
>
>> Also executing the same multiget rangeslice query over the same range
>> again will trigger the same writes again and again.
>>
>> On Wed, Apr 11, 2012 at 5:41 PM, Thibaut Britz <
>> thibaut.br...@trendiction.com> wrote:
>>
>>> Hi,
>>>
>>> I just diagnosted this strange behavior:
>>>
>>> When I fetch a rangeslice through hector and set the consistency level
>>> to quorum, according to cfstats (and also to the output files on the hd),
>>> cassandra seems to execute a write request for each read I execute. The
>>> write count in cfstats is increased when I execute the rangeslice function
>>> over the same range again and again (without saving anything at all).
>>>
>>> If I set the consitency level to ONE, no writes are executed.
>>>
>>> How can I disable this? Why are the records rewritten each time, even
>>> though I don't want them to be rewritten?
>>>
>>> Thanks,
>>> Thibaut.
>>>
>>>
>>> Code:
>>> Keyspace ks = getConnection(cluster,
>>> consistencylevel);
>>>
>>>  RangeSlicesQuery rangeSlicesQuery =
>>> HFactory.createRangeSlicesQuery(ks, StringSerializer.get(),
>>> StringSerializer.get(), s);
>>>
>>> rangeSlicesQuery.setColumnFamily(columnFamily);
>>> rangeSlicesQuery.setColumnNames(column);
>>>
>>> rangeSlicesQuery.setKeys(start, end);
>>> rangeSlicesQuery.setRowCount(maxrows);
>>>
>>> QueryResult> result =
>>> rangeSlicesQuery.execute();
>>> return result.get();
>>>
>>>
>>>
>>>
>>
>
>
> --
> With kind regards,
>
> Robin Verlangen
> www.robinverlangen.nl
>
>


Re: cassandra 0.8.7 + hector 0.8.3: All Quorum reads result in writes?

2012-04-11 Thread R. Verlangen
Are you sure this isn't read-repair?
http://wiki.apache.org/cassandra/ReadRepair

2012/4/11 Thibaut Britz 

> Also executing the same multiget rangeslice query over the same range
> again will trigger the same writes again and again.
>
> On Wed, Apr 11, 2012 at 5:41 PM, Thibaut Britz <
> thibaut.br...@trendiction.com> wrote:
>
>> Hi,
>>
>> I just diagnosted this strange behavior:
>>
>> When I fetch a rangeslice through hector and set the consistency level to
>> quorum, according to cfstats (and also to the output files on the hd),
>> cassandra seems to execute a write request for each read I execute. The
>> write count in cfstats is increased when I execute the rangeslice function
>> over the same range again and again (without saving anything at all).
>>
>> If I set the consitency level to ONE, no writes are executed.
>>
>> How can I disable this? Why are the records rewritten each time, even
>> though I don't want them to be rewritten?
>>
>> Thanks,
>> Thibaut.
>>
>>
>> Code:
>> Keyspace ks = getConnection(cluster,
>> consistencylevel);
>>
>>  RangeSlicesQuery rangeSlicesQuery =
>> HFactory.createRangeSlicesQuery(ks, StringSerializer.get(),
>> StringSerializer.get(), s);
>>
>> rangeSlicesQuery.setColumnFamily(columnFamily);
>> rangeSlicesQuery.setColumnNames(column);
>>
>> rangeSlicesQuery.setKeys(start, end);
>> rangeSlicesQuery.setRowCount(maxrows);
>>
>> QueryResult> result =
>> rangeSlicesQuery.execute();
>> return result.get();
>>
>>
>>
>>
>


-- 
With kind regards,

Robin Verlangen
www.robinverlangen.nl


Re: cassandra 0.8.7 + hector 0.8.3: All Quorum reads result in writes?

2012-04-11 Thread Thibaut Britz
Also executing the same multiget rangeslice query over the same range again
will trigger the same writes again and again.

On Wed, Apr 11, 2012 at 5:41 PM, Thibaut Britz <
thibaut.br...@trendiction.com> wrote:

> Hi,
>
> I just diagnosted this strange behavior:
>
> When I fetch a rangeslice through hector and set the consistency level to
> quorum, according to cfstats (and also to the output files on the hd),
> cassandra seems to execute a write request for each read I execute. The
> write count in cfstats is increased when I execute the rangeslice function
> over the same range again and again (without saving anything at all).
>
> If I set the consitency level to ONE, no writes are executed.
>
> How can I disable this? Why are the records rewritten each time, even
> though I don't want them to be rewritten?
>
> Thanks,
> Thibaut.
>
>
> Code:
> Keyspace ks = getConnection(cluster,
> consistencylevel);
>
>  RangeSlicesQuery rangeSlicesQuery =
> HFactory.createRangeSlicesQuery(ks, StringSerializer.get(),
> StringSerializer.get(), s);
>
> rangeSlicesQuery.setColumnFamily(columnFamily);
> rangeSlicesQuery.setColumnNames(column);
>
> rangeSlicesQuery.setKeys(start, end);
> rangeSlicesQuery.setRowCount(maxrows);
>
> QueryResult> result =
> rangeSlicesQuery.execute();
> return result.get();
>
>
>
>


cassandra 0.8.7 + hector 0.8.3: All Quorum reads result in writes?

2012-04-11 Thread Thibaut Britz
Hi,

I just diagnosted this strange behavior:

When I fetch a rangeslice through hector and set the consistency level to
quorum, according to cfstats (and also to the output files on the hd),
cassandra seems to execute a write request for each read I execute. The
write count in cfstats is increased when I execute the rangeslice function
over the same range again and again (without saving anything at all).

If I set the consitency level to ONE, no writes are executed.

How can I disable this? Why are the records rewritten each time, even
though I don't want them to be rewritten?

Thanks,
Thibaut.


Code:
Keyspace ks = getConnection(cluster,
consistencylevel);

RangeSlicesQuery rangeSlicesQuery =
HFactory.createRangeSlicesQuery(ks, StringSerializer.get(),
StringSerializer.get(), s);

rangeSlicesQuery.setColumnFamily(columnFamily);
rangeSlicesQuery.setColumnNames(column);

rangeSlicesQuery.setKeys(start, end);
rangeSlicesQuery.setRowCount(maxrows);

QueryResult> result =
rangeSlicesQuery.execute();
return result.get();


Re: insert in cql

2012-04-11 Thread Eric Evans
On Wed, Apr 11, 2012 at 5:20 AM, puneet loya  wrote:
> insert into users (KEY) values (512313);
>
> users is my column family and key is its only attribute..
>
> It is giving an error
> bad request : line 1:24 required (...)+ loop did not match anything at input
> ')'
>
> do you find any error here?

Yes.  KEY here is presumably the row key (aka PRIMARY KEY) and you
cannot store an otherwise empty row, you need at least one actual
column.

-- 
Eric Evans
Acunu | http://www.acunu.com | @acunu


Re: INserting data in Cassandra

2012-04-11 Thread Paolo Bernardi

On 04/11/12 11:42, Aliou SOW wrote:


And I used the tool json2sstable, but that does not work, I always 
have an error:


java.lang.RuntimeException: Can't write Super columns to the Standard 
Column Family.


So I have two questions:
1) What I did wrong, must I define the complete structure of my column 
family before using json2sstable or is it the structure of my json 
file which is not good?


I think that you just have to specify column_type = 'super' when 
creating your column family (see 
http://stackoverflow.com/questions/6835183/set-super-column-family-in-cassandra-cli)


Paolo



Re: Repair Process Taking too long

2012-04-11 Thread Frank Ng
Can you expand further on your issue? Were you using Random Patitioner?

thanks

On Tue, Apr 10, 2012 at 5:35 PM, David Leimbach  wrote:

> I had this happen when I had really poorly generated tokens for the ring.
>  Cassandra seems to accept numbers that are too big.  You get hot spots
> when you think you should be balanced and repair never ends (I think there
> is a 48 hour timeout).
>
>
> On Tuesday, April 10, 2012, Frank Ng wrote:
>
>> I am not using tier-sized compaction.
>>
>>
>> On Tue, Apr 10, 2012 at 12:56 PM, Jonathan Rhone wrote:
>>
>>> Data size, number of nodes, RF?
>>>
>>> Are you using size-tiered compaction on any of the column families that
>>> hold a lot of your data?
>>>
>>> Do your cassandra logs say you are streaming a lot of ranges?
>>> zgrep -E "(Performing streaming repair|out of sync)"
>>>
>>>
>>> On Tue, Apr 10, 2012 at 9:45 AM, Igor  wrote:
>>>
  On 04/10/2012 07:16 PM, Frank Ng wrote:

 Short answer - yes.
 But you are asking wrong question.


 I think both processes are taking a while.  When it starts up, netstats
 and compactionstats show nothing.  Anyone out there successfully using ext3
 and their repair processes are faster than this?

  On Tue, Apr 10, 2012 at 10:42 AM, Igor  wrote:

> Hi
>
> You can check with nodetool  which part of repair process is slow -
> network streams or verify compactions. use nodetool netstats or
> compactionstats.
>
>
> On 04/10/2012 05:16 PM, Frank Ng wrote:
>
>> Hello,
>>
>> I am on Cassandra 1.0.7.  My repair processes are taking over 30
>> hours to complete.  Is it normal for the repair process to take this 
>> long?
>>  I wonder if it's because I am using the ext3 file system.
>>
>> thanks
>>
>
>


>>>
>>>
>>> --
>>> Jonathan Rhone
>>> Software Engineer
>>>
>>> *TinyCo*
>>> 800 Market St., Fl 6
>>> San Francisco, CA 94102
>>> www.tinyco.com
>>>
>>>
>>


RE: INserting data in Cassandra

2012-04-11 Thread Aliou SOW

Hello,

Any help Or idea?

Thanks

From: aliouji...@hotmail.com
To: user@cassandra.apache.org
Subject: INserting data in Cassandra
Date: Wed, 11 Apr 2012 09:42:52 +









Hello
all,

We would like to
adopt Cassandra solution
for storing our biological data which
are essentially microarray data.

These data, formatted in text tabulated files, are in the form:



   Sample 1 sample
2 …sample n

probe 1value1  value 2 ……





probe
2  

…

probe n



In fact the probes can vary from one chip to another
and you can have chips with more than one million probes,
the samples by cons
vary from one project to another,
the values ​​are floats.

So we would like to represent our column families like they are
formatted in files with the names of the probes (which are unique) as the key,
and the names of the samples as column names and float values as column values
:)

To insert data,
I just created such for example a keyspace testKS
and a column
family testCF but
just by defining the
key (because the column names
vary), then a
file jason for insertion:



{"probe1" :{

"sample 1":value1(float),

"sample 2":value2(float),

…,

"sample n":value n(float)

},

"probe2" :{

"sample 1":value1(float),

"sample 2":value2(float),

…,

"sample n":value n(float)

}

}



 

And I used the
tool json2sstable, but that does
not work, I
always have an error:



java.lang.RuntimeException:
Can't write Super columns to the Standard Column Family.



So I have two
questions:

1) What I did
wrong, must I define the complete structure of my column family before
using json2sstable or is it the structure of my json file which is not good?

2) Otherwise what would be the best way
to proceed for insertion based on the data I dispose?



We use Cassandra 1.0.8.
Any help would be welcome.



Thanks.


  

Re: Why so many SSTables?

2012-04-11 Thread Dave Brosius
It's easy to spend other people's money, but handling 1TB of data with 
1.5 g heap?  Memory is cheap, and just a little more will solve many 
problems.



On 04/11/2012 08:43 AM, Romain HARDOUIN wrote:


Thank you for your answers.

I originally post this question because we encoutered an OOM Exception 
on 2 nodes during repair session.
Memory analyzing shows an hotspot: an ArrayList of 
SSTableBoundedScanner which contains as many objects there are 
SSTables on disk (7747 objects at the time).

This ArrayList consumes 47% of the heap space (786 MB).

We want each node to handle 1 TB, so we must dramatically reduce the 
number of SSTables.


Thus, is there any drawback if we set sstable_size_in_mb to 200MB?
Otherwise shoudl we go back to Tiered Compaction?

Regards,

Romain


Maki Watanabe  a écrit sur 11/04/2012 04:21:47 :

> You can configure sstable size by sstable_size_in_mb parameter for LCS.
> The default value is 5MB.
> You should better to check you don't have many pending compaction tasks
> with nodetool tpstats and compactionstats also.
> If you have enough IO throughput, you can increase
> compaction_throughput_mb_per_sec
> in cassandra.yaml to reduce pending compactions.
>
> maki
>
> 2012/4/10 Romain HARDOUIN :
> >
> > Hi,
> >
> > We are surprised by the number of files generated by Cassandra.
> > Our cluster consists of 9 nodes and each node handles about 35 GB.
> > We're using Cassandra 1.0.6 with LeveledCompactionStrategy.
> > We have 30 CF.
> >
> > We've got roughly 45,000 files under the keyspace directory on 
each node:

> > ls -l /var/lib/cassandra/data/OurKeyspace/ | wc -l
> > 44372
> >
> > The biggest CF is spread over 38,000 files:
> > ls -l Documents* | wc -l
> > 37870
> >
> > ls -l Documents*-Data.db | wc -l
> > 7586
> >
> > Many SSTable are about 4 MB:
> >
> > 19 MB -> 1 SSTable
> > 12 MB -> 2 SSTables
> > 11 MB -> 2 SSTables
> > 9.2 MB -> 1 SSTable
> > 7.0 MB to 7.9 MB -> 6 SSTables
> > 6.0 MB to 6.4 MB -> 6 SSTables
> > 5.0 MB to 5.4 MB -> 4 SSTables
> > 4.0 MB to 4.7 MB -> 7139 SSTables
> > 3.0 MB to 3.9 MB -> 258 SSTables
> > 2.0 MB to 2.9 MB -> 35 SSTables
> > 1.0 MB to 1.9 MB -> 13 SSTables
> > 87 KB to  994 KB -> 87 SSTables
> > 0 KB -> 32 SSTables
> >
> > FYI here is CF information:
> >
> > ColumnFamily: Documents
> >   Key Validation Class: org.apache.cassandra.db.marshal.BytesType
> >   Default column value validator: 
org.apache.cassandra.db.marshal.BytesType

> >   Columns sorted by: org.apache.cassandra.db.marshal.BytesType
> >   Row cache size / save period in seconds / keys to save : 0.0/0/all
> >   Row Cache Provider: 
org.apache.cassandra.cache.SerializingCacheProvider

> >   Key cache size / save period in seconds: 20.0/14400
> >   GC grace seconds: 1728000
> >   Compaction min/max thresholds: 4/32
> >   Read repair chance: 1.0
> >   Replicate on write: true
> >   Column Metadata:
> > Column Name: refUUID (7265664944)
> >   Validation Class: org.apache.cassandra.db.marshal.BytesType
> >   Index Name: refUUID_idx
> >   Index Type: KEYS
> >   Compaction Strategy:
> > org.apache.cassandra.db.compaction.LeveledCompactionStrategy
> >   Compression Options:
> > sstable_compression: 
org.apache.cassandra.io.compress.SnappyCompressor

> >
> > Is it a bug? If not, how can we tune Cassandra to avoid this?
> >
> > Regards,
> >
> > Romain




Re: Why so many SSTables?

2012-04-11 Thread Sylvain Lebresne
On Wed, Apr 11, 2012 at 2:43 PM, Romain HARDOUIN
 wrote:
>
> Thank you for your answers.
>
> I originally post this question because we encoutered an OOM Exception on 2
> nodes during repair session.
> Memory analyzing shows an hotspot: an ArrayList of SSTableBoundedScanner
> which contains as many objects there are SSTables on disk (7747 objects at
> the time).
> This ArrayList consumes 47% of the heap space (786 MB).

That's 101KB per element!! I know Java object representation is not
concise but that feels like more than is reasonable. Are you sure of
those numbers?

In any case, we should improve that so as to not create all those
SSTableBoundedScanner upfront. Would you mind opening a ticket on
https://issues.apache.org/jira/browse/CASSANDRA with as many info as
you have on this.

> We want each node to handle 1 TB, so we must dramatically reduce the number
> of SSTables.
>
> Thus, is there any drawback if we set sstable_size_in_mb to 200MB?
> Otherwise shoudl we go back to Tiered Compaction?
>
> Regards,
>
> Romain
>
>
> Maki Watanabe  a écrit sur 11/04/2012 04:21:47 :
>
>
>> You can configure sstable size by sstable_size_in_mb parameter for LCS.
>> The default value is 5MB.
>> You should better to check you don't have many pending compaction tasks
>> with nodetool tpstats and compactionstats also.
>> If you have enough IO throughput, you can increase
>> compaction_throughput_mb_per_sec
>> in cassandra.yaml to reduce pending compactions.
>>
>> maki
>>
>> 2012/4/10 Romain HARDOUIN :
>> >
>> > Hi,
>> >
>> > We are surprised by the number of files generated by Cassandra.
>> > Our cluster consists of 9 nodes and each node handles about 35 GB.
>> > We're using Cassandra 1.0.6 with LeveledCompactionStrategy.
>> > We have 30 CF.
>> >
>> > We've got roughly 45,000 files under the keyspace directory on each
>> > node:
>> > ls -l /var/lib/cassandra/data/OurKeyspace/ | wc -l
>> > 44372
>> >
>> > The biggest CF is spread over 38,000 files:
>> > ls -l Documents* | wc -l
>> > 37870
>> >
>> > ls -l Documents*-Data.db | wc -l
>> > 7586
>> >
>> > Many SSTable are about 4 MB:
>> >
>> > 19 MB -> 1 SSTable
>> > 12 MB -> 2 SSTables
>> > 11 MB -> 2 SSTables
>> > 9.2 MB -> 1 SSTable
>> > 7.0 MB to 7.9 MB -> 6 SSTables
>> > 6.0 MB to 6.4 MB -> 6 SSTables
>> > 5.0 MB to 5.4 MB -> 4 SSTables
>> > 4.0 MB to 4.7 MB -> 7139 SSTables
>> > 3.0 MB to 3.9 MB -> 258 SSTables
>> > 2.0 MB to 2.9 MB -> 35 SSTables
>> > 1.0 MB to 1.9 MB -> 13 SSTables
>> > 87 KB to  994 KB -> 87 SSTables
>> > 0 KB -> 32 SSTables
>> >
>> > FYI here is CF information:
>> >
>> > ColumnFamily: Documents
>> >   Key Validation Class: org.apache.cassandra.db.marshal.BytesType
>> >   Default column value validator:
>> > org.apache.cassandra.db.marshal.BytesType
>> >   Columns sorted by: org.apache.cassandra.db.marshal.BytesType
>> >   Row cache size / save period in seconds / keys to save : 0.0/0/all
>> >   Row Cache Provider:
>> > org.apache.cassandra.cache.SerializingCacheProvider
>> >   Key cache size / save period in seconds: 20.0/14400
>> >   GC grace seconds: 1728000
>> >   Compaction min/max thresholds: 4/32
>> >   Read repair chance: 1.0
>> >   Replicate on write: true
>> >   Column Metadata:
>> >     Column Name: refUUID (7265664944)
>> >       Validation Class: org.apache.cassandra.db.marshal.BytesType
>> >       Index Name: refUUID_idx
>> >       Index Type: KEYS
>> >   Compaction Strategy:
>> > org.apache.cassandra.db.compaction.LeveledCompactionStrategy
>> >   Compression Options:
>> >     sstable_compression:
>> > org.apache.cassandra.io.compress.SnappyCompressor
>> >
>> > Is it a bug? If not, how can we tune Cassandra to avoid this?
>> >
>> > Regards,
>> >
>> > Romain


Re: Why so many SSTables?

2012-04-11 Thread Romain HARDOUIN
Thank you for your answers.

I originally post this question because we encoutered an OOM Exception on 
2 nodes during repair session.
Memory analyzing shows an hotspot: an ArrayList of SSTableBoundedScanner 
which contains as many objects there are SSTables on disk (7747 objects at 
the time).
This ArrayList consumes 47% of the heap space (786 MB).

We want each node to handle 1 TB, so we must dramatically reduce the 
number of SSTables.

Thus, is there any drawback if we set sstable_size_in_mb to 200MB?
Otherwise shoudl we go back to Tiered Compaction?

Regards,

Romain


Maki Watanabe  a écrit sur 11/04/2012 04:21:47 :

> You can configure sstable size by sstable_size_in_mb parameter for LCS.
> The default value is 5MB.
> You should better to check you don't have many pending compaction tasks
> with nodetool tpstats and compactionstats also.
> If you have enough IO throughput, you can increase
> compaction_throughput_mb_per_sec
> in cassandra.yaml to reduce pending compactions.
> 
> maki
> 
> 2012/4/10 Romain HARDOUIN :
> >
> > Hi,
> >
> > We are surprised by the number of files generated by Cassandra.
> > Our cluster consists of 9 nodes and each node handles about 35 GB.
> > We're using Cassandra 1.0.6 with LeveledCompactionStrategy.
> > We have 30 CF.
> >
> > We've got roughly 45,000 files under the keyspace directory on each 
node:
> > ls -l /var/lib/cassandra/data/OurKeyspace/ | wc -l
> > 44372
> >
> > The biggest CF is spread over 38,000 files:
> > ls -l Documents* | wc -l
> > 37870
> >
> > ls -l Documents*-Data.db | wc -l
> > 7586
> >
> > Many SSTable are about 4 MB:
> >
> > 19 MB -> 1 SSTable
> > 12 MB -> 2 SSTables
> > 11 MB -> 2 SSTables
> > 9.2 MB -> 1 SSTable
> > 7.0 MB to 7.9 MB -> 6 SSTables
> > 6.0 MB to 6.4 MB -> 6 SSTables
> > 5.0 MB to 5.4 MB -> 4 SSTables
> > 4.0 MB to 4.7 MB -> 7139 SSTables
> > 3.0 MB to 3.9 MB -> 258 SSTables
> > 2.0 MB to 2.9 MB -> 35 SSTables
> > 1.0 MB to 1.9 MB -> 13 SSTables
> > 87 KB to  994 KB -> 87 SSTables
> > 0 KB -> 32 SSTables
> >
> > FYI here is CF information:
> >
> > ColumnFamily: Documents
> >   Key Validation Class: org.apache.cassandra.db.marshal.BytesType
> >   Default column value validator: 
org.apache.cassandra.db.marshal.BytesType
> >   Columns sorted by: org.apache.cassandra.db.marshal.BytesType
> >   Row cache size / save period in seconds / keys to save : 0.0/0/all
> >   Row Cache Provider: 
org.apache.cassandra.cache.SerializingCacheProvider
> >   Key cache size / save period in seconds: 20.0/14400
> >   GC grace seconds: 1728000
> >   Compaction min/max thresholds: 4/32
> >   Read repair chance: 1.0
> >   Replicate on write: true
> >   Column Metadata:
> > Column Name: refUUID (7265664944)
> >   Validation Class: org.apache.cassandra.db.marshal.BytesType
> >   Index Name: refUUID_idx
> >   Index Type: KEYS
> >   Compaction Strategy:
> > org.apache.cassandra.db.compaction.LeveledCompactionStrategy
> >   Compression Options:
> > sstable_compression: 
org.apache.cassandra.io.compress.SnappyCompressor
> >
> > Is it a bug? If not, how can we tune Cassandra to avoid this?
> >
> > Regards,
> >
> > Romain


insert in cql

2012-04-11 Thread puneet loya
insert into users (KEY) values (512313);

users is my column family and key is its only attribute..

It is giving an error
bad request : line 1:24 required (...)+ loop did not match anything at
input ')'

do you find any error here?


Re: need of regular nodetool repair

2012-04-11 Thread Igor

On 04/11/2012 12:04 PM, ruslan usifov wrote:

HH - this is hinted handoff?


Yes


2012/4/11 Igor mailto:i...@4friends.od.ua>>

On 04/11/2012 11:49 AM, R. Verlangen wrote:

Not everything, just HH :)

I hope this works for me for the next reasons: I have quite large
RF (6 datacenters, each carry one replica of all dataset), read
and write at CL ONE, relatively small TTL - 10 days, I have no
deletes, servers almost never go down for hour. So I expect that
even if I loose some HH then some other replica will reply with
data.  Is it correct?

Hope this works for me, but can not work for others.



Well, if everything works 100% at any time there should be
nothing to repair, however with a distributed cluster it would be
pretty rare for that to occur. At least that is how I interpret this.

2012/4/11 Igor mailto:i...@4friends.od.ua>>

BTW, I heard that we don't need to run repair if all your
data have TTL, all HH works,  and you never delete your data.


On 04/11/2012 11:34 AM, ruslan usifov wrote:

Sorry fo my bad english, so QUORUM allow  doesn't make
repair regularity? But form your anser it does not follow

2012/4/11 R. Verlangen mailto:ro...@us2.nl>>

Yes, I personally have configured it to perform a repair
once a week, as the GCGraceSeconds is at 10 days.

This is also what's in the manual

http://wiki.apache.org/cassandra/Operations#Repairing_missing_or_inconsistent_data
 (point
2)


2012/4/11 ruslan usifov mailto:ruslan.usi...@gmail.com>>

Hello

I have follow question, if we Read and write to
cassandra claster with QUORUM consistency level,
does this allow to us do not call nodetool repair
regular? (i.e. every GCGraceSeconds)




-- 
With kind regards,


Robin Verlangen
www.robinverlangen.nl 







-- 
With kind regards,


Robin Verlangen
www.robinverlangen.nl 








Re: need of regular nodetool repair

2012-04-11 Thread ruslan usifov
HH - this is hinted handoff?

2012/4/11 Igor 

>  On 04/11/2012 11:49 AM, R. Verlangen wrote:
>
> Not everything, just HH :)
>
> I hope this works for me for the next reasons: I have quite large RF (6
> datacenters, each carry one replica of all dataset), read and write at CL
> ONE, relatively small TTL - 10 days, I have no deletes, servers almost
> never go down for hour. So I expect that even if I loose some HH then some
> other replica will reply with data.  Is it correct?
>
> Hope this works for me, but can not work for others.
>
>
> Well, if everything works 100% at any time there should be nothing to
> repair, however with a distributed cluster it would be pretty rare for that
> to occur. At least that is how I interpret this.
>
>  2012/4/11 Igor 
>
>>  BTW, I heard that we don't need to run repair if all your data have TTL,
>> all HH works,  and you never delete your data.
>>
>>
>> On 04/11/2012 11:34 AM, ruslan usifov wrote:
>>
>> Sorry fo my bad english, so QUORUM allow  doesn't make repair regularity?
>> But form your anser it does not follow
>>
>> 2012/4/11 R. Verlangen 
>>
>>> Yes, I personally have configured it to perform a repair once a week, as
>>> the GCGraceSeconds is at 10 days.
>>>
>>>  This is also what's in the manual
>>> http://wiki.apache.org/cassandra/Operations#Repairing_missing_or_inconsistent_data
>>>  (point
>>> 2)
>>>
>>>
>>>  2012/4/11 ruslan usifov 
>>>
 Hello

 I have follow question, if we Read and write to cassandra claster with
 QUORUM consistency level, does this allow to us do not call nodetool repair
 regular? (i.e. every GCGraceSeconds)

>>>
>>>
>>>
>>>  --
>>> With kind regards,
>>>
>>>  Robin Verlangen
>>> www.robinverlangen.nl
>>>
>>>
>>
>>
>
>
>  --
> With kind regards,
>
>  Robin Verlangen
> www.robinverlangen.nl
>
>
>


Re: need of regular nodetool repair

2012-04-11 Thread Igor

On 04/11/2012 11:49 AM, R. Verlangen wrote:

Not everything, just HH :)

I hope this works for me for the next reasons: I have quite large RF (6 
datacenters, each carry one replica of all dataset), read and write at 
CL ONE, relatively small TTL - 10 days, I have no deletes, servers 
almost never go down for hour. So I expect that even if I loose some HH 
then some other replica will reply with data.  Is it correct?


Hope this works for me, but can not work for others.


Well, if everything works 100% at any time there should be nothing to 
repair, however with a distributed cluster it would be pretty rare for 
that to occur. At least that is how I interpret this.


2012/4/11 Igor mailto:i...@4friends.od.ua>>

BTW, I heard that we don't need to run repair if all your data
have TTL, all HH works,  and you never delete your data.


On 04/11/2012 11:34 AM, ruslan usifov wrote:

Sorry fo my bad english, so QUORUM allow  doesn't make repair
regularity? But form your anser it does not follow

2012/4/11 R. Verlangen mailto:ro...@us2.nl>>

Yes, I personally have configured it to perform a repair once
a week, as the GCGraceSeconds is at 10 days.

This is also what's in the manual

http://wiki.apache.org/cassandra/Operations#Repairing_missing_or_inconsistent_data
 (point
2)


2012/4/11 ruslan usifov mailto:ruslan.usi...@gmail.com>>

Hello

I have follow question, if we Read and write to cassandra
claster with QUORUM consistency level, does this allow to
us do not call nodetool repair regular? (i.e. every
GCGraceSeconds)




-- 
With kind regards,


Robin Verlangen
www.robinverlangen.nl 







--
With kind regards,

Robin Verlangen
www.robinverlangen.nl 





Re: need of regular nodetool repair

2012-04-11 Thread R. Verlangen
Well, if everything works 100% at any time there should be nothing to
repair, however with a distributed cluster it would be pretty rare for that
to occur. At least that is how I interpret this.

2012/4/11 Igor 

>  BTW, I heard that we don't need to run repair if all your data have TTL,
> all HH works,  and you never delete your data.
>
>
> On 04/11/2012 11:34 AM, ruslan usifov wrote:
>
> Sorry fo my bad english, so QUORUM allow  doesn't make repair regularity?
> But form your anser it does not follow
>
> 2012/4/11 R. Verlangen 
>
>> Yes, I personally have configured it to perform a repair once a week, as
>> the GCGraceSeconds is at 10 days.
>>
>>  This is also what's in the manual
>> http://wiki.apache.org/cassandra/Operations#Repairing_missing_or_inconsistent_data
>>  (point
>> 2)
>>
>>
>>  2012/4/11 ruslan usifov 
>>
>>> Hello
>>>
>>> I have follow question, if we Read and write to cassandra claster with
>>> QUORUM consistency level, does this allow to us do not call nodetool repair
>>> regular? (i.e. every GCGraceSeconds)
>>>
>>
>>
>>
>>  --
>> With kind regards,
>>
>>  Robin Verlangen
>> www.robinverlangen.nl
>>
>>
>
>


-- 
With kind regards,

Robin Verlangen
www.robinverlangen.nl


Re: need of regular nodetool repair

2012-04-11 Thread Igor
BTW, I heard that we don't need to run repair if all your data have TTL, 
all HH works,  and you never delete your data.


On 04/11/2012 11:34 AM, ruslan usifov wrote:
Sorry fo my bad english, so QUORUM allow  doesn't make repair 
regularity? But form your anser it does not follow


2012/4/11 R. Verlangen mailto:ro...@us2.nl>>

Yes, I personally have configured it to perform a repair once a
week, as the GCGraceSeconds is at 10 days.

This is also what's in the manual

http://wiki.apache.org/cassandra/Operations#Repairing_missing_or_inconsistent_data
 (point
2)


2012/4/11 ruslan usifov mailto:ruslan.usi...@gmail.com>>

Hello

I have follow question, if we Read and write to cassandra
claster with QUORUM consistency level, does this allow to us
do not call nodetool repair regular? (i.e. every GCGraceSeconds)




-- 
With kind regards,


Robin Verlangen
www.robinverlangen.nl 






Re: need of regular nodetool repair

2012-04-11 Thread ruslan usifov
Sorry fo my bad english, so QUORUM allow  doesn't make repair regularity?
But form your anser it does not follow

2012/4/11 R. Verlangen 

> Yes, I personally have configured it to perform a repair once a week, as
> the GCGraceSeconds is at 10 days.
>
> This is also what's in the manual
> http://wiki.apache.org/cassandra/Operations#Repairing_missing_or_inconsistent_data
>  (point
> 2)
>
>
> 2012/4/11 ruslan usifov 
>
>> Hello
>>
>> I have follow question, if we Read and write to cassandra claster with
>> QUORUM consistency level, does this allow to us do not call nodetool repair
>> regular? (i.e. every GCGraceSeconds)
>>
>
>
>
> --
> With kind regards,
>
> Robin Verlangen
> www.robinverlangen.nl
>
>


Re: need of regular nodetool repair

2012-04-11 Thread R. Verlangen
Yes, I personally have configured it to perform a repair once a week, as
the GCGraceSeconds is at 10 days.

This is also what's in the manual
http://wiki.apache.org/cassandra/Operations#Repairing_missing_or_inconsistent_data
(point
2)

2012/4/11 ruslan usifov 

> Hello
>
> I have follow question, if we Read and write to cassandra claster with
> QUORUM consistency level, does this allow to us do not call nodetool repair
> regular? (i.e. every GCGraceSeconds)
>



-- 
With kind regards,

Robin Verlangen
www.robinverlangen.nl


need of regular nodetool repair

2012-04-11 Thread ruslan usifov
Hello

I have follow question, if we Read and write to cassandra claster with
QUORUM consistency level, does this allow to us do not call nodetool repair
regular? (i.e. every GCGraceSeconds)