from:"ruslan usifov"

where is cassandra debian packages?

2012-08-24 Thread ruslan usifov

Hello

looks like http://www.apache.org/dist/cassandra/debian is missing
(HTTP 404), may be cassandra moved to other debian repository?

Re: RE where is cassandra debian packages?

2012-08-24 Thread ruslan usifov

no, i got 404 error.

2012/8/24 Romain HARDOUIN romain.hardo...@urssaf.fr:

 Hi,
 The url you mentioned is OK: e.g.
 http://www.apache.org/dist/cassandra/debian/dists/11x/


 ruslan usifov ruslan.usi...@gmail.com a écrit sur 24/08/2012 11:26:11 :

 Hello

 looks like http://www.apache.org/dist/cassandra/debian is missing
 (HTTP 404), may be cassandra moved to other debian repository?

Re: RE where is cassandra debian packages?

2012-08-24 Thread ruslan usifov

Hm, from erope servere  cassandra packages prestn, but from russian
servers absent.

2012/8/24 Michal Michalski mich...@opera.com:
 Well, Works for me.

 W dniu 24.08.2012 11:43, ruslan usifov pisze:

 no, i got 404 error.

 2012/8/24 Romain HARDOUIN romain.hardo...@urssaf.fr:


 Hi,
 The url you mentioned is OK: e.g.
 http://www.apache.org/dist/cassandra/debian/dists/11x/


 ruslan usifov ruslan.usi...@gmail.com a écrit sur 24/08/2012 11:26:11 :

 Hello

 looks like http://www.apache.org/dist/cassandra/debian is missing
 (HTTP 404), may be cassandra moved to other debian repository?

Oftopic: ksoftirqd after ddos take more cpu? as result cassandra latensy very high

2012-07-01 Thread ruslan usifov

Hello

We was under ddos attack, and as result we got high ksoftirqd activity
- as result cassandra begin answer very slow. But when ddos was gone
high ksoftirqd activity still exists, and dissaper when i stop
cassandra daemon, and repeat again when i start cassadra daemon, the
fully resolution of problem is full reboot of server. What this can be
(why ksoftirqd begin work very intensive when cassandra runing - we
disable all working traffic to cluster but this doesn't help so this
is can't be due heavy load )? And how to solve this?

PS:
 OS ubuntu 10.0.4 (2.6.32.41)
 cassandra 1.0.10
 java 1.6.32 (from oracle)

Re: Oftopic: ksoftirqd after ddos take more cpu? as result cassandra latensy very high

2012-07-01 Thread ruslan usifov

2012/7/1 David Daeschler david.daesch...@gmail.com:
 Good afternoon,

 This again looks like it could be the leap second issue:

 This looks like the problem a bunch of us were having yesterday that
 isn't cleared without a reboot or a date command. It seems to be
 related to the leap second that was added between the 30th June and
 the 1st of July.

 See the mailing list thread with subject High CPU usage as of 8pm eastern 
 time

 If you are seeing high CPU usage and a stall after restarting
 cassandra still, and you are on Linux, try:

 date; date `date +%m%d%H%M%C%y.%S`; date;

 In a terminal and see if everything starts working again.

 I hope this helps. Please spread the word if you see others having
 issues with unresponsive kernels/high CPU.

 Hello, this realy helps. In our case two problems cross each other-((
and we doesn't have assumed that might be a kernel problem. On one
data cluster we simply reboot it, and in seccond apply date solution
and everything is fine, thanks

cassandra 1.0.x and java 1.7

2012-06-18 Thread ruslan usifov

Hello!

Is it safe to use java 1.7 with cassandra 1.0.x Reason why i want do
that, is that in java 1.7 appear options for rotate GC log:

http://bugs.sun.com/bugdatabase/view_bug.do;jsessionid=ff824681055961e1f62393b68deb5?bug_id=6941923

Re: kswapd0 causing read timeouts

2012-06-14 Thread ruslan usifov

Upgrade java (version 1.6.21 have memleaks) to latest 1.6.32. Its
abnormally that on 80Gigs you have 15Gigs of index

vfs_cache_pressure - used for inodes and dentrys

Also to check that you have memleaks use drop_cache sysctl





2012/6/14 Gurpreet Singh gurpreet.si...@gmail.com:
 JNA is installed. swappiness was 0. vfs_cache_pressure was 100. 2 questions
 on this..
 1. Is there a way to find out if mlockall really worked other than just the
 mlockall successful log message?
 2. Does cassandra only mlock the jvm heap or also the mmaped memory?

 I disabled mmap completely, and things look so much better.
 latency is surprisingly half of what i see when i have mmap enabled.
 Its funny that i keep reading tall claims abt mmap, but in practise a lot of
 ppl have problems with it, especially when it uses up all the memory. We
 have tried mmap for different purposes in our company before,and had finally
 ended up disabling it, because it just doesnt handle things right when
 memory is low. Maybe the proc/sys/vm needs to be configured right, but thats
 not the easiest of configurations to get right.

 Right now, i am handling only 80 gigs of data. kernel version is 2.6.26.
 java version is 1.6.21
 /G


 On Wed, Jun 13, 2012 at 8:42 PM, Al Tobey a...@ooyala.com wrote:

 I would check /etc/sysctl.conf and get the values of
 /proc/sys/vm/swappiness and /proc/sys/vm/vfs_cache_pressure.

 If you don't have JNA enabled (which Cassandra uses to fadvise) and
 swappiness is at its default of 60, the Linux kernel will happily swap out
 your heap for cache space.  Set swappiness to 1 or 'swapoff -a' and kswapd
 shouldn't be doing much unless you have a too-large heap or some other app
 using up memory on the system.


 On Wed, Jun 13, 2012 at 11:30 AM, ruslan usifov ruslan.usi...@gmail.com
 wrote:

 Hm, it's very strange what amount of you data? You linux kernel
 version? Java version?

 PS: i can suggest switch diskaccessmode to standart in you case
 PS:PS also upgrade you linux to latest, and javahotspot to 1.6.32
 (from oracle site)

 2012/6/13 Gurpreet Singh gurpreet.si...@gmail.com:
  Alright, here it goes again...
  Even with mmap_index_only, once the RES memory hit 15 gigs, the read
  latency
  went berserk. This happens in 12 hours if diskaccessmode is mmap, abt
  48 hrs
  if its mmap_index_only.
 
  only reads happening at 50 reads/second
  row cache size: 730 mb, row cache hit ratio: 0.75
  key cache size: 400 mb, key cache hit ratio: 0.4
  heap size (max 8 gigs): used 6.1-6.9 gigs
 
  No messages about reducing cache sizes in the logs
 
  stats:
  vmstat 1 : no swapping here, however high sys cpu utilization
  iostat (looks great) - avg-qu-sz = 8, avg await = 7 ms, svc time = 0.6,
  util
  = 15-30%
  top - VIRT - 19.8g, SHR - 6.1g, RES - 15g, high cpu, buffers - 2mb
  cfstats - 70-100 ms. This number used to be 20-30 ms.
 
  The value of the SHR keeps increasing (owing to mmap i guess), while at
  the
  same time buffers keeps decreasing. buffers starts as high as 50 mb,
  and
  goes down to 2 mb.
 
 
  This is very easily reproducible for me. Every time the RES memory hits
  abt
  15 gigs, the client starts getting timeouts from cassandra, the sys cpu
  jumps a lot. All this, even though my row cache hit ratio is almost
  0.75.
 
  Other than just turning off mmap completely, is there any other
  solution or
  setting to avoid a cassandra restart every cpl of days. Something to
  keep
  the RES memory to hit such a high number. I have been constantly
  monitoring
  the RES, was not seeing issues when RES was at 14 gigs.
  /G
 
  On Fri, Jun 8, 2012 at 10:02 PM, Gurpreet Singh
  gurpreet.si...@gmail.com
  wrote:
 
  Aaron, Ruslan,
  I changed the disk access mode to mmap_index_only, and it has been
  stable
  ever since, well at least for the past 20 hours. Previously, in abt
  10-12
  hours, as soon as the resident memory was full, the client would start
  timing out on all its reads. It looks fine for now, i am going to let
  it
  continue to see how long it lasts and if the problem comes again.
 
  Aaron,
  yes, i had turned swap off.
 
  The total cpu utilization was at 700% roughly.. It looked like kswapd0
  was
  using just 1 cpu, but cassandra (jsvc) cpu utilization increased quite
  a
  bit. top was reporting high system cpu, and low user cpu.
  vmstat was not showing swapping. java heap size max is 8 gigs. while
  only
  4 gigs was in use, so java heap was doing great. no gc in the logs.
  iostat
  was doing ok from what i remember, i will have to reproduce the issue
  for
  the exact numbers.
 
  cfstats latency had gone very high, but that is partly due to high cpu
  usage.
 
  One thing was clear, that the SHR was inching higher (due to the mmap)
  while buffer cache which started at abt 20-25mb reduced to 2 MB by the
  end,
  which probably means that pagecache was being evicted by the kswapd0.
  Is
  there a way to fix the size of the buffer cache and not let system
  evict it
  in favour

Re: kswapd0 causing read timeouts

2012-06-14 Thread ruslan usifov

2012/6/14 Gurpreet Singh gurpreet.si...@gmail.com:
 JNA is installed. swappiness was 0. vfs_cache_pressure was 100. 2 questions
 on this..
 1. Is there a way to find out if mlockall really worked other than just the
 mlockall successful log message?
yes you must see something like this (from our test server):

 INFO [main] 2012-06-14 02:03:14,745 DatabaseDescriptor.java (line
233) Global memtable threshold is enabled at 512MB


 2. Does cassandra only mlock the jvm heap or also the mmaped memory?

Cassandra obviously mlock only heap, and doesn't mmaped sstables



 I disabled mmap completely, and things look so much better.
 latency is surprisingly half of what i see when i have mmap enabled.
 Its funny that i keep reading tall claims abt mmap, but in practise a lot of
 ppl have problems with it, especially when it uses up all the memory. We
 have tried mmap for different purposes in our company before,and had finally
 ended up disabling it, because it just doesnt handle things right when
 memory is low. Maybe the proc/sys/vm needs to be configured right, but thats
 not the easiest of configurations to get right.

 Right now, i am handling only 80 gigs of data. kernel version is 2.6.26.
 java version is 1.6.21
 /G


 On Wed, Jun 13, 2012 at 8:42 PM, Al Tobey a...@ooyala.com wrote:

 I would check /etc/sysctl.conf and get the values of
 /proc/sys/vm/swappiness and /proc/sys/vm/vfs_cache_pressure.

 If you don't have JNA enabled (which Cassandra uses to fadvise) and
 swappiness is at its default of 60, the Linux kernel will happily swap out
 your heap for cache space.  Set swappiness to 1 or 'swapoff -a' and kswapd
 shouldn't be doing much unless you have a too-large heap or some other app
 using up memory on the system.


 On Wed, Jun 13, 2012 at 11:30 AM, ruslan usifov ruslan.usi...@gmail.com
 wrote:

 Hm, it's very strange what amount of you data? You linux kernel
 version? Java version?

 PS: i can suggest switch diskaccessmode to standart in you case
 PS:PS also upgrade you linux to latest, and javahotspot to 1.6.32
 (from oracle site)

 2012/6/13 Gurpreet Singh gurpreet.si...@gmail.com:
  Alright, here it goes again...
  Even with mmap_index_only, once the RES memory hit 15 gigs, the read
  latency
  went berserk. This happens in 12 hours if diskaccessmode is mmap, abt
  48 hrs
  if its mmap_index_only.
 
  only reads happening at 50 reads/second
  row cache size: 730 mb, row cache hit ratio: 0.75
  key cache size: 400 mb, key cache hit ratio: 0.4
  heap size (max 8 gigs): used 6.1-6.9 gigs
 
  No messages about reducing cache sizes in the logs
 
  stats:
  vmstat 1 : no swapping here, however high sys cpu utilization
  iostat (looks great) - avg-qu-sz = 8, avg await = 7 ms, svc time = 0.6,
  util
  = 15-30%
  top - VIRT - 19.8g, SHR - 6.1g, RES - 15g, high cpu, buffers - 2mb
  cfstats - 70-100 ms. This number used to be 20-30 ms.
 
  The value of the SHR keeps increasing (owing to mmap i guess), while at
  the
  same time buffers keeps decreasing. buffers starts as high as 50 mb,
  and
  goes down to 2 mb.
 
 
  This is very easily reproducible for me. Every time the RES memory hits
  abt
  15 gigs, the client starts getting timeouts from cassandra, the sys cpu
  jumps a lot. All this, even though my row cache hit ratio is almost
  0.75.
 
  Other than just turning off mmap completely, is there any other
  solution or
  setting to avoid a cassandra restart every cpl of days. Something to
  keep
  the RES memory to hit such a high number. I have been constantly
  monitoring
  the RES, was not seeing issues when RES was at 14 gigs.
  /G
 
  On Fri, Jun 8, 2012 at 10:02 PM, Gurpreet Singh
  gurpreet.si...@gmail.com
  wrote:
 
  Aaron, Ruslan,
  I changed the disk access mode to mmap_index_only, and it has been
  stable
  ever since, well at least for the past 20 hours. Previously, in abt
  10-12
  hours, as soon as the resident memory was full, the client would start
  timing out on all its reads. It looks fine for now, i am going to let
  it
  continue to see how long it lasts and if the problem comes again.
 
  Aaron,
  yes, i had turned swap off.
 
  The total cpu utilization was at 700% roughly.. It looked like kswapd0
  was
  using just 1 cpu, but cassandra (jsvc) cpu utilization increased quite
  a
  bit. top was reporting high system cpu, and low user cpu.
  vmstat was not showing swapping. java heap size max is 8 gigs. while
  only
  4 gigs was in use, so java heap was doing great. no gc in the logs.
  iostat
  was doing ok from what i remember, i will have to reproduce the issue
  for
  the exact numbers.
 
  cfstats latency had gone very high, but that is partly due to high cpu
  usage.
 
  One thing was clear, that the SHR was inching higher (due to the mmap)
  while buffer cache which started at abt 20-25mb reduced to 2 MB by the
  end,
  which probably means that pagecache was being evicted by the kswapd0.
  Is
  there a way to fix the size of the buffer cache and not let system
  evict

Re: kswapd0 causing read timeouts

2012-06-14 Thread ruslan usifov

Soory i mistaken,here is right string

 INFO [main] 2012-06-14 02:03:14,520 CLibrary.java (line 109) JNA
mlockall successful




2012/6/15 ruslan usifov ruslan.usi...@gmail.com:
 2012/6/14 Gurpreet Singh gurpreet.si...@gmail.com:
 JNA is installed. swappiness was 0. vfs_cache_pressure was 100. 2 questions
 on this..
 1. Is there a way to find out if mlockall really worked other than just the
 mlockall successful log message?
 yes you must see something like this (from our test server):

  INFO [main] 2012-06-14 02:03:14,745 DatabaseDescriptor.java (line
 233) Global memtable threshold is enabled at 512MB


 2. Does cassandra only mlock the jvm heap or also the mmaped memory?

 Cassandra obviously mlock only heap, and doesn't mmaped sstables



 I disabled mmap completely, and things look so much better.
 latency is surprisingly half of what i see when i have mmap enabled.
 Its funny that i keep reading tall claims abt mmap, but in practise a lot of
 ppl have problems with it, especially when it uses up all the memory. We
 have tried mmap for different purposes in our company before,and had finally
 ended up disabling it, because it just doesnt handle things right when
 memory is low. Maybe the proc/sys/vm needs to be configured right, but thats
 not the easiest of configurations to get right.

 Right now, i am handling only 80 gigs of data. kernel version is 2.6.26.
 java version is 1.6.21
 /G


 On Wed, Jun 13, 2012 at 8:42 PM, Al Tobey a...@ooyala.com wrote:

 I would check /etc/sysctl.conf and get the values of
 /proc/sys/vm/swappiness and /proc/sys/vm/vfs_cache_pressure.

 If you don't have JNA enabled (which Cassandra uses to fadvise) and
 swappiness is at its default of 60, the Linux kernel will happily swap out
 your heap for cache space.  Set swappiness to 1 or 'swapoff -a' and kswapd
 shouldn't be doing much unless you have a too-large heap or some other app
 using up memory on the system.


 On Wed, Jun 13, 2012 at 11:30 AM, ruslan usifov ruslan.usi...@gmail.com
 wrote:

 Hm, it's very strange what amount of you data? You linux kernel
 version? Java version?

 PS: i can suggest switch diskaccessmode to standart in you case
 PS:PS also upgrade you linux to latest, and javahotspot to 1.6.32
 (from oracle site)

 2012/6/13 Gurpreet Singh gurpreet.si...@gmail.com:
  Alright, here it goes again...
  Even with mmap_index_only, once the RES memory hit 15 gigs, the read
  latency
  went berserk. This happens in 12 hours if diskaccessmode is mmap, abt
  48 hrs
  if its mmap_index_only.
 
  only reads happening at 50 reads/second
  row cache size: 730 mb, row cache hit ratio: 0.75
  key cache size: 400 mb, key cache hit ratio: 0.4
  heap size (max 8 gigs): used 6.1-6.9 gigs
 
  No messages about reducing cache sizes in the logs
 
  stats:
  vmstat 1 : no swapping here, however high sys cpu utilization
  iostat (looks great) - avg-qu-sz = 8, avg await = 7 ms, svc time = 0.6,
  util
  = 15-30%
  top - VIRT - 19.8g, SHR - 6.1g, RES - 15g, high cpu, buffers - 2mb
  cfstats - 70-100 ms. This number used to be 20-30 ms.
 
  The value of the SHR keeps increasing (owing to mmap i guess), while at
  the
  same time buffers keeps decreasing. buffers starts as high as 50 mb,
  and
  goes down to 2 mb.
 
 
  This is very easily reproducible for me. Every time the RES memory hits
  abt
  15 gigs, the client starts getting timeouts from cassandra, the sys cpu
  jumps a lot. All this, even though my row cache hit ratio is almost
  0.75.
 
  Other than just turning off mmap completely, is there any other
  solution or
  setting to avoid a cassandra restart every cpl of days. Something to
  keep
  the RES memory to hit such a high number. I have been constantly
  monitoring
  the RES, was not seeing issues when RES was at 14 gigs.
  /G
 
  On Fri, Jun 8, 2012 at 10:02 PM, Gurpreet Singh
  gurpreet.si...@gmail.com
  wrote:
 
  Aaron, Ruslan,
  I changed the disk access mode to mmap_index_only, and it has been
  stable
  ever since, well at least for the past 20 hours. Previously, in abt
  10-12
  hours, as soon as the resident memory was full, the client would start
  timing out on all its reads. It looks fine for now, i am going to let
  it
  continue to see how long it lasts and if the problem comes again.
 
  Aaron,
  yes, i had turned swap off.
 
  The total cpu utilization was at 700% roughly.. It looked like kswapd0
  was
  using just 1 cpu, but cassandra (jsvc) cpu utilization increased quite
  a
  bit. top was reporting high system cpu, and low user cpu.
  vmstat was not showing swapping. java heap size max is 8 gigs. while
  only
  4 gigs was in use, so java heap was doing great. no gc in the logs.
  iostat
  was doing ok from what i remember, i will have to reproduce the issue
  for
  the exact numbers.
 
  cfstats latency had gone very high, but that is partly due to high cpu
  usage.
 
  One thing was clear, that the SHR was inching higher (due to the mmap)
  while buffer cache which started at abt 20-25mb

Re: kswapd0 causing read timeouts

2012-06-13 Thread ruslan usifov

Hm, it's very strange what amount of you data? You linux kernel
version? Java version?

PS: i can suggest switch diskaccessmode to standart in you case
PS:PS also upgrade you linux to latest, and javahotspot to 1.6.32
(from oracle site)

2012/6/13 Gurpreet Singh gurpreet.si...@gmail.com:
 Alright, here it goes again...
 Even with mmap_index_only, once the RES memory hit 15 gigs, the read latency
 went berserk. This happens in 12 hours if diskaccessmode is mmap, abt 48 hrs
 if its mmap_index_only.

 only reads happening at 50 reads/second
 row cache size: 730 mb, row cache hit ratio: 0.75
 key cache size: 400 mb, key cache hit ratio: 0.4
 heap size (max 8 gigs): used 6.1-6.9 gigs

 No messages about reducing cache sizes in the logs

 stats:
 vmstat 1 : no swapping here, however high sys cpu utilization
 iostat (looks great) - avg-qu-sz = 8, avg await = 7 ms, svc time = 0.6, util
 = 15-30%
 top - VIRT - 19.8g, SHR - 6.1g, RES - 15g, high cpu, buffers - 2mb
 cfstats - 70-100 ms. This number used to be 20-30 ms.

 The value of the SHR keeps increasing (owing to mmap i guess), while at the
 same time buffers keeps decreasing. buffers starts as high as 50 mb, and
 goes down to 2 mb.


 This is very easily reproducible for me. Every time the RES memory hits abt
 15 gigs, the client starts getting timeouts from cassandra, the sys cpu
 jumps a lot. All this, even though my row cache hit ratio is almost 0.75.

 Other than just turning off mmap completely, is there any other solution or
 setting to avoid a cassandra restart every cpl of days. Something to keep
 the RES memory to hit such a high number. I have been constantly monitoring
 the RES, was not seeing issues when RES was at 14 gigs.
 /G

 On Fri, Jun 8, 2012 at 10:02 PM, Gurpreet Singh gurpreet.si...@gmail.com
 wrote:

 Aaron, Ruslan,
 I changed the disk access mode to mmap_index_only, and it has been stable
 ever since, well at least for the past 20 hours. Previously, in abt 10-12
 hours, as soon as the resident memory was full, the client would start
 timing out on all its reads. It looks fine for now, i am going to let it
 continue to see how long it lasts and if the problem comes again.

 Aaron,
 yes, i had turned swap off.

 The total cpu utilization was at 700% roughly.. It looked like kswapd0 was
 using just 1 cpu, but cassandra (jsvc) cpu utilization increased quite a
 bit. top was reporting high system cpu, and low user cpu.
 vmstat was not showing swapping. java heap size max is 8 gigs. while only
 4 gigs was in use, so java heap was doing great. no gc in the logs. iostat
 was doing ok from what i remember, i will have to reproduce the issue for
 the exact numbers.

 cfstats latency had gone very high, but that is partly due to high cpu
 usage.

 One thing was clear, that the SHR was inching higher (due to the mmap)
 while buffer cache which started at abt 20-25mb reduced to 2 MB by the end,
 which probably means that pagecache was being evicted by the kswapd0. Is
 there a way to fix the size of the buffer cache and not let system evict it
 in favour of mmap?

 Also, mmapping data files would basically cause not only the data (asked
 for) to be read into main memory, but also a bunch of extra pages
 (readahead), which would not be very useful, right? The same thing for index
 would actually be more useful, as there would be more index entries in the
 readahead part.. and the index files being small wouldnt cause memory
 pressure that page cache would be evicted. mmapping the data files would
 make sense if the data size is smaller than the RAM or the hot data set is
 smaller than the RAM, otherwise just the index would probably be a better
 thing to mmap, no?. In my case data size is 85 gigs, while available RAM is
 16 gigs (only 8 gigs after heap).

 /G


 On Fri, Jun 8, 2012 at 11:44 AM, aaron morton aa...@thelastpickle.com
 wrote:

 Ruslan,
 Why did you suggest changing the disk_access_mode ?

 Gurpreet,
 I would leave the disk_access_mode with the default until you have a
 reason to change it.

  8 core, 16 gb ram, 6 data disks raid0, no swap configured

 is swap disabled ?

 Gradually,
  the system cpu becomes high almost 70%, and the client starts getting
  continuous timeouts

 70% of one core or 70% of all cores ?
 Check the server logs, is there GC activity ?
 check nodetool cfstats to see the read latency for the cf.

 Take a look at vmstat to see if you are swapping, and look at iostats to
 see if io is the problem
 http://spyced.blogspot.co.nz/2010/01/linux-performance-basics.html

 Cheers

 -
 Aaron Morton
 Freelance Developer
 @aaronmorton
 http://www.thelastpickle.com

 On 8/06/2012, at 9:00 PM, Gurpreet Singh wrote:

 Thanks Ruslan.
 I will try the mmap_index_only.
 Is there any guideline as to when to leave it to auto and when to use
 mmap_index_only?

 /G

 On Fri, Jun 8, 2012 at 1:21 AM, ruslan usifov ruslan.usi...@gmail.com
 wrote:

 disk_access_mode: mmap??

 set to disk_access_mode: mmap_index_only

Re: kswapd0 causing read timeouts

2012-06-08 Thread ruslan usifov

disk_access_mode: mmap??

set to disk_access_mode: mmap_index_only in cassandra yaml

2012/6/8 Gurpreet Singh gurpreet.si...@gmail.com:
 Hi,
 I am testing cassandra 1.1 on a 1 node cluster.
 8 core, 16 gb ram, 6 data disks raid0, no swap configured

 cassandra 1.1.1
 heap size: 8 gigs
 key cache size in mb: 800 (used only 200mb till now)
 memtable_total_space_in_mb : 2048

 I am running a read workload.. about 30 reads/second. no writes at all.
 The system runs fine for roughly 12 hours.

 jconsole shows that my heap size has hardly touched 4 gigs.
 top shows -
   SHR increasing slowly from 100 mb to 6.6 gigs in  these 12 hrs
   RES increases slowly from 6 gigs all the way to 15 gigs
   buffers are at a healthy 25 mb at some point and that goes down to 2 mb in
 these 12 hrs
   VIRT stays at 85 gigs

 I understand that SHR goes up because of mmap, RES goes up because it is
 showing SHR value as well.

 After around 10-12 hrs, the cpu utilization of the system starts increasing,
 and i notice that kswapd0 process starts becoming more active. Gradually,
 the system cpu becomes high almost 70%, and the client starts getting
 continuous timeouts. The fact that the buffers went down from 20 mb to 2 mb
 suggests that kswapd0 is probably swapping out the pagecache.

 Is there a way out of this to avoid the kswapd0 starting to do things even
 when there is no swap configured?
 This is very easily reproducible for me, and would like a way out of this
 situation. Do i need to adjust vm memory management stuff like pagecache,
 vfs_cache_pressure.. things like that?

 just some extra information, jna is installed, mlockall is successful. there
 is no compaction running.
 would appreciate any help on this.
 Thanks
 Gurpreet

Re: kswapd0 causing read timeouts

2012-06-08 Thread ruslan usifov

2012/6/8 aaron morton aa...@thelastpickle.com:
 Ruslan,
 Why did you suggest changing the disk_access_mode ?

Because this bring problems on empty seat, in any case for me mmap
bring similar problem and i doesn't have find any solution to resolve
it, only  change disk_access_mode:-((. For me also will be interesting
hear results of author of this theme


 Gurpreet,
 I would leave the disk_access_mode with the default until you have a reason
 to change it.

  8 core, 16 gb ram, 6 data disks raid0, no swap configured

 is swap disabled ?

 Gradually,
  the system cpu becomes high almost 70%, and the client starts getting
  continuous timeouts

 70% of one core or 70% of all cores ?
 Check the server logs, is there GC activity ?
 check nodetool cfstats to see the read latency for the cf.

 Take a look at vmstat to see if you are swapping, and look at iostats to see
 if io is the problem
 http://spyced.blogspot.co.nz/2010/01/linux-performance-basics.html

 Cheers

 -
 Aaron Morton
 Freelance Developer
 @aaronmorton
 http://www.thelastpickle.com

 On 8/06/2012, at 9:00 PM, Gurpreet Singh wrote:

 Thanks Ruslan.
 I will try the mmap_index_only.
 Is there any guideline as to when to leave it to auto and when to use
 mmap_index_only?

 /G

 On Fri, Jun 8, 2012 at 1:21 AM, ruslan usifov ruslan.usi...@gmail.com
 wrote:

 disk_access_mode: mmap??

 set to disk_access_mode: mmap_index_only in cassandra yaml

 2012/6/8 Gurpreet Singh gurpreet.si...@gmail.com:
  Hi,
  I am testing cassandra 1.1 on a 1 node cluster.
  8 core, 16 gb ram, 6 data disks raid0, no swap configured
 
  cassandra 1.1.1
  heap size: 8 gigs
  key cache size in mb: 800 (used only 200mb till now)
  memtable_total_space_in_mb : 2048
 
  I am running a read workload.. about 30 reads/second. no writes at all.
  The system runs fine for roughly 12 hours.
 
  jconsole shows that my heap size has hardly touched 4 gigs.
  top shows -
    SHR increasing slowly from 100 mb to 6.6 gigs in  these 12 hrs
    RES increases slowly from 6 gigs all the way to 15 gigs
    buffers are at a healthy 25 mb at some point and that goes down to 2
  mb in
  these 12 hrs
    VIRT stays at 85 gigs
 
  I understand that SHR goes up because of mmap, RES goes up because it is
  showing SHR value as well.
 
  After around 10-12 hrs, the cpu utilization of the system starts
  increasing,
  and i notice that kswapd0 process starts becoming more active.
  Gradually,
  the system cpu becomes high almost 70%, and the client starts getting
  continuous timeouts. The fact that the buffers went down from 20 mb to 2
  mb
  suggests that kswapd0 is probably swapping out the pagecache.
 
  Is there a way out of this to avoid the kswapd0 starting to do things
  even
  when there is no swap configured?
  This is very easily reproducible for me, and would like a way out of
  this
  situation. Do i need to adjust vm memory management stuff like
  pagecache,
  vfs_cache_pressure.. things like that?
 
  just some extra information, jna is installed, mlockall is successful.
  there
  is no compaction running.
  would appreciate any help on this.
  Thanks
  Gurpreet

Re: nodetool repair -- should I schedule a weekly one ?

2012-06-07 Thread ruslan usifov

Yes, for ONE you cant got inconsistent read in case when one of you
nodes are die, and dinamyc snitch doesn't do it job

2012/6/7 Oleg Dulin oleg.du...@gmail.com:
 We have a 3-node cluster. We use RF of 3 and CL of ONE for both reads and
 writes…. Is there a reason I should schedule a regular nodetool repair job ?

 Thanks,
 Oleg

Re: nodetool repair -- should I schedule a weekly one ?

2012-06-07 Thread ruslan usifov

Sorry no dinamic snitch, but hinted handoff. Remember casaandra is
evently consistent

2012/6/8 ruslan usifov ruslan.usi...@gmail.com:
 Yes, for ONE you cant got inconsistent read in case when one of you
 nodes are die, and dinamyc snitch doesn't do it job

 2012/6/7 Oleg Dulin oleg.du...@gmail.com:
 We have a 3-node cluster. We use RF of 3 and CL of ONE for both reads and
 writes…. Is there a reason I should schedule a regular nodetool repair job ?

 Thanks,
 Oleg

Re: row_cache_provider = 'SerializingCacheProvider'

2012-06-04 Thread ruslan usifov

I have setup 5GB of JavaHeap wit follow tuning:

MAX_HEAP_SIZE=5G
HEAP_NEWSIZE=800M

JVM_OPTS=$JVM_OPTS -XX:+UseParNewGC
JVM_OPTS=$JVM_OPTS -XX:+UseConcMarkSweepGC
JVM_OPTS=$JVM_OPTS -XX:+CMSParallelRemarkEnabled
JVM_OPTS=$JVM_OPTS -XX:SurvivorRatio=8
JVM_OPTS=$JVM_OPTS -XX:MaxTenuringThreshold=5
JVM_OPTS=$JVM_OPTS -XX:CMSInitiatingOccupancyFraction=65
JVM_OPTS=$JVM_OPTS -XX:+UseCMSInitiatingOccupancyOnly
JVM_OPTS=$JVM_OPTS -XX:CMSFullGCsBeforeCompaction=1

Also I set up 2GB to memtables (memtable_total_space_in_mb: 2048)


My avg heap usage (nodetool -h localhost info):

3G


Based on nodetool -h localhost cfhistograms i calc avg row size

70KB

I setup row cache only for one CF with follow settings:

update column family building with rows_cached=1 and
row_cache_provider='SerializingCacheProvider';


When i setup row cache i got promotion failure in GC (with stop the
world pause about 30secs) with almost HEAP filled. I very confused
with this behavior.


PS: i use cassandra 1.0.10, with JNA 3.4.0 on ubuntu lucid (kernel 2.6.32-41)


2012/6/4 aaron morton aa...@thelastpickle.com:
 Yes SerializingCacheProvider is the off heap caching provider.

 Can you do some more digging into what is using the heap ?

 Cheers
 A

 -
 Aaron Morton
 Freelance Developer
 @aaronmorton
 http://www.thelastpickle.com

 On 1/06/2012, at 9:52 PM, ruslan usifov wrote:

 Hello

 I begin use SerializingCacheProvider for rows cashing, and got
 extremely JAVA heap grows. But i think that this cache provider
 doesn't use JAVA heap

Re: row_cache_provider = 'SerializingCacheProvider'

2012-06-04 Thread ruslan usifov

I think that SerializingCacheProvider have more JAVA HEAP footprint,
then i think

2012/6/4 ruslan usifov ruslan.usi...@gmail.com:
 I have setup 5GB of JavaHeap wit follow tuning:

 MAX_HEAP_SIZE=5G
 HEAP_NEWSIZE=800M

 JVM_OPTS=$JVM_OPTS -XX:+UseParNewGC
 JVM_OPTS=$JVM_OPTS -XX:+UseConcMarkSweepGC
 JVM_OPTS=$JVM_OPTS -XX:+CMSParallelRemarkEnabled
 JVM_OPTS=$JVM_OPTS -XX:SurvivorRatio=8
 JVM_OPTS=$JVM_OPTS -XX:MaxTenuringThreshold=5
 JVM_OPTS=$JVM_OPTS -XX:CMSInitiatingOccupancyFraction=65
 JVM_OPTS=$JVM_OPTS -XX:+UseCMSInitiatingOccupancyOnly
 JVM_OPTS=$JVM_OPTS -XX:CMSFullGCsBeforeCompaction=1

 Also I set up 2GB to memtables (memtable_total_space_in_mb: 2048)


 My avg heap usage (nodetool -h localhost info):

 3G


 Based on nodetool -h localhost cfhistograms i calc avg row size

 70KB

 I setup row cache only for one CF with follow settings:

 update column family building with rows_cached=1 and
 row_cache_provider='SerializingCacheProvider';


 When i setup row cache i got promotion failure in GC (with stop the
 world pause about 30secs) with almost HEAP filled. I very confused
 with this behavior.


 PS: i use cassandra 1.0.10, with JNA 3.4.0 on ubuntu lucid (kernel 2.6.32-41)


 2012/6/4 aaron morton aa...@thelastpickle.com:
 Yes SerializingCacheProvider is the off heap caching provider.

 Can you do some more digging into what is using the heap ?

 Cheers
 A

 -
 Aaron Morton
 Freelance Developer
 @aaronmorton
 http://www.thelastpickle.com

 On 1/06/2012, at 9:52 PM, ruslan usifov wrote:

 Hello

 I begin use SerializingCacheProvider for rows cashing, and got
 extremely JAVA heap grows. But i think that this cache provider
 doesn't use JAVA heap

row_cache_provider = 'SerializingCacheProvider'

2012-06-01 Thread ruslan usifov

Hello

I begin use SerializingCacheProvider for rows cashing, and got
extremely JAVA heap grows. But i think that this cache provider
doesn't use JAVA heap

Re: Exception when truncate

2012-05-23 Thread ruslan usifov

It's look s very strange but yes. Now i can't reproduce this

2012/5/22 aaron morton aa...@thelastpickle.com:
 The first part of the name is the current system time in milliseconds.

 If you run it twice do you get log messages about failing to create the same
 directory twice ?

 Cheers

 -
 Aaron Morton
 Freelance Developer
 @aaronmorton
 http://www.thelastpickle.com

 On 21/05/2012, at 5:09 AM, ruslan usifov wrote:

 I think as you, but this is not true, there are not any permissions
 issue. And as i said before, cassandra try to create directory for
 snapshort that already exists

 2012/5/19 Jonathan Ellis jbel...@gmail.com:

 Sounds like you have a permissions problem.  Cassandra creates a

 subdirectory for each snapshot.


 On Thu, May 17, 2012 at 4:57 AM, ruslan usifov ruslan.usi...@gmail.com
 wrote:

 Hello


 I have follow situation on our test server:


 from cassandra-cli i try to use


 truncate purchase_history;


 3 times i got:


 [default@township_6waves] truncate purchase_history;

 null

 UnavailableException()

        at
 org.apache.cassandra.thrift.Cassandra$truncate_result.read(Cassandra.java:20212)

        at
 org.apache.cassandra.thrift.Cassandra$Client.recv_truncate(Cassandra.java:1077)

        at
 org.apache.cassandra.thrift.Cassandra$Client.truncate(Cassandra.java:1052)

        at
 org.apache.cassandra.cli.CliClient.executeTruncate(CliClient.java:1445)

        at
 org.apache.cassandra.cli.CliClient.executeCLIStatement(CliClient.java:272)

        at
 org.apache.cassandra.cli.CliMain.processStatementInteractive(CliMain.java:220)

        at org.apache.cassandra.cli.CliMain.main(CliMain.java:348)



 So this looks that truncate goes very slow and too long, than

 rpc_timeout_in_ms: 1 (this can happens because we have very slow

 disck on test machine)


 But in in cassandra system log i see follow exception:



 ERROR [MutationStage:7022] 2012-05-17 12:19:14,356

 AbstractCassandraDaemon.java (line 139) Fatal exception in thread

 Thread[MutationStage:7022,5,main]

 java.io.IOError: java.io.IOException: unable to mkdirs

 /home/cassandra/1.0.0/data/township_6waves/snapshots/1337242754356-purchase_history

        at
 org.apache.cassandra.db.ColumnFamilyStore.snapshotWithoutFlush(ColumnFamilyStore.java:1433)

        at
 org.apache.cassandra.db.ColumnFamilyStore.snapshot(ColumnFamilyStore.java:1462)

        at
 org.apache.cassandra.db.ColumnFamilyStore.truncate(ColumnFamilyStore.java:1657)

        at
 org.apache.cassandra.db.TruncateVerbHandler.doVerb(TruncateVerbHandler.java:50)

        at
 org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59)

        at
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)

        at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)

        at java.lang.Thread.run(Thread.java:662)

 Caused by: java.io.IOException: unable to mkdirs

 /home/cassandra/1.0.0/data/township_6waves/snapshots/1337242754356-purchase_history

        at
 org.apache.cassandra.io.util.FileUtils.createDirectory(FileUtils.java:140)

        at
 org.apache.cassandra.io.util.FileUtils.createDirectory(FileUtils.java:131)

        at
 org.apache.cassandra.db.ColumnFamilyStore.snapshotWithoutFlush(ColumnFamilyStore.java:1409)

        ... 7 more



 Also i see that in snapshort dir already exists

 1337242754356-purchase_history directory, so i think that snapshort

 names that generate cassandra not uniquely.


 PS: We use cassandra 1.0.10 on Ubuntu 10.0.4-LTS




 --

 Jonathan Ellis

 Project Chair, Apache Cassandra

 co-founder of DataStax, the source for professional Cassandra support

 http://www.datastax.com

Re: Exception when truncate

2012-05-20 Thread ruslan usifov

I think as you, but this is not true, there are not any permissions
issue. And as i said before, cassandra try to create directory for
snapshort that already exists

2012/5/19 Jonathan Ellis jbel...@gmail.com:
 Sounds like you have a permissions problem.  Cassandra creates a
 subdirectory for each snapshot.

 On Thu, May 17, 2012 at 4:57 AM, ruslan usifov ruslan.usi...@gmail.com 
 wrote:
 Hello

 I have follow situation on our test server:

 from cassandra-cli i try to use

 truncate purchase_history;

 3 times i got:

 [default@township_6waves] truncate purchase_history;
 null
 UnavailableException()
        at 
 org.apache.cassandra.thrift.Cassandra$truncate_result.read(Cassandra.java:20212)
        at 
 org.apache.cassandra.thrift.Cassandra$Client.recv_truncate(Cassandra.java:1077)
        at 
 org.apache.cassandra.thrift.Cassandra$Client.truncate(Cassandra.java:1052)
        at 
 org.apache.cassandra.cli.CliClient.executeTruncate(CliClient.java:1445)
        at 
 org.apache.cassandra.cli.CliClient.executeCLIStatement(CliClient.java:272)
        at 
 org.apache.cassandra.cli.CliMain.processStatementInteractive(CliMain.java:220)
        at org.apache.cassandra.cli.CliMain.main(CliMain.java:348)


 So this looks that truncate goes very slow and too long, than
 rpc_timeout_in_ms: 1 (this can happens because we have very slow
 disck on test machine)

 But in in cassandra system log i see follow exception:


 ERROR [MutationStage:7022] 2012-05-17 12:19:14,356
 AbstractCassandraDaemon.java (line 139) Fatal exception in thread
 Thread[MutationStage:7022,5,main]
 java.io.IOError: java.io.IOException: unable to mkdirs
 /home/cassandra/1.0.0/data/township_6waves/snapshots/1337242754356-purchase_history
        at 
 org.apache.cassandra.db.ColumnFamilyStore.snapshotWithoutFlush(ColumnFamilyStore.java:1433)
        at 
 org.apache.cassandra.db.ColumnFamilyStore.snapshot(ColumnFamilyStore.java:1462)
        at 
 org.apache.cassandra.db.ColumnFamilyStore.truncate(ColumnFamilyStore.java:1657)
        at 
 org.apache.cassandra.db.TruncateVerbHandler.doVerb(TruncateVerbHandler.java:50)
        at 
 org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59)
        at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:662)
 Caused by: java.io.IOException: unable to mkdirs
 /home/cassandra/1.0.0/data/township_6waves/snapshots/1337242754356-purchase_history
        at 
 org.apache.cassandra.io.util.FileUtils.createDirectory(FileUtils.java:140)
        at 
 org.apache.cassandra.io.util.FileUtils.createDirectory(FileUtils.java:131)
        at 
 org.apache.cassandra.db.ColumnFamilyStore.snapshotWithoutFlush(ColumnFamilyStore.java:1409)
        ... 7 more


 Also i see that in snapshort dir already exists
 1337242754356-purchase_history directory, so i think that snapshort
 names that generate cassandra not uniquely.

 PS: We use cassandra 1.0.10 on Ubuntu 10.0.4-LTS



 --
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder of DataStax, the source for professional Cassandra support
 http://www.datastax.com

Exception when truncate

2012-05-17 Thread ruslan usifov

Hello

I have follow situation on our test server:

from cassandra-cli i try to use

truncate purchase_history;

3 times i got:

[default@township_6waves] truncate purchase_history;
null
UnavailableException()
at 
org.apache.cassandra.thrift.Cassandra$truncate_result.read(Cassandra.java:20212)
at 
org.apache.cassandra.thrift.Cassandra$Client.recv_truncate(Cassandra.java:1077)
at 
org.apache.cassandra.thrift.Cassandra$Client.truncate(Cassandra.java:1052)
at 
org.apache.cassandra.cli.CliClient.executeTruncate(CliClient.java:1445)
at 
org.apache.cassandra.cli.CliClient.executeCLIStatement(CliClient.java:272)
at 
org.apache.cassandra.cli.CliMain.processStatementInteractive(CliMain.java:220)
at org.apache.cassandra.cli.CliMain.main(CliMain.java:348)


So this looks that truncate goes very slow and too long, than
rpc_timeout_in_ms: 1 (this can happens because we have very slow
disck on test machine)

But in in cassandra system log i see follow exception:


ERROR [MutationStage:7022] 2012-05-17 12:19:14,356
AbstractCassandraDaemon.java (line 139) Fatal exception in thread
Thread[MutationStage:7022,5,main]
java.io.IOError: java.io.IOException: unable to mkdirs
/home/cassandra/1.0.0/data/township_6waves/snapshots/1337242754356-purchase_history
at 
org.apache.cassandra.db.ColumnFamilyStore.snapshotWithoutFlush(ColumnFamilyStore.java:1433)
at 
org.apache.cassandra.db.ColumnFamilyStore.snapshot(ColumnFamilyStore.java:1462)
at 
org.apache.cassandra.db.ColumnFamilyStore.truncate(ColumnFamilyStore.java:1657)
at 
org.apache.cassandra.db.TruncateVerbHandler.doVerb(TruncateVerbHandler.java:50)
at 
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.io.IOException: unable to mkdirs
/home/cassandra/1.0.0/data/township_6waves/snapshots/1337242754356-purchase_history
at 
org.apache.cassandra.io.util.FileUtils.createDirectory(FileUtils.java:140)
at 
org.apache.cassandra.io.util.FileUtils.createDirectory(FileUtils.java:131)
at 
org.apache.cassandra.db.ColumnFamilyStore.snapshotWithoutFlush(ColumnFamilyStore.java:1409)
... 7 more


Also i see that in snapshort dir already exists
1337242754356-purchase_history directory, so i think that snapshort
names that generate cassandra not uniquely.

PS: We use cassandra 1.0.10 on Ubuntu 10.0.4-LTS

Re: Exception when truncate

2012-05-17 Thread ruslan usifov

Also i miss understand why on empty CF(no any SStable) truncate heavy
loads disk??

2012/5/17 ruslan usifov ruslan.usi...@gmail.com:
 Hello

 I have follow situation on our test server:

 from cassandra-cli i try to use

 truncate purchase_history;

 3 times i got:

 [default@township_6waves] truncate purchase_history;
 null
 UnavailableException()
        at 
 org.apache.cassandra.thrift.Cassandra$truncate_result.read(Cassandra.java:20212)
        at 
 org.apache.cassandra.thrift.Cassandra$Client.recv_truncate(Cassandra.java:1077)
        at 
 org.apache.cassandra.thrift.Cassandra$Client.truncate(Cassandra.java:1052)
        at 
 org.apache.cassandra.cli.CliClient.executeTruncate(CliClient.java:1445)
        at 
 org.apache.cassandra.cli.CliClient.executeCLIStatement(CliClient.java:272)
        at 
 org.apache.cassandra.cli.CliMain.processStatementInteractive(CliMain.java:220)
        at org.apache.cassandra.cli.CliMain.main(CliMain.java:348)


 So this looks that truncate goes very slow and too long, than
 rpc_timeout_in_ms: 1 (this can happens because we have very slow
 disck on test machine)

 But in in cassandra system log i see follow exception:


 ERROR [MutationStage:7022] 2012-05-17 12:19:14,356
 AbstractCassandraDaemon.java (line 139) Fatal exception in thread
 Thread[MutationStage:7022,5,main]
 java.io.IOError: java.io.IOException: unable to mkdirs
 /home/cassandra/1.0.0/data/township_6waves/snapshots/1337242754356-purchase_history
        at 
 org.apache.cassandra.db.ColumnFamilyStore.snapshotWithoutFlush(ColumnFamilyStore.java:1433)
        at 
 org.apache.cassandra.db.ColumnFamilyStore.snapshot(ColumnFamilyStore.java:1462)
        at 
 org.apache.cassandra.db.ColumnFamilyStore.truncate(ColumnFamilyStore.java:1657)
        at 
 org.apache.cassandra.db.TruncateVerbHandler.doVerb(TruncateVerbHandler.java:50)
        at 
 org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59)
        at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:662)
 Caused by: java.io.IOException: unable to mkdirs
 /home/cassandra/1.0.0/data/township_6waves/snapshots/1337242754356-purchase_history
        at 
 org.apache.cassandra.io.util.FileUtils.createDirectory(FileUtils.java:140)
        at 
 org.apache.cassandra.io.util.FileUtils.createDirectory(FileUtils.java:131)
        at 
 org.apache.cassandra.db.ColumnFamilyStore.snapshotWithoutFlush(ColumnFamilyStore.java:1409)
        ... 7 more


 Also i see that in snapshort dir already exists
 1337242754356-purchase_history directory, so i think that snapshort
 names that generate cassandra not uniquely.

 PS: We use cassandra 1.0.10 on Ubuntu 10.0.4-LTS

Re: Exception when truncate

2012-05-17 Thread ruslan usifov

Maybe, something changes in cassandra 1.0.x for truncate mechanism,
because in cassandra 0.8 truncate executes much faster on the same
data

2012/5/17 Viktor Jevdokimov viktor.jevdoki...@adform.com:
 Truncate flushes all memtables to free up commit logs, and that on all nodes. 
 So this takes time. Discussed on this list not so long ago.

 Watch for:
 https://issues.apache.org/jira/browse/CASSANDRA-3651
 https://issues.apache.org/jira/browse/CASSANDRA-4006



 Best regards / Pagarbiai

 Viktor Jevdokimov
 Senior Developer

 Email: viktor.jevdoki...@adform.com
 Phone: +370 5 212 3063
 Fax: +370 5 261 0453

 J. Jasinskio 16C,
 LT-01112 Vilnius,
 Lithuania



 Disclaimer: The information contained in this message and attachments is 
 intended solely for the attention and use of the named addressee and may be 
 confidential. If you are not the intended recipient, you are reminded that 
 the information remains the property of the sender. You must not use, 
 disclose, distribute, copy, print or rely on this e-mail. If you have 
 received this message in error, please contact the sender immediately and 
 irrevocably delete this message and any copies. -Original Message-
 From: ruslan usifov [mailto:ruslan.usi...@gmail.com]
 Sent: Thursday, May 17, 2012 13:06
 To: user@cassandra.apache.org
 Subject: Re: Exception when truncate

 Also i miss understand why on empty CF(no any SStable) truncate heavy
 loads disk??

 2012/5/17 ruslan usifov ruslan.usi...@gmail.com:
  Hello
 
  I have follow situation on our test server:
 
  from cassandra-cli i try to use
 
  truncate purchase_history;
 
  3 times i got:
 
  [default@township_6waves] truncate purchase_history; null
  UnavailableException()
         at
  org.apache.cassandra.thrift.Cassandra$truncate_result.read(Cassandra.j
  ava:20212)
         at
  org.apache.cassandra.thrift.Cassandra$Client.recv_truncate(Cassandra.j
  ava:1077)
         at
  org.apache.cassandra.thrift.Cassandra$Client.truncate(Cassandra.java:1
  052)
         at
  org.apache.cassandra.cli.CliClient.executeTruncate(CliClient.java:1445
  )
         at
  org.apache.cassandra.cli.CliClient.executeCLIStatement(CliClient.java:
  272)
         at
  org.apache.cassandra.cli.CliMain.processStatementInteractive(CliMain.j
  ava:220)
         at org.apache.cassandra.cli.CliMain.main(CliMain.java:348)
 
 
  So this looks that truncate goes very slow and too long, than
  rpc_timeout_in_ms: 1 (this can happens because we have very slow
  disck on test machine)
 
  But in in cassandra system log i see follow exception:
 
 
  ERROR [MutationStage:7022] 2012-05-17 12:19:14,356
  AbstractCassandraDaemon.java (line 139) Fatal exception in thread
  Thread[MutationStage:7022,5,main]
  java.io.IOError: java.io.IOException: unable to mkdirs
 
 /home/cassandra/1.0.0/data/township_6waves/snapshots/1337242754356-
 pur
  chase_history
         at
 
 org.apache.cassandra.db.ColumnFamilyStore.snapshotWithoutFlush(Column
 F
  amilyStore.java:1433)
         at
 
 org.apache.cassandra.db.ColumnFamilyStore.snapshot(ColumnFamilyStore.j
  ava:1462)
         at
 
 org.apache.cassandra.db.ColumnFamilyStore.truncate(ColumnFamilyStore.j
  ava:1657)
         at
 
 org.apache.cassandra.db.TruncateVerbHandler.doVerb(TruncateVerbHandle
 r
  .java:50)
         at
 
 org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.j
  ava:59)
         at
 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecu
  tor.java:886)
         at
  java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.
  java:908)
         at java.lang.Thread.run(Thread.java:662)
  Caused by: java.io.IOException: unable to mkdirs
 
 /home/cassandra/1.0.0/data/township_6waves/snapshots/1337242754356-
 pur
  chase_history
         at
  org.apache.cassandra.io.util.FileUtils.createDirectory(FileUtils.java:
  140)
         at
  org.apache.cassandra.io.util.FileUtils.createDirectory(FileUtils.java:
  131)
         at
 
 org.apache.cassandra.db.ColumnFamilyStore.snapshotWithoutFlush(Column
 F
  amilyStore.java:1409)
         ... 7 more
 
 
  Also i see that in snapshort dir already exists
  1337242754356-purchase_history directory, so i think that snapshort
  names that generate cassandra not uniquely.
 
  PS: We use cassandra 1.0.10 on Ubuntu 10.0.4-LTS

Re: Exception when truncate

2012-05-17 Thread ruslan usifov

Its our test machine with one node in cluster:-)

2012/5/17 Jeremy Hanna jeremy.hanna1...@gmail.com:
 when doing a truncate, it has to talk to all of the nodes in the ring to 
 perform the operation.  by the error, it looks like one of the nodes was 
 unreachable for some reason.  you might do a nodetool ring in the cli do a 
 'describe cluster;' and see if your ring is okay.

 So I think the operation is just as fast, it just looks like it times out (20 
 seconds or something) when trying to perform the command against all of the 
 nodes in the cluster.

 On May 17, 2012, at 9:36 AM, ruslan usifov wrote:

 Maybe, something changes in cassandra 1.0.x for truncate mechanism,
 because in cassandra 0.8 truncate executes much faster on the same
 data

 2012/5/17 Viktor Jevdokimov viktor.jevdoki...@adform.com:
 Truncate flushes all memtables to free up commit logs, and that on all 
 nodes. So this takes time. Discussed on this list not so long ago.

 Watch for:
 https://issues.apache.org/jira/browse/CASSANDRA-3651
 https://issues.apache.org/jira/browse/CASSANDRA-4006



 Best regards / Pagarbiai

 Viktor Jevdokimov
 Senior Developer

 Email: viktor.jevdoki...@adform.com
 Phone: +370 5 212 3063
 Fax: +370 5 261 0453

 J. Jasinskio 16C,
 LT-01112 Vilnius,
 Lithuania



 Disclaimer: The information contained in this message and attachments is 
 intended solely for the attention and use of the named addressee and may be 
 confidential. If you are not the intended recipient, you are reminded that 
 the information remains the property of the sender. You must not use, 
 disclose, distribute, copy, print or rely on this e-mail. If you have 
 received this message in error, please contact the sender immediately and 
 irrevocably delete this message and any copies. -Original Message-
 From: ruslan usifov [mailto:ruslan.usi...@gmail.com]
 Sent: Thursday, May 17, 2012 13:06
 To: user@cassandra.apache.org
 Subject: Re: Exception when truncate

 Also i miss understand why on empty CF(no any SStable) truncate heavy
 loads disk??

 2012/5/17 ruslan usifov ruslan.usi...@gmail.com:
 Hello

 I have follow situation on our test server:

 from cassandra-cli i try to use

 truncate purchase_history;

 3 times i got:

 [default@township_6waves] truncate purchase_history; null
 UnavailableException()
        at
 org.apache.cassandra.thrift.Cassandra$truncate_result.read(Cassandra.j
 ava:20212)
        at
 org.apache.cassandra.thrift.Cassandra$Client.recv_truncate(Cassandra.j
 ava:1077)
        at
 org.apache.cassandra.thrift.Cassandra$Client.truncate(Cassandra.java:1
 052)
        at
 org.apache.cassandra.cli.CliClient.executeTruncate(CliClient.java:1445
 )
        at
 org.apache.cassandra.cli.CliClient.executeCLIStatement(CliClient.java:
 272)
        at
 org.apache.cassandra.cli.CliMain.processStatementInteractive(CliMain.j
 ava:220)
        at org.apache.cassandra.cli.CliMain.main(CliMain.java:348)


 So this looks that truncate goes very slow and too long, than
 rpc_timeout_in_ms: 1 (this can happens because we have very slow
 disck on test machine)

 But in in cassandra system log i see follow exception:


 ERROR [MutationStage:7022] 2012-05-17 12:19:14,356
 AbstractCassandraDaemon.java (line 139) Fatal exception in thread
 Thread[MutationStage:7022,5,main]
 java.io.IOError: java.io.IOException: unable to mkdirs

 /home/cassandra/1.0.0/data/township_6waves/snapshots/1337242754356-
 pur
 chase_history
        at

 org.apache.cassandra.db.ColumnFamilyStore.snapshotWithoutFlush(Column
 F
 amilyStore.java:1433)
        at

 org.apache.cassandra.db.ColumnFamilyStore.snapshot(ColumnFamilyStore.j
 ava:1462)
        at

 org.apache.cassandra.db.ColumnFamilyStore.truncate(ColumnFamilyStore.j
 ava:1657)
        at

 org.apache.cassandra.db.TruncateVerbHandler.doVerb(TruncateVerbHandle
 r
 .java:50)
        at

 org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.j
 ava:59)
        at

 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecu
 tor.java:886)
        at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.
 java:908)
        at java.lang.Thread.run(Thread.java:662)
 Caused by: java.io.IOException: unable to mkdirs

 /home/cassandra/1.0.0/data/township_6waves/snapshots/1337242754356-
 pur
 chase_history
        at
 org.apache.cassandra.io.util.FileUtils.createDirectory(FileUtils.java:
 140)
        at
 org.apache.cassandra.io.util.FileUtils.createDirectory(FileUtils.java:
 131)
        at

 org.apache.cassandra.db.ColumnFamilyStore.snapshotWithoutFlush(Column
 F
 amilyStore.java:1409)
        ... 7 more


 Also i see that in snapshort dir already exists
 1337242754356-purchase_history directory, so i think that snapshort
 names that generate cassandra not uniquely.

 PS: We use cassandra 1.0.10 on Ubuntu 10.0.4-LTS

get dinamicsnith info from php

2012-05-14 Thread ruslan usifov

Hello

I want to route request from php client to minimaly loaded node, so i need
dinamicsnitch info and gosip, how can i get this info fro php. Perhaps need
some daemon that can communicate with cassandra gosip and translate this
info to php (socket for example)???

Re: get dinamicsnith info from php

2012-05-14 Thread ruslan usifov

Sorry for my bad english.


I want to solve follow problem. For example we down one node for
maintenance reason, for a long time (30 min). Now we use TSocketPool for
polling connection to cassandra, but this poll implementation is as i think
not so good, it have a custom parameter setRetryInterval, with allow off
broken node (now we set i to 10sec), but this mean that every 10sec pool
will try to connet down node (i repeat we shutdown node for maintance
reason), because it doesn't know node dead or node, but cassandra cluster
know this, and this connection attempt is senselessly, also when node make
compact it can be heavy loaded, and can't serve client reqest very good (at
this moment we can got little increase of avg backend responce time)

2012/5/14 Viktor Jevdokimov viktor.jevdoki...@adform.com

  I’m not sure, that selecting node upon DS is a good idea. First of all
 every node has values about every node, including self. Self DS values are
 always better than others.

 ** **

 For example, 3 nodes RF=2:

 ** **

 N1

 N2

 N3

 N1

 0.5ms

 2ms

 2ms

 N2

 2ms

 0.5ms

 2ms

 N3

 2ms

 2ms

 0.5ms

 ** **

 We have monitored many Cassandra counters, including DS values for every
 node, and graphs shows that latencies is not about load.

 ** **

 So the strategy should be based on use case, node count, RF, replica
 placement strategy, read repair chance, and more, and more…

 ** **

 What do you want to achieve?

 ** **

 ** **


Best regards / Pagarbiai
 *Viktor Jevdokimov*
 Senior Developer

 Email: viktor.jevdoki...@adform.com
 Phone: +370 5 212 3063, Fax +370 5 261 0453
 J. Jasinskio 16C, LT-01112 Vilnius, Lithuania
 Follow us on Twitter: @adforminsiderhttp://twitter.com/#%21/adforminsider
 What is Adform: watch this short video http://vimeo.com/adform/display
  [image: Adform News] http://www.adform.com

 Disclaimer: The information contained in this message and attachments is
 intended solely for the attention and use of the named addressee and may be
 confidential. If you are not the intended recipient, you are reminded that
 the information remains the property of the sender. You must not use,
 disclose, distribute, copy, print or rely on this e-mail. If you have
 received this message in error, please contact the sender immediately and
 irrevocably delete this message and any copies.

 *From:* ruslan usifov [mailto:ruslan.usi...@gmail.com]
 *Sent:* Monday, May 14, 2012 16:58
 *To:* user@cassandra.apache.org
 *Subject:* get dinamicsnith info from php

 ** **

 Hello

 I want to route request from php client to minimaly loaded node, so i need
 dinamicsnitch info and gosip, how can i get this info fro php. Perhaps need
 some daemon that can communicate with cassandra gosip and translate this
 info to php (socket for example)???

signature-logo29.png

Re: Thrift error occurred during processing of message

2012-05-11 Thread ruslan usifov

Looks like you used TBUfferedTransport, but sinve 1.0.x cassandra support
only framed

2011/12/19 Tamil selvan R.S tamil.3...@gmail.com

 Hi,
  We are using PHPCassa to connect to Cassandra 1.0.2. After we installed
 the thrift extension we started noticing the following in the error logs.
 [We didn't notice this when we were running raw thrift library with out
 extension].

 ERROR [pool-2-thread-5314] 2011-12-05 20:26:47,729
 CustomTThreadPoolServer.java (line 201) Thrift error occurred during
 processing of message.
 org.apache.thrift.protocol.
 TProtocolException: Missing version in readMessageBegin, old client?
 at
 org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:213)
 at
 org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2877)
 at
 org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:187)
 at
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
 at java.lang.Thread.run(Thread.java:722)

 Is there any issue with the thrift protocol compatibilty?

 Regards,
 Tamil

Map reduce without hdfs

2012-04-26 Thread ruslan usifov

Hello to all!

It it possible to launch only hadoop mapreduce task tracker and job tracker
against cassandra cluster, and doesn't launch HDFS (use for shared storage
something else)??

Thanks

Re: swap grows

2012-04-18 Thread ruslan usifov

Thanks for link. But for me still present question about  free memory. In
out cluster we have 200 IOPS in peaks, but still have about 3GB of free
memory on each server (cluster have 6 nodes tho there are 3*6=18 GB of
unused memry). I think that OS must fill all memory with pagecache (we do
backups throw DirectIO) of SStables, but it doesn't do that and i doesn't
understand  why. I can't find any sysctl that can tune pagecache thresholds
or ratio.

Any suggestion

2012/4/18 Jonathan Ellis jbel...@gmail.com

 what-is-the-linux-kernel-parameter-vm-swappinesshttp://www.linuxvox.com/2009/10/what-is-the-linux-kernel-parameter-vm-swappiness

Re: swap grows

2012-04-15 Thread ruslan usifov

Not i don't sure about this:-)) I don't have very good knowns about Linux
VM managment. And for me looks very stange that swap grows, but not any
swap activity (i monitor it throw follow vmstat -s | grep 'pages swapped
out' | awk '{ print $1 }' and vmstat -s | grep 'pages swapped in' | awk '{
print $1 }'). So looks like that you right, and i has filled up my
knowledge:-))

2012/4/15 Віталій Тимчишин tiv...@gmail.com

 BTW: Are you sure system doing wrong? System may save some pages to swap
 not removing them from RAM simply to have possibility to remove them later
 fast if needed.


 2012/4/14 ruslan usifov ruslan.usi...@gmail.com

 Hello

 We have 6 node cluster (cassandra 0.8.10). On one node i increase java
 heap size to 6GB, and now at this node begin grows swap, but system have
 about 3GB of free memory:


 root@6wd003:~# free
  total   used   free sharedbuffers cached
 Mem:  24733664   217028123030852  0   6792   13794724
 -/+ buffers/cache:7901296   16832368
 Swap:  1998840   23521996488


 And swap space slowly grows, but i misunderstand why?


 PS: We have JNA mlock, and set  vm.swappiness = 0
 PS: OS ubuntu 10.0.4(2.6.32-40-generic)





 --
 Best regards,
  Vitalii Tymchyshyn

Re: swap grows

2012-04-14 Thread ruslan usifov

I know:-) but this is not answer:-(. I found that on other nodes there
still about 3GB (on node with JAVA_HEAP=6GB free memory also 3GB) of free
memory but there JAVA_HEAP=5G, so this looks like some sysctl
(/proc/sys/vm???) ratio (about 10%(3 / 24 * 100)), i don't known which,
anybody can explain this situation

2012/4/14 R. Verlangen ro...@us2.nl

 Its recommended to disable swap entirely when you run Cassandra on a
 server.


 2012/4/14 ruslan usifov ruslan.usi...@gmail.com

 I forgot to say that system have 24GB of phis memory


 2012/4/14 ruslan usifov ruslan.usi...@gmail.com

 Hello

 We have 6 node cluster (cassandra 0.8.10). On one node i increase java
 heap size to 6GB, and now at this node begin grows swap, but system have
 about 3GB of free memory:


 root@6wd003:~# free
  total   used   free sharedbuffers cached
 Mem:  24733664   217028123030852  0   6792   13794724
 -/+ buffers/cache:7901296   16832368
 Swap:  1998840   23521996488


 And swap space slowly grows, but i misunderstand why?


 PS: We have JNA mlock, and set  vm.swappiness = 0
 PS: OS ubuntu 10.0.4(2.6.32-40-generic)






 --
 With kind regards,

 Robin Verlangen
 www.robinverlangen.nl

swap grows

2012-04-13 Thread ruslan usifov

Hello

We have 6 node cluster (cassandra 0.8.10). On one node i increase java heap
size to 6GB, and now at this node begin grows swap, but system have about
3GB of free memory:


root@6wd003:~# free
 total   used   free sharedbuffers cached
Mem:  24733664   217028123030852  0   6792   13794724
-/+ buffers/cache:7901296   16832368
Swap:  1998840   23521996488


And swap space slowly grows, but i misunderstand why?


PS: We have JNA mlock, and set  vm.swappiness = 0
PS: OS ubuntu 10.0.4(2.6.32-40-generic)

Re: swap grows

2012-04-13 Thread ruslan usifov

I forgot to say that system have 24GB of phis memory

2012/4/14 ruslan usifov ruslan.usi...@gmail.com

 Hello

 We have 6 node cluster (cassandra 0.8.10). On one node i increase java
 heap size to 6GB, and now at this node begin grows swap, but system have
 about 3GB of free memory:


 root@6wd003:~# free
  total   used   free sharedbuffers cached
 Mem:  24733664   217028123030852  0   6792   13794724
 -/+ buffers/cache:7901296   16832368
 Swap:  1998840   23521996488


 And swap space slowly grows, but i misunderstand why?


 PS: We have JNA mlock, and set  vm.swappiness = 0
 PS: OS ubuntu 10.0.4(2.6.32-40-generic)

need of regular nodetool repair

2012-04-11 Thread ruslan usifov

Hello

I have follow question, if we Read and write to cassandra claster with
QUORUM consistency level, does this allow to us do not call nodetool repair
regular? (i.e. every GCGraceSeconds)

Re: need of regular nodetool repair

2012-04-11 Thread ruslan usifov

Sorry fo my bad english, so QUORUM allow  doesn't make repair regularity?
But form your anser it does not follow

2012/4/11 R. Verlangen ro...@us2.nl

 Yes, I personally have configured it to perform a repair once a week, as
 the GCGraceSeconds is at 10 days.

 This is also what's in the manual
 http://wiki.apache.org/cassandra/Operations#Repairing_missing_or_inconsistent_data
  (point
 2)


 2012/4/11 ruslan usifov ruslan.usi...@gmail.com

 Hello

 I have follow question, if we Read and write to cassandra claster with
 QUORUM consistency level, does this allow to us do not call nodetool repair
 regular? (i.e. every GCGraceSeconds)




 --
 With kind regards,

 Robin Verlangen
 www.robinverlangen.nl

Re: need of regular nodetool repair

2012-04-11 Thread ruslan usifov

HH - this is hinted handoff?

2012/4/11 Igor i...@4friends.od.ua

  On 04/11/2012 11:49 AM, R. Verlangen wrote:

 Not everything, just HH :)

 I hope this works for me for the next reasons: I have quite large RF (6
 datacenters, each carry one replica of all dataset), read and write at CL
 ONE, relatively small TTL - 10 days, I have no deletes, servers almost
 never go down for hour. So I expect that even if I loose some HH then some
 other replica will reply with data.  Is it correct?

 Hope this works for me, but can not work for others.


 Well, if everything works 100% at any time there should be nothing to
 repair, however with a distributed cluster it would be pretty rare for that
 to occur. At least that is how I interpret this.

  2012/4/11 Igor i...@4friends.od.ua

  BTW, I heard that we don't need to run repair if all your data have TTL,
 all HH works,  and you never delete your data.


 On 04/11/2012 11:34 AM, ruslan usifov wrote:

 Sorry fo my bad english, so QUORUM allow  doesn't make repair regularity?
 But form your anser it does not follow

 2012/4/11 R. Verlangen ro...@us2.nl

 Yes, I personally have configured it to perform a repair once a week, as
 the GCGraceSeconds is at 10 days.

  This is also what's in the manual
 http://wiki.apache.org/cassandra/Operations#Repairing_missing_or_inconsistent_data
  (point
 2)


  2012/4/11 ruslan usifov ruslan.usi...@gmail.com

 Hello

 I have follow question, if we Read and write to cassandra claster with
 QUORUM consistency level, does this allow to us do not call nodetool repair
 regular? (i.e. every GCGraceSeconds)




  --
 With kind regards,

  Robin Verlangen
 www.robinverlangen.nl






  --
 With kind regards,

  Robin Verlangen
 www.robinverlangen.nl

Re: Resident size growth

2012-04-10 Thread ruslan usifov

mmap doesn't depend on jna

2012/4/9 Jeremiah Jordan jeremiah.jor...@morningstar.com

  He says he disabled JNA.  You can't mmap without JNA can you?

  On Apr 9, 2012, at 4:52 AM, aaron morton wrote:

  see http://wiki.apache.org/cassandra/FAQ#mmap

  Cheers

 -
 Aaron Morton
 Freelance Developer
 @aaronmorton
 http://www.thelastpickle.com

  On 9/04/2012, at 5:09 AM, ruslan usifov wrote:

 mmap sstables? It's normal

 2012/4/5 Omid Aladini omidalad...@gmail.com

 Hi,

 I'm experiencing a steady growth in resident size of JVM running
 Cassandra 1.0.7. I disabled JNA and off-heap row cache, tested with
 and without mlockall disabling paging, and upgraded to JRE 1.6.0_31 to
 prevent this bug [1] to leak memory. Still JVM's resident set size
 grows steadily. A process with Xmx=2048M has grown to 6GB resident
 size and one with Xmx=8192M to 16GB in a few hours and increasing. Has
 anyone experienced this? Any idea how to deal with this issue?

 Thanks,
 Omid

 [1] http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7066129

Re: Resident size growth

2012-04-10 Thread ruslan usifov

also i suggest to setup disk_access_mode: mmap_index_only

2012/4/9 Omid Aladini omidalad...@gmail.com

 Thanks. Yes it's due to mmappd SSTables pages that count as resident size.

 Jeremiah: mmap isn't through JNA, it's via java.nio.MappedByteBuffer I
 think.

 -- Omid

 On Mon, Apr 9, 2012 at 4:15 PM, Jeremiah Jordan
 jeremiah.jor...@morningstar.com wrote:
  He says he disabled JNA.  You can't mmap without JNA can you?
 
  On Apr 9, 2012, at 4:52 AM, aaron morton wrote:
 
  see http://wiki.apache.org/cassandra/FAQ#mmap
 
  Cheers
 
  -
  Aaron Morton
  Freelance Developer
  @aaronmorton
  http://www.thelastpickle.com
 
  On 9/04/2012, at 5:09 AM, ruslan usifov wrote:
 
  mmap sstables? It's normal
 
  2012/4/5 Omid Aladini omidalad...@gmail.com
 
  Hi,
 
  I'm experiencing a steady growth in resident size of JVM running
  Cassandra 1.0.7. I disabled JNA and off-heap row cache, tested with
  and without mlockall disabling paging, and upgraded to JRE 1.6.0_31 to
  prevent this bug [1] to leak memory. Still JVM's resident set size
  grows steadily. A process with Xmx=2048M has grown to 6GB resident
  size and one with Xmx=8192M to 16GB in a few hours and increasing. Has
  anyone experienced this? Any idea how to deal with this issue?
 
  Thanks,
  Omid
 
  [1] http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7066129

Re: Resident size growth

2012-04-08 Thread ruslan usifov

mmap sstables? It's normal

2012/4/5 Omid Aladini omidalad...@gmail.com

 Hi,

 I'm experiencing a steady growth in resident size of JVM running
 Cassandra 1.0.7. I disabled JNA and off-heap row cache, tested with
 and without mlockall disabling paging, and upgraded to JRE 1.6.0_31 to
 prevent this bug [1] to leak memory. Still JVM's resident set size
 grows steadily. A process with Xmx=2048M has grown to 6GB resident
 size and one with Xmx=8192M to 16GB in a few hours and increasing. Has
 anyone experienced this? Any idea how to deal with this issue?

 Thanks,
 Omid

 [1] http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7066129

upgrade from cassandra 0.8 to 1.0

2012-04-05 Thread ruslan usifov

Hello

It's looks that cassandra 1.0.x is stable, and have interesting things like
offheap memtables and row cashes, so we want to upgrade to 1.0 version from
0.8. Is it possible to do without cluster downtime (while we upgrade all
nodes)? I mean follow: when we begin upgrade at some point in working
cluster will be mix of 0.8 (nodes that are not upgraded yet) and 1.0(nodes
that already upgraded) so i am concerned about this situation, i.e.
communications between nodes can be broken because  version communication
protocol incompatibilies

Re: repair broke TTL based expiration

2012-03-19 Thread ruslan usifov

Do you make major compaction??

2012/3/19 Radim Kolar h...@filez.com:
 I suspect that running cluster wide repair interferes with TTL based
 expiration. I am running repair every 7 days and using TTL expiration time 7
 days too. Data are never deleted.
 Stored data in cassandra are always growing (watching them for 3 months) but
 they should not. If i run manual cleanup, some data are deleted but just
 about 5%. Currently there are about 3-5 times more rows then i estimate.

 I suspect that running repair on data with TTL can cause:

 1. time check for expired records is ignored and these data are streamed to
 other node and they will be alive again
  or
 2. streaming data are propagated with full TTL. Lets say that i have ttl 7
 days, data are stored for 5 days and then repaired, they should be sent to
 other node with ttl 2 days not 7.

 Can someone do testing on this case? I could not play with production
 cluster too much.

Re: repair broke TTL based expiration

2012-03-19 Thread ruslan usifov

cleanup in you case doesn't have any seens. You write that repair work
for you, so you can stop cassandra daemon, delete all data from folder
that contain problem data, start cassandra daemon, and run nodetool
repair, but in this case ypu must have replication factor for keyspace
 3 and have consistency level for data manipulation QUORUM

2012/3/20 Radim Kolar h...@filez.com:
 Dne 19.3.2012 23:33, ruslan usifov napsal(a):

 Do you make major compaction??

 no, i do cleanups only. Major compactions kills my node with OOM.

Re: slow read

2012-03-05 Thread ruslan usifov

2012/3/5 Jeesoo Shin bsh...@gmail.com

 Hi all.

 I have very SLOW READ here. :-(
 I made a cluster with three node (aws xlarge, replication = 3)
 Cassandra version is 1.0.6
 I have inserted 1,000,000 rows. (standard column)
 Each row has 200 columns.
 Each column has 16 byte key,  512 byte value.

 I used Hector createSliceQuery to get one column in a row.
 This basic query(random row, fixed column) is created with multiple
 thread and hit cassandra.

 I only get up to 140 request per second. Is this all I can get for read?
 Or am I doing something wrong?
 Interestingly, when I request rows which doesn't exist, it goes up to
 1600 per second.


You must test read performance by paralel test (ie multiple threads). The
result when not existent rows are more faster is result of bloom filter




 ANY insight, share will be extremely helpful.
 Thank you.

 Regards,
 Jeesoo.

Re: slow read

2012-03-05 Thread ruslan usifov

And sum of all rq/s threads is 160??

2012/3/5 Jeesoo Shin bsh...@gmail.com

 Thank you for reply. :)
 Yes I did multiple thread.
 160, 320 gave me same result.

 On 3/5/12, ruslan usifov ruslan.usi...@gmail.com wrote:
  2012/3/5 Jeesoo Shin bsh...@gmail.com
 
  Hi all.
 
  I have very SLOW READ here. :-(
  I made a cluster with three node (aws xlarge, replication = 3)
  Cassandra version is 1.0.6
  I have inserted 1,000,000 rows. (standard column)
  Each row has 200 columns.
  Each column has 16 byte key,  512 byte value.
 
  I used Hector createSliceQuery to get one column in a row.
  This basic query(random row, fixed column) is created with multiple
  thread and hit cassandra.
 
  I only get up to 140 request per second. Is this all I can get for read?
  Or am I doing something wrong?
  Interestingly, when I request rows which doesn't exist, it goes up to
  1600 per second.
 
 
  You must test read performance by paralel test (ie multiple threads). The
  result when not existent rows are more faster is result of bloom filter
 
 
 
 
  ANY insight, share will be extremely helpful.
  Thank you.
 
  Regards,
  Jeesoo.

Wrong version in debian repository

2012-02-07 Thread ruslan usifov

Hello

I think that in http://www.apache.org/dist/cassandra/debian repo there is
incorret version for 0.8 branch. There is 0.8.8, but latest version is
0.8.9. May be this repository is abandoned?

Re: Disable Nagle algoritm in thrift i.e. TCP_NODELAY

2012-01-26 Thread ruslan usifov

2012/1/26 Jeffrey Kesselman jef...@gmail.com

 Most operating systems have a way to do this at the OS level.


Could you please provide this way for linux?, for particular application?
Maybe some sysctl?



 On Thu, Jan 26, 2012 at 8:17 AM, ruslan usifov ruslan.usi...@gmail.comwrote:

 Hello

 Is it possible set TCP_NODELAY on thrift socket in cassandra?




 --
 It's always darkest just before you are eaten by a grue.

Re: Disable Nagle algoritm in thrift i.e. TCP_NODELAY

2012-01-26 Thread ruslan usifov

Sorry but you misunderstand me, is ask  is cassandra have any option to
control TCP_NODELAY behaviour, so we doesn't need patch cassandra or thrift
code.

I found this article
https://wiki.cs.columbia.edu:8443/pages/viewpage.action?pageId=12585536,
where упоминается mentioned coreTransport.TcpClient.NoDelay, but what is
this i misunderstand


2012/1/26 Jeffrey Kesselman jef...@gmail.com

 
   To set or get a TCP socket option, call 
 *getsockopthttp://linux.about.com/library/cmd/blcmdl2_getsockopt.htm
 *(2) to read or 
 *setsockopthttp://linux.about.com/library/cmd/blcmdl2_setsockopt.htm
 *(2) to write the option with the option level argument set to *SOL_TCP.* In
 addition, most *SOL_IP *socket options are valid on TCP sockets. For more
 information see *ip http://linux.about.com/library/cmd/blcmdl7_ip.htm*
 (7).
...
 *TCP_NODELAY* If set, disable the Nagle algorithm. This means that
 segments are always sent as soon as possible, even if there is only a small
 amount of data. When not set, data is buffered until there is a sufficient
 amount to send out, thereby avoiding the frequent sending of small packets,
 which results in poor utilization of the network. This option cannot be
 used at the same time as the option *TCP_CORK.* *http://bit.ly/zpvLbP*
 *
 *

 On Thu, Jan 26, 2012 at 12:10 PM, ruslan usifov 
 ruslan.usi...@gmail.comwrote:



 2012/1/26 Jeffrey Kesselman jef...@gmail.com

 Most operating systems have a way to do this at the OS level.


 Could you please provide this way for linux?, for particular application?
 Maybe some sysctl?



 On Thu, Jan 26, 2012 at 8:17 AM, ruslan usifov 
 ruslan.usi...@gmail.comwrote:

 Hello

 Is it possible set TCP_NODELAY on thrift socket in cassandra?




 --
 It's always darkest just before you are eaten by a grue.





 --
 It's always darkest just before you are eaten by a grue.

Re: Disable Nagle algoritm in thrift i.e. TCP_NODELAY

2012-01-26 Thread ruslan usifov

27 января 2012 г. 1:19 пользователь aaron morton
aa...@thelastpickle.comнаписал:

 Outgoing TCP connections between nodes have TCP_NODELAY on, so do server
 side THRIFT sockets.

 Thanks, for exhaustive answer



 I would assume your client will be setting it as well.


No php client doesn have TCP_NODELAY, because php stream sockets doesn't
allow set sock options - ie no such API


 Cheers


 -
 Aaron Morton
 Freelance Developer
 @aaronmorton
 http://www.thelastpickle.com

 On 27/01/2012, at 6:54 AM, sridhar basam wrote:

 There is no global setting in linux to turn off nagle.

 Sridhar


 2012/1/26 Jeffrey Kesselman jef...@gmail.com:

 You know... here aught to be a command line command to set it.  There is in

 Solaris and Windows.  But Im having trouble finding it for Linux.



 2012/1/26 ruslan usifov ruslan.usi...@gmail.com


 Sorry but you misunderstand me, is ask  is cassandra have any option to

 control TCP_NODELAY behaviour, so we doesn't need patch cassandra or thrift

 code.


 I found this article

 https://wiki.cs.columbia.edu:8443/pages/viewpage.action?pageId=12585536,

 where упоминается mentioned coreTransport.TcpClient.NoDelay, but what is

 this i misunderstand




 2012/1/26 Jeffrey Kesselman jef...@gmail.com


 

 To set or get a TCP socket option, call getsockopt(2) to read

 or setsockopt(2) to write the option with the option level argument set

 to SOL_TCP. In addition, most SOL_IP socket options are valid on TCP

 sockets. For more information see ip(7).

 ...

 TCP_NODELAY If set, disable the Nagle algorithm. This means that segments

 are always sent as soon as possible, even if there is only a small amount
 of

 data. When not set, data is buffered until there is a sufficient amount to

 send out, thereby avoiding the frequent sending of small packets, which

 results in poor utilization of the network. This option cannot be used at

 the same time as the option TCP_CORK. http://bit.ly/zpvLbP



 On Thu, Jan 26, 2012 at 12:10 PM, ruslan usifov ruslan.usi...@gmail.com

 wrote:




 2012/1/26 Jeffrey Kesselman jef...@gmail.com


 Most operating systems have a way to do this at the OS level.



 Could you please provide this way for linux?, for particular

 application? Maybe some sysctl?




 On Thu, Jan 26, 2012 at 8:17 AM, ruslan usifov

 ruslan.usi...@gmail.com wrote:


 Hello


 Is it possible set TCP_NODELAY on thrift socket in cassandra?





 --

 It's always darkest just before you are eaten by a grue.






 --

 It's always darkest just before you are eaten by a grue.






 --

 It's always darkest just before you are eaten by a grue.

Re: Disable Nagle algoritm in thrift i.e. TCP_NODELAY

2012-01-26 Thread ruslan usifov

27 января 2012 г. 2:44 пользователь sridhar basam s...@basam.org написал:

Which socket API?

http://www.php.net/manual/en/function.socket-set-option.php

Is possible to do the appropriate setsockopt call to disable NAGLE.

No you are wrong php thrift implementation doesn't use sock extension it
uses php streams(http://ru.php.net/manual/en/book.stream.php) aka fsockopen
stream_socket_recvfrom etc, but php sreams doesn't allow set any sock
options:-(.

Sridhar

2012/1/26 ruslan usifov ruslan.usi...@gmail.com:

27 января 2012 г. 1:19 пользователь aaron morton
aa...@thelastpickle.com
написал:

Outgoing TCP connections between nodes have TCP_NODELAY on, so do server
side THRIFT sockets.

Thanks, for exhaustive answer

I would assume your client will be setting it as well.

No php client doesn have TCP_NODELAY, because php stream sockets doesn't
allow set sock options - ie no such API

Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 27/01/2012, at 6:54 AM, sridhar basam wrote:

There is no global setting in linux to turn off nagle.

Sridhar

2012/1/26 Jeffrey Kesselman jef...@gmail.com:

You know... here aught to be a command line command to set it. There is
in

Solaris and Windows. But Im having trouble finding it for Linux.

2012/1/26 ruslan usifov ruslan.usi...@gmail.com

Sorry but you misunderstand me, is ask is cassandra have any option to

control TCP_NODELAY behaviour, so we doesn't need patch cassandra or
thrift

code.

I found this article

https://wiki.cs.columbia.edu:8443/pages/viewpage.action?pageId=12585536
,

where упоминается mentioned coreTransport.TcpClient.NoDelay, but what is

this i misunderstand

2012/1/26 Jeffrey Kesselman jef...@gmail.com

To set or get a TCP socket option, call getsockopt(2) to read

or setsockopt(2) to write the option with the option level argument set

to SOL_TCP. In addition, most SOL_IP socket options are valid on TCP

sockets. For more information see ip(7).

...

TCP_NODELAY If set, disable the Nagle algorithm. This means that
segments

are always sent as soon as possible, even if there is only a small
amount
of

data. When not set, data is buffered until there is a sufficient amount
to

send out, thereby avoiding the frequent sending of small packets, which

results in poor utilization of the network. This option cannot be used
at

the same time as the option TCP_CORK. http://bit.ly/zpvLbP

On Thu, Jan 26, 2012 at 12:10 PM, ruslan usifov
ruslan.usi...@gmail.com

wrote:

2012/1/26 Jeffrey Kesselman jef...@gmail.com

Most operating systems have a way to do this at the OS level.

Could you please provide this way for linux?, for particular

application? Maybe some sysctl?

On Thu, Jan 26, 2012 at 8:17 AM, ruslan usifov

ruslan.usi...@gmail.com wrote:

Hello

Is it possible set TCP_NODELAY on thrift socket in cassandra?

It's always darkest just before you are eaten by a grue.

Enable thrift logging

2012-01-24 Thread ruslan usifov

Hello

I try to log thrift log message (this need to us for solve communicate
problem between Cassandra daemon and php client ), so in
log4j-server.properties i write follow lines:

log4j.logger.org.apache.thrift.transport=DEBUG,THRIFT

log4j.appender.THRIFT=org.apache.log4j.RollingFileAppender
log4j.appender.THRIFT.maxFileSize=20MB
log4j.appender.THRIFT.maxBackupIndex=50
log4j.appender.THRIFT.layout=org.apache.log4j.PatternLayout
log4j.appender.THRIFT.layout.ConversionPattern=%5p [%t] %d{ISO8601} %F
(line %L) %m%n
log4j.appender.THRIFT.File=/var/log/cassandra/8.0/thrift.log


But no any messages in log in this case(but thay must be, i.e. Exception
trace), if we enable DEBUG in rootLogger ie:

log4j.rootLogger=DEBUG,stdout,R

Thrift log messages appear in sytem.log as expected, but how can we
separate them to separate log?

PS: cassandra 0.8.9

Re: Enable thrift logging

2012-01-24 Thread ruslan usifov

2012/1/25 aaron morton aa...@thelastpickle.com

 Do you want to log from inside the thrift code or from the cassandra
 thrift classes ?


Exceptions happens inside thrift, so inside thrift:-)))


 if it's the later try

 log4j.logger.org.apache.thrift=DEBUG,THRIFT


 org.apache.thrift.transport is part of thrift proper.


I try this but without any result




 Cheers

 -
 Aaron Morton
 Freelance Developer
 @aaronmorton
 http://www.thelastpickle.com

 On 25/01/2012, at 11:36 AM, R. Verlangen wrote:

 Pick a custom loglevel and redirect them with the /etc/syslog.conf ?

 2012/1/24 ruslan usifov ruslan.usi...@gmail.com

 Hello

 I try to log thrift log message (this need to us for solve communicate
 problem between Cassandra daemon and php client ), so in
 log4j-server.properties i write follow lines:

 log4j.logger.org.apache.thrift.transport=DEBUG,THRIFT

 log4j.appender.THRIFT=org.apache.log4j.RollingFileAppender
 log4j.appender.THRIFT.maxFileSize=20MB
 log4j.appender.THRIFT.maxBackupIndex=50
 log4j.appender.THRIFT.layout=org.apache.log4j.PatternLayout
 log4j.appender.THRIFT.layout.ConversionPattern=%5p [%t] %d{ISO8601} %F
 (line %L) %m%n
 log4j.appender.THRIFT.File=/var/log/cassandra/8.0/thrift.log


 But no any messages in log in this case(but thay must be, i.e. Exception
 trace), if we enable DEBUG in rootLogger ie:

 log4j.rootLogger=DEBUG,stdout,R

 Thrift log messages appear in sytem.log as expected, but how can we
 separate them to separate log?

 PS: cassandra 0.8.9

Re: performance reaching plateau while the hardware is still idle

2011-12-15 Thread ruslan usifov

Use parallel test:-)))

2011/12/15 Kent Tong freemant2...@yahoo.com

 Hi,

 I am running a performance test for Cassandra 1.0.5. It can perform about
 1500 business operation (one read+one write to the same row) per second.
 However, the CPU is still 85% idle (as shown by vmstat) and the IO
 utilization is less than a few percent (as shown by iostat). nodetool
 tpstats shows basically no active and pending threads. I can run several
 such test clients concurrently, achieving the same operations per second
 without increasing the hardware utilization. So, why the performance has
 reached a plateau while there is still idle hardware resources?


 Thanks in advance for any idea!

Prevent create snapshot when truncate

2011-12-12 Thread ruslan usifov

Hello

Every time when we do truncate, cassandra automatically create snapshots.
How can we prevent this?

Re: Does anybody know why Twitter stop integrate Cassandra as Twitter store?

2011-10-05 Thread ruslan usifov

Big thanks for all your replies

Does anybody know why Twitter stop integrate Cassandra as Twitter store?

2011-10-04 Thread ruslan usifov

http://engineering.twitter.com/2010/07/cassandra-at-twitter-today.html

As said in this post Twiter stop working on using Cassandra as a store for
Tweets, but there nothing said why they made this decision? Does anybody
have mo information

Re: Does anybody know why Twitter stop integrate Cassandra as Twitter store?

2011-10-04 Thread ruslan usifov

Hello

2011/10/4 Paul Loy ketera...@gmail.com

 Did you read the article you posted?

Yes


 *We believe that this isn't the time to make large scale migration to a
 new technology*. We will focus our Cassandra work on new projects that we
 wouldn't be able to ship without a large-scale data store.



There was big boom in network about, that Tweeter will migrate they tweets
to cassandra, but than they reject this plans. This explanation sounds very
vague. Why they have changed the mind? I find only one article about this:

http://highscalability.com/blog/2010/7/11/so-why-is-twitter-really-not-using-cassandra-to-store-tweets.html

Re: Problems using Thrift API in C

2011-07-29 Thread ruslan usifov

Do you have any error messages in cassandra log?

2011/7/28 Aleksandrs Saveljevs aleksandrs.savelj...@zabbix.com

 Dear all,

 We are considering using Cassandra for storing gathered data in Zabbix (see
 https://support.zabbix.com/**browse/ZBXNEXT-844https://support.zabbix.com/browse/ZBXNEXT-844for
  more details). Because Zabbix is written in C, we are considering using
 Thrift API in C, too.

 However, we are running into problems trying to get even the basic code
 work. Consider the attached source code. This is essentially a rewrite of
 the first part of the C++ example given at http://wiki.apache.org/**
 cassandra/ThriftExamples#C.2B-**.2B-http://wiki.apache.org/cassandra/ThriftExamples#C.2B-.2B-.
  If we run it under strace, we see that it hangs on the call to recv() when
 setting keyspace:

 $ strace -s 64 ./test
 ...
 socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 3
 connect(3, {sa_family=AF_INET, sin_port=htons(9160),
 sin_addr=inet_addr(127.0.0.1**)}, 16) = 0
 send(3, 
 \0\0\0/\200\1\0\1\0\0\0\fset_**keyspace\0\0\0\0\v\0\1\0\0\0\**vmy_keyspace\0,
 47, 0) = 47
 recv(3, ^C unfinished ...

 If we run the C++ example, it passes this step successfully. Does anybody
 know where the problem is? We are using Thrift 0.6.1 and Cassandra 0.8.1.

 Also, what is the current state of Thrift API in C? Can it be considered
 stable? Has anybody used it successfully? Any examples?

 Thanks,
 Aleksandrs

Re: What will be the steps for adding new nodes

2011-04-16 Thread ruslan usifov

2011/4/16 Roni r...@similarweb.com:
 I have a 0.6.4 Cassandra cluster of two nodes in full replica (replica
 factor 2). I wants to add two more nodes and balance the cluster (replica
 factor 2).

 I want all of them to be seed's.



 What should be the simple steps:

 1. add the AutoBootstraptrue/AutoBootstrap to all the nodes or only
 the new ones?


You must add this option only on new nodes

 2. add the Seed[new_node]/Seed to the config file of the old nodes
 before adding the new ones?



If you do that bootstrap will no be working. And this is not needed
step. I think that enough only few seed nodes for fault tolerance

 3. do the old node need to be restarted (if no change is needed in their
 config file)?


No that not needed

Re: Cassandra constantly nodes which doens allredy exists

2011-04-12 Thread ruslan usifov

2011/4/12 aaron morton aa...@thelastpickle.com

 In JConsole go to o.a.c.db.HintedHandoffManager and try the
 deleteHintsForEndpopints operation.

 This is also called as when a token is removed from the ring, or when a
 node is decomissioned.

 What process did you use to reconfigure the cluster?


I decommission node, then step by step restart all nodes in clustrer. When I
repeat restart operation twice this LOG entry disappear

Cassandra constantly nodes which doens allredy exists

2011-04-11 Thread ruslan usifov

Hello

I use cassandra 0.7.4. After reconfiguring cluster on one node i constantly
see folow log:

INFO [GossipStage:1] 2011-04-11 17:14:13,514 StorageService.java (line 865)
Removing token 56713727820156410577229101238628035242 for /10.32.59.202
INFO [ScheduledTasks:1] 2011-04-11 17:14:13,514 HintedHandOffManager.java
(line 210) Deleting any stored hints for 10.32.59.202


But node 10.32.59.202 doesn't exists alredy. How to prevent this?

Re: Flush / Snapshot Triggering Full GCs, Leaving Ring

2011-04-07 Thread ruslan usifov

2011/4/7 Jonathan Ellis jbel...@gmail.com

 Hypothesis: it's probably the flush causing the CMS, not the snapshot
 linking.

 Confirmation possibility #1: Add a logger.warn to
 CLibrary.createHardLinkWithExec -- with JNA enabled it shouldn't be
 called, but let's rule it out.

 Confirmation possibility #2: Force some flushes w/o snapshot.

 Either way: concurrent mode failure is the easy GC problem.
 Hopefully you really are seeing mostly that -- this means the JVM
 didn't start CMS early enough, so it ran out of space before it could
 finish the concurrent collection, so it falls back to stop-the-world.
 The fix is a combination of reducing XX:CMSInitiatingOccupancyFraction
 and (possibly) increasing heap capacity if your heap is simply too
 full too much of the time.

 You can also mitigate it by increasing the phi threshold for the
 failure detector, so the node doing the GC doesn't mark everyone else
 as dead.

 (Eventually your heap will fragment and you will see STW collections
 due to promotion failed, but you should see that much less
 frequently. GC tuning to reduce fragmentation may be possible based on
 your workload, but that's out of scope here and in any case the real
 fix for that is https://issues.apache.org/jira/browse/CASSANDRA-2252.)


Jonatan do you have plans to backport this to 0.7 branch. (Because It's very
hard to tune CMS, and if people is novice in java this task becomes much
harder )

Re: ParNew (promotion failed)

2011-04-01 Thread ruslan usifov

Also after all this messages in stdout.log i see follow:

[Unloading class sun.reflect.GeneratedSerializationConstructorAccessor3]
[Unloading class sun.reflect.GeneratedSerializationConstructorAccessor2]
[Unloading class sun.reflect.GeneratedSerializationConstructorAccessor1]
[Unloading class sun.reflect.GeneratedConstructorAccessor3]


As write here:
http://anshuiitk.blogspot.com/2010/11/excessive-full-garbage-collection.html.
This is may Perm size problems, but line Perm : 20073K-19913K(33420K),
doesn't say about this?

Re: who to contact?

2011-03-30 Thread ruslan usifov

This bug was fixed in thrift php trunk

2011/3/30 William Oberman ober...@civicscience.com

 I think I found a bug in the cassandra PHP client.  I'm using phpcassa, but
 the bug is in thrift itself, which I think that library phpcassa just
 wraps.  In any case, I was trying to test on my local machine, which has
 limited RAM, so I reduced the JVM heap size.  Of course I immediately had an
 OOM causing my local cassandra server to crash, but that caused my unit
 tests to stall at 100% CPU, which seemed weird to me.  I had to figure out
 why.  It seems that TSocket doesn't test for EOF (it's only checking for a
 socket timeout), causing a tight infinite loop when the connection
 disappears.  Checking for EOF in an else if seems like an easy fix, but
 given how deep this code is in the library I'll leave it to the experts.

 My diff of the file:
 @@ -255,6 +255,9 @@
  if (true === $md['timed_out']  false === $md['blocked']) {
throw new TTransportException('TSocket: timed out reading
 '.$len.' bytes from '.
 $this-host_.':'.$this-port_);
 +} else if(feof($this-handle_)) {
 +   throw new TTransportException('TSocket: EOF reading '.$len.'
 bytes from '.
 +   $this-host_.':'.$this-port_);
  } else {
$pre .= $buf;
$len -= $sz;


 --
 Will Oberman
 Civic Science, Inc.
 3030 Penn Avenue., First Floor
 Pittsburgh, PA 15201
 (M) 412-480-7835
 (E) ober...@civicscience.com

Re: ParNew (promotion failed)

2011-03-26 Thread ruslan usifov

2011/3/23 ruslan usifov ruslan.usi...@gmail.com

 Hello

 Sometimes i seen in gc log follow message:

 2011-03-23T14:40:56.049+0300: 14897.104: [GC 14897.104: [ParNew (promotion
 failed)
 Desired survivor size 41943040 bytes, new threshold 2 (max 2)
 - age   1:5573024 bytes,5573024 total
 - age   2:5064608 bytes,   10637632 total
 : 672577K-670749K(737280K), 0.1837950 secs]14897.288: [CMS:
 1602487K-779310K(2326528K), 4.7525580 secs] 2270940K-779310K(3063808K), [
 CMS Perm : 20073K-19913K(33420K)], 4.9365810 secs] [Times: user=5.06
 sys=0.00, real=4.93 secs]
 Total time for which application threads were stopped: 4.9378750 seconds


After investigations i detect that this happens when Memtableflash and
compact happens. So at this moment young part of heap is overflown and Full
GC happens.

So to resolve this i must tune young generation (HEAP_NEWSIZE) -Xmn, and
tune in_memory_compaction_limit_in_mb config parameter?

Also if memtables flushes due  memtable_flush_after if i separate in time
memtable flushes can this helps?

Re: Add node to balanced cluster?

2011-03-25 Thread ruslan usifov

2011/3/25 Eric Gilmore e...@datastax.com

 Also:
 http://www.datastax.com/docs/0.7/operations/clustering#adding-capacity

 Can do that about i represent, but i afraid that when i begin balance
cluster with new node this will be a big stress for it. Mey be exists some
strategies how to do that?

Re: Add node to balanced cluster?

2011-03-25 Thread ruslan usifov

2011/3/25 Eric Gilmore e...@datastax.com

 Ruslan, I'm not sure exactly what risks you are referring to -- can you be
 more specific?

 Do the CPU-intensive operations one at a time, including doing the cleanup
 when it will not interfere with other operations, and I think you should be
 fine, from my understanding.


I afraid about disk IO.

I think that move and cleanup opperations consume many IO, so when they run
- throughput of the cluster can degrade seriously? And this nut problem if
this operations took 5-10 minutes, but they cant run hours (1,5 - 2)  - and
this is on one node, so fully rebalance can took days with the seriouse
problems in throuput at that period.

Or i exaggerate?

Re: debian/ubuntu mirror down?

2011-03-25 Thread ruslan usifov

Cassandra issue tracker have ticket for this (and in this list link on this
ticket was posted, but i forgot where)

2011/3/25 Shashank Tiwari tsha...@gmail.com

 The Ubuntu Software Update seems to complain --
 Failed to fetch
 http://www.apache.org/dist/cassandra/debian/dists/unstable/main/binary-amd64/Packages.gz
 403  Forbidden [IP: 140.211.11.131 80]
 Failed to fetch
 http://www.apache.org/dist/cassandra/debian/dists/unstable/main/source/Sources.gz
 403  Forbidden [IP: 140.211.11.131 80]

 Has something changed or is the mirror down?

 Thanks, Shashank

Re: ParNew (promotion failed)

2011-03-24 Thread ruslan usifov

2011/3/24 Erik Onnen eon...@gmail.com

 It's been about 7 months now but at the time G1 would regularly
 segfault for me under load on Linux x64. I'd advise extra precautions
 in testing and make sure you test with representative load.


Which java version do you use?

Re: error connecting to cassandra 0.7.3

2011-03-24 Thread ruslan usifov

and where is transport creation for your thrift interface? Cassandra 0.7
uses Framed transport as default

2011/3/24 Anurag Gujral anurag.guj...@gmail.com


 I am using the following code to create my client.

  tr = new TSocket(url, port);
 TProtocol proto = new TBinaryProtocol(tr);
  client = new Cassandra.Client(proto);
   client.set_keyspace(this.keyspace);

 I am getting the errors I mentioned below
 Thanks
 Anurag


 -- Forwarded message --
 From: Anurag Gujral anurag.guj...@gmail.com
 Date: Thu, Mar 24, 2011 at 1:26 AM
 Subject: error connecting to cassandra 0.7.3
 To: user@cassandra.apache.org


 I am using cassandra-0.7.3 and thrift-0.0.5,I wrote a java client using
 thrift 0.0.5 when I try to connect to
 local cassandra server I get the following error
 ERROR com.bluekai.cassandra.validation.ValidationThread  - Failed to
 connect to 127.0.0.1.
 org.apache.thrift.transport.TTransportException: Cannot write to null
 outputStream


 I am able to connect to the local cassandra server using cassandra-cli
 though

 Any suggestions
 Thanks
 Anurag

Why disc access mode conf parameter deleted from yaml in cassandra 0.7 brunch,

2011-03-24 Thread ruslan usifov

mmap which will be set default on 64 bit platforms works badly (i don't know
reason why this happens but this is happens on 4 machines in my case so i
don't think that it is hardware problems)

Add node to balanced cluster?

2011-03-24 Thread ruslan usifov

Hello

Which strategy should i use to and new node to fully balanced cluster (nodes
tokens are generated by python script:

def tokens(nodes):
  for x in xrange(nodes):
print 2 ** 127 / nodes * x

tokens(3);
)

How to get balanced cluster after adding ne node without big stress for
current cluster?

Re: change node IP address

2011-03-23 Thread ruslan usifov

2011/3/23 aaron morton aa...@thelastpickle.com

 Which version are you using ?

 It looks like using 0.7X (and prob 0.6) versions you can just shutdown the
 node and bring it back up with the new IP and It Just Works
 https://issues.apache.org/jira/browse/CASSANDRA-872



So to replace one machine on another, there is enough simply copy cassandra
data directory to the machine, set on that machine token from previous, and
thats all?

ParNew (promotion failed)

2011-03-23 Thread ruslan usifov

Hello

Sometimes i seen in gc log follow message:

2011-03-23T14:40:56.049+0300: 14897.104: [GC 14897.104: [ParNew (promotion
failed)
Desired survivor size 41943040 bytes, new threshold 2 (max 2)
- age   1:5573024 bytes,5573024 total
- age   2:5064608 bytes,   10637632 total
: 672577K-670749K(737280K), 0.1837950 secs]14897.288: [CMS:
1602487K-779310K(2326528K), 4.7525580 secs] 2270940K-779310K(3063808K), [
CMS Perm : 20073K-19913K(33420K)], 4.9365810 secs] [Times: user=5.06
sys=0.00, real=4.93 secs]
Total time for which application threads were stopped: 4.9378750 seconds


How can i minimize they frequency, or disable?

May current workload is a many small objects (about 200 bytes long), and
summary of my memtables about 300 MB (16 CF). My heap is 3G,

Re: ParNew (promotion failed)

2011-03-23 Thread ruslan usifov

2011/3/23 Narendra Sharma narendra.sha...@gmail.com

 I think it is due to fragmentation in old gen, due to which survivor area
 cannot be moved to old gen. 300MB data size of memtable looks high for 3G
 heap. I learned that in memory overhead of memtable can be as high as 10x of
 memtable data size in memory. So either increase the heap or reduce the
 memtable thresholds further so that old gen gets freed up faster. With
 16CFs, I would do both i.e. increase the heap to say 4GB and reduce memtable
 thresholds further.


 I think that you don't undestend me, 300MB is a summary thresholds on all
16 CF, so one memtable_threshold is about 18MB. Or all the same it is
necessary to reduce memtable_threshold?

Re: Pauses of GC

2011-03-21 Thread ruslan usifov

After some investigations i think that my problems is similar to this :

http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/reduced-cached-mem-resident-set-size-growth-td5967110.html

Now i disable mmap, and set disk_access_mode to mmap_index_only

Re: Pauses of GC

2011-03-21 Thread ruslan usifov

I mean a linux process heap fragmentation by malloc, so at one critical
moment all memory holden by java process in RSS, and OS core cant allocate
any system resource an as result hung? Is it possble?

Pauses of GC

2011-03-17 Thread ruslan usifov

Hello

Some times i have very long GC pauses:


Total time for which application threads were stopped: 0.0303150 seconds
2011-03-17T13:19:56.476+0300: 33295.671: [GC 33295.671: [ParNew:
678855K-20708K(737280K), 0.0271230 secs] 1457643K-806795K(4112384K),
0.027305
0 secs] [Times: user=0.33 sys=0.00, real=0.03 secs]
Total time for which application threads were stopped: 0.0291820 seconds
2011-03-17T13:20:32.962+0300: 2.157: [GC 2.157: [ParNew:
676068K-23527K(737280K), 0.0302180 secs] 1462155K-817599K(4112384K),
0.030402
0 secs] [Times: user=0.31 sys=0.00, real=0.03 secs]
Total time for which application threads were stopped: 0.1270270 seconds
2011-03-17T13:21:11.908+0300: 33371.103: [GC 33371.103: [ParNew:
678887K-21564K(737280K), 0.0268160 secs] 1472959K-823191K(4112384K),
0.027011
0 secs] [Times: user=0.28 sys=0.00, real=0.03 secs]
Total time for which application threads were stopped: 0.0293330 seconds
2011-03-17T13:21:50.482+0300: 33409.677: [GC 33409.677: [ParNew:
676924K-21115K(737280K), 0.0281720 secs] 1478551K-829900K(4112384K),
0.028363
0 secs] [Times: user=0.27 sys=0.00, real=0.03 secs]
Total time for which application threads were stopped: 0.0339610 seconds
2011-03-17T13:22:32.849+0300: 33452.044: [GC 33452.044: [ParNew:
676475K-25948K(737280K), 0.0317600 secs] 1485260K-842061K(4112384K),
0.031952
0 secs] [Times: user=0.22 sys=0.00, real=0.03 secs]
Total time for which application threads were stopped: 0.0344430 seconds
2011-03-17T13:23:14.924+0300: 33494.119: [GC 33494.119: [ParNew:
681308K-25087K(737280K), 0.0282600 secs] 1497421K-848300K(4112384K),
0.028436
0 secs] [Times: user=0.32 sys=0.00, real=0.03 secs]
Total time for which application threads were stopped: 0.0309160 seconds
2011-03-17T13:23:57.192+0300: 33536.387: [GC 33536.387: [ParNew:
680447K-24805K(737280K), 0.0299910 secs] 1503660K-855829K(4112384K),
0.030167
0 secs] [Times: user=0.29 sys=0.01, real=0.03 secs]
Total time for which application threads were stopped: 0.0324200 seconds
2011-03-17T13:24:01.553+0300: 33540.748: [GC 33540.749: [ParNew:
680165K-31886K(737280K), 0.0495620 secs] 1511189K-936503K(4112384K),
0.049742
0 secs] [Times: user=0.57 sys=0.00, real=0.05 secs]
Total time for which application threads were stopped: 0.0507030 seconds
2011-03-17T13:37:56.009+0300: 34375.204: [GC 34375.204: [ParNew:
687246K-28727K(737280K), 0.0244720 secs] 1591863K-942459K(4112384K),
0.024690
0 secs] [Times: user=0.18 sys=0.00, real=0.02 secs]
Total time for which application threads were stopped: 806.7442720 seconds
Total time for which application threads were stopped: 0.0006590 seconds
Total time for which application threads were stopped: 0.0004360 seconds
Total time for which application threads were stopped: 0.0004630 seconds
Total time for which application threads were stopped: 0.0008120 seconds
2011-03-17T13:37:59.018+0300: 34378.213: [GC 34378.213: [ParNew:
676678K-21640K(737280K), 0.0137740 secs] 1590410K-949991K(4112384K),
0.013961
0 secs] [Times: user=0.13 sys=0.02, real=0.01 secs]
Total time for which application threads were stopped: 0.0145920 seconds
Total time for which application threads were stopped: 0.1036080 seconds
Total time for which application threads were stopped: 0.0585600 seconds
Total time for which application threads were stopped: 0.0600550 seconds
Total time for which application threads were stopped: 0.0008560 seconds
Total time for which application threads were stopped: 0.0006770 seconds
Total time for which application threads were stopped: 0.0005910 seconds
Total time for which application threads were stopped: 0.0351330 seconds
Total time for which application threads were stopped: 0.0329020 seconds
Total time for which application threads were stopped: 0.0728490 seconds
Total time for which application threads were stopped: 0.0480990 seconds
Total time for which application threads were stopped: 0.0804250 seconds
2011-03-17T13:38:04.394+0300: 34383.589: [GC 34383.589: [ParNew:
677000K-8375K(737280K), 0.0218310 secs] 1605351K-944271K(4112384K),
0.0220300
 secs]




I have follow nodetoll cfstats on hung node:

Keyspace: fishdom_tuenti
Read Count: 4970999
Read Latency: 1.0267005945887335 ms.
Write Count: 1441619
Write Latency: 0.013146585887117193 ms.
Pending Tasks: 0
Column Family: decor
SSTable count: 3
Space used (live): 1296203532
Space used (total): 1302520037
Memtable Columns Count: 1066
Memtable Data Size: 121742
Memtable Switch Count: 11
Read Count: 108125
Read Latency: 2.809 ms.
Write Count: 11261
Write Latency: 0.006 ms.
Pending Tasks: 0
Key cache capacity: 30
Key cache size: 46470
Key cache hit rate: 0.40384615384615385
Row cache: disabled
Compacted row minimum size: 36
Compacted row maximum size: 73457
Compacted row mean size: 958

Column Family: adopt
SSTable count: 1
Space used

Re: Pauses of GC

2011-03-17 Thread ruslan usifov

2011/3/17 Narendra Sharma narendra.sha...@gmail.com

 What heap size are you running with? and Which version of Cassandra?

 4G with cassandra 0.7.4

Re: Pauses of GC

2011-03-17 Thread ruslan usifov

At this moments java hungs. Only one thread is work and it run mostly in OS
core, with follow trace:


  [pid  1953]  0.050157 futex(0x7fbe141ea428, FUTEX_WAKE_PRIVATE,
1) = 0 0.22
[pid  1953]  0.59 futex(0x7fbc24023794,
FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {1300202329,
797618000}, ) = -1 ETIMEDOUT (Connection timed out) 0.050093
[pid  1953]  0.050152 futex(0x7fbe141ea428, FUTEX_WAKE_PRIVATE, 1)
= 0 0.21
[pid  1953]  0.67 futex(0x7fbc24023794,
FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {1300202329,
847838000}, ) = -1 ETIMEDOUT (Connection timed out) 0.050090
[pid  1953]  0.050150 futex(0x7fbe141ea428, FUTEX_WAKE_PRIVATE, 1)
= 0 0.22
[pid  1953]  0.67 futex(0x7fbc24023794,
FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {1300202329,
898054000}, ) = -1 ETIMEDOUT (Connection timed out) 0.050086
[pid  1953]  0.050144 futex(0x7fbe141ea428, FUTEX_WAKE_PRIVATE, 1)
= 0 0.22
[pid  1953]  0.60 futex(0x7fbc24023794,
FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {1300202329,
948258000}, ) = -1 ETIMEDOUT (Connection timed out) 0.050085
[pid  1953]  0.050144 futex(0x7fbe141ea428, FUTEX_WAKE_PRIVATE, 1)
= 0 0.21
[pid  1953]  0.67 futex(0x7fbc24023794,
FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {1300202329,
998469000}, ) = -1 ETIMEDOUT (Connection timed out) 0.050067
[pid  1953]  0.050127 futex(0x7fbe141ea428, FUTEX_WAKE_PRIVATE, 1)
= 0 0.21
[pid  1953]  0.67 futex(0x7fbc24023794,
FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {1300202330,
48664000}, ) = -1 ETIMEDOUT (Connection timed out) 0.050102
[pid  1953]  0.050161 futex(0x7fbe141ea428, FUTEX_WAKE_PRIVATE, 1)
= 0 0.21
[pid  1953]  0.59 futex(0x7fbc24023794,
FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {1300202330,
98884000}, ) = -1 ETIMEDOUT (Connection timed out) 0.050102
[pid  1953]  0.050160 futex(0x7fbe141ea428, FUTEX_WAKE_PRIVATE, 1)
= 0 0.22
[pid  1953]  0.67 futex(0x7fbc24023794,
FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {1300202330,
149111000}, ) = -1 ETIMEDOUT (Connection timed out) 0.050097
[pid  1953]  0.050157 futex(0x7fbe141ea428, FUTEX_WAKE_PRIVATE, 1)
= 0 0.22
[pid  1953]  0.59 futex(0x7fbc24023794,
FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {1300202330,
199327000}, ) = -1 ETIMEDOUT (Connection timed out) 0.050093
[pid  1953]  0.050153 futex(0x7fbe141ea428, FUTEX_WAKE_PRIVATE, 1)
= 0 0.22
[pid  1953]  0.67 futex(0x7fbc24023794,
FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {1300202330,
249547000}, ) = -1 ETIMEDOUT (Connection timed out) 0.050095
[pid  1953]  0.050155 futex(0x7fbe141ea428, FUTEX_WAKE_PRIVATE, 1)
= 0 0.22
[pid  1953]  0.59 futex(0x7fbc24023794,
FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {1300202330,
299761000}, ) = -1 ETIMEDOUT (Connection timed out) 0.050094
[pid  1953]  0.050154 futex(0x7fbe141ea428, FUTEX_WAKE_PRIVATE, 1)
= 0 0.21
[pid  1953]  0.67 futex(0x7fbc24023794,
FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {1300202330,
349981000}, ) = -1 ETIMEDOUT (Connection timed out) 0.050092
[pid  1953]  0.050168 futex(0x7fbe141ea428, FUTEX_WAKE_PRIVATE, 1)
= 0 0.23
[pid  1953]  0.66 futex(0x7fbc24023794,
FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {1300202330,
400216000}, ) = -1 ETIMEDOUT (Connection timed out) 0.050090



And this happens when mmap disck access is on, and in my case when VIRTUAL
space of java process is greate then 16G. It that case all system work
badly, utilities launch very slow (but not any swap activity), when kill
java process all system functionality back.  What is that i don know,
perhaps this is OS depend i use Ubuntu 10.0.4(LTS)

Linux slv007 2.6.32-24-generic #43-Ubuntu SMP Thu Sep 16 14:58:24 UTC 2010
x86_64 GNU/Linux


2011/3/17 Narendra Sharma narendra.sha...@gmail.com

 Depending on your memtable thresholds the heap may be too small for the
 deployment. At the same time I don't see any other log statements around
 that long pause that you have shown in the log snippet. It looks little odd
 to me. All the ParNew collected almost same amount of heap and did not take
 lot of time.

 Check if it is due to some JVM bug.
 http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6477891

 -Naren


 On Thu, Mar 17, 2011 at 9:47 AM, ruslan usifov ruslan.usi...@gmail.comwrote:



 2011/3/17 Narendra Sharma narendra.sha...@gmail.com

 What heap size are you running with? and Which version of Cassandra?

 4G with cassandra 0.7.4

swap setting on linux

2011-03-16 Thread ruslan usifov

Dear community!

Please share you settings for swap on linux box

replace one node to onother

2011-03-16 Thread ruslan usifov

Hello

For example if we want change one server to another with ip address change
too. How can we that eases way? For now we do nodetool removetocken, then
set autobootstrap: true on new server (with the token that was on old node)

Move token to another node

2011-03-15 Thread ruslan usifov

Hello

I have follow task. I want to move token from one node to another how can i
do that?

Re: Move token to another node

2011-03-15 Thread ruslan usifov

2011/3/15 Sasha Dolgy sdo...@gmail.com

 Hi Ruslan,

 nodetool -h target node move newtoken


And how add node to cluster without token?

Re: Strange behaivour

2011-03-14 Thread ruslan usifov

I detect that this was after change schema and it hung on waitpid syscall.
What can i do with this?

Calculate memory used for keycache

2011-03-14 Thread ruslan usifov

Hello


How is it possible calculate this value? I think that key size, if we use
RandomPartitioner will 16 bytes so keycache will took 16*(num of keycache
elements) bytes ??

Strange error

2011-03-13 Thread ruslan usifov

Hello in working cluster of cassandra 0.7.3 in system.log i see 2 simular
errors:

ERROR [ReadStage:12] 2011-03-13 20:27:20,431 AbstractCassandraDaemon.java
(line 114) Fatal exception in thread Thread[ReadStage:12,5,m
ain]
java.lang.NullPointerException
at org.apache.cassandra.db.Column.reconcile(Column.java:177)
at
org.apache.cassandra.db.SuperColumn.addColumn(SuperColumn.java:179)
at
org.apache.cassandra.db.SuperColumn.putColumn(SuperColumn.java:195)
at
org.apache.cassandra.db.ColumnFamily.addColumn(ColumnFamily.java:220)
at
org.apache.cassandra.db.filter.QueryFilter$2.reduce(QueryFilter.java:118)
at
org.apache.cassandra.db.filter.QueryFilter$2.reduce(QueryFilter.java:108)
at
org.apache.cassandra.utils.ReducingIterator.computeNext(ReducingIterator.java:62)
at
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:136)
at
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:131)
at
org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:118)
at
org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(QueryFilter.java:142)
at
org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1326)
at
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1203)
at
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1131)
at org.apache.cassandra.db.Table.getRow(Table.java:333)
at
org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:63)
at
org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:69)
at
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:72)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)

Re: Strange behaivour

2011-03-13 Thread ruslan usifov

2011/3/13 aaron morton aa...@thelastpickle.com

 It's difficult to say what's causing the freeze.

 Was the node rejecting client connections during this time ?

Yes. I think that hung all java because jmx doesn't respond too



 Did any of the other nodes log that the node that was freezing was down ?

Yes



 Is there anything else running on the box?

 No

PS: also on graph is visible that hangs in core (system time)

Re: Strange error

2011-03-13 Thread ruslan usifov

2011/3/14 aaron morton aa...@thelastpickle.com

 Looks related to CASSANDRA-1559
 https://issues.apache.org/jira/browse/CASSANDRA-1559 which should be fixed
 in 0.7.4

 https://issues.apache.org/jira/browse/CASSANDRA-1559However everyone
 said it should not happen. Can you provide some more detail about what you
 did to cause this to happen.


I do nothing (no any maintenance work), just typical workload for production
application

Re: Strange error

2011-03-13 Thread ruslan usifov

2011/3/14 aaron morton aa...@thelastpickle.com

 What sort of workload was that ? All read and write or could there have
 been some deletes as well?


Many reads, some writes, and some deletes

Re: Poor performance on small data set

2011-03-12 Thread ruslan usifov

Here is php windows extension but you must use trunk version of thrift

2011/3/12 Vodnok vod...@gmail.com

 Thank you all for your replies


 nagle + delayed ACK problem : I founded a way to solve this via regedit
 but no impact on response time

 THRIFT-638 : It seems to be a solution but i don't know how to patch this
 on my environement phpcassa has a C extension but it's hard for me to build
 a php extension




php_thrift_protocol.dll
Description: Binary data

Re: memory utilization

2011-03-12 Thread ruslan usifov

2011/3/11 Chris Burroughs chris.burrou...@gmail.com

 Is there an more or less constant amount of resident memory, or is it
 growing over a period of days?


As said in cassandra wiki:

The main argument for using mmap() instead of standard I/O is the fact
that reading entails just touching memory - in the case of the memory
being resident, you just read it - you don't even take a page fault (so no
overhead in entering the kernel and doing a semi-context switch)



So resident memory also will grow(but what happens when all physical memory
end i don't know)

Re: memory utilization

2011-03-12 Thread ruslan usifov

2011/3/12 Jonathan Ellis jbel...@gmail.com

 Nothing happens, because it _doesn't have to be resident_.


Hm, but why in my case top show RSS 10g, when max HEAP_SIZE is 6G??

  PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
27650 cassandr  20   0 14.9g  10g 3.8g S   51 86.6 370:15.82 jsvc
20583 zabbix25   5 18256 1464 1400 S4  0.0  37:59.88 zabbix_agentd

cassandra and G1 gc

2011-03-09 Thread ruslan usifov

Hello

Does anybody use G1 gc in production? What your impressions?

Re: Nodes frozen in GC

2011-03-08 Thread ruslan usifov

2011/3/8 Chris Goffinet c...@chrisgoffinet.com

 How large are your SSTables on disk? My thought was because you have so
 many on disk, we have to store the bloom filter + every 128 keys from index
 in memory.


0.5GB
 But as I understand store in memory happens only when read happens, i do
only inserts. And i think that memory doesn't problem, because heap
allocations looks like saw (in max Heap allocations get about 5,5 GB then
they reduce to 2GB)


Also when i increase Heap Size to 7GB, situation stay mach better, but nodes
frozen still happens, and in gc.log I steel see:

Total time for which application threads were stopped: 20.0686307 seconds

lines (right not so often, like before)

Re: Nodes frozen in GC

2011-03-08 Thread ruslan usifov

2011/3/8 Peter Schuller peter.schul...@infidyne.com


 (1) I cannot stress this one enough: Run with -XX:+PrintGC
 -XX:+PrintGCDetails -XX:+PrintGCTimeStamps and collect the output.
 (2) Attach to your process with jconsole or some similar tool.
 (3) Observe the behavior of the heap over time. Preferably post
 screenshots so others can look at them.


 I'm not sure that up to the end you has understood, sorry

I launch cassandra with follow gc login options (but doesn't mention about
this before, because of this document
http://www.datastax.com/docs/0.7/troubleshooting/index#nodes-seem-to-freeze-after-some-period-of-time,
there is no any mention about gc.log ):

JVM_OPTS=$JVM_OPTS -XX:+PrintGCApplicationStoppedTime
JVM_OPTS=$JVM_OPTS -Xloggc:/var/log/cassandra/gc.log


And detect that nodes frozen with follow log entires

Total time for which application threads were stopped: 30.957 seconds

And so on. Also when i think that nodes are frozen i got
UnavailableException and TimeOutException, about 20-30 times (i make few
Attempts (300 with 1 sec sleep) before final fail), follow fragment of code
illustrate what i do

for(; $l_i  300; ++$l_i)
{
try
{
$client-batch_mutate($mutations,
cassandra_ConsistencyLevel::QUORUM);
$retval = true;

break;
}
catch(cassandra_UnavailableException $e)
{
array_push($l_exceptions, get_class($e));
sleep(1);
}
catch(cassandra_TimedOutException $e)
{
array_push($l_exceptions, get_class($e));
sleep(1);
}
catch(Exception $e)
{
$loger-err(get_class($e).': '.$e-getMessage());
$loger-err($mutations);

break;
};
};

Re: Several 'TimedOutException' in stress.py

2011-03-08 Thread ruslan usifov

2011/3/8 A J s5a...@gmail.com

 Trying out stress.py on AWS EC2 environment (4 Large instances. Each
 of 2-cores and 7.5GB RAM. All in the same region/zone.)

 python stress.py -o insert  -d
 10.253.203.224,10.220.203.48,10.220.17.84,10.124.89.81 -l 2 -e ALL -t
 10 -n 500 -S 100 -k

 (I want to try with column size of about 1MB. I am assuming the above
 gives me 10 parallel threads each executing 50 inserts sequentially
 (500/10) ).

 Getting several timeout errors.TimedOutException(). With just 10
 concurrent writes spread across 4 nodes, kind of surprised to get so
 many timeouts. Any suggestions ?



It may by EC2 disc speed degradation (io speed of EC2 instances doesnt
const, also can vary in greater limits)

Re: Nodes frozen in GC

2011-03-08 Thread ruslan usifov

2011/3/8 Paul Pak p...@yellowseo.com

  Hi Ruslan,

 Is it possible for you to tell us the details on what you have done which
 measurably helped your situation, so we can start a best practices doc on
 growing cassandra systems?

 So far, I see that under load, cassandra is rarely ready to take heavy
 load in it's default configuration and a number of steps need to be done
 with the configuration of cassandra for proper sizing of memtables,
 flushing, jvm.  Unfortunately, it's very difficult to gauge what the proper
 or appropriate settings are for a given workload.

 It would be helpful if you could share, what happened in the default
 config, what steps you did that helped the situation, h Tow much each step
 helped your situation.  That way we can start a checklist of things to
 address as we grow in load.


It will be great if you provide options that need tuning from best throput,
i know only 3:

in cassandra.yaml

binary_memtable_throughput_in_mb

And jvm options:

-Xms with -Xmx - for heap size
-Xmn - for minor young generation GC

Re: Nodes frozen in GC

2011-03-07 Thread ruslan usifov

2011/3/6 aaron morton aa...@thelastpickle.com

 Your node is under memory pressure, after the GC there is still 5.7GB in
 use. In fact it looks like memory usage went up during the GC process.

 Can you reduce the memtable size, caches or the number of CF's or increase
 the JVM size? Also is this happening under heavy load ?


I have memtable size, and insert data into one CF, with biggest rowsize 1K,
how it is possible that after GC all memory is load? Meybe this is memory
leak in cassandra 0.7.3?

Re: Nodes frozen in GC

2011-03-07 Thread ruslan usifov

2011/3/8 Jonathan Ellis jbel...@gmail.com

 It sounds like you're complaining that the JVM sometimes does
 stop-the-world GC.

 You can mitigate this but not (for most workloads) eliminate it with
 GC option tuning.  That's simply the state of the art for Java garbage
 collection right now.


Hm, but what to do in this cases?? In these moments throughput of cluster
degrade, and I misunderstand what workaround I must do to prevent this
situations?

Re: Nodes frozen in GC

2011-03-07 Thread ruslan usifov

2011/3/8 Chris Goffinet c...@chrisgoffinet.com

 Can you tell me how many SSTables on disk when you see GC pauses? In your 3
 node cluster, what's the RF factor?


About 30-40, and i use RF=2, and insert rows with QUORUM consistency level

Re: Nodes frozen in GC

2011-03-07 Thread ruslan usifov

2011/3/8 Chris Goffinet c...@chrisgoffinet.com

 The rows you are inserting, what is your update ratio to those rows?

 I doesn't update them only insert, with speed 16000 per second

1 2 >

1 - 100 of 152 matches

Mail list logo