Re: Node join streaming stuck at 100%
There is no error in the log about the streaming. And thanks for the information, we will try 1.1 when we start upgrade. koji 2012/6/5 aaron morton > Are their any errors in the logs about failed streaming ? > > If you are getting time outs 1.0.8 added a streaming socket timeout > https://github.com/apache/cassandra/blob/trunk/CHANGES.txt#L323 > > Cheers > > - > Aaron Morton > Freelance Developer > @aaronmorton > http://www.thelastpickle.com > > On 4/06/2012, at 3:12 PM, koji wrote: > > > aaron morton thelastpickle.com> writes: > > > Did you restart ? All good? > > Cheers > > > > - > > Aaron Morton > > Freelance Developer > > aaronmorton > > http://www.thelastpickle.com > > > > On 27/04/2012, at 9:49 AM, Bryce Godfrey wrote: > > > This is the second node I’ve joined to my cluster in the last few days, > and > > so far both have become stuck at 100% on a large file according to > netstats. > This is on 1.0.9, is there anything I can do to make it move on besides > restarting Cassandra? I don’t see any errors or warns in logs for > either server, and there is plenty of disk space. > > > > > On the sender side I see this: > > > Streaming to: /10.20.1.152 > > >/opt/cassandra/data/MonitoringData/PropertyTimeline-hc-80540-Data.db > > sections=1 progress=82393861085/82393861085 - 100% > > > > > On the node joining I don’t see this file in netstats, and all pending > > streams are sitting at 0% > > > > > > > > > Hi > we have the same problem (1.0.7) , our netstats log is like this: > > Mode: NORMAL > Streaming to: /1.1.1.1 > /mnt/ebs1/cassandra-data/data/NemoModel/OfflineMessage-hc-3757-Data.db > sections=1234 progress=325/325 - 100% > /mnt/ebs1/cassandra-data/data/NemoModel/OfflineMessage-hc-3641-Data.db > sections=4386 progress=0/1025272214 - 0% > /mnt/ebs1/cassandra-data/data/NemoModel/OfflineMessage-hc-3761-Data.db > sections=2956 progress=0/17826723 - 0% > /mnt/ebs1/cassandra-data/data/NemoModel/OfflineMessage-hc-3730-Data.db > sections=3792 progress=0/56066299 - 0% > /mnt/ebs1/cassandra-data/data/NemoModel/OfflineMessage-hc-3760-Data.db > sections=4384 progress=0/90941161 - 0% > /mnt/ebs1/cassandra-data/data/NemoModel/OfflineMessage-hc-3687-Data.db > sections=3958 progress=0/54729557 - 0% > /mnt/ebs1/cassandra-data/data/NemoModel/OfflineMessage-hc-3762-Data.db > sections=766 progress=0/2605165 - 0% > Streaming to: /1.1.1.2 > /mnt/ebs1/cassandra-data/data/NemoModel/OneWayFriend-hc-709-Data.db > sections=3228 progress=29175698/29175698 - 100% > /mnt/ebs1/cassandra-data/data/NemoModel/OneWayFriend-hc-789-Data.db > sections=2102 progress=0/618938 - 0% > /mnt/ebs1/cassandra-data/data/NemoModel/OneWayFriend-hc-765-Data.db > sections=3044 progress=0/1996687 - 0% > /mnt/ebs1/cassandra-data/data/NemoModel/OneWayFriend-hc-788-Data.db > sections=2773 progress=0/1374636 - 0% > /mnt/ebs1/cassandra-data/data/NemoModel/OneWayFriend-hc-729-Data.db > sections=3150 progress=0/22111512 - 0% > Nothing streaming from /1.1.1.1 > Nothing streaming from /1.1.1.2 > Pool NameActive Pending Completed > Commandsn/a 1 23825242 > Responses n/a25 19644808 > > > After restart, the pending streams are cleared, but next time we do > "nodetool repair -pr" again, the pending still happened. And this always > happend on same node(we have total 12 nodes). > > koji > >
Re: Using EC2 ephemeral 4disk raid0 cause high iowait trouble
Hi Thanks for your information, we will try that. koji 2012/5/23 Deno Vichas > for what it's worth i've been having pretty good success using the > Datastax AMIs. > > > > On 5/17/2012 6:59 PM, koji Lin wrote: > > Hi > > We use amazon ami 3.2.12-3.2.4.amzn1.x86_64 > > and some of our data file are more than 10G > > thanks > > koji > 2012-5-16 下午6:00 於 "aaron morton" 寫道: > >> On Ubuntu ? Sounds like http://wiki.apache.org/cassandra/FAQ#ubuntu_hangs >> >> Cheers >> >> >> - >> Aaron Morton >> Freelance Developer >> @aaronmorton >> http://www.thelastpickle.com >> >> On 16/05/2012, at 2:13 PM, koji Lin wrote: >> >> Hi >> >> Our service already run cassandra 1.0 on 1x ec2 instances(with ebs), and >> we saw lots of discussion talk about using ephemeral raid for better >> performance and consistent performance. >> >> So we want to create new instance using 4 ephemeral raid0, and copy the >> data from ebs to finally replace the old instance and reduce some . >> >> we create the xlarge instance with -b '/dev/sdb=ephemeral0' -b >> '/dev/sdc=ephemeral1' -b '/dev/sdd=ephemeral2' -b '/dev/sde=ephemeral3', >> >> and use mdadm command like this mdadm --create /dev/md0 --level=0 -c256 >> --raid-devices=4 /dev/sdb /dev/sdc /dev/sdd /dev/sde >> >> after copying file and start the cassandra(same token as old instance it >> replaced). >> >> we saw the read is really fast always keep 2xxm/sec, but system load >> exceed 40, with high iowait, and lots of client get timeout result. We >> guess maybe it's the problem of ec2 instance, so we create another one with >> same setting to replace other machine ,but the result is same . Then we >> rollback to ebs with single disk ,read speed keeps at 1xmb/sec but system >> becomes well .(using ebs with 2 disks raid0 will keep at 2xmb/sec and >> higher iowait then single disk ,but still works) >> >> Is there anyone meet the same problem too ? or do we forget something to >> configure? >> >> thank you >> >> koji >> >> >> > -- Koji Lin blog: http://www.javaworld.com.tw/roller/koji/
Re: Using EC2 ephemeral 4disk raid0 cause high iowait trouble
Hi I think amazon ami is based on RHEL. thank you 2012/5/21 aaron morton > Are you using the Ubuntu operating system ? > > Cheers > > - > Aaron Morton > Freelance Developer > @aaronmorton > http://www.thelastpickle.com > > On 18/05/2012, at 1:59 PM, koji Lin wrote: > > Hi > > We use amazon ami 3.2.12-3.2.4.amzn1.x86_64 > > and some of our data file are more than 10G > > thanks > > koji > 2012-5-16 下午6:00 於 "aaron morton" 寫道: > >> On Ubuntu ? Sounds like http://wiki.apache.org/cassandra/FAQ#ubuntu_hangs >> >> Cheers >> >> >> - >> Aaron Morton >> Freelance Developer >> @aaronmorton >> http://www.thelastpickle.com >> >> On 16/05/2012, at 2:13 PM, koji Lin wrote: >> >> Hi >> >> Our service already run cassandra 1.0 on 1x ec2 instances(with ebs), and >> we saw lots of discussion talk about using ephemeral raid for better >> performance and consistent performance. >> >> So we want to create new instance using 4 ephemeral raid0, and copy the >> data from ebs to finally replace the old instance and reduce some . >> >> we create the xlarge instance with -b '/dev/sdb=ephemeral0' -b >> '/dev/sdc=ephemeral1' -b '/dev/sdd=ephemeral2' -b '/dev/sde=ephemeral3', >> >> and use mdadm command like this mdadm --create /dev/md0 --level=0 -c256 >> --raid-devices=4 /dev/sdb /dev/sdc /dev/sdd /dev/sde >> >> after copying file and start the cassandra(same token as old instance it >> replaced). >> >> we saw the read is really fast always keep 2xxm/sec, but system load >> exceed 40, with high iowait, and lots of client get timeout result. We >> guess maybe it's the problem of ec2 instance, so we create another one with >> same setting to replace other machine ,but the result is same . Then we >> rollback to ebs with single disk ,read speed keeps at 1xmb/sec but system >> becomes well .(using ebs with 2 disks raid0 will keep at 2xmb/sec and >> higher iowait then single disk ,but still works) >> >> Is there anyone meet the same problem too ? or do we forget something to >> configure? >> >> thank you >> >> koji >> >> >> > -- Koji Lin blog: http://www.javaworld.com.tw/roller/koji/
Re: Using EC2 ephemeral 4disk raid0 cause high iowait trouble
Hi We use amazon ami 3.2.12-3.2.4.amzn1.x86_64 and some of our data file are more than 10G thanks koji 2012-5-16 下午6:00 於 "aaron morton" 寫道: > On Ubuntu ? Sounds like http://wiki.apache.org/cassandra/FAQ#ubuntu_hangs > > Cheers > > > - > Aaron Morton > Freelance Developer > @aaronmorton > http://www.thelastpickle.com > > On 16/05/2012, at 2:13 PM, koji Lin wrote: > > Hi > > Our service already run cassandra 1.0 on 1x ec2 instances(with ebs), and > we saw lots of discussion talk about using ephemeral raid for better > performance and consistent performance. > > So we want to create new instance using 4 ephemeral raid0, and copy the > data from ebs to finally replace the old instance and reduce some . > > we create the xlarge instance with -b '/dev/sdb=ephemeral0' -b > '/dev/sdc=ephemeral1' -b '/dev/sdd=ephemeral2' -b '/dev/sde=ephemeral3', > > and use mdadm command like this mdadm --create /dev/md0 --level=0 -c256 > --raid-devices=4 /dev/sdb /dev/sdc /dev/sdd /dev/sde > > after copying file and start the cassandra(same token as old instance it > replaced). > > we saw the read is really fast always keep 2xxm/sec, but system load > exceed 40, with high iowait, and lots of client get timeout result. We > guess maybe it's the problem of ec2 instance, so we create another one with > same setting to replace other machine ,but the result is same . Then we > rollback to ebs with single disk ,read speed keeps at 1xmb/sec but system > becomes well .(using ebs with 2 disks raid0 will keep at 2xmb/sec and > higher iowait then single disk ,but still works) > > Is there anyone meet the same problem too ? or do we forget something to > configure? > > thank you > > koji > > >
Using EC2 ephemeral 4disk raid0 cause high iowait trouble
Hi Our service already run cassandra 1.0 on 1x ec2 instances(with ebs), and we saw lots of discussion talk about using ephemeral raid for better performance and consistent performance. So we want to create new instance using 4 ephemeral raid0, and copy the data from ebs to finally replace the old instance and reduce some . we create the xlarge instance with -b '/dev/sdb=ephemeral0' -b '/dev/sdc=ephemeral1' -b '/dev/sdd=ephemeral2' -b '/dev/sde=ephemeral3', and use mdadm command like this mdadm --create /dev/md0 --level=0 -c256 --raid-devices=4 /dev/sdb /dev/sdc /dev/sdd /dev/sde after copying file and start the cassandra(same token as old instance it replaced). we saw the read is really fast always keep 2xxm/sec, but system load exceed 40, with high iowait, and lots of client get timeout result. We guess maybe it's the problem of ec2 instance, so we create another one with same setting to replace other machine ,but the result is same . Then we rollback to ebs with single disk ,read speed keeps at 1xmb/sec but system becomes well .(using ebs with 2 disks raid0 will keep at 2xmb/sec and higher iowait then single disk ,but still works) Is there anyone meet the same problem too ? or do we forget something to configure? thank you koji