Re: Node join streaming stuck at 100%

2012-06-05 Thread koji Lin
There is no error in the log about the streaming.

And thanks for the information, we will try 1.1 when we start upgrade.

koji

2012/6/5 aaron morton 

> Are their any errors in the logs about failed streaming ?
>
> If you are getting time outs 1.0.8 added a streaming socket timeout
> https://github.com/apache/cassandra/blob/trunk/CHANGES.txt#L323
>
> Cheers
>
>  -
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 4/06/2012, at 3:12 PM, koji wrote:
>
>
> aaron morton  thelastpickle.com> writes:
>
>
> Did you restart ? All good?
>
> Cheers
>
>
>
> -
>
> Aaron Morton
>
> Freelance Developer
>
>  aaronmorton
>
> http://www.thelastpickle.com
>
>
>
> On 27/04/2012, at 9:49 AM, Bryce Godfrey wrote:
>
>
> This is the second node I’ve joined to my cluster in the last few days,
> and
>
> so far both have become stuck at 100% on a large file according to
> netstats.
> This is on 1.0.9, is there anything I can do to make it move on besides
> restarting Cassandra?  I don’t see any errors or warns in logs for
> either server, and there is plenty of disk space.
>
>
>
>
> On the sender side I see this:
>
>
> Streaming to: /10.20.1.152
>
>
>/opt/cassandra/data/MonitoringData/PropertyTimeline-hc-80540-Data.db
>
> sections=1 progress=82393861085/82393861085 - 100%
>
>
>
>
> On the node joining I don’t see this file in netstats, and all pending
>
> streams are sitting at 0%
>
>
>
>
>
>
>
>
> Hi
> we have the same problem (1.0.7) , our netstats log is like this:
>
> Mode: NORMAL
> Streaming to: /1.1.1.1
>   /mnt/ebs1/cassandra-data/data/NemoModel/OfflineMessage-hc-3757-Data.db
>   sections=1234 progress=325/325 - 100%
>   /mnt/ebs1/cassandra-data/data/NemoModel/OfflineMessage-hc-3641-Data.db
>   sections=4386 progress=0/1025272214 - 0%
>   /mnt/ebs1/cassandra-data/data/NemoModel/OfflineMessage-hc-3761-Data.db
>   sections=2956 progress=0/17826723 - 0%
>   /mnt/ebs1/cassandra-data/data/NemoModel/OfflineMessage-hc-3730-Data.db
>   sections=3792 progress=0/56066299 - 0%
>   /mnt/ebs1/cassandra-data/data/NemoModel/OfflineMessage-hc-3760-Data.db
>   sections=4384 progress=0/90941161 - 0%
>   /mnt/ebs1/cassandra-data/data/NemoModel/OfflineMessage-hc-3687-Data.db
>   sections=3958 progress=0/54729557 - 0%
>   /mnt/ebs1/cassandra-data/data/NemoModel/OfflineMessage-hc-3762-Data.db
>   sections=766 progress=0/2605165 - 0%
> Streaming to: /1.1.1.2
>   /mnt/ebs1/cassandra-data/data/NemoModel/OneWayFriend-hc-709-Data.db
>   sections=3228 progress=29175698/29175698 - 100%
>   /mnt/ebs1/cassandra-data/data/NemoModel/OneWayFriend-hc-789-Data.db
>   sections=2102 progress=0/618938 - 0%
>   /mnt/ebs1/cassandra-data/data/NemoModel/OneWayFriend-hc-765-Data.db
>   sections=3044 progress=0/1996687 - 0%
>   /mnt/ebs1/cassandra-data/data/NemoModel/OneWayFriend-hc-788-Data.db
>   sections=2773 progress=0/1374636 - 0%
>   /mnt/ebs1/cassandra-data/data/NemoModel/OneWayFriend-hc-729-Data.db
>   sections=3150 progress=0/22111512 - 0%
> Nothing streaming from /1.1.1.1
> Nothing streaming from /1.1.1.2
> Pool NameActive   Pending  Completed
> Commandsn/a 1   23825242
> Responses   n/a25   19644808
>
>
> After restart, the pending streams are cleared, but next time we do
> "nodetool repair -pr" again, the pending still happened. And this always
> happend on same node(we have total 12 nodes).
>
> koji
>
>


Re: Using EC2 ephemeral 4disk raid0 cause high iowait trouble

2012-05-22 Thread koji Lin
Hi
Thanks for your information, we will try that.

koji

2012/5/23 Deno Vichas 

>  for what it's worth i've been having pretty good success using the
> Datastax AMIs.
>
>
>
> On 5/17/2012 6:59 PM, koji Lin wrote:
>
> Hi
>
> We use amazon ami 3.2.12-3.2.4.amzn1.x86_64
>
> and some of our data file are more than 10G
>
> thanks
>
> koji
> 2012-5-16 下午6:00 於 "aaron morton"  寫道:
>
>> On Ubuntu ? Sounds like http://wiki.apache.org/cassandra/FAQ#ubuntu_hangs
>>
>>  Cheers
>>
>>
>> -
>> Aaron Morton
>> Freelance Developer
>> @aaronmorton
>> http://www.thelastpickle.com
>>
>>  On 16/05/2012, at 2:13 PM, koji Lin wrote:
>>
>>  Hi
>>
>> Our service already run cassandra 1.0 on 1x ec2 instances(with ebs), and
>> we saw lots of discussion talk about using  ephemeral raid for better
>> performance and consistent performance.
>>
>> So we want to create new instance using 4 ephemeral raid0, and copy the
>> data from ebs to finally replace the old instance and reduce some .
>>
>> we create the xlarge instance with -b '/dev/sdb=ephemeral0' -b
>> '/dev/sdc=ephemeral1' -b '/dev/sdd=ephemeral2' -b '/dev/sde=ephemeral3',
>>
>> and use mdadm command like this  mdadm --create /dev/md0 --level=0 -c256
>> --raid-devices=4 /dev/sdb /dev/sdc /dev/sdd /dev/sde
>>
>> after copying file and start the cassandra(same token as old instance it
>> replaced).
>>
>> we saw the read is really fast always keep 2xxm/sec, but system load
>> exceed 40, with high iowait, and lots of client get timeout result. We
>> guess maybe it's the problem of ec2 instance, so we create another one with
>> same setting to replace other machine ,but the result is same . Then we
>> rollback to ebs with single disk ,read speed keeps at 1xmb/sec but system
>> becomes well .(using ebs with 2 disks raid0 will keep at 2xmb/sec and
>> higher iowait then single disk ,but still works)
>>
>> Is there anyone meet the same problem too ? or do we forget something to
>> configure?
>>
>> thank you
>>
>> koji
>>
>>
>>
>


-- 
Koji Lin
blog: http://www.javaworld.com.tw/roller/koji/


Re: Using EC2 ephemeral 4disk raid0 cause high iowait trouble

2012-05-22 Thread koji Lin
Hi
I think amazon ami is based on RHEL.

thank you

2012/5/21 aaron morton 

> Are you using the Ubuntu operating system ?
>
> Cheers
>
>   -
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 18/05/2012, at 1:59 PM, koji Lin wrote:
>
> Hi
>
> We use amazon ami 3.2.12-3.2.4.amzn1.x86_64
>
> and some of our data file are more than 10G
>
> thanks
>
> koji
> 2012-5-16 下午6:00 於 "aaron morton"  寫道:
>
>> On Ubuntu ? Sounds like http://wiki.apache.org/cassandra/FAQ#ubuntu_hangs
>>
>> Cheers
>>
>>
>>   -
>> Aaron Morton
>> Freelance Developer
>> @aaronmorton
>> http://www.thelastpickle.com
>>
>> On 16/05/2012, at 2:13 PM, koji Lin wrote:
>>
>> Hi
>>
>> Our service already run cassandra 1.0 on 1x ec2 instances(with ebs), and
>> we saw lots of discussion talk about using  ephemeral raid for better
>> performance and consistent performance.
>>
>> So we want to create new instance using 4 ephemeral raid0, and copy the
>> data from ebs to finally replace the old instance and reduce some .
>>
>> we create the xlarge instance with -b '/dev/sdb=ephemeral0' -b
>> '/dev/sdc=ephemeral1' -b '/dev/sdd=ephemeral2' -b '/dev/sde=ephemeral3',
>>
>> and use mdadm command like this  mdadm --create /dev/md0 --level=0 -c256
>> --raid-devices=4 /dev/sdb /dev/sdc /dev/sdd /dev/sde
>>
>> after copying file and start the cassandra(same token as old instance it
>> replaced).
>>
>> we saw the read is really fast always keep 2xxm/sec, but system load
>> exceed 40, with high iowait, and lots of client get timeout result. We
>> guess maybe it's the problem of ec2 instance, so we create another one with
>> same setting to replace other machine ,but the result is same . Then we
>> rollback to ebs with single disk ,read speed keeps at 1xmb/sec but system
>> becomes well .(using ebs with 2 disks raid0 will keep at 2xmb/sec and
>> higher iowait then single disk ,but still works)
>>
>> Is there anyone meet the same problem too ? or do we forget something to
>> configure?
>>
>> thank you
>>
>> koji
>>
>>
>>
>


-- 
Koji Lin
blog: http://www.javaworld.com.tw/roller/koji/


Re: Using EC2 ephemeral 4disk raid0 cause high iowait trouble

2012-05-17 Thread koji Lin
Hi

We use amazon ami 3.2.12-3.2.4.amzn1.x86_64

and some of our data file are more than 10G

thanks

koji
2012-5-16 下午6:00 於 "aaron morton"  寫道:

> On Ubuntu ? Sounds like http://wiki.apache.org/cassandra/FAQ#ubuntu_hangs
>
> Cheers
>
>
> -
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 16/05/2012, at 2:13 PM, koji Lin wrote:
>
> Hi
>
> Our service already run cassandra 1.0 on 1x ec2 instances(with ebs), and
> we saw lots of discussion talk about using  ephemeral raid for better
> performance and consistent performance.
>
> So we want to create new instance using 4 ephemeral raid0, and copy the
> data from ebs to finally replace the old instance and reduce some .
>
> we create the xlarge instance with -b '/dev/sdb=ephemeral0' -b
> '/dev/sdc=ephemeral1' -b '/dev/sdd=ephemeral2' -b '/dev/sde=ephemeral3',
>
> and use mdadm command like this  mdadm --create /dev/md0 --level=0 -c256
> --raid-devices=4 /dev/sdb /dev/sdc /dev/sdd /dev/sde
>
> after copying file and start the cassandra(same token as old instance it
> replaced).
>
> we saw the read is really fast always keep 2xxm/sec, but system load
> exceed 40, with high iowait, and lots of client get timeout result. We
> guess maybe it's the problem of ec2 instance, so we create another one with
> same setting to replace other machine ,but the result is same . Then we
> rollback to ebs with single disk ,read speed keeps at 1xmb/sec but system
> becomes well .(using ebs with 2 disks raid0 will keep at 2xmb/sec and
> higher iowait then single disk ,but still works)
>
> Is there anyone meet the same problem too ? or do we forget something to
> configure?
>
> thank you
>
> koji
>
>
>


Using EC2 ephemeral 4disk raid0 cause high iowait trouble

2012-05-15 Thread koji Lin
Hi

Our service already run cassandra 1.0 on 1x ec2 instances(with ebs), and we
saw lots of discussion talk about using  ephemeral raid for better
performance and consistent performance.

So we want to create new instance using 4 ephemeral raid0, and copy the
data from ebs to finally replace the old instance and reduce some .

we create the xlarge instance with -b '/dev/sdb=ephemeral0' -b
'/dev/sdc=ephemeral1' -b '/dev/sdd=ephemeral2' -b '/dev/sde=ephemeral3',

and use mdadm command like this  mdadm --create /dev/md0 --level=0 -c256
--raid-devices=4 /dev/sdb /dev/sdc /dev/sdd /dev/sde

after copying file and start the cassandra(same token as old instance it
replaced).

we saw the read is really fast always keep 2xxm/sec, but system load exceed
40, with high iowait, and lots of client get timeout result. We guess maybe
it's the problem of ec2 instance, so we create another one with same
setting to replace other machine ,but the result is same . Then we rollback
to ebs with single disk ,read speed keeps at 1xmb/sec but system becomes
well .(using ebs with 2 disks raid0 will keep at 2xmb/sec and higher iowait
then single disk ,but still works)

Is there anyone meet the same problem too ? or do we forget something to
configure?

thank you

koji