add new nodes seems added more pressure to the cluster? how about your data size?
On Fri, Jul 29, 2011 at 4:16 AM, Frank Duan <fr...@aimatch.com> wrote: > "Dropped read message" might be an indicator of capacity issue. We > experienced the similar issue with 0.7.6. > > We ended up adding two extra nodes and physically rebooted the offending > node(s). > > The entire cluster then calmed down. > > On Thu, Jul 28, 2011 at 2:24 PM, Yan Chunlu <springri...@gmail.com> wrote: > >> I have three nodes and RF=3.here is the current ring: >> >> >> Address Status State Load Owns Token >> >> 84944475733633104818662955375549269696 >> node1 Up Normal 15.32 GB 81.09% 52773518586096316348543097376923124102 >> node2 Up Normal 22.51 GB 10.48% 70597222385644499881390884416714081360 >> node3 Up Normal 56.1 GB 8.43% 84944475733633104818662955375549269696 >> >> >> it is very un-balanced and I would like to re-balance it using >> "nodetool move" asap. unfortunately I haven't been run node repair for >> a long time. >> >> aaron suggested it's better to run node repair on every node then >> re-balance it. >> >> >> problem is the node3 is in heavy-load currently, and the entire >> cluster slow down if I start doing node repair. I have to >> disablegossip and disablethrift to stop the repair. >> >> only cassandra running on that server and I have no idea what it was >> doing. the cpu load is about 20+ currently. compcationstats and >> netstats shows it was not doing anything. >> >> I have change client to not to connect to node3, but still, it seems >> in heavy load and io utils is 100%. >> >> >> the log seems normal(although not sure what about the "Dropped read >> message" thing): >> >> INFO 13:21:38,191 GC for ParNew: 345 ms, 627003992 reclaimed leaving >> 2563726360 used; max is 4248829952 >> WARN 13:21:38,560 Dropped 826 READ messages in the last 5000ms >> INFO 13:21:38,560 Pool Name Active Pending >> INFO 13:21:38,560 ReadStage 8 7555 >> INFO 13:21:38,561 RequestResponseStage 0 0 >> INFO 13:21:38,561 ReadRepairStage 0 0 >> >> >> >> is there anyway to tell what node3 was doing? or at least is there any >> way to make it not slowdown the whole cluster? >> > > > > -- > Frank Duan > aiMatch > fr...@aimatch.com > c: 703.869.9951 > www.aiMatch.com > >