There's an HDFS bandwidth setting which is set to 10MB/s. Way too low for even 1GBe.
Have you modified this setting yet? -Mike On Nov 3, 2012, at 2:50 PM, David Koch <ogd...@googlemail.com> wrote: > Hello Ted, > > We never initiate major compaction manually. I have not looked at I/O > balance between nodes in detail. We have noticed that after running for a > couple of weeks HBase seems to spend hours pushing blocks between nodes in > order to optimize things. We add data daily in one ~30gb push to several > tables. Sometimes nodes get added to the running system. > > Where can I get more information on how to carry out performance related > HBase administrative tasks? > > Thank you, > > /David > > > On Sat, Nov 3, 2012 at 4:42 PM, Ted Yu <yuzhih...@gmail.com> wrote: > >> Can you tell us how often you run major compaction after the import ? >> Have you noticed imbalanced read / write requests in the cluster ? Meaning >> subset of region servers receive bulk of the writes. >> >> We do some manual movement of regions when the above happens. >> >> Cheers >> >> On Sat, Nov 3, 2012 at 8:12 AM, David Koch <ogd...@googlemail.com> wrote: >> >>> Hello, >>> >>> Every now and then we need to flatten our cluster and re-import all data >>> from log files (changes in data format, etc.) Afterwards we notice a >>> significant increase in scan performance. As data is added and shuffled >>> around between region servers, performance goes down again over time >> (say a >>> couple of weeks). Are there any routine operations that one should run >>> manually, or settings to activate in the HBase configuration to keep the >>> data well distributed? We use HBase 0.92 as part of a Cloudera4 cluster. >>> >>> Thank you, >>> >>> /David >>> >>