I am working on improving inter-region scan performance and have the patch already. The patch will be committed as soon as all tests are done. This should improve M/R over HBase performance because now you will be able to create input splits with granularities lower than a region without loss of a performance.
See : https://issues.apache.org/jira/browse/HBASE-7336 https://issues.apache.org/jira/browse/HBASE-5979 for more information on the subject. -Vladimir Rodionov On Tue, Jul 22, 2014 at 3:31 PM, Stack <[email protected]> wrote: > On Mon, Jul 21, 2014 at 11:11 PM, Li Li <[email protected]> wrote: > > > On Tue, Jul 22, 2014 at 1:57 PM, Stack <[email protected]> wrote: > > > On Mon, Jul 21, 2014 at 10:53 PM, Li Li <[email protected]> wrote: > > > > > >> Sorry, I enter tab and it send my unfinished post. See the following > > >> mail for answers of other questions. > > >> > > >> I forget the exception's detail. It throws exception in terminal. > > > > > > > > > What exception is thrown? > > I forget it. maybe I can retry it with 8 mapper configuration. it > > seems like out of memory exception > > > > > Who OOME'd? The map task or hbase? > > > > > > > > > > > > > > >> The > > >> default io.sort.mb is 100 and I set it to 500 to speed up reducer. > > > > > > > > > Do you have to have a reducer? If you could skip the shuffle... > > I have 8 reducers > > > > > Do you have to reduce? > > Would more reducers make your job run faster? > > > > > > > > > > > > > > >> So > > >> I set mapred.child.java.opts to 1g > > >> The datanode/regionserver has 16GB memory but free memory > > > > > > > > > Does the RS use the 16G? > > the RS use 8G and there are datanode and tasktracker in this machine > > > > > > > > How much for DN and TT? They don't need much usually. > > > > > > > > > > > >> for > > >> map-reduce is about 5gb. So I can't add more mappers > > >> > > >> > > >> How much RAM in these machines? > > 16GB > > > > These your machines or EC2? Can you get bigger machines if EC2? > > St.Ack >
