Interesting idea. How would the ranges of rows be determined per region ?
This reminds me of stripe compaction feature which is in 0.98 See HBASE-7667 Cheers On Apr 30, 2014, at 3:07 AM, Jan Lukavský <[email protected]> wrote: > Hi all, > > I have a general idea I'd like to consult. A short description of a problem > we are facing: during mapreduce jobs run over HBase cluster, we very often > see great disproportions in run time of different map tasks (some tasks tend > to finish in minutes or even seconds, while others might take even hours). > This causes the job to run inefficiently and the whole cluster to be > underutilized - reducers have to wait until all the map tasks finish - at > least before starting the sort phase. The number of long running map tasks is > usually low, so the whole cluster basically waits until several machines > finish their work. We tried to get over this by sampling the regions and > creating some statistics (one statistic per mapreduce job), which we then > used to tune the input format splits to make the distribution of running time > more even. This seems to work (although at the time being might cause some > issues with data locality, which we think we can solve). > > Now, the questions is, would it be possible to calculate some statistics > during major compactions and store them in the region directory on HDFS? What > I mean by these statistics, I think it could be possible to store for some > reasonable ranges of rows (so that for each region there would be like > hundreds of these ranges): > * total number of rows between specified rows > * total number of KeyValues > * amount of data stored on disk > > These statistics could be calculated per column family and subsequently used > in InputFormat to tune the splits to match even distribution as close as > possible. > > Is anyone else interested in this? Does anyone have any other solution to the > problem I have described? I know we could say manually split regions that > take long time to process, but first, these regions are job-specific (so > different jobs have different regions that take long time to process), and > second, ideally I'm looking for an automated solution. > > Thanks for reply, > Jan >
