Andrew, I think this is all good ideas, in fact I was thinking about a better "Region Historian 2.0" or something like that. My point is more that the current version isn't really that useful compared to reading logs and that it caused us a lot of trouble thus we should disable it (to prevent bugs) and build something better.
J-D On Thu, Sep 17, 2009 at 9:45 PM, Andrew Purtell <apurt...@apache.org> wrote: > Can I wear my user hat? > > I have used the Region Historian on occasion as it is much easier than > grepping through master logs to find transitions. We also discard master logs > after 7 days because they are large, especially when running with DEBUG from > time to time. Obviously we don't have that problem of bulk with the historian > data, but this still begs the question just how useful is such long term > history? Well... > > Tracking region history ties into the audit aspects of HBASE-1697. > > I am also considering an experiment with collecting, correlating, and > visualizing what amounts to historian data among other things to see what > kind of things HBase can tell users about their data out of the box without > imposing too much overhead, e.g. how fast a table grows (or shrinks), region > split probabilities, key distribution trends, various information theoretic > metrics. These things could be useful for capacity planning and application > tuning. It's an interesting question to pose in general: What meaningful and > useful things can HBase tell users about their data? An aspect of this is the > history of where the data has been. > > On the issue of historians in general, there are three jiras: > > Region Historian, Closed, https://issues.apache.org/jira/browse/HBASE-533 > Service Historian, Open, https://issues.apache.org/jira/browse/HBASE-773 > Client Historian, Open, https://issues.apache.org/jira/browse/HBASE-1095 > > Is there any interest in these facilities and the types of operational and > capacity planning analyses they may support? Worth putting on the roadmap for > 0.22? > > Analysis of Service Historian data can identify sub par or failing nodes. > Accordingly they can be blacklisted. > > Correlating and visualizing Client Historian data can potentially reveal a > lot about client access patterns. One could look for probabilistic motifs > over different time scales, for example. This would be useful for debugging, > or to system analysts, or for security officers. > > Thoughts? > > - Andy > > > > > ________________________________ > From: Jean-Daniel Cryans <jdcry...@apache.org> > To: hbase-user@hadoop.apache.org > Sent: Thursday, September 17, 2009 5:38:08 PM > Subject: Are you using the Region Historian? Read this > > Hi users, > > The Region Historian (the page in the web UI that you get when you > click on a region name) has been in use since HBase 0.2.0 and it > caused more than its share of problems. Furthermore, we had to cripple > it in many ways to make some things work, the main issue being that > the historian is kept in .META. so operations on that catalog table > were sometimes blocked. > > We are planning to disable it for 0.20.1 and 0.21.0 until we come up > with a better solution. Is anybody using it? If so, would losing the > historian be a big deal for you? Your input would be much appreciated. > > Thx, > > J-D > > > >