Andrew,

I think this is all good ideas, in fact I was thinking about a better
"Region Historian 2.0" or something like that. My point is more that
the current version isn't really that useful compared to reading logs
and that it caused us a lot of trouble thus we should disable it (to
prevent bugs) and build something better.

J-D

On Thu, Sep 17, 2009 at 9:45 PM, Andrew Purtell <apurt...@apache.org> wrote:
> Can I wear my user hat?
>
> I have used the Region Historian on occasion as it is much easier than 
> grepping through master logs to find transitions. We also discard master logs 
> after 7 days because they are large, especially when running with DEBUG from 
> time to time. Obviously we don't have that problem of bulk with the historian 
> data, but this still begs the question just how useful is such long term 
> history?  Well...
>
> Tracking region history ties into the audit aspects of HBASE-1697.
>
> I am also considering an experiment with collecting, correlating, and 
> visualizing what amounts to historian data among other things to see what 
> kind of things HBase can tell users about their data out of the box without 
> imposing too much overhead, e.g. how fast a table grows (or shrinks), region 
> split probabilities, key distribution trends, various information theoretic 
> metrics. These things could be useful for capacity planning and application 
> tuning. It's an interesting question to pose in general: What meaningful and 
> useful things can HBase tell users about their data? An aspect of this is the 
> history of where the data has been.
>
> On the issue of historians in general, there are three jiras:
>
>    Region Historian, Closed, https://issues.apache.org/jira/browse/HBASE-533
>    Service Historian, Open, https://issues.apache.org/jira/browse/HBASE-773
>    Client Historian, Open, https://issues.apache.org/jira/browse/HBASE-1095
>
> Is there any interest in these facilities and the types of operational and 
> capacity planning analyses they may support? Worth putting on the roadmap for 
> 0.22?
>
> Analysis of Service Historian data can identify sub par or failing nodes. 
> Accordingly they can be blacklisted.
>
> Correlating and visualizing Client Historian data can potentially reveal a 
> lot about client access patterns. One could look for probabilistic motifs 
> over different time scales, for example. This would be useful for debugging, 
> or to system analysts, or for security officers.
>
> Thoughts?
>
>   - Andy
>
>
>
>
> ________________________________
> From: Jean-Daniel Cryans <jdcry...@apache.org>
> To: hbase-user@hadoop.apache.org
> Sent: Thursday, September 17, 2009 5:38:08 PM
> Subject: Are you using the Region Historian? Read this
>
> Hi users,
>
> The Region Historian (the page in the web UI that you get when you
> click on a region name) has been in use since HBase 0.2.0 and it
> caused more than its share of problems. Furthermore, we had to cripple
> it in many ways to make some things work, the main issue being that
> the historian is kept in .META. so operations on that catalog table
> were sometimes blocked.
>
> We are planning to disable it for 0.20.1 and 0.21.0 until we come up
> with a better solution. Is anybody using it? If so, would losing the
> historian be a big deal for you? Your input would be much appreciated.
>
> Thx,
>
> J-D
>
>
>
>

Reply via email to