Re: GSOC: Monitor Improvements

Gabe Bell Mon, 22 Apr 2013 08:10:26 -0700

RE: RRDTool, there is rrd4j - a Java implementation licensed under Apache 2.0 
(https://code.google.com/p/rrd4j/)
On Apr 22, 2013, at 11:03 AM, Eric Newton <[email protected]> wrote:


> Presently the information is stored in memory and it certainly could be
> stored in tables.
> 
> This reminds me of an idea that I've been thinking about for a long time.
> It's a little aggressive to do in a single summer.
> 
> ----
> 
> RRDTool stores time series data in fixed-length files.  One important
> feature is the ability to compress time-series data into less-fine-grained
> results over time.
> 
> However, updating many RRD files, with periodic updates, requires making
> lots of small seeks and updates to individual files.  It works well when
> all the files fit in the disk cache.  It falls down hard when it doesn't.
> 
> My idea is to put updates into an Accumulo row for one collected data
> point, along with some recent version in RRD format:
> 
> Key                         Value
> row, cf:cq
> --------------------------------------------------------
> point rrd:                  [RRDTool data]
> point ts:timestamp    value
> point ts:timestamp    value
> point ts:timestamp    value
> point ts:timestamp    value
> point ts:timestamp    value
> 
> When the tablet compacts, you use a Combiner to push the updates into the
> RRD data:
> 
> Key                         Value
> row, cf:cv
> -------------------------------------------------------
> point rrd:                  [Updated RRDTool data]
> point ts:timestamp    value
> 
> Further, when you scan the data, you could use an RRD iterator to perform
> queries on the RRD format, which would extract out only the
> summary/graph/data you want.
> 
> This leverages the Accumulo write-ahead log, and efficiency of
> log-structured merge trees to defer RRD updates to a point where they can
> be done efficiently (with respect to disk seeks), and even the block cache
> to access recently read information quickly.  And, the data won't grow
> indefinitely due to the properties of the RRD storage format.
> 
> Sadly, RRDTool does not have a Java API.  But there appear to be java-based
> substitutes; I have no idea if they are license compatible.
> 
> OpenTSDB does something similar: they compress updates into blocks of
> updates in hourly chunks, converting many small records into one larger
> one.  Their scheme does not lose data, which was important to them.
> 
> 
> -Eric
> 
> 
> 
> On Mon, Apr 22, 2013 at 10:33 AM, Supun Kamburugamuva 
> <[email protected]>wrote:
> 
>> I can see how summaries are very helpful to a user. We can introduce new
>> fields to the existing table/tablet summery tables that displays problem
>> information etc.
>> 
>> To make the JMX polling time configurable we can introduce configuration
>> parameters.
>> 
>> For the JMX statistics we can keep data at the server for a constant time
>> to avoid memory growth. I think the stats are stored in memory (please
>> correct me if I'm wrong). If that is the case, is it possible to store them
>> in accumulo tables?
>> 
>> Thanks,
>> Supun...
>> 
>> On Mon, Apr 22, 2013 at 9:05 AM, Eric Newton <[email protected]>
>> wrote:
>> 
>>> Another thing to consider is scale.  On large clusters (many hundreds of
>>> nodes), more data is not helpful for visualization.  Instead, summaries,
>>> averages and outliers are important.
>>> 
>>> For example, if one node is consistently slow, it is better to know that
>>> than to see one graph with low numbers in a sea of graphs.
>> 
>> 
>>> If the monitor collects information using JMX, collection time for each
>>> node would be a good thing to know, too.
>>> 
>> 
>> 
>> 
>> 
>>> 
>>> -Eric
>>> 
>>> 
>>> On Sun, Apr 21, 2013 at 10:00 PM, Josh Elser <[email protected]>
>> wrote:
>>> 
>>>> Supun,
>>>> 
>>>> Yup, very much so. Having a way to consume any and all metrics via JMX
>>>> would simplify things for any consumers (internal or external).
>>>> 
>>>> 
>>>> 
>>>> On 04/21/2013 02:15 PM, Supun Kamburugamuva wrote:
>>>> 
>>>>> Hi Josh,
>>>>> 
>>>>> Thanks for the suggestions. I'll incorporate these to the proposal.
>>>>> 
>>>>> Another area I would like to work is on JMX. There is a Jira that says
>>> to
>>>>> replace the Monitor calls from Thrift to JMX (Accumulo 694). Do you
>>> think
>>>>> this is a good addition to the Monitor?
>>>>> 
>>>>> Thanks,
>>>>> Supun..
>>>>> 
>>>>> 
>>>>> On Sun, Apr 21, 2013 at 1:45 PM, Josh Elser <[email protected]>
>>> wrote:
>>>>> 
>>>>> Supun,
>>>>>> 
>>>>>> Looks good! Can I make some suggestions/comments?
>>>>>> 
>>>>>> For: "Per table plots: ACCUMULO-594", I'd also like to see minor
>>>>>> compactions, major compactions, index cache hit rate, and data cache
>>> hit
>>>>>> rate per table (same graphs that are displayed system-wide when you
>>> visit
>>>>>> http://${MONITOR_HOST}:50095/.
>>>>>> 
>>>>>> For "Per tablet [server] plots", it would be neat if you could also
>>>>>> extract some general statistics like top N least performing, top N
>>>>>> highest
>>>>>> performing, etc. tablet servers. Ideally, this could correlate with
>>>>>> servers
>>>>>> that may be having problems :).
>>>>>> 
>>>>>> Do you see these proposed changes as being sufficient for 3-4 months
>> of
>>>>>> 40hrs/week work? If you plan to really dig into these changes
>> (perhaps
>>>>>> reworking components of the monitor itself), I could perhaps see
>> this.
>>> Do
>>>>>> you have any ideas for more lofty goals that you could pursue as
>> well?
>>> I
>>>>>> don't want you/us to get one month into things and see you complete
>>>>>> everything we initially planned to accomplish :)
>>>>>> 
>>>>>> - Josh
>>>>>> 
>>>>>> 
>>>>>> On 04/21/2013 10:37 AM, Supun Kamburugamuva wrote:
>>>>>> 
>>>>>> Hi all,
>>>>>>> 
>>>>>>> I would like to start writing the proposal for the GSoc. I've put
>>>>>>> together
>>>>>>> some initial high level goals of the project. Please let me know
>> what
>>> I
>>>>>>> can
>>>>>>> improve.
>>>>>>> 
>>>>>>> Per table plots: Accumulo 594
>>>>>>> ---------------------
>>>>>>> 
>>>>>>> The goal of this is to display plots that explains the various
>>>>>>> activtities
>>>>>>> that happens per table. When we go to the tables page of the monitor
>>> and
>>>>>>> go
>>>>>>> to a specific table it displays some information in a table format.
>> We
>>>>>>> can
>>>>>>> argument this information by showing graphs for
>>>>>>> 
>>>>>>> 1. Ingest entries
>>>>>>> 2. Ingest data size
>>>>>>> 3. Scan entries
>>>>>>> 4. Scan data size
>>>>>>> 
>>>>>>> Per tablet plots
>>>>>>> ----------------------
>>>>>>> 
>>>>>>> Same as in the table plots we can display information regarding
>> tablet
>>>>>>> servers in the tablet server page. The plots will display the same
>>>>>>> information as table plots considering data per tablet server.
>>>>>>> 
>>>>>>> Trace Visualization: Accumulo 1198
>>>>>>> ----------------------------
>>>>>>> 
>>>>>>> Since we are displaying graphs about each tablet and each table we
>> can
>>>>>>> add
>>>>>>> major and minor compaction graph to each table and each tablet.
>>>>>>> 
>>>>>>> Or other option is to display this in a single graph in overview
>> page
>>>>>>> with
>>>>>>> different graph lines for different tables and tablets.
>>>>>>> 
>>>>>>> Server type information : Accumulo 807
>>>>>>> ------------------------------****---
>>>>>>> 
>>>>>>> For displaying this informations we can add a new page and display
>> the
>>>>>>> information as a table. The table should specify the network address
>>> of
>>>>>>> the
>>>>>>> server, server type, weather it is active or in-active etc.
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> Supun...
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
>>> 
>> 
>> 
>> 
>> --
>> Supun Kamburugamuva
>> Member, Apache Software Foundation; http://www.apache.org
>> E-mail: [email protected];  Mobile: +1 812 369 6762
>> Blog: http://supunk.blogspot.com
>>

smime.p7s
Description: S/MIME cryptographic signature

Re: GSOC: Monitor Improvements

Reply via email to