[ 
https://issues.apache.org/jira/browse/CHUKWA-680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13857286#comment-13857286
 ] 

michael yu commented on CHUKWA-680:
-----------------------------------

Hi Otis,

I may have no included a screenshot of the accuracy.  You can reference Chapter 
6 Performance and Benchmarks.  From all of my testing for my provided data set, 
I recall the accuracy being anywhere between 95% to 100%.

In general, the larger the data set you feed to SVM, the better (and more 
accurate) the training model.

Unfortunately, the code was implemented in such a way specific to querying and 
parsing the metrics data from HBase in a Hadoop environment.  The code can (and 
should) be refactored and generalized to process metrics from different 
datasource types.

> Pattern recognition of Hadoop generated metrics
> -----------------------------------------------
>
>                 Key: CHUKWA-680
>                 URL: https://issues.apache.org/jira/browse/CHUKWA-680
>             Project: Chukwa
>          Issue Type: New Feature
>          Components: Data Collection
>         Environment: IBM InfoSphere BigInsights Enterprise
>            Reporter: michael yu
>            Assignee: michael yu
>            Priority: Minor
>              Labels: GSoC, GSoC2013
>         Attachments: Yu, Michael et al-project-report-draft.pdf
>
>   Original Estimate: 2,760h
>  Remaining Estimate: 2,760h
>
> Charles Lin and I are working on our IBM SJSU masters project on "Pattern 
> recognition of Hadoop generated metrics".
> The purpose of the project is to use libsvm to predict the health of the 
> cluster.
> The scope of the project includes:
> 1) gathering large scale data set of metrics for healthy and unhealthy 
> clusters
> 2) use #1 and libsvm to generate training model
> 3) periodic collection of metrics and comparing against training model using 
> libsvm to predict the cluster health
>    a) if unhealthy, send email notification to system administrator 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to