Re: [jira] [Updated] (HAMA-883) [Research Task] Massive log event aggregation in real time using Apache Hama

2014-03-04 Thread Chia-Hung Lin
Below is just my personal viewpoint. We can refactor bsp to be more modularized so that people can choose if that fits their requirement. Basically bsp is a generalized model, it may be good if we can create a flexible framework. On 5 March 2014 12:25, Edward J. Yoon wrote: > Why not? > > Sent

Re: [jira] [Updated] (HAMA-883) [Research Task] Massive log event aggregation in real time using Apache Hama

2014-03-04 Thread Edward J. Yoon
Why not? Sent from my iPhone > On 2014. 3. 5., at 오후 1:09, Yexi Jiang wrote: > > Yes, currently Hama does not support streaming input and streaming output. > That's why currently it is not a natural choice for people with real time > computing needs. > > Do we really need to make Hama to suppo

Re: [jira] [Updated] (HAMA-883) [Research Task] Massive log event aggregation in real time using Apache Hama

2014-03-04 Thread Yexi Jiang
Yes, currently Hama does not support streaming input and streaming output. That's why currently it is not a natural choice for people with real time computing needs. Do we really need to make Hama to support the real time computing? In that case, we need to compete with Storm... 2014-03-04 22:5

Re: [jira] [Updated] (HAMA-883) [Research Task] Massive log event aggregation in real time using Apache Hama

2014-03-04 Thread Chia-Hung Lin
I used Twitter Storm previously. Storm is an excellent framework in real time processing. Considering Hama in real time tasks, the framework in my opinion need to decouple io from hdfs so that the source/ input is not restricted to just hdfs. On 5 March 2014 09:30, Yexi Jiang wrote: > Please cor

Re: [jira] [Updated] (HAMA-883) [Research Task] Massive log event aggregation in real time using Apache Hama

2014-03-04 Thread Edward J. Yoon
I'm thinking about coupling with ML (incremental) algorithms. On Wed, Mar 5, 2014 at 11:16 AM, Yexi Jiang wrote: > I have ever implemented a system monitor/log collector using ActiveMQ and a > real time anomaly detection algorithm on top of Twitter's Storm. I think > people like me may naturally

Re: [jira] [Updated] (HAMA-883) [Research Task] Massive log event aggregation in real time using Apache Hama

2014-03-04 Thread Yexi Jiang
I have ever implemented a system monitor/log collector using ActiveMQ and a real time anomaly detection algorithm on top of Twitter's Storm. I think people like me may naturally choose such streaming computing framework to handle this scenario. For real time computation, what is the unique charact

Re: [jira] [Updated] (HAMA-883) [Research Task] Massive log event aggregation in real time using Apache Hama

2014-03-04 Thread Edward J. Yoon
The final goal can be a real-time event processing framework for distributed event detection, filtering, and aggregation. I guess that can be done with only 3 components: * Event processing job configuration interface. * User-defined function that handles the stream input. * Master Aggregator(s

Re: [jira] [Updated] (HAMA-883) [Research Task] Massive log event aggregation in real time using Apache Hama

2014-03-04 Thread Yexi Jiang
Please correct me if I'm wrong. My understanding of aggregating the log is the collect the generated from each monitored machine in real time. The collecting procedure is continuous like a data stream and never end. I know how to use Hama to aggregate the logs batch by batch (e.g. aggregate the lo

Re: [jira] [Updated] (HAMA-883) [Research Task] Massive log event aggregation in real time using Apache Hama

2014-03-04 Thread Edward J. Yoon
Aggregators of Graph package are doing similar wok. Monitoring and Global communication, ..., etc. On Tue, Mar 4, 2014 at 10:20 PM, Yexi Jiang wrote: > I am very interested in this topic since my research area includes event > mining, but can BSP conducts the real time computing? > > I once use

Re: [jira] [Updated] (HAMA-883) [Research Task] Massive log event aggregation in real time using Apache Hama

2014-03-04 Thread Chia-Hung Lin
BSP is a bridge model that doesn't restrict itself to some particular usage. My understanding (I could be wrong) is that our framework needs to address such issue. [1], for example, proposes a solution based on bsp in the field of real-time application. [1]. Hartley J.K., Bargiela A., TPML: Parall

Re: [jira] [Updated] (HAMA-883) [Research Task] Massive log event aggregation in real time using Apache Hama

2014-03-04 Thread Yexi Jiang
I am very interested in this topic since my research area includes event mining, but can BSP conducts the real time computing? I once used the message queue based solution to collect the event logs. 2014-03-04 1:54 GMT-05:00 Edward J. Yoon (JIRA) : > > [ > https://issues.apache.org/jira/br

[jira] [Updated] (HAMA-883) [Research Task] Massive log event aggregation in real time using Apache Hama

2014-03-03 Thread Edward J. Yoon (JIRA)
[ https://issues.apache.org/jira/browse/HAMA-883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Edward J. Yoon updated HAMA-883: Summary: [Research Task] Massive log event aggregation in real time using Apache Hama (was: [Researc