The final goal can be a real-time event processing framework for distributed event detection, filtering, and aggregation. I guess that can be done with only 3 components:
* Event processing job configuration interface. * User-defined function that handles the stream input. * Master Aggregator(s) and its client library. I expect this can be applied such as web clickstream log analysis (large scale web servers), finding hot search keywords, detecting system errors in real time, and user will be able to program them in few minutes. On Wed, Mar 5, 2014 at 10:30 AM, Yexi Jiang <[email protected]> wrote: > Please correct me if I'm wrong. My understanding of aggregating the log is > the collect the generated from each monitored machine in real time. The > collecting procedure is continuous like a data stream and never end. > > I know how to use Hama to aggregate the logs batch by batch (e.g. aggregate > the logs incrementally each day), but I cannot immediately make up an idea > of using Hama to solve this problem in real time approach. > > > 2014-03-04 19:32 GMT-05:00 Edward J. Yoon <[email protected]>: > >> Aggregators of Graph package are doing similar wok. Monitoring and >> Global communication, ..., etc. >> >> >> >> On Tue, Mar 4, 2014 at 10:20 PM, Yexi Jiang <[email protected]> wrote: >> > I am very interested in this topic since my research area includes event >> > mining, but can BSP conducts the real time computing? >> > >> > I once used the message queue based solution to collect the event logs. >> > >> > >> > 2014-03-04 1:54 GMT-05:00 Edward J. Yoon (JIRA) <[email protected]>: >> > >> >> >> >> [ >> >> >> https://issues.apache.org/jira/browse/HAMA-883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel >> ] >> >> >> >> Edward J. Yoon updated HAMA-883: >> >> -------------------------------- >> >> >> >> Summary: [Research Task] Massive log event aggregation in real time >> >> using Apache Hama (was: [Research Task] Massive log data aggregation in >> >> real time using Apache Hama) >> >> >> >> > [Research Task] Massive log event aggregation in real time using >> Apache >> >> Hama >> >> > >> >> >> ---------------------------------------------------------------------------- >> >> > >> >> > Key: HAMA-883 >> >> > URL: https://issues.apache.org/jira/browse/HAMA-883 >> >> > Project: Hama >> >> > Issue Type: Task >> >> > Reporter: Edward J. Yoon >> >> > >> >> > BSP tasks can be used for aggregating log data streamed in real time. >> >> With this research task, we might able to platformization these kind of >> >> processing. >> >> >> >> >> >> >> >> -- >> >> This message was sent by Atlassian JIRA >> >> (v6.2#6252) >> >> >> > >> > >> > >> > -- >> > ------ >> > Yexi Jiang, >> > ECS 251, [email protected] >> > School of Computer and Information Science, >> > Florida International University >> > Homepage: http://users.cis.fiu.edu/~yjian004/ >> >> >> >> -- >> Edward J. Yoon (@eddieyoon) >> Chief Executive Officer >> DataSayer, Inc. >> > > > > -- > ------ > Yexi Jiang, > ECS 251, [email protected] > School of Computer and Information Science, > Florida International University > Homepage: http://users.cis.fiu.edu/~yjian004/ -- Edward J. Yoon (@eddieyoon) Chief Executive Officer DataSayer, Inc.
