Re: [Proposal] Dynamical Topology Manager

Zhang, Edward (GDI Hadoop) Fri, 15 Jan 2016 16:07:17 -0800

I had a short discussion with Henry about this. We probably need discuss a
more graceful way to tackle the problem of whether eagle does this or
storm does this.
Today we know that Storm also provides topology view/statistics features
but does not have topology lifecycle management UI.
But can we ask Storm team to build this topology lifecycle UI?
Right now, can we make our implementation more pluggable and extensible so
later on if Storm has this feature, we can use that directly (at least
from API lever)


Do you want to communicate with storm community about this or probably
Eagle committer can build this feature and contribute back to storm?

What do you think?

Thanks
Edward



On 1/11/16, 14:45, "Edward Zhang" <yonzhang2...@apache.org> wrote:

>Thanks for the valuable comments, you are right in the high-level
>observations :-)
>
>(I participated some offline discussion on this proposal) I think this
>proposal is based on the requirements that Eagle not only monitors
>security
>events from hadoop but also monitors security events from other data
>source, for example cassandra and even more requirements are from hadoop
>native metric monitoring. In last 2 months, when we want to onboard new
>diverse datasources(for example mongo db metrics, hadoop native metrics
>etc.), we find it is impossible for user(mostly operations team) to
>understand storm topology or write code for very simple metrics
>ingestion/alert rules. For monitoring perspective, user usually wants
>metric/log onboarding is as simple as turn-key operation.
>
>It is possible for Eagle developers to develop applications for each data
>source, but it might be better for Eagle to support logs/metrics with some
>general schema. User just needs some configurations to get new data source
>flow into Eagle and create policy on-the-fly against the streaming data.
>Maybe I am wrong, but Eagle looks is an application to Storm framework,
>but
>Eagle would be a framework to monitoring applications e.g. security or
>other data activity.
>
>The point of Eagle to separate application and framework is very correct.
>In Eagle source code, the application code and framework code are
>separated
>from the beginning. For those features like policy restore after machine
>fails, DSL, aggregation etc, I think Eagle team should look for
>contributing back to Storm if people agree that is what stream framework
>should have.
>
>We also found that Eagle monitoring uses streaming framework but streaming
>framework is not customized for monitoring. The gap between monitoring
>platform and streaming framework has to be filled to make sure monitoring
>is reliable. For example compared to real-time streaming analytics,
>monitoring does not want to have any false alert or missing alert
>especially for security event, which requires more processing semantics
>than popular streaming framework provides. We will actively explore help
>from streaming projects.
>
>Thanks
>
>Edward
>
>On Mon, Jan 11, 2016 at 10:55 AM, Julian Hyde <jh...@apache.org> wrote:
>
>> Can I make a high-level observation? (And, although I¹m a mentor of this
>> project, I¹m not speaking as a mentor, just someone who has built
>>various
>> database and streaming systems over the years.)
>>
>> I¹ve noticed that Eagle is taking on several problems that ‹
>> architecturally speaking ‹ should be part of the underlying streaming
>> system. This topology manager and also, a DSL declarative streaming
>> queries, and making sure that streaming queries continue where they left
>> off, even in the presence of failures of individual stream-processing
>>nodes.
>>
>> These are very hard distributed systems problems, and they are
>>horizontal
>> problems that have nothing to do with Eagle¹s problem domain
>>(security). It
>> would be analogous to an application that is selling concert tickets
>> deciding to develop HBase as part of the application.
>>
>> If the Eagle community wants to solve these problems, that¹s awesome,
>>and
>> you should go for it. Apache projects are great at pulling in people
>>with
>> diverse skills and when they gather momentum they can build some amazing
>> technology. But I think Eagle should consider putting more architectural
>> separation between your stream management technology and your
>>application.
>> You could do that by building separate modules (and testing them
>> independently). Or you could contribute the functionality you need to
>>the
>> underlying system (e.g. Storm/Nimbus). Your project will run much more
>> smoothly if you call out the hard problems you are trying to solve.
>>
>> And, as a side benefit, projects that have nothing to do with Hadoop
>> security will be able to use (and test, and bug-fix) the technology you
>>are
>> developing.
>>
>> Julian
>>
>>
>> > On Jan 11, 2016, at 8:57 AM, Hao Chen <cn.haoc...@gmail.com> wrote:
>> >
>> > Currently eagle is requiring user to manually manage topologies
>> completely
>> > independent of eagle components,  which is not very smooth for the
>>user
>> and
>> > management experience end-to-end from on-boarding datasource, starting
>> > topologies, defining policy and also monitoring policy and execution
>> > status, so how do you think we manage everything in single place and
>> > dynamically manage topology lifecycle like
>> > starting/stopping/status/monitoring as well policy
>> > creation/modification/monitoring all in eagle ui only? So that user
>>don't
>> > need to touch storm anymore except specifying where the nimbus is when
>> > setting up eagle.
>> >
>> >
>> > --
>> >
>> > Hao
>>
>>

Re: [Proposal] Dynamical Topology Manager

Reply via email to