This is another way to do it.

I just created a JIRA issue for that:

https://issues.apache.org/jira/browse/FLINK-1297

If you can give me some pointers and suggest implementation strategies I
can try to prototype something in a feature branch over the weekend and
share it for review.



2014-12-02 14:43 GMT+01:00 Ufuk Celebi <u...@apache.org>:

> Have you also thought about adding the statistics collection with the
> writers, i.e. the collector or record writer?
>
> If all you care about is the data that the user emits from her code, that
> should be fine.
>
> On Tue, Dec 2, 2014 at 2:33 PM, Robert Metzger <rmetz...@apache.org>
> wrote:
>
> > Yes. I also got the impression that you are looking for something
> slightly
> > different.
> >
> > It is probably easier for you right now to "hack" something into the
> system
> > to get these statistics.
> >
> > On Tue, Dec 2, 2014 at 2:25 PM, Alexander Alexandrov <
> > alexander.s.alexand...@gmail.com> wrote:
> >
> > > I checked the thread. I am not sure whether this is aligned with what I
> > > want to contribute.
> > >
> > > The discussion in the other thread seems to be going in the direction
> of
> > > general-purpose monitoring (you are talking about Disk + Network IO,
> > input
> > > splits).
> > >
> > > I would like to have a very thin code base that can be (1)
> transparently
> > > injected in UDFs (if you can manipulate the AST), or wrapped in
> identity
> > > mappers (if you cannot) in order to gather collection statistics (min,
> > max,
> > > distinct, maybe some histograms) to facilitate incremental
> optimization.
> > >
> > > I agree that this should be based on existing infrastructure (Akka) and
> > > should not be over over-engineered.
> > >
> > > I will announce this in the other branch and create a JIRA ticket to
> fix
> > > the parameters of what has to be done and the best way to implement it
> > with
> > > the other contributors.
> > >
> > >
> > >
> > > 2014-12-02 14:12 GMT+01:00 Kostas Tzoumas <ktzou...@apache.org>:
> > >
> > > > From the status of that thread and absence of a JIRA (as far as I
> could
> > > > tell), I would suggest that you start working on this and announce it
> > on
> > > > the other thread, perhaps Nils would be interested in jumping in.
> > > >
> > > > On Tue, Dec 2, 2014 at 2:06 PM, Ufuk Celebi <u...@apache.org> wrote:
> > > >
> > > > > Very nice to hear :)
> > > > >
> > > > > See this thread:
> > > > >
> > > > >
> > > >
> > >
> >
> http://apache-flink-incubator-mailing-list-archive.1008284.n3.nabble.com/Enhance-Flink-s-monitoring-capabilities-td2573.html
> > > > >
> > > > > On Tue, Dec 2, 2014 at 2:00 PM, Alexander Alexandrov <
> > > > > alexander.s.alexand...@gmail.com> wrote:
> > > > >
> > > > > > Just a quick shout to check whether somebody is already working
> on
> > a
> > > > > > statistics collection component?
> > > > > >
> > > > > > If yes, can you point me to previous discussions in the mailing
> > list
> > > > and
> > > > > a
> > > > > > WIP branch -- I want to bring myself up to date with the ongoing
> > > > efforts.
> > > > > >
> > > > > > If not, I would like to start working on that component and
> ideally
> > > > > > integrate some parts of it in the 0.8 release.
> > > > > >
> > > > > > Cheers!
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to