It would be nice to have integration with the existing tools, e.g. Ganglia.
[1] These already cover system statistics, (CPU, network, I/O...) and one
can define own stats to monitor.
Hadoop is nicely integrated with it.
[1] http://ganglia.sourceforge.net/
On Tue, Dec 2, 2014 at 9:37 PM, Fabian Hu
Yes, sure.
Tracking records per split and UDF exec time per call (min, max, avg, or
histogram) would be valuable information when debugging the performance of
a program.
2014-12-02 22:08 GMT+01:00 Flavio Pompermaier :
> In my specific use case I was intererested in understanding why the scans
> o
In my specific use case I was intererested in understanding why the scans
of the splits were taking a long time, so I was intrested in getting
statistics about the number of records contained in each split and the
rate/speed of its reading..do you think it could be something useful in
general?
On D
Hi Flavio,
we have a few recently started efforts to implement the collection of
monitoring and runtime/data statistics.
Counting the number of elements emitted by an operator (or data source)
will be included.
Do you want to count the number of produced tuples for monitoring the
progress or do y
I see mainly two use cases to locally collect data on TMs and send it (and
aggregate it) on the JM.
1) Monitoring of the system and running jobs: This might include system
stats (CPU, disk usage, network traffic & buffer usage, internal memory
utilization, ...) but also progress information (numbe
+1. Some quotes from my email to the PPMC on this topic:
The project has successfully made two [I stand corrected, three] Apache
releases and added new committers. The community seems to be good at
welcoming new contributors, based on the interactions on the mailing
lists. The core team mem
Aljoscha Krettek created FLINK-1298:
---
Summary: Consolidate Handling of User Code ClassLoader
Key: FLINK-1298
URL: https://issues.apache.org/jira/browse/FLINK-1298
Project: Flink
Issue Type:
This is another way to do it.
I just created a JIRA issue for that:
https://issues.apache.org/jira/browse/FLINK-1297
If you can give me some pointers and suggest implementation strategies I
can try to prototype something in a feature branch over the weekend and
share it for review.
2014-12-02
Hello Nils,
I am going to work on a similar issue related to tracking some basics
statistics of the intermediate results produced by dataflows during
execution.
I just create a Jira issue here:
https://issues.apache.org/jira/browse/FLINK-1297
If you already have some work done on extending the
Alexander Alexandrov created FLINK-1297:
---
Summary: Add support for tracking statistics of intermediate
results
Key: FLINK-1297
URL: https://issues.apache.org/jira/browse/FLINK-1297
Project: Flin
Thanks for the update. :)
Have you also thought about adding the statistics collection with the
writers, i.e. the collector or record writer?
If all you care about is the data that the user emits from her code, that
should be fine.
On Tue, Dec 2, 2014 at 2:33 PM, Robert Metzger wrote:
> Yes. I also got the impression th
Yes. I also got the impression that you are looking for something slightly
different.
It is probably easier for you right now to "hack" something into the system
to get these statistics.
On Tue, Dec 2, 2014 at 2:25 PM, Alexander Alexandrov <
alexander.s.alexand...@gmail.com> wrote:
> I checked t
I checked the thread. I am not sure whether this is aligned with what I
want to contribute.
The discussion in the other thread seems to be going in the direction of
general-purpose monitoring (you are talking about Disk + Network IO, input
splits).
I would like to have a very thin code base that
The thread mentioned by Ufuk is an ongoing discussion, thats why there is
no JIRA yet.
To my understanding, its a student doing a project on Flink.
Also, I would like to give you the same advice I already gave to Nils: I
would highly recommend using Till's Akka branch for starting to work on
that.
>From the status of that thread and absence of a JIRA (as far as I could
tell), I would suggest that you start working on this and announce it on
the other thread, perhaps Nils would be interested in jumping in.
On Tue, Dec 2, 2014 at 2:06 PM, Ufuk Celebi wrote:
> Very nice to hear :)
>
> See th
Very nice to hear :)
See this thread:
http://apache-flink-incubator-mailing-list-archive.1008284.n3.nabble.com/Enhance-Flink-s-monitoring-capabilities-td2573.html
On Tue, Dec 2, 2014 at 2:00 PM, Alexander Alexandrov <
alexander.s.alexand...@gmail.com> wrote:
> Just a quick shout to check whether
Just a quick shout to check whether somebody is already working on a
statistics collection component?
If yes, can you point me to previous discussions in the mailing list and a
WIP branch -- I want to bring myself up to date with the ongoing efforts.
If not, I would like to start working on that
18 matches
Mail list logo