Hi,

Lots of streaming internal status are exposed through StreamingListener, as
well as what see from web UI, so you could write your own StreamingListener
and register in StreamingContext to get the internal information of Spark
Streaming and write to CSV file.

You could check the source code here (
https://github.com/apache/spark/blob/3c0156899dc1ec1f7dfe6d7c8af47fa6dc7d00bf/streaming/src/main/scala/org/apache/spark/streaming/scheduler/StreamingListener.scala
).

Thanks
Saisai


On Tue, Aug 4, 2015 at 6:58 PM, allonsy <luke1...@gmail.com> wrote:

> Hi everyone,
>
> I'm working with Spark Streaming, and I need to perform some offline
> performance measures.
>
> What I'd like to have is a CSV file that reports something like this:
>
> *Batch number/timestamp        Input Size        Total Delay*
>
> which is in fact similar to what the UI outputs.
>
> I tried to get some metrics (metrics.properties), but I'm having hard time
> getting precise information on every single batch, since they only have
> entries concerning the /last/ (completed/received) batch, and values are
> often different to those appearing in the UI.
>
> Can anybody give me some advice on how to get metrics that are close to
> those of the UI?
>
> Thanks!
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Total-delay-per-batch-in-a-CSV-file-tp24129.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>

Reply via email to