In my specific use case I was intererested in understanding why the scans of the splits were taking a long time, so I was intrested in getting statistics about the number of records contained in each split and the rate/speed of its reading..do you think it could be something useful in general? On Dec 2, 2014 9:56 PM, "Fabian Hueske" <fhue...@apache.org> wrote:
> Hi Flavio, > > we have a few recently started efforts to implement the collection of > monitoring and runtime/data statistics. > Counting the number of elements emitted by an operator (or data source) > will be included. > > Do you want to count the number of produced tuples for monitoring the > progress or do you see a different use case? > > 2014-11-28 9:37 GMT+01:00 Flavio Pompermaier <pomperma...@okkam.it>: > > > Hi guys, > > > > I was debugging an inputFormat and I discovered that there's no way to > > understand how many records have been processed in a split. > > So I added a counter in my input format incremented every nextRecord..do > > you think adding something to similar like "public int > > getProcessedRecordsCount()" to InputFormat interface could be useful? > > Or are you going to manage this count stat from the caller of nextRecord? > > > > Best, > > Flavio > > >