1) Number of invocations of a UDF: You can use pig.udf.profile <http://pig.apache.org/docs/r0.12.0/perf.html#profiling>. Note that it is approximation and can be misleading. In fact, you can make it 100% accurate by configuring pig.udf.profile.frequency <https://issues.apache.org/jira/browse/PIG-3956>. The latter is only in trunk.
2) Number of records getting filtered: We don't have a counter specifically for this, but you can guess it by looking at map/reduce input/output records before/after the filter-by. If you use a visualization tool such as Lipstick, the input/output records of each MR job is displayed in the DAG. On Tue, Jun 10, 2014 at 7:49 AM, Abhishek Agarwal <[email protected]> wrote: > I was wondering if pig has in-built support for counting, the number of > invocations of a UDF and the number of records getting filtered through > FILTER operator. > > This feature could be very useful especially for filters where you can't > hook your own counters. > > -- > Regards, > Abhishek Agarwal >
