1) Number of invocations of a UDF: You can use pig.udf.profile
<http://pig.apache.org/docs/r0.12.0/perf.html#profiling>. Note that it is
approximation and can be misleading. In fact, you can make it 100% accurate
by configuring pig.udf.profile.frequency
<https://issues.apache.org/jira/browse/PIG-3956>. The latter is only in
trunk.

2) Number of records getting filtered: We don't have a counter specifically for
this, but you can guess it by looking at map/reduce input/output records
before/after the filter-by. If you use a visualization tool such as
Lipstick, the input/output records of each MR job is displayed in the DAG.

On Tue, Jun 10, 2014 at 7:49 AM, Abhishek Agarwal <[email protected]>
wrote:

> I was wondering if pig has in-built support for counting, the number of
> invocations of a UDF and the number of records getting filtered through
> FILTER operator.
>
> This feature could be very useful especially for filters where you can't
> hook your own counters.
>
> --
> Regards,
> Abhishek Agarwal
>

Reply via email to