[ 
https://issues.apache.org/jira/browse/PIG-2207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13082556#comment-13082556
 ] 

Thejas M Nair commented on PIG-2207:
------------------------------------

bq. I also think the warning message should be printed "at least once" – just 
the counters alone aren't always sufficient. We could get this fairly cheaply 
by keeping a weakref hashmap of seen warnings, and logging the message every 
time we see a new warning.

This way the single line of warning can be printed in the backend (task) logs, 
but not in client to the user. It does not provide a means for the backend to 
client communication. 

bq. Your proposal suggests custom counters via strings, so we have this problem 
either way. At least making the group match the class that issues the warning 
constrains the space of possible counters to num_udfs x 4_generic_warnings, 
whereas the space of possible strings is (for our purposes) unlimited.

I think the counter name should based on warnName + *first* warnMsg, in 
EvalFunc.warn(String warnName, String warnMsg). That way a single warning 
message also will be displayed on client side, when warning aggregations is 
turned on. When warning aggregation is turned off, each warnMsg goes into log.




> Support custom counters for aggregating warnings from different udfs
> --------------------------------------------------------------------
>
>                 Key: PIG-2207
>                 URL: https://issues.apache.org/jira/browse/PIG-2207
>             Project: Pig
>          Issue Type: Improvement
>            Reporter: Thejas M Nair
>              Labels: newbie
>             Fix For: 0.10
>
>
> Pig allows udfs to aggregate warning messages instead of writing out a 
> separate warning message each time. Udfs can do this by logging the warning 
> using EvalFunc.warn(String msg, Enum) call. But the udfs are forced to use 
> PigWarning class if the warning needs to be printed at the end of the pig 
> script . 
> For example, with the changes in PIG-2191, some of the builtin udfs are using 
> PigWarning.UDF_WARNING_1 as argument in calls to EvalFunc.warn. This will 
> result in the warning count being printed on STDERR -
> {code}
> 2011-08-05 22:10:29,285 [main] WARN  
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
>  - Encountered Warning UDF_WARNING_1 2 time(s).
> 2011-08-05 22:10:29,285 [main] INFO  
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
>  - Success!
> {code}
> But it would be better if a udf such as the LOWER udf could use a custom 
> warning counter, and the STDERR is like -
> {code}
> 2011-08-05 22:10:29,285 [main] WARN  
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
>  - Encountered Warning LOWER_FUNC_INPUT_WARNING 2 time(s).
> 2011-08-05 22:10:29,285 [main] INFO  
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
>  - Success!
> {code}
> A new function could be added to support this - (something like) 
> EvalFunc.warn(String warnName, String warnMsg);  A specific counter group 
> could be used for udf warnings (see org.apache.hadoop.mapred.Counters), and 
> counters for that group could be done during final warning aggregation in 
> done in MapReduceLauncher.computeWarningAggregate(). 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


Reply via email to