[ 
https://issues.apache.org/jira/browse/PIG-784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13126857#comment-13126857
 ] 

Prashant Kommireddi commented on PIG-784:
-----------------------------------------

Here is an example syslog. Is it because the Warning is from a UDF? I do see 
"aggregate.warning" set to true.

2011-10-13 12:31:01,468 INFO org.apache.hadoop.mapred.MapTask: io.sort.mb = 100
2011-10-13 12:31:01,504 INFO org.apache.hadoop.mapred.MapTask: data buffer = 
79691776/99614720
2011-10-13 12:31:01,504 INFO org.apache.hadoop.mapred.MapTask: record buffer = 
262144/327680
2011-10-13 12:31:02,156 WARN 
org.apache.pig.piggybank.evaluation.string.INDEXOF: Failed to process input; 
error - null
2011-10-13 12:31:02,156 WARN 
org.apache.pig.piggybank.evaluation.string.LASTINDEXOF: Failed to process 
input; error - null
2011-10-13 12:31:02,234 WARN 
org.apache.pig.piggybank.evaluation.string.INDEXOF: Failed to process input; 
error - null
2011-10-13 12:31:02,234 WARN 
org.apache.pig.piggybank.evaluation.string.LASTINDEXOF: Failed to process 
input; error - null
2011-10-13 12:31:02,403 WARN 
org.apache.pig.piggybank.evaluation.string.INDEXOF: Failed to process input; 
error - null
2011-10-13 12:31:02,403 WARN 
org.apache.pig.piggybank.evaluation.string.LASTINDEXOF: Failed to process 
input; error - null
2011-10-13 12:31:02,407 WARN 
org.apache.pig.piggybank.evaluation.string.INDEXOF: Failed to process input; 
error - null
2011-10-13 12:31:02,407 WARN 
org.apache.pig.piggybank.evaluation.string.LASTINDEXOF: Failed to process 
input; error - null
2011-10-13 12:31:02,562 WARN 
org.apache.pig.piggybank.evaluation.string.INDEXOF: Failed to process input; 
error - null
2011-10-13 12:31:02,562 WARN 
org.apache.pig.piggybank.evaluation.string.LASTINDEXOF: Failed to process 
input; error - null
2011-10-13 12:31:02,564 WARN 
org.apache.pig.piggybank.evaluation.string.INDEXOF: Failed to process input; 
error - null
2011-10-13 12:31:02,564 WARN 
org.apache.pig.piggybank.evaluation.string.LASTINDEXOF: Failed to process 
input; error - null
2011-10-13 12:31:02,564 WARN 
org.apache.pig.piggybank.evaluation.string.INDEXOF: Failed to process input; 
error - null
2011-10-13 12:31:02,564 WARN 
org.apache.pig.piggybank.evaluation.string.LASTINDEXOF: Failed to process 
input; error - null
2011-10-13 12:31:02,567 WARN 
org.apache.pig.piggybank.evaluation.string.INDEXOF: Failed to process input; 
error - null
2011-10-13 12:31:02,567 WARN 
org.apache.pig.piggybank.evaluation.string.LASTINDEXOF: Failed to process 
input; error - null
2011-10-13 12:31:02,628 WARN 
org.apache.pig.piggybank.evaluation.string.INDEXOF: Failed to process input; 
error - null
2011-10-13 12:31:02,628 WARN 
org.apache.pig.piggybank.evaluation.string.LASTINDEXOF: Failed to process 
input; error - null
2011-10-13 12:31:02,637 WARN 
org.apache.pig.piggybank.evaluation.string.INDEXOF: Failed to process input; 
error - null

                
> PigStorage() - need ability to turn off "Attempt to access field"  warnings
> ---------------------------------------------------------------------------
>
>                 Key: PIG-784
>                 URL: https://issues.apache.org/jira/browse/PIG-784
>             Project: Pig
>          Issue Type: Improvement
>    Affects Versions: 0.2.0
>            Reporter: David Ciemiewicz
>
> I want an option to PigStorage() for LOAD which will allow me to turn off the 
> "Attempt to access field" warnings.
> Something like:
> {code}
> define PigStorage PigStorage("warn_load_nonexistent_field=off");
> A = load 'mydata.txt' using PigStorage()
>         as (col1: chararray, col2_optional: int, col3_optional: float);
> {code}
> or
> {code}
> A = load 'mydata.txt' using PigStorage("warn_load_nonexistent_field=0")
>         as (col1: chararray, col2_optional: int, col3_optional: float);
> {code}
> If I have a very large data set with optional columns that are not populated 
> (and have no tab separator), I'd like to just read the file as is and not 
> generate the warnings.
> The warnings are problematic because the fill up the logging output and every 
> System.out.println will generate slow down the overall processing.  
> Especially if the data file being processed is missing one or more columns on 
> every single row.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to