[ https://issues.apache.org/jira/browse/PIG-784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13126857#comment-13126857 ]
Prashant Kommireddi commented on PIG-784: ----------------------------------------- Here is an example syslog. Is it because the Warning is from a UDF? I do see "aggregate.warning" set to true. 2011-10-13 12:31:01,468 INFO org.apache.hadoop.mapred.MapTask: io.sort.mb = 100 2011-10-13 12:31:01,504 INFO org.apache.hadoop.mapred.MapTask: data buffer = 79691776/99614720 2011-10-13 12:31:01,504 INFO org.apache.hadoop.mapred.MapTask: record buffer = 262144/327680 2011-10-13 12:31:02,156 WARN org.apache.pig.piggybank.evaluation.string.INDEXOF: Failed to process input; error - null 2011-10-13 12:31:02,156 WARN org.apache.pig.piggybank.evaluation.string.LASTINDEXOF: Failed to process input; error - null 2011-10-13 12:31:02,234 WARN org.apache.pig.piggybank.evaluation.string.INDEXOF: Failed to process input; error - null 2011-10-13 12:31:02,234 WARN org.apache.pig.piggybank.evaluation.string.LASTINDEXOF: Failed to process input; error - null 2011-10-13 12:31:02,403 WARN org.apache.pig.piggybank.evaluation.string.INDEXOF: Failed to process input; error - null 2011-10-13 12:31:02,403 WARN org.apache.pig.piggybank.evaluation.string.LASTINDEXOF: Failed to process input; error - null 2011-10-13 12:31:02,407 WARN org.apache.pig.piggybank.evaluation.string.INDEXOF: Failed to process input; error - null 2011-10-13 12:31:02,407 WARN org.apache.pig.piggybank.evaluation.string.LASTINDEXOF: Failed to process input; error - null 2011-10-13 12:31:02,562 WARN org.apache.pig.piggybank.evaluation.string.INDEXOF: Failed to process input; error - null 2011-10-13 12:31:02,562 WARN org.apache.pig.piggybank.evaluation.string.LASTINDEXOF: Failed to process input; error - null 2011-10-13 12:31:02,564 WARN org.apache.pig.piggybank.evaluation.string.INDEXOF: Failed to process input; error - null 2011-10-13 12:31:02,564 WARN org.apache.pig.piggybank.evaluation.string.LASTINDEXOF: Failed to process input; error - null 2011-10-13 12:31:02,564 WARN org.apache.pig.piggybank.evaluation.string.INDEXOF: Failed to process input; error - null 2011-10-13 12:31:02,564 WARN org.apache.pig.piggybank.evaluation.string.LASTINDEXOF: Failed to process input; error - null 2011-10-13 12:31:02,567 WARN org.apache.pig.piggybank.evaluation.string.INDEXOF: Failed to process input; error - null 2011-10-13 12:31:02,567 WARN org.apache.pig.piggybank.evaluation.string.LASTINDEXOF: Failed to process input; error - null 2011-10-13 12:31:02,628 WARN org.apache.pig.piggybank.evaluation.string.INDEXOF: Failed to process input; error - null 2011-10-13 12:31:02,628 WARN org.apache.pig.piggybank.evaluation.string.LASTINDEXOF: Failed to process input; error - null 2011-10-13 12:31:02,637 WARN org.apache.pig.piggybank.evaluation.string.INDEXOF: Failed to process input; error - null > PigStorage() - need ability to turn off "Attempt to access field" warnings > --------------------------------------------------------------------------- > > Key: PIG-784 > URL: https://issues.apache.org/jira/browse/PIG-784 > Project: Pig > Issue Type: Improvement > Affects Versions: 0.2.0 > Reporter: David Ciemiewicz > > I want an option to PigStorage() for LOAD which will allow me to turn off the > "Attempt to access field" warnings. > Something like: > {code} > define PigStorage PigStorage("warn_load_nonexistent_field=off"); > A = load 'mydata.txt' using PigStorage() > as (col1: chararray, col2_optional: int, col3_optional: float); > {code} > or > {code} > A = load 'mydata.txt' using PigStorage("warn_load_nonexistent_field=0") > as (col1: chararray, col2_optional: int, col3_optional: float); > {code} > If I have a very large data set with optional columns that are not populated > (and have no tab separator), I'd like to just read the file as is and not > generate the warnings. > The warnings are problematic because the fill up the logging output and every > System.out.println will generate slow down the overall processing. > Especially if the data file being processed is missing one or more columns on > every single row. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira