[
https://issues.apache.org/jira/browse/PIG-784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13126857#comment-13126857
]
Prashant Kommireddi commented on PIG-784:
-----------------------------------------
Here is an example syslog. Is it because the Warning is from a UDF? I do see
"aggregate.warning" set to true.
2011-10-13 12:31:01,468 INFO org.apache.hadoop.mapred.MapTask: io.sort.mb = 100
2011-10-13 12:31:01,504 INFO org.apache.hadoop.mapred.MapTask: data buffer =
79691776/99614720
2011-10-13 12:31:01,504 INFO org.apache.hadoop.mapred.MapTask: record buffer =
262144/327680
2011-10-13 12:31:02,156 WARN
org.apache.pig.piggybank.evaluation.string.INDEXOF: Failed to process input;
error - null
2011-10-13 12:31:02,156 WARN
org.apache.pig.piggybank.evaluation.string.LASTINDEXOF: Failed to process
input; error - null
2011-10-13 12:31:02,234 WARN
org.apache.pig.piggybank.evaluation.string.INDEXOF: Failed to process input;
error - null
2011-10-13 12:31:02,234 WARN
org.apache.pig.piggybank.evaluation.string.LASTINDEXOF: Failed to process
input; error - null
2011-10-13 12:31:02,403 WARN
org.apache.pig.piggybank.evaluation.string.INDEXOF: Failed to process input;
error - null
2011-10-13 12:31:02,403 WARN
org.apache.pig.piggybank.evaluation.string.LASTINDEXOF: Failed to process
input; error - null
2011-10-13 12:31:02,407 WARN
org.apache.pig.piggybank.evaluation.string.INDEXOF: Failed to process input;
error - null
2011-10-13 12:31:02,407 WARN
org.apache.pig.piggybank.evaluation.string.LASTINDEXOF: Failed to process
input; error - null
2011-10-13 12:31:02,562 WARN
org.apache.pig.piggybank.evaluation.string.INDEXOF: Failed to process input;
error - null
2011-10-13 12:31:02,562 WARN
org.apache.pig.piggybank.evaluation.string.LASTINDEXOF: Failed to process
input; error - null
2011-10-13 12:31:02,564 WARN
org.apache.pig.piggybank.evaluation.string.INDEXOF: Failed to process input;
error - null
2011-10-13 12:31:02,564 WARN
org.apache.pig.piggybank.evaluation.string.LASTINDEXOF: Failed to process
input; error - null
2011-10-13 12:31:02,564 WARN
org.apache.pig.piggybank.evaluation.string.INDEXOF: Failed to process input;
error - null
2011-10-13 12:31:02,564 WARN
org.apache.pig.piggybank.evaluation.string.LASTINDEXOF: Failed to process
input; error - null
2011-10-13 12:31:02,567 WARN
org.apache.pig.piggybank.evaluation.string.INDEXOF: Failed to process input;
error - null
2011-10-13 12:31:02,567 WARN
org.apache.pig.piggybank.evaluation.string.LASTINDEXOF: Failed to process
input; error - null
2011-10-13 12:31:02,628 WARN
org.apache.pig.piggybank.evaluation.string.INDEXOF: Failed to process input;
error - null
2011-10-13 12:31:02,628 WARN
org.apache.pig.piggybank.evaluation.string.LASTINDEXOF: Failed to process
input; error - null
2011-10-13 12:31:02,637 WARN
org.apache.pig.piggybank.evaluation.string.INDEXOF: Failed to process input;
error - null
> PigStorage() - need ability to turn off "Attempt to access field" warnings
> ---------------------------------------------------------------------------
>
> Key: PIG-784
> URL: https://issues.apache.org/jira/browse/PIG-784
> Project: Pig
> Issue Type: Improvement
> Affects Versions: 0.2.0
> Reporter: David Ciemiewicz
>
> I want an option to PigStorage() for LOAD which will allow me to turn off the
> "Attempt to access field" warnings.
> Something like:
> {code}
> define PigStorage PigStorage("warn_load_nonexistent_field=off");
> A = load 'mydata.txt' using PigStorage()
> as (col1: chararray, col2_optional: int, col3_optional: float);
> {code}
> or
> {code}
> A = load 'mydata.txt' using PigStorage("warn_load_nonexistent_field=0")
> as (col1: chararray, col2_optional: int, col3_optional: float);
> {code}
> If I have a very large data set with optional columns that are not populated
> (and have no tab separator), I'd like to just read the file as is and not
> generate the warnings.
> The warnings are problematic because the fill up the logging output and every
> System.out.println will generate slow down the overall processing.
> Especially if the data file being processed is missing one or more columns on
> every single row.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira