[ https://issues.apache.org/jira/browse/PIG-513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Pradeep Kamath reopened PIG-513: -------------------------------- Pig is supposed to provide nulls for columns not present in the data. For example if a file has 3 columns name, age gpa then the following statement should still work with the column 'extra' getting nulls: {noformat} a = load 'input' as (name, age, gpa, extra); {noformat} This is broken with the current code. > PERFORMANCE: optimize some of the code in DefaultTuple > ------------------------------------------------------ > > Key: PIG-513 > URL: https://issues.apache.org/jira/browse/PIG-513 > Project: Pig > Issue Type: Bug > Affects Versions: 0.2.0 > Reporter: Pradeep Kamath > Assignee: Pradeep Kamath > Fix For: 0.6.0 > > Attachments: PIG-513.patch, pig-513_2.patch > > > The following areas in DefaultTuple.java can be changed: > The member methods get(), set(), getType() and isNull() all call > checkBounds() which is redundant call since all these 4 functions throw > ExecException. Instead of doing a bounds check, we can catch the > IndexOutOfBounds exception in a try-catch and throw it as an ExecException > The write() method has the following unused object (d in the code below): > {code} > for (int i = 0; i < sz; i++) { > try { > Object d = get(i); > } catch (ExecException ee) { > throw new RuntimeException(ee); > } > DataReaderWriter.writeDatum(out, mFields.get(i)); > } > {code} > {noformat} > The get(i) call in the try should be replaced by the writeDatum call directly > since d is never used and there is an unncessary call to get() > {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.