[ 
https://issues.apache.org/jira/browse/PARQUET-365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15019000#comment-15019000
 ] 

Ryan Blue commented on PARQUET-365:
-----------------------------------

[~julienledem], since you've looked at this already, can you comment on whether 
we should include a fix in 1.9.0?

> Class Summary does not provide a getter to return inputSchema
> -------------------------------------------------------------
>
>                 Key: PARQUET-365
>                 URL: https://issues.apache.org/jira/browse/PARQUET-365
>             Project: Parquet
>          Issue Type: Bug
>          Components: parquet-mr
>    Affects Versions: 1.6.0, 1.7.0, 1.8.0
>            Reporter: li xiang
>            Priority: Critical
>              Labels: easyfix, patch
>             Fix For: 1.8.0
>
>
> In Pig code, 
> https://github.com/apache/pig/blob/trunk/src/org/apache/pig/EvalFunc.java. A 
> private number "inputSchemaInternal" represents the schema. Setter and Getter 
> are also provided
> {code}
> 316     private Schema inputSchemaInternal=null;
> 328     /**
> 329      * This method is for internal use. It is called by Pig core in both 
> front-end
> 330      * and back-end to setup the right input schema for EvalFunc
> 331      */
> 332     public void setInputSchema(Schema input){
> 333         this.inputSchemaInternal=input;
> 334     }
> 335 
> 336     /**
> 337      * This method is intended to be called by the user in {@link 
> EvalFunc} to get the input
> 338      * schema of the EvalFunc
> 339      */
> 340     public Schema getInputSchema(){
> 341         return this.inputSchemaInternal;
> 342     }
> {code}
> In parquet-mr/parquet-pig/src/main/java/parquet/pig/summary/Summary.java, 
> class Summary extends EvalFunc. It uses a new number called inputSchema(vs. 
> inputSchemaInternal used in class EvalFunc in Pig) to represent schema and 
> override setInputSchema(), but the class does not override getInputSchema() 
> to return inputSchema.
> {code}
> 51  public class Summary extends EvalFunc<String> implements Algebraic {
> 54     private Schema inputSchema;
> 257   @Override
> 258   public void setInputSchema(Schema input) {
> 259     try {
> 260       // relation.bag.tuple
> 261       this.inputSchema=input.getField(0).schema.getField(0).schema;
> 262       saveSchemaToUDFContext();
> 263     } catch (FrontendException e) {
> 264       throw new RuntimeException("Usage: B = FOREACH (GROUP A ALL) 
> GENERATE Summary(A); Can not get schema from " + input, e);
> 265     } catch (RuntimeException e) {
> 266       throw new RuntimeException("Usage: B = FOREACH (GROUP A ALL) 
> GENERATE Summary(A); Can not get schema from "+input, e);
> 267     }
> 268   }
> {code}
> If setInputSchema() of class Summary is called, inputSchema is set. But if we 
> call getInputSchema() afterwards, it will return the value of 
> inputSchemaInternal, which can be still null.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to