> On Sept. 17, 2012, 11:14 p.m., Santhosh Srinivasan wrote:
> > Have a few comments.

Thank you very much for reviewing my patch, Santhosh! Please find responses 
inline.


> On Sept. 17, 2012, 11:14 p.m., Santhosh Srinivasan wrote:
> > contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroStorageUtils.java,
> >  line 199
> > <https://reviews.apache.org/r/6884/diff/3/?file=154465#file154465line199>
> >
> >     Why aren't float and double returning float and double respectively?

I am worried about loss of precision by converting a big positive/negative int 
to float/double. Do you disagree?


> On Sept. 17, 2012, 11:14 p.m., Santhosh Srinivasan wrote:
> > contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroStorageUtils.java,
> >  line 210
> > <https://reviews.apache.org/r/6884/diff/3/?file=154465#file154465line210>
> >
> >     Why aren't float and double returning float and double respectively?

Same as above.


> On Sept. 17, 2012, 11:14 p.m., Santhosh Srinivasan wrote:
> > contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroStorageUtils.java,
> >  line 221
> > <https://reviews.apache.org/r/6884/diff/3/?file=154465#file154465line221>
> >
> >     Why aren't int and long returning float ?

Same as above.


> On Sept. 17, 2012, 11:14 p.m., Santhosh Srinivasan wrote:
> > contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroStorageUtils.java,
> >  line 232
> > <https://reviews.apache.org/r/6884/diff/3/?file=154465#file154465line232>
> >
> >     Why aren't int and long returning double ?

Same as above.


> On Sept. 17, 2012, 11:14 p.m., Santhosh Srinivasan wrote:
> > contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroStorageUtils.java,
> >  line 273
> > <https://reviews.apache.org/r/6884/diff/3/?file=154465#file154465line273>
> >
> >     Is the equality applicable all the way through?

No, this is just an enum equality check. However, that's sufficient here since 
mergeType() is meant to merge primitive types only.

I added a comment regarding what this method is meant for.


> On Sept. 17, 2012, 11:14 p.m., Santhosh Srinivasan wrote:
> > contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/PigAvroRecordReader.java,
> >  line 144
> > <https://reviews.apache.org/r/6884/diff/3/?file=154467#file154467line144>
> >
> >     Are the size of t and mProtoTuple expected to match?

No, they are not always expected to match because the size of mProtoTuple is 
equal to that of merged schema while the size of t is equal to that of 
individual input schemas.

In fact, I realized that this loop is redundant after all because the 
initialization of mProtoTuple is already done inside the constructor. So I 
removed this.


> On Sept. 17, 2012, 11:14 p.m., Santhosh Srinivasan wrote:
> > contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/PigAvroRecordReader.java,
> >  line 93
> > <https://reviews.apache.org/r/6884/diff/3/?file=154467#file154467line93>
> >
> >     This will append to the end of the list resulting in a list size of 
> > 2*tupleSize. Was that your intention?

No, that's not my intention.

But I believe that initializing the capability of ArrayList doesn't initialize 
its size. The size seems to be 0 until an entry is actually added to it.

Nevertheless, I changed it to add(i, null), so now it's more clear.


> On Sept. 17, 2012, 11:14 p.m., Santhosh Srinivasan wrote:
> > contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/avro/TestAvroStorageUtils.java,
> >  line 184
> > <https://reviews.apache.org/r/6884/diff/3/?file=154469#file154469line184>
> >
> >     Why isn't bytes + other non-null type = bytes ?

My reasoning is that bytes is just blob, so its type is unknown; therefore, the 
merged type is unknown. I wanted to be conservative in terms of type converting 
to avoid run-time exceptions.

But please let me know if you think otherwise.


> On Sept. 17, 2012, 11:14 p.m., Santhosh Srinivasan wrote:
> > contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroStorage.java,
> >  line 132
> > <https://reviews.apache.org/r/6884/diff/3/?file=154464#file154464line132>
> >
> >     Is it multiple_schema or multiple_schemas ?

It should be multiple_schemas. Fixed.


> On Sept. 17, 2012, 11:14 p.m., Santhosh Srinivasan wrote:
> > contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroStorage.java,
> >  line 311
> > <https://reviews.apache.org/r/6884/diff/3/?file=154464#file154464line311>
> >
> >     Any reason for removing this here and not in other places?

I put it back.


> On Sept. 17, 2012, 11:14 p.m., Santhosh Srinivasan wrote:
> > contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroStorage.java,
> >  line 362
> > <https://reviews.apache.org/r/6884/diff/3/?file=154464#file154464line362>
> >
> >     The logic used here is close to the logic used in setLocation. Is it 
> > possible to refactor the code and move this to a utility method?

Their logic is very similar, but there is a difference. At a high level, 
they'are like this:

// setLocation
if (a) {
  do x;
  do y;
}

// getSchema
if (a) {
  do x;
}

where

a: getAllSubDirs()
x: initialize inputAvroSchema
y: call FileInputFormat.setInputPaths()

I factored out the code for initializing inputAvroSchema into a method 
setInputAvroSchema(), but I found it hard to refactor more than this. Please 
let me know if you have any suggestion.

By the way, while thinking about this, I eliminated getConcretePathFromGlob(). 
Originally, getConcretePathFromGlob() was called by getAvroSchema() to load the 
schema from the first file that matches the glob pattern. But since we build 
'paths' by calling getAllSubDirs() anyway, it seems better to resue 'paths' 
inside getAvroSchema().


> On Sept. 17, 2012, 11:14 p.m., Santhosh Srinivasan wrote:
> > contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroStorage.java,
> >  line 130
> > <https://reviews.apache.org/r/6884/diff/3/?file=154464#file154464line130>
> >
> >     Can these string constants be declared to make them reusable and more 
> > readable?


- Cheolsoo


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/6884/#review11566
-----------------------------------------------------------


On Sept. 19, 2012, 12:10 a.m., Cheolsoo Park wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/6884/
> -----------------------------------------------------------
> 
> (Updated Sept. 19, 2012, 12:10 a.m.)
> 
> 
> Review request for pig and Santhosh Srinivasan.
> 
> 
> Description
> -------
> 
> Add support for multiple avro schemas to AvroStorage. This patch is based on 
> Stan Rosenberg's original work.
> 
> Please see https://issues.apache.org/jira/browse/PIG-2579 for details
> 
> 
> This addresses bug PIG-2579.
>     https://issues.apache.org/jira/browse/PIG-2579
> 
> 
> Diffs
> -----
> 
>   
> contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroStorage.java
>  d7a004f 
>   
> contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroStorageUtils.java
>  84280af 
>   
> contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/PigAvroInputFormat.java
>  fb5cc25 
>   
> contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/PigAvroRecordReader.java
>  75057f9 
>   
> contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/avro/TestAvroStorage.java
>  1f6e581 
>   
> contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/avro/TestAvroStorageUtils.java
>  0761d5a 
> 
> Diff: https://reviews.apache.org/r/6884/diff/
> 
> 
> Testing
> -------
> 
> New unit tests are added:
> - TestAvroStorageUtils.testMergeSchema
> - TestAvroStorage.testMultipleSchemas1,2
> 
> 
> Thanks,
> 
> Cheolsoo Park
> 
>

Reply via email to