[jira] [Commented] (PIG-4628) Pig 0.14 job with order by fails in mapreduce mode with Oozie

2015-08-12 Thread Viraj Bhat (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-4628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14693974#comment-14693974
 ] 

Viraj Bhat commented on PIG-4628:
-

Thanks Koji for your help.
Viraj

> Pig 0.14 job with order by fails in mapreduce mode with Oozie
> -
>
> Key: PIG-4628
> URL: https://issues.apache.org/jira/browse/PIG-4628
> Project: Pig
>  Issue Type: Bug
>  Components: impl
>Affects Versions: 0.14.0, 0.15.0
>Reporter: Koji Noguchi
>Assignee: Koji Noguchi
> Fix For: 0.15.1
>
> Attachments: pig-4628-v01.patch, pig-4628-v02.patch
>
>
> A simple pig script with order-by submitted through oozie and running with 
> mapreduce-mode 
> {code}
> A = LOAD '$input' AS (a1:CHARARRAY,a2:CHARARRAY, );
> A_sorted = ORDER A BY url DESC PARALLEL 2;
> STORE A_sorted INTO '$output';
> {code}
> failed on our hadoop cluster which had security turned on.  Part of the stack 
> trace had 
> {noformat}
> 2015-06-08 22:24:39,246 WARN [main] org.apache.hadoop.mapred.YarnChild: 
> Exception running child : java.lang.RuntimeException: 
> java.lang.RuntimeException: java.io.IOException: Exception reading 
> file:/tmp/2/yarn-local/usercache/userA/appcache/application_1432697993142_199266/container_e06_1432697993142_199266_01_03/container_tokens
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.init(WeightedRangePartitioner.java:155)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.getPartition(WeightedRangePartitioner.java:75)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.getPartition(WeightedRangePartitioner.java:58)
>   at 
> org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:712)
>   at 
> org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
>   at 
> org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Map.collect(PigGenericMapReduce.java:135)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:281)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:274)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1694)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> {noformat}
> This failing job was from application_1432697993142_199305 and the error path 
> was from application_1432697993142_199266 which was a oozie pig-launcher job.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PIG-4628) Pig 0.14 job with order by fails in mapreduce mode with Oozie

2015-08-11 Thread Viraj Bhat (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-4628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14692669#comment-14692669
 ] 

Viraj Bhat commented on PIG-4628:
-

Rohini can you please commit this to trunk and or backport to 0.14. We are 
running on Pig 0.14 with M/R mode and faced this problem.

Viraj

> Pig 0.14 job with order by fails in mapreduce mode with Oozie
> -
>
> Key: PIG-4628
> URL: https://issues.apache.org/jira/browse/PIG-4628
> Project: Pig
>  Issue Type: Bug
>  Components: impl
>Affects Versions: 0.14.0, 0.15.0
>Reporter: Koji Noguchi
>Assignee: Koji Noguchi
> Attachments: pig-4628-v01.patch, pig-4628-v02.patch
>
>
> A simple pig script with order-by submitted through oozie and running with 
> mapreduce-mode 
> {code}
> A = LOAD '$input' AS (a1:CHARARRAY,a2:CHARARRAY, );
> A_sorted = ORDER A BY url DESC PARALLEL 2;
> STORE A_sorted INTO '$output';
> {code}
> failed on our hadoop cluster which had security turned on.  Part of the stack 
> trace had 
> {noformat}
> 2015-06-08 22:24:39,246 WARN [main] org.apache.hadoop.mapred.YarnChild: 
> Exception running child : java.lang.RuntimeException: 
> java.lang.RuntimeException: java.io.IOException: Exception reading 
> file:/tmp/2/yarn-local/usercache/userA/appcache/application_1432697993142_199266/container_e06_1432697993142_199266_01_03/container_tokens
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.init(WeightedRangePartitioner.java:155)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.getPartition(WeightedRangePartitioner.java:75)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.getPartition(WeightedRangePartitioner.java:58)
>   at 
> org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:712)
>   at 
> org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
>   at 
> org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Map.collect(PigGenericMapReduce.java:135)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:281)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:274)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1694)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> {noformat}
> This failing job was from application_1432697993142_199305 and the error path 
> was from application_1432697993142_199266 which was a oozie pig-launcher job.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PIG-4498) AvroStorage in Piggbank does not handle bad records and fails

2015-04-06 Thread Viraj Bhat (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-4498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Bhat updated PIG-4498:

Attachment: PIG-4498.patch

> AvroStorage in Piggbank does not handle bad records and fails
> -
>
> Key: PIG-4498
> URL: https://issues.apache.org/jira/browse/PIG-4498
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.12.0, 0.11.1, 0.13.1, 0.14.1
>    Reporter: Viraj Bhat
>Assignee: Viraj Bhat
>  Labels: piggybank
> Fix For: 0.14.1
>
> Attachments: PIG-4498.patch
>
>
> The following Pig script fails if the records within the file are corrupted.
> {code}
> DEFINE AvroLoader 
> org.apache.pig.piggybank.storage.avro.AvroStorage('ignore_bad_files');
>  DH_RAW = LOAD 'bad_data*' USING AvroLoader();
> STORE DH_RAW INTO 'output' USING PigStorage();
> {code}
> Here is the stack trace:
> {quote}
> java.lang.ArrayIndexOutOfBoundsException: -49 at 
> org.apache.pig.piggybank.storage.avro.PigAvroRecordReader.getCurrentValue(PigAvroRecordReader.java:230)
>  at 
> org.apache.pig.piggybank.storage.avro.AvroStorage.getNext(AvroStorage.java:407)
>  ... 12 more Caused by: java.lang.ArrayIndexOutOfBoundsException: -49 at 
> org.apache.avro.io.parsing.Symbol$Alternative.getSymbol(Symbol.java:364) at 
> org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:229) at 
> org.apache.avro.io.parsing.Parser.advance(Parser.java:88) at 
> org.apache.avro.io.ResolvingDecoder.readIndex(ResolvingDecoder.java:206) at 
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:152) 
> at 
> org.apache.pig.piggybank.storage.avro.PigAvroDatumReader.readMap(PigAvroDatumReader.java:89)
>  at 
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:151) 
> at 
> org.apache.pig.piggybank.storage.avro.PigAvroDatumReader.readRecord(PigAvroDatumReader.java:73)
>  at 
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:148) 
> at 
> org.apache.pig.piggybank.storage.avro.PigAvroDatumReader.readRecord(PigAvroDatumReader.java:73)
>  at 
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:148) 
> at 
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:139) 
> at org.apache.avro.file.DataFileStream.next(DataFileStream.java:233) at 
> org.apache.avro.file.DataFileStream.next(DataFileStream.java:220) at 
> org.apache.pig.piggybank.storage.avro.PigAvroRecordReader.getCurrentValue(PigAvroRecordReader.java:198)
>  ..
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PIG-4498) AvroStorage in Piggbank does not handle bad records and fails

2015-04-06 Thread Viraj Bhat (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-4498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Bhat updated PIG-4498:

Labels: piggybank  (was: )
Status: Patch Available  (was: Open)

> AvroStorage in Piggbank does not handle bad records and fails
> -
>
> Key: PIG-4498
> URL: https://issues.apache.org/jira/browse/PIG-4498
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.11.1, 0.12.0, 0.13.1, 0.14.1
>    Reporter: Viraj Bhat
>Assignee: Viraj Bhat
>  Labels: piggybank
> Fix For: 0.14.1
>
> Attachments: PIG-4498.patch
>
>
> The following Pig script fails if the records within the file are corrupted.
> {code}
> DEFINE AvroLoader 
> org.apache.pig.piggybank.storage.avro.AvroStorage('ignore_bad_files');
>  DH_RAW = LOAD 'bad_data*' USING AvroLoader();
> STORE DH_RAW INTO 'output' USING PigStorage();
> {code}
> Here is the stack trace:
> {quote}
> java.lang.ArrayIndexOutOfBoundsException: -49 at 
> org.apache.pig.piggybank.storage.avro.PigAvroRecordReader.getCurrentValue(PigAvroRecordReader.java:230)
>  at 
> org.apache.pig.piggybank.storage.avro.AvroStorage.getNext(AvroStorage.java:407)
>  ... 12 more Caused by: java.lang.ArrayIndexOutOfBoundsException: -49 at 
> org.apache.avro.io.parsing.Symbol$Alternative.getSymbol(Symbol.java:364) at 
> org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:229) at 
> org.apache.avro.io.parsing.Parser.advance(Parser.java:88) at 
> org.apache.avro.io.ResolvingDecoder.readIndex(ResolvingDecoder.java:206) at 
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:152) 
> at 
> org.apache.pig.piggybank.storage.avro.PigAvroDatumReader.readMap(PigAvroDatumReader.java:89)
>  at 
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:151) 
> at 
> org.apache.pig.piggybank.storage.avro.PigAvroDatumReader.readRecord(PigAvroDatumReader.java:73)
>  at 
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:148) 
> at 
> org.apache.pig.piggybank.storage.avro.PigAvroDatumReader.readRecord(PigAvroDatumReader.java:73)
>  at 
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:148) 
> at 
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:139) 
> at org.apache.avro.file.DataFileStream.next(DataFileStream.java:233) at 
> org.apache.avro.file.DataFileStream.next(DataFileStream.java:220) at 
> org.apache.pig.piggybank.storage.avro.PigAvroRecordReader.getCurrentValue(PigAvroRecordReader.java:198)
>  ..
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PIG-4498) AvroStorage in Piggbank does not handle bad records and fails

2015-04-06 Thread Viraj Bhat (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-4498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Bhat updated PIG-4498:

Affects Version/s: 0.13.1
   0.12.0

> AvroStorage in Piggbank does not handle bad records and fails
> -
>
> Key: PIG-4498
> URL: https://issues.apache.org/jira/browse/PIG-4498
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.12.0, 0.11.1, 0.13.1, 0.14.1
>    Reporter: Viraj Bhat
>Assignee: Viraj Bhat
> Fix For: 0.14.1
>
>
> The following Pig script fails if the records within the file are corrupted.
> {code}
> DEFINE AvroLoader 
> org.apache.pig.piggybank.storage.avro.AvroStorage('ignore_bad_files');
>  DH_RAW = LOAD 'bad_data*' USING AvroLoader();
> STORE DH_RAW INTO 'output' USING PigStorage();
> {code}
> Here is the stack trace:
> {quote}
> java.lang.ArrayIndexOutOfBoundsException: -49 at 
> org.apache.pig.piggybank.storage.avro.PigAvroRecordReader.getCurrentValue(PigAvroRecordReader.java:230)
>  at 
> org.apache.pig.piggybank.storage.avro.AvroStorage.getNext(AvroStorage.java:407)
>  ... 12 more Caused by: java.lang.ArrayIndexOutOfBoundsException: -49 at 
> org.apache.avro.io.parsing.Symbol$Alternative.getSymbol(Symbol.java:364) at 
> org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:229) at 
> org.apache.avro.io.parsing.Parser.advance(Parser.java:88) at 
> org.apache.avro.io.ResolvingDecoder.readIndex(ResolvingDecoder.java:206) at 
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:152) 
> at 
> org.apache.pig.piggybank.storage.avro.PigAvroDatumReader.readMap(PigAvroDatumReader.java:89)
>  at 
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:151) 
> at 
> org.apache.pig.piggybank.storage.avro.PigAvroDatumReader.readRecord(PigAvroDatumReader.java:73)
>  at 
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:148) 
> at 
> org.apache.pig.piggybank.storage.avro.PigAvroDatumReader.readRecord(PigAvroDatumReader.java:73)
>  at 
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:148) 
> at 
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:139) 
> at org.apache.avro.file.DataFileStream.next(DataFileStream.java:233) at 
> org.apache.avro.file.DataFileStream.next(DataFileStream.java:220) at 
> org.apache.pig.piggybank.storage.avro.PigAvroRecordReader.getCurrentValue(PigAvroRecordReader.java:198)
>  ..
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (PIG-4498) AvroStorage in Piggbank does not handle bad records and fails

2015-04-03 Thread Viraj Bhat (JIRA)
Viraj Bhat created PIG-4498:
---

 Summary: AvroStorage in Piggbank does not handle bad records and 
fails
 Key: PIG-4498
 URL: https://issues.apache.org/jira/browse/PIG-4498
 Project: Pig
  Issue Type: Bug
  Components: piggybank
Affects Versions: 0.11.1, 0.14.1
Reporter: Viraj Bhat
Assignee: Viraj Bhat
 Fix For: 0.14.1


The following Pig script fails if the records within the file are corrupted.

{code}
DEFINE AvroLoader 
org.apache.pig.piggybank.storage.avro.AvroStorage('ignore_bad_files');
 DH_RAW = LOAD 'bad_data*' USING AvroLoader();
STORE DH_RAW INTO 'output' USING PigStorage();
{code}

Here is the stack trace:
{quote}
java.lang.ArrayIndexOutOfBoundsException: -49 at 
org.apache.pig.piggybank.storage.avro.PigAvroRecordReader.getCurrentValue(PigAvroRecordReader.java:230)
 at 
org.apache.pig.piggybank.storage.avro.AvroStorage.getNext(AvroStorage.java:407) 
... 12 more Caused by: java.lang.ArrayIndexOutOfBoundsException: -49 at 
org.apache.avro.io.parsing.Symbol$Alternative.getSymbol(Symbol.java:364) at 
org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:229) at 
org.apache.avro.io.parsing.Parser.advance(Parser.java:88) at 
org.apache.avro.io.ResolvingDecoder.readIndex(ResolvingDecoder.java:206) at 
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:152) at 
org.apache.pig.piggybank.storage.avro.PigAvroDatumReader.readMap(PigAvroDatumReader.java:89)
 at 
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:151) at 
org.apache.pig.piggybank.storage.avro.PigAvroDatumReader.readRecord(PigAvroDatumReader.java:73)
 at 
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:148) at 
org.apache.pig.piggybank.storage.avro.PigAvroDatumReader.readRecord(PigAvroDatumReader.java:73)
 at 
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:148) at 
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:139) at 
org.apache.avro.file.DataFileStream.next(DataFileStream.java:233) at 
org.apache.avro.file.DataFileStream.next(DataFileStream.java:220) at 
org.apache.pig.piggybank.storage.avro.PigAvroRecordReader.getCurrentValue(PigAvroRecordReader.java:198)
 ..
{quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PIG-3222) New UDFContextSignature assignments in Pig 0.11 breaks HCatalog.HCatStorer

2014-01-30 Thread Viraj Bhat (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13887401#comment-13887401
 ] 

Viraj Bhat commented on PIG-3222:
-

Hi Daniel,
 It seems that this patch is in our code base for Pig 0.11. But still the query 
fails. I succeeds in Pig 0.12. I have asked Rohini if she has an idea on this.
Thanks again
Viraj

> New UDFContextSignature assignments in Pig 0.11 breaks HCatalog.HCatStorer 
> ---
>
> Key: PIG-3222
> URL: https://issues.apache.org/jira/browse/PIG-3222
> Project: Pig
>  Issue Type: Bug
>  Components: impl
>Affects Versions: 0.11
>Reporter: Feng Peng
>  Labels: hcatalog
> Attachments: PigStorerDemo.java, hcat.trace, hcatstorer.trace.txt
>
>
> Pig 0.11 assigns different UDFContextSignature for different invocations of 
> the same load/store statement. This change breaks the HCatStorer which 
> assumes all front-end and back-end invocations of the same store statement 
> has the same UDFContextSignature so that it can read the previously stored 
> information correctly.
> The related HCatalog code is in 
> https://svn.apache.org/repos/asf/incubator/hcatalog/branches/branch-0.5/hcatalog-pig-adapter/src/main/java/org/apache/hcatalog/pig/HCatStorer.java
>  (the setStoreLocation() function).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (PIG-3222) New UDFContextSignature assignments in Pig 0.11 breaks HCatalog.HCatStorer

2014-01-30 Thread Viraj Bhat (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13887216#comment-13887216
 ] 

Viraj Bhat commented on PIG-3222:
-

Hi Feng,
 Thanks for finding this error in Pig 0.11. It seems the limit to HCatStorer 
works fine with Pig 0.12 but is still a problem with Pig 0.11.  Not sure if we 
need to backport something that got this working in Pig 0.12
Viraj

> New UDFContextSignature assignments in Pig 0.11 breaks HCatalog.HCatStorer 
> ---
>
> Key: PIG-3222
> URL: https://issues.apache.org/jira/browse/PIG-3222
> Project: Pig
>  Issue Type: Bug
>  Components: impl
>Affects Versions: 0.11
>Reporter: Feng Peng
>  Labels: hcatalog
> Attachments: PigStorerDemo.java, hcat.trace, hcatstorer.trace.txt
>
>
> Pig 0.11 assigns different UDFContextSignature for different invocations of 
> the same load/store statement. This change breaks the HCatStorer which 
> assumes all front-end and back-end invocations of the same store statement 
> has the same UDFContextSignature so that it can read the previously stored 
> information correctly.
> The related HCatalog code is in 
> https://svn.apache.org/repos/asf/incubator/hcatalog/branches/branch-0.5/hcatalog-pig-adapter/src/main/java/org/apache/hcatalog/pig/HCatStorer.java
>  (the setStoreLocation() function).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


FW: IEEE CloudCom 2013 Call For Papers

2013-07-08 Thread Viraj Bhat
Kindly consider submitting.
Viraj

From: c...@grid.chu.edu.tw [mailto:c...@grid.chu.edu.tw]
Sent: Saturday, July 06, 2013 10:42 PM
To: Viraj Bhat
Subject: IEEE CloudCom 2013 Call For Papers

Call for Papers

IEEE CloudCom 2013 (5th IEEE International Conference on Cloud Computing, 
Technology and Science)
2-5 December 2013, Bristol, UK
2013.cloudcom.org

General Information
---
The “Cloud” is a natural evolution of distributed computing and of the 
widespread adaption of virtualization and SOA. In Cloud Computing, IT-related 
capabilities and resources are provided as services, via the Internet and 
on-demand, accessible without requiring detailed knowledge of the underlying 
technology. The IEEE International Conference and Workshops on Cloud Computing 
Technology and Science, steered by the Cloud Computing Association, aim to 
bring together researchers who work on cloud computing and related technologies.

Important Dates
---
Paper submission - July 31, 2013
Workshop, poster and demo papers – August 5, 2013
Notification – September 2, 2013
Camera-ready – September 16, 2013

Paper Submission
-
Manuscripts need to be prepared according to the IEEE CS format: 
http://www.computer.org/portal/web/cscps/formatting
For regular papers, the page limit will be 8 pages. For workshops and Ph.D. 
consortium, the page limit will be 6 pages. For poster and demo, the page limit 
will be 4 pages.

All accepted papers will be published by IEEE CS Press (IEEE Xplore) and 
Indexed by EI and ISSN. Accepted papers will be asked to present in a plenary 
session. Distinguished papers will be invited to be extended for submission in 
prestigious international journals.

IEEE Transactions on Cloud Computing (TCC: http://computer.org/TCC) is 
organising a Special Issue which encourages submission of revised and extended 
versions of best/top rated papers in the area of Cloud Computing from IEEE 
CloudCom 2013.

The IEEE CloudCom 2013 submission site is: 
https://www.easychair.org/conferences/?conf=ieeecloudcom2013

Topics of Interest
--
‧ Cloud architecture
‧ Big Data
‧ Security and Privacy in the Cloud
‧ Cloud services and Applications
‧ Virtualization
‧ HPC on Cloud
‧ IoT and Mobile on Cloud

For further details and workshop information see http://2013.cloudcom.org or 
send enquiries to 
ieeecloudcom2...@easychair.org<mailto:ieeecloudcom2...@easychair.org>




To subscribe other emails or see information of this mailing list, please go to
http://grid.chu.edu.tw/mailling_list/subscribe.php

To unsubscribe, please click
http://grid.chu.edu.tw/unsubscribe.php?mail=vi...@yahoo-inc.com

For other questions, please send email to 
cfp-ad...@grid.chu.edu.tw<mailto:cfp-ad...@grid.chu.edu.tw>


[jira] [Updated] (PIG-3318) AVRO: 'default value' not honored when merging schemas on load with AvroStorage

2013-06-13 Thread Viraj Bhat (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Bhat updated PIG-3318:


Attachment: Employee6.avro

> AVRO: 'default value' not honored when merging schemas on load with 
> AvroStorage
> ---
>
> Key: PIG-3318
> URL: https://issues.apache.org/jira/browse/PIG-3318
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.11.2
>Reporter: Viraj Bhat
>Assignee: Viraj Bhat
>  Labels: patch
> Fix For: 0.12, 0.11.2
>
> Attachments: Employee3.avro, Employee4.avro, Employee6.avro, 
> PIG-3318_5.patch
>
>
> Piggybank - AvroStorage. When merging multiple schemas where default values 
> have been specified in the avro schema; 
> The AvroStorage puts nulls in the merged data set. 
> ==> Employee3.avro <==
> {
> "type" : "record",
> "name" : "employee",
> "fields":[
> {"name" : "name", "type" : "string", "default" : "NU"},
> {"name" : "age", "type" : "int", "default" : 0 },
> {"name" : "dept", "type": "string", "default" : "DU"} ] }
> ==> Employee4.avro <==
> {
> "type" : "record",
> "name" : "employee",
> "fields":[
> {"name" : "name", "type" : "string", "default" : "NU"},
> {"name" : "age", "type" : "int", "default" : 0},
> {"name" : "dept", "type": "string", "default" : "DU"},
> {"name" : "office", "type": "string", "default" : "OU"} ] }
> ==> Employee6.avro <==
> {
> "type" : "record",
> "name" : "employee",
> "fields":[
> {"name" : "name", "type" : "string", "default" : "NU"},
> {"name" : "lastname", "type": "string", "default" : "LNU"},
> {"name" : "age", "type" : "int","default" : 0},
> {"name" : "salary", "type": "int", "default" : 0},
> {"name" : "dept", "type": "string","default" : "DU"},
> {"name" : "office", "type": "string","default" : "OU"} ] }
> The pig script:
> employee = load 'employee{3,4,6}.ser' using 
> org.apache.pig.piggybank.storage.avro.AvroStorage('multiple_schemas');
> describe employee;
> dump employee;
> Output Schema:
> employee: {name: chararray,age: int,dept: chararray,lastname: 
> chararray,salary: int,office: chararray}
> (Milo,30,DH,,,)
> (Asmya,34,PQ,,,)
> (Baljit,23,RS,,,)
> (Pune,60,Astrophysics,Warriors,5466,UTA)
> (Rajsathan,20,Biochemistry,Royals,1378,Stanford)
> (Chennai,50,Microbiology,Superkings,7338,Hopkins)
> (Mumbai,20,Applied Math,Indians,4468,UAH)
> (Praj,54,RMX,,,Champaign)
> (Buba,767,HD,,,Sunnyvale)
> (Manku,375,MS,,,New York)
> Regards
> Viraj

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3318) AVRO: 'default value' not honored when merging schemas on load with AvroStorage

2013-06-13 Thread Viraj Bhat (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Bhat updated PIG-3318:


Attachment: Employee4.avro

> AVRO: 'default value' not honored when merging schemas on load with 
> AvroStorage
> ---
>
> Key: PIG-3318
> URL: https://issues.apache.org/jira/browse/PIG-3318
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.11.2
>Reporter: Viraj Bhat
>Assignee: Viraj Bhat
>  Labels: patch
> Fix For: 0.12, 0.11.2
>
> Attachments: Employee3.avro, Employee4.avro, Employee6.avro, 
> PIG-3318_5.patch
>
>
> Piggybank - AvroStorage. When merging multiple schemas where default values 
> have been specified in the avro schema; 
> The AvroStorage puts nulls in the merged data set. 
> ==> Employee3.avro <==
> {
> "type" : "record",
> "name" : "employee",
> "fields":[
> {"name" : "name", "type" : "string", "default" : "NU"},
> {"name" : "age", "type" : "int", "default" : 0 },
> {"name" : "dept", "type": "string", "default" : "DU"} ] }
> ==> Employee4.avro <==
> {
> "type" : "record",
> "name" : "employee",
> "fields":[
> {"name" : "name", "type" : "string", "default" : "NU"},
> {"name" : "age", "type" : "int", "default" : 0},
> {"name" : "dept", "type": "string", "default" : "DU"},
> {"name" : "office", "type": "string", "default" : "OU"} ] }
> ==> Employee6.avro <==
> {
> "type" : "record",
> "name" : "employee",
> "fields":[
> {"name" : "name", "type" : "string", "default" : "NU"},
> {"name" : "lastname", "type": "string", "default" : "LNU"},
> {"name" : "age", "type" : "int","default" : 0},
> {"name" : "salary", "type": "int", "default" : 0},
> {"name" : "dept", "type": "string","default" : "DU"},
> {"name" : "office", "type": "string","default" : "OU"} ] }
> The pig script:
> employee = load 'employee{3,4,6}.ser' using 
> org.apache.pig.piggybank.storage.avro.AvroStorage('multiple_schemas');
> describe employee;
> dump employee;
> Output Schema:
> employee: {name: chararray,age: int,dept: chararray,lastname: 
> chararray,salary: int,office: chararray}
> (Milo,30,DH,,,)
> (Asmya,34,PQ,,,)
> (Baljit,23,RS,,,)
> (Pune,60,Astrophysics,Warriors,5466,UTA)
> (Rajsathan,20,Biochemistry,Royals,1378,Stanford)
> (Chennai,50,Microbiology,Superkings,7338,Hopkins)
> (Mumbai,20,Applied Math,Indians,4468,UAH)
> (Praj,54,RMX,,,Champaign)
> (Buba,767,HD,,,Sunnyvale)
> (Manku,375,MS,,,New York)
> Regards
> Viraj

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3318) AVRO: 'default value' not honored when merging schemas on load with AvroStorage

2013-06-13 Thread Viraj Bhat (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Bhat updated PIG-3318:


Attachment: PIG-3318_5.patch

> AVRO: 'default value' not honored when merging schemas on load with 
> AvroStorage
> ---
>
> Key: PIG-3318
> URL: https://issues.apache.org/jira/browse/PIG-3318
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.11.2
>Reporter: Viraj Bhat
>Assignee: Viraj Bhat
>  Labels: patch
> Fix For: 0.12, 0.11.2
>
> Attachments: Employee3.avro, Employee4.avro, Employee6.avro, 
> PIG-3318_5.patch
>
>
> Piggybank - AvroStorage. When merging multiple schemas where default values 
> have been specified in the avro schema; 
> The AvroStorage puts nulls in the merged data set. 
> ==> Employee3.avro <==
> {
> "type" : "record",
> "name" : "employee",
> "fields":[
> {"name" : "name", "type" : "string", "default" : "NU"},
> {"name" : "age", "type" : "int", "default" : 0 },
> {"name" : "dept", "type": "string", "default" : "DU"} ] }
> ==> Employee4.avro <==
> {
> "type" : "record",
> "name" : "employee",
> "fields":[
> {"name" : "name", "type" : "string", "default" : "NU"},
> {"name" : "age", "type" : "int", "default" : 0},
> {"name" : "dept", "type": "string", "default" : "DU"},
> {"name" : "office", "type": "string", "default" : "OU"} ] }
> ==> Employee6.avro <==
> {
> "type" : "record",
> "name" : "employee",
> "fields":[
> {"name" : "name", "type" : "string", "default" : "NU"},
> {"name" : "lastname", "type": "string", "default" : "LNU"},
> {"name" : "age", "type" : "int","default" : 0},
> {"name" : "salary", "type": "int", "default" : 0},
> {"name" : "dept", "type": "string","default" : "DU"},
> {"name" : "office", "type": "string","default" : "OU"} ] }
> The pig script:
> employee = load 'employee{3,4,6}.ser' using 
> org.apache.pig.piggybank.storage.avro.AvroStorage('multiple_schemas');
> describe employee;
> dump employee;
> Output Schema:
> employee: {name: chararray,age: int,dept: chararray,lastname: 
> chararray,salary: int,office: chararray}
> (Milo,30,DH,,,)
> (Asmya,34,PQ,,,)
> (Baljit,23,RS,,,)
> (Pune,60,Astrophysics,Warriors,5466,UTA)
> (Rajsathan,20,Biochemistry,Royals,1378,Stanford)
> (Chennai,50,Microbiology,Superkings,7338,Hopkins)
> (Mumbai,20,Applied Math,Indians,4468,UAH)
> (Praj,54,RMX,,,Champaign)
> (Buba,767,HD,,,Sunnyvale)
> (Manku,375,MS,,,New York)
> Regards
> Viraj

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3318) AVRO: 'default value' not honored when merging schemas on load with AvroStorage

2013-06-13 Thread Viraj Bhat (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Bhat updated PIG-3318:


Attachment: Employee3.avro

> AVRO: 'default value' not honored when merging schemas on load with 
> AvroStorage
> ---
>
> Key: PIG-3318
> URL: https://issues.apache.org/jira/browse/PIG-3318
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.11.2
>Reporter: Viraj Bhat
>Assignee: Viraj Bhat
>  Labels: patch
> Fix For: 0.12, 0.11.2
>
> Attachments: Employee3.avro, Employee4.avro, Employee6.avro, 
> PIG-3318_5.patch
>
>
> Piggybank - AvroStorage. When merging multiple schemas where default values 
> have been specified in the avro schema; 
> The AvroStorage puts nulls in the merged data set. 
> ==> Employee3.avro <==
> {
> "type" : "record",
> "name" : "employee",
> "fields":[
> {"name" : "name", "type" : "string", "default" : "NU"},
> {"name" : "age", "type" : "int", "default" : 0 },
> {"name" : "dept", "type": "string", "default" : "DU"} ] }
> ==> Employee4.avro <==
> {
> "type" : "record",
> "name" : "employee",
> "fields":[
> {"name" : "name", "type" : "string", "default" : "NU"},
> {"name" : "age", "type" : "int", "default" : 0},
> {"name" : "dept", "type": "string", "default" : "DU"},
> {"name" : "office", "type": "string", "default" : "OU"} ] }
> ==> Employee6.avro <==
> {
> "type" : "record",
> "name" : "employee",
> "fields":[
> {"name" : "name", "type" : "string", "default" : "NU"},
> {"name" : "lastname", "type": "string", "default" : "LNU"},
> {"name" : "age", "type" : "int","default" : 0},
> {"name" : "salary", "type": "int", "default" : 0},
> {"name" : "dept", "type": "string","default" : "DU"},
> {"name" : "office", "type": "string","default" : "OU"} ] }
> The pig script:
> employee = load 'employee{3,4,6}.ser' using 
> org.apache.pig.piggybank.storage.avro.AvroStorage('multiple_schemas');
> describe employee;
> dump employee;
> Output Schema:
> employee: {name: chararray,age: int,dept: chararray,lastname: 
> chararray,salary: int,office: chararray}
> (Milo,30,DH,,,)
> (Asmya,34,PQ,,,)
> (Baljit,23,RS,,,)
> (Pune,60,Astrophysics,Warriors,5466,UTA)
> (Rajsathan,20,Biochemistry,Royals,1378,Stanford)
> (Chennai,50,Microbiology,Superkings,7338,Hopkins)
> (Mumbai,20,Applied Math,Indians,4468,UAH)
> (Praj,54,RMX,,,Champaign)
> (Buba,767,HD,,,Sunnyvale)
> (Manku,375,MS,,,New York)
> Regards
> Viraj

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Review Request: PIG-3318 Patch to address default values when schemas are merged in AvroStorage. It does this for Records containing primitive values

2013-06-13 Thread Viraj Bhat

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11135/
---

(Updated June 14, 2013, 12:15 a.m.)


Review request for pig and Rohini Palaniswamy.


Changes
---

Indentation and variable case changes.


Description
---

Default values are not honoured when merging default schema


This addresses bug PIG-3318.
https://issues.apache.org/jira/browse/PIG-3318


Diffs (updated)
-

  
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroStorage.java
 1491556 
  
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroStorageUtils.java
 1491556 
  
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/PigAvroInputFormat.java
 1491556 
  
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/PigAvroRecordReader.java
 1491556 
  
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/avro/TestAvroStorage.java
 1491562 

Diff: https://reviews.apache.org/r/11135/diff/


Testing
---

Yes


Thanks,

Viraj Bhat



[jira] [Updated] (PIG-3318) AVRO: 'default value' not honored when merging schemas on load with AvroStorage

2013-06-13 Thread Viraj Bhat (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Bhat updated PIG-3318:


Attachment: (was: Employee3.ser)

> AVRO: 'default value' not honored when merging schemas on load with 
> AvroStorage
> ---
>
> Key: PIG-3318
> URL: https://issues.apache.org/jira/browse/PIG-3318
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.11.2
>Reporter: Viraj Bhat
>Assignee: Viraj Bhat
>  Labels: patch
> Fix For: 0.12, 0.11.2
>
>
> Piggybank - AvroStorage. When merging multiple schemas where default values 
> have been specified in the avro schema; 
> The AvroStorage puts nulls in the merged data set. 
> ==> Employee3.avro <==
> {
> "type" : "record",
> "name" : "employee",
> "fields":[
> {"name" : "name", "type" : "string", "default" : "NU"},
> {"name" : "age", "type" : "int", "default" : 0 },
> {"name" : "dept", "type": "string", "default" : "DU"} ] }
> ==> Employee4.avro <==
> {
> "type" : "record",
> "name" : "employee",
> "fields":[
> {"name" : "name", "type" : "string", "default" : "NU"},
> {"name" : "age", "type" : "int", "default" : 0},
> {"name" : "dept", "type": "string", "default" : "DU"},
> {"name" : "office", "type": "string", "default" : "OU"} ] }
> ==> Employee6.avro <==
> {
> "type" : "record",
> "name" : "employee",
> "fields":[
> {"name" : "name", "type" : "string", "default" : "NU"},
> {"name" : "lastname", "type": "string", "default" : "LNU"},
> {"name" : "age", "type" : "int","default" : 0},
> {"name" : "salary", "type": "int", "default" : 0},
> {"name" : "dept", "type": "string","default" : "DU"},
> {"name" : "office", "type": "string","default" : "OU"} ] }
> The pig script:
> employee = load 'employee{3,4,6}.ser' using 
> org.apache.pig.piggybank.storage.avro.AvroStorage('multiple_schemas');
> describe employee;
> dump employee;
> Output Schema:
> employee: {name: chararray,age: int,dept: chararray,lastname: 
> chararray,salary: int,office: chararray}
> (Milo,30,DH,,,)
> (Asmya,34,PQ,,,)
> (Baljit,23,RS,,,)
> (Pune,60,Astrophysics,Warriors,5466,UTA)
> (Rajsathan,20,Biochemistry,Royals,1378,Stanford)
> (Chennai,50,Microbiology,Superkings,7338,Hopkins)
> (Mumbai,20,Applied Math,Indians,4468,UAH)
> (Praj,54,RMX,,,Champaign)
> (Buba,767,HD,,,Sunnyvale)
> (Manku,375,MS,,,New York)
> Regards
> Viraj

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3318) AVRO: 'default value' not honored when merging schemas on load with AvroStorage

2013-06-13 Thread Viraj Bhat (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Bhat updated PIG-3318:


Attachment: (was: Employee6.ser)

> AVRO: 'default value' not honored when merging schemas on load with 
> AvroStorage
> ---
>
> Key: PIG-3318
> URL: https://issues.apache.org/jira/browse/PIG-3318
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.11.2
>Reporter: Viraj Bhat
>Assignee: Viraj Bhat
>  Labels: patch
> Fix For: 0.12, 0.11.2
>
>
> Piggybank - AvroStorage. When merging multiple schemas where default values 
> have been specified in the avro schema; 
> The AvroStorage puts nulls in the merged data set. 
> ==> Employee3.avro <==
> {
> "type" : "record",
> "name" : "employee",
> "fields":[
> {"name" : "name", "type" : "string", "default" : "NU"},
> {"name" : "age", "type" : "int", "default" : 0 },
> {"name" : "dept", "type": "string", "default" : "DU"} ] }
> ==> Employee4.avro <==
> {
> "type" : "record",
> "name" : "employee",
> "fields":[
> {"name" : "name", "type" : "string", "default" : "NU"},
> {"name" : "age", "type" : "int", "default" : 0},
> {"name" : "dept", "type": "string", "default" : "DU"},
> {"name" : "office", "type": "string", "default" : "OU"} ] }
> ==> Employee6.avro <==
> {
> "type" : "record",
> "name" : "employee",
> "fields":[
> {"name" : "name", "type" : "string", "default" : "NU"},
> {"name" : "lastname", "type": "string", "default" : "LNU"},
> {"name" : "age", "type" : "int","default" : 0},
> {"name" : "salary", "type": "int", "default" : 0},
> {"name" : "dept", "type": "string","default" : "DU"},
> {"name" : "office", "type": "string","default" : "OU"} ] }
> The pig script:
> employee = load 'employee{3,4,6}.ser' using 
> org.apache.pig.piggybank.storage.avro.AvroStorage('multiple_schemas');
> describe employee;
> dump employee;
> Output Schema:
> employee: {name: chararray,age: int,dept: chararray,lastname: 
> chararray,salary: int,office: chararray}
> (Milo,30,DH,,,)
> (Asmya,34,PQ,,,)
> (Baljit,23,RS,,,)
> (Pune,60,Astrophysics,Warriors,5466,UTA)
> (Rajsathan,20,Biochemistry,Royals,1378,Stanford)
> (Chennai,50,Microbiology,Superkings,7338,Hopkins)
> (Mumbai,20,Applied Math,Indians,4468,UAH)
> (Praj,54,RMX,,,Champaign)
> (Buba,767,HD,,,Sunnyvale)
> (Manku,375,MS,,,New York)
> Regards
> Viraj

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3318) AVRO: 'default value' not honored when merging schemas on load with AvroStorage

2013-06-13 Thread Viraj Bhat (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Bhat updated PIG-3318:


Attachment: (was: Employee4.ser)

> AVRO: 'default value' not honored when merging schemas on load with 
> AvroStorage
> ---
>
> Key: PIG-3318
> URL: https://issues.apache.org/jira/browse/PIG-3318
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.11.2
>Reporter: Viraj Bhat
>Assignee: Viraj Bhat
>  Labels: patch
> Fix For: 0.12, 0.11.2
>
>
> Piggybank - AvroStorage. When merging multiple schemas where default values 
> have been specified in the avro schema; 
> The AvroStorage puts nulls in the merged data set. 
> ==> Employee3.avro <==
> {
> "type" : "record",
> "name" : "employee",
> "fields":[
> {"name" : "name", "type" : "string", "default" : "NU"},
> {"name" : "age", "type" : "int", "default" : 0 },
> {"name" : "dept", "type": "string", "default" : "DU"} ] }
> ==> Employee4.avro <==
> {
> "type" : "record",
> "name" : "employee",
> "fields":[
> {"name" : "name", "type" : "string", "default" : "NU"},
> {"name" : "age", "type" : "int", "default" : 0},
> {"name" : "dept", "type": "string", "default" : "DU"},
> {"name" : "office", "type": "string", "default" : "OU"} ] }
> ==> Employee6.avro <==
> {
> "type" : "record",
> "name" : "employee",
> "fields":[
> {"name" : "name", "type" : "string", "default" : "NU"},
> {"name" : "lastname", "type": "string", "default" : "LNU"},
> {"name" : "age", "type" : "int","default" : 0},
> {"name" : "salary", "type": "int", "default" : 0},
> {"name" : "dept", "type": "string","default" : "DU"},
> {"name" : "office", "type": "string","default" : "OU"} ] }
> The pig script:
> employee = load 'employee{3,4,6}.ser' using 
> org.apache.pig.piggybank.storage.avro.AvroStorage('multiple_schemas');
> describe employee;
> dump employee;
> Output Schema:
> employee: {name: chararray,age: int,dept: chararray,lastname: 
> chararray,salary: int,office: chararray}
> (Milo,30,DH,,,)
> (Asmya,34,PQ,,,)
> (Baljit,23,RS,,,)
> (Pune,60,Astrophysics,Warriors,5466,UTA)
> (Rajsathan,20,Biochemistry,Royals,1378,Stanford)
> (Chennai,50,Microbiology,Superkings,7338,Hopkins)
> (Mumbai,20,Applied Math,Indians,4468,UAH)
> (Praj,54,RMX,,,Champaign)
> (Buba,767,HD,,,Sunnyvale)
> (Manku,375,MS,,,New York)
> Regards
> Viraj

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Review Request: PIG-3318 Patch to address default values when schemas are merged in AvroStorage. It does this for Records containing primitive values

2013-06-13 Thread Viraj Bhat

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11135/
---

(Updated June 13, 2013, 6:34 p.m.)


Review request for pig and Rohini Palaniswamy.


Changes
---

1) Change the testcase to use mockstorage
2) Remove the condition that does not verify results in Hadoop 23
3) Add back the "usemultipleSchemas" flag to handle cases when 
schemaToMergedSchemaMap is null and multiple_schemas is invoked. Test case 
testMultipleSchema1 fails for the previous patch
4) Testing done with Hadoop 23


Description
---

Default values are not honoured when merging default schema


This addresses bug PIG-3318.
https://issues.apache.org/jira/browse/PIG-3318


Diffs (updated)
-

  
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroStorage.java
 1491556 
  
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroStorageUtils.java
 1491556 
  
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/PigAvroInputFormat.java
 1491556 
  
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/PigAvroRecordReader.java
 1491556 
  
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/avro/TestAvroStorage.java
 1491562 

Diff: https://reviews.apache.org/r/11135/diff/


Testing
---

Yes


Thanks,

Viraj Bhat



[jira] [Updated] (PIG-3318) AVRO: 'default value' not honored when merging schemas on load with AvroStorage

2013-06-13 Thread Viraj Bhat (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Bhat updated PIG-3318:


Attachment: (was: PIG-3318_3.patch)

> AVRO: 'default value' not honored when merging schemas on load with 
> AvroStorage
> ---
>
> Key: PIG-3318
> URL: https://issues.apache.org/jira/browse/PIG-3318
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.11.2
>Reporter: Viraj Bhat
>Assignee: Viraj Bhat
>  Labels: patch
> Fix For: 0.12, 0.11.2
>
> Attachments: Employee3.ser, Employee4.ser, Employee6.ser
>
>
> Piggybank - AvroStorage. When merging multiple schemas where default values 
> have been specified in the avro schema; 
> The AvroStorage puts nulls in the merged data set. 
> ==> Employee3.avro <==
> {
> "type" : "record",
> "name" : "employee",
> "fields":[
> {"name" : "name", "type" : "string", "default" : "NU"},
> {"name" : "age", "type" : "int", "default" : 0 },
> {"name" : "dept", "type": "string", "default" : "DU"} ] }
> ==> Employee4.avro <==
> {
> "type" : "record",
> "name" : "employee",
> "fields":[
> {"name" : "name", "type" : "string", "default" : "NU"},
> {"name" : "age", "type" : "int", "default" : 0},
> {"name" : "dept", "type": "string", "default" : "DU"},
> {"name" : "office", "type": "string", "default" : "OU"} ] }
> ==> Employee6.avro <==
> {
> "type" : "record",
> "name" : "employee",
> "fields":[
> {"name" : "name", "type" : "string", "default" : "NU"},
> {"name" : "lastname", "type": "string", "default" : "LNU"},
> {"name" : "age", "type" : "int","default" : 0},
> {"name" : "salary", "type": "int", "default" : 0},
> {"name" : "dept", "type": "string","default" : "DU"},
> {"name" : "office", "type": "string","default" : "OU"} ] }
> The pig script:
> employee = load 'employee{3,4,6}.ser' using 
> org.apache.pig.piggybank.storage.avro.AvroStorage('multiple_schemas');
> describe employee;
> dump employee;
> Output Schema:
> employee: {name: chararray,age: int,dept: chararray,lastname: 
> chararray,salary: int,office: chararray}
> (Milo,30,DH,,,)
> (Asmya,34,PQ,,,)
> (Baljit,23,RS,,,)
> (Pune,60,Astrophysics,Warriors,5466,UTA)
> (Rajsathan,20,Biochemistry,Royals,1378,Stanford)
> (Chennai,50,Microbiology,Superkings,7338,Hopkins)
> (Mumbai,20,Applied Math,Indians,4468,UAH)
> (Praj,54,RMX,,,Champaign)
> (Buba,767,HD,,,Sunnyvale)
> (Manku,375,MS,,,New York)
> Regards
> Viraj

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-3318) AVRO: 'default value' not honored when merging schemas on load with AvroStorage

2013-06-12 Thread Viraj Bhat (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13681848#comment-13681848
 ] 

Viraj Bhat commented on PIG-3318:
-

Sorry for attaching the wrong patch, which makes the test case write to an Avro 
file. I have modified the test to use mock.Storage(), will reattach the correct 
patch.
Viraj

> AVRO: 'default value' not honored when merging schemas on load with 
> AvroStorage
> ---
>
> Key: PIG-3318
> URL: https://issues.apache.org/jira/browse/PIG-3318
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.11.2
>    Reporter: Viraj Bhat
>Assignee: Viraj Bhat
>  Labels: patch
> Fix For: 0.12, 0.11.2
>
> Attachments: Employee3.ser, Employee4.ser, Employee6.ser, 
> PIG-3318_3.patch
>
>
> Piggybank - AvroStorage. When merging multiple schemas where default values 
> have been specified in the avro schema; 
> The AvroStorage puts nulls in the merged data set. 
> ==> Employee3.avro <==
> {
> "type" : "record",
> "name" : "employee",
> "fields":[
> {"name" : "name", "type" : "string", "default" : "NU"},
> {"name" : "age", "type" : "int", "default" : 0 },
> {"name" : "dept", "type": "string", "default" : "DU"} ] }
> ==> Employee4.avro <==
> {
> "type" : "record",
> "name" : "employee",
> "fields":[
> {"name" : "name", "type" : "string", "default" : "NU"},
> {"name" : "age", "type" : "int", "default" : 0},
> {"name" : "dept", "type": "string", "default" : "DU"},
> {"name" : "office", "type": "string", "default" : "OU"} ] }
> ==> Employee6.avro <==
> {
> "type" : "record",
> "name" : "employee",
> "fields":[
> {"name" : "name", "type" : "string", "default" : "NU"},
> {"name" : "lastname", "type": "string", "default" : "LNU"},
> {"name" : "age", "type" : "int","default" : 0},
> {"name" : "salary", "type": "int", "default" : 0},
> {"name" : "dept", "type": "string","default" : "DU"},
> {"name" : "office", "type": "string","default" : "OU"} ] }
> The pig script:
> employee = load 'employee{3,4,6}.ser' using 
> org.apache.pig.piggybank.storage.avro.AvroStorage('multiple_schemas');
> describe employee;
> dump employee;
> Output Schema:
> employee: {name: chararray,age: int,dept: chararray,lastname: 
> chararray,salary: int,office: chararray}
> (Milo,30,DH,,,)
> (Asmya,34,PQ,,,)
> (Baljit,23,RS,,,)
> (Pune,60,Astrophysics,Warriors,5466,UTA)
> (Rajsathan,20,Biochemistry,Royals,1378,Stanford)
> (Chennai,50,Microbiology,Superkings,7338,Hopkins)
> (Mumbai,20,Applied Math,Indians,4468,UAH)
> (Praj,54,RMX,,,Champaign)
> (Buba,767,HD,,,Sunnyvale)
> (Manku,375,MS,,,New York)
> Regards
> Viraj

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3318) AVRO: 'default value' not honored when merging schemas on load with AvroStorage

2013-06-12 Thread Viraj Bhat (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Bhat updated PIG-3318:


Attachment: (was: expected_testMultipleSchemasWithDefaultValue.avro)

> AVRO: 'default value' not honored when merging schemas on load with 
> AvroStorage
> ---
>
> Key: PIG-3318
> URL: https://issues.apache.org/jira/browse/PIG-3318
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.11.2
>Reporter: Viraj Bhat
>Assignee: Viraj Bhat
>  Labels: patch
> Fix For: 0.12, 0.11.2
>
> Attachments: Employee3.ser, Employee4.ser, Employee6.ser, 
> PIG-3318_3.patch
>
>
> Piggybank - AvroStorage. When merging multiple schemas where default values 
> have been specified in the avro schema; 
> The AvroStorage puts nulls in the merged data set. 
> ==> Employee3.avro <==
> {
> "type" : "record",
> "name" : "employee",
> "fields":[
> {"name" : "name", "type" : "string", "default" : "NU"},
> {"name" : "age", "type" : "int", "default" : 0 },
> {"name" : "dept", "type": "string", "default" : "DU"} ] }
> ==> Employee4.avro <==
> {
> "type" : "record",
> "name" : "employee",
> "fields":[
> {"name" : "name", "type" : "string", "default" : "NU"},
> {"name" : "age", "type" : "int", "default" : 0},
> {"name" : "dept", "type": "string", "default" : "DU"},
> {"name" : "office", "type": "string", "default" : "OU"} ] }
> ==> Employee6.avro <==
> {
> "type" : "record",
> "name" : "employee",
> "fields":[
> {"name" : "name", "type" : "string", "default" : "NU"},
> {"name" : "lastname", "type": "string", "default" : "LNU"},
> {"name" : "age", "type" : "int","default" : 0},
> {"name" : "salary", "type": "int", "default" : 0},
> {"name" : "dept", "type": "string","default" : "DU"},
> {"name" : "office", "type": "string","default" : "OU"} ] }
> The pig script:
> employee = load 'employee{3,4,6}.ser' using 
> org.apache.pig.piggybank.storage.avro.AvroStorage('multiple_schemas');
> describe employee;
> dump employee;
> Output Schema:
> employee: {name: chararray,age: int,dept: chararray,lastname: 
> chararray,salary: int,office: chararray}
> (Milo,30,DH,,,)
> (Asmya,34,PQ,,,)
> (Baljit,23,RS,,,)
> (Pune,60,Astrophysics,Warriors,5466,UTA)
> (Rajsathan,20,Biochemistry,Royals,1378,Stanford)
> (Chennai,50,Microbiology,Superkings,7338,Hopkins)
> (Mumbai,20,Applied Math,Indians,4468,UAH)
> (Praj,54,RMX,,,Champaign)
> (Buba,767,HD,,,Sunnyvale)
> (Manku,375,MS,,,New York)
> Regards
> Viraj

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3318) AVRO: 'default value' not honored when merging schemas on load with AvroStorage

2013-06-11 Thread Viraj Bhat (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Bhat updated PIG-3318:


Attachment: PIG-3318_3.patch

> AVRO: 'default value' not honored when merging schemas on load with 
> AvroStorage
> ---
>
> Key: PIG-3318
> URL: https://issues.apache.org/jira/browse/PIG-3318
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.11.2
>Reporter: Viraj Bhat
>Assignee: Viraj Bhat
>  Labels: patch
> Fix For: 0.12, 0.11.2
>
> Attachments: Employee3.ser, Employee4.ser, Employee6.ser, 
> expected_testMultipleSchemasWithDefaultValue.avro, PIG-3318_3.patch
>
>
> Piggybank - AvroStorage. When merging multiple schemas where default values 
> have been specified in the avro schema; 
> The AvroStorage puts nulls in the merged data set. 
> ==> Employee3.avro <==
> {
> "type" : "record",
> "name" : "employee",
> "fields":[
> {"name" : "name", "type" : "string", "default" : "NU"},
> {"name" : "age", "type" : "int", "default" : 0 },
> {"name" : "dept", "type": "string", "default" : "DU"} ] }
> ==> Employee4.avro <==
> {
> "type" : "record",
> "name" : "employee",
> "fields":[
> {"name" : "name", "type" : "string", "default" : "NU"},
> {"name" : "age", "type" : "int", "default" : 0},
> {"name" : "dept", "type": "string", "default" : "DU"},
> {"name" : "office", "type": "string", "default" : "OU"} ] }
> ==> Employee6.avro <==
> {
> "type" : "record",
> "name" : "employee",
> "fields":[
> {"name" : "name", "type" : "string", "default" : "NU"},
> {"name" : "lastname", "type": "string", "default" : "LNU"},
> {"name" : "age", "type" : "int","default" : 0},
> {"name" : "salary", "type": "int", "default" : 0},
> {"name" : "dept", "type": "string","default" : "DU"},
> {"name" : "office", "type": "string","default" : "OU"} ] }
> The pig script:
> employee = load 'employee{3,4,6}.ser' using 
> org.apache.pig.piggybank.storage.avro.AvroStorage('multiple_schemas');
> describe employee;
> dump employee;
> Output Schema:
> employee: {name: chararray,age: int,dept: chararray,lastname: 
> chararray,salary: int,office: chararray}
> (Milo,30,DH,,,)
> (Asmya,34,PQ,,,)
> (Baljit,23,RS,,,)
> (Pune,60,Astrophysics,Warriors,5466,UTA)
> (Rajsathan,20,Biochemistry,Royals,1378,Stanford)
> (Chennai,50,Microbiology,Superkings,7338,Hopkins)
> (Mumbai,20,Applied Math,Indians,4468,UAH)
> (Praj,54,RMX,,,Champaign)
> (Buba,767,HD,,,Sunnyvale)
> (Manku,375,MS,,,New York)
> Regards
> Viraj

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Review Request: PIG-3318 Patch to address default values when schemas are merged in AvroStorage. It does this for Records containing primitive values

2013-06-11 Thread Viraj Bhat

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11135/
---

(Updated June 12, 2013, 2:05 a.m.)


Review request for pig and Rohini Palaniswamy.


Changes
---

Modified changes with formatting


Description
---

Default values are not honoured when merging default schema


This addresses bug PIG-3318.
https://issues.apache.org/jira/browse/PIG-3318


Diffs (updated)
-

  
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroStorage.java
 1491556 
  
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroStorageUtils.java
 1491556 
  
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/PigAvroRecordReader.java
 1491556 
  
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/avro/TestAvroStorage.java
 1491562 

Diff: https://reviews.apache.org/r/11135/diff/


Testing
---

Yes


Thanks,

Viraj Bhat



[jira] [Updated] (PIG-3318) AVRO: 'default value' not honored when merging schemas on load with AvroStorage

2013-06-11 Thread Viraj Bhat (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Bhat updated PIG-3318:


Attachment: (was: PIG-3318_2.patch)

> AVRO: 'default value' not honored when merging schemas on load with 
> AvroStorage
> ---
>
> Key: PIG-3318
> URL: https://issues.apache.org/jira/browse/PIG-3318
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.11.2
>Reporter: Viraj Bhat
>Assignee: Viraj Bhat
>  Labels: patch
> Fix For: 0.12, 0.11.2
>
> Attachments: Employee3.ser, Employee4.ser, Employee6.ser, 
> expected_testMultipleSchemasWithDefaultValue.avro
>
>
> Piggybank - AvroStorage. When merging multiple schemas where default values 
> have been specified in the avro schema; 
> The AvroStorage puts nulls in the merged data set. 
> ==> Employee3.avro <==
> {
> "type" : "record",
> "name" : "employee",
> "fields":[
> {"name" : "name", "type" : "string", "default" : "NU"},
> {"name" : "age", "type" : "int", "default" : 0 },
> {"name" : "dept", "type": "string", "default" : "DU"} ] }
> ==> Employee4.avro <==
> {
> "type" : "record",
> "name" : "employee",
> "fields":[
> {"name" : "name", "type" : "string", "default" : "NU"},
> {"name" : "age", "type" : "int", "default" : 0},
> {"name" : "dept", "type": "string", "default" : "DU"},
> {"name" : "office", "type": "string", "default" : "OU"} ] }
> ==> Employee6.avro <==
> {
> "type" : "record",
> "name" : "employee",
> "fields":[
> {"name" : "name", "type" : "string", "default" : "NU"},
> {"name" : "lastname", "type": "string", "default" : "LNU"},
> {"name" : "age", "type" : "int","default" : 0},
> {"name" : "salary", "type": "int", "default" : 0},
> {"name" : "dept", "type": "string","default" : "DU"},
> {"name" : "office", "type": "string","default" : "OU"} ] }
> The pig script:
> employee = load 'employee{3,4,6}.ser' using 
> org.apache.pig.piggybank.storage.avro.AvroStorage('multiple_schemas');
> describe employee;
> dump employee;
> Output Schema:
> employee: {name: chararray,age: int,dept: chararray,lastname: 
> chararray,salary: int,office: chararray}
> (Milo,30,DH,,,)
> (Asmya,34,PQ,,,)
> (Baljit,23,RS,,,)
> (Pune,60,Astrophysics,Warriors,5466,UTA)
> (Rajsathan,20,Biochemistry,Royals,1378,Stanford)
> (Chennai,50,Microbiology,Superkings,7338,Hopkins)
> (Mumbai,20,Applied Math,Indians,4468,UAH)
> (Praj,54,RMX,,,Champaign)
> (Buba,767,HD,,,Sunnyvale)
> (Manku,375,MS,,,New York)
> Regards
> Viraj

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3318) AVRO: 'default value' not honored when merging schemas on load with AvroStorage

2013-06-11 Thread Viraj Bhat (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Bhat updated PIG-3318:


Attachment: (was: PIG-3318_1.patch)

> AVRO: 'default value' not honored when merging schemas on load with 
> AvroStorage
> ---
>
> Key: PIG-3318
> URL: https://issues.apache.org/jira/browse/PIG-3318
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.11.2
>Reporter: Viraj Bhat
>Assignee: Viraj Bhat
>  Labels: patch
> Fix For: 0.12, 0.11.2
>
> Attachments: Employee3.ser, Employee4.ser, Employee6.ser, 
> expected_testMultipleSchemasWithDefaultValue.avro, PIG-3318_2.patch
>
>
> Piggybank - AvroStorage. When merging multiple schemas where default values 
> have been specified in the avro schema; 
> The AvroStorage puts nulls in the merged data set. 
> ==> Employee3.avro <==
> {
> "type" : "record",
> "name" : "employee",
> "fields":[
> {"name" : "name", "type" : "string", "default" : "NU"},
> {"name" : "age", "type" : "int", "default" : 0 },
> {"name" : "dept", "type": "string", "default" : "DU"} ] }
> ==> Employee4.avro <==
> {
> "type" : "record",
> "name" : "employee",
> "fields":[
> {"name" : "name", "type" : "string", "default" : "NU"},
> {"name" : "age", "type" : "int", "default" : 0},
> {"name" : "dept", "type": "string", "default" : "DU"},
> {"name" : "office", "type": "string", "default" : "OU"} ] }
> ==> Employee6.avro <==
> {
> "type" : "record",
> "name" : "employee",
> "fields":[
> {"name" : "name", "type" : "string", "default" : "NU"},
> {"name" : "lastname", "type": "string", "default" : "LNU"},
> {"name" : "age", "type" : "int","default" : 0},
> {"name" : "salary", "type": "int", "default" : 0},
> {"name" : "dept", "type": "string","default" : "DU"},
> {"name" : "office", "type": "string","default" : "OU"} ] }
> The pig script:
> employee = load 'employee{3,4,6}.ser' using 
> org.apache.pig.piggybank.storage.avro.AvroStorage('multiple_schemas');
> describe employee;
> dump employee;
> Output Schema:
> employee: {name: chararray,age: int,dept: chararray,lastname: 
> chararray,salary: int,office: chararray}
> (Milo,30,DH,,,)
> (Asmya,34,PQ,,,)
> (Baljit,23,RS,,,)
> (Pune,60,Astrophysics,Warriors,5466,UTA)
> (Rajsathan,20,Biochemistry,Royals,1378,Stanford)
> (Chennai,50,Microbiology,Superkings,7338,Hopkins)
> (Mumbai,20,Applied Math,Indians,4468,UAH)
> (Praj,54,RMX,,,Champaign)
> (Buba,767,HD,,,Sunnyvale)
> (Manku,375,MS,,,New York)
> Regards
> Viraj

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3318) AVRO: 'default value' not honored when merging schemas on load with AvroStorage

2013-06-11 Thread Viraj Bhat (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Bhat updated PIG-3318:


Attachment: PIG-3318_2.patch

> AVRO: 'default value' not honored when merging schemas on load with 
> AvroStorage
> ---
>
> Key: PIG-3318
> URL: https://issues.apache.org/jira/browse/PIG-3318
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.11.2
>Reporter: Viraj Bhat
>Assignee: Viraj Bhat
>  Labels: patch
> Fix For: 0.12, 0.11.2
>
> Attachments: Employee3.ser, Employee4.ser, Employee6.ser, 
> expected_testMultipleSchemasWithDefaultValue.avro, PIG-3318_2.patch
>
>
> Piggybank - AvroStorage. When merging multiple schemas where default values 
> have been specified in the avro schema; 
> The AvroStorage puts nulls in the merged data set. 
> ==> Employee3.avro <==
> {
> "type" : "record",
> "name" : "employee",
> "fields":[
> {"name" : "name", "type" : "string", "default" : "NU"},
> {"name" : "age", "type" : "int", "default" : 0 },
> {"name" : "dept", "type": "string", "default" : "DU"} ] }
> ==> Employee4.avro <==
> {
> "type" : "record",
> "name" : "employee",
> "fields":[
> {"name" : "name", "type" : "string", "default" : "NU"},
> {"name" : "age", "type" : "int", "default" : 0},
> {"name" : "dept", "type": "string", "default" : "DU"},
> {"name" : "office", "type": "string", "default" : "OU"} ] }
> ==> Employee6.avro <==
> {
> "type" : "record",
> "name" : "employee",
> "fields":[
> {"name" : "name", "type" : "string", "default" : "NU"},
> {"name" : "lastname", "type": "string", "default" : "LNU"},
> {"name" : "age", "type" : "int","default" : 0},
> {"name" : "salary", "type": "int", "default" : 0},
> {"name" : "dept", "type": "string","default" : "DU"},
> {"name" : "office", "type": "string","default" : "OU"} ] }
> The pig script:
> employee = load 'employee{3,4,6}.ser' using 
> org.apache.pig.piggybank.storage.avro.AvroStorage('multiple_schemas');
> describe employee;
> dump employee;
> Output Schema:
> employee: {name: chararray,age: int,dept: chararray,lastname: 
> chararray,salary: int,office: chararray}
> (Milo,30,DH,,,)
> (Asmya,34,PQ,,,)
> (Baljit,23,RS,,,)
> (Pune,60,Astrophysics,Warriors,5466,UTA)
> (Rajsathan,20,Biochemistry,Royals,1378,Stanford)
> (Chennai,50,Microbiology,Superkings,7338,Hopkins)
> (Mumbai,20,Applied Math,Indians,4468,UAH)
> (Praj,54,RMX,,,Champaign)
> (Buba,767,HD,,,Sunnyvale)
> (Manku,375,MS,,,New York)
> Regards
> Viraj

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Review Request: PIG-3318 Patch to address default values when schemas are merged in AvroStorage. It does this for Records containing primitive values

2013-06-11 Thread Viraj Bhat

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11135/
---

(Updated June 11, 2013, 9:40 p.m.)


Review request for pig and Rohini Palaniswamy.


Changes
---

Addressing comments in diff6


Description
---

Default values are not honoured when merging default schema


This addresses bug PIG-3318.
https://issues.apache.org/jira/browse/PIG-3318


Diffs (updated)
-

  
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroStorage.java
 1491556 
  
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroStorageUtils.java
 1491556 
  
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/PigAvroRecordReader.java
 1491556 
  
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/avro/TestAvroStorage.java
 1491562 

Diff: https://reviews.apache.org/r/11135/diff/


Testing
---

Yes


Thanks,

Viraj Bhat



[jira] [Updated] (PIG-3318) AVRO: 'default value' not honored when merging schemas on load with AvroStorage

2013-06-10 Thread Viraj Bhat (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Bhat updated PIG-3318:


Attachment: (was: PIG-3118.0.11.patch)

> AVRO: 'default value' not honored when merging schemas on load with 
> AvroStorage
> ---
>
> Key: PIG-3318
> URL: https://issues.apache.org/jira/browse/PIG-3318
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.11.2
>Reporter: Viraj Bhat
>Assignee: Viraj Bhat
>  Labels: patch
> Fix For: 0.12, 0.11.2
>
> Attachments: Employee3.ser, Employee4.ser, Employee6.ser, 
> expected_testMultipleSchemasWithDefaultValue.avro, PIG-3318_1.patch
>
>
> Piggybank - AvroStorage. When merging multiple schemas where default values 
> have been specified in the avro schema; 
> The AvroStorage puts nulls in the merged data set. 
> ==> Employee3.avro <==
> {
> "type" : "record",
> "name" : "employee",
> "fields":[
> {"name" : "name", "type" : "string", "default" : "NU"},
> {"name" : "age", "type" : "int", "default" : 0 },
> {"name" : "dept", "type": "string", "default" : "DU"} ] }
> ==> Employee4.avro <==
> {
> "type" : "record",
> "name" : "employee",
> "fields":[
> {"name" : "name", "type" : "string", "default" : "NU"},
> {"name" : "age", "type" : "int", "default" : 0},
> {"name" : "dept", "type": "string", "default" : "DU"},
> {"name" : "office", "type": "string", "default" : "OU"} ] }
> ==> Employee6.avro <==
> {
> "type" : "record",
> "name" : "employee",
> "fields":[
> {"name" : "name", "type" : "string", "default" : "NU"},
> {"name" : "lastname", "type": "string", "default" : "LNU"},
> {"name" : "age", "type" : "int","default" : 0},
> {"name" : "salary", "type": "int", "default" : 0},
> {"name" : "dept", "type": "string","default" : "DU"},
> {"name" : "office", "type": "string","default" : "OU"} ] }
> The pig script:
> employee = load 'employee{3,4,6}.ser' using 
> org.apache.pig.piggybank.storage.avro.AvroStorage('multiple_schemas');
> describe employee;
> dump employee;
> Output Schema:
> employee: {name: chararray,age: int,dept: chararray,lastname: 
> chararray,salary: int,office: chararray}
> (Milo,30,DH,,,)
> (Asmya,34,PQ,,,)
> (Baljit,23,RS,,,)
> (Pune,60,Astrophysics,Warriors,5466,UTA)
> (Rajsathan,20,Biochemistry,Royals,1378,Stanford)
> (Chennai,50,Microbiology,Superkings,7338,Hopkins)
> (Mumbai,20,Applied Math,Indians,4468,UAH)
> (Praj,54,RMX,,,Champaign)
> (Buba,767,HD,,,Sunnyvale)
> (Manku,375,MS,,,New York)
> Regards
> Viraj

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3318) AVRO: 'default value' not honored when merging schemas on load with AvroStorage

2013-06-10 Thread Viraj Bhat (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Bhat updated PIG-3318:


Attachment: expected_testMultipleSchemasWithDefaultValue.avro

> AVRO: 'default value' not honored when merging schemas on load with 
> AvroStorage
> ---
>
> Key: PIG-3318
> URL: https://issues.apache.org/jira/browse/PIG-3318
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.11.2
>Reporter: Viraj Bhat
>Assignee: Viraj Bhat
>  Labels: patch
> Fix For: 0.12, 0.11.2
>
> Attachments: Employee3.ser, Employee4.ser, Employee6.ser, 
> expected_testMultipleSchemasWithDefaultValue.avro, PIG-3318_1.patch
>
>
> Piggybank - AvroStorage. When merging multiple schemas where default values 
> have been specified in the avro schema; 
> The AvroStorage puts nulls in the merged data set. 
> ==> Employee3.avro <==
> {
> "type" : "record",
> "name" : "employee",
> "fields":[
> {"name" : "name", "type" : "string", "default" : "NU"},
> {"name" : "age", "type" : "int", "default" : 0 },
> {"name" : "dept", "type": "string", "default" : "DU"} ] }
> ==> Employee4.avro <==
> {
> "type" : "record",
> "name" : "employee",
> "fields":[
> {"name" : "name", "type" : "string", "default" : "NU"},
> {"name" : "age", "type" : "int", "default" : 0},
> {"name" : "dept", "type": "string", "default" : "DU"},
> {"name" : "office", "type": "string", "default" : "OU"} ] }
> ==> Employee6.avro <==
> {
> "type" : "record",
> "name" : "employee",
> "fields":[
> {"name" : "name", "type" : "string", "default" : "NU"},
> {"name" : "lastname", "type": "string", "default" : "LNU"},
> {"name" : "age", "type" : "int","default" : 0},
> {"name" : "salary", "type": "int", "default" : 0},
> {"name" : "dept", "type": "string","default" : "DU"},
> {"name" : "office", "type": "string","default" : "OU"} ] }
> The pig script:
> employee = load 'employee{3,4,6}.ser' using 
> org.apache.pig.piggybank.storage.avro.AvroStorage('multiple_schemas');
> describe employee;
> dump employee;
> Output Schema:
> employee: {name: chararray,age: int,dept: chararray,lastname: 
> chararray,salary: int,office: chararray}
> (Milo,30,DH,,,)
> (Asmya,34,PQ,,,)
> (Baljit,23,RS,,,)
> (Pune,60,Astrophysics,Warriors,5466,UTA)
> (Rajsathan,20,Biochemistry,Royals,1378,Stanford)
> (Chennai,50,Microbiology,Superkings,7338,Hopkins)
> (Mumbai,20,Applied Math,Indians,4468,UAH)
> (Praj,54,RMX,,,Champaign)
> (Buba,767,HD,,,Sunnyvale)
> (Manku,375,MS,,,New York)
> Regards
> Viraj

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3318) AVRO: 'default value' not honored when merging schemas on load with AvroStorage

2013-06-10 Thread Viraj Bhat (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Bhat updated PIG-3318:


Attachment: PIG-3318_1.patch

> AVRO: 'default value' not honored when merging schemas on load with 
> AvroStorage
> ---
>
> Key: PIG-3318
> URL: https://issues.apache.org/jira/browse/PIG-3318
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.11.2
>Reporter: Viraj Bhat
>Assignee: Viraj Bhat
>  Labels: patch
> Fix For: 0.12, 0.11.2
>
> Attachments: Employee3.ser, Employee4.ser, Employee6.ser, 
> PIG-3118.0.11.patch, PIG-3318_1.patch
>
>
> Piggybank - AvroStorage. When merging multiple schemas where default values 
> have been specified in the avro schema; 
> The AvroStorage puts nulls in the merged data set. 
> ==> Employee3.avro <==
> {
> "type" : "record",
> "name" : "employee",
> "fields":[
> {"name" : "name", "type" : "string", "default" : "NU"},
> {"name" : "age", "type" : "int", "default" : 0 },
> {"name" : "dept", "type": "string", "default" : "DU"} ] }
> ==> Employee4.avro <==
> {
> "type" : "record",
> "name" : "employee",
> "fields":[
> {"name" : "name", "type" : "string", "default" : "NU"},
> {"name" : "age", "type" : "int", "default" : 0},
> {"name" : "dept", "type": "string", "default" : "DU"},
> {"name" : "office", "type": "string", "default" : "OU"} ] }
> ==> Employee6.avro <==
> {
> "type" : "record",
> "name" : "employee",
> "fields":[
> {"name" : "name", "type" : "string", "default" : "NU"},
> {"name" : "lastname", "type": "string", "default" : "LNU"},
> {"name" : "age", "type" : "int","default" : 0},
> {"name" : "salary", "type": "int", "default" : 0},
> {"name" : "dept", "type": "string","default" : "DU"},
> {"name" : "office", "type": "string","default" : "OU"} ] }
> The pig script:
> employee = load 'employee{3,4,6}.ser' using 
> org.apache.pig.piggybank.storage.avro.AvroStorage('multiple_schemas');
> describe employee;
> dump employee;
> Output Schema:
> employee: {name: chararray,age: int,dept: chararray,lastname: 
> chararray,salary: int,office: chararray}
> (Milo,30,DH,,,)
> (Asmya,34,PQ,,,)
> (Baljit,23,RS,,,)
> (Pune,60,Astrophysics,Warriors,5466,UTA)
> (Rajsathan,20,Biochemistry,Royals,1378,Stanford)
> (Chennai,50,Microbiology,Superkings,7338,Hopkins)
> (Mumbai,20,Applied Math,Indians,4468,UAH)
> (Praj,54,RMX,,,Champaign)
> (Buba,767,HD,,,Sunnyvale)
> (Manku,375,MS,,,New York)
> Regards
> Viraj

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3318) AVRO: 'default value' not honored when merging schemas on load with AvroStorage

2013-06-10 Thread Viraj Bhat (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Bhat updated PIG-3318:


Attachment: (was: expected_testMultipleSchemasDefault1.avro)

> AVRO: 'default value' not honored when merging schemas on load with 
> AvroStorage
> ---
>
> Key: PIG-3318
> URL: https://issues.apache.org/jira/browse/PIG-3318
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.11.2
>Reporter: Viraj Bhat
>Assignee: Viraj Bhat
>  Labels: patch
> Fix For: 0.12, 0.11.2
>
> Attachments: Employee3.ser, Employee4.ser, Employee6.ser, 
> PIG-3118.0.11.patch, PIG-3318_1.patch
>
>
> Piggybank - AvroStorage. When merging multiple schemas where default values 
> have been specified in the avro schema; 
> The AvroStorage puts nulls in the merged data set. 
> ==> Employee3.avro <==
> {
> "type" : "record",
> "name" : "employee",
> "fields":[
> {"name" : "name", "type" : "string", "default" : "NU"},
> {"name" : "age", "type" : "int", "default" : 0 },
> {"name" : "dept", "type": "string", "default" : "DU"} ] }
> ==> Employee4.avro <==
> {
> "type" : "record",
> "name" : "employee",
> "fields":[
> {"name" : "name", "type" : "string", "default" : "NU"},
> {"name" : "age", "type" : "int", "default" : 0},
> {"name" : "dept", "type": "string", "default" : "DU"},
> {"name" : "office", "type": "string", "default" : "OU"} ] }
> ==> Employee6.avro <==
> {
> "type" : "record",
> "name" : "employee",
> "fields":[
> {"name" : "name", "type" : "string", "default" : "NU"},
> {"name" : "lastname", "type": "string", "default" : "LNU"},
> {"name" : "age", "type" : "int","default" : 0},
> {"name" : "salary", "type": "int", "default" : 0},
> {"name" : "dept", "type": "string","default" : "DU"},
> {"name" : "office", "type": "string","default" : "OU"} ] }
> The pig script:
> employee = load 'employee{3,4,6}.ser' using 
> org.apache.pig.piggybank.storage.avro.AvroStorage('multiple_schemas');
> describe employee;
> dump employee;
> Output Schema:
> employee: {name: chararray,age: int,dept: chararray,lastname: 
> chararray,salary: int,office: chararray}
> (Milo,30,DH,,,)
> (Asmya,34,PQ,,,)
> (Baljit,23,RS,,,)
> (Pune,60,Astrophysics,Warriors,5466,UTA)
> (Rajsathan,20,Biochemistry,Royals,1378,Stanford)
> (Chennai,50,Microbiology,Superkings,7338,Hopkins)
> (Mumbai,20,Applied Math,Indians,4468,UAH)
> (Praj,54,RMX,,,Champaign)
> (Buba,767,HD,,,Sunnyvale)
> (Manku,375,MS,,,New York)
> Regards
> Viraj

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3318) AVRO: 'default value' not honored when merging schemas on load with AvroStorage

2013-06-10 Thread Viraj Bhat (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Bhat updated PIG-3318:


Attachment: (was: PIG-3318.patch)

> AVRO: 'default value' not honored when merging schemas on load with 
> AvroStorage
> ---
>
> Key: PIG-3318
> URL: https://issues.apache.org/jira/browse/PIG-3318
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.11.2
>Reporter: Viraj Bhat
>Assignee: Viraj Bhat
>  Labels: patch
> Fix For: 0.12, 0.11.2
>
> Attachments: Employee3.ser, Employee4.ser, Employee6.ser, 
> PIG-3118.0.11.patch, PIG-3318_1.patch
>
>
> Piggybank - AvroStorage. When merging multiple schemas where default values 
> have been specified in the avro schema; 
> The AvroStorage puts nulls in the merged data set. 
> ==> Employee3.avro <==
> {
> "type" : "record",
> "name" : "employee",
> "fields":[
> {"name" : "name", "type" : "string", "default" : "NU"},
> {"name" : "age", "type" : "int", "default" : 0 },
> {"name" : "dept", "type": "string", "default" : "DU"} ] }
> ==> Employee4.avro <==
> {
> "type" : "record",
> "name" : "employee",
> "fields":[
> {"name" : "name", "type" : "string", "default" : "NU"},
> {"name" : "age", "type" : "int", "default" : 0},
> {"name" : "dept", "type": "string", "default" : "DU"},
> {"name" : "office", "type": "string", "default" : "OU"} ] }
> ==> Employee6.avro <==
> {
> "type" : "record",
> "name" : "employee",
> "fields":[
> {"name" : "name", "type" : "string", "default" : "NU"},
> {"name" : "lastname", "type": "string", "default" : "LNU"},
> {"name" : "age", "type" : "int","default" : 0},
> {"name" : "salary", "type": "int", "default" : 0},
> {"name" : "dept", "type": "string","default" : "DU"},
> {"name" : "office", "type": "string","default" : "OU"} ] }
> The pig script:
> employee = load 'employee{3,4,6}.ser' using 
> org.apache.pig.piggybank.storage.avro.AvroStorage('multiple_schemas');
> describe employee;
> dump employee;
> Output Schema:
> employee: {name: chararray,age: int,dept: chararray,lastname: 
> chararray,salary: int,office: chararray}
> (Milo,30,DH,,,)
> (Asmya,34,PQ,,,)
> (Baljit,23,RS,,,)
> (Pune,60,Astrophysics,Warriors,5466,UTA)
> (Rajsathan,20,Biochemistry,Royals,1378,Stanford)
> (Chennai,50,Microbiology,Superkings,7338,Hopkins)
> (Mumbai,20,Applied Math,Indians,4468,UAH)
> (Praj,54,RMX,,,Champaign)
> (Buba,767,HD,,,Sunnyvale)
> (Manku,375,MS,,,New York)
> Regards
> Viraj

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Review Request: PIG-3318 Patch to address default values when schemas are merged in AvroStorage. It does this for Records containing primitive values

2013-06-10 Thread Viraj Bhat

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11135/
---

(Updated June 11, 2013, 3:04 a.m.)


Review request for pig and Rohini Palaniswamy.


Changes
---

Updated patch


Description
---

Default values are not honoured when merging default schema


This addresses bug PIG-3318.
https://issues.apache.org/jira/browse/PIG-3318


Diffs (updated)
-

  
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroStorage.java
 1491556 
  
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroStorageUtils.java
 1491556 
  
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/PigAvroRecordReader.java
 1491556 
  
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/avro/TestAvroStorage.java
 1491562 

Diff: https://reviews.apache.org/r/11135/diff/


Testing
---

Yes


Thanks,

Viraj Bhat



Re: Review Request: PIG-3318 Patch to address default values when schemas are merged in AvroStorage. It does this for Records containing primitive values

2013-06-10 Thread Viraj Bhat


> On June 10, 2013, 7:06 p.m., Rohini Palaniswamy wrote:
> > http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroStorageUtils.java,
> >  lines 327-338
> > <https://reviews.apache.org/r/11135/diff/5/?file=295050#file295050line327>
> >
> > This can be simplified into few lines

This was fixed by creating a new function which will make the code more 
readable.


- Viraj


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11135/#review21662
-------


On May 30, 2013, 2:28 a.m., Viraj Bhat wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/11135/
> ---
> 
> (Updated May 30, 2013, 2:28 a.m.)
> 
> 
> Review request for pig and Rohini Palaniswamy.
> 
> 
> Description
> ---
> 
> Default values are not honoured when merging default schema
> 
> 
> This addresses bug PIG-3318.
> https://issues.apache.org/jira/browse/PIG-3318
> 
> 
> Diffs
> -
> 
>   
> http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroStorage.java
>  1484564 
>   
> http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroStorageUtils.java
>  1484564 
>   
> http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/PigAvroInputFormat.java
>  1484564 
>   
> http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/PigAvroRecordReader.java
>  1484564 
>   
> http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/avro/TestAvroStorage.java
>  1484564 
> 
> Diff: https://reviews.apache.org/r/11135/diff/
> 
> 
> Testing
> ---
> 
> Yes
> 
> 
> Thanks,
> 
> Viraj Bhat
> 
>



[jira] [Created] (PIG-3353) Feature parity between HCatStorer and HCatLoader in Pig using Avroserde and Piggybank AvroStorage

2013-06-07 Thread Viraj Bhat (JIRA)
Viraj Bhat created PIG-3353:
---

 Summary: Feature parity between HCatStorer and HCatLoader in Pig 
using Avroserde and Piggybank AvroStorage
 Key: PIG-3353
 URL: https://issues.apache.org/jira/browse/PIG-3353
 Project: Pig
  Issue Type: Improvement
Reporter: Viraj Bhat


Currently there are 2 paths for accessing a Avro File in Pig. One using the 
HCatLoader and HCatStorer and the other using the AvroStorage in piggybank.

We need to investigate the feature differences between the two access patterns. 

Regards
Viraj

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3331) Default values not stored in avro file when using specific schemas during store in AvroStorage

2013-06-04 Thread Viraj Bhat (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Bhat updated PIG-3331:


Attachment: PIG-3331_1.patch

Updated patch

> Default values not stored in avro file when using specific schemas during 
> store in AvroStorage
> --
>
> Key: PIG-3331
> URL: https://issues.apache.org/jira/browse/PIG-3331
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.11.1
>Reporter: Viraj Bhat
>Assignee: Viraj Bhat
> Fix For: 0.11.2
>
> Attachments: expected_DefaultSchemaWrite.avro, PIG-3331_1.patch
>
>
> Script which stores Avro using a predefined schema does not store the default 
> values in the file
> {code}
> a = LOAD 'numbers.txt' USING PigStorage (':') as (intnum1000: int,id:
> int,intnum5: int,intnum100: int,intnum: int,longnum: long,floatnum:
> float,doublenum: double);
> b2 = foreach a generate id, intnum5, intnum100;
> c2 = filter b2 by 110 <= id and id < 120;
> STORE c2 INTO '/tmp/TestAvroStorage/testDefaultValueWrite' USING
> org.apache.pig.piggybank.storage.avro.AvroStorage (' { "debug" : 5, "schema" :
> {  "name" : "rmyrecord", "type" : "record",  "fields" : [ { "name" : "id",
> "type" : "int" , "default" : 0 }, {  "name" : "intnum5",  "type" : "int",
> "default" : 0 }, { "name" : "intnum100", "type" : "int", "default" : 0 } ] } }
> ');
> {code}
> Opening the file shows the following schema
> {noformat} 
> avro.schema
> {"type":"record","name":"rmyrecord","fields":[{"name":"id","type":"int"},{"name":"intnum5","type":"int"},{"name":"intnum100","type":"int"}]}
> {noformat} 
> There seems to be a problem storing the schema.
> Viraj

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3331) Default values not stored in avro file when using specific schemas during store in AvroStorage

2013-06-04 Thread Viraj Bhat (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Bhat updated PIG-3331:


Attachment: (was: PIG-3331_1.patch)

> Default values not stored in avro file when using specific schemas during 
> store in AvroStorage
> --
>
> Key: PIG-3331
> URL: https://issues.apache.org/jira/browse/PIG-3331
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.11.1
>Reporter: Viraj Bhat
>Assignee: Viraj Bhat
> Fix For: 0.11.2
>
> Attachments: expected_DefaultSchemaWrite.avro, PIG-3331_1.patch
>
>
> Script which stores Avro using a predefined schema does not store the default 
> values in the file
> {code}
> a = LOAD 'numbers.txt' USING PigStorage (':') as (intnum1000: int,id:
> int,intnum5: int,intnum100: int,intnum: int,longnum: long,floatnum:
> float,doublenum: double);
> b2 = foreach a generate id, intnum5, intnum100;
> c2 = filter b2 by 110 <= id and id < 120;
> STORE c2 INTO '/tmp/TestAvroStorage/testDefaultValueWrite' USING
> org.apache.pig.piggybank.storage.avro.AvroStorage (' { "debug" : 5, "schema" :
> {  "name" : "rmyrecord", "type" : "record",  "fields" : [ { "name" : "id",
> "type" : "int" , "default" : 0 }, {  "name" : "intnum5",  "type" : "int",
> "default" : 0 }, { "name" : "intnum100", "type" : "int", "default" : 0 } ] } }
> ');
> {code}
> Opening the file shows the following schema
> {noformat} 
> avro.schema
> {"type":"record","name":"rmyrecord","fields":[{"name":"id","type":"int"},{"name":"intnum5","type":"int"},{"name":"intnum100","type":"int"}]}
> {noformat} 
> There seems to be a problem storing the schema.
> Viraj

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Review Request: PIG-3331 Default values not written to Schema when specified in the output schema

2013-06-04 Thread Viraj Bhat

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11355/
---

(Updated June 4, 2013, 11:23 p.m.)


Review request for pig and Rohini Palaniswamy.


Changes
---

Updated the patch based on PIG-3322
Viraj


Description
---

Patch to write default values to the Schema when the writer schema contains 
that in the AvroStorage.


Diffs (updated)
-

  
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/PigSchema2Avro.java
 1485826 
  
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/avro/TestAvroStorage.java
 1489655 

Diff: https://reviews.apache.org/r/11355/diff/


Testing
---

Yes against the Piggybank  in Pig trunk/Pig 0.12


Thanks,

Viraj Bhat



Re: Review Request: PIG-3331 Default values not written to Schema when specified in the output schema

2013-06-04 Thread Viraj Bhat


> On June 2, 2013, 8:55 p.m., Cheolsoo Park wrote:
> > Hi Viraj,
> > 
> > I have a couple of comments:
> > - 5k records seems unnecessary for a unit test case. You need just a few 
> > records to verify your fix, don't you?
> > - In you test case, can't you use mock.Storage instead of PigStorage? Then, 
> > you won't need an extra input file (numbers.txt). Please see 
> > org.apache.pig.builtin.mock.Storage.java.
> > - Can you put code changes and test files in a single patch and attach it 
> > in the jira? It would be very helpful if I could apply everything with a 
> > single patch command.
> > 
> > Thank you!

Hi Cheolsoo,
 Thanks for your comments fixed the test case and removed the PigStorage() and 
replaced it with mock.Storage.
Viraj


- Viraj


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11355/#review21304
---


On June 4, 2013, 9:50 p.m., Viraj Bhat wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/11355/
> ---
> 
> (Updated June 4, 2013, 9:50 p.m.)
> 
> 
> Review request for pig and Rohini Palaniswamy.
> 
> 
> Description
> ---
> 
> Patch to write default values to the Schema when the writer schema contains 
> that in the AvroStorage.
> 
> 
> Diffs
> -
> 
>   
> http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/PigSchema2Avro.java
>  1485826 
>   
> http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/avro/TestAvroStorage.java
>  1485826 
> 
> Diff: https://reviews.apache.org/r/11355/diff/
> 
> 
> Testing
> ---
> 
> Yes against the Piggybank  in Pig trunk/Pig 0.12
> 
> 
> Thanks,
> 
> Viraj Bhat
> 
>



[jira] [Updated] (PIG-3331) Default values not stored in avro file when using specific schemas during store in AvroStorage

2013-06-04 Thread Viraj Bhat (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Bhat updated PIG-3331:


Attachment: PIG-3331_1.patch

Latest patch

> Default values not stored in avro file when using specific schemas during 
> store in AvroStorage
> --
>
> Key: PIG-3331
> URL: https://issues.apache.org/jira/browse/PIG-3331
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.11.1
>Reporter: Viraj Bhat
>Assignee: Viraj Bhat
> Fix For: 0.11.2
>
> Attachments: expected_DefaultSchemaWrite.avro, PIG-3331_1.patch
>
>
> Script which stores Avro using a predefined schema does not store the default 
> values in the file
> {code}
> a = LOAD 'numbers.txt' USING PigStorage (':') as (intnum1000: int,id:
> int,intnum5: int,intnum100: int,intnum: int,longnum: long,floatnum:
> float,doublenum: double);
> b2 = foreach a generate id, intnum5, intnum100;
> c2 = filter b2 by 110 <= id and id < 120;
> STORE c2 INTO '/tmp/TestAvroStorage/testDefaultValueWrite' USING
> org.apache.pig.piggybank.storage.avro.AvroStorage (' { "debug" : 5, "schema" :
> {  "name" : "rmyrecord", "type" : "record",  "fields" : [ { "name" : "id",
> "type" : "int" , "default" : 0 }, {  "name" : "intnum5",  "type" : "int",
> "default" : 0 }, { "name" : "intnum100", "type" : "int", "default" : 0 } ] } }
> ');
> {code}
> Opening the file shows the following schema
> {noformat} 
> avro.schema
> {"type":"record","name":"rmyrecord","fields":[{"name":"id","type":"int"},{"name":"intnum5","type":"int"},{"name":"intnum100","type":"int"}]}
> {noformat} 
> There seems to be a problem storing the schema.
> Viraj

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3331) Default values not stored in avro file when using specific schemas during store in AvroStorage

2013-06-04 Thread Viraj Bhat (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Bhat updated PIG-3331:


Attachment: (was: PIG-3331_1.patch)

> Default values not stored in avro file when using specific schemas during 
> store in AvroStorage
> --
>
> Key: PIG-3331
> URL: https://issues.apache.org/jira/browse/PIG-3331
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.11.1
>Reporter: Viraj Bhat
>Assignee: Viraj Bhat
> Fix For: 0.11.2
>
> Attachments: expected_DefaultSchemaWrite.avro, PIG-3331_1.patch
>
>
> Script which stores Avro using a predefined schema does not store the default 
> values in the file
> {code}
> a = LOAD 'numbers.txt' USING PigStorage (':') as (intnum1000: int,id:
> int,intnum5: int,intnum100: int,intnum: int,longnum: long,floatnum:
> float,doublenum: double);
> b2 = foreach a generate id, intnum5, intnum100;
> c2 = filter b2 by 110 <= id and id < 120;
> STORE c2 INTO '/tmp/TestAvroStorage/testDefaultValueWrite' USING
> org.apache.pig.piggybank.storage.avro.AvroStorage (' { "debug" : 5, "schema" :
> {  "name" : "rmyrecord", "type" : "record",  "fields" : [ { "name" : "id",
> "type" : "int" , "default" : 0 }, {  "name" : "intnum5",  "type" : "int",
> "default" : 0 }, { "name" : "intnum100", "type" : "int", "default" : 0 } ] } }
> ');
> {code}
> Opening the file shows the following schema
> {noformat} 
> avro.schema
> {"type":"record","name":"rmyrecord","fields":[{"name":"id","type":"int"},{"name":"intnum5","type":"int"},{"name":"intnum100","type":"int"}]}
> {noformat} 
> There seems to be a problem storing the schema.
> Viraj

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Review Request: PIG-3331 Default values not written to Schema when specified in the output schema

2013-06-04 Thread Viraj Bhat


> On June 3, 2013, 1:20 p.m., Rohini Palaniswamy wrote:
> > http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/avro/TestAvroStorage.java,
> >  lines 629-636
> > <https://reviews.apache.org/r/11355/diff/2/?file=295976#file295976line629>
> >
> > Isn't a load and store enough to reproduce the test case? Why such a 
> > long pig script? Please try to keep the unit tests simple.

Made a smaller script to test it.


- Viraj


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11355/#review21315
---


On June 4, 2013, 9:50 p.m., Viraj Bhat wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/11355/
> ---
> 
> (Updated June 4, 2013, 9:50 p.m.)
> 
> 
> Review request for pig and Rohini Palaniswamy.
> 
> 
> Description
> ---
> 
> Patch to write default values to the Schema when the writer schema contains 
> that in the AvroStorage.
> 
> 
> Diffs
> -
> 
>   
> http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/PigSchema2Avro.java
>  1485826 
>   
> http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/avro/TestAvroStorage.java
>  1485826 
> 
> Diff: https://reviews.apache.org/r/11355/diff/
> 
> 
> Testing
> ---
> 
> Yes against the Piggybank  in Pig trunk/Pig 0.12
> 
> 
> Thanks,
> 
> Viraj Bhat
> 
>



Re: Review Request: PIG-3331 Default values not written to Schema when specified in the output schema

2013-06-04 Thread Viraj Bhat

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11355/
---

(Updated June 4, 2013, 9:50 p.m.)


Review request for pig and Rohini Palaniswamy.


Changes
---

1) Changed patch to use mock.Storage()
2) Smaller generated avro file


Description
---

Patch to write default values to the Schema when the writer schema contains 
that in the AvroStorage.


Diffs (updated)
-

  
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/PigSchema2Avro.java
 1485826 
  
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/avro/TestAvroStorage.java
 1485826 

Diff: https://reviews.apache.org/r/11355/diff/


Testing
---

Yes against the Piggybank  in Pig trunk/Pig 0.12


Thanks,

Viraj Bhat



[jira] [Updated] (PIG-3331) Default values not stored in avro file when using specific schemas during store in AvroStorage

2013-06-04 Thread Viraj Bhat (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Bhat updated PIG-3331:


Attachment: expected_DefaultSchemaWrite.avro

Expected Avro file

> Default values not stored in avro file when using specific schemas during 
> store in AvroStorage
> --
>
> Key: PIG-3331
> URL: https://issues.apache.org/jira/browse/PIG-3331
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.11.1
>Reporter: Viraj Bhat
>Assignee: Viraj Bhat
> Fix For: 0.11.2
>
> Attachments: expected_DefaultSchemaWrite.avro, PIG-3331_1.patch
>
>
> Script which stores Avro using a predefined schema does not store the default 
> values in the file
> {code}
> a = LOAD 'numbers.txt' USING PigStorage (':') as (intnum1000: int,id:
> int,intnum5: int,intnum100: int,intnum: int,longnum: long,floatnum:
> float,doublenum: double);
> b2 = foreach a generate id, intnum5, intnum100;
> c2 = filter b2 by 110 <= id and id < 120;
> STORE c2 INTO '/tmp/TestAvroStorage/testDefaultValueWrite' USING
> org.apache.pig.piggybank.storage.avro.AvroStorage (' { "debug" : 5, "schema" :
> {  "name" : "rmyrecord", "type" : "record",  "fields" : [ { "name" : "id",
> "type" : "int" , "default" : 0 }, {  "name" : "intnum5",  "type" : "int",
> "default" : 0 }, { "name" : "intnum100", "type" : "int", "default" : 0 } ] } }
> ');
> {code}
> Opening the file shows the following schema
> {noformat} 
> avro.schema
> {"type":"record","name":"rmyrecord","fields":[{"name":"id","type":"int"},{"name":"intnum5","type":"int"},{"name":"intnum100","type":"int"}]}
> {noformat} 
> There seems to be a problem storing the schema.
> Viraj

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3331) Default values not stored in avro file when using specific schemas during store in AvroStorage

2013-06-04 Thread Viraj Bhat (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Bhat updated PIG-3331:


Attachment: PIG-3331_1.patch

Updated Pig patch

> Default values not stored in avro file when using specific schemas during 
> store in AvroStorage
> --
>
> Key: PIG-3331
> URL: https://issues.apache.org/jira/browse/PIG-3331
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.11.1
>Reporter: Viraj Bhat
>Assignee: Viraj Bhat
> Fix For: 0.11.2
>
> Attachments: expected_DefaultSchemaWrite.avro, PIG-3331_1.patch
>
>
> Script which stores Avro using a predefined schema does not store the default 
> values in the file
> {code}
> a = LOAD 'numbers.txt' USING PigStorage (':') as (intnum1000: int,id:
> int,intnum5: int,intnum100: int,intnum: int,longnum: long,floatnum:
> float,doublenum: double);
> b2 = foreach a generate id, intnum5, intnum100;
> c2 = filter b2 by 110 <= id and id < 120;
> STORE c2 INTO '/tmp/TestAvroStorage/testDefaultValueWrite' USING
> org.apache.pig.piggybank.storage.avro.AvroStorage (' { "debug" : 5, "schema" :
> {  "name" : "rmyrecord", "type" : "record",  "fields" : [ { "name" : "id",
> "type" : "int" , "default" : 0 }, {  "name" : "intnum5",  "type" : "int",
> "default" : 0 }, { "name" : "intnum100", "type" : "int", "default" : 0 } ] } }
> ');
> {code}
> Opening the file shows the following schema
> {noformat} 
> avro.schema
> {"type":"record","name":"rmyrecord","fields":[{"name":"id","type":"int"},{"name":"intnum5","type":"int"},{"name":"intnum100","type":"int"}]}
> {noformat} 
> There seems to be a problem storing the schema.
> Viraj

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3331) Default values not stored in avro file when using specific schemas during store in AvroStorage

2013-06-04 Thread Viraj Bhat (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Bhat updated PIG-3331:


Attachment: (was: expected_DefaultSchemaWrite.avro)

> Default values not stored in avro file when using specific schemas during 
> store in AvroStorage
> --
>
> Key: PIG-3331
> URL: https://issues.apache.org/jira/browse/PIG-3331
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.11.1
>Reporter: Viraj Bhat
>Assignee: Viraj Bhat
> Fix For: 0.11.2
>
>
> Script which stores Avro using a predefined schema does not store the default 
> values in the file
> {code}
> a = LOAD 'numbers.txt' USING PigStorage (':') as (intnum1000: int,id:
> int,intnum5: int,intnum100: int,intnum: int,longnum: long,floatnum:
> float,doublenum: double);
> b2 = foreach a generate id, intnum5, intnum100;
> c2 = filter b2 by 110 <= id and id < 120;
> STORE c2 INTO '/tmp/TestAvroStorage/testDefaultValueWrite' USING
> org.apache.pig.piggybank.storage.avro.AvroStorage (' { "debug" : 5, "schema" :
> {  "name" : "rmyrecord", "type" : "record",  "fields" : [ { "name" : "id",
> "type" : "int" , "default" : 0 }, {  "name" : "intnum5",  "type" : "int",
> "default" : 0 }, { "name" : "intnum100", "type" : "int", "default" : 0 } ] } }
> ');
> {code}
> Opening the file shows the following schema
> {noformat} 
> avro.schema
> {"type":"record","name":"rmyrecord","fields":[{"name":"id","type":"int"},{"name":"intnum5","type":"int"},{"name":"intnum100","type":"int"}]}
> {noformat} 
> There seems to be a problem storing the schema.
> Viraj

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3331) Default values not stored in avro file when using specific schemas during store in AvroStorage

2013-06-04 Thread Viraj Bhat (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Bhat updated PIG-3331:


Attachment: (was: numbers.txt)

> Default values not stored in avro file when using specific schemas during 
> store in AvroStorage
> --
>
> Key: PIG-3331
> URL: https://issues.apache.org/jira/browse/PIG-3331
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.11.1
>Reporter: Viraj Bhat
>Assignee: Viraj Bhat
> Fix For: 0.11.2
>
>
> Script which stores Avro using a predefined schema does not store the default 
> values in the file
> {code}
> a = LOAD 'numbers.txt' USING PigStorage (':') as (intnum1000: int,id:
> int,intnum5: int,intnum100: int,intnum: int,longnum: long,floatnum:
> float,doublenum: double);
> b2 = foreach a generate id, intnum5, intnum100;
> c2 = filter b2 by 110 <= id and id < 120;
> STORE c2 INTO '/tmp/TestAvroStorage/testDefaultValueWrite' USING
> org.apache.pig.piggybank.storage.avro.AvroStorage (' { "debug" : 5, "schema" :
> {  "name" : "rmyrecord", "type" : "record",  "fields" : [ { "name" : "id",
> "type" : "int" , "default" : 0 }, {  "name" : "intnum5",  "type" : "int",
> "default" : 0 }, { "name" : "intnum100", "type" : "int", "default" : 0 } ] } }
> ');
> {code}
> Opening the file shows the following schema
> {noformat} 
> avro.schema
> {"type":"record","name":"rmyrecord","fields":[{"name":"id","type":"int"},{"name":"intnum5","type":"int"},{"name":"intnum100","type":"int"}]}
> {noformat} 
> There seems to be a problem storing the schema.
> Viraj

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3322) AVRO: AvroStorage give NPE on reading file with union as top level schema

2013-06-03 Thread Viraj Bhat (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Bhat updated PIG-3322:


Attachment: (was: test_loadavrowithnulls.avro)

> AVRO: AvroStorage give NPE on reading file with union as top level schema
> -
>
> Key: PIG-3322
> URL: https://issues.apache.org/jira/browse/PIG-3322
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.11.2
>Reporter: Egil Sorensen
>Assignee: Viraj Bhat
>  Labels: patch
> Fix For: 0.12
>
> Attachments: PIG-3322_3.patch, test_loadavrowithnulls.avro
>
>
> I am getting NPE when loading a file with AvroStorage a file that has schema 
> like:
> {code}
> ["null",{"type":"record","name":"TUPLE_0","fields":[{"name":"name","type":["null","string"],"doc":"autogenerated
>  from Pig Field 
> Schema"},{"name":"age","type":["null","int"],"doc":"autogenerated from Pig 
> Field Schema"},{"name":"gpa","type":["null","double"],"doc":"autogenerated 
> from Pig Field Schema"}]}]
> {code}
> E.g. see the e2e style test, which fails on this:
> {code}
> {
> 'num' => 4,
> # storing file with Pig type tuple relying on 
> conversion to record
> # loading using stored schemas 
> 'notmq' => 1,
> 'pig' => q\
> a = load ':INPATH:/singlefile/studentcomplextab10k' using PigStorage() as 
> (m:[], t:(name:chararray, age:int, gpa:double), b:{t:(name:chararray, 
> age:int, gpa:double)});
> b = foreach a generate t;
> describe b;
> store b into ':OUTPATH:.intermediate' USING 
> org.apache.pig.piggybank.storage.avro.AvroStorage();
> exec;
> -- Read back what was stored with Avro
> u = load ':OUTPATH:.intermediate' USING 
> org.apache.pig.piggybank.storage.avro.AvroStorage();
> describe u;
> store u into ':OUTPATH:';
> \,
> 'verify_pig_script' => q\
> a = load ':INPATH:/singlefile/studentcomplextab10k' using PigStorage() as 
> (m:[], t:(name:chararray, age:int, gpa:double), b:{t:(name:chararray, 
> age:int, gpa:double)});
> b = foreach a generate t;
> describe b;
> store b into ':OUTPATH:';
> \,
> },
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3322) AVRO: AvroStorage give NPE on reading file with union as top level schema

2013-06-03 Thread Viraj Bhat (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Bhat updated PIG-3322:


Attachment: PIG-3322_3.patch

> AVRO: AvroStorage give NPE on reading file with union as top level schema
> -
>
> Key: PIG-3322
> URL: https://issues.apache.org/jira/browse/PIG-3322
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.11.2
>Reporter: Egil Sorensen
>Assignee: Viraj Bhat
>  Labels: patch
> Fix For: 0.12
>
> Attachments: PIG-3322_3.patch, test_loadavrowithnulls.avro
>
>
> I am getting NPE when loading a file with AvroStorage a file that has schema 
> like:
> {code}
> ["null",{"type":"record","name":"TUPLE_0","fields":[{"name":"name","type":["null","string"],"doc":"autogenerated
>  from Pig Field 
> Schema"},{"name":"age","type":["null","int"],"doc":"autogenerated from Pig 
> Field Schema"},{"name":"gpa","type":["null","double"],"doc":"autogenerated 
> from Pig Field Schema"}]}]
> {code}
> E.g. see the e2e style test, which fails on this:
> {code}
> {
> 'num' => 4,
> # storing file with Pig type tuple relying on 
> conversion to record
> # loading using stored schemas 
> 'notmq' => 1,
> 'pig' => q\
> a = load ':INPATH:/singlefile/studentcomplextab10k' using PigStorage() as 
> (m:[], t:(name:chararray, age:int, gpa:double), b:{t:(name:chararray, 
> age:int, gpa:double)});
> b = foreach a generate t;
> describe b;
> store b into ':OUTPATH:.intermediate' USING 
> org.apache.pig.piggybank.storage.avro.AvroStorage();
> exec;
> -- Read back what was stored with Avro
> u = load ':OUTPATH:.intermediate' USING 
> org.apache.pig.piggybank.storage.avro.AvroStorage();
> describe u;
> store u into ':OUTPATH:';
> \,
> 'verify_pig_script' => q\
> a = load ':INPATH:/singlefile/studentcomplextab10k' using PigStorage() as 
> (m:[], t:(name:chararray, age:int, gpa:double), b:{t:(name:chararray, 
> age:int, gpa:double)});
> b = foreach a generate t;
> describe b;
> store b into ':OUTPATH:';
> \,
> },
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3322) AVRO: AvroStorage give NPE on reading file with union as top level schema

2013-06-03 Thread Viraj Bhat (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Bhat updated PIG-3322:


Attachment: test_loadavrowithnulls.avro

> AVRO: AvroStorage give NPE on reading file with union as top level schema
> -
>
> Key: PIG-3322
> URL: https://issues.apache.org/jira/browse/PIG-3322
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.11.2
>Reporter: Egil Sorensen
>Assignee: Viraj Bhat
>  Labels: patch
> Fix For: 0.12
>
> Attachments: PIG-3322_3.patch, test_loadavrowithnulls.avro
>
>
> I am getting NPE when loading a file with AvroStorage a file that has schema 
> like:
> {code}
> ["null",{"type":"record","name":"TUPLE_0","fields":[{"name":"name","type":["null","string"],"doc":"autogenerated
>  from Pig Field 
> Schema"},{"name":"age","type":["null","int"],"doc":"autogenerated from Pig 
> Field Schema"},{"name":"gpa","type":["null","double"],"doc":"autogenerated 
> from Pig Field Schema"}]}]
> {code}
> E.g. see the e2e style test, which fails on this:
> {code}
> {
> 'num' => 4,
> # storing file with Pig type tuple relying on 
> conversion to record
> # loading using stored schemas 
> 'notmq' => 1,
> 'pig' => q\
> a = load ':INPATH:/singlefile/studentcomplextab10k' using PigStorage() as 
> (m:[], t:(name:chararray, age:int, gpa:double), b:{t:(name:chararray, 
> age:int, gpa:double)});
> b = foreach a generate t;
> describe b;
> store b into ':OUTPATH:.intermediate' USING 
> org.apache.pig.piggybank.storage.avro.AvroStorage();
> exec;
> -- Read back what was stored with Avro
> u = load ':OUTPATH:.intermediate' USING 
> org.apache.pig.piggybank.storage.avro.AvroStorage();
> describe u;
> store u into ':OUTPATH:';
> \,
> 'verify_pig_script' => q\
> a = load ':INPATH:/singlefile/studentcomplextab10k' using PigStorage() as 
> (m:[], t:(name:chararray, age:int, gpa:double), b:{t:(name:chararray, 
> age:int, gpa:double)});
> b = foreach a generate t;
> describe b;
> store b into ':OUTPATH:';
> \,
> },
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3322) AVRO: AvroStorage give NPE on reading file with union as top level schema

2013-06-03 Thread Viraj Bhat (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Bhat updated PIG-3322:


Attachment: (was: expected_testLoadAvrowithNulls.txt)

> AVRO: AvroStorage give NPE on reading file with union as top level schema
> -
>
> Key: PIG-3322
> URL: https://issues.apache.org/jira/browse/PIG-3322
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.11.2
>Reporter: Egil Sorensen
>Assignee: Viraj Bhat
>  Labels: patch
> Fix For: 0.12
>
> Attachments: PIG-3322_3.patch, test_loadavrowithnulls.avro
>
>
> I am getting NPE when loading a file with AvroStorage a file that has schema 
> like:
> {code}
> ["null",{"type":"record","name":"TUPLE_0","fields":[{"name":"name","type":["null","string"],"doc":"autogenerated
>  from Pig Field 
> Schema"},{"name":"age","type":["null","int"],"doc":"autogenerated from Pig 
> Field Schema"},{"name":"gpa","type":["null","double"],"doc":"autogenerated 
> from Pig Field Schema"}]}]
> {code}
> E.g. see the e2e style test, which fails on this:
> {code}
> {
> 'num' => 4,
> # storing file with Pig type tuple relying on 
> conversion to record
> # loading using stored schemas 
> 'notmq' => 1,
> 'pig' => q\
> a = load ':INPATH:/singlefile/studentcomplextab10k' using PigStorage() as 
> (m:[], t:(name:chararray, age:int, gpa:double), b:{t:(name:chararray, 
> age:int, gpa:double)});
> b = foreach a generate t;
> describe b;
> store b into ':OUTPATH:.intermediate' USING 
> org.apache.pig.piggybank.storage.avro.AvroStorage();
> exec;
> -- Read back what was stored with Avro
> u = load ':OUTPATH:.intermediate' USING 
> org.apache.pig.piggybank.storage.avro.AvroStorage();
> describe u;
> store u into ':OUTPATH:';
> \,
> 'verify_pig_script' => q\
> a = load ':INPATH:/singlefile/studentcomplextab10k' using PigStorage() as 
> (m:[], t:(name:chararray, age:int, gpa:double), b:{t:(name:chararray, 
> age:int, gpa:double)});
> b = foreach a generate t;
> describe b;
> store b into ':OUTPATH:';
> \,
> },
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3322) AVRO: AvroStorage give NPE on reading file with union as top level schema

2013-06-03 Thread Viraj Bhat (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Bhat updated PIG-3322:


Attachment: (was: PIG-3322_2.patch)

> AVRO: AvroStorage give NPE on reading file with union as top level schema
> -
>
> Key: PIG-3322
> URL: https://issues.apache.org/jira/browse/PIG-3322
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.11.2
>Reporter: Egil Sorensen
>Assignee: Viraj Bhat
>  Labels: patch
> Fix For: 0.12
>
> Attachments: PIG-3322_3.patch, test_loadavrowithnulls.avro
>
>
> I am getting NPE when loading a file with AvroStorage a file that has schema 
> like:
> {code}
> ["null",{"type":"record","name":"TUPLE_0","fields":[{"name":"name","type":["null","string"],"doc":"autogenerated
>  from Pig Field 
> Schema"},{"name":"age","type":["null","int"],"doc":"autogenerated from Pig 
> Field Schema"},{"name":"gpa","type":["null","double"],"doc":"autogenerated 
> from Pig Field Schema"}]}]
> {code}
> E.g. see the e2e style test, which fails on this:
> {code}
> {
> 'num' => 4,
> # storing file with Pig type tuple relying on 
> conversion to record
> # loading using stored schemas 
> 'notmq' => 1,
> 'pig' => q\
> a = load ':INPATH:/singlefile/studentcomplextab10k' using PigStorage() as 
> (m:[], t:(name:chararray, age:int, gpa:double), b:{t:(name:chararray, 
> age:int, gpa:double)});
> b = foreach a generate t;
> describe b;
> store b into ':OUTPATH:.intermediate' USING 
> org.apache.pig.piggybank.storage.avro.AvroStorage();
> exec;
> -- Read back what was stored with Avro
> u = load ':OUTPATH:.intermediate' USING 
> org.apache.pig.piggybank.storage.avro.AvroStorage();
> describe u;
> store u into ':OUTPATH:';
> \,
> 'verify_pig_script' => q\
> a = load ':INPATH:/singlefile/studentcomplextab10k' using PigStorage() as 
> (m:[], t:(name:chararray, age:int, gpa:double), b:{t:(name:chararray, 
> age:int, gpa:double)});
> b = foreach a generate t;
> describe b;
> store b into ':OUTPATH:';
> \,
> },
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Review Request: PIG-3322 Fix the issue where NPE is thrown when reading a union which has nulls and add a testcase

2013-06-03 Thread Viraj Bhat


> On June 2, 2013, 9:27 p.m., Cheolsoo Park wrote:
> > http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/avro/TestAvroStorage.java,
> >  line 1104
> > <https://reviews.apache.org/r/11333/diff/5/?file=298357#file298357line1104>
> >
> > If you use mock.Storage here instead of PigStoage, you won't need the 
> > verifyTextResults method and extra output file. Can you please update your 
> > test?
> > 
> > Please see org.apache.pig.builtin.mock.Storage.java.

Added Mock Storage


- Viraj


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11333/#review21305
-------


On June 4, 2013, 12:15 a.m., Viraj Bhat wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/11333/
> ---
> 
> (Updated June 4, 2013, 12:15 a.m.)
> 
> 
> Review request for pig and Rohini Palaniswamy.
> 
> 
> Description
> ---
> 
> Null pointer exception when loading union with null in it's schema. Test case 
> was also updated with a sample test case.
> 
> 
> Diffs
> -
> 
>   
> http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroStorage.java
>  1485358 
>   
> http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/PigAvroRecordReader.java
>  1485358 
>   
> http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/avro/TestAvroStorage.java
>  1485358 
> 
> Diff: https://reviews.apache.org/r/11333/diff/
> 
> 
> Testing
> ---
> 
> Yes all tests pass in the piggybank
> 
> 
> Thanks,
> 
> Viraj Bhat
> 
>



Re: Review Request: PIG-3322 Fix the issue where NPE is thrown when reading a union which has nulls and add a testcase

2013-06-03 Thread Viraj Bhat


> On June 3, 2013, 1:03 p.m., Rohini Palaniswamy wrote:
> > Just minor comments in the naming of the variable. Java variable names 
> > should be camel case.

Thanks but now the verifyTxtResults method is not used any more


- Viraj


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11333/#review21312
---


On June 4, 2013, 12:15 a.m., Viraj Bhat wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/11333/
> ---
> 
> (Updated June 4, 2013, 12:15 a.m.)
> 
> 
> Review request for pig and Rohini Palaniswamy.
> 
> 
> Description
> ---
> 
> Null pointer exception when loading union with null in it's schema. Test case 
> was also updated with a sample test case.
> 
> 
> Diffs
> -
> 
>   
> http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroStorage.java
>  1485358 
>   
> http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/PigAvroRecordReader.java
>  1485358 
>   
> http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/avro/TestAvroStorage.java
>  1485358 
> 
> Diff: https://reviews.apache.org/r/11333/diff/
> 
> 
> Testing
> ---
> 
> Yes all tests pass in the piggybank
> 
> 
> Thanks,
> 
> Viraj Bhat
> 
>



Re: Review Request: PIG-3322 Fix the issue where NPE is thrown when reading a union which has nulls and add a testcase

2013-06-03 Thread Viraj Bhat

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11333/
---

(Updated June 4, 2013, 12:15 a.m.)


Review request for pig and Rohini Palaniswamy.


Changes
---

Using MockStorage instead of the PigStorage and comparing results inline for 4 
records.


Description
---

Null pointer exception when loading union with null in it's schema. Test case 
was also updated with a sample test case.


Diffs (updated)
-

  
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroStorage.java
 1485358 
  
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/PigAvroRecordReader.java
 1485358 
  
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/avro/TestAvroStorage.java
 1485358 

Diff: https://reviews.apache.org/r/11333/diff/


Testing
---

Yes all tests pass in the piggybank


Thanks,

Viraj Bhat



Re: Review Request: PIG-3318 Patch to address default values when schemas are merged in AvroStorage. It does this for Records containing primitive values

2013-05-29 Thread Viraj Bhat

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11135/
---

(Updated May 30, 2013, 2:28 a.m.)


Review request for pig and Rohini Palaniswamy.


Description
---

Default values are not honoured when merging default schema


This addresses bug PIG-3318.
https://issues.apache.org/jira/browse/PIG-3318


Diffs
-

  
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroStorage.java
 1484564 
  
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroStorageUtils.java
 1484564 
  
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/PigAvroInputFormat.java
 1484564 
  
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/PigAvroRecordReader.java
 1484564 
  
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/avro/TestAvroStorage.java
 1484564 

Diff: https://reviews.apache.org/r/11135/diff/


Testing
---

Yes


Thanks,

Viraj Bhat



Re: Review Request: PIG-3331 Default values not written to Schema when specified in the output schema

2013-05-29 Thread Viraj Bhat

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11355/
---

(Updated May 30, 2013, 2:29 a.m.)


Review request for pig and Rohini Palaniswamy.


Description
---

Patch to write default values to the Schema when the writer schema contains 
that in the AvroStorage.


Diffs
-

  
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/PigSchema2Avro.java
 1485826 
  
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/avro/TestAvroStorage.java
 1485826 
  
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/avro/avro_test_files/numbers.txt
 PRE-CREATION 

Diff: https://reviews.apache.org/r/11355/diff/


Testing
---

Yes against the Piggybank  in Pig trunk/Pig 0.12


Thanks,

Viraj Bhat



[jira] [Updated] (PIG-3322) AVRO: AvroStorage give NPE on reading file with union as top level schema

2013-05-29 Thread Viraj Bhat (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Bhat updated PIG-3322:


Attachment: expected_testLoadAvrowithNulls.txt

Golden test file generated

> AVRO: AvroStorage give NPE on reading file with union as top level schema
> -
>
> Key: PIG-3322
> URL: https://issues.apache.org/jira/browse/PIG-3322
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.11.2
>Reporter: Egil Sorensen
>Assignee: Viraj Bhat
>  Labels: patch
> Fix For: 0.12
>
> Attachments: expected_testLoadAvrowithNulls.txt, PIG-3322_2.patch, 
> test_loadavrowithnulls.avro
>
>
> I am getting NPE when loading a file with AvroStorage a file that has schema 
> like:
> {code}
> ["null",{"type":"record","name":"TUPLE_0","fields":[{"name":"name","type":["null","string"],"doc":"autogenerated
>  from Pig Field 
> Schema"},{"name":"age","type":["null","int"],"doc":"autogenerated from Pig 
> Field Schema"},{"name":"gpa","type":["null","double"],"doc":"autogenerated 
> from Pig Field Schema"}]}]
> {code}
> E.g. see the e2e style test, which fails on this:
> {code}
> {
> 'num' => 4,
> # storing file with Pig type tuple relying on 
> conversion to record
> # loading using stored schemas 
> 'notmq' => 1,
> 'pig' => q\
> a = load ':INPATH:/singlefile/studentcomplextab10k' using PigStorage() as 
> (m:[], t:(name:chararray, age:int, gpa:double), b:{t:(name:chararray, 
> age:int, gpa:double)});
> b = foreach a generate t;
> describe b;
> store b into ':OUTPATH:.intermediate' USING 
> org.apache.pig.piggybank.storage.avro.AvroStorage();
> exec;
> -- Read back what was stored with Avro
> u = load ':OUTPATH:.intermediate' USING 
> org.apache.pig.piggybank.storage.avro.AvroStorage();
> describe u;
> store u into ':OUTPATH:';
> \,
> 'verify_pig_script' => q\
> a = load ':INPATH:/singlefile/studentcomplextab10k' using PigStorage() as 
> (m:[], t:(name:chararray, age:int, gpa:double), b:{t:(name:chararray, 
> age:int, gpa:double)});
> b = foreach a generate t;
> describe b;
> store b into ':OUTPATH:';
> \,
> },
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3322) AVRO: AvroStorage give NPE on reading file with union as top level schema

2013-05-29 Thread Viraj Bhat (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Bhat updated PIG-3322:


Attachment: test_loadavrowithnulls.avro

Test Input Avro file

> AVRO: AvroStorage give NPE on reading file with union as top level schema
> -
>
> Key: PIG-3322
> URL: https://issues.apache.org/jira/browse/PIG-3322
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.11.2
>Reporter: Egil Sorensen
>Assignee: Viraj Bhat
>  Labels: patch
> Fix For: 0.12
>
> Attachments: expected_testLoadAvrowithNulls.txt, PIG-3322_2.patch, 
> test_loadavrowithnulls.avro
>
>
> I am getting NPE when loading a file with AvroStorage a file that has schema 
> like:
> {code}
> ["null",{"type":"record","name":"TUPLE_0","fields":[{"name":"name","type":["null","string"],"doc":"autogenerated
>  from Pig Field 
> Schema"},{"name":"age","type":["null","int"],"doc":"autogenerated from Pig 
> Field Schema"},{"name":"gpa","type":["null","double"],"doc":"autogenerated 
> from Pig Field Schema"}]}]
> {code}
> E.g. see the e2e style test, which fails on this:
> {code}
> {
> 'num' => 4,
> # storing file with Pig type tuple relying on 
> conversion to record
> # loading using stored schemas 
> 'notmq' => 1,
> 'pig' => q\
> a = load ':INPATH:/singlefile/studentcomplextab10k' using PigStorage() as 
> (m:[], t:(name:chararray, age:int, gpa:double), b:{t:(name:chararray, 
> age:int, gpa:double)});
> b = foreach a generate t;
> describe b;
> store b into ':OUTPATH:.intermediate' USING 
> org.apache.pig.piggybank.storage.avro.AvroStorage();
> exec;
> -- Read back what was stored with Avro
> u = load ':OUTPATH:.intermediate' USING 
> org.apache.pig.piggybank.storage.avro.AvroStorage();
> describe u;
> store u into ':OUTPATH:';
> \,
> 'verify_pig_script' => q\
> a = load ':INPATH:/singlefile/studentcomplextab10k' using PigStorage() as 
> (m:[], t:(name:chararray, age:int, gpa:double), b:{t:(name:chararray, 
> age:int, gpa:double)});
> b = foreach a generate t;
> describe b;
> store b into ':OUTPATH:';
> \,
> },
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3322) AVRO: AvroStorage give NPE on reading file with union as top level schema

2013-05-29 Thread Viraj Bhat (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Bhat updated PIG-3322:


Attachment: PIG-3322_2.patch

Patch for PIG-3322

> AVRO: AvroStorage give NPE on reading file with union as top level schema
> -
>
> Key: PIG-3322
> URL: https://issues.apache.org/jira/browse/PIG-3322
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.11.2
>Reporter: Egil Sorensen
>Assignee: Viraj Bhat
>  Labels: patch
> Fix For: 0.12
>
> Attachments: expected_testLoadAvrowithNulls.txt, PIG-3322_2.patch, 
> test_loadavrowithnulls.avro
>
>
> I am getting NPE when loading a file with AvroStorage a file that has schema 
> like:
> {code}
> ["null",{"type":"record","name":"TUPLE_0","fields":[{"name":"name","type":["null","string"],"doc":"autogenerated
>  from Pig Field 
> Schema"},{"name":"age","type":["null","int"],"doc":"autogenerated from Pig 
> Field Schema"},{"name":"gpa","type":["null","double"],"doc":"autogenerated 
> from Pig Field Schema"}]}]
> {code}
> E.g. see the e2e style test, which fails on this:
> {code}
> {
> 'num' => 4,
> # storing file with Pig type tuple relying on 
> conversion to record
> # loading using stored schemas 
> 'notmq' => 1,
> 'pig' => q\
> a = load ':INPATH:/singlefile/studentcomplextab10k' using PigStorage() as 
> (m:[], t:(name:chararray, age:int, gpa:double), b:{t:(name:chararray, 
> age:int, gpa:double)});
> b = foreach a generate t;
> describe b;
> store b into ':OUTPATH:.intermediate' USING 
> org.apache.pig.piggybank.storage.avro.AvroStorage();
> exec;
> -- Read back what was stored with Avro
> u = load ':OUTPATH:.intermediate' USING 
> org.apache.pig.piggybank.storage.avro.AvroStorage();
> describe u;
> store u into ':OUTPATH:';
> \,
> 'verify_pig_script' => q\
> a = load ':INPATH:/singlefile/studentcomplextab10k' using PigStorage() as 
> (m:[], t:(name:chararray, age:int, gpa:double), b:{t:(name:chararray, 
> age:int, gpa:double)});
> b = foreach a generate t;
> describe b;
> store b into ':OUTPATH:';
> \,
> },
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3322) AVRO: AvroStorage give NPE on reading file with union as top level schema

2013-05-29 Thread Viraj Bhat (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Bhat updated PIG-3322:


Attachment: (was: test_loadavrowithnulls.avro)

> AVRO: AvroStorage give NPE on reading file with union as top level schema
> -
>
> Key: PIG-3322
> URL: https://issues.apache.org/jira/browse/PIG-3322
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.11.2
>Reporter: Egil Sorensen
>Assignee: Viraj Bhat
>  Labels: patch
> Fix For: 0.12
>
>
> I am getting NPE when loading a file with AvroStorage a file that has schema 
> like:
> {code}
> ["null",{"type":"record","name":"TUPLE_0","fields":[{"name":"name","type":["null","string"],"doc":"autogenerated
>  from Pig Field 
> Schema"},{"name":"age","type":["null","int"],"doc":"autogenerated from Pig 
> Field Schema"},{"name":"gpa","type":["null","double"],"doc":"autogenerated 
> from Pig Field Schema"}]}]
> {code}
> E.g. see the e2e style test, which fails on this:
> {code}
> {
> 'num' => 4,
> # storing file with Pig type tuple relying on 
> conversion to record
> # loading using stored schemas 
> 'notmq' => 1,
> 'pig' => q\
> a = load ':INPATH:/singlefile/studentcomplextab10k' using PigStorage() as 
> (m:[], t:(name:chararray, age:int, gpa:double), b:{t:(name:chararray, 
> age:int, gpa:double)});
> b = foreach a generate t;
> describe b;
> store b into ':OUTPATH:.intermediate' USING 
> org.apache.pig.piggybank.storage.avro.AvroStorage();
> exec;
> -- Read back what was stored with Avro
> u = load ':OUTPATH:.intermediate' USING 
> org.apache.pig.piggybank.storage.avro.AvroStorage();
> describe u;
> store u into ':OUTPATH:';
> \,
> 'verify_pig_script' => q\
> a = load ':INPATH:/singlefile/studentcomplextab10k' using PigStorage() as 
> (m:[], t:(name:chararray, age:int, gpa:double), b:{t:(name:chararray, 
> age:int, gpa:double)});
> b = foreach a generate t;
> describe b;
> store b into ':OUTPATH:';
> \,
> },
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3322) AVRO: AvroStorage give NPE on reading file with union as top level schema

2013-05-29 Thread Viraj Bhat (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Bhat updated PIG-3322:


Attachment: (was: expected_testLoadAvrowithNulls.txt)

> AVRO: AvroStorage give NPE on reading file with union as top level schema
> -
>
> Key: PIG-3322
> URL: https://issues.apache.org/jira/browse/PIG-3322
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.11.2
>Reporter: Egil Sorensen
>Assignee: Viraj Bhat
>  Labels: patch
> Fix For: 0.12
>
>
> I am getting NPE when loading a file with AvroStorage a file that has schema 
> like:
> {code}
> ["null",{"type":"record","name":"TUPLE_0","fields":[{"name":"name","type":["null","string"],"doc":"autogenerated
>  from Pig Field 
> Schema"},{"name":"age","type":["null","int"],"doc":"autogenerated from Pig 
> Field Schema"},{"name":"gpa","type":["null","double"],"doc":"autogenerated 
> from Pig Field Schema"}]}]
> {code}
> E.g. see the e2e style test, which fails on this:
> {code}
> {
> 'num' => 4,
> # storing file with Pig type tuple relying on 
> conversion to record
> # loading using stored schemas 
> 'notmq' => 1,
> 'pig' => q\
> a = load ':INPATH:/singlefile/studentcomplextab10k' using PigStorage() as 
> (m:[], t:(name:chararray, age:int, gpa:double), b:{t:(name:chararray, 
> age:int, gpa:double)});
> b = foreach a generate t;
> describe b;
> store b into ':OUTPATH:.intermediate' USING 
> org.apache.pig.piggybank.storage.avro.AvroStorage();
> exec;
> -- Read back what was stored with Avro
> u = load ':OUTPATH:.intermediate' USING 
> org.apache.pig.piggybank.storage.avro.AvroStorage();
> describe u;
> store u into ':OUTPATH:';
> \,
> 'verify_pig_script' => q\
> a = load ':INPATH:/singlefile/studentcomplextab10k' using PigStorage() as 
> (m:[], t:(name:chararray, age:int, gpa:double), b:{t:(name:chararray, 
> age:int, gpa:double)});
> b = foreach a generate t;
> describe b;
> store b into ':OUTPATH:';
> \,
> },
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Review Request: PIG-3322 Fix the issue where NPE is thrown when reading a union which has nulls and add a testcase

2013-05-29 Thread Viraj Bhat

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11333/
---

(Updated May 29, 2013, 11:07 p.m.)


Review request for pig and Rohini Palaniswamy.


Changes
---

Smaller input files and output golden files


Description
---

Null pointer exception when loading union with null in it's schema. Test case 
was also updated with a sample test case.


Diffs (updated)
-

  
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroStorage.java
 1485358 
  
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/PigAvroRecordReader.java
 1485358 
  
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/avro/TestAvroStorage.java
 1485358 
  
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/avro/avro_test_files/expected_testLoadAvrowithNulls.txt
 PRE-CREATION 

Diff: https://reviews.apache.org/r/11333/diff/


Testing
---

Yes all tests pass in the piggybank


Thanks,

Viraj Bhat



[jira] [Commented] (PIG-3331) Default values not stored in avro file when using specific schemas during store in AvroStorage

2013-05-23 Thread Viraj Bhat (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13665713#comment-13665713
 ] 

Viraj Bhat commented on PIG-3331:
-

Patch posted on the review board.
https://reviews.apache.org/r/11355/
Viraj

> Default values not stored in avro file when using specific schemas during 
> store in AvroStorage
> --
>
> Key: PIG-3331
> URL: https://issues.apache.org/jira/browse/PIG-3331
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.11.1
>Reporter: Viraj Bhat
>Assignee: Viraj Bhat
> Fix For: 0.11.2
>
> Attachments: expected_DefaultSchemaWrite.avro, numbers.txt
>
>
> Script which stores Avro using a predefined schema does not store the default 
> values in the file
> {code}
> a = LOAD 'numbers.txt' USING PigStorage (':') as (intnum1000: int,id:
> int,intnum5: int,intnum100: int,intnum: int,longnum: long,floatnum:
> float,doublenum: double);
> b2 = foreach a generate id, intnum5, intnum100;
> c2 = filter b2 by 110 <= id and id < 120;
> STORE c2 INTO '/tmp/TestAvroStorage/testDefaultValueWrite' USING
> org.apache.pig.piggybank.storage.avro.AvroStorage (' { "debug" : 5, "schema" :
> {  "name" : "rmyrecord", "type" : "record",  "fields" : [ { "name" : "id",
> "type" : "int" , "default" : 0 }, {  "name" : "intnum5",  "type" : "int",
> "default" : 0 }, { "name" : "intnum100", "type" : "int", "default" : 0 } ] } }
> ');
> {code}
> Opening the file shows the following schema
> {noformat} 
> avro.schema
> {"type":"record","name":"rmyrecord","fields":[{"name":"id","type":"int"},{"name":"intnum5","type":"int"},{"name":"intnum100","type":"int"}]}
> {noformat} 
> There seems to be a problem storing the schema.
> Viraj

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3331) Default values not stored in avro file when using specific schemas during store in AvroStorage

2013-05-23 Thread Viraj Bhat (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Bhat updated PIG-3331:


Attachment: numbers.txt

Input text file with numbers

> Default values not stored in avro file when using specific schemas during 
> store in AvroStorage
> --
>
> Key: PIG-3331
> URL: https://issues.apache.org/jira/browse/PIG-3331
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.11.1
>Reporter: Viraj Bhat
>Assignee: Viraj Bhat
> Fix For: 0.11.2
>
> Attachments: expected_DefaultSchemaWrite.avro, numbers.txt
>
>
> Script which stores Avro using a predefined schema does not store the default 
> values in the file
> {code}
> a = LOAD 'numbers.txt' USING PigStorage (':') as (intnum1000: int,id:
> int,intnum5: int,intnum100: int,intnum: int,longnum: long,floatnum:
> float,doublenum: double);
> b2 = foreach a generate id, intnum5, intnum100;
> c2 = filter b2 by 110 <= id and id < 120;
> STORE c2 INTO '/tmp/TestAvroStorage/testDefaultValueWrite' USING
> org.apache.pig.piggybank.storage.avro.AvroStorage (' { "debug" : 5, "schema" :
> {  "name" : "rmyrecord", "type" : "record",  "fields" : [ { "name" : "id",
> "type" : "int" , "default" : 0 }, {  "name" : "intnum5",  "type" : "int",
> "default" : 0 }, { "name" : "intnum100", "type" : "int", "default" : 0 } ] } }
> ');
> {code}
> Opening the file shows the following schema
> {noformat} 
> avro.schema
> {"type":"record","name":"rmyrecord","fields":[{"name":"id","type":"int"},{"name":"intnum5","type":"int"},{"name":"intnum100","type":"int"}]}
> {noformat} 
> There seems to be a problem storing the schema.
> Viraj

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3331) Default values not stored in avro file when using specific schemas during store in AvroStorage

2013-05-23 Thread Viraj Bhat (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Bhat updated PIG-3331:


Attachment: expected_DefaultSchemaWrite.avro

ExpectedAvro file with Default Schema

> Default values not stored in avro file when using specific schemas during 
> store in AvroStorage
> --
>
> Key: PIG-3331
> URL: https://issues.apache.org/jira/browse/PIG-3331
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.11.1
>Reporter: Viraj Bhat
>Assignee: Viraj Bhat
> Fix For: 0.11.2
>
> Attachments: expected_DefaultSchemaWrite.avro, numbers.txt
>
>
> Script which stores Avro using a predefined schema does not store the default 
> values in the file
> {code}
> a = LOAD 'numbers.txt' USING PigStorage (':') as (intnum1000: int,id:
> int,intnum5: int,intnum100: int,intnum: int,longnum: long,floatnum:
> float,doublenum: double);
> b2 = foreach a generate id, intnum5, intnum100;
> c2 = filter b2 by 110 <= id and id < 120;
> STORE c2 INTO '/tmp/TestAvroStorage/testDefaultValueWrite' USING
> org.apache.pig.piggybank.storage.avro.AvroStorage (' { "debug" : 5, "schema" :
> {  "name" : "rmyrecord", "type" : "record",  "fields" : [ { "name" : "id",
> "type" : "int" , "default" : 0 }, {  "name" : "intnum5",  "type" : "int",
> "default" : 0 }, { "name" : "intnum100", "type" : "int", "default" : 0 } ] } }
> ');
> {code}
> Opening the file shows the following schema
> {noformat} 
> avro.schema
> {"type":"record","name":"rmyrecord","fields":[{"name":"id","type":"int"},{"name":"intnum5","type":"int"},{"name":"intnum100","type":"int"}]}
> {noformat} 
> There seems to be a problem storing the schema.
> Viraj

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Review Request: PIG-3318 Patch to address default values when schemas are merged in AvroStorage. It does this for Records containing primitive values

2013-05-22 Thread Viraj Bhat

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11135/
---

(Updated May 23, 2013, 12:12 a.m.)


Review request for pig.


Summary (updated)
-

PIG-3318 Patch to address default values when schemas are merged in 
AvroStorage. It does this for Records containing primitive values


Description
---

Default values are not honoured when merging default schema


This addresses bug PIG-3318.
https://issues.apache.org/jira/browse/PIG-3318


Diffs
-

  
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroStorage.java
 1484564 
  
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroStorageUtils.java
 1484564 
  
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/PigAvroInputFormat.java
 1484564 
  
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/PigAvroRecordReader.java
 1484564 
  
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/avro/TestAvroStorage.java
 1484564 

Diff: https://reviews.apache.org/r/11135/diff/


Testing
---

Yes


Thanks,

Viraj Bhat



[jira] [Commented] (PIG-3322) AVRO: AvroStorage give NPE on reading file with union as top level schema

2013-05-22 Thread Viraj Bhat (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13664543#comment-13664543
 ] 

Viraj Bhat commented on PIG-3322:
-

Review board
https://reviews.apache.org/r/11333/

> AVRO: AvroStorage give NPE on reading file with union as top level schema
> -
>
> Key: PIG-3322
> URL: https://issues.apache.org/jira/browse/PIG-3322
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.11.2
>Reporter: Egil Sorensen
>Assignee: Viraj Bhat
>  Labels: patch
> Attachments: expected_testLoadAvrowithNulls.txt, 
> test_loadavrowithnulls.avro
>
>
> I am getting NPE when loading a file with AvroStorage a file that has schema 
> like:
> {code}
> ["null",{"type":"record","name":"TUPLE_0","fields":[{"name":"name","type":["null","string"],"doc":"autogenerated
>  from Pig Field 
> Schema"},{"name":"age","type":["null","int"],"doc":"autogenerated from Pig 
> Field Schema"},{"name":"gpa","type":["null","double"],"doc":"autogenerated 
> from Pig Field Schema"}]}]
> {code}
> E.g. see the e2e style test, which fails on this:
> {code}
> {
> 'num' => 4,
> # storing file with Pig type tuple relying on 
> conversion to record
> # loading using stored schemas 
> 'notmq' => 1,
> 'pig' => q\
> a = load ':INPATH:/singlefile/studentcomplextab10k' using PigStorage() as 
> (m:[], t:(name:chararray, age:int, gpa:double), b:{t:(name:chararray, 
> age:int, gpa:double)});
> b = foreach a generate t;
> describe b;
> store b into ':OUTPATH:.intermediate' USING 
> org.apache.pig.piggybank.storage.avro.AvroStorage();
> exec;
> -- Read back what was stored with Avro
> u = load ':OUTPATH:.intermediate' USING 
> org.apache.pig.piggybank.storage.avro.AvroStorage();
> describe u;
> store u into ':OUTPATH:';
> \,
> 'verify_pig_script' => q\
> a = load ':INPATH:/singlefile/studentcomplextab10k' using PigStorage() as 
> (m:[], t:(name:chararray, age:int, gpa:double), b:{t:(name:chararray, 
> age:int, gpa:double)});
> b = foreach a generate t;
> describe b;
> store b into ':OUTPATH:';
> \,
> },
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3322) AVRO: AvroStorage give NPE on reading file with union as top level schema

2013-05-22 Thread Viraj Bhat (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Bhat updated PIG-3322:


Fix Version/s: 0.12

> AVRO: AvroStorage give NPE on reading file with union as top level schema
> -
>
> Key: PIG-3322
> URL: https://issues.apache.org/jira/browse/PIG-3322
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.11.2
>Reporter: Egil Sorensen
>Assignee: Viraj Bhat
>  Labels: patch
> Fix For: 0.12
>
> Attachments: expected_testLoadAvrowithNulls.txt, 
> test_loadavrowithnulls.avro
>
>
> I am getting NPE when loading a file with AvroStorage a file that has schema 
> like:
> {code}
> ["null",{"type":"record","name":"TUPLE_0","fields":[{"name":"name","type":["null","string"],"doc":"autogenerated
>  from Pig Field 
> Schema"},{"name":"age","type":["null","int"],"doc":"autogenerated from Pig 
> Field Schema"},{"name":"gpa","type":["null","double"],"doc":"autogenerated 
> from Pig Field Schema"}]}]
> {code}
> E.g. see the e2e style test, which fails on this:
> {code}
> {
> 'num' => 4,
> # storing file with Pig type tuple relying on 
> conversion to record
> # loading using stored schemas 
> 'notmq' => 1,
> 'pig' => q\
> a = load ':INPATH:/singlefile/studentcomplextab10k' using PigStorage() as 
> (m:[], t:(name:chararray, age:int, gpa:double), b:{t:(name:chararray, 
> age:int, gpa:double)});
> b = foreach a generate t;
> describe b;
> store b into ':OUTPATH:.intermediate' USING 
> org.apache.pig.piggybank.storage.avro.AvroStorage();
> exec;
> -- Read back what was stored with Avro
> u = load ':OUTPATH:.intermediate' USING 
> org.apache.pig.piggybank.storage.avro.AvroStorage();
> describe u;
> store u into ':OUTPATH:';
> \,
> 'verify_pig_script' => q\
> a = load ':INPATH:/singlefile/studentcomplextab10k' using PigStorage() as 
> (m:[], t:(name:chararray, age:int, gpa:double), b:{t:(name:chararray, 
> age:int, gpa:double)});
> b = foreach a generate t;
> describe b;
> store b into ':OUTPATH:';
> \,
> },
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3322) AVRO: AvroStorage give NPE on reading file with union as top level schema

2013-05-22 Thread Viraj Bhat (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Bhat updated PIG-3322:


Attachment: expected_testLoadAvrowithNulls.txt

Expected File generated from the testcase

> AVRO: AvroStorage give NPE on reading file with union as top level schema
> -
>
> Key: PIG-3322
> URL: https://issues.apache.org/jira/browse/PIG-3322
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.11.2
>Reporter: Egil Sorensen
>Assignee: Viraj Bhat
>  Labels: patch
> Attachments: expected_testLoadAvrowithNulls.txt, 
> test_loadavrowithnulls.avro
>
>
> I am getting NPE when loading a file with AvroStorage a file that has schema 
> like:
> {code}
> ["null",{"type":"record","name":"TUPLE_0","fields":[{"name":"name","type":["null","string"],"doc":"autogenerated
>  from Pig Field 
> Schema"},{"name":"age","type":["null","int"],"doc":"autogenerated from Pig 
> Field Schema"},{"name":"gpa","type":["null","double"],"doc":"autogenerated 
> from Pig Field Schema"}]}]
> {code}
> E.g. see the e2e style test, which fails on this:
> {code}
> {
> 'num' => 4,
> # storing file with Pig type tuple relying on 
> conversion to record
> # loading using stored schemas 
> 'notmq' => 1,
> 'pig' => q\
> a = load ':INPATH:/singlefile/studentcomplextab10k' using PigStorage() as 
> (m:[], t:(name:chararray, age:int, gpa:double), b:{t:(name:chararray, 
> age:int, gpa:double)});
> b = foreach a generate t;
> describe b;
> store b into ':OUTPATH:.intermediate' USING 
> org.apache.pig.piggybank.storage.avro.AvroStorage();
> exec;
> -- Read back what was stored with Avro
> u = load ':OUTPATH:.intermediate' USING 
> org.apache.pig.piggybank.storage.avro.AvroStorage();
> describe u;
> store u into ':OUTPATH:';
> \,
> 'verify_pig_script' => q\
> a = load ':INPATH:/singlefile/studentcomplextab10k' using PigStorage() as 
> (m:[], t:(name:chararray, age:int, gpa:double), b:{t:(name:chararray, 
> age:int, gpa:double)});
> b = foreach a generate t;
> describe b;
> store b into ':OUTPATH:';
> \,
> },
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3322) AVRO: AvroStorage give NPE on reading file with union as top level schema

2013-05-22 Thread Viraj Bhat (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Bhat updated PIG-3322:


Attachment: test_loadavrowithnulls.avro

Avro file used for the TestAvroStorage.java

> AVRO: AvroStorage give NPE on reading file with union as top level schema
> -
>
> Key: PIG-3322
> URL: https://issues.apache.org/jira/browse/PIG-3322
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.11.2
>Reporter: Egil Sorensen
>Assignee: Viraj Bhat
>  Labels: patch
> Attachments: expected_testLoadAvrowithNulls.txt, 
> test_loadavrowithnulls.avro
>
>
> I am getting NPE when loading a file with AvroStorage a file that has schema 
> like:
> {code}
> ["null",{"type":"record","name":"TUPLE_0","fields":[{"name":"name","type":["null","string"],"doc":"autogenerated
>  from Pig Field 
> Schema"},{"name":"age","type":["null","int"],"doc":"autogenerated from Pig 
> Field Schema"},{"name":"gpa","type":["null","double"],"doc":"autogenerated 
> from Pig Field Schema"}]}]
> {code}
> E.g. see the e2e style test, which fails on this:
> {code}
> {
> 'num' => 4,
> # storing file with Pig type tuple relying on 
> conversion to record
> # loading using stored schemas 
> 'notmq' => 1,
> 'pig' => q\
> a = load ':INPATH:/singlefile/studentcomplextab10k' using PigStorage() as 
> (m:[], t:(name:chararray, age:int, gpa:double), b:{t:(name:chararray, 
> age:int, gpa:double)});
> b = foreach a generate t;
> describe b;
> store b into ':OUTPATH:.intermediate' USING 
> org.apache.pig.piggybank.storage.avro.AvroStorage();
> exec;
> -- Read back what was stored with Avro
> u = load ':OUTPATH:.intermediate' USING 
> org.apache.pig.piggybank.storage.avro.AvroStorage();
> describe u;
> store u into ':OUTPATH:';
> \,
> 'verify_pig_script' => q\
> a = load ':INPATH:/singlefile/studentcomplextab10k' using PigStorage() as 
> (m:[], t:(name:chararray, age:int, gpa:double), b:{t:(name:chararray, 
> age:int, gpa:double)});
> b = foreach a generate t;
> describe b;
> store b into ':OUTPATH:';
> \,
> },
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2330) Problem in org.apache.pig.piggybank.storage.avro.AvroStorage when storing a record with a single field.

2013-05-20 Thread Viraj Bhat (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13662567#comment-13662567
 ] 

Viraj Bhat commented on PIG-2330:
-

Hi,
 The issue here is not related to : PIG-3322. The 1 line fix should solve the 
above problem.

Consider a change in the script to add TOTUPLE: The below works to generate the 
following
{code}
A = load 'input.txt' AS (name1:chararray, name2:chararray);
B = foreach A generate TOTUPLE($0);
dump B;
store B into 'singlefieldoutput' using
org.apache.pig.piggybank.storage.avro.AvroStorage('{"schema": {"type":
"record", "name": "main", "fields": [{"name": "name", "type": ["null",
"string"]}]}}')
{code}

Output
{noformat}
((Viraj))
((Roh))
((Govind))
{noformat}

The table provided in: https://cwiki.apache.org/PIG/avrostorage.html shows that 
it is possible to convert from Pig Tuple to Avro Record as they are set of 
ordered fields. But is not possible to convert from "chararray" to "record". In 
Pig you cannot generate a single chararray, it is always wrapped by a tuple.

Try loading the output generated by the older Pig script.

{code}
A = load 'singlefieldoutput' using 
org.apache.pig.piggybank.storage.avro.AvroStorage();
describe A;
dump A;
{code}


Now we see the following:
{noformat}
(Viraj)
(Roh)
(Govind)
{noformat}

Which is different from "dump B"

Viraj

> Problem in org.apache.pig.piggybank.storage.avro.AvroStorage when storing a 
> record with a single field.
> ---
>
> Key: PIG-2330
> URL: https://issues.apache.org/jira/browse/PIG-2330
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.9.0
>Reporter: Stan Rosenberg
> Attachments: AvroStorage.patch, input.txt
>
>
> Running the following script yields a RuntimeException.  If the schema is 
> changed to contain two fields, then A can be stored successfully.
> {noformat}
> REGISTER 'piggybank.jar'
> REGISTER 'avro-1.5.4.jar'
> REGISTER 'json-simple-1.1.jar'
> A = load 'input.txt' AS (name1:chararray, name2:chararray);
> B = foreach A generate $0;
> store B into './output' using 
> org.apache.pig.piggybank.storage.avro.AvroStorage(
> '{"schema": {"type": "record", "name": "main", "fields": [{"name": "name", 
> "type": ["null", "string"]}]}}');
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-3323) AVRO: default value not stored in file when given as paramter to AvroStorage

2013-05-20 Thread Viraj Bhat (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13662558#comment-13662558
 ] 

Viraj Bhat commented on PIG-3323:
-

Hi Scott,
 Thanks for your explanation for understanding default values. The 
documentation on this is limited. BTW I have opened up: PIG-3331 which I think 
is valid. Please let me know if it is not.
Regards
Viraj

> AVRO: default value not stored in file when given as paramter to AvroStorage
> 
>
> Key: PIG-3323
> URL: https://issues.apache.org/jira/browse/PIG-3323
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.11.2
>Reporter: Egil Sorensen
>Assignee: Viraj Bhat
>  Labels: patch
> Fix For: 0.12, 0.11.2
>
>
> A pig script like the below succeeds, but inspecting the resulting file I 
> find that the schema is stripped of the default value specification.
> {code}
> a = load ':INPATH:/types/numbers.txt' using PigStorage(':') as (intnum1000: 
> int,id: int,intnum5: int,intnum100: int,intnum: int,longnum: long,floatnum: 
> float,doublenum: double);
> b2 = foreach a generate id, intnum5, intnum100;
> c2 = filter b2 by 110 <= id and id < 120;
> describe c2;
> dump c2;
> store c2 into ':OUTPATH:.intermediate_2' USING 
> org.apache.pig.piggybank.storage.avro.AvroStorage('
> {
>"debug" : 5,
>"schema" : {  
>   "name" : "schema_2",
>   "type" : "record",
>   "fields" : [
>  {  
> "name" : "id",
> "type" : [
>"null",
>"int"
> ]
>  },
>  {  
> "name" : "intnum5",
> "type" : [
>"null",
>"int"
> ]
>  },
>  {
> "name" : "intnum100",
> "type" : [
>"null",
>"int"
> ],
> "default" : 0
>  }
>   ]
>}
> }
> ');
> {code}
> BTW, the documentation on https://cwiki.apache.org/PIG/avrostorage.html is 
> mute on the subject of defaults, so first question is: is my expectation that 
> the default is to be written to file not correct?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Review Request: Patch to address default values when schemas are merged in AvroStorage. It does this for Records containing primitive values

2013-05-20 Thread Viraj Bhat

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11135/
---

(Updated May 21, 2013, 1:05 a.m.)


Review request for pig.


Changes
---

Sorry for the spam. Hopefully the no more white spaces missed my attention.


Description
---

Default values are not honoured when merging default schema


This addresses bug PIG-3318.
https://issues.apache.org/jira/browse/PIG-3318


Diffs (updated)
-

  
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroStorage.java
 1484564 
  
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroStorageUtils.java
 1484564 
  
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/PigAvroInputFormat.java
 1484564 
  
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/PigAvroRecordReader.java
 1484564 
  
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/avro/TestAvroStorage.java
 1484564 

Diff: https://reviews.apache.org/r/11135/diff/


Testing
---

Yes


Thanks,

Viraj Bhat



Re: Review Request: Patch to address default values when schemas are merged in AvroStorage. It does this for Records containing primitive values

2013-05-20 Thread Viraj Bhat

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11135/
---

(Updated May 21, 2013, 1 a.m.)


Review request for pig.


Changes
---

Removed extra white spaces which escaped my attention and minor formatting 
changes.


Description
---

Default values are not honoured when merging default schema


This addresses bug PIG-3318.
https://issues.apache.org/jira/browse/PIG-3318


Diffs (updated)
-

  
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroStorage.java
 1484564 
  
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroStorageUtils.java
 1484564 
  
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/PigAvroInputFormat.java
 1484564 
  
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/PigAvroRecordReader.java
 1484564 
  
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/avro/TestAvroStorage.java
 1484564 

Diff: https://reviews.apache.org/r/11135/diff/


Testing
---

Yes


Thanks,

Viraj Bhat



Re: Review Request: Patch to address default values when schemas are merged in AvroStorage. It does this for Records containing primitive values

2013-05-20 Thread Viraj Bhat

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11135/
---

(Updated May 21, 2013, 12:42 a.m.)


Review request for pig.


Changes
---

Removed extra white spaces.


Description
---

Default values are not honoured when merging default schema


This addresses bug PIG-3318.
https://issues.apache.org/jira/browse/PIG-3318


Diffs (updated)
-

  
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroStorage.java
 1484564 
  
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroStorageUtils.java
 1484564 
  
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/PigAvroInputFormat.java
 1484564 
  
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/PigAvroRecordReader.java
 1484564 
  
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/avro/TestAvroStorage.java
 1484564 

Diff: https://reviews.apache.org/r/11135/diff/


Testing
---

Yes


Thanks,

Viraj Bhat



Re: Review Request: Patch to address default values when schemas are merged in AvroStorage. It does this for Records containing primitive values

2013-05-20 Thread Viraj Bhat

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11135/
---

(Updated May 21, 2013, 12:23 a.m.)


Review request for pig.


Changes
---

Removed Tabs and rebased patch with PIG-3321


Description
---

Default values are not honoured when merging default schema


This addresses bug PIG-3318.
https://issues.apache.org/jira/browse/PIG-3318


Diffs (updated)
-

  
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroStorage.java
 1484564 
  
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroStorageUtils.java
 1484564 
  
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/PigAvroInputFormat.java
 1484564 
  
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/PigAvroRecordReader.java
 1484564 
  
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/avro/TestAvroStorage.java
 1484564 

Diff: https://reviews.apache.org/r/11135/diff/


Testing
---

Yes


Thanks,

Viraj Bhat



Re: Review Request: Patch to address default values when schemas are merged in AvroStorage. It does this for Records containing primitive values

2013-05-20 Thread Viraj Bhat


> On May 15, 2013, 12:19 a.m., Rohini Palaniswamy wrote:
> > Please fix formatting - spaces instead of tabs and no extra white spaces. 
> > This patch will conflict with PIG-3321. Can you merge the changes once that 
> > is committed and upload a new patch?

Hi Rohini,
 I have removed all the tabs and merged PIG-3321. Resubmitting again.
Viraj


- Viraj


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11135/#review20552
---


On May 14, 2013, 1:09 a.m., Viraj Bhat wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/11135/
> ---
> 
> (Updated May 14, 2013, 1:09 a.m.)
> 
> 
> Review request for pig.
> 
> 
> Description
> ---
> 
> Default values are not honoured when merging default schema
> 
> 
> This addresses bug PIG-3318.
> https://issues.apache.org/jira/browse/PIG-3318
> 
> 
> Diffs
> -
> 
>   
> http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroStorage.java
>  1481245 
>   
> http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroStorageUtils.java
>  1481245 
>   
> http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/PigAvroInputFormat.java
>  1481245 
>   
> http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/PigAvroRecordReader.java
>  1481245 
>   
> http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/avro/TestAvroStorage.java
>  1481245 
> 
> Diff: https://reviews.apache.org/r/11135/diff/
> 
> 
> Testing
> ---
> 
> Yes
> 
> 
> Thanks,
> 
> Viraj Bhat
> 
>



[jira] [Updated] (PIG-3331) Default values not stored in avro file when using specific schemas during store in AvroStorage

2013-05-20 Thread Viraj Bhat (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Bhat updated PIG-3331:


Description: 
Script which stores Avro using a predefined schema does not store the default 
values in the file
{code}
a = LOAD 'numbers.txt' USING PigStorage (':') as (intnum1000: int,id:
int,intnum5: int,intnum100: int,intnum: int,longnum: long,floatnum:
float,doublenum: double);

b2 = foreach a generate id, intnum5, intnum100;

c2 = filter b2 by 110 <= id and id < 120;

STORE c2 INTO '/tmp/TestAvroStorage/testDefaultValueWrite' USING
org.apache.pig.piggybank.storage.avro.AvroStorage (' { "debug" : 5, "schema" :
{  "name" : "rmyrecord", "type" : "record",  "fields" : [ { "name" : "id",
"type" : "int" , "default" : 0 }, {  "name" : "intnum5",  "type" : "int",
"default" : 0 }, { "name" : "intnum100", "type" : "int", "default" : 0 } ] } }
');
{code}

Opening the file shows the following schema
{noformat} 
avro.schema
{"type":"record","name":"rmyrecord","fields":[{"name":"id","type":"int"},{"name":"intnum5","type":"int"},{"name":"intnum100","type":"int"}]}
{noformat} 

There seems to be a problem storing the schema.
Viraj

  was:
Script which stores Avro using a predefined schema does not store the default 
values in the file
{code}
a = LOAD 'numbers.txt' USING PigStorage (':') as (intnum1000: int,id:
int,intnum5: int,intnum100: int,intnum: int,longnum: long,floatnum:
float,doublenum: double);

b2 = foreach a generate id, intnum5, intnum100;

c2 = filter b2 by 110 <= id and id < 120;

STORE c2 INTO '/tmp/TestAvroStorage/testDefaultValueWrite' USING
org.apache.pig.piggybank.storage.avro.AvroStorage (' { "debug" : 5, "schema" :
{  "name" : "rmyrecord", "type" : "record",  "fields" : [ { "name" : "id",
"type" : "int" , "default" : 0 }, {  "name" : "intnum5",  "type" : "int",
"default" : 0 }, { "name" : "intnum100", "type" : "int", "default" : 0 } ] } }
');
{code}

Opening the file shows the following schema
{quote}
avro.schema
{"type":"record","name":"rmyrecord","fields":[{"name":"id","type":"int"},{"name":"intnum5","type":"int"},{"name":"intnum100","type":"int"}]}
{quote}

There seems to be a problem storing the schema.
Viraj


> Default values not stored in avro file when using specific schemas during 
> store in AvroStorage
> --
>
> Key: PIG-3331
> URL: https://issues.apache.org/jira/browse/PIG-3331
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.11.1
>Reporter: Viraj Bhat
>Assignee: Viraj Bhat
> Fix For: 0.11.2
>
>
> Script which stores Avro using a predefined schema does not store the default 
> values in the file
> {code}
> a = LOAD 'numbers.txt' USING PigStorage (':') as (intnum1000: int,id:
> int,intnum5: int,intnum100: int,intnum: int,longnum: long,floatnum:
> float,doublenum: double);
> b2 = foreach a generate id, intnum5, intnum100;
> c2 = filter b2 by 110 <= id and id < 120;
> STORE c2 INTO '/tmp/TestAvroStorage/testDefaultValueWrite' USING
> org.apache.pig.piggybank.storage.avro.AvroStorage (' { "debug" : 5, "schema" :
> {  "name" : "rmyrecord", "type" : "record",  "fields" : [ { "name" : "id",
> "type" : "int" , "default" : 0 }, {  "name" : "intnum5",  "type" : "int",
> "default" : 0 }, { "name" : "intnum100", "type" : "int", "default" : 0 } ] } }
> ');
> {code}
> Opening the file shows the following schema
> {noformat} 
> avro.schema
> {"type":"record","name":"rmyrecord","fields":[{"name":"id","type":"int"},{"name":"intnum5","type":"int"},{"name":"intnum100","type":"int"}]}
> {noformat} 
> There seems to be a problem storing the schema.
> Viraj

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (PIG-3331) Default values not stored in avro file when using specific schemas during store in AvroStorage

2013-05-20 Thread Viraj Bhat (JIRA)
Viraj Bhat created PIG-3331:
---

 Summary: Default values not stored in avro file when using 
specific schemas during store in AvroStorage
 Key: PIG-3331
 URL: https://issues.apache.org/jira/browse/PIG-3331
 Project: Pig
  Issue Type: Bug
  Components: piggybank
Affects Versions: 0.11.1
Reporter: Viraj Bhat
Assignee: Viraj Bhat
 Fix For: 0.11.2


Script which stores Avro using a predefined schema does not store the default 
values in the file
{code}
a = LOAD 'numbers.txt' USING PigStorage (':') as (intnum1000: int,id:
int,intnum5: int,intnum100: int,intnum: int,longnum: long,floatnum:
float,doublenum: double);

b2 = foreach a generate id, intnum5, intnum100;

c2 = filter b2 by 110 <= id and id < 120;

STORE c2 INTO '/tmp/TestAvroStorage/testDefaultValueWrite' USING
org.apache.pig.piggybank.storage.avro.AvroStorage (' { "debug" : 5, "schema" :
{  "name" : "rmyrecord", "type" : "record",  "fields" : [ { "name" : "id",
"type" : "int" , "default" : 0 }, {  "name" : "intnum5",  "type" : "int",
"default" : 0 }, { "name" : "intnum100", "type" : "int", "default" : 0 } ] } }
');
{code}

Opening the file shows the following schema
{quote}
avro.schema
{"type":"record","name":"rmyrecord","fields":[{"name":"id","type":"int"},{"name":"intnum5","type":"int"},{"name":"intnum100","type":"int"}]}
{quote}

There seems to be a problem storing the schema.
Viraj

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (PIG-3323) AVRO: default value not stored in file when given as paramter to AvroStorage

2013-05-17 Thread Viraj Bhat (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Bhat resolved PIG-3323.
-

Resolution: Invalid

> AVRO: default value not stored in file when given as paramter to AvroStorage
> 
>
> Key: PIG-3323
> URL: https://issues.apache.org/jira/browse/PIG-3323
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.11.2
>Reporter: Egil Sorensen
>Assignee: Viraj Bhat
>  Labels: patch
> Fix For: 0.12, 0.11.2
>
>
> A pig script like the below succeeds, but inspecting the resulting file I 
> find that the schema is stripped of the default value specification.
> {code}
> a = load ':INPATH:/types/numbers.txt' using PigStorage(':') as (intnum1000: 
> int,id: int,intnum5: int,intnum100: int,intnum: int,longnum: long,floatnum: 
> float,doublenum: double);
> b2 = foreach a generate id, intnum5, intnum100;
> c2 = filter b2 by 110 <= id and id < 120;
> describe c2;
> dump c2;
> store c2 into ':OUTPATH:.intermediate_2' USING 
> org.apache.pig.piggybank.storage.avro.AvroStorage('
> {
>"debug" : 5,
>"schema" : {  
>   "name" : "schema_2",
>   "type" : "record",
>   "fields" : [
>  {  
> "name" : "id",
> "type" : [
>"null",
>"int"
> ]
>  },
>  {  
> "name" : "intnum5",
> "type" : [
>"null",
>"int"
> ]
>  },
>  {
> "name" : "intnum100",
> "type" : [
>"null",
>"int"
> ],
> "default" : 0
>  }
>   ]
>}
> }
> ');
> {code}
> BTW, the documentation on https://cwiki.apache.org/PIG/avrostorage.html is 
> mute on the subject of defaults, so first question is: is my expectation that 
> the default is to be written to file not correct?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-3323) AVRO: default value not stored in file when given as paramter to AvroStorage

2013-05-17 Thread Viraj Bhat (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13661142#comment-13661142
 ] 

Viraj Bhat commented on PIG-3323:
-

One correction on my first comment:
Default values for union fields correspond to the first schema in the union 
according to the specification. So for the above use case posted by Egil, the 
final Output Schema should not contain the default value. 

In fact there is a bug in AvroStorage which does not write the default values 
of the individual fields. I will open another Jira and close this one.

Viraj

> AVRO: default value not stored in file when given as paramter to AvroStorage
> 
>
> Key: PIG-3323
> URL: https://issues.apache.org/jira/browse/PIG-3323
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.11.2
>Reporter: Egil Sorensen
>Assignee: Viraj Bhat
>  Labels: patch
> Fix For: 0.12, 0.11.2
>
>
> A pig script like the below succeeds, but inspecting the resulting file I 
> find that the schema is stripped of the default value specification.
> {code}
> a = load ':INPATH:/types/numbers.txt' using PigStorage(':') as (intnum1000: 
> int,id: int,intnum5: int,intnum100: int,intnum: int,longnum: long,floatnum: 
> float,doublenum: double);
> b2 = foreach a generate id, intnum5, intnum100;
> c2 = filter b2 by 110 <= id and id < 120;
> describe c2;
> dump c2;
> store c2 into ':OUTPATH:.intermediate_2' USING 
> org.apache.pig.piggybank.storage.avro.AvroStorage('
> {
>"debug" : 5,
>"schema" : {  
>   "name" : "schema_2",
>   "type" : "record",
>   "fields" : [
>  {  
> "name" : "id",
> "type" : [
>"null",
>"int"
> ]
>  },
>  {  
> "name" : "intnum5",
> "type" : [
>"null",
>"int"
> ]
>  },
>  {
> "name" : "intnum100",
> "type" : [
>"null",
>"int"
> ],
> "default" : 0
>  }
>   ]
>}
> }
> ');
> {code}
> BTW, the documentation on https://cwiki.apache.org/PIG/avrostorage.html is 
> mute on the subject of defaults, so first question is: is my expectation that 
> the default is to be written to file not correct?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-3323) AVRO: default value not stored in file when given as paramter to AvroStorage

2013-05-17 Thread Viraj Bhat (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13661019#comment-13661019
 ] 

Viraj Bhat commented on PIG-3323:
-

Spoke to Egil offline:
His original comments were:
1) Should default value be written to a file?
Ans) It should be if it is specified for a valid Complex Types.

2) Should Default schema specification be written to the file's metadata?
Ans) It should be if it is valid for that Complex Type. Since Union does not 
support default it was not written out. But we need to see how the default 
schema's work for other data types.

Viraj

> AVRO: default value not stored in file when given as paramter to AvroStorage
> 
>
> Key: PIG-3323
> URL: https://issues.apache.org/jira/browse/PIG-3323
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.11.2
>Reporter: Egil Sorensen
>Assignee: Viraj Bhat
>  Labels: patch
> Fix For: 0.12, 0.11.2
>
>
> A pig script like the below succeeds, but inspecting the resulting file I 
> find that the schema is stripped of the default value specification.
> {code}
> a = load ':INPATH:/types/numbers.txt' using PigStorage(':') as (intnum1000: 
> int,id: int,intnum5: int,intnum100: int,intnum: int,longnum: long,floatnum: 
> float,doublenum: double);
> b2 = foreach a generate id, intnum5, intnum100;
> c2 = filter b2 by 110 <= id and id < 120;
> describe c2;
> dump c2;
> store c2 into ':OUTPATH:.intermediate_2' USING 
> org.apache.pig.piggybank.storage.avro.AvroStorage('
> {
>"debug" : 5,
>"schema" : {  
>   "name" : "schema_2",
>   "type" : "record",
>   "fields" : [
>  {  
> "name" : "id",
> "type" : [
>"null",
>"int"
> ]
>  },
>  {  
> "name" : "intnum5",
> "type" : [
>"null",
>"int"
> ]
>  },
>  {
> "name" : "intnum100",
> "type" : [
>"null",
>"int"
> ],
> "default" : 0
>  }
>   ]
>}
> }
> ');
> {code}
> BTW, the documentation on https://cwiki.apache.org/PIG/avrostorage.html is 
> mute on the subject of defaults, so first question is: is my expectation that 
> the default is to be written to file not correct?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-3323) AVRO: default value not stored in file when given as paramter to AvroStorage

2013-05-17 Thread Viraj Bhat (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13660959#comment-13660959
 ] 

Viraj Bhat commented on PIG-3323:
-

Hi Egil,
 I looked at the specification of the UNION, Default types and the source code 
in: "PigAvroDatumWriter"
Field: "intum100" is a UNION of "null" and "int". So the type can be a "null" 
or an "int"
That means if Pig does not find a value for "intnum100" in the previous step 
before the store it will generate null which is perfectly acceptable here. So 
the default value makes no sense here if the item does not exist. 
Also if you remove "null" from the specification of "intnumm100" and hope the 
default value is written out, there is another problem: 

If you read specification for Unions 
http://avro.apache.org/docs/current/spec.html#Unions plus
Section on Default Values 
http://avro.apache.org/docs/current/spec.html#schema_complex
Union does not have any default values in the specification. 

Closing a INVAILD
Regards
Viraj

> AVRO: default value not stored in file when given as paramter to AvroStorage
> 
>
> Key: PIG-3323
> URL: https://issues.apache.org/jira/browse/PIG-3323
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.11.2
>Reporter: Egil Sorensen
>Assignee: Viraj Bhat
>  Labels: patch
> Fix For: 0.12, 0.11.2
>
>
> A pig script like the below succeeds, but inspecting the resulting file I 
> find that the schema is stripped of the default value specification.
> {code}
> a = load ':INPATH:/types/numbers.txt' using PigStorage(':') as (intnum1000: 
> int,id: int,intnum5: int,intnum100: int,intnum: int,longnum: long,floatnum: 
> float,doublenum: double);
> b2 = foreach a generate id, intnum5, intnum100;
> c2 = filter b2 by 110 <= id and id < 120;
> describe c2;
> dump c2;
> store c2 into ':OUTPATH:.intermediate_2' USING 
> org.apache.pig.piggybank.storage.avro.AvroStorage('
> {
>"debug" : 5,
>"schema" : {  
>   "name" : "schema_2",
>   "type" : "record",
>   "fields" : [
>  {  
> "name" : "id",
> "type" : [
>"null",
>"int"
> ]
>  },
>  {  
> "name" : "intnum5",
> "type" : [
>"null",
>"int"
> ]
>  },
>  {
> "name" : "intnum100",
> "type" : [
>"null",
>"int"
> ],
> "default" : 0
>  }
>   ]
>}
> }
> ');
> {code}
> BTW, the documentation on https://cwiki.apache.org/PIG/avrostorage.html is 
> mute on the subject of defaults, so first question is: is my expectation that 
> the default is to be written to file not correct?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Reopened] (PIG-3322) AVRO: AvroStorage give NPE on reading file with union as top level schema

2013-05-16 Thread Viraj Bhat (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Bhat reopened PIG-3322:
-


Hi Egil,
 The issue here is that the field "t" from the original data 
"studentcomplextab10k" set contains nulls. 
(fred hernandez,73,1.87)
(fred hernandez,20,2.11)

(calvin allen,60,2.49)
(yuri zipper,76,2.05)


So when this is stored via the AvroStorage, nulls are stored for the record.

When you read it out the written avro from the previous store, it fails with a 
null pointer exception.

The following snippet below works without any problems.
{code}
a = load 'studentcomplextab10k' using PigStorage() as (m:[], t:(name:chararray, 
age:int, gpa:double), b:{t:(name:chararray, age:int, gpa:double)});
b = foreach a generate t;
c = filter b by t is not null;
store c into 'singltupleavronotnull' USING 
org.apache.pig.piggybank.storage.avro.AvroStorage();
exec;
b = load 'singltupleavronotnull' USING 
org.apache.pig.piggybank.storage.avro.AvroStorage();
describe b;
dump b;
{code}

Kindly note: This issue is different from PIG-2330 


> AVRO: AvroStorage give NPE on reading file with union as top level schema
> -
>
> Key: PIG-3322
> URL: https://issues.apache.org/jira/browse/PIG-3322
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.11.2
>Reporter: Egil Sorensen
>Assignee: Viraj Bhat
>  Labels: patch
>
> I am getting NPE when loading a file with AvroStorage a file that has schema 
> like:
> {code}
> ["null",{"type":"record","name":"TUPLE_0","fields":[{"name":"name","type":["null","string"],"doc":"autogenerated
>  from Pig Field 
> Schema"},{"name":"age","type":["null","int"],"doc":"autogenerated from Pig 
> Field Schema"},{"name":"gpa","type":["null","double"],"doc":"autogenerated 
> from Pig Field Schema"}]}]
> {code}
> E.g. see the e2e style test, which fails on this:
> {code}
> {
> 'num' => 4,
> # storing file with Pig type tuple relying on 
> conversion to record
> # loading using stored schemas 
> 'notmq' => 1,
> 'pig' => q\
> a = load ':INPATH:/singlefile/studentcomplextab10k' using PigStorage() as 
> (m:[], t:(name:chararray, age:int, gpa:double), b:{t:(name:chararray, 
> age:int, gpa:double)});
> b = foreach a generate t;
> describe b;
> store b into ':OUTPATH:.intermediate' USING 
> org.apache.pig.piggybank.storage.avro.AvroStorage();
> exec;
> -- Read back what was stored with Avro
> u = load ':OUTPATH:.intermediate' USING 
> org.apache.pig.piggybank.storage.avro.AvroStorage();
> describe u;
> store u into ':OUTPATH:';
> \,
> 'verify_pig_script' => q\
> a = load ':INPATH:/singlefile/studentcomplextab10k' using PigStorage() as 
> (m:[], t:(name:chararray, age:int, gpa:double), b:{t:(name:chararray, 
> age:int, gpa:double)});
> b = foreach a generate t;
> describe b;
> store b into ':OUTPATH:';
> \,
> },
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (PIG-3320) AVRO: no empty field expressed when loading with AvroStorage using reader schema with extra field that has no default

2013-05-16 Thread Viraj Bhat (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Bhat resolved PIG-3320.
-

Resolution: Invalid

> AVRO: no empty field expressed when loading with AvroStorage using reader 
> schema with extra field that has no default
> -
>
> Key: PIG-3320
> URL: https://issues.apache.org/jira/browse/PIG-3320
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.11.2
>Reporter: Egil Sorensen
>Assignee: Viraj Bhat
>  Labels: patch
> Fix For: 0.12, 0.11.2
>
>
> Somewhat different use case than PIG-3318:
> Loading with AvroStorage giving a loader schema that relative to the schema 
> in the Avro file had an extra filed w/o default and expected to see an extra 
> empty column, but the schema is as in the avro file w/o the extra column.
> E.g. see the e2e style test, which fails on this:
> {code}
> {
> 'num' => 2,
> # storing using writer schema
> # loading using reader schema with extra field that 
> has no default
> 'notmq' => 1,
> 'pig' => q\
> a = load ':INPATH:/types/numbers.txt' using PigStorage(':') as (intnum1000: 
> int,id: int,intnum5: int,intnum100: int,intnum: int,longnum: long,floatnum: 
> float,doublenum: double);
> -- Store Avro file w. schema
> b1 = foreach a generate id, intnum5;
> c1 = filter b1 by 10 <= id and id < 20;
> describe c1;
> dump c1;
> store c1 into ':OUTPATH:.intermediate_1' USING 
> org.apache.pig.piggybank.storage.avro.AvroStorage('
> {
>"schema" : {  
>   "name" : "schema_writing",
>   "type" : "record",
>   "fields" : [
>  {  
> "name" : "id",
> "type" : [
>"null",
>"int"
> ]
>  },
>  {  
> "name" : "intnum5",
> "type" : [
>"null",
>"int"
> ]
>  }
>   ]
>}
> }
> ');
> exec;
> -- Read back what was stored with Avro adding extra field to reader schema
> u = load ':OUTPATH:.intermediate_1' USING 
> org.apache.pig.piggybank.storage.avro.AvroStorage('
> {
>"debug" : 5,
>"schema" : {  
>   "name" : "schema_reading",
>   "type" : "record",
>   "fields" : [
>  {  
> "name" : "id",
> "type" : [
>"null",
>"int"
> ]
>  },
>  {  
> "name" : "intnum5",
> "type" : [
>"null",
>"string"
> ]
>  },
>  {
> "name" : "intnum100",
> "type" : [
>"null",
>"int"
> ]
>  }
>   ]
>}
> }
> ');
> describe u;
> dump u;
> store u into ':OUTPATH:';
> \,
> 'verify_pig_script' => q\
> a = load ':INPATH:/types/numbers.txt' using PigStorage(':') as (intnum1000: 
> int,id: int,intnum5: int,intnum100: int,intnum: int,longnum: long,floatnum: 
> float,doublenum: double);
> b = filter a by (10 <= id and id < 20);
> c = foreach b generate id, intnum5, '';
> store c into ':OUTPATH:';
> \,
> },
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-3320) AVRO: no empty field expressed when loading with AvroStorage using reader schema with extra field that has no default

2013-05-16 Thread Viraj Bhat (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13659803#comment-13659803
 ] 

Viraj Bhat commented on PIG-3320:
-

With PIG-3321 committed, the above script throws an error which is listed in 
Comment 2 of this Jira.

Suppose we want AvroStorage() to return an extra field "intnum100" with null 
instead of throwing an error in Comment 2; you have to do the following:
1) Pass with a null reader schema PigAvroDatumReader
2) Construct an mProtoTuple with field size equal to readerSchema
3) Reconcile the schemas manually by using the logic in 
getSchemaToMergedSchemaMap() 
4) Populate mProtoTuple using the map keeping track of new to old position

By doing all the above we are undoing the changes done in PIG-3321, where the 
readerSchema is not passed to PigAvroDatumReader(). We want Avro to handle the 
schema merges in this case and it does it correctly by throwing an error.

Currently closing this Jira as invalid.

> AVRO: no empty field expressed when loading with AvroStorage using reader 
> schema with extra field that has no default
> -
>
> Key: PIG-3320
> URL: https://issues.apache.org/jira/browse/PIG-3320
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.11.2
>Reporter: Egil Sorensen
>Assignee: Viraj Bhat
>  Labels: patch
> Fix For: 0.12, 0.11.2
>
>
> Somewhat different use case than PIG-3318:
> Loading with AvroStorage giving a loader schema that relative to the schema 
> in the Avro file had an extra filed w/o default and expected to see an extra 
> empty column, but the schema is as in the avro file w/o the extra column.
> E.g. see the e2e style test, which fails on this:
> {code}
> {
> 'num' => 2,
> # storing using writer schema
> # loading using reader schema with extra field that 
> has no default
> 'notmq' => 1,
> 'pig' => q\
> a = load ':INPATH:/types/numbers.txt' using PigStorage(':') as (intnum1000: 
> int,id: int,intnum5: int,intnum100: int,intnum: int,longnum: long,floatnum: 
> float,doublenum: double);
> -- Store Avro file w. schema
> b1 = foreach a generate id, intnum5;
> c1 = filter b1 by 10 <= id and id < 20;
> describe c1;
> dump c1;
> store c1 into ':OUTPATH:.intermediate_1' USING 
> org.apache.pig.piggybank.storage.avro.AvroStorage('
> {
>"schema" : {  
>   "name" : "schema_writing",
>   "type" : "record",
>   "fields" : [
>  {  
> "name" : "id",
> "type" : [
>"null",
>"int"
> ]
>  },
>  {  
> "name" : "intnum5",
> "type" : [
>"null",
>"int"
> ]
>  }
>   ]
>}
> }
> ');
> exec;
> -- Read back what was stored with Avro adding extra field to reader schema
> u = load ':OUTPATH:.intermediate_1' USING 
> org.apache.pig.piggybank.storage.avro.AvroStorage('
> {
>"debug" : 5,
>"schema" : {  
>   "name" : "schema_reading",
>   "type" : "record",
>   "fields" : [
>  {  
> "name" : "id",
> "type" : [
>"null",
>"int"
> ]
>  },
>  {  
> "name" : "intnum5",
> "type" : [
>"null",
>"string"
> ]
>  },
>  {
> "name" : "intnum100",
> "type" : [
>"null",
>"int"
> ]
>  }
>   ]
>}
> }
> ');
> describe u;
> dump u;
> store u into ':OUTPATH:';
> \,
> 'verify_pig_script' => q\
> a = load ':INPATH:/types/numbers.txt' using PigStorage(':') as (intnum1000: 
> int,id: int,intnum5: int,intnum100: int,intnum: int,longnum: long,floatnum: 
> float,doublenum: double);
> b = filter a by (10 <= id and id < 20);
> c = foreach b generate id, intnum5, '';
> store c into ':OUTPATH:';
> \,
> },
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-3320) AVRO: no empty field expressed when loading with AvroStorage using reader schema with extra field that has no default

2013-05-15 Thread Viraj Bhat (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13658808#comment-13658808
 ] 

Viraj Bhat commented on PIG-3320:
-

Hi Rohini,
 The error in the 2nd comment is after taking PIG-3321 into effect.
Viraj

> AVRO: no empty field expressed when loading with AvroStorage using reader 
> schema with extra field that has no default
> -
>
> Key: PIG-3320
> URL: https://issues.apache.org/jira/browse/PIG-3320
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.11.2
>Reporter: Egil Sorensen
>Assignee: Viraj Bhat
>  Labels: patch
> Fix For: 0.12, 0.11.2
>
>
> Somewhat different use case than PIG-3318:
> Loading with AvroStorage giving a loader schema that relative to the schema 
> in the Avro file had an extra filed w/o default and expected to see an extra 
> empty column, but the schema is as in the avro file w/o the extra column.
> E.g. see the e2e style test, which fails on this:
> {code}
> {
> 'num' => 2,
> # storing using writer schema
> # loading using reader schema with extra field that 
> has no default
> 'notmq' => 1,
> 'pig' => q\
> a = load ':INPATH:/types/numbers.txt' using PigStorage(':') as (intnum1000: 
> int,id: int,intnum5: int,intnum100: int,intnum: int,longnum: long,floatnum: 
> float,doublenum: double);
> -- Store Avro file w. schema
> b1 = foreach a generate id, intnum5;
> c1 = filter b1 by 10 <= id and id < 20;
> describe c1;
> dump c1;
> store c1 into ':OUTPATH:.intermediate_1' USING 
> org.apache.pig.piggybank.storage.avro.AvroStorage('
> {
>"schema" : {  
>   "name" : "schema_writing",
>   "type" : "record",
>   "fields" : [
>  {  
> "name" : "id",
> "type" : [
>"null",
>"int"
> ]
>  },
>  {  
> "name" : "intnum5",
> "type" : [
>"null",
>"int"
> ]
>  }
>   ]
>}
> }
> ');
> exec;
> -- Read back what was stored with Avro adding extra field to reader schema
> u = load ':OUTPATH:.intermediate_1' USING 
> org.apache.pig.piggybank.storage.avro.AvroStorage('
> {
>"debug" : 5,
>"schema" : {  
>   "name" : "schema_reading",
>   "type" : "record",
>   "fields" : [
>  {  
> "name" : "id",
> "type" : [
>"null",
>"int"
> ]
>  },
>  {  
> "name" : "intnum5",
> "type" : [
>"null",
>"string"
> ]
>  },
>  {
> "name" : "intnum100",
> "type" : [
>"null",
>"int"
> ]
>  }
>   ]
>}
> }
> ');
> describe u;
> dump u;
> store u into ':OUTPATH:';
> \,
> 'verify_pig_script' => q\
> a = load ':INPATH:/types/numbers.txt' using PigStorage(':') as (intnum1000: 
> int,id: int,intnum5: int,intnum100: int,intnum: int,longnum: long,floatnum: 
> float,doublenum: double);
> b = filter a by (10 <= id and id < 20);
> c = foreach b generate id, intnum5, '';
> store c into ':OUTPATH:';
> \,
> },
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-3320) AVRO: no empty field expressed when loading with AvroStorage using reader schema with extra field that has no default

2013-05-15 Thread Viraj Bhat (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13658680#comment-13658680
 ] 

Viraj Bhat commented on PIG-3320:
-

Hi all, 
What I found out that is that if you supply a user defined schema that is 
different from the schema which the actual data contains; there is no 
reconciliation that happens. In fact we have to reconcile it case by case basis 
by using the same logic which multiple_schemas is using.

By changing a part of the source code to read the user defined schema, it 
throws the following error. I think this is valid considering that previously 
the script was passing and returning results with no extra column.

java.lang.Exception: java.io.IOException: org.apache.avro.AvroTypeException: 
Found {
  "type" : "record",
  "name" : "schema_writing",
  "fields" : [ {
"name" : "id",
"type" : [ "null", "int" ]
  }, {
"name" : "intnum5",
"type" : [ "null", "int" ]
  } ]
}, expecting {
  "type" : "record",
  "name" : "schema_reading",
  "fields" : [ {
"name" : "id",
"type" : [ "null", "int" ]
  }, {
"name" : "intnum5",
"type" : [ "null", "string" ]
  }, {
"name" : "intnum100",
"type" : [ "null", "int" ]
  } ]
}
at 
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:399)
Caused by: java.io.IOException: org.apache.avro.AvroTypeException: Found {
  "type" : "record",
  "name" : "schema_writing",
  "fields" : [ {
"name" : "id",
"type" : [ "null", "int" ]
  }, {
"name" : "intnum5",
"type" : [ "null", "int" ]
  } ]
}, expecting {
  "type" : "record",
  "name" : "schema_reading",
  "fields" : [ {
"name" : "id",
"type" : [ "null", "int" ]
  }, {
"name" : "intnum5",
"type" : [ "null", "string" ]
  }, {
"name" : "intnum100",
"type" : [ "null", "int" ]
  } ]
}
at 
org.apache.pig.piggybank.storage.avro.AvroStorage.getNext(AvroStorage.java:370)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:194)
at 
org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:497)
at 
org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
at 
org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:726)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:333)
at 
org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:231)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
    at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
at java.lang.Thread.run(Thread.java:662)

Regards
Viraj



> AVRO: no empty field expressed when loading with AvroStorage using reader 
> schema with extra field that has no default
> -
>
> Key: PIG-3320
> URL: https://issues.apache.org/jira/browse/PIG-3320
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.11.2
>Reporter: Egil Sorensen
>Assignee: Viraj Bhat
>  Labels: patch
> Fix For: 0.12, 0.11.2
>
>
> Somewhat different use case than PIG-3318:
> Loading with AvroStorage giving a loader schema that relative to the schema 
> in the Avro file had an extra filed w/o default and expected to see an extra 
> empty column, but the schema is as in the avro file w/o the extra column.
> E.g. see the e2e style test, which fails on this:
> {code}
> {
>  

[jira] [Commented] (PIG-3320) AVRO: no empty field expressed when loading with AvroStorage using reader schema with extra field that has no default

2013-05-14 Thread Viraj Bhat (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13658093#comment-13658093
 ] 

Viraj Bhat commented on PIG-3320:
-

It seems that the schema specified during load time is stored in 
"outputAvroSchema" but is not used when reading the underlying data. It will be 
used when writing out the data.
PIG-3321 will enable to use this schema when reading the data but will need to 
investigate if it fixes the above problem. 

> AVRO: no empty field expressed when loading with AvroStorage using reader 
> schema with extra field that has no default
> -
>
> Key: PIG-3320
> URL: https://issues.apache.org/jira/browse/PIG-3320
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.11.2
>Reporter: Egil Sorensen
>Assignee: Viraj Bhat
>  Labels: patch
> Fix For: 0.12, 0.11.2
>
>
> Somewhat different use case than PIG-3318:
> Loading with AvroStorage giving a loader schema that relative to the schema 
> in the Avro file had an extra filed w/o default and expected to see an extra 
> empty column, but the schema is as in the avro file w/o the extra column.
> E.g. see the e2e style test, which fails on this:
> {code}
> {
> 'num' => 2,
> # storing using writer schema
> # loading using reader schema with extra field that 
> has no default
> 'notmq' => 1,
> 'pig' => q\
> a = load ':INPATH:/types/numbers.txt' using PigStorage(':') as (intnum1000: 
> int,id: int,intnum5: int,intnum100: int,intnum: int,longnum: long,floatnum: 
> float,doublenum: double);
> -- Store Avro file w. schema
> b1 = foreach a generate id, intnum5;
> c1 = filter b1 by 10 <= id and id < 20;
> describe c1;
> dump c1;
> store c1 into ':OUTPATH:.intermediate_1' USING 
> org.apache.pig.piggybank.storage.avro.AvroStorage('
> {
>"schema" : {  
>   "name" : "schema_writing",
>   "type" : "record",
>   "fields" : [
>  {  
> "name" : "id",
> "type" : [
>"null",
>"int"
> ]
>  },
>  {  
> "name" : "intnum5",
> "type" : [
>"null",
>"int"
> ]
>  }
>   ]
>}
> }
> ');
> exec;
> -- Read back what was stored with Avro adding extra field to reader schema
> u = load ':OUTPATH:.intermediate_1' USING 
> org.apache.pig.piggybank.storage.avro.AvroStorage('
> {
>"debug" : 5,
>"schema" : {  
>   "name" : "schema_reading",
>   "type" : "record",
>   "fields" : [
>  {  
> "name" : "id",
> "type" : [
>"null",
>"int"
> ]
>  },
>  {  
> "name" : "intnum5",
> "type" : [
>"null",
>"string"
> ]
>  },
>  {
> "name" : "intnum100",
> "type" : [
>"null",
>"int"
> ]
>  }
>   ]
>}
> }
> ');
> describe u;
> dump u;
> store u into ':OUTPATH:';
> \,
> 'verify_pig_script' => q\
> a = load ':INPATH:/types/numbers.txt' using PigStorage(':') as (intnum1000: 
> int,id: int,intnum5: int,intnum100: int,intnum: int,longnum: long,floatnum: 
> float,doublenum: double);
> b = filter a by (10 <= id and id < 20);
> c = foreach b generate id, intnum5, '';
> store c into ':OUTPATH:';
> \,
> },
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-3322) AVRO: AvroStorage give NPE on reading file with union as top level schema

2013-05-14 Thread Viraj Bhat (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13658092#comment-13658092
 ] 

Viraj Bhat commented on PIG-3322:
-

I meant PIG-3320 ..

> AVRO: AvroStorage give NPE on reading file with union as top level schema
> -
>
> Key: PIG-3322
> URL: https://issues.apache.org/jira/browse/PIG-3322
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.11.2
>Reporter: Egil Sorensen
>Assignee: Viraj Bhat
>  Labels: patch
> Fix For: 0.12, 0.11.2
>
>
> I am getting NPE when loading a file with AvroStorage a file that has schema 
> like:
> {code}
> ["null",{"type":"record","name":"TUPLE_0","fields":[{"name":"name","type":["null","string"],"doc":"autogenerated
>  from Pig Field 
> Schema"},{"name":"age","type":["null","int"],"doc":"autogenerated from Pig 
> Field Schema"},{"name":"gpa","type":["null","double"],"doc":"autogenerated 
> from Pig Field Schema"}]}]
> {code}
> E.g. see the e2e style test, which fails on this:
> {code}
> {
> 'num' => 4,
> # storing file with Pig type tuple relying on 
> conversion to record
> # loading using stored schemas 
> 'notmq' => 1,
> 'pig' => q\
> a = load ':INPATH:/singlefile/studentcomplextab10k' using PigStorage() as 
> (m:[], t:(name:chararray, age:int, gpa:double), b:{t:(name:chararray, 
> age:int, gpa:double)});
> b = foreach a generate t;
> describe b;
> store b into ':OUTPATH:.intermediate' USING 
> org.apache.pig.piggybank.storage.avro.AvroStorage();
> exec;
> -- Read back what was stored with Avro
> u = load ':OUTPATH:.intermediate' USING 
> org.apache.pig.piggybank.storage.avro.AvroStorage();
> describe u;
> store u into ':OUTPATH:';
> \,
> 'verify_pig_script' => q\
> a = load ':INPATH:/singlefile/studentcomplextab10k' using PigStorage() as 
> (m:[], t:(name:chararray, age:int, gpa:double), b:{t:(name:chararray, 
> age:int, gpa:double)});
> b = foreach a generate t;
> describe b;
> store b into ':OUTPATH:';
> \,
> },
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-3322) AVRO: AvroStorage give NPE on reading file with union as top level schema

2013-05-14 Thread Viraj Bhat (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13658091#comment-13658091
 ] 

Viraj Bhat commented on PIG-3322:
-

Sorry the above comment was intended for PIG-3220
Viraj

> AVRO: AvroStorage give NPE on reading file with union as top level schema
> -
>
> Key: PIG-3322
> URL: https://issues.apache.org/jira/browse/PIG-3322
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.11.2
>Reporter: Egil Sorensen
>Assignee: Viraj Bhat
>  Labels: patch
> Fix For: 0.12, 0.11.2
>
>
> I am getting NPE when loading a file with AvroStorage a file that has schema 
> like:
> {code}
> ["null",{"type":"record","name":"TUPLE_0","fields":[{"name":"name","type":["null","string"],"doc":"autogenerated
>  from Pig Field 
> Schema"},{"name":"age","type":["null","int"],"doc":"autogenerated from Pig 
> Field Schema"},{"name":"gpa","type":["null","double"],"doc":"autogenerated 
> from Pig Field Schema"}]}]
> {code}
> E.g. see the e2e style test, which fails on this:
> {code}
> {
> 'num' => 4,
> # storing file with Pig type tuple relying on 
> conversion to record
> # loading using stored schemas 
> 'notmq' => 1,
> 'pig' => q\
> a = load ':INPATH:/singlefile/studentcomplextab10k' using PigStorage() as 
> (m:[], t:(name:chararray, age:int, gpa:double), b:{t:(name:chararray, 
> age:int, gpa:double)});
> b = foreach a generate t;
> describe b;
> store b into ':OUTPATH:.intermediate' USING 
> org.apache.pig.piggybank.storage.avro.AvroStorage();
> exec;
> -- Read back what was stored with Avro
> u = load ':OUTPATH:.intermediate' USING 
> org.apache.pig.piggybank.storage.avro.AvroStorage();
> describe u;
> store u into ':OUTPATH:';
> \,
> 'verify_pig_script' => q\
> a = load ':INPATH:/singlefile/studentcomplextab10k' using PigStorage() as 
> (m:[], t:(name:chararray, age:int, gpa:double), b:{t:(name:chararray, 
> age:int, gpa:double)});
> b = foreach a generate t;
> describe b;
> store b into ':OUTPATH:';
> \,
> },
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-3322) AVRO: AvroStorage give NPE on reading file with union as top level schema

2013-05-14 Thread Viraj Bhat (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13657784#comment-13657784
 ] 

Viraj Bhat commented on PIG-3322:
-

It seems that the schema specified during load time is stored in 
"outputAvroSchema" but is not used when reading the underlying data. It will be 
used when writing out the data. 
PIG-3321 will enable to use this schema when reading the data but will need to 
investigate if it fixes the above problem. 

> AVRO: AvroStorage give NPE on reading file with union as top level schema
> -
>
> Key: PIG-3322
> URL: https://issues.apache.org/jira/browse/PIG-3322
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.11.2
>Reporter: Egil Sorensen
>Assignee: Viraj Bhat
>  Labels: patch
> Fix For: 0.12, 0.11.2
>
>
> I am getting NPE when loading a file with AvroStorage a file that has schema 
> like:
> {code}
> ["null",{"type":"record","name":"TUPLE_0","fields":[{"name":"name","type":["null","string"],"doc":"autogenerated
>  from Pig Field 
> Schema"},{"name":"age","type":["null","int"],"doc":"autogenerated from Pig 
> Field Schema"},{"name":"gpa","type":["null","double"],"doc":"autogenerated 
> from Pig Field Schema"}]}]
> {code}
> E.g. see the e2e style test, which fails on this:
> {code}
> {
> 'num' => 4,
> # storing file with Pig type tuple relying on 
> conversion to record
> # loading using stored schemas 
> 'notmq' => 1,
> 'pig' => q\
> a = load ':INPATH:/singlefile/studentcomplextab10k' using PigStorage() as 
> (m:[], t:(name:chararray, age:int, gpa:double), b:{t:(name:chararray, 
> age:int, gpa:double)});
> b = foreach a generate t;
> describe b;
> store b into ':OUTPATH:.intermediate' USING 
> org.apache.pig.piggybank.storage.avro.AvroStorage();
> exec;
> -- Read back what was stored with Avro
> u = load ':OUTPATH:.intermediate' USING 
> org.apache.pig.piggybank.storage.avro.AvroStorage();
> describe u;
> store u into ':OUTPATH:';
> \,
> 'verify_pig_script' => q\
> a = load ':INPATH:/singlefile/studentcomplextab10k' using PigStorage() as 
> (m:[], t:(name:chararray, age:int, gpa:double), b:{t:(name:chararray, 
> age:int, gpa:double)});
> b = foreach a generate t;
> describe b;
> store b into ':OUTPATH:';
> \,
> },
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Review Request: Patch to address default values when schemas are merged in AvroStorage. It does this for Records containing primitive values

2013-05-13 Thread Viraj Bhat

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11135/
---

Review request for pig.


Description
---

Default values are not honoured when merging default schema


This addresses bug PIG-3318.
https://issues.apache.org/jira/browse/PIG-3318


Diffs
-

  
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroStorage.java
 1481245 
  
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroStorageUtils.java
 1481245 
  
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/PigAvroInputFormat.java
 1481245 
  
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/PigAvroRecordReader.java
 1481245 
  
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/avro/TestAvroStorage.java
 1481245 

Diff: https://reviews.apache.org/r/11135/diff/


Testing
---

Yes


Thanks,

Viraj Bhat



[jira] [Updated] (PIG-3318) AVRO: 'default value' not honored when merging schemas on load with AvroStorage

2013-05-11 Thread Viraj Bhat (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Bhat updated PIG-3318:


  Tags: AvroStorage
Labels: patch  (was: )
Status: Patch Available  (was: Open)

Patch for adding default values for merged schemas.

> AVRO: 'default value' not honored when merging schemas on load with 
> AvroStorage
> ---
>
> Key: PIG-3318
> URL: https://issues.apache.org/jira/browse/PIG-3318
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.11.2
>Reporter: Viraj Bhat
>Assignee: Viraj Bhat
>  Labels: patch
> Fix For: 0.12, 0.11.2
>
> Attachments: Employee3.ser, Employee4.ser, Employee6.ser, 
> expected_testMultipleSchemasDefault1.avro, PIG-3118.0.11.patch, PIG-3318.patch
>
>
> Piggybank - AvroStorage. When merging multiple schemas where default values 
> have been specified in the avro schema; 
> The AvroStorage puts nulls in the merged data set. 
> ==> Employee3.avro <==
> {
> "type" : "record",
> "name" : "employee",
> "fields":[
> {"name" : "name", "type" : "string", "default" : "NU"},
> {"name" : "age", "type" : "int", "default" : 0 },
> {"name" : "dept", "type": "string", "default" : "DU"} ] }
> ==> Employee4.avro <==
> {
> "type" : "record",
> "name" : "employee",
> "fields":[
> {"name" : "name", "type" : "string", "default" : "NU"},
> {"name" : "age", "type" : "int", "default" : 0},
> {"name" : "dept", "type": "string", "default" : "DU"},
> {"name" : "office", "type": "string", "default" : "OU"} ] }
> ==> Employee6.avro <==
> {
> "type" : "record",
> "name" : "employee",
> "fields":[
> {"name" : "name", "type" : "string", "default" : "NU"},
> {"name" : "lastname", "type": "string", "default" : "LNU"},
> {"name" : "age", "type" : "int","default" : 0},
> {"name" : "salary", "type": "int", "default" : 0},
> {"name" : "dept", "type": "string","default" : "DU"},
> {"name" : "office", "type": "string","default" : "OU"} ] }
> The pig script:
> employee = load 'employee{3,4,6}.ser' using 
> org.apache.pig.piggybank.storage.avro.AvroStorage('multiple_schemas');
> describe employee;
> dump employee;
> Output Schema:
> employee: {name: chararray,age: int,dept: chararray,lastname: 
> chararray,salary: int,office: chararray}
> (Milo,30,DH,,,)
> (Asmya,34,PQ,,,)
> (Baljit,23,RS,,,)
> (Pune,60,Astrophysics,Warriors,5466,UTA)
> (Rajsathan,20,Biochemistry,Royals,1378,Stanford)
> (Chennai,50,Microbiology,Superkings,7338,Hopkins)
> (Mumbai,20,Applied Math,Indians,4468,UAH)
> (Praj,54,RMX,,,Champaign)
> (Buba,767,HD,,,Sunnyvale)
> (Manku,375,MS,,,New York)
> Regards
> Viraj

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3318) AVRO: 'default value' not honored when merging schemas on load with AvroStorage

2013-05-11 Thread Viraj Bhat (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Bhat updated PIG-3318:


Attachment: Employee6.ser

Avro test file

> AVRO: 'default value' not honored when merging schemas on load with 
> AvroStorage
> ---
>
> Key: PIG-3318
> URL: https://issues.apache.org/jira/browse/PIG-3318
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.11.2
>Reporter: Viraj Bhat
>Assignee: Viraj Bhat
> Fix For: 0.12, 0.11.2
>
> Attachments: Employee3.ser, Employee4.ser, Employee6.ser, 
> expected_testMultipleSchemasDefault1.avro, PIG-3118.0.11.patch, PIG-3318.patch
>
>
> Piggybank - AvroStorage. When merging multiple schemas where default values 
> have been specified in the avro schema; 
> The AvroStorage puts nulls in the merged data set. 
> ==> Employee3.avro <==
> {
> "type" : "record",
> "name" : "employee",
> "fields":[
> {"name" : "name", "type" : "string", "default" : "NU"},
> {"name" : "age", "type" : "int", "default" : 0 },
> {"name" : "dept", "type": "string", "default" : "DU"} ] }
> ==> Employee4.avro <==
> {
> "type" : "record",
> "name" : "employee",
> "fields":[
> {"name" : "name", "type" : "string", "default" : "NU"},
> {"name" : "age", "type" : "int", "default" : 0},
> {"name" : "dept", "type": "string", "default" : "DU"},
> {"name" : "office", "type": "string", "default" : "OU"} ] }
> ==> Employee6.avro <==
> {
> "type" : "record",
> "name" : "employee",
> "fields":[
> {"name" : "name", "type" : "string", "default" : "NU"},
> {"name" : "lastname", "type": "string", "default" : "LNU"},
> {"name" : "age", "type" : "int","default" : 0},
> {"name" : "salary", "type": "int", "default" : 0},
> {"name" : "dept", "type": "string","default" : "DU"},
> {"name" : "office", "type": "string","default" : "OU"} ] }
> The pig script:
> employee = load 'employee{3,4,6}.ser' using 
> org.apache.pig.piggybank.storage.avro.AvroStorage('multiple_schemas');
> describe employee;
> dump employee;
> Output Schema:
> employee: {name: chararray,age: int,dept: chararray,lastname: 
> chararray,salary: int,office: chararray}
> (Milo,30,DH,,,)
> (Asmya,34,PQ,,,)
> (Baljit,23,RS,,,)
> (Pune,60,Astrophysics,Warriors,5466,UTA)
> (Rajsathan,20,Biochemistry,Royals,1378,Stanford)
> (Chennai,50,Microbiology,Superkings,7338,Hopkins)
> (Mumbai,20,Applied Math,Indians,4468,UAH)
> (Praj,54,RMX,,,Champaign)
> (Buba,767,HD,,,Sunnyvale)
> (Manku,375,MS,,,New York)
> Regards
> Viraj

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3318) AVRO: 'default value' not honored when merging schemas on load with AvroStorage

2013-05-11 Thread Viraj Bhat (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Bhat updated PIG-3318:


Attachment: expected_testMultipleSchemasDefault1.avro

Expected resulting avro file

> AVRO: 'default value' not honored when merging schemas on load with 
> AvroStorage
> ---
>
> Key: PIG-3318
> URL: https://issues.apache.org/jira/browse/PIG-3318
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.11.2
>Reporter: Viraj Bhat
>Assignee: Viraj Bhat
> Fix For: 0.12, 0.11.2
>
> Attachments: Employee3.ser, Employee4.ser, Employee6.ser, 
> expected_testMultipleSchemasDefault1.avro, PIG-3118.0.11.patch, PIG-3318.patch
>
>
> Piggybank - AvroStorage. When merging multiple schemas where default values 
> have been specified in the avro schema; 
> The AvroStorage puts nulls in the merged data set. 
> ==> Employee3.avro <==
> {
> "type" : "record",
> "name" : "employee",
> "fields":[
> {"name" : "name", "type" : "string", "default" : "NU"},
> {"name" : "age", "type" : "int", "default" : 0 },
> {"name" : "dept", "type": "string", "default" : "DU"} ] }
> ==> Employee4.avro <==
> {
> "type" : "record",
> "name" : "employee",
> "fields":[
> {"name" : "name", "type" : "string", "default" : "NU"},
> {"name" : "age", "type" : "int", "default" : 0},
> {"name" : "dept", "type": "string", "default" : "DU"},
> {"name" : "office", "type": "string", "default" : "OU"} ] }
> ==> Employee6.avro <==
> {
> "type" : "record",
> "name" : "employee",
> "fields":[
> {"name" : "name", "type" : "string", "default" : "NU"},
> {"name" : "lastname", "type": "string", "default" : "LNU"},
> {"name" : "age", "type" : "int","default" : 0},
> {"name" : "salary", "type": "int", "default" : 0},
> {"name" : "dept", "type": "string","default" : "DU"},
> {"name" : "office", "type": "string","default" : "OU"} ] }
> The pig script:
> employee = load 'employee{3,4,6}.ser' using 
> org.apache.pig.piggybank.storage.avro.AvroStorage('multiple_schemas');
> describe employee;
> dump employee;
> Output Schema:
> employee: {name: chararray,age: int,dept: chararray,lastname: 
> chararray,salary: int,office: chararray}
> (Milo,30,DH,,,)
> (Asmya,34,PQ,,,)
> (Baljit,23,RS,,,)
> (Pune,60,Astrophysics,Warriors,5466,UTA)
> (Rajsathan,20,Biochemistry,Royals,1378,Stanford)
> (Chennai,50,Microbiology,Superkings,7338,Hopkins)
> (Mumbai,20,Applied Math,Indians,4468,UAH)
> (Praj,54,RMX,,,Champaign)
> (Buba,767,HD,,,Sunnyvale)
> (Manku,375,MS,,,New York)
> Regards
> Viraj

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3318) AVRO: 'default value' not honored when merging schemas on load with AvroStorage

2013-05-11 Thread Viraj Bhat (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Bhat updated PIG-3318:


Attachment: Employee4.ser

avro test file

> AVRO: 'default value' not honored when merging schemas on load with 
> AvroStorage
> ---
>
> Key: PIG-3318
> URL: https://issues.apache.org/jira/browse/PIG-3318
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.11.2
>Reporter: Viraj Bhat
>Assignee: Viraj Bhat
> Fix For: 0.12, 0.11.2
>
> Attachments: Employee3.ser, Employee4.ser, Employee6.ser, 
> expected_testMultipleSchemasDefault1.avro, PIG-3118.0.11.patch, PIG-3318.patch
>
>
> Piggybank - AvroStorage. When merging multiple schemas where default values 
> have been specified in the avro schema; 
> The AvroStorage puts nulls in the merged data set. 
> ==> Employee3.avro <==
> {
> "type" : "record",
> "name" : "employee",
> "fields":[
> {"name" : "name", "type" : "string", "default" : "NU"},
> {"name" : "age", "type" : "int", "default" : 0 },
> {"name" : "dept", "type": "string", "default" : "DU"} ] }
> ==> Employee4.avro <==
> {
> "type" : "record",
> "name" : "employee",
> "fields":[
> {"name" : "name", "type" : "string", "default" : "NU"},
> {"name" : "age", "type" : "int", "default" : 0},
> {"name" : "dept", "type": "string", "default" : "DU"},
> {"name" : "office", "type": "string", "default" : "OU"} ] }
> ==> Employee6.avro <==
> {
> "type" : "record",
> "name" : "employee",
> "fields":[
> {"name" : "name", "type" : "string", "default" : "NU"},
> {"name" : "lastname", "type": "string", "default" : "LNU"},
> {"name" : "age", "type" : "int","default" : 0},
> {"name" : "salary", "type": "int", "default" : 0},
> {"name" : "dept", "type": "string","default" : "DU"},
> {"name" : "office", "type": "string","default" : "OU"} ] }
> The pig script:
> employee = load 'employee{3,4,6}.ser' using 
> org.apache.pig.piggybank.storage.avro.AvroStorage('multiple_schemas');
> describe employee;
> dump employee;
> Output Schema:
> employee: {name: chararray,age: int,dept: chararray,lastname: 
> chararray,salary: int,office: chararray}
> (Milo,30,DH,,,)
> (Asmya,34,PQ,,,)
> (Baljit,23,RS,,,)
> (Pune,60,Astrophysics,Warriors,5466,UTA)
> (Rajsathan,20,Biochemistry,Royals,1378,Stanford)
> (Chennai,50,Microbiology,Superkings,7338,Hopkins)
> (Mumbai,20,Applied Math,Indians,4468,UAH)
> (Praj,54,RMX,,,Champaign)
> (Buba,767,HD,,,Sunnyvale)
> (Manku,375,MS,,,New York)
> Regards
> Viraj

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3318) AVRO: 'default value' not honored when merging schemas on load with AvroStorage

2013-05-11 Thread Viraj Bhat (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Bhat updated PIG-3318:


Attachment: PIG-3118.0.11.patch

Patch for branch 0.11.2

> AVRO: 'default value' not honored when merging schemas on load with 
> AvroStorage
> ---
>
> Key: PIG-3318
> URL: https://issues.apache.org/jira/browse/PIG-3318
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.11.2
>Reporter: Viraj Bhat
>Assignee: Viraj Bhat
> Fix For: 0.12, 0.11.2
>
> Attachments: PIG-3118.0.11.patch, PIG-3318.patch
>
>
> Piggybank - AvroStorage. When merging multiple schemas where default values 
> have been specified in the avro schema; 
> The AvroStorage puts nulls in the merged data set. 
> ==> Employee3.avro <==
> {
> "type" : "record",
> "name" : "employee",
> "fields":[
> {"name" : "name", "type" : "string", "default" : "NU"},
> {"name" : "age", "type" : "int", "default" : 0 },
> {"name" : "dept", "type": "string", "default" : "DU"} ] }
> ==> Employee4.avro <==
> {
> "type" : "record",
> "name" : "employee",
> "fields":[
> {"name" : "name", "type" : "string", "default" : "NU"},
> {"name" : "age", "type" : "int", "default" : 0},
> {"name" : "dept", "type": "string", "default" : "DU"},
> {"name" : "office", "type": "string", "default" : "OU"} ] }
> ==> Employee6.avro <==
> {
> "type" : "record",
> "name" : "employee",
> "fields":[
> {"name" : "name", "type" : "string", "default" : "NU"},
> {"name" : "lastname", "type": "string", "default" : "LNU"},
> {"name" : "age", "type" : "int","default" : 0},
> {"name" : "salary", "type": "int", "default" : 0},
> {"name" : "dept", "type": "string","default" : "DU"},
> {"name" : "office", "type": "string","default" : "OU"} ] }
> The pig script:
> employee = load 'employee{3,4,6}.ser' using 
> org.apache.pig.piggybank.storage.avro.AvroStorage('multiple_schemas');
> describe employee;
> dump employee;
> Output Schema:
> employee: {name: chararray,age: int,dept: chararray,lastname: 
> chararray,salary: int,office: chararray}
> (Milo,30,DH,,,)
> (Asmya,34,PQ,,,)
> (Baljit,23,RS,,,)
> (Pune,60,Astrophysics,Warriors,5466,UTA)
> (Rajsathan,20,Biochemistry,Royals,1378,Stanford)
> (Chennai,50,Microbiology,Superkings,7338,Hopkins)
> (Mumbai,20,Applied Math,Indians,4468,UAH)
> (Praj,54,RMX,,,Champaign)
> (Buba,767,HD,,,Sunnyvale)
> (Manku,375,MS,,,New York)
> Regards
> Viraj

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3318) AVRO: 'default value' not honored when merging schemas on load with AvroStorage

2013-05-11 Thread Viraj Bhat (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Bhat updated PIG-3318:


Attachment: Employee3.ser

Avro file

> AVRO: 'default value' not honored when merging schemas on load with 
> AvroStorage
> ---
>
> Key: PIG-3318
> URL: https://issues.apache.org/jira/browse/PIG-3318
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.11.2
>Reporter: Viraj Bhat
>Assignee: Viraj Bhat
> Fix For: 0.12, 0.11.2
>
> Attachments: Employee3.ser, PIG-3118.0.11.patch, PIG-3318.patch
>
>
> Piggybank - AvroStorage. When merging multiple schemas where default values 
> have been specified in the avro schema; 
> The AvroStorage puts nulls in the merged data set. 
> ==> Employee3.avro <==
> {
> "type" : "record",
> "name" : "employee",
> "fields":[
> {"name" : "name", "type" : "string", "default" : "NU"},
> {"name" : "age", "type" : "int", "default" : 0 },
> {"name" : "dept", "type": "string", "default" : "DU"} ] }
> ==> Employee4.avro <==
> {
> "type" : "record",
> "name" : "employee",
> "fields":[
> {"name" : "name", "type" : "string", "default" : "NU"},
> {"name" : "age", "type" : "int", "default" : 0},
> {"name" : "dept", "type": "string", "default" : "DU"},
> {"name" : "office", "type": "string", "default" : "OU"} ] }
> ==> Employee6.avro <==
> {
> "type" : "record",
> "name" : "employee",
> "fields":[
> {"name" : "name", "type" : "string", "default" : "NU"},
> {"name" : "lastname", "type": "string", "default" : "LNU"},
> {"name" : "age", "type" : "int","default" : 0},
> {"name" : "salary", "type": "int", "default" : 0},
> {"name" : "dept", "type": "string","default" : "DU"},
> {"name" : "office", "type": "string","default" : "OU"} ] }
> The pig script:
> employee = load 'employee{3,4,6}.ser' using 
> org.apache.pig.piggybank.storage.avro.AvroStorage('multiple_schemas');
> describe employee;
> dump employee;
> Output Schema:
> employee: {name: chararray,age: int,dept: chararray,lastname: 
> chararray,salary: int,office: chararray}
> (Milo,30,DH,,,)
> (Asmya,34,PQ,,,)
> (Baljit,23,RS,,,)
> (Pune,60,Astrophysics,Warriors,5466,UTA)
> (Rajsathan,20,Biochemistry,Royals,1378,Stanford)
> (Chennai,50,Microbiology,Superkings,7338,Hopkins)
> (Mumbai,20,Applied Math,Indians,4468,UAH)
> (Praj,54,RMX,,,Champaign)
> (Buba,767,HD,,,Sunnyvale)
> (Manku,375,MS,,,New York)
> Regards
> Viraj

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


  1   2   >