[jira] Commented: (PIG-911) [Piggybank] SequenceFileLoader

2009-09-15 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12755709#action_12755709
 ] 

Alan Gates commented on PIG-911:


I'm reviewing this patch

 [Piggybank] SequenceFileLoader 
 ---

 Key: PIG-911
 URL: https://issues.apache.org/jira/browse/PIG-911
 Project: Pig
  Issue Type: New Feature
Reporter: Dmitriy V. Ryaboy
 Attachments: pig_911.2.patch, pig_sequencefile.patch


 The proposed piggybank contribution adds a SequenceFileLoader to the 
 piggybank.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-911) [Piggybank] SequenceFileLoader

2009-08-17 Thread Dmitriy V. Ryaboy (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12744343#action_12744343
 ] 

Dmitriy V. Ryaboy commented on PIG-911:
---

Concerning making this a StoreFunc, as well -- the StoreFunc interface is not 
very friendly to this.
All you get in the bind call is the output stream; for LoadFunc, you also get 
the name of the file (or, presumably, whatever it was the user passed in under 
the guise of a file name).  This means that for the LoadFunc, I was able to use 
the passed in filename to back into a Path and a FileSystem.  I can't do the 
same for StoreFunc, where the filename is not available -- only the output 
stream is.  That means I can't create the appropriate SequenceFile.Writer .  Is 
there a way around this limitation that does not involve requiring special 
constructor parameters to be used?  
Is it possible to change the StoreFunc api to provide this information, or to 
make it available through some side channel (MapRedUtils or similar)?

 [Piggybank] SequenceFileLoader 
 ---

 Key: PIG-911
 URL: https://issues.apache.org/jira/browse/PIG-911
 Project: Pig
  Issue Type: New Feature
Reporter: Dmitriy V. Ryaboy
 Attachments: pig_911.2.patch, pig_sequencefile.patch


 The proposed piggybank contribution adds a SequenceFileLoader to the 
 piggybank.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-911) [Piggybank] SequenceFileLoader

2009-08-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12744373#action_12744373
 ] 

Hadoop QA commented on PIG-911:
---

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12416830/pig_911.2.patch
  against trunk revision 804406.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/168/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/168/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/168/console

This message is automatically generated.

 [Piggybank] SequenceFileLoader 
 ---

 Key: PIG-911
 URL: https://issues.apache.org/jira/browse/PIG-911
 Project: Pig
  Issue Type: New Feature
Reporter: Dmitriy V. Ryaboy
 Attachments: pig_911.2.patch, pig_sequencefile.patch


 The proposed piggybank contribution adds a SequenceFileLoader to the 
 piggybank.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-911) [Piggybank] SequenceFileLoader

2009-08-12 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12742239#action_12742239
 ] 

Alan Gates commented on PIG-911:


Dmitry,

First this is great.  We've had requests to read Sequence files.  Being able to 
write them also would be great.

A few thoughts:

1) This should not extend UTF8StorageConverter.  This loader will be returning 
actual data types, not bytes that need to be interpreted.  I would think 
instead that it should implement the bytesToX() methods itself and just throw 
an exception saying it didn't expect to do any conversion.

2) The getSampledTuple looks fine if skip is handling getting the stream to the 
point that reading the next tuple is viable.

3) In the bindTo call, where you obtain the key and value by reflection, should 
there be a try/catch block there in case the cast to Writable fails?  In the 
same way, in describe schema you're asking how to suppress warnings from the 
cast in reader.getKeyClass().  But don't you want to check that what you got 
really is a writable, since there is no guarantee?



 [Piggybank] SequenceFileLoader 
 ---

 Key: PIG-911
 URL: https://issues.apache.org/jira/browse/PIG-911
 Project: Pig
  Issue Type: New Feature
Reporter: Dmitriy V. Ryaboy
 Attachments: pig_sequencefile.patch


 The proposed piggybank contribution adds a SequenceFileLoader to the 
 piggybank.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-911) [Piggybank] SequenceFileLoader

2009-08-12 Thread Dmitriy V. Ryaboy (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12742565#action_12742565
 ] 

Dmitriy V. Ryaboy commented on PIG-911:
---

Alan, 
Thanks for the feedback.

I'll add the try/catch

In regards to the UTF8StorageConverter -- I think I added that because before 
that the code broke if you didn't declare a schema at load time (so, a=load 
'foo' using SequenceFileLoader() as (a,b) instead of a=load 'foo' using 
SequenceFileLoader() as (a:chararray, b:double)

I'll figure out what exactly is going on with that and remove the 
UTF8StorageConverter 

Will add Store as time allows.



 [Piggybank] SequenceFileLoader 
 ---

 Key: PIG-911
 URL: https://issues.apache.org/jira/browse/PIG-911
 Project: Pig
  Issue Type: New Feature
Reporter: Dmitriy V. Ryaboy
 Attachments: pig_sequencefile.patch


 The proposed piggybank contribution adds a SequenceFileLoader to the 
 piggybank.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-911) [Piggybank] SequenceFileLoader

2009-08-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12740290#action_12740290
 ] 

Hadoop QA commented on PIG-911:
---

+1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12415673/pig_sequencefile.patch
  against trunk revision 801460.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/153/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/153/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/153/console

This message is automatically generated.

 [Piggybank] SequenceFileLoader 
 ---

 Key: PIG-911
 URL: https://issues.apache.org/jira/browse/PIG-911
 Project: Pig
  Issue Type: New Feature
Reporter: Dmitriy V. Ryaboy
 Attachments: pig_sequencefile.patch


 The proposed piggybank contribution adds a SequenceFileLoader to the 
 piggybank.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.