> On Jan. 8, 2014, 11:27 p.m., Daniel Dai wrote:
> > Thanks Josh, this is a very well written code to make Accumulo available to 
> > Pig. I can make it work locally. I have two major comment:
> > 1. Bear in mind, Pig is good to manipulate tuple, Ok for map, and horrible 
> > for bag. A columnar data would be perfect. AccumuloWholeRowStorage does not 
> > seems pretty useful since it requires user to manipulate bag in Pig script. 
> > User will have to write a UDF for it. AccumuloKVStorage is fine, but it is 
> > also awkward to manipulate the flattened data. AccumuloStorage is the best 
> > style among these three, user can project map entries to get the columnar 
> > data. But personally, I would still prefer HBastStorage style in which user 
> > can specify the column he's interested in construct arguments. So I would 
> > suggest to improve AccumuloStorage (or a new class) to include that.
> > 
> > 2. caster is not right. You will need to return correct caster in 
> > getLoadCaster. Otherwise cast data into Pig type will fail. Eg:
> > a = load 'accumulo://....' using 
> > org.apache.pig.backend.hadoop.accumulo.AccumuloStorage() as (key:bytearray, 
> > m:map[]);
> > b = foreach a generate (double)m#'gpa';
> > I see you give Utf8StorageConverter as the converter in some of the 
> > loaders, but Utf8StorageConverter will assume numerical data in string 
> > style, and try to parse the string into numerical. I see there is 
> > objToBytes in both AbstractAccumuloStorage and Utils. These should be 
> > consolidated into a caster (refer to HBaseBinaryConverter)
> 
> Josh Elser wrote:
>     For #1, I agree. I was never really sure about using the WholeRowStorage 
> variant, I think it probably just introduces more complexity, and, like you 
> said, isn't a great example for Pig in the first place. I can probably add in 
> another constructor variant that lets you set a column projection. Perhaps 
> the "fetch_column" option should be removed from the URI completely?
>     
>     As a follow on, I want to reduce the args to that accumulo URI (probably 
> pulling some file off of the classpath) so the user doesn't always have to 
> re-type the same four args (user, pass, instance name and zookeeper hosts).
>     
>     For #2, I think I understand, but I haven't been able to generate a 
> failure given the little snippet you provided. I'll look at the HBaseStorage 
> and HBaseBinaryConverter as an example.

For #1, I would prefer put column specification in the construct, just like 
HBaseStorage did. 

For #2, the exception is trapped, you will not see failure but null value in 
the output.


- Daniel


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/16533/#review31404
-----------------------------------------------------------


On Dec. 31, 2013, 2:02 a.m., Josh Elser wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/16533/
> -----------------------------------------------------------
> 
> (Updated Dec. 31, 2013, 2:02 a.m.)
> 
> 
> Review request for pig.
> 
> 
> Bugs: PIG-3573
>     https://issues.apache.org/jira/browse/PIG-3573
> 
> 
> Repository: pig-git
> 
> 
> Description
> -------
> 
> Provides basic StoreFunc and LoadFunc implementations. Based off of code that 
> was in an Accumulo contrib project.
> 
> 
> Diffs
> -----
> 
>   build.xml 575c9ae 
>   ivy.xml 180eb2c 
>   ivy/libraries.properties 14abdf8 
>   src/org/apache/pig/backend/hadoop/accumulo/AbstractAccumuloStorage.java 
> PRE-CREATION 
>   src/org/apache/pig/backend/hadoop/accumulo/AccumuloKVStorage.java 
> PRE-CREATION 
>   src/org/apache/pig/backend/hadoop/accumulo/AccumuloStorage.java 
> PRE-CREATION 
>   src/org/apache/pig/backend/hadoop/accumulo/AccumuloWholeRowStorage.java 
> PRE-CREATION 
>   src/org/apache/pig/backend/hadoop/accumulo/Utils.java PRE-CREATION 
>   
> test/org/apache/pig/backend/hadoop/accumulo/AbstractAccumuloStorageTest.java 
> PRE-CREATION 
>   test/org/apache/pig/backend/hadoop/accumulo/AccumuloKVStorageTest.java 
> PRE-CREATION 
>   test/org/apache/pig/backend/hadoop/accumulo/AccumuloPigClusterTest.java 
> PRE-CREATION 
>   
> test/org/apache/pig/backend/hadoop/accumulo/AccumuloStorageConfigurationTest.java
>  PRE-CREATION 
>   test/org/apache/pig/backend/hadoop/accumulo/AccumuloStorageTest.java 
> PRE-CREATION 
>   
> test/org/apache/pig/backend/hadoop/accumulo/AccumuloWholeRowStorageTest.java 
> PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/16533/diff/
> 
> 
> Testing
> -------
> 
> Local tests reading, writing and JOIN'ing Accumulo tables. Tested against 
> Hadoop-1.0.4 and 2.2.0, with Accumulo 1.5.0
> 
> 
> Thanks,
> 
> Josh Elser
> 
>

Reply via email to