> On March 1, 2013, 2:54 p.m., Jonathan Coveney wrote:
> > contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/LTSVLoader.java,
> >  line 157
> > <https://reviews.apache.org/r/9685/diff/1/?file=263710#file263710line157>
> >
> >     In the case where they do not give it a schema, I think that they 
> > shouldn't have to actually specify that it is a map. It's implicit. 
> > Furthermore, I think that instead of passing the Schema as an argument, if 
> > they do an "as (schema)" THEN it should attempt to return a Schema in 
> > accordance with what you provide.
> >     
> >     Open to argument, though.
> 
> Taku Miyakawa wrote:
>     > In the case where they do not give it a schema, I think that they 
> shouldn't have to actually specify that it is a map.
>     
>     Yes ":map[]" is redundant. I will remove it.
>     
>     > if they do an "as (schema)" THEN it should attempt to return a Schema 
> in accordance with what you provide.
>     
>     Does this code meet your point?
>     
>     <code>
>     access = LOAD 'access.log' USING 
> org.apache.pig.piggybank.storage.LTSVLoader() AS (host:chararray, 
> ua:chararray);
>     ua = FOREACH access GENERATE ua;
>     dump ua;
>     </code>
>     
>     It will be good, but I could not find a way to implement the function in 
> that manner. LoadFunc seems to have no method which can get an input schema 
> specified by users.
> 
> Cheolsoo Park wrote:
>     >> It will be good, but I could not find a way to implement the function 
> in that manner. LoadFunc seems to have no method which can get an input 
> schema specified by users.
>     
>     You don't need access to the schema given by the AS clause inside 
> LoadFunc. You can just make LoadMetaData.getSchema() return null and load 
> every field as bytearray in LoadFunc.getNext(). Then, Pig inserts a FOREACH 
> operator after LOAD and does type conversion using the schema given by the AS 
> clause.
>     
>     This is what happens when PigStorage is not given a JSON schema file, but 
> the output schema is defined in the AS clause.

Cheolsoo is totally correct. That said, this should probably be documented 
better. Also, the fact that we rely on an implicit foreach is so gross... but 
this is how you do it!


- Jonathan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/9685/#review17239
-----------------------------------------------------------


On Feb. 28, 2013, 10:53 p.m., Taku Miyakawa wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/9685/
> -----------------------------------------------------------
> 
> (Updated Feb. 28, 2013, 10:53 p.m.)
> 
> 
> Review request for pig.
> 
> 
> Description
> -------
> 
> This is a review board for  https://issues.apache.org/jira/browse/PIG-3215
> 
> The patch adds LTSVLoader function and its test class.
> 
> 
> This addresses bug PIG-3215.
>     https://issues.apache.org/jira/browse/PIG-3215
> 
> 
> Diffs
> -----
> 
>   
> contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/LTSVLoader.java
>  PRE-CREATION 
>   
> contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/TestLTSVLoader.java
>  PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/9685/diff/
> 
> 
> Testing
> -------
> 
> ant compile-test
> cd contrib/piggybank/java/
> ant -Dtestcase=TestLTSVLoader test
> 
> 
> Thanks,
> 
> Taku Miyakawa
> 
>

Reply via email to