> On Oct. 10, 2013, 8:50 p.m., Julien Le Dem wrote:
> > trunk/src/org/apache/pig/impl/util/Utils.java, line 259
> > <https://reviews.apache.org/r/14552/diff/3/?file=362607#file362607line259>
> >
> >     We should throw an exception if the trimmed lowercased  string is 
> > something we don't expect.
> >     otherwise typos (seqFile, sequencefile, ...) will lead to unexpected 
> > behavior.
> >     
> >     the call becomes then:
> >     ...getProperty(...PIG_TEMP_FILE_COMPRESSION_STORAGE, "tfile");
> >     
> >     example message:
> >     PigConfiguration.PIG_TEMP_FILE_COMPRESSION_STORAGE +": expected storage 
> > seqfile or tfile (default tfile)"
> >     
> >
> 
> Aniket Mokashi wrote:
>     We need seqfile as an option. tfile can be used by default without any 
> extra properties flag.
> 
> Julien Le Dem wrote:
>     my point was that we should use the default when the property is not 
> defined not when it is set to a value we don't understand. In that case it 
> should fail.
> 
> Aniket Mokashi wrote:
>     got it. So, either property should not exist or property should be set to 
> tfile, (Instead of checking for "" string), correct?
>     Totally makes sense.

yep :)


> On Oct. 10, 2013, 8:50 p.m., Julien Le Dem wrote:
> > trunk/src/org/apache/pig/impl/util/Utils.java, line 329
> > <https://reviews.apache.org/r/14552/diff/3/?file=362607#file362607line329>
> >
> >     else throw exception?
> >     we have a default if it's not set but we don't do anything if it's an 
> > invalid compression codec
> 
> Aniket Mokashi wrote:
>     Idea is to let user provide a codec through 
> "mapred.output.compression.codec" property. In case SequenceFile (Hadoop) 
> starts supporting more than gz, snappy, lzo and bzip2 in newer version of 
> hadoop. Note, when using seqfile, user can completely avoid setting 
> pig.tmpfilecompression.codec and use mapred.output.compression.codec if 
> wanted.
>     
>     We need this because TFileStorage uses pig's implementations of gz and 
> lzo instead of hadoop. Also, gzip codec is called gz by pig and gzip by 
> hadoop (making hard to be consistent and backwards compatible).
> 
> Julien Le Dem wrote:
>     sure. In particular my goal was to avoid doing nothing silently when 
> there is a typo in the setting. Those are hard to debug. In the current impl 
> not setting pig.tmpfilecompression.codec will force the DefaultCodec and 
> setting to something we don't understand will leave the codec unchanged. 
> hardly intuitive.
>     In general setting a property to something unknown should not fall back 
> to a default but fail with an explicit message.
> 
> Aniket Mokashi wrote:
>     Actually, thinking about it now, I think your comments make sense. I will 
> modify the code accordingly. Thanks a lot! :)
>     
>     So, new behavior:
>     codec.equals("") -> user mean to set it through the mapred property.
>     else -> misspelling etc.
>     This is definitely more intuitive.

thanks to you!


- Julien


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/14552/#review26894
-----------------------------------------------------------


On Oct. 9, 2013, 6 p.m., Aniket Mokashi wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/14552/
> -----------------------------------------------------------
> 
> (Updated Oct. 9, 2013, 6 p.m.)
> 
> 
> Review request for pig, Cheolsoo Park, Dmitriy Ryaboy, Julien Le Dem, and 
> Rohini Palaniswamy.
> 
> 
> Bugs: PIG-3480
>     https://issues.apache.org/jira/browse/PIG-3480
> 
> 
> Repository: pig
> 
> 
> Description
> -------
> 
> - Added a new parameter to make SequenceFileInterStorage optional.
> - Added tests
> - Refactored apis
> 
> 
> Diffs
> -----
> 
>   trunk/conf/pig.properties 1530468 
>   trunk/src/org/apache/pig/PigConfiguration.java 1530468 
>   
> trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/JobControlCompiler.java
>  1530468 
>   
> trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/partitioners/WeightedRangePartitioner.java
>  1530468 
>   
> trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPartitionRearrange.java
>  1530468 
>   
> trunk/src/org/apache/pig/backend/hadoop/executionengine/util/MapRedUtil.java 
> 1530468 
>   trunk/src/org/apache/pig/impl/io/InterStorage.java 1530468 
>   trunk/src/org/apache/pig/impl/io/SequenceFileInterStorage.java PRE-CREATION 
>   trunk/src/org/apache/pig/impl/io/TFileStorage.java 1530468 
>   trunk/src/org/apache/pig/impl/util/Utils.java 1530468 
>   trunk/test/org/apache/pig/test/TestTmpFileCompression.java 1530468 
> 
> Diff: https://reviews.apache.org/r/14552/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Aniket Mokashi
> 
>

Reply via email to