> On June 18, 2012, 6:02 p.m., Julien Le Dem wrote:
> > trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POUserFunc.java,
> >  lines 147-150
> > <https://reviews.apache.org/r/4651/diff/4/?file=111761#file111761line147>
> >
> >     remove?

I'll put a more instructive comment, but the point is that eventually we may 
want to use Schematuple for the output as well...


> On June 18, 2012, 6:02 p.m., Julien Le Dem wrote:
> > trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POUserFunc.java,
> >  line 155
> > <https://reviews.apache.org/r/4651/diff/4/?file=111761#file111761line155>
> >
> >     you could also have a method isFixedSize() in TupleFactory.

Not a bad idea. Although I do not think this actually solves the problem... you 
can have a TupleFactory that isn't a fixed size ie AppendableSchemaTuple, but 
appends don't start at index 0.


> On June 18, 2012, 6:02 p.m., Julien Le Dem wrote:
> > trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POUserFunc.java,
> >  lines 194-196
> > <https://reviews.apache.org/r/4651/diff/4/?file=111761#file111761line194>
> >
> >     you could also do this at the same time you would have initialized the 
> > Schema based factory (line 144)
> >     if the assigment happens in the constructor you can even make 
> > inputTupleFactory final

The assignment doesn't happen in the constructor alas, it happens in 
instantiateFunc. This function is only called in the constructor, but I am 
dubious of refactoring it. I could, though.


> On June 18, 2012, 6:02 p.m., Julien Le Dem wrote:
> > trunk/src/org/apache/pig/data/BinInterSedes.java, line 148
> > <https://reviews.apache.org/r/4651/diff/4/?file=111762#file111762line148>
> >
> >     where is this used?

readGenericTuple (which is in org.apache.pig.data.utils in SedesHelper) as well 
as addColstoTuple in BinInterSedes


> On June 18, 2012, 6:02 p.m., Julien Le Dem wrote:
> > trunk/src/org/apache/pig/data/DataByteArray.java, lines 235-243
> > <https://reviews.apache.org/r/4651/diff/4/?file=111763#file111763line235>
> >
> >     
> > http://docs.oracle.com/javase/1.5.0/docs/api/java/util/Arrays.html#hashCode%28byte[]%29
> >     I'm not sure I get why these should use different primes...

Agreed


> On June 18, 2012, 6:02 p.m., Julien Le Dem wrote:
> > trunk/src/org/apache/pig/data/TupleFactory.java, line 157
> > <https://reviews.apache.org/r/4651/diff/4/?file=111764#file111764line157>
> >
> >     I prefer just Schema as a parameter here.
> >     I'll look again when you add SchemaTupleFrontend and Backend

Yeah, take a look once it is added. There's no real getting around it if we 
want people to have control over when generated code is and is not used (ie yes 
to join optimizations, no to udf, and so on)


- Jonathan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4651/#review8332
-----------------------------------------------------------


On June 18, 2012, 6:05 p.m., Jonathan Coveney wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/4651/
> -----------------------------------------------------------
> 
> (Updated June 18, 2012, 6:05 p.m.)
> 
> 
> Review request for pig and Julien Le Dem.
> 
> 
> Description
> -------
> 
> This work builds on Dmitriy's PrimitiveTuple work. The idea is that, knowing 
> the Schema on the frontend, we can code generate Tuples which can be used for 
> fun and profit. In rudimentary tests, the memory efficiency is 2-4x better, 
> and it's ~15% smaller serialized (heavily heavily depends on the data, 
> though). Need to do get/set tests, but assuming that it's on par (or even 
> faster) than Tuple, the memory gain is huge.
> 
> Need to clean up the code and add tests.
> 
> Right now, it generates a SchemaTuple for every inputSchema and outputSchema 
> given to UDF's. The next step is to make a SchemaBag, where I think the 
> serialization savings will be really huge.
> 
> Needs tests and comments, but I want the code to settle a bit.
> 
> 
> This addresses bug PIG-2632.
>     https://issues.apache.org/jira/browse/PIG-2632
> 
> 
> Diffs
> -----
> 
>   
> trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/JobControlCompiler.java
>  1351417 
>   
> trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigGenericMapBase.java
>  1351417 
>   
> trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigGenericMapReduce.java
>  1351417 
>   
> trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigTupleDefaultRawComparator.java
>  1351417 
>   
> trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/PhysicalOperator.java
>  1351417 
>   
> trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POUserFunc.java
>  1351417 
>   trunk/src/org/apache/pig/data/AppendableSchemaTuple.java PRE-CREATION 
>   trunk/src/org/apache/pig/data/BinInterSedes.java 1351417 
>   trunk/src/org/apache/pig/data/DataByteArray.java 1351417 
>   trunk/src/org/apache/pig/data/FieldIsNullException.java PRE-CREATION 
>   trunk/src/org/apache/pig/data/PBooleanTuple.java 1351417 
>   trunk/src/org/apache/pig/data/PDoubleTuple.java 1351417 
>   trunk/src/org/apache/pig/data/PFloatTuple.java 1351417 
>   trunk/src/org/apache/pig/data/PIntTuple.java 1351417 
>   trunk/src/org/apache/pig/data/PLongTuple.java 1351417 
>   trunk/src/org/apache/pig/data/PStringTuple.java 1351417 
>   trunk/src/org/apache/pig/data/PrimitiveFieldTuple.java 1351417 
>   trunk/src/org/apache/pig/data/PrimitiveTuple.java 1351417 
>   trunk/src/org/apache/pig/data/SchemaTuple.java PRE-CREATION 
>   trunk/src/org/apache/pig/data/SchemaTupleBackend.java PRE-CREATION 
>   trunk/src/org/apache/pig/data/SchemaTupleClassGenerator.java PRE-CREATION 
>   trunk/src/org/apache/pig/data/SchemaTupleFactory.java PRE-CREATION 
>   trunk/src/org/apache/pig/data/SchemaTupleFrontend.java PRE-CREATION 
>   trunk/src/org/apache/pig/data/TupleFactory.java 1351417 
>   trunk/src/org/apache/pig/data/TypeAwareTuple.java 1351417 
>   trunk/src/org/apache/pig/data/utils/BytesHelper.java PRE-CREATION 
>   trunk/src/org/apache/pig/data/utils/MethodHelper.java PRE-CREATION 
>   trunk/src/org/apache/pig/data/utils/SedesHelper.java PRE-CREATION 
>   trunk/src/org/apache/pig/data/utils/StructuresHelper.java PRE-CREATION 
>   trunk/src/org/apache/pig/impl/PigContext.java 1351417 
>   trunk/src/org/apache/pig/impl/io/NullableTuple.java 1351417 
>   trunk/src/org/apache/pig/impl/io/PigNullableWritable.java 1351417 
>   
> trunk/src/org/apache/pig/newplan/logical/expression/ExpToPhyTranslationVisitor.java
>  1351417 
>   trunk/src/org/apache/pig/newplan/logical/expression/UserFuncExpression.java 
> 1351417 
>   trunk/test/org/apache/pig/data/TestSchemaTuple.java PRE-CREATION 
>   trunk/test/org/apache/pig/data/utils/TestMethodHelper.java PRE-CREATION 
>   trunk/test/org/apache/pig/test/TestDataBag.java 1351417 
>   trunk/test/org/apache/pig/test/TestPrimitiveFieldTuple.java 1351417 
>   trunk/test/org/apache/pig/test/TestPrimitiveTuple.java 1351417 
>   trunk/test/org/apache/pig/test/TestSchema.java 1351417 
> 
> Diff: https://reviews.apache.org/r/4651/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Jonathan Coveney
> 
>

Reply via email to