-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4651/
-----------------------------------------------------------

(Updated June 29, 2012, 9:21 p.m.)


Review request for pig and Julien Le Dem.


Changes
-------

Hopefully this is close to what will make it in. Need some eyes on the new 
pieces, however: it integrates with FR joins and merge joins. We ran it on a 
job here and saw significant memory benefits!


Description
-------

This work builds on Dmitriy's PrimitiveTuple work. The idea is that, knowing 
the Schema on the frontend, we can code generate Tuples which can be used for 
fun and profit. In rudimentary tests, the memory efficiency is 2-4x better, and 
it's ~15% smaller serialized (heavily heavily depends on the data, though). 
Need to do get/set tests, but assuming that it's on par (or even faster) than 
Tuple, the memory gain is huge.

Need to clean up the code and add tests.

Right now, it generates a SchemaTuple for every inputSchema and outputSchema 
given to UDF's. The next step is to make a SchemaBag, where I think the 
serialization savings will be really huge.

Needs tests and comments, but I want the code to settle a bit.


This addresses bug PIG-2632.
    https://issues.apache.org/jira/browse/PIG-2632


Diffs (updated)
-----

  trunk/.gitignore 1355561 
  trunk/conf/pig.properties 1355561 
  
trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/JobControlCompiler.java
 1355561 
  
trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigGenericMapBase.java
 1355561 
  
trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigGenericMapReduce.java
 1355561 
  
trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigTupleDefaultRawComparator.java
 1355561 
  
trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/PhysicalOperator.java
 1355561 
  
trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POUserFunc.java
 1355561 
  
trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POFRJoin.java
 1355561 
  
trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POMergeJoin.java
 1355561 
  trunk/src/org/apache/pig/builtin/mock/Storage.java 1355561 
  trunk/src/org/apache/pig/data/AppendableSchemaTuple.java PRE-CREATION 
  trunk/src/org/apache/pig/data/BinInterSedes.java 1355561 
  trunk/src/org/apache/pig/data/BinSedesTupleFactory.java 1355561 
  trunk/src/org/apache/pig/data/DataByteArray.java 1355561 
  trunk/src/org/apache/pig/data/FieldIsNullException.java PRE-CREATION 
  trunk/src/org/apache/pig/data/PBooleanTuple.java 1355561 
  trunk/src/org/apache/pig/data/PDoubleTuple.java 1355561 
  trunk/src/org/apache/pig/data/PFloatTuple.java 1355561 
  trunk/src/org/apache/pig/data/PIntTuple.java 1355561 
  trunk/src/org/apache/pig/data/PLongTuple.java 1355561 
  trunk/src/org/apache/pig/data/PStringTuple.java 1355561 
  trunk/src/org/apache/pig/data/PrimitiveFieldTuple.java 1355561 
  trunk/src/org/apache/pig/data/PrimitiveTuple.java 1355561 
  trunk/src/org/apache/pig/data/SchemaTuple.java PRE-CREATION 
  trunk/src/org/apache/pig/data/SchemaTupleBackend.java PRE-CREATION 
  trunk/src/org/apache/pig/data/SchemaTupleClassGenerator.java PRE-CREATION 
  trunk/src/org/apache/pig/data/SchemaTupleFactory.java PRE-CREATION 
  trunk/src/org/apache/pig/data/SchemaTupleFrontend.java PRE-CREATION 
  trunk/src/org/apache/pig/data/TupleFactory.java 1355561 
  trunk/src/org/apache/pig/data/TypeAwareTuple.java 1355561 
  trunk/src/org/apache/pig/data/utils/BytesHelper.java PRE-CREATION 
  trunk/src/org/apache/pig/data/utils/MethodHelper.java PRE-CREATION 
  trunk/src/org/apache/pig/data/utils/SedesHelper.java PRE-CREATION 
  trunk/src/org/apache/pig/data/utils/StructuresHelper.java PRE-CREATION 
  trunk/src/org/apache/pig/impl/PigContext.java 1355561 
  trunk/src/org/apache/pig/impl/io/InterRecordReader.java 1355561 
  trunk/src/org/apache/pig/impl/io/NullableTuple.java 1355561 
  
trunk/src/org/apache/pig/newplan/logical/expression/ExpToPhyTranslationVisitor.java
 1355561 
  trunk/src/org/apache/pig/newplan/logical/expression/UserFuncExpression.java 
1355561 
  
trunk/src/org/apache/pig/newplan/logical/relational/LogToPhyTranslationVisitor.java
 1355561 
  
trunk/src/org/apache/pig/newplan/logical/relational/LogicalRelationalOperator.java
 1355561 
  
trunk/src/org/apache/pig/newplan/logical/rules/GroupByConstParallelSetter.java 
1355561 
  trunk/src/org/apache/pig/newplan/logical/rules/MergeForEach.java 1355561 
  trunk/test/org/apache/pig/data/TestSchemaTuple.java PRE-CREATION 
  trunk/test/org/apache/pig/data/utils/TestMethodHelper.java PRE-CREATION 
  trunk/test/org/apache/pig/test/TestDataBag.java 1355561 
  trunk/test/org/apache/pig/test/TestLogicalPlanBuilder.java 1355561 
  trunk/test/org/apache/pig/test/TestPrimitiveFieldTuple.java 1355561 
  trunk/test/org/apache/pig/test/TestPrimitiveTuple.java 1355561 
  trunk/test/org/apache/pig/test/TestSchema.java 1355561 

Diff: https://reviews.apache.org/r/4651/diff/


Testing
-------


Thanks,

Jonathan Coveney

Reply via email to