-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4651/#review6784
-----------------------------------------------------------



trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POUserFunc.java
<https://reviews.apache.org/r/4651/#comment15009>

    instead of checking here if the tuple is generatable, the factory could 
default to the regular TupleFactory if the generation fails. 



trunk/src/org/apache/pig/data/SchemaTuple.java
<https://reviews.apache.org/r/4651/#comment15010>

    How complex is this class ? Not sure it is worth pulling the whole mahout 
just for this.



trunk/src/org/apache/pig/data/SchemaTuple.java
<https://reviews.apache.org/r/4651/#comment15011>

    append...() methods should not be null
    (part of your TODO list?)



trunk/src/org/apache/pig/data/SchemaTuple.java
<https://reviews.apache.org/r/4651/#comment15013>

    handle the case where other == null
    Here you get NullPointerException



trunk/src/org/apache/pig/data/SchemaTuple.java
<https://reviews.apache.org/r/4651/#comment15014>

    same comment



trunk/src/org/apache/pig/data/SchemaTuple.java
<https://reviews.apache.org/r/4651/#comment15015>

    If you override equals(), you should override hashCode(). 2 object that are 
equal must return the same hashcode



trunk/src/org/apache/pig/data/TupleFactory.java
<https://reviews.apache.org/r/4651/#comment15052>

    that's where we need a proper classloader



trunk/src/org/apache/pig/data/SchemaTupleClassGenerator.java
<https://reviews.apache.org/r/4651/#comment15016>

    try avoiding accessing the PigContext statically. Add it as a parameter and 
see if it can be passed from somewhere it is actually known.



trunk/src/org/apache/pig/data/SchemaTupleClassGenerator.java
<https://reviews.apache.org/r/4651/#comment15017>

    possibly we want to rename this and/or add something else for this file.



trunk/src/org/apache/pig/data/SchemaTupleClassGenerator.java
<https://reviews.apache.org/r/4651/#comment15018>

    ?
    either get rid of the empty catch block or document why it is Ok.
    
    Here I don't see why it would be Ok that we can not instantiate that class



trunk/src/org/apache/pig/data/SchemaTupleClassGenerator.java
<https://reviews.apache.org/r/4651/#comment15019>

    "org.apache.pig.generated." + classname ?



trunk/src/org/apache/pig/data/SchemaTupleClassGenerator.java
<https://reviews.apache.org/r/4651/#comment15020>

    "org/apache/pig/generated/" + classname + ".class" ?



trunk/src/org/apache/pig/data/SchemaTupleClassGenerator.java
<https://reviews.apache.org/r/4651/#comment15021>

    we should probably not compile in the current dir.
    Put it in the temp dir intead



trunk/src/org/apache/pig/data/SchemaTupleClassGenerator.java
<https://reviews.apache.org/r/4651/#comment15022>

    this could be calling a method in the parent



trunk/src/org/apache/pig/data/SchemaTupleClassGenerator.java
<https://reviews.apache.org/r/4651/#comment15023>

    this could be in the parent



trunk/src/org/apache/pig/data/SchemaTupleClassGenerator.java
<https://reviews.apache.org/r/4651/#comment15024>

    stuff which is not calling a generated field directly should be pulled up.



trunk/src/org/apache/pig/data/SchemaTupleClassGenerator.java
<https://reviews.apache.org/r/4651/#comment15025>

    same comment



trunk/src/org/apache/pig/data/SchemaTupleClassGenerator.java
<https://reviews.apache.org/r/4651/#comment15026>

    same



trunk/src/org/apache/pig/data/SchemaTupleClassGenerator.java
<https://reviews.apache.org/r/4651/#comment15027>

    this is wasteful. Maybe DataByteArray could provide a static 
compare(byte[], byte[]) ?



trunk/src/org/apache/pig/data/SchemaTupleClassGenerator.java
<https://reviews.apache.org/r/4651/#comment15028>

    duplicated code



trunk/src/org/apache/pig/data/SchemaTupleClassGenerator.java
<https://reviews.apache.org/r/4651/#comment15029>

    this can be pulled up



trunk/src/org/apache/pig/data/SchemaTupleClassGenerator.java
<https://reviews.apache.org/r/4651/#comment15030>

    the exception handling can be pulled up



trunk/src/org/apache/pig/data/SchemaTupleClassGenerator.java
<https://reviews.apache.org/r/4651/#comment15031>

    "setPos_" + fieldPos + "((Boolean)val);"



trunk/src/org/apache/pig/data/SchemaTupleClassGenerator.java
<https://reviews.apache.org/r/4651/#comment15032>

    same with a mapping for the java object for the type 



trunk/src/org/apache/pig/data/SchemaTupleClassGenerator.java
<https://reviews.apache.org/r/4651/#comment15033>

    can be pulled up



trunk/src/org/apache/pig/data/SchemaTupleClassGenerator.java
<https://reviews.apache.org/r/4651/#comment15034>

    you can just remove the call to box, it should work



trunk/src/org/apache/pig/data/SchemaTupleClassGenerator.java
<https://reviews.apache.org/r/4651/#comment15035>

    this can be pulled up



trunk/src/org/apache/pig/data/SchemaTupleClassGenerator.java
<https://reviews.apache.org/r/4651/#comment15036>

    can be pulled up



trunk/src/org/apache/pig/data/SchemaTupleClassGenerator.java
<https://reviews.apache.org/r/4651/#comment15037>

    can be pulled up



trunk/src/org/apache/pig/data/SchemaTupleClassGenerator.java
<https://reviews.apache.org/r/4651/#comment15038>

    pull up



trunk/src/org/apache/pig/data/SchemaTupleClassGenerator.java
<https://reviews.apache.org/r/4651/#comment15039>

    pull up



trunk/src/org/apache/pig/data/SchemaTupleClassGenerator.java
<https://reviews.apache.org/r/4651/#comment15040>

    pull up



trunk/src/org/apache/pig/data/SchemaTupleClassGenerator.java
<https://reviews.apache.org/r/4651/#comment15041>

    pull up



trunk/src/org/apache/pig/data/SchemaTupleClassGenerator.java
<https://reviews.apache.org/r/4651/#comment15042>

    pull up
    



trunk/src/org/apache/pig/data/SchemaTupleFactory.java
<https://reviews.apache.org/r/4651/#comment15043>

    Ok, temporary



trunk/src/org/apache/pig/data/SchemaTupleFactory.java
<https://reviews.apache.org/r/4651/#comment15044>

    do we need this. Let's just fall back to regular tuple when generation 
fails, so that we don't need to keep those in sync



trunk/src/org/apache/pig/data/SchemaTupleFactory.java
<https://reviews.apache.org/r/4651/#comment15045>

    The factory could be generated as well so that we don't need to use 
reflection here.
    Reflection is slower to create new instances.



trunk/src/org/apache/pig/data/SchemaTupleFactory.java
<https://reviews.apache.org/r/4651/#comment15046>

    The parameter type is enough to differentiate.
    getSchemaTupleFactory(Schema schema)



trunk/src/org/apache/pig/data/SchemaTupleFactory.java
<https://reviews.apache.org/r/4651/#comment15047>

    same



trunk/src/org/apache/pig/data/SchemaTupleFactory.java
<https://reviews.apache.org/r/4651/#comment15048>

    getTupleClass()



trunk/src/org/apache/pig/data/SchemaTupleFactory.java
<https://reviews.apache.org/r/4651/#comment15049>

    getTupleClass()



trunk/src/org/apache/pig/data/SchemaTupleFactory.java
<https://reviews.apache.org/r/4651/#comment15050>

    getTupleClass()



trunk/src/org/apache/pig/data/Tuple.java
<https://reviews.apache.org/r/4651/#comment15051>

    typo


- Julien


On 2012-04-08 22:26:29, Jonathan Coveney wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/4651/
> -----------------------------------------------------------
> 
> (Updated 2012-04-08 22:26:29)
> 
> 
> Review request for pig and Julien Le Dem.
> 
> 
> Summary
> -------
> 
> This work builds on Dmitriy's PrimitiveTuple work. The idea is that, knowing 
> the Schema on the frontend, we can code generate Tuples which can be used for 
> fun and profit. In rudimentary tests, the memory efficiency is 2-4x better, 
> and it's ~15% smaller serialized (heavily heavily depends on the data, 
> though). Need to do get/set tests, but assuming that it's on par (or even 
> faster) than Tuple, the memory gain is huge.
> 
> Need to clean up the code and add tests.
> 
> Right now, it generates a SchemaTuple for every inputSchema and outputSchema 
> given to UDF's. The next step is to make a SchemaBag, where I think the 
> serialization savings will be really huge.
> 
> Needs tests and comments, but I want the code to settle a bit.
> 
> 
> This addresses bug PIG-2632.
>     https://issues.apache.org/jira/browse/PIG-2632
> 
> 
> Diffs
> -----
> 
>   trunk/bin/pig 1310666 
>   trunk/build.xml 1310666 
>   trunk/ivy.xml 1310666 
>   trunk/ivy/libraries.properties 1310666 
>   
> trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POUserFunc.java
>  1310666 
>   trunk/src/org/apache/pig/data/BinInterSedes.java 1310666 
>   trunk/src/org/apache/pig/data/FieldIsNullException.java PRE-CREATION 
>   trunk/src/org/apache/pig/data/PrimitiveTuple.java 1310666 
>   trunk/src/org/apache/pig/data/SchemaTuple.java PRE-CREATION 
>   trunk/src/org/apache/pig/data/SchemaTupleClassGenerator.java PRE-CREATION 
>   trunk/src/org/apache/pig/data/SchemaTupleFactory.java PRE-CREATION 
>   trunk/src/org/apache/pig/data/Tuple.java 1310666 
>   trunk/src/org/apache/pig/data/TupleFactory.java 1310666 
>   trunk/src/org/apache/pig/data/TypeAwareTuple.java 1310666 
>   trunk/src/org/apache/pig/data/utils/SedesHelper.java PRE-CREATION 
>   trunk/src/org/apache/pig/impl/PigContext.java 1310666 
>   trunk/src/org/apache/pig/newplan/logical/expression/UserFuncExpression.java 
> 1310666 
> 
> Diff: https://reviews.apache.org/r/4651/diff
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Jonathan
> 
>

Reply via email to