[ 
https://issues.apache.org/jira/browse/PIG-544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12678990#action_12678990
 ] 

Santhosh Srinivasan commented on PIG-544:
-----------------------------------------

Review comments:

Index: test/org/apache/pig/test/TestEvalPipeline2.java
=======================================================

1. Its not clear how the constant 1634952294 is arrived at. A comment alluding 
to this fact will help. The same comment applies to other constants that are 
used in this test case.
{code}
+    @Test
+    public void testBinStorageByteArrayCastsSimple() throws IOException {
+        assertTrue( (Integer)tup.get(0) == 1634952294);
{code}

Index: 
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POProject.java
========================================================================================

1. Do we need mLog? POProject extends ExpressionOperator which in turn extends 
PhysicalOperator. PhysicalOperator has a private member variable called log. 
Its better to use log instead of creating another member variable.
{code}
+    protected final Log mLog = LogFactory.getLog(getClass());
{code}

2. I am not sure if processInputBag calls parseFromBytes(). Can you clarify?

{code}
+        Result res;
+        try{
+            res = processInputBag();
+        }
+        catch (Exception e){
+            Result r = new Result();
+            r.returnStatus = POStatus.STATUS_OK;
+            // can happen if parseFromBytes identifies it as being of 
different type
{code}


Index: src/org/apache/pig/builtin/BinStorage.java
===================================================================

1. Can we make the log private static ?
{code}
+    protected final Log mLog = LogFactory.getLog(getClass());
{code}

2. Minor comment. Use bytearray instead of byres array to be consistent with 
the bytearray type. Also, be consistent with use of upper case for types, I see 
bag and then Long, Map, etc.
{code}
+            LogUtils.warn(this, "Unable to convert bytes array to bag, " +
{code}

3. In bytesToTuple, the message should be tuple and not bag

{code}
+    public Tuple bytesToTuple(byte[] b) {
+            LogUtils.warn(this, "Unable to convert bytes array to bag, " +
{code}

4. Documentation. Now that error code 2110 has been removed,  
http://wiki.apache.org/pig/PigErrorHandlingFunctionalSpecification has to be 
updated to reflect the removal of this error code.

{code}
-            int errCode = 2110;
-            String msg = "Could not convert bytearray to map.";
-            throw new ExecException(msg, errCode, PigException.BUG);
{code}

5. The bytes to (Integer/Long/Float/Double) methods are interpreting the first 
4 to 8 bytes (depending on the type) and returning the value as the value for 
the converted type. This is a semantics question. Should we return null or 
should we do a best effort conversion of bytes to basic types? For example in 
the bytesToCharArray routine, the first couple of bytes are expected to be 
representative of the length of the string. The same argument holds good for 
the tuple too. However for the numeric types, a fixed number of bytes is 
interpreted as the value.

> Utf8StorageConverter.java does not always produce NULLs when data is malformed
> ------------------------------------------------------------------------------
>
>                 Key: PIG-544
>                 URL: https://issues.apache.org/jira/browse/PIG-544
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: types_branch
>            Reporter: Olga Natkovich
>            Assignee: Thejas M Nair
>             Fix For: types_branch
>
>         Attachments: PIG-544.txt
>
>
> It does so for scalar types but not for complext types and not for the fields 
> inside of the complext types.
> This is because it uses different code to parse scalar types by themselves 
> and scalar types inside of a complex type. It should really use the same (its 
> own code to do so.)
> The code it is currently uses, is inside of TextDataParser.jjt and is also 
> used to parse constants so we need to be careful if we want to make changes 
> to it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to