[jira] Updated: (PIG-1575) Complete the migration of optimization rule PushUpFilter including missing test cases
[ https://issues.apache.org/jira/browse/PIG-1575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated PIG-1575: - Attachment: jira-1575-4.patch Add Fix for a failed test case. Only the test case itself gets changed. Complete the migration of optimization rule PushUpFilter including missing test cases - Key: PIG-1575 URL: https://issues.apache.org/jira/browse/PIG-1575 Project: Pig Issue Type: Bug Reporter: Xuefu Zhang Assignee: Xuefu Zhang Fix For: 0.8.0 Attachments: jira-1575-1.patch, jira-1575-2.patch, jira-1575-3.patch, jira-1575-4.patch The Optimization rule under the new logical plan, PushUpFilter, only does a subset of optimization scenarios compared to the same rule under the old logical plan. For instance, it only considers filter after join, but the old optimization also considers other operators such as CoGroup, Union, Cross, etc. The migration of the rule should be complete. Also, the test cases created for testing the old PushUpFilter wasn't migrated to the new logical plan code base. It should be also migrated. (A few has been migrated in JIRA-1574.) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1597) Development snapshot jar no longer picked up by bin/pig
[ https://issues.apache.org/jira/browse/PIG-1597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dmitriy V. Ryaboy updated PIG-1597: --- Status: Resolved (was: Patch Available) Fix Version/s: 0.9.0 Resolution: Fixed Committed. Development snapshot jar no longer picked up by bin/pig --- Key: PIG-1597 URL: https://issues.apache.org/jira/browse/PIG-1597 Project: Pig Issue Type: Bug Components: grunt Affects Versions: 0.8.0 Reporter: Dmitriy V. Ryaboy Assignee: Dmitriy V. Ryaboy Fix For: 0.8.0, 0.9.0 Attachments: PIG_1597.patch As George Stathis poined out in PIG-1596, bin/pig no longer picks up development pig jars. This appears to have been introduced in PIG-1334, as the jar was renamed from -dev- to -SNAPSHOT- -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1596) NPE's thrown when attempting to load hbase columns containing null values
[ https://issues.apache.org/jira/browse/PIG-1596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12906305#action_12906305 ] Dmitriy V. Ryaboy commented on PIG-1596: +1 before committing, do you mind combining the new test with one of the existing ones? Trying to keep the test suite at under 24 hours :) NPE's thrown when attempting to load hbase columns containing null values - Key: PIG-1596 URL: https://issues.apache.org/jira/browse/PIG-1596 Project: Pig Issue Type: Bug Components: data Affects Versions: 0.7.0 Reporter: George P. Stathis Fix For: 0.8.0, 0.9.0 Attachments: null_hbase_records.patch, PIG_1596.patch, PIG_1596_2.patch I'm not a committer, but I'd like to suggest the attached patch to handle loading hbase rows containing null cell values (since hbase is all about sparsly populated data rows). As it stands, a DataByteArray can be created with a null mData if a cell has no value, which causes NPEs by simply attempting to load a row containing the null cell in question. PS: the attached patch also contains a slight change to the bin/pig executable to point to the build/pig\-\*\-SNAPSHOT.jar and not the build/pig\-\*\-dev.jar (the latter no longer seems to exist). If you prefer a separate patch for this, I'll be happy to submit it. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1548) Optimize scalar to consolidate the part file
[ https://issues.apache.org/jira/browse/PIG-1548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12906321#action_12906321 ] Daniel Dai commented on PIG-1548: - Patch break TestFRJoin2.testConcatenateJobForScalar3. Comment out TestFRJoin2.testConcatenateJobForScalar3 temporarily. Optimize scalar to consolidate the part file Key: PIG-1548 URL: https://issues.apache.org/jira/browse/PIG-1548 Project: Pig Issue Type: Improvement Components: impl Reporter: Daniel Dai Assignee: Richard Ding Fix For: 0.8.0 Attachments: PIG-1548.patch, PIG-1548_1.patch Current scalar implementation will write a scalar file onto dfs. When Pig need the scalar, it will open the dfs file directly. Each scalar file contains more than one part file though it contains only one record. This puts a huge load to namenode. We should consolidate part file before open it. Another optional step is put the consolicated file into distributed cache. This further bring down the load of namenode. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1595) casting relation to scalar- problem with handling of data from non PigStorage loaders
[ https://issues.apache.org/jira/browse/PIG-1595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12906322#action_12906322 ] Daniel Dai commented on PIG-1595: - Patch break TestScalarAliases.testScalarErrMultipleRowsInInput. Comment out TestScalarAliases.testScalarErrMultipleRowsInInput temporarily. casting relation to scalar- problem with handling of data from non PigStorage loaders - Key: PIG-1595 URL: https://issues.apache.org/jira/browse/PIG-1595 Project: Pig Issue Type: Bug Reporter: Thejas M Nair Assignee: Thejas M Nair Fix For: 0.8.0 Attachments: PIG-1595.1.patch If load functions that don't follow the same bytearray format as PigStorage for other supported datatypes, or those that don't implement the LoadCaster interface are used in 'casting relation to scalar' (PIG-1434), it can cause the query to fail or create incorrect results. The root cause of the problem is that there is a real dependency between the ReadScalars udf that returns the scalar value and the LogicalOperator that acts as its input. But the logicalplan does not capture this dependency. So in SchemaResetter visitor used by the optimizer, the order in which schema is reset and evaluated does not take this into consideration. If the schema of the input LogicalOperator does not get evaluated before the ReadScalar udf, the resutltype of ReadScalar udf becomes bytearray. POUserFunc will convert the input to bytearray using ' new DataByteArray(inp.toString().getBytes())'. But this bytearray encoding of other supported types might not be same for the LoadFunction associated with the column, and that can result in problems. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.