[jira] Commented: (PIG-365) Map side optimization for Limit (top k case)
[ https://issues.apache.org/jira/browse/PIG-365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12903955#action_12903955 ] Gianmarco De Francisci Morales commented on PIG-365: I took a look at PigMapBase and POLimit and to me it looks like it is already optimized. Can you describe the idea more in detail? Map side optimization for Limit (top k case) Key: PIG-365 URL: https://issues.apache.org/jira/browse/PIG-365 Project: Pig Issue Type: Improvement Components: impl Affects Versions: 0.2.0 Reporter: Daniel Dai Assignee: Daniel Dai Priority: Minor In map side, only collect top k records to improve performance -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (PIG-365) Map side optimization for Limit (top k case)
[ https://issues.apache.org/jira/browse/PIG-365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai resolved PIG-365. Resolution: Won't Fix Map side optimization for Limit (top k case) Key: PIG-365 URL: https://issues.apache.org/jira/browse/PIG-365 Project: Pig Issue Type: Improvement Components: impl Affects Versions: 0.2.0 Reporter: Daniel Dai Assignee: Daniel Dai Priority: Minor In map side, only collect top k records to improve performance -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-365) Map side optimization for Limit (top k case)
[ https://issues.apache.org/jira/browse/PIG-365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12903996#action_12903996 ] Daniel Dai commented on PIG-365: Hi, Gianmarco, Yes, you are right. This is a quite old Jira and it is no longer applicable. I will close this Jira. More recent limit optimization we are still looking at is [PIG-1270|https://issues.apache.org/jira/browse/PIG-1270]. Map side optimization for Limit (top k case) Key: PIG-365 URL: https://issues.apache.org/jira/browse/PIG-365 Project: Pig Issue Type: Improvement Components: impl Affects Versions: 0.2.0 Reporter: Daniel Dai Assignee: Daniel Dai Priority: Minor In map side, only collect top k records to improve performance -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1205) Enhance HBaseStorage-- Make it support loading row key and implement StoreFunc
[ https://issues.apache.org/jira/browse/PIG-1205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dmitriy V. Ryaboy updated PIG-1205: --- Attachment: hbase-0.20.6.jar hbase-0.20.6-test.jar Attaching the hbase-0.20.6 jars HBase is an apache project, so no license issues. Enhance HBaseStorage-- Make it support loading row key and implement StoreFunc -- Key: PIG-1205 URL: https://issues.apache.org/jira/browse/PIG-1205 Project: Pig Issue Type: Sub-task Affects Versions: 0.7.0 Reporter: Jeff Zhang Assignee: Dmitriy V. Ryaboy Fix For: 0.8.0 Attachments: hbase-0.20.6-test.jar, hbase-0.20.6.jar, PIG_1205.patch, PIG_1205_2.patch, PIG_1205_3.patch, PIG_1205_4.patch, PIG_1205_5.path, PIG_1205_6.patch, PIG_1205_7.patch, PIG_1205_8.patch -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1531) Pig gobbles up error messages
[ https://issues.apache.org/jira/browse/PIG-1531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated PIG-1531: -- Attachment: pig-1531_3.patch I took a look of the latest patch. There are two minor problems. Firstly, pigExec was always null and never assigned a value, so it resulted in NPE in certain code path. Second, the boolean logic in PigInputFormat needs instead of ||. I thought of correcting it and committing. But then realized hudson hasnt come back with results yet. So, I am uploading a new patch with those corrections and submitting to Hudson again. In this patch, I also refactored a code a bit, so its easier to read. Have a look and if its look fine to you. Can you run test-patch and unit tests and paste results here, so I can commit it. Pig gobbles up error messages - Key: PIG-1531 URL: https://issues.apache.org/jira/browse/PIG-1531 Project: Pig Issue Type: Bug Affects Versions: 0.7.0 Reporter: Ashutosh Chauhan Assignee: niraj rai Fix For: 0.8.0 Attachments: pig-1531_3.patch, PIG_1531.patch, PIG_1531_2.patch Consider the following. I have my own Storer implementing StoreFunc and I am throwing FrontEndException (and other Exceptions derived from PigException) in its various methods. I expect those error messages to be shown in error scenarios. Instead Pig gobbles up my error messages and shows its own generic error message like: {code} 010-07-31 14:14:25,414 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2116: Unexpected error. Could not validate the output specification for: default.partitoned Details at logfile: /Users/ashutosh/workspace/pig/pig_1280610650690.log {code} Instead I expect it to display my error messages which it stores away in that log file. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1531) Pig gobbles up error messages
[ https://issues.apache.org/jira/browse/PIG-1531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated PIG-1531: -- Status: Patch Available (was: Open) Pig gobbles up error messages - Key: PIG-1531 URL: https://issues.apache.org/jira/browse/PIG-1531 Project: Pig Issue Type: Bug Affects Versions: 0.7.0 Reporter: Ashutosh Chauhan Assignee: niraj rai Fix For: 0.8.0 Attachments: pig-1531_3.patch, PIG_1531.patch, PIG_1531_2.patch Consider the following. I have my own Storer implementing StoreFunc and I am throwing FrontEndException (and other Exceptions derived from PigException) in its various methods. I expect those error messages to be shown in error scenarios. Instead Pig gobbles up my error messages and shows its own generic error message like: {code} 010-07-31 14:14:25,414 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2116: Unexpected error. Could not validate the output specification for: default.partitoned Details at logfile: /Users/ashutosh/workspace/pig/pig_1280610650690.log {code} Instead I expect it to display my error messages which it stores away in that log file. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1482) Pig gets confused when more than one loader is involved
[ https://issues.apache.org/jira/browse/PIG-1482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated PIG-1482: - Status: Open (was: Patch Available) Pig gets confused when more than one loader is involved --- Key: PIG-1482 URL: https://issues.apache.org/jira/browse/PIG-1482 Project: Pig Issue Type: Bug Affects Versions: 0.7.0 Reporter: Ankur Assignee: Xuefu Zhang Fix For: 0.8.0 Attachments: jira-1482-final-1.patch, jira-1482-final.patch, jira-1482-final.patch, jira-1482-final.patch In case of two relations being loaded using different loader, joined, grouped and projected, pig gets confused in trying to find appropriate loader for the requested cast. Consider the following script :- A = LOAD 'data1' USING PigStorage() AS (s, m, l); B = FOREACH A GENERATE s#'k1' as v1, m#'k2' as v2, l#'k3' as v3; C = FOREACH B GENERATE v1, (v2 == 'v2' ? 1L : 0L) as v2:long, (v3 == 'v3' ? 1 :0) as v3:int; D = LOAD 'data2' USING TextLoader() AS (a); E = JOIN C BY v1, D BY a USING 'replicated'; F = GROUP E BY (v1, a); G = FOREACH F GENERATE (chararray)group.v1, group.a; dump G; This throws the error, stack trace of which is in the next comment -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1482) Pig gets confused when more than one loader is involved
[ https://issues.apache.org/jira/browse/PIG-1482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated PIG-1482: - Attachment: jira-1482-final-2.patch Updated based on review comments above. Pig gets confused when more than one loader is involved --- Key: PIG-1482 URL: https://issues.apache.org/jira/browse/PIG-1482 Project: Pig Issue Type: Bug Affects Versions: 0.7.0 Reporter: Ankur Assignee: Xuefu Zhang Fix For: 0.8.0 Attachments: jira-1482-final-1.patch, jira-1482-final-2.patch, jira-1482-final.patch, jira-1482-final.patch, jira-1482-final.patch In case of two relations being loaded using different loader, joined, grouped and projected, pig gets confused in trying to find appropriate loader for the requested cast. Consider the following script :- A = LOAD 'data1' USING PigStorage() AS (s, m, l); B = FOREACH A GENERATE s#'k1' as v1, m#'k2' as v2, l#'k3' as v3; C = FOREACH B GENERATE v1, (v2 == 'v2' ? 1L : 0L) as v2:long, (v3 == 'v3' ? 1 :0) as v3:int; D = LOAD 'data2' USING TextLoader() AS (a); E = JOIN C BY v1, D BY a USING 'replicated'; F = GROUP E BY (v1, a); G = FOREACH F GENERATE (chararray)group.v1, group.a; dump G; This throws the error, stack trace of which is in the next comment -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1482) Pig gets confused when more than one loader is involved
[ https://issues.apache.org/jira/browse/PIG-1482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated PIG-1482: - Status: Patch Available (was: Open) Pig gets confused when more than one loader is involved --- Key: PIG-1482 URL: https://issues.apache.org/jira/browse/PIG-1482 Project: Pig Issue Type: Bug Affects Versions: 0.7.0 Reporter: Ankur Assignee: Xuefu Zhang Fix For: 0.8.0 Attachments: jira-1482-final-1.patch, jira-1482-final-2.patch, jira-1482-final.patch, jira-1482-final.patch, jira-1482-final.patch In case of two relations being loaded using different loader, joined, grouped and projected, pig gets confused in trying to find appropriate loader for the requested cast. Consider the following script :- A = LOAD 'data1' USING PigStorage() AS (s, m, l); B = FOREACH A GENERATE s#'k1' as v1, m#'k2' as v2, l#'k3' as v3; C = FOREACH B GENERATE v1, (v2 == 'v2' ? 1L : 0L) as v2:long, (v3 == 'v3' ? 1 :0) as v3:int; D = LOAD 'data2' USING TextLoader() AS (a); E = JOIN C BY v1, D BY a USING 'replicated'; F = GROUP E BY (v1, a); G = FOREACH F GENERATE (chararray)group.v1, group.a; dump G; This throws the error, stack trace of which is in the next comment -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.