[jira] Commented: (PIG-365) Map side optimization for Limit (top k case)

2010-08-29 Thread Gianmarco De Francisci Morales (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12903955#action_12903955
 ] 

Gianmarco De Francisci Morales commented on PIG-365:


I took a look at PigMapBase and POLimit and to me it looks like it is already 
optimized.
Can you describe the idea more in detail?

 Map side optimization for Limit (top k case)
 

 Key: PIG-365
 URL: https://issues.apache.org/jira/browse/PIG-365
 Project: Pig
  Issue Type: Improvement
  Components: impl
Affects Versions: 0.2.0
Reporter: Daniel Dai
Assignee: Daniel Dai
Priority: Minor

 In map side, only collect top k records to improve performance

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (PIG-365) Map side optimization for Limit (top k case)

2010-08-29 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai resolved PIG-365.


Resolution: Won't Fix

 Map side optimization for Limit (top k case)
 

 Key: PIG-365
 URL: https://issues.apache.org/jira/browse/PIG-365
 Project: Pig
  Issue Type: Improvement
  Components: impl
Affects Versions: 0.2.0
Reporter: Daniel Dai
Assignee: Daniel Dai
Priority: Minor

 In map side, only collect top k records to improve performance

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-365) Map side optimization for Limit (top k case)

2010-08-29 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12903996#action_12903996
 ] 

Daniel Dai commented on PIG-365:


Hi, Gianmarco,
Yes, you are right. This is a quite old Jira and it is no longer applicable. I 
will close this Jira. More recent limit optimization we are still looking at is 
[PIG-1270|https://issues.apache.org/jira/browse/PIG-1270]. 

 Map side optimization for Limit (top k case)
 

 Key: PIG-365
 URL: https://issues.apache.org/jira/browse/PIG-365
 Project: Pig
  Issue Type: Improvement
  Components: impl
Affects Versions: 0.2.0
Reporter: Daniel Dai
Assignee: Daniel Dai
Priority: Minor

 In map side, only collect top k records to improve performance

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1205) Enhance HBaseStorage-- Make it support loading row key and implement StoreFunc

2010-08-29 Thread Dmitriy V. Ryaboy (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitriy V. Ryaboy updated PIG-1205:
---

Attachment: hbase-0.20.6.jar
hbase-0.20.6-test.jar

Attaching the hbase-0.20.6 jars

HBase is an apache project, so no license issues.

 Enhance HBaseStorage-- Make it support loading row key and implement StoreFunc
 --

 Key: PIG-1205
 URL: https://issues.apache.org/jira/browse/PIG-1205
 Project: Pig
  Issue Type: Sub-task
Affects Versions: 0.7.0
Reporter: Jeff Zhang
Assignee: Dmitriy V. Ryaboy
 Fix For: 0.8.0

 Attachments: hbase-0.20.6-test.jar, hbase-0.20.6.jar, PIG_1205.patch, 
 PIG_1205_2.patch, PIG_1205_3.patch, PIG_1205_4.patch, PIG_1205_5.path, 
 PIG_1205_6.patch, PIG_1205_7.patch, PIG_1205_8.patch




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1531) Pig gobbles up error messages

2010-08-29 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated PIG-1531:
--

Attachment: pig-1531_3.patch

I took a look of the latest patch. There are two minor problems. Firstly, 
pigExec was always null and never assigned a value, so it resulted in NPE in 
certain code path. Second, the boolean logic in PigInputFormat needs  instead 
of ||. I thought of correcting it and committing. But then realized hudson 
hasnt come back with results yet. So, I am uploading a new patch with those 
corrections and submitting to Hudson again. In this patch, I also refactored a 
code a bit, so its easier to read. Have a look and if its look fine to you. Can 
you run test-patch and unit tests and paste results here, so I can commit it.

 Pig gobbles up error messages
 -

 Key: PIG-1531
 URL: https://issues.apache.org/jira/browse/PIG-1531
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.7.0
Reporter: Ashutosh Chauhan
Assignee: niraj rai
 Fix For: 0.8.0

 Attachments: pig-1531_3.patch, PIG_1531.patch, PIG_1531_2.patch


 Consider the following. I have my own Storer implementing StoreFunc and I am 
 throwing FrontEndException (and other Exceptions derived from PigException) 
 in its various methods. I expect those error messages to be shown in error 
 scenarios. Instead Pig gobbles up my error messages and shows its own generic 
 error message like: 
 {code}
 010-07-31 14:14:25,414 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 
 2116: Unexpected error. Could not validate the output specification for: 
 default.partitoned
 Details at logfile: /Users/ashutosh/workspace/pig/pig_1280610650690.log
 {code}
 Instead I expect it to display my error messages which it stores away in that 
 log file.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1531) Pig gobbles up error messages

2010-08-29 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated PIG-1531:
--

Status: Patch Available  (was: Open)

 Pig gobbles up error messages
 -

 Key: PIG-1531
 URL: https://issues.apache.org/jira/browse/PIG-1531
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.7.0
Reporter: Ashutosh Chauhan
Assignee: niraj rai
 Fix For: 0.8.0

 Attachments: pig-1531_3.patch, PIG_1531.patch, PIG_1531_2.patch


 Consider the following. I have my own Storer implementing StoreFunc and I am 
 throwing FrontEndException (and other Exceptions derived from PigException) 
 in its various methods. I expect those error messages to be shown in error 
 scenarios. Instead Pig gobbles up my error messages and shows its own generic 
 error message like: 
 {code}
 010-07-31 14:14:25,414 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 
 2116: Unexpected error. Could not validate the output specification for: 
 default.partitoned
 Details at logfile: /Users/ashutosh/workspace/pig/pig_1280610650690.log
 {code}
 Instead I expect it to display my error messages which it stores away in that 
 log file.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1482) Pig gets confused when more than one loader is involved

2010-08-29 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated PIG-1482:
-

Status: Open  (was: Patch Available)

 Pig gets confused when more than one loader is involved
 ---

 Key: PIG-1482
 URL: https://issues.apache.org/jira/browse/PIG-1482
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.7.0
Reporter: Ankur
Assignee: Xuefu Zhang
 Fix For: 0.8.0

 Attachments: jira-1482-final-1.patch, jira-1482-final.patch, 
 jira-1482-final.patch, jira-1482-final.patch


 In case of two relations being loaded using different loader, joined, grouped 
 and projected, pig gets confused in trying to find appropriate loader for the 
 requested cast. Consider the following script :-
 A = LOAD 'data1' USING PigStorage() AS (s, m, l);
 B = FOREACH A GENERATE s#'k1' as v1, m#'k2' as v2, l#'k3' as v3;
 C = FOREACH B GENERATE v1, (v2 == 'v2' ? 1L : 0L) as v2:long, (v3 == 'v3' ? 1 
 :0) as v3:int;
 D = LOAD 'data2' USING TextLoader() AS (a);
 E = JOIN C BY v1, D BY a USING 'replicated';
 F = GROUP E BY (v1, a);
 G = FOREACH F GENERATE (chararray)group.v1, group.a;
 
 dump G;
 This throws the error, stack trace of which is in the next comment

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1482) Pig gets confused when more than one loader is involved

2010-08-29 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated PIG-1482:
-

Attachment: jira-1482-final-2.patch

Updated  based on review comments above.

 Pig gets confused when more than one loader is involved
 ---

 Key: PIG-1482
 URL: https://issues.apache.org/jira/browse/PIG-1482
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.7.0
Reporter: Ankur
Assignee: Xuefu Zhang
 Fix For: 0.8.0

 Attachments: jira-1482-final-1.patch, jira-1482-final-2.patch, 
 jira-1482-final.patch, jira-1482-final.patch, jira-1482-final.patch


 In case of two relations being loaded using different loader, joined, grouped 
 and projected, pig gets confused in trying to find appropriate loader for the 
 requested cast. Consider the following script :-
 A = LOAD 'data1' USING PigStorage() AS (s, m, l);
 B = FOREACH A GENERATE s#'k1' as v1, m#'k2' as v2, l#'k3' as v3;
 C = FOREACH B GENERATE v1, (v2 == 'v2' ? 1L : 0L) as v2:long, (v3 == 'v3' ? 1 
 :0) as v3:int;
 D = LOAD 'data2' USING TextLoader() AS (a);
 E = JOIN C BY v1, D BY a USING 'replicated';
 F = GROUP E BY (v1, a);
 G = FOREACH F GENERATE (chararray)group.v1, group.a;
 
 dump G;
 This throws the error, stack trace of which is in the next comment

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1482) Pig gets confused when more than one loader is involved

2010-08-29 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated PIG-1482:
-

Status: Patch Available  (was: Open)

 Pig gets confused when more than one loader is involved
 ---

 Key: PIG-1482
 URL: https://issues.apache.org/jira/browse/PIG-1482
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.7.0
Reporter: Ankur
Assignee: Xuefu Zhang
 Fix For: 0.8.0

 Attachments: jira-1482-final-1.patch, jira-1482-final-2.patch, 
 jira-1482-final.patch, jira-1482-final.patch, jira-1482-final.patch


 In case of two relations being loaded using different loader, joined, grouped 
 and projected, pig gets confused in trying to find appropriate loader for the 
 requested cast. Consider the following script :-
 A = LOAD 'data1' USING PigStorage() AS (s, m, l);
 B = FOREACH A GENERATE s#'k1' as v1, m#'k2' as v2, l#'k3' as v3;
 C = FOREACH B GENERATE v1, (v2 == 'v2' ? 1L : 0L) as v2:long, (v3 == 'v3' ? 1 
 :0) as v3:int;
 D = LOAD 'data2' USING TextLoader() AS (a);
 E = JOIN C BY v1, D BY a USING 'replicated';
 F = GROUP E BY (v1, a);
 G = FOREACH F GENERATE (chararray)group.v1, group.a;
 
 dump G;
 This throws the error, stack trace of which is in the next comment

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.