[jira] Updated: (PIG-1643) join fails for a query with input having 'load using pigstorage without schema' + 'foreach'

2010-09-24 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-1643:


Attachment: PIG-1643.2.patch

Attach a fix.

 join fails for a query with input having 'load using pigstorage without 
 schema' + 'foreach'
 ---

 Key: PIG-1643
 URL: https://issues.apache.org/jira/browse/PIG-1643
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.8.0
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Fix For: 0.8.0

 Attachments: PIG-1643.1.patch, PIG-1643.2.patch


 {code}
 l1 = load 'std.txt';
 l2 = load 'std.txt'; 
 f1 = foreach l1 generate $0 as abc, $1 as  def;
 -- j =  join f1 by $0, l2 by $0 using 'replicated';
 -- j =  join l2 by $0, f1 by $0 using 'replicated';
 j =  join l2 by $0, f1 by $0 ;
 dump j;
 {code}
 the error -
 {code}
 2010-09-22 16:24:48,584 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 
 2044: The type null cannot be collected as a Key type
 {code}
 The MR plan from explain  -
 {code}
 #--
 # Map Reduce Plan  
 #--
 MapReduce node scope-21
 Map Plan
 Union[tuple] - scope-22
 |
 |---j: Local Rearrange[tuple]{bytearray}(false) - scope-11
 |   |   |
 |   |   Project[bytearray][0] - scope-12
 |   |
 |   |---l2: 
 Load(file:///Users/tejas/pig_obyfail/trunk/std.txt:org.apache.pig.builtin.PigStorage)
  - scope-0
 |
 |---j: Local Rearrange[tuple]{NULL}(false) - scope-13
 |   |
 |   Project[NULL][0] - scope-14
 |
 |---f1: New For Each(false,false)[bag] - scope-6
 |   |
 |   Project[bytearray][0] - scope-2
 |   |
 |   Project[bytearray][1] - scope-4
 |
 |---l1: 
 Load(file:///Users/tejas/pig_obyfail/trunk/std.txt:org.apache.pig.builtin.PigStorage)
  - scope-1
 Reduce Plan
 j: Store(/tmp/x:org.apache.pig.builtin.PigStorage) - scope-18
 |
 |---POJoinPackage(true,true)[tuple] - scope-23
 Global sort: false
 
 {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1643) join fails for a query with input having 'load using pigstorage without schema' + 'foreach'

2010-09-24 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-1643:


Attachment: PIG-1643.3.patch

PIG-1643.3.patch is more general than PIG-1643.2.patch. It solves this null 
schema issue for all expressions.

 join fails for a query with input having 'load using pigstorage without 
 schema' + 'foreach'
 ---

 Key: PIG-1643
 URL: https://issues.apache.org/jira/browse/PIG-1643
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.8.0
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Fix For: 0.8.0

 Attachments: PIG-1643.1.patch, PIG-1643.2.patch, PIG-1643.3.patch


 {code}
 l1 = load 'std.txt';
 l2 = load 'std.txt'; 
 f1 = foreach l1 generate $0 as abc, $1 as  def;
 -- j =  join f1 by $0, l2 by $0 using 'replicated';
 -- j =  join l2 by $0, f1 by $0 using 'replicated';
 j =  join l2 by $0, f1 by $0 ;
 dump j;
 {code}
 the error -
 {code}
 2010-09-22 16:24:48,584 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 
 2044: The type null cannot be collected as a Key type
 {code}
 The MR plan from explain  -
 {code}
 #--
 # Map Reduce Plan  
 #--
 MapReduce node scope-21
 Map Plan
 Union[tuple] - scope-22
 |
 |---j: Local Rearrange[tuple]{bytearray}(false) - scope-11
 |   |   |
 |   |   Project[bytearray][0] - scope-12
 |   |
 |   |---l2: 
 Load(file:///Users/tejas/pig_obyfail/trunk/std.txt:org.apache.pig.builtin.PigStorage)
  - scope-0
 |
 |---j: Local Rearrange[tuple]{NULL}(false) - scope-13
 |   |
 |   Project[NULL][0] - scope-14
 |
 |---f1: New For Each(false,false)[bag] - scope-6
 |   |
 |   Project[bytearray][0] - scope-2
 |   |
 |   Project[bytearray][1] - scope-4
 |
 |---l1: 
 Load(file:///Users/tejas/pig_obyfail/trunk/std.txt:org.apache.pig.builtin.PigStorage)
  - scope-1
 Reduce Plan
 j: Store(/tmp/x:org.apache.pig.builtin.PigStorage) - scope-18
 |
 |---POJoinPackage(true,true)[tuple] - scope-23
 Global sort: false
 
 {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1643) join fails for a query with input having 'load using pigstorage without schema' + 'foreach'

2010-09-24 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated PIG-1643:
---

Attachment: PIG-1643.4.patch

PIG-1643.4.patch  is PIG-1643.3.patch + test case

 join fails for a query with input having 'load using pigstorage without 
 schema' + 'foreach'
 ---

 Key: PIG-1643
 URL: https://issues.apache.org/jira/browse/PIG-1643
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.8.0
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Fix For: 0.8.0

 Attachments: PIG-1643.1.patch, PIG-1643.2.patch, PIG-1643.3.patch, 
 PIG-1643.4.patch


 {code}
 l1 = load 'std.txt';
 l2 = load 'std.txt'; 
 f1 = foreach l1 generate $0 as abc, $1 as  def;
 -- j =  join f1 by $0, l2 by $0 using 'replicated';
 -- j =  join l2 by $0, f1 by $0 using 'replicated';
 j =  join l2 by $0, f1 by $0 ;
 dump j;
 {code}
 the error -
 {code}
 2010-09-22 16:24:48,584 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 
 2044: The type null cannot be collected as a Key type
 {code}
 The MR plan from explain  -
 {code}
 #--
 # Map Reduce Plan  
 #--
 MapReduce node scope-21
 Map Plan
 Union[tuple] - scope-22
 |
 |---j: Local Rearrange[tuple]{bytearray}(false) - scope-11
 |   |   |
 |   |   Project[bytearray][0] - scope-12
 |   |
 |   |---l2: 
 Load(file:///Users/tejas/pig_obyfail/trunk/std.txt:org.apache.pig.builtin.PigStorage)
  - scope-0
 |
 |---j: Local Rearrange[tuple]{NULL}(false) - scope-13
 |   |
 |   Project[NULL][0] - scope-14
 |
 |---f1: New For Each(false,false)[bag] - scope-6
 |   |
 |   Project[bytearray][0] - scope-2
 |   |
 |   Project[bytearray][1] - scope-4
 |
 |---l1: 
 Load(file:///Users/tejas/pig_obyfail/trunk/std.txt:org.apache.pig.builtin.PigStorage)
  - scope-1
 Reduce Plan
 j: Store(/tmp/x:org.apache.pig.builtin.PigStorage) - scope-18
 |
 |---POJoinPackage(true,true)[tuple] - scope-23
 Global sort: false
 
 {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1643) join fails for a query with input having 'load using pigstorage without schema' + 'foreach'

2010-09-23 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated PIG-1643:
---

Attachment: PIG-1643.1.patch

PIG-1643.1.patch
There was a code path that lead to fields having NULL datatype instead of the 
default datatype of BYTEARRAY. That was causing these failures. 
Test-patch has succeeded, unit tests are running.


 join fails for a query with input having 'load using pigstorage without 
 schema' + 'foreach'
 ---

 Key: PIG-1643
 URL: https://issues.apache.org/jira/browse/PIG-1643
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.8.0
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Fix For: 0.8.0

 Attachments: PIG-1643.1.patch


 {code}
 l1 = load 'std.txt';
 l2 = load 'std.txt'; 
 f1 = foreach l1 generate $0 as abc, $1 as  def;
 -- j =  join f1 by $0, l2 by $0 using 'replicated';
 -- j =  join l2 by $0, f1 by $0 using 'replicated';
 j =  join l2 by $0, f1 by $0 ;
 dump j;
 {code}
 the error -
 {code}
 2010-09-22 16:24:48,584 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 
 2044: The type null cannot be collected as a Key type
 {code}
 The MR plan from explain  -
 {code}
 #--
 # Map Reduce Plan  
 #--
 MapReduce node scope-21
 Map Plan
 Union[tuple] - scope-22
 |
 |---j: Local Rearrange[tuple]{bytearray}(false) - scope-11
 |   |   |
 |   |   Project[bytearray][0] - scope-12
 |   |
 |   |---l2: 
 Load(file:///Users/tejas/pig_obyfail/trunk/std.txt:org.apache.pig.builtin.PigStorage)
  - scope-0
 |
 |---j: Local Rearrange[tuple]{NULL}(false) - scope-13
 |   |
 |   Project[NULL][0] - scope-14
 |
 |---f1: New For Each(false,false)[bag] - scope-6
 |   |
 |   Project[bytearray][0] - scope-2
 |   |
 |   Project[bytearray][1] - scope-4
 |
 |---l1: 
 Load(file:///Users/tejas/pig_obyfail/trunk/std.txt:org.apache.pig.builtin.PigStorage)
  - scope-1
 Reduce Plan
 j: Store(/tmp/x:org.apache.pig.builtin.PigStorage) - scope-18
 |
 |---POJoinPackage(true,true)[tuple] - scope-23
 Global sort: false
 
 {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1643) join fails for a query with input having 'load using pigstorage without schema' + 'foreach'

2010-09-23 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated PIG-1643:
---

Status: Patch Available  (was: Open)

 join fails for a query with input having 'load using pigstorage without 
 schema' + 'foreach'
 ---

 Key: PIG-1643
 URL: https://issues.apache.org/jira/browse/PIG-1643
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.8.0
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Fix For: 0.8.0

 Attachments: PIG-1643.1.patch


 {code}
 l1 = load 'std.txt';
 l2 = load 'std.txt'; 
 f1 = foreach l1 generate $0 as abc, $1 as  def;
 -- j =  join f1 by $0, l2 by $0 using 'replicated';
 -- j =  join l2 by $0, f1 by $0 using 'replicated';
 j =  join l2 by $0, f1 by $0 ;
 dump j;
 {code}
 the error -
 {code}
 2010-09-22 16:24:48,584 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 
 2044: The type null cannot be collected as a Key type
 {code}
 The MR plan from explain  -
 {code}
 #--
 # Map Reduce Plan  
 #--
 MapReduce node scope-21
 Map Plan
 Union[tuple] - scope-22
 |
 |---j: Local Rearrange[tuple]{bytearray}(false) - scope-11
 |   |   |
 |   |   Project[bytearray][0] - scope-12
 |   |
 |   |---l2: 
 Load(file:///Users/tejas/pig_obyfail/trunk/std.txt:org.apache.pig.builtin.PigStorage)
  - scope-0
 |
 |---j: Local Rearrange[tuple]{NULL}(false) - scope-13
 |   |
 |   Project[NULL][0] - scope-14
 |
 |---f1: New For Each(false,false)[bag] - scope-6
 |   |
 |   Project[bytearray][0] - scope-2
 |   |
 |   Project[bytearray][1] - scope-4
 |
 |---l1: 
 Load(file:///Users/tejas/pig_obyfail/trunk/std.txt:org.apache.pig.builtin.PigStorage)
  - scope-1
 Reduce Plan
 j: Store(/tmp/x:org.apache.pig.builtin.PigStorage) - scope-18
 |
 |---POJoinPackage(true,true)[tuple] - scope-23
 Global sort: false
 
 {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1643) join fails for a query with input having 'load using pigstorage without schema' + 'foreach'

2010-09-23 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated PIG-1643:
---

  Status: Resolved  (was: Patch Available)
Hadoop Flags: [Reviewed]
  Resolution: Fixed

Tests passed.
Patch committed to 0.8 branch and trunk.


 join fails for a query with input having 'load using pigstorage without 
 schema' + 'foreach'
 ---

 Key: PIG-1643
 URL: https://issues.apache.org/jira/browse/PIG-1643
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.8.0
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Fix For: 0.8.0

 Attachments: PIG-1643.1.patch


 {code}
 l1 = load 'std.txt';
 l2 = load 'std.txt'; 
 f1 = foreach l1 generate $0 as abc, $1 as  def;
 -- j =  join f1 by $0, l2 by $0 using 'replicated';
 -- j =  join l2 by $0, f1 by $0 using 'replicated';
 j =  join l2 by $0, f1 by $0 ;
 dump j;
 {code}
 the error -
 {code}
 2010-09-22 16:24:48,584 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 
 2044: The type null cannot be collected as a Key type
 {code}
 The MR plan from explain  -
 {code}
 #--
 # Map Reduce Plan  
 #--
 MapReduce node scope-21
 Map Plan
 Union[tuple] - scope-22
 |
 |---j: Local Rearrange[tuple]{bytearray}(false) - scope-11
 |   |   |
 |   |   Project[bytearray][0] - scope-12
 |   |
 |   |---l2: 
 Load(file:///Users/tejas/pig_obyfail/trunk/std.txt:org.apache.pig.builtin.PigStorage)
  - scope-0
 |
 |---j: Local Rearrange[tuple]{NULL}(false) - scope-13
 |   |
 |   Project[NULL][0] - scope-14
 |
 |---f1: New For Each(false,false)[bag] - scope-6
 |   |
 |   Project[bytearray][0] - scope-2
 |   |
 |   Project[bytearray][1] - scope-4
 |
 |---l1: 
 Load(file:///Users/tejas/pig_obyfail/trunk/std.txt:org.apache.pig.builtin.PigStorage)
  - scope-1
 Reduce Plan
 j: Store(/tmp/x:org.apache.pig.builtin.PigStorage) - scope-18
 |
 |---POJoinPackage(true,true)[tuple] - scope-23
 Global sort: false
 
 {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.