[jira] Updated: (PIG-960) Using Hadoop's optimized LineRecordReader for reading Tuples in PigStorage

2009-10-02 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-960:
---

   Resolution: Fixed
Fix Version/s: 0.6.0
   Status: Resolved  (was: Patch Available)

Patch committed. Thanks Ankit!

 Using Hadoop's optimized LineRecordReader for reading Tuples in PigStorage 
 ---

 Key: PIG-960
 URL: https://issues.apache.org/jira/browse/PIG-960
 Project: Pig
  Issue Type: Improvement
  Components: impl
Reporter: Ankit Modi
 Fix For: 0.6.0

 Attachments: pig_rlr.patch


 PigStorage's reading of Tuples ( lines ) can be optimized using Hadoop's 
 {{LineRecordReader}}.
 This can help in following areas
 - Improving performance reading of Tuples (lines) in {{PigStorage}}
 - Any future improvements in line reading done in Hadoop's 
 {{LineRecordReader}} is automatically carried over to Pig
 Issues that are handled by this patch
 - BZip uses internal buffers and positioning for determining the number of 
 bytes read. Hence buffering done by {{LineRecordReader}} has to be turned off
 - Current implementation of {{LocalSeekableInputStream}} does not implement 
 {{available}} method. This method has to be implemented.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-987) Zebra Column Group Access Control

2009-10-02 Thread Yan Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12761645#action_12761645
 ] 

Yan Zhou commented on PIG-987:
--

The extra warnings were generated on 7 modified java files that were generated 
JAVACC code generator. Should be ignored.

 Zebra Column Group Access Control
 -

 Key: PIG-987
 URL: https://issues.apache.org/jira/browse/PIG-987
 Project: Pig
  Issue Type: New Feature
Affects Versions: 0.6.0
Reporter: Yan Zhou
Assignee: Yan Zhou
 Attachments: ColumnGroupSecurity.patch


 Access Control: when processes try to read from the column groups, Zebra 
 should be able to handle allowed vs. disallowed user/application accesses. 
 Expected behavior when column group permissions are set:
 When user selects only columns that they do not have permissions to 
 access, Zebra should return error with message Error #: Permission denied 
 for accessing column column name or names 
 Access control applies to an entire column group, so all columns in a column 
 group have same permissions. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-592) schema inferred incorrectly

2009-10-02 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-592:
---

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Patch committed

 schema inferred incorrectly
 ---

 Key: PIG-592
 URL: https://issues.apache.org/jira/browse/PIG-592
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.4.0
Reporter: Christopher Olston
 Fix For: 0.6.0

 Attachments: PIG-592-1.patch, PIG-592-2.patch, PIG-592-3.patch


 A simple pig script, that never introduces any schema information:
 A = load 'foo';
 B = foreach (group A by $8) generate group, COUNT($1);
 C = load 'bar';   // ('bar' has two columns)
 D = join B by $0, C by $0;
 E = foreach D generate $0, $1, $3;
 Fails, complaining that $3 does not exist:
 java.io.IOException: Out of bound access. Trying to access non-existent 
 column: 3. Schema {B::group: bytearray,long,bytearray} has 3 column(s).
 Apparently Pig gets confused, and thinks it knows the schema for C (a single 
 bytearray column).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-991) [zebra] A few minor bugs as described in the Description section

2009-10-02 Thread Yan Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yan Zhou updated PIG-991:
-

Description: 
1) lzo2 was used as the compressor name for the LZO compression algorithm; it 
should be lzo instead;
2) the default compression is changed from lzo to gz for gzip;
3) In JAVACC file SchemaParser.jjt, the package name was wrong using the old 
package org.apache.pig.table.types;
4) in build.xml, two new javacc targets are added to generate TableSchemaParser 
and TableStorageParser java codes;
5) Support of column group security ( 
https://issues.apache.org/jira/browse/PIG-987 ) lacked support of the dumpinfo 
method: the groups and permissions were not displayed. Note that as a 
consequence, the patch herein must be applied after that of JIRA987.
6) and 7) a couple of issues reported in Jira917.

  was:
1) lzo2 was used as the compressor name for the LZO compression algorithm; it 
should be lzo instead;
2) the default compression is changed from lzo to gz for gzip;
3) In JAVACC file SchemaParser.jjt, the package name was wrong using the old 
package org.apache.pig.table.types;
4) in build.xml, two new javacc targets are added to generate TableSchemaParser 
and TableStorageParser java codes;
5) Support of column group security ( 
https://issues.apache.org/jira/browse/PIG-987 ) lacked support of the dumpinfo 
method: the groups and permissions were not displayed. Note that as a 
consequence, the patch herein must be applied after that of JIRA987.

Summary: [zebra] A few minor bugs as described in the Description 
section  (was: A few minor bugs as described in the Description section)

 [zebra] A few minor bugs as described in the Description section
 

 Key: PIG-991
 URL: https://issues.apache.org/jira/browse/PIG-991
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.4.0
Reporter: Yan Zhou
Assignee: Yan Zhou
Priority: Minor
 Fix For: 0.6.0


 1) lzo2 was used as the compressor name for the LZO compression algorithm; 
 it should be lzo instead;
 2) the default compression is changed from lzo to gz for gzip;
 3) In JAVACC file SchemaParser.jjt, the package name was wrong using the old 
 package org.apache.pig.table.types;
 4) in build.xml, two new javacc targets are added to generate 
 TableSchemaParser and TableStorageParser java codes;
 5) Support of column group security ( 
 https://issues.apache.org/jira/browse/PIG-987 ) lacked support of the 
 dumpinfo method: the groups and permissions were not displayed. Note that as 
 a consequence, the patch herein must be applied after that of JIRA987.
 6) and 7) a couple of issues reported in Jira917.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-987) [zebra] Zebra Column Group Access Control

2009-10-02 Thread Yan Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yan Zhou updated PIG-987:
-

Description: 
Access Control: when processes try to read from the column groups, Zebra should 
be able to handle allowed vs. disallowed user/application accesses.  The 
security is eventuallt granted by corresponding  HDFS security of the data 
stored.

Expected behavior when column group permissions are set:

When user selects only columns that they do not have permissions to access, 
Zebra should return error with message Error #: Permission denied for 
accessing column column name or names 

Access control applies to an entire column group, so all columns in a column 
group have same permissions. 


  was:
Access Control: when processes try to read from the column groups, Zebra should 
be able to handle allowed vs. disallowed user/application accesses. 

Expected behavior when column group permissions are set:

When user selects only columns that they do not have permissions to access, 
Zebra should return error with message Error #: Permission denied for 
accessing column column name or names 

Access control applies to an entire column group, so all columns in a column 
group have same permissions. 


Summary: [zebra] Zebra Column Group Access Control  (was: Zebra Column 
Group Access Control)

 [zebra] Zebra Column Group Access Control
 -

 Key: PIG-987
 URL: https://issues.apache.org/jira/browse/PIG-987
 Project: Pig
  Issue Type: New Feature
Affects Versions: 0.6.0
Reporter: Yan Zhou
Assignee: Yan Zhou
 Attachments: ColumnGroupSecurity.patch


 Access Control: when processes try to read from the column groups, Zebra 
 should be able to handle allowed vs. disallowed user/application accesses.  
 The security is eventuallt granted by corresponding  HDFS security of the 
 data stored.
 Expected behavior when column group permissions are set:
 When user selects only columns that they do not have permissions to 
 access, Zebra should return error with message Error #: Permission denied 
 for accessing column column name or names 
 Access control applies to an entire column group, so all columns in a column 
 group have same permissions. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-991) [zebra] A few minor bugs as described in the Description section

2009-10-02 Thread Yan Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yan Zhou updated PIG-991:
-

Release Note: Patch should be applied after that of Jira987.
  Status: Patch Available  (was: Open)

 [zebra] A few minor bugs as described in the Description section
 

 Key: PIG-991
 URL: https://issues.apache.org/jira/browse/PIG-991
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.4.0
Reporter: Yan Zhou
Assignee: Yan Zhou
Priority: Minor
 Fix For: 0.6.0


 1) lzo2 was used as the compressor name for the LZO compression algorithm; 
 it should be lzo instead;
 2) the default compression is changed from lzo to gz for gzip;
 3) In JAVACC file SchemaParser.jjt, the package name was wrong using the old 
 package org.apache.pig.table.types;
 4) in build.xml, two new javacc targets are added to generate 
 TableSchemaParser and TableStorageParser java codes;
 5) Support of column group security ( 
 https://issues.apache.org/jira/browse/PIG-987 ) lacked support of the 
 dumpinfo method: the groups and permissions were not displayed. Note that as 
 a consequence, the patch herein must be applied after that of JIRA987.
 6) and 7) a couple of issues reported in Jira917.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-991) [zebra] A few minor bugs as described in the Description section

2009-10-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12761671#action_12761671
 ] 

Hadoop QA commented on PIG-991:
---

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12421138/Bugs.patch
  against trunk revision 821101.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 18 new or modified tests.

-1 patch.  The patch command could not apply the patch.

Console output: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/57/console

This message is automatically generated.

 [zebra] A few minor bugs as described in the Description section
 

 Key: PIG-991
 URL: https://issues.apache.org/jira/browse/PIG-991
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.4.0
Reporter: Yan Zhou
Assignee: Yan Zhou
Priority: Minor
 Fix For: 0.6.0

 Attachments: Bugs.patch


 1) lzo2 was used as the compressor name for the LZO compression algorithm; 
 it should be lzo instead;
 2) the default compression is changed from lzo to gz for gzip;
 3) In JAVACC file SchemaParser.jjt, the package name was wrong using the old 
 package org.apache.pig.table.types;
 4) in build.xml, two new javacc targets are added to generate 
 TableSchemaParser and TableStorageParser java codes;
 5) Support of column group security ( 
 https://issues.apache.org/jira/browse/PIG-987 ) lacked support of the 
 dumpinfo method: the groups and permissions were not displayed. Note that as 
 a consequence, the patch herein must be applied after that of JIRA987.
 6) and 7) a couple of issues reported in Jira917.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (PIG-992) [zebra] Separate Schema-related files into a Schema package

2009-10-02 Thread Yan Zhou (JIRA)
[zebra] Separate Schema-related files into a Schema package
-

 Key: PIG-992
 URL: https://issues.apache.org/jira/browse/PIG-992
 Project: Pig
  Issue Type: Improvement
Affects Versions: 0.4.0
Reporter: Yan Zhou
Assignee: Yan Zhou
Priority: Minor
 Fix For: 0.6.0


The hope is to facilitate future sharing of the Schema codes between different 
modules and/or products. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-986) [zebra] Zebra Column Group Naming Support

2009-10-02 Thread Yan Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yan Zhou updated PIG-986:
-

Fix Version/s: (was: 0.5.0)
   0.6.0
Affects Version/s: 0.4.0
 Release Note: The patch mush be applied after the one in Jira991 has 
been applied.
   Status: Patch Available  (was: Open)

The patch file name is ColumnGroupName.patch

 [zebra] Zebra Column Group Naming Support
 -

 Key: PIG-986
 URL: https://issues.apache.org/jira/browse/PIG-986
 Project: Pig
  Issue Type: New Feature
  Components: impl
Affects Versions: 0.4.0
Reporter: Chao Wang
Assignee: Chao Wang
 Fix For: 0.6.0

 Attachments: ColumnGroupName.patch


 We introduce column group name to Zebra and make it a first-class citizen in 
 Zebra. This can ease management of column groups.
 We plan to introduce an as clause for column group name in Zebra's syntax.
 Functional Specifications:
 1) Column group names are optional. For column groups which do not have a 
 user-provided name, Zebra will assign some default column group names 
 internally that is unique for that table - CG0, CG1, CG2 ... Note: If CGx is 
 used by user, then it can not be used for internal names.
 2) We introduce an AS clause in Zebra's syntax for column group names. If 
 it occurs, it has to immediately follow [ ]. For example, [a1, a2] as PI 
 secure by user:joe group:secure perm:640; [a3, a4] as General compress by 
 lzo. Note that keyword AS is case insensitive.
 3) Column group names are unique within one table and are case sensitive, 
 i.e., c1 and C1 are different.
 4) Column group names will be used as the physical column group directory 
 path names.
 5) Zebra V2 will support dropColumnGroup by column group names (will 
 integrate with Raghu's A29 drop column work).
 6) Zebra V2 can support backward compatibility (If there are Zebra V1 created 
 tables in production when V2 is released). More specifically, this means that 
 Zebra V2 can load from V1-created tables and do dropColumnGroup on it.
 7) Does NOT support renaming.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-986) [zebra] Zebra Column Group Naming Support

2009-10-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12761744#action_12761744
 ] 

Hadoop QA commented on PIG-986:
---

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12421161/ColumnGroupName.patch
  against trunk revision 821101.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 44 new or modified tests.

-1 patch.  The patch command could not apply the patch.

Console output: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/58/console

This message is automatically generated.

 [zebra] Zebra Column Group Naming Support
 -

 Key: PIG-986
 URL: https://issues.apache.org/jira/browse/PIG-986
 Project: Pig
  Issue Type: New Feature
  Components: impl
Affects Versions: 0.4.0
Reporter: Chao Wang
Assignee: Chao Wang
 Fix For: 0.6.0

 Attachments: ColumnGroupName.patch


 We introduce column group name to Zebra and make it a first-class citizen in 
 Zebra. This can ease management of column groups.
 We plan to introduce an as clause for column group name in Zebra's syntax.
 Functional Specifications:
 1) Column group names are optional. For column groups which do not have a 
 user-provided name, Zebra will assign some default column group names 
 internally that is unique for that table - CG0, CG1, CG2 ... Note: If CGx is 
 used by user, then it can not be used for internal names.
 2) We introduce an AS clause in Zebra's syntax for column group names. If 
 it occurs, it has to immediately follow [ ]. For example, [a1, a2] as PI 
 secure by user:joe group:secure perm:640; [a3, a4] as General compress by 
 lzo. Note that keyword AS is case insensitive.
 3) Column group names are unique within one table and are case sensitive, 
 i.e., c1 and C1 are different.
 4) Column group names will be used as the physical column group directory 
 path names.
 5) Zebra V2 will support dropColumnGroup by column group names (will 
 integrate with Raghu's A29 drop column work).
 6) Zebra V2 can support backward compatibility (If there are Zebra V1 created 
 tables in production when V2 is released). More specifically, this means that 
 Zebra V2 can load from V1-created tables and do dropColumnGroup on it.
 7) Does NOT support renaming.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-953) Enable merge join in pig to work with loaders and store functions which can internally index sorted data

2009-10-02 Thread Pradeep Kamath (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pradeep Kamath updated PIG-953:
---

Attachment: PIG-953-3.patch

Attached patch which has the SortColInfo implementation to convey sort column 
information in SortInfo. This patch also address PIG-981.

 Enable merge join in pig to work with loaders and store functions which can 
 internally index sorted data 
 -

 Key: PIG-953
 URL: https://issues.apache.org/jira/browse/PIG-953
 Project: Pig
  Issue Type: Improvement
Affects Versions: 0.3.0
Reporter: Pradeep Kamath
Assignee: Pradeep Kamath
 Attachments: PIG-953-2.patch, PIG-953-3.patch, PIG-953.patch


 Currently merge join implementation in pig includes construction of an index 
 on sorted data and use of that index to seek into the right input to 
 efficiently perform the join operation. Some loaders (notably the zebra 
 loader) internally implement an index on sorted data and can perform this 
 seek efficiently using their index. So the use of the index needs to be 
 abstracted in such a way that when the loader supports indexing, pig uses it 
 (indirectly through the loader) and does not construct an index. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (PIG-981) Merge join should restrict join key expressions to simple projects

2009-10-02 Thread Pradeep Kamath (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pradeep Kamath resolved PIG-981.


Resolution: Duplicate

Fixed in 
https://issues.apache.org/jira/browse/PIG-953?focusedCommentId=12761760page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12761760

 Merge join should restrict join key expressions to simple projects
 --

 Key: PIG-981
 URL: https://issues.apache.org/jira/browse/PIG-981
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.4.0
Reporter: Pradeep Kamath
Assignee: Pradeep Kamath

 Currently merge join allows join key expressions to be arbitrary expressions 
 with the assumption that the expressions keep the sort order. Since currently 
 only ascending sort order is supported, the code checks at run times for sort 
 order and catches the case where sort order is broken because the join key 
 expression is not order preserving. However there is a reason we should 
 restrict the join keys to projection of columns only:
  PIG-953 will enable pig to perform merge join  to work with loaders and 
 store functions which can internally index sorted data. These store functions 
 can only create an index (and hence lookup on the index) on raw data columns 
 (and not expressions on the columns).
 Hopefully this does not downgrade the usability of merge join much since if 
 the expressions can always be applied post join on the join columns and since 
 the expressions are order preserving they do not affect the outcome of the 
 join. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (PIG-993) [zebra] Abitlity to drop a column group in a table

2009-10-02 Thread Raghu Angadi (JIRA)
[zebra] Abitlity to drop a column group in a table
--

 Key: PIG-993
 URL: https://issues.apache.org/jira/browse/PIG-993
 Project: Pig
  Issue Type: Bug
Reporter: Raghu Angadi
Assignee: Raghu Angadi
 Fix For: 0.5.0



A Zebra table is stored as multiple sub tables each containing a set of columns 
called column group (CG). The user specifies how these columns are grouped 
while creating a table through the _storage hint_.

For some of the large tables, it might be necessary for users to remove a set 
of columns and retain the rest. This jira provides a way for users to delete an 
entire column group. 

The following comments will have more details on API and the semantics. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-993) [zebra] Abitlity to drop a column group in a table

2009-10-02 Thread Raghu Angadi (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12761766#action_12761766
 ] 

Raghu Angadi commented on PIG-993:
--


API  is pretty simple : {code}
class org.apache.hadoop.zebra.BasicTable {
 /** see the patch for JavaDoc and attached example for usage */

public static void dropColumnGroup(Path path,
   Configuration conf,   String cgName)
   throws IOException { ... }
}
{code}

  * Table schema is not modified.  
  * this API takes a name for a column group. PIG-986 adds explicit names for 
CGs.
  * Once a CGs is deleted, NULL is returned for the fields that were stored in 
the CG. 
 ** This is the main difference between just manually deleting  a directory 
on filesystem and 'properly' deleting a CG.
 ** Many changes made in other parts of zebra are related to handling the 
missing CGs.


 [zebra] Abitlity to drop a column group in a table
 --

 Key: PIG-993
 URL: https://issues.apache.org/jira/browse/PIG-993
 Project: Pig
  Issue Type: Bug
Reporter: Raghu Angadi
Assignee: Raghu Angadi
 Fix For: 0.5.0


 A Zebra table is stored as multiple sub tables each containing a set of 
 columns called column group (CG). The user specifies how these columns are 
 grouped while creating a table through the _storage hint_.
 For some of the large tables, it might be necessary for users to remove a set 
 of columns and retain the rest. This jira provides a way for users to delete 
 an entire column group. 
 The following comments will have more details on API and the semantics. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-993) [zebra] Abitlity to drop a column group in a table

2009-10-02 Thread Raghu Angadi (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raghu Angadi updated PIG-993:
-

Attachment: zebra-drop-cg.patch
DropColumnGroupExample.java

Attachments ; 

  DropColumnGropuExample.java : a simple example to illustrate the 
functionality.

  zebra-drop-cg.patch : This patch would apply only after a patch for PIG-896.

  Some of the tests included there are written by Jing Huang. Jing also helped 
with testing the patchon real clusters with various errors. Yan Zhou helped 
with correctly handling missing column groups.



 [zebra] Abitlity to drop a column group in a table
 --

 Key: PIG-993
 URL: https://issues.apache.org/jira/browse/PIG-993
 Project: Pig
  Issue Type: Bug
Reporter: Raghu Angadi
Assignee: Raghu Angadi
 Fix For: 0.5.0

 Attachments: DropColumnGroupExample.java, zebra-drop-cg.patch


 A Zebra table is stored as multiple sub tables each containing a set of 
 columns called column group (CG). The user specifies how these columns are 
 grouped while creating a table through the _storage hint_.
 For some of the large tables, it might be necessary for users to remove a set 
 of columns and retain the rest. This jira provides a way for users to delete 
 an entire column group. 
 The following comments will have more details on API and the semantics. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-993) [zebra] Abitlity to drop a column group in a table

2009-10-02 Thread Raghu Angadi (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12761769#action_12761769
 ] 

Raghu Angadi commented on PIG-993:
--

 zebra-drop-cg.patch : This patch would apply only after a patch for PIG-896.
I meant say PIG-986.


 [zebra] Abitlity to drop a column group in a table
 --

 Key: PIG-993
 URL: https://issues.apache.org/jira/browse/PIG-993
 Project: Pig
  Issue Type: Bug
Reporter: Raghu Angadi
Assignee: Raghu Angadi
 Fix For: 0.5.0

 Attachments: DropColumnGroupExample.java, zebra-drop-cg.patch


 A Zebra table is stored as multiple sub tables each containing a set of 
 columns called column group (CG). The user specifies how these columns are 
 grouped while creating a table through the _storage hint_.
 For some of the large tables, it might be necessary for users to remove a set 
 of columns and retain the rest. This jira provides a way for users to delete 
 an entire column group. 
 The following comments will have more details on API and the semantics. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-992) [zebra] Separate Schema-related files into a Schema package

2009-10-02 Thread Yan Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yan Zhou updated PIG-992:
-

Release Note: The patch file needs to be applied after Jira 986's patch has 
been applied.
  Status: Patch Available  (was: Open)

 [zebra] Separate Schema-related files into a Schema package
 -

 Key: PIG-992
 URL: https://issues.apache.org/jira/browse/PIG-992
 Project: Pig
  Issue Type: Improvement
Affects Versions: 0.4.0
Reporter: Yan Zhou
Assignee: Yan Zhou
Priority: Minor
 Fix For: 0.6.0

 Attachments: SchemaPackageChange.patch


 The hope is to facilitate future sharing of the Schema codes between 
 different modules and/or products. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-992) [zebra] Separate Schema-related files into a Schema package

2009-10-02 Thread Yan Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yan Zhou updated PIG-992:
-

Attachment: SchemaPackageChange.patch

 [zebra] Separate Schema-related files into a Schema package
 -

 Key: PIG-992
 URL: https://issues.apache.org/jira/browse/PIG-992
 Project: Pig
  Issue Type: Improvement
Affects Versions: 0.4.0
Reporter: Yan Zhou
Assignee: Yan Zhou
Priority: Minor
 Fix For: 0.6.0

 Attachments: SchemaPackageChange.patch


 The hope is to facilitate future sharing of the Schema codes between 
 different modules and/or products. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-993) [zebra] Abitlity to drop a column group in a table

2009-10-02 Thread Yan Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yan Zhou updated PIG-993:
-

Attachment: zebra-drop-cq.patch

this patch should be applied after the patch for Jira992 has been applied.

 [zebra] Abitlity to drop a column group in a table
 --

 Key: PIG-993
 URL: https://issues.apache.org/jira/browse/PIG-993
 Project: Pig
  Issue Type: Bug
Reporter: Raghu Angadi
Assignee: Raghu Angadi
 Fix For: 0.5.0

 Attachments: DropColumnGroupExample.java, zebra-drop-cg.patch


 A Zebra table is stored as multiple sub tables each containing a set of 
 columns called column group (CG). The user specifies how these columns are 
 grouped while creating a table through the _storage hint_.
 For some of the large tables, it might be necessary for users to remove a set 
 of columns and retain the rest. This jira provides a way for users to delete 
 an entire column group. 
 The following comments will have more details on API and the semantics. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-993) [zebra] Abitlity to drop a column group in a table

2009-10-02 Thread Yan Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yan Zhou updated PIG-993:
-

Attachment: (was: zebra-drop-cq.patch)

 [zebra] Abitlity to drop a column group in a table
 --

 Key: PIG-993
 URL: https://issues.apache.org/jira/browse/PIG-993
 Project: Pig
  Issue Type: Bug
Reporter: Raghu Angadi
Assignee: Raghu Angadi
 Fix For: 0.5.0

 Attachments: DropColumnGroupExample.java, zebra-drop-cg.patch


 A Zebra table is stored as multiple sub tables each containing a set of 
 columns called column group (CG). The user specifies how these columns are 
 grouped while creating a table through the _storage hint_.
 For some of the large tables, it might be necessary for users to remove a set 
 of columns and retain the rest. This jira provides a way for users to delete 
 an entire column group. 
 The following comments will have more details on API and the semantics. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-993) [zebra] Abitlity to drop a column group in a table

2009-10-02 Thread Yan Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yan Zhou updated PIG-993:
-

Attachment: zebra-drop-cg.patch

This patych should be applied after the patch for Jira992 is applied.

 [zebra] Abitlity to drop a column group in a table
 --

 Key: PIG-993
 URL: https://issues.apache.org/jira/browse/PIG-993
 Project: Pig
  Issue Type: Bug
Reporter: Raghu Angadi
Assignee: Raghu Angadi
 Fix For: 0.5.0

 Attachments: DropColumnGroupExample.java, zebra-drop-cg.patch, 
 zebra-drop-cg.patch


 A Zebra table is stored as multiple sub tables each containing a set of 
 columns called column group (CG). The user specifies how these columns are 
 grouped while creating a table through the _storage hint_.
 For some of the large tables, it might be necessary for users to remove a set 
 of columns and retain the rest. This jira provides a way for users to delete 
 an entire column group. 
 The following comments will have more details on API and the semantics. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-992) [zebra] Separate Schema-related files into a Schema package

2009-10-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12761795#action_12761795
 ] 

Hadoop QA commented on PIG-992:
---

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12421180/SchemaPackageChange.patch
  against trunk revision 821101.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 183 new or modified tests.

-1 patch.  The patch command could not apply the patch.

Console output: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/59/console

This message is automatically generated.

 [zebra] Separate Schema-related files into a Schema package
 -

 Key: PIG-992
 URL: https://issues.apache.org/jira/browse/PIG-992
 Project: Pig
  Issue Type: Improvement
Affects Versions: 0.4.0
Reporter: Yan Zhou
Assignee: Yan Zhou
Priority: Minor
 Fix For: 0.6.0

 Attachments: SchemaPackageChange.patch


 The hope is to facilitate future sharing of the Schema codes between 
 different modules and/or products. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: [jira] Updated: (PIG-993) [zebra] Abitlity to drop a column group in a table

2009-10-02 Thread Michael Bigby


- Original Message -
From: Yan Zhou (JIRA) j...@apache.org
To: pig-dev@hadoop.apache.org pig-dev@hadoop.apache.org
Sent: Fri Oct 02 18:16:23 2009
Subject: [jira] Updated: (PIG-993) [zebra] Abitlity to drop a column group in a 
table


 [ 
https://issues.apache.org/jira/browse/PIG-993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yan Zhou updated PIG-993:
-

Attachment: zebra-drop-cg.patch

This patych should be applied after the patch for Jira992 is applied.

 [zebra] Abitlity to drop a column group in a table
 --

 Key: PIG-993
 URL: https://issues.apache.org/jira/browse/PIG-993
 Project: Pig
  Issue Type: Bug
Reporter: Raghu Angadi
Assignee: Raghu Angadi
 Fix For: 0.5.0

 Attachments: DropColumnGroupExample.java, zebra-drop-cg.patch, 
 zebra-drop-cg.patch


 A Zebra table is stored as multiple sub tables each containing a set of 
 columns called column group (CG). The user specifies how these columns are 
 grouped while creating a table through the _storage hint_.
 For some of the large tables, it might be necessary for users to remove a set 
 of columns and retain the rest. This jira provides a way for users to delete 
 an entire column group. 
 The following comments will have more details on API and the semantics. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: [jira] Updated: (PIG-993) [zebra] Abitlity to drop a column group in a table

2009-10-02 Thread Michael Bigby
 X

- Original Message -
From: Yan Zhou (JIRA) j...@apache.org
To: pig-dev@hadoop.apache.org pig-dev@hadoop.apache.org
Sent: Fri Oct 02 18:14:23 2009
Subject: [jira] Updated: (PIG-993) [zebra] Abitlity to drop a column group in a 
table


 [ 
https://issues.apache.org/jira/browse/PIG-993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yan Zhou updated PIG-993:
-

Attachment: (was: zebra-drop-cq.patch)

 [zebra] Abitlity to drop a column group in a table
 --

 Key: PIG-993
 URL: https://issues.apache.org/jira/browse/PIG-993
 Project: Pig
  Issue Type: Bug
Reporter: Raghu Angadi
Assignee: Raghu Angadi
 Fix For: 0.5.0

 Attachments: DropColumnGroupExample.java, zebra-drop-cg.patch


 A Zebra table is stored as multiple sub tables each containing a set of 
 columns called column group (CG). The user specifies how these columns are 
 grouped while creating a table through the _storage hint_.
 For some of the large tables, it might be necessary for users to remove a set 
 of columns and retain the rest. This jira provides a way for users to delete 
 an entire column group. 
 The following comments will have more details on API and the semantics. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-944) Zebra schema is taken from Pig through TableStorer's construct

2009-10-02 Thread Yan Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yan Zhou updated PIG-944:
-

Attachment: (was: zebra_pig_interface.patch)

 Zebra schema is taken from Pig through TableStorer's construct
 --

 Key: PIG-944
 URL: https://issues.apache.org/jira/browse/PIG-944
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.4.0
Reporter: Yan Zhou

 It should be from StoreConfig in TableOutputFormat.checkOutputSpecs method 
 because the information is dynamic in Pig's execution engine and should not 
 be taking a static argument to the constructor.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-944) Zebra schema is taken from Pig through TableStorer's construct

2009-10-02 Thread Yan Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yan Zhou updated PIG-944:
-

Attachment: (was: zebra_pig_interface_1_1.patch)

 Zebra schema is taken from Pig through TableStorer's construct
 --

 Key: PIG-944
 URL: https://issues.apache.org/jira/browse/PIG-944
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.4.0
Reporter: Yan Zhou

 It should be from StoreConfig in TableOutputFormat.checkOutputSpecs method 
 because the information is dynamic in Pig's execution engine and should not 
 be taking a static argument to the constructor.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Assigned: (PIG-944) Zebra schema is taken from Pig through TableStorer's construct

2009-10-02 Thread Yan Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yan Zhou reassigned PIG-944:


Assignee: Yan Zhou

 Zebra schema is taken from Pig through TableStorer's construct
 --

 Key: PIG-944
 URL: https://issues.apache.org/jira/browse/PIG-944
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.4.0
Reporter: Yan Zhou
Assignee: Yan Zhou

 It should be from StoreConfig in TableOutputFormat.checkOutputSpecs method 
 because the information is dynamic in Pig's execution engine and should not 
 be taking a static argument to the constructor.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-944) Zebra schema is taken from Pig through TableStorer's construct

2009-10-02 Thread Yan Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yan Zhou updated PIG-944:
-

Attachment: SchemaConversion.patch

This patch must be applied after the patch for Jira PIG-933 has been applied.

 Zebra schema is taken from Pig through TableStorer's construct
 --

 Key: PIG-944
 URL: https://issues.apache.org/jira/browse/PIG-944
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.4.0
Reporter: Yan Zhou
Assignee: Yan Zhou
 Attachments: SchemaConversion.patch


 It should be from StoreConfig in TableOutputFormat.checkOutputSpecs method 
 because the information is dynamic in Pig's execution engine and should not 
 be taking a static argument to the constructor.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-944) Zebra schema is taken from Pig through TableStorer's construct

2009-10-02 Thread Yan Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yan Zhou updated PIG-944:
-

Fix Version/s: 0.6.0

 Zebra schema is taken from Pig through TableStorer's construct
 --

 Key: PIG-944
 URL: https://issues.apache.org/jira/browse/PIG-944
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.4.0
Reporter: Yan Zhou
Assignee: Yan Zhou
 Fix For: 0.6.0

 Attachments: SchemaConversion.patch


 It should be from StoreConfig in TableOutputFormat.checkOutputSpecs method 
 because the information is dynamic in Pig's execution engine and should not 
 be taking a static argument to the constructor.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.