[jira] Updated: (PIG-960) Using Hadoop's optimized LineRecordReader for reading Tuples in PigStorage
[ https://issues.apache.org/jira/browse/PIG-960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-960: --- Resolution: Fixed Fix Version/s: 0.6.0 Status: Resolved (was: Patch Available) Patch committed. Thanks Ankit! Using Hadoop's optimized LineRecordReader for reading Tuples in PigStorage --- Key: PIG-960 URL: https://issues.apache.org/jira/browse/PIG-960 Project: Pig Issue Type: Improvement Components: impl Reporter: Ankit Modi Fix For: 0.6.0 Attachments: pig_rlr.patch PigStorage's reading of Tuples ( lines ) can be optimized using Hadoop's {{LineRecordReader}}. This can help in following areas - Improving performance reading of Tuples (lines) in {{PigStorage}} - Any future improvements in line reading done in Hadoop's {{LineRecordReader}} is automatically carried over to Pig Issues that are handled by this patch - BZip uses internal buffers and positioning for determining the number of bytes read. Hence buffering done by {{LineRecordReader}} has to be turned off - Current implementation of {{LocalSeekableInputStream}} does not implement {{available}} method. This method has to be implemented. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-987) Zebra Column Group Access Control
[ https://issues.apache.org/jira/browse/PIG-987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12761645#action_12761645 ] Yan Zhou commented on PIG-987: -- The extra warnings were generated on 7 modified java files that were generated JAVACC code generator. Should be ignored. Zebra Column Group Access Control - Key: PIG-987 URL: https://issues.apache.org/jira/browse/PIG-987 Project: Pig Issue Type: New Feature Affects Versions: 0.6.0 Reporter: Yan Zhou Assignee: Yan Zhou Attachments: ColumnGroupSecurity.patch Access Control: when processes try to read from the column groups, Zebra should be able to handle allowed vs. disallowed user/application accesses. Expected behavior when column group permissions are set: When user selects only columns that they do not have permissions to access, Zebra should return error with message Error #: Permission denied for accessing column column name or names Access control applies to an entire column group, so all columns in a column group have same permissions. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-592) schema inferred incorrectly
[ https://issues.apache.org/jira/browse/PIG-592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-592: --- Resolution: Fixed Status: Resolved (was: Patch Available) Patch committed schema inferred incorrectly --- Key: PIG-592 URL: https://issues.apache.org/jira/browse/PIG-592 Project: Pig Issue Type: Bug Affects Versions: 0.4.0 Reporter: Christopher Olston Fix For: 0.6.0 Attachments: PIG-592-1.patch, PIG-592-2.patch, PIG-592-3.patch A simple pig script, that never introduces any schema information: A = load 'foo'; B = foreach (group A by $8) generate group, COUNT($1); C = load 'bar'; // ('bar' has two columns) D = join B by $0, C by $0; E = foreach D generate $0, $1, $3; Fails, complaining that $3 does not exist: java.io.IOException: Out of bound access. Trying to access non-existent column: 3. Schema {B::group: bytearray,long,bytearray} has 3 column(s). Apparently Pig gets confused, and thinks it knows the schema for C (a single bytearray column). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-991) [zebra] A few minor bugs as described in the Description section
[ https://issues.apache.org/jira/browse/PIG-991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yan Zhou updated PIG-991: - Description: 1) lzo2 was used as the compressor name for the LZO compression algorithm; it should be lzo instead; 2) the default compression is changed from lzo to gz for gzip; 3) In JAVACC file SchemaParser.jjt, the package name was wrong using the old package org.apache.pig.table.types; 4) in build.xml, two new javacc targets are added to generate TableSchemaParser and TableStorageParser java codes; 5) Support of column group security ( https://issues.apache.org/jira/browse/PIG-987 ) lacked support of the dumpinfo method: the groups and permissions were not displayed. Note that as a consequence, the patch herein must be applied after that of JIRA987. 6) and 7) a couple of issues reported in Jira917. was: 1) lzo2 was used as the compressor name for the LZO compression algorithm; it should be lzo instead; 2) the default compression is changed from lzo to gz for gzip; 3) In JAVACC file SchemaParser.jjt, the package name was wrong using the old package org.apache.pig.table.types; 4) in build.xml, two new javacc targets are added to generate TableSchemaParser and TableStorageParser java codes; 5) Support of column group security ( https://issues.apache.org/jira/browse/PIG-987 ) lacked support of the dumpinfo method: the groups and permissions were not displayed. Note that as a consequence, the patch herein must be applied after that of JIRA987. Summary: [zebra] A few minor bugs as described in the Description section (was: A few minor bugs as described in the Description section) [zebra] A few minor bugs as described in the Description section Key: PIG-991 URL: https://issues.apache.org/jira/browse/PIG-991 Project: Pig Issue Type: Bug Affects Versions: 0.4.0 Reporter: Yan Zhou Assignee: Yan Zhou Priority: Minor Fix For: 0.6.0 1) lzo2 was used as the compressor name for the LZO compression algorithm; it should be lzo instead; 2) the default compression is changed from lzo to gz for gzip; 3) In JAVACC file SchemaParser.jjt, the package name was wrong using the old package org.apache.pig.table.types; 4) in build.xml, two new javacc targets are added to generate TableSchemaParser and TableStorageParser java codes; 5) Support of column group security ( https://issues.apache.org/jira/browse/PIG-987 ) lacked support of the dumpinfo method: the groups and permissions were not displayed. Note that as a consequence, the patch herein must be applied after that of JIRA987. 6) and 7) a couple of issues reported in Jira917. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-987) [zebra] Zebra Column Group Access Control
[ https://issues.apache.org/jira/browse/PIG-987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yan Zhou updated PIG-987: - Description: Access Control: when processes try to read from the column groups, Zebra should be able to handle allowed vs. disallowed user/application accesses. The security is eventuallt granted by corresponding HDFS security of the data stored. Expected behavior when column group permissions are set: When user selects only columns that they do not have permissions to access, Zebra should return error with message Error #: Permission denied for accessing column column name or names Access control applies to an entire column group, so all columns in a column group have same permissions. was: Access Control: when processes try to read from the column groups, Zebra should be able to handle allowed vs. disallowed user/application accesses. Expected behavior when column group permissions are set: When user selects only columns that they do not have permissions to access, Zebra should return error with message Error #: Permission denied for accessing column column name or names Access control applies to an entire column group, so all columns in a column group have same permissions. Summary: [zebra] Zebra Column Group Access Control (was: Zebra Column Group Access Control) [zebra] Zebra Column Group Access Control - Key: PIG-987 URL: https://issues.apache.org/jira/browse/PIG-987 Project: Pig Issue Type: New Feature Affects Versions: 0.6.0 Reporter: Yan Zhou Assignee: Yan Zhou Attachments: ColumnGroupSecurity.patch Access Control: when processes try to read from the column groups, Zebra should be able to handle allowed vs. disallowed user/application accesses. The security is eventuallt granted by corresponding HDFS security of the data stored. Expected behavior when column group permissions are set: When user selects only columns that they do not have permissions to access, Zebra should return error with message Error #: Permission denied for accessing column column name or names Access control applies to an entire column group, so all columns in a column group have same permissions. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-991) [zebra] A few minor bugs as described in the Description section
[ https://issues.apache.org/jira/browse/PIG-991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yan Zhou updated PIG-991: - Release Note: Patch should be applied after that of Jira987. Status: Patch Available (was: Open) [zebra] A few minor bugs as described in the Description section Key: PIG-991 URL: https://issues.apache.org/jira/browse/PIG-991 Project: Pig Issue Type: Bug Affects Versions: 0.4.0 Reporter: Yan Zhou Assignee: Yan Zhou Priority: Minor Fix For: 0.6.0 1) lzo2 was used as the compressor name for the LZO compression algorithm; it should be lzo instead; 2) the default compression is changed from lzo to gz for gzip; 3) In JAVACC file SchemaParser.jjt, the package name was wrong using the old package org.apache.pig.table.types; 4) in build.xml, two new javacc targets are added to generate TableSchemaParser and TableStorageParser java codes; 5) Support of column group security ( https://issues.apache.org/jira/browse/PIG-987 ) lacked support of the dumpinfo method: the groups and permissions were not displayed. Note that as a consequence, the patch herein must be applied after that of JIRA987. 6) and 7) a couple of issues reported in Jira917. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-991) [zebra] A few minor bugs as described in the Description section
[ https://issues.apache.org/jira/browse/PIG-991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12761671#action_12761671 ] Hadoop QA commented on PIG-991: --- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12421138/Bugs.patch against trunk revision 821101. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 18 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/57/console This message is automatically generated. [zebra] A few minor bugs as described in the Description section Key: PIG-991 URL: https://issues.apache.org/jira/browse/PIG-991 Project: Pig Issue Type: Bug Affects Versions: 0.4.0 Reporter: Yan Zhou Assignee: Yan Zhou Priority: Minor Fix For: 0.6.0 Attachments: Bugs.patch 1) lzo2 was used as the compressor name for the LZO compression algorithm; it should be lzo instead; 2) the default compression is changed from lzo to gz for gzip; 3) In JAVACC file SchemaParser.jjt, the package name was wrong using the old package org.apache.pig.table.types; 4) in build.xml, two new javacc targets are added to generate TableSchemaParser and TableStorageParser java codes; 5) Support of column group security ( https://issues.apache.org/jira/browse/PIG-987 ) lacked support of the dumpinfo method: the groups and permissions were not displayed. Note that as a consequence, the patch herein must be applied after that of JIRA987. 6) and 7) a couple of issues reported in Jira917. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (PIG-992) [zebra] Separate Schema-related files into a Schema package
[zebra] Separate Schema-related files into a Schema package - Key: PIG-992 URL: https://issues.apache.org/jira/browse/PIG-992 Project: Pig Issue Type: Improvement Affects Versions: 0.4.0 Reporter: Yan Zhou Assignee: Yan Zhou Priority: Minor Fix For: 0.6.0 The hope is to facilitate future sharing of the Schema codes between different modules and/or products. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-986) [zebra] Zebra Column Group Naming Support
[ https://issues.apache.org/jira/browse/PIG-986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yan Zhou updated PIG-986: - Fix Version/s: (was: 0.5.0) 0.6.0 Affects Version/s: 0.4.0 Release Note: The patch mush be applied after the one in Jira991 has been applied. Status: Patch Available (was: Open) The patch file name is ColumnGroupName.patch [zebra] Zebra Column Group Naming Support - Key: PIG-986 URL: https://issues.apache.org/jira/browse/PIG-986 Project: Pig Issue Type: New Feature Components: impl Affects Versions: 0.4.0 Reporter: Chao Wang Assignee: Chao Wang Fix For: 0.6.0 Attachments: ColumnGroupName.patch We introduce column group name to Zebra and make it a first-class citizen in Zebra. This can ease management of column groups. We plan to introduce an as clause for column group name in Zebra's syntax. Functional Specifications: 1) Column group names are optional. For column groups which do not have a user-provided name, Zebra will assign some default column group names internally that is unique for that table - CG0, CG1, CG2 ... Note: If CGx is used by user, then it can not be used for internal names. 2) We introduce an AS clause in Zebra's syntax for column group names. If it occurs, it has to immediately follow [ ]. For example, [a1, a2] as PI secure by user:joe group:secure perm:640; [a3, a4] as General compress by lzo. Note that keyword AS is case insensitive. 3) Column group names are unique within one table and are case sensitive, i.e., c1 and C1 are different. 4) Column group names will be used as the physical column group directory path names. 5) Zebra V2 will support dropColumnGroup by column group names (will integrate with Raghu's A29 drop column work). 6) Zebra V2 can support backward compatibility (If there are Zebra V1 created tables in production when V2 is released). More specifically, this means that Zebra V2 can load from V1-created tables and do dropColumnGroup on it. 7) Does NOT support renaming. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-986) [zebra] Zebra Column Group Naming Support
[ https://issues.apache.org/jira/browse/PIG-986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12761744#action_12761744 ] Hadoop QA commented on PIG-986: --- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12421161/ColumnGroupName.patch against trunk revision 821101. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 44 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/58/console This message is automatically generated. [zebra] Zebra Column Group Naming Support - Key: PIG-986 URL: https://issues.apache.org/jira/browse/PIG-986 Project: Pig Issue Type: New Feature Components: impl Affects Versions: 0.4.0 Reporter: Chao Wang Assignee: Chao Wang Fix For: 0.6.0 Attachments: ColumnGroupName.patch We introduce column group name to Zebra and make it a first-class citizen in Zebra. This can ease management of column groups. We plan to introduce an as clause for column group name in Zebra's syntax. Functional Specifications: 1) Column group names are optional. For column groups which do not have a user-provided name, Zebra will assign some default column group names internally that is unique for that table - CG0, CG1, CG2 ... Note: If CGx is used by user, then it can not be used for internal names. 2) We introduce an AS clause in Zebra's syntax for column group names. If it occurs, it has to immediately follow [ ]. For example, [a1, a2] as PI secure by user:joe group:secure perm:640; [a3, a4] as General compress by lzo. Note that keyword AS is case insensitive. 3) Column group names are unique within one table and are case sensitive, i.e., c1 and C1 are different. 4) Column group names will be used as the physical column group directory path names. 5) Zebra V2 will support dropColumnGroup by column group names (will integrate with Raghu's A29 drop column work). 6) Zebra V2 can support backward compatibility (If there are Zebra V1 created tables in production when V2 is released). More specifically, this means that Zebra V2 can load from V1-created tables and do dropColumnGroup on it. 7) Does NOT support renaming. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-953) Enable merge join in pig to work with loaders and store functions which can internally index sorted data
[ https://issues.apache.org/jira/browse/PIG-953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pradeep Kamath updated PIG-953: --- Attachment: PIG-953-3.patch Attached patch which has the SortColInfo implementation to convey sort column information in SortInfo. This patch also address PIG-981. Enable merge join in pig to work with loaders and store functions which can internally index sorted data - Key: PIG-953 URL: https://issues.apache.org/jira/browse/PIG-953 Project: Pig Issue Type: Improvement Affects Versions: 0.3.0 Reporter: Pradeep Kamath Assignee: Pradeep Kamath Attachments: PIG-953-2.patch, PIG-953-3.patch, PIG-953.patch Currently merge join implementation in pig includes construction of an index on sorted data and use of that index to seek into the right input to efficiently perform the join operation. Some loaders (notably the zebra loader) internally implement an index on sorted data and can perform this seek efficiently using their index. So the use of the index needs to be abstracted in such a way that when the loader supports indexing, pig uses it (indirectly through the loader) and does not construct an index. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (PIG-981) Merge join should restrict join key expressions to simple projects
[ https://issues.apache.org/jira/browse/PIG-981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pradeep Kamath resolved PIG-981. Resolution: Duplicate Fixed in https://issues.apache.org/jira/browse/PIG-953?focusedCommentId=12761760page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12761760 Merge join should restrict join key expressions to simple projects -- Key: PIG-981 URL: https://issues.apache.org/jira/browse/PIG-981 Project: Pig Issue Type: Bug Affects Versions: 0.4.0 Reporter: Pradeep Kamath Assignee: Pradeep Kamath Currently merge join allows join key expressions to be arbitrary expressions with the assumption that the expressions keep the sort order. Since currently only ascending sort order is supported, the code checks at run times for sort order and catches the case where sort order is broken because the join key expression is not order preserving. However there is a reason we should restrict the join keys to projection of columns only: PIG-953 will enable pig to perform merge join to work with loaders and store functions which can internally index sorted data. These store functions can only create an index (and hence lookup on the index) on raw data columns (and not expressions on the columns). Hopefully this does not downgrade the usability of merge join much since if the expressions can always be applied post join on the join columns and since the expressions are order preserving they do not affect the outcome of the join. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (PIG-993) [zebra] Abitlity to drop a column group in a table
[zebra] Abitlity to drop a column group in a table -- Key: PIG-993 URL: https://issues.apache.org/jira/browse/PIG-993 Project: Pig Issue Type: Bug Reporter: Raghu Angadi Assignee: Raghu Angadi Fix For: 0.5.0 A Zebra table is stored as multiple sub tables each containing a set of columns called column group (CG). The user specifies how these columns are grouped while creating a table through the _storage hint_. For some of the large tables, it might be necessary for users to remove a set of columns and retain the rest. This jira provides a way for users to delete an entire column group. The following comments will have more details on API and the semantics. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-993) [zebra] Abitlity to drop a column group in a table
[ https://issues.apache.org/jira/browse/PIG-993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12761766#action_12761766 ] Raghu Angadi commented on PIG-993: -- API is pretty simple : {code} class org.apache.hadoop.zebra.BasicTable { /** see the patch for JavaDoc and attached example for usage */ public static void dropColumnGroup(Path path, Configuration conf, String cgName) throws IOException { ... } } {code} * Table schema is not modified. * this API takes a name for a column group. PIG-986 adds explicit names for CGs. * Once a CGs is deleted, NULL is returned for the fields that were stored in the CG. ** This is the main difference between just manually deleting a directory on filesystem and 'properly' deleting a CG. ** Many changes made in other parts of zebra are related to handling the missing CGs. [zebra] Abitlity to drop a column group in a table -- Key: PIG-993 URL: https://issues.apache.org/jira/browse/PIG-993 Project: Pig Issue Type: Bug Reporter: Raghu Angadi Assignee: Raghu Angadi Fix For: 0.5.0 A Zebra table is stored as multiple sub tables each containing a set of columns called column group (CG). The user specifies how these columns are grouped while creating a table through the _storage hint_. For some of the large tables, it might be necessary for users to remove a set of columns and retain the rest. This jira provides a way for users to delete an entire column group. The following comments will have more details on API and the semantics. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-993) [zebra] Abitlity to drop a column group in a table
[ https://issues.apache.org/jira/browse/PIG-993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raghu Angadi updated PIG-993: - Attachment: zebra-drop-cg.patch DropColumnGroupExample.java Attachments ; DropColumnGropuExample.java : a simple example to illustrate the functionality. zebra-drop-cg.patch : This patch would apply only after a patch for PIG-896. Some of the tests included there are written by Jing Huang. Jing also helped with testing the patchon real clusters with various errors. Yan Zhou helped with correctly handling missing column groups. [zebra] Abitlity to drop a column group in a table -- Key: PIG-993 URL: https://issues.apache.org/jira/browse/PIG-993 Project: Pig Issue Type: Bug Reporter: Raghu Angadi Assignee: Raghu Angadi Fix For: 0.5.0 Attachments: DropColumnGroupExample.java, zebra-drop-cg.patch A Zebra table is stored as multiple sub tables each containing a set of columns called column group (CG). The user specifies how these columns are grouped while creating a table through the _storage hint_. For some of the large tables, it might be necessary for users to remove a set of columns and retain the rest. This jira provides a way for users to delete an entire column group. The following comments will have more details on API and the semantics. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-993) [zebra] Abitlity to drop a column group in a table
[ https://issues.apache.org/jira/browse/PIG-993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12761769#action_12761769 ] Raghu Angadi commented on PIG-993: -- zebra-drop-cg.patch : This patch would apply only after a patch for PIG-896. I meant say PIG-986. [zebra] Abitlity to drop a column group in a table -- Key: PIG-993 URL: https://issues.apache.org/jira/browse/PIG-993 Project: Pig Issue Type: Bug Reporter: Raghu Angadi Assignee: Raghu Angadi Fix For: 0.5.0 Attachments: DropColumnGroupExample.java, zebra-drop-cg.patch A Zebra table is stored as multiple sub tables each containing a set of columns called column group (CG). The user specifies how these columns are grouped while creating a table through the _storage hint_. For some of the large tables, it might be necessary for users to remove a set of columns and retain the rest. This jira provides a way for users to delete an entire column group. The following comments will have more details on API and the semantics. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-992) [zebra] Separate Schema-related files into a Schema package
[ https://issues.apache.org/jira/browse/PIG-992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yan Zhou updated PIG-992: - Release Note: The patch file needs to be applied after Jira 986's patch has been applied. Status: Patch Available (was: Open) [zebra] Separate Schema-related files into a Schema package - Key: PIG-992 URL: https://issues.apache.org/jira/browse/PIG-992 Project: Pig Issue Type: Improvement Affects Versions: 0.4.0 Reporter: Yan Zhou Assignee: Yan Zhou Priority: Minor Fix For: 0.6.0 Attachments: SchemaPackageChange.patch The hope is to facilitate future sharing of the Schema codes between different modules and/or products. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-992) [zebra] Separate Schema-related files into a Schema package
[ https://issues.apache.org/jira/browse/PIG-992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yan Zhou updated PIG-992: - Attachment: SchemaPackageChange.patch [zebra] Separate Schema-related files into a Schema package - Key: PIG-992 URL: https://issues.apache.org/jira/browse/PIG-992 Project: Pig Issue Type: Improvement Affects Versions: 0.4.0 Reporter: Yan Zhou Assignee: Yan Zhou Priority: Minor Fix For: 0.6.0 Attachments: SchemaPackageChange.patch The hope is to facilitate future sharing of the Schema codes between different modules and/or products. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-993) [zebra] Abitlity to drop a column group in a table
[ https://issues.apache.org/jira/browse/PIG-993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yan Zhou updated PIG-993: - Attachment: zebra-drop-cq.patch this patch should be applied after the patch for Jira992 has been applied. [zebra] Abitlity to drop a column group in a table -- Key: PIG-993 URL: https://issues.apache.org/jira/browse/PIG-993 Project: Pig Issue Type: Bug Reporter: Raghu Angadi Assignee: Raghu Angadi Fix For: 0.5.0 Attachments: DropColumnGroupExample.java, zebra-drop-cg.patch A Zebra table is stored as multiple sub tables each containing a set of columns called column group (CG). The user specifies how these columns are grouped while creating a table through the _storage hint_. For some of the large tables, it might be necessary for users to remove a set of columns and retain the rest. This jira provides a way for users to delete an entire column group. The following comments will have more details on API and the semantics. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-993) [zebra] Abitlity to drop a column group in a table
[ https://issues.apache.org/jira/browse/PIG-993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yan Zhou updated PIG-993: - Attachment: (was: zebra-drop-cq.patch) [zebra] Abitlity to drop a column group in a table -- Key: PIG-993 URL: https://issues.apache.org/jira/browse/PIG-993 Project: Pig Issue Type: Bug Reporter: Raghu Angadi Assignee: Raghu Angadi Fix For: 0.5.0 Attachments: DropColumnGroupExample.java, zebra-drop-cg.patch A Zebra table is stored as multiple sub tables each containing a set of columns called column group (CG). The user specifies how these columns are grouped while creating a table through the _storage hint_. For some of the large tables, it might be necessary for users to remove a set of columns and retain the rest. This jira provides a way for users to delete an entire column group. The following comments will have more details on API and the semantics. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-993) [zebra] Abitlity to drop a column group in a table
[ https://issues.apache.org/jira/browse/PIG-993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yan Zhou updated PIG-993: - Attachment: zebra-drop-cg.patch This patych should be applied after the patch for Jira992 is applied. [zebra] Abitlity to drop a column group in a table -- Key: PIG-993 URL: https://issues.apache.org/jira/browse/PIG-993 Project: Pig Issue Type: Bug Reporter: Raghu Angadi Assignee: Raghu Angadi Fix For: 0.5.0 Attachments: DropColumnGroupExample.java, zebra-drop-cg.patch, zebra-drop-cg.patch A Zebra table is stored as multiple sub tables each containing a set of columns called column group (CG). The user specifies how these columns are grouped while creating a table through the _storage hint_. For some of the large tables, it might be necessary for users to remove a set of columns and retain the rest. This jira provides a way for users to delete an entire column group. The following comments will have more details on API and the semantics. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-992) [zebra] Separate Schema-related files into a Schema package
[ https://issues.apache.org/jira/browse/PIG-992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12761795#action_12761795 ] Hadoop QA commented on PIG-992: --- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12421180/SchemaPackageChange.patch against trunk revision 821101. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 183 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/59/console This message is automatically generated. [zebra] Separate Schema-related files into a Schema package - Key: PIG-992 URL: https://issues.apache.org/jira/browse/PIG-992 Project: Pig Issue Type: Improvement Affects Versions: 0.4.0 Reporter: Yan Zhou Assignee: Yan Zhou Priority: Minor Fix For: 0.6.0 Attachments: SchemaPackageChange.patch The hope is to facilitate future sharing of the Schema codes between different modules and/or products. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: [jira] Updated: (PIG-993) [zebra] Abitlity to drop a column group in a table
- Original Message - From: Yan Zhou (JIRA) j...@apache.org To: pig-dev@hadoop.apache.org pig-dev@hadoop.apache.org Sent: Fri Oct 02 18:16:23 2009 Subject: [jira] Updated: (PIG-993) [zebra] Abitlity to drop a column group in a table [ https://issues.apache.org/jira/browse/PIG-993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yan Zhou updated PIG-993: - Attachment: zebra-drop-cg.patch This patych should be applied after the patch for Jira992 is applied. [zebra] Abitlity to drop a column group in a table -- Key: PIG-993 URL: https://issues.apache.org/jira/browse/PIG-993 Project: Pig Issue Type: Bug Reporter: Raghu Angadi Assignee: Raghu Angadi Fix For: 0.5.0 Attachments: DropColumnGroupExample.java, zebra-drop-cg.patch, zebra-drop-cg.patch A Zebra table is stored as multiple sub tables each containing a set of columns called column group (CG). The user specifies how these columns are grouped while creating a table through the _storage hint_. For some of the large tables, it might be necessary for users to remove a set of columns and retain the rest. This jira provides a way for users to delete an entire column group. The following comments will have more details on API and the semantics. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: [jira] Updated: (PIG-993) [zebra] Abitlity to drop a column group in a table
X - Original Message - From: Yan Zhou (JIRA) j...@apache.org To: pig-dev@hadoop.apache.org pig-dev@hadoop.apache.org Sent: Fri Oct 02 18:14:23 2009 Subject: [jira] Updated: (PIG-993) [zebra] Abitlity to drop a column group in a table [ https://issues.apache.org/jira/browse/PIG-993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yan Zhou updated PIG-993: - Attachment: (was: zebra-drop-cq.patch) [zebra] Abitlity to drop a column group in a table -- Key: PIG-993 URL: https://issues.apache.org/jira/browse/PIG-993 Project: Pig Issue Type: Bug Reporter: Raghu Angadi Assignee: Raghu Angadi Fix For: 0.5.0 Attachments: DropColumnGroupExample.java, zebra-drop-cg.patch A Zebra table is stored as multiple sub tables each containing a set of columns called column group (CG). The user specifies how these columns are grouped while creating a table through the _storage hint_. For some of the large tables, it might be necessary for users to remove a set of columns and retain the rest. This jira provides a way for users to delete an entire column group. The following comments will have more details on API and the semantics. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-944) Zebra schema is taken from Pig through TableStorer's construct
[ https://issues.apache.org/jira/browse/PIG-944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yan Zhou updated PIG-944: - Attachment: (was: zebra_pig_interface.patch) Zebra schema is taken from Pig through TableStorer's construct -- Key: PIG-944 URL: https://issues.apache.org/jira/browse/PIG-944 Project: Pig Issue Type: Bug Affects Versions: 0.4.0 Reporter: Yan Zhou It should be from StoreConfig in TableOutputFormat.checkOutputSpecs method because the information is dynamic in Pig's execution engine and should not be taking a static argument to the constructor. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-944) Zebra schema is taken from Pig through TableStorer's construct
[ https://issues.apache.org/jira/browse/PIG-944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yan Zhou updated PIG-944: - Attachment: (was: zebra_pig_interface_1_1.patch) Zebra schema is taken from Pig through TableStorer's construct -- Key: PIG-944 URL: https://issues.apache.org/jira/browse/PIG-944 Project: Pig Issue Type: Bug Affects Versions: 0.4.0 Reporter: Yan Zhou It should be from StoreConfig in TableOutputFormat.checkOutputSpecs method because the information is dynamic in Pig's execution engine and should not be taking a static argument to the constructor. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (PIG-944) Zebra schema is taken from Pig through TableStorer's construct
[ https://issues.apache.org/jira/browse/PIG-944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yan Zhou reassigned PIG-944: Assignee: Yan Zhou Zebra schema is taken from Pig through TableStorer's construct -- Key: PIG-944 URL: https://issues.apache.org/jira/browse/PIG-944 Project: Pig Issue Type: Bug Affects Versions: 0.4.0 Reporter: Yan Zhou Assignee: Yan Zhou It should be from StoreConfig in TableOutputFormat.checkOutputSpecs method because the information is dynamic in Pig's execution engine and should not be taking a static argument to the constructor. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-944) Zebra schema is taken from Pig through TableStorer's construct
[ https://issues.apache.org/jira/browse/PIG-944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yan Zhou updated PIG-944: - Attachment: SchemaConversion.patch This patch must be applied after the patch for Jira PIG-933 has been applied. Zebra schema is taken from Pig through TableStorer's construct -- Key: PIG-944 URL: https://issues.apache.org/jira/browse/PIG-944 Project: Pig Issue Type: Bug Affects Versions: 0.4.0 Reporter: Yan Zhou Assignee: Yan Zhou Attachments: SchemaConversion.patch It should be from StoreConfig in TableOutputFormat.checkOutputSpecs method because the information is dynamic in Pig's execution engine and should not be taking a static argument to the constructor. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-944) Zebra schema is taken from Pig through TableStorer's construct
[ https://issues.apache.org/jira/browse/PIG-944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yan Zhou updated PIG-944: - Fix Version/s: 0.6.0 Zebra schema is taken from Pig through TableStorer's construct -- Key: PIG-944 URL: https://issues.apache.org/jira/browse/PIG-944 Project: Pig Issue Type: Bug Affects Versions: 0.4.0 Reporter: Yan Zhou Assignee: Yan Zhou Fix For: 0.6.0 Attachments: SchemaConversion.patch It should be from StoreConfig in TableOutputFormat.checkOutputSpecs method because the information is dynamic in Pig's execution engine and should not be taking a static argument to the constructor. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.