[jira] Commented: (PIG-987) [zebra] Zebra Column Group Access Control

2009-10-07 Thread Yan Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12763361#action_12763361
 ] 

Yan Zhou commented on PIG-987:
--

I don't think the owner name is a problem because in this release it has no 
effect at all.

The log complains about "chgrp changing group ... is not permitted".  Can you 
chgrp a local FS  file to a group called "users" on your box?

> [zebra] Zebra Column Group Access Control
> -
>
> Key: PIG-987
> URL: https://issues.apache.org/jira/browse/PIG-987
> Project: Pig
>  Issue Type: New Feature
>Affects Versions: 0.6.0
>Reporter: Yan Zhou
>Assignee: Yan Zhou
> Attachments: ColumnGroupSecurity.patch, ColumnGroupSecurity.patch, 
> TEST-org.apache.hadoop.zebra.io.TestCheckin.txt, 
> TEST-org.apache.hadoop.zebra.mapred.TestCheckin.txt, tmp-987-plus-991.patch
>
>
> Access Control: when processes try to read from the column groups, Zebra 
> should be able to handle allowed vs. disallowed user/application accesses.  
> The security is eventuallt granted by corresponding  HDFS security of the 
> data stored.
> Expected behavior when column group permissions are set:
> When user selects only columns that they do not have permissions to 
> access, Zebra should return error with message "Error #: Permission denied 
> for accessing column  
> Access control applies to an entire column group, so all columns in a column 
> group have same permissions. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-987) [zebra] Zebra Column Group Access Control

2009-10-07 Thread Raghu Angadi (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12763346#action_12763346
 ] 

Raghu Angadi commented on PIG-987:
--

I finally got some time look into this. Yes. I think the it should be fixed in 
the tests. TestColumnGroup.java says :  
{noformat}
ColumnGroup.Writer writer = new ColumnGroup.Writer(path, strSchema, sorted,
"pig", "gz", "gauravj", "users", (short) Short.parseShort("755", 8), 
false, conf);
{noformat}

using local FS. How can we expect users to have a user name "gauravj" on their 
machines and run as superusers :)? just can not be done.

If the test wants to run with these permissions we should do :
 a) use HDFS (MiniDFSCluster) rather than local filesystem. The tester has all 
the permissions on a MiniDFS.
 b) minor : use a generic name than gauravj.


> [zebra] Zebra Column Group Access Control
> -
>
> Key: PIG-987
> URL: https://issues.apache.org/jira/browse/PIG-987
> Project: Pig
>  Issue Type: New Feature
>Affects Versions: 0.6.0
>Reporter: Yan Zhou
>Assignee: Yan Zhou
> Attachments: ColumnGroupSecurity.patch, ColumnGroupSecurity.patch, 
> TEST-org.apache.hadoop.zebra.io.TestCheckin.txt, 
> TEST-org.apache.hadoop.zebra.mapred.TestCheckin.txt, tmp-987-plus-991.patch
>
>
> Access Control: when processes try to read from the column groups, Zebra 
> should be able to handle allowed vs. disallowed user/application accesses.  
> The security is eventuallt granted by corresponding  HDFS security of the 
> data stored.
> Expected behavior when column group permissions are set:
> When user selects only columns that they do not have permissions to 
> access, Zebra should return error with message "Error #: Permission denied 
> for accessing column  
> Access control applies to an entire column group, so all columns in a column 
> group have same permissions. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-996) [zebra] Zebra build script does not have findbugs and clover targets.

2009-10-07 Thread Jing Huang (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12763324#action_12763324
 ] 

Jing Huang commented on PIG-996:


+1 
Patch reviewed.

> [zebra] Zebra build script does not have findbugs and clover targets.
> -
>
> Key: PIG-996
> URL: https://issues.apache.org/jira/browse/PIG-996
> Project: Pig
>  Issue Type: Bug
>  Components: build
>Affects Versions: 0.4.0
>Reporter: Chao Wang
>Assignee: Chao Wang
> Fix For: 0.6.0
>
> Attachments: patch_build
>
>
> Zebra build script does not have findbugs and clover targets, leading hudson 
> build process to fail on Zebra.
> This jira is to fix this by adding these two targets.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-987) [zebra] Zebra Column Group Access Control

2009-10-07 Thread Yan Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12763325#action_12763325
 ] 

Yan Zhou commented on PIG-987:
--

I see the following errors in your attached log:

chgrp: changing group of 
`/home/raghu/h/pig-commit/build/contrib/zebra/test/data/TestColumnGroupNullSplits':
 Operation not permitted

So I believen your tests has encountered disk permission problems. Note that we 
are testing the feature of "column group security" so having property 
permission settings is necessary for the tests to pass.

> [zebra] Zebra Column Group Access Control
> -
>
> Key: PIG-987
> URL: https://issues.apache.org/jira/browse/PIG-987
> Project: Pig
>  Issue Type: New Feature
>Affects Versions: 0.6.0
>Reporter: Yan Zhou
>Assignee: Yan Zhou
> Attachments: ColumnGroupSecurity.patch, ColumnGroupSecurity.patch, 
> TEST-org.apache.hadoop.zebra.io.TestCheckin.txt, 
> TEST-org.apache.hadoop.zebra.mapred.TestCheckin.txt, tmp-987-plus-991.patch
>
>
> Access Control: when processes try to read from the column groups, Zebra 
> should be able to handle allowed vs. disallowed user/application accesses.  
> The security is eventuallt granted by corresponding  HDFS security of the 
> data stored.
> Expected behavior when column group permissions are set:
> When user selects only columns that they do not have permissions to 
> access, Zebra should return error with message "Error #: Permission denied 
> for accessing column  
> Access control applies to an entire column group, so all columns in a column 
> group have same permissions. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-995) Limit Optimizer throw exception "ERROR 2156: Error while fixing projections"

2009-10-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12763322#action_12763322
 ] 

Hadoop QA commented on PIG-995:
---

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12421587/PIG-995-1.patch
  against trunk revision 822382.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/65/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/65/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/65/console

This message is automatically generated.

> Limit Optimizer throw exception "ERROR 2156: Error while fixing projections"
> 
>
> Key: PIG-995
> URL: https://issues.apache.org/jira/browse/PIG-995
> Project: Pig
>  Issue Type: Bug
>  Components: impl
>Affects Versions: 0.3.0
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.6.0
>
> Attachments: PIG-995-1.patch
>
>
> The following script fail:
> A = load '1.txt' AS (a0, a1, a2);
> B = order A by a1;
> C = limit B 10;
> D = foreach C generate $0;
> dump D;
> Error log:
> Caused by: org.apache.pig.impl.plan.VisitorException: ERROR 2156: Error while 
> fixing projections. Projection map of node to be replaced is null.
> at 
> org.apache.pig.impl.logicalLayer.ProjectFixerUpper.visit(ProjectFixerUpper.java:138)
> at 
> org.apache.pig.impl.logicalLayer.LOProject.visit(LOProject.java:408)
> at org.apache.pig.impl.logicalLayer.LOProject.visit(LOProject.java:58)
> at 
> org.apache.pig.impl.plan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:65)
> at 
> org.apache.pig.impl.plan.DepthFirstWalker.walk(DepthFirstWalker.java:50)
> at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:51)
> at 
> org.apache.pig.impl.logicalLayer.LOForEach.rewire(LOForEach.java:761)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-953) Enable merge join in pig to work with loaders and store functions which can internally index sorted data

2009-10-07 Thread Pradeep Kamath (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pradeep Kamath updated PIG-953:
---

Attachment: PIG-953-4.patch

Thanks for the review Ashutosh - updated patch which addresses the concerns.

> Enable merge join in pig to work with loaders and store functions which can 
> internally index sorted data 
> -
>
> Key: PIG-953
> URL: https://issues.apache.org/jira/browse/PIG-953
> Project: Pig
>  Issue Type: Improvement
>Affects Versions: 0.3.0
>Reporter: Pradeep Kamath
>Assignee: Pradeep Kamath
> Attachments: PIG-953-2.patch, PIG-953-3.patch, PIG-953-4.patch, 
> PIG-953.patch
>
>
> Currently merge join implementation in pig includes construction of an index 
> on sorted data and use of that index to seek into the "right input" to 
> efficiently perform the join operation. Some loaders (notably the zebra 
> loader) internally implement an index on sorted data and can perform this 
> seek efficiently using their index. So the use of the index needs to be 
> abstracted in such a way that when the loader supports indexing, pig uses it 
> (indirectly through the loader) and does not construct an index. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-996) [zebra] Zebra build script does not have findbugs and clover targets.

2009-10-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12763305#action_12763305
 ] 

Hadoop QA commented on PIG-996:
---

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12421577/patch_build
  against trunk revision 822382.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 5 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/14/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/14/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/14/console

This message is automatically generated.

> [zebra] Zebra build script does not have findbugs and clover targets.
> -
>
> Key: PIG-996
> URL: https://issues.apache.org/jira/browse/PIG-996
> Project: Pig
>  Issue Type: Bug
>  Components: build
>Affects Versions: 0.4.0
>Reporter: Chao Wang
>Assignee: Chao Wang
> Fix For: 0.6.0
>
> Attachments: patch_build
>
>
> Zebra build script does not have findbugs and clover targets, leading hudson 
> build process to fail on Zebra.
> This jira is to fix this by adding these two targets.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-760) Serialize schemas for PigStorage() and other storage types.

2009-10-07 Thread mark meissonnier (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12763294#action_12763294
 ] 

mark meissonnier commented on PIG-760:
--

Any new development on this issue? I'm finding it painful to have to modify the 
input schema to all "child" pig scripts anytime I modify my "root" pig script. 
I was thinking of developing something quick and then I figured someone might 
have done something or I  could help the overall effort.
Please let me know.
Thanks

> Serialize schemas for PigStorage() and other storage types.
> ---
>
> Key: PIG-760
> URL: https://issues.apache.org/jira/browse/PIG-760
> Project: Pig
>  Issue Type: New Feature
>Reporter: David Ciemiewicz
>
> I'm finding PigStorage() really convenient for storage and data interchange 
> because it compresses well and imports into Excel and other analysis 
> environments well.
> However, it is a pain when it comes to maintenance because the columns are in 
> fixed locations and I'd like to add columns in some cases.
> It would be great if load PigStorage() could read a default schema from a 
> .schema file stored with the data and if store PigStorage() could store a 
> .schema file with the data.
> I have tested this out and both Hadoop HDFS and Pig in -exectype local mode 
> will ignore a file called .schema in a directory of part files.
> So, for example, if I have a chain of Pig scripts I execute such as:
> A = load 'data-1' using PigStorage() as ( a: int , b: int );
> store A into 'data-2' using PigStorage();
> B = load 'data-2' using PigStorage();
> describe B;
> describe B should output something like { a: int, b: int }

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-995) Limit Optimizer throw exception "ERROR 2156: Error while fixing projections"

2009-10-07 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-995:
---

Status: Patch Available  (was: Open)

> Limit Optimizer throw exception "ERROR 2156: Error while fixing projections"
> 
>
> Key: PIG-995
> URL: https://issues.apache.org/jira/browse/PIG-995
> Project: Pig
>  Issue Type: Bug
>  Components: impl
>Affects Versions: 0.3.0
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.6.0
>
> Attachments: PIG-995-1.patch
>
>
> The following script fail:
> A = load '1.txt' AS (a0, a1, a2);
> B = order A by a1;
> C = limit B 10;
> D = foreach C generate $0;
> dump D;
> Error log:
> Caused by: org.apache.pig.impl.plan.VisitorException: ERROR 2156: Error while 
> fixing projections. Projection map of node to be replaced is null.
> at 
> org.apache.pig.impl.logicalLayer.ProjectFixerUpper.visit(ProjectFixerUpper.java:138)
> at 
> org.apache.pig.impl.logicalLayer.LOProject.visit(LOProject.java:408)
> at org.apache.pig.impl.logicalLayer.LOProject.visit(LOProject.java:58)
> at 
> org.apache.pig.impl.plan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:65)
> at 
> org.apache.pig.impl.plan.DepthFirstWalker.walk(DepthFirstWalker.java:50)
> at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:51)
> at 
> org.apache.pig.impl.logicalLayer.LOForEach.rewire(LOForEach.java:761)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-995) Limit Optimizer throw exception "ERROR 2156: Error while fixing projections"

2009-10-07 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-995:
---

Attachment: PIG-995-1.patch

> Limit Optimizer throw exception "ERROR 2156: Error while fixing projections"
> 
>
> Key: PIG-995
> URL: https://issues.apache.org/jira/browse/PIG-995
> Project: Pig
>  Issue Type: Bug
>  Components: impl
>Affects Versions: 0.3.0
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.6.0
>
> Attachments: PIG-995-1.patch
>
>
> The following script fail:
> A = load '1.txt' AS (a0, a1, a2);
> B = order A by a1;
> C = limit B 10;
> D = foreach C generate $0;
> dump D;
> Error log:
> Caused by: org.apache.pig.impl.plan.VisitorException: ERROR 2156: Error while 
> fixing projections. Projection map of node to be replaced is null.
> at 
> org.apache.pig.impl.logicalLayer.ProjectFixerUpper.visit(ProjectFixerUpper.java:138)
> at 
> org.apache.pig.impl.logicalLayer.LOProject.visit(LOProject.java:408)
> at org.apache.pig.impl.logicalLayer.LOProject.visit(LOProject.java:58)
> at 
> org.apache.pig.impl.plan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:65)
> at 
> org.apache.pig.impl.plan.DepthFirstWalker.walk(DepthFirstWalker.java:50)
> at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:51)
> at 
> org.apache.pig.impl.logicalLayer.LOForEach.rewire(LOForEach.java:761)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-922) Logical optimizer: push up project

2009-10-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12763276#action_12763276
 ] 

Hadoop QA commented on PIG-922:
---

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12421557/PIG-922-p3_7.patch
  against trunk revision 822382.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 30 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

-1 release audit.  The applied patch generated 306 release audit warnings 
(more than the trunk's current 299 warnings).

-1 core tests.  The patch failed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/64/testReport/
Release audit warnings: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/64/artifact/trunk/patchprocess/releaseAuditDiffWarnings.txt
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/64/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/64/console

This message is automatically generated.

> Logical optimizer: push up project
> --
>
> Key: PIG-922
> URL: https://issues.apache.org/jira/browse/PIG-922
> Project: Pig
>  Issue Type: New Feature
>  Components: impl
>Affects Versions: 0.3.0
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.6.0
>
> Attachments: PIG-922-p1_0.patch, PIG-922-p1_1.patch, 
> PIG-922-p1_2.patch, PIG-922-p1_3.patch, PIG-922-p1_4.patch, 
> PIG-922-p2_preview.patch, PIG-922-p2_preview2.patch, PIG-922-p3_1.patch, 
> PIG-922-p3_2.patch, PIG-922-p3_3.patch, PIG-922-p3_4.patch, 
> PIG-922-p3_5.patch, PIG-922-p3_6.patch, PIG-922-p3_7.patch, PIG-922-p3_8.patch
>
>
> This is a continuation work of 
> [PIG-697|https://issues.apache.org/jira/browse/PIG-697]. We need to add 
> another rule to the logical optimizer: Push up project, ie, prune columns as 
> early as possible.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-996) [zebra] Zebra build script does not have findbugs and clover targets.

2009-10-07 Thread Chao Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Wang updated PIG-996:
--


Also added a dummy checkstyle target. 

> [zebra] Zebra build script does not have findbugs and clover targets.
> -
>
> Key: PIG-996
> URL: https://issues.apache.org/jira/browse/PIG-996
> Project: Pig
>  Issue Type: Bug
>  Components: build
>Affects Versions: 0.4.0
>Reporter: Chao Wang
>Assignee: Chao Wang
> Fix For: 0.6.0
>
> Attachments: patch_build
>
>
> Zebra build script does not have findbugs and clover targets, leading hudson 
> build process to fail on Zebra.
> This jira is to fix this by adding these two targets.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-968) findContainingJar fails when there's a + in the path

2009-10-07 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12763271#action_12763271
 ] 

Todd Lipcon commented on PIG-968:
-

Yes, I think that would be a good manual test.

> findContainingJar fails when there's a + in the path
> 
>
> Key: PIG-968
> URL: https://issues.apache.org/jira/browse/PIG-968
> Project: Pig
>  Issue Type: Bug
>  Components: impl
>Affects Versions: 0.4.0, 0.5.0
>Reporter: Todd Lipcon
> Attachments: pig-968.txt
>
>
> This is the same bug as in MAPREDUCE-714. Please see discussion there.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-953) Enable merge join in pig to work with loaders and store functions which can internally index sorted data

2009-10-07 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12763256#action_12763256
 ] 

Ashutosh Chauhan commented on PIG-953:
--

..aah.. I should have had dug more in jdk sources. AbstractList , which 
ArrayList extends does override equals and provides correct behavior. So, my 
comment is a non-issue. With nits taken care of +1 for the patch.

> Enable merge join in pig to work with loaders and store functions which can 
> internally index sorted data 
> -
>
> Key: PIG-953
> URL: https://issues.apache.org/jira/browse/PIG-953
> Project: Pig
>  Issue Type: Improvement
>Affects Versions: 0.3.0
>Reporter: Pradeep Kamath
>Assignee: Pradeep Kamath
> Attachments: PIG-953-2.patch, PIG-953-3.patch, PIG-953.patch
>
>
> Currently merge join implementation in pig includes construction of an index 
> on sorted data and use of that index to seek into the "right input" to 
> efficiently perform the join operation. Some loaders (notably the zebra 
> loader) internally implement an index on sorted data and can perform this 
> seek efficiently using their index. So the use of the index needs to be 
> abstracted in such a way that when the loader supports indexing, pig uses it 
> (indirectly through the loader) and does not construct an index. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-996) [zebra] Zebra build script does not have findbugs and clover targets.

2009-10-07 Thread Chao Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Wang updated PIG-996:
--

Fix Version/s: 0.6.0
Affects Version/s: 0.4.0
   Status: Patch Available  (was: Open)

> [zebra] Zebra build script does not have findbugs and clover targets.
> -
>
> Key: PIG-996
> URL: https://issues.apache.org/jira/browse/PIG-996
> Project: Pig
>  Issue Type: Bug
>  Components: build
>Affects Versions: 0.4.0
>Reporter: Chao Wang
>Assignee: Chao Wang
> Fix For: 0.6.0
>
> Attachments: patch_build
>
>
> Zebra build script does not have findbugs and clover targets, leading hudson 
> build process to fail on Zebra.
> This jira is to fix this by adding these two targets.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-996) [zebra] Zebra build script does not have findbugs and clover targets.

2009-10-07 Thread Chao Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Wang updated PIG-996:
--

Attachment: patch_build

> [zebra] Zebra build script does not have findbugs and clover targets.
> -
>
> Key: PIG-996
> URL: https://issues.apache.org/jira/browse/PIG-996
> Project: Pig
>  Issue Type: Bug
>  Components: build
>Reporter: Chao Wang
>Assignee: Chao Wang
> Attachments: patch_build
>
>
> Zebra build script does not have findbugs and clover targets, leading hudson 
> build process to fail on Zebra.
> This jira is to fix this by adding these two targets.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-999) sorting on map-value fails if map-value is not of bytearray type

2009-10-07 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12763236#action_12763236
 ] 

Thejas M Nair commented on PIG-999:
---


In previous comment
{code}
o = order f by $2;
{code}
should have been -
{code}
o = order f by $1;
{code}


> sorting on map-value fails if map-value is not of bytearray type
> 
>
> Key: PIG-999
> URL: https://issues.apache.org/jira/browse/PIG-999
> Project: Pig
>  Issue Type: Sub-task
>Reporter: Thejas M Nair
>
> When query execution plan is created by pig, it assumes the type to be 
> bytearray because there is no schema information associated with map fields.
> But at run time, the loader might return the actual type. This results in a 
> ClassCastException.
> This issue points to the larger issue of the way pig is handling types for 
> map-value. 
> This issue should be fixed in the context of revisiting the frontend logic 
> and pig-latin semantics.
> This is related to PIG-880 . The patch in PIG-880 changed PigStorage to 
> always return bytearray for map values to work around this, but other loaders 
> like BinStorage can return the actual type causing this issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-999) sorting on map-value fails if map-value is not of bytearray type

2009-10-07 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12763234#action_12763234
 ] 

Thejas M Nair commented on PIG-999:
---

{code}
l = load 'st_attr2.bin' using BinStorage();
f = foreach l generate $1, $4#'origin';  --   $4#'origin is stored as chararray
o = order f by $2;
dump o; 
{code}

It results in map-reduce failure with error -

java.lang.ClassCastException: org.apache.pig.impl.io.NullableText cannot be 
cast to
org.apache.pig.impl.io.NullableBytesWritable
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigBytesRawComparator.compare(PigBytesRawComparator.java:94)
at java.util.Arrays.binarySearch0(Arrays.java:2105)
at java.util.Arrays.binarySearch(Arrays.java:2043)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.getPartition(WeightedRangePartitioner.java:64)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.getPartition(WeightedRangePartitioner.java:53)
at 
org.apache.hadoop.mapred.MapTask$OldOutputCollector.collect(MapTask.java:466)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Map.collect(PigMapReduce.java:108)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:251)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:240)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Map.map(PigMapReduce.java:93)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
at org.apache.hadoop.mapred.Child.main(Child.java:170)


> sorting on map-value fails if map-value is not of bytearray type
> 
>
> Key: PIG-999
> URL: https://issues.apache.org/jira/browse/PIG-999
> Project: Pig
>  Issue Type: Sub-task
>Reporter: Thejas M Nair
>
> When query execution plan is created by pig, it assumes the type to be 
> bytearray because there is no schema information associated with map fields.
> But at run time, the loader might return the actual type. This results in a 
> ClassCastException.
> This issue points to the larger issue of the way pig is handling types for 
> map-value. 
> This issue should be fixed in the context of revisiting the frontend logic 
> and pig-latin semantics.
> This is related to PIG-880 . The patch in PIG-880 changed PigStorage to 
> always return bytearray for map values to work around this, but other loaders 
> like BinStorage can return the actual type causing this issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-953) Enable merge join in pig to work with loaders and store functions which can internally index sorted data

2009-10-07 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12763218#action_12763218
 ] 

Ashutosh Chauhan commented on PIG-953:
--

Changes look good. One comment I have:

1) In SortInfo.java#equals
We have two lists and we want to check for their equality. I quickly looked up 
jdk sources and it seems that ArrayList doesn't override equals, so doing 
equals check on lists would result in reference equality test which would be 
incorrect. Correct way to do this would be to first check the sizes of two 
lists, if they are equal iterate through both lists and check equality of items 
at the same index in two list.  

Few nits:
1) TestMergeJoin contains a System.err.println which we can get rid of.
2) There are few unused imports in patch.
3) SortInfo.java#getSortColInfoList may result in Findbugs warning because of 
similar reason we discussed earlier in this jira. 

> Enable merge join in pig to work with loaders and store functions which can 
> internally index sorted data 
> -
>
> Key: PIG-953
> URL: https://issues.apache.org/jira/browse/PIG-953
> Project: Pig
>  Issue Type: Improvement
>Affects Versions: 0.3.0
>Reporter: Pradeep Kamath
>Assignee: Pradeep Kamath
> Attachments: PIG-953-2.patch, PIG-953-3.patch, PIG-953.patch
>
>
> Currently merge join implementation in pig includes construction of an index 
> on sorted data and use of that index to seek into the "right input" to 
> efficiently perform the join operation. Some loaders (notably the zebra 
> loader) internally implement an index on sorted data and can perform this 
> seek efficiently using their index. So the use of the index needs to be 
> abstracted in such a way that when the loader supports indexing, pig uses it 
> (indirectly through the loader) and does not construct an index. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-999) sorting on map-value fails if map-value is not of bytearray type

2009-10-07 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated PIG-999:
--

Issue Type: Sub-task  (was: Bug)
Parent: PIG-998

> sorting on map-value fails if map-value is not of bytearray type
> 
>
> Key: PIG-999
> URL: https://issues.apache.org/jira/browse/PIG-999
> Project: Pig
>  Issue Type: Sub-task
>Reporter: Thejas M Nair
>
> When query execution plan is created by pig, it assumes the type to be 
> bytearray because there is no schema information associated with map fields.
> But at run time, the loader might return the actual type. This results in a 
> ClassCastException.
> This issue points to the larger issue of the way pig is handling types for 
> map-value. 
> This issue should be fixed in the context of revisiting the frontend logic 
> and pig-latin semantics.
> This is related to PIG-880 . The patch in PIG-880 changed PigStorage to 
> always return bytearray for map values to work around this, but other loaders 
> like BinStorage can return the actual type causing this issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (PIG-999) sorting on map-value fails if map-value is not of bytearray type

2009-10-07 Thread Thejas M Nair (JIRA)
sorting on map-value fails if map-value is not of bytearray type


 Key: PIG-999
 URL: https://issues.apache.org/jira/browse/PIG-999
 Project: Pig
  Issue Type: Bug
Reporter: Thejas M Nair


When query execution plan is created by pig, it assumes the type to be 
bytearray because there is no schema information associated with map fields.
But at run time, the loader might return the actual type. This results in a 
ClassCastException.
This issue points to the larger issue of the way pig is handling types for 
map-value. 

This issue should be fixed in the context of revisiting the frontend logic and 
pig-latin semantics.

This is related to PIG-880 . The patch in PIG-880 changed PigStorage to always 
return bytearray for map values to work around this, but other loaders like 
BinStorage can return the actual type causing this issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (PIG-998) revisit frontend logic and pig-latin semantics

2009-10-07 Thread Thejas M Nair (JIRA)
revisit frontend logic and pig-latin semantics
--

 Key: PIG-998
 URL: https://issues.apache.org/jira/browse/PIG-998
 Project: Pig
  Issue Type: Bug
Reporter: Thejas M Nair


This jira has been created to keep track of issues with current frontend logic 
and pig-latin semantics.
One example is handling of type information of map-values. At time of  query 
plan generation pig does not know the type for map-values and assumes it is 
bytearray. This leads to problems when the loader returns map-value of other 
types.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-987) [zebra] Zebra Column Group Access Control

2009-10-07 Thread Raghu Angadi (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raghu Angadi updated PIG-987:
-

Attachment: tmp-987-plus-991.patch
TEST-org.apache.hadoop.zebra.io.TestCheckin.txt

Attachments :
   # tmp-987-plus-991.patch : latest patch here + patch for PIG-991
   # TEST-org.apache.hadoop.zebra.io.TestCheckin.txt : output of the failed 
tests.

Yan,  looks like lzo related errors are fixed with the combined patch. But 
there are still some failures. I think some of these failures exist on trunk as 
well.

> [zebra] Zebra Column Group Access Control
> -
>
> Key: PIG-987
> URL: https://issues.apache.org/jira/browse/PIG-987
> Project: Pig
>  Issue Type: New Feature
>Affects Versions: 0.6.0
>Reporter: Yan Zhou
>Assignee: Yan Zhou
> Attachments: ColumnGroupSecurity.patch, ColumnGroupSecurity.patch, 
> TEST-org.apache.hadoop.zebra.io.TestCheckin.txt, 
> TEST-org.apache.hadoop.zebra.mapred.TestCheckin.txt, tmp-987-plus-991.patch
>
>
> Access Control: when processes try to read from the column groups, Zebra 
> should be able to handle allowed vs. disallowed user/application accesses.  
> The security is eventuallt granted by corresponding  HDFS security of the 
> data stored.
> Expected behavior when column group permissions are set:
> When user selects only columns that they do not have permissions to 
> access, Zebra should return error with message "Error #: Permission denied 
> for accessing column  
> Access control applies to an entire column group, so all columns in a column 
> group have same permissions. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-922) Logical optimizer: push up project

2009-10-07 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-922:
---

Attachment: PIG-922-p3_8.patch

Fix the mechanism to turn off prune columns rule

> Logical optimizer: push up project
> --
>
> Key: PIG-922
> URL: https://issues.apache.org/jira/browse/PIG-922
> Project: Pig
>  Issue Type: New Feature
>  Components: impl
>Affects Versions: 0.3.0
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.6.0
>
> Attachments: PIG-922-p1_0.patch, PIG-922-p1_1.patch, 
> PIG-922-p1_2.patch, PIG-922-p1_3.patch, PIG-922-p1_4.patch, 
> PIG-922-p2_preview.patch, PIG-922-p2_preview2.patch, PIG-922-p3_1.patch, 
> PIG-922-p3_2.patch, PIG-922-p3_3.patch, PIG-922-p3_4.patch, 
> PIG-922-p3_5.patch, PIG-922-p3_6.patch, PIG-922-p3_7.patch, PIG-922-p3_8.patch
>
>
> This is a continuation work of 
> [PIG-697|https://issues.apache.org/jira/browse/PIG-697]. We need to add 
> another rule to the logical optimizer: Push up project, ie, prune columns as 
> early as possible.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-944) Zebra schema is taken from Pig through TableStorer's construct

2009-10-07 Thread Chao Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Wang updated PIG-944:
--


Patch reviewed. +1

> Zebra schema is taken from Pig through TableStorer's construct
> --
>
> Key: PIG-944
> URL: https://issues.apache.org/jira/browse/PIG-944
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.4.0
>Reporter: Yan Zhou
>Assignee: Yan Zhou
> Fix For: 0.6.0
>
> Attachments: SchemaConversion.patch
>
>
> It should be from StoreConfig in TableOutputFormat.checkOutputSpecs method 
> because the information is dynamic in Pig's execution engine and should not 
> be taking a static argument to the constructor.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-992) [zebra] Separate Schema-related files into a "Schema" package

2009-10-07 Thread Chao Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Wang updated PIG-992:
--


Patch reviewed. +1

> [zebra] Separate Schema-related files into a "Schema" package
> -
>
> Key: PIG-992
> URL: https://issues.apache.org/jira/browse/PIG-992
> Project: Pig
>  Issue Type: Improvement
>Affects Versions: 0.4.0
>Reporter: Yan Zhou
>Assignee: Yan Zhou
>Priority: Minor
> Fix For: 0.6.0
>
> Attachments: SchemaPackageChange.patch, SchemaPackageChange.patch
>
>
> The hope is to facilitate future sharing of the Schema codes between 
> different modules and/or products. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-922) Logical optimizer: push up project

2009-10-07 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-922:
---

Status: Patch Available  (was: Open)

> Logical optimizer: push up project
> --
>
> Key: PIG-922
> URL: https://issues.apache.org/jira/browse/PIG-922
> Project: Pig
>  Issue Type: New Feature
>  Components: impl
>Affects Versions: 0.3.0
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.6.0
>
> Attachments: PIG-922-p1_0.patch, PIG-922-p1_1.patch, 
> PIG-922-p1_2.patch, PIG-922-p1_3.patch, PIG-922-p1_4.patch, 
> PIG-922-p2_preview.patch, PIG-922-p2_preview2.patch, PIG-922-p3_1.patch, 
> PIG-922-p3_2.patch, PIG-922-p3_3.patch, PIG-922-p3_4.patch, 
> PIG-922-p3_5.patch, PIG-922-p3_6.patch, PIG-922-p3_7.patch
>
>
> This is a continuation work of 
> [PIG-697|https://issues.apache.org/jira/browse/PIG-697]. We need to add 
> another rule to the logical optimizer: Push up project, ie, prune columns as 
> early as possible.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-922) Logical optimizer: push up project

2009-10-07 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-922:
---

Attachment: PIG-922-p3_7.patch

Address issues with hudson

> Logical optimizer: push up project
> --
>
> Key: PIG-922
> URL: https://issues.apache.org/jira/browse/PIG-922
> Project: Pig
>  Issue Type: New Feature
>  Components: impl
>Affects Versions: 0.3.0
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.6.0
>
> Attachments: PIG-922-p1_0.patch, PIG-922-p1_1.patch, 
> PIG-922-p1_2.patch, PIG-922-p1_3.patch, PIG-922-p1_4.patch, 
> PIG-922-p2_preview.patch, PIG-922-p2_preview2.patch, PIG-922-p3_1.patch, 
> PIG-922-p3_2.patch, PIG-922-p3_3.patch, PIG-922-p3_4.patch, 
> PIG-922-p3_5.patch, PIG-922-p3_6.patch, PIG-922-p3_7.patch
>
>
> This is a continuation work of 
> [PIG-697|https://issues.apache.org/jira/browse/PIG-697]. We need to add 
> another rule to the logical optimizer: Push up project, ie, prune columns as 
> early as possible.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-922) Logical optimizer: push up project

2009-10-07 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-922:
---

Status: Open  (was: Patch Available)

> Logical optimizer: push up project
> --
>
> Key: PIG-922
> URL: https://issues.apache.org/jira/browse/PIG-922
> Project: Pig
>  Issue Type: New Feature
>  Components: impl
>Affects Versions: 0.3.0
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.6.0
>
> Attachments: PIG-922-p1_0.patch, PIG-922-p1_1.patch, 
> PIG-922-p1_2.patch, PIG-922-p1_3.patch, PIG-922-p1_4.patch, 
> PIG-922-p2_preview.patch, PIG-922-p2_preview2.patch, PIG-922-p3_1.patch, 
> PIG-922-p3_2.patch, PIG-922-p3_3.patch, PIG-922-p3_4.patch, 
> PIG-922-p3_5.patch, PIG-922-p3_6.patch
>
>
> This is a continuation work of 
> [PIG-697|https://issues.apache.org/jira/browse/PIG-697]. We need to add 
> another rule to the logical optimizer: Push up project, ie, prune columns as 
> early as possible.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-948) [Usability] Relating pig script with MR jobs

2009-10-07 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12763169#action_12763169
 ] 

Ashutosh Chauhan commented on PIG-948:
--

+1 
Change looks good. It should be log.info instead of log.error. In local hadoop 
mode, since its all running in one java process there is no port address of job 
tracker to get. 

> [Usability] Relating pig script with MR jobs
> 
>
> Key: PIG-948
> URL: https://issues.apache.org/jira/browse/PIG-948
> Project: Pig
>  Issue Type: Improvement
>  Components: impl
>Affects Versions: 0.4.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>Priority: Minor
> Fix For: 0.6.0
>
> Attachments: pig-948-2.patch, pig-948-3.patch, PIG-948-4.patch, 
> pig-948.patch
>
>
> Currently its hard to find a way to relate pig script with specific MR job. 
> In a loaded cluster with multiple simultaneous job submissions, its not easy 
> to figure out which specific MR jobs were launched for a given pig script. If 
> Pig can provide this info, it will be useful to debug and monitor the jobs 
> resulting from a pig script.
> At the very least, Pig should be able to provide user the following 
> information
> 1) Job id of the launched job.
> 2) Complete web url of jobtracker running this job. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-992) [zebra] Separate Schema-related files into a "Schema" package

2009-10-07 Thread Yan Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yan Zhou updated PIG-992:
-

Attachment: SchemaPackageChange.patch

Additional changes to some test files to avoid use of non-default lzo 
compression.

> [zebra] Separate Schema-related files into a "Schema" package
> -
>
> Key: PIG-992
> URL: https://issues.apache.org/jira/browse/PIG-992
> Project: Pig
>  Issue Type: Improvement
>Affects Versions: 0.4.0
>Reporter: Yan Zhou
>Assignee: Yan Zhou
>Priority: Minor
> Fix For: 0.6.0
>
> Attachments: SchemaPackageChange.patch, SchemaPackageChange.patch
>
>
> The hope is to facilitate future sharing of the Schema codes between 
> different modules and/or products. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-986) [zebra] Zebra Column Group Naming Support

2009-10-07 Thread Yan Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yan Zhou updated PIG-986:
-

Attachment: ColumnGroupName.patch

Additional changes to some test files to avoid use of non-default lzo 
compression.

> [zebra] Zebra Column Group Naming Support
> -
>
> Key: PIG-986
> URL: https://issues.apache.org/jira/browse/PIG-986
> Project: Pig
>  Issue Type: New Feature
>  Components: impl
>Affects Versions: 0.4.0
>Reporter: Chao Wang
>Assignee: Chao Wang
> Fix For: 0.6.0
>
> Attachments: ColumnGroupName.patch, ColumnGroupName.patch
>
>
> We introduce column group name to Zebra and make it a first-class citizen in 
> Zebra. This can ease management of column groups.
> We plan to introduce an "as" clause for column group name in Zebra's syntax.
> Functional Specifications:
> 1) Column group names are optional. For column groups which do not have a 
> user-provided name, Zebra will assign some default column group names 
> internally that is unique for that table - CG0, CG1, CG2 ... Note: If CGx is 
> used by user, then it can not be used for internal names.
> 2) We introduce an "AS" clause in Zebra's syntax for column group names. If 
> it occurs, it has to immediately follow [ ]. For example, "[a1, a2] as PI 
> secure by user:joe group:secure perm:640; [a3, a4] as General compress by 
> lzo". Note that keyword "AS" is case insensitive.
> 3) Column group names are unique within one table and are case sensitive, 
> i.e., c1 and C1 are different.
> 4) Column group names will be used as the physical column group directory 
> path names.
> 5) Zebra V2 will support dropColumnGroup by column group names (will 
> integrate with Raghu's A29 drop column work).
> 6) Zebra V2 can support backward compatibility (If there are Zebra V1 created 
> tables in production when V2 is released). More specifically, this means that 
> Zebra V2 can load from V1-created tables and do dropColumnGroup on it.
> 7) Does NOT support renaming.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-987) [zebra] Zebra Column Group Access Control

2009-10-07 Thread Yan Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12763133#action_12763133
 ] 

Yan Zhou commented on PIG-987:
--

I have attached a new patch that removes the use of lzo in 6 test scripts.  
Accordingly, patches of 2 "downstream" Jiras, PIG-986 and PIG-992, will also be 
updated; while the other three "downstream" patches, PIG-991, PIG-993 and 
PIG-944, need not to be changed.

> [zebra] Zebra Column Group Access Control
> -
>
> Key: PIG-987
> URL: https://issues.apache.org/jira/browse/PIG-987
> Project: Pig
>  Issue Type: New Feature
>Affects Versions: 0.6.0
>Reporter: Yan Zhou
>Assignee: Yan Zhou
> Attachments: ColumnGroupSecurity.patch, ColumnGroupSecurity.patch, 
> TEST-org.apache.hadoop.zebra.mapred.TestCheckin.txt
>
>
> Access Control: when processes try to read from the column groups, Zebra 
> should be able to handle allowed vs. disallowed user/application accesses.  
> The security is eventuallt granted by corresponding  HDFS security of the 
> data stored.
> Expected behavior when column group permissions are set:
> When user selects only columns that they do not have permissions to 
> access, Zebra should return error with message "Error #: Permission denied 
> for accessing column  
> Access control applies to an entire column group, so all columns in a column 
> group have same permissions. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-987) [zebra] Zebra Column Group Access Control

2009-10-07 Thread Yan Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yan Zhou updated PIG-987:
-

Attachment: ColumnGroupSecurity.patch

This patch has additional test scripts that do not use the nodefault lzo 
compression. Its application should be followed by the one in PIG-991 to pass 
all Zebra-related tests.

> [zebra] Zebra Column Group Access Control
> -
>
> Key: PIG-987
> URL: https://issues.apache.org/jira/browse/PIG-987
> Project: Pig
>  Issue Type: New Feature
>Affects Versions: 0.6.0
>Reporter: Yan Zhou
>Assignee: Yan Zhou
> Attachments: ColumnGroupSecurity.patch, ColumnGroupSecurity.patch, 
> TEST-org.apache.hadoop.zebra.mapred.TestCheckin.txt
>
>
> Access Control: when processes try to read from the column groups, Zebra 
> should be able to handle allowed vs. disallowed user/application accesses.  
> The security is eventuallt granted by corresponding  HDFS security of the 
> data stored.
> Expected behavior when column group permissions are set:
> When user selects only columns that they do not have permissions to 
> access, Zebra should return error with message "Error #: Permission denied 
> for accessing column  
> Access control applies to an entire column group, so all columns in a column 
> group have same permissions. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-922) Logical optimizer: push up project

2009-10-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12762941#action_12762941
 ] 

Hadoop QA commented on PIG-922:
---

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12421493/PIG-922-p3_6.patch
  against trunk revision 822382.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 27 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

-1 release audit.  The applied patch generated 306 release audit warnings 
(more than the trunk's current 299 warnings).

-1 core tests.  The patch failed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/63/testReport/
Release audit warnings: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/63/artifact/trunk/patchprocess/releaseAuditDiffWarnings.txt
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/63/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/63/console

This message is automatically generated.

> Logical optimizer: push up project
> --
>
> Key: PIG-922
> URL: https://issues.apache.org/jira/browse/PIG-922
> Project: Pig
>  Issue Type: New Feature
>  Components: impl
>Affects Versions: 0.3.0
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.6.0
>
> Attachments: PIG-922-p1_0.patch, PIG-922-p1_1.patch, 
> PIG-922-p1_2.patch, PIG-922-p1_3.patch, PIG-922-p1_4.patch, 
> PIG-922-p2_preview.patch, PIG-922-p2_preview2.patch, PIG-922-p3_1.patch, 
> PIG-922-p3_2.patch, PIG-922-p3_3.patch, PIG-922-p3_4.patch, 
> PIG-922-p3_5.patch, PIG-922-p3_6.patch
>
>
> This is a continuation work of 
> [PIG-697|https://issues.apache.org/jira/browse/PIG-697]. We need to add 
> another rule to the logical optimizer: Push up project, ie, prune columns as 
> early as possible.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.