[jira] Commented: (PIG-660) Integration with Hadoop 0.20
[ https://issues.apache.org/jira/browse/PIG-660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12746136#action_12746136 ] Olga Natkovich commented on PIG-660: Submitted (1) Patch to apply to the latest trunk to work with the official Hadoop 20 release (2) Compressed version of hadoop20.jar. Need to be uncompressed and placed in the lib dir. This should allow running Pig against Hadoop 20 cluster > Integration with Hadoop 0.20 > > > Key: PIG-660 > URL: https://issues.apache.org/jira/browse/PIG-660 > Project: Pig > Issue Type: Bug > Components: impl >Affects Versions: 0.2.0 > Environment: Hadoop 0.20 >Reporter: Santhosh Srinivasan >Assignee: Santhosh Srinivasan > Fix For: 0.4.0 > > Attachments: hadoop20.jar.gz, PIG-660-for-branch-0.3.patch, > PIG-660.patch, PIG-660_1.patch, PIG-660_2.patch, PIG-660_3.patch, > PIG-660_4.patch, PIG-660_5.patch, PIG-660_trunk.patch, pig_660_shims.patch, > pig_660_shims_2.patch, pig_660_shims_3.patch > > > With Hadoop 0.20, it will be possible to query the status of each map and > reduce in a map reduce job. This will allow better error reporting. Some of > the other items that could be on Hadoop's feature requests/bugs are > documented here for tracking. > 1. Hadoop should return objects instead of strings when exceptions are thrown > 2. The JobControl should handle all exceptions and report them appropriately. > For example, when the JobControl fails to launch jobs, it should handle > exceptions appropriately and should support APIs that query this state, i.e., > failure to launch jobs. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-660) Integration with Hadoop 0.20
[ https://issues.apache.org/jira/browse/PIG-660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740339#action_12740339 ] Dmitriy V. Ryaboy commented on PIG-660: --- Nate, Your stacktrace shows hadoop.dfs calls (as opposed to hdfs) which tells me it's looking for -- and finding -- hadoop 18 classes. Can you do this: export PIG_HADOOP_VERSION=20 ant clean; ant -Dhadoop.version=20 any try again? Just to be sure, try moving hadoop1* out of the lib directory (so that it for sure fails if it's trying to look for 18). > Integration with Hadoop 0.20 > > > Key: PIG-660 > URL: https://issues.apache.org/jira/browse/PIG-660 > Project: Pig > Issue Type: Bug > Components: impl >Affects Versions: 0.2.0 > Environment: Hadoop 0.20 >Reporter: Santhosh Srinivasan >Assignee: Santhosh Srinivasan > Fix For: 0.4.0 > > Attachments: PIG-660-for-branch-0.3.patch, PIG-660.patch, > PIG-660_1.patch, PIG-660_2.patch, PIG-660_3.patch, PIG-660_4.patch, > PIG-660_5.patch, pig_660_shims.patch, pig_660_shims_2.patch, > pig_660_shims_3.patch > > > With Hadoop 0.20, it will be possible to query the status of each map and > reduce in a map reduce job. This will allow better error reporting. Some of > the other items that could be on Hadoop's feature requests/bugs are > documented here for tracking. > 1. Hadoop should return objects instead of strings when exceptions are thrown > 2. The JobControl should handle all exceptions and report them appropriately. > For example, when the JobControl fails to launch jobs, it should handle > exceptions appropriately and should support APIs that query this state, i.e., > failure to launch jobs. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-660) Integration with Hadoop 0.20
[ https://issues.apache.org/jira/browse/PIG-660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740241#action_12740241 ] Dmitriy V. Ryaboy commented on PIG-660: --- The shim patch posted above doesn't work as cleanly as desired; the current build.xml has junit.hadoop.conf points to a directory in ${user.home} This has an undesired effect -- a hadoop config file gets created the first time you run ant, which among other things sets what class implements the FileSytem interface. When ant gets re-run with a different hadoop version, 'ant clean' does not clean out this file -- so an incorrect fs class name gets used. Deleting the directory created by junit.hadoop.conf before rerunning fixes the problem; so does putting the value of junit.hadoop.conf relative to ${build.dir} instead of ${user.home}. As I am not sure how the Y! developers use their pigconf directories this thing references, I do not know the appropriate way to proceed. Comments? > Integration with Hadoop 0.20 > > > Key: PIG-660 > URL: https://issues.apache.org/jira/browse/PIG-660 > Project: Pig > Issue Type: Bug > Components: impl >Affects Versions: 0.2.0 > Environment: Hadoop 0.20 >Reporter: Santhosh Srinivasan >Assignee: Santhosh Srinivasan > Fix For: 0.4.0 > > Attachments: PIG-660-for-branch-0.3.patch, PIG-660.patch, > PIG-660_1.patch, PIG-660_2.patch, PIG-660_3.patch, PIG-660_4.patch, > PIG-660_5.patch, pig_660_shims.patch, pig_660_shims_2.patch > > > With Hadoop 0.20, it will be possible to query the status of each map and > reduce in a map reduce job. This will allow better error reporting. Some of > the other items that could be on Hadoop's feature requests/bugs are > documented here for tracking. > 1. Hadoop should return objects instead of strings when exceptions are thrown > 2. The JobControl should handle all exceptions and report them appropriately. > For example, when the JobControl fails to launch jobs, it should handle > exceptions appropriately and should support APIs that query this state, i.e., > failure to launch jobs. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-660) Integration with Hadoop 0.20
[ https://issues.apache.org/jira/browse/PIG-660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12739348#action_12739348 ] Daniel Dai commented on PIG-660: Hi, Dmitriy, I like your idea. One comment, in src/20/java/org/apache/pig/shims/HadoopShims.java, the package line is "org.apache.hadoop.hive.shims", I guess it is a typo right? > Integration with Hadoop 0.20 > > > Key: PIG-660 > URL: https://issues.apache.org/jira/browse/PIG-660 > Project: Pig > Issue Type: Bug > Components: impl >Affects Versions: 0.2.0 > Environment: Hadoop 0.20 >Reporter: Santhosh Srinivasan >Assignee: Santhosh Srinivasan > Fix For: 0.4.0 > > Attachments: PIG-660-for-branch-0.3.patch, PIG-660.patch, > PIG-660_1.patch, PIG-660_2.patch, PIG-660_3.patch, PIG-660_4.patch, > PIG-660_5.patch, pig_660_shims.patch > > > With Hadoop 0.20, it will be possible to query the status of each map and > reduce in a map reduce job. This will allow better error reporting. Some of > the other items that could be on Hadoop's feature requests/bugs are > documented here for tracking. > 1. Hadoop should return objects instead of strings when exceptions are thrown > 2. The JobControl should handle all exceptions and report them appropriately. > For example, when the JobControl fails to launch jobs, it should handle > exceptions appropriately and should support APIs that query this state, i.e., > failure to launch jobs. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-660) Integration with Hadoop 0.20
[ https://issues.apache.org/jira/browse/PIG-660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12736319#action_12736319 ] Olga Natkovich commented on PIG-660: removed the latest attachment - I think there is a bit of confusion. We don't need a new patch, just a separate hadoop jar that works with the official hadoop 20 release. > Integration with Hadoop 0.20 > > > Key: PIG-660 > URL: https://issues.apache.org/jira/browse/PIG-660 > Project: Pig > Issue Type: Bug > Components: impl >Affects Versions: 0.2.0 > Environment: Hadoop 0.20 >Reporter: Santhosh Srinivasan >Assignee: Santhosh Srinivasan > Fix For: 0.4.0 > > Attachments: PIG-660.patch, PIG-660_1.patch, PIG-660_2.patch, > PIG-660_3.patch, PIG-660_4.patch, PIG-660_5.patch > > > With Hadoop 0.20, it will be possible to query the status of each map and > reduce in a map reduce job. This will allow better error reporting. Some of > the other items that could be on Hadoop's feature requests/bugs are > documented here for tracking. > 1. Hadoop should return objects instead of strings when exceptions are thrown > 2. The JobControl should handle all exceptions and report them appropriately. > For example, when the JobControl fails to launch jobs, it should handle > exceptions appropriately and should support APIs that query this state, i.e., > failure to launch jobs. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-660) Integration with Hadoop 0.20
[ https://issues.apache.org/jira/browse/PIG-660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12736313#action_12736313 ] Dmitriy V. Ryaboy commented on PIG-660: --- Santosh and Olga -- could you document the differences between a version of 20 Pig can use and that in the Hadoop release? Links to necessary patches, etc? > Integration with Hadoop 0.20 > > > Key: PIG-660 > URL: https://issues.apache.org/jira/browse/PIG-660 > Project: Pig > Issue Type: Bug > Components: impl >Affects Versions: 0.2.0 > Environment: Hadoop 0.20 >Reporter: Santhosh Srinivasan >Assignee: Santhosh Srinivasan > Fix For: 0.4.0 > > Attachments: PIG-660.patch, PIG-660_1.patch, PIG-660_2.patch, > PIG-660_3.patch, PIG-660_4.patch, PIG-660_5.patch, PIG-660_6.patch > > > With Hadoop 0.20, it will be possible to query the status of each map and > reduce in a map reduce job. This will allow better error reporting. Some of > the other items that could be on Hadoop's feature requests/bugs are > documented here for tracking. > 1. Hadoop should return objects instead of strings when exceptions are thrown > 2. The JobControl should handle all exceptions and report them appropriately. > For example, when the JobControl fails to launch jobs, it should handle > exceptions appropriately and should support APIs that query this state, i.e., > failure to launch jobs. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-660) Integration with Hadoop 0.20
[ https://issues.apache.org/jira/browse/PIG-660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12736297#action_12736297 ] Raghu Angadi commented on PIG-660: -- Thanks Olga and Santosh. build.xml change is already in the patch. Thanks. I will attach hadoop20.jar that works with PIG. This is useful for anyone to tryout the patch. This will also be used by zebra (PIG-833). Please commit the jar file to PIG trunk. It could be updated with a later version of hadoop-0.20 branch. > Integration with Hadoop 0.20 > > > Key: PIG-660 > URL: https://issues.apache.org/jira/browse/PIG-660 > Project: Pig > Issue Type: Bug > Components: impl >Affects Versions: 0.2.0 > Environment: Hadoop 0.20 >Reporter: Santhosh Srinivasan >Assignee: Santhosh Srinivasan > Fix For: 0.4.0 > > Attachments: PIG-660.patch, PIG-660_1.patch, PIG-660_2.patch, > PIG-660_3.patch, PIG-660_4.patch, PIG-660_5.patch > > > With Hadoop 0.20, it will be possible to query the status of each map and > reduce in a map reduce job. This will allow better error reporting. Some of > the other items that could be on Hadoop's feature requests/bugs are > documented here for tracking. > 1. Hadoop should return objects instead of strings when exceptions are thrown > 2. The JobControl should handle all exceptions and report them appropriately. > For example, when the JobControl fails to launch jobs, it should handle > exceptions appropriately and should support APIs that query this state, i.e., > failure to launch jobs. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-660) Integration with Hadoop 0.20
[ https://issues.apache.org/jira/browse/PIG-660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12736286#action_12736286 ] Olga Natkovich commented on PIG-660: Raghu, please, add the hadoop20.jar that Zebra is using. We can commit it with the understanding that we will overwrite once we commit hadoop 20 support into PIg > Integration with Hadoop 0.20 > > > Key: PIG-660 > URL: https://issues.apache.org/jira/browse/PIG-660 > Project: Pig > Issue Type: Bug > Components: impl >Affects Versions: 0.2.0 > Environment: Hadoop 0.20 >Reporter: Santhosh Srinivasan >Assignee: Santhosh Srinivasan > Fix For: 0.4.0 > > Attachments: PIG-660.patch, PIG-660_1.patch, PIG-660_2.patch, > PIG-660_3.patch, PIG-660_4.patch, PIG-660_5.patch > > > With Hadoop 0.20, it will be possible to query the status of each map and > reduce in a map reduce job. This will allow better error reporting. Some of > the other items that could be on Hadoop's feature requests/bugs are > documented here for tracking. > 1. Hadoop should return objects instead of strings when exceptions are thrown > 2. The JobControl should handle all exceptions and report them appropriately. > For example, when the JobControl fails to launch jobs, it should handle > exceptions appropriately and should support APIs that query this state, i.e., > failure to launch jobs. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-660) Integration with Hadoop 0.20
[ https://issues.apache.org/jira/browse/PIG-660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12736283#action_12736283 ] Santhosh Srinivasan commented on PIG-660: - The build.xml in the patch(es) have the reference to hadoop20.jar. The missing part is the hadoop20.jar that Pig can use to build its sources. Pig cannot use the hadoop20.jar coming from the Hadoop release. > Integration with Hadoop 0.20 > > > Key: PIG-660 > URL: https://issues.apache.org/jira/browse/PIG-660 > Project: Pig > Issue Type: Bug > Components: impl >Affects Versions: 0.2.0 > Environment: Hadoop 0.20 >Reporter: Santhosh Srinivasan >Assignee: Santhosh Srinivasan > Fix For: 0.4.0 > > Attachments: PIG-660.patch, PIG-660_1.patch, PIG-660_2.patch, > PIG-660_3.patch, PIG-660_4.patch, PIG-660_5.patch > > > With Hadoop 0.20, it will be possible to query the status of each map and > reduce in a map reduce job. This will allow better error reporting. Some of > the other items that could be on Hadoop's feature requests/bugs are > documented here for tracking. > 1. Hadoop should return objects instead of strings when exceptions are thrown > 2. The JobControl should handle all exceptions and report them appropriately. > For example, when the JobControl fails to launch jobs, it should handle > exceptions appropriately and should support APIs that query this state, i.e., > failure to launch jobs. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-660) Integration with Hadoop 0.20
[ https://issues.apache.org/jira/browse/PIG-660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12736264#action_12736264 ] Raghu Angadi commented on PIG-660: -- Currently, hadoop jar for 0.18 under lib/ is called hadoop18.jar. Should we change build.xml to use hadoop20.jar instead of hadoop18.jar? I can file a jira to commit hadoop20.jar. This might be replaced by updated jar when this jira is committed. > Integration with Hadoop 0.20 > > > Key: PIG-660 > URL: https://issues.apache.org/jira/browse/PIG-660 > Project: Pig > Issue Type: Bug > Components: impl >Affects Versions: 0.2.0 > Environment: Hadoop 0.20 >Reporter: Santhosh Srinivasan >Assignee: Santhosh Srinivasan > Fix For: 0.4.0 > > Attachments: PIG-660.patch, PIG-660_1.patch, PIG-660_2.patch, > PIG-660_3.patch, PIG-660_4.patch, PIG-660_5.patch > > > With Hadoop 0.20, it will be possible to query the status of each map and > reduce in a map reduce job. This will allow better error reporting. Some of > the other items that could be on Hadoop's feature requests/bugs are > documented here for tracking. > 1. Hadoop should return objects instead of strings when exceptions are thrown > 2. The JobControl should handle all exceptions and report them appropriately. > For example, when the JobControl fails to launch jobs, it should handle > exceptions appropriately and should support APIs that query this state, i.e., > failure to launch jobs. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-660) Integration with Hadoop 0.20
[ https://issues.apache.org/jira/browse/PIG-660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12671936#action_12671936 ] Santhosh Srinivasan commented on PIG-660: - JIRAs in Hadoop corresponding to items 1 and 2. 1. https://issues.apache.org/jira/browse/HADOOP-5201 2. https://issues.apache.org/jira/browse/HADOOP-5202 > Integration with Hadoop 0.20 > > > Key: PIG-660 > URL: https://issues.apache.org/jira/browse/PIG-660 > Project: Pig > Issue Type: Bug > Components: impl >Affects Versions: types_branch > Environment: Hadoop 0.20 >Reporter: Santhosh Srinivasan > Fix For: 0.1.0 > > > With Hadoop 0.20, it will be possible to query the status of each map and > reduce in a map reduce job. This will allow better error reporting. Some of > the other items that could be on Hadoop's feature requests/bugs are > documented here for tracking. > 1. Hadoop should return objects instead of strings when exceptions are thrown > 2. The JobControl should handle all exceptions and report them appropriately. > For example, when the JobControl fails to launch jobs, it should handle > exceptions appropriately and should support APIs that query this state, i.e., > failure to launch jobs. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.