[jira] Commented: (MAPREDUCE-1579) archive: check and possibly replace the space charater in paths

2010-03-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12845104#action_12845104
 ] 

Hudson commented on MAPREDUCE-1579:
---

Integrated in Hadoop-Mapreduce-trunk #258 (See 
[http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/258/])


 archive: check and possibly replace the space charater in paths
 ---

 Key: MAPREDUCE-1579
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1579
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: harchive
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE
Priority: Blocker
 Fix For: 0.22.0

 Attachments: m1579_20100310.patch, m1579_20100310b.patch, 
 m1579_20100311.patch


 Since the space character is used as a separator in the index files, it won't 
 work if there are spaces in the path (see also HADOOP-6591).  The archive 
 tools should 
 # detect if there are spaces in the paths and 
 # provide an option to replace it with some other characters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1556) upgrade to Avro 1.3.0

2010-03-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12845103#action_12845103
 ] 

Hudson commented on MAPREDUCE-1556:
---

Integrated in Hadoop-Mapreduce-trunk #258 (See 
[http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/258/])


 upgrade to Avro 1.3.0
 -

 Key: MAPREDUCE-1556
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1556
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobtracker
Reporter: Doug Cutting
Assignee: Doug Cutting
 Fix For: 0.22.0

 Attachments: MAPREDUCE-1556.patch, MAPREDUCE-1556.patch, 
 MAPREDUCE-1556.patch


 Avro 1.3.0 has now been released.  HADOOP-6486 and HDFS-892 require it, and 
 the version of Avro used by MapReduce should be synchronized with these 
 projects.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-1598) Wrongly configured 'hadoop.job.history.user.location' can cause jobs to be pinned in JobTracker's memory forever

2010-03-14 Thread Amar Kamat (JIRA)
Wrongly configured 'hadoop.job.history.user.location' can cause jobs to be 
pinned in JobTracker's memory forever


 Key: MAPREDUCE-1598
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1598
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 0.21.0
Reporter: Amar Kamat
Priority: Blocker


Wrongly configured 'hadoop.job.history.user.location' can disable job-history. 
Jobs retires when JobHistory notifies the JobTracker after moving the history 
file to the done folder (i.e 
mapreduce.jobtracker.jobhistory.completed.location). If the JobHistory gets 
disabled, JobTracker would not receive any notification and thus jobs will be 
pinned in JobTracker's memory forever.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1598) Wrongly configured 'hadoop.job.history.user.location' can cause jobs to be pinned in JobTracker's memory forever

2010-03-14 Thread Hemanth Yamijala (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12845110#action_12845110
 ] 

Hemanth Yamijala commented on MAPREDUCE-1598:
-

Amar, I don't find this key or the newly mapped key - 
mapreduce.job.userhistorylocation - used anywhere in code in trunk. Therefore, 
I think this bug does not exist for trunk at least ? But we might need fixes 
for earlier versions.

 Wrongly configured 'hadoop.job.history.user.location' can cause jobs to be 
 pinned in JobTracker's memory forever
 

 Key: MAPREDUCE-1598
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1598
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 0.21.0
Reporter: Amar Kamat
Priority: Blocker

 Wrongly configured 'hadoop.job.history.user.location' can disable 
 job-history. Jobs retires when JobHistory notifies the JobTracker after 
 moving the history file to the done folder (i.e 
 mapreduce.jobtracker.jobhistory.completed.location). If the JobHistory gets 
 disabled, JobTracker would not receive any notification and thus jobs will be 
 pinned in JobTracker's memory forever.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1598) Wrongly configured 'hadoop.job.history.user.location' can cause jobs to be pinned in JobTracker's memory forever

2010-03-14 Thread Amar Kamat (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amar Kamat updated MAPREDUCE-1598:
--

Fix Version/s: 0.21.0

 Wrongly configured 'hadoop.job.history.user.location' can cause jobs to be 
 pinned in JobTracker's memory forever
 

 Key: MAPREDUCE-1598
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1598
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 0.21.0
Reporter: Amar Kamat
Priority: Blocker
 Fix For: 0.21.0


 Wrongly configured 'hadoop.job.history.user.location' can disable 
 job-history. Jobs retires when JobHistory notifies the JobTracker after 
 moving the history file to the done folder (i.e 
 mapreduce.jobtracker.jobhistory.completed.location). If the JobHistory gets 
 disabled, JobTracker would not receive any notification and thus jobs will be 
 pinned in JobTracker's memory forever.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1270) Hadoop C++ Extention

2010-03-14 Thread Dong Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dong Yang updated MAPREDUCE-1270:
-

Attachment: Overall Design of Hadoop C++ Extension.doc

Hadoop C++ Extension (HCE for short) is a framework for making mapreduce more 
stable and faster.
Here is the overall design of HCE, welcome to give your viewpoints on its 
practical implementation.

 Hadoop C++ Extention
 

 Key: MAPREDUCE-1270
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1270
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: task
Affects Versions: 0.20.1
 Environment:  hadoop linux
Reporter: Wang Shouyan
 Attachments: Overall Design of Hadoop C++ Extension.doc


   Hadoop C++ extension is an internal project in baidu, We start it for these 
 reasons:
1  To provide C++ API. We mostly use Streaming before, and we also try to 
 use PIPES, but we do not find PIPES is more efficient than Streaming. So we 
 think a new C++ extention is needed for us.
2  Even using PIPES or Streaming, it is hard to control memory of hadoop 
 map/reduce Child JVM.
3  It costs so much to read/write/sort TB/PB data by Java. When using 
 PIPES or Streaming, pipe or socket is not efficient to carry so huge data.
What we want to do: 
1 We do not use map/reduce Child JVM to do any data processing, which just 
 prepares environment, starts C++ mapper, tells mapper which split it should  
 deal with, and reads report from mapper until that finished. The mapper will 
 read record, ivoke user defined map, to do partition, write spill, combine 
 and merge into file.out. We think these operations can be done by C++ code.
2 Reducer is similar to mapper, it was started after sort finished, it 
 read from sorted files, ivoke user difined reduce, and write to user defined 
 record writer.
3 We also intend to rewrite shuffle and sort with C++, for efficience and 
 memory control.
at first, 1 and 2, then 3.  
What's the difference with PIPES:
1 Yes, We will reuse most PIPES code.
2 And, We should do it more completely, nothing changed in scheduling and 
 management, but everything in execution.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1598) Wrongly configured 'hadoop.job.history.user.location' can cause jobs to be pinned in JobTracker's memory forever

2010-03-14 Thread Amareshwari Sriramadasu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amareshwari Sriramadasu updated MAPREDUCE-1598:
---

Affects Version/s: (was: 0.21.0)
   0.20.1
Fix Version/s: (was: 0.21.0)
   0.20.3

Disabling job history if anything fails is not present in branch 0.21 or 
trunk. It is removed in job history refactoring through MAPREDUCE-157. Even the 
configuration mapreduce.jobtracker.jobhistory.completed.location is removed in 
the same jira.
So, the bug is present in 0.20.* and earlier versions.

 Wrongly configured 'hadoop.job.history.user.location' can cause jobs to be 
 pinned in JobTracker's memory forever
 

 Key: MAPREDUCE-1598
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1598
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 0.20.1
Reporter: Amar Kamat
Priority: Blocker
 Fix For: 0.20.3


 Wrongly configured 'hadoop.job.history.user.location' can disable 
 job-history. Jobs retires when JobHistory notifies the JobTracker after 
 moving the history file to the done folder (i.e 
 mapreduce.jobtracker.jobhistory.completed.location). If the JobHistory gets 
 disabled, JobTracker would not receive any notification and thus jobs will be 
 pinned in JobTracker's memory forever.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1574) Combiners should implement a specialized Combiner interface, not the generic Reducer interface

2010-03-14 Thread Tom White (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12845180#action_12845180
 ] 

Tom White commented on MAPREDUCE-1574:
--

Creating a Combiner class and _adding_ a setCombinerClass(Class? extends 
Combiner) method, while retaining the Reducer one would not be incompatible. 
However, I'm  not convinced it would make things clearer, as you would still be 
able to use a Reducer as a combiner - and the conditions for this would still 
need explaining (only in the case that the types match, and when it makes sense 
semantically for a reducer to be a combiner).

Perhaps we should be aiming for better diagnostics instead. We could improve 
the error message to explain that the input and output types must match. This 
could be done at the point when the framework receives a combiner  output 
key-value pair (or it might even be possible using an approach like 
MAPREDUCE-1411).

 Combiners should implement a specialized Combiner interface, not the 
 generic Reducer interface
 --

 Key: MAPREDUCE-1574
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1574
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 0.20.1, 0.20.2, 0.20.3
Reporter: Danny Leshem
Priority: Minor

 I just spent 30 minutes trying to figure out why my job throws 
 java.io.IOException: wrong key class when I pass my Reducer class to 
 Job.setCombinerClass. Finally, I understood that a Reducer can act as 
 Combiner only if its output key/value are the same as its input key/value.
 So yes, this is documented. But you can make life easier for users by 
 defining a Combiner interface (that Job.setCombinerClass will accept) to 
 force this at compile time. The new interface should implement the Reducer 
 interface and specialize it (is it even possible with generics?). 
 Alternatively, you can call this interface SimpleReducer.
 If the generics-trick suggested above is impossible to implement, for the 
 (common?) case of having the same class acting as Combiner and Reducer you 
 can do one of either:
 1) Thin Combiner implementation that wraps a given Reducer.
 2) Add a new method, say Job.setCombinerClassToReducer (that accepts a 
 Reducer), acting similarly to the new Job.setCombinerClass - but here the 
 name should alert the user she's doing something special.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1542) Deprecate mapred.permissions.supergroup in favor of hadoop.cluster.administrators

2010-03-14 Thread Ravi Gummadi (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Gummadi updated MAPREDUCE-1542:


Attachment: 1542.20S.1.patch

 So, why can't the mrowner and acls be created inside this class when it is 
 instantiated ? Then, isMROwnerOrAdmin can be implemented directly in 
 JobACLsManager itself.

Moved adminsACL and the method isMROwnerOrAdmin() to JobACLsManager. Since 
mrOwner is used in JobTracker and TaskTracker at few other places, keeping 
mrOwner as part of JobTracker and TaskTracker objects instead of moving to 
JobACLsManager.

Attaching patch for earlier version of hadoop with review comments 
incorporated. Not for commit here.

 Deprecate mapred.permissions.supergroup in favor of 
 hadoop.cluster.administrators
 -

 Key: MAPREDUCE-1542
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1542
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: security
Reporter: Vinod K V
Assignee: Ravi Gummadi
 Fix For: 0.22.0

 Attachments: 1542.20S.1.patch, 1542.20S.patch, 1542.patch, 
 1542.v1.patch, mapreduce-1542-y20s.patch


 HADOOP-6568 added the configuration {{hadoop.cluster.administrators}} through 
 which admins can configure who the superusers/supergroups for the cluster 
 are. MAPREDUCE itself already has {{mapred.permissions.supergroup}} (which is 
 just a single group). As agreed upon at HADOOP-6568, this should be 
 deprecated in favor of {{hadoop.cluster.administrators}}.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.