[jira] Assigned: (MAPREDUCE-1239) Mapreduce test build is broken after HADOOP-5107

2009-11-24 Thread Vinod K V (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod K V reassigned MAPREDUCE-1239:


Assignee: Vinod K V

> Mapreduce test build is broken after HADOOP-5107
> 
>
> Key: MAPREDUCE-1239
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1239
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: build
>Affects Versions: 0.21.0, 0.22.0
>Reporter: Vinod K V
>Assignee: Vinod K V
>Priority: Blocker
> Fix For: 0.21.0
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1185) URL to JT webconsole for running job and job history should be the same

2009-11-24 Thread Sharad Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12782307#action_12782307
 ] 

Sharad Agarwal commented on MAPREDUCE-1185:
---

bq. I think the approach of including the job history file name in the URL 
since the beginning will cause more headaches, since the job history file name 
includes some things that are unparseable by humans. This will require a map 
between job ids and the file name to be kept inside the jobtracker, 
Since URL is given by the JobClient API, the thinking was that users don't 
really need to bother about the contents and map based thing will not survive 
JT restarts. The long term solution being done by MAPREDUCE-323. Saying that it 
looks like (also based on off line discussion with Arun) that map based 
solution is acceptable in medium term until MAPREDUCE-323 is done.

bq. but that should not be too big, since the entries can be removed when job 
history is purged periodically. Makes sense ?
Yes, that should work. The only problem we also need to keep in mind that when 
history cleaner thread runs, it does a listing of the files in history folder 
and deletes the ones which are older than 30 days (default). But in the 
meanwhile operations could have manually purged the files. So those entries 
won't be purged from the map. To address this timestamp can be maintained along 
with the history file name.


> URL to JT webconsole for running job and job history should be the same
> ---
>
> Key: MAPREDUCE-1185
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1185
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobtracker
>Reporter: Sharad Agarwal
>Assignee: Sharad Agarwal
> Attachments: 1185_v1.patch, 1185_v2.patch, 1185_v3.patch, 
> 1185_v4.patch
>
>
> The tracking url for running jobs and the jobs which are retired is 
> different. This creates problem for clients which caches the job running url 
> because soon it becomes invalid when job is retired.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1186) While localizing a DistributedCache file, TT sets permissions recursively on the whole base-dir

2009-11-24 Thread Amareshwari Sriramadasu (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12782303#action_12782303
 ] 

Amareshwari Sriramadasu commented on MAPREDUCE-1186:


test-patch and ant test passed on my machine, for the Y!20 patch

> While localizing a DistributedCache file, TT sets permissions recursively on 
> the whole base-dir
> ---
>
> Key: MAPREDUCE-1186
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1186
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: tasktracker
>Affects Versions: 0.21.0
>Reporter: Vinod K V
>Assignee: Amareshwari Sriramadasu
> Fix For: 0.21.0
>
> Attachments: patch-1186-ydist.txt, patch-1186-ydist.txt, 
> patch-1186.txt
>
>
> This is a performance problem.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1222) [Mumak] We should not include nodes with numeric ips in cluster topology.

2009-11-24 Thread Dick King (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dick King updated MAPREDUCE-1222:
-

Attachment: IPv6-predicate.patch

This is a supplement to the previous patch to expand the regexp to meet the 
requirements of RFC 4291 .

The regular expression and the finite state machine that supports it is longer 
than it needs to be -- about 2.6K, and probably about 500-1000 nodes in the 
FSM.  I did it this way to avoid as much backtracking as possible.  It seems to 
take about 10 microseconds to handle an IPV6 full address, a bit more for an 
address with elisions, a bit less for an IPV4 address, and about 5 microseconds 
for a non-numeric address.

I'll open the regular expression for comment before I fold it in with the patch 
and with other changes I've suggested in the main patch.

> [Mumak] We should not include nodes with numeric ips in cluster topology.
> -
>
> Key: MAPREDUCE-1222
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1222
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: contrib/mumak
>Affects Versions: 0.21.0, 0.22.0
>Reporter: Hong Tang
>Assignee: Hong Tang
> Fix For: 0.21.0, 0.22.0
>
> Attachments: IPv6-predicate.patch, mapreduce-1222-20091119.patch, 
> mapreduce-1222-20091121.patch
>
>
> Rumen infers cluster topology by parsing input split locations from job 
> history logs. Due to HDFS-778, a cluster node may appear both as a numeric ip 
> or as a host name in job history logs. We should exclude nodes appeared as 
> numeric ips in cluster toplogy when we run mumak until a solution is found so 
> that numeric ips would never appear in input split locations.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1222) [Mumak] We should not include nodes with numeric ips in cluster topology.

2009-11-24 Thread Dick King (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12782131#action_12782131
 ] 

Dick King commented on MAPREDUCE-1222:
--

We have to use a regular expression rather than any API that java might provide 
for parsing IP addresses, because the punctuation characters in IP addresses 
sometimes are -- and sometimes are not -- escaped.

> [Mumak] We should not include nodes with numeric ips in cluster topology.
> -
>
> Key: MAPREDUCE-1222
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1222
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: contrib/mumak
>Affects Versions: 0.21.0, 0.22.0
>Reporter: Hong Tang
>Assignee: Hong Tang
> Fix For: 0.21.0, 0.22.0
>
> Attachments: mapreduce-1222-20091119.patch, 
> mapreduce-1222-20091121.patch
>
>
> Rumen infers cluster topology by parsing input split locations from job 
> history logs. Due to HDFS-778, a cluster node may appear both as a numeric ip 
> or as a host name in job history logs. We should exclude nodes appeared as 
> numeric ips in cluster toplogy when we run mumak until a solution is found so 
> that numeric ips would never appear in input split locations.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-698) Per-pool task limits for the fair scheduler

2009-11-24 Thread Matei Zaharia (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12782110#action_12782110
 ] 

Matei Zaharia commented on MAPREDUCE-698:
-

Oops, I forgot to add one other important comment: The new config parameters 
should be documented in the fair scheduler's Forrest documentation in 
src/docs/src/documentation/content/xdocs/fair_scheduler.xml (in the Allocation 
File section). Also, the documentation should say what happens if the maxTasks 
of a pool is set lower than its minTasks: in this case, the maxTasks takes 
precedence. It might also be good to print a warning in PoolManager when 
loading a config file if we see a pool with maxTasks < minTasks. You should do 
this after you finish reading the a  element.

One other very minor thing - there seem to be some tabs in the patch, replace 
them with spaces.

Thanks for taking the time to port this to trunk!

> Per-pool task limits for the fair scheduler
> ---
>
> Key: MAPREDUCE-698
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-698
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: contrib/fair-share
>Reporter: Matei Zaharia
> Fix For: 0.21.0
>
> Attachments: MAPREDUCE-698-prelim.patch, mapreduce-698-trunk.patch, 
> mapreduce-698-trunk.patch
>
>
> The fair scheduler could use a way to cap the share of a given pool similar 
> to MAPREDUCE-532.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-698) Per-pool task limits for the fair scheduler

2009-11-24 Thread Matei Zaharia (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12782106#action_12782106
 ] 

Matei Zaharia commented on MAPREDUCE-698:
-

The patch mostly looks good to me, but I have a few comments:
* There seem to be some extra newlines added in FairScheduler.java. You might 
want to just svn revert it.
* What is the purpose of the changes to build-contrib.xml? Are they just 
something you copied in from another build.xml file?
* In the unit test, it would be good to submit the jobs first (with an 
advanceTime between them) before doing any checkAssignment's, so that both jobs 
are initially available. Note that the scheduler will probably alternate 
between assigning tasks from the two jobs in this case, but it should still not 
let job 1 go over its max share.
* In the web UI, can you call the new "Max" column "Max Share" instead, so it 
is more consistent with the other column names?
* You can simplify Pool.numRunningTasks(TaskType type) to just return 
getSchedulable(type).getRunningTasks(). It might also be good to rename the 
method to getRunningTasks instead of nunRunningTasks so that it's more 
consistent with the rest of the code.
* Instead of making PoolSchedulable.getDemand() return maxTasks if demand > 
maxTasks, it would be better to change PoolSchedulable.updateDemand() to cap 
the demand at the end of the method, so that getDemand() just returns demand. 
Otherwise, it will be a little confusing to have a variable called demand in 
PoolSchedulable whose value is not the same as that returned by getDemand().

> Per-pool task limits for the fair scheduler
> ---
>
> Key: MAPREDUCE-698
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-698
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: contrib/fair-share
>Reporter: Matei Zaharia
> Fix For: 0.21.0
>
> Attachments: MAPREDUCE-698-prelim.patch, mapreduce-698-trunk.patch, 
> mapreduce-698-trunk.patch
>
>
> The fair scheduler could use a way to cap the share of a given pool similar 
> to MAPREDUCE-532.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1229) [Mumak] Allow customization of job submission policy

2009-11-24 Thread Dick King (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12782087#action_12782087
 ] 

Dick King commented on MAPREDUCE-1229:
--

1: Should {{TestSimulator*JobSubmission}} check to see whether the total 
"runtime" was reasonable for the Policy?

2: minor nit: Should {{SimulatorJobSubmissionPolicy/getPolicy(Configuration)}} 
use {{valueOf(policy.toUpper())}} instead of looping through the types?

3: medium sized nit: in {{SimulatorJobClient.isOverloaded()}} there are two 
literals, 0.9 and 2.0F, that ought to be {{static private}} named values.

4: Here is my biggest point.  The existing code cannot submit a job more often 
than once every five seconds when the jobs were spaced further apart than that 
and the policy is {{STRESS}} .

Please consider adding code to call the {{processLoadProbingEvent}} core code 
when we {{processJobCompleteEvent}} or a {{processJobSubmitEvent}} .  That 
includes potentially adding a new {{LoadProbingEvent}} .  This can lead to an 
accumulation because each {{LoadProbingEvent}} replaces itself, so we should 
track the ones that are in flight in a {{PriorityQueue}} and only add a new 
{{LoadProbingEvent}} whenever the new event has a time stamp strictly earlier 
than the earliest one already in flight.  This will limit us to two events in 
flight with the current {{adjustLoadProbingInterval}} .  

If you don't do that, then if a real dreadnaught of a job gets dropped into the 
system and the probing interval gets long it could take us a while to notice 
that we're okay to submit jobs, in the case where the job has many tasks 
finishing at about the same time, and we could submit tiny jobs as onsies every 
five seconds when the cluster is clear enough to accommodate lots of jobs.  
When the cluster can handle N jobs in less than 5N seconds for some N, we won't 
overload it with the existing code.





> [Mumak] Allow customization of job submission policy
> 
>
> Key: MAPREDUCE-1229
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1229
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: contrib/mumak
>Affects Versions: 0.21.0, 0.22.0
>Reporter: Hong Tang
>Assignee: Hong Tang
> Fix For: 0.21.0, 0.22.0
>
> Attachments: mapreduce-1229-20091121.patch, 
> mapreduce-1229-20091123.patch
>
>
> Currently, mumak replay job submission faithfully. To make mumak useful for 
> evaluation purposes, it would be great if we can support other job submission 
> policies such as sequential job submission, or stress job submission.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1222) [Mumak] We should not include nodes with numeric ips in cluster topology.

2009-11-24 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12782073#action_12782073
 ] 

Allen Wittenauer commented on MAPREDUCE-1222:
-

Yes, you should worry about IPv6.

> [Mumak] We should not include nodes with numeric ips in cluster topology.
> -
>
> Key: MAPREDUCE-1222
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1222
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: contrib/mumak
>Affects Versions: 0.21.0, 0.22.0
>Reporter: Hong Tang
>Assignee: Hong Tang
> Fix For: 0.21.0, 0.22.0
>
> Attachments: mapreduce-1222-20091119.patch, 
> mapreduce-1222-20091121.patch
>
>
> Rumen infers cluster topology by parsing input split locations from job 
> history logs. Due to HDFS-778, a cluster node may appear both as a numeric ip 
> or as a host name in job history logs. We should exclude nodes appeared as 
> numeric ips in cluster toplogy when we run mumak until a solution is found so 
> that numeric ips would never appear in input split locations.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1185) URL to JT webconsole for running job and job history should be the same

2009-11-24 Thread Milind Bhandarkar (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12782062#action_12782062
 ] 

Milind Bhandarkar commented on MAPREDUCE-1185:
--

I think the approach of including the job history file name in the URL since 
the beginning will cause more headaches, since the job history file name 
includes some things that are unparseable by humans. It may be easier and more 
human-friendly to translate the job id internally to the history file name, and 
return the content of job history. This will require a map between job ids and 
the file name to be kept inside the jobtracker, but that should not be too big, 
since the entries can be removed when job history is purged periodically. Makes 
sense ?

In any case, Hadoop 0.21 will have a different human-friendly  file naming 
scheme, when this can go away.

> URL to JT webconsole for running job and job history should be the same
> ---
>
> Key: MAPREDUCE-1185
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1185
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobtracker
>Reporter: Sharad Agarwal
>Assignee: Sharad Agarwal
> Attachments: 1185_v1.patch, 1185_v2.patch, 1185_v3.patch, 
> 1185_v4.patch
>
>
> The tracking url for running jobs and the jobs which are retired is 
> different. This creates problem for clients which caches the job running url 
> because soon it becomes invalid when job is retired.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1222) [Mumak] We should not include nodes with numeric ips in cluster topology.

2009-11-24 Thread Dick King (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12782037#action_12782037
 ] 

Dick King commented on MAPREDUCE-1222:
--

1: Should we worry about IPV6 numeric addresses?  I'll write the regex if you 
want.

2: Proposed line 249 of {{SimulatorEngine.java}} reads:

 {{// ips and as host names.  We remove them from the parsed network topology}}

   and should read

 {{// ips [as opposed to host names].  We remove numeric ips from the parsed 
network topology}}

The former wording implies that we remove _both forms_ of a doubled address.

3: Do we mean to make {{isIPAddress}} part of the package local API?

  3a: If not, testing {{isIPAddress}} should be accomplished by making the 
addresses that superficially look like IPs but aren't part of the test case in 
{{topo-with-numeric-ips.json}} .

> [Mumak] We should not include nodes with numeric ips in cluster topology.
> -
>
> Key: MAPREDUCE-1222
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1222
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: contrib/mumak
>Affects Versions: 0.21.0, 0.22.0
>Reporter: Hong Tang
>Assignee: Hong Tang
> Fix For: 0.21.0, 0.22.0
>
> Attachments: mapreduce-1222-20091119.patch, 
> mapreduce-1222-20091121.patch
>
>
> Rumen infers cluster topology by parsing input split locations from job 
> history logs. Due to HDFS-778, a cluster node may appear both as a numeric ip 
> or as a host name in job history logs. We should exclude nodes appeared as 
> numeric ips in cluster toplogy when we run mumak until a solution is found so 
> that numeric ips would never appear in input split locations.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (MAPREDUCE-1047) hadoop.pipes.command.port missing from environment when using bash 4

2009-11-24 Thread Simone Leo (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simone Leo resolved MAPREDUCE-1047.
---

Resolution: Duplicate

Looks like the problem is [more 
general|https://issues.apache.org/jira/browse/HADOOP-6388]. Resolving as 
duplicate.

> hadoop.pipes.command.port missing from environment when using bash 4
> 
>
> Key: MAPREDUCE-1047
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1047
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: pipes
>Affects Versions: 0.20.1
> Environment: Linux 2.6.30-gentoo-r6 -- i686 Intel(R) Core(TM)2 Duo 
> CPU T8100 @ 2.10GHz
>Reporter: Simone Leo
>
> I recently upgraded to gnu bash 4.0 and found out that Hadoop Pipes 
> applications break because they cannot read the "hadoop.pipes.command.port" 
> environment variable.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-698) Per-pool task limits for the fair scheduler

2009-11-24 Thread dhruba borthakur (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12781879#action_12781879
 ] 

dhruba borthakur commented on MAPREDUCE-698:


it would be nice if somebody can review this patch.

> Per-pool task limits for the fair scheduler
> ---
>
> Key: MAPREDUCE-698
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-698
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: contrib/fair-share
>Reporter: Matei Zaharia
> Fix For: 0.21.0
>
> Attachments: MAPREDUCE-698-prelim.patch, mapreduce-698-trunk.patch, 
> mapreduce-698-trunk.patch
>
>
> The fair scheduler could use a way to cap the share of a given pool similar 
> to MAPREDUCE-532.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1239) Mapreduce test build is broken after HADOOP-5107

2009-11-24 Thread Vinod K V (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod K V updated MAPREDUCE-1239:
-

Priority: Blocker  (was: Major)

> Mapreduce test build is broken after HADOOP-5107
> 
>
> Key: MAPREDUCE-1239
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1239
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: build
>Affects Versions: 0.21.0, 0.22.0
>Reporter: Vinod K V
>Priority: Blocker
> Fix For: 0.21.0
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1239) Mapreduce test build is broken after HADOOP-5107

2009-11-24 Thread Vinod K V (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12781875#action_12781875
 ] 

Vinod K V commented on MAPREDUCE-1239:
--

The build fails because of absent dependencies of core/hdfs jars in the contrib 
projects. For e.g, try to run {{TestCapacityScheduler}}.

> Mapreduce test build is broken after HADOOP-5107
> 
>
> Key: MAPREDUCE-1239
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1239
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: build
>Affects Versions: 0.21.0, 0.22.0
>Reporter: Vinod K V
> Fix For: 0.21.0
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-1239) Mapreduce test build is broken after HADOOP-5107

2009-11-24 Thread Vinod K V (JIRA)
Mapreduce test build is broken after HADOOP-5107


 Key: MAPREDUCE-1239
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1239
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: build
Affects Versions: 0.21.0, 0.22.0
Reporter: Vinod K V
 Fix For: 0.21.0




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1186) While localizing a DistributedCache file, TT sets permissions recursively on the whole base-dir

2009-11-24 Thread Amareshwari Sriramadasu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amareshwari Sriramadasu updated MAPREDUCE-1186:
---

Attachment: patch-1186-ydist.txt

Patch incorporating Arun's offline review comments on Y!20 patch.

> While localizing a DistributedCache file, TT sets permissions recursively on 
> the whole base-dir
> ---
>
> Key: MAPREDUCE-1186
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1186
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: tasktracker
>Affects Versions: 0.21.0
>Reporter: Vinod K V
>Assignee: Amareshwari Sriramadasu
> Fix For: 0.21.0
>
> Attachments: patch-1186-ydist.txt, patch-1186-ydist.txt, 
> patch-1186.txt
>
>
> This is a performance problem.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1020) Add more unit tests to test the queue refresh feature MAPREDUCE-893

2009-11-24 Thread Vinod K V (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12781862#action_12781862
 ] 

Vinod K V commented on MAPREDUCE-1020:
--

Here are some of the pending tests that come to my mind:
 - verify that support for priorities cannot be changed from enabled to 
disabled and vice versa during refresh
 - verify the consistency in state of JobQueuesManager
-- what happens to the 'jobQueues' list? (Not sure, myself)
-- the actual lits of running and waiting jobs
 - JobInitializationPoller
  -- per-queue maxUsersAllowedToInitialize
  -- per-queue maxJobsPerUserToInitialize
  -- running-state of the poller
-- jobs with the poller
-- how changed ulimits affect the after-refresh behaviour of the poller
 - verify the running-state of the scheduler
-- numJobsByUser
-- numWaitingJobs
-- maxCapacityPercent
-- queue-scheduling-context objects
-- map/reduce task-scheduling information
  -- cluster-capacity
  -- numRunningTasks
  -- numSlotsOccupied
  -- maxCapacity
  -- numSlotsOccupiedByUser

> Add more unit tests to test the queue refresh feature MAPREDUCE-893
> ---
>
> Key: MAPREDUCE-1020
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1020
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.21.0
>Reporter: Vinod K V
> Fix For: 0.21.0
>
>
> MAPREDUCE-893 included unit tests verifying the sanity of the refresh feature 
> - both the queue properities' refresh as well as the scheduler properties' 
> refresh. The test suite can and should be expanded. This will help easily 
> identifying issues that will otherwise be caught during manual testing. For 
> e.g., during manual testing of MAPREDUCE-893, we identified an NPE in the 
> scheduler iteration occuring during heartbeat, which could have been easily 
> identified by unit tests.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-698) Per-pool task limits for the fair scheduler

2009-11-24 Thread Kevin Peterson (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Peterson updated MAPREDUCE-698:
-

Attachment: mapreduce-698-trunk.patch

Update patch to display caps in the scheduler UI.

> Per-pool task limits for the fair scheduler
> ---
>
> Key: MAPREDUCE-698
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-698
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: contrib/fair-share
>Reporter: Matei Zaharia
> Fix For: 0.21.0
>
> Attachments: MAPREDUCE-698-prelim.patch, mapreduce-698-trunk.patch, 
> mapreduce-698-trunk.patch
>
>
> The fair scheduler could use a way to cap the share of a given pool similar 
> to MAPREDUCE-532.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1231) Distcp is very slow

2009-11-24 Thread Venkatesh S (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12781830#action_12781830
 ] 

Venkatesh S commented on MAPREDUCE-1231:


Arun, this feature was added in H20 and was not available in H18. Hence, this 
change should be sufficient for us to address the slow start up times.

> Distcp is very slow
> ---
>
> Key: MAPREDUCE-1231
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1231
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: distcp
>Affects Versions: 0.20.1
>Reporter: Jothi Padmanabhan
>Assignee: Jothi Padmanabhan
> Fix For: 0.20.2
>
> Attachments: mapred-1231.patch
>
>
> Currently distcp does a checksums check in addition to file length check to 
> decide if a remote file has to be copied. If the number of files is high 
> (thousands), this checksum check is proving to be fairly costly leading to a 
> long time before the copy is started.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1186) While localizing a DistributedCache file, TT sets permissions recursively on the whole base-dir

2009-11-24 Thread Amareshwari Sriramadasu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amareshwari Sriramadasu updated MAPREDUCE-1186:
---

Status: Patch Available  (was: Open)

> While localizing a DistributedCache file, TT sets permissions recursively on 
> the whole base-dir
> ---
>
> Key: MAPREDUCE-1186
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1186
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: tasktracker
>Affects Versions: 0.21.0
>Reporter: Vinod K V
>Assignee: Amareshwari Sriramadasu
> Fix For: 0.21.0
>
> Attachments: patch-1186-ydist.txt, patch-1186.txt
>
>
> This is a performance problem.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1186) While localizing a DistributedCache file, TT sets permissions recursively on the whole base-dir

2009-11-24 Thread Amareshwari Sriramadasu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amareshwari Sriramadasu updated MAPREDUCE-1186:
---

Attachment: patch-1186.txt

Patch for trunk doing the one-line change. And fixed a minor bug in testcase.
The patch can be improved more by doing some more changes in 
LinuxTaskController. 

> While localizing a DistributedCache file, TT sets permissions recursively on 
> the whole base-dir
> ---
>
> Key: MAPREDUCE-1186
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1186
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: tasktracker
>Affects Versions: 0.21.0
>Reporter: Vinod K V
>Assignee: Amareshwari Sriramadasu
> Fix For: 0.21.0
>
> Attachments: patch-1186-ydist.txt, patch-1186.txt
>
>
> This is a performance problem.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1186) While localizing a DistributedCache file, TT sets permissions recursively on the whole base-dir

2009-11-24 Thread Amareshwari Sriramadasu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amareshwari Sriramadasu updated MAPREDUCE-1186:
---

Attachment: patch-1186-ydist.txt

Patch for Y! distribution

> While localizing a DistributedCache file, TT sets permissions recursively on 
> the whole base-dir
> ---
>
> Key: MAPREDUCE-1186
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1186
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: tasktracker
>Affects Versions: 0.21.0
>Reporter: Vinod K V
>Assignee: Amareshwari Sriramadasu
> Fix For: 0.21.0
>
> Attachments: patch-1186-ydist.txt
>
>
> This is a performance problem.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1009) Forrest documentation needs to be updated to describes features provided for supporting hierarchical queues

2009-11-24 Thread Vinod K V (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod K V updated MAPREDUCE-1009:
-

Attachment: MAPREDUCE-1009-20091124.txt

Uploading patch that addresses Rahul's review comments.

> Forrest documentation needs to be updated to describes features provided for 
> supporting hierarchical queues
> ---
>
> Key: MAPREDUCE-1009
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1009
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 0.21.0
>Reporter: Hemanth Yamijala
>Assignee: Vinod K V
>Priority: Blocker
> Fix For: 0.21.0
>
> Attachments: MAPREDUCE-1009-20091008.txt, 
> MAPREDUCE-1009-20091116.txt, MAPREDUCE-1009-20091124.txt
>
>
> Forrest documentation must be updated for describing how to set up and use 
> hierarchical queues in the framework and the capacity scheduler.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.