[jira] Commented: (MAPREDUCE-2162) speculative execution does not handle cases where stddev > mean well

2010-12-01 Thread Joydeep Sen Sarma (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12965936#action_12965936
 ] 

Joydeep Sen Sarma commented on MAPREDUCE-2162:
--

here's the reasoning behind capping stddev at mean/3. we speculate if:
* rate < mean - stddev

implies
* 1/rate > 1/(mean - stddev)

implies
* 1/rate > 1/mean + (1/(mean - stddev) - 1/mean)

implies
# projectedTime > meanTime + Delta

where
* Delta = (1/(mean - stddev) - 1/mean)

if
* stddev <= mean/3 // for example

then
* Delta > (1/(mean - mean/3) - 1/mean) ==>
* Delta > 0.5/mean = 0.5 * MeanTime

now our our equation _1_ becomes:
# projectedTime > MeanTime + 0.5*MeanTime

two observations:

* by capping stddev - we have converted the rate check into a meaningful check 
on the running time of a task - tasks that run longer than a certain time 
(relative to the mean) will be guaranteed to be speculated.
* the Meantime + 0.5*Meantime slack over the mean is same as the heuristic 
discussed in the jira where two rules were discussed:
** dont speculate if runningTime <= MeanTime * 0.5
** dont speculate if remainingTime < MeanTime
* if we add these two together - since runningTime + remainingTime == 
projectedTime - this becomes (roughly): 
** speculate only if projectedTime > MeanTime + MeanTime*0.5

so the heuristics in the jira are structurally similar to capping the stddev at 
mean/3.

as explained earlier - the percentile stuff is actually (approximately) being 
done by speculativeCap (no more than 10% of the tasks can be speculated and 
tasks are sorted (by latest finish time) before speculating).

> speculative execution does not handle cases where stddev > mean well
> 
>
> Key: MAPREDUCE-2162
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2162
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Joydeep Sen Sarma
>Assignee: Joydeep Sen Sarma
>
> the new speculation code only speculates tasks whose progress rate deviates 
> from the mean progress rate of a job by more than some multiple (typically 
> 1.0) of stddev. stddev can be larger than mean. which means that if we ever 
> get into a situation where this condition holds true - then a task with even 
> 0 progress rate will not be speculated.
> it's not clear that this condition is self-correcting. if a job has thousands 
> of tasks - then one laggard task, inspite of not being speculated for a long 
> time, may not be able to fix the condition of stddev > mean.
> we have seen jobs where tasks have not been speculated for hours and this 
> seems one explanation why this may have happened. here's an example job with 
> stddev > mean:
> DataStatistics: count is 6, sum is 1.7141054797775723E-8, sumSquares is 
> 2.9381575958035014E-16 mean is 2.8568424662959537E-9 std() is 
> 6.388093955645905E-9

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-2028) streaming should support MultiFileInputFormat

2010-12-01 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12965849#action_12965849
 ] 

Allen Wittenauer commented on MAPREDUCE-2028:
-

Actually, what should probably happen is that MultiFileWordCount's 
"MyInputFormat" and "MultiLineRecordRecord" should get promoted out of examples 
and officially into the mapred(uce) APIs. 

The following appears to implement exactly what us streaming users want/need:

$HADOOP_HOME/bin/hadoop  \
jar \
`ls $HADOOP_HOME/contrib/streaming/hadoop-*-streaming.jar` \
-libjars `ls $HADOOP_HOME/hadoop-*-examples.jar` \
-inputformat 
org.apache.hadoop.examples.MultiFileWordCount\$MyInputFormat \
-inputreader 
org.apache.hadoop.examples.MultiFileWordCount\$MultiFileLineRecordReader \



> streaming should support MultiFileInputFormat
> -
>
> Key: MAPREDUCE-2028
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2028
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: contrib/streaming
>Affects Versions: 0.20.2
>Reporter: Allen Wittenauer
> Fix For: 0.21.1, 0.22.0
>
>
> There should be a way to call MultiFileInputFormat from streaming without 
> having to write Java code...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-2156) Raid-aware FSCK

2010-12-01 Thread Ramkumar Vadali (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12965744#action_12965744
 ] 

Ramkumar Vadali commented on MAPREDUCE-2156:


+1, looks good.

> Raid-aware FSCK
> ---
>
> Key: MAPREDUCE-2156
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2156
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: contrib/raid
>Affects Versions: 0.23.0
>Reporter: Patrick Kling
>Assignee: Patrick Kling
> Fix For: 0.23.0
>
> Attachments: MAPREDUCE-2156.2.patch, MAPREDUCE-2156.patch
>
>
> Currently, FSCK reports files as corrupt even if they can be fixed using 
> parity blocks. We need a tool that only reports files that are irreparably 
> corrupt (i.e., files for which too many data or parity blocks belonging to 
> the same stripe have been lost or corrupted).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-2162) speculative execution does not handle cases where stddev > mean well

2010-12-01 Thread Joydeep Sen Sarma (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12965624#action_12965624
 ] 

Joydeep Sen Sarma commented on MAPREDUCE-2162:
--

spent a lot of time coding and thinking about this. i am more to make a simple 
change to cap the standardDeviation at some maximum value (say Mean/3).

i did a detailed analysis that seems to suggest that doing so would be roughly 
equivalent to the scheme discussed above. we already have the notion of a 
'speculative cap' - putting a speculative cap of 10% of the currently running 
tasks would be roughly equivalent of speculating the bottom 10%. (The 
LateComparator currently sorts speculatable tasks by remaining time (instead of 
progress rate). if it were to sort based on progress rate - it would be very 
similar to speculating the bottom 10%)

the conditions discussed here (runningTime >= mean/2 and remainingTime speculative execution does not handle cases where stddev > mean well
> 
>
> Key: MAPREDUCE-2162
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2162
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Joydeep Sen Sarma
>Assignee: Joydeep Sen Sarma
>
> the new speculation code only speculates tasks whose progress rate deviates 
> from the mean progress rate of a job by more than some multiple (typically 
> 1.0) of stddev. stddev can be larger than mean. which means that if we ever 
> get into a situation where this condition holds true - then a task with even 
> 0 progress rate will not be speculated.
> it's not clear that this condition is self-correcting. if a job has thousands 
> of tasks - then one laggard task, inspite of not being speculated for a long 
> time, may not be able to fix the condition of stddev > mean.
> we have seen jobs where tasks have not been speculated for hours and this 
> seems one explanation why this may have happened. here's an example job with 
> stddev > mean:
> DataStatistics: count is 6, sum is 1.7141054797775723E-8, sumSquares is 
> 2.9381575958035014E-16 mean is 2.8568424662959537E-9 std() is 
> 6.388093955645905E-9

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.