> But *LAUNCHED_TASK* stays always one?

The number of BSP task is determined by InputFormat. Basically, the number of 
tasks equals to the number of blocks of single input file, or the number of 
multiple input files. So, you can’t force the number of tasks without input 
partitioning.

Meanwhile, in GraphJob case, PartitioningRunner creates the partitions as user 
desired, it runs before GraphJobRunner. So, you can set the number of tasks for 
a graph job.

On Jan 4, 2014, at 7:51 PM, Martin Illecker (JIRA) <[email protected]> wrote:

> 
>     [ 
> https://issues.apache.org/jira/browse/HAMA-834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
>  ]
> 
> Martin Illecker updated HAMA-834:
> ---------------------------------
> 
>    Attachment: HAMA-834.patch
> 
> Please see the updated patch.
> You have to compare double values:
> {code}
> -      assertTrue(doubleVector.get(0) >= 50 && doubleVector.get(0) < 51);
> -      assertTrue(doubleVector.get(1) >= 50 && doubleVector.get(1) < 51);
> +      assertEquals(Double.valueOf(50), doubleVector.get(0));
> +      assertEquals(Double.valueOf(50), doubleVector.get(1));
> {code}
> 
> BTW why do we need 101 input vectors not 100?
> {code}
> -      for (int i = 0; i < 100; i++) {
> +      for (int i = 0; i < 101; i++) {
> {code}
> The resulting center of 100 input vectors would be (49.5, 49.5).
> 
> {quote}
> What do you mean exactly?
> {quote}
> 
> Finally I want to verify the result for a different amount of *NumBspTask*.
> Therefore I set the *NumBspTask* within TestKMeansBSP.
> {code}
> +      job.setNumBspTask(3);
> +      System.out.println("NumBspTask: " + job.getNumBspTask());
> {code}
> But *LAUNCHED_TASK* stays always one?
> 
> 
>> Fix KMeans example
>> ------------------
>> 
>>                Key: HAMA-834
>>                URL: https://issues.apache.org/jira/browse/HAMA-834
>>            Project: Hama
>>         Issue Type: Bug
>>         Components: examples, machine learning
>>   Affects Versions: 0.6.3
>>           Reporter: Martin Illecker
>>             Labels: example
>>            Fix For: 0.7.0
>> 
>>        Attachments: HAMA-834.patch
>> 
>> 
>> Fix problems in KMeans example and revise test case.
>> 1) Typo \[1] and input path issue
>> 2) Wrong *summationCount* in assignCentersInternal
>> *summationCount* should also be incremented if \[2] 
>> {code}
>> if (clusterCenter == null) {
>>  newCenterArray[lowestDistantCenter] = key;
>> }
>> {code}
>> Otherwise *summationCount* may stay zero when only one value is assigned. 
>> Then this zero will be propagated to *incrementSum* \[3] and might cause a 
>> divide by zero in \[4]. 
>> By the way if we add three vectors and the *summationCount* would only be 
>> two, this will lead to wrong results. Because later we are dividing the 
>> vector by the amount of increments.
>> 3) Results depend on the amount *numBspTask*
>> (results vary if *numBspTask* is changed)
>> \[1]
>> https://github.com/apache/hama/blob/trunk/ml/src/main/java/org/apache/hama/ml/kmeans/KMeansBSP.java#L518-519
>> \[2] 
>> https://github.com/apache/hama/blob/trunk/ml/src/main/java/org/apache/hama/ml/kmeans/KMeansBSP.java#L249
>> \[3]
>> https://github.com/apache/hama/blob/trunk/ml/src/main/java/org/apache/hama/ml/kmeans/KMeansBSP.java#L161
>> \[4] 
>> https://github.com/apache/hama/blob/trunk/ml/src/main/java/org/apache/hama/ml/kmeans/KMeansBSP.java#L172
> 
> 
> 
> --
> This message was sent by Atlassian JIRA
> (v6.1.5#6160)

Reply via email to