[ 
https://issues.apache.org/jira/browse/MAHOUT-578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12977956#action_12977956
 ] 

Shige Takeda commented on MAHOUT-578:
-------------------------------------

With the patch, the previous command works without problems.

$MAHOUT_HOME/bin/mahout canopy \
-i input \
-o output \
-dm org.apache.mahout.common.distance.EuclideanDistanceMeasure \
-t1 2.0 \
-t2 0.05 \
-cl \
-Dmapred.job.queue.name=unfunded
Running on hadoop, using HADOOP_HOME=/grid/0/gs/hadoop/current
HADOOP_CONF_DIR=/grid/0/gs/conf/current
11/01/05 20:12:35 INFO common.AbstractJob: Command line arguments: 
{--clustering=null, 
--distanceMeasure=org.apache.mahout.common.distance.EuclideanDistanceMeasure, 
--endPhase=2147483647, --input=input, --method=mapreduce, --output=output, 
--startPhase=0, --t1=2.0, --t2=0.05, --tempDir=temp}
11/01/05 20:12:35 INFO canopy.CanopyDriver: Build Clusters Input: input Out: 
output Measure: 
org.apache.mahout.common.distance.euclideandistancemeas...@39443f t1: 2.0 t2: 
0.05
11/01/05 20:12:36 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 
781064 for stakeda
11/01/05 20:12:36 INFO security.TokenCache: Got dt for 
hdfs://tiberiumtan-nn1.tan.ygrid.yahoo.com/user/stakeda/.staging/job_201012140711_64094;uri=98.138.108.184:8020;t.service=98.138.108.184:8020
11/01/05 20:12:36 INFO input.FileInputFormat: Total input paths to process : 1
11/01/05 20:12:39 INFO mapred.JobClient: Running job: job_201012140711_64094
11/01/05 20:12:40 INFO mapred.JobClient:  map 0% reduce 0%
11/01/05 20:13:06 INFO mapred.JobClient:  map 11% reduce 0%
11/01/05 20:13:09 INFO mapred.JobClient:  map 14% reduce 0%
...


> canopy clustering fails if --Dmapred.job.queue.name=unfunded is specified to 
> mahout driver command line
> -------------------------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-578
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-578
>             Project: Mahout
>          Issue Type: Bug
>          Components: Clustering
>    Affects Versions: 0.4
>         Environment: Linux 2.6.18-164.el5 x86_64 x86_64 x86_64 GNU/Linux
>            Reporter: Shige Takeda
>         Attachments: 
> 0001-CanopyDriver-calls-ToolRunner.run-without-new-Config.patch
>
>
> Hi, I would like to demonstrate -D option issues by showing one concrete 
> example, and would like to propose the fix.
> When I want to run canopy clustering using mahout driver, the command line is 
> something like this. NOTE: -Dmapred.job.queue.name is required Job config in 
> the company's environment.
> $MAHOUT_HOME/bin/mahout canopy \
>         -i input \
>         -o output \
>         -dm org.apache.mahout.common.distance.EuclideanDistanceMeasure \
>         -t1 2.0 \
>         -t2 0.05 \
>         -cl \
>         -Dmapred.job.queue.name=unfunded
> and I get the error:
> Running on hadoop, using HADOOP_HOME=/grid/0/gs/hadoop/current
> HADOOP_CONF_DIR=/grid/0/gs/conf/current
> 11/01/05 20:19:15 ERROR common.AbstractJob: Unexpected 
> -Dmapred.job.queue.name=unfunded while processing Job-Specific Options:
> This is because -D parameter is NOT parsed properly by ToolRunner.run but 
> passed through to CanopyDriver's command line option parsers.
> ToolRunner.run(Tool,String[]) should be used rather than 
> ToolRunner.run(Configuraiton,Tool,String[]) to get -D parameter processed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to