[ 
https://issues.apache.org/jira/browse/MAHOUT-535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12984537#action_12984537
 ] 

Hudson commented on MAHOUT-535:
-------------------------------

Integrated in Mahout-Quality #575 (See 
[https://hudson.apache.org/hudson/job/Mahout-Quality/575/])
    MAHOUT-535 Operates now in terms of Path, so it can support local or HDFS 
file as input


> mahout seqdirectory reads only from the local filesystem, even when running 
> over Hadoop
> ---------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-535
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-535
>             Project: Mahout
>          Issue Type: Improvement
>          Components: Utils
>    Affects Versions: 0.5
>         Environment: local and hadoop
>            Reporter: Matt Spitz
>            Assignee: Isabel Drost
>            Priority: Minor
>             Fix For: 0.5
>
>         Attachments: 0001-added-HDFS-support-to-seqdirectory.patch
>
>
> It seems as if seqdirectory only reads from the local filesystem, though it 
> writes correctly to the HDFS.
> Consider 'myurls-local' and 'myurls-dfs', the former existing in the working 
> directory and the latter existing on the home directory of the HDFS.
> Running:
> MAHOUT_HOME=. ./bin/mahout seqdirectory -i myurls-local -o myurls-seqdir -c 
> UTF-8 -chunk 
> acts as expected (myurls-seqdir is created on the local filesystem)
> Running:
> MAHOUT_HOME=. HADOOP_HOME=/usr/lib/hadoop-0.20 
> HADOOP_CONF_DIR=/etc/hadoop-0.20/conf ./bin/mahout seqdirectory -i myurls-dfs 
> -o myurls-seqdir -c UTF-8 -chunk 
> creates a 12kb myurls-seqdir directory on the DFS.  Presumably, it couldn't 
> read myurls-dfs from the DFS and ended up creating a nearly-empty sequence 
> directory.
> Running:
> MAHOUT_HOME=. HADOOP_HOME=/usr/lib/hadoop-0.20 
> HADOOP_CONF_DIR=/etc/hadoop-0.20/conf ./bin/mahout seqdirectory -i 
> myurls-local -o myurls-seqdir -c UTF-8 -chunk 
> acts as expected, creating a substantial myurls-seqdir on the DFS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to