Re: HDFS Clustering

Max Michels Tue, 24 Feb 2015 02:33:07 -0800

Hi Giacomo,

Congratulations on setting up a Flink cluster with HDFS :) To run the
WordCount example provided with Flink, you should first upload your
input file to HDFS. If you have not done so, please run

> hdfs dfs -put -p file:///home/user/yourinputfile hdfs:///wc_input

Then, you can use the Flink command-line tool to submit the WordCount job.

> ./bin/flink run -v examples/flink-java-examples-*-WordCount.jar 
> hdfs:///wc_input hdfs:///wc_output

This should work if you configured HDFS correctly. If you haven't set
the default hdfs name (fs.default.name), you might have to use the
full HDFS URL. For example, if your namenode's address is
namenode.example.com at port 7777, then use
hdfs://namenode.example.com:7777/wc_input.

Kind regards,
Max

On Tue, Feb 24, 2015 at 11:13 AM, Giacomo Licari
<giacomo.lic...@gmail.com> wrote:
> Hi guys,
> I'm Giacomo from Italy, I'm newbie with Flink.
>
> I setted up a cluster with Hadoop 1.2 and Flink.
>
> I would like to ask to you how to run the WordCount example taking the input
> file from hdfs (example myuser/testWordCount/hamlet.
> txt) and put the output also inside hdfs (example
> myuser/testWordCount/output.txt).
>
> I successfully run the example on my local filesystem, I would like to test
> it with HDSF.
>
> Thanks a lot guys,
> Giacomo

Re: HDFS Clustering

Reply via email to