I'm not sure why that change would cause maven to hang; however, there are
some alternatives you can try to your original pipeline that should work:

The first is that when running the pipeline (below, copied from your
earlier mail, with 'file://' removed), remove the leading 'file://' should
be sufficient to get the Pipeline functioning; the issue here is that we
haven't registered a handler for the 'file://' syntax, but assume it when
the location doesn't conform to URI syntax, hence the failure in
IOChannelUtils:187

If you would like to use HDFS, instead you can use the Read transform to
read from an appropriately configured HDFSFileSource, located in the
org.apache.beam:hdfs artifact. This will need additional configuration, as
HDFSFileSource reads KVs as specified by a FileInputFormat which must be
provided, as opposed to the config-free read of lines of text by TextIO.

Pipeline p = Pipeline.create(PipelineOptionsFactory.create());
p.apply(TextIO.Read.named("ReadLines").from("/home/punit/factordb_setup.txt"))
     .apply(new CountWords())
     .apply(MapElements.via(new FormatAsTextFn()))
     .apply(TextIO.Write.named("WriteCounts").to("/home/punit/beam-out"));
p.run();


On Tue, May 24, 2016 at 11:45 PM, Punit Naik <[email protected]> wrote:

> I copied my input file to a directory called /in in HDFS and ran the
> WordCount example with the following command:
>
> mvn compile exec:java -pl examples/java
> -Dexec.mainClass=org.apache.beam.examples.WordCount
> -Dexec.args="--runner=DirectPipelineRunner --inputFile=/in
> --output=/beam-out"
>
>
> But it has hung at "[INFO] --- exec-maven-plugin:1.4.0:java (default-cli)
> @ java-examples-all ---" and is not showing me any error also.
>
> On Mon, May 9, 2016 at 8:02 PM, Frances Perry <[email protected]> wrote:
>
>> The whole goal of Beam is that you won't need to change your pipeline
>> code to swap between runners. So like JB said, you should look in the
>> examples <https://github.com/apache/incubator-beam/tree/master/examples>
>> module. The idea is that you can use the --runner option to select from any
>> runner currently on your classpath. (Note that the Flink runner currently
>> has its own copy for legacy reasons -- we'll be removing that.)
>>
>> So for example, you can run with the direct runner like this:
>>
>>     $ mvn compile exec:java -pl examples/java
>> -Dexec.mainClass=org.apache.beam.examples.WordCount
>> -Dexec.args="--runner=DirectPipelineRunner --output=output"
>>
>> (We still need to fix the pom a bit to be runner-agnostic, because
>> currently it links in the original Dataflow runners by default.)
>>
>> You can also take a look at this Word Count Walkthrough
>> <https://cloud.google.com/dataflow/examples/wordcount-example> that
>> we'll be porting from Dataflow to Beam soon.
>>
>> Frances
>>
>>
>>
>> On Mon, May 9, 2016 at 4:36 AM, Jean-Baptiste Onofré <[email protected]>
>> wrote:
>>
>>> Hi
>>>
>>> You have a word count sample in the examples module.
>>>
>>> Regards
>>> JB
>>>
>>>
>>> -------- Original message --------
>>> From: Punit Naik <[email protected]>
>>> Date: 09/05/2016 12:56 (GMT+01:00)
>>> To: [email protected]
>>> Subject: Direct Runner Example
>>>
>>> Can I get a wordcount direct runner example (batch)?
>>>
>>> --
>>> Thank You
>>>
>>> Regards
>>>
>>> Punit Naik
>>>
>>
>>
>
>
> --
> Thank You
>
> Regards
>
> Punit Naik
>

Reply via email to