I tried to change the default DataFlow pipeline to a DirectRunner pipeline
by doing this:
DirectPipelineOptions options =
PipelineOptionsFactory.as(DirectPipelineOptions.class);
Pipeline p = Pipeline.create(options);
p.apply(TextIO.Read.named("ReadLines").from("file:///home/punit/factordb_setup.txt"))
.apply(new CountWords())
.apply(MapElements.via(new FormatAsTextFn()))
.apply(TextIO.Write.named("WriteCounts").to("file:///home/punit/beam-out"));
p.run();
But when I ran this it threw this error:
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further
details.
Exception in thread "main" java.lang.IllegalStateException: Failed to
validate file:///home/user/file.txt
at org.apache.beam.sdk.io.TextIO$Read$Bound.apply(TextIO.java:307)
at org.apache.beam.sdk.io.TextIO$Read$Bound.apply(TextIO.java:204)
at
org.apache.beam.sdk.runners.PipelineRunner.apply(PipelineRunner.java:76)
at
org.apache.beam.sdk.runners.DirectPipelineRunner.apply(DirectPipelineRunner.java:253)
at org.apache.beam.sdk.Pipeline.applyInternal(Pipeline.java:369)
at org.apache.beam.sdk.Pipeline.applyTransform(Pipeline.java:276)
at org.apache.beam.sdk.values.PBegin.apply(PBegin.java:48)
at org.apache.beam.sdk.Pipeline.apply(Pipeline.java:158)
at WordCount.main(WordCount.java:188)
Caused by: java.io.IOException: Unable to find handler for
file:///home/punit/factordb_setup.txt
at
org.apache.beam.sdk.util.IOChannelUtils.getFactory(IOChannelUtils.java:187)
at org.apache.beam.sdk.io.TextIO$Read$Bound.apply(TextIO.java:302)
... 8 more
How do I circumvent this problem?
On Wed, May 11, 2016 at 5:56 AM, Dan Halperin <[email protected]> wrote:
> +Max and Kenn explicitly
>
> It looks like the Flink pipeline runner currently depends on the
> DataflowPipelineOptions, which may add some complicated runtime
> dependencies.
>
>
> https://github.com/apache/incubator-beam/blob/master/runners/flink/runner/src/main/java/org/apache/beam/runners/flink/FlinkPipelineOptions.java#L21
>
> It seems conceivable that this could cause issues like this.
>
> Dan
>
> On Mon, May 9, 2016 at 12:51 PM, amir bahmanyari <[email protected]>
> wrote:
>
>> "because currently it links in the original Dataflow runners by default"
>> Which jar(s) should be on the classpath to resolve "the original
>> Dataflow runners"?
>> In case of using FlinkPipelineRunner, I am getting this exception that
>> clearly references Dataflow runners but the Dataflow runners classes are
>> not fond.
>> Thanks for your help.
>>
>> Caused by: java.lang.NoClassDefFoundError:
>> org/apache/beam/runners/dataflow/*DataflowPipelineRunner*
>> at org.apache.beam.runners.flink.FlinkPipelineRunner.fromOptions(
>> *FlinkPipelineRunner*.java:82)
>> ... 20 more
>>
>>
>>
>> ------------------------------
>> *From:* Frances Perry <[email protected]>
>> *To:* [email protected]
>> *Sent:* Monday, May 9, 2016 7:32 AM
>> *Subject:* Re: Direct Runner Example
>>
>> The whole goal of Beam is that you won't need to change your pipeline
>> code to swap between runners. So like JB said, you should look in the
>> examples <https://github.com/apache/incubator-beam/tree/master/examples>
>> module. The idea is that you can use the --runner option to select from any
>> runner currently on your classpath. (Note that the Flink runner currently
>> has its own copy for legacy reasons -- we'll be removing that.)
>>
>> So for example, you can run with the direct runner like this:
>>
>> $ mvn compile exec:java -pl examples/java
>> -Dexec.mainClass=org.apache.beam.examples.WordCount
>> -Dexec.args="--runner=DirectPipelineRunner --output=output"
>>
>> (We still need to fix the pom a bit to be runner-agnostic, because
>> currently it links in the original Dataflow runners by default.)
>>
>> You can also take a look at this Word Count Walkthrough
>> <https://cloud.google.com/dataflow/examples/wordcount-example> that
>> we'll be porting from Dataflow to Beam soon.
>>
>> Frances
>>
>>
>>
>> On Mon, May 9, 2016 at 4:36 AM, Jean-Baptiste Onofré <[email protected]>
>> wrote:
>>
>> Hi
>>
>> You have a word count sample in the examples module.
>>
>> Regards
>> JB
>>
>>
>> -------- Original message --------
>> From: Punit Naik <[email protected]>
>> Date: 09/05/2016 12:56 (GMT+01:00)
>> To: [email protected]
>> Subject: Direct Runner Example
>>
>> Can I get a wordcount direct runner example (batch)?
>>
>> --
>> Thank You
>>
>> Regards
>>
>> Punit Naik
>>
>>
>>
>>
>>
>
--
Thank You
Regards
Punit Naik