I was trying the WordCount example of course.
On Wed, May 11, 2016 at 11:59 AM, Punit Naik <[email protected]> wrote:
> I tried to change the default DataFlow pipeline to a DirectRunner pipeline
> by doing this:
>
> DirectPipelineOptions options =
> PipelineOptionsFactory.as(DirectPipelineOptions.class);
> Pipeline p = Pipeline.create(options);
>
> p.apply(TextIO.Read.named("ReadLines").from("file:///home/punit/factordb_setup.txt"))
> .apply(new CountWords())
> .apply(MapElements.via(new FormatAsTextFn()))
>
> .apply(TextIO.Write.named("WriteCounts").to("file:///home/punit/beam-out"));
> p.run();
>
> But when I ran this it threw this error:
>
> SLF4J: Defaulting to no-operation (NOP) logger implementation
> SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further
> details.
> Exception in thread "main" java.lang.IllegalStateException: Failed to
> validate file:///home/user/file.txt
> at org.apache.beam.sdk.io.TextIO$Read$Bound.apply(TextIO.java:307)
> at org.apache.beam.sdk.io.TextIO$Read$Bound.apply(TextIO.java:204)
> at
> org.apache.beam.sdk.runners.PipelineRunner.apply(PipelineRunner.java:76)
> at
> org.apache.beam.sdk.runners.DirectPipelineRunner.apply(DirectPipelineRunner.java:253)
> at org.apache.beam.sdk.Pipeline.applyInternal(Pipeline.java:369)
> at org.apache.beam.sdk.Pipeline.applyTransform(Pipeline.java:276)
> at org.apache.beam.sdk.values.PBegin.apply(PBegin.java:48)
> at org.apache.beam.sdk.Pipeline.apply(Pipeline.java:158)
> at WordCount.main(WordCount.java:188)
> Caused by: java.io.IOException: Unable to find handler for
> file:///home/punit/factordb_setup.txt
> at
> org.apache.beam.sdk.util.IOChannelUtils.getFactory(IOChannelUtils.java:187)
> at org.apache.beam.sdk.io.TextIO$Read$Bound.apply(TextIO.java:302)
> ... 8 more
>
> How do I circumvent this problem?
>
> On Wed, May 11, 2016 at 5:56 AM, Dan Halperin <[email protected]> wrote:
>
>> +Max and Kenn explicitly
>>
>> It looks like the Flink pipeline runner currently depends on the
>> DataflowPipelineOptions, which may add some complicated runtime
>> dependencies.
>>
>>
>> https://github.com/apache/incubator-beam/blob/master/runners/flink/runner/src/main/java/org/apache/beam/runners/flink/FlinkPipelineOptions.java#L21
>>
>> It seems conceivable that this could cause issues like this.
>>
>> Dan
>>
>> On Mon, May 9, 2016 at 12:51 PM, amir bahmanyari <[email protected]>
>> wrote:
>>
>>> "because currently it links in the original Dataflow runners by default"
>>> Which jar(s) should be on the classpath to resolve "the original
>>> Dataflow runners"?
>>> In case of using FlinkPipelineRunner, I am getting this exception that
>>> clearly references Dataflow runners but the Dataflow runners classes are
>>> not fond.
>>> Thanks for your help.
>>>
>>> Caused by: java.lang.NoClassDefFoundError:
>>> org/apache/beam/runners/dataflow/*DataflowPipelineRunner*
>>> at org.apache.beam.runners.flink.FlinkPipelineRunner.fromOptions(
>>> *FlinkPipelineRunner*.java:82)
>>> ... 20 more
>>>
>>>
>>>
>>> ------------------------------
>>> *From:* Frances Perry <[email protected]>
>>> *To:* [email protected]
>>> *Sent:* Monday, May 9, 2016 7:32 AM
>>> *Subject:* Re: Direct Runner Example
>>>
>>> The whole goal of Beam is that you won't need to change your pipeline
>>> code to swap between runners. So like JB said, you should look in the
>>> examples <https://github.com/apache/incubator-beam/tree/master/examples>
>>> module. The idea is that you can use the --runner option to select from any
>>> runner currently on your classpath. (Note that the Flink runner currently
>>> has its own copy for legacy reasons -- we'll be removing that.)
>>>
>>> So for example, you can run with the direct runner like this:
>>>
>>> $ mvn compile exec:java -pl examples/java
>>> -Dexec.mainClass=org.apache.beam.examples.WordCount
>>> -Dexec.args="--runner=DirectPipelineRunner --output=output"
>>>
>>> (We still need to fix the pom a bit to be runner-agnostic, because
>>> currently it links in the original Dataflow runners by default.)
>>>
>>> You can also take a look at this Word Count Walkthrough
>>> <https://cloud.google.com/dataflow/examples/wordcount-example> that
>>> we'll be porting from Dataflow to Beam soon.
>>>
>>> Frances
>>>
>>>
>>>
>>> On Mon, May 9, 2016 at 4:36 AM, Jean-Baptiste Onofré <[email protected]>
>>> wrote:
>>>
>>> Hi
>>>
>>> You have a word count sample in the examples module.
>>>
>>> Regards
>>> JB
>>>
>>>
>>> -------- Original message --------
>>> From: Punit Naik <[email protected]>
>>> Date: 09/05/2016 12:56 (GMT+01:00)
>>> To: [email protected]
>>> Subject: Direct Runner Example
>>>
>>> Can I get a wordcount direct runner example (batch)?
>>>
>>> --
>>> Thank You
>>>
>>> Regards
>>>
>>> Punit Naik
>>>
>>>
>>>
>>>
>>>
>>
>
>
> --
> Thank You
>
> Regards
>
> Punit Naik
>
--
Thank You
Regards
Punit Naik