Hi Dan, Thanks for the notice. I've filed an issue [1] to remove the dependency of the Flink Runner on the Dataflow Runner. It seems to have slipped in during the project restructuring.
Still curious, why was the Dataflow Runner code not bundled in the jar? Some shading magic inherited from the root pom? Cheers, Max [1] https://issues.apache.org/jira/browse/BEAM-272 On Wed, May 11, 2016 at 8:30 AM, Punit Naik <[email protected]> wrote: > I was trying the WordCount example of course. > > On Wed, May 11, 2016 at 11:59 AM, Punit Naik <[email protected]> wrote: >> >> I tried to change the default DataFlow pipeline to a DirectRunner pipeline >> by doing this: >> >> DirectPipelineOptions options = >> PipelineOptionsFactory.as(DirectPipelineOptions.class); >> Pipeline p = Pipeline.create(options); >> >> p.apply(TextIO.Read.named("ReadLines").from("file:///home/punit/factordb_setup.txt")) >> .apply(new CountWords()) >> .apply(MapElements.via(new FormatAsTextFn())) >> >> .apply(TextIO.Write.named("WriteCounts").to("file:///home/punit/beam-out")); >> p.run(); >> >> But when I ran this it threw this error: >> >> SLF4J: Defaulting to no-operation (NOP) logger implementation >> SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further >> details. >> Exception in thread "main" java.lang.IllegalStateException: Failed to >> validate file:///home/user/file.txt >> at org.apache.beam.sdk.io.TextIO$Read$Bound.apply(TextIO.java:307) >> at org.apache.beam.sdk.io.TextIO$Read$Bound.apply(TextIO.java:204) >> at >> org.apache.beam.sdk.runners.PipelineRunner.apply(PipelineRunner.java:76) >> at >> org.apache.beam.sdk.runners.DirectPipelineRunner.apply(DirectPipelineRunner.java:253) >> at org.apache.beam.sdk.Pipeline.applyInternal(Pipeline.java:369) >> at org.apache.beam.sdk.Pipeline.applyTransform(Pipeline.java:276) >> at org.apache.beam.sdk.values.PBegin.apply(PBegin.java:48) >> at org.apache.beam.sdk.Pipeline.apply(Pipeline.java:158) >> at WordCount.main(WordCount.java:188) >> Caused by: java.io.IOException: Unable to find handler for >> file:///home/punit/factordb_setup.txt >> at >> org.apache.beam.sdk.util.IOChannelUtils.getFactory(IOChannelUtils.java:187) >> at org.apache.beam.sdk.io.TextIO$Read$Bound.apply(TextIO.java:302) >> ... 8 more >> >> How do I circumvent this problem? >> >> On Wed, May 11, 2016 at 5:56 AM, Dan Halperin <[email protected]> wrote: >>> >>> +Max and Kenn explicitly >>> >>> It looks like the Flink pipeline runner currently depends on the >>> DataflowPipelineOptions, which may add some complicated runtime >>> dependencies. >>> >>> >>> https://github.com/apache/incubator-beam/blob/master/runners/flink/runner/src/main/java/org/apache/beam/runners/flink/FlinkPipelineOptions.java#L21 >>> >>> It seems conceivable that this could cause issues like this. >>> >>> Dan >>> >>> On Mon, May 9, 2016 at 12:51 PM, amir bahmanyari <[email protected]> >>> wrote: >>>> >>>> "because currently it links in the original Dataflow runners by default" >>>> Which jar(s) should be on the classpath to resolve "the original >>>> Dataflow runners"? >>>> In case of using FlinkPipelineRunner, I am getting this exception that >>>> clearly references Dataflow runners but the Dataflow runners classes are >>>> not >>>> fond. >>>> Thanks for your help. >>>> >>>> Caused by: java.lang.NoClassDefFoundError: >>>> org/apache/beam/runners/dataflow/DataflowPipelineRunner >>>> at >>>> org.apache.beam.runners.flink.FlinkPipelineRunner.fromOptions(FlinkPipelineRunner.java:82) >>>> ... 20 more >>>> >>>> >>>> >>>> ________________________________ >>>> From: Frances Perry <[email protected]> >>>> To: [email protected] >>>> Sent: Monday, May 9, 2016 7:32 AM >>>> Subject: Re: Direct Runner Example >>>> >>>> The whole goal of Beam is that you won't need to change your pipeline >>>> code to swap between runners. So like JB said, you should look in the >>>> examples module. The idea is that you can use the --runner option to select >>>> from any runner currently on your classpath. (Note that the Flink runner >>>> currently has its own copy for legacy reasons -- we'll be removing that.) >>>> >>>> So for example, you can run with the direct runner like this: >>>> >>>> $ mvn compile exec:java -pl examples/java >>>> -Dexec.mainClass=org.apache.beam.examples.WordCount >>>> -Dexec.args="--runner=DirectPipelineRunner --output=output" >>>> >>>> (We still need to fix the pom a bit to be runner-agnostic, because >>>> currently it links in the original Dataflow runners by default.) >>>> >>>> You can also take a look at this Word Count Walkthrough that we'll be >>>> porting from Dataflow to Beam soon. >>>> >>>> Frances >>>> >>>> >>>> >>>> On Mon, May 9, 2016 at 4:36 AM, Jean-Baptiste Onofré <[email protected]> >>>> wrote: >>>> >>>> Hi >>>> >>>> You have a word count sample in the examples module. >>>> >>>> Regards >>>> JB >>>> >>>> >>>> -------- Original message -------- >>>> From: Punit Naik <[email protected]> >>>> Date: 09/05/2016 12:56 (GMT+01:00) >>>> To: [email protected] >>>> Subject: Direct Runner Example >>>> >>>> Can I get a wordcount direct runner example (batch)? >>>> >>>> -- >>>> Thank You >>>> >>>> Regards >>>> >>>> Punit Naik >>>> >>>> >>>> >>>> >>> >> >> >> >> -- >> Thank You >> >> Regards >> >> Punit Naik > > > > > -- > Thank You > > Regards > > Punit Naik
