[jira] [Commented] (BEAM-664) Port Dataflow SDK WordCount walkthrough to Beam site
[ https://issues.apache.org/jira/browse/BEAM-664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16221485#comment-16221485 ] ASF GitHub Bot commented on BEAM-664: - asfgit closed pull request #336: [BEAM-664] Update Java MinimalWordCount instructions URL: https://github.com/apache/beam-site/pull/336 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/src/get-started/wordcount-example.md b/src/get-started/wordcount-example.md index c40a2c706..8b379377a 100644 --- a/src/get-started/wordcount-example.md +++ b/src/get-started/wordcount-example.md @@ -50,46 +50,10 @@ input and output sources and show other best practices. **To run this example in Java:** -{:.runner-direct} ``` $ mvn compile exec:java -Dexec.mainClass=org.apache.beam.examples.MinimalWordCount ``` -{:.runner-apex} -``` -$ mvn compile exec:java -Dexec.mainClass=org.apache.beam.examples.MinimalWordCount \ - -Dexec.args="--inputFile=pom.xml --output=counts --runner=ApexRunner" -Papex-runner -``` - -{:.runner-flink-local} -``` -$ mvn compile exec:java -Dexec.mainClass=org.apache.beam.examples.MinimalWordCount \ - -Dexec.args="--runner=FlinkRunner --inputFile=pom.xml --output=counts" -Pflink-runner -``` - -{:.runner-flink-cluster} -``` -$ mvn package exec:java -Dexec.mainClass=org.apache.beam.examples.MinimalWordCount \ - -Dexec.args="--runner=FlinkRunner --flinkMaster= --filesToStage=target/word-count-beam-bundled-0.1.jar \ - --inputFile=/path/to/quickstart/pom.xml --output=/tmp/counts" -Pflink-runner - -You can monitor the running job by visiting the Flink dashboard at http://:8081 -``` - -{:.runner-spark} -``` -$ mvn compile exec:java -Dexec.mainClass=org.apache.beam.examples.MinimalWordCount \ - -Dexec.args="--runner=SparkRunner --inputFile=pom.xml --output=counts" -Pspark-runner -``` - -{:.runner-dataflow} -``` -$ mvn compile exec:java -Dexec.mainClass=org.apache.beam.examples.MinimalWordCount \ - -Dexec.args="--runner=DataflowRunner --gcpTempLocation=gs:///tmp \ ---inputFile=gs://apache-beam-samples/shakespeare/* --output=gs:///counts" \ - -Pdataflow-runner -``` - To view the full code in Java, see **[MinimalWordCount](https://github.com/apache/beam/blob/master/examples/java/src/main/java/org/apache/beam/examples/MinimalWordCount.java).** This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Port Dataflow SDK WordCount walkthrough to Beam site > > > Key: BEAM-664 > URL: https://issues.apache.org/jira/browse/BEAM-664 > Project: Beam > Issue Type: Task > Components: website >Reporter: Hadar Hod >Assignee: Hadar Hod > Fix For: Not applicable > > > Port the WordCount walkthrough from Dataflow docs to Beam website. > * Copy prose (translate from html to md, remove Dataflow references, etc) > * Add accurate "How to Run" instructions for each of the WC examples > * Include code snippets from real examples -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (BEAM-664) Port Dataflow SDK WordCount walkthrough to Beam site
[ https://issues.apache.org/jira/browse/BEAM-664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16221449#comment-16221449 ] ASF GitHub Bot commented on BEAM-664: - melap commented on issue #336: [BEAM-664] Update Java MinimalWordCount instructions URL: https://github.com/apache/beam-site/pull/336#issuecomment-339829871 LGTM This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Port Dataflow SDK WordCount walkthrough to Beam site > > > Key: BEAM-664 > URL: https://issues.apache.org/jira/browse/BEAM-664 > Project: Beam > Issue Type: Task > Components: website >Reporter: Hadar Hod >Assignee: Hadar Hod > Fix For: Not applicable > > > Port the WordCount walkthrough from Dataflow docs to Beam website. > * Copy prose (translate from html to md, remove Dataflow references, etc) > * Add accurate "How to Run" instructions for each of the WC examples > * Include code snippets from real examples -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (BEAM-664) Port Dataflow SDK WordCount walkthrough to Beam site
[ https://issues.apache.org/jira/browse/BEAM-664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16219809#comment-16219809 ] ASF GitHub Bot commented on BEAM-664: - aaltay commented on issue #336: [BEAM-664] Update Java MinimalWordCount instructions URL: https://github.com/apache/beam-site/pull/336#issuecomment-339519961 I agree that this PR is an improvement. I can merge it unless there is an objection. cc: @melap This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Port Dataflow SDK WordCount walkthrough to Beam site > > > Key: BEAM-664 > URL: https://issues.apache.org/jira/browse/BEAM-664 > Project: Beam > Issue Type: Task > Components: website >Reporter: Hadar Hod >Assignee: Hadar Hod > Fix For: Not applicable > > > Port the WordCount walkthrough from Dataflow docs to Beam website. > * Copy prose (translate from html to md, remove Dataflow references, etc) > * Add accurate "How to Run" instructions for each of the WC examples > * Include code snippets from real examples -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (BEAM-664) Port Dataflow SDK WordCount walkthrough to Beam site
[ https://issues.apache.org/jira/browse/BEAM-664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16219792#comment-16219792 ] ASF GitHub Bot commented on BEAM-664: - kennknowles commented on issue #336: [BEAM-664] Update Java MinimalWordCount instructions URL: https://github.com/apache/beam-site/pull/336#issuecomment-339516755 I'm actually not sure. FWIW this PR is clearly an improvement as the bits it deletes are passing command line args that don't exist. I expect they would actually cause the command to fail. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Port Dataflow SDK WordCount walkthrough to Beam site > > > Key: BEAM-664 > URL: https://issues.apache.org/jira/browse/BEAM-664 > Project: Beam > Issue Type: Task > Components: website >Reporter: Hadar Hod >Assignee: Hadar Hod > Fix For: Not applicable > > > Port the WordCount walkthrough from Dataflow docs to Beam website. > * Copy prose (translate from html to md, remove Dataflow references, etc) > * Add accurate "How to Run" instructions for each of the WC examples > * Include code snippets from real examples -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (BEAM-664) Port Dataflow SDK WordCount walkthrough to Beam site
[ https://issues.apache.org/jira/browse/BEAM-664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16219666#comment-16219666 ] ASF GitHub Bot commented on BEAM-664: - aaltay commented on issue #336: [BEAM-664] Update Java MinimalWordCount instructions URL: https://github.com/apache/beam-site/pull/336#issuecomment-339491626 Does the hardcoded input/output work for all runners as it is? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Port Dataflow SDK WordCount walkthrough to Beam site > > > Key: BEAM-664 > URL: https://issues.apache.org/jira/browse/BEAM-664 > Project: Beam > Issue Type: Task > Components: website >Reporter: Hadar Hod >Assignee: Hadar Hod > Fix For: Not applicable > > > Port the WordCount walkthrough from Dataflow docs to Beam website. > * Copy prose (translate from html to md, remove Dataflow references, etc) > * Add accurate "How to Run" instructions for each of the WC examples > * Include code snippets from real examples -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (BEAM-664) Port Dataflow SDK WordCount walkthrough to Beam site
[ https://issues.apache.org/jira/browse/BEAM-664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16219593#comment-16219593 ] ASF GitHub Bot commented on BEAM-664: - kennknowles commented on issue #336: [BEAM-664] Update Java MinimalWordCount instructions URL: https://github.com/apache/beam-site/pull/336#issuecomment-339480668 Actually it has [harcoded input](https://github.com/apache/beam/blob/master/examples/java/src/main/java/org/apache/beam/examples/MinimalWordCount.java#L78) and [hardcoded output](https://github.com/apache/beam/blob/master/examples/java/src/main/java/org/apache/beam/examples/MinimalWordCount.java#L114). I do think that you could set the runner via pipeline options and it would work. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Port Dataflow SDK WordCount walkthrough to Beam site > > > Key: BEAM-664 > URL: https://issues.apache.org/jira/browse/BEAM-664 > Project: Beam > Issue Type: Task > Components: website >Reporter: Hadar Hod >Assignee: Hadar Hod > Fix For: Not applicable > > > Port the WordCount walkthrough from Dataflow docs to Beam website. > * Copy prose (translate from html to md, remove Dataflow references, etc) > * Add accurate "How to Run" instructions for each of the WC examples > * Include code snippets from real examples -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (BEAM-664) Port Dataflow SDK WordCount walkthrough to Beam site
[ https://issues.apache.org/jira/browse/BEAM-664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16219545#comment-16219545 ] ASF GitHub Bot commented on BEAM-664: - aaltay commented on issue #336: [BEAM-664] Update Java MinimalWordCount instructions URL: https://github.com/apache/beam-site/pull/336#issuecomment-339476137 @kennknowles My understanding so far is, {{MinimalWordCount}} needs pipeline options for runner, input, output to be able to run with other runners. We do not want to do that to complicate things. In that case, making this documentation changes makes sense. Is this accurate? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Port Dataflow SDK WordCount walkthrough to Beam site > > > Key: BEAM-664 > URL: https://issues.apache.org/jira/browse/BEAM-664 > Project: Beam > Issue Type: Task > Components: website >Reporter: Hadar Hod >Assignee: Hadar Hod > Fix For: Not applicable > > > Port the WordCount walkthrough from Dataflow docs to Beam website. > * Copy prose (translate from html to md, remove Dataflow references, etc) > * Add accurate "How to Run" instructions for each of the WC examples > * Include code snippets from real examples -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (BEAM-664) Port Dataflow SDK WordCount walkthrough to Beam site
[ https://issues.apache.org/jira/browse/BEAM-664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16219032#comment-16219032 ] ASF GitHub Bot commented on BEAM-664: - kennknowles commented on issue #336: [BEAM-664] Update Java MinimalWordCount instructions URL: https://github.com/apache/beam-site/pull/336#issuecomment-339393568 That was back in a time where runners needed more special consideration than they do now. As long as the code for `MinimalWordCount` stays very concise, I am happy to have command lines that run on different runners. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Port Dataflow SDK WordCount walkthrough to Beam site > > > Key: BEAM-664 > URL: https://issues.apache.org/jira/browse/BEAM-664 > Project: Beam > Issue Type: Task > Components: website >Reporter: Hadar Hod >Assignee: Hadar Hod > Fix For: Not applicable > > > Port the WordCount walkthrough from Dataflow docs to Beam website. > * Copy prose (translate from html to md, remove Dataflow references, etc) > * Add accurate "How to Run" instructions for each of the WC examples > * Include code snippets from real examples -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (BEAM-664) Port Dataflow SDK WordCount walkthrough to Beam site
[ https://issues.apache.org/jira/browse/BEAM-664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16219031#comment-16219031 ] ASF GitHub Bot commented on BEAM-664: - kennknowles commented on issue #336: [BEAM-664] Update Java MinimalWordCount instructions URL: https://github.com/apache/beam-site/pull/336#issuecomment-339393231 The context is that just to keep it truly "minimal" as a demonstration of how pithy we could get things with Java 8, we hardcode everything. The idea is that if you want to experiment with it you edit the code. It is just the tiniest example. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Port Dataflow SDK WordCount walkthrough to Beam site > > > Key: BEAM-664 > URL: https://issues.apache.org/jira/browse/BEAM-664 > Project: Beam > Issue Type: Task > Components: website >Reporter: Hadar Hod >Assignee: Hadar Hod > Fix For: Not applicable > > > Port the WordCount walkthrough from Dataflow docs to Beam website. > * Copy prose (translate from html to md, remove Dataflow references, etc) > * Add accurate "How to Run" instructions for each of the WC examples > * Include code snippets from real examples -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (BEAM-664) Port Dataflow SDK WordCount walkthrough to Beam site
[ https://issues.apache.org/jira/browse/BEAM-664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16218444#comment-16218444 ] ASF GitHub Bot commented on BEAM-664: - alex-filatov commented on issue #336: [BEAM-664] Update Java MinimalWordCount instructions URL: https://github.com/apache/beam-site/pull/336#issuecomment-339304173 @aaltay @kennknowles Current instructions are broken: you'll get exception if you try to run Java MinimalWordCount using e.g. Flink runner. Checking history, it appears this was a deliberate change. Commit message: "This makes it easy to immediately run, and removes various non-portable instructions and others that aren't the easiest for a "Getting Started" scenario." This makes sense to me. > Should we instead change the example to not-hardcode to work only one runner. Not sure. Beam model is independent of the runners and could be explained separately (using Direct Runner) from the real runners. I would change _all_ examples to use Direct Runner only and provide separate instructions how to specify and configure different runners. IMO this decoupling would make docs simpler. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Port Dataflow SDK WordCount walkthrough to Beam site > > > Key: BEAM-664 > URL: https://issues.apache.org/jira/browse/BEAM-664 > Project: Beam > Issue Type: Task > Components: website >Reporter: Hadar Hod >Assignee: Hadar Hod > Fix For: Not applicable > > > Port the WordCount walkthrough from Dataflow docs to Beam website. > * Copy prose (translate from html to md, remove Dataflow references, etc) > * Add accurate "How to Run" instructions for each of the WC examples > * Include code snippets from real examples -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (BEAM-664) Port Dataflow SDK WordCount walkthrough to Beam site
[ https://issues.apache.org/jira/browse/BEAM-664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15903563#comment-15903563 ] Davor Bonaci commented on BEAM-664: --- [~hadarhg] / [~frances], what's the accurate status of this task and its sub-tasks? Perhaps this is done? > Port Dataflow SDK WordCount walkthrough to Beam site > > > Key: BEAM-664 > URL: https://issues.apache.org/jira/browse/BEAM-664 > Project: Beam > Issue Type: Task > Components: website >Reporter: Hadar Hod >Assignee: Hadar Hod > Fix For: Not applicable > > > Port the WordCount walkthrough from Dataflow docs to Beam website. > * Copy prose (translate from html to md, remove Dataflow references, etc) > * Add accurate "How to Run" instructions for each of the WC examples > * Include code snippets from real examples -- This message was sent by Atlassian JIRA (v6.3.15#6346)