Re: [DISCUSS] Beam MapReduce Runner One-Pager

2017-07-12 Thread Jean-Baptiste Onofré
Hi, I will push my branch with the current state of the mapreduce runner. Regards JB On 07/13/2017 04:47 AM, Pei HE wrote: Thanks guys! I replied Kenn's comments, and looking forward to more feedbacks and suggestions. Also, could we add a mapreduce-runner branch? Thanks -- Pei On Sat, Jul

Re: [DISCUSS] Beam MapReduce Runner One-Pager

2017-07-12 Thread Pei HE
Thanks guys! I replied Kenn's comments, and looking forward to more feedbacks and suggestions. Also, could we add a mapreduce-runner branch? Thanks -- Pei On Sat, Jul 8, 2017 at 12:42 AM, Kenneth Knowles wrote: > Very cool to see this. Commenting a little on the doc. > > On Fri, Jul 7, 2017

Re: MergeBot is here!

2017-07-12 Thread Kenneth Knowles
On Wed, Jul 12, 2017 at 12:08 PM, Robert Bradshaw < rober...@google.com.invalid> wrote: > On Tue, Jul 11, 2017 at 7:14 PM, Kenneth Knowles > wrote: > > > The thing is that "fixup! " indicates that this fixup > > should be reordered and applied to the referenced commit. Squashing in > > order is n

Re: [beam-site] branch asf-site2 deleted (was 4e9082b)

2017-07-12 Thread Lukasz Cwik
FYI, I opened branch on wrong repo and hence cleaned it up. On Wed, Jul 12, 2017 at 12:10 PM, wrote: > This is an automated email from the ASF dual-hosted git repository. > > lcwik pushed a change to branch asf-site2 > in repository https://gitbox.apache.org/repos/asf/beam-site.git. > > > w

Re: MergeBot is here!

2017-07-12 Thread Robert Bradshaw
On Tue, Jul 11, 2017 at 7:14 PM, Kenneth Knowles wrote: > > On Tue, Jul 11, 2017 at 4:25 PM, Robert Bradshaw < > rober...@google.com.invalid> wrote: > > > On Tue, Jul 11, 2017 at 8:51 AM, Kenneth Knowles > > wrote: > > > I like the idea of controlling squashing or not explicitly in the > > merge

Re: Streaming support available on Beam Python DIrectRunner

2017-07-12 Thread Dmitry Demeshchuk
Awesome work! Please sign me in for a Python Streaming Alpha, we've been looking forward for SDoFn support coming in. In fact, we would be glad to assist on completing it faster, if there's some grunt work you can hand off. On Wed, Jul 12, 2017 at 10:07 AM, Charles Chen wrote: > We recently ch

Streaming support available on Beam Python DIrectRunner

2017-07-12 Thread Charles Chen
We recently checked in the last few changes needed to support streaming pipelines on the Beam Python DirectRunner (BEAM-1265 ). As of HEAD (1-2 weeks ago) and the 2.1.0 RC, Python SDK users can now write their pipelines in streaming mode and run the

Re: Proposal and plan: new TextIO features based on SDF

2017-07-12 Thread Ben Chambers
Regarding changing the coder -- keep in mind that there may be persisted state somewhere, so we can't just change the coder once this is used. If the processing of scanning for modified and new files reported the last-modified-time, could we use that and have the SDF report KV with the last-modifi

Re: Proposal and plan: new TextIO features based on SDF

2017-07-12 Thread Reuven Lax
Yes, you still need SDF to do the root expansion. However it means that the state storage is now distributed. Garbage collection might be trickier with Distinct. On Tue, Jul 11, 2017 at 10:19 PM, Eugene Kirpichov < kirpic...@google.com.invalid> wrote: > Yes, I thought of this, but: > - The disti

Re: does beam work with spark 2, any plans when is the support for spark 2 will be available ?

2017-07-12 Thread Jean-Baptiste Onofré
Hi, I'm working on Spark 2 support (right now, I have a complete new runner, but I'm trying to have an unique runner for both Spark 1.x and 2.x). I created the following Jira: https://issues.apache.org/jira/browse/BEAM-1920 I should be able to create the pull request next week. Regards JB

Re: does beam work with spark 2, any plans when is the support for spark 2 will be available ?

2017-07-12 Thread Jyotirmoy Sundi
found the JIRA https://issues.apache.org/jira/browse/BEAM-1920, please ignore. On Wed, Jul 12, 2017 at 3:08 AM, Jyotirmoy Sundi wrote: > Working trace of same code in spark 1.6.3 is below. > > > /Library/Java/JavaVirtualMachines/jdk1.8.0_92.jdk/Contents/Home/bin/java > -Didea.launcher.port=

does beam work with spark 2, any plans when is the support for spark 2 will be available ?

2017-07-12 Thread Jyotirmoy Sundi
Hi Folks, I tried beam with spark 2 , although it compiles and runs without errors, but no data gets passed through the pipeline. Below is a trace of a pipeline run with spark 2.0, same code works with spark 1.6.3, any pointers will be helpful. Thanks *Trace:* /Library/Java/JavaVirtualMachi

Re: does beam work with spark 2, any plans when is the support for spark 2 will be available ?

2017-07-12 Thread Jyotirmoy Sundi
Working trace of same code in spark 1.6.3 is below. /Library/Java/JavaVirtualMachines/jdk1.8.0_92.jdk/Contents/Home/bin/java -Didea.launcher.port=7551 "-Didea.launcher.bin.path=/Applications/IntelliJ IDEA 15 CE.app/Contents/bin" -Dfile.encoding=UTF-8 -classpath "/Library/Java/JavaVirtualMachi

Re: Proposal and plan: new TextIO features based on SDF

2017-07-12 Thread Eugene Kirpichov
Yes, I thought of this, but: - The distinct transform needs to apply per input (probably easy) - You still need an SDF to run the set expansion repeatedly - It's not clear when to terminate the repeated expansion in this implementation On Tue, Jul 11, 2017 at 10:14 PM Reuven Lax wrote: > As a th

Jenkins build is back to stable : beam_Release_NightlySnapshot #475

2017-07-12 Thread Apache Jenkins Server
See