Re: [VOTE] Release 0.3.0-incubating, release candidate #1

2016-10-27 Thread Jean-Baptiste Onofré
No problem for the vote. For graduation, we are already thinking about it yes. Regards JB ⁣​ On Oct 27, 2016, 08:54, at 08:54, "Sergio Fernández" wrote: >Hi JB, > >On Tue, Oct 25, 2016 at 12:00 PM, Jean-Baptiste Onofré > >wrote: > >> Thanks Sergio ;) >> > >You are welcome. > > >> Just tried to

Re: [PROPOSAL] New Beam website design?

2016-10-27 Thread Abdullah Bashir
Thank you very much for taking time to respond Davor :) Regarding BEAM-752, i can work on that, i have already built some Dataflow Piplines on Google Cloud in Python language. Again Can you tell me where to start for BEAM-752. I am new to ASF contribution, so onboarding steps are kind of a black

Re: [PROPOSAL] New Beam website design?

2016-10-27 Thread Jean-Baptiste Onofré
Hi You can propose a PR on this Jira. We will be more than happy to review it. Thanks Regards JB ⁣​ On Oct 27, 2016, 11:26, at 11:26, Abdullah Bashir wrote: >Thank you very much for taking time to respond Davor :) > >Regarding BEAM-752, i can work on that, i have already built some >Dataflow

Re: [PROPOSAL] New Beam website design?

2016-10-27 Thread Minudika Malshan
Hi all, I would like to join for the development of the new site. Is there any issue tracking method for this? (Are there any jirra issues) Thank you! On Thu, Oct 27, 2016 at 4:01 PM, Jean-Baptiste Onofré wrote: > Hi > > You can propose a PR on this Jira. > > We will be more than happy to re

Re: [PROPOSAL] New Beam website design?

2016-10-27 Thread Jean-Baptiste Onofré
Great !! Thanks. You can take a look on BEAM-500 and 501 and also the PR I did last week. I plan to submit new PRs during the week end. So please let me know how we can sync. Thanks Regards JB ⁣​ On Oct 27, 2016, 14:04, at 14:04, Minudika Malshan wrote: >Hi all, > >I would like to join for

Can we have more quick start examples ?

2016-10-27 Thread Manu Zhang
Hey guys, I find Beam examples under the examples folder are not easy to run due to dependency on Google specific services. Even the MinimalWordCount requires input and

Re: Can we have more quick start examples ?

2016-10-27 Thread Thomas Weise
The Beam tutorials seem to address this: https://github.com/eljefe6a/beamexample/blob/master/README.md On Thu, Oct 27, 2016 at 8:04 AM, Manu Zhang wrote: > Hey guys, > > I find Beam examples under the examples folder are not easy to run due to > dependency on Google specific services. Even the

Re: Can we have more quick start examples ?

2016-10-27 Thread Jesse Anderson
Those tutorials help. I was going through the example code and had the same thought. We need to take a pass through the examples and remove some of the Google Cloud dependencies. On Thu, Oct 27, 2016, 5:13 PM Thomas Weise wrote: > The Beam tutorials seem to address this: > > https://github.com/e

Re: [VOTE] Release 0.3.0-incubating, release candidate #1

2016-10-27 Thread Neelesh Salian
+1 (non-binding) Thank you for putting this together On Thu, Oct 27, 2016 at 12:00 AM, Jean-Baptiste Onofré wrote: > No problem for the vote. > > For graduation, we are already thinking about it yes. > > Regards > JB > > ⁣​ > > On Oct 27, 2016, 08:54, at 08:54, "Sergio Fernández" > wrote: > >H

Re: Tracking backward-incompatible changes for Beam

2016-10-27 Thread Robert Bradshaw
If the API/semantics are sufficiently well tested, backwards incompatibility should manifest as test failures. The corollary is that one should look closely at any test changes that get proposed. On Mon, Oct 24, 2016 at 1:52 PM, Davor Bonaci wrote: > I don't think we have it right now. We should,

Re: Can we have more quick start examples ?

2016-10-27 Thread Davor Bonaci
Indeed -- this is a clear area for improvement. Sources are usually not as big of an issue -- these resources are publicly accessible regardless where/how you run the pipeline (locally, or with any runner). On the other hand, Sinks require write access, which is often more problematic. One correct

Re: Apex runner status and next steps

2016-10-27 Thread Dan Halperin
I would add (explicitly, though this may be implicit or already supported) that Apex should also be able to run the precommit WordCountIT/WindowedWordCountIT that execute on all runners. https://github.com/apache/incubator-beam/blob/master/examples/java/pom.xml#L42 and https://github.com/apache/in

Re: [PROPOSAL] New Beam website design?

2016-10-27 Thread Davor Bonaci
The best place to learn how to get started is the Contribution Guide [1]. The list of pending JIRA issues related to the website is also available [2]. I think BEAM-752 would be the best to get your feet wet. Other good candidates are 516, 268, 776. If someone knows a good (non-fragile) solution t

Re: [DISCUSS] Using Verbs for Transforms

2016-10-27 Thread Dan Halperin
Folks, I don't think this needs to be a "vote". This is just not that big a deal :). It is important to be transparent and have these discussions on the list, which is why we brought it here from GitHub/JIRA, but at the end of the day I hope that a small group of committers and developers can asses

Re: [DISCUSS] Using Verbs for Transforms

2016-10-27 Thread Jesse Anderson
Sure On Thu, Oct 27, 2016, 8:04 PM Dan Halperin wrote: > Folks, I don't think this needs to be a "vote". This is just not that big a > deal :). It is important to be transparent and have these discussions on > the list, which is why we brought it here from GitHub/JIRA, but at the end > of the da

Re: [DISCUSS] Using Verbs for Transforms

2016-10-27 Thread Jean-Baptiste Onofré
It sounds good to me. So basically you did kind of vote with a proposing solution ;) Regards JB ⁣​ On Oct 27, 2016, 20:04, at 20:04, Dan Halperin wrote: >Folks, I don't think this needs to be a "vote". This is just not that >big a >deal :). It is important to be transparent and have these dis

Re: [DISCUSS] Using Verbs for Transforms

2016-10-27 Thread Robert Bradshaw
+1 to all Dan says. I only brought this up because it seemed new contributors (yay) jumping in and renaming a core transform based on "Something to consider" deserved a couple more more eyeballs, but didn't intend for it to become a big deal. On Thu, Oct 27, 2016 at 11:03 AM, Dan Halperin wrote:

Re: [DISCUSS] Using Verbs for Transforms

2016-10-27 Thread Jean-Baptiste Onofré
You did well ! It's an interesting discussion we have and it's great to have it on the mailing list (better than in Jira or PR comments IMHO). Thanks ! Regards JB ⁣​ On Oct 27, 2016, 20:39, at 20:39, Robert Bradshaw wrote: >+1 to all Dan says. > >I only brought this up because it seemed new

Re: Can we have more quick start examples ?

2016-10-27 Thread Jean-Baptiste Onofré
Yes it sounds good to me. I would love to see this as part of the examples. Ismael and I also started the beam-samples (http://github.com/jbonofre/beam-samples) that could be part of the examples. The purpose is to have more real use cases implementation with real data. Regards JB ⁣​ On Oct 27

Re: [DISCUSS] Using Verbs for Transforms

2016-10-27 Thread Neelesh Salian
Thanks everyone for all the inputs. It's really encouraging for a new contributor, as myself, to get valuable input and mentoring (like on this thread) and, in turn, help make the community better. On Thu, Oct 27, 2016 at 11:41 AM, Jean-Baptiste Onofré wrote: > You did well ! It's an interesti

Re: Can we have more quick start examples ?

2016-10-27 Thread Neelesh Salian
+1 to this. I liked the guides for the setup for GC and Storage. The documentation is by far better than any guide I have seen. I also provided feedback on the documentation where it could use improvement. But certainly a more abstract and user friendly example would be encouraging for new users a

Re: [DISCUSS] Merging master -> feature branch

2016-10-27 Thread Robert Bradshaw
My concern was mostly about what to do in the face of conflicts, but it sounds like the consensus is that for a clean merge, with no conflicts or test breakage (or other concerns) a committer is free to push without any oversight which is fine by me. [If/when the Mergbot comes into action, and run

Re: [DISCUSS] Merging master -> feature branch

2016-10-27 Thread Kenneth Knowles
In the spirit of explicitly summarizing and concluding threads on list: I think we have affirmative consensus to go for it when a downstream integration is completely conflict-free and fixup-free. On Thu, Oct 27, 2016 at 12:43 PM Robert Bradshaw wrote: > My concern was mostly about what to do in

Re: Placement of temporary files by FileBasedSink

2016-10-27 Thread Eugene Kirpichov
Getting back to this. I noticed that the original user's job mentioned in http://stackoverflow.com/questions/39822859/temp-files-remain-in-gcs-after-a-dataflow-job-succeeded is configured to write to /path/to/$date/foo-x-of-y and another job then reads from /path/to/$date/*, so sibling file

Re: Placement of temporary files by FileBasedSink

2016-10-27 Thread Chamikara Jayalath
On Thu, Oct 27, 2016 at 1:27 PM Eugene Kirpichov wrote: > Getting back to this. I noticed that the original user's job mentioned in > > http://stackoverflow.com/questions/39822859/temp-files-remain-in-gcs-after-a-dataflow-job-succeeded > is > configured to write to /path/to/$date/foo-x-of-yyy

Re: Placement of temporary files by FileBasedSink

2016-10-27 Thread Chamikara Jayalath
BTW I'm in favor of using a sub-directory and possibly asking users to update their glob pattern while also allowing users to optionally specify a temporary path in the future, as you propose. Thanks, Cham On Thu, Oct 27, 2016 at 1:45 PM Chamikara Jayalath wrote: > On Thu, Oct 27, 2016 at 1:27

Re: Placement of temporary files by FileBasedSink

2016-10-27 Thread Eugene Kirpichov
I don't think your assessment of behavior of glob patterns correct, per https://cloud.google.com/storage/docs/gsutil/addlhelp/WildcardNames#directory-by-directory-vs-recursive-wildcards . I believe (and hope) that behavior of IOChannelFactory.match() matches the behavior of gsutil. On Thu, Oct 27

Re: Placement of temporary files by FileBasedSink

2016-10-27 Thread Eugene Kirpichov
Indeed IOChannelFactory uses GcsUtil for GCS, and GcsUtil in fact does not recurse into subdirectories inside a "*" pattern (see https://github.com/apache/incubator-beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/util/GcsUtil.java#L598) , and it does not support "**" patterns. How

Re: Placement of temporary files by FileBasedSink

2016-10-27 Thread Chamikara Jayalath
Yeah, you are right. I was testing using 'gsutil' which behaves differently. Thanks, Cham On Thu, Oct 27, 2016 at 2:06 PM Eugene Kirpichov wrote: > Indeed IOChannelFactory uses GcsUtil for GCS, and GcsUtil in fact does not > recurse into subdirectories inside a "*" pattern (see > > https://gith

Re: [DISCUSS] Merging master -> feature branch

2016-10-27 Thread Frances Perry
Great, let's document that in the feature branch section of the contribution guide: http://beam.incubator.apache.org/contribute/contribution-guide/#feature-branches Anyone want to take that? On Thu, Oct 27, 2016 at 1:01 PM, Kenneth Knowles wrote: > In the spirit of explicitly summarizing and co

Re: Why does `Combine.perKey(SerializableFunction)` require same input and output type

2016-10-27 Thread Manu Zhang
Thanks for the thorough explanation. I see the benefits for such a function. My follow-up question is whether this is a hard requirement. There are computations that don't satisfy this (I think it's monoid rule) but possible and easier to write with Combine.perKey(SerializableFunction, OutputT>). I

Re: Podling Report Reminder - November 2016

2016-10-27 Thread Jean-Baptiste Onofré
Perfect. Thanks James ! Regards JB ⁣​ On Oct 27, 2016, 01:05, at 01:05, James Malone wrote: >Hello everyone! > >Unless anyone disagrees or wants to do it, I am happy to volunteer to >draft >this podling report for review before we submit it. I can get it done >for a >review this Friday (US-Pa