Re: Introduction + Support in Comms for Beam!

2019-10-01 Thread Jesse Anderson
Excellent and welcome!

[image: Big Data Institute] Jesse Anderson
Managing Director
Big Data Institute
(775) 393 9122 | je...@bigdatainstitute.io
bigdatainstitute.io <https://www.bigdatainstitute.io/>


On Tue, Oct 1, 2019 at 10:46 AM Łukasz Gajowy  wrote:

> Welcome! :)
>
> wt., 1 paź 2019 o 11:30 Maximilian Michels  napisał(a):
>
>> Welcome Maria! Looking forward to your proposal.
>>
>> Cheers,
>> Max
>>
>> On 01.10.19 00:33, Reza Rokni wrote:
>> > Welcome!
>> >
>> > On Tue, 1 Oct 2019 at 11:18, Lukasz Cwik > > <mailto:lc...@google.com>> wrote:
>> >
>> > Welcome to the community.
>> >
>> > On Mon, Sep 30, 2019 at 3:15 PM María Cruz > > <mailto:macruz...@gmail.com>> wrote:
>> >
>> > Hi everyone,
>> > my name is María Cruz, I am from Buenos Aires but I live in the
>> > Bay Area. I recently became acquainted with Apache Beam project,
>> > and I got a chance to meet some of the Beam community at Apache
>> > Con North America this past September. I'm testing out a
>> > communications framework
>> > <
>> https://medium.com/@marianarra_/designing-a-communications-framework-for-community-engagement-e087312f9b83
>> >
>> > for Open Source communities. I'm emailing the list now because
>> > I'd like to work on a communications strategy for Beam, to make
>> > the most of the content you produce during Beam Summits.
>> >
>> > A little bit more about me. I am a communications strategist
>> > with 11 years of experience in the field, 8 of which are in the
>> > non-profit sector. I started working in Open Source in 2013,
>> > when I joined Wikimedia, the social movement behind Wikipedia. I
>> > now work to support Google Open Source projects, and I also
>> > volunteer in the communications team of the Apache Software
>> > Foundation, working closely with Sally (for those of you who
>> > know her).
>> >
>> > I will be sending the list a proposal in the coming days.
>> > Looking forward to hearing from you!
>> >
>> > Best,
>> >
>> > María
>> >
>> >
>> >
>> > --
>> >
>> > This email may be confidential and privileged. If you received this
>> > communication by mistake, please don't forward it to anyone else,
>> please
>> > erase all copies and attachments, and please let me know that it has
>> > gone to the wrong person.
>> >
>> > The above terms reflect a potential business arrangement, are provided
>> > solely as a basis for further discussion, and are not intended to be
>> and
>> > do not constitute a legally binding obligation. No legally binding
>> > obligations will be created, implied, or inferred until an agreement in
>> > final form is executed in writing by all parties involved.
>> >
>>
>


Re: Beam Samza Runner status update

2018-10-10 Thread Jesse Anderson
Interesting

On Wed, Oct 10, 2018, 3:49 PM Kenneth Knowles  wrote:

> Welcome, Hai!
>
> On Wed, Oct 10, 2018 at 3:46 PM Hai Lu  wrote:
>
>> Hi, all
>>
>> This is Hai from LinkedIn. As Xinyu mentioned, I have been working on
>> portable API for Samza runner and made some solid progress. It's been a
>> very smooth process (although not effortless for sure) and I'm really
>> grateful for the great platform that you all have built. I'm very
>> impressed. Bravo!
>>
>> Excited to work with everyone on Beam. Do expect more questions from me
>> down the road.
>>
>> Thanks,
>> Hai
>>
>> On Wed, Oct 10, 2018 at 12:36 PM Kenneth Knowles  wrote:
>>
>>> Clarification: Thomas Groh wrote the fuser, not me!
>>>
>>> Thanks for the sharing all this. Really cool.
>>>
>>> Kenn
>>>
>>> On Wed, Oct 10, 2018 at 11:17 AM Rui Wang  wrote:
>>>
 Thanks for sharing! it's so exciting to hear that Beam is being used on
 Samza in production @LinkedIn! Your feedback will be helpful to Beam
 community!

 Besides, Beam supports SQL right now and hopefully Beam community could
 also receive feedback on BeamSQL
  in the
 future.

 -Rui

 On Wed, Oct 10, 2018 at 11:10 AM Jean-Baptiste Onofré 
 wrote:

> Thanks for sharing and congrats for this great work !
>
> Regards
> JB
> Le 10 oct. 2018, à 20:23, Xinyu Liu @gmail.com
> target=_blank>xinyuliu...@gmail.com> a écrit:
>>
>> Hi, All,
>>
>> It's been over four months since we added the Samza Runner to Beam,
>> and we've been making a lot of progress after that. Here I would like to
>> update your guys and share some really good news happening here at 
>> LinkedIn:
>>
>> 1) First Beam job in production @LInkedIn!
>> After a few rounds of testing and benchmarking, we finally rolled out
>> our first Beam job here! The job uses quite a few features, such as event
>> time, fixed/session windowing, early triggering, and stateful processing.
>> Our first customer is very happy and they highly appraise the easy-to-use
>> Beam API as well as powerful processing model. Due to the limited 
>> resources
>> here, we put our full trust in the work you guys are doing, and we didn't
>> run into any surprises. We see extremely attention to details as well as
>> non-compromise in any user experience everywhere in the code base. We 
>> would
>> like to thank everyone in the Beam community to contribute to such an
>> amazing framework!
>>
>> 2) A portable Samza Runner prototype
>> We are also starting the work in making Samza Runner portable. So far
>> we just got the python word count example working using portable Samza
>> Runner. Please look out for the PR for this very soon :). Again, this 
>> work
>> is not possible without the great Beam portability framework, and the
>> developers like Luke and Ahmet, just to name a few, behind it. The
>> ReferenceRunner has been extremely useful to us to figure out what's 
>> needed
>> and how it works. Kudos to Thomas Groh, Ben Sidhom and all the others who
>> makes this available to us. And to Kenn, your fuse work rocks.
>>
>> 3) More contributors in Samza Runner
>> The runner has been Chris and my personal project for a while and now
>> it's not the case. We got Hai Lu and Boris Shkolnik from Samza team to
>> contribute. Hai has been focusing on the portability work as mentioned in
>> #2, and Boris will work mostly on supporting our use cases. We will send
>> more emails discussing our use cases, like the "Update state after 
>> firing"
>> email I sent out earlier.
>>
>> Finally, a shout-out to our very own Chris Pettitt. Without you, none
>> of the above won't happen!
>>
>> Thanks,
>> Xinyu
>>
>


Re: Community Examples Repository

2018-08-01 Thread Jesse Anderson
The examples have to be separate from the main beam repository. This way,
they serve as an example of how to use them in your code instead of how to
do it as part of Beam. It would also you to show the dependencies in sbt or
Maven.

On Wed, Aug 1, 2018, 3:16 PM Charles Chen  wrote:

> The examples we have right now serve both as examples to users and along
> with their unit tests, as tests of functionality.  If we move the examples
> out, what is a good way to make sure that we continue to have visibility
> and test coverage?  Can we address this in a section of the doc?
>
> On Wed, Aug 1, 2018 at 3:12 PM David Cavazos  wrote:
>
>> Hi everyone!
>>
>> We wanted to migrate the examples from the core repository to a new Beam
>> community examples repository. As the number of examples grow, it makes
>> sense to modularize and decouple the core functionality from the examples.
>>
>> We will also create some guidelines with the best practices for new
>> examples to be submitted.
>>
>> For more details, feel free to take a look and comment on the proposal
>> 
>> .
>>
>> Cheers,
>> David
>>
>


Re: An update on Eugene

2018-07-16 Thread Jesse Anderson
Thanks for all your work!

On Mon, Jul 16, 2018, 9:17 PM Eugene Kirpichov  wrote:

> Hi beamers,
>
> After 5.5 years working on data processing systems at Google, several of
> these years working on Dataflow and Beam, I am moving on to do something
> new (also at Google) in the area of programming models for machine
> learning. Anybody who worked with me closely knows how much I love building
> programming models, so I could not pass up on the opportunity to build a
> new one - I expect to have a lot of fun there!
>
> On the new team we very much plan to make things open-source when the time
> is right, and make use of Beam, just as TensorFlow does - so I will stay in
> touch with the community, and I expect that we will still work together on
> some things. However, Beam will no longer be the main focus of my work.
>
> I've made the decision a couple months ago and have spent the time since
> then getting things into a good state and handing over the community
> efforts in which I have played a particularly active role - they are in
> very capable hands:
> - Robert Bradshaw and Ankur Goenka on Google side are taking charge of
> Portable Runners (e.g. the Portable Flink runner).
> - Luke Cwik will be in charge of the future of Splittable DoFn. Ismael
> Mejia has also been involved in the effort and actively helping, and I
> believe he continues to do so.
> - The Beam IO ecosystem in general is in very good shape (perhaps the best
> in the industry) and does not need a lot of constant direction; and it has
> a great community (thanks JB, Ismael, Etienne and many others!) - however,
> on Google side, Chamikara Jayalath will take it over.
>
> It was a great pleasure working with you all. My last day formally on Beam
> will be this coming Friday, then I'll take a couple weeks of vacation and
> jump right in on the new team.
>
> Of course, if my involvement in something is necessary, I'm still
> available on all the same channels as always (email, Slack, Hangouts) -
> but, in general, please contact the folks mentioned above instead of me
> about the respective matters from now on.
>
> Thanks!
>


Re: [ANNOUNCEMENT] New committers, May 2018 edition!

2018-06-01 Thread Jesse Anderson
Welcome!

On Fri, Jun 1, 2018, 2:02 AM Etienne Chauchot  wrote:

> Congrats to all !
> Le jeudi 31 mai 2018 à 19:08 -0700, Davor Bonaci a écrit :
>
> Please join me and the rest of Beam PMC in welcoming the following
> contributors as our newest committers. They have significantly contributed
> to the project in different ways, and we look forward to many more
> contributions in the future.
>
> * Griselda Cuevas
> * Pablo Estrada
> * Jason Kuster
>
> (Apologizes for a delayed announcement, and the lack of the usual
> paragraph summarizing individual contributions.)
>
> Congratulations to all three! Welcome!
>
>


Re: I'm back and ready to help grow our community!

2018-05-17 Thread Jesse Anderson
Congrats!

On Thu, May 17, 2018, 6:44 PM Robert Burke  wrote:

> Congrats & welcome back!
>
> On Thu, May 17, 2018, 5:44 PM Huygaa Batsaikhan  wrote:
>
>> Welcome back, Gris! Congratulations!
>>
>> On Thu, May 17, 2018 at 4:24 PM Robert Bradshaw 
>> wrote:
>>
>>> Congratulations, Gris! And welcome back!
>>> On Thu, May 17, 2018 at 3:30 PM Robin Qiu  wrote:
>>>
>>> > Congratulations! Welcome back!
>>>
>>> > On Thu, May 17, 2018 at 3:23 PM Reuven Lax  wrote:
>>>
>>> >> Congratulations! Good to see you back!
>>>
>>> >> Reuven
>>>
>>> >> On Thu, May 17, 2018 at 2:24 PM Griselda Cuevas 
>>> wrote:
>>>
>>> >>> Hi Everyone,
>>>
>>>
>>> >>> I was absent from the mailing list, slack channel and our Beam
>>> community for the past six weeks, the reason was that I took a leave to
>>> focus on finishing my Masters Degree, which I finally did on May 15th.
>>>
>>>
>>> >>> I graduated as a Masters of Engineering in Operations Research with a
>>> concentration in Data Science from UC Berkeley. I'm glad to be part of
>>> this
>>> community and I'd like to share this accomplishment with you so I'm
>>> adding
>>> two pictures of that day :)
>>>
>>>
>>> >>> Given that I've seen so many new folks around, I'd like to use this
>>> opportunity to re-introduce myself. I'm Gris Cuevas and I work at Google.
>>> Now that I'm back, I'll continue to work on supporting our community in
>>> two
>>> main streams: Contribution Experience & Events, Meetups, and Conferences.
>>>
>>>
>>> >>> It's good to be back and I look forward to collaborating with you.
>>>
>>>
>>> >>> Cheers,
>>>
>>> >>> Gris
>>>
>>


Re: Beam high level directions (was "Graal instead of docker?")

2018-05-16 Thread Jesse Anderson
This -> "I'd like that each time you think that you ask yourself "does it
need?"."

On Wed, May 16, 2018 at 4:53 PM Robert Bradshaw  wrote:

> Thanks for your email, Romain. It helps understand your goals and where
> you're coming from. I'd also like to see a thinner core, and agree it's
> beneficial to reduce dependencies where possible, especially when
> supporting the usecase where the pipeline is constructed in an environment
> other than an end-user's main.
>
> It seems a lot of the portability work, despite being on the surface driven
> by multi-language, aligns well with many of these goals. For example, all
> the work going on in runners-core to provide a rich library that all (Java,
> and perhaps non-Java) runners can leverage to do DAG preprocessing (fusion,
> combiner lifting, ...) and handle the low-level details of managing worker
> subprocesses. As you state, the more we can put into these libraries, the
> more all runners can get "for free" by interacting with them, while still
> providing the flexibility to adapt to their differing models and strengths.
>
> Getting this right is, for me at least, one of the highest priorities for
> Beam.
>
> - Robert
> On Wed, May 16, 2018 at 11:51 AM Kenneth Knowles  wrote:
>
> > Hi Romain,
>
> > This gives a clear view of your perspective. I also recommend you ask
> around to those who have been working on Beam and big data processing for a
> long time to learn more about their perspective.
>
> > Your "Beam Analysis" is pretty accurate about what we've been trying to
> build. I would say (a) & (b) as "any language on any runner" and (c) is our
> plan of how to do it: define primitives which are fundamental to parallel
> processing and formalize a language-independent representation, with
> adapters for each language and data processing engine.
>
> > Of course anyone in the community may have their own particular goal. We
> don't control what they work on, and we are grateful for their efforts.
>
> > Technically, there is plenty to agree with. I think as you learn about
> Beam you will find that many of your suggestions are already handled in
> some way. You may also continue to learn sometimes about the specific
> reasons things are done in a different way than you expected. These should
> help you find how to build what you want to build.
>
> > Kenn
>
> > On Wed, May 16, 2018 at 1:14 AM Romain Manni-Bucau <
> rmannibu...@gmail.com>
> wrote:
>
> >> Hi guys,
>
> >> Since it is not the first time we have a thread where we end up not
> understanding each other, I'd like to take this as an opportunity to
> clarify what i'm looking for, in a more formal way. This assumes our
> misunderstandings come from the fact I mainly tried to fix issues one by
> ones, instead of painting the big picture I'm getting after. (My rational
> was I was not able to invest more time in that but I start to think it was
> not a good chocie). I really hope it helps.
>
> >> 1. Beam analysis
>
> >> Beam has three main goals:
>
> >> a. Being a portable API accross runners (I also call them
> "implementations" by opposition of "api")
> >> b. Bringing some interoperability between languages and therefore users
> >> c. Provide primitives (groupby for instance), I/O and generic processing
> items
>
> >> Indeed it doesn't cover all beam's features but, high level, it is what
> it brings.
>
> >> In terms of advantages and why choosing beam instead of spark, for
> instance, the benefit is mainly to not be vendor locked on one side and to
> enable more users on the other side (you note that point c is just catching
> up on vendors ecosystems with these statements).
>
> >> 2. Portable API accross environments
>
> >> It is key, here, to keep in mind beam is not an environment or a runner.
> It is by design, a library *embedded* in other environment.
>
> >> a. This means that Beam must keep its stack as clean as possible. If it
> is still ambiguous: beam must be dependency free.
>
> >> Until now the workaround has been to shade dependencies. This is not a
> solution since it leads to big jobs of hundreds of mega which prevents to
> scale since we deploy from the network. It makes all deployments,
> managements, and storage a pain on ops side. The other pitfall of shades
> (or shadowing since we are on gradle now) is that it completely breaks any
> company tooling and prevent vulnerability scanning or dependency upgrades -
> not handled by dev team - to work correctly. This is a major issue for any
> software targetting some professional level which should not be
> underestimated.
>
> >>  From that point we can get scared but with Java 8 there is no real
> point
> having a tons of dependencies for the sdk core - this is for java but
> should be true for most languages since beam requirements are light here.
>
> >> However it can also require to rethink the sdk core modularity: why is
> there some IO here? Do we need a big fat sdk core?
>
> >> b. API or "put it all"?
>
> >> Current API is in sdk-co

Re: Samza Runner

2018-01-25 Thread Jesse Anderson
Excellent!

On Fri, Jan 26, 2018, 5:37 AM Kenneth Knowles  wrote:

> Hi all,
>
> In case you haven't noticed or followed, there's a new runner in PR: Samza!
>
> https://github.com/apache/beam/pull/4340
>
> It has been under review and revision for some time. In local mode it
> passes a solid suite of ValidatesRunner tests (I don't have a Samza
> deployment handy to test non-local).
>
> Given all this, I am ready to put it on a feature branch where it can
> mature further, and we can build out our CI for it, etc, until we agree it
> is ready for master.
>
> Kenn
>


Strata Conference this March 6-8

2018-01-16 Thread Jesse Anderson
+1 to BoF. I don't know if any Beam talks will be on the schedule.

> We could do an informal BoF at the Philz nearby or similar?


Re: [DISCUSS] State of the project

2018-01-15 Thread Jesse Anderson
I think a focus on the runners is what's key to Beam's adoption. The
runners are the foundation on which Beam sits. If the runners don't work
properly, Beam won't work.

A focus on improved unit tests is a good start, but isn't what's needed.
Compatibility matrices will help see how your runner of choice stacks up,
but that requires too much knowledge of Beam's internals to be
interpretable.

Imagine you're an (enterprise) architect looking at adopting Beam. What do
you look at or what do you look for before going deeper? What would make
you stick your neck out to adopt Beam? For my experience, there are
several/pass fails along the way.

Here are a few of the common ones I've seen:

   - Will this dramatically improve the problems I'm trying to solve? (not
   writing APIs/better programming model/Beam's better handling of windowing)
   - Can I get commercial support for Beam? (This is changing soon)
   - Are other people using Beam with the configuration and use case as me?
   (e.g. I'm using Spark with Beam to process imagery. Are others doing this
   in production?)
   - Is there good documentation and books on the subject? (Tyler's and
   others' book will improve this)
   - Can I get my team trained on this new technology? (I have Beam
   training and Google has some cursory training)

I think the one the community can improve on the most is the social proof
of Beam. I've tried to do this (
http://www.jesse-anderson.com/2017/06/beam-2-0-q-and-a/ and
http://www.jesse-anderson.com/2016/07/question-and-answers-with-the-apache-beam-team/).
We need to get the message out more about people using Beam in production,
which configuration they have, and what their results were. I think we have
the social proof on Dataflow, but not as much on Spark/Flink/Apex.

I think it's important to note that these checks don't look at the hardcore
language or API semantics that we're working on. These are much later stage
issues, if they're ever used at all.

In my experience with other open source adoption at enterprises, it starts
with architects and works its way around the organization from there.

Thanks,

Jesse

On Mon, Jan 15, 2018 at 8:14 AM Ted Yu  wrote:

> bq. are hard to detect in our unit-test framework
>
> Looks like more integration tests would help discover bug / regression
> more quickly. If committer reviewing the PR has concern in this regard, the
> concern should be stated on the PR so that the contributor (and reviewer)
> can spend more time in solidifying the solution.
>
> bq. I've gone and fixed these issues myself when merging
>
> We can make stricter checkstyle rules so that the code wouldn't pass build
> without addressing commonly known issues.
>
> Cheers
>
> On Sun, Jan 14, 2018 at 12:37 PM, Reuven Lax  wrote:
>
>> I agree with the sentiment, but I don't completely agree with the
>> criteria.
>>
>> I think we need to be much better about reviewing PRs. Some PRs languish
>> for too long before the reviewer gets to it (and I've been guilty of this
>> too), which does not send a good message. Also new PRs sometimes languish
>> because there is no reviewer assigned; maybe we could write a gitbot to
>> automatically assign a reviewer to every new PR?
>>
>> Also, I think that the bar for merging a PR from a contributor should not
>> be "the PR is perfect." It's perfectly fine to merge a PR that still has
>> some issues (especially if the issues are stylistic). In the past when I've
>> done this, I've gone and fixed these issues myself when merging. It was a
>> bit more work for me to fix these things myself, but it was a small price
>> to pay in order to portray Beam as a welcoming place for contributions.
>>
>> On the other hand, "the build does not break" is - in my opinion - too
>> weak of a criterion for merging. A few reasons for this:
>>
>>   * Beam is a data-processing framework, and data integrity is paramount.
>> If a reviewer sees an issue that could lead to data loss (or duplication,
>> or corruption), I don't think that PR should be merged. Historically many
>> such issues only actually manifest at scale, and are hard to detect in our
>> unit-test framework. (we also need to invest in more at-scale tests to
>> catch such issues).
>>
>>   * Beam guarantees backwards compatibility for users (except across
>> major versions). If a bad API gets merged and released (and the chances of
>> "forgetting" about it before the release is cut is unfortunately high), we
>> are stuck with it. This is less of an issue for many other open-source
>> projects that do not make such a compatibility guarantee, as they are able
>> to simply remove or fix the API in the next version.
>>
>> I think we still need honest review of PRs, with the criteria being
>> stronger than "the build doesn't break." However reviewers also need to be
>> reasonable about what they ask for.
>>
>> Reuven
>>
>> On Sun, Jan 14, 2018 at 11:19 AM, Ted Yu  wrote:
>>
>>> bq. if a PR is basically right (it does what it should) without
>>> br

Re: Happy new year

2018-01-01 Thread Jesse Anderson
Happy New Year!

On Sun, Dec 31, 2017, 11:09 PM Jean-Baptiste Onofré  wrote:

> Hi beamers,
>
> I wish you a great and happy new year !
>
> Regards
> JB
> --
> Jean-Baptiste Onofré
> jbono...@apache.org
> http://blog.nanthrax.net
> Talend - http://www.talend.com
>


Re: A personal update

2017-12-12 Thread Jesse Anderson
Congrats!

On Wed, Dec 13, 2017, 5:54 AM Jean-Baptiste Onofré  wrote:

> Hi Davor,
>
> welcome back !!
>
> It's really great to see you back active in the Beam community. We really
> need you !
>
> I'm so happy !
>
> Regards
> JB
>
> On 12/13/2017 05:51 AM, Davor Bonaci wrote:
> > My dear friends,
> > As many of you have noticed, I’ve been visibly absent from the project
> for a
> > little while. During this time, a great number of you kept reaching out,
> and for
> > that I’m deeply humbled and grateful to each and every one of you.
> >
> > I needed some time for personal reflection, which led to a transition in
> my
> > professional life. As things have settled, I’m happy to again be working
> among
> > all of you, as we propel this project forward. I plan to be active in the
> > future, but perhaps not quite full-time as I was before.
> >
> > In the near term, I’m working on getting the report to the Board
> completed, as
> > well as framing the discussion about the project state and vision going
> > forwards. Additionally, I’ll make sure that we foster healthy community
> culture
> > and operate in the Apache Way.
> >
> > For those who are curious, I’m happy to say that I’m starting a company
> building
> > products related to Beam, along with several other members of this
> community and
> > authors of this technology. I’ll share more on this next year, but until
> then if
> > you have a data processing problem or an Apache Beam question, I’d love
> to hear
> > from you ;-).
> >
> > Thanks -- and so happy to be back!
> >
> > Davor
>
> --
> Jean-Baptiste Onofré
> jbono...@apache.org
> http://blog.nanthrax.net
> Talend - http://www.talend.com
>


Re: Apache Beam Workshop in Guadalajara Mexico

2017-11-29 Thread Jesse Anderson
That's great!

On Wed, Nov 29, 2017, 4:00 AM Etienne Chauchot  wrote:

> Very nice Griselda!
>
> Looking forward to get feedback !
>
> Thanks
>
> Etienne
>
>
> Le 29/11/2017 à 09:11, Jean-Baptiste Onofré a écrit :
> > Hi Gris,
> >
> > That's really great ! Thanks for sharing.
> >
> > By the way, next week I will be at Strata Singapore. I know some
> > beamers will be around.
> >
> > Regards
> > JB
> >
> > On 11/29/2017 08:31 AM, Griselda Cuevas wrote:
> >> Hi Everyone,
> >>
> >> I wanted to share with you that on December 2nd, Wizeline Academy [1]
> >> will host an Apache Beam workshop in Guadalajara Mexico. The
> >> objective of this workshop is to identify adoption barriers and
> >> improvement opportunities for the project through the observation and
> >> documentation of the experience of new Beam users. We hope that the
> >> findings of this workshop can provide supportive information to shape
> >> the direction of our project, specially now that we have started
> >> conversation about next releases.
> >>
> >> If you are in the area and are interested in joining us, please sign
> >> up [2]. If you're interested in running similar efforts reach out to
> >> me and I'll be happy to share resources and connect with you. I'll
> >> report back with findings after the workshop.
> >>
> >> Cheers,
> >> Gris
> >>
> >> [1] https://academy.wizeline.com/about/
> >> [2] https://academy.wizeline.com/apache-beam/
> >
>
>


Re: New Contributor

2017-11-14 Thread Jesse Anderson
Welcome!

On Tue, Nov 14, 2017, 10:03 PM Ben Sidhom  wrote:

> Hey all,
>
> My name is Ben Sidhom. I'm an engineer at Google working on open source
> data processing on top of GCP. I hope to contribute to the runner
> portability effort along with Axel.
>
>
> On 2017-11-14 11:38, Axel Magnuson  wrote:
> > Hello All,>
> >
> > My name is Axel Magnuson and I intend to start contributing to the Beam>
> > project.>
> >
> > I work as a Software Engineer at Google, with a background in Spark and>
> > Hadoop.  I am hoping to make myself useful in particular around
> portability>
> > efforts and the open source engine runners.>
> >
> > Best,>
> > Axel>
> >
> > -- >
> > Axel Magnuson | Software Engineer | axelm...@google.com |   1 (425)
> 893-4624>
> >
>
-- 
Thanks,

Jesse


Re: python3 support schedule

2017-11-02 Thread Jesse Anderson
Holden is being modest in her contributions to Python frameworks,
especially Apache Spark.

On Thu, Nov 2, 2017 at 12:55 PM Holden Karau  wrote:

> Hi! So this is something I'm currently working on (e.g. in between checking
> my e-mails :p). If you want to help join in we can split up the work into
> smaller components and parallelize the process a bit :) Always happy to see
> more folks who care about Python 3 support.
>
> On Thu, Nov 2, 2017 at 12:44 PM, Lukasz Cwik 
> wrote:
>
> > Contributions are always welcome to improve progress.
> >
> > You can always vote/watch the Python 3 JIRA issue as this helps people
> know
> > what others are looking for.
> >
> > On Thu, Nov 2, 2017 at 10:33 AM, Yue Yang  wrote:
> >
> > > Hello,
> > >   I wonder what is the schedule to support python 3. It seems that the
> > > progess is very slow.
> > >   Thanks.
> > >
> >
>
>
>
> --
> Twitter: https://twitter.com/holdenkarau
>
-- 
Thanks,

Jesse


Re: New contributor

2017-09-13 Thread Jesse Anderson
Welcome!

On Wed, Sep 13, 2017 at 2:24 PM Daniel Oliveira
 wrote:

> Hi everyone,
>
> My name's Daniel Oliveira. I work at Google and I'd like to start
> contributing to this project so I wanted to introduce myself.
>
> I've already read through the contribution guide and I'm excited to start
> making contributions soon!
>
> Thank you,
> Daniel Oliveira
>
-- 
Thanks,

Jesse


Re: [DISCUSS] Capability Matrix revamp

2017-08-20 Thread Jesse Anderson
It'd be awesome to see these updated. I'd add two more:

   1. A plain English summary of the runner's support in Beam. People who
   are new to Beam won't understand the in-depth coverage and need a general
   idea of how it is supported.
   2. The production readiness of the runner. Does the maintainer think
   this runner is production ready?



On Sun, Aug 20, 2017 at 8:03 AM Kenneth Knowles 
wrote:

> Hi all,
>
> I want to revamp
> https://beam.apache.org/documentation/runners/capability-matrix/
>
> When Beam first started, we didn't work on feature branches for the core
> runners, and they had a lot more gaps compared to what goes on `master`
> today, so this tracked our progress in a way that was easy for users to
> read. Now it is still our best/only comparison page for users, but I think
> we could improve its usefulness.
>
> For the benefit of the thread, let me inline all the capabilities fully
> here:
>
> 
>
> "What is being computed?"
>  - ParDo
>  - GroupByKey
>  - Flatten
>  - Combine
>  - Composite Transforms
>  - Side Inputs
>  - Source API
>  - Splittable DoFn
>  - Metrics
>  - Stateful Processing
>
> "Where in event time?"
>  - Global windows
>  - Fixed windows
>  - Sliding windows
>  - Session windows
>  - Custom windows
>  - Custom merging windows
>  - Timestamp control
>
> "When in processing time?"
>  - Configurable triggering
>  - Event-time triggers
>  - Processing-time triggers
>  - Count triggers
>  - [Meta]data driven triggers
>  - Composite triggers
>  - Allowed lateness
>  - Timers
>
> "How do refinements relate?"
>  - Discarding
>  - Accumulating
>  - Accumulating & Retracting
>
> 
>
> Here are some issues I'd like to improve:
>
>  - Rows that are impossible to not support (ParDo)
>  - Rows where "support" doesn't really make sense (Composite transforms)
>  - Rows are actually the same model feature (non-merging windowfns)
>  - Rows that represent optimizations (Combine)
>  - Rows in the wrong place (Timers)
>  - Rows have not been designed ([Meta]Data driven triggers)
>  - Rows with names that appear no where else (Timestamp control)
>  - No place to compare non-model differences between runners
>
> I'm still pondering how to improve this, but I thought I'd send the notion
> out for discussion. Some imperfect ideas I've had:
>
> 1. Lump all the basic stuff (ParDo, GroupByKey, Read, Window) into one row
> 2. Make sections as users see them, like "ParDo" / "side Inputs" not
> "What?" / "side inputs"
> 3. Add rows for non-model things, like portability framework support,
> metrics backends, etc
> 4. Drop rows that are not informative, like Composite transforms, or not
> designed
> 5. Reorganize the windowing section to be just support for merging /
> non-merging windowing.
> 6. Switch to a more distinct color scheme than the solid vs faded colors
> currently used.
> 7. Find a web design to get short descriptions into the foreground to make
> it easier to grok.
>
> These are just a few thoughts, and not necessarily compatible with each
> other. What do you think?
>
> Kenn
>
-- 
Thanks,

Jesse


Re: contrib package for beam?

2017-08-16 Thread Jesse Anderson
I've had this discussion before. I'd love to see one so that there's a
consistent home for things that don't belong in the API.

On Wed, Aug 16, 2017, 2:55 PM Pablo Estrada 
wrote:

> Hi all,
> What would be an appropriate medium for contributions such as utility
> Pipelines or PTransforms? Perhaps it's different for each kind of
> contribution (sources/sinks, PTransforms, or utility pipelines).
>
> The question comes from an active user on Stack Overflow[1], and it seems
> pertinent. What's standard practice in other projects to keep this sort of
> contributions shared and available? Perhaps keep a list with links in our
> readme, or the beam site, or something else?
>
> Best
> -P.
>
> 1 -
>
> https://stackoverflow.com/questions/45603814/any-contrib-package-for-apache-beam-where-i-can-commit-a-dataflow-pipeline
>
-- 
Thanks,

Jesse


Re: [ANNOUNCEMENT] New PMC members, August 2017 edition!

2017-08-11 Thread Jesse Anderson
Welcome!

On Fri, Aug 11, 2017, 10:43 AM Ted Yu  wrote:

> Congratulations to Ahmet and Aviem.
>
> On Fri, Aug 11, 2017 at 10:40 AM, Davor Bonaci  wrote:
>
> > Please join me and the rest of Beam PMC in welcoming the following
> > committers as our newest PMC members. They have significantly contributed
> > to the project in different ways, and we look forward to many more
> > contributions in the future.
> >
> > * Ahmet Altay
> > Beyond significant work to drive the Python SDK to the master branch,
> Ahmet
> > has worked project-wide, driving releases, improving processes and
> testing,
> > and growing the community.
> >
> > * Aviem Zur
> > Beyond significant work in the Spark runner, Aviem has worked to improve
> > how the project operates, leading discussions on inclusiveness and
> > openness.
> >
> > Congratulations to both! Welcome!
> >
> > Davor
> >
>
-- 
Thanks,

Jesse


Re: [ANNOUNCEMENT] New committers, August 2017 edition!

2017-08-11 Thread Jesse Anderson
Welcome!

On Fri, Aug 11, 2017, 10:48 AM Jason Kuster 
wrote:

> Congrats to all, many thanks for the great contributions.
>
> On Fri, Aug 11, 2017 at 10:46 AM, Ahmet Altay 
> wrote:
>
> > Congratulations to all of you. Well deserved and thank you for your
> > contributions.
> >
> > On Fri, Aug 11, 2017 at 10:43 AM, tarush grover  >
> > wrote:
> >
> > > Congratulations!!
> > >
> > > Regards,
> > > Tarush
> > >
> > > On Fri, 11 Aug 2017 at 11:11 PM, Davor Bonaci 
> wrote:
> > >
> > > > Please join me and the rest of Beam PMC in welcoming the following
> > > > contributors as our newest committers. They have significantly
> > > contributed
> > > > to the project in different ways, and we look forward to many more
> > > > contributions in the future.
> > > >
> > > > * Reuven Lax
> > > > Reuven has been with the project since the very beginning,
> contributing
> > > > mostly to the core SDK and the GCP IO connectors. He accumulated 52
> > > commits
> > > > (19,824 ++ / 12,039 --). Most recently, Reuven re-wrote several IO
> > > > connectors that significantly expanded their functionality.
> > Additionally,
> > > > Reuven authored important new design documents relating to update and
> > > > snapshot functionality.
> > > >
> > > > * Jingsong Lee
> > > > Jingsong has been contributing to Apache Beam since the beginning of
> > the
> > > > year, particularly to the Flink runner. He has accumulated 34 commits
> > > > (11,214 ++ / 6,314 --) of deep, fundamental changes that
> significantly
> > > > improved the quality of the runner. Additionally, Jingsong has
> > > contributed
> > > > to the project in other ways too -- reviewing contributions, and
> > > > participating in discussions on the mailing list, design documents,
> and
> > > > JIRA issue tracker.
> > > >
> > > > * Mingmin Xu
> > > > Mingmin started the SQL DSL effort, and has driven it to the point of
> > > > merging to the master branch. In this effort, he extended the project
> > to
> > > > the significant new user community.
> > > >
> > > > * Mingming (James) Xu
> > > > James joined the SQL DSL effort, contributing some of the trickier
> > parts,
> > > > such as the Join functionality. Additionally, he's consistently shown
> > > > himself to be an insightful code reviewer, significantly impacting
> the
> > > > project’s code quality and ensuring the success of the new major
> > > component.
> > > >
> > > > * Manu Zhang
> > > > Manu initiated and developed a runner for the Apache Gearpump
> > > (incubating)
> > > > engine, and has driven it to the point of merging to the master
> branch.
> > > In
> > > > this effort, he accumulated 65 commits (7,812 ++ / 4,882 --) and
> > extended
> > > > the project to the new user community.
> > > >
> > > > Congratulations to all five! Welcome!
> > > >
> > > > Davor
> > > >
> > >
> >
>
>
>
> --
> ---
> Jason Kuster
> Apache Beam / Google Cloud Dataflow
>
-- 
Thanks,

Jesse


Re: [DISCUSS] Beam MapReduce Runner One-Pager

2017-07-07 Thread Jesse Anderson
Basing this on Crunch's approach is a good way to go. I'd really love to
see this happen.

On Fri, Jul 7, 2017 at 6:11 AM Pei HE  wrote:

> Hi all,
> While JB is working on MapReduce Runner BEAM-165
> , I have spent time
> reading
> Apache Crunch code and drafted Beam MapReduce Runner One-Pager
> <
> https://docs.google.com/document/d/10jJ8pBTZ10rNr_IO5YnggmZZG1MU-F47sWg8N6xkBM0/edit#heading=h.bewnehqnt4zd
> >
> (mostly
> around ParDo/Flatten fusion support, and with many missing details).
>
> I would like to start the discussion, and get people's attention of
> supporting MapReduce in Beam.
>
> Feel free to make comments and suggestions on that doc.
>
> Thanks
> --
> Pei
>
-- 
Thanks,

Jesse


Re: BeamSQL status and merge to master

2017-07-05 Thread Jesse Anderson
So excited to start using this!

On Wed, Jul 5, 2017, 3:34 PM Mingmin Xu  wrote:

> Thanks for everybody's effort, we're very close to finish existing tasks.
> Here's an status update of SQL DSL, feel free to have a try and share any
> comment:
>
> *1. what's done*
>   DSL feature is done, with basic filter/project/aggregation/union/join,
> built-in functions/UDF/UDAF(pending on #3491)
>
> *2. what's on-going*
>   more unit tests, and documentation of README/Beam web.
>
> *3. open questions*
>   BEAM-2441  want to see
> any suggestion on the proper module name for SQL work. As mentioned in
> task, '*dsl/sql* is for the Java SDK and also prevents alternative language
> implementations, however there's another SQL client and not good to be
> included as Java SDK extention'.
>
> ---
> *How to run the example* beam/dsls/sql/example/BeamSqlExample.java
> <
> https://github.com/apache/beam/blob/DSL_SQL/dsls/sql/src/main/java/org/apache/beam/dsls/sql/example/BeamSqlExample.java
> >
> 1. run 'mvn install' to avoid the error in #3439
> 
> 2. run 'mvn -pl dsls/sql compile exec:java
> -Dexec.mainClass=org.apache.beam.dsls.sql.example.BeamSqlExample
> -Dexec.args="--runner=DirectRunner" -Pdirect-runner'
>
> FYI:
> 1. burn-down list in google doc
>
> https://docs.google.com/document/d/1EHZgSu4Jd75iplYpYT_K_JwSZxL2DWG8kv_EmQzNXFc/edit?usp=sharing
> 2. JIRA tasks with label 'dsl_sql_merge'
>
> https://issues.apache.org/jira/browse/BEAM-2555?jql=labels%20%3D%20dsl_sql_merge
>
>
> Mingmin
>
> On Tue, Jun 13, 2017 at 8:51 AM, Lukasz Cwik 
> wrote:
>
> > Nevermind, I merged it into #2 about usability.
> >
> > On Tue, Jun 13, 2017 at 8:50 AM, Lukasz Cwik  wrote:
> >
> > > I added a section about maven module structure/packaging (#6).
> > >
> > > On Tue, Jun 13, 2017 at 8:30 AM, Tyler Akidau
>  > >
> > > wrote:
> > >
> > >> Thanks Mingmin. I've copied your list into a doc[1] to make it easier
> to
> > >> collaborate on comments and edits.
> > >>
> > >> [1] https://s.apache.org/beam-dsl-sql-burndown
> > >>
> > >> -Tyler
> > >>
> > >>
> > >> On Mon, Jun 12, 2017 at 10:09 PM Jean-Baptiste Onofré <
> j...@nanthrax.net>
> > >> wrote:
> > >>
> > >> > Hi Mingmin
> > >> >
> > >> > Sorry, the meeting was in the middle of the night for me and I
> wasn't
> > >> able
> > >> > to
> > >> > make it.
> > >> >
> > >> > The timing and checklist look good to me.
> > >> >
> > >> > We plan to do a Beam release end of June, so, merging in July means
> we
> > >> can
> > >> > include it in the next release.
> > >> >
> > >> > Thanks !
> > >> > Regards
> > >> > JB
> > >> >
> > >> > On 06/13/2017 03:06 AM, Mingmin Xu wrote:
> > >> > > Hi all,
> > >> > >
> > >> > > Thanks to join the meeting. As discussed, we're planning to merge
> > >> DSL_SQL
> > >> > > branch back to master, targeted in the middle of July. A tag
> > >> > > 'dsl_sql_merge'[1] is created to track all todo tasks.
> > >> > >
> > >> > > *What's added in Beam SQL?*
> > >> > > BeamSQL provides the capability to execute SQL queries with Beam
> > Java
> > >> > SDK,
> > >> > > either by translating SQL to a PTransform, or run with a
> standalone
> > >> CLI
> > >> > > client.
> > >> > >
> > >> > > *Checklist for merge:*
> > >> > > 1. functionality
> > >> > >1.1. SQL grammer:
> > >> > >  1.1.1. basic query with SELECT/FILTER/PROJECT;
> > >> > >  1.1.2. AGGREGATION with global window;
> > >> > >  1.1.3. AGGREGATION with FIX_TIME/SLIDING_TIME/SESSION window;
> > >> > >  1.1.4. JOIN
> > >> > >1.2. UDF/UDAF support;
> > >> > >1.3. support predefined String/Math/Date functions, see[2];
> > >> > >
> > >> > > 2. DSL interface to convert SQL as PTransform;
> > >> > >
> > >> > > 3. junit test;
> > >> > >
> > >> > > 4. Java document;
> > >> > >
> > >> > > 5. Document of SQL feature in website;
> > >> > >
> > >> > > Any comments/suggestions are very welcomed.
> > >> > >
> > >> > > Note:
> > >> > > [1].
> > >> > >
> > >> > https://issues.apache.org/jira/browse/BEAM-2436?jql=labels%
> > >> 20%3D%20dsl_sql_merge
> > >> > >
> > >> > > [2]. https://calcite.apache.org/docs/reference.html
> > >> > >
> > >> >
> > >> > --
> > >> > Jean-Baptiste Onofré
> > >> > jbono...@apache.org
> > >> > http://blog.nanthrax.net
> > >> > Talend - http://www.talend.com
> > >> >
> > >>
> > >
> > >
> >
>
>
>
> --
> 
> Mingmin
>
-- 
Thanks,

Jesse


Re: SQL in Stream Computing: MERGE or INSERT?

2017-06-22 Thread Jesse Anderson
If I'm understanding correctly, Hive does that with a insert into followed
by a select statement that does the aggregation.
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML#LanguageManualDML-InsertingdataintoHiveTablesfromqueries

On Thu, Jun 22, 2017 at 1:32 AM James  wrote:

> Hi team,
>
> I am thinking about a SQL and stream computing related problem, want to
> hear your opinions.
>
> In stream computing, there is a typical case like this:
>
> *We want to calculate a big wide result table, which has one rowkey and ten
> value columns:*
> *create table result (*
> *rowkey varchar(127) PRIMARY KEY,*
> *col1 int,*
> *col2 int,*
> *...*
> *col10 int*
> *);*
>
> Each of the value columns is calculated by a complex query, so there will
> be ten SQLs to calculate
> data for this table, for each sql:
>
> * First check whether there is a row for the specified `rowkey`.
> * If yes, then `update`, otherwise `insert`.
>
> There is actually a dedicated sql syntax called `MERGE` designed for
> this(SQL2008), a sample usage is:
>
> MERGE INTO result D
>USING (SELECT rowkey, col1 FROM input WHERE flag = 80) S
>ON (D.rowkey = S.rowkey)
>WHEN MATCHED THEN UPDATE SET D.col1 = S.col1
>WHEN NOT MATCHED THEN INSERT (D.rowkey, D.col1)
>
>
> *The semantic fits perfectly, but it is very verbose, and normal users
> rarely used this syntax.*
>
> So my colleagues invented a new syntax for this scenario (Or more
> precisely, a new interpretation for the INSERT statement). For the above
> scenario, user will always write `insert` statement:
>
> insert into result(rowkey, col1) values(...)
> insert into result(rowkey, col2) values(...)
>
> The sql interpreter will do a trick behind the scene: if the `rowkey`
> exists, then update, otherwise `insert`. This solution is very concise, but
> violates the semantics of `insert`, using this solution INSERT will behave
> differently in batch & stream processing.
>
> How do you guys think? which do you prefer? What's your reasoning?
>
> Looking forward to your opinions, thanks in advance.
>
-- 
Thanks,

Jesse


Re: Beam 2.0 Release Q and A

2017-06-15 Thread Jesse Anderson
I've published the Q and A. You can find it here
<http://www.jesse-anderson.com/2017/06/beam-2-0-q-and-a/>.

Thanks,

Jesse

On Wed, Jun 7, 2017 at 7:52 AM Jesse Anderson 
wrote:

> One last call to fill out the Q and A. I'm going to publish it on
> Wednesday June 14. Please fill it out before Tuesday June 13.
>
> One thing to note, you don't have to be a committer or developer to add
> something to the doc. If you're using Beam in production, that will help
> others decide to use Beam.
>
> Here the document again
> https://docs.google.com/document/d/1vyel3XRfdeGyqLvXiy1C3mrw9QBbveoKjjVLuRxMw4k/edit#heading=h.o2a3kt7gof9g
> .
>
>
> On Mon, May 22, 2017 at 2:26 AM Jesse Anderson 
> wrote:
>
>> Gentle reminder to fill out the Q and A
>> https://docs.google.com/document/d/1vyel3XRfdeGyqLvXiy1C3mrw9QBbveoKjjVLuRxMw4k/edit#.
>> As James said, this is very helpful to new Beam users.
>>
>> On Thu, May 18, 2017 at 3:12 AM James  wrote:
>>
>>> Thanks a lot Jesse! Learned a lot from your previous Q & A blog(and
>>> surely
>>> will also do from this new one).
>>>
>>> Jean-Baptiste Onofré 于2017年5月18日周四 上午2:37写道:
>>>
>>> > Awesome ! Great work Jesse !
>>> >
>>> > Regards
>>> > JB
>>> > On May 17, 2017, at 14:26, Jesse Anderson 
>>> wrote:
>>> >>
>>> >> After the first release of Beam, I did a Q and A
>>> >> <
>>> http://www.jesse-anderson.com/2016/07/question-and-answers-with-the-apache-beam-team/
>>> >
>>> >> with the users and developers of Beam. Now that we've done the first
>>> stable
>>> >> release, I want to update the Q and A. This will help us promote Beam
>>> and
>>> >> how people are using it in production.
>>> >>
>>> >> I've created a Google Doc
>>> >> <
>>> https://docs.google.com/document/d/1vyel3XRfdeGyqLvXiy1C3mrw9QBbveoKjjVLuRxMw4k/edit#
>>> >
>>> >> with
>>> >> questions I often get about Beam. I'd love to get your answers to
>>> have as
>>> >> many points of view as possible. You can answer as many questions as
>>> you'd
>>> >> like.
>>> >>
>>> >> Once we're done, I'll publish the responses to a new blog post and
>>> send out
>>> >> the URL.
>>> >>
>>> >> Thanks,
>>> >>
>>> >> Jesse
>>> >>
>>> >>
>>>
>> --
>> Thanks,
>>
>> Jesse
>>
> --
> Thanks,
>
> Jesse
>
-- 
Thanks,

Jesse


Re: Beam 2.0 Release Q and A

2017-06-07 Thread Jesse Anderson
One last call to fill out the Q and A. I'm going to publish it on Wednesday
June 14. Please fill it out before Tuesday June 13.

One thing to note, you don't have to be a committer or developer to add
something to the doc. If you're using Beam in production, that will help
others decide to use Beam.

Here the document again
https://docs.google.com/document/d/1vyel3XRfdeGyqLvXiy1C3mrw9QBbveoKjjVLuRxMw4k/edit#heading=h.o2a3kt7gof9g
.

On Mon, May 22, 2017 at 2:26 AM Jesse Anderson 
wrote:

> Gentle reminder to fill out the Q and A
> https://docs.google.com/document/d/1vyel3XRfdeGyqLvXiy1C3mrw9QBbveoKjjVLuRxMw4k/edit#.
> As James said, this is very helpful to new Beam users.
>
> On Thu, May 18, 2017 at 3:12 AM James  wrote:
>
>> Thanks a lot Jesse! Learned a lot from your previous Q & A blog(and surely
>> will also do from this new one).
>>
>> Jean-Baptiste Onofré 于2017年5月18日周四 上午2:37写道:
>>
>> > Awesome ! Great work Jesse !
>> >
>> > Regards
>> > JB
>> > On May 17, 2017, at 14:26, Jesse Anderson 
>> wrote:
>> >>
>> >> After the first release of Beam, I did a Q and A
>> >> <
>> http://www.jesse-anderson.com/2016/07/question-and-answers-with-the-apache-beam-team/
>> >
>> >> with the users and developers of Beam. Now that we've done the first
>> stable
>> >> release, I want to update the Q and A. This will help us promote Beam
>> and
>> >> how people are using it in production.
>> >>
>> >> I've created a Google Doc
>> >> <
>> https://docs.google.com/document/d/1vyel3XRfdeGyqLvXiy1C3mrw9QBbveoKjjVLuRxMw4k/edit#
>> >
>> >> with
>> >> questions I often get about Beam. I'd love to get your answers to have
>> as
>> >> many points of view as possible. You can answer as many questions as
>> you'd
>> >> like.
>> >>
>> >> Once we're done, I'll publish the responses to a new blog post and
>> send out
>> >> the URL.
>> >>
>> >> Thanks,
>> >>
>> >> Jesse
>> >>
>> >>
>>
> --
> Thanks,
>
> Jesse
>
-- 
Thanks,

Jesse


Re: Beam 2.0 Release Q and A

2017-05-22 Thread Jesse Anderson
Gentle reminder to fill out the Q and A
https://docs.google.com/document/d/1vyel3XRfdeGyqLvXiy1C3mrw9QBbveoKjjVLuRxMw4k/edit#.
As James said, this is very helpful to new Beam users.

On Thu, May 18, 2017 at 3:12 AM James  wrote:

> Thanks a lot Jesse! Learned a lot from your previous Q & A blog(and surely
> will also do from this new one).
>
> Jean-Baptiste Onofré 于2017年5月18日周四 上午2:37写道:
>
> > Awesome ! Great work Jesse !
> >
> > Regards
> > JB
> > On May 17, 2017, at 14:26, Jesse Anderson  wrote:
> >>
> >> After the first release of Beam, I did a Q and A
> >> <
> http://www.jesse-anderson.com/2016/07/question-and-answers-with-the-apache-beam-team/
> >
> >> with the users and developers of Beam. Now that we've done the first
> stable
> >> release, I want to update the Q and A. This will help us promote Beam
> and
> >> how people are using it in production.
> >>
> >> I've created a Google Doc
> >> <
> https://docs.google.com/document/d/1vyel3XRfdeGyqLvXiy1C3mrw9QBbveoKjjVLuRxMw4k/edit#
> >
> >> with
> >> questions I often get about Beam. I'd love to get your answers to have
> as
> >> many points of view as possible. You can answer as many questions as
> you'd
> >> like.
> >>
> >> Once we're done, I'll publish the responses to a new blog post and send
> out
> >> the URL.
> >>
> >> Thanks,
> >>
> >> Jesse
> >>
> >>
>
-- 
Thanks,

Jesse


Beam Example 2.0 Update

2017-05-18 Thread Jesse Anderson
Could I get a pair of eyeballs (or more) to look over the 2.0 updates I
made to the Beam example? This is the commit

.

Thanks,

Jesse
-- 
Thanks,

Jesse


Beam 2.0 Release Q and A

2017-05-17 Thread Jesse Anderson
After the first release of Beam, I did a Q and A

with the users and developers of Beam. Now that we've done the first stable
release, I want to update the Q and A. This will help us promote Beam and
how people are using it in production.

I've created a Google Doc

with
questions I often get about Beam. I'd love to get your answers to have as
many points of view as possible. You can answer as many questions as you'd
like.

Once we're done, I'll publish the responses to a new blog post and send out
the URL.

Thanks,

Jesse
-- 
Thanks,

Jesse


Re: First stable release completed!

2017-05-17 Thread Jesse Anderson
Awesome!

On Wed, May 17, 2017, 8:30 AM Ahmet Altay  wrote:

> Congratulations everyone, this is great!
>
> On Wed, May 17, 2017 at 7:26 AM, Kenneth Knowles 
> wrote:
>
> > Awesome. A huge step.
> >
> > On Wed, May 17, 2017 at 6:30 AM, Andrew Psaltis <
> psaltis.and...@gmail.com>
> > wrote:
> >
> > > This is fantastic.  Great job!
> > > On Wed, May 17, 2017 at 08:20 Jean-Baptiste Onofré 
> > > wrote:
> > >
> > > > Huge congrats to everyone who helped reaching this important
> milestone
> > !
> > > >
> > > > Honestly, we are a great team, WE ROCK ! ;)
> > > >
> > > > Regards
> > > > JB
> > > >
> > > > On 05/17/2017 01:28 PM, Davor Bonaci wrote:
> > > > > The first stable release is now complete!
> > > > >
> > > > > Release artifacts are available through various repositories,
> > including
> > > > > dist.apache.org, Maven Central, and PyPI. The website is updated,
> > and
> > > > > announcements are published.
> > > > >
> > > > > Apache Software Foundation press release:
> > > > >
> > > > http://globenewswire.com/news-release/2017/05/17/986839/0/
> > > en/The-Apache-Software-Foundation-Announces-Apache-Beam-v2-0-0.html
> > > > >
> > > > > Beam blog:
> > > > > https://beam.apache.org/blog/2017/05/17/beam-first-stable-
> > release.html
> > > > >
> > > > > Congratulations to everyone -- this is a really big milestone for
> the
> > > > > project, and I'm proud to be a part of this great community.
> > > > >
> > > > > Davor
> > > > >
> > > >
> > > > --
> > > > Jean-Baptiste Onofré
> > > > jbono...@apache.org
> > > > http://blog.nanthrax.net
> > > > Talend - http://www.talend.com
> > > >
> > > --
> > > Thanks,
> > > Andrew
> > >
> > > Subscribe to my book: Streaming Data 
> > > 
> > > twiiter: @itmdata 
> > >
> >
>
-- 
Thanks,

Jesse


Re: Website homepage visual refresh

2017-05-16 Thread Jesse Anderson
Nice work!

On Tue, May 16, 2017 at 10:09 AM Davor Bonaci  wrote:

> I think it is great too -- since it is an obvious improvement, let's merge
> and iterate!
>
> On Tue, May 16, 2017 at 6:06 AM, Jean-Baptiste Onofré 
> wrote:
>
> > Hi Jeremy,
> >
> > great job ! I like the new look'n feel.
> >
> > Thanks !
> > Regards
> > JB
> >
> >
> > On 05/16/2017 07:44 AM, Jeremy Weinstein wrote:
> >
> >> Hi Beam community! fran...@apache.org and I have been working on a
> >> project
> >> to refresh the visual design of the Beam website. We have the following
> >> few
> >> goals:
> >>
> >> a) Breathe some life into the website homepage
> >> b) Simplify and clean up the project's CSS and various supporting files
> >> c) Make it a little more fun and engaging for new developers to start
> >> learning about Beam and enter into the content
> >> d) Help explain Beam to passive and interested non-users
> >>
> >> I'd like the community's help on a few things.
> >>
> >> 1) First and foremost, any feedback on the design update is welcome.
> >> 2) Secondly, there is a section on the homepage for testimonials/quotes
> >> from Beam users and/or organizations about their usage of Beam. We could
> >> set this up on a rotational basis to cycle through quotes, but to start,
> >> if
> >> anyone knows of any good quotes, posts, or tweets about Beam, I'd like
> to
> >> source those and place them into the "A collaborative effort" section.
> >> Please send them over to me and I can flow them into the build.
> >>
> >> We're hoping to refresh the site before or soon after the first stable
> >> release. For this first pass we've focused on the main landing page, but
> >> next up we'd like to improve several of the inside pages, as well as
> >> update
> >> the code toggles, and simplify a bit of the navigational structure.
> >>
> >> Sending this PR [1] out now as an FYI and to solicit feedback. We'll
> make
> >> a
> >> few more improvements based on suggestions, as well as a few tweaks to
> >> TODOs in the header and footer. Feedback is welcome - thanks everyone!
> >>
> >> [1] https://github.com/apache/beam-site/pull/244 +
> >> http://apache-beam-website-pull-requests.storage.googleapis.
> >> com/244/index.html
> >>
> >>
> > --
> > Jean-Baptiste Onofré
> > jbono...@apache.org
> > http://blog.nanthrax.net
> > Talend - http://www.talend.com
> >
>
-- 
Thanks,

Jesse


Re: Towards a spec for robust streaming SQL, Part 1

2017-05-08 Thread Jesse Anderson
-Other dev lists

I'm just coming off speaking about Beam at GOTO Chicago and QCON Sao Paulo.
There was a ton of interest in Beam with SQL as a cross-framework way of
doing SQL.

There's some confusion where people think we're just doing a pass through
to the framework's SQL engine. We'll have to make sure we're clear on how
Beam's SQL works in the docs.

Thanks,

Jesse

On Mon, May 8, 2017 at 3:34 PM Tyler Akidau 
wrote:

> Any thoughts here Fabian? I'm planning to start sending out some more
> emails towards the end of the week.
>
> -Tyler
>
>
> On Wed, Apr 26, 2017 at 8:18 AM Tyler Akidau  wrote:
>
> > No worries, thanks for the heads up. Good luck wrapping all that stuff
> up.
> >
> > -Tyler
> >
> > On Tue, Apr 25, 2017 at 12:07 AM Fabian Hueske 
> wrote:
> >
> >> Hi Tyler,
> >>
> >> thanks for pushing this effort and including the Flink list.
> >> I haven't managed to read the doc yet, but just wanted to thank you for
> >> the
> >> write-up and let you know that I'm very interested in this discussion.
> >>
> >> We are very close to the feature freeze of Flink 1.3 and I'm quite busy
> >> getting as many contributions merged before the release is forked off.
> >> When that happened, I'll have more time to read and comment.
> >>
> >> Thanks,
> >> Fabian
> >>
> >>
> >> 2017-04-22 0:16 GMT+02:00 Tyler Akidau :
> >>
> >> > Good point, when you start talking about anything less than a full
> join,
> >> > triggers get involved to describe how one actually achieves the
> desired
> >> > semantics, and they may end up being tied to just one of the inputs
> >> (e.g.,
> >> > you may only care about the watermark for one side of the join). Am
> >> > expecting us to address these sorts of details more precisely in doc
> #2.
> >> >
> >> > -Tyler
> >> >
> >> > On Fri, Apr 21, 2017 at 2:26 PM Kenneth Knowles
>  >> >
> >> > wrote:
> >> >
> >> > > There's something to be said about having different triggering
> >> depending
> >> > on
> >> > > which side of a join data comes from, perhaps?
> >> > >
> >> > > (delightful doc, as usual)
> >> > >
> >> > > Kenn
> >> > >
> >> > > On Fri, Apr 21, 2017 at 1:33 PM, Tyler Akidau
> >>  >> > >
> >> > > wrote:
> >> > >
> >> > > > Thanks for reading, Luke. The simple answer is that CoGBK is
> >> basically
> >> > > > flatten + GBK. Flatten is a non-grouping operation that merges the
> >> > input
> >> > > > streams into a single output stream. GBK then groups the data
> within
> >> > that
> >> > > > single union stream as you might otherwise expect, yielding a
> single
> >> > > table.
> >> > > > So I think it doesn't really impact things much. Grouping,
> >> aggregation,
> >> > > > window merging etc all just act upon the merged stream and
> generate
> >> > what
> >> > > is
> >> > > > effectively a merged table.
> >> > > >
> >> > > > -Tyler
> >> > > >
> >> > > > On Fri, Apr 21, 2017 at 12:36 PM Lukasz Cwik
> >>  >> > >
> >> > > > wrote:
> >> > > >
> >> > > > > The doc is a good read.
> >> > > > >
> >> > > > > I think you do a great job of explaining table -> stream, stream
> >> ->
> >> > > > stream,
> >> > > > > and stream -> table when there is only one stream.
> >> > > > > But when there are multiple streams reading/writing to a table,
> >> how
> >> > > does
> >> > > > > that impact what occurs?
> >> > > > > For example, with CoGBK you have multiple streams writing to a
> >> table,
> >> > > how
> >> > > > > does that impact window merging?
> >> > > > >
> >> > > > > On Thu, Apr 20, 2017 at 5:57 PM, Tyler Akidau
> >> > >  >> > > > >
> >> > > > > wrote:
> >> > > > >
> >> > > > > > Hello Beam, Calcite, and Flink dev lists!
> >> > > > > >
> >> > > > > > Apologies for the big cross post, but I thought this might be
> >> > > something
> >> > > > > all
> >> > > > > > three communities would find relevant.
> >> > > > > >
> >> > > > > > Beam is finally making progress on a SQL DSL utilizing
> Calcite,
> >> > > thanks
> >> > > > to
> >> > > > > > Mingmin Xu. As you can imagine, we need to come to some
> >> conclusion
> >> > > > about
> >> > > > > > how to elegantly support the full suite of streaming
> >> functionality
> >> > in
> >> > > > the
> >> > > > > > Beam model in via Calcite SQL. You folks in the Flink
> community
> >> > have
> >> > > > been
> >> > > > > > pushing on this (e.g., adding windowing constructs, amongst
> >> others,
> >> > > > thank
> >> > > > > > you! :-), but from my understanding we still don't have a full
> >> spec
> >> > > for
> >> > > > > how
> >> > > > > > to support robust streaming in SQL (including but not limited
> >> to,
> >> > > > e.g., a
> >> > > > > > triggers analogue such as EMIT).
> >> > > > > >
> >> > > > > > I've been spending a lot of time thinking about this and have
> >> some
> >> > > > > opinions
> >> > > > > > about how I think it should look that I've already written
> down,
> >> > so I
> >> > > > > > volunteered to try to drive forward agreement on a general
> >> > streaming
> >> > > > SQL
> >> > > > > > spec between our three communities (well, tec

Slack Invites

2017-05-04 Thread Jesse Anderson
Is possible to change how Slack invites are handled? This might encourage
our community contributions.

Right now, people have to email in (causing extra dev@/user@ emails). I did
a quick search and found this  so people
can invite themselves.

Thanks,

Jesse
-- 
Thanks,

Jesse


Re: Congratulations Davor!

2017-05-04 Thread Jesse Anderson
Congrats!

On Thu, May 4, 2017, 6:20 AM Aljoscha Krettek  wrote:

> Congrats! :-)
> > On 4. May 2017, at 14:34, Kenneth Knowles 
> wrote:
> >
> > Awesome!
> >
> > On Thu, May 4, 2017 at 1:19 AM, Ted Yu  wrote:
> >
> >> Congratulations, Davor!
> >>
> >> On Thu, May 4, 2017 at 12:45 AM, Aviem Zur  wrote:
> >>
> >>> Congrats Davor! :)
> >>>
> >>> On Thu, May 4, 2017 at 10:42 AM Jean-Baptiste Onofré 
> >>> wrote:
> >>>
>  Congrats ! Well deserved ;)
> 
>  Regards
>  JB
> 
>  On 05/04/2017 09:30 AM, Jason Kuster wrote:
> > Hi all,
> >
> > The ASF has just published a blog post[1] welcoming new members of
> >> the
> > Apache Software Foundation, and our own Davor Bonaci is among them!
> > Congratulations and thank you to Davor for all of your work for the
> >>> Beam
> > community, and the ASF at large. Well deserved.
> >
> > Best,
> >
> > Jason
> >
> > [1] https://blogs.apache.org/foundation/entry/the-apache-sof
> > tware-foundation-welcomes
> >
> > P.S. I dug through the list to make sure I wasn't missing any other
> >>> Beam
> > community members; if I have, my sincerest apologies and please
> >>> recognize
> > them on this or a new thread.
> >
> 
>  --
>  Jean-Baptiste Onofré
>  jbono...@apache.org
>  http://blog.nanthrax.net
>  Talend - http://www.talend.com
> 
> >>>
> >>
>
> --
Thanks,

Jesse


Re: What's the easiest way for an application to convert an Iterable to an UnboundedSource

2017-04-29 Thread Jesse Anderson
Here's some code that's similar to what you're asking for
https://github.com/eljefe6a/beamexample/blob/master/BeamTutorial/src/main/java/org/apache/beam/examples/tutorial/game/injector/InjectorBoundedSource.java

On Sat, Apr 29, 2017 at 1:23 PM Shen Li  wrote:

> Thanks!
>
> Shen
>
> On Sat, Apr 29, 2017 at 4:08 PM, Eugene Kirpichov <
> kirpic...@google.com.invalid> wrote:
>
> > Hi Shen,
> >
> > This is a very nice suggestion. Currently there is no way to do this,
> > probably because nobody thought of this before, but here's a few thoughts
> > anyway.
> >
> > - Both the Iterable and its Iterator will need to be Serializable,
> because
> > an UnboundedSource must be able to checkpoint and resume, to provide
> fault
> > tolerance in case the worker reading from it crashes. Do your iterables
> > satisfy this constraint?
> > - Reading will, of course, be sequential rather than parallel; processing
> > can still be parallelized, though. I suppose that's fine for your use
> case.
> > - Once you have that - wrapping an UnboundedSource will be possible and
> an
> > interesting exercise. And, I believe, wrapping it with a splittable DoFn
> > http://s.apache.org/splittable-do-fn will be much easier, though SDF
> > support is yet inconsistent between runners (Direct works, Flink works,
> > Apex and Dataflow in review). It'd actually be a good test case of the
> ease
> > of use of the API.
> >
> > On Sat, Apr 29, 2017 at 12:50 PM Shen Li  wrote:
> >
> > > It seems that Create.of(Iterable) can only create a BoundedSource. Is
> > there
> > > a convenient way to read from an unbounded Iterable object without
> > writing
> > > application code to wrap it into an UnboundedSource object?
> > >
> > >
> > > Thanks,
> > >
> > > Shen
> > >
> >
>
-- 
Thanks,

Jesse


Re: [PROPOSAL]: a new feature branch for SQL DSL

2017-04-05 Thread Jesse Anderson
That will be awesome!

On Wed, Apr 5, 2017, 2:05 PM Mingmin Xu  wrote:

> Hi all,
>
> I'm working on https://issues.apache.org/jira/browse/BEAM-301(Add a Beam
> SQL DSL). The skeleton is already in
> https://github.com/XuMingmin/beam/tree/BEAM-301, using Java SDK in the
> back-end. The goal is to provide a SQL interface over Beam, based on
> Calcite, including:
> 1). a translator to create Beam pipeline from SQL,
> (SELECT/INSERT/FILTER/GROUP-BY/JOIN/...);
> 2). an interactive client to submit queries;  (All-SQL mode)
> 3). a SQL API which reduce the work to create a Pipeline; (Semi-SQL mode)
>
> As we see many folks are interested in this feature, would like to create a
> feature branch to have more involvement.
> Looking for comments and feedback.
>
> Thanks!
> 
> Mingmin
>
-- 
Thanks,

Jesse


Re: [ANNOUNCEMENT] New committers, March 2017 edition!

2017-03-17 Thread Jesse Anderson
Welcome!

On Fri, Mar 17, 2017 at 2:19 PM Ted Yu  wrote:

> Congratulations!
>
> On Fri, Mar 17, 2017 at 2:13 PM, Davor Bonaci  wrote:
>
> > Please join me and the rest of Beam PMC in welcoming the following
> > contributors as our newest committers. They have significantly
> contributed
> > to the project in different ways, and we look forward to many more
> > contributions in the future.
> >
> > * Chamikara Jayalath
> > Chamikara has been contributing to Beam since inception, and previously
> to
> > Google Cloud Dataflow, accumulating a total of 51 commits (8,301 ++ /
> 3,892
> > --) since February 2016 [1]. He contributed broadly to the project, but
> > most significantly to the Python SDK, building the IO framework in this
> SDK
> > [2], [3].
> >
> > * Eugene Kirpichov
> > Eugene has been contributing to Beam since inception, and previously to
> > Google Cloud Dataflow, accumulating a total of 95 commits (22,122 ++ /
> > 18,407 --) since February 2016 [1]. In recent months, he’s been driving
> the
> > Splittable DoFn effort [4]. A true expert on IO subsystem, Eugene has
> > reviewed nearly every IO contributed to Beam. Finally, Eugene contributed
> > the Beam Style Guide, and is championing it across the project.
> >
> > * Ismaël Mejia
> > Ismaël has been contributing to Beam since mid-2016, accumulating a total
> > of 35 commits (3,137 ++ / 1,328 --) [1]. He authored the HBaseIO
> connector,
> > helped on the Spark runner, and contributed in other areas as well,
> > including cross-project collaboration with Apache Zeppelin. Ismaël
> reported
> > 24 Jira issues.
> >
> > * Aviem Zur
> > Aviem has been contributing to Beam since early fall, accumulating a
> total
> > of 49 commits (6,471 ++ / 3,185 --) [1]. He reported 43 Jira issues, and
> > resolved ~30 issues. Aviem improved the stability of the Spark runner a
> > lot, and introduced support for metrics. Finally, Aviem is championing
> > dependency management across the project.
> >
> > Congratulations to all four! Welcome!
> >
> > Davor
> >
> > [1]
> > https://github.com/apache/beam/graphs/contributors?from=
> > 2016-02-01&to=2017-03-17&type=c
> > [2]
> > https://github.com/apache/beam/blob/v0.6.0/sdks/python/
> > apache_beam/io/iobase.py#L70
> > [3]
> > https://github.com/apache/beam/blob/v0.6.0/sdks/python/
> > apache_beam/io/iobase.py#L561
> > [4] https://s.apache.org/splittable-do-fn
> >
>
-- 
Thanks,

Jesse


Re: [RESULT] [VOTE] Release 0.6.0, release candidate #2

2017-03-15 Thread Jesse Anderson
Excellent!

On Wed, Mar 15, 2017, 6:13 AM Jean-Baptiste Onofré  wrote:

> Hi Ahmet,
>
> it seems Jira is not up to date: 0.6.0 version is not flagged as
> "Released".
>
> Can you fix that please ?
>
> Thanks !
> Regards
> JB
>
> On 03/15/2017 05:22 AM, Ahmet Altay wrote:
> > I'm happy to announce that we have unanimously approved this release.
> >
> > There are 7 approving votes, 4 of which are binding:
> > * Aljoscha Krettek
> > * Davor Bonaci
> > * Ismaël Mejía
> > * Jean-Baptiste Onofré
> > * Robert Bradshaw
> > * Ted Yu
> > * Tibor Kiss
> >
> > There are no disapproving votes.
> >
> > Thanks everyone!
> >
> > Ahmet
> >
>
> --
> Jean-Baptiste Onofré
> jbono...@apache.org
> http://blog.nanthrax.net
> Talend - http://www.talend.com
>
-- 
Thanks,

Jesse


Re: First stable release: version designation?

2017-03-01 Thread Jesse Anderson
I think 1.0 makes the most sense.

On Wed, Mar 1, 2017, 10:57 AM Davor Bonaci  wrote:

> The first stable release is our next major project-wide goal; see
> discussion in [1]. I've been referring to it as "the first stable release"
> for a long time, not "1.0.0" or "2.0.0" or "2017" or something else, to
> make sure we have an unbiased discussion and a consensus-based decision on
> this matter.
>
> I think that now is the time to consider the appropriate designation for
> our first stable release, and formally make a decision on it. A reasonable
> choices could be "1.0.0" or "2.0.0", perhaps there are others.
>
> 1.0.0:
> * It logically comes after the current series, 0.x.y.
> * Most people would expect it, I suppose.
> * A possible confusion between Dataflow SDKs and Beam SDKs carrying the
> same number.
>
> 2.0.0:
> * Follows the pattern some other projects have taken -- continuing their
> version numbering scheme from their previous origin.
> * Better communicates project's roots, and degree of maturity.
> * May be unexpected to some users.
>
> I'd invite everyone to share their thoughts and preferences -- names are
> important and well correlated with success. Thanks!
>
> Davor
>
> [1] https://lists.apache.org/thread.html/c35067071aec9029d9100ae973c629
> 9aa919c31d0de623ac367128e2@%3Cdev.beam.apache.org%3E
>


Re: Interest in a (virtual) contributor meeting?

2017-02-21 Thread Jesse Anderson
Sounds good.

On Tue, Feb 21, 2017, 7:19 PM Davor Bonaci  wrote:

> In the early days of the project, we have held a few meetings for the
> initial community to get to know each other. Since then, the community has
> grown a huge amount, but we haven't organized any get-togethers.
>
> I wanted to gauge interest in a potential video conference call in the near
> future. No specific agenda -- simply a chance for everyone to meet others
> and see the faces of people we share a common passion with. Of course, an
> open discussion on any topic of interest to the contributor community is
> welcome. This would be strictly informal -- any decisions are reserved for
> the mailing list discussions.
>
> If you'd be interested in attending, please reply back. If there's
> sufficient interest, I'd be happy to try to organize something in the near
> future.
>
> Thanks!
>
> Davor
>


Re: ToString Method Name

2017-02-10 Thread Jesse Anderson
Ok will create a JIRA and PR.

On Fri, Feb 10, 2017 at 11:23 AM Eugene Kirpichov 
wrote:

> Yup, this is the part of the style guide I had in mind. As you probably
> know from the PR, I'm in favor of "elements" :)
> Note that here we're naming a family of transforms - elements, kvs and
> iterables (potentially more in the future), so the naming should be
> consistent between the different versions and future-proof.
>
> On Fri, Feb 10, 2017 at 10:16 AM Jesse Anderson 
> wrote:
>
> The ToString.of() violates the new transform rules and we need to choose a
> new name.
>
> Here is the method for reference:
>   /**
>* Returns a {@code PTransform>} which
> transforms each
>* element of the input {@link PCollection} to a {@link String} using the
>* {@link Object#toString} method.
>*/
>   public static PTransform, PCollection> of() {
> return new SimpleToString();
>   }
>
> Here are the possibilities we've had so far:
>
>- elements
>- default
>- simple
>- asString
>- simpleString
>- stringValue
>- toString
>- strings
>- make
>
> I think default shouldn't be used as that's a keyword for Lambdas.
>
> Here is the guide that I think of() is violating (@eugene is that
> correct?):
> Name factory functions so that either the function name is a verb, or
> referring to the transform reads like a verb: e.g. MongoDbIO.read(),
> Flatten.iterables().
>
> What are everyone's thoughts? I'm thinking going back to elements or make,
> strings.
>
> Thanks,
>
> Jesse
>
>


ToString Method Name

2017-02-10 Thread Jesse Anderson
The ToString.of() violates the new transform rules and we need to choose a
new name.

Here is the method for reference:
  /**
   * Returns a {@code PTransform>} which
transforms each
   * element of the input {@link PCollection} to a {@link String} using the
   * {@link Object#toString} method.
   */
  public static PTransform, PCollection> of() {
return new SimpleToString();
  }

Here are the possibilities we've had so far:

   - elements
   - default
   - simple
   - asString
   - simpleString
   - stringValue
   - toString
   - strings
   - make

I think default shouldn't be used as that's a keyword for Lambdas.

Here is the guide that I think of() is violating (@eugene is that correct?):
Name factory functions so that either the function name is a verb, or
referring to the transform reads like a verb: e.g. MongoDbIO.read(),
Flatten.iterables().

What are everyone's thoughts? I'm thinking going back to elements or make,
strings.

Thanks,

Jesse


Re: BEAM-307(KafkaIO on Kafka 0.10)

2017-02-08 Thread Jesse Anderson
I'm not. There was a decent amount of time between the first 0.8 and 0.9
release.

On Wed, Feb 8, 2017, 12:08 PM Raghu Angadi 
wrote:

> True.
>
> I was commenting on Kafka developers. I am surprised the api breakages
> didn't have any deprecation period at all.
>
> On Wed, Feb 8, 2017 at 12:02 PM, Xu Mingmin  wrote:
>
> > i tend to have more versions supported, actually in our prod environment,
> > there're 0.8, 0.9  and 0.10 for different teams. we'd take care of users
> > who are on old versions.
> >
> >
> > On Wed, Feb 8, 2017 at 10:56 AM, Raghu Angadi  >
> > wrote:
> >
> > > If we let the user pick their kafka version in their dependencies,
> > simplest
> > > fix is to broaden KafkaIO kafka-client dependency to something like
> > [0.9.1,
> > > 0.11) (and handle the api incompatibility at runtime).
> > >
> > > It might not be long before we could drop 0.9 support. Looking at these
> > api
> > > changes in Kafka client api without any deprecation warnings, I think
> > Kafka
> > > does not expect older versions to linger much longer either.
> > >
> > > On Wed, Feb 8, 2017 at 10:31 AM, Raghu Angadi 
> > wrote:
> > >
> > > > What is the recommended way for users to bundle their app? The fix
> > could
> > > > as simple as letting the user set version in mvn property
> > > > ('kafka.client.version').
> > >
> >
>


Re: [PROPOSAL] New way of passing lambdas

2017-02-03 Thread Jesse Anderson
Excellent! On the redefinition of #2, people are used to it. With Hadoop
MapReduce, you had to define types in 1-3 different places.

While you're there, we also need also need a lambda that has access to the
Context object.

On Fri, Feb 3, 2017 at 11:03 AM Kenneth Knowles 
wrote:

> Hi all,
>
> Right now when you want to use MapElements (and friends) you have two
> options:
>
> 1. Use a SimpleFunction Java 7 style
>
> MapElements.via(SimpleFunction() {
>   @Override
>   public B return apply(A input) {
> return ...expr...;
>   }
> }
>
> and the type descriptors are automatically inferred
>
> 2. Use a lambda and withOutputType
>
> MapElements.via((A input) -> ...expr...)
> .withOutputType(new TypeDescriptor(){})
>
> MapElements.via((A input) -> ...expr...)
> .withOutputType(TypeDescriotors.bs())
>
> and you might have a handy helper in TypeDescriptors (note the plural) or
> you might have to create your own, which is a weird pattern if you haven't
> seen it before. Both shown above.
>
> [PROPOSAL] Here is a neat trick for getting type information like in #1 but
> with a lambda like #2 and a bit less verbosity:
>
> MapElements.via(new SimpleFunction((A input) -> ...expr...) {})
>
> I think we can add this. I lean towards this just being a third option, but
> could be easily swayed to drop #2.
>
> This is https://github.com/apache/beam/pull/1855 where you can see some
> unit tests demonstrating it more, and take a look at what it means for
> error checking, etc. It is backwards-compatible but still a change to a
> core API so deserves a thread on list.
>
> Thoughts?
>
> Kenn
>


Re: PTransform style guide PR

2017-01-30 Thread Jesse Anderson
Thanks for putting that together. Does this mean you've volunteered to
referee bikeshedding?

On Mon, Jan 30, 2017 at 5:21 PM Eugene Kirpichov
 wrote:

> The initial PR has been merged and the style guide is live
> https://beam.apache.org/contribute/ptransform-style-guide/ - let us
> continue discussing and tweaking on this thread and via smaller PRs
> modifying the document.
>
> On Mon, Jan 30, 2017 at 7:50 AM Aljoscha Krettek 
> wrote:
>
> > Wow, that's a long read. But quite informative +1
> >
> > On Sat, 28 Jan 2017 at 06:54 Jean-Baptiste Onofré 
> wrote:
> >
> > > Hi Eugene,
> > >
> > > As said in the PR: great work and thanks a lot !
> > >
> > > I will take a complete look during the week end. I'm pretty sure it's a
> > > great guide as it's basically the result of our discussions and reviews
> > ;)
> > >
> > > Thanks again !
> > > Regards
> > > JB
> > >
> > > On 01/28/2017 06:21 AM, Eugene Kirpichov wrote:
> > > > Hello all,
> > > >
> > > > I just sent a pull request with a style guide for developers of new
> > > > PTransforms - intended for library writers, e.g. people who
> contribute
> > > new
> > > > connectors and other transforms to Beam. The guide is mainly based on
> > > > experience from reviewing connectors contributed by JB and others,
> but
> > > it's
> > > > intended to be generally applicable.
> > > >
> > > > It covers a variety of points - from code organization, to overall
> API
> > > > design, to error handling and so on. I expect most of it to be
> > > > non-controversial and just reflect the style of existing transforms
> in
> > > Beam
> > > > - however all of it is, of course, up to debate.
> > > >
> > > > https://github.com/apache/beam-site/pull/134/
> > > >
> > > > I'm hoping that this documentation will help guide new transform
> > authors
> > > in
> > > > the right direction from the start, as well as make the job of
> > reviewers
> > > > easier by providing a source they can link to and helping focus the
> > > review
> > > > on resolving more ambiguous points.
> > > >
> > > > (Note that, like all other documentation, this will evolve, so the
> goal
> > > of
> > > > the current PR is not to be complete, but to be a starting point)
> > > >
> > > > When the guide is ratified, I think it'll make sense to file JIRAs to
> > > bring
> > > > Beam in accordance with it - there are a few transforms that were
> > written
> > > > before the best practices shaped up.
> > > >
> > > > Thanks!
> > > >
> > >
> > > --
> > > Jean-Baptiste Onofré
> > > jbono...@apache.org
> > > http://blog.nanthrax.net
> > > Talend - http://www.talend.com
> > >
> >
>


Re: Consistent Placement

2017-01-27 Thread Jesse Anderson
@dan I thought you were talking about the transform class definition:
  public static class GroupedValues
  extends PTransform
>>,
 PCollection>> {


On Fri, Jan 27, 2017 at 11:30 AM Dan Halperin 
wrote:

> Hi Jesse, can you specifically say which functions on Combine and Count
> you're thinking of? I believe these transforms are consistent with the
> "principle of least visibility" -- make nothing more public than it needs
> to be.
>
> Look at Combine.globally
> <
> https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/Combine.java#L124
> >.
> It returns a Globally, but that is because Globally has a useful public API
> surface, adding functions like asSingletonView. I believe similar reasoning
> applies to Count.
>
> However, for cases where the user will not further configure the return
> value, it makes sense to return the most public type we can.
>
> On Fri, Jan 27, 2017 at 6:39 AM, Jesse Anderson 
> wrote:
>
> > One con to making transform classes be private would be that it is a
> > breaking change. If anyone uses that class directly or extends that
> class,
> > we'd be breaking that.
> >
> > On Fri, Jan 27, 2017 at 9:37 AM Jesse Anderson 
> > wrote:
> >
> > > Continuing a discussion <https://github.com/apache/beam/pull/1830>
> Dan,
> > > Kenn, and I were having here since the bug is closed. They pointed out
> > > three things:
> > >
> > >- Where the private constructor gets placed in the class
> > >- Where the code samples of how to use the class get placed (in the
> > >Transform versus in the static method)
> > >- Are transform classes public or private
> > >
> > > I noted that those were inconsistent in the code. When I write a new
> > > transform, I use one of the already written transforms as the basis.
> > >
> > > Looking at Combine and Count:
> > >
> > >- The private constructor is at the top of the class
> > >- The code sample is in the Transform class
> > >- The transform class is marked as public
> > >
> > > I don't have a strong opinion on private constructor and transform
> being
> > > marked as private/public. I think we should standardize on placing code
> > > samples in the static helper methods. That's where people are looking
> > when
> > > using these methods.
> > >
> > > I think we need to do a general pass to make these consistent after we
> > > decide on how they should be done.
> > >
> > > Thanks,
> > >
> > > Jesse
> > >
> >
>


Re: Consistent Placement

2017-01-27 Thread Jesse Anderson
One con to making transform classes be private would be that it is a
breaking change. If anyone uses that class directly or extends that class,
we'd be breaking that.

On Fri, Jan 27, 2017 at 9:37 AM Jesse Anderson 
wrote:

> Continuing a discussion <https://github.com/apache/beam/pull/1830> Dan,
> Kenn, and I were having here since the bug is closed. They pointed out
> three things:
>
>- Where the private constructor gets placed in the class
>- Where the code samples of how to use the class get placed (in the
>Transform versus in the static method)
>- Are transform classes public or private
>
> I noted that those were inconsistent in the code. When I write a new
> transform, I use one of the already written transforms as the basis.
>
> Looking at Combine and Count:
>
>- The private constructor is at the top of the class
>- The code sample is in the Transform class
>- The transform class is marked as public
>
> I don't have a strong opinion on private constructor and transform being
> marked as private/public. I think we should standardize on placing code
> samples in the static helper methods. That's where people are looking when
> using these methods.
>
> I think we need to do a general pass to make these consistent after we
> decide on how they should be done.
>
> Thanks,
>
> Jesse
>


Consistent Placement

2017-01-27 Thread Jesse Anderson
Continuing a discussion  Dan,
Kenn, and I were having here since the bug is closed. They pointed out
three things:

   - Where the private constructor gets placed in the class
   - Where the code samples of how to use the class get placed (in the
   Transform versus in the static method)
   - Are transform classes public or private

I noted that those were inconsistent in the code. When I write a new
transform, I use one of the already written transforms as the basis.

Looking at Combine and Count:

   - The private constructor is at the top of the class
   - The code sample is in the Transform class
   - The transform class is marked as public

I don't have a strong opinion on private constructor and transform being
marked as private/public. I think we should standardize on placing code
samples in the static helper methods. That's where people are looking when
using these methods.

I think we need to do a general pass to make these consistent after we
decide on how they should be done.

Thanks,

Jesse


Re: [ANNOUNCEMENT] New committers, January 2017 edition!

2017-01-26 Thread Jesse Anderson
Welcome!

On Thu, Jan 26, 2017, 7:27 PM Davor Bonaci  wrote:

> Please join me and the rest of Beam PMC in welcoming the following
> contributors as our newest committers. They have significantly contributed
> to the project in different ways, and we look forward to many more
> contributions in the future.
>
> * Stas Levin
> Stas has contributed across the breadth of the project, from the Spark
> runner to the core pieces and Java SDK. Looking at code contributions
> alone, he authored 43 commits and reported 25 issues. Stas is very active
> on the mailing lists too, contributing to good discussions and proposing
> improvements to the Beam model.
>
> * Ahmet Altay
> Ahmet is a major contributor to the Python SDK, both in terms of design and
> code contribution. Looking at code contributions alone, he authored 98
> commits and reviewed dozens of pull requests. With Python SDK’s imminent
> merge to the master branch, Ahmet contributed towards establishing a new
> major component in Beam.
>
> * Pei He
> Pei has been contributing to Beam since its inception, accumulating a total
> of 118 commits since February. He has made several major contributions,
> most recently by redesigning IOChannelFactory / FileSystem APIs (in
> progress), which would extend Beam’s portability to many additional file
> systems and cloud providers.
>
> Congratulations to all three! Welcome!
>
> Davor
>


Re: On my activity at the project

2017-01-15 Thread Jesse Anderson
Thanks for all your hard work.

On Sun, Jan 15, 2017, 10:16 AM Jean-Baptiste Onofré  wrote:

> Hi Max,
>
> thanks for your commitment and your work on the project.
>
> Enjoy your time off.
>
> Regards
> JB
>
> On 01/14/2017 09:04 AM, Maximilian Michels wrote:
> > Dear Beamers,
> >
> > Thank you for the past year where we built this amazing community! It's
> > been exciting times.
> >
> > For the beginning of this year, I decided to take some time off. I'd
> > love to stay with the project and I think I'm going to be committing
> > more in the future. For the meantime, I'd like to pass on the component
> > lead of the Flink Runner to either Aljoscha or Stephan who are the most
> > experienced Flink committers of the Beam community.
> >
> > Please feel free to reach out to me in case anything pops up. It's great
> > to see Beam as an established top level project. Everyone at the Beam
> > community can be really proud!
> >
> > Best,
> > Max
>
> --
> Jean-Baptiste Onofré
> jbono...@apache.org
> http://blog.nanthrax.net
> Talend - http://www.talend.com
>


Re: Graduation!

2017-01-10 Thread Jesse Anderson
Excellent!

On Tue, Jan 10, 2017, 7:12 AM Jacky Li  wrote:

> Great work! Congratulations!
>
> Regards,
> Jacky
>
> > 在 2017年1月10日,下午7:11,Sergio Fernández  写道:
> >
> > Congrats, guys!
> >
> > On Tue, Jan 10, 2017 at 12:07 PM, Davor Bonaci  wrote:
> >
> >> The ASF has publicly announced our graduation!
> >>
> >>
> >> https://blogs.apache.org/foundation/entry/the-apache-
> >> software-foundation-announces
> >>
> >>https://beam.apache.org/blog/2017/01/10/beam-graduates.html
> >>
> >> Graduation is a recognition of the community that we have built
> together. I
> >> am humbled to be part of this group and this project, and so excited for
> >> what we can accomplish together going forward.
> >>
> >> Davor
> >>
> >
> >
> > --
> > Sergio Fernández
> > Partner Technology Manager
> > Redlink GmbH
> > m: +43 6602747925
> > e: sergio.fernan...@redlink.co
> > w: http://redlink.co
>
>
>
>


Re: Better developer instructions for using Maven?

2017-01-05 Thread Jesse Anderson
@dan are you saying that mvn verify isn't doing checkstyle anymore? Some of
the checkstyles are still running for a few modules. Also, the contribution
docs will need to change. They say to run mvn verify before commits.

On Thu, Jan 5, 2017 at 9:25 AM Dan Halperin 
wrote:

> Several folks seem to have been confused after BEAM-246, where we moved the
> "slow things" into the release profile. I've started a discussion with
> https://github.com/apache/beam/pull/1740 to see if there are things we can
> do to fill these gaps.
>
> Would love folks to chime in with opinions.
>
> Dan
>
> On Wed, Jan 4, 2017 at 1:34 PM, Jesse Anderson 
> wrote:
>
> > @Eugene, yes that failed on the checkstyle.
> >
> > On Wed, Jan 4, 2017 at 1:27 PM Eugene Kirpichov
> >  wrote:
> >
> > > Try just -Prelease.
> > > On Wed, Jan 4, 2017 at 1:21 PM Jesse Anderson 
> > > wrote:
> > >
> > > > Fails because I don't have a secret key.
> > > >
> > > > On Wed, Jan 4, 2017 at 1:03 PM Jean-Baptiste Onofré  >
> > > > wrote:
> > > >
> > > > > Hi Jesse,
> > > > >
> > > > > Could you try the same with:
> > > > >
> > > > > mvn verify -Prelease,apache-release
> > > > >
> > > > > ?
> > > > >
> > > > > Regards
> > > > > JB
> > > > >
> > > > > On 01/04/2017 09:53 PM, Jesse Anderson wrote:
> > > > > > For some reason, running "mvn verify" isn't running checkstyle on
> > > > > > everything. I had checkstyle errors in beam-sdks-java-core that
> > > weren't
> > > > > > being found.
> > > > > >
> > > > > > I thought this was due to the extra parameters. I reran with the
> > > plain
> > > > > "mvn
> > > > > > verify" and it still didn't find them. From the output, it
> doesn't
> > > look
> > > > > > like they're being run at all.
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Jesse
> > > > > >
> > > > >
> > > > > --
> > > > > Jean-Baptiste Onofré
> > > > > jbono...@apache.org
> > > > > http://blog.nanthrax.net
> > > > > Talend - http://www.talend.com
> > > > >
> > > >
> > >
> >
>


Re: Checkstyle Errors

2017-01-04 Thread Jesse Anderson
@Eugene, yes that failed on the checkstyle.

On Wed, Jan 4, 2017 at 1:27 PM Eugene Kirpichov
 wrote:

> Try just -Prelease.
> On Wed, Jan 4, 2017 at 1:21 PM Jesse Anderson 
> wrote:
>
> > Fails because I don't have a secret key.
> >
> > On Wed, Jan 4, 2017 at 1:03 PM Jean-Baptiste Onofré 
> > wrote:
> >
> > > Hi Jesse,
> > >
> > > Could you try the same with:
> > >
> > > mvn verify -Prelease,apache-release
> > >
> > > ?
> > >
> > > Regards
> > > JB
> > >
> > > On 01/04/2017 09:53 PM, Jesse Anderson wrote:
> > > > For some reason, running "mvn verify" isn't running checkstyle on
> > > > everything. I had checkstyle errors in beam-sdks-java-core that
> weren't
> > > > being found.
> > > >
> > > > I thought this was due to the extra parameters. I reran with the
> plain
> > > "mvn
> > > > verify" and it still didn't find them. From the output, it doesn't
> look
> > > > like they're being run at all.
> > > >
> > > > Thanks,
> > > >
> > > > Jesse
> > > >
> > >
> > > --
> > > Jean-Baptiste Onofré
> > > jbono...@apache.org
> > > http://blog.nanthrax.net
> > > Talend - http://www.talend.com
> > >
> >
>


Re: Checkstyle Errors

2017-01-04 Thread Jesse Anderson
Fails because I don't have a secret key.

On Wed, Jan 4, 2017 at 1:03 PM Jean-Baptiste Onofré  wrote:

> Hi Jesse,
>
> Could you try the same with:
>
> mvn verify -Prelease,apache-release
>
> ?
>
> Regards
> JB
>
> On 01/04/2017 09:53 PM, Jesse Anderson wrote:
> > For some reason, running "mvn verify" isn't running checkstyle on
> > everything. I had checkstyle errors in beam-sdks-java-core that weren't
> > being found.
> >
> > I thought this was due to the extra parameters. I reran with the plain
> "mvn
> > verify" and it still didn't find them. From the output, it doesn't look
> > like they're being run at all.
> >
> > Thanks,
> >
> > Jesse
> >
>
> --
> Jean-Baptiste Onofré
> jbono...@apache.org
> http://blog.nanthrax.net
> Talend - http://www.talend.com
>


Checkstyle Errors

2017-01-04 Thread Jesse Anderson
For some reason, running "mvn verify" isn't running checkstyle on
everything. I had checkstyle errors in beam-sdks-java-core that weren't
being found.

I thought this was due to the extra parameters. I reran with the plain "mvn
verify" and it still didn't find them. From the output, it doesn't look
like they're being run at all.

Thanks,

Jesse


Re: Running a Specific Test

2017-01-04 Thread Jesse Anderson
I just found out that running:
mvn verify -pl org.apache.beam:beam-sdks-java-core

Doesn't run the checkstyle for some reason. I'm not sure why and I had a
bunch of checkstyle errors.

On Wed, Jan 4, 2017 at 8:18 AM Jesse Anderson  wrote:

> The blog post <http://www.jesse-anderson.com/2017/01/maven-tips/> is up.
> It covers some of the common Maven commands I've needed when developing
> Beam. One is how to run a verify on a single module instead of everything.
> The second is how to run unit tests both in the IDE and from the command
> line in Maven.
>
> Thanks again to everyone for your help.
>
> On Thu, Dec 29, 2016 at 1:39 PM Dan Halperin 
> wrote:
>
> If you'd like early eyes on the blog post, let us know. Happy to review!
>
> One thing worth noting: we've tried to structure Beam so that the pain is
> mostly limited to the core. Many modules have module-specific unit tests
> that use DirectRunner directly. The module simply has a test dependency on
> DirectRunner, and unit tests that expect the DirectRunner to be there "just
> work". It's only the 2 modules the DirectRunner depends on directly
> (sdk-core and runners-core) that have this pain.
>
> Now for tests that should work on *any* runner, there is similar
> customization -- @RunnableOnService (today, some better name tomorrow) and
> runnable-on-service-tests, etc. etc.
>
> Dan
>
> On Thu, Dec 29, 2016 at 12:42 PM, Jesse Anderson 
> wrote:
>
> > Thanks to everyone for their help. I'm writing a blog about the various
> > Maven things you need to know with Beam.
> >
> > @Dan that command line worked. Thanks!
> >
> > On Thu, Dec 29, 2016 at 11:23 AM Stas Levin  wrote:
> >
> > > I believe you raise a good point :)
> > >
> > > On Thu, Dec 29, 2016 at 9:00 PM Dan Halperin
>  > >
> > > wrote:
> > >
> > > > I suspect -- but may be wrong -- that the command line Stas gives
> will
> > > use
> > > > the *installed* version of beam-sdks-java-core. If you are iterating
> > on a
> > > > @NeedsRunner test in the SDK core, you will either need to reinstall
> it
> > > > over and over again, or use `-am` to force recompilation of the core.
> > > >
> > > > Here is a command that works for me. Please criticize :)
> > > >
> > > > mvn -Dtest=org.apache.beam.sdk.transforms.RegexTest
> > -DfailIfNoTests=false
> > > > -pl runners/direct-java -am integration-test
> > > >
> > > > Note that this is an `integration-test`, not a `test` because it
> tests
> > > the
> > > > integration of the SDK with the DirectRunner:
> > > >
> > > https://github.com/apache/beam/blob/master/runners/direct-
> > java/pom.xml#L64
> > > >
> > > > Dan
> > > >
> > > > On Thu, Dec 29, 2016 at 10:53 AM, Stas Levin 
> > > wrote:
> > > >
> > > > > P.S
> > > > > You can also do this from the main directory (without cd-ing into
> the
> > > > > direct-runner):
> > > > >
> > > > > "mvn test -Dtest=RegexTest
> > > > > -DdependenciesToScan=org.apache.beam:beam-sdks-java-core -pl
> > > > > runners/direct-java"
> > > > >
> > > > > On Thu, Dec 29, 2016 at 8:50 PM Stas Levin 
> > > wrote:
> > > > >
> > > > > > Once you "cd" into "runners/direct-java" you can use:
> > > > > >
> > > > > > "mvn test -Dtest=RegexTest
> > > > > > -DdependenciesToScan=org.apache.beam:beam-sdks-java-core"
> > > > > >
> > > > > > -Stas
> > > > > >
> > > > > > On Thu, Dec 29, 2016 at 8:27 PM Jesse Anderson <
> > > je...@smokinghand.com>
> > > > > > wrote:
> > > > > >
> > > > > > I tried that one already. It gives a no tests run error. If you
> > > bypass
> > > > > that
> > > > > > error with -DfailIfNoTests=false, no tests get run at all.
> > > > > >
> > > > > > On Thu, Dec 29, 2016 at 10:20 AM Jean-Baptiste Onofré <
> > > j...@nanthrax.net
> > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Hi Jesse
> > > > > > >
> > > > > > > Mvn test -Dtest=RegexTest
> > > > > > >
> > > > > > > Should work
> > > > > > >
> > > > > > > Don't forget the test goal. And no need to provide the fqcn.
> > > > > > >
> > > > > > > Regards
> > > > > > > JB⁣​
> > > > > > >
> > > > > > > On Dec 29, 2016, 18:55, at 18:55, Jesse Anderson <
> > > > > je...@smokinghand.com>
> > > > > > > wrote:
> > > > > > > >Does anyone know the Maven way to run a specific unit test
> with
> > > > Beam?
> > > > > > > >I've
> > > > > > > >tried:
> > > > > > > >mvn -Dtest=org.apache.beam.sdk.transforms.RegexTest
> > > > > > > >-DfailIfNoTests=false
> > > > > > > >-Dgroups="org.apache.beam.sdk.testing.NeedsRunner" -pl
> > > > > > > >org.apache.beam:beam-sdks-java-core test
> > > > > > > >
> > > > > > > >The test still doesn't run. Does anyone know what I'm missing?
> > > > > > > >
> > > > > > > >Thanks,
> > > > > > > >
> > > > > > > >Jesse
> > > > > > >
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>
>


Re: Running a Specific Test

2017-01-04 Thread Jesse Anderson
The blog post <http://www.jesse-anderson.com/2017/01/maven-tips/> is up. It
covers some of the common Maven commands I've needed when developing Beam.
One is how to run a verify on a single module instead of everything. The
second is how to run unit tests both in the IDE and from the command line
in Maven.

Thanks again to everyone for your help.

On Thu, Dec 29, 2016 at 1:39 PM Dan Halperin 
wrote:

> If you'd like early eyes on the blog post, let us know. Happy to review!
>
> One thing worth noting: we've tried to structure Beam so that the pain is
> mostly limited to the core. Many modules have module-specific unit tests
> that use DirectRunner directly. The module simply has a test dependency on
> DirectRunner, and unit tests that expect the DirectRunner to be there "just
> work". It's only the 2 modules the DirectRunner depends on directly
> (sdk-core and runners-core) that have this pain.
>
> Now for tests that should work on *any* runner, there is similar
> customization -- @RunnableOnService (today, some better name tomorrow) and
> runnable-on-service-tests, etc. etc.
>
> Dan
>
> On Thu, Dec 29, 2016 at 12:42 PM, Jesse Anderson 
> wrote:
>
> > Thanks to everyone for their help. I'm writing a blog about the various
> > Maven things you need to know with Beam.
> >
> > @Dan that command line worked. Thanks!
> >
> > On Thu, Dec 29, 2016 at 11:23 AM Stas Levin  wrote:
> >
> > > I believe you raise a good point :)
> > >
> > > On Thu, Dec 29, 2016 at 9:00 PM Dan Halperin
>  > >
> > > wrote:
> > >
> > > > I suspect -- but may be wrong -- that the command line Stas gives
> will
> > > use
> > > > the *installed* version of beam-sdks-java-core. If you are iterating
> > on a
> > > > @NeedsRunner test in the SDK core, you will either need to reinstall
> it
> > > > over and over again, or use `-am` to force recompilation of the core.
> > > >
> > > > Here is a command that works for me. Please criticize :)
> > > >
> > > > mvn -Dtest=org.apache.beam.sdk.transforms.RegexTest
> > -DfailIfNoTests=false
> > > > -pl runners/direct-java -am integration-test
> > > >
> > > > Note that this is an `integration-test`, not a `test` because it
> tests
> > > the
> > > > integration of the SDK with the DirectRunner:
> > > >
> > > https://github.com/apache/beam/blob/master/runners/direct-
> > java/pom.xml#L64
> > > >
> > > > Dan
> > > >
> > > > On Thu, Dec 29, 2016 at 10:53 AM, Stas Levin 
> > > wrote:
> > > >
> > > > > P.S
> > > > > You can also do this from the main directory (without cd-ing into
> the
> > > > > direct-runner):
> > > > >
> > > > > "mvn test -Dtest=RegexTest
> > > > > -DdependenciesToScan=org.apache.beam:beam-sdks-java-core -pl
> > > > > runners/direct-java"
> > > > >
> > > > > On Thu, Dec 29, 2016 at 8:50 PM Stas Levin 
> > > wrote:
> > > > >
> > > > > > Once you "cd" into "runners/direct-java" you can use:
> > > > > >
> > > > > > "mvn test -Dtest=RegexTest
> > > > > > -DdependenciesToScan=org.apache.beam:beam-sdks-java-core"
> > > > > >
> > > > > > -Stas
> > > > > >
> > > > > > On Thu, Dec 29, 2016 at 8:27 PM Jesse Anderson <
> > > je...@smokinghand.com>
> > > > > > wrote:
> > > > > >
> > > > > > I tried that one already. It gives a no tests run error. If you
> > > bypass
> > > > > that
> > > > > > error with -DfailIfNoTests=false, no tests get run at all.
> > > > > >
> > > > > > On Thu, Dec 29, 2016 at 10:20 AM Jean-Baptiste Onofré <
> > > j...@nanthrax.net
> > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Hi Jesse
> > > > > > >
> > > > > > > Mvn test -Dtest=RegexTest
> > > > > > >
> > > > > > > Should work
> > > > > > >
> > > > > > > Don't forget the test goal. And no need to provide the fqcn.
> > > > > > >
> > > > > > > Regards
> > > > > > > JB⁣​
> > > > > > >
> > > > > > > On Dec 29, 2016, 18:55, at 18:55, Jesse Anderson <
> > > > > je...@smokinghand.com>
> > > > > > > wrote:
> > > > > > > >Does anyone know the Maven way to run a specific unit test
> with
> > > > Beam?
> > > > > > > >I've
> > > > > > > >tried:
> > > > > > > >mvn -Dtest=org.apache.beam.sdk.transforms.RegexTest
> > > > > > > >-DfailIfNoTests=false
> > > > > > > >-Dgroups="org.apache.beam.sdk.testing.NeedsRunner" -pl
> > > > > > > >org.apache.beam:beam-sdks-java-core test
> > > > > > > >
> > > > > > > >The test still doesn't run. Does anyone know what I'm missing?
> > > > > > > >
> > > > > > > >Thanks,
> > > > > > > >
> > > > > > > >Jesse
> > > > > > >
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>


Re: Apache News Round-up

2016-12-30 Thread Jesse Anderson
Congrats JB!

On Fri, Dec 30, 2016, 9:23 PM Davor Bonaci  wrote:

> I stumbled across Apache News Round-up for this week [1], and our own
> Jean-Baptiste Onofré is noted as one of the top five committers in 2016
> with 1,825 commits across all Apache projects.
>
> Congratulations JB -- this is awesome!
>
> Davor
>
> [1] https://blogs.apache.org/foundation/entry/the_apache_news_round_up118
>


Re: PCollection to PCollection Conversion

2016-12-29 Thread Jesse Anderson
Sounds good to me too. @vikas can you start modifying the PR's code:
Clean up the PR to be more future-proof for now? Aka make `ToString` itself
not a PTransform,  but instead ToString.create() returns ToString.Default
which is a private class implementing what ToString is now (PTransform, wrapping MapElements).

On Thu, Dec 29, 2016 at 4:00 PM Ben Chambers 
wrote:

> Dan's proposal to move forward with a simple (future-proofed) version of
> the ToString transform and Javadoc, and add specific features via follow-up
> PRs.
>
> On Thu, Dec 29, 2016 at 3:53 PM Jesse Anderson 
> wrote:
>
> > @Ben which idea do you like?
> >
> > On Thu, Dec 29, 2016 at 3:20 PM Ben Chambers
>  > >
> > wrote:
> >
> > > I like that idea, with the caveat that we should probably come up with
> a
> > > better name. Perhaps "ToString.elements()" and ToString.Elements or
> > > something? Calling one the "default" and using "create" for it seems
> > > moderately non-future proof.
> > >
> > > On Thu, Dec 29, 2016 at 3:17 PM Dan Halperin
>  > >
> > > wrote:
> > >
> > > > On Thu, Dec 29, 2016 at 2:10 PM, Jesse Anderson <
> je...@smokinghand.com
> > >
> > > > wrote:
> > > >
> > > > > I agree MapElements isn't hard to use. I think there is a demand
> for
> > > this
> > > > > built-in conversion.
> > > > >
> > > > > My thought on the formatter is that, worst case, we could do
> runtime
> > > type
> > > > > checking. It would be ugly and not as performant, but it should
> work.
> > > As
> > > > > we've said, we'd point them to MapElements for better code. We'd
> > write
> > > > the
> > > > > JavaDoc accordingly.
> > > > >
> > > >
> > > > I think it will be good to see these proposals in PR form. I would
> stay
> > > far
> > > > away from reflection and varargs if possible, but properly-typed bits
> > of
> > > > code (possibly exposed as SerializableFunctions in ToString?) would
> > > > probably make sense.
> > > >
> > > > In the short-term, I can't find anyone arguing against a
> > > ToString.create()
> > > > that simply does input.toString().
> > > >
> > > > To get started, how about we ask Vikas to clean up the PR to be more
> > > > future-proof for now? Aka make `ToString` itself not a PTransform,
> but
> > > > instead ToString.create() returns ToString.Default which is a private
> > > class
> > > > implementing what ToString is now (PTransform, wrapping
> > > > MapElements).
> > > >
> > > > Then we can send PRs adding new features to that.
> > > >
> > > > IME and to Ben's point, these will mostly be used in development.
> Some
> > of
> > > > > our assumptions will break down when programmers aren't the ones
> > using
> > > > > Beam. I can see from the user traffic already that not everyone
> using
> > > > Beam
> > > > > is a programmer and they'll need classes like this to be
> productive.
> > > >
> > > >
> > > > > On Thu, Dec 29, 2016 at 1:46 PM Dan Halperin
> > >  > > > >
> > > > > wrote:
> > > > >
> > > > > On Thu, Dec 29, 2016 at 1:36 PM, Jesse Anderson <
> > je...@smokinghand.com
> > > >
> > > > > wrote:
> > > > >
> > > > > > I prefer JB's take. I think there should be three overloaded
> > methods
> > > on
> > > > > the
> > > > > > class. I like Vikas' name ToString. The methods for a simple
> > > conversion
> > > > > > should be:
> > > > > >
> > > > > > ToString.strings() - Outputs the .toString() of the objects in
> the
> > > > > > PCollection
> > > > > > ToString.strings(String delimiter) - Outputs the .toString() of
> > KVs,
> > > > > Lists,
> > > > > > etc with the delimiter between every entry
> > > > > > ToString.formatted(String format) - Outputs the formatted
> > > > > > <
> > https://docs.oracle.com/javase/8/docs/api/java/util/Formatter.html>
> > > > > > string
> > > > > > with the object passed in. For obje

Re: PCollection to PCollection Conversion

2016-12-29 Thread Jesse Anderson
@Ben which idea do you like?

On Thu, Dec 29, 2016 at 3:20 PM Ben Chambers 
wrote:

> I like that idea, with the caveat that we should probably come up with a
> better name. Perhaps "ToString.elements()" and ToString.Elements or
> something? Calling one the "default" and using "create" for it seems
> moderately non-future proof.
>
> On Thu, Dec 29, 2016 at 3:17 PM Dan Halperin 
> wrote:
>
> > On Thu, Dec 29, 2016 at 2:10 PM, Jesse Anderson 
> > wrote:
> >
> > > I agree MapElements isn't hard to use. I think there is a demand for
> this
> > > built-in conversion.
> > >
> > > My thought on the formatter is that, worst case, we could do runtime
> type
> > > checking. It would be ugly and not as performant, but it should work.
> As
> > > we've said, we'd point them to MapElements for better code. We'd write
> > the
> > > JavaDoc accordingly.
> > >
> >
> > I think it will be good to see these proposals in PR form. I would stay
> far
> > away from reflection and varargs if possible, but properly-typed bits of
> > code (possibly exposed as SerializableFunctions in ToString?) would
> > probably make sense.
> >
> > In the short-term, I can't find anyone arguing against a
> ToString.create()
> > that simply does input.toString().
> >
> > To get started, how about we ask Vikas to clean up the PR to be more
> > future-proof for now? Aka make `ToString` itself not a PTransform,  but
> > instead ToString.create() returns ToString.Default which is a private
> class
> > implementing what ToString is now (PTransform, wrapping
> > MapElements).
> >
> > Then we can send PRs adding new features to that.
> >
> > IME and to Ben's point, these will mostly be used in development. Some of
> > > our assumptions will break down when programmers aren't the ones using
> > > Beam. I can see from the user traffic already that not everyone using
> > Beam
> > > is a programmer and they'll need classes like this to be productive.
> >
> >
> > > On Thu, Dec 29, 2016 at 1:46 PM Dan Halperin
>  > >
> > > wrote:
> > >
> > > On Thu, Dec 29, 2016 at 1:36 PM, Jesse Anderson  >
> > > wrote:
> > >
> > > > I prefer JB's take. I think there should be three overloaded methods
> on
> > > the
> > > > class. I like Vikas' name ToString. The methods for a simple
> conversion
> > > > should be:
> > > >
> > > > ToString.strings() - Outputs the .toString() of the objects in the
> > > > PCollection
> > > > ToString.strings(String delimiter) - Outputs the .toString() of KVs,
> > > Lists,
> > > > etc with the delimiter between every entry
> > > > ToString.formatted(String format) - Outputs the formatted
> > > > <https://docs.oracle.com/javase/8/docs/api/java/util/Formatter.html>
> > > > string
> > > > with the object passed in. For objects made up of different parts
> like
> > > KVs,
> > > > each one is passed in as separate toString() of a varargs.
> > > >
> > >
> > > Riffing a little, with some types:
> > >
> > > ToString.of() -- PTransform that is equivalent to a ParDo
> > > that takes in a T and outputs T.toString().
> > >
> > > ToString.kv(String delimiter) -- PTransform, String> that
> > is
> > > equivalent to a ParDo that takes in a KV and outputs
> > > kv.getKey().toString() + delimiter + kv.getValue().toString()
> > >
> > > ToString.iterable(String delimiter) -- PTransform > Iterable,
> > > String> that is equivalent to a ParDo that takes in an Iterable and
> > > outputs the iterable[0] + delimiter + iterable[1] + delimiter + ... +
> > > delimiter + iterable[N-1]
> > >
> > > ToString.custom(SerializableFunction formatter) ?
> > >
> > > The last one is just MapElement.via, except you don't need to set the
> > > output type.
> > >
> > > I don't see a way to make the generic .formatted() that you propose
> that
> > > just works with anything "made of different parts".
> > >
> > > I think this adding too many overrides beyond "of" and "custom" is
> > opening
> > > up a Pandora's Box. the KV one might want to have left and right
> > > delimiters, might want to take custom formatters for K and V, etc. etc.
> > The
> &

Re: PCollection to PCollection Conversion

2016-12-29 Thread Jesse Anderson
I agree MapElements isn't hard to use. I think there is a demand for this
built-in conversion.

My thought on the formatter is that, worst case, we could do runtime type
checking. It would be ugly and not as performant, but it should work. As
we've said, we'd point them to MapElements for better code. We'd write the
JavaDoc accordingly.

IME and to Ben's point, these will mostly be used in development. Some of
our assumptions will break down when programmers aren't the ones using
Beam. I can see from the user traffic already that not everyone using Beam
is a programmer and they'll need classes like this to be productive.

On Thu, Dec 29, 2016 at 1:46 PM Dan Halperin 
wrote:

On Thu, Dec 29, 2016 at 1:36 PM, Jesse Anderson 
wrote:

> I prefer JB's take. I think there should be three overloaded methods on
the
> class. I like Vikas' name ToString. The methods for a simple conversion
> should be:
>
> ToString.strings() - Outputs the .toString() of the objects in the
> PCollection
> ToString.strings(String delimiter) - Outputs the .toString() of KVs,
Lists,
> etc with the delimiter between every entry
> ToString.formatted(String format) - Outputs the formatted
> <https://docs.oracle.com/javase/8/docs/api/java/util/Formatter.html>
> string
> with the object passed in. For objects made up of different parts like
KVs,
> each one is passed in as separate toString() of a varargs.
>

Riffing a little, with some types:

ToString.of() -- PTransform that is equivalent to a ParDo
that takes in a T and outputs T.toString().

ToString.kv(String delimiter) -- PTransform, String> that is
equivalent to a ParDo that takes in a KV and outputs
kv.getKey().toString() + delimiter + kv.getValue().toString()

ToString.iterable(String delimiter) -- PTransform,
String> that is equivalent to a ParDo that takes in an Iterable and
outputs the iterable[0] + delimiter + iterable[1] + delimiter + ... +
delimiter + iterable[N-1]

ToString.custom(SerializableFunction formatter) ?

The last one is just MapElement.via, except you don't need to set the
output type.

I don't see a way to make the generic .formatted() that you propose that
just works with anything "made of different parts".

I think this adding too many overrides beyond "of" and "custom" is opening
up a Pandora's Box. the KV one might want to have left and right
delimiters, might want to take custom formatters for K and V, etc. etc. The
iterable one might want to have a special configuration for an empty
iterable. So I'm inclined towards simplicity with the awareness that
MapElements.via is just not that hard to use.

Dan


>
> I think doing these three methods would cover every simple and advanced
> "simple conversions." As JB says, we'll need other specific converters for
> other formats like XML.
>
> I'd really like to see this class in the next version of Beam. What does
> everyone think of the class name, methods name, and method operations so
we
> can have Vikas finish up?
>
> Thanks,
>
> Jesse
>
> On Wed, Dec 28, 2016 at 12:28 PM Jean-Baptiste Onofré 
> wrote:
>
> > Hi Vikas,
> >
> > did you take a look on:
> >
> >
> > https://github.com/jbonofre/beam/tree/DATAFORMAT/sdks/
> java/extensions/dataformat
> >
> > You can see KV2String and ToString could be part of this extension.
> > I'm also using JAXB for XML and Jackson for JSON
> > marshalling/unmarshalling. I'm planning to deal with Avro
> (IndexedRecord).
> >
> > Regards
> > JB
> >
> > On 12/28/2016 08:37 PM, Vikas Kedigehalli wrote:
> > > Hi All,
> > >
> > >   Not being aware of the discussion here, I sent out a PR
> > > <https://github.com/apache/beam/pull/1704> but JB and others directed
> > me to
> > > this thread. Having converted PCollection to PCollection
> > several
> > > times, I feel something like 'ToString' transform is common enough to
> be
> > > part of the core. What do you all think?
> > >
> > > Also, if someone else is already working on or interested in tackling
> > this,
> > > then I am happy to discard the PR.
> > >
> > > Regards,
> > > Vikas
> > >
> > > On Tue, Dec 13, 2016 at 1:56 AM, Amit Sela 
> wrote:
> > >
> > >> It seems that there were a lot of good points raised here, and I tend
> to
> > >> agree that something as trivial and lean as "ToString" should be a
> part
> > of
> > >> core.ake
> > >> I'm particularly fond of makeString(prefix, toString, suffix) in
> various
> > >> combinations (Scala-like).
&g

Re: PCollection to PCollection Conversion

2016-12-29 Thread Jesse Anderson
I prefer JB's take. I think there should be three overloaded methods on the
class. I like Vikas' name ToString. The methods for a simple conversion
should be:

ToString.strings() - Outputs the .toString() of the objects in the
PCollection
ToString.strings(String delimiter) - Outputs the .toString() of KVs, Lists,
etc with the delimiter between every entry
ToString.formatted(String format) - Outputs the formatted
<https://docs.oracle.com/javase/8/docs/api/java/util/Formatter.html> string
with the object passed in. For objects made up of different parts like KVs,
each one is passed in as separate toString() of a varargs.

I think doing these three methods would cover every simple and advanced
"simple conversions." As JB says, we'll need other specific converters for
other formats like XML.

I'd really like to see this class in the next version of Beam. What does
everyone think of the class name, methods name, and method operations so we
can have Vikas finish up?

Thanks,

Jesse

On Wed, Dec 28, 2016 at 12:28 PM Jean-Baptiste Onofré 
wrote:

> Hi Vikas,
>
> did you take a look on:
>
>
> https://github.com/jbonofre/beam/tree/DATAFORMAT/sdks/java/extensions/dataformat
>
> You can see KV2String and ToString could be part of this extension.
> I'm also using JAXB for XML and Jackson for JSON
> marshalling/unmarshalling. I'm planning to deal with Avro (IndexedRecord).
>
> Regards
> JB
>
> On 12/28/2016 08:37 PM, Vikas Kedigehalli wrote:
> > Hi All,
> >
> >   Not being aware of the discussion here, I sent out a PR
> > <https://github.com/apache/beam/pull/1704> but JB and others directed
> me to
> > this thread. Having converted PCollection to PCollection
> several
> > times, I feel something like 'ToString' transform is common enough to be
> > part of the core. What do you all think?
> >
> > Also, if someone else is already working on or interested in tackling
> this,
> > then I am happy to discard the PR.
> >
> > Regards,
> > Vikas
> >
> > On Tue, Dec 13, 2016 at 1:56 AM, Amit Sela  wrote:
> >
> >> It seems that there were a lot of good points raised here, and I tend to
> >> agree that something as trivial and lean as "ToString" should be a part
> of
> >> core.ake
> >> I'm particularly fond of makeString(prefix, toString, suffix) in various
> >> combinations (Scala-like).
> >> For "fromString", I think JB has a good point leveraging JAXB and
> Jackson -
> >> though I think this should be in extensions as it is not as lean as
> >> toString.
> >>
> >> Thanks,
> >> Amit
> >>
> >> On Wed, Nov 30, 2016 at 5:13 AM Jean-Baptiste Onofré 
> >> wrote:
> >>
> >>> Hi Jesse,
> >>>
> >>> yes, I started something there (using JAXB and Jackson). Let me polish
> >>> and push.
> >>>
> >>> Regards
> >>> JB
> >>>
> >>> On 11/29/2016 10:00 PM, Jesse Anderson wrote:
> >>>> I went through the string conversions. Do you have an example of
> >> writing
> >>>> out XML/JSON/etc too?
> >>>>
> >>>> On Tue, Nov 29, 2016 at 3:46 PM Jean-Baptiste Onofré  >
> >>>> wrote:
> >>>>
> >>>>> Hi Jesse,
> >>>>>
> >>>>>
> >>>>>
> >>> https://github.com/jbonofre/incubator-beam/tree/DATAFORMAT/sdks/java/
> >> extensions/dataformat
> >>>>>
> >>>>> it's very simple and stupid and of course not complete at all (I have
> >>>>> other commits but not merged as they need some polishing), but as I
> >>>>> said, it's a base of discussion.
> >>>>>
> >>>>> Regards
> >>>>> JB
> >>>>>
> >>>>> On 11/29/2016 09:23 PM, Jesse Anderson wrote:
> >>>>>> @jb Sounds good. Just let us know once you've pushed.
> >>>>>>
> >>>>>> On Tue, Nov 29, 2016 at 2:54 PM Jean-Baptiste Onofré <
> >> j...@nanthrax.net>
> >>>>>> wrote:
> >>>>>>
> >>>>>>> Good point Eugene.
> >>>>>>>
> >>>>>>> Right now, it's a DoFn collection to experiment a bit (a pure
> >>>>>>> extension). It's pretty stupid ;)
> >>>>>>>
> >>>>>>> But, you are right, depending the direction of such extension, it
>

Re: Running a Specific Test

2016-12-29 Thread Jesse Anderson
Thanks to everyone for their help. I'm writing a blog about the various
Maven things you need to know with Beam.

@Dan that command line worked. Thanks!

On Thu, Dec 29, 2016 at 11:23 AM Stas Levin  wrote:

> I believe you raise a good point :)
>
> On Thu, Dec 29, 2016 at 9:00 PM Dan Halperin 
> wrote:
>
> > I suspect -- but may be wrong -- that the command line Stas gives will
> use
> > the *installed* version of beam-sdks-java-core. If you are iterating on a
> > @NeedsRunner test in the SDK core, you will either need to reinstall it
> > over and over again, or use `-am` to force recompilation of the core.
> >
> > Here is a command that works for me. Please criticize :)
> >
> > mvn -Dtest=org.apache.beam.sdk.transforms.RegexTest -DfailIfNoTests=false
> > -pl runners/direct-java -am integration-test
> >
> > Note that this is an `integration-test`, not a `test` because it tests
> the
> > integration of the SDK with the DirectRunner:
> >
> https://github.com/apache/beam/blob/master/runners/direct-java/pom.xml#L64
> >
> > Dan
> >
> > On Thu, Dec 29, 2016 at 10:53 AM, Stas Levin 
> wrote:
> >
> > > P.S
> > > You can also do this from the main directory (without cd-ing into the
> > > direct-runner):
> > >
> > > "mvn test -Dtest=RegexTest
> > > -DdependenciesToScan=org.apache.beam:beam-sdks-java-core -pl
> > > runners/direct-java"
> > >
> > > On Thu, Dec 29, 2016 at 8:50 PM Stas Levin 
> wrote:
> > >
> > > > Once you "cd" into "runners/direct-java" you can use:
> > > >
> > > > "mvn test -Dtest=RegexTest
> > > > -DdependenciesToScan=org.apache.beam:beam-sdks-java-core"
> > > >
> > > > -Stas
> > > >
> > > > On Thu, Dec 29, 2016 at 8:27 PM Jesse Anderson <
> je...@smokinghand.com>
> > > > wrote:
> > > >
> > > > I tried that one already. It gives a no tests run error. If you
> bypass
> > > that
> > > > error with -DfailIfNoTests=false, no tests get run at all.
> > > >
> > > > On Thu, Dec 29, 2016 at 10:20 AM Jean-Baptiste Onofré <
> j...@nanthrax.net
> > >
> > > > wrote:
> > > >
> > > > > Hi Jesse
> > > > >
> > > > > Mvn test -Dtest=RegexTest
> > > > >
> > > > > Should work
> > > > >
> > > > > Don't forget the test goal. And no need to provide the fqcn.
> > > > >
> > > > > Regards
> > > > > JB⁣​
> > > > >
> > > > > On Dec 29, 2016, 18:55, at 18:55, Jesse Anderson <
> > > je...@smokinghand.com>
> > > > > wrote:
> > > > > >Does anyone know the Maven way to run a specific unit test with
> > Beam?
> > > > > >I've
> > > > > >tried:
> > > > > >mvn -Dtest=org.apache.beam.sdk.transforms.RegexTest
> > > > > >-DfailIfNoTests=false
> > > > > >-Dgroups="org.apache.beam.sdk.testing.NeedsRunner" -pl
> > > > > >org.apache.beam:beam-sdks-java-core test
> > > > > >
> > > > > >The test still doesn't run. Does anyone know what I'm missing?
> > > > > >
> > > > > >Thanks,
> > > > > >
> > > > > >Jesse
> > > > >
> > > >
> > > >
> > >
> >
>


Re: Running a Specific Test

2016-12-29 Thread Jesse Anderson
I tried that one already. It gives a no tests run error. If you bypass that
error with -DfailIfNoTests=false, no tests get run at all.

On Thu, Dec 29, 2016 at 10:20 AM Jean-Baptiste Onofré 
wrote:

> Hi Jesse
>
> Mvn test -Dtest=RegexTest
>
> Should work
>
> Don't forget the test goal. And no need to provide the fqcn.
>
> Regards
> JB⁣​
>
> On Dec 29, 2016, 18:55, at 18:55, Jesse Anderson 
> wrote:
> >Does anyone know the Maven way to run a specific unit test with Beam?
> >I've
> >tried:
> >mvn -Dtest=org.apache.beam.sdk.transforms.RegexTest
> >-DfailIfNoTests=false
> >-Dgroups="org.apache.beam.sdk.testing.NeedsRunner" -pl
> >org.apache.beam:beam-sdks-java-core test
> >
> >The test still doesn't run. Does anyone know what I'm missing?
> >
> >Thanks,
> >
> >Jesse
>


Running a Specific Test

2016-12-29 Thread Jesse Anderson
Does anyone know the Maven way to run a specific unit test with Beam? I've
tried:
mvn -Dtest=org.apache.beam.sdk.transforms.RegexTest -DfailIfNoTests=false
-Dgroups="org.apache.beam.sdk.testing.NeedsRunner" -pl
org.apache.beam:beam-sdks-java-core test

The test still doesn't run. Does anyone know what I'm missing?

Thanks,

Jesse