On Tue, Jul 23, 2019 at 10:26 PM Chamikara Jayalath <chamik...@google.com> wrote: > > On Tue, Jul 23, 2019 at 1:10 PM Kyle Weaver <kcwea...@google.com> wrote: >> >> I agree with David that at least clearer log statements should be added. >> >> Udi, that's an interesting idea, but I imagine the sheer number of existing >> flags (including many SDK-specific flags) would make it difficult to >> implement. In addition, uniform argument names wouldn't necessarily ensure >> uniform implementation. >> >> Kyle Weaver | Software Engineer | github.com/ibzib | kcwea...@google.com >> >> >> On Tue, Jul 23, 2019 at 11:56 AM Udi Meiri <eh...@google.com> wrote: >>> >>> Java SDK creates one regional bucket per project and region combination. >>> So it's not a lot of buckets - no need to auto-clean. > > > Agree that cleanup is not a bit issue if we are only creating a single bucket > per project and region. I assume we are creating temporary folders for each > pipeline with the same region and project so that they don't conclifc (which > we clean up). > As others mentioned we should clearly document this (including the naming of > the bucket) and produce a log during pipeline creating. > >>> >>> >>> I agree with Robert that having less flags is better. >>> Perhaps what we need a unifying interface for SDKs that simplifies >>> launching? >>> >>> So instead of: >>> mvn compile exec:java -Dexec.mainClass=<class> >>> -Dexec.args="--runner=DataflowRunner --project=<project> >>> --gcpTempLocation=gs://<bucket>/tmp <user flags>" -Pdataflow-runner >>> or >>> python -m <module> --runner DataflowRunner --project <project> >>> --temp_location gs://<bucket>/tmp/ <user flags> > > Interesting, probably this should be extended to a generalized CLI for Beam > that can be easily installed to execute Beam pipelines ?
This is starting to get somewhat off-topic from the original question, but I'm not sure the benefits of providing a wrapper to the end user would outweigh the costs of having to learn the wrapper. For Python developers, python -m module, or even python -m path/to/script.py is pretty standard. Java is a bit harder, because one needs to coordinate a build as well, but I don't know how a "./beam java ..." script would gloss over whether one is using maven, gradle, ant, or just has a pile of pre-compiled jara (and would probably have to know a bit about the project layout as well to invoke the right commands).