Based on
https://stackoverflow.com/questions/44423769/how-to-use-google-cloud-storage-in-dataflow-pipeline-run-from-datalab
I tried this:
options = PipelineOptions(flags = ["--requirements_file",
"./requirements.txt"])
the requirements file was generated by:
pip freeze > requirements.txt
But it fi
Bundle boundaries are unspecified, dependent on the runner and the
particular circumstances during this particular execution, and are
generally unrelated to windowing or to the data contents itself. They have
no semantic meaning - everything would still work exactly the same way even
if every eleme
Thanks. It is much clearer now...
However, the code comments don't mention how often are {Start,Finish}Bundle
called. What constitutes a batch?
If I am using a window of 1 minute, can I expect for {Start,Finish}Bundle every
minute? In other words, will the window produce a batch of my data?
On
thank you. where do i add the reference to requirements.txt? can i do it
from the pipline options code?
On Tue, Jul 3, 2018 at 5:13 PM, Lukasz Cwik wrote:
> Take a look at https://beam.apache.org/documentation/sdks/python-
> pipeline-dependencies/
>
> On Tue, Jul 3, 2018 at 2:09 PM OrielResearch
Take a look at
https://beam.apache.org/documentation/sdks/python-pipeline-dependencies/
On Tue, Jul 3, 2018 at 2:09 PM OrielResearch Eila Arich-Landkof <
e...@orielresearch.org> wrote:
> Hello all,
>
>
> I am using the python code to run my pipeline. similar to the following:
>
> options = Pipeli
Hello all,
I am using the python code to run my pipeline. similar to the following:
options = PipelineOptions()google_cloud_options =
options.view_as(GoogleCloudOptions)google_cloud_options.project =
'my-project-id'google_cloud_options.job_name =
'myjob'google_cloud_options.staging_location =
'g
Hi Eduardo,
These differences are described by the link I sent (
https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/DoFn.java#L465-L666)
- it documents what kind of things it's best to do in each method. Please
let me know if something is still un
FinishBundle() does the job.
Should I keep using Setup()? What is the difference between Setup() and
StartBundle()?
Thanks again.
On 2018/07/03 20:10:21, Henning Rohde wrote:
> Teardown has very loose guarantees on when it's called and you essentially
> can't rely on it. Currently, for Go on
Hi Eduardo,
Henning is right - the specific guarantees around Setup/Teardown vs.
StartBundle/FinishBundle are currently described best in the Java SDK
documentation:
https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/DoFn.java
(see
documentation t
Teardown has very loose guarantees on when it's called and you essentially
can't rely on it. Currently, for Go on non-direct runners, we hang on to
the bundle descriptors forever and never destroy them (and in turn never
call Teardown). Even if we didn't, failures/restarts could cause Teardown
to n
Wow, what a nice thing to read! Thanks for sharing, Peter!
On Tue, Jul 3, 2018 at 9:50 AM Peter Mueller wrote:
> Hi Gris and everyone,
> Not a very techie post, but thought I'd contribute a little on the
> economic case for open-source, and Apache Beam in particular.
>
> TLDR; We're finding tha
Essentially I have the following code:
type Writer struct {
Pool WriterPool
}
func (w *Writer) Setup() {
w.Pool = Init()
}
func (w* Writer) ProcessElement(ctx, elem Elem) {
w.Pool.Add(elem)
}
func (w* Writer) Teardown() {
w.Pool.Write()
w.Pool.Close()
}
beam.ParDo0(scope, &Writer{}, e
Hi Gris and everyone,
Not a very techie post, but thought I'd contribute a little on the economic
case for open-source, and Apache Beam in particular.
TLDR; We're finding that, at critical points in our sales cycle, our
customers are choosing us precisely because we offer a 'call option' on
futur
Awesome!! Thanks for the heads up, very exciting, this is going to make a
lot of people happy :)
On Tue, Jul 3, 2018, 3:40 AM Carlos Alonso wrote:
> + d...@beam.apache.org
>
> Just a quick email to let you know that I'm starting developing this.
>
> On Fri, Apr 20, 2018 at 10:30 PM Eugene Kirpic
+ d...@beam.apache.org
Just a quick email to let you know that I'm starting developing this.
On Fri, Apr 20, 2018 at 10:30 PM Eugene Kirpichov
wrote:
> Hi Carlos,
>
> Thank you for expressing interest in taking this on! Let me give you a few
> pointers to start, and I'll be happy to help everyw
15 matches
Mail list logo