Re: Go SDK: Teardown() not being called with dataflow runner

2018-07-03 Thread eduardo . morales
Thanks. It is much clearer now... However, the code comments don't mention how often are {Start,Finish}Bundle called. What constitutes a batch? If I am using a window of 1 minute, can I expect for {Start,Finish}Bundle every minute? In other words, will the window produce a batch of my data? On

Re: Go SDK: Teardown() not being called with dataflow runner

2018-07-03 Thread eduardo . morales
FinishBundle() does the job. Should I keep using Setup()? What is the difference between Setup() and StartBundle()? Thanks again. On 2018/07/03 20:10:21, Henning Rohde wrote: > Teardown has very loose guarantees on when it's called and you essentially > can't rely on it. Currently, for Go on

Go SDK: Teardown() not being called with dataflow runner

2018-07-03 Thread eduardo . morales
Essentially I have the following code: type Writer struct { Pool WriterPool } func (w *Writer) Setup() { w.Pool = Init() } func (w* Writer) ProcessElement(ctx, elem Elem) { w.Pool.Add(elem) } func (w* Writer) Teardown() { w.Pool.Write() w.Pool.Close() } beam.ParDo0(scope, &Writer{}, e

Re: Go SDK: How are re-starts handled?

2018-06-27 Thread Eduardo Morales
BEAM-4665 created. On Wed, Jun 27, 2018 at 4:32 PM Ismaël Mejía wrote: > Eduardo can you please create a JIra on the Go SDK to track this issue. > Thanks. > > On Mon, Jun 25, 2018 at 10:22 PM Lukasz Cwik wrote: > >> Ah, sorry for the confusion. The SDK is meant to handle that for you as I >> d

Re: Go SDK: How are re-starts handled?

2018-06-25 Thread eduardo . morales
Nope. It returns an error. 2018/06/25 20:10:46 Failed to execute job: googleapi: Error 409: (acbae89877e14d87): The workflow could not be created. Causes: (590297f494c27357): There is already an active job named xxx-yyy_zzz. If you want to submit a second job, try again by setting a different

Re: Go SDK: How are re-starts handled?

2018-06-25 Thread eduardo . morales
I am sorry. I am not expressing myself correctly. Let me do it though code: if err := beamx.Run(ctx, pipe); err != nil { // How do I know 'err' is the result of a pipeline already running, as opposed to some // other problem that may need special attention. } On 2018/06/22 23:10:13, Lukasz

Re: Go SDK: How are re-starts handled?

2018-06-22 Thread eduardo . morales
On 2018/06/22 21:35:29, Lukasz Cwik wrote: > There can only be one pipeline in Dataflow with the same job name so if you > attempt to submit another job with the same job name you'll get back an > identifier for the currently executing pipeline. But beam.Run() only returns an error. How do I

Go SDK: How are re-starts handled?

2018-06-22 Thread eduardo . morales
If I have a k8s process launching dataflow pipelines, what happens when the process is restarted? Can Apache Beam detect a running pipeline and join accordingly? or will the pipeline be duplicated? Thanks in advance.

Re: Go SDK: Bigquery and nullable field types.

2018-06-22 Thread eduardo . morales
On 2018/06/22 01:20:19, Henning Rohde wrote: > The Go SDK can't actually serialize named types -- we serialize the > structural information and recreate assignment-compatible isomorphic > unnamed types at runtime for convenience. This usually works fine, but > perhaps not if inspected reflecti

Go SDK: Biquery and Legacy SQL

2018-06-21 Thread eduardo . morales
I am trying to read a column of type TIMESTAMP, this type is mapped by the bigquery client to time.Time. Unfortunately, it is not possible to use time.Time structs because this type contains slices which are not supported by the beam Go SDK (run fine in the direct runner, but panic on dataflow)

Go SDK: Bigquery and nullable field types.

2018-06-21 Thread eduardo . morales
I am using the bigqueryio transform and I am using the following struct to collect a data row: type Record { source_service biquery.NullString .. etc... } This works fine with the direct runner, but when I try it with the dataflow runner, then I get the following exception: java.util.con

Re: Go SDK: bad parameter type for reflect.methodValueCall: func(Record)

2018-06-18 Thread eduardo . morales
Got it. Thanks for the explanation. On 2018/06/18 21:40:48, Robert Burke wrote: > Hello, and thanks for trying out the Go SDK! > > tl;dr; You can bypass the beam correctness checks by pretending your type > is a Protocol Buffer. See create_test.go >

Go SDK: bad parameter type for reflect.methodValueCall: func(Record)

2018-06-18 Thread eduardo . morales
I have the following code: type Record struct { Timestamp time.Time Payload string } type processFn struct { // etc... } func (f *processFn) ProcessElement(ctx context.Context, data []byte, emit func(Record)) { // etc.. emit(someRecord) // etc.