Re: Spring with Apache Beam
1. won't work since it is happening at pipeline construction time and not pipeline execution time. 2. only works if your application context is scoped to the DoFn instance and doesn't have things you want to possibly share across DoFn instances. You could also try and make it a PipelineOption that is tagged with @JsonIgnore and also has a @Default.InstanceFactory like this[1]. This way when it is accessed by your DoFn it will be initialized for the first time and shared within your process. Making it a PipelineOption would also allow you to pass in preinitialized versions for testing. 1: https://github.com/apache/beam/blob/8267c223425bc201be700babbe596d133b79686e/sdks/java/extensions/google-cloud-platform-core/src/main/java/org/apache/beam/sdk/extensions/gcp/options/GcpOptions.java#L127 On Wed, Oct 9, 2019 at 1:10 PM Jitendra kumavat wrote: > Hi Luke, > > Thanks a lot for your reply. > I tried couple of options which is as follows. > > 1. Initialise the context in main method only. and use it. Creating the > context: > new AnnotationConfigApplicationContext(AppConfig.class); > 2. Creating the context on DoFn.Startup method. > > Unfortunately none of the worked perfectly, later works but it has issue > with @ComponentScan. > Please let me know your comments for the same. > > I will also try this JvmInitializer for context initialisation. > > Thanks, > Jitendra > > On Wed, Oct 9, 2019 at 12:48 PM Luke Cwik wrote: > >> -d...@beam.apache.org, +user@beam.apache.org >> >> How are you trying to inject your application context? >> Have you looked at the JvmInitializer.beforeProcessing[1] to create your >> application context? >> >> 1: >> https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/harness/JvmInitializer.java >> >> >> >> >> On Fri, Oct 4, 2019 at 12:32 PM Jitendra kumavat >> wrote: >> >>> Hi, >>> >>> I want to add Spring framework in my apache beam project. Somehow i am >>> unable to inject the Spring Application context to executing ParDo >>> functions. I couldn't find the way to do so? Can you please let me know how >>> to integrate Spring runtime application context with Apache Beam pipeline. >>> >>> Thanks, >>> Jitendra >>> >>
Re: Spring with Apache Beam
-d...@beam.apache.org, +user@beam.apache.org How are you trying to inject your application context? Have you looked at the JvmInitializer.beforeProcessing[1] to create your application context? 1: https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/harness/JvmInitializer.java On Fri, Oct 4, 2019 at 12:32 PM Jitendra kumavat wrote: > Hi, > > I want to add Spring framework in my apache beam project. Somehow i am > unable to inject the Spring Application context to executing ParDo > functions. I couldn't find the way to do so? Can you please let me know how > to integrate Spring runtime application context with Apache Beam pipeline. > > Thanks, > Jitendra >
Re: Feedback on how we use Apache Beam in my company
Very nice ! Thanks ccing dev list Etienne On 09/10/2019 16:55, Pierre Vanacker wrote: Hi Apache Beam community, We’ve been working with Apache Beam in production for a few years now in my company (Dailymotion). If you’re interested to know how we use Apache Beam in combination with Google Dataflow, we shared this experience in the following article : https://medium.com/dailymotion/realtime-data-processing-with-apache-beam-and-google-dataflow-at-dailymotion-7d1b994dc816 Thanks to the developers for your great work ! Regards, Pierre
Feedback on how we use Apache Beam in my company
Hi Apache Beam community, We’ve been working with Apache Beam in production for a few years now in my company (Dailymotion). If you’re interested to know how we use Apache Beam in combination with Google Dataflow, we shared this experience in the following article : https://medium.com/dailymotion/realtime-data-processing-with-apache-beam-and-google-dataflow-at-dailymotion-7d1b994dc816 Thanks to the developers for your great work ! Regards, Pierre
Re: Beam discarding massive amount of events due to Window object or inner processing
Hi, When inserting into PubSub can you set message metadata with the timestamp from the event? If yes then you can make use of: https://beam.apache.org/releases/javadoc/2.16.0/org/apache/beam/sdk/io/gcp/pubsub/PubsubIO.Read.html#withTimestampAttribute-java.lang.String- Cheers Reza On Wed, 9 Oct 2019 at 16:31, Eddy G wrote: > Thanks a lot for the quick response! > > I can recall having already played with this when I first deployed this > consumer and couldn't get around the following issue that I'm getting now > again... > > java.lang.IllegalArgumentException: Cannot output with timestamp > 2019-10-09T03:12:04.250Z. Output timestamps must be no earlier than the > timestamp of the current input (2019-10-09T03:12:04.292Z) minus the allowed > skew (0 milliseconds). See the DoFn#getAllowedTimestampSkew() Javadoc for > details on changing the allowed skew. > > How can I manage skew? Wouldn't it increase as it's happening with the > current version which uses processing time? > > The timestamp that I'm inferring comes straight from the JSON object > (which is the one looking forward to use) and not from PubSub itself. > -- This email may be confidential and privileged. If you received this communication by mistake, please don't forward it to anyone else, please erase all copies and attachments, and please let me know that it has gone to the wrong person. The above terms reflect a potential business arrangement, are provided solely as a basis for further discussion, and are not intended to be and do not constitute a legally binding obligation. No legally binding obligations will be created, implied, or inferred until an agreement in final form is executed in writing by all parties involved.
Re: Beam discarding massive amount of events due to Window object or inner processing
Thanks a lot for the quick response! I can recall having already played with this when I first deployed this consumer and couldn't get around the following issue that I'm getting now again... java.lang.IllegalArgumentException: Cannot output with timestamp 2019-10-09T03:12:04.250Z. Output timestamps must be no earlier than the timestamp of the current input (2019-10-09T03:12:04.292Z) minus the allowed skew (0 milliseconds). See the DoFn#getAllowedTimestampSkew() Javadoc for details on changing the allowed skew. How can I manage skew? Wouldn't it increase as it's happening with the current version which uses processing time? The timestamp that I'm inferring comes straight from the JSON object (which is the one looking forward to use) and not from PubSub itself.