Need help in creating Flink Streaming s3 Job for multiple path reader one by one

2020-09-29 Thread Satyaa Dixit
Hi Guys, I need one help, any leads will be highly appreciated.I have written a flink streaming job to read the data from s3 bucket and push to kafka. Below is the working source that deal with single s3 path: TextInputFormat format = new TextInputFormat(new org.apache.flink.core.fs.Path("s3a://dir

Re: Need help in creating Flink Streaming s3 Job for multiple path reader one by one

2020-09-29 Thread Satyaa Dixit
Hi Guys, Sorry to bother you again, but someone could help me here? Any help in this regard will be much appreciated. Regards, Satya On Tue, Sep 29, 2020 at 2:57 PM Satyaa Dixit wrote: > Hi Guys, > I need one help, any leads will be highly appreciated.I have written a > flink streami

Re: Need help in creating Flink Streaming s3 Job for multiple path reader one by one

2020-10-01 Thread Satyaa Dixit
Hi Guys, Got stuck with it please help me here Regards, Satya On Wed, 30 Sep 2020 at 11:09 AM, Satyaa Dixit wrote: > Hi Guys, > > Sorry to bother you again, but someone could help me here? Any help in > this regard will be much appreciated. > > Regards, > Satya > > O

Re: Need help in creating Flink Streaming s3 Job for multiple path reader one by one

2020-10-01 Thread Satyaa Dixit
joinedStreams.union(inputStream); } >} >// add a sanity check that there was at least 1 directory > >joinedStreams.addSink(kafka); } > > > > On 10/1/2020 9:08 AM, Satyaa Dixit wrote: > > Hi Guys, > > Got stuck with it please help me here > Reg

Re: Need help in creating Flink Streaming s3 Job for multiple path reader one by one

2020-10-04 Thread Satyaa Dixit
, Satya On Thu, Oct 1, 2020 at 7:46 PM Satyaa Dixit wrote: > Thank you @Chesnay let me try this change . > > On Thu, Oct 1, 2020 at 1:21 PM Chesnay Schepler > wrote: > >> You could also try using streams to make it a little more concise: >> >> director

Re: Need help in creating Flink Streaming s3 Job for multiple path reader one by one

2020-10-09 Thread Satyaa Dixit
to " + kafkaTopicName)); [image: image.png] Request your support on the same. Regards, Satya On Mon, Oct 5, 2020 at 12:16 PM Satyaa Dixit wrote: > Hi @ches...@apache.org , > > Thanks for your support, it was really helpful. > Do you know the list of directories when you submit

Re: Need help in creating Flink Streaming s3 Job for multiple path reader one by one

2020-10-11 Thread Satyaa Dixit
Hi Team, Could you please help me here. I’m sorry for asking on such short notice but my work has stopped due to this. Regards, Satya On Fri, 9 Oct 2020 at 8:53 PM, Satyaa Dixit wrote: > Hi Shesnay/Team, > > Thank you so much for the reply.In the continuation of the previous email

Re: Need help in creating Flink Streaming s3 Job for multiple path reader one by one

2020-10-12 Thread Satyaa Dixit
> 2) You're mixing up the Java Streams and Finks DataStream API. > > Try this: > > s3PathList.stream() > .map(...) > .reduce(...) > .map(joinedStream -> stream.map(new FlatMapFunction...)) > .map(joinedStream-> joinedStream.addSink...) > > On 10/12/2020

Re: Need help in creating Flink Streaming s3 Job for multiple path reader one by one

2020-10-14 Thread Satyaa Dixit
t(value); } } Also just to clarify one doubt , How to handle *FileNotFoundException* as part of flink reader during runtime if in case directory is not available in s3. How to avoid job failure in that use case. Regards, Satya On Tue, Oct 13, 2020 at 11:15 AM Satyaa Dixit wrote: > Thanks, I

Re: Need help in creating Flink Streaming s3 Job for multiple path reader one by one

2020-10-15 Thread Satyaa Dixit
-> S3Service.createInputStream(environment, directory, readerParallelism)) .reduce(DataStream::union).map(joinedStream -> joinedStream.addSink(kafkaProducer)); Regards, Satya On Wed, Oct 14, 2020 at 8:57 PM Satyaa Dixit wrote: > Hi Chesnay/Team > > Thank you so much.I have tried wi

Re: Need help in creating Flink Streaming s3 Job for multiple path reader one by one

2020-10-22 Thread Satyaa Dixit
reate a custom ContinuousFileMonitoringFunction that ignores missing > files in it's run() method. > > Alternatively, use some library to check the existence of the S3 > directories before creating the sources. > > On 10/15/2020 11:49 AM, Satyaa Dixit wrote: > > Hi Chesnay/T

Re: Need help in creating Flink Streaming s3 Job for multiple path reader one by one

2020-10-27 Thread Satyaa Dixit
Chesnay Schepler wrote: > The existing ContinuousFileMonitoringFunction and > ContinuousFileReaderOperator already take care of that. > Unless you aren't re-implementing them from scratch you shouldn't have > to do anything. > > On 10/22/2020 1:47 PM, Satyaa Dixit wrote: &

Re: Need help in creating Flink Streaming s3 Job for multiple path reader one by one

2020-10-27 Thread Satyaa Dixit
Time.of(30, TimeUnit.MINUTES), // time interval for measuring failure rate Time.of(60, TimeUnit.SECONDS) // delay )); Please have a look into this as well. Regards, Satya On Tue, Oct 27, 2020 at 4:46 PM Satyaa Dixit wrote: > Hi Chesnay & Team, > > I'm u

Re: Need help in creating Flink Streaming s3 Job for multiple path reader one by one

2020-10-27 Thread Satyaa Dixit
e job fail before the first checkpoint has occurred? > > What sink are you using? > > On 10/27/2020 12:45 PM, Satyaa Dixit wrote: > > In continuation of the above email, I have tried below code also but it is > restarting the job from the begin

Re: Resource Optimization for Flink Job in AWS EMR Cluster

2020-11-04 Thread Satyaa Dixit
Hi Deep, Thanks for bringing this on table, I'm also facing a similar kind of issue while deploying my flink Job w.r.t resources optimization. Hi Team, It would be much appreciated if someone helps us here. Regards, Satya On Wed, Nov 4, 2020 at 6:33 PM DEEP NARAYAN Singh wrote: > Hi All,