>From the Use Case description, it seems like u r looking to aggregate files based on either a threshold size or threshold time and ship them to S3. Correct?
Flink might be an overkill here and u could look at frameworks like Apache NiFi that have pre-built (and configurable) processors to do just what u r describing here. On Fri, Jul 22, 2016 at 3:00 PM, Suma Cherukuri <suma_cheruk...@symantec.com > wrote: > Hi, > > Good Afternoon! > > I work as an engineer at Symantec. My team works on Multi-tenant Event > Processing System. Just a high level background, our customers write data > to kafka brokers though agents like logstash and we process the events and > save the log data in Elastic Search and S3. > > Use Case: We have a use case where in we write batches of events to S3 > when file size limitation of 1MB (specific to our case) or a certain time > threshold is reached. We are planning on merging the number of files > specific to a folder into one single file based on either time limit such > as every 24 hrs. > > We were considering various options available today and would like to > know if Apache Flink can be used to serve the purpose. > > Looking forward to hearing from you. > > Thank you > Suma Cherukuri > >