Re: S3 Bucket Source

2019-04-15 Thread Steven Nelson
That looks like exactly what I needed. Thanks! -Steve On Mon, Apr 15, 2019 at 3:42 PM Addison Higham wrote: > Hi Steven, > > Usually, what you want to do is something like this: > > Instead of a `SourceFunction` use a `RichParallelSourceFunction` and as an > argument to that function, you might

Re: S3 Bucket Source

2019-04-15 Thread Addison Higham
Hi Steven, Usually, what you want to do is something like this: Instead of a `SourceFunction` use a `RichParallelSourceFunction` and as an argument to that function, you might have a list of prefixes you want to consume in parallel. The `RichParallelSourceFunction` has a a method called `getRunt

S3 Bucket Source

2019-04-15 Thread Steven Nelson
I am working on a process to do some compaction of files in S3. I read a bucket full of files key them, pull them all into a window, then remove older versions of the file. The files are not organized inside the bucket, they are simply name by guid. I can iterate them using a custom Source that jus