subject:"Issues with Flink Batch and Hadoop dependency"

Re: Issues with Flink Batch and Hadoop dependency

2020-08-31 Thread Arvid Heise

Hi Dan, Your approach in general is good. You might want to use the bundled hadoop uber jar [1] to save some time if you find the appropriate version. You can also build your own version and include it then in lib/. In general, I'd recommend moving away from sequence files. As soon as you change

Re: Issues with Flink Batch and Hadoop dependency

2020-08-29 Thread Dan Hill

I was able to get a basic version to work by including a bunch of hadoop and s3 dependencies in the job jar and hacking in some hadoop config values. It's probably not optimal but it looks like I'm unblocked. On Fri, Aug 28, 2020 at 12:11 PM Dan Hill wrote: > I'm assuming I have a simple,

Issues with Flink Batch and Hadoop dependency

2020-08-28 Thread Dan Hill

I'm assuming I have a simple, common setup problem. I've spent 6 hours debugging and haven't been able to figure it out. Any help would be greatly appreciated. *Problem* I have a Flink Streaming job setup that writes SequenceFiles in S3. When I try to create a Flink Batch job to read these