Streaming job monitoring

2017-06-08 Thread Flavio Pompermaier
Hi to all, we've successfully ran our first straming job on a Flink cluster (with some problems with the shading of guava..) and it really outperforms Logstash, from the point of view of indexing speed and easiness of use. However there's only one problem: when the job is running, in the Job Monit

Re: Streaming job monitoring

2017-06-08 Thread Chesnay Schepler
Hello Flavio, I'm not sure what source you are using, but it looks like the ContinouosFileMonitoringSource which works with 2 operators. The first operator (what is displayed as the actual Source) emits input splits (chunks of files that should be read) and passes these to the second operator

Re: Streaming job monitoring

2017-06-08 Thread Flavio Pompermaier
Hi Chesnay, this is basically my job: TextInputFormat input = new TextInputFormat(new Path(jsonDir, fileName)); DataStream json = env.createInput(input, BasicTypeInfo.STRING_TYPE_INFO); json.addSink(new ElasticsearchSink<>(userConf, transportAddr, sink)); JobExecutionResult jobInfo = env.execute("