Accessing source table data from hive/Presto

2018-08-06 Thread srimugunthan dhandapani
Hi all, I read the Flink documentation and came across the connectors supported https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/connectors/index.html#bundled-connectors We have some data that resides in Hive/Presto that needs to be made available to the flink job. The data in the

Re: Accessing source table data from hive/Presto

2018-08-06 Thread Hequn Cheng
Hi srimugunthan, I found a related link[1]. Hope it helps. [1] https://stackoverflow.com/questions/41683108/flink-1-1-3-interact-with-hive-2-1-0 On Tue, Aug 7, 2018 at 2:35 AM, srimugunthan dhandapani < srimugunthan.dhandap...@gmail.com> wrote: > Hi all, > I read the Flink documentation and ca

Re: Accessing source table data from hive/Presto

2018-08-07 Thread Fabian Hueske
Hi Mugunthan, this depends on the type of your job. Is it a batch or a streaming job? Some queries could be ported to Flink's SQL API as suggested by the link that Hequn posted. In that case, the query would be executed in Flink. Other options are to use a JDBC InputFormat or persisting the resul

Re: Accessing source table data from hive/Presto

2018-08-07 Thread srimugunthan dhandapani
Thanks for the reply. I was mainly thinking of the usecase of streaming job. In the approach to port to Flink's SQL API, is it possible to read parquet data from S3 and register table in flink? On Tue, Aug 7, 2018 at 1:05 PM, Fabian Hueske wrote: > Hi Mugunthan, > > this depends on the type of

Re: Accessing source table data from hive/Presto

2018-08-08 Thread Fabian Hueske
Do you want to read the data once or monitor a directory and process new files as they appear? Reading from S3 with Flink's current MonitoringFileSource implementation is not working reliably due to S3's eventual consistent list operation (see FLINK-9940 [1]). Reading a directory also has some iss