Flink and Presto integration

2020-01-27 Thread Flavio Pompermaier
Hi all, is there any integration between Presto and Flink? I'd like to use Presto for the UI part (preview and so on) while using Flink for the batch processing. Do you suggest something else otherwise? Best, Flavio

Re: Flink and Presto integration

2020-01-27 Thread Itamar Syn-Hershko
Hi Flavio, Presto contributor and Starburst Partners here. Presto and Flink are solving completely different challenges. Flink is about processing data streams as they come in; Presto is about ad-hoc / periodic querying of data sources. A typical architecture would use Flink to process data stre

Re: Flink and Presto integration

2020-01-27 Thread Flavio Pompermaier
Both Presto and Flink make use of a Catalog in order to be able to read/write data from a source/sink. I don't agree about " Flink is about processing data streams" because Flink is competitive also for the batch workloads (and this will be further improved in the next releases). I'd like to regist

Re: Flink and Presto integration

2020-01-27 Thread Itamar Syn-Hershko
Yes, Flink does batch processing by "reevaluating a stream" so to speak. Presto doesn't have sources and sinks, only catalogs (which are always allowing reads, and sometimes also writes). Presto catalogs are a configuration - they are managed on the node filesystem as a configuration file and nowh

Re: Flink and Presto integration

2020-01-27 Thread Jingsong Li
Hi Flavio, Your requirement should be to use blink batch to read the tables in Presto? I'm not familiar with Presto's catalog. Is it like hive Metastore? If so, what needs to be done is similar to the hive connector. You need to implement a catalog of presto, which translates the Presto table int

Re: Flink and Presto integration

2020-01-28 Thread Piotr Nowojski
Hi, Yes, Presto (in presto-hive connector) is just using hive Metastore to get the table definitions/meta data. If you connect to the same hive Metastore with Flink, both systems should be able to see the same tables. Piotrek > On 28 Jan 2020, at 04:34, Jingsong Li wrote: > > Hi Flavio, > >

Re: Flink and Presto integration

2020-01-28 Thread Flavio Pompermaier
Hive metastore is the de facto standard for Hadoop but in my use case I have to query other databases (like MySQL, Oracle and SQL Server). So Presto would be a good choice (apart from the fact that you need to restart it when you add a new catalog..), and I'd like to have an easy translation of the