Hi :)

What do you mean by “a database”? A SQL like query engine? Flink is already 
that [1]. A place where you store the data? Flink kind of is that as well [2] 
and many users are using Flink as the source of truth, not just as a data 
processing framework.

With Flink Table API/SQL [1], you can easily query the data from other systems 
(for example read tables stored in Hive Metastore). By extension, you could do 
the same with DataStream API. Or DataSet API.

With each of those APIs (Table API/SQL, DataStream API, DataSet API) there come 
different advantages/trade offs. Table API/SQL as pretty high level, give you 
automatic optimisations and easy of use. DataStream API/DataSet API as being 
lower level, give you more fine grained control over what’s happening at the 
expense of requiring more knowledge from you.

As how Flink Table API/SQL compare to other systems, I guess it will be better 
if someone from the Table API/SQL team respond.

Piotrek

[1] https://ci.apache.org/projects/flink/flink-docs-stable/dev/table/ 
<https://ci.apache.org/projects/flink/flink-docs-stable/dev/table/>
[2] 
https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/state/state.html
 
<https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/state/state.html>

> On 4 Nov 2019, at 14:05, Hanan Yehudai <hanan.yehu...@radcom.com> wrote:
> 
> This seems like a controversial subject.. 
> 
>  on purpose 😊
>  
> I have my data lake in parquet files – should I use Flink batch mode to query 
> historical  batch   ad Hoc queries ? 
> or should I use a dedicated “database”   eg Drill / Dremio  / Hive    and 
> their likes  ?
> what advantage will Flink give me for queries this type of batch data..

Reply via email to