Re: Iceberg or deltalake table as input for drill queries

Christian Pfarr Sat, 03 Jul 2021 11:54:06 -0700

Hi luoc,







thanks for the information.







I think this kind of storage format is used more and more in cloud 
architectures because it departments wants to use as less tools as possible to 
provide a big data product. With iceberg they can build consistant and scalable 
big data structures for stream and batch processing at the same storage layer 
with a single tool, Spark.







The problem is how to provide the data to customers. In my opinion Spark itself 
is too slow for interactive querying by a lot of people or BI Tools. Thats the 
point where Tools like Presto, Drill or Dremio enters the stage.







I would like to see Drill as competitor in this area, especially because of the 
brilliant flexible and schemaless design.







If the Iceberg implementation is already done for metastore and you are already 
experienced with its internals, it sounds worth to invest the time and energy 
for a new format plugin.







Just the opinion of an consultant who wants to recommend drill for this 
usecases ;)







Regards




z0ltrix




















\-------- Original-Nachricht --------
Am 3. Juli 2021, 16:55, luoc schrieb:

>
>
>
> Hello,
> Thanks for the interest. Drill’s Metastore allows to use a storage engine 
> based on Iceberg tables. But now, It seems that Drill does not support the 
> data of Iceberg for query. I will tell you that Drill can definitely support 
> Iceberg, including readable and writeable. The condition is that we need to 
> develop the format plugin using the "Easy framework based on EVF". Please let 
> me know if you are interested in the that.
>
> > 2021年7月3日 上午2:41，Christian Pfarr <[email protected]> 写道：
> >
> > Hello everyone,
> >
> >
> > it looks like more and more people are using deltalake or iceberg in spark 
> > for transactional working with big tables.
> >
> >
> > Additionally i saw that drill is using iceberg as storage engine for 
> > metadata.
> >
> >
> > So, i wonder if its possible to query iceberg tables stored in hdfs or s3 
> > directly via drill so that i can process my data with spark iceberg tables 
> > and present them with drill to my data scientists.
> >
> >
> > Regards,
> >
> > z0ltrix
> >
> >
> >
> >
> >
> >
> > <publickey - EmailAddress([email protected]) - 0xF0E154C5.asc>

publickey - EmailAddress([email protected]) - 0xF0E154C5.asc
Description: application/pgp-keys

signature.asc
Description: OpenPGP digital signature

Re: Iceberg or deltalake table as input for drill queries

Reply via email to