Hi, Makes perfect sense so far. Obviously, you understand the difference between batch computation and Ad-Hoc. At the same time, Drill is a high-performance MPP query layer for self describing data, schema-free and ANSI SQL. Would you mind helping me open an issue on the Github? Is a good way to initiate the technical discussion.
> 在 2021年7月4日,02:54,Christian Pfarr <[email protected]> 写道: > Hi luoc, > > > thanks for the information. > > > I think this kind of storage format is used more and more in cloud > architectures because it departments wants to use as less tools as possible > to provide a big data product. With iceberg they can build consistant and > scalable big data structures for stream and batch processing at the same > storage layer with a single tool, Spark. > > > The problem is how to provide the data to customers. In my opinion Spark > itself is too slow for interactive querying by a lot of people or BI Tools. > Thats the point where Tools like Presto, Drill or Dremio enters the stage. > > > I would like to see Drill as competitor in this area, especially because of > the brilliant flexible and schemaless design. > > > If the Iceberg implementation is already done for metastore and you are > already experienced with its internals, it sounds worth to invest the time > and energy for a new format plugin. > > > Just the opinion of an consultant who wants to recommend drill for this > usecases ;) > > > Regards > > z0ltrix > > > > > > > > -------- Original-Nachricht -------- > Am 3. Juli 2021, 16:55, luoc schrieb: > > Hello, > Thanks for the interest. Drill’s Metastore allows to use a storage engine > based on Iceberg tables. But now, It seems that Drill does not support the > data of Iceberg for query. I will tell you that Drill can definitely support > Iceberg, including readable and writeable. The condition is that we need to > develop the format plugin using the "Easy framework based on EVF". Please let > me know if you are interested in the that. > > > 2021年7月3日 上午2:41,Christian Pfarr <[email protected]> 写道: > > > > Hello everyone, > > > > > > it looks like more and more people are using deltalake or iceberg in spark > > for transactional working with big tables. > > > > > > Additionally i saw that drill is using iceberg as storage engine for > > metadata. > > > > > > So, i wonder if its possible to query iceberg tables stored in hdfs or s3 > > directly via drill so that i can process my data with spark iceberg tables > > and present them with drill to my data scientists. > > > > > > Regards, > > > > z0ltrix > > > > > > > > > > > > > > <publickey - EmailAddress([email protected]) - 0xF0E154C5.asc> > > <publickey - EmailAddress([email protected]) - 0xF0E154C5.asc>
