Re: Patterns for data updating?

Dobes Vandermeer Thu, 27 Feb 2020 13:14:24 -0800

On 2/27/2020 1:04:07 PM, Nicolas PARIS <[email protected]> wrote:


> However, updating parquet files can be a bit troublesome.

You might be interested in delta-lake which provides an implementation
of the sql merge statement on top of parquet files. Implementing a drill
connector on this should be feasible. This could be used together the
hybrid design described by Ted and Paul - and makes parquet be more than
static archive.

https://docs.delta.io/latest/delta-intro.html


I noticed Drill already has some support for IceBerg, but I am not familiar 
enough with Spark to figure out whether Delta Lake and Iceberg can be run 
without a Hadoop HDFS.  I was hoping to avoid a full Hadoop deployment since 
Drill itself runs fine without it.

Re: Patterns for data updating?

Reply via email to