Related work inn Iceberg. Worth a read :
https://docs.google.com/document/d/1Pk34C3diOfVCRc-sfxfhXZfzvxwum1Odo-6Jj9mwK38/edit#
On Tue, May 28, 2019 at 2:17 PM Aman Sinha wrote:
> The description I sent is for the planner but there's of course a run-time
> component which would consist of a
The description I sent is for the planner but there's of course a run-time
component which would consist of a 'RecordWriter' for the underlying DB.
In case of MapR-DB, this RecordWriter would simply call the underlying PUT
or the Bulk PUT API. In addition, we need to figure out the
Yes, Calcite already supports the INSERT/UPSERT syntax. Within Drill, you
would need to 'unblock' this syntax (not all of it but whatever variation
we may want to support). You can take a look at DrillParserImpl.java
(SqlInsert() method) which is actually a generated file from JavaCC.
We would
Yes. CTAS should be a similar problem to unsafe inserts.
We have a few people interested in the work. What is needed more is
pointers to where to find out about the details.
1. How can we enable the syntax?
2. What operators are really necessary?
3. How should writers inject insert optimizer
Hi Ted,
Drill can do a CTAS today, which uses a writer provided by the format plugin.
One would think this same structure could work for an INSERT operation, with a
writer provided by the storage plugin. The devil, of course, is always in the
details. And in finding resources to do the work...
And I should point out that Drill already has the problem of data that
changes. It just ignores the problem. If somebody appends to one CSV or
JSON file or another, some changes might get picked up, some might be seen
mid-change (causing a data syntax error, possibly) or if DB rows are
inserted
I have in mind the ability to push rows to an underlying DB without any
transactional support.
On Mon, May 27, 2019 at 2:16 PM Paul Rogers
wrote:
> Hi Ted,
>
> From item 3, it should like you are focusing on using Drill to front a DB
> system, rather than proposing to use Drill to update
Hi Ted,
>From item 3, it should like you are focusing on using Drill to front a DB
>system, rather than proposing to use Drill to update files in a distributed
>file system (DFS).
Turns out that, for the DFS case, the former HortonWorks put quite a bit into
working out viable insert/update
I would like to start a discussion about how to add insert capabilities to
drill.
It seems that the basic outline is:
1) making sure Calcite will parse it (almost certain)
2) defining an upsert operator in the logical plan
3) push rules into Drill from the DB driver to allow Drill to push down