Re: adding insert

2019-05-29 Thread Parth Chandra
Related work inn Iceberg. Worth a read : https://docs.google.com/document/d/1Pk34C3diOfVCRc-sfxfhXZfzvxwum1Odo-6Jj9mwK38/edit# On Tue, May 28, 2019 at 2:17 PM Aman Sinha wrote: > The description I sent is for the planner but there's of course a run-time > component which would consist of a

Re: adding insert

2019-05-28 Thread Aman Sinha
The description I sent is for the planner but there's of course a run-time component which would consist of a 'RecordWriter' for the underlying DB. In case of MapR-DB, this RecordWriter would simply call the underlying PUT or the Bulk PUT API. In addition, we need to figure out the

Re: adding insert

2019-05-28 Thread Aman Sinha
Yes, Calcite already supports the INSERT/UPSERT syntax. Within Drill, you would need to 'unblock' this syntax (not all of it but whatever variation we may want to support). You can take a look at DrillParserImpl.java (SqlInsert() method) which is actually a generated file from JavaCC. We would

Re: adding insert

2019-05-28 Thread Ted Dunning
Yes. CTAS should be a similar problem to unsafe inserts. We have a few people interested in the work. What is needed more is pointers to where to find out about the details. 1. How can we enable the syntax? 2. What operators are really necessary? 3. How should writers inject insert optimizer

Re: adding insert

2019-05-27 Thread Paul Rogers
Hi Ted, Drill can do a CTAS today, which uses a writer provided by the format plugin. One would think this same structure could work for an INSERT operation, with a writer provided by the storage plugin. The devil, of course, is always in the details. And in finding resources to do the work...

Re: adding insert

2019-05-27 Thread Ted Dunning
And I should point out that Drill already has the problem of data that changes. It just ignores the problem. If somebody appends to one CSV or JSON file or another, some changes might get picked up, some might be seen mid-change (causing a data syntax error, possibly) or if DB rows are inserted

Re: adding insert

2019-05-27 Thread Ted Dunning
I have in mind the ability to push rows to an underlying DB without any transactional support. On Mon, May 27, 2019 at 2:16 PM Paul Rogers wrote: > Hi Ted, > > From item 3, it should like you are focusing on using Drill to front a DB > system, rather than proposing to use Drill to update

Re: adding insert

2019-05-27 Thread Paul Rogers
Hi Ted, >From item 3, it should like you are focusing on using Drill to front a DB >system, rather than proposing to use Drill to update files in a distributed >file system (DFS). Turns out that, for the DFS case, the former HortonWorks put quite a bit into working out viable insert/update

adding insert

2019-05-27 Thread Ted Dunning
I would like to start a discussion about how to add insert capabilities to drill. It seems that the basic outline is: 1) making sure Calcite will parse it (almost certain) 2) defining an upsert operator in the logical plan 3) push rules into Drill from the DB driver to allow Drill to push down