Re: Dynamic UDFs support

Arina Yelchiyeva Fri, 29 Jul 2016 06:54:02 -0700

Hi all!

I have prepared updated design document (which describes approach where
DFS, Zookeeper, lazy-init, create and drop notions are used).
Please use new link -
https://docs.google.com/document/d/1FfyJtWae5TLuyheHCfldYUpCdeIezR2RlNsrOTYyAB4/edit?usp=sharing


Previous document link can be found in References section.

Kind regards
Arina

On Tue, Jul 26, 2016 at 8:35 PM, yuliya Feldman <[email protected]
> wrote:

> Thank you Arina
> Yuliya
>
>       From: Arina Yelchiyeva <[email protected]>
>  To: [email protected]; yuliya Feldman <[email protected]>
>  Sent: Tuesday, July 26, 2016 10:11 AM
>  Subject: Re: Dynamic UDFs support
>
> Sure, I'll add this option. I'll send a link to final document once it's
> done.
>
> On Tue, Jul 26, 2016 at 8:06 PM Keys Botzum <[email protected]> wrote:
>
> > +1
> >
> > Keys
> > _______________________________
> > Keys Botzum
> > Senior Principal Technologist
> > [email protected] <mailto:[email protected]>
> > 443-718-0098
> > MapR Technologies
> > http://www.mapr.com <http://www.mapr.com/>
> > > On Jul 26, 2016, at 1:05 PM, yuliya Feldman
> <[email protected]>
> > wrote:
> > >
> > > I want to make sure (also will make a note in the design doc) that we
> > have an option to disable dynamic loading/unloading of UDFs until we will
> > be able to have an ability to do proper authentication AND authorization
> of
> > the user(s).
> > >
> > >      From: Arina Yelchiyeva <[email protected] <mailto:
> > [email protected]>>
> > > To: [email protected] <mailto:[email protected]>
> > > Sent: Monday, July 25, 2016 9:09 AM
> > > Subject: Re: Dynamic UDFs support
> > >
> > > My fault, agree, DROP is more appropriate.
> > > Thanks Julian!
> > >
> > > On Mon, Jul 25, 2016 at 7:07 PM Julian Hyde <[email protected]
> > <mailto:[email protected]>> wrote:
> > >
> > >> But don't call it DELETE. In SQL the opposite of CREATE is DROP.
> > >>
> > >> Julian
> > >>
> > >>> On Jul 25, 2016, at 8:48 AM, Keys Botzum <[email protected]
> > <mailto:[email protected]>> wrote:
> > >>>
> > >>> I like the approach to handling DELETE. This is very useful. I think
> an
> > >> implementation that does not guarantee consistent behavior is
> perfectly
> > >> fine for use that is targeted at developers that are working on UDFs.
> As
> > >> long as the docs make the intent clear this makes me very happy.
> > >>>
> > >>> I'll defer to others more expert than I on the remainder of the
> design.
> > >>>
> > >>> Keys
> > >>> _______________________________
> > >>> Keys Botzum
> > >>> Senior Principal Technologist
> > >>> [email protected] <mailto:[email protected]> <mailto:
> > [email protected] <mailto:[email protected]>>
> > >>> 443-718-0098
> > >>> MapR Technologies
> > >>> http://www.mapr.com <http://www.mapr.com/> <http://www.mapr.com/ <
> > http://www.mapr.com/>>
> > >>>> On Jul 25, 2016, at 9:55 AM, Arina Yelchiyeva <
> > >> [email protected] <mailto:[email protected]>>
> wrote:
> > >>>>
> > >>>> Taking into account all previous comments and discussion we had with
> > >> Parth
> > >>>> and Paul, please find below my design notes (I am going to prepare
> > >> proper
> > >>>> design document, just want to see if all agree with raw version).
> > >>>> I propose will use lazy-init to dynamically loaded UDFs, in such
> case
> > >> when
> > >>>> user issues CREATE UDF command, foreman will only validate jar and
> > >> update
> > >>>> ZK function registry, and only if function is needed it will be
> loaded
> > >> to
> > >>>> appropriate drillbit (during planning stage or fragment execution).
> We
> > >>>> might add listeners (as Paul proposed) to pre-load UDFs but I didn't
> > >>>> include it to current release to simplify solution but we might
> > >> re-consider
> > >>>> this.
> > >>>> I have looked at issue with class loading and unloading and if we
> ship
> > >> each
> > >>>> jar with its own classloader, DELETE functionality can be introduced
> > in
> > >>>> current release, at least marked as experimental or for developers
> use
> > >>>> only, to ease UDF development process.
> > >>>>
> > >>>> Any comments are welcomed.
> > >>>>
> > >>>> *Invariants*
> > >>>>
> > >>>> 1. DFS staging area where user copies jar to be loaded
> > >>>>
> > >>>> 2. DFS udf area (former registration area) where all validated jars
> > are
> > >>>> present
> > >>>>
> > >>>> 3. ZK function registry - contains list of all dynamically loaded
> UDFs
> > >> and
> > >>>> their jars. UDF name will be represented as combination of name and
> > >> input
> > >>>> parameters.
> > >>>>
> > >>>> 4. Lazy-init - all dynamically loaded UDFs will be loaded to
> drillbit
> > >> upon
> > >>>> request, i.e. if drillbits receives query or fragment that contains
> > >> such UDF
> > >>>>
> > >>>> 5. Currently only CREATE and DELETE statements are supported
> > >>>>
> > >>>>
> > >>>> *Adding UDFs*
> > >>>>
> > >>>> 1. User copies source and binary (hereinafter jar) to DFS staging
> area
> > >>>> 2. User issues CREATE UDF command
> > >>>> 3. Foreman receives request to create UDF:
> > >>>> a) checks if jar is present in staging area
> > >>>> b) copies jar to temporary DFS location
> > >>>> c) validates UDFs present in jar locally:
> > >>>> 1) copies jar to temporary local fs
> > >>>> 2) scans jar using temporary classloader
> > >>>> 3) checks if there are any duplicates in local function registry
> > >>>> 4) returns list of UDFs to be registered
> > >>>> d) validates UDFs present in jar in ZK:
> > >>>> 1) takes list of dynamically loaded UDFs from ZK
> > >>>> 2) checks if there are no duplicates either by jar name or among
> UDFs
> > >>>> 3) moves jar from DFS temporary area to DFS udf area
> > >>>> 4) updates ZK with list of new dynamic UDFs
> > >>>> 5) removes jar from staging area
> > >>>> 6) returns confirmation to user that UDFs were registered
> > >>>>
> > >>>>
> > >>>> *Lazy-init*
> > >>>>
> > >>>> 1. User issues query with dynamically loaded UDF.
> > >>>>
> > >>>> 2. During planning stage or fragment execution, if UDF is not
> present
> > in
> > >>>> local function registry,  drillbit:
> > >>>>
> > >>>> a) checks if such UDF is present in ZK function registry
> > >>>>
> > >>>> b) if present, loads UDF using jar name, otherwise return an error
> > >>>>
> > >>>> c) proceeds planning stage or fragment execution
> > >>>>
> > >>>>
> > >>>> *New drillbit registration / Drillbit re-start*
> > >>>>
> > >>>> Local udf directory is re-created, to clean up previously loaded
> jars
> > >> if any
> > >>>>
> > >>>>
> > >>>> *Delete UDF*
> > >>>>
> > >>>> Each jar that going to be loaded dynamically will have its own
> > >> classloader
> > >>>> which will solve problem with loading and unloading classes with the
> > >> same
> > >>>> name.
> > >>>>
> > >>>>
> > >>>> 1. User issues DELETE command (delete will operate on jar name
> level)
> > >>>>
> > >>>> 2. Foreman receives DELETE request:
> > >>>>
> > >>>> a) checks if such jar is present in ZK function registry
> > >>>>
> > >>>> b) creates ephemeral znode /udf/delete/jar_name
> > >>>>
> > >>>> c) removes record in ZK function registry
> > >>>>
> > >>>> d) removes jar from DFS udf area
> > >>>>
> > >>>> e) removes ephemeral znode from /udf/delete/jar_name
> > >>>>
> > >>>> f) returns confirmation to user that UDFs were deleted
> > >>>>
> > >>>> 3. Drillbits are subscribed to /udf/delete znode, when new znode
> with
> > >> jar
> > >>>> name appears, drillbit:
> > >>>>
> > >>>> a) removes all UDFs associated with jar name from local function
> > >> registry
> > >>>>
> > >>>> b) removes jar from local udf directory
> > >>>>
> > >>>>
> > >>>> *Limitations*
> > >>>>
> > >>>> 1. When user runs DELETE command, some queries that are using
> deleted
> > >> UDFs
> > >>>> may fail during fragment execution if by that time UDF has been
> > deleted
> > >>>> from local registry. Ideally, before submitting DELETE command, user
> > >> needs
> > >>>> to make sure, no one is running queries using UDFs from that
> > particular
> > >> jar.
> > >>>>
> > >>>>
> > >>>> 2. We encourage users not to delete any jars from DFS udf area
> > >> manually, as
> > >>>> it may lead to inconsistency between ZK function registry and DFS
> udf
> > >> area.
> > >>>>
> > >>>>
> > >>>> 3. CREATE statement is not atomic in part when we copy validated jar
> > to
> > >> DFS
> > >>>> udf area and updating ZK function registry with list of new UDFs. In
> > >> case
> > >>>> of failure between these two steps, some unused jars may be left in
> > DFS
> > >> udf
> > >>>> area but they won’t harm current process. LIST JARS command can be
> > >>>> introduced to show used jars.
> > >>>>
> > >>>>
> > >>>> Kind regards
> > >>>> Arina
> > >>>>
> > >>>>> On Fri, Jul 22, 2016 at 7:15 PM Keys Botzum <[email protected]
> > <mailto:[email protected]>>
> > >> wrote:
> > >>>>>
> > >>>>> No disagreement on deferral but I raised my initial concern
> precisely
> > >>>>> because I'm concerned about the practicality of the "restart the
> > >> cluster"
> > >>>>> option. I  sighted my concerns about laptops and development
> > >> clusters.  I
> > >>>>> was wondering if there might be some small things Drill could do to
> > >> help.
> > >>>>> If there is nothing that can be done to make this easier, so be it,
> > >> but I
> > >>>>> think that's going to be a big impedance.
> > >>>>>
> > >>>>> Keys
> > >>>>> _______________________________
> > >>>>> Keys Botzum
> > >>>>> Senior Principal Technologist
> > >>>>> [email protected] <mailto:[email protected]> <mailto:
> > [email protected] <mailto:[email protected]>>
> > >>>>> 443-718-0098
> > >>>>> MapR Technologies
> > >>>>> http://www.mapr.com <http://www.mapr.com/> <http://www.mapr.com/ <
> > http://www.mapr.com/>>
> > >>>>>>> On Jul 22, 2016, at 1:37 AM, Neeraja Rentachintala <
> > >>>>>> [email protected] <mailto:[email protected]>>
> > wrote:
> > >>>>>>
> > >>>>>> It seems like we are reaching a conclusion here in terms of
> starting
> > >>>>> with a
> > >>>>>> simpler implementation i.e being able to deploy UDFs dynamically
> > >> without
> > >>>>>> Drillbit restarts based off a jars in DFS location.  Dropping
> > >> functions
> > >>>>>> dynamically is out of scope for version 1 of this feature (we
> assume
> > >>>>>> development of UDFs is happening on user laptop or a dev cluster
> > where
> > >>>>> its
> > >>>>>> ok to have restart).
> > >>>>>>
> > >>>>>> -Neeraja
> > >>>>>>
> > >>>>>>> On Thu, Jul 21, 2016 at 11:56 AM, Keys Botzum <
> > [email protected] <mailto:[email protected]>>
> > >>>>>> wrote:
> > >>>>>>
> > >>>>>>> Recognize the difficulty. Not suggesting this be addressed in
> first
> > >>>>>>> version. Just suggesting some thought about how a real user will
> > >>>>>>> workaround. Maybe some doc and/or small changes can make this
> > easier.
> > >>>>>>>
> > >>>>>>> Keys
> > >>>>>>> _______________________________
> > >>>>>>> Keys Botzum
> > >>>>>>> Senior Principal Technologist
> > >>>>>>> [email protected] <mailto:[email protected]>
> > >>>>>>> 443-718-0098
> > >>>>>>> MapR Technologies
> > >>>>>>> http://www.mapr.com
> > >>>>>>>> On Jul 21, 2016 1:45 PM, "Paul Rogers" <[email protected]>
> > >> wrote:
> > >>>>>>>>
> > >>>>>>>> Hi All,
> > >>>>>>>>
> > >>>>>>>> Adding a dynamic DROP would, of course, be a great addition! The
> > >> reason
> > >>>>>>>> for suggesting we skip that was to control project scope.
> > >>>>>>>>
> > >>>>>>>> Dynamic DROP requires a synchronization step. Here’s the
> scenario:
> > >>>>>>>>
> > >>>>>>>> * Foreman A starts a query using UDF U.
> > >>>>>>>> * Foreman B receives a request to drop UDF U, followed by a
> > request
> > >> to
> > >>>>>>> add
> > >>>>>>>> a new version of U, U’.
> > >>>>>>>>
> > >>>>>>>> How do we drop a function that may be in use? There are some
> > tricky
> > >>>>> bits
> > >>>>>>>> to work out, which seemed too overwhelming to consider all in
> one
> > >> go.
> > >>>>>>>>
> > >>>>>>>> Clearly just dropping U and adding a new version of U with the
> > same
> > >>>>> name
> > >>>>>>>> leads to issues if not synchronized. If a Drillbit D is running
> a
> > >> query
> > >>>>>>>> with U when it receives notice to drop U, should D complete the
> > >> query
> > >>>>> or
> > >>>>>>>> fail it? If the query completes, then how does D deal with the
> > >> request
> > >>>>> to
> > >>>>>>>> register U’, which has the same name?
> > >>>>>>>>
> > >>>>>>>> Do we globally synchronize function deletion? (The foreman B
> that
> > >>>>>>> receives
> > >>>>>>>> the drop request waits for all queries using U to finish.) But,
> > how
> > >> do
> > >>>>> we
> > >>>>>>>> know which queries use U?
> > >>>>>>>>
> > >>>>>>>> An eventually consistent approach is to track the age of the
> > oldest
> > >>>>>>>> running query. Suppose B drops U at time T. Any query received
> > >> after T
> > >>>>>>> that
> > >>>>>>>> uses U will fail in planning. A new U’ can’t be registered until
> > all
> > >>>>>>>> queries that started before T complete.
> > >>>>>>>>
> > >>>>>>>> The primary challenge we face in both the CREATE and DROP cases
> is
> > >> that
> > >>>>>>>> Drill is distributed with little central coordination. That’s
> > great
> > >> for
> > >>>>>>>> scale, but makes it hard to design features that require
> > >> coordination.
> > >>>>>>> Some
> > >>>>>>>> other tools solve this problem with a data dictionary (or
> > >> “metastore").
> > >>>>>>>> Alas, Drill does not have such a concept. So a seemingly simple
> > >> feature
> > >>>>>>>> like dynamic UDF becomes a major design challenge to get right.
> > >>>>>>>>
> > >>>>>>>> Thanks,
> > >>>>>>>>
> > >>>>>>>> - Paul
> > >>>>>>>>
> > >>>>>>>>>> On Jul 21, 2016, at 7:21 AM, Neeraja Rentachintala <
> > >>>>>>>>> [email protected]> wrote:
> > >>>>>>>>>
> > >>>>>>>>> The whole point of this feature is to avoid Drill cluster
> > restarts
> > >> as
> > >>>>>>> the
> > >>>>>>>>> name indicates 'Dynamic' UDFs.
> > >>>>>>>>> So any design that requires restarts I would think would beat
> the
> > >>>>>>>> purpose.
> > >>>>>>>>>
> > >>>>>>>>> I also think this is an example of a feature we start with a
> > simple
> > >>>>>>>> design
> > >>>>>>>>> to serve the purpose, take feedback on how it is being
> > >> deployed/used
> > >>>>> in
> > >>>>>>>>> real user situations and improve it in subsequent releases.
> > >>>>>>>>>
> > >>>>>>>>> -thanks
> > >>>>>>>>> Neeraja
> > >>>>>>>>>
> > >>>>>>>>>> On Thu, Jul 21, 2016 at 6:32 AM, Keys Botzum <
> > >> [email protected]>
> > >>>>>>>>> wrote:
> > >>>>>>>>>
> > >>>>>>>>>> I think there are a lot of great ideas here. My one concern is
> > the
> > >>>>>>> lack
> > >>>>>>>> of
> > >>>>>>>>>> unload and thus presumably replace functionality. I'm just
> > >> thinking
> > >>>>>>>> about
> > >>>>>>>>>> typical actual usage.
> > >>>>>>>>>>
> > >>>>>>>>>> In a typical development cycle someone writes something, tries
> > it,
> > >>>>>>>> learns,
> > >>>>>>>>>> changes it, and tries again. Assuming I understand the design
> > that
> > >>>>>>>> change
> > >>>>>>>>>> step requires a full Drill cluster restart. That is going to
> be
> > >> very
> > >>>>>>>>>> disruptive and will make UDF work nearly impossible without a
> > >>>>>>> dedicated
> > >>>>>>>>>> "private" cluster for Drill. I realize that people should have
> > >> access
> > >>>>>>> to
> > >>>>>>>>>> the data they need and Drill in a development cluster but even
> > >> then
> > >>>>>>>>>> restarts can be hard since development clusters are often
> > shared -
> > >>>>> and
> > >>>>>>>>>> that's assuming such a cluster exists. I realize of course
> Drill
> > >> can
> > >>>>>>> be
> > >>>>>>>> run
> > >>>>>>>>>> as a standalone Drillbit but I'm not convinced that desktops
> > will
> > >>>>> have
> > >>>>>>>>>> adequate access to the needed data.
> > >>>>>>>>>>
> > >>>>>>>>>> Having dealt with Java classloading over the years, I'm not
> > >> claiming
> > >>>>>>>> class
> > >>>>>>>>>> replacement is an easy thing so I'll defer to others on the
> > >> priority
> > >>>>>>> of
> > >>>>>>>>>> that, but I'm wondering if there isn't some way to make UDF
> > >>>>>>>> experimentation
> > >>>>>>>>>> a bit easier/practical.
> > >>>>>>>>>>
> > >>>>>>>>>> Given the above, let me toss out some possibly naive ideas
> that
> > >> maybe
> > >>>>>>>> are
> > >>>>>>>>>> workable:
> > >>>>>>>>>> * can I easily run a standalone Drillbit on a Hadoop cluster
> > node
> > >>>>> that
> > >>>>>>>> is
> > >>>>>>>>>> already running Drill servers? I'm sure this can be done, but
> is
> > >> it
> > >>>>>>>> easy?
> > >>>>>>>>>> Could we perhaps make this clearer as an explicit kind of
> thing?
> > >>>>>>>>>> * is there a way that when I deploy a UDF I can constrain the
> #
> > of
> > >>>>>>> bits
> > >>>>>>>> it
> > >>>>>>>>>> is loaded into and perhaps even specify the bits?
> > >>>>>>>>>> * Obvious correlarary is I'd want my query to run on those
> bits
> > >> and a
> > >>>>>>>>>> not too disruptive way to restart just those bits
> > >>>>>>>>>>
> > >>>>>>>>>> The above may be obvious to Drill experts. If it is then
> perhaps
> > >> the
> > >>>>>>> UDF
> > >>>>>>>>>> docs could just point out how to easily develop UDFs in an
> > >> iterative
> > >>>>>>>>>> fashion.
> > >>>>>>>>>>
> > >>>>>>>>>> Keys
> > >>>>>>>>>> _______________________________
> > >>>>>>>>>> Keys Botzum
> > >>>>>>>>>> Senior Principal Technologist
> > >>>>>>>>>> [email protected] <mailto:[email protected]>
> > >>>>>>>>>> 443-718-0098
> > >>>>>>>>>> MapR Technologies
> > >>>>>>>>>> http://www.mapr.com <http://www.mapr.com/>
> >
> >
>
>
>

Re: Dynamic UDFs support

Reply via email to