Re: On coprocessor API evolution

Andrew Purtell Sat, 17 May 2014 19:56:23 -0700

A related issue is HBASE-2396, which envisions building a sandbox inside an
embedded scripting environment on top of the coprocessor hooks for 'safer'
execution of such things we might call user code triggers, see the
description on the issue and then the large-ish comment 6 or 7 down from
the top. HBASE-2396 couldn't ever fully address isolation concerns if done
in process, so it should be something built on top of HBASE-4047.


Either or both are a fair amount of full time work.


On Sat, May 17, 2014 at 8:02 AM, Kevin O'dell <kevin.od...@cloudera.com>wrote:

> Andrew,
>
>    HBase-4047 is a great idea(even if it is three years old).  I have had
> numerous customers implement Co-Procs and take down every RS in a
> spectacular fashion from JVM crashes to performance crawling so slow that
> jobs fail out.  I will raise this internally and see if we can get some
> extra traction.
>
>
> On Sat, May 17, 2014 at 9:33 AM, Andrew Purtell <apurt...@apache.org>
> wrote:
>
> > Great, see HBASE-4047. In the best of the open source tradition, there
> > hasn't been anyone sufficiently motivated to do the work necessary
> (current
> > use cases are "good enough"), but that someone can always come along.
> > Perhaps that is yourself.
> >
> >
> > On Sat, May 17, 2014 at 5:39 AM, Michael Segel <
> michael_se...@hotmail.com
> > >wrote:
> >
> > > You have to understand…
> > >
> > > I do see the importance of the hook to allow for a trigger to implement
> > > 3rd party code on the server side.
> > > No argument there.
> > >
> > > Its just how the current implementation doesn’t sandbox the code so
> that
> > > it limits the potential for harm to the RS.
> > >
> > > In simple terms you can isolate the code in to a separate jvm and use
> IPC
> > > to connect the sandbox to the RS when a trigger occurs.
> > >
> > > In C/C++ you’d have shared memory segments, something you don’t really
> > > have in Java.  (You could use C and then put a JNI wrapper around
> this…)
> > >
> > > Which goes to my point… this is something that is solvable. You just
> need
> > > to think about it…
> > >
> > > You talk about RDBMSs. Triggers themselves are not an equivalent
> analogy.
> > > You can have a trigger that then calls some code written in an SPL and
> > > you’re ok. You can control the SPL environment so that you limit the
> risk
> > > of the server crashing.
> > > (SPL == Stored Procedure Language)
> > >
> > > If you’re running  third party code from your trigger that is written
> in
> > > C/C++ or Java, then you have other issues.
> > >
> > > Sybase’s Adaptive Server had some serious issues and a poorly written
> > > C/C++ code could cause serious performance issues… Informix IDS took a
> > > different approach and didn’t have those issues.  And I’m aging myself
> > > because most here probably never worked with either Sybase or Informix
> …
> > ;-)
> > >
> > > So using your RDBMS analogy… you have two different approaches. One
> > worked
> > > … well enough, but was problematic.  The other worked better and had
> less
> > > issues and was more secure.
> > >
> > > One of the reasons why this is important… the longer the current
> > > implementation is in the wild, the longer and harder it will take to
> fix.
> > >
> > >
> > > On May 17, 2014, at 11:44 AM, qiang tian <tian...@gmail.com> wrote:
> > >
> > > > My small 2 cents...:-)
> > > >
> > > > Hook/coprocessor is useful mechanism to interacting with a system for
> > > > things that cannot be done via API.  For end user, the tradeoff
> >  factors
> > > > like performance, security, reliability etc can be control by upper
> > > layer'
> > > > policy.
> > > > e.g. In RDBMS, the end user has limited usage case for triggers,
> which
> > > > eliminates the security factor at all, and the performance tradeoff
> is
> > > > given to end user to decide. so from evolution's perspective,
> > > > hook/coprocessor for end user could be controlled by query engine
> layer
> > > > like Phoenix.
> > > >
> > > > For internal user, hook better not be used widely unless it is a MUST
> > or
> > > > strong flexibility/plugability is required.  e.g. things can be part
> of
> > > the
> > > > core better not use it.
> > > >
> > > > thanks.
> > > >
> > > >
> > > >
> > > > On Sat, May 17, 2014 at 4:04 PM, Michael Segel <
> > > michael_se...@hotmail.com>wrote:
> > > >
> > > >> Andrew,
> > > >>
> > > >> Is ‘magical fairy dust’ a reference to some new synthetic drug you
> > take
> > > at
> > > >> raves?
> > > >> But lets get back to reality.
> > > >>
> > > >>
> > > >> Lets try this again; simply put… the coprocessor runs on the same
> JVM
> > as
> > > >> the RS, therefore you have an unacceptable level of risk.
> > > >> That inherent risk means that you cannot run HBase with end-user
> > > >> coprocessors enabled when you want to have a stable and somewhat
> > secure
> > > >> environment.
> > > >>
> > > >> The simple truth is that you need to decouple the end-user code
> > > >> (coprocessor) from the RS.
> > > >> Its not a difficult concept to understand, and while reasonable, it
> > > would
> > > >> mean a major rewrite and work done on co-processors.
> > > >>
> > > >> Will de-coupling the user-space from the RS remove all risk? No.
>  And
> > > no,
> > > >> I’m not suggesting that.
> > > >> But its a critical piece to the puzzle.
> > > >>
> > > >> Its not just security, but also reliability.
> > > >>
> > > >>
> > > >> On May 17, 2014, at 4:43 AM, Andrew Purtell <apurt...@apache.org>
> > > wrote:
> > > >>
> > > >>> Michael,
> > > >>>
> > > >>> As you know, we have implemented security features with
> coprocessors
> > > >>> precisely because they can be interposed on internal actions to
> make
> > > >>> authoritative decisions in-process. Coprocessors are a way to have
> > > >>> composable internal extensions. They don't have and probably never
> > will
> > > >>> have magic fairy security dust. We do trust the security
> coprocessor
> > > code
> > > >>> because it was developed by the project. That is not the same thing
> > as
> > > >>> saying you can have 'security' and execute arbitrary user code
> > > in-process
> > > >>> as a coprocessor. Just want to clear that up for you.
> > > >>>
> > > >>>> will want to allow system coprocessors but then write a
> coprocessor
> > > that
> > > >>> reject user coprocessors.
> > > >>>
> > > >>> That's a reasonable point.
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>> On Sat, May 17, 2014 at 12:13 AM, Michael Segel
> > > >>> <michael_se...@hotmail.com>wrote:
> > > >>>
> > > >>>> Until you move the coprocessor out of the RS space and into its
> own
> > > >>>> sandbox… saying security and coprocessor in the same sentence is a
> > > joke.
> > > >>>> Oh wait… you were serious… :-(
> > > >>>>
> > > >>>> I’d say there’s a significant rethink on coprocessors that’s
> > required.
> > > >>>>
> > > >>>> Anyone running a secure (kerberos) cluster, will want to allow
> > system
> > > >>>> coprocessors but then write a coprocessor that reject user
> > > coprocessors.
> > > >>>>
> > > >>>> Just putting it out there…
> > > >>>>
> > > >>>> On May 15, 2014, at 2:13 AM, Andrew Purtell <apurt...@apache.org>
> > > >> wrote:
> > > >>>>
> > > >>>>> Because coprocessor APIs are so tightly bound with internals, if
> we
> > > >> apply
> > > >>>>> suggested rules like as mentioned on HBASE-11054:
> > > >>>>>
> > > >>>>>    I'd say policy should be no changes to method apis across
> minor
> > > >>>>> versions
> > > >>>>>
> > > >>>>> This will lock coprocessor based components to the limitations of
> > the
> > > >> API
> > > >>>>> as we encounter them. Core code does not suffer this limitation,
> we
> > > are
> > > >>>>> otherwise free to refactor and change internal methods. For
> > example,
> > > if
> > > >>>> we
> > > >>>>> apply this policy to the 0.98 branch, then we will have to
> abandon
> > > >>>> further
> > > >>>>> security feature development there and move to trunk only. This
> is
> > > >>>> because
> > > >>>>> we already are aware that coprocessor APIs as they stand are
> > > >> insufficient
> > > >>>>> still.
> > > >>>>>
> > > >>>>> Coprocessor APIs are a special class of internal method. We have
> > had
> > > a
> > > >>>>> tension between allowing freedom of movement for developing them
> > out
> > > >> and
> > > >>>>> providing some measure of stability for implementors for a while.
> > > >>>>>
> > > >>>>> It is my belief that the way forward is something like
> HBASE-11125.
> > > >>>> Perhaps
> > > >>>>> we can take this discussion to that JIRA and have this long
> overdue
> > > >>>>> conversation.
> > > >>>>>
> > > >>>>> Regarding security features specifically, I would also like to
> call
> > > >> your
> > > >>>>> attention to HBASE-11127. I think security has been an optional
> > > feature
> > > >>>>> long enough, it is becoming a core requirement for the project,
> so
> > > >> should
> > > >>>>> be moved into core. Sure, we can therefore sidestep any issues
> with
> > > >>>>> coprocessor API sufficiency for hosting security features.
> However,
> > > in
> > > >> my
> > > >>>>> opinion we should pursue both HBASE-11125 and HBASE-11127; the
> > first
> > > to
> > > >>>>> provide the relative stability long asked for by coprocessor API
> > > users,
> > > >>>> the
> > > >>>>> latter to cleanly solve emerging issues with concurrency and
> > > >> versioning.
> > > >>>>>
> > > >>>>>
> > > >>>>> --
> > > >>>>> Best regards,
> > > >>>>>
> > > >>>>> - Andy
> > > >>>>>
> > > >>>>> Problems worthy of attack prove their worth by hitting back. -
> Piet
> > > >> Hein
> > > >>>>> (via Tom White)
> > > >>>>
> > > >>>>
> > > >>>
> > > >>>
> > > >>> --
> > > >>> Best regards,
> > > >>>
> > > >>>  - Andy
> > > >>>
> > > >>> Problems worthy of attack prove their worth by hitting back. - Piet
> > > Hein
> > > >>> (via Tom White)
> > > >>
> > > >>
> > >
> > >
> >
> >
> > --
> > Best regards,
> >
> >    - Andy
> >
> > Problems worthy of attack prove their worth by hitting back. - Piet Hein
> > (via Tom White)
> >
>
>
>
> --
> Kevin O'Dell
> Systems Engineer, Cloudera
>



-- 
Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein
(via Tom White)

Re: On coprocessor API evolution

Reply via email to