Re: On coprocessor API evolution

Kevin O'dell Sat, 17 May 2014 08:46:28 -0700

Andrew,

   HBase-4047 is a great idea(even if it is three years old).  I have had
numerous customers implement Co-Procs and take down every RS in a
spectacular fashion from JVM crashes to performance crawling so slow that
jobs fail out.  I will raise this internally and see if we can get some
extra traction.



On Sat, May 17, 2014 at 9:33 AM, Andrew Purtell <[email protected]> wrote:

> Great, see HBASE-4047. In the best of the open source tradition, there
> hasn't been anyone sufficiently motivated to do the work necessary (current
> use cases are "good enough"), but that someone can always come along.
> Perhaps that is yourself.
>
>
> On Sat, May 17, 2014 at 5:39 AM, Michael Segel <[email protected]
> >wrote:
>
> > You have to understand…
> >
> > I do see the importance of the hook to allow for a trigger to implement
> > 3rd party code on the server side.
> > No argument there.
> >
> > Its just how the current implementation doesn’t sandbox the code so that
> > it limits the potential for harm to the RS.
> >
> > In simple terms you can isolate the code in to a separate jvm and use IPC
> > to connect the sandbox to the RS when a trigger occurs.
> >
> > In C/C++ you’d have shared memory segments, something you don’t really
> > have in Java.  (You could use C and then put a JNI wrapper around this…)
> >
> > Which goes to my point… this is something that is solvable. You just need
> > to think about it…
> >
> > You talk about RDBMSs. Triggers themselves are not an equivalent analogy.
> > You can have a trigger that then calls some code written in an SPL and
> > you’re ok. You can control the SPL environment so that you limit the risk
> > of the server crashing.
> > (SPL == Stored Procedure Language)
> >
> > If you’re running  third party code from your trigger that is written in
> > C/C++ or Java, then you have other issues.
> >
> > Sybase’s Adaptive Server had some serious issues and a poorly written
> > C/C++ code could cause serious performance issues… Informix IDS took a
> > different approach and didn’t have those issues.  And I’m aging myself
> > because most here probably never worked with either Sybase or Informix …
> ;-)
> >
> > So using your RDBMS analogy… you have two different approaches. One
> worked
> > … well enough, but was problematic.  The other worked better and had less
> > issues and was more secure.
> >
> > One of the reasons why this is important… the longer the current
> > implementation is in the wild, the longer and harder it will take to fix.
> >
> >
> > On May 17, 2014, at 11:44 AM, qiang tian <[email protected]> wrote:
> >
> > > My small 2 cents...:-)
> > >
> > > Hook/coprocessor is useful mechanism to interacting with a system for
> > > things that cannot be done via API.  For end user, the tradeoff
>  factors
> > > like performance, security, reliability etc can be control by upper
> > layer'
> > > policy.
> > > e.g. In RDBMS, the end user has limited usage case for triggers, which
> > > eliminates the security factor at all, and the performance tradeoff is
> > > given to end user to decide. so from evolution's perspective,
> > > hook/coprocessor for end user could be controlled by query engine layer
> > > like Phoenix.
> > >
> > > For internal user, hook better not be used widely unless it is a MUST
> or
> > > strong flexibility/plugability is required.  e.g. things can be part of
> > the
> > > core better not use it.
> > >
> > > thanks.
> > >
> > >
> > >
> > > On Sat, May 17, 2014 at 4:04 PM, Michael Segel <
> > [email protected]>wrote:
> > >
> > >> Andrew,
> > >>
> > >> Is ‘magical fairy dust’ a reference to some new synthetic drug you
> take
> > at
> > >> raves?
> > >> But lets get back to reality.
> > >>
> > >>
> > >> Lets try this again; simply put… the coprocessor runs on the same JVM
> as
> > >> the RS, therefore you have an unacceptable level of risk.
> > >> That inherent risk means that you cannot run HBase with end-user
> > >> coprocessors enabled when you want to have a stable and somewhat
> secure
> > >> environment.
> > >>
> > >> The simple truth is that you need to decouple the end-user code
> > >> (coprocessor) from the RS.
> > >> Its not a difficult concept to understand, and while reasonable, it
> > would
> > >> mean a major rewrite and work done on co-processors.
> > >>
> > >> Will de-coupling the user-space from the RS remove all risk? No.  And
> > no,
> > >> I’m not suggesting that.
> > >> But its a critical piece to the puzzle.
> > >>
> > >> Its not just security, but also reliability.
> > >>
> > >>
> > >> On May 17, 2014, at 4:43 AM, Andrew Purtell <[email protected]>
> > wrote:
> > >>
> > >>> Michael,
> > >>>
> > >>> As you know, we have implemented security features with coprocessors
> > >>> precisely because they can be interposed on internal actions to make
> > >>> authoritative decisions in-process. Coprocessors are a way to have
> > >>> composable internal extensions. They don't have and probably never
> will
> > >>> have magic fairy security dust. We do trust the security coprocessor
> > code
> > >>> because it was developed by the project. That is not the same thing
> as
> > >>> saying you can have 'security' and execute arbitrary user code
> > in-process
> > >>> as a coprocessor. Just want to clear that up for you.
> > >>>
> > >>>> will want to allow system coprocessors but then write a coprocessor
> > that
> > >>> reject user coprocessors.
> > >>>
> > >>> That's a reasonable point.
> > >>>
> > >>>
> > >>>
> > >>>
> > >>> On Sat, May 17, 2014 at 12:13 AM, Michael Segel
> > >>> <[email protected]>wrote:
> > >>>
> > >>>> Until you move the coprocessor out of the RS space and into its own
> > >>>> sandbox… saying security and coprocessor in the same sentence is a
> > joke.
> > >>>> Oh wait… you were serious… :-(
> > >>>>
> > >>>> I’d say there’s a significant rethink on coprocessors that’s
> required.
> > >>>>
> > >>>> Anyone running a secure (kerberos) cluster, will want to allow
> system
> > >>>> coprocessors but then write a coprocessor that reject user
> > coprocessors.
> > >>>>
> > >>>> Just putting it out there…
> > >>>>
> > >>>> On May 15, 2014, at 2:13 AM, Andrew Purtell <[email protected]>
> > >> wrote:
> > >>>>
> > >>>>> Because coprocessor APIs are so tightly bound with internals, if we
> > >> apply
> > >>>>> suggested rules like as mentioned on HBASE-11054:
> > >>>>>
> > >>>>>    I'd say policy should be no changes to method apis across minor
> > >>>>> versions
> > >>>>>
> > >>>>> This will lock coprocessor based components to the limitations of
> the
> > >> API
> > >>>>> as we encounter them. Core code does not suffer this limitation, we
> > are
> > >>>>> otherwise free to refactor and change internal methods. For
> example,
> > if
> > >>>> we
> > >>>>> apply this policy to the 0.98 branch, then we will have to abandon
> > >>>> further
> > >>>>> security feature development there and move to trunk only. This is
> > >>>> because
> > >>>>> we already are aware that coprocessor APIs as they stand are
> > >> insufficient
> > >>>>> still.
> > >>>>>
> > >>>>> Coprocessor APIs are a special class of internal method. We have
> had
> > a
> > >>>>> tension between allowing freedom of movement for developing them
> out
> > >> and
> > >>>>> providing some measure of stability for implementors for a while.
> > >>>>>
> > >>>>> It is my belief that the way forward is something like HBASE-11125.
> > >>>> Perhaps
> > >>>>> we can take this discussion to that JIRA and have this long overdue
> > >>>>> conversation.
> > >>>>>
> > >>>>> Regarding security features specifically, I would also like to call
> > >> your
> > >>>>> attention to HBASE-11127. I think security has been an optional
> > feature
> > >>>>> long enough, it is becoming a core requirement for the project, so
> > >> should
> > >>>>> be moved into core. Sure, we can therefore sidestep any issues with
> > >>>>> coprocessor API sufficiency for hosting security features. However,
> > in
> > >> my
> > >>>>> opinion we should pursue both HBASE-11125 and HBASE-11127; the
> first
> > to
> > >>>>> provide the relative stability long asked for by coprocessor API
> > users,
> > >>>> the
> > >>>>> latter to cleanly solve emerging issues with concurrency and
> > >> versioning.
> > >>>>>
> > >>>>>
> > >>>>> --
> > >>>>> Best regards,
> > >>>>>
> > >>>>> - Andy
> > >>>>>
> > >>>>> Problems worthy of attack prove their worth by hitting back. - Piet
> > >> Hein
> > >>>>> (via Tom White)
> > >>>>
> > >>>>
> > >>>
> > >>>
> > >>> --
> > >>> Best regards,
> > >>>
> > >>>  - Andy
> > >>>
> > >>> Problems worthy of attack prove their worth by hitting back. - Piet
> > Hein
> > >>> (via Tom White)
> > >>
> > >>
> >
> >
>
>
> --
> Best regards,
>
>    - Andy
>
> Problems worthy of attack prove their worth by hitting back. - Piet Hein
> (via Tom White)
>



-- 
Kevin O'Dell
Systems Engineer, Cloudera

Re: On coprocessor API evolution

Reply via email to