Andrew, HBase-4047 is a great idea(even if it is three years old). I have had numerous customers implement Co-Procs and take down every RS in a spectacular fashion from JVM crashes to performance crawling so slow that jobs fail out. I will raise this internally and see if we can get some extra traction.
On Sat, May 17, 2014 at 9:33 AM, Andrew Purtell <[email protected]> wrote: > Great, see HBASE-4047. In the best of the open source tradition, there > hasn't been anyone sufficiently motivated to do the work necessary (current > use cases are "good enough"), but that someone can always come along. > Perhaps that is yourself. > > > On Sat, May 17, 2014 at 5:39 AM, Michael Segel <[email protected] > >wrote: > > > You have to understand… > > > > I do see the importance of the hook to allow for a trigger to implement > > 3rd party code on the server side. > > No argument there. > > > > Its just how the current implementation doesn’t sandbox the code so that > > it limits the potential for harm to the RS. > > > > In simple terms you can isolate the code in to a separate jvm and use IPC > > to connect the sandbox to the RS when a trigger occurs. > > > > In C/C++ you’d have shared memory segments, something you don’t really > > have in Java. (You could use C and then put a JNI wrapper around this…) > > > > Which goes to my point… this is something that is solvable. You just need > > to think about it… > > > > You talk about RDBMSs. Triggers themselves are not an equivalent analogy. > > You can have a trigger that then calls some code written in an SPL and > > you’re ok. You can control the SPL environment so that you limit the risk > > of the server crashing. > > (SPL == Stored Procedure Language) > > > > If you’re running third party code from your trigger that is written in > > C/C++ or Java, then you have other issues. > > > > Sybase’s Adaptive Server had some serious issues and a poorly written > > C/C++ code could cause serious performance issues… Informix IDS took a > > different approach and didn’t have those issues. And I’m aging myself > > because most here probably never worked with either Sybase or Informix … > ;-) > > > > So using your RDBMS analogy… you have two different approaches. One > worked > > … well enough, but was problematic. The other worked better and had less > > issues and was more secure. > > > > One of the reasons why this is important… the longer the current > > implementation is in the wild, the longer and harder it will take to fix. > > > > > > On May 17, 2014, at 11:44 AM, qiang tian <[email protected]> wrote: > > > > > My small 2 cents...:-) > > > > > > Hook/coprocessor is useful mechanism to interacting with a system for > > > things that cannot be done via API. For end user, the tradeoff > factors > > > like performance, security, reliability etc can be control by upper > > layer' > > > policy. > > > e.g. In RDBMS, the end user has limited usage case for triggers, which > > > eliminates the security factor at all, and the performance tradeoff is > > > given to end user to decide. so from evolution's perspective, > > > hook/coprocessor for end user could be controlled by query engine layer > > > like Phoenix. > > > > > > For internal user, hook better not be used widely unless it is a MUST > or > > > strong flexibility/plugability is required. e.g. things can be part of > > the > > > core better not use it. > > > > > > thanks. > > > > > > > > > > > > On Sat, May 17, 2014 at 4:04 PM, Michael Segel < > > [email protected]>wrote: > > > > > >> Andrew, > > >> > > >> Is ‘magical fairy dust’ a reference to some new synthetic drug you > take > > at > > >> raves? > > >> But lets get back to reality. > > >> > > >> > > >> Lets try this again; simply put… the coprocessor runs on the same JVM > as > > >> the RS, therefore you have an unacceptable level of risk. > > >> That inherent risk means that you cannot run HBase with end-user > > >> coprocessors enabled when you want to have a stable and somewhat > secure > > >> environment. > > >> > > >> The simple truth is that you need to decouple the end-user code > > >> (coprocessor) from the RS. > > >> Its not a difficult concept to understand, and while reasonable, it > > would > > >> mean a major rewrite and work done on co-processors. > > >> > > >> Will de-coupling the user-space from the RS remove all risk? No. And > > no, > > >> I’m not suggesting that. > > >> But its a critical piece to the puzzle. > > >> > > >> Its not just security, but also reliability. > > >> > > >> > > >> On May 17, 2014, at 4:43 AM, Andrew Purtell <[email protected]> > > wrote: > > >> > > >>> Michael, > > >>> > > >>> As you know, we have implemented security features with coprocessors > > >>> precisely because they can be interposed on internal actions to make > > >>> authoritative decisions in-process. Coprocessors are a way to have > > >>> composable internal extensions. They don't have and probably never > will > > >>> have magic fairy security dust. We do trust the security coprocessor > > code > > >>> because it was developed by the project. That is not the same thing > as > > >>> saying you can have 'security' and execute arbitrary user code > > in-process > > >>> as a coprocessor. Just want to clear that up for you. > > >>> > > >>>> will want to allow system coprocessors but then write a coprocessor > > that > > >>> reject user coprocessors. > > >>> > > >>> That's a reasonable point. > > >>> > > >>> > > >>> > > >>> > > >>> On Sat, May 17, 2014 at 12:13 AM, Michael Segel > > >>> <[email protected]>wrote: > > >>> > > >>>> Until you move the coprocessor out of the RS space and into its own > > >>>> sandbox… saying security and coprocessor in the same sentence is a > > joke. > > >>>> Oh wait… you were serious… :-( > > >>>> > > >>>> I’d say there’s a significant rethink on coprocessors that’s > required. > > >>>> > > >>>> Anyone running a secure (kerberos) cluster, will want to allow > system > > >>>> coprocessors but then write a coprocessor that reject user > > coprocessors. > > >>>> > > >>>> Just putting it out there… > > >>>> > > >>>> On May 15, 2014, at 2:13 AM, Andrew Purtell <[email protected]> > > >> wrote: > > >>>> > > >>>>> Because coprocessor APIs are so tightly bound with internals, if we > > >> apply > > >>>>> suggested rules like as mentioned on HBASE-11054: > > >>>>> > > >>>>> I'd say policy should be no changes to method apis across minor > > >>>>> versions > > >>>>> > > >>>>> This will lock coprocessor based components to the limitations of > the > > >> API > > >>>>> as we encounter them. Core code does not suffer this limitation, we > > are > > >>>>> otherwise free to refactor and change internal methods. For > example, > > if > > >>>> we > > >>>>> apply this policy to the 0.98 branch, then we will have to abandon > > >>>> further > > >>>>> security feature development there and move to trunk only. This is > > >>>> because > > >>>>> we already are aware that coprocessor APIs as they stand are > > >> insufficient > > >>>>> still. > > >>>>> > > >>>>> Coprocessor APIs are a special class of internal method. We have > had > > a > > >>>>> tension between allowing freedom of movement for developing them > out > > >> and > > >>>>> providing some measure of stability for implementors for a while. > > >>>>> > > >>>>> It is my belief that the way forward is something like HBASE-11125. > > >>>> Perhaps > > >>>>> we can take this discussion to that JIRA and have this long overdue > > >>>>> conversation. > > >>>>> > > >>>>> Regarding security features specifically, I would also like to call > > >> your > > >>>>> attention to HBASE-11127. I think security has been an optional > > feature > > >>>>> long enough, it is becoming a core requirement for the project, so > > >> should > > >>>>> be moved into core. Sure, we can therefore sidestep any issues with > > >>>>> coprocessor API sufficiency for hosting security features. However, > > in > > >> my > > >>>>> opinion we should pursue both HBASE-11125 and HBASE-11127; the > first > > to > > >>>>> provide the relative stability long asked for by coprocessor API > > users, > > >>>> the > > >>>>> latter to cleanly solve emerging issues with concurrency and > > >> versioning. > > >>>>> > > >>>>> > > >>>>> -- > > >>>>> Best regards, > > >>>>> > > >>>>> - Andy > > >>>>> > > >>>>> Problems worthy of attack prove their worth by hitting back. - Piet > > >> Hein > > >>>>> (via Tom White) > > >>>> > > >>>> > > >>> > > >>> > > >>> -- > > >>> Best regards, > > >>> > > >>> - Andy > > >>> > > >>> Problems worthy of attack prove their worth by hitting back. - Piet > > Hein > > >>> (via Tom White) > > >> > > >> > > > > > > > -- > Best regards, > > - Andy > > Problems worthy of attack prove their worth by hitting back. - Piet Hein > (via Tom White) > -- Kevin O'Dell Systems Engineer, Cloudera
