Todd, I’m saying that you’re raising a straw man. Now correct me if I’m wrong… but coprocessor code is split in to two camps. One is System coprocessors which are defined in the hbase-site.xml, right? What do you call the other group of coprocessor code? (Sorry my memory is going. Killed too many brain cells when I was a young lad…)
Lets be clear, I’m not talking about letting some guy named Andrew come in and drop any old code on the system, but more of stopping James a developer in a different team writing an app with server side functionality that I don’t control. I think we all know this to be the case. Again, I’d suggest that the committers put on a product owner’s hat and think about the issue… But it really doesn’t matter. Spark is going to supplant all this tech… ;-) (Do I really need to add <sarcasm> </sarcasm> tags? On May 21, 2014, at 6:50 PM, Todd Lipcon <t...@cloudera.com> wrote: > On Wed, May 21, 2014 at 5:16 AM, Michael Segel > <michael_se...@hotmail.com>wrote: > >> And they accuse me of raising a straw man. ;-) >> > > I wasn't arguing for or against coprocessors being out of process. I think > both sides of this "argument" are in fact in agreement: if you think of > coprocessors as a place to run *user* code, they need to run out of > process, or otherwise be sandboxed (eg by having a stripped down DSL, as > many SQL DBs do with procedural SQL variants). Another option might be to > use embeddable Javascript, which is fairly doable in the context of the JVM. > > But today, coprocessors are not a place to run arbitrary user code. I doubt > you can find examples where we suggest non-sophisticated users to drop in > arbitrary code into their clusters and expect full stability. The analogy > I've always used is Linux kernel modules: you can extend the kernel in all > sorts of fun ways, but you have to trust whoever gives you the code to run, > and a bad module can kill your whole system. Similar to kernel modules, I > expect only sanctioned "vendors" (eg other open source projects like > Phoenix) or highly sophisticated users to ever drop in a CP. > > >> Todd, really? A parent/child relationship can be secured… how depends on >> how you communicate. >> You could always encrypt the data… in the messaging… ;-) >> > > I think you are confusing confidentiality and security, and not sure you > read the article I linked to. Shared memory does not imply better > performance for all applications, exactly because of the issue I linked to. > If you plan to map the shared memory, and you want to treat that memory > like some kind of structure (eg in which there are length prefixes, > pointers, etc), you must either trust your peer or you must copy the data > before validating it. So, it's no magic bullet for faster sharing of data > between a CP and HBase. > > >> >> >> On May 19, 2014, at 11:37 PM, Todd Lipcon <t...@cloudera.com> wrote: >> >>> On Mon, May 19, 2014 at 12:05 PM, Vladimir Rodionov >>> <vladrodio...@gmail.com>wrote: >>> >>>> Michael S. >>>>>> To the best of my knowledge, MapR’s M7 doesn’t have coprocessors. >> I’ll >>>> wager that when they do, it will work and not have these issues. I >> believe >>>> that they are writing their stuff in C/C++, if so, then they’d have an >>>> advantage of using shared memory. Apache would have write C/C++ code >> and >>>> wrap it in JNI… which you may not want to do…<< >>>> >>>> MapR M7 does not support coprocessors and custom Filters as well. I >>>> consider this to be a serious limitation of the product. >>>> Shared memory communication can be done in Java w/o single line of C/C++ >>>> code, Michael by means of memory-mapped files. >>>> >>>> -Vladimir >>>> >>>> >>> And even in native code, shared memory for communication between >> untrusted >>> peers can be pretty tricky to do securely (read >>> http://lwn.net/Articles/593918/ for details) >>> >>> -Todd >>> -- >>> Todd Lipcon >>> Software Engineer, Cloudera >> >> > > > -- > Todd Lipcon > Software Engineer, Cloudera