Weiqing, do you have JIRA issues filed for your progress that folks can follow along?
Sean, another item to watch might be the state of HBase+PySpark. Our friends at Huawei have done this work [1][2] and maybe it is another candidate for inclusion and first class support? I'm not sure who the best contact for this would be, however. Mike [1]: http://huaweibigdata.github.io/astro/ [2]: https://github.com/Huawei-Spark/Spark-SQL-on-HBase On Fri, Jun 23, 2017 at 2:42 PM, Weiqing Yang <[email protected]> wrote: > Thanks, Sean! > > You are right, SHC has its own pluggable system for data encoding/decoding. > Only phoenix encoding is not in Apache Hbase Spark. > > We have shepherded almost all of SHC changes from SHC Git repo to Apache > Hbase except the features of supporting multiple secure Hbase clusters and > Phoenix data coder. Next week we’ll have a discussion to decide whether > these latest code changes will be shepherded to Apache Hbase since SHC has > used Phoenix encoding/decoding to support Phoenix data. I'll update the > next steps here after the discussion next week. > > For Composite Key, the current patch is still under reviewing, but it > brings some concerns. That's also one of the reasons to bring the Phoenix > encoding/decoding in SHC. > > Regards, > > Weiqing > > On Fri, Jun 23, 2017 at 12:20 PM, Stack <[email protected]> wrote: > > > On Fri, Jun 23, 2017 at 10:30 AM, Sean Busbey <[email protected]> wrote: > > > > > On Fri, Jun 23, 2017 at 12:06 PM, Stack <[email protected]> wrote: > > > > On Wed, Jun 21, 2017 at 9:31 AM, Sean Busbey <[email protected]> > > wrote: > > > >.... > > > > I don't know enough about the integration but is the 'handling of > > Phoenix > > > > encoded data' about mapping spark types to a serialization in hbase? > If > > > > not, where is the need for seamless transforms between spark types > and > > a > > > > natural hbase serialization listed. We need this IIRC. > > > > > > > > > > It's a subtask, really. We already have a pluggable system for mapping > > > between spark types and a couple of serialization options (the docs > > > need improvement?). > > > > > > > > > > SHC has its own pluggable system and has the addition of a phoenix > > > encoding. The set seems like the most likely out-of-the-box formats > > > folks might have something in. (I thinkMaybe Kite? I think it's > > > different than the rest.) > > > > > > Or are you saying we can just map all of it the the hbase-common > > > "types" and then do the pluggable part under it? > > > > > > > > > Not making any prescription. Was just worried about type marshalling in > and > > out of spark concerned that the serialization would be other than > something > > 'natural' for hbase, that it not performant, and that we might have a > > profusion of mechanisms. > > > > If a noted subtask, thats grand. > > > > Thanks, > > S > > >
