bq. HBase's current support, even if there are bugs or things that still need to be done, is much better than the Spark example
In my opinion, a simple example that works is better than a buggy package. I hope before long the hbase-spark module in HBase can arrive at a state which we can advertise as mature - but we're not there yet. On Tue, Apr 19, 2016 at 10:50 AM, Marcelo Vanzin <van...@cloudera.com> wrote: > You're completely missing my point. I'm saying that HBase's current > support, even if there are bugs or things that still need to be done, > is much better than the Spark example, which is basically a call to > "SparkContext.hadoopRDD". > > Spark's example is not helpful in learning how to build an HBase > application on Spark, and clashes head on with how the HBase > developers think it should be done. That, and because it brings too > many dependencies for something that is not really useful, is why I'm > suggesting removing it. > > > On Tue, Apr 19, 2016 at 10:47 AM, Ted Yu <yuzhih...@gmail.com> wrote: > > There is an Open JIRA for fixing the documentation: HBASE-15473 > > > > I would say the refguide link you provided should not be considered as > > complete. > > > > Note it is marked as Blocker by Sean B. > > > > On Tue, Apr 19, 2016 at 10:43 AM, Marcelo Vanzin <van...@cloudera.com> > > wrote: > >> > >> You're entitled to your own opinions. > >> > >> While you're at it, here's some much better documentation, from the > >> HBase project themselves, than what the Spark example provides: > >> http://hbase.apache.org/book.html#spark > >> > >> On Tue, Apr 19, 2016 at 10:41 AM, Ted Yu <yuzhih...@gmail.com> wrote: > >> > bq. it's actually in use right now in spite of not being in any > upstream > >> > HBase release > >> > > >> > If it is not in upstream, then it is not relevant for discussion on > >> > Apache > >> > mailing list. > >> > > >> > On Tue, Apr 19, 2016 at 10:38 AM, Marcelo Vanzin <van...@cloudera.com > > > >> > wrote: > >> >> > >> >> Alright, if you prefer, I'll say "it's actually in use right now in > >> >> spite of not being in any upstream HBase release", and it's more > >> >> useful than a single example file in the Spark repo for those who > >> >> really want to integrate with HBase. > >> >> > >> >> Spark's example is really very trivial (just uses one of HBase's > input > >> >> formats), which makes it not very useful as a blueprint for > developing > >> >> HBase apps with Spark. > >> >> > >> >> On Tue, Apr 19, 2016 at 10:28 AM, Ted Yu <yuzhih...@gmail.com> > wrote: > >> >> > bq. I wouldn't call it "incomplete". > >> >> > > >> >> > I would call it incomplete. > >> >> > > >> >> > Please see HBASE-15333 'Enhance the filter to handle short, > integer, > >> >> > long, > >> >> > float and double' which is a bug fix. > >> >> > > >> >> > Please exclude presence of related of module in vendor distro from > >> >> > this > >> >> > discussion. > >> >> > > >> >> > Thanks > >> >> > > >> >> > On Tue, Apr 19, 2016 at 10:23 AM, Marcelo Vanzin > >> >> > <van...@cloudera.com> > >> >> > wrote: > >> >> >> > >> >> >> On Tue, Apr 19, 2016 at 10:20 AM, Ted Yu <yuzhih...@gmail.com> > >> >> >> wrote: > >> >> >> > I want to note that the hbase-spark module in HBase is > incomplete. > >> >> >> > Zhan > >> >> >> > has > >> >> >> > several patches pending review. > >> >> >> > >> >> >> I wouldn't call it "incomplete". Lots of functionality is there, > >> >> >> which > >> >> >> doesn't mean new ones, or more efficient implementations of > existing > >> >> >> ones, can't be added. > >> >> >> > >> >> >> > hbase-spark module is currently only in master branch which > would > >> >> >> > be > >> >> >> > released as 2.0 > >> >> >> > >> >> >> Just as a side note, it's part of CDH 5.7.0, not that it matters > >> >> >> much > >> >> >> for upstream HBase. > >> >> >> > >> >> >> -- > >> >> >> Marcelo > >> >> > > >> >> > > >> >> > >> >> > >> >> > >> >> -- > >> >> Marcelo > >> > > >> > > >> > >> > >> > >> -- > >> Marcelo > > > > > > > > -- > Marcelo >