The same question can be asked w.r.t. examples for other projects, such as flume and kafka.
On Tue, Apr 19, 2016 at 11:01 AM, Marcin Tustin <mtus...@handybook.com> wrote: > Let's posit that the spark example is much better than what is available > in HBase. Why is that a reason to keep it within Spark? > > On Tue, Apr 19, 2016 at 1:59 PM, Ted Yu <yuzhih...@gmail.com> wrote: > >> bq. HBase's current support, even if there are bugs or things that still >> need to be done, is much better than the Spark example >> >> In my opinion, a simple example that works is better than a buggy package. >> >> I hope before long the hbase-spark module in HBase can arrive at a state >> which we can advertise as mature - but we're not there yet. >> >> On Tue, Apr 19, 2016 at 10:50 AM, Marcelo Vanzin <van...@cloudera.com> >> wrote: >> >>> You're completely missing my point. I'm saying that HBase's current >>> support, even if there are bugs or things that still need to be done, >>> is much better than the Spark example, which is basically a call to >>> "SparkContext.hadoopRDD". >>> >>> Spark's example is not helpful in learning how to build an HBase >>> application on Spark, and clashes head on with how the HBase >>> developers think it should be done. That, and because it brings too >>> many dependencies for something that is not really useful, is why I'm >>> suggesting removing it. >>> >>> >>> On Tue, Apr 19, 2016 at 10:47 AM, Ted Yu <yuzhih...@gmail.com> wrote: >>> > There is an Open JIRA for fixing the documentation: HBASE-15473 >>> > >>> > I would say the refguide link you provided should not be considered as >>> > complete. >>> > >>> > Note it is marked as Blocker by Sean B. >>> > >>> > On Tue, Apr 19, 2016 at 10:43 AM, Marcelo Vanzin <van...@cloudera.com> >>> > wrote: >>> >> >>> >> You're entitled to your own opinions. >>> >> >>> >> While you're at it, here's some much better documentation, from the >>> >> HBase project themselves, than what the Spark example provides: >>> >> http://hbase.apache.org/book.html#spark >>> >> >>> >> On Tue, Apr 19, 2016 at 10:41 AM, Ted Yu <yuzhih...@gmail.com> wrote: >>> >> > bq. it's actually in use right now in spite of not being in any >>> upstream >>> >> > HBase release >>> >> > >>> >> > If it is not in upstream, then it is not relevant for discussion on >>> >> > Apache >>> >> > mailing list. >>> >> > >>> >> > On Tue, Apr 19, 2016 at 10:38 AM, Marcelo Vanzin < >>> van...@cloudera.com> >>> >> > wrote: >>> >> >> >>> >> >> Alright, if you prefer, I'll say "it's actually in use right now in >>> >> >> spite of not being in any upstream HBase release", and it's more >>> >> >> useful than a single example file in the Spark repo for those who >>> >> >> really want to integrate with HBase. >>> >> >> >>> >> >> Spark's example is really very trivial (just uses one of HBase's >>> input >>> >> >> formats), which makes it not very useful as a blueprint for >>> developing >>> >> >> HBase apps with Spark. >>> >> >> >>> >> >> On Tue, Apr 19, 2016 at 10:28 AM, Ted Yu <yuzhih...@gmail.com> >>> wrote: >>> >> >> > bq. I wouldn't call it "incomplete". >>> >> >> > >>> >> >> > I would call it incomplete. >>> >> >> > >>> >> >> > Please see HBASE-15333 'Enhance the filter to handle short, >>> integer, >>> >> >> > long, >>> >> >> > float and double' which is a bug fix. >>> >> >> > >>> >> >> > Please exclude presence of related of module in vendor distro >>> from >>> >> >> > this >>> >> >> > discussion. >>> >> >> > >>> >> >> > Thanks >>> >> >> > >>> >> >> > On Tue, Apr 19, 2016 at 10:23 AM, Marcelo Vanzin >>> >> >> > <van...@cloudera.com> >>> >> >> > wrote: >>> >> >> >> >>> >> >> >> On Tue, Apr 19, 2016 at 10:20 AM, Ted Yu <yuzhih...@gmail.com> >>> >> >> >> wrote: >>> >> >> >> > I want to note that the hbase-spark module in HBase is >>> incomplete. >>> >> >> >> > Zhan >>> >> >> >> > has >>> >> >> >> > several patches pending review. >>> >> >> >> >>> >> >> >> I wouldn't call it "incomplete". Lots of functionality is there, >>> >> >> >> which >>> >> >> >> doesn't mean new ones, or more efficient implementations of >>> existing >>> >> >> >> ones, can't be added. >>> >> >> >> >>> >> >> >> > hbase-spark module is currently only in master branch which >>> would >>> >> >> >> > be >>> >> >> >> > released as 2.0 >>> >> >> >> >>> >> >> >> Just as a side note, it's part of CDH 5.7.0, not that it matters >>> >> >> >> much >>> >> >> >> for upstream HBase. >>> >> >> >> >>> >> >> >> -- >>> >> >> >> Marcelo >>> >> >> > >>> >> >> > >>> >> >> >>> >> >> >>> >> >> >>> >> >> -- >>> >> >> Marcelo >>> >> > >>> >> > >>> >> >>> >> >>> >> >>> >> -- >>> >> Marcelo >>> > >>> > >>> >>> >>> >>> -- >>> Marcelo >>> >> >> > > Want to work at Handy? Check out our culture deck and open roles > <http://www.handy.com/careers> > Latest news <http://www.handy.com/press> at Handy > Handy just raised $50m > <http://venturebeat.com/2015/11/02/on-demand-home-service-handy-raises-50m-in-round-led-by-fidelity/> > led > by Fidelity > >