I'd suggest that the hbase-downstreamer project[1] is a better place for folks to see these examples. There's already an example for spark streaming that does not rely on any of the new goodness in the hbase-spark module[2].
Granted, it uses the Spark Java APIs[3], but we'd be glad to have a scala based example if someone wanted to translate. [1]: https://github.com/saintstack/hbase-downstreamer [2]: https://github.com/saintstack/hbase-downstreamer#spark-streaming-test-application [3]: https://s.apache.org/apvQ On Tue, Apr 19, 2016 at 12:59 PM, Ted Yu <yuzhih...@gmail.com> wrote: > bq. HBase's current support, even if there are bugs or things that still > need to be done, is much better than the Spark example > > In my opinion, a simple example that works is better than a buggy package. > > I hope before long the hbase-spark module in HBase can arrive at a state > which we can advertise as mature - but we're not there yet. > > On Tue, Apr 19, 2016 at 10:50 AM, Marcelo Vanzin <van...@cloudera.com> > wrote: >> >> You're completely missing my point. I'm saying that HBase's current >> support, even if there are bugs or things that still need to be done, >> is much better than the Spark example, which is basically a call to >> "SparkContext.hadoopRDD". >> >> Spark's example is not helpful in learning how to build an HBase >> application on Spark, and clashes head on with how the HBase >> developers think it should be done. That, and because it brings too >> many dependencies for something that is not really useful, is why I'm >> suggesting removing it. >> >> >> On Tue, Apr 19, 2016 at 10:47 AM, Ted Yu <yuzhih...@gmail.com> wrote: >> > There is an Open JIRA for fixing the documentation: HBASE-15473 >> > >> > I would say the refguide link you provided should not be considered as >> > complete. >> > >> > Note it is marked as Blocker by Sean B. >> > >> > On Tue, Apr 19, 2016 at 10:43 AM, Marcelo Vanzin <van...@cloudera.com> >> > wrote: >> >> >> >> You're entitled to your own opinions. >> >> >> >> While you're at it, here's some much better documentation, from the >> >> HBase project themselves, than what the Spark example provides: >> >> http://hbase.apache.org/book.html#spark >> >> >> >> On Tue, Apr 19, 2016 at 10:41 AM, Ted Yu <yuzhih...@gmail.com> wrote: >> >> > bq. it's actually in use right now in spite of not being in any >> >> > upstream >> >> > HBase release >> >> > >> >> > If it is not in upstream, then it is not relevant for discussion on >> >> > Apache >> >> > mailing list. >> >> > >> >> > On Tue, Apr 19, 2016 at 10:38 AM, Marcelo Vanzin >> >> > <van...@cloudera.com> >> >> > wrote: >> >> >> >> >> >> Alright, if you prefer, I'll say "it's actually in use right now in >> >> >> spite of not being in any upstream HBase release", and it's more >> >> >> useful than a single example file in the Spark repo for those who >> >> >> really want to integrate with HBase. >> >> >> >> >> >> Spark's example is really very trivial (just uses one of HBase's >> >> >> input >> >> >> formats), which makes it not very useful as a blueprint for >> >> >> developing >> >> >> HBase apps with Spark. >> >> >> >> >> >> On Tue, Apr 19, 2016 at 10:28 AM, Ted Yu <yuzhih...@gmail.com> >> >> >> wrote: >> >> >> > bq. I wouldn't call it "incomplete". >> >> >> > >> >> >> > I would call it incomplete. >> >> >> > >> >> >> > Please see HBASE-15333 'Enhance the filter to handle short, >> >> >> > integer, >> >> >> > long, >> >> >> > float and double' which is a bug fix. >> >> >> > >> >> >> > Please exclude presence of related of module in vendor distro from >> >> >> > this >> >> >> > discussion. >> >> >> > >> >> >> > Thanks >> >> >> > >> >> >> > On Tue, Apr 19, 2016 at 10:23 AM, Marcelo Vanzin >> >> >> > <van...@cloudera.com> >> >> >> > wrote: >> >> >> >> >> >> >> >> On Tue, Apr 19, 2016 at 10:20 AM, Ted Yu <yuzhih...@gmail.com> >> >> >> >> wrote: >> >> >> >> > I want to note that the hbase-spark module in HBase is >> >> >> >> > incomplete. >> >> >> >> > Zhan >> >> >> >> > has >> >> >> >> > several patches pending review. >> >> >> >> >> >> >> >> I wouldn't call it "incomplete". Lots of functionality is there, >> >> >> >> which >> >> >> >> doesn't mean new ones, or more efficient implementations of >> >> >> >> existing >> >> >> >> ones, can't be added. >> >> >> >> >> >> >> >> > hbase-spark module is currently only in master branch which >> >> >> >> > would >> >> >> >> > be >> >> >> >> > released as 2.0 >> >> >> >> >> >> >> >> Just as a side note, it's part of CDH 5.7.0, not that it matters >> >> >> >> much >> >> >> >> for upstream HBase. >> >> >> >> >> >> >> >> -- >> >> >> >> Marcelo >> >> >> > >> >> >> > >> >> >> >> >> >> >> >> >> >> >> >> -- >> >> >> Marcelo >> >> > >> >> > >> >> >> >> >> >> >> >> -- >> >> Marcelo >> > >> > >> >> >> >> -- >> Marcelo > > -- busbey --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org