Re: RDD equivalent of HBase Scan
An RDD is a very different creature than a NoSQL store, so I would not think of them as in the same ball-park for NoSQL-like workloads. It's not built for point queries or range scans, since any request would launch a distributed job to scan all partitions. It's not something built for, say, thousands of concurrent jobs (queries). On Thu, Mar 26, 2015 at 1:57 PM, Stuart Layton wrote: > Thanks but I'm hoping to get away from hbase all together. I was wondering > if there is a way to get similar scan performance directly on cached rdd's > or data frames > > On Thu, Mar 26, 2015 at 9:54 AM, Ted Yu wrote: >> >> In examples//src/main/scala/org/apache/spark/examples/HBaseTest.scala, >> TableInputFormat is used. >> TableInputFormat accepts parameter >> >> public static final String SCAN = "hbase.mapreduce.scan"; >> >> where if specified, Scan object would be created from String form: >> >> if (conf.get(SCAN) != null) { >> >> try { >> >> scan = TableMapReduceUtil.convertStringToScan(conf.get(SCAN)); >> >> You can use TableMapReduceUtil#convertScanToString() to convert a Scan >> which has filter(s) and pass to TableInputFormat >> >> Cheers >> >> >> On Thu, Mar 26, 2015 at 6:46 AM, Stuart Layton >> wrote: >>> >>> HBase scans come with the ability to specify filters that make scans very >>> fast and efficient (as they let you seek for the keys that pass the filter). >>> >>> Do RDD's or Spark DataFrames offer anything similar or would I be >>> required to use a NoSQL db like HBase to do something like this? >>> >>> -- >>> Stuart Layton >> >> > > > > -- > Stuart Layton - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: RDD equivalent of HBase Scan
Thanks but I'm hoping to get away from hbase all together. I was wondering if there is a way to get similar scan performance directly on cached rdd's or data frames On Thu, Mar 26, 2015 at 9:54 AM, Ted Yu wrote: > In examples//src/main/scala/org/apache/spark/examples/HBaseTest.scala, > TableInputFormat is used. > TableInputFormat accepts parameter > > public static final String SCAN = "hbase.mapreduce.scan"; > > where if specified, Scan object would be created from String form: > > if (conf.get(SCAN) != null) { > > try { > > scan = TableMapReduceUtil.convertStringToScan(conf.get(SCAN)); > > You can use TableMapReduceUtil#convertScanToString() to convert a Scan > which has filter(s) and pass to TableInputFormat > > Cheers > > On Thu, Mar 26, 2015 at 6:46 AM, Stuart Layton > wrote: > >> HBase scans come with the ability to specify filters that make scans very >> fast and efficient (as they let you seek for the keys that pass the filter). >> >> Do RDD's or Spark DataFrames offer anything similar or would I be >> required to use a NoSQL db like HBase to do something like this? >> >> -- >> Stuart Layton >> > > -- Stuart Layton
Re: RDD equivalent of HBase Scan
In examples//src/main/scala/org/apache/spark/examples/HBaseTest.scala, TableInputFormat is used. TableInputFormat accepts parameter public static final String SCAN = "hbase.mapreduce.scan"; where if specified, Scan object would be created from String form: if (conf.get(SCAN) != null) { try { scan = TableMapReduceUtil.convertStringToScan(conf.get(SCAN)); You can use TableMapReduceUtil#convertScanToString() to convert a Scan which has filter(s) and pass to TableInputFormat Cheers On Thu, Mar 26, 2015 at 6:46 AM, Stuart Layton wrote: > HBase scans come with the ability to specify filters that make scans very > fast and efficient (as they let you seek for the keys that pass the filter). > > Do RDD's or Spark DataFrames offer anything similar or would I be required > to use a NoSQL db like HBase to do something like this? > > -- > Stuart Layton >
RDD equivalent of HBase Scan
HBase scans come with the ability to specify filters that make scans very fast and efficient (as they let you seek for the keys that pass the filter). Do RDD's or Spark DataFrames offer anything similar or would I be required to use a NoSQL db like HBase to do something like this? -- Stuart Layton