you should make hbase a data source(seems we already have hbase connector?), 
create a dataframe from hbase,  and do join in Spark SQL.

> On 21 Jun 2017, at 10:17 AM, [email protected] wrote:
> 
> Hello,
> My scenary is like this:
>         1.val df=hivecontext/carboncontex.sql("sql....")
>         2.iterating rows,extrating two columns,id and mvcc, and use id as key 
> to scan hbase to get corresponding value
>             if mvcc==value, this row pass,else drop
> Is there a better way except dataframe.mapPartitions because it cause an 
> extra stage and spend more time.
> I put two DAGs in appendix,please check!
> 
> Thanks!!
> [email protected] <mailto:[email protected]><appendix.zip>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: [email protected] 
> <mailto:[email protected]>

Reply via email to