Re: fetching and joining data from two different clusters

Jörn Franke Thu, 15 Jun 2017 09:06:08 -0700

It does not matter to Spark you just put the HDFS URL of the namenode there. Of 
course the issue is that you loose data locality, but this would be also the 
case for Oracle.


> On 15. Jun 2017, at 18:03, Mich Talebzadeh <mich.talebza...@gmail.com> wrote:
> 
> Hi,
> 
> With Spark how easy is it to fetch data from two different clusters and do a 
> join in Spark.
> 
> I can use two JDBC connections to join two tables from two different Oracle 
> instances in Spark though creating two Data Frames and joining them together.
> 
> would that be possible for data residing on two different HDFS clusters?
> 
> thanks
> 
> 
> Dr Mich Talebzadeh
>  
> LinkedIn  
> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>  
> http://talebzadehmich.wordpress.com
> 
> Disclaimer: Use it at your own risk. Any and all responsibility for any loss, 
> damage or destruction of data or any other property which may arise from 
> relying on this email's technical content is explicitly disclaimed. The 
> author will in no case be liable for any monetary damages arising from such 
> loss, damage or destruction.
>

Re: fetching and joining data from two different clusters

Reply via email to