ght?
Thanks,
-Mike
From: Nicholas Hakobian [mailto:nicholas.hakob...@rallyhealth.com]
Sent: Friday, December 30, 2016 5:50 PM
To: Sesterhenn, Mike
Cc: ayan guha; user@spark.apache.org
Subject: Re: Best way to process lookup ETL with Dataframes
Yep, sequential joins is what I have done in the p
016 2:12:40 PM
> *To:* Sesterhenn, Mike
> *Cc:* ayan guha; user@spark.apache.org
>
> *Subject:* Re: Best way to process lookup ETL with Dataframes
>
> It looks like Spark 1.5 has the coalesce function, which is like NVL, but
> a bit more flexible. From Ayan's example you shou
data
will result.
Any other thoughts?
From: Nicholas Hakobian <nicholas.hakob...@rallyhealth.com>
Sent: Friday, December 30, 2016 2:12:40 PM
To: Sesterhenn, Mike
Cc: ayan guha; user@spark.apache.org
Subject: Re: Best way to process lookup ETL with Data
-
> *From:* ayan guha <guha.a...@gmail.com>
> *Sent:* Thursday, December 29, 2016 11:06 PM
> *To:* Sesterhenn, Mike
> *Cc:* user@spark.apache.org
> *Subject:* Re: Best way to process lookup ETL with Dataframes
>
> How about this -
>
> select a.
need is to join after
the first join fails.
From: ayan guha <guha.a...@gmail.com>
Sent: Thursday, December 29, 2016 11:06 PM
To: Sesterhenn, Mike
Cc: user@spark.apache.org
Subject: Re: Best way to process lookup ETL with Dataframes
How about this -
How about this -
select a.*, nvl(b.col,nvl(c.col,'some default'))
from driving_table a
left outer join lookup1 b on a.id=b.id
left outer join lookup2 c on a.id=c,id
?
On Fri, Dec 30, 2016 at 9:55 AM, Sesterhenn, Mike
wrote:
> Hi all,
>
>
> I'm writing an ETL process with