ght?
Thanks,
-Mike
From: Nicholas Hakobian [mailto:nicholas.hakob...@rallyhealth.com]
Sent: Friday, December 30, 2016 5:50 PM
To: Sesterhenn, Mike
Cc: ayan guha; user@spark.apache.org
Subject: Re: Best way to process lookup ETL with Dataframes
Yep, sequential joins is what I have done in the p
data
will result.
Any other thoughts?
From: Nicholas Hakobian <nicholas.hakob...@rallyhealth.com>
Sent: Friday, December 30, 2016 2:12:40 PM
To: Sesterhenn, Mike
Cc: ayan guha; user@spark.apache.org
Subject: Re: Best way to process lookup ETL with Data
need is to join after
the first join fails.
From: ayan guha <guha.a...@gmail.com>
Sent: Thursday, December 29, 2016 11:06 PM
To: Sesterhenn, Mike
Cc: user@spark.apache.org
Subject: Re: Best way to process lookup ETL with Dataframes
How about this -
Hi all,
I'm writing an ETL process with Spark 1.5, and I was wondering the best way to
do something.
A lot of the fields I am processing require an algorithm similar to this:
Join input dataframe to a lookup table.
if (that lookup fails (the joined fields are null)) {
Lookup into some