Sorry actually my last message is not true for anti join, I was thinking of
semi join.
-TJ
On Sun, Jun 3, 2018 at 14:57 Tayler Lawrence Jones
wrote:
> A left join with null filter is only the same as a left anti join if the
> join keys can be guaranteed unique in the existing data. Sinc
On Mon, 4 Jun 2018 at 6:42 am, Tayler Lawrence Jones <
> t.jonesd...@gmail.com> wrote:
>
>> The issue is not the append vs overwrite - perhaps those responders do
>> not know Anti join semantics. Further, Overwrite on s3 is a bad pattern due
>> to s3 eventual consiste
The issue is not the append vs overwrite - perhaps those responders do not
know Anti join semantics. Further, Overwrite on s3 is a bad pattern due to
s3 eventual consistency issues.
First, your sql query is wrong as you don’t close the parenthesis of the
CTE (“with” part). In fact, it looks like
It is an open issue with Hadoop file committer, not spark. The simple
workaround is to write to hdfs then copy to s3. Netflix did a talk about
their custom output committer at the last spark summit which is a clever
efficient way of doing that - I’d check it out on YouTube. They have open
sourced