Hello,
I am trying to filter out some records in a table in hive. The number of lines in this table is 4billions+, I make a left semi join between above table and a small table with 1k lines. However, after 3 hours job running, it turns out a fail status. My question are as follows, 1. How could I address this problem and final solve it? 2. Is there any other good methods could filter out records with give conditions? The following picture is a snapshot of the failed job.
<<image003.jpg>>