I think you need some sort of fuzzy join ?
Is it always the case that one title is a substring of another ?
On Tue, Mar 15, 2016 at 6:46 AM, Suniti Singh
wrote:
> Hi All,
>
> I have two tables with same schema but different data. I have to join the
> tables based on one column and then do a grou
Hi All,
I have two tables with same schema but different data. I have to join the
tables based on one column and then do a group by the same column name.
now the data in that column in two table might/might not exactly match. (Ex
- column name is "title". Table1. title = "doctor" and Table2. ti
right.
However, I think it's developer's choice to purposely drop the guarantee
like when they use the existing DStream.repartition where original
per-topicpartition in-order processing is also not observed any more.
Do you agree?
On Thu, Mar 10, 2016 at 12:12 PM, Cody Koeninger wrote:
> The c