Re: Compare a column in two different tables/find the distance between column data

2016-03-14 Thread Wail Alkowaileet
I think you need some sort of fuzzy join ? Is it always the case that one title is a substring of another ? On Tue, Mar 15, 2016 at 6:46 AM, Suniti Singh wrote: > Hi All, > > I have two tables with same schema but different data. I have to join the > tables based on one column and then do a grou

Compare a column in two different tables/find the distance between column data

2016-03-14 Thread Suniti Singh
Hi All, I have two tables with same schema but different data. I have to join the tables based on one column and then do a group by the same column name. now the data in that column in two table might/might not exactly match. (Ex - column name is "title". Table1. title = "doctor" and Table2. ti

Re: DynamicPartitionKafkaRDD - 1:n mapping between kafka and RDD partition

2016-03-14 Thread Renyi Xiong
right. However, I think it's developer's choice to purposely drop the guarantee like when they use the existing DStream.repartition where original per-topicpartition in-order processing is also not observed any more. Do you agree? On Thu, Mar 10, 2016 at 12:12 PM, Cody Koeninger wrote: > The c