Good day everyone!

Have you tried to de-duplicated records based on Avro generated classes? These 
classes extend SpecificRecord which has equals and hashCode implementation, 
although when i try to use .distinct on my PairRDD (both key and value are Avro 
classes), it eliminates records which are NOT duplicates. Any help or 
suggestion is appreciated!

Using Spark 2.0 with Kafka 2.10-0.8.2.0

Thanks and have a nice day!

Reply via email to