Replication join = broadcast join. Look for that term on google. Many examples.
Semi join can be done on dataframes/dataset by passing “semi join” as the third parameter on the join/joinWith function. Not sure about the other two. Sent from my Windows 10 phone From: Aakash Basu<mailto:aakash.spark....@gmail.com> Sent: Thursday, November 17, 2016 3:41 PM To: user@spark.apache.org<mailto:user@spark.apache.org> Subject: HDPCD SPARK Certification Queries Hi all, I want to know more about this examination - http://hortonworks.com/training/certification/exam-objectives/#hdpcdspark If anyone's there who appeared for the examination, can you kindly help? 1) What are the kind of questions that come, 2) Samples, 3) All the other details. Thanks, Aakash.