SparkSQL AVRO

2015-12-07 Thread Test One
I'm using spark-avro with SparkSQL to process and output avro files. My data has the following schema: root |-- memberUuid: string (nullable = true) |-- communityUuid: string (nullable = true) |-- email: string (nullable = true) |-- firstName: string (nullable = true) |-- lastName: string

Merging two avro RDD/DataFrames

2015-09-28 Thread TEST ONE
I have a daily update of modified users (~100s) output as avro from ETL. I’d need to find and merge with existing corresponding members in a master avro file (~100,000s) The merge operation involves merging a ‘profiles’ Map between the matching records. What would be the