Hi All, Processing streaming JSON files with Spark features (Spark streaming and Spark SQL), is very efficient and works like a charm.
Below is the code snippet to process JSON files. windowDStream.foreachRDD(IncomingFiles => { val IncomingFilesTable = sqlContext.jsonRDD(IncomingFiles); IncomingFilesTable.registerAsTable("IncomingFilesTable"); val result = sqlContext.sql("select text from IncomingFilesTable").collect; sc.parallelize(result).saveAsTextFile("filepath"); } But, I feel its difficult to use spark features efficiently with streaming xml files (each compressed file would be 4 MB). What is the best approach for processing compressed xml files? Regards Vijay