Hi All,

Processing streaming JSON files with Spark features (Spark streaming and
Spark SQL), is very efficient and works like a charm.

Below is the code snippet to process JSON files.

        windowDStream.foreachRDD(IncomingFiles => {
        val IncomingFilesTable = sqlContext.jsonRDD(IncomingFiles);
        IncomingFilesTable.registerAsTable("IncomingFilesTable");
        val result = sqlContext.sql("select text from
IncomingFilesTable").collect;
        sc.parallelize(result).saveAsTextFile("filepath");
        }


But, I feel its difficult to use spark features efficiently with streaming
xml files (each compressed file would be 4 MB).

What is the best approach for processing compressed xml files?

Regards
Vijay

Reply via email to