Hi all
I’m trying to parse json using existing schema and got rows with NULL’s //get schema val df_schema = spark.sqlContext.sql("select c1,c2,…cn t1 limit 1") //read json file val f = sc.textFile("/tmp/x") //load json into data frame using schema var df = spark.sqlContext.read.option("columnNameOfCorruptRecord","xxx").option("mode","PERMISSIVE").schema(df_schema.schema).json(f) in documentation it say that you can query the corrupted rows by this columns à columnNameOfCorruptRecord o “columnNameOfCorruptRecord (default is the value specified in spark.sql.columnNameOfCorruptRecord): allows renaming the new field having malformed string created by PERMISSIVE mode. This overrides spark.sql.columnNameOfCorruptRecord.” The question is how to fetch those corrupted rows ? Thanks Yehuda