When we read files in spark it infers the schema. We have the option to not infer the schema. Is there a way to ask spark to infer the schema again just like when reading json?
The reason we want to get this done is because we have a problem in our data files. We have a json file containing this {"a": NESTED_JSON_VALUE} {"a":"null"} It should have been empty json but due to a bug it became "null" instead. Now, when we read the file the column "a" is considered as a String. Instead what we want to do is ask spark to read the file considering "a" as a String, filter the "null" out/replace with empty json and then ask spark to infer schema of "a" after the fix so we can access the nested json properly.