Re: Spark 2.1 - Infering schema of dataframe after reading json files not during

2017-06-02 Thread vaquar khan
You can add filter or replace null with value like 0 or string. df.na.fill(0, Seq("y")) Regards, Vaquar khan On Jun 2, 2017 11:25 AM, "Alonso Isidoro Roman" wrote: not sure if this can help you, but you can infer programmatically the schema providing a json schema file, val path: Path = new P

Re: Spark 2.1 - Infering schema of dataframe after reading json files not during

2017-06-02 Thread Alonso Isidoro Roman
not sure if this can help you, but you can infer programmatically the schema providing a json schema file, val path: Path = new Path(schema_parquet) val fileSystem = path.getFileSystem(sc.hadoopConfiguration) val inputStream: FSDataInputStream = fileSystem.open(path) val schema_json = Stream.con

Spark 2.1 - Infering schema of dataframe after reading json files not during

2017-06-02 Thread Aseem Bansal
When we read files in spark it infers the schema. We have the option to not infer the schema. Is there a way to ask spark to infer the schema again just like when reading json? The reason we want to get this done is because we have a problem in our data files. We have a json file containing this