Re: schema change for structured spark streaming using jsonl files

2018-04-25 Thread Michael Segel
Hi, This is going to sound complicated. Taken as an individual JSON document, because its a self contained schema doc, its structured. However there isn’t a persisting schema that has to be consistent across multiple documents. So you can consider it semi structured. If you’re parsing the

Re: schema change for structured spark streaming using jsonl files

2018-04-24 Thread Lian Jiang
Thanks for any help! On Mon, Apr 23, 2018 at 11:46 AM, Lian Jiang wrote: > Hi, > > I am using structured spark streaming which reads jsonl files and writes > into parquet files. I am wondering what's the process if jsonl files schema > change. > > Suppose jsonl files are

schema change for structured spark streaming using jsonl files

2018-04-23 Thread Lian Jiang
Hi, I am using structured spark streaming which reads jsonl files and writes into parquet files. I am wondering what's the process if jsonl files schema change. Suppose jsonl files are generated in \jsonl folder and the old schema is { "field1": String}. My proposal is: 1. write the jsonl files