Re: Source.getBatch and schema vs qe.analyzed.schema?

2021-04-03 Thread Jacek Laskowski
Hi Bartosz, This is not a question about whether the data source supports fixed or user-defined schema but what schema to use when requested for a streaming batch in Source.getBatch. Pozdrawiam, Jacek Laskowski https://about.me/JacekLaskowski "The Internals Of" Online Books

Re: Source.getBatch and schema vs qe.analyzed.schema?

2021-03-31 Thread Bartosz Konieczny
Hi Jacek, An interesting question! I don't know the exact answer and will be happy to learn by the way :) Below you can find my understanding for these 2 things, hoping it helps a little. For me, we can distinguish 2 different source categories. The first of them is a source with some fixed

Source.getBatch and schema vs qe.analyzed.schema?

2021-03-29 Thread Jacek Laskowski
Hi, I've been developing a data source with a source and sink for Spark Structured Streaming. I've got a question about Source.getBatch [1]: def getBatch(start: Option[Offset], end: Offset): DataFrame getBatch returns a streaming DataFrame between the offsets so the idiom (?) is to have a