"parquet")
> > .option("path", path.toString)
> > .outputMode("append")
> > .start()
> > .processAllAvailable()
> > spark.read.format("parquet").load(path.toString).count mustBe 1159
> >
> > logLinesDF.write.format("parquet").mode("append").save(path.toStrin
> > g)
> > spark.read.format("parquet").load(path.toString).count mustBe
> > 2*1159
> > }
> >
> > Does anyone have an idea what I am doing wrong here?
> >
> > thanks in advance
> > Eugen Wintersberger
ing).count mustBe 2*1159
}
Does anyone have an idea what I am doing wrong here?
thanks in advance
Eugen Wintersberger
21, 2021 at 2:49 AM,
> wrote:
> > Hi all,
> > I stumbled upon an interessting problem. I have an existing
> > Deltalake with data recovered from a backup and would like to
> > append to this Deltalake using Spark structured streaming. This
> > does not wor
file with structured streaming than appending
to this file with a streaming job (at least with the same job) works
flawlessly. Did I missunderstand something here?
best regards
Eugen Wintersberger
Hi folks,
I am trying to read the message headers from a Kafka structured
stream which should be stored in a column named ``headers``.
I try something like this:
val stream = sparkSession.readStream.format("kafka")..load()
stream.map(row => {
...
val headers =
Hi,
I was wondering if it would be possible to fit only the intercept on
a LinearRegression instance by providing a known coefficient?
Here is some background information: we have a problem where linear
regression is well suited as a predictor. However, the model requires
continuous adoption.