+1, I'd like to get a release out with SPARK-23852 fixed. The Parquet
community are about to release 1.8.3 - the voting period closes tomorrow -
and I've tested it with Spark 2.3 and confirmed the bug is fixed. Hopefully
it is released and I can post the version change to branch-2.3 before you
Parquet has a Java patch release, 1.8.3, that should pass tomorrow morning.
I think the plan is to get that in to fix a bug with Parquet data written
by Impala.
On Thu, May 10, 2018 at 11:09 AM, Marcelo Vanzin
wrote:
> Hello all,
>
> It's been a while since we shipped 2.3.0
Hello all,
It's been a while since we shipped 2.3.0 and lots of important bug
fixes have gone into the branch since then. I took a look at Jira and
it seems there's not a lot of things explicitly targeted at 2.3.1 -
the only potential blocker (a parquet issue) is being worked on since
a new
Huge +1 on this!
From: holden.ka...@gmail.com on behalf of Holden Karau
Sent: Thursday, May 10, 2018 9:39:26 AM
To: Joseph Bradley
Cc: dev
Subject: Re: Revisiting Online serving of Spark models?
On Thu, May 10,
On Thu, May 10, 2018 at 9:25 AM, Joseph Bradley
wrote:
> Thanks for bringing this up Holden! I'm a strong supporter of this.
>
> Awesome! I'm glad other folks think something like this belongs in Spark.
> This was one of the original goals for mllib-local: to have local
> it would be fantastic if we could make it easier to debug Spark programs
without needing to rely on eager execution.
I agree, it would be great if we could make the errors more clear about
where the error happened (user code or in Spark code) and what assumption
was violated. The problem is
Thanks for bringing this up Holden! I'm a strong supporter of this.
This was one of the original goals for mllib-local: to have local versions
of MLlib models which could be deployed without the big Spark JARs and
without a SparkContext or SparkSession. There are related commercial
offerings
If they are struggling to find bugs in their program because of lazy execution
model of Spark, they are going to struggle to debug issues when the program
runs into problems in production. Learning how to debug Spark is part of
learning Spark. It’s better that they run into issues in the