Hi, Bobby:
Do you have design doc? I'm also interested in this topic and want to help
contribute.
On Tue, Apr 2, 2019 at 10:00 PM Bobby Evans wrote:
> Thanks to everyone for the feedback.
>
> Overall the feedback has been really positive for exposing columnar as a
> processing option to users.
Hi Steve,
Thanks for your feedback. From your email, I could gather the following two
important points:
1. Report failures to something (cluster manager) which can opt to
destroy the node and request a new one
2. Pluggable failure detection algorithms
Regarding #1, current blacklisting
On Tue, Apr 2, 2019 at 12:23 PM Vinoo Ganesh wrote:
> @Sean – To the point that Ryan made, it feels wrong that stopping a session
> force stops the global context. Building in the logic to only stop the
> context when the last session is stopped also feels like a solution, but the
> best way I
i am totally fine w/waiting a few days for the latest arrow release... not
at all a problem.
On Tue, Apr 2, 2019 at 9:14 AM Bryan Cutler wrote:
> Nice work Shane! That all sounds good to me. We might want to use pyarrow
> 0.12.1 though, there is a major bug that was fixed, but we can discuss
unsubscribe
// Merging threads
Thanks everyone for your thoughts. I’m very much in sync with Ryan here.
@Sean – To the point that Ryan made, it feels wrong that stopping a session
force stops the global context. Building in the logic to only stop the context
when the last session is stopped also feels
Dear Spark developers,
We noticed that cache name could be changed upon table refreshing. It is
because CatalogImpl.refreshTable would first uncached and then recache
(lazily) without first preserving cache name (and its storage level). IMHO,
it is not what a user would expect.
I submitted a
I am not sure how would it cause a leak though. When a spark session or the
underlying context is stopped it should clean up everything. The
getOrCreate is supposed to return the active thread local or the global
session. May be if you keep creating new sessions after explicitly clearing
the
Nice work Shane! That all sounds good to me. We might want to use pyarrow
0.12.1 though, there is a major bug that was fixed, but we can discuss in
the PR. I will put up the code changes in the next few days.
Felix, I think you're right about Python 3.5, they just list one upcoming
release and
I think Vinoo is right about the intended behavior. If we support multiple
sessions in one context, then stopping any one session shouldn't stop the
shared context. The last session to be stopped should stop the context, but
not any before that. We don't typically run multiple sessions in the same
Yeah there's one global default session, but it's possible to create
others and set them as the thread's active session, to allow for
different configurations in the SparkSession within one app. I think
you're asking why closing one of them would effectively shut all of
them down by stopping the
Hey Sean - Cool, maybe I'm misunderstanding the intent of clearing a session
vs. stopping it.
The cause of the leak looks to be because of this line here
https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/util/QueryExecutionListener.scala#L131.
The
What are you expecting there ... that sounds correct? something else
needs to be closed?
On Tue, Apr 2, 2019 at 9:45 AM Vinoo Ganesh wrote:
>
> Hi All -
>
>I’ve been digging into the code and looking into what appears to be a
> memory leak (https://jira.apache.org/jira/browse/SPARK-27337)
Hi All -
I’ve been digging into the code and looking into what appears to be a memory
leak (https://jira.apache.org/jira/browse/SPARK-27337) and have noticed
something kind of peculiar about the way closing a SparkSession is handled.
Despite being marked as Closeable, closing/stopping a
Thanks to everyone for the feedback.
Overall the feedback has been really positive for exposing columnar as a
processing option to users. I'll write up a SPIP on the proposed changes
to support columnar processing (not necessarily implement it) and then ping
the list again for more feedback and
On Fri, Mar 29, 2019 at 6:18 PM Reynold Xin wrote:
> We tried enabling blacklisting for some customers and in the cloud, very
> quickly they end up having 0 executors due to various transient errors. So
> unfortunately I think the current implementation is terrible for cloud
> deployments, and
Thanks for the feedback!
As I haven't received any comments recently and I hope I have addresses the
previous ones I'll advance to the next step and open the related jiras for
both Spark and Hive.
Cheers,
Gabor
On Thu, Mar 21, 2019 at 12:00 PM Gabor Kaszab
wrote:
> Thanks for the quick
unsubscribe
18 matches
Mail list logo