Hi Dev community,
A large data skew is leading to memory problem in my cluster. I was
wondering if anyone has tackled this with their own hash function and it
worked for the same cluster configuration.
Thanks,
Sejal
Hello,
looking at BroadcastHashJoinExec, it seems to me that it never destroys the
broadcasted variables. And I think this can cause problems like SPARK-22575.
Anyway, when I tried to add a "cleanup" to destroy the variable, I saw some
test failure because it was trying to access a the destroyed
Hi,
Why does Spark SQL need Nondeterministic trait [1] and property? That must
be confusing for others not only me, right?
[1]
https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Expression.scala#L299
[2]
https://github.com/apache/spa