AFAICT, we can use spark.sql(s"select $name ..."), name is a value in
Scala context[1].
--
Cheers,
-z
[1] https://docs.scala-lang.org/overviews/core/string-interpolation.html
On Fri, 17 Apr 2020 00:10:59 +0100
Mich Talebzadeh wrote:
> Thanks Patrick,
>
> The partition broadcastId is static
I would like to know why it is faster to write out an RDD that has 30,000
partitions as 30,000 files sized 1K-2M rather than coalescing it to 1000
partitions and writing out 1000 S3 files of roughly 26MB each, or even 100
partitions and 100 S3 files of 260MB each.
The coalescing takes a long time.
Hi Jinxin,
Thanks for your suggestions, I will try to use foreachpartition later.
Best regards,
maqy
发件人: Tang Jinxin
发送时间: 2020年4月23日 7:31
收件人: maqy
抄送: Andrew Melo; user@spark.apache.org
主题: 回复:Can I collect Dataset[Row] to driver without converting it toArray [Row]?
Hi maqy,
Thanks for your
Hi Jinxin,
Thanks for your suggestions, I will try to use foreachpartition later.
Best regards,
maqy
发件人: Tang Jinxin
发送时间: 2020年4月23日 7:31
收件人: maqy
抄送: Andrew Melo; user@spark.apache.org
主题: 回复:Can I collect Dataset[Row] to driver without converting it toArray [Row]?
Hi maqy,
Thanks for your
Hi Jinxin,
Thanks for your suggestions, I will try to use foreachpartition later.
Best regards,
maqy
发件人: Tang Jinxin
发送时间: 2020年4月23日 7:31
收件人: maqy
抄送: Andrew Melo; user@spark.apache.org
主题: 回复:Can I collect Dataset[Row] to driver without converting it toArray [Row]?
Hi maqy,
Thanks for your
Yea, please report the bug on a supported Spark version like 2.4.
On Thu, Apr 23, 2020 at 3:40 PM Dhrubajyoti Hati
wrote:
> FYI we are using Spark 2.2.0. Should the change be present in this spark
> version? Wanted to check before opening a JIRA ticket?
>
>
>
>
> *Regards,Dhrubajyoti Hati.*
>
>
That's not dead locked. They are just trying acqure the same Monitor lock, and
there are 3 threads. One acquired, and others are waiting for the lock being
released. It's a common senario. You have to check the monitor lock object from
callstack source code. There should be some operations after
FYI we are using Spark 2.2.0. Should the change be present in this spark
version? Wanted to check before opening a JIRA ticket?
*Regards,Dhrubajyoti Hati.*
On Thu, Apr 23, 2020 at 10:12 AM Wenchen Fan wrote:
> This looks like a bug that path filter doesn't work for hive table
> reading. Can
Thank you Wei.
I will look into #1. With option 2, seems it will push the complexity to
application -- application need to write multiple queries and merge the
final result.
Regards,
Stone
On Mon, Apr 20, 2020 at 7:39 AM ZHANG Wei wrote:
> There might be 3 options:
>
> 1. Just as you expect,