Hi Sandeep,
Any inputs on this?
Regards
Surya
From: Garlapati, Suryanarayana (Nokia - IN/Bangalore)
Sent: Saturday, July 21, 2018 6:50 PM
To: Sandeep Katta
Cc: d...@spark.apache.org; user@spark.apache.org
Subject: RE: Query on Spark Hive with kerberos Enabled on Kubernetes
Hi Sandeep,
Thx for
Hi Susan,
This is exactly what we have used. Thank you for your interest!
- Thodoris
> On 23 Jul 2018, at 20:55, Susan X. Huynh wrote:
>
> Hi Thodoris,
>
> Maybe setting "spark.scheduler.minRegisteredResourcesRatio" to > 0 would
> help? Default value is 0 with Mesos.
>
> "The minimum
That does sound like it could be it - I checked our libmesos version and it
is 1.4.1. I'll try upgrading libmesos.
Thanks.
On Mon, Jul 23, 2018 at 12:13 PM Susan X. Huynh
wrote:
> Hi Nimi,
>
> This sounds similar to a bug I have come across before. See:
>
Hi Nimi,
This sounds similar to a bug I have come across before. See:
https://jira.apache.org/jira/browse/SPARK-22342?focusedCommentId=16429950=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16429950
It turned out to be a bug in libmesos (the client library used to
There's some discussion and proposal of supporting GPUs in this Spark JIRA:
https://jira.apache.org/jira/browse/SPARK-24615 "Accelerator-aware task
scheduling for Spark"
Susan
On Thu, Jul 12, 2018 at 11:17 AM, Mich Talebzadeh wrote:
> I agree.
>
> Adding GPU capability to Spark in my opinion
https://docs.databricks.com/spark/latest/spark-sql/skew-join.html
The above might help, in case you are using a join.
On Mon, Jul 23, 2018 at 4:49 AM, 崔苗 wrote:
> but how to get count(distinct userId) group by company from count(distinct
> userId) group by company+x?
> count(userId) is
Hi Thodoris,
Maybe setting "spark.scheduler.minRegisteredResourcesRatio" to > 0 would
help? Default value is 0 with Mesos.
"The minimum ratio of registered resources (registered resources / total
expected resources) (resources are executors in yarn mode and Kubernetes
mode, CPU cores in
Hello Dev!
Spark structured streaming job with simple window aggregation is leaking
file descriptor on kubernetes as cluster manager setup. It seems bug.
I am suing HDFS as FS for checkpointing.
Have anyone observed same? Thanks for any help.
Please find more details in trailing email.
For
Using the current Kafka sink that supports routing based on topic column, you
could just duplicate the rows (e.g. explode rows with different topic, key
values). That way you’re only reading and processing the source once and not
having to resort to custom sinks, foreachWriter, or multiple
We try to create a cluster which consists of 4 machines. The cluster will
be used by multiple-users. How can we configured that user can submit jobs
from personal computer and is there any free tool you can suggest to
leverage procedure.
--
Uğur Sopaoğlu
understand each row has a topic column but can we write one row to multiple
topics?
On Thu, Jul 12, 2018 at 11:00 AM, Arun Mahadevan wrote:
> What I meant was the number of partitions cannot be varied with
> ForeachWriter v/s if you were to write to each sink using independent
> queries. Maybe
11 matches
Mail list logo