Hi ,
Yes I know by setting shuffle tracking property enabled we can use DRA.
But , it is marked as experimental. Is it advised to use ?
Also , regarding HPA. We do not have HPA differently as such for Spark.
Right?
Kind Regards,
Sachit Murarka
On Mon, Jan 11, 2021 at 2:17 AM Sandish Kumar HN
Hi,
I run some SQL using both Hive and Spark. Usually we get the same results.
However when a window function is in the script Hive and Spark can produce
different results. Is this intended behavior or either Hive or Spark has a bug?
Thanks,
Ying
Sachit,
K8S based spark dynamic allocation is only available on Spark 3.0.X+ and
that too without External Shuffling Service.
https://github.com/GoogleCloudPlatform/spark-on-k8s-operator/blob/master/docs/user-guide.md#dynamic-allocation
Hi All,
I have read about HPA Horizontal Pod Autoscaling(for pod scaling).
I understand it can be achieved by setting the request and limit for
resources in yaml:
kubectl autoscale deploy/application-cpu --cpu-percent=95 --min=1 --max=10
// example command.
But does Kubernetes actually work
Hi Pradyumn,
It seems you did not configure spark-default.conf file well.
Below configurations are needed to use hive 2.1.1 as metastore and execution
engine.
spark.sql.hive.metastore.version=2.1.1
spark.sql.hive.metastore.jars=/opt/cloudera/parcels/CDH/lib/hive/lib/*
Thanks.
Michael Yang
--