date:20221106

Re: [EXTERNAL] Re: Re: Re: Re: Re: Stage level scheduling - lower the number of executors when using GPUs

2022-11-06 Thread Shay Elbaz

I don't think there is a definitive right or wrong approach here. The SLS feature would not have been added to Spark if there was no real need for it, and AFAIK it required quite a bit of refactoring of Spark internals. So I'm sure this discussion was already made in the developers community :)

Re: [EXTERNAL] Re: Re: Re: Re: Stage level scheduling - lower the number of executors when using GPUs

2022-11-06 Thread ayan guha

May I ask why the ETL job and DL ( Assuming you mean deep learning here) task can not be run as 2 separate spark job? IMHO it is better practice to split up entire pipeline into logical steps and orchestrate them. That way you can pick your profile as you need for 2 very different type of workloa

ClassCastException while reading parquet data via Hive metastore

2022-11-06 Thread Naresh Peshwe

Hi all, I am trying to read data (using spark sql) via a hive metastore which has a column of type bigint. Underlying parquet data has int as the datatype for the same column. I am getting the following error while trying to read the data using spark sql - java.lang.ClassCastException: org.apache.