On Mon, Oct 7, 2019 at 20:49 ayan guha wrote:
> HI
>
> I think you are mixing terminologies here. Loosely speaking, Master
> manages worker machines. Each worker machine can run one or more processes.
> A process can be a driver or executor. You submit applications to the
> master. Each
HI
I think you are mixing terminologies here. Loosely speaking, Master manages
worker machines. Each worker machine can run one or more processes. A
process can be a driver or executor. You submit applications to the master.
Each application will have driver and executors. Master will decide
Hi
On Mon, Oct 7, 2019 at 19:20 Amit Sharma wrote:
> Thanks Andrew but I am asking specific to driver memory not about
> executors memory. We have just one master and if each jobs driver.memory=4g
> and master nodes total memory is 16gb then we can not execute more than 4
> jobs at a time.
I
Thanks Andrew but I am asking specific to driver memory not about executors
memory. We have just one master and if each jobs driver.memory=4g and
master nodes total memory is 16gb then we can not execute more than 4 jobs
at a time.
On Monday, October 7, 2019, Andrew Melo wrote:
> Hi Amit
>
> On
Hi Amit
On Mon, Oct 7, 2019 at 18:33 Amit Sharma wrote:
> Can you please help me understand this. I believe driver programs runs on
> master node
If we are running 4 spark job and driver memory config is 4g then total 16
> 6b would be used of master node.
This depends on what master/deploy
Can you please help me understand this. I believe driver programs runs on
master node. If we are running 4 spark job and driver memory config is 4g
then total 16 6b would be used of master node. So if we will run more jobs
then we need more memory on master. Please correct me if I am wrong.
I figured out. Thanks.
On Mon, Oct 7, 2019 at 9:55 AM Lian Jiang wrote:
> Hi,
>
> from pyspark.sql.functions import pandas_udf, PandasUDFType
> import pyspark
> from pyspark.sql import SparkSession
> spark = SparkSession.builder.getOrCreate()
>
> df = spark.createDataFrame(
> [(1, True,
Hi,
from pyspark.sql.functions import pandas_udf, PandasUDFType
import pyspark
from pyspark.sql import SparkSession
spark = SparkSession.builder.getOrCreate()
df = spark.createDataFrame(
[(1, True, 1.0, 'aa'), (1, False, 2.0, 'aa'), (2, True, 3.0, 'aa'), (2,
True, 5.0, 'aa'), (2, True, 10.0,
Hi Manish,
Is this issue resolved? If not, please check the overlay network of your
cluster once. We had faced similar issues when we had problems with overlay
networking.
In our case, executor had spawned, but the communication with driver and
executor had failed (due to issues with overlay