Re: read dataset from only one node in YARN cluster

2023-08-18 Thread Mich Talebzadeh
Hi, Where do you see this? In spark UI. So data is skewed most probably as one node gets all the data and others nothing as I understand? HTH Mich Talebzadeh, Solutions Architect/Engineering Lead London United Kingdom view my Linkedin profile

read dataset from only one node in YARN cluster

2023-08-18 Thread marc nicole
Hi, Spark 3.2, Hadoop 3.2, using YARN cluster mode, if one wants to read a dataset that is found in one node of the cluster and not in the others, how to tell Spark that? I expect through DataframeReader and using path like *IP:port/pathOnLocalNode* PS: loading the dataset in HDFS is not an