Re: How to read multiple HDFS directories

2021-05-05 Thread Kapil Garg
ave to union multiple RDDs. You can read files from multiple > directories in a single read call. Spark will manage partitioning of the > data across directories. > > > > *From: *Kapil Garg > *Date: *Wednesday, May 5, 2021 at 10:45 AM > *To: *spark users > *Subject: *[EXTER

Re: How to read multiple HDFS directories

2021-05-05 Thread Kapil Garg
y monetary damages arising from > such loss, damage or destruction. > > > > > On Wed, 5 May 2021 at 17:03, Kapil Garg wrote: > >> Sorry but I didn't get the question. It is possible that 1 record is >> present in multiple directories. That's why we do a reduceByKey a

Re: How to read multiple HDFS directories

2021-05-05 Thread Kapil Garg
struction of data or any other property which may arise > from relying on this email's technical content is explicitly disclaimed. > The author will in no case be liable for any monetary damages arising from > such loss, damage or destruction. > > > > > On Wed, 5 May 20

Re: How to read multiple HDFS directories

2021-05-05 Thread Kapil Garg
own risk. Any and all responsibility for any > loss, damage or destruction of data or any other property which may arise > from relying on this email's technical content is explicitly disclaimed. > The author will in no case be liable for any monetary damages arising from > such loss, damage or

How to read multiple HDFS directories

2021-05-05 Thread Kapil Garg
instead of spreading on all. Is there a way to avoid this data skewness ? I couldn't find any RDD API, spark config which could enforce the data reading tasks evenly on all executors. -- Regards Kapil Garg

Re: Single executor processing all tasks in spark structured streaming kafka

2021-03-08 Thread Kapil Garg
> .option("startingOffsets", > START_OFFSET).load() .selectExpr("CAST(value AS STRING)") > > > query = > df.writeStream.foreach(process_events).option("checkpointLo

Re: Spark Version 3.0.1 Gui Display Query

2021-03-04 Thread Kapil Garg
-- > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > > > ----- > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > > -- Regards Kapil Garg -- *--

Re: Spark Version 3.0.1 Gui Display Query

2021-03-01 Thread Kapil Garg
ab during application start >2. Executors Tab during application lifetime > > > > I need to tune my application , this Executor Info would be a great help > for tuning the parameters. But currently it is blank shown. > > Regards > > Ranju > > > > *From:

Re: Spark Version 3.0.1 Gui Display Query

2021-03-01 Thread Kapil Garg
, 2021 at 11:04 AM Ranju Jain wrote: > Hi , > > > > I started using Spark 3.0.1 version recently and noticed the Executors Tab > on Spark GUI appears as blank. > > Please suggest what could be the reason of this type of display? > > > > Regards