Re: Question about installing Apache Spark [PySpark] computer requirements

2024-07-29 Thread Meena Rajani
You probably have to increase jvm/jdk memory size https://stackoverflow.com/questions/1565388/increase-heap-size-in-java On Mon, Jul 29, 2024 at 9:36 PM mike Jadoo wrote: > Thanks. I just downloaded the corretto but I got this error message, > which was the same as before. [It was shared wi

Re: [Issue] Spark SQL - broadcast failure

2024-07-16 Thread Meena Rajani
Can you try disabling broadcast join and see what happens? On Mon, Jul 8, 2024 at 12:03 PM Sudharshan V wrote: > Hi all, > > Been facing a weird issue lately. > In our production code base , we have an explicit broadcast for a small > table. > It is just a look up table that is around 1gb in siz

Re: OOM concern

2024-05-27 Thread Meena Rajani
What exactly is the error? Is it erroring out while reading the data from db? How are you partitioning the data? How much memory currently do you have? What is the network time out? Regards, Meena On Mon, May 27, 2024 at 4:22 PM Perez wrote: > Hi Team, > > I want to extract the data from DB a

Re: Python for the kids and now PySpark

2024-04-28 Thread Meena Rajani
Mitch, you are right these days the attention span is getting shorter. Christian could work on a completely new thing for 3 hours and is proud to explain. It is amazing. Thanks for sharing. On Sat, Apr 27, 2024 at 9:40 PM Farshid Ashouri wrote: > Mich, this is absolutely amazing. > > Thanks f

Re: Spark join produce duplicate rows in resultset

2023-10-27 Thread Meena Rajani
.* from rev >> inner join customer c >> on rev.custumer_id =c.id >> inner join product p >> on rev.sys = p.sys >> and rev.prin = p.prin >> and rev.scode= p.bcode >> >> left join item I >> on rev.sys = I.sys >> and rev.custumer_id = I.cust

Spark join produce duplicate rows in resultset

2023-10-21 Thread Meena Rajani
Hello all: I am using spark sql to join two tables. To my surprise I am getting redundant rows. What could be the cause. select rev.* from rev inner join customer c on rev.custumer_id =c.id inner join product p rev.sys = p.sys rev.prin = p.prin rev.scode= p.bcode left join item I on rev.sys = i