Re: Use case idea

2022-07-31 Thread pengyh
I don't think so. we were using spark integarted with Kafka for streaming computing and realtime reports. that just works. SPARK is now just an overhyped and overcomplicated ETL tool, nothing more, there is another distributed AI called as Ray, which should be the next billion dollar company

Re: Use case idea

2022-07-31 Thread Gourav Sengupta
Hi, SPARK is now just an overhyped and overcomplicated ETL tool, nothing more, there is another distributed AI called as Ray, which should be the next billion dollar company instead of just building those features in SPARK natively using a different computation engine :) So the only promise of

Re: Use case idea

2022-07-31 Thread pengyh
I am afraid the most sql functions spark has the other BI tools also have. spark is used for high performance computing, not for SQL function comparisoin. Thanks. In other terms: what analytics funcionality, that no One erp has, Spark offers ?

Use case idea

2022-07-31 Thread Gioele Sal. Perri
Hi, I have a db where are collected sales data Who is managed by Odoo erp. I am studying Apache Spark (beginner) and I have to show a particular analytics that Spark can do and that is not supported by erp In other terms: what analytics funcionality, that no One erp has, Spark offers ? Thanks

Re: Salting technique doubt

2022-07-31 Thread Vinod KC
Hi Sid, This example code with output will add some more clarity spark-shell --conf spark.sql.shuffle.partitions=3 --conf > spark.sql.autoBroadcastJoinThreshold=-1 > > > scala> import org.apache.spark.sql.DataFrame > import org.apache.spark.sql.DataFrame > > scala> import

Re: Salting technique doubt

2022-07-31 Thread ayan guha
One option is create a separate column in table A with salting. Use it as partition key. Use original column for joining. Ayan On Sun, 31 Jul 2022 at 6:45 pm, Jacob Lynn wrote: > The key is this line from Amit's email (emphasis added): > > > Change the join_col to *all possible values* of the

Re: Salting technique doubt

2022-07-31 Thread Jacob Lynn
The key is this line from Amit's email (emphasis added): > Change the join_col to *all possible values* of the sale. The two tables are treated asymmetrically: 1. The skewed table gets random salts appended to the join key. 2. The other table gets all possible salts appended to the join key

Re: Salting technique doubt

2022-07-31 Thread Amit Joshi
Hi Sid, I am not sure I understood your question. But the keys cannot be different post salting in both the tables, this is what i have shown in the explanation. You salt Table A and then explode Table B to create all possible values. In your case, I do not understand, what Table B has x_8/9. It

Re: Salting technique doubt

2022-07-31 Thread Sid
Hi Amit, Thanks for your reply. However, your answer doesn't seem different from what I have explained. My question is after salting if the keys are different like in my example then post join there would be no results assuming the join type as inner join because even though the keys are