Re: Spark Union performance issue

Prem Sahoo Wed, 22 Feb 2023 10:08:47 -0800

here is the information missed
1. Spark 3.2.0
2. it is scala based
3. size of tables will be ~60G
4. explain plan for catalysts shows lots of time is being spent in creating
the plan
5. number of union table is 2 , and another 2 then finally 2


slowness is providing resylut as the data size & column size increases .

On Wed, Feb 22, 2023 at 11:07 AM Enrico Minack <i...@enrico.minack.dev>
wrote:

> Plus number of unioned tables would be helpful, as well as which
> downstream operations are performed on the unioned tables.
>
> And what "performance issues" do you exactly measure?
>
> Enrico
>
>
>
> Am 22.02.23 um 16:50 schrieb Mich Talebzadeh:
>
> Hi,
>
> Few details will help
>
>    1. Spark version
>    2. Spark SQL, Scala or PySpark
>    3. size of tables in join.
>    4. What does explain() or the joining operation show?
>
>
> HTH
>
>
>    view my Linkedin profile
> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>
>
>  https://en.everybodywiki.com/Mich_Talebzadeh
>
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
>
> On Wed, 22 Feb 2023 at 15:42, Prem Sahoo <prem.re...@gmail.com> wrote:
>
>> Hello Team,
>> We are observing Spark Union performance issues when unioning big tables
>> with lots of rows. Do we have any option apart from the Union ?
>>
>
>

Re: Spark Union performance issue

Reply via email to