zhztheplayer closed issue #5136: [VL] Vanilla Spark broadcast exchange + R2C is
slow sometimes
URL: https://github.com/apache/incubator-gluten/issues/5136
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to
zhztheplayer commented on issue #5136:
URL:
https://github.com/apache/incubator-gluten/issues/5136#issuecomment-2022119518
Fixed in https://github.com/apache/incubator-gluten/pull/5141. I assume we
can close this now.
--
This is an automated message from the Apache Git Service.
To
zhztheplayer commented on issue #5136:
URL:
https://github.com/apache/incubator-gluten/issues/5136#issuecomment-2021919347
The major issue I have found is that the `flatMap` approach would cause
`UnsafeHashedRelation` to produce duplicated rows in my case (TPCDS q14a with
current version
zhztheplayer commented on issue #5136:
URL:
https://github.com/apache/incubator-gluten/issues/5136#issuecomment-2021916024
I don't have dedicated UTs for it so it was incorporated into the other PR.
Still I can open one for it if you think it's needed:
ulysses-you commented on issue #5136:
URL:
https://github.com/apache/incubator-gluten/issues/5136#issuecomment-2021873724
Thank you @zhztheplayer It's a good point, columnar broadcast would
broadcast the origin binary data but vanilla Spark would broadcast hash
relation. So I think this