Github user tgravescs commented on the issue: https://github.com/apache/spark/pull/18805 > Why does this need to be in Spark? @srowen you already asked that question and it has been answered on the jira as well as the old pr. A user cannot add zstd compression to the internal spark parts: spark.io.compression.codec. In this particular case he is saying its the shuffle output where its making a big difference. zstd is already included in other open source projects like Hadoop, but again we don't get that for Spark internal compression code, zstd itself is BSD license. It looks like this pr is using the https://github.com/luben/zstd-jni which also appears to be BSD licensed. We need to decide if using that is ok for us to use directly. Hadoop wrote its own version but I would say if that version is working we use it. Worse case if something happens where that user won't fix something we could fork it and aren't any worse then having our own copy to start with.
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org