Dongjoon Hyun created SPARK-53213:
-------------------------------------

             Summary: Use Java `Base64` instead of `Base64.(en|decodeBase64)*`
                 Key: SPARK-53213
                 URL: https://issues.apache.org/jira/browse/SPARK-53213
             Project: Spark
          Issue Type: Sub-task
          Components: Kubernetes, Spark Core, SQL
    Affects Versions: 4.1.0
            Reporter: Dongjoon Hyun


Java native API is over **9x faster** `Commons Codec`.

{code}
scala> val a = new Array[Byte](1_000_000_000)

scala> 
spark.time(org.apache.commons.codec.binary.Base64.decodeBase64(org.apache.commons.codec.binary.Base64.encodeBase64String(a)).length)
Time taken: 10121 ms
val res0: Int = 1000000000

scala> 
spark.time(java.util.Base64.getDecoder().decode(java.util.Base64.getEncoder().encodeToString(a)).length)
Time taken: 1156 ms
val res1: Int = 1000000000
{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to