[ 
https://issues.apache.org/jira/browse/SPARK-25317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16602506#comment-16602506
 ] 

Kazuaki Ishizaki commented on SPARK-25317:
------------------------------------------

Let me run this on 2.3 and master.
One question. This benchmark does not have an warm up loop. In other words, 
this benchmark may include execution time on an interpreter, too. Is this 
behavior intentional?

> MemoryBlock performance regression
> ----------------------------------
>
>                 Key: SPARK-25317
>                 URL: https://issues.apache.org/jira/browse/SPARK-25317
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.4.0
>            Reporter: Wenchen Fan
>            Priority: Blocker
>
> eThere is a performance regression when calculating hash code for UTF8String:
> {code:java}
>   test("hashing") {
>     import org.apache.spark.unsafe.hash.Murmur3_x86_32
>     import org.apache.spark.unsafe.types.UTF8String
>     val hasher = new Murmur3_x86_32(0)
>     val str = UTF8String.fromString("b" * 10001)
>     val numIter = 100000
>     val start = System.nanoTime
>     for (i <- 0 until numIter) {
>       Murmur3_x86_32.hashUTF8String(str, 0)
>       Murmur3_x86_32.hashUTF8String(str, 0)
>       Murmur3_x86_32.hashUTF8String(str, 0)
>       Murmur3_x86_32.hashUTF8String(str, 0)
>       Murmur3_x86_32.hashUTF8String(str, 0)
>       Murmur3_x86_32.hashUTF8String(str, 0)
>       Murmur3_x86_32.hashUTF8String(str, 0)
>       Murmur3_x86_32.hashUTF8String(str, 0)
>       Murmur3_x86_32.hashUTF8String(str, 0)
>       Murmur3_x86_32.hashUTF8String(str, 0)
>       Murmur3_x86_32.hashUTF8String(str, 0)
>       Murmur3_x86_32.hashUTF8String(str, 0)
>       Murmur3_x86_32.hashUTF8String(str, 0)
>       Murmur3_x86_32.hashUTF8String(str, 0)
>       Murmur3_x86_32.hashUTF8String(str, 0)
>       Murmur3_x86_32.hashUTF8String(str, 0)
>       Murmur3_x86_32.hashUTF8String(str, 0)
>       Murmur3_x86_32.hashUTF8String(str, 0)
>       Murmur3_x86_32.hashUTF8String(str, 0)
>       Murmur3_x86_32.hashUTF8String(str, 0)
>       Murmur3_x86_32.hashUTF8String(str, 0)
>       Murmur3_x86_32.hashUTF8String(str, 0)
>       Murmur3_x86_32.hashUTF8String(str, 0)
>       Murmur3_x86_32.hashUTF8String(str, 0)
>       Murmur3_x86_32.hashUTF8String(str, 0)
>       Murmur3_x86_32.hashUTF8String(str, 0)
>       Murmur3_x86_32.hashUTF8String(str, 0)
>       Murmur3_x86_32.hashUTF8String(str, 0)
>       Murmur3_x86_32.hashUTF8String(str, 0)
>       Murmur3_x86_32.hashUTF8String(str, 0)
>     }
>     val duration = (System.nanoTime() - start) / 1000 / numIter
>     println(s"duration $duration us")
>   }
> {code}
> To run this test in 2.3, we need to add
> {code:java}
> public static int hashUTF8String(UTF8String str, int seed) {
>     return hashUnsafeBytes(str.getBaseObject(), str.getBaseOffset(), 
> str.numBytes(), seed);
>   }
> {code}
> to `Murmur3_x86_32`
> In my laptop, the result for master vs 2.3 is: 120 us vs 40 us



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to