[ https://issues.apache.org/jira/browse/HDFS-14304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sahil Takiar updated HDFS-14304: -------------------------------- Status: Open (was: Patch Available) > High lock contention on hdfsHashMutex in libhdfs > ------------------------------------------------ > > Key: HDFS-14304 > URL: https://issues.apache.org/jira/browse/HDFS-14304 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client, libhdfs, native > Reporter: Sahil Takiar > Assignee: Sahil Takiar > Priority: Major > > While doing some performance profiling of an application using libhdfs, we > noticed a high amount of lock contention on the {{hdfsHashMutex}} defined in > {{hadoop-hdfs-native-client/src/main/native/libhdfs/os/mutexes.h}} > The issue is that every JNI method invocation done by {{hdfs.c}} goes through > a helper method called {{invokeMethod}}. {{invokeMethod}} calls > {{globalClassReference}} which acquires {{hdfsHashMutex}} while performing a > lookup in a {{htable}} (a custom hash table that lives in {{libhdfs/common}}) > (the lock is acquired for both reads and writes). The hash table maps {{char > *className}} to {{jclass}} objects, it seems the goal of the hash table is to > avoid repeatedly creating {{jclass}} objects for each JNI call. > For multi-threaded applications, this lock severely limits that rate at which > Java methods can be invoked. pstacks show a lot of time being spent on > {{hdfsHashMutex}} > {code:java} > #0 0x00007fba2dbc242d in __lll_lock_wait () from /lib64/libpthread.so.0 > #1 0x00007fba2dbbddcb in _L_lock_812 () from /lib64/libpthread.so.0 > #2 0x00007fba2dbbdc98 in pthread_mutex_lock () from /lib64/libpthread.so.0 > #3 0x00000000027d8386 in mutexLock () > #4 0x00000000027d0e7b in globalClassReference () > #5 0x00000000027d1160 in invokeMethod () > #6 0x00000000027d4176 in readDirect () > #7 0x00000000027d4325 in hdfsRead () > {code} > Same with {{perf report}} > {code:java} > + 63.36% 0.01% [k] system_call_fastpath > + 61.60% 0.12% [k] sys_futex > + 61.45% 0.13% [k] do_futex > + 57.54% 0.49% [k] _raw_qspin_lock > + 57.07% 0.01% [k] queued_spin_lock_slowpath > + 55.47% 55.47% [k] native_queued_spin_lock_slowpath > - 35.68% 0.00% [k] 0x6f6f6461682f6568 > - 0x6f6f6461682f6568 > - 30.55% __lll_lock_wait > - 29.40% system_call_fastpath > - 29.39% sys_futex > - 29.35% do_futex > - 29.27% futex_wait > - 28.17% futex_wait_setup > - 27.05% _raw_qspin_lock > - 27.05% queued_spin_lock_slowpath > 26.30% native_queued_spin_lock_slowpath > + 0.67% ret_from_intr > + 0.71% futex_wait_queue_me > - 2.00% methodIdFromClass > - 1.94% jni_GetMethodID > - 1.71% get_method_id > 0.96% SymbolTable::lookup_only > - 1.61% invokeMethod > - 0.62% jni_CallLongMethodV > 0.52% jni_invoke_nonstatic > 0.75% pthread_mutex_lock > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org