[ https://issues.apache.org/jira/browse/GEODE-8213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17142734#comment-17142734 ]
ASF GitHub Bot commented on GEODE-8213: --------------------------------------- albertogpz commented on a change in pull request #611: URL: https://github.com/apache/geode-native/pull/611#discussion_r437398002 ########## File path: cppcache/src/SerializationRegistry.cpp ########## @@ -409,44 +411,56 @@ void TheTypeMap::bindDataSerializable(TypeFactoryMethod func, int32_t id) { "TheTypeMap::bind: Serialization type not implemented."); } - std::lock_guard<util::concurrent::spinlock_mutex> guard( - m_dataSerializableMapLock); - const auto& result = m_dataSerializableMap.emplace(id, func); + auto tempDataSerializableMap = + std::make_shared<std::unordered_map<int32_t, TypeFactoryMethod>>( + *m_dataSerializableMap); + const auto& result = tempDataSerializableMap->emplace(id, func); if (!result.second) { LOGERROR("A class with ID %d is already registered.", id); throw IllegalStateException("A class with given ID is already registered."); } + std::atomic_store(&m_dataSerializableMap, tempDataSerializableMap); Review comment: I changed the implementation to boost::shared_mutex and the performance measured by the benchmark introduced was way worse than with the spinlock. I had read that the implementation of the boost::shared_lock was suboptimal on Linux. These are the figures: ``` Benchmark Time CPU Time Old Time New CPU Old CPU New ------------------------------------------------------------------------------------------------------------------------------------------------------------------------- SerializationRegistryBM_findDataSerializablePrimitive/real_time/threads:1 +1.0440 +1.0438 213 435 213 434 SerializationRegistryBM_findDataSerializablePrimitive/real_time/threads:2 +0.9090 +0.9091 156 297 312 595 SerializationRegistryBM_findDataSerializablePrimitive/real_time/threads:4 +1.0325 +1.0136 65 132 254 512 SerializationRegistryBM_findDataSerializablePrimitive/real_time/threads:8 +0.7878 +0.7826 43 78 348 620 SerializationRegistryBM_findDataSerializablePrimitive/real_time/threads:16 +0.9026 +1.0066 31 59 357 717 SerializationRegistryBM_findDataSerializablePrimitive/real_time/threads:32 +0.9935 +0.9967 28 57 365 729 SerializationRegistryBM_findDataSerializablePrimitive/real_time/threads:64 +1.0535 +1.0519 26 54 358 735 SerializationRegistryBM_findDataSerializablePrimitive/real_time/threads:96 +1.1182 +1.0243 25 52 360 729 SerializationRegistryBM_findDataSerializableFixedId/real_time/threads:1 +0.7410 +0.7410 231 403 231 403 SerializationRegistryBM_findDataSerializableFixedId/real_time/threads:2 +1.4190 +1.3271 114 276 228 531 SerializationRegistryBM_findDataSerializableFixedId/real_time/threads:4 +0.9838 +0.9709 59 116 234 462 SerializationRegistryBM_findDataSerializableFixedId/real_time/threads:8 +0.9452 +0.9369 42 82 335 649 SerializationRegistryBM_findDataSerializableFixedId/real_time/threads:16 +1.0843 +0.9737 30 62 368 725 SerializationRegistryBM_findDataSerializableFixedId/real_time/threads:32 +1.0204 +1.0018 28 57 367 735 SerializationRegistryBM_findDataSerializableFixedId/real_time/threads:64 +0.9670 +0.9939 28 55 368 734 SerializationRegistryBM_findDataSerializableFixedId/real_time/threads:96 +0.8644 +1.0207 29 54 362 732 SerializationRegistryBM_findDataSerializable/real_time/threads:1 +0.9817 +0.9817 211 418 211 418 SerializationRegistryBM_findDataSerializable/real_time/threads:2 +0.3399 +0.3401 157 211 314 421 SerializationRegistryBM_findDataSerializable/real_time/threads:4 +1.2335 +1.2339 59 133 237 530 SerializationRegistryBM_findDataSerializable/real_time/threads:8 +0.9273 +0.9344 42 81 335 648 SerializationRegistryBM_findDataSerializable/real_time/threads:16 +1.1570 +1.0856 29 63 360 750 SerializationRegistryBM_findDataSerializable/real_time/threads:32 +1.2448 +1.0935 28 62 364 762 SerializationRegistryBM_findDataSerializable/real_time/threads:64 +1.0317 +1.0715 27 56 363 752 SerializationRegistryBM_findDataSerializable/real_time/threads:96 +1.0391 +1.0796 26 54 361 751 SerializationRegistryBM_findPdxSerializable/real_time/threads:1 +0.8144 +0.8144 234 425 234 425 SerializationRegistryBM_findPdxSerializable/real_time/threads:2 +0.4180 +0.4181 154 218 308 436 SerializationRegistryBM_findPdxSerializable/real_time/threads:4 +0.6402 +0.6284 73 120 293 476 SerializationRegistryBM_findPdxSerializable/real_time/threads:8 +1.3334 +1.3390 39 90 307 717 SerializationRegistryBM_findPdxSerializable/real_time/threads:16 +0.9053 +0.9381 34 65 407 789 SerializationRegistryBM_findPdxSerializable/real_time/threads:32 +1.0597 +0.9980 31 64 414 828 SerializationRegistryBM_findPdxSerializable/real_time/threads:64 +0.9588 +0.9441 32 63 413 803 SerializationRegistryBM_findPdxSerializable/real_time/threads:96 +0.7951 +0.9451 32 57 411 800 ``` I tried with C++14 std::shared_timed_mutex and the performance was better in most cases although the improvement was not spectacular: ``` Benchmark Time CPU Time Old Time New CPU Old CPU New ------------------------------------------------------------------------------------------------------------------------------------------------------------------------- SerializationRegistryBM_findDataSerializablePrimitive/real_time/threads:1 -0.0320 -0.0320 213 206 213 206 SerializationRegistryBM_findDataSerializablePrimitive/real_time/threads:2 -0.2957 -0.2957 156 110 312 219 SerializationRegistryBM_findDataSerializablePrimitive/real_time/threads:4 -0.1130 -0.0923 65 58 254 231 SerializationRegistryBM_findDataSerializablePrimitive/real_time/threads:8 -0.0796 -0.0811 43 40 348 319 SerializationRegistryBM_findDataSerializablePrimitive/real_time/threads:16 -0.0082 +0.0533 31 31 357 376 SerializationRegistryBM_findDataSerializablePrimitive/real_time/threads:32 +0.0372 +0.0424 28 30 365 381 SerializationRegistryBM_findDataSerializablePrimitive/real_time/threads:64 +0.0669 +0.0692 26 28 358 383 SerializationRegistryBM_findDataSerializablePrimitive/real_time/threads:96 +0.1285 +0.0457 25 28 360 377 SerializationRegistryBM_findDataSerializableFixedId/real_time/threads:1 -0.0929 -0.0929 231 210 231 210 SerializationRegistryBM_findDataSerializableFixedId/real_time/threads:2 -0.0190 -0.0191 114 112 228 224 SerializationRegistryBM_findDataSerializableFixedId/real_time/threads:4 -0.0091 -0.0092 59 58 234 232 SerializationRegistryBM_findDataSerializableFixedId/real_time/threads:8 -0.1483 -0.1501 42 36 335 285 SerializationRegistryBM_findDataSerializableFixedId/real_time/threads:16 +0.0480 +0.0159 30 31 368 373 SerializationRegistryBM_findDataSerializableFixedId/real_time/threads:32 +0.0776 +0.0321 28 30 367 379 SerializationRegistryBM_findDataSerializableFixedId/real_time/threads:64 +0.0035 +0.0254 28 28 368 378 SerializationRegistryBM_findDataSerializableFixedId/real_time/threads:96 +0.0029 +0.0447 29 29 362 378 SerializationRegistryBM_findDataSerializable/real_time/threads:1 +0.0412 +0.0412 211 219 211 219 SerializationRegistryBM_findDataSerializable/real_time/threads:2 -0.3122 -0.3121 157 108 314 216 SerializationRegistryBM_findDataSerializable/real_time/threads:4 -0.0301 -0.0299 59 58 237 230 SerializationRegistryBM_findDataSerializable/real_time/threads:8 -0.0128 -0.0112 42 42 335 331 SerializationRegistryBM_findDataSerializable/real_time/threads:16 +0.0404 +0.0318 29 30 360 371 SerializationRegistryBM_findDataSerializable/real_time/threads:32 -0.0093 +0.0292 28 27 364 374 SerializationRegistryBM_findDataSerializable/real_time/threads:64 -0.0119 +0.0465 27 27 363 380 SerializationRegistryBM_findDataSerializable/real_time/threads:96 -0.0511 +0.0467 26 25 361 378 SerializationRegistryBM_findPdxSerializable/real_time/threads:1 -0.0410 -0.0410 234 225 234 225 SerializationRegistryBM_findPdxSerializable/real_time/threads:2 -0.2185 -0.2185 154 120 308 240 SerializationRegistryBM_findPdxSerializable/real_time/threads:4 -0.1662 -0.1662 73 61 293 244 SerializationRegistryBM_findPdxSerializable/real_time/threads:8 +0.0004 -0.0001 39 39 307 306 SerializationRegistryBM_findPdxSerializable/real_time/threads:16 -0.0075 +0.0106 34 34 407 412 SerializationRegistryBM_findPdxSerializable/real_time/threads:32 +0.0701 +0.0034 31 33 414 416 SerializationRegistryBM_findPdxSerializable/real_time/threads:64 -0.0072 +0.0068 32 32 413 416 SerializationRegistryBM_findPdxSerializable/real_time/threads:96 -0.0614 +0.0182 32 30 411 419 ``` As a result, I do not see the point in changing the implementation until Geode is upgraded to a later C++ version. Do you know if there are plans for it? I think at least we should leave the benchmark tests for future improvements. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > C++ native client performance bottleneck in access to serialization registry > ---------------------------------------------------------------------------- > > Key: GEODE-8213 > URL: https://issues.apache.org/jira/browse/GEODE-8213 > Project: Geode > Issue Type: Improvement > Components: native client > Reporter: Alberto Gomez > Assignee: Alberto Gomez > Priority: Major > Fix For: 1.14.0 > > > It's been observed that when the number of threads used in a Geode client > using PdxSerialization is greater than 8, there is an important drop in > performance. > Analysing the client process behavior with perf, it has been observed a very > high CPU consumption by a spinlock > (apache::geode::util::concurrent::spinlock_mutex::lock) used when accessing > the serialization registry . -- This message was sent by Atlassian Jira (v8.3.4#803005)