[ https://issues.apache.org/jira/browse/SPARK-34573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Gabriele Nizzoli updated SPARK-34573: ------------------------------------- Description: SQLConf object has a sqlConfEntries map, which has a global lock (since it implements a Collections.synchronizedMap). Every operation (like get or set) blocks the full object. Concurrent threads may wait on lock. An example is the DatatType.sameType method, that queries SQLConf entries map: {code:scala} if (SQLConf.get.caseSensitiveAnalysis) ... {code} If this data type check is run in a custom piece of code on an executor with multiple cores (eg: 40), then a lot of time will be lost waiting on the lock. An easy fix is to use the a ConcurrentHashMap that does not lock on read SQLConf.get): " ... retrieval operations do not entail locking ..." NOTE: originally discovered by Benson Hon <benso...@taboola.com> was: SQLConf object has a sqlConfEntries map, which has a global lock (since it implements a Collections.synchronizedMap). Every operation (like get or set) blocks the full object. Concurrent threads may wait on lock. An example is the DatatType.sameType method, that queries SQLConf entries map: {code:scala} if (SQLConf.get.caseSensitiveAnalysis) ... {code} If this data type check is run in a custom piece of code on an executor with multiple cores (eg: 40), then a lot of time will be lost waiting on the lock. An easy fix is to use the a ConcurrentHashMap that does not lock on read SQLConf.get): " ... retrieval operations do not entail locking ..." NOTE: originally discovered by benso...@taboola.com > SQLConf sqlConfEntries map has a global lock, should not lock on get > -------------------------------------------------------------------- > > Key: SPARK-34573 > URL: https://issues.apache.org/jira/browse/SPARK-34573 > Project: Spark > Issue Type: Improvement > Components: SQL > Affects Versions: 2.4.7, 3.0.2 > Reporter: Gabriele Nizzoli > Priority: Major > > SQLConf object has a sqlConfEntries map, which has a global lock (since it > implements a Collections.synchronizedMap). > Every operation (like get or set) blocks the full object. > Concurrent threads may wait on lock. > An example is the DatatType.sameType method, that queries SQLConf entries map: > {code:scala} > if (SQLConf.get.caseSensitiveAnalysis) > ... > {code} > If this data type check is run in a custom piece of code on an executor with > multiple cores (eg: 40), then a lot of time will be lost waiting on the lock. > An easy fix is to use the a ConcurrentHashMap that does not lock on read > SQLConf.get): " ... retrieval operations do not entail locking ..." > NOTE: originally discovered by Benson Hon <benso...@taboola.com> -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org