[jira] [Commented] (HIVE-13308) HiveOnSpark sumbit query very slow when hundred of beeline exectue at same time
[ https://issues.apache.org/jira/browse/HIVE-13308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16046323#comment-16046323 ] Rui Li commented on HIVE-13308: --- Fixed via HIVE-16854. Close this one as dup. > HiveOnSpark sumbit query very slow when hundred of beeline exectue at same > time > --- > > Key: HIVE-13308 > URL: https://issues.apache.org/jira/browse/HIVE-13308 > Project: Hive > Issue Type: Bug > Components: Spark >Affects Versions: 1.2.1 >Reporter: wangwenli > Original Estimate: 1h > Remaining Estimate: 1h > > backgroud: hive on spark , yarn cluster mode > details: > when using hundred of beeline submit query at the same time, we found that > yarn get application very slow, and hiveserver is blocked at > SparkClientFactory.createClient method > after analysis, we think the synchronize on SparkClientFactory.createClient , > can be removed -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-13308) HiveOnSpark sumbit query very slow when hundred of beeline exectue at same time
[ https://issues.apache.org/jira/browse/HIVE-13308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15281186#comment-15281186 ] wangwenli commented on HIVE-13308: -- [~lirui] generate secret has no cocurrent issue,nothing is shared, and register RemoteDirver seems also thread safe, not !00% sure, maybe [~vanzin] can give some input? thank you all very much. > HiveOnSpark sumbit query very slow when hundred of beeline exectue at same > time > --- > > Key: HIVE-13308 > URL: https://issues.apache.org/jira/browse/HIVE-13308 > Project: Hive > Issue Type: Bug > Components: Spark >Affects Versions: 1.2.1 >Reporter: wangwenli > Original Estimate: 1h > Remaining Estimate: 1h > > backgroud: hive on spark , yarn cluster mode > details: > when using hundred of beeline submit query at the same time, we found that > yarn get application very slow, and hiveserver is blocked at > SparkClientFactory.createClient method > after analysis, we think the synchronize on SparkClientFactory.createClient , > can be removed -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13308) HiveOnSpark sumbit query very slow when hundred of beeline exectue at same time
[ https://issues.apache.org/jira/browse/HIVE-13308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15281147#comment-15281147 ] Rui Li commented on HIVE-13308: --- Thanks very much [~vanzin] for your inputs. Like [~wenli] said, SparkClientFactory is stopped only when hiveserver2 shuts down, so we don't have to worry about {{server}} is null while we creating the client. What I'm not sure is that {{server}} is used to generate the secret and register the RemoteDriver, during creating the SparkClientImpl. Without synchronization, we need the {{server}} to be thread safe. > HiveOnSpark sumbit query very slow when hundred of beeline exectue at same > time > --- > > Key: HIVE-13308 > URL: https://issues.apache.org/jira/browse/HIVE-13308 > Project: Hive > Issue Type: Bug > Components: Spark >Affects Versions: 1.2.1 >Reporter: wangwenli > Original Estimate: 1h > Remaining Estimate: 1h > > backgroud: hive on spark , yarn cluster mode > details: > when using hundred of beeline submit query at the same time, we found that > yarn get application very slow, and hiveserver is blocked at > SparkClientFactory.createClient method > after analysis, we think the synchronize on SparkClientFactory.createClient , > can be removed -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13308) HiveOnSpark sumbit query very slow when hundred of beeline exectue at same time
[ https://issues.apache.org/jira/browse/HIVE-13308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15281141#comment-15281141 ] wangwenli commented on HIVE-13308: -- [~vanzin] consider the server is initialized one time and stopped only hiveserver shutting down, seems no concurrent scenarios. still require mark server variable volatile? > HiveOnSpark sumbit query very slow when hundred of beeline exectue at same > time > --- > > Key: HIVE-13308 > URL: https://issues.apache.org/jira/browse/HIVE-13308 > Project: Hive > Issue Type: Bug > Components: Spark >Affects Versions: 1.2.1 >Reporter: wangwenli > Original Estimate: 1h > Remaining Estimate: 1h > > backgroud: hive on spark , yarn cluster mode > details: > when using hundred of beeline submit query at the same time, we found that > yarn get application very slow, and hiveserver is blocked at > SparkClientFactory.createClient method > after analysis, we think the synchronize on SparkClientFactory.createClient , > can be removed -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13308) HiveOnSpark sumbit query very slow when hundred of beeline exectue at same time
[ https://issues.apache.org/jira/browse/HIVE-13308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15280475#comment-15280475 ] Marcelo Vanzin commented on HIVE-13308: --- There needs to be a synchronization block to access the {{server}} field; otherwise, it's probably safe to call the {{SparkClientImpl}} constructor outside of a synchronized block, as long as {{SparkClientFactory.stop()}} is not called before it finishes. Something like: {code} public static SparkClient createClient(MapsparkConf, HiveConf hiveConf) throws IOException, SparkException { RpcServer _server; synchronized (SparkClientFactory.class) { Preconditions.checkState(server != null, "initialize() not called."); _server = server; } return new SparkClientImpl(_server, sparkConf, hiveConf); } {code} Or maybe just making the {{server}} variable volatile would suffice, too (and then no synchronization is needed in {{createClient}}). > HiveOnSpark sumbit query very slow when hundred of beeline exectue at same > time > --- > > Key: HIVE-13308 > URL: https://issues.apache.org/jira/browse/HIVE-13308 > Project: Hive > Issue Type: Bug > Components: Spark >Affects Versions: 1.2.1 >Reporter: wangwenli > Original Estimate: 1h > Remaining Estimate: 1h > > backgroud: hive on spark , yarn cluster mode > details: > when using hundred of beeline submit query at the same time, we found that > yarn get application very slow, and hiveserver is blocked at > SparkClientFactory.createClient method > after analysis, we think the synchronize on SparkClientFactory.createClient , > can be removed -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13308) HiveOnSpark sumbit query very slow when hundred of beeline exectue at same time
[ https://issues.apache.org/jira/browse/HIVE-13308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15279638#comment-15279638 ] Rui Li commented on HIVE-13308: --- I think we can remove the synchronization if {{RpcServer}} is thread safe, which I'm not sure about. [~vanzin] do you have any ideas on this? > HiveOnSpark sumbit query very slow when hundred of beeline exectue at same > time > --- > > Key: HIVE-13308 > URL: https://issues.apache.org/jira/browse/HIVE-13308 > Project: Hive > Issue Type: Bug > Components: Spark >Affects Versions: 1.2.1 >Reporter: wangwenli > Original Estimate: 1h > Remaining Estimate: 1h > > backgroud: hive on spark , yarn cluster mode > details: > when using hundred of beeline submit query at the same time, we found that > yarn get application very slow, and hiveserver is blocked at > SparkClientFactory.createClient method > after analysis, we think the synchronize on SparkClientFactory.createClient , > can be removed -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13308) HiveOnSpark sumbit query very slow when hundred of beeline exectue at same time
[ https://issues.apache.org/jira/browse/HIVE-13308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15279338#comment-15279338 ] wangwenli commented on HIVE-13308: -- [~lirui], could you check on this? > HiveOnSpark sumbit query very slow when hundred of beeline exectue at same > time > --- > > Key: HIVE-13308 > URL: https://issues.apache.org/jira/browse/HIVE-13308 > Project: Hive > Issue Type: Bug > Components: Spark >Affects Versions: 1.2.1 >Reporter: wangwenli > Original Estimate: 1h > Remaining Estimate: 1h > > backgroud: hive on spark , yarn cluster mode > details: > when using hundred of beeline submit query at the same time, we found that > yarn get application very slow, and hiveserver is blocked at > SparkClientFactory.createClient method > after analysis, we think the synchronize on SparkClientFactory.createClient , > can be removed -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13308) HiveOnSpark sumbit query very slow when hundred of beeline exectue at same time
[ https://issues.apache.org/jira/browse/HIVE-13308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15208405#comment-15208405 ] Xuefu Zhang commented on HIVE-13308: [~ruili], any thoughts on this? > HiveOnSpark sumbit query very slow when hundred of beeline exectue at same > time > --- > > Key: HIVE-13308 > URL: https://issues.apache.org/jira/browse/HIVE-13308 > Project: Hive > Issue Type: Bug > Components: Spark >Affects Versions: 1.2.1 >Reporter: wangwenli > Original Estimate: 1h > Remaining Estimate: 1h > > backgroud: hive on spark , yarn cluster mode > details: > when using hundred of beeline submit query at the same time, we found that > yarn get application very slow, and hiveserver is blocked at > SparkClientFactory.createClient method > after analysis, we think the synchronize on SparkClientFactory.createClient , > can be removed -- This message was sent by Atlassian JIRA (v6.3.4#6332)