[jira] [Created] (HIVE-19888) Misleading "METASTORE_FILTER_HOOK will be ignored" warning from SessionState
Marcelo Vanzin created HIVE-19888: - Summary: Misleading "METASTORE_FILTER_HOOK will be ignored" warning from SessionState Key: HIVE-19888 URL: https://issues.apache.org/jira/browse/HIVE-19888 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 3.0.0 Reporter: Marcelo Vanzin When I run things on my test cluster I see things like this in my logs: {noformat} 18/03/14 13:35:20 WARN session.SessionState: METASTORE_FILTER_HOOK will be ignored, since hive.security.authorization.manager is set to instance of HiveAuthorizerFactory. 18/03/14 13:35:21 WARN session.SessionState: METASTORE_FILTER_HOOK will be ignored, since hive.security.authorization.manager is set to instance of HiveAuthorizerFactory. {noformat} That's because the code in SessionState.java is wrong: {code} String metastoreHook = sessionConf.get(ConfVars.METASTORE_FILTER_HOOK.name()); if (!ConfVars.METASTORE_FILTER_HOOK.getDefaultValue().equals(metastoreHook) && !AuthorizationMetaStoreFilterHook.class.getName().equals(metastoreHook)) { LOG.warn(ConfVars.METASTORE_FILTER_HOOK.name() + " will be ignored, since hive.security.authorization.manager" + " is set to instance of HiveAuthorizerFactory."); } {code} It's using {{.name()}} which is the enum name, not the actual config key. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: Review Request 33422: HIVE-10434 - Cancel connection when remote Spark driver process has failed [Spark Branch]
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/33422/#review81520 --- Ship it! Ship It! - Marcelo Vanzin On April 23, 2015, 6:54 p.m., Chao Sun wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/33422/ --- (Updated April 23, 2015, 6:54 p.m.) Review request for hive and Marcelo Vanzin. Bugs: HIVE-10434 https://issues.apache.org/jira/browse/HIVE-10434 Repository: hive-git Description --- This patch cancels the connection from HS2 to remote process once the latter has failed and exited with error code, to avoid potential long timeout. It add a new public method cancelClient to the RpcServer class - not sure whether there's an easier way to do this.. Diffs - spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java 71e432d spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcServer.java 32d4c46 Diff: https://reviews.apache.org/r/33422/diff/ Testing --- Tested on my own cluster, and it worked. Thanks, Chao Sun
Re: Review Request 33422: HIVE-10434 - Cancel connection when remote Spark driver process has failed [Spark Branch]
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/33422/#review81328 --- Ship it! Just a minor thing left to fix. spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcServer.java https://reviews.apache.org/r/33422/#comment131664 To avoid races, I'd do: final ClientInfo cinfo = pendingClients.remove(clientId); if (cinfo == null) { /* nothing to do */ } - Marcelo Vanzin On April 22, 2015, 1:25 a.m., Chao Sun wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/33422/ --- (Updated April 22, 2015, 1:25 a.m.) Review request for hive and Marcelo Vanzin. Bugs: HIVE-10434 https://issues.apache.org/jira/browse/HIVE-10434 Repository: hive-git Description --- This patch cancels the connection from HS2 to remote process once the latter has failed and exited with error code, to avoid potential long timeout. It add a new public method cancelClient to the RpcServer class - not sure whether there's an easier way to do this.. Diffs - spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java 71e432d spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcServer.java 32d4c46 Diff: https://reviews.apache.org/r/33422/diff/ Testing --- Tested on my own cluster, and it worked. Thanks, Chao Sun
Re: Review Request 33422: HIVE-10434 - Cancel connection when remote Spark driver process has failed [Spark Branch]
On April 23, 2015, 6:22 p.m., Xuefu Zhang wrote: spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcServer.java, line 176 https://reviews.apache.org/r/33422/diff/2/?file=939013#file939013line176 I'm wondering if cinfo can be null here. After the contains() check above, things might have changed. So, cinfo is not guaranteed to be not null. Yeah, that was my suggestion above. Don't use `containsKey`, instead just remove and check for null. - Marcelo --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/33422/#review81361 --- On April 23, 2015, 6:11 p.m., Chao Sun wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/33422/ --- (Updated April 23, 2015, 6:11 p.m.) Review request for hive and Marcelo Vanzin. Bugs: HIVE-10434 https://issues.apache.org/jira/browse/HIVE-10434 Repository: hive-git Description --- This patch cancels the connection from HS2 to remote process once the latter has failed and exited with error code, to avoid potential long timeout. It add a new public method cancelClient to the RpcServer class - not sure whether there's an easier way to do this.. Diffs - spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java 71e432d spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcServer.java 32d4c46 Diff: https://reviews.apache.org/r/33422/diff/ Testing --- Tested on my own cluster, and it worked. Thanks, Chao Sun
Re: Review Request 33422: HIVE-10434 - Cancel connection when remote Spark driver process has failed [Spark Branch]
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/33422/#review81103 --- spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcServer.java https://reviews.apache.org/r/33422/#comment131349 This will throw an exception if the child process exits with a non-zero status after the RSC connects back to HS2. I don't think you want that. spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcServer.java https://reviews.apache.org/r/33422/#comment131351 While the only current call site reflects the error message, this method seems more generic than that. Maybe pass the error message as a parameter to the method? - Marcelo Vanzin On April 22, 2015, 12:30 a.m., Chao Sun wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/33422/ --- (Updated April 22, 2015, 12:30 a.m.) Review request for hive and Marcelo Vanzin. Bugs: HIVE-10434 https://issues.apache.org/jira/browse/HIVE-10434 Repository: hive-git Description --- This patch cancels the connection from HS2 to remote process once the latter has failed and exited with error code, to avoid potential long timeout. It add a new public method cancelClient to the RpcServer class - not sure whether there's an easier way to do this.. Diffs - spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java 71e432d spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcServer.java 32d4c46 Diff: https://reviews.apache.org/r/33422/diff/ Testing --- Tested on my own cluster, and it worked. Thanks, Chao Sun
[jira] [Created] (HIVE-10143) HS2 fails to clean up Spark client state on timeout
Marcelo Vanzin created HIVE-10143: - Summary: HS2 fails to clean up Spark client state on timeout Key: HIVE-10143 URL: https://issues.apache.org/jira/browse/HIVE-10143 Project: Hive Issue Type: Bug Reporter: Marcelo Vanzin Assignee: Marcelo Vanzin When a new client is registered with the Spark client and fails to connect back in time, the code will time out the future and HS2 will give up on that client. But the RSC backend does not clean up all the state, and the client is still allowed to connect back. That can lead to the client staying alive indefinitely and holding on to cluster resources, since HS2 doesn't know it's alive but the connection still exists. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Review Request 32631: [HIVE-10143] Properly clean up client state when client times out.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/32631/ --- Review request for hive, Szehon Ho and Xuefu Zhang. Repository: hive-git Description --- Clean up needs to occur whenever the client future fails, not just when it's explicitly cancelled. Diffs - spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcServer.java b923acf78c8459cf49d47268233b328957a1ae6e spark-client/src/test/java/org/apache/hive/spark/client/rpc/TestRpc.java 8207514342bed544e1a01fc41c892825f330cf3c Diff: https://reviews.apache.org/r/32631/diff/ Testing --- Thanks, Marcelo Vanzin
[jira] [Commented] (HIVE-9410) ClassNotFoundException occurs during hive query case execution with UDF defined [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14305668#comment-14305668 ] Marcelo Vanzin commented on HIVE-9410: -- Ah, I see, didn't notice that. Thanks for clarifying! Depending on the exact semantics expected by Hive, I'd make one suggestion though: instead of having a growing hierarchy of class loader every time you call addJar() and run a job (then call addJar() then run another job then...), I'd have a simple sub-class of URLClassLoader that exposes the protected addURL() method. That way, you don't need to keep this jar list, and you don't need to create this chain of class loaders - you just modify the existing context class loader by adding the new URL. ClassNotFoundException occurs during hive query case execution with UDF defined [Spark Branch] -- Key: HIVE-9410 URL: https://issues.apache.org/jira/browse/HIVE-9410 Project: Hive Issue Type: Sub-task Components: Spark Environment: CentOS 6.5 JDK1.7 Reporter: Xin Hao Assignee: Chengxiang Li Fix For: spark-branch, 1.1.0 Attachments: HIVE-9410.1-spark.patch, HIVE-9410.2-spark.patch, HIVE-9410.3-spark.patch, HIVE-9410.4-spark.patch, HIVE-9410.4-spark.patch We have a hive query case with UDF defined (i.e. BigBench case Q10, Q18 etc.). It will be passed for default Hive (on MR) mode, while failed for Hive On Spark mode (both Standalone and Yarn-Client). Although we use 'add jar .jar;' to add the UDF jar explicitly, the issue still exists. BTW, if we put the UDF jar into $HIVE_HOME/lib dir, the case will be passed. Detail Error Message is as below (NOTE: de.bankmark.bigbench.queries.q10.SentimentUDF is the UDF which contained in jar bigbenchqueriesmr.jar, and we have add command like 'add jar /location/to/bigbenchqueriesmr.jar;' into .sql explicitly) {code} INFO [pool-1-thread-1]: client.RemoteDriver (RemoteDriver.java:call(316)) - Failed to run job 8dd120cb-1a4d-4d1c-ba31-61eac648c27d org.apache.hive.com.esotericsoftware.kryo.KryoException: Unable to find class: de.bankmark.bigbench.queries.q10.SentimentUDF Serialization trace: genericUDTF (org.apache.hadoop.hive.ql.plan.UDTFDesc) conf (org.apache.hadoop.hive.ql.exec.UDTFOperator) childOperators (org.apache.hadoop.hive.ql.exec.SelectOperator) childOperators (org.apache.hadoop.hive.ql.exec.MapJoinOperator) childOperators (org.apache.hadoop.hive.ql.exec.FilterOperator) childOperators (org.apache.hadoop.hive.ql.exec.TableScanOperator) aliasToWork (org.apache.hadoop.hive.ql.plan.MapWork) right (org.apache.commons.lang3.tuple.ImmutablePair) edgeProperties (org.apache.hadoop.hive.ql.plan.SparkWork) at org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:138) at org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:115) at org.apache.hive.com.esotericsoftware.kryo.Kryo.readClass(Kryo.java:656) at org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:99) at org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507) at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694) at org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106) at org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507) at org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776) at org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:112) at org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:18) at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694) at org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106) at org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507) at org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776) at org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:112) ... Caused by: java.lang.ClassNotFoundException: de.bankmark.bigbench.queries.q10.SentimentUDF at java.net.URLClassLoader$1.run(URLClassLoader.java:366) at java.net.URLClassLoader$1.run(URLClassLoader.java:355) at java.security.AccessController.doPrivileged(Native Method
[jira] [Commented] (HIVE-9410) ClassNotFoundException occurs during hive query case execution with UDF defined [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14303659#comment-14303659 ] Marcelo Vanzin commented on HIVE-9410: -- Hi [~chengxiang li], I think this patch has a serious bug. It adds the new jar to the driver by modifying the current thread's context class loader. But the driver runs a thread pool; so if a later request is service by a different thread, it will not see that jar. So to properly fix this you'd have to somehow change all the threads being managed by the pool, or directly modify the context class loader (without replacing it). ClassNotFoundException occurs during hive query case execution with UDF defined [Spark Branch] -- Key: HIVE-9410 URL: https://issues.apache.org/jira/browse/HIVE-9410 Project: Hive Issue Type: Sub-task Components: Spark Environment: CentOS 6.5 JDK1.7 Reporter: Xin Hao Assignee: Chengxiang Li Fix For: spark-branch, 1.1.0 Attachments: HIVE-9410.1-spark.patch, HIVE-9410.2-spark.patch, HIVE-9410.3-spark.patch, HIVE-9410.4-spark.patch, HIVE-9410.4-spark.patch We have a hive query case with UDF defined (i.e. BigBench case Q10, Q18 etc.). It will be passed for default Hive (on MR) mode, while failed for Hive On Spark mode (both Standalone and Yarn-Client). Although we use 'add jar .jar;' to add the UDF jar explicitly, the issue still exists. BTW, if we put the UDF jar into $HIVE_HOME/lib dir, the case will be passed. Detail Error Message is as below (NOTE: de.bankmark.bigbench.queries.q10.SentimentUDF is the UDF which contained in jar bigbenchqueriesmr.jar, and we have add command like 'add jar /location/to/bigbenchqueriesmr.jar;' into .sql explicitly) {code} INFO [pool-1-thread-1]: client.RemoteDriver (RemoteDriver.java:call(316)) - Failed to run job 8dd120cb-1a4d-4d1c-ba31-61eac648c27d org.apache.hive.com.esotericsoftware.kryo.KryoException: Unable to find class: de.bankmark.bigbench.queries.q10.SentimentUDF Serialization trace: genericUDTF (org.apache.hadoop.hive.ql.plan.UDTFDesc) conf (org.apache.hadoop.hive.ql.exec.UDTFOperator) childOperators (org.apache.hadoop.hive.ql.exec.SelectOperator) childOperators (org.apache.hadoop.hive.ql.exec.MapJoinOperator) childOperators (org.apache.hadoop.hive.ql.exec.FilterOperator) childOperators (org.apache.hadoop.hive.ql.exec.TableScanOperator) aliasToWork (org.apache.hadoop.hive.ql.plan.MapWork) right (org.apache.commons.lang3.tuple.ImmutablePair) edgeProperties (org.apache.hadoop.hive.ql.plan.SparkWork) at org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:138) at org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:115) at org.apache.hive.com.esotericsoftware.kryo.Kryo.readClass(Kryo.java:656) at org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:99) at org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507) at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694) at org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106) at org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507) at org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776) at org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:112) at org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:18) at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694) at org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106) at org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507) at org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776) at org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:112) ... Caused by: java.lang.ClassNotFoundException: de.bankmark.bigbench.queries.q10.SentimentUDF at java.net.URLClassLoader$1.run(URLClassLoader.java:366) at java.net.URLClassLoader$1.run(URLClassLoader.java:355) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:354) at java.lang.ClassLoader.loadClass(ClassLoader.java:425
[jira] [Commented] (HIVE-9487) Make Remote Spark Context secure [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297200#comment-14297200 ] Marcelo Vanzin commented on HIVE-9487: -- Hmm, weird. I definitely did not touch those. Maybe some merge issue, I'll take a look. Make Remote Spark Context secure [Spark Branch] --- Key: HIVE-9487 URL: https://issues.apache.org/jira/browse/HIVE-9487 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Marcelo Vanzin Assignee: Marcelo Vanzin Attachments: HIVE-9487.1-spark.patch The RSC currently uses an ad-hoc, insecure authentication mechanism. We should instead use a proper auth mechanism and add encryption to the mix. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9487) Make Remote Spark Context secure [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297457#comment-14297457 ] Marcelo Vanzin commented on HIVE-9487: -- I failed git branch management 101. New patch should be correct. Make Remote Spark Context secure [Spark Branch] --- Key: HIVE-9487 URL: https://issues.apache.org/jira/browse/HIVE-9487 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Marcelo Vanzin Assignee: Marcelo Vanzin Attachments: HIVE-9487.1-spark.patch, HIVE-9487.2-spark.patch The RSC currently uses an ad-hoc, insecure authentication mechanism. We should instead use a proper auth mechanism and add encryption to the mix. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9487) Make Remote Spark Context secure [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin updated HIVE-9487: - Attachment: HIVE-9487.2-spark.patch Make Remote Spark Context secure [Spark Branch] --- Key: HIVE-9487 URL: https://issues.apache.org/jira/browse/HIVE-9487 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Marcelo Vanzin Assignee: Marcelo Vanzin Attachments: HIVE-9487.1-spark.patch, HIVE-9487.2-spark.patch The RSC currently uses an ad-hoc, insecure authentication mechanism. We should instead use a proper auth mechanism and add encryption to the mix. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9487) Make Remote Spark Context secure [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin updated HIVE-9487: - Status: Patch Available (was: Open) Make Remote Spark Context secure [Spark Branch] --- Key: HIVE-9487 URL: https://issues.apache.org/jira/browse/HIVE-9487 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Marcelo Vanzin Assignee: Marcelo Vanzin Attachments: HIVE-9487.1-spark.patch The RSC currently uses an ad-hoc, insecure authentication mechanism. We should instead use a proper auth mechanism and add encryption to the mix. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Review Request 30385: Use SASL to establish the remote context connection.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/30385/ --- Review request for hive, Brock Noland, chengxiang li, and Xuefu Zhang. Bugs: HIVE-9487 https://issues.apache.org/jira/browse/HIVE-9487 Repository: hive-git Description --- Instead of the insecure, ad-hoc auth mechanism currently used, perform a SASL negotiation to establish trust. This requires the secret to be distributed through some secure channel (just like before). Using SASL with DIGEST-MD5 (or GSSAPI, which hasn't been tested and probably wouldn't work well here) also allows us to add encryption without the need for SSL (yay?). Only DIGEST-MD5 has been really tested. Supporting other mechanisms will probably mean adding new callback handlers in the client and server portions, but shouldn't be hard if desired. Diffs - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java d4d98d7c0c28cdb1d19c700e20537ef405be2e01 spark-client/src/main/java/org/apache/hive/spark/client/RemoteDriver.java ce2f9b6b132dc47f899798e47d18a1f6b0dd707f spark-client/src/main/java/org/apache/hive/spark/client/SparkClientFactory.java 3a7149341bac086e5efe931595143d3bebbdb5db spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java 5f9be658a855cc15c576f1a98376fcd85475e3b7 spark-client/src/main/java/org/apache/hive/spark/client/rpc/KryoMessageCodec.java 0c29c9441fb3e9daf690510a2c9b5716671e2571 spark-client/src/main/java/org/apache/hive/spark/client/rpc/README.md 2c858a121aaeca6af20f5e332de207694348a030 spark-client/src/main/java/org/apache/hive/spark/client/rpc/Rpc.java fffe24b3cbe6a5d7387e751adbc65f5b140c9089 spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcConfiguration.java eff640f7b24348043dbce734510698d9294579c6 spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcServer.java 5e18a3c0b5ea4f1b9c83f78faa3408e2dd479c2c spark-client/src/main/java/org/apache/hive/spark/client/rpc/SaslHandler.java PRE-CREATION spark-client/src/test/java/org/apache/hive/spark/client/rpc/TestKryoMessageCodec.java af534375a3ed86a3a9ad57c2f21a9a8bf6113714 spark-client/src/test/java/org/apache/hive/spark/client/rpc/TestRpc.java ec7842398d3c4112f83f00e8cd3e5d4f9fdf8ca9 Diff: https://reviews.apache.org/r/30385/diff/ Testing --- Unit tests. Thanks, Marcelo Vanzin
Re: Review Request 30385: Use SASL to establish the remote context connection.
On Jan. 29, 2015, 12:36 a.m., Xuefu Zhang wrote: spark-client/src/main/java/org/apache/hive/spark/client/rpc/Rpc.java, line 20 https://reviews.apache.org/r/30385/diff/1/?file=839319#file839319line20 Nit: if you need to submit another patch, let's not auto reorg the imports. I changed this because someone broke it... now it's in line with the usual order you see in the rest of Hive code. - Marcelo --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/30385/#review70119 --- On Jan. 28, 2015, 11:22 p.m., Marcelo Vanzin wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/30385/ --- (Updated Jan. 28, 2015, 11:22 p.m.) Review request for hive, Brock Noland, chengxiang li, and Xuefu Zhang. Bugs: HIVE-9487 https://issues.apache.org/jira/browse/HIVE-9487 Repository: hive-git Description --- Instead of the insecure, ad-hoc auth mechanism currently used, perform a SASL negotiation to establish trust. This requires the secret to be distributed through some secure channel (just like before). Using SASL with DIGEST-MD5 (or GSSAPI, which hasn't been tested and probably wouldn't work well here) also allows us to add encryption without the need for SSL (yay?). Only DIGEST-MD5 has been really tested. Supporting other mechanisms will probably mean adding new callback handlers in the client and server portions, but shouldn't be hard if desired. Diffs - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java d4d98d7c0c28cdb1d19c700e20537ef405be2e01 spark-client/src/main/java/org/apache/hive/spark/client/RemoteDriver.java ce2f9b6b132dc47f899798e47d18a1f6b0dd707f spark-client/src/main/java/org/apache/hive/spark/client/SparkClientFactory.java 3a7149341bac086e5efe931595143d3bebbdb5db spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java 5f9be658a855cc15c576f1a98376fcd85475e3b7 spark-client/src/main/java/org/apache/hive/spark/client/rpc/KryoMessageCodec.java 0c29c9441fb3e9daf690510a2c9b5716671e2571 spark-client/src/main/java/org/apache/hive/spark/client/rpc/README.md 2c858a121aaeca6af20f5e332de207694348a030 spark-client/src/main/java/org/apache/hive/spark/client/rpc/Rpc.java fffe24b3cbe6a5d7387e751adbc65f5b140c9089 spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcConfiguration.java eff640f7b24348043dbce734510698d9294579c6 spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcServer.java 5e18a3c0b5ea4f1b9c83f78faa3408e2dd479c2c spark-client/src/main/java/org/apache/hive/spark/client/rpc/SaslHandler.java PRE-CREATION spark-client/src/test/java/org/apache/hive/spark/client/rpc/TestKryoMessageCodec.java af534375a3ed86a3a9ad57c2f21a9a8bf6113714 spark-client/src/test/java/org/apache/hive/spark/client/rpc/TestRpc.java ec7842398d3c4112f83f00e8cd3e5d4f9fdf8ca9 Diff: https://reviews.apache.org/r/30385/diff/ Testing --- Unit tests. Thanks, Marcelo Vanzin
[jira] [Updated] (HIVE-9487) Make Remote Spark Context secure [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin updated HIVE-9487: - Attachment: HIVE-9487.1-spark.patch Make Remote Spark Context secure [Spark Branch] --- Key: HIVE-9487 URL: https://issues.apache.org/jira/browse/HIVE-9487 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Marcelo Vanzin Assignee: Marcelo Vanzin Attachments: HIVE-9487.1-spark.patch The RSC currently uses an ad-hoc, insecure authentication mechanism. We should instead use a proper auth mechanism and add encryption to the mix. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9493) Failed job may not throw exceptions [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14295544#comment-14295544 ] Marcelo Vanzin commented on HIVE-9493: -- Looks good, thanks for fixing it. Failed job may not throw exceptions [Spark Branch] -- Key: HIVE-9493 URL: https://issues.apache.org/jira/browse/HIVE-9493 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Rui Li Assignee: Rui Li Attachments: HIVE-9493.1-spark.patch Currently remote driver assumes exception will be thrown when job fails to run. This may not hold since job is submitted asynchronously. And we have to check the futures before we decide the job is successful. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-9487) Make Remote Spark Context secure [Spark Branch]
Marcelo Vanzin created HIVE-9487: Summary: Make Remote Spark Context secure [Spark Branch] Key: HIVE-9487 URL: https://issues.apache.org/jira/browse/HIVE-9487 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Marcelo Vanzin Assignee: Marcelo Vanzin The RSC currently uses an ad-hoc, insecure authentication mechanism. We should instead use a proper auth mechanism and add encryption to the mix. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 29954: HIVE-9179. Add listener API to JobHandle.
On Jan. 17, 2015, 12:19 a.m., Xuefu Zhang wrote: spark-client/src/main/java/org/apache/hive/spark/client/JobHandleImpl.java, line 179 https://reviews.apache.org/r/29954/diff/1-2/?file=823286#file823286line179 Sorry I didn't get it, but why? Clarity but not perf is my concern. Here we are notifying listeners with a new Spark job ID, which is done in the for loop, which is synchronized. This means no listener may be added or removed from the listeners. On the other hand, sparkJobIds.add(sparkJobId) seems irrelevant to any changes to listeners, unless I missed anything. I don't understand why either of the two cases might happen as you suggested. Marcelo Vanzin wrote: Threads: T1 updating the job handle, T2 adding a listener Case 1: Statement 1 (S1): sparkJobIds.add(sparkJobId); Statement 2 (S2): synchronized (listeners) { /* call onSparkJobStarted(newSparkJobId) on every listener */ } Timeline: T1: executes S1 T2: calls addListener(), new listener is notified of the sparkJobId added above T1: executes S2. New listener is notified again of new spark job ID. Case 2: Invert S1 and S2. T2: calls addListener() T1: executes S1. Listener is called with the current state of the handle and new Spark job ID. Listener checks `handle.getSparkJobIDs().contains(newSparkJobId)`, check fails. Those seem pretty easy to understand to me. The current code avoids both of them. Xuefu Zhang wrote: I see. So the shared state of the job handler consists of state, listeners, and sparkJobIds, which needs to be protected. Thus, I'd suggest we change synchronize(listeners) to synchronized(this) or declare the method as synchronized. No essential difference, but for better clarity. The synchronization is *only* needed because of the listeners. It's there so that when you add a listener, you never miss an event - if they didn't exist, you wouldn't need any synchronization anywhere in this class. So it makes better sense to synchronize on the listeners. - Marcelo --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/29954/#review68513 --- On Jan. 16, 2015, 11:24 p.m., Marcelo Vanzin wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/29954/ --- (Updated Jan. 16, 2015, 11:24 p.m.) Review request for hive, Brock Noland, chengxiang li, and Xuefu Zhang. Bugs: HIVE-9179 https://issues.apache.org/jira/browse/HIVE-9179 Repository: hive-git Description --- HIVE-9179. Add listener API to JobHandle. Diffs - spark-client/pom.xml 77016df61a0bcbd94058bcbd2825c6c210a70e14 spark-client/src/main/java/org/apache/hive/spark/client/BaseProtocol.java f9c10b196ab47b5b4f4c0126ad455869ab68f0ca spark-client/src/main/java/org/apache/hive/spark/client/JobHandle.java e760ce35d92bedf4d301b08ec57d1c2dc37a39f0 spark-client/src/main/java/org/apache/hive/spark/client/JobHandleImpl.java 1b8feedb0b23aa7897dc6ac37ea5c0209e71d573 spark-client/src/main/java/org/apache/hive/spark/client/RemoteDriver.java 0d49ed3d9e33ca08d6a7526c1c434a0dd0a06a67 spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java a30d8cbbaae9d25b1cffdc286b546f549e439545 spark-client/src/test/java/org/apache/hive/spark/client/TestJobHandle.java PRE-CREATION spark-client/src/test/java/org/apache/hive/spark/client/TestSparkClient.java 795d62c776cec5e9da2a24b7d40bc749a03186ab Diff: https://reviews.apache.org/r/29954/diff/ Testing --- Thanks, Marcelo Vanzin
[jira] [Commented] (HIVE-9370) Enable Hive on Spark for BigBench and run Query 8, the test failed [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14280595#comment-14280595 ] Marcelo Vanzin commented on HIVE-9370: -- For the curious: SPARK-1021. HIVE-9179 should allow the client to be smarter about when to time out things. Enable Hive on Spark for BigBench and run Query 8, the test failed [Spark Branch] - Key: HIVE-9370 URL: https://issues.apache.org/jira/browse/HIVE-9370 Project: Hive Issue Type: Sub-task Components: Spark Reporter: yuyun.chen enable hive on spark and run BigBench Query 8 then got the following exception: 2015-01-14 11:43:46,057 INFO [main]: impl.RemoteSparkJobStatus (RemoteSparkJobStatus.java:getSparkJobInfo(143)) - Job hasn't been submitted after 30s. Aborting it. 2015-01-14 11:43:46,061 INFO [main]: impl.RemoteSparkJobStatus (RemoteSparkJobStatus.java:getSparkJobInfo(143)) - Job hasn't been submitted after 30s. Aborting it. 2015-01-14 11:43:46,061 ERROR [main]: status.SparkJobMonitor (SessionState.java:printError(839)) - Status: Failed 2015-01-14 11:43:46,062 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogEnd(148)) - /PERFLOG method=SparkRunJob start=1421206996052 end=1421207026062 duration=30010 from=org.apache.hadoop.hive.ql.exec.spark.status.SparkJobMonitor 2015-01-14 11:43:46,071 INFO [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(436)) - 15/01/14 11:43:46 INFO RemoteDriver: Failed to run job 0a9a7782-0e0b-4561-8468-959a6d8df0a3 2015-01-14 11:43:46,071 INFO [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(436)) - java.lang.InterruptedException 2015-01-14 11:43:46,071 INFO [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(436)) -at java.lang.Object.wait(Native Method) 2015-01-14 11:43:46,071 INFO [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(436)) -at java.lang.Object.wait(Object.java:503) 2015-01-14 11:43:46,071 INFO [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(436)) -at org.apache.spark.scheduler.JobWaiter.awaitResult(JobWaiter.scala:73) 2015-01-14 11:43:46,071 INFO [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(436)) -at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:514) 2015-01-14 11:43:46,071 INFO [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(436)) -at org.apache.spark.SparkContext.runJob(SparkContext.scala:1282) 2015-01-14 11:43:46,072 INFO [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(436)) -at org.apache.spark.SparkContext.runJob(SparkContext.scala:1300) 2015-01-14 11:43:46,072 INFO [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(436)) -at org.apache.spark.SparkContext.runJob(SparkContext.scala:1314) 2015-01-14 11:43:46,072 INFO [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(436)) -at org.apache.spark.SparkContext.runJob(SparkContext.scala:1328) 2015-01-14 11:43:46,072 INFO [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(436)) -at org.apache.spark.rdd.RDD.collect(RDD.scala:780) 2015-01-14 11:43:46,072 INFO [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(436)) -at org.apache.spark.RangePartitioner$.sketch(Partitioner.scala:262) 2015-01-14 11:43:46,072 INFO [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(436)) -at org.apache.spark.RangePartitioner.init(Partitioner.scala:124) 2015-01-14 11:43:46,072 INFO [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(436)) -at org.apache.spark.rdd.OrderedRDDFunctions.sortByKey(OrderedRDDFunctions.scala:63) 2015-01-14 11:43:46,073 INFO [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(436)) -at org.apache.spark.api.java.JavaPairRDD.sortByKey(JavaPairRDD.scala:894) 2015-01-14 11:43:46,073 INFO [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(436)) -at org.apache.spark.api.java.JavaPairRDD.sortByKey(JavaPairRDD.scala:864) 2015-01-14 11:43:46,073 INFO [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(436)) -at org.apache.hadoop.hive.ql.exec.spark.SortByShuffler.shuffle(SortByShuffler.java:48) 2015-01-14 11:43:46,073 INFO [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(436)) -at org.apache.hadoop.hive.ql.exec.spark.ShuffleTran.transform(ShuffleTran.java:45) 2015-01-14 11:43:46,073 INFO [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(436
Re: Review Request 29954: HIVE-9179. Add listener API to JobHandle.
On Jan. 16, 2015, 7:14 p.m., Xuefu Zhang wrote: One additional question for my understanding: Originally Hive has to poll to get job ID after submitting a spark job, in RemoteSparkJobStatus.getSparkJobInfo(). With this patch, do we still need to do this. Yeah, that's still needed. I thought about adding a `onSparkJobStarted` callback or something. If there's interest in that I can add it, should be easy. On Jan. 16, 2015, 7:14 p.m., Xuefu Zhang wrote: spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java, line 442 https://reviews.apache.org/r/29954/diff/1/?file=823288#file823288line442 This method, together with other existing handl() methods, are invoked using reflection, which makes the code hard to understand. I'm wondering if this can be improved. The alternative is having cascading `if..else if..else` blocks with a bunch of `instanceof` checks, as was done in the akka-based code before. I think that's much uglier and harder to read. - Marcelo --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/29954/#review68430 --- On Jan. 16, 2015, 1:05 a.m., Marcelo Vanzin wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/29954/ --- (Updated Jan. 16, 2015, 1:05 a.m.) Review request for hive, Brock Noland, chengxiang li, and Xuefu Zhang. Bugs: HIVE-9179 https://issues.apache.org/jira/browse/HIVE-9179 Repository: hive-git Description --- HIVE-9179. Add listener API to JobHandle. Diffs - spark-client/pom.xml 77016df61a0bcbd94058bcbd2825c6c210a70e14 spark-client/src/main/java/org/apache/hive/spark/client/BaseProtocol.java f9c10b196ab47b5b4f4c0126ad455869ab68f0ca spark-client/src/main/java/org/apache/hive/spark/client/JobHandle.java e760ce35d92bedf4d301b08ec57d1c2dc37a39f0 spark-client/src/main/java/org/apache/hive/spark/client/JobHandleImpl.java 1b8feedb0b23aa7897dc6ac37ea5c0209e71d573 spark-client/src/main/java/org/apache/hive/spark/client/RemoteDriver.java 0d49ed3d9e33ca08d6a7526c1c434a0dd0a06a67 spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java a30d8cbbaae9d25b1cffdc286b546f549e439545 spark-client/src/test/java/org/apache/hive/spark/client/TestJobHandle.java PRE-CREATION spark-client/src/test/java/org/apache/hive/spark/client/TestSparkClient.java 795d62c776cec5e9da2a24b7d40bc749a03186ab Diff: https://reviews.apache.org/r/29954/diff/ Testing --- Thanks, Marcelo Vanzin
[jira] [Updated] (HIVE-9179) Add listeners on JobHandle so job status change can be notified to the client [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin updated HIVE-9179: - Attachment: HIVE-9179.2-spark.patch Add listeners on JobHandle so job status change can be notified to the client [Spark Branch] Key: HIVE-9179 URL: https://issues.apache.org/jira/browse/HIVE-9179 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Xuefu Zhang Assignee: Marcelo Vanzin Attachments: HIVE-9179.1-spark.patch, HIVE-9179.2-spark.patch Based on discussion in HIVE-8972, it seems nice to add listeners on a job handle such that state changes of a submitted a job can be notified instead of the current approach of client polling for such changes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 29954: HIVE-9179. Add listener API to JobHandle.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/29954/ --- (Updated Jan. 16, 2015, 9:22 p.m.) Review request for hive, Brock Noland, chengxiang li, and Xuefu Zhang. Bugs: HIVE-9179 https://issues.apache.org/jira/browse/HIVE-9179 Repository: hive-git Description --- HIVE-9179. Add listener API to JobHandle. Diffs (updated) - spark-client/pom.xml 77016df61a0bcbd94058bcbd2825c6c210a70e14 spark-client/src/main/java/org/apache/hive/spark/client/BaseProtocol.java f9c10b196ab47b5b4f4c0126ad455869ab68f0ca spark-client/src/main/java/org/apache/hive/spark/client/JobHandle.java e760ce35d92bedf4d301b08ec57d1c2dc37a39f0 spark-client/src/main/java/org/apache/hive/spark/client/JobHandleImpl.java 1b8feedb0b23aa7897dc6ac37ea5c0209e71d573 spark-client/src/main/java/org/apache/hive/spark/client/RemoteDriver.java 0d49ed3d9e33ca08d6a7526c1c434a0dd0a06a67 spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java a30d8cbbaae9d25b1cffdc286b546f549e439545 spark-client/src/test/java/org/apache/hive/spark/client/TestJobHandle.java PRE-CREATION spark-client/src/test/java/org/apache/hive/spark/client/TestSparkClient.java 795d62c776cec5e9da2a24b7d40bc749a03186ab Diff: https://reviews.apache.org/r/29954/diff/ Testing --- Thanks, Marcelo Vanzin
Re: Review Request 29954: HIVE-9179. Add listener API to JobHandle.
On Jan. 16, 2015, 10:35 p.m., Xuefu Zhang wrote: spark-client/src/main/java/org/apache/hive/spark/client/JobHandleImpl.java, line 179 https://reviews.apache.org/r/29954/diff/1-2/?file=823286#file823286line179 Here sparkJobIds.add() is in the synchronized block. However, we have code accessing the same variable (sparkJobIds) such as in RemoteSparkJobStatus class. Does that also needs protection? No, we don't. The job id list itself is thread-safe. The synchronization happens here so that we notify all listeners of everything. We don't want a listener being registered concurrently with a new spark job arriving to miss that event. (That reminds me that I probably should switch the order of events around if a listener is added after the handle is in a final state. Stay tuned.) - Marcelo --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/29954/#review68492 --- On Jan. 16, 2015, 9:22 p.m., Marcelo Vanzin wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/29954/ --- (Updated Jan. 16, 2015, 9:22 p.m.) Review request for hive, Brock Noland, chengxiang li, and Xuefu Zhang. Bugs: HIVE-9179 https://issues.apache.org/jira/browse/HIVE-9179 Repository: hive-git Description --- HIVE-9179. Add listener API to JobHandle. Diffs - spark-client/pom.xml 77016df61a0bcbd94058bcbd2825c6c210a70e14 spark-client/src/main/java/org/apache/hive/spark/client/BaseProtocol.java f9c10b196ab47b5b4f4c0126ad455869ab68f0ca spark-client/src/main/java/org/apache/hive/spark/client/JobHandle.java e760ce35d92bedf4d301b08ec57d1c2dc37a39f0 spark-client/src/main/java/org/apache/hive/spark/client/JobHandleImpl.java 1b8feedb0b23aa7897dc6ac37ea5c0209e71d573 spark-client/src/main/java/org/apache/hive/spark/client/RemoteDriver.java 0d49ed3d9e33ca08d6a7526c1c434a0dd0a06a67 spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java a30d8cbbaae9d25b1cffdc286b546f549e439545 spark-client/src/test/java/org/apache/hive/spark/client/TestJobHandle.java PRE-CREATION spark-client/src/test/java/org/apache/hive/spark/client/TestSparkClient.java 795d62c776cec5e9da2a24b7d40bc749a03186ab Diff: https://reviews.apache.org/r/29954/diff/ Testing --- Thanks, Marcelo Vanzin
Re: Review Request 29954: HIVE-9179. Add listener API to JobHandle.
On Jan. 16, 2015, 10:35 p.m., Xuefu Zhang wrote: spark-client/src/main/java/org/apache/hive/spark/client/JobHandleImpl.java, line 179 https://reviews.apache.org/r/29954/diff/1-2/?file=823286#file823286line179 Here sparkJobIds.add() is in the synchronized block. However, we have code accessing the same variable (sparkJobIds) such as in RemoteSparkJobStatus class. Does that also needs protection? Marcelo Vanzin wrote: No, we don't. The job id list itself is thread-safe. The synchronization happens here so that we notify all listeners of everything. We don't want a listener being registered concurrently with a new spark job arriving to miss that event. (That reminds me that I probably should switch the order of events around if a listener is added after the handle is in a final state. Stay tuned.) Xuefu Zhang wrote: In that case, can we move sparkJobIds.add() outside the sync block? I don't think that works well. That can cause two different conditions depending on where outside means: - if you do it before the synchronized block, the listener may be notified twice of the same Spark job - if you do it after the synchronized block, the listener will be called with a Spark job that is not yet listed in `handle.getSparkJobIds()`. Since I don't belived this will cause any performance issue at all, I'd rather keep the behavior consistent. - Marcelo --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/29954/#review68492 --- On Jan. 16, 2015, 9:22 p.m., Marcelo Vanzin wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/29954/ --- (Updated Jan. 16, 2015, 9:22 p.m.) Review request for hive, Brock Noland, chengxiang li, and Xuefu Zhang. Bugs: HIVE-9179 https://issues.apache.org/jira/browse/HIVE-9179 Repository: hive-git Description --- HIVE-9179. Add listener API to JobHandle. Diffs - spark-client/pom.xml 77016df61a0bcbd94058bcbd2825c6c210a70e14 spark-client/src/main/java/org/apache/hive/spark/client/BaseProtocol.java f9c10b196ab47b5b4f4c0126ad455869ab68f0ca spark-client/src/main/java/org/apache/hive/spark/client/JobHandle.java e760ce35d92bedf4d301b08ec57d1c2dc37a39f0 spark-client/src/main/java/org/apache/hive/spark/client/JobHandleImpl.java 1b8feedb0b23aa7897dc6ac37ea5c0209e71d573 spark-client/src/main/java/org/apache/hive/spark/client/RemoteDriver.java 0d49ed3d9e33ca08d6a7526c1c434a0dd0a06a67 spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java a30d8cbbaae9d25b1cffdc286b546f549e439545 spark-client/src/test/java/org/apache/hive/spark/client/TestJobHandle.java PRE-CREATION spark-client/src/test/java/org/apache/hive/spark/client/TestSparkClient.java 795d62c776cec5e9da2a24b7d40bc749a03186ab Diff: https://reviews.apache.org/r/29954/diff/ Testing --- Thanks, Marcelo Vanzin
Re: Review Request 29954: HIVE-9179. Add listener API to JobHandle.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/29954/ --- (Updated Jan. 16, 2015, 11:24 p.m.) Review request for hive, Brock Noland, chengxiang li, and Xuefu Zhang. Bugs: HIVE-9179 https://issues.apache.org/jira/browse/HIVE-9179 Repository: hive-git Description --- HIVE-9179. Add listener API to JobHandle. Diffs (updated) - spark-client/pom.xml 77016df61a0bcbd94058bcbd2825c6c210a70e14 spark-client/src/main/java/org/apache/hive/spark/client/BaseProtocol.java f9c10b196ab47b5b4f4c0126ad455869ab68f0ca spark-client/src/main/java/org/apache/hive/spark/client/JobHandle.java e760ce35d92bedf4d301b08ec57d1c2dc37a39f0 spark-client/src/main/java/org/apache/hive/spark/client/JobHandleImpl.java 1b8feedb0b23aa7897dc6ac37ea5c0209e71d573 spark-client/src/main/java/org/apache/hive/spark/client/RemoteDriver.java 0d49ed3d9e33ca08d6a7526c1c434a0dd0a06a67 spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java a30d8cbbaae9d25b1cffdc286b546f549e439545 spark-client/src/test/java/org/apache/hive/spark/client/TestJobHandle.java PRE-CREATION spark-client/src/test/java/org/apache/hive/spark/client/TestSparkClient.java 795d62c776cec5e9da2a24b7d40bc749a03186ab Diff: https://reviews.apache.org/r/29954/diff/ Testing --- Thanks, Marcelo Vanzin
[jira] [Updated] (HIVE-9179) Add listeners on JobHandle so job status change can be notified to the client [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin updated HIVE-9179: - Attachment: HIVE-9179.3-spark.patch Add listeners on JobHandle so job status change can be notified to the client [Spark Branch] Key: HIVE-9179 URL: https://issues.apache.org/jira/browse/HIVE-9179 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Xuefu Zhang Assignee: Marcelo Vanzin Attachments: HIVE-9179.1-spark.patch, HIVE-9179.2-spark.patch, HIVE-9179.3-spark.patch Based on discussion in HIVE-8972, it seems nice to add listeners on a job handle such that state changes of a submitted a job can be notified instead of the current approach of client polling for such changes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 29954: HIVE-9179. Add listener API to JobHandle.
On Jan. 17, 2015, 12:19 a.m., Xuefu Zhang wrote: spark-client/src/main/java/org/apache/hive/spark/client/JobHandleImpl.java, line 179 https://reviews.apache.org/r/29954/diff/1-2/?file=823286#file823286line179 Sorry I didn't get it, but why? Clarity but not perf is my concern. Here we are notifying listeners with a new Spark job ID, which is done in the for loop, which is synchronized. This means no listener may be added or removed from the listeners. On the other hand, sparkJobIds.add(sparkJobId) seems irrelevant to any changes to listeners, unless I missed anything. I don't understand why either of the two cases might happen as you suggested. Threads: T1 updating the job handle, T2 adding a listener Case 1: Statement 1 (S1): sparkJobIds.add(sparkJobId); Statement 2 (S2): synchronized (listeners) { /* call onSparkJobStarted(newSparkJobId) on every listener */ } Timeline: T1: executes S1 T2: calls addListener(), new listener is notified of the sparkJobId added above T1: executes S2. New listener is notified again of new spark job ID. Case 2: Invert S1 and S2. T2: calls addListener() T1: executes S1. Listener is called with the current state of the handle and new Spark job ID. Listener checks `handle.getSparkJobIDs().contains(newSparkJobId)`, check fails. Those seem pretty easy to understand to me. The current code avoids both of them. - Marcelo --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/29954/#review68513 --- On Jan. 16, 2015, 11:24 p.m., Marcelo Vanzin wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/29954/ --- (Updated Jan. 16, 2015, 11:24 p.m.) Review request for hive, Brock Noland, chengxiang li, and Xuefu Zhang. Bugs: HIVE-9179 https://issues.apache.org/jira/browse/HIVE-9179 Repository: hive-git Description --- HIVE-9179. Add listener API to JobHandle. Diffs - spark-client/pom.xml 77016df61a0bcbd94058bcbd2825c6c210a70e14 spark-client/src/main/java/org/apache/hive/spark/client/BaseProtocol.java f9c10b196ab47b5b4f4c0126ad455869ab68f0ca spark-client/src/main/java/org/apache/hive/spark/client/JobHandle.java e760ce35d92bedf4d301b08ec57d1c2dc37a39f0 spark-client/src/main/java/org/apache/hive/spark/client/JobHandleImpl.java 1b8feedb0b23aa7897dc6ac37ea5c0209e71d573 spark-client/src/main/java/org/apache/hive/spark/client/RemoteDriver.java 0d49ed3d9e33ca08d6a7526c1c434a0dd0a06a67 spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java a30d8cbbaae9d25b1cffdc286b546f549e439545 spark-client/src/test/java/org/apache/hive/spark/client/TestJobHandle.java PRE-CREATION spark-client/src/test/java/org/apache/hive/spark/client/TestSparkClient.java 795d62c776cec5e9da2a24b7d40bc749a03186ab Diff: https://reviews.apache.org/r/29954/diff/ Testing --- Thanks, Marcelo Vanzin
Review Request 29954: HIVE-9179. Add listener API to JobHandle.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/29954/ --- Review request for hive, Brock Noland, chengxiang li, and Xuefu Zhang. Bugs: HIVE-9179 https://issues.apache.org/jira/browse/HIVE-9179 Repository: hive-git Description --- HIVE-9179. Add listener API to JobHandle. Diffs - spark-client/pom.xml 77016df61a0bcbd94058bcbd2825c6c210a70e14 spark-client/src/main/java/org/apache/hive/spark/client/BaseProtocol.java f9c10b196ab47b5b4f4c0126ad455869ab68f0ca spark-client/src/main/java/org/apache/hive/spark/client/JobHandle.java e760ce35d92bedf4d301b08ec57d1c2dc37a39f0 spark-client/src/main/java/org/apache/hive/spark/client/JobHandleImpl.java 1b8feedb0b23aa7897dc6ac37ea5c0209e71d573 spark-client/src/main/java/org/apache/hive/spark/client/RemoteDriver.java 0d49ed3d9e33ca08d6a7526c1c434a0dd0a06a67 spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java a30d8cbbaae9d25b1cffdc286b546f549e439545 spark-client/src/test/java/org/apache/hive/spark/client/TestJobHandle.java PRE-CREATION spark-client/src/test/java/org/apache/hive/spark/client/TestSparkClient.java 795d62c776cec5e9da2a24b7d40bc749a03186ab Diff: https://reviews.apache.org/r/29954/diff/ Testing --- Thanks, Marcelo Vanzin
[jira] [Updated] (HIVE-9179) Add listeners on JobHandle so job status change can be notified to the client [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin updated HIVE-9179: - Status: Patch Available (was: Open) Add listeners on JobHandle so job status change can be notified to the client [Spark Branch] Key: HIVE-9179 URL: https://issues.apache.org/jira/browse/HIVE-9179 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Xuefu Zhang Assignee: Marcelo Vanzin Attachments: HIVE-9179.1-spark.patch Based on discussion in HIVE-8972, it seems nice to add listeners on a job handle such that state changes of a submitted a job can be notified instead of the current approach of client polling for such changes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9179) Add listeners on JobHandle so job status change can be notified to the client [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin updated HIVE-9179: - Attachment: HIVE-9179.1-spark.patch Add listeners on JobHandle so job status change can be notified to the client [Spark Branch] Key: HIVE-9179 URL: https://issues.apache.org/jira/browse/HIVE-9179 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Xuefu Zhang Assignee: Marcelo Vanzin Attachments: HIVE-9179.1-spark.patch Based on discussion in HIVE-8972, it seems nice to add listeners on a job handle such that state changes of a submitted a job can be notified instead of the current approach of client polling for such changes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9178) Create a separate API for remote Spark Context RPC other than job submission [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin updated HIVE-9178: - Attachment: HIVE-9178.2-spark.patch Create a separate API for remote Spark Context RPC other than job submission [Spark Branch] --- Key: HIVE-9178 URL: https://issues.apache.org/jira/browse/HIVE-9178 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Xuefu Zhang Assignee: Marcelo Vanzin Attachments: HIVE-9178.1-spark.patch, HIVE-9178.1-spark.patch, HIVE-9178.2-spark.patch, HIVE-9178.2-spark.patch Based on discussions in HIVE-8972, it seems making sense to create a separate API for RPCs, such as addJar and getExecutorCounter. These jobs are different from a query submission in that they don't need to be queued in the backend and can be executed right away. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 29832: HIVE-9178. Add a synchronous RPC API to the remote Spark context.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/29832/ --- (Updated Jan. 14, 2015, 8:45 p.m.) Review request for hive, Brock Noland, chengxiang li, and Xuefu Zhang. Bugs: HIVE-9178 https://issues.apache.org/jira/browse/HIVE-9178 Repository: hive-git Description (updated) --- Fix return value of synchronous RPCs. Diffs (updated) - ql/src/java/org/apache/hadoop/hive/ql/exec/spark/status/impl/RemoteSparkJobStatus.java 5c3ca018bb177ef9fd9fb24b054a9db29274b31e spark-client/src/main/java/org/apache/hive/spark/client/BaseProtocol.java f9c10b196ab47b5b4f4c0126ad455869ab68f0ca spark-client/src/main/java/org/apache/hive/spark/client/RemoteDriver.java 0d49ed3d9e33ca08d6a7526c1c434a0dd0a06a67 spark-client/src/main/java/org/apache/hive/spark/client/SparkClient.java 5e767ef5eb47e493a332607204f4c522028d7d0e spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java f8b2202a465bb8abe3d2c34e49ade6387482738c spark-client/src/test/java/org/apache/hive/spark/client/TestSparkClient.java 795d62c776cec5e9da2a24b7d40bc749a03186ab Diff: https://reviews.apache.org/r/29832/diff/ Testing --- Thanks, Marcelo Vanzin
Re: Review Request 29832: HIVE-9178. Add a synchronous RPC API to the remote Spark context.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/29832/ --- (Updated Jan. 14, 2015, 8:47 p.m.) Review request for hive, Brock Noland, chengxiang li, and Xuefu Zhang. Bugs: HIVE-9178 https://issues.apache.org/jira/browse/HIVE-9178 Repository: hive-git Description (updated) --- Add a synchronous RPC API to the remote Spark context. Diffs - ql/src/java/org/apache/hadoop/hive/ql/exec/spark/status/impl/RemoteSparkJobStatus.java 5c3ca018bb177ef9fd9fb24b054a9db29274b31e spark-client/src/main/java/org/apache/hive/spark/client/BaseProtocol.java f9c10b196ab47b5b4f4c0126ad455869ab68f0ca spark-client/src/main/java/org/apache/hive/spark/client/RemoteDriver.java 0d49ed3d9e33ca08d6a7526c1c434a0dd0a06a67 spark-client/src/main/java/org/apache/hive/spark/client/SparkClient.java 5e767ef5eb47e493a332607204f4c522028d7d0e spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java f8b2202a465bb8abe3d2c34e49ade6387482738c spark-client/src/test/java/org/apache/hive/spark/client/TestSparkClient.java 795d62c776cec5e9da2a24b7d40bc749a03186ab Diff: https://reviews.apache.org/r/29832/diff/ Testing --- Thanks, Marcelo Vanzin
[jira] [Commented] (HIVE-9178) Create a separate API for remote Spark Context RPC other than job submission [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14276514#comment-14276514 ] Marcelo Vanzin commented on HIVE-9178: -- [~chengxiang li] ah, good catch. This method: {code} private void handle(ChannelHandlerContext ctx, SyncJobRequest msg) throws Exception { {code} Should actually be returning the result of the RPC instead of void. I'll update the patch tomorrow and add a unit test (d'oh). Create a separate API for remote Spark Context RPC other than job submission [Spark Branch] --- Key: HIVE-9178 URL: https://issues.apache.org/jira/browse/HIVE-9178 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Xuefu Zhang Assignee: Marcelo Vanzin Attachments: HIVE-9178.1-spark.patch, HIVE-9178.1-spark.patch, HIVE-9178.2-spark.patch Based on discussions in HIVE-8972, it seems making sense to create a separate API for RPCs, such as addJar and getExecutorCounter. These jobs are different from a query submission in that they don't need to be queued in the backend and can be executed right away. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 29832: HIVE-9178. Add a synchronous RPC API to the remote Spark context.
On Jan. 13, 2015, 6:47 a.m., chengxiang li wrote: spark-client/src/main/java/org/apache/hive/spark/client/SparkClient.java, line 55 https://reviews.apache.org/r/29832/diff/1/?file=818434#file818434line55 In API level, it's still an asynchronous RPC API, as the use case of this API described in the javadoc, do you think it would be more clean to supply a synchronous API like: T run(JobT job)? No. With a client-side synchronous API, it's awkward to specify things like timeouts - you either need explicit parameters which are not really part of the RPC, or extra configuration. Here, you just say `client.run().get(someTimeout)` if you want the call to be synchronous on the client side. - Marcelo --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/29832/#review67813 --- On Jan. 13, 2015, 12:31 a.m., Marcelo Vanzin wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/29832/ --- (Updated Jan. 13, 2015, 12:31 a.m.) Review request for hive, Brock Noland, chengxiang li, and Xuefu Zhang. Bugs: HIVE-9178 https://issues.apache.org/jira/browse/HIVE-9178 Repository: hive-git Description --- HIVE-9178. Add a synchronous RPC API to the remote Spark context. Diffs - ql/src/java/org/apache/hadoop/hive/ql/exec/spark/status/impl/RemoteSparkJobStatus.java 5c3ca018bb177ef9fd9fb24b054a9db29274b31e spark-client/src/main/java/org/apache/hive/spark/client/BaseProtocol.java f9c10b196ab47b5b4f4c0126ad455869ab68f0ca spark-client/src/main/java/org/apache/hive/spark/client/RemoteDriver.java 0d49ed3d9e33ca08d6a7526c1c434a0dd0a06a67 spark-client/src/main/java/org/apache/hive/spark/client/SparkClient.java 5e767ef5eb47e493a332607204f4c522028d7d0e spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java f8b2202a465bb8abe3d2c34e49ade6387482738c Diff: https://reviews.apache.org/r/29832/diff/ Testing --- Thanks, Marcelo Vanzin
[jira] [Commented] (HIVE-9360) TestSparkClient throws Timeoutexception
[ https://issues.apache.org/jira/browse/HIVE-9360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14275783#comment-14275783 ] Marcelo Vanzin commented on HIVE-9360: -- Yeah, I dislike timeouts in tests but in this case it's kinda hard to avoid them. Feel free to increase them if that makes things better. TestSparkClient throws Timeoutexception --- Key: HIVE-9360 URL: https://issues.apache.org/jira/browse/HIVE-9360 Project: Hive Issue Type: Test Components: Tests Affects Versions: 0.15.0 Reporter: Szehon Ho Attachments: HIVE-9360.patch TestSparkClient has been throwing TimeoutException in some test runs. The exception looks like: {noformat} java.util.concurrent.TimeoutException: null at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:49) at org.apache.hive.spark.client.JobHandleImpl.get(JobHandleImpl.java:74) at org.apache.hive.spark.client.JobHandleImpl.get(JobHandleImpl.java:35) at org.apache.hive.spark.client.TestSparkClient$5.call(TestSparkClient.java:130) at org.apache.hive.spark.client.TestSparkClient.runTest(TestSparkClient.java:224) at org.apache.hive.spark.client.TestSparkClient.testMetricsCollection(TestSparkClient.java:126) {noformat} but for each of the tests. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9178) Create a separate API for remote Spark Context RPC other than job submission [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14275735#comment-14275735 ] Marcelo Vanzin commented on HIVE-9178: -- Should I worry about those test failures? I ran a subset of qtests locally and they passed. Create a separate API for remote Spark Context RPC other than job submission [Spark Branch] --- Key: HIVE-9178 URL: https://issues.apache.org/jira/browse/HIVE-9178 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Xuefu Zhang Assignee: Marcelo Vanzin Attachments: HIVE-9178.1-spark.patch Based on discussions in HIVE-8972, it seems making sense to create a separate API for RPCs, such as addJar and getExecutorCounter. These jobs are different from a query submission in that they don't need to be queued in the backend and can be executed right away. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9178) Create a separate API for remote Spark Context RPC other than job submission [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14276297#comment-14276297 ] Marcelo Vanzin commented on HIVE-9178: -- I'll take a closer look at the code later, but I wonder if this is just a side effect of the test machine being slower for some reason (e.g. HIVE-9360). The new code shouldn't really slow down anything... Create a separate API for remote Spark Context RPC other than job submission [Spark Branch] --- Key: HIVE-9178 URL: https://issues.apache.org/jira/browse/HIVE-9178 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Xuefu Zhang Assignee: Marcelo Vanzin Attachments: HIVE-9178.1-spark.patch, HIVE-9178.1-spark.patch Based on discussions in HIVE-8972, it seems making sense to create a separate API for RPCs, such as addJar and getExecutorCounter. These jobs are different from a query submission in that they don't need to be queued in the backend and can be executed right away. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9364) Document non-Public API's which Spark SQL uses
[ https://issues.apache.org/jira/browse/HIVE-9364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14276100#comment-14276100 ] Marcelo Vanzin commented on HIVE-9364: -- Here are three APIs that changed between 0.13 and trunk which are used by SparkSQL: https://github.com/apache/hive/commit/cf7028289dbe39cdbda208286aa4b53086958a8e {code} // old public boolean dropIndex(String baseTableName, String index_name, boolean deleteData) throws HiveException { // new public boolean dropIndex(String baseTableName, String index_name, boolean throwException, boolean deleteData) throws HiveException { {code} https://github.com/apache/hive/commit/29b88a117ca3900a83eaab44c74b27164493fc56 {code} // old public Path getExternalTmpPath(URI extURI) { // new public Path getExternalTmpPath(Path path) { {code} https://github.com/apache/hive/commit/9243cbbfe2f22b7ab0453d8766c6d5ae333be368 {code} // old public ArrayListLinkedHashMapString, String loadDynamicPartitions(Path loadPath, String tableName, MapString, String partSpec, boolean replace, int numDP, boolean holdDDLTime, boolean listBucketingEnabled, boolean isAcid) throws HiveException { // new public MapMapString, String, Partition loadDynamicPartitions(Path loadPath, String tableName, MapString, String partSpec, boolean replace, int numDP, boolean holdDDLTime, boolean listBucketingEnabled, boolean isAcid) throws HiveException { {code} {code} // old public void loadPartition(Path loadPath, String tableName, MapString, String partSpec, boolean replace, boolean holdDDLTime, boolean inheritTableSpecs, boolean isSkewedStoreAsSubdir, boolean isSrcLocal, boolean isAcid) throws HiveException { // new public Partition loadPartition(Path loadPath, Table tbl, MapString, String partSpec, boolean replace, boolean holdDDLTime, boolean inheritTableSpecs, boolean isSkewedStoreAsSubdir, boolean isSrcLocal, boolean isAcid) throws HiveException { {code} (For loadDynamicPartitions, there's an extra change, https://github.com/apache/hive/commit/2568066d, which added the isAcid argument.) Document non-Public API's which Spark SQL uses -- Key: HIVE-9364 URL: https://issues.apache.org/jira/browse/HIVE-9364 Project: Hive Issue Type: Sub-task Components: API Reporter: Brock Noland Fix For: 1.0.0 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-9178) Create a separate API for remote Spark Context RPC other than job submission [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin reassigned HIVE-9178: Assignee: Marcelo Vanzin Create a separate API for remote Spark Context RPC other than job submission [Spark Branch] --- Key: HIVE-9178 URL: https://issues.apache.org/jira/browse/HIVE-9178 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Xuefu Zhang Assignee: Marcelo Vanzin Based on discussions in HIVE-8972, it seems making sense to create a separate API for RPCs, such as addJar and getExecutorCounter. These jobs are different from a query submission in that they don't need to be queued in the backend and can be executed right away. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-9179) Add listeners on JobHandle so job status change can be notified to the client [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin reassigned HIVE-9179: Assignee: Marcelo Vanzin Add listeners on JobHandle so job status change can be notified to the client [Spark Branch] Key: HIVE-9179 URL: https://issues.apache.org/jira/browse/HIVE-9179 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Xuefu Zhang Assignee: Marcelo Vanzin Based on discussion in HIVE-8972, it seems nice to add listeners on a job handle such that state changes of a submitted a job can be notified instead of the current approach of client polling for such changes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Review Request 29832: HIVE-9178. Add a synchronous RPC API to the remote Spark context.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/29832/ --- Review request for hive, Brock Noland, chengxiang li, and Xuefu Zhang. Bugs: HIVE-9178 https://issues.apache.org/jira/browse/HIVE-9178 Repository: hive-git Description --- HIVE-9178. Add a synchronous RPC API to the remote Spark context. Diffs - ql/src/java/org/apache/hadoop/hive/ql/exec/spark/status/impl/RemoteSparkJobStatus.java 5c3ca018bb177ef9fd9fb24b054a9db29274b31e spark-client/src/main/java/org/apache/hive/spark/client/BaseProtocol.java f9c10b196ab47b5b4f4c0126ad455869ab68f0ca spark-client/src/main/java/org/apache/hive/spark/client/RemoteDriver.java 0d49ed3d9e33ca08d6a7526c1c434a0dd0a06a67 spark-client/src/main/java/org/apache/hive/spark/client/SparkClient.java 5e767ef5eb47e493a332607204f4c522028d7d0e spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java f8b2202a465bb8abe3d2c34e49ade6387482738c Diff: https://reviews.apache.org/r/29832/diff/ Testing --- Thanks, Marcelo Vanzin
[jira] [Updated] (HIVE-9178) Create a separate API for remote Spark Context RPC other than job submission [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin updated HIVE-9178: - Status: Patch Available (was: Open) Create a separate API for remote Spark Context RPC other than job submission [Spark Branch] --- Key: HIVE-9178 URL: https://issues.apache.org/jira/browse/HIVE-9178 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Xuefu Zhang Assignee: Marcelo Vanzin Attachments: HIVE-9178.1-spark.patch Based on discussions in HIVE-8972, it seems making sense to create a separate API for RPCs, such as addJar and getExecutorCounter. These jobs are different from a query submission in that they don't need to be queued in the backend and can be executed right away. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9178) Create a separate API for remote Spark Context RPC other than job submission [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin updated HIVE-9178: - Attachment: HIVE-9178.1-spark.patch Create a separate API for remote Spark Context RPC other than job submission [Spark Branch] --- Key: HIVE-9178 URL: https://issues.apache.org/jira/browse/HIVE-9178 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Xuefu Zhang Assignee: Marcelo Vanzin Attachments: HIVE-9178.1-spark.patch Based on discussions in HIVE-8972, it seems making sense to create a separate API for RPCs, such as addJar and getExecutorCounter. These jobs are different from a query submission in that they don't need to be queued in the backend and can be executed right away. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 29145: HIVE-9094 TimeoutException when trying get executor count from RSC [Spark Branch]
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/29145/#review65348 --- Ship it! +1 to Xuefu's comments. The config name also looks very generic, since it's only applied to a couple of jobs submitted to the client. But I don't have a good suggestion here. - Marcelo Vanzin On Dec. 17, 2014, 6:28 a.m., chengxiang li wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/29145/ --- (Updated Dec. 17, 2014, 6:28 a.m.) Review request for hive and Xuefu Zhang. Bugs: HIVE-9094 https://issues.apache.org/jira/browse/HIVE-9094 Repository: hive-git Description --- RemoteHiveSparkClient::getExecutorCount timeout after 5s as Spark cluster has not launched yet 1. set the timeout value configurable. 2. set default timeout value 60s. 3. enable timeout for get spark job info and get spark stage info. Diffs - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 22f052a ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveSparkClientFactory.java 5d6a02c ql/src/java/org/apache/hadoop/hive/ql/exec/spark/RemoteHiveSparkClient.java e1946d5 ql/src/java/org/apache/hadoop/hive/ql/exec/spark/status/impl/RemoteSparkJobStatus.java 6217de4 Diff: https://reviews.apache.org/r/29145/diff/ Testing --- Thanks, chengxiang li
[jira] [Commented] (HIVE-8972) Implement more fine-grained remote client-level events [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-8972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14250353#comment-14250353 ] Marcelo Vanzin commented on HIVE-8972: -- The patch looks ok to me. I though about creating a separate API for these kinds of RPCs - these wouldn't be queued in the backend but executed right away. My only concern is that this could be abused (e.g. a caller using these calls to run a Spark job before the queue ones), but perhaps that's an app-level concern and the client shouldn't care if someone uses it that way. The netty framework we're using now could also make some things easier, like adding listeners to JobHandle and reporting job state changes to the client side when they happen (instead of the current poll-like approach?). We could also add client-level listeners so that interesting events are reported (e.g. spark context up and things like that). If there's interest in these things we could create a new task and I'll try to find some time to work on it. Implement more fine-grained remote client-level events [Spark Branch] - Key: HIVE-8972 URL: https://issues.apache.org/jira/browse/HIVE-8972 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Rui Li Assignee: Rui Li Attachments: HIVE-8972.1-spark.patch, HIVE-8972.2-spark.patch, HIVE-8972.3-spark.patch, HIVE-8972.3-spark.patch, HIVE-8972.4-spark.patch, HIVE-8972.5-spark.patch Follow up task of HIVE-8956. Fine-grained events are useful for better job monitor and failure handling. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9094) TimeoutException when trying get executor count from RSC [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14248560#comment-14248560 ] Marcelo Vanzin commented on HIVE-9094: -- 60s sounds reasonable. This initial timeout will always be hard to figure out, since launching the app will depend a lot on the cluster being used and a bunch of other things... :-/ Perhaps we could add some kind of context status even that the client side can listen to, and the driver side can periodically send the update... but that would probably still need some kind of timeout. Anyway, raising the timeout sounds fine for now. TimeoutException when trying get executor count from RSC [Spark Branch] --- Key: HIVE-9094 URL: https://issues.apache.org/jira/browse/HIVE-9094 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Xuefu Zhang Assignee: Chengxiang Li In http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/532/testReport, join25.q failed because: {code} 2014-12-12 19:14:50,084 ERROR [main]: ql.Driver (SessionState.java:printError(838)) - FAILED: SemanticException Failed to get spark memory/core info: java.util.concurrent.TimeoutException org.apache.hadoop.hive.ql.parse.SemanticException: Failed to get spark memory/core info: java.util.concurrent.TimeoutException at org.apache.hadoop.hive.ql.optimizer.spark.SetSparkReducerParallelism.process(SetSparkReducerParallelism.java:120) at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:94) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:78) at org.apache.hadoop.hive.ql.lib.ForwardWalker.walk(ForwardWalker.java:79) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:109) at org.apache.hadoop.hive.ql.parse.spark.SparkCompiler.optimizeOperatorPlan(SparkCompiler.java:134) at org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:99) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10202) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221) at org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:420) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:306) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1108) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1045) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1035) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:199) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:151) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:362) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:297) at org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:837) at org.apache.hadoop.hive.cli.TestSparkCliDriver.runTest(TestSparkCliDriver.java:234) at org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join25(TestSparkCliDriver.java:162) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at junit.framework.TestCase.runTest(TestCase.java:176) at junit.framework.TestCase.runBare(TestCase.java:141) at junit.framework.TestResult$1.protect(TestResult.java:122) at junit.framework.TestResult.runProtected(TestResult.java:142) at junit.framework.TestResult.run(TestResult.java:125) at junit.framework.TestCase.run(TestCase.java:129) at junit.framework.TestSuite.runTest(TestSuite.java:255) at junit.framework.TestSuite.run(TestSuite.java:250) at org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:84) at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264) at org.apache.maven.surefire.junit4
[jira] [Commented] (HIVE-9017) Clean up temp files of RSC [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14244944#comment-14244944 ] Marcelo Vanzin commented on HIVE-9017: -- These files are created by Spark when downloading resources for the app (e.g. application jars). In standalone mode, by default, these files will end up in /tmp (java.io.tmpdir). The problem is that the app doesn't clean up these files; in fact, it can't, because they are supposed to be shared in case multiple executors run on the same host - so one executor cannot unilaterally decide to delete them. (That's not entirely true; I guess it could, but then it would cause other executors to re-download the file when needed, so more overhead.) This is not a problem in Yarn mode, since the temp dir is under a Yarn-managed directory that is deleted when the app shuts down. So, while I think of a clean way to fix this in Spark, the following can be done on the Hive side: - create an app-specific temp directory before launching the Spark app - set {{spark.local.dir}} to that location - delete the directory when the client shuts down Clean up temp files of RSC [Spark Branch] - Key: HIVE-9017 URL: https://issues.apache.org/jira/browse/HIVE-9017 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Rui Li Currently RSC will leave a lot of temp files in {{/tmp}}, including {{*_lock}}, {{*_cache}}, {{spark-submit.*.properties}}, etc. We should clean up these files or it will exhaust disk space. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9017) Clean up temp files of RSC [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14244948#comment-14244948 ] Marcelo Vanzin commented on HIVE-9017: -- P.S.: that solution will probably not work very well in real standalone mode, since {{spark.local.dir}} would have to be created / deleted on every node in the cluster, and the client probably doesn't have the means to do that. Clean up temp files of RSC [Spark Branch] - Key: HIVE-9017 URL: https://issues.apache.org/jira/browse/HIVE-9017 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Rui Li Currently RSC will leave a lot of temp files in {{/tmp}}, including {{*_lock}}, {{*_cache}}, {{spark-submit.*.properties}}, etc. We should clean up these files or it will exhaust disk space. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9017) Clean up temp files of RSC [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14244951#comment-14244951 ] Marcelo Vanzin commented on HIVE-9017: -- Correct. All files written by spark will end up under that directory (right now they all end up in /tmp since it's not set). Clean up temp files of RSC [Spark Branch] - Key: HIVE-9017 URL: https://issues.apache.org/jira/browse/HIVE-9017 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Rui Li Currently RSC will leave a lot of temp files in {{/tmp}}, including {{*_lock}}, {{*_cache}}, {{spark-submit.*.properties}}, etc. We should clean up these files or it will exhaust disk space. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9017) Clean up temp files of RSC [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14244990#comment-14244990 ] Marcelo Vanzin commented on HIVE-9017: -- You can't do that, because these files are meant to be referenced by multiple processes. So if one of them just deletes the files, you kinda break the protocol. Clean up temp files of RSC [Spark Branch] - Key: HIVE-9017 URL: https://issues.apache.org/jira/browse/HIVE-9017 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Rui Li Currently RSC will leave a lot of temp files in {{/tmp}}, including {{*_lock}}, {{*_cache}}, {{spark-submit.*.properties}}, etc. We should clean up these files or it will exhaust disk space. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9017) Clean up temp files of RSC [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14244992#comment-14244992 ] Marcelo Vanzin commented on HIVE-9017: -- See https://github.com/apache/spark/commit/7aacb7bf for more details of original Spark change. Clean up temp files of RSC [Spark Branch] - Key: HIVE-9017 URL: https://issues.apache.org/jira/browse/HIVE-9017 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Rui Li Currently RSC will leave a lot of temp files in {{/tmp}}, including {{*_lock}}, {{*_cache}}, {{spark-submit.*.properties}}, etc. We should clean up these files or it will exhaust disk space. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9017) Clean up temp files of RSC [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14245029#comment-14245029 ] Marcelo Vanzin commented on HIVE-9017: -- I filed https://issues.apache.org/jira/browse/SPARK-4834 to fix this in Spark. bq. To clarify, when Spark lunched multiple executors in one host for one application, these executors share the same JVM, right? No, each executor is its own process. Clean up temp files of RSC [Spark Branch] - Key: HIVE-9017 URL: https://issues.apache.org/jira/browse/HIVE-9017 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Rui Li Currently RSC will leave a lot of temp files in {{/tmp}}, including {{*_lock}}, {{*_cache}}, {{spark-submit.*.properties}}, etc. We should clean up these files or it will exhaust disk space. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9017) Clean up temp files of RSC [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14245051#comment-14245051 ] Marcelo Vanzin commented on HIVE-9017: -- In Spark-speak, executor is the JVM that executes tasks. There's no name by which the individual threads an executor has are referred to, I guess you could say task runner, but well, it's rare to see someone even talk about those. About can you run more than one executor per host, the answer is yes, but it's a little more complicated than that. In Yarn mode, it's definitely possible, but then Yarn doesn't suffer from this issue. In standalone mode, it's unusual. You can achieve that in two ways: - run with a local-cluster master, which HoS uses for testing. But people shouldn't use that in production. - run multiple Worker daemons on the same host; I don't know if that's possible, but right now Spark standalone has a 1:1 relationship between Worker daemons and executors. But, long story short, you can't delete these files when the executor goes down. That could break Yarn mode, and even in standalone mode that is kinda sketchy (let's say the executor dies and is restarted, having these files around could avoid having to re-download a large jar from the driver node). Clean up temp files of RSC [Spark Branch] - Key: HIVE-9017 URL: https://issues.apache.org/jira/browse/HIVE-9017 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Rui Li Currently RSC will leave a lot of temp files in {{/tmp}}, including {{*_lock}}, {{*_cache}}, {{spark-submit.*.properties}}, etc. We should clean up these files or it will exhaust disk space. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9085) Spark Client RPC should have larger default max message size [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14243516#comment-14243516 ] Marcelo Vanzin commented on HIVE-9085: -- LGTM (as discussed by e-mail). Spark Client RPC should have larger default max message size [Spark Branch] --- Key: HIVE-9085 URL: https://issues.apache.org/jira/browse/HIVE-9085 Project: Hive Issue Type: Sub-task Components: Spark Affects Versions: spark-branch Reporter: Brock Noland Assignee: Brock Noland Attachments: HIVE-9085-spark.1.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9085) Spark Client RPC should have larger default max message size [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14243535#comment-14243535 ] Marcelo Vanzin commented on HIVE-9085: -- When an exception is thrown in the write path, it's not safe to use the RPC channel anymore. Partial data may have been written to the socket and may cause both endpoints to get out of sync. Right now the approach the code has taken is close the socket on any error. If we'd prefer, in the long term, a more resilient approach, more modifications will have to be made. Spark Client RPC should have larger default max message size [Spark Branch] --- Key: HIVE-9085 URL: https://issues.apache.org/jira/browse/HIVE-9085 Project: Hive Issue Type: Sub-task Components: Spark Affects Versions: spark-branch Reporter: Brock Noland Assignee: Brock Noland Attachments: HIVE-9085-spark.1.patch, HIVE-9085-spark.1.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9036) Replace akka for remote spark client RPC [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin updated HIVE-9036: - Attachment: HIVE-9036.5-spark.patch Replace akka for remote spark client RPC [Spark Branch] --- Key: HIVE-9036 URL: https://issues.apache.org/jira/browse/HIVE-9036 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Marcelo Vanzin Assignee: Marcelo Vanzin Attachments: HIVE-9036.1-spark.patch, HIVE-9036.2-spark.patch, HIVE-9036.3-spark.patch, HIVE-9036.4-spark.patch, HIVE-9036.5-spark.patch, rsc-problem-1.tar.gz We've had weird issues with akka, especially when something goes wrong and it becomes a little hard to debug. Let's replace it with a simple(r) RPC system built on top of netty. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 28779: [spark-client] Netty-based RPC implementation.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/28779/ --- (Updated Dec. 9, 2014, 6:49 p.m.) Review request for hive, Brock Noland, chengxiang li, Szehon Ho, and Xuefu Zhang. Bugs: HIVE-9036 https://issues.apache.org/jira/browse/HIVE-9036 Repository: hive-git Description --- This patch replaces akka with a simple netty-based RPC layer. It doesn't add any features on top of the existing spark-client API, which is unchanged (except for the need to add empty constructors in some places). With the new backend we can think about adding some nice features such as future listeners (which were awkward with akka because of Scala), but those are left for a different time. The full change set, with more detailed descriptions, can be seen here: https://github.com/vanzin/hive/commits/spark-client-netty Diffs (updated) - pom.xml 630b10ce35032e4b2dee50ef3dfe5feb58223b78 ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/spark/RemoteHiveSparkClient.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/spark/status/impl/RemoteSparkJobStatus.java PRE-CREATION spark-client/pom.xml PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/ClientUtils.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/JobHandleImpl.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/Protocol.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/RemoteDriver.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/SparkClientFactory.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/metrics/InputMetrics.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/metrics/Metrics.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/metrics/ShuffleReadMetrics.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/metrics/ShuffleWriteMetrics.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/rpc/KryoMessageCodec.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/rpc/README.md PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/rpc/Rpc.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcConfiguration.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcDispatcher.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcException.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcServer.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/counter/SparkCounter.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/counter/SparkCounterGroup.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/counter/SparkCounters.java PRE-CREATION spark-client/src/test/java/org/apache/hive/spark/client/TestSparkClient.java PRE-CREATION spark-client/src/test/java/org/apache/hive/spark/client/rpc/TestKryoMessageCodec.java PRE-CREATION spark-client/src/test/java/org/apache/hive/spark/client/rpc/TestRpc.java PRE-CREATION Diff: https://reviews.apache.org/r/28779/diff/ Testing --- spark-client unit tests, plus some qtests. Thanks, Marcelo Vanzin
Re: Review Request 28779: [spark-client] Netty-based RPC implementation.
On Dec. 9, 2014, 7:05 p.m., Xuefu Zhang wrote: pom.xml, line 152 https://reviews.apache.org/r/28779/diff/7/?file=786238#file786238line152 Is there a reason that we cannot keep 3.7.0? Upgrading a dep version usually gives some headaches. This version is not used anywhere in the Hive build. In fact, there is no version 3.7.0.Final of io.netty (that's for the old org.jboss.netty package). - Marcelo --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/28779/#review64411 --- On Dec. 9, 2014, 6:49 p.m., Marcelo Vanzin wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/28779/ --- (Updated Dec. 9, 2014, 6:49 p.m.) Review request for hive, Brock Noland, chengxiang li, Szehon Ho, and Xuefu Zhang. Bugs: HIVE-9036 https://issues.apache.org/jira/browse/HIVE-9036 Repository: hive-git Description --- This patch replaces akka with a simple netty-based RPC layer. It doesn't add any features on top of the existing spark-client API, which is unchanged (except for the need to add empty constructors in some places). With the new backend we can think about adding some nice features such as future listeners (which were awkward with akka because of Scala), but those are left for a different time. The full change set, with more detailed descriptions, can be seen here: https://github.com/vanzin/hive/commits/spark-client-netty Diffs - pom.xml 630b10ce35032e4b2dee50ef3dfe5feb58223b78 ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/spark/RemoteHiveSparkClient.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/spark/status/impl/RemoteSparkJobStatus.java PRE-CREATION spark-client/pom.xml PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/ClientUtils.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/JobHandleImpl.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/Protocol.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/RemoteDriver.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/SparkClientFactory.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/metrics/InputMetrics.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/metrics/Metrics.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/metrics/ShuffleReadMetrics.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/metrics/ShuffleWriteMetrics.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/rpc/KryoMessageCodec.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/rpc/README.md PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/rpc/Rpc.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcConfiguration.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcDispatcher.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcException.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcServer.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/counter/SparkCounter.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/counter/SparkCounterGroup.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/counter/SparkCounters.java PRE-CREATION spark-client/src/test/java/org/apache/hive/spark/client/TestSparkClient.java PRE-CREATION spark-client/src/test/java/org/apache/hive/spark/client/rpc/TestKryoMessageCodec.java PRE-CREATION spark-client/src/test/java/org/apache/hive/spark/client/rpc/TestRpc.java PRE-CREATION Diff: https://reviews.apache.org/r/28779/diff/ Testing --- spark-client unit tests, plus some qtests. Thanks, Marcelo Vanzin
[jira] [Updated] (HIVE-9036) Replace akka for remote spark client RPC [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin updated HIVE-9036: - Attachment: HIVE-9036.6-spark.patch Replace akka for remote spark client RPC [Spark Branch] --- Key: HIVE-9036 URL: https://issues.apache.org/jira/browse/HIVE-9036 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Marcelo Vanzin Assignee: Marcelo Vanzin Attachments: HIVE-9036.1-spark.patch, HIVE-9036.2-spark.patch, HIVE-9036.3-spark.patch, HIVE-9036.4-spark.patch, HIVE-9036.5-spark.patch, HIVE-9036.6-spark.patch, rsc-problem-1.tar.gz We've had weird issues with akka, especially when something goes wrong and it becomes a little hard to debug. Let's replace it with a simple(r) RPC system built on top of netty. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 28779: [spark-client] Netty-based RPC implementation.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/28779/ --- (Updated Dec. 9, 2014, 9:17 p.m.) Review request for hive, Brock Noland, chengxiang li, Szehon Ho, and Xuefu Zhang. Bugs: HIVE-9036 https://issues.apache.org/jira/browse/HIVE-9036 Repository: hive-git Description --- This patch replaces akka with a simple netty-based RPC layer. It doesn't add any features on top of the existing spark-client API, which is unchanged (except for the need to add empty constructors in some places). With the new backend we can think about adding some nice features such as future listeners (which were awkward with akka because of Scala), but those are left for a different time. The full change set, with more detailed descriptions, can be seen here: https://github.com/vanzin/hive/commits/spark-client-netty Diffs (updated) - pom.xml 630b10ce35032e4b2dee50ef3dfe5feb58223b78 ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/spark/RemoteHiveSparkClient.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/spark/status/impl/RemoteSparkJobStatus.java PRE-CREATION spark-client/pom.xml PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/ClientUtils.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/JobHandleImpl.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/Protocol.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/RemoteDriver.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/SparkClientFactory.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/metrics/InputMetrics.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/metrics/Metrics.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/metrics/ShuffleReadMetrics.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/metrics/ShuffleWriteMetrics.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/rpc/KryoMessageCodec.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/rpc/README.md PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/rpc/Rpc.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcConfiguration.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcDispatcher.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcException.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcServer.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/counter/SparkCounter.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/counter/SparkCounterGroup.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/counter/SparkCounters.java PRE-CREATION spark-client/src/test/java/org/apache/hive/spark/client/TestSparkClient.java PRE-CREATION spark-client/src/test/java/org/apache/hive/spark/client/rpc/TestKryoMessageCodec.java PRE-CREATION spark-client/src/test/java/org/apache/hive/spark/client/rpc/TestRpc.java PRE-CREATION Diff: https://reviews.apache.org/r/28779/diff/ Testing --- spark-client unit tests, plus some qtests. Thanks, Marcelo Vanzin
Re: Review Request 28779: [spark-client] Netty-based RPC implementation.
/org/apache/hive/spark/client/rpc/RpcServer.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/counter/SparkCounter.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/counter/SparkCounterGroup.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/counter/SparkCounters.java PRE-CREATION spark-client/src/test/java/org/apache/hive/spark/client/TestSparkClient.java PRE-CREATION spark-client/src/test/java/org/apache/hive/spark/client/rpc/TestKryoMessageCodec.java PRE-CREATION spark-client/src/test/java/org/apache/hive/spark/client/rpc/TestRpc.java PRE-CREATION Diff: https://reviews.apache.org/r/28779/diff/ Testing --- spark-client unit tests, plus some qtests. Thanks, Marcelo Vanzin
[jira] [Updated] (HIVE-9036) Replace akka for remote spark client RPC [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin updated HIVE-9036: - Attachment: HIVE-9036.2-spark.patch Replace akka for remote spark client RPC [Spark Branch] --- Key: HIVE-9036 URL: https://issues.apache.org/jira/browse/HIVE-9036 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Marcelo Vanzin Assignee: Marcelo Vanzin Attachments: HIVE-9036.1-spark.patch, HIVE-9036.2-spark.patch We've had weird issues with akka, especially when something goes wrong and it becomes a little hard to debug. Let's replace it with a simple(r) RPC system built on top of netty. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 28779: [spark-client] Netty-based RPC implementation.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/28779/ --- (Updated Dec. 8, 2014, 7:40 p.m.) Review request for hive, Brock Noland, chengxiang li, Szehon Ho, and Xuefu Zhang. Bugs: HIVE-9036 https://issues.apache.org/jira/browse/HIVE-9036 Repository: hive-git Description (updated) --- This patch replaces akka with a simple netty-based RPC layer. It doesn't add any features on top of the existing spark-client API, which is unchanged (except for the need to add empty constructors in some places). With the new backend we can think about adding some nice features such as future listeners (which were awkward with akka because of Scala), but those are left for a different time. The full change set, with more detailed descriptions, can be seen here: https://github.com/vanzin/hive/commits/spark-client-netty Diffs - pom.xml 630b10ce35032e4b2dee50ef3dfe5feb58223b78 ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/spark/RemoteHiveSparkClient.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/spark/status/impl/RemoteSparkJobStatus.java PRE-CREATION spark-client/pom.xml PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/ClientUtils.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/JobHandleImpl.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/Protocol.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/RemoteDriver.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/SparkClientFactory.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/metrics/InputMetrics.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/metrics/Metrics.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/metrics/ShuffleReadMetrics.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/metrics/ShuffleWriteMetrics.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/rpc/KryoMessageCodec.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/rpc/README.md PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/rpc/Rpc.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcConfiguration.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcDispatcher.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcException.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcServer.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/counter/SparkCounter.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/counter/SparkCounterGroup.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/counter/SparkCounters.java PRE-CREATION spark-client/src/test/java/org/apache/hive/spark/client/TestSparkClient.java PRE-CREATION spark-client/src/test/java/org/apache/hive/spark/client/rpc/TestKryoMessageCodec.java PRE-CREATION spark-client/src/test/java/org/apache/hive/spark/client/rpc/TestRpc.java PRE-CREATION Diff: https://reviews.apache.org/r/28779/diff/ Testing --- spark-client unit tests, plus some qtests. Thanks, Marcelo Vanzin
[jira] [Updated] (HIVE-9036) Replace akka for remote spark client RPC [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin updated HIVE-9036: - Attachment: (was: HIVE-9036.2-spark.patch) Replace akka for remote spark client RPC [Spark Branch] --- Key: HIVE-9036 URL: https://issues.apache.org/jira/browse/HIVE-9036 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Marcelo Vanzin Assignee: Marcelo Vanzin Attachments: HIVE-9036.1-spark.patch We've had weird issues with akka, especially when something goes wrong and it becomes a little hard to debug. Let's replace it with a simple(r) RPC system built on top of netty. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9036) Replace akka for remote spark client RPC [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin updated HIVE-9036: - Attachment: HIVE-9036.2-spark.patch Replace akka for remote spark client RPC [Spark Branch] --- Key: HIVE-9036 URL: https://issues.apache.org/jira/browse/HIVE-9036 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Marcelo Vanzin Assignee: Marcelo Vanzin Attachments: HIVE-9036.1-spark.patch, HIVE-9036.2-spark.patch We've had weird issues with akka, especially when something goes wrong and it becomes a little hard to debug. Let's replace it with a simple(r) RPC system built on top of netty. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 28779: [spark-client] Netty-based RPC implementation.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/28779/ --- (Updated Dec. 8, 2014, 7:47 p.m.) Review request for hive, Brock Noland, chengxiang li, Szehon Ho, and Xuefu Zhang. Bugs: HIVE-9036 https://issues.apache.org/jira/browse/HIVE-9036 Repository: hive-git Description --- This patch replaces akka with a simple netty-based RPC layer. It doesn't add any features on top of the existing spark-client API, which is unchanged (except for the need to add empty constructors in some places). With the new backend we can think about adding some nice features such as future listeners (which were awkward with akka because of Scala), but those are left for a different time. The full change set, with more detailed descriptions, can be seen here: https://github.com/vanzin/hive/commits/spark-client-netty Diffs (updated) - pom.xml 630b10ce35032e4b2dee50ef3dfe5feb58223b78 ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/spark/RemoteHiveSparkClient.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/spark/status/impl/RemoteSparkJobStatus.java PRE-CREATION spark-client/pom.xml PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/ClientUtils.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/JobHandleImpl.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/Protocol.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/RemoteDriver.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/SparkClientFactory.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/metrics/InputMetrics.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/metrics/Metrics.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/metrics/ShuffleReadMetrics.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/metrics/ShuffleWriteMetrics.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/rpc/KryoMessageCodec.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/rpc/README.md PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/rpc/Rpc.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcConfiguration.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcDispatcher.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcException.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcServer.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/counter/SparkCounter.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/counter/SparkCounterGroup.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/counter/SparkCounters.java PRE-CREATION spark-client/src/test/java/org/apache/hive/spark/client/TestSparkClient.java PRE-CREATION spark-client/src/test/java/org/apache/hive/spark/client/rpc/TestKryoMessageCodec.java PRE-CREATION spark-client/src/test/java/org/apache/hive/spark/client/rpc/TestRpc.java PRE-CREATION Diff: https://reviews.apache.org/r/28779/diff/ Testing --- spark-client unit tests, plus some qtests. Thanks, Marcelo Vanzin
[jira] [Updated] (HIVE-9036) Replace akka for remote spark client RPC [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin updated HIVE-9036: - Status: Open (was: Patch Available) Replace akka for remote spark client RPC [Spark Branch] --- Key: HIVE-9036 URL: https://issues.apache.org/jira/browse/HIVE-9036 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Marcelo Vanzin Assignee: Marcelo Vanzin Attachments: HIVE-9036.1-spark.patch, HIVE-9036.2-spark.patch We've had weird issues with akka, especially when something goes wrong and it becomes a little hard to debug. Let's replace it with a simple(r) RPC system built on top of netty. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9036) Replace akka for remote spark client RPC [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin updated HIVE-9036: - Status: Patch Available (was: Open) Replace akka for remote spark client RPC [Spark Branch] --- Key: HIVE-9036 URL: https://issues.apache.org/jira/browse/HIVE-9036 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Marcelo Vanzin Assignee: Marcelo Vanzin Attachments: HIVE-9036.1-spark.patch, HIVE-9036.2-spark.patch We've had weird issues with akka, especially when something goes wrong and it becomes a little hard to debug. Let's replace it with a simple(r) RPC system built on top of netty. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 28779: [spark-client] Netty-based RPC implementation.
On Dec. 8, 2014, 9:03 p.m., Brock Noland wrote: Hey Marcelo, When I send an HTTP request to the port where RSC is listening the message below is printed. Thus it's doing a good job in that it's checking the max message size which is awesome, but I feel we need to: 1) Add a small header so that when junk data is sent to this port we can log a better exception than the one below. As I mentioned, we've had massive problems with this is in flume which also uses netty for communication. 2) ensure the income size is not negative. 2014-12-08 20:56:41,070 WARN [RPC-Handler-7]: rpc.RpcDispatcher (RpcDispatcher.java:exceptionCaught(154)) - [HelloDispatcher] Caught exception in channel pipeline. io.netty.handler.codec.DecoderException: java.lang.IllegalArgumentException: Message exceeds maximum allowed size (10485760 bytes). at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:280) at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:149) at io.netty.handler.codec.ByteToMessageCodec.channelRead(ByteToMessageCodec.java:108) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333) at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319) at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:787) at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:130) at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511) at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468) at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354) at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.IllegalArgumentException: Message exceeds maximum allowed size (10485760 bytes). at org.apache.hive.spark.client.rpc.KryoMessageCodec.checkSize(KryoMessageCodec.java:117) at org.apache.hive.spark.client.rpc.KryoMessageCodec.decode(KryoMessageCodec.java:77) at io.netty.handler.codec.ByteToMessageCodec$1.decode(ByteToMessageCodec.java:42) at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:249) ... 12 more I can add the check for negative sizes, but I still don't understand why you want a header. It doesn't serve any practical purposes. The protocol itself has a handshake that needs to be successful for the connection to be established; adding a header will add nothing to the process, just complexity. - Marcelo --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/28779/#review64279 --- On Dec. 8, 2014, 7:47 p.m., Marcelo Vanzin wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/28779/ --- (Updated Dec. 8, 2014, 7:47 p.m.) Review request for hive, Brock Noland, chengxiang li, Szehon Ho, and Xuefu Zhang. Bugs: HIVE-9036 https://issues.apache.org/jira/browse/HIVE-9036 Repository: hive-git Description --- This patch replaces akka with a simple netty-based RPC layer. It doesn't add any features on top of the existing spark-client API, which is unchanged (except for the need to add empty constructors in some places). With the new backend we can think about adding some nice features such as future listeners (which were awkward with akka because of Scala), but those are left for a different time. The full change set, with more detailed descriptions, can be seen here: https://github.com/vanzin/hive/commits/spark-client-netty Diffs - pom.xml 630b10ce35032e4b2dee50ef3dfe5feb58223b78 ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/spark/RemoteHiveSparkClient.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/spark/status/impl/RemoteSparkJobStatus.java PRE-CREATION spark-client/pom.xml PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/ClientUtils.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/JobHandleImpl.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/Protocol.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/RemoteDriver.java
Re: Review Request 28779: [spark-client] Netty-based RPC implementation.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/28779/ --- (Updated Dec. 8, 2014, 9:11 p.m.) Review request for hive, Brock Noland, chengxiang li, Szehon Ho, and Xuefu Zhang. Bugs: HIVE-9036 https://issues.apache.org/jira/browse/HIVE-9036 Repository: hive-git Description --- This patch replaces akka with a simple netty-based RPC layer. It doesn't add any features on top of the existing spark-client API, which is unchanged (except for the need to add empty constructors in some places). With the new backend we can think about adding some nice features such as future listeners (which were awkward with akka because of Scala), but those are left for a different time. The full change set, with more detailed descriptions, can be seen here: https://github.com/vanzin/hive/commits/spark-client-netty Diffs (updated) - pom.xml 630b10ce35032e4b2dee50ef3dfe5feb58223b78 ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/spark/RemoteHiveSparkClient.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/spark/status/impl/RemoteSparkJobStatus.java PRE-CREATION spark-client/pom.xml PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/ClientUtils.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/JobHandleImpl.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/Protocol.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/RemoteDriver.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/SparkClientFactory.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/metrics/InputMetrics.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/metrics/Metrics.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/metrics/ShuffleReadMetrics.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/metrics/ShuffleWriteMetrics.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/rpc/KryoMessageCodec.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/rpc/README.md PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/rpc/Rpc.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcConfiguration.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcDispatcher.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcException.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcServer.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/counter/SparkCounter.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/counter/SparkCounterGroup.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/counter/SparkCounters.java PRE-CREATION spark-client/src/test/java/org/apache/hive/spark/client/TestSparkClient.java PRE-CREATION spark-client/src/test/java/org/apache/hive/spark/client/rpc/TestKryoMessageCodec.java PRE-CREATION spark-client/src/test/java/org/apache/hive/spark/client/rpc/TestRpc.java PRE-CREATION Diff: https://reviews.apache.org/r/28779/diff/ Testing --- spark-client unit tests, plus some qtests. Thanks, Marcelo Vanzin
[jira] [Updated] (HIVE-9036) Replace akka for remote spark client RPC [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin updated HIVE-9036: - Attachment: HIVE-9036.3-spark.patch Replace akka for remote spark client RPC [Spark Branch] --- Key: HIVE-9036 URL: https://issues.apache.org/jira/browse/HIVE-9036 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Marcelo Vanzin Assignee: Marcelo Vanzin Attachments: HIVE-9036.1-spark.patch, HIVE-9036.2-spark.patch, HIVE-9036.3-spark.patch We've had weird issues with akka, especially when something goes wrong and it becomes a little hard to debug. Let's replace it with a simple(r) RPC system built on top of netty. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 28779: [spark-client] Netty-based RPC implementation.
On Dec. 8, 2014, 9:03 p.m., Brock Noland wrote: Hey Marcelo, When I send an HTTP request to the port where RSC is listening the message below is printed. Thus it's doing a good job in that it's checking the max message size which is awesome, but I feel we need to: 1) Add a small header so that when junk data is sent to this port we can log a better exception than the one below. As I mentioned, we've had massive problems with this is in flume which also uses netty for communication. 2) ensure the income size is not negative. 2014-12-08 20:56:41,070 WARN [RPC-Handler-7]: rpc.RpcDispatcher (RpcDispatcher.java:exceptionCaught(154)) - [HelloDispatcher] Caught exception in channel pipeline. io.netty.handler.codec.DecoderException: java.lang.IllegalArgumentException: Message exceeds maximum allowed size (10485760 bytes). at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:280) at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:149) at io.netty.handler.codec.ByteToMessageCodec.channelRead(ByteToMessageCodec.java:108) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333) at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319) at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:787) at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:130) at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511) at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468) at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354) at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.IllegalArgumentException: Message exceeds maximum allowed size (10485760 bytes). at org.apache.hive.spark.client.rpc.KryoMessageCodec.checkSize(KryoMessageCodec.java:117) at org.apache.hive.spark.client.rpc.KryoMessageCodec.decode(KryoMessageCodec.java:77) at io.netty.handler.codec.ByteToMessageCodec$1.decode(ByteToMessageCodec.java:42) at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:249) ... 12 more Marcelo Vanzin wrote: I can add the check for negative sizes, but I still don't understand why you want a header. It doesn't serve any practical purposes. The protocol itself has a handshake that needs to be successful for the connection to be established; adding a header will add nothing to the process, just complexity. Brock Noland wrote: The only thing I would add is that it's easy for engineers who work on this to look at the exception and know that it's not related, but it's not easy for operations folks. When they turn on debug logging and see these exceptions they will get taken off the trail of the real problem they are trying to debug. Ops folks should not turn on debug logging unless they're told to; otherwise they'll potentially see a lot of these kinds of things. If they do turn on debug logging by themselves, then they shouldn't be surprised to see things they may not fully understand. There's a reason why it's called debug, and not just print the log messages specific to the problem I'm having. - Marcelo --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/28779/#review64279 --- On Dec. 8, 2014, 9:11 p.m., Marcelo Vanzin wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/28779/ --- (Updated Dec. 8, 2014, 9:11 p.m.) Review request for hive, Brock Noland, chengxiang li, Szehon Ho, and Xuefu Zhang. Bugs: HIVE-9036 https://issues.apache.org/jira/browse/HIVE-9036 Repository: hive-git Description --- This patch replaces akka with a simple netty-based RPC layer. It doesn't add any features on top of the existing spark-client API, which is unchanged (except for the need to add empty constructors in some places). With the new backend we can think about adding some nice features such as future listeners (which were awkward with akka because of Scala), but those are left for a different time. The full change set, with more detailed descriptions, can be seen here: https://github.com/vanzin/hive/commits/spark-client-netty
Re: Review Request 28779: [spark-client] Netty-based RPC implementation.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/28779/ --- (Updated Dec. 8, 2014, 9:52 p.m.) Review request for hive, Brock Noland, chengxiang li, Szehon Ho, and Xuefu Zhang. Bugs: HIVE-9036 https://issues.apache.org/jira/browse/HIVE-9036 Repository: hive-git Description --- This patch replaces akka with a simple netty-based RPC layer. It doesn't add any features on top of the existing spark-client API, which is unchanged (except for the need to add empty constructors in some places). With the new backend we can think about adding some nice features such as future listeners (which were awkward with akka because of Scala), but those are left for a different time. The full change set, with more detailed descriptions, can be seen here: https://github.com/vanzin/hive/commits/spark-client-netty Diffs (updated) - pom.xml 630b10ce35032e4b2dee50ef3dfe5feb58223b78 ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/spark/RemoteHiveSparkClient.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/spark/status/impl/RemoteSparkJobStatus.java PRE-CREATION spark-client/pom.xml PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/ClientUtils.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/JobHandleImpl.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/Protocol.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/RemoteDriver.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/SparkClientFactory.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/metrics/InputMetrics.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/metrics/Metrics.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/metrics/ShuffleReadMetrics.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/metrics/ShuffleWriteMetrics.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/rpc/KryoMessageCodec.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/rpc/README.md PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/rpc/Rpc.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcConfiguration.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcDispatcher.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcException.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcServer.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/counter/SparkCounter.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/counter/SparkCounterGroup.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/counter/SparkCounters.java PRE-CREATION spark-client/src/test/java/org/apache/hive/spark/client/TestSparkClient.java PRE-CREATION spark-client/src/test/java/org/apache/hive/spark/client/rpc/TestKryoMessageCodec.java PRE-CREATION spark-client/src/test/java/org/apache/hive/spark/client/rpc/TestRpc.java PRE-CREATION Diff: https://reviews.apache.org/r/28779/diff/ Testing --- spark-client unit tests, plus some qtests. Thanks, Marcelo Vanzin
Re: Review Request 28779: [spark-client] Netty-based RPC implementation.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/28779/ --- (Updated Dec. 8, 2014, 9:54 p.m.) Review request for hive, Brock Noland, chengxiang li, Szehon Ho, and Xuefu Zhang. Bugs: HIVE-9036 https://issues.apache.org/jira/browse/HIVE-9036 Repository: hive-git Description --- This patch replaces akka with a simple netty-based RPC layer. It doesn't add any features on top of the existing spark-client API, which is unchanged (except for the need to add empty constructors in some places). With the new backend we can think about adding some nice features such as future listeners (which were awkward with akka because of Scala), but those are left for a different time. The full change set, with more detailed descriptions, can be seen here: https://github.com/vanzin/hive/commits/spark-client-netty Diffs (updated) - pom.xml 630b10ce35032e4b2dee50ef3dfe5feb58223b78 ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/spark/RemoteHiveSparkClient.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/spark/status/impl/RemoteSparkJobStatus.java PRE-CREATION spark-client/pom.xml PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/ClientUtils.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/JobHandleImpl.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/Protocol.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/RemoteDriver.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/SparkClientFactory.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/metrics/InputMetrics.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/metrics/Metrics.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/metrics/ShuffleReadMetrics.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/metrics/ShuffleWriteMetrics.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/rpc/KryoMessageCodec.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/rpc/README.md PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/rpc/Rpc.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcConfiguration.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcDispatcher.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcException.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcServer.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/counter/SparkCounter.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/counter/SparkCounterGroup.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/counter/SparkCounters.java PRE-CREATION spark-client/src/test/java/org/apache/hive/spark/client/TestSparkClient.java PRE-CREATION spark-client/src/test/java/org/apache/hive/spark/client/rpc/TestKryoMessageCodec.java PRE-CREATION spark-client/src/test/java/org/apache/hive/spark/client/rpc/TestRpc.java PRE-CREATION Diff: https://reviews.apache.org/r/28779/diff/ Testing --- spark-client unit tests, plus some qtests. Thanks, Marcelo Vanzin
[jira] [Commented] (HIVE-9036) Replace akka for remote spark client RPC [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14238670#comment-14238670 ] Marcelo Vanzin commented on HIVE-9036: -- I have a live job in that state, should be better for debugging. Replace akka for remote spark client RPC [Spark Branch] --- Key: HIVE-9036 URL: https://issues.apache.org/jira/browse/HIVE-9036 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Marcelo Vanzin Assignee: Marcelo Vanzin Attachments: HIVE-9036.1-spark.patch, HIVE-9036.2-spark.patch, HIVE-9036.3-spark.patch, rsc-problem-1.tar.gz We've had weird issues with akka, especially when something goes wrong and it becomes a little hard to debug. Let's replace it with a simple(r) RPC system built on top of netty. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 28779: [spark-client] Netty-based RPC implementation.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/28779/ --- (Updated Dec. 9, 2014, 1:01 a.m.) Review request for hive, Brock Noland, chengxiang li, Szehon Ho, and Xuefu Zhang. Bugs: HIVE-9036 https://issues.apache.org/jira/browse/HIVE-9036 Repository: hive-git Description --- This patch replaces akka with a simple netty-based RPC layer. It doesn't add any features on top of the existing spark-client API, which is unchanged (except for the need to add empty constructors in some places). With the new backend we can think about adding some nice features such as future listeners (which were awkward with akka because of Scala), but those are left for a different time. The full change set, with more detailed descriptions, can be seen here: https://github.com/vanzin/hive/commits/spark-client-netty Diffs (updated) - pom.xml 630b10ce35032e4b2dee50ef3dfe5feb58223b78 ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/spark/RemoteHiveSparkClient.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/spark/status/impl/RemoteSparkJobStatus.java PRE-CREATION spark-client/pom.xml PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/ClientUtils.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/JobHandleImpl.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/Protocol.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/RemoteDriver.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/SparkClientFactory.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/metrics/InputMetrics.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/metrics/Metrics.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/metrics/ShuffleReadMetrics.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/metrics/ShuffleWriteMetrics.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/rpc/KryoMessageCodec.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/rpc/README.md PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/rpc/Rpc.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcConfiguration.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcDispatcher.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcException.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcServer.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/counter/SparkCounter.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/counter/SparkCounterGroup.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/counter/SparkCounters.java PRE-CREATION spark-client/src/test/java/org/apache/hive/spark/client/TestSparkClient.java PRE-CREATION spark-client/src/test/java/org/apache/hive/spark/client/rpc/TestKryoMessageCodec.java PRE-CREATION spark-client/src/test/java/org/apache/hive/spark/client/rpc/TestRpc.java PRE-CREATION Diff: https://reviews.apache.org/r/28779/diff/ Testing --- spark-client unit tests, plus some qtests. Thanks, Marcelo Vanzin
[jira] [Updated] (HIVE-9036) Replace akka for remote spark client RPC [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin updated HIVE-9036: - Attachment: HIVE-9036.4-spark.patch Replace akka for remote spark client RPC [Spark Branch] --- Key: HIVE-9036 URL: https://issues.apache.org/jira/browse/HIVE-9036 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Marcelo Vanzin Assignee: Marcelo Vanzin Attachments: HIVE-9036.1-spark.patch, HIVE-9036.2-spark.patch, HIVE-9036.3-spark.patch, HIVE-9036.4-spark.patch, rsc-problem-1.tar.gz We've had weird issues with akka, especially when something goes wrong and it becomes a little hard to debug. Let's replace it with a simple(r) RPC system built on top of netty. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-8574) Enhance metrics gathering in Spark Client [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-8574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved HIVE-8574. -- Resolution: Not a Problem I'll close this as not a problem for now. If we decide the overhead is too much, we can revisit it. As for the ugly API, currently I couldn't think of a way to avoid it. Spark's API is just not very friendly in this area. Enhance metrics gathering in Spark Client [Spark Branch] Key: HIVE-8574 URL: https://issues.apache.org/jira/browse/HIVE-8574 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Marcelo Vanzin Assignee: Marcelo Vanzin The current implementation of metrics gathering in the Spark client is a little hacky. First, it's awkward to use (and the implementation is also pretty ugly). Second, it will just collect metrics indefinitely, so in the long term it turns into a huge memory leak. We need a simplified interface and some mechanism for disposing of old metrics. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-9036) Replace akka for remote spark client RPC [Spark Branch]
Marcelo Vanzin created HIVE-9036: Summary: Replace akka for remote spark client RPC [Spark Branch] Key: HIVE-9036 URL: https://issues.apache.org/jira/browse/HIVE-9036 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Marcelo Vanzin Assignee: Marcelo Vanzin We've had weird issues with akka, especially when something goes wrong and it becomes a little hard to debug. Let's replace it with a simple(r) RPC system built on top of netty. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9036) Replace akka for remote spark client RPC [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin updated HIVE-9036: - Attachment: HIVE-9036.1-spark.patch Replace akka for remote spark client RPC [Spark Branch] --- Key: HIVE-9036 URL: https://issues.apache.org/jira/browse/HIVE-9036 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Marcelo Vanzin Assignee: Marcelo Vanzin Attachments: HIVE-9036.1-spark.patch We've had weird issues with akka, especially when something goes wrong and it becomes a little hard to debug. Let's replace it with a simple(r) RPC system built on top of netty. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9036) Replace akka for remote spark client RPC [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin updated HIVE-9036: - Status: Patch Available (was: Open) Patch is rather large but shouldn't be too complicated; and there are unit tests! (Plus I've run some of the qtests.) Replace akka for remote spark client RPC [Spark Branch] --- Key: HIVE-9036 URL: https://issues.apache.org/jira/browse/HIVE-9036 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Marcelo Vanzin Assignee: Marcelo Vanzin Attachments: HIVE-9036.1-spark.patch We've had weird issues with akka, especially when something goes wrong and it becomes a little hard to debug. Let's replace it with a simple(r) RPC system built on top of netty. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9036) Replace akka for remote spark client RPC [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14236447#comment-14236447 ] Marcelo Vanzin commented on HIVE-9036: -- I'll look at why the patch isn't applying later... probably need to rebase my branch. Replace akka for remote spark client RPC [Spark Branch] --- Key: HIVE-9036 URL: https://issues.apache.org/jira/browse/HIVE-9036 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Marcelo Vanzin Assignee: Marcelo Vanzin Attachments: HIVE-9036.1-spark.patch We've had weird issues with akka, especially when something goes wrong and it becomes a little hard to debug. Let's replace it with a simple(r) RPC system built on top of netty. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8991) Fix custom_input_output_format [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-8991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14231866#comment-14231866 ] Marcelo Vanzin commented on HIVE-8991: -- I didn't mean to stop you guys from checking in this patch. I just said that while this may fix the test, it's an indication of something that we need to understand better (i.e. how to properly add jars to the Spark job's classpath without causing conflicts). Fix custom_input_output_format [Spark Branch] - Key: HIVE-8991 URL: https://issues.apache.org/jira/browse/HIVE-8991 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Rui Li Assignee: Rui Li Attachments: HIVE-8991.1-spark.patch After HIVE-8836, {{custom_input_output_format}} fails because of missing hive-it-util in remote driver's class path. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8991) Fix custom_input_output_format [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-8991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14230227#comment-14230227 ] Marcelo Vanzin commented on HIVE-8991: -- Hi [~lirui], the patch looks good if it unblocks the unit tests. I have to think a bit about whether it would work in a real deployment scenario, since IIRC hive-exec shades a lot of dependencies and it might cause problems with Spark. But the main one (Guava) should be solved in Spark, so hopefully there won't be other cases like that. Fix custom_input_output_format [Spark Branch] - Key: HIVE-8991 URL: https://issues.apache.org/jira/browse/HIVE-8991 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Rui Li Assignee: Rui Li Attachments: HIVE-8991.1-spark.patch After HIVE-8836, {{custom_input_output_format}} fails because of missing hive-it-util in remote driver's class path. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8995) Find thread leak in RSC Tests
[ https://issues.apache.org/jira/browse/HIVE-8995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14230254#comment-14230254 ] Marcelo Vanzin commented on HIVE-8995: -- The three threads are from akka; I wonder if the test code is failing to properly shut down clients or the library itself (i.e. call {{SparkClientFactory.stop()}}). Find thread leak in RSC Tests - Key: HIVE-8995 URL: https://issues.apache.org/jira/browse/HIVE-8995 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Brock Noland I was regenerating output as part of the merge: {noformat} mvn test -Dtest=TestSparkCliDriver -Phadoop-2 -Dtest.output.overwrite=true -Dqfile=annotate_stats_join.q,auto_join0.q,auto_join1.q,auto_join10.q,auto_join11.q,auto_join12.q,auto_join13.q,auto_join14.q,auto_join15.q,auto_join16.q,auto_join17.q,auto_join18.q,auto_join18_multi_distinct.q,auto_join19.q,auto_join2.q,auto_join20.q,auto_join21.q,auto_join22.q,auto_join23.q,auto_join24.q,auto_join26.q,auto_join27.q,auto_join28.q,auto_join29.q,auto_join3.q,auto_join30.q,auto_join31.q,auto_join32.q,auto_join9.q,auto_join_reordering_values.q auto_join_without_localtask.q,auto_smb_mapjoin_14.q,auto_sortmerge_join_1.q,auto_sortmerge_join_10.q,auto_sortmerge_join_11.q,auto_sortmerge_join_12.q,auto_sortmerge_join_14.q,auto_sortmerge_join_15.q,auto_sortmerge_join_2.q,auto_sortmerge_join_3.q,auto_sortmerge_join_4.q,auto_sortmerge_join_5.q,auto_sortmerge_join_6.q,auto_sortmerge_join_7.q,auto_sortmerge_join_8.q,auto_sortmerge_join_9.q,bucket_map_join_1.q,bucket_map_join_2.q,bucket_map_join_tez1.q,bucket_map_join_tez2.q,bucketmapjoin1.q,bucketmapjoin10.q,bucketmapjoin11.q,bucketmapjoin12.q,bucketmapjoin13.q,bucketmapjoin2.q,bucketmapjoin3.q,bucketmapjoin4.q,bucketmapjoin5.q,bucketmapjoin7.q bucketmapjoin8.q,bucketmapjoin9.q,bucketmapjoin_negative.q,bucketmapjoin_negative2.q,bucketmapjoin_negative3.q,column_access_stats.q,cross_join.q,ctas.q,custom_input_output_format.q,groupby4.q,groupby7_noskew_multi_single_reducer.q,groupby_complex_types.q,groupby_complex_types_multi_single_reducer.q,groupby_multi_single_reducer2.q,groupby_multi_single_reducer3.q,groupby_position.q,groupby_sort_1_23.q,groupby_sort_skew_1_23.q,having.q,index_auto_self_join.q,infer_bucket_sort_convert_join.q,innerjoin.q,input12.q,join0.q,join1.q,join11.q,join12.q,join13.q,join14.q,join15.q join17.q,join18.q,join18_multi_distinct.q,join19.q,join2.q,join20.q,join21.q,join22.q,join23.q,join25.q,join26.q,join27.q,join28.q,join29.q,join3.q,join30.q,join31.q,join32.q,join32_lessSize.q,join33.q,join35.q,join36.q,join37.q,join38.q,join39.q,join40.q,join41.q,join9.q,join_alt_syntax.q,join_cond_pushdown_1.q join_cond_pushdown_2.q,join_cond_pushdown_3.q,join_cond_pushdown_4.q,join_cond_pushdown_unqual1.q,join_cond_pushdown_unqual2.q,join_cond_pushdown_unqual3.q,join_cond_pushdown_unqual4.q,join_filters_overlap.q,join_hive_626.q,join_map_ppr.q,join_merge_multi_expressions.q,join_merging.q,join_nullsafe.q,join_rc.q,join_reorder.q,join_reorder2.q,join_reorder3.q,join_reorder4.q,join_star.q,join_thrift.q,join_vc.q,join_view.q,limit_pushdown.q,load_dyn_part13.q,load_dyn_part14.q,louter_join_ppr.q,mapjoin1.q,mapjoin_decimal.q,mapjoin_distinct.q,mapjoin_filter_on_outerjoin.q mapjoin_hook.q,mapjoin_mapjoin.q,mapjoin_memcheck.q,mapjoin_subquery.q,mapjoin_subquery2.q,mapjoin_test_outer.q,mergejoins.q,mergejoins_mixed.q,multi_insert.q,multi_insert_gby.q,multi_insert_gby2.q,multi_insert_gby3.q,multi_insert_lateral_view.q,multi_insert_mixed.q,multi_insert_move_tasks_share_dependencies.q,multi_join_union.q,optimize_nullscan.q,outer_join_ppr.q,parallel.q,parallel_join0.q,parallel_join1.q,parquet_join.q,pcr.q,ppd_gby_join.q,ppd_join.q,ppd_join2.q,ppd_join3.q,ppd_join4.q,ppd_join5.q,ppd_join_filter.q ppd_multi_insert.q,ppd_outer_join1.q,ppd_outer_join2.q,ppd_outer_join3.q,ppd_outer_join4.q,ppd_outer_join5.q,ppd_transform.q,reduce_deduplicate_exclude_join.q,router_join_ppr.q,sample10.q,sample8.q,script_pipe.q,semijoin.q,skewjoin.q,skewjoin_noskew.q,skewjoin_union_remove_1.q,skewjoin_union_remove_2.q,skewjoinopt1.q,skewjoinopt10.q,skewjoinopt11.q,skewjoinopt12.q,skewjoinopt13.q,skewjoinopt14.q,skewjoinopt15.q,skewjoinopt16.q,skewjoinopt17.q,skewjoinopt18.q,skewjoinopt19.q,skewjoinopt2.q,skewjoinopt20.q skewjoinopt3.q,skewjoinopt4.q,skewjoinopt5.q,skewjoinopt6.q,skewjoinopt7.q,skewjoinopt8.q,skewjoinopt9.q,smb_mapjoin9.q,smb_mapjoin_1.q,smb_mapjoin_10.q,smb_mapjoin_13.q,smb_mapjoin_14.q,smb_mapjoin_15.q,smb_mapjoin_16.q,smb_mapjoin_17.q,smb_mapjoin_2.q,smb_mapjoin_25.q,smb_mapjoin_3.q,smb_mapjoin_4.q,smb_mapjoin_5.q,smb_mapjoin_6.q,smb_mapjoin_7.q,sort_merge_join_desc_1.q,sort_merge_join_desc_2.q,sort_merge_join_desc_3.q,sort_merge_join_desc_4.q,sort_merge_join_desc_5.q,sort_merge_join_desc_6.q
[jira] [Commented] (HIVE-8995) Find thread leak in RSC Tests
[ https://issues.apache.org/jira/browse/HIVE-8995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14230273#comment-14230273 ] Marcelo Vanzin commented on HIVE-8995: -- You don't need to call that method for every session. The pattern here is: * Call {{SparkClientFactory.initialize()}} once * Create / use as many clients as you want * When app shuts down, call {{SparkClientFactory.stop()}} So this should work nicely for HS2 (call initialize during bring up, call stop during shut down). I see {{RemoteHiveSparkClient}} calls initialize; that seems wrong, if my understanding of that class is correct (that it will be instantiated once for each session). Another option is to make {{initialize}} idempotent; right now it will just leak the old akka actor system, which is bad. This should be a trivial change (just add a check for {{initialized}}). Find thread leak in RSC Tests - Key: HIVE-8995 URL: https://issues.apache.org/jira/browse/HIVE-8995 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Brock Noland I was regenerating output as part of the merge: {noformat} mvn test -Dtest=TestSparkCliDriver -Phadoop-2 -Dtest.output.overwrite=true -Dqfile=annotate_stats_join.q,auto_join0.q,auto_join1.q,auto_join10.q,auto_join11.q,auto_join12.q,auto_join13.q,auto_join14.q,auto_join15.q,auto_join16.q,auto_join17.q,auto_join18.q,auto_join18_multi_distinct.q,auto_join19.q,auto_join2.q,auto_join20.q,auto_join21.q,auto_join22.q,auto_join23.q,auto_join24.q,auto_join26.q,auto_join27.q,auto_join28.q,auto_join29.q,auto_join3.q,auto_join30.q,auto_join31.q,auto_join32.q,auto_join9.q,auto_join_reordering_values.q auto_join_without_localtask.q,auto_smb_mapjoin_14.q,auto_sortmerge_join_1.q,auto_sortmerge_join_10.q,auto_sortmerge_join_11.q,auto_sortmerge_join_12.q,auto_sortmerge_join_14.q,auto_sortmerge_join_15.q,auto_sortmerge_join_2.q,auto_sortmerge_join_3.q,auto_sortmerge_join_4.q,auto_sortmerge_join_5.q,auto_sortmerge_join_6.q,auto_sortmerge_join_7.q,auto_sortmerge_join_8.q,auto_sortmerge_join_9.q,bucket_map_join_1.q,bucket_map_join_2.q,bucket_map_join_tez1.q,bucket_map_join_tez2.q,bucketmapjoin1.q,bucketmapjoin10.q,bucketmapjoin11.q,bucketmapjoin12.q,bucketmapjoin13.q,bucketmapjoin2.q,bucketmapjoin3.q,bucketmapjoin4.q,bucketmapjoin5.q,bucketmapjoin7.q bucketmapjoin8.q,bucketmapjoin9.q,bucketmapjoin_negative.q,bucketmapjoin_negative2.q,bucketmapjoin_negative3.q,column_access_stats.q,cross_join.q,ctas.q,custom_input_output_format.q,groupby4.q,groupby7_noskew_multi_single_reducer.q,groupby_complex_types.q,groupby_complex_types_multi_single_reducer.q,groupby_multi_single_reducer2.q,groupby_multi_single_reducer3.q,groupby_position.q,groupby_sort_1_23.q,groupby_sort_skew_1_23.q,having.q,index_auto_self_join.q,infer_bucket_sort_convert_join.q,innerjoin.q,input12.q,join0.q,join1.q,join11.q,join12.q,join13.q,join14.q,join15.q join17.q,join18.q,join18_multi_distinct.q,join19.q,join2.q,join20.q,join21.q,join22.q,join23.q,join25.q,join26.q,join27.q,join28.q,join29.q,join3.q,join30.q,join31.q,join32.q,join32_lessSize.q,join33.q,join35.q,join36.q,join37.q,join38.q,join39.q,join40.q,join41.q,join9.q,join_alt_syntax.q,join_cond_pushdown_1.q join_cond_pushdown_2.q,join_cond_pushdown_3.q,join_cond_pushdown_4.q,join_cond_pushdown_unqual1.q,join_cond_pushdown_unqual2.q,join_cond_pushdown_unqual3.q,join_cond_pushdown_unqual4.q,join_filters_overlap.q,join_hive_626.q,join_map_ppr.q,join_merge_multi_expressions.q,join_merging.q,join_nullsafe.q,join_rc.q,join_reorder.q,join_reorder2.q,join_reorder3.q,join_reorder4.q,join_star.q,join_thrift.q,join_vc.q,join_view.q,limit_pushdown.q,load_dyn_part13.q,load_dyn_part14.q,louter_join_ppr.q,mapjoin1.q,mapjoin_decimal.q,mapjoin_distinct.q,mapjoin_filter_on_outerjoin.q mapjoin_hook.q,mapjoin_mapjoin.q,mapjoin_memcheck.q,mapjoin_subquery.q,mapjoin_subquery2.q,mapjoin_test_outer.q,mergejoins.q,mergejoins_mixed.q,multi_insert.q,multi_insert_gby.q,multi_insert_gby2.q,multi_insert_gby3.q,multi_insert_lateral_view.q,multi_insert_mixed.q,multi_insert_move_tasks_share_dependencies.q,multi_join_union.q,optimize_nullscan.q,outer_join_ppr.q,parallel.q,parallel_join0.q,parallel_join1.q,parquet_join.q,pcr.q,ppd_gby_join.q,ppd_join.q,ppd_join2.q,ppd_join3.q,ppd_join4.q,ppd_join5.q,ppd_join_filter.q ppd_multi_insert.q,ppd_outer_join1.q,ppd_outer_join2.q,ppd_outer_join3.q,ppd_outer_join4.q,ppd_outer_join5.q,ppd_transform.q,reduce_deduplicate_exclude_join.q,router_join_ppr.q,sample10.q,sample8.q,script_pipe.q,semijoin.q,skewjoin.q,skewjoin_noskew.q,skewjoin_union_remove_1.q,skewjoin_union_remove_2.q,skewjoinopt1.q,skewjoinopt10.q,skewjoinopt11.q,skewjoinopt12.q,skewjoinopt13.q,skewjoinopt14.q,skewjoinopt15.q,skewjoinopt16.q,skewjoinopt17.q,skewjoinopt18.q,skewjoinopt19.q,skewjoinopt2.q
[jira] [Commented] (HIVE-8957) Remote spark context needs to clean up itself in case of connection timeout [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-8957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14230478#comment-14230478 ] Marcelo Vanzin commented on HIVE-8957: -- If you don't mind the bug remaining unattended for several days, sure. I have my hands full with all sorts of other things at the moment. Remote spark context needs to clean up itself in case of connection timeout [Spark Branch] -- Key: HIVE-8957 URL: https://issues.apache.org/jira/browse/HIVE-8957 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Xuefu Zhang Assignee: Xuefu Zhang Attachments: HIVE-8957.1-spark.patch In the current SparkClient implementation (class SparkClientImpl), the constructor does some initialization and in the end waits for the remote driver to connect. In case of timeout, it just throws an exception without cleaning itself. The cleanup is necessary to release system resources. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8956) Hive hangs while some error/exception happens beyond job execution [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-8956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226504#comment-14226504 ] Marcelo Vanzin commented on HIVE-8956: -- I haven't looked at akka in that much detail to see if there is some API to catch those. You can enable akka logging (set {{spark.akka.logLifecycleEvents}} to true) and that will print these errors to the logs. Spark tries to serialize data before sending it to akka, to try to catch serialization issues, but that adds overhead, and it also doesn't help in the deserialization path... Hive hangs while some error/exception happens beyond job execution [Spark Branch] - Key: HIVE-8956 URL: https://issues.apache.org/jira/browse/HIVE-8956 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Chengxiang Li Assignee: Rui Li Labels: Spark-M3 Fix For: spark-branch Attachments: HIVE-8956.1-spark.patch Remote spark client communicate with remote spark context asynchronously, if error/exception is throw out during job execution in remote spark context, it would be wrapped and send back to remote spark client, but if error/exception is throw out beyond job execution, such as job serialized failed, remote spark client would never know what's going on in remote spark context, and it would hangs there. Set a timeout in remote spark client side may not a great idea, as we are not sure how long the query executed in spark cluster. we need find a way to check whether job has failed(whole life cycle) in remote spark context. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8957) Remote spark context needs to clean up itself in case of connection timeout [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-8957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226515#comment-14226515 ] Marcelo Vanzin commented on HIVE-8957: -- I think a fix here will be a little more complicated than that. Let me look at the code and think about it. Remote spark context needs to clean up itself in case of connection timeout [Spark Branch] -- Key: HIVE-8957 URL: https://issues.apache.org/jira/browse/HIVE-8957 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Xuefu Zhang Assignee: Xuefu Zhang Attachments: HIVE-8957.1-spark.patch In the current SparkClient implementation (class SparkClientImpl), the constructor does some initialization and in the end waits for the remote driver to connect. In case of timeout, it just throws an exception without cleaning itself. The cleanup is necessary to release system resources. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8574) Enhance metrics gathering in Spark Client [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-8574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226517#comment-14226517 ] Marcelo Vanzin commented on HIVE-8574: -- Actually, after a quick look at the code again, this might not be a problem. Metrics are kept per-job handle. Job handles are managed by the code submitting jobs - leave them for garbage collection and metrics go away. So unless we're worried about a single job creating so many tasks that it will run the driver out of memory with all the metrics data, this shouldn't really be an issue. Enhance metrics gathering in Spark Client [Spark Branch] Key: HIVE-8574 URL: https://issues.apache.org/jira/browse/HIVE-8574 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Marcelo Vanzin Assignee: Marcelo Vanzin The current implementation of metrics gathering in the Spark client is a little hacky. First, it's awkward to use (and the implementation is also pretty ugly). Second, it will just collect metrics indefinitely, so in the long term it turns into a huge memory leak. We need a simplified interface and some mechanism for disposing of old metrics. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8574) Enhance metrics gathering in Spark Client [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-8574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226668#comment-14226668 ] Marcelo Vanzin commented on HIVE-8574: -- Rounding up, each task metrics data structure will take around 256 bytes. So ~25MB? Enhance metrics gathering in Spark Client [Spark Branch] Key: HIVE-8574 URL: https://issues.apache.org/jira/browse/HIVE-8574 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Marcelo Vanzin Assignee: Marcelo Vanzin The current implementation of metrics gathering in the Spark client is a little hacky. First, it's awkward to use (and the implementation is also pretty ugly). Second, it will just collect metrics indefinitely, so in the long term it turns into a huge memory leak. We need a simplified interface and some mechanism for disposing of old metrics. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8836) Enable automatic tests with remote spark client.[Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-8836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14225049#comment-14225049 ] Marcelo Vanzin commented on HIVE-8836: -- I talked briefly with Brock about this, but the main thing here is that, right now, Spark is not very friendly to applications that are trying to embed it. As you've noticed, the assembly jar, which contains almost everything you need to run Spark, is not published in maven or anywhere. And not all artifacts used to build the assembly are published - for example, the Yarn backend cannot be found anywhere in maven, so without the assembly you cannot submit jobs to Yarn. I've suggested it in the past, but I think right now, or until Spark makes itself more friendly to such use cases, Hive should require a full Spark install to work. If desired we could use the hacks I added to the remote client to not need the full install for unit tests, but even those are very limited; it probably only works with a local master as some of you may have noticed. Enable automatic tests with remote spark client.[Spark Branch] -- Key: HIVE-8836 URL: https://issues.apache.org/jira/browse/HIVE-8836 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Chengxiang Li Assignee: Rui Li Labels: Spark-M3 Attachments: HIVE-8836-brock-1.patch, HIVE-8836-brock-2.patch, HIVE-8836-brock-3.patch, HIVE-8836.1-spark.patch, HIVE-8836.2-spark.patch, HIVE-8836.3-spark.patch In real production environment, remote spark client should be used to submit spark job for Hive mostly, we should enable automatic test with remote spark client to make sure the Hive feature workable with it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8956) Hive hangs while some error/exception happens beyond job execution[Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-8956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14225082#comment-14225082 ] Marcelo Vanzin commented on HIVE-8956: -- This is ok if it unblocks something right now. For the code, I'd suggest using {{System.nanoTime()}} to calculate durations, since it's monotonic. And use {{long}} instead of {{int}}. But I think a better approach is needed here. Currently the {{JobSubmitted}} message seems to only be sent when you use Spark's async APIs to submit a Spark job. A remote client job that does not use those APIs would never generate that message. Also, the backend uses a thread pool to execute jobs - so if you're queueing up multiple jobs, you may hit this timeout. I think we need more fine-grained remote client-level events for tracking job progress. e.g., adding {{JobReceived}} and {{JobStarted}} messages would be a good start ({{JobResult}} already covers the job finished case). I think these two extra messages should be enough to cover the problems described in this bug. Hive hangs while some error/exception happens beyond job execution[Spark Branch] Key: HIVE-8956 URL: https://issues.apache.org/jira/browse/HIVE-8956 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Chengxiang Li Assignee: Rui Li Labels: Spark-M3 Attachments: HIVE-8956.1-spark.patch Remote spark client communicate with remote spark context asynchronously, if error/exception is throw out during job execution in remote spark context, it would be wrapped and send back to remote spark client, but if error/exception is throw out beyond job execution, such as job serialized failed, remote spark client would never know what's going on in remote spark context, and it would hangs there. Set a timeout in remote spark client side may not a great idea, as we are not sure how long the query executed in spark cluster. we need find a way to check whether job has failed(whole life cycle) in remote spark context. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8574) Enhance metrics gathering in Spark Client [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-8574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14225276#comment-14225276 ] Marcelo Vanzin commented on HIVE-8574: -- Hey [~chengxiang li], I'd like to have a better understanding of how these metrics will be used by Hive to come up with the proper fix here. I see two approaches: * Add an API to clean up the metrics. This keeps the current collect all metrics approach, but adds APIs that will to delete the metrics. This assumes that Hive will always process metrics of finished jobs, even if just to ask them to be deleted. * Suggested by [~xuefuz]: add a timeout after a job is finished for cleaning up the metrics. This means that Hive has some time after a job finished where this data will be available, but after that, it's gone. I could also add some internal checks so that the collection doesn't keep acumulating data indefinitely if data is never deleted; like track only the last x finished jobs, evicting the oldest when a new job starts. What do you think? Enhance metrics gathering in Spark Client [Spark Branch] Key: HIVE-8574 URL: https://issues.apache.org/jira/browse/HIVE-8574 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Marcelo Vanzin Assignee: Marcelo Vanzin The current implementation of metrics gathering in the Spark client is a little hacky. First, it's awkward to use (and the implementation is also pretty ugly). Second, it will just collect metrics indefinitely, so in the long term it turns into a huge memory leak. We need a simplified interface and some mechanism for disposing of old metrics. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8574) Enhance metrics gathering in Spark Client [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-8574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14223383#comment-14223383 ] Marcelo Vanzin commented on HIVE-8574: -- Haven't had a chance to look at this yet. Hopefully this week. Enhance metrics gathering in Spark Client [Spark Branch] Key: HIVE-8574 URL: https://issues.apache.org/jira/browse/HIVE-8574 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Marcelo Vanzin Assignee: Marcelo Vanzin The current implementation of metrics gathering in the Spark client is a little hacky. First, it's awkward to use (and the implementation is also pretty ugly). Second, it will just collect metrics indefinitely, so in the long term it turns into a huge memory leak. We need a simplified interface and some mechanism for disposing of old metrics. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8951) Spark remote context doesn't work with local-cluster [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-8951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14223388#comment-14223388 ] Marcelo Vanzin commented on HIVE-8951: -- `SparkClientImpl` has `stop()`, which should be cleaning things up and properly stopping the driver. Are you calling it? Spark remote context doesn't work with local-cluster [Spark Branch] --- Key: HIVE-8951 URL: https://issues.apache.org/jira/browse/HIVE-8951 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Xuefu Zhang What I did: {code} set spark.home=/home/xzhang/apache/spark; set spark.master=local-cluster[2,1,2048]; set hive.execution.engine=spark; set spark.executor.memory=2g; set spark.serializer=org.apache.spark.serializer.KryoSerializer; set spark.io.compression.codec=org.apache.spark.io.LZFCompressionCodec; select name, avg(value) as v from dec group by name order by v; {code} Exeptions seen: {code} 14/11/23 10:42:15 INFO Worker: Spark home: /home/xzhang/apache/spark 14/11/23 10:42:15 INFO AppClient$ClientActor: Connecting to master spark://xzdt.local:55151... 14/11/23 10:42:15 INFO Master: Registering app Hive on Spark 14/11/23 10:42:15 INFO Master: Registered app Hive on Spark with ID app-20141123104215- 14/11/23 10:42:15 INFO SparkDeploySchedulerBackend: Connected to Spark cluster with app ID app-20141123104215- 14/11/23 10:42:15 INFO NettyBlockTransferService: Server created on 41676 14/11/23 10:42:15 INFO BlockManagerMaster: Trying to register BlockManager 14/11/23 10:42:15 INFO BlockManagerMasterActor: Registering block manager xzdt.local:41676 with 265.0 MB RAM, BlockManagerId(driver, xzdt.local, 41676) 14/11/23 10:42:15 INFO BlockManagerMaster: Registered BlockManager 14/11/23 10:42:15 INFO SparkDeploySchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.0 14/11/23 10:42:20 WARN AbstractLifeCycle: FAILED SelectChannelConnector@0.0.0.0:4040: java.net.BindException: Address already in use java.net.BindException: Address already in use at sun.nio.ch.Net.bind0(Native Method) at sun.nio.ch.Net.bind(Net.java:174) at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:139) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:77) at org.eclipse.jetty.server.nio.SelectChannelConnector.open(SelectChannelConnector.java:187) at org.eclipse.jetty.server.AbstractConnector.doStart(AbstractConnector.java:316) at org.eclipse.jetty.server.nio.SelectChannelConnector.doStart(SelectChannelConnector.java:265) at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64) at org.eclipse.jetty.server.Server.doStart(Server.java:293) at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64) at org.apache.spark.ui.JettyUtils$.org$apache$spark$ui$JettyUtils$$connect$1(JettyUtils.scala:194) at org.apache.spark.ui.JettyUtils$$anonfun$2.apply(JettyUtils.scala:204) at org.apache.spark.ui.JettyUtils$$anonfun$2.apply(JettyUtils.scala:204) at org.apache.spark.util.Utils$$anonfun$startServiceOnPort$1.apply$mcVI$sp(Utils.scala:1676) at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141) at org.apache.spark.util.Utils$.startServiceOnPort(Utils.scala:1667) at org.apache.spark.ui.JettyUtils$.startJettyServer(JettyUtils.scala:204) at org.apache.spark.ui.WebUI.bind(WebUI.scala:102) at org.apache.spark.SparkContext$$anonfun$10.apply(SparkContext.scala:267) at org.apache.spark.SparkContext$$anonfun$10.apply(SparkContext.scala:267) at scala.Option.foreach(Option.scala:236) at org.apache.spark.SparkContext.init(SparkContext.scala:267) at org.apache.spark.api.java.JavaSparkContext.init(JavaSparkContext.scala:61) at org.apache.hive.spark.client.RemoteDriver.init(RemoteDriver.java:106) at org.apache.hive.spark.client.RemoteDriver.main(RemoteDriver.java:362) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:353) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) 14/11/23 10:42:20 WARN AbstractLifeCycle: FAILED org.eclipse.jetty.server.Server@4c9fd062: java.net.BindException: Address already in use
[jira] [Commented] (HIVE-8951) Spark remote context doesn't work with local-cluster [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-8951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14223507#comment-14223507 ] Marcelo Vanzin commented on HIVE-8951: -- That `BindException` should not be fatal; Spark just retried on a different port when it happens. So something else must be going wrong. Spark remote context doesn't work with local-cluster [Spark Branch] --- Key: HIVE-8951 URL: https://issues.apache.org/jira/browse/HIVE-8951 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Xuefu Zhang What I did: {code} set spark.home=/home/xzhang/apache/spark; set spark.master=local-cluster[2,1,2048]; set hive.execution.engine=spark; set spark.executor.memory=2g; set spark.serializer=org.apache.spark.serializer.KryoSerializer; set spark.io.compression.codec=org.apache.spark.io.LZFCompressionCodec; select name, avg(value) as v from dec group by name order by v; {code} Exeptions seen: {code} 14/11/23 10:42:15 INFO Worker: Spark home: /home/xzhang/apache/spark 14/11/23 10:42:15 INFO AppClient$ClientActor: Connecting to master spark://xzdt.local:55151... 14/11/23 10:42:15 INFO Master: Registering app Hive on Spark 14/11/23 10:42:15 INFO Master: Registered app Hive on Spark with ID app-20141123104215- 14/11/23 10:42:15 INFO SparkDeploySchedulerBackend: Connected to Spark cluster with app ID app-20141123104215- 14/11/23 10:42:15 INFO NettyBlockTransferService: Server created on 41676 14/11/23 10:42:15 INFO BlockManagerMaster: Trying to register BlockManager 14/11/23 10:42:15 INFO BlockManagerMasterActor: Registering block manager xzdt.local:41676 with 265.0 MB RAM, BlockManagerId(driver, xzdt.local, 41676) 14/11/23 10:42:15 INFO BlockManagerMaster: Registered BlockManager 14/11/23 10:42:15 INFO SparkDeploySchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.0 14/11/23 10:42:20 WARN AbstractLifeCycle: FAILED SelectChannelConnector@0.0.0.0:4040: java.net.BindException: Address already in use java.net.BindException: Address already in use at sun.nio.ch.Net.bind0(Native Method) at sun.nio.ch.Net.bind(Net.java:174) at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:139) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:77) at org.eclipse.jetty.server.nio.SelectChannelConnector.open(SelectChannelConnector.java:187) at org.eclipse.jetty.server.AbstractConnector.doStart(AbstractConnector.java:316) at org.eclipse.jetty.server.nio.SelectChannelConnector.doStart(SelectChannelConnector.java:265) at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64) at org.eclipse.jetty.server.Server.doStart(Server.java:293) at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64) at org.apache.spark.ui.JettyUtils$.org$apache$spark$ui$JettyUtils$$connect$1(JettyUtils.scala:194) at org.apache.spark.ui.JettyUtils$$anonfun$2.apply(JettyUtils.scala:204) at org.apache.spark.ui.JettyUtils$$anonfun$2.apply(JettyUtils.scala:204) at org.apache.spark.util.Utils$$anonfun$startServiceOnPort$1.apply$mcVI$sp(Utils.scala:1676) at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141) at org.apache.spark.util.Utils$.startServiceOnPort(Utils.scala:1667) at org.apache.spark.ui.JettyUtils$.startJettyServer(JettyUtils.scala:204) at org.apache.spark.ui.WebUI.bind(WebUI.scala:102) at org.apache.spark.SparkContext$$anonfun$10.apply(SparkContext.scala:267) at org.apache.spark.SparkContext$$anonfun$10.apply(SparkContext.scala:267) at scala.Option.foreach(Option.scala:236) at org.apache.spark.SparkContext.init(SparkContext.scala:267) at org.apache.spark.api.java.JavaSparkContext.init(JavaSparkContext.scala:61) at org.apache.hive.spark.client.RemoteDriver.init(RemoteDriver.java:106) at org.apache.hive.spark.client.RemoteDriver.main(RemoteDriver.java:362) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:353) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) 14/11/23 10:42:20 WARN AbstractLifeCycle: FAILED org.eclipse.jetty.server.Server@4c9fd062: java.net.BindException: Address
[jira] [Commented] (HIVE-8951) Spark remote context doesn't work with local-cluster [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-8951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14223661#comment-14223661 ] Marcelo Vanzin commented on HIVE-8951: -- Not from just those logs. Is this easily reproduced via some unit test? (Feel free to send me an e-mail with reproduction steps so I can try it myself.) Spark remote context doesn't work with local-cluster [Spark Branch] --- Key: HIVE-8951 URL: https://issues.apache.org/jira/browse/HIVE-8951 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Xuefu Zhang What I did: {code} set spark.home=/home/xzhang/apache/spark; set spark.master=local-cluster[2,1,2048]; set hive.execution.engine=spark; set spark.executor.memory=2g; set spark.serializer=org.apache.spark.serializer.KryoSerializer; set spark.io.compression.codec=org.apache.spark.io.LZFCompressionCodec; select name, avg(value) as v from dec group by name order by v; {code} Exeptions seen: {code} 14/11/23 10:42:15 INFO Worker: Spark home: /home/xzhang/apache/spark 14/11/23 10:42:15 INFO AppClient$ClientActor: Connecting to master spark://xzdt.local:55151... 14/11/23 10:42:15 INFO Master: Registering app Hive on Spark 14/11/23 10:42:15 INFO Master: Registered app Hive on Spark with ID app-20141123104215- 14/11/23 10:42:15 INFO SparkDeploySchedulerBackend: Connected to Spark cluster with app ID app-20141123104215- 14/11/23 10:42:15 INFO NettyBlockTransferService: Server created on 41676 14/11/23 10:42:15 INFO BlockManagerMaster: Trying to register BlockManager 14/11/23 10:42:15 INFO BlockManagerMasterActor: Registering block manager xzdt.local:41676 with 265.0 MB RAM, BlockManagerId(driver, xzdt.local, 41676) 14/11/23 10:42:15 INFO BlockManagerMaster: Registered BlockManager 14/11/23 10:42:15 INFO SparkDeploySchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.0 14/11/23 10:42:20 WARN AbstractLifeCycle: FAILED SelectChannelConnector@0.0.0.0:4040: java.net.BindException: Address already in use java.net.BindException: Address already in use at sun.nio.ch.Net.bind0(Native Method) at sun.nio.ch.Net.bind(Net.java:174) at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:139) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:77) at org.eclipse.jetty.server.nio.SelectChannelConnector.open(SelectChannelConnector.java:187) at org.eclipse.jetty.server.AbstractConnector.doStart(AbstractConnector.java:316) at org.eclipse.jetty.server.nio.SelectChannelConnector.doStart(SelectChannelConnector.java:265) at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64) at org.eclipse.jetty.server.Server.doStart(Server.java:293) at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64) at org.apache.spark.ui.JettyUtils$.org$apache$spark$ui$JettyUtils$$connect$1(JettyUtils.scala:194) at org.apache.spark.ui.JettyUtils$$anonfun$2.apply(JettyUtils.scala:204) at org.apache.spark.ui.JettyUtils$$anonfun$2.apply(JettyUtils.scala:204) at org.apache.spark.util.Utils$$anonfun$startServiceOnPort$1.apply$mcVI$sp(Utils.scala:1676) at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141) at org.apache.spark.util.Utils$.startServiceOnPort(Utils.scala:1667) at org.apache.spark.ui.JettyUtils$.startJettyServer(JettyUtils.scala:204) at org.apache.spark.ui.WebUI.bind(WebUI.scala:102) at org.apache.spark.SparkContext$$anonfun$10.apply(SparkContext.scala:267) at org.apache.spark.SparkContext$$anonfun$10.apply(SparkContext.scala:267) at scala.Option.foreach(Option.scala:236) at org.apache.spark.SparkContext.init(SparkContext.scala:267) at org.apache.spark.api.java.JavaSparkContext.init(JavaSparkContext.scala:61) at org.apache.hive.spark.client.RemoteDriver.init(RemoteDriver.java:106) at org.apache.hive.spark.client.RemoteDriver.main(RemoteDriver.java:362) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:353) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) 14/11/23 10:42:20 WARN AbstractLifeCycle: FAILED org.eclipse.jetty.server.Server@4c9fd062: java.net.BindException