[GitHub] spark issue #22186: [SPARK-25183][SQL][WIP] Spark HiveServer2 to use Spark S...
Github user steveloughran commented on the issue: https://github.com/apache/spark/pull/22186 This will eliminate a race condition between FS shutdown (in the hadoop shutdown manager) and the hive callback. Theres a risk today that the filesystems will be closed before that event log close()/rename() is called, so things don't get saved âand this can happen with any FS. registering the shutdown hook via the spark APIs, with a priority > than the FS shutdown, guarantees that it will be called before the FS shutdown. But it doesn't guarantee that the operation will complete within the 10s time limit hard coded into Hadoop 2.8.x+ for any single shutdown hook to complete. It is going to work in HDFS except in the special case of HDFS NN lock or GC pause. The Hadoop configurable delay of [HADOOP-15679](https://issues.apache.org/jira/browse/HADOOP-15679) needs to go in. I've increased the default timeout to 30s there for more forgiveness with HDFS, and for object stores with O(data) renames people should configure it with a timeout of minutes, or, if they want to turn it off altogether, hours. I'm backporting HADOOP-15679 to all branches 2.8.x+, so all hadoop versions with that timeout will have the timeout configurable & the default time extended. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22186: [SPARK-25183][SQL][WIP] Spark HiveServer2 to use Spark S...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/22186 The fix itself LGTM, but I don't think this could solve the STS shutdown hook conflict problem with Hadoop. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22186: [SPARK-25183][SQL][WIP] Spark HiveServer2 to use Spark S...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22186 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22186: [SPARK-25183][SQL][WIP] Spark HiveServer2 to use Spark S...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22186 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95385/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22186: [SPARK-25183][SQL][WIP] Spark HiveServer2 to use Spark S...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22186 **[Test build #95385 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95385/testReport)** for PR 22186 at commit [`fbced52`](https://github.com/apache/spark/commit/fbced52e5687cd5eb6a06c3b9bca5cbeb9343002). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22186: [SPARK-25183][SQL][WIP] Spark HiveServer2 to use Spark S...
Github user steveloughran commented on the issue: https://github.com/apache/spark/pull/22186 The latest patch builds locally Maven test outcome * lots of json missing method errors, clearly jackson version problems of some kind * I don't see log messages of hive shutdown appearing in the output, though after all the tests finish I do get a log showing the FS cleanup is going on ``` 18/08/28 22:09:58 INFO ShutdownHookManager: Shutdown hook called 18/08/28 22:09:58 INFO ShutdownHookManager: Deleting directory ...spark/sql/hive-thriftserver/target/tmp/ ``` I think it might be possible to actually test whether the shutdown hook was added by calling remove(hook) in a test and verifying that the hook was found, that is : it was registered. Some caching of the hook and a package-level removeHook method in the HiveServer, though wiring it all the way up to a test case would be tricky... --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22186: [SPARK-25183][SQL][WIP] Spark HiveServer2 to use Spark S...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22186 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22186: [SPARK-25183][SQL][WIP] Spark HiveServer2 to use Spark S...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22186 **[Test build #95385 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95385/testReport)** for PR 22186 at commit [`fbced52`](https://github.com/apache/spark/commit/fbced52e5687cd5eb6a06c3b9bca5cbeb9343002). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22186: [SPARK-25183][SQL][WIP] Spark HiveServer2 to use Spark S...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22186 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2646/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22186: [SPARK-25183][SQL][WIP] Spark HiveServer2 to use Spark S...
Github user steveloughran commented on the issue: https://github.com/apache/spark/pull/22186 My local build wasn't including that module; it now does and the link works with a subclass of `AbstractFunction0`. The local tests are failing under maven with hive/jackson mismatch though. I'm going to consider that a separate issue. ``` #c=cvalue;d=dvalue - SPARK-16563 ThriftCLIService FetchResults repeat fetching result *** FAILED *** java.sql.SQLException: java.lang.NoSuchMethodError: org.json4s.jackson.JsonMethods$.parse$default$3()Z at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:296) at org.apache.spark.sql.hive.thriftserver.HiveThriftJdbcTest$$anonfun$withMultipleConnectionJdbcStatement$2.apply(HiveThriftServer2Suites.scala:814) at org.apache.spark.sql.hive.thriftserver.HiveThriftJdbcTest$$anonfun$withMultipleConnectionJdbcStatement$2.apply(HiveThriftServer2Suites.scala:813) at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:35) at org.apache.spark.sql.hive.thriftserver.HiveThriftJdbcTest.withMultipleConnectionJdbcStatement(HiveThriftServer2Suites.scala:813) at org.apache.spark.sql.hive.thriftserver.HiveThriftJdbcTest.withJdbcStatement(HiveThriftServer2Suites.scala:822) at org.apache.spark.sql.hive.thriftserver.HiveThriftBinaryServerSuite$$anonfun$2$$anonfun$apply$mcV$sp$2.apply(HiveThriftServer2Suites.scala:100) at org.apache.spark.sql.hive.thriftserver.HiveThriftBinaryServerSuite$$anonfun$2$$anonfun$apply$mcV$sp$2.apply(HiveThriftServer2Suites.scala:96) at org.apache.spark.sql.hive.thriftserver.HiveThriftBinaryServerSuite.org$apache$spark$sql$hive$thriftserver$HiveThriftBinaryServerSuite$$withCLIServiceClient(HiveThriftServer2Suites.scala:71) ``` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22186: [SPARK-25183][SQL][WIP] Spark HiveServer2 to use Spark S...
Github user steveloughran commented on the issue: https://github.com/apache/spark/pull/22186 my local maven build *did* work, so maybe its a javac/JVM version thing. Will move back to a java class callback. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22186: [SPARK-25183][SQL][WIP] Spark HiveServer2 to use Spark S...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/22186 My local maven build also failed. I think the problem is that`ShutdownHookManager` is implemented in Scala, the complied method signature may be different when invoked from Java, I'm not sure how Scala anonymous function is translated to Java, but it seems like due to this issue. (Maven has some detailed failure information, whereas SBT doesn't have anything). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22186: [SPARK-25183][SQL][WIP] Spark HiveServer2 to use Spark S...
Github user steveloughran commented on the issue: https://github.com/apache/spark/pull/22186 Not sure what is up with the build here; worked with mvn locally. Possibly my use of a java 8 lamda-expression as the hook? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org