[jira] [Commented] (SPARK-3685) Spark's local dir should accept only local paths
[ https://issues.apache.org/jira/browse/SPARK-3685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14365087#comment-14365087 ] Steve Loughran commented on SPARK-3685: --- YARN-1197 covers supporting resizing existing YARN containers: that would be the real solution to altering the memory footprint of an executor in a container -at least if that JVM can change its heap size down. but... that JIRA is dormant; I don't know if anyone is going to pick it up in the near-term. SPARK-1529 looks at switching to the Hadoop FS APIs, but doesn't mandate remote storage: it just makes it possible Switching to HDFS storage, as Andrew proposes, risks hitting network performance. # network traffic unless the replication factor == 1. (though do that there's only one preferred location for the new container) # disk IO conflict with other HDFS work going on on the localhost. # the overhead of going via the TCP stack unless they are bypassed via unix domain sockets (as HBase does). There's a risk, therefore, that the performance of all work will suffer just to support a single use case flex executor container JVM size. That's also ignoring the scheduling risk of the smaller container not being allocated resources Hooking up the YARN NM shuffle would be the better way to do this. If that shuffle can't handle the wiring-up, it's probably easier to fix that than the whole YARN container-resize problem Spark's local dir should accept only local paths Key: SPARK-3685 URL: https://issues.apache.org/jira/browse/SPARK-3685 Project: Spark Issue Type: Bug Components: Spark Core, YARN Affects Versions: 1.1.0 Reporter: Andrew Or When you try to set local dirs to hdfs:/tmp/foo it doesn't work. What it will try to do is create a folder called hdfs: and put tmp inside it. This is because in Util#getOrCreateLocalRootDirs we use java.io.File instead of Hadoop's file system to parse this path. We also need to resolve the path appropriately. This may not have an urgent use case, but it fails silently and does what is least expected. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-6433) hive tests to import spark-sql test JAR for QueryTest access
[ https://issues.apache.org/jira/browse/SPARK-6433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14371299#comment-14371299 ] Steve Loughran commented on SPARK-6433: --- Similarly, the original sql package's {{QueryTest}} is ahead of the hive one. {code} $ diff ./sql/hive/src/test/scala/org/apache/spark/sql/QueryTest.scala ./sql/core/src/test/scala/org/apache/spark/sql/QueryTest.scala 19a20,21 import java.util.{Locale, TimeZone} 23a26 import org.apache.spark.sql.columnar.InMemoryRelation 25,31d27 /** * *** DUPLICATED FROM sql/core. *** * * It is hard to have maven allow one subproject depend on another subprojects test code. * So, we duplicate this code here. */ 33a30,34 // Timezone is fixed to America/Los_Angeles for those timezone sensitive tests (timestamp_*) TimeZone.setDefault(TimeZone.getTimeZone(America/Los_Angeles)) // Add Locale setting Locale.setDefault(Locale.US) 37c38 * @param rdd the [[DataFrame]] to be executed --- * @param df the [[DataFrame]] to be executed 42,43c43,44 def checkExistence(rdd: DataFrame, exists: Boolean, keywords: String*) { val outputs = rdd.collect().map(_.mkString).mkString --- def checkExistence(df: DataFrame, exists: Boolean, keywords: String*) { val outputs = df.collect().map(_.mkString).mkString 46c47 assert(outputs.contains(key), sFailed for $rdd ($key doens't exist in result)) --- assert(outputs.contains(key), sFailed for $df ($key doesn't exist in result)) 48c49 assert(!outputs.contains(key), sFailed for $rdd ($key existed in the result)) --- assert(!outputs.contains(key), sFailed for $df ($key existed in the result)) 55c56 * @param rdd the [[DataFrame]] to be executed --- * @param df the [[DataFrame]] to be executed 58,59c59,60 protected def checkAnswer(rdd: DataFrame, expectedAnswer: Seq[Row]): Unit = { QueryTest.checkAnswer(rdd, expectedAnswer) match { --- protected def checkAnswer(df: DataFrame, expectedAnswer: Seq[Row]): Unit = { QueryTest.checkAnswer(df, expectedAnswer) match { 65,66c66,67 protected def checkAnswer(rdd: DataFrame, expectedAnswer: Row): Unit = { checkAnswer(rdd, Seq(expectedAnswer)) --- protected def checkAnswer(df: DataFrame, expectedAnswer: Row): Unit = { checkAnswer(df, Seq(expectedAnswer)) 73a75,89 /** * Asserts that a given [[DataFrame]] will be executed using the given number of cached results. */ def assertCached(query: DataFrame, numCachedTables: Int = 1): Unit = { val planWithCaching = query.queryExecution.withCachedData val cachedData = planWithCaching collect { case cached: InMemoryRelation = cached } assert( cachedData.size == numCachedTables, sExpected query to contain $numCachedTables, but it actually had ${cachedData.size}\n + planWithCaching) } 82c98 * @param rdd the [[DataFrame]] to be executed --- * @param df the [[DataFrame]] to be executed 85,86c101,102 def checkAnswer(rdd: DataFrame, expectedAnswer: Seq[Row]): Option[String] = { val isSorted = rdd.logicalPlan.collect { case s: logical.Sort = s }.nonEmpty --- def checkAnswer(df: DataFrame, expectedAnswer: Seq[Row]): Option[String] = { val isSorted = df.logicalPlan.collect { case s: logical.Sort = s }.nonEmpty 97c113 if (!isSorted) converted.sortBy(_.toString) else converted --- if (!isSorted) converted.sortBy(_.toString()) else converted 99c115 val sparkAnswer = try rdd.collect().toSeq catch { --- val sparkAnswer = try df.collect().toSeq catch { 104c120 |${rdd.queryExecution} --- |${df.queryExecution} 116c132 |${rdd.logicalPlan} --- |${df.logicalPlan} 118c134 |${rdd.queryExecution.analyzed} --- |${df.queryExecution.analyzed} 120c136 |${rdd.queryExecution.executedPlan} --- |${df.queryExecution.executedPlan} 124c140 prepareAnswer(expectedAnswer).map(_.toString), --- prepareAnswer(expectedAnswer).map(_.toString()), 126c142 prepareAnswer(sparkAnswer).map(_.toString)).mkString(\n)} --- prepareAnswer(sparkAnswer).map(_.toString())).mkString(\n)} 134,135c150,151 def checkAnswer(rdd: DataFrame, expectedAnswer: java.util.List[Row]): String = { checkAnswer(rdd, expectedAnswer.toSeq) match { --- def checkAnswer(df: DataFrame, expectedAnswer: java.util.List[Row]): String = { checkAnswer(df, expectedAnswer.toSeq) match { {code} hive tests to import spark-sql test JAR for QueryTest access Key: SPARK-6433 URL: https://issues.apache.org/jira/browse/SPARK-6433 Project: Spark Issue Type: Improvement Components: Build, SQL Affects Versions: 1.4.0
[jira] [Commented] (SPARK-6433) hive tests to import spark-sql test JAR for QueryTest access
[ https://issues.apache.org/jira/browse/SPARK-6433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14371283#comment-14371283 ] Steve Loughran commented on SPARK-6433: --- They have diverged, the original sql one has added a new method. Switching hive tests to that version is going to pick up this new method, but eliminate the divergence problem/maintenance work in future {code} diff ./sql/hive/src/test/scala/org/apache/spark/sql/catalyst/plans/PlanTest.scala ./sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/plans/PlanTest.scala 20,22d19 import org.apache.spark.sql.catalyst.expressions.{Alias, AttributeReference, ExprId} import org.apache.spark.sql.catalyst.plans.logical.LogicalPlan import org.apache.spark.sql.catalyst.util._ 24a22,25 import org.apache.spark.sql.catalyst.expressions._ import org.apache.spark.sql.catalyst.plans.logical.{NoRelation, Filter, LogicalPlan} import org.apache.spark.sql.catalyst.util._ 26,29c27 * *** DUPLICATED FROM sql/catalyst/plans. *** * * It is hard to have maven allow one subproject depend on another subprojects test code. * So, we duplicate this code here. --- * Provides helper methods for comparing plans. 56a55,59 /** Fails the test if the two expressions do not match */ protected def compareExpressions(e1: Expression, e2: Expression): Unit = { comparePlans(Filter(e1, NoRelation), Filter(e2, NoRelation)) } {code} hive tests to import spark-sql test JAR for QueryTest access Key: SPARK-6433 URL: https://issues.apache.org/jira/browse/SPARK-6433 Project: Spark Issue Type: Improvement Components: Build, SQL Affects Versions: 1.4.0 Reporter: Steve Loughran Priority: Minor Original Estimate: 0.5h Remaining Estimate: 0.5h The hive module has its own clone of {{org.apache.spark.sql.QueryPlan}} and {{org.apache.spark.sql.catalyst.plans.PlanTest}} which are copied from the spark-sql module because it's hard to have maven allow one subproject depend on another subprojects test code It's actually relatively straightforward # tell maven to build publish the test JARs # import them in your other sub projects There is one consequence: the JARs will also end being published to mvn central. This is not really a bad thing; it does help downstream projects pick up the JARs too. It does become an issue if a test run depends on a custom file under {{src/test/resources}} containing things like EC2 authentication keys, or even just log4.properties files which can interfere with each other. These need to be excluded -the simplest way is to exclude all of the resources from test JARs. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-6433) hive tests to import spark-sql test JAR for QueryTest access
[ https://issues.apache.org/jira/browse/SPARK-6433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14371350#comment-14371350 ] Steve Loughran commented on SPARK-6433: --- There's one interesting question: whether or not to shade the test plugins? spark-streaming already generates its test JAR and does shade it. Presumably anything else want to use the test JARs downstream is going to expect the same bindings as the release JARs, which implies shading the test JARs eveywhere hive tests to import spark-sql test JAR for QueryTest access Key: SPARK-6433 URL: https://issues.apache.org/jira/browse/SPARK-6433 Project: Spark Issue Type: Improvement Components: Build, SQL Affects Versions: 1.4.0 Reporter: Steve Loughran Priority: Minor Original Estimate: 0.5h Remaining Estimate: 0.5h The hive module has its own clone of {{org.apache.spark.sql.QueryPlan}} and {{org.apache.spark.sql.catalyst.plans.PlanTest}} which are copied from the spark-sql module because it's hard to have maven allow one subproject depend on another subprojects test code It's actually relatively straightforward # tell maven to build publish the test JARs # import them in your other sub projects There is one consequence: the JARs will also end being published to mvn central. This is not really a bad thing; it does help downstream projects pick up the JARs too. It does become an issue if a test run depends on a custom file under {{src/test/resources}} containing things like EC2 authentication keys, or even just log4.properties files which can interfere with each other. These need to be excluded -the simplest way is to exclude all of the resources from test JARs. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-6433) hive tests to import spark-sql test JAR for QueryTest access
[ https://issues.apache.org/jira/browse/SPARK-6433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14371280#comment-14371280 ] Steve Loughran commented on SPARK-6433: --- ..sorry, I'd missed that previous report. Will push up a patch here once the local build/test is happy. This patch will cull the duplicate spark-hive source files, provided they haven't diverged hive tests to import spark-sql test JAR for QueryTest access Key: SPARK-6433 URL: https://issues.apache.org/jira/browse/SPARK-6433 Project: Spark Issue Type: Improvement Components: Build, SQL Affects Versions: 1.4.0 Reporter: Steve Loughran Priority: Minor Original Estimate: 0.5h Remaining Estimate: 0.5h The hive module has its own clone of {{org.apache.spark.sql.QueryPlan}} and {{org.apache.spark.sql.catalyst.plans.PlanTest}} which are copied from the spark-sql module because it's hard to have maven allow one subproject depend on another subprojects test code It's actually relatively straightforward # tell maven to build publish the test JARs # import them in your other sub projects There is one consequence: the JARs will also end being published to mvn central. This is not really a bad thing; it does help downstream projects pick up the JARs too. It does become an issue if a test run depends on a custom file under {{src/test/resources}} containing things like EC2 authentication keys, or even just log4.properties files which can interfere with each other. These need to be excluded -the simplest way is to exclude all of the resources from test JARs. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-1200) Make it possible to use unmanaged AM in yarn-client mode
[ https://issues.apache.org/jira/browse/SPARK-1200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14369339#comment-14369339 ] Steve Loughran commented on SPARK-1200: --- You know, we could benefit all YARN apps if the AM launch request could also include a list of container requests to satisfy. This would remove the pipeline of client-AM-containers, and start requesting containers earlier. The allocated container list could just come in as callbacks once the AM is live, as container losses do on AM restart. It means the client would have to come up with an initial assessment of the priority containers to escalate. But publishing that information at launch time would help with the (proposed) Gang scheduling. Make it possible to use unmanaged AM in yarn-client mode Key: SPARK-1200 URL: https://issues.apache.org/jira/browse/SPARK-1200 Project: Spark Issue Type: Improvement Components: YARN Affects Versions: 0.9.0 Reporter: Sandy Pérez González Assignee: Sandy Ryza Using an unmanaged AM in yarn-client mode would allow apps to start up faster, but not requiring the container launcher AM to be launched on the cluster. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-6433) hive tests to import spark-sql test JAR for QueryTest access
Steve Loughran created SPARK-6433: - Summary: hive tests to import spark-sql test JAR for QueryTest access Key: SPARK-6433 URL: https://issues.apache.org/jira/browse/SPARK-6433 Project: Spark Issue Type: Improvement Components: Build, SQL Affects Versions: 1.4.0 Reporter: Steve Loughran Priority: Minor The hive module has its own clone of {{org.apache.spark.sql.QueryPlan}} and {{org.apache.spark.sql.catalyst.plans.PlanTest}} which are copied from the spark-sql module because it's hard to have maven allow one subproject depend on another subprojects test code It's actually relatively straightforward # tell maven to build publish the test JARs # import them in your other sub projects There is one consequence: the JARs will also end being published to mvn central. This is not really a bad thing; it does help downstream projects pick up the JARs too. It does become an issue if a test run depends on a custom file under {{src/test/resources}} containing things like EC2 authentication keys, or even just log4.properties files which can interfere with each other. These need to be excluded -the simplest way is to exclude all of the resources from test JARs. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-6389) YARN app diagnostics report doesn't report NPEs
Steve Loughran created SPARK-6389: - Summary: YARN app diagnostics report doesn't report NPEs Key: SPARK-6389 URL: https://issues.apache.org/jira/browse/SPARK-6389 Project: Spark Issue Type: Bug Components: YARN Affects Versions: 1.3.0 Reporter: Steve Loughran Priority: Trivial {{ApplicationMaster.run()}} catches exceptions and calls {{toMessage()}} to get their message included in the YARN diagnostics report visible in the RM UI. Except, NPEs don't have a message —if one is raised their report becomes {{Uncaught exception: null}}, which isn't that useful. The full text stack trace is logged correctly in the AM. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3351) Yarn YarnRMClientImpl.shutdown can be called before register - NPE
[ https://issues.apache.org/jira/browse/SPARK-3351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14365419#comment-14365419 ] Steve Loughran commented on SPARK-3351: --- Thomas: what Yarn version was this against? There's been a guard against this in {{FinishApplicationMasterRequestPBImpl.setTrackingUrl()}} since YARN-918 2.1.0-beta. Yarn YarnRMClientImpl.shutdown can be called before register - NPE -- Key: SPARK-3351 URL: https://issues.apache.org/jira/browse/SPARK-3351 Project: Spark Issue Type: Bug Components: YARN Affects Versions: 1.2.0 Reporter: Thomas Graves If the SparkContext exits while its in the applicationmaster.waitForSparkContextInitialized then the YarnRMClientImpl.shutdown can be called before register and you get a null pointer exception on the uihistoryAddress. 14/09/02 18:59:21 INFO ApplicationMaster: Finishing ApplicationMaster with FAILED (diag message: Timed out waiting for SparkContext.) Exception in thread main java.lang.NullPointerException at org.apache.hadoop.yarn.proto.YarnServiceProtos$FinishApplicationMasterRequestProto$Builder.setTrackingUrl(YarnServiceProtos.java:2312) at org.apache.hadoop.yarn.api.protocolrecords.impl.pb.FinishApplicationMasterRequestPBImpl.setTrackingUrl(FinishApplicationMasterRequestPBImpl.java:121) at org.apache.spark.deploy.yarn.YarnRMClientImpl.shutdown(YarnRMClientImpl.scala:73) at org.apache.spark.deploy.yarn.ApplicationMaster.finish(ApplicationMaster.scala:140) at org.apache.spark.deploy.yarn.ApplicationMaster.runDriver(ApplicationMaster.scala:178) at org.apache.spark.deploy.yarn.ApplicationMaster.run(ApplicationMaster.scala:113) -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3351) Yarn YarnRMClientImpl.shutdown can be called before register - NPE
[ https://issues.apache.org/jira/browse/SPARK-3351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14365547#comment-14365547 ] Steve Loughran commented on SPARK-3351: --- The NPE is certainly gone. I don't know about any other side-effects. Yarn YarnRMClientImpl.shutdown can be called before register - NPE -- Key: SPARK-3351 URL: https://issues.apache.org/jira/browse/SPARK-3351 Project: Spark Issue Type: Bug Components: YARN Affects Versions: 1.2.0 Reporter: Thomas Graves If the SparkContext exits while its in the applicationmaster.waitForSparkContextInitialized then the YarnRMClientImpl.shutdown can be called before register and you get a null pointer exception on the uihistoryAddress. 14/09/02 18:59:21 INFO ApplicationMaster: Finishing ApplicationMaster with FAILED (diag message: Timed out waiting for SparkContext.) Exception in thread main java.lang.NullPointerException at org.apache.hadoop.yarn.proto.YarnServiceProtos$FinishApplicationMasterRequestProto$Builder.setTrackingUrl(YarnServiceProtos.java:2312) at org.apache.hadoop.yarn.api.protocolrecords.impl.pb.FinishApplicationMasterRequestPBImpl.setTrackingUrl(FinishApplicationMasterRequestPBImpl.java:121) at org.apache.spark.deploy.yarn.YarnRMClientImpl.shutdown(YarnRMClientImpl.scala:73) at org.apache.spark.deploy.yarn.ApplicationMaster.finish(ApplicationMaster.scala:140) at org.apache.spark.deploy.yarn.ApplicationMaster.runDriver(ApplicationMaster.scala:178) at org.apache.spark.deploy.yarn.ApplicationMaster.run(ApplicationMaster.scala:113) -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-6568) spark-shell.cmd --jars option does not accept the jar that has space in its path
[ https://issues.apache.org/jira/browse/SPARK-6568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14386552#comment-14386552 ] Steve Loughran commented on SPARK-6568: --- Can you show the full stack trace? spark-shell.cmd --jars option does not accept the jar that has space in its path Key: SPARK-6568 URL: https://issues.apache.org/jira/browse/SPARK-6568 Project: Spark Issue Type: Bug Components: Spark Core, Windows Affects Versions: 1.3.0 Environment: Windows 8.1 Reporter: Masayoshi TSUZUKI spark-shell.cmd --jars option does not accept the jar that has space in its path. The path of jar sometimes containes space in Windows. {code} bin\spark-shell.cmd --jars C:\Program Files\some\jar1.jar {code} this gets {code} Exception in thread main java.net.URISyntaxException: Illegal character in path at index 10: C:/Program Files/some/jar1.jar {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-799) Windows versions of the deploy scripts
[ https://issues.apache.org/jira/browse/SPARK-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14386571#comment-14386571 ] Steve Loughran commented on SPARK-799: -- Proving python versions of the launcher scripts is probably a better approach to supporting windows than .cmd or .ps1 files # {{cmd}} is a painfully dated shell language, as you note. # Powershell is better, but not widely known in the java/scala dev space. # Neither ps1 nor cmd files can be tested except in Windows Python is cross-platform-ish enough that it can be tested on Unix systems too, and more likely to be maintained. It's not seamlessly cross-platform; propagating stdout/stderr from spawned java processes. It also provides the option of becoming the Unix entry point (the bash script simply invoking it), so that maintenance effort is shared, and testing becomes even more implicit. Windows versions of the deploy scripts -- Key: SPARK-799 URL: https://issues.apache.org/jira/browse/SPARK-799 Project: Spark Issue Type: Bug Components: Deploy, Windows Reporter: Matei Zaharia Labels: Starter Although the Spark daemons run fine on Windows with run.cmd, the deploy scripts (bin/start-all.sh and such) don't do so unless you have Cygwin. It would be nice to make .cmd versions of those. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-2356) Exception: Could not locate executable null\bin\winutils.exe in the Hadoop
[ https://issues.apache.org/jira/browse/SPARK-2356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14386585#comment-14386585 ] Steve Loughran commented on SPARK-2356: --- It's coming from {{ UserGroupInformation.setConfiguration(conf)}}; UGI is using Hadoop's {{StringUtils}} to do something, which then init's a static variable {code} public static final Pattern ENV_VAR_PATTERN = Shell.WINDOWS ? WIN_ENV_VAR_PATTERN : SHELL_ENV_VAR_PATTERN; {code} And Hadoop utils shell, does some stuff in its constructor, which depends on winutils.exe being on the path. convoluted, but there you go. HADOOP-11293 proposes factoring out the {{Shell.Windows}} code into something standalone...if that can be pushed into Hadoop 2.8 then this problem will go away from then on Exception: Could not locate executable null\bin\winutils.exe in the Hadoop --- Key: SPARK-2356 URL: https://issues.apache.org/jira/browse/SPARK-2356 Project: Spark Issue Type: Bug Components: Windows Affects Versions: 1.0.0 Reporter: Kostiantyn Kudriavtsev Priority: Critical I'm trying to run some transformation on Spark, it works fine on cluster (YARN, linux machines). However, when I'm trying to run it on local machine (Windows 7) under unit test, I got errors (I don't use Hadoop, I'm read file from local filesystem): {code} 14/07/02 19:59:31 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 14/07/02 19:59:31 ERROR Shell: Failed to locate the winutils binary in the hadoop binary path java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries. at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:318) at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:333) at org.apache.hadoop.util.Shell.clinit(Shell.java:326) at org.apache.hadoop.util.StringUtils.clinit(StringUtils.java:76) at org.apache.hadoop.security.Groups.parseStaticMapping(Groups.java:93) at org.apache.hadoop.security.Groups.init(Groups.java:77) at org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:240) at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:255) at org.apache.hadoop.security.UserGroupInformation.setConfiguration(UserGroupInformation.java:283) at org.apache.spark.deploy.SparkHadoopUtil.init(SparkHadoopUtil.scala:36) at org.apache.spark.deploy.SparkHadoopUtil$.init(SparkHadoopUtil.scala:109) at org.apache.spark.deploy.SparkHadoopUtil$.clinit(SparkHadoopUtil.scala) at org.apache.spark.SparkContext.init(SparkContext.scala:228) at org.apache.spark.SparkContext.init(SparkContext.scala:97) {code} It's happened because Hadoop config is initialized each time when spark context is created regardless is hadoop required or not. I propose to add some special flag to indicate if hadoop config is required (or start this configuration manually) -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-6646) Spark 2.0: Rearchitecting Spark for Mobile Platforms
[ https://issues.apache.org/jira/browse/SPARK-6646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14390373#comment-14390373 ] Steve Loughran commented on SPARK-6646: --- Obviously the barrier will be data source access; talking to remote data is going to run up bills. # couchdb has an offline mode, so its RDD/Dataframe support would allow spark-mobile to work in embedded mode. # Hadoop 2.8 add hardware CRC on ARM parts for HDFS (HADOOP-11660). A {{MiniHDFSCluster}} could be instantiated locally to benefit from this. # alternatively, mDNS could be used to discover and dynamically build up an HDFS cluster from nearby devices, MANET-style. The limited connectivity guarantees of moving devices means that a block size of 1536 bytes would be appropriate; probably 1KB blocks are safest. # Those nodes on the network with limited CPU power but access to external power supplies, such as toasters and coffee machines, could have a role as the persistent co-ordinators of work and HDFS Namenodes, as well as being used as the preferred routers of wifi packets. # It may be necessary to extend the hadoop {{s3://}} filesystem with the notion of monthly data quotas. Possibly even roaming and non-roaming quotas. The S3 client would need to query the runtime to determine whether it was at home vs roaming use the relevant quota. Apps could then set something like {code} fs.s3.quota.home=15GB fs.s3.quota.roaming=2GB {code} Dealing with use abroad would be more complex, as if a cost value were to be included, exchange rates would have to be dynamically assessed. # It may be interesting consider the notion of having devices publish some of their data (photos, healthkit history, movement history) to other devices nearby. If one phone could enumerate those nearby **and submit work to them**, the bandwidth problems could be addressed. Spark 2.0: Rearchitecting Spark for Mobile Platforms Key: SPARK-6646 URL: https://issues.apache.org/jira/browse/SPARK-6646 Project: Spark Issue Type: Improvement Components: Project Infra Reporter: Reynold Xin Assignee: Reynold Xin Priority: Blocker Attachments: Spark on Mobile - Design Doc - v1.pdf Mobile computing is quickly rising to dominance, and by the end of 2017, it is estimated that 90% of CPU cycles will be devoted to mobile hardware. Spark’s project goal can be accomplished only when Spark runs efficiently for the growing population of mobile users. Designed and optimized for modern data centers and Big Data applications, Spark is unfortunately not a good fit for mobile computing today. In the past few months, we have been prototyping the feasibility of a mobile-first Spark architecture, and today we would like to share with you our findings. This ticket outlines the technical design of Spark’s mobile support, and shares results from several early prototypes. Mobile friendly version of the design doc: https://databricks.com/blog/2015/04/01/spark-2-rearchitecting-spark-for-mobile.html -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-1537) Add integration with Yarn's Application Timeline Server
[ https://issues.apache.org/jira/browse/SPARK-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14385890#comment-14385890 ] Steve Loughran commented on SPARK-1537: --- # I've just tried to see where YARN-2444 stands; I can't replicate it in trunk but I've submitted the tests to verify that it isn't there. # for YARN-2423 Spark seems kind of trapped. It needs an api tagged as public/stable; Robert's patch has the API, except it's being rejected on the basis that ATSv2 will break it. So it can't be tagged as stable. So there's no API for GET operations until some undefined time {{t1 now()}} —and then, only for Hadoop versions with it. Which implies it won't get picked up by Spark for a long time. I think we need to talk to the YARN dev team and see what can be done here. Even if there's no API client bundled into YARN, unless the v1 API and its paths beginning {{/ws/v1/timeline/}} are going to go away, then a REST client is possible; it may just have to be done spark-side, where at least it can be made resilient to hadoop versions. Add integration with Yarn's Application Timeline Server --- Key: SPARK-1537 URL: https://issues.apache.org/jira/browse/SPARK-1537 Project: Spark Issue Type: New Feature Components: YARN Reporter: Marcelo Vanzin Assignee: Marcelo Vanzin Attachments: SPARK-1537.txt, spark-1573.patch It would be nice to have Spark integrate with Yarn's Application Timeline Server (see YARN-321, YARN-1530). This would allow users running Spark on Yarn to have a single place to go for all their history needs, and avoid having to manage a separate service (Spark's built-in server). At the moment, there's a working version of the ATS in the Hadoop 2.4 branch, although there is still some ongoing work. But the basics are there, and I wouldn't expect them to change (much) at this point. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-6479) Create off-heap block storage API (internal)
[ https://issues.apache.org/jira/browse/SPARK-6479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14384520#comment-14384520 ] Steve Loughran commented on SPARK-6479: --- Henry: utterly unrelated. I was merely offering to help define this API more formally derive tests from it. Create off-heap block storage API (internal) Key: SPARK-6479 URL: https://issues.apache.org/jira/browse/SPARK-6479 Project: Spark Issue Type: Improvement Components: Block Manager, Spark Core Reporter: Reynold Xin Attachments: SparkOffheapsupportbyHDFS.pdf Would be great to create APIs for off-heap block stores, rather than doing a bunch of if statements everywhere. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-7009) Build assembly JAR via ant to avoid zip64 problems
[ https://issues.apache.org/jira/browse/SPARK-7009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503752#comment-14503752 ] Steve Loughran commented on SPARK-7009: --- most of the others seemed fix by documentation patches... Build assembly JAR via ant to avoid zip64 problems -- Key: SPARK-7009 URL: https://issues.apache.org/jira/browse/SPARK-7009 Project: Spark Issue Type: Improvement Components: Build Affects Versions: 1.3.0 Environment: Java 7+ Reporter: Steve Loughran Original Estimate: 2h Remaining Estimate: 2h SPARK-1911 shows the problem that JDK7+ is using zip64 to build large JARs; a format incompatible with Java and pyspark. Provided the total number of .class files+resources is 64K, ant can be used to make the final JAR instead, perhaps by unzipping the maven-generated JAR then rezipping it with zip64=never, before publishing the artifact via maven. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-7009) Build assembly JAR via ant to avoid zip64 problems
[ https://issues.apache.org/jira/browse/SPARK-7009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14502657#comment-14502657 ] Steve Loughran commented on SPARK-7009: --- It's only 30 lines of diff including the antrun plugin config; trivial compared to the shade plugin itself. As you note though, it's not enough: there are 64K .class files. Which means that the use java6 to compile warning note of SPARK-1911 probably isn't going to work either, unless a java6 build includes less classes in the shaded jar. Build assembly JAR via ant to avoid zip64 problems -- Key: SPARK-7009 URL: https://issues.apache.org/jira/browse/SPARK-7009 Project: Spark Issue Type: Improvement Components: Build Affects Versions: 1.3.0 Environment: Java 7+ Reporter: Steve Loughran Original Estimate: 2h Remaining Estimate: 2h SPARK-1911 shows the problem that JDK7+ is using zip64 to build large JARs; a format incompatible with Java and pyspark. Provided the total number of .class files+resources is 64K, ant can be used to make the final JAR instead, perhaps by unzipping the maven-generated JAR then rezipping it with zip64=never, before publishing the artifact via maven. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-7009) Build assembly JAR via ant to avoid zip64 problems
[ https://issues.apache.org/jira/browse/SPARK-7009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14502675#comment-14502675 ] Steve Loughran commented on SPARK-7009: --- Looking at the [openJDK issue|https://bugs.openjdk.java.net/browse/JDK-4828461], Java6 appears to be generating a header/footer that stops at 64K, and doesn't bother reading that header when enumerating zip file. Java 7 (presumably) handles reads the same way, but uses zip64 to generate the artifacts. Ant can be told not to generate zip64 files, but it does zip16 properly, rejecting source filesets that are too large There isn't an obvious/immediate solution for this on Java7+; except to extend Ant to generate the same hacked zip files, then wait for that to trickle into the maven ant-run plugin, which would be about 3+ months after ant 1.9.x ships. That's a long term project, though something to consider starting now, to get the feature later in 2015 Build assembly JAR via ant to avoid zip64 problems -- Key: SPARK-7009 URL: https://issues.apache.org/jira/browse/SPARK-7009 Project: Spark Issue Type: Improvement Components: Build Affects Versions: 1.3.0 Environment: Java 7+ Reporter: Steve Loughran Original Estimate: 2h Remaining Estimate: 2h SPARK-1911 shows the problem that JDK7+ is using zip64 to build large JARs; a format incompatible with Java and pyspark. Provided the total number of .class files+resources is 64K, ant can be used to make the final JAR instead, perhaps by unzipping the maven-generated JAR then rezipping it with zip64=never, before publishing the artifact via maven. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-1911) Warn users if their assembly jars are not built with Java 6
[ https://issues.apache.org/jira/browse/SPARK-1911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14502589#comment-14502589 ] Steve Loughran commented on SPARK-1911: --- This doesn't fix the problem, merely documents it. It should be doable by using Ant's zip task, which doesn't use the JDK zip routines. The assembly would be unzipped first, then zipped with zip63 option set to never see [https://ant.apache.org/manual/Tasks/zip.html] Warn users if their assembly jars are not built with Java 6 --- Key: SPARK-1911 URL: https://issues.apache.org/jira/browse/SPARK-1911 Project: Spark Issue Type: Bug Components: Documentation Affects Versions: 1.1.0 Reporter: Andrew Or Assignee: Sean Owen Fix For: 1.2.2, 1.3.0 The root cause of the problem is detailed in: https://issues.apache.org/jira/browse/SPARK-1520. In short, an assembly jar built with Java 7+ is not always accessible by Python or other versions of Java (especially Java 6). If the assembly jar is not built on the cluster itself, this problem may manifest itself in strange exceptions that are not trivial to debug. This is an issue especially for PySpark on YARN, which relies on the python files included within the assembly jar. Currently we warn users only in make-distribution.sh, but most users build the jars directly. At the very least we need to emphasize this in the docs (currently missing entirely). The next step is to add a warning prompt in the mvn scripts whenever Java 7+ is detected. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-7009) Build assembly JAR via ant to avoid zip64 problems
[ https://issues.apache.org/jira/browse/SPARK-7009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504639#comment-14504639 ] Steve Loughran commented on SPARK-7009: --- yes, what we are trying to do is come up with some solution to the task of : build on java7 while producing a JAR that pyspark can handle. The #5592 patch does not do it, because ant's {{zip}} task does not (currently) generate the java6 style zip file. Build assembly JAR via ant to avoid zip64 problems -- Key: SPARK-7009 URL: https://issues.apache.org/jira/browse/SPARK-7009 Project: Spark Issue Type: Improvement Components: Build Affects Versions: 1.3.0 Environment: Java 7+ Reporter: Steve Loughran Original Estimate: 2h Remaining Estimate: 2h SPARK-1911 shows the problem that JDK7+ is using zip64 to build large JARs; a format incompatible with Java and pyspark. Provided the total number of .class files+resources is 64K, ant can be used to make the final JAR instead, perhaps by unzipping the maven-generated JAR then rezipping it with zip64=never, before publishing the artifact via maven. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-7009) Build assembly JAR via ant to avoid zip64 problems
[ https://issues.apache.org/jira/browse/SPARK-7009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503510#comment-14503510 ] Steve Loughran commented on SPARK-7009: --- ...so, open a JIRA for that and close this as WONTFIX? Build assembly JAR via ant to avoid zip64 problems -- Key: SPARK-7009 URL: https://issues.apache.org/jira/browse/SPARK-7009 Project: Spark Issue Type: Improvement Components: Build Affects Versions: 1.3.0 Environment: Java 7+ Reporter: Steve Loughran Original Estimate: 2h Remaining Estimate: 2h SPARK-1911 shows the problem that JDK7+ is using zip64 to build large JARs; a format incompatible with Java and pyspark. Provided the total number of .class files+resources is 64K, ant can be used to make the final JAR instead, perhaps by unzipping the maven-generated JAR then rezipping it with zip64=never, before publishing the artifact via maven. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-1437) Jenkins should build with Java 6
[ https://issues.apache.org/jira/browse/SPARK-1437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14525761#comment-14525761 ] Steve Loughran commented on SPARK-1437: --- ..be good for the pull request test runs to use java6 too; I've already had to purge some java7-isms that didn't get picked up Jenkins should build with Java 6 Key: SPARK-1437 URL: https://issues.apache.org/jira/browse/SPARK-1437 Project: Spark Issue Type: Bug Components: Build, Project Infra Affects Versions: 0.9.0 Reporter: Sean Owen Priority: Minor Labels: javac, jenkins Attachments: Screen Shot 2014-04-07 at 22.53.56.png Apologies if this was already on someone's to-do list, but I wanted to track this, as it bit two commits in the last few weeks. Spark is intended to work with Java 6, and so compiles with source/target 1.6. Java 7 can correctly enforce Java 6 language rules and emit Java 6 bytecode. However, unless otherwise configured with -bootclasspath, javac will use its own (Java 7) library classes. This means code that uses classes in Java 7 will be allowed to compile, but the result will fail when run on Java 6. This is why you get warnings like ... Using /usr/java/jdk1.7.0_51 as default JAVA_HOME. ... [warn] warning: [options] bootstrap class path not set in conjunction with -source 1.6 The solution is just to tell Jenkins to use Java 6. This may be stating the obvious, but it should just be a setting under Configure for SparkPullRequestBuilder. In our Jenkinses, JDK 6/7/8 are set up; if it's not an option already I'm guessing it's not too hard to get Java 6 configured on the Amplab machines. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-5754) Spark AM not launching on Windows
[ https://issues.apache.org/jira/browse/SPARK-5754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14533315#comment-14533315 ] Steve Loughran commented on SPARK-5754: --- I had a look at what we did for slider via our [JavaCommandLineBuilder|https://github.com/apache/incubator-slider/blob/develop/slider-core/src/main/java/org/apache/slider/core/launch/JavaCommandLineBuilder.java] We just seem to be building up the full command line for %JAVA_HOME%/bin/java without needing spaces between things. (what we do have is SLIDER-87, commas in JVM args on unix...) Hadoop on windows starts a container process in the native file {{service.c}}, which uses the windows {{CreateProcess(}} call, there's some coverage of its quotation logic in [MSDN|https://msdn.microsoft.com/en-us/library/windows/desktop/ms682425%28v=vs.85%29.aspx] one possibility here is that java home has a space in and that is causing confusion. [~goiri] -what is your JAVA_HOME value? Spark AM not launching on Windows - Key: SPARK-5754 URL: https://issues.apache.org/jira/browse/SPARK-5754 Project: Spark Issue Type: Bug Components: Windows, YARN Affects Versions: 1.1.1, 1.2.0 Environment: Windows Server 2012, Hadoop 2.4.1. Reporter: Inigo I'm trying to run Spark Pi on a YARN cluster running on Windows and the AM container fails to start. The problem seems to be in the generation of the YARN command which adds single quotes (') surrounding some of the java options. In particular, the part of the code that is adding those is the escapeForShell function in YarnSparkHadoopUtil. Apparently, Windows does not like the quotes for these options. Here is an example of the command that the container tries to execute: @call %JAVA_HOME%/bin/java -server -Xmx512m -Djava.io.tmpdir=%PWD%/tmp '-Dspark.yarn.secondary.jars=' '-Dspark.app.name=org.apache.spark.examples.SparkPi' '-Dspark.master=yarn-cluster' org.apache.spark.deploy.yarn.ApplicationMaster --class 'org.apache.spark.examples.SparkPi' --jar 'file:/D:/data/spark-1.1.1-bin-hadoop2.4/bin/../lib/spark-examples-1.1.1-hadoop2.4.0.jar' --executor-memory 1024 --executor-cores 1 --num-executors 2 Once I transform it into: @call %JAVA_HOME%/bin/java -server -Xmx512m -Djava.io.tmpdir=%PWD%/tmp -Dspark.yarn.secondary.jars= -Dspark.app.name=org.apache.spark.examples.SparkPi -Dspark.master=yarn-cluster org.apache.spark.deploy.yarn.ApplicationMaster --class 'org.apache.spark.examples.SparkPi' --jar 'file:/D:/data/spark-1.1.1-bin-hadoop2.4/bin/../lib/spark-examples-1.1.1-hadoop2.4.0.jar' --executor-memory 1024 --executor-cores 1 --num-executors 2 Everything seems to start. How should I deal with this? Creating a separate function like escapeForShell for Windows and call it whenever I detect this is for Windows? Or should I add some sanity check on YARN? I checked a little and there seems to be people that is able to run Spark on YARN on Windows, so it might be something else. I didn't find anything related on Jira either. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-5034) Spark on Yarn launch failure on HDInsight on Windows
[ https://issues.apache.org/jira/browse/SPARK-5034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14533349#comment-14533349 ] Steve Loughran commented on SPARK-5034: --- looks related to SPARK-5754 ; linking Spark on Yarn launch failure on HDInsight on Windows Key: SPARK-5034 URL: https://issues.apache.org/jira/browse/SPARK-5034 Project: Spark Issue Type: Bug Components: Windows, YARN Affects Versions: 1.1.0, 1.1.1, 1.2.0 Environment: Spark on Yarn within HDInsight on Windows Azure Reporter: Rice Windows Environment I'm trying to run JavaSparkPi example on YARN with master = yarn-client but I have a problem. It runs smoothly with submitting application, first container for Application Master works too. When job is starting and there are some tasks to do I'm getting this warning on console (I'm using windows cmd if this makes any difference): WARN cluster.YarnClientClusterScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory When I'm checking logs for container with Application Masters it is launching containers for executors properly, then goes with: INFO YarnAllocationHandler: Completed container container_1409217202587_0003_01_02 (state: COMPLETE, exit status: 1) INFO YarnAllocationHandler: Container marked as failed: container_1409217202587_0003_01_02 And tries to re-launch them. On failed container log there is only this: Error: Could not find or load main class pwd..sp...@gbv06758291.my.secret.address.net:63680.user.CoarseGrainedScheduler -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-5754) Spark AM not launching on Windows
[ https://issues.apache.org/jira/browse/SPARK-5754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14533360#comment-14533360 ] Steve Loughran commented on SPARK-5754: --- I think windows will need its own escaping logic -once we can work out what it is. Something like that should really go into hadoop common (with a test); then it can be ported into spark until the versions are in sync. Spark AM not launching on Windows - Key: SPARK-5754 URL: https://issues.apache.org/jira/browse/SPARK-5754 Project: Spark Issue Type: Bug Components: Windows, YARN Affects Versions: 1.1.1, 1.2.0 Environment: Windows Server 2012, Hadoop 2.4.1. Reporter: Inigo I'm trying to run Spark Pi on a YARN cluster running on Windows and the AM container fails to start. The problem seems to be in the generation of the YARN command which adds single quotes (') surrounding some of the java options. In particular, the part of the code that is adding those is the escapeForShell function in YarnSparkHadoopUtil. Apparently, Windows does not like the quotes for these options. Here is an example of the command that the container tries to execute: @call %JAVA_HOME%/bin/java -server -Xmx512m -Djava.io.tmpdir=%PWD%/tmp '-Dspark.yarn.secondary.jars=' '-Dspark.app.name=org.apache.spark.examples.SparkPi' '-Dspark.master=yarn-cluster' org.apache.spark.deploy.yarn.ApplicationMaster --class 'org.apache.spark.examples.SparkPi' --jar 'file:/D:/data/spark-1.1.1-bin-hadoop2.4/bin/../lib/spark-examples-1.1.1-hadoop2.4.0.jar' --executor-memory 1024 --executor-cores 1 --num-executors 2 Once I transform it into: @call %JAVA_HOME%/bin/java -server -Xmx512m -Djava.io.tmpdir=%PWD%/tmp -Dspark.yarn.secondary.jars= -Dspark.app.name=org.apache.spark.examples.SparkPi -Dspark.master=yarn-cluster org.apache.spark.deploy.yarn.ApplicationMaster --class 'org.apache.spark.examples.SparkPi' --jar 'file:/D:/data/spark-1.1.1-bin-hadoop2.4/bin/../lib/spark-examples-1.1.1-hadoop2.4.0.jar' --executor-memory 1024 --executor-cores 1 --num-executors 2 Everything seems to start. How should I deal with this? Creating a separate function like escapeForShell for Windows and call it whenever I detect this is for Windows? Or should I add some sanity check on YARN? I checked a little and there seems to be people that is able to run Spark on YARN on Windows, so it might be something else. I didn't find anything related on Jira either. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-7481) Add Hadoop 2.6+ profile to pull in object store FS accessors
Steve Loughran created SPARK-7481: - Summary: Add Hadoop 2.6+ profile to pull in object store FS accessors Key: SPARK-7481 URL: https://issues.apache.org/jira/browse/SPARK-7481 Project: Spark Issue Type: Improvement Components: Build Affects Versions: 1.3.1 Reporter: Steve Loughran To keep the s3n classpath right, to add s3a, swift azure, the dependencies of spark in a 2.6+ profile need to add the relevant object store packages (hadoop-aws, hadoop-openstack, hadoop-azure) this adds more stuff to the client bundle, but will mean a single spark package can talk to all of the stores. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-7481) Add Hadoop 2.6+ profile to pull in object store FS accessors
[ https://issues.apache.org/jira/browse/SPARK-7481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14534174#comment-14534174 ] Steve Loughran commented on SPARK-7481: --- This doesn't contain any endorsement of the use of s3a in Hadoop 2.6; see HADOOP-11571 I'm not planning to add any tests for this, but its something to consider for regression testing all the object stores —the tests just need to: * be skipped if there's no credentials * make a best effort to stop anyone accidentally checking in their credentials * work on deskop/jenkins rather than just on cloud. * not run up massive bills * not take forever AWS publishes some free-to-read datasets, such as [this one|http://datasets.elasticmapreduce.s3.amazonaws.com/] which won't need credentials, work remote and don't ring up bills for the read part of the process, but would take a long time to complete on a single executor. Add Hadoop 2.6+ profile to pull in object store FS accessors Key: SPARK-7481 URL: https://issues.apache.org/jira/browse/SPARK-7481 Project: Spark Issue Type: Improvement Components: Build Affects Versions: 1.3.1 Reporter: Steve Loughran To keep the s3n classpath right, to add s3a, swift azure, the dependencies of spark in a 2.6+ profile need to add the relevant object store packages (hadoop-aws, hadoop-openstack, hadoop-azure) this adds more stuff to the client bundle, but will mean a single spark package can talk to all of the stores. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-1423) Add scripts for launching Spark on Windows Azure
[ https://issues.apache.org/jira/browse/SPARK-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14533351#comment-14533351 ] Steve Loughran commented on SPARK-1423: --- should this be resolved as WONTFIX? Add scripts for launching Spark on Windows Azure Key: SPARK-1423 URL: https://issues.apache.org/jira/browse/SPARK-1423 Project: Spark Issue Type: Improvement Components: Windows Reporter: Matei Zaharia -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-5034) Spark on Yarn launch failure on HDInsight on Windows
[ https://issues.apache.org/jira/browse/SPARK-5034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated SPARK-5034: -- Summary: Spark on Yarn launch failure on HDInsight on Windows (was: Spark on Yarn within HDInsight on Windows) Spark on Yarn launch failure on HDInsight on Windows Key: SPARK-5034 URL: https://issues.apache.org/jira/browse/SPARK-5034 Project: Spark Issue Type: Bug Components: Windows, YARN Affects Versions: 1.1.0, 1.1.1, 1.2.0 Environment: Spark on Yarn within HDInsight on Windows Azure Reporter: Rice Windows Environment I'm trying to run JavaSparkPi example on YARN with master = yarn-client but I have a problem. It runs smoothly with submitting application, first container for Application Master works too. When job is starting and there are some tasks to do I'm getting this warning on console (I'm using windows cmd if this makes any difference): WARN cluster.YarnClientClusterScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory When I'm checking logs for container with Application Masters it is launching containers for executors properly, then goes with: INFO YarnAllocationHandler: Completed container container_1409217202587_0003_01_02 (state: COMPLETE, exit status: 1) INFO YarnAllocationHandler: Container marked as failed: container_1409217202587_0003_01_02 And tries to re-launch them. On failed container log there is only this: Error: Could not find or load main class pwd..sp...@gbv06758291.my.secret.address.net:63680.user.CoarseGrainedScheduler -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-6961) Cannot save data to parquet files when executing from Windows from a Maven Project
[ https://issues.apache.org/jira/browse/SPARK-6961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14533340#comment-14533340 ] Steve Loughran commented on SPARK-6961: --- I know what the cause is, it's something where Hadoop isn't reporting the problem properly. Hadoop on windows uses a native library to work with the native FS (there's some plans to move to nativeio with the switch to java7, but it's not in yet). If winutils.exe isn't on the path the command doesn't work (it's a separate bug that hadoop doesn't report this properly). Workarounds # get the bin/ dir off an HDP install and stick it in a directory, put HADOOP_HOME to be one directory above. # build hadoop locally (not recommended unless you really, really want to) # do a complete HDP install and then disable everything from starting # fix the move-to-nio JIRA ( HADOOP-9590 ) I've just stuck a copy of the Hadoop 2.6 windows binaries [on github|https://github.com/steveloughran/clusterconfigs/tree/master/clusters/morzine] if that helps. I don't have any hadoop 2.4/2.5 binaries to hand; I could do a 2.7.0 build if you want. Cannot save data to parquet files when executing from Windows from a Maven Project -- Key: SPARK-6961 URL: https://issues.apache.org/jira/browse/SPARK-6961 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 1.3.0 Reporter: Bogdan Niculescu Priority: Blocker I have setup a project where I am trying to save a DataFrame into a parquet file. My project is a Maven one with Spark 1.3.0 and Scala 2.11.5 : {code:xml} spark.version1.3.0/spark.version dependency groupIdorg.apache.spark/groupId artifactIdspark-core_2.11/artifactId version${spark.version}/version /dependency dependency groupIdorg.apache.spark/groupId artifactIdspark-sql_2.11/artifactId version${spark.version}/version /dependency {code} A simple version of my code that reproduces consistently the problem that I am seeing is : {code} import org.apache.spark.sql.SQLContext import org.apache.spark.{SparkConf, SparkContext} case class Person(name: String, age: Int) object DataFrameTest extends App { val conf = new SparkConf().setMaster(local[4]).setAppName(DataFrameTest) val sc = new SparkContext(conf) val sqlContext = new SQLContext(sc) val persons = List(Person(a, 1), Person(b, 2)) val rdd = sc.parallelize(persons) val dataFrame = sqlContext.createDataFrame(rdd) dataFrame.saveAsParquetFile(test.parquet) } {code} All the time the exception that I am getting is : {code} Exception in thread main java.lang.NullPointerException at java.lang.ProcessBuilder.start(ProcessBuilder.java:1010) at org.apache.hadoop.util.Shell.runCommand(Shell.java:404) at org.apache.hadoop.util.Shell.run(Shell.java:379) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589) at org.apache.hadoop.util.Shell.execCommand(Shell.java:678) at org.apache.hadoop.util.Shell.execCommand(Shell.java:661) at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:639) at org.apache.hadoop.fs.FilterFileSystem.setPermission(FilterFileSystem.java:468) at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:456) at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:424) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:905) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:886) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:783) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:772) at parquet.hadoop.ParquetFileWriter.writeMetadataFile(ParquetFileWriter.java:409) at parquet.hadoop.ParquetFileWriter.writeMetadataFile(ParquetFileWriter.java:401) at org.apache.spark.sql.parquet.ParquetTypesConverter$.writeMetaData(ParquetTypes.scala:443) at org.apache.spark.sql.parquet.ParquetRelation2$MetadataCache.prepareMetadata(newParquet.scala:240) at org.apache.spark.sql.parquet.ParquetRelation2$MetadataCache$$anonfun$6.apply(newParquet.scala:256) at org.apache.spark.sql.parquet.ParquetRelation2$MetadataCache$$anonfun$6.apply(newParquet.scala:251) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:245) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:245) at scala.collection.immutable.List.foreach(List.scala:381) at scala.collection.TraversableLike$class.map(TraversableLike.scala:245) at
[jira] [Commented] (SPARK-7481) Add Hadoop 2.6+ profile to pull in object store FS accessors
[ https://issues.apache.org/jira/browse/SPARK-7481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14534622#comment-14534622 ] Steve Loughran commented on SPARK-7481: --- hadoop openstack 100K +httpclient (400K) hadoop-aws : 85K, jetset 500K s3a needs the aws toolkit @ 11.5MB, so it's the big one azure is 500K. to retain s3n in spark, the hadoop-aws and jetset dependency needs to go in; s3a is a fairly large additions Add Hadoop 2.6+ profile to pull in object store FS accessors Key: SPARK-7481 URL: https://issues.apache.org/jira/browse/SPARK-7481 Project: Spark Issue Type: Improvement Components: Build Affects Versions: 1.3.1 Reporter: Steve Loughran To keep the s3n classpath right, to add s3a, swift azure, the dependencies of spark in a 2.6+ profile need to add the relevant object store packages (hadoop-aws, hadoop-openstack, hadoop-azure) this adds more stuff to the client bundle, but will mean a single spark package can talk to all of the stores. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-7508) JettyUtils-generated servlets to log report all errors
Steve Loughran created SPARK-7508: - Summary: JettyUtils-generated servlets to log report all errors Key: SPARK-7508 URL: https://issues.apache.org/jira/browse/SPARK-7508 Project: Spark Issue Type: Improvement Components: Web UI Affects Versions: 1.3.1 Reporter: Steve Loughran Priority: Minor the servlets created by JettyUtils to render pages do handle {{IllegalArgumentException}} exceptions, but all others are just thrown up. This stops them being logged or converted to meaningful error messages. At the very least, server-side logging via log4j means that problems can be identified. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-1537) Add integration with Yarn's Application Timeline Server
[ https://issues.apache.org/jira/browse/SPARK-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14541589#comment-14541589 ] Steve Loughran commented on SPARK-1537: --- + YARN-3539 is resolved; the [v1 timeline |https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/TimelineServer.md#Timeline_Server_REST_API_v1] is now defined and declared one of the supported REST APIs. I'm also removing YARN-2423 as a dependency; the latest patch does this itself Add integration with Yarn's Application Timeline Server --- Key: SPARK-1537 URL: https://issues.apache.org/jira/browse/SPARK-1537 Project: Spark Issue Type: New Feature Components: YARN Reporter: Marcelo Vanzin Attachments: SPARK-1537.txt, spark-1573.patch It would be nice to have Spark integrate with Yarn's Application Timeline Server (see YARN-321, YARN-1530). This would allow users running Spark on Yarn to have a single place to go for all their history needs, and avoid having to manage a separate service (Spark's built-in server). At the moment, there's a working version of the ATS in the Hadoop 2.4 branch, although there is still some ongoing work. But the basics are there, and I wouldn't expect them to change (much) at this point. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-7669) Builds against Hadoop 2.6+ get inconsistent curator dependencies
[ https://issues.apache.org/jira/browse/SPARK-7669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14545636#comment-14545636 ] Steve Loughran commented on SPARK-7669: --- snippet of the maven dependencies with {{-Phadoop-2.4 -Dhadoop.version=2.6.0 }} {code} [INFO] | | | +- org.apache.curator:curator-client:jar:2.6.0:compile [INFO] | | | | +- (org.slf4j:slf4j-api:jar:1.7.10:compile - version managed from 1.7.6; omitted for duplicate) [INFO] | | | | +- (org.apache.zookeeper:zookeeper:jar:3.4.5:compile - version managed from 3.4.6; omitted for duplicate) [INFO] | | | | \- (com.google.guava:guava:jar:14.0.1:provided - version managed from 11.0.2; scope managed from compile; omitted for duplicate) [INFO] | | | +- (org.apache.curator:curator-recipes:jar:2.4.0:compile - version managed from 2.6.0; omitted for duplicate) {code} What's happened is that the curator-recipes version is being set by spark, but no version of curator-client is set. Nor can you fix the versions on the CLI, as the curator version isn't yet property driven. I have a patch to make things consistent, with a hadoop 2.6 profile which sets curator.version=2.6,0, zookeeper.version=3.4.6, and drives the curator-recipes version off that curator.version property. There's one possible enhancement to this: declare the version of curator-client c directly, putting the spark build in control of which version gets picked up. Builds against Hadoop 2.6+ get inconsistent curator dependencies Key: SPARK-7669 URL: https://issues.apache.org/jira/browse/SPARK-7669 Project: Spark Issue Type: Bug Components: Build Affects Versions: 1.3.1, 1.4.0 Environment: Hadoop 2.6 Reporter: Steve Loughran Priority: Minor If you build spark against Hadoop 2.6 you end up with an inconsistent set of curator dependencies -curator-recipe 2.4.0 with curator 2.6.0. A dedicated hadoop-2.6 profile along with extraction of curator version into a property can keep the curator versions in sync, along with ZK. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-7669) Builds against Hadoop 2.6+ get inconsistent curator dependencies
Steve Loughran created SPARK-7669: - Summary: Builds against Hadoop 2.6+ get inconsistent curator dependencies Key: SPARK-7669 URL: https://issues.apache.org/jira/browse/SPARK-7669 Project: Spark Issue Type: Bug Components: Build Affects Versions: 1.3.1, 1.4.0 Environment: Hadoop 2.6 Reporter: Steve Loughran Priority: Minor If you build spark against Hadoop 2.6 you end up with an inconsistent set of curator dependencies -curator-recipe 2.4.0 with curator 2.6.0. A dedicated hadoop-2.6 profile along with extraction of curator version into a property can keep the curator versions in sync, along with ZK. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-1537) Add integration with Yarn's Application Timeline Server
[ https://issues.apache.org/jira/browse/SPARK-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14492303#comment-14492303 ] Steve Loughran commented on SPARK-1537: --- HADOOP-11826 patches the hadoop compatibility document to add timeline server to the list of stable APIs. Add integration with Yarn's Application Timeline Server --- Key: SPARK-1537 URL: https://issues.apache.org/jira/browse/SPARK-1537 Project: Spark Issue Type: New Feature Components: YARN Reporter: Marcelo Vanzin Assignee: Marcelo Vanzin Attachments: SPARK-1537.txt, spark-1573.patch It would be nice to have Spark integrate with Yarn's Application Timeline Server (see YARN-321, YARN-1530). This would allow users running Spark on Yarn to have a single place to go for all their history needs, and avoid having to manage a separate service (Spark's built-in server). At the moment, there's a working version of the ATS in the Hadoop 2.4 branch, although there is still some ongoing work. But the basics are there, and I wouldn't expect them to change (much) at this point. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-6907) Create an isolated classloader for the Hive Client.
[ https://issues.apache.org/jira/browse/SPARK-6907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14550808#comment-14550808 ] Steve Loughran commented on SPARK-6907: --- Having just looked at this code, I'm a bit worried about the implications of the dynamic JAR download. # what happens if the first attempt to use the client takes place while the caller is off the internet (i.e. an isolated cluster)? # when Ivy pulls down JARs over HTTP, it checks the MD5 sums from the same server. It's not secure, merely verifies that the SHA1 and JAR is tampered with consistently. Maybe I've misunderstood something —if I haven't this strikes me as insecure Create an isolated classloader for the Hive Client. --- Key: SPARK-6907 URL: https://issues.apache.org/jira/browse/SPARK-6907 Project: Spark Issue Type: Sub-task Components: SQL Reporter: Michael Armbrust Assignee: Michael Armbrust Fix For: 1.4.0 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-8064) Upgrade Hive to 1.2
[ https://issues.apache.org/jira/browse/SPARK-8064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14575070#comment-14575070 ] Steve Loughran commented on SPARK-8064: --- I'm working on this Upgrade Hive to 1.2 --- Key: SPARK-8064 URL: https://issues.apache.org/jira/browse/SPARK-8064 Project: Spark Issue Type: Sub-task Components: SQL Reporter: Reynold Xin Assignee: Steve Loughran Priority: Blocker -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-1537) Add integration with Yarn's Application Timeline Server
[ https://issues.apache.org/jira/browse/SPARK-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14579152#comment-14579152 ] Steve Loughran commented on SPARK-1537: --- Full application log. Application hasn't actually stopped, which is interesting. {code} $ dist/bin/spark-submit \ --class org.apache.spark.examples.SparkPi \ --properties-file ../clusterconfigs/clusters/devix/spark/spark-defaults.conf \ --master yarn-client \ --executor-memory 128m \ --num-executors 1 \ --executor-cores 1 \ --driver-memory 128m \ dist/lib/spark-examples-1.5.0-SNAPSHOT-hadoop2.6.0.jar 12 2015-06-09 17:01:59,596 [main] INFO spark.SparkContext (Logging.scala:logInfo(59)) - Running Spark version 1.5.0-SNAPSHOT 2015-06-09 17:02:01,309 [sparkDriver-akka.actor.default-dispatcher-2] INFO slf4j.Slf4jLogger (Slf4jLogger.scala:applyOrElse(80)) - Slf4jLogger started 2015-06-09 17:02:01,359 [sparkDriver-akka.actor.default-dispatcher-2] INFO Remoting (Slf4jLogger.scala:apply$mcV$sp(74)) - Starting remoting 2015-06-09 17:02:01,542 [sparkDriver-akka.actor.default-dispatcher-2] INFO Remoting (Slf4jLogger.scala:apply$mcV$sp(74)) - Remoting started; listening on addresses :[akka.tcp://sparkDriver@192.168.1.86:51476] 2015-06-09 17:02:01,549 [main] INFO util.Utils (Logging.scala:logInfo(59)) - Successfully started service 'sparkDriver' on port 51476. 2015-06-09 17:02:01,568 [main] INFO spark.SparkEnv (Logging.scala:logInfo(59)) - Registering MapOutputTracker 2015-06-09 17:02:01,587 [main] INFO spark.SparkEnv (Logging.scala:logInfo(59)) - Registering BlockManagerMaster 2015-06-09 17:02:01,831 [main] INFO spark.HttpServer (Logging.scala:logInfo(59)) - Starting HTTP Server 2015-06-09 17:02:01,891 [main] INFO util.Utils (Logging.scala:logInfo(59)) - Successfully started service 'HTTP file server' on port 51477. 2015-06-09 17:02:01,905 [main] INFO spark.SparkEnv (Logging.scala:logInfo(59)) - Registering OutputCommitCoordinator 2015-06-09 17:02:02,038 [main] INFO util.Utils (Logging.scala:logInfo(59)) - Successfully started service 'SparkUI' on port 4040. 2015-06-09 17:02:02,039 [main] INFO ui.SparkUI (Logging.scala:logInfo(59)) - Started SparkUI at http://192.168.1.86:4040 2015-06-09 17:02:03,071 [main] INFO spark.SparkContext (Logging.scala:logInfo(59)) - Added JAR file:/Users/stevel/Projects/Hortonworks/Projects/sparkwork/spark/dist/lib/spark-examples-1.5.0-SNAPSHOT-hadoop2.6.0.jar at http://192.168.1.86:51477/jars/spark-examples-1.5.0-SNAPSHOT-hadoop2.6.0.jar with timestamp 1433865723062 2015-06-09 17:02:03,691 [main] INFO impl.TimelineClientImpl (TimelineClientImpl.java:serviceInit(285)) - Timeline service address: http://devix.cotham.uk:8188/ws/v1/timeline/ 2015-06-09 17:02:03,808 [main] INFO client.RMProxy (RMProxy.java:createRMProxy(98)) - Connecting to ResourceManager at devix.cotham.uk/192.168.1.134:8050 2015-06-09 17:02:04,577 [main] INFO yarn.Client (Logging.scala:logInfo(59)) - Requesting a new application from cluster with 1 NodeManagers 2015-06-09 17:02:04,637 [main] INFO yarn.Client (Logging.scala:logInfo(59)) - Verifying our application has not requested more than the maximum memory capability of the cluster (2048 MB per container) 2015-06-09 17:02:04,637 [main] INFO yarn.Client (Logging.scala:logInfo(59)) - Will allocate AM container, with 896 MB memory including 384 MB overhead 2015-06-09 17:02:04,638 [main] INFO yarn.Client (Logging.scala:logInfo(59)) - Setting up container launch context for our AM 2015-06-09 17:02:04,643 [main] INFO yarn.Client (Logging.scala:logInfo(59)) - Preparing resources for our AM container 2015-06-09 17:02:05,096 [main] WARN shortcircuit.DomainSocketFactory (DomainSocketFactory.java:init(116)) - The short-circuit local reads feature cannot be used because libhadoop cannot be loaded. 2015-06-09 17:02:05,106 [main] DEBUG yarn.YarnSparkHadoopUtil (Logging.scala:logDebug(63)) - delegation token renewer is: rm/devix.cotham.uk@COTHAM 2015-06-09 17:02:05,107 [main] INFO yarn.YarnSparkHadoopUtil (Logging.scala:logInfo(59)) - getting token for namenode: hdfs://devix.cotham.uk:8020/user/stevel/.sparkStaging/application_1433777033372_0005 2015-06-09 17:02:06,129 [main] DEBUG yarn.Client (Logging.scala:logDebug(63)) - HiveMetaStore configured in localmode 2015-06-09 17:02:06,130 [main] DEBUG yarn.Client (Logging.scala:logDebug(63)) - HBase Class not found: java.lang.ClassNotFoundException: org.apache.hadoop.hbase.HBaseConfiguration 2015-06-09 17:02:06,225
[jira] [Commented] (SPARK-8275) HistoryServer caches incomplete App UIs
[ https://issues.apache.org/jira/browse/SPARK-8275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14579134#comment-14579134 ] Steve Loughran commented on SPARK-8275: --- I have a test for this as part of the SPARK-1537 WiP; I haven't yet done a fix. HistoryServer caches incomplete App UIs --- Key: SPARK-8275 URL: https://issues.apache.org/jira/browse/SPARK-8275 Project: Spark Issue Type: Bug Components: Web UI Affects Versions: 1.3.1 Reporter: Steve Loughran The history server caches applications retrieved from the {{ApplicationHistoryProvider.getAppUI()}} call for performance: it's expensive to rebuild. However, this cache also includes incomplete applications, as well as completed ones —and it never attempts to refresh the incomplete application. As a result, if you do a GET of the history of a running application, even after the application is finished, you'll still get the web UI/history as it was when that first GET was issued. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-8275) HistoryServer caches incomplete App UIs
Steve Loughran created SPARK-8275: - Summary: HistoryServer caches incomplete App UIs Key: SPARK-8275 URL: https://issues.apache.org/jira/browse/SPARK-8275 Project: Spark Issue Type: Bug Components: Web UI Affects Versions: 1.3.1 Reporter: Steve Loughran The history server caches applications retrieved from the {{ApplicationHistoryProvider.getAppUI()}} call for performance: it's expensive to rebuild. However, this cache also includes incomplete applications, as well as completed ones —and it never attempts to refresh the incomplete application. As a result, if you do a GET of the history of a running application, even after the application is finished, you'll still get the web UI/history as it was when that first GET was issued. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-8276) NPE in YarnClientSchedulerBackend.stop
Steve Loughran created SPARK-8276: - Summary: NPE in YarnClientSchedulerBackend.stop Key: SPARK-8276 URL: https://issues.apache.org/jira/browse/SPARK-8276 Project: Spark Issue Type: Bug Components: YARN Affects Versions: 1.5.0 Reporter: Steve Loughran Priority: Minor NPE seen in {{YarnClientSchedulerBackend.stop()}} after problem setting up job; on the line {{monitorThread.interrupt()}} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-8275) HistoryServer caches incomplete App UIs
[ https://issues.apache.org/jira/browse/SPARK-8275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14579273#comment-14579273 ] Steve Loughran commented on SPARK-8275: --- The cache code is from google; history server provides a method to get the data for an entry, but there's no logic in the cache itself to have a refresh time on entries. One solution would be # cache entries to include a timestamp and completed flag alongside the SparkUI instances # direct all cache.get operations through a single method in HistoryServer # have that method do something like {code} def getUI(id: String): SparkUI = { var cacheEntry = cache.get(id) if (!cacheEntry.completed (cacheEntry.timestamp + expiryTime) now()) { cache.release(id) cache.get(id) } cacheEntry } this will leave out of date entries in the cache, but on any retrieval trigger the rebuild. HistoryServer caches incomplete App UIs --- Key: SPARK-8275 URL: https://issues.apache.org/jira/browse/SPARK-8275 Project: Spark Issue Type: Bug Components: Web UI Affects Versions: 1.3.1 Reporter: Steve Loughran The history server caches applications retrieved from the {{ApplicationHistoryProvider.getAppUI()}} call for performance: it's expensive to rebuild. However, this cache also includes incomplete applications, as well as completed ones —and it never attempts to refresh the incomplete application. As a result, if you do a GET of the history of a running application, even after the application is finished, you'll still get the web UI/history as it was when that first GET was issued. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-8275) HistoryServer caches incomplete App UIs
[ https://issues.apache.org/jira/browse/SPARK-8275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14579273#comment-14579273 ] Steve Loughran edited comment on SPARK-8275 at 6/10/15 11:43 AM: - The cache code is from google; history server provides a method to get the data for an entry, but there's no logic in the cache itself to have a refresh time on entries. One solution would be # cache entries to include a timestamp and completed flag alongside the SparkUI instances # direct all cache.get operations through a single method in HistoryServer # have that method do something like {code} def getUI(id: String): SparkUI = { var cacheEntry = cache.get(id) if (!cacheEntry.completed (cacheEntry.timestamp + expiryTime) now()) { cache.release(id) cache.get(id) } cacheEntry } {code} this will leave out of date entries in the cache, but on any retrieval trigger the rebuild. was (Author: ste...@apache.org): The cache code is from google; history server provides a method to get the data for an entry, but there's no logic in the cache itself to have a refresh time on entries. One solution would be # cache entries to include a timestamp and completed flag alongside the SparkUI instances # direct all cache.get operations through a single method in HistoryServer # have that method do something like {code} def getUI(id: String): SparkUI = { var cacheEntry = cache.get(id) if (!cacheEntry.completed (cacheEntry.timestamp + expiryTime) now()) { cache.release(id) cache.get(id) } cacheEntry } this will leave out of date entries in the cache, but on any retrieval trigger the rebuild. HistoryServer caches incomplete App UIs --- Key: SPARK-8275 URL: https://issues.apache.org/jira/browse/SPARK-8275 Project: Spark Issue Type: Bug Components: Web UI Affects Versions: 1.3.1 Reporter: Steve Loughran The history server caches applications retrieved from the {{ApplicationHistoryProvider.getAppUI()}} call for performance: it's expensive to rebuild. However, this cache also includes incomplete applications, as well as completed ones —and it never attempts to refresh the incomplete application. As a result, if you do a GET of the history of a running application, even after the application is finished, you'll still get the web UI/history as it was when that first GET was issued. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Issue Comment Deleted] (SPARK-1537) Add integration with Yarn's Application Timeline Server
[ https://issues.apache.org/jira/browse/SPARK-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated SPARK-1537: -- Comment: was deleted (was: Full application log. Application hasn't actually stopped, which is interesting. {code} $ dist/bin/spark-submit \ --class org.apache.spark.examples.SparkPi \ --properties-file ../clusterconfigs/clusters/devix/spark/spark-defaults.conf \ --master yarn-client \ --executor-memory 128m \ --num-executors 1 \ --executor-cores 1 \ --driver-memory 128m \ dist/lib/spark-examples-1.5.0-SNAPSHOT-hadoop2.6.0.jar 12 2015-06-09 17:01:59,596 [main] INFO spark.SparkContext (Logging.scala:logInfo(59)) - Running Spark version 1.5.0-SNAPSHOT 2015-06-09 17:02:01,309 [sparkDriver-akka.actor.default-dispatcher-2] INFO slf4j.Slf4jLogger (Slf4jLogger.scala:applyOrElse(80)) - Slf4jLogger started 2015-06-09 17:02:01,359 [sparkDriver-akka.actor.default-dispatcher-2] INFO Remoting (Slf4jLogger.scala:apply$mcV$sp(74)) - Starting remoting 2015-06-09 17:02:01,542 [sparkDriver-akka.actor.default-dispatcher-2] INFO Remoting (Slf4jLogger.scala:apply$mcV$sp(74)) - Remoting started; listening on addresses :[akka.tcp://sparkDriver@192.168.1.86:51476] 2015-06-09 17:02:01,549 [main] INFO util.Utils (Logging.scala:logInfo(59)) - Successfully started service 'sparkDriver' on port 51476. 2015-06-09 17:02:01,568 [main] INFO spark.SparkEnv (Logging.scala:logInfo(59)) - Registering MapOutputTracker 2015-06-09 17:02:01,587 [main] INFO spark.SparkEnv (Logging.scala:logInfo(59)) - Registering BlockManagerMaster 2015-06-09 17:02:01,831 [main] INFO spark.HttpServer (Logging.scala:logInfo(59)) - Starting HTTP Server 2015-06-09 17:02:01,891 [main] INFO util.Utils (Logging.scala:logInfo(59)) - Successfully started service 'HTTP file server' on port 51477. 2015-06-09 17:02:01,905 [main] INFO spark.SparkEnv (Logging.scala:logInfo(59)) - Registering OutputCommitCoordinator 2015-06-09 17:02:02,038 [main] INFO util.Utils (Logging.scala:logInfo(59)) - Successfully started service 'SparkUI' on port 4040. 2015-06-09 17:02:02,039 [main] INFO ui.SparkUI (Logging.scala:logInfo(59)) - Started SparkUI at http://192.168.1.86:4040 2015-06-09 17:02:03,071 [main] INFO spark.SparkContext (Logging.scala:logInfo(59)) - Added JAR file:/Users/stevel/Projects/Hortonworks/Projects/sparkwork/spark/dist/lib/spark-examples-1.5.0-SNAPSHOT-hadoop2.6.0.jar at http://192.168.1.86:51477/jars/spark-examples-1.5.0-SNAPSHOT-hadoop2.6.0.jar with timestamp 1433865723062 2015-06-09 17:02:03,691 [main] INFO impl.TimelineClientImpl (TimelineClientImpl.java:serviceInit(285)) - Timeline service address: http://devix.cotham.uk:8188/ws/v1/timeline/ 2015-06-09 17:02:03,808 [main] INFO client.RMProxy (RMProxy.java:createRMProxy(98)) - Connecting to ResourceManager at devix.cotham.uk/192.168.1.134:8050 2015-06-09 17:02:04,577 [main] INFO yarn.Client (Logging.scala:logInfo(59)) - Requesting a new application from cluster with 1 NodeManagers 2015-06-09 17:02:04,637 [main] INFO yarn.Client (Logging.scala:logInfo(59)) - Verifying our application has not requested more than the maximum memory capability of the cluster (2048 MB per container) 2015-06-09 17:02:04,637 [main] INFO yarn.Client (Logging.scala:logInfo(59)) - Will allocate AM container, with 896 MB memory including 384 MB overhead 2015-06-09 17:02:04,638 [main] INFO yarn.Client (Logging.scala:logInfo(59)) - Setting up container launch context for our AM 2015-06-09 17:02:04,643 [main] INFO yarn.Client (Logging.scala:logInfo(59)) - Preparing resources for our AM container 2015-06-09 17:02:05,096 [main] WARN shortcircuit.DomainSocketFactory (DomainSocketFactory.java:init(116)) - The short-circuit local reads feature cannot be used because libhadoop cannot be loaded. 2015-06-09 17:02:05,106 [main] DEBUG yarn.YarnSparkHadoopUtil (Logging.scala:logDebug(63)) - delegation token renewer is: rm/devix.cotham.uk@COTHAM 2015-06-09 17:02:05,107 [main] INFO yarn.YarnSparkHadoopUtil (Logging.scala:logInfo(59)) - getting token for namenode: hdfs://devix.cotham.uk:8020/user/stevel/.sparkStaging/application_1433777033372_0005 2015-06-09 17:02:06,129 [main] DEBUG yarn.Client (Logging.scala:logDebug(63)) - HiveMetaStore configured in localmode 2015-06-09 17:02:06,130 [main] DEBUG yarn.Client (Logging.scala:logDebug(63)) - HBase Class not found: java.lang.ClassNotFoundException: org.apache.hadoop.hbase.HBaseConfiguration 2015-06-09 17:02:06,225 [main] INFO yarn.Client
[jira] [Commented] (SPARK-7889) Jobs progress of apps on complete page of HistoryServer shows uncompleted
[ https://issues.apache.org/jira/browse/SPARK-7889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14595998#comment-14595998 ] Steve Loughran commented on SPARK-7889: --- Added a [new pull request|https://github.com/apache/spark/pull/6935]; notes which uis do not contain a completed attempt -and will refresh those. Jobs progress of apps on complete page of HistoryServer shows uncompleted - Key: SPARK-7889 URL: https://issues.apache.org/jira/browse/SPARK-7889 Project: Spark Issue Type: Improvement Components: Spark Core Reporter: meiyoula Priority: Minor When running a SparkPi with 2000 tasks, cliking into the app on incomplete page, the job progress shows 400/2000. After the app is completed, the app goes to complete page from incomplete, and now cliking into the app, the job progress still shows 400/2000. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-3284) saveAsParquetFile not working on windows
[ https://issues.apache.org/jira/browse/SPARK-3284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran resolved SPARK-3284. --- Resolution: Duplicate closing as duplicate of SPARK-6961 saveAsParquetFile not working on windows Key: SPARK-3284 URL: https://issues.apache.org/jira/browse/SPARK-3284 Project: Spark Issue Type: Bug Components: Windows Affects Versions: 1.0.2 Environment: Windows Reporter: Pravesh Jain Priority: Minor {code} object parquet { case class Person(name: String, age: Int) def main(args: Array[String]) { val sparkConf = new SparkConf().setMaster(local).setAppName(HdfsWordCount) val sc = new SparkContext(sparkConf) val sqlContext = new org.apache.spark.sql.SQLContext(sc) // createSchemaRDD is used to implicitly convert an RDD to a SchemaRDD. import sqlContext.createSchemaRDD val people = sc.textFile(C:/Users/pravesh.jain/Desktop/people/people.txt).map(_.split(,)).map(p = Person(p(0), p(1).trim.toInt)) people.saveAsParquetFile(C:/Users/pravesh.jain/Desktop/people/people.parquet) val parquetFile = sqlContext.parquetFile(C:/Users/pravesh.jain/Desktop/people/people.parquet) } } {code} gives the error Exception in thread main java.lang.NullPointerException at org.apache.spark.parquet$.main(parquet.scala:16) which is the line saveAsParquetFile. This works fine in linux but using in eclipse in windows gives the error. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-6961) Cannot save data to parquet files when executing from Windows from a Maven Project
[ https://issues.apache.org/jira/browse/SPARK-6961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14581886#comment-14581886 ] Steve Loughran commented on SPARK-6961: --- issue here is WINUTILS.EXE isn't on the path. That's an installation-side issue, but it doesn't justify Hadoop common failing with an NPE, as that is utterly uninformative and only creates support calls elsewhere Cannot save data to parquet files when executing from Windows from a Maven Project -- Key: SPARK-6961 URL: https://issues.apache.org/jira/browse/SPARK-6961 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 1.3.0 Reporter: Bogdan Niculescu Priority: Critical I have setup a project where I am trying to save a DataFrame into a parquet file. My project is a Maven one with Spark 1.3.0 and Scala 2.11.5 : {code:xml} spark.version1.3.0/spark.version dependency groupIdorg.apache.spark/groupId artifactIdspark-core_2.11/artifactId version${spark.version}/version /dependency dependency groupIdorg.apache.spark/groupId artifactIdspark-sql_2.11/artifactId version${spark.version}/version /dependency {code} A simple version of my code that reproduces consistently the problem that I am seeing is : {code} import org.apache.spark.sql.SQLContext import org.apache.spark.{SparkConf, SparkContext} case class Person(name: String, age: Int) object DataFrameTest extends App { val conf = new SparkConf().setMaster(local[4]).setAppName(DataFrameTest) val sc = new SparkContext(conf) val sqlContext = new SQLContext(sc) val persons = List(Person(a, 1), Person(b, 2)) val rdd = sc.parallelize(persons) val dataFrame = sqlContext.createDataFrame(rdd) dataFrame.saveAsParquetFile(test.parquet) } {code} All the time the exception that I am getting is : {code} Exception in thread main java.lang.NullPointerException at java.lang.ProcessBuilder.start(ProcessBuilder.java:1010) at org.apache.hadoop.util.Shell.runCommand(Shell.java:404) at org.apache.hadoop.util.Shell.run(Shell.java:379) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589) at org.apache.hadoop.util.Shell.execCommand(Shell.java:678) at org.apache.hadoop.util.Shell.execCommand(Shell.java:661) at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:639) at org.apache.hadoop.fs.FilterFileSystem.setPermission(FilterFileSystem.java:468) at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:456) at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:424) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:905) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:886) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:783) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:772) at parquet.hadoop.ParquetFileWriter.writeMetadataFile(ParquetFileWriter.java:409) at parquet.hadoop.ParquetFileWriter.writeMetadataFile(ParquetFileWriter.java:401) at org.apache.spark.sql.parquet.ParquetTypesConverter$.writeMetaData(ParquetTypes.scala:443) at org.apache.spark.sql.parquet.ParquetRelation2$MetadataCache.prepareMetadata(newParquet.scala:240) at org.apache.spark.sql.parquet.ParquetRelation2$MetadataCache$$anonfun$6.apply(newParquet.scala:256) at org.apache.spark.sql.parquet.ParquetRelation2$MetadataCache$$anonfun$6.apply(newParquet.scala:251) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:245) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:245) at scala.collection.immutable.List.foreach(List.scala:381) at scala.collection.TraversableLike$class.map(TraversableLike.scala:245) at scala.collection.immutable.List.map(List.scala:285) at org.apache.spark.sql.parquet.ParquetRelation2$MetadataCache.refresh(newParquet.scala:251) at org.apache.spark.sql.parquet.ParquetRelation2.init(newParquet.scala:370) at org.apache.spark.sql.parquet.DefaultSource.createRelation(newParquet.scala:96) at org.apache.spark.sql.parquet.DefaultSource.createRelation(newParquet.scala:125) at org.apache.spark.sql.sources.ResolvedDataSource$.apply(ddl.scala:308) at org.apache.spark.sql.DataFrame.save(DataFrame.scala:1123) at org.apache.spark.sql.DataFrame.saveAsParquetFile(DataFrame.scala:922) at sparkTest.DataFrameTest$.delayedEndpoint$sparkTest$DataFrameTest$1(DataFrameTest.scala:17) at
[jira] [Commented] (SPARK-7889) Jobs progress of apps on complete page of HistoryServer shows uncompleted
[ https://issues.apache.org/jira/browse/SPARK-7889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14590215#comment-14590215 ] Steve Loughran commented on SPARK-7889: --- Is JIRA about (a) the status on the listing of complete/uncomplete being wrong in some way (b) the actual job view (history/some-app-id) being stale when a job completes. (b) is consistent with what I observed in SPARK-8275 Looking at your patch, and comparing it with my proposal, I prefer mine. All I'm proposing is invalidating the cache on work in progress, so that it is retrieved again. Thinking about it some more, we can go one better: rely on the {{ApplicationHistoryInfo.lastUpdated}} field to tell us when the UI was last updated. If we cache the update time with the UI, on any GET of an appUI, we can look to see if the previous UI was not completed and if the lastupdated time has changed...if so. that triggers a refresh. with this approach the entry you see will always be the one most recently published to the history store (of any implementation), and picked up by the history provider in its getListing()/background refresh operation. Jobs progress of apps on complete page of HistoryServer shows uncompleted - Key: SPARK-7889 URL: https://issues.apache.org/jira/browse/SPARK-7889 Project: Spark Issue Type: Improvement Components: Spark Core Reporter: meiyoula Priority: Minor When running a SparkPi with 2000 tasks, cliking into the app on incomplete page, the job progress shows 400/2000. After the app is completed, the app goes to complete page from incomplete, and now cliking into the app, the job progress still shows 400/2000. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-7009) Build assembly JAR via ant to avoid zip64 problems
[ https://issues.apache.org/jira/browse/SPARK-7009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14589659#comment-14589659 ] Steve Loughran commented on SPARK-7009: --- The issue isn't that python can't read large JARs, it's that it can't read the header for large JARs Build assembly JAR via ant to avoid zip64 problems -- Key: SPARK-7009 URL: https://issues.apache.org/jira/browse/SPARK-7009 Project: Spark Issue Type: Improvement Components: Build Affects Versions: 1.3.0 Environment: Java 7+ Reporter: Steve Loughran Original Estimate: 2h Remaining Estimate: 2h SPARK-1911 shows the problem that JDK7+ is using zip64 to build large JARs; a format incompatible with Java and pyspark. Provided the total number of .class files+resources is 64K, ant can be used to make the final JAR instead, perhaps by unzipping the maven-generated JAR then rezipping it with zip64=never, before publishing the artifact via maven. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-8275) HistoryServer caches incomplete App UIs
[ https://issues.apache.org/jira/browse/SPARK-8275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14587753#comment-14587753 ] Steve Loughran commented on SPARK-8275: --- your right, tagging as duplicate. HistoryServer caches incomplete App UIs --- Key: SPARK-8275 URL: https://issues.apache.org/jira/browse/SPARK-8275 Project: Spark Issue Type: Bug Components: Web UI Affects Versions: 1.3.1 Reporter: Steve Loughran The history server caches applications retrieved from the {{ApplicationHistoryProvider.getAppUI()}} call for performance: it's expensive to rebuild. However, this cache also includes incomplete applications, as well as completed ones —and it never attempts to refresh the incomplete application. As a result, if you do a GET of the history of a running application, even after the application is finished, you'll still get the web UI/history as it was when that first GET was issued. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-8394) HistoryServer doesn't read kerberos opts from config
Steve Loughran created SPARK-8394: - Summary: HistoryServer doesn't read kerberos opts from config Key: SPARK-8394 URL: https://issues.apache.org/jira/browse/SPARK-8394 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 1.3.1 Reporter: Steve Loughran Priority: Minor Fix For: 1.4.0 the history server calls {{initSecurity()}} before it reads in the configuration. As a result you can't configure kerberos options in {{spark-defaults.conf}}, but only in {{SPARK_HISTORY_OPTS}} (this has already been fixed; I'm filing the JIRA as it wasn't there I'd just hit the same problem in branch-1.3) -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-8394) HistoryServer doesn't read kerberos opts from config
[ https://issues.apache.org/jira/browse/SPARK-8394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14587882#comment-14587882 ] Steve Loughran commented on SPARK-8394: --- Fixed in commit [9042f](https://github.com/apache/spark/commit/9042f8f3784f10f695cba6b80c054695b1c152c5) HistoryServer doesn't read kerberos opts from config Key: SPARK-8394 URL: https://issues.apache.org/jira/browse/SPARK-8394 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 1.3.1 Reporter: Steve Loughran Priority: Minor Fix For: 1.4.0 the history server calls {{initSecurity()}} before it reads in the configuration. As a result you can't configure kerberos options in {{spark-defaults.conf}}, but only in {{SPARK_HISTORY_OPTS}} (this has already been fixed; I'm filing the JIRA as it wasn't there I'd just hit the same problem in branch-1.3) -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-8394) HistoryServer doesn't read kerberos opts from config
[ https://issues.apache.org/jira/browse/SPARK-8394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran resolved SPARK-8394. --- Resolution: Fixed fixed by Marcelo HistoryServer doesn't read kerberos opts from config Key: SPARK-8394 URL: https://issues.apache.org/jira/browse/SPARK-8394 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 1.3.1 Reporter: Steve Loughran Priority: Minor Fix For: 1.4.0 the history server calls {{initSecurity()}} before it reads in the configuration. As a result you can't configure kerberos options in {{spark-defaults.conf}}, but only in {{SPARK_HISTORY_OPTS}} (this has already been fixed; I'm filing the JIRA as it wasn't there I'd just hit the same problem in branch-1.3) -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-4352) Incorporate locality preferences in dynamic allocation requests
[ https://issues.apache.org/jira/browse/SPARK-4352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14567345#comment-14567345 ] Steve Loughran commented on SPARK-4352: --- As usual, when YARN-1042 is done, life gets easier: the AM asks YARN for the anti-affine placement. If you look at how other YARN clients have implemented anti-affinity (TWILL-82), the blacklist is used to block off all nodes in use, with a request-at-a-time ramp-up to avoid 1 outstanding request being granted on the same node. As well as anti-affinity, life would be even better with dynamic container resize: if a single executor could expand/relax CPU capacity on demand, you'd only need one per node and then handle multiple tasks by running more work there. (This does nothing for RAM consumption though) now, for some other fun, # you may want to consider which surplus containers to release, both outstanding requests and actually granted. In particular, if you want to cancel 1 outstanding request, which to choose? Any of them? The newest? The oldest? The node with the worst reliability statistics? Killing the newest works if you assume that the older containers have generated more host-local data that you wish to reuse. # history may also be a factor in placement. If you are starting a session which continues/extends previous work, the previous location of the executors may be the first locality clue. Ask for containers on those nodes and there's a high likelihood that all the output data from the previous session will be stored locally on one of the nodes a container is assigned. # Testing. There aren't any, are there? It's possible to simulate some of the basic operations, you just need to isolate the code which examines the application state and generates container request/release events from the actual interaction with the RM. I've done this before with the request to allocate/cancel [generating a list of operations to be submitted or simulated|https://github.com/apache/incubator-slider/blob/develop/slider-core/src/main/java/org/apache/slider/server/appmaster/state/AppState.java#L1908]. When combined with a [mock YARN engine|https://github.com/apache/incubator-slider/tree/develop/slider-core/src/test/groovy/org/apache/slider/server/appmaster/model/mock], let us do things like [test historical placement logic|https://github.com/apache/incubator-slider/tree/develop/slider-core/src/test/groovy/org/apache/slider/server/appmaster/model/history] as well as whether to re-request containers on nodes where containers have just recently failed. While that mock stuff isn't that realistic, it can be used to test basic placement and failure handling logic. More succinctly: you can write tests for this stuff by splitting request generation from the API calls testing the request/release logic standalone Incorporate locality preferences in dynamic allocation requests --- Key: SPARK-4352 URL: https://issues.apache.org/jira/browse/SPARK-4352 Project: Spark Issue Type: Improvement Components: Spark Core, YARN Affects Versions: 1.2.0 Reporter: Sandy Ryza Assignee: Saisai Shao Priority: Critical Attachments: Supportpreferrednodelocationindynamicallocation.pdf Currently, achieving data locality in Spark is difficult unless an application takes resources on every node in the cluster. preferredNodeLocalityData provides a sort of hacky workaround that has been broken since 1.0. With dynamic executor allocation, Spark requests executors in response to demand from the application. When this occurs, it would be useful to look at the pending tasks and communicate their location preferences to the cluster resource manager. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-8789) improve SQLQuerySuite resilience by dropping tables in setup
Steve Loughran created SPARK-8789: - Summary: improve SQLQuerySuite resilience by dropping tables in setup Key: SPARK-8789 URL: https://issues.apache.org/jira/browse/SPARK-8789 Project: Spark Issue Type: Improvement Components: SQL, Tests Affects Versions: 1.4.0 Reporter: Steve Loughran Priority: Minor When some of the tests in {{SQLQuerySuite}} are having problems, followup test runs fail because the tables are still present. this can be addressed by some table dropping at startup, and some try/finally clauses -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-6152) Spark does not support Java 8 compiled Scala classes
[ https://issues.apache.org/jira/browse/SPARK-6152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14646699#comment-14646699 ] Steve Loughran commented on SPARK-6152: --- Chill and Kryo need to be in sync; there's also the need to be compatible with the version Hive uses, (which has historically been addressed with custom versions of Hive). If spark could jump to Kryo 3.x, classpath conflict with hive would go away, provided the wire formats of serialized classes were compatible: hive's spark-client JAR uses kryo 2.2.x to talk to spark. Spark does not support Java 8 compiled Scala classes Key: SPARK-6152 URL: https://issues.apache.org/jira/browse/SPARK-6152 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 1.2.1 Environment: Java 8+ Scala 2.11 Reporter: Ronald Chen Priority: Minor Spark uses reflectasm to check Scala closures which fails if the *user defined Scala closures* are compiled to Java 8 class version The cause is reflectasm does not support Java 8 https://github.com/EsotericSoftware/reflectasm/issues/35 Workaround: Don't compile Scala classes to Java 8, Scala 2.11 does not support nor require any Java 8 features Stack trace: {code} java.lang.IllegalArgumentException at com.esotericsoftware.reflectasm.shaded.org.objectweb.asm.ClassReader.init(Unknown Source) at com.esotericsoftware.reflectasm.shaded.org.objectweb.asm.ClassReader.init(Unknown Source) at com.esotericsoftware.reflectasm.shaded.org.objectweb.asm.ClassReader.init(Unknown Source) at org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$ClosureCleaner$$getClassReader(ClosureCleaner.scala:41) at org.apache.spark.util.ClosureCleaner$.getInnerClasses(ClosureCleaner.scala:84) at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:107) at org.apache.spark.SparkContext.clean(SparkContext.scala:1478) at org.apache.spark.rdd.RDD.map(RDD.scala:288) at ...my Scala 2.11 compiled to Java 8 code calling into spark {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-8064) Upgrade Hive to 1.2
[ https://issues.apache.org/jira/browse/SPARK-8064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652730#comment-14652730 ] Steve Loughran commented on SPARK-8064: --- Also: we had to produce a custom release of hive-exec 1.2.1 with # The same version of Kryo as that used in Chill (2.21) # protobuf shaded (needed for co-existed with protobuf 2.4 on Hadoop 1.x) The source for this is at https://github.com/pwendell/hive/tree/release-1.2.1-spark Upgrade Hive to 1.2 --- Key: SPARK-8064 URL: https://issues.apache.org/jira/browse/SPARK-8064 Project: Spark Issue Type: Sub-task Components: SQL Reporter: Reynold Xin Assignee: Steve Loughran Priority: Blocker Fix For: 1.5.0 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-8064) Upgrade Hive to 1.2
[ https://issues.apache.org/jira/browse/SPARK-8064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652699#comment-14652699 ] Steve Loughran commented on SPARK-8064: --- While assigned to me, lots of people helped with this and deserve credit too —as well as everyone who submitted PRs (Michael, Patrick, and Cheng Lian, Marcelo Vanzin did a lot of the heavy lifting of updating the {{sql/hive}} module to work with Hive 1.x. I also got lots of support from my colleagues. Now it's time for the followon JIRAs Upgrade Hive to 1.2 --- Key: SPARK-8064 URL: https://issues.apache.org/jira/browse/SPARK-8064 Project: Spark Issue Type: Sub-task Components: SQL Reporter: Reynold Xin Assignee: Steve Loughran Priority: Blocker Fix For: 1.5.0 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-9417) sbt-launch to fetch sbt binaries over https not http
Steve Loughran created SPARK-9417: - Summary: sbt-launch to fetch sbt binaries over https not http Key: SPARK-9417 URL: https://issues.apache.org/jira/browse/SPARK-9417 Project: Spark Issue Type: Improvement Components: Build Affects Versions: 1.5.0 Reporter: Steve Loughran Priority: Minor the current {{build/sbt-launch-lib.bash}} uses two URLs to try and fetch sbt from {code} URL1=http://typesafe.artifactoryonline.com/typesafe/ivy-releases/org.scala-sbt/sbt-launch/${SBT_VERSION}/sbt-launch.jar URL2=http://repo.typesafe.com/typesafe/ivy-releases/org.scala-sbt/sbt-launch/${SBT_VERSION}/sbt-launch.jar {code} Using HTTP means that the artifacts are downloaded without any auth, and without any checksum validation. Yet the actual URL currently just redirects to URL https://repo.typesafe.com/typesafe/ivy-releases/ switching to that directly would reduce vulnerability to MITM publishing of subverted artifacts -or at least postpone it to the maven/ivy phase. An alternative strategy would be to have the SHA1 checksum in the script, and explicitly validate the D/L -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-9417) sbt-launch to fetch sbt binaries over https not http
[ https://issues.apache.org/jira/browse/SPARK-9417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14644959#comment-14644959 ] Steve Loughran commented on SPARK-9417: --- Marking as related to SPARK-9254, which added the redirect handling to the script. This JIRA doesn't supplement it, it just advocates making the original URL is the HTTPS one sbt-launch to fetch sbt binaries over https not http Key: SPARK-9417 URL: https://issues.apache.org/jira/browse/SPARK-9417 Project: Spark Issue Type: Improvement Components: Build Affects Versions: 1.5.0 Reporter: Steve Loughran Priority: Minor the current {{build/sbt-launch-lib.bash}} uses two URLs to try and fetch sbt from {code} URL1=http://typesafe.artifactoryonline.com/typesafe/ivy-releases/org.scala-sbt/sbt-launch/${SBT_VERSION}/sbt-launch.jar URL2=http://repo.typesafe.com/typesafe/ivy-releases/org.scala-sbt/sbt-launch/${SBT_VERSION}/sbt-launch.jar {code} Using HTTP means that the artifacts are downloaded without any auth, and without any checksum validation. Yet the actual URL currently just redirects to URL https://repo.typesafe.com/typesafe/ivy-releases/ switching to that directly would reduce vulnerability to MITM publishing of subverted artifacts -or at least postpone it to the maven/ivy phase. An alternative strategy would be to have the SHA1 checksum in the script, and explicitly validate the D/L -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-9019) spark-submit fails on yarn with kerberos enabled
[ https://issues.apache.org/jira/browse/SPARK-9019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14662211#comment-14662211 ] Steve Loughran commented on SPARK-9019: --- # If this problem exists (I don't have test setup right now for various reasons) then it is a regression from 1.3 # Like Thomas says, RM client tokens should get down to the AM automatically. If, however, these tokens are needed in the containers, then a delegation token is going to be needed —presumably that is what this patch does. However, that token will expire then a new one is needed; SPARK-5342 was meant to address that. It should be creating the tokens providing them on demand. Something is playing up there. Regarding the patch, I don't know how well it would work in an RM-HA environment. Someone who understands the details for HA YARN would need to look at it spark-submit fails on yarn with kerberos enabled Key: SPARK-9019 URL: https://issues.apache.org/jira/browse/SPARK-9019 Project: Spark Issue Type: Bug Components: Spark Submit Affects Versions: 1.5.0 Environment: Hadoop 2.6 with YARN and kerberos enabled Reporter: Bolke de Bruin Labels: kerberos, spark-submit, yarn Attachments: debug-log-spark-1.5-fail, spark-submit-log-1.5.0-fail It is not possible to run jobs using spark-submit on yarn with a kerberized cluster. Commandline: /usr/hdp/2.2.0.0-2041/spark-1.5.0/bin/spark-submit --principal sparkjob --keytab sparkjob.keytab --num-executors 3 --executor-cores 5 --executor-memory 5G --master yarn-cluster /tmp/get_peers.py Fails with: 15/07/13 22:48:31 INFO server.Server: jetty-8.y.z-SNAPSHOT 15/07/13 22:48:31 INFO server.AbstractConnector: Started SelectChannelConnector@0.0.0.0:58380 15/07/13 22:48:31 INFO util.Utils: Successfully started service 'SparkUI' on port 58380. 15/07/13 22:48:31 INFO ui.SparkUI: Started SparkUI at http://10.111.114.9:58380 15/07/13 22:48:31 INFO cluster.YarnClusterScheduler: Created YarnClusterScheduler 15/07/13 22:48:31 WARN metrics.MetricsSystem: Using default name DAGScheduler for source because spark.app.id is not set. 15/07/13 22:48:32 INFO util.Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 43470. 15/07/13 22:48:32 INFO netty.NettyBlockTransferService: Server created on 43470 15/07/13 22:48:32 INFO storage.BlockManagerMaster: Trying to register BlockManager 15/07/13 22:48:32 INFO storage.BlockManagerMasterEndpoint: Registering block manager 10.111.114.9:43470 with 265.1 MB RAM, BlockManagerId(driver, 10.111.114.9, 43470) 15/07/13 22:48:32 INFO storage.BlockManagerMaster: Registered BlockManager 15/07/13 22:48:32 INFO impl.TimelineClientImpl: Timeline service address: http://lxhnl002.ad.ing.net:8188/ws/v1/timeline/ 15/07/13 22:48:33 WARN ipc.Client: Exception encountered while connecting to the server : org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS] 15/07/13 22:48:33 INFO client.ConfiguredRMFailoverProxyProvider: Failing over to rm2 15/07/13 22:48:33 INFO retry.RetryInvocationHandler: Exception while invoking getClusterNodes of class ApplicationClientProtocolPBClientImpl over rm2 after 1 fail over attempts. Trying to fail over after sleeping for 32582ms. java.net.ConnectException: Call From lxhnl006.ad.ing.net/10.111.114.9 to lxhnl013.ad.ing.net:8032 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:791) at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:731) at org.apache.hadoop.ipc.Client.call(Client.java:1472) at org.apache.hadoop.ipc.Client.call(Client.java:1399) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232) at com.sun.proxy.$Proxy24.getClusterNodes(Unknown Source) at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getClusterNodes(ApplicationClientProtocolPBClientImpl.java:262) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at
[jira] [Commented] (SPARK-7789) sql on security hbase:Token generation only allowed for Kerberos authenticated clients
[ https://issues.apache.org/jira/browse/SPARK-7789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14662221#comment-14662221 ] Steve Loughran commented on SPARK-7789: --- With SPARK-8064, Hive 1.2.1 is the hive lib used. It will incorporate the HIVE-8874 patch. When the 1.5 beta comes out, would you be able to test this? sql on security hbase:Token generation only allowed for Kerberos authenticated clients --- Key: SPARK-7789 URL: https://issues.apache.org/jira/browse/SPARK-7789 Project: Spark Issue Type: Bug Components: SQL Reporter: meiyoula After creating a hbase table in beeline, then execute select sql statement, Executor occurs the exception: {quote} java.lang.IllegalStateException: Error while configuring input job properties at org.apache.hadoop.hive.hbase.HBaseStorageHandler.configureTableJobProperties(HBaseStorageHandler.java:343) at org.apache.hadoop.hive.hbase.HBaseStorageHandler.configureInputJobProperties(HBaseStorageHandler.java:279) at org.apache.hadoop.hive.ql.plan.PlanUtils.configureJobPropertiesForStorageHandler(PlanUtils.java:804) at org.apache.hadoop.hive.ql.plan.PlanUtils.configureInputJobPropertiesForStorageHandler(PlanUtils.java:774) at org.apache.spark.sql.hive.HadoopTableReader$.initializeLocalJobConfFunc(TableReader.scala:300) at org.apache.spark.sql.hive.HadoopTableReader$$anonfun$12.apply(TableReader.scala:276) at org.apache.spark.sql.hive.HadoopTableReader$$anonfun$12.apply(TableReader.scala:276) at org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$6.apply(HadoopRDD.scala:176) at org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$6.apply(HadoopRDD.scala:176) at scala.Option.map(Option.scala:145) at org.apache.spark.rdd.HadoopRDD.getJobConf(HadoopRDD.scala:176) at org.apache.spark.rdd.HadoopRDD$$anon$1.init(HadoopRDD.scala:220) at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:216) at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:101) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277) at org.apache.spark.rdd.RDD.iterator(RDD.scala:244) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277) at org.apache.spark.rdd.RDD.iterator(RDD.scala:244) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277) at org.apache.spark.rdd.RDD.iterator(RDD.scala:244) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277) at org.apache.spark.rdd.RDD.iterator(RDD.scala:244) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:63) at org.apache.spark.scheduler.Task.run(Task.scala:70) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.hadoop.hbase.security.AccessDeniedException: org.apache.hadoop.hbase.security.AccessDeniedException: Token generation only allowed for Kerberos authenticated clients at org.apache.hadoop.hbase.security.token.TokenProvider.getAuthenticationToken(TokenProvider.java:124) at org.apache.hadoop.hbase.protobuf.generated.AuthenticationProtos$AuthenticationService$1.getAuthenticationToken(AuthenticationProtos.java:4267) at org.apache.hadoop.hbase.protobuf.generated.AuthenticationProtos$AuthenticationService.callMethod(AuthenticationProtos.java:4387) at org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:7696) at org.apache.hadoop.hbase.regionserver.RSRpcServices.execServiceOnRegion(RSRpcServices.java:1877) at org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:1859) at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32209) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2131) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:102) at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130) at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:107) at
[jira] [Updated] (SPARK-7252) Add support for creating new Hive and HBase delegation tokens
[ https://issues.apache.org/jira/browse/SPARK-7252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated SPARK-7252: -- Affects Version/s: (was: 1.3.1) 1.5.0 Add support for creating new Hive and HBase delegation tokens - Key: SPARK-7252 URL: https://issues.apache.org/jira/browse/SPARK-7252 Project: Spark Issue Type: Improvement Components: YARN Affects Versions: 1.5.0 Reporter: Hari Shreedharan In SPARK-5342, support is being added for long running apps to be able to write to HDFS, but this does not work for Hive and HBase. We need to add the same support for these too. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-6882) Spark ThriftServer2 Kerberos failed encountering java.lang.IllegalArgumentException: Unknown auth type: null Allowed values are: [auth-int, auth-conf, auth]
[ https://issues.apache.org/jira/browse/SPARK-6882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14662117#comment-14662117 ] Steve Loughran commented on SPARK-6882: --- Spark 1.5 will use Hive 1.2.1; this problem shouldn't occur then Spark ThriftServer2 Kerberos failed encountering java.lang.IllegalArgumentException: Unknown auth type: null Allowed values are: [auth-int, auth-conf, auth] Key: SPARK-6882 URL: https://issues.apache.org/jira/browse/SPARK-6882 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 1.2.1, 1.3.0, 1.4.0 Environment: * Apache Hadoop 2.4.1 with Kerberos Enabled * Apache Hive 0.13.1 * Spark 1.2.1 git commit b6eaf77d4332bfb0a698849b1f5f917d20d70e97 * Spark 1.3.0 rc1 commit label 0dcb5d9f31b713ed90bcec63ebc4e530cbb69851 Reporter: Andrew Lee When Kerberos is enabled, I get the following exceptions. {code} 2015-03-13 18:26:05,363 ERROR org.apache.hive.service.cli.thrift.ThriftCLIService (ThriftBinaryCLIService.java:run(93)) - Error: java.lang.IllegalArgumentException: Unknown auth type: null Allowed values are: [auth-int, auth-conf, auth] {code} I tried it in * Spark 1.2.1 git commit b6eaf77d4332bfb0a698849b1f5f917d20d70e97 * Spark 1.3.0 rc1 commit label 0dcb5d9f31b713ed90bcec63ebc4e530cbb69851 with * Apache Hive 0.13.1 * Apache Hadoop 2.4.1 Build command {code} mvn -U -X -Phadoop-2.4 -Pyarn -Phive -Phive-0.13.1 -Phive-thriftserver -Dhadoop.version=2.4.1 -Dyarn.version=2.4.1 -Dhive.version=0.13.1 -DskipTests install {code} When starting Spark ThriftServer in {{yarn-client}} mode, the command to start thriftserver looks like this {code} ./start-thriftserver.sh --hiveconf hive.server2.thrift.port=2 --hiveconf hive.server2.thrift.bind.host=$(hostname) --master yarn-client {code} {{hostname}} points to the current hostname of the machine I'm using. Error message in {{spark.log}} from Spark 1.2.1 (1.2 rc1) {code} 2015-03-13 18:26:05,363 ERROR org.apache.hive.service.cli.thrift.ThriftCLIService (ThriftBinaryCLIService.java:run(93)) - Error: java.lang.IllegalArgumentException: Unknown auth type: null Allowed values are: [auth-int, auth-conf, auth] at org.apache.hive.service.auth.SaslQOP.fromString(SaslQOP.java:56) at org.apache.hive.service.auth.HiveAuthFactory.getSaslProperties(HiveAuthFactory.java:118) at org.apache.hive.service.auth.HiveAuthFactory.getAuthTransFactory(HiveAuthFactory.java:133) at org.apache.hive.service.cli.thrift.ThriftBinaryCLIService.run(ThriftBinaryCLIService.java:43) at java.lang.Thread.run(Thread.java:744) {code} I'm wondering if this is due to the same problem described in HIVE-8154 HIVE-7620 due to an older code based for the Spark ThriftServer? Any insights are appreciated. Currently, I can't get Spark ThriftServer2 to run against a Kerberos cluster (Apache 2.4.1). My hive-site.xml looks like the following for spark/conf. The kerberos keytab and tgt are configured correctly, I'm able to connect to metastore, but the subsequent steps failed due to the exception. {code} property namehive.semantic.analyzer.factory.impl/name valueorg.apache.hcatalog.cli.HCatSemanticAnalyzerFactory/value /property property namehive.metastore.execute.setugi/name valuetrue/value /property property namehive.stats.autogather/name valuefalse/value /property property namehive.session.history.enabled/name valuetrue/value /property property namehive.querylog.location/name value/tmp/home/hive/log/${user.name}/value /property property namehive.exec.local.scratchdir/name value/tmp/hive/scratch/${user.name}/value /property property namehive.metastore.uris/name valuethrift://somehostname:9083/value /property !-- HIVE SERVER 2 -- property namehive.server2.authentication/name valueKERBEROS/value /property property namehive.server2.authentication.kerberos.principal/name value***/value /property property namehive.server2.authentication.kerberos.keytab/name value***/value /property property namehive.server2.thrift.sasl.qop/name valueauth/value descriptionSasl QOP value; one of 'auth', 'auth-int' and 'auth-conf'/description /property property namehive.server2.enable.impersonation/name descriptionEnable user impersonation for HiveServer2/description valuetrue/value /property !-- HIVE METASTORE -- property namehive.metastore.sasl.enabled/name valuetrue/value /property property namehive.metastore.kerberos.keytab.file/name value***/value /property property namehive.metastore.kerberos.principal/name value***/value
[jira] [Resolved] (SPARK-5111) HiveContext and Thriftserver cannot work in secure cluster beyond hadoop2.5
[ https://issues.apache.org/jira/browse/SPARK-5111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran resolved SPARK-5111. --- Resolution: Done Fix Version/s: 1.5.0 HiveContext and Thriftserver cannot work in secure cluster beyond hadoop2.5 --- Key: SPARK-5111 URL: https://issues.apache.org/jira/browse/SPARK-5111 Project: Spark Issue Type: Bug Components: SQL Reporter: Zhan Zhang Fix For: 1.5.0 Due to java.lang.NoSuchFieldError: SASL_PROPS error. Need to backport some hive-0.14 fix into spark, since there is no effort to upgrade hive to 0.14 support in spark. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Reopened] (SPARK-8385) java.lang.UnsupportedOperationException: Not implemented by the TFS FileSystem implementation
[ https://issues.apache.org/jira/browse/SPARK-8385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran reopened SPARK-8385: --- I can reproduce this too, and have actually managed to do it in the spark tests, somehow {code} 2015-07-27 16:28:13,267 [IPC Server handler 9 on 61736] INFO resourcemanager.ClientRMService (ClientRMService.java:getNewApplicationId(286)) - Allocated new applicationId: 2 2015-07-27 16:28:13,341 [stderr] INFO util.Utils (Logging.scala:logInfo(59)) - Exception in thread main java.lang.UnsupportedOperationException: Not implemented by the TFS FileSystem implementation 2015-07-27 16:28:13,341 [stderr] INFO util.Utils (Logging.scala:logInfo(59)) - at org.apache.hadoop.fs.FileSystem.getScheme(FileSystem.java:214) 2015-07-27 16:28:13,342 [stderr] INFO util.Utils (Logging.scala:logInfo(59)) - at org.apache.hadoop.fs.FileSystem.loadFileSystems(FileSystem.java:2365) 2015-07-27 16:28:13,342 [stderr] INFO util.Utils (Logging.scala:logInfo(59)) - at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2375) 2015-07-27 16:28:13,342 [stderr] INFO util.Utils (Logging.scala:logInfo(59)) - at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2392) 2015-07-27 16:28:13,342 [stderr] INFO util.Utils (Logging.scala:logInfo(59)) - at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:89) 2015-07-27 16:28:13,342 [stderr] INFO util.Utils (Logging.scala:logInfo(59)) - at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2431) 2015-07-27 16:28:13,342 [stderr] INFO util.Utils (Logging.scala:logInfo(59)) - at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2413) 2015-07-27 16:28:13,342 [stderr] INFO util.Utils (Logging.scala:logInfo(59)) - at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:368) 2015-07-27 16:28:13,343 [stderr] INFO util.Utils (Logging.scala:logInfo(59)) - at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:167) 2015-07-27 16:28:13,343 [stderr] INFO util.Utils (Logging.scala:logInfo(59)) - at org.apache.spark.deploy.yarn.Client.prepareLocalResources(Client.scala:216) 2015-07-27 16:28:13,343 [stderr] INFO util.Utils (Logging.scala:logInfo(59)) - at org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:384) 2015-07-27 16:28:13,343 [stderr] INFO util.Utils (Logging.scala:logInfo(59)) - at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:102) 2015-07-27 16:28:13,343 [stderr] INFO util.Utils (Logging.scala:logInfo(59)) - at org.apache.spark.deploy.yarn.Client.run(Client.scala:619) 2015-07-27 16:28:13,343 [stderr] INFO util.Utils (Logging.scala:logInfo(59)) - at org.apache.spark.deploy.yarn.Client$.main(Client.scala:647) 2015-07-27 16:28:13,343 [stderr] INFO util.Utils (Logging.scala:logInfo(59)) - at org.apache.spark.deploy.yarn.Client.main(Client.scala) 2015-07-27 16:28:13,343 [stderr] INFO util.Utils (Logging.scala:logInfo(59)) - at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) {code} java.lang.UnsupportedOperationException: Not implemented by the TFS FileSystem implementation - Key: SPARK-8385 URL: https://issues.apache.org/jira/browse/SPARK-8385 Project: Spark Issue Type: Bug Components: Input/Output Affects Versions: 1.4.0 Environment: RHEL 7.1 Reporter: Peter Haumer I used to be able to debug my Spark apps in Eclipse. With Spark 1.3.1 I created a launch and just set the vm var -Dspark.master=local[4]. With 1.4 this stopped working when reading files from the OS filesystem. Running the same apps with spark-submit works fine. Loosing the ability to debug that way has a major impact on the usability of Spark. The following exception is thrown: Exception in thread main java.lang.UnsupportedOperationException: Not implemented by the TFS FileSystem implementation at org.apache.hadoop.fs.FileSystem.getScheme(FileSystem.java:213) at org.apache.hadoop.fs.FileSystem.loadFileSystems(FileSystem.java:2401) at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2411) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2428) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:88) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2467) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2449) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:367) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:166) at org.apache.hadoop.mapred.JobConf.getWorkingDirectory(JobConf.java:653) at
[jira] [Commented] (SPARK-8385) java.lang.UnsupportedOperationException: Not implemented by the TFS FileSystem implementation
[ https://issues.apache.org/jira/browse/SPARK-8385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14643652#comment-14643652 ] Steve Loughran commented on SPARK-8385: --- What's happening? Hadoop is trying to enum all the filesystems via the service loader mechanism (HADOOP-7549), auto-registering all filsystems listed in any JAR's resource file {{META-INF/services/org.apache.hadoop.fs.FileSystem}} —in a map indexed by the filesystem scheme as returned by {{FileSystem.getScheme()}} The default value for that raises an exception, so the FS init fails, and the user gets to see a stack trace # every filesystem needs to implement this method # hadoop's FS contract tests need to explicitly call the method and verify it is non null, non empty, so at anyone who implements those tests gets to find the problem. (there's an implicit probe already) # maybe, hadoop should be more forgiving of filesystems which don't know their own name, yet have metadata entries. That's a tough call: it'd be more forgiving at startup time, but less intuitive downstream when things simply don't work if a filesystem is named but not found (i.e. there's no fallback to fs.*.impl=classname entry in the cluster configs. Spark does ship with Tachyon 0.6.4, which has the method, but Tachyon 0.5.0 does not. Except Tachyon 0.5.0 does no have a resource file {{META-INF/services/org.apache.hadoop.fs.FileSystem}} -that is new with 0.60. Which leads to the following hypothesis about what is going wrong: # There are two versions of tachyon on the classpath # tachyon 0.6.4+ explicitly declares the FS in the metadata file, triggering an auto instantiate/load # tachyon 0.5.0's version of the FS class is the one being loaded by Hadoop (i.e. that JAR comes first in the classpath) It's OK to have 0.50 on the classpath; or 0.6.4: it's the combination which is triggering the problem. This isn't something Spark can fix, nor can Hadoop: duplicate, inconsistent JAR versions is always a disaster. This stack trace is how the specific case of 1 tachyon JAR on the classpath surfaces if v 0.5.0 comes first. Closing as a WONTFIX as its an installation side problem, not anything that is fixable in source. h2. For anyone seeing this: # Check your SPARK_HOME environment variable and make sure its not pointing to an older one than the rest of your code is trying to use. # Check your build to make sure you aren't explicitly pulling in a tachyon JAR —theres one packaged up in spark-assembly # Make sure that you aren't pulling in another assembly module with its own tachyon version # Make sure no tachyon JAR has been copied into any of your hadoop directories (i.e. {{HADOOP_HOME/lib}} java.lang.UnsupportedOperationException: Not implemented by the TFS FileSystem implementation - Key: SPARK-8385 URL: https://issues.apache.org/jira/browse/SPARK-8385 Project: Spark Issue Type: Bug Components: Input/Output Affects Versions: 1.4.0 Environment: RHEL 7.1 Reporter: Peter Haumer I used to be able to debug my Spark apps in Eclipse. With Spark 1.3.1 I created a launch and just set the vm var -Dspark.master=local[4]. With 1.4 this stopped working when reading files from the OS filesystem. Running the same apps with spark-submit works fine. Loosing the ability to debug that way has a major impact on the usability of Spark. The following exception is thrown: Exception in thread main java.lang.UnsupportedOperationException: Not implemented by the TFS FileSystem implementation at org.apache.hadoop.fs.FileSystem.getScheme(FileSystem.java:213) at org.apache.hadoop.fs.FileSystem.loadFileSystems(FileSystem.java:2401) at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2411) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2428) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:88) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2467) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2449) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:367) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:166) at org.apache.hadoop.mapred.JobConf.getWorkingDirectory(JobConf.java:653) at org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:389) at org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:362) at org.apache.spark.SparkContext$$anonfun$28.apply(SparkContext.scala:762) at org.apache.spark.SparkContext$$anonfun$28.apply(SparkContext.scala:762) at org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$6.apply(HadoopRDD.scala:172) at org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$6.apply(HadoopRDD.scala:172) at
[jira] [Resolved] (SPARK-8385) java.lang.UnsupportedOperationException: Not implemented by the TFS FileSystem implementation
[ https://issues.apache.org/jira/browse/SPARK-8385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran resolved SPARK-8385. --- Resolution: Won't Fix java.lang.UnsupportedOperationException: Not implemented by the TFS FileSystem implementation - Key: SPARK-8385 URL: https://issues.apache.org/jira/browse/SPARK-8385 Project: Spark Issue Type: Bug Components: Input/Output Affects Versions: 1.4.0 Environment: RHEL 7.1 Reporter: Peter Haumer I used to be able to debug my Spark apps in Eclipse. With Spark 1.3.1 I created a launch and just set the vm var -Dspark.master=local[4]. With 1.4 this stopped working when reading files from the OS filesystem. Running the same apps with spark-submit works fine. Loosing the ability to debug that way has a major impact on the usability of Spark. The following exception is thrown: Exception in thread main java.lang.UnsupportedOperationException: Not implemented by the TFS FileSystem implementation at org.apache.hadoop.fs.FileSystem.getScheme(FileSystem.java:213) at org.apache.hadoop.fs.FileSystem.loadFileSystems(FileSystem.java:2401) at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2411) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2428) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:88) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2467) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2449) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:367) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:166) at org.apache.hadoop.mapred.JobConf.getWorkingDirectory(JobConf.java:653) at org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:389) at org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:362) at org.apache.spark.SparkContext$$anonfun$28.apply(SparkContext.scala:762) at org.apache.spark.SparkContext$$anonfun$28.apply(SparkContext.scala:762) at org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$6.apply(HadoopRDD.scala:172) at org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$6.apply(HadoopRDD.scala:172) at scala.Option.map(Option.scala:145) at org.apache.spark.rdd.HadoopRDD.getJobConf(HadoopRDD.scala:172) at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:196) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1535) at org.apache.spark.rdd.RDD.reduce(RDD.scala:900) at org.apache.spark.api.java.JavaRDDLike$class.reduce(JavaRDDLike.scala:357) at org.apache.spark.api.java.AbstractJavaRDDLike.reduce(JavaRDDLike.scala:46) at com.databricks.apps.logs.LogAnalyzer.main(LogAnalyzer.java:60) -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-8385) java.lang.UnsupportedOperationException: Not implemented by the TFS FileSystem implementation
[ https://issues.apache.org/jira/browse/SPARK-8385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14643652#comment-14643652 ] Steve Loughran edited comment on SPARK-8385 at 7/28/15 1:01 AM: What's happening? Hadoop is trying to enum all the filesystems via the service loader mechanism (HADOOP-7549), auto-registering all filsystems listed in any JAR's resource file {{META-INF/services/org.apache.hadoop.fs.FileSystem}} —in a map indexed by the filesystem scheme as returned by {{FileSystem.getScheme()}} The default value for that raises an exception, so the FS init fails, and the user gets to see a stack trace # every filesystem needs to implement this method # hadoop's FS contract tests need to explicitly call the method and verify it is non null, non empty, so at anyone who implements those tests gets to find the problem. (there's an implicit probe already) # maybe, hadoop should be more forgiving of filesystems which don't know their own name, yet have metadata entries. That's a tough call: it'd be more forgiving at startup time, but less intuitive downstream when things simply don't work if a filesystem is named but not found (i.e. there's no fallback to fs.*.impl=classname entry in the cluster configs. Spark does ship with Tachyon 0.6.4, which has the method, but Tachyon 0.5.0 does not. Except Tachyon 0.5.0 does not have a resource file {{META-INF/services/org.apache.hadoop.fs.FileSystem}} -that is new with 0.6.x. Which leads to the following hypothesis about what is going wrong: # There are two versions of tachyon on the classpath # tachyon 0.6.4+ explicitly declares the FS in the metadata file, triggering an auto instantiate/load # tachyon 0.5.0's version of the FS class is the one being loaded by Hadoop (i.e. that JAR comes first in the classpath) It's OK to have 0.50 on the classpath; or 0.6.4: it's the combination which is triggering the problem. This isn't something Spark can fix, nor can Hadoop: duplicate, inconsistent JAR versions is always a disaster. This stack trace is how the specific case of 1 tachyon JAR on the classpath surfaces if v 0.5.0 comes first. Closing as a WONTFIX as its an installation side problem, not anything that is fixable in source. h2. For anyone seeing this: # Check your SPARK_HOME environment variable and make sure its not pointing to an older one than the rest of your code is trying to use. # Check your build to make sure you aren't explicitly pulling in a tachyon JAR —theres one packaged up in spark-assembly # Make sure that you aren't pulling in another assembly module with its own tachyon version # Make sure no tachyon JAR has been copied into any of your hadoop directories (i.e. {{HADOOP_HOME/lib}} was (Author: ste...@apache.org): What's happening? Hadoop is trying to enum all the filesystems via the service loader mechanism (HADOOP-7549), auto-registering all filsystems listed in any JAR's resource file {{META-INF/services/org.apache.hadoop.fs.FileSystem}} —in a map indexed by the filesystem scheme as returned by {{FileSystem.getScheme()}} The default value for that raises an exception, so the FS init fails, and the user gets to see a stack trace # every filesystem needs to implement this method # hadoop's FS contract tests need to explicitly call the method and verify it is non null, non empty, so at anyone who implements those tests gets to find the problem. (there's an implicit probe already) # maybe, hadoop should be more forgiving of filesystems which don't know their own name, yet have metadata entries. That's a tough call: it'd be more forgiving at startup time, but less intuitive downstream when things simply don't work if a filesystem is named but not found (i.e. there's no fallback to fs.*.impl=classname entry in the cluster configs. Spark does ship with Tachyon 0.6.4, which has the method, but Tachyon 0.5.0 does not. Except Tachyon 0.5.0 does no have a resource file {{META-INF/services/org.apache.hadoop.fs.FileSystem}} -that is new with 0.60. Which leads to the following hypothesis about what is going wrong: # There are two versions of tachyon on the classpath # tachyon 0.6.4+ explicitly declares the FS in the metadata file, triggering an auto instantiate/load # tachyon 0.5.0's version of the FS class is the one being loaded by Hadoop (i.e. that JAR comes first in the classpath) It's OK to have 0.50 on the classpath; or 0.6.4: it's the combination which is triggering the problem. This isn't something Spark can fix, nor can Hadoop: duplicate, inconsistent JAR versions is always a disaster. This stack trace is how the specific case of 1 tachyon JAR on the classpath surfaces if v 0.5.0 comes first. Closing as a WONTFIX as its an installation side problem, not anything that is fixable in source. h2. For anyone seeing this: # Check your SPARK_HOME environment variable and make sure its
[jira] [Created] (SPARK-9070) JavaDataFrameSuite teardown NPEs if setup failed
Steve Loughran created SPARK-9070: - Summary: JavaDataFrameSuite teardown NPEs if setup failed Key: SPARK-9070 URL: https://issues.apache.org/jira/browse/SPARK-9070 Project: Spark Issue Type: Bug Components: SQL, Tests Affects Versions: 1.5.0 Reporter: Steve Loughran Priority: Trivial The hive test {{JavaDataFrameSuite}} NPEs in teardown if setup failed. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-8898) Jets3t hangs with more than 1 core
[ https://issues.apache.org/jira/browse/SPARK-8898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631133#comment-14631133 ] Steve Loughran commented on SPARK-8898: --- which Hadoop version are you using? Hadoop 2.6 has the most up to date jets3t JARs Jets3t hangs with more than 1 core -- Key: SPARK-8898 URL: https://issues.apache.org/jira/browse/SPARK-8898 Project: Spark Issue Type: Bug Components: Input/Output Affects Versions: 1.4.0 Environment: S3 Reporter: Daniel Darabos If I have an RDD that reads from S3 ({{newAPIHadoopFile}}), and try to write this to S3 ({{saveAsNewAPIHadoopFile}}), it hangs if I have more than 1 core per executor. It sounds like a race condition, but so far I have seen it trigger 100% of the time. From a race for taking a limited number of connections I would expect it to succeed at least on 1 task at least some of the time. But I never saw a single completed task, except when running with 1-core executors. All executor threads hang with one of the following two stack traces: {noformat:title=Stack trace 1} java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on 0x0007759cae70 (a org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$ConnectionPool) at org.apache.commons.httpclient.MultiThreadedHttpConnectionManager.doGetConnection(MultiThreadedHttpConnectionManager.java:518) - locked 0x0007759cae70 (a org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$ConnectionPool) at org.apache.commons.httpclient.MultiThreadedHttpConnectionManager.getConnectionWithTimeout(MultiThreadedHttpConnectionManager.java:416) at org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:153) at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397) at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323) at org.jets3t.service.impl.rest.httpclient.RestS3Service.performRequest(RestS3Service.java:342) at org.jets3t.service.impl.rest.httpclient.RestS3Service.performRestHead(RestS3Service.java:718) at org.jets3t.service.impl.rest.httpclient.RestS3Service.getObjectImpl(RestS3Service.java:1599) at org.jets3t.service.impl.rest.httpclient.RestS3Service.getObjectDetailsImpl(RestS3Service.java:1535) at org.jets3t.service.S3Service.getObjectDetails(S3Service.java:1987) at org.jets3t.service.S3Service.getObjectDetails(S3Service.java:1332) at org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.retrieveMetadata(Jets3tNativeFileSystemStore.java:107) at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83) at org.apache.hadoop.fs.s3native.$Proxy8.retrieveMetadata(Unknown Source) at org.apache.hadoop.fs.s3native.NativeS3FileSystem.getFileStatus(NativeS3FileSystem.java:414) at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1332) at org.apache.hadoop.fs.s3native.NativeS3FileSystem.create(NativeS3FileSystem.java:341) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:851) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:832) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:731) at org.apache.hadoop.mapreduce.lib.output.TextOutputFormat.getRecordWriter(TextOutputFormat.java:128) at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1$$anonfun$12.apply(PairRDDFunctions.scala:1030) at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1$$anonfun$12.apply(PairRDDFunctions.scala:1014) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:63) at org.apache.spark.scheduler.Task.run(Task.scala:70) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) {noformat} {noformat:title=Stack trace 2} java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on 0x0007759cae70 (a
[jira] [Commented] (SPARK-8898) Jets3t hangs with more than 1 core
[ https://issues.apache.org/jira/browse/SPARK-8898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14633404#comment-14633404 ] Steve Loughran commented on SPARK-8898: --- OK. The last jets3t update (0.90) in Hadoop was in 2.3, HADOOP-9623; Hadoop 2.4 fixed some major regressions. There's no later version currently in Hadoop, with some pending 0.94 patches. I agree it's a jet3t issue. It *may* be possible to build against/bundle 0.94 simply by setting the jets3t version = 0.94 in the spark build Jets3t hangs with more than 1 core -- Key: SPARK-8898 URL: https://issues.apache.org/jira/browse/SPARK-8898 Project: Spark Issue Type: Bug Components: EC2 Affects Versions: 1.4.0 Environment: S3 Reporter: Daniel Darabos If I have an RDD that reads from S3 ({{newAPIHadoopFile}}), and try to write this to S3 ({{saveAsNewAPIHadoopFile}}), it hangs if I have more than 1 core per executor. It sounds like a race condition, but so far I have seen it trigger 100% of the time. From a race for taking a limited number of connections I would expect it to succeed at least on 1 task at least some of the time. But I never saw a single completed task, except when running with 1-core executors. All executor threads hang with one of the following two stack traces: {noformat:title=Stack trace 1} java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on 0x0007759cae70 (a org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$ConnectionPool) at org.apache.commons.httpclient.MultiThreadedHttpConnectionManager.doGetConnection(MultiThreadedHttpConnectionManager.java:518) - locked 0x0007759cae70 (a org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$ConnectionPool) at org.apache.commons.httpclient.MultiThreadedHttpConnectionManager.getConnectionWithTimeout(MultiThreadedHttpConnectionManager.java:416) at org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:153) at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397) at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323) at org.jets3t.service.impl.rest.httpclient.RestS3Service.performRequest(RestS3Service.java:342) at org.jets3t.service.impl.rest.httpclient.RestS3Service.performRestHead(RestS3Service.java:718) at org.jets3t.service.impl.rest.httpclient.RestS3Service.getObjectImpl(RestS3Service.java:1599) at org.jets3t.service.impl.rest.httpclient.RestS3Service.getObjectDetailsImpl(RestS3Service.java:1535) at org.jets3t.service.S3Service.getObjectDetails(S3Service.java:1987) at org.jets3t.service.S3Service.getObjectDetails(S3Service.java:1332) at org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.retrieveMetadata(Jets3tNativeFileSystemStore.java:107) at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83) at org.apache.hadoop.fs.s3native.$Proxy8.retrieveMetadata(Unknown Source) at org.apache.hadoop.fs.s3native.NativeS3FileSystem.getFileStatus(NativeS3FileSystem.java:414) at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1332) at org.apache.hadoop.fs.s3native.NativeS3FileSystem.create(NativeS3FileSystem.java:341) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:851) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:832) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:731) at org.apache.hadoop.mapreduce.lib.output.TextOutputFormat.getRecordWriter(TextOutputFormat.java:128) at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1$$anonfun$12.apply(PairRDDFunctions.scala:1030) at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1$$anonfun$12.apply(PairRDDFunctions.scala:1014) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:63) at org.apache.spark.scheduler.Task.run(Task.scala:70) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
[jira] [Commented] (SPARK-8276) NPE in YarnClientSchedulerBackend.stop
[ https://issues.apache.org/jira/browse/SPARK-8276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14633479#comment-14633479 ] Steve Loughran commented on SPARK-8276: --- Afraid I can't, sorry. I'm assuming its from the case where {{monitorThread}} is null NPE in YarnClientSchedulerBackend.stop -- Key: SPARK-8276 URL: https://issues.apache.org/jira/browse/SPARK-8276 Project: Spark Issue Type: Bug Components: YARN Affects Versions: 1.5.0 Reporter: Steve Loughran Priority: Minor NPE seen in {{YarnClientSchedulerBackend.stop()}} after problem setting up job; on the line {{monitorThread.interrupt()}} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-11265) YarnClient can't get tokens to talk to Hive in a secure cluster
[ https://issues.apache.org/jira/browse/SPARK-11265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14971188#comment-14971188 ] Steve Loughran commented on SPARK-11265: Pull request is : https://github.com/apache/spark/pull/9232 > YarnClient can't get tokens to talk to Hive in a secure cluster > --- > > Key: SPARK-11265 > URL: https://issues.apache.org/jira/browse/SPARK-11265 > Project: Spark > Issue Type: Bug > Components: YARN >Affects Versions: 1.5.1 > Environment: Kerberized Hadoop cluster >Reporter: Steve Loughran > > As reported on the dev list, trying to run a YARN client which wants to talk > to Hive in a Kerberized hadoop cluster fails. This appears to be because the > constructor of the {{ org.apache.hadoop.hive.ql.metadata.Hive}} class was > made private and replaced with a factory method. The YARN client uses > reflection to get the tokens, so the signature changes weren't picked up in > SPARK-8064. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-11265) YarnClient can't get tokens to talk to Hive in a secure cluster
[ https://issues.apache.org/jira/browse/SPARK-11265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14970930#comment-14970930 ] Steve Loughran commented on SPARK-11265: I can trigger a failure in a unit test now, once you get pass Hive failing to load (classpath issue), the {{get()}} operation fails {code} obtain Tokens For HiveMetastore *** FAILED *** java.lang.IllegalArgumentException: wrong number of arguments at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.spark.deploy.yarn.YarnSparkHadoopUtil.obtainTokenForHiveMetastoreInner(YarnSparkHadoopUtil.scala:203) at org.apache.spark.deploy.yarn.YarnSparkHadoopUtilSuite$$anonfun$22.apply(YarnSparkHadoopUtilSuite.scala:254) at org.apache.spark.deploy.yarn.YarnSparkHadoopUtilSuite$$anonfun$22.apply(YarnSparkHadoopUtilSuite.scala:249) at org.scalatest.Transformer$$anonfun$apply$1.apply$mcV$sp(Transformer.scala:22) at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85) at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104) {code} > YarnClient can't get tokens to talk to Hive in a secure cluster > --- > > Key: SPARK-11265 > URL: https://issues.apache.org/jira/browse/SPARK-11265 > Project: Spark > Issue Type: Bug > Components: YARN >Affects Versions: 1.5.1 > Environment: Kerberized Hadoop cluster >Reporter: Steve Loughran > > As reported on the dev list, trying to run a YARN client which wants to talk > to Hive in a Kerberized hadoop cluster fails. This appears to be because the > constructor of the {{ org.apache.hadoop.hive.ql.metadata.Hive}} class was > made private and replaced with a factory method. The YARN client uses > reflection to get the tokens, so the signature changes weren't picked up in > SPARK-8064. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-6270) Standalone Master hangs when streaming job completes and event logging is enabled
[ https://issues.apache.org/jira/browse/SPARK-6270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated SPARK-6270: -- Affects Version/s: 1.5.1 > Standalone Master hangs when streaming job completes and event logging is > enabled > - > > Key: SPARK-6270 > URL: https://issues.apache.org/jira/browse/SPARK-6270 > Project: Spark > Issue Type: Bug > Components: Deploy, Streaming >Affects Versions: 1.2.0, 1.2.1, 1.3.0, 1.5.1 >Reporter: Tathagata Das >Priority: Critical > > If the event logging is enabled, the Spark Standalone Master tries to > recreate the web UI of a completed Spark application from its event logs. > However if this event log is huge (e.g. for a Spark Streaming application), > then the master hangs in its attempt to read and recreate the web ui. This > hang causes the whole standalone cluster to be unusable. > Workaround is to disable the event logging. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-11265) YarnClient cant get tokens to talk to Hive in a secure cluster
Steve Loughran created SPARK-11265: -- Summary: YarnClient cant get tokens to talk to Hive in a secure cluster Key: SPARK-11265 URL: https://issues.apache.org/jira/browse/SPARK-11265 Project: Spark Issue Type: Bug Components: YARN Affects Versions: 1.5.1 Environment: Kerberized Hadoop cluster Reporter: Steve Loughran As reported on the dev list, trying to run a YARN client which wants to talk to Hive in a Kerberized hadoop cluster fails. This appears to be because the constructor of the {{ org.apache.hadoop.hive.ql.metadata.Hive}} class was made private and replaced with a factory method. The YARN client uses reflection to get the tokens, so the signature changes weren't picked up in SPARK-8064. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-11265) YarnClient can't get tokens to talk to Hive in a secure cluster
[ https://issues.apache.org/jira/browse/SPARK-11265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated SPARK-11265: --- Summary: YarnClient can't get tokens to talk to Hive in a secure cluster (was: YarnClient cant get tokens to talk to Hive in a secure cluster) > YarnClient can't get tokens to talk to Hive in a secure cluster > --- > > Key: SPARK-11265 > URL: https://issues.apache.org/jira/browse/SPARK-11265 > Project: Spark > Issue Type: Bug > Components: YARN >Affects Versions: 1.5.1 > Environment: Kerberized Hadoop cluster >Reporter: Steve Loughran > > As reported on the dev list, trying to run a YARN client which wants to talk > to Hive in a Kerberized hadoop cluster fails. This appears to be because the > constructor of the {{ org.apache.hadoop.hive.ql.metadata.Hive}} class was > made private and replaced with a factory method. The YARN client uses > reflection to get the tokens, so the signature changes weren't picked up in > SPARK-8064. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-11265) YarnClient cant get tokens to talk to Hive in a secure cluster
[ https://issues.apache.org/jira/browse/SPARK-11265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14969818#comment-14969818 ] Steve Loughran commented on SPARK-11265: Initial report from Chester Chen {noformat} This is tested against the spark 1.5.1 ( branch 1.5 with label 1.5.2-SNAPSHOT with commit on Tue Oct 6, 84f510c4fa06e43bd35e2dc8e1008d0590cbe266) Spark deployment mode : Spark-Cluster Notice that if we enable Kerberos mode, the spark yarn client fails with the following: Could not initialize class org.apache.hadoop.hive.ql.metadata.Hive java.lang.NoClassDefFoundError: Could not initialize class org.apache.hadoop.hive.ql.metadata.Hive at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.spark.deploy.yarn.Client$.org$apache$spark$deploy$yarn$Client$$obtainTokenForHiveMetastore(Client.scala:1252) at org.apache.spark.deploy.yarn.Client.prepareLocalResources(Client.scala:271) at org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:629) at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:119) at org.apache.spark.deploy.yarn.Client.run(Client.scala:907) Diving in Yarn Client.scala code and tested against different dependencies and notice the followings: if the kerberos mode is enabled, Client.obtainTokenForHiveMetastore() will try to use scala reflection to get Hive and HiveConf and method on these method. val hiveClass = mirror.classLoader.loadClass("org.apache.hadoop.hive.ql.metadata.Hive") val hive = hiveClass.getMethod("get").invoke(null) val hiveConf = hiveClass.getMethod("getConf").invoke(hive) val hiveConfClass = mirror.classLoader.loadClass("org.apache.hadoop.hive.conf.HiveConf") val hiveConfGet = (param: String) => Option(hiveConfClass .getMethod("get", classOf[java.lang.String]) .invoke(hiveConf, param)) If the "org.spark-project.hive" % "hive-exec" % "1.2.1.spark" is used, then you will get above exception. But if we use the "org.apache.hive" % "hive-exec" "0.13.1-cdh5.2.0" The above method will not throw exception. {noformat} > YarnClient cant get tokens to talk to Hive in a secure cluster > -- > > Key: SPARK-11265 > URL: https://issues.apache.org/jira/browse/SPARK-11265 > Project: Spark > Issue Type: Bug > Components: YARN >Affects Versions: 1.5.1 > Environment: Kerberized Hadoop cluster >Reporter: Steve Loughran > > As reported on the dev list, trying to run a YARN client which wants to talk > to Hive in a Kerberized hadoop cluster fails. This appears to be because the > constructor of the {{ org.apache.hadoop.hive.ql.metadata.Hive}} class was > made private and replaced with a factory method. The YARN client uses > reflection to get the tokens, so the signature changes weren't picked up in > SPARK-8064. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-11317) YARN HBase token code shouldn't swallow invocation target exceptions
[ https://issues.apache.org/jira/browse/SPARK-11317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14974905#comment-14974905 ] Steve Loughran commented on SPARK-11317: I'll do this as soon as SPARK-11265 is in; I've factored the code there to make it straightforward. This is actually a very important patch, because the current code *will not log any authentication problems*. All you get is an "invocation target exception" message in the log, which isn't enough to fix things > YARN HBase token code shouldn't swallow invocation target exceptions > > > Key: SPARK-11317 > URL: https://issues.apache.org/jira/browse/SPARK-11317 > Project: Spark > Issue Type: Bug >Reporter: Steve Loughran > > As with SPARK-11265; the HBase token retrieval code of SPARK-6918 > 1. swallows exceptions it should be rethrowing as serious problems (e.g > NoSuchMethodException) > 1. Swallows any exception raised by the HBase client, without even logging > the details (it logs that an `InvocationTargetException` was caught, but not > the contents) > As such it is potentially brittle to changes in the HDFS client code, and > absolutely not going to provide any assistance if HBase won't actually issue > tokens to the caller. > The code in SPARK-11265 can be re-used to provide consistent and better > exception processing -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-11373) Add metrics to the History Server and providers
Steve Loughran created SPARK-11373: -- Summary: Add metrics to the History Server and providers Key: SPARK-11373 URL: https://issues.apache.org/jira/browse/SPARK-11373 Project: Spark Issue Type: New Feature Components: Spark Core Affects Versions: 1.5.1 Reporter: Steve Loughran The History server doesn't publish metrics about JVM load or anything from the history provider plugins. This means that performance problems from massive job histories aren't visible to management tools, and nor are any provider-generated metrics such as time to load histories, failed history loads, the number of connectivity failures talking to remote services, etc. If the history server set up a metrics registry and offered the option to publish its metrics, then management tools could view this data. # the metrics registry would need to be passed down to the instantiated {{ApplicationHistoryProvider}}, in order for it to register its metrics. # if the codahale metrics servlet were registered under a path such as {{/metrics}}, the values would be visible as HTML and JSON, without the need for management tools. # Integration tests could also retrieve the JSON-formatted data and use it as part of the test suites. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-11373) Add metrics to the History Server and providers
[ https://issues.apache.org/jira/browse/SPARK-11373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14978322#comment-14978322 ] Steve Loughran commented on SPARK-11373: # This has tangible benefit for the SPARK-1537 YARN ATS binding, because connectivity failures, GET performance and similar do surface. There are some {{AtomicLong}} counters in its {{YarnHistoryProvider}}, but I'm not planning to add counters and metrics until after that is checked in. # All providers will benefit from the standard JVM performance counters, GC # the FS history provider could also track time to list and load histories; time of last refresh, time to load most recent history, etc —information needed to identify where an unresponsive UI is getting its problems from. > Add metrics to the History Server and providers > --- > > Key: SPARK-11373 > URL: https://issues.apache.org/jira/browse/SPARK-11373 > Project: Spark > Issue Type: New Feature > Components: Spark Core >Affects Versions: 1.5.1 >Reporter: Steve Loughran > > The History server doesn't publish metrics about JVM load or anything from > the history provider plugins. This means that performance problems from > massive job histories aren't visible to management tools, and nor are any > provider-generated metrics such as time to load histories, failed history > loads, the number of connectivity failures talking to remote services, etc. > If the history server set up a metrics registry and offered the option to > publish its metrics, then management tools could view this data. > # the metrics registry would need to be passed down to the instantiated > {{ApplicationHistoryProvider}}, in order for it to register its metrics. > # if the codahale metrics servlet were registered under a path such as > {{/metrics}}, the values would be visible as HTML and JSON, without the need > for management tools. > # Integration tests could also retrieve the JSON-formatted data and use it as > part of the test suites. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-11375) History Server "no histories" message to be dynamically generated by ApplicationHistoryProviders
Steve Loughran created SPARK-11375: -- Summary: History Server "no histories" message to be dynamically generated by ApplicationHistoryProviders Key: SPARK-11375 URL: https://issues.apache.org/jira/browse/SPARK-11375 Project: Spark Issue Type: Improvement Components: Spark Core Affects Versions: 1.5.1 Reporter: Steve Loughran Priority: Minor When there are no histories, the {{HistoryPage}} displays an error text which assumes that the provider is the {{FsHistoryProvider}}, and its sole failure mode is "directory not found" {code} Did you specify the correct logging directory? Please verify your setting of spark.history.fs.logDirectory {code} Different providers have different failure modes, and even the filesystem provider has some, such as an access control exception, or the specified directly path actually being a file. If the {{ApplicationHistoryProvider}} was itself asked to provide an error message, then it could * be dynamically generated to show the current state of the history provider * potentially include any exceptions to list * display the actual values of settings such as the log directory property. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-11375) History Server "no histories" message to be dynamically generated by ApplicationHistoryProviders
[ https://issues.apache.org/jira/browse/SPARK-11375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14978329#comment-14978329 ] Steve Loughran commented on SPARK-11375: This could be implemented with a new method on {{ApplicationHistoryProvider}}; something like {code} getDiagnosticsInfo(): (String, Option[String]) = { ... } {code} which would return two strings: one simple text, and one formatted text for insertion into a {{}} section. That would allow stack traces to be displayed readably. A default implementation would simply return the current message. Note that the HTML would have to be sanitized before display, with angle brackets escaped. > History Server "no histories" message to be dynamically generated by > ApplicationHistoryProviders > > > Key: SPARK-11375 > URL: https://issues.apache.org/jira/browse/SPARK-11375 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Affects Versions: 1.5.1 >Reporter: Steve Loughran >Priority: Minor > > When there are no histories, the {{HistoryPage}} displays an error text which > assumes that the provider is the {{FsHistoryProvider}}, and its sole failure > mode is "directory not found" > {code} > Did you specify the correct logging directory? > Please verify your setting of spark.history.fs.logDirectory > {code} > Different providers have different failure modes, and even the filesystem > provider has some, such as an access control exception, or the specified > directly path actually being a file. > If the {{ApplicationHistoryProvider}} was itself asked to provide an error > message, then it could > * be dynamically generated to show the current state of the history provider > * potentially include any exceptions to list > * display the actual values of settings such as the log directory property. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-11317) YARN HBase token code shouldn't swallow invocation target exceptions
[ https://issues.apache.org/jira/browse/SPARK-11317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated SPARK-11317: --- Affects Version/s: 1.5.1 > YARN HBase token code shouldn't swallow invocation target exceptions > > > Key: SPARK-11317 > URL: https://issues.apache.org/jira/browse/SPARK-11317 > Project: Spark > Issue Type: Bug > Components: YARN >Affects Versions: 1.5.1 >Reporter: Steve Loughran > > As with SPARK-11265; the HBase token retrieval code of SPARK-6918 > 1. swallows exceptions it should be rethrowing as serious problems (e.g > NoSuchMethodException) > 1. Swallows any exception raised by the HBase client, without even logging > the details (it logs that an `InvocationTargetException` was caught, but not > the contents) > As such it is potentially brittle to changes in the HBase client code, and > absolutely not going to provide any assistance if HBase won't actually issue > tokens to the caller. > The code in SPARK-11265 can be re-used to provide consistent and better > exception processing -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-11317) YARN HBase token code shouldn't swallow invocation target exceptions
[ https://issues.apache.org/jira/browse/SPARK-11317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated SPARK-11317: --- Description: As with SPARK-11265; the HBase token retrieval code of SPARK-6918 1. swallows exceptions it should be rethrowing as serious problems (e.g NoSuchMethodException) 1. Swallows any exception raised by the HBase client, without even logging the details (it logs that an `InvocationTargetException` was caught, but not the contents) As such it is potentially brittle to changes in the HBase client code, and absolutely not going to provide any assistance if HBase won't actually issue tokens to the caller. The code in SPARK-11265 can be re-used to provide consistent and better exception processing was: As with SPARK-11265; the HBase token retrieval code of SPARK-6918 1. swallows exceptions it should be rethrowing as serious problems (e.g NoSuchMethodException) 1. Swallows any exception raised by the HBase client, without even logging the details (it logs that an `InvocationTargetException` was caught, but not the contents) As such it is potentially brittle to changes in the HDFS client code, and absolutely not going to provide any assistance if HBase won't actually issue tokens to the caller. The code in SPARK-11265 can be re-used to provide consistent and better exception processing > YARN HBase token code shouldn't swallow invocation target exceptions > > > Key: SPARK-11317 > URL: https://issues.apache.org/jira/browse/SPARK-11317 > Project: Spark > Issue Type: Bug > Components: YARN >Affects Versions: 1.5.1 >Reporter: Steve Loughran > > As with SPARK-11265; the HBase token retrieval code of SPARK-6918 > 1. swallows exceptions it should be rethrowing as serious problems (e.g > NoSuchMethodException) > 1. Swallows any exception raised by the HBase client, without even logging > the details (it logs that an `InvocationTargetException` was caught, but not > the contents) > As such it is potentially brittle to changes in the HBase client code, and > absolutely not going to provide any assistance if HBase won't actually issue > tokens to the caller. > The code in SPARK-11265 can be re-used to provide consistent and better > exception processing -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-2089) With YARN, preferredNodeLocalityData isn't honored
[ https://issues.apache.org/jira/browse/SPARK-2089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14980278#comment-14980278 ] Steve Loughran commented on SPARK-2089: --- Bear in mind that if you start specifying locations, it takes longer for placement requests to be granted. With the normal Yarn schedulers that's only 10+ seconds before it downgrades (if relax-placement==true), but it can impact short-lived work. For long-lived apps, placement policy is often driven by people wanting anti-affinity in container placements, so that you get independent failure patterns. YARN doesn't support this (and doesn't have it on its roadmap), so people get to do it themselves. For the curious, SLIDER-82 is my homework there. There's also "where I was last time", with our without fallback to rest-of-cluster. Here the rapid fallback of YARN pushes long-lived services to implement their own escalation/relaxation logic if they want to give YARN more time to assign it to an explicit node. Trying to do clever/complex placement is [a hard problem|http://slider.apache.org/design/rolehistory.html]. I'd generally advocating avoiding it. Having a list of nodes to request, and perhaps a list of nodes to avoid, would be simpler. Maybe also add some option of placement policy, with two: "none", "configured" for now, leaving room for future options "anti-affinity", "historical", ... if someone ever wanted this stuff. Oh, and have I mentioned testing all of this... > With YARN, preferredNodeLocalityData isn't honored > --- > > Key: SPARK-2089 > URL: https://issues.apache.org/jira/browse/SPARK-2089 > Project: Spark > Issue Type: Bug > Components: YARN >Affects Versions: 1.0.0 >Reporter: Sandy Ryza >Assignee: Sandy Ryza >Priority: Critical > > When running in YARN cluster mode, apps can pass preferred locality data when > constructing a Spark context that will dictate where to request executor > containers. > This is currently broken because of a race condition. The Spark-YARN code > runs the user class and waits for it to start up a SparkContext. During its > initialization, the SparkContext will create a YarnClusterScheduler, which > notifies a monitor in the Spark-YARN code that . The Spark-Yarn code then > immediately fetches the preferredNodeLocationData from the SparkContext and > uses it to start requesting containers. > But in the SparkContext constructor that takes the preferredNodeLocationData, > setting preferredNodeLocationData comes after the rest of the initialization, > so, if the Spark-YARN code comes around quickly enough after being notified, > the data that's fetched is the empty unset version. The occurred during all > of my runs. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-11314) Add service API and test service for Yarn Cluster schedulers
Steve Loughran created SPARK-11314: -- Summary: Add service API and test service for Yarn Cluster schedulers Key: SPARK-11314 URL: https://issues.apache.org/jira/browse/SPARK-11314 Project: Spark Issue Type: Sub-task Components: YARN Affects Versions: 1.5.1 Environment: Hadoop 2.2+ cluster Reporter: Steve Loughran Provide an the extension model to load and run implementations of {{SchedulerExtensionService}} in the yarn cluster scheduler process —and to stop them afterwards. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-11265) YarnClient can't get tokens to talk to Hive 1.2.1 in a secure cluster
[ https://issues.apache.org/jira/browse/SPARK-11265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated SPARK-11265: --- Summary: YarnClient can't get tokens to talk to Hive 1.2.1 in a secure cluster (was: YarnClient can't get tokens to talk to Hive in a secure cluster) > YarnClient can't get tokens to talk to Hive 1.2.1 in a secure cluster > - > > Key: SPARK-11265 > URL: https://issues.apache.org/jira/browse/SPARK-11265 > Project: Spark > Issue Type: Bug > Components: YARN >Affects Versions: 1.5.1 > Environment: Kerberized Hadoop cluster >Reporter: Steve Loughran > > As reported on the dev list, trying to run a YARN client which wants to talk > to Hive in a Kerberized hadoop cluster fails. This appears to be because the > constructor of the {{ org.apache.hadoop.hive.ql.metadata.Hive}} class was > made private and replaced with a factory method. The YARN client uses > reflection to get the tokens, so the signature changes weren't picked up in > SPARK-8064. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-11265) YarnClient can't get tokens to talk to Hive 1.2.1 in a secure cluster
[ https://issues.apache.org/jira/browse/SPARK-11265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14974239#comment-14974239 ] Steve Loughran commented on SPARK-11265: What's changed? The spark code uses reflection to get the method {{("org.apache.hadoop.hive.ql.metadata.Hive#get"), then invokes it with a single argument: {{hive = hiveClass.getMethod("get").invoke(null, hiveConf.asInstanceOf[Object])}} Hive 0.13 has >1 method with this name, even in Hive 0.31.1; it has, in order, {{get(HiveConf}}, {{get(HiveConf, boolean)}}, and {{get()}}. Hive 1.2.1 adds one new method {{get(Configuration c, Class clazz)}} *before* the others, and now invoke is failing as the returned method doesn't take a HiveConf. What could have been happening here is that the {{Class.get()}} method was returning the {{get(HiveConf}} method because it was first in the file, and on 1.2.1 the new method returned the new one, which didn't take a single {{HiveConf}}, hence the stack trace The fix, under all of it, is simply getting the method {{get(HiveConf.class)}}, and invoking it with the configuration created by reflection. That's all: explicitly asking for a method that's always been there. The code probably worked before just because nobody was looking at it. > YarnClient can't get tokens to talk to Hive 1.2.1 in a secure cluster > - > > Key: SPARK-11265 > URL: https://issues.apache.org/jira/browse/SPARK-11265 > Project: Spark > Issue Type: Bug > Components: YARN >Affects Versions: 1.5.1 > Environment: Kerberized Hadoop cluster >Reporter: Steve Loughran > > As reported on the dev list, trying to run a YARN client which wants to talk > to Hive in a Kerberized hadoop cluster fails. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-11265) YarnClient can't get tokens to talk to Hive 1.2.1 in a secure cluster
[ https://issues.apache.org/jira/browse/SPARK-11265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14974239#comment-14974239 ] Steve Loughran edited comment on SPARK-11265 at 10/26/15 2:08 PM: -- What's changed? The spark code uses reflection to get the method {{("org.apache.hadoop.hive.ql.metadata.Hive#get")}}, then invokes it with a single argument: {{hive = hiveClass.getMethod("get").invoke(null, hiveConf.asInstanceOf[Object])}} Hive 0.13 has >1 method; it has, in order, {{get(HiveConf}}, {{get(HiveConf, boolean)}}, and {{get()}}. Hive 1.2.1 adds one new method {{get(Configuration c, Class clazz)}} *before* the others, and now invoke is failing as the returned method doesn't take a HiveConf. What could have been happening here is that the {{Class.get()}} method was returning the {{get(HiveConf}} method because it was first in the file, and on 1.2.1 the new method returned the new one, which didn't take a single {{HiveConf}}, hence the stack trace The fix, under all of it, is simply getting the method {{get(HiveConf.class)}}, and invoking it with the configuration created by reflection. That's all: explicitly asking for a method that's always been there. The code probably worked before just because nobody was looking at it. was (Author: ste...@apache.org): What's changed? The spark code uses reflection to get the method {{("org.apache.hadoop.hive.ql.metadata.Hive#get")}}, then invokes it with a single argument: {{hive = hiveClass.getMethod("get").invoke(null, hiveConf.asInstanceOf[Object])}} Hive 0.13 has >1 method with this name, even in Hive 0.31.1; it has, in order, {{get(HiveConf}}, {{get(HiveConf, boolean)}}, and {{get()}}. Hive 1.2.1 adds one new method {{get(Configuration c, Class clazz)}} *before* the others, and now invoke is failing as the returned method doesn't take a HiveConf. What could have been happening here is that the {{Class.get()}} method was returning the {{get(HiveConf}} method because it was first in the file, and on 1.2.1 the new method returned the new one, which didn't take a single {{HiveConf}}, hence the stack trace The fix, under all of it, is simply getting the method {{get(HiveConf.class)}}, and invoking it with the configuration created by reflection. That's all: explicitly asking for a method that's always been there. The code probably worked before just because nobody was looking at it. > YarnClient can't get tokens to talk to Hive 1.2.1 in a secure cluster > - > > Key: SPARK-11265 > URL: https://issues.apache.org/jira/browse/SPARK-11265 > Project: Spark > Issue Type: Bug > Components: YARN >Affects Versions: 1.5.1 > Environment: Kerberized Hadoop cluster >Reporter: Steve Loughran > > As reported on the dev list, trying to run a YARN client which wants to talk > to Hive in a Kerberized hadoop cluster fails. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-11315) Add YARN extension service to publish Spark events to YARN timeline service
Steve Loughran created SPARK-11315: -- Summary: Add YARN extension service to publish Spark events to YARN timeline service Key: SPARK-11315 URL: https://issues.apache.org/jira/browse/SPARK-11315 Project: Spark Issue Type: Sub-task Components: YARN Affects Versions: 1.5.1 Environment: Hadoop 2.6+ Reporter: Steve Loughran Add an extension service (using SPARK-11314) to subscribe to Spark lifecycle events, batch them and forward them to the YARN Application Timeline Service. This data can then be retrieved by a new back end for the Spark History Service, and by other analytics tools. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-11265) YarnClient can't get tokens to talk to Hive 1.2.1 in a secure cluster
[ https://issues.apache.org/jira/browse/SPARK-11265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated SPARK-11265: --- Description: As reported on the dev list, trying to run a YARN client which wants to talk to Hive in a Kerberized hadoop cluster fails. (was: As reported on the dev list, trying to run a YARN client which wants to talk to Hive in a Kerberized hadoop cluster fails. This appears to be because the constructor of the {{ org.apache.hadoop.hive.ql.metadata.Hive}} class was made private and replaced with a factory method. The YARN client uses reflection to get the tokens, so the signature changes weren't picked up in SPARK-8064.) > YarnClient can't get tokens to talk to Hive 1.2.1 in a secure cluster > - > > Key: SPARK-11265 > URL: https://issues.apache.org/jira/browse/SPARK-11265 > Project: Spark > Issue Type: Bug > Components: YARN >Affects Versions: 1.5.1 > Environment: Kerberized Hadoop cluster >Reporter: Steve Loughran > > As reported on the dev list, trying to run a YARN client which wants to talk > to Hive in a Kerberized hadoop cluster fails. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-11265) YarnClient can't get tokens to talk to Hive 1.2.1 in a secure cluster
[ https://issues.apache.org/jira/browse/SPARK-11265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14974239#comment-14974239 ] Steve Loughran edited comment on SPARK-11265 at 10/26/15 2:08 PM: -- What's changed? The spark code uses reflection to get the method {{("org.apache.hadoop.hive.ql.metadata.Hive#get")}}, then invokes it with a single argument: {{hive = hiveClass.getMethod("get").invoke(null, hiveConf.asInstanceOf[Object])}} Hive 0.13 has >1 method with this name, even in Hive 0.31.1; it has, in order, {{get(HiveConf}}, {{get(HiveConf, boolean)}}, and {{get()}}. Hive 1.2.1 adds one new method {{get(Configuration c, Class clazz)}} *before* the others, and now invoke is failing as the returned method doesn't take a HiveConf. What could have been happening here is that the {{Class.get()}} method was returning the {{get(HiveConf}} method because it was first in the file, and on 1.2.1 the new method returned the new one, which didn't take a single {{HiveConf}}, hence the stack trace The fix, under all of it, is simply getting the method {{get(HiveConf.class)}}, and invoking it with the configuration created by reflection. That's all: explicitly asking for a method that's always been there. The code probably worked before just because nobody was looking at it. was (Author: ste...@apache.org): What's changed? The spark code uses reflection to get the method {{("org.apache.hadoop.hive.ql.metadata.Hive#get"), then invokes it with a single argument: {{hive = hiveClass.getMethod("get").invoke(null, hiveConf.asInstanceOf[Object])}} Hive 0.13 has >1 method with this name, even in Hive 0.31.1; it has, in order, {{get(HiveConf}}, {{get(HiveConf, boolean)}}, and {{get()}}. Hive 1.2.1 adds one new method {{get(Configuration c, Class clazz)}} *before* the others, and now invoke is failing as the returned method doesn't take a HiveConf. What could have been happening here is that the {{Class.get()}} method was returning the {{get(HiveConf}} method because it was first in the file, and on 1.2.1 the new method returned the new one, which didn't take a single {{HiveConf}}, hence the stack trace The fix, under all of it, is simply getting the method {{get(HiveConf.class)}}, and invoking it with the configuration created by reflection. That's all: explicitly asking for a method that's always been there. The code probably worked before just because nobody was looking at it. > YarnClient can't get tokens to talk to Hive 1.2.1 in a secure cluster > - > > Key: SPARK-11265 > URL: https://issues.apache.org/jira/browse/SPARK-11265 > Project: Spark > Issue Type: Bug > Components: YARN >Affects Versions: 1.5.1 > Environment: Kerberized Hadoop cluster >Reporter: Steve Loughran > > As reported on the dev list, trying to run a YARN client which wants to talk > to Hive in a Kerberized hadoop cluster fails. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-11317) YARN HBase token code shouldn't swallow invocation target exceptions
Steve Loughran created SPARK-11317: -- Summary: YARN HBase token code shouldn't swallow invocation target exceptions Key: SPARK-11317 URL: https://issues.apache.org/jira/browse/SPARK-11317 Project: Spark Issue Type: Bug Reporter: Steve Loughran As with SPARK-11265; the HBase token retrieval code of SPARK-6918 1. swallows exceptions it should be rethrowing as serious problems (e.g NoSuchMethodException) 1. Swallows any exception raised by the HBase client, without even logging the details (it logs that an `InvocationTargetException` was caught, but not the contents) As such it is potentially brittle to changes in the HDFS client code, and absolutely not going to provide any assistance if HBase won't actually issue tokens to the caller. The code in SPARK-11265 can be re-used to provide consistent and better exception processing -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-11323) Add History Service Provider to service application histories from YARN timeline server
Steve Loughran created SPARK-11323: -- Summary: Add History Service Provider to service application histories from YARN timeline server Key: SPARK-11323 URL: https://issues.apache.org/jira/browse/SPARK-11323 Project: Spark Issue Type: Sub-task Components: YARN Affects Versions: 1.5.1 Reporter: Steve Loughran Add a {{ApplicationHistoryProvider}} provider for enumerating and viewing application histories from the YARN timeline server. As the provider will only run in a YARN cluster, it can take advantage of the Yarn Client API to identify those applications which have terminated without explicitly declaring this in their event histories. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-11373) Add metrics to the History Server and providers
[ https://issues.apache.org/jira/browse/SPARK-11373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14997215#comment-14997215 ] Steve Loughran commented on SPARK-11373: [~charlesyeh] I've just put up a pull request of what I had in mind, with those basic fs metrics, and JVM & thread info. I couldn't hook this up to the spark metrics system as there wasn't one that could be used ... for now I've just gone direct to the codahale servlets and classes for registration. Your suggestion of a new history metrics system would be the right thing to do ... but I would really like those metrics to be fetchable as bits of JSON at the end of URLs —that's both enumerating the whole set and reading specific values. Why? # lets me ask for performance stats from anyone with a web browser to hand, you can say "do a curl history:1800/metrics/metrics > metrics.json" and I've got something I can attach to bug reports. # lets me write tests which query the metrics for the state of the provider, e.g. probe a counter of seconds-since-successful update to be between 0 and 60 before trying to list the applications and expecting them to be found. Or, after mocking a connectivity failure, verify that the failure counts have gone up. Anyway: the draft is up, I won't be working on it again for the next couple of weeks —if, after reviewing my patch you could take it and do a real spark history metrics system, that'd really progress it. And again, that's where the servlets would help: testing the metrics system itself. > Add metrics to the History Server and providers > --- > > Key: SPARK-11373 > URL: https://issues.apache.org/jira/browse/SPARK-11373 > Project: Spark > Issue Type: New Feature > Components: Spark Core >Affects Versions: 1.5.1 >Reporter: Steve Loughran > > The History server doesn't publish metrics about JVM load or anything from > the history provider plugins. This means that performance problems from > massive job histories aren't visible to management tools, and nor are any > provider-generated metrics such as time to load histories, failed history > loads, the number of connectivity failures talking to remote services, etc. > If the history server set up a metrics registry and offered the option to > publish its metrics, then management tools could view this data. > # the metrics registry would need to be passed down to the instantiated > {{ApplicationHistoryProvider}}, in order for it to register its metrics. > # if the codahale metrics servlet were registered under a path such as > {{/metrics}}, the values would be visible as HTML and JSON, without the need > for management tools. > # Integration tests could also retrieve the JSON-formatted data and use it as > part of the test suites. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-10793) Make spark's use/subclassing of hive more maintainable
[ https://issues.apache.org/jira/browse/SPARK-10793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated SPARK-10793: --- Summary: Make spark's use/subclassing of hive more maintainable (was: Make sparks use/subclassing of hive more maintainable) > Make spark's use/subclassing of hive more maintainable > -- > > Key: SPARK-10793 > URL: https://issues.apache.org/jira/browse/SPARK-10793 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 1.5.0 >Reporter: Steve Loughran > > The latest spark/hive integration round has closed the gap with Hive > versions, but the integration is still pretty complex > # SparkSQL has deep hooks into the parser > # hivethriftserver uses "aggressive reflection" to inject spark classes into > the Hive base classes. > # there's a separate org.sparkproject.hive JAR to isolate Kryo versions while > avoiding the hive uberjar with all its dependencies getting into the spark > uberjar. > We can improve this with some assistance from the other projects, even though > no guarantees of stability of things like the parser and thrift server APIs > are likely in the near future -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org