[jira] [Reopened] (SPARK-2742) The variable inputFormatInfo and inputFormatMap never used
[ https://issues.apache.org/jira/browse/SPARK-2742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell reopened SPARK-2742: Oops - I closed accidentially The variable inputFormatInfo and inputFormatMap never used -- Key: SPARK-2742 URL: https://issues.apache.org/jira/browse/SPARK-2742 Project: Spark Issue Type: Bug Components: YARN Reporter: meiyoula Priority: Minor the ClientArguments class has two never used variables, one is inputFormatInfo, the other is inputFormatMap -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-1779) Warning when spark.storage.memoryFraction is not between 0 and 1
[ https://issues.apache.org/jira/browse/SPARK-1779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-1779. Resolution: Fixed Fixed via: https://github.com/apache/spark/pull/714 Warning when spark.storage.memoryFraction is not between 0 and 1 Key: SPARK-1779 URL: https://issues.apache.org/jira/browse/SPARK-1779 Project: Spark Issue Type: Improvement Components: Spark Core Affects Versions: 0.9.0, 1.0.0 Reporter: wangfei Fix For: 1.1.0 There should be a warning when memoryFraction is lower than 0 or greater than 1 -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-2859) Update url of Kryo project in related docs
[ https://issues.apache.org/jira/browse/SPARK-2859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-2859. Resolution: Fixed Fix Version/s: 1.1.0 1.0.3 Issue resolved by pull request 1782 [https://github.com/apache/spark/pull/1782] Update url of Kryo project in related docs -- Key: SPARK-2859 URL: https://issues.apache.org/jira/browse/SPARK-2859 Project: Spark Issue Type: Documentation Components: Documentation Reporter: Guancheng Chen Priority: Trivial Fix For: 1.0.3, 1.1.0 Kryo project has been migrated from googlecode to github, hence we need to update its URL in related docs such as tuning.md. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-2380) Support displaying accumulator contents in the web UI
[ https://issues.apache.org/jira/browse/SPARK-2380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-2380: --- Fix Version/s: 1.1.0 Support displaying accumulator contents in the web UI - Key: SPARK-2380 URL: https://issues.apache.org/jira/browse/SPARK-2380 Project: Spark Issue Type: Improvement Components: Spark Core, Web UI Reporter: Patrick Wendell Assignee: Patrick Wendell Priority: Critical Fix For: 1.1.0 -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-2380) Support displaying accumulator contents in the web UI
[ https://issues.apache.org/jira/browse/SPARK-2380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-2380. Resolution: Fixed Resolved by: https://github.com/apache/spark/pull/1309 Support displaying accumulator contents in the web UI - Key: SPARK-2380 URL: https://issues.apache.org/jira/browse/SPARK-2380 Project: Spark Issue Type: Improvement Components: Spark Core, Web UI Reporter: Patrick Wendell Assignee: Patrick Wendell Priority: Critical -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-2868) Support named accumulators in Python
Patrick Wendell created SPARK-2868: -- Summary: Support named accumulators in Python Key: SPARK-2868 URL: https://issues.apache.org/jira/browse/SPARK-2868 Project: Spark Issue Type: New Feature Components: PySpark Reporter: Patrick Wendell SPARK-2380 added this for Java/Scala. To allow this in Python we'll need to make some additional changes. One potential path is to have a 1:1 correspondence with Scala accumulators (instead of a one-to-many). A challenge is exposing the stringified values of the accumulators to the Scala code. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-2882) Spark build no longer checks local maven cache for dependencies
Patrick Wendell created SPARK-2882: -- Summary: Spark build no longer checks local maven cache for dependencies Key: SPARK-2882 URL: https://issues.apache.org/jira/browse/SPARK-2882 Project: Spark Issue Type: Sub-task Reporter: Patrick Wendell Assignee: Gregory Owen -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-2678) `Spark-submit` overrides user application options
[ https://issues.apache.org/jira/browse/SPARK-2678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14088103#comment-14088103 ] Patrick Wendell commented on SPARK-2678: Fixed in 1.1.0 via: https://github.com/apache/spark/pull/1801 `Spark-submit` overrides user application options - Key: SPARK-2678 URL: https://issues.apache.org/jira/browse/SPARK-2678 Project: Spark Issue Type: Bug Components: Deploy Affects Versions: 1.0.1, 1.0.2 Reporter: Cheng Lian Assignee: Cheng Lian Priority: Blocker Fix For: 1.1.0 Here is an example: {code} ./bin/spark-submit --class Foo some.jar --help {code} SInce {{--help}} appears behind the primary resource (i.e. {{some.jar}}), it should be recognized as a user application option. But it's actually overriden by {{spark-submit}} and will show {{spark-submit}} help message. When directly invoking {{spark-submit}}, the constraints here are: # Options before primary resource should be recognized as {{spark-submit}} options # Options after primary resource should be recognized as user application options The tricky part is how to handle scripts like {{spark-shell}} that delegate {{spark-submit}}. These scripts allow users specify both {{spark-submit}} options like {{--master}} and user defined application options together. For example, say we'd like to write a new script {{start-thriftserver.sh}} to start the Hive Thrift server, basically we may do this: {code} $SPARK_HOME/bin/spark-submit --class org.apache.spark.sql.hive.thriftserver.HiveThriftServer2 spark-internal $@ {code} Then user may call this script like: {code} ./sbin/start-thriftserver.sh --master spark://some-host:7077 --hiveconf key=value {code} Notice that all options are captured by {{$@}}. If we put it before {{spark-internal}}, they are all recognized as {{spark-submit}} options, thus {{--hiveconf}} won't be passed to {{HiveThriftServer2}}; if we put it after {{spark-internal}}, they *should* all be recognized as options of {{HiveThriftServer2}}, but because of this bug, {{--master}} is still recognized as {{spark-submit}} option and leads to the right behavior. Although currently all scripts using {{spark-submit}} work correctly, we still should fix this bug, because it causes option name collision between {{spark-submit}} and user application, and every time we add a new option to {{spark-submit}}, some existing user applications may break. However, solving this bug may cause some incompatible changes. The suggested solution here is using {{--}} as separator of {{spark-submit}} options and user application options. For the Hive Thrift server example above, user should call it in this way: {code} ./sbin/start-thriftserver.sh --master spark://some-host:7077 -- --hiveconf key=value {code} And {{SparkSubmitArguments}} should be responsible for splitting two sets of options and pass them correctly. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-2678) `Spark-submit` overrides user application options
[ https://issues.apache.org/jira/browse/SPARK-2678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-2678: --- Target Version/s: 1.1.0, 1.0.3 (was: 1.1.0) `Spark-submit` overrides user application options - Key: SPARK-2678 URL: https://issues.apache.org/jira/browse/SPARK-2678 Project: Spark Issue Type: Bug Components: Deploy Affects Versions: 1.0.1, 1.0.2 Reporter: Cheng Lian Assignee: Cheng Lian Priority: Blocker Fix For: 1.1.0 Here is an example: {code} ./bin/spark-submit --class Foo some.jar --help {code} SInce {{--help}} appears behind the primary resource (i.e. {{some.jar}}), it should be recognized as a user application option. But it's actually overriden by {{spark-submit}} and will show {{spark-submit}} help message. When directly invoking {{spark-submit}}, the constraints here are: # Options before primary resource should be recognized as {{spark-submit}} options # Options after primary resource should be recognized as user application options The tricky part is how to handle scripts like {{spark-shell}} that delegate {{spark-submit}}. These scripts allow users specify both {{spark-submit}} options like {{--master}} and user defined application options together. For example, say we'd like to write a new script {{start-thriftserver.sh}} to start the Hive Thrift server, basically we may do this: {code} $SPARK_HOME/bin/spark-submit --class org.apache.spark.sql.hive.thriftserver.HiveThriftServer2 spark-internal $@ {code} Then user may call this script like: {code} ./sbin/start-thriftserver.sh --master spark://some-host:7077 --hiveconf key=value {code} Notice that all options are captured by {{$@}}. If we put it before {{spark-internal}}, they are all recognized as {{spark-submit}} options, thus {{--hiveconf}} won't be passed to {{HiveThriftServer2}}; if we put it after {{spark-internal}}, they *should* all be recognized as options of {{HiveThriftServer2}}, but because of this bug, {{--master}} is still recognized as {{spark-submit}} option and leads to the right behavior. Although currently all scripts using {{spark-submit}} work correctly, we still should fix this bug, because it causes option name collision between {{spark-submit}} and user application, and every time we add a new option to {{spark-submit}}, some existing user applications may break. However, solving this bug may cause some incompatible changes. The suggested solution here is using {{--}} as separator of {{spark-submit}} options and user application options. For the Hive Thrift server example above, user should call it in this way: {code} ./sbin/start-thriftserver.sh --master spark://some-host:7077 -- --hiveconf key=value {code} And {{SparkSubmitArguments}} should be responsible for splitting two sets of options and pass them correctly. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-2884) Create binary builds in parallel with release script
Patrick Wendell created SPARK-2884: -- Summary: Create binary builds in parallel with release script Key: SPARK-2884 URL: https://issues.apache.org/jira/browse/SPARK-2884 Project: Spark Issue Type: Bug Components: Build, Project Infra Reporter: Patrick Wendell Assignee: Patrick Wendell -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-2566) Update ShuffleWriteMetrics as data is written
[ https://issues.apache.org/jira/browse/SPARK-2566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-2566. Resolution: Fixed Fix Version/s: 1.1.0 Issue resolved by pull request 1481 [https://github.com/apache/spark/pull/1481] Update ShuffleWriteMetrics as data is written - Key: SPARK-2566 URL: https://issues.apache.org/jira/browse/SPARK-2566 Project: Spark Issue Type: Improvement Components: Spark Core Reporter: Sandy Ryza Fix For: 1.1.0 This will allow reporting incremental progress once we have SPARK-2099. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-2879) Use HTTPS to access Maven Central and other repos
[ https://issues.apache.org/jira/browse/SPARK-2879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-2879. Resolution: Fixed Fix Version/s: 1.1.0 Assignee: Sean Owen Closed via: https://github.com/apache/spark/pull/1805 Use HTTPS to access Maven Central and other repos - Key: SPARK-2879 URL: https://issues.apache.org/jira/browse/SPARK-2879 Project: Spark Issue Type: Improvement Components: Build Affects Versions: 1.0.1 Reporter: Sean Owen Assignee: Sean Owen Priority: Minor Fix For: 1.1.0 Maven Central has just now enabled HTTPS access for everyone to Maven Central (http://central.sonatype.org/articles/2014/Aug/03/https-support-launching-now/) This is timely, as a reminder of how easily an attacker can slip malicious code into a build that's downloading artifacts over HTTP (http://blog.ontoillogical.com/blog/2014/07/28/how-to-take-over-any-java-developer/). In the meantime, it looks like the Spring repo also now supports HTTPS, so can be used this way too. I propose to use HTTPS to access these repos. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-2887) RDD.countApproxDistinct() is wrong when RDD has more one partition
[ https://issues.apache.org/jira/browse/SPARK-2887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-2887: --- Fix Version/s: 1.1.0 RDD.countApproxDistinct() is wrong when RDD has more one partition -- Key: SPARK-2887 URL: https://issues.apache.org/jira/browse/SPARK-2887 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 0.9.1, 1.0.0, 1.0.2 Reporter: Davies Liu Assignee: Davies Liu Priority: Blocker Fix For: 1.1.0 Original Estimate: 1h Remaining Estimate: 1h scala sc.makeRDD(1 to 1000, 10).countApproxDistinct(0.01) res0: Long = 101 -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-2887) RDD.countApproxDistinct() is wrong when RDD has more one partition
[ https://issues.apache.org/jira/browse/SPARK-2887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14088814#comment-14088814 ] Patrick Wendell commented on SPARK-2887: I merged this into 1.1 - [~davies] could you submit a PR for 1.0 and/or 0.9? It didn't merge cleanly into those branches. RDD.countApproxDistinct() is wrong when RDD has more one partition -- Key: SPARK-2887 URL: https://issues.apache.org/jira/browse/SPARK-2887 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 0.9.1, 1.0.0, 1.0.2 Reporter: Davies Liu Assignee: Davies Liu Priority: Blocker Fix For: 1.1.0 Original Estimate: 1h Remaining Estimate: 1h scala sc.makeRDD(1 to 1000, 10).countApproxDistinct(0.01) res0: Long = 101 -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-2887) RDD.countApproxDistinct() is wrong when RDD has more one partition
[ https://issues.apache.org/jira/browse/SPARK-2887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14088819#comment-14088819 ] Patrick Wendell commented on SPARK-2887: Actually I looked back and I don't think this bug is relevant for 1.0 or 0.9 RDD.countApproxDistinct() is wrong when RDD has more one partition -- Key: SPARK-2887 URL: https://issues.apache.org/jira/browse/SPARK-2887 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 1.1.0 Reporter: Davies Liu Assignee: Davies Liu Priority: Blocker Fix For: 1.1.0 Original Estimate: 1h Remaining Estimate: 1h scala sc.makeRDD(1 to 1000, 10).countApproxDistinct(0.01) res0: Long = 101 -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-2887) RDD.countApproxDistinct() is wrong when RDD has more one partition
[ https://issues.apache.org/jira/browse/SPARK-2887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-2887. Resolution: Fixed RDD.countApproxDistinct() is wrong when RDD has more one partition -- Key: SPARK-2887 URL: https://issues.apache.org/jira/browse/SPARK-2887 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 1.1.0 Reporter: Davies Liu Assignee: Davies Liu Priority: Blocker Fix For: 1.1.0 Original Estimate: 1h Remaining Estimate: 1h scala sc.makeRDD(1 to 1000, 10).countApproxDistinct(0.01) res0: Long = 101 -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-2887) RDD.countApproxDistinct() is wrong when RDD has more one partition
[ https://issues.apache.org/jira/browse/SPARK-2887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-2887: --- Affects Version/s: (was: 1.0.2) (was: 0.9.1) (was: 1.0.0) 1.1.0 RDD.countApproxDistinct() is wrong when RDD has more one partition -- Key: SPARK-2887 URL: https://issues.apache.org/jira/browse/SPARK-2887 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 1.1.0 Reporter: Davies Liu Assignee: Davies Liu Priority: Blocker Fix For: 1.1.0 Original Estimate: 1h Remaining Estimate: 1h scala sc.makeRDD(1 to 1000, 10).countApproxDistinct(0.01) res0: Long = 101 -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-2899) Doc generation is not working in new SBT Build
Patrick Wendell created SPARK-2899: -- Summary: Doc generation is not working in new SBT Build Key: SPARK-2899 URL: https://issues.apache.org/jira/browse/SPARK-2899 Project: Spark Issue Type: Sub-task Components: Build Reporter: Patrick Wendell Assignee: Prashant Sharma I noticed there are some errors when building the docs: {code} [error] /home/ubuntu/release/spark/core/src/main/scala/org/apache/spark/ui/jobs/JobProgressListener.scala:120: type mismatch; [error] found : org.apache.spark.ui.jobs.TaskUIData [error] required: org.apache.spark.ui.jobs.UIData.TaskUIData [error] stageData.taskData.put(taskInfo.taskId, new TaskUIData(taskInfo)) [error] ^ [error] /home/ubuntu/release/spark/core/src/main/scala/org/apache/spark/ui/jobs/JobProgressListener.scala:142: type mismatch; [error] found : org.apache.spark.ui.jobs.ExecutorSummary [error] required: org.apache.spark.ui.jobs.UIData.ExecutorSummary [error] val execSummary = execSummaryMap.getOrElseUpdate(info.executorId, new ExecutorSummary) [error] ^ [error] /home/ubuntu/release/spark/core/src/main/scala/org/apache/spark/ui/jobs/JobProgressListener.scala:171: type mismatch; [error] found : org.apache.spark.ui.jobs.TaskUIData [error] required: org.apache.spark.ui.jobs.UIData.TaskUIData [error] val taskData = stageData.taskData.getOrElseUpdate(info.taskId, new TaskUIData(info)) {code} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-2905) spark-sql shows 'sbin' instead of 'bin' in the 'usage' string
[ https://issues.apache.org/jira/browse/SPARK-2905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-2905. Resolution: Fixed Fix Version/s: 1.1.0 Issue resolved by pull request 1835 [https://github.com/apache/spark/pull/1835] spark-sql shows 'sbin' instead of 'bin' in the 'usage' string - Key: SPARK-2905 URL: https://issues.apache.org/jira/browse/SPARK-2905 Project: Spark Issue Type: Bug Components: Deploy Affects Versions: 1.1.0 Reporter: Oleg Danilov Priority: Trivial Fix For: 1.1.0 Usage: ./sbin/spark-sql [options] [cli option] Should be ./bin/spark-sql [options] [cli option] -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-2905) spark-sql shows 'sbin' instead of 'bin' in the 'usage' string
[ https://issues.apache.org/jira/browse/SPARK-2905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-2905: --- Assignee: Oleg Danilov spark-sql shows 'sbin' instead of 'bin' in the 'usage' string - Key: SPARK-2905 URL: https://issues.apache.org/jira/browse/SPARK-2905 Project: Spark Issue Type: Bug Components: Deploy Affects Versions: 1.1.0 Reporter: Oleg Danilov Assignee: Oleg Danilov Priority: Trivial Fix For: 1.1.0 Usage: ./sbin/spark-sql [options] [cli option] Should be ./bin/spark-sql [options] [cli option] -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-2880) spark-submit processes app cmdline options
[ https://issues.apache.org/jira/browse/SPARK-2880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-2880. Resolution: Duplicate I believe this is a dup of SPARK-2687. Feel free to re-open it if it's not :) spark-submit processes app cmdline options -- Key: SPARK-2880 URL: https://issues.apache.org/jira/browse/SPARK-2880 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 1.0.0 Environment: Cloudera 5.1 on Ubuntu precise Reporter: Shay Rojansky Priority: Minor Labels: newbie The usage for spark-submit is: Usage: spark-submit [options] app jar | python file [app options] However, when running my Python app thus: spark-submit test.py -v The -v gets picked up by spark-submit, which enters verbose mode. The correct behavior seems to be for test.py to receive this parameter. First time using Spark and submitting, will be happy to contribute a patch if this is validated as a bug. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-2924) Remove use of default arguments where disallowed by 2.11
Patrick Wendell created SPARK-2924: -- Summary: Remove use of default arguments where disallowed by 2.11 Key: SPARK-2924 URL: https://issues.apache.org/jira/browse/SPARK-2924 Project: Spark Issue Type: Sub-task Components: Streaming Reporter: Patrick Wendell Assignee: Anand Avati -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-2924) Remove use of default arguments where disallowed by 2.11
[ https://issues.apache.org/jira/browse/SPARK-2924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-2924: --- Priority: Blocker (was: Major) Target Version/s: 1.1.0 Remove use of default arguments where disallowed by 2.11 Key: SPARK-2924 URL: https://issues.apache.org/jira/browse/SPARK-2924 Project: Spark Issue Type: Sub-task Components: Streaming Reporter: Patrick Wendell Assignee: Anand Avati Priority: Blocker -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-2635) Fix race condition at SchedulerBackend.isReady in standalone mode
[ https://issues.apache.org/jira/browse/SPARK-2635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-2635. Resolution: Fixed Fix Version/s: 1.1.0 Issue resolved by pull request 1525 [https://github.com/apache/spark/pull/1525] Fix race condition at SchedulerBackend.isReady in standalone mode - Key: SPARK-2635 URL: https://issues.apache.org/jira/browse/SPARK-2635 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 1.0.0 Reporter: Zhihui Fix For: 1.1.0 In SPARK-1946(PR #900), configuration spark.scheduler.minRegisteredExecutorsRatio was introduced. However, in standalone mode, there is a race condition where isReady() can return true because totalExpectedExecutors has not been correctly set. Because expected executors is uncertain in standalone mode, the PR try to use CPU cores(--total-executor-cores) as expected resources to judge whether SchedulerBackend is ready. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-2635) Fix race condition at SchedulerBackend.isReady in standalone mode
[ https://issues.apache.org/jira/browse/SPARK-2635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-2635: --- Assignee: Zhihui Fix race condition at SchedulerBackend.isReady in standalone mode - Key: SPARK-2635 URL: https://issues.apache.org/jira/browse/SPARK-2635 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 1.0.0 Reporter: Zhihui Assignee: Zhihui Fix For: 1.1.0 In SPARK-1946(PR #900), configuration spark.scheduler.minRegisteredExecutorsRatio was introduced. However, in standalone mode, there is a race condition where isReady() can return true because totalExpectedExecutors has not been correctly set. Because expected executors is uncertain in standalone mode, the PR try to use CPU cores(--total-executor-cores) as expected resources to judge whether SchedulerBackend is ready. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-2861) Doc comment of DoubleRDDFunctions.histogram is incorrect
[ https://issues.apache.org/jira/browse/SPARK-2861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-2861. Resolution: Fixed Fix Version/s: 1.1.0 Issue resolved by pull request 1786 [https://github.com/apache/spark/pull/1786] Doc comment of DoubleRDDFunctions.histogram is incorrect Key: SPARK-2861 URL: https://issues.apache.org/jira/browse/SPARK-2861 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 0.9.0, 0.9.1, 1.0.0 Reporter: Chandan Kumar Priority: Trivial Fix For: 1.1.0 The documentation comment of histogram method of DoubleRDDFunctions class in source file DoubleRDDFunctions.scala is inconsistent. This might confuse somebody reading the documentation. Comment in question: {code} /** * Compute a histogram using the provided buckets. The buckets are all open * to the left except for the last which is closed * e.g. for the array * [1, 10, 20, 50] the buckets are [1, 10) [10, 20) [20, 50] * e.g 1=x10 , 10=x20, 20=x50 * And on the input of 1 and 50 we would have a histogram of 1, 0, 0 {code} The buckets are all open to the right (NOT left) except for the last which is closed For the example quoted, the last bucket should be 20=x=50. Also, the histogram result on input of 1 and 50 would be 1, 0, 1 (NOT 1, 0, 0). This works correctly in Spark but the doc comment is incorrect. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-2861) Doc comment of DoubleRDDFunctions.histogram is incorrect
[ https://issues.apache.org/jira/browse/SPARK-2861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-2861: --- Assignee: Chandan Kumar Doc comment of DoubleRDDFunctions.histogram is incorrect Key: SPARK-2861 URL: https://issues.apache.org/jira/browse/SPARK-2861 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 0.9.0, 0.9.1, 1.0.0 Reporter: Chandan Kumar Assignee: Chandan Kumar Priority: Trivial Fix For: 1.1.0 The documentation comment of histogram method of DoubleRDDFunctions class in source file DoubleRDDFunctions.scala is inconsistent. This might confuse somebody reading the documentation. Comment in question: {code} /** * Compute a histogram using the provided buckets. The buckets are all open * to the left except for the last which is closed * e.g. for the array * [1, 10, 20, 50] the buckets are [1, 10) [10, 20) [20, 50] * e.g 1=x10 , 10=x20, 20=x50 * And on the input of 1 and 50 we would have a histogram of 1, 0, 0 {code} The buckets are all open to the right (NOT left) except for the last which is closed For the example quoted, the last bucket should be 20=x=50. Also, the histogram result on input of 1 and 50 would be 1, 0, 1 (NOT 1, 0, 0). This works correctly in Spark but the doc comment is incorrect. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-2944) sc.makeRDD doesn't distribute partitions evenly
[ https://issues.apache.org/jira/browse/SPARK-2944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14091699#comment-14091699 ] Patrick Wendell commented on SPARK-2944: Hey [~mengxr], do you know how the behavior differs from Spark 1.0? Also, if there is a clear difference, could you see if the behavior is modified by this patch? https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=63bdb1f41b4895e3a9444f7938094438a94d3007 sc.makeRDD doesn't distribute partitions evenly --- Key: SPARK-2944 URL: https://issues.apache.org/jira/browse/SPARK-2944 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 1.1.0 Reporter: Xiangrui Meng Assignee: Xiangrui Meng Priority: Critical 16 nodes EC2 cluster: {code} val rdd = sc.makeRDD(0 until 1e9.toInt, 1000).cache() rdd.count() {code} Saw 156 partitions on one node while only 8 partitions on another. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-2931) getAllowedLocalityLevel() throws ArrayIndexOutOfBoundsException
[ https://issues.apache.org/jira/browse/SPARK-2931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14091700#comment-14091700 ] Patrick Wendell commented on SPARK-2931: [~matei] Hey Matei - IIRC you looked at this patch a bunch. Do you have any guesses as to what is causing this? getAllowedLocalityLevel() throws ArrayIndexOutOfBoundsException --- Key: SPARK-2931 URL: https://issues.apache.org/jira/browse/SPARK-2931 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 1.1.0 Environment: Spark EC2, spark-1.1.0-snapshot1, sort-by-key spark-perf benchmark Reporter: Josh Rosen Priority: Blocker Fix For: 1.1.0 When running Spark Perf's sort-by-key benchmark on EC2 with v1.1.0-snapshot, I get the following errors (one per task): {code} 14/08/08 18:54:22 INFO scheduler.TaskSetManager: Starting task 39.0 in stage 0.0 (TID 39, ip-172-31-14-30.us-west-2.compute.internal, PROCESS_LOCAL, 1003 bytes) 14/08/08 18:54:22 INFO cluster.SparkDeploySchedulerBackend: Registered executor: Actor[akka.tcp://sparkexecu...@ip-172-31-9-213.us-west-2.compute.internal:58901/user/Executor#1436065036] with ID 0 14/08/08 18:54:22 ERROR actor.OneForOneStrategy: 1 java.lang.ArrayIndexOutOfBoundsException: 1 at org.apache.spark.scheduler.TaskSetManager.getAllowedLocalityLevel(TaskSetManager.scala:475) at org.apache.spark.scheduler.TaskSetManager.resourceOffer(TaskSetManager.scala:409) at org.apache.spark.scheduler.TaskSchedulerImpl$$anonfun$resourceOffers$3$$anonfun$apply$7$$anonfun$apply$2.apply$mcVI$sp(TaskSchedulerImpl.scala:261) at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141) at org.apache.spark.scheduler.TaskSchedulerImpl$$anonfun$resourceOffers$3$$anonfun$apply$7.apply(TaskSchedulerImpl.scala:257) at org.apache.spark.scheduler.TaskSchedulerImpl$$anonfun$resourceOffers$3$$anonfun$apply$7.apply(TaskSchedulerImpl.scala:254) at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108) at org.apache.spark.scheduler.TaskSchedulerImpl$$anonfun$resourceOffers$3.apply(TaskSchedulerImpl.scala:254) at org.apache.spark.scheduler.TaskSchedulerImpl$$anonfun$resourceOffers$3.apply(TaskSchedulerImpl.scala:254) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) at org.apache.spark.scheduler.TaskSchedulerImpl.resourceOffers(TaskSchedulerImpl.scala:254) at org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend$DriverActor.makeOffers(CoarseGrainedSchedulerBackend.scala:153) at org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend$DriverActor$$anonfun$receive$1.applyOrElse(CoarseGrainedSchedulerBackend.scala:103) at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498) at akka.actor.ActorCell.invoke(ActorCell.scala:456) at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237) at akka.dispatch.Mailbox.run(Mailbox.scala:219) at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) {code} This causes the job to hang. I can deterministically reproduce this by re-running the test, either in isolation or as part of the full performance testing suite. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-2945) Allow specifying num of executors in the context configuration
[ https://issues.apache.org/jira/browse/SPARK-2945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14091701#comment-14091701 ] Patrick Wendell commented on SPARK-2945: Hey [~roji] I believe this already exists - there is an option spark.executor.instances. I think in the past we didn't document this, but we probably should. [~sandyr] should be able to confirm this as well. Allow specifying num of executors in the context configuration -- Key: SPARK-2945 URL: https://issues.apache.org/jira/browse/SPARK-2945 Project: Spark Issue Type: Improvement Components: Spark Core Affects Versions: 1.0.0 Environment: Ubuntu precise, on YARN (CDH 5.1.0) Reporter: Shay Rojansky Running on YARN, the only way to specify the number of executors seems to be on the command line of spark-submit, via the --num-executors switch. In many cases this is too early. Our Spark app receives some cmdline arguments which determine the amount of work that needs to be done - and that affects the number of executors it ideally requires. Ideally, the Spark context configuration would support specifying this like any other config param. Our current workaround is a wrapper script that determines how much work is needed, and which itself launches spark-submit with the number passed to --num-executors - it's a shame to have to do this. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-1766) Move reduceByKey definitions next to each other in PairRDDFunctions
[ https://issues.apache.org/jira/browse/SPARK-1766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-1766. Resolution: Fixed Fix Version/s: 1.1.0 Assignee: Chris Cope Target Version/s: 1.1.0 Resolved by: https://github.com/apache/spark/pull/1859 Move reduceByKey definitions next to each other in PairRDDFunctions --- Key: SPARK-1766 URL: https://issues.apache.org/jira/browse/SPARK-1766 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 1.0.0 Reporter: Sandy Ryza Assignee: Chris Cope Priority: Trivial Fix For: 1.1.0 Sorry, I know this is pedantic, but I've been browsing the source multiple times and gotten fooled into thinking reduceByKey always requires a partitioner. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-2894) spark-shell doesn't accept flags
[ https://issues.apache.org/jira/browse/SPARK-2894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-2894. Resolution: Duplicate spark-shell doesn't accept flags Key: SPARK-2894 URL: https://issues.apache.org/jira/browse/SPARK-2894 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 1.1.0 Reporter: Sandy Ryza Priority: Blocker {code} spark-shell --executor-memory 5G bad option '--executor-cores' {code} This is a regression. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-2894) spark-shell doesn't accept flags
[ https://issues.apache.org/jira/browse/SPARK-2894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14091996#comment-14091996 ] Patrick Wendell commented on SPARK-2894: I closed this in favor of SPARK-2678 since it had a more thorough description. spark-shell doesn't accept flags Key: SPARK-2894 URL: https://issues.apache.org/jira/browse/SPARK-2894 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 1.1.0 Reporter: Sandy Ryza Priority: Blocker {code} spark-shell --executor-memory 5G bad option '--executor-cores' {code} This is a regression. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-2678) `Spark-submit` overrides user application options
[ https://issues.apache.org/jira/browse/SPARK-2678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-2678: --- Assignee: Kousuke Saruta (was: Cheng Lian) `Spark-submit` overrides user application options - Key: SPARK-2678 URL: https://issues.apache.org/jira/browse/SPARK-2678 Project: Spark Issue Type: Bug Components: Deploy Affects Versions: 1.0.1, 1.0.2 Reporter: Cheng Lian Assignee: Kousuke Saruta Priority: Blocker Fix For: 1.1.0 Here is an example: {code} ./bin/spark-submit --class Foo some.jar --help {code} SInce {{--help}} appears behind the primary resource (i.e. {{some.jar}}), it should be recognized as a user application option. But it's actually overriden by {{spark-submit}} and will show {{spark-submit}} help message. When directly invoking {{spark-submit}}, the constraints here are: # Options before primary resource should be recognized as {{spark-submit}} options # Options after primary resource should be recognized as user application options The tricky part is how to handle scripts like {{spark-shell}} that delegate {{spark-submit}}. These scripts allow users specify both {{spark-submit}} options like {{--master}} and user defined application options together. For example, say we'd like to write a new script {{start-thriftserver.sh}} to start the Hive Thrift server, basically we may do this: {code} $SPARK_HOME/bin/spark-submit --class org.apache.spark.sql.hive.thriftserver.HiveThriftServer2 spark-internal $@ {code} Then user may call this script like: {code} ./sbin/start-thriftserver.sh --master spark://some-host:7077 --hiveconf key=value {code} Notice that all options are captured by {{$@}}. If we put it before {{spark-internal}}, they are all recognized as {{spark-submit}} options, thus {{--hiveconf}} won't be passed to {{HiveThriftServer2}}; if we put it after {{spark-internal}}, they *should* all be recognized as options of {{HiveThriftServer2}}, but because of this bug, {{--master}} is still recognized as {{spark-submit}} option and leads to the right behavior. Although currently all scripts using {{spark-submit}} work correctly, we still should fix this bug, because it causes option name collision between {{spark-submit}} and user application, and every time we add a new option to {{spark-submit}}, some existing user applications may break. However, solving this bug may cause some incompatible changes. The suggested solution here is using {{--}} as separator of {{spark-submit}} options and user application options. For the Hive Thrift server example above, user should call it in this way: {code} ./sbin/start-thriftserver.sh --master spark://some-host:7077 -- --hiveconf key=value {code} And {{SparkSubmitArguments}} should be responsible for splitting two sets of options and pass them correctly. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-2912) Jenkins should include the commit hash in his messages
[ https://issues.apache.org/jira/browse/SPARK-2912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14092435#comment-14092435 ] Patrick Wendell commented on SPARK-2912: Hey Michael - I believe [~nchammas] is already working on it actually, so I assigned him. Jenkins should include the commit hash in his messages -- Key: SPARK-2912 URL: https://issues.apache.org/jira/browse/SPARK-2912 Project: Spark Issue Type: Sub-task Components: Build Reporter: Nicholas Chammas Assignee: Nicholas Chammas When there are multiple test cycles within a PR, it is not obvious what cycle applies to what set of changes. This makes it more likely for committers to merge a PR that has had new commits added since the last PR. Requirements: * Add the commit hash to Jenkins's messages so it's clear what the test cycle corresponds to. * While you're at it, polish the formatting a bit. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-2717) BasicBlockFetchIterator#next should log when it gets stuck
[ https://issues.apache.org/jira/browse/SPARK-2717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-2717: --- Priority: Major (was: Blocker) BasicBlockFetchIterator#next should log when it gets stuck -- Key: SPARK-2717 URL: https://issues.apache.org/jira/browse/SPARK-2717 Project: Spark Issue Type: Bug Components: Spark Core Reporter: Patrick Wendell Assignee: Josh Rosen If this is stuck for a long time waiting for blocks, we should log what nodes it is waiting for to help debugging. One way to do this is to call take() with a timeout (e.g. 60 seconds) and when the timeout expires log a message for the blocks it is still waiting for. This could all happen in a loop so that the wait just restarts after the message is logged. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-2717) BasicBlockFetchIterator#next should log when it gets stuck
[ https://issues.apache.org/jira/browse/SPARK-2717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-2717: --- Priority: Critical (was: Major) BasicBlockFetchIterator#next should log when it gets stuck -- Key: SPARK-2717 URL: https://issues.apache.org/jira/browse/SPARK-2717 Project: Spark Issue Type: Bug Components: Spark Core Reporter: Patrick Wendell Assignee: Josh Rosen Priority: Critical If this is stuck for a long time waiting for blocks, we should log what nodes it is waiting for to help debugging. One way to do this is to call take() with a timeout (e.g. 60 seconds) and when the timeout expires log a message for the blocks it is still waiting for. This could all happen in a loop so that the wait just restarts after the message is logged. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-2931) getAllowedLocalityLevel() throws ArrayIndexOutOfBoundsException
[ https://issues.apache.org/jira/browse/SPARK-2931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-2931: --- Target Version/s: 1.1.0 getAllowedLocalityLevel() throws ArrayIndexOutOfBoundsException --- Key: SPARK-2931 URL: https://issues.apache.org/jira/browse/SPARK-2931 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 1.1.0 Environment: Spark EC2, spark-1.1.0-snapshot1, sort-by-key spark-perf benchmark Reporter: Josh Rosen Priority: Blocker Attachments: scala-sort-by-key.err, test.patch When running Spark Perf's sort-by-key benchmark on EC2 with v1.1.0-snapshot, I get the following errors (one per task): {code} 14/08/08 18:54:22 INFO scheduler.TaskSetManager: Starting task 39.0 in stage 0.0 (TID 39, ip-172-31-14-30.us-west-2.compute.internal, PROCESS_LOCAL, 1003 bytes) 14/08/08 18:54:22 INFO cluster.SparkDeploySchedulerBackend: Registered executor: Actor[akka.tcp://sparkexecu...@ip-172-31-9-213.us-west-2.compute.internal:58901/user/Executor#1436065036] with ID 0 14/08/08 18:54:22 ERROR actor.OneForOneStrategy: 1 java.lang.ArrayIndexOutOfBoundsException: 1 at org.apache.spark.scheduler.TaskSetManager.getAllowedLocalityLevel(TaskSetManager.scala:475) at org.apache.spark.scheduler.TaskSetManager.resourceOffer(TaskSetManager.scala:409) at org.apache.spark.scheduler.TaskSchedulerImpl$$anonfun$resourceOffers$3$$anonfun$apply$7$$anonfun$apply$2.apply$mcVI$sp(TaskSchedulerImpl.scala:261) at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141) at org.apache.spark.scheduler.TaskSchedulerImpl$$anonfun$resourceOffers$3$$anonfun$apply$7.apply(TaskSchedulerImpl.scala:257) at org.apache.spark.scheduler.TaskSchedulerImpl$$anonfun$resourceOffers$3$$anonfun$apply$7.apply(TaskSchedulerImpl.scala:254) at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108) at org.apache.spark.scheduler.TaskSchedulerImpl$$anonfun$resourceOffers$3.apply(TaskSchedulerImpl.scala:254) at org.apache.spark.scheduler.TaskSchedulerImpl$$anonfun$resourceOffers$3.apply(TaskSchedulerImpl.scala:254) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) at org.apache.spark.scheduler.TaskSchedulerImpl.resourceOffers(TaskSchedulerImpl.scala:254) at org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend$DriverActor.makeOffers(CoarseGrainedSchedulerBackend.scala:153) at org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend$DriverActor$$anonfun$receive$1.applyOrElse(CoarseGrainedSchedulerBackend.scala:103) at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498) at akka.actor.ActorCell.invoke(ActorCell.scala:456) at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237) at akka.dispatch.Mailbox.run(Mailbox.scala:219) at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) {code} This causes the job to hang. I can deterministically reproduce this by re-running the test, either in isolation or as part of the full performance testing suite. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-2931) getAllowedLocalityLevel() throws ArrayIndexOutOfBoundsException
[ https://issues.apache.org/jira/browse/SPARK-2931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-2931: --- Fix Version/s: (was: 1.1.0) getAllowedLocalityLevel() throws ArrayIndexOutOfBoundsException --- Key: SPARK-2931 URL: https://issues.apache.org/jira/browse/SPARK-2931 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 1.1.0 Environment: Spark EC2, spark-1.1.0-snapshot1, sort-by-key spark-perf benchmark Reporter: Josh Rosen Priority: Blocker Attachments: scala-sort-by-key.err, test.patch When running Spark Perf's sort-by-key benchmark on EC2 with v1.1.0-snapshot, I get the following errors (one per task): {code} 14/08/08 18:54:22 INFO scheduler.TaskSetManager: Starting task 39.0 in stage 0.0 (TID 39, ip-172-31-14-30.us-west-2.compute.internal, PROCESS_LOCAL, 1003 bytes) 14/08/08 18:54:22 INFO cluster.SparkDeploySchedulerBackend: Registered executor: Actor[akka.tcp://sparkexecu...@ip-172-31-9-213.us-west-2.compute.internal:58901/user/Executor#1436065036] with ID 0 14/08/08 18:54:22 ERROR actor.OneForOneStrategy: 1 java.lang.ArrayIndexOutOfBoundsException: 1 at org.apache.spark.scheduler.TaskSetManager.getAllowedLocalityLevel(TaskSetManager.scala:475) at org.apache.spark.scheduler.TaskSetManager.resourceOffer(TaskSetManager.scala:409) at org.apache.spark.scheduler.TaskSchedulerImpl$$anonfun$resourceOffers$3$$anonfun$apply$7$$anonfun$apply$2.apply$mcVI$sp(TaskSchedulerImpl.scala:261) at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141) at org.apache.spark.scheduler.TaskSchedulerImpl$$anonfun$resourceOffers$3$$anonfun$apply$7.apply(TaskSchedulerImpl.scala:257) at org.apache.spark.scheduler.TaskSchedulerImpl$$anonfun$resourceOffers$3$$anonfun$apply$7.apply(TaskSchedulerImpl.scala:254) at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108) at org.apache.spark.scheduler.TaskSchedulerImpl$$anonfun$resourceOffers$3.apply(TaskSchedulerImpl.scala:254) at org.apache.spark.scheduler.TaskSchedulerImpl$$anonfun$resourceOffers$3.apply(TaskSchedulerImpl.scala:254) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) at org.apache.spark.scheduler.TaskSchedulerImpl.resourceOffers(TaskSchedulerImpl.scala:254) at org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend$DriverActor.makeOffers(CoarseGrainedSchedulerBackend.scala:153) at org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend$DriverActor$$anonfun$receive$1.applyOrElse(CoarseGrainedSchedulerBackend.scala:103) at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498) at akka.actor.ActorCell.invoke(ActorCell.scala:456) at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237) at akka.dispatch.Mailbox.run(Mailbox.scala:219) at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) {code} This causes the job to hang. I can deterministically reproduce this by re-running the test, either in isolation or as part of the full performance testing suite. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-3020) Print completed indices rather than tasks in web UI
Patrick Wendell created SPARK-3020: -- Summary: Print completed indices rather than tasks in web UI Key: SPARK-3020 URL: https://issues.apache.org/jira/browse/SPARK-3020 Project: Spark Issue Type: Bug Components: Web UI Reporter: Patrick Wendell Assignee: Patrick Wendell Priority: Blocker When speculation is used, it's confusing to print the number of completed tasks, since it can exceed the number of total tasks. Instead we should just report the number of unique indices that are completed. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3024) CLI interface to Driver
[ https://issues.apache.org/jira/browse/SPARK-3024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14096541#comment-14096541 ] Patrick Wendell commented on SPARK-3024: Hey Jeff - mind giving a bit more color on what you mean here? CLI interface to Driver --- Key: SPARK-3024 URL: https://issues.apache.org/jira/browse/SPARK-3024 Project: Spark Issue Type: Improvement Components: Spark Core Reporter: Jeff Hammerbacher -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-3025) Allow JDBC clients to set a fair scheduler pool
Patrick Wendell created SPARK-3025: -- Summary: Allow JDBC clients to set a fair scheduler pool Key: SPARK-3025 URL: https://issues.apache.org/jira/browse/SPARK-3025 Project: Spark Issue Type: Bug Components: SQL Reporter: Patrick Wendell Assignee: Patrick Wendell -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-3026) Provide a good error message if JDBC server is used but Spark is not compiled with -Pthriftserver
[ https://issues.apache.org/jira/browse/SPARK-3026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-3026: --- Priority: Critical (was: Major) Provide a good error message if JDBC server is used but Spark is not compiled with -Pthriftserver - Key: SPARK-3026 URL: https://issues.apache.org/jira/browse/SPARK-3026 Project: Spark Issue Type: Bug Components: SQL Reporter: Patrick Wendell Assignee: Cheng Lian Priority: Critical Instead of giving a ClassNotFoundException we should detect this case and just tell the user to build with -Phiveserver. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-3026) Provide a good error message if JDBC server is used but Spark is not compiled with -Pthriftserver
Patrick Wendell created SPARK-3026: -- Summary: Provide a good error message if JDBC server is used but Spark is not compiled with -Pthriftserver Key: SPARK-3026 URL: https://issues.apache.org/jira/browse/SPARK-3026 Project: Spark Issue Type: Bug Components: SQL Reporter: Patrick Wendell Assignee: Cheng Lian Instead of giving a ClassNotFoundException we should detect this case and just tell the user to build with -Phiveserver. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3028) sparkEventToJson should support SparkListenerExecutorMetricsUpdate
[ https://issues.apache.org/jira/browse/SPARK-3028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14096689#comment-14096689 ] Patrick Wendell commented on SPARK-3028: I think we intentionally do not intend to ever serialize an update event as JSON. However, I think we should be more explicit about this in two ways. 1. In EventLoggingListener we should explicit put a no-op implementation of onExecutorMetricsUpdate with a comment - I actually thought one version of the PR had this, but maybe I'm misremembering. 2. We should also include the SparkListenerExecutorMetricsUpdate in the match block and just have a no-op there also with a comment. sparkEventToJson should support SparkListenerExecutorMetricsUpdate -- Key: SPARK-3028 URL: https://issues.apache.org/jira/browse/SPARK-3028 Project: Spark Issue Type: Bug Components: Spark Core Reporter: Reynold Xin Priority: Blocker SparkListenerExecutorMetricsUpdate was added without updating org.apache.spark.util.JsonProtocol.sparkEventToJson. This can crash the listener. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-3028) sparkEventToJson should support SparkListenerExecutorMetricsUpdate
[ https://issues.apache.org/jira/browse/SPARK-3028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14096689#comment-14096689 ] Patrick Wendell edited comment on SPARK-3028 at 8/14/14 7:20 AM: - I think we currently do not intend to ever serialize an update event as JSON. However, I think we should be more explicit about this in two ways. 1. In EventLoggingListener we should explicit put a no-op implementation of onExecutorMetricsUpdate with a comment - I actually thought one version of the PR had this, but maybe I'm misremembering. 2. We should also include the SparkListenerExecutorMetricsUpdate in the match block and just have a no-op there also with a comment. was (Author: pwendell): I think we explicitly do not intend to ever serialize an update event as JSON. However, I think we should be more explicit about this in two ways. 1. In EventLoggingListener we should explicit put a no-op implementation of onExecutorMetricsUpdate with a comment - I actually thought one version of the PR had this, but maybe I'm misremembering. 2. We should also include the SparkListenerExecutorMetricsUpdate in the match block and just have a no-op there also with a comment. sparkEventToJson should support SparkListenerExecutorMetricsUpdate -- Key: SPARK-3028 URL: https://issues.apache.org/jira/browse/SPARK-3028 Project: Spark Issue Type: Bug Components: Spark Core Reporter: Reynold Xin Priority: Blocker SparkListenerExecutorMetricsUpdate was added without updating org.apache.spark.util.JsonProtocol.sparkEventToJson. This can crash the listener. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3050) Spark program running with 1.0.2 jar cannot run against a 1.0.1 cluster
[ https://issues.apache.org/jira/browse/SPARK-3050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14097947#comment-14097947 ] Patrick Wendell commented on SPARK-3050: Hi [~mkim] - when you launch jobs in client mode (i.e. the driver is running where you launch the job), you will need to use an update version of the spark-submit launch script. You don't need to recompile your job though, you just need to update spark-submit in tandem with your cluster upgrade. Does that work? Spark program running with 1.0.2 jar cannot run against a 1.0.1 cluster --- Key: SPARK-3050 URL: https://issues.apache.org/jira/browse/SPARK-3050 Project: Spark Issue Type: Bug Affects Versions: 1.0.2 Reporter: Mingyu Kim Priority: Critical I ran the following code with Spark 1.0.2 jar against a cluster that runs Spark 1.0.1 (i.e. localhost:7077 is running 1.0.1). {code} import java.util.ArrayList; import java.util.List; import org.apache.spark.api.java.JavaRDD; import org.apache.spark.api.java.JavaSparkContext; public class TestTest { public static void main(String[] args) { JavaSparkContext sc = new JavaSparkContext(spark://localhost:7077, Test); ListInteger list = new ArrayList(); list.add(1); list.add(2); list.add(3); JavaRDDInteger rdd = sc.parallelize(list); System.out.println(rdd.collect()); } } {code} This throws InvalidClassException. {code} Exception in thread main org.apache.spark.SparkException: Job aborted due to stage failure: Task 0.0:1 failed 4 times, most recent failure: Exception failure in TID 6 on host 10.100.91.90: java.io.InvalidClassException: org.apache.spark.rdd.RDD; local class incompatible: stream classdesc serialVersionUID = -6766554341038829528, local class serialVersionUID = 385418487991259089 java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:604) java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1620) java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1515) java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1620) java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1515) java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1769) java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1348) java.io.ObjectInputStream.readObject(ObjectInputStream.java:370) org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:63) org.apache.spark.scheduler.ResultTask$.deserializeInfo(ResultTask.scala:61) org.apache.spark.scheduler.ResultTask.readExternal(ResultTask.scala:141) java.io.ObjectInputStream.readExternalData(ObjectInputStream.java:1835) java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1794) java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1348) java.io.ObjectInputStream.readObject(ObjectInputStream.java:370) org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:63) org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:85) org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:165) java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) java.lang.Thread.run(Thread.java:722) Driver stacktrace: at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1049) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1033) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1031) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1031) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:635) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:635) at scala.Option.foreach(Option.scala:236) at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:635) at org.apache.spark.scheduler.DAGSchedulerEventProcessActor$$anonfun$receive$2.applyOrElse(DAGScheduler.scala:1234) at
[jira] [Resolved] (SPARK-3050) Spark program running with 1.0.2 jar cannot run against a 1.0.1 cluster
[ https://issues.apache.org/jira/browse/SPARK-3050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-3050. Resolution: Not a Problem I think the issue here is just needing to use the newer version of the launch script. Feel free to re-open this though if I am misunderstanding. Spark program running with 1.0.2 jar cannot run against a 1.0.1 cluster --- Key: SPARK-3050 URL: https://issues.apache.org/jira/browse/SPARK-3050 Project: Spark Issue Type: Bug Affects Versions: 1.0.2 Reporter: Mingyu Kim Priority: Critical I ran the following code with Spark 1.0.2 jar against a cluster that runs Spark 1.0.1 (i.e. localhost:7077 is running 1.0.1). {code} import java.util.ArrayList; import java.util.List; import org.apache.spark.api.java.JavaRDD; import org.apache.spark.api.java.JavaSparkContext; public class TestTest { public static void main(String[] args) { JavaSparkContext sc = new JavaSparkContext(spark://localhost:7077, Test); ListInteger list = new ArrayList(); list.add(1); list.add(2); list.add(3); JavaRDDInteger rdd = sc.parallelize(list); System.out.println(rdd.collect()); } } {code} This throws InvalidClassException. {code} Exception in thread main org.apache.spark.SparkException: Job aborted due to stage failure: Task 0.0:1 failed 4 times, most recent failure: Exception failure in TID 6 on host 10.100.91.90: java.io.InvalidClassException: org.apache.spark.rdd.RDD; local class incompatible: stream classdesc serialVersionUID = -6766554341038829528, local class serialVersionUID = 385418487991259089 java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:604) java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1620) java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1515) java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1620) java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1515) java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1769) java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1348) java.io.ObjectInputStream.readObject(ObjectInputStream.java:370) org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:63) org.apache.spark.scheduler.ResultTask$.deserializeInfo(ResultTask.scala:61) org.apache.spark.scheduler.ResultTask.readExternal(ResultTask.scala:141) java.io.ObjectInputStream.readExternalData(ObjectInputStream.java:1835) java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1794) java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1348) java.io.ObjectInputStream.readObject(ObjectInputStream.java:370) org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:63) org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:85) org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:165) java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) java.lang.Thread.run(Thread.java:722) Driver stacktrace: at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1049) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1033) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1031) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1031) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:635) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:635) at scala.Option.foreach(Option.scala:236) at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:635) at org.apache.spark.scheduler.DAGSchedulerEventProcessActor$$anonfun$receive$2.applyOrElse(DAGScheduler.scala:1234) at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498) at akka.actor.ActorCell.invoke(ActorCell.scala:456) at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237) at akka.dispatch.Mailbox.run(Mailbox.scala:219) at
[jira] [Updated] (SPARK-3050) Spark program running with 1.0.2 jar cannot run against a 1.0.1 cluster
[ https://issues.apache.org/jira/browse/SPARK-3050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-3050: --- Priority: Major (was: Critical) Spark program running with 1.0.2 jar cannot run against a 1.0.1 cluster --- Key: SPARK-3050 URL: https://issues.apache.org/jira/browse/SPARK-3050 Project: Spark Issue Type: Bug Affects Versions: 1.0.2 Reporter: Mingyu Kim I ran the following code with Spark 1.0.2 jar against a cluster that runs Spark 1.0.1 (i.e. localhost:7077 is running 1.0.1). {code} import java.util.ArrayList; import java.util.List; import org.apache.spark.api.java.JavaRDD; import org.apache.spark.api.java.JavaSparkContext; public class TestTest { public static void main(String[] args) { JavaSparkContext sc = new JavaSparkContext(spark://localhost:7077, Test); ListInteger list = new ArrayList(); list.add(1); list.add(2); list.add(3); JavaRDDInteger rdd = sc.parallelize(list); System.out.println(rdd.collect()); } } {code} This throws InvalidClassException. {code} Exception in thread main org.apache.spark.SparkException: Job aborted due to stage failure: Task 0.0:1 failed 4 times, most recent failure: Exception failure in TID 6 on host 10.100.91.90: java.io.InvalidClassException: org.apache.spark.rdd.RDD; local class incompatible: stream classdesc serialVersionUID = -6766554341038829528, local class serialVersionUID = 385418487991259089 java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:604) java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1620) java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1515) java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1620) java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1515) java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1769) java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1348) java.io.ObjectInputStream.readObject(ObjectInputStream.java:370) org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:63) org.apache.spark.scheduler.ResultTask$.deserializeInfo(ResultTask.scala:61) org.apache.spark.scheduler.ResultTask.readExternal(ResultTask.scala:141) java.io.ObjectInputStream.readExternalData(ObjectInputStream.java:1835) java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1794) java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1348) java.io.ObjectInputStream.readObject(ObjectInputStream.java:370) org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:63) org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:85) org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:165) java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) java.lang.Thread.run(Thread.java:722) Driver stacktrace: at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1049) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1033) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1031) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1031) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:635) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:635) at scala.Option.foreach(Option.scala:236) at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:635) at org.apache.spark.scheduler.DAGSchedulerEventProcessActor$$anonfun$receive$2.applyOrElse(DAGScheduler.scala:1234) at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498) at akka.actor.ActorCell.invoke(ActorCell.scala:456) at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237) at akka.dispatch.Mailbox.run(Mailbox.scala:219) at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
[jira] [Commented] (SPARK-2858) Default log4j file no longer seems to work
[ https://issues.apache.org/jira/browse/SPARK-2858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14098148#comment-14098148 ] Patrick Wendell commented on SPARK-2858: What I mean is that when I don't include any log4j.properties file, Spark fails to use it's own default configuration. Default log4j file no longer seems to work -- Key: SPARK-2858 URL: https://issues.apache.org/jira/browse/SPARK-2858 Project: Spark Issue Type: Bug Components: Spark Core Reporter: Patrick Wendell For reasons unknown this doesn't seem to be working anymore. I deleted my log4j.properties file and did a fresh build and it noticed it still gave me a verbose stack trace when port 4040 was contented (which is a log we silence in the conf). I actually think this was an issue even before [~sowen]'s changes, so not sure what's up. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-2858) Default log4j configuration no longer seems to work
[ https://issues.apache.org/jira/browse/SPARK-2858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-2858: --- Summary: Default log4j configuration no longer seems to work (was: Default log4j file no longer seems to work) Default log4j configuration no longer seems to work --- Key: SPARK-2858 URL: https://issues.apache.org/jira/browse/SPARK-2858 Project: Spark Issue Type: Bug Components: Spark Core Reporter: Patrick Wendell For reasons unknown this doesn't seem to be working anymore. I deleted my log4j.properties file and did a fresh build and it noticed it still gave me a verbose stack trace when port 4040 was contented (which is a log we silence in the conf). I actually think this was an issue even before [~sowen]'s changes, so not sure what's up. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-2912) Jenkins should include the commit hash in his messages
[ https://issues.apache.org/jira/browse/SPARK-2912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-2912. Resolution: Fixed Fix Version/s: 1.2.0 Issue resolved by pull request 1816 [https://github.com/apache/spark/pull/1816] Jenkins should include the commit hash in his messages -- Key: SPARK-2912 URL: https://issues.apache.org/jira/browse/SPARK-2912 Project: Spark Issue Type: Sub-task Components: Build Reporter: Nicholas Chammas Assignee: Nicholas Chammas Fix For: 1.2.0 When there are multiple test cycles within a PR, it is not obvious what cycle applies to what set of changes. This makes it more likely for committers to merge a PR that has had new commits added since the last PR. Requirements: * Add the commit hash to Jenkins's messages so it's clear what the test cycle corresponds to. * While you're at it, polish the formatting a bit. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-2955) Test code fails to compile with mvn compile without install
[ https://issues.apache.org/jira/browse/SPARK-2955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-2955. Resolution: Fixed Fix Version/s: 1.2.0 Issue resolved by pull request 1879 [https://github.com/apache/spark/pull/1879] Test code fails to compile with mvn compile without install Key: SPARK-2955 URL: https://issues.apache.org/jira/browse/SPARK-2955 Project: Spark Issue Type: Bug Components: Build Affects Versions: 1.0.2 Reporter: Sean Owen Assignee: Sean Owen Priority: Minor Labels: build, compile, scalatest, test, test-compile Fix For: 1.2.0 (This is the corrected follow-up to https://issues.apache.org/jira/browse/SPARK-2903 ) Right now, mvn compile test-compile fails to compile Spark. (Don't worry; mvn package works, so this is not major.) The issue stems from test code in some modules depending on test code in other modules. That is perfectly fine and supported by Maven. It takes extra work to get this to work with scalatest, and this has been attempted: https://github.com/apache/spark/blob/master/sql/catalyst/pom.xml#L86 This formulation is not quite enough, since the SQL Core module's tests fail to compile for lack of finding test classes in SQL Catalyst, and likewise for most Streaming integration modules depending on core Streaming test code. Example: {code} [error] /Users/srowen/Documents/spark/sql/core/src/test/scala/org/apache/spark/sql/QueryTest.scala:23: not found: type PlanTest [error] class QueryTest extends PlanTest { [error] ^ [error] /Users/srowen/Documents/spark/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala:28: package org.apache.spark.sql.test is not a value [error] test(SPARK-1669: cacheTable should be idempotent) { [error] ^ ... {code} The issue I believe is that generation of a test-jar is bound here to the compile phase, but the test classes are not being compiled in this phase. It should bind to the test-compile phase. It works when executing mvn package or mvn install since test-jar artifacts are actually generated available through normal Maven mechanisms as each module is built. They are then found normally, regardless of scalatest configuration. It would be nice for a simple mvn compile test-compile to work since the test code is perfectly compilable given the Maven declarations. On the plus side, this change is low-risk as it only affects tests. [~yhuai] made the original scalatest change and has glanced at this and thinks it makes sense. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-2955) Test code fails to compile with mvn compile without install
[ https://issues.apache.org/jira/browse/SPARK-2955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-2955: --- Assignee: Sean Owen Test code fails to compile with mvn compile without install Key: SPARK-2955 URL: https://issues.apache.org/jira/browse/SPARK-2955 Project: Spark Issue Type: Bug Components: Build Affects Versions: 1.0.2 Reporter: Sean Owen Assignee: Sean Owen Priority: Minor Labels: build, compile, scalatest, test, test-compile Fix For: 1.2.0 (This is the corrected follow-up to https://issues.apache.org/jira/browse/SPARK-2903 ) Right now, mvn compile test-compile fails to compile Spark. (Don't worry; mvn package works, so this is not major.) The issue stems from test code in some modules depending on test code in other modules. That is perfectly fine and supported by Maven. It takes extra work to get this to work with scalatest, and this has been attempted: https://github.com/apache/spark/blob/master/sql/catalyst/pom.xml#L86 This formulation is not quite enough, since the SQL Core module's tests fail to compile for lack of finding test classes in SQL Catalyst, and likewise for most Streaming integration modules depending on core Streaming test code. Example: {code} [error] /Users/srowen/Documents/spark/sql/core/src/test/scala/org/apache/spark/sql/QueryTest.scala:23: not found: type PlanTest [error] class QueryTest extends PlanTest { [error] ^ [error] /Users/srowen/Documents/spark/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala:28: package org.apache.spark.sql.test is not a value [error] test(SPARK-1669: cacheTable should be idempotent) { [error] ^ ... {code} The issue I believe is that generation of a test-jar is bound here to the compile phase, but the test classes are not being compiled in this phase. It should bind to the test-compile phase. It works when executing mvn package or mvn install since test-jar artifacts are actually generated available through normal Maven mechanisms as each module is built. They are then found normally, regardless of scalatest configuration. It would be nice for a simple mvn compile test-compile to work since the test code is perfectly compilable given the Maven declarations. On the plus side, this change is low-risk as it only affects tests. [~yhuai] made the original scalatest change and has glanced at this and thinks it makes sense. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3061) Maven build fails in Windows OS
[ https://issues.apache.org/jira/browse/SPARK-3061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14098223#comment-14098223 ] Patrick Wendell commented on SPARK-3061: At this time, I'm not sure we intend to support building Spark on Windows. Maven build fails in Windows OS --- Key: SPARK-3061 URL: https://issues.apache.org/jira/browse/SPARK-3061 Project: Spark Issue Type: Bug Components: Build Affects Versions: 1.0.2 Environment: Windows Reporter: Masayoshi TSUZUKI Priority: Minor Maven build fails in Windows OS with this error message. {noformat} [ERROR] Failed to execute goal org.codehaus.mojo:exec-maven-plugin:1.2.1:exec (default) on project spark-core_2.10: Command execution failed. Cannot run program unzip (in directory C:\path\to\gitofspark\python): CreateProcess error=2, Žw’肳‚ꂽƒtƒ@ƒ - [Help 1] {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-2931) getAllowedLocalityLevel() throws ArrayIndexOutOfBoundsException
[ https://issues.apache.org/jira/browse/SPARK-2931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-2931. Resolution: Fixed Fix Version/s: 1.1.0 Assignee: Josh Rosen This was fixed in: https://github.com/apache/spark/pull/1896 getAllowedLocalityLevel() throws ArrayIndexOutOfBoundsException --- Key: SPARK-2931 URL: https://issues.apache.org/jira/browse/SPARK-2931 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 1.1.0 Environment: Spark EC2, spark-1.1.0-snapshot1, sort-by-key spark-perf benchmark Reporter: Josh Rosen Assignee: Josh Rosen Priority: Blocker Fix For: 1.1.0 Attachments: scala-sort-by-key.err, test.patch When running Spark Perf's sort-by-key benchmark on EC2 with v1.1.0-snapshot, I get the following errors (one per task): {code} 14/08/08 18:54:22 INFO scheduler.TaskSetManager: Starting task 39.0 in stage 0.0 (TID 39, ip-172-31-14-30.us-west-2.compute.internal, PROCESS_LOCAL, 1003 bytes) 14/08/08 18:54:22 INFO cluster.SparkDeploySchedulerBackend: Registered executor: Actor[akka.tcp://sparkexecu...@ip-172-31-9-213.us-west-2.compute.internal:58901/user/Executor#1436065036] with ID 0 14/08/08 18:54:22 ERROR actor.OneForOneStrategy: 1 java.lang.ArrayIndexOutOfBoundsException: 1 at org.apache.spark.scheduler.TaskSetManager.getAllowedLocalityLevel(TaskSetManager.scala:475) at org.apache.spark.scheduler.TaskSetManager.resourceOffer(TaskSetManager.scala:409) at org.apache.spark.scheduler.TaskSchedulerImpl$$anonfun$resourceOffers$3$$anonfun$apply$7$$anonfun$apply$2.apply$mcVI$sp(TaskSchedulerImpl.scala:261) at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141) at org.apache.spark.scheduler.TaskSchedulerImpl$$anonfun$resourceOffers$3$$anonfun$apply$7.apply(TaskSchedulerImpl.scala:257) at org.apache.spark.scheduler.TaskSchedulerImpl$$anonfun$resourceOffers$3$$anonfun$apply$7.apply(TaskSchedulerImpl.scala:254) at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108) at org.apache.spark.scheduler.TaskSchedulerImpl$$anonfun$resourceOffers$3.apply(TaskSchedulerImpl.scala:254) at org.apache.spark.scheduler.TaskSchedulerImpl$$anonfun$resourceOffers$3.apply(TaskSchedulerImpl.scala:254) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) at org.apache.spark.scheduler.TaskSchedulerImpl.resourceOffers(TaskSchedulerImpl.scala:254) at org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend$DriverActor.makeOffers(CoarseGrainedSchedulerBackend.scala:153) at org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend$DriverActor$$anonfun$receive$1.applyOrElse(CoarseGrainedSchedulerBackend.scala:103) at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498) at akka.actor.ActorCell.invoke(ActorCell.scala:456) at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237) at akka.dispatch.Mailbox.run(Mailbox.scala:219) at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) {code} This causes the job to hang. I can deterministically reproduce this by re-running the test, either in isolation or as part of the full performance testing suite. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-2865) Potential deadlock: tasks could hang forever waiting to fetch a remote block even though most tasks finish
[ https://issues.apache.org/jira/browse/SPARK-2865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-2865. Resolution: Fixed I believe this has been resolved by virtue of other patches to the connection manager and other components. Potential deadlock: tasks could hang forever waiting to fetch a remote block even though most tasks finish -- Key: SPARK-2865 URL: https://issues.apache.org/jira/browse/SPARK-2865 Project: Spark Issue Type: Bug Components: Shuffle, Spark Core Affects Versions: 1.0.1, 1.1.0 Environment: 16-node EC2 r3.2xlarge cluster Reporter: Zongheng Yang Priority: Blocker In the application I tested, most of the tasks out of 128 tasks could finish, but sometimes (pretty deterministically) either 1 or 3 tasks would just hang forever ( 5 hrs with no progress at all) with the following stack trace. There were no apparent failures from the UI, also the nodes where the stuck tasks were running had no apparent memory/CPU/disk pressures. {noformat} Executor task launch worker-0 daemon prio=10 tid=0x7f32ec003800 nid=0xaac waiting on condition [0x7f33f4428000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for 0x7f3e0d7198e8 (a scala.concurrent.impl.Promise$CompletionLatch) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834) at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:994) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1303) at scala.concurrent.impl.Promise$DefaultPromise.tryAwait(Promise.scala:202) at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:218) at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223) at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107) at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53) at scala.concurrent.Await$.result(package.scala:107) at org.apache.spark.network.ConnectionManager.sendMessageReliablySync(ConnectionManager.scala:832) at org.apache.spark.storage.BlockManagerWorker$.syncGetBlock(BlockManagerWorker.scala:122) at org.apache.spark.storage.BlockManager$$anonfun$doGetRemote$2.apply(BlockManager.scala:497) at org.apache.spark.storage.BlockManager$$anonfun$doGetRemote$2.apply(BlockManager.scala:495) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) at org.apache.spark.storage.BlockManager.doGetRemote(BlockManager.scala:495) at org.apache.spark.storage.BlockManager.getRemote(BlockManager.scala:481) at org.apache.spark.storage.BlockManager.get(BlockManager.scala:524) at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:44) at org.apache.spark.rdd.RDD.iterator(RDD.scala:227) at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262) at org.apache.spark.rdd.RDD.iterator(RDD.scala:229) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262) at org.apache.spark.rdd.RDD.iterator(RDD.scala:229) at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262) at org.apache.spark.rdd.RDD.iterator(RDD.scala:229) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) at org.apache.spark.scheduler.Task.run(Task.scala:54) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:199) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) {noformat} This behavior does *not* appear on 1.0 (reusing the same cluster), but appears on the master branch as of Aug 4, 2014 *and* 1.0.1. Further, I tried out [this patch|https://github.com/apache/spark/pull/1758], and
[jira] [Resolved] (SPARK-2924) Remove use of default arguments where disallowed by 2.11
[ https://issues.apache.org/jira/browse/SPARK-2924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-2924. Resolution: Fixed Fix Version/s: 1.1.0 Issue resolved by pull request 1704 [https://github.com/apache/spark/pull/1704] Remove use of default arguments where disallowed by 2.11 Key: SPARK-2924 URL: https://issues.apache.org/jira/browse/SPARK-2924 Project: Spark Issue Type: Sub-task Components: Streaming Reporter: Patrick Wendell Assignee: Anand Avati Priority: Blocker Fix For: 1.1.0 -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-1828) Created forked version of hive-exec that doesn't bundle other dependencies
[ https://issues.apache.org/jira/browse/SPARK-1828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14098752#comment-14098752 ] Patrick Wendell commented on SPARK-1828: Maxim - I think what you are pointing out is unrated to this exact issue. Spark hard-codes a specific version of Hive in our build. This is true whether or not we are pointing to a slightly modified version of Hive 0.12 or the actual Hive 0.12. The issue is that Hive does not have stable API's so we can't provide a version of Spark that is cross-compatible with different versions of Hive. We are trying to simplify our dependency on Hive to fix this. Are you proposing a specific change here? Created forked version of hive-exec that doesn't bundle other dependencies -- Key: SPARK-1828 URL: https://issues.apache.org/jira/browse/SPARK-1828 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 1.0.0 Reporter: Patrick Wendell Assignee: Patrick Wendell Priority: Blocker Fix For: 1.0.0 The hive-exec jar includes a bunch of Hive's dependencies in addition to hive itself (protobuf, guava, etc). See HIVE-5733. This breaks any attempt in Spark to manage those dependencies. The only solution to this problem is to publish our own version of hive-exec 0.12.0 that behaves correctly. While we are doing this, we might as well re-write the protobuf dependency to use the shaded version of protobuf 2.4.1 that we already have for Akka. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-3028) sparkEventToJson should support SparkListenerExecutorMetricsUpdate
[ https://issues.apache.org/jira/browse/SPARK-3028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-3028: --- Assignee: Sandy Ryza sparkEventToJson should support SparkListenerExecutorMetricsUpdate -- Key: SPARK-3028 URL: https://issues.apache.org/jira/browse/SPARK-3028 Project: Spark Issue Type: Bug Components: Spark Core Reporter: Reynold Xin Assignee: Sandy Ryza Priority: Blocker Fix For: 1.1.0 SparkListenerExecutorMetricsUpdate was added without updating org.apache.spark.util.JsonProtocol.sparkEventToJson. This can crash the listener. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-3028) sparkEventToJson should support SparkListenerExecutorMetricsUpdate
[ https://issues.apache.org/jira/browse/SPARK-3028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-3028. Resolution: Fixed Fix Version/s: 1.1.0 Issue resolved by pull request 1961 [https://github.com/apache/spark/pull/1961] sparkEventToJson should support SparkListenerExecutorMetricsUpdate -- Key: SPARK-3028 URL: https://issues.apache.org/jira/browse/SPARK-3028 Project: Spark Issue Type: Bug Components: Spark Core Reporter: Reynold Xin Priority: Blocker Fix For: 1.1.0 SparkListenerExecutorMetricsUpdate was added without updating org.apache.spark.util.JsonProtocol.sparkEventToJson. This can crash the listener. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-2858) Default log4j configuration no longer seems to work
[ https://issues.apache.org/jira/browse/SPARK-2858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14098148#comment-14098148 ] Patrick Wendell edited comment on SPARK-2858 at 8/15/14 8:48 PM: - What I mean is that when I don't include any log4j.properties file, Spark fails to use its own default configuration. was (Author: pwendell): What I mean is that when I don't include any log4j.properties file, Spark fails to use it's own default configuration. Default log4j configuration no longer seems to work --- Key: SPARK-2858 URL: https://issues.apache.org/jira/browse/SPARK-2858 Project: Spark Issue Type: Bug Components: Spark Core Reporter: Patrick Wendell For reasons unknown this doesn't seem to be working anymore. I deleted my log4j.properties file and did a fresh build and it noticed it still gave me a verbose stack trace when port 4040 was contented (which is a log we silence in the conf). I actually think this was an issue even before [~sowen]'s changes, so not sure what's up. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-3075) Expose a way for users to parse event logs
[ https://issues.apache.org/jira/browse/SPARK-3075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-3075: --- Target Version/s: 1.2.0 Fix Version/s: (was: 1.2.0) Expose a way for users to parse event logs -- Key: SPARK-3075 URL: https://issues.apache.org/jira/browse/SPARK-3075 Project: Spark Issue Type: Improvement Components: Spark Core Affects Versions: 1.0.2 Reporter: Andrew Or Both ReplayListenerBus and util.JsonProtocol are private[spark], so the user wants to parse the event logs themselves for analytics they will have to write their own JSON deserializers (or do some crazy reflection to access these methods). We should expose an easy way for them to do this. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-2532) Fix issues with consolidated shuffle
[ https://issues.apache.org/jira/browse/SPARK-2532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-2532: --- Target Version/s: 1.2.0 (was: 1.1.0) Fix issues with consolidated shuffle Key: SPARK-2532 URL: https://issues.apache.org/jira/browse/SPARK-2532 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 1.1.0 Environment: All Reporter: Mridul Muralidharan Assignee: Mridul Muralidharan Priority: Critical Will file PR with changes as soon as merge is done (earlier merge became outdated in 2 weeks unfortunately :) ). Consolidated shuffle is broken in multiple ways in spark : a) Task failure(s) can cause the state to become inconsistent. b) Multiple revert's or combination of close/revert/close can cause the state to be inconsistent. (As part of exception/error handling). c) Some of the api in block writer causes implementation issues - for example: a revert is always followed by close : but the implemention tries to keep them separate, resulting in surface for errors. d) Fetching data from consolidated shuffle files can go badly wrong if the file is being actively written to : it computes length by subtracting next offset from current offset (or length if this is last offset)- the latter fails when fetch is happening in parallel to write. Note, this happens even if there are no task failures of any kind ! This usually results in stream corruption or decompression errors. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-2977) Fix handling of short shuffle manager names in ShuffleBlockManager
[ https://issues.apache.org/jira/browse/SPARK-2977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-2977: --- Priority: Critical (was: Major) Fix handling of short shuffle manager names in ShuffleBlockManager -- Key: SPARK-2977 URL: https://issues.apache.org/jira/browse/SPARK-2977 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 1.1.0 Reporter: Josh Rosen Priority: Critical Since we allow short names for {{spark.shuffle.manager}}, all code that reads that configuration property should be prepared to handle the short names. See my comment at https://github.com/apache/spark/pull/1799#discussion_r16029607 (opening this as a JIRA so we don't forget to fix it). -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-2044) Pluggable interface for shuffles
[ https://issues.apache.org/jira/browse/SPARK-2044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-2044: --- Target Version/s: 1.2.0 (was: 1.1.0) Pluggable interface for shuffles Key: SPARK-2044 URL: https://issues.apache.org/jira/browse/SPARK-2044 Project: Spark Issue Type: Improvement Components: Shuffle, Spark Core Reporter: Matei Zaharia Assignee: Matei Zaharia Attachments: Pluggableshuffleproposal.pdf Given that a lot of the current activity in Spark Core is in shuffles, I wanted to propose factoring out shuffle implementations in a way that will make experimentation easier. Ideally we will converge on one implementation, but for a while, this could also be used to have several implementations coexist. I'm suggesting this because I aware of at least three efforts to look at shuffle (from Yahoo!, Intel and Databricks). Some of the things people are investigating are: * Push-based shuffle where data moves directly from mappers to reducers * Sorting-based instead of hash-based shuffle, to create fewer files (helps a lot with file handles and memory usage on large shuffles) * External spilling within a key * Changing the level of parallelism or even algorithm for downstream stages at runtime based on statistics of the map output (this is a thing we had prototyped in the Shark research project but never merged in core) I've attached a design doc with a proposed interface. It's not too crazy because the interface between shuffles and the rest of the code is already pretty narrow (just some iterators for reading data and a writer interface for writing it). Bigger changes will be needed in the interaction with DAGScheduler and BlockManager for some of the ideas above, but we can handle those separately, and this interface will allow us to experiment with some short-term stuff sooner. If things go well I'd also like to send a sort-based shuffle implementation for 1.1, but we'll see how the timing on that works out. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-2044) Pluggable interface for shuffles
[ https://issues.apache.org/jira/browse/SPARK-2044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14099136#comment-14099136 ] Patrick Wendell commented on SPARK-2044: A lot of this has been fixed in 1.1 so I moved target version to 1.2. [~matei] we can also close this with fixVersion=1.1.0 if you consider the initial issue fixed. Pluggable interface for shuffles Key: SPARK-2044 URL: https://issues.apache.org/jira/browse/SPARK-2044 Project: Spark Issue Type: Improvement Components: Shuffle, Spark Core Reporter: Matei Zaharia Assignee: Matei Zaharia Attachments: Pluggableshuffleproposal.pdf Given that a lot of the current activity in Spark Core is in shuffles, I wanted to propose factoring out shuffle implementations in a way that will make experimentation easier. Ideally we will converge on one implementation, but for a while, this could also be used to have several implementations coexist. I'm suggesting this because I aware of at least three efforts to look at shuffle (from Yahoo!, Intel and Databricks). Some of the things people are investigating are: * Push-based shuffle where data moves directly from mappers to reducers * Sorting-based instead of hash-based shuffle, to create fewer files (helps a lot with file handles and memory usage on large shuffles) * External spilling within a key * Changing the level of parallelism or even algorithm for downstream stages at runtime based on statistics of the map output (this is a thing we had prototyped in the Shark research project but never merged in core) I've attached a design doc with a proposed interface. It's not too crazy because the interface between shuffles and the rest of the code is already pretty narrow (just some iterators for reading data and a writer interface for writing it). Bigger changes will be needed in the interaction with DAGScheduler and BlockManager for some of the ideas above, but we can handle those separately, and this interface will allow us to experiment with some short-term stuff sooner. If things go well I'd also like to send a sort-based shuffle implementation for 1.1, but we'll see how the timing on that works out. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-2585) Remove special handling of Hadoop JobConf
[ https://issues.apache.org/jira/browse/SPARK-2585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-2585: --- Target Version/s: 1.2.0 (was: 1.1.0) Remove special handling of Hadoop JobConf - Key: SPARK-2585 URL: https://issues.apache.org/jira/browse/SPARK-2585 Project: Spark Issue Type: Improvement Components: Spark Core Reporter: Patrick Wendell Assignee: Josh Rosen Priority: Critical This is a follow up to SPARK-2521 and should close SPARK-2546 (provided the implementation does not use shared conf objects). We no longer need to specially broadcast the Hadoop configuration since we are broadcasting RDD data anyways. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-2546) Configuration object thread safety issue
[ https://issues.apache.org/jira/browse/SPARK-2546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-2546: --- Target Version/s: 1.2.0 (was: 1.1.0) Configuration object thread safety issue Key: SPARK-2546 URL: https://issues.apache.org/jira/browse/SPARK-2546 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 0.9.1 Reporter: Andrew Ash Assignee: Josh Rosen Priority: Critical // observed in 0.9.1 but expected to exist in 1.0.1 as well This ticket is copy-pasted from a thread on the dev@ list: {quote} We discovered a very interesting bug in Spark at work last week in Spark 0.9.1 — that the way Spark uses the Hadoop Configuration object is prone to thread safety issues. I believe it still applies in Spark 1.0.1 as well. Let me explain: Observations - Was running a relatively simple job (read from Avro files, do a map, do another map, write back to Avro files) - 412 of 413 tasks completed, but the last task was hung in RUNNING state - The 412 successful tasks completed in median time 3.4s - The last hung task didn't finish even in 20 hours - The executor with the hung task was responsible for 100% of one core of CPU usage - Jstack of the executor attached (relevant thread pasted below) Diagnosis After doing some code spelunking, we determined the issue was concurrent use of a Configuration object for each task on an executor. In Hadoop each task runs in its own JVM, but in Spark multiple tasks can run in the same JVM, so the single-threaded access assumptions of the Configuration object no longer hold in Spark. The specific issue is that the AvroRecordReader actually _modifies_ the JobConf it's given when it's instantiated! It adds a key for the RPC protocol engine in the process of connecting to the Hadoop FileSystem. When many tasks start at the same time (like at the start of a job), many tasks are adding this configuration item to the one Configuration object at once. Internally Configuration uses a java.lang.HashMap, which isn't threadsafe… The below post is an excellent explanation of what happens in the situation where multiple threads insert into a HashMap at the same time. http://mailinator.blogspot.com/2009/06/beautiful-race-condition.html The gist is that you have a thread following a cycle of linked list nodes indefinitely. This exactly matches our observations of the 100% CPU core and also the final location in the stack trace. So it seems the way Spark shares a Configuration object between task threads in an executor is incorrect. We need some way to prevent concurrent access to a single Configuration object. Proposed fix We can clone the JobConf object in HadoopRDD.getJobConf() so each task gets its own JobConf object (and thus Configuration object). The optimization of broadcasting the Configuration object across the cluster can remain, but on the other side I think it needs to be cloned for each task to allow for concurrent access. I'm not sure the performance implications, but the comments suggest that the Configuration object is ~10KB so I would expect a clone on the object to be relatively speedy. Has this been observed before? Does my suggested fix make sense? I'd be happy to file a Jira ticket and continue discussion there for the right way to fix. Thanks! Andrew P.S. For others seeing this issue, our temporary workaround is to enable spark.speculation, which retries failed (or hung) tasks on other machines. {noformat} Executor task launch worker-6 daemon prio=10 tid=0x7f91f01fe000 nid=0x54b1 runnable [0x7f92d74f1000] java.lang.Thread.State: RUNNABLE at java.util.HashMap.transfer(HashMap.java:601) at java.util.HashMap.resize(HashMap.java:581) at java.util.HashMap.addEntry(HashMap.java:879) at java.util.HashMap.put(HashMap.java:505) at org.apache.hadoop.conf.Configuration.set(Configuration.java:803) at org.apache.hadoop.conf.Configuration.set(Configuration.java:783) at org.apache.hadoop.conf.Configuration.setClass(Configuration.java:1662) at org.apache.hadoop.ipc.RPC.setProtocolEngine(RPC.java:193) at org.apache.hadoop.hdfs.NameNodeProxies.createNNProxyWithClientProtocol(NameNodeProxies.java:343) at org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:168) at org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:129) at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:436) at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:403) at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:125) at
[jira] [Commented] (SPARK-2585) Remove special handling of Hadoop JobConf
[ https://issues.apache.org/jira/browse/SPARK-2585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14099234#comment-14099234 ] Patrick Wendell commented on SPARK-2585: Unfortunately after a lot of effort we still can't get the test times down on this one and it's still unclear whether it will cause performance regressions. Since this isn't particularly critical from a user perspective (it's mostly about simplifying internals) I think it's best to punt this to 1.2. One unfortunate thing is that it means SPARK-2546 will remain broken in 1.1. Remove special handling of Hadoop JobConf - Key: SPARK-2585 URL: https://issues.apache.org/jira/browse/SPARK-2585 Project: Spark Issue Type: Improvement Components: Spark Core Reporter: Patrick Wendell Assignee: Josh Rosen Priority: Critical This is a follow up to SPARK-2521 and should close SPARK-2546 (provided the implementation does not use shared conf objects). We no longer need to specially broadcast the Hadoop configuration since we are broadcasting RDD data anyways. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-2546) Configuration object thread safety issue
[ https://issues.apache.org/jira/browse/SPARK-2546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14099239#comment-14099239 ] Patrick Wendell commented on SPARK-2546: Hey Andrew I think due to us cutting SPARK-2585 from this release it will remain broken in Spark 1.1. We could look into a solution based on clone()'ing the conf for future patch releases in the 1.1 branch. Configuration object thread safety issue Key: SPARK-2546 URL: https://issues.apache.org/jira/browse/SPARK-2546 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 0.9.1 Reporter: Andrew Ash Assignee: Josh Rosen Priority: Critical // observed in 0.9.1 but expected to exist in 1.0.1 as well This ticket is copy-pasted from a thread on the dev@ list: {quote} We discovered a very interesting bug in Spark at work last week in Spark 0.9.1 — that the way Spark uses the Hadoop Configuration object is prone to thread safety issues. I believe it still applies in Spark 1.0.1 as well. Let me explain: Observations - Was running a relatively simple job (read from Avro files, do a map, do another map, write back to Avro files) - 412 of 413 tasks completed, but the last task was hung in RUNNING state - The 412 successful tasks completed in median time 3.4s - The last hung task didn't finish even in 20 hours - The executor with the hung task was responsible for 100% of one core of CPU usage - Jstack of the executor attached (relevant thread pasted below) Diagnosis After doing some code spelunking, we determined the issue was concurrent use of a Configuration object for each task on an executor. In Hadoop each task runs in its own JVM, but in Spark multiple tasks can run in the same JVM, so the single-threaded access assumptions of the Configuration object no longer hold in Spark. The specific issue is that the AvroRecordReader actually _modifies_ the JobConf it's given when it's instantiated! It adds a key for the RPC protocol engine in the process of connecting to the Hadoop FileSystem. When many tasks start at the same time (like at the start of a job), many tasks are adding this configuration item to the one Configuration object at once. Internally Configuration uses a java.lang.HashMap, which isn't threadsafe… The below post is an excellent explanation of what happens in the situation where multiple threads insert into a HashMap at the same time. http://mailinator.blogspot.com/2009/06/beautiful-race-condition.html The gist is that you have a thread following a cycle of linked list nodes indefinitely. This exactly matches our observations of the 100% CPU core and also the final location in the stack trace. So it seems the way Spark shares a Configuration object between task threads in an executor is incorrect. We need some way to prevent concurrent access to a single Configuration object. Proposed fix We can clone the JobConf object in HadoopRDD.getJobConf() so each task gets its own JobConf object (and thus Configuration object). The optimization of broadcasting the Configuration object across the cluster can remain, but on the other side I think it needs to be cloned for each task to allow for concurrent access. I'm not sure the performance implications, but the comments suggest that the Configuration object is ~10KB so I would expect a clone on the object to be relatively speedy. Has this been observed before? Does my suggested fix make sense? I'd be happy to file a Jira ticket and continue discussion there for the right way to fix. Thanks! Andrew P.S. For others seeing this issue, our temporary workaround is to enable spark.speculation, which retries failed (or hung) tasks on other machines. {noformat} Executor task launch worker-6 daemon prio=10 tid=0x7f91f01fe000 nid=0x54b1 runnable [0x7f92d74f1000] java.lang.Thread.State: RUNNABLE at java.util.HashMap.transfer(HashMap.java:601) at java.util.HashMap.resize(HashMap.java:581) at java.util.HashMap.addEntry(HashMap.java:879) at java.util.HashMap.put(HashMap.java:505) at org.apache.hadoop.conf.Configuration.set(Configuration.java:803) at org.apache.hadoop.conf.Configuration.set(Configuration.java:783) at org.apache.hadoop.conf.Configuration.setClass(Configuration.java:1662) at org.apache.hadoop.ipc.RPC.setProtocolEngine(RPC.java:193) at org.apache.hadoop.hdfs.NameNodeProxies.createNNProxyWithClientProtocol(NameNodeProxies.java:343) at org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:168) at org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:129) at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:436) at
[jira] [Updated] (SPARK-2914) spark.*.extraJavaOptions are evaluated too many times
[ https://issues.apache.org/jira/browse/SPARK-2914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-2914: --- Priority: Blocker (was: Major) spark.*.extraJavaOptions are evaluated too many times - Key: SPARK-2914 URL: https://issues.apache.org/jira/browse/SPARK-2914 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 1.0.2 Reporter: Andrew Or Priority: Blocker Fix For: 1.1.0 If we pass the following to spark.executor.extraJavaOptions, {code} -Dthem.quotes=the \best\ joke ever -Dthem.backslashes= \\ \\ {code} These will first be escaped once when the SparkSubmit JVM is launched. This becomes the following string. {code} scala sc.getConf.get(spark.driver.extraJavaOptions) res0: String = -Dthem.quotes=the best joke ever -Dthem.backslashes= \ \ \\ {code} This will be split incorrectly by Utils.splitCommandString. Of course, this also affects spark.driver.extraJavaOptions. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-2914) spark.*.extraJavaOptions are evaluated too many times
[ https://issues.apache.org/jira/browse/SPARK-2914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-2914: --- Priority: Critical (was: Blocker) spark.*.extraJavaOptions are evaluated too many times - Key: SPARK-2914 URL: https://issues.apache.org/jira/browse/SPARK-2914 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 1.0.2 Reporter: Andrew Or Priority: Critical Fix For: 1.1.0 If we pass the following to spark.executor.extraJavaOptions, {code} -Dthem.quotes=the \best\ joke ever -Dthem.backslashes= \\ \\ {code} These will first be escaped once when the SparkSubmit JVM is launched. This becomes the following string. {code} scala sc.getConf.get(spark.driver.extraJavaOptions) res0: String = -Dthem.quotes=the best joke ever -Dthem.backslashes= \ \ \\ {code} This will be split incorrectly by Utils.splitCommandString. Of course, this also affects spark.driver.extraJavaOptions. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-3025) Allow JDBC clients to set a fair scheduler pool
[ https://issues.apache.org/jira/browse/SPARK-3025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-3025: --- Priority: Blocker (was: Major) Allow JDBC clients to set a fair scheduler pool --- Key: SPARK-3025 URL: https://issues.apache.org/jira/browse/SPARK-3025 Project: Spark Issue Type: Bug Components: SQL Reporter: Patrick Wendell Assignee: Patrick Wendell Priority: Blocker -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-3079) Hive build should depend on parquet serdes
Patrick Wendell created SPARK-3079: -- Summary: Hive build should depend on parquet serdes Key: SPARK-3079 URL: https://issues.apache.org/jira/browse/SPARK-3079 Project: Spark Issue Type: Bug Components: SQL Reporter: Patrick Wendell Assignee: Patrick Wendell This will allow people to read parquet hive tables out of the box. Also, I think there are no transitive dependencies (I need to audit this) to worry about. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-3015) Removing broadcast in quick successions causes Akka timeout
[ https://issues.apache.org/jira/browse/SPARK-3015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-3015: --- Assignee: Andrew Or Removing broadcast in quick successions causes Akka timeout --- Key: SPARK-3015 URL: https://issues.apache.org/jira/browse/SPARK-3015 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 1.0.2 Environment: Standalone EC2 Spark shell Reporter: Andrew Or Assignee: Andrew Or Priority: Blocker Fix For: 1.1.0 This issue is originally reported in SPARK-2916 in the context of MLLib, but we were able to reproduce it using a simple Spark shell command: {code} (1 to 1).foreach { i = sc.parallelize(1 to 1000, 48).sum } {code} We still do not have a full understanding of the issue, but we have gleaned the following information so far. When the driver runs a GC, it attempts to clean up all the broadcast blocks that go out of scope at once. This causes the driver to send out many blocking RemoveBroadcast messages to the executors, which in turn send out blocking UpdateBlockInfo messages back to the driver. Both of these calls block until they receive the expected responses. We suspect that the high frequency at which we send these blocking messages is the cause of either dropped messages or internal deadlock somewhere. Unfortunately, it is highly difficult to reproduce depending on the environment. We have been able to reproduce it on a 6-node cluster in us-west-2, but not in us-west-1, for instance. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-3015) Removing broadcast in quick successions causes Akka timeout
[ https://issues.apache.org/jira/browse/SPARK-3015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-3015. Resolution: Fixed Issue resolved by pull request 1931 [https://github.com/apache/spark/pull/1931] Removing broadcast in quick successions causes Akka timeout --- Key: SPARK-3015 URL: https://issues.apache.org/jira/browse/SPARK-3015 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 1.0.2 Environment: Standalone EC2 Spark shell Reporter: Andrew Or Priority: Blocker Fix For: 1.1.0 This issue is originally reported in SPARK-2916 in the context of MLLib, but we were able to reproduce it using a simple Spark shell command: {code} (1 to 1).foreach { i = sc.parallelize(1 to 1000, 48).sum } {code} We still do not have a full understanding of the issue, but we have gleaned the following information so far. When the driver runs a GC, it attempts to clean up all the broadcast blocks that go out of scope at once. This causes the driver to send out many blocking RemoveBroadcast messages to the executors, which in turn send out blocking UpdateBlockInfo messages back to the driver. Both of these calls block until they receive the expected responses. We suspect that the high frequency at which we send these blocking messages is the cause of either dropped messages or internal deadlock somewhere. Unfortunately, it is highly difficult to reproduce depending on the environment. We have been able to reproduce it on a 6-node cluster in us-west-2, but not in us-west-1, for instance. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3015) Removing broadcast in quick successions causes Akka timeout
[ https://issues.apache.org/jira/browse/SPARK-3015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14099517#comment-14099517 ] Patrick Wendell commented on SPARK-3015: [~andrewor] I changed Affects version/s to 1.1.0 instead of 1.0.2 because I don't think this issue was ever seen in Spark 1.0.2. Is that correct? Removing broadcast in quick successions causes Akka timeout --- Key: SPARK-3015 URL: https://issues.apache.org/jira/browse/SPARK-3015 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 1.1.0 Environment: Standalone EC2 Spark shell Reporter: Andrew Or Assignee: Andrew Or Priority: Blocker Fix For: 1.1.0 This issue is originally reported in SPARK-2916 in the context of MLLib, but we were able to reproduce it using a simple Spark shell command: {code} (1 to 1).foreach { i = sc.parallelize(1 to 1000, 48).sum } {code} We still do not have a full understanding of the issue, but we have gleaned the following information so far. When the driver runs a GC, it attempts to clean up all the broadcast blocks that go out of scope at once. This causes the driver to send out many blocking RemoveBroadcast messages to the executors, which in turn send out blocking UpdateBlockInfo messages back to the driver. Both of these calls block until they receive the expected responses. We suspect that the high frequency at which we send these blocking messages is the cause of either dropped messages or internal deadlock somewhere. Unfortunately, it is highly difficult to reproduce depending on the environment. We have been able to reproduce it on a 6-node cluster in us-west-2, but not in us-west-1, for instance. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-2916) [MLlib] While running regression tests with dense vectors of length greater than 1000, the treeAggregate blows up after several iterations
[ https://issues.apache.org/jira/browse/SPARK-2916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-2916. Resolution: Fixed Fixed by virtue of SPARK-3015 [MLlib] While running regression tests with dense vectors of length greater than 1000, the treeAggregate blows up after several iterations -- Key: SPARK-2916 URL: https://issues.apache.org/jira/browse/SPARK-2916 Project: Spark Issue Type: Bug Components: MLlib, Spark Core Reporter: Burak Yavuz Priority: Blocker While running any of the regression algorithms with gradient descent, the treeAggregate blows up after several iterations. Observed on EC2 cluster with 16 nodes, matrix dimensions of 1,000,000 x 5,000 In order to replicate the problem, use aggregate multiple times, maybe over 50-60 times. Testing lead to the possible workaround: setting `spark.cleaner.referenceTracking false` seems to help. So the problem is most probably related to the cleanup. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-3015) Removing broadcast in quick successions causes Akka timeout
[ https://issues.apache.org/jira/browse/SPARK-3015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-3015: --- Affects Version/s: (was: 1.0.2) 1.1.0 Removing broadcast in quick successions causes Akka timeout --- Key: SPARK-3015 URL: https://issues.apache.org/jira/browse/SPARK-3015 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 1.1.0 Environment: Standalone EC2 Spark shell Reporter: Andrew Or Assignee: Andrew Or Priority: Blocker Fix For: 1.1.0 This issue is originally reported in SPARK-2916 in the context of MLLib, but we were able to reproduce it using a simple Spark shell command: {code} (1 to 1).foreach { i = sc.parallelize(1 to 1000, 48).sum } {code} We still do not have a full understanding of the issue, but we have gleaned the following information so far. When the driver runs a GC, it attempts to clean up all the broadcast blocks that go out of scope at once. This causes the driver to send out many blocking RemoveBroadcast messages to the executors, which in turn send out blocking UpdateBlockInfo messages back to the driver. Both of these calls block until they receive the expected responses. We suspect that the high frequency at which we send these blocking messages is the cause of either dropped messages or internal deadlock somewhere. Unfortunately, it is highly difficult to reproduce depending on the environment. We have been able to reproduce it on a 6-node cluster in us-west-2, but not in us-west-1, for instance. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-3076) Gracefully report build timeouts in Jenkins
[ https://issues.apache.org/jira/browse/SPARK-3076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-3076. Resolution: Fixed Fix Version/s: 1.2.0 Issue resolved by pull request 1974 [https://github.com/apache/spark/pull/1974] Gracefully report build timeouts in Jenkins --- Key: SPARK-3076 URL: https://issues.apache.org/jira/browse/SPARK-3076 Project: Spark Issue Type: Sub-task Components: Build Reporter: Nicholas Chammas Priority: Minor Fix For: 1.2.0 Copy of dev list thread: {quote} Jenkins runs for this PR https://github.com/apache/spark/pull/1960 timed out without notification. The relevant Jenkins logs are at https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18588/consoleFull https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18592/consoleFull https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18597/consoleFull On Fri, Aug 15, 2014 at 11:44 AM, Nicholas Chammas nicholas.cham...@gmail.com wrote: Shivaram, Can you point us to an example of that happening? The Jenkins console output, that is. Nick On Fri, Aug 15, 2014 at 2:28 PM, Shivaram Venkataraman shiva...@eecs.berkeley.edu wrote: Also I think Jenkins doesn't post build timeouts to github. Is there anyway we can fix that ? On Aug 15, 2014 9:04 AM, Patrick Wendell pwend...@gmail.com wrote: Hi All, I noticed that all PR tests run overnight had failed due to timeouts. The patch that updates the netty shuffle I believe somehow inflated to the build time significantly. That patch had been tested, but one change was made before it was merged that was not tested. I've reverted the patch for now to see if it brings the build times back down. - Patrick {quote} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-3076) Gracefully report build timeouts in Jenkins
[ https://issues.apache.org/jira/browse/SPARK-3076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-3076: --- Assignee: Nicholas Chammas Gracefully report build timeouts in Jenkins --- Key: SPARK-3076 URL: https://issues.apache.org/jira/browse/SPARK-3076 Project: Spark Issue Type: Sub-task Components: Build Reporter: Nicholas Chammas Assignee: Nicholas Chammas Priority: Minor Fix For: 1.2.0 Copy of dev list thread: {quote} Jenkins runs for this PR https://github.com/apache/spark/pull/1960 timed out without notification. The relevant Jenkins logs are at https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18588/consoleFull https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18592/consoleFull https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18597/consoleFull On Fri, Aug 15, 2014 at 11:44 AM, Nicholas Chammas nicholas.cham...@gmail.com wrote: Shivaram, Can you point us to an example of that happening? The Jenkins console output, that is. Nick On Fri, Aug 15, 2014 at 2:28 PM, Shivaram Venkataraman shiva...@eecs.berkeley.edu wrote: Also I think Jenkins doesn't post build timeouts to github. Is there anyway we can fix that ? On Aug 15, 2014 9:04 AM, Patrick Wendell pwend...@gmail.com wrote: Hi All, I noticed that all PR tests run overnight had failed due to timeouts. The patch that updates the netty shuffle I believe somehow inflated to the build time significantly. That patch had been tested, but one change was made before it was merged that was not tested. I've reverted the patch for now to see if it brings the build times back down. - Patrick {quote} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-2977) Fix handling of short shuffle manager names in ShuffleBlockManager
[ https://issues.apache.org/jira/browse/SPARK-2977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-2977. Resolution: Fixed Fix handling of short shuffle manager names in ShuffleBlockManager -- Key: SPARK-2977 URL: https://issues.apache.org/jira/browse/SPARK-2977 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 1.1.0 Reporter: Josh Rosen Assignee: Josh Rosen Priority: Critical Since we allow short names for {{spark.shuffle.manager}}, all code that reads that configuration property should be prepared to handle the short names. See my comment at https://github.com/apache/spark/pull/1799#discussion_r16029607 (opening this as a JIRA so we don't forget to fix it). -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-2878) Inconsistent Kryo serialisation with custom Kryo Registrator
[ https://issues.apache.org/jira/browse/SPARK-2878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-2878. Resolution: Fixed Fixed by SPARK-3046 via: https://github.com/apache/spark/pull/1972 Inconsistent Kryo serialisation with custom Kryo Registrator Key: SPARK-2878 URL: https://issues.apache.org/jira/browse/SPARK-2878 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 1.0.0, 1.0.2 Environment: Linux RedHat EL 6, 4-node Spark cluster. Reporter: Graham Dennis Assignee: Graham Dennis Priority: Critical The custom Kryo Registrator (a class with the org.apache.spark.serializer.KryoRegistrator trait) is not used with every Kryo instance created, and this causes inconsistent serialisation and deserialisation. The Kryo Registrator is sometimes not used because of a ClassNotFound exception that only occurs if it *isn't* the Worker thread (of an Executor) that tries to create the KryoRegistrator. A complete description of the problem and a project reproducing the problem can be found at https://github.com/GrahamDennis/spark-kryo-serialisation I have currently only tested this with Spark 1.0.0, but will try to test against 1.0.2. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-2878) Inconsistent Kryo serialisation with custom Kryo Registrator
[ https://issues.apache.org/jira/browse/SPARK-2878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-2878: --- Fix Version/s: 1.1.0 Inconsistent Kryo serialisation with custom Kryo Registrator Key: SPARK-2878 URL: https://issues.apache.org/jira/browse/SPARK-2878 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 1.0.0, 1.0.2 Environment: Linux RedHat EL 6, 4-node Spark cluster. Reporter: Graham Dennis Assignee: Graham Dennis Priority: Critical Fix For: 1.1.0 The custom Kryo Registrator (a class with the org.apache.spark.serializer.KryoRegistrator trait) is not used with every Kryo instance created, and this causes inconsistent serialisation and deserialisation. The Kryo Registrator is sometimes not used because of a ClassNotFound exception that only occurs if it *isn't* the Worker thread (of an Executor) that tries to create the KryoRegistrator. A complete description of the problem and a project reproducing the problem can be found at https://github.com/GrahamDennis/spark-kryo-serialisation I have currently only tested this with Spark 1.0.0, but will try to test against 1.0.2. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-2878) Inconsistent Kryo serialisation with custom Kryo Registrator
[ https://issues.apache.org/jira/browse/SPARK-2878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-2878: --- Fix Version/s: 1.0.3 Inconsistent Kryo serialisation with custom Kryo Registrator Key: SPARK-2878 URL: https://issues.apache.org/jira/browse/SPARK-2878 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 1.0.0, 1.0.2 Environment: Linux RedHat EL 6, 4-node Spark cluster. Reporter: Graham Dennis Assignee: Graham Dennis Priority: Critical Fix For: 1.1.0, 1.0.3 The custom Kryo Registrator (a class with the org.apache.spark.serializer.KryoRegistrator trait) is not used with every Kryo instance created, and this causes inconsistent serialisation and deserialisation. The Kryo Registrator is sometimes not used because of a ClassNotFound exception that only occurs if it *isn't* the Worker thread (of an Executor) that tries to create the KryoRegistrator. A complete description of the problem and a project reproducing the problem can be found at https://github.com/GrahamDennis/spark-kryo-serialisation I have currently only tested this with Spark 1.0.0, but will try to test against 1.0.2. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-2881) Snappy is now default codec - could lead to conflicts since uses /tmp
[ https://issues.apache.org/jira/browse/SPARK-2881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14099849#comment-14099849 ] Patrick Wendell commented on SPARK-2881: Actually since this uses a static system property it might be better to just do [~mridul]'s suggestion and keep this simple. One possibility is just a static code block in the Snappy compression codec itself that sets it to a random sub directory of /tmp/ Snappy is now default codec - could lead to conflicts since uses /tmp - Key: SPARK-2881 URL: https://issues.apache.org/jira/browse/SPARK-2881 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 1.1.0 Reporter: Thomas Graves Priority: Blocker I was using spark master branch and I ran into an issue with Snappy since its now the default codec for shuffle. The issue was that someone else had run with snappy and it created /tmp/snappy-*.so but it had restrictive permissions so I was not able to use it or remove it. This caused my spark job to not start. I was running in yarn client mode at the time. Yarn cluster mode shouldn't have this issue since we change the java.io.tmpdir. I assume this would also affect standalone mode. I'm not sure if this is a true blocker but wanted to file it as one at first and let us decide. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-2881) Snappy is now default codec - could lead to conflicts since uses /tmp
[ https://issues.apache.org/jira/browse/SPARK-2881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14099857#comment-14099857 ] Patrick Wendell commented on SPARK-2881: I filed a bug to have snappy deal with this better out of the box. https://github.com/xerial/snappy-java/issues/84 Snappy is now default codec - could lead to conflicts since uses /tmp - Key: SPARK-2881 URL: https://issues.apache.org/jira/browse/SPARK-2881 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 1.1.0 Reporter: Thomas Graves Assignee: Patrick Wendell Priority: Blocker I was using spark master branch and I ran into an issue with Snappy since its now the default codec for shuffle. The issue was that someone else had run with snappy and it created /tmp/snappy-*.so but it had restrictive permissions so I was not able to use it or remove it. This caused my spark job to not start. I was running in yarn client mode at the time. Yarn cluster mode shouldn't have this issue since we change the java.io.tmpdir. I assume this would also affect standalone mode. I'm not sure if this is a true blocker but wanted to file it as one at first and let us decide. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-2608) scheduler backend create executor launch command not correctly
[ https://issues.apache.org/jira/browse/SPARK-2608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-2608: --- Fix Version/s: (was: 1.1.0) scheduler backend create executor launch command not correctly -- Key: SPARK-2608 URL: https://issues.apache.org/jira/browse/SPARK-2608 Project: Spark Issue Type: Bug Components: Mesos Affects Versions: 1.0.0 Reporter: wangfei Priority: Minor mesos scheduler backend use spark-class/spark-executor to launch executor backend, this will lead to problems: 1 when set spark.executor.extraJavaOptions CoarseMesosSchedulerBackend will throw error 2 spark.executor.extraJavaOptions and spark.executor.extraLibraryPath set in sparkconf will not be valid -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-2608) scheduler backend create executor launch command not correctly
[ https://issues.apache.org/jira/browse/SPARK-2608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-2608: --- Component/s: (was: Spark Core) Mesos scheduler backend create executor launch command not correctly -- Key: SPARK-2608 URL: https://issues.apache.org/jira/browse/SPARK-2608 Project: Spark Issue Type: Bug Components: Mesos Affects Versions: 1.0.0 Reporter: wangfei Priority: Minor mesos scheduler backend use spark-class/spark-executor to launch executor backend, this will lead to problems: 1 when set spark.executor.extraJavaOptions CoarseMesosSchedulerBackend will throw error 2 spark.executor.extraJavaOptions and spark.executor.extraLibraryPath set in sparkconf will not be valid -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-2608) scheduler backend create executor launch command not correctly
[ https://issues.apache.org/jira/browse/SPARK-2608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-2608: --- Target Version/s: 1.1.0 scheduler backend create executor launch command not correctly -- Key: SPARK-2608 URL: https://issues.apache.org/jira/browse/SPARK-2608 Project: Spark Issue Type: Bug Components: Mesos Affects Versions: 1.0.0 Reporter: wangfei Priority: Minor mesos scheduler backend use spark-class/spark-executor to launch executor backend, this will lead to problems: 1 when set spark.executor.extraJavaOptions CoarseMesosSchedulerBackend will throw error 2 spark.executor.extraJavaOptions and spark.executor.extraLibraryPath set in sparkconf will not be valid -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-2608) Mesos scheduler backend create executor launch command not correctly
[ https://issues.apache.org/jira/browse/SPARK-2608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-2608: --- Summary: Mesos scheduler backend create executor launch command not correctly (was: scheduler backend create executor launch command not correctly) Mesos scheduler backend create executor launch command not correctly Key: SPARK-2608 URL: https://issues.apache.org/jira/browse/SPARK-2608 Project: Spark Issue Type: Bug Components: Mesos Affects Versions: 1.0.0 Reporter: wangfei Priority: Minor mesos scheduler backend use spark-class/spark-executor to launch executor backend, this will lead to problems: 1 when set spark.executor.extraJavaOptions CoarseMesosSchedulerBackend will throw error 2 spark.executor.extraJavaOptions and spark.executor.extraLibraryPath set in sparkconf will not be valid -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-2881) Snappy is now default codec - could lead to conflicts since uses /tmp
[ https://issues.apache.org/jira/browse/SPARK-2881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-2881: --- Target Version/s: 1.2.0 (was: 1.1.0) Snappy is now default codec - could lead to conflicts since uses /tmp - Key: SPARK-2881 URL: https://issues.apache.org/jira/browse/SPARK-2881 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 1.1.0 Reporter: Thomas Graves Assignee: Patrick Wendell Priority: Blocker Fix For: 1.1.0 I was using spark master branch and I ran into an issue with Snappy since its now the default codec for shuffle. The issue was that someone else had run with snappy and it created /tmp/snappy-*.so but it had restrictive permissions so I was not able to use it or remove it. This caused my spark job to not start. I was running in yarn client mode at the time. Yarn cluster mode shouldn't have this issue since we change the java.io.tmpdir. I assume this would also affect standalone mode. I'm not sure if this is a true blocker but wanted to file it as one at first and let us decide. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-2881) Snappy is now default codec - could lead to conflicts since uses /tmp
[ https://issues.apache.org/jira/browse/SPARK-2881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-2881: --- Fix Version/s: 1.1.0 Snappy is now default codec - could lead to conflicts since uses /tmp - Key: SPARK-2881 URL: https://issues.apache.org/jira/browse/SPARK-2881 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 1.1.0 Reporter: Thomas Graves Assignee: Patrick Wendell Priority: Blocker Fix For: 1.1.0 I was using spark master branch and I ran into an issue with Snappy since its now the default codec for shuffle. The issue was that someone else had run with snappy and it created /tmp/snappy-*.so but it had restrictive permissions so I was not able to use it or remove it. This caused my spark job to not start. I was running in yarn client mode at the time. Yarn cluster mode shouldn't have this issue since we change the java.io.tmpdir. I assume this would also affect standalone mode. I'm not sure if this is a true blocker but wanted to file it as one at first and let us decide. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-2881) Snappy is now default codec - could lead to conflicts since uses /tmp
[ https://issues.apache.org/jira/browse/SPARK-2881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14100164#comment-14100164 ] Patrick Wendell commented on SPARK-2881: Okay I've merged a change in branch-1.1 updating the version to snappy-java 1.0.5.3 so this is no longer blocking Spark 1.1. I've also submitted a patch to the master branch updating to 1.1.1.3. We can merge that when tests pass. Snappy is now default codec - could lead to conflicts since uses /tmp - Key: SPARK-2881 URL: https://issues.apache.org/jira/browse/SPARK-2881 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 1.1.0 Reporter: Thomas Graves Assignee: Patrick Wendell Priority: Blocker Fix For: 1.1.0 I was using spark master branch and I ran into an issue with Snappy since its now the default codec for shuffle. The issue was that someone else had run with snappy and it created /tmp/snappy-*.so but it had restrictive permissions so I was not able to use it or remove it. This caused my spark job to not start. I was running in yarn client mode at the time. Yarn cluster mode shouldn't have this issue since we change the java.io.tmpdir. I assume this would also affect standalone mode. I'm not sure if this is a true blocker but wanted to file it as one at first and let us decide. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-3092) Always include the thriftserver when -Phive is enabled.
[ https://issues.apache.org/jira/browse/SPARK-3092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-3092: --- Target Version/s: 1.1.0 Always include the thriftserver when -Phive is enabled. --- Key: SPARK-3092 URL: https://issues.apache.org/jira/browse/SPARK-3092 Project: Spark Issue Type: Improvement Components: SQL Reporter: Patrick Wendell Assignee: Patrick Wendell Priority: Blocker Currently we have a separate profile called hive-thriftserver. I originally suggested this in case users did not want to bundle the thriftserver, but it's ultimately lead to a lot of confusion. Since the thriftserver is only a few classes, I don't see a really good reason to isolate it from the rest of Hive. So let's go ahead and just include it in the same profile to simplify things. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-3092) Always include the thriftserver when -Phive is enabled.
Patrick Wendell created SPARK-3092: -- Summary: Always include the thriftserver when -Phive is enabled. Key: SPARK-3092 URL: https://issues.apache.org/jira/browse/SPARK-3092 Project: Spark Issue Type: Improvement Components: SQL Reporter: Patrick Wendell Assignee: Patrick Wendell Priority: Blocker Currently we have a separate profile called hive-thriftserver. I originally suggested this in case users did not want to bundle the thriftserver, but it's ultimately lead to a lot of confusion. Since the thriftserver is only a few classes, I don't see a really good reason to isolate it from the rest of Hive. So let's go ahead and just include it in the same profile to simplify things. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-2881) Snappy is now default codec - could lead to conflicts since uses /tmp
[ https://issues.apache.org/jira/browse/SPARK-2881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-2881: --- Fix Version/s: 1.2.0 Snappy is now default codec - could lead to conflicts since uses /tmp - Key: SPARK-2881 URL: https://issues.apache.org/jira/browse/SPARK-2881 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 1.1.0 Reporter: Thomas Graves Assignee: Patrick Wendell Priority: Blocker Fix For: 1.1.0, 1.2.0 I was using spark master branch and I ran into an issue with Snappy since its now the default codec for shuffle. The issue was that someone else had run with snappy and it created /tmp/snappy-*.so but it had restrictive permissions so I was not able to use it or remove it. This caused my spark job to not start. I was running in yarn client mode at the time. Yarn cluster mode shouldn't have this issue since we change the java.io.tmpdir. I assume this would also affect standalone mode. I'm not sure if this is a true blocker but wanted to file it as one at first and let us decide. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-2881) Snappy is now default codec - could lead to conflicts since uses /tmp
[ https://issues.apache.org/jira/browse/SPARK-2881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-2881. Resolution: Fixed Snappy is now default codec - could lead to conflicts since uses /tmp - Key: SPARK-2881 URL: https://issues.apache.org/jira/browse/SPARK-2881 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 1.1.0 Reporter: Thomas Graves Assignee: Patrick Wendell Priority: Blocker Fix For: 1.1.0, 1.2.0 I was using spark master branch and I ran into an issue with Snappy since its now the default codec for shuffle. The issue was that someone else had run with snappy and it created /tmp/snappy-*.so but it had restrictive permissions so I was not able to use it or remove it. This caused my spark job to not start. I was running in yarn client mode at the time. Yarn cluster mode shouldn't have this issue since we change the java.io.tmpdir. I assume this would also affect standalone mode. I'm not sure if this is a true blocker but wanted to file it as one at first and let us decide. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-2881) Snappy is now default codec - could lead to conflicts since uses /tmp
[ https://issues.apache.org/jira/browse/SPARK-2881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-2881: --- Target Version/s: (was: 1.2.0) Snappy is now default codec - could lead to conflicts since uses /tmp - Key: SPARK-2881 URL: https://issues.apache.org/jira/browse/SPARK-2881 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 1.1.0 Reporter: Thomas Graves Assignee: Patrick Wendell Priority: Blocker Fix For: 1.1.0, 1.2.0 I was using spark master branch and I ran into an issue with Snappy since its now the default codec for shuffle. The issue was that someone else had run with snappy and it created /tmp/snappy-*.so but it had restrictive permissions so I was not able to use it or remove it. This caused my spark job to not start. I was running in yarn client mode at the time. Yarn cluster mode shouldn't have this issue since we change the java.io.tmpdir. I assume this would also affect standalone mode. I'm not sure if this is a true blocker but wanted to file it as one at first and let us decide. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org