[jira] [Updated] (SPARK-11638) Run Spark on Mesos with bridge networking
[ https://issues.apache.org/jira/browse/SPARK-11638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-11638: -- Summary: Run Spark on Mesos with bridge networking (was: Apache Spark in Docker with Bridge networking / run Spark on Mesos, in Docker with Bridge networking) > Run Spark on Mesos with bridge networking > - > > Key: SPARK-11638 > URL: https://issues.apache.org/jira/browse/SPARK-11638 > Project: Spark > Issue Type: Improvement > Components: Mesos, Spark Core >Affects Versions: 1.4.0, 1.4.1, 1.5.0, 1.5.1, 1.5.2, 1.6.0 >Reporter: Radoslaw Gruchalski > Attachments: 1.4.0.patch, 1.4.1.patch, 1.5.0.patch, 1.5.1.patch, > 1.5.2.patch, 1.6.0.patch, 2.3.11.patch, 2.3.4.patch > > > h4. Summary > Provides {{spark.driver.advertisedPort}}, > {{spark.fileserver.advertisedPort}}, {{spark.broadcast.advertisedPort}} and > {{spark.replClassServer.advertisedPort}} settings to enable running Spark in > Mesos on Docker with Bridge networking. Provides patches for Akka Remote to > enable Spark driver advertisement using alternative host and port. > With these settings, it is possible to run Spark Master in a Docker container > and have the executors running on Mesos talk back correctly to such Master. > The problem is discussed on the Mesos mailing list here: > https://mail-archives.apache.org/mod_mbox/mesos-user/201510.mbox/%3CCACTd3c9vjAMXk=bfotj5ljzfrh5u7ix-ghppfqknvg9mkkc...@mail.gmail.com%3E > h4. Running Spark on Mesos - LIBPROCESS_ADVERTISE_IP opens the door > In order for the framework to receive orders in the bridged container, Mesos > in the container has to register for offers using the IP address of the > Agent. Offers are sent by Mesos Master to the Docker container running on a > different host, an Agent. Normally, prior to Mesos 0.24.0, {{libprocess}} > would advertise itself using the IP address of the container, something like > {{172.x.x.x}}. Obviously, Mesos Master can't reach that address, it's a > different host, it's a different machine. Mesos 0.24.0 introduced two new > properties for {{libprocess}} - {{LIBPROCESS_ADVERTISE_IP}} and > {{LIBPROCESS_ADVERTISE_PORT}}. This allows the container to use the Agent's > address to register for offers. This was provided mainly for running Mesos in > Docker on Mesos. > h4. Spark - how does the above relate and what is being addressed here? > Similar to Mesos, out of the box, Spark does not allow to advertise its > services on ports different than bind ports. Consider following scenario: > Spark is running inside a Docker container on Mesos, it's a bridge networking > mode. Assuming a port {{}} for the {{spark.driver.port}}, {{6677}} for > the {{spark.fileserver.port}}, {{6688}} for the {{spark.broadcast.port}} and > {{23456}} for the {{spark.replClassServer.port}}. If such task is posted to > Marathon, Mesos will give 4 ports in range {{31000-32000}} mapping to the > container ports. Starting the executors from such container results in > executors not being able to communicate back to the Spark Master. > This happens because of 2 things: > Spark driver is effectively an {{akka-remote}} system with {{akka.tcp}} > transport. {{akka-remote}} prior to version {{2.4}} can't advertise a port > different to what it bound to. The settings discussed are here: > https://github.com/akka/akka/blob/f8c1671903923837f22d0726a955e0893add5e9f/akka-remote/src/main/resources/reference.conf#L345-L376. > These do not exist in Akka {{2.3.x}}. Spark driver will always advertise > port {{}} as this is the one {{akka-remote}} is bound to. > Any URIs the executors contact the Spark Master on, are prepared by Spark > Master and handed over to executors. These always contain the port number > used by the Master to find the service on. The services are: > - {{spark.broadcast.port}} > - {{spark.fileserver.port}} > - {{spark.replClassServer.port}} > all above ports are by default {{0}} (random assignment) but can be specified > using Spark configuration ( {{-Dspark...port}} ). However, they are limited > in the same way as the {{spark.driver.port}}; in the above example, an > executor should not contact the file server on port {{6677}} but rather on > the respective 31xxx assigned by Mesos. > Spark currently does not allow any of that. > h4. Taking on the problem, step 1: Spark Driver > As mentioned above, Spark Driver is based on {{akka-remote}}. In order to > take on the problem, the {{akka.remote.net.tcp.bind-hostname}} and > {{akka.remote.net.tcp.bind-port}} settings are a must. Spark does not compile > with Akka 2.4.x yet. > What we want is the back port of mentioned {{akka-remote}} settings to > {{2.3.x}} versions. These patches are attached to this ticket - > {{2.3.4.patch}} and {{2.3.11.patch}} files provide patches for
[jira] [Resolved] (SPARK-12790) Remove HistoryServer old multiple files format
[ https://issues.apache.org/jira/browse/SPARK-12790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or resolved SPARK-12790. --- Resolution: Fixed Fix Version/s: 2.0.0 > Remove HistoryServer old multiple files format > -- > > Key: SPARK-12790 > URL: https://issues.apache.org/jira/browse/SPARK-12790 > Project: Spark > Issue Type: Sub-task > Components: Deploy >Reporter: Andrew Or >Assignee: Felix Cheung > Fix For: 2.0.0 > > > HistoryServer has 2 formats. The old one makes a directory and puts multiple > files in there (APPLICATION_COMPLETE, EVENT_LOG1 etc.). The new one has just > 1 file called local_2593759238651.log or something. > It's been a nightmare to maintain both code paths. We should just remove the > old legacy format (which has been out of use for many versions now) when we > still have the chance. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-12637) Print stage info of finished stages properly
[ https://issues.apache.org/jira/browse/SPARK-12637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-12637: -- Assignee: Navis > Print stage info of finished stages properly > > > Key: SPARK-12637 > URL: https://issues.apache.org/jira/browse/SPARK-12637 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Reporter: Navis >Assignee: Navis >Priority: Trivial > Fix For: 2.0.0 > > > Currently it prints hashcode of stage info, which seemed not that useful. > {noformat} > INFO scheduler.StatsReportListener: Finished stage: > org.apache.spark.scheduler.StageInfo@2eb47d79 > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-12637) Print stage info of finished stages properly
[ https://issues.apache.org/jira/browse/SPARK-12637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or resolved SPARK-12637. --- Resolution: Fixed Fix Version/s: 2.0.0 > Print stage info of finished stages properly > > > Key: SPARK-12637 > URL: https://issues.apache.org/jira/browse/SPARK-12637 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Reporter: Navis >Priority: Trivial > Fix For: 2.0.0 > > > Currently it prints hashcode of stage info, which seemed not that useful. > {noformat} > INFO scheduler.StatsReportListener: Finished stage: > org.apache.spark.scheduler.StageInfo@2eb47d79 > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-13088) DAG viz does not work with latest version of chrome
[ https://issues.apache.org/jira/browse/SPARK-13088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or resolved SPARK-13088. --- Resolution: Fixed Fix Version/s: 2.0.0 1.6.1 1.5.3 1.4.2 > DAG viz does not work with latest version of chrome > --- > > Key: SPARK-13088 > URL: https://issues.apache.org/jira/browse/SPARK-13088 > Project: Spark > Issue Type: Bug > Components: Web UI >Affects Versions: 1.4.0, 1.5.0, 1.6.0, 2.0.0 >Reporter: Andrew Or >Assignee: Andrew Or >Priority: Blocker > Fix For: 1.4.2, 1.5.3, 1.6.1, 2.0.0 > > Attachments: Screen Shot 2016-01-29 at 10.54.14 AM.png > > > See screenshot. This is because dagre-d3.js is using a function that chrome > no longer supports: > {code} > Uncaught TypeError: elem.getTransformToElement is not a function > {code} > We need to upgrade it. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-13096) Make AccumulatorSuite#verifyPeakExecutionMemorySet less flaky
[ https://issues.apache.org/jira/browse/SPARK-13096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or resolved SPARK-13096. --- Resolution: Fixed Fix Version/s: 2.0.0 > Make AccumulatorSuite#verifyPeakExecutionMemorySet less flaky > - > > Key: SPARK-13096 > URL: https://issues.apache.org/jira/browse/SPARK-13096 > Project: Spark > Issue Type: Bug > Components: Tests >Affects Versions: 1.6.0 >Reporter: Andrew Or >Assignee: Andrew Or > Fix For: 2.0.0 > > > We have a method in AccumulatorSuite called verifyPeakExecutionMemorySet. > This is used in a variety of other test suites, including (but not limited > to): > - ExternalAppendOnlyMapSuite > - ExternalSorterSuite > - sql.SQLQuerySuite > Lately it's been flaky ever since https://github.com/apache/spark/pull/10835 > was merged. Note: this was an existing problem even before that patch, but it > was uncovered there because previously we never failed the test even if an > assertion error failed! -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-13096) Make AccumulatorSuite#verifyPeakExecutionMemorySet less flaky
[ https://issues.apache.org/jira/browse/SPARK-13096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-13096: -- Affects Version/s: 1.6.0 Target Version/s: 2.0.0 > Make AccumulatorSuite#verifyPeakExecutionMemorySet less flaky > - > > Key: SPARK-13096 > URL: https://issues.apache.org/jira/browse/SPARK-13096 > Project: Spark > Issue Type: Bug > Components: Tests >Affects Versions: 1.6.0 >Reporter: Andrew Or >Assignee: Andrew Or > > We have a method in AccumulatorSuite called verifyPeakExecutionMemorySet. > This is used in a variety of other test suites, including (but not limited > to): > - ExternalAppendOnlyMapSuite > - ExternalSorterSuite > - sql.SQLQuerySuite > Lately it's been flaky ever since https://github.com/apache/spark/pull/10835 > was merged. Note: this was an existing problem even before that patch, but it > was uncovered there because previously we never failed the test even if an > assertion error failed! -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-13096) Make AccumulatorSuite#verifyPeakExecutionMemorySet less flaky
Andrew Or created SPARK-13096: - Summary: Make AccumulatorSuite#verifyPeakExecutionMemorySet less flaky Key: SPARK-13096 URL: https://issues.apache.org/jira/browse/SPARK-13096 Project: Spark Issue Type: Bug Components: Tests Reporter: Andrew Or Assignee: Andrew Or We have a method in AccumulatorSuite called verifyPeakExecutionMemorySet. This is used in a variety of other test suites, including (but not limited to): - ExternalAppendOnlyMapSuite - ExternalSorterSuite - sql.SQLQuerySuite Lately it's been flaky ever since https://github.com/apache/spark/pull/10835 was merged. Note: this was an existing problem even before that patch, but it was uncovered there because previously we never failed the test even if an assertion error failed! -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-10620) Look into whether accumulator mechanism can replace TaskMetrics
[ https://issues.apache.org/jira/browse/SPARK-10620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or resolved SPARK-10620. --- Resolution: Fixed Fix Version/s: 2.0.0 > Look into whether accumulator mechanism can replace TaskMetrics > --- > > Key: SPARK-10620 > URL: https://issues.apache.org/jira/browse/SPARK-10620 > Project: Spark > Issue Type: Task > Components: Spark Core >Reporter: Patrick Wendell >Assignee: Andrew Or > Fix For: 2.0.0 > > Attachments: accums-and-task-metrics.pdf > > > This task is simply to explore whether the internal representation used by > TaskMetrics could be performed by using accumulators rather than having two > separate mechanisms. Note that we need to continue to preserve the existing > "Task Metric" data structures that are exposed to users through event logs > etc. The question is can we use a single internal codepath and perhaps make > this easier to extend in the future. > I think a full exploration would answer the following questions: > - How do the semantics of accumulators on stage retries differ from aggregate > TaskMetrics for a stage? Could we implement clearer retry semantics for > internal accumulators to allow them to be the same - for instance, zeroing > accumulator values if a stage is retried (see discussion here: SPARK-10042). > - Are there metrics that do not fit well into the accumulator model, or would > be difficult to update as an accumulator. > - If we expose metrics through accumulators in the future rather than > continuing to add fields to TaskMetrics, what is the best way to coerce > compatibility? > - Are there any other considerations? > - Is it worth it to do this, or is the consolidation too complicated to > justify? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-13088) DAG viz does not work with latest version of chrome
Andrew Or created SPARK-13088: - Summary: DAG viz does not work with latest version of chrome Key: SPARK-13088 URL: https://issues.apache.org/jira/browse/SPARK-13088 Project: Spark Issue Type: Bug Components: Web UI Affects Versions: 1.6.0, 1.5.0, 1.4.0, 2.0.0 Reporter: Andrew Or Assignee: Andrew Or Priority: Blocker See screenshot. This is because dagre-d3.js is using a function that chrome no longer supports: {code} Uncaught TypeError: elem.getTransformToElement is not a function {code} We need to upgrade it. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-13088) DAG viz does not work with latest version of chrome
[ https://issues.apache.org/jira/browse/SPARK-13088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-13088: -- Attachment: Screen Shot 2016-01-29 at 10.54.14 AM.png > DAG viz does not work with latest version of chrome > --- > > Key: SPARK-13088 > URL: https://issues.apache.org/jira/browse/SPARK-13088 > Project: Spark > Issue Type: Bug > Components: Web UI >Affects Versions: 1.4.0, 1.5.0, 1.6.0, 2.0.0 >Reporter: Andrew Or >Assignee: Andrew Or >Priority: Blocker > Attachments: Screen Shot 2016-01-29 at 10.54.14 AM.png > > > See screenshot. This is because dagre-d3.js is using a function that chrome > no longer supports: > {code} > Uncaught TypeError: elem.getTransformToElement is not a function > {code} > We need to upgrade it. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-13065) streaming-twitter pass twitter4j.FilterQuery argument to TwitterUtils.createStream()
Andrew Davidson created SPARK-13065: --- Summary: streaming-twitter pass twitter4j.FilterQuery argument to TwitterUtils.createStream() Key: SPARK-13065 URL: https://issues.apache.org/jira/browse/SPARK-13065 Project: Spark Issue Type: Improvement Components: Streaming Affects Versions: 1.6.0 Environment: all Reporter: Andrew Davidson Priority: Minor The twitter stream api is very powerful provides a lot of support for twitter.com side filtering of status objects. When ever possible we want to let twitter do as much work as possible for us. currently the spark twitter api only allows you to configure a small sub set of possible filters String{} filters = {"tag1", tag2"} JavaDStream tweets =TwitterUtils.createStream(ssc, twitterAuth, filters); The current implemenation does private[streaming] class TwitterReceiver( twitterAuth: Authorization, filters: Seq[String], storageLevel: StorageLevel ) extends Receiver[Status](storageLevel) with Logging { . . . val query = new FilterQuery if (filters.size > 0) { query.track(filters.mkString(",")) newTwitterStream.filter(query) } else { newTwitterStream.sample() } ... rather than construct the FilterQuery object in TwitterReceiver.onStart(). we should be able to pass a FilterQueryObject looks like an easy fix. See source code links bellow kind regards Andy https://github.com/apache/spark/blob/master/external/twitter/src/main/scala/org/apache/spark/streaming/twitter/TwitterInputDStream.scala#L60 https://github.com/apache/spark/blob/master/external/twitter/src/main/scala/org/apache/spark/streaming/twitter/TwitterInputDStream.scala#L89 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-13055) SQLHistoryListener throws ClassCastException
[ https://issues.apache.org/jira/browse/SPARK-13055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-13055: -- Component/s: SQL > SQLHistoryListener throws ClassCastException > > > Key: SPARK-13055 > URL: https://issues.apache.org/jira/browse/SPARK-13055 > Project: Spark > Issue Type: Bug > Components: Spark Core, SQL >Affects Versions: 1.5.0 >Reporter: Andrew Or >Assignee: Andrew Or > > {code} > 16/01/27 18:46:28 ERROR ReplayListenerBus: Listener SQLHistoryListener threw > an exception > java.lang.ClassCastException: java.lang.Integer cannot be cast to > java.lang.Long > at scala.runtime.BoxesRunTime.unboxToLong(BoxesRunTime.java:110) > at > org.apache.spark.sql.execution.ui.SQLHistoryListener$$anonfun$onTaskEnd$1$$anonfun$5.apply(SQLListener.scala:334) > at > org.apache.spark.sql.execution.ui.SQLHistoryListener$$anonfun$onTaskEnd$1$$anonfun$5.apply(SQLListener.scala:334) > at scala.Option.map(Option.scala:145) > at > org.apache.spark.sql.execution.ui.SQLHistoryListener$$anonfun$onTaskEnd$1.apply(SQLListener.scala:334) > at > org.apache.spark.sql.execution.ui.SQLHistoryListener$$anonfun$onTaskEnd$1.apply(SQLListener.scala:332) > {code} > SQLHistoryListener listens on SparkListenerTaskEnd events, which contain > non-SQL accumulators as well. We try to cast all accumulators we encounter to > Long, resulting in an error like this one. > Note: this was a problem even before internal accumulators were introduced. > If the task used a user accumulator of a type other than Long, we would > still see this. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-13071) Coalescing HadoopRDD overwrites existing input metrics
[ https://issues.apache.org/jira/browse/SPARK-13071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-13071: -- Description: Currently if we do the following in Hadoop 2.5+: {code} sc.textFile(..., 4).coalesce(2).collect() {code} We'll get incorrect `InputMetrics#bytesRead`. This is because HadoopRDD's compute method overwrites any existing value of `bytesRead`, and in the case of coalesce we'll end up calling `compute` multiple times in the same task. This is why InputOutputMetricsSuite is failing in master, but only for hadoop2.6/2.7: https://amplab.cs.berkeley.edu/jenkins/job/spark-master-test-sbt-hadoop-2.7/67/ This was caused by https://github.com/apache/spark/pull/10835. was: Currently if we do the following in Hadoop 2.5+: {code} sc.textFile(..., 4).coalesce(2).collect() {code} We'll get incorrect `InputMetrics#bytesRead`. This is because HadoopRDD's compute method overwrites any existing value of `bytesRead`, and in the case of coalesce we'll end up calling `compute` multiple times in the same task. This is why InputOutputMetricsSuite is failing in master, but only for hadoop2.6/2.7: https://amplab.cs.berkeley.edu/jenkins/job/spark-master-test-sbt-hadoop-2.7/67/ > Coalescing HadoopRDD overwrites existing input metrics > -- > > Key: SPARK-13071 > URL: https://issues.apache.org/jira/browse/SPARK-13071 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.0.0 >Reporter: Andrew Or >Assignee: Andrew Or > > Currently if we do the following in Hadoop 2.5+: > {code} > sc.textFile(..., 4).coalesce(2).collect() > {code} > We'll get incorrect `InputMetrics#bytesRead`. This is because HadoopRDD's > compute method overwrites any existing value of `bytesRead`, and in the case > of coalesce we'll end up calling `compute` multiple times in the same task. > This is why InputOutputMetricsSuite is failing in master, but only for > hadoop2.6/2.7: > https://amplab.cs.berkeley.edu/jenkins/job/spark-master-test-sbt-hadoop-2.7/67/ > This was caused by https://github.com/apache/spark/pull/10835. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-13071) Coalescing HadoopRDD overwrites existing input metrics
Andrew Or created SPARK-13071: - Summary: Coalescing HadoopRDD overwrites existing input metrics Key: SPARK-13071 URL: https://issues.apache.org/jira/browse/SPARK-13071 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 2.0.0 Reporter: Andrew Or Assignee: Andrew Or Currently if we do the following in Hadoop 2.5+: {code} sc.textFile(..., 4).coalesce(2).collect() {code} We'll get incorrect `InputMetrics#bytesRead`. This is because HadoopRDD's compute method overwrites any existing value of `bytesRead`, and in the case of coalesce we'll end up calling `compute` multiple times in the same task. This is why InputOutputMetricsSuite is failing in master, but only for hadoop2.6/2.7: https://amplab.cs.berkeley.edu/jenkins/job/spark-master-test-sbt-hadoop-2.7/67/ -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-13054) Always post TaskEnd event for tasks in cancelled stages
Andrew Or created SPARK-13054: - Summary: Always post TaskEnd event for tasks in cancelled stages Key: SPARK-13054 URL: https://issues.apache.org/jira/browse/SPARK-13054 Project: Spark Issue Type: Bug Components: Scheduler Affects Versions: 1.0.0 Reporter: Andrew Or Assignee: Andrew Or {code} // The success case is dealt with separately below. // TODO: Why post it only for failed tasks in cancelled stages? Clarify semantics here. if (event.reason != Success) { val attemptId = task.stageAttemptId listenerBus.post(SparkListenerTaskEnd( stageId, attemptId, taskType, event.reason, event.taskInfo, taskMetrics)) } {code} Today we only post task end events for canceled stages if the task failed. There is no reason why we shouldn't just post it for all the tasks, including the ones that succeeded. If we do that we will be able to simplify another branch in the DAGScheduler, which needs a lot of simplification. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-13053) Rectify ignored tests in InternalAccumulatorSuite
Andrew Or created SPARK-13053: - Summary: Rectify ignored tests in InternalAccumulatorSuite Key: SPARK-13053 URL: https://issues.apache.org/jira/browse/SPARK-13053 Project: Spark Issue Type: Bug Components: Spark Core, Tests Reporter: Andrew Or Assignee: Andrew Or -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-13053) Rectify ignored tests in InternalAccumulatorSuite
[ https://issues.apache.org/jira/browse/SPARK-13053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-13053: -- Target Version/s: 2.0.0 > Rectify ignored tests in InternalAccumulatorSuite > - > > Key: SPARK-13053 > URL: https://issues.apache.org/jira/browse/SPARK-13053 > Project: Spark > Issue Type: Bug > Components: Spark Core, Tests >Affects Versions: 2.0.0 >Reporter: Andrew Or >Assignee: Andrew Or > > {code} > // TODO: these two tests are incorrect; they don't actually trigger stage > retries. > ignore("internal accumulators in fully resubmitted stages") { > testInternalAccumulatorsWithFailedTasks((i: Int) => true) // fail all tasks > } > ignore("internal accumulators in partially resubmitted stages") { > testInternalAccumulatorsWithFailedTasks((i: Int) => i % 2 == 0) // fail a > subset > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-13053) Rectify ignored tests in InternalAccumulatorSuite
[ https://issues.apache.org/jira/browse/SPARK-13053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-13053: -- Description: {code} // TODO: these two tests are incorrect; they don't actually trigger stage retries. ignore("internal accumulators in fully resubmitted stages") { testInternalAccumulatorsWithFailedTasks((i: Int) => true) // fail all tasks } ignore("internal accumulators in partially resubmitted stages") { testInternalAccumulatorsWithFailedTasks((i: Int) => i % 2 == 0) // fail a subset } {code} > Rectify ignored tests in InternalAccumulatorSuite > - > > Key: SPARK-13053 > URL: https://issues.apache.org/jira/browse/SPARK-13053 > Project: Spark > Issue Type: Bug > Components: Spark Core, Tests >Reporter: Andrew Or >Assignee: Andrew Or > > {code} > // TODO: these two tests are incorrect; they don't actually trigger stage > retries. > ignore("internal accumulators in fully resubmitted stages") { > testInternalAccumulatorsWithFailedTasks((i: Int) => true) // fail all tasks > } > ignore("internal accumulators in partially resubmitted stages") { > testInternalAccumulatorsWithFailedTasks((i: Int) => i % 2 == 0) // fail a > subset > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-13051) Do not maintain global singleton map for accumulators
Andrew Or created SPARK-13051: - Summary: Do not maintain global singleton map for accumulators Key: SPARK-13051 URL: https://issues.apache.org/jira/browse/SPARK-13051 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 1.0.0 Reporter: Andrew Or Right now we register accumulators through `Accumulators.register`, and then read these accumulators in DAGScheduler through `Accumulators.get`. This design is very awkward and makes it hard to associate a list of accumulators with a particular SparkContext. Global singleton is not a good pattern in general. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-13053) Rectify ignored tests in InternalAccumulatorSuite
[ https://issues.apache.org/jira/browse/SPARK-13053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-13053: -- Affects Version/s: 2.0.0 > Rectify ignored tests in InternalAccumulatorSuite > - > > Key: SPARK-13053 > URL: https://issues.apache.org/jira/browse/SPARK-13053 > Project: Spark > Issue Type: Bug > Components: Spark Core, Tests >Affects Versions: 2.0.0 >Reporter: Andrew Or >Assignee: Andrew Or > > {code} > // TODO: these two tests are incorrect; they don't actually trigger stage > retries. > ignore("internal accumulators in fully resubmitted stages") { > testInternalAccumulatorsWithFailedTasks((i: Int) => true) // fail all tasks > } > ignore("internal accumulators in partially resubmitted stages") { > testInternalAccumulatorsWithFailedTasks((i: Int) => i % 2 == 0) // fail a > subset > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-13055) SQLHistoryListener throws ClassCastException
Andrew Or created SPARK-13055: - Summary: SQLHistoryListener throws ClassCastException Key: SPARK-13055 URL: https://issues.apache.org/jira/browse/SPARK-13055 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 1.5.0 Reporter: Andrew Or Assignee: Andrew Or {code} 16/01/27 18:46:28 ERROR ReplayListenerBus: Listener SQLHistoryListener threw an exception java.lang.ClassCastException: java.lang.Integer cannot be cast to java.lang.Long at scala.runtime.BoxesRunTime.unboxToLong(BoxesRunTime.java:110) at org.apache.spark.sql.execution.ui.SQLHistoryListener$$anonfun$onTaskEnd$1$$anonfun$5.apply(SQLListener.scala:334) at org.apache.spark.sql.execution.ui.SQLHistoryListener$$anonfun$onTaskEnd$1$$anonfun$5.apply(SQLListener.scala:334) at scala.Option.map(Option.scala:145) at org.apache.spark.sql.execution.ui.SQLHistoryListener$$anonfun$onTaskEnd$1.apply(SQLListener.scala:334) at org.apache.spark.sql.execution.ui.SQLHistoryListener$$anonfun$onTaskEnd$1.apply(SQLListener.scala:332) {code} SQLHistoryListener listens on SparkListenerTaskEnd events, which contain non-SQL accumulators as well. We try to cast all accumulators we encounter to Long, resulting in an error like this one. Note: this was a problem even before internal accumulators were introduced. If the task used a user accumulator of a type other than Long, we would still see this. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-13009) spark-streaming-twitter_2.10 does not make it possible to access the raw twitter json
Andrew Davidson created SPARK-13009: --- Summary: spark-streaming-twitter_2.10 does not make it possible to access the raw twitter json Key: SPARK-13009 URL: https://issues.apache.org/jira/browse/SPARK-13009 Project: Spark Issue Type: Improvement Components: Streaming Affects Versions: 1.6.0 Reporter: Andrew Davidson Priority: Blocker The Streaming-twitter package makes it easy for Java programmers to work with twitter. The implementation returns the raw twitter data in JSON formate as a twitter4J StatusJSONImpl object JavaDStream tweets = TwitterUtils.createStream(ssc, twitterAuth); The status class is different then the raw JSON. I.E. serializing the status object will be the same as the original json. I have down stream systems that can only process raw tweets not twitter4J Status objects. Here is my bug/RFE request made to Twitter4J. They asked I create a spark tracking issue. On Thursday, January 21, 2016 at 6:27:25 PM UTC, Andy Davidson wrote: Hi All Quick problem summary: My system uses the Status objects to do some analysis how ever I need to store the raw JSON. There are other systems that process that data that are not written in Java. Currently we are serializing the Status Object. The JSON is going to break down stream systems. I am using the Apache Spark Streaming spark-streaming-twitter_2.10 http://spark.apache.org/docs/latest/streaming-programming-guide.html#advanced-sources Request For Enhancement: I imagine easy access to the raw JSON is a common requirement. Would it be possible to add a member function to StatusJSONImpl getRawJson(). By default the returned value would be null unless jsonStoreEnabled=True is set in the config. Alternative implementations: It should be possible to modify the spark-streaming-twitter_2.10 to provide this support. The solutions is not very clean It would required apache spark to define their own Status Pojo. The current StatusJSONImpl class is marked final The Wrapper is not going to work nicely with existing code. spark-streaming-twitter_2.10 does not expose all of the twitter streaming API so many developers are writing their implementations of org.apache.park.streaming.twitter.TwitterInputDStream. This make maintenance difficult. Its not easy to know when the spark implementation for twitter has changed. Code listing for spark-1.6.0/external/twitter/src/main/scala/org/apache/spark/streaming/twitter/TwitterInputDStream.scala private[streaming] class TwitterReceiver( twitterAuth: Authorization, filters: Seq[String], storageLevel: StorageLevel ) extends Receiver[Status](storageLevel) with Logging { @volatile private var twitterStream: TwitterStream = _ @volatile private var stopped = false def onStart() { try { val newTwitterStream = new TwitterStreamFactory().getInstance(twitterAuth) newTwitterStream.addListener(new StatusListener { def onStatus(status: Status): Unit = { store(status) } Ref: https://forum.processing.org/one/topic/saving-json-data-from-twitter4j.html What do people think? Kind regards Andy From: on behalf of Igor Brigadir Reply-To: Date: Tuesday, January 19, 2016 at 5:55 AM To: Twitter4J Subject: Re: [Twitter4J] trouble writing unit test Main issue is that the Json object is in the wrong json format. eg: "createdAt": 1449775664000 should be "created_at": "Thu Dec 10 19:27:44 + 2015", ... It looks like the json you have was serialized from a java Status object, which makes json objects different to what you get from the API, TwitterObjectFactory expects json from Twitter (I haven't had any problems using TwitterObjectFactory instead of the Deprecated DataObjectFactory). You could "fix" it by matching the keys & values you have with the correct, twitter API json - it should look like the example here: https://dev.twitter.com/rest/reference/get/statuses/show/%3Aid But it might be easier to download the tweets again, but this time use TwitterObjectFactory.getRawJSON(status) to get the Original Json from the Twitter API, and save that for later. (You must have jsonStoreEnabled=True in your config, and call getRawJSON in the same thread as .showStatus() or lookup() or whatever you're using to load tweets.) -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-12911) Cacheing a dataframe causes array comparisons to fail (in filter / where) after 1.6
[ https://issues.apache.org/jira/browse/SPARK-12911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15109269#comment-15109269 ] Andrew Ray commented on SPARK-12911: In the current master this happens even without caching. The generated code calls UnsafeArrayData.equals for each row with an argument of type GenericArrayData (the filter value) which returns false. Presumably the generated code needs to convert the filter value to an UnsafeArrayData but I'm at a loss for where to do so. [~sdicocco] This is a bug, not something a developer should have to worry about. > Cacheing a dataframe causes array comparisons to fail (in filter / where) > after 1.6 > --- > > Key: SPARK-12911 > URL: https://issues.apache.org/jira/browse/SPARK-12911 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.6.0 > Environment: OSX 10.11.1, Scala 2.11.7, Spark 1.6.0 >Reporter: Jesse English > > When doing a *where* operation on a dataframe and testing for equality on an > array type, after 1.6 no valid comparisons are made if the dataframe has been > cached. If it has not been cached, the results are as expected. > This appears to be related to the underlying unsafe array data types. > {code:title=test.scala|borderStyle=solid} > test("test array comparison") { > val vectors: Vector[Row] = Vector( > Row.fromTuple("id_1" -> Array(0L, 2L)), > Row.fromTuple("id_2" -> Array(0L, 5L)), > Row.fromTuple("id_3" -> Array(0L, 9L)), > Row.fromTuple("id_4" -> Array(1L, 0L)), > Row.fromTuple("id_5" -> Array(1L, 8L)), > Row.fromTuple("id_6" -> Array(2L, 4L)), > Row.fromTuple("id_7" -> Array(5L, 6L)), > Row.fromTuple("id_8" -> Array(6L, 2L)), > Row.fromTuple("id_9" -> Array(7L, 0L)) > ) > val data: RDD[Row] = sc.parallelize(vectors, 3) > val schema = StructType( > StructField("id", StringType, false) :: > StructField("point", DataTypes.createArrayType(LongType, false), > false) :: > Nil > ) > val sqlContext = new SQLContext(sc) > val dataframe = sqlContext.createDataFrame(data, schema) > val targetPoint:Array[Long] = Array(0L,9L) > //Cacheing is the trigger to cause the error (no cacheing causes no error) > dataframe.cache() > //This is the line where it fails > //java.util.NoSuchElementException: next on empty iterator > //However we know that there is a valid match > val targetRow = dataframe.where(dataframe("point") === > array(targetPoint.map(value => lit(value)): _*)).first() > assert(targetRow != null) > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-12790) Remove HistoryServer old multiple files format
[ https://issues.apache.org/jira/browse/SPARK-12790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-12790: -- Assignee: Felix Cheung > Remove HistoryServer old multiple files format > -- > > Key: SPARK-12790 > URL: https://issues.apache.org/jira/browse/SPARK-12790 > Project: Spark > Issue Type: Sub-task > Components: Deploy >Reporter: Andrew Or >Assignee: Felix Cheung > > HistoryServer has 2 formats. The old one makes a directory and puts multiple > files in there (APPLICATION_COMPLETE, EVENT_LOG1 etc.). The new one has just > 1 file called local_2593759238651.log or something. > It's been a nightmare to maintain both code paths. We should just remove the > old legacy format (which has been out of use for many versions now) when we > still have the chance. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-12790) Remove HistoryServer old multiple files format
[ https://issues.apache.org/jira/browse/SPARK-12790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107167#comment-15107167 ] Andrew Or commented on SPARK-12790: --- also updating all the tests that rely on the old format. If you look under core/src/test/resources there are a bunch of those > Remove HistoryServer old multiple files format > -- > > Key: SPARK-12790 > URL: https://issues.apache.org/jira/browse/SPARK-12790 > Project: Spark > Issue Type: Sub-task > Components: Deploy >Reporter: Andrew Or > > HistoryServer has 2 formats. The old one makes a directory and puts multiple > files in there (APPLICATION_COMPLETE, EVENT_LOG1 etc.). The new one has just > 1 file called local_2593759238651.log or something. > It's been a nightmare to maintain both code paths. We should just remove the > old legacy format (which has been out of use for many versions now) when we > still have the chance. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-12790) Remove HistoryServer old multiple files format
[ https://issues.apache.org/jira/browse/SPARK-12790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107169#comment-15107169 ] Andrew Or commented on SPARK-12790: --- I've assigned this to you. > Remove HistoryServer old multiple files format > -- > > Key: SPARK-12790 > URL: https://issues.apache.org/jira/browse/SPARK-12790 > Project: Spark > Issue Type: Sub-task > Components: Deploy >Reporter: Andrew Or >Assignee: Felix Cheung > > HistoryServer has 2 formats. The old one makes a directory and puts multiple > files in there (APPLICATION_COMPLETE, EVENT_LOG1 etc.). The new one has just > 1 file called local_2593759238651.log or something. > It's been a nightmare to maintain both code paths. We should just remove the > old legacy format (which has been out of use for many versions now) when we > still have the chance. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-12887) Do not expose var's in TaskMetrics
[ https://issues.apache.org/jira/browse/SPARK-12887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or resolved SPARK-12887. --- Resolution: Fixed Fix Version/s: 2.0.0 > Do not expose var's in TaskMetrics > -- > > Key: SPARK-12887 > URL: https://issues.apache.org/jira/browse/SPARK-12887 > Project: Spark > Issue Type: Sub-task > Components: Spark Core >Reporter: Andrew Or >Assignee: Andrew Or > Fix For: 2.0.0 > > > TaskMetrics has a bunch of var's, some are fully public, some are > private[spark]. This is bad coding style that makes it easy to accidentally > overwrite previously set metrics. This has happened a few times in the past > and caused bugs that were difficult to debug. > Instead, we should have get-or-create semantics, which are more readily > understandable. This makes sense in the case of TaskMetrics because these are > just aggregated metrics that we want to collect throughout the task, so it > doesn't matter *who*'s incrementing them. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-12485) Rename "dynamic allocation" to "elastic scaling"
[ https://issues.apache.org/jira/browse/SPARK-12485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107151#comment-15107151 ] Andrew Or commented on SPARK-12485: --- [~srowen] to answer your question no I don't feel super strongly about changing it. Naming is difficult in general and I think both "dynamic allocation" and "elastic scaling" do mean roughly the same thing. It's just that I slightly prefer the latter (or something shorter) after giving a few talks on this topic and chatting with a few people about it in real life. I'm also totally cool with closing this as a Won't Fix if you or [~markhamstra] prefer. > Rename "dynamic allocation" to "elastic scaling" > > > Key: SPARK-12485 > URL: https://issues.apache.org/jira/browse/SPARK-12485 > Project: Spark > Issue Type: Sub-task > Components: Spark Core >Reporter: Andrew Or >Assignee: Andrew Or > > Fewer syllables, sounds more natural. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-10620) Look into whether accumulator mechanism can replace TaskMetrics
[ https://issues.apache.org/jira/browse/SPARK-10620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-10620: -- Attachment: AccumulatorsandTaskMetricsinSpark2.0.pdf > Look into whether accumulator mechanism can replace TaskMetrics > --- > > Key: SPARK-10620 > URL: https://issues.apache.org/jira/browse/SPARK-10620 > Project: Spark > Issue Type: Task > Components: Spark Core >Reporter: Patrick Wendell >Assignee: Andrew Or > Attachments: AccumulatorsandTaskMetricsinSpark2.0.pdf > > > This task is simply to explore whether the internal representation used by > TaskMetrics could be performed by using accumulators rather than having two > separate mechanisms. Note that we need to continue to preserve the existing > "Task Metric" data structures that are exposed to users through event logs > etc. The question is can we use a single internal codepath and perhaps make > this easier to extend in the future. > I think a full exploration would answer the following questions: > - How do the semantics of accumulators on stage retries differ from aggregate > TaskMetrics for a stage? Could we implement clearer retry semantics for > internal accumulators to allow them to be the same - for instance, zeroing > accumulator values if a stage is retried (see discussion here: SPARK-10042). > - Are there metrics that do not fit well into the accumulator model, or would > be difficult to update as an accumulator. > - If we expose metrics through accumulators in the future rather than > continuing to add fields to TaskMetrics, what is the best way to coerce > compatibility? > - Are there any other considerations? > - Is it worth it to do this, or is the consolidation too complicated to > justify? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-10620) Look into whether accumulator mechanism can replace TaskMetrics
[ https://issues.apache.org/jira/browse/SPARK-10620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-10620: -- Attachment: accums-and-task-metrics.pdf > Look into whether accumulator mechanism can replace TaskMetrics > --- > > Key: SPARK-10620 > URL: https://issues.apache.org/jira/browse/SPARK-10620 > Project: Spark > Issue Type: Task > Components: Spark Core >Reporter: Patrick Wendell >Assignee: Andrew Or > Attachments: accums-and-task-metrics.pdf > > > This task is simply to explore whether the internal representation used by > TaskMetrics could be performed by using accumulators rather than having two > separate mechanisms. Note that we need to continue to preserve the existing > "Task Metric" data structures that are exposed to users through event logs > etc. The question is can we use a single internal codepath and perhaps make > this easier to extend in the future. > I think a full exploration would answer the following questions: > - How do the semantics of accumulators on stage retries differ from aggregate > TaskMetrics for a stage? Could we implement clearer retry semantics for > internal accumulators to allow them to be the same - for instance, zeroing > accumulator values if a stage is retried (see discussion here: SPARK-10042). > - Are there metrics that do not fit well into the accumulator model, or would > be difficult to update as an accumulator. > - If we expose metrics through accumulators in the future rather than > continuing to add fields to TaskMetrics, what is the best way to coerce > compatibility? > - Are there any other considerations? > - Is it worth it to do this, or is the consolidation too complicated to > justify? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-10620) Look into whether accumulator mechanism can replace TaskMetrics
[ https://issues.apache.org/jira/browse/SPARK-10620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-10620: -- Attachment: (was: AccumulatorsandTaskMetricsinSpark2.0.pdf) > Look into whether accumulator mechanism can replace TaskMetrics > --- > > Key: SPARK-10620 > URL: https://issues.apache.org/jira/browse/SPARK-10620 > Project: Spark > Issue Type: Task > Components: Spark Core >Reporter: Patrick Wendell >Assignee: Andrew Or > > This task is simply to explore whether the internal representation used by > TaskMetrics could be performed by using accumulators rather than having two > separate mechanisms. Note that we need to continue to preserve the existing > "Task Metric" data structures that are exposed to users through event logs > etc. The question is can we use a single internal codepath and perhaps make > this easier to extend in the future. > I think a full exploration would answer the following questions: > - How do the semantics of accumulators on stage retries differ from aggregate > TaskMetrics for a stage? Could we implement clearer retry semantics for > internal accumulators to allow them to be the same - for instance, zeroing > accumulator values if a stage is retried (see discussion here: SPARK-10042). > - Are there metrics that do not fit well into the accumulator model, or would > be difficult to update as an accumulator. > - If we expose metrics through accumulators in the future rather than > continuing to add fields to TaskMetrics, what is the best way to coerce > compatibility? > - Are there any other considerations? > - Is it worth it to do this, or is the consolidation too complicated to > justify? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-12895) Implement TaskMetrics using accumulators
Andrew Or created SPARK-12895: - Summary: Implement TaskMetrics using accumulators Key: SPARK-12895 URL: https://issues.apache.org/jira/browse/SPARK-12895 Project: Spark Issue Type: Sub-task Components: Spark Core Reporter: Andrew Or Assignee: Andrew Or We need to first do this before we can avoid sending TaskMetrics from the executors to the driver. After we do this, we can send only accumulator updates instead of both that AND TaskMetrics. But first, we need to express everything in TaskMetrics as accumulators. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-12895) Implement TaskMetrics using accumulators
[ https://issues.apache.org/jira/browse/SPARK-12895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-12895: -- Description: We need to first do this before we can avoid sending TaskMetrics from the executors to the driver. After we do this, we can send only accumulator updates instead of both that AND TaskMetrics. By the end of this issue TaskMetrics will be a wrapper of accumulators. It will be only syntactic sugar for setting these accumulators. But first, we need to express everything in TaskMetrics as accumulators. was: We need to first do this before we can avoid sending TaskMetrics from the executors to the driver. After we do this, we can send only accumulator updates instead of both that AND TaskMetrics. But first, we need to express everything in TaskMetrics as accumulators. > Implement TaskMetrics using accumulators > > > Key: SPARK-12895 > URL: https://issues.apache.org/jira/browse/SPARK-12895 > Project: Spark > Issue Type: Sub-task > Components: Spark Core >Reporter: Andrew Or >Assignee: Andrew Or > > We need to first do this before we can avoid sending TaskMetrics from the > executors to the driver. After we do this, we can send only accumulator > updates instead of both that AND TaskMetrics. > By the end of this issue TaskMetrics will be a wrapper of accumulators. It > will be only syntactic sugar for setting these accumulators. > But first, we need to express everything in TaskMetrics as accumulators. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-12896) Send only accumulator updates, not TaskMetrics, to the driver
[ https://issues.apache.org/jira/browse/SPARK-12896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-12896: -- Description: Currently in Spark there are two different things that the executors send to the driver, accumulators and TaskMetrics. It would allow us to clean up a lot of things if we send ONLY accumulator updates instead of both that and TaskMetrics. Eventually we will be able to make TaskMetrics private and maybe even remove it. This blocks on SPARK-12895. was:This blocks on > Send only accumulator updates, not TaskMetrics, to the driver > - > > Key: SPARK-12896 > URL: https://issues.apache.org/jira/browse/SPARK-12896 > Project: Spark > Issue Type: Sub-task > Components: Spark Core >Reporter: Andrew Or >Assignee: Andrew Or > > Currently in Spark there are two different things that the executors send to > the driver, accumulators and TaskMetrics. It would allow us to clean up a lot > of things if we send ONLY accumulator updates instead of both that and > TaskMetrics. Eventually we will be able to make TaskMetrics private and maybe > even remove it. > This blocks on SPARK-12895. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-12896) Send only accumulator updates, not TaskMetrics, to the driver
Andrew Or created SPARK-12896: - Summary: Send only accumulator updates, not TaskMetrics, to the driver Key: SPARK-12896 URL: https://issues.apache.org/jira/browse/SPARK-12896 Project: Spark Issue Type: Sub-task Components: Spark Core Reporter: Andrew Or Assignee: Andrew Or This blocks on -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-12885) Rename 3 fields in ShuffleWriteMetrics
Andrew Or created SPARK-12885: - Summary: Rename 3 fields in ShuffleWriteMetrics Key: SPARK-12885 URL: https://issues.apache.org/jira/browse/SPARK-12885 Project: Spark Issue Type: Sub-task Components: Spark Core Reporter: Andrew Or Assignee: Andrew Or Priority: Minor Today we have: {code} inputMetrics.recordsRead outputMetrics.bytesWritten shuffleReadMetrics.localBlocksFetched ... shuffleWriteMetrics.shuffleRecordsWritten shuffleWriteMetrics.shuffleBytesWritten shuffleWriteMetrics.shuffleWriteTime {code} The shuffle write ones are kind of redundant. We can drop the shuffle part in the method names. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-10985) Avoid passing evicted blocks throughout BlockManager / CacheManager
[ https://issues.apache.org/jira/browse/SPARK-10985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or resolved SPARK-10985. --- Resolution: Fixed Assignee: Josh Rosen Fix Version/s: 2.0.0 Target Version/s: 2.0.0 > Avoid passing evicted blocks throughout BlockManager / CacheManager > --- > > Key: SPARK-10985 > URL: https://issues.apache.org/jira/browse/SPARK-10985 > Project: Spark > Issue Type: Improvement > Components: Block Manager, Spark Core >Reporter: Andrew Or >Assignee: Josh Rosen >Priority: Minor > Fix For: 2.0.0 > > > This is a minor refactoring task. > Currently when we attempt to put a block in, we get back an array buffer of > blocks that are dropped in the process. We do this to propagate these blocks > back to our TaskContext, which will add them to its TaskMetrics so we can see > them in the SparkUI storage tab properly. > Now that we have TaskContext.get, we can just use that to propagate this > information. This simplifies a lot of the signatures and gets rid of weird > return types like the following everywhere: > {code} > ArrayBuffer[(BlockId, BlockStatus)] > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Issue Comment Deleted] (SPARK-10620) Look into whether accumulator mechanism can replace TaskMetrics
[ https://issues.apache.org/jira/browse/SPARK-10620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-10620: -- Comment: was deleted (was: User 'andrewor14' has created a pull request for this issue: https://github.com/apache/spark/pull/10811) > Look into whether accumulator mechanism can replace TaskMetrics > --- > > Key: SPARK-10620 > URL: https://issues.apache.org/jira/browse/SPARK-10620 > Project: Spark > Issue Type: Task > Components: Spark Core >Reporter: Patrick Wendell > > This task is simply to explore whether the internal representation used by > TaskMetrics could be performed by using accumulators rather than having two > separate mechanisms. Note that we need to continue to preserve the existing > "Task Metric" data structures that are exposed to users through event logs > etc. The question is can we use a single internal codepath and perhaps make > this easier to extend in the future. > I think a full exploration would answer the following questions: > - How do the semantics of accumulators on stage retries differ from aggregate > TaskMetrics for a stage? Could we implement clearer retry semantics for > internal accumulators to allow them to be the same - for instance, zeroing > accumulator values if a stage is retried (see discussion here: SPARK-10042). > - Are there metrics that do not fit well into the accumulator model, or would > be difficult to update as an accumulator. > - If we expose metrics through accumulators in the future rather than > continuing to add fields to TaskMetrics, what is the best way to coerce > compatibility? > - Are there any other considerations? > - Is it worth it to do this, or is the consolidation too complicated to > justify? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Issue Comment Deleted] (SPARK-10620) Look into whether accumulator mechanism can replace TaskMetrics
[ https://issues.apache.org/jira/browse/SPARK-10620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-10620: -- Comment: was deleted (was: User 'andrewor14' has created a pull request for this issue: https://github.com/apache/spark/pull/10810) > Look into whether accumulator mechanism can replace TaskMetrics > --- > > Key: SPARK-10620 > URL: https://issues.apache.org/jira/browse/SPARK-10620 > Project: Spark > Issue Type: Task > Components: Spark Core >Reporter: Patrick Wendell > > This task is simply to explore whether the internal representation used by > TaskMetrics could be performed by using accumulators rather than having two > separate mechanisms. Note that we need to continue to preserve the existing > "Task Metric" data structures that are exposed to users through event logs > etc. The question is can we use a single internal codepath and perhaps make > this easier to extend in the future. > I think a full exploration would answer the following questions: > - How do the semantics of accumulators on stage retries differ from aggregate > TaskMetrics for a stage? Could we implement clearer retry semantics for > internal accumulators to allow them to be the same - for instance, zeroing > accumulator values if a stage is retried (see discussion here: SPARK-10042). > - Are there metrics that do not fit well into the accumulator model, or would > be difficult to update as an accumulator. > - If we expose metrics through accumulators in the future rather than > continuing to add fields to TaskMetrics, what is the best way to coerce > compatibility? > - Are there any other considerations? > - Is it worth it to do this, or is the consolidation too complicated to > justify? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-12887) Do not expose var's in TaskMetrics
Andrew Or created SPARK-12887: - Summary: Do not expose var's in TaskMetrics Key: SPARK-12887 URL: https://issues.apache.org/jira/browse/SPARK-12887 Project: Spark Issue Type: Sub-task Components: Spark Core Reporter: Andrew Or Assignee: Andrew Or TaskMetrics has a bunch of var's, some are fully public, some are private[spark]. This is bad coding style that makes it easy to accidentally overwrite previously set metrics. This has happened a few times in the past and caused bugs that were difficult to debug. Instead, we should have get-or-create semantics, which are more readily understandable. This makes sense in the case of TaskMetrics because these are just aggregated metrics that we want to collect throughout the task, so it doesn't matter *who*'s incrementing them. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-12884) Move *Metrics and *Accum* classes to their own files
Andrew Or created SPARK-12884: - Summary: Move *Metrics and *Accum* classes to their own files Key: SPARK-12884 URL: https://issues.apache.org/jira/browse/SPARK-12884 Project: Spark Issue Type: Sub-task Components: Spark Core Reporter: Andrew Or Assignee: Andrew Or Priority: Minor TaskMetrics.scala and Accumulators.scala are big files. The former contains 8 classes and objects, and the latter contains 6. When working through this code I found that it was difficult to navigate these files because of all the unrelated things them. We should move most of the classes to their own files. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-12174) Slow test: BlockManagerSuite."SPARK-9591: getRemoteBytes from another location when Exception throw"
[ https://issues.apache.org/jira/browse/SPARK-12174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or resolved SPARK-12174. --- Resolution: Fixed Fix Version/s: 2.0.0 Target Version/s: 2.0.0 > Slow test: BlockManagerSuite."SPARK-9591: getRemoteBytes from another > location when Exception throw" > > > Key: SPARK-12174 > URL: https://issues.apache.org/jira/browse/SPARK-12174 > Project: Spark > Issue Type: Sub-task > Components: Tests >Reporter: Josh Rosen >Assignee: Josh Rosen > Fix For: 2.0.0 > > > BlockManagerSuite's "SPARK-9591: getRemoteBytes from another location when > Exception throw" test seems to take roughly 45 seconds to run on my laptop, > which seems way too long. We should see whether we can change timeout > configurations to reduce this test's time. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-12790) Remove HistoryServer old multiple files format
Andrew Or created SPARK-12790: - Summary: Remove HistoryServer old multiple files format Key: SPARK-12790 URL: https://issues.apache.org/jira/browse/SPARK-12790 Project: Spark Issue Type: Sub-task Components: Deploy Reporter: Andrew Or HistoryServer has 2 formats. The old one makes a directory and puts multiple files in there (APPLICATION_COMPLETE, EVENT_LOG1 etc.). The new one has just 1 file called local_2593759238651.log or something. It's been a nightmare to maintain both code paths. We should just remove the old legacy format (which has been out of use for many versions now) when we still have the chance. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-12627) Remove SparkContext preferred locations constructor
[ https://issues.apache.org/jira/browse/SPARK-12627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15084438#comment-15084438 ] Andrew Or commented on SPARK-12627: --- Resolved through https://github.com/apache/spark/pull/10569/files > Remove SparkContext preferred locations constructor > --- > > Key: SPARK-12627 > URL: https://issues.apache.org/jira/browse/SPARK-12627 > Project: Spark > Issue Type: Sub-task > Components: Spark Core >Reporter: Andrew Or >Assignee: Andrew Or >Priority: Blocker > > In the code: > {code} > logWarning("Passing in preferred locations has no effect at all, see > SPARK-8949") > {code} > We kept it there only for backward compatibility. Now is the time to remove > it. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-12627) Remove SparkContext preferred locations constructor
[ https://issues.apache.org/jira/browse/SPARK-12627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or resolved SPARK-12627. --- Resolution: Fixed Assignee: Reynold Xin (was: Andrew Or) Fix Version/s: 2.0.0 > Remove SparkContext preferred locations constructor > --- > > Key: SPARK-12627 > URL: https://issues.apache.org/jira/browse/SPARK-12627 > Project: Spark > Issue Type: Sub-task > Components: Spark Core >Reporter: Andrew Or >Assignee: Reynold Xin >Priority: Blocker > Fix For: 2.0.0 > > > In the code: > {code} > logWarning("Passing in preferred locations has no effect at all, see > SPARK-8949") > {code} > We kept it there only for backward compatibility. Now is the time to remove > it. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-12627) Remove SparkContext preferred locations constructor
Andrew Or created SPARK-12627: - Summary: Remove SparkContext preferred locations constructor Key: SPARK-12627 URL: https://issues.apache.org/jira/browse/SPARK-12627 Project: Spark Issue Type: Sub-task Components: Spark Core Reporter: Andrew Or Priority: Blocker In the code: {code} logWarning("Passing in preferred locations has no effect at all, see SPARK-8949") {code} We kept it there only for backward compatibility. Now is the time to remove it. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-12627) Remove SparkContext preferred locations constructor
[ https://issues.apache.org/jira/browse/SPARK-12627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or reassigned SPARK-12627: - Assignee: Andrew Or > Remove SparkContext preferred locations constructor > --- > > Key: SPARK-12627 > URL: https://issues.apache.org/jira/browse/SPARK-12627 > Project: Spark > Issue Type: Sub-task > Components: Spark Core >Reporter: Andrew Or >Assignee: Andrew Or >Priority: Blocker > > In the code: > {code} > logWarning("Passing in preferred locations has no effect at all, see > SPARK-8949") > {code} > We kept it there only for backward compatibility. Now is the time to remove > it. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-12628) SparkUI: weird formatting on additional metrics tooltip
[ https://issues.apache.org/jira/browse/SPARK-12628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-12628: -- Attachment: Screen Shot 2016-01-04 at 1.18.16 PM.png > SparkUI: weird formatting on additional metrics tooltip > --- > > Key: SPARK-12628 > URL: https://issues.apache.org/jira/browse/SPARK-12628 > Project: Spark > Issue Type: Bug > Components: Web UI >Reporter: Andrew Or > Attachments: Screen Shot 2016-01-04 at 1.18.16 PM.png > > > See screenshot. I'm on chrome 47.0.2526, OSX 10.9.5. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-12628) SparkUI: weird formatting on additional metrics tooltip
Andrew Or created SPARK-12628: - Summary: SparkUI: weird formatting on additional metrics tooltip Key: SPARK-12628 URL: https://issues.apache.org/jira/browse/SPARK-12628 Project: Spark Issue Type: Bug Components: Web UI Reporter: Andrew Or See screenshot. I'm on chrome 47.0.2526, OSX 10.9.5. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-12606) Scala/Java compatibility issue Re: how to extend java transformer from Scala UnaryTransformer ?
Andrew Davidson created SPARK-12606: --- Summary: Scala/Java compatibility issue Re: how to extend java transformer from Scala UnaryTransformer ? Key: SPARK-12606 URL: https://issues.apache.org/jira/browse/SPARK-12606 Project: Spark Issue Type: Bug Components: ML Affects Versions: 1.5.2 Environment: Java 8, Mac OS, Spark-1.5.2 Reporter: Andrew Davidson Hi Andy, I suspect that you hit the Scala/Java compatibility issue, I can also reproduce this issue, so could you file a JIRA to track this issue? Yanbo 2016-01-02 3:38 GMT+08:00 Andy Davidson: I am trying to write a trivial transformer I use use in my pipeline. I am using java and spark 1.5.2. It was suggested that I use the Tokenize.scala class as an example. This should be very easy how ever I do not understand Scala, I am having trouble debugging the following exception. Any help would be greatly appreciated. Happy New Year Andy java.lang.IllegalArgumentException: requirement failed: Param null__inputCol does not belong to Stemmer_2f3aa96d-7919-4eaa-ad54-f7c620b92d1c. at scala.Predef$.require(Predef.scala:233) at org.apache.spark.ml.param.Params$class.shouldOwn(params.scala:557) at org.apache.spark.ml.param.Params$class.set(params.scala:436) at org.apache.spark.ml.PipelineStage.set(Pipeline.scala:37) at org.apache.spark.ml.param.Params$class.set(params.scala:422) at org.apache.spark.ml.PipelineStage.set(Pipeline.scala:37) at org.apache.spark.ml.UnaryTransformer.setInputCol(Transformer.scala:83) at com.pws.xxx.ml.StemmerTest.test(StemmerTest.java:30) public class StemmerTest extends AbstractSparkTest { @Test public void test() { Stemmer stemmer = new Stemmer() .setInputCol("raw”) //line 30 .setOutputCol("filtered"); } } /** * @ see spark-1.5.1/mllib/src/main/scala/org/apache/spark/ml/feature/Tokenizer.scala * @ see https://chimpler.wordpress.com/2014/06/11/classifiying-documents-using-naive-bayes-on-apache-spark-mllib/ * @ see http://www.tonytruong.net/movie-rating-prediction-with-apache-spark-and-hortonworks/ * * @author andrewdavidson * */ public class Stemmer extends UnaryTransformer implements Serializable{ static Logger logger = LoggerFactory.getLogger(Stemmer.class); private static final long serialVersionUID = 1L; private static final ArrayType inputType = DataTypes.createArrayType(DataTypes.StringType, true); private final String uid = Stemmer.class.getSimpleName() + "_" + UUID.randomUUID().toString(); @Override public String uid() { return uid; } /* override protected def validateInputType(inputType: DataType): Unit = { require(inputType == StringType, s"Input type must be string type but got $inputType.") } */ @Override public void validateInputType(DataType inputTypeArg) { String msg = "inputType must be " + inputType.simpleString() + " but got " + inputTypeArg.simpleString(); assert (inputType.equals(inputTypeArg)) : msg; } @Override public Function1
createTransformFunc() { // http://stackoverflow.com/questions/6545066/using-scala-from-java-passing-functions-as-parameters Function1
f = new AbstractFunction1
() { public List apply(List words) { for(String word : words) { logger.error("AEDWIP input word: {}", word); } return words; } }; return f; } @Override public DataType outputDataType() { return DataTypes.createArrayType(DataTypes.StringType, true); } } -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-12414) Remove closure serializer
[ https://issues.apache.org/jira/browse/SPARK-12414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15075814#comment-15075814 ] Andrew Or edited comment on SPARK-12414 at 12/31/15 7:51 AM: - It's also for code cleanup. Right now SparkEnv has a "closure serializer" and a "serializer", which is kind of confusing. We should just use Java serializer since it's worked for such a long time. I don't know much about Kryo 3.0 but I'm not sure if upgrading would be sufficient. was (Author: andrewor14): It's also for code cleanup. Right now SparkEnv has a "closure serializer" and a "serializer", which is kind of confusing. We should just use Java serializer since it's worked for such a long time. Not sure about Kryo 3.0 but I'm not sure if upgrading would be sufficient. > Remove closure serializer > - > > Key: SPARK-12414 > URL: https://issues.apache.org/jira/browse/SPARK-12414 > Project: Spark > Issue Type: Sub-task > Components: Spark Core >Affects Versions: 1.0.0 >Reporter: Andrew Or >Assignee: Andrew Or > > There is a config `spark.closure.serializer` that accepts exactly one value: > the java serializer. This is because there are currently bugs in the Kryo > serializer that make it not a viable candidate. This was uncovered by an > unsuccessful attempt to make it work: SPARK-7708. > My high level point is that the Java serializer has worked well for at least > 6 Spark versions now, and it is an incredibly complicated task to get other > serializers (not just Kryo) to work with Spark's closures. IMO the effort is > not worth it and we should just remove this documentation and all the code > associated with it. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-12414) Remove closure serializer
[ https://issues.apache.org/jira/browse/SPARK-12414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15075814#comment-15075814 ] Andrew Or commented on SPARK-12414: --- It's also for code cleanup. Right now SparkEnv has a "closure serializer" and a "serializer", which is kind of confusing. We should just use Java serializer since it's worked for such a long time. Not sure about Kryo 3.0 but I'm not sure if upgrading would be sufficient. > Remove closure serializer > - > > Key: SPARK-12414 > URL: https://issues.apache.org/jira/browse/SPARK-12414 > Project: Spark > Issue Type: Sub-task > Components: Spark Core >Affects Versions: 1.0.0 >Reporter: Andrew Or >Assignee: Andrew Or > > There is a config `spark.closure.serializer` that accepts exactly one value: > the java serializer. This is because there are currently bugs in the Kryo > serializer that make it not a viable candidate. This was uncovered by an > unsuccessful attempt to make it work: SPARK-7708. > My high level point is that the Java serializer has worked well for at least > 6 Spark versions now, and it is an incredibly complicated task to get other > serializers (not just Kryo) to work with Spark's closures. IMO the effort is > not worth it and we should just remove this documentation and all the code > associated with it. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-12582) IndexShuffleBlockResolverSuite fails in windows
[ https://issues.apache.org/jira/browse/SPARK-12582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or reassigned SPARK-12582: - Assignee: Andrew Or > IndexShuffleBlockResolverSuite fails in windows > --- > > Key: SPARK-12582 > URL: https://issues.apache.org/jira/browse/SPARK-12582 > Project: Spark > Issue Type: Bug > Components: Tests, Windows >Reporter: yucai >Assignee: Andrew Or > > IndexShuffleBlockResolverSuite fails in my windows develop machine. > {code} > [info] IndexShuffleBlockResolverSuite: > [info] - commit shuffle files multiple times *** FAILED *** (388 milliseconds) > [info] Array(10, 0, 20) equaled Array(10, 0, 20) > (IndexShuffleBlockResolverSuite.scala:108) > [info] org.scalatest.exceptions.TestFailedException: > . > . > [info] Exception encountered when attempting to run a suite with class name: > org.apache.spark.shuffle.sort.IndexShuffleB > lockResolverSuite *** ABORTED *** (2 seconds, 234 milliseconds) > [info] java.io.IOException: Failed to delete: > C:\Users\yyu29\Documents\codes.next\spark\target\tmp\spark-0e81a15a-e712 > -4b1c-a089-f421db149e65 > [info] at org.apache.spark.util.Utils$.deleteRecursively(Utils.scala:940) > [info] at > org.apache.spark.shuffle.sort.IndexShuffleBlockResolverSuite.afterEach(IndexShuffleBlockResolverSuite.scala: > 60) > [info] at > org.scalatest.BeforeAndAfterEach$class.afterEach(BeforeAndAfterEach.scala:205) > [info] at > org.apache.spark.shuffle.sort.IndexShuffleBlockResolverSuite.afterEach(IndexShuffleBlockResolverSuite.scala: > 36) > [info] at > org.scalatest.BeforeAndAfterEach$class.afterEach(BeforeAndAfterEach.scala:220) > [info] at > org.apache.spark.shuffle.sort.IndexShuffleBlockResolverSuite.afterEach(IndexShuffleBlockResolverSuite.scala: > 36) > {code} > Root cause is when "afterEach" wants to clean up data, some files are still > open. For example: > {code} > // The dataFile should be the previous one > val in = new FileInputStream(dataFile) > val firstByte = new Array[Byte](1) > in.read(firstByte) > assert(firstByte(0) === 0) > {code} > Lack of "in.close()". > In Linux, it is not a problem, you can still delete a file even it is open, > but this does not work in windows, which will report "resource is busy". > Another issue is this IndexShuffleBlockResolverSuite.scala is a scala file > but it is placed in "test/java". -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-12582) IndexShuffleBlockResolverSuite fails in windows
[ https://issues.apache.org/jira/browse/SPARK-12582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-12582: -- Component/s: (was: Shuffle) Windows Tests > IndexShuffleBlockResolverSuite fails in windows > --- > > Key: SPARK-12582 > URL: https://issues.apache.org/jira/browse/SPARK-12582 > Project: Spark > Issue Type: Bug > Components: Tests, Windows >Reporter: yucai > > IndexShuffleBlockResolverSuite fails in my windows develop machine. > {code} > [info] IndexShuffleBlockResolverSuite: > [info] - commit shuffle files multiple times *** FAILED *** (388 milliseconds) > [info] Array(10, 0, 20) equaled Array(10, 0, 20) > (IndexShuffleBlockResolverSuite.scala:108) > [info] org.scalatest.exceptions.TestFailedException: > . > . > [info] Exception encountered when attempting to run a suite with class name: > org.apache.spark.shuffle.sort.IndexShuffleB > lockResolverSuite *** ABORTED *** (2 seconds, 234 milliseconds) > [info] java.io.IOException: Failed to delete: > C:\Users\yyu29\Documents\codes.next\spark\target\tmp\spark-0e81a15a-e712 > -4b1c-a089-f421db149e65 > [info] at org.apache.spark.util.Utils$.deleteRecursively(Utils.scala:940) > [info] at > org.apache.spark.shuffle.sort.IndexShuffleBlockResolverSuite.afterEach(IndexShuffleBlockResolverSuite.scala: > 60) > [info] at > org.scalatest.BeforeAndAfterEach$class.afterEach(BeforeAndAfterEach.scala:205) > [info] at > org.apache.spark.shuffle.sort.IndexShuffleBlockResolverSuite.afterEach(IndexShuffleBlockResolverSuite.scala: > 36) > [info] at > org.scalatest.BeforeAndAfterEach$class.afterEach(BeforeAndAfterEach.scala:220) > [info] at > org.apache.spark.shuffle.sort.IndexShuffleBlockResolverSuite.afterEach(IndexShuffleBlockResolverSuite.scala: > 36) > {code} > Root cause is when "afterEach" wants to clean up data, some files are still > open. For example: > {code} > // The dataFile should be the previous one > val in = new FileInputStream(dataFile) > val firstByte = new Array[Byte](1) > in.read(firstByte) > assert(firstByte(0) === 0) > {code} > Lack of "in.close()". > In Linux, it is not a problem, you can still delete a file even it is open, > but this does not work in windows, which will report "resource is busy". > Another issue is this IndexShuffleBlockResolverSuite.scala is a scala file > but it is placed in "test/java". -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-12582) IndexShuffleBlockResolverSuite fails in windows
[ https://issues.apache.org/jira/browse/SPARK-12582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-12582: -- Assignee: (was: Apache Spark) Target Version/s: 1.6.1, 2.0.0 > IndexShuffleBlockResolverSuite fails in windows > --- > > Key: SPARK-12582 > URL: https://issues.apache.org/jira/browse/SPARK-12582 > Project: Spark > Issue Type: Bug > Components: Tests, Windows >Reporter: yucai > > IndexShuffleBlockResolverSuite fails in my windows develop machine. > {code} > [info] IndexShuffleBlockResolverSuite: > [info] - commit shuffle files multiple times *** FAILED *** (388 milliseconds) > [info] Array(10, 0, 20) equaled Array(10, 0, 20) > (IndexShuffleBlockResolverSuite.scala:108) > [info] org.scalatest.exceptions.TestFailedException: > . > . > [info] Exception encountered when attempting to run a suite with class name: > org.apache.spark.shuffle.sort.IndexShuffleB > lockResolverSuite *** ABORTED *** (2 seconds, 234 milliseconds) > [info] java.io.IOException: Failed to delete: > C:\Users\yyu29\Documents\codes.next\spark\target\tmp\spark-0e81a15a-e712 > -4b1c-a089-f421db149e65 > [info] at org.apache.spark.util.Utils$.deleteRecursively(Utils.scala:940) > [info] at > org.apache.spark.shuffle.sort.IndexShuffleBlockResolverSuite.afterEach(IndexShuffleBlockResolverSuite.scala: > 60) > [info] at > org.scalatest.BeforeAndAfterEach$class.afterEach(BeforeAndAfterEach.scala:205) > [info] at > org.apache.spark.shuffle.sort.IndexShuffleBlockResolverSuite.afterEach(IndexShuffleBlockResolverSuite.scala: > 36) > [info] at > org.scalatest.BeforeAndAfterEach$class.afterEach(BeforeAndAfterEach.scala:220) > [info] at > org.apache.spark.shuffle.sort.IndexShuffleBlockResolverSuite.afterEach(IndexShuffleBlockResolverSuite.scala: > 36) > {code} > Root cause is when "afterEach" wants to clean up data, some files are still > open. For example: > {code} > // The dataFile should be the previous one > val in = new FileInputStream(dataFile) > val firstByte = new Array[Byte](1) > in.read(firstByte) > assert(firstByte(0) === 0) > {code} > Lack of "in.close()". > In Linux, it is not a problem, you can still delete a file even it is open, > but this does not work in windows, which will report "resource is busy". > Another issue is this IndexShuffleBlockResolverSuite.scala is a scala file > but it is placed in "test/java". -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-12582) IndexShuffleBlockResolverSuite fails in windows
[ https://issues.apache.org/jira/browse/SPARK-12582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-12582: -- Assignee: yucai (was: Andrew Or) > IndexShuffleBlockResolverSuite fails in windows > --- > > Key: SPARK-12582 > URL: https://issues.apache.org/jira/browse/SPARK-12582 > Project: Spark > Issue Type: Bug > Components: Tests, Windows >Reporter: yucai >Assignee: yucai > > IndexShuffleBlockResolverSuite fails in my windows develop machine. > {code} > [info] IndexShuffleBlockResolverSuite: > [info] - commit shuffle files multiple times *** FAILED *** (388 milliseconds) > [info] Array(10, 0, 20) equaled Array(10, 0, 20) > (IndexShuffleBlockResolverSuite.scala:108) > [info] org.scalatest.exceptions.TestFailedException: > . > . > [info] Exception encountered when attempting to run a suite with class name: > org.apache.spark.shuffle.sort.IndexShuffleB > lockResolverSuite *** ABORTED *** (2 seconds, 234 milliseconds) > [info] java.io.IOException: Failed to delete: > C:\Users\yyu29\Documents\codes.next\spark\target\tmp\spark-0e81a15a-e712 > -4b1c-a089-f421db149e65 > [info] at org.apache.spark.util.Utils$.deleteRecursively(Utils.scala:940) > [info] at > org.apache.spark.shuffle.sort.IndexShuffleBlockResolverSuite.afterEach(IndexShuffleBlockResolverSuite.scala: > 60) > [info] at > org.scalatest.BeforeAndAfterEach$class.afterEach(BeforeAndAfterEach.scala:205) > [info] at > org.apache.spark.shuffle.sort.IndexShuffleBlockResolverSuite.afterEach(IndexShuffleBlockResolverSuite.scala: > 36) > [info] at > org.scalatest.BeforeAndAfterEach$class.afterEach(BeforeAndAfterEach.scala:220) > [info] at > org.apache.spark.shuffle.sort.IndexShuffleBlockResolverSuite.afterEach(IndexShuffleBlockResolverSuite.scala: > 36) > {code} > Root cause is when "afterEach" wants to clean up data, some files are still > open. For example: > {code} > // The dataFile should be the previous one > val in = new FileInputStream(dataFile) > val firstByte = new Array[Byte](1) > in.read(firstByte) > assert(firstByte(0) === 0) > {code} > Lack of "in.close()". > In Linux, it is not a problem, you can still delete a file even it is open, > but this does not work in windows, which will report "resource is busy". > Another issue is this IndexShuffleBlockResolverSuite.scala is a scala file > but it is placed in "test/java". -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-12554) Standalone mode may hang if max cores is not a multiple of executor cores
[ https://issues.apache.org/jira/browse/SPARK-12554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-12554: -- Priority: Minor (was: Major) > Standalone mode may hang if max cores is not a multiple of executor cores > - > > Key: SPARK-12554 > URL: https://issues.apache.org/jira/browse/SPARK-12554 > Project: Spark > Issue Type: Bug > Components: Deploy, Scheduler >Affects Versions: 1.5.2 >Reporter: Lijie Xu >Priority: Minor > > In scheduleExecutorsOnWorker() in Master.scala, > {{val keepScheduling = coresToAssign >= minCoresPerExecutor}} should be > changed to {{val keepScheduling = coresToAssign > 0}} > Case 1: > Suppose that an app's requested cores is 10 (i.e., {{spark.cores.max = 10}}) > and app.coresPerExecutor is 4 (i.e., {{spark.executor.cores = 4}}). > After allocating two executors (each has 4 cores) to this app, the > {{app.coresToAssign = 2}} and {{minCoresPerExecutor = coresPerExecutor = 4}}, > so {{keepScheduling = false}} and no extra executor will be allocated to this > app. If {{spark.scheduler.minRegisteredResourcesRatio}} is set to a large > number (e.g., > 0.8 in this case), the app will hang and never finish. > Case 2: if a small app's coresPerExecutor is larger than its requested cores > (e.g., {{spark.cores.max = 10}}, {{spark.executor.cores = 16}}), {{val > keepScheduling = coresToAssign >= minCoresPerExecutor}} is always FALSE. As a > result, this app will never get an executor to run. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-12554) Standalone mode may hang if max cores is not a multiple of executor cores
[ https://issues.apache.org/jira/browse/SPARK-12554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-12554: -- Summary: Standalone mode may hang if max cores is not a multiple of executor cores (was: Standalone app scheduler will hang when app.coreToAssign < minCoresPerExecutor) > Standalone mode may hang if max cores is not a multiple of executor cores > - > > Key: SPARK-12554 > URL: https://issues.apache.org/jira/browse/SPARK-12554 > Project: Spark > Issue Type: Bug > Components: Deploy, Scheduler >Affects Versions: 1.5.2 >Reporter: Lijie Xu > > In scheduleExecutorsOnWorker() in Master.scala, > {{val keepScheduling = coresToAssign >= minCoresPerExecutor}} should be > changed to {{val keepScheduling = coresToAssign > 0}} > Case 1: > Suppose that an app's requested cores is 10 (i.e., {{spark.cores.max = 10}}) > and app.coresPerExecutor is 4 (i.e., {{spark.executor.cores = 4}}). > After allocating two executors (each has 4 cores) to this app, the > {{app.coresToAssign = 2}} and {{minCoresPerExecutor = coresPerExecutor = 4}}, > so {{keepScheduling = false}} and no extra executor will be allocated to this > app. If {{spark.scheduler.minRegisteredResourcesRatio}} is set to a large > number (e.g., > 0.8 in this case), the app will hang and never finish. > Case 2: if a small app's coresPerExecutor is larger than its requested cores > (e.g., {{spark.cores.max = 10}}, {{spark.executor.cores = 16}}), {{val > keepScheduling = coresToAssign >= minCoresPerExecutor}} is always FALSE. As a > result, this app will never get an executor to run. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-12554) Standalone app scheduler will hang when app.coreToAssign < minCoresPerExecutor
[ https://issues.apache.org/jira/browse/SPARK-12554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15075353#comment-15075353 ] Andrew Or commented on SPARK-12554: --- [~jerrylead] The change you are proposing changes the semantics incorrectly. The right fix here is to adjust the wait behavior. Right now it doesn't take into account the fact that executors can have fixed number of cores, such that we never get to `spark.cores.max`. All we need to do is to use the right max when waiting for resources, e.g. in your example instead of waiting for all 10 cores, we wait for the nearest multiple of 4, i.e. 8. Your case 2 is not at all a bug. The user chose settings that are impossible to fulfill. Although there's nothing to fix we can throw an exception to fail the application quickly, but no one really runs into this so it's probably not worth doing. > Standalone app scheduler will hang when app.coreToAssign < minCoresPerExecutor > -- > > Key: SPARK-12554 > URL: https://issues.apache.org/jira/browse/SPARK-12554 > Project: Spark > Issue Type: Bug > Components: Deploy, Scheduler >Affects Versions: 1.5.2 >Reporter: Lijie Xu > > In scheduleExecutorsOnWorker() in Master.scala, > {{val keepScheduling = coresToAssign >= minCoresPerExecutor}} should be > changed to {{val keepScheduling = coresToAssign > 0}} > Case 1: > Suppose that an app's requested cores is 10 (i.e., {{spark.cores.max = 10}}) > and app.coresPerExecutor is 4 (i.e., {{spark.executor.cores = 4}}). > After allocating two executors (each has 4 cores) to this app, the > {{app.coresToAssign = 2}} and {{minCoresPerExecutor = coresPerExecutor = 4}}, > so {{keepScheduling = false}} and no extra executor will be allocated to this > app. If {{spark.scheduler.minRegisteredResourcesRatio}} is set to a large > number (e.g., > 0.8 in this case), the app will hang and never finish. > Case 2: if a small app's coresPerExecutor is larger than its requested cores > (e.g., {{spark.cores.max = 10}}, {{spark.executor.cores = 16}}), {{val > keepScheduling = coresToAssign >= minCoresPerExecutor}} is always FALSE. As a > result, this app will never get an executor to run. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-12554) Standalone app scheduler will hang when app.coreToAssign < minCoresPerExecutor
[ https://issues.apache.org/jira/browse/SPARK-12554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-12554: -- Priority: Major (was: Critical) > Standalone app scheduler will hang when app.coreToAssign < minCoresPerExecutor > -- > > Key: SPARK-12554 > URL: https://issues.apache.org/jira/browse/SPARK-12554 > Project: Spark > Issue Type: Bug > Components: Deploy, Scheduler >Affects Versions: 1.5.2 >Reporter: Lijie Xu > > In scheduleExecutorsOnWorker() in Master.scala, > *val keepScheduling = coresToAssign >= minCoresPerExecutor* should be changed > to *val keepScheduling = coresToAssign > 0* > Suppose that an app's requested cores is 10 (i.e., spark.cores.max = 10) and > app.coresPerExecutor is 4 (i.e., spark.executor.cores = 4). > After allocating two executors (each has 4 cores) to this app, the > *app.coresToAssign = 2* and *minCoresPerExecutor = coresPerExecutor = 4*, so > *keepScheduling = false* and no extra executor will be allocated to this app. > If *spark.scheduler.minRegisteredResourcesRatio* is set to a large number > (e.g., > 0.8 in this case), the app will hang and never finish. > Another case: if a small app's coresPerExecutor is larger than its requested > cores (e.g., spark.cores.max = 10, spark.executor.cores = 16), *val > keepScheduling = coresToAssign >= minCoresPerExecutor* is always FALSE. As a > result, this app will never get an executor to run. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-12559) Standalone cluster mode doesn't work with --packages
Andrew Or created SPARK-12559: - Summary: Standalone cluster mode doesn't work with --packages Key: SPARK-12559 URL: https://issues.apache.org/jira/browse/SPARK-12559 Project: Spark Issue Type: Bug Components: Spark Submit Affects Versions: 1.3.0 Reporter: Andrew Or >From the mailing list: {quote} Another problem I ran into that you also might is that --packages doesn't work with --deploy-mode cluster. It downloads the packages to a temporary location on the node running spark-submit, then passes those paths to the node that is running the Driver, but since that isn't the same machine, it can't find anything and fails. The driver process *should* be the one doing the downloading, but it isn't. I ended up having to create a fat JAR with all of the dependencies to get around that one. {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-12554) Standalone app scheduler will hang when app.coreToAssign < minCoresPerExecutor
[ https://issues.apache.org/jira/browse/SPARK-12554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15074189#comment-15074189 ] Andrew Or commented on SPARK-12554: --- I've downgraded the issue. I also think the behavior is by design. The semantics of `spark.executor.cores` is that each executor has exactly N number of cores. Your cluster "hanging" means it doesn't have enough resources to launch the expected number of executors, so you should provision more resources. > Standalone app scheduler will hang when app.coreToAssign < minCoresPerExecutor > -- > > Key: SPARK-12554 > URL: https://issues.apache.org/jira/browse/SPARK-12554 > Project: Spark > Issue Type: Bug > Components: Deploy, Scheduler >Affects Versions: 1.5.2 >Reporter: Lijie Xu > > In scheduleExecutorsOnWorker() in Master.scala, > *val keepScheduling = coresToAssign >= minCoresPerExecutor* should be changed > to *val keepScheduling = coresToAssign > 0* > Suppose that an app's requested cores is 10 (i.e., spark.cores.max = 10) and > app.coresPerExecutor is 4 (i.e., spark.executor.cores = 4). > After allocating two executors (each has 4 cores) to this app, the > *app.coresToAssign = 2* and *minCoresPerExecutor = coresPerExecutor = 4*, so > *keepScheduling = false* and no extra executor will be allocated to this app. > If *spark.scheduler.minRegisteredResourcesRatio* is set to a large number > (e.g., > 0.8 in this case), the app will hang and never finish. > Another case: if a small app's coresPerExecutor is larger than its requested > cores (e.g., spark.cores.max = 10, spark.executor.cores = 16), *val > keepScheduling = coresToAssign >= minCoresPerExecutor* is always FALSE. As a > result, this app will never get an executor to run. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-12559) Standalone cluster mode doesn't work with --packages
[ https://issues.apache.org/jira/browse/SPARK-12559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-12559: -- Description: >From the mailing list: {quote} Another problem I ran into that you also might is that --packages doesn't work with --deploy-mode cluster. It downloads the packages to a temporary location on the node running spark-submit, then passes those paths to the node that is running the Driver, but since that isn't the same machine, it can't find anything and fails. The driver process *should* be the one doing the downloading, but it isn't. I ended up having to create a fat JAR with all of the dependencies to get around that one. {quote} The problem is that we currently don't upload jars to the cluster. It seems to fix this we either (1) do upload jars, or (2) just run the packages code on the driver side. I slightly prefer (2) because it's simpler. was: >From the mailing list: {quote} Another problem I ran into that you also might is that --packages doesn't work with --deploy-mode cluster. It downloads the packages to a temporary location on the node running spark-submit, then passes those paths to the node that is running the Driver, but since that isn't the same machine, it can't find anything and fails. The driver process *should* be the one doing the downloading, but it isn't. I ended up having to create a fat JAR with all of the dependencies to get around that one. {quote} > Standalone cluster mode doesn't work with --packages > > > Key: SPARK-12559 > URL: https://issues.apache.org/jira/browse/SPARK-12559 > Project: Spark > Issue Type: Bug > Components: Spark Submit >Affects Versions: 1.3.0 >Reporter: Andrew Or > > From the mailing list: > {quote} > Another problem I ran into that you also might is that --packages doesn't > work with --deploy-mode cluster. It downloads the packages to a temporary > location on the node running spark-submit, then passes those paths to the > node that is running the Driver, but since that isn't the same machine, it > can't find anything and fails. The driver process *should* be the one > doing the downloading, but it isn't. I ended up having to create a fat JAR > with all of the dependencies to get around that one. > {quote} > The problem is that we currently don't upload jars to the cluster. It seems > to fix this we either (1) do upload jars, or (2) just run the packages code > on the driver side. I slightly prefer (2) because it's simpler. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-12565) Document standalone mode jar distribution more clearly
Andrew Or created SPARK-12565: - Summary: Document standalone mode jar distribution more clearly Key: SPARK-12565 URL: https://issues.apache.org/jira/browse/SPARK-12565 Project: Spark Issue Type: Improvement Components: Documentation Affects Versions: 1.0.0 Reporter: Andrew Or There's a lot of confusion on the mailing list about how jars are distributed. The existing docs are partially responsible for this: {quote} If your application is launched through Spark submit, then the application jar is automatically distributed to all worker nodes. {quote} This makes it seem like even in cluster mode, the jars are uploaded from the client (machine running Spark submit) to the cluster, which is not true. We should clarify the documentation. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Closed] (SPARK-12415) Do not use closure serializer to serialize task result
[ https://issues.apache.org/jira/browse/SPARK-12415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or closed SPARK-12415. - Resolution: Won't Fix > Do not use closure serializer to serialize task result > -- > > Key: SPARK-12415 > URL: https://issues.apache.org/jira/browse/SPARK-12415 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 1.0.0 >Reporter: Andrew Or > > As the name suggests, closure serializer is for closures. We should be able > to use the generic serializer for task results. If we want to do this we need > to register `org.apache.spark.scheduler.TaskResult` if we use Kryo. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-12415) Do not use closure serializer to serialize task result
[ https://issues.apache.org/jira/browse/SPARK-12415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15074201#comment-15074201 ] Andrew Or commented on SPARK-12415: --- Closed as Won't Fix. We can't use Kryo to serialize task result because there are all these dependent classes in TaskMetrics. If the user adds one but forgets to register it with Kryo then suddenly all apps with Kryo will fail. This issue will be superseded by SPARK-12414. > Do not use closure serializer to serialize task result > -- > > Key: SPARK-12415 > URL: https://issues.apache.org/jira/browse/SPARK-12415 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 1.0.0 >Reporter: Andrew Or > > As the name suggests, closure serializer is for closures. We should be able > to use the generic serializer for task results. If we want to do this we need > to register `org.apache.spark.scheduler.TaskResult` if we use Kryo. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-12554) Standalone app scheduler will hang when app.coreToAssign < minCoresPerExecutor
[ https://issues.apache.org/jira/browse/SPARK-12554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15074198#comment-15074198 ] Andrew Or commented on SPARK-12554: --- If anything I would say it's the calculation of {{spark.scheduler.maxRegisteredResourcesWaitingTime}} that's a problem. It shouldn't use {{spark.cores.max}} as the threshold max; it should round {{spark.cores.max}} to the nearest multiple of {{spark.executor.cores}} if the latter is set. > Standalone app scheduler will hang when app.coreToAssign < minCoresPerExecutor > -- > > Key: SPARK-12554 > URL: https://issues.apache.org/jira/browse/SPARK-12554 > Project: Spark > Issue Type: Bug > Components: Deploy, Scheduler >Affects Versions: 1.5.2 >Reporter: Lijie Xu > > In scheduleExecutorsOnWorker() in Master.scala, > *val keepScheduling = coresToAssign >= minCoresPerExecutor* should be changed > to *val keepScheduling = coresToAssign > 0* > Suppose that an app's requested cores is 10 (i.e., spark.cores.max = 10) and > app.coresPerExecutor is 4 (i.e., spark.executor.cores = 4). > After allocating two executors (each has 4 cores) to this app, the > *app.coresToAssign = 2* and *minCoresPerExecutor = coresPerExecutor = 4*, so > *keepScheduling = false* and no extra executor will be allocated to this app. > If *spark.scheduler.minRegisteredResourcesRatio* is set to a large number > (e.g., > 0.8 in this case), the app will hang and never finish. > Another case: if a small app's coresPerExecutor is larger than its requested > cores (e.g., spark.cores.max = 10, spark.executor.cores = 16), *val > keepScheduling = coresToAssign >= minCoresPerExecutor* is always FALSE. As a > result, this app will never get an executor to run. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-12561) Remove JobLogger
Andrew Or created SPARK-12561: - Summary: Remove JobLogger Key: SPARK-12561 URL: https://issues.apache.org/jira/browse/SPARK-12561 Project: Spark Issue Type: Sub-task Components: Spark Core Affects Versions: 1.0.0 Reporter: Andrew Or Assignee: Andrew Or It was research code and has been deprecated since 1.0.0. No one really uses it since they can just use event logging. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-12411) Reconsider executor heartbeats rpc timeout
[ https://issues.apache.org/jira/browse/SPARK-12411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-12411: -- Target Version/s: 1.6.1, 2.0.0 (was: 2.0.0) > Reconsider executor heartbeats rpc timeout > -- > > Key: SPARK-12411 > URL: https://issues.apache.org/jira/browse/SPARK-12411 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Reporter: Nong Li >Assignee: Nong Li > Fix For: 1.6.1, 2.0.0 > > > Currently, the timeout for checking when an executor is failed is the same as > the timeout of the sender ("spark.network.timeout") which defaults to 120s. > This means if there is a network issue, the executor only gets one try to > heartbeat which probably causes the failure detection to be flaky. > The executor has a config to control how often to heartbeat > (spark.executor.heartbeatInterval) which defaults to 10s. This combination of > configs doesn't seem to make sense. The heartbeat rpc timeout should probably > be less than or equal to the heartbeatInterval. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-12411) Reconsider executor heartbeats rpc timeout
[ https://issues.apache.org/jira/browse/SPARK-12411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-12411: -- Fix Version/s: 1.6.1 > Reconsider executor heartbeats rpc timeout > -- > > Key: SPARK-12411 > URL: https://issues.apache.org/jira/browse/SPARK-12411 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Reporter: Nong Li >Assignee: Nong Li > Fix For: 1.6.1, 2.0.0 > > > Currently, the timeout for checking when an executor is failed is the same as > the timeout of the sender ("spark.network.timeout") which defaults to 120s. > This means if there is a network issue, the executor only gets one try to > heartbeat which probably causes the failure detection to be flaky. > The executor has a config to control how often to heartbeat > (spark.executor.heartbeatInterval) which defaults to 10s. This combination of > configs doesn't seem to make sense. The heartbeat rpc timeout should probably > be less than or equal to the heartbeatInterval. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-12483) Data Frame as() does not work in Java
Andrew Davidson created SPARK-12483: --- Summary: Data Frame as() does not work in Java Key: SPARK-12483 URL: https://issues.apache.org/jira/browse/SPARK-12483 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 1.5.2 Environment: Mac El Cap 10.11.2 Java 8 Reporter: Andrew Davidson Following unit test demonstrates a bug in as(). The column name for aliasDF was not changed @Test public void bugDataFrameAsTest() { DataFrame df = createData(); df.printSchema(); df.show(); DataFrame aliasDF = df.select("id").as("UUID"); aliasDF.printSchema(); aliasDF.show(); } DataFrame createData() { Features f1 = new Features(1, category1); Features f2 = new Features(2, category2); ArrayList data = new ArrayList(2); data.add(f1); data.add(f2); //JavaRDD rdd = javaSparkContext.parallelize(Arrays.asList(f1, f2)); JavaRDD rdd = javaSparkContext.parallelize(data); DataFrame df = sqlContext.createDataFrame(rdd, Features.class); return df; } This is the output I got (without the spark log msgs) root |-- id: integer (nullable = false) |-- labelStr: string (nullable = true) +---++ | id|labelStr| +---++ | 1| noise| | 2|questionable| +---++ root |-- id: integer (nullable = false) +---+ | id| +---+ | 1| | 2| +---+ -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-12484) DataFrame withColumn() does not work in Java
[ https://issues.apache.org/jira/browse/SPARK-12484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15068696#comment-15068696 ] Andrew Davidson commented on SPARK-12484: - What I am really trying to do is rewrite the following python code in Java. Ideally I would implement this code as a MLib.transformation how ever that does not seem possible at this point in time using the Java API Kind regards Andy def convertMultinomialLabelToBinary(dataFrame): newColName = "binomialLabel" binomial = udf(lambda labelStr: labelStr if (labelStr == "noise") else "signal", StringType()) ret = dataFrame.withColumn(newColName, binomial(dataFrame["label"])) return ret > DataFrame withColumn() does not work in Java > > > Key: SPARK-12484 > URL: https://issues.apache.org/jira/browse/SPARK-12484 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.5.2 > Environment: mac El Cap. 10.11.2 > Java 8 >Reporter: Andrew Davidson > Attachments: UDFTest.java > > > DataFrame transformerdDF = df.withColumn(fieldName, newCol); raises > org.apache.spark.sql.AnalysisException: resolved attribute(s) _c0#2 missing > from id#0,labelStr#1 in operator !Project [id#0,labelStr#1,_c0#2 AS > transformedByUDF#3]; > at > org.apache.spark.sql.catalyst.analysis.CheckAnalysis$class.failAnalysis(CheckAnalysis.scala:37) > at > org.apache.spark.sql.catalyst.analysis.Analyzer.failAnalysis(Analyzer.scala:44) > at > org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.apply(CheckAnalysis.scala:154) > at > org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.apply(CheckAnalysis.scala:49) > at > org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:103) > at > org.apache.spark.sql.catalyst.analysis.CheckAnalysis$class.checkAnalysis(CheckAnalysis.scala:49) > at > org.apache.spark.sql.catalyst.analysis.Analyzer.checkAnalysis(Analyzer.scala:44) > at > org.apache.spark.sql.SQLContext$QueryExecution.assertAnalyzed(SQLContext.scala:914) > at org.apache.spark.sql.DataFrame.(DataFrame.scala:132) > at > org.apache.spark.sql.DataFrame.org$apache$spark$sql$DataFrame$$logicalPlanToDataFrame(DataFrame.scala:154) > at org.apache.spark.sql.DataFrame.select(DataFrame.scala:691) > at org.apache.spark.sql.DataFrame.withColumn(DataFrame.scala:1150) > at com.pws.fantasySport.ml.UDFTest.test(UDFTest.java:75) -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-12485) Rename "dynamic allocation" to "elastic scaling"
[ https://issues.apache.org/jira/browse/SPARK-12485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-12485: -- Target Version/s: 2.0.0 > Rename "dynamic allocation" to "elastic scaling" > > > Key: SPARK-12485 > URL: https://issues.apache.org/jira/browse/SPARK-12485 > Project: Spark > Issue Type: Sub-task > Components: Spark Core >Reporter: Andrew Or >Assignee: Andrew Or > > Fewer syllables, sounds more natural. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-12483) Data Frame as() does not work in Java
[ https://issues.apache.org/jira/browse/SPARK-12483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Davidson updated SPARK-12483: Attachment: SPARK_12483_unitTest.java added a unit test file > Data Frame as() does not work in Java > - > > Key: SPARK-12483 > URL: https://issues.apache.org/jira/browse/SPARK-12483 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.5.2 > Environment: Mac El Cap 10.11.2 > Java 8 >Reporter: Andrew Davidson > Attachments: SPARK_12483_unitTest.java > > > Following unit test demonstrates a bug in as(). The column name for aliasDF > was not changed >@Test > public void bugDataFrameAsTest() { > DataFrame df = createData(); > df.printSchema(); > df.show(); > > DataFrame aliasDF = df.select("id").as("UUID"); > aliasDF.printSchema(); > aliasDF.show(); > } > DataFrame createData() { > Features f1 = new Features(1, category1); > Features f2 = new Features(2, category2); > ArrayList data = new ArrayList(2); > data.add(f1); > data.add(f2); > //JavaRDD rdd = > javaSparkContext.parallelize(Arrays.asList(f1, f2)); > JavaRDD rdd = javaSparkContext.parallelize(data); > DataFrame df = sqlContext.createDataFrame(rdd, Features.class); > return df; > } > This is the output I got (without the spark log msgs) > root > |-- id: integer (nullable = false) > |-- labelStr: string (nullable = true) > +---++ > | id|labelStr| > +---++ > | 1| noise| > | 2|questionable| > +---++ > root > |-- id: integer (nullable = false) > +---+ > | id| > +---+ > | 1| > | 2| > +---+ -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-12484) DataFrame withColumn() does not work in Java
[ https://issues.apache.org/jira/browse/SPARK-12484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Davidson updated SPARK-12484: Attachment: UDFTest.java Add a unit test file > DataFrame withColumn() does not work in Java > > > Key: SPARK-12484 > URL: https://issues.apache.org/jira/browse/SPARK-12484 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.5.2 > Environment: mac El Cap. 10.11.2 > Java 8 >Reporter: Andrew Davidson > Attachments: UDFTest.java > > > DataFrame transformerdDF = df.withColumn(fieldName, newCol); raises > org.apache.spark.sql.AnalysisException: resolved attribute(s) _c0#2 missing > from id#0,labelStr#1 in operator !Project [id#0,labelStr#1,_c0#2 AS > transformedByUDF#3]; > at > org.apache.spark.sql.catalyst.analysis.CheckAnalysis$class.failAnalysis(CheckAnalysis.scala:37) > at > org.apache.spark.sql.catalyst.analysis.Analyzer.failAnalysis(Analyzer.scala:44) > at > org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.apply(CheckAnalysis.scala:154) > at > org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.apply(CheckAnalysis.scala:49) > at > org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:103) > at > org.apache.spark.sql.catalyst.analysis.CheckAnalysis$class.checkAnalysis(CheckAnalysis.scala:49) > at > org.apache.spark.sql.catalyst.analysis.Analyzer.checkAnalysis(Analyzer.scala:44) > at > org.apache.spark.sql.SQLContext$QueryExecution.assertAnalyzed(SQLContext.scala:914) > at org.apache.spark.sql.DataFrame.(DataFrame.scala:132) > at > org.apache.spark.sql.DataFrame.org$apache$spark$sql$DataFrame$$logicalPlanToDataFrame(DataFrame.scala:154) > at org.apache.spark.sql.DataFrame.select(DataFrame.scala:691) > at org.apache.spark.sql.DataFrame.withColumn(DataFrame.scala:1150) > at com.pws.fantasySport.ml.UDFTest.test(UDFTest.java:75) -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-12484) DataFrame withColumn() does not work in Java
[ https://issues.apache.org/jira/browse/SPARK-12484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15068698#comment-15068698 ] Andrew Davidson commented on SPARK-12484: - releated issue https://issues.apache.org/jira/browse/SPARK-12483 > DataFrame withColumn() does not work in Java > > > Key: SPARK-12484 > URL: https://issues.apache.org/jira/browse/SPARK-12484 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.5.2 > Environment: mac El Cap. 10.11.2 > Java 8 >Reporter: Andrew Davidson > Attachments: UDFTest.java > > > DataFrame transformerdDF = df.withColumn(fieldName, newCol); raises > org.apache.spark.sql.AnalysisException: resolved attribute(s) _c0#2 missing > from id#0,labelStr#1 in operator !Project [id#0,labelStr#1,_c0#2 AS > transformedByUDF#3]; > at > org.apache.spark.sql.catalyst.analysis.CheckAnalysis$class.failAnalysis(CheckAnalysis.scala:37) > at > org.apache.spark.sql.catalyst.analysis.Analyzer.failAnalysis(Analyzer.scala:44) > at > org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.apply(CheckAnalysis.scala:154) > at > org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.apply(CheckAnalysis.scala:49) > at > org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:103) > at > org.apache.spark.sql.catalyst.analysis.CheckAnalysis$class.checkAnalysis(CheckAnalysis.scala:49) > at > org.apache.spark.sql.catalyst.analysis.Analyzer.checkAnalysis(Analyzer.scala:44) > at > org.apache.spark.sql.SQLContext$QueryExecution.assertAnalyzed(SQLContext.scala:914) > at org.apache.spark.sql.DataFrame.(DataFrame.scala:132) > at > org.apache.spark.sql.DataFrame.org$apache$spark$sql$DataFrame$$logicalPlanToDataFrame(DataFrame.scala:154) > at org.apache.spark.sql.DataFrame.select(DataFrame.scala:691) > at org.apache.spark.sql.DataFrame.withColumn(DataFrame.scala:1150) > at com.pws.fantasySport.ml.UDFTest.test(UDFTest.java:75) -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-12483) Data Frame as() does not work in Java
[ https://issues.apache.org/jira/browse/SPARK-12483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15069027#comment-15069027 ] Andrew Davidson commented on SPARK-12483: - Hi Xiao thanks for looking at the issue is there a way to change a column name? If you do a select() using a data frame, the column name is really strange see attachement for https://issues.apache.org/jira/browse/SPARK-12484 // get column from data frame call df.withColumnName Column newCol = udfDF.col("_c0"); renaming data frame columns is very common in R Kind regards Andy > Data Frame as() does not work in Java > - > > Key: SPARK-12483 > URL: https://issues.apache.org/jira/browse/SPARK-12483 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.5.2 > Environment: Mac El Cap 10.11.2 > Java 8 >Reporter: Andrew Davidson > Attachments: SPARK_12483_unitTest.java > > > Following unit test demonstrates a bug in as(). The column name for aliasDF > was not changed >@Test > public void bugDataFrameAsTest() { > DataFrame df = createData(); > df.printSchema(); > df.show(); > > DataFrame aliasDF = df.select("id").as("UUID"); > aliasDF.printSchema(); > aliasDF.show(); > } > DataFrame createData() { > Features f1 = new Features(1, category1); > Features f2 = new Features(2, category2); > ArrayList data = new ArrayList(2); > data.add(f1); > data.add(f2); > //JavaRDD rdd = > javaSparkContext.parallelize(Arrays.asList(f1, f2)); > JavaRDD rdd = javaSparkContext.parallelize(data); > DataFrame df = sqlContext.createDataFrame(rdd, Features.class); > return df; > } > This is the output I got (without the spark log msgs) > root > |-- id: integer (nullable = false) > |-- labelStr: string (nullable = true) > +---++ > | id|labelStr| > +---++ > | 1| noise| > | 2|questionable| > +---++ > root > |-- id: integer (nullable = false) > +---+ > | id| > +---+ > | 1| > | 2| > +---+ -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-12484) DataFrame withColumn() does not work in Java
[ https://issues.apache.org/jira/browse/SPARK-12484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15069061#comment-15069061 ] Andrew Davidson commented on SPARK-12484: - Hi Xiao thanks for looking into this Andy > DataFrame withColumn() does not work in Java > > > Key: SPARK-12484 > URL: https://issues.apache.org/jira/browse/SPARK-12484 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.5.2 > Environment: mac El Cap. 10.11.2 > Java 8 >Reporter: Andrew Davidson > Attachments: UDFTest.java > > > DataFrame transformerdDF = df.withColumn(fieldName, newCol); raises > org.apache.spark.sql.AnalysisException: resolved attribute(s) _c0#2 missing > from id#0,labelStr#1 in operator !Project [id#0,labelStr#1,_c0#2 AS > transformedByUDF#3]; > at > org.apache.spark.sql.catalyst.analysis.CheckAnalysis$class.failAnalysis(CheckAnalysis.scala:37) > at > org.apache.spark.sql.catalyst.analysis.Analyzer.failAnalysis(Analyzer.scala:44) > at > org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.apply(CheckAnalysis.scala:154) > at > org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.apply(CheckAnalysis.scala:49) > at > org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:103) > at > org.apache.spark.sql.catalyst.analysis.CheckAnalysis$class.checkAnalysis(CheckAnalysis.scala:49) > at > org.apache.spark.sql.catalyst.analysis.Analyzer.checkAnalysis(Analyzer.scala:44) > at > org.apache.spark.sql.SQLContext$QueryExecution.assertAnalyzed(SQLContext.scala:914) > at org.apache.spark.sql.DataFrame.(DataFrame.scala:132) > at > org.apache.spark.sql.DataFrame.org$apache$spark$sql$DataFrame$$logicalPlanToDataFrame(DataFrame.scala:154) > at org.apache.spark.sql.DataFrame.select(DataFrame.scala:691) > at org.apache.spark.sql.DataFrame.withColumn(DataFrame.scala:1150) > at com.pws.fantasySport.ml.UDFTest.test(UDFTest.java:75) -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-12483) Data Frame as() does not work in Java
[ https://issues.apache.org/jira/browse/SPARK-12483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Davidson updated SPARK-12483: Attachment: SPARK_12483_unitTest.java add a unit test file > Data Frame as() does not work in Java > - > > Key: SPARK-12483 > URL: https://issues.apache.org/jira/browse/SPARK-12483 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.5.2 > Environment: Mac El Cap 10.11.2 > Java 8 >Reporter: Andrew Davidson > Attachments: SPARK_12483_unitTest.java > > > Following unit test demonstrates a bug in as(). The column name for aliasDF > was not changed >@Test > public void bugDataFrameAsTest() { > DataFrame df = createData(); > df.printSchema(); > df.show(); > > DataFrame aliasDF = df.select("id").as("UUID"); > aliasDF.printSchema(); > aliasDF.show(); > } > DataFrame createData() { > Features f1 = new Features(1, category1); > Features f2 = new Features(2, category2); > ArrayList data = new ArrayList(2); > data.add(f1); > data.add(f2); > //JavaRDD rdd = > javaSparkContext.parallelize(Arrays.asList(f1, f2)); > JavaRDD rdd = javaSparkContext.parallelize(data); > DataFrame df = sqlContext.createDataFrame(rdd, Features.class); > return df; > } > This is the output I got (without the spark log msgs) > root > |-- id: integer (nullable = false) > |-- labelStr: string (nullable = true) > +---++ > | id|labelStr| > +---++ > | 1| noise| > | 2|questionable| > +---++ > root > |-- id: integer (nullable = false) > +---+ > | id| > +---+ > | 1| > | 2| > +---+ -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-12484) DataFrame withColumn() does not work in Java
Andrew Davidson created SPARK-12484: --- Summary: DataFrame withColumn() does not work in Java Key: SPARK-12484 URL: https://issues.apache.org/jira/browse/SPARK-12484 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 1.5.2 Environment: mac El Cap. 10.11.2 Java 8 Reporter: Andrew Davidson DataFrame transformerdDF = df.withColumn(fieldName, newCol); raises org.apache.spark.sql.AnalysisException: resolved attribute(s) _c0#2 missing from id#0,labelStr#1 in operator !Project [id#0,labelStr#1,_c0#2 AS transformedByUDF#3]; at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$class.failAnalysis(CheckAnalysis.scala:37) at org.apache.spark.sql.catalyst.analysis.Analyzer.failAnalysis(Analyzer.scala:44) at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.apply(CheckAnalysis.scala:154) at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.apply(CheckAnalysis.scala:49) at org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:103) at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$class.checkAnalysis(CheckAnalysis.scala:49) at org.apache.spark.sql.catalyst.analysis.Analyzer.checkAnalysis(Analyzer.scala:44) at org.apache.spark.sql.SQLContext$QueryExecution.assertAnalyzed(SQLContext.scala:914) at org.apache.spark.sql.DataFrame.(DataFrame.scala:132) at org.apache.spark.sql.DataFrame.org$apache$spark$sql$DataFrame$$logicalPlanToDataFrame(DataFrame.scala:154) at org.apache.spark.sql.DataFrame.select(DataFrame.scala:691) at org.apache.spark.sql.DataFrame.withColumn(DataFrame.scala:1150) at com.pws.fantasySport.ml.UDFTest.test(UDFTest.java:75) -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-12484) DataFrame withColumn() does not work in Java
[ https://issues.apache.org/jira/browse/SPARK-12484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15068684#comment-15068684 ] Andrew Davidson commented on SPARK-12484: - you can find some more back ground on the email thread 'should I file a bug? Re: trouble implementing Transformer and calling DataFrame.withColumn()' > DataFrame withColumn() does not work in Java > > > Key: SPARK-12484 > URL: https://issues.apache.org/jira/browse/SPARK-12484 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.5.2 > Environment: mac El Cap. 10.11.2 > Java 8 >Reporter: Andrew Davidson > Attachments: UDFTest.java > > > DataFrame transformerdDF = df.withColumn(fieldName, newCol); raises > org.apache.spark.sql.AnalysisException: resolved attribute(s) _c0#2 missing > from id#0,labelStr#1 in operator !Project [id#0,labelStr#1,_c0#2 AS > transformedByUDF#3]; > at > org.apache.spark.sql.catalyst.analysis.CheckAnalysis$class.failAnalysis(CheckAnalysis.scala:37) > at > org.apache.spark.sql.catalyst.analysis.Analyzer.failAnalysis(Analyzer.scala:44) > at > org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.apply(CheckAnalysis.scala:154) > at > org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.apply(CheckAnalysis.scala:49) > at > org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:103) > at > org.apache.spark.sql.catalyst.analysis.CheckAnalysis$class.checkAnalysis(CheckAnalysis.scala:49) > at > org.apache.spark.sql.catalyst.analysis.Analyzer.checkAnalysis(Analyzer.scala:44) > at > org.apache.spark.sql.SQLContext$QueryExecution.assertAnalyzed(SQLContext.scala:914) > at org.apache.spark.sql.DataFrame.(DataFrame.scala:132) > at > org.apache.spark.sql.DataFrame.org$apache$spark$sql$DataFrame$$logicalPlanToDataFrame(DataFrame.scala:154) > at org.apache.spark.sql.DataFrame.select(DataFrame.scala:691) > at org.apache.spark.sql.DataFrame.withColumn(DataFrame.scala:1150) > at com.pws.fantasySport.ml.UDFTest.test(UDFTest.java:75) -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Issue Comment Deleted] (SPARK-12483) Data Frame as() does not work in Java
[ https://issues.apache.org/jira/browse/SPARK-12483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Davidson updated SPARK-12483: Comment: was deleted (was: add a unit test file ) > Data Frame as() does not work in Java > - > > Key: SPARK-12483 > URL: https://issues.apache.org/jira/browse/SPARK-12483 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.5.2 > Environment: Mac El Cap 10.11.2 > Java 8 >Reporter: Andrew Davidson > > Following unit test demonstrates a bug in as(). The column name for aliasDF > was not changed >@Test > public void bugDataFrameAsTest() { > DataFrame df = createData(); > df.printSchema(); > df.show(); > > DataFrame aliasDF = df.select("id").as("UUID"); > aliasDF.printSchema(); > aliasDF.show(); > } > DataFrame createData() { > Features f1 = new Features(1, category1); > Features f2 = new Features(2, category2); > ArrayList data = new ArrayList(2); > data.add(f1); > data.add(f2); > //JavaRDD rdd = > javaSparkContext.parallelize(Arrays.asList(f1, f2)); > JavaRDD rdd = javaSparkContext.parallelize(data); > DataFrame df = sqlContext.createDataFrame(rdd, Features.class); > return df; > } > This is the output I got (without the spark log msgs) > root > |-- id: integer (nullable = false) > |-- labelStr: string (nullable = true) > +---++ > | id|labelStr| > +---++ > | 1| noise| > | 2|questionable| > +---++ > root > |-- id: integer (nullable = false) > +---+ > | id| > +---+ > | 1| > | 2| > +---+ -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-12483) Data Frame as() does not work in Java
[ https://issues.apache.org/jira/browse/SPARK-12483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Davidson updated SPARK-12483: Attachment: (was: SPARK_12483_unitTest.java) > Data Frame as() does not work in Java > - > > Key: SPARK-12483 > URL: https://issues.apache.org/jira/browse/SPARK-12483 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.5.2 > Environment: Mac El Cap 10.11.2 > Java 8 >Reporter: Andrew Davidson > > Following unit test demonstrates a bug in as(). The column name for aliasDF > was not changed >@Test > public void bugDataFrameAsTest() { > DataFrame df = createData(); > df.printSchema(); > df.show(); > > DataFrame aliasDF = df.select("id").as("UUID"); > aliasDF.printSchema(); > aliasDF.show(); > } > DataFrame createData() { > Features f1 = new Features(1, category1); > Features f2 = new Features(2, category2); > ArrayList data = new ArrayList(2); > data.add(f1); > data.add(f2); > //JavaRDD rdd = > javaSparkContext.parallelize(Arrays.asList(f1, f2)); > JavaRDD rdd = javaSparkContext.parallelize(data); > DataFrame df = sqlContext.createDataFrame(rdd, Features.class); > return df; > } > This is the output I got (without the spark log msgs) > root > |-- id: integer (nullable = false) > |-- labelStr: string (nullable = true) > +---++ > | id|labelStr| > +---++ > | 1| noise| > | 2|questionable| > +---++ > root > |-- id: integer (nullable = false) > +---+ > | id| > +---+ > | 1| > | 2| > +---+ -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-12485) Rename "dynamic allocation" to "elastic scaling"
Andrew Or created SPARK-12485: - Summary: Rename "dynamic allocation" to "elastic scaling" Key: SPARK-12485 URL: https://issues.apache.org/jira/browse/SPARK-12485 Project: Spark Issue Type: Sub-task Components: Spark Core Reporter: Andrew Or Assignee: Andrew Or Fewer syllables, sounds more natural. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-12414) Remove closure serializer
[ https://issues.apache.org/jira/browse/SPARK-12414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or reassigned SPARK-12414: - Assignee: Andrew Or > Remove closure serializer > - > > Key: SPARK-12414 > URL: https://issues.apache.org/jira/browse/SPARK-12414 > Project: Spark > Issue Type: Sub-task > Components: Spark Core >Affects Versions: 1.0.0 >Reporter: Andrew Or >Assignee: Andrew Or > > There is a config `spark.closure.serializer` that accepts exactly one value: > the java serializer. This is because there are currently bugs in the Kryo > serializer that make it not a viable candidate. This was uncovered by an > unsuccessful attempt to make it work: SPARK-7708. > My high level point is that the Java serializer has worked well for at least > 6 Spark versions now, and it is an incredibly complicated task to get other > serializers (not just Kryo) to work with Spark's closures. IMO the effort is > not worth it and we should just remove this documentation and all the code > associated with it. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-12464) Remove spark.deploy.mesos.zookeeper.url and use spark.deploy.zookeeper.url
[ https://issues.apache.org/jira/browse/SPARK-12464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15067062#comment-15067062 ] Andrew Or commented on SPARK-12464: --- By the way for future reference you probably don't need a separate issue for each config. Just have an issue that says `Remove spark.deploy.mesos.* and use spark.deploy.* instead`. Since you already opened these we can just keep them. > Remove spark.deploy.mesos.zookeeper.url and use spark.deploy.zookeeper.url > -- > > Key: SPARK-12464 > URL: https://issues.apache.org/jira/browse/SPARK-12464 > Project: Spark > Issue Type: Task > Components: Mesos >Reporter: Timothy Chen >Assignee: Apache Spark > > Remove spark.deploy.mesos.zookeeper.url and use existing configuration > spark.deploy.zookeeper.url for Mesos cluster mode. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-12466) Harmless Master NPE in tests
Andrew Or created SPARK-12466: - Summary: Harmless Master NPE in tests Key: SPARK-12466 URL: https://issues.apache.org/jira/browse/SPARK-12466 Project: Spark Issue Type: Bug Components: Deploy, Tests Affects Versions: 1.6.0 Reporter: Andrew Or Assignee: Andrew Or Fix For: 1.6.1, 2.0.0 {code} [info] ReplayListenerSuite: [info] - Simple replay (58 milliseconds) java.lang.NullPointerException at org.apache.spark.deploy.master.Master$$anonfun$asyncRebuildSparkUI$1.applyOrElse(Master.scala:982) at org.apache.spark.deploy.master.Master$$anonfun$asyncRebuildSparkUI$1.applyOrElse(Master.scala:980) at scala.concurrent.Future$$anonfun$onSuccess$1.apply(Future.scala:117) at scala.concurrent.Future$$anonfun$onSuccess$1.apply(Future.scala:115) at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:32) at com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:293) at scala.concurrent.impl.ExecutionContextImpl$$anon$1.execute(ExecutionContextImpl.scala:133) at scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:40) at scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:248) at scala.concurrent.Promise$class.complete(Promise.scala:55) at scala.concurrent.impl.Promise$DefaultPromise.complete(Promise.scala:153) at scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:23) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) [info] - End-to-end replay (10 seconds, 755 milliseconds) {code} https://amplab.cs.berkeley.edu/jenkins/view/Spark-QA-Test/job/Spark-Master-SBT/4316/AMPLAB_JENKINS_BUILD_PROFILE=hadoop2.2,label=spark-test/consoleFull caused by https://github.com/apache/spark/pull/10284 Thanks to [~ted_yu] for reporting. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-5882) Add a test for GraphLoader.edgeListFile
[ https://issues.apache.org/jira/browse/SPARK-5882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or resolved SPARK-5882. -- Resolution: Fixed Assignee: Takeshi Yamamuro Fix Version/s: 2.0.0 Target Version/s: 2.0.0 > Add a test for GraphLoader.edgeListFile > --- > > Key: SPARK-5882 > URL: https://issues.apache.org/jira/browse/SPARK-5882 > Project: Spark > Issue Type: Test > Components: GraphX >Reporter: Takeshi Yamamuro >Assignee: Takeshi Yamamuro >Priority: Trivial > Fix For: 2.0.0 > > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-12414) Remove closure serializer
[ https://issues.apache.org/jira/browse/SPARK-12414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-12414: -- Issue Type: Sub-task (was: Bug) Parent: SPARK-11806 > Remove closure serializer > - > > Key: SPARK-12414 > URL: https://issues.apache.org/jira/browse/SPARK-12414 > Project: Spark > Issue Type: Sub-task > Components: Spark Core >Affects Versions: 1.0.0 >Reporter: Andrew Or > > There is a config `spark.closure.serializer` that accepts exactly one value: > the java serializer. This is because there are currently bugs in the Kryo > serializer that make it not a viable candidate. This was uncovered by an > unsuccessful attempt to make it work: SPARK-7708. > My high level point is that the Java serializer has worked well for at least > 6 Spark versions now, and it is an incredibly complicated task to get other > serializers (not just Kryo) to work with Spark's closures. IMO the effort is > not worth it and we should just remove this documentation and all the code > associated with it. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-12473) Reuse serializer instances for performance
[ https://issues.apache.org/jira/browse/SPARK-12473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-12473: -- Description: After commit de02782 of page rank regressed from 242s to 260s, about 7%. Although currently it's only 7%, we will likely register more classes in the future so this will only increase. The commit added 26 types to register every time we create a Kryo serializer instance. I ran a small microbenchmark to prove that this is noticeably expensive: {code} import org.apache.spark.serializer._ import org.apache.spark.SparkConf def makeMany(num: Int): Long = { val start = System.currentTimeMillis (1 to num).foreach { _ => new KryoSerializer(new SparkConf).newKryo() } System.currentTimeMillis - start } // before commit de02782, averaged over multiple runs makeMany(5000) == 1500 // after commit de02782, averaged over multiple runs makeMany(5000) == 2750 {code} Since we create multiple serializer instances per partition, this means a 5000-partition stage will unconditionally see an increase of > 1s for the stage. In page rank, we may run many such stages. We should explore the alternative of reusing thread-local serializer instances, which would lead to much fewer calls to `kryo.register`. was: After commit de02782 of page rank regressed from 242s to 260s, about 7%. Although currently it's only 7%, we will likely register more classes in the future so we should do this the right way. The commit added 26 types to register every time we create a Kryo serializer instance. I ran a small microbenchmark to prove that this is noticeably expensive: {code} import org.apache.spark.serializer._ import org.apache.spark.SparkConf def makeMany(num: Int): Long = { val start = System.currentTimeMillis (1 to num).foreach { _ => new KryoSerializer(new SparkConf).newKryo() } System.currentTimeMillis - start } // before commit de02782, averaged over multiple runs makeMany(5000) == 1500 // after commit de02782, averaged over multiple runs makeMany(5000) == 2750 {code} Since we create multiple serializer instances per partition, this means a 5000-partition stage will unconditionally see an increase of > 1s for the stage. In page rank, we may run many such stages. We should explore the alternative of reusing thread-local serializer instances, which would lead to much fewer calls to `kryo.register`. > Reuse serializer instances for performance > -- > > Key: SPARK-12473 > URL: https://issues.apache.org/jira/browse/SPARK-12473 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 1.6.0 >Reporter: Andrew Or >Assignee: Andrew Or > > After commit de02782 of page rank regressed from 242s to 260s, about 7%. > Although currently it's only 7%, we will likely register more classes in the > future so this will only increase. > The commit added 26 types to register every time we create a Kryo serializer > instance. I ran a small microbenchmark to prove that this is noticeably > expensive: > {code} > import org.apache.spark.serializer._ > import org.apache.spark.SparkConf > def makeMany(num: Int): Long = { > val start = System.currentTimeMillis > (1 to num).foreach { _ => new KryoSerializer(new SparkConf).newKryo() } > System.currentTimeMillis - start > } > // before commit de02782, averaged over multiple runs > makeMany(5000) == 1500 > // after commit de02782, averaged over multiple runs > makeMany(5000) == 2750 > {code} > Since we create multiple serializer instances per partition, this means a > 5000-partition stage will unconditionally see an increase of > 1s for the > stage. In page rank, we may run many such stages. > We should explore the alternative of reusing thread-local serializer > instances, which would lead to much fewer calls to `kryo.register`. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-12339) NullPointerException on stage kill from web UI
[ https://issues.apache.org/jira/browse/SPARK-12339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15067154#comment-15067154 ] Andrew Or commented on SPARK-12339: --- I've updated the affected version to 2.0 since SPARK-11206 was merged only there. Please let me know if this is not the case. > NullPointerException on stage kill from web UI > -- > > Key: SPARK-12339 > URL: https://issues.apache.org/jira/browse/SPARK-12339 > Project: Spark > Issue Type: Bug > Components: Web UI >Affects Versions: 2.0.0 >Reporter: Jacek Laskowski >Assignee: Alex Bozarth > Fix For: 2.0.0 > > > The following message is in the logs after killing a stage: > {code} > scala> INFO Executor: Executor killed task 1.0 in stage 7.0 (TID 33) > INFO Executor: Executor killed task 0.0 in stage 7.0 (TID 32) > WARN TaskSetManager: Lost task 1.0 in stage 7.0 (TID 33, localhost): > TaskKilled (killed intentionally) > WARN TaskSetManager: Lost task 0.0 in stage 7.0 (TID 32, localhost): > TaskKilled (killed intentionally) > INFO TaskSchedulerImpl: Removed TaskSet 7.0, whose tasks have all completed, > from pool > ERROR LiveListenerBus: Listener SQLListener threw an exception > java.lang.NullPointerException > at > org.apache.spark.sql.execution.ui.SQLListener.onTaskEnd(SQLListener.scala:167) > at > org.apache.spark.scheduler.SparkListenerBus$class.onPostEvent(SparkListenerBus.scala:42) > at > org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31) > at > org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31) > at > org.apache.spark.util.ListenerBus$class.postToAll(ListenerBus.scala:55) > at > org.apache.spark.util.AsynchronousListenerBus.postToAll(AsynchronousListenerBus.scala:37) > at > org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(AsynchronousListenerBus.scala:80) > at > org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(AsynchronousListenerBus.scala:65) > at > org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(AsynchronousListenerBus.scala:65) > at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58) > at > org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(AsynchronousListenerBus.scala:64) > at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1169) > at > org.apache.spark.util.AsynchronousListenerBus$$anon$1.run(AsynchronousListenerBus.scala:63) > ERROR LiveListenerBus: Listener SQLListener threw an exception > java.lang.NullPointerException > at > org.apache.spark.sql.execution.ui.SQLListener.onTaskEnd(SQLListener.scala:167) > at > org.apache.spark.scheduler.SparkListenerBus$class.onPostEvent(SparkListenerBus.scala:42) > at > org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31) > at > org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31) > at > org.apache.spark.util.ListenerBus$class.postToAll(ListenerBus.scala:55) > at > org.apache.spark.util.AsynchronousListenerBus.postToAll(AsynchronousListenerBus.scala:37) > at > org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(AsynchronousListenerBus.scala:80) > at > org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(AsynchronousListenerBus.scala:65) > at > org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(AsynchronousListenerBus.scala:65) > at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58) > at > org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(AsynchronousListenerBus.scala:64) > at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1169) > at > org.apache.spark.util.AsynchronousListenerBus$$anon$1.run(AsynchronousListenerBus.scala:63) > {code} > To reproduce, start a job and kill the stage from web UI, e.g.: > {code} > val rdd = sc.parallelize(0 to 9, 2) > rdd.mapPartitionsWithIndex { case (n, it) => Thread.sleep(10 * 1000); it > }.count > {code} > Go to web UI and in Stages tab click "kill" for the stage. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-12339) NullPointerException on stage kill from web UI
[ https://issues.apache.org/jira/browse/SPARK-12339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-12339: -- Affects Version/s: (was: 1.6.0) 2.0.0 > NullPointerException on stage kill from web UI > -- > > Key: SPARK-12339 > URL: https://issues.apache.org/jira/browse/SPARK-12339 > Project: Spark > Issue Type: Bug > Components: Web UI >Affects Versions: 2.0.0 >Reporter: Jacek Laskowski >Assignee: Alex Bozarth > Fix For: 2.0.0 > > > The following message is in the logs after killing a stage: > {code} > scala> INFO Executor: Executor killed task 1.0 in stage 7.0 (TID 33) > INFO Executor: Executor killed task 0.0 in stage 7.0 (TID 32) > WARN TaskSetManager: Lost task 1.0 in stage 7.0 (TID 33, localhost): > TaskKilled (killed intentionally) > WARN TaskSetManager: Lost task 0.0 in stage 7.0 (TID 32, localhost): > TaskKilled (killed intentionally) > INFO TaskSchedulerImpl: Removed TaskSet 7.0, whose tasks have all completed, > from pool > ERROR LiveListenerBus: Listener SQLListener threw an exception > java.lang.NullPointerException > at > org.apache.spark.sql.execution.ui.SQLListener.onTaskEnd(SQLListener.scala:167) > at > org.apache.spark.scheduler.SparkListenerBus$class.onPostEvent(SparkListenerBus.scala:42) > at > org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31) > at > org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31) > at > org.apache.spark.util.ListenerBus$class.postToAll(ListenerBus.scala:55) > at > org.apache.spark.util.AsynchronousListenerBus.postToAll(AsynchronousListenerBus.scala:37) > at > org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(AsynchronousListenerBus.scala:80) > at > org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(AsynchronousListenerBus.scala:65) > at > org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(AsynchronousListenerBus.scala:65) > at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58) > at > org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(AsynchronousListenerBus.scala:64) > at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1169) > at > org.apache.spark.util.AsynchronousListenerBus$$anon$1.run(AsynchronousListenerBus.scala:63) > ERROR LiveListenerBus: Listener SQLListener threw an exception > java.lang.NullPointerException > at > org.apache.spark.sql.execution.ui.SQLListener.onTaskEnd(SQLListener.scala:167) > at > org.apache.spark.scheduler.SparkListenerBus$class.onPostEvent(SparkListenerBus.scala:42) > at > org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31) > at > org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31) > at > org.apache.spark.util.ListenerBus$class.postToAll(ListenerBus.scala:55) > at > org.apache.spark.util.AsynchronousListenerBus.postToAll(AsynchronousListenerBus.scala:37) > at > org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(AsynchronousListenerBus.scala:80) > at > org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(AsynchronousListenerBus.scala:65) > at > org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(AsynchronousListenerBus.scala:65) > at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58) > at > org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(AsynchronousListenerBus.scala:64) > at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1169) > at > org.apache.spark.util.AsynchronousListenerBus$$anon$1.run(AsynchronousListenerBus.scala:63) > {code} > To reproduce, start a job and kill the stage from web UI, e.g.: > {code} > val rdd = sc.parallelize(0 to 9, 2) > rdd.mapPartitionsWithIndex { case (n, it) => Thread.sleep(10 * 1000); it > }.count > {code} > Go to web UI and in Stages tab click "kill" for the stage. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-12392) Optimize a location order of broadcast blocks by considering preferred local hosts
[ https://issues.apache.org/jira/browse/SPARK-12392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or resolved SPARK-12392. --- Resolution: Fixed Assignee: Takeshi Yamamuro Fix Version/s: 2.0.0 Target Version/s: 2.0.0 > Optimize a location order of broadcast blocks by considering preferred local > hosts > -- > > Key: SPARK-12392 > URL: https://issues.apache.org/jira/browse/SPARK-12392 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Affects Versions: 1.5.2 >Reporter: Takeshi Yamamuro >Assignee: Takeshi Yamamuro > Fix For: 2.0.0 > > > When multiple workers exist in a host, we can bypass unnecessary remote > access for broadcasts; block managers fetch broadcast blocks from the same > host instead of remote hosts. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-12473) Reuse serializer instances for performance
[ https://issues.apache.org/jira/browse/SPARK-12473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-12473: -- Description: After commit de02782 of page rank regressed from 242s to 260s, about 7%. Although currently it's only 7%, we will likely register more classes in the future so we should do this the right way. The commit added 26 types to register every time we create a Kryo serializer instance. I ran a small microbenchmark to prove that this is noticeably expensive: {code} import org.apache.spark.serializer._ import org.apache.spark.SparkConf def makeMany(num: Int): Long = { val start = System.currentTimeMillis (1 to num).foreach { _ => new KryoSerializer(new SparkConf).newKryo() } System.currentTimeMillis - start } // before commit de02782, averaged over multiple runs makeMany(5000) == 1500 // after commit de02782, averaged over multiple runs makeMany(5000) == 2750 {code} Since we create multiple serializer instances per partition, this means a 5000-partition stage will unconditionally see an increase of > 1s for the stage. In page rank, we may run many such stages. We should explore the alternative of reusing thread-local serializer instances, which would lead to much fewer calls to `kryo.register`. was: After commit de02782 of page rank regressed from 242s to 260s, about 7%. The commit added 26 types to register every time we create a Kryo serializer instance. I ran a small microbenchmark to prove that this is noticeably expensive: {code} import org.apache.spark.serializer._ import org.apache.spark.SparkConf def makeMany(num: Int): Long = { val start = System.currentTimeMillis (1 to num).foreach { _ => new KryoSerializer(new SparkConf).newKryo() } System.currentTimeMillis - start } // before commit de02782, averaged over multiple runs makeMany(5000) == 1500 // after commit de02782, averaged over multiple runs makeMany(5000) == 2750 {code} Since we create multiple serializer instances per partition, this means a 5000-partition stage will unconditionally see an increase of > 1s for the stage. In page rank, we may run many such stages. We should explore the alternative of reusing thread-local serializer instances, which would lead to much fewer calls to `kryo.register`. > Reuse serializer instances for performance > -- > > Key: SPARK-12473 > URL: https://issues.apache.org/jira/browse/SPARK-12473 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 1.6.0 >Reporter: Andrew Or >Assignee: Andrew Or > > After commit de02782 of page rank regressed from 242s to 260s, about 7%. > Although currently it's only 7%, we will likely register more classes in the > future so we should do this the right way. > The commit added 26 types to register every time we create a Kryo serializer > instance. I ran a small microbenchmark to prove that this is noticeably > expensive: > {code} > import org.apache.spark.serializer._ > import org.apache.spark.SparkConf > def makeMany(num: Int): Long = { > val start = System.currentTimeMillis > (1 to num).foreach { _ => new KryoSerializer(new SparkConf).newKryo() } > System.currentTimeMillis - start > } > // before commit de02782, averaged over multiple runs > makeMany(5000) == 1500 > // after commit de02782, averaged over multiple runs > makeMany(5000) == 2750 > {code} > Since we create multiple serializer instances per partition, this means a > 5000-partition stage will unconditionally see an increase of > 1s for the > stage. In page rank, we may run many such stages. > We should explore the alternative of reusing thread-local serializer > instances, which would lead to much fewer calls to `kryo.register`. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-12466) Harmless Master NPE in tests
[ https://issues.apache.org/jira/browse/SPARK-12466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or resolved SPARK-12466. --- Resolution: Fixed > Harmless Master NPE in tests > > > Key: SPARK-12466 > URL: https://issues.apache.org/jira/browse/SPARK-12466 > Project: Spark > Issue Type: Bug > Components: Deploy, Tests >Affects Versions: 1.6.0 >Reporter: Andrew Or >Assignee: Andrew Or > Fix For: 1.6.1, 2.0.0 > > > {code} > [info] ReplayListenerSuite: > [info] - Simple replay (58 milliseconds) > java.lang.NullPointerException > at > org.apache.spark.deploy.master.Master$$anonfun$asyncRebuildSparkUI$1.applyOrElse(Master.scala:982) > at > org.apache.spark.deploy.master.Master$$anonfun$asyncRebuildSparkUI$1.applyOrElse(Master.scala:980) > at scala.concurrent.Future$$anonfun$onSuccess$1.apply(Future.scala:117) > at scala.concurrent.Future$$anonfun$onSuccess$1.apply(Future.scala:115) > at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:32) > at > com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:293) > at > scala.concurrent.impl.ExecutionContextImpl$$anon$1.execute(ExecutionContextImpl.scala:133) > at > scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:40) > at > scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:248) > at scala.concurrent.Promise$class.complete(Promise.scala:55) > at > scala.concurrent.impl.Promise$DefaultPromise.complete(Promise.scala:153) > at > scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:23) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > [info] - End-to-end replay (10 seconds, 755 milliseconds) > {code} > https://amplab.cs.berkeley.edu/jenkins/view/Spark-QA-Test/job/Spark-Master-SBT/4316/AMPLAB_JENKINS_BUILD_PROFILE=hadoop2.2,label=spark-test/consoleFull > caused by https://github.com/apache/spark/pull/10284 > Thanks to [~ted_yu] for reporting. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-2331) SparkContext.emptyRDD should return RDD[T] not EmptyRDD[T]
[ https://issues.apache.org/jira/browse/SPARK-2331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or resolved SPARK-2331. -- Resolution: Fixed Fix Version/s: 2.0.0 > SparkContext.emptyRDD should return RDD[T] not EmptyRDD[T] > -- > > Key: SPARK-2331 > URL: https://issues.apache.org/jira/browse/SPARK-2331 > Project: Spark > Issue Type: Sub-task > Components: Spark Core >Affects Versions: 1.0.0 >Reporter: Ian Hummel >Assignee: Reynold Xin >Priority: Minor > Fix For: 2.0.0 > > > The return type for SparkContext.emptyRDD is EmptyRDD[T]. > It should be RDD[T]. That means you have to add extra type annotations on > code like the below (which creates a union of RDDs over some subset of paths > in a folder) > {code} > val rdds = Seq("a", "b", "c").foldLeft[RDD[String]](sc.emptyRDD[String]) { > (rdd, path) ⇒ > rdd.union(sc.textFile(path)) > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-12339) NullPointerException on stage kill from web UI
[ https://issues.apache.org/jira/browse/SPARK-12339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or resolved SPARK-12339. --- Resolution: Fixed Assignee: Alex Bozarth (was: Apache Spark) Fix Version/s: 2.0.0 Target Version/s: 2.0.0 > NullPointerException on stage kill from web UI > -- > > Key: SPARK-12339 > URL: https://issues.apache.org/jira/browse/SPARK-12339 > Project: Spark > Issue Type: Bug > Components: Web UI >Affects Versions: 1.6.0 >Reporter: Jacek Laskowski >Assignee: Alex Bozarth > Fix For: 2.0.0 > > > The following message is in the logs after killing a stage: > {code} > scala> INFO Executor: Executor killed task 1.0 in stage 7.0 (TID 33) > INFO Executor: Executor killed task 0.0 in stage 7.0 (TID 32) > WARN TaskSetManager: Lost task 1.0 in stage 7.0 (TID 33, localhost): > TaskKilled (killed intentionally) > WARN TaskSetManager: Lost task 0.0 in stage 7.0 (TID 32, localhost): > TaskKilled (killed intentionally) > INFO TaskSchedulerImpl: Removed TaskSet 7.0, whose tasks have all completed, > from pool > ERROR LiveListenerBus: Listener SQLListener threw an exception > java.lang.NullPointerException > at > org.apache.spark.sql.execution.ui.SQLListener.onTaskEnd(SQLListener.scala:167) > at > org.apache.spark.scheduler.SparkListenerBus$class.onPostEvent(SparkListenerBus.scala:42) > at > org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31) > at > org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31) > at > org.apache.spark.util.ListenerBus$class.postToAll(ListenerBus.scala:55) > at > org.apache.spark.util.AsynchronousListenerBus.postToAll(AsynchronousListenerBus.scala:37) > at > org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(AsynchronousListenerBus.scala:80) > at > org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(AsynchronousListenerBus.scala:65) > at > org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(AsynchronousListenerBus.scala:65) > at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58) > at > org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(AsynchronousListenerBus.scala:64) > at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1169) > at > org.apache.spark.util.AsynchronousListenerBus$$anon$1.run(AsynchronousListenerBus.scala:63) > ERROR LiveListenerBus: Listener SQLListener threw an exception > java.lang.NullPointerException > at > org.apache.spark.sql.execution.ui.SQLListener.onTaskEnd(SQLListener.scala:167) > at > org.apache.spark.scheduler.SparkListenerBus$class.onPostEvent(SparkListenerBus.scala:42) > at > org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31) > at > org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31) > at > org.apache.spark.util.ListenerBus$class.postToAll(ListenerBus.scala:55) > at > org.apache.spark.util.AsynchronousListenerBus.postToAll(AsynchronousListenerBus.scala:37) > at > org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(AsynchronousListenerBus.scala:80) > at > org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(AsynchronousListenerBus.scala:65) > at > org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(AsynchronousListenerBus.scala:65) > at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58) > at > org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(AsynchronousListenerBus.scala:64) > at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1169) > at > org.apache.spark.util.AsynchronousListenerBus$$anon$1.run(AsynchronousListenerBus.scala:63) > {code} > To reproduce, start a job and kill the stage from web UI, e.g.: > {code} > val rdd = sc.parallelize(0 to 9, 2) > rdd.mapPartitionsWithIndex { case (n, it) => Thread.sleep(10 * 1000); it > }.count > {code} > Go to web UI and in Stages tab click "kill" for the stage. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-12473) Reuse serializer instances for performance
[ https://issues.apache.org/jira/browse/SPARK-12473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-12473: -- Description: After commit de02782 of page rank regressed from 242s to 260s, about 7%. The commit added 26 types to register every time we create a Kryo serializer instance. I ran a small microbenchmark to prove that this is noticeably expensive: {code} import org.apache.spark.serializer._ import org.apache.spark.SparkConf def makeMany(num: Int): Long = { val start = System.currentTimeMillis (1 to num).foreach { _ => new KryoSerializer(new SparkConf).newKryo() } System.currentTimeMillis - start } // before commit de02782, averaged over multiple runs makeMany(5000) == 1500 // after commit de02782, averaged over multiple runs makeMany(5000) == 2750 {code} Since we create multiple serializer instances per partition, this means a 5000-partition stage will unconditionally see an increase of > 1s for the stage. In page rank, we may run many such stages. We should explore the alternative of reusing thread-local serializer instances, which would lead to much fewer calls to `kryo.register`. was: After commit de02782 of page rank regressed from 242s to 260s, about 7%. The commit added 26 types to register every time we create a Kryo serializer instance. I ran a small microbenchmark to prove that this is noticeably expensive: {code} import org.apache.spark.serializer._ import org.apache.spark.SparkConf def makeMany(num: Int): Long = { val start = System.currentTimeMillis (1 to num).foreach { _ => new KryoSerializer(new SparkConf).newKryo() } System.currentTimeMillis - start } // before commit de02782, averaged over multiple runs makeMany(5000) == 1500 // after commit de02782, averaged over multiple runs makeMany(5000) == 2750 {code} Since we create multiple serializer instances per partition, this means a 5000-partition stage will unconditionally see an increase of > 1s for the stage. In page rank, we may run many such stages. > Reuse serializer instances for performance > -- > > Key: SPARK-12473 > URL: https://issues.apache.org/jira/browse/SPARK-12473 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 1.6.0 >Reporter: Andrew Or >Assignee: Andrew Or > > After commit de02782 of page rank regressed from 242s to 260s, about 7%. > The commit added 26 types to register every time we create a Kryo serializer > instance. I ran a small microbenchmark to prove that this is noticeably > expensive: > {code} > import org.apache.spark.serializer._ > import org.apache.spark.SparkConf > def makeMany(num: Int): Long = { > val start = System.currentTimeMillis > (1 to num).foreach { _ => new KryoSerializer(new SparkConf).newKryo() } > System.currentTimeMillis - start > } > // before commit de02782, averaged over multiple runs > makeMany(5000) == 1500 > // after commit de02782, averaged over multiple runs > makeMany(5000) == 2750 > {code} > Since we create multiple serializer instances per partition, this means a > 5000-partition stage will unconditionally see an increase of > 1s for the > stage. In page rank, we may run many such stages. > We should explore the alternative of reusing thread-local serializer > instances, which would lead to much fewer calls to `kryo.register`. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-12473) Reuse serializer instances for performance
Andrew Or created SPARK-12473: - Summary: Reuse serializer instances for performance Key: SPARK-12473 URL: https://issues.apache.org/jira/browse/SPARK-12473 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 1.6.0 Reporter: Andrew Or Assignee: Andrew Or After commit de02782 of page rank regressed from 242s to 260s, about 7%. The commit added 26 types to register every time we create a Kryo serializer instance. I ran a small microbenchmark to prove that this is noticeably expensive: {code} import org.apache.spark.serializer._ import org.apache.spark.SparkConf def makeMany(num: Int): Long = { val start = System.currentTimeMillis (1 to num).foreach { _ => new KryoSerializer(new SparkConf).newKryo() } System.currentTimeMillis - start } // before commit de02782, averaged over multiple runs makeMany(5000) == 1500 // after commit de02782, averaged over multiple runs makeMany(5000) == 2750 {code} Since we create multiple serializer instances per partition, this means a 5000-partition stage will unconditionally see an increase of > 1s for the stage. In page rank, we may run many such stages. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org