[jira] [Assigned] (SPARK-13844) Generate better code for filters with a non-nullable column
[ https://issues.apache.org/jira/browse/SPARK-13844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13844: Assignee: (was: Apache Spark) > Generate better code for filters with a non-nullable column > --- > > Key: SPARK-13844 > URL: https://issues.apache.org/jira/browse/SPARK-13844 > Project: Spark > Issue Type: Improvement > Components: SQL >Reporter: Kazuaki Ishizaki > > We can generate smaller code for > # and/or with a non-nullable column > # divide/remainder with non-zero values > # whole code generation with a non-nullable columns -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-13844) Generate better code for filters with a non-nullable column
[ https://issues.apache.org/jira/browse/SPARK-13844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15192223#comment-15192223 ] Apache Spark commented on SPARK-13844: -- User 'kiszk' has created a pull request for this issue: https://github.com/apache/spark/pull/11684 > Generate better code for filters with a non-nullable column > --- > > Key: SPARK-13844 > URL: https://issues.apache.org/jira/browse/SPARK-13844 > Project: Spark > Issue Type: Improvement > Components: SQL >Reporter: Kazuaki Ishizaki > > We can generate smaller code for > # and/or with a non-nullable column > # divide/remainder with non-zero values > # whole code generation with a non-nullable columns -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-13844) Generate better code for filters with a non-nullable column
[ https://issues.apache.org/jira/browse/SPARK-13844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13844: Assignee: Apache Spark > Generate better code for filters with a non-nullable column > --- > > Key: SPARK-13844 > URL: https://issues.apache.org/jira/browse/SPARK-13844 > Project: Spark > Issue Type: Improvement > Components: SQL >Reporter: Kazuaki Ishizaki >Assignee: Apache Spark > > We can generate smaller code for > # and/or with a non-nullable column > # divide/remainder with non-zero values > # whole code generation with a non-nullable columns -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-13845) Driver OOM after few days when running streaming
jeanlyn created SPARK-13845: --- Summary: Driver OOM after few days when running streaming Key: SPARK-13845 URL: https://issues.apache.org/jira/browse/SPARK-13845 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 1.6.1, 1.5.2 Reporter: jeanlyn We have a streaming job using *FlumePollInputStream* always driver OOM after few days, here is some driver heap dump before OOM {noformat} num #instances #bytes class name -- 1: 13845916 553836640 org.apache.spark.storage.BlockStatus 2: 14020324 336487776 org.apache.spark.storage.StreamBlockId 3: 13883881 333213144 scala.collection.mutable.DefaultEntry 4: 8907 89043952 [Lscala.collection.mutable.HashEntry; 5: 62360 65107352 [B 6:163368 24453904 [Ljava.lang.Object; 7:293651 20342664 [C ... {noformat} *BlockStatus* and *StreamBlockId* keep on growing, and the driver OOM in the end. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-13845) Driver OOM after few days when running streaming
[ https://issues.apache.org/jira/browse/SPARK-13845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13845: Assignee: (was: Apache Spark) > Driver OOM after few days when running streaming > > > Key: SPARK-13845 > URL: https://issues.apache.org/jira/browse/SPARK-13845 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 1.5.2, 1.6.1 >Reporter: jeanlyn > > We have a streaming job using *FlumePollInputStream* always driver OOM after > few days, here is some driver heap dump before OOM > {noformat} > num #instances #bytes class name > -- >1: 13845916 553836640 org.apache.spark.storage.BlockStatus >2: 14020324 336487776 org.apache.spark.storage.StreamBlockId >3: 13883881 333213144 scala.collection.mutable.DefaultEntry >4: 8907 89043952 [Lscala.collection.mutable.HashEntry; >5: 62360 65107352 [B >6:163368 24453904 [Ljava.lang.Object; >7:293651 20342664 [C > ... > {noformat} > *BlockStatus* and *StreamBlockId* keep on growing, and the driver OOM in the > end. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-13845) Driver OOM after few days when running streaming
[ https://issues.apache.org/jira/browse/SPARK-13845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15192227#comment-15192227 ] Apache Spark commented on SPARK-13845: -- User 'jeanlyn' has created a pull request for this issue: https://github.com/apache/spark/pull/11679 > Driver OOM after few days when running streaming > > > Key: SPARK-13845 > URL: https://issues.apache.org/jira/browse/SPARK-13845 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 1.5.2, 1.6.1 >Reporter: jeanlyn > > We have a streaming job using *FlumePollInputStream* always driver OOM after > few days, here is some driver heap dump before OOM > {noformat} > num #instances #bytes class name > -- >1: 13845916 553836640 org.apache.spark.storage.BlockStatus >2: 14020324 336487776 org.apache.spark.storage.StreamBlockId >3: 13883881 333213144 scala.collection.mutable.DefaultEntry >4: 8907 89043952 [Lscala.collection.mutable.HashEntry; >5: 62360 65107352 [B >6:163368 24453904 [Ljava.lang.Object; >7:293651 20342664 [C > ... > {noformat} > *BlockStatus* and *StreamBlockId* keep on growing, and the driver OOM in the > end. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-13845) Driver OOM after few days when running streaming
[ https://issues.apache.org/jira/browse/SPARK-13845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13845: Assignee: Apache Spark > Driver OOM after few days when running streaming > > > Key: SPARK-13845 > URL: https://issues.apache.org/jira/browse/SPARK-13845 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 1.5.2, 1.6.1 >Reporter: jeanlyn >Assignee: Apache Spark > > We have a streaming job using *FlumePollInputStream* always driver OOM after > few days, here is some driver heap dump before OOM > {noformat} > num #instances #bytes class name > -- >1: 13845916 553836640 org.apache.spark.storage.BlockStatus >2: 14020324 336487776 org.apache.spark.storage.StreamBlockId >3: 13883881 333213144 scala.collection.mutable.DefaultEntry >4: 8907 89043952 [Lscala.collection.mutable.HashEntry; >5: 62360 65107352 [B >6:163368 24453904 [Ljava.lang.Object; >7:293651 20342664 [C > ... > {noformat} > *BlockStatus* and *StreamBlockId* keep on growing, and the driver OOM in the > end. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-10775) add search keywords in history page ui
[ https://issues.apache.org/jira/browse/SPARK-10775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta resolved SPARK-10775. Resolution: Duplicate Fix Version/s: 2.0.0 This issue is resolved by SPARK-10873. > add search keywords in history page ui > -- > > Key: SPARK-10775 > URL: https://issues.apache.org/jira/browse/SPARK-10775 > Project: Spark > Issue Type: Improvement > Components: Web UI >Reporter: Lianhui Wang >Priority: Minor > Fix For: 2.0.0 > > > add search button in history page ui that we can search some applications of > keywords,example: date or time. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-13845) Driver OOM after few days when running streaming
[ https://issues.apache.org/jira/browse/SPARK-13845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15192259#comment-15192259 ] Sean Owen commented on SPARK-13845: --- [~jeanlyn] You didn't write a description, and the title does not reflect the cause you identified. Please update it, and read https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark > Driver OOM after few days when running streaming > > > Key: SPARK-13845 > URL: https://issues.apache.org/jira/browse/SPARK-13845 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 1.5.2, 1.6.1 >Reporter: jeanlyn > > We have a streaming job using *FlumePollInputStream* always driver OOM after > few days, here is some driver heap dump before OOM > {noformat} > num #instances #bytes class name > -- >1: 13845916 553836640 org.apache.spark.storage.BlockStatus >2: 14020324 336487776 org.apache.spark.storage.StreamBlockId >3: 13883881 333213144 scala.collection.mutable.DefaultEntry >4: 8907 89043952 [Lscala.collection.mutable.HashEntry; >5: 62360 65107352 [B >6:163368 24453904 [Ljava.lang.Object; >7:293651 20342664 [C > ... > {noformat} > *BlockStatus* and *StreamBlockId* keep on growing, and the driver OOM in the > end. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-13846) VectorIndexer output on unknown feature should be more descriptive
Dmitry Spikhalskiy created SPARK-13846: -- Summary: VectorIndexer output on unknown feature should be more descriptive Key: SPARK-13846 URL: https://issues.apache.org/jira/browse/SPARK-13846 Project: Spark Issue Type: Bug Components: ML Affects Versions: 1.6.1 Reporter: Dmitry Spikhalskiy Priority: Minor I got an exception and looks like it's related to unknown categorical variable value passed indexing. java.util.NoSuchElementException: key not found: 20.0 at scala.collection.MapLike$class.default(MapLike.scala:228) at scala.collection.AbstractMap.default(Map.scala:58) at scala.collection.MapLike$class.apply(MapLike.scala:141) at scala.collection.AbstractMap.apply(Map.scala:58) at org.apache.spark.ml.feature.VectorIndexerModel$$anonfun$10$$anonfun$apply$4.apply(VectorIndexer.scala:316) at org.apache.spark.ml.feature.VectorIndexerModel$$anonfun$10$$anonfun$apply$4.apply(VectorIndexer.scala:315) at scala.collection.immutable.HashMap$HashMap1.foreach(HashMap.scala:224) at scala.collection.immutable.HashMap$HashTrieMap.foreach(HashMap.scala:403) at org.apache.spark.ml.feature.VectorIndexerModel$$anonfun$10.apply(VectorIndexer.scala:315) at org.apache.spark.ml.feature.VectorIndexerModel$$anonfun$10.apply(VectorIndexer.scala:309) at org.apache.spark.ml.feature.VectorIndexerModel$$anonfun$11.apply(VectorIndexer.scala:351) at org.apache.spark.ml.feature.VectorIndexerModel$$anonfun$11.apply(VectorIndexer.scala:351) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.evalExpr2$(Unknown Source) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.apply(Unknown Source) at org.apache.spark.sql.execution.Project$$anonfun$1$$anonfun$apply$1.apply(basicOperators.scala:51) at org.apache.spark.sql.execution.Project$$anonfun$1$$anonfun$apply$1.apply(basicOperators.scala:49) at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) at org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:149) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) at org.apache.spark.scheduler.Task.run(Task.scala:89) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) VectorIndexer created like val featureIndexer = new VectorIndexer() .setInputCol(DataFrameColumns.FEATURES) .setOutputCol("indexedFeatures") .setMaxCategories(5) .fit(trainingDF) Output should be not just default java.util.NoSuchElementException, but something specific like UnknownCategoricalValue with information, that could help to find the source element of vector (element index in vector maybe). -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-13846) VectorIndexer output on unknown feature should be more descriptive
[ https://issues.apache.org/jira/browse/SPARK-13846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dmitry Spikhalskiy updated SPARK-13846: --- Description: I got the exception and looks like it's related to unknown categorical variable value passed to indexing. java.util.NoSuchElementException: key not found: 20.0 at scala.collection.MapLike$class.default(MapLike.scala:228) at scala.collection.AbstractMap.default(Map.scala:58) at scala.collection.MapLike$class.apply(MapLike.scala:141) at scala.collection.AbstractMap.apply(Map.scala:58) at org.apache.spark.ml.feature.VectorIndexerModel$$anonfun$10$$anonfun$apply$4.apply(VectorIndexer.scala:316) at org.apache.spark.ml.feature.VectorIndexerModel$$anonfun$10$$anonfun$apply$4.apply(VectorIndexer.scala:315) at scala.collection.immutable.HashMap$HashMap1.foreach(HashMap.scala:224) at scala.collection.immutable.HashMap$HashTrieMap.foreach(HashMap.scala:403) at org.apache.spark.ml.feature.VectorIndexerModel$$anonfun$10.apply(VectorIndexer.scala:315) at org.apache.spark.ml.feature.VectorIndexerModel$$anonfun$10.apply(VectorIndexer.scala:309) at org.apache.spark.ml.feature.VectorIndexerModel$$anonfun$11.apply(VectorIndexer.scala:351) at org.apache.spark.ml.feature.VectorIndexerModel$$anonfun$11.apply(VectorIndexer.scala:351) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.evalExpr2$(Unknown Source) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.apply(Unknown Source) at org.apache.spark.sql.execution.Project$$anonfun$1$$anonfun$apply$1.apply(basicOperators.scala:51) at org.apache.spark.sql.execution.Project$$anonfun$1$$anonfun$apply$1.apply(basicOperators.scala:49) at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) at org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:149) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) at org.apache.spark.scheduler.Task.run(Task.scala:89) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) VectorIndexer created like val featureIndexer = new VectorIndexer() .setInputCol(DataFrameColumns.FEATURES) .setOutputCol("indexedFeatures") .setMaxCategories(5) .fit(trainingDF) Output should be not just default java.util.NoSuchElementException, but something specific like UnknownCategoricalValue with information, that could help to find the source element of vector (element index in vector maybe). was: I got an exception and looks like it's related to unknown categorical variable value passed indexing. java.util.NoSuchElementException: key not found: 20.0 at scala.collection.MapLike$class.default(MapLike.scala:228) at scala.collection.AbstractMap.default(Map.scala:58) at scala.collection.MapLike$class.apply(MapLike.scala:141) at scala.collection.AbstractMap.apply(Map.scala:58) at org.apache.spark.ml.feature.VectorIndexerModel$$anonfun$10$$anonfun$apply$4.apply(VectorIndexer.scala:316) at org.apache.spark.ml.feature.VectorIndexerModel$$anonfun$10$$anonfun$apply$4.apply(VectorIndexer.scala:315) at scala.collection.immutable.HashMap$HashMap1.foreach(HashMap.scala:224) at scala.collection.immutable.HashMap$HashTrieMap.foreach(HashMap.scala:403) at org.apache.spark.ml.feature.VectorIndexerModel$$anonfun$10.apply(VectorIndexer.scala:315) at org.apache.spark.ml.feature.VectorIndexerModel$$anonfun$10.apply(VectorIndexer.scala:309) at org.apache.spark.ml.feature.VectorIndexerModel$$anonfun$11.apply(VectorIndexer.scala:351) at org.apache.spark.ml.feature.VectorIndexerModel$$anonfun$11.apply(VectorIndexer.scala:351) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.evalExpr2$(Unknown Source) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.apply(Unknown Source) at org.apache.spark.sql.execution.Project$$anonfun$1$$anonfun$apply$1.apply(basicOperators.scala:51) at org.apache.spark.sql.execution.Project$$anonfun$1$$anonfun$apply$1.apply(basicOperators.scala:49) at scala.collection.Iterator$$anon$11.next(Iterato
[jira] [Updated] (SPARK-13846) VectorIndexer output on unknown feature should be more descriptive
[ https://issues.apache.org/jira/browse/SPARK-13846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-13846: -- Issue Type: Improvement (was: Bug) > VectorIndexer output on unknown feature should be more descriptive > -- > > Key: SPARK-13846 > URL: https://issues.apache.org/jira/browse/SPARK-13846 > Project: Spark > Issue Type: Improvement > Components: ML >Affects Versions: 1.6.1 >Reporter: Dmitry Spikhalskiy >Priority: Minor > > I got the exception and looks like it's related to unknown categorical > variable value passed to indexing. > java.util.NoSuchElementException: key not found: 20.0 > at scala.collection.MapLike$class.default(MapLike.scala:228) > at scala.collection.AbstractMap.default(Map.scala:58) > at scala.collection.MapLike$class.apply(MapLike.scala:141) > at scala.collection.AbstractMap.apply(Map.scala:58) > at > org.apache.spark.ml.feature.VectorIndexerModel$$anonfun$10$$anonfun$apply$4.apply(VectorIndexer.scala:316) > at > org.apache.spark.ml.feature.VectorIndexerModel$$anonfun$10$$anonfun$apply$4.apply(VectorIndexer.scala:315) > at > scala.collection.immutable.HashMap$HashMap1.foreach(HashMap.scala:224) > at > scala.collection.immutable.HashMap$HashTrieMap.foreach(HashMap.scala:403) > at > org.apache.spark.ml.feature.VectorIndexerModel$$anonfun$10.apply(VectorIndexer.scala:315) > at > org.apache.spark.ml.feature.VectorIndexerModel$$anonfun$10.apply(VectorIndexer.scala:309) > at > org.apache.spark.ml.feature.VectorIndexerModel$$anonfun$11.apply(VectorIndexer.scala:351) > at > org.apache.spark.ml.feature.VectorIndexerModel$$anonfun$11.apply(VectorIndexer.scala:351) > at > org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.evalExpr2$(Unknown > Source) > at > org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.apply(Unknown > Source) > at > org.apache.spark.sql.execution.Project$$anonfun$1$$anonfun$apply$1.apply(basicOperators.scala:51) > at > org.apache.spark.sql.execution.Project$$anonfun$1$$anonfun$apply$1.apply(basicOperators.scala:49) > at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) > at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) > at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) > at > org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:149) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) > at org.apache.spark.scheduler.Task.run(Task.scala:89) > at > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > VectorIndexer created like > val featureIndexer = new VectorIndexer() > .setInputCol(DataFrameColumns.FEATURES) > .setOutputCol("indexedFeatures") > .setMaxCategories(5) > .fit(trainingDF) > Output should be not just default java.util.NoSuchElementException, but > something specific like UnknownCategoricalValue with information, that could > help to find the source element of vector (element index in vector maybe). -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-13810) Add Port Configuration Suggestions on Bind Exceptions
[ https://issues.apache.org/jira/browse/SPARK-13810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-13810: -- Assignee: Bjorn Jonsson > Add Port Configuration Suggestions on Bind Exceptions > - > > Key: SPARK-13810 > URL: https://issues.apache.org/jira/browse/SPARK-13810 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Reporter: Bjorn Jonsson >Assignee: Bjorn Jonsson >Priority: Minor > Labels: supportability > Fix For: 1.6.2, 2.0.0 > > > When a java.net.BindException is thrown in startServiceOnPort in Utils.scala, > it displays the following message, irrespective of the port being used: > java.net.BindException: Address already in use: Service '$serviceName' failed > after 16 retries! > For supportability, it would be useful to add port configuration suggestions > for the related port being used. For example, if the Spark UI exceeds > spark.port.maxRetries, users should see: > java.net.BindException: Address already in use: Service 'SparkUI' failed > after 16 retries! Consider setting spark.ui.port to an available port or > increasing spark.port.maxRetries. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-13810) Add Port Configuration Suggestions on Bind Exceptions
[ https://issues.apache.org/jira/browse/SPARK-13810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-13810. --- Resolution: Fixed Fix Version/s: 1.6.2 2.0.0 Issue resolved by pull request 11644 [https://github.com/apache/spark/pull/11644] > Add Port Configuration Suggestions on Bind Exceptions > - > > Key: SPARK-13810 > URL: https://issues.apache.org/jira/browse/SPARK-13810 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Reporter: Bjorn Jonsson >Priority: Minor > Labels: supportability > Fix For: 2.0.0, 1.6.2 > > > When a java.net.BindException is thrown in startServiceOnPort in Utils.scala, > it displays the following message, irrespective of the port being used: > java.net.BindException: Address already in use: Service '$serviceName' failed > after 16 retries! > For supportability, it would be useful to add port configuration suggestions > for the related port being used. For example, if the Spark UI exceeds > spark.port.maxRetries, users should see: > java.net.BindException: Address already in use: Service 'SparkUI' failed > after 16 retries! Consider setting spark.ui.port to an available port or > increasing spark.port.maxRetries. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-12216) Spark failed to delete temp directory
[ https://issues.apache.org/jira/browse/SPARK-12216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15192281#comment-15192281 ] Guram Savinov commented on SPARK-12216: --- I have the same problem when exit from spark-shell on windows 7. Seems that it's not the permission problems because I start console as admin and have no problems with removng this directories manually. Maybe this directory is locked by some thread when shutdown hook executes. > Spark failed to delete temp directory > -- > > Key: SPARK-12216 > URL: https://issues.apache.org/jira/browse/SPARK-12216 > Project: Spark > Issue Type: Bug > Components: Spark Shell > Environment: windows 7 64 bit > Spark 1.52 > Java 1.8.0.65 > PATH includes: > C:\Users\Stefan\spark-1.5.2-bin-hadoop2.6\bin > C:\ProgramData\Oracle\Java\javapath > C:\Users\Stefan\scala\bin > SYSTEM variables set are: > JAVA_HOME=C:\Program Files\Java\jre1.8.0_65 > HADOOP_HOME=C:\Users\Stefan\hadoop-2.6.0\bin > (where the bin\winutils resides) > both \tmp and \tmp\hive have permissions > drwxrwxrwx as detected by winutils ls >Reporter: stefan >Priority: Minor > > The mailing list archives have no obvious solution to this: > scala> :q > Stopping spark context. > 15/12/08 16:24:22 ERROR ShutdownHookManager: Exception while deleting Spark > temp dir: > C:\Users\Stefan\AppData\Local\Temp\spark-18f2a418-e02f-458b-8325-60642868fdff > java.io.IOException: Failed to delete: > C:\Users\Stefan\AppData\Local\Temp\spark-18f2a418-e02f-458b-8325-60642868fdff > at org.apache.spark.util.Utils$.deleteRecursively(Utils.scala:884) > at > org.apache.spark.util.ShutdownHookManager$$anonfun$1$$anonfun$apply$mcV$sp$3.apply(ShutdownHookManager.scala:63) > at > org.apache.spark.util.ShutdownHookManager$$anonfun$1$$anonfun$apply$mcV$sp$3.apply(ShutdownHookManager.scala:60) > at scala.collection.mutable.HashSet.foreach(HashSet.scala:79) > at > org.apache.spark.util.ShutdownHookManager$$anonfun$1.apply$mcV$sp(ShutdownHookManager.scala:60) > at > org.apache.spark.util.SparkShutdownHook.run(ShutdownHookManager.scala:264) > at > org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(ShutdownHookManager.scala:234) > at > org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply(ShutdownHookManager.scala:234) > at > org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply(ShutdownHookManager.scala:234) > at > org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1699) > at > org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply$mcV$sp(ShutdownHookManager.scala:234) > at > org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply(ShutdownHookManager.scala:234) > at > org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply(ShutdownHookManager.scala:234) > at scala.util.Try$.apply(Try.scala:161) > at > org.apache.spark.util.SparkShutdownHookManager.runAll(ShutdownHookManager.scala:234) > at > org.apache.spark.util.SparkShutdownHookManager$$anon$2.run(ShutdownHookManager.scala:216) > at > org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54) -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-12216) Spark failed to delete temp directory
[ https://issues.apache.org/jira/browse/SPARK-12216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15192281#comment-15192281 ] Guram Savinov edited comment on SPARK-12216 at 3/13/16 10:49 AM: - I have the same problem when exit from spark-shell on windows 7. Seems that it's not the permission problems because I start console as admin and have no problems with removng this directories manually. Maybe this directory is locked by some thread when shutdown hook executes. Take a look at this post, it has details about possible directory lock: http://jakzaprogramowac.pl/pytanie/12478,how-to-find-which-java-scala-thread-has-locked-a-file was (Author: gsavinov): I have the same problem when exit from spark-shell on windows 7. Seems that it's not the permission problems because I start console as admin and have no problems with removng this directories manually. Maybe this directory is locked by some thread when shutdown hook executes. > Spark failed to delete temp directory > -- > > Key: SPARK-12216 > URL: https://issues.apache.org/jira/browse/SPARK-12216 > Project: Spark > Issue Type: Bug > Components: Spark Shell > Environment: windows 7 64 bit > Spark 1.52 > Java 1.8.0.65 > PATH includes: > C:\Users\Stefan\spark-1.5.2-bin-hadoop2.6\bin > C:\ProgramData\Oracle\Java\javapath > C:\Users\Stefan\scala\bin > SYSTEM variables set are: > JAVA_HOME=C:\Program Files\Java\jre1.8.0_65 > HADOOP_HOME=C:\Users\Stefan\hadoop-2.6.0\bin > (where the bin\winutils resides) > both \tmp and \tmp\hive have permissions > drwxrwxrwx as detected by winutils ls >Reporter: stefan >Priority: Minor > > The mailing list archives have no obvious solution to this: > scala> :q > Stopping spark context. > 15/12/08 16:24:22 ERROR ShutdownHookManager: Exception while deleting Spark > temp dir: > C:\Users\Stefan\AppData\Local\Temp\spark-18f2a418-e02f-458b-8325-60642868fdff > java.io.IOException: Failed to delete: > C:\Users\Stefan\AppData\Local\Temp\spark-18f2a418-e02f-458b-8325-60642868fdff > at org.apache.spark.util.Utils$.deleteRecursively(Utils.scala:884) > at > org.apache.spark.util.ShutdownHookManager$$anonfun$1$$anonfun$apply$mcV$sp$3.apply(ShutdownHookManager.scala:63) > at > org.apache.spark.util.ShutdownHookManager$$anonfun$1$$anonfun$apply$mcV$sp$3.apply(ShutdownHookManager.scala:60) > at scala.collection.mutable.HashSet.foreach(HashSet.scala:79) > at > org.apache.spark.util.ShutdownHookManager$$anonfun$1.apply$mcV$sp(ShutdownHookManager.scala:60) > at > org.apache.spark.util.SparkShutdownHook.run(ShutdownHookManager.scala:264) > at > org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(ShutdownHookManager.scala:234) > at > org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply(ShutdownHookManager.scala:234) > at > org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply(ShutdownHookManager.scala:234) > at > org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1699) > at > org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply$mcV$sp(ShutdownHookManager.scala:234) > at > org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply(ShutdownHookManager.scala:234) > at > org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply(ShutdownHookManager.scala:234) > at scala.util.Try$.apply(Try.scala:161) > at > org.apache.spark.util.SparkShutdownHookManager.runAll(ShutdownHookManager.scala:234) > at > org.apache.spark.util.SparkShutdownHookManager$$anon$2.run(ShutdownHookManager.scala:216) > at > org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54) -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-13073) creating R like summary for logistic Regression in Spark - Scala
[ https://issues.apache.org/jira/browse/SPARK-13073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15192290#comment-15192290 ] Mohamed Baddar commented on SPARK-13073: [~josephkb] After more investigation in the code , and to make minimal changes on the code.My previous suggestion may not be suitable .I think we can implement toString version for BinaryLogisticRegressionSummary that give different information than R summary. It will create string representation for the following members : precision recall fmeasure [~josephkb] is there any comment before i start the PR ? > creating R like summary for logistic Regression in Spark - Scala > > > Key: SPARK-13073 > URL: https://issues.apache.org/jira/browse/SPARK-13073 > Project: Spark > Issue Type: New Feature > Components: ML, MLlib >Reporter: Samsudhin >Priority: Minor > > Currently Spark ML provides Coefficients for logistic regression. To evaluate > the trained model tests like wald test, chi square tests and their results to > be summarized and display like GLM summary of R -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-13073) creating R like summary for logistic Regression in Spark - Scala
[ https://issues.apache.org/jira/browse/SPARK-13073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15192290#comment-15192290 ] Mohamed Baddar edited comment on SPARK-13073 at 3/13/16 11:03 AM: -- [~josephkb] After more investigation in the code , and to make minimal changes on the code.My previous suggestion may not be suitable .I think we can implement toString version for BinaryLogisticRegressionSummary that give different information than R summary. It will create string representation for the following members : precision recall fmeasure is there any comment before i start the PR ? was (Author: mbaddar1): [~josephkb] After more investigation in the code , and to make minimal changes on the code.My previous suggestion may not be suitable .I think we can implement toString version for BinaryLogisticRegressionSummary that give different information than R summary. It will create string representation for the following members : precision recall fmeasure [~josephkb] is there any comment before i start the PR ? > creating R like summary for logistic Regression in Spark - Scala > > > Key: SPARK-13073 > URL: https://issues.apache.org/jira/browse/SPARK-13073 > Project: Spark > Issue Type: New Feature > Components: ML, MLlib >Reporter: Samsudhin >Priority: Minor > > Currently Spark ML provides Coefficients for logistic regression. To evaluate > the trained model tests like wald test, chi square tests and their results to > be summarized and display like GLM summary of R -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-13718) Scheduler "creating" straggler node
[ https://issues.apache.org/jira/browse/SPARK-13718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15192309#comment-15192309 ] Sean Owen commented on SPARK-13718: --- What do you mean that it tries to assign without an available core (slot?). I understand that if the data-local nodes are all busy, and the task is I/O-intensive, the reading remotely is not only going to be slower but put more load on the busy nodes. But, they may not be I/O-bound, just have all slots occupied. Or the job may not be I/O-intensive at all in which case data locality doesn't help. In this case, not scheduling the task is suboptimal. But, when is it better to not schedule the task at all? you're saying it creates a straggler, but all you're saying is things take a while when resources are constrained. What is the better scheduling decision, even given omniscience? > Scheduler "creating" straggler node > > > Key: SPARK-13718 > URL: https://issues.apache.org/jira/browse/SPARK-13718 > Project: Spark > Issue Type: Improvement > Components: Scheduler, Spark Core >Affects Versions: 1.3.1 > Environment: Spark 1.3.1 > MapR-FS > Single Rack > Standalone mode scheduling > 8 node cluster > 48 cores & 512 RAM / node > Data Replication factor of 3 > Each Node has 4 Spark executors configured with 12 cores each and 22GB of RAM. >Reporter: Ioannis Deligiannis >Priority: Minor > Attachments: TestIssue.java, spark_struggler.jpg > > > *Data:* > * Assume an even distribution of data across the cluster with a replication > factor of 3. > * In-memory data are partitioned in 128 chunks (384 cores in total, i.e. 3 > requests can be executed concurrently(-ish) ) > *Action:* > * Action is a simple sequence of map/filter/reduce. > * The action operates upon and returns a small subset of data (following the > full map over the data). > * Data are 1 x cached serialized in memory (Kryo), so calling the action > should not hit the disk under normal conditions. > * Action network usage is low as it returns a small number of aggregated > results and does not require excessive shuffling > * Under low or moderate load, each action is expected to complete in less > than 2 seconds > *H/W Outlook* > When the action is called in high numbers, initially the cluster CPU gets > close to 100% (which is expected & intended). > After a while, the cluster utilization reduces significantly with only one > (struggler) node having 100% CPU and fully utilized network. > *Diagnosis:* > 1. Attached a profiler to the driver and executors to monitor GC or I/O > issues and everything is normal under low or heavy usage. > 2. Cluster network usage is very low > 3. No issues on Spark UI except that tasks begin to move from LOCAL to ANY > *Cause (Corrected as found details in code):* > 1. Node 'H' is doing marginally more work than the rest (being a little > slower and at almost 100% CPU) > 2. Scheduler hits the default 3000 millis spark.locality.wait and assigns the > task to other nodes > 3. One of the nodes 'X' that accepted the task will try to access the data > from node 'H' HDD. This adds Network I/O to node and also some extra CPU for > I/O. > 4. 'X' time to complete increases ~5x as it goes over Network > 5. Eventually, every node will have a task that is waiting to fetch that > specific partition from node 'H' so cluster is basically blocked on a single > node > What I managed to figure out from the code is that this is because if an RDD > is cached, it will make use of BlockManager.getRemote() and will not > recompute the DAG part that resulted in this RDD and hence always hit the > node that has cached the RDD. > * Proposed Fix * > I have not worked with Scala & Spark source code enough to propose a code > fix, but on a high level, when a task hits the 'spark.locality.wait' timeout, > it could make use of a new configuration e.g. > recomputeRddAfterLocalityTimeout instead of always trying to get the cached > RDD. This would be very useful if it could also be manually set on the RDD. > *Workaround* > Playing with 'spark.locality.wait' values, there is a deterministic value > depending on partitions and config where the problem ceases to exist. > *PS1* : Don't have enough Scala skils to follow the issue or propose a fix, > but I hope that this has enough information to make sense. > *PS2* : Debugging this issue made me realize that there can be a lot of > use-cases that trigger this behaviour -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-13298) DAG visualization does not render correctly for jobs
[ https://issues.apache.org/jira/browse/SPARK-13298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15192324#comment-15192324 ] Todd Leo commented on SPARK-13298: -- Pls also see SPARK-13645: DAG Diagram not shown properly in Chrome > DAG visualization does not render correctly for jobs > > > Key: SPARK-13298 > URL: https://issues.apache.org/jira/browse/SPARK-13298 > Project: Spark > Issue Type: Bug > Components: Web UI >Affects Versions: 1.6.0 >Reporter: Lucas Woltmann >Assignee: Shixiong Zhu > Fix For: 1.6.1, 2.0.0 > > Attachments: dag_full.png, dag_viz.png, no-dag.png > > > Whenever I try to open the DAG for a job, I get something like this: > !dag_viz.png! > Obviously the svg doesn't get resized, but if I resize it manually, only the > first of four stages in the DAG is shown. > The js console says (variable v is null in peg$c34): > {code:javascript} > Uncaught TypeError: Cannot read property '3' of null > peg$c34 @ graphlib-dot.min.js:1 > peg$parseidDef @ graphlib-dot.min.js:1 > peg$parseaList @ graphlib-dot.min.js:1 > peg$parseattrListBlock @ graphlib-dot.min.js:1 > peg$parseattrList @ graphlib-dot.min.js:1 > peg$parsenodeStmt @ graphlib-dot.min.js:1 > peg$parsestmt @ graphlib-dot.min.js:1 > peg$parsestmtList @ graphlib-dot.min.js:1 > peg$parsesubgraphStmt @ graphlib-dot.min.js:1 > peg$parsenodeIdOrSubgraph @ graphlib-dot.min.js:1 > peg$parseedgeStmt @ graphlib-dot.min.js:1 > peg$parsestmt @ graphlib-dot.min.js:1 > peg$parsestmtList @ graphlib-dot.min.js:1 > peg$parsesubgraphStmt @ graphlib-dot.min.js:1 > peg$parsenodeIdOrSubgraph @ graphlib-dot.min.js:1 > peg$parseedgeStmt @ graphlib-dot.min.js:1 > peg$parsestmt @ graphlib-dot.min.js:1 > peg$parsestmtList @ graphlib-dot.min.js:1 > peg$parsegraphStmt @ graphlib-dot.min.js:1 > parse @ graphlib-dot.min.js:2 > readOne @ graphlib-dot.min.js:2 > renderDot @ spark-dag-viz.js:281 > (anonymous function) @ spark-dag-viz.js:248 > (anonymous function) @ d3.min.js: > 3Y @ d3.min.js:1 > _a.each @ d3.min.js:3 > renderDagVizForJob @ spark-dag-viz.js:207 > renderDagViz @ spark-dag-viz.js:163 > toggleDagViz @ spark-dag-viz.js:100 > onclick @ ?id=2:153 > {code} > (tested in FIrefox 44.0.1 and Chromium 48.0.2564.103) -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-6378) srcAttr in graph.triplets don't update when the size of graph is huge
[ https://issues.apache.org/jira/browse/SPARK-6378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhaokang Wang updated SPARK-6378: - Attachment: TripletsViewDonotUpdate.scala I have met a similar problem with triplets update in GraphX. I think I have a code demo that can reproduce the situation of this issue. I reproduce the issue on a small toy graph with only 3 vertices. My demo code has been attached as [^TripletsViewDonotUpdate.scala]. Let me describe the steps to reproduce the issue: 1. We have constructed a small graph ({{purGraph}} in the code) with only 3 vertices. The edges of the graph are: 2->1, 3->1, 2->3. 2. Conduct the collect neighbors operation to get the {{inNeighborGraph}} of the {{purGraph}}. 3. Outer join the {{inNeighborGraph}} vertices on {{purGraph}} to get the {{dataGraph}}. In {{dataGraph}}, each vertex will store an ArrayBuffer of its in neighbors' vertex id list. 4. Now we can examine the {{inNeighbor}} attribute in {{dataGraph.vertices}} view and {{dataGraph.triplets}} view. We can see from the output that the two views are inconsistent on vertex 3's {{inNeighbor}} property: {quote} > dataGraph.vertices vid: 1, inNeighbor:2,3 vid: 3, inNeighbor:2 vid: 2, inNeighbor: > dataGraph.triplets.srcAttr vid: 2, inNeighbor: vid: 2, inNeighbor: vid: 3, inNeighbor: {quote} 5. If we comment the {{purGraph.triplets.count()}} statement in the code, the bug will disappear: {code} val purGraph = Graph(dataVertex, dataEdge).persist() // purGraph.triplets.count() // !!!comment this val inNeighborGraph = purGraph.collectNeighbors(EdgeDirection.In) // Now join the in neighbor vertex id list to every vertex's property val dataGraph = purGraph.outerJoinVertices(inNeighborGraph)((vid, property, inNeighborList) => { val inNeighborVertexIds = inNeighborList.getOrElse(Array[(VertexId, VertexProperty)]()).map(t => t._1) property.inNeighbor ++= inNeighborVertexIds.toBuffer property }) {code} It seems that the triplets view and the vertex view of the same graph may be inconsistent in some situation. > srcAttr in graph.triplets don't update when the size of graph is huge > - > > Key: SPARK-6378 > URL: https://issues.apache.org/jira/browse/SPARK-6378 > Project: Spark > Issue Type: Bug > Components: GraphX >Affects Versions: 1.2.1 >Reporter: zhangzhenyue > Attachments: TripletsViewDonotUpdate.scala > > > when the size of the graph is huge(0.2 billion vertex, 6 billion edges), the > srcAttr and dstAttr in graph.triplets don't update when using the > Graph.outerJoinVertices(when the data in vertex is changed). > the code and the log is as follows: > {quote} > g = graph.outerJoinVertices()... > g,vertices,count() > g.edges.count() > println("example edge " + g.triplets.filter(e => e.srcId == > 51L).collect() > .map(e =>(e.srcId + ":" + e.srcAttr + ", " + e.dstId + ":" + > e.dstAttr)).mkString("\n")) > println("example vertex " + g.vertices.filter(e => e._1 == > 51L).collect() > .map(e => (e._1 + "," + e._2)).mkString("\n")) > {quote} > the result: > {quote} > example edge 51:0, 2467451620:61 > 51:0, 1962741310:83 // attr of vertex 51 is 0 in > Graph.triplets > example vertex 51,2 // attr of vertex 51 is 2 in > Graph.vertices > {quote} > when the graph is smaller(10 million vertex), the code is OK, the triplets > will update when the vertex is changed -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-6378) srcAttr in graph.triplets don't update when the size of graph is huge
[ https://issues.apache.org/jira/browse/SPARK-6378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15192333#comment-15192333 ] Zhaokang Wang commented on SPARK-6378: -- I have reproduced this problem in a small toy graph demo with about 50 lines of code. I have described the detail in the following comment, please follow it if you have time. Thanks a lot! > srcAttr in graph.triplets don't update when the size of graph is huge > - > > Key: SPARK-6378 > URL: https://issues.apache.org/jira/browse/SPARK-6378 > Project: Spark > Issue Type: Bug > Components: GraphX >Affects Versions: 1.2.1 >Reporter: zhangzhenyue > Attachments: TripletsViewDonotUpdate.scala > > > when the size of the graph is huge(0.2 billion vertex, 6 billion edges), the > srcAttr and dstAttr in graph.triplets don't update when using the > Graph.outerJoinVertices(when the data in vertex is changed). > the code and the log is as follows: > {quote} > g = graph.outerJoinVertices()... > g,vertices,count() > g.edges.count() > println("example edge " + g.triplets.filter(e => e.srcId == > 51L).collect() > .map(e =>(e.srcId + ":" + e.srcAttr + ", " + e.dstId + ":" + > e.dstAttr)).mkString("\n")) > println("example vertex " + g.vertices.filter(e => e._1 == > 51L).collect() > .map(e => (e._1 + "," + e._2)).mkString("\n")) > {quote} > the result: > {quote} > example edge 51:0, 2467451620:61 > 51:0, 1962741310:83 // attr of vertex 51 is 0 in > Graph.triplets > example vertex 51,2 // attr of vertex 51 is 2 in > Graph.vertices > {quote} > when the graph is smaller(10 million vertex), the code is OK, the triplets > will update when the vertex is changed -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-12216) Spark failed to delete temp directory
[ https://issues.apache.org/jira/browse/SPARK-12216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-12216. --- Resolution: Invalid > Spark failed to delete temp directory > -- > > Key: SPARK-12216 > URL: https://issues.apache.org/jira/browse/SPARK-12216 > Project: Spark > Issue Type: Bug > Components: Spark Shell > Environment: windows 7 64 bit > Spark 1.52 > Java 1.8.0.65 > PATH includes: > C:\Users\Stefan\spark-1.5.2-bin-hadoop2.6\bin > C:\ProgramData\Oracle\Java\javapath > C:\Users\Stefan\scala\bin > SYSTEM variables set are: > JAVA_HOME=C:\Program Files\Java\jre1.8.0_65 > HADOOP_HOME=C:\Users\Stefan\hadoop-2.6.0\bin > (where the bin\winutils resides) > both \tmp and \tmp\hive have permissions > drwxrwxrwx as detected by winutils ls >Reporter: stefan >Priority: Minor > > The mailing list archives have no obvious solution to this: > scala> :q > Stopping spark context. > 15/12/08 16:24:22 ERROR ShutdownHookManager: Exception while deleting Spark > temp dir: > C:\Users\Stefan\AppData\Local\Temp\spark-18f2a418-e02f-458b-8325-60642868fdff > java.io.IOException: Failed to delete: > C:\Users\Stefan\AppData\Local\Temp\spark-18f2a418-e02f-458b-8325-60642868fdff > at org.apache.spark.util.Utils$.deleteRecursively(Utils.scala:884) > at > org.apache.spark.util.ShutdownHookManager$$anonfun$1$$anonfun$apply$mcV$sp$3.apply(ShutdownHookManager.scala:63) > at > org.apache.spark.util.ShutdownHookManager$$anonfun$1$$anonfun$apply$mcV$sp$3.apply(ShutdownHookManager.scala:60) > at scala.collection.mutable.HashSet.foreach(HashSet.scala:79) > at > org.apache.spark.util.ShutdownHookManager$$anonfun$1.apply$mcV$sp(ShutdownHookManager.scala:60) > at > org.apache.spark.util.SparkShutdownHook.run(ShutdownHookManager.scala:264) > at > org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(ShutdownHookManager.scala:234) > at > org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply(ShutdownHookManager.scala:234) > at > org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply(ShutdownHookManager.scala:234) > at > org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1699) > at > org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply$mcV$sp(ShutdownHookManager.scala:234) > at > org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply(ShutdownHookManager.scala:234) > at > org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply(ShutdownHookManager.scala:234) > at scala.util.Try$.apply(Try.scala:161) > at > org.apache.spark.util.SparkShutdownHookManager.runAll(ShutdownHookManager.scala:234) > at > org.apache.spark.util.SparkShutdownHookManager$$anon$2.run(ShutdownHookManager.scala:216) > at > org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54) -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-6378) srcAttr in graph.triplets don't update when the size of graph is huge
[ https://issues.apache.org/jira/browse/SPARK-6378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15192332#comment-15192332 ] Zhaokang Wang edited comment on SPARK-6378 at 3/13/16 1:08 PM: --- I have met a similar problem with triplets update in GraphX. I think I have a code demo that can reproduce the situation of this issue. I reproduce the issue on a small toy graph with only 3 vertices. My demo code has been attached as [^TripletsViewDonotUpdate.scala]. The key code is shown as the following: {code} // purGraph is a toy graph with edges: 2->1, 3->1, 2->3. val purGraph = Graph(dataVertex, dataEdge).persist() purGraph.triplets.count() // this operation will cause the bug. val inNeighborGraph = purGraph.collectNeighbors(EdgeDirection.In) val dataGraph = purGraph.outerJoinVertices(inNeighborGraph)((vid, property, inNeighborList) => {... }) // dataGraph's vertices view and triplets view will be inconsistent on vertex 3's inNeighbor attribute. dataGraph.vertices.foreach {...} dataGraph.triplets.foreach {...} {code} We can see from the output that the two views are inconsistent on *vertex 3*'s {{inNeighbor}} property: {quote} > dataGraph.vertices vid: 1, inNeighbor:2,3 vid: 3, inNeighbor:2 vid: 2, inNeighbor: > dataGraph.triplets.srcAttr vid: 2, inNeighbor: vid: 2, inNeighbor: vid: 3, inNeighbor: {quote} 5. If we comment the {{purGraph.triplets.count()}} statement in the code, the bug will disappear: {code} val purGraph = Graph(dataVertex, dataEdge).persist() // purGraph.triplets.count() // !!!comment this val inNeighborGraph = purGraph.collectNeighbors(EdgeDirection.In) // Now join the in neighbor vertex id list to every vertex's property val dataGraph = purGraph.outerJoinVertices(inNeighborGraph)((vid, property, inNeighborList) => {...}) {code} It seems that the triplets view and the vertex view of the same graph may be inconsistent in some situation. was (Author: bsidb): I have met a similar problem with triplets update in GraphX. I think I have a code demo that can reproduce the situation of this issue. I reproduce the issue on a small toy graph with only 3 vertices. My demo code has been attached as [^TripletsViewDonotUpdate.scala]. Let me describe the steps to reproduce the issue: 1. We have constructed a small graph ({{purGraph}} in the code) with only 3 vertices. The edges of the graph are: 2->1, 3->1, 2->3. 2. Conduct the collect neighbors operation to get the {{inNeighborGraph}} of the {{purGraph}}. 3. Outer join the {{inNeighborGraph}} vertices on {{purGraph}} to get the {{dataGraph}}. In {{dataGraph}}, each vertex will store an ArrayBuffer of its in neighbors' vertex id list. 4. Now we can examine the {{inNeighbor}} attribute in {{dataGraph.vertices}} view and {{dataGraph.triplets}} view. We can see from the output that the two views are inconsistent on vertex 3's {{inNeighbor}} property: {quote} > dataGraph.vertices vid: 1, inNeighbor:2,3 vid: 3, inNeighbor:2 vid: 2, inNeighbor: > dataGraph.triplets.srcAttr vid: 2, inNeighbor: vid: 2, inNeighbor: vid: 3, inNeighbor: {quote} 5. If we comment the {{purGraph.triplets.count()}} statement in the code, the bug will disappear: {code} val purGraph = Graph(dataVertex, dataEdge).persist() // purGraph.triplets.count() // !!!comment this val inNeighborGraph = purGraph.collectNeighbors(EdgeDirection.In) // Now join the in neighbor vertex id list to every vertex's property val dataGraph = purGraph.outerJoinVertices(inNeighborGraph)((vid, property, inNeighborList) => { val inNeighborVertexIds = inNeighborList.getOrElse(Array[(VertexId, VertexProperty)]()).map(t => t._1) property.inNeighbor ++= inNeighborVertexIds.toBuffer property }) {code} It seems that the triplets view and the vertex view of the same graph may be inconsistent in some situation. > srcAttr in graph.triplets don't update when the size of graph is huge > - > > Key: SPARK-6378 > URL: https://issues.apache.org/jira/browse/SPARK-6378 > Project: Spark > Issue Type: Bug > Components: GraphX >Affects Versions: 1.2.1 >Reporter: zhangzhenyue > Attachments: TripletsViewDonotUpdate.scala > > > when the size of the graph is huge(0.2 billion vertex, 6 billion edges), the > srcAttr and dstAttr in graph.triplets don't update when using the > Graph.outerJoinVertices(when the data in vertex is changed). > the code and the log is as follows: > {quote} > g = graph.outerJoinVertices()... > g,vertices,count() > g.edges.count() > println("example edge " + g.triplets.filter(e => e.srcId == > 51L).collect() > .map(e =>(e.srcId + ":" + e.srcAttr + ", " + e.dstId + ":" + > e.dstAttr)).mkString(
[jira] [Created] (SPARK-13847) Defer the variable evaluation for Limit codegen
Liang-Chi Hsieh created SPARK-13847: --- Summary: Defer the variable evaluation for Limit codegen Key: SPARK-13847 URL: https://issues.apache.org/jira/browse/SPARK-13847 Project: Spark Issue Type: Improvement Components: SQL Reporter: Liang-Chi Hsieh We can defer the variable evaluation for Limit codegen as we did in Project. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-13847) Defer the variable evaluation for Limit codegen
[ https://issues.apache.org/jira/browse/SPARK-13847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13847: Assignee: Apache Spark > Defer the variable evaluation for Limit codegen > --- > > Key: SPARK-13847 > URL: https://issues.apache.org/jira/browse/SPARK-13847 > Project: Spark > Issue Type: Improvement > Components: SQL >Reporter: Liang-Chi Hsieh >Assignee: Apache Spark > > We can defer the variable evaluation for Limit codegen as we did in Project. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-13847) Defer the variable evaluation for Limit codegen
[ https://issues.apache.org/jira/browse/SPARK-13847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13847: Assignee: (was: Apache Spark) > Defer the variable evaluation for Limit codegen > --- > > Key: SPARK-13847 > URL: https://issues.apache.org/jira/browse/SPARK-13847 > Project: Spark > Issue Type: Improvement > Components: SQL >Reporter: Liang-Chi Hsieh > > We can defer the variable evaluation for Limit codegen as we did in Project. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-13847) Defer the variable evaluation for Limit codegen
[ https://issues.apache.org/jira/browse/SPARK-13847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15192371#comment-15192371 ] Apache Spark commented on SPARK-13847: -- User 'viirya' has created a pull request for this issue: https://github.com/apache/spark/pull/11685 > Defer the variable evaluation for Limit codegen > --- > > Key: SPARK-13847 > URL: https://issues.apache.org/jira/browse/SPARK-13847 > Project: Spark > Issue Type: Improvement > Components: SQL >Reporter: Liang-Chi Hsieh > > We can defer the variable evaluation for Limit codegen as we did in Project. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Closed] (SPARK-13847) Defer the variable evaluation for Limit codegen
[ https://issues.apache.org/jira/browse/SPARK-13847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang-Chi Hsieh closed SPARK-13847. --- Resolution: Not A Problem > Defer the variable evaluation for Limit codegen > --- > > Key: SPARK-13847 > URL: https://issues.apache.org/jira/browse/SPARK-13847 > Project: Spark > Issue Type: Improvement > Components: SQL >Reporter: Liang-Chi Hsieh > > We can defer the variable evaluation for Limit codegen as we did in Project. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-13825) Upgrade to Scala 2.11.8
[ https://issues.apache.org/jira/browse/SPARK-13825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15192389#comment-15192389 ] Sean Owen commented on SPARK-13825: --- While you're at it, I think it's fine to update the 2.10 line to 2.10.6. It includes only a very minor fix: http://www.scala-lang.org/news/2.10.6/ > Upgrade to Scala 2.11.8 > --- > > Key: SPARK-13825 > URL: https://issues.apache.org/jira/browse/SPARK-13825 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Affects Versions: 2.0.0 >Reporter: Jacek Laskowski >Priority: Minor > > Scala 2.11.8 is out so...time to upgrade before 2.0.0 is out -> > http://www.scala-lang.org/news/2.11.8/. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-13825) Upgrade to Scala 2.11.8
[ https://issues.apache.org/jira/browse/SPARK-13825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15192496#comment-15192496 ] Jacek Laskowski commented on SPARK-13825: - I can do it in a separate task. The title would not reflect the change (and changing it to Upgrade to Scala 2.10.6 and 2.11.8 would only confuse users). WDYT? > Upgrade to Scala 2.11.8 > --- > > Key: SPARK-13825 > URL: https://issues.apache.org/jira/browse/SPARK-13825 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Affects Versions: 2.0.0 >Reporter: Jacek Laskowski >Priority: Minor > > Scala 2.11.8 is out so...time to upgrade before 2.0.0 is out -> > http://www.scala-lang.org/news/2.11.8/. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-13825) Upgrade to Scala 2.11.8
[ https://issues.apache.org/jira/browse/SPARK-13825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15192498#comment-15192498 ] Sean Owen commented on SPARK-13825: --- I don't think it's confusing to bump maintenance releases of two Scala versions instead of one? titles can be changed. > Upgrade to Scala 2.11.8 > --- > > Key: SPARK-13825 > URL: https://issues.apache.org/jira/browse/SPARK-13825 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Affects Versions: 2.0.0 >Reporter: Jacek Laskowski >Priority: Minor > > Scala 2.11.8 is out so...time to upgrade before 2.0.0 is out -> > http://www.scala-lang.org/news/2.11.8/. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-13848) Upgrade to Py4J 0.9.2
Josh Rosen created SPARK-13848: -- Summary: Upgrade to Py4J 0.9.2 Key: SPARK-13848 URL: https://issues.apache.org/jira/browse/SPARK-13848 Project: Spark Issue Type: Bug Components: PySpark Reporter: Josh Rosen Assignee: Josh Rosen We should upgrade to Py4J 0.9.2 so that we can fix SPARK-6047 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-6722) Model import/export for StreamingKMeansModel
[ https://issues.apache.org/jira/browse/SPARK-6722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15192533#comment-15192533 ] Furkan KAMACI commented on SPARK-6722: -- Is there any road map suggested about how to implement this? > Model import/export for StreamingKMeansModel > > > Key: SPARK-6722 > URL: https://issues.apache.org/jira/browse/SPARK-6722 > Project: Spark > Issue Type: Sub-task > Components: MLlib >Affects Versions: 1.3.0 >Reporter: Joseph K. Bradley > > CC: [~freeman-lab] Is this API stable enough to merit adding import/export > (which will require supporting the model format version from now on)? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-13784) Model export/import for spark.ml: RandomForests
[ https://issues.apache.org/jira/browse/SPARK-13784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15192537#comment-15192537 ] Gayathri Murali commented on SPARK-13784: - I can work on this, if no one else has started > Model export/import for spark.ml: RandomForests > --- > > Key: SPARK-13784 > URL: https://issues.apache.org/jira/browse/SPARK-13784 > Project: Spark > Issue Type: Sub-task > Components: ML >Reporter: Joseph K. Bradley > > This JIRA is for both RandomForestClassifier and RandomForestRegressor. The > implementation should reuse the one for DecisionTree*. > It should augment NodeData with a tree ID so that all nodes can be stored in > a single DataFrame. It should reconstruct the trees in a distributed fashion. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-13812) Fix SparkR lint-r test errors
[ https://issues.apache.org/jira/browse/SPARK-13812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shivaram Venkataraman resolved SPARK-13812. --- Resolution: Fixed Assignee: Sun Rui Fix Version/s: 2.0.0 Resolved by https://github.com/apache/spark/pull/11652 > Fix SparkR lint-r test errors > - > > Key: SPARK-13812 > URL: https://issues.apache.org/jira/browse/SPARK-13812 > Project: Spark > Issue Type: Bug > Components: SparkR >Affects Versions: 1.6.1 >Reporter: Sun Rui >Assignee: Sun Rui > Fix For: 2.0.0 > > > After get updated from github, the lintr package can detect errors that are > not detected in previous versions. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-6047) pyspark - class loading on driver failing with --jars and --packages
[ https://issues.apache.org/jira/browse/SPARK-6047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15192540#comment-15192540 ] Josh Rosen commented on SPARK-6047: --- Resolving this as a duplicate of SPARK-5185, since I'm about to open a PR to fix that issue. > pyspark - class loading on driver failing with --jars and --packages > > > Key: SPARK-6047 > URL: https://issues.apache.org/jira/browse/SPARK-6047 > Project: Spark > Issue Type: Bug > Components: PySpark, Spark Submit >Affects Versions: 1.3.0 >Reporter: Burak Yavuz > > Because py4j uses the system ClassLoader instead of the contextClassLoader of > the thread, the dynamically added jars in Spark Submit can't be loaded in the > driver. > This causes `Py4JError: Trying to call a package` errors. > Usually `--packages` are downloaded from some remote repo before runtime, > adding them explicitly to `--driver-class-path` is not an option, like we can > do with `--jars`. One solution is to move the fetching of `--packages` to the > SparkSubmitDriverBootstrapper, and add it to the driver class-path there. > A more complete solution can be achieved through [SPARK-4924]. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-6047) pyspark - class loading on driver failing with --jars and --packages
[ https://issues.apache.org/jira/browse/SPARK-6047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-6047. --- Resolution: Duplicate > pyspark - class loading on driver failing with --jars and --packages > > > Key: SPARK-6047 > URL: https://issues.apache.org/jira/browse/SPARK-6047 > Project: Spark > Issue Type: Bug > Components: PySpark, Spark Submit >Affects Versions: 1.3.0 >Reporter: Burak Yavuz > > Because py4j uses the system ClassLoader instead of the contextClassLoader of > the thread, the dynamically added jars in Spark Submit can't be loaded in the > driver. > This causes `Py4JError: Trying to call a package` errors. > Usually `--packages` are downloaded from some remote repo before runtime, > adding them explicitly to `--driver-class-path` is not an option, like we can > do with `--jars`. One solution is to move the fetching of `--packages` to the > SparkSubmitDriverBootstrapper, and add it to the driver class-path there. > A more complete solution can be achieved through [SPARK-4924]. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-5185) pyspark --jars does not add classes to driver class path
[ https://issues.apache.org/jira/browse/SPARK-5185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen reassigned SPARK-5185: - Assignee: Josh Rosen (was: Andrew Or) > pyspark --jars does not add classes to driver class path > > > Key: SPARK-5185 > URL: https://issues.apache.org/jira/browse/SPARK-5185 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 1.2.0 >Reporter: Uri Laserson >Assignee: Josh Rosen > > I have some random class I want access to from an Spark shell, say > {{com.cloudera.science.throwaway.ThrowAway}}. You can find the specific > example I used here: > https://gist.github.com/laserson/e9e3bd265e1c7a896652 > I packaged it as {{throwaway.jar}}. > If I then run {{bin/spark-shell}} like so: > {code} > bin/spark-shell --master local[1] --jars throwaway.jar > {code} > I can execute > {code} > val a = new com.cloudera.science.throwaway.ThrowAway() > {code} > Successfully. > I now run PySpark like so: > {code} > PYSPARK_DRIVER_PYTHON=ipython bin/pyspark --master local[1] --jars > throwaway.jar > {code} > which gives me an error when I try to instantiate the class through Py4J: > {code} > In [1]: sc._jvm.com.cloudera.science.throwaway.ThrowAway() > --- > Py4JError Traceback (most recent call last) > in () > > 1 sc._jvm.com.cloudera.science.throwaway.ThrowAway() > /Users/laserson/repos/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/java_gateway.py > in __getattr__(self, name) > 724 def __getattr__(self, name): > 725 if name == '__call__': > --> 726 raise Py4JError('Trying to call a package.') > 727 new_fqn = self._fqn + '.' + name > 728 command = REFLECTION_COMMAND_NAME +\ > Py4JError: Trying to call a package. > {code} > However, if I explicitly add the {{--driver-class-path}} to add the same jar > {code} > PYSPARK_DRIVER_PYTHON=ipython bin/pyspark --master local[1] --jars > throwaway.jar --driver-class-path throwaway.jar > {code} > it works > {code} > In [1]: sc._jvm.com.cloudera.science.throwaway.ThrowAway() > Out[1]: JavaObject id=o18 > {code} > However, the docs state that {{--jars}} should also set the driver class path. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-5185) pyspark --jars does not add classes to driver class path
[ https://issues.apache.org/jira/browse/SPARK-5185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-5185: -- Target Version/s: 2.0.0 > pyspark --jars does not add classes to driver class path > > > Key: SPARK-5185 > URL: https://issues.apache.org/jira/browse/SPARK-5185 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 1.2.0 >Reporter: Uri Laserson >Assignee: Josh Rosen > > I have some random class I want access to from an Spark shell, say > {{com.cloudera.science.throwaway.ThrowAway}}. You can find the specific > example I used here: > https://gist.github.com/laserson/e9e3bd265e1c7a896652 > I packaged it as {{throwaway.jar}}. > If I then run {{bin/spark-shell}} like so: > {code} > bin/spark-shell --master local[1] --jars throwaway.jar > {code} > I can execute > {code} > val a = new com.cloudera.science.throwaway.ThrowAway() > {code} > Successfully. > I now run PySpark like so: > {code} > PYSPARK_DRIVER_PYTHON=ipython bin/pyspark --master local[1] --jars > throwaway.jar > {code} > which gives me an error when I try to instantiate the class through Py4J: > {code} > In [1]: sc._jvm.com.cloudera.science.throwaway.ThrowAway() > --- > Py4JError Traceback (most recent call last) > in () > > 1 sc._jvm.com.cloudera.science.throwaway.ThrowAway() > /Users/laserson/repos/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/java_gateway.py > in __getattr__(self, name) > 724 def __getattr__(self, name): > 725 if name == '__call__': > --> 726 raise Py4JError('Trying to call a package.') > 727 new_fqn = self._fqn + '.' + name > 728 command = REFLECTION_COMMAND_NAME +\ > Py4JError: Trying to call a package. > {code} > However, if I explicitly add the {{--driver-class-path}} to add the same jar > {code} > PYSPARK_DRIVER_PYTHON=ipython bin/pyspark --master local[1] --jars > throwaway.jar --driver-class-path throwaway.jar > {code} > it works > {code} > In [1]: sc._jvm.com.cloudera.science.throwaway.ThrowAway() > Out[1]: JavaObject id=o18 > {code} > However, the docs state that {{--jars}} should also set the driver class path. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-13848) Upgrade to Py4J 0.9.2
[ https://issues.apache.org/jira/browse/SPARK-13848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15192543#comment-15192543 ] Apache Spark commented on SPARK-13848: -- User 'JoshRosen' has created a pull request for this issue: https://github.com/apache/spark/pull/11687 > Upgrade to Py4J 0.9.2 > - > > Key: SPARK-13848 > URL: https://issues.apache.org/jira/browse/SPARK-13848 > Project: Spark > Issue Type: Bug > Components: PySpark >Reporter: Josh Rosen >Assignee: Josh Rosen > > We should upgrade to Py4J 0.9.2 so that we can fix SPARK-6047 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-5185) pyspark --jars does not add classes to driver class path
[ https://issues.apache.org/jira/browse/SPARK-5185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15192544#comment-15192544 ] Apache Spark commented on SPARK-5185: - User 'JoshRosen' has created a pull request for this issue: https://github.com/apache/spark/pull/11687 > pyspark --jars does not add classes to driver class path > > > Key: SPARK-5185 > URL: https://issues.apache.org/jira/browse/SPARK-5185 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 1.2.0 >Reporter: Uri Laserson >Assignee: Josh Rosen > > I have some random class I want access to from an Spark shell, say > {{com.cloudera.science.throwaway.ThrowAway}}. You can find the specific > example I used here: > https://gist.github.com/laserson/e9e3bd265e1c7a896652 > I packaged it as {{throwaway.jar}}. > If I then run {{bin/spark-shell}} like so: > {code} > bin/spark-shell --master local[1] --jars throwaway.jar > {code} > I can execute > {code} > val a = new com.cloudera.science.throwaway.ThrowAway() > {code} > Successfully. > I now run PySpark like so: > {code} > PYSPARK_DRIVER_PYTHON=ipython bin/pyspark --master local[1] --jars > throwaway.jar > {code} > which gives me an error when I try to instantiate the class through Py4J: > {code} > In [1]: sc._jvm.com.cloudera.science.throwaway.ThrowAway() > --- > Py4JError Traceback (most recent call last) > in () > > 1 sc._jvm.com.cloudera.science.throwaway.ThrowAway() > /Users/laserson/repos/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/java_gateway.py > in __getattr__(self, name) > 724 def __getattr__(self, name): > 725 if name == '__call__': > --> 726 raise Py4JError('Trying to call a package.') > 727 new_fqn = self._fqn + '.' + name > 728 command = REFLECTION_COMMAND_NAME +\ > Py4JError: Trying to call a package. > {code} > However, if I explicitly add the {{--driver-class-path}} to add the same jar > {code} > PYSPARK_DRIVER_PYTHON=ipython bin/pyspark --master local[1] --jars > throwaway.jar --driver-class-path throwaway.jar > {code} > it works > {code} > In [1]: sc._jvm.com.cloudera.science.throwaway.ThrowAway() > Out[1]: JavaObject id=o18 > {code} > However, the docs state that {{--jars}} should also set the driver class path. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-13335) Optimize Data Frames collect_list and collect_set with declarative aggregates
[ https://issues.apache.org/jira/browse/SPARK-13335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15192570#comment-15192570 ] Apache Spark commented on SPARK-13335: -- User 'mccheah' has created a pull request for this issue: https://github.com/apache/spark/pull/11688 > Optimize Data Frames collect_list and collect_set with declarative aggregates > - > > Key: SPARK-13335 > URL: https://issues.apache.org/jira/browse/SPARK-13335 > Project: Spark > Issue Type: Improvement > Components: SQL >Reporter: Matt Cheah >Priority: Minor > > Based on discussion from SPARK-9301, we can optimize collect_set and > collect_list with declarative aggregate expressions, as opposed to using Hive > UDAFs. The problem with Hive UDAFs is that they require converting the data > items from catalyst types back to external types repeatedly. We can get > around this by implementing declarative aggregate expressions. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-13335) Optimize Data Frames collect_list and collect_set with declarative aggregates
[ https://issues.apache.org/jira/browse/SPARK-13335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13335: Assignee: (was: Apache Spark) > Optimize Data Frames collect_list and collect_set with declarative aggregates > - > > Key: SPARK-13335 > URL: https://issues.apache.org/jira/browse/SPARK-13335 > Project: Spark > Issue Type: Improvement > Components: SQL >Reporter: Matt Cheah >Priority: Minor > > Based on discussion from SPARK-9301, we can optimize collect_set and > collect_list with declarative aggregate expressions, as opposed to using Hive > UDAFs. The problem with Hive UDAFs is that they require converting the data > items from catalyst types back to external types repeatedly. We can get > around this by implementing declarative aggregate expressions. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-13335) Optimize Data Frames collect_list and collect_set with declarative aggregates
[ https://issues.apache.org/jira/browse/SPARK-13335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13335: Assignee: Apache Spark > Optimize Data Frames collect_list and collect_set with declarative aggregates > - > > Key: SPARK-13335 > URL: https://issues.apache.org/jira/browse/SPARK-13335 > Project: Spark > Issue Type: Improvement > Components: SQL >Reporter: Matt Cheah >Assignee: Apache Spark >Priority: Minor > > Based on discussion from SPARK-9301, we can optimize collect_set and > collect_list with declarative aggregate expressions, as opposed to using Hive > UDAFs. The problem with Hive UDAFs is that they require converting the data > items from catalyst types back to external types repeatedly. We can get > around this by implementing declarative aggregate expressions. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-13335) Optimize Data Frames collect_list and collect_set with declarative aggregates
[ https://issues.apache.org/jira/browse/SPARK-13335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15192588#comment-15192588 ] Ruslan Dautkhanov commented on SPARK-13335: --- It would be great to have this optimization in. In our workflow we use Spark in 95% of cases, except for COLLECT_SET we switch to Hive because it works fine there, while in Spark 1.5 executors even with 8Gb of memory keeps running out of memory. 11 billion records dataset OUTER JOIN to a ~1 billion records dataset. > Optimize Data Frames collect_list and collect_set with declarative aggregates > - > > Key: SPARK-13335 > URL: https://issues.apache.org/jira/browse/SPARK-13335 > Project: Spark > Issue Type: Improvement > Components: SQL >Reporter: Matt Cheah >Priority: Minor > > Based on discussion from SPARK-9301, we can optimize collect_set and > collect_list with declarative aggregate expressions, as opposed to using Hive > UDAFs. The problem with Hive UDAFs is that they require converting the data > items from catalyst types back to external types repeatedly. We can get > around this by implementing declarative aggregate expressions. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-13849) REGEX Column Specification
Ruslan Dautkhanov created SPARK-13849: - Summary: REGEX Column Specification Key: SPARK-13849 URL: https://issues.apache.org/jira/browse/SPARK-13849 Project: Spark Issue Type: Wish Components: Spark Core Reporter: Ruslan Dautkhanov Priority: Critical It would be great to have a feature parity with Hive on REGEX Column Specification: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Select#LanguageManualSelect-REGEXColumnSpecification A SELECT statement can take regex-based column specification in Hive: {code:sql} SELECT `(ds|hr)?+.+` FROM sales {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-13834) Update sbt and sbt plugins for 2.x.
[ https://issues.apache.org/jira/browse/SPARK-13834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-13834. - Resolution: Fixed Assignee: Dongjoon Hyun (was: Apache Spark) Fix Version/s: 2.0.0 > Update sbt and sbt plugins for 2.x. > --- > > Key: SPARK-13834 > URL: https://issues.apache.org/jira/browse/SPARK-13834 > Project: Spark > Issue Type: Improvement >Reporter: Dongjoon Hyun >Assignee: Dongjoon Hyun >Priority: Minor > Fix For: 2.0.0 > > > For 2.0.0, we had better make sbt and sbt plugins up-to-date. This PR checks > the status of each plugins and bumps the followings. > * sbt: 0.13.9 --> 0.13.11 > * sbteclipse-plugin: 2.2.0 --> 4.0.0 > * sbt-dependency-graph: 0.7.4 --> 0.8.2 > * sbt-mima-plugin: 0.1.6 --> 0.1.9 > * sbt-revolver: 0.7.2 --> 0.8.0 > All other plugins are up-to-date. (Note that sbt-avro seems to be change from > 0.3.2 to 1.0.1, but it's not published in the repository.) > During upgrade, this PR also updated the following MiMa error. Note that the > related excluding filter is already registered correctly. It seems due to the > change of MiMa exception result. > {code:title=project/build.properties|borderStyle=solid} > // SPARK-12896 Send only accumulator updates to driver, not TaskMetrics > > ProblemFilters.exclude[IncompatibleMethTypeProblem]("org.apache.spark.Accumulable.this"), > -ProblemFilters.exclude[IncompatibleMethTypeProblem]("org.apache.spark.Accumulator.this"), > +ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.Accumulator.this"), > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-13845) BlockStatus and StreamBlockId keep on growing result driver OOM
[ https://issues.apache.org/jira/browse/SPARK-13845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jeanlyn updated SPARK-13845: Summary: BlockStatus and StreamBlockId keep on growing result driver OOM (was: Driver OOM after few days when running streaming) > BlockStatus and StreamBlockId keep on growing result driver OOM > --- > > Key: SPARK-13845 > URL: https://issues.apache.org/jira/browse/SPARK-13845 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 1.5.2, 1.6.1 >Reporter: jeanlyn > > We have a streaming job using *FlumePollInputStream* always driver OOM after > few days, here is some driver heap dump before OOM > {noformat} > num #instances #bytes class name > -- >1: 13845916 553836640 org.apache.spark.storage.BlockStatus >2: 14020324 336487776 org.apache.spark.storage.StreamBlockId >3: 13883881 333213144 scala.collection.mutable.DefaultEntry >4: 8907 89043952 [Lscala.collection.mutable.HashEntry; >5: 62360 65107352 [B >6:163368 24453904 [Ljava.lang.Object; >7:293651 20342664 [C > ... > {noformat} > *BlockStatus* and *StreamBlockId* keep on growing, and the driver OOM in the > end. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-13724) Parameter maxMemoryInMB has gone missing in MlLib 1.6.0 DecisionTree.trainClassifier()
[ https://issues.apache.org/jira/browse/SPARK-13724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15192718#comment-15192718 ] senthil gandhi commented on SPARK-13724: I was able to replicate this, here is the exception, how does one deal with this from the python API? --- py4j.protocol.Py4JJavaError: An error occurred while calling o163.trainDecisionTreeModel. : java.lang.IllegalArgumentException: requirement failed: RandomForest/DecisionTree given maxMemoryInMB = 256, which is too small for the given features. Minimum value = 446 at scala.Predef$.require(Predef.scala:233) at org.apache.spark.mllib.tree.RandomForest.run(RandomForest.scala:186) at org.apache.spark.mllib.tree.DecisionTree.run(DecisionTree.scala:60) at org.apache.spark.mllib.tree.DecisionTree$.train(DecisionTree.scala:86) at org.apache.spark.mllib.api.python.PythonMLLibAPI.trainDecisionTreeModel(PythonMLLibAPI.scala:748) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231) at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:381) at py4j.Gateway.invoke(Gateway.java:259) at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133) > Parameter maxMemoryInMB has gone missing in MlLib 1.6.0 > DecisionTree.trainClassifier() > -- > > Key: SPARK-13724 > URL: https://issues.apache.org/jira/browse/SPARK-13724 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 1.6.0 >Reporter: senthil gandhi > > DecisionTree.trainClassifier() reports that maxMemoryInMB is too small during > training and stops. But when I try to set it, I found that in MLlib of spark > 1.6.0 pyspark.mllib.tree.DecisionTree doesn't have this parameter in the > named parameter list anymore. > (Also not sure if this is the place for this issue, kindly educate!) -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-13823) Always specify Charset in String <-> byte[] conversions (and remaining Coverity items)
[ https://issues.apache.org/jira/browse/SPARK-13823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-13823. - Resolution: Fixed Fix Version/s: 2.0.0 > Always specify Charset in String <-> byte[] conversions (and remaining > Coverity items) > -- > > Key: SPARK-13823 > URL: https://issues.apache.org/jira/browse/SPARK-13823 > Project: Spark > Issue Type: Improvement > Components: Spark Core, SQL, Streaming >Affects Versions: 2.0.0 >Reporter: Sean Owen >Assignee: Sean Owen >Priority: Minor > Fix For: 2.0.0 > > > Most of the remaining items from the last Coverity scan concern using, for > example, the constructor {{new String(byte[])}} or the method > {{String.getBytes()}}, or similarly for constructors of {{InputStreamReader}} > and {{OutputStreamWriter}}. These use the platform default encoding, which > means their behavior may change in different locales, which should be > undesirable in all cases in Spark. > It makes sense to specify UTF-8 as the default everywhere; where already > specified, it's UTF-8 in 95% of cases. A few tests set US-ASCII, but UTF-8 is > a superset. > We should also consistently use {{StandardCharsets.UTF_8}} rather than > "UTF-8" or Guava's {{Charsets.UTF_8}} to specify this. > (Finally, we should touch up the other few remaining Coverity scan items, > which are trivial, while we're here.) -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-13249) Filter null keys for inner join
[ https://issues.apache.org/jira/browse/SPARK-13249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15192723#comment-15192723 ] Liang-Chi Hsieh commented on SPARK-13249: - I think this can be closed? > Filter null keys for inner join > --- > > Key: SPARK-13249 > URL: https://issues.apache.org/jira/browse/SPARK-13249 > Project: Spark > Issue Type: Improvement > Components: SQL >Reporter: Davies Liu > > For inner join, the join key with null in it will not match each other, so we > could insert a Filter before inner join (could be pushed down), then we don't > need to check nullability of keys while joining. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-13764) Parse modes in JSON data source
[ https://issues.apache.org/jira/browse/SPARK-13764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-13764: - Description: Currently, JSON data source just fails to read if some JSON documents are malformed. Therefore, if there are two JSON documents below: {noformat} { "request": { "user": { "id": 123 } } } {noformat} {noformat} { "request": { "user": [] } } {noformat} This will fail emitting the exception below : {noformat} Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 7 in stage 0.0 failed 4 times, most recent failure: Lost task 7.3 in stage 0.0 (TID 10, 192.168.1.170): java.lang.ClassCastException: org.apache.spark.sql.types.GenericArrayData cannot be cast to org.apache.spark.sql.catalyst.InternalRow at org.apache.spark.sql.catalyst.expressions.BaseGenericInternalRow$class.getStruct(rows.scala:50) at org.apache.spark.sql.catalyst.expressions.GenericMutableRow.getStruct(rows.scala:247) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificPredicate.eval(Unknown Source) at org.apache.spark.sql.catalyst.expressions.codegen.GeneratePredicate$$anonfun$create$2.apply(GeneratePredicate.scala:67) at org.apache.spark.sql.catalyst.expressions.codegen.GeneratePredicate$$anonfun$create$2.apply(GeneratePredicate.scala:67) at org.apache.spark.sql.execution.Filter$$anonfun$4$$anonfun$apply$4.apply(basicOperators.scala:117) at org.apache.spark.sql.execution.Filter$$anonfun$4$$anonfun$apply$4.apply(basicOperators.scala:115) at scala.collection.Iterator$$anon$14.hasNext(Iterator.scala:390) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327) at org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1.org$apache$spark$sql$execution$aggregate$TungstenAggregate$$anonfun$$executePartition$1(TungstenAggregate.scala:97) at org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:119) at org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:119) at org.apache.spark.rdd.MapPartitionsWithPreparationRDD.compute(MapPartitionsWithPreparationRDD.scala:64) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:300) at org.apache.spark.rdd.RDD.iterator(RDD.scala:264) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:300) at org.apache.spark.rdd.RDD.iterator(RDD.scala:264) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) at org.apache.spark.scheduler.Task.run(Task.scala:88) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) {noformat} So, just like the parse modes in CSV data source, (See https://github.com/databricks/spark-csv), it would be great if there are some parse modes so that users do not have to filter or pre-process themselves. This happens only when custom schema is set. when this uses inferred schema, then it infers the type as {{StringType}} which reads the data successfully anyway. was: Currently, JSON data source just fails to read if some JSON documents are malformed. Therefore, if there are two JSON documents below: {noformat} { "request": { "user": { "id": 123 } } } {noformat} {noformat} { "request": { "user": [] } } {noformat} This will fail emitting the exception below : {noformat} Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 7 in stage 0.0 failed 4 times, most recent failure: Lost task 7.3 in stage 0.0 (TID 10, 192.168.1.170): java.lang.ClassCastException: org.apache.spark.sql.types.GenericArrayData cannot be cast to org.apache.spark.sql.catalyst.InternalRow at org.apache.spark.sql.catalyst.expressions.BaseGenericInternalRow$class.getStruct(rows.scala:50) at org.apache.spark.sql.catalyst.expressions.GenericMutableRow.getStruct(rows.scala:247) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificPredicate.eval(Unknown Source) at org.apache.spark.sql.catalyst.expressions.codegen.GeneratePredicate$$anonfun$create$2.apply(GeneratePredicate.scala:67) at org.apache.spark.sql.catalyst.expressions.codegen.GeneratePredicate$$anonfun$create$2.apply(GeneratePredicate.scala:67)
[jira] [Created] (SPARK-13850) TimSort Comparison method violates its general contract
Sital Kedia created SPARK-13850: --- Summary: TimSort Comparison method violates its general contract Key: SPARK-13850 URL: https://issues.apache.org/jira/browse/SPARK-13850 Project: Spark Issue Type: Bug Components: Shuffle Affects Versions: 1.6.0 Reporter: Sital Kedia While running a query which does a group by on a large dataset, the query fails with following stack trace. ``` Job aborted due to stage failure: Task 4077 in stage 1.3 failed 4 times, most recent failure: Lost task 4077.3 in stage 1.3 (TID 88702, hadoop3030.prn2.facebook.com): java.lang.IllegalArgumentException: Comparison method violates its general contract! at org.apache.spark.util.collection.TimSort$SortState.mergeLo(TimSort.java:794) at org.apache.spark.util.collection.TimSort$SortState.mergeAt(TimSort.java:525) at org.apache.spark.util.collection.TimSort$SortState.mergeCollapse(TimSort.java:453) at org.apache.spark.util.collection.TimSort$SortState.access$200(TimSort.java:325) at org.apache.spark.util.collection.TimSort.sort(TimSort.java:153) at org.apache.spark.util.collection.Sorter.sort(Sorter.scala:37) at org.apache.spark.util.collection.unsafe.sort.UnsafeInMemorySorter.getSortedIterator(UnsafeInMemorySorter.java:228) at org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter.spill(UnsafeExternalSorter.java:186) at org.apache.spark.memory.TaskMemoryManager.acquireExecutionMemory(TaskMemoryManager.java:175) at org.apache.spark.memory.TaskMemoryManager.allocatePage(TaskMemoryManager.java:249) at org.apache.spark.memory.MemoryConsumer.allocatePage(MemoryConsumer.java:112) at org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter.acquireNewPageIfNecessary(UnsafeExternalSorter.java:318) at org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter.insertRecord(UnsafeExternalSorter.java:333) at org.apache.spark.sql.execution.UnsafeExternalRowSorter.insertRow(UnsafeExternalRowSorter.java:91) at org.apache.spark.sql.execution.UnsafeExternalRowSorter.sort(UnsafeExternalRowSorter.java:168) at org.apache.spark.sql.execution.Sort$$anonfun$1.apply(Sort.scala:90) at org.apache.spark.sql.execution.Sort$$anonfun$1.apply(Sort.scala:64) at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$21.apply(RDD.scala:728) at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$21.apply(RDD.scala:728) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66) at org.apache.spark.scheduler.Task.run(Task.scala:89) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) ``` Please note that the same query used to succeed in Spark 1.5 so it seems like a regression in 1.6. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-13850) TimSort Comparison method violates its general contract
[ https://issues.apache.org/jira/browse/SPARK-13850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sital Kedia updated SPARK-13850: Description: While running a query which does a group by on a large dataset, the query fails with following stack trace. {code} Job aborted due to stage failure: Task 4077 in stage 1.3 failed 4 times, most recent failure: Lost task 4077.3 in stage 1.3 (TID 88702, hadoop3030.prn2.facebook.com): java.lang.IllegalArgumentException: Comparison method violates its general contract! at org.apache.spark.util.collection.TimSort$SortState.mergeLo(TimSort.java:794) at org.apache.spark.util.collection.TimSort$SortState.mergeAt(TimSort.java:525) at org.apache.spark.util.collection.TimSort$SortState.mergeCollapse(TimSort.java:453) at org.apache.spark.util.collection.TimSort$SortState.access$200(TimSort.java:325) at org.apache.spark.util.collection.TimSort.sort(TimSort.java:153) at org.apache.spark.util.collection.Sorter.sort(Sorter.scala:37) at org.apache.spark.util.collection.unsafe.sort.UnsafeInMemorySorter.getSortedIterator(UnsafeInMemorySorter.java:228) at org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter.spill(UnsafeExternalSorter.java:186) at org.apache.spark.memory.TaskMemoryManager.acquireExecutionMemory(TaskMemoryManager.java:175) at org.apache.spark.memory.TaskMemoryManager.allocatePage(TaskMemoryManager.java:249) at org.apache.spark.memory.MemoryConsumer.allocatePage(MemoryConsumer.java:112) at org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter.acquireNewPageIfNecessary(UnsafeExternalSorter.java:318) at org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter.insertRecord(UnsafeExternalSorter.java:333) at org.apache.spark.sql.execution.UnsafeExternalRowSorter.insertRow(UnsafeExternalRowSorter.java:91) at org.apache.spark.sql.execution.UnsafeExternalRowSorter.sort(UnsafeExternalRowSorter.java:168) at org.apache.spark.sql.execution.Sort$$anonfun$1.apply(Sort.scala:90) at org.apache.spark.sql.execution.Sort$$anonfun$1.apply(Sort.scala:64) at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$21.apply(RDD.scala:728) at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$21.apply(RDD.scala:728) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66) at org.apache.spark.scheduler.Task.run(Task.scala:89) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) {code} Please note that the same query used to succeed in Spark 1.5 so it seems like a regression in 1.6. was: While running a query which does a group by on a large dataset, the query fails with following stack trace. ``` Job aborted due to stage failure: Task 4077 in stage 1.3 failed 4 times, most recent failure: Lost task 4077.3 in stage 1.3 (TID 88702, hadoop3030.prn2.facebook.com): java.lang.IllegalArgumentException: Comparison method violates its general contract! at org.apache.spark.util.collection.TimSort$SortState.mergeLo(TimSort.java:794) at org.apache.spark.util.collection.TimSort$SortState.mergeAt(TimSort.java:525) at org.apache.spark.util.collection.TimSort$SortState.mergeCollapse(TimSort.java:453) at org.apache.spark.util.collection.TimSort$SortState.access$200(TimSort.java:325) at org.apache.spark.util.collection.TimSort.sort(TimSort.java:153) at org.apache.spark.util.col
[jira] [Created] (SPARK-13851) spark streaming web ui remains completed jobs as active jobs
t7s created SPARK-13851: --- Summary: spark streaming web ui remains completed jobs as active jobs Key: SPARK-13851 URL: https://issues.apache.org/jira/browse/SPARK-13851 Project: Spark Issue Type: Bug Components: Streaming Affects Versions: 1.4.1 Environment: yarn 2.3.0-cdh5.1.0 Reporter: t7s Priority: Minor !http://apache-spark-user-list.1001560.n3.nabble.com/file/n26474/34%E5%B1%8F%E5%B9%95%E6%88%AA%E5%9B%BE.png! env: yarn 2.3.0-cdh5.1.0 --master yarn-cluster --conf "spark.driver.cores=2" --conf "spark.akka.threads=16" I am sure these job completed according to the log For example, job 9816 This is info about job 9816 in log: [2016-03-14 07:15:05,088] INFO Job 9816 finished: transform at ErrorStreaming2.scala:396, took 8.218500 s (org.apache.spark.scheduler.DAGScheduler) Stages in job 9816 are completed too according to the log But job 9816 is still in active job of web ui, why? How can I clear these remaining jobs? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-13851) spark streaming web ui remains completed jobs as active jobs
[ https://issues.apache.org/jira/browse/SPARK-13851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] t7s updated SPARK-13851: Attachment: try.png > spark streaming web ui remains completed jobs as active jobs > > > Key: SPARK-13851 > URL: https://issues.apache.org/jira/browse/SPARK-13851 > Project: Spark > Issue Type: Bug > Components: Streaming >Affects Versions: 1.4.1 > Environment: yarn 2.3.0-cdh5.1.0 >Reporter: t7s >Priority: Minor > Attachments: try.png > > > !http://apache-spark-user-list.1001560.n3.nabble.com/file/n26474/34%E5%B1%8F%E5%B9%95%E6%88%AA%E5%9B%BE.png! > env: yarn 2.3.0-cdh5.1.0 > --master yarn-cluster > --conf "spark.driver.cores=2" > --conf "spark.akka.threads=16" > I am sure these job completed according to the log > For example, job 9816 > This is info about job 9816 in log: > [2016-03-14 07:15:05,088] INFO Job 9816 finished: transform at > ErrorStreaming2.scala:396, took 8.218500 s > (org.apache.spark.scheduler.DAGScheduler) > Stages in job 9816 are completed too according to the log > But job 9816 is still in active job of web ui, why? > How can I clear these remaining jobs? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-13851) spark streaming web ui remains completed jobs as active jobs
[ https://issues.apache.org/jira/browse/SPARK-13851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] t7s updated SPARK-13851: Description: !try.png! env: yarn 2.3.0-cdh5.1.0 --master yarn-cluster --conf "spark.driver.cores=2" --conf "spark.akka.threads=16" I am sure these job completed according to the log For example, job 9816 This is info about job 9816 in log: [2016-03-14 07:15:05,088] INFO Job 9816 finished: transform at ErrorStreaming2.scala:396, took 8.218500 s (org.apache.spark.scheduler.DAGScheduler) Stages in job 9816 are completed too according to the log But job 9816 is still in active job of web ui, why? How can I clear these remaining jobs? was: !http://apache-spark-user-list.1001560.n3.nabble.com/file/n26474/34%E5%B1%8F%E5%B9%95%E6%88%AA%E5%9B%BE.png! env: yarn 2.3.0-cdh5.1.0 --master yarn-cluster --conf "spark.driver.cores=2" --conf "spark.akka.threads=16" I am sure these job completed according to the log For example, job 9816 This is info about job 9816 in log: [2016-03-14 07:15:05,088] INFO Job 9816 finished: transform at ErrorStreaming2.scala:396, took 8.218500 s (org.apache.spark.scheduler.DAGScheduler) Stages in job 9816 are completed too according to the log But job 9816 is still in active job of web ui, why? How can I clear these remaining jobs? > spark streaming web ui remains completed jobs as active jobs > > > Key: SPARK-13851 > URL: https://issues.apache.org/jira/browse/SPARK-13851 > Project: Spark > Issue Type: Bug > Components: Streaming >Affects Versions: 1.4.1 > Environment: yarn 2.3.0-cdh5.1.0 >Reporter: t7s >Priority: Minor > Attachments: try.png > > > !try.png! > env: yarn 2.3.0-cdh5.1.0 > --master yarn-cluster > --conf "spark.driver.cores=2" > --conf "spark.akka.threads=16" > I am sure these job completed according to the log > For example, job 9816 > This is info about job 9816 in log: > [2016-03-14 07:15:05,088] INFO Job 9816 finished: transform at > ErrorStreaming2.scala:396, took 8.218500 s > (org.apache.spark.scheduler.DAGScheduler) > Stages in job 9816 are completed too according to the log > But job 9816 is still in active job of web ui, why? > How can I clear these remaining jobs? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-13851) spark streaming web ui remains completed jobs as active jobs
[ https://issues.apache.org/jira/browse/SPARK-13851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] t7s updated SPARK-13851: Priority: Major (was: Minor) > spark streaming web ui remains completed jobs as active jobs > > > Key: SPARK-13851 > URL: https://issues.apache.org/jira/browse/SPARK-13851 > Project: Spark > Issue Type: Bug > Components: Streaming >Affects Versions: 1.4.1 > Environment: yarn 2.3.0-cdh5.1.0 >Reporter: t7s > Attachments: try.png > > > !try.png! > env: yarn 2.3.0-cdh5.1.0 > --master yarn-cluster > --conf "spark.driver.cores=2" > --conf "spark.akka.threads=16" > I am sure these job completed according to the log > For example, job 9816 > This is info about job 9816 in log: > [2016-03-14 07:15:05,088] INFO Job 9816 finished: transform at > ErrorStreaming2.scala:396, took 8.218500 s > (org.apache.spark.scheduler.DAGScheduler) > Stages in job 9816 are completed too according to the log > But job 9816 is still in active job of web ui, why? > How can I clear these remaining jobs? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-13618) Make Streaming web UI display rate-limit lines in the statistics graph
[ https://issues.apache.org/jira/browse/SPARK-13618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liwei Lin updated SPARK-13618: -- Description: This JIRA propose to make Streaming web UI display rate-limit lines in the statistics graph. Please see this design doc for details: https://docs.google.com/document/d/1kGXQEcToNglDK-AbyEJGoYX9wwODimUVf0PetoxuU5M/edit [Screenshots] without back pressure: !https://cloud.githubusercontent.com/assets/15843379/13664195/d2264c48-e6e0-11e5-85e6-f13187d4cbde.png! with back pressure: !https://cloud.githubusercontent.com/assets/15843379/13664196/d2549c7e-e6e0-11e5-9f62-d7f1458f1c27.png! was: This JIRA propose to make Streaming web UI display rate-limit lines in the statistics graph. Please see the enclosed design doc for details. [Screenshots] without back pressure: !https://cloud.githubusercontent.com/assets/15843379/13664195/d2264c48-e6e0-11e5-85e6-f13187d4cbde.png! with back pressure: !https://cloud.githubusercontent.com/assets/15843379/13664196/d2549c7e-e6e0-11e5-9f62-d7f1458f1c27.png! > Make Streaming web UI display rate-limit lines in the statistics graph > -- > > Key: SPARK-13618 > URL: https://issues.apache.org/jira/browse/SPARK-13618 > Project: Spark > Issue Type: Improvement > Components: Streaming, Web UI >Affects Versions: 1.6.0, 2.0.0 >Reporter: Liwei Lin > Attachments: Spark-13618_Design_Doc_v1.pdf > > > This JIRA propose to make Streaming web UI display rate-limit lines in the > statistics graph. > Please see this design doc for details: > https://docs.google.com/document/d/1kGXQEcToNglDK-AbyEJGoYX9wwODimUVf0PetoxuU5M/edit > [Screenshots] > without back pressure: > !https://cloud.githubusercontent.com/assets/15843379/13664195/d2264c48-e6e0-11e5-85e6-f13187d4cbde.png! > with back pressure: > !https://cloud.githubusercontent.com/assets/15843379/13664196/d2549c7e-e6e0-11e5-9f62-d7f1458f1c27.png! -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-13642) Properly handle signal kill of ApplicationMaster
[ https://issues.apache.org/jira/browse/SPARK-13642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15192828#comment-15192828 ] Apache Spark commented on SPARK-13642: -- User 'jerryshao' has created a pull request for this issue: https://github.com/apache/spark/pull/11690 > Properly handle signal kill of ApplicationMaster > > > Key: SPARK-13642 > URL: https://issues.apache.org/jira/browse/SPARK-13642 > Project: Spark > Issue Type: Bug > Components: YARN >Affects Versions: 1.6.0 >Reporter: Saisai Shao >Assignee: Saisai Shao > > Currently when running Spark on Yarn with yarn cluster mode, the default > application final state is "SUCCEED", if any exception is occurred, this > final state will be changed to "FAILED" and trigger the reattempt if > possible. > This is OK in normal case, but if there's a race condition when AM received a > signal (SIGTERM) and no any exception is occurred. In this situation, > shutdown hook will be invoked and marked this application as finished with > success, and there's no another attempt. > In such situation, actually from Spark's aspect this application is failed > and need another attempt, but from Yarn's aspect the application is finished > with success. > This could happened in NM failure situation, the failure of NM will send > SIGTERM to AM, AM should mark this attempt as failure and rerun again, not > invoke unregister. > So to increase the chance of this race condition, here is the reproduced code: > {code} > val sc = ... > Thread.sleep(3L) > sc.parallelize(1 to 100).collect() > {code} > If the AM is failed in sleeping, there's no exception been thrown, so from > Yarn's point this application is finished successfully, but from Spark's > point, this application should be reattempted. > The log normally like this: > {noformat} > 16/03/03 12:44:19 INFO ContainerManagementProtocolProxy: Opening proxy : > 192.168.0.105:45454 > 16/03/03 12:44:21 INFO YarnClusterSchedulerBackend: Registered executor > NettyRpcEndpointRef(null) (192.168.0.105:57461) with ID 2 > 16/03/03 12:44:21 INFO BlockManagerMasterEndpoint: Registering block manager > 192.168.0.105:57462 with 511.1 MB RAM, BlockManagerId(2, 192.168.0.105, 57462) > 16/03/03 12:44:23 INFO YarnClusterSchedulerBackend: Registered executor > NettyRpcEndpointRef(null) (192.168.0.105:57467) with ID 1 > 16/03/03 12:44:23 INFO BlockManagerMasterEndpoint: Registering block manager > 192.168.0.105:57468 with 511.1 MB RAM, BlockManagerId(1, 192.168.0.105, 57468) > 16/03/03 12:44:23 INFO YarnClusterSchedulerBackend: SchedulerBackend is ready > for scheduling beginning after reached minRegisteredResourcesRatio: 0.8 > 16/03/03 12:44:23 INFO YarnClusterScheduler: > YarnClusterScheduler.postStartHook done > 16/03/03 12:44:39 ERROR ApplicationMaster: RECEIVED SIGNAL 15: SIGTERM > 16/03/03 12:44:39 INFO SparkContext: Invoking stop() from shutdown hook > 16/03/03 12:44:39 INFO ContextHandler: stopped > o.e.j.s.ServletContextHandler{/metrics/json,null} > 16/03/03 12:44:39 INFO ContextHandler: stopped > o.e.j.s.ServletContextHandler{/stages/stage/kill,null} > 16/03/03 12:44:39 INFO ContextHandler: stopped > o.e.j.s.ServletContextHandler{/api,null} > 16/03/03 12:44:39 INFO ContextHandler: stopped > o.e.j.s.ServletContextHandler{/,null} > 16/03/03 12:44:39 INFO ContextHandler: stopped > o.e.j.s.ServletContextHandler{/static,null} > 16/03/03 12:44:39 INFO ContextHandler: stopped > o.e.j.s.ServletContextHandler{/executors/threadDump/json,null} > 16/03/03 12:44:39 INFO ContextHandler: stopped > o.e.j.s.ServletContextHandler{/executors/threadDump,null} > 16/03/03 12:44:39 INFO ContextHandler: stopped > o.e.j.s.ServletContextHandler{/executors/json,null} > 16/03/03 12:44:39 INFO ContextHandler: stopped > o.e.j.s.ServletContextHandler{/executors,null} > 16/03/03 12:44:39 INFO ContextHandler: stopped > o.e.j.s.ServletContextHandler{/environment/json,null} > 16/03/03 12:44:39 INFO ContextHandler: stopped > o.e.j.s.ServletContextHandler{/environment,null} > 16/03/03 12:44:39 INFO ContextHandler: stopped > o.e.j.s.ServletContextHandler{/storage/rdd/json,null} > 16/03/03 12:44:39 INFO ContextHandler: stopped > o.e.j.s.ServletContextHandler{/storage/rdd,null} > 16/03/03 12:44:39 INFO ContextHandler: stopped > o.e.j.s.ServletContextHandler{/storage/json,null} > 16/03/03 12:44:39 INFO ContextHandler: stopped > o.e.j.s.ServletContextHandler{/storage,null} > 16/03/03 12:44:39 INFO ContextHandler: stopped > o.e.j.s.ServletContextHandler{/stages/pool/json,null} > 16/03/03 12:44:39 INFO ContextHandler: stopped > o.e.j.s.ServletContextHandler{/stages/pool,null} > 16/03/03 12:44:39 INFO ContextHandler: stopped > o.e.j.s.ServletContextHandler{/stages/stage/json,null} > 16/03/03 12:44:39 INFO ContextHandl