[jira] [Comment Edited] (SPARK-27598) DStreams checkpointing does not work with the Spark Shell
[ https://issues.apache.org/jira/browse/SPARK-27598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16830760#comment-16830760 ] Stavros Kontopoulos edited comment on SPARK-27598 at 4/30/19 11:44 PM: --- @[~dongjoon] I am wondering if structured streaming works in a similar scenario. was (Author: skonto): @[~dongjoon] I am wondering if structured streaming works. > DStreams checkpointing does not work with the Spark Shell > - > > Key: SPARK-27598 > URL: https://issues.apache.org/jira/browse/SPARK-27598 > Project: Spark > Issue Type: Bug > Components: DStreams >Affects Versions: 2.4.0, 2.4.1, 2.4.2, 3.0.0 >Reporter: Stavros Kontopoulos >Priority: Major > > When I restarted a stream with checkpointing enabled I got this: > {quote}19/04/29 22:45:06 WARN CheckpointReader: Error reading checkpoint from > file > [file:/tmp/checkpoint/checkpoint-155656695.bk|file:///tmp/checkpoint/checkpoint-155656695.bk] > java.io.IOException: java.lang.ClassCastException: cannot assign instance of > java.lang.invoke.SerializedLambda to field > org.apache.spark.streaming.dstream.FileInputDStream.filter of type > scala.Function1 in instance of > org.apache.spark.streaming.dstream.FileInputDStream > at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1322) > at > org.apache.spark.streaming.dstream.FileInputDStream.readObject(FileInputDStream.scala:314) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > {quote} > It seems that the closure is stored in the Serialized format and cannot be > assigned back to a scala function1 > Details of how to reproduce it here: > [https://gist.github.com/skonto/87d5b2368b0bf7786d9dd992a710e4e6] > Maybe this is spark-shell specific and is not expected to work anyway, as I > dont see this to be an issues with a normal jar. > Note that with Spark 2.3.3 the error is different and this still does not > work but with a different error. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-27598) DStreams checkpointing does not work with the Spark Shell
[ https://issues.apache.org/jira/browse/SPARK-27598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16830758#comment-16830758 ] Stavros Kontopoulos edited comment on SPARK-27598 at 4/30/19 11:39 PM: --- Btw if I do the trick and put the mappingFunction in an object like this with Spark 2.3.3 on restart I get: {quote}def createContext(checkpointDirectory: String, inputDirectory: String, outputDirectory: String) : StreamingContext = { ... object T extends Serializable { // Update the cumulative count using mapWithState // This will give a DStream made of state (which is the cumulative count of the words) val mappingFunc = (word: String, one: Option[Int], state: State[Int]) => Unknown macro: \{ val sum = one.getOrElse(0) + state.getOption.getOrElse(0) val output = (word, sum) state.update(sum) output } } val stateDstream = words.mapWithState( StateSpec.function(T.mappingFunc).initialState(initialRDD)) } {quote} {quote}2019-05-01 02:36:14 WARN BatchedWriteAheadLog:66 - BatchedWriteAheadLog Writer queue interrupted. org.apache.spark.SparkException: This RDD lacks a SparkContext. It could happen in the following cases: (1) RDD transformations and actions are NOT invoked by the driver, but inside of other transformations; for example, rdd1.map(x => rdd2.values.count() * x) is invalid because the values transformation and count action cannot be performed inside of the rdd1.map transformation. For more information, see SPARK-5063. (2) When a Spark Streaming job recovers from checkpoint, this exception will be hit if a reference to an RDD not defined by the streaming job is used in DStream operations. For more information, See SPARK-13758. at org.apache.spark.rdd.RDD.org$apache$spark$rdd$RDD$$sc(RDD.scala:90) at org.apache.spark.rdd.RDD.withScope(RDD.scala:363) at org.apache.spark.rdd.PairRDDFunctions.partitionBy(PairRDDFunctions.scala:529) at org.apache.spark.streaming.rdd.MapWithStateRDD$.createFromPairRDD(MapWithStateRDD.scala:193) at org.apache.spark.streaming.dstream.InternalMapWithStateDStream.compute(MapWithStateDStream.scala:146) at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1$$anonfun$apply$7.apply(DStream.scala:342) {quote} was (Author: skonto): Btw if I do the trick and put the mappingFunction in an object like this with Spark 2.3.3 on restart I get: {quote} def createContext(checkpointDirectory: String, inputDirectory: String, outputDirectory: String) : StreamingContext = { ... object T extends Serializable { // Update the cumulative count using mapWithState // This will give a DStream made of state (which is the cumulative count of the words) val mappingFunc = (word: String, one: Option[Int], state: State[Int]) => { val sum = one.getOrElse(0) + state.getOption.getOrElse(0) val output = (word, sum) state.update(sum) output } } {quote} } {quote}2019-05-01 02:36:14 WARN BatchedWriteAheadLog:66 - BatchedWriteAheadLog Writer queue interrupted. org.apache.spark.SparkException: This RDD lacks a SparkContext. It could happen in the following cases: (1) RDD transformations and actions are NOT invoked by the driver, but inside of other transformations; for example, rdd1.map(x => rdd2.values.count() * x) is invalid because the values transformation and count action cannot be performed inside of the rdd1.map transformation. For more information, see SPARK-5063. (2) When a Spark Streaming job recovers from checkpoint, this exception will be hit if a reference to an RDD not defined by the streaming job is used in DStream operations. For more information, See SPARK-13758. at org.apache.spark.rdd.RDD.org$apache$spark$rdd$RDD$$sc(RDD.scala:90) at org.apache.spark.rdd.RDD.withScope(RDD.scala:363) at org.apache.spark.rdd.PairRDDFunctions.partitionBy(PairRDDFunctions.scala:529) at org.apache.spark.streaming.rdd.MapWithStateRDD$.createFromPairRDD(MapWithStateRDD.scala:193) at org.apache.spark.streaming.dstream.InternalMapWithStateDStream.compute(MapWithStateDStream.scala:146) at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1$$anonfun$apply$7.apply(DStream.scala:342) {quote} > DStreams checkpointing does not work with the Spark Shell > - > > Key: SPARK-27598 > URL: https://issues.apache.org/jira/browse/SPARK-27598 > Project: Spark > Issue Type: Bug > Components: DStreams >Affects Versions: 2.4.0, 2.4.1, 2.4.2, 3.0.0 >Reporter: Stavros Kontopoulos >Priority: Major > > When I restarted a stream with checkpointing enabled I got this: > {quote}19/04/29 22:45:06 WARN CheckpointReader: Error reading checkpoint from > file > [file:/tmp/checkpoint/checkpoint-155656695.bk|file:///tmp/checkpoint/checkpoint-155656695.bk] > java.io.IOException:
[jira] [Comment Edited] (SPARK-27598) DStreams checkpointing does not work with the Spark Shell
[ https://issues.apache.org/jira/browse/SPARK-27598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16830740#comment-16830740 ] Dongjoon Hyun edited comment on SPARK-27598 at 4/30/19 11:24 PM: - If this fails with Spark 2.3 with Scala 2.11, please remove `Scala 2.12` from the title. It's misleading. was (Author: dongjoon): If this fails with Spark 2.3 with Scala 2.11, please remove `Scala 2.11` from the title. It's misleading. > DStreams checkpointing does not work with the Spark Shell > - > > Key: SPARK-27598 > URL: https://issues.apache.org/jira/browse/SPARK-27598 > Project: Spark > Issue Type: Bug > Components: DStreams >Affects Versions: 2.4.0, 2.4.1, 2.4.2, 3.0.0 >Reporter: Stavros Kontopoulos >Priority: Major > > When I restarted a stream with checkpointing enabled I got this: > {quote}19/04/29 22:45:06 WARN CheckpointReader: Error reading checkpoint from > file > [file:/tmp/checkpoint/checkpoint-155656695.bk|file:///tmp/checkpoint/checkpoint-155656695.bk] > java.io.IOException: java.lang.ClassCastException: cannot assign instance of > java.lang.invoke.SerializedLambda to field > org.apache.spark.streaming.dstream.FileInputDStream.filter of type > scala.Function1 in instance of > org.apache.spark.streaming.dstream.FileInputDStream > at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1322) > at > org.apache.spark.streaming.dstream.FileInputDStream.readObject(FileInputDStream.scala:314) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > {quote} > It seems that the closure is stored in the Serialized format and cannot be > assigned back to a scala function1 > Details of how to reproduce it here: > [https://gist.github.com/skonto/87d5b2368b0bf7786d9dd992a710e4e6] > Maybe this is spark-shell specific and is not expected to work anyway, as I > dont see this to be an issues with a normal jar. > Note that with Spark 2.3.3 the error is different and this still does not > work but with a different error. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-27598) DStreams checkpointing does not work with the Spark Shell
[ https://issues.apache.org/jira/browse/SPARK-27598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16830745#comment-16830745 ] Stavros Kontopoulos edited comment on SPARK-27598 at 4/30/19 11:21 PM: --- I will remove the language version from the title. This seems to be specific to the shell. With Spark 2.3.3 which I assume has 2.11 by default it fails with a different error: {quote}2019-05-01 02:12:00 WARN CheckpointReader:87 - Error reading checkpoint from file [file:/tmp/checkpoint/checkpoint-155666587.bk|file:///tmp/checkpoint/checkpoint-155666587.bk] java.io.IOException: java.lang.ClassNotFoundException: $line16.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$MapWithState$$anonfun$3 at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1354) at org.apache.spark.streaming.dstream.FileInputDStream.readObject(FileInputDStream.scala:316) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1170) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2178) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2069) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573) at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1975) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1567) {quote} I guess in this case its tries to de-serialize data from the checkpoint file using classes from the Spark Shell that will not exist when the the new shell is started like line16. was (Author: skonto): I will remove the language version from the title. This seems to be specific to the shell. With Spark 2.3.3 which I assume has 2.11 by default it fails with a different error: {quote}2019-05-01 02:12:00 WARN CheckpointReader:87 - Error reading checkpoint from file [file:/tmp/checkpoint/checkpoint-155666587.bk|file:///tmp/checkpoint/checkpoint-155666587.bk] java.io.IOException: java.lang.ClassNotFoundException: $line16.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$MapWithState$$anonfun$3 at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1354) at org.apache.spark.streaming.dstream.FileInputDStream.readObject(FileInputDStream.scala:316) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1170) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2178) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2069) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573) at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1975) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1567) {quote} I guess in this case its tries to de-serialize data from the checkpoint file using classes from the Spark Shell that will not exist when the the new shell is started like line16. > DStreams checkpointing does not work with the Spark Shell > - > > Key: SPARK-27598 > URL: https://issues.apache.org/jira/browse/SPARK-27598 > Project: Spark > Issue Type: Bug > Components: DStreams >Affects Versions: 2.4.0, 2.4.1, 2.4.2, 3.0.0 >Reporter: Stavros Kontopoulos >Priority: Major > > When I restarted a stream with checkpointing enabled I got this: > {quote}19/04/29 22:45:06 WARN CheckpointReader: Error reading checkpoint from > file > [file:/tmp/checkpoint/checkpoint-155656695.bk|file:///tmp/checkpoint/checkpoint-155656695.bk] > java.io.IOException: java.lang.ClassCastException: cannot assign instance of > java.lang.invoke.SerializedLambda to field > org.apache.spark.streaming.dstream.FileInputDStream.filter of type > scala.Function1 in instance of > org.apache.spark.streaming.dstream.FileInputDStream > at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1322) > at > org.apache.spark.streaming.dstream.FileInputDStream.readObject(FileInputDStream.scala:314) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > {quote} > It seems that the closure is stored in the Serialized format and cannot be > assigned back to a scala function1 > Details of how to reproduce it here: > [https://gist.github.com/skonto/87d5b2368b0bf7786d9dd992a710e4e6] > Maybe this is spark-shell specific and is not
[jira] [Comment Edited] (SPARK-27598) DStreams checkpointing does not work with the Spark Shell
[ https://issues.apache.org/jira/browse/SPARK-27598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16830745#comment-16830745 ] Stavros Kontopoulos edited comment on SPARK-27598 at 4/30/19 11:16 PM: --- I will remove the language version from the title. This seems to be specific to the shell. With Spark 2.3.3 which I assume has 2.11 by default it fails with a different error: {quote}2019-05-01 02:12:00 WARN CheckpointReader:87 - Error reading checkpoint from file [file:/tmp/checkpoint/checkpoint-155666587.bk|file:///tmp/checkpoint/checkpoint-155666587.bk] java.io.IOException: java.lang.ClassNotFoundException: $line16.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$MapWithState$$anonfun$3 at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1354) at org.apache.spark.streaming.dstream.FileInputDStream.readObject(FileInputDStream.scala:316) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1170) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2178) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2069) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573) at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1975) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1567) {quote} I guess in this case its tries to de-serialize data from the checkpoint file using classes from the Spark Shell that will not exist when the the new shell is started like line16. was (Author: skonto): I will remove the language version from the title. This seems to be specific to the shell. With Spark 2.3.3 which I assume has 2.11 by default it fails with a different error: {quote}2019-05-01 02:12:00 WARN CheckpointReader:87 - Error reading checkpoint from file [file:/tmp/checkpoint/checkpoint-155666587.bk|file:///tmp/checkpoint/checkpoint-155666587.bk] java.io.IOException: java.lang.ClassNotFoundException: $line16.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$MapWithState$$anonfun$3 at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1354) at org.apache.spark.streaming.dstream.FileInputDStream.readObject(FileInputDStream.scala:316) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1170) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2178) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2069) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573) at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1975) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1567) {quote} I guess in this case its tries to deserialize data fromthe checkpoint file using classes from the Spark Shell that will not exist when the the new shell is started like line16. > DStreams checkpointing does not work with the Spark Shell > - > > Key: SPARK-27598 > URL: https://issues.apache.org/jira/browse/SPARK-27598 > Project: Spark > Issue Type: Bug > Components: DStreams >Affects Versions: 2.4.0, 2.4.1, 2.4.2, 3.0.0 >Reporter: Stavros Kontopoulos >Priority: Major > > When I restarted a stream with checkpointing enabled I got this: > {quote}19/04/29 22:45:06 WARN CheckpointReader: Error reading checkpoint from > file > [file:/tmp/checkpoint/checkpoint-155656695.bk|file:///tmp/checkpoint/checkpoint-155656695.bk] > java.io.IOException: java.lang.ClassCastException: cannot assign instance of > java.lang.invoke.SerializedLambda to field > org.apache.spark.streaming.dstream.FileInputDStream.filter of type > scala.Function1 in instance of > org.apache.spark.streaming.dstream.FileInputDStream > at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1322) > at > org.apache.spark.streaming.dstream.FileInputDStream.readObject(FileInputDStream.scala:314) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > {quote} > It seems that the closure is stored in the Serialized format and cannot be > assigned back to a scala function1 > Details of how to reproduce it here: > [https://gist.github.com/skonto/87d5b2368b0bf7786d9dd992a710e4e6] > Maybe this is spark-shell specific and is not
[jira] [Comment Edited] (SPARK-27598) DStreams checkpointing does not work with the Spark Shell
[ https://issues.apache.org/jira/browse/SPARK-27598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16830745#comment-16830745 ] Stavros Kontopoulos edited comment on SPARK-27598 at 4/30/19 11:15 PM: --- I will remove the language version from the title. This seems to be specific to the shell. With Spark 2.3.3 which I assume has 2.11 by default it fails with a different error: {quote}2019-05-01 02:12:00 WARN CheckpointReader:87 - Error reading checkpoint from file [file:/tmp/checkpoint/checkpoint-155666587.bk|file:///tmp/checkpoint/checkpoint-155666587.bk] java.io.IOException: java.lang.ClassNotFoundException: $line16.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$MapWithState$$anonfun$3 at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1354) at org.apache.spark.streaming.dstream.FileInputDStream.readObject(FileInputDStream.scala:316) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1170) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2178) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2069) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573) at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1975) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1567) {quote} I guess in this case its tries to deserialize data fromthe checkpoint file using classes from the Spark Shell that will not exist when the the new shell is started like line16. was (Author: skonto): I will remove the language version from the title. This seems to be specific to the shell. With Spark 2.3.3 which I assume has 2.11 by default it fails with a different error: {quote}2019-05-01 02:12:00 WARN CheckpointReader:87 - Error reading checkpoint from file [file:/tmp/checkpoint/checkpoint-155666587.bk|file:///tmp/checkpoint/checkpoint-155666587.bk] java.io.IOException: java.lang.ClassNotFoundException: $line16.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$MapWithState$$anonfun$3 at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1354) at org.apache.spark.streaming.dstream.FileInputDStream.readObject(FileInputDStream.scala:316) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1170) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2178) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2069) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573) at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1975) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1567) {quote} I guess in this case its tries to serialize objects from the Spark Shell that will not exist when the the new shell is started like line16. > DStreams checkpointing does not work with the Spark Shell > - > > Key: SPARK-27598 > URL: https://issues.apache.org/jira/browse/SPARK-27598 > Project: Spark > Issue Type: Bug > Components: DStreams >Affects Versions: 2.4.0, 2.4.1, 2.4.2, 3.0.0 >Reporter: Stavros Kontopoulos >Priority: Major > > When I restarted a stream with checkpointing enabled I got this: > {quote}19/04/29 22:45:06 WARN CheckpointReader: Error reading checkpoint from > file > [file:/tmp/checkpoint/checkpoint-155656695.bk|file:///tmp/checkpoint/checkpoint-155656695.bk] > java.io.IOException: java.lang.ClassCastException: cannot assign instance of > java.lang.invoke.SerializedLambda to field > org.apache.spark.streaming.dstream.FileInputDStream.filter of type > scala.Function1 in instance of > org.apache.spark.streaming.dstream.FileInputDStream > at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1322) > at > org.apache.spark.streaming.dstream.FileInputDStream.readObject(FileInputDStream.scala:314) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > {quote} > It seems that the closure is stored in the Serialized format and cannot be > assigned back to a scala function1 > Details of how to reproduce it here: > [https://gist.github.com/skonto/87d5b2368b0bf7786d9dd992a710e4e6] > Maybe this is spark-shell specific and is not expected to work anyway, as I > dont see
[jira] [Comment Edited] (SPARK-27598) DStreams checkpointing does not work with the Spark Shell
[ https://issues.apache.org/jira/browse/SPARK-27598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16830745#comment-16830745 ] Stavros Kontopoulos edited comment on SPARK-27598 at 4/30/19 11:14 PM: --- I will remove the language version from the title. This seems to be specific to the shell. With Spark 2.3.3 which I assume has 2.11 by default it fails with a different error: {quote}2019-05-01 02:12:00 WARN CheckpointReader:87 - Error reading checkpoint from file [file:/tmp/checkpoint/checkpoint-155666587.bk|file:///tmp/checkpoint/checkpoint-155666587.bk] java.io.IOException: java.lang.ClassNotFoundException: $line16.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$MapWithState$$anonfun$3 at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1354) at org.apache.spark.streaming.dstream.FileInputDStream.readObject(FileInputDStream.scala:316) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1170) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2178) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2069) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573) at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1975) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1567) {quote} I guess in this case its tries to serialize objects from the Spark Shell that will not exist when the the new shell is started like line16. was (Author: skonto): I will remove the language version from the title. This seems to be specific to the shell. With Spark 2.3.3 which I assume has 2.11 by default it fails with a different error: {quote}2019-05-01 02:12:00 WARN CheckpointReader:87 - Error reading checkpoint from file [file:/tmp/checkpoint/checkpoint-155666587.bk|file:///tmp/checkpoint/checkpoint-155666587.bk] java.io.IOException: java.lang.ClassNotFoundException: $line16.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$MapWithState$$anonfun$3 at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1354) at org.apache.spark.streaming.dstream.FileInputDStream.readObject(FileInputDStream.scala:316) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1170) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2178) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2069) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573) at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1975) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1567) {quote} > DStreams checkpointing does not work with the Spark Shell > - > > Key: SPARK-27598 > URL: https://issues.apache.org/jira/browse/SPARK-27598 > Project: Spark > Issue Type: Bug > Components: DStreams >Affects Versions: 2.4.0, 2.4.1, 2.4.2, 3.0.0 >Reporter: Stavros Kontopoulos >Priority: Major > > When I restarted a stream with checkpointing enabled I got this: > {quote}19/04/29 22:45:06 WARN CheckpointReader: Error reading checkpoint from > file > [file:/tmp/checkpoint/checkpoint-155656695.bk|file:///tmp/checkpoint/checkpoint-155656695.bk] > java.io.IOException: java.lang.ClassCastException: cannot assign instance of > java.lang.invoke.SerializedLambda to field > org.apache.spark.streaming.dstream.FileInputDStream.filter of type > scala.Function1 in instance of > org.apache.spark.streaming.dstream.FileInputDStream > at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1322) > at > org.apache.spark.streaming.dstream.FileInputDStream.readObject(FileInputDStream.scala:314) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > {quote} > It seems that the closure is stored in the Serialized format and cannot be > assigned back to a scala function1 > Details of how to reproduce it here: > [https://gist.github.com/skonto/87d5b2368b0bf7786d9dd992a710e4e6] > Maybe this is spark-shell specific and is not expected to work anyway, as I > dont see this to be an issues with a normal jar. > Note that with Spark 2.3.3 the error is different and this still does not > work but with a different error. -- This message was
[jira] [Comment Edited] (SPARK-27598) DStreams checkpointing does not work with the Spark Shell
[ https://issues.apache.org/jira/browse/SPARK-27598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16830745#comment-16830745 ] Stavros Kontopoulos edited comment on SPARK-27598 at 4/30/19 11:13 PM: --- I will remove the language version from the title. This seems to be specific to the shell. With Spark 2.3.3 which I assume has 2.11 by default it fails with a different error: {quote}2019-05-01 02:12:00 WARN CheckpointReader:87 - Error reading checkpoint from file [file:/tmp/checkpoint/checkpoint-155666587.bk|file:///tmp/checkpoint/checkpoint-155666587.bk] java.io.IOException: java.lang.ClassNotFoundException: $line16.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$MapWithState$$anonfun$3 at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1354) at org.apache.spark.streaming.dstream.FileInputDStream.readObject(FileInputDStream.scala:316) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1170) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2178) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2069) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573) at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1975) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1567) {quote} was (Author: skonto): I will remove the language version from the title. With Spark 2.3.3 which I assume has 2.11 by default it fails with a different error: {quote}2019-05-01 02:12:00 WARN CheckpointReader:87 - Error reading checkpoint from file file:/tmp/checkpoint/checkpoint-155666587.bk java.io.IOException: java.lang.ClassNotFoundException: $line16.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$MapWithState$$anonfun$3 at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1354) at org.apache.spark.streaming.dstream.FileInputDStream.readObject(FileInputDStream.scala:316) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1170) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2178) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2069) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573) at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1975) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1567) {quote} > DStreams checkpointing does not work with the Spark Shell > - > > Key: SPARK-27598 > URL: https://issues.apache.org/jira/browse/SPARK-27598 > Project: Spark > Issue Type: Bug > Components: DStreams >Affects Versions: 2.4.0, 2.4.1, 2.4.2, 3.0.0 >Reporter: Stavros Kontopoulos >Priority: Major > > When I restarted a stream with checkpointing enabled I got this: > {quote}19/04/29 22:45:06 WARN CheckpointReader: Error reading checkpoint from > file > [file:/tmp/checkpoint/checkpoint-155656695.bk|file:///tmp/checkpoint/checkpoint-155656695.bk] > java.io.IOException: java.lang.ClassCastException: cannot assign instance of > java.lang.invoke.SerializedLambda to field > org.apache.spark.streaming.dstream.FileInputDStream.filter of type > scala.Function1 in instance of > org.apache.spark.streaming.dstream.FileInputDStream > at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1322) > at > org.apache.spark.streaming.dstream.FileInputDStream.readObject(FileInputDStream.scala:314) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > {quote} > It seems that the closure is stored in the Serialized format and cannot be > assigned back to a scala function1 > Details of how to reproduce it here: > [https://gist.github.com/skonto/87d5b2368b0bf7786d9dd992a710e4e6] > Maybe this is spark-shell specific and is not expected to work anyway, as I > dont see this to be an issues with a normal jar. > Note that with Spark 2.3.3 the error is different and this still does not > work but with a different error. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org