[jira] [Comment Edited] (SPARK-27598) DStreams checkpointing does not work with the Spark Shell

2019-04-30 Thread Stavros Kontopoulos (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-27598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16830760#comment-16830760
 ] 

Stavros Kontopoulos edited comment on SPARK-27598 at 4/30/19 11:44 PM:
---

@[~dongjoon] I am wondering if structured streaming works in a similar scenario.


was (Author: skonto):
@[~dongjoon] I am wondering if structured streaming works.

> DStreams checkpointing does not work with the Spark Shell
> -
>
> Key: SPARK-27598
> URL: https://issues.apache.org/jira/browse/SPARK-27598
> Project: Spark
>  Issue Type: Bug
>  Components: DStreams
>Affects Versions: 2.4.0, 2.4.1, 2.4.2, 3.0.0
>Reporter: Stavros Kontopoulos
>Priority: Major
>
> When I restarted a stream with checkpointing enabled I got this:
> {quote}19/04/29 22:45:06 WARN CheckpointReader: Error reading checkpoint from 
> file 
> [file:/tmp/checkpoint/checkpoint-155656695.bk|file:///tmp/checkpoint/checkpoint-155656695.bk]
>  java.io.IOException: java.lang.ClassCastException: cannot assign instance of 
> java.lang.invoke.SerializedLambda to field 
> org.apache.spark.streaming.dstream.FileInputDStream.filter of type 
> scala.Function1 in instance of 
> org.apache.spark.streaming.dstream.FileInputDStream
>  at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1322)
>  at 
> org.apache.spark.streaming.dstream.FileInputDStream.readObject(FileInputDStream.scala:314)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> {quote}
> It seems that the closure is stored in the Serialized format and cannot be 
> assigned back to a scala function1
> Details of how to reproduce it here: 
> [https://gist.github.com/skonto/87d5b2368b0bf7786d9dd992a710e4e6]
> Maybe this is spark-shell specific and is not expected to work anyway, as I 
> dont see this to be an issues with a normal jar. 
> Note that with Spark 2.3.3 the error is different and this still does not 
> work but with a different error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-27598) DStreams checkpointing does not work with the Spark Shell

2019-04-30 Thread Stavros Kontopoulos (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-27598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16830758#comment-16830758
 ] 

Stavros Kontopoulos edited comment on SPARK-27598 at 4/30/19 11:39 PM:
---

Btw if I do the trick and put the mappingFunction in an object like this with 
Spark 2.3.3 on restart I get:
{quote}def createContext(checkpointDirectory: String, inputDirectory: String, 
outputDirectory: String)
 : StreamingContext = {

...

object T extends Serializable {
 // Update the cumulative count using mapWithState
 // This will give a DStream made of state (which is the cumulative count of 
the words)
 val mappingFunc = (word: String, one: Option[Int], state: State[Int]) =>
Unknown macro: \{ val sum = one.getOrElse(0) + state.getOption.getOrElse(0) val 
output = (word, sum) state.update(sum) output }
}

val stateDstream = words.mapWithState(
 StateSpec.function(T.mappingFunc).initialState(initialRDD))

}
{quote}
{quote}2019-05-01 02:36:14 WARN BatchedWriteAheadLog:66 - BatchedWriteAheadLog 
Writer queue interrupted.
 org.apache.spark.SparkException: This RDD lacks a SparkContext. It could 
happen in the following cases:
 (1) RDD transformations and actions are NOT invoked by the driver, but inside 
of other transformations; for example, rdd1.map(x => rdd2.values.count() * x) 
is invalid because the values transformation and count action cannot be 
performed inside of the rdd1.map transformation. For more information, see 
SPARK-5063.
 (2) When a Spark Streaming job recovers from checkpoint, this exception will 
be hit if a reference to an RDD not defined by the streaming job is used in 
DStream operations. For more information, See SPARK-13758.
 at org.apache.spark.rdd.RDD.org$apache$spark$rdd$RDD$$sc(RDD.scala:90)
 at org.apache.spark.rdd.RDD.withScope(RDD.scala:363)
 at 
org.apache.spark.rdd.PairRDDFunctions.partitionBy(PairRDDFunctions.scala:529)
 at 
org.apache.spark.streaming.rdd.MapWithStateRDD$.createFromPairRDD(MapWithStateRDD.scala:193)
 at 
org.apache.spark.streaming.dstream.InternalMapWithStateDStream.compute(MapWithStateDStream.scala:146)
 at 
org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1$$anonfun$apply$7.apply(DStream.scala:342)
{quote}


was (Author: skonto):
Btw if I do the trick and put the mappingFunction in an object like this with 
Spark 2.3.3 on restart I get:
{quote}
 def createContext(checkpointDirectory: String, inputDirectory: String, 
outputDirectory: String)
 : StreamingContext = {

...

object T extends Serializable {
 // Update the cumulative count using mapWithState
 // This will give a DStream made of state (which is the cumulative count of 
the words)
 val mappingFunc = (word: String, one: Option[Int], state: State[Int]) => {
 val sum = one.getOrElse(0) + state.getOption.getOrElse(0)
 val output = (word, sum)
 state.update(sum)
 output
 }
 }
{quote}
}
{quote}2019-05-01 02:36:14 WARN BatchedWriteAheadLog:66 - BatchedWriteAheadLog 
Writer queue interrupted.
org.apache.spark.SparkException: This RDD lacks a SparkContext. It could happen 
in the following cases:
(1) RDD transformations and actions are NOT invoked by the driver, but inside 
of other transformations; for example, rdd1.map(x => rdd2.values.count() * x) 
is invalid because the values transformation and count action cannot be 
performed inside of the rdd1.map transformation. For more information, see 
SPARK-5063.
(2) When a Spark Streaming job recovers from checkpoint, this exception will be 
hit if a reference to an RDD not defined by the streaming job is used in 
DStream operations. For more information, See SPARK-13758.
 at org.apache.spark.rdd.RDD.org$apache$spark$rdd$RDD$$sc(RDD.scala:90)
 at org.apache.spark.rdd.RDD.withScope(RDD.scala:363)
 at 
org.apache.spark.rdd.PairRDDFunctions.partitionBy(PairRDDFunctions.scala:529)
 at 
org.apache.spark.streaming.rdd.MapWithStateRDD$.createFromPairRDD(MapWithStateRDD.scala:193)
 at 
org.apache.spark.streaming.dstream.InternalMapWithStateDStream.compute(MapWithStateDStream.scala:146)
 at 
org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1$$anonfun$apply$7.apply(DStream.scala:342)
{quote}

> DStreams checkpointing does not work with the Spark Shell
> -
>
> Key: SPARK-27598
> URL: https://issues.apache.org/jira/browse/SPARK-27598
> Project: Spark
>  Issue Type: Bug
>  Components: DStreams
>Affects Versions: 2.4.0, 2.4.1, 2.4.2, 3.0.0
>Reporter: Stavros Kontopoulos
>Priority: Major
>
> When I restarted a stream with checkpointing enabled I got this:
> {quote}19/04/29 22:45:06 WARN CheckpointReader: Error reading checkpoint from 
> file 
> [file:/tmp/checkpoint/checkpoint-155656695.bk|file:///tmp/checkpoint/checkpoint-155656695.bk]
>  java.io.IOException: 

[jira] [Comment Edited] (SPARK-27598) DStreams checkpointing does not work with the Spark Shell

2019-04-30 Thread Dongjoon Hyun (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-27598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16830740#comment-16830740
 ] 

Dongjoon Hyun edited comment on SPARK-27598 at 4/30/19 11:24 PM:
-

If this fails with Spark 2.3 with Scala 2.11, please remove `Scala 2.12` from 
the title. It's misleading.


was (Author: dongjoon):
If this fails with Spark 2.3 with Scala 2.11, please remove `Scala 2.11` from 
the title. It's misleading.

> DStreams checkpointing does not work with the Spark Shell
> -
>
> Key: SPARK-27598
> URL: https://issues.apache.org/jira/browse/SPARK-27598
> Project: Spark
>  Issue Type: Bug
>  Components: DStreams
>Affects Versions: 2.4.0, 2.4.1, 2.4.2, 3.0.0
>Reporter: Stavros Kontopoulos
>Priority: Major
>
> When I restarted a stream with checkpointing enabled I got this:
> {quote}19/04/29 22:45:06 WARN CheckpointReader: Error reading checkpoint from 
> file 
> [file:/tmp/checkpoint/checkpoint-155656695.bk|file:///tmp/checkpoint/checkpoint-155656695.bk]
>  java.io.IOException: java.lang.ClassCastException: cannot assign instance of 
> java.lang.invoke.SerializedLambda to field 
> org.apache.spark.streaming.dstream.FileInputDStream.filter of type 
> scala.Function1 in instance of 
> org.apache.spark.streaming.dstream.FileInputDStream
>  at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1322)
>  at 
> org.apache.spark.streaming.dstream.FileInputDStream.readObject(FileInputDStream.scala:314)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> {quote}
> It seems that the closure is stored in the Serialized format and cannot be 
> assigned back to a scala function1
> Details of how to reproduce it here: 
> [https://gist.github.com/skonto/87d5b2368b0bf7786d9dd992a710e4e6]
> Maybe this is spark-shell specific and is not expected to work anyway, as I 
> dont see this to be an issues with a normal jar. 
> Note that with Spark 2.3.3 the error is different and this still does not 
> work but with a different error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-27598) DStreams checkpointing does not work with the Spark Shell

2019-04-30 Thread Stavros Kontopoulos (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-27598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16830745#comment-16830745
 ] 

Stavros Kontopoulos edited comment on SPARK-27598 at 4/30/19 11:21 PM:
---

 I will remove the language version from the title. This seems to be specific 
to the shell.

With Spark 2.3.3 which I assume has 2.11 by default it fails with a different 
error:
{quote}2019-05-01 02:12:00 WARN CheckpointReader:87 - Error reading checkpoint 
from file 
[file:/tmp/checkpoint/checkpoint-155666587.bk|file:///tmp/checkpoint/checkpoint-155666587.bk]
 java.io.IOException: java.lang.ClassNotFoundException: 
$line16.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$MapWithState$$anonfun$3
 at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1354)
 at 
org.apache.spark.streaming.dstream.FileInputDStream.readObject(FileInputDStream.scala:316)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498)
 at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1170)
 at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2178)
 at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2069)
 at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573)
 at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1975)
 at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1567)
{quote}
I guess in this case its tries to de-serialize data from the checkpoint file 
using classes from the Spark Shell that will not exist when the the new shell 
is started like line16.


was (Author: skonto):
 I will remove the language version from the title. This seems to be specific 
to the shell.

With Spark 2.3.3 which I assume has 2.11 by default it fails with a different 
error:
{quote}2019-05-01 02:12:00 WARN CheckpointReader:87 - Error reading checkpoint 
from file 
[file:/tmp/checkpoint/checkpoint-155666587.bk|file:///tmp/checkpoint/checkpoint-155666587.bk]
 java.io.IOException: java.lang.ClassNotFoundException: 
$line16.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$MapWithState$$anonfun$3
 at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1354)
 at 
org.apache.spark.streaming.dstream.FileInputDStream.readObject(FileInputDStream.scala:316)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498)
 at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1170)
 at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2178)
 at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2069)
 at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573)
 at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1975)
 at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1567)
{quote}
I guess in this case its tries to de-serialize data from the checkpoint file 
using classes from the Spark Shell that will not exist when the the new shell 
is started like line16.

> DStreams checkpointing does not work with the Spark Shell
> -
>
> Key: SPARK-27598
> URL: https://issues.apache.org/jira/browse/SPARK-27598
> Project: Spark
>  Issue Type: Bug
>  Components: DStreams
>Affects Versions: 2.4.0, 2.4.1, 2.4.2, 3.0.0
>Reporter: Stavros Kontopoulos
>Priority: Major
>
> When I restarted a stream with checkpointing enabled I got this:
> {quote}19/04/29 22:45:06 WARN CheckpointReader: Error reading checkpoint from 
> file 
> [file:/tmp/checkpoint/checkpoint-155656695.bk|file:///tmp/checkpoint/checkpoint-155656695.bk]
>  java.io.IOException: java.lang.ClassCastException: cannot assign instance of 
> java.lang.invoke.SerializedLambda to field 
> org.apache.spark.streaming.dstream.FileInputDStream.filter of type 
> scala.Function1 in instance of 
> org.apache.spark.streaming.dstream.FileInputDStream
>  at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1322)
>  at 
> org.apache.spark.streaming.dstream.FileInputDStream.readObject(FileInputDStream.scala:314)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> {quote}
> It seems that the closure is stored in the Serialized format and cannot be 
> assigned back to a scala function1
> Details of how to reproduce it here: 
> [https://gist.github.com/skonto/87d5b2368b0bf7786d9dd992a710e4e6]
> Maybe this is spark-shell specific and is not 

[jira] [Comment Edited] (SPARK-27598) DStreams checkpointing does not work with the Spark Shell

2019-04-30 Thread Stavros Kontopoulos (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-27598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16830745#comment-16830745
 ] 

Stavros Kontopoulos edited comment on SPARK-27598 at 4/30/19 11:16 PM:
---

 I will remove the language version from the title. This seems to be specific 
to the shell.

With Spark 2.3.3 which I assume has 2.11 by default it fails with a different 
error:
{quote}2019-05-01 02:12:00 WARN CheckpointReader:87 - Error reading checkpoint 
from file 
[file:/tmp/checkpoint/checkpoint-155666587.bk|file:///tmp/checkpoint/checkpoint-155666587.bk]
 java.io.IOException: java.lang.ClassNotFoundException: 
$line16.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$MapWithState$$anonfun$3
 at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1354)
 at 
org.apache.spark.streaming.dstream.FileInputDStream.readObject(FileInputDStream.scala:316)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498)
 at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1170)
 at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2178)
 at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2069)
 at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573)
 at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1975)
 at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1567)
{quote}
I guess in this case its tries to de-serialize data from the checkpoint file 
using classes from the Spark Shell that will not exist when the the new shell 
is started like line16.


was (Author: skonto):
 I will remove the language version from the title. This seems to be specific 
to the shell.

With Spark 2.3.3 which I assume has 2.11 by default it fails with a different 
error:
{quote}2019-05-01 02:12:00 WARN CheckpointReader:87 - Error reading checkpoint 
from file 
[file:/tmp/checkpoint/checkpoint-155666587.bk|file:///tmp/checkpoint/checkpoint-155666587.bk]
 java.io.IOException: java.lang.ClassNotFoundException: 
$line16.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$MapWithState$$anonfun$3
 at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1354)
 at 
org.apache.spark.streaming.dstream.FileInputDStream.readObject(FileInputDStream.scala:316)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498)
 at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1170)
 at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2178)
 at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2069)
 at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573)
 at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1975)
 at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1567)
{quote}
I guess in this case its tries to deserialize data fromthe checkpoint file 
using classes from the Spark Shell that will not exist when the the new shell 
is started like line16.

> DStreams checkpointing does not work with the Spark Shell
> -
>
> Key: SPARK-27598
> URL: https://issues.apache.org/jira/browse/SPARK-27598
> Project: Spark
>  Issue Type: Bug
>  Components: DStreams
>Affects Versions: 2.4.0, 2.4.1, 2.4.2, 3.0.0
>Reporter: Stavros Kontopoulos
>Priority: Major
>
> When I restarted a stream with checkpointing enabled I got this:
> {quote}19/04/29 22:45:06 WARN CheckpointReader: Error reading checkpoint from 
> file 
> [file:/tmp/checkpoint/checkpoint-155656695.bk|file:///tmp/checkpoint/checkpoint-155656695.bk]
>  java.io.IOException: java.lang.ClassCastException: cannot assign instance of 
> java.lang.invoke.SerializedLambda to field 
> org.apache.spark.streaming.dstream.FileInputDStream.filter of type 
> scala.Function1 in instance of 
> org.apache.spark.streaming.dstream.FileInputDStream
>  at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1322)
>  at 
> org.apache.spark.streaming.dstream.FileInputDStream.readObject(FileInputDStream.scala:314)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> {quote}
> It seems that the closure is stored in the Serialized format and cannot be 
> assigned back to a scala function1
> Details of how to reproduce it here: 
> [https://gist.github.com/skonto/87d5b2368b0bf7786d9dd992a710e4e6]
> Maybe this is spark-shell specific and is not 

[jira] [Comment Edited] (SPARK-27598) DStreams checkpointing does not work with the Spark Shell

2019-04-30 Thread Stavros Kontopoulos (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-27598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16830745#comment-16830745
 ] 

Stavros Kontopoulos edited comment on SPARK-27598 at 4/30/19 11:15 PM:
---

 I will remove the language version from the title. This seems to be specific 
to the shell.

With Spark 2.3.3 which I assume has 2.11 by default it fails with a different 
error:
{quote}2019-05-01 02:12:00 WARN CheckpointReader:87 - Error reading checkpoint 
from file 
[file:/tmp/checkpoint/checkpoint-155666587.bk|file:///tmp/checkpoint/checkpoint-155666587.bk]
 java.io.IOException: java.lang.ClassNotFoundException: 
$line16.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$MapWithState$$anonfun$3
 at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1354)
 at 
org.apache.spark.streaming.dstream.FileInputDStream.readObject(FileInputDStream.scala:316)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498)
 at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1170)
 at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2178)
 at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2069)
 at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573)
 at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1975)
 at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1567)
{quote}
I guess in this case its tries to deserialize data fromthe checkpoint file 
using classes from the Spark Shell that will not exist when the the new shell 
is started like line16.


was (Author: skonto):
 I will remove the language version from the title. This seems to be specific 
to the shell.

With Spark 2.3.3 which I assume has 2.11 by default it fails with a different 
error:
{quote}2019-05-01 02:12:00 WARN CheckpointReader:87 - Error reading checkpoint 
from file 
[file:/tmp/checkpoint/checkpoint-155666587.bk|file:///tmp/checkpoint/checkpoint-155666587.bk]
 java.io.IOException: java.lang.ClassNotFoundException: 
$line16.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$MapWithState$$anonfun$3
 at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1354)
 at 
org.apache.spark.streaming.dstream.FileInputDStream.readObject(FileInputDStream.scala:316)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498)
 at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1170)
 at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2178)
 at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2069)
 at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573)
 at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1975)
 at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1567)
{quote}
I guess in this case its tries to serialize objects from the Spark Shell that 
will not exist when the the new shell is started like line16.

> DStreams checkpointing does not work with the Spark Shell
> -
>
> Key: SPARK-27598
> URL: https://issues.apache.org/jira/browse/SPARK-27598
> Project: Spark
>  Issue Type: Bug
>  Components: DStreams
>Affects Versions: 2.4.0, 2.4.1, 2.4.2, 3.0.0
>Reporter: Stavros Kontopoulos
>Priority: Major
>
> When I restarted a stream with checkpointing enabled I got this:
> {quote}19/04/29 22:45:06 WARN CheckpointReader: Error reading checkpoint from 
> file 
> [file:/tmp/checkpoint/checkpoint-155656695.bk|file:///tmp/checkpoint/checkpoint-155656695.bk]
>  java.io.IOException: java.lang.ClassCastException: cannot assign instance of 
> java.lang.invoke.SerializedLambda to field 
> org.apache.spark.streaming.dstream.FileInputDStream.filter of type 
> scala.Function1 in instance of 
> org.apache.spark.streaming.dstream.FileInputDStream
>  at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1322)
>  at 
> org.apache.spark.streaming.dstream.FileInputDStream.readObject(FileInputDStream.scala:314)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> {quote}
> It seems that the closure is stored in the Serialized format and cannot be 
> assigned back to a scala function1
> Details of how to reproduce it here: 
> [https://gist.github.com/skonto/87d5b2368b0bf7786d9dd992a710e4e6]
> Maybe this is spark-shell specific and is not expected to work anyway, as I 
> dont see 

[jira] [Comment Edited] (SPARK-27598) DStreams checkpointing does not work with the Spark Shell

2019-04-30 Thread Stavros Kontopoulos (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-27598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16830745#comment-16830745
 ] 

Stavros Kontopoulos edited comment on SPARK-27598 at 4/30/19 11:14 PM:
---

 I will remove the language version from the title. This seems to be specific 
to the shell.

With Spark 2.3.3 which I assume has 2.11 by default it fails with a different 
error:
{quote}2019-05-01 02:12:00 WARN CheckpointReader:87 - Error reading checkpoint 
from file 
[file:/tmp/checkpoint/checkpoint-155666587.bk|file:///tmp/checkpoint/checkpoint-155666587.bk]
 java.io.IOException: java.lang.ClassNotFoundException: 
$line16.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$MapWithState$$anonfun$3
 at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1354)
 at 
org.apache.spark.streaming.dstream.FileInputDStream.readObject(FileInputDStream.scala:316)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498)
 at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1170)
 at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2178)
 at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2069)
 at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573)
 at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1975)
 at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1567)
{quote}
I guess in this case its tries to serialize objects from the Spark Shell that 
will not exist when the the new shell is started like line16.


was (Author: skonto):
 I will remove the language version from the title. This seems to be specific 
to the shell.

With Spark 2.3.3 which I assume has 2.11 by default it fails with a different 
error:
{quote}2019-05-01 02:12:00 WARN CheckpointReader:87 - Error reading checkpoint 
from file 
[file:/tmp/checkpoint/checkpoint-155666587.bk|file:///tmp/checkpoint/checkpoint-155666587.bk]
 java.io.IOException: java.lang.ClassNotFoundException: 
$line16.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$MapWithState$$anonfun$3
 at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1354)
 at 
org.apache.spark.streaming.dstream.FileInputDStream.readObject(FileInputDStream.scala:316)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498)
 at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1170)
 at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2178)
 at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2069)
 at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573)
 at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1975)
 at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1567)
{quote}

> DStreams checkpointing does not work with the Spark Shell
> -
>
> Key: SPARK-27598
> URL: https://issues.apache.org/jira/browse/SPARK-27598
> Project: Spark
>  Issue Type: Bug
>  Components: DStreams
>Affects Versions: 2.4.0, 2.4.1, 2.4.2, 3.0.0
>Reporter: Stavros Kontopoulos
>Priority: Major
>
> When I restarted a stream with checkpointing enabled I got this:
> {quote}19/04/29 22:45:06 WARN CheckpointReader: Error reading checkpoint from 
> file 
> [file:/tmp/checkpoint/checkpoint-155656695.bk|file:///tmp/checkpoint/checkpoint-155656695.bk]
>  java.io.IOException: java.lang.ClassCastException: cannot assign instance of 
> java.lang.invoke.SerializedLambda to field 
> org.apache.spark.streaming.dstream.FileInputDStream.filter of type 
> scala.Function1 in instance of 
> org.apache.spark.streaming.dstream.FileInputDStream
>  at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1322)
>  at 
> org.apache.spark.streaming.dstream.FileInputDStream.readObject(FileInputDStream.scala:314)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> {quote}
> It seems that the closure is stored in the Serialized format and cannot be 
> assigned back to a scala function1
> Details of how to reproduce it here: 
> [https://gist.github.com/skonto/87d5b2368b0bf7786d9dd992a710e4e6]
> Maybe this is spark-shell specific and is not expected to work anyway, as I 
> dont see this to be an issues with a normal jar. 
> Note that with Spark 2.3.3 the error is different and this still does not 
> work but with a different error.



--
This message was 

[jira] [Comment Edited] (SPARK-27598) DStreams checkpointing does not work with the Spark Shell

2019-04-30 Thread Stavros Kontopoulos (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-27598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16830745#comment-16830745
 ] 

Stavros Kontopoulos edited comment on SPARK-27598 at 4/30/19 11:13 PM:
---

 I will remove the language version from the title. This seems to be specific 
to the shell.

With Spark 2.3.3 which I assume has 2.11 by default it fails with a different 
error:
{quote}2019-05-01 02:12:00 WARN CheckpointReader:87 - Error reading checkpoint 
from file 
[file:/tmp/checkpoint/checkpoint-155666587.bk|file:///tmp/checkpoint/checkpoint-155666587.bk]
 java.io.IOException: java.lang.ClassNotFoundException: 
$line16.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$MapWithState$$anonfun$3
 at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1354)
 at 
org.apache.spark.streaming.dstream.FileInputDStream.readObject(FileInputDStream.scala:316)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498)
 at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1170)
 at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2178)
 at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2069)
 at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573)
 at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1975)
 at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1567)
{quote}


was (Author: skonto):
 I will remove the language version from the title.

With Spark 2.3.3 which I assume has 2.11 by default it fails with a different 
error:
{quote}2019-05-01 02:12:00 WARN CheckpointReader:87 - Error reading checkpoint 
from file file:/tmp/checkpoint/checkpoint-155666587.bk
java.io.IOException: java.lang.ClassNotFoundException: 
$line16.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$MapWithState$$anonfun$3
 at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1354)
 at 
org.apache.spark.streaming.dstream.FileInputDStream.readObject(FileInputDStream.scala:316)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498)
 at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1170)
 at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2178)
 at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2069)
 at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573)
 at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1975)
 at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1567)
{quote}

> DStreams checkpointing does not work with the Spark Shell
> -
>
> Key: SPARK-27598
> URL: https://issues.apache.org/jira/browse/SPARK-27598
> Project: Spark
>  Issue Type: Bug
>  Components: DStreams
>Affects Versions: 2.4.0, 2.4.1, 2.4.2, 3.0.0
>Reporter: Stavros Kontopoulos
>Priority: Major
>
> When I restarted a stream with checkpointing enabled I got this:
> {quote}19/04/29 22:45:06 WARN CheckpointReader: Error reading checkpoint from 
> file 
> [file:/tmp/checkpoint/checkpoint-155656695.bk|file:///tmp/checkpoint/checkpoint-155656695.bk]
>  java.io.IOException: java.lang.ClassCastException: cannot assign instance of 
> java.lang.invoke.SerializedLambda to field 
> org.apache.spark.streaming.dstream.FileInputDStream.filter of type 
> scala.Function1 in instance of 
> org.apache.spark.streaming.dstream.FileInputDStream
>  at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1322)
>  at 
> org.apache.spark.streaming.dstream.FileInputDStream.readObject(FileInputDStream.scala:314)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> {quote}
> It seems that the closure is stored in the Serialized format and cannot be 
> assigned back to a scala function1
> Details of how to reproduce it here: 
> [https://gist.github.com/skonto/87d5b2368b0bf7786d9dd992a710e4e6]
> Maybe this is spark-shell specific and is not expected to work anyway, as I 
> dont see this to be an issues with a normal jar. 
> Note that with Spark 2.3.3 the error is different and this still does not 
> work but with a different error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org