[jira] [Commented] (SPARK-33106) Fix sbt resolvers clash
[ https://issues.apache.org/jira/browse/SPARK-33106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17266706#comment-17266706 ] Alexander Bessonov commented on SPARK-33106: My bad. I had the following in my environment that caused the issue. Builds fine without it. {code:java} SBT_OPTS="-Dsbt.override.build.repos=true"{code} > Fix sbt resolvers clash > --- > > Key: SPARK-33106 > URL: https://issues.apache.org/jira/browse/SPARK-33106 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 3.1.0 >Reporter: Denis Pyshev >Assignee: Denis Pyshev >Priority: Minor > Fix For: 3.1.0 > > > During sbt upgrade from 0.13 to 1.x, exact resolvers list was used as is. > That leads to local resolvers name clashing, which is observed as warning > from SBT: > {code:java} > [warn] Multiple resolvers having different access mechanism configured with > same name 'local'. To avoid conflict, Remove duplicate project resolvers > (`resolvers`) or rename publishing resolve > r (`publishTo`). > {code} > This needs to be fixed to avoid potential errors and reduce log noise. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-33106) Fix sbt resolvers clash
[ https://issues.apache.org/jira/browse/SPARK-33106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17260903#comment-17260903 ] Alexander Bessonov commented on SPARK-33106: That doesn't seem to fix the issue: {code:java}build/sbt publishLocal{code} now seems to end with an error "Undefined resolver 'ivyLocal'". > Fix sbt resolvers clash > --- > > Key: SPARK-33106 > URL: https://issues.apache.org/jira/browse/SPARK-33106 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 3.1.0 >Reporter: Denis Pyshev >Assignee: Denis Pyshev >Priority: Minor > Fix For: 3.1.0 > > > During sbt upgrade from 0.13 to 1.x, exact resolvers list was used as is. > That leads to local resolvers name clashing, which is observed as warning > from SBT: > {code:java} > [warn] Multiple resolvers having different access mechanism configured with > same name 'local'. To avoid conflict, Remove duplicate project resolvers > (`resolvers`) or rename publishing resolve > r (`publishTo`). > {code} > This needs to be fixed to avoid potential errors and reduce log noise. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-29719) Converted Metastore relations (ORC, Parquet) wouldn't update InMemoryFileIndex
Alexander Bessonov created SPARK-29719: -- Summary: Converted Metastore relations (ORC, Parquet) wouldn't update InMemoryFileIndex Key: SPARK-29719 URL: https://issues.apache.org/jira/browse/SPARK-29719 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 2.4.0 Reporter: Alexander Bessonov Spark attempts to convert Hive tables backed by Parquet and ORC into an internal logical relationships which cache file locations for underlying data. That cache wouldn't be invalidated when attempting to re-read partitioned table later on. The table might have new files by the time it is re-read which might be ignored. {code:java} val spark = SparkSession.builder() .master("yarn") .enableHiveSupport .config("spark.sql.hive.caseSensitiveInferenceMode", "NEVER_INFER") .getOrCreate() val df1 = spark.table("my_table").filter("date=20191101") // Do something with `df1` // External process writes to the partition val df2 = spark.table("my_table").filter("date=20191101") // Do something with `df2`. Data in `df1` and `df2` should be different, but is equal.{code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-29330) Allow users to chose the name of Spark Shuffle service
[ https://issues.apache.org/jira/browse/SPARK-29330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Bessonov updated SPARK-29330: --- Description: As of now, Spark uses hardcoded value {{spark_shuffle}} as the name of the Shuffle Service. HDP distribution of Spark, on the other hand, uses [{{spark2_shuffle}}|https://github.com/hortonworks/spark2-release/blob/HDP-3.1.0.0-78-tag/resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/ExecutorRunnable.scala#L117]. This is done to be able to run both Spark 1.6 and Spark 2.x on the same Hadoop cluster. Running vanilla Spark on HDP cluster with only Spark 2.x shuffle service (HDP favor) running becomes impossible due to the shuffle service name mismatch. was: As of now, Spark uses hardcoded value {{spark_shuffle}} as the name of the Shuffle Service. HDP distribution of Spark, on the other hand, uses [{{spark2_shuffle}}|#L117]]. This is done to be able to run both Spark 1.6 and Spark 2.x on the same Hadoop cluster. Running vanilla Spark on HDP cluster with only Spark 2.x shuffle service (HDP favor) running becomes impossible due to the shuffle service name mismatch. > Allow users to chose the name of Spark Shuffle service > -- > > Key: SPARK-29330 > URL: https://issues.apache.org/jira/browse/SPARK-29330 > Project: Spark > Issue Type: Improvement > Components: Spark Core, YARN >Affects Versions: 2.4.4 >Reporter: Alexander Bessonov >Priority: Minor > > As of now, Spark uses hardcoded value {{spark_shuffle}} as the name of the > Shuffle Service. > HDP distribution of Spark, on the other hand, uses > [{{spark2_shuffle}}|https://github.com/hortonworks/spark2-release/blob/HDP-3.1.0.0-78-tag/resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/ExecutorRunnable.scala#L117]. > This is done to be able to run both Spark 1.6 and Spark 2.x on the same > Hadoop cluster. > Running vanilla Spark on HDP cluster with only Spark 2.x shuffle service (HDP > favor) running becomes impossible due to the shuffle service name mismatch. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-29330) Allow users to chose the name of Spark Shuffle service
Alexander Bessonov created SPARK-29330: -- Summary: Allow users to chose the name of Spark Shuffle service Key: SPARK-29330 URL: https://issues.apache.org/jira/browse/SPARK-29330 Project: Spark Issue Type: Improvement Components: Spark Core, YARN Affects Versions: 2.4.4 Reporter: Alexander Bessonov As of now, Spark uses hardcoded value {{spark_shuffle}} as the name of the Shuffle Service. HDP distribution of Spark, on the other hand, uses [{{spark2_shuffle}}|#L117]]. This is done to be able to run both Spark 1.6 and Spark 2.x on the same Hadoop cluster. Running vanilla Spark on HDP cluster with only Spark 2.x shuffle service (HDP favor) running becomes impossible due to the shuffle service name mismatch. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-25983) spark-sql-kafka-0-10 no longer works with Kafka 0.10.0
Alexander Bessonov created SPARK-25983: -- Summary: spark-sql-kafka-0-10 no longer works with Kafka 0.10.0 Key: SPARK-25983 URL: https://issues.apache.org/jira/browse/SPARK-25983 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 2.4.0 Reporter: Alexander Bessonov Package {{org.apache.spark:spark-sql-kafka-0-10_2.11:2.4.0}} is no longer compatible with {{org.apache.kafka:kafka_2.11:0.10.0.1}}. When both packages are used in the same project, the following exception occurs: {code:java} java.lang.NoClassDefFoundError: org/apache/kafka/common/protocol/SecurityProtocol at kafka.server.Defaults$.(KafkaConfig.scala:125) at kafka.server.Defaults$.(KafkaConfig.scala) at kafka.log.Defaults$.(LogConfig.scala:33) at kafka.log.Defaults$.(LogConfig.scala) at kafka.log.LogConfig$.(LogConfig.scala:152) at kafka.log.LogConfig$.(LogConfig.scala) at kafka.server.KafkaConfig$.(KafkaConfig.scala:265) at kafka.server.KafkaConfig$.(KafkaConfig.scala) at kafka.server.KafkaConfig.(KafkaConfig.scala:759) at kafka.server.KafkaConfig.(KafkaConfig.scala:761) {code} This exception is caused by incompatible dependency pulled by Spark: {{org.apache.kafka:kafka-clients_2.11:2.0.0}}. Following workaround could be used to resolve the problem in my project: {code:java} dependencyOverrides += "org.apache.kafka" % "kafka-clients" % "0.10.0.1" {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-23737) Scala API documentation leads to nonexistent pages for sources
[ https://issues.apache.org/jira/browse/SPARK-23737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16414111#comment-16414111 ] Alexander Bessonov commented on SPARK-23737: [~sameerag], Making a wild guess the username in the URL is yours. > Scala API documentation leads to nonexistent pages for sources > -- > > Key: SPARK-23737 > URL: https://issues.apache.org/jira/browse/SPARK-23737 > Project: Spark > Issue Type: Bug > Components: Documentation >Affects Versions: 2.3.0 >Reporter: Alexander Bessonov >Priority: Minor > > h3. Steps to reproduce: > # Go to [Scala API > homepage|[http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.package]]. > # Click "Source: package.scala" > h3. Result: > The link leads to nonexistent page: > [https://github.com/apache/spark/tree/v2.3.0/Users/sameera/dev/spark/core/src/main/scala/org/apache/spark/package.scala] > h3. Expected result: > The link leads to proper page: > [https://github.com/apache/spark/tree/v2.3.0/core/src/main/scala/org/apache/spark/package.scala] > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Reopened] (SPARK-23737) Scala API documentation leads to nonexistent pages for sources
[ https://issues.apache.org/jira/browse/SPARK-23737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Bessonov reopened SPARK-23737: Okay. The bug isn't fixed and it affects everyone who wants to jump to the source code from ScalaDocs. > Scala API documentation leads to nonexistent pages for sources > -- > > Key: SPARK-23737 > URL: https://issues.apache.org/jira/browse/SPARK-23737 > Project: Spark > Issue Type: Bug > Components: Documentation >Affects Versions: 2.3.0 >Reporter: Alexander Bessonov >Priority: Minor > > h3. Steps to reproduce: > # Go to [Scala API > homepage|[http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.package]]. > # Click "Source: package.scala" > h3. Result: > The link leads to nonexistent page: > [https://github.com/apache/spark/tree/v2.3.0/Users/sameera/dev/spark/core/src/main/scala/org/apache/spark/package.scala] > h3. Expected result: > The link leads to proper page: > [https://github.com/apache/spark/tree/v2.3.0/core/src/main/scala/org/apache/spark/package.scala] > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-23737) Scala API documentation leads to nonexistent pages for sources
[ https://issues.apache.org/jira/browse/SPARK-23737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16406591#comment-16406591 ] Alexander Bessonov commented on SPARK-23737: Oh, thanks. Linked them. > Scala API documentation leads to nonexistent pages for sources > -- > > Key: SPARK-23737 > URL: https://issues.apache.org/jira/browse/SPARK-23737 > Project: Spark > Issue Type: Bug > Components: Documentation >Affects Versions: 2.3.0 >Reporter: Alexander Bessonov >Priority: Minor > > h3. Steps to reproduce: > # Go to [Scala API > homepage|[http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.package]]. > # Click "Source: package.scala" > h3. Result: > The link leads to nonexistent page: > [https://github.com/apache/spark/tree/v2.3.0/Users/sameera/dev/spark/core/src/main/scala/org/apache/spark/package.scala] > h3. Expected result: > The link leads to proper page: > [https://github.com/apache/spark/tree/v2.3.0/core/src/main/scala/org/apache/spark/package.scala] > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-23737) Scala API documentation leads to nonexistent pages for sources
[ https://issues.apache.org/jira/browse/SPARK-23737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Bessonov updated SPARK-23737: --- Description: h3. Steps to reproduce: # Go to [Scala API homepage|[http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.package]]. # Click "Source: package.scala" h3. Result: The link leads to nonexistent page: [https://github.com/apache/spark/tree/v2.3.0/Users/sameera/dev/spark/core/src/main/scala/org/apache/spark/package.scala] h3. Expected result: The link leads to proper page: [https://github.com/apache/spark/tree/v2.3.0/core/src/main/scala/org/apache/spark/package.scala] was: h3. Steps to reproduce: # Go to [Scala API homepage|[http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.package].] # Click "Source: package.scala" h3. Result: The link leads to nonexistent page: [https://github.com/apache/spark/tree/v2.3.0/Users/sameera/dev/spark/core/src/main/scala/org/apache/spark/package.scala] h3. Expected result: The link leads to proper page: [https://github.com/apache/spark/tree/v2.3.0/core/src/main/scala/org/apache/spark/package.scala] > Scala API documentation leads to nonexistent pages for sources > -- > > Key: SPARK-23737 > URL: https://issues.apache.org/jira/browse/SPARK-23737 > Project: Spark > Issue Type: Bug > Components: Documentation >Affects Versions: 2.3.0 >Reporter: Alexander Bessonov >Priority: Minor > > h3. Steps to reproduce: > # Go to [Scala API > homepage|[http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.package]]. > # Click "Source: package.scala" > h3. Result: > The link leads to nonexistent page: > [https://github.com/apache/spark/tree/v2.3.0/Users/sameera/dev/spark/core/src/main/scala/org/apache/spark/package.scala] > h3. Expected result: > The link leads to proper page: > [https://github.com/apache/spark/tree/v2.3.0/core/src/main/scala/org/apache/spark/package.scala] > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-23737) Scala API documentation leads to nonexistent pages for sources
Alexander Bessonov created SPARK-23737: -- Summary: Scala API documentation leads to nonexistent pages for sources Key: SPARK-23737 URL: https://issues.apache.org/jira/browse/SPARK-23737 Project: Spark Issue Type: Bug Components: Documentation Affects Versions: 2.3.0 Reporter: Alexander Bessonov h3. Steps to reproduce: # Go to [Scala API homepage|[http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.package].] # Click "Source: package.scala" h3. Result: The link leads to nonexistent page: [https://github.com/apache/spark/tree/v2.3.0/Users/sameera/dev/spark/core/src/main/scala/org/apache/spark/package.scala] h3. Expected result: The link leads to proper page: [https://github.com/apache/spark/tree/v2.3.0/core/src/main/scala/org/apache/spark/package.scala] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-17414) Set type is not supported for creating data frames
[ https://issues.apache.org/jira/browse/SPARK-17414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16130852#comment-16130852 ] Alexander Bessonov commented on SPARK-17414: Fixed in SPARK-21204 > Set type is not supported for creating data frames > -- > > Key: SPARK-17414 > URL: https://issues.apache.org/jira/browse/SPARK-17414 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 2.0.0 >Reporter: Emre Colak >Priority: Minor > > For a case class that has a field of type Set, createDataFrame() method > throws an exception saying "Schema for type Set is not supported". Exception > is raised by the org.apache.spark.sql.catalyst.ScalaReflection class where > Array, Seq and Map types are supported but Set is not. It would be nice to > support Set here by default instead of having to write a custom Encoder. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-21696) State Store can't handle corrupted snapshots
[ https://issues.apache.org/jira/browse/SPARK-21696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16122078#comment-16122078 ] Alexander Bessonov commented on SPARK-21696: {{HDFSBackedStateStoreProvider.doMaintenance()}} will supress any {{NonFatal}} exceptions. {{startMaintenanceIfNeeded.startMaintenanceIfNeeded()}} wouldn't restart maintenance if crashed. State Store still can function even when snapshot file is corrupted by simply falling back to deltas. > State Store can't handle corrupted snapshots > > > Key: SPARK-21696 > URL: https://issues.apache.org/jira/browse/SPARK-21696 > Project: Spark > Issue Type: Bug > Components: Structured Streaming >Affects Versions: 2.0.0, 2.0.1, 2.0.2, 2.1.0, 2.1.1, 2.2.0 >Reporter: Alexander Bessonov >Priority: Critical > > State store's asynchronous maintenance task (generation of Snapshot files) is > not rescheduled if crashed which might lead to corrupted snapshots. > In our case, on multiple occasions, executors died during maintenance task > with Out Of Memory error which led to following error on recovery: > {code:none} > 17/08/07 20:12:24 WARN TaskSetManager: Lost task 3.1 in stage 102.0 (TID > 3314, dnj2-bach-r2n10.bloomberg.com, executor 94): java.io.EOFException > at java.io.DataInputStream.readInt(DataInputStream.java:392) > at > org.apache.spark.sql.execution.streaming.state.HDFSBackedStateStoreProvider.org$apache$spark$sql$execution$streaming$state$HDFSBackedStateStoreProvider$$readSnapshotFile(HDFSBackedStateStoreProvider.scala:436) > at > org.apache.spark.sql.execution.streaming.state.HDFSBackedStateStoreProvider$$anonfun$org$apache$spark$sql$execution$streaming$state$HDFSBackedStateStoreProvider$$loadMap$1.apply(HDFSBackedStateStoreProvider.scala:314) > at > org.apache.spark.sql.execution.streaming.state.HDFSBackedStateStoreProvider$$anonfun$org$apache$spark$sql$execution$streaming$state$HDFSBackedStateStoreProvider$$loadMap$1.apply(HDFSBackedStateStoreProvider.scala:313) > at scala.Option.getOrElse(Option.scala:121) > at > org.apache.spark.sql.execution.streaming.state.HDFSBackedStateStoreProvider.org$apache$spark$sql$execution$streaming$state$HDFSBackedStateStoreProvider$$loadMap(HDFSBackedStateStoreProvider.scala:313) > at > org.apache.spark.sql.execution.streaming.state.HDFSBackedStateStoreProvider.getStore(HDFSBackedStateStoreProvider.scala:220) > at > org.apache.spark.sql.execution.streaming.state.StateStore$.get(StateStore.scala:186) > at > org.apache.spark.sql.execution.streaming.state.StateStoreRDD.compute(StateStoreRDD.scala:61) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) > at > org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) > at > org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) > at > org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) > at > org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) > at > org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) > at > org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) > at > org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) > at > org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) > at org.apache.spark.scheduler.Task.run(Task.scala:99) > at >
[jira] [Updated] (SPARK-21696) State Store can't handle corrupted snapshots
[ https://issues.apache.org/jira/browse/SPARK-21696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Bessonov updated SPARK-21696: --- Description: State store's asynchronous maintenance task (generation of Snapshot files) is not rescheduled if crashed which might lead to corrupted snapshots. In our case, on multiple occasions, executors died during maintenance task with Out Of Memory error which led to following error on recovery: {code:none} 17/08/07 20:12:24 WARN TaskSetManager: Lost task 3.1 in stage 102.0 (TID 3314, dnj2-bach-r2n10.bloomberg.com, executor 94): java.io.EOFException at java.io.DataInputStream.readInt(DataInputStream.java:392) at org.apache.spark.sql.execution.streaming.state.HDFSBackedStateStoreProvider.org$apache$spark$sql$execution$streaming$state$HDFSBackedStateStoreProvider$$readSnapshotFile(HDFSBackedStateStoreProvider.scala:436) at org.apache.spark.sql.execution.streaming.state.HDFSBackedStateStoreProvider$$anonfun$org$apache$spark$sql$execution$streaming$state$HDFSBackedStateStoreProvider$$loadMap$1.apply(HDFSBackedStateStoreProvider.scala:314) at org.apache.spark.sql.execution.streaming.state.HDFSBackedStateStoreProvider$$anonfun$org$apache$spark$sql$execution$streaming$state$HDFSBackedStateStoreProvider$$loadMap$1.apply(HDFSBackedStateStoreProvider.scala:313) at scala.Option.getOrElse(Option.scala:121) at org.apache.spark.sql.execution.streaming.state.HDFSBackedStateStoreProvider.org$apache$spark$sql$execution$streaming$state$HDFSBackedStateStoreProvider$$loadMap(HDFSBackedStateStoreProvider.scala:313) at org.apache.spark.sql.execution.streaming.state.HDFSBackedStateStoreProvider.getStore(HDFSBackedStateStoreProvider.scala:220) at org.apache.spark.sql.execution.streaming.state.StateStore$.get(StateStore.scala:186) at org.apache.spark.sql.execution.streaming.state.StateStoreRDD.compute(StateStoreRDD.scala:61) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:99) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:322) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) {code} was: State store's asynchronous maintenance task (generation of Snapshot files) is not rescheduled if crashed which might lead to corrupted snapshots. In our case, on multiple occasions, executors died during maintenance task with Out Of Memory error which led to following error on recovery: {code:text} 17/08/07 20:12:24 WARN TaskSetManager: Lost task 3.1 in stage 102.0 (TID 3314, dnj2-bach-r2n10.bloomberg.com, executor 94): java.io.EOFException at java.io.DataInputStream.readInt(DataInputStream.java:392) at
[jira] [Created] (SPARK-21696) State Store can't handle corrupted snapshots
Alexander Bessonov created SPARK-21696: -- Summary: State Store can't handle corrupted snapshots Key: SPARK-21696 URL: https://issues.apache.org/jira/browse/SPARK-21696 Project: Spark Issue Type: Bug Components: Structured Streaming Affects Versions: 2.2.0, 2.1.1, 2.1.0, 2.0.2, 2.0.1, 2.0.0 Reporter: Alexander Bessonov Priority: Critical State store's asynchronous maintenance task (generation of Snapshot files) is not rescheduled if crashed which might lead to corrupted snapshots. In our case, on multiple occasions, executors died during maintenance task with Out Of Memory error which led to following error on recovery: {code:text} 17/08/07 20:12:24 WARN TaskSetManager: Lost task 3.1 in stage 102.0 (TID 3314, dnj2-bach-r2n10.bloomberg.com, executor 94): java.io.EOFException at java.io.DataInputStream.readInt(DataInputStream.java:392) at org.apache.spark.sql.execution.streaming.state.HDFSBackedStateStoreProvider.org$apache$spark$sql$execution$streaming$state$HDFSBackedStateStoreProvider$$readSnapshotFile(HDFSBackedStateStoreProvider.scala:436) at org.apache.spark.sql.execution.streaming.state.HDFSBackedStateStoreProvider$$anonfun$org$apache$spark$sql$execution$streaming$state$HDFSBackedStateStoreProvider$$loadMap$1.apply(HDFSBackedStateStoreProvider.scala:314) at org.apache.spark.sql.execution.streaming.state.HDFSBackedStateStoreProvider$$anonfun$org$apache$spark$sql$execution$streaming$state$HDFSBackedStateStoreProvider$$loadMap$1.apply(HDFSBackedStateStoreProvider.scala:313) at scala.Option.getOrElse(Option.scala:121) at org.apache.spark.sql.execution.streaming.state.HDFSBackedStateStoreProvider.org$apache$spark$sql$execution$streaming$state$HDFSBackedStateStoreProvider$$loadMap(HDFSBackedStateStoreProvider.scala:313) at org.apache.spark.sql.execution.streaming.state.HDFSBackedStateStoreProvider.getStore(HDFSBackedStateStoreProvider.scala:220) at org.apache.spark.sql.execution.streaming.state.StateStore$.get(StateStore.scala:186) at org.apache.spark.sql.execution.streaming.state.StateStoreRDD.compute(StateStoreRDD.scala:61) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:99) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:322) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-20900) ApplicationMaster crashes if SPARK_YARN_STAGING_DIR is not set
Alexander Bessonov created SPARK-20900: -- Summary: ApplicationMaster crashes if SPARK_YARN_STAGING_DIR is not set Key: SPARK-20900 URL: https://issues.apache.org/jira/browse/SPARK-20900 Project: Spark Issue Type: Bug Components: YARN Affects Versions: 2.1.0, 1.6.0, 1.2.0 Environment: Spark 2.1.0 Reporter: Alexander Bessonov Priority: Minor When running {{ApplicationMaster}} directly, if {{SPARK_YARN_STAGING_DIR}} is not set or set to empty string, {{org.apache.hadoop.fs.Path}} will throw {{IllegalArgumentException}} instead of returning {{null}}. This is not handled and the exception crashes the job. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org