[jira] [Commented] (SPARK-29046) Possible NPE on SQLConf.get when SparkContext is stopping in another thread
[ https://issues.apache.org/jira/browse/SPARK-29046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17106087#comment-17106087 ] Viacheslav Tradunsky commented on SPARK-29046: -- [~kabhwan] Do you know a lower version of spark which does not have this issue? Maybe Spark 2.3.2? > Possible NPE on SQLConf.get when SparkContext is stopping in another thread > --- > > Key: SPARK-29046 > URL: https://issues.apache.org/jira/browse/SPARK-29046 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.4.0, 2.4.1, 2.4.2, 2.4.3, 2.4.4, 3.0.0 >Reporter: Jungtaek Lim >Assignee: Jungtaek Lim >Priority: Minor > Fix For: 2.4.5, 3.0.0 > > > We encountered NPE in listener code which deals with query plan - and > according to the stack trace below, only possible case of NPE is > SparkContext._dagScheduler being null, which is only possible while stopping > SparkContext (unless null is set from outside). > > {code:java} > 19/09/11 00:22:24 INFO server.AbstractConnector: Stopped > Spark@49d8c117{HTTP/1.1,[http/1.1]}{0.0.0.0:0}19/09/11 00:22:24 INFO > server.AbstractConnector: Stopped > Spark@49d8c117{HTTP/1.1,[http/1.1]}{0.0.0.0:0}19/09/11 00:22:24 INFO > ui.SparkUI: Stopped Spark web UI at http://:3277019/09/11 00:22:24 INFO > cluster.YarnClusterSchedulerBackend: Shutting down all executors19/09/11 > 00:22:24 INFO cluster.YarnSchedulerBackend$YarnDriverEndpoint: Asking each > executor to shut down19/09/11 00:22:24 INFO > cluster.SchedulerExtensionServices: Stopping > SchedulerExtensionServices(serviceOption=None, services=List(), > started=false)19/09/11 00:22:24 WARN sql.SparkExecutionPlanProcessor: Caught > exception during parsing eventjava.lang.NullPointerException at > org.apache.spark.sql.internal.SQLConf$$anonfun$15.apply(SQLConf.scala:133) at > org.apache.spark.sql.internal.SQLConf$$anonfun$15.apply(SQLConf.scala:133) at > scala.Option.map(Option.scala:146) at > org.apache.spark.sql.internal.SQLConf$.get(SQLConf.scala:133) at > org.apache.spark.sql.types.StructType.simpleString(StructType.scala:352) at > com.hortonworks.spark.atlas.types.internal$.sparkTableToEntity(internal.scala:102) > at > com.hortonworks.spark.atlas.types.AtlasEntityUtils$class.tableToEntity(AtlasEntityUtils.scala:62) > at > com.hortonworks.spark.atlas.sql.CommandsHarvester$.tableToEntity(CommandsHarvester.scala:45) > at > com.hortonworks.spark.atlas.sql.CommandsHarvester$$anonfun$com$hortonworks$spark$atlas$sql$CommandsHarvester$$discoverInputsEntities$1.apply(CommandsHarvester.scala:240) > at > com.hortonworks.spark.atlas.sql.CommandsHarvester$$anonfun$com$hortonworks$spark$atlas$sql$CommandsHarvester$$discoverInputsEntities$1.apply(CommandsHarvester.scala:239) > at > scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) > at > scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) > at > scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) > at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) at > scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241) at > scala.collection.AbstractTraversable.flatMap(Traversable.scala:104) at > com.hortonworks.spark.atlas.sql.CommandsHarvester$.com$hortonworks$spark$atlas$sql$CommandsHarvester$$discoverInputsEntities(CommandsHarvester.scala:239) > at > com.hortonworks.spark.atlas.sql.CommandsHarvester$CreateDataSourceTableAsSelectHarvester$.harvest(CommandsHarvester.scala:104) > at > com.hortonworks.spark.atlas.sql.SparkExecutionPlanProcessor$$anonfun$2.apply(SparkExecutionPlanProcessor.scala:138) > at > com.hortonworks.spark.atlas.sql.SparkExecutionPlanProcessor$$anonfun$2.apply(SparkExecutionPlanProcessor.scala:89) > at > scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) > at > scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) > at > scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) > at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) at > scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241) at > scala.collection.AbstractTraversable.flatMap(Traversable.scala:104) at > com.hortonworks.spark.atlas.sql.SparkExecutionPlanProcessor.process(SparkExecutionPlanProcessor.scala:89) > at > com.hortonworks.spark.atlas.sql.SparkExecutionPlanProcessor.process(SparkExecutionPlanProcessor.scala:63) > at > com.hortonworks.spark.atlas.AbstractEventProcessor$$anonfun$eventProcess$1.apply(AbstractEventProcessor.scala:72) > at >
[jira] [Updated] (SPARK-31698) NPE on big dataset plans
[ https://issues.apache.org/jira/browse/SPARK-31698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viacheslav Tradunsky updated SPARK-31698: - Environment: AWS EMR: 30 machines, 7TB RAM total. (was: AWS EMR: 30 machine, 7TB RAM total.) > NPE on big dataset plans > > > Key: SPARK-31698 > URL: https://issues.apache.org/jira/browse/SPARK-31698 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.4.4 > Environment: AWS EMR: 30 machines, 7TB RAM total. >Reporter: Viacheslav Tradunsky >Priority: Major > Attachments: Spark_NPE_big_dataset.log > > > We have big dataset containing 275 SQL operations more than 275 joins. > On the terminal operation to write data, it fails with NullPointerException. > > I understand that such big number of operations might not be what spark is > designed for, but NullPointerException is not an ideal way to fail in this > case. > > For more details, please see the stacktrace. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-31698) NPE on big dataset plans
[ https://issues.apache.org/jira/browse/SPARK-31698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viacheslav Tradunsky updated SPARK-31698: - Environment: AWS EMR: 30 machine, 7TB RAM total. (was: AWS EMR) > NPE on big dataset plans > > > Key: SPARK-31698 > URL: https://issues.apache.org/jira/browse/SPARK-31698 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.4.4 > Environment: AWS EMR: 30 machine, 7TB RAM total. >Reporter: Viacheslav Tradunsky >Priority: Major > Attachments: Spark_NPE_big_dataset.log > > > We have big dataset containing 275 SQL operations more than 275 joins. > On the terminal operation to write data, it fails with NullPointerException. > > I understand that such big number of operations might not be what spark is > designed for, but NullPointerException is not an ideal way to fail in this > case. > > For more details, please see the stacktrace. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-31698) NPE on big dataset plans
[ https://issues.apache.org/jira/browse/SPARK-31698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viacheslav Tradunsky updated SPARK-31698: - Docs Text: (was: org.apache.spark.SparkException: Job aborted. ./livy-livy-server.out.gz:20/05/12 22:46:54 INFO LineBufferedStream:at org.apache.spark.sql.execution.datasources.FileFormatWriter$.write(FileFormatWriter.scala:198) ./livy-livy-server.out.gz:20/05/12 22:46:54 INFO LineBufferedStream: at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand.run(InsertIntoHadoopFsRelationCommand.scala:159) ./livy-livy-server.out.gz:20/05/12 22:46:54 INFO LineBufferedStream: at org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult$lzycompute(commands.scala:104) ./livy-livy-server.out.gz:20/05/12 22:46:54 INFO LineBufferedStream: at org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult(commands.scala:102) ./livy-livy-server.out.gz:20/05/12 22:46:54 INFO LineBufferedStream: at org.apache.spark.sql.execution.command.DataWritingCommandExec.doExecute(commands.scala:122) ./livy-livy-server.out.gz:20/05/12 22:46:54 INFO LineBufferedStream: at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131) ./livy-livy-server.out.gz:20/05/12 22:46:54 INFO LineBufferedStream: at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127) ./livy-livy-server.out.gz:20/05/12 22:46:54 INFO LineBufferedStream: at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:156) ./livy-livy-server.out.gz:20/05/12 22:46:54 INFO LineBufferedStream: at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) ./livy-livy-server.out.gz:20/05/12 22:46:54 INFO LineBufferedStream: at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152) ./livy-livy-server.out.gz:20/05/12 22:46:54 INFO LineBufferedStream: at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127) ./livy-livy-server.out.gz:20/05/12 22:46:54 INFO LineBufferedStream: at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:80) ./livy-livy-server.out.gz:20/05/12 22:46:54 INFO LineBufferedStream: at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:80) ./livy-livy-server.out.gz:20/05/12 22:46:54 INFO LineBufferedStream: at org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:676) ./livy-livy-server.out.gz:20/05/12 22:46:54 INFO LineBufferedStream: at org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:676) ./livy-livy-server.out.gz:20/05/12 22:46:54 INFO LineBufferedStream: at org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:78) ./livy-livy-server.out.gz:20/05/12 22:46:54 INFO LineBufferedStream: at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:125) ./livy-livy-server.out.gz:20/05/12 22:46:54 INFO LineBufferedStream: at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:73) ./livy-livy-server.out.gz:20/05/12 22:46:54 INFO LineBufferedStream: at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:676) ./livy-livy-server.out.gz:20/05/12 22:46:54 INFO LineBufferedStream: at org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:285) ./livy-livy-server.out.gz:20/05/12 22:46:54 INFO LineBufferedStream: at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:271) ./livy-livy-server.out.gz:20/05/12 22:46:54 INFO LineBufferedStream: at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:229) ./livy-livy-server.out.gz:20/05/12 22:46:54 INFO LineBufferedStream: at org.apache.spark.sql.DataFrameWriter.parquet(DataFrameWriter.scala:566) ./livy-livy-server.out.gz:20/05/12 22:46:54 INFO LineBufferedStream: at com.company.app.executor.spark.SparkDatasetGenerationJob.generateDataset(SparkDatasetGenerationJob.scala:51) ./livy-livy-server.out.gz:20/05/12 22:46:54 INFO LineBufferedStream: at com.company.app.executor.spark.SparkDatasetGenerationJob.call(SparkDatasetGenerationJob.scala:82) ./livy-livy-server.out.gz:20/05/12 22:46:54 INFO LineBufferedStream: at com.company.app.executor.spark.SparkDatasetGenerationJob.call(SparkDatasetGenerationJob.scala:11) ./livy-livy-server.out.gz:20/05/12 22:46:54 INFO LineBufferedStream: at org.apache.livy.rsc.driver.BypassJob.call(BypassJob.java:40) ./livy-livy-server.out.gz:20/05/12 22:46:54 INFO LineBufferedStream: at org.apache.livy.rsc.driver.BypassJob.call(BypassJob.java:27) ./livy-livy-server.out.gz:20/05/12 22:46:54 INFO LineBufferedStream: at org.apache.livy.rsc.driver.JobWrapper.call(JobWrapper.java:64)
[jira] [Updated] (SPARK-31698) NPE on big dataset plans
[ https://issues.apache.org/jira/browse/SPARK-31698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viacheslav Tradunsky updated SPARK-31698: - Attachment: Spark_NPE_big_dataset.log > NPE on big dataset plans > > > Key: SPARK-31698 > URL: https://issues.apache.org/jira/browse/SPARK-31698 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.4.4 > Environment: AWS EMR >Reporter: Viacheslav Tradunsky >Priority: Major > Attachments: Spark_NPE_big_dataset.log > > > We have big dataset containing 275 SQL operations more than 275 joins. > On the terminal operation to write data, it fails with NullPointerException. > > I understand that such big number of operations might not be what spark is > designed for, but NullPointerException is not an ideal way to fail in this > case. > > For more details, please see the stacktrace. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-31698) NPE on big dataset plans
Viacheslav Tradunsky created SPARK-31698: Summary: NPE on big dataset plans Key: SPARK-31698 URL: https://issues.apache.org/jira/browse/SPARK-31698 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 2.4.4 Environment: AWS EMR Reporter: Viacheslav Tradunsky We have big dataset containing 275 SQL operations more than 275 joins. On the terminal operation to write data, it fails with NullPointerException. I understand that such big number of operations might not be what spark is designed for, but NullPointerException is not an ideal way to fail in this case. For more details, please see the stacktrace. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-29765) Monitoring UI throws IndexOutOfBoundsException when accessing metrics of attempt in stage
[ https://issues.apache.org/jira/browse/SPARK-29765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16968753#comment-16968753 ] Viacheslav Tradunsky commented on SPARK-29765: -- What's in particularly interesting, that the error doesn't reproduce when this stage is in completed list. > Monitoring UI throws IndexOutOfBoundsException when accessing metrics of > attempt in stage > - > > Key: SPARK-29765 > URL: https://issues.apache.org/jira/browse/SPARK-29765 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.4.4 > Environment: Amazon EMR 5.27 >Reporter: Viacheslav Tradunsky >Priority: Major > > When clicking on one of the largest tasks by input, I get to > [http://:20888/proxy/application_1572992299050_0001/stages/stage/?id=74=0|http://10.207.110.207:20888/proxy/application_1572992299050_0001/stages/stage/?id=74=0] > with 500 error > {code:java} > java.lang.IndexOutOfBoundsException: 95745 at > scala.collection.immutable.Vector.checkRangeConvert(Vector.scala:132) at > scala.collection.immutable.Vector.apply(Vector.scala:122) at > org.apache.spark.status.AppStatusStore$$anonfun$scanTasks$1$1.apply$mcDJ$sp(AppStatusStore.scala:255) > at > org.apache.spark.status.AppStatusStore$$anonfun$scanTasks$1$1.apply(AppStatusStore.scala:254) > at > org.apache.spark.status.AppStatusStore$$anonfun$scanTasks$1$1.apply(AppStatusStore.scala:254) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > at > scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) > at scala.collection.mutable.ArrayOps$ofLong.foreach(ArrayOps.scala:246) at > scala.collection.TraversableLike$class.map(TraversableLike.scala:234) at > scala.collection.mutable.ArrayOps$ofLong.map(ArrayOps.scala:246) at > org.apache.spark.status.AppStatusStore.scanTasks$1(AppStatusStore.scala:254) > at > org.apache.spark.status.AppStatusStore.taskSummary(AppStatusStore.scala:287) > at org.apache.spark.ui.jobs.StagePage.render(StagePage.scala:321) at > org.apache.spark.ui.WebUI$$anonfun$2.apply(WebUI.scala:84) at > org.apache.spark.ui.WebUI$$anonfun$2.apply(WebUI.scala:84) at > org.apache.spark.ui.JettyUtils$$anon$3.doGet(JettyUtils.scala:90) at > javax.servlet.http.HttpServlet.service(HttpServlet.java:687) at > javax.servlet.http.HttpServlet.service(HttpServlet.java:790) at > org.spark_project.jetty.servlet.ServletHolder.handle(ServletHolder.java:848) > at > org.spark_project.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1772) > at > org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.doFilter(AmIpFilter.java:166) > at > org.spark_project.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759) > at > org.spark_project.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582) > at > org.spark_project.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180) > at > org.spark_project.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512) > at > org.spark_project.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112) > at > org.spark_project.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) > at > org.spark_project.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:493) > at > org.spark_project.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213) > at > org.spark_project.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134) > at org.spark_project.jetty.server.Server.handle(Server.java:539) at > org.spark_project.jetty.server.HttpChannel.handle(HttpChannel.java:333) at > org.spark_project.jetty.server.HttpConnection.onFillable(HttpConnection.java:251) > at > org.spark_project.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283) > at org.spark_project.jetty.io.FillInterest.fillable(FillInterest.java:108) > at > org.spark_project.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93) > at > org.spark_project.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303) > at > org.spark_project.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148) > at > org.spark_project.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136) > at > org.spark_project.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671) > at > org.spark_project.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589) > at java.lang.Thread.run(Thread.java:748){code}
[jira] [Commented] (SPARK-29765) Monitoring UI throws IndexOutOfBoundsException when accessing metrics of attempt in stage
[ https://issues.apache.org/jira/browse/SPARK-29765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16968751#comment-16968751 ] Viacheslav Tradunsky commented on SPARK-29765: -- Increased the capacity to 2 as pointed out in docs: [https://people.apache.org/~pwendell/spark-nightly/spark-master-docs/latest/configuration.html#scheduling] But the error still happens. > Monitoring UI throws IndexOutOfBoundsException when accessing metrics of > attempt in stage > - > > Key: SPARK-29765 > URL: https://issues.apache.org/jira/browse/SPARK-29765 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.4.4 > Environment: Amazon EMR 5.27 >Reporter: Viacheslav Tradunsky >Priority: Major > > When clicking on one of the largest tasks by input, I get to > [http://:20888/proxy/application_1572992299050_0001/stages/stage/?id=74=0|http://10.207.110.207:20888/proxy/application_1572992299050_0001/stages/stage/?id=74=0] > with 500 error > {code:java} > java.lang.IndexOutOfBoundsException: 95745 at > scala.collection.immutable.Vector.checkRangeConvert(Vector.scala:132) at > scala.collection.immutable.Vector.apply(Vector.scala:122) at > org.apache.spark.status.AppStatusStore$$anonfun$scanTasks$1$1.apply$mcDJ$sp(AppStatusStore.scala:255) > at > org.apache.spark.status.AppStatusStore$$anonfun$scanTasks$1$1.apply(AppStatusStore.scala:254) > at > org.apache.spark.status.AppStatusStore$$anonfun$scanTasks$1$1.apply(AppStatusStore.scala:254) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > at > scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) > at scala.collection.mutable.ArrayOps$ofLong.foreach(ArrayOps.scala:246) at > scala.collection.TraversableLike$class.map(TraversableLike.scala:234) at > scala.collection.mutable.ArrayOps$ofLong.map(ArrayOps.scala:246) at > org.apache.spark.status.AppStatusStore.scanTasks$1(AppStatusStore.scala:254) > at > org.apache.spark.status.AppStatusStore.taskSummary(AppStatusStore.scala:287) > at org.apache.spark.ui.jobs.StagePage.render(StagePage.scala:321) at > org.apache.spark.ui.WebUI$$anonfun$2.apply(WebUI.scala:84) at > org.apache.spark.ui.WebUI$$anonfun$2.apply(WebUI.scala:84) at > org.apache.spark.ui.JettyUtils$$anon$3.doGet(JettyUtils.scala:90) at > javax.servlet.http.HttpServlet.service(HttpServlet.java:687) at > javax.servlet.http.HttpServlet.service(HttpServlet.java:790) at > org.spark_project.jetty.servlet.ServletHolder.handle(ServletHolder.java:848) > at > org.spark_project.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1772) > at > org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.doFilter(AmIpFilter.java:166) > at > org.spark_project.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759) > at > org.spark_project.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582) > at > org.spark_project.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180) > at > org.spark_project.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512) > at > org.spark_project.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112) > at > org.spark_project.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) > at > org.spark_project.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:493) > at > org.spark_project.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213) > at > org.spark_project.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134) > at org.spark_project.jetty.server.Server.handle(Server.java:539) at > org.spark_project.jetty.server.HttpChannel.handle(HttpChannel.java:333) at > org.spark_project.jetty.server.HttpConnection.onFillable(HttpConnection.java:251) > at > org.spark_project.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283) > at org.spark_project.jetty.io.FillInterest.fillable(FillInterest.java:108) > at > org.spark_project.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93) > at > org.spark_project.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303) > at > org.spark_project.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148) > at > org.spark_project.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136) > at > org.spark_project.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671) > at >
[jira] [Commented] (SPARK-29765) Monitoring UI throws IndexOutOfBoundsException when accessing metrics of attempt in stage
[ https://issues.apache.org/jira/browse/SPARK-29765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16968739#comment-16968739 ] Viacheslav Tradunsky commented on SPARK-29765: -- I think it is capacity after code review ;) spark.scheduler.listenerbus.eventqueue.capacity Thanks a lot! > Monitoring UI throws IndexOutOfBoundsException when accessing metrics of > attempt in stage > - > > Key: SPARK-29765 > URL: https://issues.apache.org/jira/browse/SPARK-29765 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.4.4 > Environment: Amazon EMR 5.27 >Reporter: Viacheslav Tradunsky >Priority: Major > > When clicking on one of the largest tasks by input, I get to > [http://:20888/proxy/application_1572992299050_0001/stages/stage/?id=74=0|http://10.207.110.207:20888/proxy/application_1572992299050_0001/stages/stage/?id=74=0] > with 500 error > {code:java} > java.lang.IndexOutOfBoundsException: 95745 at > scala.collection.immutable.Vector.checkRangeConvert(Vector.scala:132) at > scala.collection.immutable.Vector.apply(Vector.scala:122) at > org.apache.spark.status.AppStatusStore$$anonfun$scanTasks$1$1.apply$mcDJ$sp(AppStatusStore.scala:255) > at > org.apache.spark.status.AppStatusStore$$anonfun$scanTasks$1$1.apply(AppStatusStore.scala:254) > at > org.apache.spark.status.AppStatusStore$$anonfun$scanTasks$1$1.apply(AppStatusStore.scala:254) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > at > scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) > at scala.collection.mutable.ArrayOps$ofLong.foreach(ArrayOps.scala:246) at > scala.collection.TraversableLike$class.map(TraversableLike.scala:234) at > scala.collection.mutable.ArrayOps$ofLong.map(ArrayOps.scala:246) at > org.apache.spark.status.AppStatusStore.scanTasks$1(AppStatusStore.scala:254) > at > org.apache.spark.status.AppStatusStore.taskSummary(AppStatusStore.scala:287) > at org.apache.spark.ui.jobs.StagePage.render(StagePage.scala:321) at > org.apache.spark.ui.WebUI$$anonfun$2.apply(WebUI.scala:84) at > org.apache.spark.ui.WebUI$$anonfun$2.apply(WebUI.scala:84) at > org.apache.spark.ui.JettyUtils$$anon$3.doGet(JettyUtils.scala:90) at > javax.servlet.http.HttpServlet.service(HttpServlet.java:687) at > javax.servlet.http.HttpServlet.service(HttpServlet.java:790) at > org.spark_project.jetty.servlet.ServletHolder.handle(ServletHolder.java:848) > at > org.spark_project.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1772) > at > org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.doFilter(AmIpFilter.java:166) > at > org.spark_project.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759) > at > org.spark_project.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582) > at > org.spark_project.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180) > at > org.spark_project.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512) > at > org.spark_project.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112) > at > org.spark_project.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) > at > org.spark_project.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:493) > at > org.spark_project.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213) > at > org.spark_project.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134) > at org.spark_project.jetty.server.Server.handle(Server.java:539) at > org.spark_project.jetty.server.HttpChannel.handle(HttpChannel.java:333) at > org.spark_project.jetty.server.HttpConnection.onFillable(HttpConnection.java:251) > at > org.spark_project.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283) > at org.spark_project.jetty.io.FillInterest.fillable(FillInterest.java:108) > at > org.spark_project.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93) > at > org.spark_project.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303) > at > org.spark_project.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148) > at > org.spark_project.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136) > at > org.spark_project.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671) > at > org.spark_project.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589) > at
[jira] [Commented] (SPARK-29765) Monitoring UI throws IndexOutOfBoundsException when accessing metrics of attempt in stage
[ https://issues.apache.org/jira/browse/SPARK-29765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16968720#comment-16968720 ] Viacheslav Tradunsky commented on SPARK-29765: -- Exactly as you said: {code:java} ERROR AsyncEventQueue: Dropping event from queue appStatus. This likely means one of the listeners is too slow and cannot keep up with the rate at which tasks are being started by the scheduler{code} Will increase the capacity and go back with results. Thanks! > Monitoring UI throws IndexOutOfBoundsException when accessing metrics of > attempt in stage > - > > Key: SPARK-29765 > URL: https://issues.apache.org/jira/browse/SPARK-29765 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.4.4 > Environment: Amazon EMR 5.27 >Reporter: Viacheslav Tradunsky >Priority: Major > > When clicking on one of the largest tasks by input, I get to > [http://:20888/proxy/application_1572992299050_0001/stages/stage/?id=74=0|http://10.207.110.207:20888/proxy/application_1572992299050_0001/stages/stage/?id=74=0] > with 500 error > {code:java} > java.lang.IndexOutOfBoundsException: 95745 at > scala.collection.immutable.Vector.checkRangeConvert(Vector.scala:132) at > scala.collection.immutable.Vector.apply(Vector.scala:122) at > org.apache.spark.status.AppStatusStore$$anonfun$scanTasks$1$1.apply$mcDJ$sp(AppStatusStore.scala:255) > at > org.apache.spark.status.AppStatusStore$$anonfun$scanTasks$1$1.apply(AppStatusStore.scala:254) > at > org.apache.spark.status.AppStatusStore$$anonfun$scanTasks$1$1.apply(AppStatusStore.scala:254) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > at > scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) > at scala.collection.mutable.ArrayOps$ofLong.foreach(ArrayOps.scala:246) at > scala.collection.TraversableLike$class.map(TraversableLike.scala:234) at > scala.collection.mutable.ArrayOps$ofLong.map(ArrayOps.scala:246) at > org.apache.spark.status.AppStatusStore.scanTasks$1(AppStatusStore.scala:254) > at > org.apache.spark.status.AppStatusStore.taskSummary(AppStatusStore.scala:287) > at org.apache.spark.ui.jobs.StagePage.render(StagePage.scala:321) at > org.apache.spark.ui.WebUI$$anonfun$2.apply(WebUI.scala:84) at > org.apache.spark.ui.WebUI$$anonfun$2.apply(WebUI.scala:84) at > org.apache.spark.ui.JettyUtils$$anon$3.doGet(JettyUtils.scala:90) at > javax.servlet.http.HttpServlet.service(HttpServlet.java:687) at > javax.servlet.http.HttpServlet.service(HttpServlet.java:790) at > org.spark_project.jetty.servlet.ServletHolder.handle(ServletHolder.java:848) > at > org.spark_project.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1772) > at > org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.doFilter(AmIpFilter.java:166) > at > org.spark_project.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759) > at > org.spark_project.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582) > at > org.spark_project.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180) > at > org.spark_project.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512) > at > org.spark_project.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112) > at > org.spark_project.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) > at > org.spark_project.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:493) > at > org.spark_project.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213) > at > org.spark_project.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134) > at org.spark_project.jetty.server.Server.handle(Server.java:539) at > org.spark_project.jetty.server.HttpChannel.handle(HttpChannel.java:333) at > org.spark_project.jetty.server.HttpConnection.onFillable(HttpConnection.java:251) > at > org.spark_project.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283) > at org.spark_project.jetty.io.FillInterest.fillable(FillInterest.java:108) > at > org.spark_project.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93) > at > org.spark_project.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303) > at > org.spark_project.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148) > at > org.spark_project.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136) > at >
[jira] [Commented] (SPARK-29765) Monitoring UI throws IndexOutOfBoundsException when accessing metrics of attempt in stage
[ https://issues.apache.org/jira/browse/SPARK-29765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16968704#comment-16968704 ] Viacheslav Tradunsky commented on SPARK-29765: -- [~shahid] Reproduced. An interesting thing is that despite the stage was marked as completed, on UI it showed tasks 106632/109889 (3 running). > Monitoring UI throws IndexOutOfBoundsException when accessing metrics of > attempt in stage > - > > Key: SPARK-29765 > URL: https://issues.apache.org/jira/browse/SPARK-29765 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.4.4 > Environment: Amazon EMR 5.27 >Reporter: Viacheslav Tradunsky >Priority: Major > > When clicking on one of the largest tasks by input, I get to > [http://:20888/proxy/application_1572992299050_0001/stages/stage/?id=74=0|http://10.207.110.207:20888/proxy/application_1572992299050_0001/stages/stage/?id=74=0] > with 500 error > {code:java} > java.lang.IndexOutOfBoundsException: 95745 at > scala.collection.immutable.Vector.checkRangeConvert(Vector.scala:132) at > scala.collection.immutable.Vector.apply(Vector.scala:122) at > org.apache.spark.status.AppStatusStore$$anonfun$scanTasks$1$1.apply$mcDJ$sp(AppStatusStore.scala:255) > at > org.apache.spark.status.AppStatusStore$$anonfun$scanTasks$1$1.apply(AppStatusStore.scala:254) > at > org.apache.spark.status.AppStatusStore$$anonfun$scanTasks$1$1.apply(AppStatusStore.scala:254) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > at > scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) > at scala.collection.mutable.ArrayOps$ofLong.foreach(ArrayOps.scala:246) at > scala.collection.TraversableLike$class.map(TraversableLike.scala:234) at > scala.collection.mutable.ArrayOps$ofLong.map(ArrayOps.scala:246) at > org.apache.spark.status.AppStatusStore.scanTasks$1(AppStatusStore.scala:254) > at > org.apache.spark.status.AppStatusStore.taskSummary(AppStatusStore.scala:287) > at org.apache.spark.ui.jobs.StagePage.render(StagePage.scala:321) at > org.apache.spark.ui.WebUI$$anonfun$2.apply(WebUI.scala:84) at > org.apache.spark.ui.WebUI$$anonfun$2.apply(WebUI.scala:84) at > org.apache.spark.ui.JettyUtils$$anon$3.doGet(JettyUtils.scala:90) at > javax.servlet.http.HttpServlet.service(HttpServlet.java:687) at > javax.servlet.http.HttpServlet.service(HttpServlet.java:790) at > org.spark_project.jetty.servlet.ServletHolder.handle(ServletHolder.java:848) > at > org.spark_project.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1772) > at > org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.doFilter(AmIpFilter.java:166) > at > org.spark_project.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759) > at > org.spark_project.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582) > at > org.spark_project.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180) > at > org.spark_project.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512) > at > org.spark_project.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112) > at > org.spark_project.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) > at > org.spark_project.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:493) > at > org.spark_project.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213) > at > org.spark_project.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134) > at org.spark_project.jetty.server.Server.handle(Server.java:539) at > org.spark_project.jetty.server.HttpChannel.handle(HttpChannel.java:333) at > org.spark_project.jetty.server.HttpConnection.onFillable(HttpConnection.java:251) > at > org.spark_project.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283) > at org.spark_project.jetty.io.FillInterest.fillable(FillInterest.java:108) > at > org.spark_project.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93) > at > org.spark_project.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303) > at > org.spark_project.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148) > at > org.spark_project.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136) > at > org.spark_project.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671) > at > org.spark_project.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589) > at
[jira] [Commented] (SPARK-29765) Monitoring UI throws IndexOutOfBoundsException when accessing metrics of attempt in stage
[ https://issues.apache.org/jira/browse/SPARK-29765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16968649#comment-16968649 ] Viacheslav Tradunsky commented on SPARK-29765: -- [~shahid] sure, in case we have more than 200 000 tasks shall we set this to our maximum? Just trying to understand how GC of objects could influence access in immutable collection. Do you allow somehow the elements to be collected when they are referenced by that vector? > Monitoring UI throws IndexOutOfBoundsException when accessing metrics of > attempt in stage > - > > Key: SPARK-29765 > URL: https://issues.apache.org/jira/browse/SPARK-29765 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.4.4 > Environment: Amazon EMR 5.27 >Reporter: Viacheslav Tradunsky >Priority: Major > > When clicking on one of the largest tasks by input, I get to > [http://:20888/proxy/application_1572992299050_0001/stages/stage/?id=74=0|http://10.207.110.207:20888/proxy/application_1572992299050_0001/stages/stage/?id=74=0] > with 500 error > {code:java} > java.lang.IndexOutOfBoundsException: 95745 at > scala.collection.immutable.Vector.checkRangeConvert(Vector.scala:132) at > scala.collection.immutable.Vector.apply(Vector.scala:122) at > org.apache.spark.status.AppStatusStore$$anonfun$scanTasks$1$1.apply$mcDJ$sp(AppStatusStore.scala:255) > at > org.apache.spark.status.AppStatusStore$$anonfun$scanTasks$1$1.apply(AppStatusStore.scala:254) > at > org.apache.spark.status.AppStatusStore$$anonfun$scanTasks$1$1.apply(AppStatusStore.scala:254) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > at > scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) > at scala.collection.mutable.ArrayOps$ofLong.foreach(ArrayOps.scala:246) at > scala.collection.TraversableLike$class.map(TraversableLike.scala:234) at > scala.collection.mutable.ArrayOps$ofLong.map(ArrayOps.scala:246) at > org.apache.spark.status.AppStatusStore.scanTasks$1(AppStatusStore.scala:254) > at > org.apache.spark.status.AppStatusStore.taskSummary(AppStatusStore.scala:287) > at org.apache.spark.ui.jobs.StagePage.render(StagePage.scala:321) at > org.apache.spark.ui.WebUI$$anonfun$2.apply(WebUI.scala:84) at > org.apache.spark.ui.WebUI$$anonfun$2.apply(WebUI.scala:84) at > org.apache.spark.ui.JettyUtils$$anon$3.doGet(JettyUtils.scala:90) at > javax.servlet.http.HttpServlet.service(HttpServlet.java:687) at > javax.servlet.http.HttpServlet.service(HttpServlet.java:790) at > org.spark_project.jetty.servlet.ServletHolder.handle(ServletHolder.java:848) > at > org.spark_project.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1772) > at > org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.doFilter(AmIpFilter.java:166) > at > org.spark_project.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759) > at > org.spark_project.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582) > at > org.spark_project.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180) > at > org.spark_project.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512) > at > org.spark_project.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112) > at > org.spark_project.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) > at > org.spark_project.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:493) > at > org.spark_project.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213) > at > org.spark_project.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134) > at org.spark_project.jetty.server.Server.handle(Server.java:539) at > org.spark_project.jetty.server.HttpChannel.handle(HttpChannel.java:333) at > org.spark_project.jetty.server.HttpConnection.onFillable(HttpConnection.java:251) > at > org.spark_project.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283) > at org.spark_project.jetty.io.FillInterest.fillable(FillInterest.java:108) > at > org.spark_project.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93) > at > org.spark_project.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303) > at > org.spark_project.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148) > at > org.spark_project.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136) > at >
[jira] [Commented] (SPARK-29765) Monitoring UI throws IndexOutOfBoundsException when accessing metrics of attempt in stage
[ https://issues.apache.org/jira/browse/SPARK-29765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16968318#comment-16968318 ] Viacheslav Tradunsky commented on SPARK-29765: -- ok, got it. > Monitoring UI throws IndexOutOfBoundsException when accessing metrics of > attempt in stage > - > > Key: SPARK-29765 > URL: https://issues.apache.org/jira/browse/SPARK-29765 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.4.4 > Environment: Amazon EMR 5.27 >Reporter: Viacheslav Tradunsky >Priority: Major > > When clicking on one of the largest tasks by input, I get to > [http://:20888/proxy/application_1572992299050_0001/stages/stage/?id=74=0|http://10.207.110.207:20888/proxy/application_1572992299050_0001/stages/stage/?id=74=0] > with 500 error > {code:java} > java.lang.IndexOutOfBoundsException: 95745 at > scala.collection.immutable.Vector.checkRangeConvert(Vector.scala:132) at > scala.collection.immutable.Vector.apply(Vector.scala:122) at > org.apache.spark.status.AppStatusStore$$anonfun$scanTasks$1$1.apply$mcDJ$sp(AppStatusStore.scala:255) > at > org.apache.spark.status.AppStatusStore$$anonfun$scanTasks$1$1.apply(AppStatusStore.scala:254) > at > org.apache.spark.status.AppStatusStore$$anonfun$scanTasks$1$1.apply(AppStatusStore.scala:254) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > at > scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) > at scala.collection.mutable.ArrayOps$ofLong.foreach(ArrayOps.scala:246) at > scala.collection.TraversableLike$class.map(TraversableLike.scala:234) at > scala.collection.mutable.ArrayOps$ofLong.map(ArrayOps.scala:246) at > org.apache.spark.status.AppStatusStore.scanTasks$1(AppStatusStore.scala:254) > at > org.apache.spark.status.AppStatusStore.taskSummary(AppStatusStore.scala:287) > at org.apache.spark.ui.jobs.StagePage.render(StagePage.scala:321) at > org.apache.spark.ui.WebUI$$anonfun$2.apply(WebUI.scala:84) at > org.apache.spark.ui.WebUI$$anonfun$2.apply(WebUI.scala:84) at > org.apache.spark.ui.JettyUtils$$anon$3.doGet(JettyUtils.scala:90) at > javax.servlet.http.HttpServlet.service(HttpServlet.java:687) at > javax.servlet.http.HttpServlet.service(HttpServlet.java:790) at > org.spark_project.jetty.servlet.ServletHolder.handle(ServletHolder.java:848) > at > org.spark_project.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1772) > at > org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.doFilter(AmIpFilter.java:166) > at > org.spark_project.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759) > at > org.spark_project.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582) > at > org.spark_project.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180) > at > org.spark_project.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512) > at > org.spark_project.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112) > at > org.spark_project.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) > at > org.spark_project.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:493) > at > org.spark_project.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213) > at > org.spark_project.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134) > at org.spark_project.jetty.server.Server.handle(Server.java:539) at > org.spark_project.jetty.server.HttpChannel.handle(HttpChannel.java:333) at > org.spark_project.jetty.server.HttpConnection.onFillable(HttpConnection.java:251) > at > org.spark_project.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283) > at org.spark_project.jetty.io.FillInterest.fillable(FillInterest.java:108) > at > org.spark_project.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93) > at > org.spark_project.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303) > at > org.spark_project.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148) > at > org.spark_project.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136) > at > org.spark_project.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671) > at > org.spark_project.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589) > at java.lang.Thread.run(Thread.java:748){code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (SPARK-29765) Monitoring UI throws IndexOutOfBoundsException when accessing metrics of attempt in stage
[ https://issues.apache.org/jira/browse/SPARK-29765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viacheslav Tradunsky updated SPARK-29765: - Description: When clicking on one of the largest tasks by input, I get to [http://:20888/proxy/application_1572992299050_0001/stages/stage/?id=74=0|http://10.207.110.207:20888/proxy/application_1572992299050_0001/stages/stage/?id=74=0] with 500 error {code:java} java.lang.IndexOutOfBoundsException: 95745 at scala.collection.immutable.Vector.checkRangeConvert(Vector.scala:132) at scala.collection.immutable.Vector.apply(Vector.scala:122) at org.apache.spark.status.AppStatusStore$$anonfun$scanTasks$1$1.apply$mcDJ$sp(AppStatusStore.scala:255) at org.apache.spark.status.AppStatusStore$$anonfun$scanTasks$1$1.apply(AppStatusStore.scala:254) at org.apache.spark.status.AppStatusStore$$anonfun$scanTasks$1$1.apply(AppStatusStore.scala:254) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) at scala.collection.mutable.ArrayOps$ofLong.foreach(ArrayOps.scala:246) at scala.collection.TraversableLike$class.map(TraversableLike.scala:234) at scala.collection.mutable.ArrayOps$ofLong.map(ArrayOps.scala:246) at org.apache.spark.status.AppStatusStore.scanTasks$1(AppStatusStore.scala:254) at org.apache.spark.status.AppStatusStore.taskSummary(AppStatusStore.scala:287) at org.apache.spark.ui.jobs.StagePage.render(StagePage.scala:321) at org.apache.spark.ui.WebUI$$anonfun$2.apply(WebUI.scala:84) at org.apache.spark.ui.WebUI$$anonfun$2.apply(WebUI.scala:84) at org.apache.spark.ui.JettyUtils$$anon$3.doGet(JettyUtils.scala:90) at javax.servlet.http.HttpServlet.service(HttpServlet.java:687) at javax.servlet.http.HttpServlet.service(HttpServlet.java:790) at org.spark_project.jetty.servlet.ServletHolder.handle(ServletHolder.java:848) at org.spark_project.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1772) at org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.doFilter(AmIpFilter.java:166) at org.spark_project.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759) at org.spark_project.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582) at org.spark_project.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180) at org.spark_project.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512) at org.spark_project.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112) at org.spark_project.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) at org.spark_project.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:493) at org.spark_project.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213) at org.spark_project.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134) at org.spark_project.jetty.server.Server.handle(Server.java:539) at org.spark_project.jetty.server.HttpChannel.handle(HttpChannel.java:333) at org.spark_project.jetty.server.HttpConnection.onFillable(HttpConnection.java:251) at org.spark_project.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283) at org.spark_project.jetty.io.FillInterest.fillable(FillInterest.java:108) at org.spark_project.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93) at org.spark_project.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303) at org.spark_project.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148) at org.spark_project.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136) at org.spark_project.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671) at org.spark_project.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589) at java.lang.Thread.run(Thread.java:748){code} was: {code:java} java.lang.IndexOutOfBoundsException: 95745 at scala.collection.immutable.Vector.checkRangeConvert(Vector.scala:132) at scala.collection.immutable.Vector.apply(Vector.scala:122) at org.apache.spark.status.AppStatusStore$$anonfun$scanTasks$1$1.apply$mcDJ$sp(AppStatusStore.scala:255) at org.apache.spark.status.AppStatusStore$$anonfun$scanTasks$1$1.apply(AppStatusStore.scala:254) at org.apache.spark.status.AppStatusStore$$anonfun$scanTasks$1$1.apply(AppStatusStore.scala:254) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) at
[jira] [Created] (SPARK-29765) Monitoring UI throws IndexOutOfBoundsException when accessing metrics of attempt in stage
Viacheslav Tradunsky created SPARK-29765: Summary: Monitoring UI throws IndexOutOfBoundsException when accessing metrics of attempt in stage Key: SPARK-29765 URL: https://issues.apache.org/jira/browse/SPARK-29765 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 2.4.4 Environment: Amazon EMR 5.27 Reporter: Viacheslav Tradunsky {code:java} java.lang.IndexOutOfBoundsException: 95745 at scala.collection.immutable.Vector.checkRangeConvert(Vector.scala:132) at scala.collection.immutable.Vector.apply(Vector.scala:122) at org.apache.spark.status.AppStatusStore$$anonfun$scanTasks$1$1.apply$mcDJ$sp(AppStatusStore.scala:255) at org.apache.spark.status.AppStatusStore$$anonfun$scanTasks$1$1.apply(AppStatusStore.scala:254) at org.apache.spark.status.AppStatusStore$$anonfun$scanTasks$1$1.apply(AppStatusStore.scala:254) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) at scala.collection.mutable.ArrayOps$ofLong.foreach(ArrayOps.scala:246) at scala.collection.TraversableLike$class.map(TraversableLike.scala:234) at scala.collection.mutable.ArrayOps$ofLong.map(ArrayOps.scala:246) at org.apache.spark.status.AppStatusStore.scanTasks$1(AppStatusStore.scala:254) at org.apache.spark.status.AppStatusStore.taskSummary(AppStatusStore.scala:287) at org.apache.spark.ui.jobs.StagePage.render(StagePage.scala:321) at org.apache.spark.ui.WebUI$$anonfun$2.apply(WebUI.scala:84) at org.apache.spark.ui.WebUI$$anonfun$2.apply(WebUI.scala:84) at org.apache.spark.ui.JettyUtils$$anon$3.doGet(JettyUtils.scala:90) at javax.servlet.http.HttpServlet.service(HttpServlet.java:687) at javax.servlet.http.HttpServlet.service(HttpServlet.java:790) at org.spark_project.jetty.servlet.ServletHolder.handle(ServletHolder.java:848) at org.spark_project.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1772) at org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.doFilter(AmIpFilter.java:166) at org.spark_project.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759) at org.spark_project.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582) at org.spark_project.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180) at org.spark_project.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512) at org.spark_project.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112) at org.spark_project.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) at org.spark_project.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:493) at org.spark_project.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213) at org.spark_project.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134) at org.spark_project.jetty.server.Server.handle(Server.java:539) at org.spark_project.jetty.server.HttpChannel.handle(HttpChannel.java:333) at org.spark_project.jetty.server.HttpConnection.onFillable(HttpConnection.java:251) at org.spark_project.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283) at org.spark_project.jetty.io.FillInterest.fillable(FillInterest.java:108) at org.spark_project.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93) at org.spark_project.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303) at org.spark_project.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148) at org.spark_project.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136) at org.spark_project.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671) at org.spark_project.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589) at java.lang.Thread.run(Thread.java:748){code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-29244) ArrayIndexOutOfBoundsException on TaskCompletionListener during releasing of memory blocks
[ https://issues.apache.org/jira/browse/SPARK-29244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16942295#comment-16942295 ] Viacheslav Tradunsky commented on SPARK-29244: -- Thank you! > ArrayIndexOutOfBoundsException on TaskCompletionListener during releasing of > memory blocks > -- > > Key: SPARK-29244 > URL: https://issues.apache.org/jira/browse/SPARK-29244 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.4.0 > Environment: Release label:emr-5.20.0 > Hadoop distribution:Amazon 2.8.5 > Applications:Livy 0.5.0, Spark 2.4.0 >Reporter: Viacheslav Tradunsky >Assignee: L. C. Hsieh >Priority: Major > Fix For: 2.4.5, 3.0.0 > > Attachments: executor_oom.txt > > > At the end of task completion an exception happened: > {code:java} > 19/09/25 09:03:58 ERROR TaskContextImpl: Error in > TaskCompletionListener19/09/25 09:03:58 ERROR TaskContextImpl: Error in > TaskCompletionListenerjava.lang.ArrayIndexOutOfBoundsException: -3 at > org.apache.spark.memory.TaskMemoryManager.freePage(TaskMemoryManager.java:333) > at org.apache.spark.memory.MemoryConsumer.freePage(MemoryConsumer.java:130) > at org.apache.spark.memory.MemoryConsumer.freeArray(MemoryConsumer.java:108) > at org.apache.spark.unsafe.map.BytesToBytesMap.free(BytesToBytesMap.java:803) > at > org.apache.spark.sql.execution.UnsafeFixedWidthAggregationMap.free(UnsafeFixedWidthAggregationMap.java:225) > at > org.apache.spark.sql.execution.UnsafeFixedWidthAggregationMap.lambda$new$0(UnsafeFixedWidthAggregationMap.java:111) > at > org.apache.spark.TaskContextImpl$$anonfun$markTaskCompleted$1.apply(TaskContextImpl.scala:117) > at > org.apache.spark.TaskContextImpl$$anonfun$markTaskCompleted$1.apply(TaskContextImpl.scala:117) > at > org.apache.spark.TaskContextImpl$$anonfun$invokeListeners$1.apply(TaskContextImpl.scala:130) > at > org.apache.spark.TaskContextImpl$$anonfun$invokeListeners$1.apply(TaskContextImpl.scala:128) > at > scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) > at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) at > org.apache.spark.TaskContextImpl.invokeListeners(TaskContextImpl.scala:128) > at > org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:116) > at org.apache.spark.scheduler.Task.run(Task.scala:131) at > org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:402) > at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360) at > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:408) at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > {code} > > Important to note, that before this one, there was OOM of allocating some > pages. It looks like everything related to each other, but on OOM the whole > flow goes abnormally, so no resources are fried correctly. > {code:java} > java.lang.NullPointerExceptionjava.lang.NullPointerException at > org.apache.spark.util.collection.unsafe.sort.UnsafeInMemorySorter.getMemoryUsage(UnsafeInMemorySorter.java:208) > at > org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter.getMemoryUsage(UnsafeExternalSorter.java:249) > at > org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter.updatePeakMemoryUsed(UnsafeExternalSorter.java:253) > at > org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter.freeMemory(UnsafeExternalSorter.java:296) > at > org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter.cleanupResources(UnsafeExternalSorter.java:328) > at > org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter.lambda$new$0(UnsafeExternalSorter.java:178) > at > org.apache.spark.TaskContextImpl$$anonfun$markTaskCompleted$1.apply(TaskContextImpl.scala:117) > at > org.apache.spark.TaskContextImpl$$anonfun$markTaskCompleted$1.apply(TaskContextImpl.scala:117) > at > org.apache.spark.TaskContextImpl$$anonfun$invokeListeners$1.apply(TaskContextImpl.scala:130) > at > org.apache.spark.TaskContextImpl$$anonfun$invokeListeners$1.apply(TaskContextImpl.scala:128) > at > scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) > at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) at > org.apache.spark.TaskContextImpl.invokeListeners(TaskContextImpl.scala:128) > at > org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:116) > at org.apache.spark.scheduler.Task.run(Task.scala:131) at >
[jira] [Updated] (SPARK-29244) ArrayIndexOutOfBoundsException on TaskCompletionListener during releasing of memory blocks
[ https://issues.apache.org/jira/browse/SPARK-29244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viacheslav Tradunsky updated SPARK-29244: - Attachment: executor_oom.txt > ArrayIndexOutOfBoundsException on TaskCompletionListener during releasing of > memory blocks > -- > > Key: SPARK-29244 > URL: https://issues.apache.org/jira/browse/SPARK-29244 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.4.0 > Environment: Release label:emr-5.20.0 > Hadoop distribution:Amazon 2.8.5 > Applications:Livy 0.5.0, Spark 2.4.0 >Reporter: Viacheslav Tradunsky >Priority: Major > Attachments: executor_oom.txt > > > At the end of task completion an exception happened: > {code:java} > 19/09/25 09:03:58 ERROR TaskContextImpl: Error in > TaskCompletionListener19/09/25 09:03:58 ERROR TaskContextImpl: Error in > TaskCompletionListenerjava.lang.ArrayIndexOutOfBoundsException: -3 at > org.apache.spark.memory.TaskMemoryManager.freePage(TaskMemoryManager.java:333) > at org.apache.spark.memory.MemoryConsumer.freePage(MemoryConsumer.java:130) > at org.apache.spark.memory.MemoryConsumer.freeArray(MemoryConsumer.java:108) > at org.apache.spark.unsafe.map.BytesToBytesMap.free(BytesToBytesMap.java:803) > at > org.apache.spark.sql.execution.UnsafeFixedWidthAggregationMap.free(UnsafeFixedWidthAggregationMap.java:225) > at > org.apache.spark.sql.execution.UnsafeFixedWidthAggregationMap.lambda$new$0(UnsafeFixedWidthAggregationMap.java:111) > at > org.apache.spark.TaskContextImpl$$anonfun$markTaskCompleted$1.apply(TaskContextImpl.scala:117) > at > org.apache.spark.TaskContextImpl$$anonfun$markTaskCompleted$1.apply(TaskContextImpl.scala:117) > at > org.apache.spark.TaskContextImpl$$anonfun$invokeListeners$1.apply(TaskContextImpl.scala:130) > at > org.apache.spark.TaskContextImpl$$anonfun$invokeListeners$1.apply(TaskContextImpl.scala:128) > at > scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) > at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) at > org.apache.spark.TaskContextImpl.invokeListeners(TaskContextImpl.scala:128) > at > org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:116) > at org.apache.spark.scheduler.Task.run(Task.scala:131) at > org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:402) > at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360) at > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:408) at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > {code} > > Important to note, that before this one, there was OOM of allocating some > pages. It looks like everything related to each other, but on OOM the whole > flow goes abnormally, so no resources are fried correctly. > {code:java} > java.lang.NullPointerExceptionjava.lang.NullPointerException at > org.apache.spark.util.collection.unsafe.sort.UnsafeInMemorySorter.getMemoryUsage(UnsafeInMemorySorter.java:208) > at > org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter.getMemoryUsage(UnsafeExternalSorter.java:249) > at > org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter.updatePeakMemoryUsed(UnsafeExternalSorter.java:253) > at > org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter.freeMemory(UnsafeExternalSorter.java:296) > at > org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter.cleanupResources(UnsafeExternalSorter.java:328) > at > org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter.lambda$new$0(UnsafeExternalSorter.java:178) > at > org.apache.spark.TaskContextImpl$$anonfun$markTaskCompleted$1.apply(TaskContextImpl.scala:117) > at > org.apache.spark.TaskContextImpl$$anonfun$markTaskCompleted$1.apply(TaskContextImpl.scala:117) > at > org.apache.spark.TaskContextImpl$$anonfun$invokeListeners$1.apply(TaskContextImpl.scala:130) > at > org.apache.spark.TaskContextImpl$$anonfun$invokeListeners$1.apply(TaskContextImpl.scala:128) > at > scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) > at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) at > org.apache.spark.TaskContextImpl.invokeListeners(TaskContextImpl.scala:128) > at > org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:116) > at org.apache.spark.scheduler.Task.run(Task.scala:131) at > org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:402) > at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360) at >
[jira] [Created] (SPARK-29244) ArrayIndexOutOfBoundsException on TaskCompletionListener during releasing of memory blocks
Viacheslav Tradunsky created SPARK-29244: Summary: ArrayIndexOutOfBoundsException on TaskCompletionListener during releasing of memory blocks Key: SPARK-29244 URL: https://issues.apache.org/jira/browse/SPARK-29244 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 2.4.0 Environment: Release label:emr-5.20.0 Hadoop distribution:Amazon 2.8.5 Applications:Livy 0.5.0, Spark 2.4.0 Reporter: Viacheslav Tradunsky At the end of task completion an exception happened: {code:java} 19/09/25 09:03:58 ERROR TaskContextImpl: Error in TaskCompletionListener19/09/25 09:03:58 ERROR TaskContextImpl: Error in TaskCompletionListenerjava.lang.ArrayIndexOutOfBoundsException: -3 at org.apache.spark.memory.TaskMemoryManager.freePage(TaskMemoryManager.java:333) at org.apache.spark.memory.MemoryConsumer.freePage(MemoryConsumer.java:130) at org.apache.spark.memory.MemoryConsumer.freeArray(MemoryConsumer.java:108) at org.apache.spark.unsafe.map.BytesToBytesMap.free(BytesToBytesMap.java:803) at org.apache.spark.sql.execution.UnsafeFixedWidthAggregationMap.free(UnsafeFixedWidthAggregationMap.java:225) at org.apache.spark.sql.execution.UnsafeFixedWidthAggregationMap.lambda$new$0(UnsafeFixedWidthAggregationMap.java:111) at org.apache.spark.TaskContextImpl$$anonfun$markTaskCompleted$1.apply(TaskContextImpl.scala:117) at org.apache.spark.TaskContextImpl$$anonfun$markTaskCompleted$1.apply(TaskContextImpl.scala:117) at org.apache.spark.TaskContextImpl$$anonfun$invokeListeners$1.apply(TaskContextImpl.scala:130) at org.apache.spark.TaskContextImpl$$anonfun$invokeListeners$1.apply(TaskContextImpl.scala:128) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) at org.apache.spark.TaskContextImpl.invokeListeners(TaskContextImpl.scala:128) at org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:116) at org.apache.spark.scheduler.Task.run(Task.scala:131) at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:402) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:408) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) {code} Important to note, that before this one, there was OOM of allocating some pages. It looks like everything related to each other, but on OOM the whole flow goes abnormally, so no resources are fried correctly. {code:java} java.lang.NullPointerExceptionjava.lang.NullPointerException at org.apache.spark.util.collection.unsafe.sort.UnsafeInMemorySorter.getMemoryUsage(UnsafeInMemorySorter.java:208) at org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter.getMemoryUsage(UnsafeExternalSorter.java:249) at org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter.updatePeakMemoryUsed(UnsafeExternalSorter.java:253) at org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter.freeMemory(UnsafeExternalSorter.java:296) at org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter.cleanupResources(UnsafeExternalSorter.java:328) at org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter.lambda$new$0(UnsafeExternalSorter.java:178) at org.apache.spark.TaskContextImpl$$anonfun$markTaskCompleted$1.apply(TaskContextImpl.scala:117) at org.apache.spark.TaskContextImpl$$anonfun$markTaskCompleted$1.apply(TaskContextImpl.scala:117) at org.apache.spark.TaskContextImpl$$anonfun$invokeListeners$1.apply(TaskContextImpl.scala:130) at org.apache.spark.TaskContextImpl$$anonfun$invokeListeners$1.apply(TaskContextImpl.scala:128) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) at org.apache.spark.TaskContextImpl.invokeListeners(TaskContextImpl.scala:128) at org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:116) at org.apache.spark.scheduler.Task.run(Task.scala:131) at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:402) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:408) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) {code} This is must be something with job planning, but taking so many exceptions into account doesn't make things easier. Would be happy to provide more details. -- This