[jira] [Updated] (SPARK-31698) NPE on big dataset plans
[ https://issues.apache.org/jira/browse/SPARK-31698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viacheslav Tradunsky updated SPARK-31698: - Environment: AWS EMR: 30 machines, 7TB RAM total. (was: AWS EMR: 30 machine, 7TB RAM total.) > NPE on big dataset plans > > > Key: SPARK-31698 > URL: https://issues.apache.org/jira/browse/SPARK-31698 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.4.4 > Environment: AWS EMR: 30 machines, 7TB RAM total. >Reporter: Viacheslav Tradunsky >Priority: Major > Attachments: Spark_NPE_big_dataset.log > > > We have big dataset containing 275 SQL operations more than 275 joins. > On the terminal operation to write data, it fails with NullPointerException. > > I understand that such big number of operations might not be what spark is > designed for, but NullPointerException is not an ideal way to fail in this > case. > > For more details, please see the stacktrace. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-31698) NPE on big dataset plans
[ https://issues.apache.org/jira/browse/SPARK-31698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viacheslav Tradunsky updated SPARK-31698: - Environment: AWS EMR: 30 machine, 7TB RAM total. (was: AWS EMR) > NPE on big dataset plans > > > Key: SPARK-31698 > URL: https://issues.apache.org/jira/browse/SPARK-31698 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.4.4 > Environment: AWS EMR: 30 machine, 7TB RAM total. >Reporter: Viacheslav Tradunsky >Priority: Major > Attachments: Spark_NPE_big_dataset.log > > > We have big dataset containing 275 SQL operations more than 275 joins. > On the terminal operation to write data, it fails with NullPointerException. > > I understand that such big number of operations might not be what spark is > designed for, but NullPointerException is not an ideal way to fail in this > case. > > For more details, please see the stacktrace. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-31698) NPE on big dataset plans
[ https://issues.apache.org/jira/browse/SPARK-31698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viacheslav Tradunsky updated SPARK-31698: - Docs Text: (was: org.apache.spark.SparkException: Job aborted. ./livy-livy-server.out.gz:20/05/12 22:46:54 INFO LineBufferedStream:at org.apache.spark.sql.execution.datasources.FileFormatWriter$.write(FileFormatWriter.scala:198) ./livy-livy-server.out.gz:20/05/12 22:46:54 INFO LineBufferedStream: at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand.run(InsertIntoHadoopFsRelationCommand.scala:159) ./livy-livy-server.out.gz:20/05/12 22:46:54 INFO LineBufferedStream: at org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult$lzycompute(commands.scala:104) ./livy-livy-server.out.gz:20/05/12 22:46:54 INFO LineBufferedStream: at org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult(commands.scala:102) ./livy-livy-server.out.gz:20/05/12 22:46:54 INFO LineBufferedStream: at org.apache.spark.sql.execution.command.DataWritingCommandExec.doExecute(commands.scala:122) ./livy-livy-server.out.gz:20/05/12 22:46:54 INFO LineBufferedStream: at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131) ./livy-livy-server.out.gz:20/05/12 22:46:54 INFO LineBufferedStream: at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127) ./livy-livy-server.out.gz:20/05/12 22:46:54 INFO LineBufferedStream: at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:156) ./livy-livy-server.out.gz:20/05/12 22:46:54 INFO LineBufferedStream: at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) ./livy-livy-server.out.gz:20/05/12 22:46:54 INFO LineBufferedStream: at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152) ./livy-livy-server.out.gz:20/05/12 22:46:54 INFO LineBufferedStream: at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127) ./livy-livy-server.out.gz:20/05/12 22:46:54 INFO LineBufferedStream: at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:80) ./livy-livy-server.out.gz:20/05/12 22:46:54 INFO LineBufferedStream: at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:80) ./livy-livy-server.out.gz:20/05/12 22:46:54 INFO LineBufferedStream: at org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:676) ./livy-livy-server.out.gz:20/05/12 22:46:54 INFO LineBufferedStream: at org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:676) ./livy-livy-server.out.gz:20/05/12 22:46:54 INFO LineBufferedStream: at org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:78) ./livy-livy-server.out.gz:20/05/12 22:46:54 INFO LineBufferedStream: at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:125) ./livy-livy-server.out.gz:20/05/12 22:46:54 INFO LineBufferedStream: at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:73) ./livy-livy-server.out.gz:20/05/12 22:46:54 INFO LineBufferedStream: at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:676) ./livy-livy-server.out.gz:20/05/12 22:46:54 INFO LineBufferedStream: at org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:285) ./livy-livy-server.out.gz:20/05/12 22:46:54 INFO LineBufferedStream: at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:271) ./livy-livy-server.out.gz:20/05/12 22:46:54 INFO LineBufferedStream: at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:229) ./livy-livy-server.out.gz:20/05/12 22:46:54 INFO LineBufferedStream: at org.apache.spark.sql.DataFrameWriter.parquet(DataFrameWriter.scala:566) ./livy-livy-server.out.gz:20/05/12 22:46:54 INFO LineBufferedStream: at com.company.app.executor.spark.SparkDatasetGenerationJob.generateDataset(SparkDatasetGenerationJob.scala:51) ./livy-livy-server.out.gz:20/05/12 22:46:54 INFO LineBufferedStream: at com.company.app.executor.spark.SparkDatasetGenerationJob.call(SparkDatasetGenerationJob.scala:82) ./livy-livy-server.out.gz:20/05/12 22:46:54 INFO LineBufferedStream: at com.company.app.executor.spark.SparkDatasetGenerationJob.call(SparkDatasetGenerationJob.scala:11) ./livy-livy-server.out.gz:20/05/12 22:46:54 INFO LineBufferedStream: at org.apache.livy.rsc.driver.BypassJob.call(BypassJob.java:40) ./livy-livy-server.out.gz:20/05/12 22:46:54 INFO LineBufferedStream: at org.apache.livy.rsc.driver.BypassJob.call(BypassJob.java:27) ./livy-livy-server.out.gz:20/05/12 22:46:54 INFO LineBufferedStream: at org.apache.livy.rsc.driver.JobWrapper.call(JobWrapper.java:64)
[jira] [Updated] (SPARK-31698) NPE on big dataset plans
[ https://issues.apache.org/jira/browse/SPARK-31698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viacheslav Tradunsky updated SPARK-31698: - Attachment: Spark_NPE_big_dataset.log > NPE on big dataset plans > > > Key: SPARK-31698 > URL: https://issues.apache.org/jira/browse/SPARK-31698 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.4.4 > Environment: AWS EMR >Reporter: Viacheslav Tradunsky >Priority: Major > Attachments: Spark_NPE_big_dataset.log > > > We have big dataset containing 275 SQL operations more than 275 joins. > On the terminal operation to write data, it fails with NullPointerException. > > I understand that such big number of operations might not be what spark is > designed for, but NullPointerException is not an ideal way to fail in this > case. > > For more details, please see the stacktrace. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org