[ https://issues.apache.org/jira/browse/SPARK-35313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Kaushik Muniandi updated SPARK-35313: ------------------------------------- Environment: Databricks runtime version 7.5 (includes Apache Spark 3.0.1, Scala 2.12) (was: Got an error while running a code through Airflow DAG. Data size: ~ 2 TB and a little over 28 billion rows in the table Error occurred when parquet was read from s3 and written to another s3 location using spark.read.parquet running on Databricks 7.5 on top of EMR r5.8xlarge cluster) > Hive MetaException attempting to get partition metadata by filter from Hive > --------------------------------------------------------------------------- > > Key: SPARK-35313 > URL: https://issues.apache.org/jira/browse/SPARK-35313 > Project: Spark > Issue Type: Bug > Components: Spark Submit, SQL > Affects Versions: 3.0.1 > Environment: Databricks runtime version 7.5 (includes Apache Spark > 3.0.1, Scala 2.12) > Reporter: Kaushik Muniandi > Priority: Blocker > Attachments: spark_issue.JPG, spark_issue_databricks.JPG > > > Got an error while running a code through Airflow DAG. > Exception while running an ETL job on an External table created on Hive > stored as parquet in S3 with AWS Glue as metastore. Here's the error message: > > java.lang.RuntimeException: Caught Hive MetaException attempting to get > partition metadata by filter from Hive. You can set the Spark configuration > setting spark.sql.hive.manageFilesourcePartitions to false to work around > this problem, however this will result in degraded performance. Please report > a bug: https://issues.apache.org/jira/browse/SPARK | > > Caused by: MetaException(message:Unknown exception occurred. (Service: > AWSGlue; Status Code: 500; Error Code: InternalServiceException; Request ID: > 73267997-1795-45a3-965f-8bb2a6b7b3ac)) > > Exact issue occurred while running on Databricks notebook as well. Screenshot > attached for both cases. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org