[ https://issues.apache.org/jira/browse/SPARK-30709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sean R. Owen deleted SPARK-30709: --------------------------------- > Spark 2.3 to Spark 2.4 Upgrade. Problems reading HIVE partitioned tables. > ------------------------------------------------------------------------- > > Key: SPARK-30709 > URL: https://issues.apache.org/jira/browse/SPARK-30709 > Project: Spark > Issue Type: Question > Environment: PRE- Production > Reporter: Carlos Mario > Priority: Major > Labels: SQL, Spark > > Hello > We recently updated our preproduction environment from Spark 2.3 to Spark > 2.4.0 > Along time we have created a big amount of tables in Hive Metastore, > partitioned by 2 fields one of them String and the other one BigInt. > We were reading this tables with Spark 2.3 with no problem, but after > upgrading to Spark 2.4 we get the following log every time we run our SW: > <log> > log_filterBIGINT.out: > Caused by: MetaException(message:Filtering is supported only on partition > keys of type string) Caused by: MetaException(message:Filtering is supported > only on partition keys of type string) Caused by: > MetaException(message:Filtering is supported only on partition keys of type > string) > > hadoop-cmf-hive-HIVEMETASTORE-isblcsmsttc0001.scisb.isban.corp.log.out.1: > > 2020-01-10 09:36:05,781 ERROR > org.apache.hadoop.hive.metastore.RetryingHMSHandler: [pool-5-thread-138]: > MetaException(message:Filtering is supported only on partition keys of type > string) > 2020-01-10 11:19:19,208 ERROR > org.apache.hadoop.hive.metastore.RetryingHMSHandler: [pool-5-thread-187]: > MetaException(message:Filtering is supported only on partition keys of type > string) > 2020-01-10 11:19:54,780 ERROR > org.apache.hadoop.hive.metastore.RetryingHMSHandler: [pool-5-thread-167]: > MetaException(message:Filtering is supported only on partition keys of type > string) > </log> > > We know the best practice from Spark point of view is to use 'STRING' type > for partition columns, but we need to explore a solution we'll be able to > deploy with ease, due to the big amount of tables created with a bigiint type > column partition. > > As a first solution we tried to set the > spark.sql.hive.manageFilesourcePartitions parameter to false in the Spark > Submmit, but after reruning the SW the error stood still. > > Is there anyone in the community who experienced the same problem? What was > the solution for it? > > Kind Regards and thanks in advance. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org