[ https://issues.apache.org/jira/browse/SPARK-26663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16753216#comment-16753216 ]
Dongjoon Hyun commented on SPARK-26663: --------------------------------------- Hi, [~pomptuintje]. Thank you for reporting. Actually, the given example has incorrect syntax like `creat table`. It would be better if you reports with the script what you used. I do the following but I cannot reproduce the issue. {code:java} Logging initialized using configuration in jar:file:/Users/dongjoon/APACHE/hive-release/apache-hive-1.2.2-bin/lib/hive-common-1.2.2.jar!/hive-log4j.properties hive> create table a(id int); OK Time taken: 1.299 seconds hive> create table b(id int); OK Time taken: 0.046 seconds hive> insert into a values(1); Query ID = dongjoon_20190126145804_2c0252e1-d07c-4213-a387-90efe26d450b Total jobs = 3 Launching Job 1 out of 3 Number of reduce tasks is set to 0 since there's no reduce operator Job running in-process (local Hadoop) 2019-01-26 14:58:06,272 Stage-1 map = 100%, reduce = 0% Ended Job = job_local2005651311_0001 Stage-4 is selected by condition resolver. Stage-3 is filtered out by condition resolver. Stage-5 is filtered out by condition resolver. Moving data to: file:/user/hive/warehouse/a/.hive-staging_hive_2019-01-26_14-58-04_030_4426868381325183205-1/-ext-10000 Loading data to table default.a Table default.a stats: [numFiles=1, numRows=1, totalSize=2, rawDataSize=1] MapReduce Jobs Launched: Stage-Stage-1: HDFS Read: 0 HDFS Write: 0 SUCCESS Total MapReduce CPU Time Spent: 0 msec OK Time taken: 2.436 seconds hive> insert into b values(1); Query ID = dongjoon_20190126145810_034d9c36-0f23-42a6-ac0a-681839335bd6 Total jobs = 3 Launching Job 1 out of 3 Number of reduce tasks is set to 0 since there's no reduce operator Job running in-process (local Hadoop) 2019-01-26 14:58:11,941 Stage-1 map = 100%, reduce = 0% Ended Job = job_local966105199_0002 Stage-4 is selected by condition resolver. Stage-3 is filtered out by condition resolver. Stage-5 is filtered out by condition resolver. Moving data to: file:/user/hive/warehouse/b/.hive-staging_hive_2019-01-26_14-58-10_554_693159949912597124-1/-ext-10000 Loading data to table default.b Table default.b stats: [numFiles=1, numRows=1, totalSize=2, rawDataSize=1] MapReduce Jobs Launched: Stage-Stage-1: HDFS Read: 0 HDFS Write: 0 SUCCESS Total MapReduce CPU Time Spent: 0 msec OK Time taken: 1.551 seconds hive> create table c as select id from a union all select id from b; Query ID = dongjoon_20190126145831_c2b31651-c88b-47ab-9081-2375cf064b15 Total jobs = 3 Launching Job 1 out of 3 Number of reduce tasks is set to 0 since there's no reduce operator Job running in-process (local Hadoop) 2019-01-26 14:58:33,130 Stage-1 map = 100%, reduce = 0% Ended Job = job_local1725928125_0003 Stage-4 is filtered out by condition resolver. Stage-3 is selected by condition resolver. Stage-5 is filtered out by condition resolver. Launching Job 3 out of 3 Number of reduce tasks is set to 0 since there's no reduce operator Job running in-process (local Hadoop) 2019-01-26 14:58:34,449 Stage-3 map = 100%, reduce = 0% Ended Job = job_local1246940820_0004 Moving data to: file:/user/hive/warehouse/c Table default.c stats: [numFiles=1, numRows=2, totalSize=4, rawDataSize=2] MapReduce Jobs Launched: Stage-Stage-1: HDFS Read: 0 HDFS Write: 0 SUCCESS Stage-Stage-3: HDFS Read: 0 HDFS Write: 0 SUCCESS Total MapReduce CPU Time Spent: 0 msec OK Time taken: 2.806 seconds {code} {code:java} 19/01/26 14:58:42 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Setting default log level to "WARN". To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel). Spark context available as 'sc' (master = local[*], app id = local-1548543527800). Spark session available as 'spark'. Welcome to ____ __ / __/__ ___ _____/ /__ _\ \/ _ \/ _ `/ __/ '_/ /___/ .__/\_,_/_/ /_/\_\ version 2.4.0 /_/ Using Scala version 2.11.12 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_201) Type in expressions to have them evaluated. Type :help for more information. scala> sql("select * from c").show 19/01/26 14:58:57 WARN ObjectStore: Failed to get database global_temp, returning NoSuchObjectException +---+ | id| +---+ | 1| | 1| +---+ {code} > Cannot query a Hive table with subdirectories > --------------------------------------------- > > Key: SPARK-26663 > URL: https://issues.apache.org/jira/browse/SPARK-26663 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 2.4.0 > Reporter: Aäron > Priority: Major > > Hello, > > I want to report the following issue (my first one :) ) > When I create a table in Hive based on a union all then Spark 2.4 is unable > to query this table. > To reproduce: > *Hive 1.2.1* > {code:java} > hive> creat table a(id int); > insert into a values(1); > hive> creat table b(id int); > insert into b values(2); > hive> create table c(id int) as select id from a union all select id from b; > {code} > > *Spark 2.3.1* > > {code:java} > scala> spark.table("c").show > +---+ > | id| > +---+ > | 1| > | 2| > +---+ > scala> spark.table("c").count > res5: Long = 2 > {code} > > *Spark 2.4.0* > {code:java} > scala> spark.table("c").show > 19/01/18 17:00:49 WARN HiveMetastoreCatalog: Unable to infer schema for table > perftest_be.c from file format ORC (inference mode: INFER_AND_SAVE). Using > metastore schema. > +---+ > | id| > +---+ > +---+ > scala> spark.table("c").count > res3: Long = 0 > {code} > I did not find an existing issue for this. Might be important to investigate. > > +Extra info:+ Spark 2.3.1 and 2.4.0 use the same spark-defaults.conf. > > Kind regards. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org