This is an interesting one as it appears that a hive transactional table
1. Hive version 2 2. Hive on Spark engine 1.3.1 3. Spark 1.5.2 hive> create table default.foo(id int) clustered by (id) into 2 buckets STORED AS ORC TBLPROPERTIES ('transactional'='true'); hive> insert into default.foo values(10); hive> select * from foo; OK 10 Time taken: 0.067 seconds, Fetched: 1 row(s) hive> select * from foo; 10 At this stage if you do a simple select from spark from foo, you will get an error which sounds like a big spark-sql> select * from foo; 16/03/12 17:08:21 ERROR SparkSQLDriver: Failed in [select * from foo] java.lang.RuntimeException: serious problem No locks are held in Hive on that table. Let us go back and do a compaction in Hive hive> alter table foo compact 'major'; Compaction enqueued. These messages appear in Hive log. The job is a Map-reduce job 2016-03-12T17:12:29,776 INFO [rhes564-31]: mapreduce.Job (Job.java:monitorAndPrintJob(1345)) - Running job: job_1457790020440_0006 2016-03-12T17:12:31,915 INFO [org.apache.hadoop.hive.ql.txn.compactor.HouseKeeperServiceBase$1-0]: txn.AcidHouseKeeperService (AcidHouseKeeperService.java:run(67)) - timeout reaper ran for 0seconds. isAliveCounter=-2147483542 2016-03-12T17:13:51,918 INFO [org.apache.hadoop.hive.ql.txn.compactor.HouseKeeperServiceBase$1-0]: txn.AcidCompactionHistoryService (AcidCompactionHistoryService.java:run(76)) - History reaper reaper ran for 0seconds. isAliveCounter=-2147483488 And it goes through every single table to compact it including temp tables 2016-03-12T17:15:52,440 INFO [Thread-9]: compactor.Initiator (Initiator.java:run(89)) - Checking to see if we should compact default.foo 2016-03-12T17:15:52,449 INFO [Thread-9]: compactor.Initiator (Initiator.java:run(89)) - Checking to see if we should compact oraclehadoop.sales3 2016-03-12T17:15:52,468 INFO [Thread-9]: compactor.Initiator (Initiator.java:run(89)) - Checking to see if we should compact oraclehadoop.smallsales 2016-03-12T17:15:52,480 INFO [Thread-9]: compactor.Initiator (Initiator.java:run(89)) - Checking to see if we should compact test.stg_t2 2016-03-12T17:15:52,491 INFO [Thread-9]: compactor.Initiator (Initiator.java:run(89)) - Checking to see if we should compact default.values__tmp__table__3 2016-03-12T17:15:52,492 INFO [Thread-9]: compactor.Initiator (Initiator.java:run(94)) - Can't find table default.values__tmp__table__3, assuming it's a temp table or has been dropped and moving on. 2016-03-12T17:15:52,492 INFO [Thread-9]: compactor.Initiator (Initiator.java:run(89)) - Checking to see if we should compact default.values__tmp__table__4 2016-03-12T17:15:52,492 INFO [Thread-9]: compactor.Initiator (Initiator.java:run(94)) - Can't find table default.values__tmp__table__4, assuming it's a temp table or has been dropped and moving on. 2016-03-12T17:15:52,493 INFO [Thread-9]: compactor.Initiator (Initiator.java:run(89)) - Checking to see if we should compact default.values__tmp__table__1 2016-03-12T17:15:52,493 INFO [Thread-9]: compactor.Initiator (Initiator.java:run(94)) - Can't find table default.values__tmp__table__1, assuming it's a temp table or has been dropped and moving on. 2016-03-12T17:15:52,493 INFO [Thread-9]: compactor.Initiator (Initiator.java:run(89)) - Checking to see if we should compact test.t2 2016-03-12T17:15:52,504 INFO [Thread-9]: compactor.Initiator (Initiator.java:run(89)) - Checking to see if we should compact default.values__tmp__table__2 2016-03-12T17:15:52,505 INFO [Thread-9]: compactor.Initiator (Initiator.java:run(94)) - Can't find table default.values__tmp__table__2, assuming it's a temp table or has been dropped and moving on. OK once the compaction (which Hive does it in background) is complete then one can query the table from Spark spark-sql> select * from foo; 10 Time taken: 4.509 seconds, Fetched 1 row(s) I notice that if you insert a new row into foo (from Hive), you still get the same error in Spark scala> HiveContext.sql("select * from foo").collect.foreach(println) java.lang.RuntimeException: serious problem This looks like a bug as irt seems it only works after compaction is done interactively or after Hive does it itself! HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com On 12 March 2016 at 08:24, @Sanjiv Singh <sanjiv.is...@gmail.com> wrote: > Hi All, > > I am facing this issue on HDP setup on which COMPACTION is required only > once for transactional tables to fetch records with Spark SQL. > On the other hand, Apache setup doesn't required compaction even once. > > May be something got triggered on meta-store after compaction, Spark SQL > start recognizing delta files. > > Let know me if needed other details to get root cause. > > Try this, > > *See complete scenario :* > > hive> create table default.foo(id int) clustered by (id) into 2 buckets > STORED AS ORC TBLPROPERTIES ('transactional'='true'); > hive> insert into default.foo values(10); > > scala> sqlContext.table("default.foo").count // Gives 0, which is wrong > because data is still in delta files > > Now run major compaction: > > hive> ALTER TABLE default.foo COMPACT 'MAJOR'; > > scala> sqlContext.table("default.foo").count // Gives 1 > > hive> insert into foo values(20); > > scala> sqlContext.table("default.foo").count* // Gives 2 , no compaction > required.* > > > > > Regards > Sanjiv Singh > Mob : +091 9990-447-339 >