Hi Team, Data is returned when queried from hive. But not in spark ,Could you assist in finding the gap.
Details below ******************************Approach 1 --- successful**************************** select * from emp_cow limit 2; 20190503171506 20190503171506_0_424 4 default 71ff4cc6-bd8e-4c48-a075-98f32efc14b2_0_20190503171506.parquet 413Vivian Walter -1641 1556883906604 608806001 511.63 1461868200000 401217383000 20190503171506 20190503171506_0_425 8 default 71ff4cc6-bd8e-4c48-a075-98f32efc14b2_0_20190503171506.parquet 813Oprah Gross -32255 1556883906604 761166471 536.4 1516473000000 816189568000 ******************************Approach 2 --- successful**************************** spark.read.format("com.uber.hoodie").load("/apps/hive/warehouse/emp_cow_03/default/*").show +-------------------+--------------------+------------------+----------------------+--------------------+------+------------------+---------+-------------+---------+---------+-------------+-------------+ |_hoodie_commit_time|_hoodie_commit_seqno|_hoodie_record_key|_hoodie_partition_path| _hoodie_file_name|emp_id| emp_name|emp_short| ts| emp_long|emp_float| emp_date|emp_timestamp| +-------------------+--------------------+------------------+----------------------+--------------------+------+------------------+---------+-------------+---------+---------+-------------+-------------+ | 20190503171506|20190503171506_0_424| 4| default|71ff4cc6-bd8e-4c4...| 4| 13Vivian Walter| -1641|1556883906604|608806001| 511.63|1461868200000| 401217383000| +---- ******************************Approach 3 --- No records**************************** ***To read RO table as a Hive table using Spark**** But when I read from spark as hive table - no records returned. sqlContext.sql("select * from hudi.emp_cow_03").show; ---- in scala console select * from hudi.emp_cow_03 ---- in spark console NO result. Only headers/column names are printed. FYI Table DDL CREATE EXTERNAL TABLE `emp_cow`( `_hoodie_commit_time` string, `_hoodie_commit_seqno` string, `_hoodie_record_key` string, `_hoodie_partition_path` string, `_hoodie_file_name` string, `emp_id` int, `emp_name` string, `emp_short` int, `ts` bigint, `emp_long` bigint, `emp_float` float, `emp_date` bigint, `emp_timestamp` bigint) ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' STORED AS INPUTFORMAT 'com.uber.hoodie.hadoop.HoodieInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat' LOCATION 'hdfs://nn10.htrunk.com/apps/hive/warehouse/emp_cow'