On 2019/05/17 12:45:26, satish.sidnakoppa...@gmail.com
<satish.sidnakoppa...@gmail.com> wrote:
>
>
> On 2019/05/17 12:37:10, satish.sidnakoppa...@gmail.com
> <satish.sidnakoppa...@gmail.com> wrote:
> > Hi Team,
> >
> > Data is returned when queried from hive.
> > But not in spark ,Could you assist in finding the gap.
> >
> > Details below
> >
> > ******************************Approach 1 ---
> > successful****************************
> >
> > select * from emp_cow limit 2;
> > 20190503171506 20190503171506_0_424 4 default
> > 71ff4cc6-bd8e-4c48-a075-98f32efc14b2_0_20190503171506.parquet 413Vivian
> > Walter -1641 1556883906604 608806001 511.63 1461868200000
> > 401217383000
> > 20190503171506 20190503171506_0_425 8 default
> > 71ff4cc6-bd8e-4c48-a075-98f32efc14b2_0_20190503171506.parquet 813Oprah
> > Gross -32255 1556883906604 761166471 536.4 1516473000000
> > 816189568000
> >
> > ******************************Approach 2 ---
> > successful****************************
> >
> > spark.read.format("com.uber.hoodie").load("/apps/hive/warehouse/emp_cow_03/default/*").show
> > +-------------------+--------------------+------------------+----------------------+--------------------+------+------------------+---------+-------------+---------+---------+-------------+-------------+
> > |_hoodie_commit_time|_hoodie_commit_seqno|_hoodie_record_key|_hoodie_partition_path|
> > _hoodie_file_name|emp_id| emp_name|emp_short| ts|
> > emp_long|emp_float| emp_date|emp_timestamp|
> > +-------------------+--------------------+------------------+----------------------+--------------------+------+------------------+---------+-------------+---------+---------+-------------+-------------+
> > | 20190503171506|20190503171506_0_424| 4|
> > default|71ff4cc6-bd8e-4c4...| 4| 13Vivian Walter|
> > -1641|1556883906604|608806001| 511.63|1461868200000| 401217383000|
> > +----
> >
> > ******************************Approach 3 --- No
> > records****************************
> >
> >
> > ***To read RO table as a Hive table using Spark****
> > But when I read from spark as hive table - no records returned.
> >
> >
> > sqlContext.sql("select * from hudi.emp_cow").show; ---- in scala console
> > select * from hudi.emp_cow ---- in
> > spark console
> >
> > NO result.
> >
> > Only headers/column names are printed.
> >
> >
> > FYI Table DDL
> >
> >
> > CREATE EXTERNAL TABLE `emp_cow`(
> > `_hoodie_commit_time` string,
> > `_hoodie_commit_seqno` string,
> > `_hoodie_record_key` string,
> > `_hoodie_partition_path` string,
> > `_hoodie_file_name` string,
> > `emp_id` int,
> > `emp_name` string,
> > `emp_short` int,
> > `ts` bigint,
> > `emp_long` bigint,
> > `emp_float` float,
> > `emp_date` bigint,
> > `emp_timestamp` bigint)
> > ROW FORMAT SERDE
> > 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'
> > STORED AS INPUTFORMAT
> > 'com.uber.hoodie.hadoop.HoodieInputFormat'
> > OUTPUTFORMAT
> > 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'
> > LOCATION
> > '/apps/hive/warehouse/emp_cow'
> >
>
>
>
> Fixed the typo mistake
>
> path is /apps/hive/warehouse/emp_cow
> table name is emp_cow
>
Issue fixed.
Path in table creation was incorrect.
LOCATION '/apps/hive/warehouse/emp_cow'
should
LOCATION '/apps/hive/warehouse/emp_cow/default'