[ https://issues.apache.org/jira/browse/HUDI-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Raymond Xu updated HUDI-1608: ----------------------------- Labels: pull-request-available sev:critical (was: pull-request-available) > MOR fetches all records for read optimized query w/ spark sql > ------------------------------------------------------------- > > Key: HUDI-1608 > URL: https://issues.apache.org/jira/browse/HUDI-1608 > Project: Apache Hudi > Issue Type: Bug > Components: Spark Integration > Affects Versions: 0.7.0 > Reporter: sivabalan narayanan > Priority: Major > Labels: pull-request-available, sev:critical > > Script to reproduce in local spark: > > [https://gist.github.com/nsivabalan/7250b794788516f1aec35650c2632364] > > ``` > scala> spark.sql("select _hoodie_commit_time, _hoodie_record_key, > _hoodie_partition_path, id, __op from hudi_trips_snapshot order by > _hoodie_record_key").show(false) > +---------------------+----------------+++-------------------------+---- > |_hoodie_commit_time|_hoodie_record_key|_hoodie_partition_path|id|__op| > +---------------------+----------------+++-------------------------+---- > |20210210070347 |1 |1970-01-01 |1 |null| > |20210210070347 |2 |1970-01-01 |2 |null| > |20210210070347 |3 |2020-01-04 |3 |D | > |20210210070347 |4 |1998-04-13 |4 |I | > |20210210070347 |5 |2020-01-01 |5 |I | > |*20210210070445* |*6* |*1998-04-13* |*6* |*I* | > +---------------------+----------------+++-------------------------+---- > ``` > After an upsert, read optimized query returns records from both C1 and C2. > Also, I don't find any log files in partitions. all of them are parquet > files. > > ls /tmp/hudi_trips_cow/1998-04-13/ > 0d1e6a84-d036-42e9-806e-a3075b6bc677-0_1-23-12025_20210210065058.parquet > 0d1e6a84-d036-42e9-806e-a3075b6bc677-0_1-61-25595_20210210065127.parquet > ls /tmp/hudi_trips_cow/1970-01-01/ > 7b836833-a656-485d-967a-871bdc653dc3-0_2-61-25596_20210210065127.parquet > 7b836833-a656-485d-967a-871bdc653dc3-0_3-23-12027_20210210065058.parquet > > Source of the issue: [https://github.com/apache/hudi/issues/2255] > > -- This message was sent by Atlassian Jira (v8.3.4#803005)