alexey-chumakov opened a new issue, #5692:
URL: https://github.com/apache/hudi/issues/5692

   **Describe the problem you faced**
   
   Hudi time-travel query (using "as.of.instant" parameter) provides incorrect 
result when table path is ended with /*
   
   **To Reproduce**
   
   Steps to reproduce the behavior:
   
   1. Write one record to Hudi table
   2. Write one more record
   3. Get first commit time
   4. Fetch data from Hudi using 
   ```
   spark
       .read
       .format("hudi")
       .option(TIME_TRAVEL_AS_OF_INSTANT.key(), firstCommit)
       .load(s"$tempPath/*")
   ```
   5. Check number of records in the result: all commits are present in the 
resulting dataframe 
   
   **Expected behavior**
   
   Only the first commit should present in the resulting dataframe (or paths 
endind with "/*" should be prohibited)
   
   **Environment Description**
   
   * Hudi version : 0.9.0
   
   * Spark version : 3.1.2
   
   * Running on Docker? (yes/no) : no
   
   **Additional context**
   
   Everything works properly when the same table is queried without "/*" at the 
end of the table path. Please check an example showing this issue at 
[https://github.com/alexey-chumakov/hudi-issues/blob/master/src/main/scala/HudiTimeTravelExample.scala](https://github.com/alexey-chumakov/hudi-issues/blob/master/src/main/scala/HudiTimeTravelExample.scala)
   
   Is this a correct way of accessing Hudi tables in general? If not, how can 
partitioned tables with “hoodie.datasource.write.drop.partition.columns” can be 
accessed?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to