nsivabalan commented on code in PR #11514:
URL: https://github.com/apache/hudi/pull/11514#discussion_r1663011966


##########
rfc/rfc-78/rfc-78.md:
##########
@@ -179,6 +183,9 @@ Let’s reiterate what we need to support w/ 0.16.0 reader.
 On a high level, we need to ensure commit metadata in either format (avro or 
json) need to be supported. And “cluster” and completed “compaction”s need to 
be readable in 0.16.0 reader.
 - But the challenging part is, for every commit metadata, we might have to 
deserialize to avro and on exception try json. We could deduce the format using 
completion file name, but as per current code layering, deserialization methods 
does not know the file name( method takes byte[]).
 - Similarly for clustering commits, unless we have some kind of watermark, we 
have to keep considering replace commits as well in the FSV building logic to 
ensure we do not miss any clustering commits.
+- To be decided: We also need to use diff LogFileComparators depending on the 
file slice's base instant time. If the file slices's base instant time is < 
table upgrade commit time, we use older log file comparator to order log files. 
but if file slice's base instant time > table upgrade commit time, we have to 
use new log file comparator (completion time). Tricky part is if a file slice 
contains a mix of log files. 
+ This fix definitely needs to go into 1.x, but whether we wanted to port this 
change to 0.16.x or not is yet to be discussed and decided. Lets zoom in a bit 
to see what will happen if a single file slice could contain a mix of log files 
using 1.x reader(this is a basic requirement to support 0.16.x tables in 1.x). 

Review Comment:
   which mean, the log file compartor logic from 1.x needs to be ported to 
0.16.x reader 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to