emkornfield commented on code in PR #48431:
URL: https://github.com/apache/arrow/pull/48431#discussion_r3003156938


##########
cpp/src/parquet/file_reader.cc:
##########
@@ -441,6 +442,35 @@ class SerializedFile : public ParquetFileReader::Contents {
         auto footer_buffer,
         source_->ReadAt(source_size_ - footer_read_size, footer_read_size));
     uint32_t metadata_len = ParseFooterLength(footer_buffer, footer_read_size);
+    if (properties_.read_flatbuffer_metadata_if_present()) {
+      // Try to extract flatbuffer metadata from footer
+      std::string flatbuffer_data;
+      auto result = ExtractFlatbuffer(footer_buffer, &flatbuffer_data);
+      if (result.ok()) {
+        uint32_t required_or_consumed = *result;
+        if (required_or_consumed > 
static_cast<uint32_t>(footer_buffer->size()) &&
+            static_cast<int64_t>(required_or_consumed) <= source_size_) {
+          PARQUET_ASSIGN_OR_THROW(
+              footer_buffer,
+              source_->ReadAt(source_size_ - required_or_consumed, 
required_or_consumed));
+          footer_read_size = required_or_consumed;
+          result = ExtractFlatbuffer(footer_buffer, &flatbuffer_data);
+        }
+        // If successfully extracted flatbuffer data, parse it and return
+        if (result.ok() && *result > 0 && !flatbuffer_data.empty()) {
+          // Get flatbuffer metadata and convert to thrift
+          const format3::FileMetaData* fb_metadata =
+              format3::GetFileMetaData(flatbuffer_data.data());
+          auto thrift_metadata =

Review Comment:
   I'm not sure this addresses a previous comment about not requiring the 
conversion to thrift?  Parquet C++ already tries to abstract the footer 
metadata behind an interface is there a reason we can't make that consume the 
flatbuffer metadata directly?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to