paleolimbot commented on code in PR #49771:
URL: https://github.com/apache/arrow/pull/49771#discussion_r3203177106


##########
cpp/src/arrow/ipc/message.cc:
##########
@@ -565,6 +565,17 @@ Status DecodeMessage(MessageDecoder* decoder, 
io::InputStream* file) {
   auto metadata_length = decoder->next_required_size();
   ARROW_ASSIGN_OR_RAISE(auto metadata, file->Read(metadata_length));
   if (metadata->size() != metadata_length) {
+    // The first sizeof(int32_t) bytes of the Arrow file magic ("ARRO") may 
have been
+    // misread as metadata_length. Check if the remaining bytes complete the 
magic.

Review Comment:
   In nanoarrow we check the first few bytes for the magic string and skip them 
(then attempt to read the rest of the input as an IPC stream). We've never run 
into a complaint about this not working but I'm not sure how widespread the 
usage is (we could add an option to turn it off or improve the error that 
occurs if we run into one). I think 1330795073 bytes of metadata would never 
reasonably occur on purpose.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to