tustvold commented on issue #5332:
URL: https://github.com/apache/arrow-rs/issues/5332#issuecomment-2551785491

   > Many data services accept Parquet as input. Being an open format it has 
become a de-facto interchange format between systems.
   
   Different applications will have different threat models and should make 
their own judgements, but I would certainly encourage any applications 
accepting truly untrusted parquet data to rewrite them in some sandboxed 
environment before handing them off to other systems. This is fairly standard 
practice when it comes to other media files, e.g. images, video, etc... even 
where there are extremely mature and well tested transcoders. 
   
   However, many systems will instead be accepting files from other internal 
systems, at which point perhaps the thread model is different.
   
   _Security concerns aside, I would recommend rewriting parquet files anyway 
because of the sheer variety of parquet implementations - two files with the 
same data may behave very differently depending on how they've been written_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to