Re: [liberationtech] Metadata Cleanup trough File Format Convertion?
Fabio Pietrosanti (naif) li...@infosecurity.ch wrote: i've been thinking about the topic of metadata cleanup of files from an implementation point of view. Regardless the consideration whether it's something useful or not for a Whistleblowing platform (GlobaLeaks), In general, it is. To be responsible, any such platform must at least look at anything they are going to release and consider whether some of it needs to be redacted. Metadata needs to be considered in that process. There are cases, though, where metadata indicating the source of a document is critical to evaluating it. Consider a document that purports to give US policy on targeting for drone strikes. Does it come from a field commander? Or Washington? Pentagon? CIA? President's office? Or is it, say, analysis by the Pakistani government? Or just speculation by some journalist? -- Too many emails? Unsubscribe, change to digest, or change password by emailing moderator at compa...@stanford.edu or changing your settings at https://mailman.stanford.edu/mailman/listinfo/liberationtech
Re: [liberationtech] Metadata Cleanup trough File Format Convertion?
Maybe this would help -- On the Mac platform, Lemkesoft's GraphicConverter is one of the oldest and most versatile graphic media format conversion programs (AND a good photo editor) -- it currently works with 60+ formats and explicitly allows removing OR modifying METADATA in batch mode. www.lemkesoft.com or write to the author, Thorsten Lemke at supp...@lemkesoft.com There are a dozen or more language versions of GraphicConverter -- it's modestly priced. bruce - - - - - - - On Jul 17, 2013, at 12:28 PM, Fabio Pietrosanti (naif) li...@infosecurity.ch wrote: Hi all, i've been thinking about the topic of metadata cleanup of files from an implementation point of view. Regardless the consideration whether it's something useful or not for a Whistleblowing platform (GlobaLeaks), i've been considering whenever the Metadata Cleanup can't be approached by File Format Conversion. If i'd like to remove metadata from various documents formats (pdf, word, ppt, excel, etc) or image file, i've been thinking that rather then explicitly removing metadata a possible different approach would be by doing a file convertion . If a JPEG is converted to PNG, maybe all metadatas are lost. (this has to be verified) If a DOC/DOCX is converted to a PDF, maybe all metadatas are lost. At GlobaLeaks we've been discussing about introducing metadata cleanup [1] , but also a file sterilization [2] with the goal to protect Receivers of a Whistleblowing site against targeted 0day attacks. Should we approach metadata cleanup by doing the file sterilization processing trough existing Libreoffice convertion API [3] to save engineering effort/time? [1] Metadata Cleanup https://github.com/globaleaks/GlobaLeaks/issues/305 [2] File Sterilization https://github.com/globaleaks/GlobaLeaks/issues/270 [3] Libreoffice Convertion API https://github.com/dagwieers/unoconv -- Fabio Pietrosanti (naif) HERMES - Center for Transparency and Digital Human Rights http://logioshermes.org - http://globaleaks.org - http://tor2web.org -- Too many emails? Unsubscribe, change to digest, or change password by emailing moderator at compa...@stanford.edu or changing your settings at https://mailman.stanford.edu/mailman/listinfo/liberationtech -- Too many emails? Unsubscribe, change to digest, or change password by emailing moderator at compa...@stanford.edu or changing your settings at https://mailman.stanford.edu/mailman/listinfo/liberationtech
Re: [liberationtech] Metadata Cleanup trough File Format Convertion?
Hi, Griffin Boyce wrote (17 Jul 2013 21:40:57 GMT) : PDFs are an interesting situation, because they have metadata, and the files within have metadata, and even embedded fonts can have metadata that could reveal the source of the document. IIRC the MAT [1] uses an interesting trick: rendering the PDF on a Cairo surface. [1] https://mat.boum.org/ or apt-get install mat Cheers, -- intrigeri | GnuPG key @ https://gaffer.ptitcanardnoir.org/intrigeri/intrigeri.asc | OTR fingerprint @ https://gaffer.ptitcanardnoir.org/intrigeri/otr.asc -- Too many emails? Unsubscribe, change to digest, or change password by emailing moderator at compa...@stanford.edu or changing your settings at https://mailman.stanford.edu/mailman/listinfo/liberationtech
Re: [liberationtech] Metadata Cleanup trough File Format Convertion?
On 17/07/2013 21:22, Nick wrote: Quoth Fabio Pietrosanti (naif): If a JPEG is converted to PNG, maybe all metadatas are lost. (this has to be verified) If a DOC/DOCX is converted to a PDF, maybe all metadatas are lost. Interesting topic. I'd be most worried about watermarks, as depending on the format they may well remain, and be difficult to find or test for. I don't know if they're routinely used, but it's certainly something to be aware of. -- Did you know about the MAT (https://mat.boum.org) ? -- Too many emails? Unsubscribe, change to digest, or change password by emailing moderator at compa...@stanford.edu or changing your settings at https://mailman.stanford.edu/mailman/listinfo/liberationtech
Re: [liberationtech] Metadata Cleanup trough File Format Convertion?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 07/17/2013 04:01 PM, jvoisin wrote: On 17/07/2013 21:22, Nick wrote: Quoth Fabio Pietrosanti (naif): If a JPEG is converted to PNG, maybe all metadatas are lost. (this has to be verified) Our ObscuraCam app for Android strips all EXIF metadata from JPEGs. The PNG conversion trick does work as well, in my personal experience. -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQIcBAEBAgAGBQJR5wDhAAoJEKgBGD5ps3qpLwEP+wQKe2uRHZo21ypvNKpluOS3 Ubeqanhcm4DN/sl4GMfPPRgKktLq9/Zcq8AJvZ/a8YCPzsH1dywo53Ypyw83q7lN 6fu6f6bv0WX68l5vZ/FsNJI5TIx8CehtOW4HVsQQTMIvSLxcpSuxrK758UFACsNl LPz5T0igPGzOpH9ynF9crxergVdCwnB7zNTnXtbSfNOnSiWTRq63kONj70pAd1jj uvI8ohroibtuLQTo4go+KAjbO6NWcFyL4sc/Mfx6OeWSOjgThY+wmPHfYVlXRzzx DMOTokKhYdQ/9yqfkAvi7LQx8E8iV9DAtTzB/GMljtFqvPyadQSE/vvth83Dzs66 plvWGkWhhGOMqTCZXlU8b/XIgd2ox8uoPfex9PfRuXwjW0gOerZhOIUmM5R8o2py rE4gr7rOj1zyl+H2Ld5t6vpOgWWQug9XKkYMBVi8CMMA8fsZIaVJeX2mJsJS9Zbq m71hV5bg11224L93eqj6LuqErgNJTL13cgcT3YR3quB3md5mDlvqRxS/ONVyXX+0 AAcnxw1DwYG6heWh20Ah4ZvUpoGyI65/MYUSIIxyo22nN+pO96mLLx8RZ/w6QEV8 8F1L51sbKXKOnx18Vwmm4VLHFjAsdbTUiNrx8OnqRNCAW+Dar84jUO9FBOpD7XxZ FaZGhJ5Z9ivV4Sty2GcI =XW9O -END PGP SIGNATURE- -- Too many emails? Unsubscribe, change to digest, or change password by emailing moderator at compa...@stanford.edu or changing your settings at https://mailman.stanford.edu/mailman/listinfo/liberationtech
Re: [liberationtech] Metadata Cleanup trough File Format Convertion?
Fabio Pietrosanti (naif) li...@infosecurity.ch wrote: Hi all, i've been thinking about the topic of metadata cleanup of files from an implementation point of view. Media metadata is incredibly fascinating :D Obscuracam does a really great job of cleaning up jpegs, but doesn't cover the other random picture types that people tend to have around. I've been mulling around the idea of a bash/python/etc script that could be run an an entire folder of random stuff and remove all the metadata. This is one of those things that seems really easy conceptually, but has really stumped me in practice. There's so many different types of metadata that it's tricky to plot out a work plan to do it. In any given folder there might be Microsoft Word docs (with full revision history that can reveal individuals' full names), photos (with personal exif/gps data), html files (marked with the source of the file) PDFs are an interesting situation, because they have metadata, and the files within have metadata, and even embedded fonts can have metadata that could reveal the source of the document. This should still be the case when exporting/converting from ODF/DOC to PDF (unless everything goes through some type of cleaning process beforehand, before the original document were created). Depending on the document, this could be a good thing. It might be possible to prove that the origin of Evidence X was from Corrupt CEO Y using metadata. By the same token, it's just as likely to prove that Leak A came from Intel Analyst B. Even the NSA's weirded out about it [1]. best, Griffin [1] http://www.nsa.gov/ia/_files/app/pdf_risks.pdf -- Just another hacker in the City of Spies. My posts, while frequently amusing, are not representative of the thoughts of my employer. -- Too many emails? Unsubscribe, change to digest, or change password by emailing moderator at compa...@stanford.edu or changing your settings at https://mailman.stanford.edu/mailman/listinfo/liberationtech