Re: [liberationtech] Metadata Cleanup trough File Format Convertion?

2013-07-20 Thread Sandy Harris
Fabio Pietrosanti (naif) li...@infosecurity.ch wrote:

 i've been thinking about the topic of metadata cleanup of files from an
 implementation point of view.

 Regardless the consideration whether it's something useful or not for a
 Whistleblowing platform (GlobaLeaks),

In general, it is. To be responsible, any such platform must at
least look at anything they are going to release and consider
whether some of it needs to be redacted. Metadata needs
to be considered in that process.

There are cases, though, where metadata indicating the
source of a document is critical to evaluating it. Consider
a document that purports to give US policy on targeting
for drone strikes. Does it come from a field commander?
Or Washington? Pentagon? CIA? President's office?
Or is it, say, analysis by the Pakistani government? Or
just speculation by some journalist?
--
Too many emails? Unsubscribe, change to digest, or change password by emailing 
moderator at compa...@stanford.edu or changing your settings at 
https://mailman.stanford.edu/mailman/listinfo/liberationtech


Re: [liberationtech] Metadata Cleanup trough File Format Convertion?

2013-07-20 Thread Bruce Potter at IRF
Maybe this would help --

On the Mac platform, Lemkesoft's GraphicConverter is one of the oldest and most 
versatile graphic media format conversion programs (AND a good photo editor) -- 
it currently works with 60+ formats and explicitly allows removing OR modifying 
METADATA in batch mode.

www.lemkesoft.com
or write to the author, Thorsten Lemke at supp...@lemkesoft.com

There are a dozen or more language versions of GraphicConverter -- it's 
modestly priced.

bruce

- - - - - - - 

On Jul 17, 2013, at 12:28 PM, Fabio Pietrosanti (naif) li...@infosecurity.ch 
wrote:

 Hi all,
 
 i've been thinking about the topic of metadata cleanup of files from an 
 implementation point of view.
 
 Regardless the consideration whether it's something useful or not for a 
 Whistleblowing platform (GlobaLeaks), i've been considering whenever the 
 Metadata Cleanup can't be approached by File Format Conversion.
 
 If i'd like to remove metadata from various documents formats (pdf, word, 
 ppt, excel, etc) or image file, i've been thinking that rather then 
 explicitly removing metadata a possible different approach would be by 
 doing a file convertion .
 
 If a JPEG is converted to PNG, maybe all metadatas are lost. (this has to 
 be verified)
 If a DOC/DOCX is converted to a PDF, maybe all metadatas are lost.
 
 At GlobaLeaks we've been discussing about introducing metadata cleanup [1] 
 , but also a file sterilization [2] with the goal to protect Receivers of a 
 Whistleblowing site against targeted 0day attacks.
 
 Should we approach metadata cleanup by doing the file sterilization 
 processing trough existing Libreoffice convertion API [3] to save engineering 
 effort/time?
 
 
 [1] Metadata Cleanup https://github.com/globaleaks/GlobaLeaks/issues/305
 [2] File Sterilization https://github.com/globaleaks/GlobaLeaks/issues/270
 [3] Libreoffice Convertion API https://github.com/dagwieers/unoconv
 
 -- 
 Fabio Pietrosanti (naif)
 HERMES - Center for Transparency and Digital Human Rights
 http://logioshermes.org - http://globaleaks.org - http://tor2web.org
 
 --
 Too many emails? Unsubscribe, change to digest, or change password by 
 emailing moderator at compa...@stanford.edu or changing your settings at 
 https://mailman.stanford.edu/mailman/listinfo/liberationtech

--
Too many emails? Unsubscribe, change to digest, or change password by emailing 
moderator at compa...@stanford.edu or changing your settings at 
https://mailman.stanford.edu/mailman/listinfo/liberationtech

Re: [liberationtech] Metadata Cleanup trough File Format Convertion?

2013-07-18 Thread intrigeri
Hi,

Griffin Boyce wrote (17 Jul 2013 21:40:57 GMT) :
   PDFs are an interesting situation, because they have metadata, and the
 files within have metadata, and even embedded fonts can have metadata that
 could reveal the source of the document.

IIRC the MAT [1] uses an interesting trick: rendering the PDF on
a Cairo surface.

[1] https://mat.boum.org/ or apt-get install mat

Cheers,
--
  intrigeri
  | GnuPG key @ https://gaffer.ptitcanardnoir.org/intrigeri/intrigeri.asc
  | OTR fingerprint @ https://gaffer.ptitcanardnoir.org/intrigeri/otr.asc
--
Too many emails? Unsubscribe, change to digest, or change password by emailing 
moderator at compa...@stanford.edu or changing your settings at 
https://mailman.stanford.edu/mailman/listinfo/liberationtech


Re: [liberationtech] Metadata Cleanup trough File Format Convertion?

2013-07-17 Thread jvoisin
On 17/07/2013 21:22, Nick wrote:
 Quoth Fabio Pietrosanti (naif):
 If a JPEG is converted to PNG, maybe all metadatas are lost. (this
 has to be verified)
 If a DOC/DOCX is converted to a PDF, maybe all metadatas are lost.
 
 Interesting topic. I'd be most worried about watermarks, as 
 depending on the format they may well remain, and be difficult to 
 find or test for. I don't know if they're routinely used, but it's 
 certainly something to be aware of.
 --

Did you know about the MAT (https://mat.boum.org) ?



--
Too many emails? Unsubscribe, change to digest, or change password by emailing 
moderator at compa...@stanford.edu or changing your settings at 
https://mailman.stanford.edu/mailman/listinfo/liberationtech

Re: [liberationtech] Metadata Cleanup trough File Format Convertion?

2013-07-17 Thread Nathan of Guardian
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 07/17/2013 04:01 PM, jvoisin wrote:
 On 17/07/2013 21:22, Nick wrote:
 Quoth Fabio Pietrosanti (naif):
 If a JPEG is converted to PNG, maybe all metadatas are
 lost. (this has to be verified)

Our ObscuraCam app for Android strips all EXIF metadata from JPEGs.

The PNG conversion trick does work as well, in my personal experience.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQIcBAEBAgAGBQJR5wDhAAoJEKgBGD5ps3qpLwEP+wQKe2uRHZo21ypvNKpluOS3
Ubeqanhcm4DN/sl4GMfPPRgKktLq9/Zcq8AJvZ/a8YCPzsH1dywo53Ypyw83q7lN
6fu6f6bv0WX68l5vZ/FsNJI5TIx8CehtOW4HVsQQTMIvSLxcpSuxrK758UFACsNl
LPz5T0igPGzOpH9ynF9crxergVdCwnB7zNTnXtbSfNOnSiWTRq63kONj70pAd1jj
uvI8ohroibtuLQTo4go+KAjbO6NWcFyL4sc/Mfx6OeWSOjgThY+wmPHfYVlXRzzx
DMOTokKhYdQ/9yqfkAvi7LQx8E8iV9DAtTzB/GMljtFqvPyadQSE/vvth83Dzs66
plvWGkWhhGOMqTCZXlU8b/XIgd2ox8uoPfex9PfRuXwjW0gOerZhOIUmM5R8o2py
rE4gr7rOj1zyl+H2Ld5t6vpOgWWQug9XKkYMBVi8CMMA8fsZIaVJeX2mJsJS9Zbq
m71hV5bg11224L93eqj6LuqErgNJTL13cgcT3YR3quB3md5mDlvqRxS/ONVyXX+0
AAcnxw1DwYG6heWh20Ah4ZvUpoGyI65/MYUSIIxyo22nN+pO96mLLx8RZ/w6QEV8
8F1L51sbKXKOnx18Vwmm4VLHFjAsdbTUiNrx8OnqRNCAW+Dar84jUO9FBOpD7XxZ
FaZGhJ5Z9ivV4Sty2GcI
=XW9O
-END PGP SIGNATURE-
--
Too many emails? Unsubscribe, change to digest, or change password by emailing 
moderator at compa...@stanford.edu or changing your settings at 
https://mailman.stanford.edu/mailman/listinfo/liberationtech


Re: [liberationtech] Metadata Cleanup trough File Format Convertion?

2013-07-17 Thread Griffin Boyce
Fabio Pietrosanti (naif) li...@infosecurity.ch wrote:

 Hi all,

 i've been thinking about the topic of metadata cleanup of files from an
 implementation point of view.


  Media metadata is incredibly fascinating :D  Obscuracam does a really
great job of cleaning up jpegs, but doesn't cover the other random picture
types that people tend to have around.

  I've been mulling around the idea of a bash/python/etc script that could
be run an an entire folder of random stuff and remove all the metadata.
This is one of those things that seems really easy conceptually, but has
really stumped me in practice.  There's so many different types of
metadata that it's tricky to plot out a work plan to do it.  In any given
folder there might be Microsoft Word docs (with full revision history that
can reveal individuals' full names), photos (with personal exif/gps data),
html files (marked with the source of the file)

  PDFs are an interesting situation, because they have metadata, and the
files within have metadata, and even embedded fonts can have metadata that
could reveal the source of the document.  This should still be the case
when exporting/converting from ODF/DOC to PDF (unless everything goes
through some type of cleaning process beforehand, before the original
document were created).  Depending on the document, this could be a good
thing.  It might be possible to prove that the origin of Evidence X was
from Corrupt CEO Y using metadata.  By the same token, it's just as likely
to prove that Leak A came from Intel Analyst B.  Even the NSA's weirded out
about it [1].

best,
Griffin

[1] http://www.nsa.gov/ia/_files/app/pdf_risks.pdf

-- 
Just another hacker in the City of Spies.

My posts, while frequently amusing, are not representative of the thoughts
of my employer.
--
Too many emails? Unsubscribe, change to digest, or change password by emailing 
moderator at compa...@stanford.edu or changing your settings at 
https://mailman.stanford.edu/mailman/listinfo/liberationtech