added both the python library libmat2 and a command line tool called mat2 to remove metadata from various files.
https://0xacab.org/jvoisin/mat2 tests are disabled because the tarball in https://pypi.org/project/mat2/ doesn't include the test documents. the test documents are, however, present in https://0xacab.org/jvoisin/mat2 so cloning that repository separately and running the test yields the attached test-results.txt file. Looks like it fails on some video files which i'll look into, but it mostly works at least on my own personal files! this library can be a building block for apps that use mat2 like https://gitlab.com/rmnvgr/metadata-cleaner as well. the library also requires a couple runtime libraries to be installed, and they can be checked by running the --check-dependencies command. $ mat2 --check-dependencies Dependencies for mat2 0.13.4: - Cairo: yes - Exiftool: yes (optional) - Ffmpeg: yes (optional) - GLib from PyGobject: yes - GdkPixbuf from PyGobject: yes - Mutagen: yes - Poppler from PyGobject: yes - PyGobject: yes please test! works on my files on current/amd64. OK? -- jagtalon.net weirder.earth/@jag
py3-mat2.tar.gz
Description: application/gzip
jag@big ~/D/mat2 (master)> coverage run --branch -m unittest discover -s tests/ ...E.....FF..FF..........EEERROR:root:Something went wrong during the processing of ./tests/data/clean.avi: Command '['/usr/local/bin/ffmpeg', '-i', './tests/data/clean.avi', '-y', '-map', '0', '-codec', 'copy', '-loglevel', 'panic', '-hide_banner', '-map_metadata', '-1', '-map_chapters', '-1', '-disposition', '0', '-fflags', '+bitexact', '-flags:v', '+bitexact', '-flags:a', '+bitexact', './tests/data/clean.cleaned.avi']' returned non-zero exit status 1. .ERROR:root:Something went wrong during the processing of ./tests/data/--output.avi: Command '['/usr/local/bin/ffmpeg', '-i', './tests/data/--output.avi', '-y', '-map', '0', '-codec', 'copy', '-loglevel', 'panic', '-hide_banner', '-map_metadata', '-1', '-map_chapters', '-1', '-disposition', '0', '-fflags', '+bitexact', '-flags:v', '+bitexact', '-flags:a', '+bitexact', './tests/data/--output.cleaned.avi']' returned non-zero exit status 1. ...ERROR:root:Unable to parse /tmp/tmp5je1k6bq/OEBPS/content.opf in ./tests/data/clean.epub. WARNING:root:Something went wrong during deep cleaning of OEBPS/content.opf in ./tests/data/clean.epub ..........FWARNING:root:Not a valid bencoded string: 137 WARNING:root:Not a valid bencoded string: 137 WARNING:root:Not a valid bencoded string: WARNING:root:Not a valid bencoded string: WARNING:root:Not a valid bencoded string: WARNING:root:Invalid bencoded value (data after valid prefix) ..F............................[+] Testing pdf [+] Testing png [+] Testing jpg [+] Testing wav [+] Testing aiff [+] Testing mp3 [+] Testing ogg [+] Testing flac [+] Testing docx [+] Testing odt [+] Testing tiff Warning: [minor] Can't delete IFD0 from TIFF - ./tests/data/clean.tiff [+] Testing bmp [+] Testing torrent [+] Testing odf [+] Testing odg [+] Testing txt [+] Testing gif [+] Testing css [+] Testing svg [+] Testing ppm [+] Testing avi [+] Testing mp4 WARNING:root:The format of "./tests/data/clean.mp4" (video/mp4) has some mandatory metadata fields; mat2 filled them with standard data. WARNING:root:The format of "./tests/data/clean.cleaned.mp4" (video/mp4) has some mandatory metadata fields; mat2 filled them with standard data. [+] Testing wmv WARNING:root:The format of "./tests/data/clean.wmv" (video/x-ms-wmv) has some mandatory metadata fields; mat2 filled them with standard data. WARNING:root:The format of "./tests/data/clean.cleaned.wmv" (video/x-ms-wmv) has some mandatory metadata fields; mat2 filled them with standard data. [+] Testing heic Warning: ICC_Profile deleted. Image colors may be affected - ./tests/data/clean.heic Warning: ICC_Profile deleted. Image colors may be affected - ./tests/data/clean.cleaned.heic ...EEEEEWARNING:root:./tests/data/clean.pptx contains invalid cNvPr: {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 15, 20, 22, 24} ................E....FE..........ERROR:root:In file ./tests/data/clean.docx, element word/media/setup.py's format (text/x-python) isn't supported .ERROR:root:In file ./tests/data/clean.odt, element Pictures/setup.py's format (text/x-python) isn't supported .....Warning: [minor] Can't delete IFD0 from TIFF - ./tests/data/clean.tiff ..WARNING:root:In file ./tests/data/clean.docx, keeping unknown element word/media/setup.py (format: text/x-python) .WARNING:root:In file ./tests/data/clean.docx, omitting unknown element word/media/setup.py (format: text/x-python) .. ====================================================================== ERROR: test_different (test_climat2.TestCommandLineParallel.test_different) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/jag/Downloads/mat2/tests/test_climat2.py", line 269, in test_different shutil.copytree(src, dst) File "/usr/local/lib/python3.11/shutil.py", line 573, in copytree return _copytree(entries=entries, src=src, dst=dst, symlinks=symlinks, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/shutil.py", line 471, in _copytree os.makedirs(dst, exist_ok=dirs_exist_ok) File "<frozen os>", line 225, in makedirs FileExistsError: [Errno 17] File exists: './tests/data/parallel' ====================================================================== ERROR: test_docx (test_corrupted_files.TestCorruptedEmbedded.test_docx) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/jag/Downloads/mat2/tests/test_corrupted_files.py", line 69, in test_docx parser.remove_all() ^^^^^^^^^^^^^^^^^ AttributeError: 'NoneType' object has no attribute 'remove_all' ====================================================================== ERROR: test_odt (test_corrupted_files.TestCorruptedEmbedded.test_odt) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/jag/Downloads/mat2/tests/test_corrupted_files.py", line 77, in test_odt self.assertFalse(parser.remove_all()) ^^^^^^^^^^^^^^^^^ AttributeError: 'NoneType' object has no attribute 'remove_all' ====================================================================== ERROR: test_tar (test_libmat2.TestCleaningArchives.test_tar) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/jag/Downloads/mat2/tests/test_libmat2.py", line 679, in test_tar self.assertEqual(meta['./tests/data/dirty.docx']['word/media/image1.png']['Comment'], 'This is a comment, be careful!') ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^ KeyError: 'word/media/image1.png' ====================================================================== ERROR: test_tarbz2 (test_libmat2.TestCleaningArchives.test_tarbz2) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/jag/Downloads/mat2/tests/test_libmat2.py", line 749, in test_tarbz2 self.assertEqual(meta['./tests/data/dirty.docx']['word/media/image1.png']['Comment'], 'This is a comment, be careful!') ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^ KeyError: 'word/media/image1.png' ====================================================================== ERROR: test_targz (test_libmat2.TestCleaningArchives.test_targz) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/jag/Downloads/mat2/tests/test_libmat2.py", line 714, in test_targz self.assertEqual(meta['./tests/data/dirty.docx']['word/media/image1.png']['Comment'], 'This is a comment, be careful!') ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^ KeyError: 'word/media/image1.png' ====================================================================== ERROR: test_tarxz (test_libmat2.TestCleaningArchives.test_tarxz) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/jag/Downloads/mat2/tests/test_libmat2.py", line 784, in test_tarxz self.assertEqual(meta['./tests/data/dirty.docx']['word/media/image1.png']['Comment'], 'This is a comment, be careful!') ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^ KeyError: 'word/media/image1.png' ====================================================================== ERROR: test_zip (test_libmat2.TestCleaningArchives.test_zip) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/jag/Downloads/mat2/tests/test_libmat2.py", line 649, in test_zip self.assertEqual(meta['tests/data/dirty.docx']['word/media/image1.png']['Comment'], 'This is a comment, be careful!') ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^ KeyError: 'word/media/image1.png' ====================================================================== ERROR: test_tar (test_libmat2.TestGetMeta.test_tar) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/jag/Downloads/mat2/tests/test_libmat2.py", line 241, in test_tar self.assertEqual(meta['./tests/data/dirty.flac']['comments'], 'Thank you for using MAT !') ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^ KeyError: 'comments' ====================================================================== ERROR: test_zip (test_libmat2.TestGetMeta.test_zip) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/jag/Downloads/mat2/tests/test_libmat2.py", line 189, in test_zip self.assertEqual(meta['tests/data/dirty.flac']['comments'], 'Thank you for using MAT !') ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^ KeyError: 'comments' ====================================================================== FAIL: test_docx (test_climat2.TestGetMeta.test_docx) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/jag/Downloads/mat2/tests/test_climat2.py", line 203, in test_docx self.assertIn(b'Application: LibreOffice/5.4.5.1$Linux_X86_64', stdout) AssertionError: b'Application: LibreOffice/5.4.5.1$Linux_X86_64' not found in b"[-] ./tests/data/dirty.docx's format (None) is not supported\n" ====================================================================== FAIL: test_flac (test_climat2.TestGetMeta.test_flac) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/jag/Downloads/mat2/tests/test_climat2.py", line 226, in test_flac self.assertIn(b'comments: Thank you for using MAT !', stdout) AssertionError: b'comments: Thank you for using MAT !' not found in b"[-] ./tests/data/dirty.flac's format (None) is not supported\n" ====================================================================== FAIL: test_odt (test_climat2.TestGetMeta.test_odt) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/jag/Downloads/mat2/tests/test_climat2.py", line 211, in test_odt self.assertIn(b'generator: LibreOffice/3.3$Unix', stdout) AssertionError: b'generator: LibreOffice/3.3$Unix' not found in b"[-] ./tests/data/dirty.odt's format (None) is not supported\n" ====================================================================== FAIL: test_ogg (test_climat2.TestGetMeta.test_ogg) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/jag/Downloads/mat2/tests/test_climat2.py", line 234, in test_ogg self.assertIn(b'comments: Thank you for using MAT !', stdout) AssertionError: b'comments: Thank you for using MAT !' not found in b"[-] ./tests/data/dirty.ogg's format (None) is not supported\n" ====================================================================== FAIL: test_tar (test_corrupted_files.TestCorruptedFiles.test_tar) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/jag/Downloads/mat2/tests/test_corrupted_files.py", line 320, in test_tar with self.assertRaises(ValueError): AssertionError: ValueError not raised ====================================================================== FAIL: test_zip (test_corrupted_files.TestCorruptedFiles.test_zip) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/jag/Downloads/mat2/tests/test_corrupted_files.py", line 242, in test_zip with self.assertRaises(ValueError): AssertionError: ValueError not raised ====================================================================== FAIL: test_wmv (test_libmat2.TestGetMeta.test_wmv) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/jag/Downloads/mat2/tests/test_libmat2.py", line 206, in test_wmv self.assertEqual(mimetype, 'video/x-ms-wmv') AssertionError: None != 'video/x-ms-wmv' ---------------------------------------------------------------------- Ran 125 tests in 97.346s FAILED (failures=7, errors=10)