Thanks a lot to the down-under whistle-blower!

This issue reminds me the TIFF format, seen decades ago as a good preservation format, also an envelope for a myriad of other formats. It ends up that they became badly supported by new Microsoft OSs and finally we had to convert all of them to PDF.

In a decade, we will have to do it all over again I suppose! Millions of files...

Archivists/Librarians are supposed to at least cope with the "40 years disinterest time range": not an easy job and very difficult to fund in these days of info-obesity!

Christophe Dupriez
DESTIN-Informatique.com
Twitter @ChristopheDupri


Le 2/01/2014 17:16, Hilton Gibson a écrit :
Ok. I have done my awareness thing. Good luck for future researchers.

Cheers

hg

*Hilton Gibson*
Ubuntu Linux Systems Administrator
JS Gericke Library
Room 1025D
Stellenbosch University
Private Bag X5036
Stellenbosch
7599
South Africa

Tel: +27 21 808 4100 | Cell: +27 84 646 4758
http://library.sun.ac.za
http://za.linkedin.com/in/hiltongibson


On 2 January 2014 18:11, Graham Triggs <[email protected] <mailto:[email protected]>> wrote:

    On 2 January 2014 13:59, Hilton Gibson <[email protected]
    <mailto:[email protected]>> wrote:

        PDF/A-3 makes only a single, fairly monumental change. In the
        PDF/A-2 specification users were allowed to embed files, but
        only PDF/A files. PDF/A-3 now allows the embedding of any
        arbitrary file format, including XML, CSV, CAD, images and any
        others.

        At first glance this sounds like a gigantic betrayal of
        everything that the format has stood for. Why define a subset
        of PDF attributes to ensure the long-term comprehension of the
        file if you're going to turn around and allow the kitchen sink
        to be embedded within it? (You can follow some of the original
        discussion of this change here.)

        
http://blogs.loc.gov/digitalpreservation/2012/11/all-in-embedded-files-in-pdfa/?loclr=blogsig

        This is very bad news for digital preservation because it is
        now possible to "hide" proprietary digital inside the PDF/A
        digital container. What will future researchers think when
        they stumble upon these "hidden" closed formats that they will
        not be able to use?

        What were they thinking??


    There are probably nice, inventive ways to abuse this. Probably by
    having a proprietary application that uses the format as a
    container, but then has all the meat of what it's doing in
    embedded files - although that wouldn't really be usable as a
    PDF/A in the standard way, anyway. But taking a step back, the
    alternative to not being allowed to embed arbitrary file data is
    that all of that data must be held separately. Yes, that means you
    can easily perform preservation activities around those files. But
    it also increases the likelihood that someone will get the PDF/A
    file, and not the additional arbitrary files.

    Given the choice between not having the files at all, and having
    the files embedded in the PDF/A - albeit possibly in a 'dead'
    format - then for many people having the files will be a clear
    winner. Dead formats can generally still be resurrected by some
    means (get an emulator, run a file conversion, etc.). It's still
    more useful than having no file.

    If you are actively involved in preserving PDF/A files, then the
    "static readable" component remains the same regardless. You've
    just got the possibility of extra, arbitrary files inside the
    PDF/A - in which case, treat it like an archive (like zip, tar,
    etc.). Index the embedded files, extract the embedded files and
    run preservation tasks against them as necessary. Create new
    PDF/A-3 bundles.

    At no point have you degraded what is comprehensible about the
    PDF/A - you've just added stuff that might not be.

    Rule No 1 in digital preservation - capture everything. If you
    don't capture it, you can't preserve it. To that end, this should
    be a good thing for preservation. We just need to be aware of an
    extra hoop that we can / should jump through for format migration.

    G




------------------------------------------------------------------------------
Rapidly troubleshoot problems before they affect your business. Most IT
organizations don't have a clear picture of how application performance
affects their revenue. With AppDynamics, you get 100% visibility into your
Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro!
http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk


_______________________________________________
Dspace-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspace-general

------------------------------------------------------------------------------
Rapidly troubleshoot problems before they affect your business. Most IT 
organizations don't have a clear picture of how application performance 
affects their revenue. With AppDynamics, you get 100% visibility into your 
Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro!
http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk
_______________________________________________
Dspace-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspace-general

Reply via email to