Ok. I have done my awareness thing. Good luck for future researchers.

Cheers

hg

*Hilton Gibson*
Ubuntu Linux Systems Administrator
JS Gericke Library
Room 1025D
Stellenbosch University
Private Bag X5036
Stellenbosch
7599
South Africa

Tel: +27 21 808 4100 | Cell: +27 84 646 4758
http://library.sun.ac.za
http://za.linkedin.com/in/hiltongibson


On 2 January 2014 18:11, Graham Triggs <[email protected]> wrote:

> On 2 January 2014 13:59, Hilton Gibson <[email protected]> wrote:
>
>> PDF/A-3 makes only a single, fairly monumental change. In the PDF/A-2
>> specification users were allowed to embed files, but only PDF/A files.
>> PDF/A-3 now allows the embedding of any arbitrary file format, including
>> XML, CSV, CAD, images and any others.
>>
>> At first glance this sounds like a gigantic betrayal of everything that
>> the format has stood for. Why define a subset of PDF attributes to ensure
>> the long-term comprehension of the file if you’re going to turn around and
>> allow the kitchen sink to be embedded within it? (You can follow some of
>> the original discussion of this change here.)
>>
>>
>> http://blogs.loc.gov/digitalpreservation/2012/11/all-in-embedded-files-in-pdfa/?loclr=blogsig
>>
>> This is very bad news for digital preservation because it is now possible
>> to "hide" proprietary digital inside the PDF/A digital container. What will
>> future researchers think when they stumble upon these "hidden" closed
>> formats that they will not be able to use?
>>
>> What were they thinking??
>>
>
> There are probably nice, inventive ways to abuse this. Probably by having
> a proprietary application that uses the format as a container, but then has
> all the meat of what it's doing in embedded files - although that wouldn't
> really be usable as a PDF/A in the standard way, anyway. But taking a step
> back, the alternative to not being allowed to embed arbitrary file data is
> that all of that data must be held separately. Yes, that means you can
> easily perform preservation activities around those files. But it also
> increases the likelihood that someone will get the PDF/A file, and not the
> additional arbitrary files.
>
> Given the choice between not having the files at all, and having the files
> embedded in the PDF/A - albeit possibly in a 'dead' format - then for many
> people having the files will be a clear winner. Dead formats can generally
> still be resurrected by some means (get an emulator, run a file conversion,
> etc.). It's still more useful than having no file.
>
> If you are actively involved in preserving PDF/A files, then the "static
> readable" component remains the same regardless. You've just got the
> possibility of extra, arbitrary files inside the PDF/A - in which case,
> treat it like an archive (like zip, tar, etc.). Index the embedded files,
> extract the embedded files and run preservation tasks against them as
> necessary. Create new PDF/A-3 bundles.
>
> At no point have you degraded what is comprehensible about the PDF/A -
> you've just added stuff that might not be.
>
> Rule No 1 in digital preservation - capture everything. If you don't
> capture it, you can't preserve it. To that end, this should be a good thing
> for preservation. We just need to be aware of an extra hoop that we can /
> should jump through for format migration.
>
> G
>
------------------------------------------------------------------------------
Rapidly troubleshoot problems before they affect your business. Most IT 
organizations don't have a clear picture of how application performance 
affects their revenue. With AppDynamics, you get 100% visibility into your 
Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro!
http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk
_______________________________________________
Dspace-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspace-general

Reply via email to