portable document format
http://en.wikipedia.org/wiki/Portable_Document_Format
The Portable Document Format (PDF) is the file format created by Adobe
Systems, in 1993, for document exchange. PDF is used for representing
two-dimensional documents in a device-independent and display
resolution-independent fixed-layout document format. Each PDF file
encapsulates a complete description of a 2-D document (and, with
Acrobat 3-D, embedded 3-D documents) that includes the text, fonts,
images, and 2-D vector graphics that compose the document.
PDF is an open standard, and is now being prepared for submission as
an ISO standard.[1]
History
When the PDF first came out in the early 1990s, its general adoption
was slow.[2] Then, the PDF-creation tools (Acrobat) and the viewing
and printing software had to be bought. Early versions of PDF had no
support for external hyperlinks, reducing its usefulness on the world
wide web. Additionally, there were competing formats such as Envoy,
Common Ground Digital Paper and even Adobe's own PostScript format
(.ps); in those early years, the PDF file was mainly popular in
Desktop publishing workflow.
Adobe soon started free distribution of the Acrobat Reader (now Adobe
Reader) program, and continued supporting the original PDF,
eventually, becoming the de facto standard for printable documents.
The PDF file format has changed several times, as new versions of
Adobe Acrobat have been released. There have been eight versions of
PDF: 1.0 (1993), 1.1 (1994), 1.2 (1996), 1.3 (1999), 1.4 (2001), 1.5
(2003), 1.6 (2005), and 1.7 (2006), corresponding to Acrobat releases
1.0 to 8.0.
Technology
Anyone may create applications that read and write PDF files without
having to pay royalties to Adobe Systems; Adobe holds patents to PDF,
but licenses them for royalty-free use in developing software
complying with its PDF specification.[3]
The PDF combines three technologies:
A sub-set of the PostScript page description programming language, for
generating the layout and graphics.
A font-embedding/replacement system to allow fonts to travel with the
documents.
A structured storage system to bundle these elements and any
associated content into a single file, with data compression where
appropriate.
PostScript
PostScript is a page description language run in an interpreter to
generate an image, a process requiring many resources. PDF is a file
format, not a programming language, i.e. flow control commands such as
if and loop are removed, while graphics commands such as lineto
remain.
Often, the PostScript-like PDF code is generated from a source
PostScript file. The graphics commands that are output by the
PostScript code are collected and tokenized; any files, graphics, or
fonts to which the document refers also are collected; then,
everything is compressed to a single file. Therefore, the entire
PostScript world (fonts, layout, measurements) remains intact.
As a document format, PDF has several advantages over PostScript:
PDF contains already tokenized and interpreted results of the
PostScript source code, for direct correspondence between changes to
items in the PDF page description and changes to the resulting page
appearance.
PDF (from version 1.4) supports true transparency, PostScript does not.
PostScript is an imperative programming language (with an implicit
global state), so instructions accompanying the description of one
page can affect the appearance of any following page. Therefore, all
preceding pages must be processed in order to determine the correct
appearance of a given page; each page in a PDF document is unaffected
by the others.
Accessibility
PDF files that are accessible to disabled people can be created.
Current PDF file formats can include tags (XML), text equivalents,
captions, audio descriptions, et cetera). Some software, such as Adobe
InDesign, can automatically produce tagged PDFs. Leading screen
readers, including JAWS, Window-Eyes, and Hal, can read tagged PDFs;
current versions of the Acrobat and Acrobat Reader programs can also
read PDFs aloud. Moreover, tagged PDFs can be re-flowed and magnified
for readers with poor eysesight, however, problems remain: the
difficulty in adding tags to existing, or legacy, PDFs, e.g. for PDFs
are generated from scanned documents, accessibility tags and
re-flowing are unavailable, and must be created either manually or
with OCR techniques. These processes often are inaccessible to some
disabled people, nonetheless, well-made PDFs are a valid choice as
long-term accessible documents. PDF/UA, the PDF/Universal
Accessibility Committee, an activity of AIIM, is working on a
specification for PDF accessibility based on the PDF 1.6
specification.
One of the major problems with PDF accessibility is that PDF documents
have three distinct views, which, depending on the document's
creation, can be inconsistent with each other. The three views are (i)
the physical view, (ii) the tags view, and (iii) the content view. The
phys