Leonard,
How much of the PDF content do you reckon is tagged?  I haven't seen
anything from IRS come tagged.  Does iText support Tagging?

Also a snippet from PDFPlanet:
####################################
Adobe's Acrobat 6.0 will add tags to a PDF file, but human intelligence
is still required to ensure the tagging process was performed correctly.
There is little room for error in document tagging. Even seemingly small
errors in document structure can easily render a file completely
incomprehensible.
####################################

I don't see many PDF authors tagging their files.



-----Original Message-----
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Leonard
Rosenthol
Sent: Tuesday, February 21, 2006 11:32 AM
To: [EMAIL PROTECTED]; itext-questions@lists.sourceforge.net;
[EMAIL PROTECTED]; [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Subject: [PDFBox-user] Re: [iText-questions] Good reading/resarch on PDF
text extraction


At 10:36 AM 2/21/2006, Richard Braman wrote:
>As more and more content gets "pushed" into PDF it looses its
>meaning to anyone else other than a human reader or a printer.

         ONLY IF the document content is untagged.

         Tagged PDF (part of the PDF spec since 1.5) provides for the 
inclusion of semantic information about the content IN ADDITION to 
its visible attributes.   Then, extraction of that content with all 
the necessary usability becomes trivial.


Leonard

------------------------------------------------------------------------
---
Leonard Rosenthol
<mailto:[EMAIL PROTECTED]>
Chief Technical Officer                      <http://www.pdfsages.com>
PDF Sages, Inc.                              215-938-7080 (voice)
                                              215-938-0880 (fax)



-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log
files for problems?  Stop!  Download the new AJAX search engine that
makes searching your log files as easy as surfing the  web.  DOWNLOAD
SPLUNK!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
_______________________________________________
PDFBox-user mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/pdfbox-user



-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
_______________________________________________
iText-questions mailing list
iText-questions@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/itext-questions

Reply via email to