---------------------------------------- > Date: Tue, 10 Mar 2009 08:34:11 +0100 > From: [email protected] > To: [email protected] > Subject: Re: [iText-questions] modifed sample, question on PDF contents > > Mike Marchywka wrote: >> Is there any information in the >> PDF that tells me how this stuff is supposed to be organized >> to extract the INFORMATION or is this just a bunch of hopelessly jumbled >> text that can only be read by a human, not a computer? > > It's just a bunch of glyphs and lines drawn on a canvas; > there is no structure in the content UNLESS your PDF is tagged.
Ok, thanks I'll try to find tags but I was hoping there was some hierarchy to the layout and a traversal pattern or something. Are there particular classes I in itext I should grep for? This would seem like a very limited format in which to present INFORMATION in things like government documents. Surely, there must be some mechanism to extract machine readable information so that other flexible non-proprietary tools can manipulate information easily if the format is being used for public documents. This is probably more of a marketing discussion than a technical one but I would be curious to understand the situation if anyone wants to talk off-list. Thanks. > -- > This answer is provided by 1T3XT BVBA > http://www.1t3xt.com/ - http://www.1t3xt.info > > ------------------------------------------------------------------------------ > _______________________________________________ > iText-questions mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/itext-questions > > Buy the iText book: http://www.1t3xt.com/docs/book.php _________________________________________________________________ Windows Live⢠Groups: Create an online spot for your favorite groups to meet. http://windowslive.com/online/groups?ocid=TXT_TAGLM_WL_groups_032009 ------------------------------------------------------------------------------ _______________________________________________ iText-questions mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/itext-questions Buy the iText book: http://www.1t3xt.com/docs/book.php
