[ https://issues.apache.org/jira/browse/PDFBOX-18?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14025643#comment-14025643 ]
Tilman Hausherr commented on PDFBOX-18: --------------------------------------- Indeed, there is no such thing as a "table" in a PDF. See also the similar question and its answer here http://stackoverflow.com/q/23828463/535646 Many OCR programs attempt to identify "tables" and sometimes it works, sometimes it doesn't. > Possible to Extractact Just Table From PDF > ------------------------------------------ > > Key: PDFBOX-18 > URL: https://issues.apache.org/jira/browse/PDFBOX-18 > Project: PDFBox > Issue Type: New Feature > Components: Text extraction > > [imported from SourceForge] > http://sourceforge.net/tracker/index.php?group_id=78314&atid=552835&aid=1020203 > Originally submitted by nobody on 2004-08-31 23:23. > Sir i want to know is it possible to extract Just Table > From PDF File ,if it is possible then > Tell me how i can identify in Streams that this Streams > contains Table > Sir i want to mention you also that previously i > extracted the Text from PDF file and i know the whole > structure of PDF file > Just Tell me the exact way how i identify > Sir i am waiting for you reply > [comment on SourceForge] > Originally sent by benlitchfield. > Logged In: YES > user_id=601708 > This is an RFE for table support, not a bug request, so I > am changing the issue type. In addition, PDF documents do > not contain 'tables', so that information would need to be > derived and could only be done with little accuracy. I am > changing the priority to 1, as I will probably never > implement this myself. Please feel free to submit a patch > though. > Ben -- This message was sent by Atlassian JIRA (v6.2#6252)