There is no out-of-the-box solution for this (and the other posting).
PDF is not a format that has a <TABLE>...</TABLE> or <CHART>...</CHART>
syntax. PDF is just graphics. You can get the lines / shapes with this:
https://stackoverflow.com/questions/38931422/pdfbox-2-0-2-calling-of-pagedrawer-processpage-method-caught-exceptions
However you'll still have to do something to find out where your table /
chart is.
To get some understanding on how tricky this is, open your file with
PDFDebugger and look at the "contents" part. The operators you see are
explained in the PDF 32000 specification (
https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf
), in the segment "operator summary". (start with operators m, l, c, f
and s).
Your shape object is this:
0.357 0.608 0.835 rg
125.06 715.44 m
125.06 717.96 127.1 720 129.61 720 c
204 720 l
206.51 720 208.56 717.96 208.56 715.44 c
208.56 697.21 l
208.56 694.69 206.51 692.65 204 692.65 c
129.61 692.65 l
127.1 692.65 125.06 694.69 125.06 697.21 c
h
f*
1 w
0.255 0.443 0.612 RG
125.06 715.44 m
125.06 717.96 127.1 720 129.61 720 c
204 720 l
206.51 720 208.56 717.96 208.56 715.44 c
208.56 697.21 l
208.56 694.69 206.51 692.65 204 692.65 c
129.61 692.65 l
127.1 692.65 125.06 694.69 125.06 697.21 c
h
S
The chart in the other file is more difficult to find, I didn't even try.
Tilman
Am 23.11.2017 um 05:00 schrieb S S Satyanarayana Damarla:
Looks like PDF document attachment didn't get through.
I have uploaded the PDF document at the following location:
https://drive.google.com/file/d/1uYoQweCVbO4cNQiMnJuVjM1WZu7Cr7Ae/view
Please look into above link for accessing the PDF document that contains this
Chart.
Appreciate any help on this.
Thanks,
-Satya
On 2017-11-22 18:06, S S Satyanarayana Damarla <HYPERLINK
"mailto:[email protected]"[email protected]> wrote:
Hi,
I have attached a PDF document which contains a Chart.
For our project, we need the ability to retrieve rectangle bounds for the Chart
present in the attached PDF document. This chart is not recognized as image
object (PDImageXObject). Looks like it is embedded in the content stream.
Appreciate if you can help me with a sample code in retrieving rectangle bounds
for the chart present in the attached PDF document.
Thanks
-Satya
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]