How to get back to the pdf domain, once I have pixel values? Is it feasible?
Thanks -----Messaggio originale----- Da: Leonard Rosenthol [mailto:lrose...@adobe.com] Inviato: martedì 1 novembre 2011 18:47 A: Post here Oggetto: Re: [iText-questions] R: R: R: R: R: image in Flatedecode stream without metadata in dictionary Such approaches have been used in the past. There are pros and cons to it... Leonard On 11/1/11 1:35 PM, "Giampaolo Capelli" <giam...@gmail.com> wrote: >Hi 1T3XT, >thank you for the tips. > >I'm thinking of a computer vision approach to solve my problem: >if I switch to the raster domain, that is rendering the pdf syntax into a >raster image, >then I will be able to apply some computer vision techniques to recognize >the shapes (blobs), >and to get their convex hulls (or bounding boxes). > >It would be possible, for example, to work on a black/white version of the >rendered image, then to apply a threshold on its pixels and to apply some >algorithm to label the connected components and so on. > >Once I'll have shapes (blobs), I will be able to get their bounding boxes, >widths, heights and positions espressed in pixel values. > >The last step would be to convert such values back to the pdf domain. > >Do you think this makes sense? > > >-----Messaggio originale----- >Da: 1T3XT BVBA [mailto:i...@1t3xt.info] >Inviato: martedì 1 novembre 2011 17:34 >A: Post all your questions about iText here >Oggetto: Re: [iText-questions] R: R: R: R: image in Flatedecode stream >without metadata in dictionary > >On 1/11/2011 17:09, Giampaolo Capelli wrote: >> In the attachment >> >> I'm providing an example of a pdf file where I see 4 "conceptual >> images", in the psychological visual sense (I understood that they are >> not images from the pdf point of view). >> >> My aim is to edit the pdf file adding a (rectangular) border to such >> "conceptual images". >> >> Is there an easy way to do it with iText, or should I handcraft some >> sort of low level parsing myself? >Please take a look at the screen shot in attachment. It's a screen shot of >RUPS, which is a tool that takes X-Ray photos of PDFs: it shows an >"internal >view" of your PDF. > >Look at the /Resources Dictionary. It doesn't contain an /XObjects entry. >This means that there are NO Image XObjects (this you already knew), but >NO >Form XObjects either (we assumed there were Form XObjects, but that must >have been a misunderstanding)! > >Where are the four images? Well... as far as the PDF is concerned, there >are >no FOUR images. As you rightly point out, they are only there in the >psychological, visual sense (you've phrased that very well). > >As far as the PDF is concerned, there's only ONE sequence of PDF syntax: >the page content stream (object 5 in the PDF), which is the /Contents >entry >of the page dictionary (object 4). All the paths and shapes on that page >are >constructed using operators such as moveTo (m), lineTo (l) and curveTo >(c). > >I don't know any software (not iText, not any other software) that is >intelligent enough to find out which paths and shapes belong to which >"conceptual" of "visual" image. I don't know any way to automate the >detection of these images. > >----- >Nessun virus nel messaggio. >Controllato da AVG - www.avg.com >Versione: 10.0.1411 / Database dei virus: 2092/3989 - Data di rilascio: >01/11/2011 > > >-------------------------------------------------------------------------- >---- >RSA® Conference 2012 >Save $700 by Nov 18 >Register now >http://p.sf.net/sfu/rsa-sfdev2dev1 >_______________________________________________ >iText-questions mailing list >iText-questions@lists.sourceforge.net >https://lists.sourceforge.net/lists/listinfo/itext-questions > >iText(R) is a registered trademark of 1T3XT BVBA. >Many questions posted to this list can (and will) be answered with a >reference to the iText book: http://www.itextpdf.com/book/ >Please check the keywords list before you ask for examples: >http://itextpdf.com/themes/keywords.php ---------------------------------------------------------------------------- -- RSA® Conference 2012 Save $700 by Nov 18 Register now http://p.sf.net/sfu/rsa-sfdev2dev1 _______________________________________________ iText-questions mailing list iText-questions@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/itext-questions iText(R) is a registered trademark of 1T3XT BVBA. Many questions posted to this list can (and will) be answered with a reference to the iText book: http://www.itextpdf.com/book/ Please check the keywords list before you ask for examples: http://itextpdf.com/themes/keywords.php ----- Nessun virus nel messaggio. Controllato da AVG - www.avg.com Versione: 10.0.1411 / Database dei virus: 2092/3989 - Data di rilascio: 01/11/2011 ------------------------------------------------------------------------------ RSA® Conference 2012 Save $700 by Nov 18 Register now http://p.sf.net/sfu/rsa-sfdev2dev1 _______________________________________________ iText-questions mailing list iText-questions@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/itext-questions iText(R) is a registered trademark of 1T3XT BVBA. Many questions posted to this list can (and will) be answered with a reference to the iText book: http://www.itextpdf.com/book/ Please check the keywords list before you ask for examples: http://itextpdf.com/themes/keywords.php