----------------------------------------
> From: lrose...@adobe.com
> To: itext-questions@lists.sourceforge.net
> Date: Mon, 10 May 2010 06:44:13 -0700
> Subject: Re: [iText-questions] how to detect remote links in a PDF ?
>
> Prior to PDF 1.5, you could have done a grep (or equivalent) since only 
> stream objects were compressed. However, as of PDF 1.5, we now have "object 
> streams", where groups of objects are placed into a stream and then 
> compressed - which means that grep will no longer work.
>
> Adobe Acrobat 9 will ALWAYS (unless restricted by a specific ISO standard, 
> such as PDF/A) use object stream compression to keep file sizes down. I've 
> been trying to recommend that other products do the same.


Is there some utility like in pdf tk to convert a pdf with arbitrary stuff in 
it to some "Standard" or 
canonical format that can let it be used with other tools so you don't have to 
write custom code for
every little trivail variation of a thing you wish to accopmlish? For example,

cat xxx.pdf | pdf_to_standard_form | grep http 


Obivously applicability would go beyond the immediate question but also let 
people writing itext
code have some way to check their results more easily than "it opened in 
proprietary adobe product X
but in black box Y it greyed out 3 menu options and wouldn't let me save it 
unless blah blah bla ?"

There is nothing wrong with a human readable end product but given the 
complexity of these things
it would be nice to use computers to automate certain things, like checking for 
links
or other attributes. Without ability to use automated tools everything comes 
down to a long
menu chain and terse messages from products not designed for debug.





>
> So while there certainly exists lots of PDFs that you could grep, the numbers 
> are reducing daily...
>
> Leonard
>
> -----Original Message-----
> From: Mike Marchywka [mailto:marchy...@hotmail.com]
> Sent: Monday, May 10, 2010 3:51 AM
> To: itext-questions@lists.sourceforge.net
> Subject: Re: [iText-questions] how to detect remote links in a PDF ?
>
>
>
>
>
>
>
>
>
>
>
> ----------------------------------------
>> Date: Sun, 9 May 2010 23:08:51 +0200
>> From: papa...@googlemail.com
>> To: itext-questions@lists.sourceforge.net
>> Subject: [iText-questions] how to detect remote links in a PDF ?
>>
>> Colleagues,
>>
>> For an application, one needs to detect the hyperlinks (i.e. done with
>> Chunk.setRemoteGoto) in a PDF which point to an other PDF, can someone
>> point me to a solution ?
>
> Question for leonard or others who have read the spec, if you literally ONLY
> want to list the links, not parse the document or determine any context,
>  are they likely to be hidden or can you just use text
> tools to find strings that start or contain "http" ? For example,
>
>
>   540  cat *.pdf ../Desktop/*.pdf  | sed -e 's/[^a-ZA-Z0-9/:.?]/\n/g' | grep 
> http
>   541  cat *.pdf ../Desktop/*.pdf  | strings | grep http
>   542  history
>
> These seem to work in that they find things with http but not sure what would 
> be
> missing. Many of these seem to be surrounded by xml or prefixed with "/A"
> but not sure what other contexts may exist.
>
> Thanks.
>
>
>
>
>
>
>>
>> Thank you very much in advance,
>> Pieter Vankeerberghen
>>
>> ------------------------------------------------------------------------------
>>
>> _______________________________________________
>> iText-questions mailing list
>> iText-questions@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/itext-questions
>>
>> Buy the iText book: http://www.itextpdf.com/book/
>> Check the site with examples before you ask questions: 
>> http://www.1t3xt.info/examples/
>> You can also search the keywords list: http://1t3xt.info/tutorials/keywords/
>
> _________________________________________________________________
> The New Busy think 9 to 5 is a cute idea. Combine multiple calendars with 
> Hotmail.
> http://www.windowslive.com/campaign/thenewbusy?tile=multicalendar&ocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_5
> ------------------------------------------------------------------------------
>
> _______________________________________________
> iText-questions mailing list
> iText-questions@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/itext-questions
>
> Buy the iText book: http://www.itextpdf.com/book/
> Check the site with examples before you ask questions: 
> http://www.1t3xt.info/examples/
> You can also search the keywords list: http://1t3xt.info/tutorials/keywords/
>
> ------------------------------------------------------------------------------
>
> _______________________________________________
> iText-questions mailing list
> iText-questions@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/itext-questions
>
> Buy the iText book: http://www.itextpdf.com/book/
> Check the site with examples before you ask questions: 
> http://www.1t3xt.info/examples/
> You can also search the keywords list: http://1t3xt.info/tutorials/keywords/
                                          
_________________________________________________________________
Hotmail has tools for the New Busy. Search, chat and e-mail from your inbox.
http://www.windowslive.com/campaign/thenewbusy?ocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_1
------------------------------------------------------------------------------

_______________________________________________
iText-questions mailing list
iText-questions@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/itext-questions

Buy the iText book: http://www.itextpdf.com/book/
Check the site with examples before you ask questions: 
http://www.1t3xt.info/examples/
You can also search the keywords list: http://1t3xt.info/tutorials/keywords/

Reply via email to