Hi;
Can anyone help me please?
---------- Forwarded message ----------
From: Nadav Ben Jakov <tbya...@gmail.com>
Date: Tue, Feb 10, 2015 at 6:34 PM
Subject: Re: [Podofo-users] Problem with enumerating objects with PoDoFo
To: Leonard Rosenthol <lrose...@adobe.com>
Hi,
I'm sorry for having pretty vogue explanations. I'm not trying to render
the stream's content nor to parse the streams as data. I just want to
print the section between 'stream' to 'endstream'.
As to printing, I basically write the stream to files as output, so they
are in fact "printable" for that matter. In the case I had given, I expect
to be able to open the ouput file with winzip/7zip later, and view its
content. if that stream would have contained a PE file's content, I would
have wanted to be able to execute the output file just like a normal PE
file.
Thanks in advance,
Nadav
On Tue, Feb 10, 2015 at 6:16 PM, Leonard Rosenthol <lrose...@adobe.com>
wrote:
> > I'm trying to extract each of the streams data out of all of the
> objects of a given PDF file,
> >and remove any encoding/encryption related to the PDF format
> >
> That is EXACTLY what PoDoFo is doing for you.
>
>
> > Later I'm printing each stream separately
> >
> Printing them in what way? Most (if not all) streams in a PDF aren’t in a
> format that would be printable. You have font data, color data, XML
> streams, etc. None of these make any sense to be printed. Even a page
> content stream isn’t necessarily printable.
>
>
> >In the example file I mentioned, I was hoping to extract that object
> steam (the zip file) and print it as a raw stream
> >
> I don’t understand what it means to “print a ZIP file as a raw stream”. A
> ZIP file itself can contain MANY files and in a rich hierarchical
> organization. So what would you be expecting to get back??
>
> What if it was a .exe file – which si also perfectly fine in a PDF –
> what would you be expecting to get back??
>
>
> >(again, I'm not after unzipping or decompressing the zip, just the
> plain object).
> >
> And I think that’s the problem here – you are expecting something that
> isn’t reasonable.
>
> Have you read the PDF Standard – ISO 32000-1 – or at least a good book
> on subject of PDF (eg. <http://shop.oreilly.com/product/0636920025269.do>)
> to get a better understanding of PDF?
>
>
> Leonard
>
> From: Nadav Ben Jakov
> Date: Tuesday, February 10, 2015 at 10:07 AM
>
> To: Leonard Rosenthol
> Subject: Re: [Podofo-users] Problem with enumerating objects with PoDoFo
>
> I'm trying to extract each of the streams data out of all of the
> objects of a given PDF file, and remove any encoding/encryption related to
> the PDF format. Later I'm printing each stream separately.
> I'm only trying to figure out if I can rely on PoDoFo parsing and
> extraction of any stream on any object, even if the stream in question is
> unfiltered/clear, I'm basically after the raw object's streams on these
> cases.
> In the example file I mentioned, I was hoping to extract that object steam
> (the zip file) and print it as a raw stream (again, I'm not after unzipping
> or decompressing the zip, just the plain object).
>
> Nadav
>
> On Tue, Feb 10, 2015 at 4:51 PM, Leonard Rosenthol <lrose...@adobe.com>
> wrote:
>
>> This stream IS clear (to use your term) as far as PDF is concerned. It
>> is neither filtered nor encrypted. Therefore you get the raw bits – which
>> in this case, happen to be a .zip file. But as mentioned, it might be
>> font data, ICC profile data or any other manner thing that can be put into
>> a PDF stream in various ways.
>>
>> Perhaps if you would explain the reason for your request, since it’s
>> unclear what you are trying to achieve at the end of the process.
>>
>> Leonard
>>
>> From: Nadav Ben Jakov
>> Date: Tuesday, February 10, 2015 at 9:02 AM
>> To: Leonard Rosenthol
>> Subject: Re: [Podofo-users] Problem with enumerating objects with PoDoFo
>>
>> Hi,
>>
>> Thank you for your quick answer, yet it didn't completely answered my
>> question: I need to get as an output all of the streams, regardless of
>> having /Filter or /Encrypt. If the object happens to contain a /Filter -
>> the stream should be decoded. If the stream is clear - it should only be
>> parsed and returned as clear stream.
>> Can PoDoFo (or any other PDF library for that matter) perform such
>> operation?
>>
>> Thanks in advance,
>> Nadav
>>
>> On Tue, Feb 10, 2015 at 3:38 PM, Leonard Rosenthol <lrose...@adobe.com>
>> wrote:
>>
>>> That stream that you show doesn’t have a filter. If it did, you’d
>>> see it on the stream dictionary.
>>>
>>> Instead, what you have is an unfiltered stream containing arbitrary
>>> data – which in this case, just happens to be a ZIP file. But it could
>>> just as well been a Word file, another PDF or even a .exe.
>>>
>>> There is nothing for PoDoFo (or any other PDF library) to do here.
>>>
>>> Leonard
>>>
>>> From: Nadav Ben Jakov
>>> Date: Tuesday, February 10, 2015 at 8:32 AM
>>> To: "podofo-users@lists.sourceforge.net"
>>> Subject: [Podofo-users] Problem with enumerating objects with PoDoFo
>>>
>>> Hi.
>>> I'm new to the PoDoFo library, and I've been trying to use it for my
>>> program.
>>> In general, my program needs to open a PDF file, decode all of its
>>> filters, decrypt them with default password (if necessary), and print out
>>> all decoded and decrypted objects.
>>> So I saw that podofouncompress (using the PoDoFo library) pretty much
>>> does that, so I tried to use the same library functions as he does. The
>>> whole project went great from that point untill I've encountered an error.
>>> Apparently the function PdfMemDocument::GetObjects() doesn't return all
>>> objects, but only those who had been handled (or loaded?). Anyway, it will
>>> be great to find out a way to get the entire pdf's object list decrypted
>>> and uncompressed (PdfMemDocument::UncompressObjects()).
>>>
>>> I've uploaded a sample file I've been woking on. In which object 2
>>> generation 0 is clearly a zip file "embedded":
>>>
>>> 2 0 obj
>>> <</Length 144>>stream
>>> PK ÏyJFëEŒ‰% &
>>>
>>> Text File.txtíÆ1
>>> 0 0+ ˜ f “ðpLþŒ´WsºâÖÛ ©ªªªªªª PK ÏyJFëEŒ‰% &
>>>
>>> Text File.txtPK ; P
>>> endstream
>>> endobj
>>>
>>> In the link bellow:
>>> http://www57.zippyshare.com/v/or6jqo28/file.html
>>>
>>> when scanning with podofo, only object 6 0 is available in the
>>> mentioned function's output, and podofouncompress completely ignores object
>>> object 2 0
>>>
>>> Thanks in advance,
>>> Nadav
>>>
>>
>>
>
------------------------------------------------------------------------------
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
_______________________________________________
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users