Paulo, do you have any sample code for this?  I am
using ps2ascii which comes with gs, but to get
checksums of existing text I thought I might save time
if I didn't have to shell out.  But I don't know
exactly what you're talking about.  What
classes/methods should I be looking at?

Thanks,
Matt

--- Paulo Soares <[EMAIL PROTECTED]> wrote:
> I'm surprised that pstotext fails, iText certainly
> doesn't generate anything
> strange. If the documents were generated by you with
> iText, with winansi
> font encoding, you can read the document with iText,
> open the stream and
> parse the text. To parse the text find the the first
> '(' and read until the
> next ')', escaping the '\'.
> 
> Best Regards,
> Paulo Soares
> 
> > -----Original Message-----
> > From:       Matt Benson [SMTP:[EMAIL PROTECTED]]
> > Sent:       Tuesday, November 26, 2002 17:52
> > To: Paulo Soares; itext-questions
> > Subject:    RE: [iText-questions] PDF metadata
> > 
> > Thanks for the suggestion, Paulo.  Now, what would
> you
> > use to do the extraction?  I have been playing
> with
> > pstotext which uses gs behind the scenes, but gs
> is
> > choking on the iText-generated PDF.  I have looked
> at
> > jPedal, but not deeply as it seems to lack an
> > intuitive enough API that I could get started
> quickly
> > without reading the source code of the examples.
> > 
> > -Matt
> > 
> > --- Paulo Soares <[EMAIL PROTECTED]> wrote:
> > > Taking out the metadata won't help you as there
> are
> > > no guaranties that the
> > > layout engine is the same from version to
> version,
> > > the text may look the
> > > same but the internal representation is
> different.
> > > The best way is to do a
> > > checksum to the text (words only, skipping the
> > > whitespace) and store that
> > > information in the pdf metadata as a new key.
> The
> > > already generated pdf can
> > > have the text extracted, the checksum calculated
> and
> > > applied to the same
> > > pdf.
> > > 
> > > Best Regards,
> > > Paulo Soares
> > > 
> > > > -----Original Message-----
> > > > From:   Matt Benson [SMTP:[EMAIL PROTECTED]]
> > > > Sent:   Tuesday, November 26, 2002 15:40
> > > > To:     itext-questions
> > > > Subject:        [iText-questions] PDF metadata
> > > > 
> > > > We are using iText to convert text files to
> PDF as
> > > > outlined in the FAQ.  This works; however I
> want
> > > to
> > > > take a checksum of the PDF created and use it
> in
> > > > conjunction with some other information to
> verify
> > > we
> > > > have not created this file before.  What I am
> > > finding,
> > > > however, is that the metadata of the PDF
> always
> > > > differs between iText versions as well as
> creation
> > > > date/time, so I cannot create the exact same
> file
> > > > twice and thus cannot rely on a checksum.  I
> could
> > > use
> > > > the checksum from the input file, except that
> this
> > > is
> > > > a modification to a production application and
> we
> > > no
> > > > longer have the input files for the existing
> data.
> > >  So
> > > > to do this I would have to extract the text to
> get
> > > an
> > > > approximation of the original file.  If I did
> > > this,
> > > > the checksum would represent slightly
> different
> > > things
> > > > from the old to the new data.  What I am
> wondering
> > > > about is whether these variable pieces of
> metadata
> > > are
> > > > vital to the PDF structure, and if not, what
> would
> > > it
> > > > take to remove them?  Alternatively, if anyone
> has
> > > a
> > > > better idea then those are welcome too.
> > > > 
> > > > Thanks,
> > > > Matt
> > > > 
> > > >
> __________________________________________________
> > > > Do you Yahoo!?
> > > > Yahoo! Mail Plus - Powerful. Affordable. Sign
> up
> > > now.
> > > > http://mailplus.yahoo.com
> > > > 
> > > > 
> > > >
> > >
> >
>
-------------------------------------------------------
> > > > This SF.net email is sponsored by: Get the new
> > > Palm Tungsten T 
> > > > handheld. Power & Color in a compact size! 
> > > >
> > >
> >
>
http://ads.sourceforge.net/cgi-bin/redirect.pl?palm0002en
> > > >
> _______________________________________________
> > > > iText-questions mailing list
> > > > [EMAIL PROTECTED]
> > > >
> >
>
https://lists.sourceforge.net/lists/listinfo/itext-questions
> > 
> > 
> > __________________________________________________
> > Do you Yahoo!?
> > Yahoo! Mail Plus - Powerful. Affordable. Sign up
> now.
> > http://mailplus.yahoo.com
> 
> 
>
-------------------------------------------------------
> This SF.net email is sponsored by: Get the new Palm
> Tungsten T 
> handheld. Power & Color in a compact size! 
>
http://ads.sourceforge.net/cgi-bin/redirect.pl?palm0002en
> _______________________________________________
> iText-questions mailing list
> [EMAIL PROTECTED]
>
https://lists.sourceforge.net/lists/listinfo/itext-questions


__________________________________________________
Do you Yahoo!?
Yahoo! Mail Plus - Powerful. Affordable. Sign up now.
http://mailplus.yahoo.com


-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
iText-questions mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/itext-questions

Reply via email to