On Wed, Aug 06, 2008 at 03:15:19PM +1000, Robert Backhaus wrote:
> On Wed, Aug 6, 2008 at 2:47 PM, Michal Hocko <[EMAIL PROTECTED]> wrote:
> > Hi, sorry for the late reply (I have noticed your previous post, but it
> > was on my way on vacation).
> >
> > On Wed, Aug 06, 2008 at 12:11:26PM +1000, Robert Backhaus wrote:
> >> When I attempt to edit any text on some pdfs, the editable text shown
> >> in the dynamic toolbar is scrambled. for instance, the text "Tax
> >> Invoice #000335" is shown as "8E\ -RZSMGI        " (in case the
> >> non-printing characters get lost they are (in unicode)
> >> 0004,0004,0007,0014(three times)0017(twice),0019.)
> >> This is a pdf created by the gnome print tool.
> >
> > Does this happen for all generated documents? Have you tried some simple
> > documents with small amount of text with/without special characters?
> 
> Yes, the document I attached was generated from the text editor on Ubuntu
> (whatever that is) by the cups pdf creator. It contains the alphabet in upper
> and lower case, numbers and all standard (on the keyboard) characters.
> 
> >
> >> A pdf file that shows this behavior is attached. PDFs created by cups
> >> out of notepad show this behaviour, but when created out of OpenOffice
> >> do not.
> >
> > I will try to look at it, but my email backlog is rather long after
> > vacation, so it may take a while.
> >
> > Maybe Jozo could look at it too...
> >
> >>
> >> I have noticed that the text is in a custom font, with a name such as
> >> "NAAAAA+DejaVuSansMono". If I insert new text using this font, the
> >> text I enter is scrambled in the pdf, and many typed characters show
> >> as unfilled boxes. (In PDFs out of OpenOffice, any characters not in
> >> the original document show up as unfilled boxes. Characters in that
> >> font in the original document display correctly)
> >>
> >> It seems to me that the pdf file only includes glyphs as they are used
> >> in the file, and numbers them as it requires them. Or it may be some
> >> abortive DRM move (although it is strange that gnome-print does it, if
> >> that is the case!)
> >
> > This is an optimization to save some space for document. So pdf
> > generators use it to create smaller documents.
> >
> >>
> >> Is there any way to either unscramble the text in the pdf file, so
> >> text can be entered normally,
> >
> > You can insert system font and use it for your additional text.
> >
> >> or to get pdfedit to show the descrambled text in the edit boxes, and
> >> rescramble the text when I am done editing it?
> >
> > I am not sure how much we can do in this area, becaause pdf generator can
> > create almost arbitrary character mappings (at least as far as I
> > understand that - Jozo may give better background here).
> >
> 
> The "Extract Text" tool has no issues decoding the text, but I do not know
> the innards of things to know whether that is relevant.

Extract text tool is strongly based on xpdf code with all its magic done
to enable this feature. Text editing is done by content stream
modification which doesn't use any special hacks to find out what the
data actually is. But I think that we need to add some logic there.

> 
> > But we should definitely be able to add a new text without any scrambles.
> 
> We can add new text in a new system font. This is inconvenient, as it
> means guessing the original font, retyping the text, and probably increases
> the file size by slicing in a new font.


Yes, I understand, but PDF format has never been designed to be edited
conveniently... Sometimes it can be/is black magic. It depends on
document generator as it has freedom to make it almost as it wants.

> 
> >
> >>
> >> (I am sorry if this has been covered already: I browsed through the
> >> support mailing list and could not find anything, and the sourceforge
> >> list search appears to be broken.)
> >
> > We have some reports about problems with accent characters as well as
> > problems with editing some text exported by OO. But it would be useful
> > to find some patterns to sort this problem out easier.
> >
> > I think that it would be the best to report your problem in our
> > bugtracker with all your test documents and follow discussion there.
> >
> 
> Will do. I managed to find the bug tracker via google, but you do not
> seem to have a link to it on the pdfedit site, or at sourceforge.

Thanks (I will try to look at it ASAP). Sorry, I should add link too,
but Bugtracker link is available on our official web site
http://pdfedit.petricek.net/ (this page is reachable from
sourceforge.net -> Web site.

-- 
Michal Hocko

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Pdfedit-support mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/pdfedit-support

Reply via email to