[Bug 1438494] Re: ghostscript fails to correctly substitute cidf fonts

2017-04-01 Thread wang haisheng
➜  example git:(master) ✗ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:Ubuntu 16.04.2 LTS
Release:16.04
Codename:   xenial
➜  pdf2xml-viewer git:(master) ✗ pdftohtml  
pdftohtml version 0.41.0
Copyright 2005-2016 The Poppler Developers - http://poppler.freedesktop.org
Copyright 1999-2003 Gueorgui Ovtcharov and Rainer Dorsch
Copyright 1996-2011 Glyph & Cog, LLC

poppler-data is already the newest version (0.4.7-7).


➜  example git:(master) ✗ pdffonts test.pdf 
name type  encoding emb sub 
uni object ID
 -  --- --- 
--- -
OCVNVZ+KaiTi_GB2312  TrueType  WinAnsi  yes yes 
yes 19  0
JSRZNG+SimSunTrueType  WinAnsi  yes yes 
yes  8  0
➜ 


➜  example git:(master) ✗ pdftohtml -c -hidden -enc UTF-8 -xml test.pdf 
test-utf8.xml
Page-1

i could not get correct Chinese characters

test file is here 
link: https://pan.baidu.com/s/1dFiSrDn 
password: ai5u

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1438494

Title:
  ghostscript fails to correctly substitute cidf fonts

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/poppler-data/+bug/1438494/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1438494] Re: ghostscript fails to correctly substitute cidf fonts

2015-12-11 Thread Thorsten
Thanks a lot! I created a new bug report, which is hopefully accurate
enough:
https://bugs.launchpad.net/ubuntu/+source/ghostscript/+bug/1525225

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1438494

Title:
  ghostscript fails to correctly substitute cidf fonts

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/poppler-data/+bug/1438494/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1438494] Re: ghostscript fails to correctly substitute cidf fonts

2015-12-11 Thread cliddell
Your file contains neither fonts nor CIDFonts, it is simply one big
image. Whilst it is common practice for scanner produced PDF to use OCR
to overlay the scanned image with non-marking characters (obviously,
being non-marking, the actual font used does not really matter), this
file does not do so. I'd guess that's because either the OCR function
was disabled, or it simply could not recognise the handwritten
characters.

Anyway, as our 9.16 release fails with the same error as you saw (i.e. the 
error is not caused by the Ubuntu packaging), but the 9.18 release (which I 
assume is what you tested) works without error, I had a hunt, and found that 
the fix is this one:
http://git.ghostscript.com/?p=ghostpdl.git;a=commitdiff;h=668406a5

I would suggest, if you want the maintainer to pull in this patch, you
*may* want to open a new bug report, referencing the above commit.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1438494

Title:
  ghostscript fails to correctly substitute cidf fonts

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/poppler-data/+bug/1438494/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1438494] Re: ghostscript fails to correctly substitute cidf fonts

2015-12-10 Thread Thorsten
Does this bug also occur if there are no embedded fonts in a pdf? Because my 
canon scanner seems to produce pdfs  without embedded fonts, at least pdffonts 
shows no fonts:
$ pdffonts SCN_0002.pdf 
name type  encoding emb sub 
uni object ID
 -  --- --- 
--- -

 and i get a very similar error when trying to minimize the filesize
with ghostscript:

$ gs -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 
-dPDFSETTINGS=/printer  -sOutputFile=out.pdf SCN_0002.pdf 
GPL Ghostscript 9.16 (2015-03-30)
Copyright (C) 2015 Artifex Software, Inc.  All rights reserved.
This software comes with NO WARRANTY: see the file PUBLIC for details.
Processing pages 1 through 1.
Page 1
    Error reading a content stream. The page may be incomplete.
    File did not complete the page properly and may be damaged.

    This file had errors that were repaired or ignored.
    The file was produced by: 
     MP540 series 
    Please notify the author of the software that produced this
    file that it does not conform to Adobe's published PDF
    specification.

With the minimal gs version from
http://www.ghostscript.com/download/gsdnld.html its working flawlessly.

i attached a pdf created by my scanner to this comment.

(i m using ubuntu 15.10)


** Attachment added: "SCN_0002.pdf"
   
https://bugs.launchpad.net/ubuntu/+source/poppler-data/+bug/1438494/+attachment/4532873/+files/SCN_0002.pdf

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1438494

Title:
  ghostscript fails to correctly substitute cidf fonts

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/poppler-data/+bug/1438494/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1438494] Re: ghostscript fails to correctly substitute cidf fonts

2015-05-14 Thread cliddell
Two things:

1) I *really* don't understand why Ghostscript configuration file are
being installed by poppler. It would be worth finding out how (and even
if) poppler actually uses them, because I rather feel poppler and
Ghostscript configurations *should* be separate. For example, if I get
time, I'll probably be tweaking the capabilities of cidfmap at some
point, which could, potentially, break poppler's use of these files.

2) the question of whether poppler will fall back to some other
substitute CIDFont is moot since, if poppler *does* use those
configuration files, it won't (normally) find the font files they
reference anyway. So even if poppler does use them, splitting them off
into a separate package and fixing the dependencies will work better for
poppler, too.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1438494

Title:
  ghostscript fails to correctly substitute cidf fonts

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/poppler-data/+bug/1438494/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1438494] Re: ghostscript fails to correctly substitute cidf fonts

2015-05-13 Thread Till Kamppeter
The update-gsfontmap is indeed run in a post-install script, in the one
of the ghostscript package. It should get run on every change in the
directories /etc/ghostscript/cidfmap.d/ and /etc/ghostscript/fontmap.d/
as other packages than ghostscript, for example font packages, could
drop files here. Indeed the files in /etc/ghostscript/cidfmap.d/ come
from the poppler-data package. I could not determine which package(s)
hold the actal fonts though.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1438494

Title:
  ghostscript fails to correctly substitute cidf fonts

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ghostscript/+bug/1438494/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1438494] Re: ghostscript fails to correctly substitute cidf fonts

2015-05-13 Thread Till Kamppeter
Font packages in Ubuntu, providing the needed CJK fonts are (except the
redundant/obsolete names of 90gs-cjk-resource-japan2.conf):

fonts-arphic-ukai:
/usr/share/fonts/truetype/arphic/ukai.ttc

fonts-arphic-uming:
/usr/share/fonts/truetype/arphic/uming.ttc

fonts-takao-pgothic:
/usr/share/fonts/truetype/takao-gothic/TakaoPGothic.ttf
/usr/share/fonts/truetype/fonts-japanese-gothic.ttf - 
/etc/alternatives/fonts-japanese-gothic.ttf - 
/usr/share/fonts/truetype/takao-gothic/TakaoPGothic.ttf

fonts-hanazono:
/usr/share/fonts/truetype/hanazono/HanaMinA.ttf
/usr/share/fonts/truetype/fonts-japanese-mincho.ttf - 
/etc/alternatives/fonts-japanese-mincho.ttf - 
/usr/share/fonts/truetype/hanazono/HanaMinA.ttf

fonts-nanum:
/usr/share/fonts/truetype/nanum/NanumGothic.ttf
/usr/share/fonts/truetype/nanum/NanumBarunGothicBold.ttf
/usr/share/fonts/truetype/nanum/NanumMyeongjo.ttf
/usr/share/fonts/truetype/nanum/NanumBarunGothic.ttf

So splitting the fontmap files out of poppler-data and letting the new
binary file depend on the above listed packages should fix this bug.


** Changed in: ghostscript (Ubuntu)
   Status: New = Triaged

** Package changed: ghostscript (Ubuntu) = poppler-data (Ubuntu)

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1438494

Title:
  ghostscript fails to correctly substitute cidf fonts

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/poppler-data/+bug/1438494/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1438494] Re: ghostscript fails to correctly substitute cidf fonts

2015-05-13 Thread Till Kamppeter
Ghostscript will falkl back to DroidSansFallback.ttf, but the question
is whether Poppler will do it, too.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1438494

Title:
  ghostscript fails to correctly substitute cidf fonts

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/poppler-data/+bug/1438494/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1438494] Re: ghostscript fails to correctly substitute cidf fonts

2015-05-07 Thread cliddell
Apologies, Till, for the delayed reply - I *thought* replying on a bug
also subscribed me to it, but clearly not! (I have subscribed now).

There is a bit of guess work here, as I don't fully understand the file
locations.

We are mainly concerned with the cidfmap file. Now, there is a set of
cidfmap files in /etc/ghostscript/cidfmap.d/ and those (it appears)
are used by the /usr/sbin/update-gsfontmap script (poor name, as it
adds to the confusion that Fonts and CIDFonts are the same thing!), to
update the *actual* cidfmap which is in
/var/lib/ghostscript/fonts/cidfmap. It is not at all clear to me how
the update-gsfontmap script gets run - possibly only as a package
post-install step?

The files in /etc/ghostscript/cidfmap.d/ are as follows (file name +
TTF font(s) referenced):

90gs-cjk-resource-cns1.conf - ukai.ttc, uming.ttc
90gs-cjk-resource-gb1.conf - ukai.ttc, uming.ttc
90gs-cjk-resource-japan1.conf - fonts-japanese-mincho.ttf, 
fonts-japanese-gothic.ttf
90gs-cjk-resource-japan2.conf - ttf-japanese-mincho.ttf, ttf-japanese-gothic.ttf
90gs-cjk-resource-korea1.conf - NanumMyeongjo.ttf, NanumBarunGothic.ttf, 
NanumBarunGothicBold.ttf, NanumGothic.ttf

NOTE: there is some inconsistency (possibly bitrot) there with fonts-
japanese-*.ttf used in one file and ttf-japanese-*.ttf used in
another - clearly the same font, but likely different generations of
name.


My two alternate solutions are that the Ghostscript package should be augmented 
to include the fonts listed above (with the names and paths updated to reflect 
the current directory tree etc) as dependencies, thus they always get installed 
with Ghostscript.

*Or* to split off (I *think*) the files in /etc/ghostscript/cidfmap.d/
into something like a ghostscript-cjk-cidfonts package, which has
those fonts listed above as dependencies (again with names and paths
revised for a modern system).

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1438494

Title:
  ghostscript fails to correctly substitute cidf fonts

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ghostscript/+bug/1438494/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1438494] Re: ghostscript fails to correctly substitute cidf fonts

2015-04-01 Thread Till Kamppeter
Chris, thank you for the info.

I would be very grateful for further help. Which font mapping file do I
have to remove/move out into a separate package? Which font mappings (in
which files?) do I have to remove altogether? When packaging Ghostscript
in the many years up to now I did nearly no changes in included font
mappings or fonts as it usually worked, so I do not have much experience
in modifying font mappings.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1438494

Title:
  ghostscript fails to correctly substitute cidf fonts

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ghostscript/+bug/1438494/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1438494] Re: ghostscript fails to correctly substitute cidf fonts

2015-03-31 Thread cliddell
The root problem here is that the Ubuntu package contains cidfmap
mappings that provide substitutions for various CIDFonts that may not be
embedded in incoming files. But the Ghostscript package does not depend
on the package(s) containing those font files, so those font files are
often not available.

Historically, Ghostscript has assumed that such system level
configuration was correct, and did minimal error checking on them, thus
by the time Ghostscript realises the font files are not available, it is
too late to recover gracefully, and we have to error out.

The most recent Ghostscript releases are more rigorous in that area, and
should cope better.

Nevertheless, it would be preferable if the fonts references in the
mappings were made dependencies of the Ghostscipt package.

*Or* remove those mappings altogether, as they are much less relevant
since we now have a built-in CIDFont substitution in Ghostscript, using
DroidSansFallback.ttf.

A final suggestion would be to remove the mappings from the default
Ghostscript package, and rely on the DroidSansFallback.ttf substitution,
and move the existing mappings to a separate package which holds the
mapping configuration, and depends on the packages containing the
relevant font files.

I feel the last suggestion would be the most desirable, since the
DroidSansFallback.ttf substitution will work work just fine for the vast
majority of people, who don't need to be forced to install a load of
KANJI fonts, but allows the flexibility for those who genuinely need
more accurate CIDFont substitution than simply falling back to
DroidSansFallback.ttf.

Finally, the generic mappings such as Adobe-Identity and Adobe-
Japan1 should be removed altogether as, except in very rare
circumstances, those should all fall through to the
DroidSansFallback.ttf substitution.

I can provide a complete list of those generic mappings if
required.

Chris

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1438494

Title:
  ghostscript fails to correctly substitute cidf fonts

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ghostscript/+bug/1438494/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs