Your message dated Thu, 29 Nov 2012 13:02:29 +0000
with message-id <[email protected]>
and subject line Bug#692741: fixed in herold 6.0.3-1
has caused the Debian Bug report #692741,
regarding Better support for pdftohtml output (specific profile?)
to be marked as done.

This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.

(NB: If you are a system administrator and have no idea what this
message is talking about, this may indicate a serious mail system
misconfiguration somewhere. Please contact [email protected]
immediately.)


-- 
692741: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=692741
Debian Bug Tracking System
Contact [email protected] with problems
--- Begin Message ---
Package: herold
Version: 6.0.2-1
Severity: normal

It would be really nice if there was a profile for pdftohtml output. Currently 
pdftohtml generates something like:

<b>Scope</b><br>
TIFF describes image data that typically comes from scanners, frame 
grabbers,<br>and paint- and photo-retouching programs.<br>
TIFF is not a printer language or page description language. The purpose of 
TIFF<br>is to describe and store raster image data.<br>
A primary goal of TIFF is to provide a rich environment within which 
applica-<br>tions can exchange image data. This richness is required to take 
advantage of the<br>varying capabilities of scanners and other imaging 
devices.<br>
Though TIFF is a rich format, it can easily be used for simple scanners and 
appli-<br>cations as well because the number of required fields is small.<br>
TIFF will be enhanced on a continuing basis as new imaging needs arise. A 
high<br>priority has been given to structuring TIFF so that future enhancements 
can be<br>added without causing unnecessary hardship to developers.<br>

which get converted into (no profile):

  <para><emphasis remap="b:86:2" role="bold">Scope</emphasis></para>
  <para> TIFF describes image data that typically comes from scanners, frame 
grabbers,</para>
  <para>and paint- and photo-retouching programs.</para>
  <para> TIFF is not a printer language or page description language. The 
purpose of TIFF</para>
  <para>is to describe and store raster image data.</para>
  <para> A primary goal of TIFF is to provide a rich environment within which 
applica-</para>
  <para>tions can exchange image data. This richness is required to take 
advantage of the</para>
  <para>varying capabilities of scanners and other imaging devices.</para>
  <para> Though TIFF is a rich format, it can easily be used for simple 
scanners and appli-</para>
  <para>cations as well because the number of required fields is small.</para>
  <para> TIFF will be enhanced on a continuing basis as new imaging needs 
arise. A high</para>
  <para>priority has been given to structuring TIFF so that future enhancements 
can be</para>
  <para>added without causing unnecessary hardship to developers.</para>

This make is difficult to use in docbook (too many <para/>).

Also pdftohtml extract PDF headers and place it into HTML/META elements. Eg:

<HEAD>
<TITLE>TIFF6.final.9509</TITLE>
<META http-equiv="Content-Type" content="text/html; charset=UTF-8">
<META name="generator" content="pdftohtml 0.36">
<META name="author" content="Adobe Systems Inc.">
<META name="keywords" content="TIFF,,.TIF,,TIF">
<META name="date" content="1995-09-14T14:32:50+00:00">
<META name="subject" content="TIFF 6.0">
</HEAD>

It would be really nice to have them in docbook/info !

Thanks

-- System Information:
Debian Release: 6.0.6
  APT prefers stable-updates
  APT policy: (500, 'stable-updates'), (500, 'stable'), (200, 'testing'), (100, 
'unstable')
Architecture: amd64 (x86_64)

Kernel: Linux 3.2.0-0.bpo.3-amd64 (SMP w/8 CPU cores)
Locale: LANG=en_US.utf8, LC_CTYPE=en_US.utf8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash

Versions of packages herold depends on:
ii  antlr3                3.2-5              language tool for constructing rec
ii  libcommons-codec-java 1.4-2              encoder and decoders such as Base6
ii  libcommons-jxpath-jav 1.3-3              manipulate javabean using XPath sy
ii  libcommons-logging-ja 1.1.1-8            commmon wrapper interface for seve
ii  libxml-commons-resolv 1.2-7~bpo60+1      XML entity and URI resolver librar
ii  libxmlgraphics-common 1.4.dfsg-4~bpo60+1 reusable components used by Batik 

herold recommends no packages.

herold suggests no packages.

-- debconf-show failed

--- End Message ---
--- Begin Message ---
Source: herold
Source-Version: 6.0.3-1

We believe that the bug you reported is fixed in the latest version of
herold, which is due to be installed in the Debian FTP archive.

A summary of the changes between this version and the previous one is
attached.

Thank you for reporting the bug, which will now be closed.  If you
have further comments please address them to [email protected],
and the maintainer will reopen the bug report if appropriate.

Debian distribution maintenance software
pp.
Mathieu Malaterre <[email protected]> (supplier of updated herold package)

(This message was generated automatically at their request; if you
believe that there is a problem with it please contact the archive
administrators by mailing [email protected])


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Format: 1.8
Date: Thu, 29 Nov 2012 13:00:52 +0100
Source: herold
Binary: herold
Architecture: source all
Version: 6.0.3-1
Distribution: unstable
Urgency: low
Maintainer: Debian XML/SGML Group <[email protected]>
Changed-By: Mathieu Malaterre <[email protected]>
Description: 
 herold     - HTML to DocBook XML conversion
Closes: 689309 691170 692543 692544 692545 692547 692741 692756 693214
Changes: 
 herold (6.0.3-1) unstable; urgency=low
 .
   * New upstream. Closes: #692544
   * option --logging-level was removed. Closes: #691170
   * preserve lang attribute. Closes: #692543
   * Fix herold help output. Closes: #692545
   * pdftohtml support (detect-trapped-br=false to revert). Closes: #692741
   * Fix invalid XML element. Closes: #693214
   * Remove d/p/antencoding.patch, applied upstream
   * man page is provided. Closes: #689309
   * Fix herold.home location. Closes: #692547
   * Support section element. Closes: #692756
Checksums-Sha1: 
 6c817dfcb2e75c10be77de4ff2ea4cdeb1334953 2129 herold_6.0.3-1.dsc
 e86da044f8d13c76be646f97ea6f85558966282f 202700 herold_6.0.3.orig.tar.gz
 79e75e740d209277f31f50c0760eded51197f20d 16177 herold_6.0.3-1.debian.tar.gz
 ca3f28e61570ce9b37251ba5ebea395415deade2 459316 herold_6.0.3-1_all.deb
Checksums-Sha256: 
 095435019500a1c3bc74f98f0cc35b90960b28461e2035deb38be6149fd1bd53 2129 
herold_6.0.3-1.dsc
 8abfaa4fb560f38d09d6d8bbdc37e1a0a62bd668a68a810c6d76a3d4a26d57af 202700 
herold_6.0.3.orig.tar.gz
 cf47b9198ab3927719844d6b18fa98a9449c0f8b9e036aa381e7fe90fda34fd0 16177 
herold_6.0.3-1.debian.tar.gz
 7e1ee759750baea9e6e8d480e4893c640019624775def603ce73313b719c6922 459316 
herold_6.0.3-1_all.deb
Files: 
 bd0e2e9a0c5be03ce2b199de769a8066 2129 java optional herold_6.0.3-1.dsc
 32835ab1b73ea6286ab5d15f0e0649c0 202700 java optional herold_6.0.3.orig.tar.gz
 80c668b907f5ef3a3ecfbc5ebd00618c 16177 java optional 
herold_6.0.3-1.debian.tar.gz
 321b81267cf0b698340eb8da4f21a693 459316 java optional herold_6.0.3-1_all.deb

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)

iQIcBAEBAgAGBQJQt1o6AAoJEAFx4YKK4JNFif8P/2djOkbC/tK/h5oKh9fWHPkz
QEzOMHm+SD6jsPdaY6mgwHgKGjzcc6oucc+dJwxtHnFiXTRH8wQH9G05z2t0pI0N
5yZT4zTlARvyoKXcb3WiyKuzWDg63niyKuBXdZ4ULUl+FYx7M1rKCl4An7fJTh17
nlZZsJdvSOohFKyix7ZhsFisU2RZX/AscSNsAVd2p/Z/sX7iHWHsetyVVu1099wj
cB+1jvP7iXgW+cwqZxY8Olo8ZIGj/9FduSEZj0KRZXqP9RRYrsD3+47+jkb53rLx
PARc3kdvEaxAmnKf+bbNJZyphfz8R0zduMBBe09C5kuEBNM7fQQtm/IktI6Nm15c
NTd3Skg17tL3BYlzF16BLLsz/n/a6UY3ufyECwCzPv5uvNx4CApc15kmlN1/IGx/
SbSA105TlDQludA8xA9qKxjeNsFLJM80R8QchpZUs9nwFK50uRr+adWvFjv4WaYz
3XqY9ZPbSoND85Wq4pcAnmC/T52+UxL9eOm027AvSYvm1KG930UAD/xkR/DE099N
+UxoqDhcFzg65JuTC8GtSufKYndIKRfc9SW75QfCB2APEF394Z+yDn3jfyLCNWw5
7CrA53QoBjb2wKLgnXpKr3LVGCXMoHKLpMjf89RtVcxz/hwE2zvYpYC3CsI8xlkr
YH6oS4RoQg5Ne4raU8m+
=VocK
-----END PGP SIGNATURE-----

--- End Message ---

Reply via email to