Re: search through postscript documents?

2005-03-21 Thread Matej Cepl
[EMAIL PROTECTED] wrote:
 I have tried versions of Ghostscript on Slackware and on Knoppix, a
 Debian derivative. I have downloaded and installed Ghostscript 8.50. I
 have installed the latest pstotext. Nothing works.

try pdftotext from xpdf package

-- 
Matej Cepl, http://www.ceplovi.cz/matej
GPG Finger: 89EF 4BC6 288A BF43 1BAB  25C3 E09F EF25 D964 84AC
138 Highland Ave. #10, Somerville, Ma 02143, (617) 623-1488
 
His mother should have thrown him away and kept the stork.
  -- Mae West



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: search through postscript documents?

2005-03-05 Thread john

Antonio Rodriguez wrote:
 On Thu, Mar 03, 2005 at 12:15:40PM +0100, Joerg Reckers wrote:
  Is there a way(program) to search for expressions in a postscript
document?
  and to copy + paste words out of a ghostview-program to text?
 
  As i am using Kghostview now, and i am missing these features, so i
will ask
  on this list. :-)
 
  thanks, joerg

 Package: pstotext
 Priority: optional
 Section: text
 Installed-Size: 110
 Maintainer: J.H.M. Dassen (Ray) [EMAIL PROTECTED]
 Architecture: i386
 Version: 1.9-1
 Depends: gs | gs-aladdin (= 3.51), libc6 (= 2.3.2.ds1-4)
 Filename: pool/main/p/pstotext/pstotext_1.9-1_i386.deb
 Size: 32294
 MD5sum: a159e4b756759beeae003700d31487d1
 Description: Extract text from PostScript and PDF files
  pstotext extracts text (in the ISO 8859-1 character set) from a
PostScript
  or PDF (Portable Document Format) file. Thus, pstotext is similar to
the
  ps2ascii program that comes with ghostscript. The output of pstotext
is
  however better than that of ps2ascii, because pstotext deals better
with
  punctuation and ligatures.

I have a pdf file produced by a recent version of InDesign CS. The
utility pdf2ps will produce a Postscript file that is readable using
GV. However from that point on everything fails. There seems to be no
way to convert the file to ASCII except by cutting pages and pasting
them into e.g., Gvim.

I have tried creating a subset of the pages and then converting that.
What I get is just the EOP characters.

This is the second such file I have had trouble with. It may have
something to do with PostScript 1.5. In any case this is a customer's
file and I can't very well ask him to resave his PDF to an earlier
version.

I have tried versions of Ghostscript on Slackware and on Knoppix, a
Debian derivative. I have downloaded and installed Ghostscript 8.50. I
have installed the latest pstotext. Nothing works.

John Culleton


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



search through postscript documents?

2005-03-03 Thread Joerg Reckers
Is there a way(program) to search for expressions in a postscript document?
and to copy + paste words out of a ghostview-program to text? 

As i am using Kghostview now, and i am missing these features, so i will ask 
on this list. :-)

thanks, joerg


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: search through postscript documents?

2005-03-03 Thread Tomas Pospisek's Mailing Lists
On Thu, 3 Mar 2005, Joerg Reckers wrote:
Is there a way(program) to search for expressions in a postscript document?
and to copy + paste words out of a ghostview-program to text?
AFAIK:
postscript doesn't preserve word, sentence etc. boundaries. So there's no 
way to reliably know what a word inside a ps is. One of the goals of PDF 
is to remedy exaclty this problem.
*t

--
---
  Tomas Pospisek
  http://sourcepole.com -  Linux  Open Source Solutions
---
--
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]


Re: search through postscript documents?

2005-03-03 Thread Antonio Rodriguez
On Thu, Mar 03, 2005 at 12:15:40PM +0100, Joerg Reckers wrote:
 Is there a way(program) to search for expressions in a postscript document?
 and to copy + paste words out of a ghostview-program to text? 
 
 As i am using Kghostview now, and i am missing these features, so i will ask 
 on this list. :-)
 
 thanks, joerg

Package: pstotext
Priority: optional
Section: text
Installed-Size: 110
Maintainer: J.H.M. Dassen (Ray) [EMAIL PROTECTED]
Architecture: i386
Version: 1.9-1
Depends: gs | gs-aladdin (= 3.51), libc6 (= 2.3.2.ds1-4)
Filename: pool/main/p/pstotext/pstotext_1.9-1_i386.deb
Size: 32294
MD5sum: a159e4b756759beeae003700d31487d1
Description: Extract text from PostScript and PDF files
 pstotext extracts text (in the ISO 8859-1 character set) from a PostScript
 or PDF (Portable Document Format) file. Thus, pstotext is similar to the
 ps2ascii program that comes with ghostscript. The output of pstotext is
 however better than that of ps2ascii, because pstotext deals better with
 punctuation and ligatures.


So, you can pipe the output to some shellscript, with sed or gawk in the
background to process the text.


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]