RE: [PHP] regexp question - extracting wanted ascii characters only?

2001-04-04 Thread Jorg Krause

> From: Erick Papadakis [mailto:[EMAIL PROTECTED]]
> 
> Hi,
> 
> I need to do a simple thing. I want to read a binary
> file (e.g., microsoft word, excel etc) and then
> extract only the text from it. I am using simple
> fopen() and fread() and when I print out the contents
> of the file, it returns me the text but apart from the
> text, there is some junk which is probably because of
> the file being binary. 
> 
> Is it possible through the regexp to specify that I
> only want some of the ASCII characters from the binary
> stream? Here is the perl equivalent: 
> 
> /([\040-\176\s]{3,})/g
> 
> I want only those words that are minimum 3 characters
> and I want the characters to match the ASCII numbers
> from 40 to 176. 
> 
You can use the regex likewise to perl. Try it here:

http://www.php.comzept.de/rexpr/index.php4

Instead of the /g option use the function preg_grep() in PHP.
Read the file in an array with file(), then grep through
the array with the regex to get the right lines. Want a 
string? implode() without delimiter.

Joerg Krause
**
E-Mail: [EMAIL PROTECTED] Info:www.joerg.krause.net
German Reference Handbook: www.php.comzept.de/referenz
**

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
To contact the list administrators, e-mail: [EMAIL PROTECTED]




Re: [PHP] regexp question - extracting wanted ascii characters only?

2001-04-04 Thread Christian Reiniger

On Wednesday 04 April 2001 11:23, you wrote:

> Is it possible through the regexp to specify that I
> only want some of the ASCII characters from the binary
> stream? Here is the perl equivalent:
>
> /([\040-\176\s]{3,})/g
>
> I want only those words that are minimum 3 characters
> and I want the characters to match the ASCII numbers
> from 40 to 176.

Why do you ask if you already have the solution?

man preg_match

-- 
Christian Reiniger
LGDC Webmaster (http://sunsite.dk/lgdc/)

The use of COBOL cripples the mind; its teaching should, therefore,
be regarded as a criminal offence.

- Edsger W. Dijkstra

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
To contact the list administrators, e-mail: [EMAIL PROTECTED]