Erik, thanks, are you able to pint me to some good reference sources on tokenizer's... i have never come across them before
I have been scouring the web, and am coming up a decided blank. :) Cheers, Brad Nel vino la verità, nella birra la forza, nell'acqua i bacilli -------------------------------------------------------------------------- In wine there is truth, in beer there is strength, in water there are bacteria > From: Erik Price <[EMAIL PROTECTED]> > Date: Tue, 18 Mar 2003 16:45:47 -0500 > To: Brad Wright <[EMAIL PROTECTED]> > Cc: PHP General List <[EMAIL PROTECTED]> > Subject: Re: [PHP] Using PHP to get a word count of a MSword doc > > > > Brad Wright wrote: >> Thanks for the reply Rene, >> >> Any change of a code sample of how u did this?? Im not at all experienced in >> Java. > > According to the manual, PHP does have some tokenizer functions: > > http://www.php.net/manual/en/ref.tokenizer.php > > However, the documentation appears to be lacking as they are still under > development. Using it might be somewhat straightforward if you are > accustomed to using a tokenizer in another language (like Java) but if > not, it's really a little too difficult to explain in an email. > > A less elegant but ultimately quicker and probably more reliable > solution might be to investigate some kind of external word-counting > program that knows how to parse .DOC files (good luck on that part), and > call this from your PHP script using system(). Catch-22: the only > libraries I am familiar with that can parse .DOC files are the Jakarta > POI libraries, which are written in Java. But I am sure that if you > scour the web you can find some Perl, Python, or maybe even PHP-based > solution. > > > Erik > -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php