After seeing my name in lights on the weekly PHP summary, I decided to go 
back into that Porter extension I had wrote about earlier and clean things 
up a bit. 

I stripped out all of the C++ stuff to make it a bit of an easier fit with 
PHP's C code (and it seems to be running slightly faster to boot).

For now, it can only stem English language words, but after reading some 
interesting work by Dr. Porter at snowball.sourceforge.com, I'm thinking 
I'll start adding more languages soon. (Most notably, since I work on a 
Canadian web application, Francais is forthcoming.)

For now, the prototype of the lone function in the extension is

string porter(string word)

which takes a word, uppercases it, removes and suffixes and returns the 
word's stem, or "-1" on any sort of failure. (I'm thinking just to return 
the word itself, uppercased and unchanged if there's some sort of failure 
-- comments?)

If I get around to adding languages, I'm thinking the prototype should be 
changed to something like

string porter(string word [, string lang])

where the optional lang argument will be some language code, i.e. EN, FR, 
RU, etc., using EN for the default.

Comments? Queries?

J

-- 
PHP Development Mailing List <http://www.php.net/>
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
To contact the list administrators, e-mail: [EMAIL PROTECTED]

Reply via email to