Terry Reedy, 22.04.2011 05:48:
On 4/21/2011 8:25 PM, Paul Rubin wrote:
Matt Chaput writes:
I'm looking for some code that will take a Snowball program and
compile it into a Python script. Or, less ideally, a Snowball
interpreter written in Python.
(http://snowball.tartarus.org/)
Anyone heard of such a thing?
I never saw snowball before, it looks kind of interesting, and it
looks like it already has a way to compile to C. If you're using
it for IR on any scale, you're surely much better off using the C
routines with a C API wrapper,
If the C routines are in a shared library, you should be able to write the
interface in Python with ctypes.
Since it appears that the code has to get compiled anyway, Cython is likely
a better option, as it makes it easier to write a fast and Pythonic wrapper.
From a quick look, Snowball also has a "-widechar" option that could allow
interfacing directly with Python's Unicode strings in 16-bit Unicode builds
(but not 32-bit builds!). That would provide for really fast wrappers that
do not even need an intermediate encoding step. And PEP 393 would
eventually allow to include both a UTF-8 and a 16-bit version of the
(prefixed) Snowball code, and to use them alternatively, depending on the
internal layout of the processed string, with the obvious fallback to UTF-8
encoding only for strings that really exceed the lower 16-bit Unicode range.
That sounds like a really nice project.
Stefan
--
http://mail.python.org/mailman/listinfo/python-list