Terry Reedy, 22.04.2011 05:48:
On 4/21/2011 8:25 PM, Paul Rubin wrote:
Matt Chaput writes:
I'm looking for some code that will take a Snowball program and
compile it into a Python script. Or, less ideally, a Snowball
interpreter written in Python.

(http://snowball.tartarus.org/)

Anyone heard of such a thing?

I never saw snowball before, it looks kind of interesting, and it
looks like it already has a way to compile to C. If you're using
it for IR on any scale, you're surely much better off using the C
routines with a C API wrapper,

If the C routines are in a shared library, you should be able to write the
interface in Python with ctypes.

Since it appears that the code has to get compiled anyway, Cython is likely a better option, as it makes it easier to write a fast and Pythonic wrapper.

From a quick look, Snowball also has a "-widechar" option that could allow interfacing directly with Python's Unicode strings in 16-bit Unicode builds (but not 32-bit builds!). That would provide for really fast wrappers that do not even need an intermediate encoding step. And PEP 393 would eventually allow to include both a UTF-8 and a 16-bit version of the (prefixed) Snowball code, and to use them alternatively, depending on the internal layout of the processed string, with the obvious fallback to UTF-8 encoding only for strings that really exceed the lower 16-bit Unicode range.

That sounds like a really nice project.

Stefan

--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to