> On Mar 9, 2017, at 14:34, Ruediger Meier <sweet_...@gmx.de> wrote:
> 
> Hi,
> 
> I did some work to port jcc to python3, see
> https://github.com/rudimeier/jcc
> 
> There are two interesting branches, py2 and py3
>  py2  should still work for python2 >=2.7 without any behavior change
>  py3  completes experimental python3 support (but still python2
>       incompatible).
> 
> 
> regarding py2 branch:
>  - fixes many (but not yet all) python3 incompatibilities
>  - should still work for python2 >=2.7 without any behavior change
>  - succeeds lucene's test-suite completely (pylucene-4.10.1/java-1.7
>    and (pylucene-6.4.1/java-1.8).
>  - removes compatibility for python < 2.7 (though it would not be hard
>    to keep it)
> 
> 
> regarding py3 branch:
>  - based on py2 it adds a few more patches for python3 support (>3.1)
>    but still in a way which is python2 incompatible
>  - still fails about 25% of pylucene tests.
>  - Note I just did pylucene's python3 support for testing trivially
>    like '2to3  -w  $(find -name "*.py")'
> 
> 
> Please comment on this. I'd like to get python3 support upstream. 
> 
> Some more notes:
> 
> My patches were inspired from the old python-3.1 port
> http://svn.apache.org/repos/asf/lucene/pylucene/branches/python_3/jcc/
> 
> I've refactored/rebased it and splitted the huge patches into many 
> smaller ones inclusive keeping python2 support. Almost all commits in 
> my py2 branch are trivially to review and independent of each other. So 
> I would be glad if you would merge as many of them as you like.
> 
> Since the py3 branch is still not 100% correct (guess still some 
> Bytes/Unicode problems) I would be glad if somebody would help to get 
> it running.

Thank you for your contribution. I have not looked at it yet but you're now the 
second contributor with python3 support. The one thing I suspect is missing is 
proper python 3.x (x > 3 ?) <-> java string conversions. In these versions of 
python the internal string representation was changed to be more clever about 
how many bytes to use per unicode char based on the data of the string. One 
would want to take advantage of that to minimize conversions between both 
languages. Support for earlier versions of python 3 is irrelevant and not 
necessary.
You can take a look at the PyICU sources (github) for the kinds of conversion 
functions I'm referring to.
(function PyUnicode_FromUnicodeString and reverse in common.cpp: 
https://github.com/ovalhub/pyicu/blob/master/common.cpp)
Also, support for python 2 is not necessary in the new branch as it's being 
retired (python 2) in a few years.

I have no time right now to spend quality time on jcc/python3 support but it's 
been increasingly on my mind lately and I hope to spend some time on this soon. 
At that point, I'll take a look at your patches and the other contributor's 
(see list archives) as well.

Many thanks for your contribution !

Andi..

> 
> 
> Cheers,
> Rudi

Reply via email to