Re: Changing Python class/module layout, dropping --rename ?

2012-08-18 Thread Andi Vajda


On Wed, 15 Aug 2012, Roman Chyla wrote:


The full names make things easier and I would also think that the top
level 'java' package should be optional.

Many thanks from me too!


Cool, so the PyLucene 4.0 build is now using --use_full_names !
The long test and sample rewrite has begun

Currently, only test/test_Analyzers.py passes. All other test and samples 
remain to be ported to the new API and their imports fixed to fit the new 
generated Python module tree that follows the Lucene java package tree.


Volunteers to help with this would be appreciated !

If you're stumped by a Lucene API change, check the detailed CHANGES.txt and 
even more detailed MIGRATE.txt files for examples and instructions on how to 
port to the new Lucene 4.0 API.


http://svn.apache.org/repos/asf/lucene/dev/branches/branch_4x/lucene/CHANGES.txt
http://svn.apache.org/repos/asf/lucene/dev/branches/branch_4x/lucene/MIGRATE.txt

If you have questions about how to express some of these things with 
Python and JCC, please don't hesitate to ask here.


To avoid having multiple people working on the same files at the same time, 
please send a brief message to the list that you're taking on a particular 
test or directory of tests/samples.


Andi..


Re: Changing Python class/module layout, dropping --rename ?

2012-08-15 Thread Roman Chyla
The full names make things easier and I would also think that the top
level 'java' package should be optional.

Many thanks from me too!

roman

On Wed, Aug 15, 2012 at 1:23 PM, Petrus Hyvönen
 wrote:
> No objections, my build parameter list got significantly more readable using 
> the full names and managing without the renames. I think it also makes it 
> easier for entry level users to use the jcc wrapper as one doesn't have to 
> track down the duplicate names.
>
> Regarding the addition of a  top level 'java' package, I would prefer it to 
> be optional, so that it is possible to get close to the java examples for the 
> wrapped library.
>
> Many thanks for the work,
> /Petrus
>
>
> On 15 aug 2012, at 11:52, Andi Vajda  wrote:
>
>>
>> If there are no objections to the new module layout for Python wrappers 
>> around Java classes that follows the Java package structure, I'd like to 
>> switch the PyLucene 4.0 build to --use_full_names by default.
>>
>> It makes from longer import statements but eliminates all --rename and 
>> --exclude uses from the current PyLucene jcc command line.
>>
>> Any objections, comments, suggestions ?
>>
>> Andi..
>>
>> On Thu, 2 Aug 2012, Andi Vajda wrote:
>>
>>>
>>> On Wed, 25 Jul 2012, Roman Chyla wrote:
>>>
 I am now using the --use_full_names and it works without greater
 problems, even against the latest lucene trunk
 The only nuisance is that modules defined in java take over modules
 defined in python (I happened to have one name which was the same for
 both, so I renamed the java package)
>>>
>>> Maybe a top level 'java' package should be added to all the packages created
>>> when --use_full_names is used ?
>>> Thus
>>> >>> from org.apache.lucene.document import Document
>>> would become
>>> >>> from java.org.apache.lucene.document import Document
>>>
>>> It's even more typing but a little less intrusive on existing package
>>> names ?
>>>
>>> Andi..
>>>
>


Re: Changing Python class/module layout, dropping --rename ?

2012-08-15 Thread Petrus Hyvönen
No objections, my build parameter list got significantly more readable using 
the full names and managing without the renames. I think it also makes it 
easier for entry level users to use the jcc wrapper as one doesn't have to 
track down the duplicate names.

Regarding the addition of a  top level 'java' package, I would prefer it to be 
optional, so that it is possible to get close to the java examples for the 
wrapped library.

Many thanks for the work,
/Petrus


On 15 aug 2012, at 11:52, Andi Vajda  wrote:

> 
> If there are no objections to the new module layout for Python wrappers 
> around Java classes that follows the Java package structure, I'd like to 
> switch the PyLucene 4.0 build to --use_full_names by default.
> 
> It makes from longer import statements but eliminates all --rename and 
> --exclude uses from the current PyLucene jcc command line.
> 
> Any objections, comments, suggestions ?
> 
> Andi..
> 
> On Thu, 2 Aug 2012, Andi Vajda wrote:
> 
>> 
>> On Wed, 25 Jul 2012, Roman Chyla wrote:
>> 
>>> I am now using the --use_full_names and it works without greater
>>> problems, even against the latest lucene trunk
>>> The only nuisance is that modules defined in java take over modules
>>> defined in python (I happened to have one name which was the same for
>>> both, so I renamed the java package)
>> 
>> Maybe a top level 'java' package should be added to all the packages created
>> when --use_full_names is used ?
>> Thus
>> >>> from org.apache.lucene.document import Document
>> would become
>> >>> from java.org.apache.lucene.document import Document
>> 
>> It's even more typing but a little less intrusive on existing package
>> names ?
>> 
>> Andi..
>> 



Re: Changing Python class/module layout, dropping --rename ?

2012-08-15 Thread Andi Vajda


If there are no objections to the new module layout for Python wrappers 
around Java classes that follows the Java package structure, I'd like to 
switch the PyLucene 4.0 build to --use_full_names by default.


It makes from longer import statements but eliminates all --rename and 
--exclude uses from the current PyLucene jcc command line.


Any objections, comments, suggestions ?

Andi..

On Thu, 2 Aug 2012, Andi Vajda wrote:



On Wed, 25 Jul 2012, Roman Chyla wrote:


I am now using the --use_full_names and it works without greater
problems, even against the latest lucene trunk

The only nuisance is that modules defined in java take over modules
defined in python (I happened to have one name which was the same for
both, so I renamed the java package)


Maybe a top level 'java' package should be added to all the packages created
when --use_full_names is used ?
Thus
>>> from org.apache.lucene.document import Document
would become
>>> from java.org.apache.lucene.document import Document

It's even more typing but a little less intrusive on existing package
names ?

Andi..



Re: Changing Python class/module layout, dropping --rename ?

2012-08-02 Thread Andi Vajda


On Wed, 25 Jul 2012, Roman Chyla wrote:


I am now using the --use_full_names and it works without greater
problems, even against the latest lucene trunk

The only nuisance is that modules defined in java take over modules
defined in python (I happened to have one name which was the same for
both, so I renamed the java package)


Maybe a top level 'java' package should be added to all the packages created
when --use_full_names is used ?
Thus
 >>> from org.apache.lucene.document import Document
would become
 >>> from java.org.apache.lucene.document import Document

It's even more typing but a little less intrusive on existing package
names ?

Andi..


Re: Changing Python class/module layout, dropping --rename ?

2012-07-19 Thread Andi Vajda


On Thu, 19 Jul 2012, Roman Chyla wrote:


The script must have thought about it somehow :-) Have a great,
undisturbed vacation!


In rev 1363436 of jcc, I implemented support for the simplest version of the 
proposal via a new command line flag, off by default, called 
--use_full_names.
When --use_full_names is used, the wrapped classes get installed into a 
Python module hierarchy that parallels the Java one.


For example:

  >>> import lucene
  >>> lucene.initVM()
  
  >>> from org.apache.lucene.document import Document
  >>> Document()
  >
  >>>

Andi..



roman

On Thu, Jul 19, 2012 at 9:33 AM, Andi Vajda  wrote:


On Fri, 13 Jul 2012, Roman Chyla wrote:


Hi,
I was playing with the idea of creating virtual packages, attached is a
working script that illustrates it. I am getting this output:

Dit it work?



No, I haven't forgotten, I'm just on vacation.

Andi..



==
from org.apache.lucene.search import SearcherFactory; print
SearcherFactory

from org.apache.lucene.analysis import Analyzer as Banalyzer; print
Banalyzer

print sys.modules['org'] 
print sys.modules['org.apache'] 
print sys.modules['org.apache.lucene'] 
print sys.modules['org.apache.lucene.search'] 

Cheers,

 roman


On Fri, Jul 13, 2012 at 1:34 PM, Andi Vajda  wrote:



On Jul 13, 2012, at 18:33, Roman Chyla  wrote:


I think this would be great. Let me add little bit more to your
observations (whole night yesterday was spent fighting with renames -
because I was building a project which imports shared lucene and solr
--
there were thousands of same classes, I am not sure it would be possible
without some sort of a flexible rename...)

JCC is a great tool and is used by potentially many projects - so


stripping


"org.apache" seems right for pylucene, but looks arbitrary otherwise



Yes, I forgot to say that there would be a way to declare one or more
mappings  so that org.apache.lucene becomes lucene.

Andi..


(unless there is a flexible stripping mechanism). Also, if the full
namespace remains original, then the code written in Python would be
also
executable by Jython, which is IMHO an advantage.

But this being Python, the packages cannot be spread in different


locations


(ie. there can be only one org.apache.lucene.analysis package) - unless
there exists (again) some flexible mechanism which populates the


namespace


with objects that belong there. It may seem an overkill to you, because


for


single projects it would work, but seems perfectly justifiable in case
of
imported shared libraries

I don't know what is your idea for implementing the python packages, but
your last email got me thinking as well - there might be a very simple


way


of getting to the java packages inside Python without too much work.

Let's say the java "org.apache.lucene.search.IndexSearcher" is known to
python as org_apache_lucene_search_IndexSearcher

and users do:

import lucene
lucene.initVM()

initVM() first initiates java VM (and populates the lucene namespace
with
all objects), but then it will call jcc.register_module(self)

A new piece of code inside JCC grabs the lucene module and creates (on


the


fly) python packages -- using types.ModuleType (or new.module()) -- the


new


packages will be inserted into sys.modules

so after lucene.initVM() returns

users can do "from org.apache.lucene.search import IndexSearcher" and
get
lucene.org_apache_lucene_search_IndexSearcher object

and also, when shared libraries are present (let's say 'solr') users do:

import solr
solr.initVM()

The JCC will just update the existing packages and create new ones if
needed (and from this perspective, having fully qualified name is safer
than to have lucene.search.IndexSearcher)

I think this change is totally possible and will not change the way how
extensions are built. Does it have some serious flaw?

I would be of course more than happy to contribute and test.

Best,

 roman


On Fri, Jul 13, 2012 at 11:47 AM, Andi Vajda  wrote:



On Tue, 10 Jul 2012, Andi Vajda wrote:

I would also like to propose a change, to allow for more flexible


mechanism of generating Python class names. The patch doesn't change
the default pylucene behaviour, but it gives people a way to replace
class names with patterns. I have noticed that there are more
same-name classes from different packages in the new lucene (and it
becomes worse when one has to deal with both lucene and solr).



Another way to fix this is to reproduce the namespace hierarchy used
in
Lucene, following along the Java packages, something I've been


dreading to


do. Lucene just loves a really long deeply nested class structure.
I'm not convinced yet it is bad enough to go down that route, though.

Your proposal to use patterns may in fact yield a much more convenient
solution. Thanks !



Rethinking this a bit, I'm prepared to change my mind on this. Your
patterned rename patch shows that we're slowly but surely reaching the
limit of the current setup that consists in throwin

Re: Changing Python class/module layout, dropping --rename ?

2012-07-19 Thread Andi Vajda


On Fri, 13 Jul 2012, Roman Chyla wrote:


Hi,
I was playing with the idea of creating virtual packages, attached is a
working script that illustrates it. I am getting this output:

Dit it work?


No, I haven't forgotten, I'm just on vacation.

Andi..


==
from org.apache.lucene.search import SearcherFactory; print SearcherFactory

from org.apache.lucene.analysis import Analyzer as Banalyzer; print
Banalyzer

print sys.modules['org'] 
print sys.modules['org.apache'] 
print sys.modules['org.apache.lucene'] 
print sys.modules['org.apache.lucene.search'] 

Cheers,

 roman


On Fri, Jul 13, 2012 at 1:34 PM, Andi Vajda  wrote:



On Jul 13, 2012, at 18:33, Roman Chyla  wrote:


I think this would be great. Let me add little bit more to your
observations (whole night yesterday was spent fighting with renames -
because I was building a project which imports shared lucene and solr  --
there were thousands of same classes, I am not sure it would be possible
without some sort of a flexible rename...)

JCC is a great tool and is used by potentially many projects - so

stripping

"org.apache" seems right for pylucene, but looks arbitrary otherwise


Yes, I forgot to say that there would be a way to declare one or more
mappings  so that org.apache.lucene becomes lucene.

Andi..


(unless there is a flexible stripping mechanism). Also, if the full
namespace remains original, then the code written in Python would be also
executable by Jython, which is IMHO an advantage.

But this being Python, the packages cannot be spread in different

locations

(ie. there can be only one org.apache.lucene.analysis package) - unless
there exists (again) some flexible mechanism which populates the

namespace

with objects that belong there. It may seem an overkill to you, because

for

single projects it would work, but seems perfectly justifiable in case of
imported shared libraries

I don't know what is your idea for implementing the python packages, but
your last email got me thinking as well - there might be a very simple

way

of getting to the java packages inside Python without too much work.

Let's say the java "org.apache.lucene.search.IndexSearcher" is known to
python as org_apache_lucene_search_IndexSearcher

and users do:

import lucene
lucene.initVM()

initVM() first initiates java VM (and populates the lucene namespace with
all objects), but then it will call jcc.register_module(self)

A new piece of code inside JCC grabs the lucene module and creates (on

the

fly) python packages -- using types.ModuleType (or new.module()) -- the

new

packages will be inserted into sys.modules

so after lucene.initVM() returns

users can do "from org.apache.lucene.search import IndexSearcher" and get
lucene.org_apache_lucene_search_IndexSearcher object

and also, when shared libraries are present (let's say 'solr') users do:

import solr
solr.initVM()

The JCC will just update the existing packages and create new ones if
needed (and from this perspective, having fully qualified name is safer
than to have lucene.search.IndexSearcher)

I think this change is totally possible and will not change the way how
extensions are built. Does it have some serious flaw?

I would be of course more than happy to contribute and test.

Best,

 roman


On Fri, Jul 13, 2012 at 11:47 AM, Andi Vajda  wrote:



On Tue, 10 Jul 2012, Andi Vajda wrote:

I would also like to propose a change, to allow for more flexible

mechanism of generating Python class names. The patch doesn't change
the default pylucene behaviour, but it gives people a way to replace
class names with patterns. I have noticed that there are more
same-name classes from different packages in the new lucene (and it
becomes worse when one has to deal with both lucene and solr).



Another way to fix this is to reproduce the namespace hierarchy used in
Lucene, following along the Java packages, something I've been

dreading to

do. Lucene just loves a really long deeply nested class structure.
I'm not convinced yet it is bad enough to go down that route, though.

Your proposal to use patterns may in fact yield a much more convenient
solution. Thanks !



Rethinking this a bit, I'm prepared to change my mind on this. Your
patterned rename patch shows that we're slowly but surely reaching the
limit of the current setup that consists in throwing all wrapped classes
under the one global 'lucene' namespace.

Lucene 4.0 has seen a large number of deeply nested classes with similar
names added since 3.x. Renaming these one by one (or excluding some)
doesn't scale. Using the proposed patterned rename scales more but

makes it

difficult to know what got renamed and how.
Ultimately, the more classes that are like-named, the more classes would
have instable names from one release to the next as more duplicated

names

are encountered.

What if instead JCC supported the original Java namespaces all the way

to

the Python inteface (still dropping the original 'org.apache' Java

package

Re: Changing Python class/module layout, dropping --rename ?

2012-07-13 Thread Roman Chyla
Hi,
I was playing with the idea of creating virtual packages, attached is a
working script that illustrates it. I am getting this output:

Dit it work?
==
from org.apache.lucene.search import SearcherFactory; print SearcherFactory

from org.apache.lucene.analysis import Analyzer as Banalyzer; print
Banalyzer

print sys.modules['org'] 
print sys.modules['org.apache'] 
print sys.modules['org.apache.lucene'] 
print sys.modules['org.apache.lucene.search'] 

Cheers,

  roman


On Fri, Jul 13, 2012 at 1:34 PM, Andi Vajda  wrote:

>
> On Jul 13, 2012, at 18:33, Roman Chyla  wrote:
>
> > I think this would be great. Let me add little bit more to your
> > observations (whole night yesterday was spent fighting with renames -
> > because I was building a project which imports shared lucene and solr  --
> > there were thousands of same classes, I am not sure it would be possible
> > without some sort of a flexible rename...)
> >
> > JCC is a great tool and is used by potentially many projects - so
> stripping
> > "org.apache" seems right for pylucene, but looks arbitrary otherwise
>
> Yes, I forgot to say that there would be a way to declare one or more
> mappings  so that org.apache.lucene becomes lucene.
>
> Andi..
>
> > (unless there is a flexible stripping mechanism). Also, if the full
> > namespace remains original, then the code written in Python would be also
> > executable by Jython, which is IMHO an advantage.
> >
> > But this being Python, the packages cannot be spread in different
> locations
> > (ie. there can be only one org.apache.lucene.analysis package) - unless
> > there exists (again) some flexible mechanism which populates the
> namespace
> > with objects that belong there. It may seem an overkill to you, because
> for
> > single projects it would work, but seems perfectly justifiable in case of
> > imported shared libraries
> >
> > I don't know what is your idea for implementing the python packages, but
> > your last email got me thinking as well - there might be a very simple
> way
> > of getting to the java packages inside Python without too much work.
> >
> > Let's say the java "org.apache.lucene.search.IndexSearcher" is known to
> > python as org_apache_lucene_search_IndexSearcher
> >
> > and users do:
> >
> > import lucene
> > lucene.initVM()
> >
> > initVM() first initiates java VM (and populates the lucene namespace with
> > all objects), but then it will call jcc.register_module(self)
> >
> > A new piece of code inside JCC grabs the lucene module and creates (on
> the
> > fly) python packages -- using types.ModuleType (or new.module()) -- the
> new
> > packages will be inserted into sys.modules
> >
> > so after lucene.initVM() returns
> >
> > users can do "from org.apache.lucene.search import IndexSearcher" and get
> > lucene.org_apache_lucene_search_IndexSearcher object
> >
> > and also, when shared libraries are present (let's say 'solr') users do:
> >
> > import solr
> > solr.initVM()
> >
> > The JCC will just update the existing packages and create new ones if
> > needed (and from this perspective, having fully qualified name is safer
> > than to have lucene.search.IndexSearcher)
> >
> > I think this change is totally possible and will not change the way how
> > extensions are built. Does it have some serious flaw?
> >
> > I would be of course more than happy to contribute and test.
> >
> > Best,
> >
> >  roman
> >
> >
> > On Fri, Jul 13, 2012 at 11:47 AM, Andi Vajda  wrote:
> >
> >>
> >> On Tue, 10 Jul 2012, Andi Vajda wrote:
> >>
> >> I would also like to propose a change, to allow for more flexible
>  mechanism of generating Python class names. The patch doesn't change
>  the default pylucene behaviour, but it gives people a way to replace
>  class names with patterns. I have noticed that there are more
>  same-name classes from different packages in the new lucene (and it
>  becomes worse when one has to deal with both lucene and solr).
> 
> >>>
> >>> Another way to fix this is to reproduce the namespace hierarchy used in
> >>> Lucene, following along the Java packages, something I've been
> dreading to
> >>> do. Lucene just loves a really long deeply nested class structure.
> >>> I'm not convinced yet it is bad enough to go down that route, though.
> >>>
> >>> Your proposal to use patterns may in fact yield a much more convenient
> >>> solution. Thanks !
> >>>
> >>
> >> Rethinking this a bit, I'm prepared to change my mind on this. Your
> >> patterned rename patch shows that we're slowly but surely reaching the
> >> limit of the current setup that consists in throwing all wrapped classes
> >> under the one global 'lucene' namespace.
> >>
> >> Lucene 4.0 has seen a large number of deeply nested classes with similar
> >> names added since 3.x. Renaming these one by one (or excluding some)
> >> doesn't scale. Using the proposed patterned rename scales more but
> makes it
> >> difficult to know what got renamed and how.
> >> Ultimately, the m

Re: Changing Python class/module layout, dropping --rename ?

2012-07-13 Thread Patrick J. McNerthney
Just chiming in that my use of JCC to wrap the Eclipse BIRT Runtime 
Engine could really use this ability.  There are a TON of classes that 
should be wrapped, and many use the same set of names, but in different 
packages.


Pat

On 07/13/2012 07:34 AM, Andi Vajda wrote:

On Jul 13, 2012, at 18:33, Roman Chyla  wrote:


I think this would be great. Let me add little bit more to your
observations (whole night yesterday was spent fighting with renames -
because I was building a project which imports shared lucene and solr  --
there were thousands of same classes, I am not sure it would be possible
without some sort of a flexible rename...)

JCC is a great tool and is used by potentially many projects - so stripping
"org.apache" seems right for pylucene, but looks arbitrary otherwise

Yes, I forgot to say that there would be a way to declare one or more mappings  
so that org.apache.lucene becomes lucene.

Andi..


(unless there is a flexible stripping mechanism). Also, if the full
namespace remains original, then the code written in Python would be also
executable by Jython, which is IMHO an advantage.

But this being Python, the packages cannot be spread in different locations
(ie. there can be only one org.apache.lucene.analysis package) - unless
there exists (again) some flexible mechanism which populates the namespace
with objects that belong there. It may seem an overkill to you, because for
single projects it would work, but seems perfectly justifiable in case of
imported shared libraries

I don't know what is your idea for implementing the python packages, but
your last email got me thinking as well - there might be a very simple way
of getting to the java packages inside Python without too much work.

Let's say the java "org.apache.lucene.search.IndexSearcher" is known to
python as org_apache_lucene_search_IndexSearcher

and users do:

import lucene
lucene.initVM()

initVM() first initiates java VM (and populates the lucene namespace with
all objects), but then it will call jcc.register_module(self)

A new piece of code inside JCC grabs the lucene module and creates (on the
fly) python packages -- using types.ModuleType (or new.module()) -- the new
packages will be inserted into sys.modules

so after lucene.initVM() returns

users can do "from org.apache.lucene.search import IndexSearcher" and get
lucene.org_apache_lucene_search_IndexSearcher object

and also, when shared libraries are present (let's say 'solr') users do:

import solr
solr.initVM()

The JCC will just update the existing packages and create new ones if
needed (and from this perspective, having fully qualified name is safer
than to have lucene.search.IndexSearcher)

I think this change is totally possible and will not change the way how
extensions are built. Does it have some serious flaw?

I would be of course more than happy to contribute and test.

Best,

  roman


On Fri, Jul 13, 2012 at 11:47 AM, Andi Vajda  wrote:


On Tue, 10 Jul 2012, Andi Vajda wrote:

I would also like to propose a change, to allow for more flexible

mechanism of generating Python class names. The patch doesn't change
the default pylucene behaviour, but it gives people a way to replace
class names with patterns. I have noticed that there are more
same-name classes from different packages in the new lucene (and it
becomes worse when one has to deal with both lucene and solr).


Another way to fix this is to reproduce the namespace hierarchy used in
Lucene, following along the Java packages, something I've been dreading to
do. Lucene just loves a really long deeply nested class structure.
I'm not convinced yet it is bad enough to go down that route, though.

Your proposal to use patterns may in fact yield a much more convenient
solution. Thanks !


Rethinking this a bit, I'm prepared to change my mind on this. Your
patterned rename patch shows that we're slowly but surely reaching the
limit of the current setup that consists in throwing all wrapped classes
under the one global 'lucene' namespace.

Lucene 4.0 has seen a large number of deeply nested classes with similar
names added since 3.x. Renaming these one by one (or excluding some)
doesn't scale. Using the proposed patterned rename scales more but makes it
difficult to know what got renamed and how.
Ultimately, the more classes that are like-named, the more classes would
have instable names from one release to the next as more duplicated names
are encountered.

What if instead JCC supported the original Java namespaces all the way to
the Python inteface (still dropping the original 'org.apache' Java package
tree prefix) ?
The world-rooted style of naming Java classes isn't Pythonic but using the
second half of the package structure feels right at home in the Python
world.

JCC already re-creates the complete Java package structure in C++ as
namespaces for all the C++ code it generates, for both the JNI wrapper
classes and the C++/Python types. It's only the installation of the class
names into the Pyt

Re: Changing Python class/module layout, dropping --rename ?

2012-07-13 Thread Andi Vajda

On Jul 13, 2012, at 18:33, Roman Chyla  wrote:

> I think this would be great. Let me add little bit more to your
> observations (whole night yesterday was spent fighting with renames -
> because I was building a project which imports shared lucene and solr  --
> there were thousands of same classes, I am not sure it would be possible
> without some sort of a flexible rename...)
> 
> JCC is a great tool and is used by potentially many projects - so stripping
> "org.apache" seems right for pylucene, but looks arbitrary otherwise

Yes, I forgot to say that there would be a way to declare one or more mappings  
so that org.apache.lucene becomes lucene.

Andi..

> (unless there is a flexible stripping mechanism). Also, if the full
> namespace remains original, then the code written in Python would be also
> executable by Jython, which is IMHO an advantage.
> 
> But this being Python, the packages cannot be spread in different locations
> (ie. there can be only one org.apache.lucene.analysis package) - unless
> there exists (again) some flexible mechanism which populates the namespace
> with objects that belong there. It may seem an overkill to you, because for
> single projects it would work, but seems perfectly justifiable in case of
> imported shared libraries
> 
> I don't know what is your idea for implementing the python packages, but
> your last email got me thinking as well - there might be a very simple way
> of getting to the java packages inside Python without too much work.
> 
> Let's say the java "org.apache.lucene.search.IndexSearcher" is known to
> python as org_apache_lucene_search_IndexSearcher
> 
> and users do:
> 
> import lucene
> lucene.initVM()
> 
> initVM() first initiates java VM (and populates the lucene namespace with
> all objects), but then it will call jcc.register_module(self)
> 
> A new piece of code inside JCC grabs the lucene module and creates (on the
> fly) python packages -- using types.ModuleType (or new.module()) -- the new
> packages will be inserted into sys.modules
> 
> so after lucene.initVM() returns
> 
> users can do "from org.apache.lucene.search import IndexSearcher" and get
> lucene.org_apache_lucene_search_IndexSearcher object
> 
> and also, when shared libraries are present (let's say 'solr') users do:
> 
> import solr
> solr.initVM()
> 
> The JCC will just update the existing packages and create new ones if
> needed (and from this perspective, having fully qualified name is safer
> than to have lucene.search.IndexSearcher)
> 
> I think this change is totally possible and will not change the way how
> extensions are built. Does it have some serious flaw?
> 
> I would be of course more than happy to contribute and test.
> 
> Best,
> 
>  roman
> 
> 
> On Fri, Jul 13, 2012 at 11:47 AM, Andi Vajda  wrote:
> 
>> 
>> On Tue, 10 Jul 2012, Andi Vajda wrote:
>> 
>> I would also like to propose a change, to allow for more flexible
 mechanism of generating Python class names. The patch doesn't change
 the default pylucene behaviour, but it gives people a way to replace
 class names with patterns. I have noticed that there are more
 same-name classes from different packages in the new lucene (and it
 becomes worse when one has to deal with both lucene and solr).
 
>>> 
>>> Another way to fix this is to reproduce the namespace hierarchy used in
>>> Lucene, following along the Java packages, something I've been dreading to
>>> do. Lucene just loves a really long deeply nested class structure.
>>> I'm not convinced yet it is bad enough to go down that route, though.
>>> 
>>> Your proposal to use patterns may in fact yield a much more convenient
>>> solution. Thanks !
>>> 
>> 
>> Rethinking this a bit, I'm prepared to change my mind on this. Your
>> patterned rename patch shows that we're slowly but surely reaching the
>> limit of the current setup that consists in throwing all wrapped classes
>> under the one global 'lucene' namespace.
>> 
>> Lucene 4.0 has seen a large number of deeply nested classes with similar
>> names added since 3.x. Renaming these one by one (or excluding some)
>> doesn't scale. Using the proposed patterned rename scales more but makes it
>> difficult to know what got renamed and how.
>> Ultimately, the more classes that are like-named, the more classes would
>> have instable names from one release to the next as more duplicated names
>> are encountered.
>> 
>> What if instead JCC supported the original Java namespaces all the way to
>> the Python inteface (still dropping the original 'org.apache' Java package
>> tree prefix) ?
>> The world-rooted style of naming Java classes isn't Pythonic but using the
>> second half of the package structure feels right at home in the Python
>> world.
>> 
>> JCC already re-creates the complete Java package structure in C++ as
>> namespaces for all the C++ code it generates, for both the JNI wrapper
>> classes and the C++/Python types. It's only the installation of the class
>> names into the

Re: Changing Python class/module layout, dropping --rename ?

2012-07-13 Thread Roman Chyla
Hi Andi,

I think this would be great. Let me add little bit more to your
observations (whole night yesterday was spent fighting with renames -
because I was building a project which imports shared lucene and solr  --
there were thousands of same classes, I am not sure it would be possible
without some sort of a flexible rename...)

JCC is a great tool and is used by potentially many projects - so stripping
"org.apache" seems right for pylucene, but looks arbitrary otherwise
(unless there is a flexible stripping mechanism). Also, if the full
namespace remains original, then the code written in Python would be also
executable by Jython, which is IMHO an advantage.

But this being Python, the packages cannot be spread in different locations
(ie. there can be only one org.apache.lucene.analysis package) - unless
there exists (again) some flexible mechanism which populates the namespace
with objects that belong there. It may seem an overkill to you, because for
single projects it would work, but seems perfectly justifiable in case of
imported shared libraries

I don't know what is your idea for implementing the python packages, but
your last email got me thinking as well - there might be a very simple way
of getting to the java packages inside Python without too much work.

Let's say the java "org.apache.lucene.search.IndexSearcher" is known to
python as org_apache_lucene_search_IndexSearcher

and users do:

import lucene
lucene.initVM()

initVM() first initiates java VM (and populates the lucene namespace with
all objects), but then it will call jcc.register_module(self)

A new piece of code inside JCC grabs the lucene module and creates (on the
fly) python packages -- using types.ModuleType (or new.module()) -- the new
packages will be inserted into sys.modules

so after lucene.initVM() returns

users can do "from org.apache.lucene.search import IndexSearcher" and get
lucene.org_apache_lucene_search_IndexSearcher object

and also, when shared libraries are present (let's say 'solr') users do:

import solr
solr.initVM()

The JCC will just update the existing packages and create new ones if
needed (and from this perspective, having fully qualified name is safer
than to have lucene.search.IndexSearcher)

I think this change is totally possible and will not change the way how
extensions are built. Does it have some serious flaw?

I would be of course more than happy to contribute and test.

Best,

  roman


On Fri, Jul 13, 2012 at 11:47 AM, Andi Vajda  wrote:

>
> On Tue, 10 Jul 2012, Andi Vajda wrote:
>
>  I would also like to propose a change, to allow for more flexible
>>> mechanism of generating Python class names. The patch doesn't change
>>> the default pylucene behaviour, but it gives people a way to replace
>>> class names with patterns. I have noticed that there are more
>>> same-name classes from different packages in the new lucene (and it
>>> becomes worse when one has to deal with both lucene and solr).
>>>
>>
>> Another way to fix this is to reproduce the namespace hierarchy used in
>> Lucene, following along the Java packages, something I've been dreading to
>> do. Lucene just loves a really long deeply nested class structure.
>> I'm not convinced yet it is bad enough to go down that route, though.
>>
>> Your proposal to use patterns may in fact yield a much more convenient
>> solution. Thanks !
>>
>
> Rethinking this a bit, I'm prepared to change my mind on this. Your
> patterned rename patch shows that we're slowly but surely reaching the
> limit of the current setup that consists in throwing all wrapped classes
> under the one global 'lucene' namespace.
>
> Lucene 4.0 has seen a large number of deeply nested classes with similar
> names added since 3.x. Renaming these one by one (or excluding some)
> doesn't scale. Using the proposed patterned rename scales more but makes it
> difficult to know what got renamed and how.
> Ultimately, the more classes that are like-named, the more classes would
> have instable names from one release to the next as more duplicated names
> are encountered.
>
> What if instead JCC supported the original Java namespaces all the way to
> the Python inteface (still dropping the original 'org.apache' Java package
> tree prefix) ?
> The world-rooted style of naming Java classes isn't Pythonic but using the
> second half of the package structure feels right at home in the Python
> world.
>
> JCC already re-creates the complete Java package structure in C++ as
> namespaces for all the C++ code it generates, for both the JNI wrapper
> classes and the C++/Python types. It's only the installation of the class
> names into the Python VM that is done in the flat 'lucene' namespace.
>
> I think it shouldn't be too hard to change the code that installs classes
> to create sub-modules of the lucene module and install classes in these
> submodules instead (down to however many levels are in the original).
>
> In other words:
>   - from lucene import Document
> would become
>