Re: FacetExample.py

2013-02-13 Thread Robert Muir
On Wed, Feb 13, 2013 at 7:54 PM, Andi Vajda  wrote:
>
> On Wed, 13 Feb 2013, Robert Muir wrote:
>
>> On Tue, Feb 12, 2013 at 3:11 AM, Andi Vajda  wrote:
>>>
>>> I then found that the test case from hell, TestSort.java, has majorly
>>> changed again and test_Sort.py needs to be ported again. Sigh.
>>>
>>> Andi..
>>
>>
>> I'm not laughing at your expense Andi... but this made me laugh out
>> loud multiple times today.
>>
>> I've done battle with this thing several times, I feel like I always lose!
>
>
> I found lines 170 - 195 particularly clever :-)
>
> Jokes aside, I did spend a bunch of time of yesterday battling with the
> unspelled assumptions made in the Lucene random number generation code in
> the Lucene test framework. In particular, it seems that it expects different
> threads, sometimes, to get the same random values, no ? (I'm using Python's
> random number generator in PyLucene).
>
> The field cache sanity checker would otherwise complain, sometimes...
>
> Andi..

in all seriousness I dont like that committers' time is wasted on
this. just a day or two ago I created a bug in this thing merging, and
mike spent time tracking it down. I'd like to think i'm pretty careful
about not breaking things when merging (I think i spent at least an
hour merging this file alone very carefully, yet still screwed it up).

so i opened https://issues.apache.org/jira/browse/LUCENE-4779

about the randomness: I think this should not be the case. if
different threads try to share the same random, actually there should
be an exception from the test framework saying that each thread should
get its own random (eg. initialized by a long value). So lucene-java
tests should not have code that does this: otherwise it really makes
test failures difficult to reproduce.

Unfortunately I'm not very familiar with what python does here, but I
cc'ed Dawid just in case he knows off the top of his head.


Re: FacetExample.py

2013-02-13 Thread Andi Vajda


On Wed, 13 Feb 2013, Robert Muir wrote:


On Tue, Feb 12, 2013 at 3:11 AM, Andi Vajda  wrote:

I then found that the test case from hell, TestSort.java, has majorly
changed again and test_Sort.py needs to be ported again. Sigh.

Andi..


I'm not laughing at your expense Andi... but this made me laugh out
loud multiple times today.

I've done battle with this thing several times, I feel like I always lose!


I found lines 170 - 195 particularly clever :-)

Jokes aside, I did spend a bunch of time of yesterday battling with the 
unspelled assumptions made in the Lucene random number generation code in 
the Lucene test framework. In particular, it seems that it expects different 
threads, sometimes, to get the same random values, no ? (I'm using Python's 
random number generator in PyLucene).


The field cache sanity checker would otherwise complain, sometimes...

Andi..


Re: FacetExample.py

2013-02-13 Thread Robert Muir
On Tue, Feb 12, 2013 at 3:11 AM, Andi Vajda  wrote:
> I then found that the test case from hell, TestSort.java, has majorly
> changed again and test_Sort.py needs to be ported again. Sigh.
>
> Andi..

I'm not laughing at your expense Andi... but this made me laugh out
loud multiple times today.

I've done battle with this thing several times, I feel like I always lose!


Re: FacetExample.py

2013-02-13 Thread Andi Vajda


On Wed, 13 Feb 2013, Thomas Koch wrote:

Hi Andi, You're right - and API docs are wrong. Actually both must have 
change after 4.1 release: I checked the source of java-lucene v4.1 
(lucene-4.1.0-src.tgz / 21-Jan-2013) and it matches the online javadocs. 
So I guess you're preparing for PyLucene v4.2?


Yes, by getting the Lucene sources from
  http://svn.apache.org/repos/asf/lucene/dev/branches/branch_4x
we're building PyLucene 4.1+, what should be becoming 4.2 when Lucene 
releases version 4.2.


Note: I think that 
LUCENE_SVN=http://svn.apache.org/repos/asf/lucene/dev/branches/branch_4x 
is the "trunk" where 4.x development happens (i.e.  "unstable") whereas 
http://svn.apache.org/repos/asf/lucene/dev/branches/lucene_solr_4_1/ is 
the "stable" Lucene4.1/Solr4.1 branch (matching the 4.1 release and API 
docs). So if that's right (please correct me if I'm wrong) - why did you 
choose the branch_4x?


That is correct, I chose branch_4x because that is where 4.x development is 
happening and catching up to 4.1 only would cause to have to catch up again 
to 4.2. That being said, catch-up has to be done for every release, it's 
just that Lucene 4.1 has happened now and PyLucene hasn't had a 4.x release 
yet. By aiming for 4.2 and getting the samples ported, we have a chance of 
releasing PyLucene 4.2 as soon as the Lucene 4.2 release is out.



Anyway, I fixed the FacetsExample.py for branch_4x now ,-)


Thanks !


Some notes on API changes for those interested:
-the 'new' FacetsCollector has a factory pattern now:
 public static FacetsCollector create(FacetSearchParams fsp, IndexReader 
indexReader, TaxonomyReader taxoReader)
- the order of constructor arguments for FacetSearchParams has changed!
- FacetResultNode has changed: it used to be an interface but is now a concrete 
class (and the method getSubResults of FacetResultNode disappeared)
- DrillDown.query() became DrillDownQuery() - with a new API.

Well, at least API docs state it:
"WARNING: This API is experimental and might change in incompatible ways in the next 
release."
So one should be warned...

Here's the new version: https://dl.dropbox.com/u/4384120/FacetExample.py
Or as patch to svn: 
https://dl.dropbox.com/u/4384120/FacetExample_patch_20130213.txt


I checked your new version into rev 1445740 after fixing a string encoding 
issue line 199.



Thanks again for your help.


You're welcome !
We now have one functional sample in PyLucene 4.x :-)

Andi..



regards,
Thomas
--
Am 12.02.2013 um 22:36 schrieb Andi Vajda :



Hi Thomas,

On Tue, 12 Feb 2013, Thomas Koch wrote:


Thanks to your hints I was now able to build PyLucene4.1 and got further with 
the FacetExample.py - The imports should be OK now and most of the required 
changes are done I guess. However I now reached another problem: I need to 
instantiate the class 'FacetsCollector' but get an error when doing so:

File "samples/FacetExample.py", line 222, in searchWithRequestAndQuery
  facetsCollector = FacetsCollector(facetSearchParams, indexReader, taxoReader)
NotImplementedError: ('instantiating java class', )

The java example has this line:
  FacetsCollector facetsCollector = new FacetsCollector(facetSearchParams, 
indexReader, taxoReader);
and javadocs state it has a public constructor:
http://lucene.apache.org/core/4_1_0/facet/org/apache/lucene/facet/search/FacetsCollector.html#FacetsCollector(org.apache.lucene.facet.search.params.FacetSearchParams,%20org.apache.lucene.index.IndexReader,%20org.apache.lucene.facet.taxonomy.TaxonomyReader)

So what could be the reason for this behavior?


The FacetCollector class is declared abstract. Thus you can't instantiate it, 
constructor or not. I think the intent is to instantiate one of its concrete 
inner subclasses.
See 
lucene-java-4.1/lucene/facet/src/java/org/apache/lucene/facet/search/FacetsCollector.java


I have another problem with the constructor of FacetSearchParams: it is 
expecting arguments:
(List facetRequests, FacetIndexingParams indexingParams)
but neither
FacetSearchParams(Arrays.asList([facetRequest,]), indexingParams)
nor
FacetSearchParams([facetRequest,], indexingParams)
does it here.  I get

lucene.InvalidArgsError: (, '__init__', (, ))


There are four constructors on FacetSearchParams, none of which seems to match 
your call:
 public FacetSearchParams(FacetRequest... facetRequests)
 public FacetSearchParams(List facetRequests)
 public FacetSearchParams(FacetIndexingParams indexingParams, FacetRequest... 
facetRequests)
 public FacetSearchParams(FacetIndexingParams indexingParams, 
List facetRequests)

See 
lucene-java-4.1/lucene/facet/src/java/org/apache/lucene/facet/params/FacetSearchParams.java

You seem to be passing FacetIndexingParams last.

Andi..




I thought that JavaList could help, but I cannot import it:

from lucene.collections import JavaList

Traceback (most recent call last):
File "", line 1, in 
File 
"/Users/koch/.virtualenvs/pylucene/lib/python2.7/site-packages/lucene-4.1-py2.7-macosx-10.8-x86_

Re: FacetExample.py

2013-02-13 Thread Thomas Koch
Hi Andi,
You're right - and API docs are wrong. Actually both must have change after 4.1 
release: I checked the source of java-lucene v4.1 (lucene-4.1.0-src.tgz /  
21-Jan-2013) and it matches the online javadocs. So I guess you're preparing 
for PyLucene v4.2? 

Note: I think that 
LUCENE_SVN=http://svn.apache.org/repos/asf/lucene/dev/branches/branch_4x is the 
"trunk" where 4.x development happens (i.e.  "unstable") whereas
http://svn.apache.org/repos/asf/lucene/dev/branches/lucene_solr_4_1/
is the "stable" Lucene4.1/Solr4.1 branch (matching the 4.1 release and API 
docs). So if that's right (please correct me if I'm wrong) - why did you choose 
the branch_4x? 

Anyway, I fixed the FacetsExample.py for branch_4x now ,-)

Some notes on API changes for those interested:
 -the 'new' FacetsCollector has a factory pattern now:
  public static FacetsCollector create(FacetSearchParams fsp, IndexReader 
indexReader, TaxonomyReader taxoReader) 
- the order of constructor arguments for FacetSearchParams has changed!
- FacetResultNode has changed: it used to be an interface but is now a concrete 
class (and the method getSubResults of FacetResultNode disappeared)
- DrillDown.query() became DrillDownQuery() - with a new API.

Well, at least API docs state it: 
 "WARNING: This API is experimental and might change in incompatible ways in 
the next release."
So one should be warned...

Here's the new version: https://dl.dropbox.com/u/4384120/FacetExample.py
Or as patch to svn: 
https://dl.dropbox.com/u/4384120/FacetExample_patch_20130213.txt

Thanks again for your help.

regards,
Thomas
--
Am 12.02.2013 um 22:36 schrieb Andi Vajda :

> 
> Hi Thomas,
> 
> On Tue, 12 Feb 2013, Thomas Koch wrote:
> 
>> Thanks to your hints I was now able to build PyLucene4.1 and got further 
>> with the FacetExample.py - The imports should be OK now and most of the 
>> required changes are done I guess. However I now reached another problem: I 
>> need to instantiate the class 'FacetsCollector' but get an error when doing 
>> so:
>> 
>> File "samples/FacetExample.py", line 222, in searchWithRequestAndQuery
>>   facetsCollector = FacetsCollector(facetSearchParams, indexReader, 
>> taxoReader)
>> NotImplementedError: ('instantiating java class', )
>> 
>> The java example has this line:
>>   FacetsCollector facetsCollector = new FacetsCollector(facetSearchParams, 
>> indexReader, taxoReader);
>> and javadocs state it has a public constructor:
>> http://lucene.apache.org/core/4_1_0/facet/org/apache/lucene/facet/search/FacetsCollector.html#FacetsCollector(org.apache.lucene.facet.search.params.FacetSearchParams,%20org.apache.lucene.index.IndexReader,%20org.apache.lucene.facet.taxonomy.TaxonomyReader)
>> 
>> So what could be the reason for this behavior?
> 
> The FacetCollector class is declared abstract. Thus you can't instantiate it, 
> constructor or not. I think the intent is to instantiate one of its concrete 
> inner subclasses.
> See 
> lucene-java-4.1/lucene/facet/src/java/org/apache/lucene/facet/search/FacetsCollector.java
> 
>> I have another problem with the constructor of FacetSearchParams: it is 
>> expecting arguments:
>> (List facetRequests, FacetIndexingParams indexingParams)
>> but neither
>> FacetSearchParams(Arrays.asList([facetRequest,]), indexingParams)
>> nor
>> FacetSearchParams([facetRequest,], indexingParams)
>> does it here.  I get
>> 
>> lucene.InvalidArgsError: (, '__init__', (> [root/a nRes=10 nLbl=10]>, > org.apache.lucene.facet.params.FacetIndexingParams@f97ad3c0>))
> 
> There are four constructors on FacetSearchParams, none of which seems to 
> match your call:
>  public FacetSearchParams(FacetRequest... facetRequests)
>  public FacetSearchParams(List facetRequests)
>  public FacetSearchParams(FacetIndexingParams indexingParams, FacetRequest... 
> facetRequests)
>  public FacetSearchParams(FacetIndexingParams indexingParams, 
> List facetRequests)
> 
> See 
> lucene-java-4.1/lucene/facet/src/java/org/apache/lucene/facet/params/FacetSearchParams.java
> 
> You seem to be passing FacetIndexingParams last.
> 
> Andi..
> 
> 
>> 
>> I thought that JavaList could help, but I cannot import it:
> from lucene.collections import JavaList
>> Traceback (most recent call last):
>> File "", line 1, in 
>> File 
>> "/Users/koch/.virtualenvs/pylucene/lib/python2.7/site-packages/lucene-4.1-py2.7-macosx-10.8-x86_64.egg/lucene/collections.py",
>>  line 17, in 
>>   from org.apache.pylucene.util import \
>> ImportError: No module named pylucene.util
> 
>> 
>> That's probably because I had to disable in Makefile
>> ## JARS+=$(HIGHLIGHTER_JAR)# needs memory contrib
>> ## JARS+=$(EXTENSIONS_JAR) # needs highlighter contrib
>> 
>> Do you think that's a type cast issue and that JavaList would help here?
>> I need to define a 'typed' list , e.g. List
>> 
>> FacetSearchParams API docs:
>> http://lucene.apache.org/core/4_1_0/facet/org/apache/lucene/facet/search/params/FacetSearchParams.html
>> 
>> Current version of F