Jeremey - I'm passing this on to the Lucene dev list for comment.
There are two .java files attached that may not make it through to the list. These are simple wrappers that do exactly what you'd expect. The idea is to make dealing with Lucene Hits more "Java like" with an Iterator, which in turn makes this much more amenable to Groovy.
It's syntactic sugar in a sense, but quality expressions are much more than syntactic IMO.
What do folks think of this single additional method to Hits and two additional classes?
Erik
Begin forwarded message:
From: Jeremy Rayner <[EMAIL PROTECTED]> Date: April 22, 2005 3:18:31 PM EDT To: Erik Hatcher <[EMAIL PROTECTED]> Subject: Re: Lucene and Groovy... Reply-To: Jeremy Rayner <[EMAIL PROTECTED]>
Hi Erik,
hits.each { println(it["filename"]) }
OK, I've implemented the above now :-)
where 'it' is bound as a a Document instance obtained from hits.doc(i)
The issue with that is that there are some other methods on Hits that
you'd want to access besides just the doc(i). score(i) for example, or
get the document id with id(i). Also, the "hit" (excuse the pun) to
retrieve a document is made when the doc(i) is called, so you may want
to avoid doing that if you're simply iterating the hits but not
accessing the underlying data (rare, but possible - and there is a
HitCollector facility to allow for this type of thing anyway).
I've created two classes to do this
Hit.java - provides a lazy shell that stores a reference to hits and the current index HitIterator.java - provides a simple cursor across the hits object
I have also added a minor convienience method to Hits.java which returns
an Iterator over the hits.
You guys just let me know what is needed on the Java Lucene side of things and I'll be happy to facilitate any changes needed.
Have a look at the two sources attached, and the patch to Hits.java in Lucene,
you can have the sources for the lucene distro, I'm sure they'd be
useful in Java
too, and they have no mention of groovy in them. With them in the core distro,
it'd certainly make lucene even nicer to use :-)
Thanks
jez.
Index: Hits.java =================================================================== --- Hits.java (revision 164251) +++ Hits.java (working copy) @@ -18,6 +18,7 @@
import java.io.IOException; import java.util.Vector; +import java.util.Iterator;
import org.apache.lucene.document.Document;
@@ -160,6 +161,11 @@
numDocs--; } + + public Iterator iterator() { + return new HitIterator(this); + } + }
final class HitDoc {
-- http://javanicus.com/blog2
/** * Copyright 2004 The Apache Software Foundation * * Licensed under the Apache License, Version 2.0 (the "License"); * you may not use this file except in compliance with the License. * You may obtain a copy of the License at * * http://www.apache.org/licenses/LICENSE-2.0 * * Unless required by applicable law or agreed to in writing, software * distributed under the License is distributed on an "AS IS" BASIS, * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. * See the License for the specific language governing permissions and * limitations under the License. */ package org.apache.lucene.search;
import java.io.IOException;
import org.apache.lucene.document.Document;
/**
* a lazy future for a hit, useful for iterators over instances of Hits
*
* @author Jeremy Rayner
*/
public class Hit implements java.io.Serializable {
private float score;
private int id;
private Document doc = null;
private boolean resolved = false;
private Hits hits = null;
private int hitNumber;
public Hit(Hits hits, int hitNumber) {
this.hits = hits;
this.hitNumber = hitNumber;
}
public Document getDoc() throws IOException {
if (!resolved) fetchTheHit();
return doc;
}
public float getScore() throws IOException {
if (!resolved) fetchTheHit();
return score;
}
public int getId() throws IOException {
if (!resolved) fetchTheHit();
return id;
}
private void fetchTheHit() throws IOException {
doc = hits.doc(hitNumber);
score = hits.score(hitNumber);
id = hits.id(hitNumber);
resolved = true;
}
// provide some of the Document style interface (the simple stuff)
/**
* Returns the boost factor for hits on any field of the underlying
document.
*/
public float getBoost() throws IOException {
return getDoc().getBoost();
}
/**
* Returns the string value of the field with the given name if any exist in
* this document, or null. If multiple fields exist with this name, this
* method returns the first value added. If only binary fields with this
name
* exist, returns null.
*/
public String get(String name) throws IOException {
return getDoc().get(name);
}
/** Prints the fields of the underlying document for human consumption.
*
* If an IOException occurs whilst getting the document, returns null
*/
public String toString() {
try {
return getDoc().toString();
} catch (IOException e) {
return null;
}
}
}
/** * Copyright 2004 The Apache Software Foundation * * Licensed under the Apache License, Version 2.0 (the "License"); * you may not use this file except in compliance with the License. * You may obtain a copy of the License at * * http://www.apache.org/licenses/LICENSE-2.0 * * Unless required by applicable law or agreed to in writing, software * distributed under the License is distributed on an "AS IS" BASIS, * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. * See the License for the specific language governing permissions and * limitations under the License. */ package org.apache.lucene.search; import java.util.Iterator; import java.util.NoSuchElementException; /** * An iterator over lucene hits that provides lazy fetching of each document. * * @author Jeremy Rayner */ public class HitIterator implements Iterator { private Hits hits; private int hitNumber = 0; public HitIterator(Hits hits) { this.hits = hits; } public boolean hasNext() { return hitNumber != hits.length(); } public Object next() { try { Object next = new Hit(hits, hitNumber); hitNumber++; return next; } catch (IndexOutOfBoundsException e) { throw new NoSuchElementException(); } } public void remove() { throw new UnsupportedOperationException(); } public int length() { return hits.length(); } }
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
