On Dec 10, 2009, at 1:26 PM, Yonik Seeley wrote:
> I wouldn't necessarily link FieldType.isPolyField() to the idea of a
> poly value source... they are two different things.
Yep. The word Poly is overloaded here to mean multiple ValueSources, but it
isn't necessarily tied to there being a poly field, even though a PolyField
likely would create a PolyValueSource
> For example, if NumericField had not already been written in Lucene, I
> would have perhaps just indexed both the lat and lon into the same
> lucene field. That part can be more of an implementation detail, and
> does not reflect the semantics of the field (the fact that it contains
> both a lat and lon).
Maybe, there are tradeoffs though.
Let's get concrete and look at the VectorDistanceFunction (dist()). It can
currently take in an even number of ValueSource instances, and the distance
method essentially boils down to (for Euclidean Distance):
for (int i = 0; i < docValues1.length; i++) {
double v = docValues1[i].doubleVal(doc) - docValues2[i].doubleVal(doc);
result += v * v;
}
result = Math.sqrt(result);
For example, a call to this might be:
dist(power = 2, x1, y1, x2, y2) - where xi, yi are ValueSources. //note power =
2 is just me showing what the first parameter is so that no one wonders why
there is this extra number in there
Now, assuming a PointType fields named point1 and point 2 (along with the
others above), one could have:
dist(2, point1, point2) //distance between two PointTypes
dist(2, point1, x1, y2) //distance between a PointType and a user defined point.
While I think this can be coded up in the lat/lon case (i.e. two values) I
think it gets hairy when you consider a point in n-dim. space.
My inclination is to fudge on this and do something in ValueSourceParser for
each of the functions that can deal w/ poly fields (my gut says most can't)
like:
addParser("dist", new ValueSourceParser() {
public ValueSource parse(FunctionQParser fp) throws ParseException {
float power = fp.parseFloat();
List<ValueSource> sources = fp.parseValueSourceList();
if (sources.size() % 2 != 0) {
//expand if needed
List newSources = new ...;
for each sources
if (source is a PolyValueSource){
List<ValueSource> tmp =
((PolyValueSource)source).getValueSources();
newSources.addAll(tmp);
else
newSources.add(source); //just like the old one
sources = newSources;
//Do the even check again here
if (sources.size() % 2 != 0){
throw new SolrException(SolrException.ErrorCode.BAD_REQUEST,
"Illegal number of sources. There must be an even number of sources");
}
}
int dim = sources.size() / 2;
List<ValueSource> sources1 = new ArrayList<ValueSource>(dim);
List<ValueSource> sources2 = new ArrayList<ValueSource>(dim);
splitSources(dim, sources, sources1, sources2);
return new VectorDistanceFunction(power, sources1, sources2);
}
});
Of course, this requires documentation, etc. for others to be able to do the
same for their custom Functions, but that is surmountable.
-Grant