On Dec 10, 2009, at 1:44 PM, Grant Ingersoll wrote:
>
> On Dec 10, 2009, at 1:26 PM, Yonik Seeley wrote:
>
>> I wouldn't necessarily link FieldType.isPolyField() to the idea of a
>> poly value source... they are two different things.
>
> Yep. The word Poly is overloaded here to mean multiple ValueSources, but it
> isn't necessarily tied to there being a poly field, even though a PolyField
> likely would create a PolyValueSource
>
>> For example, if NumericField had not already been written in Lucene, I
>> would have perhaps just indexed both the lat and lon into the same
>> lucene field. That part can be more of an implementation detail, and
>> does not reflect the semantics of the field (the fact that it contains
>> both a lat and lon).
>
> Maybe, there are tradeoffs though.
>
> Let's get concrete and look at the VectorDistanceFunction (dist()). It can
> currently take in an even number of ValueSource instances, and the distance
> method essentially boils down to (for Euclidean Distance):
> for (int i = 0; i < docValues1.length; i++) {
> double v = docValues1[i].doubleVal(doc) - docValues2[i].doubleVal(doc);
> result += v * v;
> }
> result = Math.sqrt(result);
>
> For example, a call to this might be:
> dist(power = 2, x1, y1, x2, y2) - where xi, yi are ValueSources. //note power
> = 2 is just me showing what the first parameter is so that no one wonders why
> there is this extra number in there
>
> Now, assuming a PointType fields named point1 and point 2 (along with the
> others above), one could have:
>
> dist(2, point1, point2) //distance between two PointTypes
> dist(2, point1, x1, y2) //distance between a PointType and a user defined
> point.
>
D'oh, I see another way of doing this, namely the Distance functions only work
with points.
Namely, the second case above becomes:
dist(2, point1, point(x1, y1));
-Grant
> While I think this can be coded up in the lat/lon case (i.e. two values) I
> think it gets hairy when you consider a point in n-dim. space.
>
> My inclination is to fudge on this and do something in ValueSourceParser for
> each of the functions that can deal w/ poly fields (my gut says most can't)
> like:
> addParser("dist", new ValueSourceParser() {
> public ValueSource parse(FunctionQParser fp) throws ParseException {
> float power = fp.parseFloat();
> List<ValueSource> sources = fp.parseValueSourceList();
> if (sources.size() % 2 != 0) {
> //expand if needed
> List newSources = new ...;
> for each sources
> if (source is a PolyValueSource){
> List<ValueSource> tmp =
> ((PolyValueSource)source).getValueSources();
> newSources.addAll(tmp);
> else
> newSources.add(source); //just like the old one
> sources = newSources;
> //Do the even check again here
> if (sources.size() % 2 != 0){
> throw new SolrException(SolrException.ErrorCode.BAD_REQUEST,
> "Illegal number of sources. There must be an even number of sources");
> }
> }
> int dim = sources.size() / 2;
> List<ValueSource> sources1 = new ArrayList<ValueSource>(dim);
> List<ValueSource> sources2 = new ArrayList<ValueSource>(dim);
> splitSources(dim, sources, sources1, sources2);
> return new VectorDistanceFunction(power, sources1, sources2);
> }
> });
>
> Of course, this requires documentation, etc. for others to be able to do the
> same for their custom Functions, but that is surmountable.
>
> -Grant
>
--------------------------
Grant Ingersoll
http://www.lucidimagination.com/
Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using
Solr/Lucene:
http://www.lucidimagination.com/search