On Dec 10, 2009, at 1:26 PM, Yonik Seeley wrote:

> I wouldn't necessarily link FieldType.isPolyField() to the idea of a
> poly value source... they are two different things.

Yep.  The word Poly is overloaded here to mean multiple ValueSources, but it 
isn't necessarily tied to there being a poly field, even though a PolyField 
likely would create a PolyValueSource

> For example, if NumericField had not already been written in Lucene, I
> would have perhaps just indexed both the lat and lon into the same
> lucene field.  That part can be more of an implementation detail, and
> does not reflect the semantics of the field (the fact that it contains
> both a lat and lon).

Maybe, there are tradeoffs though.  

Let's get concrete and look at the VectorDistanceFunction (dist()).  It can 
currently take in an even number of ValueSource instances, and the distance 
method essentially boils down to (for Euclidean Distance):
for (int i = 0; i < docValues1.length; i++) {
        double v = docValues1[i].doubleVal(doc) - docValues2[i].doubleVal(doc);
        result += v * v;
}
result = Math.sqrt(result);

For example, a call to this might be:
dist(power = 2, x1, y1, x2, y2) - where xi, yi are ValueSources. //note power = 
2 is just me showing what the first parameter is so that no one wonders why 
there is this extra number in there

Now, assuming a PointType fields named point1 and point 2 (along with the 
others above), one could have:

dist(2,  point1, point2)  //distance between two PointTypes
dist(2, point1, x1, y2) //distance between a PointType and a user defined point.

While I think this can be coded up in the lat/lon case (i.e. two values) I 
think it gets hairy when you consider a point in n-dim. space.

My inclination is to fudge on this and do something in ValueSourceParser for 
each of the functions that can deal w/ poly fields (my gut says most can't) 
like:
addParser("dist", new ValueSourceParser() {
      public ValueSource parse(FunctionQParser fp) throws ParseException {
        float power = fp.parseFloat();
        List<ValueSource> sources = fp.parseValueSourceList();
        if (sources.size() % 2 != 0) {
        //expand if needed
        List newSources = new ...;
         for each sources
                if (source is a PolyValueSource){
                        List<ValueSource> tmp = 
((PolyValueSource)source).getValueSources();
                        newSources.addAll(tmp);
                else
                        newSources.add(source);  //just like the old one
                sources = newSources;
         //Do the even check again here
          if (sources.size() % 2 != 0){
                  throw new SolrException(SolrException.ErrorCode.BAD_REQUEST, 
"Illegal number of sources.  There must be an even number of sources");
                }
        }
        int dim = sources.size() / 2;
        List<ValueSource> sources1 = new ArrayList<ValueSource>(dim);
        List<ValueSource> sources2 = new ArrayList<ValueSource>(dim);
        splitSources(dim, sources, sources1, sources2);
        return new VectorDistanceFunction(power, sources1, sources2);
      }
    });

Of course, this requires documentation, etc. for others to be able to do the 
same for their custom Functions, but that is surmountable.  

-Grant

Reply via email to