Re: [GENERAL] efficient way to do "fuzzy" join

Rémi Cura Fri, 11 Apr 2014 08:59:34 -0700

Hey,
thanks for your answer.

I think you are right, range type with index could at least provide a fast
matching,
thus avoiding the numrow(A) * numrow(B) complexity .


Though I don't see how to use it to interpolate for more than 1st order.

Cheers,
Rémi-C


2014-04-11 17:09 GMT+02:00 Andy Colson <[email protected]>:

> On 4/11/2014 7:50 AM, Rémi Cura wrote:
>
>> Hey dear List,
>>
>> I'm looking for some advice about the best way to perform a "fuzzy"
>> join, that is joining two table based on approximate matching.
>>
>> It is about temporal matching
>> given a table A with rows containing data and a control_time (for
>> instance 1 ; 5; 6;  .. sec, not necessarly rounded of evenly-spaced)
>>
>> given another table B with lines on no precise timing (eg control_time =
>> 2.3 ; 5.8 ; 6.2 for example)
>>
>> How to join every row of B to A based on
>> min(@(A.control_time-B.control_time))
>> (that is, for every row of B, get the row of A that is temporaly the
>> closest),
>> in an efficient way?
>> (to be explicit, 2.3 would match to 1, 5.8 to 6, 6.2 to 6)
>>
>> Optionnaly, how to get interpolation efficiently (meaning one has to get
>> the previous time and next time for 1 st order interpolation, 2 before
>> and 2 after for 2nd order interpolation, and so on)?
>> (to be explicit 5.8 would match to 5 and 6, the weight being 0.2 and 0.8
>> respectively)
>>
>>
>> Currently my data is spatial so I use Postgis function to interpolate a
>> point on a line, but is is far from efficient or general, and I don't
>> have control on interpolation (only the spatial values are interpolated).
>>
>>
>> Cheers,
>> Rémi-C
>>
>
>
> Have you seen the range type?
>
> http://www.postgresql.org/docs/9.3/static/rangetypes.html
>
> Not fuzzy, but is indexable.
>
> -Andy
>

Re: [GENERAL] efficient way to do "fuzzy" join

Reply via email to