> In the following query, it is possible to limit the amount of entries
>returned by an outer join to a single value? I want to obtain a single
>country from ipv4geotable for each entry in logontable.

Yes, the PTF DENSE_RANK()/ROW_NUMBER() basically gives you that - you can
read the first row out of each logon.IP except, there¹s no way to force
which country wins over the other without an order by country in the
OVER() clause as well.

That said, it will only get slower to produce 1 row per group, because of
the distributed nature of the SQL engine, the reduction of data happens
after a ordering shuffle.

You¹re doing range joins in a SQL engine without theta joins and MapReduce
had no way to implement those at runtime (Tez has, with EdgeManager
plugins).

The easiest/traditional approach out of doing geo-IP lookups is a compact
UDF model without any joins at all.

There¹s some old threads on discussing this as a built-in & some code
(with potential licensing issues) -
http://markmail.org/message/w54j4upwg2wbh3xg

Cheers,
Gopal


Reply via email to