On Friday, 27 January 2012 at 19:36:49 UTC, Andrei Alexandrescu
wrote:
Here's a multikey sorting problem that's bothering me; I think
it's well researched but I can't find the right keywords.
Say we have a collection of people with height and hair length
information. We want to get people who are "tall and
long-haired". More precisely, we want to order people by
rank(p.height) + rank(p.hairlength), where rank(p.k) is the
function that returns the rank of person p if ordered by key k.
The brute force approach would essentially compute the two
ranks and then sort by their sum. That's three sorts and at
least one temporary array. But I think things can be done
considerably better. Any ideas?
I don't think you can get much better than this.
Knowing the rank (for one key) for one element requires looking
at all the other elements. Thus, knowing the rank of all elements
isn't possible in linear time or faster.
Knowing the ranks of only one key doesn't get you much
information regarding the final result. Even the item ranked last
on one key can tie for first place in combination with the other.
These observations seem to impose a strict dependency on the
order the data is calculated. I don't see any shortcuts, but I'd
love to be proven wrong.
You may want to ask on http://cstheory.stackexchange.com/ .