On Friday, 27 January 2012 at 19:36:49 UTC, Andrei Alexandrescu wrote:
Here's a multikey sorting problem that's bothering me; I think it's well researched but I can't find the right keywords.

Say we have a collection of people with height and hair length information. We want to get people who are "tall and long-haired". More precisely, we want to order people by rank(p.height) + rank(p.hairlength), where rank(p.k) is the function that returns the rank of person p if ordered by key k.

The brute force approach would essentially compute the two ranks and then sort by their sum. That's three sorts and at least one temporary array. But I think things can be done considerably better. Any ideas?

I don't think you can get much better than this.

Knowing the rank (for one key) for one element requires looking at all the other elements. Thus, knowing the rank of all elements isn't possible in linear time or faster.

Knowing the ranks of only one key doesn't get you much information regarding the final result. Even the item ranked last on one key can tie for first place in combination with the other.

These observations seem to impose a strict dependency on the order the data is calculated. I don't see any shortcuts, but I'd love to be proven wrong.

You may want to ask on http://cstheory.stackexchange.com/ .

Reply via email to