On Dec 26, 2008, at 7:24 PM, Philip Hunt wrote:
2008/12/27 J. Andrew Rogers <and...@ceruleansystems.com>:

I think many people greatly underestimate how many gaping algorithm holes there are in computer science for even the most important and mundane tasks.
The algorithm coverage of computer science is woefully incomplete,

Is it? In all my time as a programmer, it's never occurred to me to
think "I wish there was an algorithm to do X". mybe that's just me.
And there are vast numbers of useful algorithms that people use every
day.


Computers are general, so there always exists an obvious algorithm for doing any particular task. Whether or not that obvious algorithm is efficient is quite another thing, since the real costs of various algorithms are far from equivalent even if their functionality is.

The Sieve of Eratosthenes will allow you to factor any integer in theory, but for non-trivial integers you will want to use a number field sieve. The limitations of many types of software are fundamentally based in the complexity class of the of the attributes of the algorithms they use. We frequently improperly conflate "theoretically impossible" and "no tractable algorithm currently exists".


I wonder (thinking out loud here) are there any statistics for this?
For example if you plot the number of such algorithms that've been
found over time, what sort of curve would you get? (Of course, you'd
have to define "general, elegant algorithm for basic problem", which
might be tricky)


I am still surprised often enough that it is obvious that there is considerable amounts of innovation still being done. It both amuses and annoys me no end that some common algorithms have design characteristics that reflect long-forgotten assumptions that do not even make sense in the context they are used e.g. compulsive tree balancing behavior of intrinsically unbalanced data structures.


In short, we have no idea what important and fundamental
algorithms will be discovered from one year to the next that change the
boundaries of what is practically possible with computer science.

Is this true? It doesn't seem right to me. AIUI the current state of
the art in operating systems, compilers, garbage collectors, etc is
only slightly more efficient than it was 10 or 20 years ago. (In fact,
most practical programs are a good deal less efficient, because faster
processors mean they don't have to be).


It is easy to forget how many basic algorithms we use ubiquitously are relatively recent. The concurrent B-tree algorithm that is pervasively used in databases, file systems, and just about everything else was published in the 1980s. In fact, most of the algorithms that make up a modern SQL database as we understand them were developed in the 1980s, even though the relational model goes back to the 1960s.


I don't think I understand you. To me "indexing" means what the Google
search engine or an SQL database does -- but you're using the word
with a different meaning aren't you?


I mean it exactly like you understand it. Indexed access methods and representations.


Sorry, you've lost me again -- I've never heard of the term
"hyper-rectangles" in relation to relational databases.


Most people haven't, because there are no hyper-rectangles in relational database *implementations* seeing as how there are no useful algorithms for representing them. Nonetheless, the underlying model describes operations using hyper-rectangles in high-dimensional spaces.

In an ideal relational implementation there are never external indexes, only data organized in its native high-dimensionality logical space, since external indexes are a de-normalization.


It is not because it is theoretically impossible, but
because it is only possible if someone discovers a general algorithm for
indexing hyper-rectangles -- faking it is not distributable.

How do we know that there is such an algorithm?


We don't unless someone publishes one, but there is a lot of evidence that seems to imply otherwise and which proves that much of the research that has been done was misdirected. Aesthetically, the current algorithms for doing this are nasty ugly hacks, and that lack of elegance is often an indicator that a better way exists.

In the specific case of indexing hyper-rectangles, the first basic algorithm was published in 1971 (IIRC), but was supplanted by a completely different family of algorithm in 1981. Virtually all research has been based on derivatives of the 1981 algorithm, since it appeared to have better properties. Unfortunately, we can now prove that this algorithm class can never yield a general solution and that a solution must look like a variant of the original 1971 algorithm family that has been ignored for a quarter century. Interestingly, the proof of this comes by way of the recent explosion in the research on massively concurrent data structures due to the proliferation of multi- core CPUs, which has published results that when applied to this particular problem show both that we were doing it wrong and gives a lot of hints as to how to do it right that suggest a completely novel type of algorithm that has never been tried.

You have to understand that until a couple years ago, no one was doing useful research on this particular problem despite the fact that many, many people had done endless quantities of research trying to solve it down an apparent dead-end path. Computer science is full of this kind of thing, and the elegant solutions only look obvious in hindsight.

Cheers,

J. Andrew Rogers



-------------------------------------------
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=123753653-47f84b
Powered by Listbox: http://www.listbox.com

Reply via email to