Re: [AOLSERVER] build nsd w/o locking?

John Buckman Mon, 03 Jul 2006 03:37:02 -0700

How do you mean "included"? Actual code that is run? One of the things

where AOLserver completely blows PHP out of the water is that PHPhas to

re-interpret everything on every page. So if you include a library and
only use one function from it... AOLserver's library Tcl code is just
there at no extra cost, other than at thread creation.

Exactly : "package require" hugely helps performance, but moregenerally, web pages slowly sprout features on each page, such as"people who also bought..." which add overhead to every page. In myenvironment, I have a debug hook that does "package forget" on everypage, to make debugging easy, and turn that off when in production.

platform, at about 1/2 the speed of serving 1k GIFs, but about 10x
faster than PHP, 3x faster than lighthttpd-fastcgi.  So, AOLserver is
How could that be, PHP is better than anything, its popular for areason! ;-)
That's good info, though. I still plan to do my own benchmarks,with somesimulated application so the database and it's drivers get benchedtoo.
Not sure if I will ever get around to it, though.

I did that too, and got around 40pgs/second on my macmini against asingle-sql query page using mysql, and 600 pgs/second with a 4 querybdb page. The "real" (ie, in production) full text search page,against a 100,000 item corpus, runs around 120pgs/second on mymacmini (which is an incredibly underpowered machine)

a good platform as far as scripting scalability is concerned, as long
as the developer takes care not to load too much dependent code per

As I said becfore I am not clear about what you mean by this, can you
elaborate?

Really, all I mean is per-page feature creep, esp features thatrequire a db query for each page load. Most web sites slowly add these.

end of the table, which is a common scenario.  In my experience, many
applications that use SQL actually only need key-lookup capability
I guess the answer here is "it depends"; rendering news.bbc.co.ukcould be
done from BDB. But people end up wanting to do queries like "how much
money does the average customer spend per month" that only an RDBMScanprovide easily. And the "R" in RDBMS is quite important too. Ithink thebest use for BerkeleyDB is as a cache; save to the RDBMS, thenexport toBerkeley. Like saving all the fields in a blog entry, thenrendering the
page and stuffing it in a BerkeleyDB for viewing.

For the occasional fancy queries, in my case (www.bookmooch.com)things like "what are the most popular book topics" I simply do atable scan and cache the result (I need to look at AOLServer's time-expiring cache mechanism, I thought I saw something that did that).

I'm not really a mysql table scan + tcl function for average is anyslower than SQL, because that's exactly what the SQL engine is doingwhen you ask "average of field X for whole table" and compiled Tcl isawfully fast, and all a SQL query optimizer really is, is a p-codecompiler (just like tcl). On my PHP-backed site www.magnatune.com,the "average stats" pages are run nightly and cached to HTML, becauseagainst mysql they take about 30s to run.

Now, granted, there is more code to write to do an average on a tablescan with Tcl+bdb, than doing it with SQL, but the code is drop-deadsimple to write and not in the critical path, that I don't worryabout it.

that it's tricky to get right.  Google uses Berkeley DB for their
universal login, for just this reason.

It's extremely usefull for that. Though many sites don't quite need to
scale to Google-esque proportions.

No, but on the other hand, most interactive web sites have responsetimes over 3 seconds. Amazon, for instance, now is frequently >10seconds for a book page load. Yahoo, with it's page timesconsistently under 1 second, is a pleasure to use.

Not to put too fine a point on it, but if you run a web-basedbusiness and may want to sell your company, one of the firstquestions you will hear is "how do you scale". Engineering forscalability way beyond your current needs really will help sell yourbusiness (I've sold two web-based businesses, and have been throughdiscussion this quite a few times)

Thirdly, you'll notice many sites have very poor full text searching
performance.  Lucene, a recently popularized full text search engine,
appears to finally solve this problem. However, in my case I wanted
But Lucene is probably hard to integrate unless you use Java, isn'tit? Ihad a play with http://xapian.org/ two years ago. Seems very good,it'd be
nice to have an AOLserver module for it. Some day, when I have time...

There's a C client-side module for Lucene, which I assume could befairly easily integrated into a Tcl module. However, I believe theserver-side is Java, which many are biased against.

I know that going with Berkeley DB is controversial, but in my
opinion it's extremely difficult to scale up a SQL backed application
Like I said, I would use DBD as a cache and store the "real data" in a
nice relational schema. But maybe that's just my apprehension with a
technology I haven't used beyond some experiments...
db platform and do extensive up-front design work to that effect,
which few people do.
And those are the magic words. Most people don't, they just care about
functionality; "it works, I am done." I am just finsihing up on aservice(http://www.sativo.co.uk) where search is very important. I knowthat evenwith high concurrency it will do 20 searches/sec on the currenthardwarewith 100K subscribers. I also designed in seperate pools forreading and
writing, so if I do need to scale to multiple DBs, I can use a simple
single master, multiple slaves setup with every web server readingfromtheir own DB and writing to the master. Due to the low number ofwrites on
this service, that will scale very well.

Yes, generally if you plan on a single-writer architecture, that canwork well. I've frequently seen user-session-state saved in a backendSQL database, with a key passed on the URL, so that every pagerequires a SQL query, and every state changes requires a SQL write.That sort of architecture is dreadfully hard to scale up, since youhave a many-writers/many-readers scenario. Typically, people usefancy load-balancing machines that send the same user to the samemachine every time, which is more complicated that I like.

Just my opinion... all I can say is that AOLServer+berkeley-db, if
you can live with a key-lookup database only, is incredibly fast, at

No surprise there really, AOLserver is the fastest server and BDB is
pretty much the fastest thing out there for key/value lookups!

DJ Bernstein (of qmail fame) wrote a read-only database librarythat's insanely fast, about 3x faster than bdb in my own tests, whichcould be useful in certain scenarios.


-john


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to <[EMAIL PROTECTED]> 
with the
body of "SIGNOFF AOLSERVER" in the email message. You can leave the Subject: 
field of your email blank.

Re: [AOLSERVER] build nsd w/o locking?

Reply via email to