--On Tuesday, June 06, 2006 9:33 AM -0400 Alex Karasulu <[EMAIL PROTECTED]> wrote:

Quanah Gibson-Mount wrote:



--On Tuesday, June 06, 2006 12:50 AM -0400 "Noel J. Bergman"
<[EMAIL PROTECTED]> wrote:

Quanah Gibson-Mount wrote:

I think the concept of applying all indexing to attributes is in itself
broken.


So is your suggestion that the option be made available, but that by
indexing selectively, Alex's concerns can be effectively addressed?  Do
you have any suggestions as to how that might be provided without losing
ease-of-use for the most common cases?


Well, in OpenLDAP, the way ease of use is met is by users being able
to define a default index type or types.  That way, they can specify
the default set, and then just use index <attribute>, similar to what
is being done in Apache DS.

I think it is important to allow specification of what indices to use
for a given attribute for a few reasons.  One, that you can use it to
actually make some searches slow enough to hinder efforts (like we
have a spam troller routinely trying to get data from our sources that
is fairly obnoxious), another is that the more indices you have on an
attribute, the larger the total database is, and the longer it takes
to load.  This of course depends on part in the OS/Cpu used as well.
For example, I currently index 90 attributes in my database to varying
degrees (most are eq, which is a fairly minimal index).

Note that ApacheDS indices are very similar to OpenLDAP equality indices
with some minor differences for handling substring matching.  The cost
is about the same as an eq index.  So you get sub, eq, and existence for
the price of eq.


Ah, okay.  That is handy.

On my Solaris sparc systems, it takes 2.5ish hours to load the database.


What kind of sparc is that?  I have a blade 2000 here if you want to try
a current machine benchmark.  I could create an account for you to
test.  Just let me know.


My current systems are SunFire V120's, with a 650 MHz CPU and 4GB of RAM. I've played with more modern systems (8 CPU T2000's with 4-cores), and they are definitely faster, but the same underlying issue of needing to use a memory cache during bulk loads still applies, but that may be an OpenLDAP/BDB specific thing that ApacheDS wouldn't encounter.


On my new AMD systems that'll be replacing the Sun Sparc boxes, it
takes all of 14.5 minutes.  However, if all 90 of those attributes
were getting indexed pres,eq,sub, the amount of time to load would
increase significantly.

Currently, my indices take up 1.1GB of disk space in OpenLDAP (I'm not
sure how that exactly map out in Apache DS).  My database entry file
takes 2.7GB.  So my indices are approximately 1/3 of my database size.

Yeah the cost of disk space is just about the same but that's the least
of our worries.  Disk is cheap as Emmanuel stated.

Nods, memory was more my concern, but it may not apply in the ApacheDS case, given the use of a different database backend and the difference in how indices are done.

--Quanah

--
Quanah Gibson-Mount
Principal Software Developer
ITS/Shared Application Services
Stanford University
GnuPG Public Key: http://www.stanford.edu/~quanah/pgp.html

Reply via email to