On Jun 30, 2004, at 1:38 PM, George Abraham wrote:

> Dick,
>  The search method employed in an application is a sore topic in the
>  projects I work in. The application is geared towards really
>  comprehensive metadata about media assets stored within the
>  application. We basically follow a lot of the Dublin Core metadata
>  set.
>

Not familiar with that!

>  Our initial search method was to allow searches on individual metadata
>  fields. However our test users were complaining that they wanted
>  google-type simplicity.

I can see the problems with that -- probably a very confusing/tedious UI

> So fine, we had a basic google-type search for
>  everyone and if they wanted an advanced fine-grain search, they could
>  do that too.

The way Apple appears to handle this (mostly) is the free-form google
approach -- start typing and hits are displayed/eliminated with each
succeeding keystroke.

Then the hits are displayed in categories, and based on the categories
found, unique, meaningful filters for each category.

For example, for hits found in the email, files you have filters
"from", within last (day/week/month).

For images, there might be additional  filters by image type, size,
source, camera f-stop, etc.

There appears to be an ability to specify/handle synonyms -- Steve
demoed a Windows user Searching for "Wallpaper" and finding the Mac
controls for setting the "Desktop Image"

>  Next they wanted words close to or stems of the word(s) they were
>  searching on. Not too hard using inflectional or etymological stem
>  searching. So that was done.

In a way, the immediate feedback of partial keying,  handles stemming.

>  Now the problem is to find even distantly related assets to the term
>  that they are searching on. Here is a link to a pdf file that is an
>  article that describes what we are thinking about:
>  http://www.dtsearch.com/images3/PCAI-Fielded_Data_Divide.pdf
>

Interesting -- skim read it, but will read more later.

>  I haven't seen the Spotlight preso, but if Spotlight's API can be used
>  to work on databases, that would be cool.

I expect that it will work with databases, and/or maybe it already does.

Steve explained that Apple was trying to develop a technology to
easily/intuitively find what you want from hundreds of thousands of
bits of information.  The emphasis was on "Find" rather than "Search"
... subtle.

Then he stated that they had already solved the problem -- with the
engine that they have in iTunes.

iTunes consists of three file sets

1) iTunes Music -- dir containing sub dirs for artist subdir for album
and the song track file
2) iTunes Library file -- the database used to rapidly search/retrieve
songs
3) iTunes Library.xml -- externally accessable equivalent of 2)

So that's how it works in iTunes -- Apple has equivalent files for
iPhoto, etc.

But, Apple is supposed to be working on a db-based system file
structure -- when that comes "everything" will be in the db

Don't know if the db-file structure will be in the next OS X (Tiger)
scheduled first half 2005,

If you want to see how the search works (without the filters), you can
download iTunes and load up some songs from CD, whatever.

>  Hopefully this wasn't confusing.

Finally, I worked for IBM in the '60-70's.  They had millions of
publications (2nd largest publisher to the US Govt.).  They had a maim
frame-generated index of these docs -- it was called KWIC (KeyWord In
Context) index.  It looked like hell, was terribly verbose, but worked
like a dream.  This was totally un-automated -- they published books
monthly containing the updated index.

But you could quickly visually scan the index, find the doc (and pub
no) of what you wanted.

The program was rather simple -- mainly eliminate noise words, then
sort.

I have often wondered what this powerful, primitive technique would be
like with a db and a friendly UI.

Dick

>  George
>
[Todays Threads] [This Message] [Subscription] [Fast Unsubscribe] [User Settings] [Donations and Support]

Reply via email to