Re: google-search 2

2005-10-12 Thread daniel kessler
thanks for your help everyone.  I didn't realize these options are available.

 I've been using Verity after watching Ray's preso on it. Works sweet, 
 just like google. Suggested spellings even. 
 
Will

~|
Logware (www.logware.us): a new and convenient web-based time tracking 
application. Start tracking and documenting hours spent on a project or with a 
client with Logware today. Try it for free with a 15 day trial account.
http://www.houseoffusion.com/banners/view.cfm?bannerid=67

Message: http://www.houseoffusion.com/lists.cfm/link=i:4:220842
Archives: http://www.houseoffusion.com/cf_lists/threads.cfm/4
Subscription: http://www.houseoffusion.com/lists.cfm/link=s:4
Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4
Donations  Support: http://www.houseoffusion.com/tiny.cfm/54


google-search 2

2005-10-11 Thread Daniel Kessler
As y'all have read, I'm trying to replicate a google-like search on  
our database.  I think that I have the CF and DB (Oracle) logic  
worked out but I'd like someone else to look it over and see if it's  
right.  I want to search the same way that google does since most  
people are used to that.  I want words separated by spaces, quoted  
text (a phrase) treated as a single word, + meaning that it must have  
the word and - that it must not contain the word.

Here's the search page which allows for a search of a few different  
columns:
http://hhp.umd.edu/people/people_search.cfm

and here's the code that I use (and any improvements are welcome):


 !---
 before this, I make the three arrays described here:
 - search_ar: the OR search items
 - search_must: the AND must contain items, designated  
as + before the word
 - search_not: the items that must not contain in the  
returned results, designated as minus before the word
 ---

CFQUERY NAME=people_search DATASOURCE=dpch
 SELECT  
id,fname,lname,type,title,dept,phone,location,email,summary,imagePath,li 
nk
 FROM PEOPLE
 WHERE
 cfloop from=1 to=#arrayLen(search_ar)# index=ii
 UPPER(#PreserveSingleQuotes(the_search_column)#)  
LIKE UPPER('%#search_ar[ii]#%')
 cfif ii LT arrayLen(search_ar)OR/cfif
 /cfloop
 cfif arrayLen(search_ar) GT 0 AND arrayLen 
(search_must) GT 0AND/cfif
 cfif arrayLen(search_must) GT 0
 (
 cfloop from=1 to=#arrayLen(search_must)#  
index=jj
 UPPER(#PreserveSingleQuotes 
(the_search_column)#) LIKE UPPER('%#search_must[jj]#%')
 cfif jj LT arrayLen(search_must)AND/cfif
 /cfloop
 )
 /cfif
 cfif (arrayLen(search_must) GT 0 OR (arrayLen 
(search_ar) GT 0 AND arrayLen(search_must) EQ 0)) AND arrayLen 
(search_not) GT 0AND/cfif
 cfif arrayLen(search_not) GT 0
 (
 cfloop from=1 to=#arrayLen(search_not)#  
index=kk
 UPPER(#PreserveSingleQuotes 
(the_search_column)#) NOT LIKE UPPER('%#search_not[kk]#%')
 cfif kk LT arrayLen(search_not)AND/cfif
 /cfloop
 )
 /cfif

 !--- ---

 AND status = 'active'
 ORDER BY upper(lname) ASC
 /CFQUERY


thanks!
_

Daniel Kessler

Department of Public and Community Health
University of Maryland
Suite 2387 Valley Drive
College Park, MD  20742-2611
Phone: 301-405-2545
http://hhp.umd.edu




~|
Logware (www.logware.us): a new and convenient web-based time tracking 
application. Start tracking and documenting hours spent on a project or with a 
client with Logware today. Try it for free with a 15 day trial account.
http://www.houseoffusion.com/banners/view.cfm?bannerid=67

Message: http://www.houseoffusion.com/lists.cfm/link=i:4:220648
Archives: http://www.houseoffusion.com/cf_lists/threads.cfm/4
Subscription: http://www.houseoffusion.com/lists.cfm/link=s:4
Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4
Donations  Support: http://www.houseoffusion.com/tiny.cfm/54


RE: google-search 2

2005-10-11 Thread Bobby Hartsfield
Here are some searches I ran

sport
sport and eeg
sport or eeg
sport -department
sport -department eeg
sport -department +eeg
associated cognitive
associated cognitive

I didn't put the parens in the search criteria, I just used them here to
denote the actual search criteria I used.

(-department) didn't exclude records with the word in it but

(sport -jessica) did exclude the one record with the word jessica in it when
I searched for sport

The search for 
(sport and eeg) returned many records that had neither word in them (I
searched by bio on all of them)

Same thing with (sport or egg)
Same with (sport and eeg)
Same with (sport or eeg)

You also have some unicode characters in the displayed text. Example: search
the word (home) and look at the last sentence in the first result


..:.:.:.:.:.:.:.:.:.:.:.:.:.:.
Bobby Hartsfield
http://acoderslife.com


-Original Message-
From: Daniel Kessler [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, October 11, 2005 9:01 AM
To: CF-Talk
Subject: google-search 2

As y'all have read, I'm trying to replicate a google-like search on  
our database.  I think that I have the CF and DB (Oracle) logic  
worked out but I'd like someone else to look it over and see if it's  
right.  I want to search the same way that google does since most  
people are used to that.  I want words separated by spaces, quoted  
text (a phrase) treated as a single word, + meaning that it must have  
the word and - that it must not contain the word.

Here's the search page which allows for a search of a few different  
columns:
http://hhp.umd.edu/people/people_search.cfm

and here's the code that I use (and any improvements are welcome):


 !---
 before this, I make the three arrays described here:
 - search_ar: the OR search items
 - search_must: the AND must contain items, designated  
as + before the word
 - search_not: the items that must not contain in the  
returned results, designated as minus before the word
 ---

CFQUERY NAME=people_search DATASOURCE=dpch
 SELECT  
id,fname,lname,type,title,dept,phone,location,email,summary,imagePath,li 
nk
 FROM PEOPLE
 WHERE
 cfloop from=1 to=#arrayLen(search_ar)# index=ii
 UPPER(#PreserveSingleQuotes(the_search_column)#)  
LIKE UPPER('%#search_ar[ii]#%')
 cfif ii LT arrayLen(search_ar)OR/cfif
 /cfloop
 cfif arrayLen(search_ar) GT 0 AND arrayLen 
(search_must) GT 0AND/cfif
 cfif arrayLen(search_must) GT 0
 (
 cfloop from=1 to=#arrayLen(search_must)#  
index=jj
 UPPER(#PreserveSingleQuotes 
(the_search_column)#) LIKE UPPER('%#search_must[jj]#%')
 cfif jj LT arrayLen(search_must)AND/cfif
 /cfloop
 )
 /cfif
 cfif (arrayLen(search_must) GT 0 OR (arrayLen 
(search_ar) GT 0 AND arrayLen(search_must) EQ 0)) AND arrayLen 
(search_not) GT 0AND/cfif
 cfif arrayLen(search_not) GT 0
 (
 cfloop from=1 to=#arrayLen(search_not)#  
index=kk
 UPPER(#PreserveSingleQuotes 
(the_search_column)#) NOT LIKE UPPER('%#search_not[kk]#%')
 cfif kk LT arrayLen(search_not)AND/cfif
 /cfloop
 )
 /cfif

 !--- ---

 AND status = 'active'
 ORDER BY upper(lname) ASC
 /CFQUERY


thanks!
_

Daniel Kessler

Department of Public and Community Health
University of Maryland
Suite 2387 Valley Drive
College Park, MD  20742-2611
Phone: 301-405-2545
http://hhp.umd.edu






~|
Logware (www.logware.us): a new and convenient web-based time tracking 
application. Start tracking and documenting hours spent on a project or with a 
client with Logware today. Try it for free with a 15 day trial account.
http://www.houseoffusion.com/banners/view.cfm?bannerid=67

Message: http://www.houseoffusion.com/lists.cfm/link=i:4:220652
Archives: http://www.houseoffusion.com/cf_lists/threads.cfm/4
Subscription: http://www.houseoffusion.com/lists.cfm/link=s:4
Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4
Donations  Support: http://www.houseoffusion.com/tiny.cfm/54


Re: google-search 2

2005-10-11 Thread daniel kessler
Thanks a bunch for your review.  I have some comments, but I need to re-read 
your reply a few times to see how it works out.

I didn't implement the use of the keywords AND or OR adn I'm not sure if your 
use of them below indicates that they're part of the search criteria or meant 
to be used as keywords.  If  there's an AND in the line, it'll use it as a 
search word and anything containing AND (i.e. sand) will be retrieved.

(-department) didn't exclude records with the word in it but

(sport -jessica) did exclude the one record with the word jessica in it when
I searched for sport

I'll look into this some more.

The search for 
(sport and eeg) returned many records that had neither word in them (I
searched by bio on all of them)

Same thing with (sport or egg)
Same with (sport and eeg)
Same with (sport or eeg)

In this case, though I don't know for sure, since the field displays html as 
html, it might be part of the url.
I did sport eeg (again quotes used just to designate the search text) and 
every record came up with one or the other.  Your results might be do to the 
fact that I am not using AND or OR as keywords - not sure.

You also have some unicode characters in the displayed text. Example: search
the word (home) and look at the last sentence in the first result

I noticed those a few times, thanks.  I think that they're curly quotes.

~|
Logware (www.logware.us): a new and convenient web-based time tracking 
application. Start tracking and documenting hours spent on a project or with a 
client with Logware today. Try it for free with a 15 day trial account.
http://www.houseoffusion.com/banners/view.cfm?bannerid=67

Message: http://www.houseoffusion.com/lists.cfm/link=i:4:220660
Archives: http://www.houseoffusion.com/cf_lists/threads.cfm/4
Subscription: http://www.houseoffusion.com/lists.cfm/link=s:4
Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4
Donations  Support: http://www.houseoffusion.com/tiny.cfm/54


Re: google-search 2

2005-10-11 Thread Dave Carabetta
On 10/11/05, Daniel Kessler [EMAIL PROTECTED] wrote:
 As y'all have read, I'm trying to replicate a google-like search on
 our database.  I think that I have the CF and DB (Oracle) logic
 worked out but I'd like someone else to look it over and see if it's
 right.  I want to search the same way that google does since most
 people are used to that.  I want words separated by spaces, quoted
 text (a phrase) treated as a single word, + meaning that it must have
 the word and - that it must not contain the word.

 Here's the search page which allows for a search of a few different
 columns:
 http://hhp.umd.edu/people/people_search.cfm

 and here's the code that I use (and any improvements are welcome):


  !---
  before this, I make the three arrays described here:
  - search_ar: the OR search items
  - search_must: the AND must contain items, designated
 as + before the word
  - search_not: the items that must not contain in the
 returned results, designated as minus before the word
  ---

 CFQUERY NAME=people_search DATASOURCE=dpch
  SELECT
 id,fname,lname,type,title,dept,phone,location,email,summary,imagePath,li
 nk
  FROM PEOPLE
  WHERE
  cfloop from=1 to=#arrayLen(search_ar)# index=ii
  UPPER(#PreserveSingleQuotes(the_search_column)#)
 LIKE UPPER('%#search_ar[ii]#%')
  cfif ii LT arrayLen(search_ar)OR/cfif
  /cfloop
  cfif arrayLen(search_ar) GT 0 AND arrayLen
 (search_must) GT 0AND/cfif
  cfif arrayLen(search_must) GT 0
  (
  cfloop from=1 to=#arrayLen(search_must)#
 index=jj
  UPPER(#PreserveSingleQuotes
 (the_search_column)#) LIKE UPPER('%#search_must[jj]#%')
  cfif jj LT arrayLen(search_must)AND/cfif
  /cfloop
  )
  /cfif
  cfif (arrayLen(search_must) GT 0 OR (arrayLen
 (search_ar) GT 0 AND arrayLen(search_must) EQ 0)) AND arrayLen
 (search_not) GT 0AND/cfif
  cfif arrayLen(search_not) GT 0
  (
  cfloop from=1 to=#arrayLen(search_not)#
 index=kk
  UPPER(#PreserveSingleQuotes
 (the_search_column)#) NOT LIKE UPPER('%#search_not[kk]#%')
  cfif kk LT arrayLen(search_not)AND/cfif
  /cfloop
  )
  /cfif

  !--- ---

  AND status = 'active'
  ORDER BY upper(lname) ASC
  /CFQUERY


It might have a Google feel from a UI standpoint, but your code
above will more than likely not scale at all. First, when you use
Oracle functions around column names such as:

UPPER(column) = '#UCase(myvalue)#'

Oracle will not use any indexes defined on that column, which means
it's going to do a full table scan for that value (translation: not
scalable). Second, LIKE queries, in general, should be avoided,
particularly for heavy search queries. You should really look into
using Oracle's Full Text Search capabilities, as the performance
difference can be dramatic. It's a little more difficult to implement
up front, but that effort will go a long way to saving your database.
Some might also say that you should be using cfqueryparam as well so
that bind parameters are used, but I've read in several places that
user-defined search queries like you have are actually *better off*
without bind parameters because you don't risk Oracle creating an
initial explain plan for a query based on one set of terms terms, only
to find out that another set of search terms should use a completely
different explain plan.

I don't know what the traffic at that site is like, nor do I know the
number of records your searching against (or will search against in
the future), but I think you should do some more research into Full
Text Searching before committing to your current approach, as I think
you're setting yourself up for trouble, if not now, down the road.

Regards,
Dave.

~|
Logware (www.logware.us): a new and convenient web-based time tracking 
application. Start tracking and documenting hours spent on a project or with a 
client with Logware today. Try it for free with a 15 day trial account.
http://www.houseoffusion.com/banners/view.cfm?bannerid=67

Message: http://www.houseoffusion.com/lists.cfm/link=i:4:220663
Archives: http://www.houseoffusion.com/cf_lists/threads.cfm/4
Subscription: http://www.houseoffusion.com/lists.cfm/link=s:4
Unsubscribe: 
http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=11502.10531.4
Donations  Support: http://www.houseoffusion.com/tiny.cfm/54


Re: google-search 2

2005-10-11 Thread Adrocknaphobia
I agree with Dave, as I popped in on this thread thinking you were
using OracleText and my laziness has prevented me from writing a
Google-to-OracleText query in PL/SQL. I was hoping you might have
already.

I was able to generate an error when I searched on a NOT term without
specifiying the term. (Just a '-'). It also throws an error if there
are double quotes.

-Adam

On 10/11/05, Dave Carabetta [EMAIL PROTECTED] wrote:
 On 10/11/05, Daniel Kessler [EMAIL PROTECTED] wrote:
  As y'all have read, I'm trying to replicate a google-like search on
  our database.  I think that I have the CF and DB (Oracle) logic
  worked out but I'd like someone else to look it over and see if it's
  right.  I want to search the same way that google does since most
  people are used to that.  I want words separated by spaces, quoted
  text (a phrase) treated as a single word, + meaning that it must have
  the word and - that it must not contain the word.
 
  Here's the search page which allows for a search of a few different
  columns:
  http://hhp.umd.edu/people/people_search.cfm
 
  and here's the code that I use (and any improvements are welcome):
 
 
   !---
   before this, I make the three arrays described here:
   - search_ar: the OR search items
   - search_must: the AND must contain items, designated
  as + before the word
   - search_not: the items that must not contain in the
  returned results, designated as minus before the word
   ---
 
  CFQUERY NAME=people_search DATASOURCE=dpch
   SELECT
  id,fname,lname,type,title,dept,phone,location,email,summary,imagePath,li
  nk
   FROM PEOPLE
   WHERE
   cfloop from=1 to=#arrayLen(search_ar)# index=ii
   UPPER(#PreserveSingleQuotes(the_search_column)#)
  LIKE UPPER('%#search_ar[ii]#%')
   cfif ii LT arrayLen(search_ar)OR/cfif
   /cfloop
   cfif arrayLen(search_ar) GT 0 AND arrayLen
  (search_must) GT 0AND/cfif
   cfif arrayLen(search_must) GT 0
   (
   cfloop from=1 to=#arrayLen(search_must)#
  index=jj
   UPPER(#PreserveSingleQuotes
  (the_search_column)#) LIKE UPPER('%#search_must[jj]#%')
   cfif jj LT arrayLen(search_must)AND/cfif
   /cfloop
   )
   /cfif
   cfif (arrayLen(search_must) GT 0 OR (arrayLen
  (search_ar) GT 0 AND arrayLen(search_must) EQ 0)) AND arrayLen
  (search_not) GT 0AND/cfif
   cfif arrayLen(search_not) GT 0
   (
   cfloop from=1 to=#arrayLen(search_not)#
  index=kk
   UPPER(#PreserveSingleQuotes
  (the_search_column)#) NOT LIKE UPPER('%#search_not[kk]#%')
   cfif kk LT arrayLen(search_not)AND/cfif
   /cfloop
   )
   /cfif
 
   !--- ---
 
   AND status = 'active'
   ORDER BY upper(lname) ASC
   /CFQUERY
 

 It might have a Google feel from a UI standpoint, but your code
 above will more than likely not scale at all. First, when you use
 Oracle functions around column names such as:

 UPPER(column) = '#UCase(myvalue)#'

 Oracle will not use any indexes defined on that column, which means
 it's going to do a full table scan for that value (translation: not
 scalable). Second, LIKE queries, in general, should be avoided,
 particularly for heavy search queries. You should really look into
 using Oracle's Full Text Search capabilities, as the performance
 difference can be dramatic. It's a little more difficult to implement
 up front, but that effort will go a long way to saving your database.
 Some might also say that you should be using cfqueryparam as well so
 that bind parameters are used, but I've read in several places that
 user-defined search queries like you have are actually *better off*
 without bind parameters because you don't risk Oracle creating an
 initial explain plan for a query based on one set of terms terms, only
 to find out that another set of search terms should use a completely
 different explain plan.

 I don't know what the traffic at that site is like, nor do I know the
 number of records your searching against (or will search against in
 the future), but I think you should do some more research into Full
 Text Searching before committing to your current approach, as I think
 you're setting yourself up for trouble, if not now, down the road.

 Regards,
 Dave.

 

~|
Discover CFTicket - The leading ColdFusion Help Desk and Trouble 
Ticket application

http://www.houseoffusion.com/banners/view.cfm?bannerid=48

Message: http://www.houseoffusion.com/lists.cfm/link=i:4:220740
Archives: 

Re: google-search 2

2005-10-11 Thread Chris Velevitch
On 10/11/05, Daniel Kessler [EMAIL PROTECTED] wrote:
 As y'all have read, I'm trying to replicate a google-like search on
 our database.  I think that I have the CF and DB (Oracle) logic

Doesn't Verity provide some if not most of this?


Chris
--
Chris Velevitch
Manager - Sydney Flash Platform Developers Group
www.flashdev.org.au

~|
Discover CFTicket - The leading ColdFusion Help Desk and Trouble 
Ticket application

http://www.houseoffusion.com/banners/view.cfm?bannerid=48

Message: http://www.houseoffusion.com/lists.cfm/link=i:4:220780
Archives: http://www.houseoffusion.com/cf_lists/threads.cfm/4
Subscription: http://www.houseoffusion.com/lists.cfm/link=s:4
Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4
Donations  Support: http://www.houseoffusion.com/tiny.cfm/54


Re: google-search 2

2005-10-11 Thread Will Tomlinson
I've been using Verity after watching Ray's preso on it. Works sweet, just like 
google. Suggested spellings even. 

Will

~|
Logware (www.logware.us): a new and convenient web-based time tracking 
application. Start tracking and documenting hours spent on a project or with a 
client with Logware today. Try it for free with a 15 day trial account.
http://www.houseoffusion.com/banners/view.cfm?bannerid=67

Message: http://www.houseoffusion.com/lists.cfm/link=i:4:220787
Archives: http://www.houseoffusion.com/cf_lists/threads.cfm/4
Subscription: http://www.houseoffusion.com/lists.cfm/link=s:4
Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4
Donations  Support: http://www.houseoffusion.com/tiny.cfm/54