Re: google-search 2

2005-10-12 Thread daniel kessler
thanks for your help everyone.  I didn't realize these options are available.

> I've been using Verity after watching Ray's preso on it. Works sweet, 
> just like google. Suggested spellings even. 
> 
Will

~|
Logware (www.logware.us): a new and convenient web-based time tracking 
application. Start tracking and documenting hours spent on a project or with a 
client with Logware today. Try it for free with a 15 day trial account.
http://www.houseoffusion.com/banners/view.cfm?bannerid=67

Message: http://www.houseoffusion.com/lists.cfm/link=i:4:220842
Archives: http://www.houseoffusion.com/cf_lists/threads.cfm/4
Subscription: http://www.houseoffusion.com/lists.cfm/link=s:4
Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4
Donations & Support: http://www.houseoffusion.com/tiny.cfm/54


Re: google-search 2

2005-10-11 Thread Will Tomlinson
I've been using Verity after watching Ray's preso on it. Works sweet, just like 
google. Suggested spellings even. 

Will

~|
Logware (www.logware.us): a new and convenient web-based time tracking 
application. Start tracking and documenting hours spent on a project or with a 
client with Logware today. Try it for free with a 15 day trial account.
http://www.houseoffusion.com/banners/view.cfm?bannerid=67

Message: http://www.houseoffusion.com/lists.cfm/link=i:4:220787
Archives: http://www.houseoffusion.com/cf_lists/threads.cfm/4
Subscription: http://www.houseoffusion.com/lists.cfm/link=s:4
Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4
Donations & Support: http://www.houseoffusion.com/tiny.cfm/54


Re: google-search 2

2005-10-11 Thread Chris Velevitch
On 10/11/05, Daniel Kessler <[EMAIL PROTECTED]> wrote:
> As y'all have read, I'm trying to replicate a google-like search on
> our database.  I think that I have the CF and DB (Oracle) logic

Doesn't Verity provide some if not most of this?


Chris
--
Chris Velevitch
Manager - Sydney Flash Platform Developers Group
www.flashdev.org.au

~|
Discover CFTicket - The leading ColdFusion Help Desk and Trouble 
Ticket application

http://www.houseoffusion.com/banners/view.cfm?bannerid=48

Message: http://www.houseoffusion.com/lists.cfm/link=i:4:220780
Archives: http://www.houseoffusion.com/cf_lists/threads.cfm/4
Subscription: http://www.houseoffusion.com/lists.cfm/link=s:4
Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4
Donations & Support: http://www.houseoffusion.com/tiny.cfm/54


Re: google-search 2

2005-10-11 Thread Adrocknaphobia
I agree with Dave, as I popped in on this thread thinking you were
using OracleText and my laziness has prevented me from writing a
Google-to-OracleText query in PL/SQL. I was hoping you might have
already.

I was able to generate an error when I searched on a NOT term without
specifiying the term. (Just a '-'). It also throws an error if there
are double quotes.

-Adam

On 10/11/05, Dave Carabetta <[EMAIL PROTECTED]> wrote:
> On 10/11/05, Daniel Kessler <[EMAIL PROTECTED]> wrote:
> > As y'all have read, I'm trying to replicate a google-like search on
> > our database.  I think that I have the CF and DB (Oracle) logic
> > worked out but I'd like someone else to look it over and see if it's
> > right.  I want to search the same way that google does since most
> > people are used to that.  I want words separated by spaces, quoted
> > text (a phrase) treated as a single word, + meaning that it must have
> > the word and - that it must not contain the word.
> >
> > Here's the search page which allows for a search of a few different
> > columns:
> > http://hhp.umd.edu/people/people_search.cfm
> >
> > and here's the code that I use (and any improvements are welcome):
> >
> >
> >  
> >
> > 
> >  SELECT
> > id,fname,lname,type,title,dept,phone,location,email,summary,imagePath,li
> > nk
> >  FROM PEOPLE
> >  WHERE
> >  
> >  UPPER(#PreserveSingleQuotes(the_search_column)#)
> > LIKE UPPER('%#search_ar[ii]#%')
> >  OR
> >  
> >   > (search_must) GT 0>AND
> >  
> >  (
> >   > index="jj">
> >  UPPER(#PreserveSingleQuotes
> > (the_search_column)#) LIKE UPPER('%#search_must[jj]#%')
> >  AND
> >  
> >  )
> >  
> >   > (search_ar) GT 0 AND arrayLen(search_must) EQ 0)) AND arrayLen
> > (search_not) GT 0>AND
> >  
> >  (
> >   > index="kk">
> >  UPPER(#PreserveSingleQuotes
> > (the_search_column)#) NOT LIKE UPPER('%#search_not[kk]#%')
> >  AND
> >  
> >  )
> >  
> >
> >  
> >
> >  AND status = 'active'
> >  ORDER BY upper(lname) ASC
> >  
> >
>
> It might have a Google "feel" from a UI standpoint, but your code
> above will more than likely not scale at all. First, when you use
> Oracle functions around column names such as:
>
> UPPER(column) = '#UCase(myvalue)#'
>
> Oracle will not use any indexes defined on that column, which means
> it's going to do a full table scan for that value (translation: not
> scalable). Second, LIKE queries, in general, should be avoided,
> particularly for heavy search queries. You should really look into
> using Oracle's Full Text Search capabilities, as the performance
> difference can be dramatic. It's a little more difficult to implement
> up front, but that effort will go a long way to saving your database.
> Some might also say that you should be using cfqueryparam as well so
> that bind parameters are used, but I've read in several places that
> user-defined search queries like you have are actually *better off*
> without bind parameters because you don't risk Oracle creating an
> initial explain plan for a query based on one set of terms terms, only
> to find out that another set of search terms should use a completely
> different explain plan.
>
> I don't know what the traffic at that site is like, nor do I know the
> number of records your searching against (or will search against in
> the future), but I think you should do some more research into Full
> Text Searching before committing to your current approach, as I think
> you're setting yourself up for trouble, if not now, down the road.
>
> Regards,
> Dave.
>
> 

~|
Discover CFTicket - The leading ColdFusion Help Desk and Trouble 
Ticket application

http://www.houseoffusion.com/banners/view.cfm?bannerid=48

Message: http://www.houseoffusion.com/lists.cfm/link=i:4:220740
Archives: http://www.houseoffusion.com/cf_lists/threads.cfm/4
Subscription: http://www.houseoffusion.com/lists.cfm/link=s:4
Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4
Donations & Support: http://www.houseoffusion.com/tiny.cfm/54


Re: google-search 2

2005-10-11 Thread Dave Carabetta
On 10/11/05, Daniel Kessler <[EMAIL PROTECTED]> wrote:
> As y'all have read, I'm trying to replicate a google-like search on
> our database.  I think that I have the CF and DB (Oracle) logic
> worked out but I'd like someone else to look it over and see if it's
> right.  I want to search the same way that google does since most
> people are used to that.  I want words separated by spaces, quoted
> text (a phrase) treated as a single word, + meaning that it must have
> the word and - that it must not contain the word.
>
> Here's the search page which allows for a search of a few different
> columns:
> http://hhp.umd.edu/people/people_search.cfm
>
> and here's the code that I use (and any improvements are welcome):
>
>
>  
>
> 
>  SELECT
> id,fname,lname,type,title,dept,phone,location,email,summary,imagePath,li
> nk
>  FROM PEOPLE
>  WHERE
>  
>  UPPER(#PreserveSingleQuotes(the_search_column)#)
> LIKE UPPER('%#search_ar[ii]#%')
>  OR
>  
>   (search_must) GT 0>AND
>  
>  (
>   index="jj">
>  UPPER(#PreserveSingleQuotes
> (the_search_column)#) LIKE UPPER('%#search_must[jj]#%')
>  AND
>  
>  )
>  
>   (search_ar) GT 0 AND arrayLen(search_must) EQ 0)) AND arrayLen
> (search_not) GT 0>AND
>  
>  (
>   index="kk">
>  UPPER(#PreserveSingleQuotes
> (the_search_column)#) NOT LIKE UPPER('%#search_not[kk]#%')
>  AND
>  
>  )
>  
>
>  
>
>  AND status = 'active'
>  ORDER BY upper(lname) ASC
>  
>

It might have a Google "feel" from a UI standpoint, but your code
above will more than likely not scale at all. First, when you use
Oracle functions around column names such as:

UPPER(column) = '#UCase(myvalue)#'

Oracle will not use any indexes defined on that column, which means
it's going to do a full table scan for that value (translation: not
scalable). Second, LIKE queries, in general, should be avoided,
particularly for heavy search queries. You should really look into
using Oracle's Full Text Search capabilities, as the performance
difference can be dramatic. It's a little more difficult to implement
up front, but that effort will go a long way to saving your database.
Some might also say that you should be using cfqueryparam as well so
that bind parameters are used, but I've read in several places that
user-defined search queries like you have are actually *better off*
without bind parameters because you don't risk Oracle creating an
initial explain plan for a query based on one set of terms terms, only
to find out that another set of search terms should use a completely
different explain plan.

I don't know what the traffic at that site is like, nor do I know the
number of records your searching against (or will search against in
the future), but I think you should do some more research into Full
Text Searching before committing to your current approach, as I think
you're setting yourself up for trouble, if not now, down the road.

Regards,
Dave.

~|
Logware (www.logware.us): a new and convenient web-based time tracking 
application. Start tracking and documenting hours spent on a project or with a 
client with Logware today. Try it for free with a 15 day trial account.
http://www.houseoffusion.com/banners/view.cfm?bannerid=67

Message: http://www.houseoffusion.com/lists.cfm/link=i:4:220663
Archives: http://www.houseoffusion.com/cf_lists/threads.cfm/4
Subscription: http://www.houseoffusion.com/lists.cfm/link=s:4
Unsubscribe: 
http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=11502.10531.4
Donations & Support: http://www.houseoffusion.com/tiny.cfm/54


Re: google-search 2

2005-10-11 Thread daniel kessler
Thanks a bunch for your review.  I have some comments, but I need to re-read 
your reply a few times to see how it works out.

I didn't implement the use of the keywords AND or OR adn I'm not sure if your 
use of them below indicates that they're part of the search criteria or meant 
to be used as keywords.  If  there's an AND in the line, it'll use it as a 
search word and anything containing AND (i.e. sand) will be retrieved.

>(-department) didn't exclude records with the word in it but
>
>(sport -jessica) did exclude the one record with the word jessica in it when
>I searched for sport

I'll look into this some more.

>The search for 
>(sport and eeg) returned many records that had neither word in them (I
>searched by bio on all of them)
>
>Same thing with (sport or egg)
>Same with ("sport" and "eeg")
>Same with ("sport" or "eeg")

In this case, though I don't know for sure, since the field displays html as 
html, it might be part of the url.
I did "sport eeg" (again quotes used just to designate the search text) and 
every record came up with one or the other.  Your results might be do to the 
fact that I am not using AND or OR as keywords - not sure.

>You also have some unicode characters in the displayed text. Example: search
>the word (home) and look at the last sentence in the first result

I noticed those a few times, thanks.  I think that they're curly quotes.

~|
Logware (www.logware.us): a new and convenient web-based time tracking 
application. Start tracking and documenting hours spent on a project or with a 
client with Logware today. Try it for free with a 15 day trial account.
http://www.houseoffusion.com/banners/view.cfm?bannerid=67

Message: http://www.houseoffusion.com/lists.cfm/link=i:4:220660
Archives: http://www.houseoffusion.com/cf_lists/threads.cfm/4
Subscription: http://www.houseoffusion.com/lists.cfm/link=s:4
Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4
Donations & Support: http://www.houseoffusion.com/tiny.cfm/54


RE: google-search 2

2005-10-11 Thread Bobby Hartsfield
Here are some searches I ran

sport
sport and eeg
sport or eeg
sport -department
sport -department eeg
sport -department +eeg
associated cognitive
"associated cognitive"

I didn't put the parens in the search criteria, I just used them here to
denote the actual search criteria I used.

(-department) didn't exclude records with the word in it but

(sport -jessica) did exclude the one record with the word jessica in it when
I searched for sport

The search for 
(sport and eeg) returned many records that had neither word in them (I
searched by bio on all of them)

Same thing with (sport or egg)
Same with ("sport" and "eeg")
Same with ("sport" or "eeg")

You also have some unicode characters in the displayed text. Example: search
the word (home) and look at the last sentence in the first result


..:.:.:.:.:.:.:.:.:.:.:.:.:.:.
Bobby Hartsfield
http://acoderslife.com


-Original Message-
From: Daniel Kessler [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, October 11, 2005 9:01 AM
To: CF-Talk
Subject: google-search 2

As y'all have read, I'm trying to replicate a google-like search on  
our database.  I think that I have the CF and DB (Oracle) logic  
worked out but I'd like someone else to look it over and see if it's  
right.  I want to search the same way that google does since most  
people are used to that.  I want words separated by spaces, quoted  
text (a phrase) treated as a single word, + meaning that it must have  
the word and - that it must not contain the word.

Here's the search page which allows for a search of a few different  
columns:
http://hhp.umd.edu/people/people_search.cfm

and here's the code that I use (and any improvements are welcome):


 


 SELECT  
id,fname,lname,type,title,dept,phone,location,email,summary,imagePath,li 
nk
 FROM PEOPLE
 WHERE
 
 UPPER(#PreserveSingleQuotes(the_search_column)#)  
LIKE UPPER('%#search_ar[ii]#%')
 OR
 
 AND
 
 (
 
 UPPER(#PreserveSingleQuotes 
(the_search_column)#) LIKE UPPER('%#search_must[jj]#%')
 AND
 
 )
 
 AND
 
 (
 
 UPPER(#PreserveSingleQuotes 
(the_search_column)#) NOT LIKE UPPER('%#search_not[kk]#%')
 AND
 
 )
 

 

 AND status = 'active'
 ORDER BY upper(lname) ASC
 


thanks!
_

Daniel Kessler

Department of Public and Community Health
University of Maryland
Suite 2387 Valley Drive
College Park, MD  20742-2611
Phone: 301-405-2545
http://hhp.umd.edu






~|
Logware (www.logware.us): a new and convenient web-based time tracking 
application. Start tracking and documenting hours spent on a project or with a 
client with Logware today. Try it for free with a 15 day trial account.
http://www.houseoffusion.com/banners/view.cfm?bannerid=67

Message: http://www.houseoffusion.com/lists.cfm/link=i:4:220652
Archives: http://www.houseoffusion.com/cf_lists/threads.cfm/4
Subscription: http://www.houseoffusion.com/lists.cfm/link=s:4
Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4
Donations & Support: http://www.houseoffusion.com/tiny.cfm/54


google-search 2

2005-10-11 Thread Daniel Kessler
As y'all have read, I'm trying to replicate a google-like search on  
our database.  I think that I have the CF and DB (Oracle) logic  
worked out but I'd like someone else to look it over and see if it's  
right.  I want to search the same way that google does since most  
people are used to that.  I want words separated by spaces, quoted  
text (a phrase) treated as a single word, + meaning that it must have  
the word and - that it must not contain the word.

Here's the search page which allows for a search of a few different  
columns:
http://hhp.umd.edu/people/people_search.cfm

and here's the code that I use (and any improvements are welcome):


 


 SELECT  
id,fname,lname,type,title,dept,phone,location,email,summary,imagePath,li 
nk
 FROM PEOPLE
 WHERE
 
 UPPER(#PreserveSingleQuotes(the_search_column)#)  
LIKE UPPER('%#search_ar[ii]#%')
 OR
 
 AND
 
 (
 
 UPPER(#PreserveSingleQuotes 
(the_search_column)#) LIKE UPPER('%#search_must[jj]#%')
 AND
 
 )
 
 AND
 
 (
 
 UPPER(#PreserveSingleQuotes 
(the_search_column)#) NOT LIKE UPPER('%#search_not[kk]#%')
 AND
 
 )
 

 

 AND status = 'active'
 ORDER BY upper(lname) ASC
 


thanks!
_

Daniel Kessler

Department of Public and Community Health
University of Maryland
Suite 2387 Valley Drive
College Park, MD  20742-2611
Phone: 301-405-2545
http://hhp.umd.edu




~|
Logware (www.logware.us): a new and convenient web-based time tracking 
application. Start tracking and documenting hours spent on a project or with a 
client with Logware today. Try it for free with a 15 day trial account.
http://www.houseoffusion.com/banners/view.cfm?bannerid=67

Message: http://www.houseoffusion.com/lists.cfm/link=i:4:220648
Archives: http://www.houseoffusion.com/cf_lists/threads.cfm/4
Subscription: http://www.houseoffusion.com/lists.cfm/link=s:4
Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4
Donations & Support: http://www.houseoffusion.com/tiny.cfm/54