Re: verity & cfscript
We did this using Oracle Text for a number of apps a while ago and I've never looked back. Other DBs have good "free text" indexing tools too. mxAjax / CFAjax docs and other useful articles: http://www.bifrost.com.au/blog/ 2008/10/9 Dave l > > > Anyways... just talked with client and we are pulling it and looking at > just putting all the info into a db and making it just a db search. > > ~| Adobe® ColdFusion® 8 software 8 is the most important and dramatic release to date Get the Free Trial http://ad.doubleclick.net/clk;207172674;29440083;f Archive: http://www.houseoffusion.com/groups/cf-talk/message.cfm/messageid:313653 Subscription: http://www.houseoffusion.com/groups/cf-talk/subscribe.cfm Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=11502.10531.4
Re: verity & cfscript
> I get that when I tell it to index a cfm page that it indexes the "page", I > just think it would > make more sense to index the resulting output and not the code. When google > indexes > the pages it doesn't index the source code per-say. When Google indexes the pages, it fetches them via HTTP. Verity doesn't; by itself, it simply can't. You tell Verity where your files are on your filesystem. What is the URL for c:\myfiles? That's why Verity includes a crawler: vspider. Dave Watts, CTO, Fig Leaf Software http://www.figleaf.com/ Fig Leaf Software provides the highest caliber vendor-authorized instruction at our training centers in Washington DC, Atlanta, Chicago, Baltimore, Northern Virginia, or on-site at your location. Visit http://training.figleaf.com/ for more information! ~| Adobe® ColdFusion® 8 software 8 is the most important and dramatic release to date Get the Free Trial http://ad.doubleclick.net/clk;207172674;29440083;f Archive: http://www.houseoffusion.com/groups/cf-talk/message.cfm/messageid:313640 Subscription: http://www.houseoffusion.com/groups/cf-talk/subscribe.cfm Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4
Re: verity & cfscript
I get that when I tell it to index a cfm page that it indexes the "page", I just think it would make more sense to index the resulting output and not the code. When google indexes the pages it doesn't index the source code per-say. With it setup like it is now it would seem to say that the majority of the users would want it to index and display source code which I would have a hard time believing that is the case. I was wondering something down the line of what Ray said and that it has rules & I would assume that one of its rules might be to do something when it finds the word "script" in between tags which might explain why it is happening with cfscript but not cfset. Verity works well for document searching but I still am a little puzzled about it. Anyways... just talked with client and we are pulling it and looking at just putting all the info into a db and making it just a db search. ~| Adobe® ColdFusion® 8 software 8 is the most important and dramatic release to date Get the Free Trial http://ad.doubleclick.net/clk;207172674;29440083;f Archive: http://www.houseoffusion.com/groups/cf-talk/message.cfm/messageid:313623 Subscription: http://www.houseoffusion.com/groups/cf-talk/subscribe.cfm Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=11502.10531.4
Re: verity & cfscript
> I figured as much when I was doing it but if it does index the source code > then why > wouldn't it index all of the source code and not just whats inside a cfscript > tag? > Or why would it not index the source code when moved inside a cfset tag > instead of > cfscript? Or why would I have the EXACT same code on other pages and they > don't show it > the results? If the indexer parses an HTML page and indexes the content, it shouldn't index the HTML tags themselves. I would expect it to do much the same with CFML tags. Dave Watts, CTO, Fig Leaf Software http://www.figleaf.com/ Fig Leaf Software provides the highest caliber vendor-authorized instruction at our training centers in Washington DC, Atlanta, Chicago, Baltimore, Northern Virginia, or on-site at your location. Visit http://training.figleaf.com/ for more information! ~| Adobe® ColdFusion® 8 software 8 is the most important and dramatic release to date Get the Free Trial http://ad.doubleclick.net/clk;207172674;29440083;f Archive: http://www.houseoffusion.com/groups/cf-talk/message.cfm/messageid:313597 Subscription: http://www.houseoffusion.com/groups/cf-talk/subscribe.cfm Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4
Re: verity & cfscript
Verity _can_ do the job. You told it to index code pages, and it did. ;) I know it's been said on the thread before, but it is still true. We could say that the CF Admin's use of CFM as one of the defaults is a bit wrong, but you can edit that easily enough. If you want to index _data_ from your CF site, it may be easier to simply index a query. I find that a heck of a lot simpler than using vspider. As for why you see Verity possibly indexing part of the code... Verity has rules for how it indexes content. These rules include things like ignoring unimportant words like The. It is very possible that some of the CF code is simply being ignored. Verity doesn't understand CF. I'm sure you would see similar oddness with indexed PHP code as well. On Wed, Oct 8, 2008 at 12:49 AM, Dave l <[EMAIL PROTECTED]> wrote: > Yes and no dave.. > > I figured as much when I was doing it but if it does index the source code > then why wouldn't it index all of the source code and not just whats inside a > cfscript tag? > Or why would it not index the source code when moved inside a cfset tag > instead of cfscript? Or why would I have the EXACT same code on other pages > and they don't show it the results? > > the output is: > > > index.cfm.cfm > strTitle = "title blah blah blah"; strDesc = > "desc blah blah blah"; some reg generated page content here. > ~| Adobe® ColdFusion® 8 software 8 is the most important and dramatic release to date Get the Free Trial http://ad.doubleclick.net/clk;207172674;29440083;f Archive: http://www.houseoffusion.com/groups/cf-talk/message.cfm/messageid:313596 Subscription: http://www.houseoffusion.com/groups/cf-talk/subscribe.cfm Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4
RE: verity & cfscript
I'm sure there's something else at play there. I could point you to some sites where searching for cfif will show you source but I don't think it's fair on them :) Not really what you want to hear but I'd use vspider or think about getting the in-page content into some other form, database, HTML files etc. Here's one problem with what you're doing: Apples Oranges If you index that the way you're doing, a search for apples will always turn up this page and a search for oranges will always turn up this page. Vertity isn't browsing to it so it isn't in the right state to be indexed for both someCondition and NOT someCondition. Not too hot an example really but I hope that makes sense. One downside with vspider, can't say the same for the Google appliance, but I don't think you have too much control over what gets indexed on a page. If you have a comment footer, a search for a word in that footer returns all the pages on the site with that footer. Is that a bad thing? It could be. It was for me on one project. I got around it by wrapping the content I wanted, or didn't want, can't remember which, in a custom tag that looked at the user agent. If it was vspider doing the browsing it didn't render various parts of the page. Having said that, I think I remember there being a way to define regions with HTML comments but I might be thinking of the search in Sitecore and not vspider. Which brings us nicely back to the database :OD Adrian Building a DB of errors at http://cferror.org/ -Original Message----- From: Dave l Sent: 08 October 2008 06:50 To: cf-talk Subject: Re: verity & cfscript Yes and no dave.. I figured as much when I was doing it but if it does index the source code then why wouldn't it index all of the source code and not just whats inside a cfscript tag? Or why would it not index the source code when moved inside a cfset tag instead of cfscript? Or why would I have the EXACT same code on other pages and they don't show it the results? the output is: index.cfm.cfm strTitle = "title blah blah blah"; strDesc = "desc blah blah blah"; some reg generated page content here. I will look at the google mini but I guess I just thought that cfm including verity could actually do the job. The defaults are set to index coldfusion pages and if they expect them not to have any cfm code then they would just be reg html pages.. I guess I will look at Lucene as well ~| Adobe® ColdFusion® 8 software 8 is the most important and dramatic release to date Get the Free Trial http://ad.doubleclick.net/clk;207172674;29440083;f Archive: http://www.houseoffusion.com/groups/cf-talk/message.cfm/messageid:313591 Subscription: http://www.houseoffusion.com/groups/cf-talk/subscribe.cfm Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4
Re: verity & cfscript
Yes and no dave.. I figured as much when I was doing it but if it does index the source code then why wouldn't it index all of the source code and not just whats inside a cfscript tag? Or why would it not index the source code when moved inside a cfset tag instead of cfscript? Or why would I have the EXACT same code on other pages and they don't show it the results? the output is: index.cfm.cfm strTitle = "title blah blah blah"; strDesc = "desc blah blah blah"; some reg generated page content here. I will look at the google mini but I guess I just thought that cfm including verity could actually do the job. The defaults are set to index coldfusion pages and if they expect them not to have any cfm code then they would just be reg html pages.. I guess I will look at Lucene as well > > I have a verity collection and it is picking up some cfm code. Now > of course i can take out > the .cfm & .cfml in the collection code but > then it doesn't index actual .cfm pages which is > needed (Like > this-page.cfm) > > > > The only cfm code it seams to be picking up is inside of cfscript > blocks and so in the > > results you get something like: > > > > page linktransfer = application.transferFactory.getTransfer(); > > qRecords = transfer.list("Admin.Press", "date", false); strTitle = > "site title"; > > strDesc = "site desc"; > > > > anyone got any ideas of getting rid of that source code? > > As Adrian mentioned, if you index .cfm files directly, you're > indexing > source code, not generated output. CFML tags won't show as readily > because the browser will ignore them, but I'm betting they're there. > > To index generated output, you have to crawl HTTP URLs. The vspider > utility will let you do this, but the bundled version of vspider has > some significant limitations. > > If crawling HTTP content is important, you might take a look at the > Google Mini, which is an appliance that does exactly this. > > Dave Watts, CTO, Fig Leaf Software > http://www.figleaf.com/ > > Fig Leaf Software provides the highest caliber vendor-authorized > instruction at our training centers in Washington DC, Atlanta, > Chicago, Baltimore, Northern Virginia, or on-site at your location. > Visit http://training.figleaf.com/ for more information! ~| Adobe® ColdFusion® 8 software 8 is the most important and dramatic release to date Get the Free Trial http://ad.doubleclick.net/clk;207172674;29440083;f Archive: http://www.houseoffusion.com/groups/cf-talk/message.cfm/messageid:313589 Subscription: http://www.houseoffusion.com/groups/cf-talk/subscribe.cfm Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=11502.10531.4
Re: verity & cfscript
> I have a verity collection and it is picking up some cfm code. Now of course > i can take out > the .cfm & .cfml in the collection code but then it doesn't > index actual .cfm pages which is > needed (Like this-page.cfm) > > The only cfm code it seams to be picking up is inside of cfscript blocks and > so in the > results you get something like: > > page linktransfer = application.transferFactory.getTransfer(); > qRecords = transfer.list("Admin.Press", "date", false); strTitle = "site > title"; > strDesc = "site desc"; > > anyone got any ideas of getting rid of that source code? As Adrian mentioned, if you index .cfm files directly, you're indexing source code, not generated output. CFML tags won't show as readily because the browser will ignore them, but I'm betting they're there. To index generated output, you have to crawl HTTP URLs. The vspider utility will let you do this, but the bundled version of vspider has some significant limitations. If crawling HTTP content is important, you might take a look at the Google Mini, which is an appliance that does exactly this. Dave Watts, CTO, Fig Leaf Software http://www.figleaf.com/ Fig Leaf Software provides the highest caliber vendor-authorized instruction at our training centers in Washington DC, Atlanta, Chicago, Baltimore, Northern Virginia, or on-site at your location. Visit http://training.figleaf.com/ for more information! ~| Adobe® ColdFusion® 8 software 8 is the most important and dramatic release to date Get the Free Trial http://ad.doubleclick.net/clk;207172674;29440083;f Archive: http://www.houseoffusion.com/groups/cf-talk/message.cfm/messageid:313586 Subscription: http://www.houseoffusion.com/groups/cf-talk/subscribe.cfm Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=11502.10531.4
Re: verity & cfscript
some pages have the exact same script and they don't show but other pages do.. very weird > No there is no cfsets in there. > Whats even weirder is that on some pages it indexes everything in the > top cfscript and other times it only indexes a part of it. > > Ok now the weirder part is that if I take out the cfscript and put the > vars in cfsets it works fine. But of course that really isn't an > optimal way for me to do things and I don't really want to go back and > rewrite all that code in cumbersome cfsets . ~| Adobe® ColdFusion® 8 software 8 is the most important and dramatic release to date Get the Free Trial http://ad.doubleclick.net/clk;207172674;29440083;f Archive: http://www.houseoffusion.com/groups/cf-talk/message.cfm/messageid:313584 Subscription: http://www.houseoffusion.com/groups/cf-talk/subscribe.cfm Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4
Re: verity & cfscript
No there is no cfsets in there. Whats even weirder is that on some pages it indexes everything in the top cfscript and other times it only indexes a part of it. Ok now the weirder part is that if I take out the cfscript and put the vars in cfsets it works fine. But of course that really isn't an optimal way for me to do things and I don't really want to go back and rewrite all that code in cumbersome cfsets . ~| Adobe® ColdFusion® 8 software 8 is the most important and dramatic release to date Get the Free Trial http://ad.doubleclick.net/clk;207172674;29440083;f Archive: http://www.houseoffusion.com/groups/cf-talk/message.cfm/messageid:313581 Subscription: http://www.houseoffusion.com/groups/cf-talk/subscribe.cfm Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=11502.10531.4
RE: verity & cfscript
I've got it to pull out CF tags before. Maybe it is but because you're viewing them on a web page you don't seem them because of the < and >. Try viewing the source when you search for cfif or cfset etc. Adrian -Original Message- From: Dave l Sent: 08 October 2008 00:31 To: cf-talk Subject: Re: verity & cfscript I have seen the vspider but there is too many ?'s and not enough time to figure out all i need for that. It's just seems odd that in all the books and examples that it doesn't mention this and the default files include .cfml & .cfm but who would actually want it to search the actually source code? I can see searching the resulting output since that is what you think it actually should be searching. It is also weird that it is just picking out things inside of cfscript, so if it was actually grabbing all the source code then you would see more code but there isnt. If this is how it really works then it is pretty worthless IMO. >If you tell verity to index .cfm files it'll read the source. > >As far as I know you can't stop that (I know less and less about Verity >these day!). > >Look into using the vspider. Google will throw up some answers but the short >of it is, it'll crawl your site via the web so you'll end up indexing it as >a user sees it. > >Let us know if you find out different. > >Adrian >Building a DB of errors at http://cferror.org/ > >I have a verity collection and it is picking up some cfm code. Now of course >i can take out the .cfm & .cfml in the collection code but then it doesn't >index actual .cfm pages which is needed (Like this-page.cfm) > >The only cfm code it seams to be picking up is inside of cfscript blocks and >so in the results you get something like: > >page link >transfer = application.transferFactory.getTransfer(); qRecords = >transfer.list("Admin.Press", "date", false); strTitle = "site title"; >strDesc = "site desc"; > > >anyone got any ideas of getting rid of that source code? ~| Adobe® ColdFusion® 8 software 8 is the most important and dramatic release to date Get the Free Trial http://ad.doubleclick.net/clk;207172674;29440083;f Archive: http://www.houseoffusion.com/groups/cf-talk/message.cfm/messageid:313579 Subscription: http://www.houseoffusion.com/groups/cf-talk/subscribe.cfm Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4
Re: verity & cfscript
I have seen the vspider but there is too many ?'s and not enough time to figure out all i need for that. It's just seems odd that in all the books and examples that it doesn't mention this and the default files include .cfml & .cfm but who would actually want it to search the actually source code? I can see searching the resulting output since that is what you think it actually should be searching. It is also weird that it is just picking out things inside of cfscript, so if it was actually grabbing all the source code then you would see more code but there isnt. If this is how it really works then it is pretty worthless IMO. >If you tell verity to index .cfm files it'll read the source. > >As far as I know you can't stop that (I know less and less about Verity >these day!). > >Look into using the vspider. Google will throw up some answers but the short >of it is, it'll crawl your site via the web so you'll end up indexing it as >a user sees it. > >Let us know if you find out different. > >Adrian >Building a DB of errors at http://cferror.org/ > >I have a verity collection and it is picking up some cfm code. Now of course >i can take out the .cfm & .cfml in the collection code but then it doesn't >index actual .cfm pages which is needed (Like this-page.cfm) > >The only cfm code it seams to be picking up is inside of cfscript blocks and >so in the results you get something like: > >page link >transfer = application.transferFactory.getTransfer(); qRecords = >transfer.list("Admin.Press", "date", false); strTitle = "site title"; >strDesc = "site desc"; > > >anyone got any ideas of getting rid of that source code? ~| Adobe® ColdFusion® 8 software 8 is the most important and dramatic release to date Get the Free Trial http://ad.doubleclick.net/clk;207172674;29440083;f Archive: http://www.houseoffusion.com/groups/cf-talk/message.cfm/messageid:313578 Subscription: http://www.houseoffusion.com/groups/cf-talk/subscribe.cfm Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4
RE: verity & cfscript
If you tell verity to index .cfm files it'll read the source. As far as I know you can't stop that (I know less and less about Verity these day!). Look into using the vspider. Google will throw up some answers but the short of it is, it'll crawl your site via the web so you'll end up indexing it as a user sees it. Let us know if you find out different. Adrian Building a DB of errors at http://cferror.org/ -Original Message- From: Dave l Sent: 07 October 2008 21:09 To: cf-talk Subject: verity & cfscript I have a verity collection and it is picking up some cfm code. Now of course i can take out the .cfm & .cfml in the collection code but then it doesn't index actual .cfm pages which is needed (Like this-page.cfm) The only cfm code it seams to be picking up is inside of cfscript blocks and so in the results you get something like: page link transfer = application.transferFactory.getTransfer(); qRecords = transfer.list("Admin.Press", "date", false); strTitle = "site title"; strDesc = "site desc"; anyone got any ideas of getting rid of that source code? ~| Adobe® ColdFusion® 8 software 8 is the most important and dramatic release to date Get the Free Trial http://ad.doubleclick.net/clk;207172674;29440083;f Archive: http://www.houseoffusion.com/groups/cf-talk/message.cfm/messageid:313570 Subscription: http://www.houseoffusion.com/groups/cf-talk/subscribe.cfm Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4
verity & cfscript
I have a verity collection and it is picking up some cfm code. Now of course i can take out the .cfm & .cfml in the collection code but then it doesn't index actual .cfm pages which is needed (Like this-page.cfm) The only cfm code it seams to be picking up is inside of cfscript blocks and so in the results you get something like: page link transfer = application.transferFactory.getTransfer(); qRecords = transfer.list("Admin.Press", "date", false); strTitle = "site title"; strDesc = "site desc"; anyone got any ideas of getting rid of that source code? ~| Adobe® ColdFusion® 8 software 8 is the most important and dramatic release to date Get the Free Trial http://ad.doubleclick.net/clk;207172674;29440083;f Archive: http://www.houseoffusion.com/groups/cf-talk/message.cfm/messageid:313558 Subscription: http://www.houseoffusion.com/groups/cf-talk/subscribe.cfm Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4