Re: [fw-general] Zend Search Lucene: Searching for URLs
Thank you for looking at this... okay that makes sense. I suspected it was the Query Parser. Alexander Veremyev wrote: > > Hi Dave, > > I checked the index you sent me and found the problem. > > The problem is that field is not tokenized. > So values are stored "as is", but query parser breaks urls into several > terms. :( > > I've created an issue for this: > http://framework.zend.com/issues/browse/ZF-1216 > > > While it's not done you may: > a) use tokenized fields for urls: > - > Zend_Search_Lucene_Field::Text($fieldName, $fieldValue[, $encoding]) > - > > b) construct query through API: > - > $term = new Zend_Search_Lucene_Index_Term('framework.zend.com', 'url') > $query = new Zend_Search_Lucene_Search_Query_Term($term); > > $hits = $index->find($query); > - > > > With best regards, > Alexander Veremyev. > > Dave Dash wrote: >> You bet. >> >> It's ITS/ Indexed Tokenized Stored (like most of my fields). >> >> Here's the query logic: >> >> public function executeSearch() >> { >> $query = strtolower($this->getRequestParameter('q')); >> >> $hits = array(); >> >> if ($query) >> { >> >> $index = >> Zend_Search_Lucene::open(sfConfig::get('app_search_card_index_file')); >> >> >> Zend_Search_Lucene_Analysis_Analyzer::setDefault(new >> Zend_Search_Lucene_Analysis_Analyzer_Common_Utf8Num()); >> >> $hits = $index->find($query); >> } >> >> An example query in my form would be >> >> hawaiimoves.com >> >> or >> >> url:hawaiimoves.com >> >> The same queries work fine in Luke. Also other fields seem to work just >> fine. >> >> Thanks! >> >> >> Alexander Veremyev wrote: >>> Hi Dave, >>> >>> Could you describe field type used for URL (in context of this >>> description - >>> http://framework.zend.com/manual/en/zend.search.html#zend.search.index-creation.understanding-field-types) >>> >>> and give an example of search query? >>> >>> >>> With best regards, >>> Alexander Veremyev. >>> >>> >>> Dave Dash wrote: So I've indexed a document that has a field called "url" filled with URLs (e.g. http://reviewsby.us/, http://spindrop.us/, http://www.nabble.com, etc, etc). I can find these in Luke just fine by searching for the url (without the http:// in fact) But in my ZSL 0.9.1 app I get nothing. If I search for http or https I do get results, but nothing after the :// Luke was using the default analyzer (Keyword) and ZSL was using UTF8_Num, but it should be able to find these, I have a feeling the dots in the URLs are choking things up. >>> >>> >> > > > -- View this message in context: http://www.nabble.com/Zend-Search-Lucene%3A-Searching-for-URLs-tf3521912s16154.html#a9911449 Sent from the Zend Framework mailing list archive at Nabble.com.
Re: [fw-general] Zend Search Lucene: Searching for URLs
Hi Dave, I checked the index you sent me and found the problem. The problem is that field is not tokenized. So values are stored "as is", but query parser breaks urls into several terms. :( I've created an issue for this: http://framework.zend.com/issues/browse/ZF-1216 While it's not done you may: a) use tokenized fields for urls: - Zend_Search_Lucene_Field::Text($fieldName, $fieldValue[, $encoding]) - b) construct query through API: - $term = new Zend_Search_Lucene_Index_Term('framework.zend.com', 'url') $query = new Zend_Search_Lucene_Search_Query_Term($term); $hits = $index->find($query); - With best regards, Alexander Veremyev. Dave Dash wrote: You bet. It's ITS/ Indexed Tokenized Stored (like most of my fields). Here's the query logic: public function executeSearch() { $query = strtolower($this->getRequestParameter('q')); $hits = array(); if ($query) { $index = Zend_Search_Lucene::open(sfConfig::get('app_search_card_index_file')); Zend_Search_Lucene_Analysis_Analyzer::setDefault(new Zend_Search_Lucene_Analysis_Analyzer_Common_Utf8Num()); $hits = $index->find($query); } An example query in my form would be hawaiimoves.com or url:hawaiimoves.com The same queries work fine in Luke. Also other fields seem to work just fine. Thanks! Alexander Veremyev wrote: Hi Dave, Could you describe field type used for URL (in context of this description - http://framework.zend.com/manual/en/zend.search.html#zend.search.index-creation.understanding-field-types) and give an example of search query? With best regards, Alexander Veremyev. Dave Dash wrote: So I've indexed a document that has a field called "url" filled with URLs (e.g. http://reviewsby.us/, http://spindrop.us/, http://www.nabble.com, etc, etc). I can find these in Luke just fine by searching for the url (without the http:// in fact) But in my ZSL 0.9.1 app I get nothing. If I search for http or https I do get results, but nothing after the :// Luke was using the default analyzer (Keyword) and ZSL was using UTF8_Num, but it should be able to find these, I have a feeling the dots in the URLs are choking things up.
Re: [fw-general] Zend Search Lucene: Searching for URLs
Hi Dave, Thanks for details! I am going to check it, but have chance to look into it only tomorrow or at Friday. How large is the index? Could you send it to me for tests? With best regards, Alexander Veremyev. Dave Dash wrote: You bet. It's ITS/ Indexed Tokenized Stored (like most of my fields). Here's the query logic: public function executeSearch() { $query = strtolower($this->getRequestParameter('q')); $hits = array(); if ($query) { $index = Zend_Search_Lucene::open(sfConfig::get('app_search_card_index_file')); Zend_Search_Lucene_Analysis_Analyzer::setDefault(new Zend_Search_Lucene_Analysis_Analyzer_Common_Utf8Num()); $hits = $index->find($query); } An example query in my form would be hawaiimoves.com or url:hawaiimoves.com The same queries work fine in Luke. Also other fields seem to work just fine. Thanks! Alexander Veremyev wrote: Hi Dave, Could you describe field type used for URL (in context of this description - http://framework.zend.com/manual/en/zend.search.html#zend.search.index-creation.understanding-field-types) and give an example of search query? With best regards, Alexander Veremyev. Dave Dash wrote: So I've indexed a document that has a field called "url" filled with URLs (e.g. http://reviewsby.us/, http://spindrop.us/, http://www.nabble.com, etc, etc). I can find these in Luke just fine by searching for the url (without the http:// in fact) But in my ZSL 0.9.1 app I get nothing. If I search for http or https I do get results, but nothing after the :// Luke was using the default analyzer (Keyword) and ZSL was using UTF8_Num, but it should be able to find these, I have a feeling the dots in the URLs are choking things up.
Re: [fw-general] Zend Search Lucene: Searching for URLs
You bet. It's ITS/ Indexed Tokenized Stored (like most of my fields). Here's the query logic: public function executeSearch() { $query = strtolower($this->getRequestParameter('q')); $hits = array(); if ($query) { $index = Zend_Search_Lucene::open(sfConfig::get('app_search_card_index_file')); Zend_Search_Lucene_Analysis_Analyzer::setDefault(new Zend_Search_Lucene_Analysis_Analyzer_Common_Utf8Num()); $hits = $index->find($query); } An example query in my form would be hawaiimoves.com or url:hawaiimoves.com The same queries work fine in Luke. Also other fields seem to work just fine. Thanks! Alexander Veremyev wrote: > > Hi Dave, > > Could you describe field type used for URL (in context of this > description - > http://framework.zend.com/manual/en/zend.search.html#zend.search.index-creation.understanding-field-types) > > and give an example of search query? > > > With best regards, > Alexander Veremyev. > > > Dave Dash wrote: >> So I've indexed a document that has a field called "url" filled with URLs >> (e.g. http://reviewsby.us/, http://spindrop.us/, http://www.nabble.com, >> etc, >> etc). >> >> I can find these in Luke just fine by searching for the url (without the >> http:// in fact) >> >> But in my ZSL 0.9.1 app I get nothing. If I search for http or https I >> do >> get results, but nothing after the :// >> >> Luke was using the default analyzer (Keyword) and ZSL was using UTF8_Num, >> but it should be able to find these, I have a feeling the dots in the >> URLs >> are choking things up. > > > -- View this message in context: http://www.nabble.com/Zend-Search-Lucene%3A-Searching-for-URLs-tf3521912s16154.html#a9827630 Sent from the Zend Framework mailing list archive at Nabble.com.
Re: [fw-general] Zend Search Lucene: Searching for URLs
Hi Dave, Could you describe field type used for URL (in context of this description - http://framework.zend.com/manual/en/zend.search.html#zend.search.index-creation.understanding-field-types) and give an example of search query? With best regards, Alexander Veremyev. Dave Dash wrote: So I've indexed a document that has a field called "url" filled with URLs (e.g. http://reviewsby.us/, http://spindrop.us/, http://www.nabble.com, etc, etc). I can find these in Luke just fine by searching for the url (without the http:// in fact) But in my ZSL 0.9.1 app I get nothing. If I search for http or https I do get results, but nothing after the :// Luke was using the default analyzer (Keyword) and ZSL was using UTF8_Num, but it should be able to find these, I have a feeling the dots in the URLs are choking things up.