hi jack I have been using text_en_splitting initially but what it was doing is it is changing by query aswell for example: if i am searching for "ace" term it is taking it as "ac" thus giving split ac higher score... see debug statment:
"debug":{ "rawquerystring":"ace", "querystring":"ace", "parsedquery":"(+DisjunctionMaxQuery((title:ac^30.0)))/no_coord", "parsedquery_toString":"+(title:ac^30.0)", "explain":{ "":"\n1.8650155 = (MATCH) weight(title:ac^30.0 in 469) [DefaultSimilarity], result of:\n 1.8650155 = fieldWeight in 469, product of:\n 1.0 = tf(freq=1.0), with freq of:\n 1.0 = termFreq=1.0\n 4.2628927 = idf(docFreq=39, maxDocs=1045)\n 0.4375 = fieldNorm(doc=469)\n", "":"\n1.8650155 = (MATCH) weight(title:ac^30.0 in 470) [DefaultSimilarity], result of:\n 1.8650155 = fieldWeight in 470, product of:\n 1.0 = tf(freq=1.0), with freq of:\n 1.0 = termFreq=1.0\n 4.2628927 = idf(docFreq=39, maxDocs=1045)\n 0.4375 = fieldNorm(doc=470)\n", "":"\n1.8650155 = (MATCH) weight(title:ac^30.0 in 471) [DefaultSimilarity], result of:\n 1.8650155 = fieldWeight in 471, product of:\n 1.0 = tf(freq=1.0), with freq of:\n 1.0 = termFreq=1.0\n 4.2628927 = idf(docFreq=39, maxDocs=1045)\n 0.4375 = fieldNorm(doc=471)\n", "":"\n1.8650155 = (MATCH) weight(title:ac^30.0 in 472) [DefaultSimilarity], result of:\n 1.8650155 = fieldWeight in 472, product of:\n 1.0 = tf(freq=1.0), with freq of:\n 1.0 = termFreq=1.0\n 4.2628927 = idf(docFreq=39, maxDocs=1045)\n 0.4375 = fieldNorm(doc=472)\n", "":"\n1.5985848 = (MATCH) weight(title:ac^30.0 in 331) [DefaultSimilarity], result of:\n 1.5985848 = fieldWeight in 331, product of:\n 1.0 = tf(freq=1.0), with freq of:\n 1.0 = termFreq=1.0\n 4.2628927 = idf(docFreq=39, maxDocs=1045)\n 0.375 = fieldNorm(doc=331)\n", "":"\n1.5985848 = (MATCH) weight(title:ac^30.0 in 332) [DefaultSimilarity], result of:\n 1.5985848 = fieldWeight in 332, product of:\n 1.0 = tf(freq=1.0), with freq of:\n 1.0 = termFreq=1.0\n 4.2628927 = idf(docFreq=39, maxDocs=1045)\n 0.375 = fieldNorm(doc=332)\n", "":"\n1.5985848 = (MATCH) weight(title:ac^30.0 in 335) [DefaultSimilarity], result of:\n 1.5985848 = fieldWeight in 335, product of:\n 1.0 = tf(freq=1.0), with freq of:\n 1.0 = termFreq=1.0\n 4.2628927 = idf(docFreq=39, maxDocs=1045)\n 0.375 = fieldNorm(doc=335)\n", "":"\n1.5985848 = (MATCH) weight(title:ac^30.0 in 336) [DefaultSimilarity], result of:\n 1.5985848 = fieldWeight in 336, product of:\n 1.0 = tf(freq=1.0), with freq of:\n 1.0 = termFreq=1.0\n 4.2628927 = idf(docFreq=39, maxDocs=1045)\n 0.375 = fieldNorm(doc=336)\n", "":"\n1.5985848 = (MATCH) weight(title:ac^30.0 in 337) [DefaultSimilarity], result of:\n 1.5985848 = fieldWeight in 337, product of:\n 1.0 = tf(freq=1.0), with freq of:\n 1.0 = termFreq=1.0\n 4.2628927 = idf(docFreq=39, maxDocs=1045)\n 0.375 = fieldNorm(doc=337)\n", "":"\n1.5985848 = (MATCH) weight(title:ac^30.0 in 393) [DefaultSimilarity], result of:\n 1.5985848 = fieldWeight in 393, product of:\n 1.0 = tf(freq=1.0), with freq of:\n 1.0 = termFreq=1.0\n 4.2628927 = idf(docFreq=39, maxDocs=1045)\n 0.375 = fieldNorm(doc=393)\n", "":"\n1.5985848 = (MATCH) weight(title:ac^30.0 in 425) [DefaultSimilarity], result of:\n 1.5985848 = fieldWeight in 425, product of:\n 1.0 = tf(freq=1.0), with freq of:\n 1.0 = termFreq=1.0\n 4.2628927 = idf(docFreq=39, maxDocs=1045)\n 0.375 = fieldNorm(doc=425)\n", "":"\n1.5985848 = (MATCH) weight(title:ac^30.0 in 426) [DefaultSimilarity], result of:\n 1.5985848 = fieldWeight in 426, product of:\n 1.0 = tf(freq=1.0), with freq of:\n 1.0 = termFreq=1.0\n 4.2628927 = idf(docFreq=39, maxDocs=1045)\n 0.375 = fieldNorm(doc=426)\n", "":"\n1.5985848 = (MATCH) weight(title:ac^30.0 in 429) [DefaultSimilarity], result of:\n 1.5985848 = fieldWeight in 429, product of:\n 1.0 = tf(freq=1.0), with freq of:\n 1.0 = termFreq=1.0\n 4.2628927 = idf(docFreq=39, maxDocs=1045)\n 0.375 = fieldNorm(doc=429)\n", "":"\n1.5985848 = (MATCH) weight(title:ac^30.0 in 430) [DefaultSimilarity], result of:\n 1.5985848 = fieldWeight in 430, product of:\n 1.0 = tf(freq=1.0), with freq of:\n 1.0 = termFreq=1.0\n 4.2628927 = idf(docFreq=39, maxDocs=1045)\n 0.375 = fieldNorm(doc=430)\n", "":"\n1.5985848 = (MATCH) weight(title:ac^30.0 in 431) [DefaultSimilarity], result of:\n 1.5985848 = fieldWeight in 431, product of:\n 1.0 = tf(freq=1.0), with freq of:\n 1.0 = termFreq=1.0\n 4.2628927 = idf(docFreq=39, maxDocs=1045)\n 0.375 = fieldNorm(doc=431)\n", "":"\n1.5985848 = (MATCH) weight(title:ac^30.0 in 433) [DefaultSimilarity], result of:\n 1.5985848 = fieldWeight in 433, product of:\n 1.0 = tf(freq=1.0), with freq of:\n 1.0 = termFreq=1.0\n 4.2628927 = idf(docFreq=39, maxDocs=1045)\n 0.375 = fieldNorm(doc=433)\n", "":"\n1.5985848 = (MATCH) weight(title:ac^30.0 in 434) [DefaultSimilarity], result of:\n 1.5985848 = fieldWeight in 434, product of:\n 1.0 = tf(freq=1.0), with freq of:\n 1.0 = termFreq=1.0\n 4.2628927 = idf(docFreq=39, maxDocs=1045)\n 0.375 = fieldNorm(doc=434)\n", "":"\n1.5985848 = (MATCH) weight(title:ac^30.0 in 502) [DefaultSimilarity], result of:\n 1.5985848 = fieldWeight in 502, product of:\n 1.0 = tf(freq=1.0), with freq of:\n 1.0 = termFreq=1.0\n 4.2628927 = idf(docFreq=39, maxDocs=1045)\n 0.375 = fieldNorm(doc=502)\n", "":"\n1.332154 = (MATCH) weight(title:ac^30.0 in 411) [DefaultSimilarity], result of:\n 1.332154 = fieldWeight in 411, product of:\n 1.0 = tf(freq=1.0), with freq of:\n 1.0 = termFreq=1.0\n 4.2628927 = idf(docFreq=39, maxDocs=1045)\n 0.3125 = fieldNorm(doc=411)\n", "":"\n1.332154 = (MATCH) weight(title:ac^30.0 in 424) [DefaultSimilarity], result of:\n 1.332154 = fieldWeight in 424, product of:\n 1.0 = tf(freq=1.0), with freq of:\n 1.0 = termFreq=1.0\n 4.2628927 = idf(docFreq=39, maxDocs=1045)\n 0.3125 = fieldNorm(doc=424)\n"}, "QParser":"ExtendedDismaxQParser", On Tue, Mar 19, 2013 at 7:37 PM, Jack Krupansky <j...@basetechnology.com>wrote: > Yeah, one ambiguity in typography is whether a hyphen is internal to a > compound term (e.g., "CD-ROM") or a phrase separator as in your case. Some > people are careful to put spaces around the hyphen for a phrase delimiter, > but plenty of people still just drop it in directly adjacent to two words. > > In your case, text_en_splitting_tight is SPECIFICALLY trying to keep > "Laptop-DUAL" together as a single term, so that "wi fi" is kept distinct > from "Wi-Fi". > > Try text_en_splitting, which specifically is NOT trying to keep them > together. > > The key clue here is that the former does not have generateWordParts="1". > That is the option that is needed so that "Laptop-DUAL" will be indexed as > "laptop dual". > > -- Jack Krupansky > > -----Original Message----- From: Rohan Thakur > Sent: Tuesday, March 19, 2013 3:35 AM > To: solr-user@lucene.apache.org > Subject: Re: had query regarding the indexing and analysers > > > my default is title only I have used debug as well it shows that solr > divides the query into dual and core and then searches both separately now > while calculating the scores it puts the document in which both the terms > appear and in my case the document containing this title: > > Wipro 7710U Laptop-DUAL CORE 1.4 Ghz-120GB HDD > > solr has found only core term not dual as I guess it is > attached to laptop term not as even searching for only dual > term this document doesnot show up which is why this document > sshows down in the search results thus I am not able to > search for partial terms for that I have to apply *dual > in the query then it is searching this document but then > other search scoring gets affected with this when I put * in > the query terms I think I have to remove the "-" terms from > the strings before indexing them point me if i am wrong any > where > > thanks > regards > Rohan > > > On Sat, Mar 16, 2013 at 7:02 PM, Erick Erickson <erickerick...@gmail.com>* > *wrote: > > See admin/analysis, it's invaluable. Probably >> >> The terms are being searched against your default text field which I'd >> guess is not "title". >> >> Also, try adding &debug=all to your query and look in the debug info at >> the >> parsed form of the query to see what's actually being searched. >> >> Best >> Erick >> >> >> On Fri, Mar 15, 2013 at 2:52 AM, Rohan Thakur <rohan.i...@gmail.com> >> wrote: >> >> > hi all >> > >> > wanted to know I have this string in field title : >> > >> > Wipro 7710U Laptop-DUAL CORE 1.4 Ghz-120GB HDD >> > >> > I have indexed it using text-en-splliting-tight >> > >> > >> > and now I am searching for term like q=dual core >> > >> > but in the relevance part its this title is coming down the order as >> > solr is not searching dual in this string its just searching core term >> > from the query in this string thus multiplying the score for this field >> by >> > 1/2 >> > decreasing the score. >> > >> > how can I correct this can any one help >> > >> > thanks >> > regards >> > Rohan >> > >> >> >