hi Mark Actually i am using object oriented database. Where can i find the information regarding embedding lucene with database. Thanks
mark harwood wrote: > > Hi Sumit, >>>now i want my database to communicate with lucene api > > I would recommend that it's the other way round....see my earlier comment > on using FilterIndexReader and creating "faked" TermEnum and TermDocs to > make your database content appear as if it were part of the index when > calling Lucene. If you do want to make the database call Lucene see the > recent work on embedding Lucene in Oracle. > There is no simple ready-made solution here that I can post in a few lines > of code - you'll need to familiarise yourself with these low-level APIs > that underpin Lucene searches (they are all documented). > > > Cheers > Mark > > ----- Original Message ---- > From: sumittyagi <[EMAIL PROTECTED]> > To: java-user@lucene.apache.org > Sent: Wednesday, 20 February, 2008 4:43:56 PM > Subject: Re: Which file in the lucene package is used to manipulate > results.. > > > Hi Mark Harwood > I know it's being a long time, but till now i was busy in developing the > database to store the keyword, document and no. of clicks of the document > for the keyword and their respective mappings. > now i want my database to communicate with lucene api and i cannot figure > it > out where to start from. > Please help me out, how can i make my database to work with lucene. > Thanks > Sumit > > mark harwood wrote: >> >> Thanks for the context - much more useful. >> The challenge here is similar to that posed by offering end-user tagging >> of content (see here >> http://www.mail-archive.com/java-user@lucene.apache.org/msg17580.html ). >> The main difference here being that words are added to docs implicitly by >> search click-throughs rather than any explicit tagging action. >> >> In both cases the challenge is that the user data around documents is >> likely to be updated very often while the documents remain relatively >> static. >> I suspect some additional things to think about are: >> 1) Cancelling out the "human laziness" bias that favours clicking results >> on page 1. Are clicks on page 2 worth more? >> 2) Spam clicks - detecting deliberate gaming of your re-ranking >> algorithm. >> 3) Lucene doc IDs are not stable - how will you associate query >> terms/click data with documents and join them at speed? >> 4) Are individual words or phrases the unit of boost? "Paris" means >> different things in "Paris Hilton" and "Paris, France". >> >> A simple approach might be to re-index your content with all of the >> additional search terms from clicks added to the associated document in a >> "searchClicks" field - the more clicks, the more repetitions of the same >> search words in the document to help with tf (Term Frequency). This >> additional content would need to be capped, to avoid huge documents. This >> has the disadvantage of requiring a re-index though. >> Another option to avoid reindexing everything is to wrap IndexReader (See >> FilterIndexReader) and implement TermEnum/TermDocs for a fake field >> called >> "searchClicks". The idea is Lucene looks after the usual, static document >> content while your implementation goes off to your more volatile storage >> (e.g. database/parallel index, custom file structure) to retrieve lists >> of >> doc ids, term frequencies etc. for this "searchClicks" field. All of the >> Lucene queries you might want to throw at this e.g. PhraseQueries can >> then >> test both the static Lucene fields and your new volatile "click" fields >> without being aware of this low-level trickery. >> >> I'm sure there will be other ways of doing this too but this seems like a >> conceptually clean way of modelling it - just seeing search terms as >> extensions to the document content. >> >> Cheers >> Mark >> >> >> ----- Original Message ---- >> From: sumittyagi <[EMAIL PROTECTED]> >> To: java-user@lucene.apache.org >> Sent: Sunday, 23 December, 2007 5:30:55 AM >> Subject: Re: Which file in the lucene package is used to manipulate >> results.. >> >> >> Actually what i have to do is... >> 1.) for every query(keyword), among the results obtained, the keyword >> will >> be mapped with the page clicked, along with the no. of clicks for that >> keyword on that page >> 2.) next time for the same query(keyword), the mapped pages will be >> ranked >> higher considering the no. of clicks too.. >> 3.) for every new query these steps will be repeated... >> this was a very high level view , i have made algorithms for these >> modules >> and trying to incorporate with lucene but dont know , on which files i >> have >> to do edition to make it work... >> please help me regarding this, if you need some more explanation, >> please let >> me know... >> thanks >> Sumit Tyagi >> >> >> >> >> >> Erick Erickson wrote: >>> >>> You still haven't explained *why* you want to rerank results. What >>> is the use-case you're trying to implement? Quite often it's turned >>> out for me that when I let folks on the list know what the use >>> case I'm trying to support is, they come up with much more elegant >>> solutions than I was thinking about. >>> >>> For instance, does the CustomScoreQuery class have any relevance >>> to your problem? >>> >>> If you're thinking of modifying the core Lucene code for your >>> special purpose, I'd advise against it unless and until you'd >> exhausted >>> all the other options. It's always a maintenance headache to do this. >>> >>> Best >>> Erick >>> >>> On Dec 21, 2007 10:09 AM, sumittyagi <[EMAIL PROTECTED]> wrote: >>> >>>> >>>> actually i am writing a module to rerank the results, so i want to >> edit >>>> the >>>> file which arrange the results and give them ranks, >>>> or is there any other way i can use my module to rerank the results >>>> >>>> >>>> markharw00d wrote: >>>> > >>>> > I think you need to describe your "factors" in more detail. >> Exactly >>>> what >>>> > do you want to achieve for your users? >>>> > We could be talking about any number of Lucene functions here. >>>> > >>>> > ----- Original Message ---- >>>> > From: sumittyagi <[EMAIL PROTECTED]> >>>> > To: java-user@lucene.apache.org >>>> > Sent: Friday, 21 December, 2007 4:51:09 AM >>>> > Subject: Which file in the lucene package is used to manipulate >>>> results.. >>>> > >>>> > >>>> > hi, i am using lucene for the very first time and want to >> manipulate >>>> > the >>>> > results, by adding some more factors to it, which file should i >> edit to >>>> > manipulate the search results.... >>>> > >>>> > Thanks >>>> > Sumit Tyagi >>>> > -- >>>> > View this message in context: >>>> > >>>> > >>>> >> >> http://www.nabble.com/Which-file-in-the-lucene-package-is-used-to-manipulate-results..-tp14450335p14450335.html >>>> > Sent from the Lucene - Java Users mailing list archive at >> Nabble.com. >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > __________________________________________________________ >>>> > Sent from Yahoo! Mail - a smarter inbox http://uk.mail.yahoo.com >>>> > >>>> > >>>> > >> --------------------------------------------------------------------- >>>> > To unsubscribe, e-mail: [EMAIL PROTECTED] >>>> > For additional commands, e-mail: [EMAIL PROTECTED] >>>> > >>>> > >>>> > >>>> >>>> -- >>>> View this message in context: >>>> >> >> http://www.nabble.com/Which-file-in-the-lucene-package-is-used-to-manipulate-results..-tp14450335p14456938.html >>>> Sent from the Lucene - Java Users mailing list archive at >> Nabble.com. >>>> >>>> >>>> >> --------------------------------------------------------------------- >>>> To unsubscribe, e-mail: [EMAIL PROTECTED] >>>> For additional commands, e-mail: [EMAIL PROTECTED] >>>> >>>> >>> >>> >> >> -- >> View this message in context: >> >> http://www.nabble.com/Which-file-in-the-lucene-package-is-used-to-manipulate-results..-tp14450335p14476062.html >> Sent from the Lucene - Java Users mailing list archive at Nabble.com. >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: [EMAIL PROTECTED] >> For additional commands, e-mail: [EMAIL PROTECTED] >> >> >> >> >> >> >> __________________________________________________________ >> Sent from Yahoo! Mail - a smarter inbox http://uk.mail.yahoo.com >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: [EMAIL PROTECTED] >> For additional commands, e-mail: [EMAIL PROTECTED] >> >> >> > > -- > View this message in context: > http://www.nabble.com/Which-file-in-the-lucene-package-is-used-to-manipulate-results..-tp14450335p15591566.html > Sent from the Lucene - Java Users mailing list archive at Nabble.com. > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > > > > > > __________________________________________________________ > Sent from Yahoo! Mail - a smarter inbox http://uk.mail.yahoo.com > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > > -- View this message in context: http://www.nabble.com/Which-file-in-the-lucene-package-is-used-to-manipulate-results..-tp14450335p15596170.html Sent from the Lucene - Java Users mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]