RE: Return Lucene DocId in Solr Results

Lohrenz, Steven Thu, 02 Dec 2010 05:59:47 -0800

I would be interested in hearing about some ways to improve the algorithm. I 
have done a very straightforward Lucene query within a loop to get the docIds.


Here's what I did to get it working where favsBean are objects returned from a 
query of the second core, but there is probably a better way to do it:

private int[] getDocIdsFromPrimaryKey(SolrQueryRequest req, List<Favorites> 
favsBeans) throws ParseException {
        // open the core & get data directory
        String indexDir = req.getCore().getIndexDir();
        FSDirectory index = null;
        try {
            index = FSDirectory.open(new File(indexDir));
        } catch (IOException e) {
            throw new ParseException("IOException, cannot open the index at: " 
+ indexDir + " " + e.getMessage());
        }
        
        int[] docIds = new int[favsBeans.size()];
        int i = 0;
        for(Favorites favBean: favsBeans) {
            String pkQueryString = "resourceId:" + favBean.getResourceId();
            Query pkQuery = new QueryParser(Version.LUCENE_CURRENT, 
"resourceId", new StandardAnalyzer()).parse(pkQueryString);

            IndexSearcher searcher = null;
            TopScoreDocCollector collector = null;
            try {
                searcher = new IndexSearcher(index, true);
                collector = TopScoreDocCollector.create(1, true);
                searcher.search(pkQuery, collector);
            } catch (IOException e) {
                throw new ParseException("IOException, cannot search the index 
at: " + indexDir + " " + e.getMessage());
            }

            ScoreDoc[] hits = collector.topDocs().scoreDocs;
            if(hits != null && hits[0] != null) {
                docIds[i] = hits[0].doc;
                i++;
            }
        }
        
        Arrays.sort(docIds);
        return docIds;
    }

-----Original Message-----
From: Erick Erickson [mailto:erickerick...@gmail.com] 
Sent: 02 December 2010 13:46
To: solr-user@lucene.apache.org
Subject: Re: Return Lucene DocId in Solr Results

Sounds good, especially because your old scenario was fragile. The doc IDs
in
your first core could change as a result of a single doc deletion and
optimize. So
the doc IDs stored in the second core would then be wrong...

Your user-defined unique key is definitely a better way to go. There are
some tricks
you could try if there are performance issues....

Best
Erick

On Thu, Dec 2, 2010 at 7:47 AM, Lohrenz, Steven
<steven.lohr...@hmhpub.com>wrote:

> I know the doc ids from one core have nothing to do with the other. I was
> going to use the docId returned from the first core in the solr results and
> store it in the second core that way the second core knows about the doc ids
> from the first core. So when you query the second core from the Filter in
> the first core you get returned a set of data that includes the docId from
> the first core that the document relates to.
>
> I have backed off from this approach and have a user defined primary key in
> the firstCore, which is stored as the reference in the secondCore and when
> the filter performs the search it goes off and queries the firstCore for
> each primary key and gets the lucene docId from the returned doc.
>
> Thanks,
> Steve
>
> -----Original Message-----
> From: Erick Erickson [mailto:erickerick...@gmail.com]
> Sent: 02 December 2010 02:19
> To: solr-user@lucene.apache.org
> Subject: Re: Return Lucene DocId in Solr Results
>
> On the face of it, this doesn't make sense, so perhaps you can explain a
> bit.The doc IDs
> from one Solr instance have no relation to the doc IDs from another Solr
> instance. So anything
> that uses doc IDs from one Solr instance to create a filter on another
> instance doesn't seem
> to be something you'd want to do...
>
> Which may just mean I don't understand what you're trying to do. Can you
> back up a bit
> and describe the higher-level problem? This seems like it may be an XY
> problem, see:
> http://people.apache.org/~hossman/#xyproblem
>
> Best
> Erick
>
> On Tue, Nov 30, 2010 at 6:57 AM, Lohrenz, Steven
> <steven.lohr...@hmhpub.com>wrote:
>
> > Hi,
> >
> > I was wondering how I would go about getting the lucene docid included in
> > the results from a solr query?
> >
> > I've built a QueryParser to query another solr instance and and join the
> > results of the two instances through the use of a Filter.  The Filter
> needs
> > the lucene docid to work. This is the only bit I'm missing right now.
> >
> > Thanks,
> > Steve
> >
> >
>

RE: Return Lucene DocId in Solr Results

Reply via email to