Re: Algorithm for retrieving documents

Harshvardhan Ojha Thu, 13 Feb 2014 01:48:25 -0800

Hi Mikhail,

Thanks for sharing this nice link. I am pretty comfortable with searching
of lucene and this is very beginner level question on storage, mainly
Hashing part(storage and retrieval).
Which DS(I don't know currently), is being used to keep and again calculate
that hash to get document back?


Lets me put it very clearly,
If I know document to search id:1, and there is no other query, after
knowing this much about doc, there should ideally be no searching at
all(although it was indexed), its only fast retrieval.

Let me know, If you want me to clarify question.

Regards
Harshvardhan Ojha


On Thu, Feb 13, 2014 at 2:53 PM, Mikhail Khludnev <
mkhlud...@griddynamics.com> wrote:

> Hello
>
> I think you can start from
> http://www.lucenerevolution.org/2013/What-is-in-a-lucene-index
>
>
>
> On Thu, Feb 13, 2014 at 12:56 PM, Harshvardhan Ojha <
> ojha.harshvard...@gmail.com> wrote:
>
> > Hi All,
> >
> > I have a question regarding retrieval of documents by lucene.
> > I know lucene uses many files on disk to keep documents, each comprising
> > fields in it, and uses many IR algorithms, and inverted index to match
> > documents.
> >
> > My question is :
> > 1. How lucene stores these documents inside file system and gets it so
> > fast?
> > 2. Does lucene uses any Hashing algorithm to get docs in O(1) ? If not
> > which DS is         used by lucene ?
> > 3. Except id provided by us at the time of indexing, is there any other
> > unique identifier       which is assigned by lucene to its documents ?
> >
> > I will appreciate If someone can provide me with source file names to
> study
> > these algorithms in detail.
> >
> > Regards
> > Harshvardhan Ojha
> >
>
>
>
> --
> Sincerely yours
> Mikhail Khludnev
> Principal Engineer,
> Grid Dynamics
>
> <http://www.griddynamics.com>
>  <mkhlud...@griddynamics.com>
>

Re: Algorithm for retrieving documents

Reply via email to