Thanks a lot! I don't know how I missed this discussion. Thank you again!
Best regards Atanas On Fri, Aug 30, 2013 at 11:31 AM, Aloke Ghoshal <alghos...@gmail.com> wrote: > Hi, > > Please refer to my response from a few months back: > > http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201303.mbox/%3ccaht6s2az_w2av04rdmoeeck5e9o0k4ytktf0pjsecsh-lls...@mail.gmail.com%3E > > Our modelling is to index N (individual pages) + 1 (original document) in > Solr. Once a document has matched for a given set of terms, the > corresponding page boundary cases can be handled by relaxing the page > search condition to an OR (you could even add these alongside with a lower > boost). > > Regards, > Aloke > > > > On Fri, Aug 30, 2013 at 12:11 PM, Атанас Атанасов <atanaso...@gmail.com > >wrote: > > > Thanks for the response. Your suggestion is to keep the existing way of > > indexing data where every page of a document is a row in the SOLR > database, > > changing the "content" field to be store-only and add another field (ex. > > document_content) for "index only" where I should put the whole content > of > > the document. This is a good idea but I am also using HighLighter and I > > think it won't work since it requires the field to be stored=true. My > > problem will be solved if there is a way to search in the index-only > field > > where the whole document is indexed but to get the highlights/context of > > the match from the existing page. > > Originally my idea was to keep data in existing format (1 page - 1 > record) > > but somehow search in grouped (by document) results or some kind of union > > between pages of a document. Is this possible? > > > > > > On Thu, Aug 29, 2013 at 4:45 PM, Alexandre Rafalovitch > > <arafa...@gmail.com>wrote: > > > > > Assuming you want both pages to match you need the text to be present > on > > > both pages. Do you actually return/store text of the page in Solr? If > so, > > > you can have that 'page' field store-only and have another field which > is > > > index-only and into which you put all your matching logic. So, that > > > index-only field can contain the page plus another line/paragraph/page > on > > > each side. > > > > > > Regards, > > > Alex. > > > > > > Personal website: http://www.outerthoughts.com/ > > > LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch > > > - Time is the quality of nature that keeps events from happening all at > > > once. Lately, it doesn't seem to be working. (Anonymous - via GTD > book) > > > > > > > > > On Thu, Aug 29, 2013 at 2:49 PM, Alexandre Rafalovitch > > > <arafa...@gmail.com>wrote: > > > > > > > So, if the match spans pages 4 and 5, what do you want returned? Page > > 4, > > > > page 5, or both? > > > > > > > > Regards, > > > > Alex > > > > On 28 Aug 2013 06:55, "Атанас Атанасов" <atanaso...@gmail.com> > wrote: > > > > > > > >> Hello, > > > >> > > > >> My name is Atanas Atanasov, I'm using SOLR 1.4/3.5/4.3 for an year > > and a > > > >> half and I'm really satisfied of what it provides. Searching and > > > indexing > > > >> are extremely fast, it is easy to work with. > > > >> However I ran into a small problem and I can't figure it out. > > > >> I'm using SOLR to store the content/text of different types of > > > >> documents(.pdf, .txt, .doc, etc.). > > > >> The whole document content represents a SOLR record(all the text > from > > > all > > > >> pages of the document). > > > >> schema.xml is in SOLR_Document_Level folder of attached .zip file. > > > >> This worked absolutely fine but I wanted to see the exact page/pages > > of > > > a > > > >> document where the search match is/are. > > > >> > > > >> I redesigned it so that every page of a document is a row in the > SOLR > > > >> database (schema.xml is in SOLR_Page_Level folder of attached .zip > > > file.) > > > >> and it works good but this resulted in the following problem: > > > >> Example: I search for (lucene AND apache). If both words are on the > > same > > > >> page I will get a hit and > > > >> result will be returned. However If the words are on different pages > > of > > > a > > > >> document no results will be found. > > > >> My goal is to find out the exact page of a document where the match > > is. > > > >> Dynamic fields would solve this problem but there are very big > > documents > > > >> with many pages so I don't think this is a solution. > > > >> Can you help me with some ideas on how to make it work? > > > >> > > > >> Just for information. I am using SOLR as a REST service hosted in > > Apache > > > >> and a .NET application to work with it. > > > >> If you have questions please feel free to ask. > > > >> > > > >> Thanks in advance and Best Regards, > > > >> Atanas Atanasov > > > >> > > > >> > > > > > >