Re: Pointing to Hbase for Docuements or Directly Saving Documents at Hbase

Otis Gospodnetic Thu, 11 Apr 2013 13:24:47 -0700

Source code is your best bet.  Wiki has info about how to use it, but
not how highlighting is implemented.  But you don't need to understand
the implementation details to understand that they are dynamic,
computed specifically for each query for each matching document, so
you cannot store them anywhere ahead of time.


Otis
--
Solr & ElasticSearch Support
http://sematext.com/





On Thu, Apr 11, 2013 at 11:22 AM, Furkan KAMACI <furkankam...@gmail.com> wrote:
> Hi Otis;
>
> It seems that I should read more about highlights. Is there any where that
> explains in detail how highlights are generated at Solr?
>
> 2013/4/11 Otis Gospodnetic <otis.gospodne...@gmail.com>
>
>> Hi,
>>
>> You can't store highlights ahead of time because they are query
>> dependent.  You could store documents in HBase and use Solr just for
>> indexing.  Is that what you want to do?  If so, a custom
>> SearchComponent executed after QueryComponent could fetch data from
>> external store like HBase.  I'm not sure if I'd recommend that.
>>
>> Otis
>> --
>> Solr & ElasticSearch Support
>> http://sematext.com/
>>
>>
>>
>>
>>
>> On Thu, Apr 11, 2013 at 10:01 AM, Furkan KAMACI <furkankam...@gmail.com>
>> wrote:
>> > Actually I don't think to store documents at Solr. I want to store just
>> > highlights (snippets) at Hbase and I want to retrieve them from Hbase
>> when
>> > needed.
>> > What do you think about separating just highlights from Solr and storing
>> > them into Hbase at Solrclod. By the way if you explain at which process
>> and
>> > how highlights are genareted at Solr you are welcome.
>> >
>> >
>> > 2013/4/9 Otis Gospodnetic <otis.gospodne...@gmail.com>
>> >
>> >> You may also be interested in looking at things like solrbase (on
>> Github).
>> >>
>> >> Otis
>> >> --
>> >> Solr & ElasticSearch Support
>> >> http://sematext.com/
>> >>
>> >>
>> >>
>> >>
>> >>
>> >> On Sat, Apr 6, 2013 at 6:01 PM, Furkan KAMACI <furkankam...@gmail.com>
>> >> wrote:
>> >> > Hi;
>> >> >
>> >> > First of all should mention that I am new to Solr and making a
>> research
>> >> > about it. What I am trying to do that I will crawl some websites with
>> >> Nutch
>> >> > and then I will index them with Solr. (Nutch 2.1, Solr-SolrCloud 4.2 )
>> >> >
>> >> > I wonder about something. I have a cloud of machines that crawls
>> websites
>> >> > and stores that documents. Then I send that documents into SolrCloud.
>> >> Solr
>> >> > indexes that documents and generates indexes and save them. I know
>> that
>> >> > from Information Retrieval theory: it *may* not be efficient to store
>> >> > indexes at a NoSQL database (they are something like linked lists and
>> if
>> >> > you store them in such kind of database you *may* have a sparse
>> >> > representation -by the way there may be some solutions for it. If you
>> >> > explain them you are welcome.)
>> >> >
>> >> > However Solr stores some documents too (i.e. highlights) So some of my
>> >> > documents will be doubled somehow. If I consider that I will have many
>> >> > documents, that dobuled documents may cause a problem for me. So is
>> there
>> >> > any way not storing that documents at Solr and pointing to them at
>> >> > Hbase(where I save my crawled documents) or instead of pointing
>> directly
>> >> > storing them at Hbase (is it efficient or not)?
>> >>
>>

Re: Pointing to Hbase for Docuements or Directly Saving Documents at Hbase

Reply via email to