Hi,

I was also facing the issue of highlighting the large text files. I applied the 
solution proposed here and it worked. But I am getting following error :


Basically 'hitGrouped.vm' is not found. I am using solr-3.4.0. Where can I get 
this file from. Its reference is present in browse.vm

<div class="results">
  #if($response.response.get('grouped'))
    #foreach($grouping in $response.response.get('grouped'))
      #parse("hitGrouped.vm")
    #end
  #else
    #foreach($doc in $response.results)
      #parse("hit.vm")
    #end
  #end
</div>


HTTP Status 500 - Can't find resource 'hitGrouped.vm' in classpath or 
'C:\caprice\workspace\caprice\dist\DEV\solr\.\conf/', 
cwd=C:\glassfish3\glassfish\domains\domain1\config java.lang.RuntimeException: 
Can't find resource 'hitGrouped.vm' in classpath or 
'C:\caprice\workspace\caprice\dist\DEV\solr\.\conf/', 
cwd=C:\glassfish3\glassfish\domains\domain1\config at 
org.apache.solr.core.SolrResourceLoader.openResource(SolrResourceLoader.java:268)
 at 
org.apache.solr.response.SolrVelocityResourceLoader.getResourceStream(SolrVelocityResourceLoader.java:42)
 at org.apache.velocity.Template.process(Template.java:98) at 
org.apache.velocity.runtime.resource.ResourceManagerImpl.loadResource(ResourceManagerImpl.java:446)
 at 

Thanks & Regards,
Anand
Anand Nigam
RBS Global Banking & Markets
Office: +91 124 492 5506   


-----Original Message-----
From: karsten-s...@gmx.de [mailto:karsten-s...@gmx.de] 
Sent: 21 October 2011 14:58
To: solr-user@lucene.apache.org
Subject: Re: Can Solr handle large text files?

Hi Peter,

highlighting in large text files can not be fast without dividing the original 
text in small piece.
So take a look in
http://xtf.cdlib.org/documentation/under-the-hood/#Chunking
and in
http://www.lucidimagination.com/blog/2010/09/16/2446/

Which means that you should divide your files and use Result Grouping / Field 
Collapsing to list only one hit per original document.

(xtf also would solve your problem "out of the box" but xtf does not use solr).

Best regards
  Karsten

-------- Original-Nachricht --------
> Datum: Thu, 20 Oct 2011 17:59:04 -0700
> Von: Peter Spam <ps...@mac.com>
> An: solr-user@lucene.apache.org
> Betreff: Can Solr handle large text files?

> I have about 20k text files, some very small, but some up to 300MB, 
> and would like to do text searching with highlighting.
> 
> Imagine the text is the contents of your syslog.
> 
> I would like to type in some terms, such as "error" and "mail", and 
> have Solr return the syslog lines with those terms PLUS two lines of context.
> Pretty much just like Google's highlighting.
> 
> 1) Can Solr handle this?  I had extremely long query times when I 
> tried this with Solr 1.4.1 (yes I was using TermVectors, etc.).  I 
> tried breaking the files into 1MB pieces, but searching would be wonky 
> => return the wrong number of documents (ie. if one file had a term 5 
> times, and that was the only file that had the term, I want 1 result, not 5 
> results).
> 
> 2) What sort of tokenizer would be best?  Here's what I'm using:
> 
>    <field name="body" type="text_pl" indexed="true" stored="true"
> multiValued="false" termVectors="true" termPositions="true" 
> termOffsets="true" />
> 
>     <fieldType name="text_pl" class="solr.TextField">
>       <analyzer>
>         <tokenizer class="solr.StandardTokenizerFactory"/>
>         <filter class="solr.LowerCaseFilterFactory"/>
>         <filter class="solr.WordDelimiterFilterFactory"
> generateWordParts="0" generateNumberParts="0" catenateWords="0" 
> catenateNumbers="0"
> catenateAll="0" splitOnCaseChange="0"/>
>       </analyzer>
>     </fieldType>
> 
> 
> Thanks!
> Pete

***********************************************************************************
 
The Royal Bank of Scotland plc. Registered in Scotland No 90312. 
Registered Office: 36 St Andrew Square, Edinburgh EH2 2YB. 
Authorised and regulated by the Financial Services Authority. The 
Royal Bank of Scotland N.V. is authorised and regulated by the 
De Nederlandsche Bank and has its seat at Amsterdam, the 
Netherlands, and is registered in the Commercial Register under 
number 33002587. Registered Office: Gustav Mahlerlaan 350, 
Amsterdam, The Netherlands. The Royal Bank of Scotland N.V. and 
The Royal Bank of Scotland plc are authorised to act as agent for each 
other in certain jurisdictions. 
  
This e-mail message is confidential and for use by the addressee only. 
If the message is received by anyone other than the addressee, please 
return the message to the sender by replying to it and then delete the 
message from your computer. Internet e-mails are not necessarily 
secure. The Royal Bank of Scotland plc and The Royal Bank of Scotland 
N.V. including its affiliates ("RBS group") does not accept responsibility 
for changes made to this message after it was sent. For the protection
of RBS group and its clients and customers, and in compliance with
regulatory requirements, the contents of both incoming and outgoing
e-mail communications, which could include proprietary information and
Non-Public Personal Information, may be read by authorised persons
within RBS group other than the intended recipient(s). 

Whilst all reasonable care has been taken to avoid the transmission of 
viruses, it is the responsibility of the recipient to ensure that the onward 
transmission, opening or use of this message and any attachments will 
not adversely affect its systems or data. No responsibility is accepted 
by the RBS group in this regard and the recipient should carry out such 
virus and other checks as it considers appropriate. 

Visit our website at www.rbs.com 

***********************************************************************************
  

Reply via email to