FastVectorHighlighter ignoring fragmenter parameter . . .

2010-12-04 Thread CRB
Got the FVH to work in Solr 3.1 (or at least I presume I have given I 
can see multi-color highlighting in the output.)


But I am not able to get it to recognize the "regex" fragmenter. I get 
no change in output if I specify the fragmenter. In fact, I can even 
enter bogus names for the fragmenter and get no change in the output.


Grateful for any suggestions.

Settings and output below.

Christopher


*Query*

   http://localhost:8983/solr/10k-Fragments/select?
   q=content%3Aliquidity
   &rows=100
   &fl=id%2Ccontent
   &qt=standard
   &hl.fl=content
   &hl.useFastVectorHighlighter=true
   &hl=true
   &hl.fragmentsBuilder=colored
   &hl.fragmenter=regex

*Response* (Abbreviated)

   
   -
   
   0
   47
   -
   
   id,content
   true
   content:liquidity
   regex1text
   content
   colored
   standard
   true
   100
   
   
   . . .
   
   -
   
   -
   
   -
   
   ᆘ Liquidity is a measure of a
   bank's ability to fund loans and withdrawals of deposits in a cost-ef
   
   
   
   . . .

*Field listing in schema.xml*

   

*Highlighter listing in solrconfig.xml*

   

   
   
   100
   
   
   
   
   70
   0.5
   [-\w ,/\n\"']{20,200}
   
   

   
   
   
   
   
   

   
   

   
   

   
   
   
   
   
   
   
   



Re: FastVectorHighlighter ignoring fragmenter parameter . . .

2010-12-06 Thread CRB

Koji,

Thank you for the reply.

Being something of a novice with Solr, I would be grateful if you could 
clarify my next steps.


I infer from your reply that there is no current implementation yet 
contributed for the FVH similar to the regex fragmenter.


Thus I need to write my own custom extensions of *FragmentsBuilder 
<http://lucene.apache.org/java/3_0_1/api/contrib-fast-vector-highlighter/org/apache/lucene/search/vectorhighlight/FragmentsBuilder.html> 
& **FragListBuilder 
<http://lucene.apache.org/java/3_0_1/api/contrib-fast-vector-highlighter/org/apache/lucene/search/vectorhighlight/FragListBuilder.html> 
*interfaces to take in and apply the regex.


I would be happy to contribute back what I create.

Appreciate whatever guidance you can offer,

Christopher

On 2:59 PM, Koji Sekiguchi wrote:

(10/12/05 5:53), CRB wrote:
Got the FVH to work in Solr 3.1 (or at least I presume I have given I 
can see multi-color

highlighting in the output.)

But I am not able to get it to recognize the "regex" fragmenter. I 
get no change in output if I
specify the fragmenter. In fact, I can even enter bogus names for the 
fragmenter and get no change

in the output.

Grateful for any suggestions.

Settings and output below.

Christopher


*Query*

http://localhost:8983/solr/10k-Fragments/select?
q=content%3Aliquidity
&rows=100
&fl=id%2Ccontent
&qt=standard
&hl.fl=content
&hl.useFastVectorHighlighter=true
&hl=true
&hl.fragmentsBuilder=colored
&hl.fragmenter=regex


Christopher,

Because algorithm of FVH is totally different from (traditional) 
highlighter,
FVH doesn't see hl.fragmenter and hl.formatter, but see 
hl.fragListBuilder
and hl.fragmentsBuilder instead. I think your settings and 
request/response

looks good except hl.fragmenter=regex. FVH simply ignores the parameter.

Koji




Using Saxon 9 as a response writer with Solr 3.1 . . ?

2010-12-06 Thread CRB

Has anyone been able to get Saxon 9 working with Solr3.1?

I was following the wiki page 
(http://wiki.apache.org/solr/XsltResponseWriter), placing all the 
saxon-*.jars are in Jetty's lib/ext folder and start with


java 
-Djavax.xml.transform.TransformerFactory=net.sf.saxon.TransformerFactoryImpl 
-jar start.jar

But get an ugly dump of errors from Jetty:

   2010-12-06 13:29:16.515::WARN:  failed SolrRequestFilter
   java.lang.NoSuchMethodError:
   net.sf.saxon.dom.DOMEnvelope.getInstance()Lnet/sf/saxon/dom/DOMEnvelope;
at
   net.sf.saxon.java.JavaPlatform.initialize(JavaPlatform.java:43)
at net.sf.saxon.Configuration.init(Configuration.java:392)
at net.sf.saxon.Configuration.(Configuration.java:311)
at
   
net.sf.saxon.xpath.XPathFactoryImpl.makeConfiguration(XPathFactoryImpl.java:41)
at
   net.sf.saxon.xpath.XPathFactoryImpl.(XPathFactoryImpl.java:26)
at
   sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at
   sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source)
at
   sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown
   Source)
at java.lang.reflect.Constructor.newInstance(Unknown Source)
at java.lang.Class.newInstance0(Unknown Source)
at java.lang.Class.newInstance(Unknown Source)
at
   javax.xml.xpath.XPathFactoryFinder.loadFromService(Unknown Source)
at javax.xml.xpath.XPathFactoryFinder._newFactory(Unknown
   Source)
at javax.xml.xpath.XPathFactoryFinder.newFactory(Unknown
   Source)
at javax.xml.xpath.XPathFactory.newInstance(Unknown Source)
at javax.xml.xpath.XPathFactory.newInstance(Unknown Source)
at org.apache.solr.core.Config.(Config.java:50)
at
   org.apache.solr.servlet.SolrDispatchFilter.(SolrDispatchFilter.java:68)
at
   sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at
   sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source)
at
   sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown
   Source)
at java.lang.reflect.Constructor.newInstance(Unknown Source)
at java.lang.Class.newInstance0(Unknown Source)
at java.lang.Class.newInstance(Unknown Source)
at
   org.mortbay.jetty.servlet.Holder.newInstance(Holder.java:153)
at
   org.mortbay.jetty.servlet.FilterHolder.doStart(FilterHolder.java:94)
at
   org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40)
at
   org.mortbay.jetty.servlet.ServletHandler.initialize(ServletHandler.java:594)
at
   org.mortbay.jetty.servlet.Context.startContext(Context.java:139)
at
   org.mortbay.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1218)
at
   org.mortbay.jetty.handler.ContextHandler.doStart(ContextHandler.java:500)
at
   org.mortbay.jetty.webapp.WebAppContext.doStart(WebAppContext.java:448)
at
   org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40)
at
   
org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:147)
at
   
org.mortbay.jetty.handler.ContextHandlerCollection.doStart(ContextHandlerCollection.java:161)
at
   org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40)
at
   
org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:147)
at
   org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40)
at
   org.mortbay.jetty.handler.HandlerWrapper.doStart(HandlerWrapper.java:117)
at org.mortbay.jetty.Server.doStart(Server.java:210)
at
   org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40)
at
   org.mortbay.xml.XmlConfiguration.main(XmlConfiguration.java:929)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown
   Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at org.mortbay.start.Main.invokeMain(Main.java:183)
at org.mortbay.start.Main.start(Main.java:497)
at org.mortbay.start.Main.main(Main.java:115)




Re: FastVectorHighlighter ignoring fragmenter parameter . . .

2010-12-06 Thread CRB

Koji,

Thank you for the reply.

Being something of a novice with Solr, I would be grateful if you could 
clarify my next steps.


I infer from your reply that there is no current implementation yet 
contributed for the FVH similar to the regex fragmenter.


Thus I need to write my own custom extensions of FragmentsBuilder & 
FragListBuilder interfaces to take in and apply the regex.


I would be happy to contribute back what I create.

Appreciate whatever guidance you can offer,

Christopher


Function Query Syntax?

2010-12-29 Thread CRB

We have documents which are comprised of:

- A short list of terms (about 1 to 5 terms per document)
- An estimate of the probability of the terms occurrence (stored as 
tint)


For each term in the index, we would like to get the result of the 
following function:


(our estimate of the probability/100) x (a term's Document Frequency)

So if the term "fox" occurred in 7 documents, the desired query result 
would look something like:



fox
7
23
1.61


We can find a number of examples for using function queries to alter 
scoring or sorting results, but can not find any that show how to get 
the value of actual function result back.




edismax - Handling collocations mapped to a single token . . ?

2011-06-28 Thread CRB
We are trying to get edismax to handle collocations mapped to a single 
token. To do so we need to manipulate the "chunks" (as Hoss referred to 
them in http://www.lucidimagination.com/blog/2010/05/23/whats-a-dismax/) 
generated by the dismax parser. We have numerous collocations (terms of 
speech which do not directly relate to the constituent words that make 
up the saying). For example, at index time "real estate" is mapped to 
"real_estate" to avoid it colliding with searches for "estate" or "real 
value". So we need the "chunks" to reflect this mapping of multi-word 
phrases to a single token that is done during indexing (via the synonym 
filter).


In an ideal world, we would just list the queryAnalyzerFieldType that 
should be used in pre-processing the query string before it is divided 
into "chunks" (similar to what is done with the SpellChecker Compoenent).


But our impression thus far is that we are off the reservation and will 
need to hack away at 
org.apache.solr.search.ExtendedDismaxQParser.splitIntoClauses(String, 
boolean).


Is it correct that the only pre-processing by dismax is on stopwords?

Is it correct to be able to limit customization to 
splitIntoClauses(String, boolean) to handle this?


Regards,

Christopher







Solr Size Estimator (JIRA#3435) . . .

2011-09-23 Thread CRB

Hi,

In working through some updates for the Solr Size Estimator, I have 
found a number of gaps in the Solr Wiki. I've Google'd to a fair degree 
on each of these and either found nothing or an insufficient explanation.


In particular, for each of the following I'm looking for:
A) An explanation of what it is
B) How to use it or estimate its size

Topics:
1) fieldValueCache
2) RamBufferSize
3) Transient Factor
4) Average number of Bytes per Term
5) Cache Key Average Size (Bytes)
6) Avgerage QueryResultKey size (in bytes)

Appreciate any input, so I can update the Solr Wiki as needed.

C


Faceting is not Using Field Value Cache . . ?

2011-11-22 Thread CRB


Seeing something odd going on with faceting . . . we execute facets with 
every query and yet the fieldValueCache is not being used:


name:  fieldValueCache
class:  org.apache.solr.search.FastLRUCache
version:  1.0
description:  Concurrent LRU Cache(maxSize=1, initialSize=10, 
minSize=9000, acceptableSize=9500, cleanupThread=false)

stats: lookups : 0
hits : 0
hitratio : 0.00
inserts : 0
evictions : 0
size : 0
warmupTime : 0
cumulative_lookups : 0
cumulative_hits : 0
cumulative_hitratio : 0.00
cumulative_inserts : 0
cumulative_evictions : 0

I was under the impression the fieldValueCache  was an implicit cache 
(if you don't define it, it will still exist).


We are running Solr v3.3 (and NOT using {!cache=false}).

Thoughts?