Solr on JBOSS 4.0.3

2007-05-31 Thread Thierry Collogne
Hello, We are trying to run Solr on JBOSS 4.0.3, and are heaving an issue. When we deploy the war and start our server we get a ExceptionInInitializerError. This is part of the stacktrace: Caused by: java.lang.RuntimeException: XPathFactory#newInstance() failed to create an XPathFactory for

Re: SOLR Indexing/Querying

2007-05-31 Thread Frans Flippo
I think if you add a field that has an analyzer that creates tokens on alpha/digit/punctuation boundaries, that should go a long way. Use that both at index and search time. For example: * 3555LHP becomes 3555 LHP Searching for D3555 becomes D OR 3555, so it matches on token 3555 from 3555LHP.

AW: SOLR Indexing/Querying

2007-05-31 Thread Burkamp, Christian
Hi there, It looks alot like using Solr's standard WordDelimiterFilter (see the sample schema.xml) does what you need. It splits on alphabetical to numeric boundaries and on the various kinds of intra word delimiters like -, _ or .. You can decide whether the parts are put together again in

Re: facet question

2007-05-31 Thread Yonik Seeley
On 5/31/07, Gal Nitzan [EMAIL PROTECTED] wrote: We have a small index with about 4 million docs. On this index we have a field tags which is a multiple values field. Running a facet query on the index with something like: facet=truefacetField=tagsq=type:video takes about 1 minute. We have

Re: Schema question: overriding fieldType attributes in field element

2007-05-31 Thread Yonik Seeley
On 5/31/07, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: I am trying to override the tokenized attribute of a single FieldType from the field attribute in schema.xml, but it doesn't seem to work The tokenized attribute is not settable from the schema, and there is no reason I can think of why

Re: Schema question: overriding fieldType attributes in field element

2007-05-31 Thread RWatkins
Thanks for the prompt response. Comments below ... [EMAIL PROTECTED] wrote on 05/31/2007 10:55:57 AM: On 5/31/07, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: I am trying to override the tokenized attribute of a single FieldType from the field attribute in schema.xml, but it doesn't seem to

Re: Schema question: overriding fieldType attributes in field element

2007-05-31 Thread Yonik Seeley
On 5/31/07, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: You say the tokenized attribute is not settable from the schema, but the output from IndexSchema.readConfig shows that the properties are indeed read, and the resulting SchemaField object retains these properties: are they then ignored?

Commit failing with EOFException

2007-05-31 Thread Matt Mitchell
Hi, I've had this application running before and not sure what has changed to cause this error. When trying to do a clean update (removed index dir and restarted solr) with just a commit/, Solr is returning a status 1 with this error at the top: java.io.EOFException: input contained no

Re: Schema question: overriding fieldType attributes in field element

2007-05-31 Thread RWatkins
Thanks, but I think I'm going to have to work out a different solution. I have written my own analyzer that does everything I need: it's not a different analyzer I need but a way to specify that certain fields should be tokenized and others not -- while still leaving all other options open. As

Re: Commit failing with EOFException

2007-05-31 Thread Matt Mitchell
OK figured this out. The short of it is, make sure your schema is always up to date! : ) The schema did not match the xml docs being posted. And because we had a previous solr update with those docs, even trying to post/ update a commit/ was failing because there was already bad data

Re: AW: SOLR Indexing/Querying

2007-05-31 Thread Chris Hostetter
: It looks alot like using Solr's standard WordDelimiterFilter (see the : sample schema.xml) does what you need. WordDelimiterFilter will only get you so far. it can split the indexed text of 3555LHP into tokens 3555 and LHP; and the user entered D3555 into the tokens D and 3555 -- but because

Re: Schema question: overriding fieldType attributes in field element

2007-05-31 Thread RWatkins
Chris Hostetter [EMAIL PROTECTED] wrote on 05/31/2007 02:28:58 PM: I'm having a little trouble following this discussion, first off as to your immediate issue... : Thanks, but I think I'm going to have to work out a different solution. I : have written my own analyzer that does everything

Re: AW: SOLR Indexing/Querying

2007-05-31 Thread Walter Underwood
I solved something similar to this by creating a stemmer for part numbers. Variations like -BN on the end can be treated as inflections in the part number language, similar to plurals in English. I used a set of regexes to match and transform, in some cases generating multiple root part numbers.

RE: facet question

2007-05-31 Thread Gal Nitzan
-Original Message- From: Mike Klaas [mailto:[EMAIL PROTECTED] Sent: Thursday, May 31, 2007 9:07 PM To: solr-user@lucene.apache.org Subject: Re: facet question On 31-May-07, at 1:33 AM, Gal Nitzan wrote: Hi, We have a small index with about 4 million docs. On this index

Re: Schema question: overriding fieldType attributes in field element

2007-05-31 Thread Chris Hostetter
: Unfortunately, unless I've missed something obvious, the tokenized : property is not available to classes that extend FieldType: the setArgs() : method of FieldType strips tokenized and other standard properties away : before calling the init() method. Yes, of course one could override :

Re: facet question

2007-05-31 Thread Mike Klaas
On 31-May-07, at 1:35 PM, Gal Nitzan wrote: However, the cache size brings us to the 2GB limit. If the cardinality of many of the tags is low, you can use HashSet- based filters (the default size at which a HashSet is used is 3000). [Gal Nitzan] I will appreciate a pointer to documentation

RE: facet question

2007-05-31 Thread Gal Nitzan
-Original Message- From: Mike Klaas [mailto:[EMAIL PROTECTED] Sent: Friday, June 01, 2007 12:36 AM To: solr-user@lucene.apache.org Subject: Re: facet question On 31-May-07, at 1:35 PM, Gal Nitzan wrote: However, the cache size brings us to the 2GB limit. If the

RE: RAMDirecotory instead of FSDirectory for SOLR

2007-05-31 Thread Jeryl Cook
Thats the thing,Terracotta persists everything it has in memory to the disk when it overflows(u can set how much u want to use in memory), or when the server goes offline. When the server comes back the master terracotta simply loads it back into the memory of the once offline

RE: facet question

2007-05-31 Thread Chris Hostetter
: Also, I'm still suspicious about your application. You have 1.5M : distinct tags for 4M documents? That seems quite dense. it's possible the app is using the filterCache for other things (on other fields) besies just the tag field ... but that still doesn't explain one thing... :

RE: RAMDirecotory instead of FSDirectory for SOLR

2007-05-31 Thread Chris Hostetter
: Thats the thing,Terracotta persists everything it has in memory to the : disk when it overflows(u can set how much u want to use in memory), or : when the server goes offline. When the server comes back the master : terracotta simply loads it back into the memory of the once offline Sure ..

RE: RAMDirecotory instead of FSDirectory for SOLR

2007-05-31 Thread Jeryl Cook
i have Terracotta to work with Lucene , and it works find with the RAMDirectory...i am trying to get it to work with SOLR(Hook the RAMDirectory..)..., when i do, ill post the findings,problems,etc..Thanks for feedback from everyone.Jeryl Cook /^\ Pharaoh /^\