Boosting in version 1.2

2007-06-08 Thread Thierry Collogne
Hello, Our documents contain three fields. title, keywords, content. What we want is to give priority to the field keywords, than title and last content. So we did the following in our xml file that is to be indexed we put the following doc field name=keywords boost=3.0letters/field field

How does HTMLStripWhitespaceTokenizerFactory work?

2007-06-08 Thread Thierry Collogne
Hello, I am trying to use the solr.HTMLStripWhitespaceTokenizerFactory analyzer with no luck. I have a field content that contains the following field name=content![CDATA[test a href=testlink/a post]]/field When I do a search I get the following result

How can I use dates to boost my results?

2007-06-08 Thread Daniel Alheiros
Hi For my search use, the document freshness is a relevant aspect that should be considered to boost results. I have a field in my index like this: field name=created type=date indexed=true stored=true / How can I make a good use of this to boost my results? I'm using the DisMaxRequestHandler

Re: Multi-language indexing and searching

2007-06-08 Thread Henrib
Hi Daniel, If it is functionally 'ok' to search in only one lang at a time, you could try having one index per lang. Each per-lang index would have one schema where you would describe field types (the lang part coming through stemming/snowball analyzers, per-lang stopwords al) and the same field

Re: Multi-language indexing and searching

2007-06-08 Thread Daniel Alheiros
Hi Henri. Thanks for your reply. I've just looked at the patch you referred, but doing this I will lose the out of the box Solr installation... I'll have to create my own Solr application responsible for creating the multiple cores and I'll have to change my indexing process to something able to

problem with schema.xml

2007-06-08 Thread mirko
Hi, I just started playing around with Solr 1.2. It has some nice improvements. I noticed that errors in the schema.xml get reported in a verbose way now, but the following steps cause a problem for me: 1. start with a correct schema.xml - Solr works fine 2. edit it in a way that is no longer

Re: How does HTMLStripWhitespaceTokenizerFactory work?

2007-06-08 Thread Yonik Seeley
On 6/8/07, Thierry Collogne [EMAIL PROTECTED] wrote: I am trying to use the solr.HTMLStripWhitespaceTokenizerFactory analyzer with no luck. [...] Is this normal? Shouldn't the html code and the white spaces be removed from the field? For indexing purposes, yes. The stored field you get back

Cannot index '' this character using post.jar

2007-06-08 Thread Tiong Jeffrey
Hi all, I tried to index a document that has '' using post.jar. But during the indexing it causes error and it wont finish the indexing. Can I know why is this and how to prevent this? Thanks! Jeffrey

Re: Boosting in version 1.2

2007-06-08 Thread Mike Klaas
On 8-Jun-07, at 2:07 AM, Thierry Collogne wrote: Hello, Our documents contain three fields. title, keywords, content. What we want is to give priority to the field keywords, than title and last content In our schema.xml we have put defaultSearchFieldtext/defaultSearchField copyField

Re: problem with schema.xml

2007-06-08 Thread Ryan McKinley
I don't use tomcat, so I can't be particularly useful. The behavior you describe does not happen with resin or jetty... My guess is that tomcat is caching the error state. Since fixing the problem is outside the webapp directory, it does not think it has changed so it stays in a broken

Re: problem with schema.xml

2007-06-08 Thread mirko
Hi Ryan, I have my .war file located outside the webapps folder (I am using multiple Solr instances with a config as suggested on the wiki: http://wiki.apache.org/solr/SolrTomcat). Nevertheless, I touched the .war file, the config file, the directory under webapps, but nothing seems to be

Re: To make sure XML is UTF-8

2007-06-08 Thread Funtick
Tiong Jeffrey wrote: Thought this is not directly related to Solr, but I have a XML output from mysql database, but during indexing the XML output is not working. And the problem is part of the XML output is not in UTF-8 encoding, how can I convert it to UTF-8 and how do I know what kind

Re: To make sure XML is UTF-8

2007-06-08 Thread funtick
Thought this is not directly related to Solr, but I have a XML output from mysql database, but during indexing the XML output is not working. And the problem is part of the XML output is not in UTF-8 encoding, how can I convert it to UTF-8 and how do I know what kind of coding it uses in the

Re: Multi-language indexing and searching

2007-06-08 Thread Chris Hostetter
: Can't I have the same index, using one single core, same field names being : processed by language specific components based on a field/parameter? yes, but you don't really need the complexity you describe below ... you don't need seperate request handlers per language, just seperate fields

Re: Solr 1.2 released

2007-06-08 Thread Jack L
Hello Yonik, This is great news. Will it be a drop-in replacement for 1.1? I.e., do I need to make any changes other than replacing the jar files? I suppose the index files will still be good. Are 1.2 schema files and config files compatible with those of 1.1? -- Best regards, Jack Thursday,

Re: Solr 1.2 released

2007-06-08 Thread Yonik Seeley
On 6/8/07, Jack L [EMAIL PROTECTED] wrote: This is great news. Will it be a drop-in replacement for 1.1? I.e., do I need to make any changes other than replacing the jar files? I suppose the index files will still be good. Are 1.2 schema files and config files compatible with those of 1.1? It

Re: Wildcards / Binary searches

2007-06-08 Thread Chris Hostetter
: Do you mean something like below ? : field name=autocompletew wo wor word/field yeah, but there are some Tokenizers that make this trivial (EdgeNGramTokenizer i think is the name) : project, definitively not a good practice for portability of indexes. A : duplicate field with an analyser to

RE: Solr 1.2 released

2007-06-08 Thread Teruhiko Kurosaka
I noticed there is no example/ext directory or jars that was found there in 1.1 (commons-el.jar, commons-logging.jar, jasper-*.jar, mx4j-*.jar) I have a jar that my Solr plugin depends on. This jar contains a class that needs to be loaded only once per container because it is a JNI library.

Re: solr+hadoop = next solr

2007-06-08 Thread Jeff Rodenburg
On 6/7/07, Rafael Rossini [EMAIL PROTECTED] wrote: Hi, Jeff and Mike. Would you mind telling us about the architecture of your solutions a little bit? Mike, you said that you implemented a highly-distributed search engine using Solr as indexing nodes. What does that mean? You guys

RE: Solr 1.2 released

2007-06-08 Thread Chris Hostetter
: I noticed there is no example/ext : directory or jars that was found there : in 1.1 (commons-el.jar, commons-logging.jar, : jasper-*.jar, mx4j-*.jar) the example/ext directory was an entirly Jetty based artifact. when we upgraded the Jetty used in the example setup, Jetty no longer had an ext