Re: filed type for text search

2012-07-25 Thread Lance Norskog
String fields can be searched, but you have to give the complete string, or a wildcard. And the upper/lower case has to be right. On Tue, Jul 24, 2012 at 11:13 AM, Xiao Li shinelee.thew...@gmail.com wrote: I have used Solr 3.4 for a long time. Recently, when I upgrade to Solr 4.0 and reindex

Significance of Analyzer Class attribute

2012-07-25 Thread Rajani Maski
Hi, What is the significance of Analyzer class attribute? When I specify analyzer class in schema, something like below and do analysis on this field in analysis page : I cant see verbose output on tokenizer and filters fieldType name=text_chinese class=solr.TextField analyzer

Re: SOLR 4.0-ALPHA : DIH : Indexed and Committed Successfully but Index is empty

2012-07-25 Thread Chantal Ackermann
Hi Hoss, Did you perhaps forget to include RunUpdateProcessorFactory at the end? What is that? ;-) I had copied the config from http://wiki.apache.org/solr/UpdateRequestProcessor but removed the lines I thought I did not need. :-( I've changed my configuration, and this is now WORKING

Re: Invalid or unreadable WAR file : .../solr.war when starting solr 3.6.1 app on Tomcat 7 (upgrade to 7.0.29/upstream)?

2012-07-25 Thread Chantal Ackermann
HI, I haven't been following from the beginning but am still curious: is the war file on a shared fs? See also: http://www.mail-archive.com/users@tomcat.apache.org/msg79555.html

Re: NumberFormatException while indexing TextField with LengthFilter and then copying to tfloat

2012-07-25 Thread Chantal Ackermann
Here are the working solutions for: 3.6.1 (or lower probably) via ScriptTransformer in data-config.xml: function prepareData(row) { var cols = new java.util.ArrayList(); cols.add(spent_hours);

Re: DIH XML configs for multi environment

2012-07-25 Thread Pranav Prakash
Jerry, Glad it worked for you. I will also do the same thing. This seems easier for me, as I have a solr start shell script, which sets the JVM params for master/slave, Xmx and so on according to the environment. Setting a jdbc connect url in the start script is convenient than changing the

Re: Significance of Analyzer Class attribute

2012-07-25 Thread Ahmet Arslan
When I specify analyzer class in schema,  something like below and do analysis on this field in analysis page : I cant  see verbose output on tokenizer and filters fieldType name=text_chinese class=solr.TextField       analyzer

Re: Binary content index with multiple cores

2012-07-25 Thread Ahmet Arslan
I try to get the content from binary documents using solr-cell and i follow the wiki for that. After putting some missing classes, i arrive on this exception : Caused by: org.apache.solr.common.SolrException: Error Instantiating Request Handler, solr.extraction.ExtractingRequestHandler

Re: [Announce] Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 with Realtime NRT available for download

2012-07-25 Thread Nagendra Nagarajayya
Yes faceting works as before. Regarding the cache, the suggestion is to disable the cache for realtime NRT, for now. Regards, Nagendra Nagarajayya http://solr-ra.tgels.org http://rankingalgorithm.tgels.org On 7/24/2012 2:57 PM, Andy wrote: Nagendra, Does RankingAlgorithm work with faceting

javabin binary format specification

2012-07-25 Thread Zsolt Czinkos
Hi all, Sorry, but I could not find any spec on the binary format SolrJ is using. Can you point me to an URL if any? Thanks in advance. Best, Zsolt

Re: javabin binary format specification

2012-07-25 Thread Ahmet Arslan
Sorry, but I could not find any spec on the binary format SolrJ is using. Can you point me to an URL if any? may be this? https://svn.apache.org/repos/asf/lucene/dev/trunk/solr/core/src/java/org/apache/solr/response/BinaryResponseWriter.java

Re: Binary content index with multiple cores

2012-07-25 Thread Ahmet Arslan
I'm using the 3.6.0 version. Are you using the following lib directives (defined solrconfig.xml) lib dir=../../../dist/ regex=apache-solr-cell-\d.*\.jar / lib dir=../../../contrib/extraction/lib regex=.*\.jar / or did you manually copied above jar files into solrHome/coreName/lib directory?

Re: Binary content index with multiple cores

2012-07-25 Thread davidbougearel
Here is my solrconfig.xml for one of the core : lib dir=../../dist/ regex=apache-solr-cell-\d.*\.jar / lib dir=../../contrib/extraction/lib regex=.*\.jar / ... I've added the maven dependencies like this for the solr war : dependencies combine.self=override

Autocomplete terms from the middle of name/description of a Doc

2012-07-25 Thread Ugo Matrangolo
Hi, I'm working on making our autocomplete engine a bit more smart. The actual impl is a basic facet based autocompletion as described in the 'SOLR 3 Enterprise Search' book: we use all the typed tokens except the last one to build a facet.prefix query on an autocomplete facet field we built at

Re: [Announce] Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 with Realtime NRT available for download

2012-07-25 Thread Nagendra Nagarajayya
Each request thread may return updated results. Each component may also in certain cases return updated results. The algorithm is designed to handle these. The granularity of the returned results can be controlled through a visible parameter. Regards, Nagendra Nagarajayya

Re: [Announce] Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 with Realtime NRT available for download

2012-07-25 Thread Mark Miller
You are changing the name, or someone at Apache told you the current name is okay? If someone at Apache told you it was okay, who was that? You are certainly not using the Solr mark in an approved manner and I'd hope if you are going to take advantage of our mailing list for promotion of your

Re: Copy lucene index into Solr

2012-07-25 Thread spredd1208
So I am familiar with SOLR. I have built fairly extensive solr indexes already. What I am trying to do now is build very large indexes and then move them over to a sharded server environment that will allow clients to query them. These indexes will be built once and will not be updated or need

Re: Copy lucene index into Solr

2012-07-25 Thread spredd1208
On Wed, Jul 25, 2012 at 8:58 AM, spredd1208 [via Lucene] ml-node+s472066n3997251...@n3.nabble.com wrote: So I am familiar with SOLR. I have built fairly extensive solr indexes already. What I am trying to do now is build very large indexes and then move them over to a sharded server

Solr zk client stopping sending data

2012-07-25 Thread Trym R. Møller
Hi Running a Solr cloud cluster after a while a Solr looses its connection to its ZooKeeper cluster as seen in the ZooKeeper log below. The Solr reconnects to another ZooKeeper in the ZK cluster and the only thing seen in the Solr log (running warning level) is a newly programmatic created

Re: Autocomplete terms from the middle of name/description of a Doc

2012-07-25 Thread Chantal Ackermann
Hi Ugo, You can use facet.prefix on a tokenized field instead of a String field. Example: field name=product type=string … / field name=product_tokens type=text_split … /!-- use e.g. WhitespaceTokenizer or WordDelimiter and others, see example schema.xml that comes with SOLR -- facet.prefix

Re: [Announce] Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 with Realtime NRT available for download

2012-07-25 Thread Andy
But Solr relies on cache in faceting for performance reason. If it is required to disable the cache then faceting would be very slow under RankingAlgorithm, no? From:Nagendra Nagarajayya nnagaraja...@transaxtions.com To:solr-user@lucene.apache.org

separation of indexes to optimize facet queries without fulltext

2012-07-25 Thread Daniel Brügge
Hi, I have currently one big sharded Solr setup storing couple of million documents with some 'small' fields and one fulltext field in each doc. The latter blows up the index. My thought was, that I could separate indexes. So for the facet queries where I don't need fulltext search (so also no

Re: Solr zk client stopping sending data

2012-07-25 Thread Mark Miller
Hard to determine from this info. With only WARN logging, you don't see a lot of what is happening. Around what is the date of this build? Close can actually be called quite often - it's used to cancel recoveries, but new recovery threads can easily start up. What you should see if Recovery truly

Re: [Announce] Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 with Realtime NRT available for download

2012-07-25 Thread Nagendra Nagarajayya
Mark, Grant Ingersoll from ASF got in touch with me to ensure that I am compliant with the Apache Trade Mark. I made changes to the names, web pages, wiki, papers, etc. and sent back the links to Grant for approval. You may want to check with Grant. Regarding the fork, I am not creating a

numFound inconsistent for different rows-param

2012-07-25 Thread patrick
hi, i'm running two solr v3.6 instances: rdta01:9983/solr/msg-core : 8 documents rdta01:28983/solr/msg-core : 4 documents the following two queries with rows=10 resp rows=0 return different numFound results which confuses me. i hope someone can clarify this behaviour. URL with rows=10:

Re: Autocomplete terms from the middle of name/description of a Doc

2012-07-25 Thread Ugo Matrangolo
Hi, thank you for the suggestions. However, I think that this is not going to work. Suppose I have a product with a title='kMix Espresso maker'. If I tokenize this and put the result in product_tokens I should get '[kMix][Espresso][maker]'. If now I try to search with

Re: [Announce] Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 with Realtime NRT available for download

2012-07-25 Thread Mark Miller
On Wed, Jul 25, 2012 at 11:03 AM, Nagendra Nagarajayya nnagaraja...@transaxtions.com wrote: Mark, Grant Ingersoll from ASF got in touch with me to ensure that I am compliant with the Apache Trade Mark. I made changes to the names, web pages, wiki, papers, etc. and sent back the links to

Re: Invalid or unreadable WAR file : .../solr.war when starting solr 3.6.1 app on Tomcat 7 (upgrade to 7.0.29/upstream)?

2012-07-25 Thread k9157
Hi On Wed, Jul 25, 2012, at 01:37 AM, Chantal Ackermann wrote: I haven't been following from the beginning but am still curious: is the war file on a shared fs? No, it's not. Atm, all's on one fs. See also: http://www.mail-archive.com/users@tomcat.apache.org/msg79555.html

Skip first word

2012-07-25 Thread Finotti Simone
Hi is there a tokenizer and/or a combination of filter to remove the first term from a field? For example: The quick brown fox should be tokenized as: quick brown fox thank you in advance S

Re: Skip first word

2012-07-25 Thread Ahmet Arslan
is there a tokenizer and/or a combination of filter to remove the first term from a field? For example: The quick brown fox should be tokenized as: quick brown fox There is no such filter that i know of. Though, you can implement one with modifying source code of LengthFilterFactory

Re: Binary content index with multiple cores

2012-07-25 Thread Ahmet Arslan
  lib dir=../../dist/ regex=apache-solr-cell-\d.*\.jar /   lib dir=../../contrib/extraction/lib regex=.*\.jar / Thats okey, do you see something like below in logs: INFO: Adding 'file:/Users/iorixxx/Desktop/solr-trunk/solr/contrib/extraction/lib/tika-parsers-1.1.jar' to classloader

Download of old solr releases

2012-07-25 Thread Nicolas Dietrich
Hi there, it looks like the old releases have been thrown out of the download servers, for example http://apache.mirrors.tds.net/lucene/solr/1.4.1/apache-solr-1.4.1.tgz Is this on purpose or a mistake, or have I overseen something? Thanks for clarification. Cheers, Nicolas

Re: Download of old solr releases

2012-07-25 Thread Chris Hostetter
: it looks like the old releases have been thrown out of the download : servers, for example This is standard practice for apache projects so that the mirror network doesn't have to store gigs and gigs of ancient files that most people don't care about. All historic apache releases are

solr trademark

2012-07-25 Thread Radim Kolar
Mark, You are certainly not using the Solr mark in an approved manner and I'd hope if you are going to take advantage of our mailing list for promotion of your product, that you would not violate our trademark. Apache Foundation do not own SOLR (R) trademark. I looked into registry (USA and

Re: solr trademark

2012-07-25 Thread Mark Miller
On Jul 25, 2012, at 2:00 PM, Radim Kolar h...@filez.com wrote: Mark, You are certainly not using the Solr mark in an approved manner and I'd hope if you are going to take advantage of our mailing list for promotion of your product, that you would not violate our trademark. Apache

Re: Autocomplete terms from the middle of name/description of a Doc

2012-07-25 Thread Chantal Ackermann
Suppose I have a product with a title='kMix Espresso maker'. If I tokenize this and put the result in product_tokens I should get '[kMix][Espresso][maker]'. If now I try to search with facet.field='product_tokens' and facet.prefix='espresso' I should get only 'espresso' while I want 'kMix

Re: Download of old solr releases

2012-07-25 Thread Nicolas Dietrich
On 07/25/2012 07:17 PM, Chris Hostetter wrote: : it looks like the old releases have been thrown out of the download : servers, for example This is standard practice for apache projects so that the mirror network doesn't have to store gigs and gigs of ancient files that most people

Solr 4.0 cross-core join limitations or a misunderstanding?

2012-07-25 Thread Jeff Schmidt
Hello: I'm trying to figure out if there is some limitation to a cross core join, or if I'm must misunderstanding something. This has been working fine with a small number of documents in the from index, but now I'm not getting the expected results now that a given example here has 41K from

solr spellchecker hogging all of my memory

2012-07-25 Thread dboychuck
before I optimize (build my spellchecker index) my solr instance running in tomcat uses about 2 gigs of memory as soon as I optimize it jumps to about 5 gigs http://d.pr/i/oUQI it just doesn't seem right http://pastebin.com/6Cg7F0dK is there anything wrong with my configuration? when i dump

Re: Significance of Analyzer Class attribute

2012-07-25 Thread Lance Norskog
An Analyzer object is a chain of Tokenizer and TokenFilters. These text type definitions either use an analyzer class or describe the Tokenizer and TokenFilters directly. The Analyzer classes create their own sequence of Tokenizer and maybe TokenFilters, hard-coded in the analyzer class. In