Re: Significance of Analyzer Class attribute

2012-07-25 Thread Lance Norskog
An Analyzer object is a chain of Tokenizer and TokenFilters. These text type definitions either use an analyzer class or describe the Tokenizer and TokenFilters directly. The Analyzer classes create their own sequence of Tokenizer and maybe TokenFilters, hard-coded in the analyzer class. In schema.

solr spellchecker hogging all of my memory

2012-07-25 Thread dboychuck
before I optimize (build my spellchecker index) my solr instance running in tomcat uses about 2 gigs of memory as soon as I optimize it jumps to about 5 gigs http://d.pr/i/oUQI it just doesn't seem right http://pastebin.com/6Cg7F0dK is there anything wrong with my configuration? when i dump the

Solr 4.0 cross-core join limitations or a misunderstanding?

2012-07-25 Thread Jeff Schmidt
Hello: I'm trying to figure out if there is some limitation to a cross core join, or if I'm must misunderstanding something. This has been working fine with a small number of documents in the from index, but now I'm not getting the expected results now that a given example here has 41K from in

Re: Download of old solr releases

2012-07-25 Thread Nicolas Dietrich
On 07/25/2012 07:17 PM, Chris Hostetter wrote: > > : it looks like the old releases have been thrown out of the download > : servers, for example > > This is standard practice for apache projects so that the mirror network > doesn't have to store gigs and gigs of ancient files that most people

Re: Autocomplete terms from the middle of name/description of a Doc

2012-07-25 Thread Chantal Ackermann
> Suppose I have a product with a title='kMix Espresso maker'. If I tokenize > this and put the result in product_tokens I should get > '[kMix][Espresso][maker]'. > > If now I try to search with facet.field='product_tokens' and > facet.prefix='espresso' I should get only 'espresso' while I want '

Re: solr trademark

2012-07-25 Thread Mark Miller
On Jul 25, 2012, at 2:00 PM, Radim Kolar wrote: > Mark, >> You are certainly not using the Solr mark in an approved manner and I'd hope >> if you are going to take advantage of our mailing list for promotion of your >> product, that you would not violate our trademark. > Apache Foundation do n

solr trademark

2012-07-25 Thread Radim Kolar
Mark, You are certainly not using the Solr mark in an approved manner and I'd hope if you are going to take advantage of our mailing list for promotion of your product, that you would not violate our trademark. Apache Foundation do not own SOLR (R) trademark. I looked into registry (USA and Wo

Re: Download of old solr releases

2012-07-25 Thread Chris Hostetter
: it looks like the old releases have been thrown out of the download : servers, for example This is standard practice for apache projects so that the mirror network doesn't have to store gigs and gigs of ancient files that most people don't care about. All historic apache releases are availab

Download of old solr releases

2012-07-25 Thread Nicolas Dietrich
Hi there, it looks like the old releases have been thrown out of the download servers, for example http://apache.mirrors.tds.net/lucene/solr/1.4.1/apache-solr-1.4.1.tgz Is this on purpose or a mistake, or have I overseen something? Thanks for clarification. Cheers, Nicolas

Re: Binary content index with multiple cores

2012-07-25 Thread Ahmet Arslan
>   regex="apache-solr-cell-\d.*\.jar" /> >   regex=".*\.jar" /> Thats okey, do you see something like below in logs: INFO: Adding 'file:/Users/iorixxx/Desktop/solr-trunk/solr/contrib/extraction/lib/tika-parsers-1.1.jar' to classloader > I've added the maven dependencies like this for the

Re: Skip first word

2012-07-25 Thread Ahmet Arslan
> is there a tokenizer and/or a combination of filter to > remove the first term from a field? > > For example: > The quick brown fox > > should be tokenized as: > quick > brown > fox There is no such filter that i know of. Though, you can implement one with modifying source code of LengthFilt

Skip first word

2012-07-25 Thread Finotti Simone
Hi is there a tokenizer and/or a combination of filter to remove the first term from a field? For example: The quick brown fox should be tokenized as: quick brown fox thank you in advance S

Re: "Invalid or unreadable WAR file : .../solr.war" when starting solr 3.6.1 app on Tomcat 7 (upgrade to 7.0.29/upstream)?

2012-07-25 Thread k9157
Hi On Wed, Jul 25, 2012, at 01:37 AM, Chantal Ackermann wrote: > I haven't been following from the beginning but am still curious: is the war > file on a shared fs? No, it's not. Atm, all's on one fs. > See also: > http://www.mail-archive.com/users@tomcat.apache.org/msg79555.html > http://sta

Re: [Announce] Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 with Realtime NRT available for download

2012-07-25 Thread Mark Miller
On Wed, Jul 25, 2012 at 11:03 AM, Nagendra Nagarajayya < nnagaraja...@transaxtions.com> wrote: > Mark, > > Grant Ingersoll from ASF got in touch with me to ensure that I am > compliant with the Apache Trade Mark. I made changes to the names, web > pages, wiki, papers, etc. and sent back the links

Re: Autocomplete terms from the middle of name/description of a Doc

2012-07-25 Thread Ugo Matrangolo
Hi, thank you for the suggestions. However, I think that this is not going to work. Suppose I have a product with a title='kMix Espresso maker'. If I tokenize this and put the result in product_tokens I should get '[kMix][Espresso][maker]'. If now I try to search with facet.field='product_tokens

numFound inconsistent for different rows-param

2012-07-25 Thread patrick
hi, i'm running two solr v3.6 instances: rdta01:9983/solr/msg-core : 8 documents rdta01:28983/solr/msg-core : 4 documents the following two queries with rows=10 resp rows=0 return different numFound results which confuses me. i hope someone can clarify this behaviour. URL with rows=10: ---

Re: [Announce] Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 with Realtime NRT available for download

2012-07-25 Thread Nagendra Nagarajayya
Mark, Grant Ingersoll from ASF got in touch with me to ensure that I am compliant with the Apache Trade Mark. I made changes to the names, web pages, wiki, papers, etc. and sent back the links to Grant for approval. You may want to check with Grant. Regarding the fork, I am not creating a fo

Re: Solr zk client stopping sending data

2012-07-25 Thread Mark Miller
Hard to determine from this info. With only WARN logging, you don't see a lot of what is happening. Around what is the date of this build? Close can actually be called quite often - it's used to cancel recoveries, but new recovery threads can easily start up. What you should see if Recovery truly

separation of indexes to optimize facet queries without fulltext

2012-07-25 Thread Daniel Brügge
Hi, I have currently one big sharded Solr setup storing couple of million documents with some 'small' fields and one fulltext field in each doc. The latter blows up the index. My thought was, that I could separate indexes. So for the facet queries where I don't need fulltext search (so also no ind

Re: [Announce] Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 with Realtime NRT available for download

2012-07-25 Thread Andy
But Solr relies on cache in faceting for performance reason. If it is required to disable the cache then faceting would be very slow under RankingAlgorithm, no? From:Nagendra Nagarajayya To:solr-user@lucene.apache.org Sent:Wednesday, July 25, 2012 9:12 AM Sub

Re: Autocomplete terms from the middle of name/description of a Doc

2012-07-25 Thread Chantal Ackermann
Hi Ugo, You can use facet.prefix on a tokenized field instead of a String field. Example: facet.prefix on "product" will only return hits that match the start of the single token stored in that field. As "product_tokens" contains the value of "product" tokenized in a fashion that suites you,

Solr zk client stopping sending data

2012-07-25 Thread Trym R. Møller
Hi Running a Solr cloud cluster after a while a Solr looses its connection to its ZooKeeper cluster as seen in the ZooKeeper log below. The Solr reconnects to another ZooKeeper in the ZK cluster and the only thing seen in the Solr log (running warning level) is a newly programmatic created co

Re: Copy lucene index into Solr

2012-07-25 Thread spredd1208
On Wed, Jul 25, 2012 at 8:58 AM, spredd1208 [via Lucene] < ml-node+s472066n3997251...@n3.nabble.com> wrote: > So I am familiar with SOLR. I have built fairly extensive solr indexes > already. What I am trying to do now > is build very large indexes and then move them over to a sharded server > env

Re: Copy lucene index into Solr

2012-07-25 Thread spredd1208
So I am familiar with SOLR. I have built fairly extensive solr indexes already. What I am trying to do now is build very large indexes and then move them over to a sharded server environment that will allow clients to query them. These indexes will be built once and will not be updated or need de

Re: [Announce] Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 with Realtime NRT available for download

2012-07-25 Thread Mark Miller
You are changing the name, or someone at Apache told you the current name is okay? If someone at Apache told you it was okay, who was that? You are certainly not using the Solr mark in an approved manner and I'd hope if you are going to take advantage of our mailing list for promotion of your

Re: [Announce] Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 with Realtime NRT available for download

2012-07-25 Thread Nagendra Nagarajayya
Each request thread may return updated results. Each component may also in certain cases return updated results. The algorithm is designed to handle these. The granularity of the returned results can be controlled through a visible parameter. Regards, Nagendra Nagarajayya http://solr-ra.tgel

Autocomplete terms from the middle of name/description of a Doc

2012-07-25 Thread Ugo Matrangolo
Hi, I'm working on making our autocomplete engine a bit more smart. The actual impl is a basic facet based autocompletion as described in the 'SOLR 3 Enterprise Search' book: we use all the typed tokens except the last one to build a facet.prefix query on an autocomplete facet field we built at i

Re: Binary content index with multiple cores

2012-07-25 Thread davidbougearel
Here is my solrconfig.xml for one of the core : ... I've added the maven dependencies like this for the solr war : org.apache.solr

Re: Binary content index with multiple cores

2012-07-25 Thread Ahmet Arslan
> I'm using the 3.6.0 version. Are you using the following lib directives (defined solrconfig.xml) or did you manually copied above jar files into solrHome/coreName/lib directory?

Re: javabin binary format specification

2012-07-25 Thread Ahmet Arslan
> Sorry, but I could not find any spec on the binary format > SolrJ is > using. Can you point me to an URL if any? may be this? https://svn.apache.org/repos/asf/lucene/dev/trunk/solr/core/src/java/org/apache/solr/response/BinaryResponseWriter.java

javabin binary format specification

2012-07-25 Thread Zsolt Czinkos
Hi all, Sorry, but I could not find any spec on the binary format SolrJ is using. Can you point me to an URL if any? Thanks in advance. Best, Zsolt

Re: [Announce] Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 with Realtime NRT available for download

2012-07-25 Thread Nagendra Nagarajayya
Yes faceting works as before. Regarding the cache, the suggestion is to disable the cache for realtime NRT, for now. Regards, Nagendra Nagarajayya http://solr-ra.tgels.org http://rankingalgorithm.tgels.org On 7/24/2012 2:57 PM, Andy wrote: Nagendra, Does RankingAlgorithm work with faceting

Re: Binary content index with multiple cores

2012-07-25 Thread Ahmet Arslan
> I try to get the content from binary documents using > solr-cell and i follow > the wiki for that. > > After putting some missing classes, i arrive on this > exception : > > Caused by: org.apache.solr.common.SolrException: Error > Instantiating Request > Handler, solr.extraction.ExtractingReque

Re: Significance of Analyzer Class attribute

2012-07-25 Thread Ahmet Arslan
> When I specify analyzer class in schema,  something > like below and do > analysis on this field in analysis page : I cant  see > verbose output on > tokenizer and filters > > class="solr.TextField"> >       class="org.apache.lucene.analysis.cn.smart.SmartChineseAnalyzer"> >   class="solr.Sm

Re: DIH XML configs for multi environment

2012-07-25 Thread Pranav Prakash
Jerry, Glad it worked for you. I will also do the same thing. This seems easier for me, as I have a solr start shell script, which sets the JVM params for master/slave, Xmx and so on according to the environment. Setting a jdbc connect url in the start script is convenient than changing the config

Re: NumberFormatException while indexing TextField with LengthFilter and then copying to tfloat

2012-07-25 Thread Chantal Ackermann
Here are the working solutions for: 3.6.1 (or lower probably) via ScriptTransformer in data-config.xml: function prepareData(row) { var cols = new java.util.ArrayList(); cols.add("spent_hours");

Re: "Invalid or unreadable WAR file : .../solr.war" when starting solr 3.6.1 app on Tomcat 7 (upgrade to 7.0.29/upstream)?

2012-07-25 Thread Chantal Ackermann
HI, I haven't been following from the beginning but am still curious: is the war file on a shared fs? See also: http://www.mail-archive.com/users@tomcat.apache.org/msg79555.html http://stackoverflow.com/questions/5493931/java-lang-illegalargumentexception-invalid-or-unreadable-war-file-error-in-

Re: SOLR 4.0-ALPHA : DIH : Indexed and Committed Successfully but Index is empty

2012-07-25 Thread Chantal Ackermann
Hi Hoss, > Did you perhaps forget to include RunUpdateProcessorFactory at the end? What is that? ;-) I had copied the config from http://wiki.apache.org/solr/UpdateRequestProcessor but removed the lines I thought I did not need. :-( I've changed my configuration, and this is now WORKING (4.0-AL

Significance of Analyzer Class attribute

2012-07-25 Thread Rajani Maski
Hi, What is the significance of Analyzer class attribute? When I specify analyzer class in schema, something like below and do analysis on this field in analysis page : I cant see verbose output on tokenizer and filters *But if i don't add analyzer class, I can see t

Re: filed type for text search

2012-07-25 Thread Lance Norskog
String fields can be searched, but you have to give the complete string, or a wildcard. And the upper/lower case has to be right. On Tue, Jul 24, 2012 at 11:13 AM, Xiao Li wrote: > I have used Solr 3.4 for a long time. Recently, when I upgrade to Solr 4.0 > and reindex the whole data, I find that