Re: How to extract a field with a prefixed dimension?

2013-10-21 Thread Upayavira
Not too sure what you're asking. Are you saying that you want to only return a relevant part of a field in search results - i.e. a contextual snippet? If so, then you should look at the highlighting component, which can do this. http://wiki.apache.org/solr/HighlightingParameters Upayavira On Mo

Re: Seeking New Moderators for solr-user@lucene

2013-10-21 Thread Andrew Psaltis
Hey Hoss, I would be interested in being a moderator. Thanks, Andrew On Sun, Oct 20, 2013 at 7:09 AM, Jeevanandam M. wrote: > Hello Hoss - > > My pleasure, kindly accept my moderator nomination. > > Regards, > Jeeva > > -- Original Message -- > From: Chris Hostetter [mailto:hos

Re: Exact Match Results

2013-10-21 Thread Developer
For exact phrase match you can wrap the query inside quotes but this will perform the exact match and it wont match other results. The below query will match only : Okkadu telugu movie stills http://localhost:8983/solr/core1/select?q=%22okkadu%20telugu%20movie%20stills%22 Since you are using Edg

Re: measure result set quality

2013-10-21 Thread Alvaro Cabrerizo
Thanks for your valuable answers. As a first approach I will evaluate (manually :( ) hits that are out of the intersection set for every query in each system. Anyway I will keep searching for literature in the field. Regards. On Sun, Oct 20, 2013 at 10:55 PM, Doug Turnbull < dturnb...@opensourc

Major GC does not reduce the old gen size

2013-10-21 Thread neoman
Hello everyone, We are using solr 4.4 version production with 4 shards. This is our memory settings. -d64 -server -Xms8192m -Xmx12288m -XX:MaxPermSize=256m \ -XX:NewRatio=1 -XX:SurvivorRatio=6 \ -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode -XX:CMSIncrementalDutyCycleMin=0 \ -XX:CMSIncrementalDut

Re: External Zookeeper and JBOSS

2013-10-21 Thread Shawn Heisey
On 10/21/2013 1:19 PM, Branham, Jeremy [HR] wrote: Sorl.xml [simplified by removing additional cores] These cores that you have listed here do not look like SolrCloud-related cores, because they do not reference a collection or a shard. Here's what I've got on a 4.2.1

RE: External Zookeeper and JBOSS

2013-10-21 Thread Branham, Jeremy [HR]
I've made progress... Rather than using the zkCli.sh in the zookeep bin folder, I used the java libs fom SOLR and the config now shows up. Jeremy D. Branham Performance Technologist II Sprint University Performance Support Fort Worth, TX | Tel: **DOTNET http://JeremyBranham.Wordpress.com htt

External Zookeeper and JBOSS

2013-10-21 Thread Branham, Jeremy [HR]
When I use the Zookeeper CLI utility, I'm not sure if the configuration is uploading correctly. How can I tell? This is the command I am issuing - ./zkCli.sh -cmd upconfig -server 127.0.0.1:2181 -confdir /data/v8p/solr/root/conf -confname defaultconfig -solrhome /data/v8p/solr Then checking wit

How to extract a field with a prefixed dimension?

2013-10-21 Thread javozzo
Hi, i'm new in solr. i use the content field to extract the text of solr documents, but this field is too long. Is there a way to extract only a substring of this field? i make my query in java as follow: SolrQuery querySolr = new SolrQuery(); querySolr.setQuery("*:*"); querySolr.setRows(3); quer

Re: Questions developing custom functionquery

2013-10-21 Thread JT
I would agree the "right" way to do this is probably just add the information I wish to sort on directly, as a date field or something like that. The issue is we currently have ~300m documents that are already indexed. Not all of the fields have stored=true (for good reason, we maintain the docume

reindexing data

2013-10-21 Thread Christopher Gross
In Solr 4.5, I'm trying to create a new collection on the fly. I have a data dir with the index that should be in there, but the CREATE command makes the directory be: _shard1_replicant# I was hoping that making a collection named something would use a directory with that name to let me use the d

Re: Custom FunctionQuery Guide/Tutorial (4.3.0+) ?

2013-10-21 Thread Jack Krupansky
Hopefully at the end of the week. -- Jack Krupansky -Original Message- From: fudong li Sent: Monday, October 21, 2013 1:45 PM To: solr-user@lucene.apache.org Subject: Re: Custom FunctionQuery Guide/Tutorial (4.3.0+) ? Hi Jack, Do you have a date for the new version of your book: solr

Re: Custom FunctionQuery Guide/Tutorial (4.3.0+) ?

2013-10-21 Thread fudong li
Hi Jack, Do you have a date for the new version of your book: solr_4x_deep_dive_early_access? Thanks, Fudong On Mon, Oct 21, 2013 at 10:39 AM, Jack Krupansky wrote: > Take a look at the unit tests for various "value sources", and find a Jira > that added some value source and look at the patc

Re: SolrCloud performance in VM environment

2013-10-21 Thread Shawn Heisey
On 10/21/2013 9:48 AM, Tom Mortimer wrote: Hi everyone, I've been working on an installation recently which uses SolrCloud to index 45M documents into 8 shards on 2 VMs running 64-bit Ubuntu (with another 2 identical VMs set up for replicas). The reason we're using so many shards for a relativel

Re: Custom FunctionQuery Guide/Tutorial (4.3.0+) ?

2013-10-21 Thread Jack Krupansky
Take a look at the unit tests for various "value sources", and find a Jira that added some value source and look at the patch for what changes had to be made. -- Jack Krupansky -Original Message- From: JT Sent: Monday, October 21, 2013 1:17 PM To: solr-user@lucene.apache.org Subject:

Re: Solr timeout after reboot

2013-10-21 Thread Otis Gospodnetic
Hi Michael, I agree with Shawn, don't listen to Peter ;) but only this once - he's a smart guy, as you can see in list archives. And I disagree with Shawn. again, only just this once and only somewhat. :) Because: In general, Shawn's advice is correct, but we have no way of knowing your

Custom FunctionQuery Guide/Tutorial (4.3.0+) ?

2013-10-21 Thread JT
Does anyone have a good link to a guide / tutorial /etc. for writing a custom function query in Solr 4? The tutorials I've seen vary from showing half the code to being written for older versions of Solr. Any type of pointers would be appreciated, thanks.

Re: Exact Match Results

2013-10-21 Thread kumar
Hi i am using field type configuration in the following way. -- View this message in context: http://luc

Re: Question about docvalues

2013-10-21 Thread Yago Riveiro
Hi Gun, Thanks for the response. Indeed I only want docValues to do facets. IMHO I think that a reference to the fact that docValues take precedence over other methods is needed. Is not always obvious. -- Yago Riveiro Sent with Sparrow (http://www.sparrowmailapp.com/?sig) On Monday, October

Re: Exact Match Results

2013-10-21 Thread Developer
You need to provide us with the fieldtype information.. If you just want to match the phrase entered by user, you can use KeywordTokenizerFactory.. Reference: http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters Creates org.apache.lucene.analysis.core.KeywordTokenizer. Treats the entire

Re: Question about docvalues

2013-10-21 Thread Gun Akkor
Hello Yago, To my knowledge, in facet calculations docValues take precedence over other methods. So, even if your field is also stored and indexed, your facets won't use the inverted index or fieldValueCache, when docValues are present. I think you will still have to store and index to maintain

Re: Question about docvalues

2013-10-21 Thread Yago Riveiro
Sorry if I don't make understand, my english is not too good. My goal is remove pressure from the heap, my indexes are too big and the heap get full very quick and I get an OOM. I read about docValues stored on disk, but I don't know how configure it. A read this link: https://cwiki.apache.org

Re: Pivot faceting not working after upgrading to 4.5

2013-10-21 Thread Henrik Ossipoff Hansen
I realise now that distributed pivotal faceting is not implemented yet in SolrCloud after some digging through the internet. Apologies :) Den 21/10/2013 kl. 18.20 skrev Henrik Ossipoff Hansen : > Hello, > > We have a rather weird behavior I don't really understand. As written in a > few othe

Pivot faceting not working after upgrading to 4.5

2013-10-21 Thread Henrik Ossipoff Hansen
Hello, We have a rather weird behavior I don't really understand. As written in a few other threads, we're migrating from a master/slave setup running 4.3 to a SolrCloud setup running 4.5. Both run on the same data set (the 4.5 instances have been re-indexed under 4.5 obviously). The following

Re: Question about docvalues

2013-10-21 Thread Erick Erickson
I really don't understand the question. What behavior are you seeing that leads you to ask? bq: Is it necessary duplicate the field and set index and stored to false and If this means setting _both_ indexed and stored to false, then you effectively throw the field completely away, there's no point

RE: SolrCloud performance in VM environment

2013-10-21 Thread Boogie Shafer
some basic tips. -try to create enough shards that you can get the size of each index portion on the shard closer to the amount of RAM you have on each node (e.g. if you are ~140GB index on 16GB nodes, try doing 12-16 shards) -start with just the initial shards, add replicas later when you have

SolrCloud performance in VM environment

2013-10-21 Thread Tom Mortimer
Hi everyone, I've been working on an installation recently which uses SolrCloud to index 45M documents into 8 shards on 2 VMs running 64-bit Ubuntu (with another 2 identical VMs set up for replicas). The reason we're using so many shards for a relatively small index is that there are complex filte

Re: Local Solr and Webserver-Solr act differently ("and" treated like "or")

2013-10-21 Thread Stavros Delsiavas
I did a full-import again. That solved the issue. I didn't know that the stopwords apply on the indexing itself too. Thanks a lot, Stavros Am 21.10.2013 17:13, schrieb Jack Krupansky: Did you completely reindex your data after emptying the stop words file? -- Jack Krupansky -Original M

Re: Local Solr and Webserver-Solr act differently ("and" treated like "or")

2013-10-21 Thread Jack Krupansky
Did you completely reindex your data after emptying the stop words file? -- Jack Krupansky -Original Message- From: Stavros Delisavas Sent: Monday, October 21, 2013 10:05 AM To: solr-user@lucene.apache.org Subject: Re: Local Solr and Webserver-Solr act differently ("and" treated like

Re: Solr timeout after reboot

2013-10-21 Thread Shawn Heisey
On 10/21/2013 8:03 AM, michael.boom wrote: > I'm using the m3.xlarge server with 15G RAM, but my index size is over 100G, > so I guess putting running the above command would bite all available > memory. With a 100GB index, I would want a minimum server memory size of 64GB, and I would much prefer

RE: Facet performance

2013-10-21 Thread Lemke, Michael SZ/HZA-ZSW
On Mon, October 21, 2013 10:04 AM, Toke Eskildsen wrote: >On Fri, 2013-10-18 at 18:30 +0200, Lemke, Michael SZ/HZA-ZSW wrote: >> Toke Eskildsen wrote: >> > Unfortunately the enum-solution is normally quite slow when there >> > are enough unique values to trigger the "too many > values"-exception. >

Re: Class name of parsing the fq clause

2013-10-21 Thread YouPeng Yang
HI Jack Thanks a lot for your explanation. 2013/10/21 Jack Krupansky > Start with org.apache.solr.handler.**component.QueryComponent#**prepare > which fetches the fq parameters and indirectly invokes the query parser(s): > > String[] fqs = req.getParams().getParams(**CommonParams.FQ); > if (

Re: Solr timeout after reboot

2013-10-21 Thread François Schiettecatte
Well no, the OS is smarter than that, it manages file system cache along with other memory requirements. If applications need more memory then file system cache will likely be reduced. The command is a cheap trick to get the OS to fill the file system cache as quickly as possible, not sure how

Re: Local Solr and Webserver-Solr act differently ("and" treated like "or")

2013-10-21 Thread Stavros Delisavas
Okay, I emtpied the stopword file. I don't know where the wordlist came from. I have never seen this and never touched that file. Anyways... Now my queries do work with one word, like "in" or "to" but the queries still do not work when I use more than one stopword within one query. Instead of too m

Re: how to debug my own analyzer in solr

2013-10-21 Thread Mingzhu Gao
Koji , thank you for reply. I am just using the binary of solr.war , I will try to use the solr source and have a try . -Mingz On 10/21/13 6:21 PM, "Koji Sekiguchi" wrote: >Hi Mingz, > >If you use Eclipse, you can debug Solr with your plugin like this: > ># go to Solr install directory >$ cd

Re: Solr timeout after reboot

2013-10-21 Thread michael.boom
I'm using the m3.xlarge server with 15G RAM, but my index size is over 100G, so I guess putting running the above command would bite all available memory. - Thanks, Michael -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-timeout-after-reboot-tp4096408p4096827.html S

Re: Solr timeout after reboot

2013-10-21 Thread Peter Keegan
I found this warming to be especially necessary after starting an instance of those m3.xlarge servers, else the response times for the first minutes was terrible. Peter On Mon, Oct 21, 2013 at 8:39 AM, François Schiettecatte < fschietteca...@gmail.com> wrote: > To put the file data into file sy

Re: Class name of parsing the fq clause

2013-10-21 Thread Jack Krupansky
Start with org.apache.solr.handler.component.QueryComponent#prepare which fetches the fq parameters and indirectly invokes the query parser(s): String[] fqs = req.getParams().getParams(CommonParams.FQ); if (fqs!=null && fqs.length!=0) { List filters = rb.getFilters(); // if filters already e

Re: Exact Match Results

2013-10-21 Thread François Schiettecatte
Kumar You might want to look into the 'pf' parameter: https://cwiki.apache.org/confluence/display/solr/The+Extended+DisMax+Query+Parser François On Oct 21, 2013, at 9:24 AM, kumar wrote: > I am querying solr for exact match results. But it is showing some other > results also. > > E

Exact Match Results

2013-10-21 Thread kumar
I am querying solr for exact match results. But it is showing some other results also. Examle : User Query String : Okkadu telugu movie Results : 1.Okkadu telugu movie 2.Okkadunnadu telugu movie 3.YuganikiOkkadu telugu movie 4.Okkadu telugu movie stills how can we order these results that 4

Re: Solr timeout after reboot

2013-10-21 Thread François Schiettecatte
To put the file data into file system cache which would make for faster access. François On Oct 21, 2013, at 8:33 AM, michael.boom wrote: > Hmm, no, I haven't... > > What would be the effect of this ? > > > > - > Thanks, > Michael > -- > View this message in context: > http://lucene.4

Re: Solr timeout after reboot

2013-10-21 Thread michael.boom
Hmm, no, I haven't... What would be the effect of this ? - Thanks, Michael -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-timeout-after-reboot-tp4096408p4096809.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr timeout after reboot

2013-10-21 Thread Peter Keegan
Have you tried this old trick to warm the FS cache? cat ...//data/index/* >/dev/null Peter On Mon, Oct 21, 2013 at 5:31 AM, michael.boom wrote: > Thank you, Otis! > > I've integrated the SPM on my Solr instances and now I have access to > monitoring data. > Could you give me some hints on whic

Question about docvalues

2013-10-21 Thread yriveiro
Hi, If I have a field (named dv_field) configured to be indexed, stored and with docvalues=true. How I know that when I do a query like: q=*:*&facet=true&facet.field=dv_field, I'm really using the docvalues and not the normal way? Is it necessary duplicate the field and set index and stored to

Re: ExtractRequestHandler, skipping errors

2013-10-21 Thread Jan Høydahl
Guido, can you point us to the Commons-Compress JIRA issue which reports your particular problem? Perhaps uncompress works just fine? -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com 18. okt. 2013 kl. 14:48 skrev Guido Medina : > Dont, commons compress 1.5 is broken, e

Re: SolrCloud Performance Issue

2013-10-21 Thread Erick Erickson
Shamik: You're right, the use of NOW shouldn't be making that much of a difference between versions. FYI, though, here's a way to use NOW and re-use fq clauses: http://searchhub.org/2012/02/23/date-math-now-and-filter-queries/ It may well be this setting: 1000 Every second (assuming

Re: caching HTML pages in SOLR

2013-10-21 Thread Furkan KAMACI
You can also try: https://www.varnish-cache.org/ 2013/10/21 Alexandre Rafalovitch > I have not used it myself, but perhaps something like > http://www.crawl-anywhere.com/ is along what you were looking for. > > Regards, >Alex. > > Personal website: http://www.outerthoughts.com/ > LinkedIn:

Re: Ordering Results

2013-10-21 Thread Upayavira
Do two searches. Why do you want to do this though? It seems a bit strange. Presumably your users want the best matches possible whether exact or fuzzy? Wouldn't it be best to return both exact and fuzzy matches, but score the exact ones above the fuzzy ones? Upayavira On Mon, Oct 21, 2013, at 0

Re: how to debug my own analyzer in solr

2013-10-21 Thread Koji Sekiguchi
Hi Mingz, If you use Eclipse, you can debug Solr with your plugin like this: # go to Solr install directory $ cd $SOLR $ ant run-example -Dexample.debug=true Then connect the JVM from Eclipse via remote debug port 5005. Good luck! koji (13/10/21 18:58), Mingzhu Gao wrote: More information

Re: how to debug my own analyzer in solr

2013-10-21 Thread Siegfried Goeschl
Thread Dump and/or Remote Debugging?! Cheers, Siegfried Goeschl On 21.10.13 11:58, Mingzhu Gao wrote: More information about this , the custom analyzer just implement "createComponents" of Analyzer. And my configure in schema.xml is just something like : From the log I cannot see any

Error: Repeated service interruptions - failure processing document: Read timed out

2013-10-21 Thread Ronny Heylen
Hi, Just installed SOLR and when running a job I have the following problem : Error: Repeated service interruptions - failure processing document: Read timed out Like I said, just installed SOLR and so very new to the topic. ( On Windows 2008R2 ) SOLR 4.4 Tomcat 7.0.42 ManifoldCF 1.3 Postg

Re: how to debug my own analyzer in solr

2013-10-21 Thread Mingzhu Gao
More information about this , the custom analyzer just implement "createComponents" of Analyzer. And my configure in schema.xml is just something like : >From the log I cannot see any error information , however , when I want to analysis or add document data , it always hang there . Any wa

Re: solrconfig.xml carrot2 params

2013-10-21 Thread Stanislaw Osinski
> Thanks, I'm new to the clustering libraries. I finally made this > connection when I started browsing through the carrot2 source. I had > pulled down a smaller MM document collection from our test environment. It > was not ideal as it was mostly structured, but small. I foolishly thought > I

Re: Solr timeout after reboot

2013-10-21 Thread michael.boom
Thank you, Otis! I've integrated the SPM on my Solr instances and now I have access to monitoring data. Could you give me some hints on which metrics should I watch? Below I've added my query configs. Is there anything I could tweak here? 1024

how to avoid recover? how to ensure a recover success?

2013-10-21 Thread sling
Hi, guys: I have an online application with solrcloud 4.1, but I get errors of syncpeer every 2 or 3 weeks... In my opinion, a recover occers when a replica can not sync data to its leader successfully. I see the topic http://lucene.472066.n3.nabble.com/SolrCloud-5x-Errors-while-recovering-td402

Ordering Results

2013-10-21 Thread kumar
Hi, I have a situation that if user looking for anything first it has to give the suggestions from the exact match and as well as the fuzzy matches. Suppose we are showing 15 suggestions. First 10 results are exact match results. And remaining 5 results from fuzzy matches. Can anybody give me

how to debug my own analyzer in solr

2013-10-21 Thread Mingzhu Gao
Dear solr expert , I would like to write my own analyser ( Chinese analyser ) and integrate them into solr as solr plugin . >From the log information , the custom analyzer can be loaded into solr >successfully . I define my with this custom analyzer. Now the problem is that , when I try thi

RE: Facet performance

2013-10-21 Thread Toke Eskildsen
On Fri, 2013-10-18 at 18:30 +0200, Lemke, Michael SZ/HZA-ZSW wrote: > Toke Eskildsen [mailto:t...@statsbiblioteket.dk] wrote: > > Unfortunately the enum-solution is normally quite slow when there > > are enough unique values to trigger the "too many > values"-exception. > > [...] > > [...] And yes

Re: XLSB files not indexed

2013-10-21 Thread Roland Everaert
Hi Otis, In our case, there is no exception raised by tika or solr, a lucene document is created, but the content field contains only a few white spaces like for ODF files. Roland. On Sat, Oct 19, 2013 at 3:54 AM, Otis Gospodnetic < otis.gospodne...@gmail.com> wrote: > Hi Roland, > > It looks