facet.query with facet.date

2010-07-21 Thread ruphus
Hello, I need to create two date facets displaying counts of a particular fields values. With normal facets, this can be done with facet.query, but this parameter is not available to facet.date . Is this possbile? I'd really prefer to avoid performing two queries. Thanks William -- View this me

Re: how to change the default path of Solr Tomcat

2010-07-21 Thread Eben
Hi Wong, I'm using Default server (Jetty with port 8983) Girish solution already solve my problem thanks for your response Wong :) On 7/22/2010 11:57 AM, K Wong wrote: Check: /var/lib/tomcat5.5/conf/Catalina/localhost/ Are you using Tomcat on a custom port (the default tomcat port is 8080)? Che

Re: set field with value 0 to the end

2010-07-21 Thread Grijesh.singh
why using default="0" its optional remove that from field definition -- View this message in context: http://lucene.472066.n3.nabble.com/set-field-with-value-0-to-the-end-tp980580p986115.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: how to change the default path of Solr Tomcat

2010-07-21 Thread K Wong
Check: /var/lib/tomcat5.5/conf/Catalina/localhost/ Are you using Tomcat on a custom port (the default tomcat port is 8080)? Check your ports ($ sudo netstat -nlp) Maybe try searching the file system for the solr.xml file? $ sudo find / -name solr.xml Hope this helps. K On Wed, Jul 21, 2010 a

Re: how to change the default path of Solr Tomcat

2010-07-21 Thread Eben
wowww...amazing Girish ! the solution that you gave is very simple but 100% answering my problem thanks a lot Girish ! On 7/22/2010 11:43 AM, Girish Pandit wrote: it seems like you are using Default server (Jetty with port 8983), also it looks like you are trying to run it with command "java -ja

Re: how to change the default path of Solr Tomcat

2010-07-21 Thread Girish Pandit
it seems like you are using Default server (Jetty with port 8983), also it looks like you are trying to run it with command "java -jar start.jar" if so then under same directory there is another directory called "webapps" go in there, rename "solr.war" to "search.war" bounce server and you shou

Re: how to change the default path of Solr Tomcat

2010-07-21 Thread Eben
firstly, I really appreciate your respond to my question Ken I'm using Tomcat on Linux Debian I can't find the solr.xml in \program files\apache...\Tomcat\conf\catalina\localhost there are only 2 files in localhost folder: host-manager.xml and manager.xml any solutions? On 7/22/2010 10:41 AM

Re: how to change the default path of Solr Tomcat

2010-07-21 Thread kenf_nc
Your environment may be different, but this is how I did it. (Apache Tomcat on Windows 2008) go to \program files\apache...\Tomcat\conf\catalina\localhost rename solr.xml to search.xml recycle Tomcat service -- View this message in context: http://lucene.472066.n3.nabble.com/how-to-change-the-

Re: a bug of solr distributed search

2010-07-21 Thread Li Li
I think what Siva mean is that when there are docs with the same url, leave the doc whose score is large. This is the right solution. But itshows a problem of distrubted search without common idf. A doc will get different score in different shard. 2010/7/22 MitchK : > > It already was sorted by sco

how to change the default path of Solr Tomcat

2010-07-21 Thread Eben
Hi everyone, I really need your help this is the default address that I got from the solr: http://172.16.17.126:8983/solr/ the question is how to change that path to be: http://172.16.17.126:8983/search/ Please I really need your help thanks a lot before

Re: Using hl.regex.pattern to print complete lines

2010-07-21 Thread Lance Norskog
Java regex might be different from all other regex, so writing a test program and experimenting is the only way. Once you decide that this expression really is what you want, and that it does not achieve what you expect, you might have found a bug in highlighting. Lucene/Solr highlighting has alwa

Re: Count hits per document?

2010-07-21 Thread Lance Norskog
You have to store the termvectors when you index, and then retrieve them when you do a query. Highlighting does exactly this; the easy way to do this is to ask for highlighting and search for the highlighted words, and count them. On Wed, Jul 21, 2010 at 4:21 PM, Peter Spam wrote: > If I search f

Re: Dismax query response field number

2010-07-21 Thread Lance Norskog
Fields or documents? It will return all of the fields that are 'stored'. The default number of documents to return is 10. Returning all of the documents is very slow, so you have to request that with the rows= parameter. On Wed, Jul 21, 2010 at 3:32 PM, wrote: > > > >  Hi, > > It seems that not

Clustering results limit?

2010-07-21 Thread Darren Govoni
Hi, I am attempting to cluster a query. It kinda works, but where my (regular) query returns 500 results the cluster only shows 1-10 hits for each cluster (5 clusters). Never more than 10 docs and I know its not right. What could be happening here? It should be showing dozens of documents per clus

Re: faceted search with job title

2010-07-21 Thread Dave Searle
You could grab your xpath rules from a db too. This is what I did for a price scrapping app I did a while ago. New sites were added with a set of rules using a web ui You could certainly use regex of course, but IMO that's more complex than writing a simple xpath. Using JavaScript or some dom t

Re: Using hl.regex.pattern to print complete lines

2010-07-21 Thread Peter Spam
Still not working ... any ideas? -Pete On Jul 14, 2010, at 11:56 AM, Peter Spam wrote: > Any other thoughts, Chris? I've been messing with this a bit, and can't seem > to get (?m)^.*$ to do what I want. > > 1) I don't care how many characters it returns, I'd like entire lines all the > time

Count hits per document?

2010-07-21 Thread Peter Spam
If I search for "foo", I get back a list of documents. Any way to get a per-document hit count? Thanks! -Pete

Re: boosting particular field values

2010-07-21 Thread Justin Lolofie
I might have misunderstood, but I think I cant do string literals in function queries, right? myfield:"something"^3.0 I tried it anyway using solr 1.4, doesnt seem to work. On Wed, Jul 21, 2010 at 1:48 PM, Markus Jelsma wrote: > function queries match all documents > > > http://wiki.apache.org/

Dismax query response field number

2010-07-21 Thread scrapy
Hi, It seems that not all field are returned from query response when i use DISMAX? Only first 10?? Any idea? Here is my solrconfig: dismax explicit * 0.01 text^0.5 content^1.1 title^1.5 text^0.2 content^1.1 title^1.5

Re: faceted search with job title

2010-07-21 Thread Savannah Beckett
And I will have to recompile the dom or sax code each time I add a job board for crawling.  Regex patten is only a string which can be stored in a text file or db, and retrieved based on the job board.  What do you think? From: "Nagelberg, Kallin" To: "solr-

Re: faceted search with job title

2010-07-21 Thread Savannah Beckett
I don't see how it can be done without writing sax or dom code for each job board, it is non-maintainable if there are a lot of new job boards being crawled.  Maybe I should use regex match?  Then I just need to substitute the regex pattern for each job board without writing any new sax or dom c

Re: Solr searching performance issues, using large documents

2010-07-21 Thread Peter Spam
>From the mailing list archive, Koji wrote: > 1. Provide another field for highlighting and use copyField to copy plainText > to the highlighting field. and Lance wrote: http://www.mail-archive.com/solr-user@lucene.apache.org/msg35548.html > If you want to highlight field X, doing the > termO

RE: boosting particular field values

2010-07-21 Thread Markus Jelsma
function queries match all documents http://wiki.apache.org/solr/FunctionQuery#Using_FunctionQuery   -Original message- From: Justin Lolofie Sent: Wed 21-07-2010 20:24 To: solr-user@lucene.apache.org; Subject: boosting particular field values I'm using dismax request handler, solr 1.4

boosting particular field values

2010-07-21 Thread Justin Lolofie
I'm using dismax request handler, solr 1.4. I would like to boost the weight of certain fields according to their values... this appears to work: bq=category:electronics^5.5 However, I think this boosting only affects sorting the results that have already matched? So if I only get 10 rows back,

Re: help finding illegal chars in XML doc

2010-07-21 Thread robert mena
Hi Chris, Thanks for your reply. I could not find in the log files any mention to that. By the way I only have _MM_DD.request.log files in my directory. Do I have to enable any specific log or level to catch those errors? On Sun, Jul 18, 2010 at 3:45 PM, Chris Hostetter wrote: > > : Simple

RE: faceted search with job title

2010-07-21 Thread Nagelberg, Kallin
Yeah you should definitely just setup a custom parser for each site.. should be easy to extract title using groovy's xml parsing along with tagsoup for sloppy html. If you can't find the pattern for each site leading to the job title how can you expect solr to? Humans have the advantage here :P

Re: nested query and number of matched records

2010-07-21 Thread MitchK
Thank you three for your feedback! Chantal, unfortuntately kenf is right. Facetting won't work in this special case. > parallel calls. > Yes, this will be the solution. However, this would lead to a second HTTP-request and I hoped to be able to avoid it. Chantal Ackermann wrote: > > Sure S

Re: a bug of solr distributed search

2010-07-21 Thread MitchK
It already was sorted by score. The problem here is the following: Shard_A and shard_B contain doc_X and doc_X. If you are querying for something, doc_X could have a score of 1.0 at shard_A and a score of 12.0 at shard_B. You can never be sure which doc Solr sees first. In the bad case, Solr see

Re: faceted search with job title

2010-07-21 Thread Savannah Beckett
mmm...there must be better way...each job board has different format.  If there are constantly new job boards being crawled, I don't think I can manually look for specific sequence of tags that leads to job title.  Most of them don't even have class or id.  There is no guarantee that the job tit

Re: a bug of solr distributed search

2010-07-21 Thread Siva Kommuri
How about sorting over the score? Would that be possible? On Jul 21, 2010, at 12:13 AM, Li Li wrote: > in QueryComponent.mergeIds. It will remove document which has > duplicated uniqueKey with others. In current implementation, it use > the first encountered. > String prevShard = uniqueD

RE: Securing Solr 1.4 in a glassfish container AS NEW THREAD

2010-07-21 Thread Sharp, Jonathan
Some further information -- I tried indexing a batch of PDFs with the client and Solr CELL, setting the credentials in the httpclient. For some reason after successfully indexing several hundred files I start getting a "SolrException: Unauthorized" and an info message (for every subsequent file):

RE: faceted search with job title

2010-07-21 Thread Dave Searle
You'd probably need to do some post processing on the pages and set up rules for each website to grab that specific bit of data. You could load the html into an xml parser, then use xpath to grab content from a particular tag with a class or id, based on the particular website -Original M

faceted search with job title

2010-07-21 Thread Savannah Beckett
Hi,   I am currently using nutch to crawl some job pages from job boards.  They are in my solr index now.  I want to do faceted search with the job titles.  How?  The job titles can be in any locations of the page, e.g. title, header, content...   If I use indexfilter in Nutch to search the cont

Re: nested query and number of matched records

2010-07-21 Thread kenf_nc
That just gives a count of documents by type. The use-case, I believe, is to return from a search, 10 documents of type 'short' and 1 document of type 'extensive'. -- View this message in context: http://lucene.472066.n3.nabble.com/nested-query-and-number-of-matched-records-tp983756p984539.html

Re: nested query and number of matched records

2010-07-21 Thread Chantal Ackermann
Sure SOLR supports this: use facets on the field "type": add to your regular query: facet.query=true&facet.field=type see http://wiki.apache.org/solr/SimpleFacetParameters On Wed, 2010-07-21 at 15:48 +0200, kenf_nc wrote: > parallel calls. simultaneously query for type:short rows=10 and > typ

Re: nested query and number of matched records

2010-07-21 Thread kenf_nc
parallel calls. simultaneously query for type:short rows=10 and type:extensive rows=1 and merge your results. This would also let you separate your short docs from your extensive docs into different solr instances if you wished...depending on your document architecture this could speed up one o

Re: Load cores without restarting/reloading Solr

2010-07-21 Thread Andrew McCombe
Hi Peter We are using the packaged Ubuntu Server (10.04 LTS) versions of Tomcat6 and Solr1.4 and running a single instance of Solr with multiple cores. Regards Andrew On 20 July 2010 19:47, Peter Karich wrote: > Hi Andrew, > > the whole tomcat shouldn't fail on restart if only one core fails.

solrconfig.xml and xinclude

2010-07-21 Thread fiedzia
I am trying to export some config options common to all cores into single file, which would be included using xinclude. The only problem is how to include childrens of given node. common_solrconfig.xml looks like that: now all of the following attemps have failed: http://www.w3.org/20

Re: nested query and number of matched records

2010-07-21 Thread Grijesh.singh
I Think Solr does not provide any thing like that U want. -- View this message in context: http://lucene.472066.n3.nabble.com/nested-query-and-number-of-matched-records-tp983756p983938.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: LocalSolr distance in km?

2010-07-21 Thread Saïd Radhouani
Hi, What resource are you using for LocalSolr? Using the SpatialTierQParser, you can choose between km or mile: http://blog.jteam.nl/2009/08/03/geo-location-search-with-solr-and-lucene/ Or, if you are using the LocalSolrQueryComponent (http://www.gissearch.com/localsolr), and you can't choose

Re: a bug of solr distributed search

2010-07-21 Thread MitchK
I don't know much about the code. Maybe you can tell me to what file you are referring? However, from the comments one can see, that the problem is known but one decided to let it happen, because of System requirements in the Java version. - Mitch -- View this message in context: http://luce

Re: a bug of solr distributed search

2010-07-21 Thread Li Li
yes. This will make user think our search engine has some bug. from the comments of the codes, it needs more things to do if (prevShard != null) { // For now, just always use the first encountered since we can't currently // remove the previous one added to the pri

Re: nested query and number of matched records

2010-07-21 Thread MitchK
Oh,... I just see, there is no direct question ;-). How can I specify the number of returned documents in the desired way *within* one request? - Mitch -- View this message in context: http://lucene.472066.n3.nabble.com/nested-query-and-number-of-matched-records-tp983756p983773.html Sent from

Re: a bug of solr distributed search

2010-07-21 Thread MitchK
Ah, okay. I understand your problem. Why should doc x be at position 1 when searching for the first time, and when I search for the 2nd time it occurs at position 8 - right? I am not sure, but I think you can't prevent this without custom coding or making a document's occurence unique. Kind rega

nested query and number of matched records

2010-07-21 Thread MitchK
Hello community, I got a situation, where I know that some types of documents contain very extensive information and other types are giving more general information. Since I don't know whether a user searches for general or extensive information (and I don't want to ask him when he uses the defau

Re: set field with value 0 to the end

2010-07-21 Thread Grijesh.singh
Integer field can be empty also, I think u have set required=true if that remove required=true, and u can live field without data at the time of indexing. -- View this message in context: http://lucene.472066.n3.nabble.com/set-field-with-value-0-to-the-end-tp980580p983728.html Sent from the Solr

Re: a bug of solr distributed search

2010-07-21 Thread Li Li
But users will think there is something wrong with it when he/she search the same query but got different result. 2010/7/21 MitchK : > > Li Li, > > this is the intended behaviour, not a bug. > Otherwise you could get back the same record in a response for several > times, which may not be intended

Re: a bug of solr distributed search

2010-07-21 Thread MitchK
Li Li, this is the intended behaviour, not a bug. Otherwise you could get back the same record in a response for several times, which may not be intended by the user. Kind regards, - Mitch -- View this message in context: http://lucene.472066.n3.nabble.com/a-bug-of-solr-distributed-search-tp98

Re: Any there any known issues may cause the index sync between the master/slave abnormal?

2010-07-21 Thread Peter Karich
Hi James, triggering an optimize (on the salve) helped us to shrink the disc usage of the slaves. But I think, the slaves will clean them up automatically on the next replication (if you don't mind the double-size-index) Regards, Peter. > > Hi Peter, > Thanks your reponse. I will check the > ht

Re:Re: Any there any known issues may cause the index sync between the master/slave abnormal?

2010-07-21 Thread Chengyang
Hi Peter, Thanks your reponse. I will check the http://wiki.apache.org/solr/SolrReplication first. I mean the slave node did not delete the old index and finally cause the disk usage to large for the slave node. I am thinking to manually force the slave node to refresh the index. Regards, Jam

Re: Any there any known issues may cause the index sync between the master/slave abnormal?

2010-07-21 Thread Peter Karich
Hi! > Any there any known issues may cause the index sync between the > master/slave abnormal? What do you mean here? Corrupt indices? Please, describe your problems in more detail. > And is there any API to call to force sync the index between the > master and slave, or force to delete the old

a bug of solr distributed search

2010-07-21 Thread Li Li
in QueryComponent.mergeIds. It will remove document which has duplicated uniqueKey with others. In current implementation, it use the first encountered. String prevShard = uniqueDoc.put(id, srsp.getShard()); if (prevShard != null) { // duplicate detected