Re: Solr Reporting

2010-09-23 Thread Myron Chelyada
Hi Adeel, I would use the first approach since it is more flexible and easier to use. Please consider XsltResponseWriter which allows to transform result set from Solr's default xml structure into custom using provided xslt template. Myron 2010/9/23 Adeel Qureshi > This probably isnt directly

Re: Autocomplete: match words anywhere in the token

2010-09-23 Thread Chantal Ackermann
What works very good for me: 1.) Keep the tokenized field (KeywordTokenizerFilter, WordDelimiterFilter) (like you described you had) 2.) create an additional field that stores uses the String type with the same content (use copy field to fill either) 3.) use facet.prefix instead of terms.prefix fo

Re: Autocomplete: match words anywhere in the token

2010-09-23 Thread Chantal Ackermann
On Wed, 2010-09-22 at 20:14 +0200, Arunkumar Ayyavu wrote: > Thanks for the responses. Now, I included the EdgeNGramFilter. But, I get > the following results when I search for "canon pixma". > Canon PIXMA MP500 All-In-One Photo Printer > Canon PowerShot SD500 > > As you can guess, I'm not expecti

Re: is indexing single-threaded?

2010-09-23 Thread Jan Høydahl / Cominvent
SolrJ threads speeds up feeding throughput. The building the index is still single threaded (per core), isn't it? Don't know about analysis. But you cannot have two threads write to the same file... -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com On 23. sep. 2010, at

Re: Autocomplete: match words anywhere in the token

2010-09-23 Thread Jan Høydahl / Cominvent
Make sure you're using AND as default operator... -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com On 22. sep. 2010, at 20.14, Arunkumar Ayyavu wrote: > Thanks for the responses. Now, I included the EdgeNGramFilter. But, I get > the following results when I search for "

matches in result grouping

2010-09-23 Thread Koji Sekiguchi
I'm using recent committed field collapsing / result grouping feature in trunk. I'm confusing matches parameter in the result at the second sample output of Wiki: http://wiki.apache.org/solr/FieldCollapsing#Quick_Start I cannot understand why there are two "matches":5 entries in the result. Can

Custom Sorting with function queries

2010-09-23 Thread dl
I need to 'rank' the documents in a solr index based on some field values and the query. Is this possible using function queries? Two example to illustrate what I am trying to achieve: The index contains two fields min_rooms and max_rooms, both integers, both optional. If I query the index for

bi-grams for common terms - any analyzers do that?

2010-09-23 Thread Andy
Hi, I was going thru this LucidImagnaton presentation on analysis: http://www.slideshare.net/LucidImagination/analyze-this-tips-and-tricks-on-getting-the-lucene-solr-analyzer-to-index-and-search-your-content-right 1) on p.31-33, it talks about forming bi-grams for the 32 most common terms durin

Re: Solr Reporting

2010-09-23 Thread kenf_nc
keep in mind that the paradigm isn't completely useless, the str is a data type (string), it can be int, float, double, date, and others. So to not lose any information you may want to do something like: 123 xyz Which I agree makes more sense to me. The name of the field is more important than

RE: bi-grams for common terms - any analyzers do that?

2010-09-23 Thread Steven A Rowe
> -Original Message- > From: Andy [mailto:angelf...@yahoo.com] > Sent: Thursday, September 23, 2010 6:05 AM > To: solr-user@lucene.apache.org > Subject: bi-grams for common terms - any analyzers do

Re: How can I delete the entire contents of the index?

2010-09-23 Thread kenf_nc
Quick tangent... I went to the link you provided, and the delete part makes sense. But the next tip, how to re-index after a schema change. What is the point of step 5. Send an command. ? Why do you need to optimize an empty index? Or is my understanding of Optimize incorrect? -- View this

Re: Searches with a period (.) in the query

2010-09-23 Thread kenf_nc
Do you have any other Analyzers or Formatters involved? I use delimiters in certain string fields all the time. Usually a colon ":" or slash "/" but should be the same for a period. I've never seen this behavior. But if you have any kind of tokenizer or formatter involved beyond then you ma

RE: How can I delete the entire contents of the index?

2010-09-23 Thread Jonathan Rochkind
Because even after you've deleted every document from the index, there are still actually index _files_ on disk taking up space. Lucene organizes it's files for quick access, and a consequence of this is that deleting a document does not neccesarily reclaim the disk space. Optimize will recla

RE: bi-grams for common terms - any analyzers do that?

2010-09-23 Thread Jonathan Rochkind
I've been thinking about the CommonGramsFilter for a while, and am confused about how it works. Can anyone provide examples? Are you meant to include the analyzer at both index and query time? The description on the wiki says among other things: "The CommonGramsQueryFilter converts the phrase

Re: Xpath extract element name

2010-09-23 Thread yklxmas
Great. XSL worked like a charm! Thx lots. -- View this message in context: http://lucene.472066.n3.nabble.com/Xpath-extract-element-name-tp1534390p1567809.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: How can I delete the entire contents of the index?

2010-09-23 Thread Michael McCandless
Lucene has an API for very fast deletion of the index (ie, it removes the files): IndexWriter.deleteAll(). It's part of the transaction, ie, you still must call .commit() to make the change visible to external readers. But I don't know whether this is exposed in Solr... Mike On Thu, Sep 23, 201

Issue with Solr Boosting

2010-09-23 Thread Jayant Patil
Hi, We are using Solr for our searches. We are facing issues while applying boost on particular fields. E.g. We have a field Category, which contains values like Electronics, Computers, Home Appliances, Mobile Phones etc. We want to boost the category Electronics and Mobile Phones, we are using

Re: Wildcard search in phrase query using spanquery

2010-09-23 Thread Joachim M
Bra1n, Did you ever get this sorted out? I'm having the same problem, Unknown query type "org.apache.lucene.search.WildcardQuery" found in phrase query Thanks -- View this message in context: http://lucene.472066.n3.nabble.com/Wildcard-search-in-phrase-query-using-spanquery-tp729275p1568402.h

Re: Solr Reporting

2010-09-23 Thread Adeel Qureshi
Thank you for your suggestions .. makes sense and I didnt knew about the XsltResponseWriter .. that opens up door to all kind of possibilities ..so its great to know about that but before I go that route .. what about performance .. In Solr Wiki it mentions that XSLT transformation isnt so bad in

Re: How can I delete the entire contents of the index?

2010-09-23 Thread Chris Hostetter
: Lucene has an API for very fast deletion of the index (ie, it removes : the files): IndexWriter.deleteAll(). It's part of the transaction, ... : But I don't know whether this is exposed in Solr... Solr definitely has optimized the delete *:* case (but i don't know if it's using the spe

Re: Searches with a period (.) in the query

2010-09-23 Thread Siddharth Powar
Hey Ken, The filedType definition that I am using is: Thanks, Sid On Thu, Sep 23, 2010 at 5:29 AM, kenf_nc wrote: > > Do you have any other Analyzers or Formatters involved? I use delimiters in > certain string fields all the time. Usually a colon ":" or slash "/" but > should be the same for

RE: bi-grams for common terms - any analyzers do that?

2010-09-23 Thread Burton-West, Tom
Hi all, The CommonGrams filter is designed to only work on phrase queries. It is designed to solve the problem of slow phrase queries with phrases containing common words, when you don't want to use stop words. It would not make sense for Boolean queries. Boolean queries just get passed throu

Re: is indexing single-threaded?

2010-09-23 Thread Dennis Gearon
I was kind of wondering what magic had been done to achieve multiple writing to the index file :-) BTW, wouldn't it be possible to have seperate segments per thread? Set up the index with a minimum (desired?) segment count, and write each individually? Is there any organization in the segments?

Re: matches in result grouping

2010-09-23 Thread Koji Sekiguchi
(10/09/23 18:14), Koji Sekiguchi wrote: > I'm using recent committed field collapsing / result grouping > feature in trunk. > > I'm confusing matches parameter in the result at the second > sample output of Wiki: > > http://wiki.apache.org/solr/FieldCollapsing#Quick_Start > > I cannot understand

Re: matches in result grouping

2010-09-23 Thread Yonik Seeley
2010/9/23 Koji Sekiguchi : >  (10/09/23 18:14), Koji Sekiguchi wrote: >>  I'm using recent committed field collapsing / result grouping >> feature in trunk. >> >> I'm confusing matches parameter in the result at the second >> sample output of Wiki: >> >> http://wiki.apache.org/solr/FieldCollapsing#

Re: Solr Reporting

2010-09-23 Thread Peter Sturge
Hi, Are you going to generate a report with 3 records in it? That will be a very large report - will anyone really want to read through that? If you want/need 'summary' reports - i.e. stats on on the 30k records, it is much more efficient to setup faceting and/or server-side analysis to do thi

Re: bi-grams for common terms - any analyzers do that?

2010-09-23 Thread Robert Muir
On Thu, Sep 23, 2010 at 12:02 PM, Burton-West, Tom wrote: > > The problem with "l'art" is actually due to a bug or feature in the > QueryParser. Currently the QueryParser interacts with the token chain and > decides whether the tokens coming back from a tokenfilter should be treated > as a phrase

Re: Issue with Solr Boosting

2010-09-23 Thread Jak Akdemir
I think if you don't need to add more categories, just increasing boost factor of Electronics would work. As you said because of DocFreq of "Mobile Phones", scoring algorithm is working as expected way. On Thu, Sep 23, 2010 at 3:42 PM, Jayant Patil wrote: > Hi, > > We are using Solr for our sear

Re: Searches with a period (.) in the query

2010-09-23 Thread Jak Akdemir
Siddharth, did you check tokenizer and filter behaviour from ../admin/analysis.jsp page. That would be quite informative to you. On Thu, Sep 23, 2010 at 6:42 PM, Siddharth Powar wrote: > Hey Ken, > > The filedType definition that I am using is: > sortMissingLast="true" omitNorms="true" /> > > T

Re: Solr Reporting

2010-09-23 Thread Adeel Qureshi
Hi Peter I understand what you are saying but I think you are thinking more of report as graph and analysis and summary kind of data .. for my reports I do need to include all records that qualify certain criteria .. e.g. a listing of all orders placed in last 6 months .. now that could be 1 o

Calgary Solr Consultant?

2010-09-23 Thread Ryan Courtnage
Hi, I'm looking for a Solr expert local to Calgary, Alberta to help us jumpstart a search project. Ryan Courtnage PS: apologies if this is the wrong list for this type of request.

Re: Searches with a period (.) in the query

2010-09-23 Thread Yonik Seeley
On Wed, Sep 22, 2010 at 8:13 PM, Siddharth Powar wrote: > I am getting some weird output upon searching in solr. For certain searches > that have a period in the search term (e.g: q=ab.xyz) solr returns the > results perfectly, but for some other searches (e.g: q=ab.pqr) solr would > return 0 resu

RE: Calgary Solr Consultant?

2010-09-23 Thread Markus Jelsma
Companies and people that offer support should be listed on the wiki, although just a few take the effort to edit the wiki: http://wiki.apache.org/solr/Support   -Original message- From: Ryan Courtnage Sent: Thu 23-09-2010 20:22 To: solr-user@lucene.apache.org; Subject: Calgary Solr C

Renaming Solr mbean

2010-09-23 Thread Dan Trainor
Hi - I inquired about this some time ago, and learned a lot on the subject of JMX and Tomcat (and solr, too!), but never did find a definite answer. It doesn't look like anything JMX has been discussed on the list for some time, either. I've been toying around with JMX lately in Tomcat, and th

Grouping in solr ?

2010-09-23 Thread Papp Richard
Hi all, is it possible somehow to group documents? I have services as documents, and I would like to show the filtered services grouped by company. So I filter services by given criteria, but I show the results grouped by companay. If I got 1000 services, maybe I need to show just 100 com

RE: Grouping in solr ?

2010-09-23 Thread Markus Jelsma
http://wiki.apache.org/solr/FieldCollapsing https://issues.apache.org/jira/browse/SOLR-236   -Original message- From: Papp Richard Sent: Thu 23-09-2010 21:29 To: solr-user@lucene.apache.org; Subject: Grouping in solr ? Hi all,  is it possible somehow to group documents?  I have servic

Re: Solr Reporting

2010-09-23 Thread Peter Sturge
Yes, that makes sense. So, more of a bulk data export requirement. If the excel data doesn't have to go out on the web, you could export to a local file (using a local solj streamer), then publish it, which might save some external http bandwidth if that's a concern. We do this all the time using a

Re: Autocomplete: match words anywhere in the token

2010-09-23 Thread Jonathan Rochkind
This works with _one_ entry per document, right? If you've actually found a clever trick to use this technique when you have more than one entry for auto-suggest per document, do let me know. Cause I haven't been able to come with one. Jonathan Chantal Ackermann wrote: What works very goo

Range query not working

2010-09-23 Thread PeterKerk
I have this in my query: &q=*:*&facet.query=location_rating_total:[3 TO 100] And this document: − 1.0 1 2 But still my total results equals 6 (total population) and not 0 as I would expect Why? -- View this message in context: http://lucene.472066.n3.nabble.com/Range-query-not-working-tp1

Search a URL

2010-09-23 Thread Max Lynch
Is there a tokenizer that will allow me to search for parts of a URL? For example, the search "google" would match on the data " http://mail.google.com/dlkjadf"; This tokenizer factory doesn't seem to be sufficient:

Re: Range query not working

2010-09-23 Thread Yonik Seeley
On Thu, Sep 23, 2010 at 4:30 PM, PeterKerk wrote: > I have this in my query: > &q=*:*&facet.query=location_rating_total:[3 TO 100] > > And this document: > > - > > 1.0 > 1 > 2 > > > But still my total results equals 6 (total population) and not 0 as I would > expect > > Why? facet.query will

Re: Range query not working

2010-09-23 Thread PeterKerk
Forgot to mention..I tried that too already. So when I have: location_rating_total:[0 TO 100] It shows only the location for which the location_rating_total is EXACTLY 0...locations that have location_rating_total value of 2 are NOT included. Any other suggestions? -- View this message in cont

Re: Search a URL

2010-09-23 Thread dl
LetterTokenizerFactory will use each contiguous sequence of letters and discard the rest. http, https, com, etc. would need to be a stopword. Alternatively you can try PatternTokenizerFactory with a regular expression if you are looking for a specific part of the URL. On Sep 23, 2010, at 10:59

RE: Search a URL

2010-09-23 Thread Markus Jelsma
Try setting generateWordParts=1 in your WDF. Also, having a WhitespaceTokenizer makes little sense for URL's, there should be no whitespace in a URL, the StandardTokenizer can tokenize a URL. Anyway, the problem is your WDF.   -Original message- From: Max Lynch Sent: Thu 23-09-2010 23:00

Re: Range query not working

2010-09-23 Thread Jonathan Rochkind
You need to use a field type that will sort integers properly. You are, I'm pretty sure, using a field type that ends up doing string byte order comparison. And as a string, "2" is not in between "0" and "100". (In fact, pretty much only strings begininng with "0" like say "0234" are.). In

Re: Range query not working

2010-09-23 Thread PeterKerk
This is the field in my schema.xml: Also in the response it clearly shows: 0 What else can I do? -- View this message in context: http://lucene.472066.n3.nabble.com/Range-query-not-working-tp1570324p1570580.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Range query not working

2010-09-23 Thread Jonathan Rochkind
Are you using Solr 1.4.1? Are you using the example default schema from Solr 1.4.1? "int", which I recommended, is not the same as "integer", which you report. In Solr 1.4.1. Different field types have a somewhat confusing history in Solr. With Solr 1.4, there are new types based on the T

Re: Generating a sitemap

2010-09-23 Thread Doki
Hi all, Hate to bring forward a zombified thread (Mar 2010 though, not too bad), but I also am tasked to generate a sitemap for items indexed in a Solr index. Been at this job for only a few weeks, so Solr and Lucene are all new to me, but I think my path forward on this is to create a requ

RE: Grouping in solr ?

2010-09-23 Thread Papp Richard
thank you! this is really helpful. just tried it and it's amazing. do you know, how trustable is a nightly built version (solr4) ? Rich -Original Message- From: Markus Jelsma [mailto:markus.jel...@buyways.nl] Sent: Thursday, September 23, 2010 22:38 To: solr-user@lucene.apache.org Subjec

Re: Range query not working

2010-09-23 Thread Yonik Seeley
On Thu, Sep 23, 2010 at 5:44 PM, Jonathan Rochkind wrote: > The field type in a standard schema.xml that's defined as "integer" is NOT > sortable. Right - before 1.4. There is no "integer" field type in 1.4 and beyond in the example schema. > You can not sort on this and get what you want. (Wha

RE: Search a URL

2010-09-23 Thread Dennis Gearon
WDF is not WTF(what I think when I see WDF), right ;-) What is WDF? Dennis Gearon Signature Warning EARTH has a Right To Life, otherwise we all die. Read 'Hot, Flat, and Crowded' Laugh at http://www.yert.com/film.php --- On Thu, 9/23/10, Markus Jelsma wrote: > From: Marku

Re: Can Solr do approximate matching?

2010-09-23 Thread Igor Chudov
Eric, it appears that the "/solr/mlt" handler is missing, at least based on the URL that I typed. How can I verify existence of MoreLikeThis handler and install it? Thanks a lot! Igor On Wed, Sep 22, 2010 at 11:18 AM, Erik Hatcher wrote: >

TokenFilter that removes payload ?

2010-09-23 Thread Teruhiko Kurosaka
Is there an existing TokenFilter that simply removes payloads from the token stream? Teruhiko "Kuro" Kurosaka RLP + Lucene & Solr = powerful search for global contents

Re: Issue with Solr Boosting

2010-09-23 Thread Jak Akdemir
Hi Jayant, Did you check Function Queries in solr? I am not sure but It would solve your problem in this case. Functions Queries are a way to affect scoring according to different factors like page view etc. http://wiki.apache.org/solr/FunctionQuery Jak On Fri, Sep 24, 2010 at 8:38 AM, Jayant Pat