Re: How many fields can SOLR handle?

2011-07-08 Thread William Bell
RoySolr, Not sure what language your client is written in, but this is a simple if statement. if (category == TV) { qStr = q=*:*facet=truefacet.field=tv_sizefacet.field=resolution; elseif (category == Computer) { qStr = q=*:*facet=truefacet.field=cpufacet.field=gpu; } curl

Re: How many fields can SOLR handle?

2011-07-08 Thread roySolr
Yes i use something like that. I make a db connection to get the facets for the chosen category. With this data i add facet.fields dynamically: example: foreach(results as result){ qStr = facet.field= . result; } I was searching for a solution that i don't need to get the facets from db. Now i

Re: (Solr-UIMA) Doubt regarding integrating UIMA in to solr - Configuration.

2011-07-08 Thread Sowmya V.B.
Hi Koji Thanks for the mail. Thanks for all the clarifications. I am now using the version 3.3.. But, another query that I have about this is: How can I add an annotator that I wrote myself, in to Solr-UIMA? Here is what I did before I moved to Solr: I wrote an annotator (which worked when I

Problem in searching uppercase string with wildcard

2011-07-08 Thread Romi
Hello, I am using solr search. my search field contains both diamond and Diamond. But when i search for Diamond/diamond it gives me correct results. But when i search for Diamond*/diamond*, I get result for diamond* but no results found for Diamond* . although i have applied filter

RE: Problem with first letter accented

2011-07-08 Thread Villacorta Peral, Eva
I'm using collectiveaccess, and its DB structure. Perhaps this is useful... My type definition is: schema name=ca_objects version=1.1 types fieldType name=text class=solr.TextField positionIncrementGap=100 analyzer

Re: Problem in searching uppercase string with wildcard

2011-07-08 Thread William Bell
Setup the filter on query and indexing to make it case insensitive... Then reindex. On Fri, Jul 8, 2011 at 1:26 AM, Romi romijain3...@gmail.com wrote: Hello, I am using solr search. my search field contains both diamond and Diamond. But when i search for Diamond/diamond it gives me correct

RE: Problem with first letter accented

2011-07-08 Thread Ahmet Arslan
Hello, As I see from analyis.jsp your á letter is not converted to 'a' by ASCII folding filter. It is recognized as two characters 'á' (before it comes to ASCII folding) for some reason. First of all I would check URI Encoding of my servlet container. It should be utf-8. See tomcat's

Re: Problem in searching uppercase string with wildcard

2011-07-08 Thread Ahmet Arslan
Hello, I am using solr search. my search field contains both diamond and Diamond. But when i search for Diamond/diamond it gives me correct results. But when i search for Diamond*/diamond*, I get result for diamond* but no results found for Diamond* . although i have applied  filter

performance variation with respect to the index size

2011-07-08 Thread jame vaalet
hi, is there any performance degradation (response time etc ) if the index has document content text stored in it (stored=true)? -JAME

Re: Exception when using result grouping and sorting by geodist() with Solr 3.3

2011-07-08 Thread Thomas Heigl
How should I proceed with this problem? Should I create a JIRA issue or should I cross-post on the dev mailing list? Any suggestions? Cheers, Thomas On Wed, Jul 6, 2011 at 9:49 AM, Thomas Heigl tho...@umschalt.com wrote: My query in the unit test looks like this: q=*:*fq=_query_:{!geofilt

Re: Virtual Memory usage increases beyond Xmx with Solr 3.3

2011-07-08 Thread Toke Eskildsen
On Fri, 2011-07-08 at 07:12 +0200, Nikhil Chhaochharia wrote: However, if I upgrade to Solr 3.3, then the Virtual Memory of the Tomcat process increases to roughly the index size (70GB). Any ideas why this is happening? Maybe you switched to MMapDirectory?

RE: Problem with first letter accented

2011-07-08 Thread Villacorta Peral, Eva
I'm sorry if this mail is repeated. But my server mail gave me an error. Hi! I've changed the server.xml to add the URI Enconding. I've changed the schema version to 1.4. And I've reindexed my DB. But nothing has changed. In the analisys.jsp I've searched for más, in order to find what happens

Solr sentiment analysis

2011-07-08 Thread Zheng Qin
Hi, We are starting a project on Twitter data sentiment analysis. We have installed LucidWorks, which also has a Solr admin page. By reading the posts, it seems that sentiment analysis can be done by using OpenNLP or machine learning (Mahout or Weka). Can you share with us which tool is good at

RE: Problem with first letter accented

2011-07-08 Thread Ahmet Arslan
I've changed the server.xml to add the URI Enconding. I've changed the schema version to 1.4. And I've reindexed my DB. But nothing has changed. Okey just to make sure, correct connector should be this: Connector port=8080 protocol=HTTP/1.1 connectionTimeout=2

RE: Problem with first letter accented

2011-07-08 Thread Villacorta Peral, Eva
Okey just to make sure, correct connector should be this: Connector port=8080 protocol=HTTP/1.1 connectionTimeout=2 redirectPort=8443 URIEncoding=UTF-8 / Can you confirm this? Did you restart tomcat? This is my connector:

Re: performance variation with respect to the index size

2011-07-08 Thread François Schiettecatte
Hi I don't think that anyone has run such benchmarks, in fact this topic came up two weeks ago and I volunteered some time to do that because I have some spare time this week, so I am going to run some benchmarks this weekend and report back. The machine I have to do this a core i7 960, 24GB,

Re: performance variation with respect to the index size

2011-07-08 Thread jame vaalet
i would prefer every setting to be in its default stage and compare the result with stored = true and False . 2011/7/8 François Schiettecatte fschietteca...@gmail.com Hi I don't think that anyone has run such benchmarks, in fact this topic came up two weeks ago and I volunteered some time to

Re: Need help with troublesome wildcard query

2011-07-08 Thread Christopher Cato
Hi Briggs. Thanks for taking the time. I have the query nearly working now, currently this is how it looks when it matches on the title Super Technocrane 30 and others with similar names: INFO: [] webapp=/solr path=/select/

omitNorms and omitTermFreqAndPosition

2011-07-08 Thread Gastone Penzo
Hi, i have a problem with omitTermFreqAndPosition and omitNorms. In my schema i have some fields with these property set True. for example the field category then i make a query like: select?q=category:(x OR y or Z) it returns all docs that have as category x or y or z. i make a debugQuery=on

Re: (Solr-UIMA) Doubt regarding integrating UIMA in to solr - Configuration.

2011-07-08 Thread Koji Sekiguchi
(11/07/08 16:19), Sowmya V.B. wrote: Hi Koji Thanks for the mail. Thanks for all the clarifications. I am now using the version 3.3.. But, another query that I have about this is: How can I add an annotator that I wrote myself, in to Solr-UIMA? Here is what I did before I moved to Solr: I

Re: can't get moreLikeThis to work

2011-07-08 Thread Elaine Li
Guan and Koji, thank you both! After I changed to termVectors = true, it returns the results as expected. I flipped the stored=true|false for two fields: text and category_text and compared the results and don't see any difference. The documentation seems to suggest to have stored=true for the

Re: Exception when using result grouping and sorting by geodist() with Solr 3.3

2011-07-08 Thread Yonik Seeley
On Fri, Jul 8, 2011 at 4:11 AM, Thomas Heigl tho...@umschalt.com wrote: How should I proceed with this problem? Should I create a JIRA issue or should I cross-post on the dev mailing list? Any suggestions? Yes, this definitely sounds like a bug in the 3.3 grouping (looks like it forgets to

Re: Need help with troublesome wildcard query

2011-07-08 Thread Briggs Thompson
Hey Chris, Removing the ORs in each query might help narrow down the problem, but I suggest you run this through the query analyzer in order to see where it is dropping out. It is a great tool for troubleshooting issues like these. I see a few things here. - for leading wildcard queries, you

Re: performance variation with respect to the index size

2011-07-08 Thread Erick Erickson
Well, it depends (tm). Raw search time should be unaffected (or very close to that). The stored data is in a completely separate file in the index directory and is not referenced during searches. That said, assembling the response may take longer since you're potentially reading more data from

Re: can't get moreLikeThis to work

2011-07-08 Thread Erick Erickson
What browser are you using? Chrome and FireFox (and I think IE) have plugins that'll format XML and JSON right in the browser that helps with this a lot. Best Erick On Fri, Jul 8, 2011 at 10:08 AM, Elaine Li elaine.bing...@gmail.com wrote: Guan and Koji, thank you both! After I changed to

Re: (Solr-UIMA) Doubt regarding integrating UIMA in to solr - Configuration.

2011-07-08 Thread Sujit Pal
Hi Sowmya, I basically wrote an annotator and built a buffering tokenizer around it so I could include it in a Lucene analyzer pipeline. I've blogged about it, not sure if its good form to include links to blog posts in public forums, but here they are, apologies in advance if this is wrong (let

Re: can't get moreLikeThis to work

2011-07-08 Thread Shawn Heisey
On 7/8/2011 8:08 AM, Elaine Li wrote: Guan and Koji, thank you both! After I changed to termVectors = true, it returns the results as expected. I flipped the stored=true|false for two fields: text and category_text and compared the results and don't see any difference. The documentation seems

Re: can't get moreLikeThis to work

2011-07-08 Thread Juan Grande
Hi Elain, You can add the indent=true parameter to the request to get a tidier output. Firefox usually ignores tabs when showing XML, so I'd suggest to choose View page source in that case. The documentation seems to suggest to have stored=true for the fields though, not sure why. Maybe

Re: Need help with troublesome wildcard query

2011-07-08 Thread Christopher Cato
Hi Briggs, thanks for being patient with me! Yeah, I saw I had a typo there in the OR clause. Fixed it but still no perfect results. I'm looking at the analysis.jsp page and can't really figure it out. Feeling a bit overwhelmed by all the output. I also don't know how to check if stemming is

Re: Need help with troublesome wildcard query

2011-07-08 Thread Erick Erickson
Yeah, the analysis page takes a bit of getting used to, but it's well worth the time. Be sure to check the verbose box. Taking some time to understand what it's telling you is one of the best investments you'll make. Your parts of words is the issue. One approach is to use ngrams or edgengrams.

Re: Solr sentiment analysis

2011-07-08 Thread Bruno Adam Osiek
Try Lingpipe. They use Language Models as their engine for sentiment analysis. At (http://alias-i.com/lingpipe/) you will find a step-by-step tutorial on how to implement it. On 07/08/2011 07:14 AM, Zheng Qin wrote: Hi, We are starting a project on Twitter data sentiment analysis. We have

Re: Virtual Memory usage increases beyond Xmx with Solr 3.3

2011-07-08 Thread Nikhil Chhaochharia
Thanks, MMapDirectory was the reason - it was made the default in Lucene 3.3 http://lucene.apache.org/java/3_3_0/changes/Changes.html#3.3.0.changes_in_runtime_behavior https://issues.apache.org/jira/browse/LUCENE-3198 From: Toke Eskildsen To:

Re: (Solr-UIMA) Doubt regarding integrating UIMA in to solr - Configuration.

2011-07-08 Thread Sowmya V.B.
Hi Koji Thanks. I have checked out the code and began looking at it. The code examples gave me an idea of what to do,though I am not fully clear, since there are no comments there, to verify my understanding. Hence, mailing again for clarification. In NamedEntity.java, you add two fields name,

Re: Need help with troublesome wildcard query

2011-07-08 Thread Christopher Cato
Thanks for that pointer, that's really more what I want to do. And actually, EdgeNGrams is stuck somewhere in the back of my head :) Yes, simple at first thought but not as easy to implement as I have discovered. Well, so how do I implement something like this? I took the fieldtype declaration

Re: Getting the indexed value rather than the stored value

2011-07-08 Thread Christian
Hi Gora, The problem I am finding is that the copyField directive sends the original value to the new field type. The field type then munges the index until it's completely different (original - some sentence this like, index - true), but the stored value is still the original sentence. When I

Re: Need help with troublesome wildcard query

2011-07-08 Thread Erick Erickson
Nope, that should do it (although I haven't tried that exact set of steps). But you do have to reindex from scratch Best Erick On Fri, Jul 8, 2011 at 1:36 PM, Christopher Cato christopher.c...@minimedia.se wrote: Thanks for that pointer, that's really more what I want to do. And actually,

Re: sorting on date field in facet query

2011-07-08 Thread Dmitry Kan
Wanted to say thanks to everyone contributed: Erick, Stefan, kenf_nc. Erick a solution based on your proposition has been implemented and pushed to users. Thanks! Best On 5/19/11, Erick Erickson erickerick...@gmail.com wrote: Oooh, that's clever The glitch is that field collapsing is

Re: can't get moreLikeThis to work

2011-07-08 Thread Elaine Li
You can add the indent=true parameter to the request to get a tidier output. Firefox usually ignores tabs when showing XML, so I'd suggest to choose View page source in that case. The page source looks so much better. :) thanks! The documentation seems to suggest to have stored=true for the

Re: Solr sentiment analysis

2011-07-08 Thread Matthew Painter
Note you can't use lingpipe commercially without a license though I believe. Sent from my iPhone On 8 Jul 2011, at 18:20, Bruno Adam Osiek baos...@gmail.com wrote: Try Lingpipe. They use Language Models as their engine for sentiment analysis. At (http://alias-i.com/lingpipe/) you will find a

Re: Need help with troublesome wildcard query

2011-07-08 Thread Christopher Cato
And don't you know, that EdgeNGram analyzer did the trick. Added the fieldtype, added a new field based on it, copyfielded the old title to it, reindexed and hey - it works brilliantly :) And you were right, the analysis output does make sence once it actually matches something :D Thanks a

Appropriate Tokenizer/Filter to Handle Punctuation Variation

2011-07-08 Thread Damon Zwolinski
This maybe be a simple question; well I hope so anyways. We have songs that punctuation and quoting and the trick is to get all variations of a query to result with the correct result. Please see the following example. From the database we index a song with title Damon's Radical Song?. We want

Loosing out some facets when changing anything in feed.

2011-07-08 Thread Bais, Shailendra Singh
We are trying to implement browsing based on the search functionality (Which will have facet for that particular category along with item list). For this purpose we have added one field in the feed (.csv) file which we use to create indexes so that we can do the search based on the subcategory id

Re: (Solr-UIMA) Doubt regarding integrating UIMA in to solr - Configuration.

2011-07-08 Thread Koji Sekiguchi
Now I've pasted sample solrconfig.xml to the project top page. Can you visit and look at it again? koji -- http://www.rondhuit.com/en/ (11/07/09 2:29), Sowmya V.B. wrote: Hi Koji Thanks. I have checked out the code and began looking at it. The code examples gave me an idea of what to

SolrJ Spatial Search

2011-07-08 Thread SR
Hi there, Through SolrJ 3.2, I'm trying to set some Spatial Search queries (e.g., filter by distance, sort by distance, etc.). I don’t know whether there's a specific SolrJ syntax to do this. I tried using Strings, but it’s not working. Here’s are two examples that work fine on Solr, but don’t

Local Search – Rank results by businesses density

2011-07-08 Thread SR
Hi there, For local business searches in big cities (e.g., Restaurant in NYC), I’d like to sort the results by the density of the businesses in the underlying neighborhoods (e.g., return x restaurants from the neighborhood that has the highest density of restaurants). A solution would be to

Query Rewrite

2011-07-08 Thread Jamie Johnson
My organization is considering a few different approaches for indexing vs query rewrite and I'm trying to figure out what would be required in order to implement some form of query rewrite. Lets say my index has 2 fields first name and last name. When the user does a query name:bob I'd like to

Re: Solr sentiment analysis

2011-07-08 Thread Zheng Qin
Thanks, Bruno and Matthew. I saw that tutorial before and Lingpipe requires a license while we are looking at open source solutions. We are not clear yet on how to use Solr to do sentiment analysis. Does a NLP or learning tool have to be used to accomplish this task? If a tool is needed, how it