AW: logic required for newbie

2010-07-29 Thread Bastian Spitzer
You cant really. By searching you allways will find _documents_, and solr will return all their stored fields unless you specify which exact stored fields you want solr to return by passing "&fl=" parameter to your query. The only aproach i can think off is (mis)using highlighting, search for h

Re: SolrJ Response + JSON

2010-07-29 Thread Mats Bolstad
If you don't mind your JSON format complying with the one Solr uses, you could use GSON. SolrQuery solrQuery = new SolrQuery("your query"); QueryResponse response = server.query(solrQuery); List beans = response.getBeans(YourObject.class); // some computing ... GSON gson = new GSON(); String json

spell checking problem

2010-07-29 Thread satya swaroop
hi all, i need some help in spellchecking.i configured my solrconfig and schema by looking the usermailing list and here i give you the configuration i made.. my schema.xml:: my solrconfig.xml:::

Speed up Solr Index merging

2010-07-29 Thread Karthik K
I need to merge multiple solr indexes into one big index. The process is very slow. Please share any tips to speed it up. Will optimizing the indexes before merging help? Thanks, Karthik

Re: Speed up Solr Index merging

2010-07-29 Thread Li Li
I faced this problem but can't find any good solution. But if you have large stored field such as full text of document. If you don't store it in lucene, it will be quicker because 2 merge indexes will force copy all fdts into a new fdt. If you store it externally. The problem you have to face is h

Re: Solr using 1500 threads - is that normal?

2010-07-29 Thread Christos Constantinou
Eric, Thank you very much for the indicators! I had a closer look at the commit intervals and it seems that the application is gradually increasing the commits to almost once per second after some time - something that was hidden in the massive amount of queries in the log file. I have changed

Reference shards by alias

2010-07-29 Thread Mark Allan
Hi all, We're building a service which will have data from a number of different providers, but the data from each will be slightly different. We've decided to keep the data separate by using multiple cores in Solr; one for each provider. There will be enough overlap in the data to allow

Implementing lookups while importing data

2010-07-29 Thread Gora Mohanty
Hi, We have a database that has numeric values for some columns, which correspond to text values in drop-downs on a website. We need to index both the numeric and text equivalents into Solr, and can do that via a lookup on a different table from the one holding the main data. We are currently doin

Re: Know which terms are in a document

2010-07-29 Thread Michael McCandless
This is a fairly frequently requested and missing feature in Lucene/Solr... Lucene actually "knows" this information while it's scoring each document; it's just that it in no way tries to record that. If you will only do this on a few documents (eg the one page of results) then piggybacking on th

Re: Implementing lookups while importing data

2010-07-29 Thread Chantal Ackermann
Hi Gora, your suggestion is good. Two thoughts: 1. if both of the tables you are joining are in the same database under the same user you might want to check why the join is so slow. Maybe you just need to add an index on a column that is used in your WHERE clauses. Joins should not be slow. 2.

Re: slave index is bigger than master index

2010-07-29 Thread Muneeb Ali
Well I do have disk limitations too, and thats why I think slave nodes died, when replicating data from master node. (as it was just adding on top of existing index files). What do you mean here? Optimizing is too CPU expensive? What I meant by avoid playing around with slave nodes is that doing

Re: slave index is bigger than master index

2010-07-29 Thread Muneeb Ali
Where do these lines go in solr config? 5000 1 Thanks, -Mueeb -- View this message in context: http://lucene.472066.n3.nabble.com/slave-index-is-bigger-than-master-index-tp996329p1003903.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Querying throws java.util.ArrayList.RangeCheck

2010-07-29 Thread Michael McCandless
Can you run CheckIndex on the index and post the output? Mike On Tue, Jul 27, 2010 at 5:56 PM, Manepalli, Kalyan wrote: > Yonik, >        One more update on this. I used the filter query that was throwing > error and used it to delete a subset of results. > After that the queries started workin

Re: SolrJ Response + JSON

2010-07-29 Thread Mitch Köhler
Hi Mat, sounds very interesting, because it seems to be so easy. You say, that this could comply with Solr's JSON-format. What are your experiences regarding the differences? I mean, JSON is a standard, so what can be different? Thank you! - Mitch Am 29.07.2010 09:42, schrieb Mats Bolstad: I

A German Question

2010-07-29 Thread Eric Grobler
Hi Solr world, I have a city field with german city names like: Mölln München Roßdorf and I want to do filters like: fq=city:München or fq=city:munchen I use this type definition: But faceting then looks like: molln munchen rossdorf How can

Solr Indexing slows down

2010-07-29 Thread Peter Karich
Hi, I am indexing a solr 1.4.0 core and commiting gets slower and slower. Starting from 3-5 seconds for ~200 documents and ending with over 60 seconds after 800 commits. Then, if I reloaded the index, it is as fast as before! And today I have read a similar thread [1] and indeed: if I set autowarm

Re: A German Question

2010-07-29 Thread Christian Vogler
On Thursday 29 of July 2010 14:00:21 Eric Grobler wrote: > But faceting then looks like: > molln > munchen > rossdorf > > How can I enable case-insensitive and german agnostic character filters and > output proper formatted names in the facet result? Just create another field without any filte

Re: slave index is bigger than master index

2010-07-29 Thread Peter Karich
Hi Muneeb, I fear you'll have no chance: replicating an index will use more disc space on the slave nodes. Of course, you could minimize disc usage AFTER the replication via the 'optimize-hack'. But are you sure the reason for the slave-node die, is due to disc limitations? Try to observe the sla

Re: Reference shards by alias

2010-07-29 Thread Gora Mohanty
On Thu, 29 Jul 2010 10:49:18 +0100 Mark Allan wrote: [...] > Is there a way to reference each core/shard by name rather than > explicitly stating the host, port and path in the URL? For > example, I'd like to swap: > http://localhost:8983/solr/core0/select/? > q=foo<..snip..>&shards=loca

Re: hi all the ranging searchproblem

2010-07-29 Thread yun chen
good questions! is there any help/ -- View this message in context: http://lucene.472066.n3.nabble.com/hi-all-the-ranging-searchproblem-tp1003643p1003973.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Reference shards by alias

2010-07-29 Thread Mark Allan
On 29 Jul 2010, at 12:27 pm, Gora Mohanty wrote: On Thu, 29 Jul 2010 10:49:18 +0100 Mark Allan wrote: [...] Is there a way to reference each core/shard by name rather than explicitly stating the host, port and path in the URL? For example, I'd like to swap: http://localhost:8983/solr

Re: SolrJ Response + JSON

2010-07-29 Thread Mats Bolstad
What I meant is that GSON do not wrap the response as follows: { "responseHeader":{ "status":0, "QTime":x}, "response":{"numFound":x,"start":0,"docs":[ { /* docs */ }] }, "facet_counts":{ "facet_queries":{}, "facet_fields":{}, "facet_dates":{}}} If you car

Re: Querying throws java.util.ArrayList.RangeCheck

2010-07-29 Thread Yonik Seeley
On Thu, Jul 29, 2010 at 6:37 AM, Michael McCandless wrote: > Can you run CheckIndex on the index and post the output? One of these days we need to get around to adding support for this in Solr's admin interface. http://issues.apache.org/jira/browse/SOLR-566 -Yonik http://www.lucidimagination.com

RE: simple question from a newbie

2010-07-29 Thread Nguyen, Vincent (CDC/OSELS/NCPHI) (CTR)
Yup, dc3.title worked like a charm. I'm also sorting by that field as well. Thanks Vincent Vu Nguyen Division of Science Quality and Translation Office of the Associate Director for Science Centers for Disease Control and Prevention (CDC) 404-498-6154 Century Bldg 2400 Atlanta, GA 30329 -O

Re: Implementing lookups while importing data

2010-07-29 Thread Gora Mohanty
On Thu, 29 Jul 2010 12:30:50 +0200 Chantal Ackermann wrote: > Hi Gora, > > your suggestion is good. > > Two thoughts: > 1. if both of the tables you are joining are in the same database > under the same user you might want to check why the join is so > slow. Maybe you just need to add an index

Re: Tree Faceting in Solr 1.4

2010-07-29 Thread Erik Hatcher
I use patch -p0, not -p1. But otherwise that looks the same as what I do. Can you try again with -p0 and see if it's still an issue? (or have you gotten past this and I've just not caught up with mails yet?) Erik On Jul 23, 2010, at 10:26 AM, Eric Grobler wrote: Hi Erik, I mus

Excluding large tokens from indexing

2010-07-29 Thread Paul Dlug
Is there a filter available that will remove large tokens from the token stream? Ideally something configurable to a character limit? I have a noisy data set that has some large tokens (in this case more than 50 characters) that I'd like to just strip. They're unlikely to ever match a user query an

Re: A German Question

2010-07-29 Thread Eric Grobler
Hi Christian, Thank you - sounds good - I will try that. regards Ericz On Thu, Jul 29, 2010 at 12:04 PM, Christian Vogler < christian.vog...@gmail.com> wrote: > On Thursday 29 of July 2010 14:00:21 Eric Grobler wrote: > > But faceting then looks like: > > molln > > munchen > > rossdorf > > >

search with special chars like € @ % §

2010-07-29 Thread Markus.Rietzler
hi, what is the best way to deal with searches with special chars like § (paragraph), € (euro), @ (at in emails), % and so forth. i think that the WordDelimiterFilters is working on such chars (on index-time and on query-time). the greatest problem i see is, that there can be an optional space

Re: Know which terms are in a document

2010-07-29 Thread Max Lynch
Yea, I've had mild success with the highlighting approach with lucene, but wasn't sure if there was another method available from solr. Thanks Mike. On Thu, Jul 29, 2010 at 5:17 AM, Michael McCandless < luc...@mikemccandless.com> wrote: > This is a fairly frequently requested and missing feature

Facets on multiple values

2010-07-29 Thread Shishir Jain
Hi, Am using Solr facets for my data and have a field which has multiple values in its field am using ";" to delimit those values. So after doing a solr search it returns me a facet array but that contains ";" in the facet value. I want facet to return each as separate values. For eg. am using th

AW: Facets on multiple values

2010-07-29 Thread Bastian Spitzer
just define the keyword field as multivalued and add the keywords separatly, not as single-valued-string. cheers. -Ursprüngliche Nachricht- Von: Shishir Jain [mailto:shishir.j...@gmail.com] Gesendet: Donnerstag, 29. Juli 2010 17:10 An: solr-user@lucene.apache.org Betreff: Facets on mul

Re: Tree Faceting in Solr 1.4

2010-07-29 Thread Eric Grobler
Hi Erik, Thanks, -p1 vs -p0 must then be the issue. On Thu, Jul 29, 2010 at 2:32 PM, Erik Hatcher wrote: > I use patch -p0, not -p1. But otherwise that looks the same as what I do. > > Can you try again with -p0 and see if it's still an issue? (or have you > gotten past this and I've just not

Re: Excluding large tokens from indexing

2010-07-29 Thread Chantal Ackermann
This is probably what you want? http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.LengthFilterFactory On Thu, 2010-07-29 at 15:44 +0200, Paul Dlug wrote: > Is there a filter available that will remove large tokens from the > token stream? Ideally something configurable to a chara

Re: AW: Facets on multiple values

2010-07-29 Thread Shawn Heisey
I'm developing a new schema that includes something similar. The DIH database select statement uses a left join to gather a set of values for each main record into a new field, separated by semicolons. I put the result into a fieldType with the following analyzer chain, which breaks it up in

Re: logic required for newbie

2010-07-29 Thread rajini maski
yes.. The above solution would help ..:) you can specify like http://localhost:8090/solr/select?indent=on&start=0&rows=10&q=landmark:landmark4&fl=landmark,user_id this will give u for each results set only landmark field and userId And in solr console ,The Full Interface option, There you can try

wildcard and proximity searches

2010-07-29 Thread Frederico Azeiteiro
Hi, What approach shoud I use to perform wildcard and proximity searches? Like: "solr mail*"~10 For getting docs where solr is within 10 words of "mailing" for instance? Thanks, Frederico

Spatial filtering with sfilt - how can it be done?

2010-07-29 Thread Marian Steinbach
Hi! I am trying to get spatial filtering to work, but have some trouble while trying to understand what the status is. The version of solr I'm using is a nightly build of 21-Jun-2010. I have this field definition in my schema: And I also have the type "location": When I query my docs,

Spatial filtering with sfilt - how can it be done?

2010-07-29 Thread Marian Steinbach
Hi! I am trying to get spatial filtering to work, but have some trouble while trying to understand what the status is. The version of solr I'm using is a nightly build of 21-Jun-2010. I have this field definition in my schema: And I also have the type "location": When I query my docs,

Re: Solr searching performance issues, using large documents

2010-07-29 Thread Peter Spam
Any ideas? I've got 5000 documents with an average size of 850k each, and it sometimes takes 2 minutes for a query to come back when highlighting is turned on! Help! -Pete On Jul 21, 2010, at 2:41 PM, Peter Spam wrote: > From the mailing list archive, Koji wrote: > >> 1. Provide another fi

Re: Facets on multiple values

2010-07-29 Thread Gora Mohanty
On Thu, 29 Jul 2010 20:39:57 +0530 Shishir Jain wrote: > Hi, > > Am using Solr facets for my data and have a field which has > multiple values in its field am using ";" to delimit those > values. So after doing a solr search it returns me a facet array > but that contains ";" in the facet value.

solr with tomcat basic authentication

2010-07-29 Thread KC Braunschweig
Using the tomcat-users.xml and web.xml changes at the link below, I was able to setup a default ubuntu server tomcat/solr install to require a simple login to access solr (the admin console, updates, etc). http://blog.comtaste.com/2009/02/securing_your_solr_server_on_t.html What I want to do is r

Re: AW: Facets on multiple values

2010-07-29 Thread Chris Hostetter
: ... : Whether to use this idea or Bastian's depends on how the original data source : is organized. it also depends on what you want to get *out* if this is a stored field ... using an analyzer like this will deal with letting you facet on the individual terms, but the stored vaue re

Re: Solr 1.4.1 field collapse

2010-07-29 Thread Chris Hostetter
: I read somewhere that Solr 1.4.1 has field collapse support by default : (without patching it) but I haven't been able to confirm it. Is this : true? No. If field collapsing (or any feature) was included in an official release, it would be listed in the CHANGES.txt for that release. -Hoss

Re: logic required for newbie

2010-07-29 Thread Chris Hostetter
: Actually I am getting result. But I am getting all column of the rows. I : want to remove unnecessary column. : In case of q=piza hut.. then I want to get only piza : hut. you should denormalize your data more -- if each "id" has more then one landmark assocaited with it, but for any given sea

Re: Excluding large tokens from indexing

2010-07-29 Thread Paul Dlug
Thanks, that's exactly what I was looking for, not sure how I missed it. On Thu, Jul 29, 2010 at 11:28 AM, Chantal Ackermann wrote: > This is probably what you want? > > http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.LengthFilterFactory > > > > On Thu, 2010-07-29 at 15:44 +0200,

Re: WordDelimiterFilter and phrase queries?

2010-07-29 Thread Drew Farris
On Wed, Jul 28, 2010 at 10:25 PM, Chris Hostetter wrote: > > : Say someone enters the query string 3-diphenylpropanoic > : > : The query parser I'm using transforms this into a phrase query and the > : indexed form is missed because based the positions of the terms '3' > : and 'diphenylpropanoic'

Re: Solr searching performance issues, using large documents

2010-07-29 Thread dc tech
Are you storing the entire log file text in SOLR? That's almost 3gb of text that you are storing in the SOLR. Try to 1) Is this first time performance or on repaat queries with the same fields? 2) Optimze the index and test performance again 3) index without storing the text and see what the perfor

Re: AW: Facets on multiple values

2010-07-29 Thread Shawn Heisey
On 7/29/2010 12:18 PM, Chris Hostetter wrote: it also depends on what you want to get *out* if this is a stored field ... using an analyzer like this will deal with letting you facet on the individual terms, but the stored vaue returned with each document will still be a single semi-colon sepera

Re: AW: Facets on multiple values

2010-07-29 Thread Chris Hostetter
: My initial approach was to grab the values (which are in another table) with a : DIH subentity and store them in a multivalued field, but that reduced index : speed to a crawl. That's because instead of one query for the entire import, : it was making an individual subquery for every document r

Re: WordDelimiterFilter and phrase queries?

2010-07-29 Thread Chris Hostetter
: > typically for fields where you are using WDF with the "concat" options : > you would usually use a bit of slop on the generated phrase queries to : > allow for the loosenes of the position information. : : Ahh, ok I see dismax supports this with the ps= parameter. Thanks, for dismax, the par

myField:value does not seem to work

2010-07-29 Thread Khai Doan
Hello, My name is Khai. I am new to Solr, and I am having a strange issue. I use the admin interface and search for "Khai" and it work fine. However if I type membername:Khai it does not work. Please provide me with hints on what the issue may be. Thank you, Khai

Re: question about relevance

2010-07-29 Thread Chris Hostetter
: 1. There are user records of type A, B, C etc. (userId field in index is : common to all records) : 2. A user can have any number of A, B, C etc (e.g. think of A being a : language then user can know many languages like french, english, german etc) : 3. Records are currently stored as a document

Re: WordDelimiterFilter and phrase queries?

2010-07-29 Thread Drew Farris
On Thu, Jul 29, 2010 at 3:15 PM, Chris Hostetter wrote: > > for dismax, the param you probably want to focus on is "qs" which is the > slop associated with phrase queries which are generated by the main query > from the "qf" query fields ... "ps" is the slop associated with the single > score boos

Re: AW: Facets on multiple values

2010-07-29 Thread Shawn Heisey
On 7/29/2010 1:13 PM, Chris Hostetter wrote: : My initial approach was to grab the values (which are in another table) with a : DIH subentity and store them in a multivalued field, but that reduced index : speed to a crawl. That's because instead of one query for the entire import, : it was mak

Re: Integration Problem

2010-07-29 Thread Chris Hostetter
: I tried to do that with a custom query handler and a custom response : writer and i'm able to write in the response msg of solr but only in the : response node of the xml msg an not in the results node. i would strongly advise against tyring to modify the block in any way -- that will only ca

advice on creating a solr index when data source is from many unrelated db tables

2010-07-29 Thread S Ahmed
I understand (and its straightforward) when you want to create a index for something simple like Products. But how do you go about creating a Solr index when you have data coming from 10-15 database tables, and the tables have unrelated data? The issue is then you would have many 'columns' in you

Re: myField:value does not seem to work

2010-07-29 Thread Yonik Seeley
Is membername an indexed field in the schema, and was it populated with something that would match "Khai"? If so, what is the fieldType in the schema for the membername field? -Yonik http://www.lucidimagination.com On Thu, Jul 29, 2010 at 3:17 PM, Khai Doan wrote: > Hello, > > My name is Khai.

Re: myField:value does not seem to work

2010-07-29 Thread Khai Doan
Hi Yonik, Here is the field definition in schema.xml: and it is populated with "Khai Bright T" I am using solr 1.4.1 Khai On Thu, Jul 29, 2010 at 12:49 PM, Yonik Seeley wrote: > Is membername an indexed field in the schema, and was it populated > with something that would match "Khai"? > If

Re: advice on creating a solr index when data source is from many unrelated db tables

2010-07-29 Thread Geert-Jan Brits
I can interprete your question in 2 different ways: 1. Do you want to index several heterogenous documents all coming from different tables? So documents of type "tableA" are created and indexed alongside documents of type "tableB", "tableC", etc. 2. Do you want to combine unrelated data from 15 ta

Re: myField:value does not seem to work

2010-07-29 Thread Mats Bolstad
Type string is not tokenized, meaning that it would match only the exact phrase "Khai Bright T". Use text (or another) type that tokenizes (on whitespace in this case) instead. Mats Bolstad On Thu, Jul 29, 2010 at 9:55 PM, Khai Doan wrote: > Hi Yonik, > > Here is the field definition in schema

Nabble problems?

2010-07-29 Thread kenf_nc
The Nabble.com page for Solr - User seems to be broken. I haven't seen an update on it since early this morning. However I'm still getting email notifications so people are seeing and responding to posts. I'm just curious, are you just using email and responding to solr-u...@lucene.apache.org? Or

Re: myField:value does not seem to work

2010-07-29 Thread Khai Doan
What are the differences between "string" and "text"? What other types (that are available by default) can I use? Thanks, Khai On Thu, Jul 29, 2010 at 1:30 PM, Mats Bolstad wrote: > Type string is not tokenized, meaning that it would match only the > exact phrase "Khai Bright T". Use text (or

Re: Nabble problems?

2010-07-29 Thread MitchK
I got some problems with Nabble, too. Nabble sends some warnings that my posts are still pending to the mailing-list, while people were already answering to my initial questions. Did you send a message to the nabble-support? Kind regards, - Mitch kenf_nc wrote: > > The Nabble.com page for Sol

Re: myField:value does not seem to work

2010-07-29 Thread Mats Bolstad
Put simply, strings do not go through filters, and will need exact matching. A string field can typically be an ID field. Texts go through filters so that "bar" could match "Foo Bars", for example. Types are well documented in the example schema.xml shipped with solr. You would also find more info

Re: myField:value does not seem to work

2010-07-29 Thread Khai Doan
Thank you all. Khai On Thu, Jul 29, 2010 at 3:38 PM, Mats Bolstad wrote: > Put simply, strings do not go through filters, and will need exact > matching. A string field can typically be an ID field. > Texts go through filters so that "bar" could match "Foo Bars", for example. > > Types are well

Re: Is there a cache for a query?

2010-07-29 Thread Chris Hostetter
: I want a cache to cache all result of a query(all steps including : collapse, highlight and facet). I read : http://wiki.apache.org/solr/SolrCaching, but can't find a global : cache. Maybe I can use external cache to store key-value. Is there any : one in solr? One of SOlr's design principles

Re: Solr searching performance issues, using large documents

2010-07-29 Thread Peter Spam
If I don't do highlighting, it's really fast. Optimize has no effect. -Peter On Jul 29, 2010, at 11:54 AM, dc tech wrote: > Are you storing the entire log file text in SOLR? That's almost 3gb of > text that you are storing in the SOLR. Try to > 1) Is this first time performance or on repaat que