Re: Sharded Index Creation Magic?

2009-07-14 Thread Shalin Shekhar Mangar
On Tue, Jul 14, 2009 at 2:00 AM, Nick Dimiduk wrote: > However, when I search across all > deployed shards using the &shards= query parameter ( > > http://host00:8080/solr/select?shards=host00:8080/solr,host01:8080/solr&q=body > \%3A%3Aterm), > I get a NullPointerException: > > java.lang.NullPoin

Re: Availability during merge

2009-07-14 Thread Shalin Shekhar Mangar
On Tue, Jul 14, 2009 at 2:30 AM, Charlie Jackson wrote: > The wiki page for merging solr cores > (http://wiki.apache.org/solr/MergingSolrIndexes) mentions that the cores > being merged cannot be indexed to during the merge. What about the core > being merged *to*? In terms of the example on the w

Re: Can't limit return fields in custom request handler

2009-07-14 Thread Osman İZBAT
Thank you very much Chris. Regards. On Mon, Jul 13, 2009 at 4:30 AM, Chris Hostetter wrote: > > : Query filter = new TermQuery(new Term("inStores", "true")); > > that will work if "inStores" is a TextField or a StrField and it's got the > term "true" indexed in it ... but if it's a B

Custom funcionality in SolrIndexSearcher

2009-07-14 Thread Marc Sturlese
Hey there. I needed a funcionality similar to adjacent-field-collapsing but instead of make the docs disapear I just wanted to put them at the end of the list (ids array). At the moment, I am just experimenting the way to obtain the shortests reponse time. provably will not be able to use my solu

Using Multiple fields in UniqueKey

2009-07-14 Thread Anand Kumar Prabhakar
Is there any possiblity of Adding Multiple fields to the UniqueKey in Schema.xml(An Implementation similar to Compound Primary Key)? -- View this message in context: http://www.nabble.com/Using-Multiple-fields-in-UniqueKey-tp24476088p24476088.html Sent from the Solr - User mailing list archiv

Re: Implementing Solr for the first time

2009-07-14 Thread Noble Paul നോബിള്‍ नोब्ळ्
On Tue, Jul 14, 2009 at 1:33 AM, Kevin Miller wrote: > I am new to Solr and trying to get it set up to index files from a > directory structure on a server.  I have a few questions. > > 1.) Is there an application that will return the search results in a > user friendly format? isn't the xml respon

Re: Faceting

2009-07-14 Thread gwk
Well, I had a bit of a facepalm moment when thinking about it a little more, I'll just show a "more countries [Y selected]" where Y is the number of countries selected which are not in the top X. If you want a nice concise interface you'll just have to enable javascript. With my earlier adventu

Re: Distributed Search in Solr

2009-07-14 Thread Sumit Aggarwal
Hi Grant, What i have got from your comments is: 1. We will have to add a support for BoostingTermQuery which extends SpanTermQuery like in lucene payload support. In current world we anyway have other class which is extending SpanTermQuery . Where should i put this class or newly built BoostingTe

Data Import ID Problem

2009-07-14 Thread Chris Masters
Hi All, I have a problem when importing data using the data import handler. I import documents from multiple tables so table.id is not unique - to get round this I concatenate the type like this:             When searching it seems the CONCATted string is turned i

RE: Implementing Solr for the first time

2009-07-14 Thread Kevin Miller
I am needing to index primarily .doc files but also need it to look at .pdf and .xls files. I am currently looking at the Tika project for this functionality. Kevin Miller Web Services -Original Message- From: Shalin Shekhar Mangar [mailto:shalinman...@gmail.com] Sent: Tuesday, July 1

support for Payload Feature of lucene in solr

2009-07-14 Thread Sumit Aggarwal
Hi, As i am new to solr and trying to explore payloads in solr but i haven't got any success on that. In one of the thread Grant mentioned solr have DelimitedPayloadTokenFilter which can store payloads at index time. But to make search on it we will require implementation of BoostingTermQuery exte

TooManyOpenFiles: indexing in one core, doing many searches at the same time in another

2009-07-14 Thread Bruno Aranda
Hi, We are having a TooManyOpenFiles exception in our indexing process. We are reading data from a database and indexing this data into one of the two cores of our solr instance. Each of the cores has a different schema as they are used for a different purpose. While we index in the first core, we

Re: Data Import ID Problem

2009-07-14 Thread Chris Masters
Sorry - The solrJ snippet shoud read: SolrInputDocument doc = doc.addField( doc.addField( newSolrInputDocument();"id", myThing.getId() + TCSearch.SEARCH_TYPE_THING);"dbid", myThing.getId()); - Original Message From: Chris Masters To: solr-user@lucene.apache.org Sent: Tuesday, Jul

Re: Spell checking: Is there a way to exclude words known to be wrong?

2009-07-14 Thread Erik Hatcher
Use the stopwords feature with a custom mispeled_words.txt and a StopFilterFactory on the spell check field ;) Erik On Jul 13, 2009, at 8:27 PM, Jay Hill wrote: We're building a spell index from a field in our main index with the following configuration: textSpell default spel

Re: TooManyOpenFiles: indexing in one core, doing many searches at the same time in another

2009-07-14 Thread Marc Sturlese
Setting: 2 may help. have you tried it? Indexing will be a bit slower but will be faster optimizing. You can check with lsof to see how many files jetty/tomcat (or the server you are using) is holding Bruno Aranda wrote: > > Hi, > > We are having a TooManyOpenFiles exception in our indexing p

Re: TooManyOpenFiles: indexing in one core, doing many searches at the same time in another

2009-07-14 Thread Mark Miller
What merge factor are you using now? The merge factor will influence the number of files that are created as the index grows. Lower = fewer file descriptors needed, but also slower bulk indexing. You could up the Max Open Files settings on your OS. You could also use true Which writes mu

Re: Implementing Solr for the first time

2009-07-14 Thread Erik Hatcher
On Jul 14, 2009, at 8:00 AM, Kevin Miller wrote: I am needing to index primarily .doc files but also need it to look at .pdf and .xls files. I am currently looking at the Tika project for this functionality. This is now built into trunk (aka Solr 1.4): http://wiki.apache.org/solr/Extracting

Anyone working on adapting AnalyzingQueryParser to solr?

2009-07-14 Thread Bill Dueber
The lucene class AnalyzingQueryParser does exactly what I need it to do, but I need to do it in Solr. I took a look at trying to subclass QParser, and it's clear I'm not smart enough. :-) Is anyone else looking at this? -Bill- -- Bill Dueber Library Systems Programmer University of Michigan L

wt=json Not setting application/json reponse headers but text/plain. Howto fix?

2009-07-14 Thread Julian Davchev
Hi folks I see that when calling wt=json I get json response but headers are text/plain which totally bugs me. I rather expect application/json response headers. Any pointers are more than welcome how I can fix this.

Re: Spell checking: Is there a way to exclude words known to be wrong?

2009-07-14 Thread Shalin Shekhar Mangar
On Tue, Jul 14, 2009 at 6:37 PM, Erik Hatcher wrote: > Use the stopwords feature with a custom mispeled_words.txt and a > StopFilterFactory on the spell check field ;) > > Very cool! :) -- Regards, Shalin Shekhar Mangar.

Re: Implementing Solr for the first time

2009-07-14 Thread Erik Hatcher
On Jul 14, 2009, at 5:35 AM, Noble Paul നോബിള്‍ नोब्ळ् wrote: On Tue, Jul 14, 2009 at 1:33 AM, Kevin Miller wrote: I am new to Solr and trying to get it set up to index files from a directory structure on a server. I have a few questions. 1.) Is there an application that will return the s

Re: TooManyOpenFiles: indexing in one core, doing many searches at the same time in another

2009-07-14 Thread Bruno Aranda
Hi, my process is: I index 60 docs in the secondary core (each doc has 5 fields). No problem with that. After this core is indexed (and optimized) it will be used only for searches, during the main core indexing. Currently, I am using mergeFactoror 10 for the main core. I will try with 2 to se

Guide to using SolrQuery object

2009-07-14 Thread Reuben Firmin
Hi, It seems that SolrQuery is a better API than the basic ModifiableSolrParams, but I can't make it work. Constructing params with: final ModifiableSolrParams params = new ModifiableSolrParams(); params.set("q", queryString); ...results in a successful search. Constructing SolrQuery with:

Re: support for Payload Feature of lucene in solr

2009-07-14 Thread Toby Cole
As i am new to solr and trying to explore payloads in solr but i haven't got any success on that. In one of the thread Grant mentioned solr have DelimitedPayloadTokenFilter which can store payloads at index time. But to make search on it we will require implementation of BoostingTermQuery exten

Re: support for Payload Feature of lucene in solr

2009-07-14 Thread Noble Paul നോബിള്‍ नोब्ळ्
right now Solr does not support indexing/retrieving payloads. Probably this can be taken up as an issue On Tue, Jul 14, 2009 at 5:41 PM, Sumit Aggarwal wrote: > Hi, > As i am new to solr and trying to explore payloads in solr but i haven't got > any success on that. In one of the thread Grant ment

Re: Data Import ID Problem

2009-07-14 Thread Noble Paul നോബിള്‍ नोब्ळ्
DIH is getting the field as it as a byte[] ? which db and which driver are you using? On Tue, Jul 14, 2009 at 4:46 PM, Chris Masters wrote: > > Hi All, > > I have a problem when importing data using the data import handler. I import > documents from multiple tables so table.id is not unique - to

Re: support for Payload Feature of lucene in solr

2009-07-14 Thread Walter Underwood
That doesn't require payloads. I was doing that with Solr 1.1. Define two fields, stemmed and exact, with different analyzer chains. Use copyfield to load the same info into both. With the dismax handler, search both fields with a higher boost on the exact field. wunder On 7/14/09 7:39 AM, "Toby

Re: support for Payload Feature of lucene in solr

2009-07-14 Thread Sumit Aggarwal
Hi Walter, I do have a search server where i have implemented things using payload feature itself. These days i am evaluating solr to get rid of my own search server. For that i need payloads feature in solr itself. I raised a related question and got a message from *Grant* as * "**I added a new De

Re: Data Import ID Problem

2009-07-14 Thread Chris Masters
MySQL -> com.mysql.jdbc.Driver (mysql-connector-java-5.1.7.jar). mysql concat -> http://dev.mysql.com/doc/refman/5.1/en/string-functions.html#function_concat Fix is to use CAST like: SELECT CONCAT(CAST(THING.ID AS CHAR),TYPE) AS INDEX_ID... Thanks for the nudge 'Noble Paul'! - Original

Re: support for Payload Feature of lucene in solr

2009-07-14 Thread Sumit Aggarwal
Hey Nobel, Any comments on Grant suggestion. Thanks, -Sumit On Tue, Jul 14, 2009 at 8:40 PM, Sumit Aggarwal wrote: > Hi Walter, > I do have a search server where i have implemented things using payload > feature itself. These days i am evaluating solr to get rid of my own search > server. For th

Re: Sharded Index Creation Magic?

2009-07-14 Thread Nick Dimiduk
I do, but you raise an interesting point. I had named the field incorrectly. I'm a little puzzled as to why individual search worked with the broken field name, but now all is well! On Tue, Jul 14, 2009 at 12:03 AM, Shalin Shekhar Mangar < shalinman...@gmail.com> wrote: > On Tue, Jul 14, 2009 at

Re: Sharded Index Creation Magic?

2009-07-14 Thread Shalin Shekhar Mangar
On Tue, Jul 14, 2009 at 10:30 PM, Nick Dimiduk wrote: > I do, but you raise an interesting point. I had named the field > incorrectly. > I'm a little puzzled as to why individual search worked with the broken > field name, but now all is well! > > An individual Solr uses uniqueKey only for replac

Re: wt=json Not setting application/json reponse headers but text/plain. Howto fix?

2009-07-14 Thread Avlesh Singh
Take a look at https://issues.apache.org/jira/browse/SOLR-1123 Don't stop yourself from voting for the issue :) Cheers Avlesh On Tue, Jul 14, 2009 at 7:01 PM, Julian Davchev wrote: > Hi folks > I see that when calling wt=json I get json response but headers are > text/plain which totally bugs m

Re: Availability during merge

2009-07-14 Thread Jason Rutherglen
Kind of regrettable, I think we can look at changing this in Lucene. On Tue, Jul 14, 2009 at 12:08 AM, Shalin Shekhar Mangar < shalinman...@gmail.com> wrote: > On Tue, Jul 14, 2009 at 2:30 AM, Charlie Jackson < > charlie.jack...@cision.com > > wrote: > > > The wiki page for merging solr cores > >

Wikipedia or reuters like index for testing facets?

2009-07-14 Thread Jason Rutherglen
Is there a standard index like what Lucene uses for contrib/benchmark for executing faceted queries over? Or maybe we can randomly generate one that works in conjunction with wikipedia? That way we can execute real world queries against faceted data. Or we could use the Lucene/Solr mailing lists an

Re: support for Payload Feature of lucene in solr

2009-07-14 Thread Shalin Shekhar Mangar
It may be nice to tell us why you need payloads? There may be other ways of solving your problem than adding payload support to Solr? Anyway, I don't see payload support before 1.5 On Tue, Jul 14, 2009 at 10:07 PM, Sumit Aggarwal wrote: > Hey Nobel, > Any comments on Grant suggestion. > > Thanks,

Re: Wikipedia or reuters like index for testing facets?

2009-07-14 Thread Mark Miller
On Tue, Jul 14, 2009 at 3:36 PM, Jason Rutherglen < jason.rutherg...@gmail.com> wrote: > Is there a standard index like what Lucene uses for contrib/benchmark for > executing faceted queries over? Or maybe we can randomly generate one that > works in conjunction with wikipedia? That way we can exe

Multicore Solr (trunk) creates extra dirs

2009-07-14 Thread Otis Gospodnetic
Hello, I just built solr.war from trunk and deployed it to a multicore solr server whose solr.xml looks like this: Each core has conf and data/index dirs under its instanceDir. e.g. $ tree /mnt/solrhome/cores/core0 cores/core0 |-- conf | |-- schema.xml -> ../../../conf/sch

Re: support for Payload Feature of lucene in solr

2009-07-14 Thread Grant Ingersoll
The TokenFilterFactory side is trivial for the DelimitedPayloadTokenFilter. That could be in for 1.4. In fact, there is an automated way to generate the stubs that should be run in preparing for a release. I'll see if I can find a minute or two to make that happen. For query support, I

Re: Wikipedia or reuters like index for testing facets?

2009-07-14 Thread Grant Ingersoll
At a min, it is trivial to use the EnWikiDocMaker and then send the doc over SolrJ... On Jul 14, 2009, at 4:07 PM, Mark Miller wrote: On Tue, Jul 14, 2009 at 3:36 PM, Jason Rutherglen < jason.rutherg...@gmail.com> wrote: Is there a standard index like what Lucene uses for contrib/ benchmark

Re: Wikipedia or reuters like index for testing facets?

2009-07-14 Thread Jason Rutherglen
You think enwiki has enough data for faceting? On Tue, Jul 14, 2009 at 2:56 PM, Grant Ingersoll wrote: > At a min, it is trivial to use the EnWikiDocMaker and then send the doc over > SolrJ... > > On Jul 14, 2009, at 4:07 PM, Mark Miller wrote: > >> On Tue, Jul 14, 2009 at 3:36 PM, Jason Ruthergle

JMX monitoring for multiple SOLR instances

2009-07-14 Thread J G
Hi, If I want to run multiple SOLR war files in tomcat is it possible to monitor each of the SOLR instances individually through JMX? Has anyone attempted this before? Also, what are the implications (e.g. performance) of runnign mulitple SOLR instances in the same tomcat server? Thanks.

Re: Multicore Solr (trunk) creates extra dirs

2009-07-14 Thread Otis Gospodnetic
Hi, Paul and Shalin will know about this. What I'm seeing looks a lot like what Walter reported in March: * http://markmail.org/thread/dfsj7hqi5buzhd6n And this commit from Paul seems possibly related: * http://markmail.org/message/cjvjffrfszlku3ri ...because of things like: -cores =

Re: Using Multiple fields in UniqueKey

2009-07-14 Thread Otis Gospodnetic
Some ideas: - Use copyField to copy fields to the field designated as the uniqueKey (not sure if this will work) - Create the field from existing data before sending docs to Solr - Create a custom UpdateRequestProcessor that adds a field for each document it processes and stuffs it with other f

Re: Solr 1.4 Release Date

2009-07-14 Thread Otis Gospodnetic
I just looked at SOLR JIRA today and saw some 40 open issues marked for 1.4, so Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: pof > To: solr-user@lucene.apache.org > Sent: Tuesday, July 14, 2009 12:37:33 AM > Subject: Re: Solr 1.4

Re: Wikipedia or reuters like index for testing facets?

2009-07-14 Thread Grant Ingersoll
Probably not as generated by the EnwikiDocMaker, but the WikipediaTokenizer in Lucene can pull out richer syntax which could then be Teed/Sinked to other fields. Things like categories, related links, etc. Mostly, though, I was just commenting on the fact that it isn't hard to at least us

Re: Wikipedia or reuters like index for testing facets?

2009-07-14 Thread Mark Miller
Why don't you just randomly generate the facet data? Thats prob the best way right? You can control the uniques and ranges. On Wed, Jul 15, 2009 at 1:21 AM, Grant Ingersoll wrote: > Probably not as generated by the EnwikiDocMaker, but the WikipediaTokenizer > in Lucene can pull out richer syntax

Re: support for Payload Feature of lucene in solr

2009-07-14 Thread Sumit Aggarwal
Hi Shalin, Our requirement is to have a rolling window support for popularity of catalog items for say 3 months. What we used to do we are adding term,value as tokens where term is some unique string for each day and value is popularity count for that day. Once indexing this data as token stream an

Re: grouping and sorting by facet?

2009-07-14 Thread Chris Hostetter
: Is there a way to group and sort by facet count? I have a large set of : images, each of which is part of a different "collection." I am performing : a faceted search: : : /solr/select/?q=my+term&max=30&version=2.2&rows=30&start=0&facet=true&facet.field=collection&facet.sort=true : : I woul

Segments_2 and segments.gen under Index folder and spellchecker1, spellchecker2, spellcheckerFile folder

2009-07-14 Thread Francis Yakin
I just upgraded our solr to 1.3.0 After I deployed the solr apps, I noticed there are: Segments_2 and segments.gen and there are 3 folder spellchecker1, spellchecker2 and spellcheckerFile What's these for? When I deleted them, I need bounce the apps again and it will generate the new ones aga

Re: Segments_2 and segments.gen under Index folder and spellchecker1, spellchecker2, spellcheckerFile folder

2009-07-14 Thread Shalin Shekhar Mangar
On Wed, Jul 15, 2009 at 8:46 AM, Francis Yakin wrote: > > I just upgraded our solr to 1.3.0 > > After I deployed the solr apps, I noticed there are: > > Segments_2 and segments.gen and there are 3 folder spellchecker1, > spellchecker2 and spellcheckerFile > > What's these for? When I deleted them

DefaultSearchField ? "important"

2009-07-14 Thread Jörg Agatz
Hallo Users... And good Morning, in germany it is morning :-) I have a realy important Prroblem... My Fields are realy Bad.. Like "CUPS_EBENE1_EBENE2_TASKS_CATEGORIE" I have no Content field ore somthing like this... So when i will search somthing, i need to search in ALL fields, but when i sear

spellcheck with misspelled words in index

2009-07-14 Thread Chris Williams
Hi, I'm having some trouble getting the correct results from the spellcheck component. I'd like to use it to suggest correct product titles on our site, however some of our products have misspellings in them outside of our control. For example, there's 2 products with the misspelled word "cusine"