Re: Should I extend DIH to handle POST too?

2009-01-26 Thread Gunaranjan Chandraraju
Thanks, I guess I got the wrong impression initially. These classes extend the RequestHandlerBase. Since you mentioned we need to extend the ContenStreamBase, I was thinking that we need to create a new plugin for the ContentStream that would be picked up by the XmlUpdateRequestHandler (

Re: QParserPlugin

2009-01-26 Thread Erik Hatcher
Karl - where did you put your a.b.QParserPlugin? You should put it in /lib within a JAR file. I'm surprised you aren't seeing an error though. Erik On Jan 27, 2009, at 1:07 AM, Karl Wettin wrote: Hi forum, I'm trying to get QParserPlugin to work, I've got but still get Unknow

Re: Connection mismanagement in Solrj?

2009-01-26 Thread Noble Paul നോബിള്‍ नोब्ळ्
are you making requests in parallel ? which ConnectionManager are you using for HttpClient? On Tue, Jan 27, 2009 at 11:58 AM, Noble Paul നോബിള്‍ नोब्ळ् wrote: > you can set any connection parameters for the HttpClient and pass on > the instance to CommonsHttpSolrServer and that will be used for

Re: Connection mismanagement in Solrj?

2009-01-26 Thread Noble Paul നോബിള്‍ नोब्ळ्
correction wrong: make sure that you are not reusing instance of CommonsHttpSolrServer correct : make sure that you are reusing instance of CommonsHttpSolrServer On Tue, Jan 27, 2009 at 11:58 AM, Noble Paul നോബിള്‍ नोब्ळ् wrote: > you can set any connection parameters for the HttpClient and pa

Re: Connection mismanagement in Solrj?

2009-01-26 Thread Walter Underwood
I did that. Unless CommonsHttpSolrServer changes some defaults, it should be the same as my earlier code. I set max connections, max per host, retries, and timeouts for connect and read. We are seeing 20-40 open connections instead of a few. wunder On 1/26/09 10:28 PM, "Noble Paul നോബിള്‍ नोब्

Re: Connection mismanagement in Solrj?

2009-01-26 Thread Noble Paul നോബിള്‍ नोब्ळ्
you can set any connection parameters for the HttpClient and pass on the instance to CommonsHttpSolrServer and that will be used for making requests make sure that you are not reusing instance of CommonsHttpSolrServer On Tue, Jan 27, 2009 at 10:59 AM, Walter Underwood wrote: > We just switched t

QParserPlugin

2009-01-26 Thread Karl Wettin
Hi forum, I'm trying to get QParserPlugin to work, I've got but still get Unknown query type 'myqueryparser' when I /solr/select/?defType=myqueryparser&q=foo There is no warning about myqueryparser from Solr at startup. I do however manage to get this working: So it shouldn't

Re: solr-duplicate post management

2009-01-26 Thread S.Selvam Siva
On Tue, Jan 27, 2009 at 5:03 AM, Chris Hostetter wrote: > > : Hi, i added some code to *DirectUpdateHandler2.java's doDeletions()* > (solr > : 1.2.0) ,and got the solution i wanted.(logging duplicate post entry-i.e > old > : field and new field of duplicate post) > : > : > :Document d1=sea

Connection mismanagement in Solrj?

2009-01-26 Thread Walter Underwood
We just switched to Solrj from a home-grown client and we have a huge jump in the number of connections to the server, enough that our load balancer was rejecting connections in production tonight. Does that sound familiar? We're running 1.3. I set the timeouts and connection pools to the same va

Re: Performance issue

2009-01-26 Thread Shalin Shekhar Mangar
2009/1/27 mahendra mahendra > > Is there anyway I can get the results without restarting the server or > reloading cores from search machine. > A commit on the slave (searcher) will be needed to re-open the IndexReader and show the latest updates. Ofcourse, that should be done after the commit o

Re: Should I extend DIH to handle POST too?

2009-01-26 Thread Noble Paul നോബിള്‍ नोब्ळ्
Do not replace any update handler. There are already a handful of updatehandlers Just add another . Take a look at XmlUpdateRequestHandler, CSVRequestHandler, BinaryUpdateRequestHandler etc you can define them as a request handler eq: On Tue, Jan 27, 2009 at 4:04 AM, Gunaranjan Chandraraju w

Re: URL-import field type?

2009-01-26 Thread Noble Paul നോബിള്‍ नोब्ळ्
On Tue, Jan 27, 2009 at 4:47 AM, Chris Hostetter wrote: > > : I know I was able to imitate that in plain-lucene by crafting a particular > : analyzer-filter who was only given the URL as content and who gave further > the > : tokens of the stream. > > FWIW: while taking advantage of DIH and some

Re: Setting dataDir in multicore environment

2009-01-26 Thread Noble Paul നോബിള്‍ नोब्ळ्
The behavior is expected properties set in solr.xml are not implicitly used anywhere. you will have to use those variables explicitly in solrconfig.xml/schema.xml instead of hardcoding dataDir in solrconfig.xml you can use it as a variable $$dataDir BTW there is an issue (https://issues.apache.org

Re: Query Performance while updating teh index

2009-01-26 Thread oleg_gnatovskiy
Just to clarify - we do not optimize on the slaves at all. We only optimize on the master. hossman wrote: > > > : We do optimize the index before updates but we get tehse performance > issues > : even when we pull an empty snapshot. Thus even when our update is tiny, > the > : performance issue

Re: Query Performance while updating teh index

2009-01-26 Thread oleg_gnatovskiy
Just to calrify - we do not optimize on teh slaves at all. We only optimize on the master. hossman wrote: > > > : We do optimize the index before updates but we get tehse performance > issues > : even when we pull an empty snapshot. Thus even when our update is tiny, > the > : performance issue

Re: Query Performance while updating teh index

2009-01-26 Thread oleg_gnatovskiy
Just to calrify - we do not optimize on the slaves at all. We only optimize on the master. hossman wrote: > > > : We do optimize the index before updates but we get tehse performance > issues > : even when we pull an empty snapshot. Thus even when our update is tiny, > the > : performance issue

Performance issue

2009-01-26 Thread mahendra mahendra
Hi, I have configured solr in two different servers in different machines. One server I am using for indexing and another server using for searching data. Both servers are pointing to the common data directory (shared drive). When indexer server posting the data to shared drive (data directory),

Setting dataDir in multicore environment

2009-01-26 Thread Mark Ferguson
Hi, In my solr.xml file, I am trying to set the dataDir property the way it is described in the CoreAdmin page on the wiki: However, the property is being completed ignored. It is using whatever I have set in the solrconfig.xml file (or ./data, the default value, if I set nothing in that fi

Re: solr-duplicate post management

2009-01-26 Thread Chris Hostetter
: Hi, i added some code to *DirectUpdateHandler2.java's doDeletions()* (solr : 1.2.0) ,and got the solution i wanted.(logging duplicate post entry-i.e old : field and new field of duplicate post) : : :Document d1=searcher.doc(prev);//existing doc to be deleted :Document d

Datemath Now is UST or IST?

2009-01-26 Thread Kalidoss MM
Hi, We use Solr1.3 and indexed some of our date fields in the format '1995-12-31T23:59:59Z' and as we know this is a UTC date. But we do want to index the date in IST which is +05:30hours so that extra conversion from UTC to IST across all our application is avoided. How to do that? And we have

Re: Query Performance while updating teh index

2009-01-26 Thread Chris Hostetter
: We do optimize the index before updates but we get tehse performance issues : even when we pull an empty snapshot. Thus even when our update is tiny, the : performance issues still happen. FWIW: this behavior doesn't make a lot of sense -- optimizing just before you are about to make updates/a

Re: URL-import field type?

2009-01-26 Thread Chris Hostetter
: I know I was able to imitate that in plain-lucene by crafting a particular : analyzer-filter who was only given the URL as content and who gave further the : tokens of the stream. FWIW: while taking advantage of DIH and some of it's plugin APIs to deal with this is probaly a better way to -- a

Re: stats.jsp - maxDoc and numDoc-help

2009-01-26 Thread Chris Hostetter
: 1)then i can think of that "maxDocs- numDocs " should be the maximum(upper : bound) duplicate post count so far,if i assume no other deletion happened : other than duplication deletion. not neccessarily -- when Lucene merges segments (which can happen on any add) deletes get flushed from the se

RE: QTime in microsecond

2009-01-26 Thread Chris Hostetter
: The easiest way is to run maybe 100,000 or more queries and take an : average. A single microsecond value for a query would be incredibly : inaccurate. that can be useful for doing the timing externally, if you're interested in averaging over all X queries in a sequential batch, but it doesn't

Re: [ANN] Lucid Imagination

2009-01-26 Thread Glen Newton
Congrats & good-luck on this new endeavour! -Glen :-) 2009/1/26 Grant Ingersoll : > Hi Lucene and Solr users, > > As some of you may know, Yonik, Erik, Sami, Mark and I teamed up with > Marc Krellenstein to create a company to provide commercial > support (with SLAs), training, value-add compone

Re: Should I extend DIH to handle POST too?

2009-01-26 Thread Gunaranjan Chandraraju
Hi Noble, I started digging into the code to this and came away a bit confused - apologize being 'quite a newbie' here. The solrconfig contains both an "Update Handler" and an "Update Request Handler " Should I plug in the former or the latter. I am not sure I want to replace the 'high

Re: exact field match

2009-01-26 Thread Svein Parnas
Another solution is to put a special token in front and end of every occurence of the field, eg aastartaa in front an zzendzz in the end (a solution looking like Fasts boundary match feature behind the hood), You could then search for exact match ("aastartaa your phrase zzendzz"), and you w

Re: Help with Solr 1.3 lockups?

2009-01-26 Thread Mark Miller
Just a point or I missed: with such a large index (not doc size large, but content wise), I imagine a lot of your 16GB of RAM is being used by the system disk cache - which is good. Another reason you don't want to give too much RAM to the JVM. But you still want to give it enough to avoid the

Re: exact field match

2009-01-26 Thread Antonio Zippo
it works... thanks for your help bye Da: Erick Erickson A: solr-user@lucene.apache.org Inviato: Lunedì 26 gennaio 2009, 20:29:17 Oggetto: Re: exact field match You need to index and search using something like KeywordAnalyzer. That analyzer does no token

Re: Help with Solr 1.3 lockups?

2009-01-26 Thread Mark Miller
Okay, it sounds like your index is fairly small then (1 million docs). Since you are only faceting on a couple fields and sorting on one, thats going to take a bit of RAM, but not really that much. So whats likely happening is that because of all the committing so often, you are getting multipl

Re: Help with Solr 1.3 lockups?

2009-01-26 Thread Jerome L Quinn
"Lance Norskog" wrote on 01/20/2009 02:16:47 AM: > "Lance Norskog" > 01/20/2009 02:16 AM > Java 1.5 has thread-locking bugs. Switching to Java 1.6 may cure this > problem. Thanks for taking time to look at the problem. Unfortunately, this is happening on Java 1.6, so I can't put the blame t

Re: [ANN] Lucid Imagination

2009-01-26 Thread Yonik Seeley
> To try it out, browse to http://www.lucidimagination.com/search/. Since this is built on Solr, it does feel on-topic to talk a bit about it here too... Some tips: There's a very small "options" menu above the "PROJECT" facet. You can enable "Show facets with 0 results" there, which provides a

Re: Help with Solr 1.3 lockups?

2009-01-26 Thread Jerome L Quinn
Hi and thanks for looking at the problem ... Mark Miller wrote on 01/15/2009 02:58:24 PM: > Mark Miller > 01/15/2009 02:58 PM > > Re: Help with Solr 1.3 lockups? > > How much RAM are you giving the JVM? Thats running out of memory loading > a FieldCache, which can be a more memory intensive da

[ANN] Lucid Imagination

2009-01-26 Thread Grant Ingersoll
Hi Lucene and Solr users, As some of you may know, Yonik, Erik, Sami, Mark and I teamed up with Marc Krellenstein to create a company to provide commercial support (with SLAs), training, value-add components and services to users of Lucene and Solr. We have been relatively quiet up until now a

Re: exact field match

2009-01-26 Thread Erick Erickson
You need to index and search using something like KeywordAnalyzer. That analyzer does no tokenizing/ data transformation or such. For instance, it doesn't fold case. You will be unable to search for "bond" and get a hit in this case, so one solution is to use two fields, and search one or the othe

Re: solr.core.name property not available on core creation

2009-01-26 Thread Mark Ferguson
Thanks Shalin. Any ideas on a workaround in the mean time? I suppose I could set the instanceDir property to the data directory rather than the common directory, then set the config and schema explicitly. Mark On Mon, Jan 26, 2009 at 12:16 PM, Shalin Shekhar Mangar < shalinman...@gmail.com> wrote

Re: solr.core.name property not available on core creation

2009-01-26 Thread Shalin Shekhar Mangar
This is a known issue. I'll try to give a patch soon. https://issues.apache.org/jira/browse/SOLR-883 On Mon, Jan 26, 2009 at 11:59 PM, Mark Ferguson wrote: > Hi, > > I am trying to set up a multi-core environment in which I share a single > conf folder. I am following the instructions described

exact field match

2009-01-26 Thread Antonio Zippo
Hi all, i'm using a string field named "myField" and 2 documents containing: 1. myField="my name is james bond" 2. myField="james bond" if i use a query like this: myField:"james bond" it returns 2 documents how can i get only the second document using a string or text field? I need to sea

Re: How to make Relationships work for Multi-valued Index Fields?

2009-01-26 Thread Alexander Ramos Jardim
Hey Gunaranjan, I have the same scenario as you. A lucene index is denormalized. It should not contain entity relationship. When I need to do something like you are doing, I group the related values in one field. Let's say we have 2 credit cards. the first has id 30459673 and taxes at 1.5%/month

solr.core.name property not available on core creation

2009-01-26 Thread Mark Ferguson
Hi, I am trying to set up a multi-core environment in which I share a single conf folder. I am following the instructions described in this thread: http://www.mail-archive.com/solr-user@lucene.apache.org/msg16954.html In solrconfig.xml, I am setting dataDir to /srv/solr/cores/data/${ solr.core.na

Re: Text classification with Solr

2009-01-26 Thread Neal Richter
Thanks for the link Shalin... played with that a while back.. It's possibly got some indirect possibilities. On Mon, Jan 26, 2009 at 10:46 AM, Hannes Carl Meyer wrote: > I didn't understand, is the corpus of documents you want to use to classify > fix? Assume the 'documents' are not stored in th

Re: Text classification with Solr

2009-01-26 Thread Hannes Carl Meyer
Hi Neal, this sounds pretty similar to me. Did a lot of those projects some years ago (with Lucene low-level API)! I didn't understand, is the corpus of documents you want to use to classify fix? >>previously suggested procedure of 1) store document 2) execute >>more-like-this and 3) delete docu

Re: Random queries extremely slow

2009-01-26 Thread oleg_gnatovskiy
Can you expand on this? Mirroring delay on what? zayhen wrote: > > Use multiple boxes, with a mirroring delaay from one to another, like a > pipeline. > > 2009/1/22 oleg_gnatovskiy > >> >> Well this probably isn't the cause of our random slow queries, but might >> be >> the cause of the slo

Re: Text classification with Solr

2009-01-26 Thread Shalin Shekhar Mangar
On Mon, Jan 26, 2009 at 10:59 PM, Neal Richter wrote: > Hey all, > > I'm in the processing of implementing a system to do 'text > classification' with Solr. The basic idea is to take an > ontology/taxonomy like dmoz of {label: "X", tags: "a,b,c,d,e"}, index > it and then classify documents into

Text classification with Solr

2009-01-26 Thread Neal Richter
Hey all, I'm in the processing of implementing a system to do 'text classification' with Solr. The basic idea is to take an ontology/taxonomy like dmoz of {label: "X", tags: "a,b,c,d,e"}, index it and then classify documents into the taxonomy by pushing parsed document into the Solr search API.

Re: fastest way to index/reindex

2009-01-26 Thread Ian Connor
*:* took it up to 45/sec from 28/sec so a nice 60% bump in performance - thanks! On Sun, Jan 25, 2009 at 5:46 PM, Ryan McKinley wrote: > I don't know of any standard export/import tool -- i think luke has > something, but it will be faster if you write your own. > > Rather then id:[* TO *], just

Re: fastest way to index/reindex

2009-01-26 Thread Ian Connor
I have about 2.5 million per shard and seem to be getting through 28/sec using a 1000 at a time. It ran all yesterday and part of the night. It is over the 1.6 million mark now so hope it can keep up a similar rate as it gets deeper into the index. I need to reindex it all because I changed how so

Re: Concurrency problem with delta-import

2009-01-26 Thread Ryuuichi KUMAI
Hello Shalin, Thank you for your reply. I opened the issue and attached the patch. https://issues.apache.org/jira/browse/SOLR-985 > The lists are OK since they are modified only in the constructor. The map > needs to be changed to a ConcurrentHashMap as you did in the patch. I understood. Many

Re: Concurrency problem with delta-import

2009-01-26 Thread Shalin Shekhar Mangar
The lists are OK since they are modified only in the constructor. The map needs to be changed to a ConcurrentHashMap as you did in the patch. On Mon, Jan 26, 2009 at 7:23 PM, Shalin Shekhar Mangar < shalinman...@gmail.com> wrote: > Wow, well spotted. TemplateString is not thread-safe but it is be

Re: Concurrency problem with delta-import

2009-01-26 Thread Shalin Shekhar Mangar
Wow, well spotted. TemplateString is not thread-safe but it is being used concurrently by many cores due to the static instance. Apart from the cache map, the lists will also need to be taken care of. Can you please open an issue and attach this patch? https://issues.apache.org/jira/browse/SOLR

Concurrency problem with delta-import

2009-01-26 Thread Ryuuichi KUMAI
Hello, I'm using Solr 1.3 and I have a problem with DataImportHandler. My environment: - Solr 1.3 - MySQL 5.1.30, Connector/J 5.1.6 - Linux 2.6.9 x86_64 (RHEL4) - Sun JDK 1.6.0_11 - Apache Tomcat 6.0.18 Our Solr server has multi core, and the schema in each core is the same. When delta-impo

SOLR - indexing Date/Time in local format?

2009-01-26 Thread pskumar82
We use Solr1.3 and indexed some of our date fields in the format '1995-12-31T23:59:59Z' and as we know this is a UTC date. But we do want to index the date in IST which is +05:30hours so that extra conversion from UTC to IST across all our application is avoided. 1) How to do that? 2) And we have

Re: fastest way to index/reindex

2009-01-26 Thread Julian Davchev
I kinda don't get why would you reindex all data at once? Each document has unique id you will reindex only whats needed. Also if too many stuff I'd suggest using some batch processor that will add N tasks with range query 1:10 10:20 etc... and cronjob executing those. Thousends seems ok but w