RE: fmap.content - copying to two fields possible?
I've tried and it is working fine. But, still would like to know whether I can specify two fields against fmap.content. regards, Naga -Original Message- From: Naga Darbha [mailto:ndar...@opentext.com] Sent: Tuesday, May 04, 2010 12:20 PM To: solr-user@lucene.apache.org Subject: fmap.content - copying to two fields possible? Hi, I want to copy contents of a file (extracted using "ExtractingRequestHandler") to two fileds A and B. Currently I have configured it with: A If I want to copy the contents of the file to even B field, what is the option? Can I specify two fields against fmap.content? regards, Naga
fmap.content - copying to two fields possible?
Hi, I want to copy contents of a file (extracted using "ExtractingRequestHandler") to two fileds A and B. Currently I have configured it with: A If I want to copy the contents of the file to even B field, what is the option? Can I specify two fields against fmap.content? regards, Naga
RE: Facets vs TermV's
... I meant; the terms component is faster than using facets. Both of cause provide the autocomplete. From: Villemos, Gert [mailto:gert.ville...@logica.com] Sent: Tue 5/4/2010 8:30 AM To: solr-user@lucene.apache.org Subject: RE: Facets vs TermV's I found a thread ones (sorry; cant remember where) which stated that the issue is performance; the terms component is faster than the autocomplete. I'm no expert but I guess its a question of when the auto complete index gets build. Where as the terms component likely builds it at storage time, the facet component builds it at retrieval time. Me thinks... G. From: Darren Govoni [mailto:dar...@ontrenet.com] Sent: Tue 5/4/2010 3:53 AM To: solr-user@lucene.apache.org Subject: Facets vs TermV's Hi, I spent a lot of time on the Wiki and am working with facets and tv's, but I'm still confused about something. Basically, what is the difference between issuing a facet field query that returns facets with counts, and a query with term vectors that also returns document frequency counts for terms in a field? They seem almost similar, but I'm missing something I think. i.e. when is best to use one over the other? I know this is Solr 101, but just want to understand it fully. Thanks for any quick tips again. thanks. Please help Logica to respect the environment by not printing this email / Pour contribuer comme Logica au respect de l'environnement, merci de ne pas imprimer ce mail / Bitte drucken Sie diese Nachricht nicht aus und helfen Sie so Logica dabei, die Umwelt zu schützen. / Por favor ajude a Logica a respeitar o ambiente nao imprimindo este correio electronico. This e-mail and any attachment is for authorised use by the intended recipient(s) only. It may contain proprietary material, confidential information and/or be subject to legal privilege. It should not be copied, disclosed to, retained or used by, any other party. If you are not an intended recipient then please promptly delete this e-mail and any attachment and all copies and inform the sender. Thank you. Please help Logica to respect the environment by not printing this email / Pour contribuer comme Logica au respect de l'environnement, merci de ne pas imprimer ce mail / Bitte drucken Sie diese Nachricht nicht aus und helfen Sie so Logica dabei, die Umwelt zu schützen. / Por favor ajude a Logica a respeitar o ambiente nao imprimindo este correio electronico. This e-mail and any attachment is for authorised use by the intended recipient(s) only. It may contain proprietary material, confidential information and/or be subject to legal privilege. It should not be copied, disclosed to, retained or used by, any other party. If you are not an intended recipient then please promptly delete this e-mail and any attachment and all copies and inform the sender. Thank you.
RE: Facets vs TermV's
I found a thread ones (sorry; cant remember where) which stated that the issue is performance; the terms component is faster than the autocomplete. I'm no expert but I guess its a question of when the auto complete index gets build. Where as the terms component likely builds it at storage time, the facet component builds it at retrieval time. Me thinks... G. From: Darren Govoni [mailto:dar...@ontrenet.com] Sent: Tue 5/4/2010 3:53 AM To: solr-user@lucene.apache.org Subject: Facets vs TermV's Hi, I spent a lot of time on the Wiki and am working with facets and tv's, but I'm still confused about something. Basically, what is the difference between issuing a facet field query that returns facets with counts, and a query with term vectors that also returns document frequency counts for terms in a field? They seem almost similar, but I'm missing something I think. i.e. when is best to use one over the other? I know this is Solr 101, but just want to understand it fully. Thanks for any quick tips again. thanks. Please help Logica to respect the environment by not printing this email / Pour contribuer comme Logica au respect de l'environnement, merci de ne pas imprimer ce mail / Bitte drucken Sie diese Nachricht nicht aus und helfen Sie so Logica dabei, die Umwelt zu schützen. / Por favor ajude a Logica a respeitar o ambiente nao imprimindo este correio electronico. This e-mail and any attachment is for authorised use by the intended recipient(s) only. It may contain proprietary material, confidential information and/or be subject to legal privilege. It should not be copied, disclosed to, retained or used by, any other party. If you are not an intended recipient then please promptly delete this e-mail and any attachment and all copies and inform the sender. Thank you.
Re: solrDynamicMbeans access
: i need to access the solr mbeans displayed in jconsole to access the : attributes of solr using codes( java) ... : MBeanServerConnection mbs = conn.getMBeanServerConnection(); ... : now how do i create a solrMbean object to check on its attributes. I'm not overly familiar with using programatic JMX to remote monitor remote applications, but i don't believe you wnat to "create" any MBeans ... i believe you wnat to "query" that MBeanServerConnection for MBeans or "getMBeanInfo" for a given object name. -Hoss
Re: Solr Dismax query - prefix matching
: example: If I have a field called 'booktitle' with the actual values as : 'Code Complete', 'Coding standard 101', then I'd like to search for the : query string 'cod' and have the dismax match against both the book : titles since 'cod' is a prefix match for 'code' and 'coding'. it doesn't sound like you really want prefix queries ... it sounds like you want stemming. It's hard to tell because you only gave one example, so consider whether you want the book "codependents of agony" to match a search for "code" ... if hte answer is "yes" then what you are looking for is preix matching, if the answer is "no" then you should probably read up on stemming (which can work with the dismax parsing, by configuring it in the analyzer for your fields) -Hoss
Re: run on reboot on windows
There are a few programs that wrap any java app as a service. http://en.wikipedia.org/wiki/Service_wrapper On Mon, May 3, 2010 at 6:58 AM, Dave Searle wrote: > I don't think jetty can be installed as a service. You'd need to > create a bat file and put that in the win startup registry. > > Sent from my iPhone > > On 3 May 2010, at 11:26, "Frederico Azeiteiro" > wrote: > >> Hi Ahmed, >> >> I need to achieve that also. Do you manage to install it as service >> and >> start solr with Jetty? >> After installing and start jetty as service how do you start solr? >> >> Thanks, >> Frederico >> >> -Original Message- >> From: S Ahmed [mailto:sahmed1...@gmail.com] >> Sent: segunda-feira, 3 de Maio de 2010 01:05 >> To: solr-user@lucene.apache.org >> Subject: Re: run on reboot on windows >> >> Thanks, for some reason I was looking for a solution outside of >> jetty/tomcat, when that was the obvious way to get things restarted :) >> >> On Sun, May 2, 2010 at 7:53 PM, Dave Searle >> wrote: >> >>> Tomcat is installed as a service on windows. Just go into service >>> control panel and set startup type to automatic >>> >>> Sent from my iPhone >>> >>> On 3 May 2010, at 00:43, "S Ahmed" wrote: >>> its not tomcat/jetty that's the issue, its how to get things to re- start on a windows server (tomcat and jetty don't run as native windows services) so I am a little confused..thanks. On Sun, May 2, 2010 at 7:37 PM, caman wrote: > > Ahmed, > > > > Best is if you take a look at the documentation of jetty or tomcat. > SOLR > can > run on any web container, it's up to you how you configure your >> web > container to run > > > > Thanks > > Aboxy > > > > > > > > > > > > From: S Ahmed [via Lucene] > >> [mailto:ml-node+772174-2097041460-124...@n3.nabble.com> %2B772174- >> 2097041460-124...@n3.nabble.com> >>> %2b772174-2097041460-124...@n3.nabble.com> > ] > Sent: Sunday, May 02, 2010 4:33 PM > To: caman > Subject: Re: run on reboot on windows > > > > By default it uses Jetty, so your saying Tomcat on windows server > 2008/ > IIS7 > > runs as a native windows service? > > On Sun, May 2, 2010 at 12:46 AM, Dave Searle <[hidden email]>wrote: > > >> Set tomcat6 service to auto start on boot (if running tomat) >> >> Sent from my iPhone >> >> On 2 May 2010, at 02:31, "S Ahmed" <[hidden email]> wrote: >> >>> Hi, >>> >>> I'm trying to get Solr to run on windows, such that if it reboots >>> the Solr >>> service will be running. >>> >>> How can I do this? >> > > > > _ > > View message @ > > >>> >> http://lucene.472066.n3.nabble.com/run-on-reboot-on-windows-tp770892p772 >> 174 > . > html > To start a new topic under Solr - User, email > >> ml-node+472068-464289649-124...@n3.nabble.com> %2B472068-464289649 >> -124...@n3.nabble.com> >>> %2b472068-464289649-124...@n3.nabble.com> > To unsubscribe from Solr - User, click > < (link removed) > GZvcnRoZW90aGVyc3R1ZmZAZ21haWwuY29tfDQ3MjA2OHwtOTM0OTI1NzEx> here. > > > > > -- > View this message in context: > >>> >> http://lucene.472066.n3.nabble.com/run-on-reboot-on-windows-tp770892p772 >> 178.html > Sent from the Solr - User mailing list archive at Nabble.com. > >>> > -- Lance Norskog goks...@gmail.com
Facets vs TermV's
Hi, I spent a lot of time on the Wiki and am working with facets and tv's, but I'm still confused about something. Basically, what is the difference between issuing a facet field query that returns facets with counts, and a query with term vectors that also returns document frequency counts for terms in a field? They seem almost similar, but I'm missing something I think. i.e. when is best to use one over the other? I know this is Solr 101, but just want to understand it fully. Thanks for any quick tips again. thanks.
Re: Solr commit issue
This could be caused by HTTP caching. Solr's example solrconfig.xml comes with HTTP caching turned on, and this causes lots of beginners to have problems. The code to turn it off is commented in solrconfig.xml. Notice that the default is to have caching on, so to turn it off you have to have the XML that turns it off. On Sat, May 1, 2010 at 8:01 PM, Indika Tantrigoda wrote: > Thanks for the reply. > Here is another thread I found similar to this > http://www.mail-archive.com/solr-user@lucene.apache.org/msg28236.html > > From what I understand the IndexReaders get reopened after a commit. > > Regards, > Indika > > On 2 May 2010 00:29, Erick Erickson wrote: > >> The underlying IndexReader must be reopened. If you're >> searching for a document with a searcher that was opened >> before the document was indexed, it won't show up on the >> search results. >> >> I'm guessing that your statement that when you search >> for it with some test is coincidence, but that's just a guess. >> >> HTH >> Erick >> >> On Sat, May 1, 2010 at 1:07 PM, Indika Tantrigoda > >wrote: >> >> > Hi all, >> > >> > I've been working with Solr for a few weeks and have gotten SolrJ >> > to connect to it, index, search documents. >> > >> > However I am having an issue when a document is committed. >> > When a document is committed it does not show in the search results if I >> do >> > a *:* search, >> > but if I search for it with some text then it is shown in the results. >> > Only when another document is committed, the previous document is found >> > when >> > I do a *:* search >> > >> > Is this because of the SolrJ client or do I have to pass additional >> > parameters to commit() ? >> > >> > Thanks in advance. >> > >> > Regards, >> > Indika >> > >> > -- Lance Norskog goks...@gmail.com
Re: Auto-commit does not work
I think his point was, _what_ determines if its a misconfiguration? It can't be Solr because, like he said, a plugin may require it. If there is no such plugin, then what shall be the handler of it properly? nothingergo its ignored. On Mon, 2010-05-03 at 19:34 +0200, Andreas Jung wrote: > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA1 > > Chris Hostetter wrote: > > : right - and Solr should not swallow errors in the configuration :-) > > > > If you have an error in a *known* config declaration, solr will complain > > about it -- but solr can't complain just because you declare extra stuff > > in your conig files that it doens't know anything about -- some other > > plugin might care about it (or it might be there because you wanted > > special syntax for your own documentation purposes) > > I don't care about if this config is a core configuration or a > configuration of some plugin. Such kind of misconfiguration should be > handled properly - this is the minimum one can expect. > > - -aj > -BEGIN PGP SIGNATURE- > Version: GnuPG v1.4.10 (Darwin) > Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ > > iEYEARECAAYFAkvfCQ4ACgkQCJIWIbr9KYz22QCfeVNIwJt0f5+XfnV1qvsZ0HJm > He0AoL5lIoEiUyhUINXpA2rcDB8bgsAy > =skdf > -END PGP SIGNATURE-
Re: Commit takes 1 to 2 minutes, CPU usage affects other apps
On 5/3/10 9:06 AM, Markus Fischer wrote: Hi, we recently began having trouble with our Solr 1.4 instance. We've about 850k documents in the index which is about 1.2GB in size; the JVM which runs tomcat/solr (no other apps are deployed) has been given 2GB. We've a forum and run a process every minute which indexes the new messages. The number of messages updated are from 0 to 20 messages average. The commit takes about 1 or two minutes, but usually when it finished a few seconds later the next batch of documents is processed and the story starts again. So actually it's like Solr is running commits all day long and CPU usage ranges from 80% to 120%. This continuous CPU usage caused ill effects on other services running on the same machine. Our environment is being providing by a company purely using VMWare infrastructure, the Solr index itself is on an NSF for which we get some 33MB/s throughput. So, an easy solution would be to just put more resources into it, e.g. a separate machine. But before I make the decision I'd like to find out whether the app behaves properly under this circumstances or if its possible to shorten the commit time down to a few seconds so the CPU is not drained that long. thanks for any pointers, - Markus That is certainly not a normal commit time for an index of that size. Note that Solr 1.4 can have issues when working on NFS, but I don't know that it would have anything to do with this. Are you using the simple lock factory rather than the default native lock factory? (as you should do when running on NFS) -- - Mark http://www.lucidimagination.com
Re: Problem with pdf, upgrading Cell
Little more info... Seems to be a classloading issue. The tests pass, but they aren't loading the Tika libraries via the Solr ResourceLoader, whereas the example is. Marc, one thing to try is to unjar the Solr WAR file and put the Tika libs in there, as I bet it will then work. Note, however, I haven't tried this. On May 3, 2010, at 6:24 PM, Grant Ingersoll wrote: > I've opened https://issues.apache.org/jira/browse/SOLR-1902 to track this. > It is indeed a bug somewhere (still investigating). It seems that Tika is > now picking an EmptyParser implementation when trying to determine which > parser to use, despite the fact that it properly identifies the MIME Type. > > -Grant > > On May 3, 2010, at 5:36 PM, Grant Ingersoll wrote: > >> I'm investigating. >> >> On May 3, 2010, at 5:17 AM, Marc Ghorayeb wrote: >> >>> >>> Hi, >>> Grant, i confirm what Praveen has said, any PDF i try does not work with >>> the new Tika and SVN versions. :( >>> Marc >>> From: sagar...@opentext.com To: solr-user@lucene.apache.org Date: Mon, 3 May 2010 13:05:24 +0530 Subject: RE: Problem with pdf, upgrading Cell Hello, Please let me know if anybody figured out a way out of this issue. Thanks, Sandhya -Original Message- From: Praveen Agrawal [mailto:pkal...@gmail.com] Sent: Friday, April 30, 2010 11:14 PM To: solr-user@lucene.apache.org Subject: Re: Problem with pdf, upgrading Cell Grant, You can try any of the sample pdfs that come in /docs folder of Solr 1.4 dist'n. I had tried 'Installing Solr in Tomcat.pdf', 'index.pdf' etc. Only metadata i.e. stream_size, content_type apart from my own literals are indexed, and content is missing.. On Fri, Apr 30, 2010 at 8:52 PM, Grant Ingersoll wrote: > Praveen and Marc, > > Can you share the PDF (feel free to email my private email) that fails in > Solr? > > Thanks, > Grant > > > On Apr 30, 2010, at 7:55 AM, Marc Ghorayeb wrote: > >> >> Hi >> Nope i didn't get it to work... Just like you, command line version of > tika extracts correctly the content, but once included in Solr, no content > is extracted. >> What i tried until now is:- Updating the tika libraries inside Solr 1.4 > public version, no luck there.- Downloading the latest SVN version, > compiled > it, and started from a simple schema, still no luck.- Getting other > versions > compiled on hudson (nightly builds), and testing them also, still no > extraction. >> I sent a mail on the developpers mailing list but they told me i should > just mail here, hope some developper reads this because it's quite an > important feature of Solr and somehow it got broke between the 1.4 > release, > and the last version on the svn. >> Marc >> _ >> Consultez gratuitement vos emails Orange, Gmail, Free, ... directement > dans HOTMAIL ! >> http://www.windowslive.fr/hotmail/agregation/ > > -- > Grant Ingersoll > http://www.lucidimagination.com/ > > Search the Lucene ecosystem using Solr/Lucene: > http://www.lucidimagination.com/search > > >>> >>> _ >>> Hotmail et MSN dans la poche? HOTMAIL et MSN sont dispo gratuitement sur >>> votre téléphone! >>> http://www.messengersurvotremobile.com/?d=Hotmail >> >> -- >> Grant Ingersoll >> http://www.lucidimagination.com/ >> >> Search the Lucene ecosystem using Solr/Lucene: >> http://www.lucidimagination.com/search >> > > -- > Grant Ingersoll > http://www.lucidimagination.com/ > > Search the Lucene ecosystem using Solr/Lucene: > http://www.lucidimagination.com/search > -- Grant Ingersoll http://www.lucidimagination.com/ Search the Lucene ecosystem using Solr/Lucene: http://www.lucidimagination.com/search
Re: Problem with pdf, upgrading Cell
I've opened https://issues.apache.org/jira/browse/SOLR-1902 to track this. It is indeed a bug somewhere (still investigating). It seems that Tika is now picking an EmptyParser implementation when trying to determine which parser to use, despite the fact that it properly identifies the MIME Type. -Grant On May 3, 2010, at 5:36 PM, Grant Ingersoll wrote: > I'm investigating. > > On May 3, 2010, at 5:17 AM, Marc Ghorayeb wrote: > >> >> Hi, >> Grant, i confirm what Praveen has said, any PDF i try does not work with the >> new Tika and SVN versions. :( >> Marc >> >>> From: sagar...@opentext.com >>> To: solr-user@lucene.apache.org >>> Date: Mon, 3 May 2010 13:05:24 +0530 >>> Subject: RE: Problem with pdf, upgrading Cell >>> >>> Hello, >>> >>> Please let me know if anybody figured out a way out of this issue. >>> >>> Thanks, >>> Sandhya >>> >>> -Original Message- >>> From: Praveen Agrawal [mailto:pkal...@gmail.com] >>> Sent: Friday, April 30, 2010 11:14 PM >>> To: solr-user@lucene.apache.org >>> Subject: Re: Problem with pdf, upgrading Cell >>> >>> Grant, >>> You can try any of the sample pdfs that come in /docs folder of Solr 1.4 >>> dist'n. I had tried 'Installing Solr in Tomcat.pdf', 'index.pdf' etc. Only >>> metadata i.e. stream_size, content_type apart from my own literals are >>> indexed, and content is missing.. >>> >>> >>> On Fri, Apr 30, 2010 at 8:52 PM, Grant Ingersoll wrote: >>> Praveen and Marc, Can you share the PDF (feel free to email my private email) that fails in Solr? Thanks, Grant On Apr 30, 2010, at 7:55 AM, Marc Ghorayeb wrote: > > Hi > Nope i didn't get it to work... Just like you, command line version of tika extracts correctly the content, but once included in Solr, no content is extracted. > What i tried until now is:- Updating the tika libraries inside Solr 1.4 public version, no luck there.- Downloading the latest SVN version, compiled it, and started from a simple schema, still no luck.- Getting other versions compiled on hudson (nightly builds), and testing them also, still no extraction. > I sent a mail on the developpers mailing list but they told me i should just mail here, hope some developper reads this because it's quite an important feature of Solr and somehow it got broke between the 1.4 release, and the last version on the svn. > Marc > _ > Consultez gratuitement vos emails Orange, Gmail, Free, ... directement dans HOTMAIL ! > http://www.windowslive.fr/hotmail/agregation/ -- Grant Ingersoll http://www.lucidimagination.com/ Search the Lucene ecosystem using Solr/Lucene: http://www.lucidimagination.com/search >> >> _ >> Hotmail et MSN dans la poche? HOTMAIL et MSN sont dispo gratuitement sur >> votre téléphone! >> http://www.messengersurvotremobile.com/?d=Hotmail > > -- > Grant Ingersoll > http://www.lucidimagination.com/ > > Search the Lucene ecosystem using Solr/Lucene: > http://www.lucidimagination.com/search > -- Grant Ingersoll http://www.lucidimagination.com/ Search the Lucene ecosystem using Solr/Lucene: http://www.lucidimagination.com/search
Re: Problem with pdf, upgrading Cell
I'm investigating. On May 3, 2010, at 5:17 AM, Marc Ghorayeb wrote: > > Hi, > Grant, i confirm what Praveen has said, any PDF i try does not work with the > new Tika and SVN versions. :( > Marc > >> From: sagar...@opentext.com >> To: solr-user@lucene.apache.org >> Date: Mon, 3 May 2010 13:05:24 +0530 >> Subject: RE: Problem with pdf, upgrading Cell >> >> Hello, >> >> Please let me know if anybody figured out a way out of this issue. >> >> Thanks, >> Sandhya >> >> -Original Message- >> From: Praveen Agrawal [mailto:pkal...@gmail.com] >> Sent: Friday, April 30, 2010 11:14 PM >> To: solr-user@lucene.apache.org >> Subject: Re: Problem with pdf, upgrading Cell >> >> Grant, >> You can try any of the sample pdfs that come in /docs folder of Solr 1.4 >> dist'n. I had tried 'Installing Solr in Tomcat.pdf', 'index.pdf' etc. Only >> metadata i.e. stream_size, content_type apart from my own literals are >> indexed, and content is missing.. >> >> >> On Fri, Apr 30, 2010 at 8:52 PM, Grant Ingersoll wrote: >> >>> Praveen and Marc, >>> >>> Can you share the PDF (feel free to email my private email) that fails in >>> Solr? >>> >>> Thanks, >>> Grant >>> >>> >>> On Apr 30, 2010, at 7:55 AM, Marc Ghorayeb wrote: >>> Hi Nope i didn't get it to work... Just like you, command line version of >>> tika extracts correctly the content, but once included in Solr, no content >>> is extracted. What i tried until now is:- Updating the tika libraries inside Solr 1.4 >>> public version, no luck there.- Downloading the latest SVN version, compiled >>> it, and started from a simple schema, still no luck.- Getting other versions >>> compiled on hudson (nightly builds), and testing them also, still no >>> extraction. I sent a mail on the developpers mailing list but they told me i should >>> just mail here, hope some developper reads this because it's quite an >>> important feature of Solr and somehow it got broke between the 1.4 release, >>> and the last version on the svn. Marc _ Consultez gratuitement vos emails Orange, Gmail, Free, ... directement >>> dans HOTMAIL ! http://www.windowslive.fr/hotmail/agregation/ >>> >>> -- >>> Grant Ingersoll >>> http://www.lucidimagination.com/ >>> >>> Search the Lucene ecosystem using Solr/Lucene: >>> http://www.lucidimagination.com/search >>> >>> > > _ > Hotmail et MSN dans la poche? HOTMAIL et MSN sont dispo gratuitement sur > votre téléphone! > http://www.messengersurvotremobile.com/?d=Hotmail -- Grant Ingersoll http://www.lucidimagination.com/ Search the Lucene ecosystem using Solr/Lucene: http://www.lucidimagination.com/search
Re: Score cutoff
Hi, Can someone give clues on how to implement this feature? This is a very important requirement for us, so any help is greatly appreciated. thanks! On Tue, Apr 27, 2010 at 5:54 PM, Satish Kumar < satish.kumar.just.d...@gmail.com> wrote: > Hi, > > For some of our queries, the top xx (five or so) results are of very high > quality and results after xx are very poor. The difference in score for the > high quality and poor quality results is high. For example, 3.5 for high > quality and 0.8 for poor quality. We want to exclude results with score > value that is less than 60% or so of the first result. Is there a filter > that does this? If not, can someone please give some hints on how to > implement this (we want to do this as part of solr relevance ranking so that > the facet counts, etc will be correct). > > > Thanks, > Satish >
Re: cores and SWAP
I have 2 cores: core1 and core2. Load the same data set into each and commit. Verify that searches return the same for each core. Delete a document (call it docA) from core2 but not from core1. Commit and verify search results (docA disappears from core2's search results. core1 continues to return the docA) Swap cores. Core2 should now return docA, but it doesn't until I reload core2. thanks, Tim On Mon, May 3, 2010 at 1:41 PM, Shalin Shekhar Mangar wrote: > On Mon, May 3, 2010 at 10:27 PM, Tim Heckman wrote: > >> Hi, I'm trying to figure out whether I need to reload a core (or both >> cores?) after performing a swap. >> >> When I perform a swap in my sandbox (non-production) environment, I am >> seeing that one of the cores needs to be reloaded following a swap and >> the other does not, but I haven't been able to find a pattern to which >> one it will be. >> >> > No, you should not need to reload any core after a swap. What is the > behavior that you see? > > -- > Regards, > Shalin Shekhar Mangar. >
Re: Overlapping onDeckSearchers=2
: When i run 2 -3 commits parallely to diff instances or same instance I get : this error : : PERFORMANCE WARNING: Overlapping onDeckSearchers=2 : : What is the Best approach to solve this http://wiki.apache.org/solr/FAQ#What_does_.22PERFORMANCE_WARNING:_Overlapping_onDeckSearchers.3DX.22_mean_in_my_logs.3F -Hoss
Re: Overlapping onDeckSearchers=2
On Mon, May 3, 2010 at 11:24 AM, revas wrote: > Hello, > > We have a server with many solr instances running (around 40-50) . > > We are committing documents ,sometimes one or sometimes around 200 > documents at a time .to only one instance at a time > > When i run 2 -3 commits parallely to diff instances or same instance I get > this error > > PERFORMANCE WARNING: Overlapping onDeckSearchers=2 > > What is the Best approach to solve this > > You should see that warning only when you run multiple commits within a short period of time to the same Solr instance. You will never see this warning when performing commit on different instances. So, do you really need to commit on the same instance in such short time? Can you batch commits? -- Regards, Shalin Shekhar Mangar.
Re: cores and SWAP
On Mon, May 3, 2010 at 10:27 PM, Tim Heckman wrote: > Hi, I'm trying to figure out whether I need to reload a core (or both > cores?) after performing a swap. > > When I perform a swap in my sandbox (non-production) environment, I am > seeing that one of the cores needs to be reloaded following a swap and > the other does not, but I haven't been able to find a pattern to which > one it will be. > > No, you should not need to reload any core after a swap. What is the behavior that you see? -- Regards, Shalin Shekhar Mangar.
Re: Auto-commit does not work
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Chris Hostetter wrote: > : right - and Solr should not swallow errors in the configuration :-) > > If you have an error in a *known* config declaration, solr will complain > about it -- but solr can't complain just because you declare extra stuff > in your conig files that it doens't know anything about -- some other > plugin might care about it (or it might be there because you wanted > special syntax for your own documentation purposes) I don't care about if this config is a core configuration or a configuration of some plugin. Such kind of misconfiguration should be handled properly - this is the minimum one can expect. - -aj -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (Darwin) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkvfCQ4ACgkQCJIWIbr9KYz22QCfeVNIwJt0f5+XfnV1qvsZ0HJm He0AoL5lIoEiUyhUINXpA2rcDB8bgsAy =skdf -END PGP SIGNATURE-
Re: Auto-commit does not work
: right - and Solr should not swallow errors in the configuration :-) If you have an error in a *known* config declaration, solr will complain about it -- but solr can't complain just because you declare extra stuff in your conig files that it doens't know anything about -- some other plugin might care about it (or it might be there because you wanted special syntax for your own documentation purposes) -Hoss
Re: Auto-commit does not work
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Ahmet Arslan wrote: > > I just realized that there is a typo in your autoCommit definition. The > letter C sould be capital. > > > 1 > 1000 > > right - and Solr should not swallow errors in the configuration :-) Andreas -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (Darwin) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkvfB2wACgkQCJIWIbr9KYy+6ACZAXSFy8wHPWa8R+wzEUHhftEp qw0An1GV6PAAdd4ezTKn4OF9WwROAUHP =FeBy -END PGP SIGNATURE-
Re: Auto-commit does not work
> commits : 135 > autocommits : 0 > optimizes : 0 > rollbacks : 0 > expungeDeletes : 0 > docsPending : 8842 > adds : 8842 > deletesById : 0 > deletesByQuery : 0 > errors : 0 > cumulative_adds : 8842 > cumulative_deletesById : 20390 > cumulative_deletesByQuery : 0 > cumulative_errors : 0 I just realized that there is a typo in your autoCommit definition. The letter C sould be capital. 1 1000
cores and SWAP
Hi, I'm trying to figure out whether I need to reload a core (or both cores?) after performing a swap. When I perform a swap in my sandbox (non-production) environment, I am seeing that one of the cores needs to be reloaded following a swap and the other does not, but I haven't been able to find a pattern to which one it will be. http://wiki.apache.org/solr/CoreAdmin doesn't seem to cover this. Or, maybe I'm doing something wrong. Thanks for any help, Tim
Re: Retrieving indexed field data
may be backup/restore data directory a workaround for you!! On Mon, May 3, 2010 at 7:47 PM, Erick Erickson wrote: > Ahhh, Nope, I'm clueless. This strikes me as a pretty hairy thing to > do, but there's no built-in support that I know of for anything > similar. > > Sorry I can't be more help > Erick > > 2010/5/3 Licinio Fernández Maurelo > > > Thanks for your response, Erik. > > > > Just want to "copy" indexing related info for fields indexed but not > stored > > , *don't want to reconstruct the original field(s) value. * > > * > > * > > Any help?* * > > * > > * > > * > > * > > * > > * > > * > > * > > * > > * > > > > > > > > 2010/5/3 Erick Erickson > > > > > If you're asking if indexed but NOT stored data can be retrieved, > > > i.e. if you can reconstruct the original field(s) from the indexed > > > data alone, the answer is no. Or, rather, you can, kinda, but it's > > > a lossy process. > > > > > > Consider stemming. If you indexed "running" using stemming, > > > the term "run" is indexed. Lucene/SOLR has no record > > > of the original term. Similarly with stopwords. > > > > > > But if you *store* the data, then the original can be retrieved. > > > > > > HTH > > > Erick > > > > > > 2010/5/3 Licinio Fernández Maurelo > > > > > > > Hi folks, > > > > > > > > i'm wondering if there is a way to retrieve the indexed data. The > > reason > > > is > > > > that i'm working on a solrj-based tool that copies one index data > into > > > > other > > > > (allowing you to perform changes in docs ). I know i can't perform > any > > > > change in an indexed field, just want to "copy" the chunk of bytes .. > > > > > > > > Am i missing something? Indexing generated data can't be retrieved > > > anyway? > > > > > > > > Thanks in advance . > > > > > > > > -- > > > > Lici > > > > ~Java Developer~ > > > > > > > > > > > > > > > -- > > Lici > > ~Java Developer~ > > >
Re: Auto-commit does not work
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Ahmet Arslan wrote: >> >> I inserted 10k documents through a Python script (w/ solrpy >> bindings) >> without explict commit. However I do not see that the >> "numDocs" >> increased meanwhile...is there any way to hunt this down? > > What does solr/admin/stats.jsp#update page says about autocommits and > docsPending? commits : 135 autocommits : 0 optimizes : 0 rollbacks : 0 expungeDeletes : 0 docsPending : 8842 adds : 8842 deletesById : 0 deletesByQuery : 0 errors : 0 cumulative_adds : 8842 cumulative_deletesById : 20390 cumulative_deletesByQuery : 0 cumulative_errors : 0 Andreas -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (Darwin) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkve9pwACgkQCJIWIbr9KYwKhgCdFB39W6SitIeaL//ioXwWoC+n 4PkAn05TlCFYfJw3Vm/B20Dm7lyY/Qhm =lWbr -END PGP SIGNATURE-
Re: Auto-commit does not work
> Running Solr 1.4 with > > > ? > > ? > > 1000 > 6 > > > > I inserted 10k documents through a Python script (w/ solrpy > bindings) > without explict commit. However I do not see that the > "numDocs" > increased meanwhile...is there any way to hunt this down? What does solr/admin/stats.jsp#update page says about autocommits and docsPending?
Auto-commit does not work
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Running Solr 1.4 with ? ? 1000 6 I inserted 10k documents through a Python script (w/ solrpy bindings) without explict commit. However I do not see that the "numDocs" increased meanwhile...is there any way to hunt this down? Andreas -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (Darwin) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkve8MUACgkQCJIWIbr9KYwl6gCg6HxoSyjt9ccPFAohIKjEUzdZ ZrEAoM7QmwP3GWVMSpQaCtABK/6K/6dK =EiRk -END PGP SIGNATURE-
Re: SpellChecking
Am 03.05.2010 16:43, schrieb Jan Kammer: > Hi, > > It worked fine with a normal field. There must something wrong with > copyfield, or why does dataimporthandler add/update no more documents? Did you define your destination field as multivalue? -Michael
Re: SpellChecking
Hi, i build the index with ...&spellcheck.build=true It worked fine with a normal field. There must something wrong with copyfield, or why does dataimporthandler add/update no more documents? Can somebody paste the code for copyfield with many fields? Greetz, Jan Am 03.05.2010 16:36, schrieb Villemos, Gert: We are using copy fields for 40+ fields to do spelling, and it works fine. Are you sure that you actually build the spell index before you try to do spelling? You need to either configure SOLr to build spell index on commit, or manually issue a spell index build request. Regards, Gert. -Original Message- From: Jan Kammer [mailto:jan.kam...@mni.fh-giessen.de] Sent: Montag, 3. Mai 2010 16:26 To: solr-user@lucene.apache.org Subject: Re: SpellChecking Hi, if I define one of my normal fields from schema.xml in solrconfig.xml for spellchecking all works fine: ... That didnt work, because nothing was in "spell" after that. Next try was to copy each field in a line to "spell": ... This does work up to 3 documents, if i define more, the count for failed documents in dataimporthandler gets higher and higher the more i copy into "spell". 16444 So my question is, if this is the right way to use the spellchecker with many fields, or is there an other "better" way... thanks. greetz, Jan Am 03.05.2010 16:08, schrieb Erick Erickson: It would help a lot to see your actual config file, and if you provided a bit more detail about what failure looks like Best Erick On Mon, May 3, 2010 at 9:43 AM, Jan Kammerwrote: Hi there, I want to enable spellchecking, but i got many fields. I tried around with copyfield to copy all with "*" in one field, but that didnt work. Next try was to copy some fields specified each by name in one field named "spell", but that worked only for 2 or 3 fields, but not for 10 or more... My question is, what the best practice is to enable spellchecking on many fields. thanks. greetz, Jan Please help Logica to respect the environment by not printing this email / Pour contribuer comme Logica au respect de l'environnement, merci de ne pas imprimer ce mail / Bitte drucken Sie diese Nachricht nicht aus und helfen Sie so Logica dabei, die Umwelt zu schützen. / Por favor ajude a Logica a respeitar o ambiente nao imprimindo este correio electronico. This e-mail and any attachment is for authorised use by the intended recipient(s) only. It may contain proprietary material, confidential information and/or be subject to legal privilege. It should not be copied, disclosed to, retained or used by, any other party. If you are not an intended recipient then please promptly delete this e-mail and any attachment and all copies and inform the sender. Thank you.
RE: SpellChecking
We are using copy fields for 40+ fields to do spelling, and it works fine. Are you sure that you actually build the spell index before you try to do spelling? You need to either configure SOLr to build spell index on commit, or manually issue a spell index build request. Regards, Gert. -Original Message- From: Jan Kammer [mailto:jan.kam...@mni.fh-giessen.de] Sent: Montag, 3. Mai 2010 16:26 To: solr-user@lucene.apache.org Subject: Re: SpellChecking Hi, if I define one of my normal fields from schema.xml in solrconfig.xml for spellchecking all works fine: ... That didnt work, because nothing was in "spell" after that. Next try was to copy each field in a line to "spell": ... This does work up to 3 documents, if i define more, the count for failed documents in dataimporthandler gets higher and higher the more i copy into "spell". 16444 So my question is, if this is the right way to use the spellchecker with many fields, or is there an other "better" way... thanks. greetz, Jan Am 03.05.2010 16:08, schrieb Erick Erickson: > It would help a lot to see your actual config file, and if you provided a > bit more > detail about what failure looks like > > Best > Erick > > On Mon, May 3, 2010 at 9:43 AM, Jan Kammerwrote: > > >> Hi there, >> >> I want to enable spellchecking, but i got many fields. >> >> I tried around with copyfield to copy all with "*" in one field, but that >> didnt work. >> Next try was to copy some fields specified each by name in one field named >> "spell", but that worked only for 2 or 3 fields, but not for 10 or more... >> >> My question is, what the best practice is to enable spellchecking on many >> fields. >> >> thanks. >> >> greetz, Jan >> >> > Please help Logica to respect the environment by not printing this email / Pour contribuer comme Logica au respect de l'environnement, merci de ne pas imprimer ce mail / Bitte drucken Sie diese Nachricht nicht aus und helfen Sie so Logica dabei, die Umwelt zu schützen. / Por favor ajude a Logica a respeitar o ambiente nao imprimindo este correio electronico. This e-mail and any attachment is for authorised use by the intended recipient(s) only. It may contain proprietary material, confidential information and/or be subject to legal privilege. It should not be copied, disclosed to, retained or used by, any other party. If you are not an intended recipient then please promptly delete this e-mail and any attachment and all copies and inform the sender. Thank you.
Re: SpellChecking
Hi, if I define one of my normal fields from schema.xml in solrconfig.xml for spellchecking all works fine: ... That didnt work, because nothing was in "spell" after that. Next try was to copy each field in a line to "spell": ... This does work up to 3 documents, if i define more, the count for failed documents in dataimporthandler gets higher and higher the more i copy into "spell". 16444 So my question is, if this is the right way to use the spellchecker with many fields, or is there an other "better" way... thanks. greetz, Jan Am 03.05.2010 16:08, schrieb Erick Erickson: It would help a lot to see your actual config file, and if you provided a bit more detail about what failure looks like Best Erick On Mon, May 3, 2010 at 9:43 AM, Jan Kammerwrote: Hi there, I want to enable spellchecking, but i got many fields. I tried around with copyfield to copy all with "*" in one field, but that didnt work. Next try was to copy some fields specified each by name in one field named "spell", but that worked only for 2 or 3 fields, but not for 10 or more... My question is, what the best practice is to enable spellchecking on many fields. thanks. greetz, Jan
Re: Retrieving indexed field data
Ahhh, Nope, I'm clueless. This strikes me as a pretty hairy thing to do, but there's no built-in support that I know of for anything similar. Sorry I can't be more help Erick 2010/5/3 Licinio Fernández Maurelo > Thanks for your response, Erik. > > Just want to "copy" indexing related info for fields indexed but not stored > , *don't want to reconstruct the original field(s) value. * > * > * > Any help?* * > * > * > * > * > * > * > * > * > * > * > > > > 2010/5/3 Erick Erickson > > > If you're asking if indexed but NOT stored data can be retrieved, > > i.e. if you can reconstruct the original field(s) from the indexed > > data alone, the answer is no. Or, rather, you can, kinda, but it's > > a lossy process. > > > > Consider stemming. If you indexed "running" using stemming, > > the term "run" is indexed. Lucene/SOLR has no record > > of the original term. Similarly with stopwords. > > > > But if you *store* the data, then the original can be retrieved. > > > > HTH > > Erick > > > > 2010/5/3 Licinio Fernández Maurelo > > > > > Hi folks, > > > > > > i'm wondering if there is a way to retrieve the indexed data. The > reason > > is > > > that i'm working on a solrj-based tool that copies one index data into > > > other > > > (allowing you to perform changes in docs ). I know i can't perform any > > > change in an indexed field, just want to "copy" the chunk of bytes .. > > > > > > Am i missing something? Indexing generated data can't be retrieved > > anyway? > > > > > > Thanks in advance . > > > > > > -- > > > Lici > > > ~Java Developer~ > > > > > > > > > -- > Lici > ~Java Developer~ >
Re: Retrieving indexed field data
Thanks for your response, Erik. Just want to "copy" indexing related info for fields indexed but not stored , *don't want to reconstruct the original field(s) value. * * * Any help?* * * * * * * * * * * * 2010/5/3 Erick Erickson > If you're asking if indexed but NOT stored data can be retrieved, > i.e. if you can reconstruct the original field(s) from the indexed > data alone, the answer is no. Or, rather, you can, kinda, but it's > a lossy process. > > Consider stemming. If you indexed "running" using stemming, > the term "run" is indexed. Lucene/SOLR has no record > of the original term. Similarly with stopwords. > > But if you *store* the data, then the original can be retrieved. > > HTH > Erick > > 2010/5/3 Licinio Fernández Maurelo > > > Hi folks, > > > > i'm wondering if there is a way to retrieve the indexed data. The reason > is > > that i'm working on a solrj-based tool that copies one index data into > > other > > (allowing you to perform changes in docs ). I know i can't perform any > > change in an indexed field, just want to "copy" the chunk of bytes .. > > > > Am i missing something? Indexing generated data can't be retrieved > anyway? > > > > Thanks in advance . > > > > -- > > Lici > > ~Java Developer~ > > > -- Lici ~Java Developer~
Re: SpellChecking
It would help a lot to see your actual config file, and if you provided a bit more detail about what failure looks like Best Erick On Mon, May 3, 2010 at 9:43 AM, Jan Kammer wrote: > Hi there, > > I want to enable spellchecking, but i got many fields. > > I tried around with copyfield to copy all with "*" in one field, but that > didnt work. > Next try was to copy some fields specified each by name in one field named > "spell", but that worked only for 2 or 3 fields, but not for 10 or more... > > My question is, what the best practice is to enable spellchecking on many > fields. > > thanks. > > greetz, Jan >
Re: run on reboot on windows
I don't think jetty can be installed as a service. You'd need to create a bat file and put that in the win startup registry. Sent from my iPhone On 3 May 2010, at 11:26, "Frederico Azeiteiro" wrote: > Hi Ahmed, > > I need to achieve that also. Do you manage to install it as service > and > start solr with Jetty? > After installing and start jetty as service how do you start solr? > > Thanks, > Frederico > > -Original Message- > From: S Ahmed [mailto:sahmed1...@gmail.com] > Sent: segunda-feira, 3 de Maio de 2010 01:05 > To: solr-user@lucene.apache.org > Subject: Re: run on reboot on windows > > Thanks, for some reason I was looking for a solution outside of > jetty/tomcat, when that was the obvious way to get things restarted :) > > On Sun, May 2, 2010 at 7:53 PM, Dave Searle > wrote: > >> Tomcat is installed as a service on windows. Just go into service >> control panel and set startup type to automatic >> >> Sent from my iPhone >> >> On 3 May 2010, at 00:43, "S Ahmed" wrote: >> >>> its not tomcat/jetty that's the issue, its how to get things to re- >>> start on >>> a windows server (tomcat and jetty don't run as native windows >>> services) so >>> I am a little confused..thanks. >>> >>> On Sun, May 2, 2010 at 7:37 PM, caman >>> wrote: >>> Ahmed, Best is if you take a look at the documentation of jetty or tomcat. SOLR can run on any web container, it's up to you how you configure your > web container to run Thanks Aboxy From: S Ahmed [via Lucene] > [mailto:ml-node+772174-2097041460-124...@n3.nabble.com %2B772174- > 2097041460-124...@n3.nabble.com> >> >>> %2b772174-2097041460-124...@n3.nabble.com> ] Sent: Sunday, May 02, 2010 4:33 PM To: caman Subject: Re: run on reboot on windows By default it uses Jetty, so your saying Tomcat on windows server 2008/ IIS7 runs as a native windows service? On Sun, May 2, 2010 at 12:46 AM, Dave Searle <[hidden email]>wrote: > Set tomcat6 service to auto start on boot (if running tomat) > > Sent from my iPhone > > On 2 May 2010, at 02:31, "S Ahmed" <[hidden email]> wrote: > >> Hi, >> >> I'm trying to get Solr to run on windows, such that if it reboots >> the Solr >> service will be running. >> >> How can I do this? > _ View message @ >> > http://lucene.472066.n3.nabble.com/run-on-reboot-on-windows-tp770892p772 > 174 . html To start a new topic under Solr - User, email > ml-node+472068-464289649-124...@n3.nabble.com %2B472068-464289649 > -124...@n3.nabble.com> >> >>> %2b472068-464289649-124...@n3.nabble.com> To unsubscribe from Solr - User, click < (link removed) GZvcnRoZW90aGVyc3R1ZmZAZ21haWwuY29tfDQ3MjA2OHwtOTM0OTI1NzEx> here. -- View this message in context: >> > http://lucene.472066.n3.nabble.com/run-on-reboot-on-windows-tp770892p772 > 178.html Sent from the Solr - User mailing list archive at Nabble.com. >>
Re: Retrieving indexed field data
If you're asking if indexed but NOT stored data can be retrieved, i.e. if you can reconstruct the original field(s) from the indexed data alone, the answer is no. Or, rather, you can, kinda, but it's a lossy process. Consider stemming. If you indexed "running" using stemming, the term "run" is indexed. Lucene/SOLR has no record of the original term. Similarly with stopwords. But if you *store* the data, then the original can be retrieved. HTH Erick 2010/5/3 Licinio Fernández Maurelo > Hi folks, > > i'm wondering if there is a way to retrieve the indexed data. The reason is > that i'm working on a solrj-based tool that copies one index data into > other > (allowing you to perform changes in docs ). I know i can't perform any > change in an indexed field, just want to "copy" the chunk of bytes .. > > Am i missing something? Indexing generated data can't be retrieved anyway? > > Thanks in advance . > > -- > Lici > ~Java Developer~ >
Re: Skipping duplicates in DataImportHandler based on uniqueKey
Marc Sturlese wrote: > > You can use deduplication to do that. Create the signature based on the > unique field or any field you want. > Cool, thanks, I hadn't thought of that. -- View this message in context: http://lucene.472066.n3.nabble.com/Skipping-duplicates-in-DataImportHandler-based-on-uniqueKey-tp771559p773268.html Sent from the Solr - User mailing list archive at Nabble.com.
SpellChecking
Hi there, I want to enable spellchecking, but i got many fields. I tried around with copyfield to copy all with "*" in one field, but that didnt work. Next try was to copy some fields specified each by name in one field named "spell", but that worked only for 2 or 3 fields, but not for 10 or more... My question is, what the best practice is to enable spellchecking on many fields. thanks. greetz, Jan
Re: OutOfMemoryError when using query with sort
How many unique terms are in your sort field? On Sun, May 2, 2010 at 11:48 PM, Hamid Vahedi wrote: > I install 64 bit windows and my problem solved. also i using shard mode > (100 M doc per machine with one solr instance) > is there better solution? because i insert at least 5M doc per day > > > > > > From: Koji Sekiguchi > To: solr-user@lucene.apache.org > Sent: Sun, May 2, 2010 9:08:42 PM > Subject: Re: OutOfMemoryError when using query with sort > > Hamid Vahedi wrote: > > Hi, i using solr that running on windows server 2008 32-bit. > > I add about 100 million article into solr without set store attribute. > (only store document id) (index file size about 164 GB) > > when try to get query without sort , it's return doc ids in some ms, but > when add sort command, i get below error: > > > > TTP Status 500 - Java heap space java.lang.OutOfMemoryError: Java heap > space at > Since sort uses FieldCache and it consumes memory, you got OOM. > I think 100M docs/164GB index is considerable large for 32 bit machine. > Why don't you use distributed search? > > Koji > > -- http://www.rondhuit.com/en/ > > > >
Commit takes 1 to 2 minutes, CPU usage affects other apps
Hi, we recently began having trouble with our Solr 1.4 instance. We've about 850k documents in the index which is about 1.2GB in size; the JVM which runs tomcat/solr (no other apps are deployed) has been given 2GB. We've a forum and run a process every minute which indexes the new messages. The number of messages updated are from 0 to 20 messages average. The commit takes about 1 or two minutes, but usually when it finished a few seconds later the next batch of documents is processed and the story starts again. So actually it's like Solr is running commits all day long and CPU usage ranges from 80% to 120%. This continuous CPU usage caused ill effects on other services running on the same machine. Our environment is being providing by a company purely using VMWare infrastructure, the Solr index itself is on an NSF for which we get some 33MB/s throughput. So, an easy solution would be to just put more resources into it, e.g. a separate machine. But before I make the decision I'd like to find out whether the app behaves properly under this circumstances or if its possible to shorten the commit time down to a few seconds so the CPU is not drained that long. thanks for any pointers, - Markus
Retrieving indexed field data
Hi folks, i'm wondering if there is a way to retrieve the indexed data. The reason is that i'm working on a solrj-based tool that copies one index data into other (allowing you to perform changes in docs ). I know i can't perform any change in an indexed field, just want to "copy" the chunk of bytes .. Am i missing something? Indexing generated data can't be retrieved anyway? Thanks in advance . -- Lici ~Java Developer~
Re: synonym filter problem for string or phrase
Just for clear terminology: You mean field, not fieldType. FieldType is the definition of tokenizers, filters etc.. You apply a fieldType on a field. And you query against a field, not against a whole fieldType. :-) Kind regards - Mitch Marco Martinez-2 wrote: > > Hi Ranveer, > > If you don't specify a field type in the q parameter, the search will be > done searching in your default search field defined in the solrconfig.xml, > its your default field a text_sync field? > > Regards, > > Marco Martínez Bautista > http://www.paradigmatecnologico.com > Avenida de Europa, 26. Ática 5. 3ª Planta > 28224 Pozuelo de Alarcón > Tel.: 91 352 59 42 > -- View this message in context: http://lucene.472066.n3.nabble.com/synonym-filter-problem-for-string-or-phrase-tp765242p773083.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: run on reboot on windows
Hi Ahmed, I need to achieve that also. Do you manage to install it as service and start solr with Jetty? After installing and start jetty as service how do you start solr? Thanks, Frederico -Original Message- From: S Ahmed [mailto:sahmed1...@gmail.com] Sent: segunda-feira, 3 de Maio de 2010 01:05 To: solr-user@lucene.apache.org Subject: Re: run on reboot on windows Thanks, for some reason I was looking for a solution outside of jetty/tomcat, when that was the obvious way to get things restarted :) On Sun, May 2, 2010 at 7:53 PM, Dave Searle wrote: > Tomcat is installed as a service on windows. Just go into service > control panel and set startup type to automatic > > Sent from my iPhone > > On 3 May 2010, at 00:43, "S Ahmed" wrote: > > > its not tomcat/jetty that's the issue, its how to get things to re- > > start on > > a windows server (tomcat and jetty don't run as native windows > > services) so > > I am a little confused..thanks. > > > > On Sun, May 2, 2010 at 7:37 PM, caman > > wrote: > > > >> > >> Ahmed, > >> > >> > >> > >> Best is if you take a look at the documentation of jetty or tomcat. > >> SOLR > >> can > >> run on any web container, it's up to you how you configure your web > >> container to run > >> > >> > >> > >> Thanks > >> > >> Aboxy > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> From: S Ahmed [via Lucene] > >> [mailto:ml-node+772174-2097041460-124...@n3.nabble.com > >> %2b772174-2097041460-124...@n3.nabble.com> > >> ] > >> Sent: Sunday, May 02, 2010 4:33 PM > >> To: caman > >> Subject: Re: run on reboot on windows > >> > >> > >> > >> By default it uses Jetty, so your saying Tomcat on windows server > >> 2008/ > >> IIS7 > >> > >> runs as a native windows service? > >> > >> On Sun, May 2, 2010 at 12:46 AM, Dave Searle <[hidden email]>wrote: > >> > >> > >>> Set tomcat6 service to auto start on boot (if running tomat) > >>> > >>> Sent from my iPhone > >>> > >>> On 2 May 2010, at 02:31, "S Ahmed" <[hidden email]> wrote: > >>> > Hi, > > I'm trying to get Solr to run on windows, such that if it reboots > the Solr > service will be running. > > How can I do this? > >>> > >> > >> > >> > >> _ > >> > >> View message @ > >> > >> > http://lucene.472066.n3.nabble.com/run-on-reboot-on-windows-tp770892p772 174 > >> . > >> html > >> To start a new topic under Solr - User, email > >> ml-node+472068-464289649-124...@n3.nabble.com > >> %2b472068-464289649-124...@n3.nabble.com> > >> To unsubscribe from Solr - User, click > >> < (link removed) > >> GZvcnRoZW90aGVyc3R1ZmZAZ21haWwuY29tfDQ3MjA2OHwtOTM0OTI1NzEx> here. > >> > >> > >> > >> > >> -- > >> View this message in context: > >> > http://lucene.472066.n3.nabble.com/run-on-reboot-on-windows-tp770892p772 178.html > >> Sent from the Solr - User mailing list archive at Nabble.com. > >> >
RE: Problem with pdf, upgrading Cell
Hi, Grant, i confirm what Praveen has said, any PDF i try does not work with the new Tika and SVN versions. :( Marc > From: sagar...@opentext.com > To: solr-user@lucene.apache.org > Date: Mon, 3 May 2010 13:05:24 +0530 > Subject: RE: Problem with pdf, upgrading Cell > > Hello, > > Please let me know if anybody figured out a way out of this issue. > > Thanks, > Sandhya > > -Original Message- > From: Praveen Agrawal [mailto:pkal...@gmail.com] > Sent: Friday, April 30, 2010 11:14 PM > To: solr-user@lucene.apache.org > Subject: Re: Problem with pdf, upgrading Cell > > Grant, > You can try any of the sample pdfs that come in /docs folder of Solr 1.4 > dist'n. I had tried 'Installing Solr in Tomcat.pdf', 'index.pdf' etc. Only > metadata i.e. stream_size, content_type apart from my own literals are > indexed, and content is missing.. > > > On Fri, Apr 30, 2010 at 8:52 PM, Grant Ingersoll wrote: > > > Praveen and Marc, > > > > Can you share the PDF (feel free to email my private email) that fails in > > Solr? > > > > Thanks, > > Grant > > > > > > On Apr 30, 2010, at 7:55 AM, Marc Ghorayeb wrote: > > > > > > > > Hi > > > Nope i didn't get it to work... Just like you, command line version of > > tika extracts correctly the content, but once included in Solr, no content > > is extracted. > > > What i tried until now is:- Updating the tika libraries inside Solr 1.4 > > public version, no luck there.- Downloading the latest SVN version, compiled > > it, and started from a simple schema, still no luck.- Getting other versions > > compiled on hudson (nightly builds), and testing them also, still no > > extraction. > > > I sent a mail on the developpers mailing list but they told me i should > > just mail here, hope some developper reads this because it's quite an > > important feature of Solr and somehow it got broke between the 1.4 release, > > and the last version on the svn. > > > Marc > > > _ > > > Consultez gratuitement vos emails Orange, Gmail, Free, ... directement > > dans HOTMAIL ! > > > http://www.windowslive.fr/hotmail/agregation/ > > > > -- > > Grant Ingersoll > > http://www.lucidimagination.com/ > > > > Search the Lucene ecosystem using Solr/Lucene: > > http://www.lucidimagination.com/search > > > > _ Hotmail et MSN dans la poche? HOTMAIL et MSN sont dispo gratuitement sur votre téléphone! http://www.messengersurvotremobile.com/?d=Hotmail
Re: Skipping duplicates in DataImportHandler based on uniqueKey
You can use deduplication to do that. Create the signature based on the unique field or any field you want. -- View this message in context: http://lucene.472066.n3.nabble.com/Skipping-duplicates-in-DataImportHandler-based-on-uniqueKey-tp771559p772768.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: synonym filter problem for string or phrase
Hi Ranveer, I don't see any stemming analyzer in your configuration of the field 'text_sync', also you have at query time and not at index time, maybe that is your problem. Regards, Marco Martínez Bautista http://www.paradigmatecnologico.com Avenida de Europa, 26. Ática 5. 3ª Planta 28224 Pozuelo de Alarcón Tel.: 91 352 59 42 2010/4/30 Jonty Rhods > On 4/29/10 8:50 PM, Marco Martinez wrote: > > Hi Ranveer, > > If you don't specify a field type in the q parameter, the search will be > done searching in your default search field defined in the solrconfig.xml, > its your default field a text_sync field? > > Regards, > > Marco Martínez Bautista > http://www.paradigmatecnologico.com > Avenida de Europa, 26. Ática 5. 3ª Planta > 28224 Pozuelo de Alarcón > Tel.: 91 352 59 42 > > > 2010/4/29 Ranveer > > > > Hi, > > I am trying to configure synonym filter. > my requirement is: > when user searching by phrase like "what is solr user?" then it should be > replace with "solr user". > something like : what is solr user? => solr user > > My schema for particular field is: > > positionIncrementGap="100"> > > > > > > > > > > ignoreCase="true" expand="true" > tokenizerFactory="KeywordTokenizerFactory"/> > > > > > it seems working fine while trying by analysis.jsp but not by url > http://localhost:8080/solr/core0/select?q="what is solr user?" > or > http://localhost:8080/solr/core0/select?q=what is solr user? > > Please guide me for achieve desire result. > > > > > > > Hi Marco, > thanks. > yes my default search field is text_sync. > I am getting result now but not as I expect. > following is my synonym.txt > > what is bone cancer=>bone cancer > what is bone cancer?=>bone cancer > what is of bone cancer=>bone cancer > what is symptom of bone cancer=>bone cancer > what is symptoms of bone cancer=>bone cancer > > in above I am getting result of all synonym but not the last one "what is > symptoms of bone cancer=>bone cancer". > I think due to stemming I am not getting expected result. However when I am > checking result from the analysis.jsp, > its giving expected result. I am confused.. > Also I want to know best approach to configure synonym for my requirement. > > thanks > with regards > > Hi, > > I am also facing same type of problem.. > I am Newbie please help. > > thanks > Jonty >
Re: phrase search - problem
> I wanted to do phrase search. What are the analyzers > that best suited for phrase search. I tried with > "textgen", but it did not yield the expected results. > > I wanted to index: > > my dear friend > > If I search for "dear friend", I should get the result and > if I search for "friend dear" I should not get any records. > Default PhraseQuery is unordered. "dear friend" returns documents containing "friend dear". It is not about Analyzer but QueryParser. So you want ordered phrase queries. With SOLR-1604 you can accomplish what you want. It constructs ordered SpanNearQuery instead of PhraseQuery. For example features:"stick memory" returns this snippet: SmartMedia, Memory Stick, Memory Stick Pro, SD Card http://localhost:8983/solr/select/?q=features:%22stick%20memory%22&version=2.2&start=0&rows=10&indent=on&defType=complexphrase&debugQuery=on&hl=true&hl.fl=features https://issues.apache.org/jira/browse/SOLR-1604
RE: Problem with pdf, upgrading Cell
Hello, Please let me know if anybody figured out a way out of this issue. Thanks, Sandhya -Original Message- From: Praveen Agrawal [mailto:pkal...@gmail.com] Sent: Friday, April 30, 2010 11:14 PM To: solr-user@lucene.apache.org Subject: Re: Problem with pdf, upgrading Cell Grant, You can try any of the sample pdfs that come in /docs folder of Solr 1.4 dist'n. I had tried 'Installing Solr in Tomcat.pdf', 'index.pdf' etc. Only metadata i.e. stream_size, content_type apart from my own literals are indexed, and content is missing.. On Fri, Apr 30, 2010 at 8:52 PM, Grant Ingersoll wrote: > Praveen and Marc, > > Can you share the PDF (feel free to email my private email) that fails in > Solr? > > Thanks, > Grant > > > On Apr 30, 2010, at 7:55 AM, Marc Ghorayeb wrote: > > > > > Hi > > Nope i didn't get it to work... Just like you, command line version of > tika extracts correctly the content, but once included in Solr, no content > is extracted. > > What i tried until now is:- Updating the tika libraries inside Solr 1.4 > public version, no luck there.- Downloading the latest SVN version, compiled > it, and started from a simple schema, still no luck.- Getting other versions > compiled on hudson (nightly builds), and testing them also, still no > extraction. > > I sent a mail on the developpers mailing list but they told me i should > just mail here, hope some developper reads this because it's quite an > important feature of Solr and somehow it got broke between the 1.4 release, > and the last version on the svn. > > Marc > > _ > > Consultez gratuitement vos emails Orange, Gmail, Free, ... directement > dans HOTMAIL ! > > http://www.windowslive.fr/hotmail/agregation/ > > -- > Grant Ingersoll > http://www.lucidimagination.com/ > > Search the Lucene ecosystem using Solr/Lucene: > http://www.lucidimagination.com/search > >