Re: how often do you boys restart your tomcat?
Till now I used jetty and got 2 week as the longest uptime until OOM. I just switched to tomcat6 and will see how that one behaves but I think its not a problem of the servlet container. Solr is pretty unstable if having a huge database. Actually this can't be blamed directly to Solr it is a problem of Lucene and its fieldCache. Somehow during 2 weeks runtime with searching and replication the fieldCache gets doubled until OOM. Currently there is no other solution to this than restarting your tomcat or jetty regularly :-( Am 27.07.2011 03:42, schrieb Bing Yu: I find that, if I do not restart the master's tomcat for some days, the load average will keep rising to a high level, solr become slow and unstable, so I add a crontab to restart the tomcat everyday. do you boys restart your tomcat ? and is there any way to avoid restart tomcat?
Re: how often do you boys restart your tomcat?
On curriki.org, our solr's Tomcat saturates memory after 2-4 weeks. I am still investigating if I am accumulating something or something else is. To check it, I am running a query all, return num results every minute to measure the time it takes. It's generally when it meets a big GC that gives a timeout that I start to worry. Memory then starts to be hogged but things get back to normal as soon as the GC is out. I had other tomcat servers with very long uptimes (more than 6 months) so I do not think tomcat is guilty. Currently I can only show the freememory of the system and what's in solr-stats, but I do not know what to look at really... paul Le 27 juil. 2011 à 03:42, Bing Yu a écrit : I find that, if I do not restart the master's tomcat for some days, the load average will keep rising to a high level, solr become slow and unstable, so I add a crontab to restart the tomcat everyday. do you boys restart your tomcat ? and is there any way to avoid restart tomcat?
Re: how often do you boys restart your tomcat?
It is definately Lucenes fieldCache making the trouble. Restart your solr and monitor it with jvisualvm, especially OldGen heap. When it gets to 100 percent filled use jmap to dump heap of your system. Then use Eclipse Memory Analyzer http://www.eclipse.org/mat/ and open the heap dump. You will see a pie chart and can easily identify the largets consumer of your heap space. Am 27.07.2011 09:02, schrieb Paul Libbrecht: On curriki.org, our solr's Tomcat saturates memory after 2-4 weeks. I am still investigating if I am accumulating something or something else is. To check it, I am running a query all, return num results every minute to measure the time it takes. It's generally when it meets a big GC that gives a timeout that I start to worry. Memory then starts to be hogged but things get back to normal as soon as the GC is out. I had other tomcat servers with very long uptimes (more than 6 months) so I do not think tomcat is guilty. Currently I can only show the freememory of the system and what's in solr-stats, but I do not know what to look at really... paul Le 27 juil. 2011 à 03:42, Bing Yu a écrit : I find that, if I do not restart the master's tomcat for some days, the load average will keep rising to a high level, solr become slow and unstable, so I add a crontab to restart the tomcat everyday. do you boys restart your tomcat ? and is there any way to avoid restart tomcat?
Re: Conditional field values in DataImport
On Wed, Jul 27, 2011 at 7:20 AM, solruser@9913 gunaranj...@yahoo.com wrote: This may be a trivial question - I am noob :). In the dataimport of a CSV file, am trying to assign a field based on a conditional check on another field. E.g. field name=rawLine regex=CSV-splitting-regex groupNames=X,Y,Z / this works well. However I need to create another field A that is assigned a value based on X. [...] A ScriptTransformer should do the job. Please see http://wiki.apache.org/solr/DataImportHandler#ScriptTransformer Regards, Gora
Re: Different options for autocomplete/autosuggestion
HI Bell, i used autocomplete in solr 3.1. same this: searchComponent name=autocomplete class=solr.SpellCheckComponent lst name=spellchecker str name=nameautocomplete/str str name=classnameorg.apache.solr.spelling.suggest.Suggester/str str name=lookupImplorg.apache.solr.spelling.suggest.jaspell.JaspellLookup/s tr str name=fieldautocomplete/str str name=buildOnCommittrue/str /lst and i make following URL* http://solr.pl/en/2010/11/15/solr-and-autocomplete-part-2/* to index my data. and had a problem. with one word, it have done very good. But when i typed more two words, rerults return not right. I don't know why? Can any one know this problem? Thanks for your help. -- View this message in context: http://lucene.472066.n3.nabble.com/Different-options-for-autocomplete-autosuggestion-tp2678899p3203032.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr vs ElasticSearch
On 06/01/2011 08:22 AM, Jason Rutherglen wrote: Thanks Shashi, this is oddly coincidental with another issue being put into Solr (SOLR-2193) to help solve some of the NRT issues, the timing is impeccable. Hmm, does anyone have an idea on when this will be finished? I'm considering if I should wait for the patch to solidify or if I should switch to ES. At a base however Solr uses Lucene, as does ES. I think the main advantage of ES is the auto-sharding etc. I think it uses a gossip protocol to capitalize on this however... Hmm... Yes it looks nice. T On Tue, May 31, 2011 at 10:01 PM, Shashi Kant sk...@sloan.mit.edu wrote: Here is a very interesting comparison http://engineering.socialcast.com/2011/05/realtime-search-solr-vs-elasticsearch/ -Original Message- From: Mark Sent: May-31-11 10:33 PM To: solr-user@lucene.apache.org Subject: Solr vs ElasticSearch I've been hearing more and more about ElasticSearch. Can anyone give me a rough overview on how these two technologies differ. What are the strengths/weaknesses of each. Why would one choose one of the other? Thanks -- Regards / Med vennlig hilsen Tarjei Huse Mobil: 920 63 413
Problem starting solr on jetty
Hi, I am new to solr. I have downloaded the solr 3.3.0 distribution and tryign to run it using java -jar start.jar from the apache-solr-3.3.0\example directory (start.jar is present here). But I am getting following error on running this command: C:\downloads\apache-solr-3.3.0\apache-solr-3.3.0\examplejava -jar start.jar java.lang.NullPointerException at java.io.File.init(File.java:222) at org.mortbay.start.Main.init(Main.java:465) at org.mortbay.start.Main.start(Main.java:439) at org.mortbay.start.Main.main(Main.java:119) Could someone help me in resolving this issue. Thanks Regards Anand Nigam *** The Royal Bank of Scotland plc. Registered in Scotland No 90312. Registered Office: 36 St Andrew Square, Edinburgh EH2 2YB. Authorised and regulated by the Financial Services Authority. The Royal Bank of Scotland N.V. is authorised and regulated by the De Nederlandsche Bank and has its seat at Amsterdam, the Netherlands, and is registered in the Commercial Register under number 33002587. Registered Office: Gustav Mahlerlaan 350, Amsterdam, The Netherlands. The Royal Bank of Scotland N.V. and The Royal Bank of Scotland plc are authorised to act as agent for each other in certain jurisdictions. This e-mail message is confidential and for use by the addressee only. If the message is received by anyone other than the addressee, please return the message to the sender by replying to it and then delete the message from your computer. Internet e-mails are not necessarily secure. The Royal Bank of Scotland plc and The Royal Bank of Scotland N.V. including its affiliates (RBS group) does not accept responsibility for changes made to this message after it was sent. For the protection of RBS group and its clients and customers, and in compliance with regulatory requirements, the contents of both incoming and outgoing e-mail communications, which could include proprietary information and Non-Public Personal Information, may be read by authorised persons within RBS group other than the intended recipient(s). Whilst all reasonable care has been taken to avoid the transmission of viruses, it is the responsibility of the recipient to ensure that the onward transmission, opening or use of this message and any attachments will not adversely affect its systems or data. No responsibility is accepted by the RBS group in this regard and the recipient should carry out such virus and other checks as it considers appropriate. Visit our website at www.rbs.com ***
Re: Autocomplete with Solr 3.1
I know the solution, just not how to actually implement it, but maybe somebody can help with that :) From Wiki: If you want to use a dictionary file that contains phrases (actually, strings that can be split into multiple tokens by the default QueryConverter) then define a different QueryConverter like this: queryConverter name=queryConverter class=org.apache.solr.spelling.MySpellingQueryConverter/ -- View this message in context: http://lucene.472066.n3.nabble.com/Autocomplete-with-Solr-3-1-tp3202214p3203191.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: How to make a valid date facet query?
Hi Floyd, yes, those queries are supported. Make sure you use the right encoding for the plus sign: facet.query=onlinedate:[NOW/YEAR-3YEARS TO NOW/YEAR%2B5YEARS] the result of this facet query will be the number of documents in the result set that match that range. You'll have to use different facet queries for the different ranges to achieve what you want. Regards, Tomás On Wed, Jul 27, 2011 at 12:43 AM, Floyd Wu floyd...@gmail.com wrote: Hi Tomás Is facet queries support following queries? facet.query=onlinedate:[NOW/YEAR-3YEARS TO NOW/YEAR+5YEARS] I tried this but returned result was not correct. Am I missing something? Floyd 2011/7/26 Tomás Fernández Löbbe tomasflo...@gmail.com Hi Floyd, I don't think the feature that allows to use multiple gaps for a range facet is committed. See https://issues.apache.org/jira/browse/SOLR-2366 You can achieve a similar functionality by using facet.query. see: http://wiki.apache.org/solr/SimpleFacetParameters#Facet_Fields_and_Facet_Queries Regards, Tomás On Tue, Jul 26, 2011 at 1:23 AM, Floyd Wu floyd...@gmail.com wrote: Hi all, I need to make date faceted query and I tried to use facet.range but can't get result I need. I want to make 4 facet like following. 1 Months,3 Months, 6Months, more than 1 Year The onlinedate field in schema.xml like this field name=onlinedate type=tdate indexed=true stored=true/ I hit the solr by this url http://localhost:8983/solr/select/?q=*%3A* start=0 rows=10 indent=on facet=true facet.range=onlinedate f.onlinedate.facet.range.start=NOW-1YEARS f.onlinedate.facet.range.end=NOW%2B1YEARS f.onlinedate.facet.range.gap=NOW-1MONTHS, NOW-3MONTHS, NOW-6MONTHS,NOW-1YEAR But the solr complained Exception during facet.range of onlinedate org.apache.solr.common.SolrException: Can't add gap NOW-1MONTHS, NOW-3MONTHS, NOW-6MONTHS,NOW-1YEAR to value Mon Jul 26 11:56:40 CST 2010 for What is correct way to make this requirement to realized? Please help on this. Floyd
what data type for geo fields?
Looking at the example schema: http://svn.apache.org/repos/asf/lucene/dev/branches/lucene_solr_3_3/solr/example/solr/conf/schema.xml the solr.PointType field type uses double (is this just an example field, or used for geo search?), while the solr.LatLonType field uses tdouble and it's unclear how the geohash is translated into lat/lon values or if the geohash itself might typically be used as a copyfield and use just for matching a query on a geohash? Is there an advantage in terms of speed to using Trie fields for solr.LatLonType? I would assume so, e.g. for bbox operations. Thanks, Peter -- Peter M. Wolanin, Ph.D. : Momentum Specialist, Acquia. Inc. peter.wola...@acquia.com : 978-296-5247 Get a free, hosted Drupal 7 site: http://www.drupalgardens.com;
Delete by range query
Hi, I want to delete the bunch of docs from my solr using rangeQuery. I have one field called 'time' which is tint. I am deleting using the query : deletequerytime:[1296777600+TO+1296778000]/query/delete but solr is returning Error, Saying bad request. however I am able to delete one by one using below deleteQuery: deletequerytime:1296777600/query/delete Please suggest any solution to this problem. -- Thanks and Regards Mohammad Shariq
Re: Solr vs ElasticSearch
You might also check out Solandra: https://github.com/tjake/Solandra With Solr's configuration and indexes in Cassandra, you can benefit from replication, distribution etc., and still have Cassandra available for non-Solr specific purposes. Cheers, Jeff On Jul 27, 2011, at 5:17 AM, Tarjei Huse wrote: On 06/01/2011 08:22 AM, Jason Rutherglen wrote: Thanks Shashi, this is oddly coincidental with another issue being put into Solr (SOLR-2193) to help solve some of the NRT issues, the timing is impeccable. Hmm, does anyone have an idea on when this will be finished? I'm considering if I should wait for the patch to solidify or if I should switch to ES. At a base however Solr uses Lucene, as does ES. I think the main advantage of ES is the auto-sharding etc. I think it uses a gossip protocol to capitalize on this however... Hmm... Yes it looks nice. T On Tue, May 31, 2011 at 10:01 PM, Shashi Kant sk...@sloan.mit.edu wrote: Here is a very interesting comparison http://engineering.socialcast.com/2011/05/realtime-search-solr-vs-elasticsearch/ -Original Message- From: Mark Sent: May-31-11 10:33 PM To: solr-user@lucene.apache.org Subject: Solr vs ElasticSearch I've been hearing more and more about ElasticSearch. Can anyone give me a rough overview on how these two technologies differ. What are the strengths/weaknesses of each. Why would one choose one of the other? Thanks -- Regards / Med vennlig hilsen Tarjei Huse Mobil: 920 63 413 -- Jeff Schmidt 535 Consulting j...@535consulting.com http://www.535consulting.com (650) 423-1068
Re: what data type for geo fields?
On Wed, Jul 27, 2011 at 9:01 AM, Peter Wolanin peter.wola...@acquia.com wrote: Looking at the example schema: http://svn.apache.org/repos/asf/lucene/dev/branches/lucene_solr_3_3/solr/example/solr/conf/schema.xml the solr.PointType field type uses double (is this just an example field, or used for geo search?) While you could possibly use PointType for geo search, it doesn't have good support for it (it's more of a general n-dimension point) The LatLonType has all the geo support currently. , while the solr.LatLonType field uses tdouble and it's unclear how the geohash is translated into lat/lon values or if the geohash itself might typically be used as a copyfield and use just for matching a query on a geohash? There's no geohash used in LatLonType It is indexed as a lat and lon under the covers (using the suffix _d) Is there an advantage in terms of speed to using Trie fields for solr.LatLonType? Currently only for explicit range queries... like point:[10,10 TO 20,20] I would assume so, e.g. for bbox operations. It's a bit of an implementation detail, but bbox doesn't currently use range queries. -Yonik http://www.lucidimagination.com
Re: Delete by range query
deletequerytime:[1296777600+TO+1296778000]/query/delete Should be deletequerytime:[1296777600 TO 1296778000]/query/delete ? koji -- http://www.rondhuit.com/en/
Re: Solr vs ElasticSearch
On Wed, Jul 27, 2011 at 7:17 AM, Tarjei Huse tar...@scanmine.com wrote: On 06/01/2011 08:22 AM, Jason Rutherglen wrote: Thanks Shashi, this is oddly coincidental with another issue being put into Solr (SOLR-2193) to help solve some of the NRT issues, the timing is impeccable. Hmm, does anyone have an idea on when this will be finished? It's in trunk now... try it out! -Yonik http://www.lucidimagination.com
Re: Delete by range query
Thanks Koji Its working now. On 27 July 2011 19:30, Koji Sekiguchi k...@r.email.ne.jp wrote: deletequerytime:[**1296777600+TO+1296778000]/**query/delete Should be deletequerytime:[**1296777600 TO 1296778000]/query/delete ? koji -- http://www.rondhuit.com/en/ -- Thanks and Regards Mohammad Shariq
Re: using distributed search with the suggest component
Thanks, but this does not work. Looking at the log files, I see only one request, when executing a search. Executing a request to the default servlet (/select) with multiple shards, each core gets ask for the current query. Any other suggestions? Tobias On Tue, Jul 26, 2011 at 2:11 PM, mdz-munich sebastian.lu...@bsb-muenchen.de wrote: Hi Tobias, try this, it works for us (Solr 3.3): solrconfig.xml: /searchComponent name=suggest class=solr.SpellCheckComponent str name=queryAnalyzerFieldTypeword/str lst name=spellchecker str name=namesuggestion/str str name=classnameorg.apache.solr.spelling.suggest.Suggester/str str name=lookupImplorg.apache.solr.spelling.suggest.fst.FSTLookup/str str name=fieldwordCorpus/str str name=comparatorClassscore/str str name=storeDir./suggester/str str name=buildOnCommitfalse/str str name=buildOnOptimizetrue/str float name=threshold0.005/float /lst requestHandler name=/suggest class=solr.SearchHandler lst name=defaults str name=omitHeadertrue/str str name=spellchecktrue/str str name=spellcheck.onlyMorePopulartrue/str str name=spellcheck.collatetrue/str str name=spellcheck.dictionarysuggestion/str str name=spellcheck.count50/str str name=spellcheck.maxCollations50/str /lst arr name=components strsuggest/str /arr /requestHandler/ Query like that: http://localhost:8080/solr/core.01/suggest?q=wordPrefixshards=localhost:8080/solr/core.01,localhost:8080/solr/core.02shards.qt=/suggest Greetz, Sebastian Tobias Rübner wrote: Hi, I try to use the suggest component (solr 3.3) with multiple cores. I added a search component and a request handler as described in the docs ( http://wiki.apache.org/solr/Suggester) to my solrconfig. That works fine for 1 core but querying my solr instance with the shards parameter does not query multiple cores. It just ignores the shards parameter. http://localhost:/solr/core1/suggest?q=sashards=localhost:/solr/core1,localhost:/solr/core2 The documentation of the SpellCheckComponent ( http://wiki.apache.org/solr/SpellCheckComponent#Distributed_Search_Support ) is a bit vage in that point, because I don't know if this feature really works with solr 3.3. It is targeted for solr 1.5, which will never come, but says, it is now available. I also tried the shards.qt paramater, but it does not change my results. Thanks for any help, Tobias -- View this message in context: http://lucene.472066.n3.nabble.com/using-distributed-search-with-the-suggest-component-tp3197651p3200143.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: Problem starting solr on jetty
Hi Anand, Someone else reported this exact same error with Solr v1.4.0: http://www.lucidimagination.com/search/document/fd5b83f3595a1c6c/can_t_start_solr_by_java_jar_start_jar I downloaded the apache-solr-3.3.0.zip, unpacked it, then ran 'java -jar start.jar' from the cmdline. It worked. (Windows 7; Oracle Java 1.6.0_23). I tried to reproduce the error you're seeing, by making the example\ directory and all its contents read-only (different exception: FileNotFound), and by removing the entire contents of the example\ directory except for start.jar (nothing happens - it just quits without printing anything out). Can you give more details about your environment? Steve -Original Message- From: anand.ni...@rbs.com [mailto:anand.ni...@rbs.com] Sent: Wednesday, July 27, 2011 7:25 AM To: solr-user@lucene.apache.org Subject: Problem starting solr on jetty Hi, I am new to solr. I have downloaded the solr 3.3.0 distribution and tryign to run it using java -jar start.jar from the apache-solr-3.3.0\example directory (start.jar is present here). But I am getting following error on running this command: C:\downloads\apache-solr-3.3.0\apache-solr-3.3.0\examplejava -jar start.jar java.lang.NullPointerException at java.io.File.init(File.java:222) at org.mortbay.start.Main.init(Main.java:465) at org.mortbay.start.Main.start(Main.java:439) at org.mortbay.start.Main.main(Main.java:119) Could someone help me in resolving this issue. Thanks Regards Anand Nigam *** The Royal Bank of Scotland plc. Registered in Scotland No 90312. Registered Office: 36 St Andrew Square, Edinburgh EH2 2YB. Authorised and regulated by the Financial Services Authority. The Royal Bank of Scotland N.V. is authorised and regulated by the De Nederlandsche Bank and has its seat at Amsterdam, the Netherlands, and is registered in the Commercial Register under number 33002587. Registered Office: Gustav Mahlerlaan 350, Amsterdam, The Netherlands. The Royal Bank of Scotland N.V. and The Royal Bank of Scotland plc are authorised to act as agent for each other in certain jurisdictions. This e-mail message is confidential and for use by the addressee only. If the message is received by anyone other than the addressee, please return the message to the sender by replying to it and then delete the message from your computer. Internet e-mails are not necessarily secure. The Royal Bank of Scotland plc and The Royal Bank of Scotland N.V. including its affiliates (RBS group) does not accept responsibility for changes made to this message after it was sent. For the protection of RBS group and its clients and customers, and in compliance with regulatory requirements, the contents of both incoming and outgoing e-mail communications, which could include proprietary information and Non-Public Personal Information, may be read by authorised persons within RBS group other than the intended recipient(s). Whilst all reasonable care has been taken to avoid the transmission of viruses, it is the responsibility of the recipient to ensure that the onward transmission, opening or use of this message and any attachments will not adversely affect its systems or data. No responsibility is accepted by the RBS group in this regard and the recipient should carry out such virus and other checks as it considers appropriate. Visit our website at www.rbs.com ***
Why Slop doens't match anything?
Hello pals, Using solr 1.4.0. Trying to understand something. When I run the query *fieldA:nokia c3*, I get 5 results. All with nokia c3, as expected. But when I run fieldA:nokia c3~100, I don get any result! As far as I understand the ~100 should make my query bring even more results as not only documents with nokia c3 in their fieldA will be found. Something like nokia blue c3 should match too. Right? So, why I don't get any result? Any know bug? -- Alexander Ramos Jardim
Re: Dealing with keyword stuffing
On Wed, Jul 27, 2011 at 7:15 PM, Pranav Prakash pra...@gmail.com wrote: I guess most of you have already handled and many of you might still be handling keyword stuffing. Here is my scenario. We have a huge index containing about 6m docs. (Not sure if that is huge :-) And every document contains title, description, tags, content (textual data). People have been doing keyword stuffing on the documents, so when searched for a query term, the first results are always the ones who are optimized. So, instead of people getting relevant results, they get spam content (highly optimized, keyword stuffed content) as first few results. I have tried a couple of things like providing different boosts to different fields, but almost everything seems to fail. [...] Presumably, they are doing this by increasing tf (term frequency), i.e., by repeating keywords multiple times. If so, you can use a custom similarity class that caps term frequency, and/or ensures that the scoring increases less than linearly with tf. Please see http://wiki.apache.org/solr/SchemaXml#Similarity , and/or do a web search for more details. Regards, Gora
Re: Exact match not the first result returned
Thanks Emmanuel for that explanation. I implemented your solution but I'm not quite there yet. Suppose I also have a record: RECORD 3 arr name=myname strFred G. Anderson/str strFred Anderson/str /arr With your solution, RECORD 1 does appear at the top but I think thats just blind luck more than anything else because RECORD 3 shows as having the same score. So what more can I do to push RECORD 1 up to the top. Ideally, I'd like all three records returned with RECORD 1 being the first listing. Thanks, Brian Lamb On Tue, Jul 26, 2011 at 6:03 PM, Emmanuel Espina espinaemman...@gmail.comwrote: That is caused by the size of the documents. The principle is pretty intuitive if one of your documents is the entire three volumes of The Lord of the Rings, and you search for tree I know that The Lord of the Rings will be in the results, and I haven't memorized the entire text of that book :p It is a matter of probability that if you have a big (big!) text any word will have a greater chance to be found than in a smaller letter. So one can infer that the letter is more relevant than the big text. That is the principle applied here and Lucene does that when building the ranking. The first document is bigger (remember that all the values of a multivalued field are merged into one field in the index, so you can not tell one value from another apart) than the second one. In the first one you have [Fred, coolest, guy, town] and in the second [Fred, Anderson], so the second document is more relevant than the first one. To avoid all this procedure you can set omitNorms to true and that should make the first document more relevant because Fred appears twice (not because Fred appears alone in a value) Regards Emmanuel 2011/7/26 Brian Lamb brian.l...@journalexperts.com Hi all, I am a little confused as to why the scoring is working the way it is: I have a field defined as: field name=myname type=text indexed=true stored=true required=false multivalued=true / And I have several documents where that value is: RECORD 1 arr name=myname strFred/str strFred (the coolest guy in town)/str /arr OR RECORD 2 arr name=myname strFred Anderson/str /arr What happens when I do a search for http://localhost:8983/solr/search/?q=myname:Fred I get RECORD 2 returned before RECORD 1. RECORD 2 5.282213 = (MATCH) fieldWeight(myname:Fred in 256575), product of: 1.0 = tf(termFreq(myname:Fred)=1) 8.451541 = idf(docFreq=7306, maxDocs=12586425) 0.625 = fieldNorm(field=myname, doc=256575) RECORD 1 4.482106 = (MATCH) fieldWeight(myname:Fred in 215), product of: 1.4142135 = tf(termFreq(myname:Fred)=2) 8.451541 = idf(docFreq=7306, maxDocs=12586425) 0.375 = fieldNorm(field=myname, doc=215) So the difference is fieldNorm obviously but I think that's only part of the story. Why is RECORD 2 returned with a higher score than RECORD 1 even though RECORD 1 matches Fred exactly? And how should I do this differently so that I am getting the results I am expecting? Thanks, Brian Lamb
Re: Why Slop doens't match anything?
On Wed, Jul 27, 2011 at 8:38 PM, Alexander Ramos Jardim alexander.ramos.jar...@gmail.com wrote: Hello pals, Using solr 1.4.0. Trying to understand something. When I run the query *fieldA:nokia c3*, I get 5 results. All with nokia c3, as expected. But when I run fieldA:nokia c3~100, I don get any result! As far as I understand the ~100 should make my query bring even more results as not only documents with nokia c3 in their fieldA will be found. Something like nokia blue c3 should match too. Right? [...] That does seem odd. You are not using the dismax query handler by any chance, are you? If so, then the query slop needs to be specified by adding qs=100 to the query. Regards, Gora
Solr Master-slave master failover without data loss
Suppose master goes down immediately after the index updates, while the updates haven't been replicated to the slaves, data loss seems to happen. Does Solr have any mechanism to deal with that? -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Master-slave-master-failover-without-data-loss-tp3203644p3203644.html Sent from the Solr - User mailing list archive at Nabble.com.
Filter content upon indexing
Hi all, I am trying to index html documents using Solr and I am having difficulties to extract certain parts of the main content of the document and store them sepparately into other fields. I saw on the docs that it is possible to achieve this using xpath but in my certain case I need to do a regex match. To be more specifical I am willing to copy a certain pattern content to title field. My first attempt was to define a custom field type with a PatternFilter and copy content field to title field but this did not work. Next attempt was to specify that copyField tag would have a pattern and group attributes but this did not work as well. Is it possible to do what I am trying? I am unwilling to resort to grep outside Solr as I am pretty sure Solr is capable of doing what I want... best regards, Rafael Ribeiro -- View this message in context: http://lucene.472066.n3.nabble.com/Filter-content-upon-indexing-tp3203946p3203946.html Sent from the Solr - User mailing list archive at Nabble.com.
Jetty Logs - Max line size?
Hello I enabled Jetty Logs but my GET requests seem so long that they get truncated and without a line break, so in the end it looks like this: notice the logged ping and where it begins. How can i change this? thank you very much 000.000.000.000 - - [27/Jul/2011:17:38:04 +0100] GET /solr/select?fl=id%2Cdoc_feature%2Cmandant%2Cd_id%2Cdoc_title%2Cdoc_id%2Cscore%2Ccategory%2Canrede%2Ckeyword_a%2Ckeyword_a_name%2Cmenu_id%2Cprice%2Caprice%2Cstandort_id%2Cdate_start%2Cdate_end%2Cdlpath%2Cdownloads%2Cdoc_type%2Cdoc_id%2Cmenu_path_text%2Ctitle%2Csummary%2Csubtitle%2Cdate_online%2Ctables%2Ckdnrzen%2Cplz%2Cort%2Cadresse%2Cbundesland%2Ctelefonnummer%2Cmobil%2Cfax%2Cemail%2Cnachname%2Cvorname%2Chauptfunktion%2Cbereichname%2Cfilialenamesort=date_online+descrows=0version=1.2wt=jsonjson.nl=mapq=text_copy%3A%28schuhe%29+%28category%3A%22Lagerhaus%22+OR+%28category%3ASortiment+AND+mandant%3A%28%22Lagerhaus%22+OR+Portal%29%29+OR+kdnrzen%3A%28%2A%29%29+AND+-%28doc_feature%3Abroschuere+AND+doc_type%3Acontent%29+AND+-%28menu_path_text%3A%2ATables%2A%29+AND+-%28id%3Acld_%2A%29+date_online%3A%5B2006-06-21T00%3A00%3A00Z+TO+2011-7-27T23%3A59%3A59Z%5D+date_offline%3A%5B2011-7-27T00%3A00%3A00Z+TO+%2A%5D+%28doc_type%3Acontent+AND+-doc_feature%3A%28ange000.000.000.000 - - [27/Jul/2011:17:38:04 +0100] HEAD /solr/admin/ping HTTP/1.0 200 0
Re: Autocomplete with Solr 3.1
Hi Klein, Thanks for your reply. But i tried some suggestion with solr, and results return is good. But i want to using search component with solr 3.1. Now i have had some problems with Suggester. i think my problem perhaps about in schema file. This is schema file: fieldType name=text class=solr.TextField positionIncrementGap=100 analyzer type=index charFilter class=solr.HTMLStripCharFilterFactory/ tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/ filter class=solr.LowerCaseFilterFactory/ /analyzer analyzer type=query charFilter class=solr.HTMLStripCharFilterFactory/ tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=0 catenateNumbers=0 catenateAll=0 splitOnCaseChange=1/ filter class=solr.LowerCaseFilterFactory/ /analyzer /fieldType And i defined fields: field name=s_SongId type=string indexed=true stored=true/ field name=s_SongName type=text indexed=true stored=true/ field name=search_autocomplete type=text_auto indexed=true stored=true multiValued=true/ where: fieldType with text_auto: fieldType class=solr.TextField name=text_auto positionIncrementGap=100 analyzer type=index tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.LowerCaseFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.LowerCaseFilterFactory/ /analyzer /fieldType In file solrconfig.xml i defined: searchComponent name=spellcheck-autocomplete class=solr.SpellCheckComponent lst name=spellchecker str name=namesuggest/str str name=classnameorg.apache.solr.spelling.suggest.Suggester/str str name=lookupImplorg.apache.solr.spelling.suggest.tst.TSTLookup/str str name=fieldsearch_autocomplete/str str name=buildOnCommittrue/str /lst /searchComponent requestHandler name=/autocomplete class=org.apache.solr.handler.component.SearchHandler lst name=defaults str name=spellchecktrue/str str name=spellcheck.dictionarysuggest/str str name=spellcheck.count10/str str name=spellcheck.collatetrue/str /lst arr name=components strspellcheck-autocomplete/str /arr /requestHandler Can any one help??? -- View this message in context: http://lucene.472066.n3.nabble.com/Autocomplete-with-Solr-3-1-tp3202214p3204176.html Sent from the Solr - User mailing list archive at Nabble.com.
schema.xml changes, need re-indexing ?
Hi, We currently have a big index in production. We would like to add 2 non-required fields to our schema.xml : field name=myfield type=boolean indexed=true stored=true required=false/ field name=myotherfield type=string indexed=true stored=true required=false multiValued=true/ I made some tests: - I stopped tomcat - I changed the schema.xml - I started tomcat The data was still there and I was able to add new document with theses 2 fields. So far, it looks I won't need to re-index all my data. Am I right ? Do I need to re-index all my data or in that case I'm fine ? Thank you ! Charles-André Martin
Re: Filter content upon indexing
If you can express what you want with a regular expression then the pattern Filter should work! I'm thinking that maybe you tokenized the field and that invalidated the structure of the html. I would use a contents field analized with a http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.HTMLStripCharFilterFactory and use copyField to another field Title that has a KeywordTokenizer in combination with PatternFilter (with the pattern of the title of your pages) Thanks Emmanuel 2011/7/27 Rafael Ribeiro rafae...@gmail.com Hi all, I am trying to index html documents using Solr and I am having difficulties to extract certain parts of the main content of the document and store them sepparately into other fields. I saw on the docs that it is possible to achieve this using xpath but in my certain case I need to do a regex match. To be more specifical I am willing to copy a certain pattern content to title field. My first attempt was to define a custom field type with a PatternFilter and copy content field to title field but this did not work. Next attempt was to specify that copyField tag would have a pattern and group attributes but this did not work as well. Is it possible to do what I am trying? I am unwilling to resort to grep outside Solr as I am pretty sure Solr is capable of doing what I want... best regards, Rafael Ribeiro -- View this message in context: http://lucene.472066.n3.nabble.com/Filter-content-upon-indexing-tp3203946p3203946.html Sent from the Solr - User mailing list archive at Nabble.com.
Data Import Handler Architecture Diagram
Maybe I am looking at the wrong version - the diagram (and the screenshot in the interactive dev mode section) don't show up in the WIKI page. http://wiki.apache.org/solr/DataImportHandler#Architecture Is this a wrong link? I did an inspect element and this is what I see ... /solr/DataImportHandler?action=AttachFileamp;do=getamp;target=DataImportHandlerOverview.png -g -- View this message in context: http://lucene.472066.n3.nabble.com/Data-Import-Handler-Architecture-Diagram-tp3204459p3204459.html Sent from the Solr - User mailing list archive at Nabble.com.
Data Import Handler Diagram
Maybe I am looking at the wrong version - the diagram (and the screenshot in the interactive dev mode section) don't show up in the WIKI page. http://wiki.apache.org/solr/DataImportHandler#Architecture Is this a wrong link? I did an inspect element and this is what I see ... ... img alt=DataImportHandlerOverview.png class=attachment src=/solr/DataImportHandler?action=AttachFileamp;do=getamp;target=DataImportHandlerOverview.png title=DataImportHandlerOverview.png -g -- View this message in context: http://lucene.472066.n3.nabble.com/Data-Import-Handler-Diagram-tp3204470p3204470.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: schema.xml changes, need re-indexing ?
You should be fine - no need to re-index your data. Adding and removing fields is generally safe to do without a re-index. Changing a field (its type, analyzers, etc) requires more caution and generally does require a re-index. -Michael
Re: schema.xml changes, need re-indexing ?
I believe you're fine with that. Don't need to reindex all solr database. 2011/7/27 Charles-Andre Martin charles-andre.mar...@sunmedia.ca Hi, We currently have a big index in production. We would like to add 2 non-required fields to our schema.xml : field name=myfield type=boolean indexed=true stored=true required=false/ field name=myotherfield type=string indexed=true stored=true required=false multiValued=true/ I made some tests: - I stopped tomcat - I changed the schema.xml - I started tomcat The data was still there and I was able to add new document with theses 2 fields. So far, it looks I won't need to re-index all my data. Am I right ? Do I need to re-index all my data or in that case I'm fine ? Thank you ! Charles-André Martin -- *Alexei Martchenko* | *CEO* | Superdownloads ale...@superdownloads.com.br | ale...@martchenko.com.br | (11) 5083.1018/5080.3535/5080.3533
Re: Filter content upon indexing
I want to add, that since the stored text (not the indexed) is not analyzed, if you retrieve the title you will get all the html. If you want to extract the title for storage in a separate field that will have to be done with a different tool not just with the analysis. My previous answer was focused only in extraction of text for searching purposes. Thanks Emmanuel 2011/7/27 Emmanuel Espina espinaemman...@gmail.com If you can express what you want with a regular expression then the pattern Filter should work! I'm thinking that maybe you tokenized the field and that invalidated the structure of the html. I would use a contents field analized with a http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.HTMLStripCharFilterFactory and use copyField to another field Title that has a KeywordTokenizer in combination with PatternFilter (with the pattern of the title of your pages) Thanks Emmanuel 2011/7/27 Rafael Ribeiro rafae...@gmail.com Hi all, I am trying to index html documents using Solr and I am having difficulties to extract certain parts of the main content of the document and store them sepparately into other fields. I saw on the docs that it is possible to achieve this using xpath but in my certain case I need to do a regex match. To be more specifical I am willing to copy a certain pattern content to title field. My first attempt was to define a custom field type with a PatternFilter and copy content field to title field but this did not work. Next attempt was to specify that copyField tag would have a pattern and group attributes but this did not work as well. Is it possible to do what I am trying? I am unwilling to resort to grep outside Solr as I am pretty sure Solr is capable of doing what I want... best regards, Rafael Ribeiro -- View this message in context: http://lucene.472066.n3.nabble.com/Filter-content-upon-indexing-tp3203946p3203946.html Sent from the Solr - User mailing list archive at Nabble.com.
Indexing SharePoint from SolrJ
Does anyone have examples of indexing SP content using the Google Connectors API and using SolrJ. I know Lucid Imagination has a Sharepoint connector and I have used that successfully. However, I would like to create a thumbnail image of PDF's and PPT docs and add that to my index and I assume I need to use solrJ and some third party libraries to do that. Hence I want to crawl Sharepoint using SolrJ so I can then call third party libraries at index time. Thanks so much David
Solr Performance Tuning: -XX:+AggressiveOpts
Anyone tried this? I can not start Solr-Tomcat with following options on Ubuntu: JAVA_OPTS=$JAVA_OPTS -Xms2048m -Xmx2048m -Xmn256m -XX:MaxPermSize=256m JAVA_OPTS=$JAVA_OPTS -Dsolr.solr.home=/data/solr -Dfile.encoding=UTF8 -Duser.timezone=GMT -Djava.util.logging.config.file=/data/solr/logging.properties -Djava.net.preferIPv4Stack=true JAVA_OPTS=$JAVA_OPTS -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode -XX:+AggressiveOpts -XX:NewSize=64m -XX:MaxNewSize=64m -XX:CMSInitiatingOccupancyFraction=77 -XX:+CMSParallelRemarkEnabled JAVA_OPTS=$JAVA_OPTS -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:/data/solr/solr-gc.log Tomcat log (something about PorterStemFilter; Solr 3.3.0): INFO: Server startup in 2683 ms # # A fatal error has been detected by the Java Runtime Environment: # # SIGSEGV (0xb) at pc=0x7f5c6f36716e, pid=7713, tid=140034519381760 # # JRE version: 6.0_26-b03 # Java VM: Java HotSpot(TM) 64-Bit Server VM (20.1-b02 mixed mode linux-amd64 compressed oops) # Problematic frame: # J org.apache.lucene.analysis.PorterStemFilter.incrementToken()Z # [thread 140034523637504 also had an error] [thread 140034520434432 also had an error] # An error report file with more information is saved as: # [thread 140034520434432 also had an error] # # If you would like to submit a bug report, please visit: # http://java.sun.com/webapps/bugreport/crash.jsp # However, I can start it and run without any problems by removing -XX:+AggressiveOpts (which has to be default setting in upcoming releases Java 6) Do we need to disable -XX:-DoEscapeAnalysis as IBM suggests? http://www-01.ibm.com/support/docview.wss?uid=swg21422605 Thanks, Fuad Efendi http://www.tokenizer.ca
Re: Solr Performance Tuning: -XX:+AggressiveOpts
Don't use this option, these optimizations are buggy: http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7070134 On Wed, Jul 27, 2011 at 3:56 PM, Fuad Efendi f...@efendi.ca wrote: Anyone tried this? I can not start Solr-Tomcat with following options on Ubuntu: JAVA_OPTS=$JAVA_OPTS -Xms2048m -Xmx2048m -Xmn256m -XX:MaxPermSize=256m JAVA_OPTS=$JAVA_OPTS -Dsolr.solr.home=/data/solr -Dfile.encoding=UTF8 -Duser.timezone=GMT -Djava.util.logging.config.file=/data/solr/logging.properties -Djava.net.preferIPv4Stack=true JAVA_OPTS=$JAVA_OPTS -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode -XX:+AggressiveOpts -XX:NewSize=64m -XX:MaxNewSize=64m -XX:CMSInitiatingOccupancyFraction=77 -XX:+CMSParallelRemarkEnabled JAVA_OPTS=$JAVA_OPTS -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:/data/solr/solr-gc.log Tomcat log (something about PorterStemFilter; Solr 3.3.0): INFO: Server startup in 2683 ms # # A fatal error has been detected by the Java Runtime Environment: # # SIGSEGV (0xb) at pc=0x7f5c6f36716e, pid=7713, tid=140034519381760 # # JRE version: 6.0_26-b03 # Java VM: Java HotSpot(TM) 64-Bit Server VM (20.1-b02 mixed mode linux-amd64 compressed oops) # Problematic frame: # J org.apache.lucene.analysis.PorterStemFilter.incrementToken()Z # [thread 140034523637504 also had an error] [thread 140034520434432 also had an error] # An error report file with more information is saved as: # [thread 140034520434432 also had an error] # # If you would like to submit a bug report, please visit: # http://java.sun.com/webapps/bugreport/crash.jsp # However, I can start it and run without any problems by removing -XX:+AggressiveOpts (which has to be default setting in upcoming releases Java 6) Do we need to disable -XX:-DoEscapeAnalysis as IBM suggests? http://www-01.ibm.com/support/docview.wss?uid=swg21422605 Thanks, Fuad Efendi http://www.tokenizer.ca -- lucidimagination.com
An idea for an intersection type of filter query
I've been looking at the slow queries our Solr installation is receiving. They are dominated by queries with a simple q parameter (often *:* for all docs) and a VERY complicated fq parameter. The filter query is built by going through a set of rules for the user and putting together each rule's query clause separated by OR -- we can't easily break it into multiple filters. In addition to causing queries themselves to run slowly, this causes large autowarm times for our filterCache -- my filterCache autowarmCount is tiny (4), but it sometimes takes 30 seconds to warm. I've seen a number of requests here for the ability to have multiple fq parameters ORed together. This is probably possible, but in the interests of compatibility between versions, very impractical. What if a new parameter was introduced? It could be named fqi, for filter query intersection. To figure out the final bitset for multiple fq and fqi parameters, it would use this kind of logic: fq AND fq AND fq AND (fqi OR fqi OR fqi) This would let us break our filters into manageable pieces that can efficiently populate the filterCache, and they would autowarm quickly. Is the filter design in Solr separated cleanly enough to make this at all reasonable? I'm not a Java developer, so I'd have a tough time implementing it myself. When I have a free moment I will take a look at the code anyway. I'm trying to teach myself Java. Thanks, Shawn
Re: schema.xml changes, need re-indexing ?
I have not seen this mentioned anywhere, but I found a useful 'trick' to restart solr without having to restart tomcat. All you need to do is 'touch' the solr.xml in the solr.home directory. It can take a few seconds but solr will restart and reload any config. Cheers François On Jul 27, 2011, at 2:56 PM, Alexei Martchenko wrote: I believe you're fine with that. Don't need to reindex all solr database. 2011/7/27 Charles-Andre Martin charles-andre.mar...@sunmedia.ca Hi, We currently have a big index in production. We would like to add 2 non-required fields to our schema.xml : field name=myfield type=boolean indexed=true stored=true required=false/ field name=myotherfield type=string indexed=true stored=true required=false multiValued=true/ I made some tests: - I stopped tomcat - I changed the schema.xml - I started tomcat The data was still there and I was able to add new document with theses 2 fields. So far, it looks I won't need to re-index all my data. Am I right ? Do I need to re-index all my data or in that case I'm fine ? Thank you ! Charles-André Martin -- *Alexei Martchenko* | *CEO* | Superdownloads ale...@superdownloads.com.br | ale...@martchenko.com.br | (11) 5083.1018/5080.3535/5080.3533
Re: An idea for an intersection type of filter query
On 7/27/2011 2:00 PM, Shawn Heisey wrote: I've seen a number of requests here for the ability to have multiple fq parameters ORed together. This is probably possible, but in the interests of compatibility between versions, very impractical. What if a new parameter was introduced? It could be named fqi, for filter query intersection. To figure out the final bitset for multiple fq and fqi parameters, it would use this kind of logic: fq AND fq AND fq AND (fqi OR fqi OR fqi) Thinking about this after I sent it, I realized that I don't mean intersection, that's what filter queries already do. :) I meant union, so fqu would be a better parameter name. Shawn
Re: Solr Performance Tuning: -XX:+AggressiveOpts
Thanks Robert!!! Submitted On 26-JUL-2011 - yesterday. This option was popular in Hbase On 11-07-27 3:58 PM, Robert Muir rcm...@gmail.com wrote: Don't use this option, these optimizations are buggy: http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7070134 On Wed, Jul 27, 2011 at 3:56 PM, Fuad Efendi f...@efendi.ca wrote: Anyone tried this? I can not start Solr-Tomcat with following options on Ubuntu: JAVA_OPTS=$JAVA_OPTS -Xms2048m -Xmx2048m -Xmn256m -XX:MaxPermSize=256m JAVA_OPTS=$JAVA_OPTS -Dsolr.solr.home=/data/solr -Dfile.encoding=UTF8 -Duser.timezone=GMT -Djava.util.logging.config.file=/data/solr/logging.properties -Djava.net.preferIPv4Stack=true JAVA_OPTS=$JAVA_OPTS -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode -XX:+AggressiveOpts -XX:NewSize=64m -XX:MaxNewSize=64m -XX:CMSInitiatingOccupancyFraction=77 -XX:+CMSParallelRemarkEnabled JAVA_OPTS=$JAVA_OPTS -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:/data/solr/solr-gc.log Tomcat log (something about PorterStemFilter; Solr 3.3.0): INFO: Server startup in 2683 ms # # A fatal error has been detected by the Java Runtime Environment: # # SIGSEGV (0xb) at pc=0x7f5c6f36716e, pid=7713, tid=140034519381760 # # JRE version: 6.0_26-b03 # Java VM: Java HotSpot(TM) 64-Bit Server VM (20.1-b02 mixed mode linux-amd64 compressed oops) # Problematic frame: # J org.apache.lucene.analysis.PorterStemFilter.incrementToken()Z # [thread 140034523637504 also had an error] [thread 140034520434432 also had an error] # An error report file with more information is saved as: # [thread 140034520434432 also had an error] # # If you would like to submit a bug report, please visit: # http://java.sun.com/webapps/bugreport/crash.jsp # However, I can start it and run without any problems by removing -XX:+AggressiveOpts (which has to be default setting in upcoming releases Java 6) Do we need to disable -XX:-DoEscapeAnalysis as IBM suggests? http://www-01.ibm.com/support/docview.wss?uid=swg21422605 Thanks, Fuad Efendi http://www.tokenizer.ca -- lucidimagination.com
RE: Spellcheck compounded words
I could not reproduce the problem even with the two parameters you show below added to the Default handler. I tried using this default handler with different queries with correct incorrect terms. I made sure it would sometimes successfully create collations and other times try to create collations but not find any good ones. In all cases everything worked as expected. I also checked the code to see if possibly it could create an infinite loop whereas the queries that run to check a collation's validity were in themselves getting spell corrections back. But this doesn't look like a possibility. If you are able to figure anything more out on this yourself, then please post. If this is a real bug, then we ought to get it fixed. James Dyer E-Commerce Systems Ingram Content Group (615) 213-4311 -Original Message- From: O. Klein [mailto:kl...@octoweb.nl] Sent: Wednesday, July 27, 2011 9:15 AM To: solr-user@lucene.apache.org Subject: Re: Spellcheck compounded words All the talk about logging derailed the thread. So can someone test if adding str name=spellcheck.maxCollations2/str str name=spellcheck.maxCollationTries2/str to the dedault requesthandler in solrconfig.xml using collations causes system to hang? O. Klein wrote: Anyways. I was testing on 3.3 and found that when I added spellcheck.maxCollations=2spellcheck.maxCollationTries=2 as parameters to the URL there was no problem at all. Adding str name=spellcheck.maxCollations2/str str name=spellcheck.maxCollationTries2/str to the default requestHandler in solrconfig.xml caused request to hang. Can someone verify if this is a bug? -- View this message in context: http://lucene.472066.n3.nabble.com/Spellcheck-compounded-words-tp3192748p3203569.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr Performance Tuning: -XX:+AggressiveOpts
On Wed, Jul 27, 2011 at 4:12 PM, Fuad Efendi f...@efendi.ca wrote: Thanks Robert!!! Submitted On 26-JUL-2011 - yesterday. This option was popular in HbaseŠ Then you should tell them also, not to use it, if they want their loops to work. -- lucidimagination.com
Re: Indexing SharePoint from SolrJ
+1 On 7/27/11, Twomey, David david.two...@novartis.com wrote: Does anyone have examples of indexing SP content using the Google Connectors API and using SolrJ. I know Lucid Imagination has a Sharepoint connector and I have used that successfully. However, I would like to create a thumbnail image of PDF's and PPT docs and add that to my index and I assume I need to use solrJ and some third party libraries to do that. Hence I want to crawl Sharepoint using SolrJ so I can then call third party libraries at index time. Thanks so much David -- Sent from my mobile device -
Re: An idea for an intersection type of filter query
I don't know the answer to feasibilty either, but I'll just point out that boolean OR corresponds to set union, not set intersection. So I think you probably mean a 'union' type of filter query; 'intersection' does not seem to describe what you are describing; ordinary 'fq' values are 'intersected' already to restrict the result set, no? So, anyhow, the basic goal, if I understand it right, is not to provide any additional semantics, but to allow individual clauses in an 'fq' OR to be cached and looked up in the filter cache individually. Perhaps someone (not me) who understands the Solr architecture better might also have another suggestion for how to get to that goal, other than the specific thing you suggested. I do not know, sorry. Hmm, but I start thinking, what about a general purpose mechanism to identify a sub-clause that should be fetched/retrieved from the filter cache. I don't _think_ current nested queries will do that: fq=_query_:foo:bar OR _query_:foo:baz That's legal now (and doesn't accomplish much) -- but what if the individual subquery components could consult the filter cache seperately? I don't know if nested query is the right way to do that or not, but I'm thinking some mechanism where you could arbitrarily identify clauses that should be filter cached independently? Jonathan On 7/27/2011 4:00 PM, Shawn Heisey wrote: I've been looking at the slow queries our Solr installation is receiving. They are dominated by queries with a simple q parameter (often *:* for all docs) and a VERY complicated fq parameter. The filter query is built by going through a set of rules for the user and putting together each rule's query clause separated by OR -- we can't easily break it into multiple filters. In addition to causing queries themselves to run slowly, this causes large autowarm times for our filterCache -- my filterCache autowarmCount is tiny (4), but it sometimes takes 30 seconds to warm. I've seen a number of requests here for the ability to have multiple fq parameters ORed together. This is probably possible, but in the interests of compatibility between versions, very impractical. What if a new parameter was introduced? It could be named fqi, for filter query intersection. To figure out the final bitset for multiple fq and fqi parameters, it would use this kind of logic: fq AND fq AND fq AND (fqi OR fqi OR fqi) This would let us break our filters into manageable pieces that can efficiently populate the filterCache, and they would autowarm quickly. Is the filter design in Solr separated cleanly enough to make this at all reasonable? I'm not a Java developer, so I'd have a tough time implementing it myself. When I have a free moment I will take a look at the code anyway. I'm trying to teach myself Java. Thanks, Shawn
Re: Speeding up search by combining common sub-filters
I'm pretty sure Solr/lucene have no such optimization already, but it's not clear to me that it would result in much of a performance benefit, just because of the way lucene works, it's not obvious to me that the second version of your query will be noticeably faster than the first version. Maybe in cases with many many clauses, rather than the few clauses in your example. You'd definitely want to performance test it to verify there are any gains, before embarking on writing the 'optimization' -- you can test it just by sending the different versions of your real world queries to Solr and seeing what the response times are, calculating the hypothetically 'optimized' version yourself by hand if need be, right? On 7/27/2011 5:05 PM, Scott Smith wrote: We have a solr application which ends up creating queries with very complicated filters (literally hundreds and sometimes thousands of terms-typically a large number of terms OR'ed together where each of these terms might have a half a dozen keywords ANDed/ORed together). In looking at the filters, I realized that there are often a lot of common sub-filters. A simple example of what I mean is: (cat AND dog) OR (cat AND horse) This could clearly be simplified by saying: cat AND (dog OR horse) It turns out that finding and combining common sub-filters isn't trivial for our application. So, before I start a project to attempt some kind of optimization, my question is whether it's likely that I will see significant decreases in query times to justify the development effort it takes to optimize the filters. Certainly, if I thought I might get a 20%+ decrease in time, I would say it's probably a good project. If it's just a few percentage points of improvement, then I'm less excited about doing it. Does Solr already go through some kind of optimization which effectively combines common sub-filters and possibly duplicated terms? Does anyone have any thoughts on this subject? Thanks Scott
Re: An idea for an intersection type of filter query
On 7/27/2011 3:49 PM, Jonathan Rochkind wrote: I don't know the answer to feasibilty either, but I'll just point out that boolean OR corresponds to set union, not set intersection. So I think you probably mean a 'union' type of filter query; 'intersection' does not seem to describe what you are describing; ordinary 'fq' values are 'intersected' already to restrict the result set, no? You're right, I noticed that later and corrected myself. Substitute fqu (and try not to pronounce it) for fqi in my previous message. This is the only name suggestion I could come up with on short notice, and it's probably a good idea to change it. So, anyhow, the basic goal, if I understand it right, is not to provide any additional semantics, but to allow individual clauses in an 'fq' OR to be cached and looked up in the filter cache individually. I would like to have both intersection and union at the same time, not be restricted to one or the other, and have it be possible without altering existing functionality. The idea is to just add a new parameter that just changes how the resulting bitset is applied to the query results. The filterCache entry would look the same whether you used fq or fqu. Restating my suggested bitset logic with the changed parameter name: fq AND fq AND fq AND (fqu OR fqu OR fqu) It would be awesome to have a syntax that creates arbitrarily complex and nested AND/OR combinations, but that would be a MAJOR undertaking. The logic I've mentioned above seems to be the most useful you could get with just having the one additional parameter. You can get pure union by just using fqu. The existing model of pure intersection would be maintained when only fq is present. Thanks, Shawn
Re: schema.xml changes, need re-indexing ?
I always run http://localhost:8983/solr/admin/cores?action=RELOADcore=corename in the browser when I wanna reload solr and see any changes in config xmls. 2011/7/27 François Schiettecatte fschietteca...@gmail.com I have not seen this mentioned anywhere, but I found a useful 'trick' to restart solr without having to restart tomcat. All you need to do is 'touch' the solr.xml in the solr.home directory. It can take a few seconds but solr will restart and reload any config. Cheers François On Jul 27, 2011, at 2:56 PM, Alexei Martchenko wrote: I believe you're fine with that. Don't need to reindex all solr database. 2011/7/27 Charles-Andre Martin charles-andre.mar...@sunmedia.ca Hi, We currently have a big index in production. We would like to add 2 non-required fields to our schema.xml : field name=myfield type=boolean indexed=true stored=true required=false/ field name=myotherfield type=string indexed=true stored=true required=false multiValued=true/ I made some tests: - I stopped tomcat - I changed the schema.xml - I started tomcat The data was still there and I was able to add new document with theses 2 fields. So far, it looks I won't need to re-index all my data. Am I right ? Do I need to re-index all my data or in that case I'm fine ? Thank you ! Charles-André Martin -- *Alexei Martchenko* | *CEO* | Superdownloads ale...@superdownloads.com.br | ale...@martchenko.com.br | (11) 5083.1018/5080.3535/5080.3533
colocated term stats
Given a query term, is it possible to get from the index the top 10 collocated terms in the index. ie: return the top 10 terms that appear with this term based on doc count. A plus would be to add some constraints on how near the terms are in the docs.
Re: Data Import Handler Architecture Diagram
: Maybe I am looking at the wrong version - the diagram (and the screenshot in : the interactive dev mode section) don't show up in the WIKI page. : : http://wiki.apache.org/solr/DataImportHandler#Architecture : : Is this a wrong link? Ugh. a while back the Infra team disabled attachments in the wiki because of spam. Attachments are all still in the system on disk somewhere, but you can't view, edit, or replace them... https://issues.apache.org/jira/browse/INFRA-3634 ...i thought the only major use of attachments on the Solr wiki was the eclipse project zip files (which are no available in dev-tools) but i didn't realize there were any diagrams as well. Nothing anyone except Infra can really do about it. -Hoss
Re: Exact match not the first result returned
: With your solution, RECORD 1 does appear at the top but I think thats just : blind luck more than anything else because RECORD 3 shows as having the same : score. So what more can I do to push RECORD 1 up to the top. Ideally, I'd : like all three records returned with RECORD 1 being the first listing. with omitNorms RECORD1 and RECORD3 have the same score because only the tf() matters, and both docs contain the term frank exactly twice. the reason RECORD1 isn't scoring higher even though it contains (as you put it matchings 'Fred' exactly is that from a term perspective, RECORD1 doesn't actually match myname:Fred exactly, because there are in fact other terms in that field because it's multivalued. one way to indicate that you (only* want documents where entire field values to match your input (ie: RECORD1 but no other records) would be to use a StrField instead of a TextField or an analyzer that doesn't split up tokens (lie: something using KeywordTokenizer). that way a query on myname:Frank would not match a document where you had indexed the value Frank Stalone by a query for myname:Frank Stalone would. in your case, you don't want *only* the exact field value matches, but you want them boosted, so you could do something like copyField myname into myname_str and then do... q=+myname:Frank myname_str:Frank^100 ...in which case a match on myname is required, but a match on myname_str will greatly increase the score. dismax (and edismax) are really designed for situations like this... defType=dismax qf=myname pf=myname_str^100 q=Frank -Hoss
Re: Dealing with keyword stuffing
: Presumably, they are doing this by increasing tf (term frequency), : i.e., by repeating keywords multiple times. If so, you can use a custom : similarity class that caps term frequency, and/or ensures that the scoring : increases less than linearly with tf. Please see in paticular, using something like SweetSpotSimilarity tuned to know what values make sense for good content in your domain can be useful because it can actaully penalize docsuments that are too short/long or have term freqs that are outside of a reasonble expected range. FWIW though: that's really just a generic answer to a generic question. the better you understand your data, the better you can configure solr for it -- and that goes equally for the advice people can give you about how to configure solr. you haven't given any information about hte nature of your data: the types of documets, the authoritaive source, the fields involved, where/how/when people edit this data, who is keyword spamming, etc.; or how you wnat to use it: what types of queries you need to support, what your users objectives are, etc. That makes it impossible for anyone to suggest anything but the most general answer customize your Similarity. -Hoss
RE: Problem starting solr on jetty
Thanks for your reply Steve. My environment details: Java version: 1.6.0_24 System: Microsoft Windows XP Professional Version 2002 Service Pack 3 Interestingly my colleagues who have the same environment are not facing this problem. Thanks Regards Anand Nigam -Original Message- From: Steven A Rowe [mailto:sar...@syr.edu] Sent: 27 July 2011 20:21 To: solr-user@lucene.apache.org Subject: RE: Problem starting solr on jetty Hi Anand, Someone else reported this exact same error with Solr v1.4.0: http://www.lucidimagination.com/search/document/fd5b83f3595a1c6c/can_t_start_solr_by_java_jar_start_jar I downloaded the apache-solr-3.3.0.zip, unpacked it, then ran 'java -jar start.jar' from the cmdline. It worked. (Windows 7; Oracle Java 1.6.0_23). I tried to reproduce the error you're seeing, by making the example\ directory and all its contents read-only (different exception: FileNotFound), and by removing the entire contents of the example\ directory except for start.jar (nothing happens - it just quits without printing anything out). Can you give more details about your environment? Steve -Original Message- From: anand.ni...@rbs.com [mailto:anand.ni...@rbs.com] Sent: Wednesday, July 27, 2011 7:25 AM To: solr-user@lucene.apache.org Subject: Problem starting solr on jetty Hi, I am new to solr. I have downloaded the solr 3.3.0 distribution and tryign to run it using java -jar start.jar from the apache-solr-3.3.0\example directory (start.jar is present here). But I am getting following error on running this command: C:\downloads\apache-solr-3.3.0\apache-solr-3.3.0\examplejava -jar start.jar java.lang.NullPointerException at java.io.File.init(File.java:222) at org.mortbay.start.Main.init(Main.java:465) at org.mortbay.start.Main.start(Main.java:439) at org.mortbay.start.Main.main(Main.java:119) Could someone help me in resolving this issue. Thanks Regards Anand Nigam *** The Royal Bank of Scotland plc. Registered in Scotland No 90312. Registered Office: 36 St Andrew Square, Edinburgh EH2 2YB. Authorised and regulated by the Financial Services Authority. The Royal Bank of Scotland N.V. is authorised and regulated by the De Nederlandsche Bank and has its seat at Amsterdam, the Netherlands, and is registered in the Commercial Register under number 33002587. Registered Office: Gustav Mahlerlaan 350, Amsterdam, The Netherlands. The Royal Bank of Scotland N.V. and The Royal Bank of Scotland plc are authorised to act as agent for each other in certain jurisdictions. This e-mail message is confidential and for use by the addressee only. If the message is received by anyone other than the addressee, please return the message to the sender by replying to it and then delete the message from your computer. Internet e-mails are not necessarily secure. The Royal Bank of Scotland plc and The Royal Bank of Scotland N.V. including its affiliates (RBS group) does not accept responsibility for changes made to this message after it was sent. For the protection of RBS group and its clients and customers, and in compliance with regulatory requirements, the contents of both incoming and outgoing e-mail communications, which could include proprietary information and Non-Public Personal Information, may be read by authorised persons within RBS group other than the intended recipient(s). Whilst all reasonable care has been taken to avoid the transmission of viruses, it is the responsibility of the recipient to ensure that the onward transmission, opening or use of this message and any attachments will not adversely affect its systems or data. No responsibility is accepted by the RBS group in this regard and the recipient should carry out such virus and other checks as it considers appropriate. Visit our website at www.rbs.com ***
RE: Problem starting solr on jetty
Thanks for your reply Steve. My environment details: Java version: 1.6.0_24 System: Microsoft Windows XP Professional Version 2002 Service Pack 3 Interestingly my colleagues who have the same environment are not facing this problem. Thanks Regards Anand Nigam -Original Message- From: Nigam, Anand, GBM Sent: 28 July 2011 08:37 To: solr-user@lucene.apache.org Subject: RE: Problem starting solr on jetty Thanks for your reply Steve. My environment details: Java version: 1.6.0_24 System: Microsoft Windows XP Professional Version 2002 Service Pack 3 Interestingly my colleagues who have the same environment are not facing this problem. Thanks Regards Anand Nigam -Original Message- From: Steven A Rowe [mailto:sar...@syr.edu] Sent: 27 July 2011 20:21 To: solr-user@lucene.apache.org Subject: RE: Problem starting solr on jetty Hi Anand, Someone else reported this exact same error with Solr v1.4.0: http://www.lucidimagination.com/search/document/fd5b83f3595a1c6c/can_t_start_solr_by_java_jar_start_jar I downloaded the apache-solr-3.3.0.zip, unpacked it, then ran 'java -jar start.jar' from the cmdline. It worked. (Windows 7; Oracle Java 1.6.0_23). I tried to reproduce the error you're seeing, by making the example\ directory and all its contents read-only (different exception: FileNotFound), and by removing the entire contents of the example\ directory except for start.jar (nothing happens - it just quits without printing anything out). Can you give more details about your environment? Steve -Original Message- From: anand.ni...@rbs.com [mailto:anand.ni...@rbs.com] Sent: Wednesday, July 27, 2011 7:25 AM To: solr-user@lucene.apache.org Subject: Problem starting solr on jetty Hi, I am new to solr. I have downloaded the solr 3.3.0 distribution and tryign to run it using java -jar start.jar from the apache-solr-3.3.0\example directory (start.jar is present here). But I am getting following error on running this command: C:\downloads\apache-solr-3.3.0\apache-solr-3.3.0\examplejava -jar start.jar java.lang.NullPointerException at java.io.File.init(File.java:222) at org.mortbay.start.Main.init(Main.java:465) at org.mortbay.start.Main.start(Main.java:439) at org.mortbay.start.Main.main(Main.java:119) Could someone help me in resolving this issue. Thanks Regards Anand Nigam *** The Royal Bank of Scotland plc. Registered in Scotland No 90312. Registered Office: 36 St Andrew Square, Edinburgh EH2 2YB. Authorised and regulated by the Financial Services Authority. The Royal Bank of Scotland N.V. is authorised and regulated by the De Nederlandsche Bank and has its seat at Amsterdam, the Netherlands, and is registered in the Commercial Register under number 33002587. Registered Office: Gustav Mahlerlaan 350, Amsterdam, The Netherlands. The Royal Bank of Scotland N.V. and The Royal Bank of Scotland plc are authorised to act as agent for each other in certain jurisdictions. This e-mail message is confidential and for use by the addressee only. If the message is received by anyone other than the addressee, please return the message to the sender by replying to it and then delete the message from your computer. Internet e-mails are not necessarily secure. The Royal Bank of Scotland plc and The Royal Bank of Scotland N.V. including its affiliates (RBS group) does not accept responsibility for changes made to this message after it was sent. For the protection of RBS group and its clients and customers, and in compliance with regulatory requirements, the contents of both incoming and outgoing e-mail communications, which could include proprietary information and Non-Public Personal Information, may be read by authorised persons within RBS group other than the intended recipient(s). Whilst all reasonable care has been taken to avoid the transmission of viruses, it is the responsibility of the recipient to ensure that the onward transmission, opening or use of this message and any attachments will not adversely affect its systems or data. No responsibility is accepted by the RBS group in this regard and the recipient should carry out such virus and other checks as it considers appropriate. Visit our website at www.rbs.com ***
Store complete XML record (DIH XPathEntityProcessor)
I am trying to use DIH to import an XML based file with multiple XML records in it. Each record corresponds to one document in Lucene. I am using the DIH FileListEntityProcessor (to get file list) followed by the XPathEntityProcessor to create the entities. It works perfectly and I am able to map XML elements to fields . however I also need to store the entire XML record as separate 'full text' field. Is there any way the XPathEntityProcessor provides a variable like 'rawLine' or 'plainText' that I can map to a field. I tried to use the Plain Text processor after this - but that does not recognize the XML boundaries and just gives the whole XML file. entity name=x rootEntity=truedataSource=logfilereader processor=XPathEntityProcessor url=${logfile.fileAbsolutePath} stream=false forEach=/xml/myrecord transformer= field column=mycol1 xpath=/xml/myrecord/@something / and so on ... This works perfectly. However I also need something like ... field column=fullxmlrecord name=plainText / Any help is much appreciated. I am a newbie and may be missing something obvious here -g -- View this message in context: http://lucene.472066.n3.nabble.com/Store-complete-XML-record-DIH-XPathEntityProcessor-tp3205524p3205524.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: Problem starting solr on jetty
Hi All, I tried to debug the issue by runing start.jar in eclipse debuger and found that the root of the issue was that the jetty.home system property was not set. If I set the jetty.home property then the server starts properly. Thanks, Anand -Original Message- From: Nigam, Anand, GBM Sent: 28 July 2011 08:39 To: solr-user@lucene.apache.org Subject: RE: Problem starting solr on jetty Thanks for your reply Steve. My environment details: Java version: 1.6.0_24 System: Microsoft Windows XP Professional Version 2002 Service Pack 3 Interestingly my colleagues who have the same environment are not facing this problem. Thanks Regards Anand Nigam -Original Message- From: Nigam, Anand, GBM Sent: 28 July 2011 08:37 To: solr-user@lucene.apache.org Subject: RE: Problem starting solr on jetty Thanks for your reply Steve. My environment details: Java version: 1.6.0_24 System: Microsoft Windows XP Professional Version 2002 Service Pack 3 Interestingly my colleagues who have the same environment are not facing this problem. Thanks Regards Anand Nigam -Original Message- From: Steven A Rowe [mailto:sar...@syr.edu] Sent: 27 July 2011 20:21 To: solr-user@lucene.apache.org Subject: RE: Problem starting solr on jetty Hi Anand, Someone else reported this exact same error with Solr v1.4.0: http://www.lucidimagination.com/search/document/fd5b83f3595a1c6c/can_t_start_solr_by_java_jar_start_jar I downloaded the apache-solr-3.3.0.zip, unpacked it, then ran 'java -jar start.jar' from the cmdline. It worked. (Windows 7; Oracle Java 1.6.0_23). I tried to reproduce the error you're seeing, by making the example\ directory and all its contents read-only (different exception: FileNotFound), and by removing the entire contents of the example\ directory except for start.jar (nothing happens - it just quits without printing anything out). Can you give more details about your environment? Steve -Original Message- From: anand.ni...@rbs.com [mailto:anand.ni...@rbs.com] Sent: Wednesday, July 27, 2011 7:25 AM To: solr-user@lucene.apache.org Subject: Problem starting solr on jetty Hi, I am new to solr. I have downloaded the solr 3.3.0 distribution and tryign to run it using java -jar start.jar from the apache-solr-3.3.0\example directory (start.jar is present here). But I am getting following error on running this command: C:\downloads\apache-solr-3.3.0\apache-solr-3.3.0\examplejava -jar start.jar java.lang.NullPointerException at java.io.File.init(File.java:222) at org.mortbay.start.Main.init(Main.java:465) at org.mortbay.start.Main.start(Main.java:439) at org.mortbay.start.Main.main(Main.java:119) Could someone help me in resolving this issue. Thanks Regards Anand Nigam *** The Royal Bank of Scotland plc. Registered in Scotland No 90312. Registered Office: 36 St Andrew Square, Edinburgh EH2 2YB. Authorised and regulated by the Financial Services Authority. The Royal Bank of Scotland N.V. is authorised and regulated by the De Nederlandsche Bank and has its seat at Amsterdam, the Netherlands, and is registered in the Commercial Register under number 33002587. Registered Office: Gustav Mahlerlaan 350, Amsterdam, The Netherlands. The Royal Bank of Scotland N.V. and The Royal Bank of Scotland plc are authorised to act as agent for each other in certain jurisdictions. This e-mail message is confidential and for use by the addressee only. If the message is received by anyone other than the addressee, please return the message to the sender by replying to it and then delete the message from your computer. Internet e-mails are not necessarily secure. The Royal Bank of Scotland plc and The Royal Bank of Scotland N.V. including its affiliates (RBS group) does not accept responsibility for changes made to this message after it was sent. For the protection of RBS group and its clients and customers, and in compliance with regulatory requirements, the contents of both incoming and outgoing e-mail communications, which could include proprietary information and Non-Public Personal Information, may be read by authorised persons within RBS group other than the intended recipient(s). Whilst all reasonable care has been taken to avoid the transmission of viruses, it is the responsibility of the recipient to ensure that the onward transmission, opening or use of this message and any attachments will not adversely affect its systems or data. No responsibility is accepted by the RBS group in this regard and the recipient should carry out such virus and other checks as it considers appropriate. Visit our website at www.rbs.com ***
RE: Problem starting solr on jetty
Hi Anand, Congrats! And thanks for letting us know. Steve -Original Message- From: anand.ni...@rbs.com [mailto:anand.ni...@rbs.com] Sent: Thursday, July 28, 2011 12:00 AM To: solr-user@lucene.apache.org Subject: RE: Problem starting solr on jetty Hi All, I tried to debug the issue by runing start.jar in eclipse debuger and found that the root of the issue was that the jetty.home system property was not set. If I set the jetty.home property then the server starts properly. Thanks, Anand -Original Message- From: Nigam, Anand, GBM Sent: 28 July 2011 08:39 To: solr-user@lucene.apache.org Subject: RE: Problem starting solr on jetty Thanks for your reply Steve. My environment details: Java version: 1.6.0_24 System: Microsoft Windows XP Professional Version 2002 Service Pack 3 Interestingly my colleagues who have the same environment are not facing this problem. Thanks Regards Anand Nigam -Original Message- From: Nigam, Anand, GBM Sent: 28 July 2011 08:37 To: solr-user@lucene.apache.org Subject: RE: Problem starting solr on jetty Thanks for your reply Steve. My environment details: Java version: 1.6.0_24 System: Microsoft Windows XP Professional Version 2002 Service Pack 3 Interestingly my colleagues who have the same environment are not facing this problem. Thanks Regards Anand Nigam -Original Message- From: Steven A Rowe [mailto:sar...@syr.edu] Sent: 27 July 2011 20:21 To: solr-user@lucene.apache.org Subject: RE: Problem starting solr on jetty Hi Anand, Someone else reported this exact same error with Solr v1.4.0: http://www.lucidimagination.com/search/document/fd5b83f3595a1c6c/can_t_start_solr_by_java_jar_start_jar I downloaded the apache-solr-3.3.0.zip, unpacked it, then ran 'java -jar start.jar' from the cmdline. It worked. (Windows 7; Oracle Java 1.6.0_23). I tried to reproduce the error you're seeing, by making the example\ directory and all its contents read-only (different exception: FileNotFound), and by removing the entire contents of the example\ directory except for start.jar (nothing happens - it just quits without printing anything out). Can you give more details about your environment? Steve -Original Message- From: anand.ni...@rbs.com [mailto:anand.ni...@rbs.com] Sent: Wednesday, July 27, 2011 7:25 AM To: solr-user@lucene.apache.org Subject: Problem starting solr on jetty Hi, I am new to solr. I have downloaded the solr 3.3.0 distribution and tryign to run it using java -jar start.jar from the apache-solr-3.3.0\example directory (start.jar is present here). But I am getting following error on running this command: C:\downloads\apache-solr-3.3.0\apache-solr-3.3.0\examplejava -jar start.jar java.lang.NullPointerException at java.io.File.init(File.java:222) at org.mortbay.start.Main.init(Main.java:465) at org.mortbay.start.Main.start(Main.java:439) at org.mortbay.start.Main.main(Main.java:119) Could someone help me in resolving this issue. Thanks Regards Anand Nigam *** The Royal Bank of Scotland plc. Registered in Scotland No 90312. Registered Office: 36 St Andrew Square, Edinburgh EH2 2YB. Authorised and regulated by the Financial Services Authority. The Royal Bank of Scotland N.V. is authorised and regulated by the De Nederlandsche Bank and has its seat at Amsterdam, the Netherlands, and is registered in the Commercial Register under number 33002587. Registered Office: Gustav Mahlerlaan 350, Amsterdam, The Netherlands. The Royal Bank of Scotland N.V. and The Royal Bank of Scotland plc are authorised to act as agent for each other in certain jurisdictions. This e-mail message is confidential and for use by the addressee only. If the message is received by anyone other than the addressee, please return the message to the sender by replying to it and then delete the message from your computer. Internet e-mails are not necessarily secure. The Royal Bank of Scotland plc and The Royal Bank of Scotland N.V. including its affiliates (RBS group) does not accept responsibility for changes made to this message after it was sent. For the protection of RBS group and its clients and customers, and in compliance with regulatory requirements, the contents of both incoming and outgoing e-mail communications, which could include proprietary information and Non-Public Personal Information, may be read by authorised persons within RBS group other than the intended recipient(s). Whilst all reasonable care has been taken to avoid the transmission of viruses, it is the responsibility of the recipient to ensure that the onward transmission, opening or use of this message and any attachments will not adversely affect its systems or data. No responsibility is accepted by the RBS group in this regard and the recipient should carry out such virus and other checks as it considers
RE: Problem starting solr on jetty
: I tried to debug the issue by runing start.jar in eclipse debuger and : found that the root of the issue was that the jetty.home system property : was not set. If I set the jetty.home property then the server starts : properly. H, weird ... that still doesn't really make much sense. The jetty.home property isn't required by Jetty (or solr). if it's unset, it defaults to the current working directory. : I am new to solr. I have downloaded the solr 3.3.0 distribution and tryign to run it using java -jar start.jar from the apache-solr-3.3.0\example directory (start.jar is present here). But I am getting following error on running this command: : : C:\downloads\apache-solr-3.3.0\apache-solr-3.3.0\examplejava -jar start.jar java.lang.NullPointerException : at java.io.File.init(File.java:222) : at org.mortbay.start.Main.init(Main.java:465) : at org.mortbay.start.Main.start(Main.java:439) : at org.mortbay.start.Main.main(Main.java:119) : : Could someone help me in resolving this issue. -Hoss