Solr Sorting is not working properly on long Fields
Hi, I am having a column named 'Kilometers' and when I try to sort with that it is not working properly.The values in 'Kilometers' column are,Kilometers171119792365611Values in 'Kilometers' after soting are Kilometers979236561117111The Problem here is, when 97 is compared with 923 it is taking 97 as bigger number since 97 is greater than 923. Initially Kilometers column was having string as datatype and I thought the problem could be because of that and i changed the datatype of that column to 'long'. Even then i couldn't see any change in the results.But when I insert values which are having same number of digits say, 1, 2, 3,4,5Kilometers21452 when i try to sort now it is working perfectlyKilometers12345Datatypes that I have tries are, Can anyone helpme to get rid out of this problem... Thanks in Advance -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Sorting-is-not-working-properly-on-long-Fields-tp4050833.html Sent from the Solr - User mailing list archive at Nabble.com.
Solr sorting is not working properly on long Fields
Hi, I am having a column named 'Kilometers' and when I try to sort with that it is not working properly. The values in 'Kilometers' column are, Kilometers 17 111 97 923 65 611 Values in 'Kilometers' after soting are Kilometers 97 923 65 611 17 111 The Problem here is, when 97 is compared with 923 it is taking 97 as bigger number since 97 is greater than 923. Initially Kilometers column was having string as datatype and I thought the problem could be because of that and i changed the datatype of that column to 'long'. Even then i couldn't see any change in the results. But when I insert values which are having same number of digits say, 1, 2, 3,4,5 Kilometers 2 1 4 5 2 when i try to sort now it is working perfectly Kilometers 1 2 3 4 5 Datatypes that I have tried are, field name=adi_f10001 type=wc_keywordText indexed=true stored=true multiValued=false/ field name=adi_f10001 type=long indexed=true stored=true multiValued=false/ field name=adi_f10001 type=double indexed=true stored=true multiValued=false/ Can anyone helpme to get rid out of this problem... Thanks in Advance -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-sorting-is-not-working-properly-on-long-Fields-tp4050834.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr sorting is not working properly on long Fields
On 24 March 2013 11:56, ballusethuraman ballusethura...@gmail.com wrote: Hi, I am having a column named 'Kilometers' and when I try to sort with that it is not working properly. [...] Initially Kilometers column was having string as datatype and I thought the problem could be because of that and i changed the datatype of that column to 'long'. Even then i couldn't see any change in the results. [...] Did you reindex after changing the data type of the column to long? Regards, Gora
Solr using a ridiculous amount of memory
Hello all, We are running a solr cluster which is now running solr-4.2. The index is about 35GB on disk with each register between 15k and 30k. (This is simply the size of a full xml reply of one register. I'm not sure how to measure it otherwise.) Our memory requirements are running amok. We have less than a quarter of our customers running now and even though we have allocated 25GB to the JVM already, we are still seeing daily OOM crashes. We used to just allocate more memory to the JVM, but with the way solr is scaling, we would need well over 100GB of memory on each node to finish the project, and thats just not going to happen. I need to lower the memory requirements somehow. I can see from the memory dumps we've done that the field cache is by far the biggest sinner. Of special interest to me is the recent introduction of DocValues which supposedly mitigates this issue by using memory outside the JVM. I just can't, because of lack of documentation, seem to make it work. We do a lot of facetting. One client facets on about 50.000 docs of approx 30k each on 5 fields. I understand that this is VERY memory intensive. Schema with DocValues attempt at solving problem: http://pastebin.com/Ne23NnW4 Config: http://pastebin.com/x1qykyXW The cache is pretty well tuned. Any lower and i get evictions. Come hell or high water, my JVM memory requirements must come down. Simply moving some memory load outside of the JVM would be awesome! Making it not use the field cache for anything would also (probably) work for me. I thought about killing off my other caches, but from the dumps, they just don't seem to use that much memory. I am at my wits end. Any help would be sorely appreciated. -- Med venlig hilsen / Best regards *John Nielsen* Programmer *MCB A/S* Enghaven 15 DK-7500 Holstebro Kundeservice: +45 9610 2824 p...@mcb.dk www.mcb.dk
Re: Solr sorting is not working properly on long Fields
Yes I did.. But there is no change in result.. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-sorting-is-not-working-properly-on-long-Fields-tp4050834p4050844.html Sent from the Solr - User mailing list archive at Nabble.com.
SOLR 4.2 SolrQuery exception
I am using the below code and getting the exception while using SolrQuery Mar 24, 2013 3:08:07 PM org.apache.solr.core.QuerySenderListener newSearcher INFO: QuerySenderListener sending requests to Searcher@795e0c2b main{StandardDirectoryReader(segments_49:524 _4v(4.2):C299313 _4x(4.2):C2953/1396 _4y(4.2):C2866/1470 _4z(4.2):C4263/2793 _50(4.2):C3554/761 _51(4.2):C1126/365 _52(4.2):C650/285 _53(4.2):C500/215 _54(4.2):C1808/1593 _55(4.2):C1593)} Mar 24, 2013 3:08:07 PM org.apache.solr.common.SolrException log SEVERE: java.lang.NullPointerException at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:181) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1797) at org.apache.solr.core.QuerySenderListener.newSearcher(QuerySenderListener.java:64) at org.apache.solr.core.SolrCore$5.call(SolrCore.java:1586) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:679) Mar 24, 2013 3:08:07 PM org.apache.solr.core.SolrCore execute INFO: [collection1] webapp=null path=null params={event=firstSearcherq=static+firstSearcher+warming+in+solrconfig.xmldistrib=false} status=500 QTime=4 Mar 24, 2013 3:08:07 PM org.apache.solr.core.QuerySenderListener newSearcher INFO: QuerySenderListener done. Mar 24, 2013 3:08:07 PM org.apache.solr.handler.component.SpellCheckComponent$SpellCheckerListener newSearcher INFO: Loading spell index for spellchecker: default Mar 24, 2013 3:08:07 PM org.apache.solr.handler.component.SpellCheckComponent$SpellCheckerListener newSearcher INFO: Loading spell index for spellchecker: wordbreak Mar 24, 2013 3:08:07 PM org.apache.solr.core.SolrCore registerSearcher INFO: [collection1] Registered new searcher Searcher@795e0c2b main{StandardDirectoryReader(segments_49:524 _4v(4.2):C299313 _4x(4.2):C2953/1396 _4y(4.2):C2866/1470 _4z(4.2):C4263/2793 _50(4.2):C3554/761 _51(4.2):C1126/365 _52(4.2):C650/285 _53(4.2):C500/215 _54(4.2):C1808/1593 _55(4.2):C1593)} Mar 24, 2013 3:08:07 PM org.apache.solr.core.CoreContainer registerCore INFO: registering core: collection1 server value -org.apache.solr.client.solrj.embedded.EmbeddedSolrServer@3a32ea4 query value -q=smstext%3AEMIRATESrows=50 Mar 24, 2013 3:08:07 PM org.apache.solr.common.SolrException log SEVERE: java.lang.NullPointerException at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:181) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1797) at org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:150) at org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:90) at org.apache.solr.client.solrj.SolrServer.query(SolrServer.java:301) at SolrQueryResult.solrQuery(SolrQueryResult.java:31) at SolrQueryResult.main(SolrQueryResult.java:65) Mar 24, 2013 3:08:07 PM org.apache.solr.core.SolrCore execute INFO: [collection1] webapp=null path=/select params={q=smstext%3AEMIRATESrows=50} status=500 QTime=0 org.apache.solr.client.solrj.SolrServerException: org.apache.solr.client.solrj.SolrServerException: java.lang.NullPointerException at org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:223) at org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:90) at org.apache.solr.client.solrj.SolrServer.query(SolrServer.java:301) at SolrQueryResult.solrQuery(SolrQueryResult.java:31) at SolrQueryResult.main(SolrQueryResult.java:65) Caused by: org.apache.solr.client.solrj.SolrServerException: java.lang.NullPointerException at org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:155) ... 4 more Caused by: java.lang.NullPointerException at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:181) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1797) at org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:150) ... 4 more try{ String SOLR_HOME = /data/solr1/example/solr/; CoreContainer coreContainer = new CoreContainer(SOLR_HOME); CoreDescriptor discriptor = new CoreDescriptor(coreContainer,
Tlog File not removed after hard commit
Hi all, We import about 1.5 million documents on a nightly basis using DIH. During this time, we need to ensure that all documents make it into index otherwise rollback on any errors; which DIH takes care of for us. We also disable autoCommit in DIH but instruct it to commit at the very end of the import. This is all done through configuration of the DIH config XML file and the command issued to the request handler. We have noticed that the tlog file appears to linger around even after DIH has issued the hard commit. My expectation would be that after the hard commit has occurred, the tlog file will be removed. I'm obviously misunderstanding how this all works. Can someone please help me understand how this is meant to function? Thanks! -Niran
Re: Solr Sorting is not working properly on long Fields
Hi ballusethuraman, I am sure you have done this already, but just to be sure, did you reindex your existing kilometer data after you changed the data type from string to long? If not, then you should. -sujit On Mar 23, 2013, at 11:21 PM, ballusethuraman wrote: Hi, I am having a column named 'Kilometers' and when I try to sort with that it is not working properly.The values in 'Kilometers' column are,Kilometers171119792365611Values in 'Kilometers' after soting are Kilometers979236561117111The Problem here is, when 97 is compared with 923 it is taking 97 as bigger number since 97 is greater than 923. Initially Kilometers column was having string as datatype and I thought the problem could be because of that and i changed the datatype of that column to 'long'. Even then i couldn't see any change in the results.But when I insert values which are having same number of digits say, 1, 2, 3,4,5Kilometers21452 when i try to sort now it is working perfectlyKilometers12345Datatypes that I have tries are, Can anyone helpme to get rid out of this problem... Thanks in Advance -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Sorting-is-not-working-properly-on-long-Fields-tp4050833.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Practicality of enormous fields
Yeah, it is kind of weird, but certainly do-able. But big gotcha is if you want to _retrieve_ that field, that could take some time. If you just want to search it, no problems that I know of. If you do want to retrieve it, make sure lazy field loading is enabled and that you do NOT ask for this field in results except when you really need it... Best Erick On Tue, Mar 19, 2013 at 6:33 PM, jimtronic jimtro...@gmail.com wrote: What are the likely ramifications of having a stored field with millions of words? For example, If I had an article and wanted to store the user id of every user who has read it and stuck it into a simple white space delimited field. What would go wrong and when? My tests lead me to believe this is not a problem, but it feels weird. Jim -- View this message in context: http://lucene.472066.n3.nabble.com/Practicality-of-enormous-fields-tp4049131.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Too many fields to Sort in Solr
Seems like a reasonable thing to do. Examine the debug output to insure that there's no short-circuiting being done as far as ConstantScoreQuery... Best Erick On Tue, Mar 19, 2013 at 7:05 PM, adityab aditya_ba...@yahoo.com wrote: Hi All, I want to validate my approach by the experts, just to make sure i am on doing anything wrong. #Docs in Solr : 25M Solr Versin : 4.2 Our requirement is to list top download document based on user country. So we have a dynamic field *numdownload.** which is evaluate as *numdownloads.countryId* Now as sorting is an expensive and also uses large amount of java heap, I planned to use this field in boosting result. Old Query q=*:*fq=countryId:1sort=numdownloads.1 desc which i changed to q={!boost b=numdownloads.1}*:*fq=countryId:1 Is my approach correct. Any better alternate ? thanks Aditya -- View this message in context: http://lucene.472066.n3.nabble.com/Too-many-fields-to-Sort-in-Solr-tp4049139.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr using a ridiculous amount of memory
Just to get started, do you hit OOM quickly with a few expensive queries, or is it after a number of hours and lots of queries? Does Java heap usage seem to be growing linearly as queries come in, or are there big spikes? How complex/rich are your queries (e.g., how many terms, wildcards, faceted fields, sorting, etc.)? As a baseline experiment, start a Solr server, see how much Java heap is used/available. Then do a couple of typical queries, and check the heap size again. Then do a couple more similar but different (to avoid query cache matches), and check the heap again. Maybe do that a few times to get a handle on the baseline memory required and whether there might be a leak of some sort. Do enough queries to hits all of the fields, facets, sorting, etc. that are likely to be encountered in one of your typical days that hits OOM - just not the volume of queries. The goal is to determine if there is something inherently memory intensive in your index/queries, or something relating to a leak based on total query volume. -- Jack Krupansky -Original Message- From: John Nielsen Sent: Sunday, March 24, 2013 4:19 AM To: solr-user@lucene.apache.org Subject: Solr using a ridiculous amount of memory Hello all, We are running a solr cluster which is now running solr-4.2. The index is about 35GB on disk with each register between 15k and 30k. (This is simply the size of a full xml reply of one register. I'm not sure how to measure it otherwise.) Our memory requirements are running amok. We have less than a quarter of our customers running now and even though we have allocated 25GB to the JVM already, we are still seeing daily OOM crashes. We used to just allocate more memory to the JVM, but with the way solr is scaling, we would need well over 100GB of memory on each node to finish the project, and thats just not going to happen. I need to lower the memory requirements somehow. I can see from the memory dumps we've done that the field cache is by far the biggest sinner. Of special interest to me is the recent introduction of DocValues which supposedly mitigates this issue by using memory outside the JVM. I just can't, because of lack of documentation, seem to make it work. We do a lot of facetting. One client facets on about 50.000 docs of approx 30k each on 5 fields. I understand that this is VERY memory intensive. Schema with DocValues attempt at solving problem: http://pastebin.com/Ne23NnW4 Config: http://pastebin.com/x1qykyXW The cache is pretty well tuned. Any lower and i get evictions. Come hell or high water, my JVM memory requirements must come down. Simply moving some memory load outside of the JVM would be awesome! Making it not use the field cache for anything would also (probably) work for me. I thought about killing off my other caches, but from the dumps, they just don't seem to use that much memory. I am at my wits end. Any help would be sorely appreciated. -- Med venlig hilsen / Best regards *John Nielsen* Programmer *MCB A/S* Enghaven 15 DK-7500 Holstebro Kundeservice: +45 9610 2824 p...@mcb.dk www.mcb.dk
Re: Solr using a ridiculous amount of memory
On Sun, Mar 24, 2013 at 4:19 AM, John Nielsen j...@mcb.dk wrote: Schema with DocValues attempt at solving problem: http://pastebin.com/Ne23NnW4 Config: http://pastebin.com/x1qykyXW This schema isn't using docvalues, due to a typo in your config. it should not be DocValues=true but docValues=true. Are you not getting an error? Solr needs to throw exception if you provide invalid attributes to the field. Nothing is more frustrating than having a typo or something in your configuration and solr just ignores this, reports no error, and doesnt work the way you want. I'll look into this (I already intend to add these checks to analysis factories for the same reason). Separately, if you really want the terms data and so on to remain on disk, it is not enough to just enable docvalues for the field. The default implementation uses the heap. So if you want that, you need to set docValuesFormat=Disk on the fieldtype. This will keep the majority of the data on disk, and only some key datastructures in heap memory. This might have significant performance impact depending upon what you are doing so you need to test that.
SOLR4/lucene and JVM memory management
Hi, Does anyone know how solr4/lucene and the JVM, manages memory? We have the following case. We have a 15GB server running only SOLR4/Lucene and the JVM (no custom code) We had allocated 2GB of memory and the JVM was using 1.9MB. At some point something happened and we run out of memory. Then we increased the JVM memory to 4GB and we see that gradually, JVM starts to use as much as it can. It is now using 3GB out of the 4GB allocated. Is that normal JVM memory usage? i.e. Does the JVM always use as much as it can from the allocated space? Thanks for your help -- Spyros Lambrinidis Head of Engineering Commando of PeoplePerHour.comhttp://www.peopleperhour.com Evmolpidon 23 118 54, Gkazi Athens, Greece Tel: +30 210 3455480 Follow us on Facebook http://www.facebook.com/peopleperhour Follow us on Twitter http://twitter.com/#%21/peopleperhour
RE: SOLR4/lucene and JVM memory management
Spyros Lambrinidis [spy...@peopleperhour.com]: Then we increased the JVM memory to 4GB and we see that gradually, JVM starts to use as much as it can. It is now using 3GB out of the 4GB allocated. That is to be expected. When the amount of garbage collections increases, the JVM might decide that it would be better overall to increase the size of the heap. Whether it will allocate up to your 4GB limit depends on how active it is. If you stress it, it will probably take the last GB. i.e. Does the JVM always use as much as it can from the allocated space? No, but the Oracle JVM do tend to be somewhat greedy (very subjective, I know). Since larger heaps means (hopefully infrequent) pauses for full garbage collection with a standard setup, the consensus seems to be that it is best to allocate conservatively and thereby avoid over-allocation. If 2GB worked well for you until you hit OOM, changing to 3GB seems like a better choice than 4GB to me. Especially since you describe the allocation up to 3GB as gradual, which tells me that your installation is not starved with 3GB. - Toke Eskildsen
RE: Solr using a ridiculous amount of memory
From: John Nielsen [j...@mcb.dk]: The index is about 35GB on disk with each register between 15k and 30k. (This is simply the size of a full xml reply of one register. I'm not sure how to measure it otherwise.) Our memory requirements are running amok. We have less than a quarter of our customers running now and even though we have allocated 25GB to the JVM already, we are still seeing daily OOM crashes. That does sound a bit peculiar. I do not understand what you mean by register though. How many documents does your index holds? I can see from the memory dumps we've done that the field cache is by far the biggest sinner. Do you sort on a lot of different fields? We do a lot of facetting. One client facets on about 50.000 docs of approx 30k each on 5 fields. I understand that this is VERY memory intensive. To get a rough approximation of memory usage, we need the total number of documents, the average number of values for each of the 5 fields for a document and the number of unique values in each of the 5 fields. The rule of thumb I use for lower ceiling is #documents*log2(#references) + #references*log2(#unique_values) bit If your whole index has 10M documents, which each has 100 values for each field, with each field having 50M unique values, then the memory requirement would be more than 10M*log2(100*10M) + 100*10M*log2(50M) bit ~= 340MB/field ~= 1.6GB for faceting on all fields. Even when we multiply that with 4 to get a more real-world memory requirement, it is far from the 25GB that you are allocating. Either you have an interestingly high number somewhere in the equation or something's off. Regards, Toke Eskildsen
Re: Recommendation for integration test framework
Unrelated about your question you said that: We are utilizing Apache Maven as build management tool I think currently ant + ivy is build and dependency management tools, maven pom is generated via plugin (If I am wrong you can correct it). Are there any plan to move the project based on Maven? 2013/3/25 Jan Morlock jan.morl...@googlemail.com Hi, our solr implementation consists of several cores sometimes interacting with each other. Using SolrTestCaseJ4 didn't work out for us. Instead we would like to test the resulting war from outside using integration tests. We are utilizing Apache Maven as build management tool. Therefore we are currently thinking about using the maven failsafe plugin. Does anybody have experiences with using it in combination with solr? Or does somebody have a better recommendation for us? Thank you very much in advance Jan -- View this message in context: http://lucene.472066.n3.nabble.com/Recommendation-for-integration-test-framework-tp4050936.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: Solr using a ridiculous amount of memory
Toke Eskildsen [t...@statsbiblioteket.dk]: If your whole index has 10M documents, which each has 100 values for each field, with each field having 50M unique values, then the memory requirement would be more than 10M*log2(100*10M) + 100*10M*log2(50M) bit ~= 340MB/field ~= 1.6GB for faceting on all fields. Whoops. Missed a 0 when calculating. The case above would actually take more than 15GB, probably also more than the 25GB you have allocated. Anyway, I see now in your solrconfig that your main facet fields are cat, manu_exact, content_type and author_s, with the 5th being maybe price, popularity or manufacturedate_dt? cat seems like category (relatively few references, few uniques), content_type probably has a single value/item and again few uniques. No memory problem there, unless you have a lot of documents (100M-range). That leaves manu_exact and author_s. If those are freetext fields with item descriptions or similar, that might explain the OOM. Could you describe the facet fields in more detail and provide us with the total document count? Quick sanity check: If you are using a Linux server, could you please verify that your virtual memory is set to unlimited with 'ulimit -v'? Regards, Toke Eskildsen
Re: Solr using a ridiculous amount of memory
A step I meant to include was that after you warm Solr with a representative collection of queries that references all of the fields, facets, sorting, etc. that your daily load will reference, check the Java heap size at that point, and then set your Java heap limit to a moderate level higher, like 256M, restart, and then see what happens. The theory is that if you have too much available heap, Java will gradually fill it all with garbage (no leaks implied, but maybe some leaks as well), and then a Java GC will be an expensive hit, and sometimes a rapid flow of incoming requests at that point can cause Java to freak out and even hit OOM even though a more graceful garbage collection would eventually free up tons of garbage. So, by only allowing for a moderate amount of garbage, more frequent GCs will be less intensive and less likely to cause weird situations. The other part of the theory is that it is usually better to leave tons of memory to the OS for efficiently caching files, rather than force Java to manage large amounts of memory, which it typically does not do so well. -- Jack Krupansky -Original Message- From: Jack Krupansky Sent: Sunday, March 24, 2013 2:00 PM To: solr-user@lucene.apache.org Subject: Re: Solr using a ridiculous amount of memory Just to get started, do you hit OOM quickly with a few expensive queries, or is it after a number of hours and lots of queries? Does Java heap usage seem to be growing linearly as queries come in, or are there big spikes? How complex/rich are your queries (e.g., how many terms, wildcards, faceted fields, sorting, etc.)? As a baseline experiment, start a Solr server, see how much Java heap is used/available. Then do a couple of typical queries, and check the heap size again. Then do a couple more similar but different (to avoid query cache matches), and check the heap again. Maybe do that a few times to get a handle on the baseline memory required and whether there might be a leak of some sort. Do enough queries to hits all of the fields, facets, sorting, etc. that are likely to be encountered in one of your typical days that hits OOM - just not the volume of queries. The goal is to determine if there is something inherently memory intensive in your index/queries, or something relating to a leak based on total query volume. -- Jack Krupansky -Original Message- From: John Nielsen Sent: Sunday, March 24, 2013 4:19 AM To: solr-user@lucene.apache.org Subject: Solr using a ridiculous amount of memory Hello all, We are running a solr cluster which is now running solr-4.2. The index is about 35GB on disk with each register between 15k and 30k. (This is simply the size of a full xml reply of one register. I'm not sure how to measure it otherwise.) Our memory requirements are running amok. We have less than a quarter of our customers running now and even though we have allocated 25GB to the JVM already, we are still seeing daily OOM crashes. We used to just allocate more memory to the JVM, but with the way solr is scaling, we would need well over 100GB of memory on each node to finish the project, and thats just not going to happen. I need to lower the memory requirements somehow. I can see from the memory dumps we've done that the field cache is by far the biggest sinner. Of special interest to me is the recent introduction of DocValues which supposedly mitigates this issue by using memory outside the JVM. I just can't, because of lack of documentation, seem to make it work. We do a lot of facetting. One client facets on about 50.000 docs of approx 30k each on 5 fields. I understand that this is VERY memory intensive. Schema with DocValues attempt at solving problem: http://pastebin.com/Ne23NnW4 Config: http://pastebin.com/x1qykyXW The cache is pretty well tuned. Any lower and i get evictions. Come hell or high water, my JVM memory requirements must come down. Simply moving some memory load outside of the JVM would be awesome! Making it not use the field cache for anything would also (probably) work for me. I thought about killing off my other caches, but from the dumps, they just don't seem to use that much memory. I am at my wits end. Any help would be sorely appreciated. -- Med venlig hilsen / Best regards *John Nielsen* Programmer *MCB A/S* Enghaven 15 DK-7500 Holstebro Kundeservice: +45 9610 2824 p...@mcb.dk www.mcb.dk
Re: Too many fields to Sort in Solr
thanks Eric. in this query q=*:* the Lucene score is always 1 -- View this message in context: http://lucene.472066.n3.nabble.com/Too-many-fields-to-Sort-in-Solr-tp4049139p4050944.html Sent from the Solr - User mailing list archive at Nabble.com.
[ANNOUNCE] Solr wiki editing change
The wiki at http://wiki.apache.org/solr/ has come under attack by spammers more frequently of late, so the PMC has decided to lock it down in an attempt to reduce the work involved in tracking and removing spam. From now on, only people who appear on http://wiki.apache.org/solr/ContributorsGroup will be able to create/modify/delete wiki pages. Please request either on the solr-user@lucene.apache.org or on d...@lucene.apache.org to have your wiki username added to the ContributorsGroup page - this is a one-time step. Steve
RE: SOLR 4.2 SolrQuery exception
Hi, I managed to resolve this issue and I am getting the results also. But this time I am getting a different exception while loading Solr Container Here is the Code. String SOLR_HOME = /data/solr1/example/solr/collection1; CoreContainer coreContainer = new CoreContainer(SOLR_HOME); CoreDescriptor discriptor = new CoreDescriptor(coreContainer, collection1, new File(SOLR_HOME).getAbsolutePath()); SolrCore solrCore = coreContainer.create(discriptor); coreContainer.register(solrCore, false); File home = new File( SOLR_HOME ); File f = new File( home, solr.xml ); coreContainer.load( SOLR_HOME, f ); server = new EmbeddedSolrServer( coreContainer, collection1 ); SolrQuery q = new SolrQuery(); Parameters inside Solrconfig.xml !-- writeLockTimeout1000/writeLockTimeout -- lockTypesimple/lockType unlockOnStartuptrue/unlockOnStartup WARNING: Unable to get IndexCommit on startup org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out: SimpleFSLock@/data/solr1/example/solr/collection1/./data/index/write.lock at org.apache.lucene.store.Lock.obtain(Lock.java:84) at org.apache.lucene.index.IndexWriter.init(IndexWriter.java:636) at org.apache.solr.update.SolrIndexWriter.init(SolrIndexWriter.java:77) at org.apache.solr.update.SolrIndexWriter.create(SolrIndexWriter.java:64) at org.apache.solr.update.DefaultSolrCoreState.createMainIndexWriter(DefaultSolrCoreState.java:192) at org.apache.solr.update.DefaultSolrCoreState.getIndexWriter(DefaultSolrCoreState.java:106) at org.apache.solr.handler.ReplicationHandler.inform(ReplicationHandler.java:904) at org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:592) at org.apache.solr.core.SolrCore.init(SolrCore.java:801) at org.apache.solr.core.SolrCore.init(SolrCore.java:619) at org.apache.solr.core.CoreContainer.createFromLocal(CoreContainer.java:1021) at org.apache.solr.core.CoreContainer.create(CoreContainer.java:1051) at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:634) at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:629) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:679) From: Sandeep Kumar Anumalla Sent: 24 March, 2013 03:44 PM To: solr-user@lucene.apache.org Subject: SOLR 4.2 SolrQuery exception I am using the below code and getting the exception while using SolrQuery Mar 24, 2013 3:08:07 PM org.apache.solr.core.QuerySenderListener newSearcher INFO: QuerySenderListener sending requests to Searcher@795e0c2b main{StandardDirectoryReader(segments_49:524 _4v(4.2):C299313 _4x(4.2):C2953/1396 _4y(4.2):C2866/1470 _4z(4.2):C4263/2793 _50(4.2):C3554/761 _51(4.2):C1126/365 _52(4.2):C650/285 _53(4.2):C500/215 _54(4.2):C1808/1593 _55(4.2):C1593)} Mar 24, 2013 3:08:07 PM org.apache.solr.common.SolrException log SEVERE: java.lang.NullPointerException at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:181) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1797) at org.apache.solr.core.QuerySenderListener.newSearcher(QuerySenderListener.java:64) at org.apache.solr.core.SolrCore$5.call(SolrCore.java:1586) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:679) Mar 24, 2013 3:08:07 PM org.apache.solr.core.SolrCore execute INFO: [collection1] webapp=null path=null params={event=firstSearcherq=static+firstSearcher+warming+in+solrconfig.xmldistrib=false} status=500 QTime=4 Mar 24, 2013 3:08:07 PM org.apache.solr.core.QuerySenderListener newSearcher INFO: QuerySenderListener done. Mar 24, 2013 3:08:07 PM org.apache.solr.handler.component.SpellCheckComponent$SpellCheckerListener newSearcher INFO: Loading spell index for spellchecker: default Mar 24, 2013 3:08:07 PM org.apache.solr.handler.component.SpellCheckComponent$SpellCheckerListener newSearcher INFO: Loading spell index for spellchecker: wordbreak Mar 24, 2013 3:08:07 PM org.apache.solr.core.SolrCore
Using Solrj to Get termVectors
Hi all, I've enabled term vector component to be stored. The result has been shown using http request on browser. Since I'm planning to build web service using java, I need to get those values using Solrj. I've been googling find this solution (http://stackoverflow.com/questions/8977852/how-to-parse-the-termvectorcomponent-response-to-which-java-object) but it seems like some function has been deprecated. Does anybody know how to get termVectors using Solrj? -- Regards, Rendy Bambang Junior Informatics Engineering '09 Bandung Institute of Technology
Re: how to get term vector information of sepcific word/position in field
Thanks Chris, -- View this message in context: http://lucene.472066.n3.nabble.com/how-to-get-term-vector-information-of-sepcific-word-position-in-field-tp4047637p4050997.html Sent from the Solr - User mailing list archive at Nabble.com.
Multi-core and replicated Solr cloud testing. Data-directory mis-configures
I have three indexes which I have set up as three separate cores, using this solr.xml config. cores adminPath=/admin/cores host=${host:} hostPort=${jetty.port:} core name=jira-issue instanceDir=jira-issue property name=dataDir value=jira-issue/data/ / /core core name=jira-comment instanceDir=jira-comment property name=dataDir value=jira-comment/data/ / /core core name=jira-change-history instanceDir=jira-change-history property name=dataDir value=jira-change-history/data/ / /core /cores This works just fine a standalone solr. I duplicated this setup on the same machine under a completely separate solr installation (solr-nodeb) and modified all the data directroies to point to the direstories in nodeb. This all worked fine. I then connected the 2 instances together with zoo-keeper using settings -Dbootstrap_conf=true -Dcollection.configName=jiraCluster -DzkRun -DnumShards=1 for the first intsance and -DzkHost=localhost:9080 for the second. (I'm using tomcat and ports 8080 and 8081 for the 2 Solr instances) Now the data directories of the second node point to the data directories in the first node. I have tried many settings in the solrconfig.xml for each core but am now using absolute paths, e.g. dataDir/home//solr-4.2.0-nodeb/example/multicore/jira-comment/data/dataDir previously I used ${solr.jira-comment.data.dir:/home/tcampbell/solr-4.2.0-nodeb/example/multicore/jira-comment/data} but that had the same result. It seems zookeeper is forcing data directory config from the uploaded configuration on all the nodes in the cluster? How can I do testing on a single machine? Do I really need identical directory layouts on all machines?