Can we have [core name] in each log entry?
Can we have [core name] in each log entry? It's hard for us to know the exact core having a such issue and the sequence, if there are too many cores in a solr node in a SolrCloud env. I post the request to this JIRA ticket. https://issues.apache.org/jira/browse/SOLR-7434 -- View this message in context: http://lucene.472066.n3.nabble.com/Can-we-have-core-name-in-each-log-entry-tp4201186.html Sent from the Solr - User mailing list archive at Nabble.com.
Solr5.0.0, do a commit alone ?
Dear Solr Users, With Solr3.6, when I want to force a commit without giving data, I do: java -jar post.jar Now with Solr5.0.0, I use bin/post . but it do not accept to do a commit if I don't give a data directory. ie: bin/post -c mydb -commit yes I want to do that because I have a file with delete action. Each line in this file contains one ref to delete bin/post -c mydb -commit no -d delete.../delete So I would like to do the commit only after running my file with a command line bin/post -c mydb -commit yes (without data) is not accepted by post Thanks, Sincerely, Bruno --- Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active. http://www.avast.com
Re: Confusing SOLR 5 memory usage
Tom Evans tevans...@googlemail.com wrote: I do apologise for wasting anyone's time on this, the PEBKAC (my keyboard and chair unfortunately). When adding the new server to haproxy, I updated the label for the balancer entry to the new server, but left the host name the same, so the server that wasn't using any RAM... wasn't getting any requests. No problem at all. On the contrary, thank you for closing the issue. - Toke Eskildsen
and stopword in user query is being change to q.op=AND
Hi All, And stopword in user query is being changed to q.op=AND, i am going to look more into this i thought of sharing this in solr community just in-case if someone have came across this issue. OR I will also be validating my config and schema if i am doing something wrong. solr : 4.9 query parser: edismax when i search for *q=derek and romace* final parsed query is *(+(+DisjunctionMaxQuery((textSpell:derek)) +DisjunctionMaxQuery((textSpell:romance/no_coord * * response:{numFound:0,start:0,maxScore:0.0,docs:[]* when i search for *q=derek romace* final parsed query is *parsedquery: (+(DisjunctionMaxQuery((textSpell:derek)) DisjunctionMaxQuery((textSpell:romance/no_coord,* *response: {* *numFound: 1405,* *start: 0,* *maxScore: 0.2780709,* *docs: [.* textSpell field definition : field name=textSpell type=text_general indexed=true stored=false omitNorms=true multiValued=true / fieldType name=text_general class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.ClassicTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt / filter class=solr.LowerCaseFilterFactory/ filter class=solr.KeywordMarkerFilterFactory protected=protwords.txt/ /analyzer analyzer type=query tokenizer class=solr.ClassicTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt / filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=false / filter class=solr.LowerCaseFilterFactory/ filter class=solr.KeywordMarkerFilterFactory protected=protwords.txt/ /analyzer /fieldType Let me know if anyone of you guys need more info. *Thanks,* *Rajesh**.*
Re. Apache Lucerne Article
Hi, My name is Octavia, and I'm a Developer Evangelist at Toptal. We are an online marketplace for software developers, with one important difference: our screening process means that we only accepts the top 3% of applicants. We have just released a new Apache Lucerne article that I'd love your feedback on if you find a few spare moments: http://www.toptal.com/database/full-text-search-of-dialogues-with-apache-lucene https://t.yesware.com/tl/40ebfe12f57e945b87d931337ae4fdf7562ce7cd/876bc5a32fe924c92a2d67baa7beae4f/c4ceb868fec46d61d6fb0adf43befe69?ytl=http%3A%2F%2Fwww.toptal.com%2Fdatabase%2Ffull-text-search-of-dialogues-with-apache-lucene . Kind Regards, Octavia Campbell | Developer Evangelist Toptal https://t.yesware.com/tl/40ebfe12f57e945b87d931337ae4fdf7562ce7cd/6f0b813845dfa0937a36e00c31424522/d25c37bd0953d21c7341a1498b55b117?ytl=http%3A%2F%2Ftoptal.com%2F%3Futm_source%3Demail%26utm_medium%3Doctavia%26utm_campaign%3Ddevevang | Exclusive access to Top Developers
Re: Can we have [core name] in each log entry?
+1 :) That would be very helpful! Thanks, Stefan Am 21.04.2015 um 09:07 schrieb forest_soup: Can we have [core name] in each log entry? It's hard for us to know the exact core having a such issue and the sequence, if there are too many cores in a solr node in a SolrCloud env. I post the request to this JIRA ticket. https://issues.apache.org/jira/browse/SOLR-7434 -- View this message in context: http://lucene.472066.n3.nabble.com/Can-we-have-core-name-in-each-log-entry-tp4201186.html Sent from the Solr - User mailing list archive at Nabble.com. -- Mit den besten Grüßen aus Nürnberg, Stefan Moises *** Stefan Moises Senior Softwareentwickler Leiter Modulentwicklung shoptimax GmbH Ulmenstrasse 52 H 90443 Nürnberg Amtsgericht Nürnberg HRB 21703 GF Friedrich Schreieck Fax: 0911/25566-29 moi...@shoptimax.de http://www.shoptimax.de ***
SolrCloud 4.8 - Stopping recovery for zkNodeName
Hi All, I have solr log full by this warning: - Stopping recovery for zkNodeName=core_node2core=utenti_20141121_shard1_replica3 This warning appears very frequentely, about every 10 minutes. Is there any way to recovery the replica? Or should I remove and create again the replica? Best regards, Vincenzo -- Vincenzo D'Amore email: v.dam...@gmail.com skype: free.dev mobile: +39 349 8513251
RE: Unsubscribe from Mailing list
Please reply -Original Message- From: Isha Garg [mailto:isha.g...@creditpointe.com] Sent: Monday, April 20, 2015 2:54 PM To: solr-user@lucene.apache.org Subject: Unsubscribe from Mailing list Hi , Can anyone tell me how to unsubscribe from Solr mailing lists. I tried sending email on 'solr-user-unsubscr...@lucene.apache.org', 'general-unsubscr...@lucene.apache.org'. But it is not working for me. Thanks Regards, Isha Garg RAGE Frameworks/CreditPointe Services Pvt. LTD India Off: +91 (20) 4141 3000 Ext:3043 www.rageframeworks.comhttps://mail.creditpointe.com/owa/redir.aspx?C=AbOUZv82G0KZ33QPyLlosoGQ9j10yNAId3aeTWGDbSBxU0BlQqNxKdEXXNDwCVrRhHIk7yWMi_M.URL=http%3a%2f%2fwww.creditpointe.com%2f www.creditpointe.comhttps://mail.creditpointe.com/owa/redir.aspx?C=AbOUZv82G0KZ33QPyLlosoGQ9j10yNAId3aeTWGDbSBxU0BlQqNxKdEXXNDwCVrRhHIk7yWMi_M.URL=http%3a%2f%2fwww.creditpointe.com%2f * NOTICE AND DISCLAIMER This e-mail (including any attachments) is intended for the above-named person(s). If you are not the intended recipient, notify the sender immediately, delete this email from your system and do not disclose or use for any purpose. All emails are scanned for any virus and monitored as per the Company information security policies and practices. * --- This email has been scanned for email related threats and delivered safely by Mimecast. For more information please visit http://www.mimecast.com --- --- This email has been scanned for email related threats and delivered safely by Mimecast. For more information please visit http://www.mimecast.com ---
Re: Add Entry to Support Page
Hi Christoph, You mean https://wiki.apache.org/solr/Support ? If yes, you need to request for edit wiki right by saying your username. Then, you can add your company by yourself. Ahmet On Tuesday, April 21, 2015 3:15 PM, Christoph Schmidt christoph.schm...@moresophy.de wrote: Solr Community, I’m Christoph Schmidt (http://www.moresophy.com/de/management), CEO of the german company moresophy GmbH. My Solr Wiki name is: - ChristophSchmidt We are working with Lucene since 2003 and Solr 2012 and are building linguistic token filters and plugins for Solr. We would like to add the following entry to the Solr Support page: moresophy GmbH: consulting in Lucene, Solr, elasticsearch, specialization in linguistic and semantic enrichment and high scalability content clouds (DE/AT/CH) a class=mailto href=mailto:i...@moresophy.com;i...@moresophy.com/a Best regards Christoph Schmidt ___ Dr. Christoph Schmidt | Geschäftsführer P +49-89-523041-72 M +49-171-1419367 Skype: cs_moresophy christoph.schm...@moresophy.de www.moresophy.com moresophy GmbH | Fraunhoferstrasse 15 | 82152 München-Martinsried Handelsregister beim Amtsgericht München, NR. HRB 136075 Umsatzsteueridentifikationsnummer: DE813188826 Vertreten durch die Geschäftsführer: Prof. Dr. Heiko Beier | Dr. Christoph Schmidt Diese E-Mail enthält vertrauliche und/oder rechtlich geschützte Informationen. Wenn Sie nicht der richtige Adressat sind oder diese E-Mail irrtümlich erhalten haben, informieren Sie bitte sofort den Absender und vernichten Sie diese Mail. Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Mail ist nicht gestattet. This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and destroy this e-mail. Any unauthorised copying, disclosure or distribution of the material in this e-mail is strictly forbidden.
Re: Can we have [core name] in each log entry?
Yes, with Solr 5.1 we use MDC to log collection, shard, replica and core via SOLR-6673. See http://issues.apache.org/jira/browse/SOLR-6673 In trunk and un-released 5.2, there is more logging and context, see http://issues.apache.org/jira/browse/SOLR-7381 It is possible that some places may have been missed. Let's try to find and fix them. On Tue, Apr 21, 2015 at 12:54 PM, Stefan Moises moi...@shoptimax.de wrote: +1 :) That would be very helpful! Thanks, Stefan Am 21.04.2015 um 09:07 schrieb forest_soup: Can we have [core name] in each log entry? It's hard for us to know the exact core having a such issue and the sequence, if there are too many cores in a solr node in a SolrCloud env. I post the request to this JIRA ticket. https://issues.apache.org/jira/browse/SOLR-7434 -- View this message in context: http://lucene.472066.n3.nabble.com/Can-we-have-core-name-in-each-log-entry-tp4201186.html Sent from the Solr - User mailing list archive at Nabble.com. -- Mit den besten Grüßen aus Nürnberg, Stefan Moises *** Stefan Moises Senior Softwareentwickler Leiter Modulentwicklung shoptimax GmbH Ulmenstrasse 52 H 90443 Nürnberg Amtsgericht Nürnberg HRB 21703 GF Friedrich Schreieck Fax: 0911/25566-29 moi...@shoptimax.de http://www.shoptimax.de *** -- Regards, Shalin Shekhar Mangar.
Re: Solr Index data lost
Shawn, Yes, I had used java -jar start.jar. I haven't tried moving it to a local hard disk, as I wanted to work on two machines (work and home). So was using a pen drive as the index storage. Yesterday, I did the complete indexing and then unplugged the drive from work machine and connected to my personal laptop. Data folder didn't exist. Erick, As per your earlier suggestion, I am using Tika and SolrJ to index the data (both binary and as well database content) and the same had been committed using the SolrJ UpdataRequest. I was able to see the data in the admin UI screen and even performed some searches on the index and it worked fine. Thanks Regards Vijay On 21 April 2015 at 00:42, Erick Erickson erickerick...@gmail.com wrote: Did you commit before you unplugged the drive? Were you able to see data in the admin UI _before_ you unplugged the drive? Best, Erick On Mon, Apr 20, 2015 at 3:58 PM, Vijay Bhoomireddy vijaya.bhoomire...@whishworks.com wrote: Shawn, I haven’t changed any DirectoryFactory setting in the solrconfig.xml as I am using in a local setup and using the default configurations. Device has been unmounted successfully (confirmed through windows message in the lower right corner). I am using Solr-4.10.2. I simply run a Ctrl-C command in the windows Command prompt to stop Solr, in the same window where it was started earlier. Please correct me if something has been done not in the correct fashion. Thanks Regards Vijay -Original Message- From: Shawn Heisey [mailto:apa...@elyograg.org] Sent: 20 April 2015 22:34 To: solr-user@lucene.apache.org Subject: Re: Solr Index data lost On 4/20/2015 2:55 PM, Vijay Bhoomireddy wrote: I have configured Solr example server on a pen drive. I have indexed some content. The data directory was under example/solr/collection1/data which is the default one. After indexing, I stopped the Solr server and unplugged the pen drive and reconnected the same. Now, when I navigate to the SolrAdmin UI, I cannot see any data in the index. Any pointers please? In this case, though the installation was on a pen-drive, I think it shouldn't matter to Solr on where the data directory is. So I believe this data folder wiping has happened due to server shutdown. Will the data folder be wiped off if the server is restarted or stopped? How to save the index data between machine failures or planned maintenances? If you are using the default Directory implementation in your solrconfig.xml (NRTCachingDirectoryFactory for 4.x and later, MMapDirectoryFactory for newer 3.x versions), then everything should be persisted correctly. Did you properly unmount/eject the removable volume before you unplugged it? On a non-windows OS, you might also want to run the 'sync' command. If you didn't do the unmount/eject, you can't be sure that the filesystem was properly closed and fully up-to-date on the device. What version of Solr did you use and how exactly did you start Solr and the example? How did you stop Solr? Thanks, Shawn -- The contents of this e-mail are confidential and for the exclusive use of the intended recipient. If you receive this e-mail in error please delete it from your system immediately and notify us either by e-mail or telephone. You should not copy, forward or otherwise disclose the content of the e-mail. The views expressed in this communication may not necessarily be the view held by WHISHWORKS. -- The contents of this e-mail are confidential and for the exclusive use of the intended recipient. If you receive this e-mail in error please delete it from your system immediately and notify us either by e-mail or telephone. You should not copy, forward or otherwise disclose the content of the e-mail. The views expressed in this communication may not necessarily be the view held by WHISHWORKS.
Re: CDATA response is coming with lt: instead of
It seems this is done in XML(Response)Writer: XML.escapeAttributeValue(stylesheet, writer); I suppose this is valid according with XML escaping rules, but it's just a thought of mine because I don't know so strictly those rules. I see the character is being escaped so what you get is coheren (I mean, I think it's not a mistake) Did you try with another response writer (e.g. JSON)? On 04/21/2015 03:46 PM, mesenthil1 wrote: We are using DIH for indexing XML files. As part of the xml we have xml enclosed with CDATA. It is getting indexed but in response the CDATA content is coming as decoded terms instead of symbols. Example: /Feed file: / add doc field name=id123/field field name=description_tabc pqr xyz/field * field name=images_t![CDATA[Imagesimageuri/images/series/chiunks/flipbooksflipbook30_640x480.jpg/uri/image/Images]]/field/b /doc /add XML response:(curl and browser view source) ?xml version=1.0 encoding=UTF-8? response result name=response numFound=1 start=0 doc str name=id123/str str name=description_tabc pqr xyz/str bstr name=images_tlt;Imagesgt; lt;imagegt; lt;urigt;/images/series/chiunks/flipbooksflipbook30_640x480.jpglt;/urigt; lt;/imagegt; lt;/Imagesgt; /str/b /doc /result /response Instead, we are looking to get the response as well within CDATA as below ?xml version=1.0 encoding=UTF-8? response result name=response numFound=1 start=0 doc str name=id123/str str name=description_tabc pqr xyz/str bstr name=images_t![CDATA[Imagesimageuri/images/series/chiunks/flipbooksflipbook30_640x480.jpg/uri/image/Images ]]/b /str /doc /result /response Can anyone please help me if this is possible? Thanks, Senthil -- View this message in context: http://lucene.472066.n3.nabble.com/CDATA-response-is-coming-with-lt-instead-of-tp4201271.html Sent from the Solr - User mailing list archive at Nabble.com.
Suggester
Hello together, I have some problems with the Solr 5.1.0 suggester. I followed the instructions in https://cwiki.apache.org/confluence/display/solr/Suggester and also tried the techproducts example delivered with the binary package, which is working well. I added a field suggestions-Field to the schema: field name=suggestions type=text_suggest indexed=true stored=true multiValued=true“/ And added some copies to the field: copyField source=content dest=suggestions/ copyField source=title dest=suggestions/ copyField source=author dest=suggestions/ copyField source=description dest=suggestions/ copyField source=keywords dest=suggestions/ The field type definition for „text_suggest“ is pretty simple: fieldType name=text_suggest class=solr.TextField positionIncrementGap=100 analyzer tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt / filter class=solr.LowerCaseFilterFactory/ /analyzer /fieldType I Also changed the solrconfig.xml to use the suggestions field: searchComponent class=solr.SuggestComponent name=suggest lst name=suggester str name=namemySuggester/str str name=lookupImplFuzzyLookupFactory/str str name=dictionaryImplDocumentDictionaryFactory/str str name=fieldsuggestions/str str name=suggestAnalyzerFieldTypetext_general/str str name=buildOnStartupfalse/str /lst /searchComponent For Tokens original coming from „title or „author“, I get suggestions, but not any from the content field. So, what do I have to do? Any help is appreciated. Martin
Document Created Date
I am a newbie and just started using Solr 4.10.3. We have successfully indexed a network drive and are running searches. We now have a request to show the Created Date for all documents (PDF/WORD/TXT/XLS) that come back in our search results. I have successfully filtered on the last_modified date but I cannot figure out or find out how to add a document's Created Date to the schema.xml. We do not want to search on the created date since last_modified date handles this but just want to display it. To my understanding I need to add indexed=false and stored=true to the xml field but I don't know how or understand how the xml name will map to the document's created date property. This is my guess: field name=CreatedDate type=date indexed=false stored=true/ Can someone please supply the correct syntax for the xml and maybe a brief comment on how solr maps to the actual document's property? Also, will I need to re-index the dive to make this change apply? Thanks, Eric
Correct usage for Synonyms.txt
Is my understanding of synonyms.txt configuration correct 1. When the user can search from a list of synonyms and the searchable document can have any synonym the configuration should be like below. Fuji, Gala, Braeburn, Crisp = Fuji, Gala, Braeburn, Crisp 2. When the user can search from a list of synonyms and the searchable document can only have a preferred term (for e.g. Apple) Apple, Fuji, Gala, Braeburn, Crisp OR Fuji, Gala, Braeburn, Crisp = Apple Is there any other format that I am missing? Thank you, Kaushik
Re: Confusing SOLR 5 memory usage
I do apologise for wasting anyone's time on this, the PEBKAC (my keyboard and chair unfortunately). When adding the new server to haproxy, I updated the label for the balancer entry to the new server, but left the host name the same, so the server that wasn't using any RAM... wasn't getting any requests. Again, sorry! Tom On Tue, Apr 21, 2015 at 11:54 AM, Tom Evans tevans...@googlemail.com wrote: We monitor them with munin, so I have charts if attachments are acceptable? Having said that, they have only been running for a day with this memory allocation.. Describing them, the master consistently has 8GB used for apps, the 8GB used in cache, whilst the slave consistently only uses ~1.5GB for apps, 14GB used in cache. We are trying to use our SOLR servers to do a lot more facet queries, previously we were mainly doing searches, and the SolrPerformanceProblems wiki page mentions that faceting (amongst others) require a lot of JVM heap, so I'm confused why it is not using the heap we've allocated on one server, whilst it is on the other server. Perhaps our master server needs even more heap? Also, my infra guy is wondering why I asked him to add more memory to the slave server, if it is just in cache, although I did try to explain that ideally, I'd have even more in cache - we have about 35GB of index data. Cheers Tom On Tue, Apr 21, 2015 at 11:25 AM, Markus Jelsma markus.jel...@openindex.io wrote: Hi - what do you see if you monitor memory over time? You should see a typical saw tooth. Markus -Original message- From:Tom Evans tevans...@googlemail.com Sent: Tuesday 21st April 2015 12:22 To: solr-user@lucene.apache.org Subject: Confusing SOLR 5 memory usage Hi all I have two SOLR 5 servers, one is the master and one is the slave. They both have 12 cores, fully replicated and giving identical results when querying them. The only difference between configuration on the two servers is that one is set to slave from the other - identical core configs and solr.in.sh. They both run on identical VMs with 16GB of RAM. In solr.in.sh, we are setting the heap size identically: SOLR_JAVA_MEM=-Xms512m -Xmx7168m The two servers are balanced behind haproxy, and identical numbers and types of queries flow to both servers. Indexing only happens once a day. When viewing the memory usage of the servers, the master server's JVM has 8.8GB RSS, but the slave only has 1.2GB RSS. Can someone hit me with the cluebat please? :) Cheers Tom
CDATA response is coming with lt: instead of
We are using DIH for indexing XML files. As part of the xml we have xml enclosed with CDATA. It is getting indexed but in response the CDATA content is coming as decoded terms instead of symbols. Example: /Feed file: / add doc field name=id123/field field name=description_tabc pqr xyz/field * field name=images_t![CDATA[Imagesimageuri/images/series/chiunks/flipbooksflipbook30_640x480.jpg/uri/image/Images]]/field/b /doc /add XML response:(curl and browser view source) ?xml version=1.0 encoding=UTF-8? response result name=response numFound=1 start=0 doc str name=id123/str str name=description_tabc pqr xyz/str bstr name=images_tlt;Imagesgt; lt;imagegt; lt;urigt;/images/series/chiunks/flipbooksflipbook30_640x480.jpglt;/urigt; lt;/imagegt; lt;/Imagesgt; /str/b /doc /result /response Instead, we are looking to get the response as well within CDATA as below ?xml version=1.0 encoding=UTF-8? response result name=response numFound=1 start=0 doc str name=id123/str str name=description_tabc pqr xyz/str bstr name=images_t![CDATA[Imagesimageuri/images/series/chiunks/flipbooksflipbook30_640x480.jpg/uri/image/Images ]]/b /str /doc /result /response Can anyone please help me if this is possible? Thanks, Senthil -- View this message in context: http://lucene.472066.n3.nabble.com/CDATA-response-is-coming-with-lt-instead-of-tp4201271.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: CDATA response is coming with lt: instead of
Thanks. For wt=json, it is bringing the results properly. I understand the reason for getting this in lt;. As our solr client is expecting this to be like within CDATA, I am looking for a way to achieve this. -- View this message in context: http://lucene.472066.n3.nabble.com/CDATA-response-is-coming-with-lt-instead-of-tp4201271p4201281.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Confusing SOLR 5 memory usage
We monitor them with munin, so I have charts if attachments are acceptable? Having said that, they have only been running for a day with this memory allocation.. Describing them, the master consistently has 8GB used for apps, the 8GB used in cache, whilst the slave consistently only uses ~1.5GB for apps, 14GB used in cache. We are trying to use our SOLR servers to do a lot more facet queries, previously we were mainly doing searches, and the SolrPerformanceProblems wiki page mentions that faceting (amongst others) require a lot of JVM heap, so I'm confused why it is not using the heap we've allocated on one server, whilst it is on the other server. Perhaps our master server needs even more heap? Also, my infra guy is wondering why I asked him to add more memory to the slave server, if it is just in cache, although I did try to explain that ideally, I'd have even more in cache - we have about 35GB of index data. Cheers Tom On Tue, Apr 21, 2015 at 11:25 AM, Markus Jelsma markus.jel...@openindex.io wrote: Hi - what do you see if you monitor memory over time? You should see a typical saw tooth. Markus -Original message- From:Tom Evans tevans...@googlemail.com Sent: Tuesday 21st April 2015 12:22 To: solr-user@lucene.apache.org Subject: Confusing SOLR 5 memory usage Hi all I have two SOLR 5 servers, one is the master and one is the slave. They both have 12 cores, fully replicated and giving identical results when querying them. The only difference between configuration on the two servers is that one is set to slave from the other - identical core configs and solr.in.sh. They both run on identical VMs with 16GB of RAM. In solr.in.sh, we are setting the heap size identically: SOLR_JAVA_MEM=-Xms512m -Xmx7168m The two servers are balanced behind haproxy, and identical numbers and types of queries flow to both servers. Indexing only happens once a day. When viewing the memory usage of the servers, the master server's JVM has 8.8GB RSS, but the slave only has 1.2GB RSS. Can someone hit me with the cluebat please? :) Cheers Tom
Add Entry to Support Page
Solr Community, I'm Christoph Schmidt (http://www.moresophy.com/de/management), CEO of the german company moresophy GmbH. My Solr Wiki name is: - ChristophSchmidt We are working with Lucene since 2003 and Solr 2012 and are building linguistic token filters and plugins for Solr. We would like to add the following entry to the Solr Support page: moresophy GmbHhttp://www.moresophy.com/: consulting in Lucene, Solr, elasticsearch, specialization in linguistic and semantic enrichment and high scalability content clouds (DE/AT/CH) a class=mailto href=mailto:i...@moresophy.com;i...@moresophy.com/amailto:i...@moresophy.com%3c/a Best regards Christoph Schmidt ___ Dr. Christoph Schmidt | Geschäftsführer P +49-89-523041-72 M +49-171-1419367 Skype: cs_moresophy christoph.schm...@moresophy.demailto:heiko.be...@moresophy.de www.moresophy.comhttp://www.moresophy.com/ moresophy GmbH | Fraunhoferstrasse 15 | 82152 München-Martinsried Handelsregister beim Amtsgericht München, NR. HRB 136075 Umsatzsteueridentifikationsnummer: DE813188826 Vertreten durch die Geschäftsführer: Prof. Dr. Heiko Beier | Dr. Christoph Schmidt Diese E-Mail enthält vertrauliche und/oder rechtlich geschützte Informationen. Wenn Sie nicht der richtige Adressat sind oder diese E-Mail irrtümlich erhalten haben, informieren Sie bitte sofort den Absender und vernichten Sie diese Mail. Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Mail ist nicht gestattet. This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and destroy this e-mail. Any unauthorised copying, disclosure or distribution of the material in this e-mail is strictly forbidden. [cid:890335816@02112010-31c2]
Re: Suggester
Did you build your suggest dictionary after indexing? Kind of a shot in the dark but worth a try. Note that the suggest field of your suggester isn't using your text_suggest field type to make suggestions, it's using text_general. IOW, the text may not be analyzed as you expect. Best, Erick On Tue, Apr 21, 2015 at 7:16 AM, Martin Keller martin.kel...@unitedplanet.com wrote: Hello together, I have some problems with the Solr 5.1.0 suggester. I followed the instructions in https://cwiki.apache.org/confluence/display/solr/Suggester and also tried the techproducts example delivered with the binary package, which is working well. I added a field suggestions-Field to the schema: field name=suggestions type=text_suggest indexed=true stored=true multiValued=true“/ And added some copies to the field: copyField source=content dest=suggestions/ copyField source=title dest=suggestions/ copyField source=author dest=suggestions/ copyField source=description dest=suggestions/ copyField source=keywords dest=suggestions/ The field type definition for „text_suggest“ is pretty simple: fieldType name=text_suggest class=solr.TextField positionIncrementGap=100 analyzer tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt / filter class=solr.LowerCaseFilterFactory/ /analyzer /fieldType I Also changed the solrconfig.xml to use the suggestions field: searchComponent class=solr.SuggestComponent name=suggest lst name=suggester str name=namemySuggester/str str name=lookupImplFuzzyLookupFactory/str str name=dictionaryImplDocumentDictionaryFactory/str str name=fieldsuggestions/str str name=suggestAnalyzerFieldTypetext_general/str str name=buildOnStartupfalse/str /lst /searchComponent For Tokens original coming from „title or „author“, I get suggestions, but not any from the content field. So, what do I have to do? Any help is appreciated. Martin
Re: CDATA response is coming with lt: instead of
Escaped entities and CDATA sections are two syntaxes for the same thing. After these two are parsed, they are exactly the same XML information. If your client can only handle one of the two syntaxes, they are not actually using XML. This is not a bug. Your client appears misguided. wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) On Apr 21, 2015, at 7:10 AM, mesenthil1 senthilkumar.arumu...@viacomcontractor.com wrote: Thanks. For wt=json, it is bringing the results properly. I understand the reason for getting this in lt;. As our solr client is expecting this to be like within CDATA, I am looking for a way to achieve this. -- View this message in context: http://lucene.472066.n3.nabble.com/CDATA-response-is-coming-with-lt-instead-of-tp4201271p4201281.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Unsubscribe from Mailing list
Did you follow Ere's link and try that? * There's a wiki page about possible issues and solutions for unsubscribing, see https://wiki.apache.org/solr/Unsubscribing%20from%20mailing%20lists. Regards, Ere Or look here? http://lucene.apache.org/solr/resources.html, see the unsubscribe link. NOTE you must use the _exact_ same e-mail you subscribed with. Or Googled solr unsubscribe? In short, what have you tried that's not working? Best, Erick On Tue, Apr 21, 2015 at 6:16 AM, Isha Garg isha.g...@creditpointe.com wrote: Please reply -Original Message- From: Isha Garg [mailto:isha.g...@creditpointe.com] Sent: Monday, April 20, 2015 2:54 PM To: solr-user@lucene.apache.org Subject: Unsubscribe from Mailing list Hi , Can anyone tell me how to unsubscribe from Solr mailing lists. I tried sending email on 'solr-user-unsubscr...@lucene.apache.org', 'general-unsubscr...@lucene.apache.org'. But it is not working for me. Thanks Regards, Isha Garg RAGE Frameworks/CreditPointe Services Pvt. LTD India Off: +91 (20) 4141 3000 Ext:3043 www.rageframeworks.comhttps://mail.creditpointe.com/owa/redir.aspx?C=AbOUZv82G0KZ33QPyLlosoGQ9j10yNAId3aeTWGDbSBxU0BlQqNxKdEXXNDwCVrRhHIk7yWMi_M.URL=http%3a%2f%2fwww.creditpointe.com%2f www.creditpointe.comhttps://mail.creditpointe.com/owa/redir.aspx?C=AbOUZv82G0KZ33QPyLlosoGQ9j10yNAId3aeTWGDbSBxU0BlQqNxKdEXXNDwCVrRhHIk7yWMi_M.URL=http%3a%2f%2fwww.creditpointe.com%2f * NOTICE AND DISCLAIMER This e-mail (including any attachments) is intended for the above-named person(s). If you are not the intended recipient, notify the sender immediately, delete this email from your system and do not disclose or use for any purpose. All emails are scanned for any virus and monitored as per the Company information security policies and practices. * --- This email has been scanned for email related threats and delivered safely by Mimecast. For more information please visit http://www.mimecast.com --- --- This email has been scanned for email related threats and delivered safely by Mimecast. For more information please visit http://www.mimecast.com ---
Re: Document Created Date
Not really sure what you're asking here, I must be missing something. The mapping is through the field name supplied, so as long as your input XML has something like add doc field name=CreatedDateyour date here/field /doc /add it should be fine. You can use date math here as well, as: field name=CreatedDateNOW/field Best, Erick On Tue, Apr 21, 2015 at 7:57 AM, Eric Meisler eric.meis...@veritablelp.com wrote: I am a newbie and just started using Solr 4.10.3. We have successfully indexed a network drive and are running searches. We now have a request to show the Created Date for all documents (PDF/WORD/TXT/XLS) that come back in our search results. I have successfully filtered on the last_modified date but I cannot figure out or find out how to add a document's Created Date to the schema.xml. We do not want to search on the created date since last_modified date handles this but just want to display it. To my understanding I need to add indexed=false and stored=true to the xml field but I don't know how or understand how the xml name will map to the document's created date property. This is my guess: field name=CreatedDate type=date indexed=false stored=true/ Can someone please supply the correct syntax for the xml and maybe a brief comment on how solr maps to the actual document's property? Also, will I need to re-index the dive to make this change apply? Thanks, Eric
Re: CDATA response is coming with lt: instead of
On Tue, Apr 21, 2015 at 9:46 AM, mesenthil1 senthilkumar.arumu...@viacomcontractor.com wrote: We are using DIH for indexing XML files. As part of the xml we have xml enclosed with CDATA. It is getting indexed but in response the CDATA content is coming as decoded terms instead of symbols. Your problem is ambiguous since we can't tell what is data, and what is markup (transfer syntax). If you were to index this same data using JSON, what would you pass? Is it this: Imagesimageuri... Or is it this? ![CDATA[Imagesimageuri... If it's the former, you're already set - it's working that way now. If it's the latter, then if you index that in XML you will need to escape it like any other XML value. Otherwise the XML parser will remove the CDATA stuff before it gets to the indexing part of Solr. -Yonik
Re: Checking of Solr Memory and Disk usage
On 4/21/2015 7:48 PM, Zheng Lin Edwin Yeo wrote: Does anyone knows the way to check the accurate memory and disk usage for each individual collections that's running in Solr? I'm using Solr-5.0.0 with 3 instance of external zookeeper-3.4.6, running on 2 shards/ Solr's admin UI will tell you the amount of disk space used by each *core* on the Overview tab for the core ... but you have to add them all up if you want to know that for each *collection*. The admin UI does have a Heap memory reading on the Overview tab for each core, but I do not know how accurate it is, or what exactly is being measured there. I am pretty sure that there is far more memory being used by each of my cores than I can see listed there, so it must not be counting everything. Thanks, Shawn
Re: Solr went on recovery multiple time.
After changing zookeeper time out from 10sec to 45/50 sec and monitoring for a long time i can observe servers went on recovery multiple times, but the Exceptions are some what different : INFO - 2015-04-22 09:02:47.943; org.apache.solr.common.cloud.ConnectionManager; Watcher org.apache.solr.common.cloud.ConnectionManager@6ad2b64e name:ZooKeeperConnection Watcher:bot1:2181,bot2:2181,bot3:2181,bot4:2181,bot5:2181 got event WatchedEvent state:SyncConnected type:None path:null path:null type:None INFO - 2015-04-22 09:02:47.944; org.apache.solr.common.cloud.ConnectionManager; Client is connected to ZooKeeper INFO - 2015-04-22 09:02:47.944; org.apache.solr.common.cloud.ConnectionManager$1; Connection with ZooKeeper reestablished. WARN - 2015-04-22 09:02:47.944; org.apache.solr.cloud.RecoveryStrategy; Stopping recovery for zkNodeName=searcher1.abc:8980_solr_dict_en_shard1_replica4core=dict_en_shard1_replica4 WARN - 2015-04-22 09:02:47.944; org.apache.solr.cloud.RecoveryStrategy; Stopping recovery for zkNodeName=searcher1.abc:8980_solr_dict_cn_shard1_replica2core=dict_cn_shard1_replica2 WARN - 2015-04-22 09:02:47.944; org.apache.solr.cloud.RecoveryStrategy; Stopping recovery for zkNodeName=searcher1.abc:8980_solr_dict_hk_shard1_replica4core=dict_hk_shard1_replica4 WARN - 2015-04-22 09:02:47.944; org.apache.solr.cloud.RecoveryStrategy; Stopping recovery for zkNodeName=searcher1.abc:8980_solr_dict_jp_shard1_replica3core=dict_jp_shard1_replica3 WARN - 2015-04-22 09:02:47.945; org.apache.solr.cloud.RecoveryStrategy; Stopping recovery for zkNodeName=searcher1.abc:8980_solr_dict_vn_shard1_replica3core=dict_vn_shard1_replica3 WARN - 2015-04-22 09:02:47.945; org.apache.solr.cloud.RecoveryStrategy; Stopping recovery for zkNodeName=searcher1.abc:8980_solr_dict_th_shard1_replica3core=dict_th_shard1_replica3 WARN - 2015-04-22 09:02:47.945; org.apache.solr.cloud.RecoveryStrategy; Stopping recovery for zkNodeName=searcher1.abc:8980_solr_dict_nl_shard1_replica2core=dict_nl_shard1_replica2 INFO - 2015-04-22 09:02:47.945; org.apache.solr.cloud.ZkController; publishing core=rn0 state=down INFO - 2015-04-22 09:02:47.945; org.apache.solr.cloud.ZkController; numShards not found on descriptor - reading it from system property INFO - 2015-04-22 09:02:47.951; org.apache.solr.client.solrj.impl.HttpClientUtil; Creating new http client, config:maxConnections=128maxConnectionsPerHost=32followRedirects=false ERROR - 2015-04-22 09:02:48.010; org.apache.solr.common.SolrException; :org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired for /overseer/queue/qn- at org.apache.zookeeper.KeeperException.create(KeeperException.java:127) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783) at org.apache.solr.common.cloud.SolrZkClient$5.execute(SolrZkClient.java:218) at org.apache.solr.common.cloud.SolrZkClient$5.execute(SolrZkClient.java:215) at org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.java:65) at org.apache.solr.common.cloud.SolrZkClient.create(SolrZkClient.java:215) at org.apache.solr.cloud.DistributedQueue.createData(DistributedQueue.java:284) at org.apache.solr.cloud.DistributedQueue.offer(DistributedQueue.java:271) at org.apache.solr.cloud.ZkController.publish(ZkController.java:1011) at org.apache.solr.cloud.ZkController.publish(ZkController.java:976) at org.apache.solr.handler.admin.CoreAdminHandler$2.run(CoreAdminHandler.java:811) INFO - 2015-04-22 09:02:48.010; org.apache.solr.update.DefaultSolrCoreState; Running recovery - first canceling any ongoing recovery INFO - 2015-04-22 09:02:48.012; org.apache.solr.cloud.RecoveryStrategy; Starting recovery process. core=rn0 recoveringAfterStartup=false INFO - 2015-04-22 09:02:48.016; org.apache.solr.cloud.ZkController; publishing core=rn0 state=recovering INFO - 2015-04-22 09:02:48.017; org.apache.solr.cloud.ZkController; numShards not found on descriptor - reading it from system property INFO - 2015-04-22 09:02:48.020; org.apache.solr.client.solrj.impl.HttpClientUtil; Creating new http client, config:maxConnections=128maxConnectionsPerHost=32followRedirects=false -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-went-on-recovery-multiple-time-tp4196249p4201508.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Checking of Solr Memory and Disk usage
On 4/21/2015 11:33 PM, Zheng Lin Edwin Yeo wrote: I've got the amount of disk space used, but for the Heap Memory Usage reading, it is showing the value -1. Do we need to change any settings for it? When I check from the Windows Task Manager, it is showing about 300MB for shard1 and 150MB for shard2. But I suppose that is the usage for the entire Solr and not for individual collection. That -1 sounds like a bug, but I'd like others to have a chance to chime in before you open an issue in Jira. My Solr instances are older -- 4.7.2 and 4.9.1. One of the larger cores on a 4.7.2 server shows a heap memory value of 86656138 -- about 82MB. I have no way to verify, but this seems very low to me. Thanks, Shawn
Solr 4.10.x regression in map-reduce contrib
Hello list, I'm using mapreduce from contrib and I get this stack trace: https://gist.github.com/ralph-tice/b1e84bdeb64532c7ecab Whenever I specify luceneMatchVersion4.10/luceneMatchVersion in my solrconfig.xml. 4.9 works fine. I'm using 4.10.4 artifacts for both map reduce runs. I tried raising maxWarmingSearchers to 20 and set openSearcher to false in my configs with no difference. I have started studying the code, but why would BatchWriter invoke warming (autowarming?) on a close, let alone opening a new searcher? Should I be looking in Lucene or Solr code to investigate this regression? I also notice there are interesting defaults for FaultTolerance in SolrReducer that don't appear to be documented: https://github.com/apache/lucene-solr/blob/trunk/solr/contrib/map-reduce/src/java/org/apache/solr/hadoop/SolrReducer.java#L70-L73 but reading https://issues.apache.org/jira/browse/SOLR-5758 sounds like they are either unimportant or overlooked? Also, we will probably be testing mapreduce contrib with 5.x, has anyone been successful with this yet or are there any known issues? I don't see a lot of changes on contrib/map-reduce... Regards, --Ralph Tice ralph.t...@gmail.com
Re: Add Entry to Support Page
Hi Christoph, I’ve added your wiki name to the ContributorsGroup page, so you should now be able to edit pages on the wiki. Steve On Apr 21, 2015, at 8:15 AM, Christoph Schmidt christoph.schm...@moresophy.de wrote: Solr Community, I’m Christoph Schmidt (http://www.moresophy.com/de/management), CEO of the german company moresophy GmbH. My Solr Wiki name is: - ChristophSchmidt We are working with Lucene since 2003 and Solr 2012 and are building linguistic token filters and plugins for Solr. We would like to add the following entry to the Solr Support page: moresophy GmbH: consulting in Lucene, Solr, elasticsearch, specialization in linguistic and semantic enrichment and high scalability content clouds (DE/AT/CH) a class=mailto href=mailto:i...@moresophy.com;i...@moresophy.com/a Best regards Christoph Schmidt ___ Dr. Christoph Schmidt | Geschäftsführer P +49-89-523041-72 M +49-171-1419367 Skype: cs_moresophy christoph.schm...@moresophy.de www.moresophy.com moresophy GmbH | Fraunhoferstrasse 15 | 82152 München-Martinsried Handelsregister beim Amtsgericht München, NR. HRB 136075 Umsatzsteueridentifikationsnummer: DE813188826 Vertreten durch die Geschäftsführer: Prof. Dr. Heiko Beier | Dr. Christoph Schmidt Diese E-Mail enthält vertrauliche und/oder rechtlich geschützte Informationen. Wenn Sie nicht der richtige Adressat sind oder diese E-Mailbitte sofort den Absender und vernichten Sie diese Mail. Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Mail ist nicht gestattet. This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have receiveddestroy this e-mail. Any unauthorised copying, disclosure or distribution of the material in this e-mail is strictly forbidden.
Confusing SOLR 5 memory usage
Hi all I have two SOLR 5 servers, one is the master and one is the slave. They both have 12 cores, fully replicated and giving identical results when querying them. The only difference between configuration on the two servers is that one is set to slave from the other - identical core configs and solr.in.sh. They both run on identical VMs with 16GB of RAM. In solr.in.sh, we are setting the heap size identically: SOLR_JAVA_MEM=-Xms512m -Xmx7168m The two servers are balanced behind haproxy, and identical numbers and types of queries flow to both servers. Indexing only happens once a day. When viewing the memory usage of the servers, the master server's JVM has 8.8GB RSS, but the slave only has 1.2GB RSS. Can someone hit me with the cluebat please? :) Cheers Tom
RE: Confusing SOLR 5 memory usage
Hi - what do you see if you monitor memory over time? You should see a typical saw tooth. Markus -Original message- From:Tom Evans tevans...@googlemail.com Sent: Tuesday 21st April 2015 12:22 To: solr-user@lucene.apache.org Subject: Confusing SOLR 5 memory usage Hi all I have two SOLR 5 servers, one is the master and one is the slave. They both have 12 cores, fully replicated and giving identical results when querying them. The only difference between configuration on the two servers is that one is set to slave from the other - identical core configs and solr.in.sh. They both run on identical VMs with 16GB of RAM. In solr.in.sh, we are setting the heap size identically: SOLR_JAVA_MEM=-Xms512m -Xmx7168m The two servers are balanced behind haproxy, and identical numbers and types of queries flow to both servers. Indexing only happens once a day. When viewing the memory usage of the servers, the master server's JVM has 8.8GB RSS, but the slave only has 1.2GB RSS. Can someone hit me with the cluebat please? :) Cheers Tom
Solr search with various combinations of space, hyphen, casing and punctuations
I posted this question on Stackoverflow, but wanted to get this to some attention of Solr mailing lists as well: http://stackoverflow.com/questions/29783237/solr-search-with-various-combinations-of-space-hyphen-casing-and-punctuations Thanks in advance, Venkat Sudheer Reddy Aedama -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-search-with-various-combinations-of-space-hyphen-casing-and-punctuations-tp4201405.html Sent from the Solr - User mailing list archive at Nabble.com.