Re: Solr - complete setup (update)
On 2019-01-30 07:33, Stephan Bosch wrote: (forgot to CC mailing list) Op 26/01/2019 om 20:07 schreef Joan Moreau via dovecot: *- Bugs so far* -> Line 620 of fts_solr dovecot plugin : the size oof header is improperly calculated ("huge header" warning for a simple email, which kilss the index of that considered email, so basically MOST emails as the calculation is wrong) *You can check that regularly in dovecot log file. My guess is the mix of Unicode which is not properly addressed here.* Does this happen with specific messages? Do you have a sample message for me? I don't see how Unicode could cause this. MY ONLY GUESS IS THAT IT REFERS TO SOME 'STRLEN', WHICH IS WRONG OF COURSE IN CASE OF UNICODE EMAILS. THIS IS JUST A GUESS. BUT DO A GREP FOR "HUGE" IN THE DOVECOT LOG OF A BUSY SERVER TO FIND EXAMPLES. (SORRY, I SWITCHED TO XAPIAN, AS SOLR IS CREATING TOO MUCH TROUBLES FOR MY SERVER, SO NO MORE CONCRETE EXAMPLE) -> The UID returned by SOlr is to be considered as a STRING (and that is maybe the source of problem of the "out of bound" errors in fts_solr dovecot, as "long" is not enough) *This is just highly visible in Solr schema.xml. Swithcing it to "long" in schema.xml returns plenty of errors.* I cannot reproduce this so far (see modified schema below). In a simple test I just get the desired results and no errors logged. I got this with large mailboxes (where UID seems not acceptable for Solr ). The fault is not on Dovecot side but Solr, and the returned UID(s) for a search is garbage instead of a proper value -> Putting it as string solves this -> Java errors : A lot of non sense for me, I am not expert in Java. But, with increased memory, it seems not crashing, even if complaining quite a lot in the logs Can you elaborate on the errors you have seen so far? When do these happen? How can I reproduce them? *Honestly, I have no clue what the problems are. I just increased the memory of the JVM and the systems stopped crashing. Log files are huge anyway.* What errors do you see? I see only INFO entries in my /var/solr/logs/solr.log. Looks like Solr is pretty verbose by default (lots of INFO output), but there must be a way to reduce that. I DELETED SOLR. NO MORE LOGS. MAYBE SOMEONE ELSE CAN TELL. id
Re: Solr - complete setup (update)
(forgot to CC mailing list) Op 26/01/2019 om 20:07 schreef Joan Moreau via dovecot: *- Bugs so far* -> Line 620 of fts_solr dovecot plugin : the size oof header is improperly calculated ("huge header" warning for a simple email, which kilss the index of that considered email, so basically MOST emails as the calculation is wrong) *You can check that regularly in dovecot log file. My guess is the mix of Unicode which is not properly addressed here.* Does this happen with specific messages? Do you have a sample message for me? I don't see how Unicode could cause this. -> The UID returned by SOlr is to be considered as a STRING (and that is maybe the source of problem of the "out of bound" errors in fts_solr dovecot, as "long" is not enough) *This is just highly visible in Solr schema.xml. Swithcing it to "long" in schema.xml returns plenty of errors.* I cannot reproduce this so far (see modified schema below). In a simple test I just get the desired results and no errors logged. -> Java errors : A lot of non sense for me, I am not expert in Java. But, with increased memory, it seems not crashing, even if complaining quite a lot in the logs Can you elaborate on the errors you have seen so far? When do these happen? How can I reproduce them? *Honestly, I have no clue what the problems are. I just increased the memory of the JVM and the systems stopped crashing. Log files are huge anyway.* What errors do you see? I see only INFO entries in my /var/solr/logs/solr.log. Looks like Solr is pretty verbose by default (lots of INFO output), but there must be a way to reduce that. Regards, Stephan. id positionIncrementGap="0"/> autoGeneratePhraseQueries="true" positionIncrementGap="100"> generateNumberParts="1" splitOnCaseChange="1" generateWordParts="1" splitOnNumerics="1" catenateAll="1" catenateWords="1" preserveOriginal="1"/> autoGeneratePhraseQueries="true"> stored="true"/> stored="true"/> stored="true"/>
Re: Solr - complete setup (update)
*- Installation:* -> Create a clean install using the default, (at least in the Archlinux package), and do a "sudo -u solr solr create -c dovecot ". The config files are then in /opt/solr/server/solr/dovecot/conf and datafiles in /opt/solr/server/solr/dovecot/data On my system (Debian) these directories are wildly different (e.g. data is under /var), but other than that, this information is OK. Used this as a side-reference for Debian installation: https://tecadmin.net/install-apache-solr-on-debian/ Accessed http://solr-host.tld:8983/solr/ to check whether all is OK. MAKE SURE YOU HAVE A DOVECOT INSTANCE (NOT THE DEFAULT INSTANCE) , WITH THE FUNCTION BELOW: SOLR CREATE -C DOVECOT (OR WHATEVER NAME) Weirdly, rescan returns immediately here. When I perform `doveadm index INBOX` for my test user, I do see a lot of fts and HTTP activity. THE SOLR PLUGIN IS NOT CODED ENTIRELY, REFRESH AND RESCAN FUNCTIONS ARE MISSING : https://github.com/dovecot/core/blob/master/src/plugins/fts-solr/fts-backend-solr.c static int fts_backend_solr_refresh(struct fts_backend *backend ATTR_UNUSED) { return 0; } static int fts_backend_solr_rescan(struct fts_backend *backend) { /* FIXME: proper rescan needed. for now we'll just reset the last-uids */ return fts_backend_reset_last_uids(backend); } *- Bugs so far* -> Line 620 of fts_solr dovecot plugin : the size oof header is improperly calculated ("huge header" warning for a simple email, which kilss the index of that considered email, so basically MOST emails as the calculation is wrong) YOU CAN CHECK THAT REGULARLY IN DOVECOT LOG FILE. MY GUESS IS THE MIX OF UNICODE WHICH IS NOT PROPERLY ADDRESSED HERE. -> The UID returned by SOlr is to be considered as a STRING (and that is maybe the source of problem of the "out of bound" errors in fts_solr dovecot, as "long" is not enough) THIS IS JUST HIGHLY VISIBLE IN SOLR SCHEMA.XML. SWITHCING IT TO "LONG" IN SCHEMA.XML RETURNS PLENTY OF ERRORS. -> Java errors : A lot of non sense for me, I am not expert in Java. But, with increased memory, it seems not crashing, even if complaining quite a lot in the logs Can you elaborate on the errors you have seen so far? When do these happen? How can I reproduce them? HONESTLY, I HAVE NO CLUE WHAT THE PROBLEMS ARE. I JUST INCREASED THE MEMORY OF THE JVM AND THE SYSTEMS STOPPED CRASHING. LOG FILES ARE HUGE ANYWAY.
Re: Solr - complete setup (update)
Op 26/01/2019 om 15:24 schreef Hendrik Boom: On Sat, Jan 26, 2019 at 01:44:16PM +0100, Stephan Bosch wrote: Hi Joan, Op 14/01/2019 om 07:44 schreef Joan Moreau via dovecot: Hi Stephan, What's up with that ? Thank you so much On 2019-01-05 02:04, Stephan Bosch wrote: Debian does something weird here. It doesn't use an explicit systemd unit. It is generated from the SysV init file. I ended up setting the ulimits in /etc/security/limits.conf for user solr. Please make sure the changes you make don't make your Debian package *require* systemd. There are Debian-derived distros that avoid systemd. Don't worry, I am not working on packaging this. I just want to know what the problems are and how these can be solved, so that we can update the wiki. Regards, Stephan.
Re: Solr - complete setup (update)
On Sat, Jan 26, 2019 at 01:44:16PM +0100, Stephan Bosch wrote: > Hi Joan, > > Op 14/01/2019 om 07:44 schreef Joan Moreau via dovecot: > > > > Hi Stephan, > > > > What's up with that ? > > > > Thank you so much > > > > On 2019-01-05 02:04, Stephan Bosch wrote: > > > > > Hi, > > > > > > Op 04/01/2019 om 05:36 schreef Joan Moreau via dovecot: > > > > ... ... > > > > > > > > -> The systemd unit shall specify high ulimit for files and proc > > > > (see below) > > Debian does something weird here. It doesn't use an explicit systemd unit. > It is generated from the SysV init file. I ended up setting the ulimits in > /etc/security/limits.conf for user solr. Please make sure the changes you make don't make your Debian package *require* systemd. There are Debian-derived distros that avoid systemd. -- hendrik
Re: Solr - complete setup (update)
Hi Joan, Op 14/01/2019 om 07:44 schreef Joan Moreau via dovecot: Hi Stephan, What's up with that ? Thank you so much On 2019-01-05 02:04, Stephan Bosch wrote: Hi, Op 04/01/2019 om 05:36 schreef Joan Moreau via dovecot: Hi This is the summary of my work with SOLR-Dovecot, in my *quest to reproduce the previoulsy excellent work of fts_squat* @Aki : Based on the time I have spent on this, I would love to see you updating the Wiki with those improvements, and adding my name somewhere @All : Hope it helps *- Installation:* -> Create a clean install using the default, (at least in the Archlinux package), and do a "sudo -u solr solr create -c dovecot ". The config files are then in /opt/solr/server/solr/dovecot/conf and datafiles in /opt/solr/server/solr/dovecot/data On my system (Debian) these directories are wildly different (e.g. data is under /var), but other than that, this information is OK. Used this as a side-reference for Debian installation: https://tecadmin.net/install-apache-solr-on-debian/ Accessed http://solr-host.tld:8983/solr/ to check whether all is OK. -> In /opt/solr/server/solr/dovecot/conf/solrconfig.xml: * around line 313, change false to true * around line 147, set 2000 (or above) * around line 696 : uncomment hdr * around line 1127, before class="solr.UUIDUpdateProcessorFactory" name="uuid"/>, add * around line 1161, delete the whole class="solr.AddSchemaFieldsUpdateProcessorFactory" name="add-schema-fields"> * around line 1192, remove the whole ... /> Applied these changes. We should probably provide an example config file on the Wiki that incorporates all this.. or maybe a diff. We also need to evaluate what the merit of all of this is. I did something similar in my previous effort, but it was all based on getting an error from Solr and then removing that section of the config file with the assumption it wasn't needed. So far, I have little clue what these things are and why these things are enabled by default. As I said in an earlier mail, there is an option to leave some of this cruft out at backend initialization, but I haven't tried that yet. -> Remove /opt/solr/server/solr/dovecot/conf/managed-schema -> Change "schema.xml" by the one below to reproduce fts_squat behavior (equivalent to " fts_squat = partial=3 full=25" in dovecot.conf) (note : such a huge trouble to replace a single line setup, anyway...) Did that too. -> Move /opt/solr/server/solr (or the subfolder data) to a partition with *space*, ideally ext4 or faster file system (it looks like Solr is not considering using a simple mysql database, which would make sense to avoid all the fuzz and let it transit to a non-java state, but that is another story) Skipped that. -> Config of dovecot.conf is as below I also enabled debug for fts_solr. -> The systemd unit shall specify high ulimit for files and proc (see below) Debian does something weird here. It doesn't use an explicit systemd unit. It is generated from the SysV init file. I ended up setting the ulimits in /etc/security/limits.conf for user solr. -> Increase the memory available for the JavaVM (I put 12Gb as I have quite a space on my server, but you may adapt it as per your specs) : in /opt/solr/bin/solr.in.sh, set SOLR_HEAP="12288m" Skipped that. -> As Solr is complaining a lot, you may consider a filter for it in your syslog-ng or journald as it pollutes greatly your audit files What does it complain about and when does it happen? I haven't seen much logging from Solr so far. -> (re)Start solr (first) and dovecot by systemctl -> Launch redindex ( doveadm fts rescan -u ) -> wait for a big while to let the system re-index all your mail boxes Weirdly, rescan returns immediately here. When I perform `doveadm index INBOX` for my test user, I do see a lot of fts and HTTP activity. *- Bugs so far* -> Line 620 of fts_solr dovecot plugin : the size oof header is improperly calculated ("huge header" warning for a simple email, which kilss the index of that considered email, so basically MOST emails as the calculation is wrong) -> The UID returned by SOlr is to be considered as a STRING (and that is maybe the source of problem of the "out of bound" errors in fts_solr dovecot, as "long" is not enough) -> Java errors : A lot of non sense for me, I am not expert in Java. But, with increased memory, it seems not crashing, even if complaining quite a lot in the logs Can you elaborate on the errors you have seen so far? When do these happen? How can I reproduce them? Regards, Stephan. *---SCHEMA.XML in /opt/solr/server/solr/dovecot/conf* id autoGeneratePhraseQueries="true" positionIncrementGap="100"> catenateNumbers="1" generateNumberParts="1" splitOnCaseChange="1" generateWordParts="1" splitOnNumerics="1" catenateAll="1" catenateWords="1" preserveOriginal="1"/>
Re: Solr - complete setup (update)
Yes, the " -property update.autoCreateFields -value false " seems interesting However, we smash the created schema just after On 2019-01-14 23:25, Stephan Bosch wrote: Op 14/01/2019 om 07:44 schreef Joan Moreau via dovecot: Hi Stephan, What's up with that ? Thank you so much Working on it, somewhat anyway. BTW, did you see this ? : """ $ sudo -u solr /opt/solr/bin/solr create -c dovecot WARNING: Using _default configset with data driven schema functionality. NOT RECOMMENDED for production use. To turn off: bin/solr config -c dovecot -p 8983 -action set-user-property -property update.autoCreateFields -value false INFO - 2019-01-14 23:19:56.831; org.apache.solr.util.configuration.SSLCredentialProviderFactory; Processing SSL Credential Provider chain: env;sysprop Created new core 'dovecot' """ I'll be trying your steps first, but the mentioned command might at least get rid of some of the cruft in the default config file. Regards, Stephan. On 2019-01-05 02:04, Stephan Bosch wrote: Hi, Op 04/01/2019 om 05:36 schreef Joan Moreau via dovecot: Hi This is the summary of my work with SOLR-Dovecot, in my *quest to reproduce the previoulsy excellent work of fts_squat* @Aki : Based on the time I have spent on this, I would love to see you updating the Wiki with those improvements, and adding my name somewhere @All : Hope it helps I'll be going through the description below soon. I've recently independently installed fts-solr from scratch. Although this wasn't a flawless effort, I managed to get some basic indexing going. From this mail thread I understand that there are quite a few more problems than I've seen myself so far. Then again, I didn't perform extensive tests with actual searches. Maybe we can turn all this into a test suite that we can run internally here at Dovecot. At the very least, the described Dovecot bugs need to be addressed and the wiki needs to be updated. I'll get back to you. Regards, Stephan. *- Installation:* -> Create a clean install using the default, (at least in the Archlinux package), and do a "sudo -u solr solr create -c dovecot ". The config files are then in /opt/solr/server/solr/dovecot/conf and datafiles in /opt/solr/server/solr/dovecot/data -> In /opt/solr/server/solr/dovecot/conf/solrconfig.xml: * around line 313, change false to true * around line 147, set 2000 (or above) * around line 696 : uncomment hdr * around line 1127, before , add * around line 1161, delete the whole * around line 1192, remove the whole -> Remove /opt/solr/server/solr/dovecot/conf/managed-schema -> Change "schema.xml" by the one below to reproduce fts_squat behavior (equivalent to " fts_squat = partial=3 full=25" in dovecot.conf) (note : such a huge trouble to replace a single line setup, anyway...) -> Move /opt/solr/server/solr (or the subfolder data) to a partition with *space*, ideally ext4 or faster file system (it looks like Solr is not considering using a simple mysql database, which would make sense to avoid all the fuzz and let it transit to a non-java state, but that is another story) -> Config of dovecot.conf is as below -> The systemd unit shall specify high ulimit for files and proc (see below) -> Increase the memory available for the JavaVM (I put 12Gb as I have quite a space on my server, but you may adapt it as per your specs) : in /opt/solr/bin/solr.in.sh, set SOLR_HEAP="12288m" -> As Solr is complaining a lot, you may consider a filter for it in your syslog-ng or journald as it pollutes greatly your audit files -> (re)Start solr (first) and dovecot by systemctl -> Launch redindex ( doveadm fts rescan -u ) -> wait for a big while to let the system re-index all your mail boxes *- Bugs so far* -> Line 620 of fts_solr dovecot plugin : the size oof header is improperly calculated ("huge header" warning for a simple email, which kilss the index of that considered email, so basically MOST emails as the calculation is wrong) -> The UID returned by SOlr is to be considered as a STRING (and that is maybe the source of problem of the "out of bound" errors in fts_solr dovecot, as "long" is not enough) -> Java errors : A lot of non sense for me, I am not expert in Java. But, with increased memory, it seems not crashing, even if complaining quite a lot in the logs *---SCHEMA.XML in /opt/solr/server/solr/dovecot/conf* id *-- DOVECOT.CONF* mail_plugins = fts fts_solr plugin { plugin = fts fts_solr managesieve sieve fts = solr fts_autoindex = yes fts_enforced = yes fts_solr = url=http://127.0.0.1:8983/solr/dovecot/ (replace 127.0.0.1 by your solr server if you want to use an external server) (...) } *-- /etc/systemd/system/multi-user.target.wants/solr.service* [Unit] Description=Solr full text search engine After=network.target [Service] Type=simple User=solr Group=solr PrivateTmp=yes WorkingDirectory=/opt/solr *LimitNOFILE=65000* *LimitNPROC=65000*
Re: Solr - complete setup (update)
Op 14/01/2019 om 07:44 schreef Joan Moreau via dovecot: Hi Stephan, What's up with that ? Thank you so much Working on it, somewhat anyway. BTW, did you see this ? : """ $ sudo -u solr /opt/solr/bin/solr create -c dovecot WARNING: Using _default configset with data driven schema functionality. NOT RECOMMENDED for production use. To turn off: bin/solr config -c dovecot -p 8983 -action set-user-property -property update.autoCreateFields -value false INFO - 2019-01-14 23:19:56.831; org.apache.solr.util.configuration.SSLCredentialProviderFactory; Processing SSL Credential Provider chain: env;sysprop Created new core 'dovecot' """ I'll be trying your steps first, but the mentioned command might at least get rid of some of the cruft in the default config file. Regards, Stephan. On 2019-01-05 02:04, Stephan Bosch wrote: Hi, Op 04/01/2019 om 05:36 schreef Joan Moreau via dovecot: Hi This is the summary of my work with SOLR-Dovecot, in my *quest to reproduce the previoulsy excellent work of fts_squat* @Aki : Based on the time I have spent on this, I would love to see you updating the Wiki with those improvements, and adding my name somewhere @All : Hope it helps I'll be going through the description below soon. I've recently independently installed fts-solr from scratch. Although this wasn't a flawless effort, I managed to get some basic indexing going. From this mail thread I understand that there are quite a few more problems than I've seen myself so far. Then again, I didn't perform extensive tests with actual searches. Maybe we can turn all this into a test suite that we can run internally here at Dovecot. At the very least, the described Dovecot bugs need to be addressed and the wiki needs to be updated. I'll get back to you. Regards, Stephan. *- Installation:* -> Create a clean install using the default, (at least in the Archlinux package), and do a "sudo -u solr solr create -c dovecot ". The config files are then in /opt/solr/server/solr/dovecot/conf and datafiles in /opt/solr/server/solr/dovecot/data -> In /opt/solr/server/solr/dovecot/conf/solrconfig.xml: * around line 313, change false to true * around line 147, set 2000 (or above) * around line 696 : uncomment hdr * around line 1127, before class="solr.UUIDUpdateProcessorFactory" name="uuid"/>, add * around line 1161, delete the whole class="solr.AddSchemaFieldsUpdateProcessorFactory" name="add-schema-fields"> * around line 1192, remove the whole ... /> -> Remove /opt/solr/server/solr/dovecot/conf/managed-schema -> Change "schema.xml" by the one below to reproduce fts_squat behavior (equivalent to " fts_squat = partial=3 full=25" in dovecot.conf) (note : such a huge trouble to replace a single line setup, anyway...) -> Move /opt/solr/server/solr (or the subfolder data) to a partition with *space*, ideally ext4 or faster file system (it looks like Solr is not considering using a simple mysql database, which would make sense to avoid all the fuzz and let it transit to a non-java state, but that is another story) -> Config of dovecot.conf is as below -> The systemd unit shall specify high ulimit for files and proc (see below) -> Increase the memory available for the JavaVM (I put 12Gb as I have quite a space on my server, but you may adapt it as per your specs) : in /opt/solr/bin/solr.in.sh, set SOLR_HEAP="12288m" -> As Solr is complaining a lot, you may consider a filter for it in your syslog-ng or journald as it pollutes greatly your audit files -> (re)Start solr (first) and dovecot by systemctl -> Launch redindex ( doveadm fts rescan -u ) -> wait for a big while to let the system re-index all your mail boxes *- Bugs so far* -> Line 620 of fts_solr dovecot plugin : the size oof header is improperly calculated ("huge header" warning for a simple email, which kilss the index of that considered email, so basically MOST emails as the calculation is wrong) -> The UID returned by SOlr is to be considered as a STRING (and that is maybe the source of problem of the "out of bound" errors in fts_solr dovecot, as "long" is not enough) -> Java errors : A lot of non sense for me, I am not expert in Java. But, with increased memory, it seems not crashing, even if complaining quite a lot in the logs *---SCHEMA.XML in /opt/solr/server/solr/dovecot/conf* id autoGeneratePhraseQueries="true" positionIncrementGap="100"> catenateNumbers="1" generateNumberParts="1" splitOnCaseChange="1" generateWordParts="1" splitOnNumerics="1" catenateAll="1" catenateWords="1" preserveOriginal="1"/> autoGeneratePhraseQueries="true"> stored="true"/> stored="true"/> stored="true"/> stored="true"/> *-- DOVECOT.CONF* mail_plugins = fts fts_solr plugin { plugin = fts fts_solr managesieve sieve fts = solr fts_autoindex = yes fts_enforced = yes fts_solr =
Re: Solr - complete setup (update)
Hi Stephan, What's up with that ? Thank you so much On 2019-01-05 02:04, Stephan Bosch wrote: Hi, Op 04/01/2019 om 05:36 schreef Joan Moreau via dovecot: Hi This is the summary of my work with SOLR-Dovecot, in my *quest to reproduce the previoulsy excellent work of fts_squat* @Aki : Based on the time I have spent on this, I would love to see you updating the Wiki with those improvements, and adding my name somewhere @All : Hope it helps I'll be going through the description below soon. I've recently independently installed fts-solr from scratch. Although this wasn't a flawless effort, I managed to get some basic indexing going. From this mail thread I understand that there are quite a few more problems than I've seen myself so far. Then again, I didn't perform extensive tests with actual searches. Maybe we can turn all this into a test suite that we can run internally here at Dovecot. At the very least, the described Dovecot bugs need to be addressed and the wiki needs to be updated. I'll get back to you. Regards, Stephan. *- Installation:* -> Create a clean install using the default, (at least in the Archlinux package), and do a "sudo -u solr solr create -c dovecot ". The config files are then in /opt/solr/server/solr/dovecot/conf and datafiles in /opt/solr/server/solr/dovecot/data -> In /opt/solr/server/solr/dovecot/conf/solrconfig.xml: * around line 313, change false to true * around line 147, set 2000 (or above) * around line 696 : uncomment hdr * around line 1127, before , add * around line 1161, delete the whole * around line 1192, remove the whole -> Remove /opt/solr/server/solr/dovecot/conf/managed-schema -> Change "schema.xml" by the one below to reproduce fts_squat behavior (equivalent to " fts_squat = partial=3 full=25" in dovecot.conf) (note : such a huge trouble to replace a single line setup, anyway...) -> Move /opt/solr/server/solr (or the subfolder data) to a partition with *space*, ideally ext4 or faster file system (it looks like Solr is not considering using a simple mysql database, which would make sense to avoid all the fuzz and let it transit to a non-java state, but that is another story) -> Config of dovecot.conf is as below -> The systemd unit shall specify high ulimit for files and proc (see below) -> Increase the memory available for the JavaVM (I put 12Gb as I have quite a space on my server, but you may adapt it as per your specs) : in /opt/solr/bin/solr.in.sh, set SOLR_HEAP="12288m" -> As Solr is complaining a lot, you may consider a filter for it in your syslog-ng or journald as it pollutes greatly your audit files -> (re)Start solr (first) and dovecot by systemctl -> Launch redindex ( doveadm fts rescan -u ) -> wait for a big while to let the system re-index all your mail boxes *- Bugs so far* -> Line 620 of fts_solr dovecot plugin : the size oof header is improperly calculated ("huge header" warning for a simple email, which kilss the index of that considered email, so basically MOST emails as the calculation is wrong) -> The UID returned by SOlr is to be considered as a STRING (and that is maybe the source of problem of the "out of bound" errors in fts_solr dovecot, as "long" is not enough) -> Java errors : A lot of non sense for me, I am not expert in Java. But, with increased memory, it seems not crashing, even if complaining quite a lot in the logs *---SCHEMA.XML in /opt/solr/server/solr/dovecot/conf* id *-- DOVECOT.CONF* mail_plugins = fts fts_solr plugin { plugin = fts fts_solr managesieve sieve fts = solr fts_autoindex = yes fts_enforced = yes fts_solr = url=http://127.0.0.1:8983/solr/dovecot/ (replace 127.0.0.1 by your solr server if you want to use an external server) (...) } *-- /etc/systemd/system/multi-user.target.wants/solr.service* [Unit] Description=Solr full text search engine After=network.target [Service] Type=simple User=solr Group=solr PrivateTmp=yes WorkingDirectory=/opt/solr *LimitNOFILE=65000* *LimitNPROC=65000* ExecStart=/opt/solr/bin/solr start -f [Install] WantedBy=multi-user.target
Re: Solr - complete setup (update)
Hi, Op 04/01/2019 om 05:36 schreef Joan Moreau via dovecot: Hi This is the summary of my work with SOLR-Dovecot, in my *quest to reproduce the previoulsy excellent work of fts_squat* @Aki : Based on the time I have spent on this, I would love to see you updating the Wiki with those improvements, and adding my name somewhere @All : Hope it helps I'll be going through the description below soon. I've recently independently installed fts-solr from scratch. Although this wasn't a flawless effort, I managed to get some basic indexing going. From this mail thread I understand that there are quite a few more problems than I've seen myself so far. Then again, I didn't perform extensive tests with actual searches. Maybe we can turn all this into a test suite that we can run internally here at Dovecot. At the very least, the described Dovecot bugs need to be addressed and the wiki needs to be updated. I'll get back to you. Regards, Stephan. *- Installation:* -> Create a clean install using the default, (at least in the Archlinux package), and do a "sudo -u solr solr create -c dovecot ". The config files are then in /opt/solr/server/solr/dovecot/conf and datafiles in /opt/solr/server/solr/dovecot/data -> In /opt/solr/server/solr/dovecot/conf/solrconfig.xml: * around line 313, change false to true * around line 147, set 2000 (or above) * around line 696 : uncomment hdr * around line 1127, before class="solr.UUIDUpdateProcessorFactory" name="uuid"/>, add * around line 1161, delete the whole class="solr.AddSchemaFieldsUpdateProcessorFactory" name="add-schema-fields"> * around line 1192, remove the whole name="add-unknown-fields-to-the-schema" ... /> -> Remove /opt/solr/server/solr/dovecot/conf/managed-schema -> Change "schema.xml" by the one below to reproduce fts_squat behavior (equivalent to " fts_squat = partial=3 full=25" in dovecot.conf) (note : such a huge trouble to replace a single line setup, anyway...) -> Move /opt/solr/server/solr (or the subfolder data) to a partition with *space*, ideally ext4 or faster file system (it looks like Solr is not considering using a simple mysql database, which would make sense to avoid all the fuzz and let it transit to a non-java state, but that is another story) -> Config of dovecot.conf is as below -> The systemd unit shall specify high ulimit for files and proc (see below) -> Increase the memory available for the JavaVM (I put 12Gb as I have quite a space on my server, but you may adapt it as per your specs) : in /opt/solr/bin/solr.in.sh, set SOLR_HEAP="12288m" -> As Solr is complaining a lot, you may consider a filter for it in your syslog-ng or journald as it pollutes greatly your audit files -> (re)Start solr (first) and dovecot by systemctl -> Launch redindex ( doveadm fts rescan -u ) -> wait for a big while to let the system re-index all your mail boxes *- Bugs so far* -> Line 620 of fts_solr dovecot plugin : the size oof header is improperly calculated ("huge header" warning for a simple email, which kilss the index of that considered email, so basically MOST emails as the calculation is wrong) -> The UID returned by SOlr is to be considered as a STRING (and that is maybe the source of problem of the "out of bound" errors in fts_solr dovecot, as "long" is not enough) -> Java errors : A lot of non sense for me, I am not expert in Java. But, with increased memory, it seems not crashing, even if complaining quite a lot in the logs *---SCHEMA.XML in /opt/solr/server/solr/dovecot/conf* id autoGeneratePhraseQueries="true" positionIncrementGap="100"> catenateNumbers="1" generateNumberParts="1" splitOnCaseChange="1" generateWordParts="1" splitOnNumerics="1" catenateAll="1" catenateWords="1" preserveOriginal="1"/> autoGeneratePhraseQueries="true"> stored="true"/> stored="true"/> stored="true"/> stored="true"/> *-- DOVECOT.CONF* mail_plugins = fts fts_solr plugin { plugin = fts fts_solr managesieve sieve fts = solr fts_autoindex = yes fts_enforced = yes fts_solr = url=http://127.0.0.1:8983/solr/dovecot/ (replace 127.0.0.1 by your solr server if you want to use an external server) (...) } *-- /etc/systemd/system/multi-user.target.wants/solr.service* [Unit] Description=Solr full text search engine After=network.target [Service] Type=simple User=solr Group=solr PrivateTmp=yes WorkingDirectory=/opt/solr *LimitNOFILE=65000* *LimitNPROC=65000* ExecStart=/opt/solr/bin/solr start -f [Install] WantedBy=multi-user.target