Re: [General] Buffer overflow
Hi Philippe, On 03/20/2013 01:10 PM, Philippe DE ROCHAMBEAU wrote: Hi Alexander, The problem is that version 3.3.12 is the only one available on the Redhat Repository. The info below makes me think that you're using the RPM you previously downloaded from our site. This RPM is a similar RPM we built for 3.3.13: http://www.mnogosearch.org/Download/RPMS/mnogosearch-3.3.13-01.static.glibc-2.12.x86_64.rpm I suggest to download it and upgrade. --- Yum info mnogosearch Loaded plugins: product-id, rhnplugin, security, subscription-manager Updating certificate-based repositories. Unable to read consumer identity Installed Packages Name: mnogosearch Arch: x86_64 Version : 3.3.12 Release : 01.static Size: 15 M Repo: installed Summary : Full-featured MySQL based web search engine. URL : http://www.mnogosearch.org/ License : GNU GPL Version 2 Description : mnoGoSearch is a full-featured MySQL based web search engine. mnoGoSearch consists of : two parts. The first part is an indexing mechanism (indexer). The indexer walks over : html hypertext references and stores found words and new references into a database. : The second part is a web CGI front-end to provide search using data collected by the : indexer. : : A PHP and a Perl front-ends are also available from our site http://www.mnogosearch.org/. : : mnoGoSearch first release took place in November 1998. The search engine was named : UDMSearch until the project was acquired by Lavtech.Com Corp. in October 2000 and : its name changed to mnoGoSearch. -- Best regards, Philippe -Original Message- From: Alexander Barkov [mailto:b...@mnogosearch.org] Sent: 20 March 2013 09:50 To: Philippe DE ROCHAMBEAU Cc: general@mnogosearch.org Subject: Re: [General] Buffer overflow Hi Philippe, So you're actually running mnogosearch-3.3.12 (not 3.3.13 as you reported in the first letter). This problem should be fixed in 3.3.13. This is from the 3.3.13 ChangeLog: Bug#4803 buffer overflow detected with search.cgi was fixed. Please download 3.3.13 from our site and reinstall. Greetings. On 03/20/2013 12:32 PM, Philippe DE ROCHAMBEAU wrote: Hi, uname --all Linux xxx 2.6.32-279.22.1.el6.x86_64 #1 SMP Sun Jan 13 09:21:40 EST 2013 x86_64 x86_64 x86_64 GNU/Linux --- [root@xxx cgi-bin]# ./search.cgi a *** buffer overflow detected ***: ./search.cgi terminated === Backtrace: = [0x52dae5] [0x52da7e] [0x52d523] [0x52d408] [0x440c98] [0x44d247] [0x4171dd] [0x404566] [0x4b6056] [0x405201] === Memory map: 0040-00685000 r-xp fd:00 334904 /var/www/cgi-bin/search.cgi 00885000-008e rw-p 00285000 fd:00 334904 /var/www/cgi-bin/search.cgi 008e-008ec000 rw-p 00:00 0 02484000-0251d000 rw-p 00:00 0 [heap] 399c40-399c42 r-xp fd:00 318247 /lib64/ld-2.12.so 399c42-399c61f000 ---p 0002 fd:00 318247 /lib64/ld-2.12.so 399c61f000-399c62 r--p 0001f000 fd:00 318247 /lib64/ld-2.12.so 399c62-399c621000 rw-p 0002 fd:00 318247 /lib64/ld-2.12.so 399c621000-399c622000 rw-p 00:00 0 399cc0-399cd89000 r-xp fd:00 318254 /lib64/libc-2.12.so 399cd89000-399cf89000 ---p 00189000 fd:00 318254 /lib64/libc-2.12.so 399cf89000-399cf8d000 r--p 00189000 fd:00 318254 /lib64/libc-2.12.so 399cf8d000-399cf8e000 rw-p 0018d000 fd:00 318254 /lib64/libc-2.12.so 399cf8e000-399cf93000 rw-p 00:00 0 7fc85941b000-7fc859541000 rw-p 00:00 0 7fc85994d000-7fc859a95000 rw-p 00:00 0 7fc859a95000-7fc859aa1000 r-xp fd:00 318269 /lib64/libnss_files-2.12.so 7fc859aa1000-7fc859ca1000 ---p c000 fd:00 318269 /lib64/libnss_files-2.12.so 7fc859ca1000-7fc859ca2000 r--p c000 fd:00 318269 /lib64/libnss_files-2.12.so 7fc859ca2000-7fc859ca3000 rw-p d000 fd:00 318269 /lib64/libnss_files-2.12.so 7fff73931000-7fff73946000 rw-p 00:00 0 [stack] 7fff739ff000-7fff73a0 r-xp 00:00 0 [vdso] ff60-ff601000 r-xp 00:00 0 [vsyscall] Aborted (core dumped) -- [root@xxx cgi-bin]# gdb search.cgi GNU gdb (GDB) Red Hat Enterprise Linux (7.2-56.el6) Copyright (C) 2010 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html This is free
Re: [General] Buffer overflow
399cf89000-399cf8d000 r--p 00189000 fd:00 318254 /lib64/libc-2.12.so 399cf8d000-399cf8e000 rw-p 0018d000 fd:00 318254 /lib64/libc-2.12.so 399cf8e000-399cf93000 rw-p 00:00 0 77ce6000-77de6000 rw-p 00:00 0 77de6000-77df2000 r-xp fd:00 318269 /lib64/libnss_files-2.12.so 77df2000-77ff2000 ---p c000 fd:00 318269 /lib64/libnss_files-2.12.so 77ff2000-77ff3000 r--p c000 fd:00 318269 /lib64/libnss_files-2.12.so 77ff3000-77ff4000 rw-p d000 fd:00 318269 /lib64/libnss_files-2.12.so 77ffd000-77ffe000 rw-p 00:00 0 77ffe000-77fff000 r-xp 00:00 0 [vdso] 7ffea000-7000 rw-p 00:00 0 [stack] ff60-ff601000 r-xp 00:00 0 [vsyscall] Program received signal SIGABRT, Aborted. 0x0047199b in ?? () (gdb) backtrace #0 0x0047199b in ?? () #1 0x004be10b in ?? () #2 0x004ca57e in ?? () #3 0x0052dae5 in ?? () #4 0x0052da7e in ?? () #5 0x0052d523 in ?? () #6 0x0052d408 in ?? () #7 0x00440c98 in ?? () #8 0x0044d247 in ?? () #9 0x004171dd in ?? () #10 0x00404566 in ?? () #11 0x004b6056 in ?? () #12 0x00405201 in ?? () #13 0x7fffe5d8 in ?? () #14 0x in ?? () (gdb) - Philippe -Original Message- From: Alexander Barkov [mailto:b...@mnogosearch.org] Sent: 20 March 2013 11:28 To: Philippe DE ROCHAMBEAU; general@mnogosearch.org Subject: Re: [General] Buffer overflow Hi Philippe, On 03/20/2013 01:10 PM, Philippe DE ROCHAMBEAU wrote: Hi Alexander, The problem is that version 3.3.12 is the only one available on the Redhat Repository. The info below makes me think that you're using the RPM you previously downloaded from our site. This RPM is a similar RPM we built for 3.3.13: http://www.mnogosearch.org/Download/RPMS/mnogosearch-3.3.13-01.static.glibc-2.12.x86_64.rpm I suggest to download it and upgrade. --- Yum info mnogosearch Loaded plugins: product-id, rhnplugin, security, subscription-manager Updating certificate-based repositories. Unable to read consumer identity Installed Packages Name: mnogosearch Arch: x86_64 Version : 3.3.12 Release : 01.static Size: 15 M Repo: installed Summary : Full-featured MySQL based web search engine. URL : http://www.mnogosearch.org/ License : GNU GPL Version 2 Description : mnoGoSearch is a full-featured MySQL based web search engine. mnoGoSearch consists of : two parts. The first part is an indexing mechanism (indexer). The indexer walks over : html hypertext references and stores found words and new references into a database. : The second part is a web CGI front-end to provide search using data collected by the : indexer. : : A PHP and a Perl front-ends are also available from our site http://www.mnogosearch.org/. : : mnoGoSearch first release took place in November 1998. The search engine was named : UDMSearch until the project was acquired by Lavtech.Com Corp. in October 2000 and : its name changed to mnoGoSearch. -- Best regards, Philippe -Original Message- From: Alexander Barkov [mailto:b...@mnogosearch.org] Sent: 20 March 2013 09:50 To: Philippe DE ROCHAMBEAU Cc: general@mnogosearch.org Subject: Re: [General] Buffer overflow Hi Philippe, So you're actually running mnogosearch-3.3.12 (not 3.3.13 as you reported in the first letter). This problem should be fixed in 3.3.13. This is from the 3.3.13 ChangeLog: Bug#4803 buffer overflow detected with search.cgi was fixed. Please download 3.3.13 from our site and reinstall. Greetings. On 03/20/2013 12:32 PM, Philippe DE ROCHAMBEAU wrote: Hi, uname --all Linux xxx 2.6.32-279.22.1.el6.x86_64 #1 SMP Sun Jan 13 09:21:40 EST 2013 x86_64 x86_64 x86_64 GNU/Linux --- [root@xxx cgi-bin]# ./search.cgi a *** buffer overflow detected ***: ./search.cgi terminated === Backtrace: = [0x52dae5] [0x52da7e] [0x52d523] [0x52d408] [0x440c98] [0x44d247] [0x4171dd] [0x404566] [0x4b6056] [0x405201] === Memory map: 0040-00685000 r-xp fd:00 334904 /var/www/cgi-bin/search.cgi 00885000-008e rw-p 00285000 fd:00 334904 /var/www/cgi-bin/search.cgi 008e-008ec000 rw-p 00:00 0 02484000-0251d000 rw-p 00:00 0 [heap] 399c40-399c42 r-xp
Re: [General] Buffer overflow
/lib64/libnss_files-2.12.so 77ff3000-77ff4000 rw-p d000 fd:00 318269 /lib64/libnss_files-2.12.so 77ffd000-77ffe000 rw-p 00:00 0 77ffe000-77fff000 r-xp 00:00 0 [vdso] 7ffea000-7000 rw-p 00:00 0 [stack] ff60-ff601000 r-xp 00:00 0 [vsyscall] Program received signal SIGABRT, Aborted. 0x0047199b in ?? () (gdb) backtrace #0 0x0047199b in ?? () #1 0x004be10b in ?? () #2 0x004ca57e in ?? () #3 0x0052dae5 in ?? () #4 0x0052da7e in ?? () #5 0x0052d523 in ?? () #6 0x0052d408 in ?? () #7 0x00440c98 in ?? () #8 0x0044d247 in ?? () #9 0x004171dd in ?? () #10 0x00404566 in ?? () #11 0x004b6056 in ?? () #12 0x00405201 in ?? () #13 0x7fffe5d8 in ?? () #14 0x in ?? () (gdb) - Philippe -Original Message- From: Alexander Barkov [mailto:b...@mnogosearch.org] Sent: 20 March 2013 11:28 To: Philippe DE ROCHAMBEAU; general@mnogosearch.org Subject: Re: [General] Buffer overflow Hi Philippe, On 03/20/2013 01:10 PM, Philippe DE ROCHAMBEAU wrote: Hi Alexander, The problem is that version 3.3.12 is the only one available on the Redhat Repository. The info below makes me think that you're using the RPM you previously downloaded from our site. This RPM is a similar RPM we built for 3.3.13: http://www.mnogosearch.org/Download/RPMS/mnogosearch-3.3.13-01.static.glibc-2.12.x86_64.rpm I suggest to download it and upgrade. --- Yum info mnogosearch Loaded plugins: product-id, rhnplugin, security, subscription-manager Updating certificate-based repositories. Unable to read consumer identity Installed Packages Name: mnogosearch Arch: x86_64 Version : 3.3.12 Release : 01.static Size: 15 M Repo: installed Summary : Full-featured MySQL based web search engine. URL : http://www.mnogosearch.org/ License : GNU GPL Version 2 Description : mnoGoSearch is a full-featured MySQL based web search engine. mnoGoSearch consists of : two parts. The first part is an indexing mechanism (indexer). The indexer walks over : html hypertext references and stores found words and new references into a database. : The second part is a web CGI front-end to provide search using data collected by the : indexer. : : A PHP and a Perl front-ends are also available from our site http://www.mnogosearch.org/. : : mnoGoSearch first release took place in November 1998. The search engine was named : UDMSearch until the project was acquired by Lavtech.Com Corp. in October 2000 and : its name changed to mnoGoSearch. -- Best regards, Philippe -Original Message- From: Alexander Barkov [mailto:b...@mnogosearch.org] Sent: 20 March 2013 09:50 To: Philippe DE ROCHAMBEAU Cc: general@mnogosearch.org Subject: Re: [General] Buffer overflow Hi Philippe, So you're actually running mnogosearch-3.3.12 (not 3.3.13 as you reported in the first letter). This problem should be fixed in 3.3.13. This is from the 3.3.13 ChangeLog: Bug#4803 buffer overflow detected with search.cgi was fixed. Please download 3.3.13 from our site and reinstall. Greetings. On 03/20/2013 12:32 PM, Philippe DE ROCHAMBEAU wrote: Hi, uname --all Linux xxx 2.6.32-279.22.1.el6.x86_64 #1 SMP Sun Jan 13 09:21:40 EST 2013 x86_64 x86_64 x86_64 GNU/Linux --- [root@xxx cgi-bin]# ./search.cgi a *** buffer overflow detected ***: ./search.cgi terminated === Backtrace: = [0x52dae5] [0x52da7e] [0x52d523] [0x52d408] [0x440c98] [0x44d247] [0x4171dd] [0x404566] [0x4b6056] [0x405201] === Memory map: 0040-00685000 r-xp fd:00 334904 /var/www/cgi-bin/search.cgi 00885000-008e rw-p 00285000 fd:00 334904 /var/www/cgi-bin/search.cgi 008e-008ec000 rw-p 00:00 0 02484000-0251d000 rw-p 00:00 0 [heap] 399c40-399c42 r-xp fd:00 318247 /lib64/ld-2.12.so 399c42-399c61f000 ---p 0002 fd:00 318247 /lib64/ld-2.12.so 399c61f000-399c62 r--p 0001f000 fd:00 318247 /lib64/ld-2.12.so 399c62-399c621000 rw-p 0002 fd:00 318247 /lib64/ld-2.12.so 399c621000-399c622000 rw-p 00:00 0 399cc0-399cd89000 r-xp fd:00 318254 /lib64/libc-2.12.so 399cd89000-399cf89000 ---p 00189000 fd:00 318254
[General] ANNOUNCE: mnoGoSearch-3.3.14
Hello, mnoGoSearch-3.3.14 is now available from http://www.mnogosearch.org/ - DOCX and RTF built-in parsers were added. - It's now possible to use the $(ConfDir), $(ShareDir), $(VarDir), $(TmpDir) template variables in search.htm, e.g.: Include $(ConfDir)/common.inc DBAddr sqlite3:///$(VarDir)/mnogosearch.sqlite3/ Previously these variables were understood only in indexer.conf. - A minor fix in installation layout was made: the --docdir parameter to configure is now respected, and the HTML documentation is now installed to PREFIX/share/doc/mnogosearch/ by default. Previously --docdir was ignored, and the documentation was installed to PREFIX/doc/. - A number of minor bugs were fixed. The full ChangeLog can be found at: http://www.mnogosearch.org/doc33/msearch-changelog.html#changelog-3-3-14 Greetings. ___ General mailing list General@mnogosearch.org http://lists.mnogosearch.org/listinfo/general
Re: [General] Install mnogosearch
Hi, On 06/01/2013 06:36 PM, Mapluz Dev wrote: hi i try to install mnogosearch on a debian release 6 when i try the commande : ./configure --with-mysql i have this message : checking build system type... i686-pc-linux-gnu checking host system type... i686-pc-linux-gnu checking target system type... i686-pc-linux-gnu checking for a BSD-compatible install... /usr/bin/install -c checking whether build environment is sane... yes checking for a thread-safe mkdir -p... /bin/mkdir -p checking for gawk... no checking for mawk... mawk checking whether make sets $(MAKE)... yes checking whether to enable maintainer-specific portions of Makefiles... no checking whether make sets $(MAKE)... (cached) yes checking whether build environment is sane... yes checking for gcc... gcc checking whether the C compiler works... no configure: error: in `/home/francis/Downloads/mnogosearch-3.3.14': configure: error: C compiler cannot create executables See `config.log' for more details Can you send config.log please? can you help me thanks -- VBLC Signature Développement Mapluz - MAPLUZ http://www.mapluz.fr ___ General mailing list General@mnogosearch.org http://lists.mnogosearch.org/listinfo/general ___ General mailing list General@mnogosearch.org http://lists.mnogosearch.org/listinfo/general
Re: [General] Install mnogosearch
On 06/01/2013 07:06 PM, Mapluz Dev wrote: skip checking whether the C compiler works... no configure: error: in `/home/francis/Downloads/mnogosearch-3.3.14': configure: error: C compiler cannot create executables See `config.log' for more details Can you send config.log please? yes, see attached file thanks I think this line is the most important: /usr/bin/ld: crt1.o: No such file: No such file or directory Quick googling returns many pages telling how to fix this. For example, have a look into this one: http://www.businesscorner.co.uk/usrbinld-crt1-o-no-such-file-no-such-file-or-directory/ ___ General mailing list General@mnogosearch.org http://lists.mnogosearch.org/listinfo/general
Re: [General] CheckOnly for all unkown file types
Hi, On 06/13/2013 01:22 PM, W. de Hoog wrote: Hi, I would like to configure mnogosearch so that for all non parsed files it stores the path. CheckOnly however does not use the mime type but a regex. It would be nice to have for example AddType application/unknown *.* CheckOnly Match Mime application/unknown How could this be done? Unfortunately there's no a feature like this. best regards, Willem-Jan de Hoog ___ General mailing list General@mnogosearch.org http://lists.mnogosearch.org/listinfo/general ___ General mailing list General@mnogosearch.org http://lists.mnogosearch.org/listinfo/general
Re: [General] blob, single or multi parameters
Hi, On 10/03/2013 02:53 PM, d...@mapluz.fr wrote: Hi I have install release mnogosearch 3.3.14 with a mysql database i have create table with thois command : *./indexer -Ecreate -d /usr/local/mnogosearch/etc/indexer.conf* in my indexer .conf i have this : *DBAddr mysql://root:mypassword@localhost/mnogosearchactu/?DBMode=multi* when i try to run search, i have this message : *Inverted word index not found. Probably you forgot to run 'indexer -Eblob'*. so i have run this command : *./indexer -Eblob -d /usr/local/mnogosearch/etc/indexer.conf* I guess you forgot to fix DBAddr in search.htm to match the one in indexer.conf. and all run, but i have questions : 1 - why must i run this commande :*./indexer -Eblob -d /usr/local/mnogosearch/etc/indexer.conf * i do not uderstand the*-Eblob* parameter 2 - in my crontab, is this line correct* **00 23 * * * /usr/local/mnogosearch/sbin/indexer -d /usr/local/mnogosearch/etc/indexer.conf* to indexing all days at 23h ?* With DBMode=multi crawling and indexing is done at the same time. The advantage is that search index is always up to date with what crawler has already downloaded. With DBMode=blob crawling and indexing are separated in time. The advantage of DBMode=blob is that it is much faster at search time than DBMode=blob. But it needs an extra step indexer --index (or indexer -Eblob - these commands are synonyms) to make the index up to date after the crawler has downloaded a number of documents with new content (i.e. both new documents and old documents that have changed since last crawling). The choice between DBMode=multi and DBMode=blob can be done depending on the database size and search performance. - If your document collection is rather small and you're happy with search performance provided by DBMode=multi, then use this command in both indexer.conf and search.htm: DBAddr mysql://root:mypassword@localhost/mnogosearchactu/?DBMode=multi The command in crontab is Okey in this case. - If your document collection is rather big, and/or you prefer faster search results, then use this DBAddr in both indexer.conf and search.htm: DBAddr mysql://root:mypassword@localhost/mnogosearchactu/?DBMode=blob In this case, the crontab task should do two things consequently: # Crawling /usr/local/mnogosearch/sbin/indexer -d /usr/local/mnogosearch/etc/indexer.conf # Indexing /usr/local/mnogosearch/sbin/indexer --index -d /usr/local/mnogosearch/etc/indexer.conf It's a good idea to put these two commands into a shell script, then use it from crontab. Now you can try to change search.htm changing between DBMode=blob and DBMode=multi and compare performance. If you decide to stay with DBMode=multi, then just copy DBAddr from indexer.conf to search.htm. If you decide to switch to DBMode=blob, then it's a good idea to start from scratch: 1. Drop the tables in the current database that were created for DBMode=multi indexer --drop 2. Edit indexer.conf and search.htm, change to DBMode to blob. 3. Create tables for DBMode=blob indexer --create 4. Crawl your document collection indexer 5. Create index indexer --index 6. Search *Thanks a lot for your responses.* * ___ General mailing list General@mnogosearch.org http://lists.mnogosearch.org/listinfo/general ___ General mailing list General@mnogosearch.org http://lists.mnogosearch.org/listinfo/general
Re: [General] scans subdirectories websites
Hi, On 10/29/2013 06:38 PM, d...@mapluz.fr wrote: Hi, is that the search engine scans subdirectories websites ? if not is there a setting in indexer.conf ? Can you clarify please what do you mean subdirectories websites? If you have a command like: Server http://sitename.com/ then it does go to this site subdirectories,e.g. http://sitename.com/dir/, if there is a link found. Thanks ___ General mailing list General@mnogosearch.org http://lists.mnogosearch.org/listinfo/general ___ General mailing list General@mnogosearch.org http://lists.mnogosearch.org/listinfo/general
Re: [General] Data in database mnogosearch
Hi, On 10/30/2013 01:08 PM, d...@mapluz.fr wrote: hi, I'm using mnogosearch 3.2 on linux ubuntu server. Hmm, 3.2 sounds very old. Why not 3.3? i have created a mysql database and my indexer.conf file is here : http://www.mapluz.fr/public/indexer.conf i init indexer with this command : /usr/local/mnogosearch/sbin/indexer -Eblob /usr/local/mnogosearch/etc/indexer.conf i run indexer with this command : /usr/local/mnogosearch/sbin/indexer -d /usr/local/mnogosearch/etc/indexerer.conf They are to be run in the opposite order: 1. Crawl documents: /usr/local/mnogosearch/sbin/indexer -d /usr/local/mnogosearch/etc/indexerer.conf 2. Index the documents collected by crawler: /usr/local/mnogosearch/sbin/indexer -Eblob /usr/local/mnogosearch/etc/indexer.conf my search with the sample of API php (the search.php sample provides by mnogosearch) return no results. so, perhaps my database have a problem : i have a question about the*bdict table; *here is an example : http://www.mapluz.fr/public/capture.jpg why is it so small bdict table ? Try to run this command again: /usr/local/mnogosearch/sbin/indexer -Eblob /usr/local/mnogosearch/etc/indexer.conf Does the size of the table bdict change? thanks ___ General mailing list General@mnogosearch.org http://lists.mnogosearch.org/listinfo/general ___ General mailing list General@mnogosearch.org http://lists.mnogosearch.org/listinfo/general
Re: [General] Indexer RSS flux
Hi, On 11/04/2013 04:57 PM, d...@mapluz.fr wrote: hi, I want indexer RSS flux on my web site. Is there a parameter in indexer.conf to do this ? Can you please give a link to an example of an RSS file you'd like to index, or copy and paste an RSS fragment? Thanks. Thanks d...@mapluz.fr ___ General mailing list General@mnogosearch.org http://lists.mnogosearch.org/listinfo/general ___ General mailing list General@mnogosearch.org http://lists.mnogosearch.org/listinfo/general
Re: [General] Indexer RSS flux
Hi, On 11/06/2013 01:00 PM, d...@mapluz.fr wrote: hi, here is an example of rss flux : http://www.sudouest.fr/pyrenees-atlantiques/bayonne/rss.xml Let's have a look into the item tags, for example: item titlePays basque : un troupeau de brebis pris au piège par la montée des eaux/title descriptionUn troupeau de brebis est tombé dans une rivière, 61 ont été sauvées, 47 sont mortes. Des brebis à l’âme aventureuse. Un troupeau d’Irissarry, au nord de Saint-Jean-Pied-de-Port, a voulu explorer les alentours de son pâturage, mardi, en fin de matinée. Bien mal lui en a pris. Les animaux, du fait de la montée des eaux,.../description pubDateWed, 06 Nov 2013 10:14:38 +0100/pubDate linkhttp://www.sudouest.fr/2013/11/06/des-brebis-se-tuent-dans-un-ravin-1221272-4181.php#xtor=RSS-10521769/link guidhttp://www.sudouest.fr/2013/11/06/des-brebis-se-tuent-dans-un-ravin-1221272-4181.php#xtor=RSS-10521769/guid /item What would you like to do with every item instance? Have the crawler follow the URL given in the tag link.../link and index the content of the document referenced by this URL? Or just create a new URL entry in the database and index the information inside the title../title and content../content, without crawling to the given URL? thanks for you help *De: *Alexander Barkov b...@mnogosearch.org *À: *d...@mapluz.fr *Cc: *MNOGOSEARCH general@mnogosearch.org *Envoyé: *Lundi 4 Novembre 2013 20:59:43 *Objet: *Re: [General] Indexer RSS flux Hi, On 11/04/2013 04:57 PM, d...@mapluz.fr wrote: hi, I want indexer RSS flux on my web site. Is there a parameter in indexer.conf to do this ? Can you please give a link to an example of an RSS file you'd like to index, or copy and paste an RSS fragment? Thanks. Thanks d...@mapluz.fr ___ General mailing list General@mnogosearch.org http://lists.mnogosearch.org/listinfo/general ___ General mailing list General@mnogosearch.org http://lists.mnogosearch.org/listinfo/general ___ General mailing list General@mnogosearch.org http://lists.mnogosearch.org/listinfo/general
[General] ANNOUNCE: mnoGoSearch 3.3.15
Hello, mnoGoSearch-3.3.15 is now available. Sources and binaries for a number of platforms are available from http://www.mnogosearch.org/. This is mostly a bug-fixes release. Starting from this release, binaries for the console (non-GUI, Unix-alike) version are available for the Windows family operating systems. From ChangeLog: * The default search template improvements were made. Query words are now highlighted differently when displaying a list of found documents (using bold font) and when displaying a cached copy of a document (using yellow background). * A section about installation of the mnoGoSearch PHP module was added into the docbook manual. * The EREGCUT template operator was added, to remove sub-strings matching to a regular expression pattern from a string. * Bug#4820 mirror files exceed platform limit for file name length was fixed. * A few potential vulnerabilities found by the Veracode static analyzer were fixed (Bug#4826). * A few warnings reported by the clang compiler were fixed. * Fixed that the words having non-ASCII letters were not highlighted when displaying cached copy in cases when the document character set differs from LocalCharset (a bug since 3.3.13). * Fixed that the Microsoft SQL Server driver always used quotes in a USE dbname; query when connecting to the server, assuming that QUOTED_IDENTIFIERS is set to ON, which is not necessarily always the case (a bug since 3.3.12). Now quotes are used only for database names starting with a digit. * Fixed that popularity rank calculation did not work with Microsoft SQL Server. * Fixed a bug in the Microsoft SQL Server driver which reported one extra byte (for the trailing 0x00) when fetching character data from the server. This bug made indexer and search.cgi behave unexpectedly in rare cases. * Bug#4825 Redirect: Bad URL: redirected locations not indexed was fixed. * Fixed that make bin-dist did not work in some cases. ___ General mailing list General@mnogosearch.org http://lists.mnogosearch.org/listinfo/general
Re: [General] wordstat
Hi, On 12/31/2013 01:35 PM, Developpement Team Hodei wrote: hi, i'm using the mnogosearch 3.2 on an uguntu server. i index with DBMODE=BLOB i want use autocompletion in my input textbox when a user search on the web : so where are the words that mnogosearch has tagged it ? my wrdstat table is empty but it seems the is word in bdict Run indexer --wordstat to populate the table. Btw, which tools are you going to use to implement autocompletion? It would be nice to make autocompletion available out of the box into the next development branch (3.4.x). thanks ___ General mailing list General@mnogosearch.org http://lists.mnogosearch.org/listinfo/general ___ General mailing list General@mnogosearch.org http://lists.mnogosearch.org/listinfo/general
Re: [General] separate indexing/crawling and web recherche
Hi, On 01/06/2014 02:56 PM, Developpement Team Hodei wrote: hi, i have installed mnogosearch 3.2 on an unbutu server with wamp and mysql database. In ly indexer.conf i'm using *DBMode=blob *In my crontab i have this : 00 23 1 * * /usr/local/mnogosearch/sbin/indexer -d /usr/local/mnogosearch/etc/indexer.conf /usr/local/mnogosearch/sbin/logmng 2/usr/local/mnogosearch/sbin/logmng2 30 23 1 * * /usr/local/mnogosearch/sbin/indexer -Eblob /usr/local/mnogosearch/etc/indexer.conf /usr/local/mnogosearch/sbin/logmng 2/usr/local/mnogosearch/sbin/logmng2 My question : to improve performanceis it necessary to separate the crawling/indexing of mnogosearch engine from which users make webs research ? in other words, is it necessary to create two servers * 1 ubuntu server wamp / mnogosearch with the mysql database and crontab commands * 1 ubuntu serverwamp / mnogosearch / wordpress (my website is on wordpress) with the same mysql database would be replicated (frequency like crontab) and without crontab commands (i'm using php frontend to access mnogosearch engine) How big is your search database? What does indexer -S return? - crawling is not CPU consuming, it can co-exist with indexing and search without any problems. - indexing and search are CPU consuming For small databases (e.g. less than 100k documents) having is single box for all three tasks is fine. For bigger databases with a few hundred thousand documents a box with multiple CPU cores can also handle all three tasks. For huge databases with millions documents separation makes some sense. However, instead of separating crawling/indexing/search tasks, I'd propose to consider using cluster solution described here: http://www.mnogosearch.org/doc33/msearch-cluster.html It significantly improves search performance. thanks for your help ___ General mailing list General@mnogosearch.org http://lists.mnogosearch.org/listinfo/general ___ General mailing list General@mnogosearch.org http://lists.mnogosearch.org/listinfo/general
Re: [General] rss and multiledia file
Hi, On 12/31/2013 10:47 PM, Developpement Team Hodei wrote: hi, how can i configured my indexer.conf file for web search returns no RSS, no audio or video file ! Use Allow/Disallow commands for this. thanks ___ General mailing list General@mnogosearch.org http://lists.mnogosearch.org/listinfo/general ___ General mailing list General@mnogosearch.org http://lists.mnogosearch.org/listinfo/general
Re: [General] separate indexing/crawling and web recherche
On 01/06/2014 02:56 PM, Developpement Team Hodei wrote: hi, i have installed mnogosearch 3.2 on an unbutu server with wamp and mysql database. By the way, why not mnogosearch-3.3 ? In ly indexer.conf i'm using *DBMode=blob *In my crontab i have this : 00 23 1 * * /usr/local/mnogosearch/sbin/indexer -d /usr/local/mnogosearch/etc/indexer.conf /usr/local/mnogosearch/sbin/logmng 2/usr/local/mnogosearch/sbin/logmng2 30 23 1 * * /usr/local/mnogosearch/sbin/indexer -Eblob /usr/local/mnogosearch/etc/indexer.conf /usr/local/mnogosearch/sbin/logmng 2/usr/local/mnogosearch/sbin/logmng2 My question : to improve performanceis it necessary to separate the crawling/indexing of mnogosearch engine from which users make webs research ? in other words, is it necessary to create two servers * 1 ubuntu server wamp / mnogosearch with the mysql database and crontab commands * 1 ubuntu serverwamp / mnogosearch / wordpress (my website is on wordpress) with the same mysql database would be replicated (frequency like crontab) and without crontab commands (i'm using php frontend to access mnogosearch engine) thanks for your help ___ General mailing list General@mnogosearch.org http://lists.mnogosearch.org/listinfo/general ___ General mailing list General@mnogosearch.org http://lists.mnogosearch.org/listinfo/general
Re: [General] indexer make segmentation fail
Hi, Please try to get gdb backtrace. On 03/17/2014 10:23 PM, d...@hodei.net wrote: Hi I try this command to Indexing my list of web site : /usr/local/mnogosearch/sbin/indexer -Eblob -d /usr/local/mnogosearch/etc/indexer.conf the result is : Segmentation fault here my sql databases informations : ++---+ | Tables | Size (MB) | ++---+ | bdict |191.90 | | bdicti |605.58 | | categories | 0.00 | | crossdict | 0.00 | | dict | 0.00 | | links | 0.00 | | qcache | 0.00 | | qinfo | 0.00 | | qtrack | 0.00 | | server | 0.18 | | srvinfo| 0.00 | | url| 3028.87 | | urlinfo| 2110.99 | | wrdstat| 0.00 | ++---+ have you an idea? __ my config : * Debian 3.2.51-1 x86_64 GNU/Linux * mnogosearch 3.3.15 * indexer.conf : .. DBAddr mysql://root:password@localhost/mnogosearch/?dbmode=blob .. * commande ./configure for installation ../configure --prefix=/usr/local/mnogosearch --bindir=/usr/local/mnogosearch/bin --sbindir=/usr/local/mnogosearch/sbin --sysconfdir=/usr/local/mnogosearch/etc --localstatedir=/usr/local/mnogosearch/var --libdir=/usr/local/mnogosearch/lib --includedir=/usr/local/mnogosearch/include --mandir=/usr/local/mnogosearch/man --disable-shared --enable-static --enable-syslog --without-docs --enable-pthreads --disable-dmalloc --enable-parser --disable-mp3 --disable-xml --disable-rss --disable-css --disable-js --with-extra-charsets=all --enable-file --enable-http --enable-ftp --enable-htdb --enable-news --with-mysql --with-zlib __ --- Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active. http://www.avast.com ___ General mailing list General@mnogosearch.org http://lists.mnogosearch.org/listinfo/general ___ General mailing list General@mnogosearch.org http://lists.mnogosearch.org/listinfo/general
Re: [General] UDM_URL's parent url inside external parser
Hi, On 04/16/2014 06:36 PM, Yasser Zamani wrote: Hi there, I need to know the parent url of the url which has been passed to my parser via UDM_URL. For example if a page at `http://example.com/example_movie.html` has a link like `http://example.com/example_movie.mp4` inside it, when mnogosearch passes `http://example.com/example_movie.mp4` to my parser via UDM_URL, I need to know it's parent page, `http://example.com/example_movie.html`. Is it possible at all? if so, how? or any workaround? You can try to know the parent page using an SQL query like this: select url1.url from url url1, url url2 where url2.referrer=url1.rec_id and url2.url='http://example.com/example_movie.mp4'; Thanks in advance! ___ General mailing list General@mnogosearch.org http://lists.mnogosearch.org/listinfo/general ___ General mailing list General@mnogosearch.org http://lists.mnogosearch.org/listinfo/general
Re: [General] parameter to server command in indexer.conf
Hi, On 05/05/2014 05:02 PM, d...@hodei.net wrote: hi When i try to add this url to my list in dexer.conf -- server http://fr.wikipedia.org/wiki/Zanpantzar -- the crawler search all fr.wikioedia.org site and not only Zanpantzar directory Have you an idea ? Hmm. http://fr.wikipedia.org/wiki/Zanpantzar is not a directory. It's a file. With this server command it should crawl everything in this directory: http://fr.wikipedia.org/wiki/ It should not go outside of the /wiki/ directory. If it goes outside of /wiki/, perhaps you have more server commands. Thanks __ my config : * Debian 3.2.51-1 x86_64 GNU/Linux * mnogosearch 3.3.15 * indexer.conf : .. DBAddr mysql://root:password@localhost/mnogosearch/?dbmode=blob .. --- Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active. http://www.avast.com ___ General mailing list General@mnogosearch.org http://lists.mnogosearch.org/listinfo/general ___ General mailing list General@mnogosearch.org http://lists.mnogosearch.org/listinfo/general
Re: [General] accented characters
On 05/05/2014 02:32 PM, d...@hodei.net wrote: hi i have accented characters in my web search. to solve this problem, i have modify the database with this queries : ALTER TABLE bdict CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci; ALTER TABLE bdicti CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci; and i have init variables *$localcharset* et *$browsercharset* with utf-8 in my indexer.conf But i have always the problem ! have you an idea ? Thanks __ my config : * Debian 3.2.51-1 x86_64 GNU/Linux * mnogosearch 3.3.15 * indexer.conf : .. DBAddr mysql://root:password@localhost/mnogosearch/?dbmode=blob Try adding the SetNames=utf8 part, like this: DBAddr mysql://root:password@localhost/mnogosearch/?SetNames=utf8dbmode=blob .. http://www.avast.com/ Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection Antivirus avast! http://www.avast.com/ est active. ___ General mailing list General@mnogosearch.org http://lists.mnogosearch.org/listinfo/general ___ General mailing list General@mnogosearch.org http://lists.mnogosearch.org/listinfo/general
Re: [General] Delete a line in Server method in indexer.conf
Hi, On 05/05/2014 02:26 PM, d...@hodei.net wrote: Hi, In the indexer.conf file, in 'Server [Method] ', i want to delete an entry like this : 'server http://www.eke.org' Is that all pages of the site will be removed from the database during the next crawling? Every document has its own expiration time, which is stored in url.next_index_time. When crawling, indexer deletes all expired documents that do not have a matching Server/Real command. Note, you can delete all documents at once, without waiting for expiration: indexer -Cw -u http://www.eke.org/% Thanks ___ General mailing list General@mnogosearch.org http://lists.mnogosearch.org/listinfo/general
Re: [General] Indexing Failed with large database
Hi, On 06/26/2014 06:49 PM, d...@hodei.net wrote: Hi I have a problem when i indexing my database : ___ root@botujo:/home/jean# /usr/local/mnogosearch/sbin/indexer -Eblob indexer[4787]: Indexing indexer[4787]: Loading URL list {sql.c:1513} Query: SELECT rec_id, site_id, pop_rank, last_mod_time FROM url indexer[4787]: MySQL driver: #144: Table './mnogosearch/url' is marked as crashed and last (automatic?) repair failed Here is my database information in phpmyadmin : namelinessize -- bdict 864 575 1,1 Go bdicti utilisé bdict_tmp2,0 Ko categories1,0 Ko crossdict 1,0 Ko dict 1,0 Ko links 1,0 Ko qcache 1,0 Ko qinfo 2,0 Ko qtrack 1,0 Ko server 889156,7 Ko srvinfo 1,0 Ko url utilisé urlinfo 11 009 854 27,4 Go Have you an idea ? Does manual REPAIR TABLE url help? Thanks __ my config : * Debian 3.2.51-1 x86_64 GNU/Linux * mnogosearch 3.3.15 * indexer.conf : .. DBAddr mysql://root:password@localhost/mnogosearch/?dbmode=blob --- Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active. http://www.avast.com ___ General mailing list General@mnogosearch.org http://lists.mnogosearch.org/listinfo/general ___ General mailing list General@mnogosearch.org http://lists.mnogosearch.org/listinfo/general
Re: [General] Crawling order
Hi, On 08/05/2014 12:12 PM, d...@hodei.net wrote: Hi I have 1000 websites in my indexer.conf on the 'Server method' rubric in what order the 'crawler' look over the list of website : random, alphabetical or other Crawler selects targets in a random order. There are some related command line options: -e Visit 'most expired' (oldest) documents first -o Visit documents with less depth (hops value) first -r Do not try to reduce remote servers load by randomising crawler queue order (faster, but less polite) thanks for your help _ my config : * Debian 3.2.51-1 x86_64 GNU/Linux * mnogosearch 3.3.15 * contents of indexer.conf : .. DBAddr mysql://root:password@localhost/mnogosearch/?dbmode=blob .. _ --- Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active. http://www.avast.com ___ General mailing list General@mnogosearch.org http://lists.mnogosearch.org/listinfo/general ___ General mailing list General@mnogosearch.org http://lists.mnogosearch.org/listinfo/general
Re: [General] Duplicates Commandes in indexer
Hi, Most likely you have two DBAddr commands in your indexer.conf. If this does not help, please send me your indexer.conf. On 10/06/2014 01:23 PM, Hodei-Dev wrote: Hi When i try to execute the indexer command it run command in double; for example : /usr/local/mnogosearch/sbin/indexer -Ecreate -d /usr/local/mnogosearch/etc/indexer.conf : this command create tables twice and in the second run i have a warning 'table already exist' /usr/local/mnogosearch/sbin/indexer -Eblob /usr/local/mnogosearch/etc/indexer.conf : this command return this : - indexer[16663]: Indexing indexer[16663]: Loading URL list indexer[16663]: Converting intag00 indexer[16663]: Converting intag01 indexer[16663]: Converting intag02 indexer[16663]: Converting intag03 indexer[16663]: Converting intag04 indexer[16663]: Converting intag05 indexer[16663]: Converting intag06 indexer[16663]: Converting intag07 indexer[16663]: Converting intag08 indexer[16663]: Converting intag09 indexer[16663]: Converting intag0A indexer[16663]: Converting intag0B indexer[16663]: Converting intag0C indexer[16663]: Converting intag0D indexer[16663]: Converting intag0E indexer[16663]: Converting intag0F indexer[16663]: Converting intag10 indexer[16663]: Converting intag11 indexer[16663]: Converting intag12 indexer[16663]: Converting intag13 indexer[16663]: Converting intag14 indexer[16663]: Converting intag15 indexer[16663]: Converting intag16 indexer[16663]: Converting intag17 indexer[16663]: Converting intag18 indexer[16663]: Converting intag19 indexer[16663]: Converting intag1A indexer[16663]: Converting intag1B indexer[16663]: Converting intag1C indexer[16663]: Converting intag1D indexer[16663]: Converting intag1E indexer[16663]: Converting intag1F indexer[16663]: Total converted: 2604877 records, 13711786 bytes indexer[16663]: Converting url data indexer[16663]: Switching to new blob table. indexer[16663]: Loading URL list indexer[16663]: Converting intag00 indexer[16663]: Converting intag01 indexer[16663]: Converting intag02 indexer[16663]: Converting intag03 indexer[16663]: Converting intag04 indexer[16663]: Converting intag05 indexer[16663]: Converting intag06 indexer[16663]: Converting intag07 indexer[16663]: Converting intag08 indexer[16663]: Converting intag09 indexer[16663]: Converting intag0A indexer[16663]: Converting intag0B indexer[16663]: Converting intag0C indexer[16663]: Converting intag0D indexer[16663]: Converting intag0E indexer[16663]: Converting intag0F indexer[16663]: Converting intag10 indexer[16663]: Converting intag11 indexer[16663]: Converting intag12 indexer[16663]: Converting intag13 indexer[16663]: Converting intag14 indexer[16663]: Converting intag15 indexer[16663]: Converting intag16 indexer[16663]: Converting intag17 indexer[16663]: Converting intag18 indexer[16663]: Converting intag19 indexer[16663]: Converting intag1A indexer[16663]: Converting intag1B indexer[16663]: Converting intag1C indexer[16663]: Converting intag1D indexer[16663]: Converting intag1E indexer[16663]: Converting intag1F indexer[16663]: Total converted: 2605019 records, 13712168 bytes indexer[16663]: Converting url data indexer[16663]: Switching to new blob table. - In my install, i have configure indexer like this : usr/local/mnogosearch/lib --includedir=/usr/local/mnogosearch/include --mandir=/usr/local/mnogosearch/man --disable-shared --enable-static --enable-syslog --without-docs --enable-pthreads --disable-dmalloc --enable-parser --disable-mp3 --disable-xml --disable-rss --disable-css --disable-js --with-extra-charsets=all --enable-file --enable-http --enable-ftp --enable-htdb --enable-news --with-mysql --with-zlib _ my config : * Debian 3.2.51-1 x86_64 GNU/Linux * mnogosearch 3.3.15 * contents of indexer.conf : .. DBAddr mysql://root:password@localhost/mnogosearch/?dbmode=blob .. _ Have you an idea ? Thanks -- -- VBLC Signature http://www.avast.com/ Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection Antivirus avast! http://www.avast.com/ est active. ___ General mailing list General@mnogosearch.org http://lists.mnogosearch.org/listinfo/general ___ General mailing list General@mnogosearch.org http://lists.mnogosearch.org/listinfo/general
Re: [General] Geo Search
Hello, On 03/10/2015 03:24 PM, Tiptoe Danceware Sales wrote: Hello, I was hoping someone could point me in the right direction. Our website has geo settings per page (e.g. this page can only be seen in Europe). The problem is when mnogosearch indexes and hits this page it is given a message This page can not be viewed in your country because the mnogosearch indexer is in USA and the website doesn't allow mnogo to view the content. We can hack our cms to allow mnogosearch to index all content. The problem lies in how to tell mnogoseach to only give results for content that is visible in their country. We could place a meta tag in each page telling mnogosearch where this page is visible, but how to get mnogosearch to store this in its index and then evaluate this during user searches? Any ideas? Limiting by a specific meta tag value can be done with help of a Limit command in indexer.conf and a corresponding parameter fl=xxx passed to search.cgi Please have a look here for details: http://www.mnogosearch.org/doc33/msearch-cmdref-limit.html Thanks, Mike ___ General mailing list General@mnogosearch.org http://lists.mnogosearch.org/listinfo/general ___ General mailing list General@mnogosearch.org http://lists.mnogosearch.org/listinfo/general
[General] ANNOUNCE: mnoGoSearch-3.4.1
Hi, mnoGoSearch-3.4.1 is available from our site http://www.mnogosearch.org/ This is a new development branch with lots of changes and improvements. Please see here for a detailed change list: http://www.mnogosearch.org/doc34/msearch-changelog.html Greetings. ___ General mailing list General@mnogosearch.org http://lists.mnogosearch.org/listinfo/general
Re: [General] HoldBadHrefs has no effect
Hello Jeff, Sorry for a late reply. On 02/09/2017 02:25 AM, Jeff Taylor wrote: > I've been running Mnogo 3.3.13 under debian wheezy for a number of > years, but it seems that old cached content is never removed. I > normally run with HoldBadHrefs=7d, but I've even tried setting it to 1s, > and the old content is still never removed. As an example, on a recent > search I noticed pages that were last cached in Aug 2011 (which was > probably when the site went offline), but it still comes up in searches. > > Help? I would even be happy with running a mysql query to remove all > cached content that is more than 7 days old, but I didn't want to go > blindly deleting things without knowing how the info in the tables might > be cross-referenced. If there's a way to fix indexer.conf I would also > be happy. Note that the old pages which should be removed are no longer > referenced in server.list, and when I run indexer I get a long list of > URLs that can't be reached. So what can I do to get these old entries > removed from the database? Which http status do these old documents have? Can you please check statistics for a few old documents: ./indexer -S -u http://old1/ ./indexer -S -u http://old2/ ./indexer -S -u http://old3/ Or using this SQL query: SELECT status, url FROM url WHERE url IN ('http://old1/','http://old2/','http://old3/'); Also, the output from this command would be helpful: ./indexer -am -v6 -u http://old1/ Please also send your indexer.conf to b...@mnogosearch.org. > ___ > General mailing list > General@mnogosearch.org > http://lists.mnogosearch.org/listinfo/general ___ General mailing list General@mnogosearch.org http://lists.mnogosearch.org/listinfo/general
Re: [General] Indexing problem with sqlite3
Hello Teijo, SQLite changed the error message in one of the recent releases, from "unique" in lower case to "UNIQUE" in upper case. Please apply this patch to src/sql-sqlite.c: -if (!strstr(db->errstr,"unique")) +if (!strstr(db->errstr,"unique") && !strstr(db->errstr,"UNIQUE")) On 03/22/2017 06:39 PM, Alexander Barkov wrote: > Hello Teijo, > > > On 03/22/2017 03:44 PM, Teijo wrote: >> Hello, >> >> I have installed Mnogosearch 3.4.1 from source both to Ubuntu 16.04 and >> Debian Jessie. >> >> In Ubuntu I cannot use Mysql as database because there seem to be some >> compatibility issues with Mysql 5.7. In Jessie where Mysql version is >> 5.5x there are no such problems. >> >> I thought to use Sqlite3 in Ubuntu. Database setup goes without errors >> with indexer --create. But when I try to make index with simply typing >> indexer, I get similar to the following: >> >> [33572]{--} indexer from mnogosearch-3.4.1-sqlite3 started with >> '/usr/local/mnogosearch/etc/indexer.conf' >> [33572]{01} Error: 'DB: sqlite3 driver: (19) UNIQUE constraint failed: >> url.url' >> >> There seem to be similar problems with Sqlite3 in Jessie as well. >> >> I am not familiar with Mnogosearch and Sqlite3 so is there something I >> have missed when setting up the environment? Only changes I have made in >> indexer.conf are Dbaddress and server definitions. Dbaddress is just >> that it's in the example of Sqlite3 definition in indexer.conf-dist. > > Which exact version of SQLite are you using? > > > Can you please send your indexer.conf and the output for: > > ./indexer --sqlmon --exec="SELECT rec_id, url FROM url" > > to b...@mnogosearch.org > > Thanks. > > > >> >> Best regards, >> >> Teijo >> ___ >> General mailing list >> General@mnogosearch.org >> http://lists.mnogosearch.org/listinfo/general > ___ > General mailing list > General@mnogosearch.org > http://lists.mnogosearch.org/listinfo/general > ___ General mailing list General@mnogosearch.org http://lists.mnogosearch.org/listinfo/general
Re: [General] Indexing problem with sqlite3
Hello Teijo, On 03/22/2017 03:44 PM, Teijo wrote: > Hello, > > I have installed Mnogosearch 3.4.1 from source both to Ubuntu 16.04 and > Debian Jessie. > > In Ubuntu I cannot use Mysql as database because there seem to be some > compatibility issues with Mysql 5.7. In Jessie where Mysql version is > 5.5x there are no such problems. > > I thought to use Sqlite3 in Ubuntu. Database setup goes without errors > with indexer --create. But when I try to make index with simply typing > indexer, I get similar to the following: > > [33572]{--} indexer from mnogosearch-3.4.1-sqlite3 started with > '/usr/local/mnogosearch/etc/indexer.conf' > [33572]{01} Error: 'DB: sqlite3 driver: (19) UNIQUE constraint failed: > url.url' > > There seem to be similar problems with Sqlite3 in Jessie as well. > > I am not familiar with Mnogosearch and Sqlite3 so is there something I > have missed when setting up the environment? Only changes I have made in > indexer.conf are Dbaddress and server definitions. Dbaddress is just > that it's in the example of Sqlite3 definition in indexer.conf-dist. Which exact version of SQLite are you using? Can you please send your indexer.conf and the output for: ./indexer --sqlmon --exec="SELECT rec_id, url FROM url" to b...@mnogosearch.org Thanks. > > Best regards, > > Teijo > ___ > General mailing list > General@mnogosearch.org > http://lists.mnogosearch.org/listinfo/general ___ General mailing list General@mnogosearch.org http://lists.mnogosearch.org/listinfo/general
Re: [General] Extra hit with SQL query and word position in the original file
Hello Teijo, On 03/24/2017 01:24 AM, Teijo wrote: > Hello, > > If I search given word with search.cgi, I get correct number of occurences. > > But if I do it with SQL (no matter in mysql or sqlite3), they show extra > occurence. For example, if a given word is in a given original file > twice, they tell that there are three occurences. SQL query is almost > the same one found in Mnogosearch's manual, except that I am using only > one word: > > SELECT url.url, count(*) AS RANK FROM dict, url WHERE > url.rec_id=dict.url_id AND dict.word IN ('word') GROUP BY url.url ORDER > BY rank DESC; > > I'd like to know (by SQL query) position of word in the original file > (to use filepos function). There is at least coord column in dict table. > Coord contains section id and word's position in relationship to > section, if I have understood correctly. How to extract the relative > position from coord, or is the position information elsewhere in > database? If I disabled all sections, would coord actually contain the > absolute position? > > I'm using "single mode" as to database. Coord is a 32 bit number. - The highest 8 bits are section ID (e.g. title, body, etc, according to Section commands in indexer.conf) - The lowest 24 bits are position inside this section. - The last hit inside each combination (url_id,word,secno) is the section length (i.e. the total number of words in this section on) in this document. This MySQL query return the information in a readable form: SELECT url_id,word,coord>>24 AS secno,coord&0xFF AS pos FROM dict WHERE word='mnogosearch' ORDER BY secno,pos; ++-+---+-+ | url_id | word| secno | pos | +-+---+-+ | 1 | mnogosearch | 1 | 1 | | 1 | mnogosearch | 1 | 14 | | 1 | mnogosearch | 1 | 28 | | 1 | mnogosearch | 1 | 42 | | 1 | mnogosearch | 1 | 76 | | 1 | mnogosearch | 1 | 77 | | 1 | mnogosearch | 1 | 85 | | 1 | mnogosearch | 1 | 105 | <- section 1 length | 1 | mnogosearch | 2 | 1 | | 1 | mnogosearch | 2 | 6 | <- section 2 length | 1 | mnogosearch | 3 | 54 | | 1 | mnogosearch | 3 | 69 | <- section 3 length | 1 | mnogosearch | 4 | 1 | | 1 | mnogosearch | 4 | 11 | <- section 4 length | 1 | mnogosearch | 8 | 2 | | 1 | mnogosearch | 8 | 4 | <- section 8 length ++-+---+-+ Lines that are not marked as "section X length" are actual word hits. > > Best regards, > > Teijo > ___ > General mailing list > General@mnogosearch.org > http://lists.mnogosearch.org/listinfo/general ___ General mailing list General@mnogosearch.org http://lists.mnogosearch.org/listinfo/general
Re: [General] Indexing problem with sqlite3
Hello, On 03/22/2017 08:51 PM, Teijo wrote: > Hello, > > Unfortunately patch did not solve the problem. > > As to SQLite3 versions, Ubuntu 16.04 it is > SQLite version 3.11.0 2016-02-15 17:29:24 > and in Jessie > SQLite version 3.8.7.1 2014-10-29 13:59:56 There are two similar places in sql-sqlite.c Please make sure to fix the SQLite3 (rather than SQLite2) code branch: case SQLITE_ERROR: sqlite3_finalize(pStmt); udm_snprintf(db->errstr, sizeof(db->errstr), "sqlite3 driver: (%d) %s", sqlite3_errcode(UdmSQLite3Conn(db)), sqlite3_errmsg(UdmSQLite3Conn(db))); if (!strstr(db->errstr,"unique") && !strstr(db->errstr,"UNIQUE")) { UdmSetErrorCode(db, 1); return UDM_ERROR; } return UDM_OK; break; > > Best regards, > > Teijo > > 22.3.2017, 16:52, Alexander Barkov kirjoitti: > >> Hello Teijo, >> >> >> SQLite changed the error message in one of the recent releases, >> from "unique" in lower case to "UNIQUE" in upper case. >> >> >> Please apply this patch to src/sql-sqlite.c: >> >> >> >> -if (!strstr(db->errstr,"unique")) >> +if (!strstr(db->errstr,"unique") && >> !strstr(db->errstr,"UNIQUE")) >> >> >> >> >> >> >> On 03/22/2017 06:39 PM, Alexander Barkov wrote: >>> Hello Teijo, >>> >>> >>> On 03/22/2017 03:44 PM, Teijo wrote: >>>> Hello, >>>> >>>> I have installed Mnogosearch 3.4.1 from source both to Ubuntu 16.04 and >>>> Debian Jessie. >>>> >>>> In Ubuntu I cannot use Mysql as database because there seem to be some >>>> compatibility issues with Mysql 5.7. In Jessie where Mysql version is >>>> 5.5x there are no such problems. >>>> >>>> I thought to use Sqlite3 in Ubuntu. Database setup goes without errors >>>> with indexer --create. But when I try to make index with simply typing >>>> indexer, I get similar to the following: >>>> >>>> [33572]{--} indexer from mnogosearch-3.4.1-sqlite3 started with >>>> '/usr/local/mnogosearch/etc/indexer.conf' >>>> [33572]{01} Error: 'DB: sqlite3 driver: (19) UNIQUE constraint failed: >>>> url.url' >>>> >>>> There seem to be similar problems with Sqlite3 in Jessie as well. >>>> >>>> I am not familiar with Mnogosearch and Sqlite3 so is there something I >>>> have missed when setting up the environment? Only changes I have >>>> made in >>>> indexer.conf are Dbaddress and server definitions. Dbaddress is just >>>> that it's in the example of Sqlite3 definition in indexer.conf-dist. >>> >>> Which exact version of SQLite are you using? >>> >>> >>> Can you please send your indexer.conf and the output for: >>> >>> ./indexer --sqlmon --exec="SELECT rec_id, url FROM url" >>> >>> to b...@mnogosearch.org >>> >>> Thanks. >>> >>> >>> >>>> >>>> Best regards, >>>> >>>> Teijo >>>> ___ >>>> General mailing list >>>> General@mnogosearch.org >>>> http://lists.mnogosearch.org/listinfo/general >>> ___ >>> General mailing list >>> General@mnogosearch.org >>> http://lists.mnogosearch.org/listinfo/general >>> > ___ > General mailing list > General@mnogosearch.org > http://lists.mnogosearch.org/listinfo/general ___ General mailing list General@mnogosearch.org http://lists.mnogosearch.org/listinfo/general
Re: [General] URL matches list as query string
Hello, On 04/07/2017 02:04 AM, Teijo wrote: > Hello, > > I have URL (server) I have indexed; for example: www.example.com/files > containing several documents. I would like to restrict search results > only to documents which names are document1 and document2. > > If ul parameter in query string contains only document1 or document2, > but not both, search results are restricted to the corresponding > document. But I have not found a way to get ul parameter in query string > to be such one that restriction would contain both documents. I get no > matches when trying to put both documents to query string although both > documents contain word I'm searching for. > > This is an example query string which does not work: > > ?q=test=all=beg=document1+document2 Try multiple ul= parameters: ?q=test=all=beg=document1=document2 > > I have tried also to pass this (and other variants with different ul > parameter) directly to search.cgi. > > Best regards, > > Teijo > > 5.4.2017, 13:58, Alexander Barkov kirjoitti: > >> Hi Teijo, >> >> On 03/30/2017 05:03 PM, Teijo wrote: >>> Hello, >>> >>> I have tried with multi selection list box and text edit field. In both >>> cases only one item is accepted. If I try with more than one, the rest >>> are omitted (multi selection) or your search did not match any documents >>> message is shown (when entered in the text edit field. >>> >>> I do not know what to try next. >> >> Can you clarify please what exactly you're doing. >> >> How does the relevant HTML code look like, >> and how does the URL look like after you submit. >> >> Thanks. >> >>> >>> Best regards, >>> >>> Teijo >>> ___ >>> General mailing list >>> General@mnogosearch.org >>> http://lists.mnogosearch.org/listinfo/general > ___ > General mailing list > General@mnogosearch.org > http://lists.mnogosearch.org/listinfo/general ___ General mailing list General@mnogosearch.org http://lists.mnogosearch.org/listinfo/general
Re: [General] URL matches list as query string
Hi Teijo, On 03/30/2017 05:03 PM, Teijo wrote: > Hello, > > I have tried with multi selection list box and text edit field. In both > cases only one item is accepted. If I try with more than one, the rest > are omitted (multi selection) or your search did not match any documents > message is shown (when entered in the text edit field. > > I do not know what to try next. Can you clarify please what exactly you're doing. How does the relevant HTML code look like, and how does the URL look like after you submit. Thanks. > > Best regards, > > Teijo > ___ > General mailing list > General@mnogosearch.org > http://lists.mnogosearch.org/listinfo/general ___ General mailing list General@mnogosearch.org http://lists.mnogosearch.org/listinfo/general