from:"Alexander Barkov"

Re: [General] Buffer overflow

2013-03-20 Thread Alexander Barkov


Hi Philippe,


On 03/20/2013 01:10 PM, Philippe DE ROCHAMBEAU wrote:

Hi Alexander,

The problem is that version 3.3.12 is the only one available on the Redhat 
Repository.


The info below makes me think that you're using the RPM you
previously downloaded from our site.

This RPM is a similar RPM we built for 3.3.13:

http://www.mnogosearch.org/Download/RPMS/mnogosearch-3.3.13-01.static.glibc-2.12.x86_64.rpm

I suggest to download it and upgrade.




---

Yum info mnogosearch

Loaded plugins: product-id, rhnplugin, security, subscription-manager
Updating certificate-based repositories.
Unable to read consumer identity
Installed Packages
Name: mnogosearch
Arch: x86_64
Version : 3.3.12
Release : 01.static
Size: 15 M
Repo: installed
Summary : Full-featured MySQL based web search engine.
URL : http://www.mnogosearch.org/
License : GNU GPL Version 2
Description : mnoGoSearch is a full-featured MySQL based web search engine. 
mnoGoSearch consists of
 : two parts. The first part is an indexing mechanism (indexer). 
The indexer walks over
 : html hypertext references and stores found words and new 
references into a database.
 : The second part is a web CGI front-end to provide search using 
data collected by the
 : indexer.
 :
 : A PHP and a Perl front-ends are also available from our site 
http://www.mnogosearch.org/.
 :
 : mnoGoSearch first release took place in November 1998. The 
search engine was named
 : UDMSearch until the project was acquired by Lavtech.Com Corp. in 
October 2000 and
 : its name changed to mnoGoSearch.

--

Best regards,

Philippe


-Original Message-
From: Alexander Barkov [mailto:b...@mnogosearch.org]
Sent: 20 March 2013 09:50
To: Philippe DE ROCHAMBEAU
Cc: general@mnogosearch.org
Subject: Re: [General] Buffer overflow

Hi Philippe,

So you're actually running mnogosearch-3.3.12
(not 3.3.13 as you reported in the first letter).


This problem should be fixed  in 3.3.13.

This is from the 3.3.13 ChangeLog:
   Bug#4803 buffer overflow detected with search.cgi was fixed.


Please download 3.3.13 from our site and reinstall.

Greetings.



On 03/20/2013 12:32 PM, Philippe DE ROCHAMBEAU wrote:

Hi,

uname --all
Linux xxx 2.6.32-279.22.1.el6.x86_64 #1 SMP Sun Jan 13 09:21:40 EST 2013 x86_64 
x86_64 x86_64 GNU/Linux

---

[root@xxx cgi-bin]# ./search.cgi a
*** buffer overflow detected ***: ./search.cgi terminated
=== Backtrace: =
[0x52dae5]
[0x52da7e]
[0x52d523]
[0x52d408]
[0x440c98]
[0x44d247]
[0x4171dd]
[0x404566]
[0x4b6056]
[0x405201]
=== Memory map: 
0040-00685000 r-xp  fd:00 334904 
/var/www/cgi-bin/search.cgi
00885000-008e rw-p 00285000 fd:00 334904 
/var/www/cgi-bin/search.cgi
008e-008ec000 rw-p  00:00 0
02484000-0251d000 rw-p  00:00 0  [heap]
399c40-399c42 r-xp  fd:00 318247 
/lib64/ld-2.12.so
399c42-399c61f000 ---p 0002 fd:00 318247 
/lib64/ld-2.12.so
399c61f000-399c62 r--p 0001f000 fd:00 318247 
/lib64/ld-2.12.so
399c62-399c621000 rw-p 0002 fd:00 318247 
/lib64/ld-2.12.so
399c621000-399c622000 rw-p  00:00 0
399cc0-399cd89000 r-xp  fd:00 318254 
/lib64/libc-2.12.so
399cd89000-399cf89000 ---p 00189000 fd:00 318254 
/lib64/libc-2.12.so
399cf89000-399cf8d000 r--p 00189000 fd:00 318254 
/lib64/libc-2.12.so
399cf8d000-399cf8e000 rw-p 0018d000 fd:00 318254 
/lib64/libc-2.12.so
399cf8e000-399cf93000 rw-p  00:00 0
7fc85941b000-7fc859541000 rw-p  00:00 0
7fc85994d000-7fc859a95000 rw-p  00:00 0
7fc859a95000-7fc859aa1000 r-xp  fd:00 318269 
/lib64/libnss_files-2.12.so
7fc859aa1000-7fc859ca1000 ---p c000 fd:00 318269 
/lib64/libnss_files-2.12.so
7fc859ca1000-7fc859ca2000 r--p c000 fd:00 318269 
/lib64/libnss_files-2.12.so
7fc859ca2000-7fc859ca3000 rw-p d000 fd:00 318269 
/lib64/libnss_files-2.12.so
7fff73931000-7fff73946000 rw-p  00:00 0  [stack]
7fff739ff000-7fff73a0 r-xp  00:00 0  [vdso]
ff60-ff601000 r-xp  00:00 0  
[vsyscall]
Aborted (core dumped)


--

[root@xxx cgi-bin]# gdb search.cgi
GNU gdb (GDB) Red Hat Enterprise Linux (7.2-56.el6)
Copyright (C) 2010 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html
This is free

Re: [General] Buffer overflow

2013-03-20 Thread Alexander Barkov

399cf89000-399cf8d000 r--p 00189000 fd:00 318254 
/lib64/libc-2.12.so
399cf8d000-399cf8e000 rw-p 0018d000 fd:00 318254 
/lib64/libc-2.12.so
399cf8e000-399cf93000 rw-p  00:00 0
77ce6000-77de6000 rw-p  00:00 0
77de6000-77df2000 r-xp  fd:00 318269 
/lib64/libnss_files-2.12.so
77df2000-77ff2000 ---p c000 fd:00 318269 
/lib64/libnss_files-2.12.so
77ff2000-77ff3000 r--p c000 fd:00 318269 
/lib64/libnss_files-2.12.so
77ff3000-77ff4000 rw-p d000 fd:00 318269 
/lib64/libnss_files-2.12.so
77ffd000-77ffe000 rw-p  00:00 0
77ffe000-77fff000 r-xp  00:00 0  [vdso]
7ffea000-7000 rw-p  00:00 0  [stack]
ff60-ff601000 r-xp  00:00 0  
[vsyscall]

Program received signal SIGABRT, Aborted.
0x0047199b in ?? ()
(gdb) backtrace
#0  0x0047199b in ?? ()
#1  0x004be10b in ?? ()
#2  0x004ca57e in ?? ()
#3  0x0052dae5 in ?? ()
#4  0x0052da7e in ?? ()
#5  0x0052d523 in ?? ()
#6  0x0052d408 in ?? ()
#7  0x00440c98 in ?? ()
#8  0x0044d247 in ?? ()
#9  0x004171dd in ?? ()
#10 0x00404566 in ?? ()
#11 0x004b6056 in ?? ()
#12 0x00405201 in ?? ()
#13 0x7fffe5d8 in ?? ()
#14 0x in ?? ()
(gdb)

-


Philippe





-Original Message-
From: Alexander Barkov [mailto:b...@mnogosearch.org]
Sent: 20 March 2013 11:28
To: Philippe DE ROCHAMBEAU; general@mnogosearch.org
Subject: Re: [General] Buffer overflow

Hi Philippe,


On 03/20/2013 01:10 PM, Philippe DE ROCHAMBEAU wrote:

Hi Alexander,

The problem is that version 3.3.12 is the only one available on the Redhat 
Repository.


The info below makes me think that you're using the RPM you previously 
downloaded from our site.

This RPM is a similar RPM we built for 3.3.13:

http://www.mnogosearch.org/Download/RPMS/mnogosearch-3.3.13-01.static.glibc-2.12.x86_64.rpm

I suggest to download it and upgrade.




---

Yum info mnogosearch

Loaded plugins: product-id, rhnplugin, security, subscription-manager
Updating certificate-based repositories.
Unable to read consumer identity
Installed Packages
Name: mnogosearch
Arch: x86_64
Version : 3.3.12
Release : 01.static
Size: 15 M
Repo: installed
Summary : Full-featured MySQL based web search engine.
URL : http://www.mnogosearch.org/
License : GNU GPL Version 2
Description : mnoGoSearch is a full-featured MySQL based web search engine. 
mnoGoSearch consists of
  : two parts. The first part is an indexing mechanism (indexer). 
The indexer walks over
  : html hypertext references and stores found words and new 
references into a database.
  : The second part is a web CGI front-end to provide search using 
data collected by the
  : indexer.
  :
  : A PHP and a Perl front-ends are also available from our site 
http://www.mnogosearch.org/.
  :
  : mnoGoSearch first release took place in November 1998. The 
search engine was named
  : UDMSearch until the project was acquired by Lavtech.Com Corp. 
in October 2000 and
  : its name changed to mnoGoSearch.

--

Best regards,

Philippe


-Original Message-
From: Alexander Barkov [mailto:b...@mnogosearch.org]
Sent: 20 March 2013 09:50
To: Philippe DE ROCHAMBEAU
Cc: general@mnogosearch.org
Subject: Re: [General] Buffer overflow

Hi Philippe,

So you're actually running mnogosearch-3.3.12 (not 3.3.13 as you
reported in the first letter).


This problem should be fixed  in 3.3.13.

This is from the 3.3.13 ChangeLog:
Bug#4803 buffer overflow detected with search.cgi was fixed.


Please download 3.3.13 from our site and reinstall.

Greetings.



On 03/20/2013 12:32 PM, Philippe DE ROCHAMBEAU wrote:

Hi,

uname --all
Linux xxx 2.6.32-279.22.1.el6.x86_64 #1 SMP Sun Jan 13 09:21:40 EST
2013 x86_64 x86_64 x86_64 GNU/Linux

---

[root@xxx cgi-bin]# ./search.cgi a
*** buffer overflow detected ***: ./search.cgi terminated ===
Backtrace: = [0x52dae5] [0x52da7e] [0x52d523] [0x52d408]
[0x440c98] [0x44d247] [0x4171dd] [0x404566] [0x4b6056] [0x405201]
=== Memory map: 
0040-00685000 r-xp  fd:00 334904 
/var/www/cgi-bin/search.cgi
00885000-008e rw-p 00285000 fd:00 334904 
/var/www/cgi-bin/search.cgi
008e-008ec000 rw-p  00:00 0
02484000-0251d000 rw-p  00:00 0  [heap]
399c40-399c42 r-xp

Re: [General] Buffer overflow

2013-03-20 Thread Alexander Barkov

 
/lib64/libnss_files-2.12.so
77ff3000-77ff4000 rw-p d000 fd:00 318269 
/lib64/libnss_files-2.12.so
77ffd000-77ffe000 rw-p  00:00 0
77ffe000-77fff000 r-xp  00:00 0  [vdso]
7ffea000-7000 rw-p  00:00 0  [stack]
ff60-ff601000 r-xp  00:00 0  
[vsyscall]

Program received signal SIGABRT, Aborted.
0x0047199b in ?? ()
(gdb) backtrace
#0  0x0047199b in ?? ()
#1  0x004be10b in ?? ()
#2  0x004ca57e in ?? ()
#3  0x0052dae5 in ?? ()
#4  0x0052da7e in ?? ()
#5  0x0052d523 in ?? ()
#6  0x0052d408 in ?? ()
#7  0x00440c98 in ?? ()
#8  0x0044d247 in ?? ()
#9  0x004171dd in ?? ()
#10 0x00404566 in ?? ()
#11 0x004b6056 in ?? ()
#12 0x00405201 in ?? ()
#13 0x7fffe5d8 in ?? ()
#14 0x in ?? ()
(gdb)

-


Philippe





-Original Message-
From: Alexander Barkov [mailto:b...@mnogosearch.org]
Sent: 20 March 2013 11:28
To: Philippe DE ROCHAMBEAU; general@mnogosearch.org
Subject: Re: [General] Buffer overflow

Hi Philippe,


On 03/20/2013 01:10 PM, Philippe DE ROCHAMBEAU wrote:

Hi Alexander,

The problem is that version 3.3.12 is the only one available on the Redhat 
Repository.


The info below makes me think that you're using the RPM you previously 
downloaded from our site.

This RPM is a similar RPM we built for 3.3.13:

http://www.mnogosearch.org/Download/RPMS/mnogosearch-3.3.13-01.static.glibc-2.12.x86_64.rpm

I suggest to download it and upgrade.




---

Yum info mnogosearch

Loaded plugins: product-id, rhnplugin, security, subscription-manager
Updating certificate-based repositories.
Unable to read consumer identity
Installed Packages
Name: mnogosearch
Arch: x86_64
Version : 3.3.12
Release : 01.static
Size: 15 M
Repo: installed
Summary : Full-featured MySQL based web search engine.
URL : http://www.mnogosearch.org/
License : GNU GPL Version 2
Description : mnoGoSearch is a full-featured MySQL based web search engine. 
mnoGoSearch consists of
  : two parts. The first part is an indexing mechanism (indexer). 
The indexer walks over
  : html hypertext references and stores found words and new 
references into a database.
  : The second part is a web CGI front-end to provide search using 
data collected by the
  : indexer.
  :
  : A PHP and a Perl front-ends are also available from our site 
http://www.mnogosearch.org/.
  :
  : mnoGoSearch first release took place in November 1998. The 
search engine was named
  : UDMSearch until the project was acquired by Lavtech.Com Corp. 
in October 2000 and
  : its name changed to mnoGoSearch.

--

Best regards,

Philippe


-Original Message-
From: Alexander Barkov [mailto:b...@mnogosearch.org]
Sent: 20 March 2013 09:50
To: Philippe DE ROCHAMBEAU
Cc: general@mnogosearch.org
Subject: Re: [General] Buffer overflow

Hi Philippe,

So you're actually running mnogosearch-3.3.12 (not 3.3.13 as you
reported in the first letter).


This problem should be fixed  in 3.3.13.

This is from the 3.3.13 ChangeLog:
Bug#4803 buffer overflow detected with search.cgi was fixed.


Please download 3.3.13 from our site and reinstall.

Greetings.



On 03/20/2013 12:32 PM, Philippe DE ROCHAMBEAU wrote:

Hi,

uname --all
Linux xxx 2.6.32-279.22.1.el6.x86_64 #1 SMP Sun Jan 13 09:21:40 EST
2013 x86_64 x86_64 x86_64 GNU/Linux

---

[root@xxx cgi-bin]# ./search.cgi a
*** buffer overflow detected ***: ./search.cgi terminated ===
Backtrace: = [0x52dae5] [0x52da7e] [0x52d523] [0x52d408]
[0x440c98] [0x44d247] [0x4171dd] [0x404566] [0x4b6056] [0x405201]
=== Memory map: 
0040-00685000 r-xp  fd:00 334904 
/var/www/cgi-bin/search.cgi
00885000-008e rw-p 00285000 fd:00 334904 
/var/www/cgi-bin/search.cgi
008e-008ec000 rw-p  00:00 0
02484000-0251d000 rw-p  00:00 0  [heap]
399c40-399c42 r-xp  fd:00 318247 
/lib64/ld-2.12.so
399c42-399c61f000 ---p 0002 fd:00 318247 
/lib64/ld-2.12.so
399c61f000-399c62 r--p 0001f000 fd:00 318247 
/lib64/ld-2.12.so
399c62-399c621000 rw-p 0002 fd:00 318247 
/lib64/ld-2.12.so
399c621000-399c622000 rw-p  00:00 0
399cc0-399cd89000 r-xp  fd:00 318254 
/lib64/libc-2.12.so
399cd89000-399cf89000 ---p 00189000 fd:00 318254

[General] ANNOUNCE: mnoGoSearch-3.3.14

2013-04-02 Thread Alexander Barkov


Hello,

mnoGoSearch-3.3.14 is now available from
http://www.mnogosearch.org/

- DOCX and RTF built-in parsers were added.

- It's now possible to use the $(ConfDir), $(ShareDir), $(VarDir), 
$(TmpDir) template variables in search.htm, e.g.:


Include $(ConfDir)/common.inc
DBAddr sqlite3:///$(VarDir)/mnogosearch.sqlite3/

Previously these variables were understood only in indexer.conf.

- A minor fix in installation layout was made: the --docdir parameter
to configure is now respected, and the HTML documentation is now
installed to PREFIX/share/doc/mnogosearch/ by default. Previously
--docdir was ignored, and the documentation was installed to
PREFIX/doc/.

- A number of minor bugs were fixed.

The full ChangeLog can be found at:
http://www.mnogosearch.org/doc33/msearch-changelog.html#changelog-3-3-14


Greetings.


___
General mailing list
General@mnogosearch.org
http://lists.mnogosearch.org/listinfo/general

Re: [General] Install mnogosearch

2013-06-01 Thread Alexander Barkov


Hi,

On 06/01/2013 06:36 PM, Mapluz Dev wrote:

hi
i try to install mnogosearch on a debian release 6
when i try the commande : ./configure --with-mysql
i have this message :
checking build system type... i686-pc-linux-gnu
checking host system type... i686-pc-linux-gnu
checking target system type... i686-pc-linux-gnu
checking for a BSD-compatible install... /usr/bin/install -c
checking whether build environment is sane... yes
checking for a thread-safe mkdir -p... /bin/mkdir -p
checking for gawk... no
checking for mawk... mawk
checking whether make sets $(MAKE)... yes
checking whether to enable maintainer-specific portions of Makefiles... no
checking whether make sets $(MAKE)... (cached) yes
checking whether build environment is sane... yes
checking for gcc... gcc
checking whether the C compiler works... no
configure: error: in `/home/francis/Downloads/mnogosearch-3.3.14':
configure: error: C compiler cannot create executables
See `config.log' for more details


Can you send config.log please?



can you help me

thanks
--
VBLC Signature


Développement Mapluz - MAPLUZ http://www.mapluz.fr




___
General mailing list
General@mnogosearch.org
http://lists.mnogosearch.org/listinfo/general


___
General mailing list
General@mnogosearch.org
http://lists.mnogosearch.org/listinfo/general

Re: [General] Install mnogosearch

2013-06-01 Thread Alexander Barkov



On 06/01/2013 07:06 PM, Mapluz Dev wrote:
skip

checking whether the C compiler works... no
configure: error: in `/home/francis/Downloads/mnogosearch-3.3.14':
configure: error: C compiler cannot create executables
See `config.log' for more details


Can you send config.log please?




yes, see attached file
thanks



I think this  line is the most important:

 /usr/bin/ld: crt1.o: No such file: No such file or directory

Quick googling returns many pages telling how to fix this.
For example, have a look into this one:

http://www.businesscorner.co.uk/usrbinld-crt1-o-no-such-file-no-such-file-or-directory/
___
General mailing list
General@mnogosearch.org
http://lists.mnogosearch.org/listinfo/general

Re: [General] CheckOnly for all unkown file types

2013-06-18 Thread Alexander Barkov


Hi,


On 06/13/2013 01:22 PM, W. de Hoog wrote:

Hi,

I would like to configure mnogosearch so that for all non parsed files
it stores the path. CheckOnly however does not use the mime type but a
regex. It would be nice to have for example
   AddType application/unknown *.*
   CheckOnly Match Mime application/unknown

How could this be done?


Unfortunately there's no a feature like this.



best regards,

Willem-Jan de Hoog
___
General mailing list
General@mnogosearch.org
http://lists.mnogosearch.org/listinfo/general

___
General mailing list
General@mnogosearch.org
http://lists.mnogosearch.org/listinfo/general

Re: [General] blob, single or multi parameters

2013-10-03 Thread Alexander Barkov


  Hi,


On 10/03/2013 02:53 PM, d...@mapluz.fr wrote:

Hi
I have install release mnogosearch 3.3.14 with a mysql database
i have create table with thois command :

*./indexer -Ecreate -d /usr/local/mnogosearch/etc/indexer.conf*

in my indexer .conf i have this :

*DBAddr mysql://root:mypassword@localhost/mnogosearchactu/?DBMode=multi*

when i try to run search, i have this message :
*Inverted word index not found. Probably you forgot to run 'indexer
-Eblob'*.
so i have run this command :

*./indexer -Eblob -d /usr/local/mnogosearch/etc/indexer.conf*


I guess you forgot to fix DBAddr in search.htm to match the one in 
indexer.conf.




and all run, but i have questions :

1 - why must i run this commande :*./indexer -Eblob -d 
/usr/local/mnogosearch/etc/indexer.conf
*  i do not uderstand the*-Eblob*  parameter
2 - in my crontab, is this line correct*  **00 23 * * * 
/usr/local/mnogosearch/sbin/indexer -d /usr/local/mnogosearch/etc/indexer.conf* 
 to indexing all days at 23h ?*



With DBMode=multi crawling and indexing is done at the same time.
The advantage is that search index is always up to date with
what crawler has already downloaded.

With DBMode=blob crawling and indexing are separated in time.
The advantage of DBMode=blob is that it is much faster at search
time than DBMode=blob.
But it needs an extra step indexer --index
(or indexer -Eblob - these commands are synonyms)
to make the index up to date after the crawler has downloaded
a number of documents with new content
(i.e. both new documents and old documents that have changed
since last crawling).


The choice between DBMode=multi and DBMode=blob can be done
depending on the database size and search performance.


- If your document collection is rather small and you're
happy with search performance provided by DBMode=multi,
then use this command in both indexer.conf and search.htm:

DBAddr mysql://root:mypassword@localhost/mnogosearchactu/?DBMode=multi

The command in crontab is Okey in this case.


- If your document collection is rather big, and/or you prefer faster
search results, then use this DBAddr in both indexer.conf and
search.htm:

DBAddr mysql://root:mypassword@localhost/mnogosearchactu/?DBMode=blob

In this case, the crontab task should do two things consequently:

# Crawling
/usr/local/mnogosearch/sbin/indexer -d 
/usr/local/mnogosearch/etc/indexer.conf

# Indexing
/usr/local/mnogosearch/sbin/indexer --index -d 
/usr/local/mnogosearch/etc/indexer.conf


It's a good idea to put these two commands into a shell script,
then use it from crontab.


Now you can try to change search.htm changing between DBMode=blob
and DBMode=multi and compare performance.



If you decide to stay with DBMode=multi, then just copy DBAddr
from indexer.conf to search.htm.


If you decide to switch to DBMode=blob, then it's a good idea
to start from scratch:

1. Drop the tables in the current database that were created
for DBMode=multi

indexer --drop

2. Edit indexer.conf and search.htm, change to DBMode to blob.

3. Create tables for DBMode=blob

indexer --create

4. Crawl your document collection

indexer

5. Create index

indexer --index

6. Search






*Thanks a lot for your responses.*
*





___
General mailing list
General@mnogosearch.org
http://lists.mnogosearch.org/listinfo/general


___
General mailing list
General@mnogosearch.org
http://lists.mnogosearch.org/listinfo/general

Re: [General] scans subdirectories websites

2013-10-29 Thread Alexander Barkov


Hi,

On 10/29/2013 06:38 PM, d...@mapluz.fr wrote:

Hi,

is that the search engine scans subdirectories websites ?
if not is there a setting in indexer.conf ?


Can you clarify please what do you mean subdirectories websites?

If you have a command like:

Server http://sitename.com/

then it does go to this site subdirectories,e.g. 
http://sitename.com/dir/, if there is a link found.




Thanks




___
General mailing list
General@mnogosearch.org
http://lists.mnogosearch.org/listinfo/general


___
General mailing list
General@mnogosearch.org
http://lists.mnogosearch.org/listinfo/general

Re: [General] Data in database mnogosearch

2013-10-30 Thread Alexander Barkov


Hi,


On 10/30/2013 01:08 PM, d...@mapluz.fr wrote:

hi,
I'm using mnogosearch 3.2 on linux ubuntu server.


Hmm, 3.2 sounds very old. Why not 3.3?


i have created a mysql database and my indexer.conf file is here :
http://www.mapluz.fr/public/indexer.conf
i init indexer with this command : /usr/local/mnogosearch/sbin/indexer
-Eblob /usr/local/mnogosearch/etc/indexer.conf
i run indexer with this command : /usr/local/mnogosearch/sbin/indexer
-d /usr/local/mnogosearch/etc/indexerer.conf


They are to be run in the opposite order:

1. Crawl documents:
/usr/local/mnogosearch/sbin/indexer  -d 
/usr/local/mnogosearch/etc/indexerer.conf


2. Index the documents collected by crawler:

/usr/local/mnogosearch/sbin/indexer -Eblob 
/usr/local/mnogosearch/etc/indexer.conf





my search with the sample of API php (the search.php sample provides by
mnogosearch) return no results.

so, perhaps my database have a problem : i have a question about
the*bdict table; *here is an example :
http://www.mapluz.fr/public/capture.jpg

why is it so small bdict table ?


Try to run this command again:

/usr/local/mnogosearch/sbin/indexer -Eblob 
/usr/local/mnogosearch/etc/indexer.conf


Does the size of the table bdict change?




thanks


___
General mailing list
General@mnogosearch.org
http://lists.mnogosearch.org/listinfo/general


___
General mailing list
General@mnogosearch.org
http://lists.mnogosearch.org/listinfo/general

Re: [General] Indexer RSS flux

2013-11-04 Thread Alexander Barkov


Hi,

On 11/04/2013 04:57 PM, d...@mapluz.fr wrote:

hi,

I want indexer RSS flux on my web site.
 Is there a parameter in indexer.conf to do this ?



Can you please give a link to an example of an RSS file you'd like to 
index, or copy and paste an RSS fragment?


Thanks.



Thanks

d...@mapluz.fr


___
General mailing list
General@mnogosearch.org
http://lists.mnogosearch.org/listinfo/general


___
General mailing list
General@mnogosearch.org
http://lists.mnogosearch.org/listinfo/general

Re: [General] Indexer RSS flux

2013-11-06 Thread Alexander Barkov


Hi,


On 11/06/2013 01:00 PM, d...@mapluz.fr wrote:

hi,

here is an example of rss flux :
http://www.sudouest.fr/pyrenees-atlantiques/bayonne/rss.xml


Let's have a look into the item tags, for example:

item
   titlePays basque : un troupeau de brebis pris au piège par 
la montée des eaux/title
   descriptionUn troupeau de brebis est tombé dans une 
rivière, 61 ont été sauvées, 47 sont mortes. Des brebis à l’âme 
aventureuse. Un troupeau d’Irissarry, au nord de 
Saint-Jean-Pied-de-Port, a voulu explorer les alentours de son pâturage, 
mardi, en fin de matinée.  Bien mal lui en a pris. Les animaux, du fait 
de la montée des eaux,.../description

   pubDateWed, 06 Nov 2013 10:14:38 +0100/pubDate

linkhttp://www.sudouest.fr/2013/11/06/des-brebis-se-tuent-dans-un-ravin-1221272-4181.php#xtor=RSS-10521769/link

guidhttp://www.sudouest.fr/2013/11/06/des-brebis-se-tuent-dans-un-ravin-1221272-4181.php#xtor=RSS-10521769/guid
/item


What would you like to do with every item instance?

Have the crawler follow the URL given in the tag link.../link
and index the content of the document referenced by this URL?


Or just create a new URL entry in the database and index the
information inside the title../title and content../content,
without crawling to the given URL?





thanks for you help


*De: *Alexander Barkov b...@mnogosearch.org
*À: *d...@mapluz.fr
*Cc: *MNOGOSEARCH general@mnogosearch.org
*Envoyé: *Lundi 4 Novembre 2013 20:59:43
*Objet: *Re: [General] Indexer RSS flux

Hi,

On 11/04/2013 04:57 PM, d...@mapluz.fr wrote:
  hi,
 
  I want indexer RSS flux on my web site.
   Is there a parameter in indexer.conf to do this ?
 

Can you please give a link to an example of an RSS file you'd like to
index, or copy and paste an RSS fragment?

Thanks.


  Thanks
 
  d...@mapluz.fr
 
 
  ___
  General mailing list
  General@mnogosearch.org
  http://lists.mnogosearch.org/listinfo/general
 



___
General mailing list
General@mnogosearch.org
http://lists.mnogosearch.org/listinfo/general


___
General mailing list
General@mnogosearch.org
http://lists.mnogosearch.org/listinfo/general

[General] ANNOUNCE: mnoGoSearch 3.3.15

2013-12-01 Thread Alexander Barkov


  Hello,

mnoGoSearch-3.3.15 is now available.
Sources and binaries for a number of platforms
are available from http://www.mnogosearch.org/.

This is mostly a bug-fixes release.

Starting from this release, binaries for the console
(non-GUI, Unix-alike) version are available for the
Windows family operating systems.


From ChangeLog:

* The default search template improvements were made. Query words are
now highlighted differently when displaying a list of found documents
(using bold font) and when displaying a cached copy of a document
(using yellow background).

* A section about installation of the mnoGoSearch PHP module was added
into the docbook manual.

* The EREGCUT template operator was added, to remove sub-strings
matching to a regular expression pattern from a string.

* Bug#4820 mirror files exceed platform limit for file name length
was fixed.

* A few potential vulnerabilities found by the Veracode static analyzer
were fixed (Bug#4826).

* A few warnings reported by the clang compiler were fixed.

* Fixed that the words having non-ASCII letters were not highlighted
when displaying cached copy in cases when the document character set
differs from LocalCharset (a bug since 3.3.13).

* Fixed that the Microsoft SQL Server driver always used quotes in a
USE dbname; query when connecting to the server, assuming that
QUOTED_IDENTIFIERS is set to ON, which is not necessarily always the
case (a bug since 3.3.12). Now quotes are used only for database names
starting with a digit.

* Fixed that popularity rank calculation did not work with Microsoft
SQL Server.

*  Fixed a bug in the Microsoft SQL Server driver which reported one
extra byte (for the trailing 0x00) when fetching character data from
the server. This bug made indexer and search.cgi behave unexpectedly in
rare cases.

* Bug#4825 Redirect: Bad URL: redirected locations not indexed was
fixed.

* Fixed that make bin-dist did not work in some cases.
___
General mailing list
General@mnogosearch.org
http://lists.mnogosearch.org/listinfo/general

Re: [General] wordstat

2013-12-31 Thread Alexander Barkov


Hi,

On 12/31/2013 01:35 PM, Developpement Team Hodei wrote:

hi,

i'm using the mnogosearch 3.2 on an uguntu server.
i index with DBMODE=BLOB
i want use autocompletion in my input textbox when a user search on
the web :
so where are the words that mnogosearch has tagged it ?
my wrdstat table is empty but it seems the is word in bdict


Run indexer --wordstat to populate the table.

Btw, which tools are you going to use to implement autocompletion?
It would be nice to make autocompletion available out of the box
into the next development branch (3.4.x).





thanks


___
General mailing list
General@mnogosearch.org
http://lists.mnogosearch.org/listinfo/general


___
General mailing list
General@mnogosearch.org
http://lists.mnogosearch.org/listinfo/general

Re: [General] separate indexing/crawling and web recherche

2014-01-08 Thread Alexander Barkov


Hi,

On 01/06/2014 02:56 PM, Developpement Team Hodei wrote:

hi,

i have installed mnogosearch 3.2 on an unbutu server with wamp and mysql
database.

In ly indexer.conf i'm using *DBMode=blob

*In my crontab i have this :

00 23 1 * * /usr/local/mnogosearch/sbin/indexer -d
/usr/local/mnogosearch/etc/indexer.conf
 /usr/local/mnogosearch/sbin/logmng 2/usr/local/mnogosearch/sbin/logmng2
30 23 1 * * /usr/local/mnogosearch/sbin/indexer -Eblob
/usr/local/mnogosearch/etc/indexer.conf
 /usr/local/mnogosearch/sbin/logmng 2/usr/local/mnogosearch/sbin/logmng2

My question :

to improve performanceis it necessary to separate the crawling/indexing
of mnogosearch engine from which users make webs research ?
in other words, is it necessary to create two servers
* 1 ubuntu server wamp / mnogosearch with the mysql database and crontab
commands
* 1 ubuntu serverwamp / mnogosearch / wordpress (my website is on
wordpress) with the same mysql database would be replicated (frequency
like crontab) and without crontab commands (i'm using php frontend to
access mnogosearch engine)


How big is your search database?
What does indexer -S return?

- crawling is not CPU consuming,
  it can co-exist with indexing and search without any problems.

- indexing and search are CPU consuming


For small databases (e.g. less than 100k documents) having is single
box for all three tasks is fine.

For bigger databases with a few hundred thousand documents a box with
multiple CPU cores can also handle all three tasks.

For huge databases with millions documents separation makes some sense.
However, instead of separating crawling/indexing/search tasks, I'd
propose to consider using cluster solution described here:
http://www.mnogosearch.org/doc33/msearch-cluster.html
It significantly improves search performance.



thanks for your help

___
General mailing list
General@mnogosearch.org
http://lists.mnogosearch.org/listinfo/general


___
General mailing list
General@mnogosearch.org
http://lists.mnogosearch.org/listinfo/general

Re: [General] rss and multiledia file

2014-01-08 Thread Alexander Barkov


Hi,

On 12/31/2013 10:47 PM, Developpement Team Hodei wrote:

hi,

how can i configured my indexer.conf file for web search returns no RSS,
no audio or video file !


Use Allow/Disallow commands for this.



thanks



___
General mailing list
General@mnogosearch.org
http://lists.mnogosearch.org/listinfo/general


___
General mailing list
General@mnogosearch.org
http://lists.mnogosearch.org/listinfo/general

Re: [General] separate indexing/crawling and web recherche

2014-01-08 Thread Alexander Barkov




On 01/06/2014 02:56 PM, Developpement Team Hodei wrote:

hi,

i have installed mnogosearch 3.2 on an unbutu server with wamp and mysql
database.


By the way, why not mnogosearch-3.3 ?




In ly indexer.conf i'm using *DBMode=blob

*In my crontab i have this :

00 23 1 * * /usr/local/mnogosearch/sbin/indexer -d
/usr/local/mnogosearch/etc/indexer.conf
 /usr/local/mnogosearch/sbin/logmng 2/usr/local/mnogosearch/sbin/logmng2
30 23 1 * * /usr/local/mnogosearch/sbin/indexer -Eblob
/usr/local/mnogosearch/etc/indexer.conf
 /usr/local/mnogosearch/sbin/logmng 2/usr/local/mnogosearch/sbin/logmng2

My question :

to improve performanceis it necessary to separate the crawling/indexing
of mnogosearch engine from which users make webs research ?
in other words, is it necessary to create two servers
* 1 ubuntu server wamp / mnogosearch with the mysql database and crontab
commands
* 1 ubuntu serverwamp / mnogosearch / wordpress (my website is on
wordpress) with the same mysql database would be replicated (frequency
like crontab) and without crontab commands (i'm using php frontend to
access mnogosearch engine)

thanks for your help


___
General mailing list
General@mnogosearch.org
http://lists.mnogosearch.org/listinfo/general


___
General mailing list
General@mnogosearch.org
http://lists.mnogosearch.org/listinfo/general

Re: [General] indexer make segmentation fail

2014-03-17 Thread Alexander Barkov


Hi,

Please try to get gdb backtrace.

On 03/17/2014 10:23 PM, d...@hodei.net wrote:

Hi

I try this command to Indexing my list of web site :
/usr/local/mnogosearch/sbin/indexer -Eblob -d
/usr/local/mnogosearch/etc/indexer.conf

the result is : Segmentation fault
here my sql databases informations :
++---+
| Tables | Size (MB) |
++---+
| bdict  |191.90 |
| bdicti |605.58 |
| categories |  0.00 |
| crossdict  |  0.00 |
| dict   |  0.00 |
| links  |  0.00 |
| qcache |  0.00 |
| qinfo  |  0.00 |
| qtrack |  0.00 |
| server |  0.18 |
| srvinfo|  0.00 |
| url|   3028.87 |
| urlinfo|   2110.99 |
| wrdstat|  0.00 |
++---+

have you an idea?

__
my config :

* Debian 3.2.51-1 x86_64 GNU/Linux
* mnogosearch 3.3.15
* indexer.conf :
..
   DBAddr mysql://root:password@localhost/mnogosearch/?dbmode=blob
 ..
* commande ./configure for installation

../configure --prefix=/usr/local/mnogosearch
--bindir=/usr/local/mnogosearch/bin
--sbindir=/usr/local/mnogosearch/sbin
--sysconfdir=/usr/local/mnogosearch/etc
--localstatedir=/usr/local/mnogosearch/var
--libdir=/usr/local/mnogosearch/lib
--includedir=/usr/local/mnogosearch/include
--mandir=/usr/local/mnogosearch/man
--disable-shared
--enable-static
--enable-syslog
--without-docs
--enable-pthreads
--disable-dmalloc
--enable-parser
--disable-mp3
--disable-xml
--disable-rss
--disable-css
--disable-js
--with-extra-charsets=all
--enable-file
--enable-http
--enable-ftp
--enable-htdb
--enable-news
--with-mysql
--with-zlib
__


---
Ce courrier électronique ne contient aucun virus ou logiciel malveillant
parce que la protection avast! Antivirus est active.
http://www.avast.com

___
General mailing list
General@mnogosearch.org
http://lists.mnogosearch.org/listinfo/general


___
General mailing list
General@mnogosearch.org
http://lists.mnogosearch.org/listinfo/general

Re: [General] UDM_URL's parent url inside external parser

2014-04-21 Thread Alexander Barkov


Hi,

On 04/16/2014 06:36 PM, Yasser Zamani wrote:

Hi there,

I need to know the parent url of the url which has been passed to my
parser via UDM_URL. For example if a page at
`http://example.com/example_movie.html` has a link like
`http://example.com/example_movie.mp4` inside it, when mnogosearch
passes `http://example.com/example_movie.mp4` to my parser via UDM_URL,
I need to know it's parent page, `http://example.com/example_movie.html`.

Is it possible at all? if so, how? or any workaround?


You can try to know the parent page using an SQL query like this:


select url1.url  from url url1, url url2
where url2.referrer=url1.rec_id and
url2.url='http://example.com/example_movie.mp4';




Thanks in advance!
___
General mailing list
General@mnogosearch.org
http://lists.mnogosearch.org/listinfo/general


___
General mailing list
General@mnogosearch.org
http://lists.mnogosearch.org/listinfo/general

Re: [General] parameter to server command in indexer.conf

2014-05-29 Thread Alexander Barkov


Hi,

On 05/05/2014 05:02 PM, d...@hodei.net wrote:

hi

When i try to add this url to my list in dexer.conf
--
server http://fr.wikipedia.org/wiki/Zanpantzar
--
the crawler search all fr.wikioedia.org site and not only Zanpantzar
directory

Have you an idea ?


Hmm. http://fr.wikipedia.org/wiki/Zanpantzar
is not a directory. It's a file.


With this server command it should crawl everything
in this directory: http://fr.wikipedia.org/wiki/
It should not go outside of the /wiki/ directory.

If it goes outside of /wiki/, perhaps you have
more server commands.



Thanks



__
my config :

* Debian 3.2.51-1 x86_64 GNU/Linux
* mnogosearch 3.3.15
* indexer.conf :
..
   DBAddr mysql://root:password@localhost/mnogosearch/?dbmode=blob
 ..


---
Ce courrier électronique ne contient aucun virus ou logiciel malveillant
parce que la protection avast! Antivirus est active.
http://www.avast.com

___
General mailing list
General@mnogosearch.org
http://lists.mnogosearch.org/listinfo/general

___
General mailing list
General@mnogosearch.org
http://lists.mnogosearch.org/listinfo/general

Re: [General] accented characters

2014-05-29 Thread Alexander Barkov




On 05/05/2014 02:32 PM, d...@hodei.net wrote:

hi

i have accented characters in my web search.

to solve this problem, i have modify the database with this queries :

ALTER TABLE bdict CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
ALTER TABLE bdicti CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;

and i have init variables *$localcharset* et *$browsercharset* with
utf-8 in my indexer.conf

But i have always the problem !

have you an idea ?

Thanks
__
my config :

* Debian 3.2.51-1 x86_64 GNU/Linux
* mnogosearch 3.3.15
* indexer.conf :
..
   DBAddr mysql://root:password@localhost/mnogosearch/?dbmode=blob



Try adding the SetNames=utf8 part, like this:


DBAddr 
mysql://root:password@localhost/mnogosearch/?SetNames=utf8dbmode=blob


 ..




http://www.avast.com/   

Ce courrier électronique ne contient aucun virus ou logiciel malveillant
parce que la protection Antivirus avast! http://www.avast.com/ est
active.




___
General mailing list
General@mnogosearch.org
http://lists.mnogosearch.org/listinfo/general


___
General mailing list
General@mnogosearch.org
http://lists.mnogosearch.org/listinfo/general

Re: [General] Delete a line in Server method in indexer.conf

2014-05-29 Thread Alexander Barkov


Hi,

On 05/05/2014 02:26 PM, d...@hodei.net wrote:

Hi,

In the indexer.conf file, in 'Server [Method] ', i want  to delete an
entry  like this : 'server http://www.eke.org'


Is that all pages of the site will be removed from the database during
the next crawling?


Every document has its own expiration time, which is stored in 
url.next_index_time.


When crawling, indexer deletes all expired documents that do not
have a matching Server/Real command.

Note, you can delete all documents at once,
without waiting for expiration:

indexer -Cw -u http://www.eke.org/%





Thanks


___
General mailing list
General@mnogosearch.org
http://lists.mnogosearch.org/listinfo/general

Re: [General] Indexing Failed with large database

2014-07-02 Thread Alexander Barkov


Hi,


On 06/26/2014 06:49 PM, d...@hodei.net wrote:

Hi

I have a problem when i indexing my database :
___

root@botujo:/home/jean# /usr/local/mnogosearch/sbin/indexer -Eblob
indexer[4787]: Indexing
indexer[4787]: Loading URL list
{sql.c:1513} Query: SELECT rec_id, site_id, pop_rank, last_mod_time FROM
url

indexer[4787]: MySQL driver: #144: Table './mnogosearch/url' is marked
as crashed and last (automatic?) repair failed


Here is my database information in phpmyadmin :

  namelinessize
--
 bdict 864  575 1,1 Go
 bdicti utilisé
 bdict_tmp2,0 Ko
 categories1,0 Ko
 crossdict  1,0 Ko
 dict   1,0 Ko
 links  1,0 Ko
 qcache  1,0 Ko
 qinfo 2,0 Ko
 qtrack   1,0 Ko
 server 889156,7 Ko
 srvinfo  1,0 Ko
 url utilisé
 urlinfo 11  009  854   27,4 Go


Have you an idea ?


Does manual REPAIR TABLE url help?




Thanks

__
my config :

* Debian 3.2.51-1 x86_64 GNU/Linux
* mnogosearch 3.3.15
* indexer.conf :
..
   DBAddr mysql://root:password@localhost/mnogosearch/?dbmode=blob


---
Ce courrier électronique ne contient aucun virus ou logiciel malveillant
parce que la protection avast! Antivirus est active.
http://www.avast.com

___
General mailing list
General@mnogosearch.org
http://lists.mnogosearch.org/listinfo/general

___
General mailing list
General@mnogosearch.org
http://lists.mnogosearch.org/listinfo/general

Re: [General] Crawling order

2014-08-05 Thread Alexander Barkov


Hi,


On 08/05/2014 12:12 PM, d...@hodei.net wrote:

Hi

I have 1000 websites in my indexer.conf on the 'Server method' rubric

in what order the 'crawler' look over the list of website : random,
alphabetical or other


Crawler selects targets in a random order.

There are some related command line options:


  -e  Visit 'most expired' (oldest) documents first
  -o  Visit documents with less depth (hops value) first
  -r  Do not try to reduce remote servers load by randomising
  crawler queue order (faster, but less polite)






thanks for your help


_
my config :
* Debian 3.2.51-1 x86_64 GNU/Linux
* mnogosearch 3.3.15
* contents of indexer.conf :
 ..
 DBAddr  mysql://root:password@localhost/mnogosearch/?dbmode=blob
 ..
_



---
Ce courrier électronique ne contient aucun virus ou logiciel malveillant
parce que la protection avast! Antivirus est active.
http://www.avast.com

___
General mailing list
General@mnogosearch.org
http://lists.mnogosearch.org/listinfo/general

___
General mailing list
General@mnogosearch.org
http://lists.mnogosearch.org/listinfo/general

Re: [General] Duplicates Commandes in indexer

2014-10-06 Thread Alexander Barkov


Hi,

Most likely you have two DBAddr commands in your indexer.conf.

If this does not help, please send me your indexer.conf.


On 10/06/2014 01:23 PM, Hodei-Dev wrote:

Hi
When i try to execute the indexer command it run command in double;
for example :

/usr/local/mnogosearch/sbin/indexer -Ecreate -d 
/usr/local/mnogosearch/etc/indexer.conf :
this command create tables twice and in the second run i have a warning 'table 
already exist'

/usr/local/mnogosearch/sbin/indexer -Eblob 
/usr/local/mnogosearch/etc/indexer.conf :
this command return this :
-
indexer[16663]: Indexing
indexer[16663]: Loading URL list
indexer[16663]: Converting intag00
indexer[16663]: Converting intag01
indexer[16663]: Converting intag02
indexer[16663]: Converting intag03
indexer[16663]: Converting intag04
indexer[16663]: Converting intag05
indexer[16663]: Converting intag06
indexer[16663]: Converting intag07
indexer[16663]: Converting intag08
indexer[16663]: Converting intag09
indexer[16663]: Converting intag0A
indexer[16663]: Converting intag0B
indexer[16663]: Converting intag0C
indexer[16663]: Converting intag0D
indexer[16663]: Converting intag0E
indexer[16663]: Converting intag0F
indexer[16663]: Converting intag10
indexer[16663]: Converting intag11
indexer[16663]: Converting intag12
indexer[16663]: Converting intag13
indexer[16663]: Converting intag14
indexer[16663]: Converting intag15
indexer[16663]: Converting intag16
indexer[16663]: Converting intag17
indexer[16663]: Converting intag18
indexer[16663]: Converting intag19
indexer[16663]: Converting intag1A
indexer[16663]: Converting intag1B
indexer[16663]: Converting intag1C
indexer[16663]: Converting intag1D
indexer[16663]: Converting intag1E
indexer[16663]: Converting intag1F
indexer[16663]: Total converted: 2604877 records, 13711786 bytes
indexer[16663]: Converting url data
indexer[16663]: Switching to new blob table.
indexer[16663]: Loading URL list
indexer[16663]: Converting intag00
indexer[16663]: Converting intag01
indexer[16663]: Converting intag02
indexer[16663]: Converting intag03
indexer[16663]: Converting intag04
indexer[16663]: Converting intag05
indexer[16663]: Converting intag06
indexer[16663]: Converting intag07
indexer[16663]: Converting intag08
indexer[16663]: Converting intag09
indexer[16663]: Converting intag0A
indexer[16663]: Converting intag0B
indexer[16663]: Converting intag0C
indexer[16663]: Converting intag0D
indexer[16663]: Converting intag0E
indexer[16663]: Converting intag0F
indexer[16663]: Converting intag10
indexer[16663]: Converting intag11
indexer[16663]: Converting intag12
indexer[16663]: Converting intag13
indexer[16663]: Converting intag14
indexer[16663]: Converting intag15
indexer[16663]: Converting intag16
indexer[16663]: Converting intag17
indexer[16663]: Converting intag18
indexer[16663]: Converting intag19
indexer[16663]: Converting intag1A
indexer[16663]: Converting intag1B
indexer[16663]: Converting intag1C
indexer[16663]: Converting intag1D
indexer[16663]: Converting intag1E
indexer[16663]: Converting intag1F
indexer[16663]: Total converted: 2605019 records, 13712168 bytes
indexer[16663]: Converting url data
indexer[16663]: Switching to new blob table.
-
In my install, i have configure indexer like this :
usr/local/mnogosearch/lib  --includedir=/usr/local/mnogosearch/include  
--mandir=/usr/local/mnogosearch/man  --disable-shared  --enable-static  
--enable-syslog  --without-docs  --enable-pthreads  --disable-dmalloc  
--enable-parser  --disable-mp3 --disable-xml --disable-rss --disable-css 
--disable-js --with-extra-charsets=all   --enable-file  --enable-http  
--enable-ftp  --enable-htdb  --enable-news --with-mysql --with-zlib

_
my config :
* Debian 3.2.51-1 x86_64 GNU/Linux
* mnogosearch 3.3.15
* contents of indexer.conf :
 ..
 DBAddr  mysql://root:password@localhost/mnogosearch/?dbmode=blob
 ..
_

Have you an idea ?

Thanks

--
--
VBLC Signature





http://www.avast.com/   

Ce courrier électronique ne contient aucun virus ou logiciel malveillant
parce que la protection Antivirus avast! http://www.avast.com/ est
active.




___
General mailing list
General@mnogosearch.org
http://lists.mnogosearch.org/listinfo/general


___
General mailing list
General@mnogosearch.org
http://lists.mnogosearch.org/listinfo/general

Re: [General] Geo Search

2015-03-11 Thread Alexander Barkov


Hello,

On 03/10/2015 03:24 PM, Tiptoe Danceware Sales wrote:

Hello,

I was hoping someone could point me in the right direction.  Our website
has geo settings per page (e.g. this page can only be seen in Europe).
The problem is when mnogosearch indexes and hits this page it is given a
message This page can not be viewed in your country because the
mnogosearch indexer is in USA and the website doesn't allow mnogo to
view the content.

We can hack our cms to allow mnogosearch to index all content.  The
problem lies in how to tell mnogoseach to only give results for content
that is visible in their country.  We could place a meta tag in each
page telling mnogosearch where this page is visible, but how to get
mnogosearch to store this in its index and then evaluate this during
user searches?

Any ideas?


Limiting by a specific meta tag value can be done with help of a Limit
command in indexer.conf and a corresponding parameter fl=xxx passed to
search.cgi

Please have a look here for details:

http://www.mnogosearch.org/doc33/msearch-cmdref-limit.html




Thanks,

Mike

___
General mailing list
General@mnogosearch.org
http://lists.mnogosearch.org/listinfo/general


___
General mailing list
General@mnogosearch.org
http://lists.mnogosearch.org/listinfo/general

[General] ANNOUNCE: mnoGoSearch-3.4.1

2015-12-14 Thread Alexander Barkov


Hi,

mnoGoSearch-3.4.1 is available from our site
http://www.mnogosearch.org/

This is a new development branch with lots of changes and improvements.

Please see here for a detailed change list:
http://www.mnogosearch.org/doc34/msearch-changelog.html

Greetings.
___
General mailing list
General@mnogosearch.org
http://lists.mnogosearch.org/listinfo/general

Re: [General] HoldBadHrefs has no effect

2017-02-17 Thread Alexander Barkov

Hello Jeff,

Sorry for a late reply.

On 02/09/2017 02:25 AM, Jeff Taylor wrote:
> I've been running Mnogo 3.3.13 under debian wheezy for a number of
> years, but it seems that old cached content is never removed.  I
> normally run with HoldBadHrefs=7d, but I've even tried setting it to 1s,
> and the old content is still never removed.  As an example, on a recent
> search I noticed pages that were last cached in Aug 2011 (which was
> probably when the site went offline), but it still comes up in searches.
> 
> Help?  I would even be happy with running a mysql query to remove all
> cached content that is more than 7 days old, but I didn't want to go
> blindly deleting things without knowing how the info in the tables might
> be cross-referenced.  If there's a way to fix indexer.conf I would also
> be happy.  Note that the old pages which should be removed are no longer
> referenced in server.list, and when I run indexer I get a long list of
> URLs that can't be reached.  So what can I do to get these old entries
> removed from the database?

Which http status do these old documents have?

Can you please check statistics for a few old documents:

./indexer -S -u http://old1/
./indexer -S -u http://old2/
./indexer -S -u http://old3/


Or using this SQL query:

SELECT status, url FROM url WHERE url IN
('http://old1/','http://old2/','http://old3/');



Also, the output from this command would be helpful:

./indexer -am -v6 -u http://old1/


Please also send your indexer.conf to b...@mnogosearch.org.



> ___
> General mailing list
> General@mnogosearch.org
> http://lists.mnogosearch.org/listinfo/general
___
General mailing list
General@mnogosearch.org
http://lists.mnogosearch.org/listinfo/general

Re: [General] Indexing problem with sqlite3

2017-03-22 Thread Alexander Barkov

Hello Teijo,


SQLite changed the error message in one of the recent releases,
from "unique" in lower case to "UNIQUE" in upper case.


Please apply this patch to src/sql-sqlite.c:



-if (!strstr(db->errstr,"unique"))
+if (!strstr(db->errstr,"unique") && !strstr(db->errstr,"UNIQUE"))






On 03/22/2017 06:39 PM, Alexander Barkov wrote:
> Hello Teijo,
> 
> 
> On 03/22/2017 03:44 PM, Teijo wrote:
>> Hello,
>>
>> I have installed Mnogosearch 3.4.1 from source both to Ubuntu 16.04 and
>> Debian Jessie.
>>
>> In Ubuntu I cannot use Mysql as database because there seem to be some
>> compatibility issues with Mysql 5.7. In Jessie where Mysql version is
>> 5.5x there are no such problems.
>>
>> I thought to use Sqlite3 in Ubuntu. Database setup goes without errors
>> with indexer --create. But when I try to make index with simply typing
>> indexer, I get similar to the following:
>>
>> [33572]{--} indexer from mnogosearch-3.4.1-sqlite3 started with
>> '/usr/local/mnogosearch/etc/indexer.conf'
>> [33572]{01} Error: 'DB: sqlite3 driver: (19) UNIQUE constraint failed:
>> url.url'
>>
>> There seem to be similar problems with Sqlite3 in Jessie as well.
>>
>> I am not familiar with Mnogosearch and Sqlite3 so is there something I
>> have missed when setting up the environment? Only changes I have made in
>> indexer.conf are Dbaddress and server definitions. Dbaddress is just
>> that it's in the example of Sqlite3 definition in indexer.conf-dist.
> 
> Which exact version  of SQLite are you using?
> 
> 
> Can you please send your indexer.conf and the output for:
> 
> ./indexer --sqlmon --exec="SELECT rec_id, url FROM url"
> 
> to b...@mnogosearch.org
> 
> Thanks.
> 
> 
> 
>>
>> Best regards,
>>
>> Teijo
>> ___
>> General mailing list
>> General@mnogosearch.org
>> http://lists.mnogosearch.org/listinfo/general
> ___
> General mailing list
> General@mnogosearch.org
> http://lists.mnogosearch.org/listinfo/general
> 
___
General mailing list
General@mnogosearch.org
http://lists.mnogosearch.org/listinfo/general

Re: [General] Indexing problem with sqlite3

2017-03-22 Thread Alexander Barkov

Hello Teijo,


On 03/22/2017 03:44 PM, Teijo wrote:
> Hello,
> 
> I have installed Mnogosearch 3.4.1 from source both to Ubuntu 16.04 and
> Debian Jessie.
> 
> In Ubuntu I cannot use Mysql as database because there seem to be some
> compatibility issues with Mysql 5.7. In Jessie where Mysql version is
> 5.5x there are no such problems.
> 
> I thought to use Sqlite3 in Ubuntu. Database setup goes without errors
> with indexer --create. But when I try to make index with simply typing
> indexer, I get similar to the following:
> 
> [33572]{--} indexer from mnogosearch-3.4.1-sqlite3 started with
> '/usr/local/mnogosearch/etc/indexer.conf'
> [33572]{01} Error: 'DB: sqlite3 driver: (19) UNIQUE constraint failed:
> url.url'
> 
> There seem to be similar problems with Sqlite3 in Jessie as well.
> 
> I am not familiar with Mnogosearch and Sqlite3 so is there something I
> have missed when setting up the environment? Only changes I have made in
> indexer.conf are Dbaddress and server definitions. Dbaddress is just
> that it's in the example of Sqlite3 definition in indexer.conf-dist.

Which exact version  of SQLite are you using?


Can you please send your indexer.conf and the output for:

./indexer --sqlmon --exec="SELECT rec_id, url FROM url"

to b...@mnogosearch.org

Thanks.



> 
> Best regards,
> 
> Teijo
> ___
> General mailing list
> General@mnogosearch.org
> http://lists.mnogosearch.org/listinfo/general
___
General mailing list
General@mnogosearch.org
http://lists.mnogosearch.org/listinfo/general

Re: [General] Extra hit with SQL query and word position in the original file

2017-03-23 Thread Alexander Barkov

Hello Teijo,


On 03/24/2017 01:24 AM, Teijo wrote:
> Hello,
> 
> If I search given word with search.cgi, I get correct number of occurences.
> 
> But if I do it with SQL (no matter in mysql or sqlite3), they show extra
> occurence. For example, if a given word is in a given original file
> twice, they tell that there are three occurences. SQL query is almost
> the same one found in Mnogosearch's manual, except that I am using only
> one word:
> 
> SELECT url.url, count(*) AS RANK FROM dict, url WHERE
> url.rec_id=dict.url_id AND dict.word IN ('word') GROUP BY url.url ORDER
> BY rank DESC;
> 
> I'd like to know (by SQL query) position of word in the original file
> (to use filepos function). There is at least coord column in dict table.
> Coord contains section id and word's position in relationship to
> section, if I have understood correctly. How to extract the relative
> position from coord, or is the position information elsewhere in
> database? If I disabled all sections, would coord actually contain the
> absolute position?
> 
> I'm using "single mode" as to database.

Coord is a 32 bit number.

- The highest 8 bits are section ID (e.g. title, body, etc,
   according to Section commands in indexer.conf)

- The lowest 24 bits are position inside this section.

- The last hit inside each combination (url_id,word,secno) is the
section length (i.e. the total number of words in this section on)
in this document.


This MySQL query return the information in a readable form:

SELECT url_id,word,coord>>24 AS secno,coord&0xFF AS pos FROM dict
WHERE word='mnogosearch' ORDER BY secno,pos;

++-+---+-+
| url_id | word| secno | pos |
+-+---+-+
|  1 | mnogosearch | 1 |   1 |
|  1 | mnogosearch | 1 |  14 |
|  1 | mnogosearch | 1 |  28 |
|  1 | mnogosearch | 1 |  42 |
|  1 | mnogosearch | 1 |  76 |
|  1 | mnogosearch | 1 |  77 |
|  1 | mnogosearch | 1 |  85 |
|  1 | mnogosearch | 1 | 105 | <- section 1 length
|  1 | mnogosearch | 2 |   1 |
|  1 | mnogosearch | 2 |   6 | <- section 2 length
|  1 | mnogosearch | 3 |  54 |
|  1 | mnogosearch | 3 |  69 | <- section 3 length
|  1 | mnogosearch | 4 |   1 |
|  1 | mnogosearch | 4 |  11 | <- section 4 length
|  1 | mnogosearch | 8 |   2 |
|  1 | mnogosearch | 8 |   4 | <- section 8 length
++-+---+-+


Lines that are not marked as "section X length" are actual word hits.


> 
> Best regards,
> 
> Teijo
> ___
> General mailing list
> General@mnogosearch.org
> http://lists.mnogosearch.org/listinfo/general
___
General mailing list
General@mnogosearch.org
http://lists.mnogosearch.org/listinfo/general

Re: [General] Indexing problem with sqlite3

2017-03-23 Thread Alexander Barkov

Hello,

On 03/22/2017 08:51 PM, Teijo wrote:
> Hello,
> 
> Unfortunately patch did not solve the problem.
> 
> As to SQLite3 versions, Ubuntu 16.04 it is
> SQLite version 3.11.0 2016-02-15 17:29:24
> and in Jessie
> SQLite version 3.8.7.1 2014-10-29 13:59:56

There are two similar places in sql-sqlite.c

Please make sure to fix the SQLite3 (rather than SQLite2) code branch:

  case SQLITE_ERROR:
sqlite3_finalize(pStmt);
udm_snprintf(db->errstr, sizeof(db->errstr),
 "sqlite3 driver: (%d) %s",
 sqlite3_errcode(UdmSQLite3Conn(db)),
 sqlite3_errmsg(UdmSQLite3Conn(db)));
if (!strstr(db->errstr,"unique") && !strstr(db->errstr,"UNIQUE"))
{
  UdmSetErrorCode(db, 1);
  return UDM_ERROR;
}
return UDM_OK;
    break;



> 
> Best regards,
> 
> Teijo
> 
> 22.3.2017, 16:52, Alexander Barkov kirjoitti:
> 
>> Hello Teijo,
>>
>>
>> SQLite changed the error message in one of the recent releases,
>> from "unique" in lower case to "UNIQUE" in upper case.
>>
>>
>> Please apply this patch to src/sql-sqlite.c:
>>
>>
>>
>> -if (!strstr(db->errstr,"unique"))
>> +if (!strstr(db->errstr,"unique") &&
>> !strstr(db->errstr,"UNIQUE"))
>>
>>
>>
>>
>>
>>
>> On 03/22/2017 06:39 PM, Alexander Barkov wrote:
>>> Hello Teijo,
>>>
>>>
>>> On 03/22/2017 03:44 PM, Teijo wrote:
>>>> Hello,
>>>>
>>>> I have installed Mnogosearch 3.4.1 from source both to Ubuntu 16.04 and
>>>> Debian Jessie.
>>>>
>>>> In Ubuntu I cannot use Mysql as database because there seem to be some
>>>> compatibility issues with Mysql 5.7. In Jessie where Mysql version is
>>>> 5.5x there are no such problems.
>>>>
>>>> I thought to use Sqlite3 in Ubuntu. Database setup goes without errors
>>>> with indexer --create. But when I try to make index with simply typing
>>>> indexer, I get similar to the following:
>>>>
>>>> [33572]{--} indexer from mnogosearch-3.4.1-sqlite3 started with
>>>> '/usr/local/mnogosearch/etc/indexer.conf'
>>>> [33572]{01} Error: 'DB: sqlite3 driver: (19) UNIQUE constraint failed:
>>>> url.url'
>>>>
>>>> There seem to be similar problems with Sqlite3 in Jessie as well.
>>>>
>>>> I am not familiar with Mnogosearch and Sqlite3 so is there something I
>>>> have missed when setting up the environment? Only changes I have
>>>> made in
>>>> indexer.conf are Dbaddress and server definitions. Dbaddress is just
>>>> that it's in the example of Sqlite3 definition in indexer.conf-dist.
>>>
>>> Which exact version  of SQLite are you using?
>>>
>>>
>>> Can you please send your indexer.conf and the output for:
>>>
>>> ./indexer --sqlmon --exec="SELECT rec_id, url FROM url"
>>>
>>> to b...@mnogosearch.org
>>>
>>> Thanks.
>>>
>>>
>>>
>>>>
>>>> Best regards,
>>>>
>>>> Teijo
>>>> ___
>>>> General mailing list
>>>> General@mnogosearch.org
>>>> http://lists.mnogosearch.org/listinfo/general
>>> ___
>>> General mailing list
>>> General@mnogosearch.org
>>> http://lists.mnogosearch.org/listinfo/general
>>>
> ___
> General mailing list
> General@mnogosearch.org
> http://lists.mnogosearch.org/listinfo/general
___
General mailing list
General@mnogosearch.org
http://lists.mnogosearch.org/listinfo/general

Re: [General] URL matches list as query string

2017-04-07 Thread Alexander Barkov

Hello,

On 04/07/2017 02:04 AM, Teijo wrote:
> Hello,
> 
> I have URL (server) I have indexed; for example: www.example.com/files
> containing several documents. I would like to restrict search results
> only to documents which names are document1 and document2.
> 
> If ul parameter in query string contains only document1 or document2,
> but not both, search results are restricted to the corresponding
> document. But I have not found a way to get ul parameter in query string
> to be such one that restriction would contain both documents. I get no
> matches when trying to put both documents to query string although both
> documents contain word I'm searching for.
> 
> This is an example query string which does not work:
> 
> ?q=test=all=beg=document1+document2

Try multiple ul= parameters:

?q=test=all=beg=document1=document2

> 
> I have tried also to pass this (and other variants with different ul
> parameter) directly to search.cgi.
> 
> Best regards,
> 
> Teijo
> 
> 5.4.2017, 13:58, Alexander Barkov kirjoitti:
> 
>> Hi Teijo,
>>
>> On 03/30/2017 05:03 PM, Teijo wrote:
>>> Hello,
>>>
>>> I have tried with multi selection list box and text edit field. In both
>>> cases only one item is accepted. If I try with more than one, the rest
>>> are omitted (multi selection) or your search did not match any documents
>>> message is shown (when entered in the text edit field.
>>>
>>> I do not know what to try next.
>>
>> Can you clarify please what exactly you're doing.
>>
>> How does the relevant HTML code look like,
>> and how does the URL look like after you submit.
>>
>> Thanks.
>>
>>>
>>> Best regards,
>>>
>>> Teijo
>>> ___
>>> General mailing list
>>> General@mnogosearch.org
>>> http://lists.mnogosearch.org/listinfo/general
> ___
> General mailing list
> General@mnogosearch.org
> http://lists.mnogosearch.org/listinfo/general
___
General mailing list
General@mnogosearch.org
http://lists.mnogosearch.org/listinfo/general

Re: [General] URL matches list as query string

2017-04-05 Thread Alexander Barkov

Hi Teijo,

On 03/30/2017 05:03 PM, Teijo wrote:
> Hello,
> 
> I have tried with multi selection list box and text edit field. In both
> cases only one item is accepted. If I try with more than one, the rest
> are omitted (multi selection) or your search did not match any documents
> message is shown (when entered in the text edit field.
> 
> I do not know what to try next.

Can you clarify please what exactly you're doing.

How does the relevant HTML code look like,
and how does the URL look like after you submit.

Thanks.

> 
> Best regards,
> 
> Teijo
> ___
> General mailing list
> General@mnogosearch.org
> http://lists.mnogosearch.org/listinfo/general
___
General mailing list
General@mnogosearch.org
http://lists.mnogosearch.org/listinfo/general

Re: [General] Buffer overflow

Re: [General] Buffer overflow

Re: [General] Buffer overflow

[General] ANNOUNCE: mnoGoSearch-3.3.14

Re: [General] Install mnogosearch

Re: [General] Install mnogosearch

Re: [General] CheckOnly for all unkown file types

Re: [General] blob, single or multi parameters

Re: [General] scans subdirectories websites

Re: [General] Data in database mnogosearch

Re: [General] Indexer RSS flux

Re: [General] Indexer RSS flux

[General] ANNOUNCE: mnoGoSearch 3.3.15

Re: [General] wordstat

Re: [General] separate indexing/crawling and web recherche

Re: [General] rss and multiledia file

Re: [General] separate indexing/crawling and web recherche

Re: [General] indexer make segmentation fail

Re: [General] UDM_URL's parent url inside external parser

Re: [General] parameter to server command in indexer.conf

Re: [General] accented characters

Re: [General] Delete a line in Server method in indexer.conf

Re: [General] Indexing Failed with large database

Re: [General] Crawling order

Re: [General] Duplicates Commandes in indexer

Re: [General] Geo Search

[General] ANNOUNCE: mnoGoSearch-3.4.1

Re: [General] HoldBadHrefs has no effect

Re: [General] Indexing problem with sqlite3

Re: [General] Indexing problem with sqlite3

Re: [General] Extra hit with SQL query and word position in the original file

Re: [General] Indexing problem with sqlite3

Re: [General] URL matches list as query string

Re: [General] URL matches list as query string

< 2 3 4 5 6 7

601 - 634 of 634 matches

Site Navigation

Mail list logo

Footer information