UdmSearch: New message on the WebBoard #1: HTDB index

2000-10-25 Thread Alexander Barkov

Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
Find it in doc/samples/htdb.conf



Reply: 

__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




UdmSearch: New message on the WebBoard #1: Dynamically Genarated Pages

2000-10-25 Thread Padilha

Author: Padilha 
Email: [EMAIL PROTECTED]
Message:
My webpages are dynamically genarated by Java Servlets such as pages genarated by php. 
How can I index these pages?

Reply: 

__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




UdmSearch: New message on the WebBoard #1: Dynamically Genarated Pages

2000-10-25 Thread Alexander Barkov

Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
You don't need any special configuration.
Just check Allow/Disallow commands.


Reply: 

__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




UdmSearch: New message on the WebBoard #1: How can users add their homepage to the index?

2000-10-25 Thread Jan Paul ten Wolde

Author: Jan Paul ten Wolde
Email: [EMAIL PROTECTED]
Message:
I want my visitors to be able to add their URL to the index themselves. How can I 
configure udmsearch (3.0.9-mysql-cgi-frontend) to have this add URL feature?


Reply: 

__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




UdmSearch: New message on the WebBoard #1: How can users add their homepage to the index?

2000-10-25 Thread Alexander Barkov

Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
Since 3.1.7 we have new "ServerTable" feature.
Currently we have no front-end to add URLs into
server tables, but we have a plan to make it soon.


Reply: 

__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




UdmSearch: New message on the WebBoard #1: Indexing whole TLD

2000-10-25 Thread Martin Perst

Author: Martin Perst
Email: [EMAIL PROTECTED]
Message:
What should be in my indexer.conf in order to index whole TLD?

For example:
I have this url - http://corp.sk/inet/web_all.html - and I want the indexer to index 
all .sk servers, including subdirs & all pages ( recursively ).

Sorry if my question is too lame, but I've been playing around with "FollowOutside" 
for several hours, but it always index only the main page ..

Reply: 

__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




UdmSearch: New message on the WebBoard #1: Welcome to go.to/blmts!

2000-10-25 Thread Vandal

Author: Vandal
Email: 
Message:
`

Reply: 

__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




UdmSearch: New message on the WebBoard #1: Unknown column errors

2000-10-25 Thread Jim

Author: Jim
Email: 
Message:
After a week of trying to get UDM running, I am almost there, now major sticking issue 
is tables. What does this error mean??

Displaying documents 1-20 of total 123 found. 
An error occured! 

Query error: SELECT SQL_SMALL_RESULT url.url, url.title, url.txt, url.content_type, 
url.docsize, url.last_modified, url.keywords, url.description, url.crc, url.rec_id 
FROM url WHERE url.rec_id = 4 
Unknown column 'url.crc' in 'field list' 

If I create url.crc, then it complains about last mod date and so on. I went to single 
mode since multi-crc never could work.

  I have updated to the latest 3.1.7 and PHP. 

Reply: 

__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




UdmSearch: possible bug in search,cgi

2000-10-25 Thread Gene King
Title: UdmSearch: atlantic










 

Thanks

 

Gene








 
  
  
  Search for: 

UdmSearch: possible bug in indexer

2000-10-25 Thread Gene King

I am using indexer from UdmSearch v.3.1.7/MySQL
http://search.mnogo.ru (C) 1998-2000, UdmSearch Developers Team.

Where I have loaded the URL's to index in mySQL server table. I wanted to
index only valid words but I got no words if I used IspellIncorrectFactor
0.  

I am using mySQL table for the spell checking but I do not see in
indexer.conf where I need to say that. There is a reference in search.html
for it. 

What am I doing wrong or is there a bug?

Thanks.

Gene


Eugene K. King
Programmer

EuroDebit Systems, Inc.
Phone: 505.454.3806, ext. 520
Facsimile: 505.454.3817
http://www.eurodebit.com  
[EMAIL PROTECTED]  

 winmail.dat


UdmSearch: searching private lists

2000-10-25 Thread Nick Marouf

Hi all,
I am  new to Udmsearch and was hoping on getting some help.

the setup I have is as follows,
in udm search config I set it to index
www.test.edu/archive
and that does the public archives for my lists just fine. 

so for private mailing lists I set it to index
www.test.edu/mailman/private
and that won't work because it asks for a listname. If I put a list
name, (time comsuming for many lists) I am stuck with username and
passwd. there is a way in udmsearch to authenticate using AuthBasic
butthat is not working with mailman.

I though that I could do an apache ScriptAlias so that I can get to
the private index list  with something like
www.test.edu/privatearchives/  this would work to index pages in
udmsearch. But I don't want those archives public, so I put a
htaccess passwd on it. and I can use the udmsearch capabilites to
authenticate
with htaccess.

so now udm search lists the urls it found as 

http://www.test.edu/privatearchive/adms/October2000/msg01116.html

I click on it, is asks for the htaccess passwd. And that won't work
for people.

What are people doing to search private lists? I searched the archives
and wasn't able to come up with much.


Thanks a lot for any help

Nick



-- 
Nicholas Marouf || Earlham College || Assistant System Administrator. 
http://www.ramallahonline.com
"it is better to die standing than to live on one's knees"
__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




Re: UdmSearch: Possible bug in udmsearch

2000-10-25 Thread Alexander Barkov

Hi!

I can't reproduce this on my box.

Gene King wrote:
> 
> I am using: indexer from UdmSearch v.3.1.7/MySQL.
> 
> I have loaded my list of sites to search in a server table along with a
> category. After the indexer is complete the category column in the URL table
> all contain the category of the first site it indexed.
__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




UdmSearch: New message on the WebBoard #1: Unknown column errors

2000-10-25 Thread Alexander Barkov

Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
You are using OLD database structure from 3.0.x
with the NEW 3.1.7 indexer. You have to recreate 
all tables.

Reply: 

__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




Re: UdmSearch: possible bug in search,cgi

2000-10-25 Thread Alexander Barkov

Hi!

This is fixed in 3.1.8 sources which will be available soon.

Thanks for reporting anyway.


>Displaying documents 1-3 of total 2 found.
__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




UdmSearch: New message on the WebBoard #1: Indexing whole TLD

2000-10-25 Thread gluke

Author: gluke
Email: [EMAIL PROTECTED]
Message:
You should use Allow/Disallow commands to do that. Please read documentation and 
examples included in the distribution.

Reply: 

__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




UdmSearch: New message on the WebBoard #1: Unknown column errors

2000-10-25 Thread gluke

Author: gluke
Email: [EMAIL PROTECTED]
Message:
Or you can switch search.php to use old database format by using $db_format config.inc 
variable.

Reply: 

__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




UdmSearch: Announce mnoGoSearch-cat_ed-php-1.6

2000-10-25 Thread Sergey Kartashoff

Hi!

  New version of mnoGoSearch-cat_ed-php-1.6
  (formely known as UdmSearch-cat_ed-php) is available at our site
  http://search.mnogo.ru

  All users are strongly recommended to upgrade.

  From ChangeLog:

25 October 2000: 1.6
* Fixed major bugs while editing, renaming and deleting symlinks.

-- 
Regards, Sergey aka gluke.


__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




UdmSearch: mnoGoSearch-cat_ed-perl-1.2

2000-10-25 Thread Sergey Kartashoff

Hi!

  New version of mnoGoSearch-cat_ed-perl-1.2
  formely known as UdmSearch-cat_ed-perl) is available at our site
  http://search.mnogo.ru

  From ChangeLog:

25  October 2000: 1.2
* Incorporated cat_ed.php 1.6 patches
* Added MySql and Oracle connect strings
* Configuration parameters moved into config.pl
* Fixed cgi parameter bug which prevented setting link path 
* Fixed bug in tree optimization routine
* Fixed Undefined subroutine &main::query bug
* Fixed some minor bugs while comparing strings as numbers
* $dbtype 'Pg' changed to psql to comply with cat_ed.php 
  

-- 
Regards, Sergey aka gluke.


__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




UdmSearch: New message on the WebBoard #1: Indexing whole TLD

2000-10-25 Thread Alexander Barkov

Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
It seems that something wrong with your indexer.conf. You may send it to discussions
list and we'll check it.

Reply: 

__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




Re: UdmSearch: MySQL RAID...

2000-10-25 Thread Alexander Barkov

Found this message to be not answered 

Using RAID ability does not seem to require any changes in search 
engine code. You need just to install MySQL properly and tell him to
use raid.

If you mean MERGE tables introduced in MySQL 3.23.25, it require
changes in our code because parts of MERGE table have separated
indexes. What is the idea to use them in search engine?


Nefer wrote:
> 
> Hmm... Do you plan to implement this ability?
>
__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




Re: UdmSearch: arte reqexps in AllowDisallow case sensitive?

2000-10-25 Thread Alexander Barkov

Hi!

We've added case sensitive URL control commands into 3.1.8 sources.
It will be available soon.


Peter Hanecak wrote:
> 
> Hello,
> 
> I'm trying to block URLs which have some upper case charaster(s) in host
> name part because then there are two identical documents with different
> URLs in my database. I tried this:
> 
> Disallow http://.*[A-Z]*.*\.[sS][kK]/
> 
> but:
> 
> [hany@m1 ~]$ indexer -v 6 -n 1 -m -i -a -u http://WWW.MEGALOMAN.SK/
> Indexer[1691]: indexer from UdmSearch v.3.0.23/PgSQL started with '/etc/indexer.conf'
> Indexer[1693]: [1] http://WWW.MEGALOMAN.SK/
> Indexer[1693]: [1] 'Disallowhttp://.*[A-Z]*.*\.[sS][kK]/'
> Indexer[1693]: [1] Done
> [hany@m1 ~]$ indexer -v 6 -n 1 -m -i -a -u http://www.megaloman.sk/
> Indexer[1695]: indexer from UdmSearch v.3.0.23/PgSQL started with '/etc/indexer.conf'
> Indexer[1697]: [1] http://www.megaloman.sk/
> Indexer[1697]: [1] 'Disallowhttp://.*[A-Z]*.*\.[sS][kK]/'
> Indexer[1697]: [1] Done
> 
> Is there a way (except hacking sources) to avoid indexing URLs like:
> 
> WWW.MEGALOMAN.SK
> www.MEGALOMAN.sk
> 
> but still index URL www.megaloman.sk ?
__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




Re: UdmSearch: Cor blimey - performance!

2000-10-25 Thread Alexander Barkov

Hi!

I uploaded  the script to our site in "Tools and Front-ends" part
of Download page.

Joe, thanks!

Joe Frost wrote:
> 
> I have UDMSearch installed on an old P166/64MB and after experiencing
> problems with DBMode cache (I will get you some debug info from my test
> server Alexander), I decided to configure the server to use DBMode
> crc-multi.
> Once I had indexed my 100,000 documents I was getting search times of 3
> minutes for a three word query :o( so I had a look at the
> performance.txt file included with 3.1.7, originally written by Randy
> Winch <[EMAIL PROTECTED]>.
> I decided to try it out and had a look through some of the docs on the
> MySQL site and eventually came up with a script to optimise the tables
> and the same search now takes about 4 seconds :o)
> 
> The script has only been used on RedHat 6.2 / UDMSearch 3.1.7 / MySQL
> 3.22.23 but if you want to use it then feel free
> 
> Here it is:
> 
> # getyn function to provide a yes no prompt before running the script
> getyn () {
> while echo -n "$* (y/n) ? " >&2
> do
> read yn rest
> case $yn in
> [yY]) return 0 ;;
> [nN]) return 1 ;;
> *) echo -n "Please answer y or n, " >&2 ;;
> esac
> done
> }
> ###
> 
> # Warn user about database ccorruption before proceeding
> echo -n "Warning, "
> getyn "executing this script could destroy your database, do you wish to
> continue?" || exit
> 
> set -x
> 
> # Define the system parameters
> MYSQLPATH=/home/mysql
> BINPATH=$MYSQLPATH/bin
> DATAPATH=$MYSQLPATH/var
> DATANAME=udmsearch
> 
> # Loop through the table names for crc-multi mode UDMSearch indexes
> for TABLE in 2 3 4 5 6 7 8 9 10 11 12 16 32 ;
> do
> # Export the data from the current table to a text file
> echo "select * from ndict$TABLE into outfile
> '$DATAPATH/ndict$TABLE.txt';" | $BINPATH/mysql $DATANAME
> 
> # Sort  the data into word order (column two sort) and
> output it to a .srt file
> sort -k 2 $DATAPATH/ndict$TABLE.txt -o
> $DATAPATH/ndict$TABLE.srt
> 
> # Remove the original text file
> rm -f $DATAPATH/ndict$TABLE.txt
> 
> # Drop the tcurrent table - in MYSQL this actually
> deletes the table from disk
> echo "DROP TABLE ndict$TABLE;" | $BINPATH/mysql
> $DATANAME
> 
> # Re-create the table based on the crc-multi create file
> echo "CREATE TABLE ndict$TABLE ( url_id int(11) DEFAULT
> '0' NOT NULL, word_id int(11) DEFAULT '0' NOT NULL, intag tinyint(4)
> DEFAULT '0' NOT NULL, KEY url_id$TABLE (url_id), KEY word_id$TABLE
> (word_id) );" | $BINPATH/mysql $DATANAME
> 
> # Flush the tables
> $BINPATH/mysqladmin flush-tables
> 
> # Disable indexes
> $BINPATH/isamchk --keys-used=0 -rq
> $DATAPATH/$DATANAME/ndict$TABLE
> 
> # Restart the database server
> /etc/rc.d/init.d/mysql stop
> sleep 5
> /etc/rc.d/init.d/mysql start
> sleep 5
> 
> # Load the sorted text file into the table
> echo "LOAD DATA INFILE '$DATAPATH/ndict$TABLE.srt' INTO
> TABLE ndict$TABLE;" | $BINPATH/mysql $DATANAME
> 
> # Restart the database server
> /etc/rc.d/init.d/mysql stop
> sleep 5
> /etc/rc.d/init.d/mysql start
> sleep 5
> 
> # Rebuild indexes
> $BINPATH/isamchk -r -q $DATAPATH/$DATANAME/ndict$TABLE
> 
> # Flush the tables
> $BINPATH/mysqladmin flush-tables
> 
> # Delete the sorted text file
> rm -f $DATAPATH/ndict$TABLE.srt
> done
> __
> If you want to unsubscribe send "unsubscribe udmsearch"
> to [EMAIL PROTECTED]
__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




Re: UdmSearch: Performance Tweaks

2000-10-25 Thread Alexander Barkov

Hello!

Paul Stewart wrote:
> I'm going to upgrade to the new cache version shortly... recently I upgraded
> my server to a PIII-500 with 256 meg of RAM.  Unfortunately the server has
> IDE hard drives however they are fast for IDE and MySQL resides on a
> dedicated drive which helps.
> 
> My questions are as follows:
> 
> If I was to load a portion of MySQL tables into a RAM Drive for better
> performance, which files (using crc-multi) would offer me the greatest speed
> improvement.  My database directory sits at almost a gig right now and I was
> hoping to load 100-150 meg of data into a RAM drive IF it helps:)

I think those tables that contains most requested words. It is
ndict4,ndict5,ndict6.
You may also turn on queries tracking and check statistics later.
Consider also to optimize tables how it is described in
doc/performance.txt


> How does the new cache mode work?  Does it only cache up queries that were
> previously done or does it build a "master" cache of possible combinations
> which then gives a big speed hit.  I can't find any docs on it at this time
> but understand I think what it's going to do.

It is described in doc/cachemode.txt


> Is it realistic (given a dual processor machine with fast scsi drives and
> lots of RAM) that a query of four words against UdmSeach with 10 million
> entries could be completed in under 5 seconds?  Just a benchmark to give me
> a feel:)

Take a look to our test site with "cache mode". There are 4.3 mln
documents.
http://udm.aspseek.com/cgi-bin/search.cgi


> Does anyone have a list of configuration options that are good to pass into
> MySQL 3.23 for best performance with large databases and UDMSearch?  I'm not
> very literate at MySQL and realize that my slowness problems are because of
> MySQL yet I'm told it can perform excellent with tweaking..

It is completely described in MySQL documentation on
http://www.mysql.com/


> Currently I only have 100,000 entries and if I search for more than a couple
> of words the performance drags into 45-60 seconds for a response which won't
> work for my application.
> 
> Try it if you like www.canadian-links.com  and you can see what I mean..:)
> The server is moving tomorrow morning (EST) so will be down for about an
> hour as it travels to a co-lo nearby..


Looks very nice. I tested a couple of queries and they was completed
within
1 or 2 seconds.

   Any possibility to add "Powered by" ?   :-)
__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




Re: UdmSearch: possible bug in indexer

2000-10-25 Thread Alexander Barkov

Gene King wrote:
> 
> I am using indexer from UdmSearch v.3.1.7/MySQL
> http://search.mnogo.ru (C) 1998-2000, UdmSearch Developers Team.
> 
> Where I have loaded the URL's to index in mySQL server table. I wanted to
> index only valid words but I got no words if I used IspellIncorrectFactor
> 0.
> 
> I am using mySQL table for the spell checking but I do not see in
> indexer.conf where I need to say that. There is a reference in search.html
> for it.
> 
> What am I doing wrong or is there a bug?


indexer can't take ispell data from database. Put ispell files to disk
and use Affix and Spell commands.
__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




Re: UdmSearch: searching private lists

2000-10-25 Thread Alexander Barkov

Hi!

Nick Marouf wrote:
> the setup I have is as follows,
> in udm search config I set it to index
> www.test.edu/archive
> and that does the public archives for my lists just fine.
> 
> so for private mailing lists I set it to index
> www.test.edu/mailman/private
> and that won't work because it asks for a listname. If I put a list
> name, (time comsuming for many lists) I am stuck with username and
> passwd. there is a way in udmsearch to authenticate using AuthBasic
> butthat is not working with mailman.
> 
> I though that I could do an apache ScriptAlias so that I can get to
> the private index list  with something like
> www.test.edu/privatearchives/  this would work to index pages in
> udmsearch. But I don't want those archives public, so I put a
> htaccess passwd on it. and I can use the udmsearch capabilites to
> authenticate
> with htaccess.
> 
> so now udm search lists the urls it found as
> 
> http://www.test.edu/privatearchive/adms/October2000/msg01116.html
> 
> I click on it, is asks for the htaccess passwd. And that won't work
> for people.
> 
> What are people doing to search private lists? I searched the archives
> and wasn't able to come up with much.
> 
> Thanks a lot for any help


You may try to write this in indexer.conf:

Server http://user:[EMAIL PROTECTED]/privatearchive/
__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]