2 Questions: 1. Full-text Search 2. Loading 1Gb Data

2006-08-18 Thread AmirBehzad Eslami
Dear list,

I'm programming a PHP-driven Search Engine for a newspaper.

Full-text Search with MyISAM Tables, MySQL 4.1.11, PHP 4.3.0
1GB of Text encoded by UTF-8
An average of 1Mbyte Data is inserted to database every day

A common SQL-Query:
SELECT COUNT(*) FROM news_archive
WHERE
 MATCH(news_title, news_text) AGAINST('%s' IN BOOLEAN MODE)


1st Question:
Do you recommend any setting to improve the speed?
Is there anything which I should keep in mind?

2nd Question:
Before this project, I was using phpMyAdmin to backup/restore my database. But 
today I need to load 1Gb of data into my tables. There are two problems with 
the phpMyAdmin method: Time-out and Max Post-size.
Is there a better way to perfrom this task?
 
Thank you in advance,
Behzad

P.S.
Database Scheme:

-- Table structure for table `news_archive`
-- 

CREATE TABLE `news_archive` (
  `news_id` int(10) unsigned NOT NULL auto_increment,
  `news_type_id` int(10) unsigned NOT NULL,
  `news_date` mediumint(8) unsigned NOT NULL default '0',
  `news_title` text collate utf8_persian_ci NOT NULL,
  `news_text` text collate utf8_persian_ci NOT NULL,
  `news_file` varchar(100) character set ascii NOT NULL,
  PRIMARY KEY  (`news_id`),
  KEY `news_date` (`news_date`),
  KEY `news_type_id` (`news_type_id`),
  FULLTEXT KEY `news_title` (`news_title`,`news_text`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_persian_ci COMMENT='News 
Archive';

-- 

-- 
-- Table structure for table `news_subtypes`
-- 

CREATE TABLE `news_subtypes` (
  `news_subtype_id` int(10) unsigned NOT NULL,
  `news_type_id` int(10) unsigned NOT NULL,
  PRIMARY KEY  (`news_subtype_id`),
  KEY `news_type_id` (`news_type_id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_persian_ci;

-- 

-- 
-- Table structure for table `news_types`
-- 

CREATE TABLE `news_types` (
  `type_id` int(10) unsigned NOT NULL auto_increment,
  `type_name` varchar(100) collate utf8_persian_ci NOT NULL,
  PRIMARY KEY  (`type_id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_persian_ci COMMENT='News 
Types';



-
Want to be your own boss? Learn how on  Yahoo! Small Business. 

Re: MySQL 4.0, FULL-TEXT Indexing and Search Arabic Data, Unicode

2005-11-27 Thread AmirBehzad Eslami
Mohsen wrote:
  
   But himself solved his problem.
   with : mysql_query(SET NAMES utf8);
   Even 4.0.x
  
  Wrong. 
  
  I decided to prepare two different versions for my software:
  - A MySQL 4.0-friendly version using Romanizing method (Hats off to you, 
Ehsan)
  - A MySQL 4.1-compatible version.
  
  The code you mentioned belongs to the 2nd version.
  
   SET NAMES indicates what is in the SQL statements that the client  sends. 
Thus, SET NAMES 'cp1251' tells the server “future incoming  messages from this 
client are in character set cp1251.” It also  specifies the character set for 
results that the server sends back to  the client. (For example, it indicates 
what character set column values  are if you use a SELECT statement.) 
  
  MySQL Manual 4.1 - 10.3.6. Connection Character Sets and Collations.
  
  Kind Regards,
  Behzad
  



-
 Yahoo! Music Unlimited - Access over 1 million songs. Try it free.

MySQL 4.0, FULL-TEXT Indexing and Search Arabic Data, Unicode

2005-11-24 Thread AmirBehzad Eslami
Dear list,
  
  I'm considering programming a simple Search Engine for a website,
  to find Arabic/Persian data within a MySQL database.
  This database contains a huge amount of data, encoded with Unicode (UTF-8). 
  
  
  The big deal is to ** reduce the response time ** to end-users.
  
  My first solution is to create an Index and use the FULL-TEXT Searching 
method.
  
  Luckily, MySQL's provides FULL-TEXT Indexing support in MyISAM tables.
  But unfortunately, it doesn't support multi-byte charsets (e.g. Unicode). [1]
  Technically, MySQL creates Indexes over words.
  A word'' is any sequence of characters consisting of letters and numbers [2].
  
  Assuming this, I tried to save the records as Unicode Character References 
(#;), but the search failed again :-(
  
  Any suggestion?
  I appreciate any solution to solve this problem.
  
  Thanks in Advance,
  Behzad
  
  
  [1] MySQL Manual - 6.8.3 Full-text Search TODO
  [2] MySQL Manual - 6.8 MySQL Full-text Search
  
  
  P.S.
  
  I use MySQL 4.0
  
  1) Table Strucutre
  
  CREATE TABLE `articles` (
`article_id` int(10) unsigned NOT NULL auto_increment,
`article_title` NATIONAL varchar(255) NOT NULL default '',
`article_text` text NOT NULL,
PRIMARY KEY  (`article_id`),
FULLTEXT (`article_title`,`article_text`)
  ) TYPE=MyISAM ;
  
  ALTER TABLE `articles` CHARACTER SET ut8;
  
  2) SQL-Query to Perform a Full-text search
  
  SELECT * FROM articles WHERE MATCH(article_title, article_text) 
AGAINST('#1587;#1608;#1575;#1604;')
  
  


-
 Yahoo! Music Unlimited - Access over 1 million songs. Try it free.

Re: MySQL 4.0, FULL-TEXT Indexing and Search Arabic Data, Unicode

2005-11-24 Thread AmirBehzad Eslami
On 24/11/2005, Alec worte:
  
   I think this is your problem: MySQL does not properly support Unicode 
   until version 4.1. I am successfully using FullText with MySQL 4.1 to sort 
   UTF-8 encoded Japanese text. I see no reason why it should not work for 
   Arabic - if you upgrade.
  
  Dear Alec,
  Thank you for your prompt reply.
  
  You're right. That's my problem. I admit it.
  
  But I'm really unable to solve this by upgrading.
  Many of the Hosting Companies, which I use their services [even the 
HostRocket.com] still use MySQL 4.0 !!!
  
  1) Would you recommand any hosting company with PHP 5 and MySQL 4.1 support?
  
  2) What about if my client only use MySQL 4.0 for his reasons. In this  case, 
I really can't use FULL-TEXT search? There is no any solution?
  
  Once again, thank you for your reply.
  Behzad



-
 Yahoo! Music Unlimited - Access over 1 million songs. Try it free.