Problem in Fulltext Search

2005-08-05 Thread Suryya Ghosh
Hi,

We have a table containing more than 15 million rows of data, can anybody 
please help in this problem of fulltext search described below.

The following query is giving a good result in terms of query time.

select field1, field2
from tblMerge
where
 MATCH(field1)
 AGAINST('food' IN BOOLEAN MODE)
Order By field2 DESC LIMIT 500

when I want to search for more that one word or phrase like food, locomotive, 
water tank then the query is not working up to the desired level. The following 
query is taking not less than 90 to 100 seconds that is 1.5 minutes

select field1, field2
from tblMerge
where
 MATCH(field1)
 AGAINST('food' IN BOOLEAN MODE)
 Or
 MATCH(field1)
 AGAINST('locomotive' IN BOOLEAN MODE)
 Or
 MATCH(field1)
 AGAINST('water' IN BOOLEAN MODE)
Order By field2 DESC LIMIT 500

I have also tried with another version of the same previous query which is also 
not giving a desired performance.  The following query is taking not less than 
90 to 100 seconds that is 1.5 minutes

select field1, field2
from tblMerge
where
 MATCH(field1)
 AGAINST('food locomotive water tank' IN BOOLEAN MODE)
Order By field2 DESC LIMIT 500

In these cases, I think, I am not getting the advantage of full text search, 
because still it is taking a lot of time to execute the query. Can I optimise 
the database to few step forward such that the query time can be reduced for 
multiple words of phrases, please let us know can the query be optimized to 
provide a better performance?


Regards,

Suryya


Consultancy for MySQL database optimization against substantial financial remuneration

2005-07-14 Thread Suryya Ghosh
Hi ,

We are a company specialized in developing tools for Internet Marketing domain.
We are developing a product for which we need consultancy services from 
professionals speacilized in Mysql server optimization and tuning. 
Adequate financial remuneration will not be problem for effective solution 
provider.

Below the nature of our application and problems  faced are discussed in detail.

We are developing an application that needs to use a massive back-end
database. The database will contain around 75 million rows with around
80 columns per row. We would prefer to use MySQL as the database
platform as it is free. The MySQL database is hosted on a
dedicated server that has been purchased from a web hosting company.

This database would be used both by our customers and by our own
employees.

The first column will contain some text which will be unique in each
row. 90% of the remaining columns will containing numbers and the other
columns will contain text.

The second column will contain numbers and it needs to be updated on a
monthly basis. But, we also need to store historical data regarding the
value of the second column for each row for the last 24 months, on a
rolling basis. This can either be done by adding more columns to the
same table, or by putting this historical data in a separate table,
depending on your recommendations.

Users will make 2 types of queries on this database:

i) The first type of query is what can be called a mission-critical
query - these queries will be made by our customers and the results of
these queries must be returned within 30 seconds at the most; otherwise,
customers are not going to want to use the application. This query would
basically involve asking the customer for a search string, searching the
FIRST column (and ONLY the first column) of the entire database to find
out each row that contains that search string (either in whole or in
part) and then returning all such rows to the user sorted in descending
order of the SECOND column. Only the information in the first 2 columns
will be returned to the customers - the information in the other 78
columns will not be returned to the customers. Customers will also have
the option of specifying negative matches - i.e. if the first column of
a particular row contains any one of a list of banned words or phrases,
then that row will not be returned even if it contained the primary
search string.

ii) The second type of queries are non-mission-critical; these would be
run by our employees and it is ok if these queries take as much as 10
minutes to return results. However, the queries that our employees will
run are also much more complex - they will specify multiple search
criteria - for instance, return all rows for which the 60th column has
a value  2000 and the minimum value for the columns 40, 41, ... 50 for
that row is 20 and the 35th column of that row is  5 etc.

It is quite possible that as many as 20 - 30 users will be querying the
database at the same time. Furthermore, there will be 5 - 6 different
PHP scripts that are going to constantly update the different columns
and rows of the database with the values.

Now we have hired a server with the following configuration:
Server:  Dual Xeon 2.8 GHz  
Secondary Processor:  Second Xeon Processor
Primary HDD:  73 GB SCSI
Secondary HDD:  None
Third HDD:  None
RAM:  ECC Registered 1024MB RAM
Number of ips:  10 IP Addresses
Bandwidth:  2000 GB Bandwidth
Uplink Port Speed:  100 Mbps Uplink
Database:  MySQL 4.1.11-standard
Backup Service:  Network Backup
Operating System:  Red Hat Enterprise Linux, Version 3
Drive Controller:  SCSI
Chassis Control:  DRAC Card

We are executing certain queries but they are taking too long a time. 
Could you help us in tuning the SQL queries and the MySQL database such
that the query time reduces and we get a fatser and more efficient
database?

Do let us know what is the timeframe required for this and also please
indicate your charges for the same. Please send us your profile and
details of some assignments that you have carried out.

Awaiting your response,

Regards
Suryya Ghosh


Mysql installation information

2005-04-01 Thread Suryya Ghosh
Hi,

I have gone through the sites which you have recomended where it is suggested 
that the exact version depends on the requirement and the hardware.

According to our hardware  software specification which is given below, I 
think we should use (Linux (x86, glibc-2.2, static, gcc),Standard 4.1.10a )  
version of  mysql server.

I have no previous experience regarding installing mysql so it will be a great 
help if  let me know your openion about it.
Can you please review the URL http://dev.mysql.com/downloads/mysql/4.1.html and 
recomend the exact version which will go best with the server.

Server Configuration
-
Server:  Dual Xeon 2.8 GHz ( X86,32 bit,533MHz FSB,512KB L2 cache)
Primary HDD:  73 GB SCSI
RAM:  ECC Registered 1024MB RAM
Bandwidth:  2000 GB Bandwidth
Uplink Port Speed:  100 Mbps Uplink
Operating System:  Red Hat Enterprise Linux, Version 3 (recommended)
Drive Controller:  SCSI
Chassis Control:  DRAC Card


With Regards,

Suryya


Mysql installation

2005-03-31 Thread Suryya Ghosh
Hi,

We have server with the following hardware specification:

 Server:  Dual Xeon 2.8 GHz
 Primary HDD:  73 GB SCSI
 RAM:  ECC Registered 1024MB RAM
 Bandwidth:  2000 GB Bandwidth
 Uplink Port Speed:  100 Mbps Uplink
 Operating System:  Red Hat Enterprise Linux, Version 3 (recommended)
 Drive Controller:  SCSI
 Chassis Control:  DRAC Card

Can any body suggest which exact version of mysql will give optimum 
performance in this hard ware, will an RPM installation will be the best one we 
should  compile it from source using an particulat gcc compiler.

With Regards,

Suryya

 

Consultancy for MySQL database design against substantial financial remuneration

2005-03-17 Thread Suryya Ghosh
Hi,

We are developing an application that needs to use a massive back-end
database. The database will contain around 75 million rows with around
80 columns per row. We would prefer to use MySQL as the database
platform as it is free. The MySQL database would be hosted on a
dedicated server that we will purchase from a web hosting company.

This database would be used both by our customers and by our own employees.

The first column will contain some text which will be unique in each
row. 90% of the remaining columns will containing numbers and the
other columns will contain text.

The second column will contain numbers and it needs to be updated on a
monthly basis. But, we also need to store historical data regarding
the value of the second column for each row for the last 24 months, on
a rolling basis. This can either be done by adding more columns to the
same table, or by putting this historical data in a separate table,
depending on your recommendations.

Users will make 2 types of queries on this database:

i) The first type of query is what can be called a mission-critical
query - these queries will be made by our customers and the results of
these queries must be returned within 30 seconds at the most;
otherwise, customers are not going to want to use the application.
This query would basically involve asking the customer for a search
string, searching the FIRST column (and ONLY the first column) of the
entire database to find out each row that contains that search string
(either in whole or in part) and then returning all such rows to the
user sorted in descending order of the SECOND column. Only the
information in the first 2 columns will be returned to the customers -
the information in the other 78 columns will not be returned to the
customers. Customers will also have the option of specifying negative
matches - i.e. if the first column of a particular row contains any
one of a list of banned words or phrases, then that row will not be
returned even if it contained the primary search string.

ii) The second type of queries are non-mission-critical; these would
be run by our employees and it is ok if these queries take as much as
10 minutes to return results. However, the queries that our employees
will run are also much more complex - they will specify multiple
search criteria - for instance, return all rows for which the 60th
column has a value  2000 and the minimum value for the columns 40,
41, ... 50 for that row is 20 and the 35th column of that row is  5
etc.

It is quite possible that as many as 20 - 30 users will be querying
the database at the same time. Furthermore, there will be 5 - 6
different PHP scripts that are going to constantly update the
different columns and rows of the database with the values.

Here are my questions:

i) Is MySQL a realistic option for this kind of database?

ii) What should be the hardware configuration (processor type, number
of processors, RAM etc.) for the dedicated server that will host this
database. We are interested in the most cost-effective option; i.e.
the least powerful hardware configuration that is suitable for running
this system.

iii) What is the best way of designing the database architecture in
order to ensure that we are able to meet our targets regarding the
query times?

I might need to ask some follow-up questions based on your initial response.

Please be extremely detailed and specific in answering the questions,
especially the third question.

For the third question, please suggest the table structure including
whether we should keep all the data in a single table or whether we
should use multiple tables.

If we use multiple tables, how many such tables should we use, and
which columns should be present in each table? Also, we need to know
what should be the primary key, the unique key etc. for each table and
how the indexes should be defined.

Alternatively, if you recommend that we break up the database into
multiple tables not in terms of columns but in terms of rows (i.e.
keeping a fixed number of rows per table), then what is the ideal
number of rows that should be present in each table?

And, depending on your recommendation regarding the database
structure, please advise us regarding how we should build the queries
for the 2 types of queries that we need to support.

I would prefer that someone who has actual experience designing a
MySQL database with similar or even more demanding characteristics and
requirements answer my query.

We are looking for the right persion who can come up with a consultancy. 
Remunarations will not be any constraint for some body who can provide the 
right solution.

If you need any clarifications from us before you can answer the
question, feel free to ask.

Suryya

Hardware and Architectural Configuration

2004-11-29 Thread Suryya Ghosh
HI,

We are working on a software tool for keyword analysis. 

Our proposed software will have Mysql as the backend database server.
The proposed database will contain atleast 80 million rows and there will a lot 
of concurrent search (string search) and update operation by the software on 
the database.

Can anybody help us by recomending a suitable hardware and architectural 
configuration for the the database.

With Regards,

Suryya