Problem in Fulltext Search
Hi, We have a table containing more than 15 million rows of data, can anybody please help in this problem of fulltext search described below. The following query is giving a good result in terms of query time. select field1, field2 from tblMerge where MATCH(field1) AGAINST('food' IN BOOLEAN MODE) Order By field2 DESC LIMIT 500 when I want to search for more that one word or phrase like food, locomotive, water tank then the query is not working up to the desired level. The following query is taking not less than 90 to 100 seconds that is 1.5 minutes select field1, field2 from tblMerge where MATCH(field1) AGAINST('food' IN BOOLEAN MODE) Or MATCH(field1) AGAINST('locomotive' IN BOOLEAN MODE) Or MATCH(field1) AGAINST('water' IN BOOLEAN MODE) Order By field2 DESC LIMIT 500 I have also tried with another version of the same previous query which is also not giving a desired performance. The following query is taking not less than 90 to 100 seconds that is 1.5 minutes select field1, field2 from tblMerge where MATCH(field1) AGAINST('food locomotive water tank' IN BOOLEAN MODE) Order By field2 DESC LIMIT 500 In these cases, I think, I am not getting the advantage of full text search, because still it is taking a lot of time to execute the query. Can I optimise the database to few step forward such that the query time can be reduced for multiple words of phrases, please let us know can the query be optimized to provide a better performance? Regards, Suryya
Consultancy for MySQL database optimization against substantial financial remuneration
Hi , We are a company specialized in developing tools for Internet Marketing domain. We are developing a product for which we need consultancy services from professionals speacilized in Mysql server optimization and tuning. Adequate financial remuneration will not be problem for effective solution provider. Below the nature of our application and problems faced are discussed in detail. We are developing an application that needs to use a massive back-end database. The database will contain around 75 million rows with around 80 columns per row. We would prefer to use MySQL as the database platform as it is free. The MySQL database is hosted on a dedicated server that has been purchased from a web hosting company. This database would be used both by our customers and by our own employees. The first column will contain some text which will be unique in each row. 90% of the remaining columns will containing numbers and the other columns will contain text. The second column will contain numbers and it needs to be updated on a monthly basis. But, we also need to store historical data regarding the value of the second column for each row for the last 24 months, on a rolling basis. This can either be done by adding more columns to the same table, or by putting this historical data in a separate table, depending on your recommendations. Users will make 2 types of queries on this database: i) The first type of query is what can be called a mission-critical query - these queries will be made by our customers and the results of these queries must be returned within 30 seconds at the most; otherwise, customers are not going to want to use the application. This query would basically involve asking the customer for a search string, searching the FIRST column (and ONLY the first column) of the entire database to find out each row that contains that search string (either in whole or in part) and then returning all such rows to the user sorted in descending order of the SECOND column. Only the information in the first 2 columns will be returned to the customers - the information in the other 78 columns will not be returned to the customers. Customers will also have the option of specifying negative matches - i.e. if the first column of a particular row contains any one of a list of banned words or phrases, then that row will not be returned even if it contained the primary search string. ii) The second type of queries are non-mission-critical; these would be run by our employees and it is ok if these queries take as much as 10 minutes to return results. However, the queries that our employees will run are also much more complex - they will specify multiple search criteria - for instance, return all rows for which the 60th column has a value 2000 and the minimum value for the columns 40, 41, ... 50 for that row is 20 and the 35th column of that row is 5 etc. It is quite possible that as many as 20 - 30 users will be querying the database at the same time. Furthermore, there will be 5 - 6 different PHP scripts that are going to constantly update the different columns and rows of the database with the values. Now we have hired a server with the following configuration: Server: Dual Xeon 2.8 GHz Secondary Processor: Second Xeon Processor Primary HDD: 73 GB SCSI Secondary HDD: None Third HDD: None RAM: ECC Registered 1024MB RAM Number of ips: 10 IP Addresses Bandwidth: 2000 GB Bandwidth Uplink Port Speed: 100 Mbps Uplink Database: MySQL 4.1.11-standard Backup Service: Network Backup Operating System: Red Hat Enterprise Linux, Version 3 Drive Controller: SCSI Chassis Control: DRAC Card We are executing certain queries but they are taking too long a time. Could you help us in tuning the SQL queries and the MySQL database such that the query time reduces and we get a fatser and more efficient database? Do let us know what is the timeframe required for this and also please indicate your charges for the same. Please send us your profile and details of some assignments that you have carried out. Awaiting your response, Regards Suryya Ghosh
Mysql installation information
Hi, I have gone through the sites which you have recomended where it is suggested that the exact version depends on the requirement and the hardware. According to our hardware software specification which is given below, I think we should use (Linux (x86, glibc-2.2, static, gcc),Standard 4.1.10a ) version of mysql server. I have no previous experience regarding installing mysql so it will be a great help if let me know your openion about it. Can you please review the URL http://dev.mysql.com/downloads/mysql/4.1.html and recomend the exact version which will go best with the server. Server Configuration - Server: Dual Xeon 2.8 GHz ( X86,32 bit,533MHz FSB,512KB L2 cache) Primary HDD: 73 GB SCSI RAM: ECC Registered 1024MB RAM Bandwidth: 2000 GB Bandwidth Uplink Port Speed: 100 Mbps Uplink Operating System: Red Hat Enterprise Linux, Version 3 (recommended) Drive Controller: SCSI Chassis Control: DRAC Card With Regards, Suryya
Mysql installation
Hi, We have server with the following hardware specification: Server: Dual Xeon 2.8 GHz Primary HDD: 73 GB SCSI RAM: ECC Registered 1024MB RAM Bandwidth: 2000 GB Bandwidth Uplink Port Speed: 100 Mbps Uplink Operating System: Red Hat Enterprise Linux, Version 3 (recommended) Drive Controller: SCSI Chassis Control: DRAC Card Can any body suggest which exact version of mysql will give optimum performance in this hard ware, will an RPM installation will be the best one we should compile it from source using an particulat gcc compiler. With Regards, Suryya
Consultancy for MySQL database design against substantial financial remuneration
Hi, We are developing an application that needs to use a massive back-end database. The database will contain around 75 million rows with around 80 columns per row. We would prefer to use MySQL as the database platform as it is free. The MySQL database would be hosted on a dedicated server that we will purchase from a web hosting company. This database would be used both by our customers and by our own employees. The first column will contain some text which will be unique in each row. 90% of the remaining columns will containing numbers and the other columns will contain text. The second column will contain numbers and it needs to be updated on a monthly basis. But, we also need to store historical data regarding the value of the second column for each row for the last 24 months, on a rolling basis. This can either be done by adding more columns to the same table, or by putting this historical data in a separate table, depending on your recommendations. Users will make 2 types of queries on this database: i) The first type of query is what can be called a mission-critical query - these queries will be made by our customers and the results of these queries must be returned within 30 seconds at the most; otherwise, customers are not going to want to use the application. This query would basically involve asking the customer for a search string, searching the FIRST column (and ONLY the first column) of the entire database to find out each row that contains that search string (either in whole or in part) and then returning all such rows to the user sorted in descending order of the SECOND column. Only the information in the first 2 columns will be returned to the customers - the information in the other 78 columns will not be returned to the customers. Customers will also have the option of specifying negative matches - i.e. if the first column of a particular row contains any one of a list of banned words or phrases, then that row will not be returned even if it contained the primary search string. ii) The second type of queries are non-mission-critical; these would be run by our employees and it is ok if these queries take as much as 10 minutes to return results. However, the queries that our employees will run are also much more complex - they will specify multiple search criteria - for instance, return all rows for which the 60th column has a value 2000 and the minimum value for the columns 40, 41, ... 50 for that row is 20 and the 35th column of that row is 5 etc. It is quite possible that as many as 20 - 30 users will be querying the database at the same time. Furthermore, there will be 5 - 6 different PHP scripts that are going to constantly update the different columns and rows of the database with the values. Here are my questions: i) Is MySQL a realistic option for this kind of database? ii) What should be the hardware configuration (processor type, number of processors, RAM etc.) for the dedicated server that will host this database. We are interested in the most cost-effective option; i.e. the least powerful hardware configuration that is suitable for running this system. iii) What is the best way of designing the database architecture in order to ensure that we are able to meet our targets regarding the query times? I might need to ask some follow-up questions based on your initial response. Please be extremely detailed and specific in answering the questions, especially the third question. For the third question, please suggest the table structure including whether we should keep all the data in a single table or whether we should use multiple tables. If we use multiple tables, how many such tables should we use, and which columns should be present in each table? Also, we need to know what should be the primary key, the unique key etc. for each table and how the indexes should be defined. Alternatively, if you recommend that we break up the database into multiple tables not in terms of columns but in terms of rows (i.e. keeping a fixed number of rows per table), then what is the ideal number of rows that should be present in each table? And, depending on your recommendation regarding the database structure, please advise us regarding how we should build the queries for the 2 types of queries that we need to support. I would prefer that someone who has actual experience designing a MySQL database with similar or even more demanding characteristics and requirements answer my query. We are looking for the right persion who can come up with a consultancy. Remunarations will not be any constraint for some body who can provide the right solution. If you need any clarifications from us before you can answer the question, feel free to ask. Suryya
Hardware and Architectural Configuration
HI, We are working on a software tool for keyword analysis. Our proposed software will have Mysql as the backend database server. The proposed database will contain atleast 80 million rows and there will a lot of concurrent search (string search) and update operation by the software on the database. Can anybody help us by recomending a suitable hardware and architectural configuration for the the database. With Regards, Suryya