Re: Relational Databasing on busy webserver
On Mon, Sep 22, 2008 at 5:12 PM, Ben A.H. [EMAIL PROTECTED] wrote: There are various other fields that I believe could be handled like this for a cumulative performance boost. For example: country, state/province, gender, industry, occupation, ethnicity, language are all options that aren't going to change that often. Then again, when we do have to display the users choice; for example if user.countryID=3 we'd have to have a way to turn that number 3 into the word Canada when we display the users profile... I'd probably do this via XML lookup. Has anyone experimented with the benefits/tradeoffs of such a scheme? As I've said, we are dealing with at least 200,000 user records, probably 300,000-400,000 in the next year. The User table contains at least 50 attributes, 15 of which are Foreign Keys that link to tables that will likely never change (the users choices for countryID, stateID, OperatingSystemID, internet_connectionTypeID, internetUserFrequencyID, industryID, occupationID)... Hi Ben, This is a very common issue in web database development. The issue is really whether the tables should exist (logically) in the database or in the web scripts. For example, for the 50 states, a lot of programmers would put this logic in the web script and just store the two-letter postal code in the database table (but with no separate table for states). The mapping from MI to Michigan would occur in the web scripts; as would the logic to know that XQ is not a state. This is done all the time. As long as the database is indexed properly for the queries you want to run, there is not usually a severe performance hit. Either approach is fine; but the web script approach is usually easier. I just have one piece of advice for you: Use something stored in the database that won't change when you update the fixed tables in the web scripts. For example, a 2-letter postal state code is fine; because when they add the 51st state they won't change any of the existing codes. Similarly, use M, F, etc. for sex. Don't store anything in the database that is tied to, for example, the integer index of a table entry in the web script or stored in some other way where when you modify the script you'll need to alter the database as well; otherwise you'll be writing a separate script to remap values of columns in the database so that you can use the new scripts. Dave.
Re: Relational Databasing on busy webserver
On Tue, Sep 23, 2008 at 11:13 AM, Olexandr Melnyk [EMAIL PROTECTED] wrote: On 9/23/08, David Ashley [EMAIL PROTECTED] wrote: For example, for the 50 states, a lot of programmers would put this logic in the web script and just store the two-letter postal code in the database table (but with no separate table for states). The mapping from MI to Michigan would occur in the web scripts; as would the logic to know that XQ is not a state. Keep in mind that, as you add more countries, maintaining in-scipt code/name mappings of state codes will become more and more cumbersome. Another problem will arise if you want to view user information, along with full country and state names, from a different programming language. Plus, if the same query is run very often and table is almost static, chances are high that the result will be in query cache. The different programming language issue is one I hadn't thought about. Good catch. I program in PHP only, and I run queries manually only if I need to debug something hard. But I could see the issue you cited arising in a more complex environment. I could even see a mixed web application where the web presence is a mixture of PHP, Python, and Perl. Good catch.
Re: Patch for mysql
On Mon, Sep 22, 2008 at 7:54 AM, Colin Charles [EMAIL PROTECTED]wrote: Nagaraj S wrote: Hi Gurus, Good Day! How to apply mysql bug fixing patches or it is just updating or replacing some code in a c file and recompiling the source? Depending on how your patch looks like, you will use the Unix patch(1) utility or not Anyone able to supply some simple steps to patch instructions? I have mysql server running on Solaris 10 system. Why not just use the method that Solaris provides to update the MySQL package? Applying patches requires you to recompile MySQL, and if you're already curious about how to apply a patch (and you're not clear what patch this might be), I'd avoid this method of getting things done One possible ambiguity is that patch means different things to different people. To hard-core *nix users, patch strictly means the behavior of the patch program (man patch). However, to certain others, patch may mean any bugfix mechanism (updates). I suspect that the OP used patch when the lingo of the list would have been update.
Re: Appropriate Design
On Wed, Sep 17, 2008 at 5:46 AM, Jeffrey Santos [EMAIL PROTECTED] wrote: I'm not sure if this is the right list to ask such a question; if not, any direction as to the appropriate forum would be much appreciated! This may be a simple question, but I'm still on the early stages of my learning about databases. I'm developing an application that will require information from various sources. Since what might be considered required information about those sources will vary (occasionally dramatically), I'm unsure as to the correct design for the database tables. In other words I'm left with (I believe) two choices: 1. Create a source table that will contain, among other things, every possible field for a source type and sort out what is considered required information at the application level. 2. Create a few source tables detailing the required information about each source type. The only way I can think of doing this is something along the lines of: sourceFields: fieldId, fieldName, sourceRequireds: reqId, typeId, fieldId, sourceInfo: infoId, fieldId, sourceId, and then pull only the appropriate data from the database at query time using JOINs. Which of these options (or a third I have no idea about!) would be appropriate here? If possible, maybe a general conventional wisdom statement would greatly help my education on these matters! Thank you, Jeffrey Santos The second solution (multiple tables) is generally the only acceptable solution. The general rule (for one table or many) is whether the fields in the single table are fixed in quantity or perhaps are repetitive enough to be tedious or repetitive to manipulate. For example, assume you want a database of people in your state and the cars they own. Clearly, since people vary widely in the number of cars they own and since each car has similar data, puttling fields like car1vin, car2vin, car3vin in the people table is the wrong solution. Your problem description is rather vague, but the way I'm reading it is that your sources vary widely in the fields that need to be recorded. In that case, you'd generally want a table of sources (each instance of a source), a table of fields (each field that MAY be associated with a source), and a table of sourcefields (the actual data for a given field with a given source). But you really need to solidify the description of your problem or have a simplest example so people can really help you. I don't fully understand your description. Dave.
Re: Appropriate Design
On Wed, Sep 17, 2008 at 11:38 AM, Jeffrey Santos [EMAIL PROTECTED]wrote: Hi David, My project involves storing information that could have been gathered from wildly different sources. You can think of this part of the database as a sort of bibliography for the sources of that information. If I gathered information from a book, for instance, there are some generally required fields such as Author, Title, ... etc. A website, on the other hand, would also require fields such as URL and Date Visited and so on and so forth. I hope that's more clear! Ah, OK, the essence of your problem is what is called in some programming languages a variant record. One table is related to another table in a one-to-many relationship, the the items in the second table may vary wildly in what needs to be stored. http://en.wikibooks.org/wiki/Programming:Ada:Types:record#Variant_record The simplest real-life example I can think of would be to make a record of each citizen of the world and all of their property. The is a one-to-many relationship between people and property. The problem is that each item of property has different attributes. A computer will have different things you want to record than a goat. I'm not qualified to even guess at this question in the context of MySQL. I'm sure others can be more helpful. The two approaches that come to mind are: a)For simple problems, in the second table to have an enumerated discriminator field and a set of additional fields that is the union of everything required, with some unused depending on the discriminator. b)A more sophisticated approach involving named attributes. I've gotta bow out of this one. For all I know, MySQL might support variant records directly. I'm not worthy, I'm not worthy ... Dave.
Re: MySQL not running on fresh LAMP install
On Tue, Sep 16, 2008 at 11:02 AM, Dave M G [EMAIL PROTECTED] wrote: MySQL List, I'm going through a post-hard-drive-crash process of reinstalling my whole operating system, and MySQL is acting very strange. I installed it as I always have done, by using Ubuntu's Synaptic Package Manager to install all the default packages for a LAMP server. Every time before I've done this it has worked like a dream. It seemed to work when I first installed it yesterday, but today, when trying to access phpMyAdmin, I got this error: #2002 - The server is not responding (or the local MySQL server's socket is not correctly configured) I figured I'd give MySQL a bit of a kick just to see if that would help: [EMAIL PROTECTED]:~$ sudo /etc/init.d/mysqld restart But it gave me this new and unusual response: sudo: /etc/init.d/mysqld: command not found This is only a narrow part of your problem, but did you forget rc.d as part of the path? i.e. /etc/rc.d/init.d/mysqld ... ??
Re: How many key buffer size should be allocate?
On Mon, Sep 8, 2008 at 12:47 AM, Dan Nelson [EMAIL PROTECTED] wrote: In the last episode (Sep 08), Yi Wang said: According to the mysql manual, I changed key_buffer_size from 8M to 512M. But In mysql administrator, key buffer hitrate seems stay unchanged. And key buffer usage always below 4M. The total db size is about 200M. And the system's load is slight. key buffer size is 8M before changing. And the hitrate is about 100% intermittently. If your database is only 200M, then your indexes are probably less than half that (if you total up all of your .MYI files, that's the total size of all your indexes). Setting key_buffer_size larger than your total index size doesn't do any good. The other thing that struck me about this particular application is that with typical RAM size for a modern computer, and with a lightly loaded server, the ENTIRE database will effectively be cached: http://en.wikipedia.org/wiki/Page_cache Because of the small database size, I'm not confident that using the key cache features of MySQL will lead to a significant performance increase. There will be some performance increase, naturally, because it is cheaper to find index values locally in MySQL rather than make the fseek() calls or whatever through the OS. But I also have to consider that the OS calls usually involve an open file handle and primarily integer operations, and are highly optimized. I'm just not sure what kind of a performance increase would result from using the key cache features of MySQL for such a small database. The OP should definitely explore these features, for (a) academic value, (b) future expansion of the database, (c) other databases, (d) in case he is running the database on an older server, and (e) in case the server is not lightly loaded. But in general for a 200M database ... I'm not sure of the benefit. I've never actually tried any of this, so I could be totally wrong about the performance implications with a small database ... Dave.
Re: Large Query Question.
On Thu, Sep 4, 2008 at 10:38 AM, mos [EMAIL PROTECTED] wrote Jim, The problem is likely your index is not defined properly. Use an Explain in front of the query to see if it can use just one index from each table. I would try building a compound index on Products: (RecordReference, FeedId) ProductContributors: (RecordReference, RowType) This should get it to execute the join and where clause using just one index from each table. Give that a try and see if it speeds things up. :) Mike I concur. The SELECT time is going to resemble something like: K_1 * F_1(number_of_records_in_database) + K_2 * F_2(number_of_records_selected) If the indices are effective, F_1 = log(N), but if the indices are not effective, F_1 = N. One thing you may want to try to narrow down the problem is just retrieving 100 records (the COUNT clause of a query) and see how that affects the speed, then try the full set and see how it is different. If they aren't very different, then it is a F_1 problem. But if they are different, then it is a K_2 / F_2 problem. As far as K_2 or F_2 problems ... Another possibility is that you are using ORDER BY on a large result set that isn't indexed for an effective sort. Try dropping the ORDER BY and see what happens. My view of how MySQL might work internally is perhaps naive. But sorting can be worst case O(N**2). Dave. On 3-Sep-08, at 3:02 PM, mos wrote: Jim, Retrieving 100,000 rows will always take some time. Do you really need to return that many rows? Are you selecting just the columns you need? What are the slow queries? Mike At 12:05 PM 9/3/2008, Jim Leavitt wrote: Greetings List, We have a medium-large size database application which we are trying to optimize and I have a few questions. Server Specs 1 Dual Core 2.6 Ghz 2GB Ram Database Specs 51 Tables Min 10 rows, Max 100 rows Total size approx 2GB My.cnf [mysqld] set-variable=local-infile=0 log-slow-queries=slow-queries.log datadir=/var/lib/mysql socket=/var/lib/mysql/mysql.sock old_passwords=1 key_buffer = 512M max_allowed_packet=4M sort_buffer_size = 512M read_buffer_size = 512M read_rnd_buffer_size = 256M record_buffer = 256M myisam_sort_buffer_size = 512M thread_cache = 128 query_cache_limit = 1M query_cache_type = 1 query_cache_size = 32M join_buffer = 512M table_cache = 512 We are having trouble with certain queries which are returning anywhere from 10 - 30 rows. Total query time is taking approx 1 - 2 mins depending on load. Is there anything in our conf file which could improve our performance? Are there any hardware recommendations that could help us improve the speed? Would more memory help us? Any comments or recommendations are greatly appreciated. Thanks much. Jim Leavitt Developer Treefrog Interactive Inc. (http://www.treefrog.cawww.treefrog.ca) Bringing the Internet to Life -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED] http://lists.mysql.com/[EMAIL PROTECTED] Jim Leavitt Developer Treefrog Interactive Inc. (http://www.treefrog.ca/www.treefrog.ca) Bringing the Internet to Life ph: 905-836-4442 ext 104 fx: 905-895-6561 -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED]
Fwd: Large Query Question.
I concur. The SELECT time is going to resemble something like: K_1 * F_1(number_of_records_in_database) + K_2 * F_2(number_of_records_selected) If the indices are effective, F_1 = log(N), but if the indices are not effective, F_1 = N. One thing you may want to try to narrow down the problem is just retrieving 100 records (the COUNT clause of a query) and see how that affects the speed, then try the full set and see how it is different. If they aren't very different, then it is a F_1 problem. But if they are different, then it is a K_2 / F_2 problem. As far as K_2 or F_2 problems ... Another possibility is that you are using ORDER BY on a large result set that isn't indexed for an effective sort. Try dropping the ORDER BY and see what happens. My view of how MySQL might work internally is perhaps naive. But sorting can be worst case O(N**2). Dave. Addendum: I misremembered the SQL keywords. It isn't COUNT. It is (I think) LIMIT. Also, ORDER BY might be GROUP BY. Oopsie.
Re: Large Query Question.
On Wed, Sep 3, 2008 at 1:05 PM, Jim Leavitt [EMAIL PROTECTED] wrote: We are having trouble with certain queries which are returning anywhere from 10 - 30 rows. Total query time is taking approx 1 - 2 mins depending on load. Is there anything in our conf file which could improve our performance? Are there any hardware recommendations that could help us improve the speed? Would more memory help us? Any comments or recommendations are greatly appreciated. Returning 100,000 to 300,000 rows will take some time no matter how you slice it. A more common approach is to be sure that the database is organized for O(log N) retrieval, then to retrieve only the records you need (the ones you need to display, for example), then to execute a second query to get more, and then a third query, etc. O(log N) retrieval = indices for the columns and the database arranged so that equality and ordering are implemented using native data types. What is your application? Do you really need all those rows at one time?