How to remove stagnant stock returns?

2009-09-08 Thread Jia Chen

Here there,

One of my tables, called mr, looks like this,

codendate   mr
130042199402 0.96
130042199403 0.97
130042199404 0.95
130042199405 1
130042199406 1.1
130042199407 1
130042199408 1
130042199409 1
130043199205  0.98
130043199206  1.01
130043199207  1.03
130043199208  1
130043199209  1.02
130043199210  1
130043199211  1
130043199212  1
130044199801  0.89
130044199802  1.02
...

code is an identifier of firms, ndate is a column of year and month, and 
mr is gross monthly stock return. This table is a time series of monthly 
returns for many firms. The problem of this data is that some firms have 
stagnant returns (mr equal to 1) for some months in the ending part of 
the firm's return series. For example, months 199407, 199408, and 199409 
for firm 130042, and months 199210, 199211, 199212 for firm 130043.  The 
reason for this error is that people who construct the return series 
keep adding 1's to the return series after the firm dies or gets delisted.


I need to remove these stagnant returns from the end of monthly return 
time series up to the  last non-stagnant return (mr not equal to 1).  
Therefore, I want the table mr to look like this


codendate   mr
130042199402 0.96
130042199403 0.97
130042199404 0.95
130042199405 1
130042199406 1.1
130043199205  0.98
130043199206  1.01
130043199207  1.03
130043199208  1
130043199209  1.02
130044199801  0.89
130044199802  1.02
...

Notice that for '130042-199405' and '130043-199208', the returns are 
also 1's, but I don't want to remove them because they are meaningful 
observations before the firm dies or gets delisted.  Only the stagnant 
returns (mr equal to 1) from the end of return series up to the last 
non-stagnant return need to be removed.


What is a good way of doing this in MySQL? Thanks.

I have not figured out a way yet, but I am looking into two potential 
solutions:


1, user-defined variables.

2. select non-1 returns to a new table, find out the date of the last 
non-1 return for each firm, say last_day, and add to the new table the 
returns equal to one's and before last_day for each firm.


Best,
Jia

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org



Re: How to optimize a slow query?

2009-09-07 Thread Jia Chen

Hi Mike,

Thanks for you help!

Best,
Jia

mos wrote:

Jia,
 The code you sent seems to be able to get the job done.

You could try something simpler by executing 2 sql statements instead 
of using one. Something like:


create table rmpdata1
select ri.*, mv.* from  RItime as ri left join MVtime as mv on 
(ri.code=mv.code and ri.ndate=mv.ndate));


insert into rmpdata1
select ri.*, mv.* from  MVtime as mv left join RItime as ri  on 
(ri.code=mv.code and ri.ndate=mv.ndate)) where ri.code is null and 
ri.date is null;


You will of course have to play with the column list to avoid 
duplicate columns. The Insert statement will insert rows from mv that 
are missing in ri.


Mike

At 01:51 PM 9/6/2009, Jia Chen wrote:

Hi Mike,

Thanks for your detailed answer. Now, I understand what you mean. 
And, yes, I agree with you that keeping all data in one table works 
better for a bunch of 1:1 relationship tables.  Actually, this is 
what I was trying to do with that query.


Since you mention They all had a 1:1 relationship and occasionally 
some of the tables did not have a corresponding row. and From then 
on I've merged all 8 tables into one and if any of the subordinate 
table data isn't available for a row, its columns are set to NULL, I 
do want to ask you about how you set the columns to NULL for rows in 
subordinate table data unavailable because I have similar situation.


If I want to combine two tables into one, I think that a full outer 
join can achieve what you did. However, MySQL does not have full 
outer join. So, I am using


create table rmpdata1
(select ri.*, mv.MV, coalesce(ri.code,mv.code) as ccode,
  coalesce(ri.ndate,mv.ndate) as cndate
from
   RItime as ri left outer join MVtime as mv
on
   (ri.code=mv.code and ri.ndate=mv.ndate))
union
(select ri.*, mv.MV, coalesce(ri.code,mv.code) as ccode,
  coalesce(ri.ndate,mv.ndate) as cndate
from
   RItime as ri right outer join MVtime as mv
on
   (ri.code=mv.code and ri.ndate=mv.ndate));

This query takes more than twice as much time as the query in my 
first e-mail. Do you have a better way? Thanks.


Best,
Jia

mos wrote:

Jia,

Yes, it is a 1:1 relationship between table RItime and MVtime. 
However, I don't get your suggestion, I'd recommend joining the 
two tables into 1 table so you don't have to join them in the first 
place.  Could you elaborate that?


Sure but first I have to relate it to my own experience. I had 8 
tables of around 25 million rows each. They all had a 1:1 
relationship and occasionally some of the tables did not have a 
corresponding row. I felt it was better from a design point of view 
to have 8 different tables and do the joins on the tables that I 
needed for each of my particular queries. I'd be joining anywhere 
from 2 to 5 or 6 or even all 8 tables at a time, using a where 
clause to select 15k rows at a time. This is the way to do it from a 
normalized point of view. All of the information is in its 
respective table and only assemble the tables for each particular 
query.


Well, this was slwww! A heck of a lot of work was done to join 
the tables together on a 2 column key (like yours). I also had to 
run maintenance on the tables to see which tables where corrupted or 
were missing rows that should have been there. The tables also 
repeated columns from the other tables like date and product_id that 
is used to help identify each row. Well to make a long story short, 
it was far too much effort to juggle the relationships between all 
of these tables.


Then a colleague made the monumental announcement by saying I've 
never found the need to use more than 1 table when there was a 1:1 
relationship. There is a tremendous speed cost involved in piecing 
the data back together. I put all of the data into 1 table.  So the 
light went on for me. From then on I've merged all 8 tables into one 
and if any of the subordinate table data isn't available for a row, 
its columns are set to NULL, which is the values they would have had 
anyway after a left join.


I am perfectly happy know with one wide table with over 100 columns. 
Everything is in its place and maintenance is a dream. Queries are 
also quite fast because all of the information is under one table 
and not 8. I don't have to worry about optimizing the indexes for 
the table joins because there aren't any joins between these tables 
because it is all in 1 row.


So you really have to ask yourself, why spend  10 minutes each time 
your query is run? Instead you eliminate the query altogether by 
keeping the data of the 2 tables into 1 table in the first place.


Mike



At 09:45 AM 9/6/2009, Jia Chen wrote:

Thanks for your reply, Mike.

Yes, 13419851 rows were added to rmpdata1. However, 10 minutes seem 
to be too long. I run the same join by using SQL procedure in a 
statistical software called SAS on a similar machine. It only takes 
1 minute and 3 seconds.


Yes, it is a 1:1 relationship between table RItime

Re: How to optimize a slow query?

2009-09-06 Thread Jia Chen

Thanks for your reply, Mike.

Yes, 13419851 rows were added to rmpdata1. However, 10 minutes seem to 
be too long. I run the same join by using SQL procedure in a statistical 
software called SAS on a similar machine. It only takes 1 minute and 3 
seconds.


Yes, it is a 1:1 relationship between table RItime and MVtime. However, 
I don't get your suggestion, I'd recommend joining the two tables into 
1 table so you don't have to join them in the first place.  Could you 
elaborate that?


I was using B-tree index. Switching to HASH does help. Now, it takes 4 
min 50.17 sec to run the query. 


I also turn on profiling by using
mysql set profiling = 1;
Query OK, 0 rows affected (0.01 sec)

After the query finishes, I get
mysql show profile;
+--++
| Status   | Duration   |
+--++
| starting |   0.000123 |
| checking permissions |   0.10 |
| Opening tables   |   0.44 |
| System lock  |   0.07 |
| Table lock   |   0.11 |
| init |   0.83 |
| creating table   |   0.003428 |
| After create |   0.000124 |
| System lock  |   0.04 |
| Table lock   |   0.51 |
| optimizing   |   0.07 |
| statistics   |   0.33 |
| preparing|   0.20 |
| executing|   0.04 |
| Sending data | 290.153530 |
| end  |   0.08 |
| end  |   0.04 |
| query end|   0.03 |
| freeing items|   0.10 |
| closing tables   |   0.25 |
| logging slow query   |   0.01 |
| logging slow query   |   0.013429 |
| cleaning up  |   0.04 |
+--++
23 rows in set (0.02 sec)

MySQL spends most of its time sending data. According to 
http://dev.mysql.com/doc/refman/5.0/en/general-thread-states.html, 
sending data means that the thread is processing rows for a |SELECT| 
http://dev.mysql.com/doc/refman/5.0/en/select.html statement and also 
is sending data to the client.  Is there more room to optimize this 
query?  Thanks again.


Best,
Jia

mos wrote:
How many rows were added to rmpdata1 table? If it is 13.4 million rows 
then it is going to take several minutes to join this many rows from 
the 2 tables.
Is there a 1:1 relationship between the two tables or a 1:Many? If 
there is a 1:1 then I'd recommend joining the two tables into 1 table 
so you don't have to join them in the first place.
The only other thing I can suggest is to change the type of index on 
the tables being joined to see if that makes a speed difference. For 
example, if you are using BTREE then switch to HASH or vice versa. See 
http://dev.mysql.com/doc/refman/5.0/en/create-index.html for more info.


Mike

At 10:05 AM 9/5/2009, Jia Chen wrote:

Hi there,

One simple query took more than 10 minutes.  Here is how relevant 
rows in the slow query log looks like:


# Time: 090905 10:49:57
# u...@host: root[root] @ localhost []
# Query_time: 649  Lock_time: 0  Rows_sent: 0  Rows_examined: 26758561
use world;
create table rmpdata1 select ri.*,
mv.MV, coalesce(ri.code,mv.code) as ccode,
coalesce(ri.ndate,mv.ndate) as cndate
from RItime as ri left outer join
MVtime as mv
on (ri.code=mv.code and
ri.ndate=mv.ndate);

When I explain only the select clause, I get
+--+---+
| id | select_type | table | type   | possible_keys | key | 
key_len | ref  | rows | Extra |
++-+---++---+-+-+--+--+---+ 


|  1 | SIMPLE  | ri| ALL| NULL  | NULL| NULL
| NULL | 13419851 |   |
|  1 | SIMPLE  | mv| eq_ref | PRIMARY   | PRIMARY | 11
| world.ri.code,world.ri.ndate |1 |   |
++-+---++---+-+-+--+--+---+ 


2 rows in set (0.00 sec)

I use show table status from world; to get information about two 
tables, RItime and MVtime, in the join clause:

  Name: RItime
Engine: MyISAM
   Version: 10
Row_format: Dynamic
  Rows: 13419851
Avg_row_length: 31
   Data_length: 427721848
Max_data_length: 281474976710655
  Index_length: 347497472
 Data_free: 0
Auto_increment: NULL
   Create_time: 2009-09-03 10:17:57
   Update_time: 2009-09-03 12:04:02
Check_time: NULL
 Collation: latin1_swedish_ci
  Checksum: NULL
Create_options:
   Comment:
*** 2. row ***
  Name: MVtime
Engine: MyISAM
   Version: 10
Row_format: Dynamic
  Rows: 13562373
Avg_row_length: 31
   Data_length: 430220056
Max_data_length: 281474976710655
  Index_length: 350996480
 Data_free: 0
Auto_increment: NULL
   Create_time: 2009-09-03 13:31:33
   Update_time: 2009-09-03 13:43

Re: How to optimize a slow query?

2009-09-06 Thread Jia Chen

Hi Mike,

Thanks for your detailed answer. Now, I understand what you mean. And, 
yes, I agree with you that keeping all data in one table works better 
for a bunch of 1:1 relationship tables.  Actually, this is what I was 
trying to do with that query.


Since you mention They all had a 1:1 relationship and occasionally some 
of the tables did not have a corresponding row. and From then on I've 
merged all 8 tables into one and if any of the subordinate table data 
isn't available for a row, its columns are set to NULL, I do want to 
ask you about how you set the columns to NULL for rows in subordinate 
table data unavailable because I have similar situation.


If I want to combine two tables into one, I think that a full outer join 
can achieve what you did. However, MySQL does not have full outer join. 
So, I am using


create table rmpdata1
(select ri.*, mv.MV, coalesce(ri.code,mv.code) as ccode,
  coalesce(ri.ndate,mv.ndate) as cndate
from
   RItime as ri left outer join MVtime as mv
on
   (ri.code=mv.code and ri.ndate=mv.ndate))
union
(select ri.*, mv.MV, coalesce(ri.code,mv.code) as ccode,
  coalesce(ri.ndate,mv.ndate) as cndate
from
   RItime as ri right outer join MVtime as mv
on
   (ri.code=mv.code and ri.ndate=mv.ndate));

This query takes more than twice as much time as the query in my first 
e-mail. Do you have a better way? Thanks.


Best,
Jia

mos wrote:

Jia,

Yes, it is a 1:1 relationship between table RItime and MVtime. 
However, I don't get your suggestion, I'd recommend joining the two 
tables into 1 table so you don't have to join them in the first 
place.  Could you elaborate that?


Sure but first I have to relate it to my own experience. I had 8 
tables of around 25 million rows each. They all had a 1:1 relationship 
and occasionally some of the tables did not have a corresponding row. 
I felt it was better from a design point of view to have 8 different 
tables and do the joins on the tables that I needed for each of my 
particular queries. I'd be joining anywhere from 2 to 5 or 6 or even 
all 8 tables at a time, using a where clause to select 15k rows at a 
time. This is the way to do it from a normalized point of view. All of 
the information is in its respective table and only assemble the 
tables for each particular query.


Well, this was slwww! A heck of a lot of work was done to join the 
tables together on a 2 column key (like yours). I also had to run 
maintenance on the tables to see which tables where corrupted or were 
missing rows that should have been there. The tables also repeated 
columns from the other tables like date and product_id that is used to 
help identify each row. Well to make a long story short, it was far 
too much effort to juggle the relationships between all of these tables.


Then a colleague made the monumental announcement by saying I've 
never found the need to use more than 1 table when there was a 1:1 
relationship. There is a tremendous speed cost involved in piecing the 
data back together. I put all of the data into 1 table.  So the light 
went on for me. From then on I've merged all 8 tables into one and if 
any of the subordinate table data isn't available for a row, its 
columns are set to NULL, which is the values they would have had 
anyway after a left join.


I am perfectly happy know with one wide table with over 100 columns. 
Everything is in its place and maintenance is a dream. Queries are 
also quite fast because all of the information is under one table and 
not 8. I don't have to worry about optimizing the indexes for the 
table joins because there aren't any joins between these tables 
because it is all in 1 row.


So you really have to ask yourself, why spend  10 minutes each time 
your query is run? Instead you eliminate the query altogether by 
keeping the data of the 2 tables into 1 table in the first place.


Mike



At 09:45 AM 9/6/2009, Jia Chen wrote:

Thanks for your reply, Mike.

Yes, 13419851 rows were added to rmpdata1. However, 10 minutes seem 
to be too long. I run the same join by using SQL procedure in a 
statistical software called SAS on a similar machine. It only takes 1 
minute and 3 seconds.


Yes, it is a 1:1 relationship between table RItime and MVtime. 
However, I don't get your suggestion, I'd recommend joining the two 
tables into 1 table so you don't have to join them in the first 
place.  Could you elaborate that?


I was using B-tree index. Switching to HASH does help. Now, it takes 
4 min 50.17 sec to run the query.

I also turn on profiling by using
mysql set profiling = 1;
Query OK, 0 rows affected (0.01 sec)

After the query finishes, I get
mysql show profile;
+--++
| Status   | Duration   |
+--++
| starting |   0.000123 |
| checking permissions |   0.10 |
| Opening tables   |   0.44 |
| System lock  |   0.07 |
| Table lock   |   0.11

How to optimize a slow query?

2009-09-05 Thread Jia Chen

Hi there,

One simple query took more than 10 minutes.  Here is how relevant rows 
in the slow query log looks like:


# Time: 090905 10:49:57
# u...@host: root[root] @ localhost []
# Query_time: 649  Lock_time: 0  Rows_sent: 0  Rows_examined: 26758561
use world;
create table rmpdata1 select ri.*,
mv.MV, coalesce(ri.code,mv.code) as ccode,
coalesce(ri.ndate,mv.ndate) as cndate
from RItime as ri left outer join
MVtime as mv
on (ri.code=mv.code and
ri.ndate=mv.ndate);

When I explain only the select clause, I get
+--+---+
| id | select_type | table | type   | possible_keys | key | key_len 
| ref  | rows | Extra |

++-+---++---+-+-+--+--+---+
|  1 | SIMPLE  | ri| ALL| NULL  | NULL| NULL
| NULL | 13419851 |   |
|  1 | SIMPLE  | mv| eq_ref | PRIMARY   | PRIMARY | 11  
| world.ri.code,world.ri.ndate |1 |   |

++-+---++---+-+-+--+--+---+
2 rows in set (0.00 sec)

I use show table status from world; to get information about two 
tables, RItime and MVtime, in the join clause:

  Name: RItime
Engine: MyISAM
   Version: 10
Row_format: Dynamic
  Rows: 13419851
Avg_row_length: 31
   Data_length: 427721848
Max_data_length: 281474976710655
  Index_length: 347497472
 Data_free: 0
Auto_increment: NULL
   Create_time: 2009-09-03 10:17:57
   Update_time: 2009-09-03 12:04:02
Check_time: NULL
 Collation: latin1_swedish_ci
  Checksum: NULL
Create_options:
   Comment:
*** 2. row ***
  Name: MVtime
Engine: MyISAM
   Version: 10
Row_format: Dynamic
  Rows: 13562373
Avg_row_length: 31
   Data_length: 430220056
Max_data_length: 281474976710655
  Index_length: 350996480
 Data_free: 0
Auto_increment: NULL
   Create_time: 2009-09-03 13:31:33
   Update_time: 2009-09-03 13:43:51
Check_time: NULL
 Collation: latin1_swedish_ci
  Checksum: NULL
Create_options:
   Comment:

I also describe these two tables:
mysql desc RItime;
+---++--+-++---+
| Field | Type   | Null | Key | Default| Extra |
+---++--+-++---+
| code  | varchar(6) | NO   | PRI ||   |
| ndate | date   | NO   | PRI | -00-00 |   |
| ri| double | YES  | | NULL   |   |
| time  | date   | YES  | | NULL   |   |
| bdate | date   | YES  | | NULL   |   |
+---++--+-++---+
5 rows in set (0.00 sec)

mysql desc MVtime;
+---++--+-++---+
| Field | Type   | Null | Key | Default| Extra |
+---++--+-++---+
| code  | varchar(6) | NO   | PRI ||   |
| ndate | date   | NO   | PRI | -00-00 |   |
| MV| double | YES  | | NULL   |   |
| time  | date   | YES  | | NULL   |   |
| bdate | date   | YES  | | NULL   |   |
+---++--+-++---+
5 rows in set (0.00 sec)

Could you give me some hint on how to improve the speed of this query?  
Thanks.


Best,
Jia




--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org



Very Slow Query

2009-08-28 Thread Jia Chen

Hi all,

One seemingly simple query that joins two tables takes a long time for me.

This is my library.

mysql show table status from nber1999;
+---++-++--++-+--+--+---++-+-++---+--++-+
| Name  | Engine | Version | Row_format | Rows | Avg_row_length 
| Data_length | Max_data_length  | Index_length | Data_free | 
Auto_increment | Create_time | Update_time | Check_time 
| Collation | Checksum | Create_options | Comment |

+---++-++--++-+--+--+---++-+-++---+--++-+
| compusta1 | MyISAM |  10 | Dynamic| 4906 | 77 
|  379464 |  281474976710655 | 1024 | 0 |   
NULL | 2009-08-27 23:56:47 | 2009-08-27 23:56:47 | NULL   | 
latin1_swedish_ci | NULL || |
| pat1  | MyISAM |  10 | Dynamic|  2089903 | 96 
|   201936072 |  281474976710655 | 1024 | 0 |   
NULL | 2009-08-27 23:55:48 | 2009-08-27 23:55:56 | NULL   | 
latin1_swedish_ci | NULL || |

+---++-++--++-+--+--+---++-+-++---+--++-+
5 rows in set (0.00 sec)

And the relevant rows in my slow query log file is:

/usr/sbin/mysqld, Version: 5.0.75-0ubuntu10.2-log ((Ubuntu)). started with:
Tcp port: 3306  Unix socket: /var/run/mysqld/mysqld.sock
Time Id CommandArgument
# Time: 090828 10:36:17
# u...@host: root[root] @ localhost []
# Query_time: 478  Lock_time: 0  Rows_sent: 0  Rows_examined: 1251
use nber1999;
create table nber1999.pat select a.*, b.assname, b.cname, b.cusip, 
b.own, b.pname, b.sname

from nber1999.pat1 as a inner join nber1999.compusta1 as b
   on a.assignee=b.assignee;

My operating system is ubuntu 9.04.  I set configuration variables as 
follows:

[mysqld]
key_buffer = 1024M
table_cache = 256
query_cache_type= 1
query_cache_limit   = 2M
query_cache_size= 20M

[isamchk]
key_buffer= 16M

Can anyone give me some hint on how to speed this query up?  Thanks.

I tried to tune mysql by using a script from 
http://mediakey.dk/~cc/optimize-mysql-performance-with-mysqltuner/

and got

  MySQLTuner 1.0.0 - Major Hayden ma...@mhtx.net
  Bug reports, feature requests, and downloads at http://mysqltuner.com/
  Run with '--help' for additional options and output filtering
Please enter your MySQL administrative login: root
Please enter your MySQL administrative password:


 General Statistics 
--

[--] Skipped version check for MySQLTuner script
[OK] Currently running supported MySQL version 5.0.75-0ubuntu10.2-log
[OK] Operating on 64-bit architecture

 Storage Engine Statistics 
---

[--] Status: +Archive -BDB -Federated -InnoDB -ISAM -NDBCluster
[--] Data in MyISAM tables: 584M (Tables: 6)
[OK] Total fragmented tables: 0

 Performance Metrics 
-

[--] Up for: 30m 47s (131 q [0.071 qps], 42 conn, TX: 35K, RX: 7K)
[--] Reads / Writes: 100% / 0%
[--] Total buffers: 1.0G global + 2.6M per thread (100 max threads)
[OK] Maximum possible memory usage: 1.3G (34% of installed RAM)
[OK] Slow queries: 0% (1/131)
[OK] Highest usage of available connections: 2% (2/100)
[OK] Key buffer size / total MyISAM indexes: 1.0G/74.0K
[!!] Query cache efficiency: 0.0% (0 cached / 67 selects)
[OK] Query cache prunes per day: 0
[OK] Temporary tables created on disk: 19% (21 on disk / 107 total)
[OK] Thread cache hit rate: 95% (2 created / 42 connections)
[OK] Table cache hit rate: 75% (24 open / 32 opened)
[OK] Open file limit used: 4% (49/1K)
[OK] Table locks acquired immediately: 100% (41 immediate / 41 locks)

 Recommendations 
-

General recommendations:
   MySQL started within last 24 hours - recommendations may be inaccurate
Variables to adjust:
   query_cache_limit ( 2M, or use smaller result sets)

Best,
Jia

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org



Re: Very Slow Query

2009-08-28 Thread Jia Chen

Thanks for reply!

Yes, it is very slow too  if I just execute the select ... part.

When I run
mysql explain extended select a.*, b.assname, b.cname, b.cusip, b.own, 
b.pname, b.sname from nber1999.pat1 as a inner join nber1999.compusta1 
asb  on a.assignee=b.assignee;

I got
++-+---+--+---+--+-+--+-+-+
| id | select_type | table | type | possible_keys | key  | key_len | 
ref  | rows| Extra   |

++-+---+--+---+--+-+--+-+-+
|  1 | SIMPLE  | b | ALL  | NULL  | NULL | NULL| 
NULL |4906 | |
|  1 | SIMPLE  | a | ALL  | NULL  | NULL | NULL| 
NULL | 2089903 | Using where |

++-+---+--+---+--+-+--+-+-+
2 rows in set, 1 warning (0.00 sec)

Best,
Jia


Dan Nelson wrote:

In the last episode (Aug 28), Jia Chen said:
  

One seemingly simple query that joins two tables takes a long time for me.

This is my library.

mysql show table status from nber1999;
+---++-++--++-+--+--+---++-+-++---+--++-+
| Name  | Engine | Version | Row_format | Rows | Avg_row_length  | 
Data_length | Max_data_length  | Index_length | Data_free |  Auto_increment | 
Create_time | Update_time | Check_time  | Collation | 
Checksum | Create_options | Comment |
+---++-++--++-+--+--+---++-+-++---+--++-+
| compusta1 | MyISAM |  10 | Dynamic| 4906 | 77  |  
379464 |  281474976710655 | 1024 | 0 |NULL | 
2009-08-27 23:56:47 | 2009-08-27 23:56:47 | NULL   |  latin1_swedish_ci |   
  NULL || |
| pat1  | MyISAM |  10 | Dynamic|  2089903 | 96  |   
201936072 |  281474976710655 | 1024 | 0 |NULL | 
2009-08-27 23:55:48 | 2009-08-27 23:55:56 | NULL   |  latin1_swedish_ci |   
  NULL || |
+---++-++--++-+--+--+---++-+-++---+--++-+
5 rows in set (0.00 sec)

And the relevant rows in my slow query log file is:

# Time: 090828 10:36:17
# u...@host: root[root] @ localhost []
# Query_time: 478  Lock_time: 0  Rows_sent: 0  Rows_examined: 1251
use nber1999;
create table nber1999.pat select a.*, b.assname, b.cname, b.cusip, 
b.own, b.pname, b.sname

 from nber1999.pat1 as a inner join nber1999.compusta1 as b
on a.assignee=b.assignee;



If you run just the select ... part, is it slow also?  Do you have an
index on pat1.assignee?  What does an EXPLAIN on the select print?
 
  



--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org



Re: Very Slow Query

2009-08-28 Thread Jia Chen

Hi Dan and John,

Thanks for your time!

You guys are right. I did not index any columns when I created these 
tables. After I indexed assignee columns in both tables, the select 
clause runs in seconds.


Best,
Jia

Dan Nelson wrote:

In the last episode (Aug 28), Jia Chen said:
  

Thanks for reply!

Yes, it is very slow too  if I just execute the select ... part.

When I run
mysql explain extended select a.*, b.assname, b.cname, b.cusip, b.own, 
b.pname, b.sname from nber1999.pat1 as a inner join nber1999.compusta1 
asb  on a.assignee=b.assignee;

I got
++-+---+--+---+--+-+--+-+-+
| id | select_type | table | type | possible_keys | key  | key_len |  ref  | 
rows| Extra   |
++-+---+--+---+--+-+--+-+-+
|  1 | SIMPLE  | b | ALL  | NULL  | NULL | NULL|  NULL |
4906 | |
|  1 | SIMPLE  | a | ALL  | NULL  | NULL | NULL|  NULL | 
2089903 | Using where |
++-+---+--+---+--+-+--+-+-+
2 rows in set, 1 warning (0.00 sec)



Ouch.  Add an index on pat1.assignee.  Mysql currently has to scan your
entire pat1 table for every row in compusta1 to find matching rows.  In
general, you want an index on any fields used in a WHERE clause.

  



--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org



Re: Fail to change MySQL data directory on ubuntu

2009-08-27 Thread Jia Chen

 Chen, Did you really delete ibdata1 ?

Yes, I did.

Best,
Jia


Claudio Nanni wrote:



2009/8/26 chen jia chen.1...@gmail.com mailto:chen.1...@gmail.com

Hi there,

I am using MySQL on ubuntu 8.04.

I followed this link

http://www.ubuntu-howto.info/howto/how-to-move-mysql-databases-to-another-location-partition-or-hard-drive
to change the data directory of MySQL.

After stopping MySQL: sudo /etc/init.d/mysql stop

I make a new directory: sudo mkdir /media/disk/MySQL_data

then change the ownership of new directory, sudo chown mysql:mysql
/media/disk/MySQL_data

and copy all data to the new directory, cp -r -p /var/lib/mysql/*
/media/disk/MySQL_data/ and deleted all files like ibdata1,


Chen, Did you really delete ibdata1 ?



 




ib_logfile0, and ib_logfile1.

I then edit /etc/mysql/my.conf and update the datadir to my new
directory. I also update /etc/apparmor.d/usr.sbin.mysql so that news
lines with /var/lib/mysql replaced by /media/disk/MySQL_data are
added.

However, after sudo /etc/init.d/apparmor reload

I try sudo /etc/init.d/mysql start

I got
* Starting MySQL database server mysqld
   [fail]

If I change the datadir line in /etc/mysql/my.conf back to the
original one, I can start MySQL successfully.

I think I have done everything needed to change MySQL data directory.

Why am I still getting this error?  Where can I start to look for
the causes?

Thanks.

Jia

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:  
 http://lists.mysql.com/mysql?unsub=claudio.na...@gmail.com






--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org



Re: Fail to change MySQL data directory on ubuntu

2009-08-27 Thread Jia Chen

Thanks for the tips. Where can I find more details about the ibdata file?

After I changed the data dir, hese file did get recreated. So far, I 
have not noticed any data loss yet.


Best,
Jia

Eric Bergen wrote:

That procedure is horribly incorrect. You should simply move the
ib_log and ibdata files with the rest of the datadir. The ibdata1 file
contains innodb's system tables and depending on your setting of
innodb_file_per_table it also contains your data!

On Thu, Aug 27, 2009 at 7:56 AM, Jia Chenchen.1...@gmail.com wrote:
  

Chen, Did you really delete ibdata1 ?
  

Yes, I did.

Best,
Jia


Claudio Nanni wrote:


2009/8/26 chen jia chen.1...@gmail.com mailto:chen.1...@gmail.com

   Hi there,

   I am using MySQL on ubuntu 8.04.

   I followed this link

 
http://www.ubuntu-howto.info/howto/how-to-move-mysql-databases-to-another-location-partition-or-hard-drive
   to change the data directory of MySQL.

   After stopping MySQL: sudo /etc/init.d/mysql stop

   I make a new directory: sudo mkdir /media/disk/MySQL_data

   then change the ownership of new directory, sudo chown mysql:mysql
   /media/disk/MySQL_data

   and copy all data to the new directory, cp -r -p /var/lib/mysql/*
   /media/disk/MySQL_data/ and deleted all files like ibdata1,


Chen, Did you really delete ibdata1 ?






   ib_logfile0, and ib_logfile1.

   I then edit /etc/mysql/my.conf and update the datadir to my new
   directory. I also update /etc/apparmor.d/usr.sbin.mysql so that news
   lines with /var/lib/mysql replaced by /media/disk/MySQL_data are
   added.

   However, after sudo /etc/init.d/apparmor reload

   I try sudo /etc/init.d/mysql start

   I got
   * Starting MySQL database server mysqld
  [fail]

   If I change the datadir line in /etc/mysql/my.conf back to the
   original one, I can start MySQL successfully.

   I think I have done everything needed to change MySQL data directory.

   Why am I still getting this error?  Where can I start to look for
   the causes?

   Thanks.

   Jia

   --
   MySQL General Mailing List
   For list archives: http://lists.mysql.com/mysql
   To unsubscribe:
http://lists.mysql.com/mysql?unsub=claudio.na...@gmail.com


  

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=eric.ber...@gmail.com







  



--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org



Re: Fail to change MySQL data directory on ubuntu

2009-08-26 Thread Jia Chen

Hi Johnny,

Thanks you so much!

Your command fixed the problem beautifully. Now, MySQL can start 
successfully. I can create and drop databases without problem. Thanks again.


Best,
Jia

Johnny Withers wrote:

I'm at work now, this is the cmd I used:
 
chcon -R -u system_u -r object_r -t mysql_db_t /data

(my data lives in /data/mysqlXX -- were XX is the server version)
 
On Wed, Aug 26, 2009 at 8:02 AM, Johnny Withers joh...@pixelated.net 
mailto:joh...@pixelated.net wrote:


Yes, error 13, permission denied. Check selinux setup. I had this same
problem last week on a CentOS machine. I had to change the object type
of the new data dir to mysqld-something. I'm on a mobile phone and
can't remember the exact cmd.


On Tuesday, August 25, 2009, Jia Chen chen.1...@gmail.com
mailto:chen.1...@gmail.com wrote:
 I run sudo /etc/init.d/mysql start and check the syslog by
running sudo tail -f /var/log/syslog

 This is what I get
 Aug 25 22:18:06 chenj-desktop mysqld_safe[10934]: started
 Aug 25 22:18:06 chenj-desktop kernel: [11083.933531] type=1503
audit(1251253086.020:43): operation=inode_create
requested_mask=a:: denied_mask=a:: fsuid=0
name=/home/chenj/MySQL_data/chenj-desktop.lower-test pid=10936
profile=/usr/sbin/mysqld
 Aug 25 22:18:06 chenj-desktop kernel: [11083.933581] type=1503
audit(1251253086.020:44): operation=inode_create
requested_mask=a:: denied_mask=a:: fsuid=0
name=/home/chenj/MySQL_data/chenj-desktop.lower-test pid=10936
profile=/usr/sbin/mysqld
 Aug 25 22:18:06 chenj-desktop mysqld[10937]: 090825 22:18:06
[Warning] option 'thread_stack': unsigned value 128 adjusted to 131072
 Aug 25 22:18:06 chenj-desktop mysqld[10937]: 090825 22:18:06
[Warning] Can't create test file
/home/chenj/MySQL_data/chenj-desktop.lower-test
 Aug 25 22:18:06 chenj-desktop mysqld[10937]: 090825 22:18:06
[Warning] Can't create test file
/home/chenj/MySQL_data/chenj-desktop.lower-test
 Aug 25 22:18:06 chenj-desktop mysqld[10937]: 090825 22:18:06
 InnoDB: Operating system error number 13 in a file operation.
 Aug 25 22:18:06 chenj-desktop mysqld[10937]: InnoDB: The error
means mysqld does not have the access rights to
 Aug 25 22:18:06 chenj-desktop mysqld[10937]: InnoDB: the directory.
 Aug 25 22:18:06 chenj-desktop mysqld[10937]: InnoDB: File name
./ibdata1
 Aug 25 22:18:06 chenj-desktop mysqld[10937]: InnoDB: File
operation call: 'create'.
 Aug 25 22:18:06 chenj-desktop mysqld[10937]: InnoDB: Cannot
continue operation.
 Aug 25 22:18:06 chenj-desktop kernel: [11083.962674] type=1503
audit(1251253086.048:45): operation=inode_create
requested_mask=a:: denied_mask=a:: fsuid=110
name=/home/chenj/MySQL_data/ibdata1 pid=10936
profile=/usr/sbin/mysqld
 Aug 25 22:18:06 chenj-desktop mysqld_safe[10944]: ended
 Aug 25 22:18:20 chenj-desktop /etc/init.d/mysql[11094]: 0
processes alive and '/usr/bin/mysqladmin
--defaults-file=/etc/mysql/debian.cnf ping' resulted in
 Aug 25 22:18:20 chenj-desktop /etc/init.d/mysql[11094]:
^G/usr/bin/mysqladmin: connect to server at 'localhost' failed
 Aug 25 22:18:20 chenj-desktop /etc/init.d/mysql[11094]: error:
'Can't connect to local MySQL server through socket
'/var/run/mysqld/mysqld.sock' (2)'
 Aug 25 22:18:20 chenj-desktop /etc/init.d/mysql[11094]: Check
that mysqld is running and that the socket:
'/var/run/mysqld/mysqld.sock' exists!
 Aug 25 22:18:20 chenj-desktop /etc/init.d/mysql[11094]:

 Does this message tell me any thing about the problem?  Thanks.

 Best,
 Jia

 Johnny Withers wrote:

 First, check the error log, if you can't find it, start mysql
from the
 cmd line by running safe_mysqld it should print errors to console.

 If it is a permission issue, it might be caused be selinux, you'll
 need to change the object type od that new directory to mysqld-
 something. I can't recall the command. A google search on
selinux and
 mysql should produce good results.

 On Tuesday, August 25, 2009, chen jia chen.1...@gmail.com
mailto:chen.1...@gmail.com wrote:


 Hi there,

 I am using MySQL on ubuntu 8.04.

 I followed this link


http://www.ubuntu-howto.info/howto/how-to-move-mysql-databases-to-another-location-partition-or-hard-drive
 to change the data directory of MySQL.

 After stopping MySQL: sudo /etc/init.d/mysql stop

 I make a new directory: sudo mkdir /media/disk/MySQL_data

 then change the ownership of new directory, sudo chown mysql:mysql
 /media/disk/MySQL_data

 and copy all data to the new directory, cp -r -p /var/lib/mysql/*
 /media/disk/MySQL_data/ and deleted all files like ibdata1,
 ib_logfile0, and ib_logfile1.

 I then edit /etc/mysql/my.conf and update the datadir to my new

Re: Fail to change MySQL data directory on ubuntu

2009-08-25 Thread Jia Chen
I run sudo /etc/init.d/mysql start and check the syslog by running sudo 
tail -f /var/log/syslog


This is what I get
Aug 25 22:18:06 chenj-desktop mysqld_safe[10934]: started
Aug 25 22:18:06 chenj-desktop kernel: [11083.933531] type=1503 
audit(1251253086.020:43): operation=inode_create requested_mask=a:: 
denied_mask=a:: fsuid=0 
name=/home/chenj/MySQL_data/chenj-desktop.lower-test pid=10936 
profile=/usr/sbin/mysqld
Aug 25 22:18:06 chenj-desktop kernel: [11083.933581] type=1503 
audit(1251253086.020:44): operation=inode_create requested_mask=a:: 
denied_mask=a:: fsuid=0 
name=/home/chenj/MySQL_data/chenj-desktop.lower-test pid=10936 
profile=/usr/sbin/mysqld
Aug 25 22:18:06 chenj-desktop mysqld[10937]: 090825 22:18:06 [Warning] 
option 'thread_stack': unsigned value 128 adjusted to 131072
Aug 25 22:18:06 chenj-desktop mysqld[10937]: 090825 22:18:06 [Warning] 
Can't create test file /home/chenj/MySQL_data/chenj-desktop.lower-test
Aug 25 22:18:06 chenj-desktop mysqld[10937]: 090825 22:18:06 [Warning] 
Can't create test file /home/chenj/MySQL_data/chenj-desktop.lower-test
Aug 25 22:18:06 chenj-desktop mysqld[10937]: 090825 22:18:06  InnoDB: 
Operating system error number 13 in a file operation.
Aug 25 22:18:06 chenj-desktop mysqld[10937]: InnoDB: The error means 
mysqld does not have the access rights to

Aug 25 22:18:06 chenj-desktop mysqld[10937]: InnoDB: the directory.
Aug 25 22:18:06 chenj-desktop mysqld[10937]: InnoDB: File name ./ibdata1
Aug 25 22:18:06 chenj-desktop mysqld[10937]: InnoDB: File operation 
call: 'create'.
Aug 25 22:18:06 chenj-desktop mysqld[10937]: InnoDB: Cannot continue 
operation.
Aug 25 22:18:06 chenj-desktop kernel: [11083.962674] type=1503 
audit(1251253086.048:45): operation=inode_create requested_mask=a:: 
denied_mask=a:: fsuid=110 name=/home/chenj/MySQL_data/ibdata1 
pid=10936 profile=/usr/sbin/mysqld

Aug 25 22:18:06 chenj-desktop mysqld_safe[10944]: ended
Aug 25 22:18:20 chenj-desktop /etc/init.d/mysql[11094]: 0 processes 
alive and '/usr/bin/mysqladmin --defaults-file=/etc/mysql/debian.cnf 
ping' resulted in
Aug 25 22:18:20 chenj-desktop /etc/init.d/mysql[11094]: 
^G/usr/bin/mysqladmin: connect to server at 'localhost' failed
Aug 25 22:18:20 chenj-desktop /etc/init.d/mysql[11094]: error: 'Can't 
connect to local MySQL server through socket 
'/var/run/mysqld/mysqld.sock' (2)'
Aug 25 22:18:20 chenj-desktop /etc/init.d/mysql[11094]: Check that 
mysqld is running and that the socket: '/var/run/mysqld/mysqld.sock' exists!

Aug 25 22:18:20 chenj-desktop /etc/init.d/mysql[11094]:

Does this message tell me any thing about the problem?  Thanks.

Best,
Jia

Johnny Withers wrote:

First, check the error log, if you can't find it, start mysql from the
cmd line by running safe_mysqld it should print errors to console.

If it is a permission issue, it might be caused be selinux, you'll
need to change the object type od that new directory to mysqld-
something. I can't recall the command. A google search on selinux and
mysql should produce good results.

On Tuesday, August 25, 2009, chen jia chen.1...@gmail.com wrote:
  

Hi there,

I am using MySQL on ubuntu 8.04.

I followed this link
http://www.ubuntu-howto.info/howto/how-to-move-mysql-databases-to-another-location-partition-or-hard-drive
to change the data directory of MySQL.

After stopping MySQL: sudo /etc/init.d/mysql stop

I make a new directory: sudo mkdir /media/disk/MySQL_data

then change the ownership of new directory, sudo chown mysql:mysql
/media/disk/MySQL_data

and copy all data to the new directory, cp -r -p /var/lib/mysql/*
/media/disk/MySQL_data/ and deleted all files like ibdata1,
ib_logfile0, and ib_logfile1.

I then edit /etc/mysql/my.conf and update the datadir to my new
directory. I also update /etc/apparmor.d/usr.sbin.mysql so that news
lines with /var/lib/mysql replaced by /media/disk/MySQL_data are
added.

However, after sudo /etc/init.d/apparmor reload

I try sudo /etc/init.d/mysql start

I got
* Starting MySQL database server mysqld
[fail]

If I change the datadir line in /etc/mysql/my.conf back to the
original one, I can start MySQL successfully.

I think I have done everything needed to change MySQL data directory.

Why am I still getting this error?  Where can I start to look for the causes?

Thanks.

Jia

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=joh...@pixelated.net





  



--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org