Re: help with query to count rows while excluding certain rows
On Fri, Jan 1, 2016 at 9:31 PM, Peter Brawleywrote: > On 1/1/2016 19:24, Larry Martell wrote: >> >> On Fri, Jan 1, 2016 at 2:12 PM, Peter Brawley >> wrote: >>> >>> On 12/31/2015 0:51, Larry Martell wrote: I need to count the number of rows in a table that are grouped by a list of columns, but I also need to exclude rows that have more then some count when grouped by a different set of columns. Conceptually, this is not hard, but I am having trouble doing this efficiently. My first counting query would be this: SELECT count(*) FROM cst_rollup GROUP BY target_name_id, ep, roiname, recipe_process, recipe_product, recipe_layer, f_tag_bottom, measname, recipe_id But from this count I need to subtract the count of rows that have more then 50 rows with a different grouping: SELECT count(*) FROM cst_rollup GROUP BY target_name_id, ep, wafer_id HAVING count(*) >= 50 As you can see, the second query has wafer_id, but the first query does not. Currently I am doing this in python, and it's slow. In my current implementation I have one query, and it selects the columns (i.e. doesn't just count), and I have added wafer_id: SELECT target_name_id, ep, roiname, recipe_process, recipe_product, recipe_layer, f_tag_bottom, measname, recipe_id, wafer_id FROM cst_rollup Then I go through the result set (which can be over 200k rows) and I count the number of rows with matching (target_name_id, ep, wafer_id). Then I go through the rows again and regroup them without wafer_id, but skipping the rows that have more then 50 rows for that row's (target_name_id, ep, wafer_id). Is this clear to everyone what I am trying to do? >>> >>> >>> If I've understand this correctly, the resultset you wish to aggregate on >>> is >>> ... >>> >>> select target_name_id, ep, wafer_id >>> from cst_rollup a >>> left join ( -- exclude rows for which wafer_id count >= 50 >>>select name_id, ep, wafer, count(*) n >>>from cst_rollup >>>group by target_name_id, ep, wafer_id >>>having n >= 50 >>> ) b using ( target_name_id, ep, wafer_id ) >>> where b.target_name is null ; >>> >>> If that's so, you could assemble that resultset in a temp table then run >>> the >>> desired aggregate query on it, or you could aggregate on it directly as a >>> subquery. >> >> That query gives: >> >> ERROR 1137 (HY000): Can't reopen table: 'a' > > > So, it's a temporary table, and you'll need to make that not so. Yes, cst_rollup is a temp table. The underlying table is millions of rows (with 300 columns) so for efficiency a subset of the rows and columns are selected into the temp table based on some user input. It's just the rows in the temp table that are of interest for the current report. I was able to get this working with a second temp table: CREATE TEMPORARY TABLE rollup_exclude SELECT target_name_id, ep, wafer_id, count(*) n FROM cst_rollup GROUP BY target_name_id, ep, wafer_id HAVING n >= 50 And then: SELECT count(*) FROM cst_rollup LEFT JOIN( SELECT target_name_id, ep, wafer_id FROM rollup_exclude) b USING (target_name_id, ep, wafer_id) WHERE b.target_name_id IS NULL GROUP by target_name_id, ep, roiname, recipe_process, recipe_product, recipe_layer, f_tag_bottom, measname, recipe_id And the rowcount from that query gave me what I needed. Thanks very much for the help Peter, you gave me a push toward the right path. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
Re: help with query to count rows while excluding certain rows
On Fri, Jan 1, 2016 at 2:12 PM, Peter Brawleywrote: > On 12/31/2015 0:51, Larry Martell wrote: >> >> I need to count the number of rows in a table that are grouped by a >> list of columns, but I also need to exclude rows that have more then >> some count when grouped by a different set of columns. Conceptually, >> this is not hard, but I am having trouble doing this efficiently. >> >> My first counting query would be this: >> >> SELECT count(*) >> FROM cst_rollup >> GROUP BY target_name_id, ep, roiname, recipe_process, >> recipe_product, recipe_layer, f_tag_bottom, >> measname, recipe_id >> >> But from this count I need to subtract the count of rows that have >> more then 50 rows with a different grouping: >> >> SELECT count(*) >> FROM cst_rollup >> GROUP BY target_name_id, ep, wafer_id >> HAVING count(*) >= 50 >> >> As you can see, the second query has wafer_id, but the first query does >> not. >> >> Currently I am doing this in python, and it's slow. In my current >> implementation I have one query, and it selects the columns (i.e. >> doesn't just count), and I have added wafer_id: >> >> SELECT target_name_id, ep, roiname, recipe_process, >> recipe_product, recipe_layer, f_tag_bottom, >> measname, recipe_id, wafer_id >> FROM cst_rollup >> >> Then I go through the result set (which can be over 200k rows) and I >> count the number of rows with matching (target_name_id, ep, wafer_id). >> Then I go through the rows again and regroup them without wafer_id, >> but skipping the rows that have more then 50 rows for that row's >> (target_name_id, ep, wafer_id). >> >> Is this clear to everyone what I am trying to do? > > > If I've understand this correctly, the resultset you wish to aggregate on is > ... > > select target_name_id, ep, wafer_id > from cst_rollup a > left join ( -- exclude rows for which wafer_id count >= 50 > select name_id, ep, wafer, count(*) n > from cst_rollup > group by target_name_id, ep, wafer_id > having n >= 50 > ) b using ( target_name_id, ep, wafer_id ) > where b.target_name is null ; > > If that's so, you could assemble that resultset in a temp table then run the > desired aggregate query on it, or you could aggregate on it directly as a > subquery. That query gives: ERROR 1137 (HY000): Can't reopen table: 'a' -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
Re: help with query to count rows while excluding certain rows
On 12/31/2015 0:51, Larry Martell wrote: I need to count the number of rows in a table that are grouped by a list of columns, but I also need to exclude rows that have more then some count when grouped by a different set of columns. Conceptually, this is not hard, but I am having trouble doing this efficiently. My first counting query would be this: SELECT count(*) FROM cst_rollup GROUP BY target_name_id, ep, roiname, recipe_process, recipe_product, recipe_layer, f_tag_bottom, measname, recipe_id But from this count I need to subtract the count of rows that have more then 50 rows with a different grouping: SELECT count(*) FROM cst_rollup GROUP BY target_name_id, ep, wafer_id HAVING count(*) >= 50 As you can see, the second query has wafer_id, but the first query does not. Currently I am doing this in python, and it's slow. In my current implementation I have one query, and it selects the columns (i.e. doesn't just count), and I have added wafer_id: SELECT target_name_id, ep, roiname, recipe_process, recipe_product, recipe_layer, f_tag_bottom, measname, recipe_id, wafer_id FROM cst_rollup Then I go through the result set (which can be over 200k rows) and I count the number of rows with matching (target_name_id, ep, wafer_id). Then I go through the rows again and regroup them without wafer_id, but skipping the rows that have more then 50 rows for that row's (target_name_id, ep, wafer_id). Is this clear to everyone what I am trying to do? If I've understand this correctly, the resultset you wish to aggregate on is ... select target_name_id, ep, wafer_id from cst_rollup a left join ( -- exclude rows for which wafer_id count >= 50 select name_id, ep, wafer, count(*) n from cst_rollup group by target_name_id, ep, wafer_id having n >= 50 ) b using ( target_name_id, ep, wafer_id ) where b.target_name is null ; If that's so, you could assemble that resultset in a temp table then run the desired aggregate query on it, or you could aggregate on it directly as a subquery. PB - I'd like to do this all in sql with count because then I do not have to actually return and parse the data in python. Can anyone think of a way to do this in sql in a way that will be more efficient then my current implementation? Thanks! -Larry -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
Re: Help with REGEXP
From: Olivier Nicole olivier.nic...@cs.ait.ac.th You could look for a tool called The Regex Coach. While it is mainly for Windows, it runs very well in vine. I fijd it highly useful to debug regexps. On the Mac, look for RegExRx. It lets you paste in text to work on, build a regex, and see the result in real time. I also use one simply called Patterns, another real-time regex engine. It does some things RegExRx doesn't do, and vice-versa. Jan Steinman, EcoReality Co-op -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
Re: Help with REGEXP
Paul, You could look for a tool called The Regex Coach. While it is mainly for Windows, it runs very well in vine. I fijd it highly useful to debug regexps. Best regards, Olivier -- -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
Re: Help with REGEXP
Trying to pattern match ip addresses is a famous anti-pattern; it's one of those things like you feel like it should work, but it won't. Your case, however, is pretty specific. taking advantage of the limited range (I will assume you only wanted 4 sections of IPv4) this should come close: 10[.]\d{1,3}[.](224|225|226|227|228|229|23\d))[.]\d{1.3} On Thu, Mar 19, 2015 at 9:39 AM, Paul Halliday paul.halli...@gmail.com wrote: I am trying to pick out a range of IP addresses using REGEXP but failing miserably :) The pattern I want to match is: 10.%.224-239.%.% The regex I have looks like this: AND INET_NTOA(src_ip) REGEXP '\d{1,3}\\.\d{1,3}\.(22[4-9]|23[0-9])\\.\d{1,3}' but, go fish. Thoughts? Thanks! -- Paul Halliday http://www.pintumbler.org/ -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql -- - michael dykman - mdyk...@gmail.com May the Source be with you.
Re: Help with REGEXP
I don't think it accepts \d, or much of anything else I am used to putting in expressions :) This is what I ended up with and it appears to be working: REGEXP '10.[[:alnum:]]{1,3}.(22[4-9]|23[0-9]).[[:alnum:]]{1,3}' On Thu, Mar 19, 2015 at 11:10 AM, Michael Dykman mdyk...@gmail.com wrote: Trying to pattern match ip addresses is a famous anti-pattern; it's one of those things like you feel like it should work, but it won't. Your case, however, is pretty specific. taking advantage of the limited range (I will assume you only wanted 4 sections of IPv4) this should come close: 10[.]\d{1,3}[.](224|225|226|227|228|229|23\d))[.]\d{1.3} On Thu, Mar 19, 2015 at 9:39 AM, Paul Halliday paul.halli...@gmail.com wrote: I am trying to pick out a range of IP addresses using REGEXP but failing miserably :) The pattern I want to match is: 10.%.224-239.%.% The regex I have looks like this: AND INET_NTOA(src_ip) REGEXP '\d{1,3}\\.\d{1,3}\.(22[4-9]|23[0-9])\\.\d{1,3}' but, go fish. Thoughts? Thanks! -- Paul Halliday http://www.pintumbler.org/ -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql -- - michael dykman - mdyk...@gmail.com May the Source be with you. -- Paul Halliday http://www.pintumbler.org/ -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
Re: Help improving query performance
Hi Larry, On 2/1/2015 4:49 PM, Larry Martell wrote: I have 2 queries. One takes 4 hours to run and returns 21 rows, and the other, which has 1 additional where clause, takes 3 minutes and returns 20 rows. The main table being selected from is largish (37,247,884 rows with 282 columns). Caching is off for my testing, so it's not related to that. To short circuit anyone asking, these queries are generated by python code, which is why there's an IN clause with 1 value, as oppose to an =. Here are the queries and their explains. The significant difference is that the faster query has Using intersect(data_cst_bbccbce0,data_cst_fba12377) in the query plan - those 2 indexes are on the 2 columns in the where clause, so that's why the second one is faster. But I am wondering what I can do to make the first one faster. 4 hour query: SELECT MIN(data_tool.name) as tool, MIN(data_cst.date_time) start, MAX(data_cst.date_time) end, MIN(data_target.name) as target, MIN(data_lot.name) as lot, MIN(data_wafer.name) as wafer, MIN(measname) as measname, MIN(data_recipe.name) as recipe FROM data_cst INNER JOIN data_tool ON data_tool.id = data_cst.tool_id INNER JOIN data_target ON data_target.id = data_cst.target_name_id INNER JOIN data_lot ON data_lot.id = data_cst.lot_id INNER JOIN data_wafer ON data_wafer.id = data_cst.wafer_id INNER JOIN data_measparams ON data_measparams.id = data_cst.meas_params_name_id INNER JOIN data_recipe ON data_recipe.id = data_cst.recipe_id WHERE data_target.id IN (172) AND data_cst.date_time BETWEEN '2015-01-26 00:00:00' AND '2015-01-26 23:59:59' GROUP BY wafer_id, data_cst.lot_id, target_name_id ... snipped ... Faster query: SELECT MIN(data_tool.name) as tool, MIN(data_cst.date_time) start, MAX(data_cst.date_time) end, MIN(data_target.name) as target, MIN(data_lot.name) as lot, MIN(data_wafer.name) as wafer, MIN(measname) as measname, MIN(data_recipe.name) as recipe FROM data_cst INNER JOIN data_tool ON data_tool.id = data_cst.tool_id INNER JOIN data_target ON data_target.id = data_cst.target_name_id INNER JOIN data_lot ON data_lot.id = data_cst.lot_id INNER JOIN data_wafer ON data_wafer.id = data_cst.wafer_id INNER JOIN data_measparams ON data_measparams.id = data_cst.meas_params_name_id INNER JOIN data_recipe ON data_recipe.id = data_cst.recipe_id WHERE data_target.id IN (172) AND data_recipe.id IN (148) AND data_cst.date_time BETWEEN '2015-01-26 00:00:00' AND '2015-01-26 23:59:59' GROUP BY wafer_id, data_cst.lot_id, target_name_id ... snip ... Thanks for taking the time to read this, and for any help or pointers you can give me. The biggest difference is the added selectivity generated by the WHERE term against the data_recipe table. Compare the two EXPLAINS, in the faster query you see that data_recipe is listed second. This allows the additional term a chance to reduce the number of row combinations for the entire query. To really get at the logic behind how the Optimizer chooses its execution plan, get an optimizer trace. Look at the cost estimates for each phase being considered. http://dev.mysql.com/doc/refman/5.6/en/optimizer-trace-table.html http://dev.mysql.com/doc/internals/en/optimizer-tracing.html Yours, -- Shawn Green MySQL Senior Principal Technical Support Engineer Oracle USA, Inc. - Hardware and Software, Engineered to Work Together. Office: Blountville, TN -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
Re: Help improving query performance
Hi Larry, On 2/4/2015 3:18 PM, Larry Martell wrote: On Wed, Feb 4, 2015 at 2:56 PM, shawn l.green shawn.l.gr...@oracle.com wrote: Hi Larry, On 2/1/2015 4:49 PM, Larry Martell wrote: I have 2 queries. One takes 4 hours to run and returns 21 rows, and the other, which has 1 additional where clause, takes 3 minutes and returns 20 rows. The main table being selected from is largish (37,247,884 rows with 282 columns). Caching is off for my testing, so it's not related to that. To short circuit anyone asking, these queries are generated by python code, which is why there's an IN clause with 1 value, as oppose to an =. Here are the queries and their explains. The significant difference is that the faster query has Using intersect(data_cst_bbccbce0,data_cst_fba12377) in the query plan - those 2 indexes are on the 2 columns in the where clause, so that's why the second one is faster. But I am wondering what I can do to make the first one faster. 4 hour query: SELECT MIN(data_tool.name) as tool, MIN(data_cst.date_time) start, MAX(data_cst.date_time) end, MIN(data_target.name) as target, MIN(data_lot.name) as lot, MIN(data_wafer.name) as wafer, MIN(measname) as measname, MIN(data_recipe.name) as recipe FROM data_cst INNER JOIN data_tool ON data_tool.id = data_cst.tool_id INNER JOIN data_target ON data_target.id = data_cst.target_name_id INNER JOIN data_lot ON data_lot.id = data_cst.lot_id INNER JOIN data_wafer ON data_wafer.id = data_cst.wafer_id INNER JOIN data_measparams ON data_measparams.id = data_cst.meas_params_name_id INNER JOIN data_recipe ON data_recipe.id = data_cst.recipe_id WHERE data_target.id IN (172) AND data_cst.date_time BETWEEN '2015-01-26 00:00:00' AND '2015-01-26 23:59:59' GROUP BY wafer_id, data_cst.lot_id, target_name_id ... snipped ... Faster query: SELECT MIN(data_tool.name) as tool, MIN(data_cst.date_time) start, MAX(data_cst.date_time) end, MIN(data_target.name) as target, MIN(data_lot.name) as lot, MIN(data_wafer.name) as wafer, MIN(measname) as measname, MIN(data_recipe.name) as recipe FROM data_cst INNER JOIN data_tool ON data_tool.id = data_cst.tool_id INNER JOIN data_target ON data_target.id = data_cst.target_name_id INNER JOIN data_lot ON data_lot.id = data_cst.lot_id INNER JOIN data_wafer ON data_wafer.id = data_cst.wafer_id INNER JOIN data_measparams ON data_measparams.id = data_cst.meas_params_name_id INNER JOIN data_recipe ON data_recipe.id = data_cst.recipe_id WHERE data_target.id IN (172) AND data_recipe.id IN (148) AND data_cst.date_time BETWEEN '2015-01-26 00:00:00' AND '2015-01-26 23:59:59' GROUP BY wafer_id, data_cst.lot_id, target_name_id ... snip ... Thanks for taking the time to read this, and for any help or pointers you can give me. The biggest difference is the added selectivity generated by the WHERE term against the data_recipe table. Compare the two EXPLAINS, in the faster query you see that data_recipe is listed second. This allows the additional term a chance to reduce the number of row combinations for the entire query. To really get at the logic behind how the Optimizer chooses its execution plan, get an optimizer trace. Look at the cost estimates for each phase being considered. http://dev.mysql.com/doc/refman/5.6/en/optimizer-trace-table.html http://dev.mysql.com/doc/internals/en/optimizer-tracing.html Thanks very much Shawn for the reply and the links. I will check those out and I'm sure I will find them very useful. Meanwhile I changed the query to select from data_cst using the where clause into a temp table and then I join the temp table with the other tables. That has improved the slow query from 4 hours to 10 seconds (!) Did you also add an index to the temporary table for the JOIN condition? It might make it even faster Yours, -- Shawn Green MySQL Senior Principal Technical Support Engineer Oracle USA, Inc. - Hardware and Software, Engineered to Work Together. Office: Blountville, TN -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
Re: Help improving query performance
On Wed, Feb 4, 2015 at 3:25 PM, shawn l.green shawn.l.gr...@oracle.com wrote: Hi Larry, On 2/4/2015 3:18 PM, Larry Martell wrote: On Wed, Feb 4, 2015 at 2:56 PM, shawn l.green shawn.l.gr...@oracle.com wrote: Hi Larry, On 2/1/2015 4:49 PM, Larry Martell wrote: I have 2 queries. One takes 4 hours to run and returns 21 rows, and the other, which has 1 additional where clause, takes 3 minutes and returns 20 rows. The main table being selected from is largish (37,247,884 rows with 282 columns). Caching is off for my testing, so it's not related to that. To short circuit anyone asking, these queries are generated by python code, which is why there's an IN clause with 1 value, as oppose to an =. Here are the queries and their explains. The significant difference is that the faster query has Using intersect(data_cst_bbccbce0,data_cst_fba12377) in the query plan - those 2 indexes are on the 2 columns in the where clause, so that's why the second one is faster. But I am wondering what I can do to make the first one faster. 4 hour query: SELECT MIN(data_tool.name) as tool, MIN(data_cst.date_time) start, MAX(data_cst.date_time) end, MIN(data_target.name) as target, MIN(data_lot.name) as lot, MIN(data_wafer.name) as wafer, MIN(measname) as measname, MIN(data_recipe.name) as recipe FROM data_cst INNER JOIN data_tool ON data_tool.id = data_cst.tool_id INNER JOIN data_target ON data_target.id = data_cst.target_name_id INNER JOIN data_lot ON data_lot.id = data_cst.lot_id INNER JOIN data_wafer ON data_wafer.id = data_cst.wafer_id INNER JOIN data_measparams ON data_measparams.id = data_cst.meas_params_name_id INNER JOIN data_recipe ON data_recipe.id = data_cst.recipe_id WHERE data_target.id IN (172) AND data_cst.date_time BETWEEN '2015-01-26 00:00:00' AND '2015-01-26 23:59:59' GROUP BY wafer_id, data_cst.lot_id, target_name_id ... snipped ... Faster query: SELECT MIN(data_tool.name) as tool, MIN(data_cst.date_time) start, MAX(data_cst.date_time) end, MIN(data_target.name) as target, MIN(data_lot.name) as lot, MIN(data_wafer.name) as wafer, MIN(measname) as measname, MIN(data_recipe.name) as recipe FROM data_cst INNER JOIN data_tool ON data_tool.id = data_cst.tool_id INNER JOIN data_target ON data_target.id = data_cst.target_name_id INNER JOIN data_lot ON data_lot.id = data_cst.lot_id INNER JOIN data_wafer ON data_wafer.id = data_cst.wafer_id INNER JOIN data_measparams ON data_measparams.id = data_cst.meas_params_name_id INNER JOIN data_recipe ON data_recipe.id = data_cst.recipe_id WHERE data_target.id IN (172) AND data_recipe.id IN (148) AND data_cst.date_time BETWEEN '2015-01-26 00:00:00' AND '2015-01-26 23:59:59' GROUP BY wafer_id, data_cst.lot_id, target_name_id ... snip ... Thanks for taking the time to read this, and for any help or pointers you can give me. The biggest difference is the added selectivity generated by the WHERE term against the data_recipe table. Compare the two EXPLAINS, in the faster query you see that data_recipe is listed second. This allows the additional term a chance to reduce the number of row combinations for the entire query. To really get at the logic behind how the Optimizer chooses its execution plan, get an optimizer trace. Look at the cost estimates for each phase being considered. http://dev.mysql.com/doc/refman/5.6/en/optimizer-trace-table.html http://dev.mysql.com/doc/internals/en/optimizer-tracing.html Thanks very much Shawn for the reply and the links. I will check those out and I'm sure I will find them very useful. Meanwhile I changed the query to select from data_cst using the where clause into a temp table and then I join the temp table with the other tables. That has improved the slow query from 4 hours to 10 seconds (!) Did you also add an index to the temporary table for the JOIN condition? It might make it even faster No, I didn't. I (and the users) were so shocked and happy with the massive improvement I moved on to make similar changes to other queries. This is a django app, and it's a one-shot deal - i.e. there's just the one query run and the response is sent back to the browser and that's the end of the session and the temp table. So I'm thinking it's probably not worth it. As an aside this change has messed up all my unit tests - they send multiple requests, but they're all in the same session. So only the first succeeds and the next one fails because the temp table already exists. I haven't figured out how to get it run each request in its own session. I guess I'm going to have to drop the temp table after I join with it before I sent the response back. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
Re: Help improving query performance
On Wed, Feb 4, 2015 at 2:56 PM, shawn l.green shawn.l.gr...@oracle.com wrote: Hi Larry, On 2/1/2015 4:49 PM, Larry Martell wrote: I have 2 queries. One takes 4 hours to run and returns 21 rows, and the other, which has 1 additional where clause, takes 3 minutes and returns 20 rows. The main table being selected from is largish (37,247,884 rows with 282 columns). Caching is off for my testing, so it's not related to that. To short circuit anyone asking, these queries are generated by python code, which is why there's an IN clause with 1 value, as oppose to an =. Here are the queries and their explains. The significant difference is that the faster query has Using intersect(data_cst_bbccbce0,data_cst_fba12377) in the query plan - those 2 indexes are on the 2 columns in the where clause, so that's why the second one is faster. But I am wondering what I can do to make the first one faster. 4 hour query: SELECT MIN(data_tool.name) as tool, MIN(data_cst.date_time) start, MAX(data_cst.date_time) end, MIN(data_target.name) as target, MIN(data_lot.name) as lot, MIN(data_wafer.name) as wafer, MIN(measname) as measname, MIN(data_recipe.name) as recipe FROM data_cst INNER JOIN data_tool ON data_tool.id = data_cst.tool_id INNER JOIN data_target ON data_target.id = data_cst.target_name_id INNER JOIN data_lot ON data_lot.id = data_cst.lot_id INNER JOIN data_wafer ON data_wafer.id = data_cst.wafer_id INNER JOIN data_measparams ON data_measparams.id = data_cst.meas_params_name_id INNER JOIN data_recipe ON data_recipe.id = data_cst.recipe_id WHERE data_target.id IN (172) AND data_cst.date_time BETWEEN '2015-01-26 00:00:00' AND '2015-01-26 23:59:59' GROUP BY wafer_id, data_cst.lot_id, target_name_id ... snipped ... Faster query: SELECT MIN(data_tool.name) as tool, MIN(data_cst.date_time) start, MAX(data_cst.date_time) end, MIN(data_target.name) as target, MIN(data_lot.name) as lot, MIN(data_wafer.name) as wafer, MIN(measname) as measname, MIN(data_recipe.name) as recipe FROM data_cst INNER JOIN data_tool ON data_tool.id = data_cst.tool_id INNER JOIN data_target ON data_target.id = data_cst.target_name_id INNER JOIN data_lot ON data_lot.id = data_cst.lot_id INNER JOIN data_wafer ON data_wafer.id = data_cst.wafer_id INNER JOIN data_measparams ON data_measparams.id = data_cst.meas_params_name_id INNER JOIN data_recipe ON data_recipe.id = data_cst.recipe_id WHERE data_target.id IN (172) AND data_recipe.id IN (148) AND data_cst.date_time BETWEEN '2015-01-26 00:00:00' AND '2015-01-26 23:59:59' GROUP BY wafer_id, data_cst.lot_id, target_name_id ... snip ... Thanks for taking the time to read this, and for any help or pointers you can give me. The biggest difference is the added selectivity generated by the WHERE term against the data_recipe table. Compare the two EXPLAINS, in the faster query you see that data_recipe is listed second. This allows the additional term a chance to reduce the number of row combinations for the entire query. To really get at the logic behind how the Optimizer chooses its execution plan, get an optimizer trace. Look at the cost estimates for each phase being considered. http://dev.mysql.com/doc/refman/5.6/en/optimizer-trace-table.html http://dev.mysql.com/doc/internals/en/optimizer-tracing.html Thanks very much Shawn for the reply and the links. I will check those out and I'm sure I will find them very useful. Meanwhile I changed the query to select from data_cst using the where clause into a temp table and then I join the temp table with the other tables. That has improved the slow query from 4 hours to 10 seconds (!) -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
Re: Help improving query performance
Hello Larry, On 2/4/2015 3:37 PM, Larry Martell wrote: On Wed, Feb 4, 2015 at 3:25 PM, shawn l.green shawn.l.gr...@oracle.com wrote: Hi Larry, On 2/4/2015 3:18 PM, Larry Martell wrote: On Wed, Feb 4, 2015 at 2:56 PM, shawn l.green shawn.l.gr...@oracle.com wrote: Hi Larry, On 2/1/2015 4:49 PM, Larry Martell wrote: I have 2 queries. One takes 4 hours to run and returns 21 rows, and the other, which has 1 additional where clause, takes 3 minutes and returns 20 rows. The main table being selected from is largish (37,247,884 rows with 282 columns). Caching is off for my testing, so it's not related to that. To short circuit anyone asking, these queries are generated by python code, which is why there's an IN clause with 1 value, as oppose to an =. Here are the queries and their explains. The significant difference is that the faster query has Using intersect(data_cst_bbccbce0,data_cst_fba12377) in the query plan - those 2 indexes are on the 2 columns in the where clause, so that's why the second one is faster. But I am wondering what I can do to make the first one faster. 4 hour query: SELECT MIN(data_tool.name) as tool, MIN(data_cst.date_time) start, MAX(data_cst.date_time) end, MIN(data_target.name) as target, MIN(data_lot.name) as lot, MIN(data_wafer.name) as wafer, MIN(measname) as measname, MIN(data_recipe.name) as recipe FROM data_cst INNER JOIN data_tool ON data_tool.id = data_cst.tool_id INNER JOIN data_target ON data_target.id = data_cst.target_name_id INNER JOIN data_lot ON data_lot.id = data_cst.lot_id INNER JOIN data_wafer ON data_wafer.id = data_cst.wafer_id INNER JOIN data_measparams ON data_measparams.id = data_cst.meas_params_name_id INNER JOIN data_recipe ON data_recipe.id = data_cst.recipe_id WHERE data_target.id IN (172) AND data_cst.date_time BETWEEN '2015-01-26 00:00:00' AND '2015-01-26 23:59:59' GROUP BY wafer_id, data_cst.lot_id, target_name_id ... snipped ... Faster query: SELECT MIN(data_tool.name) as tool, MIN(data_cst.date_time) start, MAX(data_cst.date_time) end, MIN(data_target.name) as target, MIN(data_lot.name) as lot, MIN(data_wafer.name) as wafer, MIN(measname) as measname, MIN(data_recipe.name) as recipe FROM data_cst INNER JOIN data_tool ON data_tool.id = data_cst.tool_id INNER JOIN data_target ON data_target.id = data_cst.target_name_id INNER JOIN data_lot ON data_lot.id = data_cst.lot_id INNER JOIN data_wafer ON data_wafer.id = data_cst.wafer_id INNER JOIN data_measparams ON data_measparams.id = data_cst.meas_params_name_id INNER JOIN data_recipe ON data_recipe.id = data_cst.recipe_id WHERE data_target.id IN (172) AND data_recipe.id IN (148) AND data_cst.date_time BETWEEN '2015-01-26 00:00:00' AND '2015-01-26 23:59:59' GROUP BY wafer_id, data_cst.lot_id, target_name_id ... snip ... Thanks for taking the time to read this, and for any help or pointers you can give me. The biggest difference is the added selectivity generated by the WHERE term against the data_recipe table. Compare the two EXPLAINS, in the faster query you see that data_recipe is listed second. This allows the additional term a chance to reduce the number of row combinations for the entire query. To really get at the logic behind how the Optimizer chooses its execution plan, get an optimizer trace. Look at the cost estimates for each phase being considered. http://dev.mysql.com/doc/refman/5.6/en/optimizer-trace-table.html http://dev.mysql.com/doc/internals/en/optimizer-tracing.html Thanks very much Shawn for the reply and the links. I will check those out and I'm sure I will find them very useful. Meanwhile I changed the query to select from data_cst using the where clause into a temp table and then I join the temp table with the other tables. That has improved the slow query from 4 hours to 10 seconds (!) Did you also add an index to the temporary table for the JOIN condition? It might make it even faster No, I didn't. I (and the users) were so shocked and happy with the massive improvement I moved on to make similar changes to other queries. This is a django app, and it's a one-shot deal - i.e. there's just the one query run and the response is sent back to the browser and that's the end of the session and the temp table. So I'm thinking it's probably not worth it. As an aside this change has messed up all my unit tests - they send multiple requests, but they're all in the same session. So only the first succeeds and the next one fails because the temp table already exists. I haven't figured out how to get it run each request in its own session. I guess I'm going to have to drop the temp table after I join with it before I sent the response back. If... * it's a MEMORY temp table * it's always the same table design Then, you can use DELETE to clear the content (it's faster than DROP or
Re: Help optimize query.
Hello Mimko, Sorry for the late reply. I had a bunch of work to take care of before vacation, then there was the vacation itself. :) On 11/13/2014 2:34 PM, Mimiko wrote: Hello. I have this table: show create table cc_agents_tier_status_log: CREATE TABLE cc_agents_tier_status_log ( id int(10) unsigned NOT NULL AUTO_INCREMENT, date_log timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP, cc_agent varchar(45) NOT NULL, cc_agent_tier_status_id tinyint(3) unsigned NOT NULL, cc_queue_id tinyint(3) unsigned NOT NULL, cc_agent_id int(10) unsigned NOT NULL, cc_agent_phone smallint(5) unsigned NOT NULL, cc_agent_domain varchar(45) NOT NULL DEFAULT 'pbx01.apa-canal.md', PRIMARY KEY (id), KEY IDX_cc_agents_tier_status_log_2 (cc_agent) USING HASH, KEY IDX_cc_agents_tier_status_log_3 (date_log), KEY FK_cc_agents_tier_status_log_2 (cc_agent_id), KEY FK_cc_agents_tier_status_log_3 (cc_queue_id), KEY FK_cc_agents_tier_status_log_1 (cc_agent_tier_status_id) USING BTREE, KEY IDX_cc_agents_tier_status_log_7 (id,date_log), CONSTRAINT FK_cc_agents_tier_status_log_1 FOREIGN KEY (cc_agent_tier_status_id) REFERENCES cc_agent_tier_status_chart (id) ON UPDATE CASCADE, CONSTRAINT FK_cc_agents_tier_status_log_2 FOREIGN KEY (cc_agent_id) REFERENCES apacanal.employee (id) ON UPDATE CASCADE, CONSTRAINT FK_cc_agents_tier_status_log_3 FOREIGN KEY (cc_queue_id) REFERENCES cc_queues (id) ON UPDATE CASCADE ) ENGINE=InnoDB AUTO_INCREMENT=23799 DEFAULT CHARSET=ascii show index from cc_agents_tier_status_log: TableNon_uniqueKey_nameSeq_in_indexColumn_name Collation CardinalitySub_partPackedNullIndex_type CommentIndex_comment cc_agents_tier_status_log0PRIMARY1idA 23999(null)BTREE(null) (null) cc_agents_tier_status_log1IDX_cc_agents_tier_status_log_21 cc_agentA260(null)BTREE(null)(null) cc_agents_tier_status_log1IDX_cc_agents_tier_status_log_31 date_logA23999(null)BTREE(null)(null) cc_agents_tier_status_log1FK_cc_agents_tier_status_log_21 cc_agent_idA2(null)BTREE(null)(null) cc_agents_tier_status_log1FK_cc_agents_tier_status_log_31 cc_queue_idA14(null)BTREE(null)(null) cc_agents_tier_status_log1FK_cc_agents_tier_status_log_11 cc_agent_tier_status_idA2(null)BTREE (null)(null) cc_agents_tier_status_log1 IDX_cc_agents_tier_status_log_71 idA23999(null)BTREE(null)(null) cc_agents_tier_status_log1IDX_cc_agents_tier_status_log_72 date_logA23999(null)BTREE(null)(null) And the query is: set @enddate:=now(); set @startdate:='2014-11-01'; set @que_id:=-1; select s.theHour as theHour,avg(s.nrAgents) as nrAgents from (select date(a.theDateHour) as theDate,extract(hour from a.theDateHour) as theHour,count(c.cc_agent_tier_status_id) as nrAgents from ( select dh.theDateHour as theDateHour, max(c.date_log) as maxdatelog,c.* FROM ( select concat(d.thedate,' ',h.theHour,':0:0') as theDateHour from ( select DATE(DATE_ADD(date(@startdate), INTERVAL @i:=@i+1 DAY) ) as theDate from (select @i:=-1) as t1 inner join cc_member_queue_end_log b on 1=1 and b.id=datediff(@enddate,@startdate)+1 ) as d left outer join (SELECT 0 AS theHour UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9 UNION ALL SELECT 10 UNION ALL SELECT 11 UNION ALL SELECT 12 UNION ALL SELECT 13 UNION ALL SELECT 14 UNION ALL SELECT 15 UNION ALL SELECT 16 UNION ALL SELECT 17 UNION ALL SELECT 18 UNION ALL SELECT 19 UNION ALL SELECT 20 UNION ALL SELECT 21 UNION ALL SELECT 22 UNION ALL SELECT 23) as h on 1=1 ) AS dh left outer join cc_agents_tier_status_log as c on c.date_log=dh.theDateHour where (if(@queue_id0,1,0) or if(@queue_id=c.cc_queue_id,1,0)) group by dh.theDateHour,c.cc_queue_id,c.cc_agent_id,c.cc_agent_phone ) as a left outer join cc_agents_tier_status_log as c on c.date_log=a.maxdatelog and c.cc_queue_id=a.cc_queue_id and c.cc_agent_id=a.cc_agent_id and c.cc_agent_phone=a.cc_agent_phone and c.cc_agent_tier_status_id=2 group by a.theDateHour order by date(a.theDateHour),extract(hour from a.theDateHour)) as s group by s.theHour order by s.theHour; This query takes 20 seconds to populate. Table cc_agents_tier_status_log contains log entries of agent_id login/logout per queue per phone. status_id can have value 1 (logged out) and 2 (login) at date_log datetime. The resulting table must contain average number of agents logged in at every hour per startdate to enddate. Hope for some hints. Thank you. The first problem is that you are generating a lot of extra rows before you actually need them. The only place where you should be faking the
Re: Help optimize query.
On 15.11.2014 01:06, Peter Brawley wrote: Let's see the results of Explain Extended this query, result of Show Create Table cc_member_queue_end_log. cc_member_queue_end_log is not of interest, it is used just as a series of numbers. It may be any table with ids. I've changed a bit the query which seemed to reduce the select time, but not for a lot. set @enddate:=now(); set @startdate:='2014-11-01'; set @que_id:=-1; explain extended select s.theHour as theHour,avg(s.nrAgents) as nrAgents from - (select date(FROM_UNIXTIME(a.theDateHour)) as theDate,extract(hour from FROM_UNIXTIME(a.theDateHour)) as theHour,count(c.cc_agent_tier_status_id) as nrAgents - from ( - - select dh.theDateHour as theDateHour, max(c.date_log) as maxdatelog,c.* - FROM - ( select UNIX_TIMESTAMP(concat(d.thedate,' ',h.theHour,':0:0')) as theDateHour - from - ( select DATE(DATE_ADD(date('2014-11-01'), INTERVAL @i:=@i+1 DAY) ) as theDate from (select @i:=-1) as t1 - inner join cc_agents_tier_status_log b on 1=1 and b.id=datediff(now(),'2014-11-01')+1 ) as d - straight_join - (SELECT 0 AS theHour UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9 UNION ALL SELECT 10 UNION ALL SELECT 11 UNION ALL SELECT 12 UNION ALL SELECT 13 UNION ALL SELECT 14 UNION ALL SELECT 15 UNION ALL SELECT 16 UNION ALL SELECT 17 UNION ALL SELECT 18 UNION ALL SELECT 19 UNION ALL SELECT 20 UNION ALL SELECT 21 UNION ALL SELECT 22 UNION ALL SELECT 23) as h - on 1=1 ) AS dh - straight_join - cc_agents_tier_status_log as c - on UNIX_TIMESTAMP(c.date_log)=dh.theDateHour where (if(-10,1,0) or if(-1=c.cc_queue_id,1,0)) - group by dh.theDateHour,c.cc_queue_id,c.cc_agent_id,c.cc_agent_phone - - - ) as a - straight_join cc_agents_tier_status_log as c - on c.date_log=a.maxdatelog and c.cc_queue_id=a.cc_queue_id and c.cc_agent_id=a.cc_agent_id and c.cc_agent_phone=a.cc_agent_phone and c.cc_agent_tier_status_id=2 - group by a.theDateHour - order by date(FROM_UNIXTIME(a.theDateHour)),extract(hour from FROM_UNIXTIME(a.theDateHour))) - as s - group by s.theHour - order by s.theHour\G *** 1. row *** id: 1 select_type: PRIMARY table: derived2 type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 360 filtered: 100.00 Extra: Using temporary; Using filesort *** 2. row *** id: 2 select_type: DERIVED table: derived3 type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 43560 filtered: 100.00 Extra: Using temporary; Using filesort *** 3. row *** id: 2 select_type: DERIVED table: c type: ref possible_keys: IDX_cc_agents_tier_status_log_3,FK_cc_agents_tier_status_log_2,FK_cc_agents_tier_status_log_3,FK_cc_agents_tier_status_log_1 key: IDX_cc_agents_tier_status_log_3 key_len: 4 ref: a.maxdatelog rows: 1 filtered: 100.00 Extra: Using where *** 4. row *** id: 3 select_type: DERIVED table: derived4 type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 360 filtered: 100.00 Extra: Using temporary; Using filesort *** 5. row *** id: 3 select_type: DERIVED table: c type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 24207 filtered: 100.00 Extra: Using where; Using join buffer *** 6. row *** id: 4 select_type: DERIVED table: derived5 type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 15 filtered: 100.00 Extra: *** 7. row *** id: 4 select_type: DERIVED table: derived7 type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 24 filtered: 100.00 Extra: Using join buffer *** 8. row *** id: 7 select_type: DERIVED table: NULL type: NULL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: NULL filtered: NULL Extra: No tables used *** 9. row *** id: 8 select_type: UNION table: NULL
Re: Help optimize query.
Let's see the results of Explain Extended this query, result of Show Create Table cc_member_queue_end_log. PB - On 2014-11-13 1:34 PM, Mimiko wrote: Hello. I have this table: show create table cc_agents_tier_status_log: CREATE TABLE cc_agents_tier_status_log ( id int(10) unsigned NOT NULL AUTO_INCREMENT, date_log timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP, cc_agent varchar(45) NOT NULL, cc_agent_tier_status_id tinyint(3) unsigned NOT NULL, cc_queue_id tinyint(3) unsigned NOT NULL, cc_agent_id int(10) unsigned NOT NULL, cc_agent_phone smallint(5) unsigned NOT NULL, cc_agent_domain varchar(45) NOT NULL DEFAULT 'pbx01.apa-canal.md', PRIMARY KEY (id), KEY IDX_cc_agents_tier_status_log_2 (cc_agent) USING HASH, KEY IDX_cc_agents_tier_status_log_3 (date_log), KEY FK_cc_agents_tier_status_log_2 (cc_agent_id), KEY FK_cc_agents_tier_status_log_3 (cc_queue_id), KEY FK_cc_agents_tier_status_log_1 (cc_agent_tier_status_id) USING BTREE, KEY IDX_cc_agents_tier_status_log_7 (id,date_log), CONSTRAINT FK_cc_agents_tier_status_log_1 FOREIGN KEY (cc_agent_tier_status_id) REFERENCES cc_agent_tier_status_chart (id) ON UPDATE CASCADE, CONSTRAINT FK_cc_agents_tier_status_log_2 FOREIGN KEY (cc_agent_id) REFERENCES apacanal.employee (id) ON UPDATE CASCADE, CONSTRAINT FK_cc_agents_tier_status_log_3 FOREIGN KEY (cc_queue_id) REFERENCES cc_queues (id) ON UPDATE CASCADE ) ENGINE=InnoDB AUTO_INCREMENT=23799 DEFAULT CHARSET=ascii show index from cc_agents_tier_status_log: TableNon_uniqueKey_nameSeq_in_indexColumn_name Collation CardinalitySub_partPackedNull Index_type CommentIndex_comment cc_agents_tier_status_log0PRIMARY1idA 23999(null)BTREE(null) (null) cc_agents_tier_status_log1 IDX_cc_agents_tier_status_log_21 cc_agentA 260(null)BTREE(null)(null) cc_agents_tier_status_log1 IDX_cc_agents_tier_status_log_31 date_logA 23999(null)BTREE(null)(null) cc_agents_tier_status_log1 FK_cc_agents_tier_status_log_21 cc_agent_idA 2(null)BTREE(null)(null) cc_agents_tier_status_log1 FK_cc_agents_tier_status_log_31 cc_queue_idA 14(null)BTREE(null)(null) cc_agents_tier_status_log1 FK_cc_agents_tier_status_log_11 cc_agent_tier_status_id A2(null)BTREE(null) (null) cc_agents_tier_status_log1 IDX_cc_agents_tier_status_log_7 1 idA23999(null)BTREE(null) (null) cc_agents_tier_status_log1 IDX_cc_agents_tier_status_log_72 date_logA 23999(null)BTREE(null)(null) And the query is: set @enddate:=now(); set @startdate:='2014-11-01'; set @que_id:=-1; select s.theHour as theHour,avg(s.nrAgents) as nrAgents from (select date(a.theDateHour) as theDate,extract(hour from a.theDateHour) as theHour,count(c.cc_agent_tier_status_id) as nrAgents from ( select dh.theDateHour as theDateHour, max(c.date_log) as maxdatelog,c.* FROM ( select concat(d.thedate,' ',h.theHour,':0:0') as theDateHour from ( select DATE(DATE_ADD(date(@startdate), INTERVAL @i:=@i+1 DAY) ) as theDate from (select @i:=-1) as t1 inner join cc_member_queue_end_log b on 1=1 and b.id=datediff(@enddate,@startdate)+1 ) as d left outer join (SELECT 0 AS theHour UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9 UNION ALL SELECT 10 UNION ALL SELECT 11 UNION ALL SELECT 12 UNION ALL SELECT 13 UNION ALL SELECT 14 UNION ALL SELECT 15 UNION ALL SELECT 16 UNION ALL SELECT 17 UNION ALL SELECT 18 UNION ALL SELECT 19 UNION ALL SELECT 20 UNION ALL SELECT 21 UNION ALL SELECT 22 UNION ALL SELECT 23) as h on 1=1 ) AS dh left outer join cc_agents_tier_status_log as c on c.date_log=dh.theDateHour where (if(@queue_id0,1,0) or if(@queue_id=c.cc_queue_id,1,0)) group by dh.theDateHour,c.cc_queue_id,c.cc_agent_id,c.cc_agent_phone ) as a left outer join cc_agents_tier_status_log as c on c.date_log=a.maxdatelog and c.cc_queue_id=a.cc_queue_id and c.cc_agent_id=a.cc_agent_id and c.cc_agent_phone=a.cc_agent_phone and c.cc_agent_tier_status_id=2 group by a.theDateHour order by date(a.theDateHour),extract(hour from a.theDateHour)) as s group by s.theHour order by s.theHour; This query takes 20 seconds to populate. Table cc_agents_tier_status_log contains log entries of agent_id login/logout per queue per phone. status_id can have value 1 (logged out) and 2 (login) at date_log datetime. The resulting table must contain average number of agents logged in at every hour per startdate to enddate. Hope for some hints. Thank you. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
Re: Help with cleaning up data
delete b from icd9x10 a join icd9x10 b on a.icd9 = b.icd9 and a.id b.id ... CREATE TABLE `ICD9X10` ( ... id icd9 icd10 25 29182 F10182 26 29182 F10282 ... Good luck, Bob
Re: Help with cleaning up data
On 3/29/2014 2:26 PM, william drescher wrote: I am given a table: ICD9X10 which is a maping of ICD9 codes to ICD10 codes. Unfortunately the table contains duplicate entries that I need to remove. CREATE TABLE `ICD9X10` ( `id` smallint(6) NOT NULL AUTO_INCREMENT, `icd9` char(8) NOT NULL, `icd10` char(6) NOT NULL, PRIMARY KEY (`id`), UNIQUE KEY `icd9` (`icd9`,`id`), UNIQUE KEY `icd10` (`icd10`,`id`) ) ENGINE=InnoDB AUTO_INCREMENT=671 DEFAULT CHARSET=ascii id icd9 icd10 25 29182 F10182 26 29182 F10282 27 29182 F10982 I just can't think of a way to write a querey to delete the duplicates. Does anyone have a suggestion ? bill Thanks for all the suggestions. I learned a lot, which is the most important part of the exercise. bill -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
Re: Help with cleaning up data
Hi Bill, How big is your table? It seems to me that you might want to change your unique keys to something like (icd9, icd10), thus guaranteeing that every mapping will exist only once in your table. You could create a new table with that constraint and copy all your data to it: CREATE TABLE `ICD9X10_2` ( `id` smallint(6) NOT NULL AUTO_INCREMENT, `icd9` char(8) NOT NULL, `icd10` char(6) NOT NULL, PRIMARY KEY (`id`), UNIQUE KEY `icd9_icd10` (`icd9`,`icd10`) ) ENGINE=InnoDB DEFAULT CHARSET=ascii INSERT IGNORE INTO ICD9X10_2 SELECT * FROM ICD9X10; -- This will skip the duplicates -- Once you've checked the new table and it looks fine to you, you can swap them: RENAME TABLE ICD9X10 TO ICD9X10_old, ICD9X10_2 TO ICD9X10; Or, alternatively, you can also directly alter your table by adding that unique index like this: ALTER IGNORE TABLE ICD9X10 ADD UNIQUE KEY (ICD9, ICD10); Hope that helps 2014-03-29 18:26 GMT+00:00 william drescher will...@techservsys.com: I am given a table: ICD9X10 which is a maping of ICD9 codes to ICD10 codes. Unfortunately the table contains duplicate entries that I need to remove. CREATE TABLE `ICD9X10` ( `id` smallint(6) NOT NULL AUTO_INCREMENT, `icd9` char(8) NOT NULL, `icd10` char(6) NOT NULL, PRIMARY KEY (`id`), UNIQUE KEY `icd9` (`icd9`,`id`), UNIQUE KEY `icd10` (`icd10`,`id`) ) ENGINE=InnoDB AUTO_INCREMENT=671 DEFAULT CHARSET=ascii id icd9 icd10 25 29182 F10182 26 29182 F10282 27 29182 F10982 I just can't think of a way to write a querey to delete the duplicates. Does anyone have a suggestion ? bill -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
Re: Help with cleaning up data
On 29-03-2014 19:26, william drescher wrote: I am given a table: ICD9X10 which is a maping of ICD9 codes to ICD10 codes. Unfortunately the table contains duplicate entries that I need to remove. ... I just can't think of a way to write a querey to delete the duplicates. Does anyone have a suggestion ? http://bit.ly/1hKCVHi -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
RE: Help with cleaning up data
Bill, here is one approach: The following query will return the id's that should NOT be deleted: Select min (id) from icd9x10 group by icd9, icd10 Once you run it and happy with the results then you subquery it in a DELETE statement. Something like: Delete from icd9x10 A where A.id not in (Select min (B.id) from icd9x10 B group by B.icd9, B.icd10). I have not tested it (sorry it is a weekend here...), but I hope it will lead you into the right direction. David. David Lerer | Director, Database Administration | Interactive | 605 Third Avenue, 12th Floor, New York, NY 10158 Direct: (646) 487-6522 | Fax: (646) 487-1569 | dle...@univision.net | www.univision.net -Original Message- From: william drescher [mailto:will...@techservsys.com] Sent: Saturday, March 29, 2014 2:26 PM To: mysql@lists.mysql.com Subject: Help with cleaning up data I am given a table: ICD9X10 which is a maping of ICD9 codes to ICD10 codes. Unfortunately the table contains duplicate entries that I need to remove. CREATE TABLE `ICD9X10` ( `id` smallint(6) NOT NULL AUTO_INCREMENT, `icd9` char(8) NOT NULL, `icd10` char(6) NOT NULL, PRIMARY KEY (`id`), UNIQUE KEY `icd9` (`icd9`,`id`), UNIQUE KEY `icd10` (`icd10`,`id`) ) ENGINE=InnoDB AUTO_INCREMENT=671 DEFAULT CHARSET=ascii id icd9 icd10 25 29182 F10182 26 29182 F10282 27 29182 F10982 I just can't think of a way to write a querey to delete the duplicates. Does anyone have a suggestion ? bill -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql The information contained in this e-mail and any attached documents may be privileged, confidential and protected from disclosure. If you are not the intended recipient you may not read, copy, distribute or use this information. If you have received this communication in error, please notify the sender immediately by replying to this message and then delete it from your system.
Re: help: innodb database cannot recover
boah you *must not* remove ibdata1 it contains the global tablespace even with file_per_table ib_logfile0 and ib_logfile1 may be removed, but make sure you have a as cinsistent as possible backup of the whole datadir I removed ib_logfile0 and ib_logfile1 and restarted mysql with innodb_force_recovery=1, mysql keeps crashing and restart: thd: 0x0 Attempting backtrace. You can use the following information to find out where mysqld died. If you see no messages after this, something went terribly wrong... stack_bottom = (nil) thread_stack 0x3 /usr/libexec/mysqld(my_print_stacktrace+0x2e) [0x84bbbae] /usr/libexec/mysqld(handle_segfault+0x4bc) [0x81eca1c] [0xf57fe400] [0xf57fe416] /lib/libc.so.6(gsignal+0x51) [0x45a7bb71] /lib/libc.so.6(abort+0x17a) [0x45a7d44a] /usr/libexec/mysqld(fil_io+0x377) [0x83ba177] /usr/libexec/mysqld() [0x83a257b] /usr/libexec/mysqld(buf_read_page+0x282) [0x83a3132] /usr/libexec/mysqld(buf_page_get_gen+0x351) [0x839c111] /usr/libexec/mysqld(btr_cur_search_to_nth_level+0x3c1) [0x838ca31] /usr/libexec/mysqld(row_search_index_entry+0x79) [0x840d3c9] /usr/libexec/mysqld() [0x840bf97] /usr/libexec/mysqld(row_purge_step+0x574) [0x840d1e4] /usr/libexec/mysqld(que_run_threads+0x535) [0x83fa815] /usr/libexec/mysqld(trx_purge+0x365) [0x8427e25] /usr/libexec/mysqld(srv_master_thread+0x75b) [0x842009b] /lib/libpthread.so.0() [0x45bf09e9] /lib/libc.so.6(clone+0x5e) [0x45b2dc2e] The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains information that should help you find out what is causing the crash. 130620 00:47:21 mysqld_safe Number of processes running now: 0 130620 00:47:21 mysqld_safe mysqld restarted InnoDB: Error: tablespace size stored in header is 456832 pages, but InnoDB: the sum of data file sizes is only 262080 pages InnoDB: Cannot start InnoDB. The tail of the system tablespace is InnoDB: missing. Have you edited innodb_data_file_path in my.cnf in an InnoDB: inappropriate way, removing ibdata files from there? InnoDB: You can set innodb_force_recovery=1 in my.cnf to force InnoDB: a startup if you are trying to recover a badly corrupt database. 130620 0:47:22 [ERROR] Plugin 'InnoDB' init function returned error. 130620 0:47:22 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed. if I set innodb_force_recovery=4 to restart mysql and then run mysqldump, i got the following error: mysqldump: Got error: 2013: Lost connection to MySQL server during query when using LOCK TABLES it looks that all data from innodb is messed up and gone forever even though *.frm is still there. Peter
Re: help: innodb database cannot recover
As a matter of dumb questions, what versions are the old and new mysqld; and are they running on the same platform (OS, 32/64 bit, ...) ? - Original Message - From: Peter one2001...@yahoo.com To: Reindl Harald h.rei...@thelounge.net, mysql@lists.mysql.com Sent: Friday, 21 June, 2013 10:04:27 AM Subject: Re: help: innodb database cannot recover I removed ib_logfile0 and ib_logfile1 and restarted mysql with innodb_force_recovery=1, mysql keeps crashing and restart: thd: 0x0 Attempting backtrace. You can use the following information to find out where mysqld died. If you see no messages after this, something went terribly wrong... stack_bottom = (nil) thread_stack 0x3 /usr/libexec/mysqld(my_print_stacktrace+0x2e) [0x84bbbae] /usr/libexec/mysqld(handle_segfault+0x4bc) [0x81eca1c] [0xf57fe400] [0xf57fe416] /lib/libc.so.6(gsignal+0x51) [0x45a7bb71] /lib/libc.so.6(abort+0x17a) [0x45a7d44a] /usr/libexec/mysqld(fil_io+0x377) [0x83ba177] /usr/libexec/mysqld() [0x83a257b] /usr/libexec/mysqld(buf_read_page+0x282) [0x83a3132] /usr/libexec/mysqld(buf_page_get_gen+0x351) [0x839c111] /usr/libexec/mysqld(btr_cur_search_to_nth_level+0x3c1) [0x838ca31] /usr/libexec/mysqld(row_search_index_entry+0x79) [0x840d3c9] /usr/libexec/mysqld() [0x840bf97] /usr/libexec/mysqld(row_purge_step+0x574) [0x840d1e4] /usr/libexec/mysqld(que_run_threads+0x535) [0x83fa815] /usr/libexec/mysqld(trx_purge+0x365) [0x8427e25] /usr/libexec/mysqld(srv_master_thread+0x75b) [0x842009b] /lib/libpthread.so.0() [0x45bf09e9] /lib/libc.so.6(clone+0x5e) [0x45b2dc2e] The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains information that should help you find out what is causing the crash. 130620 00:47:21 mysqld_safe Number of processes running now: 0 130620 00:47:21 mysqld_safe mysqld restarted InnoDB: Error: tablespace size stored in header is 456832 pages, but InnoDB: the sum of data file sizes is only 262080 pages InnoDB: Cannot start InnoDB. The tail of the system tablespace is InnoDB: missing. Have you edited innodb_data_file_path in my.cnf in an InnoDB: inappropriate way, removing ibdata files from there? InnoDB: You can set innodb_force_recovery=1 in my.cnf to force InnoDB: a startup if you are trying to recover a badly corrupt database. 130620 0:47:22 [ERROR] Plugin 'InnoDB' init function returned error. 130620 0:47:22 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed. if I set innodb_force_recovery=4 to restart mysql and then run mysqldump, i got the following error: mysqldump: Got error: 2013: Lost connection to MySQL server during query when using LOCK TABLES it looks that all data from innodb is messed up and gone forever even though *.frm is still there. Peter -- Unhappiness is discouraged and will be corrected with kitten pictures. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
Re: help: innodb database cannot recover
Am 20.06.2013 10:11, schrieb Peter: 130620 00:47:08 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql InnoDB: Error: tablespace size stored in header is 456832 pages, but InnoDB: the sum of data file sizes is only 262080 pages InnoDB: Cannot start InnoDB. The tail of the system tablespace is InnoDB: missing is there a way to start the database again? Thanks for your help in advance restore your backups that is one reason why replication exists to have a slave which a) does not die with the master at a complete crash and b) have a machine where offline backups could be pulled from without any downtime signature.asc Description: OpenPGP digital signature
Re: help: innodb database cannot recover
2013/6/20 Peter one2001...@yahoo.com Hello, I copied innodb database (ib_logfile0 ib_logfile1 ibdata1 and the whole database directory) from one crashed machine to another. I find that I cannot start database to get the database data any more. How did you copy the database? Manuel
Re: help: innodb database cannot recover
Am 20.06.2013 15:18, schrieb Peter: 2013/6/20 Peter one2001...@yahoo.com Hello, I copied innodb database (ib_logfile0 ib_logfile1 ibdata1 and the whole database directory) from one crashed machine to another. I find that I cannot start database to get the database data any more. How did you copy the database? Manuel I copy the files ib_logfile0 ib_logfile1 ibdata1 into /var/lib/mysql in linux and the whole database directory my_database_name into /var/lib/mysql/my_database_name, the same location as previous machine /var/lib/mysql/my_database_name and your my.cnf? you hardly can move innodb around without the same settings signature.asc Description: OpenPGP digital signature
Re: help: innodb database cannot recover
2013/6/20 Peter one2001...@yahoo.com Hello, I copied innodb database (ib_logfile0 ib_logfile1 ibdata1 and the whole database directory) from one crashed machine to another. I find that I cannot start database to get the database data any more. How did you copy the database? Manuel I copy the files ib_logfile0 ib_logfile1 ibdata1 into /var/lib/mysql in linux and the whole database directory my_database_name into /var/lib/mysql/my_database_name, the same location as previous machine /var/lib/mysql/my_database_name Thanks. Peter
Re: help: innodb database cannot recover
2013/6/20 Peter one2001...@yahoo.com 2013/6/20 Peter one2001...@yahoo.com Hello, I copied innodb database (ib_logfile0 ib_logfile1 ibdata1 and the whole database directory) from one crashed machine to another. I find that I cannot start database to get the database data any more. How did you copy the database? Manuel I copy the files ib_logfile0 ib_logfile1 ibdata1 into /var/lib/mysql in linux and the whole database directory my_database_name into /var/lib/mysql/my_database_name, the same location as previous machine /var/lib/mysql/my_database_name Was MySQL daemon stopped in both machines? It must be stopped in both of them. If you cannot afford stopping your mysql, you might want to take a look at xtrabackup: http://www.percona.com/doc/percona-xtrabackup/2.1/ Manuel.
Re: help: innodb database cannot recover
Am 20.06.2013 15:18, schrieb Peter: I copied innodb database (ib_logfile0 ib_logfile1 ibdata1 and the whole database directory) from one crashed machine to another. I find that I cannot start database to get the database data any more. How did you copy the database? Manuel I copy the files ib_logfile0 ib_logfile1 ibdata1 into /var/lib/mysql in linux and the whole database directory my_database_name into /var/lib/mysql/my_database_name, the same location as previous machine /var/lib/mysql/my_database_name and your my.cnf? you hardly can move innodb around without the same settings the my.cnf file is same as follows:
Re: help: innodb database cannot recover
Am 20.06.2013 15:18, schrieb Peter: I copied innodb database (ib_logfile0 ib_logfile1 ibdata1 and the whole database directory) from one crashed machine to another. I find that I cannot start database to get the database data any more. How did you copy the database? Manuel I copy the files ib_logfile0 ib_logfile1 ibdata1 into /var/lib/mysql in linux and the whole database directory my_database_name into /var/lib/mysql/my_database_name, the same location as previous machine /var/lib/mysql/my_database_name and your my.cnf? you hardly can move innodb around without the same settings sorry for the previous post missing the file. the my.cnf file is same in both machines.as follows: [mysqld] datadir=/var/lib/mysql socket=/var/lib/mysql/mysql.sock user=mysql # Disabling symbolic-links is recommended to prevent assorted security risks symbolic-links=0 [mysqld_safe] log-error=/var/log/mysqld.log pid-file=/var/run/mysqld/mysqld.pid Peter
Re: help: innodb database cannot recover
2013/6/20 Peter one2001...@yahoo.com 2013/6/20 Peter one2001...@yahoo.com Hello, I copied innodb database (ib_logfile0 ib_logfile1 ibdata1 and the whole database directory) from one crashed machine to another. I find that I cannot start database to get the database data any more. How did you copy the database? Manuel I copy the files ib_logfile0 ib_logfile1 ibdata1 into /var/lib/mysql in linux and the whole database directory my_database_name into /var/lib/mysql/my_database_name, the same location as previous machine /var/lib/mysql/my_database_name Was MySQL daemon stopped in both machines? It must be stopped in both of them. If you cannot afford stopping your mysql, you might want to take a look at xtrabackup: http://www.percona.com/doc/percona-xtrabackup/2.1/ yes, the mysql daemon was stopped during the copy. actually, the old machine cannot run anymore, and I copy it over from another machine by moving the disk to another machine. Peter
Re: help: innodb database cannot recover
Hello, I copied innodb database (ib_logfile0 ib_logfile1 ibdata1 and the whole database directory) from one crashed machine to another. I find that I cannot start database to get the database data any more. How did you copy the database? Manuel I copy the files ib_logfile0 ib_logfile1 ibdata1 into /var/lib/mysql in linux and the whole database directory my_database_name into /var/lib/mysql/my_database_name, the same location as previous machine /var/lib/mysql/my_database_name peter If I remove ib_logfile0 ib_logfile1 ibdata1, then restart mysql. still I am unable to open the .frm file though it exists. Here is the error I got. 130620 14:28:18 [ERROR] Cannot find or open table my_database_name/my_database_table from the internal data dictionary of InnoDB though the .frm file for the table exists. Maybe you have deleted and recreated InnoDB data files but have forgotten to delete the corresponding .frm files of InnoDB tables, or you have moved .frm files to another database? or, the table contains indexes that this version of the engine doesn't support. See http://dev.mysql.com/doc/refman/5.1/en/innodb-troubleshooting.html how you can resolve the problem. .frm files are all there and just don't know how to open them. Peter
Re: help: innodb database cannot recover
Am 20.06.2013 23:47, schrieb Peter: Hello, I copied innodb database (ib_logfile0 ib_logfile1 ibdata1 and the whole database directory) from one crashed machine to another. I find that I cannot start database to get the database data any more. How did you copy the database? Manuel I copy the files ib_logfile0 ib_logfile1 ibdata1 into /var/lib/mysql in linux and the whole database directory my_database_name into /var/lib/mysql/my_database_name, the same location as previous machine /var/lib/mysql/my_database_name peter If I remove ib_logfile0 ib_logfile1 ibdata1, then restart mysql. still I am unable to open the .frm file though it exists. Here is the error I got boah you *must not* remove ibdata1 it contains the global tablespace even with file_per_table ib_logfile0 and ib_logfile1 may be removed, but make sure you have a as cinsistent as possible backup of the whole datadir signature.asc Description: OpenPGP digital signature
Re: help with mysql db names
On 19.04.2013 06:49, Kapil Karekar wrote: snip Though I would recommend not using such names. Some poor guy working on your application six months down the line is going to wonder why his queries are failing, spend a day trying to figure out and will post the same question again to this list :-) ...not to mention the many query generator tools that might come into use, which will fail because they do not consistently use back-ticks. NEVER use identififers - database or otherwise - that start with a digit. / Carsten -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
Re: help with mysql db names
On 19-Apr-2013, at 9:14 AM, Doug d...@hacks.perl.sh wrote: why these db names created fail but the last one gets success? mysql create database 3208e1c6aa32; mysql create database 208e1c6aa32; mysql create database 08e1c6aa32; mysql create database 8e1c6aa32; These are not working because MySQL is interpreting the database names as expressions. Look closely at the db names: 3208e+1 208e+1 08e+1 8e+1 mysql create database e1c6aa32; Query OK, 1 row affected (0.01 sec) This works fine. Suggestion: Don't use database names starting with integers unless you use back ticks. This will work for you: mysql create database `3208e1c6aa32`; Though I would recommend not using such names. Some poor guy working on your application six months down the line is going to wonder why his queries are failing, spend a day trying to figure out and will post the same question again to this list :-) Cheers! Kapil Karekar Managing Director @ Ask DB Experts http://www.askdbexperts.com -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
Re: help with mysql db names
On 2013-04-19, Doug d...@hacks.perl.sh wrote: why these db names created fail but the last one gets success? [snips] mysql create database 3208e1c6aa32; mysql create database 208e1c6aa32; mysql create database 08e1c6aa32; mysql create database 8e1c6aa32; mysql create database e1c6aa32; https://dev.mysql.com/doc/refman/5.5/en/identifiers.html You could try quoting your db names and see if you have any more success. --keith -- kkel...@wombat.san-francisco.ca.us -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
Re: Help restoring database: MacOS Server (Snow Leopard)
Okay, panic over. I recursively stripped the ACLs and things are working. Next time I drop a table from phpMyAdmin, I'll carefully read the little thing that pops up saying I'm about to drop an entire database... :-( One gets so yea, whatever to warning notifiers...) Thanks to all who sent helpful suggestions! On 2013-01-09, at 07:33, Jan Steinman wrote: I accidentally dropped a crucial database. My only backup is via Apple's Time Machine. First, I stopped mysqld and copied (via tar) the database in question from the backup. Restarted, but drat -- most of the tables were apparently using innodb's ibdata1 file, as only the MyISAM tables showed up in phpMyAdmin. I copied the ibdata1, but then mysqld wouldn't start, complaining about no mysql.sock, which was odd, because it was there. I then copied (via tar) the entire set of all databases, but am still having problems, I think related to Time Machine's ACL lists and extended attributes. Anyone have experience and wise words on restoring a database from Time Machine? Thanks! Books are associated with communities of writers, printers, proofreaders and a host of other people with whom the writer interacts... Granted, there are some websites that provide well written and researched articles and information. Unfortunately, many are not. The Internet may be more of a time using machine than TV. And its usage is not always positive. -- Pat Murphy Jan Steinman, EcoReality Co-op The day Microsoft makes something that doesn't suck is probably the day they start making vacuum cleaners. -- Ernst Jan Plugge Jan Steinman, EcoReality Co-op -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
Re: Help restoring database: MacOS Server (Snow Leopard)
Am 09.01.2013 16:33, schrieb Jan Steinman: I accidentally dropped a crucial database. My only backup is via Apple's Time Machine. First, I stopped mysqld and copied (via tar) the database in question from the backup. Restarted, but drat -- most of the tables were apparently using innodb's ibdata1 file, as only the MyISAM tables showed up in phpMyAdmin. I copied the ibdata1, but then mysqld wouldn't start, complaining about no mysql.sock, which was odd, because it was there. I then copied (via tar) the entire set of all databases, but am still having problems, I think related to Time Machine's ACL lists and extended attributes. what means i think jesus show us the mysqld log signature.asc Description: OpenPGP digital signature
Re: Help restoring database: MacOS Server (Snow Leopard)
Hi, It is not very surprising that the database cannot recover from a Time Machine backup. This generally applies to any software that is running at the moment the backup is taken. The InnoDB is especially sensitive to taking what is called a 'dirty' backup because it has a cache. You may get some luck with innodb_force_recovery. Either way, it is going to be unpleasant experience best avoided by mysql-specific backup strategy. Peace Karen. On 09.01.2013, at 7:33, Jan Steinman wrote: I accidentally dropped a crucial database. My only backup is via Apple's Time Machine. First, I stopped mysqld and copied (via tar) the database in question from the backup. Restarted, but drat -- most of the tables were apparently using innodb's ibdata1 file, as only the MyISAM tables showed up in phpMyAdmin. I copied the ibdata1, but then mysqld wouldn't start, complaining about no mysql.sock, which was odd, because it was there. I then copied (via tar) the entire set of all databases, but am still having problems, I think related to Time Machine's ACL lists and extended attributes. Anyone have experience and wise words on restoring a database from Time Machine? Thanks! Books are associated with communities of writers, printers, proofreaders and a host of other people with whom the writer interacts... Granted, there are some websites that provide well written and researched articles and information. Unfortunately, many are not. The Internet may be more of a time using machine than TV. And its usage is not always positive. -- Pat Murphy Jan Steinman, EcoReality Co-op -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
Re: Help with left outer join
On 12/11/2012 7:22 PM, h...@tbbs.net wrote: ... (Are all the distinct ids really needed? When one joins on a field with the same name in both tables, one may use 'USING', and only the common field, with neither NULL, shows up in the output.) This is a perfectly acceptable naming convention to use. For example if you have a field on the `art` table that references the ID column of the `person` table to indicate the owner and another field to indicate the person who created the art, you might want to use the names `owner_person_id` and `artist_person_id` to keep them separate from the `id` column used to uniquely identify the work of art itself. In this design pattern, each table has a numeric ID column (string-based primary keys are perfectly legal but have their drawbacks and should be used with care) and to reference it from another table you can use the pattern parenttable_id. It keeps your naming conventions clean and predictable. If I were to try to use a USING operator in my opening example, I would be trying to match the PK fields of two separate types of data. (the USING example) SELECT ... FROM art INNER JOIN person USING(id)... Let's say that I renamed the id fields to art_id and person_id to make them table-specific. This still fails because a person's identifier as an owner is not the same as a work of art's creator. It also does not allow me to use the `person` table more than once in a single query. (FAIL: a renamed USING example) SELECT ... FROM art INNER JOIN person USING(person_id) --- does this refer to the owner or the creator of the art? (the name template example) SELECT ... FROM art INNER JOIN person owner on art.owner_person_id = owner.id INNER JOIN person artist on art.artist_person_id = artist.id ... I admit I am a control freak when it comes to the accuracy of my queries, the integrity of my data, and the ease of maintenance for my SQL statements. Because of this, I much prefer the regular predictability of the ANSI JOIN syntax (demonstrated above) for any JOIN except for an intentional Cartesian product. For that case alone, I prefer a comma join (Cartesian product example) SELECT ... FROM table1, table2 ... Not only is the ANSI syntax the only way to specify an OUTER join (such as LEFT JOIN or RIGHT JOIN) it forces the author of the statement to recognize that they do or do not have an ON clause for this table relationship. Separating the same conditions into a WHERE clause makes it very easy to overlook a table relationship and accidentally create a Cartesian product which can often devastate query performance. It is perfectly legal to use the comma-join syntax with MySQL but I strongly recommend against it just because it can only be used for INNER joins or CROSS joins and because it forces you to put your relationship conditions in the WHERE clause. Another strike (to me, anyway) against the comma join is that in order to process joins more like the specifications in the SQL standards, we demoted the precedence of the comma operator with 5.0.12. These changes also affected the behavior of the USING and NATURAL JOIN operators. http://dev.mysql.com/doc/refman/5.5/en/join.html Regards, -- Shawn Green MySQL Principal Technical Support Engineer Oracle USA, Inc. - Hardware and Software, Engineered to Work Together. Office: Blountville, TN -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
Re: Help with left outer join
On Wed, Dec 12, 2012 at 8:25 AM, Shawn Green shawn.l.gr...@oracle.com wrote: On 12/11/2012 7:22 PM, h...@tbbs.net wrote: ... (Are all the distinct ids really needed? When one joins on a field with the same name in both tables, one may use 'USING', and only the common field, with neither NULL, shows up in the output.) This is a perfectly acceptable naming convention to use. For example if you have a field on the `art` table that references the ID column of the `person` table to indicate the owner and another field to indicate the person who created the art, you might want to use the names `owner_person_id` and `artist_person_id` to keep them separate from the `id` column used to uniquely identify the work of art itself. In this design pattern, each table has a numeric ID column (string-based primary keys are perfectly legal but have their drawbacks and should be used with care) and to reference it from another table you can use the pattern parenttable_id. It keeps your naming conventions clean and predictable. If I were to try to use a USING operator in my opening example, I would be trying to match the PK fields of two separate types of data. (the USING example) SELECT ... FROM art INNER JOIN person USING(id)... Let's say that I renamed the id fields to art_id and person_id to make them table-specific. This still fails because a person's identifier as an owner is not the same as a work of art's creator. It also does not allow me to use the `person` table more than once in a single query. (FAIL: a renamed USING example) SELECT ... FROM art INNER JOIN person USING(person_id) --- does this refer to the owner or the creator of the art? (the name template example) SELECT ... FROM art INNER JOIN person owner on art.owner_person_id = owner.id INNER JOIN person artist on art.artist_person_id = artist.id ... I admit I am a control freak when it comes to the accuracy of my queries, the integrity of my data, and the ease of maintenance for my SQL statements. Because of this, I much prefer the regular predictability of the ANSI JOIN syntax (demonstrated above) for any JOIN except for an intentional Cartesian product. For that case alone, I prefer a comma join (Cartesian product example) SELECT ... FROM table1, table2 ... Not only is the ANSI syntax the only way to specify an OUTER join (such as LEFT JOIN or RIGHT JOIN) it forces the author of the statement to recognize that they do or do not have an ON clause for this table relationship. Separating the same conditions into a WHERE clause makes it very easy to overlook a table relationship and accidentally create a Cartesian product which can often devastate query performance. It is perfectly legal to use the comma-join syntax with MySQL but I strongly recommend against it just because it can only be used for INNER joins or CROSS joins and because it forces you to put your relationship conditions in the WHERE clause. Another strike (to me, anyway) against the comma join is that in order to process joins more like the specifications in the SQL standards, we demoted the precedence of the comma operator with 5.0.12. These changes also affected the behavior of the USING and NATURAL JOIN operators. http://dev.mysql.com/doc/refman/5.5/en/join.html Thanks very much Shawn for this very informative post. I learned SQL in the early 1980's and I was not taught the JOIN syntax, only the comma join and WHERE. It's really just force of habit that I write queries that way. I will try very hard in the future to break that habit and use the JOIN syntax. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
Re: Help with left outer join
On Tue, Dec 11, 2012 at 8:48 PM, Peter Brawley peter.braw...@earthlink.net wrote: ERROR 1054 (42S22): Unknown column 'data_tool.category_id' in 'on clause' But category_id is a column in data_tool. Then a bit of reordering is required ... SELECT data_target.name, ep, wafer_id, lot_id, date_time, data_file_id, data_cstimage.name, bottom, wf_file_path_id, data_measparams.name, vacc, data_category.name FROM data_cst RIGHT JOIN data_cstimage ON data_cstimage.id = data_cst.image_measurer_id INNER JOIN data_tool ON data_tool.id = data_cst.tool_id INNER JOIN data_target ON data_target.id = data_cst.target_name_id INNER JOIN data_measparams ON data_measparams.id = data_cst.meas_params_name_id INNER JOIN data_category ON data_category.id = data_tool.category_id WHERE data_cst.target_name_id IN (38018, 29947, 28330) AND data_cst.date_time BETWEEN '2012-09-01 00:00:00' AND '2012-09-07 00:00:00' ORDER BY target_name_id, ep, wafer_id, lot_id, date_time; Your query worked exactly like my original one - i.e. when data_cst.image_measurer_id is NULL I don't get that data_cst row. But I changed the RIGHT JOIN to a LEFT JOIN and then it was doing exactly what I wanted. Thanks for all the help! On 2012-12-11 5:11 PM, Larry Martell wrote: SELECT data_target.name, ep, wafer_id, lot_id, date_time, data_file_id, data_cstimage.name, bottom, wf_file_path_id, data_measparams.name, vacc, data_category.name FROM data_cst RIGHT JOIN data_cstimage ON data_cstimage.id = data_cst.image_measurer_id INNER JOIN data_target ON data_target.id = data_cst.target_name_id INNER JOIN data_measparams ON data_measparams.id = data_cst.meas_params_name_id INNER JOIN data_category ON data_category.id = data_tool.category_id INNER JOIN data_tool ON data_tool.id = data_cst.tool_id WHERE data_cst.target_name_id IN (38018, 29947, 28330) AND data_cst.date_time BETWEEN '2012-09-01 00:00:00' AND '2012-09-07 00:00:00' ORDER BY target_name_id, ep, wafer_id, lot_id, date_time; Thanks very much for the reply. This is giving me: ERROR 1054 (42S22): Unknown column 'data_tool.category_id' in 'on clause' But category_id is a column in data_tool. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
Re: Help with left outer join
On Tue, Dec 11, 2012 at 7:22 PM, h...@tbbs.net wrote: 2012/12/11 16:19 -0500, Larry Martell I have this query: SELECT data_target.name, ep, wafer_id, lot_id, date_time, data_file_id, data_cstimage.name, bottom, wf_file_path_id, data_measparams.name, vacc, data_category.name FROM data_cst, data_target, data_cstimage, data_measparams, data_category, data_tool WHERE data_cst.target_name_id IN (38018, 29947, 28330) AND data_cst.date_time BETWEEN '2012-09-01 00:00:00' AND '2012-09-07 00:00:00' AND data_target.id = data_cst.target_name_id AND data_cstimage.id = data_cst.image_measurer_id AND data_measparams.id = data_cst.meas_params_name_id AND data_category.id = data_tool.category_id AND data_tool.id = data_cst.tool_id ORDER BY target_name_id, ep, wafer_id, lot_id, date_time My problem is that when data_cst.image_measurer_id is NULL I don't get that data_cst row even though all the other part of the where clause are TRUE. I understand why that is, but in that case I want the row, but with NULL in the data_cstimage.name column. I think I need a left outer join, but I've been messing with this for hours, and I can't get the syntax right. I've googled it, but all the examples are simple with just 2 tables. Can someone help me with this? Modern forms do not give a left join if one uses WHERE-clause to reduce a full cross-join to an inner join. It is better to start with something like this, FROM data_cst JOIN data_target ON data_target.id = data_cst.target_name_id JOIN data_cstimage ON data_cstimage.id = data_cst.image_measurer_id JOIN data_measparams ON data_measparams.id = data_cst.meas_params_name_id JOIN (data_category JOIN data_tool ON data_category.id = data_tool.category_id) ON data_tool.id = data_cst.tool_id but I am not too sure where to bracket data_tool. When you have put it into a 'FROM'-clause with 'JOIN', not comma, separating the tables, with the same outcome as now, then you have to decide before which 'JOIN' to put the 'LEFT'. Maybe you want it between data_cstimage and data_measparams. (Are all the distinct ids really needed? When one joins on a field with the same name in both tables, one may use 'USING', and only the common field, with neither NULL, shows up in the output.) I'm not familiar with the USING clause. I'll have to look into that. Thanks! -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
Re: Help with left outer join
On 2012-12-11 3:19 PM, Larry Martell wrote: I have this query: SELECT data_target.name, ep, wafer_id, lot_id, date_time, data_file_id, data_cstimage.name, bottom, wf_file_path_id, data_measparams.name, vacc, data_category.name FROM data_cst, data_target, data_cstimage, data_measparams, data_category, data_tool WHERE data_cst.target_name_id IN (38018, 29947, 28330) AND data_cst.date_time BETWEEN '2012-09-01 00:00:00' AND '2012-09-07 00:00:00' AND data_target.id = data_cst.target_name_id AND data_cstimage.id = data_cst.image_measurer_id AND data_measparams.id = data_cst.meas_params_name_id AND data_category.id = data_tool.category_id AND data_tool.id = data_cst.tool_id ORDER BY target_name_id, ep, wafer_id, lot_id, date_time My problem is that when data_cst.image_measurer_id is NULL I don't get that data_cst row even though all the other part of the where clause are TRUE. I understand why that is, but in that case I want the row, but with NULL in the data_cstimage.name column. Explicit join syntax makes such queries easier to read, write, develop debug. Is this what you mean? SELECT data_target.name, ep, wafer_id, lot_id, date_time, data_file_id, data_cstimage.name, bottom, wf_file_path_id, data_measparams.name, vacc, data_category.name FROM data_cst *RIGHT JOIN data_cstimage*ON data_cstimage.id = data_cst.image_measurer_id INNER JOIN data_target ON data_target.id = data_cst.target_name_id INNER JOIN data_measparams ON data_measparams.id = data_cst.meas_params_name_id INNER JOIN data_category ON data_category.id = data_tool.category_id INNER JOIN data_tool ON data_tool.id = data_cst.tool_id WHERE data_cst.target_name_id IN (38018, 29947, 28330) AND data_cst.date_time BETWEEN '2012-09-01 00:00:00' AND '2012-09-07 00:00:00' ORDER BY target_name_id, ep, wafer_id, lot_id, date_time; PB http://www.artfulsoftware.com I think I need a left outer join, but I've been messing with this for hours, and I can't get the syntax right. I've googled it, but all the examples are simple with just 2 tables. Can someone help me with this? TIA! -larry
Re: Help with left outer join
On Tue, Dec 11, 2012 at 5:12 PM, Peter Brawley peter.braw...@earthlink.net wrote: On 2012-12-11 3:19 PM, Larry Martell wrote: I have this query: SELECT data_target.name, ep, wafer_id, lot_id, date_time, data_file_id, data_cstimage.name, bottom, wf_file_path_id, data_measparams.name, vacc, data_category.name FROM data_cst, data_target, data_cstimage, data_measparams, data_category, data_tool WHERE data_cst.target_name_id IN (38018, 29947, 28330) AND data_cst.date_time BETWEEN '2012-09-01 00:00:00' AND '2012-09-07 00:00:00' AND data_target.id = data_cst.target_name_id AND data_cstimage.id = data_cst.image_measurer_id AND data_measparams.id = data_cst.meas_params_name_id AND data_category.id = data_tool.category_id AND data_tool.id = data_cst.tool_id ORDER BY target_name_id, ep, wafer_id, lot_id, date_time My problem is that when data_cst.image_measurer_id is NULL I don't get that data_cst row even though all the other part of the where clause are TRUE. I understand why that is, but in that case I want the row, but with NULL in the data_cstimage.name column. Explicit join syntax makes such queries easier to read, write, develop debug. Is this what you mean? SELECT data_target.name, ep, wafer_id, lot_id, date_time, data_file_id, data_cstimage.name, bottom, wf_file_path_id, data_measparams.name, vacc, data_category.name FROM data_cst RIGHT JOIN data_cstimage ON data_cstimage.id = data_cst.image_measurer_id INNER JOIN data_target ON data_target.id = data_cst.target_name_id INNER JOIN data_measparams ON data_measparams.id = data_cst.meas_params_name_id INNER JOIN data_category ON data_category.id = data_tool.category_id INNER JOIN data_tool ON data_tool.id = data_cst.tool_id WHERE data_cst.target_name_id IN (38018, 29947, 28330) AND data_cst.date_time BETWEEN '2012-09-01 00:00:00' AND '2012-09-07 00:00:00' ORDER BY target_name_id, ep, wafer_id, lot_id, date_time; Thanks very much for the reply. This is giving me: ERROR 1054 (42S22): Unknown column 'data_tool.category_id' in 'on clause' But category_id is a column in data_tool. I think I need a left outer join, but I've been messing with this for hours, and I can't get the syntax right. I've googled it, but all the examples are simple with just 2 tables. Can someone help me with this? TIA! -larry -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
Re: Help with left outer join
2012/12/11 16:19 -0500, Larry Martell I have this query: SELECT data_target.name, ep, wafer_id, lot_id, date_time, data_file_id, data_cstimage.name, bottom, wf_file_path_id, data_measparams.name, vacc, data_category.name FROM data_cst, data_target, data_cstimage, data_measparams, data_category, data_tool WHERE data_cst.target_name_id IN (38018, 29947, 28330) AND data_cst.date_time BETWEEN '2012-09-01 00:00:00' AND '2012-09-07 00:00:00' AND data_target.id = data_cst.target_name_id AND data_cstimage.id = data_cst.image_measurer_id AND data_measparams.id = data_cst.meas_params_name_id AND data_category.id = data_tool.category_id AND data_tool.id = data_cst.tool_id ORDER BY target_name_id, ep, wafer_id, lot_id, date_time My problem is that when data_cst.image_measurer_id is NULL I don't get that data_cst row even though all the other part of the where clause are TRUE. I understand why that is, but in that case I want the row, but with NULL in the data_cstimage.name column. I think I need a left outer join, but I've been messing with this for hours, and I can't get the syntax right. I've googled it, but all the examples are simple with just 2 tables. Can someone help me with this? Modern forms do not give a left join if one uses WHERE-clause to reduce a full cross-join to an inner join. It is better to start with something like this, FROM data_cst JOIN data_target ON data_target.id = data_cst.target_name_id JOIN data_cstimage ON data_cstimage.id = data_cst.image_measurer_id JOIN data_measparams ON data_measparams.id = data_cst.meas_params_name_id JOIN (data_category JOIN data_tool ON data_category.id = data_tool.category_id) ON data_tool.id = data_cst.tool_id but I am not too sure where to bracket data_tool. When you have put it into a 'FROM'-clause with 'JOIN', not comma, separating the tables, with the same outcome as now, then you have to decide before which 'JOIN' to put the 'LEFT'. Maybe you want it between data_cstimage and data_measparams. (Are all the distinct ids really needed? When one joins on a field with the same name in both tables, one may use 'USING', and only the common field, with neither NULL, shows up in the output.) -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
Re: Help with left outer join
ERROR 1054 (42S22): Unknown column 'data_tool.category_id' in 'on clause' But category_id is a column in data_tool. Then a bit of reordering is required ... SELECT data_target.name, ep, wafer_id, lot_id, date_time, data_file_id, data_cstimage.name, bottom, wf_file_path_id, data_measparams.name, vacc, data_category.name FROM data_cst RIGHT JOIN data_cstimage ON data_cstimage.id = data_cst.image_measurer_id INNER JOIN data_tool ON data_tool.id = data_cst.tool_id INNER JOIN data_target ON data_target.id = data_cst.target_name_id INNER JOIN data_measparams ON data_measparams.id = data_cst.meas_params_name_id INNER JOIN data_category ON data_category.id = data_tool.category_id WHERE data_cst.target_name_id IN (38018, 29947, 28330) AND data_cst.date_time BETWEEN '2012-09-01 00:00:00' AND '2012-09-07 00:00:00' ORDER BY target_name_id, ep, wafer_id, lot_id, date_time; PB On 2012-12-11 5:11 PM, Larry Martell wrote: SELECT data_target.name, ep, wafer_id, lot_id, date_time, data_file_id, data_cstimage.name, bottom, wf_file_path_id, data_measparams.name, vacc, data_category.name FROM data_cst RIGHT JOIN data_cstimage ON data_cstimage.id = data_cst.image_measurer_id INNER JOIN data_target ON data_target.id = data_cst.target_name_id INNER JOIN data_measparams ON data_measparams.id = data_cst.meas_params_name_id INNER JOIN data_category ON data_category.id = data_tool.category_id INNER JOIN data_tool ON data_tool.id = data_cst.tool_id WHERE data_cst.target_name_id IN (38018, 29947, 28330) AND data_cst.date_time BETWEEN '2012-09-01 00:00:00' AND '2012-09-07 00:00:00' ORDER BY target_name_id, ep, wafer_id, lot_id, date_time; Thanks very much for the reply. This is giving me: ERROR 1054 (42S22): Unknown column 'data_tool.category_id' in 'on clause' But category_id is a column in data_tool.
RE: Help with purging old logs for each customer ID
Off hand, I would iterate over the PRIMARY KEY, looking at a thousand rows at a time, DELETEing any that need to be purged. I would use a Perl or PHP loop, or write a stored procedure. More discussion of huge deletes (which this _could_ be): http://mysql.rjweb.org/doc.php/deletebig (PARTITIONing does not apply in your case, as I understand it.) I like the loop: SELECT @z := customer_log_id FROM customers_log LIMIT 1000, 1; DELETE FROM customers_log WHERE customer_log_id = @left_off AND customer_log_id @z AND created_on NOW() - INTERVAL 90 DAY; sleep a few seconds (to be a nice guy) Plus code to take care of iterating and terminating. That loop could be done continually. It seems that customer_id is irrelevant?? -Original Message- From: Daevid Vincent [mailto:dae...@daevid.com] Sent: Thursday, October 25, 2012 11:46 AM To: mysql@lists.mysql.com Subject: Help with purging old logs for each customer ID I have a customer log table that is starting to rapidly fill up (we have hundreds of thousands of users, but many are transient, and use the service for a few months, or use the free trial and quit, etc.) CREATE TABLE `customers_log` ( `customer_log_id` bigint(20) unsigned NOT NULL auto_increment, `customer_id` int(10) unsigned default '0', `created_on` timestamp NOT NULL default CURRENT_TIMESTAMP on update CURRENT_TIMESTAMP, `type` enum('View','Action','Admin','Search','Login','Logout','Access','Genera l','A PI'), `source` enum('web','mobile','system'), `body` text, PRIMARY KEY (`customer_log_id`), KEY `created_on` (`created_on`), KEY `customers_id` (`customer_id`) ) ENGINE=InnoDB What I'd like to do now is make a 'rolling log' in that I want to DELETE any entries older than 90 days for EACH `customer_id`. I'm not sure how to do that in a query? I'd rather not iterate over each customer_id if I can help it. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
RE: Help with purging old logs for each customer ID
Well, the customer_id is relevant in that I want the last 90 days relative to each customer. customer_id = 123 might have logs from jan - mar customer_id = 444 might have logs from feb - may So it's a rolling log of THEIR last 90 days from their last log (most recent) back 90 days from there. Does that make more sense? I guess I was trying to avoid looping over every customer ID and computing if I could help it. I thought by using a GROUP BY or something it could group all the logs for a given customer and then trim them that way. But maybe brute force is the way to go? -Original Message- From: Rick James [mailto:rja...@yahoo-inc.com] Sent: Thursday, October 25, 2012 1:09 PM To: Daevid Vincent; mysql@lists.mysql.com Subject: RE: Help with purging old logs for each customer ID Off hand, I would iterate over the PRIMARY KEY, looking at a thousand rows at a time, DELETEing any that need to be purged. I would use a Perl or PHP loop, or write a stored procedure. More discussion of huge deletes (which this _could_ be): http://mysql.rjweb.org/doc.php/deletebig (PARTITIONing does not apply in your case, as I understand it.) I like the loop: SELECT @z := customer_log_id FROM customers_log LIMIT 1000, 1; DELETE FROM customers_log WHERE customer_log_id = @left_off AND customer_log_id @z AND created_on NOW() - INTERVAL 90 DAY; sleep a few seconds (to be a nice guy) Plus code to take care of iterating and terminating. That loop could be done continually. It seems that customer_id is irrelevant?? -Original Message- From: Daevid Vincent [mailto:dae...@daevid.com] Sent: Thursday, October 25, 2012 11:46 AM To: mysql@lists.mysql.com Subject: Help with purging old logs for each customer ID I have a customer log table that is starting to rapidly fill up (we have hundreds of thousands of users, but many are transient, and use the service for a few months, or use the free trial and quit, etc.) CREATE TABLE `customers_log` ( `customer_log_id` bigint(20) unsigned NOT NULL auto_increment, `customer_id` int(10) unsigned default '0', `created_on` timestamp NOT NULL default CURRENT_TIMESTAMP on update CURRENT_TIMESTAMP, `type` enum('View','Action','Admin','Search','Login','Logout','Access','Genera l','A PI'), `source` enum('web','mobile','system'), `body` text, PRIMARY KEY (`customer_log_id`), KEY `created_on` (`created_on`), KEY `customers_id` (`customer_id`) ) ENGINE=InnoDB What I'd like to do now is make a 'rolling log' in that I want to DELETE any entries older than 90 days for EACH `customer_id`. I'm not sure how to do that in a query? I'd rather not iterate over each customer_id if I can help it. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
RE: Help with purging old logs for each customer ID
If the 90 days is back from MAX(created_on) for a given customer... INDEX(customer_id, created_on) will probably be needed. And that should replace KEY `customers_id` (`customer_id`). Maybe... DELETE FROM customers_log AS a WHERE a.customer_log_id = @left_off AND a.customer_log_id @z AND a.created_on ( SELECT MAX(created_on) FROM customers_log WHERE customer_id = a.customer_id ) - INTERVAL 90 DAY ); (Since this has the subquery, I would do only 100 at a time, not 1000) Or... CREATE TEMPORARY TABLE tmp SELECT customer_id, MAX(created_on) - INTERVAL 90 DAY AS cutoff FROM customers_log GROUP BY customer_id; DELETE FROM customers_log AS a JOIN tmp ON a.customer_id = tmp.customer_id WHERE a.customer_log_id = @left_off AND a.customer_log_id @z AND a.created_on tmp.cutoff; If you have millions of rows, a delete without some kind of loop is asking for trouble. Or... Turning things around to base it on customers... Loop through customer_ids (yeah, you did not want to do this) SELECT @id := 0; StartLoop: SELECT @id := customer_id WHERE customer_id @id ORDER BY customer_id LIMIT 1; if @id is NULL, exit DELETE FROM customers_log AS a WHERE a.customer_id = @id AND a.created_on ( SELECT MAX(created_on) FROM customers_log WHERE customer_id = @id ) - INTERVAL 90 DAY ); EndLoop. Since there is no rush for the purging, there is little need to optimize it other than to keep it from interfering with other queries. To that end, the compound index I propose is important. -Original Message- From: Daevid Vincent [mailto:dae...@daevid.com] Sent: Thursday, October 25, 2012 1:33 PM To: Rick James; mysql@lists.mysql.com Subject: RE: Help with purging old logs for each customer ID Well, the customer_id is relevant in that I want the last 90 days relative to each customer. customer_id = 123 might have logs from jan - mar customer_id = 444 might have logs from feb - may So it's a rolling log of THEIR last 90 days from their last log (most recent) back 90 days from there. Does that make more sense? I guess I was trying to avoid looping over every customer ID and computing if I could help it. I thought by using a GROUP BY or something it could group all the logs for a given customer and then trim them that way. But maybe brute force is the way to go? -Original Message- From: Rick James [mailto:rja...@yahoo-inc.com] Sent: Thursday, October 25, 2012 1:09 PM To: Daevid Vincent; mysql@lists.mysql.com Subject: RE: Help with purging old logs for each customer ID Off hand, I would iterate over the PRIMARY KEY, looking at a thousand rows at a time, DELETEing any that need to be purged. I would use a Perl or PHP loop, or write a stored procedure. More discussion of huge deletes (which this _could_ be): http://mysql.rjweb.org/doc.php/deletebig (PARTITIONing does not apply in your case, as I understand it.) I like the loop: SELECT @z := customer_log_id FROM customers_log LIMIT 1000, 1; DELETE FROM customers_log WHERE customer_log_id = @left_off AND customer_log_id @z AND created_on NOW() - INTERVAL 90 DAY; sleep a few seconds (to be a nice guy) Plus code to take care of iterating and terminating. That loop could be done continually. It seems that customer_id is irrelevant?? -Original Message- From: Daevid Vincent [mailto:dae...@daevid.com] Sent: Thursday, October 25, 2012 11:46 AM To: mysql@lists.mysql.com Subject: Help with purging old logs for each customer ID I have a customer log table that is starting to rapidly fill up (we have hundreds of thousands of users, but many are transient, and use the service for a few months, or use the free trial and quit, etc.) CREATE TABLE `customers_log` ( `customer_log_id` bigint(20) unsigned NOT NULL auto_increment, `customer_id` int(10) unsigned default '0', `created_on` timestamp NOT NULL default CURRENT_TIMESTAMP on update CURRENT_TIMESTAMP, `type` enum('View','Action','Admin','Search','Login','Logout','Access','Gen era l','A PI'), `source` enum('web','mobile','system'), `body` text, PRIMARY KEY (`customer_log_id`), KEY `created_on` (`created_on`), KEY `customers_id` (`customer_id`) ) ENGINE=InnoDB What I'd like to do now is make a 'rolling log' in that I want to DELETE any entries older than 90 days for EACH `customer_id`. I'm not sure how to do that in a query? I'd rather not iterate over each customer_id if I can help it. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql -- MySQL General Mailing List For list
RE: Help with purging old logs for each customer ID
Thank you Rick! You're a super freak! ;-p This gets me in the vicinity of where I'm trying to go and I learned a few new tricks with the StartLoop: stuff too! Neat! d. -Original Message- From: Rick James [mailto:rja...@yahoo-inc.com] Sent: Thursday, October 25, 2012 1:54 PM To: Daevid Vincent; mysql@lists.mysql.com Subject: RE: Help with purging old logs for each customer ID If the 90 days is back from MAX(created_on) for a given customer... INDEX(customer_id, created_on) will probably be needed. And that should replace KEY `customers_id` (`customer_id`). Maybe... DELETE FROM customers_log AS a WHERE a.customer_log_id = @left_off AND a.customer_log_id @z AND a.created_on ( SELECT MAX(created_on) FROM customers_log WHERE customer_id = a.customer_id ) - INTERVAL 90 DAY ); (Since this has the subquery, I would do only 100 at a time, not 1000) Or... CREATE TEMPORARY TABLE tmp SELECT customer_id, MAX(created_on) - INTERVAL 90 DAY AS cutoff FROM customers_log GROUP BY customer_id; DELETE FROM customers_log AS a JOIN tmp ON a.customer_id = tmp.customer_id WHERE a.customer_log_id = @left_off AND a.customer_log_id @z AND a.created_on tmp.cutoff; If you have millions of rows, a delete without some kind of loop is asking for trouble. Or... Turning things around to base it on customers... Loop through customer_ids (yeah, you did not want to do this) SELECT @id := 0; StartLoop: SELECT @id := customer_id WHERE customer_id @id ORDER BY customer_id LIMIT 1; if @id is NULL, exit DELETE FROM customers_log AS a WHERE a.customer_id = @id AND a.created_on ( SELECT MAX(created_on) FROM customers_log WHERE customer_id = @id ) - INTERVAL 90 DAY ); EndLoop. Since there is no rush for the purging, there is little need to optimize it other than to keep it from interfering with other queries. To that end, the compound index I propose is important. -Original Message- From: Daevid Vincent [mailto:dae...@daevid.com] Sent: Thursday, October 25, 2012 1:33 PM To: Rick James; mysql@lists.mysql.com Subject: RE: Help with purging old logs for each customer ID Well, the customer_id is relevant in that I want the last 90 days relative to each customer. customer_id = 123 might have logs from jan - mar customer_id = 444 might have logs from feb - may So it's a rolling log of THEIR last 90 days from their last log (most recent) back 90 days from there. Does that make more sense? I guess I was trying to avoid looping over every customer ID and computing if I could help it. I thought by using a GROUP BY or something it could group all the logs for a given customer and then trim them that way. But maybe brute force is the way to go? -Original Message- From: Rick James [mailto:rja...@yahoo-inc.com] Sent: Thursday, October 25, 2012 1:09 PM To: Daevid Vincent; mysql@lists.mysql.com Subject: RE: Help with purging old logs for each customer ID Off hand, I would iterate over the PRIMARY KEY, looking at a thousand rows at a time, DELETEing any that need to be purged. I would use a Perl or PHP loop, or write a stored procedure. More discussion of huge deletes (which this _could_ be): http://mysql.rjweb.org/doc.php/deletebig (PARTITIONing does not apply in your case, as I understand it.) I like the loop: SELECT @z := customer_log_id FROM customers_log LIMIT 1000, 1; DELETE FROM customers_log WHERE customer_log_id = @left_off AND customer_log_id @z AND created_on NOW() - INTERVAL 90 DAY; sleep a few seconds (to be a nice guy) Plus code to take care of iterating and terminating. That loop could be done continually. It seems that customer_id is irrelevant?? -Original Message- From: Daevid Vincent [mailto:dae...@daevid.com] Sent: Thursday, October 25, 2012 11:46 AM To: mysql@lists.mysql.com Subject: Help with purging old logs for each customer ID I have a customer log table that is starting to rapidly fill up (we have hundreds of thousands of users, but many are transient, and use the service for a few months, or use the free trial and quit, etc.) CREATE TABLE `customers_log` ( `customer_log_id` bigint(20) unsigned NOT NULL auto_increment, `customer_id` int(10) unsigned default '0', `created_on` timestamp NOT NULL default CURRENT_TIMESTAMP on update CURRENT_TIMESTAMP, `type` enum('View','Action','Admin','Search','Login','Logout','Access','Gen era l','A PI'), `source` enum('web','mobile','system'), `body` text, PRIMARY KEY (`customer_log_id`), KEY `created_on` (`created_on
Re: Help with mysql connect_timeout
But connect-timeout has nothing to do with termination of query. It is no. of secs that mysqld server waits for a connect packet before responding with Bad handshake, default value is 10 seconds. Probably you should adjust net_read /write_ timeout.
Re: Help with mysql connect_timeout
- Original Message - From: indrani gorti indrani.go...@gmail.com Later I changed it to SET GLOBAL connect_timeout=60; However, I still find that the query loses the connection to mysql server after about 10 mins( 600.495 secs) I see that the connect_time is 60 though. If I recall correctly, all timeouts are set in seconds, not minutes. However, as long as a query is running, MySQL should not timeout your connection. It seems more likely that your application gives up. Assuming this is a website, check your max script runtime settings. -- Linux Bier Wanderung 2012, now also available in Belgium! August, 12 to 19, Diksmuide, Belgium - http://lbw2012.tuxera.be -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
Re: Help with mysql connect_timeout
Also check for any firewall or NAT On Sep 5, 2012 5:17 PM, indrani gorti indrani.go...@gmail.com wrote: Hi all, I am a newbie to work with the settings in mysql although I have used in very small applications before I am currently experimenting with very large tables and a few complicated queries. I am using mysql 5.5. My tables are of size of abt 1 million tuples. The queries upon these tables take more than 10 minutes and hence the connection timeout. I learnt that there are various timeouts . I tried using SHOW VARIABLES LIKE '%TIMEOUT%'; and found that the connect-timeout is 10 ( about 600.495 secs) Later I changed it to SET GLOBAL connect_timeout=60; However, I still find that the query loses the connection to mysql server after about 10 mins( 600.495 secs) I see that the connect_time is 60 though. Can anyone please let me know whats wrong? Thanks in advance for the help. Regards, Indrani Gorti
Re: help with correlated subquery
On Tue, Aug 21, 2012 at 8:26 PM, Larry Martell larry.mart...@gmail.com wrote: On Tue, Aug 21, 2012 at 8:07 PM, h...@tbbs.net wrote: 2012/08/21 16:35 -0600, Larry Martell I am trying to write a query that selects from both a correlated subquery and a table in the main query, and I'm having a lot of trouble getting the proper row count. I'm sure this is very simple, and I'm just missing it. I'll try and present a simple example. For this example, there are 27 rows, organized like this: mysql select count(*), target_name_id, ep, wafer_id from data_cst where target_name_id = 44 group by target_name_id, ep, wafer_id; +--++--+--+ | count(*) | target_name_id | ep | wafer_id | +--++--+--+ |6 | 44 | 1,1 | 16 | |3 | 44 | 1,1 | 17 | |6 | 44 | 1,2 | 16 | |3 | 44 | 1,2 | 17 | |6 | 44 | 1,3 | 16 | |3 | 44 | 1,3 | 17 | +--++--+--+ 6 rows in set (0.00 sec) I need to get an average of a column grouped by target_name_id, ep as well as the average of the averages grouped by target_name_id, ep, wafer_id, and I also need the count of the rows in the target_name_id, ep group. My query is getting the correct averages, but incorrect row counts: mysql select count(*), target_name_id, ep, avg(bottom), avg(averages) from (select avg(bottom) as averages, target_name_id as t, ep as e from data_cst where target_name_id = 44 group by target_name_id, ep, wafer_id) x, data_cst where target_name_id = t and ep = e group by target_name_id, ep; +--++--+-+-+ | count(*) | target_name_id | ep | avg(bottom) | avg(averages) | +--++--+-+-+ | 18 | 44 | 1,1 | 21.8056667 | 21.8545833 | | 18 | 44 | 1,2 | 121.798 | 121.83983335000 | | 18 | 44 | 1,3 | 349.763 | 349.75016665000 | +--++--+-+-+ 3 rows in set (0.01 sec) The count for each row should be 9. What do I need in my count() to be counting the right thing? Your trouble lys in the joining; in effect, you are joining a row with wafer_id 16 with a row with wafer_id 17, and also a row with wafer_id 17 with a row with wafer_id 16. A further advantage to using the now standard form of joining, as Rick James bids you do, is that one can add further conditions to it: select count(*), target_name_id, ep, avg(bottom), avg(averages) from (select avg(bottom) as averages, target_name_id, ep from data_cst where target_name_id = 44 group by target_name_id, ep, wafer_id) x JOIN data_cst ON target_name_id = x.target_name_id and ep = x.ep and wafer_id x.wafer_id group by target_name_id, ep The inequality, maybe, will give you what you want. Not, wafer_id x.wafer_id, but wafer_id = x.wafer_id - adding that makes it work the way I want. Thanks!! So now that I have this working, that have added another requirement. They also want a count of rows aggregated by a different set of columns. So I need to add another subquery, but I can't figure out how to get the row count. In this example query: mysql select count(*) from data_cst where target_name_id=208082 and wafer_id=425845 group by target_name_id,wafer_id,lot_id,data_file_id; +--+ | count(*) | +--+ | 12 | | 12 | | 12 | | 12 | +--+ 4 rows in set (0.00 sec) The result I need is 4 (i.e there were 4 distinct groups of target_name_id,wafer_id,lot_id,data_file_id). How can I get that? This will be a subquery, so I can't use mysql_num_rows() or FOUND_ROWS() after the fact. I need the result returned from the query. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
Re: help with correlated subquery
On Thu, Aug 23, 2012 at 8:08 AM, Larry Martell larry.mart...@gmail.com wrote: On Tue, Aug 21, 2012 at 8:26 PM, Larry Martell larry.mart...@gmail.com wrote: On Tue, Aug 21, 2012 at 8:07 PM, h...@tbbs.net wrote: 2012/08/21 16:35 -0600, Larry Martell I am trying to write a query that selects from both a correlated subquery and a table in the main query, and I'm having a lot of trouble getting the proper row count. I'm sure this is very simple, and I'm just missing it. I'll try and present a simple example. For this example, there are 27 rows, organized like this: mysql select count(*), target_name_id, ep, wafer_id from data_cst where target_name_id = 44 group by target_name_id, ep, wafer_id; +--++--+--+ | count(*) | target_name_id | ep | wafer_id | +--++--+--+ |6 | 44 | 1,1 | 16 | |3 | 44 | 1,1 | 17 | |6 | 44 | 1,2 | 16 | |3 | 44 | 1,2 | 17 | |6 | 44 | 1,3 | 16 | |3 | 44 | 1,3 | 17 | +--++--+--+ 6 rows in set (0.00 sec) I need to get an average of a column grouped by target_name_id, ep as well as the average of the averages grouped by target_name_id, ep, wafer_id, and I also need the count of the rows in the target_name_id, ep group. My query is getting the correct averages, but incorrect row counts: mysql select count(*), target_name_id, ep, avg(bottom), avg(averages) from (select avg(bottom) as averages, target_name_id as t, ep as e from data_cst where target_name_id = 44 group by target_name_id, ep, wafer_id) x, data_cst where target_name_id = t and ep = e group by target_name_id, ep; +--++--+-+-+ | count(*) | target_name_id | ep | avg(bottom) | avg(averages) | +--++--+-+-+ | 18 | 44 | 1,1 | 21.8056667 | 21.8545833 | | 18 | 44 | 1,2 | 121.798 | 121.83983335000 | | 18 | 44 | 1,3 | 349.763 | 349.75016665000 | +--++--+-+-+ 3 rows in set (0.01 sec) The count for each row should be 9. What do I need in my count() to be counting the right thing? Your trouble lys in the joining; in effect, you are joining a row with wafer_id 16 with a row with wafer_id 17, and also a row with wafer_id 17 with a row with wafer_id 16. A further advantage to using the now standard form of joining, as Rick James bids you do, is that one can add further conditions to it: select count(*), target_name_id, ep, avg(bottom), avg(averages) from (select avg(bottom) as averages, target_name_id, ep from data_cst where target_name_id = 44 group by target_name_id, ep, wafer_id) x JOIN data_cst ON target_name_id = x.target_name_id and ep = x.ep and wafer_id x.wafer_id group by target_name_id, ep The inequality, maybe, will give you what you want. Not, wafer_id x.wafer_id, but wafer_id = x.wafer_id - adding that makes it work the way I want. Thanks!! So now that I have this working, that have added another requirement. They also want a count of rows aggregated by a different set of columns. So I need to add another subquery, but I can't figure out how to get the row count. In this example query: mysql select count(*) from data_cst where target_name_id=208082 and wafer_id=425845 group by target_name_id,wafer_id,lot_id,data_file_id; +--+ | count(*) | +--+ | 12 | | 12 | | 12 | | 12 | +--+ 4 rows in set (0.00 sec) The result I need is 4 (i.e there were 4 distinct groups of target_name_id,wafer_id,lot_id,data_file_id). How can I get that? This will be a subquery, so I can't use mysql_num_rows() or FOUND_ROWS() after the fact. I need the result returned from the query. I got this working: select count(distinct lot_id,data_file_id) from data_cst where target_name_id=208082 and wafer_id=425845; -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
RE: help with correlated subquery
assign realistic alias names OuterJoin should be called OuterJoin InnerJoin should be called InnerJoin If you want only the most restricitive criteria that match resultsets from both select statements use INNER JOIN if you want all results from both resultsets (cartesian JOIN) use OUTER JOIN Finally: Be aware FUNCTIONS such as AVG cast off indexing and should be avoided unless the FUNCTION(columnName) itself is indexed GROUP BY re-arranges your query so its best to introduce GROUP BY in stages use realistic alias names like Dept and EmployeeNumber and avoid aliases that cause confusion like 'a' or 'foo' Develop in stages and write down what YOU EXPECT vs WHAT each query produces .. If the Individual Select doesnt produce expected results..STOP and correct the SELECT Statement Diagram out what you expect results from any of the INNER JOIN, OUTER JOIN, LEFT JOIN or RIGHT JOIN operations will produce If the executed JOIN Statement does not produce expected results STOP and correct the JOIN clause BEFORE incorporating more functionality Obfuscation and confusion can hopelessly sidetrack any intelligent analysis Martin __ Verzicht und Vertraulichkeitanmerkung/Note de déni et de confidentialité Diese Nachricht ist vertraulich. Sollten Sie nicht der vorgesehene Empfaenger sein, so bitten wir hoeflich um eine Mitteilung. Jede unbefugte Weiterleitung oder Fertigung einer Kopie ist unzulaessig. Diese Nachricht dient lediglich dem Austausch von Informationen und entfaltet keine rechtliche Bindungswirkung. Aufgrund der leichten Manipulierbarkeit von E-Mails koennen wir keine Haftung fuer den Inhalt uebernehmen. Ce message est confidentiel et peut être privilégié. Si vous n'êtes pas le destinataire prévu, nous te demandons avec bonté que pour satisfaire informez l'expéditeur. N'importe quelle diffusion non autorisée ou la copie de ceci est interdite. Ce message sert à l'information seulement et n'aura pas n'importe quel effet légalement obligatoire. Étant donné que les email peuvent facilement être sujets à la manipulation, nous ne pouvons accepter aucune responsabilité pour le contenu fourni. Date: Tue, 21 Aug 2012 20:26:51 -0600 Subject: Re: help with correlated subquery From: larry.mart...@gmail.com To: h...@tbbs.net CC: mysql@lists.mysql.com On Tue, Aug 21, 2012 at 8:07 PM, h...@tbbs.net wrote: 2012/08/21 16:35 -0600, Larry Martell I am trying to write a query that selects from both a correlated subquery and a table in the main query, and I'm having a lot of trouble getting the proper row count. I'm sure this is very simple, and I'm just missing it. I'll try and present a simple example. For this example, there are 27 rows, organized like this: mysql select count(*), target_name_id, ep, wafer_id from data_cst where target_name_id = 44 group by target_name_id, ep, wafer_id; +--++--+--+ | count(*) | target_name_id | ep | wafer_id | +--++--+--+ |6 | 44 | 1,1 | 16 | |3 | 44 | 1,1 | 17 | |6 | 44 | 1,2 | 16 | |3 | 44 | 1,2 | 17 | |6 | 44 | 1,3 | 16 | |3 | 44 | 1,3 | 17 | +--++--+--+ 6 rows in set (0.00 sec) I need to get an average of a column grouped by target_name_id, ep as well as the average of the averages grouped by target_name_id, ep, wafer_id, and I also need the count of the rows in the target_name_id, ep group. My query is getting the correct averages, but incorrect row counts: mysql select count(*), target_name_id, ep, avg(bottom), avg(averages) from (select avg(bottom) as averages, target_name_id as t, ep as e from data_cst where target_name_id = 44 group by target_name_id, ep, wafer_id) x, data_cst where target_name_id = t and ep = e group by target_name_id, ep; +--++--+-+-+ | count(*) | target_name_id | ep | avg(bottom) | avg(averages) | +--++--+-+-+ | 18 | 44 | 1,1 | 21.8056667 | 21.8545833 | | 18 | 44 | 1,2 | 121.798 | 121.83983335000 | | 18 | 44 | 1,3 | 349.763 | 349.75016665000 | +--++--+-+-+ 3 rows in set (0.01 sec) The count for each row should be 9. What do I need in my count() to be counting the right thing? Your trouble lys in the joining; in effect, you are joining a row with wafer_id 16 with a row with wafer_id 17, and also a row with wafer_id 17 with a row with wafer_id 16. A further advantage to using the now standard form of joining, as Rick James bids you do, is that one can add further conditions
RE: help with correlated subquery
The inner query has multiple rows because of wafer_id. The outer query then gives you multiple copies, hence screwing up the COUNT. Also, the AVG(AVG()) is mathematically incorrect unless the counts are identical. -Original Message- From: Larry Martell [mailto:larry.mart...@gmail.com] Sent: Tuesday, August 21, 2012 7:27 PM To: h...@tbbs.net Cc: mysql@lists.mysql.com Subject: Re: help with correlated subquery On Tue, Aug 21, 2012 at 8:07 PM, h...@tbbs.net wrote: 2012/08/21 16:35 -0600, Larry Martell I am trying to write a query that selects from both a correlated subquery and a table in the main query, and I'm having a lot of trouble getting the proper row count. I'm sure this is very simple, and I'm just missing it. I'll try and present a simple example. For this example, there are 27 rows, organized like this: mysql select count(*), target_name_id, ep, wafer_id from data_cst where target_name_id = 44 group by target_name_id, ep, wafer_id; +--++--+--+ | count(*) | target_name_id | ep | wafer_id | +--++--+--+ |6 | 44 | 1,1 | 16 | |3 | 44 | 1,1 | 17 | |6 | 44 | 1,2 | 16 | |3 | 44 | 1,2 | 17 | |6 | 44 | 1,3 | 16 | |3 | 44 | 1,3 | 17 | +--++--+--+ 6 rows in set (0.00 sec) I need to get an average of a column grouped by target_name_id, ep as well as the average of the averages grouped by target_name_id, ep, wafer_id, and I also need the count of the rows in the target_name_id, ep group. My query is getting the correct averages, but incorrect row counts: mysql select count(*), target_name_id, ep, avg(bottom), avg(averages) from (select avg(bottom) as averages, target_name_id as t, ep as e from data_cst where target_name_id = 44 group by target_name_id, ep, wafer_id) x, data_cst where target_name_id = t and ep = e group by target_name_id, ep; +--++--+-+-+ | count(*) | target_name_id | ep | avg(bottom) | avg(averages) | +--++--+-+-+ | 18 | 44 | 1,1 | 21.8056667 | 21.8545833 | | 18 | 44 | 1,2 | 121.798 | 121.83983335000 | | 18 | 44 | 1,3 | 349.763 | 349.75016665000 | +--++--+-+-+ 3 rows in set (0.01 sec) The count for each row should be 9. What do I need in my count() to be counting the right thing? Your trouble lys in the joining; in effect, you are joining a row with wafer_id 16 with a row with wafer_id 17, and also a row with wafer_id 17 with a row with wafer_id 16. A further advantage to using the now standard form of joining, as Rick James bids you do, is that one can add further conditions to it: select count(*), target_name_id, ep, avg(bottom), avg(averages) from (select avg(bottom) as averages, target_name_id, ep from data_cst where target_name_id = 44 group by target_name_id, ep, wafer_id) x JOIN data_cst ON target_name_id = x.target_name_id and ep = x.ep and wafer_id x.wafer_id group by target_name_id, ep The inequality, maybe, will give you what you want. Not, wafer_id x.wafer_id, but wafer_id = x.wafer_id - adding that makes it work the way I want. Thanks!! -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
Re: help with correlated subquery
Hello Martin, On 8/22/2012 8:30 AM, Martin Gainty wrote: assign realistic alias names OuterJoin should be called OuterJoin InnerJoin should be called InnerJoin Almost! MySQL does not have a simple OUTER JOIN command (some RDBMSes call this a FULL OUTER JOIN). What we do have is the option to include the OUTER keyword into our LEFT or RIGHT joins. For example, both of these are acceptable: LEFT OUTER JOIN LEFT JOIN Also, you need a space between inner and join as in INNER JOIN. If you want only the most restricitive criteria that match resultsets from both select statements use INNER JOIN if you want all results from both resultsets (cartesian JOIN) use OUTER JOIN Again, MySQL does not have a plain OUTER JOIN. If you want a full Cartesian product of two tables, use the 'comma join' syntax with no criteria for matching the tables in the WHERE clause. SELECT ... FROM tableA, tableB WHERE Syntax details are located here: http://dev.mysql.com/doc/refman/5.5/en/join.html Finally: Be aware FUNCTIONS such as AVG cast off indexing and should be avoided unless the FUNCTION(columnName) itself is indexed GROUP BY re-arranges your query so its best to introduce GROUP BY in stages Not exactly. If you wrap a column in a function and attempt to use the results of that function in the WHERE clause, then you are correct. However based on the way your define your indexes, the data you process in a function may actually come from the index and save you a trip to the underlying table. In this case, the index could make your function faster by skipping an additional retrieval step. use realistic alias names like Dept and EmployeeNumber and avoid aliases that cause confusion like 'a' or 'foo' Excellent advice. Develop in stages and write down what YOU EXPECT vs WHAT each query produces .. If the Individual Select doesnt produce expected results..STOP and correct the SELECT Statement Also excellent advice. Diagram out what you expect results from any of the INNER JOIN, OUTER JOIN, LEFT JOIN or RIGHT JOIN operations will produce If the executed JOIN Statement does not produce expected results STOP and correct the JOIN clause BEFORE incorporating more functionality Obfuscation and confusion can hopelessly sidetrack any intelligent analysis Well put! -- Shawn Green MySQL Principal Technical Support Engineer Oracle USA, Inc. - Hardware and Software, Engineered to Work Together. Office: Blountville, TN -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
RE: help with correlated subquery
select count(*), target_name_id, ep, avg(bottom), avg(averages) from ( SELECT avg(bottom) as averages, target_name_id as t, ep as e from data_cst where target_name_id = 44 group by target_name_id, ep, wafer_id) x, data_cst where target_name_id = t and ep = e group by target_name_id, ep; Please qualify all fields with table names (or aliases). I can't tell what ep and e are. Etc. Please turn the commajoin into an explicit JOIN...ON. -Original Message- From: Larry Martell [mailto:larry.mart...@gmail.com] Sent: Tuesday, August 21, 2012 3:35 PM To: mysql mailing list Subject: help with correlated subquery I am trying to write a query that selects from both a correlated subquery and a table in the main query, and I'm having a lot of trouble getting the proper row count. I'm sure this is very simple, and I'm just missing it. I'll try and present a simple example. For this example, there are 27 rows, organized like this: mysql select count(*), target_name_id, ep, wafer_id from data_cst where target_name_id = 44 group by target_name_id, ep, wafer_id; +--++--+--+ | count(*) | target_name_id | ep | wafer_id | +--++--+--+ |6 | 44 | 1,1 | 16 | |3 | 44 | 1,1 | 17 | |6 | 44 | 1,2 | 16 | |3 | 44 | 1,2 | 17 | |6 | 44 | 1,3 | 16 | |3 | 44 | 1,3 | 17 | +--++--+--+ 6 rows in set (0.00 sec) I need to get an average of a column grouped by target_name_id, ep as well as the average of the averages grouped by target_name_id, ep, wafer_id, and I also need the count of the rows in the target_name_id, ep group. My query is getting the correct averages, but incorrect row counts: mysql select count(*), target_name_id, ep, avg(bottom), avg(averages) from (select avg(bottom) as averages, target_name_id as t, ep as e from data_cst where target_name_id = 44 group by target_name_id, ep, wafer_id) x, data_cst where target_name_id = t and ep = e group by target_name_id, ep; +--++--+-+-+ | count(*) | target_name_id | ep | avg(bottom) | avg(averages) | +--++--+-+-+ | 18 | 44 | 1,1 | 21.8056667 | 21.8545833 | | 18 | 44 | 1,2 | 121.798 | 121.83983335000 | | 18 | 44 | 1,3 | 349.763 | 349.75016665000 | +--++--+-+-+ 3 rows in set (0.01 sec) The count for each row should be 9. What do I need in my count() to be counting the right thing? TIA! -larry -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
Re: help with correlated subquery
On Tue, Aug 21, 2012 at 5:30 PM, Rick James rja...@yahoo-inc.com wrote: select count(*), target_name_id, ep, avg(bottom), avg(averages) from ( SELECT avg(bottom) as averages, target_name_id as t, ep as e from data_cst where target_name_id = 44 group by target_name_id, ep, wafer_id) x, data_cst where target_name_id = t and ep = e group by target_name_id, ep; Please qualify all fields with table names (or aliases). I can't tell what ep and e are. Etc. Please turn the commajoin into an explicit JOIN...ON. select count(*), target_name_id as target, ep as ep, avg(bottom), avg(averages) from (select avg(bottom) as averages, target_name_id as t, ep as e from data_cst where target_name_id = 44 group by target_name_id, ep, wafer_id) x join (data_cst) on data_cst.target_name_id = x.t and ep = x.e group by target_name_id, ep; Returns the same result set. -Original Message- From: Larry Martell [mailto:larry.mart...@gmail.com] Sent: Tuesday, August 21, 2012 3:35 PM To: mysql mailing list Subject: help with correlated subquery I am trying to write a query that selects from both a correlated subquery and a table in the main query, and I'm having a lot of trouble getting the proper row count. I'm sure this is very simple, and I'm just missing it. I'll try and present a simple example. For this example, there are 27 rows, organized like this: mysql select count(*), target_name_id, ep, wafer_id from data_cst where target_name_id = 44 group by target_name_id, ep, wafer_id; +--++--+--+ | count(*) | target_name_id | ep | wafer_id | +--++--+--+ |6 | 44 | 1,1 | 16 | |3 | 44 | 1,1 | 17 | |6 | 44 | 1,2 | 16 | |3 | 44 | 1,2 | 17 | |6 | 44 | 1,3 | 16 | |3 | 44 | 1,3 | 17 | +--++--+--+ 6 rows in set (0.00 sec) I need to get an average of a column grouped by target_name_id, ep as well as the average of the averages grouped by target_name_id, ep, wafer_id, and I also need the count of the rows in the target_name_id, ep group. My query is getting the correct averages, but incorrect row counts: mysql select count(*), target_name_id, ep, avg(bottom), avg(averages) from (select avg(bottom) as averages, target_name_id as t, ep as e from data_cst where target_name_id = 44 group by target_name_id, ep, wafer_id) x, data_cst where target_name_id = t and ep = e group by target_name_id, ep; +--++--+-+-+ | count(*) | target_name_id | ep | avg(bottom) | avg(averages) | +--++--+-+-+ | 18 | 44 | 1,1 | 21.8056667 | 21.8545833 | | 18 | 44 | 1,2 | 121.798 | 121.83983335000 | | 18 | 44 | 1,3 | 349.763 | 349.75016665000 | +--++--+-+-+ 3 rows in set (0.01 sec) The count for each row should be 9. What do I need in my count() to be counting the right thing? TIA! -larry -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
Re: help with correlated subquery
On Tue, Aug 21, 2012 at 5:39 PM, Martin Gainty mgai...@hotmail.com wrote: a look at the first query: select count(*), target_name_id, ep, wafer_id from data_cst where target_name_id = 44 group by target_name_id, ep, wafer_id; +--++--+--+ | count(*) | target_name_id | ep | wafer_id | +--++--+--+ | 6 | 44 | 1,1 | 16 | | 3 | 44 | 1,1 | 17 | | 6 | 44 | 1,2 | 16 | | 3 | 44 | 1,2 | 17 | | 6 | 44 | 1,3 | 16 | | 3 | 44 | 1,3 | 17 | a look at the second query which references the 1st query select count(*), target_name_id, ep, avg(bottom), avg(averages) from (select avg(bottom) as averages, target_name_id as t, ep as e from nt er_id) x, data_cst where target_name_id = t and ep = e group by target_name_id, ep; +--++--+-+-+ | count(*) | target_name_id | ep | avg(bottom) | avg(averages) | +--++--+-+-+ | 18 | 44 | 1,1 | 21.8056667 | 21.8545833 | | 18 | 44 | 1,2 | 121.798 | 121.83983335000 | | 18 | 44 | 1,3 | 349.763 | 349.75016665000 | you have 3 rows returned based on wafer_id wafer_id = 16 returns count of 6 count of 6 count of 6 count(*) = 18 but you dont want the rows aggregated by wafer_id (or any other criteria) you want your rows aggregated by only column ep (1,1 with 1,1) ROW 1 and ROW 2 c (1,2 with 1,2) ROW3 and ROW 4 (1,3 with 1,3) ROW5 and ROW6 so i would group ONLY on ep No, I need to aggregate on target_name_id and ep. In the real app there are multiple target_name_ids being selected. I was trying to present a simple example. Its a tough call without seeing each query executed individually what I do is build out the query column at a time and then I add in group by then I add in aggregate functions avg(bottom) avg(averages) that way I can see each variable being used and which one delivers correct result (and which variable goes fubar) I've done that. The inner query gives the expected results: mysql select target_name_id as t, ep as e, avg(bottom) as averages from data_cst where target_name_id = 44 group by target_name_id, ep, wafer_id; ++--+-+ | t | e| averages| ++--+-+ | 44 | 1,1 | 21.7078333 | | 44 | 1,1 | 22.001 | | 44 | 1,2 | 121.7156667 | | 44 | 1,2 | 121.964 | | 44 | 1,3 | 349.790 | | 44 | 1,3 | 349.710 | ++--+-+ 6 rows in set (0.00 sec) As does the outer one: mysql select count(*), target_name_id as target, ep as ep, avg(bottom) from data_cst where target_name_id = 44 group by target_name_id, ep; +--++--+-+ | count(*) | target | ep | avg(bottom) | +--++--+-+ |9 | 44 | 1,1 | 21.8056667 | |9 | 44 | 1,2 | 121.798 | |9 | 44 | 1,3 | 349.763 | +--++--+-+ 3 rows in set (0.00 sec) It only when I combine them that I am double counting the target, ep rows. I think I see why (there are 2 wafers for each target, ep), but I don't know how to avoid that. I need to group by target_name_id, ep, wafer_id in the inner query, and then I need to group by target_name_id, ep in the outer one. I only want to count the number of target_name_id, ep groups. Date: Tue, 21 Aug 2012 16:35:23 -0600 Subject: help with correlated subquery From: larry.mart...@gmail.com To: mysql@lists.mysql.com I am trying to write a query that selects from both a correlated subquery and a table in the main query, and I'm having a lot of trouble getting the proper row count. I'm sure this is very simple, and I'm just missing it. I'll try and present a simple example. For this example, there are 27 rows, organized like this: mysql select count(*), target_name_id, ep, wafer_id from data_cst where target_name_id = 44 group by target_name_id, ep, wafer_id; +--++--+--+ | count(*) | target_name_id | ep | wafer_id | +--++--+--+ | 6 | 44 | 1,1 | 16 | | 3 | 44 | 1,1 | 17 | | 6 | 44 | 1,2 | 16 | | 3 | 44 | 1,2 | 17 | | 6 | 44 | 1,3 | 16 | | 3 | 44 | 1,3 | 17 | +--++--+--+ 6 rows in set (0.00 sec) I need to get an average of a column grouped by target_name_id, ep as well as the average of the averages grouped by target_name_id, ep, wafer_id, and I also need the count of the rows in the target_name_id, ep group. My query is getting the correct averages, but incorrect row counts: mysql select count(*), target_name_id, ep, avg(bottom), avg(averages) from (select avg(bottom) as averages, target_name_id as t, ep as e from data_cst where target_name_id = 44 group by target_name_id, ep, wafer_id) x, data_cst where target_name_id = t and ep = e group by
Re: help with correlated subquery
2012/08/21 16:35 -0600, Larry Martell I am trying to write a query that selects from both a correlated subquery and a table in the main query, and I'm having a lot of trouble getting the proper row count. I'm sure this is very simple, and I'm just missing it. I'll try and present a simple example. For this example, there are 27 rows, organized like this: mysql select count(*), target_name_id, ep, wafer_id from data_cst where target_name_id = 44 group by target_name_id, ep, wafer_id; +--++--+--+ | count(*) | target_name_id | ep | wafer_id | +--++--+--+ |6 | 44 | 1,1 | 16 | |3 | 44 | 1,1 | 17 | |6 | 44 | 1,2 | 16 | |3 | 44 | 1,2 | 17 | |6 | 44 | 1,3 | 16 | |3 | 44 | 1,3 | 17 | +--++--+--+ 6 rows in set (0.00 sec) I need to get an average of a column grouped by target_name_id, ep as well as the average of the averages grouped by target_name_id, ep, wafer_id, and I also need the count of the rows in the target_name_id, ep group. My query is getting the correct averages, but incorrect row counts: mysql select count(*), target_name_id, ep, avg(bottom), avg(averages) from (select avg(bottom) as averages, target_name_id as t, ep as e from data_cst where target_name_id = 44 group by target_name_id, ep, wafer_id) x, data_cst where target_name_id = t and ep = e group by target_name_id, ep; +--++--+-+-+ | count(*) | target_name_id | ep | avg(bottom) | avg(averages) | +--++--+-+-+ | 18 | 44 | 1,1 | 21.8056667 | 21.8545833 | | 18 | 44 | 1,2 | 121.798 | 121.83983335000 | | 18 | 44 | 1,3 | 349.763 | 349.75016665000 | +--++--+-+-+ 3 rows in set (0.01 sec) The count for each row should be 9. What do I need in my count() to be counting the right thing? Your trouble lys in the joining; in effect, you are joining a row with wafer_id 16 with a row with wafer_id 17, and also a row with wafer_id 17 with a row with wafer_id 16. A further advantage to using the now standard form of joining, as Rick James bids you do, is that one can add further conditions to it: select count(*), target_name_id, ep, avg(bottom), avg(averages) from (select avg(bottom) as averages, target_name_id, ep from data_cst where target_name_id = 44 group by target_name_id, ep, wafer_id) x JOIN data_cst ON target_name_id = x.target_name_id and ep = x.ep and wafer_id x.wafer_id group by target_name_id, ep The inequality, maybe, will give you what you want. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
Re: help with correlated subquery
On Tue, Aug 21, 2012 at 8:07 PM, h...@tbbs.net wrote: 2012/08/21 16:35 -0600, Larry Martell I am trying to write a query that selects from both a correlated subquery and a table in the main query, and I'm having a lot of trouble getting the proper row count. I'm sure this is very simple, and I'm just missing it. I'll try and present a simple example. For this example, there are 27 rows, organized like this: mysql select count(*), target_name_id, ep, wafer_id from data_cst where target_name_id = 44 group by target_name_id, ep, wafer_id; +--++--+--+ | count(*) | target_name_id | ep | wafer_id | +--++--+--+ |6 | 44 | 1,1 | 16 | |3 | 44 | 1,1 | 17 | |6 | 44 | 1,2 | 16 | |3 | 44 | 1,2 | 17 | |6 | 44 | 1,3 | 16 | |3 | 44 | 1,3 | 17 | +--++--+--+ 6 rows in set (0.00 sec) I need to get an average of a column grouped by target_name_id, ep as well as the average of the averages grouped by target_name_id, ep, wafer_id, and I also need the count of the rows in the target_name_id, ep group. My query is getting the correct averages, but incorrect row counts: mysql select count(*), target_name_id, ep, avg(bottom), avg(averages) from (select avg(bottom) as averages, target_name_id as t, ep as e from data_cst where target_name_id = 44 group by target_name_id, ep, wafer_id) x, data_cst where target_name_id = t and ep = e group by target_name_id, ep; +--++--+-+-+ | count(*) | target_name_id | ep | avg(bottom) | avg(averages) | +--++--+-+-+ | 18 | 44 | 1,1 | 21.8056667 | 21.8545833 | | 18 | 44 | 1,2 | 121.798 | 121.83983335000 | | 18 | 44 | 1,3 | 349.763 | 349.75016665000 | +--++--+-+-+ 3 rows in set (0.01 sec) The count for each row should be 9. What do I need in my count() to be counting the right thing? Your trouble lys in the joining; in effect, you are joining a row with wafer_id 16 with a row with wafer_id 17, and also a row with wafer_id 17 with a row with wafer_id 16. A further advantage to using the now standard form of joining, as Rick James bids you do, is that one can add further conditions to it: select count(*), target_name_id, ep, avg(bottom), avg(averages) from (select avg(bottom) as averages, target_name_id, ep from data_cst where target_name_id = 44 group by target_name_id, ep, wafer_id) x JOIN data_cst ON target_name_id = x.target_name_id and ep = x.ep and wafer_id x.wafer_id group by target_name_id, ep The inequality, maybe, will give you what you want. Not, wafer_id x.wafer_id, but wafer_id = x.wafer_id - adding that makes it work the way I want. Thanks!! -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
Re: Help! The dreaded Incorrect key file for table 'XXXXXXX'; try to repair it error
Hi Victor, To answer your question about saving the table. This URL http://dev.mysql.com/doc/refman/5.6/en/myisam-repair.html - *Stage 3: Difficult repair* directly addresses your concerns. You also may want to look into different option of REPAIR TABLE command http://dev.mysql.com/doc/refman/5.6/en/repair-table.html to see if extended or form only option can be used to restore your table Regards, Mikhail Berman On Sat, Mar 10, 2012 at 12:38 PM, Victor Danilchenko vic...@askonline.netwrote: Hi all, I was upgrading some web software on my server, and its upgrade involved upgrading its database. After the upgrade, the following error started appearing: mysqldump: Got error: 1034: Incorrect key file for table 'notes'; try to repair it when using LOCK TABLES So i tried doing lock and repair: mysql LOCK TABLES notes WRITE; ERROR 1034 (HY000): Incorrect key file for table 'notes'; try to repair it mysql REPAIR TABLE notes\G *** 1. row *** Table: sugar.notes Op: repair Msg_type: Error Msg_text: Incorrect key file for table 'notes'; try to repair it *** 2. row *** Table: sugar.notes Op: repair Msg_type: error Msg_text: Corrupt 2 rows in set (0.00 sec) So i tried stopping mysql server and running myisamchk: # myisamchk --verbose --force--update-state --key_buffer_size=64M --sort_buffer_size=64M --read_buffer_size=1M --write_buffer_size=1M ~mysql/sugar/notes.MYI # myisamchk --verbose notes Checking MyISAM file: notes Data records:9519 Deleted blocks: 0 - check file-size - check record delete-chain No recordlinks - check key delete-chain block_size 1024: block_size 2048: block_size 3072: block_size 4096: - check index reference - check data record references index: 1 - check data record references index: 2 - check data record references index: 3 - check data record references index: 4 - check record links but the error is still there, even though myisamchk apparently sees nothing wrong! is there any way to save this table? it was a result of an upgrade that took about 24 hours to run, so I really don't want to re-do it from scratch. many thanks in advance for any sage advice. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql -- Mikhail Berman
Re: help! mysqld cannot start!
It doesn't particularly say, but this: 120116 19:15:29 120116 19:15:29 InnoDB: 1.1.8 started; log sequence number 1595675 suggests to me that there's still junk from a previous install around. You might want to clean that up. - Original Message - From: mantianyu mantia...@gmail.com To: mysql@lists.mysql.com Sent: Monday, 16 January, 2012 12:37:07 PM Subject: help! mysqld cannot start! I have download the Linux - Generic 2.6 (x86, 32-bit), Compressed TAR Archive binary edition and I installed it all followed the INSTALL_BINARY but at last step I start the service by run sudo bin/mysqld_safe --user=mysql I got following error message cifer@Pig:/usr/local/mysql$ 120116 19:15:28 mysqld_safe Logging to '/var/log/mysql/error.log'. 120116 19:15:28 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql 120116 19:15:30 mysqld_safe mysqld from pid file /var/lib/mysql/Pig.pid ended and I check the error.log file it shows that 120116 19:15:28 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql 120116 19:15:28 [ERROR] Can't find messagefile '/usr/share/errmsg.sys' 120116 19:15:28 [Note] Plugin 'FEDERATED' is disabled. 120116 19:15:28 InnoDB: The InnoDB memory heap is disabled 120116 19:15:28 InnoDB: Mutexes and rw_locks use InnoDB's own implementation 120116 19:15:28 InnoDB: Compressed tables use zlib 1.2.3 120116 19:15:28 InnoDB: Using Linux native AIO 120116 19:15:28 InnoDB: Initializing buffer pool, size = 128.0M 120116 19:15:28 InnoDB: Completed initialization of buffer pool 120116 19:15:28 InnoDB: highest supported file format is Barracuda. 120116 19:15:28 InnoDB: Waiting for the background threads to start 120116 19:15:29 InnoDB: 1.1.8 started; log sequence number 1595675 120116 19:15:29 [ERROR] Aborting 120116 19:15:29 InnoDB: Starting shutdown... 120116 19:15:30 InnoDB: Shutdown completed; log sequence number 1595675 120116 19:15:30 [Note] 120116 19:15:30 mysqld_safe mysqld from pid file /var/lib/mysql/Pig.pid ended every step I followed the INSTALL_BINARY file why it still can't start? who can help me? thanks! -- Bier met grenadyn Is als mosterd by den wyn Sy die't drinkt, is eene kwezel Hy die't drinkt, is ras een ezel -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
Re: help! mysqld cannot start!
2012/01/16 19:37 +0800, mantianyu but at last step I start the service by run sudo bin/mysqld_safe --user=mysql I got following error message cifer@Pig:/usr/local/mysql$ 120116 19:15:28 mysqld_safe Logging to '/var/log/mysql/error.log'. Your means of starting does not show where the MySQL own files are. The flag -b (--basedir) is used for that. It works also to start mysqld in the directory where the default my.cnf was left. If that has everything that MySQL needs, it works also to start with --defaults-file=/my.cnf--the four dots stand for the rest of the full pathname. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
Re: help! mysqld cannot start!
On Mon, Jan 16, 2012 at 4:37 AM, mantianyu mantia...@gmail.com wrote: I have download the Linux - Generic 2.6 (x86, 32-bit), Compressed TAR Archive binary edition and I installed it all followed the INSTALL_BINARY but at last step I start the service by run sudo bin/mysqld_safe --user=mysql I got following error message cifer@Pig:/usr/local/mysql$ 120116 19:15:28 mysqld_safe Logging to '/var/log/mysql/error.log'. 120116 19:15:28 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql 120116 19:15:30 mysqld_safe mysqld from pid file /var/lib/mysql/Pig.pid ended and I check the error.log file it shows that 120116 19:15:28 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql 120116 19:15:28 [ERROR] Can't find messagefile '/usr/share/errmsg.sys' 120116 19:15:28 [Note] Plugin 'FEDERATED' is disabled. 120116 19:15:28 InnoDB: The InnoDB memory heap is disabled 120116 19:15:28 InnoDB: Mutexes and rw_locks use InnoDB's own implementation 120116 19:15:28 InnoDB: Compressed tables use zlib 1.2.3 120116 19:15:28 InnoDB: Using Linux native AIO 120116 19:15:28 InnoDB: Initializing buffer pool, size = 128.0M 120116 19:15:28 InnoDB: Completed initialization of buffer pool 120116 19:15:28 InnoDB: highest supported file format is Barracuda. 120116 19:15:28 InnoDB: Waiting for the background threads to start 120116 19:15:29 InnoDB: 1.1.8 started; log sequence number 1595675 120116 19:15:29 [ERROR] Aborting 120116 19:15:29 InnoDB: Starting shutdown... 120116 19:15:30 InnoDB: Shutdown completed; log sequence number 1595675 120116 19:15:30 [Note] 120116 19:15:30 mysqld_safe mysqld from pid file /var/lib/mysql/Pig.pid ended every step I followed the INSTALL_BINARY file why it still can't start? who can help me? thanks! I had the same issue, and I solved it by starting the server with this command line option: --lc-messages-dir=/usr/local/mysql/share/ HTH, -larry -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
Re: help needed restoring crashed mysql
Am 30.11.2011 03:13, schrieb Karen Abgarian: The concept is not difficult to explain. Most people do not expect a gas tank to shrink once the gas is consumed...right? yes, but the hard-disk is the gas tank and the data are the gas and yes, normally everybody would expect after deleting data that the space is available for other applications signature.asc Description: OpenPGP digital signature
Re: help needed restoring crashed mysql
2011/11/29 23:19 +0100, Reindl Harald MY only luck is that i recognized this years ago after PLAYING with innodb and so i started with innodb_file_per_table=1 from the begin with the first production database And are then the table-files in the directories with frm, or in the directory where ibdata1 is? If the latter, one problem is exchanged for another. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
Re: help needed restoring crashed mysql
Am 30.11.2011 07:02, schrieb Hal?sz S?ndor: 2011/11/29 23:19 +0100, Reindl Harald MY only luck is that i recognized this years ago after PLAYING with innodb and so i started with innodb_file_per_table=1 from the begin with the first production database And are then the table-files in the directories with frm, or in the directory where ibdata1 is? If the latter, one problem is exchanged for another. they are in the db-folder but even if not it is a hughe differene if optimize table tablename free space on disk or not [root@mail:/mysql_data]$ ls insgesamt 3,0G drwx-- 2 mysql mysql 4,0K 2011-11-25 10:27 dbmail drwx-- 2 mysql mysql 4,0K 2011-11-20 17:46 mysql drwx-- 2 mysql mysql 4,0K 2011-11-20 17:46 performance_schema drwx-- 2 mysql mysql 4,0K 2011-11-30 04:00 syslog -rw-rw 1 mysql mysql 354M 2011-11-30 14:01 ibdata1 -rw-rw 1 mysql mysql 512M 2011-11-30 14:01 ib_logfile0 -rw-rw 1 mysql mysql 512M 2011-11-30 02:21 ib_logfile1 -rw-rw 1 mysql mysql6 2011-11-20 17:46 mysql_upgrade_info -rw-rw 1 mysql mysql 1,1G 2011-11-29 15:21 bin.000137 -rw-rw 1 mysql mysql 617M 2011-11-30 14:01 bin.000138 -rw-rw 1 mysql mysql 72 2011-11-29 15:21 bin.index [root@mail:/mysql_data]$ ls dbmail/ insgesamt 9,5G -rw-rw 1 mysql mysql 8,5K 2011-09-18 08:18 cms1_config.frm -rw-rw 1 mysql mysql 8,7K 2011-09-18 08:18 cms1_global_cache.frm -rw-rw 1 mysql mysql 9,4K 2011-09-18 08:18 cms1_haupt.frm -rw-rw 1 mysql mysql 8,6K 2011-09-18 08:18 cms1_locks.frm -rw-rw 1 mysql mysql 9,8K 2011-09-18 08:18 cms1_meta.frm -rw-rw 1 mysql mysql 8,5K 2011-09-18 08:18 cms1_snippets.frm -rw-rw 1 mysql mysql 11K 2011-09-18 08:18 cms1_sub2.frm -rw-rw 1 mysql mysql 11K 2011-09-18 08:18 cms1_sub.frm -rw-rw 1 mysql mysql 8,5K 2011-09-18 08:18 cms1_user_group_permissions.frm -rw-rw 1 mysql mysql 8,6K 2011-09-18 08:18 cms1_user_login.frm -rw-rw 1 mysql mysql 9,6K 2011-09-18 08:18 cms1_user_modules.frm -rw-rw 1 mysql mysql 8,5K 2011-09-14 09:32 cms1_user_online.frm -rw-rw 1 mysql mysql 9,3K 2011-09-18 08:18 cms1_users.frm -rw-rw 1 mysql mysql 8,4K 2011-09-18 08:18 dbma_aliases.frm -rw-rw 1 mysql mysql 8,4K 2011-09-18 08:18 dbma_aliases_global.frm -rw-rw 1 mysql mysql 8,4K 2011-09-18 08:18 dbma_allowed_hosts.frm -rw-rw 1 mysql mysql 8,5K 2011-09-18 08:18 dbma_client_admins.frm -rw-rw 1 mysql mysql 8,5K 2011-09-18 08:18 dbma_clients.frm -rw-rw 1 mysql mysql 8,8K 2011-10-22 20:18 dbmail_acl.frm -rw-rw 1 mysql mysql 8,5K 2011-10-22 20:18 dbmail_aliases.frm -rw-rw 1 mysql mysql 8,5K 2011-10-22 20:18 dbmail_auto_notifications.frm -rw-rw 1 mysql mysql 8,5K 2011-10-22 20:18 dbmail_auto_replies.frm -rw-rw 1 mysql mysql 8,5K 2011-10-22 20:18 dbmail_ccfield.frm -rw-rw 1 mysql mysql 8,5K 2011-10-22 20:18 dbmail_datefield.frm -rw-rw 1 mysql mysql 8,5K 2011-10-22 20:19 dbmail_envelope.frm -rw-rw 1 mysql mysql 8,5K 2011-10-22 20:19 dbmail_fromfield.frm -rw-rw 1 mysql mysql 8,4K 2011-10-22 20:19 dbmail_headername.frm -rw-rw 1 mysql mysql 8,5K 2011-10-22 20:19 dbmail_headervalue.frm -rw-rw 1 mysql mysql 8,9K 2011-10-22 20:20 dbmail_mailboxes.frm -rw-rw 1 mysql mysql 8,6K 2011-07-24 12:17 dbmail_messageblks.frm -rw-rw 1 mysql mysql 8,9K 2011-10-22 20:20 dbmail_messages.frm -rw-rw 1 mysql mysql 8,5K 2011-10-22 20:21 dbmail_pbsp.frm -rw-rw 1 mysql mysql 8,5K 2011-10-22 20:21 dbmail_physmessage.frm -rw-rw 1 mysql mysql 8,5K 2011-10-22 20:21 dbmail_referencesfield.frm -rw-rw 1 mysql mysql 8,5K 2011-10-22 20:21 dbmail_replycache.frm -rw-rw 1 mysql mysql 8,5K 2011-10-22 20:21 dbmail_replytofield.frm -rw-rw 1 mysql mysql 8,5K 2011-10-22 20:21 dbmail_sievescripts.frm -rw-rw 1 mysql mysql 8,5K 2011-10-22 20:21 dbmail_subjectfield.frm -rw-rw 1 mysql mysql 8,5K 2011-10-22 20:21 dbmail_subscription.frm -rw-rw 1 mysql mysql 8,5K 2011-10-22 20:21 dbmail_tofield.frm -rw-rw 1 mysql mysql 8,5K 2011-10-22 20:21 dbmail_usermap.frm -rw-rw 1 mysql mysql 8,9K 2011-10-22 20:21 dbmail_users.frm -rw-rw 1 mysql mysql 8,5K 2011-09-18 08:18 dbma_mta.frm -rw-rw 1 mysql mysql 8,5K 2011-09-18 08:18 dbma_recipient_relay.frm -rw-rw 1 mysql mysql 1,8K 2011-11-25 10:27 dbma_recipients.frm -rw-rw 1 mysql mysql 8,5K 2011-09-18 08:18 dbma_reply_groups.frm -rw-rw 1 mysql mysql 8,4K 2011-09-18 08:18 dbma_rewrite_domains.frm -rw-rw 1 mysql mysql 8,4K 2011-09-18 08:18 dbma_rewrite_senders.frm -rw-rw 1 mysql mysql 8,5K 2011-09-18 08:18 dbma_sender_relay.frm -rw-rw 1 mysql mysql 8,4K 2011-09-18 08:18 dbma_spamfilter.frm -rw-rw 1 mysql mysql 1,7K 2011-07-24 11:49 dbma_stats.frm -rw-rw 1 mysql mysql 8,5K 2011-08-27 22:39 dbma_transports_error.frm -rw-rw 1 mysql mysql 1,5K 2011-07-24 11:49 dbma_transports.frm -rw-rw 1 mysql mysql 8,4K 2011-07-24 11:49 #sql2-704-271.frm -rw-rw 1 mysql mysql 64K 2011-09-18 08:18 cms1_config.ibd -rw-rw
Re: help needed restoring crashed mysql
Am 29.11.2011 14:08, schrieb Luis Pugoy: Hello. I have the following problem. I was importing a large database to mysql using mysqldump. Unfortunately this filled up the whole disk, and mysqldump exited with an error that the table it is currently writing to is full. Checking df -h it shows that the disk usage is at 100%. I decided to drop the database I was importing, but rechecking df -h shows that it is still at 100%. I then looked for the cause of this and found a very large file under /var/lib/mysql. I think it is ibdata1. ibdata1 does NEVER get smaller, this is normal and a hughe problem in your case, only if you are using innodb_file_per_table which is NOT default would retire the space after drop tables why is this dumb innodb_file_per_table=0 default since MOST PEOPLE have only troubles with it because they can not free space with optimize table with no real benefits? signature.asc Description: OpenPGP digital signature
Re: help needed restoring crashed mysql
On 29.11.2011, at 5:21, Reindl Harald wrote: ibdata1 does NEVER get smaller, this is normal and a hughe problem in your case, only if you are using innodb_file_per_table which is NOT default would retire the space after drop tables why is this dumb innodb_file_per_table=0 default since MOST PEOPLE have only troubles with it because they can not free space with optimize table with no real benefits? Hi... The logic behind this is probably that without innodb_file_per_table=1 and with several large ibdata files, the space IS freed up when one does optimize table or drop table. The space is freed up inside the database files and can be reused. If we think about it, the database product should only resolve problems of the database space management, not of the OS space management. Then, the user essentially asked InnoDB to keep allocating arbitrary amount of space as needed, ignoring that the OS disk is actually of the limited size. To be correct about it, the user should have stated that the ibdata file should have a firm limit and not autoextend beyond that. This is not to say that MySQL could not have more of the file management features. For example, the ability to add or remove datafiles on the fly and the ability to detach tablespaces as collections of tables. Making innodb tablespaces default... well, it still would not liberate the users from thinking whether they want to run with them enabled or not. For example, if I have 1 tables of 100 bytes each, I probably do not want tablespaces. If I have 1% of tables consuming 99% of the space, I would also not want the tablespaces. As for the OP's problem, unless he changed his mind about the need to import, the same amount of space would anyway be consumed. The solution would probably be to find some bigger O Sdisk and copy that ibdata file there. Right? Cheers Karen -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
Re: help needed restoring crashed mysql
This is not to say that MySQL could not have more of the file management features. For example, the ability to add or remove datafiles on the fly and the ability to detach tablespaces as collections of tables. That's where MySQL(read InnoDB) got stuck actually, it never introduced a powerful datafiles management system, and that is where Oracle excels (as far as being almost a O.S.) with multiple level of abstractions, just think of ASM. It is actually the part of Oracle I like most as well as the really oraclish way to get stats out of it! The 'problem' with MySQL is that it is so easy to start with it that people do not realize that is also a real RDBMS. -- Claudio Making innodb tablespaces default... well, it still would not liberate the users from thinking whether they want to run with them enabled or not. For example, if I have 1 tables of 100 bytes each, I probably do not want tablespaces. If I have 1% of tables consuming 99% of the space, I would also not want the tablespaces. As for the OP's problem, unless he changed his mind about the need to import, the same amount of space would anyway be consumed. The solution would probably be to find some bigger O Sdisk and copy that ibdata file there. Right? Cheers Karen -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql -- Claudio
Re: help needed restoring crashed mysql
Am 29.11.2011 20:25, schrieb Karen Abgarian: On 29.11.2011, at 5:21, Reindl Harald wrote: why is this dumb innodb_file_per_table=0 default since MOST PEOPLE have only troubles with it because they can not free space with optimize table with no real benefits? The logic behind this is probably that without innodb_file_per_table=1 and with several large ibdata files, the space IS freed up when one does optimize table or drop table. The space is freed up inside the database files and can be reused. well, and if you have this day 2 TB mysql-data and a year later get rid of 1 TB of it they allocated space can be REUSED for innodb but never for any other application If we think about it, the database product should only resolve problems of the database space management, not of the OS space management. the database producht with default settings is the part starting the troubles of os-space-managment and this is idiotic, no other words for this! MY only luck is that i recognized this years ago after PLAYING with innodb and so i started with innodb_file_per_table=1 from the begin with the first production database the user essentially asked InnoDB to keep allocating arbitrary amount of space as needed, ignoring that the OS disk is actually of the limited size. To be correct about it, the user should have stated that the ibdata file should have a firm limit and not autoextend beyond that. yes this case is a user-problem but the cases where ibdata1 is growing becasue ONCE bigger data was stored and never release the allocated space is a design-problem signature.asc Description: OpenPGP digital signature
Re: help needed restoring crashed mysql
Hi... there is stuff inline there. The logic behind this is probably that without innodb_file_per_table=1 and with several large ibdata files, the space IS freed up when one does optimize table or drop table. The space is freed up inside the database files and can be reused. well, and if you have this day 2 TB mysql-data and a year later get rid of 1 TB of it they allocated space can be REUSED for innodb but never for any other application I did not say it is the right thing to not have an option to shrink the database or do file management. I tried to explain the logic that is probably put into this product. Another piece of logic is that it is not really typical for the databases to lose 50% of its volume. The databases usually either grow, or can grow, or are destroyed. In that regard the product with this feature lacking probably still covers the needs of most. By comparison, Oracle did not provide ability to drop the datafiles until, eh, version 8, I believe, and it was not made easy until version 10. If we think about it, the database product should only resolve problems of the database space management, not of the OS space management. the database producht with default settings is the part starting the troubles of os-space-managment and this is idiotic, no other words for this! I would say inconvenient. As I explained above, the OS space allocation problems that way could be considered a corner case and thus be considered unimportant by MySQL development. Considering the problem of reclaiming 1 terabyte out of 2-terabyte database, one could resolve it with creating a brand new instance followed by export/import of data. It is not that there is no solution, it is inconvenient to use. MY only luck is that i recognized this years ago after PLAYING with innodb and so i started with innodb_file_per_table=1 from the begin with the first production database Well, I would not base my database design on luck and playing. There should be good awareness of what the features do and what would be the plan to deal with file allocations should the database grow, shrink or somerset. but the cases where ibdata1 is growing becasue ONCE bigger data was stored and never release the allocated space is a design-problem Not exactly. A design problem is to build a server in such a way as that adding a feature to remove datafiles would be impossible (without major rebuild). I think this one can be added. I didn't bother to check, but I would be surprised if there isn't already an enhancement request for this
Re: help needed restoring crashed mysql
On Nov 29, 2011, at 11:50 AM, Claudio Nanni wrote: This is not to say that MySQL could not have more of the file management features. For example, the ability to add or remove datafiles on the fly and the ability to detach tablespaces as collections of tables. That's where MySQL(read InnoDB) got stuck actually, it never introduced a powerful datafiles management system, and that is where Oracle excels (as far as being almost a O.S.) with multiple level of abstractions, just think of ASM. It is actually the part of Oracle I like most as well as the really oraclish way to get stats out of it! The 'problem' with MySQL is that it is so easy to start with it that people do not realize that is also a real RDBMS. -- Yes, Oracle has features of the known kind. But, if we consider the wishlist for InnoDB, then on top there would be an ability to add files without stopping database (that is without innodb tablespaces), followed by the ability to control which files are part of the database (information schema of that). It looks like at some point they have decided that the file per table thing will cover all needs. Well, it doesn't. I personally do not like dealing with 50G files should individual table grow to this size. It will be much better managed with a larger collection of smaller files, without innodb tablespaces. It is after I have those I would want the features like removing files or compacting the database.
Re: help needed restoring crashed mysql
Am 30.11.2011 01:11, schrieb Karen Abgarian: MY only luck is that i recognized this years ago after PLAYING with innodb and so i started with innodb_file_per_table=1 from the begin with the first production database Well, I would not base my database design on luck and playing. There should be good awareness of what the features do and what would be the plan to deal with file allocations should the database grow, shrink or somerset if you are working many years with mysql and myisam you normally do not expect this - and no my work depends never really on luck and that is why it look in ALL datadirs of all software i am using and missed the table files known from my isam most peopole DO NOT care about this and not expect that allocated space will not be freed nor makes it any sense to have a whole database-server to dump/import because you get rid of big databases Another piece of logic is that it is not really typical for the databases to lose 50% of its volume. well, so install http://www.dbmail.org/ with replication and offsite-backups of the slave, get rid of your biggest mail-user and think about how useful it is to waste all this space as before multiple times in the backup-storages it did never happen to me - but i heard so many people start whining because the mysql-defaults and these are most pepole which do not have our knowledge to handle this before and in this case also not able to handle dump/import in a production environment signature.asc Description: OpenPGP digital signature
Re: help needed restoring crashed mysql
Hi... and some more stuff inline. Well, I would not base my database design on luck and playing. There should be good awareness of what the features do and what would be the plan to deal with file allocations should the database grow, shrink or somerset if you are working many years with mysql and myisam you normally do not expect this - and no my work depends never really on luck and that is why it look in ALL datadirs of all software i am using and missed the table files known from my isam most peopole DO NOT care about this and not expect that allocated space will not be freed nor makes it any sense to have a whole database-server to dump/import because you get rid of big databases I am not saying there is anything wrong with your experience. I understand that people having worked with MyISAM might have a different mindset. But they at some point did make a decision to use InnoDB. I think this is the time when I would ask myself a question: where is it that my data will be stored. The concept is not difficult to explain. Most people do not expect a gas tank to shrink once the gas is consumed...right? Another piece of logic is that it is not really typical for the databases to lose 50% of its volume. well, so install http://www.dbmail.org/ with replication and offsite-backups of the slave, get rid of your biggest mail-user and think about how useful it is to waste all this space as before multiple times in the backup-storages I didn't have a pleasure to use dbmail. I presume it does something with mail users. Thinking logically, if I got rid of my biggest mail user, I might eventually get another user, even bigger one, which would consume the same space vacated by the deceased user. So why would I want to give up the space then? it did never happen to me - but i heard so many people start whining because the mysql-defaults and these are most pepole which do not have our knowledge to handle this before and in this case also not able to handle dump/import in a production environment If the people do not have the knowledge to do exports/imports, the brand new and cool file management feature will not help them either. Essentially, the instructions how to use that feature are written in the same book as the instructions how to do export/imports. If we consider it as given that people would never read, it is a dead end. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
Re: Help with a query
Hi Aveek, You need to use something like union all and having to get desire result Follow example below select file, digest from ( SELECT file, digest,Count(*) as Cnt FROM A GROUP BY file, digest union all SELECT file, digest,Count(*) as Cnt FROM B GROUP BY file, digest ) tmp group by file, digest HAVING Sum(Cnt) (Select sum(refcount) from C WHERE tmp.file = C.file and tmp.digest = C.digest); --Anupam --- On Tue, 17/5/11, Aveek Misra ave...@yahoo-inc.com wrote: From: Aveek Misra ave...@yahoo-inc.com Subject: Help with a query To: mysql@lists.mysql.com mysql@lists.mysql.com Date: Tuesday, 17 May, 2011, 1:23 PM I have a table A and table B of the same type as CREATE TABLE A ( `file` varbinary(255) not null, `digest` binary(40) not null ) CREATE TABLE B ( `file` varbinary(255) not null, `digest` binary(40) not null ) I have another table C of the following type CREATE TABLE C ( `file` varbinary(255) not null, `digest` binary(40) not null, `refcount` bigint(20) not null ) I need to write a query where for the same file and digest in table A and table B, the refcount is table C is not the same. So: SELECT COUNT(*) as count 1 FROM A WHERE file='file1' AND digest='digest1'; SELECT COUNT(*) as count 2 FROM B WHERE file='file1' AND digest='digest1'; and then adding up the two counts from these queries and comparing it with the result of the following query: SELECT refcount FROM C WHERE file='file1' AND digest='digest1'; basically (refcount == (count1 + count2)) should be true and I am interested in finding out all such records in table C where this is not the case. Thanks Aveek -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/mysql?unsub=sb_akarmar...@yahoo.com
Re: Help with a query
I eventually came up with a solution myself although the query is a bit different SELECT C.file, C.digest, (a.cnt_A + b.cnt_B) AS total_count, C.refcount FROM C, (SELECT file, digest, COUNT(file) AS cnt_A FROM A GROUP BY file, digest) as a, (SELECT file, digest, COUNT(file) AS cnt_B FROM B GROUP BY file, digest) as b WHERE C.file = a.file and C.digest = a.digest and C.file = b.file and C.digest = b.digest and C.refcount (a.cnt_A + b.cnt_B); Thanks Aveek On May 20, 2011, at 1:52 PM, Anupam Karmarkar wrote: Hi Aveek, You need to use something like union all and having to get desire result Follow example below select file, digest from ( SELECT file, digest,Count(*) as Cnt FROM A GROUP BY file, digest union all SELECT file, digest,Count(*) as Cnt FROM B GROUP BY file, digest ) tmp group by file, digest HAVING Sum(Cnt) (Select sum(refcount) from C WHERE tmp.file = C.file and tmp.digest = C.digest); --Anupam --- On Tue, 17/5/11, Aveek Misra ave...@yahoo-inc.commailto:ave...@yahoo-inc.com wrote: From: Aveek Misra ave...@yahoo-inc.commailto:ave...@yahoo-inc.com Subject: Help with a query To: mysql@lists.mysql.commailto:mysql@lists.mysql.com mysql@lists.mysql.commailto:mysql@lists.mysql.com Date: Tuesday, 17 May, 2011, 1:23 PM I have a table A and table B of the same type as CREATE TABLE A ( `file` varbinary(255) not null, `digest` binary(40) not null ) CREATE TABLE B ( `file` varbinary(255) not null, `digest` binary(40) not null ) I have another table C of the following type CREATE TABLE C ( `file` varbinary(255) not null, `digest` binary(40) not null, `refcount` bigint(20) not null ) I need to write a query where for the same file and digest in table A and table B, the refcount is table C is not the same. So: SELECT COUNT(*) as count 1 FROM A WHERE file='file1' AND digest='digest1'; SELECT COUNT(*) as count 2 FROM B WHERE file='file1' AND digest='digest1'; and then adding up the two counts from these queries and comparing it with the result of the following query: SELECT refcount FROM C WHERE file='file1' AND digest='digest1'; basically (refcount == (count1 + count2)) should be true and I am interested in finding out all such records in table C where this is not the case. Thanks Aveek -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=sb_akarmar...@yahoo.com
Re: Help with slow query
Shawn, Thanks for the great help! It still is not working. I did an EXPLAIN on this query with your amended split out join statements and got this: ++-+---+---+---++-+--++-+ | id | select_type | table | type | possible_keys | key| key_len | ref | rows | Extra | ++-+---+---+---++-+--++-+ | 1 | SIMPLE | a | range | apptdate | apptdate | 4 | NULL | 296148 | Using where | | 1 | SIMPLE | p | ALL | NULL | NULL | NULL| NULL | 262462 | | | 1 | SIMPLE | t | ALL | NULL | NULL | NULL| NULL | 311152 | | | 1 | SIMPLE | c | ref | IdApptType| IdApptType | 51 | func | 1 | | | 1 | SIMPLE | af| ALL | NULL | NULL | NULL| NULL | 5680 | | ++-+---+---+---++-+--++-+ What I'm not catching is why it says there is no key it can use for the patient table; here is a portion of the show create: PRIMARY KEY (`zzk`), KEY `IdPatient` (`IdPatient`), KEY `SSN` (`SSN`), KEY `IdLastword` (`IdLastword`), KEY `DOB` (`DateOfBirth`), KEY `NameFirst` (`NameFirst`), KEY `NameLast` (`NameLast`) So, the IdPatient is at least a POSSIBLE key, right? On Mar 9, 2011, at 8:34 PM, Shawn Green (MySQL) wrote: Hi Jim, On 3/9/2011 17:57, Jim McNeely wrote: I am trying to set up an export query which is executing very slowly, and I was hoping I could get some help. Here is the query: SELECT a.IdAppt, a.IdPatient, p.NameLast, p.NameFirst, p.NameMI, a.IdProcedure, a.ProcName, CAST(CONCAT(a.ApptDate, , a.ApptTimeOut) AS CHAR) ApptDateTime, a.ApptLenMin Duration, a.Room Primary_Resource_ID, 'R' as Primary_Resource_Type, t.Note, c.IdPractAffil DoctorID, af.PractName DoctorName, a.ICD9code ICD9, '??' Diagnosis_free_test from Appt_ a LEFT JOIN (patient_ p, today_ t, Copy_ c, Affil_ af) ON (a.IdPatient = p.IdPatient AND a.IdPatientDate = t.IdPatientDate AND CONCAT(a.IdAppt, '0') = c.IdApptType AND a.IdPriCarePhy = af.IdAffil) WHERE a.ApptDate= '2009-03-01'; p.IdPatient, t.IdPatientDate, c.IdApptType, and af.IdAffil are all indexed. Also I selectively took out join parameters until there was nothing but a join on the patient table, and it was still slow, but when I took that out, the query was extremely fast. What might I be doing wrong? Thanks, Jim McNeely The performance problem is with your Cartesian product. I think you meant to write: from Appt_ a LEFT JOIN patient_ p ON a.IdPatient = p.IdPatient LEFT JOIN today_ t ON a.IdPatientDate = t.IdPatientDate LEFT JOIN Copy_ c ON CONCAT(a.IdAppt, '0') = c.IdApptType LEFT JOIN Affil_ af ON a.IdPriCarePhy = af.IdAffil As of 5.0.12, the comma operator for table joins was demoted in the 'order of precedence' for query execution. That means that MySQL became more complaint with the SQL standard but it also means that using a comma-join instead of an explicit ANSI join can result in a Cartesian product more frequently. Try my style and compare how it works. If both styles are similarly slow, collect the EXPLAIN plan for this query and share with the list. Yours, -- Shawn Green MySQL Principal Technical Support Engineer Oracle USA, Inc. - Hardware and Software, Engineered to Work Together. Office: Blountville, TN -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=j...@newcenturydata.com
Re: Help with slow query
Rhino, Thanks for the help and time! Actually, I thought the same thing, but what's weird is that is the only thing that doesn't slow it down. If I take out all of the join clauses EXCEPT that one the query runs virtually instantaneously. for some reason it will use the index in that case and it works. If I take out everything like this: SELECT a.IdAppt, a.IdPatient, p.NameLast, p.NameFirst, p.NameMI from Appt_ a LEFT JOIN patient_ p ON a.IdPatient = p.IdPatient WHERE a.ApptDate = '2009-03-01'; It is still utterly slow. EXPLAIN looks like this: ++-+---+---+---+--+-+--++-+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | ++-+---+---+---+--+-+--++-+ | 1 | SIMPLE | a | range | apptdate | apptdate | 4 | NULL | 296166 | Using where | | 1 | SIMPLE | p | ALL | NULL | NULL | NULL| NULL | 262465 | | ++-+---+---+---+--+-+--++-+ But, very good try. I thought this might be it as well. Thanks, Jim McNeely On Mar 10, 2011, at 9:05 AM, Rhino wrote: What I'm about to say may be completely out to lunch so don't be afraid to dismiss it. I'm more a DB2 guy than a MySQL guy and I'm getting rusty on both but I've always been struck by how similar the two dtabases are. Therefore, I want to offer an insight on why this query would not perform terribly well in DB2. I simply don't know if it is applicable to MySQL. In DB2, using functions on predicates (conditions in a WHERE clause), prevents DB2 from using an index to satisfy that predicate. (Or at least it used to: I'm not certain if that has been remedied in recent versions of the DB2 optimizer.) Therefore, the CONCAT() function in the line AND CONCAT(a.IdAppt, '0') = c.IdApptType would ensure that no index on the IdAppt column would be used to find the rows of the table that satisfied that condition. My suggestion is that you try rewriting that condition to avoid using CONCAT() - or any other function - and see if that helps the performance of your query. That would require modifying your data to append a zero to the end of the existing date in IdApptType column, which may or may not be a reasonable thing to do. You'll have to decide about that. Again, I could be all wet here so don't have me killed if I'm wrong about this :-) I'm just trying to help ;-) -- Rhino On 2011-03-10 11:38, Jim McNeely wrote: Shawn, Thanks for the great help! It still is not working. I did an EXPLAIN on this query with your amended split out join statements and got this: ++-+---+---+---++-+--++-+ | id | select_type | table | type | possible_keys | key| key_len | ref | rows | Extra | ++-+---+---+---++-+--++-+ | 1 | SIMPLE | a | range | apptdate | apptdate | 4 | NULL | 296148 | Using where | | 1 | SIMPLE | p | ALL | NULL | NULL | NULL| NULL | 262462 | | | 1 | SIMPLE | t | ALL | NULL | NULL | NULL| NULL | 311152 | | | 1 | SIMPLE | c | ref | IdApptType| IdApptType | 51 | func | 1 | | | 1 | SIMPLE | af| ALL | NULL | NULL | NULL| NULL | 5680 | | ++-+---+---+---++-+--++-+ What I'm not catching is why it says there is no key it can use for the patient table; here is a portion of the show create: PRIMARY KEY (`zzk`), KEY `IdPatient` (`IdPatient`), KEY `SSN` (`SSN`), KEY `IdLastword` (`IdLastword`), KEY `DOB` (`DateOfBirth`), KEY `NameFirst` (`NameFirst`), KEY `NameLast` (`NameLast`) So, the IdPatient is at least a POSSIBLE key, right? On Mar 9, 2011, at 8:34 PM, Shawn Green (MySQL) wrote: Hi Jim, On 3/9/2011 17:57, Jim McNeely wrote: I am trying to set up an export query which is executing very slowly, and I was hoping I could get some help. Here is the query: SELECT a.IdAppt, a.IdPatient, p.NameLast, p.NameFirst, p.NameMI, a.IdProcedure, a.ProcName, CAST(CONCAT(a.ApptDate, , a.ApptTimeOut) AS CHAR) ApptDateTime, a.ApptLenMin Duration, a.Room Primary_Resource_ID, 'R' as Primary_Resource_Type, t.Note, c.IdPractAffil DoctorID, af.PractName DoctorName, a.ICD9code ICD9, '??' Diagnosis_free_test from Appt_ a LEFT JOIN (patient_ p, today_ t, Copy_ c, Affil_ af) ON (a.IdPatient = p.IdPatient AND a.IdPatientDate = t.IdPatientDate AND CONCAT(a.IdAppt, '0') = c.IdApptType AND
Re: Help with slow query
If the optimizer chooses the wrong index, you can tell it what index to use. SELECT a.IdAppt, a.IdPatient, p.NameLast, p.NameFirst, p.NameMI from Appt_ a force index(id_patient) LEFT JOIN patient_ p ON a.IdPatient = p.IdPatient WHERE a.ApptDate = '2009-03-01'; See http://dev.mysql.com/doc/refman/5.1/en/index-hints.html Mike At 11:32 AM 3/10/2011, Jim McNeely wrote: Rhino, Thanks for the help and time! Actually, I thought the same thing, but what's weird is that is the only thing that doesn't slow it down. If I take out all of the join clauses EXCEPT that one the query runs virtually instantaneously. for some reason it will use the index in that case and it works. If I take out everything like this: SELECT a.IdAppt, a.IdPatient, p.NameLast, p.NameFirst, p.NameMI from Appt_ a LEFT JOIN patient_ p ON a.IdPatient = p.IdPatient WHERE a.ApptDate = '2009-03-01'; It is still utterly slow. EXPLAIN looks like this: ++-+---+---+---+--+-+--++-+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | ++-+---+---+---+--+-+--++-+ | 1 | SIMPLE | a | range | apptdate | apptdate | 4 | NULL | 296166 | Using where | | 1 | SIMPLE | p | ALL | NULL | NULL | NULL| NULL | 262465 | | ++-+---+---+---+--+-+--++-+ But, very good try. I thought this might be it as well. Thanks, Jim McNeely On Mar 10, 2011, at 9:05 AM, Rhino wrote: What I'm about to say may be completely out to lunch so don't be afraid to dismiss it. I'm more a DB2 guy than a MySQL guy and I'm getting rusty on both but I've always been struck by how similar the two dtabases are. Therefore, I want to offer an insight on why this query would not perform terribly well in DB2. I simply don't know if it is applicable to MySQL. In DB2, using functions on predicates (conditions in a WHERE clause), prevents DB2 from using an index to satisfy that predicate. (Or at least it used to: I'm not certain if that has been remedied in recent versions of the DB2 optimizer.) Therefore, the CONCAT() function in the line AND CONCAT(a.IdAppt, '0') = c.IdApptType would ensure that no index on the IdAppt column would be used to find the rows of the table that satisfied that condition. My suggestion is that you try rewriting that condition to avoid using CONCAT() - or any other function - and see if that helps the performance of your query. That would require modifying your data to append a zero to the end of the existing date in IdApptType column, which may or may not be a reasonable thing to do. You'll have to decide about that. Again, I could be all wet here so don't have me killed if I'm wrong about this :-) I'm just trying to help ;-) -- Rhino On 2011-03-10 11:38, Jim McNeely wrote: Shawn, Thanks for the great help! It still is not working. I did an EXPLAIN on this query with your amended split out join statements and got this: ++-+---+---+---++-+--++-+ | id | select_type | table | type | possible_keys | key| key_len | ref | rows | Extra | ++-+---+---+---++-+--++-+ | 1 | SIMPLE | a | range | apptdate | apptdate | 4 | NULL | 296148 | Using where | | 1 | SIMPLE | p | ALL | NULL | NULL | NULL| NULL | 262462 | | | 1 | SIMPLE | t | ALL | NULL | NULL | NULL| NULL | 311152 | | | 1 | SIMPLE | c | ref | IdApptType| IdApptType | 51 | func | 1 | | | 1 | SIMPLE | af| ALL | NULL | NULL | NULL| NULL | 5680 | | ++-+---+---+---++-+--++-+ What I'm not catching is why it says there is no key it can use for the patient table; here is a portion of the show create: PRIMARY KEY (`zzk`), KEY `IdPatient` (`IdPatient`), KEY `SSN` (`SSN`), KEY `IdLastword` (`IdLastword`), KEY `DOB` (`DateOfBirth`), KEY `NameFirst` (`NameFirst`), KEY `NameLast` (`NameLast`) So, the IdPatient is at least a POSSIBLE key, right? On Mar 9, 2011, at 8:34 PM, Shawn Green (MySQL) wrote: Hi Jim, On 3/9/2011 17:57, Jim McNeely wrote: I am trying to set up an export query which is executing very slowly, and I was hoping I could get some help. Here is the query: SELECT a.IdAppt, a.IdPatient, p.NameLast, p.NameFirst, p.NameMI, a.IdProcedure, a.ProcName, CAST(CONCAT(a.ApptDate, , a.ApptTimeOut) AS CHAR) ApptDateTime, a.ApptLenMin Duration,
Re: Help with slow query
On 3/10/2011 12:32, Jim McNeely wrote: Rhino, Thanks for the help and time! Actually, I thought the same thing, but what's weird is that is the only thing that doesn't slow it down. If I take out all of the join clauses EXCEPT that one the query runs virtually instantaneously. for some reason it will use the index in that case and it works. If I take out everything like this: SELECT a.IdAppt, a.IdPatient, p.NameLast, p.NameFirst, p.NameMI from Appt_ a LEFT JOIN patient_ p ON a.IdPatient = p.IdPatient WHERE a.ApptDate= '2009-03-01'; It is still utterly slow. EXPLAIN looks like this: ++-+---+---+---+--+-+--++-+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | ++-+---+---+---+--+-+--++-+ | 1 | SIMPLE | a | range | apptdate | apptdate | 4 | NULL | 296166 | Using where | | 1 | SIMPLE | p | ALL | NULL | NULL | NULL| NULL | 262465 | | ++-+---+---+---+--+-+--++-+ But, very good try. I thought this might be it as well. ... snip ... According to this report, there are no indexes on the `patient_` table that include the column `IdPatient` as the first column. Fix that and this query should be much faster. -- Shawn Green MySQL Principal Technical Support Engineer Oracle USA, Inc. - Hardware and Software, Engineered to Work Together. Office: Blountville, TN -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
Re: Help with slow query
Shawn, This is the first thing that I though as well, but here is a portion from the show create table for patient_: PRIMARY KEY (`zzk`), KEY `IdPatient` (`IdPatient`), KEY `SSN` (`SSN`), KEY `IdLastword` (`IdLastword`), KEY `DOB` (`DateOfBirth`), KEY `NameFirst` (`NameFirst`), KEY `NameLast` (`NameLast`) This extremely simple join is still massively slow. Jim On Mar 10, 2011, at 10:00 AM, Shawn Green (MySQL) wrote: On 3/10/2011 12:32, Jim McNeely wrote: Rhino, Thanks for the help and time! Actually, I thought the same thing, but what's weird is that is the only thing that doesn't slow it down. If I take out all of the join clauses EXCEPT that one the query runs virtually instantaneously. for some reason it will use the index in that case and it works. If I take out everything like this: SELECT a.IdAppt, a.IdPatient, p.NameLast, p.NameFirst, p.NameMI from Appt_ a LEFT JOIN patient_ p ON a.IdPatient = p.IdPatient WHERE a.ApptDate= '2009-03-01'; It is still utterly slow. EXPLAIN looks like this: ++-+---+---+---+--+-+--++-+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | ++-+---+---+---+--+-+--++-+ | 1 | SIMPLE | a | range | apptdate | apptdate | 4 | NULL | 296166 | Using where | | 1 | SIMPLE | p | ALL | NULL | NULL | NULL| NULL | 262465 | | ++-+---+---+---+--+-+--++-+ But, very good try. I thought this might be it as well. ... snip ... According to this report, there are no indexes on the `patient_` table that include the column `IdPatient` as the first column. Fix that and this query should be much faster. -- Shawn Green MySQL Principal Technical Support Engineer Oracle USA, Inc. - Hardware and Software, Engineered to Work Together. Office: Blountville, TN -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=j...@newcenturydata.com -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
Re: Help with slow query
On 3/10/2011 13:12, Jim McNeely wrote: Shawn, This is the first thing that I though as well, but here is a portion from the show create table for patient_: PRIMARY KEY (`zzk`), KEY `IdPatient` (`IdPatient`), KEY `SSN` (`SSN`), KEY `IdLastword` (`IdLastword`), KEY `DOB` (`DateOfBirth`), KEY `NameFirst` (`NameFirst`), KEY `NameLast` (`NameLast`) This extremely simple join is still massively slow. Jim On Mar 10, 2011, at 10:00 AM, Shawn Green (MySQL) wrote: On 3/10/2011 12:32, Jim McNeely wrote: Rhino, Thanks for the help and time! Actually, I thought the same thing, but what's weird is that is the only thing that doesn't slow it down. If I take out all of the join clauses EXCEPT that one the query runs virtually instantaneously. for some reason it will use the index in that case and it works. If I take out everything like this: SELECT a.IdAppt, a.IdPatient, p.NameLast, p.NameFirst, p.NameMI from Appt_ a LEFT JOIN patient_ p ON a.IdPatient = p.IdPatient WHERE a.ApptDate= '2009-03-01'; 1) Verify that the indexes on `patient_` haven't been disabled SHOW INDEXES FROM `patient_`; http://dev.mysql.com/doc/refman/5.5/en/show-index.html 2) Verify that the data types of `Appt_`.`IdPatient` and `patient_`.`IdPatient` are not incompatible. (for example: one is varchar, the other int) Thanks, -- Shawn Green MySQL Principal Technical Support Engineer Oracle USA, Inc. - Hardware and Software, Engineered to Work Together. Office: Blountville, TN -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
Re: Help with slow query
On 3/10/11 10:46 AM, Shawn Green (MySQL) wrote: On 3/10/2011 12:32, Jim McNeely wrote: Rhino, Thanks for the help and time! Actually, I thought the same thing, but what's weird is that is the only thing that doesn't slow it down. If I take out all of the join clauses EXCEPT that one the query runs virtually instantaneously. for some reason it will use the index in that case and it works. If I take out everything like this: SELECT a.IdAppt, a.IdPatient, p.NameLast, p.NameFirst, p.NameMI from Appt_ a LEFT JOIN patient_ p ON a.IdPatient = p.IdPatient WHERE a.ApptDate= '2009-03-01'; 1) Verify that the indexes on `patient_` haven't been disabled SHOW INDEXES FROM `patient_`; http://dev.mysql.com/doc/refman/5.5/en/show-index.html 2) Verify that the data types of `Appt_`.`IdPatient` and `patient_`.`IdPatient` are not incompatible. (for example: one is varchar, the other int) This last one can be HUGE. I tracked a big performance issue to this exact problem - the columns used in the join had the same name, but different data types. Correcting to be the same type (both ints) made a terrific performance increase. -- Andy Wallace iHOUSEweb, Inc. awall...@ihouseweb.com (866) 645-7700 ext 219 -- There are two ways to build software: Make it so simple that there are obviously no bugs, or make it so complex that there are no obvious bugs. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
Re: Help with slow query
Hi Jim, On 3/9/2011 17:57, Jim McNeely wrote: I am trying to set up an export query which is executing very slowly, and I was hoping I could get some help. Here is the query: SELECT a.IdAppt, a.IdPatient, p.NameLast, p.NameFirst, p.NameMI, a.IdProcedure, a.ProcName, CAST(CONCAT(a.ApptDate, , a.ApptTimeOut) AS CHAR) ApptDateTime, a.ApptLenMin Duration, a.Room Primary_Resource_ID, 'R' as Primary_Resource_Type, t.Note, c.IdPractAffil DoctorID, af.PractName DoctorName, a.ICD9code ICD9, '??' Diagnosis_free_test from Appt_ a LEFT JOIN (patient_ p, today_ t, Copy_ c, Affil_ af) ON (a.IdPatient = p.IdPatient AND a.IdPatientDate = t.IdPatientDate AND CONCAT(a.IdAppt, '0') = c.IdApptType AND a.IdPriCarePhy = af.IdAffil) WHERE a.ApptDate= '2009-03-01'; p.IdPatient, t.IdPatientDate, c.IdApptType, and af.IdAffil are all indexed. Also I selectively took out join parameters until there was nothing but a join on the patient table, and it was still slow, but when I took that out, the query was extremely fast. What might I be doing wrong? Thanks, Jim McNeely The performance problem is with your Cartesian product. I think you meant to write: from Appt_ a LEFT JOIN patient_ p ON a.IdPatient = p.IdPatient LEFT JOIN today_ t ON a.IdPatientDate = t.IdPatientDate LEFT JOIN Copy_ c ON CONCAT(a.IdAppt, '0') = c.IdApptType LEFT JOIN Affil_ af ON a.IdPriCarePhy = af.IdAffil As of 5.0.12, the comma operator for table joins was demoted in the 'order of precedence' for query execution. That means that MySQL became more complaint with the SQL standard but it also means that using a comma-join instead of an explicit ANSI join can result in a Cartesian product more frequently. Try my style and compare how it works. If both styles are similarly slow, collect the EXPLAIN plan for this query and share with the list. Yours, -- Shawn Green MySQL Principal Technical Support Engineer Oracle USA, Inc. - Hardware and Software, Engineered to Work Together. Office: Blountville, TN -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
RE: Help with ORDER BY
SELECT name, city, state, phone, prods_done, cancels, miles FROM (SELECT name, city, state, phone, prods_done, cancels, miles, ((prod_done - cancels) * 100 / prod_done) reliability FROM volunteer_search WHERE project_id = 5653) A ORDER BY reliability DESC, miles ASC Give it a try !!! Rolando A. Edwards MySQL DBA (SCMDBA) 155 Avenue of the Americas, Fifth Floor New York, NY 10013 212-625-5307 (Work) 201-660-3221 (Cell) AIM Skype : RolandoLogicWorx redwa...@logicworks.net http://www.linkedin.com/in/rolandoedwards -Original Message- From: Richard Reina [mailto:gatorre...@gmail.com] Sent: Monday, February 07, 2011 1:08 PM To: mysql@lists.mysql.com Subject: Help with ORDER BY I currently have a query that organizes search results for volunteers that should be called for projects based on how close they live to a project the and there past attendance. Currently doing SELECT name, city, state, phone, prods_done, cancels, miles FROM volunteer_search WHERE project_id = 5653 ORDER BY miles ASC, cancels/(prods_done/cancels) ASC, prods_done DESC; The results look something like this: Jim Barnes Chicago, IL 773-555- 2 1 11.5 Kelley Smith Cicero, IL 708-444-2121 3 0 21.6 Kim Ayers Plainfield, IL 630-888-9898 22 1 25.1 I am trying to find a way to give more weight to reliability (projects done and cancels). For example the fact that Kim Ayers has done 22 projects makes her more desirable than either Jim Barnes and Kelley Smith even though she has 1 cancel is farther from the project than both the other candidates. The desired order would be: Kim Ayers Plainfield, IL 630-888-9898 22 1 25.1 Kelley Smith Cicero, IL 708-444-2121 3 0 21.6 Jim Barnes Chicago, IL 773-555- 2 1 11.5 Can the query could be modified to take into account such considerations? I realize that I have not really quantified how much reliability and the number of projects done trumps distance from the project, but in the beginning I am willing to set that somewhat arbitrary and adjust it later. I realize that this may be beyond the scope of a MySQL query. If so I am grateful for any ideas on where to go for info/help. Thanks, Richard -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
Re: Help with query.
Hi Paul! Paul Halliday wrote: I have a query (thanks to this list) that uses a join to add country information to an IP. It looks like this: SELECT COUNT(signature) AS count, INET_NTOA(src_ip), map1.cc as src_cc, INET_NTOA(dst_ip), map2.cc as dst_cc FROM event LEFT JOIN mappings AS map1 ON event.src_ip = map1.ip LEFT JOIN mappings AS map2 ON event.dst_ip = map2.ip WHERE timestamp BETWEEN '2011-01-29 00:00:00' AND '2011-01-30 00:00:00' GROUP BY src_ip, src_cc, dst_ip, dst_cc ORDER BY src_cc, dst_cc ASC; This would return something like this: +---+---++---++ | count | INET_NTOA(src_ip) | src_cc | INET_NTOA(dst_ip) | dst_cc | +---+---++---++ | 8 | 10.0.0.8 | NULL | 61.55.142.129 | CN | | 1 | 210.52.216.92 | CN | 10.0.0.2 | NULL | | 1 | 121.33.205.235| CN | 172.16.0.6| NULL | | 239 | 210.52.216.92 | CN | 10.0.0.2| NULL | | 2 | 121.33.205.235| CN | 172.16.0.15 | NULL | | 4 | 121.33.205.235| CN | 10.0.0.1| NULL | |39 | 210.52.216.92 | CN | 172.16.0.15 | NULL | | 1 | 121.33.205.235| CN | 172.16.0.14 | NULL | +---+---++---++ All I am interested in is the event count for each country, in this case: 295 CN ... Other countries.. As a first step, remove the columns src_ip and dst_ip from your query, both from the select list and from the group by (but not from the join condition): SELECT COUNT(signature) AS count, map1.cc as src_cc, map2.cc as dst_cc FROM event LEFT JOIN mappings AS map1 ON event.src_ip = map1.ip LEFT JOIN mappings AS map2 ON event.dst_ip = map2.ip WHERE timestamp BETWEEN '2011-01-29 00:00:00' AND '2011-01-30 00:00:00' GROUP BY src_cc, dst_cc ORDER BY src_cc, dst_cc ASC; The result should be: +---+++ | count | src_cc | dst_cc | +---+++ | 8 | NULL | CN | | 287 | CN | NULL | +---+++ Now, you are left with two problems: 1) Your query still groups by the country codes of both source and destination (which doesn't become obvious with your sample data, as one of these is always shown as NULL). For example: traffic just between three countries (each combination occurring) would give nine rows, with each country occurring three times as source and three times as destination. If you want the total for the source country, you must stop grouping by destination (and vice versa). 2) If you really want the total of source and destination (the 295 in your example, not the 287 and 8 I expect from my version), it really might be easiest to do this in the application; I have no idea how to do it nicely in SQL. HTH, Jörg -- Joerg Bruehe, MySQL Build Team, joerg.bru...@oracle.com ORACLE Deutschland B.V. Co. KG, Komturstrasse 18a, D-12099 Berlin Geschaeftsfuehrer: Juergen Kunz, Marcel v.d. Molen, Alexander v.d. Ven Amtsgericht Muenchen: HRA 95603 -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
Re: Help with query.
On Tue, 1 Feb 2011 14:46:39 -0400 Paul Halliday paul.halli...@gmail.com wrote: I have a query (thanks to this list) that uses a join to add country information to an IP. It looks like this: SELECT COUNT(signature) AS count, INET_NTOA(src_ip), map1.cc as src_cc, INET_NTOA(dst_ip), map2.cc as dst_cc FROM event LEFT JOIN mappings AS map1 ON event.src_ip = map1.ip LEFT JOIN mappings AS map2 ON event.dst_ip = map2.ip WHERE timestamp BETWEEN '2011-01-29 00:00:00' AND '2011-01-30 00:00:00' GROUP BY src_ip, src_cc, dst_ip, dst_cc ORDER BY src_cc, dst_cc ASC; . All I am interested in is the event count for each country, in this case: 295 CN ... Other countries.. try: SELECT COUNT(*) AS count, IF(map1.cc, map1.cc, map2.cc) AS country FROM event LEFT JOIN mappings AS map1 ON event.src_ip = map1.ip LEFT JOIN mappings AS map2 ON event.dst_ip = map2.ip WHERE timestamp BETWEEN '2011-01-29 00:00:00' AND '2011-01-30 00:00:00' GROUP BY country ORDER BY country note, I am assuming in this query that you have either a source country or a destination country, but not both. If both values might be set, and you need to count each, this will not work. Thanks! -- Paul Halliday http://www.pintumbler.org -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=sim...@syounger.com -- Simcha Younger sim...@syounger.com -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
Re: Help with Date in Where Clause
On Monday 31 January 2011 21:12, Phillip Baker wrote: Greetings All, I am looking for a little help in setting a where clause. I have a dateAdded field that is a DATETIME field. I am looking to pull records from Midnight to midnight the previous day. I thought just passing the date (without time) would get it but I keep getting an empty record set. So looking for something that works a bit better. select * from your_table where convert(dateAdded, date)='2011-01-31'; -- Jørn Dahl-Stamnes homepage: http://www.dahl-stamnes.net/dahls/ -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
Re: Help with Date in Where Clause
Thank you very much Jørn Blessed Be Phillip Never ascribe to malice what can be explained by incompetence -- Hanlon's Razor On Mon, Jan 31, 2011 at 1:18 PM, Jørn Dahl-Stamnes sq...@dahl-stamnes.netwrote: Jørn
Re: Help with Date in Where Clause
On 1/31/2011 15:12, Phillip Baker wrote: Greetings All, I am looking for a little help in setting a where clause. I have a dateAdded field that is a DATETIME field. I am looking to pull records from Midnight to midnight the previous day. I thought just passing the date (without time) would get it but I keep getting an empty record set. So looking for something that works a bit better. Any suggestions? Blessed Be Phillip Never ascribe to malice what can be explained by incompetence -- Hanlon's Razor All of the datetime values for yesterday actually exist as a range of datetime values between midnight that morning (inclusive) and midnight the next morning (not part of the search). So your WHERE clause needs to resemble ... WHERE dtcolumn = '2011-01-21 00:00:00' and dtcolumn '2011-01-22 00:00:00' This pattern has the added advantage of not eliminating the possibility of using an INDEX on the dtcolumn column by wrapping it inside a function. Yours, -- Shawn Green MySQL Principal Technical Support Engineer Oracle USA, Inc. Office: Blountville, TN -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
Re: Help with Date in Where Clause
On 01/31/2011 12:18 PM, Jørn Dahl-Stamnes wrote: On Monday 31 January 2011 21:12, Phillip Baker wrote: Greetings All, I am looking for a little help in setting a where clause. I have a dateAdded field that is a DATETIME field. I am looking to pull records from Midnight to midnight the previous day. I thought just passing the date (without time) would get it but I keep getting an empty record set. So looking for something that works a bit better. select * from your_table where convert(dateAdded, date)='2011-01-31'; not so good, but it works: select * from your_table where dateAdded like '2011-01-31%'; OR select * from your_table where dateAdded between '2011-01-30%' and '2011-01-31%'; better: select * from your_table where DATE_SUB('2011-01-31', INTERVAL 1 DAY); -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
Re: help with query
Thank you, that did the trick. Simon On 11 January 2011 12:09, Steve Meyers steve-mysql-l...@spamwiz.com wrote: On 1/11/11 9:31 AM, Simon Wilkinson wrote: select users.id from users where users.id in (select newletters.user_id from newletters left join articles on newletters.id = articles.newsletter_id where articles.newsletter_id is null); I think this would do what you require: SELECT u.id AS user_id, COUNT(DISTINCT n.id) AS num_newsletters, COUNT(DISTINCT a.id) AS num_articles FROM users u JOIN newsletters n ON n.user_id=u.id LEFT JOIN articles a ON a.newsletter_id=n.id GROUP BY u.id HAVING num_newsletters 0 AND num_articles = 0
Re: help with query
On 1/11/11 9:31 AM, Simon Wilkinson wrote: select users.id from users where users.id in (select newletters.user_id from newletters left join articles on newletters.id = articles.newsletter_id where articles.newsletter_id is null); I think this would do what you require: SELECT u.id AS user_id, COUNT(DISTINCT n.id) AS num_newsletters, COUNT(DISTINCT a.id) AS num_articles FROM users u JOIN newsletters n ON n.user_id=u.id LEFT JOIN articles a ON a.newsletter_id=n.id GROUP BY u.id HAVING num_newsletters 0 AND num_articles = 0 -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
Re: Help needed with what appears to be a corrupted innodb db
Pito, can u show us the innodb parameters in the my.cnf file. regards anandkl On Sat, Jan 8, 2011 at 10:31 PM, Pito Salas r...@salas.com wrote: I am very new to trying to solve a problem like this and have searched and searched the web for a useful troubleshooting guide but I am honestly stuck. I wonder if anyone out there could take a look at this bit of mysqld log. Any help at all would be greatly appreciated!!! Pito 110107 15:07:15 mysqld started 110107 15:07:15 InnoDB: Database was not shut down normally! InnoDB: Starting crash recovery. InnoDB: Reading tablespace information from the .ibd files... InnoDB: Restoring possible half-written data pages from the doublewrite InnoDB: buffer... 110107 15:07:15 InnoDB: Starting log scan based on checkpoint at InnoDB: log sequence number 35 515914826. InnoDB: Doing recovery: scanned up to log sequence number 35 515915839 InnoDB: 1 transaction(s) which must be rolled back or cleaned up InnoDB: in total 1 row operations to undo InnoDB: Trx id counter is 0 1697553664 110107 15:07:15 InnoDB: Starting an apply batch of log records to the database... InnoDB: Progress in percents: 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 InnoDB: Apply batch completed InnoDB: Starting rollback of uncommitted transactions InnoDB: Rolling back trx with id 0 1697553198, 1 rows to undoInnoDB: Error: trying to access page number 3522914176 in space 0, InnoDB: space name ./ibdata1, InnoDB: which is outside the tablespace bounds. InnoDB: Byte offset 0, len 16384, i/o type 10 110107 15:07:15InnoDB: Assertion failure in thread 3086403264 in file fil0fil.c line 3922 InnoDB: We intentionally generate a memory trap. InnoDB: Submit a detailed bug report to http://bugs.mysql.com. InnoDB: If you get repeated assertion failures or crashes, even InnoDB: immediately after the mysqld startup, there may be InnoDB: corruption in the InnoDB tablespace. Please refer to InnoDB: http://dev.mysql.com/doc/mysql/en/Forcing_recovery.html InnoDB: about forcing recovery. mysqld got signal 11; This could be because you hit a bug. It is also possible that this binary or one of the libraries it was linked against is corrupt, improperly built, or misconfigured. This error can also be caused by malfunctioning hardware. We will try our best to scrape up some info that will hopefully help diagnose the problem, but since we have already crashed, something is definitely wrong and this may fail. key_buffer_size=0 read_buffer_size=131072 max_used_connections=0 max_connections=100 threads_connected=0 It is possible that mysqld could use up to key_buffer_size + (read_buffer_size + sort_buffer_size)*max_connections = 217599 K bytes of memory Hope that's ok; if not, decrease some variables in the equation. thd=(nil) Attempting backtrace. You can use the following information to find out where mysqld died. If you see no messages after this, something went terribly wrong... Cannot determine thread, fp=0xbffc55ac, backtrace may not be correct. Stack range sanity check OK, backtrace follows: 0x8139eec 0x83721d5 0x833d897 0x833db71 0x832aa38 0x835f025 0x835f7a3 0x830a77e 0x8326b57 0x831c825 0x8317b8d 0x82a9e66 0x8315732 0x834fc9a 0x828d7c3 0x81c29dd 0x81b5620 0x813d9fe 0x40fdf3 0x80d5ff1 New value of fp=(nil) failed sanity check, terminating stack trace! Please read http://dev.mysql.com/doc/mysql/en/Using_stack_trace.html and follow instructions on how to resolve the stack trace. Resolved stack trace is much more helpful in diagnosing the problem, so please do resolve it The manual page at http://www.mysql.com/doc/en/Crashing.htmlcontains information that should help you find out what is causing the crash. 110107 15:07:15 mysqld ended -- Check out http://www.salas.com and http://www.blogbridge.com/look -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=anan...@gmail.com
Re: help with replication
thanks On Wed, Aug 18, 2010 at 10:42 PM, a.sm...@ukgrid.net wrote: Quoting Norman Khine nor...@khine.net: What is shown from show master status and show slave status after you have made a change on the master DB? this is the output: http://pastie.org/1100610 it does not seem to have any changes and show slave status is just empty. have i missed to add something to the master's /etc/mysql/my.cnf options? So you can see the binlog position on the master has not changed, hence you wont get any changes replicated to the slave. The show slave status has to be run on the slave not the master. Anyway, I believe your problem is your binlog-do-db section on the master, and also the ignore sections, I think these need to be broken into seperate lines, with only one value per line. ie: binlog-do-db = upgrade binlog-do-db = tracker this was the issue You´ll need to completely resetup the syncronsiation after this, as you currently have out of sync DBs and no data stored in your binlogs... -- ˙uʍop ǝpısdn p,uɹnʇ pןɹoʍ ǝɥʇ ǝǝs noʎ 'ʇuǝɯɐן sǝɯıʇ ǝɥʇ puɐ 'ʇuǝʇuoɔ ǝq s,ʇǝן ʇǝʎ % .join( [ {'*':'@','^':'.'}.get(c,None) or chr(97+(ord(c)-83)%26) for c in ,adym,*)uzq^zqf ] ) -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
Re: help with replication
Quoting Norman Khine nor...@khine.net: hello, i have an issue with the replication here is my procedure: http://pastie.org/1100368 in the log it shows replication works, but when i update a record this is not updated on the server. What is shown from show master status and show slave status after you have made a change on the master DB? -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org