subject:"Query with group by"

RE: query results group/summed by interval

2010-08-01 Thread nuno . tavares

Hi all,

Aveeks solution should work if you have at least one call for each intervall.
It's the classical GROUP BY solution that only works on the available dataset.
Although it should work pretty well in the cited scenario, you will miss
intervals (from a all intervals report point of view) if indeed there are
intervals (of more than 5 minutes, in this example) when there were no calls at
all.

I had a somewhat similar problem (running the second scenario, though) and this
is the solution I setup (this was a Data Warehouse and that's why you'll read
about partition pruning, dataset was dozens of Gigs):

http://gpshumano.blogs.dri.pt/2009/09/28/finding-for-each-time-interval-how-many-records-are-ocurring-during-that-interval/

This might become handy if Ghulam understands the differences between my
scenario and his.

Hope that helps,
-NT




Quoting Martin Gainty mgai...@hotmail.com:

 
 no that would give you the count for each second interval instead of using
 the interval variable 5 
 
 Aveeks floor:
 FLOOR(X) Returns the largest integer value not greater than X. 
  
 1st (seconds/5) interval  example
 5/5=1
 floor(5/5) = 1
 supplied value would truncate and give you the int not greater than X
 then multiply by 5
 1*5=5
 is correct
  
 Aveeks sum function:
 SUM([DISTINCT] expr) 
 Returns the sum of expr. If the return set has no rows, SUM() returns NULL.
 The DISTINCT keyword can be used in MySQL 5.0 to sum only the distinct values
 of expr. 
 SUM() returns NULL if there were no matching rows. 
 sum(calls) from calls group by 5 * floor(seconds/5)
 sum(calls) from calls group by 5 * floor(5/5)
 sum(calls) from class group by 5 * 1
 sum(calls) from class group by 5
 is correct 
  
 2nd(seconds/5) interval example
 10/5=2
 floor(10/5)=2
 supplied value would truncate and give you the int not greater than X
 then multiply by 5
 2*5=10
 is correct
  
 Aveeks sum function
 sum(calls) from calls group by 5 * floor(seconds/5)
 sum(calls) from calls group by 5 * floor(10/5)
 sum(calls) from class group by 5 * 2
 sum(calls) from class group by 10
 would be applicable only if the interval was 10
  
 Aveek if your interval is 5 change:
 sum(calls) from calls group by 5 * floor(seconds/5)
 to
 sum(calls) from calls group by floor(seconds/5)
  
 Martin Gainty 
 __ 
 Verzicht und Vertraulichkeitanmerkung/Note de déni et de confidentialité
 
 Diese Nachricht ist vertraulich. Sollten Sie nicht der vorgesehene Empfaenger
 sein, so bitten wir hoeflich um eine Mitteilung. Jede unbefugte Weiterleitung
 oder Fertigung einer Kopie ist unzulaessig. Diese Nachricht dient lediglich
 dem Austausch von Informationen und entfaltet keine rechtliche
 Bindungswirkung. Aufgrund der leichten Manipulierbarkeit von E-Mails koennen
 wir keine Haftung fuer den Inhalt uebernehmen.
 Ce message est confidentiel et peut être privilégié. Si vous n'êtes pas le
 destinataire prévu, nous te demandons avec bonté que pour satisfaire informez
 l'expéditeur. N'importe quelle diffusion non autorisée ou la copie de ceci
 est interdite. Ce message sert à l'information seulement et n'aura pas
 n'importe quel effet légalement obligatoire. Étant donné que les email
 peuvent facilement être sujets à la manipulation, nous ne pouvons accepter
 aucune responsabilité pour le contenu fourni.
 
 
 
   
 
  Date: Sat, 31 Jul 2010 10:31:43 +0700
  From: cuong.m...@vienthongso.com
  To: ave...@yahoo-inc.com
  CC: mustafa...@gmail.com; mysql@lists.mysql.com
  Subject: Re: query results group/summed by interval
  
  Hi Aveek,
  
  I think Ghulam just want to count calls for each intervals
  so the query should looks like this:
  
  select count(*) as total_calls, queue_seconds
  from calls group by queue_seconds order by total_calls;
  
  
  - Original Message -
  From: Aveek Misra ave...@yahoo-inc.com
  To: Ghulam Mustafa mustafa...@gmail.com, mysql@lists.mysql.com
  Sent: Tuesday, July 27, 2010 5:54:13 PM
  Subject: RE: query results group/summed by interval
  
  try this ...
  
  select 5 * floor(seconds/5) as start, 5 * floor(seconds/5) + 5 as end,
 sum(calls) from calls group by 5 * floor(seconds/5);
  
  This should give you an output of the type
  
  +---+--++
  | start | end | sum(calls) |
  +---+--++
  | 0 | 5 | 387 |
  | 5 | 10 | 225 |
  | 10 | 15 | 74 |
  +---+--++
  
  
  Thanks
  Aveek
  
  From: Ghulam Mustafa [mustafa...@gmail.com]
  Sent: Tuesday, July 27, 2010 3:53 PM
  To: mysql@lists.mysql.com
  Subject: query results group/summed by interval
  
  Hi everyone,
  
  i have two columns (seconds, number of calls), i need to produce a
  report which will show total number of calls in intervals (let'say 10
  seconds interval), i know i can do this programmability in my script but
  i was wondering if it's possible to accomplish this behavior within
  mysql. for example i have following data

RE: query results group/summed by interval

2010-08-01 Thread Travis Ard


You could also pre-define your intervals in a subquery using UNION and join 
that to your original table like so:

select ifnull(sum(calls), 0) as calls, n as queue_seconds
from 
(select 0 as n union 
select 5 union 
select 10 union 
select 15) as step
left join calls on calls.queue_seconds  (step.n - 5) and calls.queue_seconds 
= step.n
group by n;

+---+---+
| calls | queue_seconds |
+---+---+
|   250 | 0 |
|   168 | 5 |
|   268 |    10 |
| 0 |    15 |
+---+---+
4 rows in set (0.00 sec)

-Travis


 Date: Sun, 1 Aug 2010 13:16:36 +0100
 From: nuno.tava...@dri.pt
 To: mgai...@hotmail.com
 CC: cuong.m...@vienthongso.com; ave...@yahoo-inc.com; mustafa...@gmail.com; 
 mysql@lists.mysql.com
 Subject: RE: query results group/summed by interval

 Hi all,

 Aveeks solution should work if you have at least one call for each intervall.
 It's the classical GROUP BY solution that only works on the available dataset.
 Although it should work pretty well in the cited scenario, you will miss
 intervals (from a all intervals report point of view) if indeed there are
 intervals (of more than 5 minutes, in this example) when there were no calls 
 at
 all.

 I had a somewhat similar problem (running the second scenario, though) and 
 this
 is the solution I setup (this was a Data Warehouse and that's why you'll read
 about partition pruning, dataset was dozens of Gigs):

 http://gpshumano.blogs.dri.pt/2009/09/28/finding-for-each-time-interval-how-many-records-are-ocurring-during-that-interval/

 This might become handy if Ghulam understands the differences between my
 scenario and his.

 Hope that helps,
 -NT




 Quoting Martin Gainty :

 
  no that would give you the count for each second interval instead of using
  the interval variable 5
 
  Aveeks floor:
  FLOOR(X) Returns the largest integer value not greater than X.
 
  1st (seconds/5) interval example
  5/5=1
  floor(5/5) = 1
  supplied value would truncate and give you the int not greater than X
  then multiply by 5
  1*5=5
  is correct
 
  Aveeks sum function:
  SUM([DISTINCT] expr)
  Returns the sum of expr. If the return set has no rows, SUM() returns NULL.
  The DISTINCT keyword can be used in MySQL 5.0 to sum only the distinct 
  values
  of expr.
  SUM() returns NULL if there were no matching rows.
  sum(calls) from calls group by 5 * floor(seconds/5)
  sum(calls) from calls group by 5 * floor(5/5)
  sum(calls) from class group by 5 * 1
  sum(calls) from class group by 5
  is correct
 
  2nd(seconds/5) interval example
  10/5=2
  floor(10/5)=2
  supplied value would truncate and give you the int not greater than X
  then multiply by 5
  2*5=10
  is correct
 
  Aveeks sum function
  sum(calls) from calls group by 5 * floor(seconds/5)
  sum(calls) from calls group by 5 * floor(10/5)
  sum(calls) from class group by 5 * 2
  sum(calls) from class group by 10
  would be applicable only if the interval was 10
 
  Aveek if your interval is 5 change:
  sum(calls) from calls group by 5 * floor(seconds/5)
  to
  sum(calls) from calls group by floor(seconds/5)
 
  Martin Gainty
  __
  Verzicht und Vertraulichkeitanmerkung/Note de déni et de confidentialité
 
  Diese Nachricht ist vertraulich. Sollten Sie nicht der vorgesehene 
  Empfaenger
  sein, so bitten wir hoeflich um eine Mitteilung. Jede unbefugte 
  Weiterleitung
  oder Fertigung einer Kopie ist unzulaessig. Diese Nachricht dient lediglich
  dem Austausch von Informationen und entfaltet keine rechtliche
  Bindungswirkung. Aufgrund der leichten Manipulierbarkeit von E-Mails koennen
  wir keine Haftung fuer den Inhalt uebernehmen.
  Ce message est confidentiel et peut être privilégié. Si vous n'êtes pas le
  destinataire prévu, nous te demandons avec bonté que pour satisfaire 
  informez
  l'expéditeur. N'importe quelle diffusion non autorisée ou la copie de ceci
  est interdite. Ce message sert à l'information seulement et n'aura pas
  n'importe quel effet légalement obligatoire. Étant donné que les email
  peuvent facilement être sujets à la manipulation, nous ne pouvons accepter
  aucune responsabilité pour le contenu fourni.
 
 
 
 
 
   Date: Sat, 31 Jul 2010 10:31:43 +0700
   From: cuong.m...@vienthongso.com
   To: ave...@yahoo-inc.com
   CC: mustafa...@gmail.com; mysql@lists.mysql.com
   Subject: Re: query results group/summed by interval
  
   Hi Aveek,
  
   I think Ghulam just want to count calls for each intervals
   so the query should looks like this:
  
   select count(*) as total_calls, queue_seconds
   from calls group by queue_seconds order by total_calls;
  
  
   - Original Message -
   From: Aveek Misra 
   To: Ghulam Mustafa , mysql@lists.mysql.com
   Sent: Tuesday, July 27, 2010 5:54:13 PM
   Subject: RE: query results group/summed by interval
  
   try this ...
  
   select 5 * floor(seconds/5) as start, 5

RE: query results group/summed by interval

2010-07-31 Thread Martin Gainty


no that would give you the count for each second interval instead of using the 
interval variable 5 

Aveeks floor:
FLOOR(X) Returns the largest integer value not greater than X. 
 
1st (seconds/5) interval  example
5/5=1
floor(5/5) = 1
supplied value would truncate and give you the int not greater than X
then multiply by 5
1*5=5
is correct
 
Aveeks sum function:
SUM([DISTINCT] expr) 
Returns the sum of expr. If the return set has no rows, SUM() returns NULL. The 
DISTINCT keyword can be used in MySQL 5.0 to sum only the distinct values of 
expr. 
SUM() returns NULL if there were no matching rows. 
sum(calls) from calls group by 5 * floor(seconds/5)
sum(calls) from calls group by 5 * floor(5/5)
sum(calls) from class group by 5 * 1
sum(calls) from class group by 5
is correct 
 
2nd(seconds/5) interval example
10/5=2
floor(10/5)=2
supplied value would truncate and give you the int not greater than X
then multiply by 5
2*5=10
is correct
 
Aveeks sum function
sum(calls) from calls group by 5 * floor(seconds/5)
sum(calls) from calls group by 5 * floor(10/5)
sum(calls) from class group by 5 * 2
sum(calls) from class group by 10
would be applicable only if the interval was 10
 
Aveek if your interval is 5 change:
sum(calls) from calls group by 5 * floor(seconds/5)
to
sum(calls) from calls group by floor(seconds/5)
 
Martin Gainty 
__ 
Verzicht und Vertraulichkeitanmerkung/Note de déni et de confidentialité

Diese Nachricht ist vertraulich. Sollten Sie nicht der vorgesehene Empfaenger 
sein, so bitten wir hoeflich um eine Mitteilung. Jede unbefugte Weiterleitung 
oder Fertigung einer Kopie ist unzulaessig. Diese Nachricht dient lediglich dem 
Austausch von Informationen und entfaltet keine rechtliche Bindungswirkung. 
Aufgrund der leichten Manipulierbarkeit von E-Mails koennen wir keine Haftung 
fuer den Inhalt uebernehmen.
Ce message est confidentiel et peut être privilégié. Si vous n'êtes pas le 
destinataire prévu, nous te demandons avec bonté que pour satisfaire informez 
l'expéditeur. N'importe quelle diffusion non autorisée ou la copie de ceci est 
interdite. Ce message sert à l'information seulement et n'aura pas n'importe 
quel effet légalement obligatoire. Étant donné que les email peuvent facilement 
être sujets à la manipulation, nous ne pouvons accepter aucune responsabilité 
pour le contenu fourni.



 

 Date: Sat, 31 Jul 2010 10:31:43 +0700
 From: cuong.m...@vienthongso.com
 To: ave...@yahoo-inc.com
 CC: mustafa...@gmail.com; mysql@lists.mysql.com
 Subject: Re: query results group/summed by interval
 
 Hi Aveek,
 
 I think Ghulam just want to count calls for each intervals
 so the query should looks like this:
 
 select count(*) as total_calls, queue_seconds
 from calls group by queue_seconds order by total_calls;
 
 
 - Original Message -
 From: Aveek Misra ave...@yahoo-inc.com
 To: Ghulam Mustafa mustafa...@gmail.com, mysql@lists.mysql.com
 Sent: Tuesday, July 27, 2010 5:54:13 PM
 Subject: RE: query results group/summed by interval
 
 try this ...
 
 select 5 * floor(seconds/5) as start, 5 * floor(seconds/5) + 5 as end, 
 sum(calls) from calls group by 5 * floor(seconds/5);
 
 This should give you an output of the type
 
 +---+--++
 | start | end | sum(calls) |
 +---+--++
 | 0 | 5 | 387 |
 | 5 | 10 | 225 |
 | 10 | 15 | 74 |
 +---+--++
 
 
 Thanks
 Aveek
 
 From: Ghulam Mustafa [mustafa...@gmail.com]
 Sent: Tuesday, July 27, 2010 3:53 PM
 To: mysql@lists.mysql.com
 Subject: query results group/summed by interval
 
 Hi everyone,
 
 i have two columns (seconds, number of calls), i need to produce a
 report which will show total number of calls in intervals (let'say 10
 seconds interval), i know i can do this programmability in my script but
 i was wondering if it's possible to accomplish this behavior within
 mysql. for example i have following data.
 
 +--+---+
 | calls | queue_seconds |
 +--+---+
 | 250 | 0.00 |
 | 28 | 1.00 |
 | 30 | 2.00 |
 | 56 | 3.00 |
 | 23 | 4.00 |
 | 31 | 5.00 |
 | 33 | 6.00 |
 | 50 | 7.00 |
 | 49 | 8.00 |
 | 62 | 9.00 |
 | 74 | 10.00 |
 ...
 ... and so on...
 ...
 +--+---+
 
 now result should look like this with a 5 seconds interval.
 
 +--+---+
 | count(*) | queue_seconds |
 +--+---+
 | 250 | 0.00 |
 | 168 | 5.00 |
 | 268 | 10.00 |
 ...
 ... and so on...
 ...
 +--+---+
 
 i would really appreciate your help.
 
 Best Regards.
 
 --
 Ghulam Mustafa
 
 --
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe: http://lists.mysql.com/mysql?unsub=ave...@yahoo-inc.com
 
 
 --
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe: http://lists.mysql.com/mysql?unsub=cuong.m...@vienthongso.com
 
 
 -- 
 Best Regards,
 Cuongmc

RE: query results group/summed by interval

2010-07-31 Thread Martin Gainty


no that would give you the count for each second interval instead of using the 
interval variable 5 

Aveeks floor:
FLOOR(X) Returns the largest integer value not greater than X. 
 
1st (seconds/5) interval  example
5/5=1
floor(5/5) = 1
supplied value would truncate and give you the int not greater than X
then multiply by 5
1*5=5
is correct
 
Aveeks sum function:
SUM([DISTINCT] expr) 
Returns the sum of expr. If the return set has no rows, SUM() returns NULL. The 
DISTINCT keyword can be used in MySQL 5.0 to sum only the distinct values of 
expr. 
SUM() returns NULL if there were no matching rows. 
sum(calls) from calls group by 5 * floor(seconds/5)
sum(calls) from calls group by 5 * floor(5/5)
sum(calls) from class group by 5 * 1
sum(calls) from class group by 5
is correct 
 
2nd(seconds/5) interval example
10/5=2
floor(10/5)=2
supplied value would truncate and give you the int not greater than X
then multiply by 5
2*5=10
is correct
 
Aveeks sum function
sum(calls) from calls group by 5 * floor(seconds/5)
sum(calls) from calls group by 5 * floor(10/5)
sum(calls) from class group by 5 * 2
sum(calls) from class group by 10
would be applicable only if the interval was 10
 
Aveek if your interval is 5 change:
sum(calls) from calls group by 5 * floor(seconds/5)
to
sum(calls) from calls group by floor(seconds/5)
 
Martin Gainty 
__ 
Verzicht und Vertraulichkeitanmerkung/Note de déni et de confidentialité

Diese Nachricht ist vertraulich. Sollten Sie nicht der vorgesehene Empfaenger 
sein, so bitten wir hoeflich um eine Mitteilung. Jede unbefugte Weiterleitung 
oder Fertigung einer Kopie ist unzulaessig. Diese Nachricht dient lediglich dem 
Austausch von Informationen und entfaltet keine rechtliche Bindungswirkung. 
Aufgrund der leichten Manipulierbarkeit von E-Mails koennen wir keine Haftung 
fuer den Inhalt uebernehmen.
Ce message est confidentiel et peut être privilégié. Si vous n'êtes pas le 
destinataire prévu, nous te demandons avec bonté que pour satisfaire informez 
l'expéditeur. N'importe quelle diffusion non autorisée ou la copie de ceci est 
interdite. Ce message sert à l'information seulement et n'aura pas n'importe 
quel effet légalement obligatoire. Étant donné que les email peuvent facilement 
être sujets à la manipulation, nous ne pouvons accepter aucune responsabilité 
pour le contenu fourni.



  

 Date: Sat, 31 Jul 2010 10:31:43 +0700
 From: cuong.m...@vienthongso.com
 To: ave...@yahoo-inc.com
 CC: mustafa...@gmail.com; mysql@lists.mysql.com
 Subject: Re: query results group/summed by interval
 
 Hi Aveek,
 
 I think Ghulam just want to count calls for each intervals
 so the query should looks like this:
 
 select count(*) as total_calls, queue_seconds
 from calls group by queue_seconds order by total_calls;
 
 
 - Original Message -
 From: Aveek Misra ave...@yahoo-inc.com
 To: Ghulam Mustafa mustafa...@gmail.com, mysql@lists.mysql.com
 Sent: Tuesday, July 27, 2010 5:54:13 PM
 Subject: RE: query results group/summed by interval
 
 try this ...
 
 select 5 * floor(seconds/5) as start, 5 * floor(seconds/5) + 5 as end, 
 sum(calls) from calls group by 5 * floor(seconds/5);
 
 This should give you an output of the type
 
 +---+--++
 | start | end | sum(calls) |
 +---+--++
 | 0 | 5 | 387 |
 | 5 | 10 | 225 |
 | 10 | 15 | 74 |
 +---+--++
 
 
 Thanks
 Aveek
 
 From: Ghulam Mustafa [mustafa...@gmail.com]
 Sent: Tuesday, July 27, 2010 3:53 PM
 To: mysql@lists.mysql.com
 Subject: query results group/summed by interval
 
 Hi everyone,
 
 i have two columns (seconds, number of calls), i need to produce a
 report which will show total number of calls in intervals (let'say 10
 seconds interval), i know i can do this programmability in my script but
 i was wondering if it's possible to accomplish this behavior within
 mysql. for example i have following data.
 
 +--+---+
 | calls | queue_seconds |
 +--+---+
 | 250 | 0.00 |
 | 28 | 1.00 |
 | 30 | 2.00 |
 | 56 | 3.00 |
 | 23 | 4.00 |
 | 31 | 5.00 |
 | 33 | 6.00 |
 | 50 | 7.00 |
 | 49 | 8.00 |
 | 62 | 9.00 |
 | 74 | 10.00 |
 ...
 ... and so on...
 ...
 +--+---+
 
 now result should look like this with a 5 seconds interval.
 
 +--+---+
 | count(*) | queue_seconds |
 +--+---+
 | 250 | 0.00 |
 | 168 | 5.00 |
 | 268 | 10.00 |
 ...
 ... and so on...
 ...
 +--+---+
 
 i would really appreciate your help.
 
 Best Regards.
 
 --
 Ghulam Mustafa
 
 --
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe: http://lists.mysql.com/mysql?unsub=ave...@yahoo-inc.com
 
 
 --
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe: http://lists.mysql.com/mysql?unsub=cuong.m...@vienthongso.com
 
 
 -- 
 Best Regards,
 Cuongmc

Re: query results group/summed by interval

2010-07-30 Thread Nguyen Manh Cuong

Hi Aveek,

I think Ghulam just want to count calls for each intervals
so the query should looks like this:

select count(*) as total_calls, queue_seconds
from calls group by queue_seconds order by total_calls;


- Original Message -
From: Aveek Misra ave...@yahoo-inc.com
To: Ghulam Mustafa mustafa...@gmail.com, mysql@lists.mysql.com
Sent: Tuesday, July 27, 2010 5:54:13 PM
Subject: RE: query results group/summed by interval

try this ...

 select 5 * floor(seconds/5) as start, 5 * floor(seconds/5) + 5 as end, 
sum(calls) from calls group by 5 * floor(seconds/5);

This should give you an output of the type

+---+--++
| start | end  | sum(calls) |
+---+--++
| 0 |5 |387 |
| 5 |   10 |225 |
|10 |   15 | 74 |
+---+--++


Thanks
Aveek

From: Ghulam Mustafa [mustafa...@gmail.com]
Sent: Tuesday, July 27, 2010 3:53 PM
To: mysql@lists.mysql.com
Subject: query results group/summed by interval

Hi everyone,

i have two columns (seconds, number of calls), i need to produce a
report which will show total number of calls in intervals (let'say 10
seconds interval), i know i can do this programmability in my script but
i was wondering if it's possible to accomplish this behavior within
mysql. for example i have following data.

+--+---+
|  calls   | queue_seconds |
+--+---+
|  250 |  0.00 |
|   28 |  1.00 |
|   30 |  2.00 |
|   56 |  3.00 |
|   23 |  4.00 |
|   31 |  5.00 |
|   33 |  6.00 |
|   50 |  7.00 |
|   49 |  8.00 |
|   62 |  9.00 |
|   74 | 10.00 |
...
... and so on...
...
+--+---+

now result should look like this with a 5 seconds interval.

+--+---+
| count(*) | queue_seconds |
+--+---+
|  250 |  0.00 |
|  168 |  5.00 |
|  268 | 10.00 |
...
... and so on...
...
+--+---+

i would really appreciate your help.

Best Regards.

--
Ghulam Mustafa

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=ave...@yahoo-inc.com


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=cuong.m...@vienthongso.com


-- 
Best Regards,
Cuongmc.

-- 
Nguyen Manh Cuong
Phong Ky Thuat - Cong ty Vien Thong So - VTC
Dien thoai: 0912051542
Gmail : philipscu...@gmail.com
YahooMail : philipscu...@yahoo.com

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org

RE: concatenate sql query with group by and having

2010-07-29 Thread Travis Ard


With some databases such as MySQL, subqueries have to be explicitly named.  For 
example

select * from (select * from (select * from table) sub1) sub2;

If not, you will see an error like: ERROR 1248 (42000): Every derived table 
must have its own alias

If
 I understand your problem correctly, you are looking to limit your 
result set to only those records that have symbols with a single unique 
combination of chrom, and strand.  If that's correct, something like the
 query below might work:

select geneName as symbol, name as refSeq, chrom, strand, txStart 
from refFlat 
where geneName in 
-- returns all geneNames (symbols) with one unique combination of chrom and 
strand
(select geneName from
-- returns all unique combinations of symbol, chrom, and strand
(select distinct geneName, chrom, strand
from refFlat) sub1
group by geneName
having count(*) = 1) 
group by refSeq
having count(*) = 1;


 Date: Wed, 28 Jul 2010 11:10:32 -0500
 Subject: concatenate sql query with group by and having
 From: pengyu...@gmail.com
 To: mysql@lists.mysql.com

 mysql -ugenome -hgenome-mysql.cse.ucsc.edu mm9 -A

 I start mysql with the above command. Then I want to select the rows
 from the result of the following query, provided that for any rows
 that have the same symbol, chrom and strand should be the same
 (basically, discard the rows that have the same symbols but different
 chrom and strand). Could anybody show me how to do it?

 select geneName as symbol, name as refSeq, chrom, strand, txStart from
 refFlat group by refSeq having count(*)=1;


 I think that something like

 SELECT name FROM (SELECT name, type_id FROM (SELECT * FROM foods));

 works for sqlite3 (in terms of syntax). But the following do not work
 for mysql. Is this a difference between mysql and sqlite3? (I'm always
 confused by the difference between different variants of SQL)

 select * from (select geneName as symbol, name as refSeq, chrom,
 strand, txStart from refFlat group by refSeq having count(*)=1);

 --
 Regards,
 Peng

 --
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe: http://lists.mysql.com/mysql?unsub=travis_...@hotmail.com

  
--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org

concatenate sql query with group by and having

2010-07-28 Thread Peng Yu

mysql -ugenome -hgenome-mysql.cse.ucsc.edu mm9 -A

I start mysql with the above command. Then I want to select the rows
from the result of the following query, provided that for any rows
that have the same symbol, chrom and strand should be the same
(basically, discard the rows that have the same symbols but different
chrom and strand). Could anybody show me how to do it?

select geneName as symbol, name as refSeq, chrom, strand, txStart from
refFlat group by refSeq having count(*)=1;


I think that something like

SELECT name FROM (SELECT name, type_id FROM (SELECT * FROM foods));

works for sqlite3 (in terms of syntax). But the following do not work
for mysql. Is this a difference between mysql and sqlite3? (I'm always
confused by the difference between different variants of SQL)

select * from (select geneName as symbol, name as refSeq, chrom,
strand, txStart from refFlat group by refSeq having count(*)=1);

-- 
Regards,
Peng

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org

query results group/summed by interval

2010-07-27 Thread Ghulam Mustafa

Hi everyone,

i have two columns (seconds, number of calls), i need to produce a
report which will show total number of calls in intervals (let'say 10
seconds interval), i know i can do this programmability in my script but
i was wondering if it's possible to accomplish this behavior within
mysql. for example i have following data.

+--+---+
|  calls   | queue_seconds |
+--+---+
|  250 |  0.00 |
|   28 |  1.00 |
|   30 |  2.00 |
|   56 |  3.00 |
|   23 |  4.00 |
|   31 |  5.00 |
|   33 |  6.00 |
|   50 |  7.00 |
|   49 |  8.00 |
|   62 |  9.00 |
|   74 | 10.00 |
...
... and so on...
...
+--+---+

now result should look like this with a 5 seconds interval.

+--+---+
| count(*) | queue_seconds |
+--+---+
|  250 |  0.00 |
|  168 |  5.00 |
|  268 | 10.00 |
...
... and so on...
...
+--+---+

i would really appreciate your help.

Best Regards.

-- 
Ghulam Mustafa

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org

RE: query results group/summed by interval

2010-07-27 Thread Aveek Misra

try this ...

 select 5 * floor(seconds/5) as start, 5 * floor(seconds/5) + 5 as end, 
sum(calls) from calls group by 5 * floor(seconds/5);

This should give you an output of the type

+---+--++
| start | end  | sum(calls) |
+---+--++
| 0 |5 |387 |
| 5 |   10 |225 |
|10 |   15 | 74 |
+---+--++


Thanks
Aveek

From: Ghulam Mustafa [mustafa...@gmail.com]
Sent: Tuesday, July 27, 2010 3:53 PM
To: mysql@lists.mysql.com
Subject: query results group/summed by interval

Hi everyone,

i have two columns (seconds, number of calls), i need to produce a
report which will show total number of calls in intervals (let'say 10
seconds interval), i know i can do this programmability in my script but
i was wondering if it's possible to accomplish this behavior within
mysql. for example i have following data.

+--+---+
|  calls   | queue_seconds |
+--+---+
|  250 |  0.00 |
|   28 |  1.00 |
|   30 |  2.00 |
|   56 |  3.00 |
|   23 |  4.00 |
|   31 |  5.00 |
|   33 |  6.00 |
|   50 |  7.00 |
|   49 |  8.00 |
|   62 |  9.00 |
|   74 | 10.00 |
...
... and so on...
...
+--+---+

now result should look like this with a 5 seconds interval.

+--+---+
| count(*) | queue_seconds |
+--+---+
|  250 |  0.00 |
|  168 |  5.00 |
|  268 | 10.00 |
...
... and so on...
...
+--+---+

i would really appreciate your help.

Best Regards.

--
Ghulam Mustafa

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=ave...@yahoo-inc.com


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org

RE: Query with group by

2004-09-21 Thread Jose Miguel Pérez

Hi Michael! 

Talking about the query with group by issue...

 I'll explain my reasoning below.
[...]
  From the manual, section 7.2.8 How MySQL Optimizes LEFT 
 JOIN and RIGHT 
 JOIN http://dev.mysql.com/doc/mysql/en/LEFT_JOIN_optimization.html:
 
A LEFT JOIN B join_condition is implemented in MySQL as follows:
...
* The LEFT JOIN condition is used to decide how to 
 retrieve rows from table B. (In other words, any condition in the
 WHERE clause is not used.)
...
 
 So, the WHERE c2.id IS NULL cannot be applied until after 
 the rows which match the ON clause (and the NULL rows) have been
 fetched.

Thanks for your explanations here, it's clear now I was confused. I
missed this LEFT JOIN behaviour. I was in the idea the WHERE xx IS NULL was
taken into account.

[...]
 You are certainly right that temporary and filesort are 
 to be avoided. And they will be, if the table is properly indexed. Single 
 column indexing won't help much here, because the WHERE condition,
 the GROUP BY column, and the MAX column are all different.  A multi-column
 index on (content, location, date), however, will allow mysql to use the
 index to find the matching rows, find the groups, and calculate the MAX
date.

I still refuse to use the temporary table solution, call me fussy
here. Indeed, I think giving more indexes than necessary is a bit redundant,
unless completely necessary. (You will end up having more space on indexes
than data itself). It's a handle with care issue for me.

 Anyway, I don't know if one can program an agregate UDF 
  called something like EXTERNAL_MAX(...) or something, so that we
  could do like:
 
 SELECT EXTERNAL_MAX(date, version)  --- i.e: 
  Returns the version value for the row with MAX(date).

 This, for sure, will be the best solution. ;-)
 
 That would have to do the same thing behind the scenes.

I have to beg you pardon here. ;-) Think again this solution doesn't
require a JOIN, nor a temporary table. I think the UDF solution doesn't have
to be less efficient than a MAX or AVG aggregate function by itself. I'm in
the process of creating such a monster :-) Let me know if you are curious
about and have time to test it, I will test MAX() and EXTERNAL_MAX() against
a very large table.

Cheers,
Jose Miguel.



-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: Query with group by

2004-09-17 Thread Rhino

Here is the output I got when I ran my query - and yours, Michael - against
DB2 V7.2. Please note that I replaced the 'temp' table in Michael's query
with the real table in both the outer query and the subquery; no other
changes were made.

--
create table versions
(id smallint not null,
date date not null,
content char(5) not null,
location char(10) not null,
version smallint not null,
primary key(id))
DB2I  The SQL command completed successfully.

insert into versions values
(1, '2004-09-14', 'ALPHA', 'PARIS', 10),
(2, '2004-09-15', 'ALPHA', 'PARIS', 11),
(3, '2004-09-16', 'ALPHA', 'PARIS', 10),
(4, '2004-09-14', 'ALPHA', 'NEW-YORK', 11),
(5, '2004-09-15', 'ALPHA', 'NEW-YORK', 11),
(6, '2004-09-16', 'ALPHA', 'NEW-YORK', 10),
(7, '2004-09-14', 'ALPHA', 'TOKYO', 10),
(8, '2004-09-15', 'ALPHA', 'TOKYO', 11),
(9, '2004-09-16', 'BETA', 'TOKYO', 10)
DB2I  The SQL command completed successfully.

select content, location, version, date
from versions
where concat(location, char(date)) in
(select concat(location, char(max(date)))
from versions
where content = 'ALPHA'
group by location)

CONTENT LOCATION   VERSION DATE
--- -- --- --
ALPHA   NEW-YORK10 09/16/2004
ALPHA   PARIS   10 09/16/2004
ALPHA   TOKYO   11 09/15/2004

  3 record(s) selected.


SELECT content, location, version, date
FROM versions t1
WHERE date=(SELECT MAX(t2.date)
FROM versions t2
WHERE t1.location = t2.location AND t1.content = t2.content)
AND content = 'ALPHA'

CONTENT LOCATION   VERSION DATE
--- -- --- --
ALPHA   PARIS   10 09/16/2004
ALPHA   NEW-YORK10 09/16/2004
ALPHA   TOKYO   11 09/15/2004

  3 record(s) selected.
--

As you can see, both queries worked and produced the same result in DB2,
aside from the row sequence, which is easily fixable via an Order By.

I'm at a loss to explain why my query didn't work in MySQL V4.1.4.

Rhino


- Original Message - 
From: Michael Stassen [EMAIL PROTECTED]
To: Rhino [EMAIL PROTECTED]
Cc: Jose Miguel Pérez [EMAIL PROTECTED]; [EMAIL PROTECTED];
[EMAIL PROTECTED]
Sent: Friday, September 17, 2004 12:42 AM
Subject: Re: Query with group by



 Rhino wrote:
  I agree that Michael's solution with the temporary tables is the best I
have
  seen so far.

 I can't take much credit.  It's just an adaptation of the solution in the
 manual.

  I am running MySQL 4.0.15 so I don't have any subquery capability. As a
  result, I went to DB2 to come up with one solution that does give the
right
  answer via a subquery:
 
  select content, location, version, date
  from versions
  where concat(location, char(date)) in
  (select concat(location, char(max(date)))
  from versions
  where content = 'ALPHA'
  group by location);

 Really?  That didn't work for me in mysql 4.1.4a-gamma.  I got

 +-+--+-++
 | content | location | version | date   |
 +-+--+-++
 | ALPHA   | PARIS|  10 | 2004-09-14 |
 | ALPHA   | PARIS|  11 | 2004-09-15 |
 | ALPHA   | PARIS|  10 | 2004-09-16 |
 | ALPHA   | NEW-YORK |  11 | 2004-09-14 |
 | ALPHA   | NEW-YORK |  11 | 2004-09-15 |
 | ALPHA   | NEW-YORK |  10 | 2004-09-16 |
 | ALPHA   | TOKYO|  10 | 2004-09-14 |
 | ALPHA   | TOKYO|  11 | 2004-09-15 |
 | BETA| TOKYO|  10 | 2004-09-16 |
 +-+--+-++
 9 rows in set (0.31 sec)


  I don't know if this will work in MySQL 4.1.x though. You may want to
give
  it a try if you have 4.1.x. By the way, I'm not convinced that this is
the
  *best* solution using a subquery; it's just the first one I could think
of.
  I don't have all day to spend on this ;-)

 The following query works for me.

SELECT content, location, version, date
FROM temp t1
WHERE date=(SELECT MAX(t2.date)
FROM temp t2
WHERE t1.location = t2.location

AND t1.content = t2.content)
AND content = 'ALPHA';


 +-+--+-++
 | content | location | version | date   |
 +-+--+-++
 | ALPHA   | PARIS|  10 | 2004-09-16 |
 | ALPHA   | NEW-YORK |  10 | 2004-09-16 |
 | ALPHA   | TOKYO|  11 | 2004-09-15 |
 +-+--+-++
 3 rows in set (0.01 sec)

 Again, that's an adaptation of the subquery version of the solution in the
 manual.

 Michael



-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: Query with group by

2004-09-17 Thread Michael Stassen

I took a closer look and found the problem -- char(date).  Is char() 
necessary in DB2 to cast the date as a string?  In MySQL, char() expects a 
list of integers to interpret as a list of character codes 
http://dev.mysql.com/doc/mysql/en/String_functions.html.  Removing char() 
makes the query appear to work.

I say appear because you forgot the same thing I did.  We need to guard 
against the possibility that a row with different content will match the 
location and date.  So, your query becomes

  SELECT content, location, version, date
  FROM temp
  WHERE CONCAT(location, date) IN
  (SELECT CONCAT(location, MAX(date))
  FROM temp
  WHERE content = 'ALPHA'
  GROUP BY location)
  AND content = 'ALPHA';
which works for me in 4.1.4a.
That said, it seems to me that this is a version of the MAX-CONCAT trick 
written as a subquery.  With the columns to be compared inside CONCAT() 
functions, there is no way to use an index to match up the rows.  My 
expectation is that this will be relatively inefficient compared to the 
other subquery solution.

Michael
Rhino wrote:
Here is the output I got when I ran my query - and yours, Michael - against
DB2 V7.2. Please note that I replaced the 'temp' table in Michael's query
with the real table in both the outer query and the subquery; no other
changes were made.
--
create table versions
(id smallint not null,
date date not null,
content char(5) not null,
location char(10) not null,
version smallint not null,
primary key(id))
DB2I  The SQL command completed successfully.
insert into versions values
(1, '2004-09-14', 'ALPHA', 'PARIS', 10),
(2, '2004-09-15', 'ALPHA', 'PARIS', 11),
(3, '2004-09-16', 'ALPHA', 'PARIS', 10),
(4, '2004-09-14', 'ALPHA', 'NEW-YORK', 11),
(5, '2004-09-15', 'ALPHA', 'NEW-YORK', 11),
(6, '2004-09-16', 'ALPHA', 'NEW-YORK', 10),
(7, '2004-09-14', 'ALPHA', 'TOKYO', 10),
(8, '2004-09-15', 'ALPHA', 'TOKYO', 11),
(9, '2004-09-16', 'BETA', 'TOKYO', 10)
DB2I  The SQL command completed successfully.
select content, location, version, date
from versions
where concat(location, char(date)) in
(select concat(location, char(max(date)))
from versions
where content = 'ALPHA'
group by location)
CONTENT LOCATION   VERSION DATE
--- -- --- --
ALPHA   NEW-YORK10 09/16/2004
ALPHA   PARIS   10 09/16/2004
ALPHA   TOKYO   11 09/15/2004
  3 record(s) selected.
SELECT content, location, version, date
FROM versions t1
WHERE date=(SELECT MAX(t2.date)
FROM versions t2
WHERE t1.location = t2.location AND t1.content = t2.content)
AND content = 'ALPHA'
CONTENT LOCATION   VERSION DATE
--- -- --- --
ALPHA   PARIS   10 09/16/2004
ALPHA   NEW-YORK10 09/16/2004
ALPHA   TOKYO   11 09/15/2004
  3 record(s) selected.
--
As you can see, both queries worked and produced the same result in DB2,
aside from the row sequence, which is easily fixable via an Order By.
I'm at a loss to explain why my query didn't work in MySQL V4.1.4.
Rhino

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: Query with group by

2004-09-17 Thread Rhino

From the SQL Reference for DB2:

---
-CHAR--(--datetime-expression+---+--)
   '-,--+-ISO---+--'
+-USA---+
+-EUR---+
+-JIS---+
'-LOCAL-'


Datetime to Character

  datetime-expression
  An expression that is one of the following three data types

date
The result is the character string representation of the date in the
format specified by the second argument. The length of the result is 10. An
error occurs if the second argument is specified and is not a valid value
(SQLSTATE 42703).

time
The result is the character string representation of the time in the
format specified by the second argument. The length of the result is 8. An
error occurs if the second argument is specified and is not a valid value
(SQLSTATE 42703).

timestamp
The second argument is not applicable and must not be specified.
SQLSTATE 42815 The result is the character string representation of the
timestamp. The length of the result is 26.
  The code page of the string is the code page of the database at the
application server.

---

So, yes, char() is converting the date to a string as you suspected.

I'm not suprised that the query wouldn't perform terribly well; that's what
I would have expected given the construction of the query. Like I said, it
was just the first solution I came up with. I would likely have come up with
something more like your solution if I had had more time to try other
solutions :-)

In any case, we have now thought of several solutions to the problem. I hope
Vincent can use at least one of them ;-)

Rhino

- Original Message - 
From: Michael Stassen [EMAIL PROTECTED]
To: Rhino [EMAIL PROTECTED]
Cc: Jose Miguel Pérez [EMAIL PROTECTED]; [EMAIL PROTECTED];
[EMAIL PROTECTED]
Sent: Friday, September 17, 2004 8:13 PM
Subject: Re: Query with group by


 I took a closer look and found the problem -- char(date).  Is char()
 necessary in DB2 to cast the date as a string?  In MySQL, char() expects a
 list of integers to interpret as a list of character codes
 http://dev.mysql.com/doc/mysql/en/String_functions.html.  Removing
char()
 makes the query appear to work.

 I say appear because you forgot the same thing I did.  We need to guard
 against the possibility that a row with different content will match the
 location and date.  So, your query becomes

SELECT content, location, version, date
FROM temp
WHERE CONCAT(location, date) IN
(SELECT CONCAT(location, MAX(date))
FROM temp
WHERE content = 'ALPHA'
GROUP BY location)
AND content = 'ALPHA';

 which works for me in 4.1.4a.

 That said, it seems to me that this is a version of the MAX-CONCAT trick
 written as a subquery.  With the columns to be compared inside CONCAT()
 functions, there is no way to use an index to match up the rows.  My
 expectation is that this will be relatively inefficient compared to the
 other subquery solution.

 Michael

 Rhino wrote:

  Here is the output I got when I ran my query - and yours, Michael -
against
  DB2 V7.2. Please note that I replaced the 'temp' table in Michael's
query
  with the real table in both the outer query and the subquery; no other
  changes were made.
 
  --
  create table versions
  (id smallint not null,
  date date not null,
  content char(5) not null,
  location char(10) not null,
  version smallint not null,
  primary key(id))
  DB2I  The SQL command completed successfully.
 
  insert into versions values
  (1, '2004-09-14', 'ALPHA', 'PARIS', 10),
  (2, '2004-09-15', 'ALPHA', 'PARIS', 11),
  (3, '2004-09-16', 'ALPHA', 'PARIS', 10),
  (4, '2004-09-14', 'ALPHA', 'NEW-YORK', 11),
  (5, '2004-09-15', 'ALPHA', 'NEW-YORK', 11),
  (6, '2004-09-16', 'ALPHA', 'NEW-YORK', 10),
  (7, '2004-09-14', 'ALPHA', 'TOKYO', 10),
  (8, '2004-09-15', 'ALPHA', 'TOKYO', 11),
  (9, '2004-09-16', 'BETA', 'TOKYO', 10)
  DB2I  The SQL command completed successfully.
 
  select content, location, version, date
  from versions
  where concat(location, char(date)) in
  (select concat(location, char(max(date)))
  from versions
  where content = 'ALPHA'
  group by location)
 
  CONTENT LOCATION   VERSION DATE
  --- -- --- --
  ALPHA   NEW-YORK10 09/16/2004
  ALPHA   PARIS   10 09/16/2004
  ALPHA   TOKYO   11 09/15/2004
 
3 record(s) selected.
 
 
  SELECT content, location, version, date
  FROM versions t1
  WHERE date=(SELECT MAX(t2.date)
  FROM versions t2
  WHERE t1.location = t2.location AND t1.content = t2.content)
  AND content = 'ALPHA'
 
  CONTENT LOCATION   VERSION DATE
  --- -- --- --
  ALPHA   PARIS   10 09/16/2004
  ALPHA   NEW-YORK10 09/16/2004
  ALPHA   TOKYO

Query with group by

2004-09-16 Thread Vincent . Badier

Hello all,

i'm trying to built a query that report me the latest tuple for a given
field.
I explain myself a bit. Let's say i have a table such this :

+++-+--+-+
| id | date   | content | location | version |
+++-+--+-+
|  1 | 2004-09-14 | ALPHA   | PARIS|  10 |
|  2 | 2004-09-15 | ALPHA   | PARIS|  11 |
|  3 | 2004-09-16 | ALPHA   | PARIS|  10 |
|  4 | 2004-09-14 | ALPHA   | NEW-YORK |  11 |
|  5 | 2004-09-15 | ALPHA   | NEW-YORK |  11 |
|  6 | 2004-09-16 | ALPHA   | NEW-YORK |  10 |
|  7 | 2004-09-14 | ALPHA   | TOKYO|  10 |
|  8 | 2004-09-15 | ALPHA   | TOKYO|  11 |
|  9 | 2004-09-16 | BETA| TOKYO|  10 |
+++-+--+-+

Then, i'm trying to get, for ALPHA content, the last (most recent) tuple
for each location, with their associated version.
What i should have in the result set :

++-+--+-+
| 2004-09-16 | ALPHA   | PARIS|  10 |
| 2004-09-16 | ALPHA   | NEW-YORK |  10 |
| 2004-09-15 | ALPHA   | TOKYO|  11 |
++-+--+-+

I tried with max(date) but i get this :

mysql select max(date), location, version from temp where content=ALPHA
group by location;
++--+-+
| max(date)  | location | version |
++--+-+
| 2004-09-16 | NEW-YORK |  11 |
| 2004-09-16 | PARIS|  10 |
| 2004-09-15 | TOKYO|  10 |
++--+-+

Because the GROUP BY statement get the first tuple by default?
distinct(max(date)) do the same result.

How them can i get the correct result set?
Note that we cannot guess if the version is increasing or deacreasing.


Thanks for your help

--
Vincent




-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: Query with group by

2004-09-16 Thread Rhino

- Original Message - 
From: [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Thursday, September 16, 2004 3:05 AM
Subject: Query with group by

 Hello all,

 i'm trying to built a query that report me the latest tuple for a given
 field.
 I explain myself a bit. Let's say i have a table such this :

 +++-+--+-+
 | id | date   | content | location | version |
 +++-+--+-+
 |  1 | 2004-09-14 | ALPHA   | PARIS|  10 |
 |  2 | 2004-09-15 | ALPHA   | PARIS|  11 |
 |  3 | 2004-09-16 | ALPHA   | PARIS|  10 |
 |  4 | 2004-09-14 | ALPHA   | NEW-YORK |  11 |
 |  5 | 2004-09-15 | ALPHA   | NEW-YORK |  11 |
 |  6 | 2004-09-16 | ALPHA   | NEW-YORK |  10 |
 |  7 | 2004-09-14 | ALPHA   | TOKYO|  10 |
 |  8 | 2004-09-15 | ALPHA   | TOKYO|  11 |
 |  9 | 2004-09-16 | BETA| TOKYO|  10 |
 +++-+--+-+

 Then, i'm trying to get, for ALPHA content, the last (most recent) tuple
 for each location, with their associated version.
 What i should have in the result set :

 ++-+--+-+
 | 2004-09-16 | ALPHA   | PARIS|  10 |
 | 2004-09-16 | ALPHA   | NEW-YORK |  10 |
 | 2004-09-15 | ALPHA   | TOKYO|  11 |
 ++-+--+-+

 I tried with max(date) but i get this :

 mysql select max(date), location, version from temp where content=ALPHA
 group by location;
 ++--+-+
 | max(date)  | location | version |
 ++--+-+
 | 2004-09-16 | NEW-YORK |  11 |
 | 2004-09-16 | PARIS|  10 |
 | 2004-09-15 | TOKYO|  10 |
 ++--+-+

 Because the GROUP BY statement get the first tuple by default?
 distinct(max(date)) do the same result.

 How them can i get the correct result set?
 Note that we cannot guess if the version is increasing or deacreasing.

Which version of MySQL are you running?

I'm having trouble thinking of a solution that doesn't involve a subquery
but subqueries aren't supported until version 4.1.x; I don't want to give
you a subquery if you can't run it.

Rhino

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: Query with group by

2004-09-16 Thread Jose Miguel Pérez

Hi Vincent!

 i'm trying to built a query that report me the latest tuple for a given
 field.
 I explain myself a bit. Let's say i have a table such this :

There is no simple solution to your problem. Strictly speaking, it's
forbidden to name a column in the SELECT clause which is NOT on the GROUP BY
clause. (Read chapter: 13.9.3 GROUP BY with Hidden Fields from the MySQL
Documentation.). Anyway, MySQL is relaxed from ANSI SQL in such a way it's
permitted to include a column which is not in the GROUP BY clause (hidden
columns). The values returned from such a query are unpredictable.

I started saying there is no _simple_ solution. However, you can think a
little bit and your problem goes away. I can think of two solutions to your
problem which neither involves a subquery.

You are lucky in that you know for sure what your last value is (you do
have a date field). You already know the value of using MAX(date) in the
SELECT. You can concatenate the date column with the version, and return a
SUBSTRING from that operation, something like this:

SELECT SUBSTRING_INDEX(MAX(CONCAT(date, ' ', version)), ' ', -1) AS
correct_version,
location, version AS wrong_version
FROM cities
WHERE content='ALPHA'
GROUP BY location

You will get the following result set: (Notice the correct_version
column).

+-+--+---+
| correct_version | location | wrong_version |
+-+--+---+
| 10  | NEW-YORK |11 |
| 10  | PARIS|10 |
| 11  | TOKYO|10 |
+-+--+---+


There is another option, however, and this one is what I like most. In
two words, you can join the table with itself, like this:

SELECT c1.date, c1.location, c1.version
FROM cities c1
LEFT JOIN cities c2
ON c1.location=c2.location AND c1.content=c2.content
AND c2.datec1.date
WHERE c2.id IS NULL AND c1.content = 'ALPHA'

What we have done here is first LEFT JOIN the table (cities in my
example) with itself so that we have in the left part those rows which have
the maximum date and with NULL on the right as the value of c2. We then
select those saying WHERE c2.id IS NULL. The last operation is selecting
the rows for an 'ALPHA' content.

Notice that with this version we also eliminate the GROUP BY, since all
rows returned from the join are unique. The result set returned with the
last example is this:

++--+-+
| date   | location | version |
++--+-+
| 2004-09-16 | PARIS|  10 |
| 2004-09-16 | NEW-YORK |  10 |
| 2004-09-15 | TOKYO|  11 |
++--+-+


Cheers,
Jose Miguel.


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: Query with group by

2004-09-16 Thread Michael Stassen

Jose Miguel Pérez wrote:
Hi Vincent!
i'm trying to built a query that report me the latest tuple for a given
field.
I explain myself a bit. Let's say i have a table such this :
There is no simple solution to your problem. Strictly speaking, it's
forbidden to name a column in the SELECT clause which is NOT on the GROUP BY
clause. (Read chapter: 13.9.3 GROUP BY with Hidden Fields from the MySQL
Documentation.). Anyway, MySQL is relaxed from ANSI SQL in such a way it's
permitted to include a column which is not in the GROUP BY clause (hidden
columns). The values returned from such a query are unpredictable.
I started saying there is no _simple_ solution. However, you can think a
little bit and your problem goes away. I can think of two solutions to your
problem which neither involves a subquery.
You are lucky in that you know for sure what your last value is (you do
have a date field). You already know the value of using MAX(date) in the
SELECT. You can concatenate the date column with the version, and return a
SUBSTRING from that operation, something like this:
SELECT SUBSTRING_INDEX(MAX(CONCAT(date, ' ', version)), ' ', -1) AS
correct_version,
location, version AS wrong_version
FROM cities
WHERE content='ALPHA'
GROUP BY location
This is the MAX-CONCAT trick.  It works, but it's inefficient, as it has to 
do a full table scan, with calculations done on each row.  An index on date 
cannot be used in this case, because we are searching for the maximum 
CONCAT(date, ' ', version), rather than the maximum date.

You will get the following result set: (Notice the correct_version
column).
+-+--+---+
| correct_version | location | wrong_version |
+-+--+---+
| 10  | NEW-YORK |11 |
| 10  | PARIS|10 |
| 11  | TOKYO|10 |
+-+--+---+
There is another option, however, and this one is what I like most. In
two words, you can join the table with itself, like this:
SELECT c1.date, c1.location, c1.version
FROM cities c1
LEFT JOIN cities c2
ON c1.location=c2.location AND c1.content=c2.content
AND c2.datec1.date
WHERE c2.id IS NULL AND c1.content = 'ALPHA'
What we have done here is first LEFT JOIN the table (cities in my
example) with itself so that we have in the left part those rows which have
the maximum date and with NULL on the right as the value of c2. We then
select those saying WHERE c2.id IS NULL. The last operation is selecting
the rows for an 'ALPHA' content.
This will work, but it is also somewhat inefficient.  The LEFT JOIN is 
creating numerous extra, unwanted rows, only to throw them away with the 
WHERE c2.id IS NULL.  Assuming n rows for a particular location value, you 
are creating 1 + (n * (n - 1)/2) rows {optimized down to n rows, if id is 
defined NOT NULL} to find the one row you want for that group.  That's no 
big deal for this small sample table, but it may not scale well.

Notice that with this version we also eliminate the GROUP BY, since all
rows returned from the join are unique. The result set returned with the
last example is this:
++--+-+
| date   | location | version |
++--+-+
| 2004-09-16 | PARIS|  10 |
| 2004-09-16 | NEW-YORK |  10 |
| 2004-09-15 | TOKYO|  11 |
++--+-+
The most efficient way is probably to use a temporary table.
  CREATE TEMPORARY TABLE max_dates
  SELECT location, MAX(date) AS max_date
  FROM temp
  WHERE content = 'ALPHA'
  GROUP BY location;
  SELECT t.*
  FROM temp t, max_dates m
  WHERE t.location = m.location
  AND t.date = m.max_date;
  DROP TABLE max_dates;
The manual describes the MAX-CONCAT trick, the temporary table solution, and 
a subquery solution for 4.1+ 
http://dev.mysql.com/doc/mysql/en/example-Maximum-column-group-row.html.

Cheers,
Jose Miguel.
Michael
--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: Query with group by

2004-09-16 Thread Rhino

I agree that Michael's solution with the temporary tables is the best I have
seen so far.

I am running MySQL 4.0.15 so I don't have any subquery capability. As a
result, I went to DB2 to come up with one solution that does give the right
answer via a subquery:

select content, location, version, date
from versions
where concat(location, char(date)) in
(select concat(location, char(max(date)))
from versions
where content = 'ALPHA'
group by location);

I don't know if this will work in MySQL 4.1.x though. You may want to give
it a try if you have 4.1.x. By the way, I'm not convinced that this is the
*best* solution using a subquery; it's just the first one I could think of.
I don't have all day to spend on this ;-)

One other thought: is it really necessary to keep track of all of the
different versions you have had at each location? I can't think of a lot of
cases where I'd really care that I had version 8 of the ALPHA in Tokyo last
week or last year. Your query would be simpler - and the volume of data
would be less - if you only kept track of the current version for each
content at each location.

Rhino

- Original Message - 
From: Michael Stassen [EMAIL PROTECTED]
To: Jose Miguel Pérez [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]
Sent: Thursday, September 16, 2004 10:47 AM
Subject: Re: Query with group by



 Jose Miguel Pérez wrote:

  Hi Vincent!
 
 i'm trying to built a query that report me the latest tuple for a given
 field.
 I explain myself a bit. Let's say i have a table such this :
 
  There is no simple solution to your problem. Strictly speaking, it's
  forbidden to name a column in the SELECT clause which is NOT on the
GROUP BY
  clause. (Read chapter: 13.9.3 GROUP BY with Hidden Fields from the
MySQL
  Documentation.). Anyway, MySQL is relaxed from ANSI SQL in such a way
it's
  permitted to include a column which is not in the GROUP BY clause
(hidden
  columns). The values returned from such a query are unpredictable.
 
  I started saying there is no _simple_ solution. However, you can
think a
  little bit and your problem goes away. I can think of two solutions to
your
  problem which neither involves a subquery.
 
  You are lucky in that you know for sure what your last value is (you
do
  have a date field). You already know the value of using MAX(date) in
the
  SELECT. You can concatenate the date column with the version, and return
a
  SUBSTRING from that operation, something like this:
 
  SELECT SUBSTRING_INDEX(MAX(CONCAT(date, ' ', version)), ' ', -1)
AS
  correct_version,
  location, version AS wrong_version
  FROM cities
  WHERE content='ALPHA'
  GROUP BY location

 This is the MAX-CONCAT trick.  It works, but it's inefficient, as it has
to
 do a full table scan, with calculations done on each row.  An index on
date
 cannot be used in this case, because we are searching for the maximum
 CONCAT(date, ' ', version), rather than the maximum date.

  You will get the following result set: (Notice the correct_version
  column).
 
  +-+--+---+
  | correct_version | location | wrong_version |
  +-+--+---+
  | 10  | NEW-YORK |11 |
  | 10  | PARIS|10 |
  | 11  | TOKYO|10 |
  +-+--+---+
 
  There is another option, however, and this one is what I like most.
In
  two words, you can join the table with itself, like this:
 
  SELECT c1.date, c1.location, c1.version
  FROM cities c1
  LEFT JOIN cities c2
  ON c1.location=c2.location AND c1.content=c2.content
  AND c2.datec1.date
  WHERE c2.id IS NULL AND c1.content = 'ALPHA'
 
  What we have done here is first LEFT JOIN the table (cities in my
  example) with itself so that we have in the left part those rows which
have
  the maximum date and with NULL on the right as the value of c2. We then
  select those saying WHERE c2.id IS NULL. The last operation is
selecting
  the rows for an 'ALPHA' content.

 This will work, but it is also somewhat inefficient.  The LEFT JOIN is
 creating numerous extra, unwanted rows, only to throw them away with the
 WHERE c2.id IS NULL.  Assuming n rows for a particular location value, you
 are creating 1 + (n * (n - 1)/2) rows {optimized down to n rows, if id is
 defined NOT NULL} to find the one row you want for that group.  That's no
 big deal for this small sample table, but it may not scale well.

  Notice that with this version we also eliminate the GROUP BY, since
all
  rows returned from the join are unique. The result set returned with the
  last example is this:
 
  ++--+-+
  | date   | location | version |
  ++--+-+
  | 2004-09-16 | PARIS|  10 |
  | 2004-09-16 | NEW-YORK |  10 |
  | 2004-09-15

Re: Query with group by

2004-09-16 Thread Vincent . Badier


Which version of MySQL are you running?

I'm having trouble thinking of a solution that doesn't involve a subquery
but subqueries aren't supported until version 4.1.x; I don't want to give
you a subquery if you can't run it.

Rhino

I'm running  3.23.43.

--
Vincent



-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: Query with group by

2004-09-16 Thread Jose Miguel Pérez

Hi Michael!

Yes, you're right, thanks for extending and clarifying my message.
However, I'm not confident about your comments on the inefficiency for the
following query:

  SELECT c1.date, c1.location, c1.version
  FROM cities c1
  LEFT JOIN cities c2
  ON c1.location=c2.location AND c1.content=c2.content
  AND c2.datec1.date
  WHERE c2.id IS NULL AND c1.content = 'ALPHA'
 

 This will work, but it is also somewhat inefficient.  The LEFT JOIN is
 creating numerous extra, unwanted rows, only to throw them away with the
 WHERE c2.id IS NULL.  Assuming n rows for a particular location value, you

[...] and then [...]

 The most efficient way is probably to use a temporary table.

CREATE TEMPORARY TABLE max_dates
SELECT location, MAX(date) AS max_date
FROM temp
WHERE content = 'ALPHA'
GROUP BY location;

SELECT t.*
FROM temp t, max_dates m
WHERE t.location = m.location
AND t.date = m.max_date;

DROP TABLE max_dates;

I don't think the temporary table is such an efficient way of doing
this. Pardon me, I'm provably wrong, but let me explain to see if I think
correctly. First, I assume as true this table have an index on location,
content and date, apart from the PK on ID. Given that, on my query we
are using the keys at full, I mean, although you say the left join is
creating numerous extra, unwanted rows, this is not true. We could apply
the standard algebra here, but the real world query optimizers are smart
enough to not retrieve unwanted data. (What about joining four or more
tables! Multiply then).

Your query is creating a temporary table, doing a full scan of it
(thanks to the MAX(date) function), etc. If you do a EXPLAIN SELECT for your
query, you'll notice there is an Extra of: Using where; Using temporary;
Using FILESORT. Reading the MySQL documentation, one can see If you want
to make your queries as fast as possible, you should look out for Extra
values of Using filesort and Using temporary.. (Chapter 7.2.1 EXPLAIN
Syntax).

If I'm not wrong, maybe the first LEFT JOIN is worse from a mathematical
point of view, but the temporary one may be is the worst from a practical
perspective.

And you'll see I'm very cautious because I'm not such a SQL guru, but
I'd like to know other opinions.

Anyway, I don't know if one can program an agregate UDF called something
like EXTERNAL_MAX(...) or something, so that we could do like:

SELECT EXTERNAL_MAX(date, version)  --- i.e: Returns the version
value for the row with MAX(date).

This, for sure, will be the best solution. ;-)

Cheers,
Jose Miguel.


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: Query with group by

2004-09-16 Thread Michael Stassen

Rhino wrote:
I agree that Michael's solution with the temporary tables is the best I have
seen so far.
I can't take much credit.  It's just an adaptation of the solution in the 
manual.

I am running MySQL 4.0.15 so I don't have any subquery capability. As a
result, I went to DB2 to come up with one solution that does give the right
answer via a subquery:
select content, location, version, date
from versions
where concat(location, char(date)) in
(select concat(location, char(max(date)))
from versions
where content = 'ALPHA'
group by location);
Really?  That didn't work for me in mysql 4.1.4a-gamma.  I got
+-+--+-++
| content | location | version | date   |
+-+--+-++
| ALPHA   | PARIS|  10 | 2004-09-14 |
| ALPHA   | PARIS|  11 | 2004-09-15 |
| ALPHA   | PARIS|  10 | 2004-09-16 |
| ALPHA   | NEW-YORK |  11 | 2004-09-14 |
| ALPHA   | NEW-YORK |  11 | 2004-09-15 |
| ALPHA   | NEW-YORK |  10 | 2004-09-16 |
| ALPHA   | TOKYO|  10 | 2004-09-14 |
| ALPHA   | TOKYO|  11 | 2004-09-15 |
| BETA| TOKYO|  10 | 2004-09-16 |
+-+--+-++
9 rows in set (0.31 sec)

I don't know if this will work in MySQL 4.1.x though. You may want to give
it a try if you have 4.1.x. By the way, I'm not convinced that this is the
*best* solution using a subquery; it's just the first one I could think of.
I don't have all day to spend on this ;-)
The following query works for me.
  SELECT content, location, version, date
  FROM temp t1
  WHERE date=(SELECT MAX(t2.date)
  FROM temp t2
  WHERE t1.location = t2.location 

  AND t1.content = t2.content)
  AND content = 'ALPHA'; 

+-+--+-++
| content | location | version | date   |
+-+--+-++
| ALPHA   | PARIS|  10 | 2004-09-16 |
| ALPHA   | NEW-YORK |  10 | 2004-09-16 |
| ALPHA   | TOKYO|  11 | 2004-09-15 |
+-+--+-++
3 rows in set (0.01 sec)
Again, that's an adaptation of the subquery version of the solution in the 
manual.

Michael
--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: Nested query with GROUP BY ... HAVING COUNT(*) ... hangs 4.1.3 beta

2004-07-20 Thread Leo Siefert

Thanks for your response.

 [EMAIL PROTECTED] 7/19/2004 11:47:39 AM 
It looks like your IN statement is forcing your inner SELECT to
execute 
once PER ROW of your main table. It is asking the engine to make sure
that 
_each and every_  id value in main meets the condition in the inner 
select. So, for each and every value in the table main, it has to 
re-computing the inner query and scan the results for matches.

Not sure why this would happen.  The nested query is not correlated to
the outer query, so I would expect it to be executed only once.  I have
tried the same query with even larger file sizes on other data managers
and not had this problem. (In fact, I copied the query from an existing
FoxPro program.)

Also, in my production app, the actual queries being run ar much more
complex, including multiple nested queries, and only with the having
clause is there ever a problem.

I would change it to a JOIN against an anonymous view and test again
-

SELECT m.*
FROM main m
INNER JOIN (SELECT main_ID
FROM receipt
GROUP BY main_ID
HAVING COUNT(1)  5) as r
ON m.id = r.main_ID

This query actually does run quickly.  Thanks - I will try to work the
syntax into my query generator.

- or to break it into two tables for some real speed -

CREATE TEMPORARY TABLE tmpR
SELECT main_ID
FROM receipt
GROUP BY main_ID
HAVING COUNT(1)  5;

alter table tmpR add key(main_Id);

SELECT m.*
FROM main m
INNER JOIN tmpR r
on m.ID = r.main_ID;

DROP TABLE tmpR;

This is actually the second scenario I had tried, as noted in my
original post, and it does yeild better results than the nested query,
but still takes an incredibly long time to run.

Either method should avoid the re-execution of the subselect for every
row 
in your primary table. Adding the index to the temporary table will
make 
the last select really fly.

Yours,
Shawn Green
Database Administrator
Unimin Corporation - Spruce Pine


Thanks for the info, and for the query syntax to work around the
problem.  I still think this is a bug in processing the nested query,
and if it is runing the subquery for each line in the master table, i
think that is incorrect.

 - Leo Siefert







Leo Siefert [EMAIL PROTECTED] wrote on 07/19/2004
11:22:39 
AM:

 OS: Win2k Server
 MySQL: 4.1.1 alpha / 4.1.3 beta
Table type: INNO DB
 
 In my production environment, running the query:
 
 select * from main where id in
 (select main_id from receipt group by main_id having COUNT(*)  5)
 
 will hang the server - sometimes for over a day, thugh it seems it
will
 eventually complete working on it if given enough time. Currently
main
 contains ~200,000 records and receipt contains ~16,000. Main records
 with any receipts have an average of ~10 receipts, but most have
none. 
 
 Created a smaller test database:
 
 master
id   int(6)  primary  autoincrement
name  varchar(25)  (filled with random 10 char strings)
 
 detail
id   int(6)  primary  autoincrement
master_id   int(6)  index  (filled with random ints =
 max(master.id))
detail  varchar(25)  (filled with random 10 char strings)
 
 temp
id int(6) index
 
 Fill master with 1,000 records, detail with 10,000.
 Clone and fill master with 10,000 records, detail with 100,000.
 
 Query:
 
 select * from master where master.id in
 (select master_id from detail group by master_id having COUNT(*) 
2)
 
 (small) returns 76 rows in 13 seconds.
 (large) returns 496 rows in 566 seconds. (COUNT(*)  15)
 
 Tried a two part query, sending the intermediate results to a
temporary
 table:
 
 create temporary table t select master_id from detail group by
 master_id having COUNT(*)  2;
 select * from master where master.id in (select master_id from t);
 drop table t;
 
 (small) returns 76 rows in 2.8 seconds.
 (large) returns 496 rows in 17 seconds.
 
 Running the intermediate results into a permanent table:
 
 truncate table temp;
 insert into temp select master_id from detail group by master_id
having
 COUNT(*)  2;
 select * from master where master.id in (select id from temp);
 
 (small) returns 76 rows in 0.16 seconds.
 (large) returns 496 rows in 0.17 seconds.
 
 Have tried playing around with some of the system variables:
 query_cache_size, innodb_buffer_pool_size with no real affect.
 
 In our production environment (record size is much larger, similar
 number of records to the large test set), both the nested query and
the
 two-part query using a temporary query hang for indeterminate ( 6
hrs)
 amounts of time, leaving the use of a permanent table as the only
 option. Of course, the only real way to manage this is to create a
 dedicated scratch table for each user of the system, a somewhat
onerous
 workaround.
 
 Anyone have an idea on a solution to this? Is there something in
 setting up for INNO DB that I am missing, or should I file this as a
 bug?
 
 Thanks.
 
  - Leo
 
 -- 
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql 
 To unsubscribe:

Nested query with GROUP BY ... HAVING COUNT(*) ... hangs 4.1.3 beta

2004-07-19 Thread Leo Siefert

OS: Win2k Server
MySQL: 4.1.1 alpha / 4.1.3 beta
   Table type: INNO DB

In my production environment, running the query:

select * from main where id in
(select main_id from receipt group by main_id having COUNT(*)  5)

will hang the server - sometimes for over a day, thugh it seems it will
eventually complete working on it if given enough time. Currently main
contains ~200,000 records and receipt contains ~16,000. Main records
with any receipts have an average of ~10 receipts, but most have none. 

Created a smaller test database:

master
   id   int(6)  primary  autoincrement
   name  varchar(25)  (filled with random 10 char strings)

detail
   id   int(6)  primary  autoincrement
   master_id   int(6)  index  (filled with random ints =
max(master.id))
   detail  varchar(25)  (filled with random 10 char strings)

temp
   id int(6) index

Fill master with 1,000 records, detail with 10,000.
Clone and fill master with 10,000 records, detail with 100,000.

Query:

select * from master where master.id in
(select master_id from detail group by master_id having COUNT(*)  2)

(small) returns 76 rows in 13 seconds.
(large) returns 496 rows in 566 seconds. (COUNT(*)  15)

Tried a two part query, sending the intermediate results to a temporary
table:

create temporary table t select master_id from detail group by
master_id having COUNT(*)  2;
select * from master where master.id in (select master_id from t);
drop table t;

(small) returns 76 rows in 2.8 seconds.
(large) returns 496 rows in 17 seconds.

Running the intermediate results into a permanent table:

truncate table temp;
insert into temp select master_id from detail group by master_id having
COUNT(*)  2;
select * from master where master.id in (select id from temp);

(small) returns 76 rows in 0.16 seconds.
(large) returns 496 rows in 0.17 seconds.

Have tried playing around with some of the system variables:
query_cache_size, innodb_buffer_pool_size with no real affect.

In our production environment (record size is much larger, similar
number of records to the large test set), both the nested query and the
two-part query using a temporary query hang for indeterminate ( 6 hrs)
amounts of time, leaving the use of a permanent table as the only
option. Of course, the only real way to manage this is to create a
dedicated scratch table for each user of the system, a somewhat onerous
workaround.

Anyone have an idea on a solution to this? Is there something in
setting up for INNO DB that I am missing, or should I file this as a
bug?

Thanks.

 - Leo

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: Nested query with GROUP BY ... HAVING COUNT(*) ... hangs 4.1.3 beta

2004-07-19 Thread SGreen

It looks like your IN statement is forcing your inner SELECT to execute 
once PER ROW of your main table. It is asking the engine to make sure that 
_each and every_  id value in main meets the condition in the inner 
select. So, for each and every value in the table main, it has to 
re-computing the inner query and scan the results for matches.

I would change it to a JOIN against an anonymous view and test again -

SELECT m.*
FROM main m
INNER JOIN (SELECT main_ID
FROM receipt
GROUP BY main_ID
HAVING COUNT(1)  5) as r
ON m.id = r.main_ID

- or to break it into two tables for some real speed -

CREATE TEMPORARY TABLE tmpR
SELECT main_ID
FROM receipt
GROUP BY main_ID
HAVING COUNT(1)  5;

alter table tmpR add key(main_Id);

SELECT m.*
FROM main m
INNER JOIN tmpR r
on m.ID = r.main_ID;

DROP TABLE tmpR;

Either method should avoid the re-execution of the subselect for every row 
in your primary table. Adding the index to the temporary table will make 
the last select really fly.

Yours,
Shawn Green
Database Administrator
Unimin Corporation - Spruce Pine

Leo Siefert [EMAIL PROTECTED] wrote on 07/19/2004 11:22:39 
AM:

 OS: Win2k Server
 MySQL: 4.1.1 alpha / 4.1.3 beta
Table type: INNO DB
 
 In my production environment, running the query:
 
 select * from main where id in
 (select main_id from receipt group by main_id having COUNT(*)  5)
 
 will hang the server - sometimes for over a day, thugh it seems it will
 eventually complete working on it if given enough time. Currently main
 contains ~200,000 records and receipt contains ~16,000. Main records
 with any receipts have an average of ~10 receipts, but most have none. 
 
 Created a smaller test database:
 
 master
id   int(6)  primary  autoincrement
name  varchar(25)  (filled with random 10 char strings)
 
 detail
id   int(6)  primary  autoincrement
master_id   int(6)  index  (filled with random ints =
 max(master.id))
detail  varchar(25)  (filled with random 10 char strings)
 
 temp
id int(6) index
 
 Fill master with 1,000 records, detail with 10,000.
 Clone and fill master with 10,000 records, detail with 100,000.
 
 Query:
 
 select * from master where master.id in
 (select master_id from detail group by master_id having COUNT(*)  2)
 
 (small) returns 76 rows in 13 seconds.
 (large) returns 496 rows in 566 seconds. (COUNT(*)  15)
 
 Tried a two part query, sending the intermediate results to a temporary
 table:
 
 create temporary table t select master_id from detail group by
 master_id having COUNT(*)  2;
 select * from master where master.id in (select master_id from t);
 drop table t;
 
 (small) returns 76 rows in 2.8 seconds.
 (large) returns 496 rows in 17 seconds.
 
 Running the intermediate results into a permanent table:
 
 truncate table temp;
 insert into temp select master_id from detail group by master_id having
 COUNT(*)  2;
 select * from master where master.id in (select id from temp);
 
 (small) returns 76 rows in 0.16 seconds.
 (large) returns 496 rows in 0.17 seconds.
 
 Have tried playing around with some of the system variables:
 query_cache_size, innodb_buffer_pool_size with no real affect.
 
 In our production environment (record size is much larger, similar
 number of records to the large test set), both the nested query and the
 two-part query using a temporary query hang for indeterminate ( 6 hrs)
 amounts of time, leaving the use of a permanent table as the only
 option. Of course, the only real way to manage this is to create a
 dedicated scratch table for each user of the system, a somewhat onerous
 workaround.
 
 Anyone have an idea on a solution to this? Is there something in
 setting up for INNO DB that I am missing, or should I file this as a
 bug?
 
 Thanks.
 
  - Leo
 
 -- 
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: Why this query doesn't group the email addresses?

2004-07-14 Thread Harald Fuchs

In article [EMAIL PROTECTED],
Justin Swanhart [EMAIL PROTECTED] writes:

 Where is the implicit group?  The 'order by'
 shouldn't effect how things
 are grouped.  On MySQL 4.0.17:

 Wow.  I wasn't aware.  Is that expected behavior?
 Other databases (Oracle) generate an error when you
 include columns in the select list that aren't in a
 group by,

Yes, because the SQL standard prohibits this.

 ... or they do an implicit group by (postgres)
 on the columns.

Huh?  PostgreSQL throws an error - as it should do.

 I [wrongly] assumed an implicit group
 by was going on because the query hadn't generated an
 error.

Nope; that's a MySQL extension to the SQL standard which confused
everyone and has few real use cases: MySQL groups by email and then
fetches a random first and last name out of the group.


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Why this query doesn't group the email addresses?

2004-07-13 Thread Aaron Wolski

Hey all,

Got this query:

SELECT first,last,email FROM CustomerTable AS t1,
OrderTable AS t2, CartTable AS t3 WHERE t2.cart_id=t3.cart_id
AND t1.id=t2.customer_index AND t3.submitted='1' AND
t3.product_index='1' AND t3.quantity0
GROUP BY t1.email ORDER BY t1.first,t1.last

For some strange reason it doesn't seem to group the email addresses.
I'd be hard pressed to find every occurrence out of 1000 records, but I
DID quickly spot two exact same records which means the email address
was not grouped.

What can I do or where did I go wrong?

Thanks!

Aaron



-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

RE: Why this query doesn't group the email addresses?

2004-07-13 Thread Victor Pendleton

Were the records exactly the same? 

-Original Message-
From: Aaron Wolski
To: [EMAIL PROTECTED]
Sent: 7/13/04 10:04 AM
Subject: Why this query doesn't group the email addresses?

Hey all,

Got this query:

SELECT first,last,email FROM CustomerTable AS t1,
OrderTable AS t2, CartTable AS t3 WHERE t2.cart_id=t3.cart_id
AND t1.id=t2.customer_index AND t3.submitted='1' AND
t3.product_index='1' AND t3.quantity0
GROUP BY t1.email ORDER BY t1.first,t1.last

For some strange reason it doesn't seem to group the email addresses.
I'd be hard pressed to find every occurrence out of 1000 records, but I
DID quickly spot two exact same records which means the email address
was not grouped.

What can I do or where did I go wrong?

Thanks!

Aaron

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:
http://lists.mysql.com/[EMAIL PROTECTED]

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

RE: Why this query doesn't group the email addresses?

2004-07-13 Thread Aaron Wolski

Yes sir. Exactly!

A

 -Original Message-
 From: Victor Pendleton [mailto:[EMAIL PROTECTED]
 Sent: July 13, 2004 11:15 AM
 To: 'Aaron Wolski '; '[EMAIL PROTECTED] '
 Subject: RE: Why this query doesn't group the email addresses?
 
 Were the records exactly the same?
 
 -Original Message-
 From: Aaron Wolski
 To: [EMAIL PROTECTED]
 Sent: 7/13/04 10:04 AM
 Subject: Why this query doesn't group the email addresses?
 
 Hey all,
 
 Got this query:
 
 SELECT first,last,email FROM CustomerTable AS t1,
 OrderTable AS t2, CartTable AS t3 WHERE t2.cart_id=t3.cart_id
 AND t1.id=t2.customer_index AND t3.submitted='1' AND
 t3.product_index='1' AND t3.quantity0
 GROUP BY t1.email ORDER BY t1.first,t1.last
 
 For some strange reason it doesn't seem to group the email addresses.
 I'd be hard pressed to find every occurrence out of 1000 records, but
I
 DID quickly spot two exact same records which means the email address
 was not grouped.
 
 What can I do or where did I go wrong?
 
 Thanks!
 
 Aaron
 
 
 
 --
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe:
 http://lists.mysql.com/[EMAIL PROTECTED]



-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: Why this query doesn't group the email addresses?

2004-07-13 Thread Wesley Furgiuele

What type of field is the email field?
Wes
On Jul 13, 2004, at 11:04 AM, Aaron Wolski wrote:
Hey all,
Got this query:
SELECT first,last,email FROM CustomerTable AS t1,
OrderTable AS t2, CartTable AS t3 WHERE t2.cart_id=t3.cart_id
AND t1.id=t2.customer_index AND t3.submitted='1' AND
t3.product_index='1' AND t3.quantity0
GROUP BY t1.email ORDER BY t1.first,t1.last
For some strange reason it doesn't seem to group the email addresses.
I'd be hard pressed to find every occurrence out of 1000 records, but I
DID quickly spot two exact same records which means the email address
was not grouped.
What can I do or where did I go wrong?
Thanks!
Aaron

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

RE: Why this query doesn't group the email addresses?

2004-07-13 Thread Aaron Wolski

varchar, sir.



 -Original Message-
 From: Wesley Furgiuele [mailto:[EMAIL PROTECTED]
 Sent: July 13, 2004 1:14 PM
 To: Aaron Wolski
 Cc: [EMAIL PROTECTED]
 Subject: Re: Why this query doesn't group the email addresses?
 
 What type of field is the email field?
 
 Wes
 
 On Jul 13, 2004, at 11:04 AM, Aaron Wolski wrote:
 
  Hey all,
 
  Got this query:
 
  SELECT first,last,email FROM CustomerTable AS t1,
  OrderTable AS t2, CartTable AS t3 WHERE t2.cart_id=t3.cart_id
  AND t1.id=t2.customer_index AND t3.submitted='1' AND
  t3.product_index='1' AND t3.quantity0
  GROUP BY t1.email ORDER BY t1.first,t1.last
 
  For some strange reason it doesn't seem to group the email
addresses.
  I'd be hard pressed to find every occurrence out of 1000 records,
but I
  DID quickly spot two exact same records which means the email
address
  was not grouped.
 
  What can I do or where did I go wrong?
 
  Thanks!
 
  Aaron
 
 
 --
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe:
 http://lists.mysql.com/[EMAIL PROTECTED]
 



-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: Why this query doesn't group the email addresses?

2004-07-13 Thread Justin Swanhart

You are doing an implicit group by of first, last as
well as your explicit group by of email.  

So you could have two records with the same e-mail
address generate two records with your group by

   Justin Time [EMAIL PROTECTED]
   Justin Credible [EMAIL PROTECTED]  --DUPE--

Case differences between the records could also cause
dupes.  If case differences are causing it then do

select lower(first), lower(last), lower(email)
...
group by lower(first), lower(last), lower(email)

--- Wesley Furgiuele [EMAIL PROTECTED] wrote:
 What type of field is the email field?
 
 Wes
 
 On Jul 13, 2004, at 11:04 AM, Aaron Wolski wrote:
 
  Hey all,
 
  Got this query:
 
  SELECT first,last,email FROM CustomerTable AS t1,
  OrderTable AS t2, CartTable AS t3 WHERE
 t2.cart_id=t3.cart_id
  AND t1.id=t2.customer_index AND t3.submitted='1'
 AND
  t3.product_index='1' AND t3.quantity0
  GROUP BY t1.email ORDER BY t1.first,t1.last
 
  For some strange reason it doesn't seem to group
 the email addresses.
  I'd be hard pressed to find every occurrence out
 of 1000 records, but I
  DID quickly spot two exact same records which
 means the email address
  was not grouped.
 
  What can I do or where did I go wrong?
 
  Thanks!
 
  Aaron
 
 
 -- 
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe:   

http://lists.mysql.com/[EMAIL PROTECTED]
 
 


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: Why this query doesn't group the email addresses?

2004-07-13 Thread Wesley Furgiuele

Justin:
Interesting. So, if GROUP BY t1.email doesn't group simply by 
t1.email, then how would the query be written to accomplish that? And 
how is it occurring that Aaron is implicitly grouping by email, first, 
last instead of just grouping by email? Is it because of the ORDER BY 
clause?

Wes
On Jul 13, 2004, at 2:13 PM, Justin Swanhart wrote:
You are doing an implicit group by of first, last as
well as your explicit group by of email.
So you could have two records with the same e-mail
address generate two records with your group by
   Justin Time [EMAIL PROTECTED]
   Justin Credible [EMAIL PROTECTED]  --DUPE--
Case differences between the records could also cause
dupes.  If case differences are causing it then do
select lower(first), lower(last), lower(email)
...
group by lower(first), lower(last), lower(email)
--- Wesley Furgiuele [EMAIL PROTECTED] wrote:
What type of field is the email field?
Wes
On Jul 13, 2004, at 11:04 AM, Aaron Wolski wrote:
Hey all,
Got this query:
SELECT first,last,email FROM CustomerTable AS t1,
OrderTable AS t2, CartTable AS t3 WHERE
t2.cart_id=t3.cart_id
AND t1.id=t2.customer_index AND t3.submitted='1'
AND
t3.product_index='1' AND t3.quantity0
GROUP BY t1.email ORDER BY t1.first,t1.last
For some strange reason it doesn't seem to group
the email addresses.
I'd be hard pressed to find every occurrence out
of 1000 records, but I
DID quickly spot two exact same records which
means the email address
was not grouped.
What can I do or where did I go wrong?
Thanks!
Aaron

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:
http://lists.mysql.com/[EMAIL PROTECTED]



--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: Why this query doesn't group the email addresses?

2004-07-13 Thread Garth Webb

On Tue, 2004-07-13 at 11:13, Justin Swanhart wrote:
 You are doing an implicit group by of first, last as
 well as your explicit group by of email.  
 
 So you could have two records with the same e-mail
 address generate two records with your group by
 
Justin Time [EMAIL PROTECTED]
Justin Credible [EMAIL PROTECTED]  --DUPE--

Where is the implicit group?  The 'order by' shouldn't effect how things
are grouped.  On MySQL 4.0.17:

create table bar ( email varchar(64), first varchar(64), last
varchar(64) );

insert into bar values ('[EMAIL PROTECTED]', 'a', 'smith'), ('[EMAIL PROTECTED]', 'a',
'williams'), ('[EMAIL PROTECTED]', 'b', 'webb'), ('[EMAIL PROTECTED]', 'c', 'oconner');

mysql select * from bar;
+-+---+--+
| email   | first | last |
+-+---+--+
| [EMAIL PROTECTED] | a | smith|
| [EMAIL PROTECTED] | a | williams |
| [EMAIL PROTECTED] | b | webb |
| [EMAIL PROTECTED] | c | oconner  |
+-+---+--+

mysql select first,last,email from bar b group by b.email order by
b.first, b.last;
+---+---+-+
| first | last  | email   |
+---+---+-+
| a | smith | [EMAIL PROTECTED] |
| b | webb  | [EMAIL PROTECTED] |
+---+---+-+


 Case differences between the records could also cause
 dupes.  If case differences are causing it then do
 
 select lower(first), lower(last), lower(email)
 ...
 group by lower(first), lower(last), lower(email)

Case or extra whitespace is a definite possibility.  Aaron, try to find
at least one occurrence of duplicate email addresses and then post a
small (5 row) dataset that exhibits the problem you are having.

Garth

 --- Wesley Furgiuele [EMAIL PROTECTED] wrote:
  What type of field is the email field?
  
  Wes
  
  On Jul 13, 2004, at 11:04 AM, Aaron Wolski wrote:
  
   Hey all,
  
   Got this query:
  
   SELECT first,last,email FROM CustomerTable AS t1,
   OrderTable AS t2, CartTable AS t3 WHERE
  t2.cart_id=t3.cart_id
   AND t1.id=t2.customer_index AND t3.submitted='1'
  AND
   t3.product_index='1' AND t3.quantity0
   GROUP BY t1.email ORDER BY t1.first,t1.last
  
   For some strange reason it doesn't seem to group
  the email addresses.
   I'd be hard pressed to find every occurrence out
  of 1000 records, but I
   DID quickly spot two exact same records which
  means the email address
   was not grouped.
  
   What can I do or where did I go wrong?
  
   Thanks!
  
   Aaron
  
  
  -- 
  MySQL General Mailing List
  For list archives: http://lists.mysql.com/mysql
  To unsubscribe:   
 
 http://lists.mysql.com/[EMAIL PROTECTED]
  
  
-- 
. Garth Webb
. [EMAIL PROTECTED]
.
. shoes *  * schoenen *  * chaussures * zapatos
. Schuhe *  * pattini *  * sapatas * 

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: Why this query doesn't group the email addresses?

2004-07-13 Thread Justin Swanhart

 Where is the implicit group?  The 'order by'
 shouldn't effect how things
 are grouped.  On MySQL 4.0.17:

Wow.  I wasn't aware.  Is that expected behavior? 
Other databases (Oracle) generate an error when you
include columns in the select list that aren't in a
group by, or they do an implicit group by (postgres)
on the columns.  I [wrongly] assumed an implicit group
by was going on because the query hadn't generated an
error.

I guess mysql just takes the values from the first
record that matches the group expression.

That creates confusion when you do:
select first,last,email, count(*)
from foobar
group by email

Because the query will report a count of two (given
your data) when there really is only one row with that
first,last,email combination.

Oracle would require you to do:
select first,last,email, count(*)
from foobar
group by first,last,email
otherwise you would get an error that first is not a
GROUP BY expression.

That query would return four rows on your data, each
with a count of 1.

My apologies,

Justin

 
 create table bar ( email varchar(64), first
 varchar(64), last
 varchar(64) );
 
 insert into bar values ('[EMAIL PROTECTED]', 'a', 'smith'),
 ('[EMAIL PROTECTED]', 'a',
 'williams'), ('[EMAIL PROTECTED]', 'b', 'webb'), ('[EMAIL PROTECTED]',
 'c', 'oconner');
 
 mysql select * from bar;
 +-+---+--+
 | email   | first | last |
 +-+---+--+
 | [EMAIL PROTECTED] | a | smith|
 | [EMAIL PROTECTED] | a | williams |
 | [EMAIL PROTECTED] | b | webb |
 | [EMAIL PROTECTED] | c | oconner  |
 +-+---+--+
 
 mysql select first,last,email from bar b group by
 b.email order by
 b.first, b.last;
 +---+---+-+
 | first | last  | email   |
 +---+---+-+
 | a | smith | [EMAIL PROTECTED] |
 | b | webb  | [EMAIL PROTECTED] |
 +---+---+-+
 
 
  Case differences between the records could also
 cause
  dupes.  If case differences are causing it then do
  
  select lower(first), lower(last), lower(email)
  ...
  group by lower(first), lower(last), lower(email)
 
 Case or extra whitespace is a definite possibility. 
 Aaron, try to find
 at least one occurrence of duplicate email addresses
 and then post a
 small (5 row) dataset that exhibits the problem you
 are having.
 
 Garth
 
  --- Wesley Furgiuele [EMAIL PROTECTED] wrote:
   What type of field is the email field?
   
   Wes
   
   On Jul 13, 2004, at 11:04 AM, Aaron Wolski
 wrote:
   
Hey all,
   
Got this query:
   
SELECT first,last,email FROM CustomerTable AS
 t1,
OrderTable AS t2, CartTable AS t3 WHERE
   t2.cart_id=t3.cart_id
AND t1.id=t2.customer_index AND
 t3.submitted='1'
   AND
t3.product_index='1' AND t3.quantity0
GROUP BY t1.email ORDER BY t1.first,t1.last
   
For some strange reason it doesn't seem to
 group
   the email addresses.
I'd be hard pressed to find every occurrence
 out
   of 1000 records, but I
DID quickly spot two exact same records which
   means the email address
was not grouped.
   
What can I do or where did I go wrong?
   
Thanks!
   
Aaron
   
   
   -- 
   MySQL General Mailing List
   For list archives: http://lists.mysql.com/mysql
   To unsubscribe:   
  
 

http://lists.mysql.com/[EMAIL PROTECTED]
   
   
 -- 
 . Garth Webb
 . [EMAIL PROTECTED]
 .
 . shoes * 鞋子 * schoenen * 단화 * chaussures *
 zapatos
 . Schuhe * παπούτσια * pattini * 靴 *
 sapatas * ботинки
 
 --
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe:   

http://lists.mysql.com/[EMAIL PROTECTED]
 
 


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: Why this query doesn't group the email addresses?

2004-07-13 Thread Garth Webb

On Tue, 2004-07-13 at 13:51, Justin Swanhart wrote:
  Where is the implicit group?  The 'order by'
  shouldn't effect how things
  are grouped.  On MySQL 4.0.17:
 
 Wow.  I wasn't aware.  Is that expected behavior? 

Yup, check out:

http://dev.mysql.com/doc/mysql/en/GROUP-BY-hidden-fields.html

 Other databases (Oracle) generate an error when you
 include columns in the select list that aren't in a
 group by, or they do an implicit group by (postgres)
 on the columns.  I [wrongly] assumed an implicit group
 by was going on because the query hadn't generated an
 error.
 
 I guess mysql just takes the values from the first
 record that matches the group expression.

Sorta, but its probably better to assume that its random.

 That creates confusion when you do:
 select first,last,email, count(*)
 from foobar
 group by email
 
 Because the query will report a count of two (given
 your data) when there really is only one row with that
 first,last,email combination.

But there are 2 rows with that email which, considering 'email' is the
only thing group'ed on, makes sense.  From the link above, one would
only select 'first' and 'last' if they knew they were unique as well and
wanted to save on typing.  Probably the only way to trust that is to
have a unique constraint on those three columns.

 Oracle would require you to do:
 select first,last,email, count(*)
 from foobar
 group by first,last,email
 otherwise you would get an error that first is not a
 GROUP BY expression.

Yah, and Oracle actually enforces foreign keys and won't insert guessed
default values for 'NOT NULL' fields when you try to insert a null value
into them.  MySQL keeps you on your toes!

Garth

 That query would return four rows on your data, each
 with a count of 1.
 
 My apologies,
 
 Justin
 
  
  create table bar ( email varchar(64), first
  varchar(64), last
  varchar(64) );
  
  insert into bar values ('[EMAIL PROTECTED]', 'a', 'smith'),
  ('[EMAIL PROTECTED]', 'a',
  'williams'), ('[EMAIL PROTECTED]', 'b', 'webb'), ('[EMAIL PROTECTED]',
  'c', 'oconner');
  
  mysql select * from bar;
  +-+---+--+
  | email   | first | last |
  +-+---+--+
  | [EMAIL PROTECTED] | a | smith|
  | [EMAIL PROTECTED] | a | williams |
  | [EMAIL PROTECTED] | b | webb |
  | [EMAIL PROTECTED] | c | oconner  |
  +-+---+--+
  
  mysql select first,last,email from bar b group by
  b.email order by
  b.first, b.last;
  +---+---+-+
  | first | last  | email   |
  +---+---+-+
  | a | smith | [EMAIL PROTECTED] |
  | b | webb  | [EMAIL PROTECTED] |
  +---+---+-+
  
  
   Case differences between the records could also
  cause
   dupes.  If case differences are causing it then do
   
   select lower(first), lower(last), lower(email)
   ...
   group by lower(first), lower(last), lower(email)
  
  Case or extra whitespace is a definite possibility. 
  Aaron, try to find
  at least one occurrence of duplicate email addresses
  and then post a
  small (5 row) dataset that exhibits the problem you
  are having.
  
  Garth
  
   --- Wesley Furgiuele [EMAIL PROTECTED] wrote:
What type of field is the email field?

Wes

On Jul 13, 2004, at 11:04 AM, Aaron Wolski
  wrote:

 Hey all,

 Got this query:

 SELECT first,last,email FROM CustomerTable AS
  t1,
 OrderTable AS t2, CartTable AS t3 WHERE
t2.cart_id=t3.cart_id
 AND t1.id=t2.customer_index AND
  t3.submitted='1'
AND
 t3.product_index='1' AND t3.quantity0
 GROUP BY t1.email ORDER BY t1.first,t1.last

 For some strange reason it doesn't seem to
  group
the email addresses.
 I'd be hard pressed to find every occurrence
  out
of 1000 records, but I
 DID quickly spot two exact same records which
means the email address
 was not grouped.

 What can I do or where did I go wrong?

 Thanks!

 Aaron


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:   
   
  
 
 http://lists.mysql.com/[EMAIL PROTECTED]


  -- 
  . Garth Webb
  . [EMAIL PROTECTED]
  .
  . shoes *  * schoenen *  * chaussures *
  zapatos
  . Schuhe *  * pattini *  *
  sapatas * 
  
  --
  MySQL General Mailing List
  For list archives: http://lists.mysql.com/mysql
  To unsubscribe:   
 
 http://lists.mysql.com/[EMAIL PROTECTED]
  
  
-- 
. Garth Webb
. [EMAIL PROTECTED]
.
. shoes *  * schoenen *  * chaussures * zapatos
. Schuhe *  * pattini *  * sapatas * 

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: QUERY question (group by) - please help

2003-01-11 Thread Paul DuBois

At 14:41 +0100 1/10/03, Damir Dezeljin wrote:

Hi.

I want to calculate how many rows my query which uses 'GROUP BY' returns.

The query:

SELECT something
FROM test
WHERE (kid=1) OR (kid=2) OR (kid=4)
GROUP BY cid,aid;


Is it posible to get number of rows with such a query from MySQL v3.23.49?


You want both a list of rows, and a count of the number of rows -- with
a single query.  That doesn't make sense, because those two things are
incompatible.


If it isn't posible ... is it posible in MySQL 4.x?


MySQL 4.x won't change the fundamental nature of the incompatibility.
The strategy you use below is reasonable.



I think a lot about this problem and I realize only the following
solution:
CREATE TEMPORARY TABLE t (i INT);
INSERT INTO t (i) SELECT aid FROM the_query_above
SELECT COUNT(*) FROM t;
DROP TABLE t;

But this isn't so elegant.

I want to do so on data generated by:

CREATE TABLE test (
kid INT,
aid INT,
cid INT
);

INSERT INTO test
(kid, aid, cid) VALUES
(  1,   0,   1),
(  2,   2,   2),
(  1,   3,   2),
(  2,   3,   2),
(  4,   4,   2),
(  4,   0,   3),
(  3,   3,   4),
(  4,   3,   4);


Regards,
Dezo



-
Before posting, please check:
  http://www.mysql.com/manual.php   (the manual)
  http://lists.mysql.com/   (the list archive)

To request this thread, e-mail [EMAIL PROTECTED]
To unsubscribe, e-mail [EMAIL PROTECTED]
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Re: QUERY question (group by) - please help

2003-01-11 Thread Damir Dezeljin

 
 SELECT something
  FROM test
  WHERE (kid=1) OR (kid=2) OR (kid=4)
  GROUP BY cid,aid;
 
 Is it posible to get number of rows with such a query from MySQL v3.23.49?

 You want both a list of rows, and a count of the number of rows -- with
 a single query.  That doesn't make sense, because those two things are
 incompatible.
I make a mistake by not specifing my problem in more detail. I will try to
be more specific:

I'm writing my own search engine for my web site. I want to output for
each result on each page (20 results per page) the no_of_current_result /
no_of_total_results . So with a first query I want to get a count of the
number of rows. Then I will execute a query with LIMIT blahblah to get
results for the current page.

Is it posible to get number of all results without geting all results in a
temporary table and then count them? If yes, how?

Thanks and regards,
Dezo


-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail [EMAIL PROTECTED]
To unsubscribe, e-mail [EMAIL PROTECTED]
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Re: QUERY question (group by) - please help

2003-01-11 Thread Joseph Bueno

Damir Dezeljin wrote:

SELECT something
FROM test
WHERE (kid=1) OR (kid=2) OR (kid=4)
GROUP BY cid,aid;

Is it posible to get number of rows with such a query from MySQL v3.23.49?

You want both a list of rows, and a count of the number of rows -- with
a single query.  That doesn't make sense, because those two things are
incompatible.
 
 I make a mistake by not specifing my problem in more detail. I will try to
 be more specific:
 
 I'm writing my own search engine for my web site. I want to output for
 each result on each page (20 results per page) the no_of_current_result /
 no_of_total_results . So with a first query I want to get a count of the
 number of rows. Then I will execute a query with LIMIT blahblah to get
 results for the current page.
 
 Is it posible to get number of all results without geting all results in a
 temporary table and then count them? If yes, how?

SELECT COUNT(*)
  FROM test
  WHERE (kid=1) OR (kid=2) OR (kid=4)
  GROUP BY cid,aid;

Should do it, or am I missing something ?

 
 Thanks and regards,
 Dezo
 
 


Hope this Helps
Joseph Bueno


-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail [EMAIL PROTECTED]
To unsubscribe, e-mail [EMAIL PROTECTED]
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Re: QUERY question (group by) - please help

2003-01-11 Thread Damir Dezeljin

Hi again ;)

 SELECT COUNT(*)
   FROM test
   WHERE (kid=1) OR (kid=2) OR (kid=4)
   GROUP BY cid,aid;
This query returns:
+--+
| COUNT(*) |
+--+
|1 |
|1 |
|2 |
|1 |
|1 |
|1 |
+--+
6 rows in set (0.03 sec)

Inest of this I want to get back '6'  so numbers of rows in resulting
query;

The same query WITHOUT 'GROUP BY' returns:
mysql SELECT COUNT(*) FROM test WHERE (kid=1) OR (kid=2) OR (kid=4);
+--+
| COUNT(*) |
+--+
|7 |
+--+
1 row in set (0.00 sec)

So two rows are grouped because of same cid,did

Any sugestion?

Regards,
Dezo


-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail [EMAIL PROTECTED]
To unsubscribe, e-mail [EMAIL PROTECTED]
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Re: QUERY question (group by) - please help

2003-01-11 Thread Joseph Bueno

Damir Dezeljin wrote:
 Hi again ;)
 
 
SELECT COUNT(*)
  FROM test
  WHERE (kid=1) OR (kid=2) OR (kid=4)
  GROUP BY cid,aid;
 
 This query returns:
 +--+
 | COUNT(*) |
 +--+
 |1 |
 |1 |
 |2 |
 |1 |
 |1 |
 |1 |
 +--+
 6 rows in set (0.03 sec)
 
 Inest of this I want to get back '6'  so numbers of rows in resulting
 query;
 
 The same query WITHOUT 'GROUP BY' returns:
 mysql SELECT COUNT(*) FROM test WHERE (kid=1) OR (kid=2) OR (kid=4);
 +--+
 | COUNT(*) |
 +--+
 |7 |
 +--+
 1 row in set (0.00 sec)
 
 So two rows are grouped because of same cid,did
 
 Any sugestion?
 
 Regards,
 Dezo
 
 

SELECT COUNT(DISTINCT cid,aid)
  FROM test
  WHERE (kid=1) OR (kid=2) OR (kid=4)

Should give you the number of groups.

Hope this helps
Joseph Bueno


-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail [EMAIL PROTECTED]
To unsubscribe, e-mail [EMAIL PROTECTED]
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

QUERY question (group by) - please help

2003-01-10 Thread Damir Dezeljin

Hi.

I want to calculate how many rows my query which uses 'GROUP BY' returns.

The query:

SELECT something
FROM test
WHERE (kid=1) OR (kid=2) OR (kid=4)
GROUP BY cid,aid;


Is it posible to get number of rows with such a query from MySQL v3.23.49?
If it isn't posible ... is it posible in MySQL 4.x?

I think a lot about this problem and I realize only the following
solution:
CREATE TEMPORARY TABLE t (i INT);
INSERT INTO t (i) SELECT aid FROM the_query_above
SELECT COUNT(*) FROM t;
DROP TABLE t;

But this isn't so elegant.

I want to do so on data generated by:

CREATE TABLE test (
kid INT,
aid INT,
cid INT
);

INSERT INTO test
(kid, aid, cid) VALUES
(  1,   0,   1),
(  2,   2,   2),
(  1,   3,   2),
(  2,   3,   2),
(  4,   4,   2),
(  4,   0,   3),
(  3,   3,   4),
(  4,   3,   4);


Regards,
Dezo


-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail [EMAIL PROTECTED]
To unsubscribe, e-mail [EMAIL PROTECTED]
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

QUERY question (group by) - please help

2003-01-10 Thread Damir Dezeljin

Sorry if this mail already arived to the list (I'm not sure if my posting
was sucesfull, because I had problems with my mail server). The past
posting:


Hi.

I want to calculate how many rows my query which uses 'GROUP BY' returns.

The query:

SELECT something
FROM test
WHERE (kid=1) OR (kid=2) OR (kid=4)
GROUP BY cid,aid;


Is it posible to get number of rows with such a query from MySQL v3.23.49?
If it isn't posible ... is it posible in MySQL 4.x?

I think a lot about this problem and I realize only the following
solution:
CREATE TEMPORARY TABLE t (i INT);
INSERT INTO t (i) SELECT aid FROM the_query_above
SELECT COUNT(*) FROM t;
DROP TABLE t;

But this isn't so elegant.

I want to do so on data generated by:

CREATE TABLE test (
kid INT,
aid INT,
cid INT
);

INSERT INTO test
(kid, aid, cid) VALUES
(  1,   0,   1),
(  2,   2,   2),
(  1,   3,   2),
(  2,   3,   2),
(  4,   4,   2),
(  4,   0,   3),
(  3,   3,   4),
(  4,   3,   4);


Regards,
Dezo



-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail [EMAIL PROTECTED]
To unsubscribe, e-mail [EMAIL PROTECTED]
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

substitute for sub query with group function

2002-05-12 Thread Sabine Richter


Hello,

I just want to get the value of one column in the row where max of
another col of this table is.
With the possibility of sub queries I would write:

select col1
from table t1
where col2 = (select max(col2)
  from tab1e t2);


But how to do it without a sub query?
Just with 1 statement?
I find no solution.

Not wanting to believe there is none 
Sabine

-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail [EMAIL PROTECTED]
To unsubscribe, e-mail [EMAIL PROTECTED]
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Re: substitute for sub query with group function

2002-05-12 Thread Philip Spradling


I could be missing something, but what about:

select col1 from t1 where col2 = max(col2);

On Sun, 12 May 2002 23:04:14 +0200
Sabine Richter [EMAIL PROTECTED] wrote:

 Hello,
 
 I just want to get the value of one column in the row where max of
 another col of this table is.
 With the possibility of sub queries I would write:
 
 select col1
 from table t1
 where col2 = (select max(col2)
   from tab1e t2);
 
 
 But how to do it without a sub query?
 Just with 1 statement?
 I find no solution.
 
 Not wanting to believe there is none 
 Sabine
 
 -
 Before posting, please check:
http://www.mysql.com/manual.php   (the manual)
http://lists.mysql.com/   (the list archive)
 
 To request this thread, e-mail [EMAIL PROTECTED]
 To unsubscribe, e-mail [EMAIL PROTECTED]
 Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php
 
 

-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail [EMAIL PROTECTED]
To unsubscribe, e-mail [EMAIL PROTECTED]
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Re: substitute for sub query with group function

2002-05-12 Thread Sabine Richter


Hello Philip,

that was my first choice, too. 
But no, error : Invalid use of group function

and select col1, max(col2) from table;
is error 1140: mixing of group cols with non groupü cols is illegal
(That would other sql backends say too)

and other variants of the above like select col1 from table t1, table
t2 where =max(col2) result in the same error reports

Perhaps someone else has an idea?

Best regards
Sabine 


Philip Spradling wrote:
 
 I could be missing something, but what about:
 
 select col1 from t1 where col2 = max(col2);
 
 On Sun, 12 May 2002 23:04:14 +0200
 Sabine Richter [EMAIL PROTECTED] wrote:
 
  Hello,
 
  I just want to get the value of one column in the row where max of
  another col of this table is.
  With the possibility of sub queries I would write:
 
  select col1
  from table t1
  where col2 = (select max(col2)
from tab1e t2);
 
 
  But how to do it without a sub query?
  Just with 1 statement?
  I find no solution.
 
  Not wanting to believe there is none
  Sabine
 
  -
  Before posting, please check:
 http://www.mysql.com/manual.php   (the manual)
 http://lists.mysql.com/   (the list archive)
 
  To request this thread, e-mail [EMAIL PROTECTED]
  To unsubscribe, e-mail 
[EMAIL PROTECTED]
  Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php
 
 
 
 -
 Before posting, please check:
http://www.mysql.com/manual.php   (the manual)
http://lists.mysql.com/   (the list archive)
 
 To request this thread, e-mail [EMAIL PROTECTED]
 To unsubscribe, e-mail [EMAIL PROTECTED]
 Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail [EMAIL PROTECTED]
To unsubscribe, e-mail [EMAIL PROTECTED]
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Re: substitute for sub query with group function

2002-05-12 Thread Philip Spradling


Oh, someone should fix that... I guess it won't let you put a group statement in the 
where clause.

Here is another way.  On the down side, its a bit more roundabout.  On the up side, 
however, I actually tried it and it worked.

select col1 from t1 order by col2 desc limit 1;

-Philip


On Sun, 12 May 2002 23:41:05 +0200
Sabine Richter [EMAIL PROTECTED] wrote:

 Hello Philip,
 
 that was my first choice, too. 
 But no, error : Invalid use of group function
 
 and select col1, max(col2) from table;
 is error 1140: mixing of group cols with non groupü cols is illegal
 (That would other sql backends say too)
 
 and other variants of the above like select col1 from table t1, table
 t2 where =max(col2) result in the same error reports
 
 Perhaps someone else has an idea?
 
 Best regards
 Sabine 
 

-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail [EMAIL PROTECTED]
To unsubscribe, e-mail [EMAIL PROTECTED]
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Re: substitute for sub query with group function

2002-05-12 Thread Paul DuBois


At 23:41 +0200 5/12/02, Sabine Richter wrote:
Hello Philip,

that was my first choice, too.
But no, error : Invalid use of group function

and select col1, max(col2) from table;
is error 1140: mixing of group cols with non groupü cols is illegal
(That would other sql backends say too)

And they should.


and other variants of the above like select col1 from table t1, table
t2 where =max(col2) result in the same error reports

Perhaps someone else has an idea?

If you don't mind running two queries, do this:

SELECT @max := MAX(col2) FROM t2;
SELECT col1 FROM t1 WHERE col2 = @max;


Best regards
Sabine


Philip Spradling wrote:

  I could be missing something, but what about:

  select col1 from t1 where col2 = max(col2);

  On Sun, 12 May 2002 23:04:14 +0200
  Sabine Richter [EMAIL PROTECTED] wrote:

   Hello,
  
   I just want to get the value of one column in the row where max of
   another col of this table is.
   With the possibility of sub queries I would write:
  
   select col1
   from table t1
   where col2 = (select max(col2)
 from tab1e t2);
  
  
   But how to do it without a sub query?
   Just with 1 statement?
   I find no solution.
  
   Not wanting to believe there is none
Sabine


-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail [EMAIL PROTECTED]
To unsubscribe, e-mail [EMAIL PROTECTED]
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

47 matches

Mail list logo