RE: query results group/summed by interval

2010-08-01 Thread nuno . tavares
Hi all,

Aveeks solution should work if you have at least one call for each intervall.
It's the classical GROUP BY solution that only works on the available dataset.
Although it should work pretty well in the cited scenario, you will miss
intervals (from a all intervals report point of view) if indeed there are
intervals (of more than 5 minutes, in this example) when there were no calls at
all.

I had a somewhat similar problem (running the second scenario, though) and this
is the solution I setup (this was a Data Warehouse and that's why you'll read
about partition pruning, dataset was dozens of Gigs):

http://gpshumano.blogs.dri.pt/2009/09/28/finding-for-each-time-interval-how-many-records-are-ocurring-during-that-interval/

This might become handy if Ghulam understands the differences between my
scenario and his.

Hope that helps,
-NT




Quoting Martin Gainty mgai...@hotmail.com:

 
 no that would give you the count for each second interval instead of using
 the interval variable 5 
 
 Aveeks floor:
 FLOOR(X) Returns the largest integer value not greater than X. 
  
 1st (seconds/5) interval  example
 5/5=1
 floor(5/5) = 1
 supplied value would truncate and give you the int not greater than X
 then multiply by 5
 1*5=5
 is correct
  
 Aveeks sum function:
 SUM([DISTINCT] expr) 
 Returns the sum of expr. If the return set has no rows, SUM() returns NULL.
 The DISTINCT keyword can be used in MySQL 5.0 to sum only the distinct values
 of expr. 
 SUM() returns NULL if there were no matching rows. 
 sum(calls) from calls group by 5 * floor(seconds/5)
 sum(calls) from calls group by 5 * floor(5/5)
 sum(calls) from class group by 5 * 1
 sum(calls) from class group by 5
 is correct 
  
 2nd(seconds/5) interval example
 10/5=2
 floor(10/5)=2
 supplied value would truncate and give you the int not greater than X
 then multiply by 5
 2*5=10
 is correct
  
 Aveeks sum function
 sum(calls) from calls group by 5 * floor(seconds/5)
 sum(calls) from calls group by 5 * floor(10/5)
 sum(calls) from class group by 5 * 2
 sum(calls) from class group by 10
 would be applicable only if the interval was 10
  
 Aveek if your interval is 5 change:
 sum(calls) from calls group by 5 * floor(seconds/5)
 to
 sum(calls) from calls group by floor(seconds/5)
  
 Martin Gainty 
 __ 
 Verzicht und Vertraulichkeitanmerkung/Note de déni et de confidentialité
 
 Diese Nachricht ist vertraulich. Sollten Sie nicht der vorgesehene Empfaenger
 sein, so bitten wir hoeflich um eine Mitteilung. Jede unbefugte Weiterleitung
 oder Fertigung einer Kopie ist unzulaessig. Diese Nachricht dient lediglich
 dem Austausch von Informationen und entfaltet keine rechtliche
 Bindungswirkung. Aufgrund der leichten Manipulierbarkeit von E-Mails koennen
 wir keine Haftung fuer den Inhalt uebernehmen.
 Ce message est confidentiel et peut être privilégié. Si vous n'êtes pas le
 destinataire prévu, nous te demandons avec bonté que pour satisfaire informez
 l'expéditeur. N'importe quelle diffusion non autorisée ou la copie de ceci
 est interdite. Ce message sert à l'information seulement et n'aura pas
 n'importe quel effet légalement obligatoire. Étant donné que les email
 peuvent facilement être sujets à la manipulation, nous ne pouvons accepter
 aucune responsabilité pour le contenu fourni.
 
 
 
   
 
  Date: Sat, 31 Jul 2010 10:31:43 +0700
  From: cuong.m...@vienthongso.com
  To: ave...@yahoo-inc.com
  CC: mustafa...@gmail.com; mysql@lists.mysql.com
  Subject: Re: query results group/summed by interval
  
  Hi Aveek,
  
  I think Ghulam just want to count calls for each intervals
  so the query should looks like this:
  
  select count(*) as total_calls, queue_seconds
  from calls group by queue_seconds order by total_calls;
  
  
  - Original Message -
  From: Aveek Misra ave...@yahoo-inc.com
  To: Ghulam Mustafa mustafa...@gmail.com, mysql@lists.mysql.com
  Sent: Tuesday, July 27, 2010 5:54:13 PM
  Subject: RE: query results group/summed by interval
  
  try this ...
  
  select 5 * floor(seconds/5) as start, 5 * floor(seconds/5) + 5 as end,
 sum(calls) from calls group by 5 * floor(seconds/5);
  
  This should give you an output of the type
  
  +---+--++
  | start | end | sum(calls) |
  +---+--++
  | 0 | 5 | 387 |
  | 5 | 10 | 225 |
  | 10 | 15 | 74 |
  +---+--++
  
  
  Thanks
  Aveek
  
  From: Ghulam Mustafa [mustafa...@gmail.com]
  Sent: Tuesday, July 27, 2010 3:53 PM
  To: mysql@lists.mysql.com
  Subject: query results group/summed by interval
  
  Hi everyone,
  
  i have two columns (seconds, number of calls), i need to produce a
  report which will show total number of calls in intervals (let'say 10
  seconds interval), i know i can do this programmability in my script but
  i was wondering if it's possible to accomplish this behavior within
  mysql. for example i have following data

RE: query results group/summed by interval

2010-08-01 Thread Travis Ard

You could also pre-define your intervals in a subquery using UNION and join 
that to your original table like so:

select ifnull(sum(calls), 0) as calls, n as queue_seconds
from 
(select 0 as n union 
select 5 union 
select 10 union 
select 15) as step
left join calls on calls.queue_seconds  (step.n - 5) and calls.queue_seconds 
= step.n
group by n;

+---+---+
| calls | queue_seconds |
+---+---+
|   250 | 0 |
|   168 | 5 |
|   268 |    10 |
| 0 |    15 |
+---+---+
4 rows in set (0.00 sec)

-Travis


 Date: Sun, 1 Aug 2010 13:16:36 +0100
 From: nuno.tava...@dri.pt
 To: mgai...@hotmail.com
 CC: cuong.m...@vienthongso.com; ave...@yahoo-inc.com; mustafa...@gmail.com; 
 mysql@lists.mysql.com
 Subject: RE: query results group/summed by interval

 Hi all,

 Aveeks solution should work if you have at least one call for each intervall.
 It's the classical GROUP BY solution that only works on the available dataset.
 Although it should work pretty well in the cited scenario, you will miss
 intervals (from a all intervals report point of view) if indeed there are
 intervals (of more than 5 minutes, in this example) when there were no calls 
 at
 all.

 I had a somewhat similar problem (running the second scenario, though) and 
 this
 is the solution I setup (this was a Data Warehouse and that's why you'll read
 about partition pruning, dataset was dozens of Gigs):

 http://gpshumano.blogs.dri.pt/2009/09/28/finding-for-each-time-interval-how-many-records-are-ocurring-during-that-interval/

 This might become handy if Ghulam understands the differences between my
 scenario and his.

 Hope that helps,
 -NT




 Quoting Martin Gainty :

 
  no that would give you the count for each second interval instead of using
  the interval variable 5
 
  Aveeks floor:
  FLOOR(X) Returns the largest integer value not greater than X.
 
  1st (seconds/5) interval example
  5/5=1
  floor(5/5) = 1
  supplied value would truncate and give you the int not greater than X
  then multiply by 5
  1*5=5
  is correct
 
  Aveeks sum function:
  SUM([DISTINCT] expr)
  Returns the sum of expr. If the return set has no rows, SUM() returns NULL.
  The DISTINCT keyword can be used in MySQL 5.0 to sum only the distinct 
  values
  of expr.
  SUM() returns NULL if there were no matching rows.
  sum(calls) from calls group by 5 * floor(seconds/5)
  sum(calls) from calls group by 5 * floor(5/5)
  sum(calls) from class group by 5 * 1
  sum(calls) from class group by 5
  is correct
 
  2nd(seconds/5) interval example
  10/5=2
  floor(10/5)=2
  supplied value would truncate and give you the int not greater than X
  then multiply by 5
  2*5=10
  is correct
 
  Aveeks sum function
  sum(calls) from calls group by 5 * floor(seconds/5)
  sum(calls) from calls group by 5 * floor(10/5)
  sum(calls) from class group by 5 * 2
  sum(calls) from class group by 10
  would be applicable only if the interval was 10
 
  Aveek if your interval is 5 change:
  sum(calls) from calls group by 5 * floor(seconds/5)
  to
  sum(calls) from calls group by floor(seconds/5)
 
  Martin Gainty
  __
  Verzicht und Vertraulichkeitanmerkung/Note de déni et de confidentialité
 
  Diese Nachricht ist vertraulich. Sollten Sie nicht der vorgesehene 
  Empfaenger
  sein, so bitten wir hoeflich um eine Mitteilung. Jede unbefugte 
  Weiterleitung
  oder Fertigung einer Kopie ist unzulaessig. Diese Nachricht dient lediglich
  dem Austausch von Informationen und entfaltet keine rechtliche
  Bindungswirkung. Aufgrund der leichten Manipulierbarkeit von E-Mails koennen
  wir keine Haftung fuer den Inhalt uebernehmen.
  Ce message est confidentiel et peut être privilégié. Si vous n'êtes pas le
  destinataire prévu, nous te demandons avec bonté que pour satisfaire 
  informez
  l'expéditeur. N'importe quelle diffusion non autorisée ou la copie de ceci
  est interdite. Ce message sert à l'information seulement et n'aura pas
  n'importe quel effet légalement obligatoire. Étant donné que les email
  peuvent facilement être sujets à la manipulation, nous ne pouvons accepter
  aucune responsabilité pour le contenu fourni.
 
 
 
 
 
   Date: Sat, 31 Jul 2010 10:31:43 +0700
   From: cuong.m...@vienthongso.com
   To: ave...@yahoo-inc.com
   CC: mustafa...@gmail.com; mysql@lists.mysql.com
   Subject: Re: query results group/summed by interval
  
   Hi Aveek,
  
   I think Ghulam just want to count calls for each intervals
   so the query should looks like this:
  
   select count(*) as total_calls, queue_seconds
   from calls group by queue_seconds order by total_calls;
  
  
   - Original Message -
   From: Aveek Misra 
   To: Ghulam Mustafa , mysql@lists.mysql.com
   Sent: Tuesday, July 27, 2010 5:54:13 PM
   Subject: RE: query results group/summed by interval
  
   try this ...
  
   select 5 * floor(seconds/5) as start, 5

RE: query results group/summed by interval

2010-07-31 Thread Martin Gainty

no that would give you the count for each second interval instead of using the 
interval variable 5 

Aveeks floor:
FLOOR(X) Returns the largest integer value not greater than X. 
 
1st (seconds/5) interval  example
5/5=1
floor(5/5) = 1
supplied value would truncate and give you the int not greater than X
then multiply by 5
1*5=5
is correct
 
Aveeks sum function:
SUM([DISTINCT] expr) 
Returns the sum of expr. If the return set has no rows, SUM() returns NULL. The 
DISTINCT keyword can be used in MySQL 5.0 to sum only the distinct values of 
expr. 
SUM() returns NULL if there were no matching rows. 
sum(calls) from calls group by 5 * floor(seconds/5)
sum(calls) from calls group by 5 * floor(5/5)
sum(calls) from class group by 5 * 1
sum(calls) from class group by 5
is correct 
 
2nd(seconds/5) interval example
10/5=2
floor(10/5)=2
supplied value would truncate and give you the int not greater than X
then multiply by 5
2*5=10
is correct
 
Aveeks sum function
sum(calls) from calls group by 5 * floor(seconds/5)
sum(calls) from calls group by 5 * floor(10/5)
sum(calls) from class group by 5 * 2
sum(calls) from class group by 10
would be applicable only if the interval was 10
 
Aveek if your interval is 5 change:
sum(calls) from calls group by 5 * floor(seconds/5)
to
sum(calls) from calls group by floor(seconds/5)
 
Martin Gainty 
__ 
Verzicht und Vertraulichkeitanmerkung/Note de déni et de confidentialité

Diese Nachricht ist vertraulich. Sollten Sie nicht der vorgesehene Empfaenger 
sein, so bitten wir hoeflich um eine Mitteilung. Jede unbefugte Weiterleitung 
oder Fertigung einer Kopie ist unzulaessig. Diese Nachricht dient lediglich dem 
Austausch von Informationen und entfaltet keine rechtliche Bindungswirkung. 
Aufgrund der leichten Manipulierbarkeit von E-Mails koennen wir keine Haftung 
fuer den Inhalt uebernehmen.
Ce message est confidentiel et peut être privilégié. Si vous n'êtes pas le 
destinataire prévu, nous te demandons avec bonté que pour satisfaire informez 
l'expéditeur. N'importe quelle diffusion non autorisée ou la copie de ceci est 
interdite. Ce message sert à l'information seulement et n'aura pas n'importe 
quel effet légalement obligatoire. Étant donné que les email peuvent facilement 
être sujets à la manipulation, nous ne pouvons accepter aucune responsabilité 
pour le contenu fourni.



 

 Date: Sat, 31 Jul 2010 10:31:43 +0700
 From: cuong.m...@vienthongso.com
 To: ave...@yahoo-inc.com
 CC: mustafa...@gmail.com; mysql@lists.mysql.com
 Subject: Re: query results group/summed by interval
 
 Hi Aveek,
 
 I think Ghulam just want to count calls for each intervals
 so the query should looks like this:
 
 select count(*) as total_calls, queue_seconds
 from calls group by queue_seconds order by total_calls;
 
 
 - Original Message -
 From: Aveek Misra ave...@yahoo-inc.com
 To: Ghulam Mustafa mustafa...@gmail.com, mysql@lists.mysql.com
 Sent: Tuesday, July 27, 2010 5:54:13 PM
 Subject: RE: query results group/summed by interval
 
 try this ...
 
 select 5 * floor(seconds/5) as start, 5 * floor(seconds/5) + 5 as end, 
 sum(calls) from calls group by 5 * floor(seconds/5);
 
 This should give you an output of the type
 
 +---+--++
 | start | end | sum(calls) |
 +---+--++
 | 0 | 5 | 387 |
 | 5 | 10 | 225 |
 | 10 | 15 | 74 |
 +---+--++
 
 
 Thanks
 Aveek
 
 From: Ghulam Mustafa [mustafa...@gmail.com]
 Sent: Tuesday, July 27, 2010 3:53 PM
 To: mysql@lists.mysql.com
 Subject: query results group/summed by interval
 
 Hi everyone,
 
 i have two columns (seconds, number of calls), i need to produce a
 report which will show total number of calls in intervals (let'say 10
 seconds interval), i know i can do this programmability in my script but
 i was wondering if it's possible to accomplish this behavior within
 mysql. for example i have following data.
 
 +--+---+
 | calls | queue_seconds |
 +--+---+
 | 250 | 0.00 |
 | 28 | 1.00 |
 | 30 | 2.00 |
 | 56 | 3.00 |
 | 23 | 4.00 |
 | 31 | 5.00 |
 | 33 | 6.00 |
 | 50 | 7.00 |
 | 49 | 8.00 |
 | 62 | 9.00 |
 | 74 | 10.00 |
 ...
 ... and so on...
 ...
 +--+---+
 
 now result should look like this with a 5 seconds interval.
 
 +--+---+
 | count(*) | queue_seconds |
 +--+---+
 | 250 | 0.00 |
 | 168 | 5.00 |
 | 268 | 10.00 |
 ...
 ... and so on...
 ...
 +--+---+
 
 i would really appreciate your help.
 
 Best Regards.
 
 --
 Ghulam Mustafa
 
 --
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe: http://lists.mysql.com/mysql?unsub=ave...@yahoo-inc.com
 
 
 --
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe: http://lists.mysql.com/mysql?unsub=cuong.m...@vienthongso.com
 
 
 -- 
 Best Regards,
 Cuongmc

RE: query results group/summed by interval

2010-07-31 Thread Martin Gainty

no that would give you the count for each second interval instead of using the 
interval variable 5 

Aveeks floor:
FLOOR(X) Returns the largest integer value not greater than X. 
 
1st (seconds/5) interval  example
5/5=1
floor(5/5) = 1
supplied value would truncate and give you the int not greater than X
then multiply by 5
1*5=5
is correct
 
Aveeks sum function:
SUM([DISTINCT] expr) 
Returns the sum of expr. If the return set has no rows, SUM() returns NULL. The 
DISTINCT keyword can be used in MySQL 5.0 to sum only the distinct values of 
expr. 
SUM() returns NULL if there were no matching rows. 
sum(calls) from calls group by 5 * floor(seconds/5)
sum(calls) from calls group by 5 * floor(5/5)
sum(calls) from class group by 5 * 1
sum(calls) from class group by 5
is correct 
 
2nd(seconds/5) interval example
10/5=2
floor(10/5)=2
supplied value would truncate and give you the int not greater than X
then multiply by 5
2*5=10
is correct
 
Aveeks sum function
sum(calls) from calls group by 5 * floor(seconds/5)
sum(calls) from calls group by 5 * floor(10/5)
sum(calls) from class group by 5 * 2
sum(calls) from class group by 10
would be applicable only if the interval was 10
 
Aveek if your interval is 5 change:
sum(calls) from calls group by 5 * floor(seconds/5)
to
sum(calls) from calls group by floor(seconds/5)
 
Martin Gainty 
__ 
Verzicht und Vertraulichkeitanmerkung/Note de déni et de confidentialité

Diese Nachricht ist vertraulich. Sollten Sie nicht der vorgesehene Empfaenger 
sein, so bitten wir hoeflich um eine Mitteilung. Jede unbefugte Weiterleitung 
oder Fertigung einer Kopie ist unzulaessig. Diese Nachricht dient lediglich dem 
Austausch von Informationen und entfaltet keine rechtliche Bindungswirkung. 
Aufgrund der leichten Manipulierbarkeit von E-Mails koennen wir keine Haftung 
fuer den Inhalt uebernehmen.
Ce message est confidentiel et peut être privilégié. Si vous n'êtes pas le 
destinataire prévu, nous te demandons avec bonté que pour satisfaire informez 
l'expéditeur. N'importe quelle diffusion non autorisée ou la copie de ceci est 
interdite. Ce message sert à l'information seulement et n'aura pas n'importe 
quel effet légalement obligatoire. Étant donné que les email peuvent facilement 
être sujets à la manipulation, nous ne pouvons accepter aucune responsabilité 
pour le contenu fourni.



  

 Date: Sat, 31 Jul 2010 10:31:43 +0700
 From: cuong.m...@vienthongso.com
 To: ave...@yahoo-inc.com
 CC: mustafa...@gmail.com; mysql@lists.mysql.com
 Subject: Re: query results group/summed by interval
 
 Hi Aveek,
 
 I think Ghulam just want to count calls for each intervals
 so the query should looks like this:
 
 select count(*) as total_calls, queue_seconds
 from calls group by queue_seconds order by total_calls;
 
 
 - Original Message -
 From: Aveek Misra ave...@yahoo-inc.com
 To: Ghulam Mustafa mustafa...@gmail.com, mysql@lists.mysql.com
 Sent: Tuesday, July 27, 2010 5:54:13 PM
 Subject: RE: query results group/summed by interval
 
 try this ...
 
 select 5 * floor(seconds/5) as start, 5 * floor(seconds/5) + 5 as end, 
 sum(calls) from calls group by 5 * floor(seconds/5);
 
 This should give you an output of the type
 
 +---+--++
 | start | end | sum(calls) |
 +---+--++
 | 0 | 5 | 387 |
 | 5 | 10 | 225 |
 | 10 | 15 | 74 |
 +---+--++
 
 
 Thanks
 Aveek
 
 From: Ghulam Mustafa [mustafa...@gmail.com]
 Sent: Tuesday, July 27, 2010 3:53 PM
 To: mysql@lists.mysql.com
 Subject: query results group/summed by interval
 
 Hi everyone,
 
 i have two columns (seconds, number of calls), i need to produce a
 report which will show total number of calls in intervals (let'say 10
 seconds interval), i know i can do this programmability in my script but
 i was wondering if it's possible to accomplish this behavior within
 mysql. for example i have following data.
 
 +--+---+
 | calls | queue_seconds |
 +--+---+
 | 250 | 0.00 |
 | 28 | 1.00 |
 | 30 | 2.00 |
 | 56 | 3.00 |
 | 23 | 4.00 |
 | 31 | 5.00 |
 | 33 | 6.00 |
 | 50 | 7.00 |
 | 49 | 8.00 |
 | 62 | 9.00 |
 | 74 | 10.00 |
 ...
 ... and so on...
 ...
 +--+---+
 
 now result should look like this with a 5 seconds interval.
 
 +--+---+
 | count(*) | queue_seconds |
 +--+---+
 | 250 | 0.00 |
 | 168 | 5.00 |
 | 268 | 10.00 |
 ...
 ... and so on...
 ...
 +--+---+
 
 i would really appreciate your help.
 
 Best Regards.
 
 --
 Ghulam Mustafa
 
 --
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe: http://lists.mysql.com/mysql?unsub=ave...@yahoo-inc.com
 
 
 --
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe: http://lists.mysql.com/mysql?unsub=cuong.m...@vienthongso.com
 
 
 -- 
 Best Regards,
 Cuongmc

Re: query results group/summed by interval

2010-07-30 Thread Nguyen Manh Cuong
Hi Aveek,

I think Ghulam just want to count calls for each intervals
so the query should looks like this:

select count(*) as total_calls, queue_seconds
from calls group by queue_seconds order by total_calls;


- Original Message -
From: Aveek Misra ave...@yahoo-inc.com
To: Ghulam Mustafa mustafa...@gmail.com, mysql@lists.mysql.com
Sent: Tuesday, July 27, 2010 5:54:13 PM
Subject: RE: query results group/summed by interval

try this ...

 select 5 * floor(seconds/5) as start, 5 * floor(seconds/5) + 5 as end, 
sum(calls) from calls group by 5 * floor(seconds/5);

This should give you an output of the type

+---+--++
| start | end  | sum(calls) |
+---+--++
| 0 |5 |387 |
| 5 |   10 |225 |
|10 |   15 | 74 |
+---+--++


Thanks
Aveek

From: Ghulam Mustafa [mustafa...@gmail.com]
Sent: Tuesday, July 27, 2010 3:53 PM
To: mysql@lists.mysql.com
Subject: query results group/summed by interval

Hi everyone,

i have two columns (seconds, number of calls), i need to produce a
report which will show total number of calls in intervals (let'say 10
seconds interval), i know i can do this programmability in my script but
i was wondering if it's possible to accomplish this behavior within
mysql. for example i have following data.

+--+---+
|  calls   | queue_seconds |
+--+---+
|  250 |  0.00 |
|   28 |  1.00 |
|   30 |  2.00 |
|   56 |  3.00 |
|   23 |  4.00 |
|   31 |  5.00 |
|   33 |  6.00 |
|   50 |  7.00 |
|   49 |  8.00 |
|   62 |  9.00 |
|   74 | 10.00 |
...
... and so on...
...
+--+---+

now result should look like this with a 5 seconds interval.

+--+---+
| count(*) | queue_seconds |
+--+---+
|  250 |  0.00 |
|  168 |  5.00 |
|  268 | 10.00 |
...
... and so on...
...
+--+---+

i would really appreciate your help.

Best Regards.

--
Ghulam Mustafa

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=ave...@yahoo-inc.com


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=cuong.m...@vienthongso.com


-- 
Best Regards,
Cuongmc.

-- 
Nguyen Manh Cuong
Phong Ky Thuat - Cong ty Vien Thong So - VTC
Dien thoai: 0912051542
Gmail : philipscu...@gmail.com
YahooMail : philipscu...@yahoo.com

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org



RE: query results group/summed by interval

2010-07-27 Thread Aveek Misra
try this ...

 select 5 * floor(seconds/5) as start, 5 * floor(seconds/5) + 5 as end, 
sum(calls) from calls group by 5 * floor(seconds/5);

This should give you an output of the type

+---+--++
| start | end  | sum(calls) |
+---+--++
| 0 |5 |387 |
| 5 |   10 |225 |
|10 |   15 | 74 |
+---+--++


Thanks
Aveek

From: Ghulam Mustafa [mustafa...@gmail.com]
Sent: Tuesday, July 27, 2010 3:53 PM
To: mysql@lists.mysql.com
Subject: query results group/summed by interval

Hi everyone,

i have two columns (seconds, number of calls), i need to produce a
report which will show total number of calls in intervals (let'say 10
seconds interval), i know i can do this programmability in my script but
i was wondering if it's possible to accomplish this behavior within
mysql. for example i have following data.

+--+---+
|  calls   | queue_seconds |
+--+---+
|  250 |  0.00 |
|   28 |  1.00 |
|   30 |  2.00 |
|   56 |  3.00 |
|   23 |  4.00 |
|   31 |  5.00 |
|   33 |  6.00 |
|   50 |  7.00 |
|   49 |  8.00 |
|   62 |  9.00 |
|   74 | 10.00 |
...
... and so on...
...
+--+---+

now result should look like this with a 5 seconds interval.

+--+---+
| count(*) | queue_seconds |
+--+---+
|  250 |  0.00 |
|  168 |  5.00 |
|  268 | 10.00 |
...
... and so on...
...
+--+---+

i would really appreciate your help.

Best Regards.

--
Ghulam Mustafa

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=ave...@yahoo-inc.com


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org



RE: Query with group by

2004-09-21 Thread Jose Miguel Pérez
Hi Michael! 

Talking about the query with group by issue...

 I'll explain my reasoning below.
[...]
  From the manual, section 7.2.8 How MySQL Optimizes LEFT 
 JOIN and RIGHT 
 JOIN http://dev.mysql.com/doc/mysql/en/LEFT_JOIN_optimization.html:
 
A LEFT JOIN B join_condition is implemented in MySQL as follows:
...
* The LEFT JOIN condition is used to decide how to 
 retrieve rows from table B. (In other words, any condition in the
 WHERE clause is not used.)
...
 
 So, the WHERE c2.id IS NULL cannot be applied until after 
 the rows which match the ON clause (and the NULL rows) have been
 fetched.

Thanks for your explanations here, it's clear now I was confused. I
missed this LEFT JOIN behaviour. I was in the idea the WHERE xx IS NULL was
taken into account.

[...]
 You are certainly right that temporary and filesort are 
 to be avoided. And they will be, if the table is properly indexed. Single 
 column indexing won't help much here, because the WHERE condition,
 the GROUP BY column, and the MAX column are all different.  A multi-column
 index on (content, location, date), however, will allow mysql to use the
 index to find the matching rows, find the groups, and calculate the MAX
date.

I still refuse to use the temporary table solution, call me fussy
here. Indeed, I think giving more indexes than necessary is a bit redundant,
unless completely necessary. (You will end up having more space on indexes
than data itself). It's a handle with care issue for me.

 Anyway, I don't know if one can program an agregate UDF 
  called something like EXTERNAL_MAX(...) or something, so that we
  could do like:
 
 SELECT EXTERNAL_MAX(date, version)  --- i.e: 
  Returns the version value for the row with MAX(date).

 This, for sure, will be the best solution. ;-)
 
 That would have to do the same thing behind the scenes.

I have to beg you pardon here. ;-) Think again this solution doesn't
require a JOIN, nor a temporary table. I think the UDF solution doesn't have
to be less efficient than a MAX or AVG aggregate function by itself. I'm in
the process of creating such a monster :-) Let me know if you are curious
about and have time to test it, I will test MAX() and EXTERNAL_MAX() against
a very large table.

Cheers,
Jose Miguel.



-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Query with group by

2004-09-17 Thread Rhino
Here is the output I got when I ran my query - and yours, Michael - against
DB2 V7.2. Please note that I replaced the 'temp' table in Michael's query
with the real table in both the outer query and the subquery; no other
changes were made.

--
create table versions
(id smallint not null,
date date not null,
content char(5) not null,
location char(10) not null,
version smallint not null,
primary key(id))
DB2I  The SQL command completed successfully.

insert into versions values
(1, '2004-09-14', 'ALPHA', 'PARIS', 10),
(2, '2004-09-15', 'ALPHA', 'PARIS', 11),
(3, '2004-09-16', 'ALPHA', 'PARIS', 10),
(4, '2004-09-14', 'ALPHA', 'NEW-YORK', 11),
(5, '2004-09-15', 'ALPHA', 'NEW-YORK', 11),
(6, '2004-09-16', 'ALPHA', 'NEW-YORK', 10),
(7, '2004-09-14', 'ALPHA', 'TOKYO', 10),
(8, '2004-09-15', 'ALPHA', 'TOKYO', 11),
(9, '2004-09-16', 'BETA', 'TOKYO', 10)
DB2I  The SQL command completed successfully.

select content, location, version, date
from versions
where concat(location, char(date)) in
(select concat(location, char(max(date)))
from versions
where content = 'ALPHA'
group by location)

CONTENT LOCATION   VERSION DATE
--- -- --- --
ALPHA   NEW-YORK10 09/16/2004
ALPHA   PARIS   10 09/16/2004
ALPHA   TOKYO   11 09/15/2004

  3 record(s) selected.


SELECT content, location, version, date
FROM versions t1
WHERE date=(SELECT MAX(t2.date)
FROM versions t2
WHERE t1.location = t2.location AND t1.content = t2.content)
AND content = 'ALPHA'

CONTENT LOCATION   VERSION DATE
--- -- --- --
ALPHA   PARIS   10 09/16/2004
ALPHA   NEW-YORK10 09/16/2004
ALPHA   TOKYO   11 09/15/2004

  3 record(s) selected.
--

As you can see, both queries worked and produced the same result in DB2,
aside from the row sequence, which is easily fixable via an Order By.

I'm at a loss to explain why my query didn't work in MySQL V4.1.4.

Rhino


- Original Message - 
From: Michael Stassen [EMAIL PROTECTED]
To: Rhino [EMAIL PROTECTED]
Cc: Jose Miguel Pérez [EMAIL PROTECTED]; [EMAIL PROTECTED];
[EMAIL PROTECTED]
Sent: Friday, September 17, 2004 12:42 AM
Subject: Re: Query with group by



 Rhino wrote:
  I agree that Michael's solution with the temporary tables is the best I
have
  seen so far.

 I can't take much credit.  It's just an adaptation of the solution in the
 manual.

  I am running MySQL 4.0.15 so I don't have any subquery capability. As a
  result, I went to DB2 to come up with one solution that does give the
right
  answer via a subquery:
 
  select content, location, version, date
  from versions
  where concat(location, char(date)) in
  (select concat(location, char(max(date)))
  from versions
  where content = 'ALPHA'
  group by location);

 Really?  That didn't work for me in mysql 4.1.4a-gamma.  I got

 +-+--+-++
 | content | location | version | date   |
 +-+--+-++
 | ALPHA   | PARIS|  10 | 2004-09-14 |
 | ALPHA   | PARIS|  11 | 2004-09-15 |
 | ALPHA   | PARIS|  10 | 2004-09-16 |
 | ALPHA   | NEW-YORK |  11 | 2004-09-14 |
 | ALPHA   | NEW-YORK |  11 | 2004-09-15 |
 | ALPHA   | NEW-YORK |  10 | 2004-09-16 |
 | ALPHA   | TOKYO|  10 | 2004-09-14 |
 | ALPHA   | TOKYO|  11 | 2004-09-15 |
 | BETA| TOKYO|  10 | 2004-09-16 |
 +-+--+-++
 9 rows in set (0.31 sec)


  I don't know if this will work in MySQL 4.1.x though. You may want to
give
  it a try if you have 4.1.x. By the way, I'm not convinced that this is
the
  *best* solution using a subquery; it's just the first one I could think
of.
  I don't have all day to spend on this ;-)

 The following query works for me.

SELECT content, location, version, date
FROM temp t1
WHERE date=(SELECT MAX(t2.date)
FROM temp t2
WHERE t1.location = t2.location

AND t1.content = t2.content)
AND content = 'ALPHA';


 +-+--+-++
 | content | location | version | date   |
 +-+--+-++
 | ALPHA   | PARIS|  10 | 2004-09-16 |
 | ALPHA   | NEW-YORK |  10 | 2004-09-16 |
 | ALPHA   | TOKYO|  11 | 2004-09-15 |
 +-+--+-++
 3 rows in set (0.01 sec)

 Again, that's an adaptation of the subquery version of the solution in the
 manual.

 Michael



-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Query with group by

2004-09-17 Thread Michael Stassen
I took a closer look and found the problem -- char(date).  Is char() 
necessary in DB2 to cast the date as a string?  In MySQL, char() expects a 
list of integers to interpret as a list of character codes 
http://dev.mysql.com/doc/mysql/en/String_functions.html.  Removing char() 
makes the query appear to work.

I say appear because you forgot the same thing I did.  We need to guard 
against the possibility that a row with different content will match the 
location and date.  So, your query becomes

  SELECT content, location, version, date
  FROM temp
  WHERE CONCAT(location, date) IN
  (SELECT CONCAT(location, MAX(date))
  FROM temp
  WHERE content = 'ALPHA'
  GROUP BY location)
  AND content = 'ALPHA';
which works for me in 4.1.4a.
That said, it seems to me that this is a version of the MAX-CONCAT trick 
written as a subquery.  With the columns to be compared inside CONCAT() 
functions, there is no way to use an index to match up the rows.  My 
expectation is that this will be relatively inefficient compared to the 
other subquery solution.

Michael
Rhino wrote:
Here is the output I got when I ran my query - and yours, Michael - against
DB2 V7.2. Please note that I replaced the 'temp' table in Michael's query
with the real table in both the outer query and the subquery; no other
changes were made.
--
create table versions
(id smallint not null,
date date not null,
content char(5) not null,
location char(10) not null,
version smallint not null,
primary key(id))
DB2I  The SQL command completed successfully.
insert into versions values
(1, '2004-09-14', 'ALPHA', 'PARIS', 10),
(2, '2004-09-15', 'ALPHA', 'PARIS', 11),
(3, '2004-09-16', 'ALPHA', 'PARIS', 10),
(4, '2004-09-14', 'ALPHA', 'NEW-YORK', 11),
(5, '2004-09-15', 'ALPHA', 'NEW-YORK', 11),
(6, '2004-09-16', 'ALPHA', 'NEW-YORK', 10),
(7, '2004-09-14', 'ALPHA', 'TOKYO', 10),
(8, '2004-09-15', 'ALPHA', 'TOKYO', 11),
(9, '2004-09-16', 'BETA', 'TOKYO', 10)
DB2I  The SQL command completed successfully.
select content, location, version, date
from versions
where concat(location, char(date)) in
(select concat(location, char(max(date)))
from versions
where content = 'ALPHA'
group by location)
CONTENT LOCATION   VERSION DATE
--- -- --- --
ALPHA   NEW-YORK10 09/16/2004
ALPHA   PARIS   10 09/16/2004
ALPHA   TOKYO   11 09/15/2004
  3 record(s) selected.
SELECT content, location, version, date
FROM versions t1
WHERE date=(SELECT MAX(t2.date)
FROM versions t2
WHERE t1.location = t2.location AND t1.content = t2.content)
AND content = 'ALPHA'
CONTENT LOCATION   VERSION DATE
--- -- --- --
ALPHA   PARIS   10 09/16/2004
ALPHA   NEW-YORK10 09/16/2004
ALPHA   TOKYO   11 09/15/2004
  3 record(s) selected.
--
As you can see, both queries worked and produced the same result in DB2,
aside from the row sequence, which is easily fixable via an Order By.
I'm at a loss to explain why my query didn't work in MySQL V4.1.4.
Rhino

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]


Re: Query with group by

2004-09-17 Thread Rhino
From the SQL Reference for DB2:

---
-CHAR--(--datetime-expression+---+--)
   '-,--+-ISO---+--'
+-USA---+
+-EUR---+
+-JIS---+
'-LOCAL-'


Datetime to Character

  datetime-expression
  An expression that is one of the following three data types

date
The result is the character string representation of the date in the
format specified by the second argument. The length of the result is 10. An
error occurs if the second argument is specified and is not a valid value
(SQLSTATE 42703).

time
The result is the character string representation of the time in the
format specified by the second argument. The length of the result is 8. An
error occurs if the second argument is specified and is not a valid value
(SQLSTATE 42703).

timestamp
The second argument is not applicable and must not be specified.
SQLSTATE 42815 The result is the character string representation of the
timestamp. The length of the result is 26.
  The code page of the string is the code page of the database at the
application server.

---

So, yes, char() is converting the date to a string as you suspected.

I'm not suprised that the query wouldn't perform terribly well; that's what
I would have expected given the construction of the query. Like I said, it
was just the first solution I came up with. I would likely have come up with
something more like your solution if I had had more time to try other
solutions :-)

In any case, we have now thought of several solutions to the problem. I hope
Vincent can use at least one of them ;-)

Rhino

- Original Message - 
From: Michael Stassen [EMAIL PROTECTED]
To: Rhino [EMAIL PROTECTED]
Cc: Jose Miguel Pérez [EMAIL PROTECTED]; [EMAIL PROTECTED];
[EMAIL PROTECTED]
Sent: Friday, September 17, 2004 8:13 PM
Subject: Re: Query with group by


 I took a closer look and found the problem -- char(date).  Is char()
 necessary in DB2 to cast the date as a string?  In MySQL, char() expects a
 list of integers to interpret as a list of character codes
 http://dev.mysql.com/doc/mysql/en/String_functions.html.  Removing
char()
 makes the query appear to work.

 I say appear because you forgot the same thing I did.  We need to guard
 against the possibility that a row with different content will match the
 location and date.  So, your query becomes

SELECT content, location, version, date
FROM temp
WHERE CONCAT(location, date) IN
(SELECT CONCAT(location, MAX(date))
FROM temp
WHERE content = 'ALPHA'
GROUP BY location)
AND content = 'ALPHA';

 which works for me in 4.1.4a.

 That said, it seems to me that this is a version of the MAX-CONCAT trick
 written as a subquery.  With the columns to be compared inside CONCAT()
 functions, there is no way to use an index to match up the rows.  My
 expectation is that this will be relatively inefficient compared to the
 other subquery solution.

 Michael

 Rhino wrote:

  Here is the output I got when I ran my query - and yours, Michael -
against
  DB2 V7.2. Please note that I replaced the 'temp' table in Michael's
query
  with the real table in both the outer query and the subquery; no other
  changes were made.
 
  --
  create table versions
  (id smallint not null,
  date date not null,
  content char(5) not null,
  location char(10) not null,
  version smallint not null,
  primary key(id))
  DB2I  The SQL command completed successfully.
 
  insert into versions values
  (1, '2004-09-14', 'ALPHA', 'PARIS', 10),
  (2, '2004-09-15', 'ALPHA', 'PARIS', 11),
  (3, '2004-09-16', 'ALPHA', 'PARIS', 10),
  (4, '2004-09-14', 'ALPHA', 'NEW-YORK', 11),
  (5, '2004-09-15', 'ALPHA', 'NEW-YORK', 11),
  (6, '2004-09-16', 'ALPHA', 'NEW-YORK', 10),
  (7, '2004-09-14', 'ALPHA', 'TOKYO', 10),
  (8, '2004-09-15', 'ALPHA', 'TOKYO', 11),
  (9, '2004-09-16', 'BETA', 'TOKYO', 10)
  DB2I  The SQL command completed successfully.
 
  select content, location, version, date
  from versions
  where concat(location, char(date)) in
  (select concat(location, char(max(date)))
  from versions
  where content = 'ALPHA'
  group by location)
 
  CONTENT LOCATION   VERSION DATE
  --- -- --- --
  ALPHA   NEW-YORK10 09/16/2004
  ALPHA   PARIS   10 09/16/2004
  ALPHA   TOKYO   11 09/15/2004
 
3 record(s) selected.
 
 
  SELECT content, location, version, date
  FROM versions t1
  WHERE date=(SELECT MAX(t2.date)
  FROM versions t2
  WHERE t1.location = t2.location AND t1.content = t2.content)
  AND content = 'ALPHA'
 
  CONTENT LOCATION   VERSION DATE
  --- -- --- --
  ALPHA   PARIS   10 09/16/2004
  ALPHA   NEW-YORK10 09/16/2004
  ALPHA   TOKYO

Re: Query with group by

2004-09-16 Thread Rhino

- Original Message - 
From: [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Thursday, September 16, 2004 3:05 AM
Subject: Query with group by


 Hello all,

 i'm trying to built a query that report me the latest tuple for a given
 field.
 I explain myself a bit. Let's say i have a table such this :

 +++-+--+-+
 | id | date   | content | location | version |
 +++-+--+-+
 |  1 | 2004-09-14 | ALPHA   | PARIS|  10 |
 |  2 | 2004-09-15 | ALPHA   | PARIS|  11 |
 |  3 | 2004-09-16 | ALPHA   | PARIS|  10 |
 |  4 | 2004-09-14 | ALPHA   | NEW-YORK |  11 |
 |  5 | 2004-09-15 | ALPHA   | NEW-YORK |  11 |
 |  6 | 2004-09-16 | ALPHA   | NEW-YORK |  10 |
 |  7 | 2004-09-14 | ALPHA   | TOKYO|  10 |
 |  8 | 2004-09-15 | ALPHA   | TOKYO|  11 |
 |  9 | 2004-09-16 | BETA| TOKYO|  10 |
 +++-+--+-+

 Then, i'm trying to get, for ALPHA content, the last (most recent) tuple
 for each location, with their associated version.
 What i should have in the result set :

 ++-+--+-+
 | 2004-09-16 | ALPHA   | PARIS|  10 |
 | 2004-09-16 | ALPHA   | NEW-YORK |  10 |
 | 2004-09-15 | ALPHA   | TOKYO|  11 |
 ++-+--+-+

 I tried with max(date) but i get this :

 mysql select max(date), location, version from temp where content=ALPHA
 group by location;
 ++--+-+
 | max(date)  | location | version |
 ++--+-+
 | 2004-09-16 | NEW-YORK |  11 |
 | 2004-09-16 | PARIS|  10 |
 | 2004-09-15 | TOKYO|  10 |
 ++--+-+

 Because the GROUP BY statement get the first tuple by default?
 distinct(max(date)) do the same result.

 How them can i get the correct result set?
 Note that we cannot guess if the version is increasing or deacreasing.

Which version of MySQL are you running?

I'm having trouble thinking of a solution that doesn't involve a subquery
but subqueries aren't supported until version 4.1.x; I don't want to give
you a subquery if you can't run it.

Rhino



-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Query with group by

2004-09-16 Thread Jose Miguel Pérez
Hi Vincent!

 i'm trying to built a query that report me the latest tuple for a given
 field.
 I explain myself a bit. Let's say i have a table such this :

There is no simple solution to your problem. Strictly speaking, it's
forbidden to name a column in the SELECT clause which is NOT on the GROUP BY
clause. (Read chapter: 13.9.3 GROUP BY with Hidden Fields from the MySQL
Documentation.). Anyway, MySQL is relaxed from ANSI SQL in such a way it's
permitted to include a column which is not in the GROUP BY clause (hidden
columns). The values returned from such a query are unpredictable.

I started saying there is no _simple_ solution. However, you can think a
little bit and your problem goes away. I can think of two solutions to your
problem which neither involves a subquery.

You are lucky in that you know for sure what your last value is (you do
have a date field). You already know the value of using MAX(date) in the
SELECT. You can concatenate the date column with the version, and return a
SUBSTRING from that operation, something like this:

SELECT SUBSTRING_INDEX(MAX(CONCAT(date, ' ', version)), ' ', -1) AS
correct_version,
location, version AS wrong_version
FROM cities
WHERE content='ALPHA'
GROUP BY location

You will get the following result set: (Notice the correct_version
column).

+-+--+---+
| correct_version | location | wrong_version |
+-+--+---+
| 10  | NEW-YORK |11 |
| 10  | PARIS|10 |
| 11  | TOKYO|10 |
+-+--+---+


There is another option, however, and this one is what I like most. In
two words, you can join the table with itself, like this:

SELECT c1.date, c1.location, c1.version
FROM cities c1
LEFT JOIN cities c2
ON c1.location=c2.location AND c1.content=c2.content
AND c2.datec1.date
WHERE c2.id IS NULL AND c1.content = 'ALPHA'

What we have done here is first LEFT JOIN the table (cities in my
example) with itself so that we have in the left part those rows which have
the maximum date and with NULL on the right as the value of c2. We then
select those saying WHERE c2.id IS NULL. The last operation is selecting
the rows for an 'ALPHA' content.

Notice that with this version we also eliminate the GROUP BY, since all
rows returned from the join are unique. The result set returned with the
last example is this:

++--+-+
| date   | location | version |
++--+-+
| 2004-09-16 | PARIS|  10 |
| 2004-09-16 | NEW-YORK |  10 |
| 2004-09-15 | TOKYO|  11 |
++--+-+


Cheers,
Jose Miguel.


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Query with group by

2004-09-16 Thread Michael Stassen
Jose Miguel Pérez wrote:
Hi Vincent!
i'm trying to built a query that report me the latest tuple for a given
field.
I explain myself a bit. Let's say i have a table such this :
There is no simple solution to your problem. Strictly speaking, it's
forbidden to name a column in the SELECT clause which is NOT on the GROUP BY
clause. (Read chapter: 13.9.3 GROUP BY with Hidden Fields from the MySQL
Documentation.). Anyway, MySQL is relaxed from ANSI SQL in such a way it's
permitted to include a column which is not in the GROUP BY clause (hidden
columns). The values returned from such a query are unpredictable.
I started saying there is no _simple_ solution. However, you can think a
little bit and your problem goes away. I can think of two solutions to your
problem which neither involves a subquery.
You are lucky in that you know for sure what your last value is (you do
have a date field). You already know the value of using MAX(date) in the
SELECT. You can concatenate the date column with the version, and return a
SUBSTRING from that operation, something like this:
SELECT SUBSTRING_INDEX(MAX(CONCAT(date, ' ', version)), ' ', -1) AS
correct_version,
location, version AS wrong_version
FROM cities
WHERE content='ALPHA'
GROUP BY location
This is the MAX-CONCAT trick.  It works, but it's inefficient, as it has to 
do a full table scan, with calculations done on each row.  An index on date 
cannot be used in this case, because we are searching for the maximum 
CONCAT(date, ' ', version), rather than the maximum date.

You will get the following result set: (Notice the correct_version
column).
+-+--+---+
| correct_version | location | wrong_version |
+-+--+---+
| 10  | NEW-YORK |11 |
| 10  | PARIS|10 |
| 11  | TOKYO|10 |
+-+--+---+
There is another option, however, and this one is what I like most. In
two words, you can join the table with itself, like this:
SELECT c1.date, c1.location, c1.version
FROM cities c1
LEFT JOIN cities c2
ON c1.location=c2.location AND c1.content=c2.content
AND c2.datec1.date
WHERE c2.id IS NULL AND c1.content = 'ALPHA'
What we have done here is first LEFT JOIN the table (cities in my
example) with itself so that we have in the left part those rows which have
the maximum date and with NULL on the right as the value of c2. We then
select those saying WHERE c2.id IS NULL. The last operation is selecting
the rows for an 'ALPHA' content.
This will work, but it is also somewhat inefficient.  The LEFT JOIN is 
creating numerous extra, unwanted rows, only to throw them away with the 
WHERE c2.id IS NULL.  Assuming n rows for a particular location value, you 
are creating 1 + (n * (n - 1)/2) rows {optimized down to n rows, if id is 
defined NOT NULL} to find the one row you want for that group.  That's no 
big deal for this small sample table, but it may not scale well.

Notice that with this version we also eliminate the GROUP BY, since all
rows returned from the join are unique. The result set returned with the
last example is this:
++--+-+
| date   | location | version |
++--+-+
| 2004-09-16 | PARIS|  10 |
| 2004-09-16 | NEW-YORK |  10 |
| 2004-09-15 | TOKYO|  11 |
++--+-+
The most efficient way is probably to use a temporary table.
  CREATE TEMPORARY TABLE max_dates
  SELECT location, MAX(date) AS max_date
  FROM temp
  WHERE content = 'ALPHA'
  GROUP BY location;
  SELECT t.*
  FROM temp t, max_dates m
  WHERE t.location = m.location
  AND t.date = m.max_date;
  DROP TABLE max_dates;
The manual describes the MAX-CONCAT trick, the temporary table solution, and 
a subquery solution for 4.1+ 
http://dev.mysql.com/doc/mysql/en/example-Maximum-column-group-row.html.

Cheers,
Jose Miguel.
Michael
--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]


Re: Query with group by

2004-09-16 Thread Rhino
I agree that Michael's solution with the temporary tables is the best I have
seen so far.

I am running MySQL 4.0.15 so I don't have any subquery capability. As a
result, I went to DB2 to come up with one solution that does give the right
answer via a subquery:

select content, location, version, date
from versions
where concat(location, char(date)) in
(select concat(location, char(max(date)))
from versions
where content = 'ALPHA'
group by location);

I don't know if this will work in MySQL 4.1.x though. You may want to give
it a try if you have 4.1.x. By the way, I'm not convinced that this is the
*best* solution using a subquery; it's just the first one I could think of.
I don't have all day to spend on this ;-)

One other thought: is it really necessary to keep track of all of the
different versions you have had at each location? I can't think of a lot of
cases where I'd really care that I had version 8 of the ALPHA in Tokyo last
week or last year. Your query would be simpler - and the volume of data
would be less - if you only kept track of the current version for each
content at each location.

Rhino

- Original Message - 
From: Michael Stassen [EMAIL PROTECTED]
To: Jose Miguel Pérez [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]
Sent: Thursday, September 16, 2004 10:47 AM
Subject: Re: Query with group by



 Jose Miguel Pérez wrote:

  Hi Vincent!
 
 i'm trying to built a query that report me the latest tuple for a given
 field.
 I explain myself a bit. Let's say i have a table such this :
 
  There is no simple solution to your problem. Strictly speaking, it's
  forbidden to name a column in the SELECT clause which is NOT on the
GROUP BY
  clause. (Read chapter: 13.9.3 GROUP BY with Hidden Fields from the
MySQL
  Documentation.). Anyway, MySQL is relaxed from ANSI SQL in such a way
it's
  permitted to include a column which is not in the GROUP BY clause
(hidden
  columns). The values returned from such a query are unpredictable.
 
  I started saying there is no _simple_ solution. However, you can
think a
  little bit and your problem goes away. I can think of two solutions to
your
  problem which neither involves a subquery.
 
  You are lucky in that you know for sure what your last value is (you
do
  have a date field). You already know the value of using MAX(date) in
the
  SELECT. You can concatenate the date column with the version, and return
a
  SUBSTRING from that operation, something like this:
 
  SELECT SUBSTRING_INDEX(MAX(CONCAT(date, ' ', version)), ' ', -1)
AS
  correct_version,
  location, version AS wrong_version
  FROM cities
  WHERE content='ALPHA'
  GROUP BY location

 This is the MAX-CONCAT trick.  It works, but it's inefficient, as it has
to
 do a full table scan, with calculations done on each row.  An index on
date
 cannot be used in this case, because we are searching for the maximum
 CONCAT(date, ' ', version), rather than the maximum date.

  You will get the following result set: (Notice the correct_version
  column).
 
  +-+--+---+
  | correct_version | location | wrong_version |
  +-+--+---+
  | 10  | NEW-YORK |11 |
  | 10  | PARIS|10 |
  | 11  | TOKYO|10 |
  +-+--+---+
 
  There is another option, however, and this one is what I like most.
In
  two words, you can join the table with itself, like this:
 
  SELECT c1.date, c1.location, c1.version
  FROM cities c1
  LEFT JOIN cities c2
  ON c1.location=c2.location AND c1.content=c2.content
  AND c2.datec1.date
  WHERE c2.id IS NULL AND c1.content = 'ALPHA'
 
  What we have done here is first LEFT JOIN the table (cities in my
  example) with itself so that we have in the left part those rows which
have
  the maximum date and with NULL on the right as the value of c2. We then
  select those saying WHERE c2.id IS NULL. The last operation is
selecting
  the rows for an 'ALPHA' content.

 This will work, but it is also somewhat inefficient.  The LEFT JOIN is
 creating numerous extra, unwanted rows, only to throw them away with the
 WHERE c2.id IS NULL.  Assuming n rows for a particular location value, you
 are creating 1 + (n * (n - 1)/2) rows {optimized down to n rows, if id is
 defined NOT NULL} to find the one row you want for that group.  That's no
 big deal for this small sample table, but it may not scale well.

  Notice that with this version we also eliminate the GROUP BY, since
all
  rows returned from the join are unique. The result set returned with the
  last example is this:
 
  ++--+-+
  | date   | location | version |
  ++--+-+
  | 2004-09-16 | PARIS|  10 |
  | 2004-09-16 | NEW-YORK |  10 |
  | 2004-09-15

Re: Query with group by

2004-09-16 Thread Vincent . Badier

Which version of MySQL are you running?

I'm having trouble thinking of a solution that doesn't involve a subquery
but subqueries aren't supported until version 4.1.x; I don't want to give
you a subquery if you can't run it.

Rhino

I'm running  3.23.43.

--
Vincent



-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Query with group by

2004-09-16 Thread Jose Miguel Pérez
Hi Michael!

Yes, you're right, thanks for extending and clarifying my message.
However, I'm not confident about your comments on the inefficiency for the
following query:

  SELECT c1.date, c1.location, c1.version
  FROM cities c1
  LEFT JOIN cities c2
  ON c1.location=c2.location AND c1.content=c2.content
  AND c2.datec1.date
  WHERE c2.id IS NULL AND c1.content = 'ALPHA'
 

 This will work, but it is also somewhat inefficient.  The LEFT JOIN is
 creating numerous extra, unwanted rows, only to throw them away with the
 WHERE c2.id IS NULL.  Assuming n rows for a particular location value, you

[...] and then [...]

 The most efficient way is probably to use a temporary table.

CREATE TEMPORARY TABLE max_dates
SELECT location, MAX(date) AS max_date
FROM temp
WHERE content = 'ALPHA'
GROUP BY location;

SELECT t.*
FROM temp t, max_dates m
WHERE t.location = m.location
AND t.date = m.max_date;

DROP TABLE max_dates;

I don't think the temporary table is such an efficient way of doing
this. Pardon me, I'm provably wrong, but let me explain to see if I think
correctly. First, I assume as true this table have an index on location,
content and date, apart from the PK on ID. Given that, on my query we
are using the keys at full, I mean, although you say the left join is
creating numerous extra, unwanted rows, this is not true. We could apply
the standard algebra here, but the real world query optimizers are smart
enough to not retrieve unwanted data. (What about joining four or more
tables! Multiply then).

Your query is creating a temporary table, doing a full scan of it
(thanks to the MAX(date) function), etc. If you do a EXPLAIN SELECT for your
query, you'll notice there is an Extra of: Using where; Using temporary;
Using FILESORT. Reading the MySQL documentation, one can see If you want
to make your queries as fast as possible, you should look out for Extra
values of Using filesort and Using temporary.. (Chapter 7.2.1 EXPLAIN
Syntax).

If I'm not wrong, maybe the first LEFT JOIN is worse from a mathematical
point of view, but the temporary one may be is the worst from a practical
perspective.

And you'll see I'm very cautious because I'm not such a SQL guru, but
I'd like to know other opinions.

Anyway, I don't know if one can program an agregate UDF called something
like EXTERNAL_MAX(...) or something, so that we could do like:

SELECT EXTERNAL_MAX(date, version)  --- i.e: Returns the version
value for the row with MAX(date).

This, for sure, will be the best solution. ;-)

Cheers,
Jose Miguel.


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Query with group by

2004-09-16 Thread Michael Stassen
Rhino wrote:
I agree that Michael's solution with the temporary tables is the best I have
seen so far.
I can't take much credit.  It's just an adaptation of the solution in the 
manual.

I am running MySQL 4.0.15 so I don't have any subquery capability. As a
result, I went to DB2 to come up with one solution that does give the right
answer via a subquery:
select content, location, version, date
from versions
where concat(location, char(date)) in
(select concat(location, char(max(date)))
from versions
where content = 'ALPHA'
group by location);
Really?  That didn't work for me in mysql 4.1.4a-gamma.  I got
+-+--+-++
| content | location | version | date   |
+-+--+-++
| ALPHA   | PARIS|  10 | 2004-09-14 |
| ALPHA   | PARIS|  11 | 2004-09-15 |
| ALPHA   | PARIS|  10 | 2004-09-16 |
| ALPHA   | NEW-YORK |  11 | 2004-09-14 |
| ALPHA   | NEW-YORK |  11 | 2004-09-15 |
| ALPHA   | NEW-YORK |  10 | 2004-09-16 |
| ALPHA   | TOKYO|  10 | 2004-09-14 |
| ALPHA   | TOKYO|  11 | 2004-09-15 |
| BETA| TOKYO|  10 | 2004-09-16 |
+-+--+-++
9 rows in set (0.31 sec)

I don't know if this will work in MySQL 4.1.x though. You may want to give
it a try if you have 4.1.x. By the way, I'm not convinced that this is the
*best* solution using a subquery; it's just the first one I could think of.
I don't have all day to spend on this ;-)
The following query works for me.
  SELECT content, location, version, date
  FROM temp t1
  WHERE date=(SELECT MAX(t2.date)
  FROM temp t2
  WHERE t1.location = t2.location 

  AND t1.content = t2.content)
  AND content = 'ALPHA'; 

+-+--+-++
| content | location | version | date   |
+-+--+-++
| ALPHA   | PARIS|  10 | 2004-09-16 |
| ALPHA   | NEW-YORK |  10 | 2004-09-16 |
| ALPHA   | TOKYO|  11 | 2004-09-15 |
+-+--+-++
3 rows in set (0.01 sec)
Again, that's an adaptation of the subquery version of the solution in the 
manual.

Michael
--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]


Re: QUERY question (group by) - please help

2003-01-11 Thread Paul DuBois
At 14:41 +0100 1/10/03, Damir Dezeljin wrote:

Hi.

I want to calculate how many rows my query which uses 'GROUP BY' returns.

The query:

SELECT something
FROM test
WHERE (kid=1) OR (kid=2) OR (kid=4)
GROUP BY cid,aid;


Is it posible to get number of rows with such a query from MySQL v3.23.49?


You want both a list of rows, and a count of the number of rows -- with
a single query.  That doesn't make sense, because those two things are
incompatible.


If it isn't posible ... is it posible in MySQL 4.x?


MySQL 4.x won't change the fundamental nature of the incompatibility.
The strategy you use below is reasonable.



I think a lot about this problem and I realize only the following
solution:
CREATE TEMPORARY TABLE t (i INT);
INSERT INTO t (i) SELECT aid FROM the_query_above
SELECT COUNT(*) FROM t;
DROP TABLE t;

But this isn't so elegant.

I want to do so on data generated by:

CREATE TABLE test (
kid INT,
aid INT,
cid INT
);

INSERT INTO test
(kid, aid, cid) VALUES
(  1,   0,   1),
(  2,   2,   2),
(  1,   3,   2),
(  2,   3,   2),
(  4,   4,   2),
(  4,   0,   3),
(  3,   3,   4),
(  4,   3,   4);


Regards,
Dezo



-
Before posting, please check:
  http://www.mysql.com/manual.php   (the manual)
  http://lists.mysql.com/   (the list archive)

To request this thread, e-mail [EMAIL PROTECTED]
To unsubscribe, e-mail [EMAIL PROTECTED]
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php




Re: QUERY question (group by) - please help

2003-01-11 Thread Damir Dezeljin
 
 SELECT something
  FROM test
  WHERE (kid=1) OR (kid=2) OR (kid=4)
  GROUP BY cid,aid;
 
 Is it posible to get number of rows with such a query from MySQL v3.23.49?

 You want both a list of rows, and a count of the number of rows -- with
 a single query.  That doesn't make sense, because those two things are
 incompatible.
I make a mistake by not specifing my problem in more detail. I will try to
be more specific:

I'm writing my own search engine for my web site. I want to output for
each result on each page (20 results per page) the no_of_current_result /
no_of_total_results . So with a first query I want to get a count of the
number of rows. Then I will execute a query with LIMIT blahblah to get
results for the current page.

Is it posible to get number of all results without geting all results in a
temporary table and then count them? If yes, how?

Thanks and regards,
Dezo


-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail [EMAIL PROTECTED]
To unsubscribe, e-mail [EMAIL PROTECTED]
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php




Re: QUERY question (group by) - please help

2003-01-11 Thread Joseph Bueno
Damir Dezeljin wrote:

SELECT something
FROM test
WHERE (kid=1) OR (kid=2) OR (kid=4)
GROUP BY cid,aid;

Is it posible to get number of rows with such a query from MySQL v3.23.49?

You want both a list of rows, and a count of the number of rows -- with
a single query.  That doesn't make sense, because those two things are
incompatible.
 
 I make a mistake by not specifing my problem in more detail. I will try to
 be more specific:
 
 I'm writing my own search engine for my web site. I want to output for
 each result on each page (20 results per page) the no_of_current_result /
 no_of_total_results . So with a first query I want to get a count of the
 number of rows. Then I will execute a query with LIMIT blahblah to get
 results for the current page.
 
 Is it posible to get number of all results without geting all results in a
 temporary table and then count them? If yes, how?

SELECT COUNT(*)
  FROM test
  WHERE (kid=1) OR (kid=2) OR (kid=4)
  GROUP BY cid,aid;

Should do it, or am I missing something ?

 
 Thanks and regards,
 Dezo
 
 


Hope this Helps
Joseph Bueno


-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail [EMAIL PROTECTED]
To unsubscribe, e-mail [EMAIL PROTECTED]
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php




Re: QUERY question (group by) - please help

2003-01-11 Thread Damir Dezeljin
Hi again ;)

 SELECT COUNT(*)
   FROM test
   WHERE (kid=1) OR (kid=2) OR (kid=4)
   GROUP BY cid,aid;
This query returns:
+--+
| COUNT(*) |
+--+
|1 |
|1 |
|2 |
|1 |
|1 |
|1 |
+--+
6 rows in set (0.03 sec)

Inest of this I want to get back '6'  so numbers of rows in resulting
query;

The same query WITHOUT 'GROUP BY' returns:
mysql SELECT COUNT(*) FROM test WHERE (kid=1) OR (kid=2) OR (kid=4);
+--+
| COUNT(*) |
+--+
|7 |
+--+
1 row in set (0.00 sec)

So two rows are grouped because of same cid,did

Any sugestion?

Regards,
Dezo


-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail [EMAIL PROTECTED]
To unsubscribe, e-mail [EMAIL PROTECTED]
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php




Re: QUERY question (group by) - please help

2003-01-11 Thread Joseph Bueno
Damir Dezeljin wrote:
 Hi again ;)
 
 
SELECT COUNT(*)
  FROM test
  WHERE (kid=1) OR (kid=2) OR (kid=4)
  GROUP BY cid,aid;
 
 This query returns:
 +--+
 | COUNT(*) |
 +--+
 |1 |
 |1 |
 |2 |
 |1 |
 |1 |
 |1 |
 +--+
 6 rows in set (0.03 sec)
 
 Inest of this I want to get back '6'  so numbers of rows in resulting
 query;
 
 The same query WITHOUT 'GROUP BY' returns:
 mysql SELECT COUNT(*) FROM test WHERE (kid=1) OR (kid=2) OR (kid=4);
 +--+
 | COUNT(*) |
 +--+
 |7 |
 +--+
 1 row in set (0.00 sec)
 
 So two rows are grouped because of same cid,did
 
 Any sugestion?
 
 Regards,
 Dezo
 
 

SELECT COUNT(DISTINCT cid,aid)
  FROM test
  WHERE (kid=1) OR (kid=2) OR (kid=4)

Should give you the number of groups.

Hope this helps
Joseph Bueno


-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail [EMAIL PROTECTED]
To unsubscribe, e-mail [EMAIL PROTECTED]
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php