Re: [GENERAL] histogram
hi, group by 1 means group by first output column order by 2 means order by second output column ascending Am 30.04.2011 19:00, schrieb Joel Reymont: What is the meaning of group by 1 order by 2 e.g. what to the numbers 1 and 2 stand for? What would change if I do the following? group by 1 order by 1 On Apr 30, 2011, at 5:48 PM, Thomas Markus wrote: Hi, try something like this: select trunc(random() * 10.)/10. , count(*) from generate_series(1,200) group by 1 order by 2 -- - for hire: mac osx device driver ninja, kernel extensions and usb drivers -++--- http://wagerlabs.com | @wagerlabs | http://www.linkedin.com/in/joelreymont -++--- -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] histogram
David Johnston wrote: Given that you are actively implementing the code that uses the 1 and 2 I don't see how it is that egregious. When generating calculated fields it is cleaner than the alternative: Select trunc(distance * 10.)/10., count(*) From doc_ads Group by (trunc(distance * 10.)) Order by (trunc(distance * 10.)) It would be nice if you could do: Select trunc(distance * 10.)/10. AS bin, count(*) AS frequency From doc_ads Group by bin Order by bin But I do not believe that is allowed (though I may have my syntax wrong...) David J. re: 1 and 2. They're horrible (imho) reference to the attributes of the returned tuple. Or at best an exposure of the implementation. :) Joel Reymont wrote: I think this should do what I want select trunc(distance * 10.)/10., count(*) from doc_ads group by 1 order by 1 Thanks, Joel I think we're supposed to bottom-post here. I agree in the case of generated columns and old servers but you see the practice more commonly than really necessary. But in 8.4 at least select trunc(distance * 10.0 )/10.0 as histo, count(*) as tally from d group by histo order by tally; works just fine for me -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] histogram
Given that you are actively implementing the code that uses the 1 and 2 I don't see how it is that egregious. When generating calculated fields it is cleaner than the alternative: Select trunc(distance * 10.)/10., count(*) >From doc_ads Group by (trunc(distance * 10.)) Order by (trunc(distance * 10.)) It would be nice if you could do: Select trunc(distance * 10.)/10. AS bin, count(*) AS frequency >From doc_ads Group by bin Order by bin But I do not believe that is allowed (though I may have my syntax wrong...) David J. >> re: 1 and 2. They're horrible (imho) reference to the attributes of the returned tuple. Or at best an exposure of the implementation. :) >>Joel Reymont wrote: >>> I think this should do what I want >>> >>> select trunc(distance * 10.)/10., count(*) >>> from doc_ads >>> group by 1 order by 1 >>> >>> Thanks, Joel -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] histogram
re: 1 and 2. They're horrible (imho) reference to the attributes of the returned tuple. Or at best an exposure of the implementation. :) Order by "2" if you want the most frequent (highest counts) of your distances at the bottom of the output (or ordery by 2 desc) if you want them at the top of your output. Joel Reymont wrote: I think this should do what I want select trunc(distance * 10.)/10., count(*) from doc_ads group by 1 order by 1 Thanks, Joel -- - for hire: mac osx device driver ninja, kernel extensions and usb drivers -++--- http://wagerlabs.com | @wagerlabs | http://www.linkedin.com/in/joelreymont -++--- -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] histogram
I think this should do what I want select trunc(distance * 10.)/10., count(*) from doc_ads group by 1 order by 1 Thanks, Joel -- - for hire: mac osx device driver ninja, kernel extensions and usb drivers -++--- http://wagerlabs.com | @wagerlabs | http://www.linkedin.com/in/joelreymont -++--- -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] histogram
What is the meaning of group by 1 order by 2 e.g. what to the numbers 1 and 2 stand for? What would change if I do the following? group by 1 order by 1 On Apr 30, 2011, at 5:48 PM, Thomas Markus wrote: > Hi, > > try something like this: > > select >trunc(random() * 10.)/10. >, count(*) > from >generate_series(1,200) > group by 1 order by 2 -- - for hire: mac osx device driver ninja, kernel extensions and usb drivers -++--- http://wagerlabs.com | @wagerlabs | http://www.linkedin.com/in/joelreymont -++--- -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] histogram
Thank you Thomas! Is there a way for the code below to determine the number of rows in the table and use it? Thanks, Joel On Apr 30, 2011, at 5:48 PM, Thomas Markus wrote: > Hi, > > try something like this: > > select >trunc(random() * 10.)/10. >, count(*) > from >generate_series(1,200) > group by 1 order by 2 -- - for hire: mac osx device driver ninja, kernel extensions and usb drivers -++--- http://wagerlabs.com | @wagerlabs | http://www.linkedin.com/in/joelreymont -++--- -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] histogram
Hi, try something like this: select trunc(random() * 10.)/10. , count(*) from generate_series(1,200) group by 1 order by 2 regards Thomas Am 30.04.2011 18:37, schrieb Joel Reymont: I have a column of 2 million float values from 0 to 1. I would like to figure out how many values fit into buckets spaced by 0.10, e.g. from 0 to 0.10, from 0.10 to 0.20, etc. What is the best way to do this? Thanks, Joel -- - for hire: mac osx device driver ninja, kernel extensions and usb drivers -++--- http://wagerlabs.com | @wagerlabs | http://www.linkedin.com/in/joelreymont -++--- -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] histogram
I have a column of 2 million float values from 0 to 1. I would like to figure out how many values fit into buckets spaced by 0.10, e.g. from 0 to 0.10, from 0.10 to 0.20, etc. What is the best way to do this? Thanks, Joel -- - for hire: mac osx device driver ninja, kernel extensions and usb drivers -++--- http://wagerlabs.com | @wagerlabs | http://www.linkedin.com/in/joelreymont -++--- -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Histogram generator
On Tue, Jul 27, 2010 at 09:25:05PM -0400, Patrick May wrote: > On Jul 27, 2010, at 9:21 PM, Steve Atkins wrote: > > select date_trunc('hour', foo) + interval '30 minutes' * > > floor(extract(minute from foo) / 30) as start, event, count(*) from bar > > group by 1, 2 order by 1 asc; > > Thanks! It looks like interval is what I need to play with. Another useful tool to use is the classic unix "seconds since epoch". You could turn the key expression from above into: timestamp 'epoch' + interval '1 second' * floor(date_part('epoch',foo) / (30*60)) * (30*60) I'd probably go with Steve's version here, it's a bit more obvious what's going on. Also note, that if you don't really care about what the specific groups are, just that you have a set of evenly divided 30minute periods you don't need to convert back to a date, so could just use: floor(date_part('epoch',foo) / (30*60)) One final note, if you're dealing with lots of data and the above expression is slow, you could delay converting back to a date until "after" the grouping, i.e: SELECT timestamp 'epoch' + interval '1 second' * floor(date_part('epoch',foo) / (30*60)) * (30*60) AS t, COUNT(*) FROM data GROUP BY floor(date_part('epoch',foo) / (30*60)); This will save PG from converting back to a date for every row when it's going to chuck most of them away anyway. Hope that gives you some more ideas! -- Sam http://samason.me.uk/ -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Histogram generator
On Jul 27, 2010, at 9:21 PM, Steve Atkins wrote: > On Jul 27, 2010, at 6:07 PM, Patrick May wrote: >> On Jul 27, 2010, at 7:12 PM, Dann Corbit wrote: I have a table containing events with timestamps. I would like to generate a histogram of the number of each type of event for each half-hour period from 8:00 am to 6:00 pm. Are there any tools that will help me do this? >>> >>> GROUP BY is essentially a histogram generator. >>> The age() function will give you an interval. >>> >>> I am not sure if you want to combine half hour periods from different days >>> with the same time or not (IOW are 8AM Tuesday and 8AM Wednesday supposed >>> to be grouped together or not?) >>> >>> I guess that if you are more specific about exactly what you want you can >>> get a better answer. Your requirement is a little bit vague or ambiguous. >> >> I don't mind using Excel to generate the actual graphic. Ideally I'd >> get output something like this: >> >> datestart endevent count >> -- - >> 2010-07-27 08:00:00 08:29:59 EVENT_120 >> 2010-07-27 08:00:00 08:29:59 EVENT_215 >> 2010-07-27 08:30:00 08:59:59 EVENT_110 >> 2010-07-27 08:30:00 08:59:59 EVENT_3 5 >> >> I'm quite familiar with SQL, but I'm not sure how to generate the half hour >> intervals without hard coding them. > > There's probably a better way, but something like this might work: > > select date_trunc('hour', foo) + interval '30 minutes' * floor(extract(minute > from foo) / 30) as start, event, count(*) from bar group by 1, 2 order by 1 > asc; Thanks! It looks like interval is what I need to play with. Regards, Patrick smime.p7s Description: S/MIME cryptographic signature
Re: [GENERAL] Histogram generator
On Jul 27, 2010, at 6:07 PM, Patrick May wrote: > On Jul 27, 2010, at 7:12 PM, Dann Corbit wrote: >>> I have a table containing events with timestamps. I would like >>> to generate a histogram of the number of each type of event for each >>> half-hour period from 8:00 am to 6:00 pm. Are there any tools that >>> will help me do this? >> >> GROUP BY is essentially a histogram generator. >> The age() function will give you an interval. >> >> I am not sure if you want to combine half hour periods from different days >> with the same time or not (IOW are 8AM Tuesday and 8AM Wednesday supposed to >> be grouped together or not?) >> >> I guess that if you are more specific about exactly what you want you can >> get a better answer. Your requirement is a little bit vague or ambiguous. > > I don't mind using Excel to generate the actual graphic. Ideally I'd > get output something like this: > > datestart endevent count > -- - > 2010-07-27 08:00:00 08:29:59 EVENT_120 > 2010-07-27 08:00:00 08:29:59 EVENT_215 > 2010-07-27 08:30:00 08:59:59 EVENT_110 > 2010-07-27 08:30:00 08:59:59 EVENT_3 5 > > I'm quite familiar with SQL, but I'm not sure how to generate the half hour > intervals without hard coding them. There's probably a better way, but something like this might work: select date_trunc('hour', foo) + interval '30 minutes' * floor(extract(minute from foo) / 30) as start, event, count(*) from bar group by 1, 2 order by 1 asc; Cheers, Steve -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Histogram generator
On Jul 27, 2010, at 7:12 PM, Dann Corbit wrote: >> I have a table containing events with timestamps. I would like >> to generate a histogram of the number of each type of event for each >> half-hour period from 8:00 am to 6:00 pm. Are there any tools that >> will help me do this? > > GROUP BY is essentially a histogram generator. > The age() function will give you an interval. > > I am not sure if you want to combine half hour periods from different days > with the same time or not (IOW are 8AM Tuesday and 8AM Wednesday supposed to > be grouped together or not?) > > I guess that if you are more specific about exactly what you want you can get > a better answer. Your requirement is a little bit vague or ambiguous. I don't mind using Excel to generate the actual graphic. Ideally I'd get output something like this: datestart endevent count -- - 2010-07-27 08:00:00 08:29:59 EVENT_120 2010-07-27 08:00:00 08:29:59 EVENT_215 2010-07-27 08:30:00 08:59:59 EVENT_110 2010-07-27 08:30:00 08:59:59 EVENT_3 5 I'm quite familiar with SQL, but I'm not sure how to generate the half hour intervals without hard coding them. Thanks, Patrick smime.p7s Description: S/MIME cryptographic signature
Re: [GENERAL] Histogram generator
> -Original Message- > From: pgsql-general-ow...@postgresql.org [mailto:pgsql-general- > ow...@postgresql.org] On Behalf Of Patrick May > Sent: Tuesday, July 27, 2010 3:59 PM > To: pgsql-general@postgresql.org > Subject: [GENERAL] Histogram generator > > Hi, > > I have a table containing events with timestamps. I would like > to generate a histogram of the number of each type of event for each > half-hour period from 8:00 am to 6:00 pm. Are there any tools that > will help me do this? GROUP BY is essentially a histogram generator. The age() function will give you an interval. I am not sure if you want to combine half hour periods from different days with the same time or not (IOW are 8AM Tuesday and 8AM Wednesday supposed to be grouped together or not?) I guess that if you are more specific about exactly what you want you can get a better answer. Your requirement is a little bit vague or ambiguous. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Histogram generator
Hi, I have a table containing events with timestamps. I would like to generate a histogram of the number of each type of event for each half-hour period from 8:00 am to 6:00 pm. Are there any tools that will help me do this? Thanks, Patrick smime.p7s Description: S/MIME cryptographic signature