Seena Blace <[EMAIL PROTECTED]> wrote on 05/16/2005 11:13:48 AM:

> shawn
> I think 2nd query will return only 10 rows.
> I want out like followings
>  date                 domain                  spam
>  05/05/05         hotmail.com            120
>  05/05/05         yahoo.com              110
>  05/05/05         abc.com              99
>  05/05/05         def.com              80
>  05/05/05         mnpo.net              79
>  ......................................... like that upto 10
>  --------------------------------------------------
>  05/06/05         yahoo.com              300
>  05/06/05         def.com              250
>  05/06/05         zer.com              200
>  ..................................like that upto 10
> 
> Each day there are multiple entry from diffrent domains or same domain.
> I want each day whatever top 10  spam sender domain.
> thanks
> 
> 
> 
> 
> [EMAIL PROTECTED] wrote:
> Seena Blace wrote on 05/16/2005 10:08:15 AM:
> 
> > Any suggestion pl?
> > 
> > Seena Blace wrote:hi,
> > here is table description
> > report1
> > 
> > +-------------------------+------------------+------+-----
> > +---------------------+----------------+
> > | Field | Type | Null | Key | Default | Extra |
> > +-------------------------+------------------+------+-----
> > +---------------------+----------------+
> > | id | int(10) unsigned | | PRI | NULL | auto_increment |
> > | host_id | int(10) unsigned | | MUL | 0 | |
> > | report_rcpt_domain_id | int(10) unsigned | YES | MUL | NULL | |
> > | report_sender_domain_id | int(10) unsigned | YES | MUL | NULL | |
> > | report_ipaddress_id | int(10) unsigned | YES | MUL | NULL | |
> > | time | datetime | | MUL | 0000-00-00 00:00:00 | |
> > | detected_spam | int(10) unsigned | | | 0 | |
> > | detected_virus | int(10) unsigned | | | 0 | |
> > | processed | int(10) unsigned | | | 0 | |
> > | allowed | int(10) unsigned | | | 0 | |
> > | suspected | int(10) unsigned | | | 0 | |
> > | blocked | int(10) unsigned | | | 0 | |
> > | spam | int(10) unsigned | | | 0 | |
> > | virus | int(10) unsigned | | | 0 | |
> > 
> > 
> > I WANT REPORT LIKE FOLLOWINGS
> > 
> > date sender processed spam suspected
> > 
> > 
> > I want top 10 spam sender each day.
> > 
> > QUery i'm using 
> > select date_format(time,'%Y-%d-%m'),report_sender_domain_id,
> > processed ,spam from report1
> > order by spam desc ,report_sender_domain_id,date_format(time,'%Y-%
> > d-%m') limit 10;
> > 
> > 
> > 
> > Please suggest.
> > thanks
> > 
> It was the weekend. Not everyone lurks on their days off.
> 
> Try this
> select date_format(time,'%Y-%d-%m')
> ,report_sender_domain_id
> ,processed 
> ,spam 
> ,suspected
> from report1
> order by spam desc
> ,report_sender_domain_id
> ,date_format(time,'%Y-%d-%m') 
> limit 10;
> 
> or if there is more than one entry per spammer per day
> 
> SELECT date_format(time,'%Y-%d-%m')
> ,report_sender_domain_id
> ,sum(processed) as processed
> ,sum(spam) as spam 
> ,sum(suspected) as suspected
> FROM report1
> GROUP BY date_format(time,'%Y-%d-%m')
> ,report_sender_domain_id
> ORDER BY spam desc
> ,report_sender_domain_id
> ,date_format(time,'%Y-%d-%m') 
> LIMIT 10;
> 
> That will give you their total stats for each day.
> 
> Shawn Green
> Database Administrator
> Unimin Corporation - Spruce Pine
> 
> ---------------------------------

OK, then what you want to do will take two steps and another table (unless 
you want to write application code) to do with MySQL;

First step is to create a table where the intermediate results can be 
stored. What this table does is number each row for each day. Since it 
appears that you may want this information frequently (but only need to 
update it at the end of every day) I will make this a permanent table (not 
a temporary table).

CREATE TABLE spam_stats (
        report_date date not null,
        report_sender_domain_id int unsigned not null,
        rank int unsigned auto_increment,
        processed int unsigned,
        spam int unsigned,
        suspected int unsigned,
        PRIMARY KEY (report_date, rank),
        UNIQUE KEY (report_sender_domain_id, report_date),
        KEY (rank)
);

The UNIQUE key ensures that the same spammer cannot have more than one 
entry per day while the PRIMARY KEY allows for a groupwize autonumber 
(each entry per day gets it's own number, rank, starting at 1). The last 
key is optional but will seriously speed up the returns for the report you 
wanted. Next we need to populate our new table with the "spam" report data 
(filled in from most spam to least spam per domain per day).

INSERT spam_stats (report_date, report_sender_domain_id, processed, spam, 
suspected)
SELECT date(`time`)
        ,report_sender_domain_id
        ,sum(processed) as processed
        ,sum(spam) as spam 
        ,sum(suspected) as suspected
FROM report1
GROUP BY date(`time`)
        ,report_sender_domain_id 
ORDER BY spam desc
        ,report_sender_domain_id
        ,date(`time`);

What will happen is that the rank column will be automatically filled in 
for each day's spam statistics. Then to get your "top ten" list you just 
run:

SELECT * 
FROM spam_stats
WHERE rank <= 10;

Update spam_stats at the end of each new day and will only need to 
calculate those values once. If you create more tables like spam_stats for 
your other statistics (processed, suspected, blocked, etc.), you will be 
able to do rank-based reporting for those statistics, too ("top 20 sites 
blocked", for example).

If you play around with the general idea of the groupwize numbering 
feature of auto_increment + primary key, I am sure you can solve a lot of 
your reporting issues.

Shawn Green
Database Administrator
Unimin Corporation - Spruce Pine

Reply via email to