subject:"\"\\\[vox\\\-tech\\\] MySQL question\\\: making stats out of orders\""

Re: [vox-tech] MySQL question: making stats out of orders

2009-12-03 Thread Brian Lavender

What if you just use RRD tool?
http://oss.oetiker.ch/rrdtool/

On Tue, Dec 01, 2009 at 05:32:20PM -0800, Bill Kendrick wrote:
> 
> Ok, so I've got a few queries our website does that
> are used to determine which products are 'top sellers'
> in the past 90 days' worth of sales.
> 
> Right now, this involves joining the product table
> with the orders table, by way of an items-in-the-order table.
> 
> In other words, to figure out what sold the best in the
> last 90 days, it's something like:
> 
>   SELECT * FROM products
>   JOIN orderitem ON products.id = orderitem.productid
>   JOIN orders ON orders.id = orderitem.orderid
>   WHERE orders.timestamp >= { 90 days ago }
> 
> As time goes on, our records or orders, order items, and
> actual products all keep growing.  That means this task
> gets more and more intensive.
> 
> 
> Now, we've got some statistics tables set up for other
> purposes -- how many times a product was viewed,
> as well as when any particular logged-in-users last
> visited particular items, which allows us to do the
> "people who recently looked at X also recently looked at Y".
> 
> I'm thinking we could do something similar for sales,
> to simplify the "top sellers" queries.  They'd hit
> a single stats table (which could, on a daily basis,
> be cleared of any >90-days-ago entries), rather than
> having to JOIN two large tables that go back to day 1.
> 
> 
> My current problem -- no doubt due to lack of brain power
> at the end of the day -- is how to take the CURRENT data
> from the orders and orderitem tables so that I can do an
> initial population of a new stats table.
> 
> Our "views" stats table, for example, is something
> along the lines of:
> 
>   productid, date, views
> 
> where those first two columns act as a key.  (Date is at the
> 'day' level of granularity.)
> 
> 
> And we count a view like so:
> 
>   INSERT INTO views (productid, date, views) VALUES (###, NOW(), 1)
>   ON DUPLICATE KEY UPDATE views=views+1
> 
> In other words, the first time a product is viewed on any given day,
> we insert a new row, otherwise we update its existing row.
> 
> 
> Anyone feel like flexing their MySQL muscles and provide some
> ideas on how to populate a similar table (tracking sales) based
> on the orders/orderitem table combo explained above?
> 
> Thanks.  Sorry for asking you to do my (home)work for me.
> 
> 
> -- 
> -bill!
> Sent from my computer
> ___
> vox-tech mailing list
> vox-tech@lists.lugod.org
> http://lists.lugod.org/mailman/listinfo/vox-tech

-- 
Brian Lavender
http://www.brie.com/brian/
___
vox-tech mailing list
vox-tech@lists.lugod.org
http://lists.lugod.org/mailman/listinfo/vox-tech

Re: [vox-tech] MySQL question: making stats out of orders

2009-12-01 Thread Bill Broadley


Hrm, trying to figure out a non-proprietary question ask, that would be useful.

Ah, how about what is the ratio between:
 (unique orderitem.productid today)
 --
 orders today

Assuming it's high you could make it much easier to deal with by building a
table of:
,,

Then finding the most popular in the last 90 days, or today, or the "hot" item
that raised in the rankings the most today becomes easy.

I used this approach for handling a few 100k web hits/day and it allowed for
much more complex queries and still got my answers very quickly.  We logged
directly to Mysql and wanted to handle lots of complex queries of years of
logs.  We typically got our answers in a few seconds or less.
___
vox-tech mailing list
vox-tech@lists.lugod.org
http://lists.lugod.org/mailman/listinfo/vox-tech

Re: [vox-tech] MySQL question: making stats out of orders

2009-12-01 Thread Bill Kendrick

On Wed, Dec 02, 2009 at 12:37:19AM -0600, Chanoch (Ken) Bloom wrote:
> Then you have to throw in a conversion function that truncates it to the
> date in order for the `group by` to work correctly. I'm not sure what
> function that is, but you can play around with the functions listed at 
> http://dev.mysql.com/doc/refman/5.1/en/date-and-time-functions.html

In my original follow-up, I did a "FROM_DAYS(TO_DAYS(...))" trick,
which would probably work.

Thanks!

-- 
-bill!
Sent from my computer
___
vox-tech mailing list
vox-tech@lists.lugod.org
http://lists.lugod.org/mailman/listinfo/vox-tech

Re: [vox-tech] MySQL question: making stats out of orders

2009-12-01 Thread Chanoch (Ken) Bloom

On Tue, 2009-12-01 at 22:33 -0800, Bill Kendrick wrote:
> On Wed, Dec 02, 2009 at 12:20:25AM -0600, Chanoch (Ken) Bloom wrote:
> > INSERT into orderstats(productid, date, orders)
> > SELECT
> >  orderitem.productid, orders.timestamp, SUM(orderitem.qty)
> > FROM orderitem
> >   JOIN orders ON orders.id = orderitem.orderid
> >   WHERE orders.timestamp >= { 90 days ago }
> > group by orderitem.productid, orders.timestamp
> > 
> > assuming orders.timestamp has one-day granularity
> 
> Timestamp is actually a timestamp (down the second).

Then you have to throw in a conversion function that truncates it to the
date in order for the `group by` to work correctly. I'm not sure what
function that is, but you can play around with the functions listed at 
http://dev.mysql.com/doc/refman/5.1/en/date-and-time-functions.html
___
vox-tech mailing list
vox-tech@lists.lugod.org
http://lists.lugod.org/mailman/listinfo/vox-tech

Re: [vox-tech] MySQL question: making stats out of orders

2009-12-01 Thread Bill Kendrick

On Wed, Dec 02, 2009 at 12:20:25AM -0600, Chanoch (Ken) Bloom wrote:
> INSERT into orderstats(productid, date, orders)
> SELECT
>  orderitem.productid, orders.timestamp, SUM(orderitem.qty)
> FROM orderitem
>   JOIN orders ON orders.id = orderitem.orderid
>   WHERE orders.timestamp >= { 90 days ago }
> group by orderitem.productid, orders.timestamp
> 
> assuming orders.timestamp has one-day granularity

Timestamp is actually a timestamp (down the second).

-bill!
___
vox-tech mailing list
vox-tech@lists.lugod.org
http://lists.lugod.org/mailman/listinfo/vox-tech

Re: [vox-tech] MySQL question: making stats out of orders

2009-12-01 Thread Chanoch (Ken) Bloom

On Tue, 2009-12-01 at 22:14 -0800, Bill Kendrick wrote:
> > It would seem pretty simple to create the analagous table for orders:
> > 
> > orderstats consisting of productid, date, orders
> > 
> > and every time you complete an order, you run the query:
> > 
> > insert into orderstats(productid, date, orders) VALUES (###, NOW(),
> > QTY) on duplicate key update orders=orders+QTY
> 
> Yes, that's great for today-on, but the moment I switch to using this
> new way of recording purchases, there won't be any data.
> 
> That is, unless I start collecting data for the next 90 days BEFORE
> changing how the "top sellers" are determined.
> 
> What I'm looking for is a way to populate a new 'order stats' table
> based on the [last 90 days', or whatever, of] existing orders.

The one time command to fill the orderstats table for the first time
would be

INSERT into orderstats(productid, date, orders)
SELECT
 orderitem.productid, orders.timestamp, SUM(orderitem.qty)
FROM orderitem
  JOIN orders ON orders.id = orderitem.orderid
  WHERE orders.timestamp >= { 90 days ago }
group by orderitem.productid, orders.timestamp

assuming orders.timestamp has one-day granularity

T

___
vox-tech mailing list
vox-tech@lists.lugod.org
http://lists.lugod.org/mailman/listinfo/vox-tech

Re: [vox-tech] MySQL question: making stats out of orders

2009-12-01 Thread Bill Kendrick

On Tue, Dec 01, 2009 at 08:04:31PM -0600, Chanoch (Ken) Bloom wrote:
> Is this task getting more intensive?

I think so, but it's hard to tell, with other things going on
at the same time.

> If is (and the number of
> transactions per day hasn't grown significantly),

Site traffic has been increasing steadily.

> then you probably don't have indices set up correctly.

Highly likely.  Most of these tables were set up prior to my
taking over maintenance of all this, and any similar ones were
no doubt based on the ones the previous developer designed.

> I think that having an an index on orders.timestamp,
> and an index on orderitem.orderid (in addition to the primary keys
> products.id and orders.id) should optimize this query,

Ok, I'll look into that.  Thanks!

> but it always
> pays to determine what's going on with EXPLAIN SELECT.

Good idea.

> If that still doesn't speed things up enough, maybe consider using a
> table with the MERGE storage engine to move old entries from the
> orders table and the orderitem table into archives. Then only query
> the table full of recent order data for queries, and only query the
> other tables in the merge when you need to.

Products are virtual, so a buyer's access to them are determined
by the orders, so I'm not going to change how all that logic works. :)

> > Now, we've got some statistics tables set up for other
> > purposes -- how many times a product was viewed,
> > as well as when any particular logged-in-users last
> > visited particular items, which allows us to do the
> > "people who recently looked at X also recently looked at Y".
> > 
> > I'm thinking we could do something similar for sales,

> I'm not really clear on what you're looking for here. We need a better
> idea of what stats you're running to know how to structure the table.
> 
> It would seem pretty simple to create the analagous table for orders:
> 
> orderstats consisting of productid, date, orders
> 
> and every time you complete an order, you run the query:
> 
> insert into orderstats(productid, date, orders) VALUES (###, NOW(),
> QTY) on duplicate key update orders=orders+QTY

Yes, that's great for today-on, but the moment I switch to using this
new way of recording purchases, there won't be any data.

That is, unless I start collecting data for the next 90 days BEFORE
changing how the "top sellers" are determined.

What I'm looking for is a way to populate a new 'order stats' table
based on the [last 90 days', or whatever, of] existing orders.

Thanks!

-bill!
___
vox-tech mailing list
vox-tech@lists.lugod.org
http://lists.lugod.org/mailman/listinfo/vox-tech

Re: [vox-tech] MySQL question: making stats out of orders

2009-12-01 Thread Chanoch (Ken) Bloom

On Tue, Dec 01, 2009 at 05:32:20PM -0800, Bill Kendrick wrote:
> 
> Ok, so I've got a few queries our website does that
> are used to determine which products are 'top sellers'
> in the past 90 days' worth of sales.
> 
> Right now, this involves joining the product table
> with the orders table, by way of an items-in-the-order table.
> 
> In other words, to figure out what sold the best in the
> last 90 days, it's something like:
> 
>   SELECT * FROM products
>   JOIN orderitem ON products.id = orderitem.productid
>   JOIN orders ON orders.id = orderitem.orderid
>   WHERE orders.timestamp >= { 90 days ago }
> 
> As time goes on, our records or orders, order items, and
> actual products all keep growing.  That means this task
> gets more and more intensive.

Is this task getting more intensive? If is (and the number of
transactions per day hasn't grown significantly), then you probably
don't have indices set up correctly.

I think that having an an index on orders.timestamp,
and an index on orderitem.orderid (in addition to the primary keys
products.id and orders.id) should optimize this query, but it always
pays to determine what's going on with EXPLAIN SELECT.

If that still doesn't speed things up enough, maybe consider using a
table with the MERGE storage engine to move old entries from the
orders table and the orderitem table into archives. Then only query
the table full of recent order data for queries, and only query the
other tables in the merge when you need to.

> Now, we've got some statistics tables set up for other
> purposes -- how many times a product was viewed,
> as well as when any particular logged-in-users last
> visited particular items, which allows us to do the
> "people who recently looked at X also recently looked at Y".
> 
> I'm thinking we could do something similar for sales,
> to simplify the "top sellers" queries.  They'd hit
> a single stats table (which could, on a daily basis,
> be cleared of any >90-days-ago entries), rather than
> having to JOIN two large tables that go back to day 1.
> 
> 
> My current problem -- no doubt due to lack of brain power
> at the end of the day -- is how to take the CURRENT data
> from the orders and orderitem tables so that I can do an
> initial population of a new stats table.
> 
> Our "views" stats table, for example, is something
> along the lines of:
> 
>   productid, date, views
> 
> where those first two columns act as a key.  (Date is at the
> 'day' level of granularity.)
> 
> 
> And we count a view like so:
> 
>   INSERT INTO views (productid, date, views) VALUES (###, NOW(), 1)
>   ON DUPLICATE KEY UPDATE views=views+1
> 
> In other words, the first time a product is viewed on any given day,
> we insert a new row, otherwise we update its existing row.
> 
> 
> Anyone feel like flexing their MySQL muscles and provide some
> ideas on how to populate a similar table (tracking sales) based
> on the orders/orderitem table combo explained above?

I'm not really clear on what you're looking for here. We need a better
idea of what stats you're running to know how to structure the table.

It would seem pretty simple to create the analagous table for orders:

orderstats consisting of productid, date, orders

and every time you complete an order, you run the query:

insert into orderstats(productid, date, orders) VALUES (###, NOW(),
QTY) on duplicate key update orders=orders+QTY

-- 
Chanoch (Ken) Bloom. PhD candidate. Linguistic Cognition Laboratory.
Department of Computer Science. Illinois Institute of Technology.
http://www.iit.edu/~kbloom1/
___
vox-tech mailing list
vox-tech@lists.lugod.org
http://lists.lugod.org/mailman/listinfo/vox-tech

Re: [vox-tech] MySQL question: making stats out of orders

2009-12-01 Thread Bill Kendrick

On Tue, Dec 01, 2009 at 05:32:20PM -0800, Bill Kendrick wrote:
> Anyone feel like flexing their MySQL muscles and provide some
> ideas on how to populate a similar table (tracking sales) based
> on the orders/orderitem table combo explained above?
> 
> Thanks.  Sorry for asking you to do my (home)work for me.

Maybe as simple as something like:

SELECT
  orderitem.productid,
  SUM(orderitem.qty),
  FROM_DAYS(TO_DAYS(orders.ordertime)) AS day
FROM orders
JOIN orderitem ON orders.id = orderitem.orderid
GROUP BY productid,day

???

-bill!
___
vox-tech mailing list
vox-tech@lists.lugod.org
http://lists.lugod.org/mailman/listinfo/vox-tech

[vox-tech] MySQL question: making stats out of orders

2009-12-01 Thread Bill Kendrick


Ok, so I've got a few queries our website does that
are used to determine which products are 'top sellers'
in the past 90 days' worth of sales.

Right now, this involves joining the product table
with the orders table, by way of an items-in-the-order table.

In other words, to figure out what sold the best in the
last 90 days, it's something like:

  SELECT * FROM products
  JOIN orderitem ON products.id = orderitem.productid
  JOIN orders ON orders.id = orderitem.orderid
  WHERE orders.timestamp >= { 90 days ago }

As time goes on, our records or orders, order items, and
actual products all keep growing.  That means this task
gets more and more intensive.


Now, we've got some statistics tables set up for other
purposes -- how many times a product was viewed,
as well as when any particular logged-in-users last
visited particular items, which allows us to do the
"people who recently looked at X also recently looked at Y".

I'm thinking we could do something similar for sales,
to simplify the "top sellers" queries.  They'd hit
a single stats table (which could, on a daily basis,
be cleared of any >90-days-ago entries), rather than
having to JOIN two large tables that go back to day 1.


My current problem -- no doubt due to lack of brain power
at the end of the day -- is how to take the CURRENT data
from the orders and orderitem tables so that I can do an
initial population of a new stats table.

Our "views" stats table, for example, is something
along the lines of:

  productid, date, views

where those first two columns act as a key.  (Date is at the
'day' level of granularity.)


And we count a view like so:

  INSERT INTO views (productid, date, views) VALUES (###, NOW(), 1)
  ON DUPLICATE KEY UPDATE views=views+1

In other words, the first time a product is viewed on any given day,
we insert a new row, otherwise we update its existing row.


Anyone feel like flexing their MySQL muscles and provide some
ideas on how to populate a similar table (tracking sales) based
on the orders/orderitem table combo explained above?

Thanks.  Sorry for asking you to do my (home)work for me.


-- 
-bill!
Sent from my computer
___
vox-tech mailing list
vox-tech@lists.lugod.org
http://lists.lugod.org/mailman/listinfo/vox-tech

Re: [vox-tech] MySQL question: making stats out of orders

Re: [vox-tech] MySQL question: making stats out of orders

Re: [vox-tech] MySQL question: making stats out of orders

Re: [vox-tech] MySQL question: making stats out of orders

Re: [vox-tech] MySQL question: making stats out of orders

Re: [vox-tech] MySQL question: making stats out of orders

Re: [vox-tech] MySQL question: making stats out of orders

Re: [vox-tech] MySQL question: making stats out of orders

Re: [vox-tech] MySQL question: making stats out of orders

[vox-tech] MySQL question: making stats out of orders

10 matches

Site Navigation

Mail list logo

Footer information