Revised optimization question

Dave G Thu, 28 Jun 2007 11:39:13 -0700

I asked a question yesterday that was quite onerous, so I have been
working out some of the details today, and would like to re-post my
question under a different light.


I have a table called data__ProcessedDataFrames that can grow to be
gigantic, and I need to speed up the query for pulling the data out.

What I'm trying to avoid is doing multiple queries on that table because
it is so big.  In my query I have a WHERE statement that looks like:
 WHERE P.test_id=tid
 AND payload_time <= ptime

where P is data__ProcessedDataFrames and tid is the test_id I'm looking at
and ptime is the payload_time that I'm looking at.  The problem with it is
I don't always know the exact payload_time, just that I want the
MAX(payload_time) < ptime.  I can't seem to get it right.  If I put the
MAX aggregate on payload_time, it returns the max payload_time but not the
data associated with that max payload_time.  So essentially I'm trying to
force my query to return the row that is associated with this max
payload_time without doing another query on the table to get the exact
payload_time.  Here is the table structure for data__ProcessedDataFrames:

mysql> desc data__ProcessedDataFrames;
+------------------------+------------------+------+-----+---------+----------------+
| Field                  | Type             | Null | Key | Default | Extra
         |
+------------------------+------------------+------+-----+---------+----------------+
| processed_id           | int(10) unsigned | NO   | PRI | NULL    |
auto_increment |
| top_level_product_name | varchar(255)     | YES  | MUL | NULL    |      
         |
| test_id                | int(10) unsigned | YES  | MUL | NULL    |      
         |
| payload_time           | double           | YES  | MUL | NULL    |      
         |
| universal_time         | double           | YES  |     | NULL    |      
         |
| processed_data         | mediumblob       | YES  |     | NULL    |      
         |
+------------------------+------------------+------+-----+---------+----------------+
6 rows in set (0.00 sec)

mysql>


Here is the SP:
CREATE PROCEDURE `getElement`(IN id INT UNSIGNED,
                              IN ptime DOUBLE,
                              IN tid VARCHAR(255),
                              IN exact_time TINYINT)
BEGIN
   IF(ptime < 1) THEN
      SELECT MAX(ROUND(payload_time,6)) FROM data__ProcessedDataFrames
WHERE test_id=tid INTO ptime;
   END IF;
   SELECT
      J.product_id,
      P.processed_id,
      MAX(ROUND(P.payload_time,6)) as payload_time,
      P.top_level_product_name,
      SUBSTR(
         BINARY(processed_data),
         FLOOR(J.product_offset/8)+1,
         CEIL(J.product_length/8)) as substring,
         (SELECT HEX(substring)) as raw_data,
         (SELECT toString(
            substring,
            round(char_length(raw_data)/2,0),
            data_type,
            (SELECT attribute_value FROM
               config__DataProductAttributes
                  WHERE attribute_name='FormatString'
                  AND config__DataProductAttributes.product_id=
J.product_id),
            product_offset % 8,
            (product_length+(product_offset % 8)) % 8,
            product_length,
            byte_order,
            (SELECT attribute_value FROM
               config__DataProductAttributes
                  WHERE attribute_name = 'ConvParams'
                  AND config__DataProductAttributes.product_id =
J.product_id))) as converted_data,
         (SELECT enum_name FROM
            config__DataProductEnumConversions
               WHERE product_id=J.product_id
               AND enum_value =converted_data) as enumerated_data,
         (SELECT metricTest(converted_data,
            (SELECT xmlTestMetric FROM
               test__TestMetrics
                  WHERE product_id = J.product_id))) as test_metric
      FROM data__ProcessedDataFrames
               P INNER JOIN
                  (SELECT E.product_id,
                          top_level_product_name,
                          product_length,
                          product_offset,
                          data_type,
                          byte_order
                    FROM display__DataProducts_in_Element
                          E INNER JOIN config__DataProducts D ON
E.product_id=D.product_id
                          WHERE E.element_id=id) J
               ON P.top_level_product_name=J.top_level_product_name
                   WHERE P.test_id=tid
                   AND payload_time <= ptime
                   GROUP BY J.product_id ;
END;;


You can see here at the end I did a GROUP BY on J.product_id (there can be
multible product_ids) and one of the return values is:
MAX(ROUND(P.payload_time,6)) as payload_time,

Well this max does not force the processed_data from that row to be
returned, I get processed_data from another row because of the GROUP BY

Any suggestions?

David Godsey


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]

Revised optimization question

Reply via email to