RE: NoSQL help

Manivannan S. Thu, 14 Jun 2012 03:39:07 -0700

id

select_type

table

type

possible_keys

key

key_len

ref

rows

Extra

1

SIMPLE

ibf

ALL

ibf_MsgId

\N

\N

\N

160944

1

SIMPLE

pl

ref

idx_unique_key_ib_msg\,index_message_id\,index_message_processing_status

idx_unique_key_ib_msg

180

reports.ibf.Message_Id\,const

1

Using where; Using index

1

SIMPLE

tl

ref

idx_unique_key_ib_text\,index_message_id

idx_unique_key_ib_text

153

reports.pl.Message_Id

1

1

SIMPLE

xl

ref

idx_unique_key_ib_xml\,index_message_id

idx_unique_key_ib_xml

153

reports.pl.Message_Id

1

Using where

                This is the execution plan for 1.5 milion records......

From: Ananda Kumar [mailto:anan...@gmail.com]<mailto:[mailto:anan...@gmail.com]>
Sent: Thursday, June 14, 2012 3:33 PM
To: Manivannan S.
Cc: mysql@lists.mysql.com<mailto:mysql@lists.mysql.com>
Subject: Re: NoSQL help

can u share the sql, explain plan, indexes etc,
show full processlist out put when the sql's are running
On Thu, Jun 14, 2012 at 3:03 PM, Manivannan S. 
<manivanna...@spanservices.com<mailto:manivanna...@spanservices.com>> wrote:
I tried with myisam engine also. But it also taking more time to generate the 
report. In my database I am having 8 innodb tables and at the same time I am 
joining 4 tables to get the report.

I am maintaining 60days records because the user will try to generate the 
report out of 60 days in terms of second, minute, hourly, weekly and Monthly 
report also.

From: Ananda Kumar [mailto:anan...@gmail.com<mailto:anan...@gmail.com>]
Sent: Thursday, June 14, 2012 12:32 AM
To: Rick James
Cc: Johan De Meersman; Manivannan S.; 
mysql@lists.mysql.com<mailto:mysql@lists.mysql.com>
Subject: Re: NoSQL help

Did you try with myisam tables.
They are supposed to be good for reporting requirement
On Wed, Jun 13, 2012 at 11:52 PM, Rick James 
<rja...@yahoo-inc.com<mailto:rja...@yahoo-inc.com><mailto:rja...@yahoo-inc.com<mailto:rja...@yahoo-inc.com>>>
 wrote:
I'll second Johan's comments.

"Count the disk hits!"

One minor change:  Don't store averages in the summary table; instead store the 
SUM().  That lets you get the mathematically correct AVERAGE over any time 
range via
SUM(sum_foo) / SUM(count_foo)

Switching between MySQL and Mongo requires rewriting _all_ of the relevant code.

<opinion> NoSQL will be no better than MySQL for 150GB. </opinion>  "Count the 
disk hits!"

I recently built a system that topped out at 350GB (90 days' data).  It 
involved hourly ingestion of a few GB of data and a variety of "reports".  The 
prototype showed that most reports would take about an hour to run.  Not good.  
The final product, with summary tables, lets the reports be run on-demand and 
online and each takes only a few seconds.  By careful use of MEMORY tables, 
LOAD DATA, etc, the ingestion takes 5 minutes (each hour) for the raw data and 
2 minutes (total) for the 7 summary tables.  PARTITIONing was vital for the 
design.  Once an hour a new partition is populated; once a day, 24 hourly 
partitions are rolled into a new daily partition and the 90-day old partition 
is DROPped.

> -----Original Message-----
> From: Johan De Meersman 
> [mailto:vegiv...@tuxera.be<mailto:vegiv...@tuxera.be><mailto:vegiv...@tuxera.be<mailto:vegiv...@tuxera.be>>]
> Sent: Wednesday, June 13, 2012 6:20 AM
> To: Manivannan S.
> Cc: 
> mysql@lists.mysql.com<mailto:mysql@lists.mysql.com><mailto:mysql@lists.mysql.com<mailto:mysql@lists.mysql.com>>
> Subject: Re: NoSQL help
>
>
> ----- Original Message -----
> > From: "Manivannan S." 
> > <manivanna...@spanservices.com<mailto:manivanna...@spanservices.com><mailto:manivanna...@spanservices.com<mailto:manivanna...@spanservices.com>>>
> >
> > Hi all,
> >
> > [lots of data]
> > [slow reports]
> > [wooo NoSQL magic]
>
> Not that I want to discourage you, but my standard first question is
> "why do you think NoSQL (let alone any specific product) is the right
> solution?" :-)
>
> Don't get me wrong, it might be; but from what little I now know about
> your environment, it sounds like applying some data warehousing
> techniques might suffice - and being the cynical dinosaur that I am, I
> have a healthy reluctance about welding new technology onto a stable
> environment.
>
> To speed up reporting (and note that these techniques are often applied
> even when implementing NoSQL solutions, too) it is usually a good first
> step to set up a process of data summarization.
>
> Basically, you pre-calculate averages, medians, groupings, whatever you
> need for your reports; and your job also saves the last record IDs it's
> processed; then on the next run, you only read the new records and
> update your summary tables to incorporate the new data.
>
> Suppose I have a table like this:
>
> ID | Val
> --------
>  1     1
>  2     7
>  3     5
>  4    13
>
> I want to report the average on a daily basis, and calculating that
> over those rows is unbearably slow because I'm running the process on a
> wristwatch from 1860 :-)
>
> So I get a summary table, calculate (1+7+5+13)/4 = 6.5 and that then
> gets a record saying this:
>
> Avg | elementCount | lastSeen
> -----------------------------
> 6.5              4          4
>
> Now, over the course of the day, the elements 4, 17 and 2 get added
> with sequential row numbers. Instead of calculating
> (1+7+5+13+4+17+2)/7, which would be slow; I can substitute the already
> summarized data by Avg*elementCount. Thus, I calculate (6.5*4 +
> 4+17+2)/7 = 7, which is a lot faster, and my summary table now looks
> like this:
>
> Avg | elementCount | lastSeen
> -----------------------------
>   7              7          7
>
> This is of course a stupid example, but it saves you a lot of time if
> you already have the summary of several thousand elements and only need
> to update it for a handful. Similar tricks are possible for a lot of
> typical reporting stuff - you don't need to re-calculate data for past
> months over and over again, for instance - and that's what makes your
> reports run fast.
>
>
> Just my 2 cents :-)
> /johan
>
> --
> Bier met grenadyn
> Is als mosterd by den wyn
> Sy die't drinkt, is eene kwezel
> Hy die't drinkt, is ras een ezel
>
> --
> MySQL General Mailing List
> For list archives: http://lists.mysql.com/mysql
> To unsubscribe:    http://lists.mysql.com/mysql
DISCLAIMER: This email message and all attachments are confidential and may 
contain information that is privileged, confidential or exempt from disclosure 
under applicable law.  If you are not the intended recipient, you are notified 
that any dissemination, distribution or copying of this email is strictly 
prohibited. If you have received this email in error, please notify us 
immediately by return email or to 
mailad...@spanservices.com<mailto:mailad...@spanservices.com> and destroy the 
original message.  Opinions, conclusions and other information in this message 
that do not relate to the official business of SPAN, shall be understood to be 
neither given nor endorsed by SPAN.

DISCLAIMER: This email message and all attachments are confidential and may 
contain information that is privileged, confidential or exempt from disclosure 
under applicable law.  If you are not the intended recipient, you are notified 
that any dissemination, distribution or copying of this email is strictly 
prohibited. If you have received this email in error, please notify us 
immediately by return email or to mailad...@spanservices.com and destroy the 
original message.  Opinions, conclusions and other information in this message 
that do not relate to the official business of SPAN, shall be understood to be 
neither given nor endorsed by SPAN.

RE: NoSQL help

Reply via email to