Well, first, you can design 6 MR jobs:
1- for 5 mins interval
2- for 1 hour
3- for 1 day
4- for 1 month
5- for 1 year
6- and a last for any interval
If you say that for each interval, you have to do a different
calculation; this way could be a solution (at least I think that).
You can read the "design patterns" for MapReduce algorithms proposed by
Jimmy Lin and Chris Dyer on his "Data-Intensive Text Processing with
MapReduce" book.
Regards
On 02/27/2012 05:39 AM, Stuti Awasthi wrote:
No. The data will be either of 5 mins interval, or 1 hour interval or 1 day
interval and so on ....
So suppose utilization is for 40 days then I will charge 30 days according to
months billing and remaining 10 days as days billing job.
-----Original Message-----
From: Rohit Kelkar [mailto:rohitkel...@gmail.com]
Sent: Monday, February 27, 2012 4:06 PM
To: mapreduce-user@hadoop.apache.org
Subject: Re: Query Regarding design MR job for Billing
Just trying to understand your use case
you need an hour job to run on data between 6:40 AM and 7:40 AM. Would it be
like a moving window? For ex. run hour jobs on
6:41 AM to 7:41 AM
6:42 AM to 7:42 AM
and so on...
On Mon, Feb 27, 2012 at 1:01 PM, Stuti Awasthi<stutiawas...@hcl.com> wrote:
Hi all,
I have to implement BillingEngine using MR jobs. My usecase is like this:
I will be having data files of format<TimeStamp> <Information for Billing>.
Now these datafiles will be containing timestamp either at minute interval,
hour inverval, day interval, month interval, year interval. Every type of
interval will be having different type of calculation for billing so basically
different jobs for every type of interval.
Suppose I have a data file which contain minute interval timestamp. I have a
scenario that if data is present for hours , then it should be processed by
hourly job and remaining will be processed by minutejob.
Example :
2/10/12 6:40 AM<data for billing>
2/10/12 6:40 AM<data for billing>
.
2/10/12 6:45 AM<data for billing>
2/10/12 6:45 AM<data for billing>
.
.
2/10/12 7:40 AM<data for billing>
2/10/12 7:40 AM<data for billing>
.
.
2/10/12 7:45 AM<data for billing>
2/10/12 7:45 AM<data for billing>
.
Now I want data between 2/10/12 6:40 AM to 2/10/12 7:40 AM is processed by
Hourjob and 2/10/12 7:45 AM is processed by MinuteJob.
Please suggest how to design my MR to achieve this.
Thanks
Stuti
::DISCLAIMER::
----------------------------------------------------------------------
-------------------------------------------------
The contents of this e-mail and any attachment(s) are confidential and intended
for the named recipient(s) only.
It shall not attach any liability on the originator or HCL or its
affiliates. Any views or opinions presented in this email are solely those of
the author and may not necessarily reflect the opinions of HCL or its
affiliates.
Any form of reproduction, dissemination, copying, disclosure,
modification, distribution and / or publication of this message
without the prior written consent of the author of this e-mail is
strictly prohibited. If you have received this email in error please delete it
and notify the sender immediately. Before opening any mail and attachments
please check them for viruses and defect.
----------------------------------------------------------------------
-------------------------------------------------
--
Marcos Luis OrtÃz Valmaseda
Senior Software Engineer (UCI)
http://marcosluis2186.posterous.com
http://www.linkedin.com/in/marcosluis2186
Twitter: @marcosluis2186
Fin a la injusticia, LIBERTAD AHORA A NUESTROS CINCO COMPATRIOTAS QUE SE
ENCUENTRAN INJUSTAMENTE EN PRISIONES DE LOS EEUU!
http://www.antiterroristas.cu
http://justiciaparaloscinco.wordpress.com