Re: Questions about known upcoming events and multiple 'locations'

Alan Haverty Tue, 27 Oct 2015 15:15:27 -0700

Thank you Matthew, will do.

Thanks for the suggestions and example Pascal,
Ill get back to the list with any discoveries!


Best regards,
Alan

On Tue, 27 Oct 2015 at 21:41 Pascal Weinberger <[email protected]>
wrote:

> Hey,
>
> Just a quick thought:
> Maybe HTM isn't the best choice to encode this. Especially given that you
> want to learn a sequence over a year (like the Xmas example) given 3months
> of data.
> Overall I'm not even sure that you would get this with 3 years of data as
> this is a really rare event and the decay of synapses would have it vanish.
> No decay on the other hand will make the learning more unstable and you
> loose a chunk of the online learning performance...
>
> So why not assist the HTM with an additional layer:
> You can of course get fancier than this with another stack of a classifier
> or a regression, but the simplified way to deal with it is to just check
> from known opinions that (pseudo code)
>
> if time == Xmas
>   deliveries = predicted_deliveries * 0.5 (people cook at Xmas)
> end
>
> if time == big football game
>   deliveries = predicted_deliveries * 10 ( ;))
> end
>
> else
>   deliveries =predicted_deliveries (use the raw HTM prediction)
> end
>
> I guess you get the idea.
> You can now learn the scaling factor with multiple methods, or just
> estimate them.
>
> In any case I think this is the easiest way to tweak the prediction
>
> If you try it, please let me know what you had to take care of in scaling
> ;D this is a cool social study :P
>
>
> Best
>
> Pascal
> ____________________________
>
> BE THE CHANGE YOU WANT TO SEE IN THE WORLD ...
>
>
> On 27 Oct 2015, at 21:51, Alan Haverty <[email protected]> wrote:
>
> Thank you Matthew,
> I'll experiment with the events.
>
> No, this will actually be a component of my final year project (4th year
> college, Ireland)
>
> I missed the boat for this years challenge, but I'll be sure to join in
> next year!
>
> Thanks again,
> Alan Haverty
>
> On Tue 27 Oct 2015 at 04:23 Matthew Taylor <[email protected]> wrote:
>
>> Hi Alan,
>>
>> Here are my comments about your questions.
>>
>> 1.a. This was an ad-hoc idea, but I haven't tried it.
>>
>> 1.a.i.-ii. Ideally, you would not want to include this field at all, you
>> would just have years worth of data an a learned model that has seen the
>> patterns each holiday produced in the past. But since you don't have that
>> kind of history, you'll need to experiment a little. Perhaps a simple
>> countup isn't going to give you what you want... if a holiday like XMas is
>> a big deal, maybe its value is higher and there is a longer countup to that
>> date, rather than say St. Patrick's Day. Like I said, this was just an
>> ad-hoc idea and I can't say for certain how it will work. You'll want to
>> experiment with it.
>>
>> 2. If you have data for 15 locations, I would say that each location
>> should have its own model. One model only make predictions for one field,
>> anyway.
>>
>> 2.a. You would only lose value if there are correlations between the
>> locations, but I imagine this is not the case. The frequency of deliveries
>> at one restaurant are probably not directly affected by the frequency of
>> deliveries at another.
>>
>> 2.b. No.
>>
>> By the way, is this an HTM Challenge project?
>>
>> Regards,
>>
>>
>> ---------
>> Matt Taylor
>> OS Community Flag-Bearer
>> Numenta
>>
>> On Mon, Oct 26, 2015 at 1:27 PM, Alan Haverty <[email protected]> wrote:
>>
>>> Hello Nupic,
>>>
>>>
>>> I have some questions about feeding in known events and also, how I
>>> should handle multiple 'locations' that have similar properties but that
>>> may not be directly related in reality.
>>>
>>>
>>> Please let me know if I'm asking in the wrong mail list.
>>>
>>> I'm also providing a brief description and example of the project.
>>> Outline of Problem
>>>
>>> Restaurants that offer food delivery are forced to hire drivers, pay for
>>> insurance, pay for wages + predict how many drivers are needed in advance
>>> and schedule their hours.
>>>
>>>
>>> I propose to abstract this as a service where restaurants can simply use
>>> an app to request a driver and let this service-business worry about
>>> drivers, insurance, wages, roster scheduling etc.
>>>
>>>
>>> To achieve this, the central ‘delivery system’ needs to predict how many
>>> jobs are going to come from each area within a city to allow scheduling of
>>> drivers days/weeks in advance.
>>>
>>>
>>> I believe NuPIC is ideal to solve this problem, but I have a few
>>> questions that I hope the mailing list can help with.
>>>
>>>
>>> *Assuming for this example:*
>>>
>>>    - That a city is divided into 15 geographical areas.
>>>    - That I have 3 months of known data with the amount of total
>>>    deliveries that came from each area per hour.
>>>    - That I need to predict the number_of_deliveries per hour
>>>    (days/week in advance, not too concerned with how far in advance yet.)
>>>
>>> Example Data
>>>
>>> *Example of 3hrs of data for one of those 15 areas:*
>>>
>>> *dttm*
>>>
>>> *number_of_deliveries*
>>>
>>> *datetime*
>>>
>>> *int*
>>>
>>> *T*
>>>
>>> 2015/08/01 00:00:00.0
>>>
>>> 178
>>>
>>> 2015/08/01 01:00:00.0
>>>
>>> 96
>>>
>>> 2015/08/01 02:00:00.0
>>>
>>> 52
>>>
>>>
>>> Questions
>>>
>>> 1.  1.     If I want to incorporate event data for known upcoming
>>> events such as a national holiday/football game/TV series finale airing;
>>> how should this hourly event data be arranged?
>>>
>>> a.       Matthew Taylor suggested
>>> <https://www.youtube.com/watch?v=gYOwBlVuJDw> to use a count down until
>>> the hours of the event
>>>
>>>                                                                i.      How
>>> would this work if I wanted to weight certain events differently? (e.g. A
>>> national bank holiday would be weighted higher than a television series
>>> episode airing)
>>>
>>>                                                              ii.      While
>>> the event is occurring, how should the countdown be represented? Should it
>>> be ..,5,4,3,2,1,1,1,1,1,…,1,20,19,.. *(Red being the event currently on
>>> for that hour(s) or some cases the whole day(s))*
>>>
>>> 2.  2.     I need to do this for multiple locations, would a field to
>>> specify each location be correct (Meaning there would be x15 {Saturday @
>>> 12:00}, *one for each of the 15 locations*) or should they be totally
>>> separated?
>>>
>>> a.       If I separate locations completely, would you expect I lose
>>> value in anyway?
>>>
>>> b.      If I keep them together, could locations contaminate/effect
>>> each other that may not happen in reality?
>>>
>>> *c.       **Apologies for this broad question, if anyone could even
>>> point me to suggested reading, I would appreciate it.*
>>> Thank you for reading!
>>>
>>>
>>>
>>> *Best regards, Alan Haverty**[email protected] <[email protected]>*
>>>
>>>
>>

Re: Questions about known upcoming events and multiple 'locations'

Reply via email to