Hey Matthew, Thank you in advance for willing to help us by giving a step in the right direction. First I’ll begin explaining our situation regarding the traffic data. We contacted the NDW, the company which is responsible for storing all traffic data in the Netherlands. They agreed to facilitate us with traffic speeds/flows and events like incidents or road works. We want to combine the traffic data with weather data and find anomalies, and hopefully try to predict incidents. We’ll be attending at The World Port Hackathon (September 4-5) in Rotterdam. At this event we would like to build a prototype proving our hypothesis that traffic accidents can be predicted. Our data is from the past three years in an area of Rotterdam. We have the following data from about hundred measurement sites: ● traffic speeds / min. / lane / vehicle-type ● traffic intensity / min. / lane / vehicle-type ● traffic events e.g. incidents, road work and more. The weather data is available from each Dutch weather station per hour. We would like to combine the data from the weather station in Rotterdam with all the traffic data. This is how we think we should approach it using HTM Engine: ● Define a model with the following fields: ○ average traffic speed (int) ○ average traffic intensity (int) ○ incident (close to this point) (boolean) ○ horizontal visibility (in meters) ○ rain (boolean) ○ icing (boolean) ○ snow (boolean) ● Create an api to communicate with HTM Engine ● Create a model for each data point There are two main problems with this approach in our opinion. The first issue is that we don't take account of the traffic flow across multiple points/highways. We've been told that there is a strong relation in the flow between some specific highways. These patterns are known and we think we need to find a way to use these connections to improve the context of the accidents. The second issue concerns the different weather factors, which are different per season. We can make specific models for the winter and summer so it includes temperature in the summer and icing/snow in the winter. But what we’re very interested in is how we can “connect” the data from multiple measurement sites. Another approach for the model can be: ● average traffic speed (int) ● average traffic intensity (int) ● incident (close to this point) (boolean) ● horizontal visibility (in meters) ● rain (boolean) ● icing (boolean) (in winter) ● snow (boolean) (in winter) ● temperature (in summer) ● related point A average traffic speed (int) ● related point A average traffic intensity (int) ● related point B average traffic speed (int) ● related point B average traffic intensity (int) This also isn’t the ideal approach in our opinion. Nupic probably won’t give accurate anomalies/predictions with so many properties. We thought of a third option, but we’re not sure how to approach it. What if we make separate smaller models per measurement point, this way we can find out which performs the best, something like: ● model A: ○ average traffic speed (int) ● model B: ○ average traffic intensity (int) ● model C ○ average traffic speed (int) ○ average traffic intensity (int) ● model X ○ model A,B,C with certain weather data Then swarm on anomaly scores (and maybe with raw input data) of different sites to find relations between measurement sites. Then use these models with incident data to predict them. The last option is probably the most difficult, but could be the most promising. What are your thoughts? Any input is greatly appreciated. Beside the approaches we have some smaller questions. We would like to start with the skeleton-htmengine-app, expand the api so it excepts multiple, different kind of models and build a webapp interfacing with the api. We couldn’t find a lot of documentation regarding the HTM Engine. ● Can we swarm subsets to create model params using HTM Engine? ○ This way we can try multiple model params for models with different values, or do we have to create them beforehand? ● I have little knowledge about HTM Engine ○ Can you give some info about the services (anomaly_service, metric_listener, metric_storer, model_scheduler) and if we need them, how to interface with them? ● Do you have any thoughts about “combining/connecting/merging” different traffic points for the best accident prediction? We understand these are a lot of questions. Therefor we would be very grateful if you are able to find the time to answer them. Thank you again. Daniël, Pionect
—Daniël Ducro
