@Jeff Thanks! That looks really interesting, although they do not provide any info on how they do it :$
@Jared Yes to some extent that is true, although I also would like to provide researchers with a good overview of wath happens. Therfore a UI like Taurus//Grok or sth along those lines is a goal. I like your saying 'Assuming you have easy access to the data' :D That is the major problem right now, a lot of mailing to agencies and organisations which unfortunately keeps me from coding :'( Thank you for the heads up! And I also *hope that someone out there (You, yes right you reading this!)* Would like to help making this a reality! In case the link got lost, here is how you can help: https://github.com/nupic-community/nostradamIQ/blob/master/CONTRIBUTING.md Thanks :) On Aug 5, 2015 12:34 AM, "Jeff Fohl" <[email protected]> wrote: > Pascal - > > Your idea reminds me a bit of Banjo: http://ban.jo/ > > This is a private corporation, but doing something somewhat similar - at > least in that they have divided the globe up into a giant grid, and within > each cell of that grid, they do anomaly detection. Except, instead of > geophysical data, they are monitoring social activity by observing > geotagged photos, tweets, posts, etc. > > - Jeff > > On Tue, Aug 4, 2015 at 3:27 PM Jared Casner <[email protected]> wrote: > >> Hi Pascal, >> >> So, let me see if I understand correctly. For now, you don't require any >> geo-encoding of data (but it sounds like that might be a useful feature in >> the future?) Instead, you will create a list of regions / polygons that >> represent a geofenced area. Within each region, you will have some set of >> sensors - air pressure, humidity, wind speed, seismic activity, >> temperature, etc. Your goal is to generate anomaly scores for each of >> those sensors - which produce scalar data. You then plan to do some >> additional logistic regression on top of the anomaly scores to predict the >> likelihood of a natural disaster (earthquake, meteorological, etc) in that >> region or nearby regions. It would be up to the statistician to correlate >> regions in the short term, correct? Also, if I've understood you >> correctly, the biggest issue that researchers face currently with respect >> to this problem is that their predictions for each sensor aren't always >> accurate because of daily variations in the data that are unexpected? >> >> I hope I've now understood the problem, but please clarify if I've >> mis-stated anything. >> >> Assuming I have a basic understanding of the problem, I think you may be >> able to simplify the engineering task a little bit. It seems to me that >> your primary objective isn't to have an easy-to-read user interface that >> displays data to an end user. Instead, you want data available to >> researchers in a format that they can do the logistic regression on. So, >> perhaps you can simplify your project by starting with HTMEngine directly. >> I'm sure by now you've seen Matt's demo [1] of HTMEngine - that may be a >> good place to start. In his NYC Traffic demo [2], each road segment >> represents a geolocation and has a scalar metric (average speed) associated >> to it. Assuming you have easy access to the data, you can probably use >> this as a good basis for getting started. The output is available in both >> json and csv formats, so should be easily accessible to a researcher. >> >> To answer one of your original questions about Numenta engineers helping >> out on this project, they're all free to help in their off time! One of >> our big objectives of opening access to NuPIC and the Numenta Apps was to >> provide a means for you - and those like you - to get in and do things that >> we just don't have the bandwidth to do internally. I'm thrilled to see >> your excitement and hope that others in the community will want to get >> involved to help you out! >> >> Cheers, >> >> Jared >> >> [1] https://www.youtube.com/watch?v=lzJd_a6y6-E >> [2] https://github.com/nupic-community/htmengine-traffic-tutorial >> >> >> >>> >>> ---------- Forwarded message ---------- >>> From: Pascal Weinberger <[email protected]> >>> To: "NuPIC general mailing list." <[email protected]> >>> Cc: >>> Date: Tue, 4 Aug 2015 12:13:04 +0200 >>> Subject: Re: nostradamIQ Project help needed! >>> Matt, >>> That's true, but you do not need it at all: >>> Take the world, splice it in polygons (according to the density of data >>> availably and resolution needed); label you polygons, and get your data for >>> each polygon with the label consisting of Where:What, with where being the >>> label of the specific geo-area according to your above system, and what the >>> label for what kind of data you push (like seismic etc.). And there you >>> have your data format: Label to scalar! >>> Now the htmengine outputs you anomaly scores for each >>> Label Where:What and you take these to hierarchically (in a >>> geo-hierarchie) build logistic regression models, trained by the anomaly >>> output, and a binary value for whether a certain disaster happened there at >>> a time X later time or not. (This needs some past data which is why the >>> highest priority is getting the data polled and htmengine trained). You go >>> for logistic regression because that is what literature finds to perform >>> best. Now when that works, you have your 'live' data stream and get >>> predictions in the form of probabilities for the disaster occurring X time >>> in the future... >>> >>> This was the basic idea.. of course you will need to test it and refine >>> the architecture etc. But you got your work-around :) >>> >>> So htmengine is not supposed to do the entire job. its more for feature >>> detection :) The problem researchers find when building log-reg models with >>> real data (raw scalars of the sensors) is that they periodically make wrong >>> predictions due to daily etc. patterns. This is what HTM should filter out >>> ;) >>> >>> The point of using tuarus as a starter therefore is that you already >>> have your basic infrastructure of companies (your geo-polygons) and >>> different metrics (the different sensor data in that region).. >>> >>> Does it make more sense now? :) Of course a geoencoder and so would be >>> nice in addition to capture more of the patterns, but this is what I would >>> hope to achieve with the geo-hierarchy of log-reg models so they capture >>> the spatial relationships in their input weights (of course only based on >>> historical data)... I do not think the geoEncoder Would get this as well.. >>> When running the demo_app, you find that the geoendoding with >>> radius=Magnitude or any exponential function thereof makes HTM immune to >>> regions where at least one strong quake happend... and you dont want that. >>> >>> but David, you may think about building a engine for java as well :) >>> Just cause its faster ;D >>> >>> _______________________________________________ >>> nupic mailing list >>> [email protected] >>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org >>> >>> >>
