On Wed, Oct 23, 2013 at 2:32 PM, Karsten Loesing <kars...@torproject.org>wrote:
> On 10/11/13 4:05 PM, Kostas Jakeliunas wrote: > > Oops! Sorry for the delay in responding! Responding now. > > > On Fri, Oct 11, 2013 at 12:00 PM, Karsten Loesing < > kars...@torproject.org>wrote: > > > >> Hi Kostas, > >> > >> should we move this thread to tor-dev@? > >> > > > > Hi Karsten! > > > > sure. > > > >>From our earlier conversation about your GSoC project: > >>> In particular, we should discuss how to integrate your project into > >>> Onionoo. I could imagine that we: > >>> > >>> - create a database on the Onionoo machine; > >>> - run your database importer cronjob right after the current Onionoo > >>> cronjob; > >>> - make your code produce statuses documents and store them on disk, > >>> similar to details/weights/bandwidth documents; > >>> - let the ResourceServlet use your database to return the > >>> fingerprints to return documents for; and > >>> - extend the ResourceServlet to support the new statuses documents. > >>> > >>> Maybe I'm overlooking something and you have a better plan? In any > >>> case, we should take the path that implies writing as little code as > >>> possible to integrate your code in Onionoo. > >> > >> Let me know what you think! > >> > > > > Sounds good. Responding to particular points: > > > >> - create a database on the Onionoo machine; > >> - run your database importer cronjob right after the current Onionoo > >> cronjob; > > > > These should be no problem and make perfect sense. It's always best to > use > > raw SQL table creation routines to make sure the database looks exactly > > like the one on the dev machine I guess (cf. using SQLAlchemy > abstractions > > to do that (I did that before)). > > > > Current SQL script to do that is at [1]. I'll look over it. For example, > > I'd (still) like to generate some plots showing the chances of two > > fingerprints having the same substring (this is for the intermediate > > fingerprint table.) (One axis would be substring length, another would be > > the possibility in (portions of) %.) As of now, we still use > > substr(fingerprint, 0, 12), and it is reflected in the schema. > > > > Overall though, no particular snags here. > > I don't follow. But before we get into details here, I must admit that > I was too optimistic about running your code on the current Onionoo > machine. I ran a few benchmark tests on it last week to compare it to > new hardware, and those tests almost made it fall over. We should not > even think about adding new load to the current machine. > > New plan: can you run an Onionoo instance with your changes on a > different machine? (If you need anything from me, like a tarball of the > status/ and out/ directories, I'm happy to provide them to you.) I > think we should run this instance for a while to see how reliable it is. > And once we're confident enough, we'll likely have new hardware for the > new Onionoo, so that we can move it there. > This sounds like a very good idea. Ok, I can try and do this. Sorry for delaying my response as well, I'll try and follow up with what I need (if anything). >> - make your code produce statuses documents and store them on disk, > >> similar to details/weights/bandwidth documents; > > > > Right, so if we are planning to support all V3 network statuses for all > > fingerprints, how are we to store all the status documents? The idea is > to > > preprocess and serve static JSON documents, correct (as in the current > > Onionoo)? (cf. the idea of simply caching documents: if we serve a > > particular status document, it gets cached, and depending on the query > > parameters (date range restriction, e.g.) it may be set not to expire at > > all.) > > > > Or should we try and actually store all the statuses (the condensed > status > > document version [2], of course)? > > Let's do it as the current Onionoo does it. This code does not exist, > right? > I've done some small testing on a local system, it seems the Onionoo way is plausible, since the generation of all the old(er) status etc. documents needs to happen only once (obviously, but now I understand this means the number of resulting status documents and their size is not such a big deal after all.) I don't have good code for it as of yet. > >> - let the ResourceServlet use your database to return the > >> fingerprints to return documents for; and > >> - extend the ResourceServlet to support the new statuses documents. > > > > Sounds good. I assume you are very busy with other things as well, so > > ideally maybe you had in mind that I could try and do the Java part? :) > > Though, since you are much more familiar with (your own) code, you could > > probably do it faster than me. Not sure. > > Any particular technical issues/nuances here (re: ResourceServlet)? > > Can you give it a try? Happy to help with specific questions about > ResourceServlet, and I'll try hard to reply faster this time. Again, > sorry for the delay! > Okay! I've been tinkering a bit, actually. Will see if I can produce something decent and reliable. Best wishes Kostas. > > > [1]: https://github.com/wfn/torsearch/blob/master/db/db_create.sql > > [2]: > > > https://github.com/wfn/torsearch/blob/master/docs/onionoo_api.md#network-status-entry-documents > > (e.g. > > > http://ts.mkj.lt:5555/statuses?lookup=9695DFC35FFEB861329B9F1AB04C46397020CE31&condensed=true > > ) > > >
_______________________________________________ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev