[tor-talk] GSOC Ideas.

2011-03-28 Thread Ian Foster
Hello Everyone,

I was on IRC last week discussing my ideas for projects for the Google
Summer of code, and I would like to get input here too.

My first idea is an update to TorStatus.
The main idea for this is because the project looks like it could use
an update, and I know PHP very well. Since it looks like TorStatus is
moving to Metrics I propose that TorStatus could become a more
lightweight script that can be run on one's own Tor server(s) to
provide information on just them rather than all servers.
Functionality could also be added to configure the server or to
start/stop the tor service.
Is there interest for such a project and would it be useful?

Idea 2, HTTPS Everywhere for Google Chrome
I think that the EFF's HTTPS Everywhere for Firefox is great, and I
would love to have such an extension for Google Chrome. Unfortunately
as Sebastian pointed out in IRC, due to Google Chrome limitations, an
insecure http request would most likely be made before the http
connection started. Until this limitation in the Chrome api is
overcome I suggest that the extension be simplified to just a simple
button/icon that notifies the user if the current page is using http
where it could be using https, then if the user clicks on said icon it
would take them to the https version of the site. This extension could
also be incorporated into a future Tor extension for Chrome.
Upon further research it looks like a similar extension already
exists: 
https://chrome.google.com/extensions/detail/flcpelgcagfhfoegekianiofphddckof?hl=en
But I would like to offer more than that extension, such as using the
same list of HTTPS sites that the Firefox extension uses, and not send
any unencrypted data when the user only expects HTTPS.
If during the course off GSOC (or after) Google allows the chrome api
to prevent/pause a page from loading by a plugin then I would
incorporate that, mimicking the functionality present in Firefox.

Idea 3: Python!
I see that quite a few of your projects use Python. I do not know
Python as well as PHP but am am still learning. I would like to work
on any python project for Tor as long as it is something that I can
work on while still learning more python. I was looking at the Tor
Updater script, but would like suggestions.

Ideally I would like to work with something in python so that at the
same time I can get more practice with it but I am open to anything.

Looking forward to feedback!

Thanks!

-- 
Ian Foster
www.vorsk.com
___
tor-talk mailing list
tor-talk@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-talk


Re: [tor-talk] GSOC Ideas.

2011-03-29 Thread grarpamp
> Since it looks like TorStatus is moving to Metrics

Sure, torproject should run one as a central looking glass.
As well as anyone else who wants to.

> become a more lightweight script that can be run on
> one's own Tor server(s) to provide information on just
> them rather than all servers.

That's fine, but the ability to process them all still needs to exist.
Also, I could really use a full CSV version with at minimum,
fp, country, ip, hostname, bandwidth provisioned, uptime, flags.
Selecting those columns does not make it through to the CSV export,
and parsing html is a pain. Nor does it allow not showing the (useless)
router name.
___
tor-talk mailing list
tor-talk@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-talk


Re: [tor-talk] GSOC Ideas.

2011-03-29 Thread Ian Foster
>> become a more lightweight script that can be run on
>> one's own Tor server(s) to provide information on just
>> them rather than all servers.
>
> That's fine, but the ability to process them all still needs to exist.
> Also, I could really use a full CSV version with at minimum,
> fp, country, ip, hostname, bandwidth provisioned, uptime, flags.
> Selecting those columns does not make it through to the CSV export,
> and parsing html is a pain. Nor does it allow not showing the (useless)
> router name.

Exporting to CSV based of of the filters is an easy task. Is there
anyone else who would find this useful? If so I'll look into making a
PHP script that can do that right now. :)

-- 
Ian Foster
www.vorsk.com
___
tor-talk mailing list
tor-talk@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-talk


Re: [tor-talk] GSOC Ideas.

2011-03-29 Thread Karsten Loesing
On Tue, Mar 29, 2011 at 09:25:02PM -0700, Ian Foster wrote:
> >> become a more lightweight script that can be run on
> >> one's own Tor server(s) to provide information on just
> >> them rather than all servers.
> >
> > That's fine, but the ability to process them all still needs to exist.
> > Also, I could really use a full CSV version with at minimum,
> > fp, country, ip, hostname, bandwidth provisioned, uptime, flags.
> > Selecting those columns does not make it through to the CSV export,
> > and parsing html is a pain. Nor does it allow not showing the (useless)
> > router name.
> 
> Exporting to CSV based of of the filters is an easy task. Is there
> anyone else who would find this useful? If so I'll look into making a
> PHP script that can do that right now. :)

The odds of Tor picking a GSoC student to improve TorStatus are non-zero,
but low.  (To be precise, I wouldn't mentor that project, but I don't know
if somebody else would.)

The better approach for providing Tor network status information is to
extend the metrics website, mostly because the metrics website is
maintained whereas the TorStatus website isn't.  Kevin Berry, one of our
last year's GSoC students who I mentored, started working on a basic
network status page here:

  https://metrics.torproject.org/networkstatus.html

The code for the metrics website is here, and yes, it's JSP/servlets:

  http://gitweb.torproject.org/metrics-web.git

Please let me know if you have further questions.

Best,
Karsten

___
tor-talk mailing list
tor-talk@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-talk


Re: [tor-talk] GSOC Ideas.

2011-03-30 Thread Moritz Bartl
On 30.03.2011 06:25, Ian Foster wrote:
> Exporting to CSV based of of the filters is an easy task. Is there
> anyone else who would find this useful? If so I'll look into making a
> PHP script that can do that right now. :)

I have a very basic Python parser at
https://github.com/moba/tormap/blob/master/tormap.py

Currently it generates a KML for Google Earth. Feel free to reuse.

I am working on a XML representation of snapshots from the complete
history available on archive.torproject.org for a dynamic world map.

-- 
Moritz Bartl
https://www.torservers.net/
___
tor-talk mailing list
tor-talk@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-talk


Re: [tor-talk] GSOC Ideas.

2011-04-01 Thread Ian Foster
I've created a simple python parser for Tor that will generate a csv
file from Tor's cached-descriptors and cached-consensus files.
It does not get all the data it should but it is only a first revision.
The purpose of this was to more familiarize myself with Tor for the
Google Summer of Code, Hope it is useful!
Get it here: https://github.com/mrlanrat/TorExport

grarpamp, I hope this helps you a little, If I can I will expand the
data that this script will collect.

Moritz Bartl, your script was very useful, I used it s a base, but
modified it a bit so that it would run on python 3.

Feedback is welcome!

On Wed, Mar 30, 2011 at 8:39 AM, Moritz Bartl  wrote:
> On 30.03.2011 06:25, Ian Foster wrote:
>> Exporting to CSV based of of the filters is an easy task. Is there
>> anyone else who would find this useful? If so I'll look into making a
>> PHP script that can do that right now. :)
>
> I have a very basic Python parser at
> https://github.com/moba/tormap/blob/master/tormap.py
>
> Currently it generates a KML for Google Earth. Feel free to reuse.
>
> I am working on a XML representation of snapshots from the complete
> history available on archive.torproject.org for a dynamic world map.
>
> --
> Moritz Bartl
> https://www.torservers.net/
> ___
> tor-talk mailing list
> tor-talk@lists.torproject.org
> https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-talk
>



-- 
Ian Foster
www.vorsk.com
___
tor-talk mailing list
tor-talk@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-talk


Re: [tor-talk] GSOC Ideas.

2011-04-01 Thread Robert Ransom
On Fri, 1 Apr 2011 17:12:20 -0700
Ian Foster  wrote:

> I've created a simple python parser for Tor that will generate a csv
> file from Tor's cached-descriptors and cached-consensus files.
> It does not get all the data it should but it is only a first revision.
> The purpose of this was to more familiarize myself with Tor for the
> Google Summer of Code, Hope it is useful!
> Get it here: https://github.com/mrlanrat/TorExport
> 
> grarpamp, I hope this helps you a little, If I can I will expand the
> data that this script will collect.
> 
> Moritz Bartl, your script was very useful, I used it s a base, but
> modified it a bit so that it would run on python 3.

No shit.  Your substantive contribution to TorExport consists of less
than 10 new lines near the end -- diff attached.  (I normalized the
leading whitespace in both files with ‘expand -t 4’ first.)


Robert Ransom
--- moba-tormap.py	2011-04-01 20:00:56.0 -0700
+++ mrlanrat-torexport.py	2011-04-01 20:01:15.0 -0700
@@ -1,41 +1,46 @@
 #!/usr/bin/env python
 # encoding: utf-8
-
-'''  
- quick and dirty hack Moritz Bartl mor...@torservers.net
- 13.12.2010
-
- let me know and send me your changes if you improve anything
-
- requires: 
- - pygeoip, http://code.google.com/p/pygeoip/
- - geoIP city database, eg. http://www.maxmind.com/app/geolitecity
-
- This program is free software: you can redistribute it and/or modify
- it under the terms of the GNU Lesser General Public License (LGPL) 
- as published by the Free Software Foundation, either version 3 of the 
- License, or any later version.
-
- This program is distributed in the hope that it will be useful,
- but WITHOUT ANY WARRANTY; without even the implied warranty of
- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
- GNU Lesser General Public License for more details.
- 
- http://www.gnu.org/licenses/
 '''
+Script to parse torfiles for nodes and export csv
+4/1/2010
+Ian Foster
 
-FAST = 100
+requires python 3
+
+Built using code from:
+https://github.com/moba/tormap/blob/master/tormap.py
+
+TorExport is free software: you can redistribute it and/or modify it under the terms
+of the GNU General Public License as published by the Free Software Foundation, 
+either version 3 of the License, or (at your option) any later version.
+
+TorExport is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A 
+PARTICULAR PURPOSE. See the GNU General Public License for more details.
+
+http://www.gnu.org/licenses/.
+'''
 
-import base64, shelve, pygeoip, cgi, re
-from operator import attrgetter, itemgetter
-from string import Template
+import base64
+import cgi
+import csv
+import sys
+
+try:
+sys.argv[1]
+except IndexError:
+print('Creates a CSV of all online nodes')
+print('Please Pass folder containing cached-descriptors and cached-consensus files')
+print('usage: torexport.py /path/to/tor/data')
+exit()
 
+FAST = 100
 cachedRelays = dict()
 currentRouter = dict()
 
 # parse cached-descriptors to extract uptime and announced bandwidth
-with open('cached-descriptors') as f:
-for line in f:  
+with open(sys.argv[1]+ '/cached-descriptors') as f:
+for line in f:
 line = line.strip()
 if line.startswith('router '):
 [nil,name,ip,orport,socksport,dirport] = line.split()
@@ -48,13 +53,13 @@
 currentRouter['version']=line[9:]
 if line.startswith('opt fingerprint'):
 fingerprint=line[16:]
-currentRouter['fingerprint'] = fingerprint.replace(' ','').lower()
+currentRouter['fingerprint'] = str(fingerprint.replace(' ','').lower())
 if line.startswith('uptime '):
 currentRouter['uptime']=line[7:]
 if line.startswith('bandwidth '):
 currentRouter['bandwidth'] = line[10:]
 try:
-currentRouter['bw-observed'] = int(line.split()[3]) 
+currentRouter['bw-observed'] = int(line.split()[3])
 except:
 pass
 bandwidth = line[10:]
@@ -65,8 +70,6 @@
 cachedRelays[fingerprint] = currentRouter
 currentRouter = dict()
 
-# parse cached-consensus for flags and correlate to descriptors
-
 badRelays = dict() # Bad in flags, eg. BadExit, BadDirectory
 exitFastRelays = dict() # Exit flag, >= FAST
 exitRelays = dict() # Exit flag, slower than FAST
@@ -74,57 +77,56 @@
 stableRelays = dict() # Stable flag, but not Exit
 otherRelays = dict() # non Stable, non Exit
 
-count = 0
-with open('cached-consensus') as f:
-for line in f:  
+# parse cached-consensus for flags and correlate to descriptors
+with open(sys.argv[1]+'/cached-consensus') as f:
+for line in f:
 line = line.strip()
 if line.startswith('r '):
 [nil,name,identity,digest,date,time,ip,orport,dirport] = line.split()
 

Re: [tor-talk] GSOC Ideas.

2011-04-02 Thread Roger Dingledine
On Wed, Mar 30, 2011 at 07:50:06AM +0200, Karsten Loesing wrote:
> > Exporting to CSV based of of the filters is an easy task. Is there
> > anyone else who would find this useful? If so I'll look into making a
> > PHP script that can do that right now. :)
> 
> The odds of Tor picking a GSoC student to improve TorStatus are non-zero,
> but low.  (To be precise, I wouldn't mentor that project, but I don't know
> if somebody else would.)
> 
> The better approach for providing Tor network status information is to
> extend the metrics website, mostly because the metrics website is
> maintained whereas the TorStatus website isn't.  Kevin Berry, one of our
> last year's GSoC students who I mentored, started working on a basic
> network status page here:
> 
>   https://metrics.torproject.org/networkstatus.html
> 
> The code for the metrics website is here, and yes, it's JSP/servlets:
> 
>   http://gitweb.torproject.org/metrics-web.git
> 
> Please let me know if you have further questions.

That said, I think changing the Torstatus PHP script so it uses the
metrics database as its back-end, and cleaning up the PHP part of it,
would still be a very valuable task.

Right now Torstatus has two components: the PHP interface front-end,
and the database back-end that remembers stuff about the network so it
can (for example) make historical bandwidth graphs.

The database kept by the metrics project is probably better than the
database kept by Torstatus. So dropping the db side of torstatus, and
teaching it to use the db from metrics, would be valuable in that it
would make things more maintainable.

The front-end from Torstatus is currently more usable, and thus more
useful, than the front-end on the metrics project. Karsten would like
somebody to fix the metrics side so it's better, and then we can dump
Torstatus. That would be great, if it happens, but until it happens,
making Torstatus better would still be useful.

The problem is that there are basically no Torstatus developers in
the world, so working on that as a GSoC project would be hard since we
wouldn't have anybody to mentor you.

But if somebody wants to pick it up as a side hobby, that'd be great. :)

--Roger

___
tor-talk mailing list
tor-talk@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-talk


Re: [tor-talk] GSOC Ideas.

2011-04-02 Thread Moritz Bartl
On 02.04.2011 05:27, Robert Ransom wrote:
>> Moritz Bartl, your script was very useful, I used it s a base, but
>> modified it a bit so that it would run on python 3.
> No shit.  Your substantive contribution to TorExport consists of less
> than 10 new lines near the end -- diff attached.

So? Someone looking for a .csv output with no coding experience is
grateful for any change that helps.

Think user, not coder. That is a general advice to #tor ;-)

-- 
Moritz Bartl
https://www.torservers.net/
___
tor-talk mailing list
tor-talk@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-talk


Re: [tor-talk] GSOC Ideas.

2011-04-03 Thread Ian Foster
Karsten, would working on TorStatus to make it work with the Metrics
database be a useful project?

Do you think either you, or someone else could mentor it? I know it's
PHP and it won't directly be a part of Metrics but it could be another
method of displaying and exporting the data.
Unfortunately I don't trust my little Java knowledge to work directly
on metrics.

If this is not needed then I'll focus on something with Python.

Thanks.

On Sat, Apr 2, 2011 at 6:44 AM, Roger Dingledine  wrote:
> On Wed, Mar 30, 2011 at 07:50:06AM +0200, Karsten Loesing wrote:
>> > Exporting to CSV based of of the filters is an easy task. Is there
>> > anyone else who would find this useful? If so I'll look into making a
>> > PHP script that can do that right now. :)
>>
>> The odds of Tor picking a GSoC student to improve TorStatus are non-zero,
>> but low.  (To be precise, I wouldn't mentor that project, but I don't know
>> if somebody else would.)
>>
>> The better approach for providing Tor network status information is to
>> extend the metrics website, mostly because the metrics website is
>> maintained whereas the TorStatus website isn't.  Kevin Berry, one of our
>> last year's GSoC students who I mentored, started working on a basic
>> network status page here:
>>
>>   https://metrics.torproject.org/networkstatus.html
>>
>> The code for the metrics website is here, and yes, it's JSP/servlets:
>>
>>   http://gitweb.torproject.org/metrics-web.git
>>
>> Please let me know if you have further questions.
>
> That said, I think changing the Torstatus PHP script so it uses the
> metrics database as its back-end, and cleaning up the PHP part of it,
> would still be a very valuable task.
>
> Right now Torstatus has two components: the PHP interface front-end,
> and the database back-end that remembers stuff about the network so it
> can (for example) make historical bandwidth graphs.
>
> The database kept by the metrics project is probably better than the
> database kept by Torstatus. So dropping the db side of torstatus, and
> teaching it to use the db from metrics, would be valuable in that it
> would make things more maintainable.
>
> The front-end from Torstatus is currently more usable, and thus more
> useful, than the front-end on the metrics project. Karsten would like
> somebody to fix the metrics side so it's better, and then we can dump
> Torstatus. That would be great, if it happens, but until it happens,
> making Torstatus better would still be useful.
>
> The problem is that there are basically no Torstatus developers in
> the world, so working on that as a GSoC project would be hard since we
> wouldn't have anybody to mentor you.
>
> But if somebody wants to pick it up as a side hobby, that'd be great. :)
>
> --Roger
>
> ___
> tor-talk mailing list
> tor-talk@lists.torproject.org
> https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-talk
>



-- 
Ian Foster
www.vorsk.com
___
tor-talk mailing list
tor-talk@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-talk


Re: [tor-talk] GSOC Ideas.

2011-04-04 Thread Karsten Loesing
On Sun, Apr 03, 2011 at 07:25:27PM -0700, Ian Foster wrote:
> Karsten, would working on TorStatus to make it work with the Metrics
> database be a useful project?

It would be useful, but improving the metrics website to be a better
TorStatus website would be much more useful. :)

Maybe I should accept the fact that the world has roughly as many
JSP/servlet developers as it has TorStatus maintainers and just do it
myself...

> Do you think either you, or someone else could mentor it? I know it's
> PHP and it won't directly be a part of Metrics but it could be another
> method of displaying and exporting the data.

I'm afraid I wouldn't be a good mentor for a PHP GSoC project.  I'm both
incapable and unwilling to read PHP code.

But this shouldn't stop you from working on TorStatus unrelated to GSoC.
I'd be glad to help you understand the metrics data if you're interested!

> Unfortunately I don't trust my little Java knowledge to work directly
> on metrics.

Fair enough.

Best,
Karsten

___
tor-talk mailing list
tor-talk@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-talk


Re: [tor-talk] GSOC Ideas.

2011-04-06 Thread Ian Foster
Hello Everyone,

I'v submitted my first GSOC proposal to the Google Melange site. It is
proposing an update of TorStatus to work with Metrics. I am aware that
this may not be the highest priority for the community so I plan to
submit an alternate proposal too.

I would greatly appreciate any feedback on it that anyone can provide.
Let me know if anyone would like me to post it here too.

Thanks!

-- 
Ian Foster
www.vorsk.com
___
tor-talk mailing list
tor-talk@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-talk