Re: [tor-talk] How to write program that uses Tor network

2015-10-02 Thread Tyler Hardin
This is a lot of really good advice. Thanks. For some reason, I was thinking C++ would give a measurable performance increase for the spider, but after having questioned that it seems really dumb. Obviously the network will be the bottleneck by far. I think I'll still use C++ for the back end

Re: [tor-talk] How to write program that uses Tor network

2015-10-01 Thread Akademika Aka
Google how to use Socks5 with boost and set 127.0.0.1:9050 as proxy. On Sep 30, 2015 5:14 AM, "Tyler Hardin" wrote: > Hi, I'm writing a spider in C++ and thinking about running it on the Tor > hidden network. I'm using boost::asio for the network API. What would be > the

Re: [tor-talk] How to write program that uses Tor network

2015-10-01 Thread Apple Apple
Asio is only a socket library which means you would need to build all the Http logic on top of it, which is not very fun but everything you need to know is documented in RFCs if you really want to go down that route. The "best/ easiest" way would be to use a http library specifically for the

[tor-talk] How to write program that uses Tor network

2015-09-29 Thread Tyler Hardin
Hi, I'm writing a spider in C++ and thinking about running it on the Tor hidden network. I'm using boost::asio for the network API. What would be the best/easiest way for me to retrieve pages from the Tor hidden network? P.S. Yes, I'm respecting robots.txt and rate limiting. I'm not going to DOS

Re: [tor-talk] How to write program that uses Tor network

2015-09-29 Thread Tyler Hardin
Also, about rate limiting, what sort of rate limit do y'all think would be mindful of the health of the network and the average site? I'm thinking a maximum of 1 req per second per site and 10 reqs per second overall (in other words, 10 sites per second). But based on my experience browsing with

Re: [tor-talk] How to write program that uses Tor network

2015-09-29 Thread Andreas Krey
On Tue, 29 Sep 2015 23:22:37 +, Tyler Hardin wrote: > Also, about rate limiting, what sort of rate limit do y'all think would be > mindful of the health of the network and the average site? I'm thinking a > maximum of 1 req per second per site and 10 reqs per second overall Perhaps you should