Re: Concepts of Unique Tracking
On Fri, May 25, 2001 at 10:03:04AM -0700, Jonathan Hilgeman wrote: > Now, I'm assuming that Apache has full access to these incoming packets. > Therefore, they must also have access to this invisible identifier. Is it > possible to extract that identifier somehow by tinkering with Apache? Most NAT implemetations keep a hash of destination ports -> internal IP. To wit: > 1) Person behind the firewall sends out a request to a web server. Person _really_ establishes an outgoing TCP session with his NAT box. The NAT box notes his internal_IP:dest_port, sets up an outgoing TCP session to web server, notes it's own source port for that leg. > 4) The firewall receives the packets of data first, but now must send those > data packets to someone inside the firewall. Returning packets from the webserver come to that source port, NAT box looks up hash of: external_IP:source_port -> internal_IP:dest_port, and hands the packet in. > 5) The packets of data MUST have some unique identifier to let the firewall That would be the source port of the NAT box's outgoing connection. But: - each outgoing TCP connection from the internal host will use a different source port. - the request your web server is receiving may actaully (likely) be coming from a web cache somewhere. > > Jonathan > -- Brian 'you Bastard' Reichert<[EMAIL PROTECTED]> 37 Crystal Ave. #303Daytime number: (603) 434-6842 Derry NH 03038-1713 USA Intel architecture: the left-hand path
RE: Concepts of Unique Tracking
ASCEND SOAPBOX I agree with Alex (and it's not just because we work together). Companies have been doing the kind of data collecting Alex is talking about for years. As a matter of fact, some Cultural Anthropologists specialize in "Corporate Anthropology" (for a recent related news item see -http://www.cnn.com/2001/CAREER/dayonthejob/05/23/corp.anthropologist.idg/in dex.html ). Collecting anonymous information about users is something almost all websites do - I'm hesitant to say all, because I'm sure one website out there doesn't keep a usage log (i.e. /usr/local/apache/logs/access_log or /usr/local/apache/logs/error_log). It would be almost impossible to run a good website that changes based on user trends and preferences and not do some form of user tracking. Of course the real problem is when the website tries to link the collected data in someway to real people. Knowing that 15% of your users HTTP_REFERRER is www.porn.com is one thing, knowing that Persons X, Y, and Z came from www.porn.com and acting on that knowledge to send them information about the latest sale on leather underwear and selling their names to the porn_users mailing list is completely wrong. In my opinion, a good website has to track generalizations about user preferences so it can react to add to the user experience in positive ways. One way to do this to collect anonymous data about the things a user does on the site. This can be done and still protect a users privacy. DESCEND SOAPBOX Joe Breeden -- Sent from my Outlook 2000 Wired Deskheld (www.microsoft.com) -Original Message- From: Alex Porras Sent: Friday, May 25, 2001 2:38 PM To: '[EMAIL PROTECTED]' Subject: RE: Concepts of Unique Tracking Although I agree about privacy issues, I will keep it short by stating that there is a difference between identifying you as "unique user 1309850825" (assuming no personally identifiable information is also collected) versus identifying you as "Stephen Adkins". You can use the first method to collect aggregate information about what percentage of your users are accessing what parts of your website the most/least, so you could customize your website appropriately. That does not require me to know who everyone is, personally speaking. --Alex > -Original Message- > From: Stephen Adkins [mailto:[EMAIL PROTECTED]] > Sent: Friday, May 25, 2001 1:14 PM > To: Jonathan Hilgeman; '[EMAIL PROTECTED]' > Subject: RE: Concepts of Unique Tracking > > > > How quickly we forget ... > > Don't we remember the huge outcry over Intel putting a unique > ID in every > CPU which would could be transmitted via web browser and > destroy all of our > privacy? > > The frustration we feel as programmers who are trying to > identify anonymous > visitors > is exactly what privacy is all about. > And I am thankful for it. > > Get used to it. > People need to opt-in in order to be identified. > The closest thing we can get to this is people leaving their cookies > enabled on their > browser. > > Stephen > > At 10:43 AM 5/25/2001 -0700, Jonathan Hilgeman wrote: > >Let's take over the world and recompile all browsers to have > them send out > >the MAC address of thet network card. > > > >Jonathan > > > >
RE: Concepts of Unique Tracking
Although I agree about privacy issues, I will keep it short by stating that there is a difference between identifying you as "unique user 1309850825" (assuming no personally identifiable information is also collected) versus identifying you as "Stephen Adkins". You can use the first method to collect aggregate information about what percentage of your users are accessing what parts of your website the most/least, so you could customize your website appropriately. That does not require me to know who everyone is, personally speaking. --Alex > -Original Message- > From: Stephen Adkins [mailto:[EMAIL PROTECTED]] > Sent: Friday, May 25, 2001 1:14 PM > To: Jonathan Hilgeman; '[EMAIL PROTECTED]' > Subject: RE: Concepts of Unique Tracking > > > > How quickly we forget ... > > Don't we remember the huge outcry over Intel putting a unique > ID in every > CPU which would could be transmitted via web browser and > destroy all of our > privacy? > > The frustration we feel as programmers who are trying to > identify anonymous > visitors > is exactly what privacy is all about. > And I am thankful for it. > > Get used to it. > People need to opt-in in order to be identified. > The closest thing we can get to this is people leaving their cookies > enabled on their > browser. > > Stephen > > At 10:43 AM 5/25/2001 -0700, Jonathan Hilgeman wrote: > >Let's take over the world and recompile all browsers to have > them send out > >the MAC address of thet network card. > > > >Jonathan > > > >
RE: Concepts of Unique Tracking
How quickly we forget ... Don't we remember the huge outcry over Intel putting a unique ID in every CPU which would could be transmitted via web browser and destroy all of our privacy? The frustration we feel as programmers who are trying to identify anonymous visitors is exactly what privacy is all about. And I am thankful for it. Get used to it. People need to opt-in in order to be identified. The closest thing we can get to this is people leaving their cookies enabled on their browser. Stephen At 10:43 AM 5/25/2001 -0700, Jonathan Hilgeman wrote: >Let's take over the world and recompile all browsers to have them send out >the MAC address of thet network card. > >Jonathan >
RE: Concepts of Unique Tracking
Dialup users will be given high-speed connections using network cards and modems will be burned. It'll be like book-burning sessions all over again. Jonathan -Original Message- From: Ilya Martynov [mailto:[EMAIL PROTECTED]] Sent: Friday, May 25, 2001 10:53 AM To: Jonathan Hilgeman Cc: '[EMAIL PROTECTED]' Subject: Re: Concepts of Unique Tracking JH> Let's take over the world and recompile all browsers to have them send out JH> the MAC address of thet network card. .. and if I'm dialup user :) JH> Jonathan JH> -Original Message- JH> From: Wim Kerkhoff [mailto:[EMAIL PROTECTED]] JH> Sent: Friday, May 25, 2001 10:42 AM JH> To: Jonathan Hilgeman JH> Cc: '[EMAIL PROTECTED]' JH> Subject: Re: Concepts of Unique Tracking JH> Jonathan Hilgeman wrote: >> >> What about client-specific information available in Javascript, like JH> screen >> resolution, size, etc...? Can that be accessed by tinkering with Apache a >> bit, or is it something only available because of the browser, since >> Javascript is dependent on the browser? JH> I briefly thought about suggesting something like that, or with JH> combination with the other headers that get sent in the HTTP request for JH> language, encoding, etc. However, think of the situations such as JH> computer labs, internet cafes, etc, where all computers are identical in JH> every aspect, with the exact same version of the browser, hard coded JH> screen resolutions (e.g. 800x600), etc, that the user can not change. JH> -- JH> Regards, JH> Wim Kerkhoff, Software Engineer JH> Merilus, Inc. -|- http://www.merilus.com JH> Email: [EMAIL PROTECTED] -- -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- | Ilya Martynov (http://martynov.org/)| | GnuPG 1024D/323BDEE6 D7F7 561E 4C1D 8A15 8E80 E4AE BE1A 53EB 323B DEE6 | | AGAVA Software Company (http://www.agava.com/) | -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Re: Concepts of Unique Tracking
JH> Let's take over the world and recompile all browsers to have them send out JH> the MAC address of thet network card. .. and if I'm dialup user :) JH> Jonathan JH> -Original Message- JH> From: Wim Kerkhoff [mailto:[EMAIL PROTECTED]] JH> Sent: Friday, May 25, 2001 10:42 AM JH> To: Jonathan Hilgeman JH> Cc: '[EMAIL PROTECTED]' JH> Subject: Re: Concepts of Unique Tracking JH> Jonathan Hilgeman wrote: >> >> What about client-specific information available in Javascript, like JH> screen >> resolution, size, etc...? Can that be accessed by tinkering with Apache a >> bit, or is it something only available because of the browser, since >> Javascript is dependent on the browser? JH> I briefly thought about suggesting something like that, or with JH> combination with the other headers that get sent in the HTTP request for JH> language, encoding, etc. However, think of the situations such as JH> computer labs, internet cafes, etc, where all computers are identical in JH> every aspect, with the exact same version of the browser, hard coded JH> screen resolutions (e.g. 800x600), etc, that the user can not change. JH> -- JH> Regards, JH> Wim Kerkhoff, Software Engineer JH> Merilus, Inc. -|- http://www.merilus.com JH> Email: [EMAIL PROTECTED] -- -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- | Ilya Martynov (http://martynov.org/)| | GnuPG 1024D/323BDEE6 D7F7 561E 4C1D 8A15 8E80 E4AE BE1A 53EB 323B DEE6 | | AGAVA Software Company (http://www.agava.com/) | -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
RE: Concepts of Unique Tracking
Let's take over the world and recompile all browsers to have them send out the MAC address of thet network card. Jonathan -Original Message- From: Wim Kerkhoff [mailto:[EMAIL PROTECTED]] Sent: Friday, May 25, 2001 10:42 AM To: Jonathan Hilgeman Cc: '[EMAIL PROTECTED]' Subject: Re: Concepts of Unique Tracking Jonathan Hilgeman wrote: > > What about client-specific information available in Javascript, like screen > resolution, size, etc...? Can that be accessed by tinkering with Apache a > bit, or is it something only available because of the browser, since > Javascript is dependent on the browser? I briefly thought about suggesting something like that, or with combination with the other headers that get sent in the HTTP request for language, encoding, etc. However, think of the situations such as computer labs, internet cafes, etc, where all computers are identical in every aspect, with the exact same version of the browser, hard coded screen resolutions (e.g. 800x600), etc, that the user can not change. -- Regards, Wim Kerkhoff, Software Engineer Merilus, Inc. -|- http://www.merilus.com Email: [EMAIL PROTECTED]
Re: Concepts of Unique Tracking
Jonathan Hilgeman wrote: > > What about client-specific information available in Javascript, like screen > resolution, size, etc...? Can that be accessed by tinkering with Apache a > bit, or is it something only available because of the browser, since > Javascript is dependent on the browser? I briefly thought about suggesting something like that, or with combination with the other headers that get sent in the HTTP request for language, encoding, etc. However, think of the situations such as computer labs, internet cafes, etc, where all computers are identical in every aspect, with the exact same version of the browser, hard coded screen resolutions (e.g. 800x600), etc, that the user can not change. -- Regards, Wim Kerkhoff, Software Engineer Merilus, Inc. -|- http://www.merilus.com Email: [EMAIL PROTECTED]
RE: Concepts of Unique Tracking
Actually, someone suggested HTTP authorization - does that require a cookie to work? Or after they are authorized, it simply keeps the session open in the browser...? Jonathan -Original Message- From: Brian Reichert [mailto:[EMAIL PROTECTED]] Sent: Friday, May 25, 2001 10:20 AM To: Jonathan Hilgeman Cc: '[EMAIL PROTECTED]' Subject: Re: Concepts of Unique Tracking On Fri, May 25, 2001 at 10:03:04AM -0700, Jonathan Hilgeman wrote: > Now, I'm assuming that Apache has full access to these incoming packets. > Therefore, they must also have access to this invisible identifier. Is it > possible to extract that identifier somehow by tinkering with Apache? Most NAT implemetations keep a hash of destination ports -> internal IP. To wit: > 1) Person behind the firewall sends out a request to a web server. Person _really_ establishes an outgoing TCP session with his NAT box. The NAT box notes his internal_IP:dest_port, sets up an outgoing TCP session to web server, notes it's own source port for that leg. > 4) The firewall receives the packets of data first, but now must send those > data packets to someone inside the firewall. Returning packets from the webserver come to that source port, NAT box looks up hash of: external_IP:source_port -> internal_IP:dest_port, and hands the packet in. > 5) The packets of data MUST have some unique identifier to let the firewall That would be the source port of the NAT box's outgoing connection. But: - each outgoing TCP connection from the internal host will use a different source port. - the request your web server is receiving may actaully (likely) be coming from a web cache somewhere. > > Jonathan > -- Brian 'you Bastard' Reichert<[EMAIL PROTECTED]> 37 Crystal Ave. #303Daytime number: (603) 434-6842 Derry NH 03038-1713 USA Intel architecture: the left-hand path
RE: Concepts of Unique Tracking
Actually, I had come up with a similar idea after I sent that one off. My idea was that packets had packet identifiers in their header or footer, and the packet identifiers were stored in the firewall and referenced to the computer inside the firewall, so whenever packets with that identifier came back, the firewall knew which computer to send it to. Oh well. What about client-specific information available in Javascript, like screen resolution, size, etc...? Can that be accessed by tinkering with Apache a bit, or is it something only available because of the browser, since Javascript is dependent on the browser? Jonathan -Original Message- From: Wim Kerkhoff [mailto:[EMAIL PROTECTED]] Sent: Friday, May 25, 2001 10:15 AM To: Jonathan Hilgeman Cc: '[EMAIL PROTECTED]' Subject: Re: Concepts of Unique Tracking Jonathan Hilgeman wrote: > Now, I'm assuming that Apache has full access to these incoming packets. > Therefore, they must also have access to this invisible identifier. Is it > possible to extract that identifier somehow by tinkering with Apache? The only thing that you can access from the webserver side is the REMOTE_ADDR and REMOTE_PORT. IP masquarding is handled only by the firewall that is doing the masquarding: the web server and browser have no idea that this is happening. The firewall has a table that keeps track of open TCP connections, so that when it receives data on the outside port (e.g. 61172) it knows to rewrite the packet and send it off back to the inside client (e.g. 192.168.1.42:49372) that created the initial TCP connection. This is one of primary reasons that cookies exist. -- Regards, Wim Kerkhoff, Software Engineer Merilus, Inc. -|- http://www.merilus.com Email: [EMAIL PROTECTED]
Re: Concepts of Unique Tracking
Jonathan Hilgeman wrote: > Now, I'm assuming that Apache has full access to these incoming packets. > Therefore, they must also have access to this invisible identifier. Is it > possible to extract that identifier somehow by tinkering with Apache? The only thing that you can access from the webserver side is the REMOTE_ADDR and REMOTE_PORT. IP masquarding is handled only by the firewall that is doing the masquarding: the web server and browser have no idea that this is happening. The firewall has a table that keeps track of open TCP connections, so that when it receives data on the outside port (e.g. 61172) it knows to rewrite the packet and send it off back to the inside client (e.g. 192.168.1.42:49372) that created the initial TCP connection. This is one of primary reasons that cookies exist. -- Regards, Wim Kerkhoff, Software Engineer Merilus, Inc. -|- http://www.merilus.com Email: [EMAIL PROTECTED]
Re: Concepts of Unique Tracking
Jonathan Hilgeman <[EMAIL PROTECTED]> wrote: >Okay, after I think about it, there must be a way to identify a unique user, >even if they are behind a firewall. Let's run through this process: > >1) Person behind the firewall sends out a request to a web server. >2) The firewall intercepts that request, masks the person's IP address and >lets the request keep going out. >3) The web server receives the request and sends back packets of data to the >IP of the user, which is really the IP of the firewall now. >4) The firewall receives the packets of data first, but now must send those >data packets to someone inside the firewall. >5) The packets of data MUST have some unique identifier to let the firewall >know who requested the data in the first place. > >Now, I'm assuming that Apache has full access to these incoming packets. >Therefore, they must also have access to this invisible identifier. Is it >possible to extract that identifier somehow by tinkering with Apache? No. What happens is more like this: (1) Browser opens socket for connecting to remote server. This assigns a unique identifier to the TCP connection - IP + socket on client side. (2) Browser connects to remote server, which actually ends up connecting to firewall. Firewall has a unique number on its side - its IP + socket (80 or 443 most likely). (3) Firewall opens socket for connecting to remote server. This assigns a unique identifier to the TCP connection - firewall's public IP + socket. Firewall remembers this and will transfer any data coming from client to this connection, and any data from this connection to the client. This is part of what is meant by a firewall which saves state information. All the information needed to connect the client and server via the firewall is kept within the firewall. Neither the client or server need be aware of any of it, nor, afaik, can they be aware of it without putting a http proxy on the firewall. The server is seeing the firewall's IP and socket, not the actual client's. This will change with each connection made, which will happen if the keepalive timeout happens. -- James Smith <[EMAIL PROTECTED]>, 979-862-3725 Texas A&M CIS Operating Systems Group, Unix