>If counting events and tracking activity on websites without the explicit consent of the people involved is unethical, then
>what are web log analyzer companies and software like Analog and Surfstat doing?
They only report on what's in the servers log file. How it got there is the ethical question...
If the user is browsing a web site it is ethical to track them. If you hide a transparent image with or without unique identifier is still questionable. The purpose of the logging is not the question.
You do have a dilemma. You want to anonymously track this event so as not to affect the results. The problem is that many before you were not benign in their motives so the level of awareness and animosity has risen to the point of not trusting anyone...
Your data is going to be suspect from any angle. I cannot conceive of a method to record it that would insure accuracy given the state of the internet at this time..
John
"LAWRENCE BOYD" <[EMAIL PROTECTED]>
Sent by: [EMAIL PROTECTED] 05/17/2004 12:33 PM
|
To: <[EMAIL PROTECTED]> cc: Subject: Re: [analog-help] Email Opening Count and Forwards for Research on Pyramiding in Public Advocacy Campaigns |
John,
Good of you to comment. I agree with what you say about being up front about collecting data on email traffic, particularly now that I understand the web long technologies and the issues better. At first, I simply assumed that there would be no privacy problem as long as the survey was anonymous, no addresses or personal data were recorded, and all statistics were descriptive and at the summary level. I thought of it more as a non-invasive measurement like measuring the wear of floor tiles in a museum to get an indication of traffic. But, that still leaves the question of measuring something without explicit consent.
This is an interesting issue. I am going to consult the human subject protection people at the University of California, Berkeley for a history and an assessment of these issues. I am also curious regarding commerical applications. If counting events and tracking activity on websites without the explicit consent of the people involved is unethical, then what are web log analyzer companies and software like Analog and Surfstat doing? Does the distinction between ethical and unethical boil down to embedding things like transparent images to enable measurements without informing recipients? I wonder if visitors in public places are aware that an infrared beam is counting them? In research, part of ethical issues involves benefits. The more I think about this, the more I am of the opinion that a scientific study that requires transparent tracking of email should go through a human subjects protocol process.
In the meantime, if we proceed we are thinking about two possibilities: The first is to ask recipients to let us know if they pass along the invitation to participate and to how many. The second is to explain to recipients what we are doing (counting) and for what purpose and ask for their consent. Both pose significant problems of response rates and the reliability and validy of results, but that's the name of the game.
As I said to Stil, I am delighted to see people in the Internet analysis business who are sensitive to privacy issues.
Thanks for your input.
Larry Boyd
----- Original Message -----
From: [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Monday, May 17, 2004 6:18 AM
Subject: Re: [analog-help] Email Opening Count and Forwards for Research on Pyramiding in Public Advocacy Campaigns
Lawrence, Why hide the fact that you are collecting data? Since you are trying to be up-front about things. Be honest with the users and you might get what you want.
John
Stilgherrian <[EMAIL PROTECTED]> Sent by: [EMAIL PROTECTED] 05/14/2004 08:16 PM | To: "LAWRENCE BOYD" <[EMAIL PROTECTED]>, [EMAIL PROTECTED] cc: Subject: Re: [analog-help] Email Opening Count and Forwards for Research on Pyramiding in Public Advocacy Campaigns |
At 15:17 -0700 14/5/04, LAWRENCE BOYD wrote:
>I am a social researcher doing a study of email pyramiding
>strategies for citizen-initiated polls. We have a url for a poll in
>an email invitation to participate. [snip]
>Can we get this [following] information using Analog? (Somewhere I
>read that you can embed a transparent image in the email that would
>produce a request that could be counted showing that the email was
>opened?)
Lawrence, the basic issue is that Analog can count and measure what
is in web server log files -- nothing more, nothing less. So whatever
you do must somehow generate a request for a file from a website.
That request would then be logged, and Analog can slice and dice that
data in all sorts of ways.
Any web log analysis program can do this, the ability is not unique
to Analog. It's all about how you set up your email, not about how
you analyze the logs.
Yes, you can embed an image in an HTML (that is, a formatted) email,
and have that email come from a web server rather than being attached
as part of the email. If that image is loaded as the email is read,
then a request to the web server is generated and logged. Indeed, if
you generate a unique ID number for each email you send out -- the
link to the image could be of the form
http://your.server.com/someimage.gif?55449582754, with a different
number in every email sent -- that number is also logged. You can
then see which email addresses generated a request, and you will know
that duplicate numbers can be traced back to the initial address.
There are also tricky things you might do with _javascript_ or other
scripting languages to generate a *new* unique ID when the email is
opened, thus enabling you to trace the email as it's passed on to
others.
However, there are two major problems with this technique:
* It will not deliver accurate results.
* Its ethics are extremely questionable.
It won't deliver accurate results because not all email client
programs will load the image.
* Many people (though probably in a minority these days) use email
clients which do not render HTML email. They will only see messy
HTML code, and are unlikely to follow the image link manually to
load the image. These people won't appear in your count.
* Many people whose email client *does* display HTML email will
have the display of external images turned off -- because it's
considered to be an invasion of privacy for someone to track
whether they open an email or not, and when. Indeed, this is
exactly the technique used by spammers to validate whether an
email address works or not.
The number of people blocking such external images already high
and is increasing as email programs improve their security and
privacy.
* Similarly, any scripted tricks are increasingly likely to be
blocked, because running unknown program code as you open an
email is a security risk. This is exactly how virus propagate.
The method is ethically questionable because:
* The external image-load technique does not allow the user to
provide informed consent to having their behaviour logged before
that logging takes place.
* Potentially, if you do the track-the-hand-on stuff, you're
compiling a list of who's a friend of whom (at least via their
email address, and that's fairly easy to match back to people)
and, given the context of your research question, matching that
to their political beliefs. How is that data going to be handled
and people's privacy protected?
So, to answer your specific questions...
>We need to learn:
>1. How many people opened the email,
No, because many people will not generate a request to the web server
even if they do open the email.
>2. How many clicked on the url, and
If you mean how many manually clicked on the URL to your poll, you
already know this from your web server logs. Or at least you know how
many requested the web page with the poll on it (because you're
logging it), and then how many completed the poll (because you're
also logging that).
However, there's no completely reliable way to know how they came to
find that web address unless you add that unique serial number to
every email. But you won't be able to analyze the pyramid down past
the first level.
>3. How many forwarded the email to their friends - and the number
>of lines in a chain, if possible.
No, for the reasons outlined above. At least not reliably. And at
least not ethically.
Of course, if you threw ethical issues out of the window, this sort
of thing is possible. And indeed it's done all the time by people
like Emode/Tickle, and the less-than-reputable folks who embed
"spyware" programs into people's email clients under the guise of
providing them with "cute smiley faces for their email". Such folks
will quite happily track every email a person sends, who they send it
to and when, compile comprehensive behavioural profiles of
individuals and sell them.
But since you're researching "public advocacy campaigns", and this
sort of hidden user-tracking represents an attitude which is the
exact opposite of advocating the rights of the public, I'm hoping
this mis-match prevents you choosing this path for your research. :)
I hope that helps,
Stil
--
Stilgherrian <[EMAIL PROTECTED]>
Internet, IT and Media Consulting, Sydney, Australia. ABN 25 231 641 421
mobile 0407 623 600 (international +61 407 623 600)
fax 02 9516 5630 (international +61 2 9516 5630)
+------------------------------------------------------------------------
| TO UNSUBSCRIBE from this list:
| http://lists.isite.net/listgate/analog-help/unsubscribe.html
|
| Digest version: http://lists.isite.net/listgate/analog-help-digest/
| Usenet version: news://news.gmane.org/gmane.comp.web.analog.general
| List archives: http://www.analog.cx/docs/mailing.html#listarchives
+------------------------------------------------------------------------