John Ioannidis wrote:
Hmmm... a how about a market-data feed for warez?
That would be useful for research. My colleague Karl Chen pointed out
that it would probably be more useful for the underground market.
For the case of drug street prices, the U.S. Drug Enforcement Agency
does keep a database of prices, called STRIDE, obtained from informant
and undercover agent buys of drugs. These are records from actual buys,
so they partially address the concern Richard Clayton raises about going
by advertised list price -- but there are concerns (to which Richard
alludes) about whether agents systematically overpay or informants
systematically lie about the price they paid for drugs in order to
pocket the difference between money given to them for drug buys and the
actual price.
STRIDE also includes data on purity of drugs assayed in DEA labs. This
includes drugs seized by the feds, but not usually drugs seized by local
agencies. There's actually a trio of papers here in particular that
might be of interest to people who want to look at possible parallels
between data gathering on drug street prices and illegal digital goods.
The first is an overview paper that discusses the conceptual and
practical problems in doing price and purity analyses over time for
illegal drugs. The paper also points out some interesting features of
the drug market. For example, the author points out that drugs are
"experience goods." That is, the purchaser does not know the actual
quality of the good until after making the purchase. For drugs, quality
means purity of the drug. What this boils down to is that when looking
at time series of drug street prices, it turns out you need to model
what the buyer believes the purity of the drug will be to make sense of
the data.
"Price and purity analysis for illicit drugs: Data and conceptual issues"
J.P. Caulkins
Drug and Alcohol Dependence , Volume 90 , Pages S61 - S68
http://linkinghub.elsevier.com/retrieve/pii/S0376871606003061
(Unfortunately the article is behind a paywall.)
The second looks at the STRIDE data and argues it is not suitable for
use in economic analyses of the drug market. The primary criticism is
that the data are mainly gathered from buys intended to produce evidence
for busts, except for a smaller program aimed solely at heroin. They are
therefore not a uniform sample of any kind. More interesting to me,
however, is the author's contention that the data are not internally
consistent: he is able to separate out prices reported by the DEA from
prices reported by the DC metro police, then does a analysis showing
that the two agencies report a statistically significant difference in
prices. He concludes that the difference is greater than can be
accounted for by normal price differences within a single city and that
therefore something is wrong with the data.
"Should the DEA's STRIDE Data Be Used for Economic Analyses of Markets
for Illegal Drugs?"
Horowitz, Joel L
http://www.biz.uiowa.edu/econ/papers/uia/STRIDE_rev1a.pdf
The third and final paper is a rebuttal of the second. The authors claim
that the second paper improperly lumps together retail and wholesale
purchases of illegal drugs. They also claim that the second paper does
not properly account for the relationship between price and purity of a
drug. Once they toss the appropriate magic indicator variables into
their regressions and stratify by purchase type, the supposed conflict
between DEA and DC police reported prices disappears.
Why the DEA STRIDE Data are Still Useful for Understanding Drug Markets
Jeremy Arkes, Rosalie Liccardo Pacula, Susan M. Paddock, Jonathan P.
Caulkins, Peter Reuter
NBER Working Paper No. 14224
Issued in August 2008
http://www.nber.org/papers/w14224
(Also paywalled, unfortunately)
What is the relevance to us? Well, I see a couple of points:
1) Like drugs, compromised PayPal accounts appear to be experience
goods. In the case of drugs, quality is purity. In the case of
compromised PayPal accounts, quality is something like the amount of
money that can be successfully moved out of the account. Therefore, I
would expect the same kind of modelling the buyer's "expected quality"
of the good would be useful for us. In particular, failing to take it
into account when analyzing price series could lead to the same kind of
internal inconsistencies noted by Horowitz.
Not clear to me where other illegal digital goods stand. Botnets for
example seem easy enough to test whether they are real. Also as Peter
Gutmann points out, escrow services are possible and exist with illegal
digital goods to aid fair exchange -- this is not reported for drugs.
2) Unlike STRIDE, the data sets we have reported so far were gathered
specifically for research in mind, and not as part of some other
mission. Unfortunately, they still are almost certainly not uniform
samples of illegal prices, and unlike STRIDE, as pointed out, they are
not actual t