Hello, my name is Juan Florez and I am a PhD student at the University of Texas at Dallas.

I'm writing to report a problem that surfaced after trying to download a relatively big number of bug reports from this project's bugzilla. In short, any domain at apache.org started rejecting connections from our networks, and we suspect we were blacklisted at the top level. This is strange because we had already downloaded similar amounts of bug reports from other Apache bug trackers and never ran into any issues.

We already tried contacting the webmaster but got no answer. The original email is below, and it explains the problem in more detail.

I would appreciate your help in sorting out this issue, since our research routinely depends on data of this nature.

Sincerely,



-------- Forwarded Message --------
Subject:        Issues accessing apache services from campus network
Date:   Tue, 23 Jan 2018 15:01:24 -0600
From:   Juan Florez <jxf160...@utdallas.edu>
To:     webmas...@apache.org
CC:     cs-t...@utdallas.edu, Oscar Chaparro <ojchapar...@utdallas.edu>



My name is Juan Florez and I write on behalf of the SEERS group at the
University of Texas at Dallas. I'm writing this email to report
difficulties accessing this domain from some of our networks after
attempting to collect data for research purposes.

The problems started on Sunday January 21, 2018 after we tried to
programmatically download around 31k bug reports from the website
https://bz.apache.org/ooo/ , accessing each one as XML through bugzilla
(for example https://bz.apache.org/ooo/show_bug.cgi?ctype=xml&id=84969
). After a portion of the bug reports was downloaded, further
connections to the website started timing out, and we realized that any
connection to an apache.org subdomain would also time out. The problem
surfaced again after retrying from two other networks with different IP
addresses.

This came as a surprise since we have performed this procedure before,
even from other apache bugzilla websites (for example
https://issues.apache.org/jira/browse/CASSANDRA-7657 ), and never
encountered any problems while downloading thousands of bug reports.

We would appreciate you help in solving this issue, since we routinely
require access to many Apache services for our research. The IPs
affected by this problem are:
 - 129.110.93.16 (on-campus)
 - 129.110.241.5 (on-campus)
 - 66.253.176.84 (off-campus, used after the two options on-campus
stopped working)

We suspect this to be an issue related to rate limits, and we are CCing
the office of tech support of our department so that they can set a rate
limit on the campus networks to avoid this situation from happening
again. However, we could not find out this rate limit by ourselves, so
we would appreciate if you could include it in the reply to this email.

We apologize for any inconvenience caused. Please don't hesitate to
write back if you require more details.


Sincerely,

--
Juan Manuel Florez
Software Engineering PhD Student

Reply via email to