Great idea! Anyone on the list know if there's a way to make the debug log facilities do the YYYYMMDD timestamp instead of the longer one?
If not, I suppose we could work to update the core MediaWiki code. [1] -Adam 1. For those with PHP skills or equivalent, I'm referring to https://git.wikimedia.org/blob/mediawiki%2Fcore.git/a26687e81532def3faba64612ce79b701a13949e/includes%2FGlobalFunctions.php#L1042. Scroll to the bottom of the function definition to see the datetimestamp approach. On Wed, Apr 16, 2014 at 12:47 PM, Andrew Gray <andrew.g...@dunelm.org.uk>wrote: > Hi Adam, > > One thought: you don't really need the date/time data at any detailed > resolution, do you? If what you're wanting it for is to track major > changes ("last month it all switched to this IP") and to purge old > data ("delete anything older than 10 March"), you could simply log day > rather than datetime. > > enwiki / 127.0.0.1 / 123.45 / 2014-04-16:1245.45 > > enwiki / 127.0.0.1 / 123.45 / 2014-04-16 > > - the latter gives you the data you need while making it a lot harder > to do any kind of close user-identification. > > Andrew. > On 16 Apr 2014 19:17, "Adam Baso" <ab...@wikimedia.org> wrote: > > > Inline. > > > > Thanks for starting this thread. > > > > > > Sorry if I've overlooked this, but who/what will have access to this > > data? > > > Only members of the mobile team? Local project CheckUsers? Wikimedia > > > Foundation-approved researchers? Wikimedia shell users? AbuseFilter > > > filters? > > > > > > > It's a good question. The thought is to put it in the customary > wfDebugLog > > location (with, for example, filename "mccmnc.log") on fluorine. > > > > It just occurred to me that the wiki name (e.g., "enwiki"), but not the > > full URL, gets logged additionally as part of the wfDebugLog call; to > make > > the implicit explicit, wfDebugLog adds a datetime stamp as well, and > that's > > useful for purging old records. I'll forward this email to mobile-l and > > wikitech-l to underscore this. > > > > > > > And this may be a silly question, but is there a reasonable means of > > > approximating how identifying these two data points alone are? That is, > > > Using a mobile country code and exit IP address, is it possible to > > > identify a particular editor or reader? Or perhaps rephrased, is this > > data > > > considered anonymized? > > > > > > > Not a silly question. My approximation is these tuples (datetime, now > that > > it hit me - XYwiki, exit IP, and MCC-MNC) alone, although not perfectly > > anonymized, are low identifying (that is, indirect inferences on the data > > in isolation are unlikely, but technically possible, through examination > of > > short tail outliers in a cluster analysis where such readers/editors > exist > > in the short tail outliers sets), in contrast to regular web access logs > > (where direct inferences are easy). > > > > Thanks. I'll forward this along now. > > > > -Adam > > _______________________________________________ > > Wikimedia-l mailing list > > Wikimedia-l@lists.wikimedia.org > > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, > > <mailto:wikimedia-l-requ...@lists.wikimedia.org?subject=unsubscribe> > _______________________________________________ > Wikimedia-l mailing list > Wikimedia-l@lists.wikimedia.org > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, > <mailto:wikimedia-l-requ...@lists.wikimedia.org?subject=unsubscribe> > _______________________________________________ Wikimedia-l mailing list Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:wikimedia-l-requ...@lists.wikimedia.org?subject=unsubscribe>