On Sun, Nov 18, 2018 at 11:20 PM Richard Eisenberg <[email protected]> wrote: > > I have not analyzed the data myself, but I wonder how we jumped to the > conclusion that the troll was trying to promote Stack. Is there statistical > data that supports that conclusion? For example, just reading this thread, it > sounds like the bogus responses also really don't like the new release > schedule. Maybe the troll wants the old release schedule back and was just > lazy about programming the tool to vary the stack/cabal question answers > adequately.
Roughly 90% of the bogus responses disliked the new ghc schedule and 10% left the answer blank. As far as I know, 100% of the bogus responses said they used stack exclusively. The answers to almost every other question (except, I think, for targeted platform?) varied significantly (although according to either uniform, linear, or normal distributions for the most part). So as guesses go, this seems pretty strong. I will also say, though there's speculation about "false flags" and other silliness floating around that I personally have a very good guess as to who did this. There's one well-known troll who has these preoccupations and is known for creating serial sockpuppet accounts, and is just the right amount of obsessed to do something like this. A few of the bogus responses actually had comments, and the comments were all written in a voice that was unmistakeable as this troll as well. Occam's razor seems to apply. Finally, let me add why I don't think this was a "false flag" -- while there were enough telltale markers that the fake answers could seem to be detected, I don't think this was on purpose. There was _too much_ effort put into distributions of other choices, etc. If they had wanted the fakes to be detected they would have left much stronger evidence. Rather, from a forensic standpoint, this seems pretty clear to me that the pattern of data is of someone _trying_ to cover their tracks, but just making four or five errors which I could assemble into a pattern. If they hadn't made those errors -- likely based on bad priors about what the organic data would be that theirs would need to "mesh" into -- then I think the deception would have been much harder to detect. --Gershom > Given the contention around cabal vs stack, I agree that sociological > concerns suggest that the troll meant to tilt those scales. But I wouldn't > want a public accusation without at least some statistical analysis that > independently supports that conclusion. > > In any case, thanks to all for putting this together! > > Richard > > On Nov 18, 2018, at 4:31 PM, Taylor Fausak <[email protected]> wrote: > > Oops, the ordering of the answer choices is manual because some questions > have a natural order while others should just be most to least popular. I've > made another run through to make sure everything is sorted properly. I'll > probably hit publish in the next half hour or so unless there are any > objections. > > https://github.com/tfausak/tfausak.github.io/blob/fce97d07c369856d4c05b756c492eb6229a1b5c7/_posts/2018-11-18-2018-state-of-haskell-survey-results.markdown > > > On Sun, Nov 18, 2018, at 3:07 PM, Gershom B wrote: > > The language extensions section doesn’t appear to be sorted properly. Outside > of that, I think that these results are looking much better and any effort to > find any additional outliers is probably not worth it for the moment. Thanks > for your work on this, and I appreciate you being responsive and attentive > when problems with the data were pointed out. There’s certainly some > interesting and helpful information to be gleaned from this data. > > Cheers, > Gershom > > > > > On November 18, 2018 at 2:55:10 PM, Taylor Fausak ([email protected]) wrote: > > > > > Ok, I updated the function that checks for bad responses, re-ran the script, > and updated the announcement along with all the assets (charts, tables, and > CSV). Hopefully it's the last time, as I can't justify spending much more > time on this. > > https://github.com/tfausak/tfausak.github.io/blob/6f9991758ffeed085c45dd97e4ce6a82a8b1a73f/_posts/2018-11-18-2018-state-of-haskell-survey-results.markdown > > > On Sun, Nov 18, 2018, at 2:32 PM, Michael Snoyman wrote: > > Just wanted to add in: good catch Gershom on identifying the problem, and > thank you Taylor for working to remove them from the report. > > On 18 Nov 2018, at 21:17, Taylor Fausak <[email protected]> wrote: > > Great catch, Gershom! There are indeed about 300 responses that tick all the > boxes except for disliking the new GHC release schedule. The main thing the > attacker seemed to be interested in was over-representing Stack and Stackage. > Also, bizarrely, Java. > > That brings the number of bogus responses up to 3,735, which puts the number > of legitimate responses at 1,361. For context, last year's survey asked far > fewer questions and had 1,335 responses. > > > On Sun, Nov 18, 2018, at 1:26 PM, Imants Cekusins wrote: > > What if the announcement mentioned a large number of potentially bogus > responses, explained the grounds for this conclusion, with a new survey > conducted early next year? > > The next survey would then need to be done differently from this one somehow. > To improve the reliability, some authentication may be necessary. > > > Maybe Stack, Cabal questions could be grouped as separate distinct surveys, > conducted by their maintainers through own channels? > > Not sure how much value is in exact numbers of users of Stack or Cabal. Both > groups are large enough. The maintainers of both groups are aware about usage > stats. > > Is either library likely to be influenced by this survey? > _______________________________________________ > Haskell-community mailing list > [email protected] > http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-community > > > _______________________________________________ > Haskell-community mailing list > [email protected] > http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-community > > > _______________________________________________ > Haskell-community mailing list > [email protected] > http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-community > > > _______________________________________________ > Haskell-community mailing list > [email protected] > http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-community > > _______________________________________________ Haskell-community mailing list [email protected] http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-community
