Good question.
It might be valuable to add to the collection of tools.
I do have some concern about we are offering here though.
(1) if we offer to look at large datasets and/or large log files, then
work is moving from the user to the list.
(2) the obfuscated data is public. We don't want any
commitment/liability here that the code is, say, suitable for personal
data because sometimes obfuscation is not enough.
On the first point:
Part of a CMVE [1] is the user doing some work. If we make it
acceptable to bypass that, the work still exists but it has been
transferred.
I simply can't spend 1+ hour setting up a test environment. Performance
can involve load as well and I don't have the infrastructure to look at
that.
I'm more willing to spend time if the user is in a university/non-profit
or for people, commercial or otherwise, who engage in useful discussion.
A good report is a contribution.
But I'm not willing (or even able) to subsidise commercial organisations
per se. They can go find and pay for commercial support contract or
contract with someone (a contributor/committer maybe) and have a
confidentiality agreement.
It is not always one question in isolation. Solve one issue and then
another arrives.
Sorry if this is grumpy but I can see ways things might turn out not so
well without us also having common agreement about how we operate on users@.
Andy
[1] and point to
https://stackoverflow.com/help/mcve
PS
There is also a theme of "ask first" before trying anything, or doing in
a few minutes investigation. Such emails are vague.
On 12/10/17 10:03, Rob Vesse wrote:
Folks
An occasional recurring theme I see on the users list is we get a vague
question about performance details where users can’t/won’t share Data and
queries because of confidentiality or other concerns. This is something we’ve
encountered in the past with customers for our commercial products and so
internally we developed some obfuscation code using Jena APIs so that we can
obfuscate queries and dates in our logs allowing customers to share these
without confidentiality being breached.
Would it be valuable to the project if we cleaned this up and made it a part
of core Jena libraries?
It would probably take a bit of time to unpick this from our code and to
generalise it but I think it could be a very useful feature going forward. Let
me know what you think
Rob