Sharing trained models while protecting confidentiality

Alexander Measure Sat, 18 May 2013 12:52:35 -0700

In my day job I train text classifiers that are useful for a wide variety
of health surveillance tasks. The data used to train these classifiers
however cannot be shared because of confidentiality protections.  I would
like to make these trained models available to others just as cTAKES does,
but I'm not sure how. Can you tell me how cTAKES does it, or point me to
resources that might be useful?


My models tend to be regularized logistic regression models trained on
bag-of-words type features. I suspect that I can get some protection by
hashing everything to a fixed space first, but if there's a different
well-established approach out there I'd rather use that.

Alex Measure

Sharing trained models while protecting confidentiality

Reply via email to