[
https://issues.apache.org/jira/browse/DATAFU-37?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13994299#comment-13994299
]
Matthew Hayes commented on DATAFU-37:
-------------------------------------
Ah I get it, that makes sense to me. I like this approach. So the user has the
option of setting a seed in the constructor. If it isn't provided then we pick
a random seed and use UDFContext so each task is using the same hash functions
consistently.
> Add Locality Sensitive Hashing UDFs
> -----------------------------------
>
> Key: DATAFU-37
> URL: https://issues.apache.org/jira/browse/DATAFU-37
> Project: DataFu
> Issue Type: New Feature
> Reporter: Casey Stella
> Assignee: Casey Stella
> Attachments: DATAFU-37-1.patch, DATAFU-37-2.patch, DATAFU-37.patch
>
> Original Estimate: 168h
> Remaining Estimate: 168h
>
> Create a set of UDFs to implement [Locality Sensitive
> Hashing|http://en.wikipedia.org/wiki/Locality-sensitive_hashing] in support
> of finding k-near neighbors. Initially, hashes associated with L1, L2 and
> Cosine similarity should be supported.
--
This message was sent by Atlassian JIRA
(v6.2#6252)