[
https://issues.apache.org/jira/browse/SPARK-19771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15893508#comment-15893508
]
Yun Ni commented on SPARK-19771:
[~merlin] What you are suggesting is to hash each AND hash vector into a
[
https://issues.apache.org/jira/browse/SPARK-19771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15893097#comment-15893097
]
Yun Ni edited comment on SPARK-19771 at 3/2/17 9:55 PM:
[~merlin]
(1) The
[
https://issues.apache.org/jira/browse/SPARK-19771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15893097#comment-15893097
]
Yun Ni commented on SPARK-19771:
[~merlin]
(1) The computation cost is NumHashFunctions because we go
Yun Ni created SPARK-19771:
--
Summary: Support OR-AND amplification in Locality Sensitive
Hashing (LSH)
Key: SPARK-19771
URL: https://issues.apache.org/jira/browse/SPARK-19771
Project: Spark
Issue
[
https://issues.apache.org/jira/browse/SPARK-18454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15874956#comment-15874956
]
Yun Ni edited comment on SPARK-18454 at 2/21/17 9:39 PM:
-
[~josephkb] [~sethah]
[
https://issues.apache.org/jira/browse/SPARK-18454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15874956#comment-15874956
]
Yun Ni commented on SPARK-18454:
[~josephkb] [~sethah] [~mlnick] I opened a gDoc for discussions. Feel
[
https://issues.apache.org/jira/browse/SPARK-18286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Yun Ni resolved SPARK-18286.
Resolution: Fixed
Fix Version/s: 2.2.0
> Add Scala/Java/Python examples for MinHash and
[
https://issues.apache.org/jira/browse/SPARK-18392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15867366#comment-15867366
]
Yun Ni commented on SPARK-18392:
I agree with Seth. We need to first finish SPARK-18080 and SPARK-18450
[
https://issues.apache.org/jira/browse/SPARK-18392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832487#comment-15832487
]
Yun Ni edited comment on SPARK-18392 at 1/20/17 10:29 PM:
--
Yes, comparing if the
[
https://issues.apache.org/jira/browse/SPARK-18392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832487#comment-15832487
]
Yun Ni commented on SPARK-18392:
Yes, comparing if the hash signature equals is faster than computing the
[
https://issues.apache.org/jira/browse/SPARK-18392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832404#comment-15832404
]
Yun Ni edited comment on SPARK-18392 at 1/20/17 9:15 PM:
-
Hi David,
Thanks for
[
https://issues.apache.org/jira/browse/SPARK-18392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832404#comment-15832404
]
Yun Ni commented on SPARK-18392:
Hi David,
Thanks for the question. I did not group the records by their
[
https://issues.apache.org/jira/browse/SPARK-18454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Yun Ni updated SPARK-18454:
---
Description:
We all agree to do the following improvement to Multi-Probe NN Search:
(1) Use approxQuantile
[
https://issues.apache.org/jira/browse/SPARK-18454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Yun Ni updated SPARK-18454:
---
Description:
We all agree to do the following improvement to Multi-Probe NN Search:
(1) Use approxQuantile
[
https://issues.apache.org/jira/browse/SPARK-18454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Yun Ni updated SPARK-18454:
---
Summary: Changes to fix Nearest Neighbor Search for LSH (was: Changes to
fix Multi-Probe Nearest Neighbor
[
https://issues.apache.org/jira/browse/SPARK-18392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15668372#comment-15668372
]
Yun Ni commented on SPARK-18392:
Re-org the ticket dependencies for a little bit. PTAL.
> LSH API,
Yun Ni created SPARK-18454:
--
Summary: Changes to fix Multi-Probe Nearest Neighbor Search for LSH
Key: SPARK-18454
URL: https://issues.apache.org/jira/browse/SPARK-18454
Project: Spark
Issue Type:
Yun Ni created SPARK-18450:
--
Summary: Add AND-amplification to Locality Sensitive Hashing
Key: SPARK-18450
URL: https://issues.apache.org/jira/browse/SPARK-18450
Project: Spark
Issue Type:
[
https://issues.apache.org/jira/browse/SPARK-18408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Yun Ni updated SPARK-18408:
---
Description:
As the first improvements to current LSH Implementations, we are planning to do
the
[
https://issues.apache.org/jira/browse/SPARK-18408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Yun Ni updated SPARK-18408:
---
Description:
As the first improvements to current LSH Implementations, we are planning to do
the
[
https://issues.apache.org/jira/browse/SPARK-18408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Yun Ni updated SPARK-18408:
---
Description:
As the first improvements to current LSH Implementations, we are planning to do
the
Yun Ni created SPARK-18408:
--
Summary: API Improvements for LSH
Key: SPARK-18408
URL: https://issues.apache.org/jira/browse/SPARK-18408
Project: Spark
Issue Type: Improvement
Reporter:
[
https://issues.apache.org/jira/browse/SPARK-18392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15654822#comment-15654822
]
Yun Ni commented on SPARK-18392:
I think your summary is very good in general. For class names, I think
[
https://issues.apache.org/jira/browse/SPARK-18334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15645506#comment-15645506
]
Yun Ni edited comment on SPARK-18334 at 11/7/16 9:25 PM:
-
[~sethah] [~josephkb]
[
https://issues.apache.org/jira/browse/SPARK-18334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15645506#comment-15645506
]
Yun Ni edited comment on SPARK-18334 at 11/7/16 9:25 PM:
-
[~sethah] [~josephkb] I
[
https://issues.apache.org/jira/browse/SPARK-18334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15645506#comment-15645506
]
Yun Ni commented on SPARK-18334:
[~sethah][~josephkb]
> MinHash should use binary hash distance
>
Yun Ni created SPARK-18334:
--
Summary: MinHash should use binary hash distance
Key: SPARK-18334
URL: https://issues.apache.org/jira/browse/SPARK-18334
Project: Spark
Issue Type: Bug
[
https://issues.apache.org/jira/browse/SPARK-18081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15637493#comment-15637493
]
Yun Ni commented on SPARK-18081:
This is super helpful. Thanks!
> Locality Sensitive Hashing (LSH) User
[
https://issues.apache.org/jira/browse/SPARK-18081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15637102#comment-15637102
]
Yun Ni commented on SPARK-18081:
Sorry, I was really overloaded this week. I will try my best to send a
[
https://issues.apache.org/jira/browse/SPARK-5992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15581416#comment-15581416
]
Yun Ni commented on SPARK-5992:
---
Yes, I have implemented cosine, jaccard, euclidean and hamming distance.
[
https://issues.apache.org/jira/browse/SPARK-5992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15581412#comment-15581412
]
Yun Ni commented on SPARK-5992:
---
Yes, I do have comparison with full scan. Usually kNN with 1 hash function
[
https://issues.apache.org/jira/browse/SPARK-5992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15581345#comment-15581345
]
Yun Ni commented on SPARK-5992:
---
Some tests on the WEX Open Dataset, with steps and results:
[
https://issues.apache.org/jira/browse/SPARK-5992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15502493#comment-15502493
]
Yun Ni commented on SPARK-5992:
---
Hi Joseph,
I have made an initial PR based on the design doc:
[
https://issues.apache.org/jira/browse/SPARK-5992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15478135#comment-15478135
]
Yun Ni commented on SPARK-5992:
---
Thank you very much for reviewing it, Joseph!
I will work on the first
[
https://issues.apache.org/jira/browse/SPARK-5992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15453980#comment-15453980
]
Yun Ni commented on SPARK-5992:
---
Thanks very much for reviewing, Joseph!
Based on your comments, I have
[
https://issues.apache.org/jira/browse/SPARK-5992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15432186#comment-15432186
]
Yun Ni edited comment on SPARK-5992 at 8/23/16 5:50 AM:
Hi,
We are engineers from
[
https://issues.apache.org/jira/browse/SPARK-5992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15432186#comment-15432186
]
Yun Ni commented on SPARK-5992:
---
Hi,
We are engineers from Uber. Here is our design doc for LSH:
37 matches
Mail list logo