[jira] Commented: (MAHOUT-244) Add root log-likelihood method to LogLikehood class.

2010-01-13 Thread Shashikant Kore (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12800130#action_12800130 ] Shashikant Kore commented on MAHOUT-244: +1. Add root log-likelihood method to

[jira] Commented: (MAHOUT-163) Get (better) cluster labels using Log Likelihood Ratio

2009-12-31 Thread Shashikant Kore (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12795570#action_12795570 ] Shashikant Kore commented on MAHOUT-163: Grant, Yes, it should have been

[jira] Commented: (MAHOUT-208) Vector.getLengthSquared() is dangerously optimized

2009-12-14 Thread Shashikant Kore (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12790118#action_12790118 ] Shashikant Kore commented on MAHOUT-208: Apologies. I *assumed* underlying library

[jira] Commented: (MAHOUT-208) Vector.getLengthSquared() is dangerously optimized

2009-12-11 Thread Shashikant Kore (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12789171#action_12789171 ] Shashikant Kore commented on MAHOUT-208: It is important to have this caching to

[jira] Commented: (MAHOUT-191) NPE while creating term vectors with an index on a field that does not exist in all the documents

2009-12-06 Thread Shashikant Kore (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12786782#action_12786782 ] Shashikant Kore commented on MAHOUT-191: Yeah, this can be committed. NPE while

[jira] Commented: (MAHOUT-204) Better integration of Mahout matrix capabilities with Colt Matrix additions

2009-11-30 Thread Shashikant Kore (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12783577#action_12783577 ] Shashikant Kore commented on MAHOUT-204: I ran into compile troubles after

[jira] Commented: (MAHOUT-206) Separate and clearly label different SparseVector implementations

2009-11-23 Thread Shashikant Kore (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12781793#action_12781793 ] Shashikant Kore commented on MAHOUT-206: From my observation, the input vectors

[jira] Commented: (MAHOUT-207) AbstractVector.hashCode() should not care about the order of iteration over elements

2009-11-23 Thread Shashikant Kore (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12781794#action_12781794 ] Shashikant Kore commented on MAHOUT-207: When this patch gets in, we can remove the

[jira] Updated: (MAHOUT-165) Using better primitives hash for sparse vector for performance gains

2009-11-18 Thread Shashikant Kore (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shashikant Kore updated MAHOUT-165: --- Attachment: mahout-165-18nov.patch Here is the updated patch. The dependency on

[jira] Commented: (MAHOUT-201) OrderedIntDoubleMapping / SparseVector is unnecessarily slow

2009-11-18 Thread Shashikant Kore (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12779403#action_12779403 ] Shashikant Kore commented on MAHOUT-201: Jake, Colt also provides fast iteration

[jira] Commented: (MAHOUT-165) Using better primitives hash for sparse vector for performance gains

2009-11-18 Thread Shashikant Kore (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12779406#action_12779406 ] Shashikant Kore commented on MAHOUT-165: My patch is only for the changes to

[jira] Updated: (MAHOUT-165) Using better primitives hash for sparse vector for performance gains

2009-11-18 Thread Shashikant Kore (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shashikant Kore updated MAHOUT-165: --- Attachment: mahout-165-18nov-updated.patch I am updating the patch to ensure hashCode() is

[jira] Commented: (MAHOUT-165) Using better primitives hash for sparse vector for performance gains

2009-11-17 Thread Shashikant Kore (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12778822#action_12778822 ] Shashikant Kore commented on MAHOUT-165: Not sure if voting for my own patch

[jira] Updated: (MAHOUT-191) NPE while creating term vectors with an index on a field that does not exist in all the documents

2009-10-29 Thread Shashikant Kore (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shashikant Kore updated MAHOUT-191: --- Attachment: MAHOUT-191.patch NPE while creating term vectors with an index on a field that

[jira] Commented: (MAHOUT-191) NPE while creating term vectors with an index on a field that does not exist in all the documents

2009-10-29 Thread Shashikant Kore (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12771408#action_12771408 ] Shashikant Kore commented on MAHOUT-191: I noticed a different problem for empty

[jira] Commented: (MAHOUT-165) Using better primitives hash for sparse vector for performance gains

2009-10-05 Thread Shashikant Kore (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12762167#action_12762167 ] Shashikant Kore commented on MAHOUT-165: I am trying out this patch. Somehow, I

[jira] Commented: (MAHOUT-165) Using better primitives hash for sparse vector for performance gains

2009-09-18 Thread Shashikant Kore (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12757017#action_12757017 ] Shashikant Kore commented on MAHOUT-165: Colt handles the removal by explicitly

[jira] Updated: (MAHOUT-163) Get (better) cluster labels using Log Likelihood Ratio

2009-09-17 Thread Shashikant Kore (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shashikant Kore updated MAHOUT-163: --- Attachment: MAHOUT-163-17sep.patch Updated patch. Using bitset to find in-cluster DF instead

[jira] Commented: (MAHOUT-160) ClusterDumper utility to output all the clusters in all sequence files and points

2009-09-17 Thread Shashikant Kore (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12756488#action_12756488 ] Shashikant Kore commented on MAHOUT-160: Yeah, this can be closed. ClusterDumper

[jira] Commented: (MAHOUT-165) Using better primitives hash for sparse vector for performance gains

2009-09-17 Thread Shashikant Kore (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12756526#action_12756526 ] Shashikant Kore commented on MAHOUT-165: Since, I couldn't apply Ted's patch to

[jira] Commented: (MAHOUT-165) Using better primitives hash for sparse vector for performance gains

2009-09-10 Thread Shashikant Kore (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12753583#action_12753583 ] Shashikant Kore commented on MAHOUT-165: The attached patch uses integer to double

[jira] Updated: (MAHOUT-163) Get (better) cluster labels using Log Likelihood Ratio

2009-09-09 Thread Shashikant Kore (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shashikant Kore updated MAHOUT-163: --- Attachment: MAHOUT-163.patch Revised patch updated to trunk. Get (better) cluster labels

[jira] Updated: (MAHOUT-165) Using better primitives hash for sparse vector for performance gains

2009-09-07 Thread Shashikant Kore (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shashikant Kore updated MAHOUT-165: --- Attachment: colt.jar Jar for Colt after removing the LGPL code of hep.aida and the the

[jira] Updated: (MAHOUT-165) Using better primitives hash for sparse vector for performance gains

2009-09-04 Thread Shashikant Kore (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shashikant Kore updated MAHOUT-165: --- Attachment: mahout-165-trove.patch For SparseVector, I have copied relevant source from

[jira] Commented: (MAHOUT-165) Using better primitives hash for sparse vector for performance gains

2009-09-04 Thread Shashikant Kore (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12751398#action_12751398 ] Shashikant Kore commented on MAHOUT-165: My interpretation was Trove (and Colt)

[jira] Commented: (MAHOUT-165) Using better primitives hash for sparse vector for performance gains

2009-09-04 Thread Shashikant Kore (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12751520#action_12751520 ] Shashikant Kore commented on MAHOUT-165: OK. Should I copy relevant classes source

[jira] Updated: (MAHOUT-163) Get (better) cluster labels using Log Likelihood Ratio

2009-08-20 Thread Shashikant Kore (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shashikant Kore updated MAHOUT-163: --- Attachment: mahout-163.patch Revised patch. Get (better) cluster labels using Log

[jira] Commented: (MAHOUT-165) Using better primitives hash for sparse vector for performance gains

2009-08-20 Thread Shashikant Kore (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12745405#action_12745405 ] Shashikant Kore commented on MAHOUT-165: I couldn't locate the primitive hasmap in

[jira] Commented: (MAHOUT-121) Speed up distance calculations for sparse vectors

2009-08-19 Thread Shashikant Kore (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12744973#action_12744973 ] Shashikant Kore commented on MAHOUT-121: OrderedIntDoubleMapping, the primitive

[jira] Commented: (MAHOUT-121) Speed up distance calculations for sparse vectors

2009-08-19 Thread Shashikant Kore (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12744991#action_12744991 ] Shashikant Kore commented on MAHOUT-121: We need mix of iteration and map get/set.

[jira] Commented: (MAHOUT-121) Speed up distance calculations for sparse vectors

2009-08-19 Thread Shashikant Kore (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12744995#action_12744995 ] Shashikant Kore commented on MAHOUT-121: Trying it out... Speed up distance

[jira] Updated: (MAHOUT-165) Using better primitives hash for sparse vector for performance gains

2009-08-19 Thread Shashikant Kore (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shashikant Kore updated MAHOUT-165: --- Attachment: mahout-165.patch Patch for using Colt in SparseVector Using better primitives

[jira] Created: (MAHOUT-165) Using better primitives hash for sparse vector for performance gains

2009-08-19 Thread Shashikant Kore (JIRA)
Using better primitives hash for sparse vector for performance gains Key: MAHOUT-165 URL: https://issues.apache.org/jira/browse/MAHOUT-165 Project: Mahout Issue Type:

[jira] Commented: (MAHOUT-163) Get (better) cluster labels using Log Likelihood Ratio

2009-08-14 Thread Shashikant Kore (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12743093#action_12743093 ] Shashikant Kore commented on MAHOUT-163: I'm not sure if this implementation is

[jira] Created: (MAHOUT-163) Get (better) cluster labels using Log Likelihood Ratio

2009-08-13 Thread Shashikant Kore (JIRA)
Get (better) cluster labels using Log Likelihood Ratio -- Key: MAHOUT-163 URL: https://issues.apache.org/jira/browse/MAHOUT-163 Project: Mahout Issue Type: Improvement

[jira] Updated: (MAHOUT-163) Get (better) cluster labels using Log Likelihood Ratio

2009-08-13 Thread Shashikant Kore (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shashikant Kore updated MAHOUT-163: --- Attachment: mahout-cluster-labels-llr.patch Pacth for getting top labels with LLR. Get

[jira] Updated: (MAHOUT-160) ClusterDumper utility to output all the clusters in all sequence files and points

2009-08-06 Thread Shashikant Kore (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shashikant Kore updated MAHOUT-160: --- Attachment: mahout-160.patch ClusterDumper utility has been modified to take the clusters

[jira] Updated: (MAHOUT-160) ClusterDumper utility to output all the clusters in all sequence files and points

2009-08-06 Thread Shashikant Kore (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shashikant Kore updated MAHOUT-160: --- Attachment: mahout-160-dict.patch This patch accepts the term dictionary (created while

[jira] Commented: (MAHOUT-121) Speed up distance calculations for sparse vectors

2009-08-06 Thread Shashikant Kore (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12740420#action_12740420 ] Shashikant Kore commented on MAHOUT-121: Declaring variables out of the loop looks

[jira] Commented: (MAHOUT-121) Speed up distance calculations for sparse vectors

2009-07-28 Thread Shashikant Kore (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12736460#action_12736460 ] Shashikant Kore commented on MAHOUT-121: Grant, The latest patch brings back the

[jira] Commented: (MAHOUT-121) Speed up distance calculations for sparse vectors

2009-06-25 Thread Shashikant Kore (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12723980#action_12723980 ] Shashikant Kore commented on MAHOUT-121: Grant, I was trying to verify the patch,

[jira] Commented: (MAHOUT-121) Speed up distance calculations for sparse vectors

2009-06-16 Thread Shashikant Kore (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12720123#action_12720123 ] Shashikant Kore commented on MAHOUT-121: Apologies for posting incorrect results in

[jira] Commented: (MAHOUT-121) Speed up distance calculations for sparse vectors

2009-06-16 Thread Shashikant Kore (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12720246#action_12720246 ] Shashikant Kore commented on MAHOUT-121: I have document vectors created from some

[jira] Commented: (MAHOUT-121) Speed up distance calculations for sparse vectors

2009-06-16 Thread Shashikant Kore (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12720498#action_12720498 ] Shashikant Kore commented on MAHOUT-121: Grant, I am trying out the wikipedia

[jira] Commented: (MAHOUT-121) Speed up distance calculations for sparse vectors

2009-06-15 Thread Shashikant Kore (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12719462#action_12719462 ] Shashikant Kore commented on MAHOUT-121: +1 Grant's suggestion that we split two

[jira] Commented: (MAHOUT-121) Speed up distance calculations for sparse vectors

2009-06-15 Thread Shashikant Kore (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12719546#action_12719546 ] Shashikant Kore commented on MAHOUT-121: Sean, Your patch has definitely improved

[jira] Commented: (MAHOUT-121) Speed up distance calculations for sparse vectors

2009-06-10 Thread Shashikant Kore (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12718026#action_12718026 ] Shashikant Kore commented on MAHOUT-121: That's right, Grant. Some simple tests

[jira] Created: (MAHOUT-126) Prepare document vectors from the text

2009-05-29 Thread Shashikant Kore (JIRA)
Prepare document vectors from the text -- Key: MAHOUT-126 URL: https://issues.apache.org/jira/browse/MAHOUT-126 Project: Mahout Issue Type: New Feature Reporter: Shashikant Kore Clustering

[jira] Commented: (MAHOUT-126) Prepare document vectors from the text

2009-05-29 Thread Shashikant Kore (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12714356#action_12714356 ] Shashikant Kore commented on MAHOUT-126: David, Sorry, I don't have any background

[jira] Created: (MAHOUT-121) Speed up distance calculations for sparse vectors

2009-05-22 Thread Shashikant Kore (JIRA)
Speed up distance calculations for sparse vectors - Key: MAHOUT-121 URL: https://issues.apache.org/jira/browse/MAHOUT-121 Project: Mahout Issue Type: Improvement Components: Matrix