If it's as obvious a win as it sounds, I'd say 0.3.  We aren't in lock down yet 
are we?

-Grant

On Feb 18, 2010, at 3:37 PM, Jake Mannix wrote:

> I dunno, we can file it for whenever, 0.4 and if it turns out it's a really
> easy
> change we can always commit it for 0.3.
> 
>  -jake
> 
> On Thu, Feb 18, 2010 at 12:29 PM, Robin Anil <robin.a...@gmail.com> wrote:
> 
>> File it for 0.3 ?
>> 
>> 
>> Robin
>> 
>> On Fri, Feb 19, 2010 at 1:56 AM, Jake Mannix <jake.man...@gmail.com>
>> wrote:
>> 
>>> On Thu, Feb 18, 2010 at 11:55 AM, Robin Anil <robin.a...@gmail.com>
>> wrote:
>>> 
>>>> I was trying out SeqAccessSparseVector on Canopy Clustering using
>>> Manhattan
>>>> distance. I found performance to be really bad. So I profiled it with
>>>> Yourkit(Thanks a lot for providing us free license)
>>>> 
>>>> Since i was trying out manhattan distance, there were a lot of A-B
>> which
>>>> created a lot of clone operation 5% of the total time
>>>> there were also so many A+B for adding a point to the canopy to
>> average.
>>>> this was also creating a lot of clone operations.  90% of the total
>> time
>>>> 
>>> 
>>> SequentialAccessSparseVector should only be used in a read-only fashion.
>>> If
>>> you are creating an average centroid which is sparse, but it is mutating,
>>> then it should be RandomAccessSparseVector.  The points which are being
>>> used
>>> to create it can be SequentialAccessSparseVector (if they themselves
>> never
>>> change), but then the method called should be
>>> SequentialAccessSparseVector.addTo(RandomAccessSparseVector) - this
>>> exploits
>>> the fast sequential iteration of SeqAcc, and the fast random-access
>>> mutatability of RandAcc.
>>> 
>>> 
>>>> 
>>>> So we definitely needs to improve that..
>>>> 
>>>> For a small hack. I made the cluster centers RandomAccess Vector.
>> Things
>>>> are fast again. I dont know whether to commit or not. But something to
>>> look
>>>> into in 0.4?
>>>> 
>>> 
>>> Yeah, cluster *centers* should indeed be RandomAccess.  JIRA / patch so
>> we
>>> can see exactly what the change is?
>>> 
>>> -jake
>>> 
>> 

--------------------------
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem using Solr/Lucene: 
http://www.lucidimagination.com/search

Reply via email to