Re: [julia-users] Re: Speeding up indexing into a Dict{Vector,Vector}

Stefan Karpinski Mon, 13 Apr 2015 15:47:05 -0700

If you want to have the original mutable vector around, you may want to
keep it as a value instead of a key. You can do something like map the
Vector{Int} to a single Int value, use that as a key and then keep the
Vector{Int} as part of the Dict value.


On Mon, Apr 13, 2015 at 6:06 PM, Seth <catch...@bromberger.com> wrote:

> Two ideas that *might* work:
>
> 1) is hashing the key - that is, if you can represent the Vector{Int} as
> an immutable type (say, by somehow quickly changing it into an Int64 or
> Int128, perhaps using hash()) then you store the computed result as the
> key. The thing to keep in mind here is that if you mutate the vector, you
> will mutate the hash as well. That is likely what you want, but it's good
> to be explicit. You will also not be able to "go back" unless you create a
> separate dict with the hash values as a key and the Vector{Int} as the
> value.
>
> 2) is creating two data structures: one is a Vector{Vector{Int}}, the
> other a Dict{Int, Vector{Float64}}. The index of the first data structure
> is the key of the second, and it should be* faster to check the former
> against a given vector then going through a dict. Unlike #1, the index will
> NOT change if you mutate the inner vector.
>
>
>
>
> On Monday, April 13, 2015 at 1:56:55 PM UTC-7, Will Kearney wrote:
>>
>> I'm hoping I can get some advice on optimizing some code.
>>
>> I have a data structure which is very conveniently represented as a
>> Dict{Vector{Int},Vector{Float64}}. That is, I need to look up
>> floating-point vectors stored at certain multidimensional (usually 2-10)
>> positions represented as a Vector{Int}. Before I construct this Dict, I
>> don't have a priori knowledge of what position vectors will be represented.
>>
>> I'm aware of the problems of using mutable objects as Dict keys, but
>> let's assume that's not really an issue.
>>
>> Profiling reveals that the real time sink is testing arrays for equality
>> when indexing into this data structure, and indexing happens very
>> frequently. What would be a good way to speed this process up?
>>
>> Is there a way to represent the keys which will be cheaper to compare? I
>> could convert the keys to tuples, but it's helpful to have them as mutable
>> arrays to do some calculations with them at certain points.
>>
>> Is there a different data structure which might work better? Should I
>> roll my own hash table implementation which is particularly optimized for
>> vector keys?
>>
>

Re: [julia-users] Re: Speeding up indexing into a Dict{Vector,Vector}

Reply via email to