My point was to illustrate that the limit will be arbitrary chosen. It is 
impossible to set a limit for how big numbers that should be allowed to be 
stored in a IntSet. It depends on the application, and can only be 
determined by the programmer. Sorry for not stating that clear. You seem to 
understand the issue better than my reasoning :P

You should probably use a different storage mechanism that explicitly 
handles reading and writing the result unused parts of the array to disk if 
your IntSet have trouble fitting in memory. Maybe you will get good results 
if you just change the backing array in IntSet to a mmapped file? 
Unfortunately you have to copy all of the method definitions for 
IntSet<https://github.com/JuliaLang/julia/blob/master/base/intset.jl>, 
because we do not have inheritance from concrete types.

PS: To get the size of the backing array, you have to use sizeof(s.bits), 
sizeof(s) is the constant size of the IntSet struct.

Regards Ivar

kl. 16:22:01 UTC+1 fredag 28. februar 2014 skrev David P. Sanders følgende:
>
>
>
> El viernes, 28 de febrero de 2014 08:41:37 UTC-6, Ivar Nesje escribió:
>>
>> The documentation states very clear that 
>> IntSet<http://docs.julialang.org/en/latest/stdlib/base/#Base.IntSet> should 
>> only be used for dense collections, and that 
>> Set<http://docs.julialang.org/en/latest/stdlib/base/#Base.Set>, 
>> should be used for sparse collections.
>>
>
> Agreed. Of course, this was just a toy example to test the limits of 
> IntSet.
> In the real application that I am working towards, I want to think about 
> systems of size at least 10^5 x 10^5. Mapping pairs (x,y) in this system to 
> a single number gives up to 10^10,
> which is what I was testing.
>
>  
>
>>
>> Construct a sorted set of the integers generated by the given iterable 
>>> object, or an empty set. Implemented as a bit string, and therefore 
>>> designed for dense integer sets. If the set will be sparse (for example 
>>> holding a single very large integer), use Set instead.
>>
>>
>> Do you happen to know a nice limit to how much memory IntSet should be 
>> allowed to use?
>>
>
> In this case, it seems to be using more memory than I have available on my 
> machine (4GB on my laptop).
> I guess my point is that normally I would expect that to give me an 
> out-of-memory error, rather than enter an infinite loop producing garbage.
>
>
>
>> On my laptop 100 MB would be more than I can afford
>>
>
> Not sure what you mean by that -- doesn't it rather depend on the 
> application? If I am doing a heavy computation on my laptop over night, I 
> am happy for it to use all available memory.
>
>  
>
>> , but that would make IntSet unusable for bigger calculations on bigger 
>> systems, so it should be no smaller than 10 GB.
>>
>
> I have another machine with a lot of memory (128 GB), so I certainly do 
> not want to impose an arbitrary restriction.
>
>
> David.
>  
>
>>
>> Ivar
>>
>> kl. 15:16:33 UTC+1 fredag 28. februar 2014 skrev David P. Sanders 
>> følgende:
>>>
>>>
>>> I am investigating possible data structures for an application.
>>> Here is an "interesting" behaviour in IntSet, which is no doubt to do 
>>> with the implementation.
>>> Maybe it should just throw an exception if someone tries to add a really 
>>> large integer like this!
>>>
>>>
>>> julia> s = IntSet()
>>> IntSet()
>>>
>>> julia> push!(s, 100000)
>>> IntSet(100000)
>>>
>>> julia> sizeof(s)
>>> 24
>>>
>>> julia> push!(s, 1000000)
>>> IntSet(100000, 1000000)
>>>
>>> julia> sizeof(s)
>>> 24
>>>
>>> julia> push!(s, 10000000)
>>> IntSet(100000, 1000000, 10000000)
>>>
>>> julia> push!(s, 100000000)
>>> IntSet(100000, 1000000, 10000000, 100000000)
>>>
>>> julia> push!(s, 1000000000)
>>> IntSet(100000, 1000000, 10000000, 100000000, 1000000000)
>>>
>>> julia> sizeof(s)
>>> 24
>>>
>>> julia> push!(s, 10000000000)
>>> IntSet(100000, 1000000, 10000000, 100000000, 1000000000, 1410065408, 
>>> 1410065408, 1410065408, 1410065408, 1410065408, 1410065408, 1410065408, 
>>> 1410065408, 1410065408, 1410065408, 1410065408, 1410065408, 1410065408, 
>>> 1410065408, 1410065408, 1410065408, 1410065408^CEvaluation succeeded, but 
>>> an error occurred while showing value of type IntSet:
>>> ERROR: interrupt
>>>  in show at intset.jl:172
>>>  in anonymous at show.jl:973
>>>  in showlimited at show.jl:972
>>>  in writemime at repl.jl:2
>>>  in display at multimedia.jl:117
>>>  in display at multimedia.jl:119
>>>  in display at multimedia.jl:151
>>>
>>

Reply via email to