Announce: LimDB, a fast, persistent table with LMDB under the hood

2023-02-17 Thread Gtriangle
Great work, thank you.

Wanted libraries wishlist?

2023-02-13 Thread Gtriangle
A roaring bitmap implementation (compressed bitmaps).

high memory usage with large number of HashSets. 3X more memory than Python

2021-12-25 Thread Gtriangle
@Araq's warning was definitely justified, when I compiled with the nimIntHash1 flag the code was **many** times faster, but I ran into KeyErrors when looking for document ids in the reverse index tables. I'll try changing all my ints to distinct ints and see what happens ! > FWIW, there was an

high memory usage with large number of HashSets. 3X more memory than Python

2021-12-24 Thread Gtriangle
wow, amazing. Thank you @Araq, @ElegantBeef and @cblake. With Nim it feels like I got the keys to a Lamborghini :) @cblake: You're right, this part of the code is used to generate an inverted index. I then calculate the Jaccard-Index between all pairs of 'docs' to find similar documents for a

high memory usage with large number of HashSets. 3X more memory than Python

2021-12-23 Thread Gtriangle
Hi, I'm really loving Nim but ran into a strange issue today when porting some Python code. I have some simple code that builds a table, mapping an int to a HashSet[int]. The table has about 2.5 million entries. Each HashSet contains 10 random integers (max integer value being 115000) I was h