> > > > skunks get personal armor with government money [joan of arc asked for accurate spirit information on AI influence/lawyer > > > > > > > > --- > > > > > > > > i'm still working on rep/dict.py ! > > > > > > > > for a little smidge i didn't have wifi so i made a very simple > > > > file-based backend and just now got it working > > > > > > > > here's my current crash > > > > > > > > File "/home/karl3/projects/rep/rep/dict.py", line 392, in <module> > > > > doc.update([[val,val]]) > > > > File "/home/karl3/projects/rep/rep/dict.py", line 347, in update > > > > super().update(keyhashitems()) > > > > File "/home/karl3/projects/rep/rep/dict.py", line 225, in update > > > > self.array[:] = IterableWithLength(content_generator(), capacity) > > > > ~~~~~~~~~~^^^ > > > > File "/home/karl3/projects/rep/rep/array.py", line 56, in __setitem__ > > > > self.doc[start * sz : stop * sz] = data > > > > ~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^ > > > > File "/home/karl3/projects/rep/rep/rep.py", line 167, in __setitem__ > > > > piece = prefix + data[:] + suffix[:suffixoff] > > > > ~~~~^^^ > > > > File "/home/karl3/projects/rep/rep/rep.py", line 220, in __getitem__ > > > > buf += next(self.iteration) > > > > ^^^^^^^^^^^^^^^^^^^^ > > > > File "/home/karl3/projects/rep/rep/dict.py", line 218, in > > > > content_generator > > > > assert superidx * expansion + subidx == > > > > int.from_bytes(dbg_keyhash[:hashbytes], 'big') >> hashshift > > > > > > > > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > > > AssertionError > > > > > > > > [stares blankly at crash] i wonder what all this funny text means! > > > > > > (Pdb) p [superidx, expansion, subidx], [hashbytes, > > > dbg_keyhash[:hashbytes], hashshift, > > > int.from_bytes(dbg_keyhash[:hashbytes], 'big') >> hashshift] > > > ([0, 8, 1], [1, b'\x0c', 5, 0]) > > > > > > so the superidx is 0 that means it's engaging the first item of the > > > original array. > > > the subidx is 8, the hashshift is 5, and the expected final index is 0. > > > > > > it's another problem with my bitwise arithmetic > > > > > > (Pdb) list 202 > > > 197 assert superidx == newidx >> (hashbits - > > > self._hashbits) > > > 198 #subidx = (newidx >> hashshift) & > > > expansionmask > > > 199 subidx = newidx & expansionmask > > > 200 assert superidx * expansion + subidx == > > > newidx# >> hashshift > > > 201 return subidx > > > 202 for superidx, item in > > > enumerate(tqdm.tqdm(self.array if self._capacity else > > > [self._sentinel], desc='growing sentinel hashtable', leave=False)): > > > 203 update_chunk = [self._sentinel] * expansion > > > 204 if item != self._sentinel: > > > 205 keyhash = self._key(item) > > > 206 newidx = > > > int.from_bytes(keyhash[:hashbytes], 'big') >> hashshift > > > 207 update_chunk[newidx2subidx(newidx)] = item > > > (Pdb) list 213 > > > 208 dbg_additions = [] > > > 209 while next_superidx == superidx: > > > 210 item = next_item > > > 211 newidx = next_newidx > > > 212 dbg_additions.append([next_newidx, > > > next_keyhash, next_item]) > > > 213 update_chunk[newidx2subidx(newidx)] = item > > > 214 next_newidx, next_keyhash, next_item = > > > updates.pop() if len(updates) else [1<<hashbits,None,None] > > > 215 next_superidx = next_newidx >> > > > (hashbits - self._hashbits) > > > 216 for subidx, item in enumerate(update_chunk): > > > 217 dbg_keyhash = self._key(item) > > > 218 -> assert superidx * expansion + subidx > > > == int.from_bytes(dbg_keyhash[:hashbytes], 'big') >> hashshift > > > > not sure if i posted that i separated wholeidx into byteidx and newidx > > because i was sometimes assuming it was downshifted and othertimes > > not. [i could have referenced the old code to be consistent, that > > might have been clearer, but it is a logical computer-checked system > > anyway > > (Pdb) p item > b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00' > (Pdb) p update_chunk > [b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00', > b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00', > bytearray(b'\xd6\x00\x00\x00\x00\x00\x00\x00\xd8\x00\x00\x00\x00\x00\x00\x00'), > b'\xd6\x00\x00\x00\x00\x00\x00\x00\xd8\x00\x00\x00\x00\x00\x00\x00', > b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00', > b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00', > b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00', > b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'] > > i'm guessing all those 16-long sequences of zeros are bugs in the file > backend i tried to add :D
oh no that's the new algorithm where it preallocates a length of sentinels. hmm so expansion is 8, there's identical data here in 0-based slot 2 and 3, that's strange; and one's a bytearray meaning they came from different places
