Can you try the newest (2024pre-processed enwik5.txt I posted from a few days ago? I named it 2024 at the start of the filename.
But I already know how to (and have) learn related words and use related words without word2vec and without Glove etc. Using the ideas of PPM etc instead. And I posted on my project page that gap matches are working in my 52 lines of code (and the list at the top that runs these searches can add more or time-delay matches too like "the big big big cat" matches "the cat", or both done together as a search too). The 52 lines of code also does priming and evaluation. I posted that the delay matches don't work at all, and that they should - but probably only once I add the building up of the sentence (parsing, using probabilities to guide it) like Transformers do, and this will allow me to find only the searches that I "should" be making, both for delays and for gaps, and order-ns too. ------------------------------------------ Artificial General Intelligence List: AGI Permalink: https://agi.topicbox.com/groups/agi/Tf0bedfcd44454678-M9c41b417396b4a70e06a2900 Delivery options: https://agi.topicbox.com/groups/agi/subscription
