Dear all, Very good morning please.I have some big texts in my tables. On average, each row contains about 4.2KB data and there are 9.5 million rows.I want to perform various conceptual searches on technical terms, technical phrases and would like to retrieve all texts with nearest meanings. So I have to vectorize the data.What is the best approach please? I was trying to fragment the data into small fragments of 4.2 KB & then do embedding using small vector size with the help of pgvector.Once I have the embedding vectors on fragments, then I can combine them using some close relationship model or average. This way, we generate embedding for the full text. Or would you recommend any other approach to generate embedding for the full text please? Also I have another question. I have title, abstract & description where description is about 3KB and I would like to search title, abstract, description. Should I merge all the data (& generate embeddings) or keep the embeddings separate? Have a wonderful day please.Thank you,Apurba K. Saha
