Re: Any idea about stroing and searching on formula

Dmitri Maziuk Sat, 10 Aug 2024 08:05:30 -0700

On 8/9/24 21:57, Zara Parst wrote:

Actually, I have a small app called MassiveMark ,  where people insert
different text, Markdown(MathML, LaTex, Chemistry formula etc), codeblock,
images and normal text. Later on we figured out this is mainly used by
students and professors to create lecture notes, exam papers etc. I guess
they are mainly converting text from ChatGPT and downloading it as docx.
However few users requested if we can allow them to store it and later they
can fetch it. We were planning to also let them search in the document. I
have no clue how we are going to search in Organic chemistry, Compound are
mainly manipulated from smile code, which is manipulated during visual or
export to docx.

Same math formulae van be written in different ways, e.g. 'y = ax + b'is the same as 'z = i + j * k'. Same goes for SMILES notation (in manycases) that's used for small molecules. Long nucleic acid and proteinchains are are whole 'nother story: they are written as strings ofletters (not SMILES) and have specialized sequence matching algorithmsthat are the subject of Bioinformatics 101.


I.e. solr is the wrong tool for those jobs.

It'll work for text search, it's just that they'll likely get garbageout if they try searching for math or chemical formula.


Dima

Re: Any idea about stroing and searching on formula

Reply via email to