Ah, it was what I thought you were talking about -- I wasn't aware they were considered word boundaries :)
Thanks for the links! On Mar 13, 2017 4:54 PM, "Richard Wordingham" < richard.wording...@ntlworld.com> wrote: On Mon, 13 Mar 2017 15:26:00 -0700 Manish Goregaokar <man...@mozilla.com> wrote: > Do you have examples of AA being split that way (and further reading)? > I think I'm aware of what you're talking about, but would love to read > more about it. Just googling for the three words 'Sanskrit', 'sandhi' and 'resolution' brings up plenty of papers and discussion, e.g. Hellwig's at http://ltc.amu.edu.pl/book/papers/LRL-1.pdf and a multi-author paper at https://www.aclweb.org/anthology/C/C16/C16-1048.pdf. There are even technical terms for before and after. Unsplit text is 'samhita text', and text split into words is 'pada text'. Richard.