That sounds really cool. I recently hacked a swish-e index of my org files (there might have been 3000+!) http://kitchingroup.cheme.cmu.edu/blog/2015/06/25/Integrating-swish-e-and-Emacs/. and
I just updated it to index the html version of an org-file so that I take advantage of the structure in the search. http://kitchingroup.cheme.cmu.edu/blog/2015/07/03/Using-swish-e-to-index-org-files-as-html/. It would be cool to have more granular searching though. Is your info project visible anywhere? i can imagine a close-file hook function that updates the database automatically. Oleg Sivokon writes: > Hello list! > > Suppose I wanted to extract the structure from an Org document, where, > what's important for me would be to have it cathegorically divided into > headers, paragraphs of text, technical information and inclusion of > other documents (code snippets). How would I do it? > > The reason I'm asking is that I've a small project I work on, where I'm > trying to enhance the search in documents by using indexing combined > with queries based on things like distance between words, frequency of a > word appearing in a document and so on. (I'm using Sphinx for it.) > I've tried to do this with Info pages, and I liked the results, however, > in order to do this more intelligently, I'd like to index the documents > with better granularity (i.e. so that later on I could search assigning > different weights to words appearing in headers and words appearing in > comments). > > Best. > > Oleg -- Professor John Kitchin Doherty Hall A207F Department of Chemical Engineering Carnegie Mellon University Pittsburgh, PA 15213 412-268-7803 @johnkitchin http://kitchingroup.cheme.cmu.edu
