On Wed, Feb 23, 2011 at 9:14 AM, Andrew S. Townley <[email protected]> wrote: > Well, actually, I want it for more than that. For my particular needs, I > need to get the field name where the match occurred in the document, and then > I'd ideally like to have the start offset into that field and the length of > the match.
Apart from the lack of Ruby bindings, this won't be a problem with Lucy. It's a data-forward approach, so that if the information is in the indexes, you'll have access to it. You might need to write a custom Hit class (or the like), but it will certainly be possible. > One of the things that struck me was the "implementing as much functionality > in high-level languages as possible" comment. What does this mean, exactly? > Why was this approach chosen rather than put all the muscle in the C code and > provide thin wrappers--even via SWIG or something more hand-tailored where > necessary/appropriate? I think you're missing an implied "And not only that, if you order by midnight tonight now you'll also receive..." Lucy is/will-have a complete C core that can be used directly, but it will also be possible to override the functionality class-by-class in Perl, Ruby, Python, etc. It's the added potential for accessing this functionality from a scripting language that is being highlighted, not the requirement. > Part of the reason I ask has to do with the future of my own project. Much > of what I have now will eventually be rewritten piecemeal in C++ and then > wrapped via SWIG so I can have Ruby and Java bindings as well as use it in > other environments natively supporting C/C++. Whatever route I end up going > for fulltext, this is something that would need to support the same kind of > thing as I'd actually be leveraging it more from the C++ code than the Ruby > code. Sounds like an excellent fit for Lucy. In the same way that we hope to allow the C-core to be overridden with scripting languages for fast prototyping, it's also should be easy to then selectively optimize that with C++. It's an ambitious multilingual goal, so it's possible it will not be fully achieved, but your sort of application is exactly the reason this approach was chosen. Nathan Kurz [email protected]
