Hello, I'm quite new to Solr, but I'm wondering if it can help me out with indexing training content on my LMS (moodle). The catch is that many of our training modules are created in Adobe Captivate, which means they are basically zip files with HTML inside (some use Flash, but we can limit it to HTML5 if that makes things easier).
I've been reading up on whether/how one could index this sort of content in Solr, and... I'm still a bit confused. It looks like Solr Cell <https://wiki.apache.org/solr/ExtractingRequestHandler> *could* do this, as it can handle compressed formats like Word docs, but there's a Jira issue from several years ago <https://issues.apache.org/jira/browse/SOLR-2416> which purports to contain a patch for letting it index zipfiles more generally that's still in state Open, so... I guess Word is a special case? Basically, does anyone have experience with this sort of thing and, if it's possible, are there any examples or more specific instructions I should look at? Thanks! --Brad
