To test these ideas, we need a proof of concept that will work. How about creating a Wiki page "Keywords" that will have a simple list of links with keywords in alphabetical order and/or other convention? E.g.
== Wiki == * [:Essays/Text Formatting] ransitive closure, fold text, line wrap, word wrap * [:Studio/Sockets and the Internet] sockets, internet * [:Studio/Regular Expressions] * [:System/Windows Front End/Requests] gui, front-end, forms == J Documentation == * [wiki:JDic:d331 Cut] substring This will be a community-maintained index, and if content maintainer would wish to internalize those keywords, they will be removed from this list. This approach will also solve the problem of immutable pages and non-community content such as dictionary. --- Skip Cave <[EMAIL PROTECTED]> wrote: > Oleg Kobchenko wrote: > > OK, everybody got it that attaching keywords to > > any and all J documentation is suggested. Now, how to > > go about it practically? > Skip Says: > > Glad you asked. > > It turns out that there is a fairly straightforward way to approach the > problem of adding keywords to all kinds of > documents. I stumbled across the solution a few years ago when working > on a project to index lots of internal documents in all kinds of > formats. The secret is......... in the search engine! > > I mean literally in the search engine. When you think about it, when a > new document is added to the system, the search engine has to make an > indexing pass over that document, to index all of the words that are in > the document. If someone adds more keywords to the document at a later > date, the search engine must make another indexing pass over the > document, to add the new words to the index for that document. The > problem with putting keywords in documents, is that one may not have the > appropriate program to modify a particular obscure document format to > add a keyword . Worse, not all documents can be modified at all - > (immutable wikis, PDFs). We clearly need some out-of-the-box thinking to > approach this.problem. > > What if one could just add the keywords directly to the index that is > kept by the search engine, instead of adding the keywords to the > document and then re-indexing? > > That's the secret of adding keywords to all kinds of documents. Don't > put the keywords in the documents, put them in the index that points to > the documents. That's where they all end up, anyway. Then you don't have > to access or modify the actual documents at all. > > Of course it is easier to describe this solution than implement it. > Generally, this approach means an open-source search engine of some > kind, such as Lucene or others > (http://www.searchtools.com/tools/tools-opensource.html), as well as > some serious index-engine surgery to allow direct modifications to the > search engines' index structure. > > So how would this work, from the user's point of view? When a user types > a search phrase in a web page search box, the resulting found-document > list should be shown on the page, like any normal search engine. The new > functionality appears when the user clicks on one of the presented > document links. > > After clicking on the link, the selected document should be displayed, > just like any search engine. However, the browser should also display a > new frame which contains a text entry box for new keywords, as well as a > display/hide keyword button which will show or hide all of the keywords > currently added to the selected document. So, while the user is perusing > the document, any keyword that pops into their head can be typed into > the keyword frame in the browser. An "enter" button inserts the keyword > into the existing index for that document. The keyword frame doesn't > display all of the words indexed from the document, just the keywords > that have been added to the document (actually they were added to the > document's index). > > If a user types a new keyword or phrase in the entry box, that keyword > will be directly added to the search engine's index for that selected > document. Of course, the search engine will need to keep the > manually-added keywords separate from the originally indexed words in > the document, but all of the documents' words and phrases, added or > original, will be used for searching. > > I wonder how hard it would be to build a search engine in J? The biggest > problem for search engines is usually the text-extraction routines for > the multitude of obscure document formats. J doesn't provide much of an > advantage in that area. > > So now you know. Keyword tagging - a very powerful concept, with a > simple solution (at least simple in theory). > > Skip > > > ---------------------------------------------------------------------- > For information about J forums see http://www.jsoftware.com/forums.htm > ____________________________________________________________________________________ Never miss a thing. Make Yahoo your home page. http://www.yahoo.com/r/hs ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
