[ https://issues.apache.org/jira/browse/LUCENE-3550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13598204#comment-13598204 ]
Shai Erera commented on LUCENE-3550: ------------------------------------ Few comments: * Please remove @author tags. We don't use them as well as the build fails if it finds any. * In general, I think that the code needs to be more documented, since this is an example code. So for instance I would add: ** to index() a comment saying "IndexWriterConfig lets you configure how IndexWriter works as well as how documents are indexed". ** to search() a comment saying "QueryParser is able to parse a query string into a meaningful Query object which is used to match and score documents". ** etc... * If there's nothing special to say about an exception that is thrown, can you please remove @throws from javadocs? * addDocs: ** I would rename to addDoc ** Modify the comment "create index" to "add document to the index" * Currently the code prints messages, which we try to avoid (e.g. during tests). So either we add to DemoConstants a VERBOSE property that is initialized to System.getProperty("tests.verbose"), or you just move all the prints to main()? ** In that regard, search() can return a {{ScoreDoc[]}} which main() can use to print results as well as tests could use to assert on. ** I.e. rather than asserting that search() returned 1 or 2 hits, we can assert their order etc. (not saying we have to for this example). * In order to better test the example, I would make it take a Directory (e.g. index(Directory), search(Directory) or SimpleCoreExample(Directory)) and pass from tests newDirectory() (note: there's no space intentionally). ** This will detect incomplete code, e.g. you don't close the reader in search(). * Also, I think that the example should better clarify that we don't e.g. care about casing, so for instance if you index "Apache" search for "apache". ** main() could also run two searches, to print diverse results ** and tests (and main()) should test multi-word queries too As a start, it looks great. I think though that it would be better if our simple example contained: ** Documents with more than one field, to show different Field types (TextField, StringField, DocValuesField) ** Instead of a single search(), have different searchXYZ methods, e.g. *** searchKeyword (using default field), searchFields (execute fielded search) *** searchBooleanQuery, searchRangeQuery to show QueryParser's syntax *** searchSort to sort results I consider these simple/basic examples, since that's really the essence of Lucene -- index documents with few fields and querying for them in different ways. > Create example code for core > ---------------------------- > > Key: LUCENE-3550 > URL: https://issues.apache.org/jira/browse/LUCENE-3550 > Project: Lucene - Core > Issue Type: New Feature > Components: core/other > Reporter: Shai Erera > Labels: newdev > Attachments: LUCENE-3550.patch > > > Trunk has gone under lots of API changes. Some of which are not trivial, and > the migration path from 3.x to 4.0 seems hard. I'd like to propose some way > to tackle this, by means of live example code. > The facet module implements this approach. There is live Java code under > src/examples that demonstrate some well documented scenarios. The code itself > is documented, in addition to javadoc. Also, the code itself is being unit > tested regularly. > We found it very difficult to keep documentation up-to-date -- javadocs > always lag behind, Wiki pages get old etc. However, when you have live Java > code, you're *forced* to keep it up-to-date. It doesn't compile if you break > the API, it fails to run if you change internal impl behavior. If you keep it > simple enough, its documentation stays simple to. > And if we are successful at maintaining it (which we must be, otherwise the > build should fail), then people should have an easy experience migrating > between releases. So say you take the simple scenario "I'd like to index > documents which have the fields ID, date and body". Then you create an > example class/method that accomplishes that. And between releases, this code > gets updated, and people can follow the changes required to implement that > scenario. > I'm not saying the examples code should always stay optimized. We can aim at > that, but I don't try to fool myself thinking that we'll succeed. But at > least we can get it compiled and regularly unit tested. > I think that it would be good if we introduce the concept of examples such > that if a module (core, contrib, modules) have an src/examples, we package it > in a .jar and include it with the binary distribution. That's for a first > step. We can also have meta examples, under their own module/contrib, that > show how to combine several modules together (this might even uncover API > problems), but that's definitely a second phase. > At first, let's do the "unit examples" (ala unit tests) and better start with > core. Whatever we succeed at writing for 4.0 will only help users. So let's > use this issue to: > # List example scenarios that we want to demonstrate for core > # Building the infrastructure in our build system to package and distribute a > module's examples. > Please feel free to list here example scenarios that come to mind. We can > then track what's been done and what's not. The more we do the better. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org