[
https://issues.apache.org/jira/browse/LUCENE-3550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13598204#comment-13598204
]
Shai Erera commented on LUCENE-3550:
------------------------------------
Few comments:
* Please remove @author tags. We don't use them as well as the build fails if
it finds any.
* In general, I think that the code needs to be more documented, since this is
an example code. So for instance I would add:
** to index() a comment saying "IndexWriterConfig lets you configure how
IndexWriter works as well as how documents are indexed".
** to search() a comment saying "QueryParser is able to parse a query string
into a meaningful Query object which is used to match and score documents".
** etc...
* If there's nothing special to say about an exception that is thrown, can you
please remove @throws from javadocs?
* addDocs:
** I would rename to addDoc
** Modify the comment "create index" to "add document to the index"
* Currently the code prints messages, which we try to avoid (e.g. during
tests). So either we add to DemoConstants a VERBOSE property that is
initialized to System.getProperty("tests.verbose"), or you just move all the
prints to main()?
** In that regard, search() can return a {{ScoreDoc[]}} which main() can use to
print results as well as tests could use to assert on.
** I.e. rather than asserting that search() returned 1 or 2 hits, we can assert
their order etc. (not saying we have to for this example).
* In order to better test the example, I would make it take a Directory (e.g.
index(Directory), search(Directory) or SimpleCoreExample(Directory)) and pass
from tests newDirectory() (note: there's no space intentionally).
** This will detect incomplete code, e.g. you don't close the reader in
search().
* Also, I think that the example should better clarify that we don't e.g. care
about casing, so for instance if you index "Apache" search for "apache".
** main() could also run two searches, to print diverse results
** and tests (and main()) should test multi-word queries too
As a start, it looks great. I think though that it would be better if our
simple example contained:
** Documents with more than one field, to show different Field types
(TextField, StringField, DocValuesField)
** Instead of a single search(), have different searchXYZ methods, e.g.
*** searchKeyword (using default field), searchFields (execute fielded search)
*** searchBooleanQuery, searchRangeQuery to show QueryParser's syntax
*** searchSort to sort results
I consider these simple/basic examples, since that's really the essence of
Lucene -- index documents with few fields and querying for them in different
ways.
> Create example code for core
> ----------------------------
>
> Key: LUCENE-3550
> URL: https://issues.apache.org/jira/browse/LUCENE-3550
> Project: Lucene - Core
> Issue Type: New Feature
> Components: core/other
> Reporter: Shai Erera
> Labels: newdev
> Attachments: LUCENE-3550.patch
>
>
> Trunk has gone under lots of API changes. Some of which are not trivial, and
> the migration path from 3.x to 4.0 seems hard. I'd like to propose some way
> to tackle this, by means of live example code.
> The facet module implements this approach. There is live Java code under
> src/examples that demonstrate some well documented scenarios. The code itself
> is documented, in addition to javadoc. Also, the code itself is being unit
> tested regularly.
> We found it very difficult to keep documentation up-to-date -- javadocs
> always lag behind, Wiki pages get old etc. However, when you have live Java
> code, you're *forced* to keep it up-to-date. It doesn't compile if you break
> the API, it fails to run if you change internal impl behavior. If you keep it
> simple enough, its documentation stays simple to.
> And if we are successful at maintaining it (which we must be, otherwise the
> build should fail), then people should have an easy experience migrating
> between releases. So say you take the simple scenario "I'd like to index
> documents which have the fields ID, date and body". Then you create an
> example class/method that accomplishes that. And between releases, this code
> gets updated, and people can follow the changes required to implement that
> scenario.
> I'm not saying the examples code should always stay optimized. We can aim at
> that, but I don't try to fool myself thinking that we'll succeed. But at
> least we can get it compiled and regularly unit tested.
> I think that it would be good if we introduce the concept of examples such
> that if a module (core, contrib, modules) have an src/examples, we package it
> in a .jar and include it with the binary distribution. That's for a first
> step. We can also have meta examples, under their own module/contrib, that
> show how to combine several modules together (this might even uncover API
> problems), but that's definitely a second phase.
> At first, let's do the "unit examples" (ala unit tests) and better start with
> core. Whatever we succeed at writing for 4.0 will only help users. So let's
> use this issue to:
> # List example scenarios that we want to demonstrate for core
> # Building the infrastructure in our build system to package and distribute a
> module's examples.
> Please feel free to list here example scenarios that come to mind. We can
> then track what's been done and what's not. The more we do the better.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]