[jira] [Commented] (LUCENE-3550) Create example code for core

Shai Erera (JIRA) Sun, 10 Mar 2013 03:55:16 -0700

    [ 
https://issues.apache.org/jira/browse/LUCENE-3550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13598204#comment-13598204
 ]


Shai Erera commented on LUCENE-3550:
------------------------------------

Few comments:

* Please remove @author tags. We don't use them as well as the build fails if 
it finds any.

* In general, I think that the code needs to be more documented, since this is 
an example code. So for instance I would add:
** to index() a comment saying "IndexWriterConfig lets you configure how 
IndexWriter works as well as how documents are indexed".
** to search() a comment saying "QueryParser is able to parse a query string 
into a meaningful Query object which is used to match and score documents".
** etc...

* If there's nothing special to say about an exception that is thrown, can you 
please remove @throws from javadocs?

* addDocs:
** I would rename to addDoc
** Modify the comment "create index" to "add document to the index"

* Currently the code prints messages, which we try to avoid (e.g. during 
tests). So either we add to DemoConstants a VERBOSE property that is 
initialized to System.getProperty("tests.verbose"), or you just move all the 
prints to main()?
** In that regard, search() can return a {{ScoreDoc[]}} which main() can use to 
print results as well as tests could use to assert on.
** I.e. rather than asserting that search() returned 1 or 2 hits, we can assert 
their order etc. (not saying we have to for this example).

* In order to better test the example, I would make it take a Directory (e.g. 
index(Directory), search(Directory) or SimpleCoreExample(Directory)) and pass 
from tests newDirectory() (note: there's no space intentionally).
** This will detect incomplete code, e.g. you don't close the reader in 
search().

* Also, I think that the example should better clarify that we don't e.g. care 
about casing, so for instance if you index "Apache" search for "apache".
** main() could also run two searches, to print diverse results
** and tests (and main()) should test multi-word queries too

As a start, it looks great. I think though that it would be better if our 
simple example contained:
** Documents with more than one field, to show different Field types 
(TextField, StringField, DocValuesField)
** Instead of a single search(), have different searchXYZ methods, e.g.
*** searchKeyword (using default field), searchFields (execute fielded search)
*** searchBooleanQuery, searchRangeQuery to show QueryParser's syntax
*** searchSort to sort results

I consider these simple/basic examples, since that's really the essence of 
Lucene -- index documents with few fields and querying for them in different 
ways.
                
> Create example code for core
> ----------------------------
>
>                 Key: LUCENE-3550
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3550
>             Project: Lucene - Core
>          Issue Type: New Feature
>          Components: core/other
>            Reporter: Shai Erera
>              Labels: newdev
>         Attachments: LUCENE-3550.patch
>
>
> Trunk has gone under lots of API changes. Some of which are not trivial, and 
> the migration path from 3.x to 4.0 seems hard. I'd like to propose some way 
> to tackle this, by means of live example code.
> The facet module implements this approach. There is live Java code under 
> src/examples that demonstrate some well documented scenarios. The code itself 
> is documented, in addition to javadoc. Also, the code itself is being unit 
> tested regularly.
> We found it very difficult to keep documentation up-to-date -- javadocs 
> always lag behind, Wiki pages get old etc. However, when you have live Java 
> code, you're *forced* to keep it up-to-date. It doesn't compile if you break 
> the API, it fails to run if you change internal impl behavior. If you keep it 
> simple enough, its documentation stays simple to.
> And if we are successful at maintaining it (which we must be, otherwise the 
> build should fail), then people should have an easy experience migrating 
> between releases. So say you take the simple scenario "I'd like to index 
> documents which have the fields ID, date and body". Then you create an 
> example class/method that accomplishes that. And between releases, this code 
> gets updated, and people can follow the changes required to implement that 
> scenario.
> I'm not saying the examples code should always stay optimized. We can aim at 
> that, but I don't try to fool myself thinking that we'll succeed. But at 
> least we can get it compiled and regularly unit tested.
> I think that it would be good if we introduce the concept of examples such 
> that if a module (core, contrib, modules) have an src/examples, we package it 
> in a .jar and include it with the binary distribution. That's for a first 
> step. We can also have meta examples, under their own module/contrib, that 
> show how to combine several modules together (this might even uncover API 
> problems), but that's definitely a second phase.
> At first, let's do the "unit examples" (ala unit tests) and better start with 
> core. Whatever we succeed at writing for 4.0 will only help users. So let's 
> use this issue to:
> # List example scenarios that we want to demonstrate for core
> # Building the infrastructure in our build system to package and distribute a 
> module's examples.
> Please feel free to list here example scenarios that come to mind. We can 
> then track what's been done and what's not. The more we do the better.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (LUCENE-3550) Create example code for core

Reply via email to