On 13.11.2012 11:18, Knut Anders Hatlen wrote:
Lubos Kosco <[email protected]> writes:

Hi guys, can you have a look please?
Thanks, Lubos. I had a quick look at the patch (skipped quickly past the

Thank you for the review kah ;)

big changes to the NetBeans project files...), and it looks mostly fine
to me. I'm not a Lucene expert, though.

Some questions:

I saw many changes like this one:

-        }
-        doc.add(new Field("date", date, Field.Store.YES, 
Field.Index.NOT_ANALYZED));
+        }
+        doc.add(new Field("date", date, StringField.TYPE_STORED));

According to http://lucene.apache.org/core/4_0_0/MIGRATE.html, one
should also call ft.setOmitNorms(false) to preserve the original
semantics for this particular combination of arguments. Is this
something we need to do?

good catch, I had it there before, so it's my mistake when doing merges(I had some fighting with the reusable components in Analyzers :( , ev. ApacheCon visit solved it :-D ), will check all fieldtypes used


There are also many new instance variables in the analyzers (for the
TokenStreamComponents and tokenizers used in the new createComponents()
methods), but I don't see that they are ever read except in the local
scope where they are assigned a value. Could they be changed to local
variables?

I guess yes, they are cached by Analyzer APIs after created once and all subsequent tokenStream calls reuse the cache, or create the object(s)
I will change where appropriate and no other reuse occurs

I also forgot to do formatting, so I will run autoformat from NB on all changed files


Why?
Lucene 4.0 is 300% faster when indexing, 100% faster for queries
Sounds promising! :)

We can easily add regexp queries now
We can easily take latest highlighting
index statistics (and all what can be done from them, better
searching, grouping, categorization/classification)
possible Solr/Tika integrations
(for more search the web please)

this above should be more promising, I actually hope for highlighter and some index statistics to be used soon also I have the tika for pdf and open/libre/ms office integration pending, once we're done with lucene4.0 ;)


Webrev
http://stargate.cnl.tuke.sk/~taz/webrev-2012-11-09-lucene_40/

not tested(in progress):
updating documents (I am not sure on uid document retrieval, wasn't
obvious to port)

tested: all junits, regression on term count numbers (same numbers of
tokenized terms as in 3.6.1)

what's missing: build system fixes  so l40 will get autodownloaded ,
Another build problem I had, was that I had to set the
platforms.JDK_1.7.home variable when running Ant. Never seen that
before. It works fine without the patch.

BUILD FAILED
/code/opengrok/trunk/nbproject/build-impl.xml:86: The J2SE Platform is not 
correctly set up.
  Your active platform is: JDK_1.7, but the corresponding property 
"platforms.JDK_1.7.home" is not found in the project's properties files.
  Either open the project in the IDE and setup the Platform with the same name 
or add it manually.
  For example like this:
      ant -Duser.properties.file=<path_to_property_file> jar (where you put the property 
"platforms.JDK_1.7.home" in a .properties file)
   or ant -Dplatforms.JDK_1.7.home=<path_to_JDK_home> jar (where no properties 
file is used)

yes, I haven't played with build system yet and netbeans was upgraded in my env in between too(which seems like the cause)
so I won't push anything until I test both ant in cli and from netbeans
(if I will have spare time I might do the same for eclipse and idea - saw on bitbucket that J. Ryan Stinnett was developing OpenGrok in Idea, what is a very good idea ;) )



package-ing fixes, lucene compatibility test auto run if lucene
test-framework on classpath
(will add it once I get some time)

I have the lucene compatibility tests running (in a very basic form, but they are) - when the classpath has the lucene test-framework jars

will publish new review today/tomorrow

cheers
Lubos

_______________________________________________
opengrok-dev mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/opengrok-dev

Reply via email to