Now with all the changes :
https://cr.opensolaris.org/action/browse/opengrok/taz/lucene40/webrev-2012-11-14-lucene_40/
and without nb7.2 upgrade changes
- depends on JDK1.7
- I added a fix for NPE when webapp is not inited (thnx Lukas)
- source code formatting and cleanups, license, date fixes
- build fixes
- package fixes
- lucene compatibility class will fail however - we need to redesign the
Analyzers, ev. get rid of cache-ing the whole file into memory to be
compatible with test class (next step)
thnx
Lubos
On 13.11.2012 13:14, Lubos Kosco wrote:
On 13.11.2012 11:18, Knut Anders Hatlen wrote:
Lubos Kosco <[email protected]> writes:
Hi guys, can you have a look please?
Thanks, Lubos. I had a quick look at the patch (skipped quickly past the
Thank you for the review kah ;)
big changes to the NetBeans project files...), and it looks mostly fine
to me. I'm not a Lucene expert, though.
Some questions:
I saw many changes like this one:
- }
- doc.add(new Field("date", date, Field.Store.YES,
Field.Index.NOT_ANALYZED));
+ }
+ doc.add(new Field("date", date, StringField.TYPE_STORED));
According to http://lucene.apache.org/core/4_0_0/MIGRATE.html, one
should also call ft.setOmitNorms(false) to preserve the original
semantics for this particular combination of arguments. Is this
something we need to do?
good catch, I had it there before, so it's my mistake when doing
merges(I had some fighting with the reusable components in Analyzers
:( , ev. ApacheCon visit solved it :-D ), will check all fieldtypes used
There are also many new instance variables in the analyzers (for the
TokenStreamComponents and tokenizers used in the new createComponents()
methods), but I don't see that they are ever read except in the local
scope where they are assigned a value. Could they be changed to local
variables?
I guess yes, they are cached by Analyzer APIs after created once and
all subsequent tokenStream calls reuse the cache, or create the object(s)
I will change where appropriate and no other reuse occurs
I also forgot to do formatting, so I will run autoformat from NB on
all changed files
Why?
Lucene 4.0 is 300% faster when indexing, 100% faster for queries
Sounds promising! :)
We can easily add regexp queries now
We can easily take latest highlighting
index statistics (and all what can be done from them, better
searching, grouping, categorization/classification)
possible Solr/Tika integrations
(for more search the web please)
this above should be more promising, I actually hope for highlighter
and some index statistics to be used soon
also I have the tika for pdf and open/libre/ms office integration
pending, once we're done with lucene4.0 ;)
Webrev
http://stargate.cnl.tuke.sk/~taz/webrev-2012-11-09-lucene_40/
not tested(in progress):
updating documents (I am not sure on uid document retrieval, wasn't
obvious to port)
tested: all junits, regression on term count numbers (same numbers of
tokenized terms as in 3.6.1)
what's missing: build system fixes so l40 will get autodownloaded ,
Another build problem I had, was that I had to set the
platforms.JDK_1.7.home variable when running Ant. Never seen that
before. It works fine without the patch.
BUILD FAILED
/code/opengrok/trunk/nbproject/build-impl.xml:86: The J2SE Platform
is not correctly set up.
Your active platform is: JDK_1.7, but the corresponding property
"platforms.JDK_1.7.home" is not found in the project's properties files.
Either open the project in the IDE and setup the Platform with the
same name or add it manually.
For example like this:
ant -Duser.properties.file=<path_to_property_file> jar (where
you put the property "platforms.JDK_1.7.home" in a .properties file)
or ant -Dplatforms.JDK_1.7.home=<path_to_JDK_home> jar (where no
properties file is used)
yes, I haven't played with build system yet and netbeans was upgraded
in my env in between too(which seems like the cause)
so I won't push anything until I test both ant in cli and from netbeans
(if I will have spare time I might do the same for eclipse and idea -
saw on bitbucket that J. Ryan Stinnett was developing OpenGrok in
Idea, what is a very good idea ;) )
package-ing fixes, lucene compatibility test auto run if lucene
test-framework on classpath
(will add it once I get some time)
I have the lucene compatibility tests running (in a very basic form,
but they are) - when the classpath has the lucene test-framework jars
will publish new review today/tomorrow
cheers
Lubos
_______________________________________________
opengrok-dev mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/opengrok-dev
_______________________________________________
opengrok-dev mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/opengrok-dev