Hi Shad,
We are currently considering to upgrade to latest Lucene.Net version(from 3.0.3
which is already used in our application) and also to contribute to the
community to speedup the release of 4.8.0. But before that, it would be good to
know:
1. How far is Lucene .Net 4.8.0-beta00005 away from the state where it could be
used productively? Are there any release plans already?2. What are the known
open issues (e.g. missing features, bugs) which are yet to be addressed?
Thanks & Regards,Juliya
On Wednesday, May 30, 2018 12:29 AM, Shad Storhaug <[email protected]>
wrote:
Farhad,
> 1. The current version of Lucene.Net is 3.0.3 which was released on October
10, 2012. I am assuming this was on par with the Java code of the same
version.
Correct.
> 2. The current effort that you describe is targeting 4.8.0 and possibly
4.8.1 of the Lucene codebase.
Correct. There have been some discussions about upgrading the project to 4.8.1,
and there are only about 120 files that differ between the versions. We are
somewhere in the middle now because a lot of the recent contributions were
brought over from 4.8.1. The differences to the Lucene core library were
significant, though and IMO it would be best to wait until we have released
4.8.0 before considering a full upgrade to 4.8.1. The performance benefits will
probably be worth the relatively minor effort before upgrading to a higher
major version of Lucene.
> 3. The current Java version of Lucene is 7.3.1.
Correct. However the hump we are just about over is the big one. There were
major changes to the project structure from 3.x to 4.x and the project size
also increased by more than a factor of 10. By contrast, the changes to the
project structure are minor going forward. I tried getting a line comparison
the same as I did between 4.8.0 and 4.8.1, but it was thrown off by the fact
that code comments have been restructured and other changes that have no effect
on the actual executable code. So it is hard to get a feel for how much change
is there from the repo.
That said, the structure and layout of the classes is easily more than 90% the
same. I am confident the project can be upgraded to the latest version without
going through a complete port again after the release by doing file by file
comparisons and porting only the diff into Lucene.Net. I have outlined the
procedure here:
https://github.com/apache/lucenenet/pull/174#issuecomment-251614795
I don't see any real reason why the next jump couldn't be all the way to the
latest Lucene version in a small fraction of the time it took to port 4.8.0 by
changing the individual files and leaving the project structure relatively
unchanged.
> Are we going to attract a significantly larger community of users as we
move to version 4.8.1?
I would say that is a resounding yes. There have been several reports of
bugs/performance issues in 3.0.3. Not to mention, we have heard lots of
positive feedback about how much better the performance is (for the most part)
from 4.8.0.
The download count (of all versions) was averaging about 600 per day 2 years
ago. Now it is up to 950 per day. The bulk of the downloads are 3.0.3, but
getting 4.8.0 out of pre-release will certainly change that.
> We will also be competing with active projects like Elastic Search for .NET
(NEST) project. Maybe low-level access to Lucene core is not that important
anymore?
This is a good point, and may be a factor in the number of contributions we
have been getting. However, keep in mind the numbers that Stefan are throwing
out there are CONTRIBUTIONS BY PMC MEMBERS, there have been several
contributions by non PMC members over the past year. The download counts
indicate the popularity of Lucene.Net as a dependency is growing significantly
despite being stuck at 3.0.3.
Another factor is that having an integrated solution will always outperform an
HTTP API based platform such as Elastic Search, and for many that makes all the
difference in the decision.
Also, I noticed one other NuGet projects are now targeting Lucene.Net
4.8.0-beta00005: https://www.nuget.org/packages/Kalix.Leo.Lucene/11.0.10-alpha,
which means there is another potential contributor out there.
Another consideration: You don't really have to know much about Lucene or Java
to be a part of the porting effort. Most of the work can be done with the help
of Google or by searching the codebase to find out how other similar pieces of
code were ported. At this stage in the game, we mostly just need people to
thoroughly test, report, and fix bugs.
Thanks,
Shad Storhaug (NightOwl888)
-----Original Message-----
From: farhad khalafi [mailto:[email protected]]
Sent: Tuesday, May 29, 2018 9:59 PM
To: [email protected]
Cc: [email protected]
Subject: Re: State / Future of the Lucene.Net Project
Stefan,
Thank you for your help in keeping this project alive up to this point and
trying to re-ignite some interest in moving it forward.
I have a few questions that will influence the extent of my possible
involvement with this project.
1. The current version of Lucene.Net is 3.0.3 which was released on October
10, 2012. I am assuming this was on par with the Java code of the same
version.
2. The current effort that you describe is targeting 4.8.0 and possibly
4.8.1 of the Lucene codebase.
3. The current Java version of Lucene is 7.3.1.
As I am not very familiar with advanced features of each version, could you
summarize what major enhancements are included as you move from 3.0.3 to
4.8.1 to 7.3.1
The version numbers are abstract and don't tell much about feature gaps as
we try to play catch with the Java version.
Are we going to attract a significantly larger community of users as we
move to version 4.8.1?
What will be missing as compared to the current Java version?
I fully appreciate the amount of work involved in porting this code even
when using automated tools.
I am just not sure that once the task is accomplished, the users will not
dismiss it as an already "outdated version".
We will also be competing with active projects like Elastic Search for .NET
(NEST) project. Maybe low-level access to Lucene core is not that important
anymore?
Thanks again,
Farhad
On Tue, May 29, 2018 at 7:26 AM, Stefan Bodewig <[email protected]> wrote:
> On 2018-05-28, Alberto León wrote:
>
> > Me too, but please, make easy to prepare the environment, I love code by
> I
> > hate the sysadmin things
>
> :-)
>
> As I said in a different mail I'm afraid we lack hands, so I doubt
> anybody will magically appear and streamline the build process.
>
> From what I understand building and running tests should work fine as
> long as you've got .NET Core and Powershell installed in Windows - if
> this is not your environment things become tricky. I'd recommend you try
> building and if things don't work as expected ask on the dev list.
>
> Stefan
>