RE: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed?

2011-06-30 Thread Moray McConnachie
I don't think I'm as hard core on this as Neal, but remember: the
history of the Lucene.NET project is that all the intellectual work, all
the understanding of search, all the new features come from the Lucene
Java folks. Theirs is an immensely respected project, and I trust them
to add new features that will be well-tested and well-researched, and to
have a decent roadmap which I can trust they will execute on. 

Now I know there's been an influx of capable developers to Lucene.NET
who are ready, willing and (I'm going to assume) able to add a lot more
value in a generic .NET implementation as they change it. But it'll take
a while before I trust a .NET dedicated framework which is significantly
diverged from Java in the way I do the line-by-line version. And at what
stage is it not just not a line-by-line port, but not a port at all?

At the same time, I recognise that if this project is going to continue,
and attract good developers, it has to change in this direction.

So that said, I can see why a line-by-line port might not be
sustainable. And most people don't need it. But most of us using Lucene
in production systems do need a system that we can trust and rely on. So
let me chime in with someone else's plea, to keep the general structure
close to Lucene, to keep the same general objects and inheritance
set-up, and to keep the same method names, even if you add other methods
and classes to provide additional functionality. ABSOLUTELY the same
file formats. End users benefit a lot from a high degree of similarity,
with good documentation and help being available from the Java
community.

Yours,
Moray
-
Moray McConnachie
Director of IT+44 1865 261 600
Oxford Analytica  http://www.oxan.com

-Original Message-
From: Granroth, Neal V. [mailto:neal.granr...@thermofisher.com] 
Sent: 29 June 2011 20:47
To: lucene-net-u...@lucene.apache.org
Cc: lucene-net-...@incubator.apache.org
Subject: RE: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed?

This is has been discussed many times.
Lucene.NET is not valid, the code cannot be trusted, if it is not a
line-by-line port.  It ceases to be Lucene.

- Neal

-Original Message-
From: Scott Lombard [mailto:lombardena...@gmail.com]
Sent: Wednesday, June 29, 2011 1:58 PM
To: lucene-net-dev@lucene.apache.org; lucene-net-u...@lucene.apache.org
Subject: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed?

 

After the large community response about moving the code base from .Net
2.0 to Net 4.0 I am trying to figure out what is the need for a
line-by-line port.  Starting with Digy's excellent work on the
conversion to generics a priority of the 2.9.4g release is the 2
packages would not be interchangeable.  So faster turnaround from a java
release won't matter to non line-by-line users they will have to wait
until the updates are made to the non line-by-line code base.  

 

My question is there really a user base for the line-by-line port?
Anyone have a comment?

 

Scott

 

  

 

-
Disclaimer 

This message and any attachments are confidential and/or privileged. If this 
has been sent to you in error, please do not use, retain or disclose them, and 
contact the sender as soon as possible.

Oxford Analytica Ltd
Registered in England: No. 1196703
5 Alfred Street, Oxford
United Kingdom, OX1 4EH
-



Re: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed?

2011-06-30 Thread Noel Lysaght
Can I just plug in my bit and say I agree 100% with what Moray has outlined 
below.


If we move away from the line by line port then over time we'll loose out on 
the momentum that is Lucene and the improvements that they make.
It is only if the Lucene.NET community has expertise in search,  a  deep 
knowledge of the project and the community can guarantee that the knowledge 
will survive members coming and going should such a consideration be give.


When Lucene.NET has stood on it's feet for a number of years after it has 
moved out of Apache incubation should consideration be given to abandoning a 
line by line port.
By all means extend and wrap the libraries in .NET equivalents and .NET 
goodness like LINQ (we do this internally in our company at the moment); but 
leave the core of the project on a line by line port.


Just my tu-pence worth.

Kind Regards
Noel


-Original Message- 
From: Moray McConnachie

Sent: Thursday, June 30, 2011 10:25 AM
To: lucene-net-u...@lucene.apache.org
Cc: lucene-net-...@incubator.apache.org
Subject: RE: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed?

I don't think I'm as hard core on this as Neal, but remember: the
history of the Lucene.NET project is that all the intellectual work, all
the understanding of search, all the new features come from the Lucene
Java folks. Theirs is an immensely respected project, and I trust them
to add new features that will be well-tested and well-researched, and to
have a decent roadmap which I can trust they will execute on.

Now I know there's been an influx of capable developers to Lucene.NET
who are ready, willing and (I'm going to assume) able to add a lot more
value in a generic .NET implementation as they change it. But it'll take
a while before I trust a .NET dedicated framework which is significantly
diverged from Java in the way I do the line-by-line version. And at what
stage is it not just not a line-by-line port, but not a port at all?

At the same time, I recognise that if this project is going to continue,
and attract good developers, it has to change in this direction.

So that said, I can see why a line-by-line port might not be
sustainable. And most people don't need it. But most of us using Lucene
in production systems do need a system that we can trust and rely on. So
let me chime in with someone else's plea, to keep the general structure
close to Lucene, to keep the same general objects and inheritance
set-up, and to keep the same method names, even if you add other methods
and classes to provide additional functionality. ABSOLUTELY the same
file formats. End users benefit a lot from a high degree of similarity,
with good documentation and help being available from the Java
community.

Yours,
Moray
-
Moray McConnachie
Director of IT+44 1865 261 600
Oxford Analytica  http://www.oxan.com

-Original Message-
From: Granroth, Neal V. [mailto:neal.granr...@thermofisher.com]
Sent: 29 June 2011 20:47
To: lucene-net-u...@lucene.apache.org
Cc: lucene-net-...@incubator.apache.org
Subject: RE: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed?

This is has been discussed many times.
Lucene.NET is not valid, the code cannot be trusted, if it is not a
line-by-line port.  It ceases to be Lucene.

- Neal

-Original Message-
From: Scott Lombard [mailto:lombardena...@gmail.com]
Sent: Wednesday, June 29, 2011 1:58 PM
To: lucene-net-dev@lucene.apache.org; lucene-net-u...@lucene.apache.org
Subject: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed?



After the large community response about moving the code base from .Net
2.0 to Net 4.0 I am trying to figure out what is the need for a
line-by-line port.  Starting with Digy's excellent work on the
conversion to generics a priority of the 2.9.4g release is the 2
packages would not be interchangeable.  So faster turnaround from a java
release won't matter to non line-by-line users they will have to wait
until the updates are made to the non line-by-line code base.



My question is there really a user base for the line-by-line port?
Anyone have a comment?



Scott







-
Disclaimer

This message and any attachments are confidential and/or privileged. If this 
has been sent to you in error, please do not use, retain or disclose them, 
and contact the sender as soon as possible.


Oxford Analytica Ltd
Registered in England: No. 1196703
5 Alfred Street, Oxford
United Kingdom, OX1 4EH
-



RE: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed?

2011-06-30 Thread Ayende Rahien
As someone from the nhibernate project
We stopped following hibernate a while ago, and haven't regretted it
We have mire features, less bugs and better code base

Sent from my Windows Phone From: Rory Plaire
Sent: Thursday, June 30, 2011 19:58
To: lucene-net-dev@lucene.apache.org
Subject: Re: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed?
I don't want to drag this out much longer, but I am curious with people who
hold the line-by-line sentiment - are you NHibernate users?

-r

On Thu, Jun 30, 2011 at 2:39 AM, Noel Lysaght lysag...@hotmail.com wrote:

 Can I just plug in my bit and say I agree 100% with what Moray has outlined
 below.

 If we move away from the line by line port then over time we'll loose out
 on the momentum that is Lucene and the improvements that they make.
 It is only if the Lucene.NET community has expertise in search,  a  deep
 knowledge of the project and the community can guarantee that the knowledge
 will survive members coming and going should such a consideration be give.

 When Lucene.NET has stood on it's feet for a number of years after it has
 moved out of Apache incubation should consideration be given to abandoning a
 line by line port.
 By all means extend and wrap the libraries in .NET equivalents and .NET
 goodness like LINQ (we do this internally in our company at the moment); but
 leave the core of the project on a line by line port.

 Just my tu-pence worth.

 Kind Regards
 Noel


 -Original Message- From: Moray McConnachie
 Sent: Thursday, June 30, 2011 10:25 AM

 To: lucene-net-user@lucene.apache.**orglucene-net-u...@lucene.apache.org
 Cc: lucene-net-dev@incubator.**apache.orglucene-net-...@incubator.apache.org
 Subject: RE: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed?

 I don't think I'm as hard core on this as Neal, but remember: the
 history of the Lucene.NET project is that all the intellectual work, all
 the understanding of search, all the new features come from the Lucene
 Java folks. Theirs is an immensely respected project, and I trust them
 to add new features that will be well-tested and well-researched, and to
 have a decent roadmap which I can trust they will execute on.

 Now I know there's been an influx of capable developers to Lucene.NET
 who are ready, willing and (I'm going to assume) able to add a lot more
 value in a generic .NET implementation as they change it. But it'll take
 a while before I trust a .NET dedicated framework which is significantly
 diverged from Java in the way I do the line-by-line version. And at what
 stage is it not just not a line-by-line port, but not a port at all?

 At the same time, I recognise that if this project is going to continue,
 and attract good developers, it has to change in this direction.

 So that said, I can see why a line-by-line port might not be
 sustainable. And most people don't need it. But most of us using Lucene
 in production systems do need a system that we can trust and rely on. So
 let me chime in with someone else's plea, to keep the general structure
 close to Lucene, to keep the same general objects and inheritance
 set-up, and to keep the same method names, even if you add other methods
 and classes to provide additional functionality. ABSOLUTELY the same
 file formats. End users benefit a lot from a high degree of similarity,
 with good documentation and help being available from the Java
 community.

 Yours,
 Moray
 --**---
 Moray McConnachie
 Director of IT+44 1865 261 600
 Oxford Analytica  http://www.oxan.com

 -Original Message-
 From: Granroth, Neal V. 
 [mailto:neal.granroth@**thermofisher.comneal.granr...@thermofisher.com
 ]
 Sent: 29 June 2011 20:47
 To: lucene-net-user@lucene.apache.**orglucene-net-u...@lucene.apache.org
 Cc: lucene-net-dev@incubator.**apache.orglucene-net-...@incubator.apache.org
 Subject: RE: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed?

 This is has been discussed many times.
 Lucene.NET is not valid, the code cannot be trusted, if it is not a
 line-by-line port.  It ceases to be Lucene.

 - Neal

 -Original Message-
 From: Scott Lombard [mailto:lombardenator@gmail.**comlombardena...@gmail.com
 ]
 Sent: Wednesday, June 29, 2011 1:58 PM
 To: lucene-net-dev@lucene.apache.**org lucene-net-dev@lucene.apache.org;
 lucene-net-user@lucene.apache.**org lucene-net-u...@lucene.apache.org
 Subject: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed?



 After the large community response about moving the code base from .Net
 2.0 to Net 4.0 I am trying to figure out what is the need for a
 line-by-line port.  Starting with Digy's excellent work on the
 conversion to generics a priority of the 2.9.4g release is the 2
 packages would not be interchangeable.  So faster turnaround from a java
 release won't matter to non line-by-line users they will have to wait
 until the updates are made to the non line-by-line code base.



 My question is there really a user 

Re: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed?

2011-06-30 Thread Itamar Syn-Hershko

NHibernate has a much bigger community and more active devs afaict.


The proposed changes as I understand them are not about changing class 
structure or APIs, but merely touch hunks of code and rewrite them to 
use better .NET practices (yield, generics, LINQ etc). In conjunction 
with a move to .NET 4.0 this would increase readability, improve GC and 
boost performance.



IMO this doesn't have to be a line-by-line port in order to make porting 
of patches easy - what digy seem to be really worried about (and he's 
right). As long as the meaning of the code is clear, it shouldn't be a 
real problem to apply Java patches to the .NET codebase. And as long as 
the test suite keeps being thorough, there's really nothing to fear of.



On 30/06/2011 20:15, Ayende Rahien wrote:


As someone from the nhibernate project
We stopped following hibernate a while ago, and haven't regretted it
We have mire features, less bugs and better code base


RE: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed?

2011-06-30 Thread Digy
Michael,
You interpret the report as whoever commits code wins? But when I look at it, 
I see a lof of talk, no work. .Net community is not interested in 
contributing.
I really don't understand what hinders people to work on Lucene.Net. As I did 
for 2.9.4g, grab the code, do whatever you want on it and submit back. If it 
doesn't fit to the project's direction it can still find a place in contrib or 
in branch. All of the approaches can live side by side happily in the 
Lucene.Net repository. 

Troy,
I also don't understand why do you wait for 2.9.4g? It is a *branch* and has 
nothing to do with the trunk. It need not be an offical release and can live in 
branch as a PoC. 


As a result, I got bored to listen to this should be done that way. What I 
want to see is I did it that way, should we continue with this.

DIGY




-Original Message-
From: Troy Howard [mailto:thowar...@gmail.com] 
Sent: Thursday, June 30, 2011 10:47 PM
To: lucene-net-dev@lucene.apache.org
Subject: Re: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed?

Michael,

I agree with everything you said. My point in saying whoever commits code
wins was to illustrate the reality of how and why the project has the
current form.

Building consensus is difficult. It is an essential first step before we can
do something like make a list of bit-sized pieces of work that others can
work on.

This is why my real message of Let's find a way to accommodate both is so
important. It allows us to build consensus, so that we can settle on a
direction and structure our work.

Until we accomplish that, it really is whoever commits code wins, and that
is an unhealthy and unmaintainable way to operate.

From a technical perspective, your statements about the unit tests are
completely accurate. They really need a LOT of reworking. That's the very
first step before making any significant changes. Part of the problem is
that the tests themselves are not well written. The other part is that the
Lucene object model was not designed for testability, and it makes writing
good tests more difficult, and certain tests might not be possible. It will
be difficult to write good unit tests without re-structuring. The biggest
issue is the use of abstract classes with base behaviour vs interfaces or
fully abstracted classes. Makes mocking tough. This is the direction I was
going when I started the Lucere project. I'd like to start in on that work
after the 2.9.4g release.

Thanks,
Troy


On Thu, Jun 30, 2011 at 12:04 PM, Michael Herndon 
mhern...@wickedsoftware.net wrote:

 I'd say that is all the more reasons that we need to work smarter and not
 harder. I'd also say thats a good reason to make sure we build consensus
 rather than just saying whoever commits code wins.

 And its a damn good reason to focus on the effort of growing the number of
 contributors and lowing the barrier to submitting patches, breaking things
 down into pieces that people would feel confident to work on without
 being overwhelmed by the complexity of Lucene.Net.

 There is a pretty big gap between Lucene 2.9.x to Lucene 4.0 and the
 internals and index formats are significantly different including nixing
 the
 current vint file format and using byte[] array slices for Terms instead of
 char[].

 So while porting 1 to 1 while require less knowledge or thought, its most
 likely going to require more hours of work. And Its definitely not going to
 guarantee the stability of the code or that its great code.

 I'd have to say that its not as stable as most would believe at the moment.

 Most of the tests avoid anything that remotely looks like it knows about
 the
 DRY principle and there is a static constructor in the core test case that
 throws an exception if it doesn't find an environment variable TEMP which
 will fail 90% of the tests and nunit will be unable to give you a clear
 reason why.  Just to name a few issues I came across working towards
 getting
 Lucene.Net into CI.  I haven't even started really digging in under the
 covers of the code yet.

 So my suggestion is to chew on this a bit more and build consensus, avoid
 fracturing people into sides.  Be open to reservations and concerns that
 others have and continue to address them.

 - Michael


 On Thu, Jun 30, 2011 at 2:10 PM, Digy digyd...@gmail.com wrote:

  Although there are a lot of people using Lucene.Net, this is our
  contribution report for the past 5 years.
 
 
 
 https://issues.apache.org/jira/secure/ConfigureReport.jspa?atl_token=A5KQ-2Q
 
 
 AV-T4JA-FDED|3204f7e696067a386144705075c074e991db2a2b|linversionId=-1issue
 
 
 Status=allselectedProjectId=12310290reportKey=com.sourcelabs.jira.plugin.r
  eport.contributions%3AcontributionreportNext=Next
 
 
  DIGY
 
  -Original Message-
  From: Ayende Rahien [mailto:aye...@ayende.com]
  Sent: Thursday, June 30, 2011 8:16 PM
  To: Rory Plaire; lucene-net-dev@lucene.apache.org
  Subject: RE: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed?
 
  

RE: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed?

2011-06-30 Thread Digy
 A) I don't to want to commit anything  thats going to piss alot of people
off, 
 B) I don't want to spend time/waste time on modifications that are going
to be rejected.  

What I've learnt from Apache Way is creating a JIRA issue if you are
hesitant.
If no one answers in a reasonable time(mostly), then commit.

DIGY

-Original Message-
From: Michael Herndon [mailto:mhern...@wickedsoftware.net] 
Sent: Thursday, June 30, 2011 11:58 PM
To: lucene-net-dev@lucene.apache.org
Subject: Re: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed?

@Troy,

I've already started working towards fixing unit testing issues, and
prototyping some things that sure DRY up the testing just so that I can get
the tests running on mono.

Those changes are currently in a private git repo, however since we don't
have a CI, I'm need to make some time to manually test those on at least 3
different Os (windowx, osx, and ubuntu) before putting those back into the
2.9.4g branch.

The reason being I need those in working order so that I can do a write up
on pulling those from source and at least running the build script to
compile everything and run the tests for you.  I don't know about everyone
else, but thats a starting point I look for when I go to work on something
or commit something back.

They should make their way back sometime this month.  I think the next thing
I'll do is put my money where my mouth is, spend time break down the rest of
the CI tasks, then seeing how much stuff I can get documented into the wiki.
 The simple faceted search is a decent starting template.

@Digy I agree with the talk, no work.

Though coming from the outside in, I still cringe when I make any commits at
the moment. (even that little .gitnore file)  A) I don't to want to commit
anything  thats going to piss alot of people off, B) I don't want to spend
time/waste time on modifications that are going to be rejected.  C) it took
a good deal of going through things before I felt comfortable to even making
a commit.   D) yes I know I just need to get over it and so does everyone
else (hence the obsession with the unit tests at the moment).

and I think a key to relaying people to get over it, including myself, is to
make the point you had more clear across the board:

*If it doesn't fit to the project's direction it can still find a place in
contrib or in branch. All of the approaches can live side by side happily in
the Lucene.Net repository. * +1  because that makes feel there is more
leadway to experiment and any decent effort will at least go somewhere to
live and not be wasted.

On Thu, Jun 30, 2011 at 4:19 PM, Digy digyd...@gmail.com wrote:

 Michael,
 You interpret the report as whoever commits code wins? But when I look
at
 it, I see a lof of talk, no work. .Net community is not interested in
 contributing.
 I really don't understand what hinders people to work on Lucene.Net. As I
 did for 2.9.4g, grab the code, do whatever you want on it and submit back.
 If it doesn't fit to the project's direction it can still find a place in
 contrib or in branch. All of the approaches can live side by side happily
in
 the Lucene.Net repository.

 Troy,
 I also don't understand why do you wait for 2.9.4g? It is a *branch* and
 has nothing to do with the trunk. It need not be an offical release and
can
 live in branch as a PoC.


 As a result, I got bored to listen to this should be done that way. What
 I want to see is I did it that way, should we continue with this.

 DIGY




 -Original Message-
 From: Troy Howard [mailto:thowar...@gmail.com]
 Sent: Thursday, June 30, 2011 10:47 PM
 To: lucene-net-dev@lucene.apache.org
 Subject: Re: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed?

 Michael,

 I agree with everything you said. My point in saying whoever commits code
 wins was to illustrate the reality of how and why the project has the
 current form.

 Building consensus is difficult. It is an essential first step before we
 can
 do something like make a list of bit-sized pieces of work that others can
 work on.

 This is why my real message of Let's find a way to accommodate both is
so
 important. It allows us to build consensus, so that we can settle on a
 direction and structure our work.

 Until we accomplish that, it really is whoever commits code wins, and
 that
 is an unhealthy and unmaintainable way to operate.

 From a technical perspective, your statements about the unit tests are
 completely accurate. They really need a LOT of reworking. That's the very
 first step before making any significant changes. Part of the problem is
 that the tests themselves are not well written. The other part is that the
 Lucene object model was not designed for testability, and it makes writing
 good tests more difficult, and certain tests might not be possible. It
will
 be difficult to write good unit tests without re-structuring. The biggest
 issue is the use of abstract classes with base behaviour vs interfaces or
 fully 

Re: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed?

2011-06-30 Thread Troy Howard
DIGY - Re: Why do I wait.. That's mostly because I intend to make some deep
changes, which would make merging the 2.9.4g branch back to trunk difficult.
So, it's easier to merge those changes first. Also, I won't have enough time
to make my changes until a little way in the future, but probably do have
the time to put together another release, so I'll do that first because it
fits with my work/life schedule.

Thanks,
Troy


On Thu, Jun 30, 2011 at 1:19 PM, Digy digyd...@gmail.com wrote:

 Michael,
 You interpret the report as whoever commits code wins? But when I look at
 it, I see a lof of talk, no work. .Net community is not interested in
 contributing.
 I really don't understand what hinders people to work on Lucene.Net. As I
 did for 2.9.4g, grab the code, do whatever you want on it and submit back.
 If it doesn't fit to the project's direction it can still find a place in
 contrib or in branch. All of the approaches can live side by side happily in
 the Lucene.Net repository.

 Troy,
 I also don't understand why do you wait for 2.9.4g? It is a *branch* and
 has nothing to do with the trunk. It need not be an offical release and can
 live in branch as a PoC.


 As a result, I got bored to listen to this should be done that way. What
 I want to see is I did it that way, should we continue with this.

 DIGY




 -Original Message-
 From: Troy Howard [mailto:thowar...@gmail.com]
 Sent: Thursday, June 30, 2011 10:47 PM
 To: lucene-net-dev@lucene.apache.org
 Subject: Re: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed?

 Michael,

 I agree with everything you said. My point in saying whoever commits code
 wins was to illustrate the reality of how and why the project has the
 current form.

 Building consensus is difficult. It is an essential first step before we
 can
 do something like make a list of bit-sized pieces of work that others can
 work on.

 This is why my real message of Let's find a way to accommodate both is so
 important. It allows us to build consensus, so that we can settle on a
 direction and structure our work.

 Until we accomplish that, it really is whoever commits code wins, and
 that
 is an unhealthy and unmaintainable way to operate.

 From a technical perspective, your statements about the unit tests are
 completely accurate. They really need a LOT of reworking. That's the very
 first step before making any significant changes. Part of the problem is
 that the tests themselves are not well written. The other part is that the
 Lucene object model was not designed for testability, and it makes writing
 good tests more difficult, and certain tests might not be possible. It will
 be difficult to write good unit tests without re-structuring. The biggest
 issue is the use of abstract classes with base behaviour vs interfaces or
 fully abstracted classes. Makes mocking tough. This is the direction I was
 going when I started the Lucere project. I'd like to start in on that work
 after the 2.9.4g release.

 Thanks,
 Troy


 On Thu, Jun 30, 2011 at 12:04 PM, Michael Herndon 
 mhern...@wickedsoftware.net wrote:

  I'd say that is all the more reasons that we need to work smarter and not
  harder. I'd also say thats a good reason to make sure we build consensus
  rather than just saying whoever commits code wins.
 
  And its a damn good reason to focus on the effort of growing the number
 of
  contributors and lowing the barrier to submitting patches, breaking
 things
  down into pieces that people would feel confident to work on without
  being overwhelmed by the complexity of Lucene.Net.
 
  There is a pretty big gap between Lucene 2.9.x to Lucene 4.0 and the
  internals and index formats are significantly different including nixing
  the
  current vint file format and using byte[] array slices for Terms instead
 of
  char[].
 
  So while porting 1 to 1 while require less knowledge or thought, its most
  likely going to require more hours of work. And Its definitely not going
 to
  guarantee the stability of the code or that its great code.
 
  I'd have to say that its not as stable as most would believe at the
 moment.
 
  Most of the tests avoid anything that remotely looks like it knows about
  the
  DRY principle and there is a static constructor in the core test case
 that
  throws an exception if it doesn't find an environment variable TEMP
 which
  will fail 90% of the tests and nunit will be unable to give you a clear
  reason why.  Just to name a few issues I came across working towards
  getting
  Lucene.Net into CI.  I haven't even started really digging in under the
  covers of the code yet.
 
  So my suggestion is to chew on this a bit more and build consensus, avoid
  fracturing people into sides.  Be open to reservations and concerns that
  others have and continue to address them.
 
  - Michael
 
 
  On Thu, Jun 30, 2011 at 2:10 PM, Digy digyd...@gmail.com wrote:
 
   Although there are a lot of people using Lucene.Net, this is our
   contribution report 

Re: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed?

2011-06-30 Thread Troy Howard
Scott -

The idea of the automated port is still worth doing. Perhaps it makes sense
for someone more passionate about the line-by-line idea to do that work?

I would say, focus on what makes sense to you. Being productive, regardless
of the specific direction, is what will be most valuable. Once you start,
others will join and momentum will build. That is how these things work.

I like DIGY's approach too, but the problem with it is that it is a
never-ending manual task. The theory behind the automated port is that it
may reduce the manual work. It is complicated, but once it's built and
works, it will save a lot of future development hours. If it's built in a
sufficiently general manner, it could be useful for other project like
Lucene.Net that want to automate a port from Java to C#.

It might make sense for that to be a separate project from Lucene.Net
though.

-T


On Thu, Jun 30, 2011 at 2:13 PM, Scott Lombard lombardena...@gmail.comwrote:

 Ok I think I asked the wrong question.  I am trying to figure out where to
 put my time.  I was thinking about working on the automated porting system,
 but when I saw the response to the .NET 4.0 discussions I started to
 question if that is the right direction.  The community seemed to be more
 interested in the .NET features.

 The complexity of the automated tool is going to become very high and will
 probably end up with a line-for-line style port.  So I keep asking my self
 is the automated tool worth it.  I don't think it is.

 I like the method has been Digy is using for porting the code.  So I guess
 for me the real question is Digy where did you see 2.9.4g going next and
 what do you need help on?

 Scott




  -Original Message-
  From: Digy [mailto:digyd...@gmail.com]
  Sent: Thursday, June 30, 2011 4:20 PM
  To: lucene-net-dev@lucene.apache.org
  Subject: RE: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed?
 
  Michael,
  You interpret the report as whoever commits code wins? But when I look
  at it, I see a lof of talk, no work. .Net community is not interested
 in
  contributing.
  I really don't understand what hinders people to work on Lucene.Net. As I
  did for 2.9.4g, grab the code, do whatever you want on it and submit
 back.
  If it doesn't fit to the project's direction it can still find a place in
  contrib or in branch. All of the approaches can live side by side happily
  in the Lucene.Net repository.
 
  Troy,
  I also don't understand why do you wait for 2.9.4g? It is a *branch* and
  has nothing to do with the trunk. It need not be an offical release and
  can live in branch as a PoC.
 
 
  As a result, I got bored to listen to this should be done that way.
 What
  I want to see is I did it that way, should we continue with this.
 
  DIGY
 
 
 
 
  -Original Message-
  From: Troy Howard [mailto:thowar...@gmail.com]
  Sent: Thursday, June 30, 2011 10:47 PM
  To: lucene-net-dev@lucene.apache.org
  Subject: Re: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed?
 
  Michael,
 
  I agree with everything you said. My point in saying whoever commits
 code
  wins was to illustrate the reality of how and why the project has the
  current form.
 
  Building consensus is difficult. It is an essential first step before we
  can
  do something like make a list of bit-sized pieces of work that others can
  work on.
 
  This is why my real message of Let's find a way to accommodate both is
  so
  important. It allows us to build consensus, so that we can settle on a
  direction and structure our work.
 
  Until we accomplish that, it really is whoever commits code wins, and
  that
  is an unhealthy and unmaintainable way to operate.
 
  From a technical perspective, your statements about the unit tests are
  completely accurate. They really need a LOT of reworking. That's the very
  first step before making any significant changes. Part of the problem is
  that the tests themselves are not well written. The other part is that
 the
  Lucene object model was not designed for testability, and it makes
 writing
  good tests more difficult, and certain tests might not be possible. It
  will
  be difficult to write good unit tests without re-structuring. The biggest
  issue is the use of abstract classes with base behaviour vs interfaces or
  fully abstracted classes. Makes mocking tough. This is the direction I
 was
  going when I started the Lucere project. I'd like to start in on that
 work
  after the 2.9.4g release.
 
  Thanks,
  Troy
 
 
  On Thu, Jun 30, 2011 at 12:04 PM, Michael Herndon 
  mhern...@wickedsoftware.net wrote:
 
   I'd say that is all the more reasons that we need to work smarter and
  not
   harder. I'd also say thats a good reason to make sure we build
 consensus
   rather than just saying whoever commits code wins.
  
   And its a damn good reason to focus on the effort of growing the number
  of
   contributors and lowing the barrier to submitting patches, breaking
  things
   down into pieces 

Re: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed?

2011-06-30 Thread Troy Howard
Michael -

If you bring those changes from git into a branch in SVN, we can help with
it. It doesn't have to be complete to be committed. :)

Regarding A (angering people)/B (being rejected)/C (feeling comfortable)/D
(getting over it)...

a) Making progress is more important than keeping everyone happy
b) Our goal is to accept things, not reject them. That said, if something
gets rejected due to quality issues, don't be afraid of that, it's a
learning experience for everyone, and it's a good thing. We can work
together to get to something everyone is happy with and learn in the
process.
c) Commit to a branch. Merge when things are right. No one expects branches
to build or be finished. It's OK. I get worried when I merge to trunk or
when I make a release. But I don't do that until I'm pretty sure it's all
legit.
d) Best way to get over it is to start doing it

I know you probably already realize all of this, but I wanted to respond, so
that, in case anyone else out there is struggling with the same set of
fears, they can see that fears that prevent action are more problematic than
any action they might take without those fears.

Thanks,
Troy


On Thu, Jun 30, 2011 at 1:57 PM, Michael Herndon 
mhern...@wickedsoftware.net wrote:

 @Troy,

 I've already started working towards fixing unit testing issues, and
 prototyping some things that sure DRY up the testing just so that I can get
 the tests running on mono.

 Those changes are currently in a private git repo, however since we don't
 have a CI, I'm need to make some time to manually test those on at least 3
 different Os (windowx, osx, and ubuntu) before putting those back into the
 2.9.4g branch.

 The reason being I need those in working order so that I can do a write up
 on pulling those from source and at least running the build script to
 compile everything and run the tests for you.  I don't know about everyone
 else, but thats a starting point I look for when I go to work on something
 or commit something back.

 They should make their way back sometime this month.  I think the next
 thing
 I'll do is put my money where my mouth is, spend time break down the rest
 of
 the CI tasks, then seeing how much stuff I can get documented into the
 wiki.
  The simple faceted search is a decent starting template.

 @Digy I agree with the talk, no work.

 Though coming from the outside in, I still cringe when I make any commits
 at
 the moment. (even that little .gitnore file)  A) I don't to want to commit
 anything  thats going to piss alot of people off, B) I don't want to spend
 time/waste time on modifications that are going to be rejected.  C) it took
 a good deal of going through things before I felt comfortable to even
 making
 a commit.   D) yes I know I just need to get over it and so does everyone
 else (hence the obsession with the unit tests at the moment).

 and I think a key to relaying people to get over it, including myself, is
 to
 make the point you had more clear across the board:

 *If it doesn't fit to the project's direction it can still find a place in
 contrib or in branch. All of the approaches can live side by side happily
 in
 the Lucene.Net repository. * +1  because that makes feel there is more
 leadway to experiment and any decent effort will at least go somewhere to
 live and not be wasted.

 On Thu, Jun 30, 2011 at 4:19 PM, Digy digyd...@gmail.com wrote:

  Michael,
  You interpret the report as whoever commits code wins? But when I look
 at
  it, I see a lof of talk, no work. .Net community is not interested in
  contributing.
  I really don't understand what hinders people to work on Lucene.Net. As I
  did for 2.9.4g, grab the code, do whatever you want on it and submit
 back.
  If it doesn't fit to the project's direction it can still find a place in
  contrib or in branch. All of the approaches can live side by side happily
 in
  the Lucene.Net repository.
 
  Troy,
  I also don't understand why do you wait for 2.9.4g? It is a *branch* and
  has nothing to do with the trunk. It need not be an offical release and
 can
  live in branch as a PoC.
 
 
  As a result, I got bored to listen to this should be done that way.
 What
  I want to see is I did it that way, should we continue with this.
 
  DIGY
 
 
 
 
  -Original Message-
  From: Troy Howard [mailto:thowar...@gmail.com]
  Sent: Thursday, June 30, 2011 10:47 PM
  To: lucene-net-dev@lucene.apache.org
  Subject: Re: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed?
 
  Michael,
 
  I agree with everything you said. My point in saying whoever commits
 code
  wins was to illustrate the reality of how and why the project has the
  current form.
 
  Building consensus is difficult. It is an essential first step before we
  can
  do something like make a list of bit-sized pieces of work that others can
  work on.
 
  This is why my real message of Let's find a way to accommodate both is
 so
  important. It allows us to build consensus, so that we can settle on 

RE: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed?

2011-06-30 Thread Digy
I can not say I like this approach, but till we find an automated way(with good 
results), it seems to be the only way we can use.

DIGY

-Original Message-
From: Troy Howard [mailto:thowar...@gmail.com] 
Sent: Friday, July 01, 2011 12:43 AM
To: lucene-net-dev@lucene.apache.org
Subject: Re: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed?

Scott -

The idea of the automated port is still worth doing. Perhaps it makes sense
for someone more passionate about the line-by-line idea to do that work?

I would say, focus on what makes sense to you. Being productive, regardless
of the specific direction, is what will be most valuable. Once you start,
others will join and momentum will build. That is how these things work.

I like DIGY's approach too, but the problem with it is that it is a
never-ending manual task. The theory behind the automated port is that it
may reduce the manual work. It is complicated, but once it's built and
works, it will save a lot of future development hours. If it's built in a
sufficiently general manner, it could be useful for other project like
Lucene.Net that want to automate a port from Java to C#.

It might make sense for that to be a separate project from Lucene.Net
though.

-T


On Thu, Jun 30, 2011 at 2:13 PM, Scott Lombard lombardena...@gmail.comwrote:

 Ok I think I asked the wrong question.  I am trying to figure out where to
 put my time.  I was thinking about working on the automated porting system,
 but when I saw the response to the .NET 4.0 discussions I started to
 question if that is the right direction.  The community seemed to be more
 interested in the .NET features.

 The complexity of the automated tool is going to become very high and will
 probably end up with a line-for-line style port.  So I keep asking my self
 is the automated tool worth it.  I don't think it is.

 I like the method has been Digy is using for porting the code.  So I guess
 for me the real question is Digy where did you see 2.9.4g going next and
 what do you need help on?

 Scott




  -Original Message-
  From: Digy [mailto:digyd...@gmail.com]
  Sent: Thursday, June 30, 2011 4:20 PM
  To: lucene-net-dev@lucene.apache.org
  Subject: RE: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed?
 
  Michael,
  You interpret the report as whoever commits code wins? But when I look
  at it, I see a lof of talk, no work. .Net community is not interested
 in
  contributing.
  I really don't understand what hinders people to work on Lucene.Net. As I
  did for 2.9.4g, grab the code, do whatever you want on it and submit
 back.
  If it doesn't fit to the project's direction it can still find a place in
  contrib or in branch. All of the approaches can live side by side happily
  in the Lucene.Net repository.
 
  Troy,
  I also don't understand why do you wait for 2.9.4g? It is a *branch* and
  has nothing to do with the trunk. It need not be an offical release and
  can live in branch as a PoC.
 
 
  As a result, I got bored to listen to this should be done that way.
 What
  I want to see is I did it that way, should we continue with this.
 
  DIGY
 
 
 
 
  -Original Message-
  From: Troy Howard [mailto:thowar...@gmail.com]
  Sent: Thursday, June 30, 2011 10:47 PM
  To: lucene-net-dev@lucene.apache.org
  Subject: Re: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed?
 
  Michael,
 
  I agree with everything you said. My point in saying whoever commits
 code
  wins was to illustrate the reality of how and why the project has the
  current form.
 
  Building consensus is difficult. It is an essential first step before we
  can
  do something like make a list of bit-sized pieces of work that others can
  work on.
 
  This is why my real message of Let's find a way to accommodate both is
  so
  important. It allows us to build consensus, so that we can settle on a
  direction and structure our work.
 
  Until we accomplish that, it really is whoever commits code wins, and
  that
  is an unhealthy and unmaintainable way to operate.
 
  From a technical perspective, your statements about the unit tests are
  completely accurate. They really need a LOT of reworking. That's the very
  first step before making any significant changes. Part of the problem is
  that the tests themselves are not well written. The other part is that
 the
  Lucene object model was not designed for testability, and it makes
 writing
  good tests more difficult, and certain tests might not be possible. It
  will
  be difficult to write good unit tests without re-structuring. The biggest
  issue is the use of abstract classes with base behaviour vs interfaces or
  fully abstracted classes. Makes mocking tough. This is the direction I
 was
  going when I started the Lucere project. I'd like to start in on that
 work
  after the 2.9.4g release.
 
  Thanks,
  Troy
 
 
  On Thu, Jun 30, 2011 at 12:04 PM, Michael Herndon 
  mhern...@wickedsoftware.net wrote:
 
   I'd say that is all the more 

Re: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed?

2011-06-30 Thread Rory Plaire
So, veering towards action - are there concrete tasks written up anywhere
for the unit tests? If a poor schlep like me wanted to dig in and start to
improve them, where would I get the understanding of what is good and what
needs help?

-r

On Thu, Jun 30, 2011 at 3:29 PM, Digy digyd...@gmail.com wrote:

 I can not say I like this approach, but till we find an automated way(with
 good results), it seems to be the only way we can use.

 DIGY

 -Original Message-
 From: Troy Howard [mailto:thowar...@gmail.com]
 Sent: Friday, July 01, 2011 12:43 AM
 To: lucene-net-dev@lucene.apache.org
 Subject: Re: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed?

 Scott -

 The idea of the automated port is still worth doing. Perhaps it makes sense
 for someone more passionate about the line-by-line idea to do that work?

 I would say, focus on what makes sense to you. Being productive, regardless
 of the specific direction, is what will be most valuable. Once you start,
 others will join and momentum will build. That is how these things work.

 I like DIGY's approach too, but the problem with it is that it is a
 never-ending manual task. The theory behind the automated port is that it
 may reduce the manual work. It is complicated, but once it's built and
 works, it will save a lot of future development hours. If it's built in a
 sufficiently general manner, it could be useful for other project like
 Lucene.Net that want to automate a port from Java to C#.

 It might make sense for that to be a separate project from Lucene.Net
 though.

 -T


 On Thu, Jun 30, 2011 at 2:13 PM, Scott Lombard lombardena...@gmail.com
 wrote:

  Ok I think I asked the wrong question.  I am trying to figure out where
 to
  put my time.  I was thinking about working on the automated porting
 system,
  but when I saw the response to the .NET 4.0 discussions I started to
  question if that is the right direction.  The community seemed to be more
  interested in the .NET features.
 
  The complexity of the automated tool is going to become very high and
 will
  probably end up with a line-for-line style port.  So I keep asking my
 self
  is the automated tool worth it.  I don't think it is.
 
  I like the method has been Digy is using for porting the code.  So I
 guess
  for me the real question is Digy where did you see 2.9.4g going next and
  what do you need help on?
 
  Scott
 
 
 
 
   -Original Message-
   From: Digy [mailto:digyd...@gmail.com]
   Sent: Thursday, June 30, 2011 4:20 PM
   To: lucene-net-dev@lucene.apache.org
   Subject: RE: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port
 needed?
  
   Michael,
   You interpret the report as whoever commits code wins? But when I
 look
   at it, I see a lof of talk, no work. .Net community is not interested
  in
   contributing.
   I really don't understand what hinders people to work on Lucene.Net. As
 I
   did for 2.9.4g, grab the code, do whatever you want on it and submit
  back.
   If it doesn't fit to the project's direction it can still find a place
 in
   contrib or in branch. All of the approaches can live side by side
 happily
   in the Lucene.Net repository.
  
   Troy,
   I also don't understand why do you wait for 2.9.4g? It is a *branch*
 and
   has nothing to do with the trunk. It need not be an offical release and
   can live in branch as a PoC.
  
  
   As a result, I got bored to listen to this should be done that way.
  What
   I want to see is I did it that way, should we continue with this.
  
   DIGY
  
  
  
  
   -Original Message-
   From: Troy Howard [mailto:thowar...@gmail.com]
   Sent: Thursday, June 30, 2011 10:47 PM
   To: lucene-net-dev@lucene.apache.org
   Subject: Re: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port
 needed?
  
   Michael,
  
   I agree with everything you said. My point in saying whoever commits
  code
   wins was to illustrate the reality of how and why the project has the
   current form.
  
   Building consensus is difficult. It is an essential first step before
 we
   can
   do something like make a list of bit-sized pieces of work that others
 can
   work on.
  
   This is why my real message of Let's find a way to accommodate both
 is
   so
   important. It allows us to build consensus, so that we can settle on a
   direction and structure our work.
  
   Until we accomplish that, it really is whoever commits code wins, and
   that
   is an unhealthy and unmaintainable way to operate.
  
   From a technical perspective, your statements about the unit tests are
   completely accurate. They really need a LOT of reworking. That's the
 very
   first step before making any significant changes. Part of the problem
 is
   that the tests themselves are not well written. The other part is that
  the
   Lucene object model was not designed for testability, and it makes
  writing
   good tests more difficult, and certain tests might not be possible. It
   will
   be difficult to write good unit tests without 

Re: code to call arbitrary function on Python modules, and eval()

2011-06-30 Thread Andi Vajda

On Jul 1, 2011, at 0:49, Bill Janssen jans...@parc.com wrote:

 Here's some code implementing a class called PythonModule,

Hmm, no code was received here...

Andi..

 which allows Java code to invoke arbitary module-level functions,
 and allows use of Python's eval built-in.
 
 The Python code is a bit tricky; is there a better way to cast a Java
 scalar type to its Python equivalent value?



 
 Bill
 


Re: code to call arbitrary function on Python modules, and eval()

2011-06-30 Thread Andi Vajda

On Jul 1, 2011, at 0:49, Bill Janssen jans...@parc.com wrote:

 Here's some code implementing a class called PythonModule,

Hmm, no code was received here...

Andi..

 which allows Java code to invoke arbitary module-level functions,
 and allows use of Python's eval built-in.
 
 The Python code is a bit tricky; is there a better way to cast a Java
 scalar type to its Python equivalent value?



 
 Bill
 



[jira] [Commented] (LUCENE-3241) Remove Lucene core's FunctionQuery impls

2011-06-30 Thread Chris Male (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057635#comment-13057635
 ] 

Chris Male commented on LUCENE-3241:


I will re-evaluate the tests and port what I can.

 Remove Lucene core's FunctionQuery impls
 

 Key: LUCENE-3241
 URL: https://issues.apache.org/jira/browse/LUCENE-3241
 Project: Lucene - Java
  Issue Type: Sub-task
  Components: core/search
Reporter: Chris Male
Assignee: Chris Male
 Fix For: 4.0

 Attachments: LUCENE-3241.patch


 As part of the consolidation of FunctionQuerys, we want to remove Lucene 
 core's impls.  Included in this work, we will make sure that all the 
 functionality provided by the core impls is also provided by the new module.  
 Any tests will be ported across too, to increase the test coverage.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3261) Faceting module userguide

2011-06-30 Thread Shai Erera (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-3261:
---

Attachment: facet-userguide.pdf

Attaching the userguide from LUCENE-3079.

 Faceting module userguide
 -

 Key: LUCENE-3261
 URL: https://issues.apache.org/jira/browse/LUCENE-3261
 Project: Lucene - Java
  Issue Type: Improvement
  Components: modules/facet
Reporter: Shai Erera
Assignee: Shai Erera
Priority: Minor
 Attachments: facet-userguide.pdf


 In LUCENE-3079 I've uploaded a userguide for the faceting module. I'd like to 
 discuss where is the best place to include the module. We include it with the 
 code (in our SVN), so that it's always attached to some branch (or in other 
 words a release). That way we can have versions of it per releases that 
 reflect API changes.
 This document is like the file format document, or any other document we put 
 under site-versioned. So we have two places:
 * facet/docs
 * site/userguides
 Unlike the site, which its PDFs are built automatically by Forrest, we cannot 
 convert ODT to PDF with it, so it's a challenge to put it there. What we do 
 today (in our SVN) is whoever updates the userguide, creates a PDF too, 
 that's easy from OpenOffice.
 I'll upload the file later when I'm in front of it.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3264) crank up faceting module tests

2011-06-30 Thread Shai Erera (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057668#comment-13057668
 ] 

Shai Erera commented on LUCENE-3264:


Patch looks very good. All tests pass for me (I've applied on trunk only).

Few things I've noticed:
* Previously the tests took 1m20s to run, now they take 2m55s. I guess it's 
because previously we only created RAMDirs, while now newDirectory picks FSDir 
from time to time (10%?).

* FacetTestUtils.close*() can be removed and calls replaced by 
IOUtils.closeSafely. This is not critical, just remove redundant code.

* You added a TODO to CategoryListIteratorTest about the test failing if 
TieredMP is used. In general TieredMP is not good for the taxonomy index, which 
relies on Lucene doc IDs, and therefore segments must be merged in-order. LTW 
uses LMP specifically because of that. I will look into the test to understand 
why would it care about doc IDs, since it doesn't using the taxonomy index at 
all.

* There are few places with code like: assertTrue(Would like to test this with 
deletions!,indexReader.hasDeletions()), and assertTrue(Would like to test 
this with deletions!,indexReader.numDeletedDocs()  0) which you removed. Any 
reason?

* You added a TODO to TestScoredDocIDsUtils (about reader is read-only) -- 
you're right, the comment can be deleted.

While I reviewed, I was thinking that RandomIndexWriter is used to replace the 
IndexWriter for content indexing. While this is good, this does not cover the 
'taxonomy' indexing. So I wonder if we should have under facet/test/o.a.l.utils 
a RandomTaxonomyWriter which opens RIW internally?

This is very impressive progress Robert, thanks for doing it !

I am +1 to commit, after we resolve the tiny issues I raised above. We can add 
RandomTaxonomyWriter as a follow-on commit.

 crank up faceting module tests
 --

 Key: LUCENE-3264
 URL: https://issues.apache.org/jira/browse/LUCENE-3264
 Project: Lucene - Java
  Issue Type: Test
  Components: modules/facet
Reporter: Robert Muir
Assignee: Robert Muir
 Fix For: 3.4, 4.0

 Attachments: LUCENE-3264.patch


 The faceting module has a large set of good tests.
 lets switch them over to use all of our test infra (randomindexwriter, random 
 iwconfig, mockanalyzer, newDirectory, ...)
 I don't want to address multipliers and atLeast() etc on this issue, I think 
 we should follow up with that on a separate issue, that also looks at speed 
 and making sure the nightly build is exhaustive.
 for now, lets just get the coverage in, it will be good to do before any 
 refactoring.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3241) Remove Lucene core's FunctionQuery impls

2011-06-30 Thread Chris Male (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057694#comment-13057694
 ] 

Chris Male commented on LUCENE-3241:


Command for patch:

{code}
svn move 
lucene/src/java/org/apache/lucene/search/function/NumericIndexDocValueSource.java
 modules/queries/src/java/org/apache/lucene/queries/function/valuesource/
svn move 
lucene/src/test/org/apache/lucene/search/function/TestFieldScoreQuery.java 
modules/queries/src/test/org/apache/lucene/queries/function/
svn move lucene/src/test/org/apache/lucene/search/function/TestOrdValues.java 
modules/queries/src/test/org/apache/lucene/queries/function/
svn --force delete lucene/src/java/org/apache/lucene/search/function
svn --force delete lucene/src/test/org/apache/lucene/search/function
{code}

 Remove Lucene core's FunctionQuery impls
 

 Key: LUCENE-3241
 URL: https://issues.apache.org/jira/browse/LUCENE-3241
 Project: Lucene - Java
  Issue Type: Sub-task
  Components: core/search
Reporter: Chris Male
Assignee: Chris Male
 Fix For: 4.0

 Attachments: LUCENE-3241.patch, LUCENE-3241.patch


 As part of the consolidation of FunctionQuerys, we want to remove Lucene 
 core's impls.  Included in this work, we will make sure that all the 
 functionality provided by the core impls is also provided by the new module.  
 Any tests will be ported across too, to increase the test coverage.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3241) Remove Lucene core's FunctionQuery impls

2011-06-30 Thread Chris Male (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Male updated LUCENE-3241:
---

Attachment: LUCENE-3241.patch

New patch which incorporates Robert's suggestions.

I have salvaged some of the tests, but theres definitely a big TODO in regards 
to the test coverage.

Command coming up.

 Remove Lucene core's FunctionQuery impls
 

 Key: LUCENE-3241
 URL: https://issues.apache.org/jira/browse/LUCENE-3241
 Project: Lucene - Java
  Issue Type: Sub-task
  Components: core/search
Reporter: Chris Male
Assignee: Chris Male
 Fix For: 4.0

 Attachments: LUCENE-3241.patch, LUCENE-3241.patch


 As part of the consolidation of FunctionQuerys, we want to remove Lucene 
 core's impls.  Included in this work, we will make sure that all the 
 functionality provided by the core impls is also provided by the new module.  
 Any tests will be ported across too, to increase the test coverage.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-3079) Faceting module

2011-06-30 Thread Shai Erera (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera resolved LUCENE-3079.


Resolution: Fixed

Faceting module in 3.x and trunk, tests pass, opened follow up issues. I think 
we can close this.

Thanks for everyone for helping get this in so quickly !

 Faceting module
 ---

 Key: LUCENE-3079
 URL: https://issues.apache.org/jira/browse/LUCENE-3079
 Project: Lucene - Java
  Issue Type: Improvement
  Components: modules/facet
Reporter: Michael McCandless
Assignee: Shai Erera
 Fix For: 3.4, 4.0

 Attachments: LUCENE-3079-dev-tools.patch, LUCENE-3079.patch, 
 LUCENE-3079.patch, LUCENE-3079.patch, LUCENE-3079.patch, 
 LUCENE-3079_4x.patch, LUCENE-3079_4x_broken.patch, TestPerformanceHack.java, 
 facet-userguide.pdf


 Faceting is a hugely important feature, available in Solr today but
 not [easily] usable by Lucene-only apps.
 We should fix this, by creating a shared faceting module.
 Ideally, we factor out Solr's faceting impl, and maybe poach/merge
 from other impls (eg Bobo browse).
 Hoss describes some important challenges we'll face in doing this
 (http://markmail.org/message/5w35c2fr4zkiwsz6), copied here:
 {noformat}
 To look at faceting as a concrete example, there are big the reasons 
 faceting works so well in Solr: Solr has total control over the 
 index, knows exactly when the index has changed to rebuild caches, has a 
 strict schema so it can make sense of field types and 
 pick faceting algos accordingly, has multi-phase distributed search 
 approach to get exact counts efficiently across multiple shards, etc...
 (and there are still a lot of additional enhancements and improvements 
 that can be made to take even more advantage of knowledge solr has because 
 it owns the index that we no one has had time to tackle)
 {noformat}
 This is a great list of the things we face in refactoring.  It's also
 important because, if Solr needed to be so deeply intertwined with
 caching, schema, etc., other apps that want to facet will have the
 same needs and so we really have to address them in creating the
 shared module.
 I think we should get a basic faceting module started, but should not
 cut Solr over at first.  We should iterate on the module, fold in
 improvements, etc., and then, once we can fully verify that cutting
 over doesn't hurt Solr (ie lose functionality or performance) we can
 later cutover.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3216) Store DocValues per segment instead of per field

2011-06-30 Thread Simon Willnauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer updated LUCENE-3216:


Attachment: LUCENE-3216.patch

we are getting closer to the overall target here. This patch enables each codec 
to decided to use CFS for DocValues or write individual files. 

To configure this and more stuff per codec I introduced a CodecConfig (just 
like IWC) that holds configuration for core codecs and is passed to each codec 
on creation. I added testcases for the Config and for nested CFS in the case IW 
or SegmentMerger decides to use CFS too so DocValues still can safely open the 
CFS.

For test coverage I added a static newCodecConfig() to LuceneTestCase that 
randomly configures a codec per file to use CFS or individual files for 
DocValues and other stuff I figured make sense in the CodecConfig.

All tests pass and there is no nocommit left I think its close. Review is 
appreciated

 Store DocValues per segment instead of per field
 

 Key: LUCENE-3216
 URL: https://issues.apache.org/jira/browse/LUCENE-3216
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/index
Affects Versions: 4.0
Reporter: Simon Willnauer
Assignee: Simon Willnauer
 Fix For: 4.0

 Attachments: LUCENE-3216.patch, LUCENE-3216.patch, LUCENE-3216.patch, 
 LUCENE-3216.patch, LUCENE-3216.patch, LUCENE-3216_floats.patch


 currently we are storing docvalues per field which results in at least one 
 file per field that uses docvalues (or at most two per field per segment 
 depending on the impl.). Yet, we should try to by default pack docvalues into 
 a single file if possible. To enable this we need to hold all docvalues in 
 memory during indexing and write them to disk once we flush a segment. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3216) Store DocValues per segment instead of per field

2011-06-30 Thread Simon Willnauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer updated LUCENE-3216:


Attachment: LUCENE-3239.patch

since the vote has passed here is a patch to cut over the build and references 
to 1.6

 Store DocValues per segment instead of per field
 

 Key: LUCENE-3216
 URL: https://issues.apache.org/jira/browse/LUCENE-3216
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/index
Affects Versions: 4.0
Reporter: Simon Willnauer
Assignee: Simon Willnauer
 Fix For: 4.0

 Attachments: LUCENE-3216.patch, LUCENE-3216.patch, LUCENE-3216.patch, 
 LUCENE-3216.patch, LUCENE-3216.patch, LUCENE-3216_floats.patch, 
 LUCENE-3239.patch


 currently we are storing docvalues per field which results in at least one 
 file per field that uses docvalues (or at most two per field per segment 
 depending on the impl.). Yet, we should try to by default pack docvalues into 
 a single file if possible. To enable this we need to hold all docvalues in 
 memory during indexing and write them to disk once we flush a segment. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-3142) benchmark/stats package is obsolete and unused - remove it

2011-06-30 Thread Doron Cohen (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doron Cohen resolved LUCENE-3142.
-

Resolution: Fixed

r1141465: trunk
r1141468: 3x

 benchmark/stats package is obsolete and unused - remove it
 --

 Key: LUCENE-3142
 URL: https://issues.apache.org/jira/browse/LUCENE-3142
 Project: Lucene - Java
  Issue Type: Bug
  Components: modules/benchmark
Reporter: Doron Cohen
Assignee: Doron Cohen
Priority: Minor

 This seems like a leftover from the original benchmark implementation and can 
 thus be removed.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3216) Store DocValues per segment instead of per field

2011-06-30 Thread Simon Willnauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer updated LUCENE-3216:


Comment: was deleted

(was: since the vote has passed here is a patch to cut over the build and 
references to 1.6)

 Store DocValues per segment instead of per field
 

 Key: LUCENE-3216
 URL: https://issues.apache.org/jira/browse/LUCENE-3216
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/index
Affects Versions: 4.0
Reporter: Simon Willnauer
Assignee: Simon Willnauer
 Fix For: 4.0

 Attachments: LUCENE-3216.patch, LUCENE-3216.patch, LUCENE-3216.patch, 
 LUCENE-3216.patch, LUCENE-3216.patch, LUCENE-3216_floats.patch


 currently we are storing docvalues per field which results in at least one 
 file per field that uses docvalues (or at most two per field per segment 
 depending on the impl.). Yet, we should try to by default pack docvalues into 
 a single file if possible. To enable this we need to hold all docvalues in 
 memory during indexing and write them to disk once we flush a segment. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3216) Store DocValues per segment instead of per field

2011-06-30 Thread Simon Willnauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer updated LUCENE-3216:


Attachment: (was: LUCENE-3239.patch)

 Store DocValues per segment instead of per field
 

 Key: LUCENE-3216
 URL: https://issues.apache.org/jira/browse/LUCENE-3216
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/index
Affects Versions: 4.0
Reporter: Simon Willnauer
Assignee: Simon Willnauer
 Fix For: 4.0

 Attachments: LUCENE-3216.patch, LUCENE-3216.patch, LUCENE-3216.patch, 
 LUCENE-3216.patch, LUCENE-3216.patch, LUCENE-3216_floats.patch


 currently we are storing docvalues per field which results in at least one 
 file per field that uses docvalues (or at most two per field per segment 
 depending on the impl.). Yet, we should try to by default pack docvalues into 
 a single file if possible. To enable this we need to hold all docvalues in 
 memory during indexing and write them to disk once we flush a segment. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3239) drop java 5 support

2011-06-30 Thread Simon Willnauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer updated LUCENE-3239:


Attachment: LUCENE-3239.patch

this patch moves the build and metadata to 1.6

 drop java 5 support
 -

 Key: LUCENE-3239
 URL: https://issues.apache.org/jira/browse/LUCENE-3239
 Project: Lucene - Java
  Issue Type: Task
Reporter: Robert Muir
 Attachments: LUCENE-3239.patch


 its been discussed here and there, but I think we need to drop java 5 
 support, for these reasons:
 * its totally untested by any continual build process. Testing java5 only 
 when there is a release candidate ready is not enough. If we are to claim 
 support then we need a hudson actually running the tests with java 5.
 * its now unmaintained, so bugs have to either be hacked around, tests 
 disabled, warnings placed, but some things simply cannot be fixed... we 
 cannot actually support something that is no longer maintained: we do find 
 JRE bugs (http://wiki.apache.org/lucene-java/SunJavaBugs) and its important 
 that bugs actually get fixed: cannot do everything with hacks.
 * because of its limitations, we do things like allow 20% slower grouping 
 speed. I find it hard to believe we are sacrificing performance for this.
 So, in summary: because we don't test it at all, because its buggy and 
 unmaintained, and because we are sacrificing performance, I think we need to 
 cutover the build system for the next release to require java 6.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3239) drop java 5 support

2011-06-30 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057765#comment-13057765
 ] 

Uwe Schindler commented on LUCENE-3239:
---

Patch looks fine, Jenkins already moved.

 drop java 5 support
 -

 Key: LUCENE-3239
 URL: https://issues.apache.org/jira/browse/LUCENE-3239
 Project: Lucene - Java
  Issue Type: Task
Reporter: Robert Muir
 Attachments: LUCENE-3239.patch


 its been discussed here and there, but I think we need to drop java 5 
 support, for these reasons:
 * its totally untested by any continual build process. Testing java5 only 
 when there is a release candidate ready is not enough. If we are to claim 
 support then we need a hudson actually running the tests with java 5.
 * its now unmaintained, so bugs have to either be hacked around, tests 
 disabled, warnings placed, but some things simply cannot be fixed... we 
 cannot actually support something that is no longer maintained: we do find 
 JRE bugs (http://wiki.apache.org/lucene-java/SunJavaBugs) and its important 
 that bugs actually get fixed: cannot do everything with hacks.
 * because of its limitations, we do things like allow 20% slower grouping 
 speed. I find it hard to believe we are sacrificing performance for this.
 So, in summary: because we don't test it at all, because its buggy and 
 unmaintained, and because we are sacrificing performance, I think we need to 
 cutover the build system for the next release to require java 6.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3239) drop java 5 support

2011-06-30 Thread Simon Willnauer (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057772#comment-13057772
 ] 

Simon Willnauer commented on LUCENE-3239:
-

I just committed that patch, I will continue on all the *.java TODOs

 drop java 5 support
 -

 Key: LUCENE-3239
 URL: https://issues.apache.org/jira/browse/LUCENE-3239
 Project: Lucene - Java
  Issue Type: Task
Reporter: Robert Muir
 Attachments: LUCENE-3239.patch


 its been discussed here and there, but I think we need to drop java 5 
 support, for these reasons:
 * its totally untested by any continual build process. Testing java5 only 
 when there is a release candidate ready is not enough. If we are to claim 
 support then we need a hudson actually running the tests with java 5.
 * its now unmaintained, so bugs have to either be hacked around, tests 
 disabled, warnings placed, but some things simply cannot be fixed... we 
 cannot actually support something that is no longer maintained: we do find 
 JRE bugs (http://wiki.apache.org/lucene-java/SunJavaBugs) and its important 
 that bugs actually get fixed: cannot do everything with hacks.
 * because of its limitations, we do things like allow 20% slower grouping 
 speed. I find it hard to believe we are sacrificing performance for this.
 So, in summary: because we don't test it at all, because its buggy and 
 unmaintained, and because we are sacrificing performance, I think we need to 
 cutover the build system for the next release to require java 6.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3239) drop java 5 support

2011-06-30 Thread Simon Willnauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer updated LUCENE-3239:


Attachment: LUCENE-3239.patch

here is a patch that fixes almost all todos except of the one in NativeFSLock. 
I think for that we should open a sep. issue. I didn't convert all the 
ArrayUtils yet I think we can do that later in a followup too. 



 drop java 5 support
 -

 Key: LUCENE-3239
 URL: https://issues.apache.org/jira/browse/LUCENE-3239
 Project: Lucene - Java
  Issue Type: Task
Reporter: Robert Muir
 Attachments: LUCENE-3239.patch, LUCENE-3239.patch


 its been discussed here and there, but I think we need to drop java 5 
 support, for these reasons:
 * its totally untested by any continual build process. Testing java5 only 
 when there is a release candidate ready is not enough. If we are to claim 
 support then we need a hudson actually running the tests with java 5.
 * its now unmaintained, so bugs have to either be hacked around, tests 
 disabled, warnings placed, but some things simply cannot be fixed... we 
 cannot actually support something that is no longer maintained: we do find 
 JRE bugs (http://wiki.apache.org/lucene-java/SunJavaBugs) and its important 
 that bugs actually get fixed: cannot do everything with hacks.
 * because of its limitations, we do things like allow 20% slower grouping 
 speed. I find it hard to believe we are sacrificing performance for this.
 So, in summary: because we don't test it at all, because its buggy and 
 unmaintained, and because we are sacrificing performance, I think we need to 
 cutover the build system for the next release to require java 6.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3260) need a test that uses termsenum.seekExact() (which returns true), then calls next()

2011-06-30 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057778#comment-13057778
 ] 

Michael McCandless commented on LUCENE-3260:


Thanks Shai!

The 200+ iterations are exceptionally fast since they only do 1 TermsEnum op 
per iter (it's the indexing that'll be slow in this test -- for that I do 
numDocs = atLeast(10)).  Also, this bug only happens when seekExact is followed 
by next, only on certain terms, and only on a multi-seg index.  So it seems an 
OK investment of CPU for test coverage ;)

 need a test that uses termsenum.seekExact() (which returns true), then calls 
 next()
 ---

 Key: LUCENE-3260
 URL: https://issues.apache.org/jira/browse/LUCENE-3260
 Project: Lucene - Java
  Issue Type: Bug
Reporter: Robert Muir
Assignee: Michael McCandless
 Attachments: LUCENE-3260.patch


 i tried to do some seekExact (where the result must exist) then next()ing in 
 the faceting module,
 and it seems like there could be a bug here.
 I think we should add a test that mixes seekExact/seekCeil/next like this, to 
 ensure that
 if seekExact returns true, that the enum is properly positioned.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3239) drop java 5 support

2011-06-30 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057780#comment-13057780
 ] 

Uwe Schindler commented on LUCENE-3239:
---

+1 as a start

 drop java 5 support
 -

 Key: LUCENE-3239
 URL: https://issues.apache.org/jira/browse/LUCENE-3239
 Project: Lucene - Java
  Issue Type: Task
Reporter: Robert Muir
 Attachments: LUCENE-3239.patch, LUCENE-3239.patch


 its been discussed here and there, but I think we need to drop java 5 
 support, for these reasons:
 * its totally untested by any continual build process. Testing java5 only 
 when there is a release candidate ready is not enough. If we are to claim 
 support then we need a hudson actually running the tests with java 5.
 * its now unmaintained, so bugs have to either be hacked around, tests 
 disabled, warnings placed, but some things simply cannot be fixed... we 
 cannot actually support something that is no longer maintained: we do find 
 JRE bugs (http://wiki.apache.org/lucene-java/SunJavaBugs) and its important 
 that bugs actually get fixed: cannot do everything with hacks.
 * because of its limitations, we do things like allow 20% slower grouping 
 speed. I find it hard to believe we are sacrificing performance for this.
 So, in summary: because we don't test it at all, because its buggy and 
 unmaintained, and because we are sacrificing performance, I think we need to 
 cutover the build system for the next release to require java 6.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3239) drop java 5 support

2011-06-30 Thread Simon Willnauer (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057781#comment-13057781
 ] 

Simon Willnauer commented on LUCENE-3239:
-

bq. +1 as a start
alright I'll kick it in... we are on 1.6 YAY!

 drop java 5 support
 -

 Key: LUCENE-3239
 URL: https://issues.apache.org/jira/browse/LUCENE-3239
 Project: Lucene - Java
  Issue Type: Task
Reporter: Robert Muir
 Attachments: LUCENE-3239.patch, LUCENE-3239.patch


 its been discussed here and there, but I think we need to drop java 5 
 support, for these reasons:
 * its totally untested by any continual build process. Testing java5 only 
 when there is a release candidate ready is not enough. If we are to claim 
 support then we need a hudson actually running the tests with java 5.
 * its now unmaintained, so bugs have to either be hacked around, tests 
 disabled, warnings placed, but some things simply cannot be fixed... we 
 cannot actually support something that is no longer maintained: we do find 
 JRE bugs (http://wiki.apache.org/lucene-java/SunJavaBugs) and its important 
 that bugs actually get fixed: cannot do everything with hacks.
 * because of its limitations, we do things like allow 20% slower grouping 
 speed. I find it hard to believe we are sacrificing performance for this.
 So, in summary: because we don't test it at all, because its buggy and 
 unmaintained, and because we are sacrificing performance, I think we need to 
 cutover the build system for the next release to require java 6.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Assigned] (LUCENE-3239) drop java 5 support

2011-06-30 Thread Simon Willnauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer reassigned LUCENE-3239:
---

Assignee: Simon Willnauer

 drop java 5 support
 -

 Key: LUCENE-3239
 URL: https://issues.apache.org/jira/browse/LUCENE-3239
 Project: Lucene - Java
  Issue Type: Task
Reporter: Robert Muir
Assignee: Simon Willnauer
 Attachments: LUCENE-3239.patch, LUCENE-3239.patch


 its been discussed here and there, but I think we need to drop java 5 
 support, for these reasons:
 * its totally untested by any continual build process. Testing java5 only 
 when there is a release candidate ready is not enough. If we are to claim 
 support then we need a hudson actually running the tests with java 5.
 * its now unmaintained, so bugs have to either be hacked around, tests 
 disabled, warnings placed, but some things simply cannot be fixed... we 
 cannot actually support something that is no longer maintained: we do find 
 JRE bugs (http://wiki.apache.org/lucene-java/SunJavaBugs) and its important 
 that bugs actually get fixed: cannot do everything with hacks.
 * because of its limitations, we do things like allow 20% slower grouping 
 speed. I find it hard to believe we are sacrificing performance for this.
 So, in summary: because we don't test it at all, because its buggy and 
 unmaintained, and because we are sacrificing performance, I think we need to 
 cutover the build system for the next release to require java 6.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-3265) Cut over to Java 6 API where needed / possible

2011-06-30 Thread Simon Willnauer (JIRA)
Cut over to Java 6 API where needed / possible
--

 Key: LUCENE-3265
 URL: https://issues.apache.org/jira/browse/LUCENE-3265
 Project: Lucene - Java
  Issue Type: Task
Affects Versions: 4.0
Reporter: Simon Willnauer
Priority: Minor
 Fix For: 4.0


since we are on 1.6 on trunk we should try to reduce the duplications like in 
ArrayUtils and cut over to Java 6 API

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-3266) Improve FileLocking based on Java 1.6

2011-06-30 Thread Simon Willnauer (JIRA)
Improve FileLocking based on Java 1.6 
--

 Key: LUCENE-3266
 URL: https://issues.apache.org/jira/browse/LUCENE-3266
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/store
Affects Versions: 4.0
Reporter: Simon Willnauer
Priority: Minor
 Fix For: 4.0


Snippet from NativeFSLockFactory:

{noformat}
/*
* The javadocs for FileChannel state that you should have
* a single instance of a FileChannel (per JVM) for all
* locking against a given file (locks are tracked per 
* FileChannel instance in Java 1.4/1.5). Even using the same 
* FileChannel instance is not completely thread-safe with Java 
* 1.4/1.5 though. To work around this, we have a single (static) 
* HashSet that contains the file paths of all currently 
* locked locks.  This protects against possible cases 
* where different Directory instances in one JVM (each 
* with their own NativeFSLockFactory instance) have set 
* the same lock dir and lock prefix. However, this will not 
* work when LockFactorys are created by different 
* classloaders (eg multiple webapps). 
* 
* TODO: Java 1.6 tracks system wide locks in a thread safe manner 
* (same FileChannel instance or not), so we may want to 
* change this when Lucene moves to Java 1.6.
*/
{noformat}

since we are on 1.6 we should improve this if possible.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-3239) drop java 5 support

2011-06-30 Thread Simon Willnauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer resolved LUCENE-3239.
-

   Resolution: Fixed
Fix Version/s: 4.0
Lucene Fields: [New, Patch Available]  (was: [New])

moving out here, created LUCENE-3265 and LUCENE-3266 as followup issues

 drop java 5 support
 -

 Key: LUCENE-3239
 URL: https://issues.apache.org/jira/browse/LUCENE-3239
 Project: Lucene - Java
  Issue Type: Task
Reporter: Robert Muir
Assignee: Simon Willnauer
 Fix For: 4.0

 Attachments: LUCENE-3239.patch, LUCENE-3239.patch


 its been discussed here and there, but I think we need to drop java 5 
 support, for these reasons:
 * its totally untested by any continual build process. Testing java5 only 
 when there is a release candidate ready is not enough. If we are to claim 
 support then we need a hudson actually running the tests with java 5.
 * its now unmaintained, so bugs have to either be hacked around, tests 
 disabled, warnings placed, but some things simply cannot be fixed... we 
 cannot actually support something that is no longer maintained: we do find 
 JRE bugs (http://wiki.apache.org/lucene-java/SunJavaBugs) and its important 
 that bugs actually get fixed: cannot do everything with hacks.
 * because of its limitations, we do things like allow 20% slower grouping 
 speed. I find it hard to believe we are sacrificing performance for this.
 So, in summary: because we don't test it at all, because its buggy and 
 unmaintained, and because we are sacrificing performance, I think we need to 
 cutover the build system for the next release to require java 6.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3264) crank up faceting module tests

2011-06-30 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057792#comment-13057792
 ] 

Robert Muir commented on LUCENE-3264:
-

{quote}
Previously the tests took 1m20s to run, now they take 2m55s. I guess it's 
because previously we only created RAMDirs, while now newDirectory picks FSDir 
from time to time (10%?).
{quote}

I don't think its from FSDir, this is now very very rarely picked. Anyway, as 
said in the issue summary, for a number of reasons, I don't want to address 
this on this issue, I want to address the coverage first.

{quote}
FacetTestUtils.close*() can be removed and calls replaced by 
IOUtils.closeSafely. This is not critical, just remove redundant code.
{quote}

ah, you are right. let's change this.

{quote}
You added a TODO to CategoryListIteratorTest about the test failing if TieredMP 
is used. In general TieredMP is not good for the taxonomy index, which relies 
on Lucene doc IDs, and therefore segments must be merged in-order. LTW uses LMP 
specifically because of that. I will look into the test to understand why would 
it care about doc IDs, since it doesn't using the taxonomy index at all.
{quote}

Right, as you said this is for the main index, not the taxonomy index. So I 
think the test just relies upon lucene doc ids, but I didnt want to just change 
the test without saying why.

{quote}
There are few places with code like: assertTrue(Would like to test this with 
deletions!,indexReader.hasDeletions()), and assertTrue(Would like to test 
this with deletions!,indexReader.numDeletedDocs()  0) which you removed. Any 
reason?
{quote}

Mostly to prevent the tests from failing. RandomIndexWriter randomly optimizes 
some times, so occasionally there are no deletions. I think this is fine 
(actually better) as far as coverage... then the deleted docs is occasionally 
null, etc.

{quote}
You added a TODO to TestScoredDocIDsUtils (about reader is read-only) – you're 
right, the comment can be deleted.
{quote}

OK, I'll nuke this.

{quote}
We can add RandomTaxonomyWriter as a follow-on commit.
{quote}

Yes, lets do this separate.


 crank up faceting module tests
 --

 Key: LUCENE-3264
 URL: https://issues.apache.org/jira/browse/LUCENE-3264
 Project: Lucene - Java
  Issue Type: Test
  Components: modules/facet
Reporter: Robert Muir
Assignee: Robert Muir
 Fix For: 3.4, 4.0

 Attachments: LUCENE-3264.patch


 The faceting module has a large set of good tests.
 lets switch them over to use all of our test infra (randomindexwriter, random 
 iwconfig, mockanalyzer, newDirectory, ...)
 I don't want to address multipliers and atLeast() etc on this issue, I think 
 we should follow up with that on a separate issue, that also looks at speed 
 and making sure the nightly build is exhaustive.
 for now, lets just get the coverage in, it will be good to do before any 
 refactoring.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3265) Cut over to Java 6 API where needed / possible

2011-06-30 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057794#comment-13057794
 ] 

Robert Muir commented on LUCENE-3265:
-

I think we should be careful here: any performance tests need to also be done 
on -client!

 Cut over to Java 6 API where needed / possible
 --

 Key: LUCENE-3265
 URL: https://issues.apache.org/jira/browse/LUCENE-3265
 Project: Lucene - Java
  Issue Type: Task
Affects Versions: 4.0
Reporter: Simon Willnauer
Priority: Minor
 Fix For: 4.0


 since we are on 1.6 on trunk we should try to reduce the duplications like in 
 ArrayUtils and cut over to Java 6 API

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-tests-only-flexscoring-branch - Build # 66 - Failure

2011-06-30 Thread Apache Jenkins Server
Build: 
https://builds.apache.org/job/Lucene-Solr-tests-only-flexscoring-branch/66/

3 tests failed.
REGRESSION:  
org.apache.solr.client.solrj.embedded.MultiCoreExampleJettyTest.testDistributed

Error Message:
Severe errors in solr configuration.  Check your log files for more detailed 
information on what may be wrong.  
- 
java.lang.RuntimeException: java.io.FileNotFoundException: 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-flexscoring-branch/checkout/solr/example/multicore/core0/data/index/org.apache.solr.core.RefCntRamDirectory@38ca6cea
 lockFactory=org.apache.lucene.store.simplefslockfact...@6af2da21-write.lock 
(No such file or directory)  at 
org.apache.solr.core.SolrCore.initIndex(SolrCore.java:378)  at 
org.apache.solr.core.SolrCore.init(SolrCore.java:501)  at 
org.apache.solr.core.CoreContainer.create(CoreContainer.java:653)  at 
org.apache.solr.core.CoreContainer.load(CoreContainer.java:406)  at 
org.apache.solr.core.CoreContainer.load(CoreContainer.java:291)  at 
org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:240)
  at 
org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:93)  at 
org.mortbay.jetty.servlet.FilterHolder.doStart(FilterHolder.java:97)  at 
org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)  at 
org.mortbay.jetty.servlet.ServletHandler.initialize(ServletHandler.java:713)  
at 
org.mortbay.jetty.servlet.ServletHandler.updateMappings(ServletHandler.java:1104)
  at 
org.mortbay.jetty.servlet.ServletHandler.setFilterMappings(ServletHandler.java:1140)
  at 
org.mortbay.jetty.servlet.ServletHandler.addFilterWithMapping(ServletHandler.java:940)
  at 
org.mortbay.jetty.servlet.ServletHandler.addFilterWithMapping(ServletHandler.java:895)
  at org.mortbay.jetty.servlet.Context.addFilter(Context.java:207)  at 
org.apache.solr.client.solrj.embedded.JettySolrRunner$1.lifeCycleStarted(JettySolrRunner.java:98)
  at 
org.mortbay.component.AbstractLifeCycle.setStarted(AbstractLifeCycle.java:140)  
at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:52)  at 
org.apache.solr.client.solrj.embedded.JettySolrRunner.start(JettySolrRunner.java:123)
  at org.apache.solr.client.sol  Severe errors in solr configuration.  Check 
your log files for more detailed information on what may be wrong.  
- 
java.lang.RuntimeException: java.io.FileNotFoundException: 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-flexscoring-branch/checkout/solr/example/multicore/core0/data/index/org.apache.solr.core.RefCntRamDirectory@38ca6cea
 lockFactory=org.apache.lucene.store.simplefslockfact...@6af2da21-write.lock 
(No such file or directory)  at 
org.apache.solr.core.SolrCore.initIndex(SolrCore.java:378)  at 
org.apache.solr.core.SolrCore.init(SolrCore.java:501)  at 
org.apache.solr.core.CoreContainer.create(CoreContainer.java:653)  at 
org.apache.solr.core.CoreContainer.load(CoreContainer.java:406)  at 
org.apache.solr.core.CoreContainer.load(CoreContainer.java:291)  at 
org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:240)
  at 
org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:93)  at 
org.mortbay.jetty.servlet.FilterHolder.doStart(FilterHolder.java:97)  at 
org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)  at 
org.mortbay.jetty.servlet.ServletHandler.initialize(ServletHandler.java:713)  
at 
org.mortbay.jetty.servlet.ServletHandler.updateMappings(ServletHandler.java:1104)
  at 
org.mortbay.jetty.servlet.ServletHandler.setFilterMappings(ServletHandler.java:1140)
  at 
org.mortbay.jetty.servlet.ServletHandler.addFilterWithMapping(ServletHandler.java:940)
  at 
org.mortbay.jetty.servlet.ServletHandler.addFilterWithMapping(ServletHandler.java:895)
  at org.mortbay.jetty.servlet.Context.addFilter(Context.java:207)  at 
org.apache.solr.client.solrj.embedded.JettySolrRunner$1.lifeCycleStarted(JettySolrRunner.java:98)
  at 
org.mortbay.component.AbstractLifeCycle.setStarted(AbstractLifeCycle.java:140)  
at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:52)  at 
org.apache.solr.client.solrj.embedded.JettySolrRunner.start(JettySolrRunner.java:123)
  at org.apache.solr.client.sol  request: 
http://localhost:15720/example/core0/update?commit=truewaitFlush=truewaitSearcher=truewt=javabinversion=2

Stack Trace:


request: 
http://localhost:15720/example/core0/update?commit=truewaitFlush=truewaitSearcher=truewt=javabinversion=2
at 
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:435)
at 
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:244)
at 
org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)
at 

Re: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java

2011-06-30 Thread Simon Willnauer
hmm are you concerned about the extra Math.min that happens in the
copyOf method?
I don't how that relates to intrinsic and java 1.7

I don't have strong feelings here just checking if you mix something
up in the comment you put there... I am happy to keep the old and now
current code

simon

On Thu, Jun 30, 2011 at 2:42 PM,  rm...@apache.org wrote:
 Author: rmuir
 Date: Thu Jun 30 12:42:17 2011
 New Revision: 1141510

 URL: http://svn.apache.org/viewvc?rev=1141510view=rev
 Log:
 LUCENE-3239: remove use of slow Arrays.copyOf

 Modified:
    
 lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java

 Modified: 
 lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
 URL: 
 http://svn.apache.org/viewvc/lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java?rev=1141510r1=1141509r2=1141510view=diff
 ==
 --- 
 lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
  (original)
 +++ 
 lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
  Thu Jun 30 12:42:17 2011
 @@ -2,7 +2,6 @@ package org.apache.lucene.util;

  import java.io.IOException;
  import java.io.OutputStream;
 -import java.util.Arrays;

  /**
  * Licensed to the Apache Software Foundation (ASF) under one or more
 @@ -72,7 +71,11 @@ public class UnsafeByteArrayOutputStream
   }

   private void grow(int newLength) {
 -    buffer = Arrays.copyOf(buffer, newLength);
 +    // It actually should be: (Java 1.7, when its intrinsic on all machines)
 +    // buffer = Arrays.copyOf(buffer, newLength);
 +    byte[] newBuffer = new byte[newLength];
 +    System.arraycopy(buffer, 0, newBuffer, 0, buffer.length);
 +    buffer = newBuffer;
   }

   /**




-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



RE: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java

2011-06-30 Thread Uwe Schindler
Hi Robert,

you reverted a use of Arrays.copyOf() on native types which is *exactly* 
implemented like this in Arrays.java code!

The slow ones are T T[] copyOf(T[] array, int newlen)

(because they use reflection).

Uwe

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


 -Original Message-
 From: rm...@apache.org [mailto:rm...@apache.org]
 Sent: Thursday, June 30, 2011 2:42 PM
 To: comm...@lucene.apache.org
 Subject: svn commit: r1141510 -
 /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/Unsafe
 ByteArrayOutputStream.java
 
 Author: rmuir
 Date: Thu Jun 30 12:42:17 2011
 New Revision: 1141510
 
 URL: http://svn.apache.org/viewvc?rev=1141510view=rev
 Log:
 LUCENE-3239: remove use of slow Arrays.copyOf
 
 Modified:
 
 lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeB
 yteArrayOutputStream.java
 
 Modified:
 lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeB
 yteArrayOutputStream.java
 URL:
 http://svn.apache.org/viewvc/lucene/dev/trunk/modules/facet/src/java/or
 g/apache/lucene/util/UnsafeByteArrayOutputStream.java?rev=1141510r1
 =1141509r2=1141510view=diff
 ==
 
 ---
 lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeB
 yteArrayOutputStream.java (original)
 +++
 lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/Unsaf
 +++ eByteArrayOutputStream.java Thu Jun 30 12:42:17 2011
 @@ -2,7 +2,6 @@ package org.apache.lucene.util;
 
  import java.io.IOException;
  import java.io.OutputStream;
 -import java.util.Arrays;
 
  /**
   * Licensed to the Apache Software Foundation (ASF) under one or more
 @@ -72,7 +71,11 @@ public class UnsafeByteArrayOutputStream
}
 
private void grow(int newLength) {
 -buffer = Arrays.copyOf(buffer, newLength);
 +// It actually should be: (Java 1.7, when its intrinsic on all machines)
 +// buffer = Arrays.copyOf(buffer, newLength);
 +byte[] newBuffer = new byte[newLength];
 +System.arraycopy(buffer, 0, newBuffer, 0, buffer.length);
 +buffer = newBuffer;
}
 
/**
 



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java

2011-06-30 Thread Robert Muir
because on windows 32bit at least, -client is still the default on
most jres out there.

i realize people don't care about -client, but i will police
foo[].clone() / arrays.copyOf etc to prevent problems.

There are comments about this stuff on the relevant bug reports
(oracle's site is down, sorry) linked to this issue.
https://issues.apache.org/jira/browse/LUCENE-2674

Sorry, I don't think we should use foo[].clone() / arrays.copyOf, I
think we should always use arraycopy.

On Thu, Jun 30, 2011 at 8:55 AM, Simon Willnauer
simon.willna...@googlemail.com wrote:
 hmm are you concerned about the extra Math.min that happens in the
 copyOf method?
 I don't how that relates to intrinsic and java 1.7

 I don't have strong feelings here just checking if you mix something
 up in the comment you put there... I am happy to keep the old and now
 current code

 simon

 On Thu, Jun 30, 2011 at 2:42 PM,  rm...@apache.org wrote:
 Author: rmuir
 Date: Thu Jun 30 12:42:17 2011
 New Revision: 1141510

 URL: http://svn.apache.org/viewvc?rev=1141510view=rev
 Log:
 LUCENE-3239: remove use of slow Arrays.copyOf

 Modified:
    
 lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java

 Modified: 
 lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
 URL: 
 http://svn.apache.org/viewvc/lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java?rev=1141510r1=1141509r2=1141510view=diff
 ==
 --- 
 lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
  (original)
 +++ 
 lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
  Thu Jun 30 12:42:17 2011
 @@ -2,7 +2,6 @@ package org.apache.lucene.util;

  import java.io.IOException;
  import java.io.OutputStream;
 -import java.util.Arrays;

  /**
  * Licensed to the Apache Software Foundation (ASF) under one or more
 @@ -72,7 +71,11 @@ public class UnsafeByteArrayOutputStream
   }

   private void grow(int newLength) {
 -    buffer = Arrays.copyOf(buffer, newLength);
 +    // It actually should be: (Java 1.7, when its intrinsic on all machines)
 +    // buffer = Arrays.copyOf(buffer, newLength);
 +    byte[] newBuffer = new byte[newLength];
 +    System.arraycopy(buffer, 0, newBuffer, 0, buffer.length);
 +    buffer = newBuffer;
   }

   /**




 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java

2011-06-30 Thread Simon Willnauer
Robert I agree but doesn't that apply to Arrays.copyOf(Object[],int)
only? here we use a specialized primitive version?

simon

On Thu, Jun 30, 2011 at 3:05 PM, Robert Muir rcm...@gmail.com wrote:
 because on windows 32bit at least, -client is still the default on
 most jres out there.

 i realize people don't care about -client, but i will police
 foo[].clone() / arrays.copyOf etc to prevent problems.

 There are comments about this stuff on the relevant bug reports
 (oracle's site is down, sorry) linked to this issue.
 https://issues.apache.org/jira/browse/LUCENE-2674

 Sorry, I don't think we should use foo[].clone() / arrays.copyOf, I
 think we should always use arraycopy.

 On Thu, Jun 30, 2011 at 8:55 AM, Simon Willnauer
 simon.willna...@googlemail.com wrote:
 hmm are you concerned about the extra Math.min that happens in the
 copyOf method?
 I don't how that relates to intrinsic and java 1.7

 I don't have strong feelings here just checking if you mix something
 up in the comment you put there... I am happy to keep the old and now
 current code

 simon

 On Thu, Jun 30, 2011 at 2:42 PM,  rm...@apache.org wrote:
 Author: rmuir
 Date: Thu Jun 30 12:42:17 2011
 New Revision: 1141510

 URL: http://svn.apache.org/viewvc?rev=1141510view=rev
 Log:
 LUCENE-3239: remove use of slow Arrays.copyOf

 Modified:
    
 lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java

 Modified: 
 lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
 URL: 
 http://svn.apache.org/viewvc/lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java?rev=1141510r1=1141509r2=1141510view=diff
 ==
 --- 
 lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
  (original)
 +++ 
 lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
  Thu Jun 30 12:42:17 2011
 @@ -2,7 +2,6 @@ package org.apache.lucene.util;

  import java.io.IOException;
  import java.io.OutputStream;
 -import java.util.Arrays;

  /**
  * Licensed to the Apache Software Foundation (ASF) under one or more
 @@ -72,7 +71,11 @@ public class UnsafeByteArrayOutputStream
   }

   private void grow(int newLength) {
 -    buffer = Arrays.copyOf(buffer, newLength);
 +    // It actually should be: (Java 1.7, when its intrinsic on all 
 machines)
 +    // buffer = Arrays.copyOf(buffer, newLength);
 +    byte[] newBuffer = new byte[newLength];
 +    System.arraycopy(buffer, 0, newBuffer, 0, buffer.length);
 +    buffer = newBuffer;
   }

   /**




 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org




-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java

2011-06-30 Thread Dawid Weiss
Arrays.copyOf(primitive) is actually System.arraycopy by default. If
intrinsics are used it can only get faster. For object types it will
probably be a bit slower for -client because of a runtime check for
the component type.

Dawid

On Thu, Jun 30, 2011 at 3:05 PM, Robert Muir rcm...@gmail.com wrote:
 because on windows 32bit at least, -client is still the default on
 most jres out there.

 i realize people don't care about -client, but i will police
 foo[].clone() / arrays.copyOf etc to prevent problems.

 There are comments about this stuff on the relevant bug reports
 (oracle's site is down, sorry) linked to this issue.
 https://issues.apache.org/jira/browse/LUCENE-2674

 Sorry, I don't think we should use foo[].clone() / arrays.copyOf, I
 think we should always use arraycopy.

 On Thu, Jun 30, 2011 at 8:55 AM, Simon Willnauer
 simon.willna...@googlemail.com wrote:
 hmm are you concerned about the extra Math.min that happens in the
 copyOf method?
 I don't how that relates to intrinsic and java 1.7

 I don't have strong feelings here just checking if you mix something
 up in the comment you put there... I am happy to keep the old and now
 current code

 simon

 On Thu, Jun 30, 2011 at 2:42 PM,  rm...@apache.org wrote:
 Author: rmuir
 Date: Thu Jun 30 12:42:17 2011
 New Revision: 1141510

 URL: http://svn.apache.org/viewvc?rev=1141510view=rev
 Log:
 LUCENE-3239: remove use of slow Arrays.copyOf

 Modified:
    
 lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java

 Modified: 
 lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
 URL: 
 http://svn.apache.org/viewvc/lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java?rev=1141510r1=1141509r2=1141510view=diff
 ==
 --- 
 lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
  (original)
 +++ 
 lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
  Thu Jun 30 12:42:17 2011
 @@ -2,7 +2,6 @@ package org.apache.lucene.util;

  import java.io.IOException;
  import java.io.OutputStream;
 -import java.util.Arrays;

  /**
  * Licensed to the Apache Software Foundation (ASF) under one or more
 @@ -72,7 +71,11 @@ public class UnsafeByteArrayOutputStream
   }

   private void grow(int newLength) {
 -    buffer = Arrays.copyOf(buffer, newLength);
 +    // It actually should be: (Java 1.7, when its intrinsic on all 
 machines)
 +    // buffer = Arrays.copyOf(buffer, newLength);
 +    byte[] newBuffer = new byte[newLength];
 +    System.arraycopy(buffer, 0, newBuffer, 0, buffer.length);
 +    buffer = newBuffer;
   }

   /**




 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org



 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



RE: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java

2011-06-30 Thread Uwe Schindler
Robert, 

as noted in my other eMail, ist only slow for the generic Object[] method (as 
it uses j.l.reflect.Array.newInstance(Class componentType)). We are talking 
here about byte[], and the Arrays method is implemented with the same 3 lines 
of code, Simon replaced. The only difference is a Math.min() which is intrinsic 
(it is used, as Arrays.copyOf supports shrinking size, so the 
System.arrayCopy() needs upper limit to not AIOOBE).

Uwe

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


 -Original Message-
 From: Robert Muir [mailto:rcm...@gmail.com]
 Sent: Thursday, June 30, 2011 3:05 PM
 To: dev@lucene.apache.org; simon.willna...@gmail.com
 Cc: comm...@lucene.apache.org
 Subject: Re: svn commit: r1141510 -
 /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/Unsafe
 ByteArrayOutputStream.java
 
 because on windows 32bit at least, -client is still the default on most jres 
 out
 there.
 
 i realize people don't care about -client, but i will police
 foo[].clone() / arrays.copyOf etc to prevent problems.
 
 There are comments about this stuff on the relevant bug reports (oracle's
 site is down, sorry) linked to this issue.
 https://issues.apache.org/jira/browse/LUCENE-2674
 
 Sorry, I don't think we should use foo[].clone() / arrays.copyOf, I think we
 should always use arraycopy.
 
 On Thu, Jun 30, 2011 at 8:55 AM, Simon Willnauer
 simon.willna...@googlemail.com wrote:
  hmm are you concerned about the extra Math.min that happens in the
  copyOf method?
  I don't how that relates to intrinsic and java 1.7
 
  I don't have strong feelings here just checking if you mix something
  up in the comment you put there... I am happy to keep the old and now
  current code
 
  simon
 
  On Thu, Jun 30, 2011 at 2:42 PM,  rm...@apache.org wrote:
  Author: rmuir
  Date: Thu Jun 30 12:42:17 2011
  New Revision: 1141510
 
  URL: http://svn.apache.org/viewvc?rev=1141510view=rev
  Log:
  LUCENE-3239: remove use of slow Arrays.copyOf
 
  Modified:
 
 
 lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/Unsafe
  ByteArrayOutputStream.java
 
  Modified:
 
 lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/Unsafe
  ByteArrayOutputStream.java
  URL:
 
 http://svn.apache.org/viewvc/lucene/dev/trunk/modules/facet/src/java/
 
 org/apache/lucene/util/UnsafeByteArrayOutputStream.java?rev=1141510r
  1=1141509r2=1141510view=diff
 
 ==
 ===
  =
  ---
 
 lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/Unsafe
  ByteArrayOutputStream.java (original)
  +++
 lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/Un
  +++ safeByteArrayOutputStream.java Thu Jun 30 12:42:17 2011
  @@ -2,7 +2,6 @@ package org.apache.lucene.util;
 
   import java.io.IOException;
   import java.io.OutputStream;
  -import java.util.Arrays;
 
   /**
   * Licensed to the Apache Software Foundation (ASF) under one or more
  @@ -72,7 +71,11 @@ public class UnsafeByteArrayOutputStream
}
 
private void grow(int newLength) {
  -buffer = Arrays.copyOf(buffer, newLength);
  +// It actually should be: (Java 1.7, when its intrinsic on all
  + machines)
  +// buffer = Arrays.copyOf(buffer, newLength);
  +byte[] newBuffer = new byte[newLength];
  +System.arraycopy(buffer, 0, newBuffer, 0, buffer.length);
  +buffer = newBuffer;
}
 
/**
 
 
 
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For
  additional commands, e-mail: dev-h...@lucene.apache.org
 
 
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional
 commands, e-mail: dev-h...@lucene.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



RE: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java

2011-06-30 Thread Uwe Schindler
We had an issue about this with FST's array growing in Mike's code, in facts 
ist *much* slower for generic Arrays' T[] copyOf(T[]...), with T extends Object 
(uses slow reflection).

For primitives it can only get faster in later JVMs, this is why we want to 
change all ArrayUtils.grow() to use this (and we don’t have a generic one there 
for above reason).

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


 -Original Message-
 From: dawid.we...@gmail.com [mailto:dawid.we...@gmail.com] On Behalf
 Of Dawid Weiss
 Sent: Thursday, June 30, 2011 3:11 PM
 To: dev@lucene.apache.org
 Subject: Re: svn commit: r1141510 -
 /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/Unsafe
 ByteArrayOutputStream.java
 
 Arrays.copyOf(primitive) is actually System.arraycopy by default. If 
 intrinsics
 are used it can only get faster. For object types it will probably be a bit 
 slower
 for -client because of a runtime check for the component type.
 
 Dawid
 
 On Thu, Jun 30, 2011 at 3:05 PM, Robert Muir rcm...@gmail.com wrote:
  because on windows 32bit at least, -client is still the default on
  most jres out there.
 
  i realize people don't care about -client, but i will police
  foo[].clone() / arrays.copyOf etc to prevent problems.
 
  There are comments about this stuff on the relevant bug reports
  (oracle's site is down, sorry) linked to this issue.
  https://issues.apache.org/jira/browse/LUCENE-2674
 
  Sorry, I don't think we should use foo[].clone() / arrays.copyOf, I
  think we should always use arraycopy.
 
  On Thu, Jun 30, 2011 at 8:55 AM, Simon Willnauer
  simon.willna...@googlemail.com wrote:
  hmm are you concerned about the extra Math.min that happens in the
  copyOf method?
  I don't how that relates to intrinsic and java 1.7
 
  I don't have strong feelings here just checking if you mix something
  up in the comment you put there... I am happy to keep the old and now
  current code
 
  simon
 
  On Thu, Jun 30, 2011 at 2:42 PM,  rm...@apache.org wrote:
  Author: rmuir
  Date: Thu Jun 30 12:42:17 2011
  New Revision: 1141510
 
  URL: http://svn.apache.org/viewvc?rev=1141510view=rev
  Log:
  LUCENE-3239: remove use of slow Arrays.copyOf
 
  Modified:
 
 
 lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/Unsaf
  eByteArrayOutputStream.java
 
  Modified:
 
 lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/Unsaf
  eByteArrayOutputStream.java
  URL:
 
 http://svn.apache.org/viewvc/lucene/dev/trunk/modules/facet/src/java
 
 /org/apache/lucene/util/UnsafeByteArrayOutputStream.java?rev=1141510
  r1=1141509r2=1141510view=diff
 
 ==
 ==
  ==
  ---
 
 lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/Unsaf
  eByteArrayOutputStream.java (original)
  +++
 lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/U
  +++ nsafeByteArrayOutputStream.java Thu Jun 30 12:42:17 2011
  @@ -2,7 +2,6 @@ package org.apache.lucene.util;
 
   import java.io.IOException;
   import java.io.OutputStream;
  -import java.util.Arrays;
 
   /**
   * Licensed to the Apache Software Foundation (ASF) under one or
  more @@ -72,7 +71,11 @@ public class UnsafeByteArrayOutputStream
}
 
private void grow(int newLength) {
  -buffer = Arrays.copyOf(buffer, newLength);
  +// It actually should be: (Java 1.7, when its intrinsic on all
  + machines)
  +// buffer = Arrays.copyOf(buffer, newLength);
  +byte[] newBuffer = new byte[newLength];
  +System.arraycopy(buffer, 0, newBuffer, 0, buffer.length);
  +buffer = newBuffer;
}
 
/**
 
 
 
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For
  additional commands, e-mail: dev-h...@lucene.apache.org
 
 
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For
  additional commands, e-mail: dev-h...@lucene.apache.org
 
 
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional
 commands, e-mail: dev-h...@lucene.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java

2011-06-30 Thread Simon Willnauer
On Thu, Jun 30, 2011 at 3:26 PM, Uwe Schindler u...@thetaphi.de wrote:
 We had an issue about this with FST's array growing in Mike's code, in facts 
 ist *much* slower for generic Arrays' T[] copyOf(T[]...), with T extends 
 Object (uses slow reflection).

 For primitives it can only get faster in later JVMs, this is why we want to 
 change all ArrayUtils.grow() to use this (and we don’t have a generic one 
 there for above reason).

+1 - I don't see why this would be any slower... if we can get
improvements we should go for it. The issues and bugreports are all
for non-primitive copyOf methods so I don't see how this should affect
us

simon

 -
 Uwe Schindler
 H.-H.-Meier-Allee 63, D-28213 Bremen
 http://www.thetaphi.de
 eMail: u...@thetaphi.de


 -Original Message-
 From: dawid.we...@gmail.com [mailto:dawid.we...@gmail.com] On Behalf
 Of Dawid Weiss
 Sent: Thursday, June 30, 2011 3:11 PM
 To: dev@lucene.apache.org
 Subject: Re: svn commit: r1141510 -
 /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/Unsafe
 ByteArrayOutputStream.java

 Arrays.copyOf(primitive) is actually System.arraycopy by default. If 
 intrinsics
 are used it can only get faster. For object types it will probably be a bit 
 slower
 for -client because of a runtime check for the component type.

 Dawid

 On Thu, Jun 30, 2011 at 3:05 PM, Robert Muir rcm...@gmail.com wrote:
  because on windows 32bit at least, -client is still the default on
  most jres out there.
 
  i realize people don't care about -client, but i will police
  foo[].clone() / arrays.copyOf etc to prevent problems.
 
  There are comments about this stuff on the relevant bug reports
  (oracle's site is down, sorry) linked to this issue.
  https://issues.apache.org/jira/browse/LUCENE-2674
 
  Sorry, I don't think we should use foo[].clone() / arrays.copyOf, I
  think we should always use arraycopy.
 
  On Thu, Jun 30, 2011 at 8:55 AM, Simon Willnauer
  simon.willna...@googlemail.com wrote:
  hmm are you concerned about the extra Math.min that happens in the
  copyOf method?
  I don't how that relates to intrinsic and java 1.7
 
  I don't have strong feelings here just checking if you mix something
  up in the comment you put there... I am happy to keep the old and now
  current code
 
  simon
 
  On Thu, Jun 30, 2011 at 2:42 PM,  rm...@apache.org wrote:
  Author: rmuir
  Date: Thu Jun 30 12:42:17 2011
  New Revision: 1141510
 
  URL: http://svn.apache.org/viewvc?rev=1141510view=rev
  Log:
  LUCENE-3239: remove use of slow Arrays.copyOf
 
  Modified:
 
 
 lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/Unsaf
  eByteArrayOutputStream.java
 
  Modified:
 
 lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/Unsaf
  eByteArrayOutputStream.java
  URL:
 
 http://svn.apache.org/viewvc/lucene/dev/trunk/modules/facet/src/java
 
 /org/apache/lucene/util/UnsafeByteArrayOutputStream.java?rev=1141510
  r1=1141509r2=1141510view=diff
 
 ==
 ==
  ==
  ---
 
 lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/Unsaf
  eByteArrayOutputStream.java (original)
  +++
 lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/U
  +++ nsafeByteArrayOutputStream.java Thu Jun 30 12:42:17 2011
  @@ -2,7 +2,6 @@ package org.apache.lucene.util;
 
   import java.io.IOException;
   import java.io.OutputStream;
  -import java.util.Arrays;
 
   /**
   * Licensed to the Apache Software Foundation (ASF) under one or
  more @@ -72,7 +71,11 @@ public class UnsafeByteArrayOutputStream
    }
 
    private void grow(int newLength) {
  -    buffer = Arrays.copyOf(buffer, newLength);
  +    // It actually should be: (Java 1.7, when its intrinsic on all
  + machines)
  +    // buffer = Arrays.copyOf(buffer, newLength);
  +    byte[] newBuffer = new byte[newLength];
  +    System.arraycopy(buffer, 0, newBuffer, 0, buffer.length);
  +    buffer = newBuffer;
    }
 
    /**
 
 
 
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For
  additional commands, e-mail: dev-h...@lucene.apache.org
 
 
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For
  additional commands, e-mail: dev-h...@lucene.apache.org
 
 

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional
 commands, e-mail: dev-h...@lucene.apache.org



 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2565) Prevent IW#close and cut over to IW#commit

2011-06-30 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057829#comment-13057829
 ] 

Mark Miller commented on SOLR-2565:
---

Committed - there is still some wiki work to do.

 Prevent IW#close and cut over to IW#commit
 --

 Key: SOLR-2565
 URL: https://issues.apache.org/jira/browse/SOLR-2565
 Project: Solr
  Issue Type: Improvement
  Components: update
Affects Versions: 4.0
Reporter: Simon Willnauer
 Fix For: 4.0

 Attachments: SOLR-2565.patch


 Spinnoff from SOLR-2193. We already have a branch to work on this issue here 
 https://svn.apache.org/repos/asf/lucene/dev/branches/solr2193 
 The main goal here is to prevent solr from closing the IW and use IW#commit 
 instead. AFAIK the main issues here are:
 The update handler needs an overhaul.
 A few goals I think we might want to look at:
 1. Expose the SolrIndexWriter in the api or add the proper abstractions to 
 get done what we now do with special casing:
 2. Stop closing the IndexWriter and start using commit (still lazy IW init 
 though).
 3. Drop iwAccess, iwCommit locks and sync mostly at the Lucene level.
 4. Address the current issues we face because multiple original/'reloaded' 
 cores can have a different IndexWriter on the same index.
 Eventually this is a preparation for NRT support in Solr which I will create 
 a followup issue for.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Reopened] (SOLR-2193) Re-architect Update Handler

2011-06-30 Thread Mark Miller (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-2193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller reopened SOLR-2193:
---

  Assignee: Mark Miller  (was: Robert Muir)

 Re-architect Update Handler
 ---

 Key: SOLR-2193
 URL: https://issues.apache.org/jira/browse/SOLR-2193
 Project: Solr
  Issue Type: Improvement
Reporter: Mark Miller
Assignee: Mark Miller
 Fix For: 4.0

 Attachments: SOLR-2193.patch, SOLR-2193.patch, SOLR-2193.patch, 
 SOLR-2193.patch, SOLR-2193.patch, SOLR-2193.patch


 The update handler needs an overhaul.
 A few goals I think we might want to look at:
 1. Cleanup - drop DirectUpdateHandler(2) line - move to something like 
 UpdateHandler, DefaultUpdateHandler
 2. Expose the SolrIndexWriter in the api or add the proper abstractions to 
 get done what we now do with special casing:
 if (directupdatehandler2)
   success
  else
   failish
 3. Stop closing the IndexWriter and start using commit (still lazy IW init 
 though).
 4. Drop iwAccess, iwCommit locks and sync mostly at the Lucene level.
 5. Keep NRT support in mind.
 6. Keep microsharding in mind (maintain logical index as multiple physical 
 indexes)
 7. Address the current issues we face because multiple original/'reloaded' 
 cores can have a different IndexWriter on the same index.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2193) Re-architect Update Handler

2011-06-30 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057833#comment-13057833
 ] 

Mark Miller commented on SOLR-2193:
---

This issue is superceded by: SOLR-2565 Prevent IW#close and cut over to 
IW#commit

 Re-architect Update Handler
 ---

 Key: SOLR-2193
 URL: https://issues.apache.org/jira/browse/SOLR-2193
 Project: Solr
  Issue Type: Improvement
Reporter: Mark Miller
Assignee: Mark Miller
 Fix For: 4.0

 Attachments: SOLR-2193.patch, SOLR-2193.patch, SOLR-2193.patch, 
 SOLR-2193.patch, SOLR-2193.patch, SOLR-2193.patch


 The update handler needs an overhaul.
 A few goals I think we might want to look at:
 1. Cleanup - drop DirectUpdateHandler(2) line - move to something like 
 UpdateHandler, DefaultUpdateHandler
 2. Expose the SolrIndexWriter in the api or add the proper abstractions to 
 get done what we now do with special casing:
 if (directupdatehandler2)
   success
  else
   failish
 3. Stop closing the IndexWriter and start using commit (still lazy IW init 
 though).
 4. Drop iwAccess, iwCommit locks and sync mostly at the Lucene level.
 5. Keep NRT support in mind.
 6. Keep microsharding in mind (maintain logical index as multiple physical 
 indexes)
 7. Address the current issues we face because multiple original/'reloaded' 
 cores can have a different IndexWriter on the same index.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2193) Re-architect Update Handler

2011-06-30 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057834#comment-13057834
 ] 

Mark Miller commented on SOLR-2193:
---

bq. Curious; why is the resolution status invalid?

Dunno - it's not invalid. I've re-resolved as duplicate

 Re-architect Update Handler
 ---

 Key: SOLR-2193
 URL: https://issues.apache.org/jira/browse/SOLR-2193
 Project: Solr
  Issue Type: Improvement
Reporter: Mark Miller
Assignee: Mark Miller
 Fix For: 4.0

 Attachments: SOLR-2193.patch, SOLR-2193.patch, SOLR-2193.patch, 
 SOLR-2193.patch, SOLR-2193.patch, SOLR-2193.patch


 The update handler needs an overhaul.
 A few goals I think we might want to look at:
 1. Cleanup - drop DirectUpdateHandler(2) line - move to something like 
 UpdateHandler, DefaultUpdateHandler
 2. Expose the SolrIndexWriter in the api or add the proper abstractions to 
 get done what we now do with special casing:
 if (directupdatehandler2)
   success
  else
   failish
 3. Stop closing the IndexWriter and start using commit (still lazy IW init 
 though).
 4. Drop iwAccess, iwCommit locks and sync mostly at the Lucene level.
 5. Keep NRT support in mind.
 6. Keep microsharding in mind (maintain logical index as multiple physical 
 indexes)
 7. Address the current issues we face because multiple original/'reloaded' 
 cores can have a different IndexWriter on the same index.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-2193) Re-architect Update Handler

2011-06-30 Thread Mark Miller (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-2193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller resolved SOLR-2193.
---

Resolution: Duplicate

 Re-architect Update Handler
 ---

 Key: SOLR-2193
 URL: https://issues.apache.org/jira/browse/SOLR-2193
 Project: Solr
  Issue Type: Improvement
Reporter: Mark Miller
Assignee: Mark Miller
 Fix For: 4.0

 Attachments: SOLR-2193.patch, SOLR-2193.patch, SOLR-2193.patch, 
 SOLR-2193.patch, SOLR-2193.patch, SOLR-2193.patch


 The update handler needs an overhaul.
 A few goals I think we might want to look at:
 1. Cleanup - drop DirectUpdateHandler(2) line - move to something like 
 UpdateHandler, DefaultUpdateHandler
 2. Expose the SolrIndexWriter in the api or add the proper abstractions to 
 get done what we now do with special casing:
 if (directupdatehandler2)
   success
  else
   failish
 3. Stop closing the IndexWriter and start using commit (still lazy IW init 
 though).
 4. Drop iwAccess, iwCommit locks and sync mostly at the Lucene level.
 5. Keep NRT support in mind.
 6. Keep microsharding in mind (maintain logical index as multiple physical 
 indexes)
 7. Address the current issues we face because multiple original/'reloaded' 
 cores can have a different IndexWriter on the same index.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Assigned] (SOLR-2565) Prevent IW#close and cut over to IW#commit

2011-06-30 Thread Mark Miller (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-2565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller reassigned SOLR-2565:
-

Assignee: Mark Miller

 Prevent IW#close and cut over to IW#commit
 --

 Key: SOLR-2565
 URL: https://issues.apache.org/jira/browse/SOLR-2565
 Project: Solr
  Issue Type: Improvement
  Components: update
Affects Versions: 4.0
Reporter: Simon Willnauer
Assignee: Mark Miller
 Fix For: 4.0

 Attachments: SOLR-2565.patch


 Spinnoff from SOLR-2193. We already have a branch to work on this issue here 
 https://svn.apache.org/repos/asf/lucene/dev/branches/solr2193 
 The main goal here is to prevent solr from closing the IW and use IW#commit 
 instead. AFAIK the main issues here are:
 The update handler needs an overhaul.
 A few goals I think we might want to look at:
 1. Expose the SolrIndexWriter in the api or add the proper abstractions to 
 get done what we now do with special casing:
 2. Stop closing the IndexWriter and start using commit (still lazy IW init 
 though).
 3. Drop iwAccess, iwCommit locks and sync mostly at the Lucene level.
 4. Address the current issues we face because multiple original/'reloaded' 
 cores can have a different IndexWriter on the same index.
 Eventually this is a preparation for NRT support in Solr which I will create 
 a followup issue for.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[Lucene.Net] Possible bug in Lucene with Prefix Search and Danish Locale

2011-06-30 Thread Matt Warren
I think that the code here shows a bug in Lucene.NET, see
http://gist.github.com/1056231.
This happens when using 2.9.2.

After some digging I think that it's due to the way it does a Prefix search.

The main problem is shown by this code http://gist.github.com/1056242.
If the Locale is Danish, this returns FALSE, weird eh!!
   daab.StartsWith(da) //false
But this works as expected
   daab.StartsWith(da, StringComparison.InvariantCulture) //true

The line of code that has this problem is the TermCompare(..) function in
PrefixTermEnum.cs,
see
http://svn.apache.org/repos/asf/incubator/lucene.net/trunk/src/core/Search/PrefixTermEnum.cs


[JENKINS-MAVEN] Lucene-Solr-Maven-trunk #166: POMs out of sync

2011-06-30 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-Maven-trunk/166/

No tests ran.

Build Log (for compile errors):
[...truncated 10375 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [Lucene.Net] Possible bug in Lucene with Prefix Search and Danish Locale

2011-06-30 Thread Ben West
Hey Matt,

This is issue 420: https://issues.apache.org/jira/browse/LUCENENET-420

I think the theory so far has been that the user should manage the culture 
rather than Lucene. If you disagree could you post on that issue ticket?

Thanks,
-Ben


- Original Message -
From: Matt Warren mattd...@gmail.com
To: lucene-net-...@lucene.apache.org
Cc: 
Sent: Thursday, June 30, 2011 9:28 AM
Subject: [Lucene.Net] Possible bug in Lucene with Prefix Search and Danish 
Locale

I think that the code here shows a bug in Lucene.NET, see
http://gist.github.com/1056231.
This happens when using 2.9.2.

After some digging I think that it's due to the way it does a Prefix search.

The main problem is shown by this code http://gist.github.com/1056242.
If the Locale is Danish, this returns FALSE, weird eh!!
   daab.StartsWith(da) //false
But this works as expected
   daab.StartsWith(da, StringComparison.InvariantCulture) //true

The line of code that has this problem is the TermCompare(..) function in
PrefixTermEnum.cs,
see
http://svn.apache.org/repos/asf/incubator/lucene.net/trunk/src/core/Search/PrefixTermEnum.cs



Re: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java

2011-06-30 Thread Robert Muir
I think Dawid is correct here... so we should change this back? still
personally when I see array clone() or copyOf() it makes me concerned, I
know these are as fast as arraycopy in recent versions, but depending on
which variant is used, and whether you use -server, can be slower... in
general I just don't want us to have performance regressions on say windows
32bit over this stuff, personally I think arraycopy is a sure fire bet every
time, but Ill concede the point that copyOf might not be slower for the
primitive versions... I think in jdk7 we will not have this issue as -client
sorta goes away in favor of the tiered thing? anyway, I think we should
proceed with caution here as far as moving things over to copyOf, I don't
see any evidence that its ever faster, but its definitely sometimes slower.
On Jun 30, 2011 9:12 AM, Dawid Weiss dawid.we...@cs.put.poznan.pl wrote:
 Arrays.copyOf(primitive) is actually System.arraycopy by default. If
 intrinsics are used it can only get faster. For object types it will
 probably be a bit slower for -client because of a runtime check for
 the component type.

 Dawid

 On Thu, Jun 30, 2011 at 3:05 PM, Robert Muir rcm...@gmail.com wrote:
 because on windows 32bit at least, -client is still the default on
 most jres out there.

 i realize people don't care about -client, but i will police
 foo[].clone() / arrays.copyOf etc to prevent problems.

 There are comments about this stuff on the relevant bug reports
 (oracle's site is down, sorry) linked to this issue.
 https://issues.apache.org/jira/browse/LUCENE-2674

 Sorry, I don't think we should use foo[].clone() / arrays.copyOf, I
 think we should always use arraycopy.

 On Thu, Jun 30, 2011 at 8:55 AM, Simon Willnauer
 simon.willna...@googlemail.com wrote:
 hmm are you concerned about the extra Math.min that happens in the
 copyOf method?
 I don't how that relates to intrinsic and java 1.7

 I don't have strong feelings here just checking if you mix something
 up in the comment you put there... I am happy to keep the old and now
 current code

 simon

 On Thu, Jun 30, 2011 at 2:42 PM,  rm...@apache.org wrote:
 Author: rmuir
 Date: Thu Jun 30 12:42:17 2011
 New Revision: 1141510

 URL: http://svn.apache.org/viewvc?rev=1141510view=rev
 Log:
 LUCENE-3239: remove use of slow Arrays.copyOf

 Modified:

 
lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java

 Modified:
lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
 URL:
http://svn.apache.org/viewvc/lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java?rev=1141510r1=1141509r2=1141510view=diff

==
 ---
lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
(original)
 +++
lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
Thu Jun 30 12:42:17 2011
 @@ -2,7 +2,6 @@ package org.apache.lucene.util;

  import java.io.IOException;
  import java.io.OutputStream;
 -import java.util.Arrays;

  /**
  * Licensed to the Apache Software Foundation (ASF) under one or more
 @@ -72,7 +71,11 @@ public class UnsafeByteArrayOutputStream
   }

   private void grow(int newLength) {
 -buffer = Arrays.copyOf(buffer, newLength);
 +// It actually should be: (Java 1.7, when its intrinsic on all
machines)
 +// buffer = Arrays.copyOf(buffer, newLength);
 +byte[] newBuffer = new byte[newLength];
 +System.arraycopy(buffer, 0, newBuffer, 0, buffer.length);
 +buffer = newBuffer;
   }

   /**




 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org



 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org



 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [Lucene.Net] Possible bug in Lucene with Prefix Search and Danish Locale

2011-06-30 Thread Matt Warren
Thanks for the info, after reading issue 420 it makes sense now

On 30 June 2011 15:38, Ben West bwsithspaw...@yahoo.com wrote:

 Hey Matt,

 This is issue 420: https://issues.apache.org/jira/browse/LUCENENET-420

 I think the theory so far has been that the user should manage the culture
 rather than Lucene. If you disagree could you post on that issue ticket?

 Thanks,
 -Ben


 - Original Message -
 From: Matt Warren mattd...@gmail.com
 To: lucene-net-...@lucene.apache.org
 Cc:
 Sent: Thursday, June 30, 2011 9:28 AM
 Subject: [Lucene.Net] Possible bug in Lucene with Prefix Search and Danish
 Locale

 I think that the code here shows a bug in Lucene.NET, see
 http://gist.github.com/1056231.
 This happens when using 2.9.2.

 After some digging I think that it's due to the way it does a Prefix
 search.

 The main problem is shown by this code http://gist.github.com/1056242.
 If the Locale is Danish, this returns FALSE, weird eh!!
daab.StartsWith(da) //false
 But this works as expected
daab.StartsWith(da, StringComparison.InvariantCulture) //true

 The line of code that has this problem is the TermCompare(..) function in
 PrefixTermEnum.cs,
 see

 http://svn.apache.org/repos/asf/incubator/lucene.net/trunk/src/core/Search/PrefixTermEnum.cs




[jira] [Updated] (LUCENE-3216) Store DocValues per segment instead of per field

2011-06-30 Thread Simon Willnauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer updated LUCENE-3216:


Attachment: LUCENE-3216.patch

one more iteration adding a NestedCompoundDirectory that uses the parents 
openInputSlice method for efficiency.

 Store DocValues per segment instead of per field
 

 Key: LUCENE-3216
 URL: https://issues.apache.org/jira/browse/LUCENE-3216
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/index
Affects Versions: 4.0
Reporter: Simon Willnauer
Assignee: Simon Willnauer
 Fix For: 4.0

 Attachments: LUCENE-3216.patch, LUCENE-3216.patch, LUCENE-3216.patch, 
 LUCENE-3216.patch, LUCENE-3216.patch, LUCENE-3216.patch, 
 LUCENE-3216_floats.patch


 currently we are storing docvalues per field which results in at least one 
 file per field that uses docvalues (or at most two per field per segment 
 depending on the impl.). Yet, we should try to by default pack docvalues into 
 a single file if possible. To enable this we need to hold all docvalues in 
 memory during indexing and write them to disk once we flush a segment. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3260) need a test that uses termsenum.seekExact() (which returns true), then calls next()

2011-06-30 Thread Shai Erera (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057896#comment-13057896
 ] 

Shai Erera commented on LUCENE-3260:


I see. Thanks for the clarification. +1 to commit.

 need a test that uses termsenum.seekExact() (which returns true), then calls 
 next()
 ---

 Key: LUCENE-3260
 URL: https://issues.apache.org/jira/browse/LUCENE-3260
 Project: Lucene - Java
  Issue Type: Bug
Reporter: Robert Muir
Assignee: Michael McCandless
 Attachments: LUCENE-3260.patch


 i tried to do some seekExact (where the result must exist) then next()ing in 
 the faceting module,
 and it seems like there could be a bug here.
 I think we should add a test that mixes seekExact/seekCeil/next like this, to 
 ensure that
 if seekExact returns true, that the enum is properly positioned.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3216) Store DocValues per segment instead of per field

2011-06-30 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057894#comment-13057894
 ] 

Michael McCandless commented on LUCENE-3216:


Looks great!

So this means, if you use default StandardCodec, and 3 fields store
doc values, and main CFS is off but doc values CFS is on, you'll see
a cfs file holding the 3-6 sub-files that your docvalues created,
right?

But eg if some fields use another codec, then that codec will have its
own CFS for any fields it has with docvalues (this is the TODO)?
That's seems fine for starters.

I like CodecConfig, but I'm not sure it should hold things specific
only to 1 codec, like the Pulsing cutoff?  The other settings seem
more widely applicable... though I guess even terms cache size is not
used by various codecs, but it is by enough to have it in
CodecConfig, I think?

CodecConfig needs @experimental?

For the nested test... couldn't you createCompoundOutput directly from
an opened CompoundFileDirectory?  (Vs creating externally  copying
in).


 Store DocValues per segment instead of per field
 

 Key: LUCENE-3216
 URL: https://issues.apache.org/jira/browse/LUCENE-3216
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/index
Affects Versions: 4.0
Reporter: Simon Willnauer
Assignee: Simon Willnauer
 Fix For: 4.0

 Attachments: LUCENE-3216.patch, LUCENE-3216.patch, LUCENE-3216.patch, 
 LUCENE-3216.patch, LUCENE-3216.patch, LUCENE-3216.patch, 
 LUCENE-3216_floats.patch


 currently we are storing docvalues per field which results in at least one 
 file per field that uses docvalues (or at most two per field per segment 
 depending on the impl.). Yet, we should try to by default pack docvalues into 
 a single file if possible. To enable this we need to hold all docvalues in 
 memory during indexing and write them to disk once we flush a segment. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3216) Store DocValues per segment instead of per field

2011-06-30 Thread Simon Willnauer (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057900#comment-13057900
 ] 

Simon Willnauer commented on LUCENE-3216:
-

{quote}
So this means, if you use default StandardCodec, and 3 fields store
doc values, and main CFS is off but doc values CFS is on, you'll see
a cfs file holding the 3-6 sub-files that your docvalues created,
right?{quote}
Correct!

{quote}
But eg if some fields use another codec, then that codec will have its
own CFS for any fields it has with docvalues (this is the TODO)?
That's seems fine for starters.{quote}

again correct. So what I have in mind is a global cfs that a codec can pull 
via PerDocWriteState or something that holds all of them but for now having 
this per codec is fine IMO. I will create a follow up for this.

bq. For the nested test... couldn't you createCompoundOutput directly from an 
opened CompoundFileDirectory? (Vs creating externally  copying in).
Yes I could but this functionality is tricky and not needed currently so I left 
it out for now.

{quote}I like CodecConfig, but I'm not sure it should hold things specific
only to 1 codec, like the Pulsing cutoff? The other settings seem
more widely applicable... though I guess even terms cache size is not
used by various codecs, but it is by enough to have it in
CodecConfig, I think?{quote}

I am not sure here, I had the same thought but when you look at Solr and other 
high level users they need to configure stuff somehow so I put all reasonable 
core stuff in there. I think its ok to have this for only one codec. Thoughts?



 Store DocValues per segment instead of per field
 

 Key: LUCENE-3216
 URL: https://issues.apache.org/jira/browse/LUCENE-3216
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/index
Affects Versions: 4.0
Reporter: Simon Willnauer
Assignee: Simon Willnauer
 Fix For: 4.0

 Attachments: LUCENE-3216.patch, LUCENE-3216.patch, LUCENE-3216.patch, 
 LUCENE-3216.patch, LUCENE-3216.patch, LUCENE-3216.patch, 
 LUCENE-3216_floats.patch


 currently we are storing docvalues per field which results in at least one 
 file per field that uses docvalues (or at most two per field per segment 
 depending on the impl.). Yet, we should try to by default pack docvalues into 
 a single file if possible. To enable this we need to hold all docvalues in 
 memory during indexing and write them to disk once we flush a segment. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3264) crank up faceting module tests

2011-06-30 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057901#comment-13057901
 ] 

Robert Muir commented on LUCENE-3264:
-

{quote}
I don't understand. I thought that you said so regarding introducing atLeast 
and iterations, and I'm ok with that. I was just asking, since all you've done 
is move to use newDir, newIWC and RandomIW, how come the tests running time got 
that much longer? If it's not FSDir, do you have any idea what can cause that? 
Will RandomIW stall indexing randomly, or maybe it's newIWC which chooses to 
flush more often?
{quote}

I think the slowdown is basically linear (the tests run 2x or 3x as slow). Let 
me explain some of the reasons why you have this slowdown over just normal 
indexing without using randomiw/mockdirectorywrapper/etc:
# we call checkIndex on every directory we create after its closed. I think 
this is the right thing to do always... it does slow down the tests a bit.
# we do sometimes get crappy indexing params, crazy merge params, ridiculous 
IndexReader/Writer params (e.g. termIndexInterval=1). I think sometimes these 
non-optimal params slow things down.
# occasionally we do things like randomly fully or partially optimize, yield(), 
etc.

So while Lucene's defaults are pretty good, we are testing a bunch of 
non-default parameters and doing a bunch of other crazy things... so these slow 
down the tests!

That being said, I'm working on the speed issue at least a little here, because 
I really want to get this test improvements in,  although I really didn't want 
to work on this here (I think 1 minute extra *temporarily* to the build is no 
big deal for the additional coverage).


 crank up faceting module tests
 --

 Key: LUCENE-3264
 URL: https://issues.apache.org/jira/browse/LUCENE-3264
 Project: Lucene - Java
  Issue Type: Test
  Components: modules/facet
Reporter: Robert Muir
Assignee: Robert Muir
 Fix For: 3.4, 4.0

 Attachments: LUCENE-3264.patch


 The faceting module has a large set of good tests.
 lets switch them over to use all of our test infra (randomindexwriter, random 
 iwconfig, mockanalyzer, newDirectory, ...)
 I don't want to address multipliers and atLeast() etc on this issue, I think 
 we should follow up with that on a separate issue, that also looks at speed 
 and making sure the nightly build is exhaustive.
 for now, lets just get the coverage in, it will be good to do before any 
 refactoring.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-2793) Directory createOutput and openInput should take an IOContext

2011-06-30 Thread Simon Willnauer (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057903#comment-13057903
 ] 

Simon Willnauer commented on LUCENE-2793:
-

Varun this patch looks great. I am about to commit it. Can you now work through 
the nocommits, fix them or post questions here?

simon

 Directory createOutput and openInput should take an IOContext
 -

 Key: LUCENE-2793
 URL: https://issues.apache.org/jira/browse/LUCENE-2793
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/store
Reporter: Michael McCandless
Assignee: Varun Thacker
  Labels: gsoc2011, lucene-gsoc-11, mentor
 Attachments: LUCENE-2793-nrt.patch, LUCENE-2793.patch, 
 LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, 
 LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, 
 LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, 
 LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, 
 LUCENE-2793.patch, LUCENE-2793.patch


 Today for merging we pass down a larger readBufferSize than for searching 
 because we get better performance.
 I think we should generalize this to a class (IOContext), which would hold 
 the buffer size, but then could hold other flags like DIRECT (bypass OS's 
 buffer cache), SEQUENTIAL, etc.
 Then, we can make the DirectIOLinuxDirectory fully usable because we would 
 only use DIRECT/SEQUENTIAL during merging.
 This will require fixing how IW pools readers, so that a reader opened for 
 merging is not then used for searching, and vice/versa.  Really, it's only 
 all the open file handles that need to be different -- we could in theory 
 share del docs, norms, etc, if that were somehow possible.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3264) crank up faceting module tests

2011-06-30 Thread Shai Erera (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057906#comment-13057906
 ] 

Shai Erera commented on LUCENE-3264:


Thanks Robert. This makes sense to me.

bq. although I really didn't want to work on this here (I think 1 minute extra 
temporarily to the build is no big deal for the additional coverage)

I apologize if that caused you to do that work here. I really only wanted to 
understand. By all means, commit the changes. The explanation makes sense and 
I'm ok with it. We can speed up things later.

 crank up faceting module tests
 --

 Key: LUCENE-3264
 URL: https://issues.apache.org/jira/browse/LUCENE-3264
 Project: Lucene - Java
  Issue Type: Test
  Components: modules/facet
Reporter: Robert Muir
Assignee: Robert Muir
 Fix For: 3.4, 4.0

 Attachments: LUCENE-3264.patch, LUCENE-3264.patch


 The faceting module has a large set of good tests.
 lets switch them over to use all of our test infra (randomindexwriter, random 
 iwconfig, mockanalyzer, newDirectory, ...)
 I don't want to address multipliers and atLeast() etc on this issue, I think 
 we should follow up with that on a separate issue, that also looks at speed 
 and making sure the nightly build is exhaustive.
 for now, lets just get the coverage in, it will be good to do before any 
 refactoring.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-3260) need a test that uses termsenum.seekExact() (which returns true), then calls next()

2011-06-30 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless resolved LUCENE-3260.


   Resolution: Fixed
Fix Version/s: 4.0

 need a test that uses termsenum.seekExact() (which returns true), then calls 
 next()
 ---

 Key: LUCENE-3260
 URL: https://issues.apache.org/jira/browse/LUCENE-3260
 Project: Lucene - Java
  Issue Type: Bug
Reporter: Robert Muir
Assignee: Michael McCandless
 Fix For: 4.0

 Attachments: LUCENE-3260.patch


 i tried to do some seekExact (where the result must exist) then next()ing in 
 the faceting module,
 and it seems like there could be a bug here.
 I think we should add a test that mixes seekExact/seekCeil/next like this, to 
 ensure that
 if seekExact returns true, that the enum is properly positioned.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3216) Store DocValues per segment instead of per field

2011-06-30 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057935#comment-13057935
 ] 

Robert Muir commented on LUCENE-3216:
-

{quote}
I am not sure here, I had the same thought but when you look at Solr and other 
high level users they need to configure stuff somehow so I put all reasonable 
core stuff in there. I think its ok to have this for only one codec. Thoughts?
{quote}

I don't like CodecConfig actually. It doesn't make sense that it contains all 
these codec-specific parameters, which should be private to the codec. I think 
lucene's codecs should just be APIs and have ordinary ctors.

As far as higher-level stuff like Solr, we can improve it there so its easier 
for users to configure this stuff, for example the Solr codec configuration 
allows you to specify a codecproviderfactory that takes arbitrary nested xml 
and parses it however you want.

The only problem is we don't have a *concrete* (e.g. non-mock/test) 
implementation in Solr that actually exposes all of what lucene can offer... I 
would prefer we instead just do this, and make a SolrCodecProviderFactory that 
lets you configure skip intervals, pulsing cutoffs, and all these other 
codec-specific options in a type-safe way.


 Store DocValues per segment instead of per field
 

 Key: LUCENE-3216
 URL: https://issues.apache.org/jira/browse/LUCENE-3216
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/index
Affects Versions: 4.0
Reporter: Simon Willnauer
Assignee: Simon Willnauer
 Fix For: 4.0

 Attachments: LUCENE-3216.patch, LUCENE-3216.patch, LUCENE-3216.patch, 
 LUCENE-3216.patch, LUCENE-3216.patch, LUCENE-3216.patch, 
 LUCENE-3216_floats.patch


 currently we are storing docvalues per field which results in at least one 
 file per field that uses docvalues (or at most two per field per segment 
 depending on the impl.). Yet, we should try to by default pack docvalues into 
 a single file if possible. To enable this we need to hold all docvalues in 
 memory during indexing and write them to disk once we flush a segment. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-1674) improve analysis tests, cut over to new API

2011-06-30 Thread Mark Miller (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller updated SOLR-1674:
--

Assignee: Robert Muir  (was: Mark Miller)

 improve analysis tests, cut over to new API
 ---

 Key: SOLR-1674
 URL: https://issues.apache.org/jira/browse/SOLR-1674
 Project: Solr
  Issue Type: Test
  Components: Schema and Analysis
Reporter: Robert Muir
Assignee: Robert Muir
 Fix For: 4.0

 Attachments: SOLR-1674.patch, SOLR-1674.patch, SOLR-1674_speedup.patch


 This patch
 * converts all analysis tests to use the new tokenstream api
 * converts most tests to use the more stringent assertion mechanisms from 
 lucene
 * adds new tests to improve coverage
 Most bugs found by more stringent testing have been fixed, with the exception 
 of SynonymFilter.
 The problems with this filter are more serious, the previous tests were 
 essentially a no-op.
 The new tests for SynonymFilter test the current behavior, but have FIXMEs 
 with what I think the old test wanted to expect in the comments.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS-MAVEN] Lucene-Solr-Maven-trunk #167: POMs out of sync

2011-06-30 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-Maven-trunk/167/

No tests ran.

Build Log (for compile errors):
[...truncated 7426 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



RE: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed?

2011-06-30 Thread Digy
Although there are a lot of people using Lucene.Net, this is our
contribution report for the past 5 years.

https://issues.apache.org/jira/secure/ConfigureReport.jspa?atl_token=A5KQ-2Q
AV-T4JA-FDED|3204f7e696067a386144705075c074e991db2a2b|linversionId=-1issue
Status=allselectedProjectId=12310290reportKey=com.sourcelabs.jira.plugin.r
eport.contributions%3AcontributionreportNext=Next


DIGY

-Original Message-
From: Ayende Rahien [mailto:aye...@ayende.com] 
Sent: Thursday, June 30, 2011 8:16 PM
To: Rory Plaire; lucene-net-...@lucene.apache.org
Subject: RE: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed?

As someone from the nhibernate project
We stopped following hibernate a while ago, and haven't regretted it
We have mire features, less bugs and better code base

Sent from my Windows Phone From: Rory Plaire
Sent: Thursday, June 30, 2011 19:58
To: lucene-net-...@lucene.apache.org
Subject: Re: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed?
I don't want to drag this out much longer, but I am curious with people who
hold the line-by-line sentiment - are you NHibernate users?

-r

On Thu, Jun 30, 2011 at 2:39 AM, Noel Lysaght lysag...@hotmail.com wrote:

 Can I just plug in my bit and say I agree 100% with what Moray has
outlined
 below.

 If we move away from the line by line port then over time we'll loose out
 on the momentum that is Lucene and the improvements that they make.
 It is only if the Lucene.NET community has expertise in search,  a  deep
 knowledge of the project and the community can guarantee that the
knowledge
 will survive members coming and going should such a consideration be give.

 When Lucene.NET has stood on it's feet for a number of years after it has
 moved out of Apache incubation should consideration be given to abandoning
a
 line by line port.
 By all means extend and wrap the libraries in .NET equivalents and .NET
 goodness like LINQ (we do this internally in our company at the moment);
but
 leave the core of the project on a line by line port.

 Just my tu-pence worth.

 Kind Regards
 Noel


 -Original Message- From: Moray McConnachie
 Sent: Thursday, June 30, 2011 10:25 AM

 To: lucene-net-user@lucene.apache.**orglucene-net-u...@lucene.apache.org
 Cc:
lucene-net-dev@incubator.**apache.orglucene-net-...@incubator.apache.org
 Subject: RE: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed?

 I don't think I'm as hard core on this as Neal, but remember: the
 history of the Lucene.NET project is that all the intellectual work, all
 the understanding of search, all the new features come from the Lucene
 Java folks. Theirs is an immensely respected project, and I trust them
 to add new features that will be well-tested and well-researched, and to
 have a decent roadmap which I can trust they will execute on.

 Now I know there's been an influx of capable developers to Lucene.NET
 who are ready, willing and (I'm going to assume) able to add a lot more
 value in a generic .NET implementation as they change it. But it'll take
 a while before I trust a .NET dedicated framework which is significantly
 diverged from Java in the way I do the line-by-line version. And at what
 stage is it not just not a line-by-line port, but not a port at all?

 At the same time, I recognise that if this project is going to continue,
 and attract good developers, it has to change in this direction.

 So that said, I can see why a line-by-line port might not be
 sustainable. And most people don't need it. But most of us using Lucene
 in production systems do need a system that we can trust and rely on. So
 let me chime in with someone else's plea, to keep the general structure
 close to Lucene, to keep the same general objects and inheritance
 set-up, and to keep the same method names, even if you add other methods
 and classes to provide additional functionality. ABSOLUTELY the same
 file formats. End users benefit a lot from a high degree of similarity,
 with good documentation and help being available from the Java
 community.

 Yours,
 Moray
 --**---
 Moray McConnachie
 Director of IT+44 1865 261 600
 Oxford Analytica  http://www.oxan.com

 -Original Message-
 From: Granroth, Neal V.
[mailto:neal.granroth@**thermofisher.comneal.granr...@thermofisher.com
 ]
 Sent: 29 June 2011 20:47
 To: lucene-net-user@lucene.apache.**orglucene-net-u...@lucene.apache.org
 Cc:
lucene-net-dev@incubator.**apache.orglucene-net-...@incubator.apache.org
 Subject: RE: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed?

 This is has been discussed many times.
 Lucene.NET is not valid, the code cannot be trusted, if it is not a
 line-by-line port.  It ceases to be Lucene.

 - Neal

 -Original Message-
 From: Scott Lombard
[mailto:lombardenator@gmail.**comlombardena...@gmail.com
 ]
 Sent: Wednesday, June 29, 2011 1:58 PM
 To: lucene-net-dev@lucene.apache.**org lucene-net-...@lucene.apache.org;
 

[jira] [Commented] (LUCENE-3216) Store DocValues per segment instead of per field

2011-06-30 Thread Simon Willnauer (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057967#comment-13057967
 ] 

Simon Willnauer commented on LUCENE-3216:
-

I will back out the config stuff and make it default to CFS. Somehow somebody 
who needs it eventually will figure it out how to make it non-private whatever.

 Store DocValues per segment instead of per field
 

 Key: LUCENE-3216
 URL: https://issues.apache.org/jira/browse/LUCENE-3216
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/index
Affects Versions: 4.0
Reporter: Simon Willnauer
Assignee: Simon Willnauer
 Fix For: 4.0

 Attachments: LUCENE-3216.patch, LUCENE-3216.patch, LUCENE-3216.patch, 
 LUCENE-3216.patch, LUCENE-3216.patch, LUCENE-3216.patch, 
 LUCENE-3216_floats.patch


 currently we are storing docvalues per field which results in at least one 
 file per field that uses docvalues (or at most two per field per segment 
 depending on the impl.). Yet, we should try to by default pack docvalues into 
 a single file if possible. To enable this we need to hold all docvalues in 
 memory during indexing and write them to disk once we flush a segment. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java

2011-06-30 Thread Simon Willnauer
On Thu, Jun 30, 2011 at 4:44 PM, Robert Muir rcm...@gmail.com wrote:
 I think Dawid is correct here... so we should change this back? still
 personally when I see array clone() or copyOf() it makes me concerned, I
 know these are as fast as arraycopy in recent versions, but depending on
 which variant is used, and whether you use -server, can be slower... in
 general I just don't want us to have performance regressions on say windows
 32bit over this stuff, personally I think arraycopy is a sure fire bet every
 time, but Ill concede the point that copyOf might not be slower for the
 primitive versions... I think in jdk7 we will not have this issue as -client
 sorta goes away in favor of the tiered thing? anyway, I think we should
 proceed with caution here as far as moving things over to copyOf, I don't
 see any evidence that its ever faster, but its definitely sometimes slower.

I don't seen any evidence that this is any slower though.

simon

 On Jun 30, 2011 9:12 AM, Dawid Weiss dawid.we...@cs.put.poznan.pl wrote:
 Arrays.copyOf(primitive) is actually System.arraycopy by default. If
 intrinsics are used it can only get faster. For object types it will
 probably be a bit slower for -client because of a runtime check for
 the component type.

 Dawid

 On Thu, Jun 30, 2011 at 3:05 PM, Robert Muir rcm...@gmail.com wrote:
 because on windows 32bit at least, -client is still the default on
 most jres out there.

 i realize people don't care about -client, but i will police
 foo[].clone() / arrays.copyOf etc to prevent problems.

 There are comments about this stuff on the relevant bug reports
 (oracle's site is down, sorry) linked to this issue.
 https://issues.apache.org/jira/browse/LUCENE-2674

 Sorry, I don't think we should use foo[].clone() / arrays.copyOf, I
 think we should always use arraycopy.

 On Thu, Jun 30, 2011 at 8:55 AM, Simon Willnauer
 simon.willna...@googlemail.com wrote:
 hmm are you concerned about the extra Math.min that happens in the
 copyOf method?
 I don't how that relates to intrinsic and java 1.7

 I don't have strong feelings here just checking if you mix something
 up in the comment you put there... I am happy to keep the old and now
 current code

 simon

 On Thu, Jun 30, 2011 at 2:42 PM,  rm...@apache.org wrote:
 Author: rmuir
 Date: Thu Jun 30 12:42:17 2011
 New Revision: 1141510

 URL: http://svn.apache.org/viewvc?rev=1141510view=rev
 Log:
 LUCENE-3239: remove use of slow Arrays.copyOf

 Modified:

  lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java

 Modified:
 lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
 URL:
 http://svn.apache.org/viewvc/lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java?rev=1141510r1=1141509r2=1141510view=diff

 ==
 ---
 lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
 (original)
 +++
 lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
 Thu Jun 30 12:42:17 2011
 @@ -2,7 +2,6 @@ package org.apache.lucene.util;

  import java.io.IOException;
  import java.io.OutputStream;
 -import java.util.Arrays;

  /**
  * Licensed to the Apache Software Foundation (ASF) under one or more
 @@ -72,7 +71,11 @@ public class UnsafeByteArrayOutputStream
   }

   private void grow(int newLength) {
 -    buffer = Arrays.copyOf(buffer, newLength);
 +    // It actually should be: (Java 1.7, when its intrinsic on all
 machines)
 +    // buffer = Arrays.copyOf(buffer, newLength);
 +    byte[] newBuffer = new byte[newLength];
 +    System.arraycopy(buffer, 0, newBuffer, 0, buffer.length);
 +    buffer = newBuffer;
   }

   /**




 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org



 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org



 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-3264) crank up faceting module tests

2011-06-30 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir resolved LUCENE-3264.
-

Resolution: Fixed

ok, committed and backported.

I think we should open followup issue(s):
* speed up the top-k sampling tests (but make sure they are thorough on nightly 
etc still)
* make a RandomTaxonomyWriter
* look at any hardcoded constants like #docs etc and see if we can in general 
add randomization.


 crank up faceting module tests
 --

 Key: LUCENE-3264
 URL: https://issues.apache.org/jira/browse/LUCENE-3264
 Project: Lucene - Java
  Issue Type: Test
  Components: modules/facet
Reporter: Robert Muir
Assignee: Robert Muir
 Fix For: 3.4, 4.0

 Attachments: LUCENE-3264.patch, LUCENE-3264.patch


 The faceting module has a large set of good tests.
 lets switch them over to use all of our test infra (randomindexwriter, random 
 iwconfig, mockanalyzer, newDirectory, ...)
 I don't want to address multipliers and atLeast() etc on this issue, I think 
 we should follow up with that on a separate issue, that also looks at speed 
 and making sure the nightly build is exhaustive.
 for now, lets just get the coverage in, it will be good to do before any 
 refactoring.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3246) Invert IR.getDelDocs - IR.getLiveDocs

2011-06-30 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057979#comment-13057979
 ] 

Uwe Schindler commented on LUCENE-3246:
---

Hi Mike,
As we have now both variants to read/write BitVectors, would it be not a good 
idea to automatically use the old encoding for liveDocs, if more than 50% of 
all bits are unset? This would save disk space if a segments has more 
deletetions than live docs. Not sure if this can easily be implemented and is 
worth the complexity (that we already have because of both versions)?

The patch looks fine!

 Invert IR.getDelDocs - IR.getLiveDocs
 --

 Key: LUCENE-3246
 URL: https://issues.apache.org/jira/browse/LUCENE-3246
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/index
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 4.0

 Attachments: LUCENE-3246-IndexSplitters.patch, LUCENE-3246.patch, 
 LUCENE-3246.patch


 Spinoff from LUCENE-1536, where we need to fix the low level filtering
 we do for deleted docs to match Filters (ie, a set bit means the doc
 is accepted) so that filters can be pushed all the way down to the
 enums when possible/appropriate.
 This change also inverts the meaning first arg to
 TermsEnum.docs/AndPositions (renames from skipDocs to liveDocs).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3179) OpenBitSet.prevSetBit()

2011-06-30 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057980#comment-13057980
 ] 

Uwe Schindler commented on LUCENE-3179:
---

Any other comments/microbenchmarks from other committers? Dawid and Paul?

I would like to commit this if nobody objects! What should we do with the then 
obsolete BitUtils methods?

 OpenBitSet.prevSetBit()
 ---

 Key: LUCENE-3179
 URL: https://issues.apache.org/jira/browse/LUCENE-3179
 Project: Lucene - Java
  Issue Type: Improvement
Reporter: Paul Elschot
Assignee: Paul Elschot
Priority: Minor
 Fix For: 3.3, 4.0

 Attachments: LUCENE-3179-fix.patch, LUCENE-3179-fix.patch, 
 LUCENE-3179-long-ntz.patch, LUCENE-3179-long-ntz.patch, LUCENE-3179.patch, 
 LUCENE-3179.patch, LUCENE-3179.patch, TestBitUtil.java, TestOpenBitSet.patch


 Find a previous set bit in an OpenBitSet.
 Useful for parent testing in nested document query execution LUCENE-2454 .

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java

2011-06-30 Thread Dawid Weiss
 I don't seen any evidence that this is any slower though.

You need to run with -client (if the machine is a beast this is tricky
because x64 will pick -server regardless of the command-line setting)
and you need to be copying generic arrays. I think this can be shown
-- a caliper benchmark would be perfect to demonstrate this in
isolation; I may write one if I find a spare moment.

Dawid

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java

2011-06-30 Thread Michael McCandless
I think it's important Lucene keeps good performance on ordinary
machines/envs.

It's really quite dangerous that the active Lucene devs all use beasts
for development/testing.  We draw false conclusions.

So we really should be testing with -client and if indeed generified
Arrays.copyOf (and anything else) is risky in such envs we should not
use it when System.arraycopy works more consistently.

Mike McCandless

http://blog.mikemccandless.com

On Thu, Jun 30, 2011 at 2:50 PM, Dawid Weiss
dawid.we...@cs.put.poznan.pl wrote:
 I don't seen any evidence that this is any slower though.

 You need to run with -client (if the machine is a beast this is tricky
 because x64 will pick -server regardless of the command-line setting)
 and you need to be copying generic arrays. I think this can be shown
 -- a caliper benchmark would be perfect to demonstrate this in
 isolation; I may write one if I find a spare moment.

 Dawid

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed?

2011-06-30 Thread Michael Herndon
I'd say that is all the more reasons that we need to work smarter and not
harder. I'd also say thats a good reason to make sure we build consensus
rather than just saying whoever commits code wins.

And its a damn good reason to focus on the effort of growing the number of
contributors and lowing the barrier to submitting patches, breaking things
down into pieces that people would feel confident to work on without
being overwhelmed by the complexity of Lucene.Net.

There is a pretty big gap between Lucene 2.9.x to Lucene 4.0 and the
internals and index formats are significantly different including nixing the
current vint file format and using byte[] array slices for Terms instead of
char[].

So while porting 1 to 1 while require less knowledge or thought, its most
likely going to require more hours of work. And Its definitely not going to
guarantee the stability of the code or that its great code.

I'd have to say that its not as stable as most would believe at the moment.

Most of the tests avoid anything that remotely looks like it knows about the
DRY principle and there is a static constructor in the core test case that
throws an exception if it doesn't find an environment variable TEMP which
will fail 90% of the tests and nunit will be unable to give you a clear
reason why.  Just to name a few issues I came across working towards getting
Lucene.Net into CI.  I haven't even started really digging in under the
covers of the code yet.

So my suggestion is to chew on this a bit more and build consensus, avoid
fracturing people into sides.  Be open to reservations and concerns that
others have and continue to address them.

- Michael


On Thu, Jun 30, 2011 at 2:10 PM, Digy digyd...@gmail.com wrote:

 Although there are a lot of people using Lucene.Net, this is our
 contribution report for the past 5 years.


 https://issues.apache.org/jira/secure/ConfigureReport.jspa?atl_token=A5KQ-2Q

 AV-T4JA-FDED|3204f7e696067a386144705075c074e991db2a2b|linversionId=-1issue

 Status=allselectedProjectId=12310290reportKey=com.sourcelabs.jira.plugin.r
 eport.contributions%3AcontributionreportNext=Next


 DIGY

 -Original Message-
 From: Ayende Rahien [mailto:aye...@ayende.com]
 Sent: Thursday, June 30, 2011 8:16 PM
 To: Rory Plaire; lucene-net-...@lucene.apache.org
 Subject: RE: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed?

 As someone from the nhibernate project
 We stopped following hibernate a while ago, and haven't regretted it
 We have mire features, less bugs and better code base

 Sent from my Windows Phone From: Rory Plaire
 Sent: Thursday, June 30, 2011 19:58
 To: lucene-net-...@lucene.apache.org
 Subject: Re: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed?
 I don't want to drag this out much longer, but I am curious with people who
 hold the line-by-line sentiment - are you NHibernate users?

 -r

 On Thu, Jun 30, 2011 at 2:39 AM, Noel Lysaght lysag...@hotmail.com
 wrote:

  Can I just plug in my bit and say I agree 100% with what Moray has
 outlined
  below.
 
  If we move away from the line by line port then over time we'll loose out
  on the momentum that is Lucene and the improvements that they make.
  It is only if the Lucene.NET community has expertise in search,  a  deep
  knowledge of the project and the community can guarantee that the
 knowledge
  will survive members coming and going should such a consideration be
 give.
 
  When Lucene.NET has stood on it's feet for a number of years after it has
  moved out of Apache incubation should consideration be given to
 abandoning
 a
  line by line port.
  By all means extend and wrap the libraries in .NET equivalents and .NET
  goodness like LINQ (we do this internally in our company at the moment);
 but
  leave the core of the project on a line by line port.
 
  Just my tu-pence worth.
 
  Kind Regards
  Noel
 
 
  -Original Message- From: Moray McConnachie
  Sent: Thursday, June 30, 2011 10:25 AM
 
  To: lucene-net-user@lucene.apache.**org
 lucene-net-u...@lucene.apache.org
  Cc:
 lucene-net-dev@incubator.**apache.orglucene-net-...@incubator.apache.org
  Subject: RE: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed?
 
  I don't think I'm as hard core on this as Neal, but remember: the
  history of the Lucene.NET project is that all the intellectual work, all
  the understanding of search, all the new features come from the Lucene
  Java folks. Theirs is an immensely respected project, and I trust them
  to add new features that will be well-tested and well-researched, and to
  have a decent roadmap which I can trust they will execute on.
 
  Now I know there's been an influx of capable developers to Lucene.NET
  who are ready, willing and (I'm going to assume) able to add a lot more
  value in a generic .NET implementation as they change it. But it'll take
  a while before I trust a .NET dedicated framework which is significantly
  diverged from Java in the way I do the line-by-line version. And at what

Re: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java

2011-06-30 Thread Dawid Weiss
 I think it's important Lucene keeps good performance on ordinary
 machines/envs.

Not that this voice will help in anything, but I think the above is
virtually impossible to achieve unless you have a bunch of machines,
OSs and VMs to continually test on and a consistent set of benchmarks
plotted over time... and of course check every single commit for
regression over all these combinations. And even then you'd always
find a case of something being faster or slower on some combination of
hardware/ software; optimizing for these differences makes little
sense to me (people struggling with performance on some weird
software/hardware combination can always change the VM vendor or a VM
switch).

Sorry for being so pessimistically unconstructive... :(

Dawid

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-2341) explore morfologik integration

2011-06-30 Thread Dawid Weiss (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Weiss resolved LUCENE-2341.
-

Resolution: Fixed

In trunk. Long live 1.6 support.

 explore morfologik integration
 --

 Key: LUCENE-2341
 URL: https://issues.apache.org/jira/browse/LUCENE-2341
 Project: Lucene - Java
  Issue Type: New Feature
  Components: modules/analysis
Reporter: Robert Muir
Assignee: Dawid Weiss
 Fix For: 4.0

 Attachments: LUCENE-2341.diff, LUCENE-2341.diff, LUCENE-2341.diff, 
 LUCENE-2341.diff, LUCENE-2341.patch, LUCENE-2341.patch, 
 morfologik-fsa-1.5.2.jar, morfologik-polish-1.5.2.jar, 
 morfologik-stemming-1.5.2.jar


 Dawid Weiss mentioned on LUCENE-2298 that there is another Polish stemmer 
 available:
 http://sourceforge.net/projects/morfologik/
 This works differently than LUCENE-2298, and ideally would be another option 
 for users.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-tests-only-3.x - Build # 9208 - Failure

2011-06-30 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-3.x/9208/

1 tests failed.
REGRESSION:  
org.apache.lucene.facet.util.TestScoredDocIDsUtils.testWithDeletions

Error Message:
Wrong number of (live) documents expected:65 but was:64

Stack Trace:
junit.framework.AssertionFailedError: Wrong number of (live) documents 
expected:65 but was:64
at 
org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1277)
at 
org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1195)
at 
org.apache.lucene.facet.util.TestScoredDocIDsUtils.testWithDeletions(TestScoredDocIDsUtils.java:142)




Build Log (for compile errors):
[...truncated 8816 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java

2011-06-30 Thread Michael McCandless
Fair enough, and I agree.

Though the least we could do is rotate in a Windows env, where Java
runs with -client, to our Jenkins.

But simple-to-follow rules like Don't use Arrays.copyOf; use
System.arraycopy instead (if indeed System.arraycopy seems to
generally not be slower) seem like a no-brainer.

Why risk Arrays.copyOf, anytime?  Shouldn't we never use it...?

Mike McCandless

http://blog.mikemccandless.com

On Thu, Jun 30, 2011 at 3:09 PM, Dawid Weiss
dawid.we...@cs.put.poznan.pl wrote:
 I think it's important Lucene keeps good performance on ordinary
 machines/envs.

 Not that this voice will help in anything, but I think the above is
 virtually impossible to achieve unless you have a bunch of machines,
 OSs and VMs to continually test on and a consistent set of benchmarks
 plotted over time... and of course check every single commit for
 regression over all these combinations. And even then you'd always
 find a case of something being faster or slower on some combination of
 hardware/ software; optimizing for these differences makes little
 sense to me (people struggling with performance on some weird
 software/hardware combination can always change the VM vendor or a VM
 switch).

 Sorry for being so pessimistically unconstructive... :(

 Dawid

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [JENKINS] Lucene-Solr-tests-only-3.x - Build # 9208 - Failure

2011-06-30 Thread Robert Muir
this one reproduces, and just beasting the test, looks like this test
fails ~ 2% of the time on trunk and branch_3x

On Thu, Jun 30, 2011 at 3:24 PM, Apache Jenkins Server
jenk...@builds.apache.org wrote:
 Build: https://builds.apache.org/job/Lucene-Solr-tests-only-3.x/9208/

 1 tests failed.
 REGRESSION:  
 org.apache.lucene.facet.util.TestScoredDocIDsUtils.testWithDeletions

 Error Message:
 Wrong number of (live) documents expected:65 but was:64

 Stack Trace:
 junit.framework.AssertionFailedError: Wrong number of (live) documents 
 expected:65 but was:64
        at 
 org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1277)
        at 
 org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1195)
        at 
 org.apache.lucene.facet.util.TestScoredDocIDsUtils.testWithDeletions(TestScoredDocIDsUtils.java:142)




 Build Log (for compile errors):
 [...truncated 8816 lines...]



 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-tests-only-trunk - Build # 9205 - Failure

2011-06-30 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/9205/

All tests passed

Build Log (for compile errors):
[...truncated 10841 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [JENKINS] Lucene-Solr-tests-only-trunk - Build # 9205 - Failure

2011-06-30 Thread Dawid Weiss
javadocs failed. I'll fix it.

Dawid

On Thu, Jun 30, 2011 at 9:35 PM, Apache Jenkins Server
jenk...@builds.apache.org wrote:
 Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/9205/

 All tests passed

 Build Log (for compile errors):
 [...truncated 10841 lines...]



 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed?

2011-06-30 Thread Troy Howard
Michael,

I agree with everything you said. My point in saying whoever commits code
wins was to illustrate the reality of how and why the project has the
current form.

Building consensus is difficult. It is an essential first step before we can
do something like make a list of bit-sized pieces of work that others can
work on.

This is why my real message of Let's find a way to accommodate both is so
important. It allows us to build consensus, so that we can settle on a
direction and structure our work.

Until we accomplish that, it really is whoever commits code wins, and that
is an unhealthy and unmaintainable way to operate.

From a technical perspective, your statements about the unit tests are
completely accurate. They really need a LOT of reworking. That's the very
first step before making any significant changes. Part of the problem is
that the tests themselves are not well written. The other part is that the
Lucene object model was not designed for testability, and it makes writing
good tests more difficult, and certain tests might not be possible. It will
be difficult to write good unit tests without re-structuring. The biggest
issue is the use of abstract classes with base behaviour vs interfaces or
fully abstracted classes. Makes mocking tough. This is the direction I was
going when I started the Lucere project. I'd like to start in on that work
after the 2.9.4g release.

Thanks,
Troy


On Thu, Jun 30, 2011 at 12:04 PM, Michael Herndon 
mhern...@wickedsoftware.net wrote:

 I'd say that is all the more reasons that we need to work smarter and not
 harder. I'd also say thats a good reason to make sure we build consensus
 rather than just saying whoever commits code wins.

 And its a damn good reason to focus on the effort of growing the number of
 contributors and lowing the barrier to submitting patches, breaking things
 down into pieces that people would feel confident to work on without
 being overwhelmed by the complexity of Lucene.Net.

 There is a pretty big gap between Lucene 2.9.x to Lucene 4.0 and the
 internals and index formats are significantly different including nixing
 the
 current vint file format and using byte[] array slices for Terms instead of
 char[].

 So while porting 1 to 1 while require less knowledge or thought, its most
 likely going to require more hours of work. And Its definitely not going to
 guarantee the stability of the code or that its great code.

 I'd have to say that its not as stable as most would believe at the moment.

 Most of the tests avoid anything that remotely looks like it knows about
 the
 DRY principle and there is a static constructor in the core test case that
 throws an exception if it doesn't find an environment variable TEMP which
 will fail 90% of the tests and nunit will be unable to give you a clear
 reason why.  Just to name a few issues I came across working towards
 getting
 Lucene.Net into CI.  I haven't even started really digging in under the
 covers of the code yet.

 So my suggestion is to chew on this a bit more and build consensus, avoid
 fracturing people into sides.  Be open to reservations and concerns that
 others have and continue to address them.

 - Michael


 On Thu, Jun 30, 2011 at 2:10 PM, Digy digyd...@gmail.com wrote:

  Although there are a lot of people using Lucene.Net, this is our
  contribution report for the past 5 years.
 
 
 
 https://issues.apache.org/jira/secure/ConfigureReport.jspa?atl_token=A5KQ-2Q
 
 
 AV-T4JA-FDED|3204f7e696067a386144705075c074e991db2a2b|linversionId=-1issue
 
 
 Status=allselectedProjectId=12310290reportKey=com.sourcelabs.jira.plugin.r
  eport.contributions%3AcontributionreportNext=Next
 
 
  DIGY
 
  -Original Message-
  From: Ayende Rahien [mailto:aye...@ayende.com]
  Sent: Thursday, June 30, 2011 8:16 PM
  To: Rory Plaire; lucene-net-...@lucene.apache.org
  Subject: RE: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed?
 
  As someone from the nhibernate project
  We stopped following hibernate a while ago, and haven't regretted it
  We have mire features, less bugs and better code base
 
  Sent from my Windows Phone From: Rory Plaire
  Sent: Thursday, June 30, 2011 19:58
  To: lucene-net-...@lucene.apache.org
  Subject: Re: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed?
  I don't want to drag this out much longer, but I am curious with people
 who
  hold the line-by-line sentiment - are you NHibernate users?
 
  -r
 
  On Thu, Jun 30, 2011 at 2:39 AM, Noel Lysaght lysag...@hotmail.com
  wrote:
 
   Can I just plug in my bit and say I agree 100% with what Moray has
  outlined
   below.
  
   If we move away from the line by line port then over time we'll loose
 out
   on the momentum that is Lucene and the improvements that they make.
   It is only if the Lucene.NET community has expertise in search,  a
  deep
   knowledge of the project and the community can guarantee that the
  knowledge
   will survive members coming and going should such a 

Re: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java

2011-06-30 Thread Simon Willnauer
On Thu, Jun 30, 2011 at 8:50 PM, Dawid Weiss
dawid.we...@cs.put.poznan.pl wrote:
 I don't seen any evidence that this is any slower though.

 You need to run with -client (if the machine is a beast this is tricky
 because x64 will pick -server regardless of the command-line setting)
 and you need to be copying generic arrays. I think this can be shown
 -- a caliper benchmark would be perfect to demonstrate this in
 isolation; I may write one if I find a spare moment.

this is what I want to see. I don't want to discuss based on some bug
reported for a non-primitive version of copyOf thats all.
its pointless to discuss if there is no evidence which I don't see. I
am happy with arraycopy I would just have appreciated a discussion
before backing the change out.

simon

 Dawid


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-1879) Parallel incremental indexing

2011-06-30 Thread hao yan (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13058072#comment-13058072
 ] 

hao yan commented on LUCENE-1879:
-

Hi, Michael

Is there any lastest progress on this topic? I am very interested in this!

 Parallel incremental indexing
 -

 Key: LUCENE-1879
 URL: https://issues.apache.org/jira/browse/LUCENE-1879
 Project: Lucene - Java
  Issue Type: New Feature
  Components: core/index
Reporter: Michael Busch
Assignee: Michael Busch
 Fix For: 4.0

 Attachments: parallel_incremental_indexing.tar


 A new feature that allows building parallel indexes and keeping them in sync 
 on a docID level, independent of the choice of the MergePolicy/MergeScheduler.
 Find details on the wiki page for this feature:
 http://wiki.apache.org/lucene-java/ParallelIncrementalIndexing 
 Discussion on java-dev:
 http://markmail.org/thread/ql3oxzkob7aqf3jd

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



RE: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed?

2011-06-30 Thread Scott Lombard
Ok I think I asked the wrong question.  I am trying to figure out where to
put my time.  I was thinking about working on the automated porting system,
but when I saw the response to the .NET 4.0 discussions I started to
question if that is the right direction.  The community seemed to be more
interested in the .NET features.  

The complexity of the automated tool is going to become very high and will
probably end up with a line-for-line style port.  So I keep asking my self
is the automated tool worth it.  I don't think it is.  

I like the method has been Digy is using for porting the code.  So I guess
for me the real question is Digy where did you see 2.9.4g going next and
what do you need help on?  

Scott




 -Original Message-
 From: Digy [mailto:digyd...@gmail.com]
 Sent: Thursday, June 30, 2011 4:20 PM
 To: lucene-net-...@lucene.apache.org
 Subject: RE: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed?
 
 Michael,
 You interpret the report as whoever commits code wins? But when I look
 at it, I see a lof of talk, no work. .Net community is not interested in
 contributing.
 I really don't understand what hinders people to work on Lucene.Net. As I
 did for 2.9.4g, grab the code, do whatever you want on it and submit back.
 If it doesn't fit to the project's direction it can still find a place in
 contrib or in branch. All of the approaches can live side by side happily
 in the Lucene.Net repository.
 
 Troy,
 I also don't understand why do you wait for 2.9.4g? It is a *branch* and
 has nothing to do with the trunk. It need not be an offical release and
 can live in branch as a PoC.
 
 
 As a result, I got bored to listen to this should be done that way. What
 I want to see is I did it that way, should we continue with this.
 
 DIGY
 
 
 
 
 -Original Message-
 From: Troy Howard [mailto:thowar...@gmail.com]
 Sent: Thursday, June 30, 2011 10:47 PM
 To: lucene-net-...@lucene.apache.org
 Subject: Re: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed?
 
 Michael,
 
 I agree with everything you said. My point in saying whoever commits code
 wins was to illustrate the reality of how and why the project has the
 current form.
 
 Building consensus is difficult. It is an essential first step before we
 can
 do something like make a list of bit-sized pieces of work that others can
 work on.
 
 This is why my real message of Let's find a way to accommodate both is
 so
 important. It allows us to build consensus, so that we can settle on a
 direction and structure our work.
 
 Until we accomplish that, it really is whoever commits code wins, and
 that
 is an unhealthy and unmaintainable way to operate.
 
 From a technical perspective, your statements about the unit tests are
 completely accurate. They really need a LOT of reworking. That's the very
 first step before making any significant changes. Part of the problem is
 that the tests themselves are not well written. The other part is that the
 Lucene object model was not designed for testability, and it makes writing
 good tests more difficult, and certain tests might not be possible. It
 will
 be difficult to write good unit tests without re-structuring. The biggest
 issue is the use of abstract classes with base behaviour vs interfaces or
 fully abstracted classes. Makes mocking tough. This is the direction I was
 going when I started the Lucere project. I'd like to start in on that work
 after the 2.9.4g release.
 
 Thanks,
 Troy
 
 
 On Thu, Jun 30, 2011 at 12:04 PM, Michael Herndon 
 mhern...@wickedsoftware.net wrote:
 
  I'd say that is all the more reasons that we need to work smarter and
 not
  harder. I'd also say thats a good reason to make sure we build consensus
  rather than just saying whoever commits code wins.
 
  And its a damn good reason to focus on the effort of growing the number
 of
  contributors and lowing the barrier to submitting patches, breaking
 things
  down into pieces that people would feel confident to work on without
  being overwhelmed by the complexity of Lucene.Net.
 
  There is a pretty big gap between Lucene 2.9.x to Lucene 4.0 and the
  internals and index formats are significantly different including nixing
  the
  current vint file format and using byte[] array slices for Terms instead
 of
  char[].
 
  So while porting 1 to 1 while require less knowledge or thought, its
 most
  likely going to require more hours of work. And Its definitely not going
 to
  guarantee the stability of the code or that its great code.
 
  I'd have to say that its not as stable as most would believe at the
 moment.
 
  Most of the tests avoid anything that remotely looks like it knows about
  the
  DRY principle and there is a static constructor in the core test case
 that
  throws an exception if it doesn't find an environment variable TEMP
 which
  will fail 90% of the tests and nunit will be unable to give you a clear
  reason why.  Just to name a few issues I came across working towards
  getting
  

Re: managing CHANGES.txt?

2011-06-30 Thread Chris Hostetter

: There's no sense in CHANGES being a 'rolling list', when someone looks
: at 4.0 they should be able to see whats DIFFERENT aka what CHANGED
: from the past release.

I agree completely, the disagreement is *which* past release the list 
should be relative to.

I don't know how many more ways i can say it: I believe that the list of 
changes for 4.0 should be labled (and contain) Changes since 3.0 -- 
because that is the most recent past release sith a common development 
history.

When we only had a single trunk and the 3.0 release branch was forked 
from the same place as the 2.9 release branch it made sense to think of 
the 3.0 changes list as Changes since 2.9 because they were genuine 
success of eachother -- any code in 2.9 was by definition in 3.0 unless it 
was modified/removed by somehting listed in the 3.0 changes.

That is not going to be true for 3.3 and 4.0 (or 3.4 and 4.0, or 3.7 and 
4.0 or whatever our last 3.x release is before 4.0).  

The list of changes 
for a release should always make it clear *exactly* what is differnet 
between that release and the previous release with common lineage of 
source code -- it may sound weird, but it's what i believe and it's 
consistent with how we've done bug fix releases in the past -- they've 
refered to changes since their parent release, not since the last 
calendar release.

Since no one seems to agree with me on this, I've tried to let this go 
(twice!) by stating my position and conceeding that it's not concensus -- 
but if you keep reviving the argument, i'll happily keep restating my 
beliefs.


-Hoss

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java

2011-06-30 Thread Michael McCandless
On Thu, Jun 30, 2011 at 4:45 PM, Simon Willnauer
simon.willna...@googlemail.com wrote:
 On Thu, Jun 30, 2011 at 8:50 PM, Dawid Weiss
 dawid.we...@cs.put.poznan.pl wrote:
 I don't seen any evidence that this is any slower though.

 You need to run with -client (if the machine is a beast this is tricky
 because x64 will pick -server regardless of the command-line setting)
 and you need to be copying generic arrays. I think this can be shown
 -- a caliper benchmark would be perfect to demonstrate this in
 isolation; I may write one if I find a spare moment.

 this is what I want to see. I don't want to discuss based on some bug
 reported for a non-primitive version of copyOf thats all.
 its pointless to discuss if there is no evidence which I don't see. I
 am happy with arraycopy I would just have appreciated a discussion
 before backing the change out.

I think the burden of proof here is on Arrays.copyOf.

Ie, until we can prove (through benchmarking in different envs) that
it can be trusted, we should just stick with System.arraycopy to
reduce the risk.

Mike

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3246) Invert IR.getDelDocs - IR.getLiveDocs

2011-06-30 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13058107#comment-13058107
 ] 

Michael McCandless commented on LUCENE-3246:


bq. As we have now both variants to read/write BitVectors, would it be not a 
good idea to automatically use the old encoding for liveDocs, if more than 50% 
of all bits are unset? 

That seems like a good idea?  Ie, handle both sparse set and sparse unset 
compactly?  Though it should be unusual that you have so many deletes against a 
segment (esp. because TMP now targets such segs more aggressively).

We should do this under a new issue (the old code also didn't handle the many 
deletions case sparsely either, just the few deletions case).

 Invert IR.getDelDocs - IR.getLiveDocs
 --

 Key: LUCENE-3246
 URL: https://issues.apache.org/jira/browse/LUCENE-3246
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/index
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 4.0

 Attachments: LUCENE-3246-IndexSplitters.patch, LUCENE-3246.patch, 
 LUCENE-3246.patch


 Spinoff from LUCENE-1536, where we need to fix the low level filtering
 we do for deleted docs to match Filters (ie, a set bit means the doc
 is accepted) so that filters can be pushed all the way down to the
 enums when possible/appropriate.
 This change also inverts the meaning first arg to
 TermsEnum.docs/AndPositions (renames from skipDocs to liveDocs).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2623) Solr JMX MBeans do not survive core reloads

2011-06-30 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13058112#comment-13058112
 ] 

Hoss Man commented on SOLR-2623:


Alexey: at first glance, i think i would prefer Shalin's suggestion over your 
patch.  

My main hesitation about your approach is the parameterized close method -- If 
we really go that route i'd much rather see something like a 
SolrCore.preCloseToReleaesResources() method.  But more fundementally, if we 
unregister the MBeans before creating the new core, there is a window of time 
when the old core is responding to requests, but can't be monitored (and if 
anything goes wrong with creating the new core, the old one will continue to 
handle requests indefinitely but be totally unmonitorable.

That said: i suspect the fix might even be easier then what Shalin proposed 
(which would require making SolrCore passing itself into the JmxMonitoredMap) 
... can't we essentially change 
JmxMonitoredMap.unregsiter(String,SolrInfoMBean) to have psuedo code like this..

{code}
if (server.isRegistered(name)) {
  MBean existing = server.getMBean(name)
  if (existing intsanceof SolrDynamicMBean  
  existing.getSolrInfoMBean() == this.get(name)) {
server.unregisterMBean(name);
  } else {
 // :NOOP: MBean is not ours
  }
}
{code}

...adding a package protected SolrDynamicMBean.getSolrInfoMBean() seems less 
invasive then passing the SolrCore to another class

 Solr JMX MBeans do not survive core reloads
 ---

 Key: SOLR-2623
 URL: https://issues.apache.org/jira/browse/SOLR-2623
 Project: Solr
  Issue Type: Bug
  Components: multicore
Affects Versions: 1.4, 1.4.1, 3.1, 3.2
Reporter: Alexey Serba
Assignee: Shalin Shekhar Mangar
Priority: Minor
 Attachments: SOLR-2623.patch, SOLR-2623.patch, SOLR-2623.patch


 Solr JMX MBeans do not survive core reloads
 {noformat:title=Steps to reproduce}
 sh cd example
 sh vi multicore/core0/conf/solrconfig.xml # enable jmx
 sh java -Dcom.sun.management.jmxremote -Dsolr.solr.home=multicore -jar 
 start.jar
 sh echo 'open 8842 # 8842 is java pid
  domain solr/core0
  beans
  ' | java -jar jmxterm-1.0-alpha-4-uber.jar
 
 solr/core0:id=core0,type=core
 solr/core0:id=org.apache.solr.handler.StandardRequestHandler,type=org.apache.solr.handler.StandardRequestHandler
 solr/core0:id=org.apache.solr.handler.StandardRequestHandler,type=standard
 solr/core0:id=org.apache.solr.handler.XmlUpdateRequestHandler,type=/update
 solr/core0:id=org.apache.solr.handler.XmlUpdateRequestHandler,type=org.apache.solr.handler.XmlUpdateRequestHandler
 ...
 solr/core0:id=org.apache.solr.search.SolrIndexSearcher,type=searcher
 solr/core0:id=org.apache.solr.update.DirectUpdateHandler2,type=updateHandler
 sh curl 'http://localhost:8983/solr/admin/cores?action=RELOADcore=core0'
 sh echo 'open 8842 # 8842 is java pid
  domain solr/core0
  beans
  ' | java -jar jmxterm-1.0-alpha-4-uber.jar
 # there's only one bean left after Solr core reload
 solr/core0:id=org.apache.solr.search.SolrIndexSearcher,type=Searcher@2e831a91 
 main
 {noformat}
 The root cause of this is Solr core reload behavior:
 # create new core (which overwrites existing registered MBeans)
 # register new core and close old one (we remove/un-register MBeans on 
 oldCore.close)
 The correct sequence is:
 # unregister MBeans from old core
 # create and register new core
 # close old core without touching MBeans

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-2793) Directory createOutput and openInput should take an IOContext

2011-06-30 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13058116#comment-13058116
 ] 

Michael McCandless commented on LUCENE-2793:


To address the nocommits about losing the larger buffer size during merging, 
should we add set/getMergeBufferSize and set/getDefaultBufferSize to those Dir 
impls that do buffering?  (And default to what they are today on trunk, I think 
1 KB and 4 KB?)

 Directory createOutput and openInput should take an IOContext
 -

 Key: LUCENE-2793
 URL: https://issues.apache.org/jira/browse/LUCENE-2793
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/store
Reporter: Michael McCandless
Assignee: Varun Thacker
  Labels: gsoc2011, lucene-gsoc-11, mentor
 Attachments: LUCENE-2793-nrt.patch, LUCENE-2793.patch, 
 LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, 
 LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, 
 LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, 
 LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, 
 LUCENE-2793.patch, LUCENE-2793.patch


 Today for merging we pass down a larger readBufferSize than for searching 
 because we get better performance.
 I think we should generalize this to a class (IOContext), which would hold 
 the buffer size, but then could hold other flags like DIRECT (bypass OS's 
 buffer cache), SEQUENTIAL, etc.
 Then, we can make the DirectIOLinuxDirectory fully usable because we would 
 only use DIRECT/SEQUENTIAL during merging.
 This will require fixing how IW pools readers, so that a reader opened for 
 merging is not then used for searching, and vice/versa.  Really, it's only 
 all the open file handles that need to be different -- we could in theory 
 share del docs, norms, etc, if that were somehow possible.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-3267) check-legal-lucene always checks contrib/queries/lib

2011-06-30 Thread Chris Male (JIRA)
check-legal-lucene always checks contrib/queries/lib


 Key: LUCENE-3267
 URL: https://issues.apache.org/jira/browse/LUCENE-3267
 Project: Lucene - Java
  Issue Type: Bug
  Components: general/build
Reporter: Chris Male
Priority: Minor


I've been noticing for awhile that the check-legal-lucene always checks 
/contrib/queries/lib, no matter where it is.  Consequently it never finds the 
directory.  This seems like a waste in our build and for the life of me I have 
no idea why it is necessary.  

Offending line is:

{code}
arg value=${basedir}/contrib/queries/lib /
{code}

in check-legal-lucene

Patch will remove this.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-3241) Remove Lucene core's FunctionQuery impls

2011-06-30 Thread Chris Male (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Male resolved LUCENE-3241.


Resolution: Fixed

Committed revision 1141747.

 Remove Lucene core's FunctionQuery impls
 

 Key: LUCENE-3241
 URL: https://issues.apache.org/jira/browse/LUCENE-3241
 Project: Lucene - Java
  Issue Type: Sub-task
  Components: core/search
Reporter: Chris Male
Assignee: Chris Male
 Fix For: 4.0

 Attachments: LUCENE-3241.patch, LUCENE-3241.patch


 As part of the consolidation of FunctionQuerys, we want to remove Lucene 
 core's impls.  Included in this work, we will make sure that all the 
 functionality provided by the core impls is also provided by the new module.  
 Any tests will be ported across too, to increase the test coverage.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-2883) Consolidate Solr Lucene FunctionQuery into modules

2011-06-30 Thread Chris Male (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Male resolved LUCENE-2883.


Resolution: Fixed

Committed revision 1141749.

Its done.  Finally.

 Consolidate Solr   Lucene FunctionQuery into modules
 -

 Key: LUCENE-2883
 URL: https://issues.apache.org/jira/browse/LUCENE-2883
 Project: Lucene - Java
  Issue Type: Task
  Components: core/search
Affects Versions: 4.0
Reporter: Simon Willnauer
Assignee: Chris Male
  Labels: gsoc2011, lucene-gsoc-11, mentor
 Fix For: 4.0

 Attachments: LUCENE-2883.patch


 Spin-off from the [dev list | 
 http://www.mail-archive.com/dev@lucene.apache.org/msg13261.html]  

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3267) check-legal-lucene always checks contrib/queries/lib

2011-06-30 Thread Chris Male (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13058138#comment-13058138
 ] 

Chris Male commented on LUCENE-3267:


Woops, committed wrong thing with this issue number.  Oh well.

 check-legal-lucene always checks contrib/queries/lib
 

 Key: LUCENE-3267
 URL: https://issues.apache.org/jira/browse/LUCENE-3267
 Project: Lucene - Java
  Issue Type: Bug
  Components: general/build
Reporter: Chris Male
Priority: Minor
 Attachments: LUCENE-3267.patch


 I've been noticing for awhile that the check-legal-lucene always checks 
 /contrib/queries/lib, no matter where it is.  Consequently it never finds the 
 directory.  This seems like a waste in our build and for the life of me I 
 have no idea why it is necessary.  
 Offending line is:
 {code}
 arg value=${basedir}/contrib/queries/lib /
 {code}
 in check-legal-lucene
 Patch will remove this.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3267) check-legal-lucene always checks contrib/queries/lib

2011-06-30 Thread Chris Male (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Male updated LUCENE-3267:
---

Attachment: LUCENE-3267.patch

Actual patch for this issue.

Removes the offending /contrib/queries/lib hardcoded check.

Everything seems good.  I'll commit tomorrow.

 check-legal-lucene always checks contrib/queries/lib
 

 Key: LUCENE-3267
 URL: https://issues.apache.org/jira/browse/LUCENE-3267
 Project: Lucene - Java
  Issue Type: Bug
  Components: general/build
Reporter: Chris Male
Priority: Minor
 Attachments: LUCENE-3267.patch


 I've been noticing for awhile that the check-legal-lucene always checks 
 /contrib/queries/lib, no matter where it is.  Consequently it never finds the 
 directory.  This seems like a waste in our build and for the life of me I 
 have no idea why it is necessary.  
 Offending line is:
 {code}
 arg value=${basedir}/contrib/queries/lib /
 {code}
 in check-legal-lucene
 Patch will remove this.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-2883) Consolidate Solr Lucene FunctionQuery into modules

2011-06-30 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13058142#comment-13058142
 ] 

Robert Muir commented on LUCENE-2883:
-

Thanks for all your hard refactoring work here Chris!

 Consolidate Solr   Lucene FunctionQuery into modules
 -

 Key: LUCENE-2883
 URL: https://issues.apache.org/jira/browse/LUCENE-2883
 Project: Lucene - Java
  Issue Type: Task
  Components: core/search
Affects Versions: 4.0
Reporter: Simon Willnauer
Assignee: Chris Male
  Labels: gsoc2011, lucene-gsoc-11, mentor
 Fix For: 4.0

 Attachments: LUCENE-2883.patch


 Spin-off from the [dev list | 
 http://www.mail-archive.com/dev@lucene.apache.org/msg13261.html]  

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3267) check-legal-lucene always checks contrib/queries/lib

2011-06-30 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13058145#comment-13058145
 ] 

Robert Muir commented on LUCENE-3267:
-

+1

 check-legal-lucene always checks contrib/queries/lib
 

 Key: LUCENE-3267
 URL: https://issues.apache.org/jira/browse/LUCENE-3267
 Project: Lucene - Java
  Issue Type: Bug
  Components: general/build
Reporter: Chris Male
Priority: Minor
 Attachments: LUCENE-3267.patch


 I've been noticing for awhile that the check-legal-lucene always checks 
 /contrib/queries/lib, no matter where it is.  Consequently it never finds the 
 directory.  This seems like a waste in our build and for the life of me I 
 have no idea why it is necessary.  
 Offending line is:
 {code}
 arg value=${basedir}/contrib/queries/lib /
 {code}
 in check-legal-lucene
 Patch will remove this.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



  1   2   >