[jira] Assigned: (LUCENE-1458) Further steps towards flexible indexing

2009-08-27 Thread Michael Busch (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Busch reassigned LUCENE-1458: - Assignee: Michael Busch (was: Michael McCandless) > Further steps towards flexible inde

[jira] Assigned: (LUCENE-1458) Further steps towards flexible indexing

2009-08-27 Thread Michael Busch (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Busch reassigned LUCENE-1458: - Assignee: Michael McCandless (was: Michael Busch) Oups, didn't want to steal this from

[jira] Updated: (LUCENE-1865) Add a ton of missing license headers throughout test/demo/contrib

2009-08-27 Thread Hoss Man (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man updated LUCENE-1865: - Attachment: LUCENE-1865-part2.patch untested patch fixing the files i mentioned > Add a ton of missing

[jira] Updated: (LUCENE-1868) update NOTICE.txt

2009-08-27 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-1868: Attachment: LUCENE-1868.patch patch to add NOTICE.txt for the persian stopwords file. > update NO

[jira] Updated: (LUCENE-1867) replace collation/lib/icu4j.jar with a smaller icu jar

2009-08-27 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-1867: Attachment: icu4j-collation-4.0.jar i have uploaded the jar file i created. (i had to bloat it 200

[jira] Created: (LUCENE-1868) update NOTICE.txt

2009-08-27 Thread Robert Muir (JIRA)
update NOTICE.txt - Key: LUCENE-1868 URL: https://issues.apache.org/jira/browse/LUCENE-1868 Project: Lucene - Java Issue Type: Task Reporter: Robert Muir Fix For: 2.9 >From the java-dev discussion,

Re: ICU license info in NOTICE.txt ?

2009-08-27 Thread Mark Miller
Lets make an issue for. - Mark http://www.lucidimagination.com (mobile) On Aug 27, 2009, at 8:23 PM, Robert Muir wrote: along the same lines, I looked at NOTICE.txt and I think the same notice for Arabic stopwords should apply to Persian (they come from the same place, both BSD-licensed) On

Re: ICU license info in NOTICE.txt ?

2009-08-27 Thread Robert Muir
along the same lines, I looked at NOTICE.txt and I think the same notice for Arabic stopwords should apply to Persian (they come from the same place, both BSD-licensed) On Thu, Aug 27, 2009 at 5:27 PM, Chris Hostetter wrote: > > i notice this file has the full licensing info for ICU... > >   contr

[jira] Commented: (LUCENE-1458) Further steps towards flexible indexing

2009-08-27 Thread Michael Busch (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12748617#action_12748617 ] Michael Busch commented on LUCENE-1458: --- In the current patch the choice of the Code

Signing the release

2009-08-27 Thread Grant Ingersoll
Not sure if we have to, but I believe there is a discussion going on at commun...@a.o concerning signatures and the need to upgrade. Something about number of bits, etc. See recent threads from Robert Burrell Donkin. See http://mail-archives.apache.org/mod_mbox/www-community/200908.mbox/b

[jira] Reopened: (LUCENE-1865) Add a ton of missing license headers throughout test/demo/contrib

2009-08-27 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller reopened LUCENE-1865: - Darn - didn't mean to miss the java one - though I knowingly skipped some of the jsp/xml. I still h

[jira] Commented: (LUCENE-1865) Add a ton of missing license headers throughout test/demo/contrib

2009-08-27 Thread Hoss Man (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12748611#action_12748611 ] Hoss Man commented on LUCENE-1865: -- as of r808636 it still seems like we're missing boile

RE: competeing license ifo for snowball code?

2009-08-27 Thread Chris Hostetter
: There is a discussion about this at: : :http://issues.apache.org/jira/browse/LUCENE-740 Hmmm... ok. even with that in mind, I don't understand why we need ./contrib/snowball/LICENSE.txt -- all of (lucene) source code is already covered by ./LICENSE.txt right? -Hoss --

Re: svn commit: r807763 - /lucene/java/trunk/build.xml

2009-08-27 Thread Mark Miller
Mark Miller wrote: > Chris Hostetter wrote: > >> : FWIW, committers can get Hudson accounts. See >> >> are you sure about that? I never understood the reason, but the wiki has >> always said... >> >> "if you are a member of an ASF PMC, get in touch and we'll set you up with >> an account." >

Re: svn commit: r807763 - /lucene/java/trunk/build.xml

2009-08-27 Thread Mark Miller
Chris Hostetter wrote: > : FWIW, committers can get Hudson accounts. See > > are you sure about that? I never understood the reason, but the wiki has > always said... > > "if you are a member of an ASF PMC, get in touch and we'll set you up with > an account." > > : http://wiki.apache.org/gener

RE: competeing license ifo for snowball code?

2009-08-27 Thread Steven A Rowe
There is a discussion about this at: http://issues.apache.org/jira/browse/LUCENE-740 Steve > -Original Message- > From: Chris Hostetter [mailto:hossman_luc...@fucit.org] > Sent: Thursday, August 27, 2009 5:32 PM > To: Lucene Dev > Subject: competeing license ifo for snowball code? >

Re: svn commit: r807763 - /lucene/java/trunk/build.xml

2009-08-27 Thread Chris Hostetter
: FWIW, committers can get Hudson accounts. See are you sure about that? I never understood the reason, but the wiki has always said... "if you are a member of an ASF PMC, get in touch and we'll set you up with an account." : http://wiki.apache.org/general/Hudson. Committers can also get L

competeing license ifo for snowball code?

2009-08-27 Thread Chris Hostetter
can someone explain this to me... http://svn.apache.org/viewvc/lucene/java/trunk/contrib/snowball/LICENSE.txt?view=co http://svn.apache.org/viewvc/lucene/java/trunk/contrib/snowball/SNOWBALL-LICENSE.txt?view=co ...that first one seems like a (very old) mistake. -Hoss ---

ICU license info in NOTICE.txt ?

2009-08-27 Thread Chris Hostetter
i notice this file has the full licensing info for ICU... contrib/collation/lib/ICU-LICENSE.txt ...but isn't there also suppose to be at least a one line mention of this in the top level NOTICE.txt file? -Hoss - To u

RE: Lucene 2.9 release size

2009-08-27 Thread Chris Hostetter
: This prompts the question (in my mind anyway): should source releases include third-party binary jars? if i remember correctly, the historical argument has been that this way the source release contains everything you need to compile the source. except that if i remember correctly (and i'm v

[jira] Commented: (LUCENE-1867) replace collation/lib/icu4j.jar with a smaller icu jar

2009-08-27 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12748566#action_12748566 ] Robert Muir commented on LUCENE-1867: - 5,647,316 bytes -> 3,410,075 bytes. > replace

[jira] Created: (LUCENE-1867) replace collation/lib/icu4j.jar with a smaller icu jar

2009-08-27 Thread Robert Muir (JIRA)
replace collation/lib/icu4j.jar with a smaller icu jar -- Key: LUCENE-1867 URL: https://issues.apache.org/jira/browse/LUCENE-1867 Project: Lucene - Java Issue Type: Task Component

Re: Lucene 2.9 release size

2009-08-27 Thread Mark Miller
Sounds like a good idea. Robert Muir wrote: > collation could be made smaller, it probably uses entire icu4j jar, > which includes large data pieces unnecessary for collation > > it this sounds like a good idea i will create a smaller one with the below > link: > > http://apps.icu-project.org/dat

[jira] Commented: (LUCENE-1521) "fdx size mismatch" exception in StoredFieldsWriter.closeDocStore() when closing index with 500M documents

2009-08-27 Thread Elliot Metsger (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12748558#action_12748558 ] Elliot Metsger commented on LUCENE-1521: Nevermind, it doesn't look like this is a

RE: Lucene 2.9 release size

2009-08-27 Thread Steven A Rowe
Hi Mark, On 8/29/2009 at 4:24 PM, Mark Miller wrote: > It looks like 2.9 will be a much larger release. The 2.4.1 src dist I > have is 5.9MB zipped - the 2.9 version is 15.3 MB zipped. [snip] > collation 0 -> 5.5 The source code under collation is much smaller than 5.5MB - this must mostly be

Re: Lucene 2.9 release size

2009-08-27 Thread Robert Muir
collation could be made smaller, it probably uses entire icu4j jar, which includes large data pieces unnecessary for collation it this sounds like a good idea i will create a smaller one with the below link: http://apps.icu-project.org/datacustom/ On Thu, Aug 27, 2009 at 4:23 PM, Mark Miller wro

CachingTokenFilter extensibility and LUCENE-1685

2009-08-27 Thread David Kaelbling
Hi, Looking at Lucene 2.9 trunk, CachingTokenFilter seems much less extensible than before. In previous releases I subclassed it so I could back the cache with an array and provide random access to the stream. I can't see how to do this any more, and the WeightedSpanTermExtractor.getReaderFor

[jira] Issue Comment Edited: (LUCENE-1521) "fdx size mismatch" exception in StoredFieldsWriter.closeDocStore() when closing index with 500M documents

2009-08-27 Thread Elliot Metsger (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12748555#action_12748555 ] Elliot Metsger edited comment on LUCENE-1521 at 8/27/09 1:33 PM: ---

[jira] Commented: (LUCENE-1521) "fdx size mismatch" exception in StoredFieldsWriter.closeDocStore() when closing index with 500M documents

2009-08-27 Thread Elliot Metsger (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12748555#action_12748555 ] Elliot Metsger commented on LUCENE-1521: I received this on 2.4.1, not sure if it

Lucene 2.9 release size

2009-08-27 Thread Mark Miller
It looks like 2.9 will be a much larger release. The 2.4.1 src dist I have is 5.9MB zipped - the 2.9 version is 15.3 MB zipped. It all looks legit to me though. I'll post a list of the biggest changes so someone can flag if they think something is off. It appears legit to me though: src dist 2.4.

[jira] Commented: (LUCENE-1815) Geohash encode/decode floating point problems

2009-08-27 Thread Wouter Heijke (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12748548#action_12748548 ] Wouter Heijke commented on LUCENE-1815: --- To me it was major since geohashes are THE

[jira] Updated: (LUCENE-1866) better RAT reporting

2009-08-27 Thread Hoss Man (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man updated LUCENE-1866: - Attachment: LUCENE-1866.patch per discussion on the user list, we should be looking at a RAT report for

[jira] Created: (LUCENE-1866) better RAT reporting

2009-08-27 Thread Hoss Man (JIRA)
better RAT reporting Key: LUCENE-1866 URL: https://issues.apache.org/jira/browse/LUCENE-1866 Project: Lucene - Java Issue Type: Bug Reporter: Hoss Man the "ant rat-sources" target currently only analyzes sr

[jira] Commented: (LUCENE-1458) Further steps towards flexible indexing

2009-08-27 Thread Michael Busch (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12748538#action_12748538 ] Michael Busch commented on LUCENE-1458: --- {quote} Switches to a new more efficient te

Re: RAT on just src/java ???

2009-08-27 Thread Chris Hostetter
: > from the commandline i'm seeing about what you're seeing, from the ant correction .. even calling RAT directly (via ant's ) contrib takes a few minutes -- but it doens't chew up RAM (it was the uncompressed dist artifacts that were really fast on the comman line i think) : I wonder if yo

Re: RAT on just src/java ???

2009-08-27 Thread Mark Miller
Chris Hostetter wrote: > : How much RAM is it taking for you? I've got it scanning > > I didn't look into it htat hard. > > : demo/test/src/contrib and it takes 6 seconds - the mem does appear to > : pop to like 160MB from 70 real quick - what are you seeing for RAM reqs? > > are you running from t

Re: RAT on just src/java ???

2009-08-27 Thread Mark Miller
Chris Hostetter wrote: > : How much RAM is it taking for you? I've got it scanning > > I didn't look into it htat hard. > > : demo/test/src/contrib and it takes 6 seconds - the mem does appear to > : pop to like 160MB from 70 real quick - what are you seeing for RAM reqs? > > are you running from t

Re: RAT on just src/java ???

2009-08-27 Thread Chris Hostetter
: How much RAM is it taking for you? I've got it scanning I didn't look into it htat hard. : demo/test/src/contrib and it takes 6 seconds - the mem does appear to : pop to like 160MB from 70 real quick - what are you seeing for RAM reqs? are you running from the commandline, or from ant? if yo

[jira] Resolved: (LUCENE-1865) Add a ton of missing license headers throughout test/demo/contrib

2009-08-27 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller resolved LUCENE-1865. - Resolution: Fixed > Add a ton of missing license headers throughout test/demo/contrib >

[jira] Resolved: (LUCENE-1817) it is impossible to use a custom dictionary for SmartChineseAnalyzer

2009-08-27 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir resolved LUCENE-1817. - Resolution: Fixed Committed revision 808570. > it is impossible to use a custom dictionary for

Re: RAT on just src/java ???

2009-08-27 Thread Mark Miller
Chris Hostetter wrote: > : reason why I did only src/java. I agree we should have it cover all > : sources. > > Hmmm... rat is a memory hog, but the rat ant task is ridiculous (probably > because it only supports being bpassed filesets containing actualy files > to analyze, i can't figure out a

[jira] Commented: (LUCENE-1817) it is impossible to use a custom dictionary for SmartChineseAnalyzer

2009-08-27 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12748505#action_12748505 ] Mark Miller commented on LUCENE-1817: - I vote commit it now so it makes the RC - I can

Re: RAT on just src/java ???

2009-08-27 Thread Chris Hostetter
: reason why I did only src/java. I agree we should have it cover all : sources. Hmmm... rat is a memory hog, but the rat ant task is ridiculous (probably because it only supports being bpassed filesets containing actualy files to analyze, i can't figure out a way to just give it a directory (

[jira] Updated: (LUCENE-1817) it is impossible to use a custom dictionary for SmartChineseAnalyzer

2009-08-27 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-1817: Lucene Fields: [New, Patch Available] (was: [New]) Fix Version/s: 2.9 Assignee: Rober

[jira] Commented: (LUCENE-1817) it is impossible to use a custom dictionary for SmartChineseAnalyzer

2009-08-27 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12748497#action_12748497 ] Mark Miller commented on LUCENE-1817: - I agree Robert - given your concerns, lots drop

Re: svn commit: r808530 - /lucene/java/trunk/src/test/org/apache/lucene/index/TestCheckIndex.java

2009-08-27 Thread Robert Muir
i do not want to delay anything but if someone could provide opinion on javadocs-only patch for LUCENE-1817, i would appreciate it. On Thu, Aug 27, 2009 at 2:35 PM, Mark Miller wrote: > NM - didn't read the code - I see you ||'d it. I'm revving up to do the > RC soon - if anyone has any objections

Re: svn commit: r808530 - /lucene/java/trunk/src/test/org/apache/lucene/index/TestCheckIndex.java

2009-08-27 Thread Mark Miller
NM - didn't read the code - I see you ||'d it. I'm revving up to do the RC soon - if anyone has any objections or anything they want to hold off for - speak up. Mark Miller wrote: > Should this go in the release todo? > > mikemcc...@apache.org wrote: > >> Author: mikemccand >> Date: Thu Aug 27

Re: svn commit: r808530 - /lucene/java/trunk/src/test/org/apache/lucene/index/TestCheckIndex.java

2009-08-27 Thread Mark Miller
Should this go in the release todo? mikemcc...@apache.org wrote: > Author: mikemccand > Date: Thu Aug 27 17:05:53 2009 > New Revision: 808530 > > URL: http://svn.apache.org/viewvc?rev=808530&view=rev > Log: > fix TestCheckIndex to accept either 2.9 or 2.9-dev as valid, from > common-build.xml > >

[jira] Created: (LUCENE-1865) Add a ton of missing license headers throughout test/demo/contrib

2009-08-27 Thread Mark Miller (JIRA)
Add a ton of missing license headers throughout test/demo/contrib - Key: LUCENE-1865 URL: https://issues.apache.org/jira/browse/LUCENE-1865 Project: Lucene - Java Issue Type: Ta

[jira] Updated: (LUCENE-1865) Add a ton of missing license headers throughout test/demo/contrib

2009-08-27 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller updated LUCENE-1865: Attachment: LUCENE-1865.patch > Add a ton of missing license headers throughout test/demo/contrib

Re: RAT on just src/java ???

2009-08-27 Thread Michael McCandless
I think I originally set it up, and I don't think (can't remember) any reason why I did only src/java. I agree we should have it cover all sources. Mike On Thu, Aug 27, 2009 at 1:36 PM, Mark Miller wrote: > bq. but does anyone know of a reason why it was setup this way? > > No, but pending that

Re: RAT on just src/java ???

2009-08-27 Thread Mark Miller
bq. but does anyone know of a reason why it was setup this way? No, but pending that answer, +1 on making it check all of the source files. Chris Hostetter wrote: > > I noticed that the Release TODO recommends running "ant rat-sources" > to look for possible errors ... but the rat-soruces tag is

RAT on just src/java ???

2009-08-27 Thread Chris Hostetter
I noticed that the Release TODO recommends running "ant rat-sources" to look for possible errors ... but the rat-soruces tag is setup to only analyze the src/java directory -- not any of the other source files included in the release (contrib, tests, demo, etc...) let alone the full release a

[jira] Resolved: (LUCENE-1858) Update site level documentation

2009-08-27 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller resolved LUCENE-1858. - Resolution: Fixed I've gotten the majority of this done. I'm still on the lookout for dead link

Re: Lucene website - benchmarks page

2009-08-27 Thread Mark Miller
Noone had touched that page for many years :) Thats why I removed it. If we add it back though, thats a great thought - we should mention it and how to do a standard run. Though if we use it to compare 2.4 to 2.9, the mentioning part goes without saying. - Mark Matthew Hall wrote: > That's gre

Re: Lucene website - benchmarks page

2009-08-27 Thread Matthew Hall
That's great... perhaps it should be listed on the Benchmark page though ^^ Matt Mark Miller wrote: We have it - its the benchmark contrib - just run the mirco or standard benchmark alg. Matthew Hall wrote: Sorry to interject, but I think that the most useful thing that could be done to th

Re: Lucene website - benchmarks page

2009-08-27 Thread Mark Miller
We have it - its the benchmark contrib - just run the mirco or standard benchmark alg. Matthew Hall wrote: > Sorry to interject, but I think that the most useful thing that could > be done to this page would be the creation of a standard benchmark > test suite that someone could simply download an

[jira] Resolved: (LUCENE-1864) bogus javadocs for FieldValueHitQuery.fillFields

2009-08-27 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless resolved LUCENE-1864. Resolution: Fixed > bogus javadocs for FieldValueHitQuery.fillFields > ---

Re: Lucene website - benchmarks page

2009-08-27 Thread Mark Miller
Chris Hostetter wrote: > : Prob want to run it on decent hardware as well (eg mabye I shouldn't do > : it with my 5200 rpm laptop drives). > > as long as both are run on the same hardware, and the page lists the > hardware, it's the relative numbers that matter the most. > > > > -Hoss > > > --

Re: Lucene website - benchmarks page

2009-08-27 Thread Matthew Hall
Sorry to interject, but I think that the most useful thing that could be done to this page would be the creation of a standard benchmark test suite that someone could simply download and run on own systems. That way the benchmarking numbers for all users could be easily compared between boxes

[jira] Resolved: (LUCENE-1860) switch MultiTermQuery to "constant score auto" rewrite by default

2009-08-27 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless resolved LUCENE-1860. Resolution: Fixed > switch MultiTermQuery to "constant score auto" rewrite by defa

Re: Lucene website - benchmarks page

2009-08-27 Thread Chris Hostetter
: Prob want to run it on decent hardware as well (eg mabye I shouldn't do : it with my 5200 rpm laptop drives). as long as both are run on the same hardware, and the page lists the hardware, it's the relative numbers that matter the most. -Hoss --

Re: Lucene website - benchmarks page

2009-08-27 Thread Mark Miller
Sure - not a bad idea - need to remember to set autocommit to false on 2.4 though. Not sure if there are any other fair settings - Prob want to run it on decent hardware as well (eg mabye I shouldn't do it with my 5200 rpm laptop drives). note: I also think the wiki already has benchmark stuff t

Re: Lucene website - benchmarks page

2009-08-27 Thread Chris Hostetter
pulling a crap doc fro mteh release seems sound to me. alternately: couldn't we just replace it with the output from the contrib/benchmarker on some of the bigger tests (the full wikipedia ones) comparing 2.4 with 2.9 ? then just make it a pre-release TODO item for the future: update that page

[jira] Commented: (LUCENE-1864) bogus javadocs for FieldValueHitQuery.fillFields

2009-08-27 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12748439#action_12748439 ] Michael McCandless commented on LUCENE-1864: I'll fix this... > bogus javadoc

[jira] Assigned: (LUCENE-1864) bogus javadocs for FieldValueHitQuery.fillFields

2009-08-27 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless reassigned LUCENE-1864: -- Assignee: Michael McCandless > bogus javadocs for FieldValueHitQuery.fillField

[jira] Updated: (LUCENE-1817) it is impossible to use a custom dictionary for SmartChineseAnalyzer

2009-08-27 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-1817: Attachment: LUCENE-1817.patch Here is a javadocs-only patch that I think is the best solution. Th

Re: Lucene 2.9 trunk freeze

2009-08-27 Thread Michael McCandless
Nice :) Mike On Thu, Aug 27, 2009 at 12:41 PM, Mark Miller wrote: > of course :) > > I proclaimed we started today, but not the time :) > > -  Mark > > Michael McCandless wrote: >> Can I commit LUCENE-1860? >> >> Mike >> >> On Thu, Aug 27, 2009 at 11:49 AM, Mark Miller wrote: >> >>> We are freezi

[jira] Resolved: (LUCENE-1863) SnowballAnalyzer has a link to net.sf (a package that is empty and needs to be removed).

2009-08-27 Thread Hoss Man (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man resolved LUCENE-1863. -- Resolution: Fixed mark fixed in r808472 > SnowballAnalyzer has a link to net.sf (a package that is em

Re: Lucene 2.9 trunk freeze

2009-08-27 Thread Mark Miller
of course :) I proclaimed we started today, but not the time :) - Mark Michael McCandless wrote: > Can I commit LUCENE-1860? > > Mike > > On Thu, Aug 27, 2009 at 11:49 AM, Mark Miller wrote: > >> We are freezing trunk rather than a branch for this release. >> >> I hereby proclaim the freeze

Re: Lucene 2.9 trunk freeze

2009-08-27 Thread Michael McCandless
Can I commit LUCENE-1860? Mike On Thu, Aug 27, 2009 at 11:49 AM, Mark Miller wrote: > We are freezing trunk rather than a branch for this release. > > I hereby proclaim the freeze starts today (unless objections are raised) > - and shortly I will update the versions and build an RC to host for >

Lucene 2.9 trunk freeze

2009-08-27 Thread Mark Miller
We are freezing trunk rather than a branch for this release. I hereby proclaim the freeze starts today (unless objections are raised) - and shortly I will update the versions and build an RC to host for java-user. The releasetodo tells me to remind you of the dos and don'ts: * No new

Re: [jira] Created: (LUCENE-1862) duplicate package.html files in queryParser and analsysis.cn packages

2009-08-27 Thread Chris Hostetter
: I still have the same thought though - why not? Unless it takes a lot : longer to parse, why hide bad JavaDoc? We may maintain public JavaDoc : for users, but we maintain private JavaDoc for developers as well. if we default it to private, the release will wind up advertising all of the privat

Re: [jira] Created: (LUCENE-1862) duplicate package.html files in queryParser and analsysis.cn packages

2009-08-27 Thread Mark Miller
Mark Miller wrote: > Chris Hostetter wrote: > >> : > i'm thinking we should change the nightly build to set >> : > -Djavadoc.access=private so we at least expose more errors earlier. >> : > (assuming we also setup the hudson to report stats on javadoc >> : > warnings ... i've seen it in other

Re: [jira] Created: (LUCENE-1862) duplicate package.html files in queryParser and analsysis.cn packages

2009-08-27 Thread Chris Hostetter
: True enough - I don't think its super important for release that the : private javadocs are 100% valid. Buts its nice if it is regardless :) FWIW: i wasn't trying to suggest that it was, but it helps surface things like LUCENE-1864 which can be really confusing when you start looking at long

Re: [jira] Created: (LUCENE-1862) duplicate package.html files in queryParser and analsysis.cn packages

2009-08-27 Thread Mark Miller
Chris Hostetter wrote: > : > i'm thinking we should change the nightly build to set > : > -Djavadoc.access=private so we at least expose more errors earlier. > : > (assuming we also setup the hudson to report stats on javadoc > : > warnings ... i've seen it in other instances but don't know if i

Re: [jira] Created: (LUCENE-1862) duplicate package.html files in queryParser and analsysis.cn packages

2009-08-27 Thread Mark Miller
True enough - I don't think its super important for release that the private javadocs are 100% valid. Buts its nice if it is regardless :) Uwe Schindler wrote: > Javadoc should normally only contain public methods/classes. Ony developers > maybe want to have javadocs with all classes. > > - >

Re: [jira] Created: (LUCENE-1862) duplicate package.html files in queryParser and analsysis.cn packages

2009-08-27 Thread Chris Hostetter
: > i'm thinking we should change the nightly build to set : > -Djavadoc.access=private so we at least expose more errors earlier. : > (assuming we also setup the hudson to report stats on javadoc : > warnings ... i've seen it in other instances but don't know if it requires : > a special plug

RE: [jira] Created: (LUCENE-1862) duplicate package.html files in queryParser and analsysis.cn packages

2009-08-27 Thread Uwe Schindler
Javadoc should normally only contain public methods/classes. Ony developers maybe want to have javadocs with all classes. - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: Mark Miller [mailto:markrmil...@gm

Re: [jira] Created: (LUCENE-1862) duplicate package.html files in queryParser and analsysis.cn packages

2009-08-27 Thread Mark Miller
Chris Hostetter wrote: > : > you obviously haven't tried "ant javadocs -Djavadoc.access=private" > lately > : > ... i'm working on cleaning that up at the moment. > > : tried it? I'm not even aware of it. Not mentioned in the release todo. > > yeah ... it's admittedly esoteric, but it helps surfa

Re: [jira] Created: (LUCENE-1862) duplicate package.html files in queryParser and analsysis.cn packages

2009-08-27 Thread Chris Hostetter
: > you obviously haven't tried "ant javadocs -Djavadoc.access=private" lately : > ... i'm working on cleaning that up at the moment. : tried it? I'm not even aware of it. Not mentioned in the release todo. yeah ... it's admittedly esoteric, but it helps surface bugs in docs on private level m

[jira] Created: (LUCENE-1864) bogus javadocs for FieldValueHitQuery.fillFields

2009-08-27 Thread Hoss Man (JIRA)
bogus javadocs for FieldValueHitQuery.fillFields Key: LUCENE-1864 URL: https://issues.apache.org/jira/browse/LUCENE-1864 Project: Lucene - Java Issue Type: Bug Reporter: Hoss Man

Re: [jira] Created: (LUCENE-1862) duplicate package.html files in queryParser and analsysis.cn packages

2009-08-27 Thread Mark Miller
Chris Hostetter wrote: > : Thanks for the help finishing up the javadoc cleanup Hoss - we almost > : have a clean javadoc run - which is fantastic, because I didn't think it > : was going to be possible. I think its just this and 1863 and the run is > : clean. > > you obviously haven't tried "ant j

Re: [jira] Created: (LUCENE-1862) duplicate package.html files in queryParser and analsysis.cn packages

2009-08-27 Thread Chris Hostetter
: Thanks for the help finishing up the javadoc cleanup Hoss - we almost : have a clean javadoc run - which is fantastic, because I didn't think it : was going to be possible. I think its just this and 1863 and the run is : clean. you obviously haven't tried "ant javadocs -Djavadoc.access=private"

Re: [jira] Created: (LUCENE-1862) duplicate package.html files in queryParser and analsysis.cn packages

2009-08-27 Thread Mark Miller
Hoss Man (JIRA) wrote: > duplicate package.html files in queryParser and analsysis.cn packages > - > > Key: LUCENE-1862 > URL: https://issues.apache.org/jira/browse/LUCENE-1862 > Proje

[jira] Created: (LUCENE-1863) SnowballAnalyzer has a link to net.sf (a package that is empty and needs to be removed).

2009-08-27 Thread Mark Miller (JIRA)
SnowballAnalyzer has a link to net.sf (a package that is empty and needs to be removed). Key: LUCENE-1863 URL: https://issues.apache.org/jira/browse/LUCENE-1863

[jira] Created: (LUCENE-1862) duplicate package.html files in queryParser and analsysis.cn packages

2009-08-27 Thread Hoss Man (JIRA)
duplicate package.html files in queryParser and analsysis.cn packages - Key: LUCENE-1862 URL: https://issues.apache.org/jira/browse/LUCENE-1862 Project: Lucene - Java Issue

[jira] Commented: (LUCENE-1861) Add contrib libs to classpath for javadoc

2009-08-27 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12748388#action_12748388 ] Mark Miller commented on LUCENE-1861: - Thanks! Wishes do come true. > Add contrib lib

Re: FuzzyLikeThis query and exact matches

2009-08-27 Thread Mark Harwood
I think those boosts shown are reflecting the edit distance. What we can't see from this is that the Similarity class used in execution is using the same IDF for all terms. The other factors at play will be the term frequency in the doc, its length and any doc boost. I don't have access to the c

[jira] Resolved: (LUCENE-1861) Add contrib libs to classpath for javadoc

2009-08-27 Thread Hoss Man (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man resolved LUCENE-1861. -- Resolution: Fixed Fix Version/s: 2.9 Assignee: Hoss Man Committed revision 808419. >

Re: Back-Compat on Contribs

2009-08-27 Thread Chris Hostetter
: releases > 2.9. Robert raised a question if we should mark smartcn as : experimental so that we can change interfaces and public methods etc. : during the refactoring. Would that make sense for 2.9 or is there no : such thing as a back compat policy for modules like that. http://wiki.apache.org

Re: FuzzyLikeThis query and exact matches

2009-08-27 Thread Berkes Adam
After searching for term "desy" which has lot of variants in our index a rewritten (sub)query will look like this: (text:dey^0.22828968 text:des^0.22828968 text:dest^1.1557184 text:desk^1.1557184 text:desi^1.1557184 text:desf^1.1557184 text:desc^1.1557184 text:deny^1.1557184 text:defy^1.155718

Re: FuzzyLikeThis query and exact matches

2009-08-27 Thread Mark Harwood
Despite making IDF a constant the edit distance should remain a factor in the rankings so I would have thought this would give you what you need. Can you supply a more detailed example? Either print the rewritten query or use the explain function Cheers Mark On 27 Aug 2009, at 13:22, Ber

FuzzyLikeThis query and exact matches

2009-08-27 Thread Berkes Adam
Hi, In our java project we uses a (slightly modifed) version of FuzzyLikeThis query which "For each source term the fuzzy variants are held in a BooleanQuery with no coord factor (because we are not looking for matches on multiple variants in any one doc). Additionally, a specialized TermQue

[jira] Closed: (LUCENE-1851) 'ant javacc' in root project should also properly create contrib/surround Java files

2009-08-27 Thread Paul Elschot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Elschot closed LUCENE-1851. Everything working as expected now. Thanks. > 'ant javacc' in root project should also properly creat