[jira] Closed: (LUCENE-703) Change QueryParser to use ConstantScoreRangeQuery in preference to RangeQuery by default

2006-11-16 Thread Mark Harwood (JIRA)
 [ http://issues.apache.org/jira/browse/LUCENE-703?page=all ]

Mark Harwood closed LUCENE-703.
---

Resolution: Fixed

> Change QueryParser to use ConstantScoreRangeQuery in preference to RangeQuery 
> by default
> 
>
> Key: LUCENE-703
> URL: http://issues.apache.org/jira/browse/LUCENE-703
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: QueryParser
>Affects Versions: 2.1
>Reporter: Mark Harwood
>Priority: Minor
> Attachments: patch.diff
>
>
> Change to QueryParser to default to using new ConstantScoreRangeQuery in 
> preference to RangeQuery
> for range queries. This implementation is generally preferable because it 
> a) Runs faster 
> b) Does not have the scarcity of range terms unduly influence score 
> c) avoids any "TooManyBooleanClauses" exception.
> However, if applications really need to use the old-fashioned RangeQuery and 
> the above
> points are not required then the  "useOldRangeQuery" property can be used to 
> revert to old behaviour.
> The patch includes extra Junit tests for this flag and all other Junit tests 
> pass

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Stephen Hussey is out of the office.

2006-11-16 Thread Stephen Hussey

I will be out of the office starting  11/16/2006 and will not return until
11/29/2006.

I will respond to your message when I return.

Re: svn commit: r475423 - in /lucene/java/trunk: docs/whoweare.html xdocs/whoweare.xml

2006-11-16 Thread Doug Cutting

[EMAIL PROTECTED] wrote:

Modified:
lucene/java/trunk/docs/whoweare.html
lucene/java/trunk/xdocs/whoweare.xml


It looks like you still need to refresh the website.

  ssh people.apache.org
  cd /www/lucene.apache.org/java/docs
  svn up

Doug

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: svn commit: r475423 - in /lucene/java/trunk: docs/whoweare.html xdocs/whoweare.xml

2006-11-16 Thread Michael McCandless

Doug Cutting wrote:

[EMAIL PROTECTED] wrote:

Modified:
lucene/java/trunk/docs/whoweare.html
lucene/java/trunk/xdocs/whoweare.xml


It looks like you still need to refresh the website.

  ssh people.apache.org
  cd /www/lucene.apache.org/java/docs
  svn up


Woops!  OK I've done this now.

Mike

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Updated: (LUCENE-675) Lucene benchmark: objective performance test for Lucene

2006-11-16 Thread Doron Cohen (JIRA)
 [ http://issues.apache.org/jira/browse/LUCENE-675?page=all ]

Doron Cohen updated LUCENE-675:
---

Attachment: benchmark.byTask.patch

I am attaching benchmark.byTask.patch - to be applied in the contrib/benchmark 
directory. 

Root package of byTask classes was modified to 
org.apache.lucene.benchmark.byTask, in the lines of Grant's suggestion - seems 
better cause it keeps all benchmark classes under 
lucene.benchmark.

I added one a sample .alg under conf and added some documentation. 

Entry point - documentation wise - is the package doc for 
org.apache.lucene.benchmark.byTask.

Thanks for any comments on this!

PS. Before submitting the patch file, I tried to apply it myself on a clean 
version of the code, just to make sure that it works. But I got errors like 
this -- Could not retrieve revision 0 of "...\byTask\.." -- for every file 
under a new folder. So I am not sure if it is just my (Windows) svn patch 
applying utility, or is it really impossible to apply a patch that creates 
files in (yet) nonexistent directories.  I searched Lucene mailing lists and 
SVN mailing lists and went again through the SVN book again but nowhere could I 
find what is the expected behavior for applying a patch containing new 
directories. In fact, "svn diff" would not even show you files that are new 
(again, this is the Windows svn 1.4.2 version). (I used Tortoise SVN to create 
the patch). This is rather annoying and I might be misunderstanding something 
basic about SVN, but I thought it'd be better to share this experience here - 
might save some time for others trying to apply this patch or other patches
 ...

> Lucene benchmark: objective performance test for Lucene
> ---
>
> Key: LUCENE-675
> URL: http://issues.apache.org/jira/browse/LUCENE-675
> Project: Lucene - Java
>  Issue Type: Improvement
>Reporter: Andrzej Bialecki 
> Assigned To: Grant Ingersoll
> Attachments: benchmark.byTask.patch, benchmark.patch, 
> BenchmarkingIndexer.pm, extract_reuters.plx, LuceneBenchmark.java, 
> LuceneIndexer.java, taskBenchmark.zip, timedata.zip, tiny.alg, tiny.properties
>
>
> We need an objective way to measure the performance of Lucene, both indexing 
> and querying, on a known corpus. This issue is intended to collect comments 
> and patches implementing a suite of such benchmarking tests.
> Regarding the corpus: one of the widely used and freely available corpora is 
> the original Reuters collection, available from 
> http://www-2.cs.cmu.edu/afs/cs.cmu.edu/project/theo-20/www/data/news20.tar.gz 
> or 
> http://people.csail.mit.edu/u/j/jrennie/public_html/20Newsgroups/20news-18828.tar.gz.
>  I propose to use this corpus as a base for benchmarks. The benchmarking 
> suite could automatically retrieve it from known locations, and cache it 
> locally.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: file format incosisentcy (any answer ?) IMPORTANT

2006-11-16 Thread Samir Abdou

When the field stores offsets and positions of its terms within term vectors
(in the .tvf file), these are not specified in the file format
documentation.

But looking to the TermVectorsWriter within the writeField() method, you'll
see that if offsets and positions are required, then these are written to
(.tvf file)

Hope this w'll help you,

Samir
 

-Message d'origine-
De : Chris Hostetter [mailto:[EMAIL PROTECTED] 
Envoyé : mercredi, 15. novembre 2006 19:36
À : java-dev@lucene.apache.org; [EMAIL PROTECTED]
Objet : Re: file format incosisentcy 

: There is an inconsistency between the files format page (from Lucene
: website) and the source code. It concerns the positions and offsets of
term
: vectors. It seems that documentation (website) is not up to date.
According
: to the file format page, offsets and positions are not stored! Is that
: correct?

can you cite exactly what about the fileformats doc leads you to believe
this? ... a quick search for "offsets" and "positions" finds these lines
for me...

 If the third lowest-order bit is set (0x04), term positions are stored with
the term vectors.
 If the fourth lowest-order bit is set (0x08), term offsets are stored with
the term vectors.

...and that's just to start with.

-Hoss


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Created: (LUCENE-713) File Formats Documentation is not correct for Term Vectors

2006-11-16 Thread Grant Ingersoll (JIRA)
File Formats Documentation is not correct for Term Vectors
--

 Key: LUCENE-713
 URL: http://issues.apache.org/jira/browse/LUCENE-713
 Project: Lucene - Java
  Issue Type: Bug
  Components: Website
Reporter: Grant Ingersoll
 Assigned To: Grant Ingersoll
Priority: Minor


>From Samir Abdou on the dev mailing list:

Hi, 

There is an inconsistency between the files format page (from Lucene
website) and the source code. It concerns the positions and offsets of term
vectors. It seems that documentation (website) is not up to date. According
to the file format page, offsets and positions are not stored! Is that
correct?

Many thanks,

Samir
-
Indeed, in the file formats term vectors section it doesn't talk about the 
storing of position and offset info.


-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [jira] Updated: (LUCENE-675) Lucene benchmark: objective performance test for Lucene

2006-11-16 Thread Paul Elschot
Doron,

On Thursday 16 November 2006 21:17, Doron Cohen (JIRA) wrote:
> ...
> PS. Before submitting the patch file, I tried to apply it myself on a clean 
version of the code, just to make sure that it works. But I got errors like 
this -- Could not retrieve revision 0 of "...\byTask\.." -- for every file 
under a new folder. So I am not sure if it is just my (Windows) svn patch 
applying utility, or is it really impossible to apply a patch that creates 
files in (yet) nonexistent directories.  I searched Lucene mailing lists and 
SVN mailing lists and went again through the SVN book again but nowhere could 
I find what is the expected behavior for applying a patch containing new 
directories. In fact, "svn diff" would not even show you files that are new

Did you "svn add" the new files locally before doing "svn diff"?

It took me a quite while to get the hang of that. It is mentioned here:
http://wiki.apache.org/jakarta-lucene/HowToContribute

Regards,
Paul Elschot

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [jira] Updated: (LUCENE-675) Lucene benchmark: objective performance test for Lucene

2006-11-16 Thread Steven Rowe
Doron Cohen (JIRA) wrote:
>  [ http://issues.apache.org/jira/browse/LUCENE-675?page=all ]
> (I used Tortoise SVN to create the patch).

I haven't tried to use TortoiseSVN to create patches, but my experience
with it for other purposes has been negative enough, especially in
trying to use it on the same working copy on which I use a Cygwin
command-line version of SVN, that I have kicked TortoiseSVN off my
computer, and only use the command-line client.

The cute little icon overlays never seemed to be in sync with reality
anyway, even after refreshing Windows Explorer, and GUIs over
command-line functionality always make me a little paranoid about what
seems like a loss of control, so I haven't mourned its loss too much.

Steve

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Commented: (LUCENE-707) Lucene Java Site docs

2006-11-16 Thread Grant Ingersoll (JIRA)
[ 
http://issues.apache.org/jira/browse/LUCENE-707?page=comments#action_12450567 ] 

Grant Ingersoll commented on LUCENE-707:


I think I have ironed out the issues.  I am not going to submit a patch, just 
yet, instead I ask that people go review the site at 
http://people.apache.org/~gsingers/site/  (note: the breadcrumbs at the top 
reflect that it is deployed on my site and will be different when on the real 
site)

Here is a summary of what will be in the patch/commit when we are ready:
1. Move xdocs etc. into Forrest common structure under src/site (as is the case 
w/ SOLR and Hadoop)
2. xdocs and docs directory are deleted
3. site directory is cleaned out and the new site is put in there
4. New site update instructions will be the same as: 
http://wiki.apache.org/solr/Website_Update_HOWTO

The "Site Versions" section will eventually have a link to a nightly build of 
the site (as per issue 708) but we are having some infrastructure issues w/ 
setting this up for now, but that part will be coming soon and I am going to 
decouple that bug from this one so as to not hold up this one.


Please let me know what you think, but stick to layout problems and introduced 
errors (i.e. write up a different bug for errors in the documentation that 
already exist).

Thanks!
Grant

> Lucene Java Site docs
> -
>
> Key: LUCENE-707
> URL: http://issues.apache.org/jira/browse/LUCENE-707
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Website
> Environment: N/A
>Reporter: Grant Ingersoll
> Assigned To: Grant Ingersoll
>Priority: Minor
>
> It would be really nice if the Java site docs where consistent with the rest 
> of the Lucene family (namely, with navigation tabs, etc.) so that one can 
> easily go between Nutch, Hadoop, etc.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [jira] Commented: (LUCENE-707) Lucene Java Site docs

2006-11-16 Thread Chris Hostetter

: Please let me know what you think, but stick to layout problems and
: introduced errors (i.e. write up a different bug for errors in the
: documentation that already exist).

I don't know enough about forrest to explain this, but all of the pages
seem to be getting an empty   (the PDFs don't seem to get
any document titles either)




-Hoss


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [jira] Commented: (LUCENE-707) Lucene Java Site docs

2006-11-16 Thread Grant Ingersoll
Good catch.  Shows you how little I look at the title!  Anyway, it  
appears that SOLR does it as Title  
whereas our old xdocs are Titleproperties>.  Apparently, Forrest and Anakia have different  
interpretations of xdocs.


Updates are on http://people.apache.org/~gsingers/site/

-Grant

On Nov 16, 2006, at 7:04 PM, Chris Hostetter wrote:



: Please let me know what you think, but stick to layout problems and
: introduced errors (i.e. write up a different bug for errors in the
: documentation that already exist).

I don't know enough about forrest to explain this, but all of the  
pages
seem to be getting an empty   (the PDFs don't seem  
to get

any document titles either)




-Hoss


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]





-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]