Re: [DISCUSS] Do away with Contrib Committers and make core committers

2010-03-14 Thread Chris Hostetter
: Subject: [DISCUSS] Do away with Contrib Committers and make core committers +1 -Hoss - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]

Re: lucene and solr trunk

2010-03-15 Thread Chris Hostetter
: prime-time as the new solr trunk! Lucene and Solr need to move to a : common trunk for a host of reasons, including single patches that can : cover both, shared tags and branches, and shared test code w/o a test : jar. Without a clearer picture of how people envision development "overhead" wor

Re: #lucene IRC log [was: RE: lucene and solr trunk]

2010-03-16 Thread Chris Hostetter
: with, "if id didn't happen on the lists, it didn't happen". Its the same as +1 But as the IRC channel gets used more and more, it would *also* be nice if there was an archive of the IRC channel so that there is a place to go look to understand the back story behind an idea once it's synthesi

Re: lucene and solr trunk

2010-03-17 Thread Chris Hostetter
: build and nicely gets all dependencies to Lucene and Tika whenever I build : or release, no problem there and certainly no need to have it merged into : Lucene's svn! The key distinction is that Solr is allready in "Lucene's svn" -- The question is how reorg things in a way that makes it easier

Re: Contrib tests fail if core jar is not up to date

2010-03-18 Thread Chris Hostetter
: In addition to what Shai mentioned, I wanted to say that there are : other oddities about how the contrib tests run in ant. For example, : I'm not sure why we create the junitfailed.flag files (I think it has : something to do with detecting top-level that a single contrib : failed). Correct ..

Re: Build failed in Hudson: Lucene-trunk #1144

2010-03-31 Thread Chris Hostetter
: I was wondering yesterday why aren't the required libs checked in to SVN? We Licensing issues. we can't redistribute them (but we can provide the build.xml code to fetch them) -Hoss - To unsubscribe, e-mail: java-dev-unsu

Re: Build failed in Hudson: Lucene-trunk #1144

2010-04-01 Thread Chris Hostetter
: No, no, no, Lucene still has no need for maven or ivy for dependency management. : We can just hack around all issues with ant scripts. it doesn't really matter if it's ant scripts, or ivy declarations, or maven pom entries -- the point is the same. We can't distribute the jars, but we can d

Re: Changing the subject for a JIRA-issue (Was: [jira] Created: (LUCENE-2335) optimization: when sorting by field, if index has one segment and field values are not needed, do not load String[] into f

2010-04-08 Thread Chris Hostetter
: > Is it possible to change it? If not, what is the policy here? To open a : > new issue and close the old one? ... : In this case, that would mean either closing this issue and opening a new one, : or taking the discussion to the mailing list where subject headers may be : modified as th

RE: issues.apache.org compromised: please update your passwords

2010-04-14 Thread Chris Hostetter
: I disabled the account by assigning a dummy eMail and gave it a random password. : : I was not able to unassign the issues, as most issues were "Closed", : where no modifications can be done anymore. Reopening and changing Uwe: it may be too late (depending on wether you remember the dummy

Eliminating norms ... completley

2005-10-07 Thread Chris Hostetter
Yonik and I have been looking at the memory requirements of an application we've got. We use a lot of indexed fields, primarily so I can do a lot of numeric tests (using RangeFilter). When I say "a lot" I mean arround 8,000 -- many of which are not used by all documents in the index. Now there

Re: Eliminating norms ... completley

2005-10-07 Thread Chris Hostetter
: 2) Can you think of a clean way for individual applications to eliminate : norms (via subclassing the lucene code base - ie: no patching) For completeness, I should mention that one thing I briefly considered was writing a new Directory implimentation that would proxy to FSDirectory, but

Re: Eliminating norms ... completley

2005-10-09 Thread Chris Hostetter
Paul, thanx for your suggestions. It seems like they mostly address the issue of improving search time, by eliminting the need to read the norm files from disk -- but the spead of the query isn't as big of a concern for us as the memory footprint. As I understand it, the point when we are reall

Re: Adding information to an index

2005-10-09 Thread Chris Hostetter
: I'm looking to store some additional information in a Lucene index : and I'm looking for an advise on how to implement the functionality. : Specifically, I'm planning to store 1) collection frequency count for : each term, 2) actual document length for each document (yes, I looked : at the norm f

Re: Eliminating norms ... completley

2005-10-10 Thread Chris Hostetter
: > Doesn't this cause a problem for highly interactive and large indexes? Since : > every update to the index requires the rewriting of the norms, and : > constructing a new array. : : The original complaint was primarily about search-time memory size, not : update speed. I like the proposed pat

Re: regex-based query contribution

2005-10-13 Thread Chris Hostetter
: A more general solution would be to use a subclass of BooleanQuery that : provides a Weight that flattens all the weights of the subqueries, for example : to the maximum weight, and for the rest works like the usual Weight of : BooleanQuery. I'm not grasping all of the ideas in this thread comp

Re: about numeric range searching with large value sets patches

2005-10-19 Thread Chris Hostetter
I've never really looked at the IntegerRangeQuery submission, but if you think you've found a bug, you should attach your test to the JIRA issue that the orriginal patch bug has been migrated to, so that it's clear to anyone looking at applying it that it may have problems... http://issues.apache

Re: Welcome Yonik Seeley as committer!

2005-10-24 Thread Chris Hostetter
: Last week I proposed to the Lucene PMC that we make Yonik Seeley a : committer on Lucene Java. I am pleased to announce that other PMC : members agreed. Welcome, Yonik! 1) Wah-Hoo! Yonik is definitely one of the smartest guys I've worked with in the past few years. 2) On the subject of comm

Re: Welcome Yonik Seeley as committer!

2005-10-25 Thread Chris Hostetter
: Formally, the process is that someone nominates, and the PMC votes. : When Lucene was part of Jakarta we used to just have the committers : vote, since we had little contact with the Jakarta PMC. But now that : Lucene has its own PMC we can do it the official Apache way. Ahh, I see ... I didn'

Re: Sort by Title Issue

2005-10-27 Thread Chris Hostetter
: I am trying to perform a sort by title field search, but am receiving : the following error. The search seems to have problems with field : values that have multiple words. It sorts single word values with no : problem. Any help will be appreciated. I indexed the title field as : Field.text(

Re: Indexing Remote Documents

2005-10-27 Thread Chris Hostetter
: probably you'll need http client module (commons-httpclient or something) More specifically: when dealing with lucene, the concept of a "document" is very specific: it is an instance of org.apache.lucene.document.Document. how you construct one of these Document objects in your application is

Re: svn commit: r329366 - in /lucene/java/trunk: ./ docs/ src/java/org/apache/lucene/document/ src/java/org/apache/lucene/index/ xdocs/

2005-10-29 Thread Chris Hostetter
: Perhaps something like @since is what we should be using : on that file formats page. It's a little late now for older versions, but it might make sense to move that documentation directly into the code base, where it can be locally linked to from the javadocs, and included directly into jars

Re: Indexing

2005-10-31 Thread Chris Hostetter
: : Taking this to java-dev: Since this is such a common issue, would it : be feasible for Lucene to have some sort of capability to be told : what field is the unique one and automatically update (delete, and : add) a document added with a duplicate of a unique field? This : would probably requi

Re: compatibility of Lucene 1.9

2005-11-09 Thread Chris Hostetter
: And what's the command line to do the svn checkout? It's not apparent : from the Lucene web site. I have the svn client installed. the info is in the wiki, i've linked to it from the FAQ... http://wiki.apache.org/jakarta-lucene/LuceneFAQ#head-abe69adac45ac2e9b5c04db87666a6757631 -Hoss

Re: Shared Field Values

2005-11-12 Thread Chris Hostetter
The first thing that occurs to me, is that if the fields you are talking about "sharing" are allways indexed, then you can leave them UnStored, and use a FieldCache.StringIndex to get the values. -Hoss - To unsubscribe, e-mai

Re: svn commit: r332747 - in /lucene/java/trunk: ./ src/java/org/apache/lucene/search/regex/ src/test/org/apache/lucene/search/regex/

2005-11-16 Thread Chris Hostetter
: > Should we dynamically decide to switch to FieldNormQuery when : > BooleanQuery.maxClauseCount is exceeded? That way queries that : Why not leave that decision to the program using the query? : Something like this: : - catch the TooManyClauses exception, : - adapt (the offending parts of) th

Re: [EMAIL PROTECTED]: Project lucene-java (in module lucene-java) failed

2005-11-17 Thread Chris Hostetter
: It's fixed now. : Sorry bout that... I've already set up a test script to switch my JDK : to 14 before running "ant test". I don't remember the specifics, but isn't there an attribute for the and taks that you can use to tell it wether you want it to compile as 1.4 code or 1.5 code? ... I thou

Removing CVS REpository?

2005-11-18 Thread Chris Hostetter
Replying here to a thread from java-user. Is there any reason not to "cvs remove" all of the files from the old CVS repository for lucene, and check in a single README file explaining that the CVS repository is no longer used, and where they can find the SVN repository? : Date: Fri, 18 Nov 2005

Re: [EMAIL PROTECTED]: Project lucene-java (in module lucene-java) failed

2005-11-18 Thread Chris Hostetter
: I think you are thinking of target="1.4" type of thing. I always : thought this was about binary compatibility of complied code, not the : language syntax, but I'm not sure. Erik will know. Actually, I'd forgotten about "target" ... I checked and the option i was thinking of is "source"...

Re: "Advanced" query language

2005-12-05 Thread Chris Hostetter
I'm extremely stoked to see this topic come up, but very sad that I didn't have time to read any Lucene mail this past weekend. I'll have to catchup. First off... : Again, we're talking machine-to-machine communication here, not human- : machine. : While there have been several different topic

Re: "Advanced" query language

2005-12-05 Thread Chris Hostetter
: Though, I'd be careful with proposing a variety of equivalent : syntaxes as it may easily lead to more confusion than good. Let's : start with one canonical syntax. If desired, other (more pleasant) : syntaxes may then be converted to that as part of a preprocessing step. Experience has taught

Re: NioFile cache performance

2005-12-09 Thread Chris Hostetter
: I have seen this issue come up several times (perhaps the following is : an oversimplification): : Someone will suggest a performance enhancement and perhaps supply the : code. Then there will be a general discussion about the merits of the : change and the validity of the results, with question

Re: Query.extractTerms

2005-12-09 Thread Chris Hostetter
: : Query.extractTerms throws an exception if called with a non-rewritten : query. Is it enough to document that (I could do that) or is that : something that should be fixed (if possible)? That seems like something that should be a checked Exception (not a RuntimeException) Alternately, extractT

online javadocs are gone?

2005-12-18 Thread Chris Hostetter
Anyone know what happened? these URLs are 403ing... http://lucene.apache.org/java/docs/api/ http://lucene.apache.org/java/docs/api/index.html -Hoss - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail

Re: "Advanced" query language

2005-12-21 Thread Chris Hostetter
I finally got a chance to look at this code today (the best part about the last day before vacation, is no one expects you to get anything done, so you can ignore your "real work" and spend time on things that are more important in the long run) and while I still havne't wrapped my head arround al

Re: "Advanced" query language

2005-12-23 Thread Chris Hostetter
: > I think that the ideal API wouldn't require people : > writing ObjectBuilders : > to know anything about sax, or to ever need to : > import anything from : > org.xml.** or javax.xml.** : : Fair enough. I presume we want to maintain the : position that Lucene should not have any dependencies : o

Re: "Advanced" query language

2005-12-30 Thread Chris Hostetter
: I'm personally happier to stick with one approach, : preferably with an existing, standardized interface : which lets me switch implementations. I didn't really : want to have to design a general API for parsing XML : as part of this project. I'm not suggesting that, I'm just saying that the AP

Re: "Advanced" query language

2006-01-04 Thread Chris Hostetter
: I'd still like to keep the parser core reasonably generic (ie : java.lang.Object rather than Query or Filter) because I can see it being : used for instantiating many different types of objects eg requests for : GroupBy , highlighting, indexing, testing etc. : As for your type-safety requiremen

Re: "Advanced" query language

2006-01-04 Thread Chris Hostetter
: This example code looks interesting. If I understand : correctly using this approach requires that builders : like the "q" QueryObjectBuilder instance must be : explicitly registered with each and every builder that : consumes its type of output eg BQOB and FQOB. An correct. : provider for the

Re: Question about FieldInfos

2006-01-10 Thread Chris Hostetter
: Is there some reason not to store all field attributes in one place (*.fnm) ? : Some of them are stored as a one byte-bit mask : in the field infos file (*.fnm), : : isIndexed (IS_INDEXED) : storeTermVector (STORE_TERMVECTOR) : storePositionsWithTermVector (STORE_POSITIONS_WITH_TERMVECTOR) :

Re: BooleanQuery: static setMaxClauseCount(int)?

2006-01-10 Thread Chris Hostetter
I thought the purpose of this method was for applications to specify the largest possible BooleanQuery that could be created in their application (either progromaticaly, via QueryParser, or as a result of rewriting a non-primative). Changing this to be non-static would (besides breaking existing

Re: Question about FieldInfos

2006-01-15 Thread Chris Hostetter
: IMO, there's no reason to allow field definitions to be spec'd more : often than once per IndexWriter. Need to add a new field for docs : 501-1000 of a 1000-doc indexing pass? No problem: create a new : IndexWriter, define new fields, and you're off and running. If I understand your argument,

Re: Question about FieldInfos

2006-01-15 Thread Chris Hostetter
: Option 1: Merge field definitions at the segment level rather than : the Document level. The defs stay stored with individual segments, : but everything gets moved into the .fnm file, including : IS_COMPRESSED, IS_BINARY, etc (as I believe Robert was proposing). : : Option 2: Centralize the field

Re: problems with date ranges in queryParser

2006-01-15 Thread Chris Hostetter
: 1.) We now have DateField and DateTools which use different formats. So : QueryParser needs to know which one has been used during indexing. I've a : local patch that adds an appropriate set... method. A much as i dislike the "standard" mechanism for indexing Dates, I'm of the opinion that if p

Re: Handling of colons in QueryParserTokenManager

2006-01-21 Thread Chris Hostetter
if you are flexible in the syntax you are willing to support, you can tell your users that they need to escape the colons that aren't ment as field identifiers... ID:CI\:123 ...alternately, you can tell them they have to quote colons... ID:"CI:123" ...then you can avoid the who

Re: Filter

2006-01-26 Thread Chris Hostetter
The subject of revamping the Filter API to support more compact filter representations has come up in the past ... At least one patch comes to mind that helps with the issue... https://issues.apache.org/jira/browse/LUCENE-328 ...i'm not intimitely familiar with that code, but if i recall corr

RE: Filter

2006-01-26 Thread Chris Hostetter
r out of the box public interface DocIterator { public int doc(); public boolean next(); public boolean skipTo(int target); } : : -Original Message- : From: [EMAIL PROTECTED] : [mailto:[EMAIL PROTECTED] Behalf Of Chris Hostetter : Sent: Thursday, January 26, 2006

Re: Filter

2006-01-27 Thread Chris Hostetter
: > > public interface DocIterator { : > > public int doc(); : > > public boolean next(); : > > public boolean skipTo(int target); : > > } : Btw. the DocNrSkipper referred to earlier has this DocIterator functionality : in one method: : : int nextDocNr(int) :

Re: Preventing "killer" queries

2006-02-07 Thread Chris Hostetter
Mark, I know you've already commited a patch along these lines (LUCENE-494) and I can see how in a lot of cases that would be a great solution, but i'm still interested in the orriginal idea you proposed (a 'maxDf' in TermQuery) because i anticipate situations in which you don't want to ignore th

Re: Preventing "killer" queries

2006-02-08 Thread Chris Hostetter
: Chris, although I suggested it initially, I'm now a : little uncomfortable in controlling this issue with a : static variable in TermQuery because it doesnt let me : have different settings for different queries, indexes : or fields. Oh i totally agree ... it's the kind of thing you'd only want

Re: 1.9 RC1

2006-02-13 Thread Chris Hostetter
: This is a great time to improve the javadoc. I see lots of blank boxes : which could use a bit of descriptive text, for example: That reminds me about a documentation/release issue that's been rolling arround in the back of my mind that seems like it's only going to get worse as future release

Re: changing prefix queries to use a filter instead of expanding terms

2006-02-13 Thread Chris Hostetter
: care about having contribute to the score of the hit. Along those lines I : was thinking about adding some functionality to the code that expands prefix : queries to create a filter and use that instead of just expanding the : individual terms. Can anyone see any major issues with doing it this

updating fieldNorms in mass

2006-02-14 Thread Chris Hostetter
I just noticed the IndexReader.setNorm method(s) today and was extremely stoked -- after rebuilding my dev index from scratch three times last week becuase I wanted to try out tweaks to Similarity.lengthNorm the idea of being able to directly change the norms without rebuildign from scratch is loo

Re: 1.9 RC1

2006-02-14 Thread Chris Hostetter
: I'd like to push out a 1.9 release candidate in the next week or so. I'm not sure what the ASF/Lucene policy is on keeping Copyright/License statements in source files up to date, but should they all be updated to say "Copyright 2006 The Apache Software Foundation" prior to a 1.9 release? I've

Re: updating fieldNorms in mass

2006-02-14 Thread Chris Hostetter
: > in the case where doc boosts and field boosts aren't used, it seems like : > it would be very easy to write a maintenance app that did something : > like... : > ...does anyone see anything wrong with the overall appraoch? : : Looks good to me. Implimented and submitted in LUCENE-496. So far

Re: [jira] Created: (LUCENE-498) Remove old @jakarta.apache.org mailing lists

2006-02-16 Thread Chris Hostetter
: Anyone using those addresses, even the new ones, without first : signing up for the list is going to have some issues anyway. I : moderate in a fair number of these sorts of messages, but I also : reject recurring ones and request the sender sign up. Perhaps the best course of action would be

Re: Lucene 1.9 RC1 release available

2006-02-21 Thread Chris Hostetter
: of query). Under the previous versions of QueryParser, I could simply : specify 'riot???' and capture all of those variants. I don't have a strong opinion on this issue, but it seems clear to me that this was a bug in 1.4.3 not a change in the orriginally intended behavior. queryparsersyntax.h

Re: Lucene 1.9 RC1 release available

2006-02-21 Thread Chris Hostetter
: In either case, what I'm arguing is that the current behavior makes more : sense in the real world of query expressions (that is, makes the most : common query expressions simpler), so why not continue it? I disagree with that statment. People familiar with shell globing are going to be confus

Re: Lucene 1.9 RC1 release available

2006-02-24 Thread Chris Hostetter
: FYI, I think all of the commits to trunk since the RC1 release are safe : to merge to the 1.9 branch. They're mostly documentation improvements. : So my plan is currently, on Monday, to merge these changes to the 1.9 : branch, then make a 1.9-final release. I'll again announce it to the ...

Re: XML based Query Parser

2006-02-26 Thread Chris Hostetter
: Further to our discussions some time ago I've had some time to put : together an XML-based query parser with support for many "advanced" : query types not supported in the current Query parser. : : More details and code here: http://www.inperspective.com/lucene/LXQuery2.htm So I *finally* got a

Re: XML Query Parser - next steps

2006-02-26 Thread Chris Hostetter
: > A) generate an XML representation of a given : > Query/Filter object. This would solve the current : > parser.parse(Query.toString())round-tripping problem. : : This would be very useful, but couldn't it be added after this was in : contrib? You might reorganize things so that it fits in mor

Re: XML based Query Parser

2006-02-27 Thread Chris Hostetter
: But doesn't sticking with w3c.dom.Element allow the possibility of : standards based tools (eg XPath implementations) to be used by builders : if they so wish? Hmmm... that isn't something i'd considered. You've convinced me. : >3) I'm still confused about how state information could/would be

Re: XML based Query Parser

2006-02-27 Thread Chris Hostetter
: : DOMUtils.getAttributeWithInheritance instead. My one scenario I came : : across where I wanted some context passed down was "fieldName" and this : : is handled simply by leaf nodes walking up the w3c.dom.Node tree until : : you find an Element with this attribute set. : : Hmm, i can see how th

Re: Changes.txt for contrib

2006-03-01 Thread Chris Hostetter
: distribution, we should start documenting their changes. I suggest that we : add a file contrib/CHANGES.txt. This way we don't pollute the top-level : changes file. Having one changes file per contrib project on the other : hand makes it more difficult to get an overview, so one in contrib seems

Online javadocs: 1.9-rc1

2006-03-03 Thread Chris Hostetter
Someone with the neccessary permisions to update the javadocs on the website might want to do so, they currently say "Lucene 1.9-rc1 API" which might confuse people (even if the API is exactly the same as 1.9.1) http://lucene.apache.org/java/docs/api/ -Hoss --

Re: compile search.jsp

2006-03-04 Thread Chris Hostetter
[email protected] is the appropriate email list to consult with questions about using/configuring/customizing nutch. [EMAIL PROTECTED] is for discussing the core lucene java library. : Date: Sat, 4 Mar 2006 18:36:25 -0800 (PST) : From: Michael Ji <[EMAIL PROTECTED]> : Reply-To: java-

RE: query parsing

2006-03-23 Thread Chris Hostetter
: Any suggestions on what to do then, as the following query exhibits the same behavior : : (+cat) (-dog) : : Due to the implied AND. Removing the parenthesis allows it to work. It : doesn't seem that adding parenthesis in this case should cause the query : to fail??? Adding parens causes QueryPa

Re: FilterIndexReader.getVersion

2006-04-04 Thread Chris Hostetter
: Shouldn't FilterIndexReader in 1.9.1 override IndexReader.getVersion() and : IndexReader.isCurrent()? Currently it doesn't, so getVersion() gives a : NullPointerException, segmentInfos is null. I think you are right, it looks like FilterIndexReader just wasn't updated when those methods were ad

RE: Date Boosting

2006-04-04 Thread Chris Hostetter
: Maybe I'm going about this the wrong way. If you think I am, let me : know. I now realize that this question should be in the lucene users : list but I started it here because I was going to write a new module for : doing this because I couldn't get lucene to do it for me. I'm going to : look

Re: [newbie]problem about range query

2006-04-04 Thread Chris Hostetter
There is a FAQ thta covers it, I just updated it since it was somewhat out of date and lacked some of the newest (bestest?) info about dealing with this problem... http://wiki.apache.org/jakarta-lucene/LuceneFAQ#head-06fafb5d19e786a50fb3dfb8821a6af9f37aa831 In the future, questions about using

Re: bytecount as prefix

2006-04-11 Thread Chris Hostetter
1) not only does ConstantScoreRangeQuery uses a RangeFilter, but TestConstantScoreRangeQuery and TestRangeFilter share a base class that creates the index. 2) perhaps the issue is that corruption is happening when segments are merged -- and most tests don't surface the problem becuse they tend to

Re: MoreLikeThis

2006-04-16 Thread Chris Hostetter
: Lucene is completely new to me. I just downloaded 1.9.1 and started : experimenting with it. I am a bit confused though. I want to use the : MoreLikeThis class, which appears in the javadoc, but does not exist in : code. Where can I find it? if you look at the way the main javadoc index is ar

Re: how to match Documents from Hits with Documents from Query Spans?

2006-04-18 Thread Chris Hostetter
: For some reason, there is a disagreement between the order the : Documents are returned in hits, and the Documents are referenced (via : order number, starting from 0) in the Spans? When dealing with a Hits instance, documents are iterated over in "results order" -- which may be by score, or ma

Re: using Plucene and Plucene::Simple

2006-04-18 Thread Chris Hostetter
As marvin mentioned, there are some UTF-8 incompatabilities between java lucene and Plucene. Incidently: your best bet for getting assistence with Plucene is the Plucene mailing lists, as identified at the bottom of "perldoc Plucene" ... http://kasei.com/mailman/listinfo/plucene ...perl

Re: How to get Document (or filename) from Span

2006-04-18 Thread Chris Hostetter
: The question is when I get Spans, I get start/end positions and a : Document order (starting from 0), not the Document object itself from Are you sure about that? Spans.doc() should return you the internal document Identifier which you can pass to indexReader.doc(int) : which I could get a fi

Re: 2.0 release

2006-04-27 Thread Chris Hostetter
: I should have been more clear: I'm not asking for new feature requests. : Rather for known, high-priority, bugs. I don't know if it's high priority, but LUCENE-546 seems to be a trivial bug with a trivial fix ("seems to be", i'm judging purely by the patch) 2.0 also seems like the best time

Re: Turkish Analyzer for lucene

2006-04-28 Thread Chris Hostetter
: Anyway, i am sending you TurkishAnalyzer as attachment.I will be VERY : happy if you upload these codes to: Emre, I don't know anything about Turkish -- but It's allways good to have new analyzers: thanks for the contribution. Uploading it to Jira was definitely the best way to submit it. One

Re: A problem running two or more negative Span clauses

2006-05-01 Thread Chris Hostetter
: I am having problems running span queries with more than one : negative clauses: i believe you mean when the exclude clause contains a SpanNear query correct? : Is the span query nested correctly? I'm not very good at reading SpanQuery.toString() output ... but i believe i encountered the s

Re: this == that

2006-05-01 Thread Chris Hostetter
A couple of responses to various comments in this thread... : > Unless it object identity is what is being tested or intern is an : > invariant, I think it is dangerous. It is easy to forget to intern or to : > propagate the pattern via cut and paste to an inappropriate context. interning the St

Re: Statistaical evaluation of modifications to a Lucene query based on search logs

2006-05-04 Thread Chris Hostetter
: It's got one difference from yours, in that the terms are allowed to : occur in any order in the sub-phrases (so phrase "C B" from your : original example is scored like "B C"). there's a much bigger differnece, in that your technique won't reqard documents where B and C are "near" eachother, b

Re: Changing Lucene scoring?

2006-05-08 Thread Chris Hostetter
: One of the reasons I am looking at this is because I often need just : yes/no (matches/doesn't match) answers, and don't care for the score. I didn't realize that was an option -- i thought you wanted integer scoring, and the best advice i had for that was to search and replace. But if you jus

Re: Multiple threads searching in Lucene and the synchronized issue. -- solution attached.

2006-05-09 Thread Chris Hostetter
: We found if we were using 2 IndexSearcher, we would get 10% performance : benefit. : But if we increased the number of IndexSearcher from 2, the performance : improvement became slight even worse. Why use more then 2 IndexSearchers? Typically 1 is all you need, except for when you want to

Re: Multiple threads searching in Lucene and the synchronized issue. -- solution attached.

2006-05-09 Thread Chris Hostetter
: > I am fairly certain his code is ok, since it rechecks the initialized state : > in the synchronized block before initializing. : : That "recheck" is why the pattern (or anti-pattern) is called : double-checked locking :-) More specificly, this is functionally half way between example labeled

RE: Multiple threads searching in Lucene and the synchronized issue. -- solution attached.

2006-05-09 Thread Chris Hostetter
: I think you could use a volatile primitive boolean to control whether or not : the index needs to be read, and also mark the index data volatile and it : SHOULD PROBABLY work. : : But as stated, I don't think the performance difference is worth it. My understanding is: 1) volatile will only h

RE: Multiple threads searching in Lucene and the synchronized issue. -- solution attached.

2006-05-10 Thread Chris Hostetter
, I'm not expert i was just going based on what i've read, and aparently i forgot to paste the URL in my last email... http://www.cs.umd.edu/~pugh/java/memoryModel/jsr-133-faq.html#dcl : -Original Message- : From: Chris Hostetter [mailto:[EMAIL PROTECTED] : Sent: Wednesday, May

BooleanWeight.normalize(float) doesn't normalize prohibited clauses?

2006-05-10 Thread Chris Hostetter
I'm looking into some of the issues with LUCENE-557 and it seems that a lot of them are triggered by the way BooleanWeight.normalize is implimented... public void normalize(float norm) { norm *= getBoost(); // incorporate boost for (int i = 0 ; i < weights.

Re: BooleanWeight.normalize(float) doesn't normalize prohibited clauses?

2006-05-11 Thread Chris Hostetter
: > Does anyone know why normalize ignores the prohibited clauses? was that : > just intended to be an optimization (save time calculating stuff for : > clauses we don't care about scoring in depth) ... ? : : A prohibited clause will never occur in any matching document, so it : will never need t

Re: problems calculating norms

2006-05-11 Thread Chris Hostetter
I'm really confused by your example ... I'm assuming eField is a Map.Entry, and eField.getKey() is returning a FieldInfo (allthough i'm not sure why there's no explicit cast in your code) ... but what is the return type of "eField.getValue()" ? Without understanding what that object is, i can onl

Re: BooleanWeight.normalize(float) doesn't normalize prohibited clauses?

2006-05-11 Thread Chris Hostetter
: If class Explanation would have a boolean attribute indicating whether : or not there was a match, the Explanation for BooleanQuery could : simply use this value from the Explanation of the prohibited clause. I've definitely thought about that a lot initially. But my gut reaction was to try an

Re: accelerate hits.id(i) function: eliminating scoring for the sake of efficiency

2006-05-11 Thread Chris Hostetter
: However what significantly slows us down is the hits.id(i) function. : Can we accelerate it somehow "cleaning" Lucene code itself from : scoring? you said in your last message... : We don't need any scoring in our application domain, but : efficiency is the key because we are getting tens

Re: BooleanWeight.normalize(float) doesn't normalize prohibited clauses?

2006-05-11 Thread Chris Hostetter
: >Boolean match = null; : : As for the thoughts question below: this java-dev, not c-dev :) i could not for the life of me understand this comment untill i got to the end of your message... : null for false: long time no see... ...i'm not trying to use null for false, i'm using null to ind

Re: forbid empty field names?

2006-05-12 Thread Chris Hostetter
I'm curious: does the exception only occur if both the field and the value are empty? ... are the Field.Store and Field.Index options you listed neccessary for this condition as well? Is it clear why this situation causes the exception? (I don't have any obejction to rejecting blank field names

Jira Convention: Resolved vs Closed

2006-05-15 Thread Chris Hostetter
Is there a documented or unspoken policy about the "Resolved" vs "Closed" bug statuses? How/when should a resolved bug be closed? (In my experience policy has tended towards the person fixing the bug to "resolve" it, and the person who opened the bug to "close" once they're verified the fix -- b

SpanNotQuery.hashCode cut/paste error?

2006-05-16 Thread Chris Hostetter
SpanNodeQuery's hashCode method makes two refrences to include.hashCode(), but none to exclude.hashCode() ... this is a mistake yes/no? -Hoss - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL

Re: OpenBitSet

2006-05-16 Thread Chris Hostetter
: I measured also on different densities, and it looks about the same. : When I find a few spare minutes will make one PerfTest that generates : gnuplot diagrams. Wold be interesting to see how all key methods behave : as a function of density/size. I was thinking the same thing ... i just haven'

Re: weird behavior of IndexReader.indexExists()

2006-05-17 Thread Chris Hostetter
: I put Lock in IndexReader.indexExists function, and testes for a few days : It worked fine. I never had that mistery problem. : : How can put the patch in a JIRA issue? Please take a look at the recently added FAQ "How do I contribute an improvement?"... http://wiki.apache.org/jakarta-lucene/L

RE: non indexed field searching?

2006-05-17 Thread Chris Hostetter
: I don't see anything related to searching using non-indexed fields. Could : you maybe point me at the class(es) that implement this functionality? I think Erik was refering more specificly to the statement... : > it is just : > very difficult to perform some complex queries efficiently without

Re: Lucene 2.0

2006-05-18 Thread Chris Hostetter
: Could someone enumerate what needs to be done before 2.0 is released. : From following this thread, it was stated that 2.0 was 1.9 with : deprecations removed. : Recently it appears to be becoming much more than that. I believe Doug's suggestion was to hold off just long enough to fix any egre

Re: Lucene 2.0

2006-05-18 Thread Chris Hostetter
: I wouldn't seeing 415 being fixed, but I seem to be missing a way one : changes "Fix Version". it's a property that can be changed from the edit screen .. but 415 is weird, there is no "Edit" link in the Operations nav (as opposed to every other LUCENE issue i've ever looked at ) -Hoss

Re: ParseException with escaped quotes in a phrase

2006-05-18 Thread Chris Hostetter
: Looks like QueryParser doesn't handle escaped quotes when inside a phrase: I believe you are correct. could you file a Jira issue for this, preferably with your main function converted to a JUnit test function that can be added to TestQueryParser? (it doesn't take much to write a JUnit test f

Re: Explaining a filter; Scorer extending Matcher; (was: BooleanWeight.normalize(float) doesn't normalize prohibited clauses?)

2006-05-22 Thread Chris Hostetter
: In case Explanation is also to explain what a Filter does, it would need to : have both a match flag and a score value. that's a good point, i hadn't considered hte possibility of "explaining" filters much ... but there's no reason why the "valueO 'f an explanation couldn't be an optional part

  1   2   3   4   5   6   7   >