corrupted index Lucene 4.4

2013-10-23 Thread Chris
Hi, I am running solr 4.4 & one of my collections seems to have a corrupted index... I tried doing - java -cp lucene-core-4.4.0.jar -ea:org.apache.lucene... org.apache.lucene.index.CheckIndex /solr2/example/solr/w1/data/index/ -fix But it didnt help...gives - ERROR: could not read any segments

Re: corrupted index Lucene 4.4

2013-10-23 Thread Chris
less.com> wrote: > How did this corruption happen? > > If you "ls" your index directory, is there any segments_N file? > > Mike McCandless > > http://blog.mikemccandless.com > > > On Wed, Oct 23, 2013 at 9:01 AM, Chris wrote: > > Hi, > > &g

Re: corrupted index Lucene 4.4

2013-10-23 Thread Chris
egments_N file, CheckIndex is unusable; a > readable segments_N file is currently necessary to recover anything > from the index. > > Mike McCandless > > http://blog.mikemccandless.com > > > On Wed, Oct 23, 2013 at 9:44 AM, Chris wrote: > > Hi Mike, > > >

Re: corrupted index Lucene 4.4

2013-10-23 Thread Chris
3 at 8:16 PM, Michael McCandless < luc...@mikemccandless.com> wrote: > On Wed, Oct 23, 2013 at 10:33 AM, Chris wrote: > > I am not exactly sure if the commit() was run, as i am inserting each > row & > > doing a commit right away. My solr will not load the index &g

Re: corrupted index Lucene 4.4

2013-10-23 Thread Chris
commit (e.g. maybe > every few hours or something). > > Also, be sure your IO system is "healthy" / does not disregard fsync, > and if the index is really important, back it up to a different > storage device every so often. > > Mike McCandless > > http://blog.mikemccan

Re: corrupted index Lucene 4.4

2013-10-23 Thread Chris
Hi Mike, Thanks, I have asked there also, they are investigating, will let you know if something turns up on that front :) On Thu, Oct 24, 2013 at 1:30 AM, Michael McCandless < luc...@mikemccandless.com> wrote: > Hi Chris, > > Sorry, I don't know much about Solr cloud;

Re: corrupted index Lucene 4.4

2013-10-29 Thread Chris
pointers on how to resolve this one? I have seen that this occurs mostly for japanese chinese characters. Warm Regards, Chris On Thu, Oct 24, 2013 at 1:30 AM, Michael McCandless < luc...@mikemccandless.com> wrote: > Hi Chris, > > Sorry, I don't know much about Solr cloud;

Re: Question about Boolean Operators

2008-01-01 Thread Chris
(new TermQuery(), BooleanClause.Occur.MUST_NOT); // for - you can find the information with lucene javadoc http://lucene.zones.apache.org:8080/hudson/job/Lucene-Nightly/javadoc/ Classes with BooleanQuery and BooleanClause.Occur above Chris. Search Team in PCHOME co. in Taiwan. 2008/1/2

Re: A question about ParalellMultiSearcher and RMI

2008-04-19 Thread Chris
1.5~2G )... But if you only have 3G , I think the 8G memory that is good enough. above Chris. PCHOME ,Search Team @ Taiwan 2008/4/19, 王建新 <[EMAIL PROTECTED]>: > > I want to use RAMDirectory to raise the peformance of lucene. > So I cu

Re: How to add PageRank score with lucene's relevant score in sorting (with Paralle Index modify)

2008-05-29 Thread Chris
t;> >> their > >> >> > pagerank scores. i give a query to it , every docs returned have a > >> >> > lucene-score, mark it as R (relevant score), and i also have its > >> >> > pagerank score, mark it as P, what i need is i want to sort the > >> search > >> >> > result base on the value "P+R". You know if i store the pagerank > >> score > >> >> in > >> >> > index and get it every search time , then compute P+R , then sort > it , > >> >> this > >> >> > way is too slow. in my system , when the search hits 50 result > , > >> the > >> >> > sort may cost about 20s. > >> >> > Sorry for my poor english. Anyone has a good idea? > >> >> > > >> >> > Best > >> >> > Jarvis > >> >> > > >> >> > >> > > >> > > > > > > -- > > - > -- Chris Lin [EMAIL PROTECTED] Taipei , Taiwan. ---

Re: batch indexing

2007-05-02 Thread Chris
the max docs and size with the tempory Indexing action? If I am fault , tell me , please . Thank you. = Chris Lin http://search20.portal20.com.tw [EMAIL PROTECTED] Taipei , Taiwan. --- 2007/4/29, Erick Erickson

Re: Improving Lucene Search Performance

2011-12-09 Thread Chris Hostetter
: Subject: Improving Lucene Search Performance : In-Reply-To: : : References: : <161fd7d0-e01f-42f2-a02a-a4e4b182c...@ebi.ac.uk><347A161B-6C7B-4DC3-ACD0-9A804E2 : dd...@ebi.ac.uk><007613f0-8529-47a3-95c4-7839e1d3e...@ebi.ac.uk> : https://people.apache.org/~hossman/#threadhijack Thr

Re: highlighter: how can I get locations of fragments?

2011-12-14 Thread Chris Hostetter
: Subject: highlighter: how can I get locations of fragments? : References: <4ee79b27.1010...@wyona.com> : In-Reply-To: <4ee79b27.1010...@wyona.com> https://people.apache.org/~hossman/#threadhijack Thread Hijacking on Mailing Lists When starting a new discussion on a mailing list, please do not

RE: Boolean OR does not work as described

2012-01-03 Thread Chris Hostetter
: if you want to mix and/or in one query, always use parenthesis. The or better yet, train yourself not to use AND, OR and NOT... http://www.lucidimagination.com/blog/2011/12/28/why-not-and-or-and-not/ -Hoss - To unsubscribe

Re: Using dismax features in Lucene

2012-01-10 Thread Chris Hostetter
: The book said that dismax query was similar but different to : : DisjunctionMaxQuery the dismax *parser* in Solr is relatively simple, the majority of the code in it relates to parsing config options, reporting debugging, etc... if you wanted to do something similar in non-Solr java code m

Re: Unsubscribe failure

2012-01-13 Thread Chris Hostetter
If anyone is having problems unsubscribing from, or subscribing to, any apache mailing list, please contact the list modererators by adding "-owner" to the name of the list, ie... java-user-ow...@lucene.apache.org ...and avoid emailing the entire list with messages that 99.99% of the subscr

Re: When does Query Parser do its analysis ?

2012-02-01 Thread Chris Hostetter
: So it seems like it just broke the text up at spaces, and does text analysis : within getFieldQuery(), but how can it make the assumption that text should : only be broken at whitespace ? whitespace is a significant metacharacter to the Queryparser - it is used to distinguish multiple clauses

Re: Short circuit AND or subquerying in lucene for performance

2012-02-15 Thread Chris Hostetter
: Basically for queries such as field1:foo AND field2:*bar, I think it : would be highly beneficial to restrict evaluation of the second field on : the result of the first to avoid scanning the index in its entirety due : to the leading wildcard. This is exactly how the BooleanQuery class in Luce

RE: SweetSpotSimilarity

2012-02-15 Thread Chris Hostetter
: sloppyFreq(distance). hyperbolicTf() only comes into play if you : override the tf method in your own subclass to call it instead of the : baselineTf which it normally calls. I also didn't get what it was : trying to do. Correct, as documented... http://lucene.apache.org/core/old_versioned

RE: Short circuit AND or subquerying in lucene for performance

2012-02-16 Thread Chris Hostetter
: Is there a way to run a subquery in Lucene, i.e. running a query only on : the result of a first query to avoid scanning the whole index ? : Is is worth forwarding this request to the developers, do you think it : is feasible to implement such a short circuit operator where the term is : "late"

data extraction architecture

2012-02-23 Thread chris chisolm
I'm relatively new to this field and I have a problem that seems to be solvable in lots of different ways, and I'm looking for some recommendations on how to approach a data refining pipeline. I'm not sure where to look for this type of architecture description. My best finds so far have been som

RE: SweetSpotSimilarity

2012-02-28 Thread Chris Hostetter
ear how tweaking the settings affect the formula : Another problem mentioned in the e-mail thread Chris linked is "people : who know the 'sweetspot' of their data.", but I have yet to find a : definition of what is meant by "sweetspot", so I couldn't say whether

RE: SweetSpotSimilarity

2012-02-28 Thread Chris Hostetter
: i'll try to get some graphs commited and linked to from the javadocs that : make it more clear how tweaking the settings affect the formula http://svn.apache.org/viewvc?rev=1294920&view=rev -Hoss - To unsubscribe, e-mail:

RE: SweetSpotSimilarity

2012-03-05 Thread Chris Hostetter
: very small to occasionally very large. It also might be the case that : cover letters and e-mails while short might not be really something to : heavily discount. The lower discount range can be ignored by setting : the min of any sweet spot to 1. Then one starts to wonder if there is : r

Is Java 7 now safe with Lucene?

2012-03-06 Thread Chris Bamford
Hi there, Is Java7 now safe to use with Lucene? If so, is there a minimum Lucene version I must use with it? Thanks, - Chris

Re: Repeatability of results

2012-04-04 Thread Chris Hostetter
: OK this could make sense (floating point math is frustrating!). : : But, Lucene generally scores one document at a time, so in theory just : changing its docid shouldn't alter the order of float operations. i haven't thought this through, but couldn't scorer re-ordering in BooleanScorer2 poss

Memory question

2012-05-15 Thread Chris Bamford
that indexes are mapped into non-heap memory? If so, how can I monitor the space my process is using if I cache open IndexSearchers? The details are: Sun 64-bit JVM on Linux. Lucene 3.6 running in 2.3 compatibility mode (as we are in the in the process of a migration to 3.6) Thanks, - Chris

Re: RE: Memory question

2012-05-15 Thread Chris Bamford
overs. My server caches indexsearchers and then closes them based on how full the heap is getting. My worry is that if the bulk of the memory is being allocated outside the Jvm, how can I make sensible decisions? Thanks for any pointers / info. Chris -Original Message- Fr

Re: Memory question

2012-05-15 Thread Chris Bamford
n't sound right!) and am I right in thinking that it is some sort of monitoring code pulled into your server via a jar? (I'm confused why it would have its' own GC cycle...) - So are you suggesting I play with my own JVM's (Sun/Oracle) parameters to achieve a simil

Re: Memory question

2012-05-16 Thread Chris Bamford
or apps that are sensitive (from a user >experience) from hanging during GC time. > >See http://docs.oracle.com/javase/6/docs/technotes/guides/vm/cms-6.html > >Best Regards > >Lutz > >-Original Message- >From: Chris Bamford [mailto:chris.bamf...@

Approches/semantics for arbitrarily combining boolean and proximity search operators?

2012-05-16 Thread Chris Harris
I'm working on a product for librarians and similar people, who apparently expect to be able to combine classic boolean operators (i.e. AND, OR, NOT) with proximity operators (especially w/n and pre/n -- which basically map to unordered and ordered SpanQueries with slop n, respectively) in unrestri

Re: Approches/semantics for arbitrarily combining boolean and proximity search operators?

2012-05-17 Thread Chris Harris
First impression is, that's a reasonably clever way to get the user intent basically right without having to add a new SpanQuery. Have you come up with any edge cases where it could do something unexpected? So far I've thought of one, though you could argue it has more to do with the "minimum/lazy

Re: old fashioned....."Too many open files"!

2012-05-18 Thread Chris Hostetter
: the point is that I keep the readers open to share them across search. Is : this wrong? your goal is fine, but where in your code do you think you are doing that? I don't see any readers ever being shared. You open new ones (which are never closed) in every call to getSearcher() : > >

Re: Memory question

2012-05-21 Thread Chris Bamford
dings -- thanks again. So far so good - cheers to everyone for your valuable suggestions and insight. Chris - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-u

Re: Bizarre Search order request

2012-05-25 Thread Chris Lu
Nothing like this yet. But you don't need to do everything in one search request. You can send one search request to know that the match distribution for each document type, and then send 3 requests for 3 document types each. -- Chris Lu - Instant Scalable Full

Re: Bizarre Search order request

2012-05-25 Thread Chris Hostetter
: For example, if I display of 20 results, I might want to limit it to a : maximum of 10 "mail", 10 "blog" and 10 "website" documents. Which ones : get displayed and how they were ordered would depend on the normal : relevancy ranking, but, for example, once I had 10 "mail" objects to : displ

Re: Approches/semantics for arbitrarily combining boolean and proximity search operators?

2012-05-25 Thread Chris Harris
ortions of the query, and considered only binary versions of and/or.) a not/n b means "a, not within n words of b". I don't think it can be implemented directly using existing SpanQueries, but I think it's probably easy to extend SpanQuery to do the job. On Wed, May 16, 201

Re: need to find locations of query hits in doc: works fine for regular text but not for phone numbers

2012-06-14 Thread Chris Hostetter
: Subject: need to find locations of query hits in doc: works fine for regular : text but not for phone numbers : Message-ID: : References: <1339635547170-3989548.p...@n3.nabble.com> : In-Reply-To: <1339635547170-3989548.p...@n3.nabble.com> https://people.apache.org/~hossman/#threadhijack Threa

zero sized cfs files in index lead to IOException: read past EOF

2012-06-19 Thread Chris Gioran
rse is the reason for the exception above. Here is the file listing: -rw-r--r-- 1 chris chris36 Jun 6 16:42 _drr_1.del -rw-r--r-- 1 chris chris 47794 Jun 5 21:15 _drr.fdt -rw-r--r-- 1 chris chris 6476 Jun 5 21:15 _drr.fdx -rw-r--r-- 1 chris chris23 Jun 5 21:15 _drr.fnm -rw-r--r-- 1 chris c

Re: zero sized cfs files in index lead to IOException: read past EOF

2012-06-19 Thread Chris Gioran
L 5. No, just happened twice, no clear pattern. Yes, the the exception happened on site and afterwords the store was given to me - everything in there works but that index. Thank you for your response, i'll get back if i have more information. CG > Mike McCandless > > http://blog.mike

Any CommonGrams-inspired tricks to speed up other proximity query types?

2012-06-21 Thread Chris Harris
CommonGrams provides a neat trick for optimizing slow phrase queries that contain common words. (E.g. Hathi Trust has some datashowing how effective this can be.) Unfortunately, it does nothing for other positi

RE: How to unsubscribe from this list?

2012-06-25 Thread Chris Hostetter
G.Long: I'm Replying to list so this info is visibilt to anyone who is curious, but if you have specific followup questions, please reply to java-user-owner@lucene ... : Thanks. I tried this but it did not work so asking :). 1) sending an unsubscribe request will trigger an automated response

Re: Mapping Lucene search results with a relational database

2012-07-03 Thread Chris Lu
Can you index the rule1 and rule2 fields into the documents, and when searching with the keywords, also append rule1:foo and rule2:bar to the query? Chris - Instant Scalable Full-Text Search On Any Database/Application site: http://www.dbsight.net demo: http

Re: change of API Javadoc interface funtionality in 4.0.x

2012-07-18 Thread Chris Hostetter
: What is the sense of removing the "Index" from the API Javadoc for Lucene and Solr? It was heavily bloating the size of the releases... https://issues.apache.org/jira/browse/LUCENE-3977 It's pretty easy to turn this back on and rebuild the docs locally. Feel free to open a jira and submit

[ANNOUNCE] Lucene/Solr @ ApacheCon Europe - August 13th Deadline for CFP and Travel Assistance applications

2012-08-06 Thread Chris Hostetter
ApacheCon Europe will be happening 5-8 November 2012 in Sinsheim, Germany at the Rhein-Neckar-Arena. Early bird tickets go on sale this Monday, 6 August. http://www.apachecon.eu/ The Lucene/Solr track is shaping up to be quite impressive this year, so make your plans to attend an

Seeking more moderators for java-user@lucene

2012-08-27 Thread Chris Hostetter
Greetings subscribers to java-user@lucene. I've been offline for the past ~5 days, and when i looked at my email again this morning I found a message to java-user@lucene sitting in the moderator queue since Aug 22nd. Messages sitting in the queue that long are a good indication that we don'

RE: Seeking more moderators for java-user@lucene

2012-08-28 Thread Chris Hostetter
: I have tried multiple times to unsubscribe, and it never works. Could you unsubscribe me? Anyone having trouble unsubscribing should read the help page on the wiki and follow the instructions there if thye need more help... https://wiki.apache.org/solr/Unsubscribing%20from%20mailing%20lists

Re: ResourceLoader?

2012-08-29 Thread Chris Male
; use > > > it? > > > > > > > Where is it deprecated? What does the deprecation message say? > > > > -- > > lucidworks.com > > > > - > > To unsubscribe, e-mail: java

Re: Issue with documentation for org.apache.lucene.analysis.synonym.SynonymMap.Builder.add() method

2012-09-06 Thread Chris Hostetter
: Converted to U+000 by what, I wonder? Javadoc shouldn't be doing that. If : it does, I wonder if we need \\u instead? aparently it is... https://mail-archives.apache.org/mod_mbox/harmony-dev/200802.mbox/%3c47b2f7ae.2000...@gmail.com%3E -Hoss --

Re: Lucene 4.0 PerFieldAnalyzerWrapper question

2012-09-25 Thread Chris Male
nents function that specifies which tokenizer to use with which > field, though. It looks like the PerFieldAnalyzerWrapper itself assumes > that the same tokenizer will be used with all fields, as its wrapComponents > function ignores the fieldname parameter. I would appreciate any help

Re: Lucene 4.0 PerFieldAnalyzerWrapper question

2012-09-25 Thread Chris Male
ou've wrapped through getWrappedAnalyzer. You can avoid all this entirely of course by not extending Analyzer but instead just instantiating a PerFieldAnalyerWrapper instance directly instead of your MyPerFieldAnalyzer. On Wed, Sep 26, 2012 at 12:25 PM, Mike O'Leary wrote: > Hi Chri

Re: Lucene 4.0 PerFieldAnalyzerWrapper question

2012-09-25 Thread Chris Male
Mike, On Wed, Sep 26, 2012 at 1:05 PM, Mike O'Leary wrote: > Hi Chris, > So if I change my analyzer to inherit from AnalyzerWrapper, I need to > define a getWrappedAnalyzer function and a wrapComponents function. I think > getWrappedAnalyzer is straightforward, but I don&#x

Re: short search terms

2012-09-26 Thread Chris Hostetter
: I have a key field that will only ever have a length of 3 characters. I am : using a StandardAnalyzer and a QueryParser to create the Query : (parser.parse(string)), and an IndexReader and IndexSearcher to execute the : query (searcher(query)). I can't seem to find a setter to allow for a 3 : ch

Re: Is there anything in Lucene 4.0 that provides 'absolute' scoring so that i can compare the scoring results of different searches ?

2012-10-25 Thread Chris Hostetter
https://wiki.apache.org/lucene-java/LuceneFAQ#Can_I_filter_by_score.3F https://wiki.apache.org/lucene-java/ScoresAsPercentages The fundemental problem of attempting to compare scores for different searches is the same in your situation as in the goal of trying to "normalize" scores to a fixed r

Re: Why QueryParser isn't in API?

2012-11-12 Thread Chris Male
; didn't find in api. And a simple example in docs still used the class. > And are there > anthor methods to replace it. Thx! > > Harry Yu -- Chris Male | Open Source Search Developer | elasticsearch | www.e<http://www.dutchworks.nl> lasticsearch.com

Re: Question about ordering rule of SpanNearQuery

2012-11-21 Thread Chris Hostetter
: I am confused with the ordering rule about SpanNearQuery. For example, I : indicate the slot in SpanNearQuery is 10. And the results are all the : qualified documents. Is it true that any document with shorter distance ... : it till uses tf-idf algorithm to rank the docs. Or there is

Re: Which token filter can combine 2 terms into 1?

2012-12-21 Thread Chris Hostetter
: Unfortunately, no...I am not combine every two term into one. I am : combining a specific pair. I'm confused ... you've already said that you expect you will need a custom filter because your usecase is very special -- and you haven't given us many details about exactly when/why/how you want t

Re: NPE when adding a Document to an IndexWriter

2013-01-09 Thread Chris Hostetter
: I keep getting an NPE when trying to add a Doc to an IndexWriter. I've : minimized my code to very basic code. what am I doing wrong? pseudo-code: can you post a full test that other people can run to try and reproduce? it doesn't even have to be a junit test -- just some complete javacode

Re: NPE when adding a Document to an IndexWriter

2013-01-09 Thread Chris Hostetter
: thanks for your reply. please see attached. I tried to maintain the : structure of the code that I need to use in the library I'm building. I think : it should work for you as long as you remove the package declaration at the : top. I can't currently try your code, but skimming through it i'

Re: Large Index Query Help!

2013-01-29 Thread Chris Hostetter
: Subject: Large Index Query Help! : References: <1359429227142-4036943.p...@n3.nabble.com> https://people.apache.org/~hossman/#threadhijack Thread Hijacking on Mailing Lists When starting a new discussion on a mailing list, please do not reply to an existing message, instead start a fresh emai

Usage of ToParentBlockJoinCollector

2013-02-11 Thread Chris Bamford
some sample code somewhere I can refer to? Thanks! - Chris

More questions on BlockJoinQuery

2013-02-11 Thread Chris Bamford
ll search() twice (one returning TopDocs and the other GroupDocs via the Collector) and join them myself? Or does one of these calls return me both types of documents, grouped and sorted? I hope this makes sense. I'm happy to provide more detail if required. Thanks, - Chris

Re: More questions on BlockJoinQuery

2013-02-12 Thread Chris Bamford
t the parent objects (with td=s.search(q, 10)) or just the children (with the Collector), but not both! Am I to call search() twice (one returning TopDocs and the other GroupDocs via the Collector) and join them myself? Or does one of these calls return me both types of documents, grouped and sorted? I hope this makes sense. I'm happy to provide more detail if required. Thanks, - Chris

Re: ApacheCon meetup

2013-02-19 Thread Chris Hostetter
: Subject: ApacheCon meetup : : Any other Lucene/Solr enthusiasts attending ApacheCon in Portland next week? I won't make it to ApacheCon this year (first time in a long time actually) but I'm fairly certain there will be a Lucene MeetUp of some kind -- there always is. This is usually organi

Re: More questions on BlockJoinQuery

2013-02-20 Thread Chris Bamford
Thanks Mike. I have downloaded the source tarball for 4.1.0 and have tried to get it working, but am having a few problems getting it to fit with my environment (intelliJ / Maven). Where is the best forum to discuss such issues? Chris -Original Message- From: Michael McCandless

Re: More questions on BlockJoinQuery

2013-02-20 Thread Chris Bamford
ct "lucene". Total time: 0 seconds What have I done wrong? Thanks! - Chris -Original Message- From: Steve Rowe To: java-user@lucene.apache.org Sent: Wed, 20 Feb 2013 16:29 Subject: Re: More questions on BlockJoinQuery Hi Chris, This mailing list is fine for discussing Intel

RE: Searching for keywords .net,c#,...

2013-02-26 Thread Chris Hostetter
: which seems to override incrementToken() ( guess as I don't know java ) : however using lucene.net 3.0.3, I can override Lucene.Net is a completely seperate project from Lucene, with it's own APIs, release cycles, and user community. Your best bet at getting help from people who are familiar

Re: More questions on BlockJoinQuery

2013-02-28 Thread Chris Bamford
right-click / Run any of the unit tests. I am clearly missing a step or two, just not sure what! (My Project SDK is correctly set to java 1.6.) Please can someone tell me what I need to do... Thanks - Chris -Original Message- From: Steve Rowe To: java-user@lucene.apache.org

Re: Migrating SnowballAnalyzer to 4.1

2013-02-28 Thread Chris Hostetter
: Subject: Migrating SnowballAnalyzer to 4.1 : References: : : : In-Reply-To: : https://people.apache.org/~hossman/#threadhijack Thread Hijacking on Mailing Lists When starting a new discussion on a mailing list, please do not reply to an existing message, instead start a fresh emai

Loading lucene_solr_4_1_0 into IntelliJ

2013-03-05 Thread Chris Bamford
ml -rw-r--r-- 1 cbamford staff510 5 Mar 14:45 vcs.xml -rw-r--r-- 1 cbamford staff 18895 5 Mar 14:45 workspace.xml which seems fine ?? I have also tried running other ant targets (test and generate-maven-artifacts), but to no avail. Is there a step or two I am

Re: Loading lucene_solr_4_1_0 into IntelliJ

2013-03-05 Thread Chris Bamford
Hi Steve, Turns out IntelliJ was all confused and just needed a restart. I can now run the tests I'm interested in :-D Thanks for all your help. Cheers, - Chris -Original Message- From: Steve Rowe To: java-user@lucene.apache.org Sent: Tue, 5 Mar 2013 15:23 Subjec

Multi-value fields in Lucene 4.1

2013-03-22 Thread Chris Bamford
index with term vectors and positions if it helps. Thanks, - Chris

Re: StandardAnalyzer class not present in Lucene 4.2.0

2013-03-25 Thread Chris Hostetter
: Thank you very much Arjen. I had to separately download and install the : jar. it was not present in my lucene installation directory. I had : downloaded the lucene zip file and ran the command "ant" after extracting : it. Did i miss anything.? if you download & build lucene from source, then

Re: Why does index boosting a field to 2.0f on a document have such a dramatic effect

2013-04-04 Thread Chris Hostetter
: At index time I boost the alias field of a small set of documents, setting the : boost to 2.0f, which I thought meant equivalent to doubling the score this doc : would get over another doc, everything else being equal. 1) you haven't shown us enough details to be certian, but based on the code

Re: ERROR help me please ,org.apache.lucene.search.IndexSearcher.(Ljava/lang/String;)V

2013-05-17 Thread Chris Hostetter
: Well IndexSearcher doesn't have a constructor that accepts a string, : maybe you should pass in an indexreader instead? speciically: the code you are trying to run was compiled against a version of lucene in which the IndexSearcher class had a constructor that accepted a single string argumen

Re: Read an solr index with two different lucene formats

2013-06-14 Thread Chris Hostetter
: I used solr to query the index, and verified that each document does have a : non-blank date field. I suspect that it's because the lucene-3.6 api I am : using can not read datefield correctly from documents written in lucene 1.4 : format. how did you verify that they all have a non-blank valu

Please Help solve problem of bad read performance in lucene 4.2.1

2013-07-07 Thread Chris Zhang
hi , Sorry to interrupt you, but I am really confused by the bad performance of lucene 4.2.1. Recently I migrated project from lucene 3.0 to 4.2.1 . After simply tests I found that both indexing and reading performance of lucene 4 can not match the older version. Indexing code snippets are as

Re: Please Help solve problem of bad read performance in lucene 4.2.1

2013-07-07 Thread Chris Zhang
thianks Adrien, In my project, almost all hit docs are supposed to be fetched for every query, what's why I am upset by the poor reading performance. Maybe I should store field values which are expected to be stored in high performance storage engine. In the above test case, time consuming of readi

Re: Please Help solve problem of bad read performance in lucene 4.2.1

2013-07-07 Thread Chris Zhang
ry performance in in 4.x vs. 3.x? That's the true, proper > measure of Lucene and Solr performance. > > -- Jack Krupansky > > -Original Message- From: Chris Zhang > Sent: Sunday, July 07, 2013 12:26 PM > To: java-user@lucene.apache.org > Subject: Re: Please

ANNOUNCE: CFP Lucene/Solr Revolution EU 2013 (Deadline August 2nd)

2013-07-08 Thread Chris Hostetter
(NOTE: cross-posted to variuous lists, please reply only to general@lucene w/ any questions or follow ups) The Call for Papers for Lucene/Solr Revolution EU 2013 is currently open. http://www.lucenerevolution.org/2013/call-for-papers Lucene/Solr Revolution is the biggest open source conferen

Re: QueryParser for DisjunctionMaxQuery, et al.

2013-07-23 Thread Chris Hostetter
: Subject: QueryParser for DisjunctionMaxQuery, et al. : References: <1374578398714-4079673.p...@n3.nabble.com> : In-Reply-To: <1374578398714-4079673.p...@n3.nabble.com> https://people.apache.org/~hossman/#threadhijack Thread Hijacking on Mailing Lists When starting a new discussion on a mailing

ANNOUNCE: Lucene/Solr Revolution EU 2013: Registration & Community Voting

2013-08-26 Thread Chris Hostetter
(NOTE: cross-posted to various lists, please reply only to general@lucene w/ any questions or follow ups) 2 Announcements folks should be aware of regarding the upcoming Lucene/Solr Revolution EU 2013 in Dublin... # 1) Registration Now Open Registration is now open for Lucene/Solr Revolu

Re: is there some dangerous bug in lucene?

2010-05-11 Thread Chris Lu
If you are using field cache for field A, and updating field A, isn't it normal that the field A is not updated? Field cache is keyed via index reader, it won't be efficient to reload the field cache for each updateDocument(). -- Chris Lu - Instant Scalable

Re: Will doc ids ever change if nothing is deleted?

2010-05-14 Thread Chris Harris
; deleted, and if (per my original question) only deletions would trigger > renumbering, then the doc ids from a search result could be used on an index > with a newer version. > > Thanks, > Chris > > On Thu, May 13, 2010 at 9:51 PM, Erick Erickson > wrote: > >> Why do

Re: Will doc ids ever change if nothing is deleted?

2010-05-14 Thread Chris Lu
documents are added, the id is N+1. Of course, if some documents from other segments are merged. The documents in one segment will "lose" its doc id. -- Chris Lu - Instant Scalable Full-Text Search On Any Database/Application site: http://www.dbsight.net

Re: Sorting and Empty (non-existing) Fields

2010-05-19 Thread Chris Hostetter
: Now I want to search something on the first field and want the results : sorted by relevance, then by the first field, then by the second field. first off: if your primary sort is on relevancy, there are going to be very few cases where your secondary sort comes into play -- the scoring form

Re: Test File locks

2010-05-26 Thread Chris Hostetter
It would be helpful to know: 1) what version of Lucene you are using 2) what exactly like 167 of LibraryBuilder looks like (ie: what options are you using when instantiating the IndexWRiter) 3) what filesystems are using on each of the two different machines you are using. 4) does it really say

RE: Docs with any score are collected in the Collector implementations

2010-06-02 Thread Chris Hostetter
: Thanks, have overseen this implementation. How to get solr configured to : use this wrapper collector? Or is this the wrong mailing list for this : question? :) : : As far as I read the solr code it is not meant to configure the collectors at all without touching the code... correct ... Col

RE: Docs with any score are collected in the Collector implementations

2010-06-02 Thread Chris Hostetter
: that's probably because I move from lucene to solr. : : We will need to filter them from the result manually then first. Can you explain why? ... in particular, can you explain what types of queries you have that produce negative scores for matches, but where you don't want to see those matc

[ANN] Free Webinar: June 24: How Cisco uses Lucene/Solr w/ Social Networks

2010-06-17 Thread Chris Hostetter
(cross posted announcement, please keep any replies to gene...@lucene) On behalf of Lucid Imagination, I'd like to invite folks to a free Webinar we're hosting on June 24th... How Cisco’s Pulse uses Lucene/Solr to put Social Networks to Work Thursday, June 24, 2010 9am

Re: Inserting data from multiple databases in same index

2010-07-22 Thread Chris Lu
several boxes and achieve sharded search. -- Chris Lu - Instant Scalable Full-Text Search On Any Database/Application site: http://www.dbsight.net demo: http://search.dbsight.com Lucene Database Search in 3 minutes: http://wiki.dbsight.com/index.php?title

Re: Databases

2010-07-23 Thread Chris Lu
-time data import. Or you would have to put a hook in your program to write new content to the index. Anyway, you can get it work, but maybe not as simple as you expected. -- Chris Lu - Instant Scalable Full-Text Search On Any Database/Application site: http://www.dbsight.net

RE: Term browsing much slower in Lucene 3.x.x

2010-07-29 Thread Chris Hostetter
: > My other question is whether there are planned performance : > enhancements to address this loss of performance? : : These APIs are very different in the next major release (4.0) of : Lucene, so except for problems spotted by users like you, there's not : much more dev happening against them

Re: Lucene applicability

2010-08-25 Thread Chris Lu
uld need a mechanism to get prepared and rebuild the index when you need to. -- Chris Lu - Instant Scalable Full-Text Search On Any Database/Application site: http://www.dbsight.net demo: http://search.dbsight.com Lucene Database Search in 3 minutes: http://wiki.dbsight.com

Re: Combine data from index and db before sorting and pagination

2010-09-01 Thread Chris Lu
ot;category_2", take doc5 and doc10 for example, after all the reindexing effort, the only changes is: "category_1": doc1,doc2. "category_2": doc3,doc4,doc5,doc7,doc8,doc10. Of course, to support this efficiently could be a big change, affecting all the nice

Re: Federated search with opensearch or proprietary APIs for Atlassian

2010-09-02 Thread Chris Lu
more flexible with the structure, even dealing with data beyond Atlassian products. I guess that's the reason Google did not rely on each website's own search mechanism. -- Chris Lu - Instant Scalable Full-Text Search On Any Database/Application site: http://www.d

Re: does lucene support Database full text search

2010-09-10 Thread Chris Lu
Lucene does not support database directly. You need to pump data into Lucene. You can use DBSight, which has a built-in high performance crawler for any databases. It also has integrated Chinese analyzers, including IKAnalyzer, which is the best one I found so far. -- Chris Lu

RE: Unexpected Results - using should and must in boolean query

2010-09-17 Thread Chris Hostetter
: If you have some MUST terms, but you also want to have at least one of a : list of other terms (like 5 SHOULD clauses), the trick is to separate both: : Create a BooleanQuery with 2 MUST clauses, one is your required TermQuery : and the second clause is itself a BooleanQuery with all the SHOULD

Re: High frequency term for the searched query

2010-11-04 Thread Chris Lu
After you get the query object, you can use IndexSearcher's function docFreq(), like this final Set terms = new HashSet(); query = searcher.rewrite(query); query.extractTerms(terms); for(Term t : terms){ int frequency = irs.getSearcher().docFreq(t); } -- -- Chr

Re: High frequency term for the searched query

2010-11-04 Thread Chris Lu
After you get the query object, you can use IndexSearcher's function docFreq(), like this final Set terms = new HashSet(); query = searcher.rewrite(query); query.extractTerms(terms); for(Term t : terms){ int frequency = searcher.docFreq(t); } -- -- Chris Lu - In

  1   2   3   4   5   6   7   8   9   10   >