background merge hit exception

2009-01-02 Thread Brian Whitman
I am getting this on a 10GB index (via solr 1.3) during an optimize: Jan 2, 2009 6:51:52 PM org.apache.solr.common.SolrException log SEVERE: java.io.IOException: background merge hit exception: _ks4:C2504982 _oaw:C514635 _tll:C827949 _tdx:C18372 _te8:C19929 _tej:C22201 _1agw:C1717926 _1agz:C1 into

Re: background merge hit exception

2009-01-02 Thread Brian Whitman
there any other > exceptions prior to this one, or, any previous problems with the OS/IO > system? > > Can you run CheckIndex (java org.apache.lucene.index.CheckIndex to see > usage) and post the output? > Mike > > Brian Whitman wrote: > > > I am getting this on a 10

Re: background merge hit exception

2009-01-02 Thread Brian Whitman
e [-fix was not specified] On Fri, Jan 2, 2009 at 3:47 PM, Brian Whitman wrote: > I will but I bet I can guess what happened -- this index has many > duplicates in it as well (same uniqueKey id multiple times) - this happened > to us once before and it was because the solr server went down

Re: background merge hit exception

2009-01-02 Thread Brian Whitman
s the same error. On Fri, Jan 2, 2009 at 5:26 PM, Michael McCandless < luc...@mikemccandless.com> wrote: > Also, this (Solr server going down during an add) should not be able to > cause this kind of corruption. > Mike > > Yonik Seeley wrote: > > > On Fri, Jan 2, 2009

Re: background merge hit exception

2009-01-03 Thread Brian Whitman
> > > It's very strange that CheckIndex -fix did not resolve the issue. After > fixing it, if you re-run CheckIndex on the index do you still see that > original one broken segment present? CheckIndex should have removed > reference to that one segment. > I just ran it again, and it detected the

solr / lucene engineering positions in Boston, MA USA @ the Echo Nest

2010-09-10 Thread Brian Whitman
Hi all, brief message to let you know that we're in heavy hire mode at the Echo Nest. As many of you know we are very heavy solr/lucene users (~1bn documents across many many servers) and a lot of our staff have been working with and contributing to the projects over the years. We are a "music inte

searching by field's TF vector (not MoreLikeThis)

2007-02-01 Thread Brian Whitman
I'm looking for a way to search by a field's internal TF vector representation. MoreLikeThis does not seem to be what I want-- it constructs a text query based on the top scoring TF-IDF terms. I want to query by TF vector directly, bypassing the tokens. Lucene understandably has knowledge

Re: searching by field's TF vector (not MoreLikeThis)

2007-02-03 Thread Brian Whitman
On Feb 1, 2007, at 7:13 PM, Brian Whitman wrote: I'm looking for a way to search by a field's internal TF vector representation. MoreLikeThis does not seem to be what I want-- it constructs a text query based on the top scoring TF-IDF terms. I want to query by TF vecto

copying fields between documents in different indexes

2007-02-16 Thread Brian Whitman
Using the lucene API, is there a way to copy the contents and parameters of fields between documents in different indexes? Without requiring the field to be stored or needing to pass around the fulltext contents of the field. I guess I am looking for doc.add(new Field("contentsNew", copyFr