Modifying a tokenized field entry

2009-08-19 Thread Matt Dufrasne
Let's assume I have an index structured as below

name

species

color

max

cat

grey

sam

dog

brown

lucy

cat

white

.

.

.

.

.

.

poe

dog

blond

joe

cat

red

pam

dog

brown



The species and color fields are tokenized, indexed and stored.

Now let's assume that I want to change the term cat to feline and dog to canine.

From what I have been reading, I would have to delete each Document(row) and 
re-add it with the new term.
Since the original cat and dog terms are Indexed, Tokenized and Stored, it 
seems like there should be a way to update just the terms cat and dog to their 
new titles.

Is there already a way to do this? Did  I just miss it?

---
This email message is for the sole use of the intended recipient(s) and may 
contain
confidential information.  Any unauthorized review, use, disclosure or 
distribution
is prohibited.  If you are not the intended recipient, please contact the 
sender by
reply email and destroy all copies of the original message.
---


Re: Is there a way to check for field uniqueness when indexing?

2009-08-19 Thread Daniel Shane

But in that case, I assume Solr does a commit per document added.

Lets say I wanted to index a collection of 1 million pages, would it 
take much longer if I comited at each insertion rather than comiting at 
the end?


Daniel Shane

Grant Ingersoll wrote:



On Aug 13, 2009, at 10:33 AM, Daniel Shane wrote:


Does anyone have an idea on how I could check an index that is in the 
process of being indexed (things added, things deleted) for the 
uniquess of a given field *at the time I index a document* ?



Solr has de-duplication built-in at indexing time: 
http://wiki.apache.org/solr/Deduplication


--
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) 
using Solr/Lucene:

http://www.lucidimagination.com/search


-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org




-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



custom scorer

2009-08-19 Thread Chris Salem
Hello,
I'm trying to write a custom scorer that only uses the term frequency function 
from the DefaultSimilarity class, the problem is that documents with lower 
frequencies are returning with higher scores than documents with higher 
frequencies.  Here's the code:
searcher.setSimilarity(new DefaultSimilarity(){
 public float lengthNorm(String field, int numTerms){
  return 1;
 }
 public float idf(int docFreq, int numDocs){
  return 1;
 }
 public float coord(int overlap, int maxoverlap){
  return 1;
 }
 public float queryNorm(float sumOfSquaredWeights){
  return 1;
 }   
 public float sloppyFreq(int distance){
  return 1;
 }
});
Any idea why this wouldn't be working?
Sincerely,
Chris Salem 


Re: custom scorer

2009-08-19 Thread Grant Ingersoll

Are you setting the Similarity before indexing, too, on the IndexWriter?

On Aug 19, 2009, at 4:20 PM, Chris Salem wrote:


Hello,
I'm trying to write a custom scorer that only uses the term  
frequency function from the DefaultSimilarity class, the problem is  
that documents with lower frequencies are returning with higher  
scores than documents with higher frequencies.  Here's the code:

searcher.setSimilarity(new DefaultSimilarity(){
public float lengthNorm(String field, int numTerms){
 return 1;
}
public float idf(int docFreq, int numDocs){
 return 1;
}
public float coord(int overlap, int maxoverlap){
 return 1;
}
public float queryNorm(float sumOfSquaredWeights){
 return 1;
}
public float sloppyFreq(int distance){
 return 1;
}
});
Any idea why this wouldn't be working?
Sincerely,
Chris Salem


--
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
using Solr/Lucene:

http://www.lucidimagination.com/search


-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org