Sean,

I don't understand why the idField stuff is removed from the Lucene stuff.   It 
appears to have been dropped for MAHOUT-379 (rev 936183) and then commented out 
on this commit, but this is pretty important functionality for people coming 
from Lucene.  Without it, one has no way of mapping the vectors back to the 
original documents.  It's one thing to change over to change how we use vector 
labels, it's another to completely remove the functionality.

It appears that we need to switch to using the NamedVector when idField is not 
null.

-Grant

Begin forwarded message:

> From: sro...@apache.org
> Date: May 27, 2010 2:02:23 PM EDT
> To: comm...@mahout.apache.org
> Subject: svn commit: r948935 [3/3] - in /mahout/trunk: 
> buildtools/src/main/resources/ 
> core/src/main/java/org/apache/mahout/cf/taste/eval/ 
> core/src/main/java/org/apache/mahout/cf/taste/hadoop/ 
> core/src/main/java/org/apache/mahout/cf/taste/hadoop/item/ core/src/m...
> Reply-To: d...@mahout.apache.org
> 
> Modified: 
> mahout/trunk/utils/src/main/java/org/apache/mahout/utils/vectors/lucene/LuceneIterable.java
> URL: 
> http://svn.apache.org/viewvc/mahout/trunk/utils/src/main/java/org/apache/mahout/utils/vectors/lucene/LuceneIterable.java?rev=948935&r1=948934&r2=948935&view=diff
> ==============================================================================
> --- 
> mahout/trunk/utils/src/main/java/org/apache/mahout/utils/vectors/lucene/LuceneIterable.java
>  (original)
> +++ 
> mahout/trunk/utils/src/main/java/org/apache/mahout/utils/vectors/lucene/LuceneIterable.java
>  Thu May 27 18:02:20 2010
> @@ -32,17 +32,17 @@ import org.apache.mahout.math.Vector;
>  * {...@link Vector}. The Field used to create the Vector currently must have 
> Term Vectors stored for it.
>  */
> public class LuceneIterable implements Iterable<Vector> {
> -  
> +
> +  public static final double NO_NORMALIZING = -1.0;
> +
>   private final IndexReader indexReader;
>   private final String field;
> -  private final String idField;
> -  private final FieldSelector idFieldSelector;
> +  //private final String idField;
> +  //private final FieldSelector idFieldSelector;
> 
>   private final VectorMapper mapper;
>   private double normPower = NO_NORMALIZING;
> -  
> -  public static final double NO_NORMALIZING = -1.0;
> -  
> +
>   public LuceneIterable(IndexReader reader, String idField, String field, 
> VectorMapper mapper) {
>     this(reader, idField, field, mapper, NO_NORMALIZING);
>   }
> @@ -70,9 +70,9 @@ public class LuceneIterable implements I
>     if (normPower != NO_NORMALIZING && normPower < 0) {
>       throw new IllegalArgumentException("normPower must either be -1 or >= 
> 0");
>     }
> -    idFieldSelector = new 
> SetBasedFieldSelector(Collections.singleton(idField), 
> Collections.<String>emptySet());
> +    //idFieldSelector = new 
> SetBasedFieldSelector(Collections.singleton(idField), 
> Collections.<String>emptySet());
>     this.indexReader = reader;
> -    this.idField = idField;
> +    //this.idField = idField;
>     this.field = field;
>     this.mapper = mapper;
>     this.normPower = normPower;



Reply via email to