Hi Jeff,
It is not clear to me. If we are saying that the name represents the
document ID the vector comprises of, then probably we can have a wrapper
class which includes vector and name rather having it as part of
Abstract vector. I might be missing something here. Kindly clarify.
Thanks
Pallavi
On 03/17/2010 06:44 PM, Jeff Eastman wrote:
Pallavi Palleti wrote:
Hi,
Could some one kindly let me know the significance of instance
variable "name" in AbstractVector? It is causing problems, when I
write a vector to file and read and compare with the same vector if
the value of "name" is null. Because, while writing to file, "name"
is set to empty string if it is null. So, when we read the vector
from the file, it will have different value (not null) and
asFormatString will have two different values for these vectors and
so concludes that they are different.
Thanks
Pallavi
The "name" instance variable was added in MAHOUT-65 along with the
"labelBindings" feature so that e.g. a term vector can retain its term
in its state. I guess the problem you are seeing is an interaction
between the vector Writable implementation - which incorrectly handles
null - and the Json produced by asFormatString. <rant> I've said this
before and, not to belabor the point, using a Json encoding to compare
vectors for equality has a host of related problems, most recently
with lazy lengthSquared. If Vector implemented Printable instead, then
asFormatString(bindings) could probably be crafted to eliminate these
problems and be usable for such comparisons. </rant>
Jeff