hi,
I found something very weird, I can't figure out what's wrong.
I use this FileDataModel to read the dataset from disk:
DataModel model = new FileDataModel(new File("./data/all_data.data"));
int numUsers = model.getNumUsers();
on one machine it works like this:
15-Dec-2009 14:29:32 org.slf4j.impl.JCLLoggerAdapter info
INFO: Creating FileDataModel for file .\data\all_data.data
15-Dec-2009 14:29:32 org.slf4j.impl.JCLLoggerAdapter info
INFO: Reading file info...
15-Dec-2009 14:29:32 org.slf4j.impl.JCLLoggerAdapter info
INFO: Read lines: 100000
15-Dec-2009 14:29:32 org.slf4j.impl.JCLLoggerAdapter info
INFO: Processed 943 users
15-Dec-2009 14:29:33 org.slf4j.impl.JCLLoggerAdapter info
INFO: Processed 943 users
which is correct.
on another one it seems to read something else at the same time. it gives me
this output:
15-Dec-2009 14:35:13 org.slf4j.impl.JCLLoggerAdapter info
INFO: Creating FileDataModel for file .\data\all_data.data
15-Dec-2009 14:35:13 org.slf4j.impl.JCLLoggerAdapter info
INFO: Reading file info...
15-Dec-2009 14:35:13 org.slf4j.impl.JCLLoggerAdapter info
INFO: Read lines: 100000
15-Dec-2009 14:35:13 org.slf4j.impl.JCLLoggerAdapter info
INFO: Reading file info...
15-Dec-2009 14:35:15 org.slf4j.impl.JCLLoggerAdapter info
INFO: Processed 1000000 lines
15-Dec-2009 14:35:15 org.slf4j.impl.JCLLoggerAdapter info
INFO: Read lines: 1000209
15-Dec-2009 14:35:17 org.slf4j.impl.JCLLoggerAdapter info
INFO: Processed 6040 users
I have two datasets, but for some reason on the second machine it rereads it
from somewhere.
thanks a lot
--
View this message in context:
http://old.nabble.com/FileDataModel---taste-library-tp26795792p26795792.html
Sent from the Mahout User List mailing list archive at Nabble.com.