SparseVectorsFromSequenceFiles StandardAnalyzer ClassNotFoundException issue

2014-06-03 Thread Terry Blankers
Hello, can anyone please give me a clue as to what I may be missing here? I'm trying to run a SparseVectorsFromSequenceFiles job via ToolRunner from a java project and I'm getting the following exception: Error: java.lang.ClassNotFoundException:

Re: SparseVectorsFromSequenceFiles StandardAnalyzer ClassNotFoundException issue

2014-06-03 Thread Terry Blankers
, Terry Blankers te...@amritanet.com wrote: Hello, can anyone please give me a clue as to what I may be missing here? I'm trying to run a SparseVectorsFromSequenceFiles job via ToolRunner from a java project and I'm getting the following exception: Error: java.lang.ClassNotFoundException

Re: clusterdump samplePoints parameter

2014-04-18 Thread Terry Blankers
PM, Suneel Marthi wrote: Its the max. no. of points to include from each cluster in the clusterdump. If not specified all points would be included. On Tuesday, March 18, 2014 11:25 PM, Terry Blankers te...@amritanet.com wrote: Hi all, Can someone please answer a quick question about

Re: lucene2seq error: field does not exist in the index

2014-04-18 Thread Terry Blankers
something like 2 or 3 docs in xml format would be sufficient for a test? Regards, Terry On Thursday, April 10, 2014 11:34 PM, Terry Blankers te...@amritanet.com wrote: Hi All, I'm very new to trying to use lucene2seq so I'm not sure if it's just user error, but I'm experiencing some

Re: lucene2seq error: field does not exist in the index

2014-04-18 Thread Terry Blankers
as well as being stored, which would be a bug because lucene2seq is designed to load stored fields. Cheers, Frank On Fri, Apr 11, 2014 at 5:33 AM, Terry Blankers te...@amritanet.com wrote: Hi All, I'm very new to trying to use lucene2seq so I'm not sure if it's just user error, but I'm

lucene2seq error: field does not exist in the index

2014-04-10 Thread Terry Blankers
Hi All, I'm very new to trying to use lucene2seq so I'm not sure if it's just user error, but I'm experiencing some unexpected behavior when running lucene2seq against my solr index (4.7.1). I've tried using both 0.9 and the trunk build of mahout. (And BTW, I have been able to successfully run

clusterdump - structure of JSON output

2014-04-02 Thread Terry Blankers
Hi all, I'm working on some automated analysis of the clusterdump output using '-of = JSON'. While digging into the structure of the representation of the data I've noticed something that seems a little odd to me. In order to access the data for a particular cluster, the 'cluster', 'n', 'c'

Re: clusterdump - structure of JSON output

2014-04-02 Thread Terry Blankers
Thanks for the confirmation, I prefer your solution for c r values. I've created https://issues.apache.org/jira/browse/MAHOUT-1505 On 4/2/14, 2:53 PM, Andrew Musselman wrote: Looks like a bug to me as well; I would have expected something similar to what you were expecting except maybe

Re: clusterdump samplePoints parameter

2014-03-19 Thread Terry Blankers
, Terry Blankers te...@amritanet.com wrote: Hi all, Can someone please answer a quick question about the --samplePoints parameter in the clusterdump utility? I understand it specifies the number of points returned per cluster. But are the points per cluster ordered or ranked in any way before

clusterdump samplePoints parameter

2014-03-18 Thread Terry Blankers
Hi all, Can someone please answer a quick question about the --samplePoints parameter in the clusterdump utility? I understand it specifies the number of points returned per cluster. But are the points per cluster ordered or ranked in any way before this truncation occurs? Thanks, Terry