Anyone ? Thanks
On Wed, Jan 30, 2013 at 10:42 AM, Vinay B, <vybe3...@gmail.com> wrote: > > Just a set of mahout commands. Here they are. > > https://gist.github.com/4674331 > > For what it's worth,t he relevant solr config from the schema was > > <field name="text" type="text_general" indexed="true" stored="false" > multiValued="true" > termVectors="true"/> > > Thank You > > On Wed, Jan 30, 2013 at 4:37 AM, Grant Ingersoll <gsing...@apache.org>wrote: > >> Can you gist (gist.github.org) or pastebin your code? >> >> On Jan 29, 2013, at 5:12 PM, vybe3142 wrote: >> >> > Reposting - I wasn't subscribed to the group earlier >> > >> > >> > VS >> > >> > first ingesting the data into SOLR and then invoking mahout on the SOLR >> > index (clustering on the contents of the field "text") >> > >> > defined as >> > >> > Field: text >> > >> Field-Type:org.apache.solr.schema.TextFieldProperties:Indexed,Tokenized,Multivalued,TermVector >> > StoredSchema:Indexed,Tokenized,Multivalued,TermVector >> StoredIndex:(unstored >> > field) >> > PI Gap:100 >> > Docs:21578 >> > >> > Index Analyzer: >> > org.apache.solr.analysis.TokenizerChain >> > Query Analyzer: >> > org.apache.solr.analysis.TokenizerChain >> > and executing a "similar" command set >> > >> > I get vastly differing results: >> > >> > The lucene / kmeans approach yeids 20 cluster whereas the solr approach >> > yields just one cluster. >> > >> > I'm obviously doing something wrong. Any pointers? >> > >> > Thanks >> > >> > >> > >> > -- >> > View this message in context: >> http://lucene.472066.n3.nabble.com/Clustering-using-Solr-Index-vs-Lucene-Index-Different-Results-tp4037198.html >> > Sent from the Mahout User List mailing list archive at Nabble.com. >> >> -------------------------------------------- >> Grant Ingersoll >> http://www.lucidworks.com >> >> >> >> >> >