Anyone ?

Thanks

On Wed, Jan 30, 2013 at 10:42 AM, Vinay B, <vybe3...@gmail.com> wrote:

>
> Just a set of mahout commands. Here they are.
>
> https://gist.github.com/4674331
>
> For what it's worth,t he relevant solr config from the schema was
>
>  <field name="text" type="text_general" indexed="true" stored="false"
> multiValued="true"
>         termVectors="true"/>
>
> Thank You
>
> On Wed, Jan 30, 2013 at 4:37 AM, Grant Ingersoll <gsing...@apache.org>wrote:
>
>> Can you gist (gist.github.org) or pastebin your code?
>>
>> On Jan 29, 2013, at 5:12 PM, vybe3142 wrote:
>>
>> > Reposting - I wasn't subscribed to the group earlier
>> >
>> >
>> > VS
>> >
>> > first ingesting the data into SOLR and then invoking mahout on the SOLR
>> > index (clustering on the contents of the field "text")
>> >
>> > defined as
>> >
>> > Field: text
>> >
>> Field-Type:org.apache.solr.schema.TextFieldProperties:Indexed,Tokenized,Multivalued,TermVector
>> > StoredSchema:Indexed,Tokenized,Multivalued,TermVector
>> StoredIndex:(unstored
>> > field)
>> > PI Gap:100
>> > Docs:21578
>> >
>> > Index Analyzer:
>> > org.apache.solr.analysis.TokenizerChain
>> > Query Analyzer:
>> > org.apache.solr.analysis.TokenizerChain
>> > and executing a  "similar" command set
>> >
>> > I get vastly differing results:
>> >
>> > The lucene / kmeans approach yeids 20 cluster whereas the solr approach
>> > yields just one cluster.
>> >
>> > I'm obviously doing something wrong. Any pointers?
>> >
>> > Thanks
>> >
>> >
>> >
>> > --
>> > View this message in context:
>> http://lucene.472066.n3.nabble.com/Clustering-using-Solr-Index-vs-Lucene-Index-Different-Results-tp4037198.html
>> > Sent from the Mahout User List mailing list archive at Nabble.com.
>>
>> --------------------------------------------
>> Grant Ingersoll
>> http://www.lucidworks.com
>>
>>
>>
>>
>>
>

Reply via email to