Re: Clustering Question

sarath pr Wed, 06 Apr 2011 01:11:56 -0700

I am using Netbeans IDE.
I use CanopyDriver.run to create initial clusters and KmeansDriver.run
for clustering news articles.


On 4/6/11, Grant Ingersoll <grant.ingers...@gmail.com> wrote:
> What commands are you running to do the actual clustering?
>
>
> On Apr 3, 2011, at 4:27 AM, sarath pr wrote:
>
>> SequenceFile.Writer writer = new SequenceFile.Writer(fs, conf, new
>> Path(inputDir,"documents.seq"),Text.class, Text.class);
>>
>>     for(int i=0;i<s.length;i++)
>>        {
>>
>>             writer.append(new Text(s[i][0]), new Text(s[i][1]));
>>         }
>>      writer.close();
>>
>> Here Text(s[i][0]) is a string value, which is the ID of a news
>> article and Text(s[i][1]) is the news article text . I have clustered
>> some 100+ news articles like this and i get the output in
>> clusteredPoints/part-m-00000. My question is that is it possible to
>> extract the article ID (ie Texts[i][0]), which i had appended) and
>> corresponding cluster id from the part-m-00000 file.
>>
>> Anyone knows ???
>>
>> --
>> Thank You..!!
>> Sarath Ramachandran
>> sarath.amr...@gmail.com
>> +919995024287
>
> --------------------------
> Grant Ingersoll
> http://www.lucidimagination.com/
>
>

-- 
Sent from my mobile device

Thank You..!!
Sarath Ramachandran
sarath.amr...@gmail.com
+919995024287

Re: Clustering Question

Reply via email to