I think Ted was implying just write a script to aggregate the Movielens data by 
user id.  Should be pretty straightforward.

On Jun 13, 2013, at 10:05 AM, Neetha <netasu...@gmail.com> wrote:

> Thank you, for the reply. How can we group the user.
> 
> 
> On Thu, Jun 13, 2013 at 3:41 PM, Ted Dunning <ted.dunn...@gmail.com> wrote:
> 
>> [image: Boxbe] <https://www.boxbe.com/overview> This message is eligible
>> for Automatic Cleanup! (ted.dunn...@gmail.com) Add cleanup 
>> rule<https://www.boxbe.com/popup?url=https%3A%2F%2Fwww.boxbe.com%2Fcleanup%3Ftoken%3DGYex%252FPN%252FsEWDwuSs%252F9AS43g45aYbNc1OMuaZA7xu3TRldhNItvxAspHuwKeaedBKYvZ5Ah5DVIK7%252F%252B0qQSbX3CvYa7lvPle4%252BTdcv5k4cI%252BL4yoMK8by1Rm7UhZnW7TcvFw%252FeqoeYWXhz%252BgDPSUIWA%253D%253D%26key%3D0Lbb2Ob2N7oax0oxeBQTRLmrOCps42qosLO9Gh82kvs%253D&tc_serial=14367563490&tc_rand=1983549237&utm_source=stf&utm_medium=email&utm_campaign=ANNO_CLEANUP_ADD&utm_content=001>|
>>  More
>> info<http://blog.boxbe.com/general/boxbe-automatic-cleanup?tc_serial=14367563490&tc_rand=1983549237&utm_source=stf&utm_medium=email&utm_campaign=ANNO_CLEANUP_ADD&utm_content=001>
>> 
>> You need to group by user before converting to vector to get sensible
>> clustering.
>> 
>> 
>> On Wed, Jun 12, 2013 at 1:06 PM, Grant Ingersoll <gsing...@apache.org
>>> wrote:
>> 
>>> The CSVVectorIterator in the Integration package will take in a CSV file
>>> and produce vectors.  It assumes that each row is the equivalent of a
>>> DenseVector (does MovieLens fit that?)  If you need otherwise, I'd
>> suggest
>>> starting with the code and modifying to fit your needs.
>>> 
>>> 
>>> -Grant
>>> 
>>> On Jun 12, 2013, at 6:11 AM, Neetha <netasu...@gmail.com> wrote:
>>> 
>>>> Hi,
>>>> 
>>>> 
>>>> I am using 1m movielens.
>>>> 
>>>> I need to run the K-means clustering using mahout and hadoop. Actually,
>>>> 1st step in the clustering is to convert into a sequence file, then
>> into
>>>> vector format and then apply the clustering algorithm. My doubt is, Is
>>>> there any need to convert the movielens rating.csv file into a sequence
>>>> file. If needed what are the commands for applying clustering technique
>>>> using mahout and the hadoop.
>>>> 
>>>> Thanking you,
>>>> Neetha Suan Thampi
>>> 
>>> --------------------------------------------
>>> Grant Ingersoll | @gsingers
>>> http://www.lucidworks.com
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>> 
>> 

--------------------------------------------
Grant Ingersoll | @gsingers
http://www.lucidworks.com





Reply via email to