Re: mahout output of seq2sparse is empty
Depends on what u r trying to do. Are u trying classification or clustering? On Wed, Mar 4, 2015 at 1:08 AM, Raghuveer alwaysra...@yahoo.com.invalid wrote: Yes, you are right its was a directory. I see the part-m-0 file can you kindly suggest me how to run mahout on this file. Should i run classification or clustering? Can you please share some sample. thanks very much. On Wednesday, March 4, 2015 11:06 AM, Andrew Musselman andrew.mussel...@gmail.com wrote: I don't have a terminal in front of me but are you sure tfidf-vectors is a file, not a directory? On Tuesday, March 3, 2015, Raghuveer alwaysra...@yahoo.com.invalid wrote: I have data file of the formatsrc_ip,dest_ip,packet, bytes_transferred, src_port,dest_port, start_timestamp 71.105.62.168, 38.106.70.147, 1, 54, 55704, 52747, 1341775056478 38.106.70.147, 71.105.62.168, 2, 1568, 52747, 55704, 1341775056478 Firstly the text like src_ip should be converted to a number i think as per my reading how can i do this?I ran the following code successfully without any errors:./mahout seqdirectory --input /upload/20120708-0031-0060.csv --output /upload/output1 ./mahout seq2sparse -i /upload/output1 -o /upload/output2 but the output is emptydrwxr-xr-x - admin supergroup0 /upload/output4/df-count -rw-r--r-- 2 admin supergroup 2008 /upload/output4/dictionary.file-0 -rw-r--r-- 2 admin supergroup 1593 /upload/output4/frequency.file-0 drwxr-xr-x - admin supergroup0 /upload/output4/tf-vectors drwxr-xr-x - admin supergroup0 /upload/output4/tfidf-vectors drwxr-xr-x - admin supergroup0 /upload/output4/tokenized-documents drwxr-xr-x - admin supergroup0 /upload/output4/wordcount i should have got something in tfidf-vectors isnt it? Can you kindly suggest what am missing.thanks in advance
Re: mahout output of seq2sparse is empty
Yes, you are right its was a directory. I see the part-m-0 file can you kindly suggest me how to run mahout on this file. Should i run classification or clustering? Can you please share some sample. thanks very much. On Wednesday, March 4, 2015 11:06 AM, Andrew Musselman andrew.mussel...@gmail.com wrote: I don't have a terminal in front of me but are you sure tfidf-vectors is a file, not a directory? On Tuesday, March 3, 2015, Raghuveer alwaysra...@yahoo.com.invalid wrote: I have data file of the formatsrc_ip,dest_ip,packet, bytes_transferred, src_port,dest_port, start_timestamp 71.105.62.168, 38.106.70.147, 1, 54, 55704, 52747, 1341775056478 38.106.70.147, 71.105.62.168, 2, 1568, 52747, 55704, 1341775056478 Firstly the text like src_ip should be converted to a number i think as per my reading how can i do this?I ran the following code successfully without any errors:./mahout seqdirectory --input /upload/20120708-0031-0060.csv --output /upload/output1 ./mahout seq2sparse -i /upload/output1 -o /upload/output2 but the output is emptydrwxr-xr-x - admin supergroup 0 /upload/output4/df-count -rw-r--r-- 2 admin supergroup 2008 /upload/output4/dictionary.file-0 -rw-r--r-- 2 admin supergroup 1593 /upload/output4/frequency.file-0 drwxr-xr-x - admin supergroup 0 /upload/output4/tf-vectors drwxr-xr-x - admin supergroup 0 /upload/output4/tfidf-vectors drwxr-xr-x - admin supergroup 0 /upload/output4/tokenized-documents drwxr-xr-x - admin supergroup 0 /upload/output4/wordcount i should have got something in tfidf-vectors isnt it? Can you kindly suggest what am missing.thanks in advance
Re: mahout output of seq2sparse is empty
I don't have a terminal in front of me but are you sure tfidf-vectors is a file, not a directory? On Tuesday, March 3, 2015, Raghuveer alwaysra...@yahoo.com.invalid wrote: I have data file of the formatsrc_ip,dest_ip,packet, bytes_transferred, src_port,dest_port, start_timestamp 71.105.62.168, 38.106.70.147, 1, 54, 55704, 52747, 1341775056478 38.106.70.147, 71.105.62.168, 2, 1568, 52747, 55704, 1341775056478 Firstly the text like src_ip should be converted to a number i think as per my reading how can i do this?I ran the following code successfully without any errors:./mahout seqdirectory --input /upload/20120708-0031-0060.csv --output /upload/output1 ./mahout seq2sparse -i /upload/output1 -o /upload/output2 but the output is emptydrwxr-xr-x - admin supergroup 0 /upload/output4/df-count -rw-r--r-- 2 admin supergroup 2008 /upload/output4/dictionary.file-0 -rw-r--r-- 2 admin supergroup 1593 /upload/output4/frequency.file-0 drwxr-xr-x - admin supergroup 0 /upload/output4/tf-vectors drwxr-xr-x - admin supergroup 0 /upload/output4/tfidf-vectors drwxr-xr-x - admin supergroup 0 /upload/output4/tokenized-documents drwxr-xr-x - admin supergroup 0 /upload/output4/wordcount i should have got something in tfidf-vectors isnt it? Can you kindly suggest what am missing.thanks in advance
mahout output of seq2sparse is empty
I have data file of the formatsrc_ip,dest_ip,packet, bytes_transferred, src_port,dest_port, start_timestamp 71.105.62.168, 38.106.70.147, 1, 54, 55704, 52747, 1341775056478 38.106.70.147, 71.105.62.168, 2, 1568, 52747, 55704, 1341775056478 Firstly the text like src_ip should be converted to a number i think as per my reading how can i do this?I ran the following code successfully without any errors:./mahout seqdirectory --input /upload/20120708-0031-0060.csv --output /upload/output1 ./mahout seq2sparse -i /upload/output1 -o /upload/output2 but the output is emptydrwxr-xr-x - admin supergroup 0 /upload/output4/df-count -rw-r--r-- 2 admin supergroup 2008 /upload/output4/dictionary.file-0 -rw-r--r-- 2 admin supergroup 1593 /upload/output4/frequency.file-0 drwxr-xr-x - admin supergroup 0 /upload/output4/tf-vectors drwxr-xr-x - admin supergroup 0 /upload/output4/tfidf-vectors drwxr-xr-x - admin supergroup 0 /upload/output4/tokenized-documents drwxr-xr-x - admin supergroup 0 /upload/output4/wordcount i should have got something in tfidf-vectors isnt it? Can you kindly suggest what am missing.thanks in advance
Re: mahout output
It’s been a while since I used it but doesn’t it give all recs for all users? You can write code to grab the matrices and create a server but the job doesn’t do that. On Feb 17, 2015, at 5:56 AM, Hartwig Anzt ha...@icl.utk.edu wrote: hey! I am trying to understand the Mahout ALS output. Using 'mahout seqdumper' and 'mahout vectordump' I can convert the HDF5 files into human-readable data. Howeer, I do not understand the meaning. I would expect to get something like a matrix? I.e. three matrices - user/item correlation, item/user correlation and a small matrix of the feature-size. Hartwig
mahout output
hey! I am trying to understand the Mahout ALS output. Using 'mahout seqdumper' and 'mahout vectordump' I can convert the HDF5 files into human-readable data. Howeer, I do not understand the meaning. I would expect to get something like a matrix? I.e. three matrices - user/item correlation, item/user correlation and a small matrix of the feature-size. Hartwig
Parsing mahout output
Hi, I want to convert the output of cf module (als,svd) etc to csv. How do I do the conversion? I want to look at those latent features? Thanks Jamal
Re: Parsing mahout output
Hi Jamal, Maybe you can use getUserFeature() and getItemFeature() methods in Factorization class to look at the latent features. Best Regards, Peng Zhang On May 21, 2014, at 5:42 AM, jamal sasha jamalsha...@gmail.com wrote: Hi, I want to convert the output of cf module (als,svd) etc to csv. How do I do the conversion? I want to look at those latent features? Thanks Jamal