Re: mahout output of seq2sparse is empty

2015-03-03 Thread Suneel Marthi
Depends on what u r trying to do. Are u trying classification or clustering?

On Wed, Mar 4, 2015 at 1:08 AM, Raghuveer alwaysra...@yahoo.com.invalid
wrote:

 Yes, you are right its was a directory. I see the part-m-0 file can
 you kindly suggest me how to run mahout on this file. Should i run
 classification or clustering? Can you please share some sample.
 thanks very much.

  On Wednesday, March 4, 2015 11:06 AM, Andrew Musselman 
 andrew.mussel...@gmail.com wrote:


  I don't have a terminal in front of me but are you sure tfidf-vectors is a
 file, not a directory?

 On Tuesday, March 3, 2015, Raghuveer alwaysra...@yahoo.com.invalid
 wrote:

  I have data file of the formatsrc_ip,dest_ip,packet, bytes_transferred,
  src_port,dest_port, start_timestamp
  71.105.62.168, 38.106.70.147, 1, 54, 55704, 52747, 1341775056478
  38.106.70.147, 71.105.62.168, 2, 1568, 52747, 55704, 1341775056478
  Firstly the text like src_ip should be converted to a number i think as
  per my reading how can i do this?I ran the following code successfully
  without any errors:./mahout seqdirectory --input
  /upload/20120708-0031-0060.csv --output /upload/output1
  ./mahout seq2sparse -i /upload/output1 -o /upload/output2
  but the output is emptydrwxr-xr-x  - admin supergroup0
  /upload/output4/df-count
  -rw-r--r--  2 admin supergroup  2008  /upload/output4/dictionary.file-0
  -rw-r--r--  2 admin supergroup  1593  /upload/output4/frequency.file-0
  drwxr-xr-x  - admin supergroup0  /upload/output4/tf-vectors
  drwxr-xr-x  - admin supergroup0  /upload/output4/tfidf-vectors
  drwxr-xr-x  - admin supergroup0  /upload/output4/tokenized-documents
  drwxr-xr-x  - admin supergroup0  /upload/output4/wordcount
  i should have got something in tfidf-vectors isnt it? Can you kindly
  suggest what am missing.thanks in advance
 





Re: mahout output of seq2sparse is empty

2015-03-03 Thread Raghuveer
Yes, you are right its was a directory. I see the part-m-0 file can you 
kindly suggest me how to run mahout on this file. Should i run classification 
or clustering? Can you please share some sample.
thanks very much. 

 On Wednesday, March 4, 2015 11:06 AM, Andrew Musselman 
andrew.mussel...@gmail.com wrote:
   

 I don't have a terminal in front of me but are you sure tfidf-vectors is a
file, not a directory?

On Tuesday, March 3, 2015, Raghuveer alwaysra...@yahoo.com.invalid wrote:

 I have data file of the formatsrc_ip,dest_ip,packet, bytes_transferred,
 src_port,dest_port, start_timestamp
 71.105.62.168, 38.106.70.147, 1, 54, 55704, 52747, 1341775056478
 38.106.70.147, 71.105.62.168, 2, 1568, 52747, 55704, 1341775056478
 Firstly the text like src_ip should be converted to a number i think as
 per my reading how can i do this?I ran the following code successfully
 without any errors:./mahout seqdirectory --input
 /upload/20120708-0031-0060.csv --output /upload/output1
 ./mahout seq2sparse -i /upload/output1 -o /upload/output2
 but the output is emptydrwxr-xr-x  - admin supergroup    0
 /upload/output4/df-count
 -rw-r--r--  2 admin supergroup  2008  /upload/output4/dictionary.file-0
 -rw-r--r--  2 admin supergroup  1593  /upload/output4/frequency.file-0
 drwxr-xr-x  - admin supergroup    0  /upload/output4/tf-vectors
 drwxr-xr-x  - admin supergroup    0  /upload/output4/tfidf-vectors
 drwxr-xr-x  - admin supergroup    0  /upload/output4/tokenized-documents
 drwxr-xr-x  - admin supergroup    0  /upload/output4/wordcount
 i should have got something in tfidf-vectors isnt it? Can you kindly
 suggest what am missing.thanks in advance



   

Re: mahout output of seq2sparse is empty

2015-03-03 Thread Andrew Musselman
I don't have a terminal in front of me but are you sure tfidf-vectors is a
file, not a directory?

On Tuesday, March 3, 2015, Raghuveer alwaysra...@yahoo.com.invalid wrote:

 I have data file of the formatsrc_ip,dest_ip,packet, bytes_transferred,
 src_port,dest_port, start_timestamp
 71.105.62.168, 38.106.70.147, 1, 54, 55704, 52747, 1341775056478
 38.106.70.147, 71.105.62.168, 2, 1568, 52747, 55704, 1341775056478
 Firstly the text like src_ip should be converted to a number i think as
 per my reading how can i do this?I ran the following code successfully
 without any errors:./mahout seqdirectory --input
 /upload/20120708-0031-0060.csv --output /upload/output1
 ./mahout seq2sparse -i /upload/output1 -o /upload/output2
 but the output is emptydrwxr-xr-x   - admin supergroup 0
 /upload/output4/df-count
 -rw-r--r--   2 admin supergroup  2008  /upload/output4/dictionary.file-0
 -rw-r--r--   2 admin supergroup  1593  /upload/output4/frequency.file-0
 drwxr-xr-x   - admin supergroup 0  /upload/output4/tf-vectors
 drwxr-xr-x   - admin supergroup 0  /upload/output4/tfidf-vectors
 drwxr-xr-x   - admin supergroup 0  /upload/output4/tokenized-documents
 drwxr-xr-x   - admin supergroup 0  /upload/output4/wordcount
 i should have got something in tfidf-vectors isnt it? Can you kindly
 suggest what am missing.thanks in advance



mahout output of seq2sparse is empty

2015-03-03 Thread Raghuveer
I have data file of the formatsrc_ip,dest_ip,packet, bytes_transferred, 
src_port,dest_port, start_timestamp
71.105.62.168, 38.106.70.147, 1, 54, 55704, 52747, 1341775056478
38.106.70.147, 71.105.62.168, 2, 1568, 52747, 55704, 1341775056478
Firstly the text like src_ip should be converted to a number i think as per my 
reading how can i do this?I ran the following code successfully without any 
errors:./mahout seqdirectory --input /upload/20120708-0031-0060.csv --output 
/upload/output1
./mahout seq2sparse -i /upload/output1 -o /upload/output2
but the output is emptydrwxr-xr-x   - admin supergroup 0  
/upload/output4/df-count
-rw-r--r--   2 admin supergroup  2008  /upload/output4/dictionary.file-0
-rw-r--r--   2 admin supergroup  1593  /upload/output4/frequency.file-0
drwxr-xr-x   - admin supergroup 0  /upload/output4/tf-vectors
drwxr-xr-x   - admin supergroup 0  /upload/output4/tfidf-vectors
drwxr-xr-x   - admin supergroup 0  /upload/output4/tokenized-documents
drwxr-xr-x   - admin supergroup 0  /upload/output4/wordcount
i should have got something in tfidf-vectors isnt it? Can you kindly suggest 
what am missing.thanks in advance


Re: mahout output

2015-02-18 Thread Pat Ferrel
It’s been a while since I used it but doesn’t it give all recs for all users? 
You can write code to grab the matrices and create a server but the job doesn’t 
do that.

On Feb 17, 2015, at 5:56 AM, Hartwig Anzt ha...@icl.utk.edu wrote:

hey!

I am trying to understand the Mahout ALS output.
Using 'mahout seqdumper' and 'mahout vectordump' I can convert the HDF5 files 
into human-readable data. Howeer, I do not understand the meaning. I would 
expect to get something like a matrix? I.e. three matrices - user/item 
correlation, item/user correlation and a small matrix of the feature-size.

Hartwig



mahout output

2015-02-17 Thread Hartwig Anzt

hey!

I am trying to understand the Mahout ALS output.
Using 'mahout seqdumper' and 'mahout vectordump' I can convert the HDF5 
files into human-readable data. Howeer, I do not understand the meaning. 
I would expect to get something like a matrix? I.e. three matrices - 
user/item correlation, item/user correlation and a small matrix of the 
feature-size.


Hartwig


Parsing mahout output

2014-05-20 Thread jamal sasha
Hi,
  I want to convert the output of cf module (als,svd) etc to csv. How do I
do the conversion?
I want to look at those latent features?
Thanks
Jamal


Re: Parsing mahout output

2014-05-20 Thread Peng Zhang
Hi Jamal,

Maybe you can use getUserFeature() and getItemFeature() methods in 
Factorization class to look at the latent features.

Best Regards,
Peng Zhang





On May 21, 2014, at 5:42 AM, jamal sasha jamalsha...@gmail.com wrote:

 Hi,
  I want to convert the output of cf module (als,svd) etc to csv. How do I
 do the conversion?
 I want to look at those latent features?
 Thanks
 Jamal