Re: mahout output of seq2sparse is empty

2015-03-03 Thread Suneel Marthi
Depends on what u r trying to do. Are u trying classification or clustering?

On Wed, Mar 4, 2015 at 1:08 AM, Raghuveer alwaysra...@yahoo.com.invalid
wrote:

 Yes, you are right its was a directory. I see the part-m-0 file can
 you kindly suggest me how to run mahout on this file. Should i run
 classification or clustering? Can you please share some sample.
 thanks very much.

  On Wednesday, March 4, 2015 11:06 AM, Andrew Musselman 
 andrew.mussel...@gmail.com wrote:


  I don't have a terminal in front of me but are you sure tfidf-vectors is a
 file, not a directory?

 On Tuesday, March 3, 2015, Raghuveer alwaysra...@yahoo.com.invalid
 wrote:

  I have data file of the formatsrc_ip,dest_ip,packet, bytes_transferred,
  src_port,dest_port, start_timestamp
  71.105.62.168, 38.106.70.147, 1, 54, 55704, 52747, 1341775056478
  38.106.70.147, 71.105.62.168, 2, 1568, 52747, 55704, 1341775056478
  Firstly the text like src_ip should be converted to a number i think as
  per my reading how can i do this?I ran the following code successfully
  without any errors:./mahout seqdirectory --input
  /upload/20120708-0031-0060.csv --output /upload/output1
  ./mahout seq2sparse -i /upload/output1 -o /upload/output2
  but the output is emptydrwxr-xr-x  - admin supergroup0
  /upload/output4/df-count
  -rw-r--r--  2 admin supergroup  2008  /upload/output4/dictionary.file-0
  -rw-r--r--  2 admin supergroup  1593  /upload/output4/frequency.file-0
  drwxr-xr-x  - admin supergroup0  /upload/output4/tf-vectors
  drwxr-xr-x  - admin supergroup0  /upload/output4/tfidf-vectors
  drwxr-xr-x  - admin supergroup0  /upload/output4/tokenized-documents
  drwxr-xr-x  - admin supergroup0  /upload/output4/wordcount
  i should have got something in tfidf-vectors isnt it? Can you kindly
  suggest what am missing.thanks in advance
 





Re: mahout output of seq2sparse is empty

2015-03-03 Thread Raghuveer
Yes, you are right its was a directory. I see the part-m-0 file can you 
kindly suggest me how to run mahout on this file. Should i run classification 
or clustering? Can you please share some sample.
thanks very much. 

 On Wednesday, March 4, 2015 11:06 AM, Andrew Musselman 
andrew.mussel...@gmail.com wrote:
   

 I don't have a terminal in front of me but are you sure tfidf-vectors is a
file, not a directory?

On Tuesday, March 3, 2015, Raghuveer alwaysra...@yahoo.com.invalid wrote:

 I have data file of the formatsrc_ip,dest_ip,packet, bytes_transferred,
 src_port,dest_port, start_timestamp
 71.105.62.168, 38.106.70.147, 1, 54, 55704, 52747, 1341775056478
 38.106.70.147, 71.105.62.168, 2, 1568, 52747, 55704, 1341775056478
 Firstly the text like src_ip should be converted to a number i think as
 per my reading how can i do this?I ran the following code successfully
 without any errors:./mahout seqdirectory --input
 /upload/20120708-0031-0060.csv --output /upload/output1
 ./mahout seq2sparse -i /upload/output1 -o /upload/output2
 but the output is emptydrwxr-xr-x  - admin supergroup    0
 /upload/output4/df-count
 -rw-r--r--  2 admin supergroup  2008  /upload/output4/dictionary.file-0
 -rw-r--r--  2 admin supergroup  1593  /upload/output4/frequency.file-0
 drwxr-xr-x  - admin supergroup    0  /upload/output4/tf-vectors
 drwxr-xr-x  - admin supergroup    0  /upload/output4/tfidf-vectors
 drwxr-xr-x  - admin supergroup    0  /upload/output4/tokenized-documents
 drwxr-xr-x  - admin supergroup    0  /upload/output4/wordcount
 i should have got something in tfidf-vectors isnt it? Can you kindly
 suggest what am missing.thanks in advance



   

Re: mahout output of seq2sparse is empty

2015-03-03 Thread Andrew Musselman
I don't have a terminal in front of me but are you sure tfidf-vectors is a
file, not a directory?

On Tuesday, March 3, 2015, Raghuveer alwaysra...@yahoo.com.invalid wrote:

 I have data file of the formatsrc_ip,dest_ip,packet, bytes_transferred,
 src_port,dest_port, start_timestamp
 71.105.62.168, 38.106.70.147, 1, 54, 55704, 52747, 1341775056478
 38.106.70.147, 71.105.62.168, 2, 1568, 52747, 55704, 1341775056478
 Firstly the text like src_ip should be converted to a number i think as
 per my reading how can i do this?I ran the following code successfully
 without any errors:./mahout seqdirectory --input
 /upload/20120708-0031-0060.csv --output /upload/output1
 ./mahout seq2sparse -i /upload/output1 -o /upload/output2
 but the output is emptydrwxr-xr-x   - admin supergroup 0
 /upload/output4/df-count
 -rw-r--r--   2 admin supergroup  2008  /upload/output4/dictionary.file-0
 -rw-r--r--   2 admin supergroup  1593  /upload/output4/frequency.file-0
 drwxr-xr-x   - admin supergroup 0  /upload/output4/tf-vectors
 drwxr-xr-x   - admin supergroup 0  /upload/output4/tfidf-vectors
 drwxr-xr-x   - admin supergroup 0  /upload/output4/tokenized-documents
 drwxr-xr-x   - admin supergroup 0  /upload/output4/wordcount
 i should have got something in tfidf-vectors isnt it? Can you kindly
 suggest what am missing.thanks in advance