I think you want LDAPrintTopics?

-----Original Message-----
From: [email protected] [mailto:[email protected]] On Behalf Of Dhruv Kumar
Sent: Thursday, July 07, 2011 11:29 AM
To: [email protected]
Subject: Re: how to transfer the sequence file into readable format

Sequence Files store key and value pairs in a binary, compressed format. To
read a sequence file and display the key and values in a human format, you
can use SequenceFile Reader:
http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/io/SequenceFile.Reader.html

I don't know the outputs of LDA, but in general you can do the following,
assuming key is IntWritable and value is DoubleWritable.

Configuration conf = new Configuration();
FileSystem fs = FileSystem.get(conf);
SequenceFile.reader reader = new SequenceFile.reader(fs, new
Path("/path/to/output/of/LDA"), conf);
IntWritable key = new IntWritable();
DoubleWritable value = new DoubleWritable();

while(reader.next(key, value)) {
  System.out.println(key.toString(), value.toString());
}
reader.close();


There may be a convenient command line utility for LDA also which someone
else can point out. However, you can always write your own simple class as
shown above for reading any Sequence File.





On Thu, Jul 7, 2011 at 1:53 PM, wine lover <[email protected]> wrote:

> Dear All,
>
> After running LDA analysis, I got the docTopic file, which is a regular
> sequence-file. How to transfer it into a readable format? I searched
> vectordumper, or vectordump, but did not get any useful results, such as
> how
> to use it in command-line? Thanks.
>

Reply via email to