I think you want LDAPrintTopics? -----Original Message----- From: [email protected] [mailto:[email protected]] On Behalf Of Dhruv Kumar Sent: Thursday, July 07, 2011 11:29 AM To: [email protected] Subject: Re: how to transfer the sequence file into readable format
Sequence Files store key and value pairs in a binary, compressed format. To read a sequence file and display the key and values in a human format, you can use SequenceFile Reader: http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/io/SequenceFile.Reader.html I don't know the outputs of LDA, but in general you can do the following, assuming key is IntWritable and value is DoubleWritable. Configuration conf = new Configuration(); FileSystem fs = FileSystem.get(conf); SequenceFile.reader reader = new SequenceFile.reader(fs, new Path("/path/to/output/of/LDA"), conf); IntWritable key = new IntWritable(); DoubleWritable value = new DoubleWritable(); while(reader.next(key, value)) { System.out.println(key.toString(), value.toString()); } reader.close(); There may be a convenient command line utility for LDA also which someone else can point out. However, you can always write your own simple class as shown above for reading any Sequence File. On Thu, Jul 7, 2011 at 1:53 PM, wine lover <[email protected]> wrote: > Dear All, > > After running LDA analysis, I got the docTopic file, which is a regular > sequence-file. How to transfer it into a readable format? I searched > vectordumper, or vectordump, but did not get any useful results, such as > how > to use it in command-line? Thanks. >
