In general, serializing to text and then parsing back into a different format will always be slower than using a purpose-built class that can serialize itself. The tradeoff, of course, is that going to text is often more convenient from a developer-time perspective.
- Aaron On Mon, Apr 20, 2009 at 2:23 PM, chintan bhatt <chin1...@hotmail.com> wrote: > > Hi all, > I want to ask you about the performance difference between using the Text > class and using a custom Class which implements Writable interface. > > Lets say in InvertedIndex problem when I emit token and a list of document > Ids which contains it , using Text we usually Concat the list of document > ids with space as a separator "d1 d2 d3 d4" etc..If I need the same values > in a later step of map reduce, I need to split the value string to get the > list of all document Ids. Is it not better to use Writable List instead?? > > I need to ask it because I am using too many Concats and Splits in my > project to use documents total tokens count, token frequency in a particular > document etc.. > > > Thanks in advance, > Chintan > > > _________________________________________________________________ > Windows Live Messenger. Multitasking at its finest. > http://www.microsoft.com/india/windows/windowslive/messenger.aspx