i have a similar problem earlyer, and i just use the split and awk to split the file.
2009/3/20 Akira Kitada <akit...@gmail.com> > Hi, > > Can I split a input file into pieces based on the key? (Probably the > hash value of the key) > Considering Hadoop streaming is a kind of shell pipelines, > it seems to be impossible to do this, but I wanted to double-check > this to be sure. > > Background: The output(an index file) is so large (more than 10G) that > it slows down my applications using that file without splitting it into > pieces. > > Thanks in advance. > -- http://daily.appspot.com/food/