You need to store your data into "column-based" format, checking out Hive RCFile, and its InputFormat option. Yong
Date: Mon, 23 Dec 2013 21:37:23 +0800 Subject: Any method to get input splits by column? From: samliuhad...@gmail.com To: user@hadoop.apache.org Hi, By default, MR inputformat classes break input file into splits by rows. However, we have a specilal requirement on MR app: get input splits by column. Is there any good method? Thanks!