Hi Shuai,
It should certainly be possible to do it that way, but I would recommend
against it. If you look at HadoopRDD, its doing all sorts of little
book-keeping that you would most likely want to mimic. eg., tracking the
number of bytes records that are read, setting up all the hadoop
It should be *possible* to do what you want ... but if I understand you
right, there isn't really any very easy way to do it. I think you would
need to write your own subclass of RDD, which has its own logic on how the
input files get put divided among partitions. You can probably subclass