Github user xuchuanyin commented on the issue: https://github.com/apache/carbondata/pull/2850 @NamanRastogi I think we can further optimize this function. 1. We can enable the parallel reading and set the parallelism while creating a CarbonReader; 2. Inside CarbonReader, we handle the concurrent processing; 3. The interfaces for CarbonReader should be kept the same as before, there is no need to add more interfaces. By calling hasNext or next, user can get the next record and will not care about which RecordReader does this record belong to. The user interface looks like below: ``` CarbonReader reader = CarbonReader.builder(dataDir).parallelism(3).build(); while (reader.hasNext()) { reader.next(); } reader.close(); ``` To keep it simple, by default the parallelism can be 1 which means we will process each RecordReader one by one. Setting this parallelism to a higher value means that we will go process the RecordReaders in a thread pool with size 3.
---