Re: ORC tables loading

2015-11-17 Thread James Pirz
Thanks Allen. I get your point about parallel I/Os. But what about data transfer on the network. Is it the case that all the conversion is happening locally (I mean the blocks of text data on a specific node will be stored on the same node as ORC) ? Or some re-partitioning needs to happen ? My majo

Re: ORC tables loading

2015-11-17 Thread Alan Gates
The reads and writes both happen in parallel, so as more nodes are available for read and write, at least in this case, the time stays roughly the same. Alan. James Pirz November 16, 2015 at 21:23 Hi, I am using Hive 1.2 with ORC tables on Hadoop 2.6 on a cluste

ORC tables loading

2015-11-16 Thread James Pirz
Hi, I am using Hive 1.2 with ORC tables on Hadoop 2.6 on a cluster. I load data into an ORC table by reading the data from an external table on raw text files and using insert statement: INSERT into TABLE myorctab SELECT * FROM mytxttab; I ran a simple scale-up test to find out how the loading t