Re: Data loading from Datanode

2011-12-07 Thread Vikas Srivastava
hey yes its possible to load data through hive in hadoop, but you can't decide that where data file should store(on which node). that could only be decide by namenode. Regards Vikas Srivastava On Thu, Dec 8, 2011 at 12:49 PM, Savant, Keshav < keshav.c.sav...@fisglobal.com> wrote: > Hi All,***

Data loading from Datanode

2011-12-07 Thread Savant, Keshav
Hi All, Is it possible to load data (in HDFS) using Hive Load data query from any of the Datanode? So that means can we insert files into datanode directly (or from hive installed on datanode) and then the master node syncs with datanodes later. Keshav C Savant _ The inform

RE: Hive query taking too much time

2011-12-07 Thread Savant, Keshav
You are right Wojciech Langiewicz, we did the same thing and posted my result yesterday. Now we are planning to do this using a shell script because of dynamicity of our environment where file keep on coming. We will schedule the shell script using cron job. A query on this, we are planning to mer

failed to start hbase!!

2011-12-07 Thread siyuan.tong
# start-hbase.sh SFserver176: Exception in thread "regionserver60020" java.lang.NullPointerException SFserver176: at org.apache.hadoop.hbase.regionserver.HRegionServer.join(HRegionServer.java:1417) SFserver176: at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.jav

Re: Hive query taking too much time

2011-12-07 Thread Wojciech Langiewicz
Hi, In this case it's much easier and faster to merge all files using this command: cat *.csv > output.csv hive -e "load data local inpath 'output.csv' into table $table" On 07.12.2011 07:00, Vikas Srivastava wrote: hey if u having the same col of all the files then you can easily merge by s

RE: Hive query taking too much time

2011-12-07 Thread Savant, Keshav
Hi Wojciech Langiewicz/Paul Mackles, I tried your suggestion and it worked, now the performance has increased many folds, here are the results from my testing after implementing your suggestion Number of Files on HDFS File Size Select count(*) time taken in seconds Select count(*) result