Re: LZO compression implementation in Hive

2013-06-17 Thread Sanjay Subramanian
:-) Not sure how to add a page…may be the Admin needs to grant me permission From: Sanjay Subramanian mailto:sanjay.subraman...@wizecommerce.com>> Reply-To: "user@hive.apache.org" mailto:user@hive.apache.org>> Date: Monday, June 17, 2013 11:50 PM To: "user@hive.apach

Re: hive to hbase mapping

2013-06-17 Thread Sanjay Subramanian
How about you have two streams - one to hbase and one to Hive fro your data generation source ? Moving data out of Hbase may not be trivial specially if the data sizes are large…. From: Mario Casola mailto:mario.cas...@gmail.com>> Reply-To: "user@hive.apache.org"

Re: LZO compression implementation in Hive

2013-06-17 Thread Sanjay Subramanian
Sure…would love to add the LZO compression in Hive Is there a specific page structure u want me to add to in confluence? https://cwiki.apache.org/confluence thanks sanjay From: Lefty Leverenz mailto:le...@hortonworks.com>> Reply-To: "user@h

Re: LZO compression implementation in Hive

2013-06-17 Thread Lefty Leverenz
Perhaps you'd like to write up your insights in the Hive wiki, and others could add their insights. Then the information would be available to all, immediately. – Lefty On Mon, Jun 17, 2013 at 4:39 PM, Ramki Palle wrote: > Hi Sanjay, > > Can you quickly give your insights on thip topic, if p

Re: Extremely slow throughput with dynamic partitions using Hive 0.8.1 in Amazon Elastic Mapreduce

2013-06-17 Thread Shaun Clowes
Thanks for following up Ted, I couldn't work out why the progress tracking was being forced on for Dynamic Partition inserts so thanks for your helpful explanation. I'll raise a JIRA issue regarding the problem. Do you have any idea for an alternate approach? I could have a go at implementing a fix

Re: LZO compression implementation in Hive

2013-06-17 Thread Ramki Palle
Hi Sanjay, Can you quickly give your insights on thip topic, if possible? Regards, Ramki. On Mon, May 20, 2013 at 2:51 PM, Sanjay Subramanian < sanjay.subraman...@wizecommerce.com> wrote: > Hi Programming Hive Book authors > > Maybe a lot of u have already successfully implemented this but o

RE: Powershell script and Hive command dfs moveFromLocal dont' work ??

2013-06-17 Thread Matouk IFTISSEN
I have resolved the problem I dont use Hive script I have used hadoop cmd in powershell script like this : . . $Hadoop_Home = "C:\apps\dist\hadoop-1.1.0-SNAPSHOT\bin\hadoop" $script_transfert_local_hdfs = "fs -moveFromLocal D:\Users\admin\Desktop\rep_depot\fichier_log\ /spir_rep_courant" . .

Re: Powershell script and Hive command dfs moveFromLocal dont' work ??

2013-06-17 Thread Edward Capriolo
When your processing gets this complicated hive-thrift is probably better then she'll scripting. On Monday, June 17, 2013, Matouk IFTISSEN wrote: > Hello > > I meet I real problem when I lanch a powershell script that execute un other Hive sript below (that cut file from local to HDFS (hortonwo

Powershell script and Hive command dfs moveFromLocal dont' work ??

2013-06-17 Thread Matouk IFTISSEN
Hello I meet I real problem when I lanch a powershell script that execute un other Hive sript below (that cut file from local to HDFS (hortonwork on azure) : Hive script in file “mon_scipt_hive’’ dfs -moveFromLocal D:\Users\admin\Desktop\rep_depot\sous_rep\ /test_rep_courant and Powershell

Re: hive to hbase mapping

2013-06-17 Thread Mario Casola
Hi, for the first question the answer is yes, with a 500,000 rows Hbase table, the job complete successfully. Second, the jobs are in running state. Attached you can see a syslog of one of the jobs. Third, I've tryed to set the "hbase.zookeeper.quorum" property but nothing is changed. Let me know

Re: Extremely slow throughput with dynamic partitions using Hive 0.8.1 in Amazon Elastic Mapreduce

2013-06-17 Thread Ted Xu
Hi Shaun, Your findings are valid. Hive uses Hadoop job counters to report fatal error, so the client can kill the MapReduce job before it completes. With regard to your case, because Hive wants to kill the MapReduce job when there is too many partitions using Dynamic Partitioning, counters repor

Re: When to use bucketed tables with/instead of partitioned tables

2013-06-17 Thread bejoy_ks
Hi Stephen In addition to join optimization, bucketing helps much in sampling as well. It helps you to choose the sample space, (ie n buckets of m). Regards Bejoy KS Sent from remote device, Please excuse typos -Original Message- From: Stephen Boesch Date: Sun, 16 Jun 2013 11:20:49

Re: Hive Query having virtual column INPUT__FILE__NAME in where clause gives exception

2013-06-17 Thread Jitendra Kumar Singh
Thanks guys for reply. Following query also did not work hive> select count(*), filename from (select INPUT__FILE__NAME as filename from netflow) tmp where filename='vzb.1351794600.0' group by filename; FAILED: SemanticException java.lang.RuntimeException: cannot find field input__file__name from