Re: Hive ExIm from on-premise HDP to Amazon EMR

2016-01-07 Thread Elliot West
Further investigation appears to show this going wrong in a copy phase of the plan. The correctly functioning HDFS → HDFS import copy stage looks like this: STAGE PLANS: Stage: Stage-1 Copy source: hdfs://host:8020/staging/my_table/year_month=2015-12 destination: hdfs://host:8020

How can i get job progress info with java jdbc client?

2016-01-07 Thread Heng Chen

Re: Impact of partitioning on certain queries

2016-01-07 Thread Jörn Franke
This observation is correct and it is the same behavior as you see it in other databases supporting partitions. Usually you should avoid many small partitions. > On 07 Jan 2016, at 23:53, Mich Talebzadeh wrote: > > Ok we hope that partitioning improves performance where the predicate is on >

Re: How to ensure that the record value of Hive on MapReduce and Hive on Spark are completely consistent?

2016-01-07 Thread Xuefu Zhang
If the number of records are in synch, then the chance for any value disagreement is very low because Hive on Spark and Hive on MR are basically running the same byte code. If there is anything wrong specific to Spark, then the disparity would be much bigger than that. I suggest you test your produ

Re: How to ensure that the record value of Hive on MapReduce and Hive on Spark are completely consistent?

2016-01-07 Thread Jone Zhang
2016-01-08 11:37 GMT+08:00 Jone Zhang : > We made a comparison of the number of records between Hive on MapReduce > and Hive on Spark.And they are in good agreement. > But how to ensure that the record values of Hive on MapReduce and Hive on > Spark are completely consistent? > Do you have any sug

How to ensure that the record value of Hive on MapReduce and Hive on Spark are completely consistent?

2016-01-07 Thread Jone Zhang
We made a comparison of the number of records between Hive on MapReduce and Hive on Spark.And they are in good agreement. But how to ensure that the record values of Hive on MapReduce and Hive on Spark are completely consistent? Do you have any suggestions? Best wishes. Thanks.

Impact of partitioning on certain queries

2016-01-07 Thread Mich Talebzadeh
Ok we hope that partitioning improves performance where the predicate is on partitioned columns I have two tables. One a basic table called smallsales defined as below CREATE TABLE `smallsales`( | | `prod_id` bigint,

RE: Trying to run simple Java script againt Hive in Eclipse on Windows

2016-01-07 Thread Mich Talebzadeh
Thanks Nick, What I did was this 1.Copied over $HIVE_HOME/lib from Linux host to my PC 2.Copied over $HADOPP_HOME from Linux host to my PC 3.In Eclipse in project (right click), choose Build Path --> Configure Build Path --> Choose external JARs 4.Add all JAR files from

adding jars - hive on spark cdh 5.4.3

2016-01-07 Thread Ophir Etzion
I' trying to add jars before running a query using hive on spark on cdh 5.4.3. I've tried applying the patch in https://issues.apache.org/jira/browse/HIVE-12045 (manually as the patch is done on a different hive version) but still hasn't succeeded. did anyone manage to do ADD JAR successfully with

Hive ExIm from on-premise HDP to Amazon EMR

2016-01-07 Thread Elliot West
Hello, Following on from my earlier post concerning syncing Hive data from an on premise cluster to the cloud, I've been experimenting with the IMPORT/EXPORT functionality to move data from an on-premise HDP cluster to Amazon EMR. I started out with some simple Exports/Imports as these can be the

Re: query execution time in hive

2016-01-07 Thread Awhan Patnaik
Thanks. On Thu, Jan 7, 2016 at 4:17 PM, Mich Talebzadeh wrote: > As far as I can see the easiest option is to use > > > > select from_unixtime(unix_timestamp(), 'dd/MM/ HH:mm:ss.ss') AS > StartTime; > > > > .. your query here > > > > select from_unixtime(unix_timestamp(), 'dd/MM/ HH:mm:s

Trying to run simple Java script againt Hive in Eclipse on Windows

2016-01-07 Thread Mich Talebzadeh
Hi Java Gurus, I have written a simple Java program that works fine when I run it on Linux as hduser (the OS owner for Hadoop, Hive etc) When I create a project in Eclipse on Windows and have copied over $HIVE_HOME/lib from Linux to Windows and added as external libraries, the execution