Further investigation appears to show this going wrong in a copy phase of
the plan. The correctly functioning HDFS → HDFS import copy stage looks
like this:
STAGE PLANS:
Stage: Stage-1
Copy
source: hdfs://host:8020/staging/my_table/year_month=2015-12
destination:
hdfs://host:8020
This observation is correct and it is the same behavior as you see it in other
databases supporting partitions. Usually you should avoid many small partitions.
> On 07 Jan 2016, at 23:53, Mich Talebzadeh wrote:
>
> Ok we hope that partitioning improves performance where the predicate is on
>
If the number of records are in synch, then the chance for any value
disagreement is very low because Hive on Spark and Hive on MR are basically
running the same byte code. If there is anything wrong specific to Spark,
then the disparity would be much bigger than that. I suggest you test your
produ
2016-01-08 11:37 GMT+08:00 Jone Zhang :
> We made a comparison of the number of records between Hive on MapReduce
> and Hive on Spark.And they are in good agreement.
> But how to ensure that the record values of Hive on MapReduce and Hive on
> Spark are completely consistent?
> Do you have any sug
We made a comparison of the number of records between Hive on MapReduce and
Hive on Spark.And they are in good agreement.
But how to ensure that the record values of Hive on MapReduce and Hive on
Spark are completely consistent?
Do you have any suggestions?
Best wishes.
Thanks.
Ok we hope that partitioning improves performance where the predicate is on
partitioned columns
I have two tables. One a basic table called smallsales defined as below
CREATE TABLE `smallsales`( |
| `prod_id` bigint,
Thanks Nick,
What I did was this
1.Copied over $HIVE_HOME/lib from Linux host to my PC
2.Copied over $HADOPP_HOME from Linux host to my PC
3.In Eclipse in project (right click), choose Build Path --> Configure
Build Path --> Choose external JARs
4.Add all JAR files from
I' trying to add jars before running a query using hive on spark on cdh
5.4.3.
I've tried applying the patch in
https://issues.apache.org/jira/browse/HIVE-12045 (manually as the patch is
done on a different hive version) but still hasn't succeeded.
did anyone manage to do ADD JAR successfully with
Hello,
Following on from my earlier post concerning syncing Hive data from an on
premise cluster to the cloud, I've been experimenting with the
IMPORT/EXPORT functionality to move data from an on-premise HDP cluster to
Amazon EMR. I started out with some simple Exports/Imports as these can be
the
Thanks.
On Thu, Jan 7, 2016 at 4:17 PM, Mich Talebzadeh wrote:
> As far as I can see the easiest option is to use
>
>
>
> select from_unixtime(unix_timestamp(), 'dd/MM/ HH:mm:ss.ss') AS
> StartTime;
>
>
>
> .. your query here
>
>
>
> select from_unixtime(unix_timestamp(), 'dd/MM/ HH:mm:s
Hi Java Gurus,
I have written a simple Java program that works fine when I run it on Linux
as hduser (the OS owner for Hadoop, Hive etc)
When I create a project in Eclipse on Windows and have copied over
$HIVE_HOME/lib from Linux to Windows and added as external libraries, the
execution
12 matches
Mail list logo