> I'll try the simplest query I can reduce it to with loads of memory and see
> if that gets anywhere. Other pointers are much appreciated.
Looks like something I'm testing right now (to make the memory setting
cost-based).
https://issues.apache.org/jira/browse/HIVE-21399
A less
Not sure why you are using SparkThriftServer. OOTB HiveServer2 would be
good enough for this.
Is there any specific reason for moving from tez to spark as execution
engine?
~Rajesh.B
On Mon, Mar 11, 2019 at 9:45 PM Daniel Mateus Pires
wrote:
> Hi there,
>
> I would like to run Hive using
hi Patrick,
Usually a distinct is preferred on Primary key columns instead of the
entire table - something typically addressed to as SKEWNESS in traditional
rdbms world.
Doing it on an array will further add to the woes typically.
A typical workaround for this done by me in past is to fall back
Hello,
I've just implemented a pipeline based on Apache Flink to synchronize
data between MySQL and Hive (transactional + bucketized) onto HDP
cluster. Flink jobs run on Yarn.
I've used Orc files but without ACID properties.
Then, we've created external tables on these hdfs directories that
Venkatesh:
Increasing the memory: I've tried even bigger setttings, that made the
error appear after twice much more time.
Dev:
So I know which table is giving the issue, following your previous
suggestion I did a SELECT DISTINCT * FROM DELTA, which cause the same issue
so I think the DISTINCT is
hi Patrick,
If it sounds worth trying please do the same:
1. Create physical table from table 1. (with filter clause)
2. Create physical table from table 2. (with filter clause)
3. Create interim table 2_1 with the DISTINCT clause.
4. Create interim table 2_2 with the UNION clause.
5. Do an
Patrick,
Can you bump up the mapper memory and see if it helps?
SET mapreduce.map.memory.mb=3072
SET mapreduce.map.java.opts=-Xmx2560m;
Regards,
Venkatesh
On Mon, Mar 11, 2019 at 7:29 AM Patrick Duin wrote:
> Hi,
>
> I'm running into oom issue trying to do a Union all on a bunch of AVRO
>
Hi there,
I would like to run Hive using Spark as the execution engine and I'm pretty
confused with the set up.
For reference I'm using AWS EMR.
First, I'm confused at the difference between running Hive with Spark as
its execution engine sending queries to Hive using HiveServer2 (Thrift),
and
Hi,
I've just implemented a pipeline to synchronize data between MySQL and Hive
(transactional + bucketized) onto HDP cluster.
I've used Orc files but without ACID properties.
Then, we've created external tables on these hdfs directories that contain
these delta Orc files.
Then, MERGE INTO
Very good question, Yes that does give the same problem.
Op ma 11 mrt. 2019 om 16:28 schreef Devopam Mittra :
> Can you please try doing SELECT DISTINCT * FROM DELTA into a physical
> table first ?
> regards
> Dev
>
>
> On Mon, Mar 11, 2019 at 7:59 PM Patrick Duin wrote:
>
>> Hi,
>>
>> I'm
Can you please try doing SELECT DISTINCT * FROM DELTA into a physical table
first ?
regards
Dev
On Mon, Mar 11, 2019 at 7:59 PM Patrick Duin wrote:
> Hi,
>
> I'm running into oom issue trying to do a Union all on a bunch of AVRO
> files.
>
> The query is something like this:
>
> with gold as
Hi,
I'm running into oom issue trying to do a Union all on a bunch of AVRO
files.
The query is something like this:
with gold as ( select * from table1 where local_date=2019-01-01),
delta ss ( select * from table2 where local_date=2019-01-01)
insert overwrite table3 PARTITION
The both tables have NO partitions.
发件人:
"Sahibdeep Singh"
收件人:
user@hive.apache.org
日期:
2019/03/11 13:09
主题:
Re: How to update the metadata of a Hive table?
Does the new location have old partitions as well? OR some partitions lie
in old location and some in new location?
On Sun, Mar
Not, it is NOT an external table.
Both tables are managed tables and have no partitions.
The analyze command only update the row number but not other meta data
such as table size.
发件人:
"Ashutosh Bapat"
收件人:
user@hive.apache.org
日期:
2019/03/11 13:01
主题:
Re: How to update the metadata of a
14 matches
Mail list logo