Re: out of memory using Union operator and array column type

2019-03-11 Thread Gopal Vijayaraghavan
> I'll try the simplest query I can reduce it to  with loads of memory and see > if that gets anywhere. Other pointers are much appreciated. Looks like something I'm testing right now (to make the memory setting cost-based). https://issues.apache.org/jira/browse/HIVE-21399 A less

Re: Running Hive on Spark

2019-03-11 Thread Rajesh Balamohan
Not sure why you are using SparkThriftServer. OOTB HiveServer2 would be good enough for this. Is there any specific reason for moving from tez to spark as execution engine? ~Rajesh.B On Mon, Mar 11, 2019 at 9:45 PM Daniel Mateus Pires wrote: > Hi there, > > I would like to run Hive using

Re: out of memory using Union operator and array column type

2019-03-11 Thread Devopam Mittra
hi Patrick, Usually a distinct is preferred on Primary key columns instead of the entire table - something typically addressed to as SKEWNESS in traditional rdbms world. Doing it on an array will further add to the woes typically. A typical workaround for this done by me in past is to fall back

How to update Hive ACID tables in Flink

2019-03-11 Thread David Morin
Hello, I've just implemented a pipeline based on Apache Flink to synchronize data between MySQL and Hive (transactional + bucketized) onto HDP cluster. Flink jobs run on Yarn. I've used Orc files but without ACID properties. Then, we've created external tables on these hdfs directories that

Re: out of memory using Union operator and array column type

2019-03-11 Thread Patrick Duin
Venkatesh: Increasing the memory: I've tried even bigger setttings, that made the error appear after twice much more time. Dev: So I know which table is giving the issue, following your previous suggestion I did a SELECT DISTINCT * FROM DELTA, which cause the same issue so I think the DISTINCT is

Re: out of memory using Union operator and array column type

2019-03-11 Thread Devopam Mittra
hi Patrick, If it sounds worth trying please do the same: 1. Create physical table from table 1. (with filter clause) 2. Create physical table from table 2. (with filter clause) 3. Create interim table 2_1 with the DISTINCT clause. 4. Create interim table 2_2 with the UNION clause. 5. Do an

Re: out of memory using Union operator and array column type

2019-03-11 Thread Venkatesh Selvaraj
Patrick, Can you bump up the mapper memory and see if it helps? SET mapreduce.map.memory.mb=3072 SET mapreduce.map.java.opts=-Xmx2560m; Regards, Venkatesh On Mon, Mar 11, 2019 at 7:29 AM Patrick Duin wrote: > Hi, > > I'm running into oom issue trying to do a Union all on a bunch of AVRO >

Running Hive on Spark

2019-03-11 Thread Daniel Mateus Pires
Hi there, I would like to run Hive using Spark as the execution engine and I'm pretty confused with the set up. For reference I'm using AWS EMR. First, I'm confused at the difference between running Hive with Spark as its execution engine sending queries to Hive using HiveServer2 (Thrift), and

Re: Read Hive ACID tables in Spark or Pig

2019-03-11 Thread David Morin
Hi, I've just implemented a pipeline to synchronize data between MySQL and Hive (transactional + bucketized) onto HDP cluster. I've used Orc files but without ACID properties. Then, we've created external tables on these hdfs directories that contain these delta Orc files. Then, MERGE INTO

Re: out of memory using Union operator and array column type

2019-03-11 Thread Patrick Duin
Very good question, Yes that does give the same problem. Op ma 11 mrt. 2019 om 16:28 schreef Devopam Mittra : > Can you please try doing SELECT DISTINCT * FROM DELTA into a physical > table first ? > regards > Dev > > > On Mon, Mar 11, 2019 at 7:59 PM Patrick Duin wrote: > >> Hi, >> >> I'm

Re: out of memory using Union operator and array column type

2019-03-11 Thread Devopam Mittra
Can you please try doing SELECT DISTINCT * FROM DELTA into a physical table first ? regards Dev On Mon, Mar 11, 2019 at 7:59 PM Patrick Duin wrote: > Hi, > > I'm running into oom issue trying to do a Union all on a bunch of AVRO > files. > > The query is something like this: > > with gold as

out of memory using Union operator and array column type

2019-03-11 Thread Patrick Duin
Hi, I'm running into oom issue trying to do a Union all on a bunch of AVRO files. The query is something like this: with gold as ( select * from table1 where local_date=2019-01-01), delta ss ( select * from table2 where local_date=2019-01-01) insert overwrite table3 PARTITION

答复: Re: How to update the metadata of a Hive table?

2019-03-11 Thread luby
The both tables have NO partitions. 发件人: "Sahibdeep Singh" 收件人: user@hive.apache.org 日期: 2019/03/11 13:09 主题: Re: How to update the metadata of a Hive table? Does the new location have old partitions as well? OR some partitions lie in old location and some in new location? On Sun, Mar

答复: Re: How to update the metadata of a Hive table?

2019-03-11 Thread luby
Not, it is NOT an external table. Both tables are managed tables and have no partitions. The analyze command only update the row number but not other meta data such as table size. 发件人: "Ashutosh Bapat" 收件人: user@hive.apache.org 日期: 2019/03/11 13:01 主题: Re: How to update the metadata of a