Re: Query Failures
https://community.cloudera.com/t5/Support-Questions/Map-and-Reduce-Error-Java-heap-space/td-p/45874 On Fri, Feb 14, 2020, 6:58 PM David Mollitor wrote: > Hive has many optimizations. One is that it will load the data directly > from storage (HDFS) if it's a trivial query. For example: > > Select * from table limit 10; > > In natural language it says "give me any ten rows (if available) from the > table." You don't need the overhead of launching a full mapreduce job for > this. Just read the rows from the file directly. > > Adding additional predicates on the query requires a mapreduce job to do > the heavy lifting. The error message you're getting is probably the result > of a failed mapreduce job. Nine times out of ten, the problem is that the > mappers/reducers are not granted enough memory for their YARN containers. > > On Tue, Feb 11, 2020, 10:41 AM Pau Tallada wrote: > >> Hi, >> >> Do you have more complete tracebacks? >> >> Missatge de Charles Givre del dia dt., 11 de febr. >> 2020 a les 2:54: >> >>> Hello Everyone! >>> I recently joined a project that has a Hive/Impala installation and we >>> are experience a significant number of query failures. We are using an >>> older version of Hive, and unfortunately there's nothing iI can do about >>> that, but I'm wondering is how I can make Hive do better with queries to >>> give our users a better experience. >>> >>> For example, I can execute a basic SELECT * query or SELECT >>> query without issues. >>> >>> However, if I attempt to: >>> 1. Add filters >>> 2. Do a SELECT DISTINCT >>> 3. Perform basic aggregation >>> >>> I get errors like this: Execution Error, return code 1 from >>> org.apache.hadoop.hive.ql.exec.mr.MapRedTask. >>> >>> Could someone point me to some good guides for querying Hive and/or >>> assisting my engineers in preventing these errors? >>> Thanks, >>> >>> >> >> -- >> -- >> Pau Tallada Crespí >> Dep. d'Astrofísica i Cosmologia >> Port d'Informació Científica (PIC) >> Tel: +34 93 170 2729 >> -- >> >>
Re: Query Failures
Hive has many optimizations. One is that it will load the data directly from storage (HDFS) if it's a trivial query. For example: Select * from table limit 10; In natural language it says "give me any ten rows (if available) from the table." You don't need the overhead of launching a full mapreduce job for this. Just read the rows from the file directly. Adding additional predicates on the query requires a mapreduce job to do the heavy lifting. The error message you're getting is probably the result of a failed mapreduce job. Nine times out of ten, the problem is that the mappers/reducers are not granted enough memory for their YARN containers. On Tue, Feb 11, 2020, 10:41 AM Pau Tallada wrote: > Hi, > > Do you have more complete tracebacks? > > Missatge de Charles Givre del dia dt., 11 de febr. > 2020 a les 2:54: > >> Hello Everyone! >> I recently joined a project that has a Hive/Impala installation and we >> are experience a significant number of query failures. We are using an >> older version of Hive, and unfortunately there's nothing iI can do about >> that, but I'm wondering is how I can make Hive do better with queries to >> give our users a better experience. >> >> For example, I can execute a basic SELECT * query or SELECT >> query without issues. >> >> However, if I attempt to: >> 1. Add filters >> 2. Do a SELECT DISTINCT >> 3. Perform basic aggregation >> >> I get errors like this: Execution Error, return code 1 from >> org.apache.hadoop.hive.ql.exec.mr.MapRedTask. >> >> Could someone point me to some good guides for querying Hive and/or >> assisting my engineers in preventing these errors? >> Thanks, >> >> > > -- > -- > Pau Tallada Crespí > Dep. d'Astrofísica i Cosmologia > Port d'Informació Científica (PIC) > Tel: +34 93 170 2729 > -- > >
Re: Query Failures
Hi, Do you have more complete tracebacks? Missatge de Charles Givre del dia dt., 11 de febr. 2020 a les 2:54: > Hello Everyone! > I recently joined a project that has a Hive/Impala installation and we are > experience a significant number of query failures. We are using an older > version of Hive, and unfortunately there's nothing iI can do about that, > but I'm wondering is how I can make Hive do better with queries to give our > users a better experience. > > For example, I can execute a basic SELECT * query or SELECT query > without issues. > > However, if I attempt to: > 1. Add filters > 2. Do a SELECT DISTINCT > 3. Perform basic aggregation > > I get errors like this: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.mr.MapRedTask. > > Could someone point me to some good guides for querying Hive and/or > assisting my engineers in preventing these errors? > Thanks, > > -- -- Pau Tallada Crespí Dep. d'Astrofísica i Cosmologia Port d'Informació Científica (PIC) Tel: +34 93 170 2729 --
Query Failures
Hello Everyone! I recently joined a project that has a Hive/Impala installation and we are experience a significant number of query failures. We are using an older version of Hive, and unfortunately there's nothing iI can do about that, but I'm wondering is how I can make Hive do better with queries to give our users a better experience. For example, I can execute a basic SELECT * query or SELECT query without issues. However, if I attempt to: 1. Add filters 2. Do a SELECT DISTINCT 3. Perform basic aggregation I get errors like this: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask. Could someone point me to some good guides for querying Hive and/or assisting my engineers in preventing these errors? Thanks,