I am trying to enable Column statistics usage with Parquet tables. This is
the query I am executing. However on explain, I see that even though *Basic
stats: COMPLETE *is seen *Column stats *is seen as* NONE.*
Can someone please explain what else I need to debug/fix this.
set
I am trying to enable Column statistics usage with Parquet tables. This is
the query I am executing. However on explain, I see that even though *Basic
stats: COMPLETE *is seen *Column stats *is seen as* NONE.*
Can someone please explain what else I need to debug/fix this.
set
Looks like it's caused by HIVE-7314. Could you try that with
hive.cache.expr.evaluation=false?
Thanks,
Navis
2014-07-24 14:34 GMT+09:00 丁桂涛(桂花) dinggui...@baixing.com:
Yes. The output is correct: [tp,p,sp].
I developed the UDF using JAVA in eclipse and exported the jar file into
the auxlib
I am trying to enable Column statistics usage with Parquet tables. This is
the query I am executing. However on explain, I see that even though *Basic
stats: COMPLETE *is seen *Column stats *is seen as* NONE.*
Can someone please explain what else I need to debug/fix this.
set
Hello,
I have a csv file that has columns which contains commas within a string
enclosed with a . ex: column name:*'Issue' *value:*Other (phone, health
club, etc)*
*Question:* What should the data type of 'Issue' be? Or how should I format
the table (row format delimited terminated by) so that
Stuck .need help
I created a small table with multiple partition desc (id int ,term int)
partitioned by id ,whenever I run analyze on any id I am getting perfectly good
answers . I am unable to figure out the difference each file is making .
New table
Table Parameters:
Yeah. After setting hive.cache.expr.evaluation=false, all queries output
expected results.
And I found that it's related to the getDisplayString function in the UDF.
At first the function returns a string regardless of its parameters. And I
had to set hive.cache.expr.evaluation = false.
But
I am trying to enable Column statistics usage with Parquet tables. This is
the query I am executing. However on explain, I see that even though *Basic
stats: COMPLETE *is seen *Column stats *is seen as* NONE.*
Can someone please explain what else I need to debug/fix this.
set
Can anyone please help with this ?
[image: Inline image 1]
i followed the advice here
http://stackoverflow.com/questions/20390217/mapreduce-job-in-headless-environment-fails-n-times-due-to-am-container-exceptio
and added to mapred-site.xml following properties but still getting the
same error.
Hey Guys,
I'm working with HiveServer2. I know the HiveServer holds a session for each
client, and close it when the client execute 'CloseSession'.
But if the client is forced to terminate, like Ctrl+Z or kill -9, the
session in HiveServer will not be closed.
Does there exists a
I am trying to enable Column statistics usage with Parquet tables. This is
the query I am executing. However on explain, I see that even though *Basic
stats: COMPLETE *is seen *Column stats *is seen as* NONE.*
Can someone please explain what else I need to debug/fix this.
set
https://issues.apache.org/jira/browse/HIVE-5799 is for that kind of cases,
but not included in releases yet.
Thanks,
Navis
2014-07-24 20:04 GMT+09:00 Zhanghe (D) crane.zh...@huawei.com:
Hey Guys,
I'm working with HiveServer2. I know the HiveServer holds a session for
each client, and
I am trying to enable Column statistics usage with Parquet tables. This is
the query I am executing. However on explain, I see that even though *Basic
stats: COMPLETE *is seen *Column stats *is seen as*NONE.*
Can someone please explain what else I need to debug/fix this.
set
Well the problem exactly didn’t get solved but I observed this kind of behavior
is persistent when I partition my table by date type otherwise its working .
may its worth a issue .
Thank you
From: Navdeep Agrawal [mailto:navdeep_agra...@symantec.com]
Sent: Thursday, July 24, 2014 1:22 PM
To:
I am trying to aggregate one column of decimal type, which is returning me
null. If I cast this column to double it returns me some value. following
are the steps to recreate this scenario.
CREATE TABLE salestemp(sku int, sales decimal);
LOAD DATA LOCAL INPATH
Hi All,
I hope I’m not duplicating a previous question, but I couldn’t find any search
functionality for the user list archives.
I have written a relatively simple python script that is meant to take a field
from a hive query and transform it (just some string processing through a dict)
given
You have to explicit specifics column list in analyze command for gathering
columns stats.
This command will only collect basic stats like number of rows, total file
size, raw data size, number of files.
analyze table user_table partition(dt='2014-06-01',hour='00') compute
statistics;
To
if I do a join of a table based on txt file and a table based on HBase, and
say the latter is very large, is HIVE smart enough to utilize the HBase
table's index to do the join, instead of implementing this as a regular map
reduce job, where each table is scanned fully, bucketed on join keys, and
kind of found this
http://hortonworks.com/blog/hbase-via-hive-part-1/
From a performance perspective, there are things Hive can do today (ie,
not dependent on data types) to take advantage of HBase. There’s also
the possibility of an HBase-aware Hive to make use of HBase tables as
intermediate
Are you trying to read the Avro file directly in your UDF? If so, that is not
the correct way to do it in UDF.
Hive can support Avro file natively. Don't know your UDF requirement, but here
is normally what I will do:
Create the table in hive as using AvroContainerInputFormat
create external
if we have a huge table, and every 1 hour only 1% of that has some updates,
it would be a huge waste to slurp in the whole table through MR job and
write out the new table.
instead, if we store this table in HBASE, and use the current HBase+Hive
integration, as long as we can do upsert, then we
Hi Yang. That's correct. You should check out the HBase UDFs in Klout's
Brickhouse library
https://github.com/klout/brickhouse/tree/master/src/main/java/brickhouse/hbase
On Jul 24, 2014 8:07 PM, Yang tedd...@gmail.com wrote:
if we have a huge table, and every 1 hour only 1% of that has some
I don't think Hbase-Hive integration part is that smart, be able to utilize the
index existing in the HBase. But I think it depends on the version you are
using.
From my experience, there are a lot of improvement space in the Hbase-hive
integration, especially push down logic into HBase engine.
I am trying to Create a table in Hive. It's a very long script contained large
number of columns and also contains complex fields like STRUCT, ARRAY etc.
* Cannot create full table in one shot using CREATE TABLE statement so
need to first run CREATE and then ALTER
* If fields
What version of hive are you using? What file format are you using?
Thanks
Prasanth Jayachandran
On Jul 24, 2014, at 5:03 PM, azaz.ras...@wipro.com azaz.ras...@wipro.com
wrote:
I am trying to Create a table in Hive. It’s a very long script contained
large number of columns and also contains
Are you using MySQL or Postgres for the Metastore database?
On Jul 24, 2014 9:08 PM, Prasanth Jayachandran
pjayachand...@hortonworks.com wrote:
What version of hive are you using? What file format are you using?
Thanks
Prasanth Jayachandran
On Jul 24, 2014, at 5:03 PM,
The following article about using Klout's Brickhouse library to access an
HBase table as a map through its key might be useful.
http://brickhouseconfessions.wordpress.com/2013/08/06/squash-the-long-tail-with-brickhouses-hbase-udfs/
On Jul 24, 2014 8:56 PM, Andrew Mains andrew.ma...@kontagent.com
Hi,
The actual useful part of the error is:
Execution Error, return code 2 from
org.apache.hadoop.hive.ql.exec.mr.MapRedTask
If you do a search for this plus EC2 in Google you will find a couple of
results that point to memory exhaustion issues. You should try increasing
the configurated memory
try
select sum(sales) from salestemp where sales is not null;
On Thu, Jul 24, 2014 at 11:10 PM, Abhishek Gayakwad a.gayak...@gmail.com
wrote:
I am trying to aggregate one column of decimal type, which is returning me
null. If I cast this column to double it returns me some value. following
thanks all. I created a new database and it works fine there..
On Sat, Jul 19, 2014 at 1:37 PM, Lefty Leverenz leftylever...@gmail.com
wrote:
And now it's documented in the DDL wiki:
- Use Database
Hi,
Thanks for your reply. Have been following links for the past two days now.
Finally got hadoop natively compiled. Let's see if that solves the problem.
Yes, increasing the memory was on my list but i think i tried that, didn't
work.
Memory can be issue as it is working perfectly fine for
31 matches
Mail list logo