Subject: Successful Implementation of Bucket Map Join
Hi,
I hope this message finds you well.
I wanted to express my gratitude for the detailed instructions you provided
on setting up the Bucket Map Join. Your guidance proved to be extremely
helpful and, following your steps, I am pleased to con
Hi,
I understand you are trying MapReduce! I recommend you use Tez unless you
have special reasons. Tez is the recommended engine and I guess more
community members use Hive 3 on Tez. It means you are more likely to get
answers when you encounter trouble.
Quickly, I succeeded in enabling Bucket M
Hello,
First of all, I would like to express my gratitude for your responses and
assistance. I’m currently encountering a scenario where my Hive is not
choosing BucketMapJoin, and I wonder whether this is due to its underlying
execution engine, which is MapReduce.
In addition, I am operating in a
Hi smart li,
As far as I tried with Hive 3.1.2 on Tez, Bucket Map Join was probably
triggered. My configurations could be different from yours, though.
# How I tested
## hive-site.xml
https://github.com/zookage/zookage/blob/v0.2.3/kubernetes/base/common/config/hive/hive-site.xml
## Prepare tes
Hello Hive Users,
I’m currently trying to understand how Bucket Map Join works in Hive, but
I’m encountering some issues that I need help with. Here’s what I did:
Firstly, I created a Hive table using the following statement:
create table map_join_tb(
id int
)
clustered by (id) into 32 buckets;
Hi Julien,
See my answers below:
> On Sep 19, 2019, at 21:55, Julien Phalip wrote:
>
> Hi,
>
> I'm interested in a new config property that was added as part of HIVE-20508
> <https://issues.apache.org/jira/browse/HIVE-20508>, and had a few questions:
>
>
Hi,
I'm interested in a new config property that was added as part of HIVE-20508
<https://issues.apache.org/jira/browse/HIVE-20508>, and had a few questions:
1) The update was merged
<https://github.com/apache/hive/commit/494b771ac02455d1a162570fa5fd26be55e0152c>
into
the maste
Hi everybody,
I have got some questions concerning from_utc_timestamp date function by using
hive:
1.I have a datetime value which is meant to be in UTC time. By using
from_utc_timestamp I want to get the correct unix timestamp out of this date. I
would think, that I don't have to d
For question 1, if hive.server2.enable.doAs is set to true, the AppMaster
fails to connect to LLAP daemons (from my experiments).
--- Sungwoo
On Fri, May 18, 2018 at 1:02 AM, Sungwoo Park wrote:
> Hello,
>
> I have a couple of questions on LLAP and hive.server2.enable.doAs. I've
Hello,
I have a couple of questions on LLAP and hive.server2.enable.doAs. I've
learned that LLAP does not support hive.server2.enable.doAs=true, but what
if we disable LLAP IO? If LLAP IO is disabled and no cache is used in LLAP
daemons, I guess it should be okay to allow hive.se
Hi All-
I have a few questions on Hive, I have been going through the documentation and
did check with a couple of my known but couldn’t get the satisfactory answer. I
would appreciate if someone please shed light on this.
MAP JOIN - During the map join phase - before the actual MR task gets
Hi I'm a newbie to Hive and recently I need to do some data warehousing
work with hive.
Actually the business data is inside a SQL server database and I need to
extract the data from the tables inside the database first into
HDFS and then into Hive.
The current design is like this, I first use Apa
Can someone help answer the questions? Thanks
--
发自我的网易邮箱平板适配版
在 2016-01-28 22:11:29,Todd 写道:
Hi,
I am using Hive 0.14, and I am using JDBC to connect the Hive thrift server to
do queries things, I encounter two issues-
1. When the query is issued,how can i get the job id(mapreduce
Hi,
I am using Hive 0.14, and I am using JDBC to connect the Hive thrift server to
do queries things, I encounter two issues-
1. When the query is issued,how can i get the job id(mapreduce that run the
query),so that I can get a chance to be able to kill the job.
2. I want to execute a sql file
ployees, unless expressly so stated. It is
> the responsibility of the recipient to ensure that this email is virus
> free, therefore neither Peridale Ltd, its subsidiaries nor their employees
> accept any responsibility.
>
>
>
> *From:* Xuefu Zhang [mailto:xzh...@cloudera.c
sponsibility.
From: Xuefu Zhang [mailto:xzh...@cloudera.com]
Sent: 28 November 2015 20:53
To: user@hive.apache.org
Cc: d...@hive.apache.org
Subject: Re: Answers to recent questions on Hive on Spark
You should be able to set that property as any other Hive property: just do
"set hi
Mich
>
>
>
> *From:* Xuefu Zhang [mailto:xzh...@cloudera.com]
> *Sent:* 28 November 2015 04:35
> *To:* user@hive.apache.org
> *Cc:* d...@hive.apache.org
> *Subject:* Re: Answers to recent questions on Hive on Spark
>
>
>
> Okay. I think I know what problem you have now.
idale Technology Ltd, its
subsidiaries or their employees, unless expressly so stated. It is the
responsibility of the recipient to ensure that this email is virus free,
therefore neither Peridale Ltd, its subsidiaries nor their employees accept any
responsibility.
From: Xuefu Zhang [mailto
...@cloudera.com]
Sent: 28 November 2015 04:35
To: user@hive.apache.org
Cc: d...@hive.apache.org
Subject: Re: Answers to recent questions on Hive on Spark
Okay. I think I know what problem you have now. To run Hive on Spark,
spark-assembly.jar is needed and it's also recommended that you
cloudera.com]
> *发送时间:* 2015年11月28日 2:12
> *收件人:* user@hive.apache.org; d...@hive.apache.org
> *主题:* Answers to recent questions on Hive on Spark
>
>
>
> Hi there,
>
> There seemed an increasing interest in Hive On Spark From the Hive users.
> I understand that there have been a
,
Xuefu
On Fri, Nov 27, 2015 at 4:34 PM, Mich Talebzadeh
wrote:
> Hi,
>
>
>
> Thanks for heads up and comments.
>
>
>
> Sounds like when it comes to using spark as the execution engine for Hive,
> we are in no man’s land so to speak. I have opened questions in bo
questions on Hive on Spark
Hi there,
There seemed an increasing interest in Hive On Spark From the Hive users. I
understand that there have been a few questions or problems reported and I can
see some frustration sometimes. It's impossible for Hive on Spark team to
respond every inquiry even th
Hi,
Thanks for heads up and comments.
Sounds like when it comes to using spark as the execution engine for Hive, we
are in no man’s land so to speak. I have opened questions in both Hive and
Spark user forums. Not much of luck for reasons that you alluded to.
Ok just to clarify the
Hi there,
There seemed an increasing interest in Hive On Spark From the Hive users. I
understand that there have been a few questions or problems reported and I
can see some frustration sometimes. It's impossible for Hive on Spark team
to respond every inquiry even thought we wish we
dress":"1.74.164.206","url":"http://ustream.tv/molestie/lorem.jpg"}}
I have two questions now:
1) Is there a way to map the single fields of the struct to its own columns
of a hive table?
2) Is there a way to query a union type? For example, if I want to query
the
Hi everyone,
Consider the sql :
SELECT thumbnail( product_ image )
FROM advert i sements
WHERE product_name = ‘ Brownie’ ;
The product_ image field is a reference to a multi-megabyte image object.
The thumbnail method reads in this object,
I hope you don't mind me cc'ing user-group so that this q&a is
available for others as well.
The grant/revoke based authorization models (including the new
sql-standards based authorization in hive 0.13) does not automatically
ensure that the user has necessary privileges on hdfs dirs and files.
T
Hi
I tried to answer the best I could, I think beeline logging still can have
some improvements.
On Tue, Apr 15, 2014 at 7:48 AM, Uber Slacker wrote:
> Hi folks. I've been trying out Beeline and have a couple questions. I
> haven't seen a ton of documentation or examples
Hi folks. I've been trying out Beeline and have a couple questions. I
haven't seen a ton of documentation or examples of people using Beeline, so
I wanted to bounce these off you...
- When I use Beeline in embedded mode - connecting with jdbc:hive2:// (I'm
running on t
Thanks. It worked for me now when i use it as an empty string.
From: Krishnan K
To: "user@hive.apache.org" ; Raj Hadoop
Sent: Thursday, October 17, 2013 11:11 AM
Subject: Re: Hive Query Questions - is null in WHERE
For string columns, nu
For string columns, null will be interpreted as an empty string and for
others, it will be interpreted as null...
On Wednesday, October 16, 2013, Raj Hadoop wrote:
> All,
>
> When a query is executed like the below
>
> select field1 from table1 where field1 is null;
>
> I am getting the resul
All,
When a query is executed like the below
select field1 from table1 where field1 is null;
I am getting the results which have empty values or nulls in field1. How does
is null work in Hive queries.
Thanks,
Raj
Select * from table without any where condition will never run MR job
Hive does not cache your query results. If you rerun your query everything
will be repeated for each repeatation
How many days you should keep data...as long as it means something to you
or it has some value in storing foe futu
Hi
I was wondering if it is right to assume:
1. The first time we create a table in hive and load it followed by running the
first query like
Select * from Table1
will result in a MR job running and will get the data to us.
If we run the same query second time MR job will not run but will res
Hello Everyone:
I'm using Hive .9. MapR ODBC driver against Cloudera. Relatively new to
Hadoop, been looking at it about a week to add it to our query tool.
CREATE DATABASE test;
CREATE TABLE test.table1
(col1 int);
describe table1;
I'm getting the message "Table table1 does
fColumns=meta.getColumnCount();
> >
> >
> >
> > System.out.println("Result:");
> >
> > while (res.next()) {
> >
> > for (int i=1;i<=numberOfColumns;i++){
> >
> > System.out.print(Str
System.out.println("Result:");
>
> while (res.next()) {
>
> for (int i=1;i<=numberOfColumns;i++){
>
> System.out.print(String.valueOf("\t" +
> res.getString(i)));
>
> }
>
>
Monday, September 17, 2012 12:39 AM
To: hive-u...@hadoop.apache.org
Subject: Questions about Hive
Note: I am a newbie to Hive.
Can someone please answer the following questions?
1) Does Hive provide APIs (like HBase does) that can be used to retrieve data
from the tables in Hive from a Java
works out. Thanks.
>
>
>
> On Sun, Sep 16, 2012 at 10:51 PM, Tim Robertson > wrote:
>
>> Note: I am a newbie to Hive.
>>>
>>> Can someone please answer the following questions?
>>>
>>> 1) Does Hive provide APIs (like HBase does) that
etezza
can.
If performance isn't a concern, then I guess it could be a useful tool.
Will try it out & see how it works out. Thanks.
On Sun, Sep 16, 2012 at 10:51 PM, Tim Robertson
wrote:
> Note: I am a newbie to Hive.
>>
>> Can someone please answer the following ques
>
> Note: I am a newbie to Hive.
>
> Can someone please answer the following questions?
>
> 1) Does Hive provide APIs (like HBase does) that can be used to retrieve
> data from the tables in Hive from a Java program? I heard somewhere that
> the data can be accesse
s
Anand B
From: Something Something [mailto:mailinglist...@gmail.com]
Sent: 17 September 2012 11:09
To: hive-u...@hadoop.apache.org
Subject: Questions about Hive
Note: I am a newbie to Hive.
Can someone please answer the following questions?
1) Does Hive provide APIs (like HBase does) that c
Note: I am a newbie to Hive.
Can someone please answer the following questions?
1) Does Hive provide APIs (like HBase does) that can be used to retrieve
data from the tables in Hive from a Java program? I heard somewhere that
the data can be accessed with JDBC (style) APIs. True?
2) I
Hi,
When I try to use Hive Indexing, I have the following questions.
1. Does Indexing have the same performance on both the partitioned and
non-partitioned tables? How about bucketed and un-bucked tables?
2. Is it possible for us to build index of function of indexed columns,
like
al Message-
From: Avdeev V. M. [mailto:ls...@list.ru]
Sent: sábado, 02 de junio de 2012 18:43
To: ruben.devr...@hyves.nl
Cc: user@hive.apache.org
Subject: Re[2]: table design and performance questions
Thank for the information Ruben.
1. I found the issue https://issues.apache.org/jira/browse
Thank for the information Ruben.
1. I found the issue https://issues.apache.org/jira/browse/HIVE-1642
does it mean that MAPJOIN hint is obsolete since 2010 and I can avoid this hint
absolutely?
2. sorry for stupid questions, but I can't understand bucketing still.
partitioning is ok,
.
I'm assuming sequencefiles are faster, but I wouldn't really know :( need
someone else to tell us more about that ;)
-Original Message-
From: Avdeev V. M. [mailto:ls...@list.ru]
Sent: Monday, May 28, 2012 7:17 AM
To: user@hive.apache.org
Subject: table design and performanc
Question from novice.
Where I can read table design best practices? I have a measure table with
millions of rows and many dimension tables with less than 1000 rows each. I
can't find out the way to get optimal design of both kind of tables. Is there
performance tuning guides or performance FAQ?
Have a look at the code for the LazySerDes. When you deserialize in the
SerDe, you don't actually have to deserialize all the columns. Deserialized
could return an object that is not actually deserialized and you can write
an ObjectInspector that deserializes a field from that structure but only
wh
1) Is there a way in initialize() of a SerDe to know if it is being used as
a Serializer or a Deserializer. If not, can i define the Serializer and
Deserializer separately instead of defining a SerDe (so i have two
initialize methods)?
2) Is there a way to find out which columns are being used? sa
Hiii all,
Do anyone know any info about HBQL ??
Good/ Bad ?? Performance??
ogs..
>
> Finally, is there any tutorials using Java API of Hive and Hbase???
>
> Date: Thu, 19 Jan 2012 13:36:30 +0530
> Subject: Re: Questions
> From: hadooph...@gmail.com
> To: user@hive.apache.org
>
>
> hey Dalia ,
>
> A: bot
und that Hive queries are faster according to some blogs..
>
> Finally, is there any tutorials using Java API of Hive and Hbase???
> --
> Date: Thu, 19 Jan 2012 13:36:30 +0530
> Subject: Re: Questions
> From: hadooph...@gmail.com
> To: user@hive.apache.o
+0530
Subject: Re: Questions
From: hadooph...@gmail.com
To: user@hive.apache.org
hey
Dalia ,
A: both are good its up to u what kinda data you are processing through them,
for many row and billions of col you can you Hbase and if you need to update
data on regular basis then u can you hbase, for
e like
python ,ruby, java and other. its fast and well to used
regards
Vikas Srivastava
2012/1/19 Dalia Sobhy
> Dear all,
> I want to ask a couple of questions:
> Which is better use Hive or Hive/Hbase or Hbase?What about the RCFILe?Is
> there any tutorials for HiveThrift API usin
Dear all,I want to ask a couple of questions:Which is better use Hive or
Hive/Hbase or Hbase?What about the RCFILe?Is there any tutorials for HiveThrift
API using Java or even any examples bec I am messed with a lot of methods which
I cannot understand...Thanks,Please reply asap bec this part i
f the performed operations (e.g.
embarrassingly parallel queries, parallel aggregation queries, parallel
joins, etc.). For all queries, I performed experiments with increasing
selectivity factor to see how the system reacts to an increase in the amount
of processed data.
I will try to illustrate my quest
ID2
The explain plan at the end of this mail, with parts I need clarification on
in red.
Questions:
1. what is the meaning of the "Map Reduce" line at the beginning?
2. is Select Operator a map-side only operation? if so, why is the reduce
output operator indented inside it?
3. what is
have been reading
> http://wiki.apache.org/hadoop/Hive/Tutorial to get better understanding of
> Hive
>
> I am sorry for really basic questions, but I have some confusion, here are
> couple of questions:
>
>
>1. what is difference between internal and external table in Hive?
>2.
Hi All,
I am new to Hive and have been reading
http://wiki.apache.org/hadoop/Hive/Tutorial to get better understanding of
Hive
I am sorry for really basic questions, but I have some confusion, here are
couple of questions:
1. what is difference between internal and external table in Hive
On May 26, 2011, at 1:28 PM, Guy Bayes wrote:
Crap sorry hit send too early
questions
1: Job overhead of generating statistics on the fly with
set hive.stats.autogather=true;?
Overhead is minimum. The only accountable overhead is to insert a row into a
RDBMS/HBase at the end of a task. At
Crap sorry hit send too early
questions
1: Job overhead of generating statistics on the fly with
set hive.stats.autogather=true;?
2: Is stat descriptions in describe table extended implemented? I've
gathered stats on a table but do not see the expected entries (rowNum = ,
etc) in the des
Hello all, I'm new to this list,
I was wondering if anyone could answer a couple questions about the
implementation of statistics in 0.7?
I've reviewed
http://wiki.apache.org/hadoop/Hive/StatsDev
and have the following q
Hive 0.6 has support for multiple databases/schemas. Is this feature
mature enough to be used in production? Are there any particular
features known to not work with databases (I know you cannot run
queries using multiple databases at the same time)? Currently, there
doesn't seem to be an easy way
64 matches
Mail list logo