Hi,
Currently I am submitting multiple hive jobs using hive cli with "hive -f"
from different scripts. All these jobs I could see in application tracker and
these get processed in parallel.
Now I planned to switch to HiveServer2 and submitting jobs using beeline client
from multiple scripts
I'm using Hadoop 1.0.4. Suspecting some compatibility issues I moved from
Hive 0.13 to Hive 0.12.
But the exceptions related to SL4J still persist.
Unable to move forward with hive to finalize a critical product design. Can
somebody please help me?
On Wed, Jul 9, 2014 at 11:25 AM, Sarath Chandra
The "small" table can be any size. You want the small table to be
/path/to/table/b here because that will result in more parallelism. There
is a ticket on hive theta join that you might want to look at.
On Thu, Jul 10, 2014 at 10:23 PM, Malligarjunan S
wrote:
> Hello Edwards,
>
> Thank you very
Hello Edwards,
Thank you very much for the update.
What size you mean is small table. In our case the small table will have
minimum of 1 million records.
Can we use this UDTF? how much time improvement will be there?
Appreciate your help!
Thanks and Regards
SankarS
On Thu, Jul 10, 2014 at 11:26
The url for the hbase-hive integration:
https://cwiki.apache.org/confluence/display/Hive/HBaseIntegration
has old versions: Hbase 0.92.0 and hadoop 0.20.x
Are there any significant changes to these docs that anyone might (a) have
pointers to or (b) be able/willing to mention here as important
There is a max limit size for the number of tables (maxRows) but it does
not help because the call internally fetches all the table objects and then
truncates the result set based on the maxRows argument. So, it is dependent
only on the total number of tables your database has.
I have also been ex
for simpler use, Zeppelin (http://zeppelin-project.org) runs hive query
with web based editor, and it's got cron tab style scheduler.
Best,
moon
On Fri, Jul 11, 2014 at 8:52 AM, Martin, Nick wrote:
> Oozie has a workflow action for Hive to execute scripts. You can also
> configure an Oozie co
Oozie has a workflow action for Hive to execute scripts. You can also configure
an Oozie coordinator to run the Hive workflow at desired intervals. Lots of
Oozie config options for workflows and configs so check out the documentation.
Sent from my iPhone
On Jul 10, 2014, at 6:09 PM, "Cheng Ju C
Yeah. It's expected that 13 client is not able to talk to the older sever.
However, the other direction is fine. That is, old 12 client should be able
to talk to 13 server.
--Xuefu
On Thu, Jul 10, 2014 at 3:09 PM, Edward Capriolo
wrote:
> 2014-07-10 22:00:03 ERROR HiveConnection:425 - Error op
So, I have a query like this:
select
user.id
ud_name.value as name
ud_age.value as age
from user
left outer join user_data ud_name on user.id = ud_name.user_id and
ud_name.key = 'name'
left outer join user_data ud_age on user.id = ud_age.user_id and ud_age.key
= 'age'
...
;
With multiple joins
Hi,
Hopefully it won't be an extra mail for you guys because I keep getting deliver
error.
I am looking for any scheduling implementation for Hive job. (e.g. some hive
command have to be executed every 15 minutes.) It is supposed to be some ways
to achieve it but I haven't find a stable way fr
2014-07-10 22:00:03 ERROR HiveConnection:425 - Error opening session
org.apache.thrift.TApplicationException: Required field 'client_protocol'
is unset! Struct:TOpenSessionReq(client_protocol:null)
at
org.apache.thrift.TApplicationException.read(TApplicationException.java:108)
This is hive 13
Hi,
I am looking for any scheduling implementation for Hive job. (e.g. some hive
command have to be executed every 15 minutes.) It is supposed to be some ways
to achieve it but I haven't find a stable way from online community. Any
suggestion?
Btw, my hive version is 0.13 and Hadoop version is
What do you see in your hiveserver2 logs? There might be a clue there.
On Thu, Jul 10, 2014 at 1:17 PM, Hang Chan wrote:
> I tried using the username and password but still getting the same error.
>
> # hive --service beeline --verbose=true -u jdbc:hive2://hiveservice:11000
> -n root -p foo
>
I tried using the username and password but still getting the same error.
# hive --service beeline --verbose=true -u jdbc:hive2://hiveservice:11000
-n root -p foo
issuing: !connect jdbc:hive2://hiveservice:11000 root foo
scan complete in 36ms
Connecting to jdbc:hive2://hiveservice:11000
Error: Inv
Oh, somewhere in the email thread I thought http transport mode was being
used. If that's not the case then you should be able to login using:
hive --service beeline -u jdbc:hive2://hiveservice:11000 -n $USER -p fakepwd
Even though it doesn't do authentication, hiveserver2 still needs to a
usernam
There is no magic. Hopefully one table is smaller then the other. You could
make a UDTF to do something like this MR job is doing
Make a mapper that runs over table A.
InputFormat.setInputPath("/path/to/table/a")
Then inside the mapper
private Conf c
setup(Conf c){
this.c = c
}
public void map
Why is this failing? I am calling it in the SELECT list.
hive> ADD JAR /root/apache-hive-0.13.1-bin/lib/hive-contrib-0.13.1.jar;
hive> CREATE TEMPORARY FUNCTION rowSequence AS
'org.apache.hadoop.hive.contrib.udf.UDFRowSequence';
hive> CREATE TABLE LOYALTY_CARDS AS
> SELECT DISTINCT CARD_NB
Hello Edward,
Thank you very much for helping me.
I am new to hive. Could you please provide the sample map reduce job?
Regards,
Sankar S
On Thu, Jul 10, 2014 at 8:19 AM, Edward Capriolo
wrote:
> Hive cross product stinks . I have a map reduce job that will do it
>
>
> On Wednesday, July 9
Nope, still not working. I don't believe I have http enabled.
# hive --service beeline --verbose=true -u
"jdbc:hive2://hiveservice:10001/default?hive.server2.transport.mode=http;hive.server2.thrift.http.path=cliservice"
issuing: !connect
jdbc:hive2://hiveservice:10001/default?hive.server2.transpo
Scaling a Hadoop cluster with Hive has the following issues
1. Adding a computing node(Scaling up) when load on the cluster is high
decreases the execution time of the queries but its there is still a huge
time lag as the new node works on data from other nodes.
2. The process of removing a node
21 matches
Mail list logo