Try
SELECT
client, receive_day, receive_hour as start_time, receive_hour+1 as end_time
FROM some_table
WHERE client='xyz' AND receive_day=7
ORDER BY start_time;
On Mon, Oct 22, 2012 at 4:41 PM, dyuti a hadoop.hiv...@gmail.com wrote:
Hi all,
I have a hive table with 235 million records.
Hi,
I am new to data warehousing in hadoop. This might be a trivial question
but I was unable to find any answers in the mailing list.
My questions are:
A person has an existing data warehouse that uses a star schema
(implemented in a mysql database).How to migrate it to Hadoop?
I can use sqoop
Hi Austin
You can import the existing tables to hive as such using sqoop. Hive is a
wrapper over mapreduce that gives you the flexibility to create optimized
mapreduce jobs using SQL like syntax. The is no relational style maintained in
hive and don't treat hive as a typical
Austin,
There are some of the great questions asked simply in your email.
Datawarehouse and hadoop echo system goes hand-on-hand. I don't think you need
to move all data from your warehouse to hive and hbase. This is the key :) you
need to understand where should you use have and where can you
Hi,
I have one name node machine and under which there are 4 slaves machines to
run the job.
The way users run queries is
- They ssh into the name node machine
- They initiate hive and submit their queries
Currently multiple users log in with the same credentials and submit queries
Whenever 2
Hi
Is your hive queries in waiting mode even though there are task slots available
on your cluster?
If task slots are getting exhausted and you need parallelism here, then you may
need to look at some approaches of using fair scheduler and different user
accounts for each user so that each
Hi,
Thank you so much for your help. It works great.
Regards,
dti
On Mon, Oct 22, 2012 at 2:18 PM, MiaoMiao liy...@gmail.com wrote:
Try
SELECT
client, receive_day, receive_hour as start_time, receive_hour+1 as end_time
FROM some_table
WHERE client='xyz' AND receive_day=7
ORDER BY
Bejoy is right. I just want to say explicitly that the scheduler
configuration is something which is orthogonal to the use of Hive. (ie same
problem with Pig or standard MapReduce jobs).
Regards
Bertrand
PS : There is also the capacity scheduler.
On Mon, Oct 22, 2012 at 2:18 PM, Bejoy KS
Hi Bejoy and Bertrand
Thanks for quick reply.
I think tasks slots are not available in my cluster because I have only 4
slave machines.
Actually I am beginner to HIVE. So, if you can let me know how I can check
if time slots are available or not.
I have different users credentials to log in
Hi
From the jobtracker web UI you can get the total number of map and reduce
slots. Also from the wen UI itself you can get the num of running map/reduce
tasks. Second value subtracted from first would give you the available slots.
Fair scheduler is a property of map reduce and not of hive.
Hi All,
Is it true that Pig's JOIN operation is not so efficient as of HIVE.
I have just tried over and found differences over JOIN query.
Hive resulted the same as My Sql but Pig resulted some counts lesser then Hive
Join.
Please put some light over JOINS in Pig and Hive.
Regards
Yogesh
Hi,
We have a requirement where we need to print the column headers in the
generated file on executing a query. We are using Jdbc hive client to execute
the query.
Regards,
Venugopal
http://www.mindtree.com/email/disclaimer.html
12 matches
Mail list logo