Re: Strange error in Hive - Insert INTO

2013-08-19 Thread Jérôme Verdier
Hi, Thanks for your replies. So all that said, i see that the columns in your create table statement don't match the columns in your outermost select statement. In particular, DT_JOUR is listed as the 6th column in your create table statement but it appears to be the 2nd column in your select

Re: Hive and Lzo Compression

2013-08-19 Thread w00t w00t
My scenario is a bit different - I am using external tables. So I uploaded some lzo compressed files into HDFS, generated the lzo-index files and finally I created the external table without the specific storage as clause . A SELECT statement on the table still works. Does it work

Re: Hive and Lzo Compression

2013-08-19 Thread Nitin Pawar
As per my understanding, its not file extensions as compressed files can be renamed to anything without extensions. First check, is file compressed if no then directly proceed to read else if yes then find out the compression codec and use it. you can see by running a file command on any

question about hive SQL

2013-08-19 Thread ch huang
hi,all: i do not very familar with HQL, and my problem is ,now i have 2 queries Q1: select page_url, original_category,token from media_visit_info group by page_url, original_category,token limit 10 Q2: select original_category as code , weight from media_visit_info where page_url='X'

Number of reducers in a join

2013-08-19 Thread S Imig
Hello, I'm curious why there is only one reducer for the join below, and why I cannot change the number of reducers using mapred.reduce.tasks. Can someone shed some light? Thanks! hive set mapred.reduce.tasks=5; hive create table nAllTran as select nTranDtl.tran_key, tran_line_qty,

Invalid method name: 'execute'

2013-08-19 Thread Guillermo Alvarado
Hi everyone, I am using the Python client https://cwiki.apache.org/Hive/hiveclient.html#HiveClient-Python to execute some queries to Hive, I am getting this error: Traceback (most recent call last): File test_hive.py, line 20, in module client.execute(SHOW TABLES) File

Re: question about hive SQL

2013-08-19 Thread Sanjay Subramanian
Here is my stab at it. I have not tested it but this should get you started Following points are importat 1. I added a WHERE clause in the sub query to limit he data set by any partition u may have 2. You have to write a collect UDF to use it. Wampler/Capriolo's book in Chapter 13.Functions -

Hive Authorization (ROLES AND PRIVILEGES) does not work with hiveserver2 ?

2013-08-19 Thread Sanjay Subramanian
0: jdbc:hive2://dev-thdp5:1 CREATE ROLE sas_role; No rows affected (0.16 seconds) 0: jdbc:hive2://dev-thdp5:1 CREATE EXTERNAL TABLE IF NOT EXISTS keyword_impressions_log (date_ STRING,server STRING,impression_id STRING,search_session_id STRING,channel_id INT,visit_id BIGINT,visitor_id

Re: question about get part result from hive

2013-08-19 Thread Stephen Sprague
Maybe its too obvious but there is the limit keyword as well. On Sun, Aug 18, 2013 at 9:43 PM, Nitin Pawar nitinpawar...@gmail.comwrote: Hive does support sampling. Please look at https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Sampling We normally partition data based on

Re: Problem with rank() and dense_rank()

2013-08-19 Thread Harish Butani
Regarding the issue posted with rank and dense_rank. The example posted was: CREATE TABLE test (a INT); EXPLAIN SELECT DENSE_RANK() OVER (PARTITION BY a), a FROM test; Some comments on this: 1. The underlying issue is the bug that ranking functions had to be lowercase. This was fixed in

Re: Bug when adding multiple partitions

2013-08-19 Thread Navis류승우
Looks like a bug. I'll fix that. 2013/8/15 Jan Dolinár dolik@gmail.com: Hi everyone, Consider following DDL: CREATE TABLE partition_test (a INT) PARTITIONED BY (b INT); ALTER TABLE partition_test ADD PARTITION (b=1) location '/tmp/test1' PARTITION

Re: Bug when adding multiple partitions

2013-08-19 Thread Navis류승우
https://issues.apache.org/jira/browse/HIVE-5122 2013/8/20 Navis류승우 navis@nexr.com: Looks like a bug. I'll fix that. 2013/8/15 Jan Dolinár dolik@gmail.com: Hi everyone, Consider following DDL: CREATE TABLE partition_test (a INT) PARTITIONED BY (b INT); ALTER