Re: Hive 0.12.0 mysql metastore exception

2014-01-22 Thread Jov
2014/1/23 Jander g 

> Caused by: java.sql.BatchUpdateException: Duplicate entry 'default' for
> key 2


what is the HQL you run? it look like hive try to insert 'default' to a
meta table which violate the unique key.


Jov
blog: http:amutu.com/blog <http://amutu.com/blog>


Re: How to prevent user drop table in Hive metadata?

2013-11-30 Thread Jov
you can hack the meta database,eg create a triger on the TBLS which do some
check and return error when delete from this table

jov
在 2013-11-23 上午3:36,"Echo Li" 写道:

> Good Friday!
>
> I was trying to apply certain level of security in our hive data
> warehouse, by modifying access mode of directories and files on hdfs to 755
> I think it's good enough for a new user to remove data, however the user
> still can drop the table definition in hive cli, seems the "revoke" doesn't
> help much, is there any way to prevent this?
>
>
> Thanks,
> Echo
>


Re: exception when using Hive 0.12 with MySQL metastore

2013-10-19 Thread Jov
can you confirm the script content?there may be bug,you can open a issue.

jov
On Oct 20, 2013 1:11 PM, "Zhang Xiaoyu"  wrote:
>
> Hi, Jov,
> Thanks. I understand turn on those two properties resolve the problems.
But I run the hive 0.12 scheme script. I assume it should create all
required tables.
>
> Johnny
>
>
> On Sat, Oct 19, 2013 at 7:44 PM, Jov  wrote:
>>
>> jov
>>
>>
>> On Oct 20, 2013 8:07 AM, "Zhang Xiaoyu"  wrote:
>> >
>> > Hi, all,
>> > When I using Hive 0.12 with MySQL metastore. I set those properties in
hive-site.xml.
>> > datanucleus.autoCreateSchema = false
>> > datanucleus.autoCreateTables= false
>> you should set these properties to true,then hive will auto add new
column.
>>
>> >
>> > In beeline, "show tables" is fine, but create a new table got below
exception, any idea? Since I create the metastore table by the hive 0.12
scheme script, it shouldn't complain about the missing columns in metastore
tables.
>> >
>> > Thanks,
>> > Johnny
>> >
>> > -
>> >
>> > FAILED: Error in metadata:
MetaException(message:javax.jdo.JDODataStoreException: Add request failed :
INSERT INTO `COLUMNS_V2`
(`CD_ID`,`FCOMMENT`,`COLUMN_NAME`,`TYPE_NAME`,`INTEGER_IDX`) VALUES
(?,?,?,?,?)
>> > at
org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:422)
>> > at
org.datanucleus.api.jdo.JDOPersistenceManager.jdoMakePersistent(JDOPersistenceManager.java:745)
>> > at
org.datanucleus.api.jdo.JDOPersistenceManager.makePersistent(JDOPersistenceManager.java:765)
>> > at
org.apache.hadoop.hive.metastore.ObjectStore.createTable(ObjectStore.java:638)
>> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> > at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>> > at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> > at java.lang.reflect.Method.invoke(Method.java:601)
>> > at
org.apache.hadoop.hive.metastore.RetryingRawStore.invoke(RetryingRawStore.java:111)
>> > at sun.proxy.$Proxy6.createTable(Unknown Source)
>> > at
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_core(HiveMetaStore.java:1081)
>> > at
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_with_environment_context(HiveMetaStore.java:1114)
>> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> > at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>> > at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> > at java.lang.reflect.Method.invoke(Method.java:601)
>> > at
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:102)
>> > at sun.proxy.$Proxy8.create_table_with_environment_context(Unknown
Source)
>> > at
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:464)
>> > at
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:453)
>> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> > at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>> > at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> > at java.lang.reflect.Method.invoke(Method.java:601)
>> > at
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:74)
>> > at sun.proxy.$Proxy10.createTable(Unknown Source)
>> > at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:593)
>> > at
org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:3784)
>> > at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:256)
>> > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:144)
>> > at
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
>> > at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1355)
>> > at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1139)
>> > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:945)
>> > at
org.apache.hive.service.cli.operation.SQLOperation.run(SQLOperation.java:95)
>> > at
org.apache.hive.service.cli.session.HiveSessionImpl.executeStatement(HiveSessionImpl.java:193)
>> > at
org.apache.hive.service.cli.CLIService.executeStatement(CLIService.java:148)
>> > at
org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:203)
>

Re: exception when using Hive 0.12 with MySQL metastore

2013-10-19 Thread Jov
jov
On Oct 20, 2013 8:07 AM, "Zhang Xiaoyu"  wrote:
>
> Hi, all,
> When I using Hive 0.12 with MySQL metastore. I set those properties in
hive-site.xml.
> datanucleus.autoCreateSchema = false
> datanucleus.autoCreateTables= false
you should set these properties to true,then hive will auto add new column.
>
> In beeline, "show tables" is fine, but create a new table got below
exception, any idea? Since I create the metastore table by the hive 0.12
scheme script, it shouldn't complain about the missing columns in metastore
tables.
>
> Thanks,
> Johnny
>
> -
>
> FAILED: Error in metadata:
MetaException(message:javax.jdo.JDODataStoreException: Add request failed :
INSERT INTO `COLUMNS_V2`
(`CD_ID`,`FCOMMENT`,`COLUMN_NAME`,`TYPE_NAME`,`INTEGER_IDX`) VALUES
(?,?,?,?,?)
> at
org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:422)
> at
org.datanucleus.api.jdo.JDOPersistenceManager.jdoMakePersistent(JDOPersistenceManager.java:745)
> at
org.datanucleus.api.jdo.JDOPersistenceManager.makePersistent(JDOPersistenceManager.java:765)
> at
org.apache.hadoop.hive.metastore.ObjectStore.createTable(ObjectStore.java:638)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:601)
> at
org.apache.hadoop.hive.metastore.RetryingRawStore.invoke(RetryingRawStore.java:111)
> at sun.proxy.$Proxy6.createTable(Unknown Source)
> at
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_core(HiveMetaStore.java:1081)
> at
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_with_environment_context(HiveMetaStore.java:1114)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:601)
> at
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:102)
> at sun.proxy.$Proxy8.create_table_with_environment_context(Unknown Source)
> at
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:464)
> at
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:453)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:601)
> at
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:74)
> at sun.proxy.$Proxy10.createTable(Unknown Source)
> at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:593)
> at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:3784)
> at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:256)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:144)
> at
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1355)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1139)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:945)
> at
org.apache.hive.service.cli.operation.SQLOperation.run(SQLOperation.java:95)
> at
org.apache.hive.service.cli.session.HiveSessionImpl.executeStatement(HiveSessionImpl.java:193)
> at
org.apache.hive.service.cli.CLIService.executeStatement(CLIService.java:148)
> at
org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:203)
> at
org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1133)
> at
org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1118)
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
> at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
> at
org.apache.hive.service.auth.TUGIContainingProcessor$1.run(TUGIContainingProcessor.java:40)
> at
org.apache.hive.service.auth.TUGIContainingProcessor$1.run(TUGIContainingProcessor.java:37)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
> at
org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:526)
> at
org.apache.hive.service.auth

Re: [ANNOUNCE] New Hive Committer - Yin Huai

2013-09-03 Thread Jov
congratulations!

Jov
blog: http:amutu.com/blog <http://amutu.com/blog>


2013/9/4 Carl Steinbach 

> The Apache Hive PMC has voted to make Yin Huai a committer on the Apache
> Hive project.
>
> Please join me in congratulating Yin!
>
> Thanks.
>
> Carl
>


Re: Where can we see the results of Select * from states

2013-05-24 Thread Jov
you can write data into filesystem from query using "INSERT OVERWRITE
[LOCAL] DIRECTORY directory1 SELECT ... FROM ..."

more detail:
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML#LanguageManualDML-Writingdataintofilesystemfromqueries


2013/5/24 Sai Sai 

> I have created an external table called states under a database called
> test,
> Then loaded the table successfully;
> The i have tried:
>
> Select * from states;
>
> It successfully executes MR and displays the results in the console but
> wondering where to look in hdfs to see these results.
>
> I have looked under all the dirs in filesystem for the below url but
> cannot see the results part file.
>
> http://localhost.localdomain:50070/dfshealth.jsp
>
> Also if i would like the results to save to a specific file from a query
> how to do it?
>
> For Ex:
> Select * from states > myStates.txt ;
> Is there something like this.
> Thanks
> Sai
>
>
>


-- 
Jov
blog: http:amutu.com/blog <http://amutu.com/blog>


Re: Difference between like %A% and %a%

2013-05-24 Thread Jov
2013/5/24 Sai Sai 

> abbreviation l


unlike MySQL, string in Hive is case sensitive,so '%A%' is not equal with
'%a%'.


-- 
Jov
blog: http:amutu.com/blog <http://amutu.com/blog>


Re: Thrift Hive client for CDH 4.1 HiveServer2?

2013-01-05 Thread Jov
here:
https://issues.apache.org/jira/browse/HIVE-2935
https://cwiki.apache.org/Hive/hiveserver2-thrift-api.html
HiveServer2 now is CDH extension.

I think you can use find cmd to search the CDH src dir to find the .thrift
files.

2013/1/5 David Morel 

> On 4 Jan 2013, at 16:04, Jov wrote:
>
> they are in the src/service/if and src/metastore/if
>
> Cool. But these would be files for HiveServer, not HiveServer2 which has a
> different API, right? After finally generating the libs, it turns out they
> work fine on the old-style hive server, but produce this in hiveserver2's
> log: 13/01/04 20:09:11 ERROR server.TThreadPoolServer: Error occurred
> during processing of message. java.lang.RuntimeException:
> org.apache.thrift.transport.TTransportException at
> org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:218)
> at
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:170)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662) Caused by:
> org.apache.thrift.transport.TTransportException at
> org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)
> at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) at
> org.apache.thrift.transport.TSaslTransport.receiveSaslMessage(TSaslTransport.java:182)
> at
> org.apache.thrift.transport.TSaslServerTransport.handleSaslStartMessage(TSaslServerTransport.java:124)
> at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:253)
> at
> org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:40)
> at
> org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:215)
> ... 4 more Where should I start looking (meaning I haven't a clue)? Thanks!
> David
>
> 在 2013-1-4 上午7:16,"David Morel" 写道:
>
> Hi all (and happy New Year!) Is it possible to build a perl Thrift client
> for HiveServer2 (from Cloudera's 4.1.x) ? I'm following the instructions
> found here:
> http://stackoverflow.com/questions/5289164/perl-thrift-client-to-hiveDownloaded
>  Hive from Cloudera's site, then i'm a bit lost: where do I find
> these thrift files that I need to build the perl libs? I have the thrift
> compiler working ok, but thats as far as I got.
>
>


-- 
jov
blog: http:amutu.com/blog


Re: Thrift Hive client for CDH 4.1 HiveServer2?

2013-01-04 Thread Jov
they are in the src/service/if and src/metastore/if
在 2013-1-4 上午7:16,"David Morel" 写道:

> Hi all (and happy New Year!)
>
> Is it possible to build a perl Thrift client for HiveServer2 (from
> Cloudera's 4.1.x) ?
>
> I'm following the instructions found here:
> http://stackoverflow.com/questions/5289164/perl-thrift-client-to-hive
>
> Downloaded Hive from Cloudera's site, then i'm a bit lost: where do I find
> these thrift files that I need to build the perl libs? I have the thrift
> compiler working ok, but thats as far as I got.
>
> Any help would be most welcome
>
> Thanks!
>
> D.Morel
>


Re: Hive Client connect error

2011-06-25 Thread Jov
You can build the fb303.jar from the apache thrift src.you may need
libthrift.jar too,but not sure.
在 2011-6-25 下午4:00,"Ayon Sinha" 写道:
>
> I'm following the instructions on
http://wiki.apache.org/hadoop/Hive/HiveJDBCInterface
> I have the HiveDriver show up in the list but when I try to connect I get:
> ClassNotFound for com.facebook.fb303.FacebookService$IFace
>
> I have the following classes in SQuirrel's extra classpath:
>
> hadoop-core-0.20.2-cdh3u0.jar
> hive-cli-0.7.0-cdh3u0.jar.   //not need
> hive-anttasks-0.7.0-cdh3u0.jar.  //not need
> hive-contrib-0.7.0-cdh3u0.jar//not need
> hive-exec-0.7.0-cdh3u0.jar
> hive-common-0.7.0-cdh3u0.jar
> hive-hbase-handler-0.7.0-cdh3u0.jar. //not need
> hive-hwi-0.7.0-cdh3u0.jar //not need
> hive-jdbc-0.7.0-cdh3u0.jar
> hive-metastore-0.7.0-cdh3u0.jar
> hive-serde-0.7.0-cdh3u0.jar
> hive-service-0.7.0-cdh3u0.jar
> hive-shims-0.7.0-cdh3u0.jar. //not sure
>
> -Ayon
> See My Photos on Flickr
> Also check out my Blog for answers to commonly asked questions.


Re: How to change the hive.metastore.warehouse.dir ?

2011-05-17 Thread Jov
2011/5/18 jinhang du :
> hi,
> The default value is "/user/hive/warehouse" in hive.site.xml. After I
> changed the directory to a path on HDFS, I got the exception.
>
> FAILED: Error in metadata: MetaException(message:Got exception:
> org.apache.hadoop.security.
> AccessControlException org.apache.hadoop.security.AccessControlException:
> Permission denied:
> user=root, access=WRITE, inode="output

make sure you can READ/WRITE to the new path.to get permissions,see
http://hadoop.apache.org/common/docs/r0.20.0/hdfs_shell.html#chmod

> FAILED: Execution Error, return code 1 from
> org.apache.hadoop.hive.ql.exec.DDLTask
>
> Is this failure related to the hadoop-site.xml or something?
> Thanks for your help.
>
> --
> dujinhang
>


Re: Using PostgreSQL as Hive metastore

2011-05-16 Thread Jov
this is caused by HIVE auto create some meta tables when found them has not
created.

when you create table without partition or index,it will not touch the
partition or index meta table,all goes well.but when you drop the table ,it
will query the partition or index meta table,but find it has not created,so
hive will create them,this ddl need some locks.this cause pgsql dead lock.

there' three methods to solve it:
1. connect to pg as super user, select * from pg_stat_activity,find the
process id of backend which status is " in transaction",so the
partition or index meta table will create sucess.then re run the drop talbe
stmt.

2.run some non-ddl stmt in hive:
for example,after create table t,run:

show partitions t;
show index on t;

these stmt will create the meta table but not cause dead lock.

3.see:

http://www.datanucleus.org/products/accessplatform_2_0/rdbms/schematool.html

and use hive jdo file in src source_root/metastore/src/model/package.jdo as
input file,to init the metastore db in pg.


good luck!

2011/5/16 Mike 

> Hi all,
>
>   I am trying to use PostgreSQL as the local metastore in Hive. I have the
> postgresql-8.4-701.jdbc4.jar in the path. I use hive-0.7 in my system.
>
>
> I have the following properties added to hive-site.xml.
>
>
> 
>   hive.metastore.local
>   true
> 
> 
>   javax.jdo.option.ConnectionURL
>   jdbc:postgresql://localhost:5432/hivedb
>  
>  
> javax.jdo.option.ConnectionDriverName
> org.postgresql.Driver
>   
>   
> javax.jdo.option.ConnectionUserName
> **
>   
>   
> javax.jdo.option.ConnectionPassword
> **
>   
>
> I created a table using Hive Cli, and it went good. But when I try to
> DROP the table the hive cli hangs and is not responding.
>
>
> I found a similar thread here :
> http://search-hadoop.com/m/VYmqW1ejsiA&subj=Re+Can+t+drop+table , but I
> didnot quite get the solution.
>
> Can anyone help me on this?
>
>
> Thanks,
> Mike


Re: Question about how to compile Hive project off-line?

2011-05-11 Thread Jov
I do it like this:

1.down load the hive src
2.compile hive in eclipse from my windows,which can connect the net.it
will down load all necessories to src/build dir
3.in linux,mkdir src/build;copy the src/build/hadoopcore,src/ivy from
windows to linux
4.edit the src/build-common.xml,make the target "ivy-retrieve" do nothing:


  
  

5.compile hive: ant -Doffline=true package





2011/5/11 Alex.Wang :
> Hello everyone.
> I am a student who begins to do some research on Hive project.
> I think the first step is to have a successful build of Hive.
> I have downloaded the Hive source code, uploaded to the Linux machine,
> unzipped and entered the source directory.
> Then I run the "ant" command and got a error message related to the ivy 
> package.
> The complete message is:
> ivy-download:
> [get] Getting: 
> http://repo2.maven.org/maven2/org/apache/ivy/ivy/2.1.0/ivy-2.1.0.jar
> [get] To: /path/hive-0.7.0/src/build/ivy/lib/ivy-2.1.0.jar
> [get] Error getting
> http://repo2.maven.org/maven2/org/apache/ivy/ivy/2.1.0/ivy-2.1.0.jar
> to /path/src/build/ivy/lib/ivy-2.1.0.jar
>
> BUILD FAILED:
> /path/src/shims/build.xml: 62: The following error occurred while
> executing this line:
> /path/src/build-common.xml: 125: java.net.UnknownHostException: 
> repot.maven.org
>
> In fact, the Linux machine is a offline machine. That is, it only
> accepts an incoming data and doesn't have the internet access to
> download the required ivy package.
>
> However, I can downloaded the ivy package to my PC and upload it to
> this Linux machine. So is anyone here willing to tell how to modify
> necessary files to enable the offline-build of Hive?
>
> Any help is greatly appreciated. Thank you.
> Best regards.
>


Re: Apache Log Date Format

2011-05-06 Thread Jov
在 2011-5-7 上午6:48,"bichonfrise74" 写道:
>
> Hi,
>
> I am using this to load the apache log into Hadoop via Hive (my version is
0.4.1).
>
> CREATE TABLE apache_log (
>   ...
>   logdate STRING,
>   ...
> ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe'
> WITH SERDEPROPERTIES (
>   "input.regex" = "([^ ]*) ([^ ]*) ([^ ]*)
\\[(\\w+\/\\w+\/\\w+)\:(\\d+:\\d+:\\d+) ...
> ...
>
> The date is coming in this format: dd/mmm/.
> I would like to be able to load the data using this date format:
-mmm-dd.
>
> 1. Has anyone done this before loading the date in a different a different
format?
> 2. Also, how do you specify in the create table statement above that the
partition is the logdate?
> 3. And when I tried to convert the old date into unixtime format via this
sql, hive complains.
>
> hive> select from_unixtime( unix_timestamp( logdate, 'dd/MMM/')) from
apache_log;
> FAILED: Error in semantic analysis: line 1:7 Function Argument Type
Mismatch from_unixtime: Looking for UDF "from_unixtime" with parameters
[class org.apache.hadoop.io.LongWritable]

The unix_timestamp func returns bigint while the from_unixtime func only
accepts int as its parameter.so you should use cast:
from_unixtime(cast( unix_timestamp( logdate, 'dd/MMM/') as int))

> Has anyone encountered these issues before?
>
> Thanks.


Re: INSERT OVERWRITE LOCAL DIRECTORY -- Why it creates multiple files

2011-03-29 Thread Jov
try add limit:

INSERT OVERWRITE LOCAL DIRECTORY
'/home/hdp-user/hiveadmin_dirs/outbox/apachetest'
Select host, identity, user, time, request
from raw_apachelog
where ds = '2011-03-22-001500' limit 32;


2011/3/30 V.Senthil Kumar :
> Hello,
>
> I have a hive query which does a simple select and writes the results to a 
> local
>
> file system.
>
>
> For example, a query like this,
>
> INSERT OVERWRITE LOCAL DIRECTORY
> '/home/hdp-user/hiveadmin_dirs/outbox/apachetest'
> Select host, identity, user, time, request
> from raw_apachelog
> where ds = '2011-03-22-001500';
>
> Now this creates a two files under apachetest folder. This table has only 32
> rows. Is there any way I can make Hive to create only single file ?
>
>
> Appreciate your help :)
>
> Thanks,
> Senthil
>


Re: skew join optimization

2011-03-20 Thread Jov
2011/3/20 Igor Tatarinov :
> I have the following join that takes 4.5 hours (with 12 nodes) mostly
> because of a single reduce task that gets the bulk of the work:
> SELECT ...
> FROM T
> LEFT OUTER JOIN S
> ON T.timestamp = S.timestamp and T.id = S.id
> This is a 1:0/1 join so the size of the output is exactly the same as the
> size of T (500M records). S is actually very small (5K).
> I've tried:
> - switching the order of the join conditions
> - using a different hash function setting (jenkins instead of murmur)
> - using SET set hive.auto.convert.join = true;

are you sure your query convert to mapjoin? if not,try use explicit
mapjoin hint.


> - using SET hive.optimize.skewjoin = true;
> but nothing helped :(
> Anything else I can try?
> Thanks!


Re: Database/Schema , INTERVAL and SQL IN usages in Hive

2011-02-21 Thread Jov
在 2011-2-21 下午10:54,"Bejoy Ks" 写道:
>
> Hi Experts
>  I'm using hive for a few projects and i found it a great tool in
hadoop to process end to end structured data. Unfortunately I'm facing a few
challenges out here as follows
>
> Availability of database/schemas in Hive
> I'm having multiple projects running in hive each having fairly large
number of tables. With this much tables all together it is  looking a bit
messed up. Is there any option of creating database/schema in Hive so that I
can maintain the tables in different databases/schemas corresponding to each
project.

it seems the resent version has already support database ddl,so,you can use
create database.

> Using INTERVAL
> I need to replicate a job running in Teradata edw into hive, i'm
facing a challenge out here.Not able to identify a similar usage
corresponding to Interval in teradata within hive. Here is the snippet where
I'm facing the issue
>  *** where 1.seq_id = r4.seq_id and r4.mc_datetime >= (r1.rc_datetime +
INTERVAL '05' HOUR)
> In this query how do i replicate the last part in hive ie (r1.rc_datetime
+ INTERVAL '05' HOUR) , where it is adding 5 hours to the obtained time
stamp rc_datetime.
> *The where condition is part of a very large query involving multiple
table joins.

hive do not have date or timestamp data type,all such type is string,but you
can write your udf to implement similar function

>
> Using IN
> How do we replicate the SQL IN function in hive
> ie *** where R1.seq_id = r4.seq_id and r1.PROCCESS_PHASE IN (
'Production', 'Stage' , 'QA', 'Development')
> the last part of the query is where i'm facing the challenge
r1.PROCCESS_PHASE IN ( 'Production', 'Stage' , 'QA', 'Development')
> *The where condition is part of a very large query involving multiple
table joins.

you can use or,e.g.

'x in(1,2)' can be 'x=1 or x=2'

> Please advise.
>
> Regards
> Bejoy KS
>
>
>
>
>
>
>


Re: problem while performing union on twotables

2011-02-19 Thread Jov
if you want union,you should do it as this :

select distinct ... from
subquery1
union all
subquery2

so,union = union distinct
在 2011-2-19 上午12:11,"sangeetha s" 写道:
> Hi,
>
> Thanks Jov and Ajo
>
> Changing from Union to Union all solved the issue. But do we need to
specify
> all the fields in the sub query?
> Actually I had used the following query.
>
> INSERT OVERWRITE TABLE tab3 SELECT t3.col1,t3col2 FROM (SELECT id AS
> col1,name AS col2 FROM tab1 UNION ALL SELECT id AS col1,name as col2 FROM
> tab2)t3;
>
> The above query overwrite the result in the table tab3. If I am doing such
> an operation in a table that has many fields,Is it necessary to specify
all
> the column names as a part of the sub query? If not, Will the query
*SELECT
> t3.* FROM (SELECT * FROM tab1 UNION ALL SELECT * FROM tab2)t3;* work?As I
> am using this to update my table which has many columns,I need this.
Well,I
> will check it first and get back to you.
>
> Thank you,
>
>
>
> 2011/2/18 Jov 
>
>> hive0.4.1 do not support union,only support union all
>> 在 2011-2-18 下午3:12,"sangeetha s" 写道:
>>
>> >
>> > Hi,
>> >
>> > I am trying to perform union of two tables which are having identical
>> > schemas and distinct data.There are two tables 'oldtable' and
'newtable'.
>> > The old table contains the information of old users and the new table
>> will
>> > conatin the information of new user. I am trying to update the new
entry
>> in
>> > the old table using the following query.
>> >
>> > INSERT OVERWRITE TABLE oldtable SELECT * FROM (SELECT * FROM oldtable
>> UNION
>> > select * from newtable);
>> >
>> > executing the above query results,
>> > FAILED:Parse Error:line 1.68 mismatched input 'select' expecting ALL in
>> > query operator.
>> >
>> > I am currently using Hadoop verso 0.20.2 and Hive 0.4.1 from apache
>> > disctribution. Will Hive 0.4.1 support UNION and UNION ALL queries? Is
>> the
>> > above problem is dueto the Hive version or due to a wrong query? Also I
>> am
>> > not using any partitions in these tables and the table format is
>> TextFile.
>> >
>> > Any pointers in this regard would be highly helpful.
>> >
>> > --
>> >
>> >
>> >
>> > Regards,
>> > Sangita
>> >
>> >
>> > --
>> >
>>
>
>
>
> --
>
>
>
> Regards,
> Sangita


Re: problem while performing union on twotables

2011-02-18 Thread Jov
hive0.4.1 do not support union,only support union all
在 2011-2-18 下午3:12,"sangeetha s" 写道:
>
> Hi,
>
> I am trying to perform union of two tables which are having identical
> schemas and distinct data.There are two tables 'oldtable' and 'newtable'.
> The old table contains the information of old users and the new table will
> conatin the information of new user. I am trying to update the new entry
in
> the old table using the following query.
>
> INSERT OVERWRITE TABLE oldtable SELECT * FROM (SELECT * FROM oldtable
UNION
> select * from newtable);
>
> executing the above query results,
> FAILED:Parse Error:line 1.68 mismatched input 'select' expecting ALL in
> query operator.
>
> I am currently using Hadoop verso 0.20.2 and Hive 0.4.1 from apache
> disctribution. Will Hive 0.4.1 support UNION and UNION ALL queries? Is the
> above problem is dueto the Hive version or due to a wrong query? Also I am
> not using any partitions in these tables and the table format is TextFile.
>
> Any pointers in this regard would be highly helpful.
>
> --
>
>
>
> Regards,
> Sangita
>
>
> --
>