RE: Hive HBase integeration use case
You may want to try by creating UDF/UDTF hive function -Vivek From: G.S.Vijay raajaa [gsvijayraa...@gmail.com] Sent: 01 February 2013 18:55 To: user@hive.apache.org Subject: Hive HBase integeration use case Hi, I would like to have HBase as a data storage and use Hive for data warehousing. The issue with the integration is, The HBase table is composed of composite keys following the below structure: HBase ROWKEY: Hash(customer_id)+customer_id+time+event_id and column: usage : value The structure of Hbase rowkey makes every entry a distinct one. Is it possible to split the rowkey and map them as columns of the hive table as explained below: I am trying to create a hive table with the following column structure: customer_id,event_id,time,usage This will enable me aggregate data by grouping the column ( time or event_id)?? Any thoughts on the same?? If there isn't a direct handling by hive HBase integration, can you suggest any other means?? Regards, Vijay Raajaa G S NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference.
RE: Running commands at hive cli or hive thirft startup
UDFs are tricky. The only way I can think of is to add them to the function registry (https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java) and recompile Hive. ` What about https://cwiki.apache.org/Hive/plugindeveloperkit.html ? ~ From: Mark Grover [grover.markgro...@gmail.com] Sent: 14 December 2012 15:11 To: user@hive.apache.org Subject: Re: Running commands at hive cli or hive thirft startup No, .hiverc only works for CLI. UDFs are tricky. The only way I can think of is to add them to the function registry (https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java) and recompile Hive. On Mon, Dec 10, 2012 at 8:01 AM, John Omernik j...@omernik.com wrote: Will that work for my thrift server connections? On Sun, Dec 9, 2012 at 7:56 PM, विनोद सिंह vi...@vinodsingh.com wrote: Put a .hiverc file in your home directory containing commands, Hive CLI will execute all of them at startup. Thanks, Vinod On Sun, Dec 9, 2012 at 10:25 PM, John Omernik j...@omernik.com wrote: I am looking for ways to streamline some of my analytics. One thing I notice is that when I use hive cli, or connect to my hive thrift server, there are a some commands I always end up running for my session. If I have multiple CLIs or connections to Thrift, then I have to run it each time. If I lose a connection to hive thrift, I have to run them. Etc etc. My thought was, is there a way that upon opening a hive cli or connection to a hive thrift server, could I have certain commands be executed? These commands include a use command to get me to a specific database (perhaps there is a default database config variable?) or loading up all the temporary functions I use (UDFs) . For example, I have a UDF to do URL decoding: CREATE TEMPORARY FUNCTION uridecode AS 'org.domain.analytics.URIDECODE; Can I get this to run auto magically at hive cli start or thrift server connection? If not, could we build it in that we can add UDFs to hive without doing a recompile that stay in permanently? I would welcome discussion on this! NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference.
RE: Load data in (external table) from symbolic link
Looks like MapR is complaining for mounted directory and somehow it is not accessible. -Vivek From: Hadoop Inquirer [hadoop.inqui...@gmail.com] Sent: 08 December 2012 04:47 To: user@hive.apache.org Subject: Load data in (external table) from symbolic link Hi, I am trying to create an external table in Hive by pointing it to a file that has symbolic links in its path reference. Hive seems to complain with the following error indicating that it thinks the symbolic link is a file: java.io.IOException: Open failed for file: /dir1/dir2/dir3_symlink, error: Invalid argument (22) at com.mapr.fs.MapRClient.open(MapRClient.java:190) at com.mapr.fs.MapRFileSystem.open(MapRFileSystem.java:327) at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:460) at org.apache.hadoop.mapred.LineRecordReader.init(LineRecordReader.java:93) at org.apache.hadoop.mapred.TextInputFormat.getRecordReader(TextInputFormat.java:54) at org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:237) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:383) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:336) at org.apache.hadoop.mapred.Child$4.run(Child.java:270) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1109) at org.apache.hadoop.mapred.Child.main(Child.java:264) Any help would be appreciated. NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference.
RE: handling null argument in custom udf
Could you please look into and share your task log/attemptlog for complete error trace or actual error behind this? -Vivek From: Søren [s...@syntonetic.com] Sent: 04 December 2012 20:28 To: user@hive.apache.org Subject: Re: handling null argument in custom udf Thanks. Did you mean I should handle null in my udf or my serde? I did try to check for null inside the code in my udf, but it fails even before it gets called. This is from when the udf fails: Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to execute method public org.apache.hadoop.io.Text com.company.hive.myfun.evaluate(java.lang.Object,java.lang.Object) on objectcom.company.hive.myfun@1412332 of class com.company.hive.myfun with arguments {0:java.lang.Object, null} of size 2 It looks like there is a null, or is this error message misleading? On 04/12/2012 15:43, Edward Capriolo wrote: There is no null argument. You should handle the null case in your code. If (arga == null) Or optionally you could use a generic udf but a regular one should handle what you are doing. On Tuesday, December 4, 2012, Søren s...@syntonetic.commailto:s...@syntonetic.com wrote: Hi Hive community I have a custom udf, say myfun, written in Java which I utilize like this select myfun(col_a, col_b) from mytable where etc col_b is a string type and sometimes it is null. When that happens, my query crashes with --- java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {col_a:val,col_b:null} ... Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to execute method public org.apache.hadoop.io.Text --- public final class myfun extends UDF { public Text evaluate(final Text argA, final Text argB) { I'm unsure how this should be fixed in a proper way. Is the framework looking for an overload of evaluate that would comply with the null argument? I need to say that the table is declared using my own json serde reading from S3. I'm not processing nulls in my serde in any special way because Hive seems to handle null in the right way when not passed to my own UDF. Are there anyone out there with ideas or experiences on this issue? thanks in advance Søren NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference.
RE: handling null argument in custom udf
The way UDF works is, you need to tell your ObjectInspector about your primitive or JavaTypes. So in your case even if value is null, you should be able to assign it as a String or any other object. Then invocation to evaluate() function should know about type of java object. -Vivek From: Vivek Mishra Sent: 05 December 2012 15:36 To: user@hive.apache.org Subject: RE: handling null argument in custom udf Could you please look into and share your task log/attemptlog for complete error trace or actual error behind this? -Vivek From: Søren [s...@syntonetic.com] Sent: 04 December 2012 20:28 To: user@hive.apache.org Subject: Re: handling null argument in custom udf Thanks. Did you mean I should handle null in my udf or my serde? I did try to check for null inside the code in my udf, but it fails even before it gets called. This is from when the udf fails: Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to execute method public org.apache.hadoop.io.Text com.company.hive.myfun.evaluate(java.lang.Object,java.lang.Object) on objectcom.company.hive.myfun@1412332 of class com.company.hive.myfun with arguments {0:java.lang.Object, null} of size 2 It looks like there is a null, or is this error message misleading? On 04/12/2012 15:43, Edward Capriolo wrote: There is no null argument. You should handle the null case in your code. If (arga == null) Or optionally you could use a generic udf but a regular one should handle what you are doing. On Tuesday, December 4, 2012, Søren s...@syntonetic.commailto:s...@syntonetic.com wrote: Hi Hive community I have a custom udf, say myfun, written in Java which I utilize like this select myfun(col_a, col_b) from mytable where etc col_b is a string type and sometimes it is null. When that happens, my query crashes with --- java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {col_a:val,col_b:null} ... Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to execute method public org.apache.hadoop.io.Text --- public final class myfun extends UDF { public Text evaluate(final Text argA, final Text argB) { I'm unsure how this should be fixed in a proper way. Is the framework looking for an overload of evaluate that would comply with the null argument? I need to say that the table is declared using my own json serde reading from S3. I'm not processing nulls in my serde in any special way because Hive seems to handle null in the right way when not passed to my own UDF. Are there anyone out there with ideas or experiences on this issue? thanks in advance Søren NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference. NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference.
RE: MapJoin error: .hashtable file not found
Hi, I am using 0.9.0 hive with Windows 2008 server. I did try debugging into code, but no luck to understand: Can somebody exaplain how exactly such .hastable files get created? What could be the reason they are not created though gz files are there? -Vivek From: Mark Grover [grover.markgro...@gmail.com] Sent: 28 November 2012 11:41 To: user@hive.apache.org Subject: Re: MapJoin error: .hashtable file not found Vivek, What version of Hive are you using? And, on what OS? Mark On Tue, Nov 27, 2012 at 9:34 PM, Vivek Mishra vivek.mis...@impetus.co.inmailto:vivek.mis...@impetus.co.in wrote: Any pointers? Any help? Can somebody exaplain how exactly such .hastable files get created? What could be the reason they are not created though gz files are there -Vivek From: Vivek Mishra Sent: 24 November 2012 11:52 To: user@hive.apache.orgmailto:user@hive.apache.org Subject: MapJoin error: .hashtable file not found Hi, I am trying to run a MapJoin query and somehow getting below error. I can see that file is not there in specified directory. Did some debugging but no luck as well. Here is the error: nOperator:Load back 1 hashtable file from tmp file uri:c:/hadoop/hdfs/mapred/local/taskTracker/distcache/-6923915657555089159_-359650783_821098197/localhost/tmp/hive-vivek.mishra/hive_2012-11-24_11-12-30_853_4076782611151037010/-mr-10004/HashTable-Stage-5/Stage-5.tar.gz/MapJoin-mapfile10--.hashtable 2012-11-24 11:13:45,389 ERROR org.apache.hadoop.hive.ql.exec.MapJoinOperator: Load Distributed Cache Error 2012-11-24 11:13:45,390 FATAL ExecMapper: org.apache.hadoop.hive.ql.metadata.HiveException: c:\hadoop\hdfs\mapred\local\taskTracker\distcache\-6923915657555089159_-359650783_821098197\localhost\tmp\hive-vivek.mishra\hive_2012-11-24_11-12-30_853_4076782611151037010\-mr-10004\HashTable-Stage-5\Stage-5.tar.gz\MapJoin-mapfile10--.hashtable (The system cannot find the file specified) at org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:198) at org.apache.hadoop.hive.ql.exec.MapJoinOperator.cleanUpInputFileChangedOp(MapJoinOperator.java:212) at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1321) at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1325) at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1325) at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1325) at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:495) at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:143) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372) at org.apache.hadoop.mapred.Child$4.run(Child.java:271) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1135) at org.apache.hadoop.mapred.Child.main(Child.java:265) Also, i have verified there is no such OOM issue as well. Any idea? -Vivek Neustar VP and Impetus CEO to present on ‘Innovative information services powered by Cloud and Big Data technologies’at Cloud Expo - Santa Clara, Nov 6th. http://www.impetus.com/events#2. Check out Impetus contribution to build Luminar - a new business unit at Entravision. http://lf1.me/MS/ NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference. Neustar VP and Impetus CEO to present on ‘Innovative information services powered by Cloud and Big Data technologies’at Cloud Expo - Santa Clara, Nov 6th. http://www.impetus.com/events#2. Check out Impetus contribution to build Luminar - a new business unit at Entravision. http://lf1.me/MS/ NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus
RE: MapJoin error: .hashtable file not found
Thanks Ashutosh. I will give it a try. -Vivek From: Ashutosh Chauhan [hashut...@apache.org] Sent: 28 November 2012 12:43 To: user@hive.apache.org Subject: Re: MapJoin error: .hashtable file not found Hi Vivek, I will encourage you to try out latest trunk or hive-0.10 branch. Lots of work has gone in since 0.9 to make Hive work better on Windows. Ashutosh On Tue, Nov 27, 2012 at 11:08 PM, Vivek Mishra vivek.mis...@impetus.co.inmailto:vivek.mis...@impetus.co.in wrote: Hi, I am using 0.9.0 hive with Windows 2008 server. I did try debugging into code, but no luck to understand: Can somebody exaplain how exactly such .hastable files get created? What could be the reason they are not created though gz files are there? -Vivek From: Mark Grover [grover.markgro...@gmail.commailto:grover.markgro...@gmail.com] Sent: 28 November 2012 11:41 To: user@hive.apache.orgmailto:user@hive.apache.org Subject: Re: MapJoin error: .hashtable file not found Vivek, What version of Hive are you using? And, on what OS? Mark On Tue, Nov 27, 2012 at 9:34 PM, Vivek Mishra vivek.mis...@impetus.co.inmailto:vivek.mis...@impetus.co.inmailto:vivek.mis...@impetus.co.inmailto:vivek.mis...@impetus.co.in wrote: Any pointers? Any help? Can somebody exaplain how exactly such .hastable files get created? What could be the reason they are not created though gz files are there -Vivek From: Vivek Mishra Sent: 24 November 2012 11:52 To: user@hive.apache.orgmailto:user@hive.apache.orgmailto:user@hive.apache.orgmailto:user@hive.apache.org Subject: MapJoin error: .hashtable file not found Hi, I am trying to run a MapJoin query and somehow getting below error. I can see that file is not there in specified directory. Did some debugging but no luck as well. Here is the error: nOperator:Load back 1 hashtable file from tmp file uri:c:/hadoop/hdfs/mapred/local/taskTracker/distcache/-6923915657555089159_-359650783_821098197/localhost/tmp/hive-vivek.mishra/hive_2012-11-24_11-12-30_853_4076782611151037010/-mr-10004/HashTable-Stage-5/Stage-5.tar.gz/MapJoin-mapfile10--.hashtable 2012-11-24 11:13:45,389 ERROR org.apache.hadoop.hive.ql.exec.MapJoinOperator: Load Distributed Cache Error 2012-11-24 11:13:45,390 FATAL ExecMapper: org.apache.hadoop.hive.ql.metadata.HiveException: c:\hadoop\hdfs\mapred\local\taskTracker\distcache\-6923915657555089159_-359650783_821098197\localhost\tmp\hive-vivek.mishra\hive_2012-11-24_11-12-30_853_4076782611151037010\-mr-10004\HashTable-Stage-5\Stage-5.tar.gz\MapJoin-mapfile10--.hashtable (The system cannot find the file specified) at org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:198) at org.apache.hadoop.hive.ql.exec.MapJoinOperator.cleanUpInputFileChangedOp(MapJoinOperator.java:212) at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1321) at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1325) at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1325) at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1325) at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:495) at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:143) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372) at org.apache.hadoop.mapred.Child$4.run(Child.java:271) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1135) at org.apache.hadoop.mapred.Child.main(Child.java:265) Also, i have verified there is no such OOM issue as well. Any idea? -Vivek Neustar VP and Impetus CEO to present on ‘Innovative information services powered by Cloud and Big Data technologies’at Cloud Expo - Santa Clara, Nov 6th. http://www.impetus.com/events#2. Check out Impetus contribution to build Luminar - a new business unit at Entravision. http://lf1.me/MS/ NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference. Neustar VP and Impetus CEO to present on ‘Innovative information services
MapJoin error: .hashtable file not found
Hi, I am trying to run a MapJoin query and somehow getting below error. I can see that file is not there in specified directory. Did some debugging but no luck as well. Here is the error: nOperator:Load back 1 hashtable file from tmp file uri:c:/hadoop/hdfs/mapred/local/taskTracker/distcache/-6923915657555089159_-359650783_821098197/localhost/tmp/hive-vivek.mishra/hive_2012-11-24_11-12-30_853_4076782611151037010/-mr-10004/HashTable-Stage-5/Stage-5.tar.gz/MapJoin-mapfile10--.hashtable 2012-11-24 11:13:45,389 ERROR org.apache.hadoop.hive.ql.exec.MapJoinOperator: Load Distributed Cache Error 2012-11-24 11:13:45,390 FATAL ExecMapper: org.apache.hadoop.hive.ql.metadata.HiveException: c:\hadoop\hdfs\mapred\local\taskTracker\distcache\-6923915657555089159_-359650783_821098197\localhost\tmp\hive-vivek.mishra\hive_2012-11-24_11-12-30_853_4076782611151037010\-mr-10004\HashTable-Stage-5\Stage-5.tar.gz\MapJoin-mapfile10--.hashtable (The system cannot find the file specified) at org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:198) at org.apache.hadoop.hive.ql.exec.MapJoinOperator.cleanUpInputFileChangedOp(MapJoinOperator.java:212) at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1321) at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1325) at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1325) at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1325) at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:495) at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:143) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372) at org.apache.hadoop.mapred.Child$4.run(Child.java:271) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1135) at org.apache.hadoop.mapred.Child.main(Child.java:265) Also, i have verified there is no such OOM issue as well. Any idea? -Vivek Neustar VP and Impetus CEO to present on 'Innovative information services powered by Cloud and Big Data technologies'at Cloud Expo - Santa Clara, Nov 6th. http://www.impetus.com/events#2. Check out Impetus contribution to build Luminar - a new business unit at Entravision. http://lf1.me/MS/ NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference.
RE: hive select count(*) query exception
did you copy same jars in classpath for hadoop? hadoop_home for hive is set to this only? i tried with 0.20.2 version from apache site only! From: jingjung Ng [jingjun...@gmail.com] Sent: 18 December 2011 11:02 To: user@hive.apache.org Subject: Re: hive select count(*) query exception I am using hadoop-0.20.2-cdh3u1 from Cloudera Jing On Sat, Dec 17, 2011 at 12:18 AM, Vivek Mishra vivek.mis...@impetus.co.inmailto:vivek.mis...@impetus.co.in wrote: Which version of hadoop you re experimenting with? AFAIK, only 0.20.X works fine. i tried versions 0.20.x but no luck. Vivek From: alo alt [wget.n...@googlemail.commailto:wget.n...@googlemail.com] Sent: 16 December 2011 14:59 To: user@hive.apache.orgmailto:user@hive.apache.org Subject: Re: hive select count(*) query exception Hi, looks like the user who uses the statement has not the correct rights. org.apache.hadoop.fs.permission.FsPermission$2.init())' - Alex On Fri, Dec 16, 2011 at 8:59 AM, jingjung Ng jingjun...@gmail.commailto:jingjun...@gmail.com wrote: Hi, I have simple hive select count(*) query,which results in the following exception. I am using Cloudera cdh3u1 ( hadoop/hbase/hive). However I am able to do select * from t1 from hive CLI. Here is output after running select count(*) from t1. hive select count(*) from t1; Total MapReduce jobs = 1 Launching Job 1 out of 1 Number of reduce tasks determined at compile time: 1 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=number In order to limit the maximum number of reducers: set hive.exec.reducers.max=number In order to set a constant number of reducers: set mapred.reduce.tasks=number org.apache.hadoop.ipc.RemoteException: IPC server unable to read call parameters: java.lang.NoSuchMethodException: org.apache.hadoop.fs.permission.FsPermission$2.init() at org.apache.hadoop.ipc.Client.call(Client.java:1107) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:226) at $Proxy4.setPermission(Unknown Source) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59) at $Proxy4.setPermission(Unknown Source) at org.apache.hadoop.hdfs.DFSClient.setPermission(DFSClient.java:855) at org.apache.hadoop.hdfs.DistributedFileSystem.setPermission(DistributedFileSystem.java:560) at org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:123) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:839) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:833) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:833) at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:807) at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:657) at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:123) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:130) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1063) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:900) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:748) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:209) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:286) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:513) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:186) Job Submission failed with exception 'org.apache.hadoop.ipc.RemoteException(IPC server unable to read call parameters: java.lang.NoSuchMethodException: org.apache.hadoop.fs.permission.FsPermission$2.init())' FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MapRedTask hive Thanks, Jing. -- Alexander Lorenz http://mapredit.blogspot.com P Think of the environment: please don't
RE: Hive server not starting...on EC2 Ubuntu 10.04 instance
Hi, Additionally, verify this property property namejavax.jdo.option.ConnectionURL/name valuejdbc:derby:;databaseName=metastore_db;create=true/value descriptionJDBC connect string for a JDBC metastore/description /property in hive-default.xml From: Jasper Knulst [jasper.knu...@incentro.com] Sent: 18 December 2011 23:02 To: user@hive.apache.org Subject: Re: Hive server not starting...on EC2 Ubuntu 10.04 instance Hi Periya, Try removing both .lck (lock) files in the metastore_db folder. The derby db can only support one user. So if you have used hive from cli it is probably for another user. Cheers, Jasper Op 18 dec. 2011 17:42 schreef Periya.Data periya.d...@gmail.commailto:periya.d...@gmail.com het volgende: Hi Vivek, Tried doing with sudo and later also changed the permissions. None of them worked. root@domU-12-31-39-0E-C9-33:/var/lib/hive/metastore# ls -l total 4 drwxr-xr-x 5 root root 4096 2011-12-18 03:19 metastore_db root@domU-12-31-39-0E-C9-33:/var/lib/hive/metastore# chmod 777 metastore_db/ root@domU-12-31-39-0E-C9-33:/var/lib/hive/metastore# ls -l total 4 drwxrwxrwx 5 root root 4096 2011-12-18 03:19 metastore_db root@domU-12-31-39-0E-C9-33:/var/lib/hive/metastore# root@domU-12-31-39-0E-C9-33:/var/lib/hive/metastore# hive --service hiveserver Starting Hive Thrift Server ^Croot@domU-12-31-39-0E-C9-33:/var/lib/hive/metastore# -PD On Sun, Dec 18, 2011 at 4:34 AM, Vivek Mishra vivek.mis...@impetus.co.inmailto:vivek.mis...@impetus.co.in wrote: try issuing with sudo. as metastore_db is locked for root user. else do sudo chmod 777 metastore_db folder. Vivek From: Periya.Data [periya.d...@gmail.commailto:periya.d...@gmail.com] Sent: 18 December 2011 10:29 To: user@hive.apache.orgmailto:user@hive.apache.org Subject: Hive server not starting...on EC2 Ubuntu 10.04 instance Hi all, I am trying to start hive server, but, after the command, it looks like nothing is happening. I am not even getting a prompt. Here are some details: - machine - EC2 Ubuntu 10.04 LTS - Hive version - 0.7.1-cdh3u2 (as seen from hive-default.xml) - Hadoop version - 0.20.2 - I currently have an embedded Derby database as my metastore. (Plan to move it to a remote MySQL DB later. For now, i am the only user). root@domU-12-31-39-0E-C9-33:/usr/lib/hive/conf# hive --service hiveserver Starting Hive Thrift Server it just hangs here...nothing happens for 10 min..had to Ctrl-c to get out. Looks like to it is unable to talk to the metastore. I am able to run hive shell, create, drop tables and run queries from the shell. --- I did the following: - # HIVE_PORT=1 hive --service hiveserver (same problem) - Tried changing ports, but, same problem. Log File (/tmp/user/hive.log) org.apache.commons.dbcp.SQLNestedException: Cannot get a connection, pool error Could not create a validated object, cause: A read-only user or a user in a read-only database is not permitted to disable read-only mode on a connection. at org.apache.hadoop.hive.ql.metadata.Hive.getDatabase(Hive.java:1028) at org.apache.hadoop.hive.ql.metadata.Hive.databaseExists(Hive.java:1013) at org.apache.hadoop.hive.ql.exec.DDLTask.showTables(DDLTask.java:1712) at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:289) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:130) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1063) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:900) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:748) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:209) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:286) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:485) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:186) Caused by: javax.jdo.JDOFatalDataStoreException: Cannot get a connection, pool error Could not create a validated object, cause: A read-only user or a user in a read-only database is not permitted to disable read-only mode on a connection. NestedThrowables: org.apache.commons.dbcp.SQLNestedException: Cannot get a connection, pool error Could not create a validated object, cause: A read-only user or a user in a read-only database is not permitted to disable read-only mode on a connection. at org.datanucleus.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:298) at org.datanucleus.jdo.JDOPersistenceManagerFactory.freezeConfiguration(JDOPersistenceManagerFactory.java:601
RE: hive select count(*) query exception
Which version of hadoop you re experimenting with? AFAIK, only 0.20.X works fine. i tried versions 0.20.x but no luck. Vivek From: alo alt [wget.n...@googlemail.com] Sent: 16 December 2011 14:59 To: user@hive.apache.org Subject: Re: hive select count(*) query exception Hi, looks like the user who uses the statement has not the correct rights. org.apache.hadoop.fs.permission.FsPermission$2.init())' - Alex On Fri, Dec 16, 2011 at 8:59 AM, jingjung Ng jingjun...@gmail.com wrote: Hi, I have simple hive select count(*) query,which results in the following exception. I am using Cloudera cdh3u1 ( hadoop/hbase/hive). However I am able to do select * from t1 from hive CLI. Here is output after running select count(*) from t1. hive select count(*) from t1; Total MapReduce jobs = 1 Launching Job 1 out of 1 Number of reduce tasks determined at compile time: 1 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=number In order to limit the maximum number of reducers: set hive.exec.reducers.max=number In order to set a constant number of reducers: set mapred.reduce.tasks=number org.apache.hadoop.ipc.RemoteException: IPC server unable to read call parameters: java.lang.NoSuchMethodException: org.apache.hadoop.fs.permission.FsPermission$2.init() at org.apache.hadoop.ipc.Client.call(Client.java:1107) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:226) at $Proxy4.setPermission(Unknown Source) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59) at $Proxy4.setPermission(Unknown Source) at org.apache.hadoop.hdfs.DFSClient.setPermission(DFSClient.java:855) at org.apache.hadoop.hdfs.DistributedFileSystem.setPermission(DistributedFileSystem.java:560) at org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:123) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:839) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:833) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:833) at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:807) at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:657) at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:123) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:130) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1063) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:900) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:748) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:209) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:286) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:513) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:186) Job Submission failed with exception 'org.apache.hadoop.ipc.RemoteException(IPC server unable to read call parameters: java.lang.NoSuchMethodException: org.apache.hadoop.fs.permission.FsPermission$2.init())' FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MapRedTask hive Thanks, Jing. -- Alexander Lorenz http://mapredit.blogspot.com P Think of the environment: please don't print this email unless you really need to. New Impetus webcast on-demand ‘Big Data Technologies for Social Media Analytics’ available at http://bit.ly/nFdet0. Visit http://www.impetus.com to know more. Follow us on www.twitter.com/impetuscalling NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error.
RE: Hive not reflecting hdfs data
Please check for metastore_db location. That should help Vivek From: abhishek pathak [mailto:forever_yours_a...@yahoo.co.in] Sent: Thursday, March 10, 2011 5:05 PM To: Hive mailing list Subject: Hive not reflecting hdfs data Hi, I am a hive newbie.I am managing a setup where data is regularly fed into HDFS using flume.However, hive does not show the data that is recently added to the HDFS.It used to earlier,but somehow its not updating now.The queries i fire all give answers to the old HDFS and do not reflect the newer data added there. Is there a configuration that is messed up?Is there someway I can check where the external table is pointing? Regards, Abhishek Pathak Are you exploring a Big Data Strategy ? Listen to this recorded webinar on Planning your Hadoop/ NoSQL projects for 2011 at www.impetus.com/featured_webinar?eventid=37 Follow us on www.twitter.com/impetuscalling or visit www.impetus.com to know more. NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference.
Hive import issue
Hi, I am a newbie to hive. When I am trying to import data to HBase via a table managed by Hive. I am getting following errors: mismatched input 'Timestamp' expecting Identifier in column specification mismatched input 'data' expecting Identifier in column specification Remvoing or renaming these columns to 'data_something' makes it working. Any idea why is it happening? And what are total of these keywords which cannot be used as column name? Any help will be greatly appreciated. Vivek Register for Impetus Webinar on 'Building Highly Scalable and Flexible SaaS Solutions' on Dec 10 (10:00 a.m. PT). Click http://www.impetus.com to know more. Follow us on www.twitter.com/impetuscalling. NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference.
Running a HiveClient with create external table HBase
Hi, Currently I am facing random behavior while trying to create a java client for Hive hbase integration. Case: I am trying to create a hive table for existing HBase table. So i have started hiveserver via /hive -service hiveserver. In logs I can see it is printing by my sql with CREATE EXTERNAL TABLE. But somehow that table is not getting created in Hive. Interesting point is running the sql command from command line with Hive is running fine. This behavior is random. Sometimes it shows me all created tables in Hive(when I use SHOW TABLES). Does it have something to do with 'metastore_db'? Any idea? Register for Impetus Webinar on 'Building Highly Scalable and Flexible SaaS Solutions' on Dec 10 (10:00 a.m. PT). Click http://www.impetus.com to know more. Follow us on www.twitter.com/impetuscalling. NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference.
RE: Hive/HBase integration issue.
Added 1 post at: http://mevivs.wordpress.com/2010/11/24/hivehbase-integration/ Sharing it if is useful. Vivek -Original Message- From: Vivek Mishra Sent: Friday, November 19, 2010 10:36 AM To: user@hive.apache.org Subject: RE: Hive/HBase integration issue. Hi, Just found that, It is related to HIVE-1264 JIRA. Thanks for all help. Vivek -Original Message- From: John Sichi [mailto:jsi...@fb.com] Sent: Friday, November 19, 2010 1:02 AM To: user@hive.apache.org Subject: Re: Hive/HBase integration issue. This is unrelated to Hive/HBase integration; it looks like a Hadoop version issue. JVS On Nov 17, 2010, at 9:56 PM, Vivek Mishra wrote: Hi, Currently I am facing an issue with Hive/HBase integration. Exception in thread main java.lang.NoSuchMethodError: org.apache.hadoop.util.Shell.getGROUPS_COMMAND()[Ljava/lang/String; StackTrace: Exception in thread main java.lang.NoSuchMethodError: org.apache.hadoop.util.Shell.getGROUPS_COMMAND()[Ljava/lang/String; at org.apache.hadoop.security.UnixUserGroupInformation.getUnixGroups(UnixUserGroupInformation.java:320) at org.apache.hadoop.security.UnixUserGroupInformation.login(UnixUserGroupInformation.java:243) at org.apache.hadoop.security.UnixUserGroupInformation.login(UnixUserGroupInformation.java:275) at org.apache.hadoop.hive.ql.Driver.init(Driver.java:273) at org.apache.hadoop.hive.ql.processors.CommandProcessorFactory.get(CommandProcessorFactory.java:49) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:131) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:197) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:302) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:186) I believe, It is because of jdk6 backward compatibility. I tried to set -Dsun.lang.ClassLoader.allowArraySyntax=true. But unfortunately didn't work. Any help will be greatly appreciated. Thanks and Regards, Vivek Mishra Impetus is a proud sponsor for ASCI Tour 2010 (Agile Software Community of India) on Oct 30 in Noida, India. Meet Impetus at the Cloud Computing Expo from Nov 1-4 in Santa Clara. Our Sr. Director of Engineering, Vineet Tyagi will be speaking about ‘Using Hadoop for Deriving Intelligence from Large Data’. Click http://www.impetus.com/ to know more. Follow us on www.twitter.com/impetuscalling NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference. Watch Vineet Tyagi (Impetus – Sr. Director of Engineering) explaining ‘The Hadoop, Cloud, and Mafia Connection ‘ at http://tinyurl.com/376gl62 Follow Impetus’ updates on www.twitter.com/impetuscalling. NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference.