Re: How to set an empty value to hive.querylog.location to disable the creation of hive history file
How about setting it to /dev/null . Not sure if that would help in your case. Just an hack. Regards. On Thu, Dec 6, 2012 at 2:14 PM, Bing Li sarah.lib...@gmail.com wrote: Hi, all Refer to https://cwiki.apache.org/Hive/adminmanual-configuration.html, if I set hive.querylog.location to an empty string, it won't create structured log. I filed hive-site.xml in HIVE_HOME/conf and add the following setting, property namehive.querylog.location/name value/value /property BUT it didn't work, when launch HIVE_HOME/bin/hive, it created a history file in /tmp/user.name which is the default directory of this property. Do you know how to set an EMPTY value in hive-site.xml? Thanks, - Bing
Re: handling null argument in custom udf
Right. Thanks for all the help. It turned out that it did help to check for null in the code. No mystery. I did try that earlier but the attempt got lost somehow. Thanks for the advise on using GenericUDF. cheers Søren On 05/12/2012 11:10, Vivek Mishra wrote: The way UDF works is, you need to tell your ObjectInspector about your primitive or JavaTypes. So in your case even if value is null, you should be able to assign it as a String or any other object. Then invocation to evaluate() function should know about type of java object. -Vivek From: Vivek Mishra Sent: 05 December 2012 15:36 To: user@hive.apache.org Subject: RE: handling null argument in custom udf Could you please look into and share your task log/attemptlog for complete error trace or actual error behind this? -Vivek From: Søren [s...@syntonetic.com] Sent: 04 December 2012 20:28 To: user@hive.apache.org Subject: Re: handling null argument in custom udf Thanks. Did you mean I should handle null in my udf or my serde? I did try to check for null inside the code in my udf, but it fails even before it gets called. This is from when the udf fails: Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to execute method public org.apache.hadoop.io.Text com.company.hive.myfun.evaluate(java.lang.Object,java.lang.Object) on objectcom.company.hive.myfun@1412332 of class com.company.hive.myfun with arguments {0:java.lang.Object, null} of size 2 It looks like there is a null, or is this error message misleading? On 04/12/2012 15:43, Edward Capriolo wrote: There is no null argument. You should handle the null case in your code. If (arga == null) Or optionally you could use a generic udf but a regular one should handle what you are doing. On Tuesday, December 4, 2012, Søren s...@syntonetic.commailto:s...@syntonetic.com wrote: Hi Hive community I have a custom udf, say myfun, written in Java which I utilize like this select myfun(col_a, col_b) from mytable where etc col_b is a string type and sometimes it is null. When that happens, my query crashes with --- java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {col_a:val,col_b:null} ... Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to execute method public org.apache.hadoop.io.Text --- public final class myfun extends UDF { public Text evaluate(final Text argA, final Text argB) { I'm unsure how this should be fixed in a proper way. Is the framework looking for an overload of evaluate that would comply with the null argument? I need to say that the table is declared using my own json serde reading from S3. I'm not processing nulls in my serde in any special way because Hive seems to handle null in the right way when not passed to my own UDF. Are there anyone out there with ideas or experiences on this issue? thanks in advance Søren NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference. NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference.
RE: How to set an empty value to hive.querylog.location to disable the creation of hive history file
It’s not supported now. I think you a rise it in JIRA. Regards Ransom From: Bing Li [mailto:sarah.lib...@gmail.com] Sent: Thursday, December 06, 2012 5:06 PM To: user@hive.apache.org Subject: Re: How to set an empty value to hive.querylog.location to disable the creation of hive history file it will exit with error like FAILED: Failed to open Query Log: /dev/null/hive_job_log_xxx.txt and pointed that the path is not a directory. 2012/12/6 Jithendranath Joijoide pixelma...@gmail.commailto:pixelma...@gmail.com How about setting it to /dev/null . Not sure if that would help in your case. Just an hack. Regards. On Thu, Dec 6, 2012 at 2:14 PM, Bing Li sarah.lib...@gmail.commailto:sarah.lib...@gmail.com wrote: Hi, all Refer to https://cwiki.apache.org/Hive/adminmanual-configuration.html, if I set hive.querylog.location to an empty string, it won't create structured log. I filed hive-site.xml in HIVE_HOME/conf and add the following setting, property namehive.querylog.location/name value/value /property BUT it didn't work, when launch HIVE_HOME/bin/hive, it created a history file in /tmp/user.namehttp://user.name which is the default directory of this property. Do you know how to set an EMPTY value in hive-site.xml? Thanks, - Bing
Mapping existing HBase table with many columns to Hive.
Hello, How can I map an HBase table with the following layout to Hive using the CREATE EXTERNAL TABLE command from shell (or another programmatic way): The HBase table's layout is as follows: Rowkey=16 bytes, a UUID that had the - removed, and the 32hex chars converted into two 8byte longs. Columns (qualifiers): timestamps, i.e the bytes of a long which were converted using Hadoop's Bytes.toBytes(long). There can be many of those in a single row. Values: The bytes of a Java string. I am unsure of which datatypes to use. I am pretty sure there is no way I can sensible map the row key to anything other than binary but maybe the columns - which are longs and the values which are strings can be mapped to their according Hive datatypes. I include an extract of what a row looks like in HBase shell below: Thank you, /David hbase(main):009:0 scan hits ROW COLUMN+CELL \x00\x00\x06\xB1H\x89N\xC3\xA5\x83\x0F\xDD\x1E\xAE\xDC column=t:\x00\x00\x01;2\xE6Q\x06, timestamp=1267737987733, value=blahaha \x00\x00\x06\xB1H\x89N\xC3\xA5\x83\x0F\xDD\x1E\xAE\xDC column=t:\x00\x00\x01;2\xE6\xFB@, timestamp=1354012104967, value=testtest
Re: How is it that every hive release in maven depends on
These jars are pulled in by datanucleus which is a dependency of hive-metastore. The datanucleus project manages its own repositories for these jars: http://www.datanucleus.org/downloads/maven2 chris From: Edward Capriolo edlinuxg...@gmail.commailto:edlinuxg...@gmail.com Reply-To: user@hive.apache.orgmailto:user@hive.apache.org user@hive.apache.orgmailto:user@hive.apache.org Date: Thursday, December 6, 2012 8:56 AM To: user@hive.apache.orgmailto:user@hive.apache.org user@hive.apache.orgmailto:user@hive.apache.org Subject: How is it that every hive release in maven depends on http://mvnrepository.com/artifact/org.apache.hive/hive-metastore/0.9.0 javax.jdohttp://mvnrepository.com/artifact/javax.jdo jdo2-apihttp://mvnrepository.com/artifact/javax.jdo/jdo2-api 2.3-ec 2.3-ec is not in maven central. All our poms seem to reference this. What is the deal here?
Re: Mapping existing HBase table with many columns to Hive.
Hi David, First of all, you columns are not long. They are binary as well. Currently as hive stands, there is no support for binary qualifiers. However, I recently submitted a patch for that[1]. Feel free to give it a shot and let me know if you see any issues. With that patch, you can directly give your qualifiers to hive as they look here ( \x00\x00\x01;2\xE6Q\x06). Until then, the only option you have is to use a map to map all your columns under the column family t. An example to do that would be: CREATE EXTERNAL TABLE hbase_table_1(key int, value mapstring,string) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES (hbase.columns.mapping = :key,t:) TBLPROPERTIES(hbase.table.name = some_existing_table); Also as far as your key goes, it is a composite key. There is also an existing patch for the support of that here[2]. Hope that helps. [1] https://issues.apache.org/jira/browse/HIVE-3553 [2] https://issues.apache.org/jira/browse/HIVE-2599 On Thu, Dec 6, 2012 at 12:56 PM, David Koch ogd...@googlemail.com wrote: Hello, How can I map an HBase table with the following layout to Hive using the CREATE EXTERNAL TABLE command from shell (or another programmatic way): The HBase table's layout is as follows: Rowkey=16 bytes, a UUID that had the - removed, and the 32hex chars converted into two 8byte longs. Columns (qualifiers): timestamps, i.e the bytes of a long which were converted using Hadoop's Bytes.toBytes(long). There can be many of those in a single row. Values: The bytes of a Java string. I am unsure of which datatypes to use. I am pretty sure there is no way I can sensible map the row key to anything other than binary but maybe the columns - which are longs and the values which are strings can be mapped to their according Hive datatypes. I include an extract of what a row looks like in HBase shell below: Thank you, /David hbase(main):009:0 scan hits ROW COLUMN+CELL \x00\x00\x06\xB1H\x89N\xC3\xA5\x83\x0F\xDD\x1E\xAE\xDC column=t:\x00\x00\x01;2\xE6Q\x06, timestamp=1267737987733, value=blahaha \x00\x00\x06\xB1H\x89N\xC3\xA5\x83\x0F\xDD\x1E\xAE\xDC column=t:\x00\x00\x01;2\xE6\xFB@, timestamp=1354012104967, value=testtest -- Swarnim
Re: Mapping existing HBase table with many columns to Hive.
Hello Swarnim, Thank you for your answer. I will try the options you pointed out. /David On Thu, Dec 6, 2012 at 9:10 PM, kulkarni.swar...@gmail.com kulkarni.swar...@gmail.com wrote: map
Locking in HIVE : How to use locking/unlocking features using hive java API ?
Hi, I'm building / designing a back-up and restore tool for hive data for Disaster Recovery scenarios. I'm trying to understand the locking behavior of HIVE that is currently supporting ZooKeeper for locking. My thought process if like this ( early design.) 1. Backing up the meta-data of hive. 2. Backing up the data for hive tables on s3 or hdfs or NFS 3. Restoring table(s): a. Only Data b. Schema and data So, to achieve 1st task, this is the flow I'm thinking. a. Check whether there is any exclusive lock on the Table, whose meta-data needs to be backed up. if YES then don't do any thing, wait and retry for configured no/frequency if NO: Then get the meta-data of the table and create the DDL statement for HIVE including table / partition etc. For 2nd task: a. Check whether the table has any exclusive lock, if NOT take shared lock and start copy, once done release the shared lock. if YES then then wait and retry. For 3rd: Restoring: a. Only Data: Check if there is any lock on the table. if NO, then take the exclusive lock, insert the data into table, release the lock. if YES then wait and retry. b. Schema and Data: Check if there is any lock on table/partition. if NO then Drop and create table/partitions. if YES then wait and retry. Once schema is created: take the exclusive lock, insert data, release lock. Now I'm going to run this kind of job from my scheduler / WF engine. I need input on following questions: a. Is this overall approach looks good? b. How can I take and release different locks explicitly using HIVE API. ref: https://cwiki.apache.org/confluence/display/Hive/Locking If I understood correctly, As per this still HIVE doesn't support locking explicitly at API level. Is there any plan or patch to get this done. I saw some classes like *ZooKeeperHiveLock *etc.but need to dig further to see, if can use these classes for locking features. Thanks for your time and effort. Regards, Manish