RE: Hive Custom UDF - "hive.aux.jars.path" not working
Hi, U need to mention the jar like this, ~/Documents/workspace/Hive_0_7_1/build/dist/conf$grep aux hive-site.xml hive.aux.jars.path/Users/amsharma/dev/Perforce /development/depot/dataeng/hive/dist/{URJARNAME}.jar U r using CLI mode so after changing the value if u start shell that is ok...and in another mode also we can start hive that is hiveserver this case after changing the value u need to restart the hive server Thanks, Chinna Rao Lalam _ From: Amit Sharma [mailto:amitsharma1...@gmail.com] Sent: Tuesday, August 23, 2011 3:35 AM To: user@hive.apache.org Subject: Re: Hive Custom UDF - "hive.aux.jars.path" not working Hi Vaibhav, Excuse my ignorance as im a little new to Hive. What do you mean by restart the Hive Server? I am using the Hive Interactive shell for my work. So i start the shell after modifying the config variable. Which server do i need to restart? Amit On Mon, Aug 22, 2011 at 2:49 PM, Aggarwal, Vaibhav wrote: Did you restart the hive server after modifying the hive-site.xml settings? I think you need to restart the server to pick up the latest settings in the config file. Thanks Vaibhav From: Amit Sharma [mailto:amitsharma1...@gmail.com] Sent: Monday, August 22, 2011 2:42 PM To: user@hive.apache.org Subject: Hive Custom UDF - "hive.aux.jars.path" not working Hi, I build custom UDFS for hive and they seem to work fine when i explicitly register the jars using the "add jar " command or put in in the environment variable "HIVE_AUX_JARS_PATH". But if i add it as a configuration variable in the hive-site.xml file and try to register the function using "create temporary function as 'funciton' ", it cannot find the jar. Any idea whats going on here? Here is the snippet from hive-site.xml: ~/Documents/workspace/Hive_0_7_1/build/dist/conf$grep aux hive-site.xml hive.aux.jars.path/Users/amsharma/dev/Perforce /development/depot/dataeng/hive/dist Amit
回复: hive-0.7.1: TestCliDriver FAILED
>From the result, we can see that the difference between the source file and >target file is only the path which should be masked when being compared. Bing --- 11年8月22日,周一, 李 冰 写道: 发件人: 李 冰 主题: hive-0.7.1: TestCliDriver FAILED 收件人: user@hive.apache.org 抄送: d...@hive.apache.org 日期: 2011年8月22日,周一,下午10:43 Hi, all When I try to run the standard test cases in Hive 0.7.1 against SUN 1.6 JDK, I found that TestCliDriver failed. The version of the JDK I used is: java version "1.6.0_27-ea" Java(TM) SE Runtime Environment (build 1.6.0_27-ea-b03) Java HotSpot(TM) 64-Bit Server VM (build 20.2-b03, mixed mode) My steps: 1. ant clean 2. ant package 3. ant test Here is a snapshot of the failure: [junit] Done query: script_env_var2.q [junit] Begin query: script_pipe.q [junit] junit.framework.AssertionFailedError: Client execution results failed with error code = 1 [junit] See build/ql/tmp/hive.log, or try "ant test ... -Dtest.silent=false" to get more logs. [junit] at junit.framework.Assert.fail(Assert.java:47) [junit] at org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_script_pipe(TestCliDriver.java:21067) [junit] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) [junit] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) [junit] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) [junit] at java.lang.reflect.Method.invoke(Method.java:597) [junit] at junit.framework.TestCase.runTest(TestCase.java:154) [junit] at junit.framework.TestCase.runBare(TestCase.java:127) [junit] at junit.framework.TestResult$1.protect(TestResult.java:106) [junit] at junit.framework.TestResult.runProtected(TestResult.java:124) [junit] at junit.framework.TestResult.run(TestResult.java:109) [junit] at junit.framework.TestCase.run(TestCase.java:118) [junit] at junit.framework.TestSuite.runTest(TestSuite.java:208) [junit] at junit.framework.TestSuite.run(TestSuite.java:203) [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:518) [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:1052) [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:906) [junit] diff -a -I file: -I pfile: -I hdfs: -I /tmp/ -I invalidscheme: -I lastUpdateTime -I lastAccessTime -I [Oo]wner -I CreateTime -I LastAccessTime -I Location -I transient_lastDdlTime -I last_modified_ -I java.lang.RuntimeException -I at org -I at sun -I at java -I at junit -I Caused by: -I LOCK_QUERYID: -I grantTime -I [.][.][.] [0-9]* more -I USING 'java -cp /home/libing/hive-0.7.1/src/build/ql/test/logs/clientpositive/script_pipe.q.out /home/libing/hive-0.7.1/src/ql/src/test/results/clientpositive/script_pipe.q.out [junit] 143c143,144 [junit] < POSTHOOK: Output: file:/tmp/libing/hive_2011-08-21_23-27-41_670_8767305526316071428/-mr-1 [junit] --- [junit] > POSTHOOK: Output: file:/tmp/sdong/hive_2011-02-10_17-04-27_817_7785884157237702561/-mr-1 [junit] > 238 val_238 238 val_238 [junit] Exception: Client execution results failed with error code = 1 [junit] See build/ql/tmp/hive.log, or try "ant test ... -Dtest.silent=false" to get more logs. [junit] Begin query: select_as_omitted.q Have you met this before? Thanks
Re: Hive Custom UDF - "hive.aux.jars.path" not working
Hi Vaibhav, Excuse my ignorance as im a little new to Hive. What do you mean by restart the Hive Server? I am using the Hive Interactive shell for my work. So i start the shell after modifying the config variable. Which server do i need to restart? Amit On Mon, Aug 22, 2011 at 2:49 PM, Aggarwal, Vaibhav wrote: > Did you restart the hive server after modifying the hive-site.xml settings? > > > I think you need to restart the server to pick up the latest settings in > the config file. > > ** ** > > Thanks > > Vaibhav > > ** ** > > *From:* Amit Sharma [mailto:amitsharma1...@gmail.com] > *Sent:* Monday, August 22, 2011 2:42 PM > *To:* user@hive.apache.org > *Subject:* Hive Custom UDF - "hive.aux.jars.path" not working > > ** ** > > Hi, > I build custom UDFS for hive and they seem to work fine when i explicitly > register the jars using the "add jar " command or put in in the > environment variable "HIVE_AUX_JARS_PATH". But if i add it as a > configuration variable in the hive-site.xml file and try to register the > function using "create temporary function as 'funciton' ", it > cannot find the jar. Any idea whats going on here? > > Here is the snippet from hive-site.xml: > > ~/Documents/workspace/Hive_0_7_1/build/dist/conf$grep aux hive-site.xml > > hive.aux.jars.path/Users/amsharma/dev/Perforce/development/depot/dataeng/hive/dist > > Amit >
RE: Hive Custom UDF - "hive.aux.jars.path" not working
Did you restart the hive server after modifying the hive-site.xml settings? I think you need to restart the server to pick up the latest settings in the config file. Thanks Vaibhav From: Amit Sharma [mailto:amitsharma1...@gmail.com] Sent: Monday, August 22, 2011 2:42 PM To: user@hive.apache.org Subject: Hive Custom UDF - "hive.aux.jars.path" not working Hi, I build custom UDFS for hive and they seem to work fine when i explicitly register the jars using the "add jar " command or put in in the environment variable "HIVE_AUX_JARS_PATH". But if i add it as a configuration variable in the hive-site.xml file and try to register the function using "create temporary function as 'funciton' ", it cannot find the jar. Any idea whats going on here? Here is the snippet from hive-site.xml: ~/Documents/workspace/Hive_0_7_1/build/dist/conf$grep aux hive-site.xml hive.aux.jars.path/Users/amsharma/dev/Perforce/development/depot/dataeng/hive/dist Amit
Hive Custom UDF - "hive.aux.jars.path" not working
Hi, I build custom UDFS for hive and they seem to work fine when i explicitly register the jars using the "add jar " command or put in in the environment variable "HIVE_AUX_JARS_PATH". But if i add it as a configuration variable in the hive-site.xml file and try to register the function using "create temporary function as 'funciton' ", it cannot find the jar. Any idea whats going on here? Here is the snippet from hive-site.xml: ~/Documents/workspace/Hive_0_7_1/build/dist/conf$grep aux hive-site.xml hive.aux.jars.path/Users/amsharma/dev/Perforce/development/depot/dataeng/hive/dist Amit
One Schema Per Partition? (Multiple schemas per table?)
I found a set of slides from Facebook online about Hive that claims you can have a schema per partition in the table, this is exciting to us, because we have a table like so: id int name string level int date string And it's broken up into partitions by date. However, on a particular date last year, the table dramatically changed its schema to: id int levelint date string name_id int So now if I do "select * from table" in hive, the data is completely garbled for whichever portion of data doesn't fit the Hive schema. We are considering re-writing the datafiles so they're the same before/after that date, but if Hive supports having two entirely different schemas depending on the partition, that'd be really convenient, since these datafiles are hundreds of gigabytes in size (and we do sort of like the idea of knowing how the datafile looked back then...). This page: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-AlterTable%2FPartitionStatementsdoesn't seem to have an appropriate example, so I'm left wondering. Has anyone done anything like this? -- Tim Ellis Data Architect, Riot Games
Re: Passing table properties to the InputFormat
I have been able to get the table properties in InputFormat as below. However I am not sure if that is correct way or if there is any better way for that. Properties tableProperties = Utilities.getMapRedWork(job).getPathToPartitionInfo().get(getInputPaths(job)[0].toString()).getTableDesc().getProperties() ; From: Shantian Purkad To: "user@hive.apache.org" Sent: Saturday, August 20, 2011 5:01 PM Subject: Passing table properties to the InputFormat Hi, I have a custom Input format that reads multiple lines as one row based on number of columns in a table. I want to dynamically pass the table properties (like number of columns in table, their data types etc. just like what you get in SerDe) How can I do that? If that is not possible, and SerDe is an option, how can I use my custom record reader in SerDe? My table definition is create table delimited_data_serde ( col1 int, col2 string, col3 int, col4 string, col5 string, col6 string ) STORED AS INPUTFORMAT 'fwrk.hadoop.input.DelimitedInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' ; The input format needs needs the property 'total.fields.count'='6' If I set this using set total.fields.count=6 ; It works, however I will have to change this property for every table that uses the custom Input format before I query that table. How can I automatically get handle to the table properties in input format? Regards, Shantian
Local and remote metastores
Hi everyone, Does anyone know the differences between local and remove Hive metastores? Are there features that are only provided by the remote datastore (like authorization)? Is the use of a local metastore recommended in production? Many thanks, Alex
Re: org.apache.hadoop.fs.ChecksumException: Checksum error:
I try using hadoop fs -copyToLocal. I also get a stack trace, like this: 11/08/22 10:53:57 INFO fs.FSInputChecker: Found checksum error: b[1024, 1536]=31325431393a32313a31315a7c3137342e3235332e3234352e3232377c39376261623664642d353062342d343461612d383235642d6537336238646434336563337c36373842303935453945304431374635383833344135464336423341424646357c342e327c313931393638200a323031312d30352d31325431393a32313a31315a7c3137342e3235332e3234352e3232377c39376261623664642d353062342d343461612d383235642d6537336238646434336563337c36373842303935453945304031374635383833344135464336423341424646357c342e322e317c313931393638200a323031312d30352d31325431393a32323a33395a7c3137342e3235332e3234352e3232377c39376261623664642d353062342d343461612d383235642d6537336238646434336563337c36373842303935453945304431374635383833344135464336423341424646357c362e322e317c313837373837200a323031312d30352d31325431393a32323a34335a7c3137342e3235332e3234352e3232377c39376261623664642d353062342d343461612d383235642d6537336238646434336563337c36373842303935453945304431374635383833344135464336423341424646357c362e337c3138373738375f61745f706f736974696f6e5f3835200a323031312d30352d31325431393a32323a34335a7c3137342e org.apache.hadoop.fs.ChecksumException: Checksum error: /blk_2722854101062410251:of:/user/hive/warehouse/att_log/collect_time=1314024490064/load.dat at 64635904 at org.apache.hadoop.fs.FSInputChecker.verifySum(FSInputChecker.java:277) at org.apache.hadoop.fs.FSInputChecker.readChecksumChunk(FSInputChecker.java:241) at org.apache.hadoop.fs.FSInputChecker.read1(FSInputChecker.java:189) at org.apache.hadoop.fs.FSInputChecker.read(FSInputChecker.java:158) at org.apache.hadoop.hdfs.DFSClient$BlockReader.read(DFSClient.java:1158) at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.readBuffer(DFSClient.java:1718) at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1770) at java.io.DataInputStream.read(DataInputStream.java:83) at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:53) at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:72) at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:320) at org.apache.hadoop.fs.FsShell.copyToLocal(FsShell.java:248) at org.apache.hadoop.fs.FsShell.copyToLocal(FsShell.java:199) at org.apache.hadoop.fs.FsShell.run(FsShell.java:1754) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) at org.apache.hadoop.fs.FsShell.main(FsShell.java:1880) 11/08/22 10:53:57 WARN hdfs.DFSClient: Found Checksum error for blk_2722854101062410251_1038 from 192.168.50.192:50010 at 64635904 11/08/22 10:53:57 INFO hdfs.DFSClient: Could not obtain block blk_2722854101062410251_1038 from any node: java.io.IOException: No live nodes contain current block copyToLocal: Checksum error: /blk_2722854101062410251:of:/user/hive/warehouse/att_log/collect_time=1314024490064/load.dat at 64635904 I manage to load two files(by using the Java API copyFromLocal call and then a 'load data inpath' call to load the data into the table). hadoop fsck does not show corrupted block until I run the 'select count(*)' call after loading the second file. 'hadoop fs -copyToLocal' also only fails after hadoop fsck shows corrupted block. For the first loaded file, 'hadoop fs -copyToLocal' works fine. It does look like the problem is with hdfs. I originally discover this issue on a two-node cluster with a replication factor of 2. But I am now testing on a pseudo-distributed install with only one node and a replication factor of 1. I am using text file. I would like to try to use sequencefile. I understand the "io.skip.checksum.errors" setting only applies to sequencefile. But the only way I know to load data into a table with sequencefile as storage is to first load the text file into a table with textfile as storage and then use a 'insert into select' to load the data into the sequencefile table. The 'insert into select' already fails with the same problem as running a query on the textfile table. Is there any other way to load a sequencefile table? On Fri, Aug 19, 2011 at 8:57 PM, Aggarwal, Vaibhav wrote: > This is a really curious case. > > How many replicas of each block do you have? > > Are you able to copy the data directly using HDFS client? > You could try the hadoop fs -copyToLocal command and see if it can copy the > data from hdfs correctly. > > That would help you verify that the issue really is at HDFS layer (though it > does look like that from the stack trace). > > Which file format are you using? > > Thanks > Vaibhav > > -Original Message- > From: W S Chung [mailto:qp.wsch...@gmail.com] > Sent: Friday, August 19, 2011 3:26 PM > To: user@hive.apache.org > Subject: org.apache.hadoop.fs.ChecksumException: Checksum error: > > For some reason, my questions sent two days ago again never shows up, even > though I can goo
hive-0.7.1: TestCliDriver FAILED
Hi, all When I try to run the standard test cases in Hive 0.7.1 against SUN 1.6 JDK, I found that TestCliDriver failed. The version of the JDK I used is: java version "1.6.0_27-ea" Java(TM) SE Runtime Environment (build 1.6.0_27-ea-b03) Java HotSpot(TM) 64-Bit Server VM (build 20.2-b03, mixed mode) My steps: 1. ant clean 2. ant package 3. ant test Here is a snapshot of the failure: [junit] Done query: script_env_var2.q [junit] Begin query: script_pipe.q [junit] junit.framework.AssertionFailedError: Client execution results failed with error code = 1 [junit] See build/ql/tmp/hive.log, or try "ant test ... -Dtest.silent=false" to get more logs. [junit] at junit.framework.Assert.fail(Assert.java:47) [junit] at org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_script_pipe(TestCliDriver.java:21067) [junit] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) [junit] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) [junit] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) [junit] at java.lang.reflect.Method.invoke(Method.java:597) [junit] at junit.framework.TestCase.runTest(TestCase.java:154) [junit] at junit.framework.TestCase.runBare(TestCase.java:127) [junit] at junit.framework.TestResult$1.protect(TestResult.java:106) [junit] at junit.framework.TestResult.runProtected(TestResult.java:124) [junit] at junit.framework.TestResult.run(TestResult.java:109) [junit] at junit.framework.TestCase.run(TestCase.java:118) [junit] at junit.framework.TestSuite.runTest(TestSuite.java:208) [junit] at junit.framework.TestSuite.run(TestSuite.java:203) [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:518) [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:1052) [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:906) [junit] diff -a -I file: -I pfile: -I hdfs: -I /tmp/ -I invalidscheme: -I lastUpdateTime -I lastAccessTime -I [Oo]wner -I CreateTime -I LastAccessTime -I Location -I transient_lastDdlTime -I last_modified_ -I java.lang.RuntimeException -I at org -I at sun -I at java -I at junit -I Caused by: -I LOCK_QUERYID: -I grantTime -I [.][.][.] [0-9]* more -I USING 'java -cp /home/libing/hive-0.7.1/src/build/ql/test/logs/clientpositive/script_pipe.q.out /home/libing/hive-0.7.1/src/ql/src/test/results/clientpositive/script_pipe.q.out [junit] 143c143,144 [junit] < POSTHOOK: Output: file:/tmp/libing/hive_2011-08-21_23-27-41_670_8767305526316071428/-mr-1 [junit] --- [junit] > POSTHOOK: Output: file:/tmp/sdong/hive_2011-02-10_17-04-27_817_7785884157237702561/-mr-1 [junit] > 238 val_238 238 val_238 [junit] Exception: Client execution results failed with error code = 1 [junit] See build/ql/tmp/hive.log, or try "ant test ... -Dtest.silent=false" to get more logs. [junit] Begin query: select_as_omitted.q Have you met this before? Thanks