Loading data in Hive 0.11 - permission issue
Hi, I'm using Hive 0.11, downloaded the tarball from Apache's website. I have a Linux user called *admin * and i invoke the hive CLI using this user. In the hive terminal I created a table as follows: *hive> create table ptest (pkey INT, skey INT, fkey INT, rkey INT, units INT) row format delimited fields terminated by ',' lines terminated by '\n' stored as textfile;* *OK* *Time taken: 0.241 seconds* When i try to load data into the table i get the following error: *hive> LOAD DATA LOCAL INPATH '/home/admin/sample.csv' OVERWRITE INTO TABLE ptest;* *Copying data from file:/home/admin/sample.csv* *Copying file: file:/home/admin/sample.csv* *Loading data to table default.ptest* *rmr: DEPRECATED: Please use 'rm -r' instead.* *rmr: Permission denied: user=admin, access=ALL, inode="/user/hive_0.11/warehouse/ptest":root:hive:drwxr-xr-x* *Failed with exception Permission denied: user=admin, access=ALL, inode="/user/hive_0.11/warehouse/ptest":root:hive:drwxr-xr-x* *at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:205) * *at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkSubAccess(FSPermissionChecker.java:174) * *at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:144) * *at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:4684) * *at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:2794) * *at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInt(FSNamesystem.java:2757) * *at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:2740) * *at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:621) * *at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.delete(ClientNamenodeProtocolServerSideTranslatorPB.java:406) * *at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44094) * *at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453) * *at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1002)* *at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1695)* *at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1691)* *at java.security.AccessController.doPrivileged(Native Method)* *at javax.security.auth.Subject.doAs(Subject.java:415)* *at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408) * *at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1689)* * * *FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask* When I looked into the warehouse directory, *hive>dfs -ls /user/hive_0.11/warehouse;* *Found 1 item* *drwxr-xr-x - root hive0 2013-08-05 17:04 /user/hive_0.11/warehouse/ptest* It seems like the file is owned by the root user even though it was created by invoking hive CLI from the user admin. I'm unable to figure out why the owner of the table has been assigned as root. Could anyone please help me out? Thank you, Sachin
Sequence file compression in Hive
Hi, I have a table stored as SEQUENCEFILE in hive-0.10,* facts520_normal_seq* Now, i wish to create another table stored as a SEQUENCEFILE itself, but compressed using the Gzip codec. So, i set the compression codec and type as BLOCK and then executed the following query: *SET hive.exec.compress.output=true;* *SET mapred.output.compression.codec=org.apache.hadoop.io.compress.GzipCodec;* *SET mapred.output.compression.type=BLOCK;* *create table test1facts520_gzip_seq as select * from facts520_normal_seq;* * * The table got created and was compressed as well. *[root@aana1 comp_data]# sudo -u hdfs hadoop fs -ls /user/hive/warehouse/facts_520.db/test1facts520_gzip_seq* *Found 5 items* *-rw-r--r-- 3 admin supergroup 38099145 2013-06-10 17:56 /user/hive/warehouse/facts_520.db/test1facts520_gzip_seq/00_0.gz* *-rw-r--r-- 3 admin supergroup 31450189 2013-06-10 17:56 /user/hive/warehouse/facts_520.db/test1facts520_gzip_seq/01_0.gz* *-rw-r--r-- 3 admin supergroup 20764259 2013-06-10 17:56 /user/hive/warehouse/facts_520.db/test1facts520_gzip_seq/02_0.gz* *-rw-r--r-- 3 admin supergroup 21107597 2013-06-10 17:56 /user/hive/warehouse/facts_520.db/test1facts520_gzip_seq/03_0.gz* *-rw-r--r-- 3 admin supergroup 12202692 2013-06-10 17:56 /user/hive/warehouse/facts_520.db/test1facts520_gzip_seq/04_0.gz* * * However, when i checked the table properties, it was surprising to see that the table has been stored as a textfile! *hive> show create table test1facts520_gzip_seq;* *OK* *CREATE TABLE test1facts520_gzip_seq(* * fact_key bigint,* * products_key int,* * retailers_key int,* * suppliers_key int,* * time_key int,* * units int)* *ROW FORMAT SERDE* * 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'* *STORED AS INPUTFORMAT* * 'org.apache.hadoop.mapred.TextInputFormat'* *OUTPUTFORMAT* * 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'* *LOCATION* * 'hdfs:// aana1.ird.com/user/hive/warehouse/facts_520.db/test1facts520_gzip_seq'* *TBLPROPERTIES (* * 'numPartitions'='0',* * 'numFiles'='5',* * 'transient_lastDdlTime'='1370867198',* * 'numRows'='0',* * 'totalSize'='123623882',* * 'rawDataSize'='0')* *Time taken: 0.15 seconds* * * * * So, i tried adding the STORED AS clause to my earlier create table statement and created a new table: *create table test3facts520_gzip_seq STORED AS SEQUENCEFILE as select * from facts520_normal_seq;* * * This time, the output table got stored as a SEQUENCEFILE, *hive> show create table test3facts520_gzip_seq;* *OK* *CREATE TABLE test3facts520_gzip_seq(* * fact_key bigint,* * products_key int,* * retailers_key int,* * suppliers_key int,* * time_key int,* * units int)* *ROW FORMAT SERDE* * 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'* *STORED AS INPUTFORMAT* * 'org.apache.hadoop.mapred.SequenceFileInputFormat'* *OUTPUTFORMAT* * 'org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat'* *LOCATION* * 'hdfs:// aana1.ird.com/user/hive/warehouse/facts_520.db/test3facts520_gzip_seq'* *TBLPROPERTIES (* * 'numPartitions'='0',* * 'numFiles'='5',* * 'transient_lastDdlTime'='137086',* * 'numRows'='0',* * 'totalSize'='129811519',* * 'rawDataSize'='0')* *Time taken: 0.135 seconds* But, the compression itself did not happen! *[root@aana1 comp_data]# sudo -u hdfs hadoop fs -ls /user/hive/warehouse/facts_520.db/test3facts520_gzip_seq* *Found 5 items* *-rw-r--r-- 3 admin supergroup 40006368 2013-06-10 18:06 /user/hive/warehouse/facts_520.db/test3facts520_gzip_seq/00_0* *-rw-r--r-- 3 admin supergroup 33026961 2013-06-10 18:06 /user/hive/warehouse/facts_520.db/test3facts520_gzip_seq/01_0* *-rw-r--r-- 3 admin supergroup 21797242 2013-06-10 18:05 /user/hive/warehouse/facts_520.db/test3facts520_gzip_seq/02_0* *-rw-r--r-- 3 admin supergroup 22171637 2013-06-10 18:05 /user/hive/warehouse/facts_520.db/test3facts520_gzip_seq/03_0* *-rw-r--r-- 3 admin supergroup 12809311 2013-06-10 18:05 /user/hive/warehouse/facts_520.db/test3facts520_gzip_seq/04_0* Is there anything that I have done wrong, or I have missed something ? Any help would be greatly appreciated! Thank you, Sachin
Compression in Hive
Hi, I have been testing the usefulness of compression in Hive. I have a general question, I would like to know if there are any particular cases where compression in hive can actually prove useful while running any MR jobs. Any pointers/examples would really be useful! Thank you, Sachin
Compression in Hive using different file formats
Hi, I was testing Compression in Hive using different file formats. I have a table stored as a sequence file ,* facts_normal_seq*. Now I wish to create another table *facts_snappy_seq *by using Snappy compression codec. Is this the correct way to do this: *CREATE TABLE facts_snappy_seq ( , ) ROW FORMAT STORED AS SEQUENCEFILE;* * * *SET hive.exec.compress.output=true;* *SET mapred.output.compression.codec=org.apache.hadoop.io.compress.SnappyCodec;* *SET mapred.output.compression.type=BLOCK; * * * *INSERT OVERWRITE TABLE facts_snappy_seq SELECT * FROM facts_normal_seq;* * * When i populate the table in this manner, the file in HDFS doesn not seem to have the .snappy extension. Any pointers in this regard would really be helpful Thank you, Sachin
Re: Textfile compression using Gzip codec
Hi Stephen, Thank you for your reply. But, its the silliest error from my side. Its a typo! The codec is : org.apache.hadoop.io.compress.*GzipCodec* and not org.apache.hadoop.io.compress.*GZipCodec.* * * I regret making that mistake. Thank you, Sachin On Thu, Jun 6, 2013 at 10:07 PM, Stephen Sprague wrote: > Hi Sachin, > LIke you say looks like something to do with the GZipCodec all right. And > that would make sense given your original problem. > > Yeah, one would think it'd be in there by default but for whatever reason > its not finding it but at least the problem is now identified. > > Now _my guess_ is that maybe your hadoop core-site.xml file might need to > list the codecs available under the property name: > "io.compression.codecs". Can you chase that up as a possibility and let us > know what you find out? > > > > > On Thu, Jun 6, 2013 at 4:02 AM, Sachin Sudarshana > wrote: > >> Hi Stephen, >> >> *hive> show create table facts520_normal_text;* >> *OK* >> *CREATE TABLE facts520_normal_text(* >> * fact_key bigint,* >> * products_key int,* >> * retailers_key int,* >> * suppliers_key int,* >> * time_key int,* >> * units int)* >> *ROW FORMAT DELIMITED* >> * FIELDS TERMINATED BY ','* >> * LINES TERMINATED BY '\n'* >> *STORED AS INPUTFORMAT* >> * 'org.apache.hadoop.mapred.TextInputFormat'* >> *OUTPUTFORMAT* >> * 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'* >> *LOCATION* >> * 'hdfs:// >> aana1.ird.com/user/hive/warehouse/facts_520.db/facts520_normal_text'* >> *TBLPROPERTIES (* >> * 'numPartitions'='0',* >> * 'numFiles'='1',* >> * 'transient_lastDdlTime'='1369395430',* >> * 'numRows'='0',* >> * 'totalSize'='545216508',* >> * 'rawDataSize'='0')* >> *Time taken: 0.353 seconds* >> >> >> The syserror log shows this: >> >> *java.lang.IllegalArgumentException: Compression codec >> org.apache.hadoop.io.compress.GZipCodec was not found.* >> * at >> org.apache.hadoop.mapred.FileOutputFormat.getOutputCompressorClass(FileOutputFormat.java:85) >> * >> * at >> org.apache.hadoop.hive.ql.exec.Utilities.getFileExtension(Utilities.java:934) >> * >> * at >> org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketFiles(FileSinkOperator.java:469) >> * >> * at >> org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:543) >> * >> * at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474)* >> * at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800)* >> * at >> org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84) >> * >> * at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474)* >> * at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800)* >> * at >> org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:83) >> * >> * at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474)* >> * at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800)* >> * at >> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:546)* >> * at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:143)* >> * at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)* >> * at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:418)* >> * at org.apache.hadoop.mapred.MapTask.run(MapTask.java:333)* >> * at org.apache.hadoop.mapred.Child$4.run(Child.java:268)* >> * at java.security.AccessController.doPrivileged(Native Method)* >> * at javax.security.auth.Subject.doAs(Subject.java:415)* >> * at >> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408) >> * >> * at org.apache.hadoop.mapred.Child.main(Child.java:262)* >> *Caused by: java.lang.ClassNotFoundException: Class >> org.apache.hadoop.io.compress.GZipCodec not found* >> * at >> org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1493) >> * >> * at >> org.apache.hadoop.mapred.FileOutputFormat.getOutputCompressorClass(FileOutputFormat.java:82) >> * >> * ... 21 more* >> *java.lang.IllegalArgumentException: Compression codec >> org.apache.hadoop.io.compress.GZipCodec was not found.* >> * at >> org.apache.hadoop.mapred.FileOutputFormat.getOutputCompressorClass(FileOutputFormat.java:8
Re: Textfile compression using Gzip codec
ot found.* * at org.apache.hadoop.mapred.FileOutputFormat.getOutputCompressorClass(FileOutputFormat.java:85) * * at org.apache.hadoop.hive.ql.exec.Utilities.getFileExtension(Utilities.java:934) * * at org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketFiles(FileSinkOperator.java:469) * * ... 14 more* *Caused by: java.lang.ClassNotFoundException: Class org.apache.hadoop.io.compress.GZipCodec not found* * at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1493) * * at org.apache.hadoop.mapred.FileOutputFormat.getOutputCompressorClass(FileOutputFormat.java:82) * * ... 16 more* It says that GZipCodec is not found. Isn't Snappy,GZip and BZip codecs available on Hadoop by default? Thank you, Sachin On Wed, Jun 5, 2013 at 11:58 PM, Stephen Sprague wrote: > well... the hiveException has the word "metadata" in it. maybe that's a > hint or a red-herrring. :)Let's try the following: > > 1. show create table * facts520_normal_text; > > * > *2. anything useful at this URL? ** > http://aana1.ird.com:50030/taskdetails.jsp?jobid=job_201306051948_0010&tipid=task_201306051948_0010_m_02or > is it just the same stack dump? > > > * > > > On Wed, Jun 5, 2013 at 3:17 AM, Sachin Sudarshana > wrote: > >> Hi, >> >> I have hive 0.10 + (CDH 4.2.1 patches) installed on my cluster. >> >> I have a table facts520_normal_text stored as a textfile. I'm trying to >> create a compressed table from this table using GZip codec. >> >> *hive> SET hive.exec.compress.output=true;* >> *hive> SET >> mapred.output.compression.codec=org.apache.hadoop.io.compress.GZipCodec;* >> *hive> SET mapred.output.compression.type=BLOCK;* >> * >> * >> *hive>* >> *> Create table facts520_gzip_text* >> *> (fact_key BIGINT,* >> *> products_key INT,* >> *> retailers_key INT,* >> *> suppliers_key INT,* >> *> time_key INT,* >> *> units INT)* >> *> ROW FORMAT DELIMITED FIELDS TERMINATED BY ','* >> *> LINES TERMINATED BY '\n'* >> *> STORED AS TEXTFILE;* >> * >> * >> *hive> INSERT OVERWRITE TABLE facts520_gzip_text SELECT * from >> facts520_normal_text;* >> >> >> When I run the above queries, the MR job fails. >> >> The error that the Hive CLI itself shows is the following: >> >> *Total MapReduce jobs = 3* >> *Launching Job 1 out of 3* >> *Number of reduce tasks is set to 0 since there's no reduce operator* >> *Starting Job = job_201306051948_0010, Tracking URL = >> http://aana1.ird.com:50030/jobdetails.jsp?jobid=job_201306051948_0010* >> *Kill Command = /usr/lib/hadoop/bin/hadoop job -kill >> job_201306051948_0010* >> *Hadoop job information for Stage-1: number of mappers: 3; number of >> reducers: 0* >> *2013-06-05 21:09:42,281 Stage-1 map = 0%, reduce = 0%* >> *2013-06-05 21:10:11,446 Stage-1 map = 100%, reduce = 100%* >> *Ended Job = job_201306051948_0010 with errors* >> *Error during job, obtaining debugging information...* >> *Job Tracking URL: >> http://aana1.ird.com:50030/jobdetails.jsp?jobid=job_201306051948_0010* >> *Examining task ID: task_201306051948_0010_m_04 (and more) from job >> job_201306051948_0010* >> *Examining task ID: task_201306051948_0010_m_01 (and more) from job >> job_201306051948_0010* >> * >> * >> *Task with the most failures(4):* >> *-* >> *Task ID:* >> * task_201306051948_0010_m_02* >> * >> * >> *URL:* >> * >> http://aana1.ird.com:50030/taskdetails.jsp?jobid=job_201306051948_0010&tipid=task_201306051948_0010_m_02 >> * >> *-* >> *Diagnostic Messages for this Task:* >> *java.lang.RuntimeException: >> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while >> processing row >> {"fact_key":7549094,"products_key":205,"retailers_key":304,"suppliers_key":402,"time_key":103,"units":23} >> * >> *at >> org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:161)* >> *at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)* >> *at >> org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:418)* >> *at org.apache.hadoop.mapred.MapTask.run(MapTask.java:333)* >> *at org.apache.hadoop.mapred.Child$4.run(Child.java:268)* >> *at java.security.AccessController.doPrivileged(Native Method)* >> *at javax.security.auth.Subject.doAs(Subject.java:415)* >> *at >
Textfile compression using Gzip codec
Hi, I have hive 0.10 + (CDH 4.2.1 patches) installed on my cluster. I have a table facts520_normal_text stored as a textfile. I'm trying to create a compressed table from this table using GZip codec. *hive> SET hive.exec.compress.output=true;* *hive> SET mapred.output.compression.codec=org.apache.hadoop.io.compress.GZipCodec;* *hive> SET mapred.output.compression.type=BLOCK;* * * *hive>* *> Create table facts520_gzip_text* *> (fact_key BIGINT,* *> products_key INT,* *> retailers_key INT,* *> suppliers_key INT,* *> time_key INT,* *> units INT)* *> ROW FORMAT DELIMITED FIELDS TERMINATED BY ','* *> LINES TERMINATED BY '\n'* *> STORED AS TEXTFILE;* * * *hive> INSERT OVERWRITE TABLE facts520_gzip_text SELECT * from facts520_normal_text;* When I run the above queries, the MR job fails. The error that the Hive CLI itself shows is the following: *Total MapReduce jobs = 3* *Launching Job 1 out of 3* *Number of reduce tasks is set to 0 since there's no reduce operator* *Starting Job = job_201306051948_0010, Tracking URL = http://aana1.ird.com:50030/jobdetails.jsp?jobid=job_201306051948_0010* *Kill Command = /usr/lib/hadoop/bin/hadoop job -kill job_201306051948_0010* *Hadoop job information for Stage-1: number of mappers: 3; number of reducers: 0* *2013-06-05 21:09:42,281 Stage-1 map = 0%, reduce = 0%* *2013-06-05 21:10:11,446 Stage-1 map = 100%, reduce = 100%* *Ended Job = job_201306051948_0010 with errors* *Error during job, obtaining debugging information...* *Job Tracking URL: http://aana1.ird.com:50030/jobdetails.jsp?jobid=job_201306051948_0010* *Examining task ID: task_201306051948_0010_m_04 (and more) from job job_201306051948_0010* *Examining task ID: task_201306051948_0010_m_01 (and more) from job job_201306051948_0010* * * *Task with the most failures(4):* *-* *Task ID:* * task_201306051948_0010_m_02* * * *URL:* * http://aana1.ird.com:50030/taskdetails.jsp?jobid=job_201306051948_0010&tipid=task_201306051948_0010_m_02 * *-* *Diagnostic Messages for this Task:* *java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"fact_key":7549094,"products_key":205,"retailers_key":304,"suppliers_key":402,"time_key":103,"units":23} * *at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:161)* *at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)* *at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:418)* *at org.apache.hadoop.mapred.MapTask.run(MapTask.java:333)* *at org.apache.hadoop.mapred.Child$4.run(Child.java:268)* *at java.security.AccessController.doPrivileged(Native Method)* *at javax.security.auth.Subject.doAs(Subject.java:415)* *at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408) * *at org.apache.hadoop.mapred.Child.main(Child.java:262)* *Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"fact_key":7549094,"products_key":205,"retailers_key":304,"suppliers_key":402,"time_key":103,"units":23} * *at org.apach* * * *FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask* *MapReduce Jobs Launched:* *Job 0: Map: 3 HDFS Read: 0 HDFS Write: 0 FAIL* *Total MapReduce CPU Time Spent: 0 msec* I'm unable to figure out why this is happening. It looks like the data is not being able to be copied properly. Or is it that GZip codec is not supported on textfiles? Any help in this issue is greatly appreciated! Thank you, Sachin
Re: io.compression.codecs not found
Hi Bejoy, Thanks for the reply. I would like to know "what" are the codecs that are available by default in the Hadoop system, among which i can choose to set in the core-site.xml. For ex: LZO compression codecs are not available by default and we have to install the required libraries for it. Thank you, Sachin On Thu, May 23, 2013 at 7:55 PM, wrote: > ** > Go to $HADOOP_HOME/config open and edit core-site.xml > > Add a new property 'io.compression.codecs' and assign the required > compression codecs as its value. > Regards > Bejoy KS > > Sent from remote device, Please excuse typos > -- > *From: * Sachin Sudarshana > *Date: *Thu, 23 May 2013 19:46:37 +0530 > *To: * > *ReplyTo: * user@hive.apache.org > *Subject: *Re: io.compression.codecs not found > > Hi, > > I'm not using CM. I have installed CDH 4.2.1 using Linux packages. > > Thank you, > Sachin > > > On Thu, May 23, 2013 at 7:13 PM, Sanjay Subramanian < > sanjay.subraman...@wizecommerce.com> wrote: > >> This property needs to be set in core-site.xml. If u r using >> clouderamanager then ping me I will tell u how to set it there. Out of the >> box hive works beautifully with gzip and snappy. And if u r using lzo then >> needs some plumbing. Depends on what ur usecase is I can provide guidance. >> >> Regards >> Sanjay >> >> Sent from my iPhone >> >> On May 23, 2013, at 3:33 AM, "Sachin Sudarshana" >> wrote: >> >> > Hi, >> > >> > I'm trying to run some queries on compressed tables in Hive 0.10. I >> wish to know what all compression codecs are available which i can make use >> of. >> > However, when i run set io.compression.codecs in the hive CLI, it >> throws an error saying the io.compression.codecs is not found. >> > >> > I'm unable to figure out why its happening. Has it (the hiveconf >> property) been removed from 0.10? >> > >> > Any help is greatly appreciated! >> > >> > Thank you, >> > Sachin >> > >> >> CONFIDENTIALITY NOTICE >> == >> This email message and any attachments are for the exclusive use of the >> intended recipient(s) and may contain confidential and privileged >> information. Any unauthorized review, use, disclosure or distribution is >> prohibited. If you are not the intended recipient, please contact the >> sender by reply email and destroy all copies of the original message along >> with any attachments, from your computer system. If you are the intended >> recipient, please be advised that the content of this message is subject to >> access, review and disclosure by the sender's Email System Administrator. >> >> >
Re: io.compression.codecs not found
Hi, I'm not using CM. I have installed CDH 4.2.1 using Linux packages. Thank you, Sachin On Thu, May 23, 2013 at 7:13 PM, Sanjay Subramanian < sanjay.subraman...@wizecommerce.com> wrote: > This property needs to be set in core-site.xml. If u r using > clouderamanager then ping me I will tell u how to set it there. Out of the > box hive works beautifully with gzip and snappy. And if u r using lzo then > needs some plumbing. Depends on what ur usecase is I can provide guidance. > > Regards > Sanjay > > Sent from my iPhone > > On May 23, 2013, at 3:33 AM, "Sachin Sudarshana" > wrote: > > > Hi, > > > > I'm trying to run some queries on compressed tables in Hive 0.10. I wish > to know what all compression codecs are available which i can make use of. > > However, when i run set io.compression.codecs in the hive CLI, it throws > an error saying the io.compression.codecs is not found. > > > > I'm unable to figure out why its happening. Has it (the hiveconf > property) been removed from 0.10? > > > > Any help is greatly appreciated! > > > > Thank you, > > Sachin > > > > CONFIDENTIALITY NOTICE > == > This email message and any attachments are for the exclusive use of the > intended recipient(s) and may contain confidential and privileged > information. Any unauthorized review, use, disclosure or distribution is > prohibited. If you are not the intended recipient, please contact the > sender by reply email and destroy all copies of the original message along > with any attachments, from your computer system. If you are the intended > recipient, please be advised that the content of this message is subject to > access, review and disclosure by the sender's Email System Administrator. > >
io.compression.codecs not found
Hi, I'm trying to run some queries on compressed tables in Hive 0.10. I wish to know what all compression codecs are available which i can make use of. However, when i run set io.compression.codecs in the hive CLI, it throws an error saying the io.compression.codecs is not found. I'm unable to figure out why its happening. Has it (the hiveconf property) been removed from 0.10? Any help is greatly appreciated! Thank you, Sachin
Re: Finding maximum across a row
Hi Bejoy, I am new to UDF in Hive. Could you send me any link/tutorials on where i can be able to learn about writing the UDF? Thanks! On Fri, Mar 1, 2013 at 10:22 PM, wrote: > ** > Hi Sachin > > AFAIK There isn't one at the moment. But you can easily achieve this using > a custom UDF. > Regards > Bejoy KS > > Sent from remote device, Please excuse typos > ---------- > *From: * Sachin Sudarshana > *Date: *Fri, 1 Mar 2013 22:16:37 +0530 > *To: * > *ReplyTo: * user@hive.apache.org > *Subject: *Finding maximum across a row > > Hi, > > Is there any function/method to find the maximum across a row in hive? > > Suppose i have a table like this: > > ColA ColB ColC > 2 5 7 > 3 2 1 > > I want the function to return > > 7 > 1 > > > Its urgently required. Any help would be greatly appreciated! > > > > -- > Thanks and Regards, > Sachin Sudarshana > -- Thanks and Regards, Sachin Sudarshana
Finding maximum across a row
Hi, Is there any function/method to find the maximum across a row in hive? Suppose i have a table like this: ColA ColB ColC 2 5 7 3 2 1 I want the function to return 7 1 Its urgently required. Any help would be greatly appreciated! -- Thanks and Regards, Sachin Sudarshana
Request to add me onto the list
Hi, I request you to kindly add me onto this list. -- Thanks and Regards, Sachin Sudarshana
Re: Security for Hive
Hi, I have read about roles, user privileges, group privileges etc. But these roles can be created by any user for any database/table. I would like to know if there is a specific 'administrator' for hive who can log on with his credentials and is the only one entitled to create roles, grant privileges etc. Thank you. On Fri, Feb 22, 2013 at 4:19 PM, Jagat Singh wrote: > You might want to read this > > https://cwiki.apache.org/Hive/languagemanual-auth.html > > > > > On Fri, Feb 22, 2013 at 9:44 PM, Sachin Sudarshana < > sachin.sudarsh...@gmail.com> wrote: > >> Hi, >> >> I have just started learning about hive. >> I have configured Hive to use mysql as the metastore instead of derby. >> If I wish to use GRANT and REVOKE commands, i can use it with any user. A >> user can issue GRANT or REVOKE commands to any other users' table since >> both the users' tables are present in the same warehouse. >> >> Isn't there a concept of superuser/admin in hive who alone has the >> authority to issue these commands ? >> >> Any answer is greatly appreciated! >> >> -- >> Thanks and Regards, >> Sachin Sudarshana >> > > -- Thanks and Regards, Sachin Sudarshana