Re: 答复: [ANNOUNCE] New Hive Committer - Wei Zheng
\o/ On Thu, Mar 10, 2016 at 11:31 AM, 谭成灶 wrote: > Congratulations, Wei ! > > -- > 发件人: Madhu Thalakola > 发送时间: 2016/3/10 21:47 > 收件人: user@hive.apache.org > 抄送: d...@hive.apache.org; w...@apache.org > 主题: Re: [ANNOUNCE] New Hive Committer - Wei Zheng > > Congratulations Wei Zheng > > Thanks, > MAdhu > Help ever, Hurt never > > > > > -- *Daniel Lopes, B.Eng* Data Scientist - BankFacil CREA/SP 5069410560 <http://edital.confea.org.br/ConsultaProfissional/cartao.aspx?rnp=2613651334> Mob +55 (18) 99764-2733 Ph +55 (11) 3522-8009 http://about.me/dannyeuu Av. Nova Independência, 956, São Paulo, SP Bairro Brooklin Paulista CEP 04570-001 https://www.bankfacil.com.br
hive memory error: GC overhead limit exceeded
Hi, Anyone know this error? running at Amazon EMR. 2016-02-19 10:32:34 Starting to launch local task to process map join; maximum memory = 932184064 # # java.lang.OutOfMemoryError: GC overhead limit exceeded # -XX:OnOutOfMemoryError="kill -9 %p kill -9 %p" # Executing /bin/sh -c "kill -9 15759 kill -9 15759"... Execution failed with exit status: 137 Obtaining error information Task failed! Task ID: Stage-35 Logs: /var/log/hive/user/hadoop/hive.log Best, -- *Daniel Lopes, B.Eng* Data Scientist - BankFacil CREA/SP 5069410560 <http://edital.confea.org.br/ConsultaProfissional/cartao.aspx?rnp=2613651334> Mob +55 (18) 99764-2733 Ph +55 (11) 3522-8009 http://about.me/dannyeuu Av. Nova Independência, 956, São Paulo, SP Bairro Brooklin Paulista CEP 04570-001 https://www.bankfacil.com.br
Spark 1.5.2 + Hive 1.0.0 in Amazon EMR 4.2.0
Hi, I get this error when trying to write Spark DataFrame to Hive Table Stored as TextFile sqlContext.sql('INSERT OVERWRITE TABLE analytics.client_view_stock *(hive table)* SELECT * FROM client_view_stock'*(spark temp table)*') Error: 15/11/30 21:40:14 INFO latency: StatusCode=[404], Exception=[com.amazonaws.services.s3.model.AmazonS3Exception: Not Found (Service: Amazon S3; Status Code: 404; Error Code: 404 Not Found; Request ID: 5ADBECA2D82A7C17), S3 Extended Request ID: RcPfjgWaeXG62xyVRrAr91sVQNxktqbXUPJgK2cvZlf6SKEAOnWCtV9X9K1Vp9dAyDhGALQRBcU=], ServiceName=*[Amazon S3], AWSErrorCode=[404 Not Found]*, AWSRequestID=[5ADBECA2D82A7C17], ServiceEndpoint=[ https://my-bucket.s3.amazonaws.com], Exception=1, HttpClientPoolLeasedCount=0, RequestCount=1, HttpClientPoolPendingCount=0, HttpClientPoolAvailableCount=1, ClientExecuteTime=[214.69], HttpRequestTime=[214.245], HttpClientReceiveResponseTime=[212.513], RequestSigningTime=[0.16], HttpClientSendRequestTime=[0.112], 15/11/30 21:40:21 INFO Hive: Replacing src:s3://my-bucket/output/2015/11/29/client_view_stock/.hive-staging_hive_2015-11-30_21-19-48_942_238078420083598647-1/-ext-1/part-00199, dest: s3://my-bucket/output/2015/11/29/client_view_stock/part-00199, Status:true -chgrp: '' does not match expected pattern for group Usage: hadoop fs [generic options] -chgrp [-R] GROUP PATH... 15/11/30 21:40:21 INFO latency: StatusCode=[200], ServiceName=[Amazon S3], AWSRequestID=[2509AE55A8D71A61], ServiceEndpoint=[https://my-bucket. s3.amazonaws.com], HttpClientPoolLeasedCount=0, RequestCount=1, HttpClientPoolPendingCount=0, HttpClientPoolAvailableCount=1, ClientExecuteTime=[137.387], HttpRequestTime=[136.721], HttpClientReceiveResponseTime=[134.805], RequestSigningTime=[0.235], ResponseProcessingTime=[0.169], HttpClientSendRequestTime=[0.145], 15/11/30 21:40:21 WARN RetryingMetaStoreClient: MetaStoreClient lost connection. Attempting to reconnect. org.apache.thrift.TApplication*Exception: Invalid method name: 'alter_table_with_cascade'* Thanks! -- *Daniel Lopes, B.Eng* Data Scientist - BankFacil CREA/SP 5069410560 <http://edital.confea.org.br/ConsultaProfissional/cartao.aspx?rnp=2613651334> Mob +55 (18) 99764-2733 Ph +55 (11) 3522-8009 http://about.me/dannyeuu Av. Nova Independência, 956, São Paulo, SP Bairro Brooklin Paulista CEP 04570-001 https://www.bankfacil.com.br
Re: Best way to load CSV file into Hive
Hello, If you have file with diferents types of data, it's prefered to use other type of file like TSV, ORC or Parquet. Best, On Fri, Oct 30, 2015 at 4:16 PM, Vijaya Narayana Reddy Bhoomi Reddy < vijaya.bhoomire...@whishworks.com> wrote: > Hi, > > I have a CSV file which contains hunderd thousand rows and about 200+ > columns. Some of the columns have free text information, which means it > might contain characters like comma, colon, quotes etc with in the column > content. > > What is the best way to load such CSV file into Hive? > > Another serious issue, I have stored the file in a location in HDFS and > then created an external hive table on it. However, upon running Create > external table using HDP Hive View, the original CSV is no longer present > in the folder where it is meant to be. Not sure on how HDP processes and > where it is stored? My understanding was that EXTERNAL table wouldnt be > moved from their original HDFS location? > > Request someone to help out! > > > Thanks & Regards > Vijay > > > > The contents of this e-mail are confidential and for the exclusive use of > the intended recipient. If you receive this e-mail in error please delete > it from your system immediately and notify us either by e-mail or > telephone. You should not copy, forward or otherwise disclose the content > of the e-mail. The views expressed in this communication may not > necessarily be the view held by WHISHWORKS. -- *Daniel Lopes, B.Eng* Data Scientist - BankFacil CREA/SP 5069410560 <http://edital.confea.org.br/ConsultaProfissional/cartao.aspx?rnp=2613651334> Mob +55 (18) 99764-2733 Ph +55 (11) 3522-8009 http://about.me/dannyeuu Av. Nova Independência, 956, São Paulo, SP Bairro Brooklin Paulista CEP 04570-001 https://www.bankfacil.com.br
Hive 1.0.0 Error: cannot be cast
I'm HIVE 1.0.0 and I got this error Query ID = hadoop_20151023210202_9d73cf48-62f0-4d47-ae26-f5dfff0a24d9 Total jobs = 3 Execution log at: /var/log/hive/tmp/hadoop/hadoop_20151023210202_9d73cf48-62f0-4d47-ae26-f5dfff0a24d9.log 2015-10-23 09:03:14 Starting to launch local task to process map join; maximum memory = 1013645312 2015-10-23 09:03:23 Dump the side-table for tag: 1 with group count: 27 into file: file:/var/log/hive/tmp/hadoop/ccfa168d-3b18-4551-b253-070682b406a0/hive_2015-10-23_21-02-55_298_3710383116074055586-1/-local-10004/HashTable-Stage-8/MapJoin-mapfile311--.hashtable 2015-10-23 09:03:26 Uploaded 1 File to: file:/var/log/hive/tmp/hadoop/ccfa168d-3b18-4551-b253-070682b406a0/hive_2015-10-23_21-02-55_298_3710383116074055586-1/-local-10004/HashTable-Stage-8/MapJoin-mapfile311--.hashtable (1216 bytes) 2015-10-23 09:03:26 End of local task; Time Taken: 11.675 sec. Execution completed successfully MapredLocal task succeeded Launching Job 1 out of 3 Number of reduce tasks is set to 0 since there's no reduce operator Starting Job = job_1445632064722_0029, Tracking URL = http://ip-10-252-112-226.sa-east-1.compute.internal:20888/proxy/application_1445632064722_0029/ Kill Command = /usr/lib/hadoop/bin/hadoop job -kill job_1445632064722_0029 Hadoop job information for Stage-8: number of mappers: 1; number of reducers: 0 2015-10-23 21:03:43,140 Stage-8 map = 0%, reduce = 0% 2015-10-23 21:04:25,621 Stage-8 map = 100%, reduce = 0% Ended Job = job_1445632064722_0029 with errors Error during job, obtaining debugging information... Examining task ID: task_1445632064722_0029_m_00 (and more) from job job_1445632064722_0029 Task with the most failures(4): - Task ID: task_1445632064722_0029_m_00 URL: http://ip-10-252-112-226.sa-east-1.compute.internal:8088/taskdetails.jsp?jobid=job_1445632064722_0029&tipid=task_1445632064722_0029_m_00 - Diagnostic Messages for this Task: Error: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"bi_lead_id":2459,"consultor":2,"etapa":"Atendimento","tempo_ate":null,"duracao":null,"data":"2015-03-16 18:08:41"} at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:185) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:65) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:452) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:344) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:171) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:166) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"bi_lead_id":2459,"consultor":2,"etapa":"Atendimento","tempo_ate":null,"duracao":null,"data":"2015-03-16 18:08:41"} at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:503) at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:176) ... 8 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unexpected exception: org.apache.hadoop.io.Text cannot be cast to org.apache.hadoop.hive.serde2.io.HiveVarcharWritable at org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:311) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815) at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95) at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:157) at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:493) ... 9 more Caused by: java.lang.ClassCastException: org.apache.hadoop.io.Text cannot be cast to org.apache.hadoop.hive.serde2.io.HiveVarcharWritable at org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableHiveVarcharObjectInspector.copyObject(WritableHiveVarcharObjectInspector.java:109) at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.copyToStandardObject(ObjectInspectorUtils.java:305) at org.apache.hadoop.hive.ql.exec.JoinUtil.computeValues(JoinUtil.java:193) at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.getFilteredValue(CommonJoinOperator.java:408) at org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:295) ... 13 more FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask MapReduce Jobs Launched: Stage-Stage-8: Map: 1 HDFS Read: 0 HDFS Write: 0 FAIL Total MapReduce CPU Time Spent: 0 msec Command exiting with ret '2' Best, -- *Daniel Lopes, B.Eng* Data Scientist - BankFacil CREA/SP 5069410560 <http://edital.confea.org.br/ConsultaProfissional/cartao.aspx?rnp=2613651334> Mob +55 (18) 99764-2733 Ph +55 (11) 3522-8009 http://about.me/dannyeuu Av. Nova Independência, 956, São Paulo, SP Bairro Brooklin Paulista CEP 04570-001 https://www.bankfacil.com.br
Better way to do UDF's for Hive
Hi, I'd like to know the good way to do a a UDF for a single field, like SELECT tbl.id AS id, tbl.name AS name, tbl.city AS city, state_from_city(tbl.city) AS state FROM my_db.my_table tbl; *Native Java*? *Python *over *Hadoop* *Streaming*? I prefer Python, but I don't know how to do in a good way. Thanks, *Daniel Lopes, B.Eng* Data Scientist - BankFacil CREA/SP 5069410560 <http://edital.confea.org.br/ConsultaProfissional/cartao.aspx?rnp=2613651334> Mob +55 (18) 99764-2733 Ph +55 (11) 3522-8009 http://about.me/dannyeuu Av. Nova Independência, 956, São Paulo, SP Bairro Brooklin Paulista CEP 04570-001 https://www.bankfacil.com.br
Re: Failed to execute command on hive - Relative path in absolute URI
Hi, To call a hive conf, get by this: ${hiveconf:NAME_OF_PROPERTY} Bye *Daniel Lopes, B.Eng* Data Scientist - BankFacil CREA/SP 5069410560 <http://edital.confea.org.br/ConsultaProfissional/cartao.aspx?rnp=2613651334> Mob +55 (18) 99764-2733 Ph +55 (11) 3522-8009 http://about.me/dannyeuu Av. Nova Independência, 956, São Paulo, SP Bairro Brooklin Paulista CEP 04570-001 https://www.bankfacil.com.br On Tue, Sep 29, 2015 at 4:53 AM, IT CTO wrote: > > > -- Forwarded message - > From: IT CTO > Date: Tue, Sep 29, 2015 at 10:51 AM > Subject: Failed to execute command on hive - Relative path in absolute URI > To: > > > After connecting to hive both using the shell or beeline I am getting the > following error on every command I am making > What is wrong? > > Failed with exception > java.io.IOException:java.lang.IllegalArgumentException: > java.net.URISyntaxException: Relative path in absolute URI: ${system: > user.name%7D > > my hive-site.xml properties points to /tmp/${system:user.name} and I do > see the /tmp/cto directory in it with hive log file. > > Eran > -- > Eran | "You don't need eyes to see, you need vision" (Faithless) > -- > Eran | "You don't need eyes to see, you need vision" (Faithless) >
Re: Organising Hive Scripts
Thanks Erwan, I needed this too, and works fine for me. *Daniel Lopes, B.Eng* Data Scientist - BankFacil CREA/SP 5069410560 <http://edital.confea.org.br/ConsultaProfissional/cartao.aspx?rnp=2613651334> Mob +55 (18) 99764-2733 Ph +55 (11) 3522-8009 http://about.me/dannyeuu Av. Nova Independência, 956, São Paulo, SP Bairro Brooklin Paulista CEP 04570-001 https://www.bankfacil.com.br On Sat, Sep 12, 2015 at 12:11 PM, Erwan MAS wrote: > Hi, > > HIVE have a the source keyword , so you can split you big huge script in > multiple part . > > Sou you can have a script_part.hql and inside you call all you small part . > > source script_a.hql ; > source script_b.hql ; > source script_c.hql ; > > -- > Erwan MAS >
Re: Subquery in select statement
Sorry, but there are something that I can do in this case? *Daniel Lopes, B.Eng* Data Scientist - BankFacil CREA/SP 5069410560 <http://edital.confea.org.br/ConsultaProfissional/cartao.aspx?rnp=2613651334> Mob +55 (18) 99764-2733 Ph +55 (11) 3522-8009 http://about.me/dannyeuu Av. Nova Independência, 956, São Paulo, SP Bairro Brooklin Paulista CEP 04570-001 https://www.bankfacil.com.br On Thu, Sep 3, 2015 at 4:12 PM, Daniel Lopes wrote: > Hi, > > There are somthing that I can do this? > > SELECT >tb.id, >(SELECT tb3.field FROM database.table2 tb2 JOIN database.table3 tb3 ON ( > tb3.id = tb2.table3_id) ORDER BY tb3.date DESC LIMIT 1) AS tb3_field > FROM database.table1 tb1 > > > Best, > > *Daniel Lopes, B.Eng* > Data Scientist - BankFacil > CREA/SP 5069410560 > <http://edital.confea.org.br/ConsultaProfissional/cartao.aspx?rnp=2613651334> > Mob +55 (18) 99764-2733 > Ph +55 (11) 3522-8009 > http://about.me/dannyeuu > > Av. Nova Independência, 956, São Paulo, SP > Bairro Brooklin Paulista > CEP 04570-001 > https://www.bankfacil.com.br > >
Re: [ANNOUNCE] New Hive Committer - Lars Francke
Congrats! *Daniel Lopes, B.Eng* Data Scientist - BankFacil CREA/SP 5069410560 <http://edital.confea.org.br/ConsultaProfissional/cartao.aspx?rnp=2613651334> Mob +55 (18) 99764-2733 Ph +55 (11) 3522-8009 http://about.me/dannyeuu Av. Nova Independência, 956, São Paulo, SP Bairro Brooklin Paulista CEP 04570-001 https://www.bankfacil.com.br On Tue, Sep 8, 2015 at 6:34 PM, Lars Francke wrote: > Thank you so much everyone! > > Looking forward to continue working with all of you. > > On Tue, Sep 8, 2015 at 3:26 AM, kulkarni.swar...@gmail.com < > kulkarni.swar...@gmail.com> wrote: > >> Congrats! >> >> On Mon, Sep 7, 2015 at 3:54 AM, Carl Steinbach wrote: >> >>> The Apache Hive PMC has voted to make Lars Francke a committer on the >>> Apache Hive Project. >>> >>> Please join me in congratulating Lars! >>> >>> Thanks. >>> >>> - Carl >>> >>> >> >> >> -- >> Swarnim >> > >
Subquery in select statement
Hi, There are somthing that I can do this? SELECT tb.id, (SELECT tb3.field FROM database.table2 tb2 JOIN database.table3 tb3 ON ( tb3.id = tb2.table3_id) ORDER BY tb3.date DESC LIMIT 1) AS tb3_field FROM database.table1 tb1 Best, *Daniel Lopes, B.Eng* Data Scientist - BankFacil CREA/SP 5069410560 <http://edital.confea.org.br/ConsultaProfissional/cartao.aspx?rnp=2613651334> Mob +55 (18) 99764-2733 Ph +55 (11) 3522-8009 http://about.me/dannyeuu Av. Nova Independência, 956, São Paulo, SP Bairro Brooklin Paulista CEP 04570-001 https://www.bankfacil.com.br