from:"\"Daniel Lopes\""

Re: 答复: [ANNOUNCE] New Hive Committer - Wei Zheng

2016-03-10 Thread Daniel Lopes

\o/

On Thu, Mar 10, 2016 at 11:31 AM, 谭成灶  wrote:

> Congratulations, Wei !
>
> --
> 发件人: Madhu Thalakola 
> 发送时间: ‎2016/‎3/‎10 21:47
> 收件人: user@hive.apache.org
> 抄送: d...@hive.apache.org; w...@apache.org
> 主题: Re: [ANNOUNCE] New Hive Committer - Wei Zheng
>
> Congratulations Wei Zheng
>
> Thanks,
> MAdhu
> Help ever, Hurt never
>
>
>
>
>


-- 
*Daniel Lopes, B.Eng*
Data Scientist - BankFacil
CREA/SP 5069410560
<http://edital.confea.org.br/ConsultaProfissional/cartao.aspx?rnp=2613651334>
Mob +55 (18) 99764-2733 
Ph +55 (11) 3522-8009
http://about.me/dannyeuu

Av. Nova Independência, 956, São Paulo, SP
Bairro Brooklin Paulista
CEP 04570-001
https://www.bankfacil.com.br

hive memory error: GC overhead limit exceeded

2016-02-23 Thread Daniel Lopes

Hi,

Anyone know this error? running at Amazon EMR.

2016-02-19 10:32:34 Starting to launch local task to process map join; maximum
memory = 932184064
#
# java.lang.OutOfMemoryError: GC overhead limit exceeded
# -XX:OnOutOfMemoryError="kill -9 %p
kill -9 %p"
#   Executing /bin/sh -c "kill -9 15759
kill -9 15759"...
Execution failed with exit status: 137
Obtaining error information
Task failed!
Task ID:
  Stage-35
Logs:
/var/log/hive/user/hadoop/hive.log


Best,

-- 
*Daniel Lopes, B.Eng*
Data Scientist - BankFacil
CREA/SP 5069410560
<http://edital.confea.org.br/ConsultaProfissional/cartao.aspx?rnp=2613651334>
Mob +55 (18) 99764-2733 
Ph +55 (11) 3522-8009
http://about.me/dannyeuu

Av. Nova Independência, 956, São Paulo, SP
Bairro Brooklin Paulista
CEP 04570-001
https://www.bankfacil.com.br

Spark 1.5.2 + Hive 1.0.0 in Amazon EMR 4.2.0

2015-11-30 Thread Daniel Lopes

Hi,

I get this error when trying to write Spark DataFrame to Hive Table Stored
as TextFile


sqlContext.sql('INSERT OVERWRITE TABLE analytics.client_view_stock *(hive
table)* SELECT * FROM client_view_stock'*(spark temp table)*')

Error:

15/11/30 21:40:14 INFO latency: StatusCode=[404],
Exception=[com.amazonaws.services.s3.model.AmazonS3Exception: Not Found
(Service: Amazon S3; Status Code: 404; Error Code: 404 Not Found; Request
ID: 5ADBECA2D82A7C17), S3 Extended Request ID:
RcPfjgWaeXG62xyVRrAr91sVQNxktqbXUPJgK2cvZlf6SKEAOnWCtV9X9K1Vp9dAyDhGALQRBcU=],
ServiceName=*[Amazon S3], AWSErrorCode=[404 Not Found]*,
AWSRequestID=[5ADBECA2D82A7C17], ServiceEndpoint=[
https://my-bucket.s3.amazonaws.com], Exception=1,
HttpClientPoolLeasedCount=0, RequestCount=1, HttpClientPoolPendingCount=0,
HttpClientPoolAvailableCount=1, ClientExecuteTime=[214.69],
HttpRequestTime=[214.245], HttpClientReceiveResponseTime=[212.513],
RequestSigningTime=[0.16], HttpClientSendRequestTime=[0.112],
15/11/30 21:40:21 INFO Hive: Replacing
src:s3://my-bucket/output/2015/11/29/client_view_stock/.hive-staging_hive_2015-11-30_21-19-48_942_238078420083598647-1/-ext-1/part-00199,
dest: s3://my-bucket/output/2015/11/29/client_view_stock/part-00199,
Status:true
-chgrp: '' does not match expected pattern for group
Usage: hadoop fs [generic options] -chgrp [-R] GROUP PATH...
15/11/30 21:40:21 INFO latency: StatusCode=[200], ServiceName=[Amazon S3],
AWSRequestID=[2509AE55A8D71A61], ServiceEndpoint=[https://my-bucket.
s3.amazonaws.com], HttpClientPoolLeasedCount=0, RequestCount=1,
HttpClientPoolPendingCount=0, HttpClientPoolAvailableCount=1,
ClientExecuteTime=[137.387], HttpRequestTime=[136.721],
HttpClientReceiveResponseTime=[134.805], RequestSigningTime=[0.235],
ResponseProcessingTime=[0.169], HttpClientSendRequestTime=[0.145],
15/11/30 21:40:21 WARN RetryingMetaStoreClient: MetaStoreClient lost
connection. Attempting to reconnect.
org.apache.thrift.TApplication*Exception: Invalid method name:
'alter_table_with_cascade'*

Thanks!

-- 
*Daniel Lopes, B.Eng*
Data Scientist - BankFacil
CREA/SP 5069410560
<http://edital.confea.org.br/ConsultaProfissional/cartao.aspx?rnp=2613651334>
Mob +55 (18) 99764-2733 
Ph +55 (11) 3522-8009
http://about.me/dannyeuu

Av. Nova Independência, 956, São Paulo, SP
Bairro Brooklin Paulista
CEP 04570-001
https://www.bankfacil.com.br

Re: Best way to load CSV file into Hive

2015-10-30 Thread Daniel Lopes

Hello,

If you have file with diferents types of data, it's prefered to use other
type of file like TSV, ORC or Parquet.

Best,

On Fri, Oct 30, 2015 at 4:16 PM, Vijaya Narayana Reddy Bhoomi Reddy <
vijaya.bhoomire...@whishworks.com> wrote:

> Hi,
>
> I have a CSV file which contains hunderd thousand rows and about 200+
> columns. Some of the columns have free text information, which means it
> might contain characters like comma, colon, quotes etc with in the column
> content.
>
> What is the best way to load such CSV file into Hive?
>
> Another serious issue, I have stored the file in a location in HDFS and
> then created an external hive table on it. However, upon running Create
> external table using HDP Hive View, the original CSV is no longer present
> in the folder where it is meant to be. Not sure on how HDP processes and
> where it is stored? My understanding was that EXTERNAL table wouldnt be
> moved from their original HDFS location?
>
> Request someone to help out!
>
>
> Thanks & Regards
> Vijay
>
>
>
> The contents of this e-mail are confidential and for the exclusive use of
> the intended recipient. If you receive this e-mail in error please delete
> it from your system immediately and notify us either by e-mail or
> telephone. You should not copy, forward or otherwise disclose the content
> of the e-mail. The views expressed in this communication may not
> necessarily be the view held by WHISHWORKS.




-- 
*Daniel Lopes, B.Eng*
Data Scientist - BankFacil
CREA/SP 5069410560
<http://edital.confea.org.br/ConsultaProfissional/cartao.aspx?rnp=2613651334>
Mob +55 (18) 99764-2733 
Ph +55 (11) 3522-8009
http://about.me/dannyeuu

Av. Nova Independência, 956, São Paulo, SP
Bairro Brooklin Paulista
CEP 04570-001
https://www.bankfacil.com.br

Hive 1.0.0 Error: cannot be cast

2015-10-23 Thread Daniel Lopes

I'm HIVE 1.0.0 and I got this error

Query ID = hadoop_20151023210202_9d73cf48-62f0-4d47-ae26-f5dfff0a24d9
Total jobs = 3
Execution log at:
/var/log/hive/tmp/hadoop/hadoop_20151023210202_9d73cf48-62f0-4d47-ae26-f5dfff0a24d9.log
2015-10-23 09:03:14 Starting to launch local task to process map join; maximum
memory = 1013645312
2015-10-23 09:03:23 Dump the side-table for tag: 1 with group count: 27
into file:
file:/var/log/hive/tmp/hadoop/ccfa168d-3b18-4551-b253-070682b406a0/hive_2015-10-23_21-02-55_298_3710383116074055586-1/-local-10004/HashTable-Stage-8/MapJoin-mapfile311--.hashtable
2015-10-23 09:03:26 Uploaded 1 File to:
file:/var/log/hive/tmp/hadoop/ccfa168d-3b18-4551-b253-070682b406a0/hive_2015-10-23_21-02-55_298_3710383116074055586-1/-local-10004/HashTable-Stage-8/MapJoin-mapfile311--.hashtable
(1216 bytes)
2015-10-23 09:03:26 End of local task; Time Taken: 11.675 sec.
Execution completed successfully
MapredLocal task succeeded
Launching Job 1 out of 3
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1445632064722_0029, Tracking URL =
http://ip-10-252-112-226.sa-east-1.compute.internal:20888/proxy/application_1445632064722_0029/
Kill Command = /usr/lib/hadoop/bin/hadoop job  -kill job_1445632064722_0029
Hadoop job information for Stage-8: number of mappers: 1; number of
reducers: 0
2015-10-23 21:03:43,140 Stage-8 map = 0%,  reduce = 0%
2015-10-23 21:04:25,621 Stage-8 map = 100%,  reduce = 0%
Ended Job = job_1445632064722_0029 with errors
Error during job, obtaining debugging information...
Examining task ID: task_1445632064722_0029_m_00 (and more) from job
job_1445632064722_0029

Task with the most failures(4):
-
Task ID:
  task_1445632064722_0029_m_00

URL:

http://ip-10-252-112-226.sa-east-1.compute.internal:8088/taskdetails.jsp?jobid=job_1445632064722_0029&tipid=task_1445632064722_0029_m_00
-
Diagnostic Messages for this Task:
Error: java.lang.RuntimeException:
org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while
processing row
{"bi_lead_id":2459,"consultor":2,"etapa":"Atendimento","tempo_ate":null,"duracao":null,"data":"2015-03-16
18:08:41"}
at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:185)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:65)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:452)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:344)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:171)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:166)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime
Error while processing row
{"bi_lead_id":2459,"consultor":2,"etapa":"Atendimento","tempo_ate":null,"duracao":null,"data":"2015-03-16
18:08:41"}
at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:503)
at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:176)
... 8 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unexpected
exception: org.apache.hadoop.io.Text cannot be cast to
org.apache.hadoop.hive.serde2.io.HiveVarcharWritable
at
org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:311)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
at
org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95)
at
org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:157)
at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:493)
... 9 more
Caused by: java.lang.ClassCastException: org.apache.hadoop.io.Text cannot
be cast to org.apache.hadoop.hive.serde2.io.HiveVarcharWritable
at
org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableHiveVarcharObjectInspector.copyObject(WritableHiveVarcharObjectInspector.java:109)
at
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.copyToStandardObject(ObjectInspectorUtils.java:305)
at org.apache.hadoop.hive.ql.exec.JoinUtil.computeValues(JoinUtil.java:193)
at
org.apache.hadoop.hive.ql.exec.CommonJoinOperator.getFilteredValue(CommonJoinOperator.java:408)
at
org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:295)
... 13 more


FAILED: Execution Error, return code 2 from
org.apache.hadoop.hive.ql.exec.mr.MapRedTask
MapReduce Jobs Launched:
Stage-Stage-8: Map: 1   HDFS Read: 0 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 0 msec
Command exiting with ret '2'

Best,

-- 
*Daniel Lopes, B.Eng*
Data Scientist - BankFacil
CREA/SP 5069410560
<http://edital.confea.org.br/ConsultaProfissional/cartao.aspx?rnp=2613651334>
Mob +55 (18) 99764-2733 
Ph +55 (11) 3522-8009
http://about.me/dannyeuu

Av. Nova Independência, 956, São Paulo, SP
Bairro Brooklin Paulista
CEP 04570-001
https://www.bankfacil.com.br

Better way to do UDF's for Hive

2015-10-01 Thread Daniel Lopes

Hi,

I'd like to know the good way to do a a UDF for a single field, like

SELECT
  tbl.id AS id,
  tbl.name AS name,
  tbl.city AS city,
  state_from_city(tbl.city) AS state
FROM
  my_db.my_table tbl;

*Native Java*? *Python *over *Hadoop* *Streaming*?

I prefer Python, but I don't know how to do in a good way.

Thanks,

*Daniel Lopes, B.Eng*
Data Scientist - BankFacil
CREA/SP 5069410560
<http://edital.confea.org.br/ConsultaProfissional/cartao.aspx?rnp=2613651334>
Mob +55 (18) 99764-2733 
Ph +55 (11) 3522-8009
http://about.me/dannyeuu

Av. Nova Independência, 956, São Paulo, SP
Bairro Brooklin Paulista
CEP 04570-001
https://www.bankfacil.com.br

Re: Failed to execute command on hive - Relative path in absolute URI

2015-10-01 Thread Daniel Lopes

Hi,

To call a hive conf, get by this:

${hiveconf:NAME_OF_PROPERTY}

Bye

*Daniel Lopes, B.Eng*
Data Scientist - BankFacil
CREA/SP 5069410560
<http://edital.confea.org.br/ConsultaProfissional/cartao.aspx?rnp=2613651334>
Mob +55 (18) 99764-2733 
Ph +55 (11) 3522-8009
http://about.me/dannyeuu

Av. Nova Independência, 956, São Paulo, SP
Bairro Brooklin Paulista
CEP 04570-001
https://www.bankfacil.com.br


On Tue, Sep 29, 2015 at 4:53 AM, IT CTO  wrote:

>
>
> -- Forwarded message -
> From: IT CTO 
> Date: Tue, Sep 29, 2015 at 10:51 AM
> Subject: Failed to execute command on hive - Relative path in absolute URI
> To: 
>
>
> After connecting to hive both using the shell or beeline I am getting the
> following error on every command I am making
> What is wrong?
>
> Failed with exception
> java.io.IOException:java.lang.IllegalArgumentException:
> java.net.URISyntaxException: Relative path in absolute URI: ${system:
> user.name%7D
>
> my hive-site.xml properties points to /tmp/${system:user.name} and I do
> see the /tmp/cto directory in it with hive log file.
>
> Eran
> --
> Eran | "You don't need eyes to see, you need vision" (Faithless)
> --
> Eran | "You don't need eyes to see, you need vision" (Faithless)
>

Re: Organising Hive Scripts

2015-09-14 Thread Daniel Lopes

Thanks Erwan,

I needed this too, and works fine for me.


*Daniel Lopes, B.Eng*
Data Scientist - BankFacil
CREA/SP 5069410560
<http://edital.confea.org.br/ConsultaProfissional/cartao.aspx?rnp=2613651334>
Mob +55 (18) 99764-2733 
Ph +55 (11) 3522-8009
http://about.me/dannyeuu

Av. Nova Independência, 956, São Paulo, SP
Bairro Brooklin Paulista
CEP 04570-001
https://www.bankfacil.com.br


On Sat, Sep 12, 2015 at 12:11 PM, Erwan MAS  wrote:

> Hi,
>
> HIVE have a the source keyword , so you can split you big huge script in
> multiple part .
>
> Sou you can have a script_part.hql and inside you call all you small part .
>
> source  script_a.hql ;
> source  script_b.hql ;
> source  script_c.hql ;
>
> --
> Erwan MAS
>

Re: Subquery in select statement

2015-09-08 Thread Daniel Lopes

Sorry, but there are something that I can do in this case?

*Daniel Lopes, B.Eng*
Data Scientist - BankFacil
CREA/SP 5069410560
<http://edital.confea.org.br/ConsultaProfissional/cartao.aspx?rnp=2613651334>
Mob +55 (18) 99764-2733 
Ph +55 (11) 3522-8009
http://about.me/dannyeuu

Av. Nova Independência, 956, São Paulo, SP
Bairro Brooklin Paulista
CEP 04570-001
https://www.bankfacil.com.br


On Thu, Sep 3, 2015 at 4:12 PM, Daniel Lopes 
wrote:

> Hi,
>
> There are somthing that I can do this?
>
> SELECT
>tb.id,
>(SELECT tb3.field FROM database.table2 tb2 JOIN database.table3 tb3 ON (
> tb3.id = tb2.table3_id) ORDER BY tb3.date DESC LIMIT 1) AS tb3_field
> FROM database.table1 tb1
>
>
> Best,
>
> *Daniel Lopes, B.Eng*
> Data Scientist - BankFacil
> CREA/SP 5069410560
> <http://edital.confea.org.br/ConsultaProfissional/cartao.aspx?rnp=2613651334>
> Mob +55 (18) 99764-2733 
> Ph +55 (11) 3522-8009
> http://about.me/dannyeuu
>
> Av. Nova Independência, 956, São Paulo, SP
> Bairro Brooklin Paulista
> CEP 04570-001
> https://www.bankfacil.com.br
>
>

Re: [ANNOUNCE] New Hive Committer - Lars Francke

2015-09-08 Thread Daniel Lopes

Congrats!

*Daniel Lopes, B.Eng*
Data Scientist - BankFacil
CREA/SP 5069410560
<http://edital.confea.org.br/ConsultaProfissional/cartao.aspx?rnp=2613651334>
Mob +55 (18) 99764-2733 
Ph +55 (11) 3522-8009
http://about.me/dannyeuu

Av. Nova Independência, 956, São Paulo, SP
Bairro Brooklin Paulista
CEP 04570-001
https://www.bankfacil.com.br


On Tue, Sep 8, 2015 at 6:34 PM, Lars Francke  wrote:

> Thank you so much everyone!
>
> Looking forward to continue working with all of you.
>
> On Tue, Sep 8, 2015 at 3:26 AM, kulkarni.swar...@gmail.com <
> kulkarni.swar...@gmail.com> wrote:
>
>> Congrats!
>>
>> On Mon, Sep 7, 2015 at 3:54 AM, Carl Steinbach  wrote:
>>
>>> The Apache Hive PMC has voted to make Lars Francke a committer on the
>>> Apache Hive Project.
>>>
>>> Please join me in congratulating Lars!
>>>
>>> Thanks.
>>>
>>> - Carl
>>>
>>>
>>
>>
>> --
>> Swarnim
>>
>
>

Subquery in select statement

2015-09-03 Thread Daniel Lopes

Hi,

There are somthing that I can do this?

SELECT
   tb.id,
   (SELECT tb3.field FROM database.table2 tb2 JOIN database.table3 tb3 ON (
tb3.id = tb2.table3_id) ORDER BY tb3.date DESC LIMIT 1) AS tb3_field
FROM database.table1 tb1


Best,

*Daniel Lopes, B.Eng*
Data Scientist - BankFacil
CREA/SP 5069410560
<http://edital.confea.org.br/ConsultaProfissional/cartao.aspx?rnp=2613651334>
Mob +55 (18) 99764-2733 
Ph +55 (11) 3522-8009
http://about.me/dannyeuu

Av. Nova Independência, 956, São Paulo, SP
Bairro Brooklin Paulista
CEP 04570-001
https://www.bankfacil.com.br

Re: 答复: [ANNOUNCE] New Hive Committer - Wei Zheng

hive memory error: GC overhead limit exceeded

Spark 1.5.2 + Hive 1.0.0 in Amazon EMR 4.2.0

Re: Best way to load CSV file into Hive

Hive 1.0.0 Error: cannot be cast

Better way to do UDF's for Hive

Re: Failed to execute command on hive - Relative path in absolute URI

Re: Organising Hive Scripts

Re: Subquery in select statement

Re: [ANNOUNCE] New Hive Committer - Lars Francke

Subquery in select statement

11 matches

Site Navigation

Mail list logo

Footer information