Re: New lines causing new rows

2014-08-17 Thread Andre Araujo
Hi, Charles, What's the storage format for the raw data source? What's the definition of your view? On 18 August 2014 04:20, Charles Robertson wrote: > HI all, > > I am loading some data into a Hive table, and one of the fields contains > text which I believe contains new line characters. I ha

Re: Hive Statistics

2014-07-23 Thread Andre Araujo
Hi, Navdeep, Please note that the configuration for the stats database is separate from the configuration for the metastore db. Can you confirm you have both to use a mysql db? The properties for the stats db are: hive.stats.dbclass= hive.stats.dbconnectionstring= On 23 July 2014 16:07, Navdee

Re: exchange partition documentation

2014-07-20 Thread Andre Araujo
eed some guidance. The sytax >> confused me originally -- see comments on HIVE-4095 >> <https://issues.apache.org/jira/browse/HIVE-4095?focusedCommentId=13819885&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13819885>. >> I'll

Re: exchange partition documentation

2014-07-20 Thread Andre Araujo
Indeed! The documentation is a fair bit off. I've tested the below on Hive 0.12 on CDH and it works fine. Lefty, would you please update the documentation on the two pages below? --- Source: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-

Re: Errors while creating a new table using existing table schema

2014-07-19 Thread Andre Araujo
Vidya, I'm not sure I've understood your problem correctly. But if you want to create a table in the default database you can do either: use default; create table jobs_ex2 as select ... OR create table default.jobs_ex2 as select ... With that you don't need to specify the LOCATION clause. Howe

Re: how to control hive log location on 0.13?

2014-07-18 Thread Andre Araujo
Hi, Yang, you're running your mapreduce jobs in Hadoop's local mode, and in that mode all the Hive MR logging is handled through log4j on your local machine, which is what this log file is about. The log location and naming is controlled by the property log4j.appender.FA.File in the Hive log4j pr

Re: how to control hive log location on 0.13?

2014-07-18 Thread Andre Araujo
and where is it located? On 19 July 2014 10:58, Andre Araujo wrote: > Can you give us an excerpt of the contents of this log? > > > On 19 July 2014 04:38, Yang wrote: > >> thanks guys. anybody knows what generates the log like " >> myuser_201407

Re: how to control hive log location on 0.13?

2014-07-18 Thread Andre Araujo
enerate this, looks from hive. > > > On Fri, Jul 18, 2014 at 12:28 AM, Andre Araujo wrote: > >> Make sure the directory you specify has the sticky bit set, otherwise >> users will have permission problems: >> >> chmod 1777 >> >> >> On 18 July 201

Re: how to control hive log location on 0.13?

2014-07-18 Thread Andre Araujo
Make sure the directory you specify has the sticky bit set, otherwise users will have permission problems: chmod 1777 On 18 July 2014 14:19, Satish Mittal wrote: > You can configure the following property in > $HIVE_HOME/conf/hive-log4j.properties: > > hive.log.dir= > > The default value of t

Re: beeline remote client not connecting to hiveserver2

2014-07-02 Thread Andre Araujo
t.sasl.qop=auth > hive.server2.transport.mode=binary > > > On Wed, Jul 2, 2014 at 10:55 PM, Andre Araujo wrote: > >> Did you explicitly change the HiveServer2 port to 11000? The default is >> 1. >> >> Can you provide the output of the following ? >

Re: beeline remote client not connecting to hiveserver2

2014-07-02 Thread Andre Araujo
hiveservice". I confirmed that the hostname is indeed resolvable in dns. > I also tried using the ip address in place of the hostname and I still get > the same error. > > > On Wed, Jul 2, 2014 at 5:36 PM, Andre Araujo wrote: > >> If the name "hiveserver" is

Re: beeline remote client not connecting to hiveserver2

2014-07-02 Thread Andre Araujo
If the name "hiveserver" is not resolved correctly, that's the exact error you'd getting. Does "nslookup hiveservice" resolve the name successfully? Try using the fully qualified name instead. On 3 July 2014 07:01, Szehon Ho wrote: > I believe you should be able to put in anything by default.

Re: beeline remote client not connecting to hiveserver2

2014-07-02 Thread Andre Araujo
Is the name "hiveservice" being resolved successfully? If there's a problem with the name resolution, that's the exact message you'd get. Does "nslookup hiveservice" resolve the name? Try using the fully qualified name instead. On 3 July 2014 03:16, Hang Chan wrote: > beeline does not seem to

Re: Need urgent help

2014-07-02 Thread Andre Araujo
Did you manage to load the jar from the local file? On 2 July 2014 16:57, wrote: > This copy was successful. The file was not corrupted > > > Sent from Samsung Mobile > > > Original message ---- > From: Andre Araujo > Date:01/07/2014 23:45 (GMT-08:0

Re: Need urgent help

2014-07-01 Thread Andre Araujo
supergroup 621942 2014-05-20 21:33 > hdfs://xx.xx.xx.xxx:9000/user/ipg_intg_user/AP/scripts/lib/wsaUtils.jar > > > > *From:* Andre Araujo [mailto:ara...@pythian.com] > *Sent:* Tuesday, July 01, 2014 10:46 PM > *To:* user > > *Subject:* Re: Need urgent help > >

Re: Need urgent help

2014-07-01 Thread Andre Araujo
What's the result of the command below? hadoop fs -ls hdfs://xx.xx.xx.xxx:9000/user/ipg_intg_user/AP/scripts/lib/wsaUtils.jar On 2 July 2014 14:07, wrote: > Can you please elaborate on the permission > > > > > > Hi, > > Cannot add a jar to hive classpath. > > Once I launch HIVE, I type -> ADD

Re: Reg: Merging Rows

2014-06-24 Thread Andre Araujo
...SUM(COALESCE(col1, 0)), ... On 25 June 2014 08:01, Krishnan Narayanan wrote: > Try coalesce > > > On Tue, Jun 24, 2014 at 2:49 PM, sumit ghosh wrote: > >> Did you try sum(col1), sum(col2) ... group by id >> >> >> On Tuesday, 24 June 2014 1:23 PM, usha hive wrote: >> >> >> Hi, >> >> I

Re: hive/hbase integration

2014-06-23 Thread Andre Araujo
Brian, I'm successfully using Hive/Hbase integration on CDH5 (Hive 0.12.0 and Hbase 0.96.1) Can you explain your setup and configuration in details? Cheers, Andre' On 24 June 2014 02:14, Brian Jeltema wrote: > I’m running Hive 0.12 on Hadoop V2 (Ambari installation) and have been > trying to

Re: DDLTask. Database does not exist:

2014-06-22 Thread Andre Araujo
Robert, check your hive.metastore.uris variable. It seems that your Hive sessions are using a local derby metastore, which is created in your local directory by default, instead of using a shared metastore. Cheers, Andre On 22 June 2014 10:57, Grandl Robert wrote: > Ok. It seems if I start h

Re: How to load json data with nested arrays into hive?

2014-06-21 Thread Andre Araujo
Hi, Chris, I like the Json serde solution better, but there's another alternative to achieve what you're trying to do. Using the Brickhouse's json_split UDF (https://github.com/klout/brickhouse) and the mdmp_raw_data table you already have, one can do the following: ADD JAR file:///.../brickhouse

Re:

2014-06-18 Thread Andre Araujo
This seems to be fixed by HIVE-6005, which is included in HIVE-5263. On 19 June 2014 00:25, Clay McDonald wrote: > I’m trying to run the following hive join query and get the following > error. Any suggestions? > > > > > > > > > > hive> select count(B.txn_id) AS CNT FROM txn_hdr_combined AS B

Re:

2014-06-18 Thread Andre Araujo
Could you send a "show create table" for the two tables involved? On 19 June 2014 00:25, Clay McDonald wrote: > I’m trying to run the following hive join query and get the following > error. Any suggestions? > > > > > > > > > > hive> select count(B.txn_id) AS CNT FROM txn_hdr_combined AS B JOI

Re: Need help in Date format

2014-06-12 Thread Andre Araujo
You can find this is a cheeky trick, but it works as a treat :) select printf('%s-%02.0f-%s', substr(started_dt,1,2), (2+instr('JanFebMarAprMayJunJulAugSepOctNovDec', substr(started_dt,4,3)))/3, substr(started_dt,8,4) ) On 13 June 2014 10:01, Krishnan Narayanan w

Re: How to extract multi-match in one line with regexp_extract function?

2014-06-10 Thread Andre Araujo
Unfortunately, regexp_extract only works with a single index. If the fields in your line are delimited by some specific character, just use the split function. If not, you can use a combination of regexp_replace and split, like the example below: select split(regexp_replace('World Cup 2014', '(.*

Re: why sometime drop table will take very long time?

2014-03-30 Thread Andre Araujo
If your table has a large number of files, it may take a while for Hive to remove all of them from HDFS and thus there's a delay. To be 100% sure, bump up the logging level of the Hive client to DEBUG and check where the time is being spent. On 31 March 2014 14:02, ch huang wrote: > hi,maillis

Re: Group By and Partion By in Hive

2014-01-29 Thread Andre Araujo
The first is available in Hive since early versions (in versions older than 0.6.0, though, use COUNT(1) instead). The second is available in Hive starting in version 0.11.0. On 29 January 2014 21:41, unmesha sreeveni wrote: > In SQL we have partion by and group by > > select deptno, count(*) c

Re: Hive dynamic partitions generate multiple files

2014-01-29 Thread Andre Araujo
s Engineer > Phone: +45.27.30.60.35 > > > > On Wed, Jan 29, 2014 at 2:53 AM, Andre Araujo wrote: > >> Why do you need exactly one file? This is transparent to Hive and it >> should treat it seamlessly. Unless you have external requirements (reading >> files from

Re: Hive dynamic partitions generate multiple files

2014-01-28 Thread Andre Araujo
Software Systems Engineer > Phone: +45.27.30.60.35 > > > > On Wed, Jan 29, 2014 at 1:16 AM, Andre Araujo wrote: > >> Hi, Cosmin, >> >> Have you tried using DISTRIBUTE BY to distribute the query's data by the >> partitioning columns? >> That

Re: Hive dynamic partitions generate multiple files

2014-01-28 Thread Andre Araujo
Hi, Cosmin, Have you tried using DISTRIBUTE BY to distribute the query's data by the partitioning columns? That way all the data for each partition should be sent to the same reducer and should be written to a single file in each partition, I think. If your data is being distributed by a differen

Re: casting complex data types for outputs of custom scripts

2014-01-14 Thread Andre Araujo
I had a similar issue in the past when trying to cast an empty array to array(). By default Hive assumes it's an array(). I don't think there's currently a Hive syntax to cast values to complex data types. If there's one, I'd love to know what it is :) On 14 January 2014 10:22, rohan monga wrote

RCFILE and "\n" characters

2014-01-06 Thread Andre Araujo
The example below shows that the RCFILE SerDe doesn't handle "\n" in string fields correctly. It seem that the SerDe uses "\n" internally as a record delimiter but it's failing to de/serialize it correctly when it appears within a field. Is that correct? Any ideas on how to work around that? Tha

Re: Typecasting arrays

2013-12-18 Thread Andre Araujo
gt; now. If its added now then well and good > > I normally typecast individual element of the array and then join them > back (mostly via udf). > > I will see if I can find that code > > > On Thu, Dec 19, 2013 at 11:47 AM, Andre Araujo wrote: > >> Hi, all, >>

Typecasting arrays

2013-12-18 Thread Andre Araujo
Hi, all, Is there a way to typecast arrays in hive? What I want is that for a specific select, where I specify and empty array as the value for one of the columns, for Hive to treat the empty array as array, instead of the default, which is array. I searched the documentation but couldn't find an