RE: user Digest 19 Jul 2020 19:53:41 -0000 Issue 3895

2020-12-04 Thread Jack Yang
unsubsribe

Re: Tez query failed with OutOfMemoryError: Java heap space

2017-07-06 Thread Yang, Xin
Here're the version information: Hive: 1.2.1 Tez: 0.8.5 Hadoop 2.6.0-cdh5.8.3 Please let me know if you need more information. Regards, Xin From: "Yang, Xin" mailto:xiy...@visa.com>> Date: Thursday, June 29, 2017 at 11:48 AM To: "user@hive.apache.org<mailto:user@h

Re: Big + small + small 3 table mapjoin?

2016-03-08 Thread Yang
ah never mind I found that we are using an old version of hive without this feature On Tue, Mar 8, 2016 at 9:57 AM, Yang wrote: > by documentation I'm referring to this: > https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Joins#LanguageManualJoins-MapJoinRestriction

Re: Big + small + small 3 table mapjoin?

2016-03-08 Thread Yang
8, 2016 at 9:31 AM, Yang wrote: > From the documentation it says that if my tables are small enough and i > set the conver join parameters, without the join hints hive should be able > to convert the joins into 1 mapjoin in 1 mr job > > But in practice i found that it always end

Big + small + small 3 table mapjoin?

2016-03-08 Thread Yang
>From the documentation it says that if my tables are small enough and i set the conver join parameters, without the join hints hive should be able to convert the joins into 1 mapjoin in 1 mr job But in practice i found that it always ends up in 2 mr jobs (2 map joins). What is wrong?

Re: hacking the hive ql parser?

2015-12-29 Thread Yang
thanks! On Tue, Dec 29, 2015 at 12:14 PM, Edward Capriolo wrote: > hive --service lineage 'hql' exists i believe. > > On Tue, Dec 29, 2015 at 3:05 PM, Yang wrote: > >> I'm trying to create a utility to parse out the data lineage (i.e. DAG >> depe

hacking the hive ql parser?

2015-12-29 Thread Yang
r with the parser code structure of hive, could anybody give me some tips on where to start? (I see the .g files, but not sure where is the rest I am more familiar with the ASTvisitor paradigm in antlr, but can't find similar files in the parser dir) thanks Yang

RE: multiple users for hive access

2015-07-07 Thread Jack Yang
user access. You can configure it to server mode or use other metadata store like mysql etc. Here's the tutorial for how to configure derby server mode https://cwiki.apache.org/confluence/display/Hive/HiveDerbyServerMode On Tue, Jul 7, 2015 at 1:50 PM, Jack Yang mailto:j...@uow.edu.au&g

multiple users for hive access

2015-07-06 Thread Jack Yang
Hi all, I would like to have multiple users to access hive. Does anyone try that before? Is there any tutorial or link I can study from? Best regards, Jack

Re: 38 digits vs 35 digits for Decimal type?

2015-02-24 Thread Yang
sorry my bad the doc from hive says 38 bits, I misread that. didn't intend to send the last email and thought it has been in "draft " box only...

38 digits vs 35 digits for Decimal type?

2015-02-21 Thread Yang
ECIMAL type was introduced? Why not follow oracle convention? Right now what is my best strategy for copying a table with NUMBER column from oracle to hive? Thanks Yang

Re: select on parquet hive tables always gives NULL ?

2015-02-19 Thread Yang
ah... found out. my issue is that hive 0.13 doesn't handle this correctly. could be a bug. used 0.14, it works. btw the UNION[int, null] translates to parquet as a field "optional int32 myfieldName", I found this by calling ParquetFileReader.readFooter() On Thu, Feb 19, 2015 at

Re: Parquet support for Timestamp in 0.14

2015-02-19 Thread Yang
015 at 12:08 PM, Yang wrote: > Szehon: > > another question related to the types support: > > if I convert an avro field of UNION to parquet, does hive support that > UNION field ? a UNION is needed because avro field can not take NULL, and I > have to define every field a

Re: Parquet support for Timestamp in 0.14

2015-02-19 Thread Yang
Szehon: another question related to the types support: if I convert an avro field of UNION to parquet, does hive support that UNION field ? a UNION is needed because avro field can not take NULL, and I have to define every field as an UNION of original type and NULL. Thanks Yang On Mon, Feb 9

select on parquet hive tables always gives NULL ?

2015-02-19 Thread Yang
t;string","doc":""},{"name":"nullableInt","type":["int","null"],"doc":""}],"version":"1424373511441"} the following is the parquet hive table def. I also attached the sample par

Re: Parquet support for Timestamp in 0.14

2015-02-09 Thread Yang
Thanks Szehon! On Tue, Feb 3, 2015 at 7:33 PM, Szehon Ho wrote: > Hi Yang > > I saw you posted this question in several places, I gave an answer in > HIVE-6394 as I saw that one first, to the timestamp query. > > Can't speak about about date support, as its not in m

failed to create an external hive table on parquet files (hive 0.14)

2015-02-03 Thread Yang
ect * from parquet_test gives just NULL. I tried to create an internal table with parquet format, it does work. and selection works too. but then after I point the location of an external table to that new internal table, it still selects NULL on output. Thanks Yang

Parquet support for Timestamp in 0.14

2015-02-02 Thread Yang
g/jira/browse/HIVE-8119 are we going to have a different on-disk binary encoding than the "int32" specified in the above doc? thanks Yang

possible to pass in a list of values as param ?

2014-10-27 Thread Yang
iveconf myargs="'1','2','3','4'" or ='1,2,3,4' neither seems to work what is the best way to do this? thanks! yang

will I get conflict if I run 2 "INSERT INTO TABLE" in parallel ?

2014-10-23 Thread Yang
antee correctness of data. if there is conflict, would "INSERT OVERWRITE PARTITION" get conflicts ? the different processes indeed process different partitions thanks Yang

'split' table into multiple partitions, only 1mapper 1 reducer is launched

2014-10-06 Thread Yang
I have 400k rows in table A, about 50 bytes each row, now I want to split all the rows in A and insert into B, which is the same table but partitioned on the row_id. I fired off my hive query, but it only generated 1 mapper and 1 reducer, so it's very slow. what settings can I set to use more ma

HIVE not creating _SUCCESS ?

2014-08-10 Thread Yang
I'm very surprised to find that HIVE actually does not create a _SUCCESS in its output file dir.. I had thought that since every MR job by default creates a _SUCCESS, and HIVE is compiled into MR anyway, HIVE kind of naturally behaves the same as any MR program. I verified that the wc example and

doing upsert possible?

2014-07-24 Thread Yang
if we have a huge table, and every 1 hour only 1% of that has some updates, it would be a huge waste to slurp in the whole table through MR job and write out the new table. instead, if we store this table in HBASE, and use the current HBase+Hive integration, as long as we can do upsert, then we ca

Re: does the HBase-Hive integration support using HBase index (primary key or secondary index) in the JOIN implementatoin?

2014-07-24 Thread Yang
e partitions already. On Thu, Jul 24, 2014 at 2:03 PM, Yang wrote: > if I do a join of a table based on txt file and a table based on HBase, > and say the latter is very large, is HIVE smart enough to utilize the HBase > table's index to do the join, instead of implementing thi

does the HBase-Hive integration support using HBase index (primary key or secondary index) in the JOIN implementatoin?

2014-07-24 Thread Yang
, and then the matching items found out through the reducer? thanks Yang

random NPE in HiveInputFormat.init() ??

2014-07-18 Thread Yang
JqRISUQ&bvm=bv.71198958,d.cGU second one suggests that you have to set the resource manager instead of leaving it empty. since we do have a valid server value set to the yarn.resourcemanager.address , and the above error only appears about 20% time, does it mean that our resourcemanager is unstable? thanks Yang

Re: how to control hive log location on 0.13?

2014-07-18 Thread Yang
s log? >> >> >> On 19 July 2014 04:38, Yang wrote: >> >>> thanks guys. anybody knows what generates the log like " >>> myuser_20140716143232_d76043ed-1c4b-42a0-bf0a-2816377a6a2a.log" ? I >>> checked our application code, it doesn'

Re: how to control hive log location on 0.13?

2014-07-18 Thread Yang
2014-07-18 15:03:37,774 INFO mr.ExecDriver (SessionState.java:printInfo(537)) - Execution log at: /tmp/myuser/myuser_2014071815030 3_56bf6bb0-db30-4dbc-807c-9023ce4103f4.log 2014-07-18 15:03:37,864 WARN conf.Configuration (Configuration.java:loadProperty(2358)) - file:/tmp/myuser/hive_2014-07-18_

Re: how to control hive log location on 0.13?

2014-07-18 Thread Yang
;> hive.log.dir= >> >> The default value of this property is ${java.io.tmpdir}/${user.name}. >> >> Thanks, >> Satish >> >> >> On Thu, Jul 17, 2014 at 11:58 PM, Yang wrote: >> >>> we just moved to hadoop2.0 (HDP2.1 distro). it turns out that

how to control hive log location on 0.13?

2014-07-17 Thread Yang
I change the location of both the logs , by some per-script params ? (i.e. we can't afford to change the system hive-site.xml or /etc/hive/conf etc) Thanks a lot Yang

Re: local-mode doesn't work?

2014-06-18 Thread Yang
es to local is less of a chore since PIG does not need the metastore On Wed, Jun 18, 2014 at 11:14 AM, Yang wrote: > I tried to run hive in local mode to debug a script (possibly with UDF) so > that I could attach it to eclipse for debugging. > > I followed the advice of > http

local-mode doesn't work?

2014-06-18 Thread Yang
or information Task failed! Task ID: Stage-1 Logs: /tmp/yyang15/hive.log FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask it seems that hive is trying to access a local jar but assuming the base dir to be HDFS. did anybody get this to work? Thanks Yang

problem with delimiters (control A)

2014-05-17 Thread Jack Yang
Hi All, I have a local file called mytest.txt (restored in hdfs already). The content is like this: $ cat -A HDFSLOAD_DIR/mytest.txt 49139801^A25752451^Aunknown$ 49139801^A24751754^Aunknown$ 49139801^A2161696^Anice$ To load this raw data above, I then defined the table like this in HQL: create

Re: HiveMetaStoreClient only sees one of my DBs ?

2013-12-30 Thread Yang
e ip=unknown-ip-addr > cmd=get_all_databases > (org.apache.hadoop.hive.metastore.HiveMetaStore.audit) > default > > > > > > > why is it showing only 1 db? what setttings of default are different from > the others to enable it to be shown? also I wonder how is that HiveConf() > initialized ? how does it even know the hive port and config settings ? is > it hardcoded to /etc/hive/conf/hive-site.xml ? > > > thanks > Yang > > > > > >

HiveMetaStoreClient only sees one of my DBs ?

2013-12-30 Thread Yang
ized ? how does it even know the hive port and config settings ? is it hardcoded to /etc/hive/conf/hive-site.xml ? thanks Yang

How to make hive run multiple queries simultaneously?

2013-11-20 Thread Lin Yang
Hi, all, I'm running Hive-0.9.0-cdh4.1.2 on a cluster consisting of 3 nodes, I was wondering if I would like to run multiple queries simultaneously? BTW, I have set "hive.exec.parallel" to true in hive-site.xml, but it doesn't work. Thanks. -- Lin Yang

Re: NPE org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable

2013-10-12 Thread xinyan Yang
ry hive trunk? Looks like it is a bug fixed after the > release of 0.11. > > Thanks, > > Yin > > > On Fri, Oct 11, 2013 at 9:21 AM, xinyan Yang wrote: > >> Development environment,hive 0.11、hadoop 1.0.3 >> >> >> 2013/10/11 xinya

Re: NPE org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable

2013-10-11 Thread xinyan Yang
Development environment,hive 0.11、hadoop 1.0.3 2013/10/11 xinyan Yang > Hi, > when i run this sql,it fails,can anyone give me a advise > > > select

NPE org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable

2013-10-11 Thread xinyan Yang
Hi, when i run this sql,it fails,can anyone give me a advise select e.udid as udid,e.app_id as app_id from acorn_3g.ClientChannelDefine cc join ( select udid,app_id,from_

Re: UDF error?

2013-09-30 Thread Yang
ok I found the reason, as I modified the jar file, though I re-ran "ADD .MyUdf.jar; create temporary function ; ", it doesn't take effect. I have to get out of hive session, then rerun these again. On Mon, Sep 30, 2013 at 1:47 PM, Yang wrote: > I wrote a super s

Re: UDF error?

2013-09-30 Thread Yang
= false) > public class UDFRowSequence extends UDF { > > Hope this helps! > Tim > > > > On Mon, Sep 30, 2013 at 10:47 PM, Yang wrote: > >> I wrote a super simple UDF, but got some errors: >> >> UDF: >> >> package yy; >> im

UDF error?

2013-09-30 Thread Yang
_() so I'm declaring a UDF with arg of long, so that should work for a bigint (more importantly it's complaining not long vs bigint, but bigint vs void ). I tried changing both to int, same failure thanks! yang

Re: how to treat an existing partition data file as a table?

2013-09-30 Thread Yang
thanks guys, I found that the table is not partitioned, so I guess no way out... On Mon, Sep 30, 2013 at 9:31 AM, Olga L. Natkovich wrote: > You need to specify a table partition from which you want to sample. > > ** ** > > Olga > > ** ** > > *From:* Yang [m

how to treat an existing partition data file as a table?

2013-09-29 Thread Yang
ot; pointing to only one of the data files used by the original table mytable ? this way the total files to be scanned is much smaller. thanks! yang

Re: Exception comes out when counting the rows

2013-04-23 Thread YouPeng Yang
managed > tables and external tables locations. > > Regards, > Ramki. > > > On Mon, Apr 22, 2013 at 2:01 AM, YouPeng Yang > wrote: > >> >> Hi hive users >> >> Sorry for missing the title on the previous mail. >> >> This is my firs

What's the URL of Hive IRC Channel

2013-04-22 Thread YouPeng Yang
Hi Does anyone know the exact URL of the Hive IRC Channel. I try http://irc.freenode.net/#hive,and it does not works. Regards

Exception comes out when counting the rows

2013-04-22 Thread YouPeng Yang
Hi hive users Sorry for missing the title on the previous mail. This is my first time to post a question here. I have gotten an exception when I count the rows of my hive table after I have loaded the data: hive>create EXTERNAL TABLE NMS_CMTS_CPU_CDX_TEST (CMTSID INT,MSEQ INT,GOTTIME BIGI

[no subject]

2013-04-22 Thread YouPeng Yang
Hi hive users This is my first time to post a question here. I have gotten an exception when I count the rows of my hive table after I have loaded the data: hive>create EXTERNAL TABLE NMS_CMTS_CPU_CDX_TEST (CMTSID INT,MSEQ INT,GOTTIME BIGINT,CMTSINDEX INT,CPUTOTAL INT,DESCR STRING) ROW FORMA

substr() index out of range exception in hive 0.8.1

2013-01-24 Thread Yu Yang
Hi All, I'm working on hive 0.8.1. and meet following problem. I use function substr(item,-4,1) to process one item in hive table, and there is one row in which the content of the item is "ba_s0一朝忽觉京梦醒,半世浮沉雨打萍--衣俊卿小n实录010", then the job failed. I checked the task log, it appeared java.lang.Strin

RE: Hive double-precision question

2012-12-07 Thread Lauren Yang
This sounds like https://issues.apache.org/jira/browse/HIVE-2586 , where comparing float/doubles will not work because of the way floating point numbers are represented. Perhaps there is a comparison between a float and double type because of some internal representation in the Java library, o

RE: hive-site.xml not found on classpath

2012-11-30 Thread Lauren Yang
You can see if the classpath is being passed correctly to hadoop by putting in an echo statement around line 150 of the hive cli script where it passes the CLASSPATH variable to HADOOP_CLASSPATH. # pass classpath to hadoop export HADOOP_CLASSPATH="${HADOOP_CLASSPATH}:${CLASSPATH}" You could also

Re: How to apply data mining on Hive?

2012-06-09 Thread jason Yang
might fit your bill too. >> >> Good luck, >> Mark >> >> On Fri, Jun 8, 2012 at 1:25 AM, jason Yang wrote: >> >>> Hi, Mark. >>> >>> Thank you for your reply. >>> >>> I have read the User Guide, but I'm still wonde

Re: How to apply data mining on Hive?

2012-06-08 Thread jason Yang
Hi, Screenath all right, I will check it out. thank you~ 2012/6/8 Sreenath Menon > Kindly check out Apache Mahout and whether it satisfies your needs. > -- YANG, Lin

Re: How to apply data mining on Hive?

2012-06-07 Thread jason Yang
ossible. > > This is a good place to get started and learn more about Hive: > https://cwiki.apache.org/confluence/display/Hive/GettingStarted > > Welcome and good luck! > > Mark > > > On Thu, Jun 7, 2012 at 10:10 PM, jason Yang wrote: > >> Hi, dear friends. >

How to apply data mining on Hive?

2012-06-07 Thread jason Yang
Hi, dear friends. I was wondering what's the popular way to do data mining on Hive? Since the data in Hive is distributed over the cluster, is there any tool or solution could parallelize the data mining? Any suggestion would be appreciated. -- YANG, Lin

RE: Specifying a double precision in HiveQL

2011-02-24 Thread Paul Yang
Hacky, but maybe something like select concat( cast(num as int), '.' , cast(abs(num)*100 as int) % 100) from (select 1.234 as num from src limit 1) a; ? -Original Message- From: Aurora Skarra-Gallagher [mailto:aur...@yahoo-inc.com] Sent: Thursday, February 24, 2011 11:31 AM To: user@hi

RE: OutOfMemory errors on joining 2 large tables.

2011-02-22 Thread Paul Yang
Have you taken a look at the distribution of your join keys? If there are a couple join keys that occur much more frequently than others, the reducers handling those keys will have more load and may be subject to an OOM. -Original Message- From: Bennie Schut [mailto:bsc...@ebuddy.com] S

RE: Only a single expression in the SELECT clause is supported with UDTF's

2010-11-08 Thread Paul Yang
In your original query, I think if you put parenthesis around p,k it should have worked: select taxonDensityUDTF(kingdom_concept_id, phylum_concept_id) as (p,k) ... -Original Message- From: Tim Robertson [mailto:timrobertson...@gmail.com] Sent: Monday, November 08, 2010 5:53 AM To: user

RE: Search the newest partition of one table in view

2010-10-20 Thread Paul Yang
I don't think it is possible to use just a view to get that effect. If you're generating the query programmatically, it'd be possible to either use the Thrift service or process the output of 'show partitions' to get the latest partition date. From: lei liu [mailto:liulei...@gmail.com] Sent: We

RE: USING .. AS column names

2010-10-13 Thread Paul Yang
For insert overwrite, the column names don't matter - the order of the columns dictate how they are inserted into the table so the behavior is not specific to the transform clause. Also, when you use AS with transform, you're just assigning column aliases to the output of the transform. For exa