unsubsribe
Here're the version information:
Hive: 1.2.1
Tez: 0.8.5
Hadoop 2.6.0-cdh5.8.3
Please let me know if you need more information.
Regards,
Xin
From: "Yang, Xin" mailto:xiy...@visa.com>>
Date: Thursday, June 29, 2017 at 11:48 AM
To: "user@hive.apache.org<mailto:user@h
ah never mind I found that we are using an old version of hive without this
feature
On Tue, Mar 8, 2016 at 9:57 AM, Yang wrote:
> by documentation I'm referring to this:
> https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Joins#LanguageManualJoins-MapJoinRestriction
8, 2016 at 9:31 AM, Yang wrote:
> From the documentation it says that if my tables are small enough and i
> set the conver join parameters, without the join hints hive should be able
> to convert the joins into 1 mapjoin in 1 mr job
>
> But in practice i found that it always end
>From the documentation it says that if my tables are small enough and i set
the conver join parameters, without the join hints hive should be able to
convert the joins into 1 mapjoin in 1 mr job
But in practice i found that it always ends up in 2 mr jobs (2 map joins).
What is wrong?
thanks!
On Tue, Dec 29, 2015 at 12:14 PM, Edward Capriolo
wrote:
> hive --service lineage 'hql' exists i believe.
>
> On Tue, Dec 29, 2015 at 3:05 PM, Yang wrote:
>
>> I'm trying to create a utility to parse out the data lineage (i.e. DAG
>> depe
r with the parser code structure of hive, could
anybody give me some tips on where to start?
(I see the .g files, but not sure where is the rest I am more familiar
with the ASTvisitor paradigm in antlr, but can't find similar files in the
parser dir)
thanks
Yang
user access. You can configure it to server mode or use other
metadata store like mysql etc. Here's the tutorial for how to configure derby
server mode
https://cwiki.apache.org/confluence/display/Hive/HiveDerbyServerMode
On Tue, Jul 7, 2015 at 1:50 PM, Jack Yang
mailto:j...@uow.edu.au&g
Hi all,
I would like to have multiple users to access hive.
Does anyone try that before?
Is there any tutorial or link I can study from?
Best regards,
Jack
sorry my bad the doc from hive says 38 bits, I misread that.
didn't intend to send the last email and thought it has been in "draft "
box only...
ECIMAL type was
introduced? Why not follow oracle convention? Right now what is my best
strategy for copying a table with NUMBER column from oracle to hive?
Thanks
Yang
ah... found out. my issue is that hive 0.13 doesn't handle this correctly.
could be a bug.
used 0.14, it works.
btw the UNION[int, null] translates to parquet as a field "optional int32
myfieldName", I found this by calling ParquetFileReader.readFooter()
On Thu, Feb 19, 2015 at
015 at 12:08 PM, Yang wrote:
> Szehon:
>
> another question related to the types support:
>
> if I convert an avro field of UNION to parquet, does hive support that
> UNION field ? a UNION is needed because avro field can not take NULL, and I
> have to define every field a
Szehon:
another question related to the types support:
if I convert an avro field of UNION to parquet, does hive support that
UNION field ? a UNION is needed because avro field can not take NULL, and I
have to define every field as an UNION of original type and NULL.
Thanks
Yang
On Mon, Feb 9
t;string","doc":""},{"name":"nullableInt","type":["int","null"],"doc":""}],"version":"1424373511441"}
the following is the parquet hive table def. I also attached the sample
par
Thanks Szehon!
On Tue, Feb 3, 2015 at 7:33 PM, Szehon Ho wrote:
> Hi Yang
>
> I saw you posted this question in several places, I gave an answer in
> HIVE-6394 as I saw that one first, to the timestamp query.
>
> Can't speak about about date support, as its not in m
ect * from parquet_test gives just NULL.
I tried to create an internal table with parquet format, it does work.
and selection works too.
but then after I point the location of an external table to that new
internal table, it still selects NULL on output.
Thanks
Yang
g/jira/browse/HIVE-8119
are we going to have a different on-disk binary encoding than the "int32"
specified in the above doc?
thanks
Yang
iveconf myargs="'1','2','3','4'"
or ='1,2,3,4'
neither seems to work
what is the best way to do this?
thanks!
yang
antee correctness of data.
if there is conflict, would "INSERT OVERWRITE PARTITION" get conflicts ?
the different processes indeed process different partitions
thanks
Yang
I have 400k rows in table A, about 50 bytes each row, now I want to split
all the rows in A and insert into B, which is the same table but
partitioned on the row_id.
I fired off my hive query, but it only generated 1 mapper and 1 reducer,
so it's very slow. what settings can I set to use more ma
I'm very surprised to find that HIVE actually does not create a _SUCCESS in
its output file dir..
I had thought that since every MR job by default creates a _SUCCESS, and
HIVE is compiled into MR anyway, HIVE kind of naturally behaves the same as
any MR program. I verified that the wc example and
if we have a huge table, and every 1 hour only 1% of that has some updates,
it would be a huge waste to slurp in the whole table through MR job and
write out the new table.
instead, if we store this table in HBASE, and use the current HBase+Hive
integration, as long as we can do upsert, then we ca
e partitions already.
On Thu, Jul 24, 2014 at 2:03 PM, Yang wrote:
> if I do a join of a table based on txt file and a table based on HBase,
> and say the latter is very large, is HIVE smart enough to utilize the HBase
> table's index to do the join, instead of implementing thi
, and
then the matching items found out through the reducer?
thanks
Yang
JqRISUQ&bvm=bv.71198958,d.cGU
second one suggests that you have to set the resource manager instead of
leaving it empty.
since we do have a valid server value set to the
yarn.resourcemanager.address , and the above error only appears about 20%
time, does it mean that our resourcemanager is unstable?
thanks
Yang
s log?
>>
>>
>> On 19 July 2014 04:38, Yang wrote:
>>
>>> thanks guys. anybody knows what generates the log like "
>>> myuser_20140716143232_d76043ed-1c4b-42a0-bf0a-2816377a6a2a.log" ? I
>>> checked our application code, it doesn'
2014-07-18 15:03:37,774 INFO mr.ExecDriver
(SessionState.java:printInfo(537)) - Execution log at:
/tmp/myuser/myuser_2014071815030
3_56bf6bb0-db30-4dbc-807c-9023ce4103f4.log
2014-07-18 15:03:37,864 WARN conf.Configuration
(Configuration.java:loadProperty(2358)) -
file:/tmp/myuser/hive_2014-07-18_
;> hive.log.dir=
>>
>> The default value of this property is ${java.io.tmpdir}/${user.name}.
>>
>> Thanks,
>> Satish
>>
>>
>> On Thu, Jul 17, 2014 at 11:58 PM, Yang wrote:
>>
>>> we just moved to hadoop2.0 (HDP2.1 distro). it turns out that
I change the location of both the logs , by some per-script
params ? (i.e. we can't afford to change the system hive-site.xml or
/etc/hive/conf etc)
Thanks a lot
Yang
es to local is less of a chore
since PIG does not need the metastore
On Wed, Jun 18, 2014 at 11:14 AM, Yang wrote:
> I tried to run hive in local mode to debug a script (possibly with UDF) so
> that I could attach it to eclipse for debugging.
>
> I followed the advice of
> http
or information
Task failed!
Task ID:
Stage-1
Logs:
/tmp/yyang15/hive.log
FAILED: Execution Error, return code 1 from
org.apache.hadoop.hive.ql.exec.mr.MapRedTask
it seems that hive is trying to access a local jar but assuming the base
dir to be HDFS. did anybody get this to work?
Thanks
Yang
Hi All,
I have a local file called mytest.txt (restored in hdfs already). The content
is like this:
$ cat -A HDFSLOAD_DIR/mytest.txt
49139801^A25752451^Aunknown$
49139801^A24751754^Aunknown$
49139801^A2161696^Anice$
To load this raw data above, I then defined the table like this in HQL:
create
e ip=unknown-ip-addr
> cmd=get_all_databases
> (org.apache.hadoop.hive.metastore.HiveMetaStore.audit)
> default
>
>
>
>
>
>
> why is it showing only 1 db? what setttings of default are different from
> the others to enable it to be shown? also I wonder how is that HiveConf()
> initialized ? how does it even know the hive port and config settings ? is
> it hardcoded to /etc/hive/conf/hive-site.xml ?
>
>
> thanks
> Yang
>
>
>
>
>
>
ized ? how does it even know the hive port and config settings ? is
it hardcoded to /etc/hive/conf/hive-site.xml ?
thanks
Yang
Hi, all,
I'm running Hive-0.9.0-cdh4.1.2 on a cluster consisting of 3 nodes, I was
wondering if I would like to run multiple queries simultaneously?
BTW, I have set "hive.exec.parallel" to true in hive-site.xml, but it
doesn't work.
Thanks.
--
Lin Yang
ry hive trunk? Looks like it is a bug fixed after the
> release of 0.11.
>
> Thanks,
>
> Yin
>
>
> On Fri, Oct 11, 2013 at 9:21 AM, xinyan Yang wrote:
>
>> Development environment,hive 0.11、hadoop 1.0.3
>>
>>
>> 2013/10/11 xinya
Development environment,hive 0.11、hadoop 1.0.3
2013/10/11 xinyan Yang
> Hi,
> when i run this sql,it fails,can anyone give me a advise
>
>
> select
Hi,
when i run this sql,it fails,can anyone give me a advise
select e.udid as udid,e.app_id as app_id
from acorn_3g.ClientChannelDefine cc
join (
select udid,app_id,from_
ok I found the reason, as I modified the jar file, though I re-ran "ADD
.MyUdf.jar; create temporary function ; ", it doesn't take effect.
I have to get out of hive session, then rerun these again.
On Mon, Sep 30, 2013 at 1:47 PM, Yang wrote:
> I wrote a super s
= false)
> public class UDFRowSequence extends UDF {
>
> Hope this helps!
> Tim
>
>
>
> On Mon, Sep 30, 2013 at 10:47 PM, Yang wrote:
>
>> I wrote a super simple UDF, but got some errors:
>>
>> UDF:
>>
>> package yy;
>> im
_()
so I'm declaring a UDF with arg of long, so that should work for a bigint
(more importantly it's complaining not long vs bigint, but bigint vs void
). I tried changing both to int, same failure
thanks!
yang
thanks guys, I found that the table is not partitioned, so I guess no way
out...
On Mon, Sep 30, 2013 at 9:31 AM, Olga L. Natkovich wrote:
> You need to specify a table partition from which you want to sample.
>
> ** **
>
> Olga
>
> ** **
>
> *From:* Yang [m
ot; pointing to only one of the data files
used by the original table mytable ?
this way the total files to be scanned is much smaller.
thanks!
yang
managed
> tables and external tables locations.
>
> Regards,
> Ramki.
>
>
> On Mon, Apr 22, 2013 at 2:01 AM, YouPeng Yang
> wrote:
>
>>
>> Hi hive users
>>
>> Sorry for missing the title on the previous mail.
>>
>> This is my firs
Hi
Does anyone know the exact URL of the Hive IRC Channel.
I try http://irc.freenode.net/#hive,and it does not works.
Regards
Hi hive users
Sorry for missing the title on the previous mail.
This is my first time to post a question here.
I have gotten an exception when I count the rows of my hive table after I
have loaded the data:
hive>create EXTERNAL TABLE NMS_CMTS_CPU_CDX_TEST (CMTSID INT,MSEQ
INT,GOTTIME BIGI
Hi hive users
This is my first time to post a question here.
I have gotten an exception when I count the rows of my hive table after I
have loaded the data:
hive>create EXTERNAL TABLE NMS_CMTS_CPU_CDX_TEST (CMTSID INT,MSEQ
INT,GOTTIME BIGINT,CMTSINDEX INT,CPUTOTAL INT,DESCR STRING) ROW FORMA
Hi All,
I'm working on hive 0.8.1. and meet following problem.
I use function substr(item,-4,1) to process one item in hive table, and
there is one row in which the content of the item is
"ba_s0一朝忽觉京梦醒,半世浮沉雨打萍--衣俊卿小n实录010", then the job failed.
I checked the task log, it appeared
java.lang.Strin
This sounds like https://issues.apache.org/jira/browse/HIVE-2586 , where
comparing float/doubles will not work because of the way floating point numbers
are represented.
Perhaps there is a comparison between a float and double type because of some
internal representation in the Java library, o
You can see if the classpath is being passed correctly to hadoop by putting in
an echo statement around line 150 of the hive cli script where it passes the
CLASSPATH variable to HADOOP_CLASSPATH.
# pass classpath to hadoop
export HADOOP_CLASSPATH="${HADOOP_CLASSPATH}:${CLASSPATH}"
You could also
might fit your bill too.
>>
>> Good luck,
>> Mark
>>
>> On Fri, Jun 8, 2012 at 1:25 AM, jason Yang wrote:
>>
>>> Hi, Mark.
>>>
>>> Thank you for your reply.
>>>
>>> I have read the User Guide, but I'm still wonde
Hi, Screenath
all right, I will check it out. thank you~
2012/6/8 Sreenath Menon
> Kindly check out Apache Mahout and whether it satisfies your needs.
>
--
YANG, Lin
ossible.
>
> This is a good place to get started and learn more about Hive:
> https://cwiki.apache.org/confluence/display/Hive/GettingStarted
>
> Welcome and good luck!
>
> Mark
>
>
> On Thu, Jun 7, 2012 at 10:10 PM, jason Yang wrote:
>
>> Hi, dear friends.
>
Hi, dear friends.
I was wondering what's the popular way to do data mining on Hive?
Since the data in Hive is distributed over the cluster, is there any tool
or solution could parallelize the data mining?
Any suggestion would be appreciated.
--
YANG, Lin
Hacky, but maybe something like
select concat( cast(num as int), '.' , cast(abs(num)*100 as int) % 100) from
(select 1.234 as num from src limit 1) a;
?
-Original Message-
From: Aurora Skarra-Gallagher [mailto:aur...@yahoo-inc.com]
Sent: Thursday, February 24, 2011 11:31 AM
To: user@hi
Have you taken a look at the distribution of your join keys? If there are a
couple join keys that occur much more frequently than others, the reducers
handling those keys will have more load and may be subject to an OOM.
-Original Message-
From: Bennie Schut [mailto:bsc...@ebuddy.com]
S
In your original query, I think if you put parenthesis around p,k it should
have worked:
select taxonDensityUDTF(kingdom_concept_id, phylum_concept_id) as (p,k) ...
-Original Message-
From: Tim Robertson [mailto:timrobertson...@gmail.com]
Sent: Monday, November 08, 2010 5:53 AM
To: user
I don't think it is possible to use just a view to get that effect. If you're
generating the query programmatically, it'd be possible to either use the
Thrift service or process the output of 'show partitions' to get the latest
partition date.
From: lei liu [mailto:liulei...@gmail.com]
Sent: We
For insert overwrite, the column names don't matter - the order of the columns
dictate how they are inserted into the table so the behavior is not specific to
the transform clause.
Also, when you use AS with transform, you're just assigning column aliases to
the output of the transform. For exa
60 matches
Mail list logo