I had try the followings, and it works..
SessionState session = new SessionState(conf);// the confi should have
enough informations,such as the access to hive meta server database(the
myslq 额头)
session.setIsSilent(true);
session.setIsVerbose(true);
SessionState.start(session);
Driver driver = new
You can directly embed the hive client library in your java program, and
use it without running a hive service. My blog post at
http://csgrad.blogspot.com/2010/04/to-use-language-other-than-java-say.htmldescribes
how to run hive queries from Jython. Something very similar
should work for Java.
D
Hello,
I would like to embed Hive (client) in my application to execute a sequence
of queries. Right now I do it using CLI (hive -f myScript.sql). Problem
with this approach is that I do not get an return / error code to know the
status of query programmatically.
So my question is what is the bes
Hello all,
I am new to Amazon Services and started to learning the new things about
Amazon EMR, S3 and EC2. But as I read about the EMR because I want to
deploy my task using Amazon EMR only.But I am facing some problems like
while running Sample Hive Program:
1) I created the bucket in which I s
Hi Ruben,
Looks like pastie is down (http://pastie.org/) because of recent DDOS attacks.
Can you please post your queries elsewhere?
Mark
Mark Grover, Business Intelligence Analyst
OANDA Corporation
www: oanda.com www: fxtrade.com
e: mgro...@oanda.com
"Best Trading Platform" - World Finance
Hi, I'm running some Hive jobs on Amazon Elastic MapReduce.
The versions for Hive and Hadoop as reported by the instance:
hadoop@ip-10-64-33-113:~$ hive --version
Hive version 0.7.1.4
hadoop@ip-10-64-33-113:~$ hadoop version
Hadoop 0.20.205
I'm able to specify a table and execute a query using
I figured out how to do this. The problem is that you need to add the
number of fields you want to be returned to the
StandardStructObjectInspector that is returned in the initialize()
method of the UDTF.
-Thomas
From: Ryabin, Thomas
Sent: Tuesday, April 24, 2012 12:46 PM
To: user@hive.apa
Hi Bhavesh,
If you copy your jar over to master node of your EMR cluster and install Sqoop
like Kyle suggested, you can run your jar on the master node, just like you did
on your local cluster before. Just make sure that the Hive Jdbc drivers are
available to jar and that you connect to the Hive
Hi,
Is it possible ever to not specify the partition variable name when discovering
partitions? I'm sure I've seen this demonstrated but of course when it's
needed, I can't find it. Can anyone clarify?
I have a number of date-named directories in Amazon AWS S3, containing data
stored in sequen
Hello,
I have created a custom UDTF called "test_udtf". This function takes 4
parameters. Right now I can use this function like so in the following
query:
SELECT test_udtf(product, store, 'test0', 'test1') as (col0, col1, col2,
col3) from products join stores;
The problem is that I want t
I got the (rather big) log here in a github gist:
https://gist.github.com/2480893
And I also attached the plan.xml it was using to the gist.
When loading the members_map (11mil records, 320mb, 30b per record), it seems
to take about 198b per record in the members_map, resulting in crashing aro
Hi Soren
If you can collect or order the log files into date based sub dirs in S3.
Then you can partition the table based on date. With partitions you can query a
subset of your data based on date. You can organize the data into date folders
during flume ingestion itself.
Regards
Bejoy K
Hi Hive community
We are collecting huge amounts of data into Amazon S3 using Flume.
In Elastic Mapreduce, we have so far managed to create an external Hive
table on JSON formatted gzipped log files in S3 using a customized
serde. The log files are collected and stored in one single folder wi
Hi Ruben
The operation you are seeing in your log is preparation of hash table of
the smaller table, This hash table file is compressed and loaded into
Distributed Cache and from there it is used for map side joins. From your
console log the hash table size/data size has gone to nearly 1.5
This seems like it would work.
Thanks,
Thomas
-Original Message-
From: Nitin Pawar [mailto:nitinpawar...@gmail.com]
Sent: Tuesday, April 24, 2012 3:31 AM
To: user@hive.apache.org
Subject: Re: Possible to use regex column specification with WHERE
clause?
you may want to have a programmat
looks like a good use case
created and improvement request
https://issues.apache.org/jira/browse/HIVE-2980
On Tue, Apr 24, 2012 at 9:10 AM, Sukhendu Chakraborty <
sukhendu.chakrabo...@gmail.com> wrote:
> Thanks Nitin. I am aware of what Hive is doing. The question is, is it
> okay not return an
Here are both tables:
$ hdfs -count /user/hive/warehouse/hyves_goldmine.db/members_map
1 1 247231757
hdfs://localhost:54310/user/hive/warehouse/hyves_goldmine.db/members_map
$ hdfs -count /user/hive/warehouse/hyves_goldmine.db/visit_stats
442 441
Shashwat,
i think he wanted to put a regex in the where clause to derive a column
name instead of select clause
On Tue, Apr 24, 2012 at 4:45 PM, shashwat shriparv <
dwivedishash...@gmail.com> wrote:
> Use this to generate probable strings Some examples are here :
>
> regexp_extract(s, '^([a-z
Use this to generate probable strings Some examples are here :
regexp_extract(s, '^([a-zA-Z0-9]{2}\.)?(a-zA-Z0-9]{3}-?){3}')
select regexp_extract(request, ' (\\S*) HTTP', 1) from logfile;
select regexp_extract('junk:text:ua123','ua[0-9]+',0) from dual
and pass in your query with or condition.
This operation is erroring out on the hive client itself before starting a
map so splitting to mappers is out of question.
can you do a dfs count for the members_map table hdfslocation and tell us
the result?
On Tue, Apr 24, 2012 at 2:06 PM, Ruben de Vries wrote:
> Hmm I must be doing something
Hmm I must be doing something wrong, the members_map table is 300ish MB.
When I execute the following query:
SELECT
/*+ MAPJOIN(members_map) */
date_int,
members_map.gender AS gender,
'generic',
COUNT( memberId ) AS unique,
SUM( `generic`['count'] ) AS count,
SUM( `gen
Hi Ruben
Map join hint is provided to hive using "MAPJOIN" keyword as :
SELECT /*+ MAPJOIN(b) */ a.key, a.value FROM a join b on a.key = b.key
To use map side join some hive configuration properties needs to be enabled
For plain map side joins
hive>SET hive.auto.convert.join=true;
Latest versions
Hi Mark,
Looks great to me! Thanks for adding it.
-Justin
On Tue, Apr 24, 2012 at 5:55 AM, Mark Grover wrote:
> Added a tiny blurb here:
> https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-UDFinternals
> Comments/suggestions welcome!
>
> Thanks for brin
you may want to have a programmatic approach for doing this and
provide hive with a final query.
You can solve this with either solving your regular expression outside
hive paradigm and then provide the query to hive
On 4/23/12, Ryabin, Thomas wrote:
> Hi,
>
>
>
> I know that it is possible to
If you are doing a map side join make sure the table members_map is
small enough to hold in memory
On 4/24/12, Ruben de Vries wrote:
> Wow thanks everyone for the nice feedback!
>
> I can force a mapside join by doing /*+ STREAMTABLE(members_map) */ right?
>
>
> Cheers,
>
> Ruben de Vries
>
> ---
25 matches
Mail list logo