Thanks very much Nick, it's yours for the taking.
-- Lefty
On Tue, Sep 9, 2014 at 2:37 PM, Martin, Nick wrote:
> Lefty, that’s the single best description of indexes/partitions I’ve yet
> encountered. Stealing it.
>
>
>
> Nice J
>
>
>
> *From:* Lefty Leverenz [mailto:leftylever...@gmail.com]
>
Thanks a lot for your reply..I changed the following parameters from Cloudera
manager
mapred.tasktracker.map.tasks.maximum = 2 (it was 1 before)
mapred.tasktracker.reduce.tasks.maximum = 2 (it was 1 before)
could you please mention what are the parameters and how do I change those ...
Regards,Am
It use Yarn now you need to set your container resource memory and CPU then set
the mapreduce physical memory and CPU cores the number of mapper and reducers
are calculated based on the resource you gave to your mapper and reducer
Pengcheng
Sent from my iPhone
> On Sep 9, 2014, at 7:55 PM, Amit
I think one of the issue is number of mapreduce slot for the cluster... Can
anyone please let me know how do I increase the mapreduce slot?
From: amitkrdu...@outlook.com
To: user@hive.apache.org
Subject: PIG heart beat freeze using hue + cdh 5.1
Date: Tue, 9 Sep 2014 17:55:01 -0500
Hi I have
Hi
Does anyone please let me know how to increase the mapreduce slots? i am
getting infinite heartbeat when i run a PIG script from hue cloudera cdh5.1
Thanks,Amit
Hi
Does anyone please let me know how to increase the mapreduce slots? i am
getting infinite heartbeat when i run a PIG script from hue cloudera cdh5.1
Thanks,Amit
Hi I have a only 604 rows in the hive table.
while using A = LOAD 'revenue' USING org.apache.hcatalog.pig.HCatLoader(); DUMP
A; it starts spouting heart beat repeatedly and does not leave this state.Can
please someone help.I am getting following exception
2014-09-09 17:27:45,844 [JobControl] IN
Hi Furcy,
Thanks for sharing. I modified my code to mark the map variables
"transient" but still got same error. this is the code:
public class fun_name extends GenericUDTF {
private PrimitiveObjectInspector stringOI = null;
transient Map> mapObject;
transient Map eventDetails;
On 9/6/14, 9:36 AM, Alain Petrus wrote:
I am wondering whether is it possible to use Hive index and ORC format? Does
it make sense?
ORC maintains its own indexes within the file - one index record every
10,000 rows (orc.row.index.stride / orc.create.index).
You can take advantage of it du
Well, here is me talking to myself: but in case someone else runs across
this, I changed the hive metastore connect timeout to 600 seconds (per the
JIRA below for Hive 0.14) and now my problem has gone away. It looks like
the timeout was causing some craziness.
https://issues.apache.org/jira/brows
Lefty, that’s the single best description of indexes/partitions I’ve yet
encountered. Stealing it.
Nice ☺
From: Lefty Leverenz [mailto:leftylever...@gmail.com]
Sent: Tuesday, September 09, 2014 2:28 PM
To: user@hive.apache.org
Subject: Re: Indexes vs Partitions in hive
Others can give technical
Others can give technical explanations, but I'll give you a simple analogy:
a book might have an index as well as chapters. Both help you find
information more quickly. The index directs you to particular information,
and chapters partition the book into smaller pieces that are organized
around
Hi Anusha,
1. Well, not quite. What my solution gives you is only a way to move your
data from 's3://some-bucket/pageviews/dt=20120311/key=ACME1234/site=
example.com/Output-file-1' to 's3://some-bucket/pageviews/20120311/ACME1234/
example.com/Output-file-1'. You could actually do this via the linu
I ran with debug logging, and this is interesting, there was a loss of
connection to the metastore client RIGHT before the partition mention
above... as data was looking to be moved around... I wonder if the timing
on that is bad?
14/09/09 12:47:37 [main]: INFO exec.MoveTask: Partition is: {day=nu
Thanks Nishanth.. I got thousands of records inserted into dynamically
partitioned Tables.
1)Do you think this is ideal solution to CONVERT the path for every record
or didnt i understand your answer.?
2) Is there anyway we can set up so the initial path formed as we need(only
with Column value
You can use a regex to solve this. If you're using this file path in Java,
you could try something like the following:
String s =
"s3://some-bucket/pageviews/dt=20120311/key=ACME1234/site=
example.com/Output-file-1";
System.out.println(s.replaceAll("*[a-z]{2,4}=*", ""));
If you'd
I am doing a dynamic partition load in Hive 0.13 using ORC files. This has
always worked in the past both with MapReduce V1 and YARN. I am working
with Mesos now, and trying to trouble shoot this weird error:
Failed with exception AlreadyExistsException(message:Partition already
exists
What's
My Table has Dynamic Partitions and creates the File Path as
s3://some-bucket/pageviews/dt=20120311/key=ACME1234/site=
example.com/Output-file-1
Is there something i can do so i can have the path always as
s3://some-bucket/pageviews/20120311/ACME1234/example.com/Output-file-1
Please help me
We use our own library, simple constructions like files in hdfs that work
like pid/lock files. a file like /flags/tablea/process1 could mean "hey i'm
working on table a leave it alone". Accomplishes the exact same thing with
less fuss, it is also much easier for an external process/scheduler/shell
you can not modify the paths of partitions being created by dynamic
partitioning or rename them
Thats the default implementation for having column=value in path as
partition
On Tue, Sep 9, 2014 at 5:18 AM, anusha Mangina
wrote:
>
> I need a table partitioned by country and then city . I created
Hi,
I think I encountered this kind of serialization problem when writing UDFs.
Usually, marking every fields of the UDF as *transient* does the trick.
I guess the error means that Kryo tries to serialize the UDF class and
everything that is inside, and by marking them as transient
you ensure tha
Yes. It does now.
Thanks
Prasanth Jayachandran
On Sep 9, 2014, at 12:30 AM, Abhishek Agarwal wrote:
> Thanks Prasanth. Does it also mean that a query reading nested.k column will
> invariably read nested.v as well even if nested.v column in not used in the
> query?
>
> On Mon, Sep 8, 2014
Thanks Prasanth. Does it also mean that a query reading nested.k column
will invariably read nested.v as well even if nested.v column in not used
in the query?
On Mon, Sep 8, 2014 at 11:29 PM, Prasanth Jayachandran <
pjayachand...@hortonworks.com> wrote:
> Hi
>
> ORC stores nested fields as separ
Hi,
We also encounter this in hive 0.13 , we need to enable concurrency in
daily ETL workflows (to avoid sub etl start to read parent etl 's output
while it's still running).
We found that in hive 0.13 sometime when you open hive cli shell it would
output the msg "conflicting lock present for defa
24 matches
Mail list logo