Hi,
I have uploaded a file 'passwd' into hdfs path /user/Haldar/passwd.
But when I am executing,
A = LOAD '/user/haldar/passwd2' using PigStorage(':');
I am getting error, and the log says
Error starting action [pig]. ErrorType [FAILED], ErrorCode [It should never
happen], Message [Permission d
> The Oozie servers in your Oozie HA setup actually are all active; that is,
> they are all processing jobs at the same time -- there is no failover.
I see. So, this is more of an active-active configuration.
I got answers for all my questions.
Thanks a lot Robert! Your help is greatly appreciate
Yeah this is exactly the type of functionality that I was looking for. I
would certainly make use of such a feature.
I’m curious, how did your implementation go about defining the inputs to
the shards?
-Chris
On 2/20/14, 5:24 PM, "Alejandro Abdelnur" wrote:
>This is what I refer as sharding, i
This is what I refer as sharding, it can be seen as a special type of
fork/join where all shards are doing the same actions on different datasets
and the number of shards depends on the number of datasets.
A while ago I've rewritten the workflow lib, cleanning it up a bit and
adding this capabilit
How would I invoke a sub-workflow from the Java API? Just create a
workflow that only contains a sub-workflow?
On 2/20/14, 4:47 PM, "Mona Chitnis" wrote:
>If you use the sub-workflow construct, then it would do some error
>reporting for you. If a sub-workflow fails, the parent workflow also gets
If you use the sub-workflow construct, then it would do some error
reporting for you. If a sub-workflow fails, the parent workflow also gets
updated to failed. Also in Oozie 4.0, the JIRA OOZIE-1264 The "parent"
property of a subworkflow should be the ID of the parent workflow, helps
get the depend
Thanks, Mona. This is easy to fix.
Richard.
-Original Message-
From: Mona Chitnis [mailto:chit...@yahoo-inc.com]
Sent: Thursday, February 20, 2014 3:49 PM
To: user@oozie.apache.org
Subject: Re: Coordinator action TIMEDOUT when no timeout is set
Hi Richard,
The default timeout is in fac
Mona,
Thanks. That is the road I’m headed down. At the moment.
I’ll create a Java action which takes the files (or a path glob ― or
something) as input, and create multiple Oozie tasks based on that input,
and then ‘wait’ for those tasks to complete.
A feature like this built into the workflow c
Hi Richard,
The default timeout is in fact changed from -1 (infinity) to 2 hours, to
avoid unnecessary CPU cycles to check for nonexistent data.
Property in oozie-site.xml
oozie.service.coord.normal.default.timeout
120
Default tim
Hi Chris,
There isn¹t a way of dynamic parallel tasks within the same Oozie workflow
XML currently. But you can do some programmatically. Using Oozie Java API,
you can start a dynamic number of sub-workflows based on the number of
outputs.
On 2/20/14, 7:05 AM, "Heller, Chris" wrote:
>Hi,
>
>I¹
Do you have an authentication setup different from kerberos?
On 2/20/14, 6:22 AM, "Roshan Punnoose" wrote:
>Hi all,
>
>I am running into an issue using the Hive Action (and Shell Action to
>interact with the hive command line) where the Delegation Token doesn't
>seem to be propagated for the pro
Hi,
I’m trying to figure out the best way to implement a workflow in Oozie.
I am creating a workflow which splits an input into multiple outputs.
Then for each output I want to run another process over each.
The trouble is I cannot know a-priori how many outputs I will have, and so to
post pro
>
> Is there any known data migration tool which can migrate data from derby
> to let's say mysql?
Migrating the Oozie data out of Derby to another database is somewhat
tricky. You can take a look at this procedure given on the Cloudera
Community forums, but I can't guarantee that it will work an
Actually, my confusion was here (just answered my own question).
If the logs are continually present, having the coordinator run once a day
will make it so, 24 new logs are grabbed each time.
Thanks.
On Thu, Feb 20, 2014 at 11:21 AM, Purshotam Shah wrote:
>
> Yes. If you dataset ³1HourLogs²
Yes. If you dataset ³1HourLogs² is hourly, then every time it going to
look for 23 previous hour logs + 1 .
On 2/20/14, 9:03 AM, "Scott Preddy" wrote:
>Will the snipped below over the same 23 logs it ran over the previous hour
>(i.e. just bumping up
>the log iterator by 1) each hour, or is o
Will the snipped below over the same 23 logs it ran over the previous hour
(i.e. just bumping up
the log iterator by 1) each hour, or is oozie going to run the action once
24 logs are present, then not kick off the action again until 24 new logs
are present? I think it is the former, but just
makin
Thanks a lot for the prompt reply Robert!
I do have lot of data in my derby db which I need to migrate now to either
of mysql/oracle/postgres.
Is there any known data migration tool which can migrate data from derby to
let's say mysql?
I also have one more question. Since traditionally oozie datab
Hi all,
I am running into an issue using the Hive Action (and Shell Action to
interact with the hive command line) where the Delegation Token doesn't
seem to be propagated for the proxy user:
org.apache.hadoop.ipc.RemoteException(java.io.IOException): Delegation
Token can be issued only with kerb
Hey:
Yesterday I setup a daily coordinator with an input dataset. It is scheduled to
run everyday at 00:00 and process the dataset. I don't have the piece that
creates the dataset automated yet, and was planning to manually create the
dataset each morning while I work on the automation pieces.
19 matches
Mail list logo