Well, a user don't really know how many jobs will be scheduled and so their
order is not something that should matter. A pig script should really be
seen as a graph of operators. Your problem was that a dependency between
two operators was implicit. Exec allows to 'flush' the existing graph and
mak
t; at
>
> org.apache.pig.piggybank.storage.avro.PigAvroDatumWriter.write(PigAvroDatumWriter.java:103)
> at
>
> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:58)
> at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:257)
>
> Any idea whats going wrong ?
>
> Thanks,
> Anup
>
--
Bertrand Dechoux
Or Lipstick : https://github.com/Netflix/Lipstick
It's Netflix this time instead of Twitter. ;)
http://techblog.netflix.com/2013/06/introducing-lipstick-on-apache-pig.html
But by simply running the script, the information your are looking for will
be displayed at the end of the job.
Bertrand
O
I would say either generate the script using another language (eg Python)
or use a true programming language with an API having the same level of
abstraction (eg Java and Cascading).
Bertrand
On Thu, Jul 18, 2013 at 8:44 AM, Something Something <
mailinglist...@gmail.com> wrote:
> There must be
n an
> example
> > with 10 maps (we forget the reducers now), it means that each map will
> read
> > more or less 50MB?
> >
> >
> >
> > On 10 June 2013 11:21, Bertrand Dechoux wrote:
> >
> > > I wasn't clear. Specifying the size of the f
ls
> > explain it very clearly. I want to split a 500MB single txt in HDFS into
> > multiple files using Pig latin. Is it possible? E.g.,
> >
> > A = LOAD ‘myfile.txt’ USING PigStorage() AS (t);
> > STORE A INTO ‘multiplefiles’ USING PigStorage(); -- and here creates
>
nto 'result-australia-0' using PigStorage('\t');
> > >
> > > to store the data in HDFS. But the problem is that, this creates 1 file
> > > with 500MB of size. Instead, want to save several 64MB files. How I do
> > > this?
> > >
> > > --
> > > Best regards,
> > >
> >
>
>
>
> --
> Best regards,
>
--
Bertrand Dechoux
Hi,
The command line and its output explain what are the required parameters
and the inputs/outputs of a script. I was wondering : is there a simple way
to extract them automatically from the script?
For the parameters, I could parse the file with my own logic, inputs/ouputs
should also be doable
ira/browse/PIG-3317
>
> So you will be able to set the properties in PigContext and pass it to
> PigServer.
>
> The patch is not committed yet, but it's likely to be in next release.
>
> Thanks,
> Cheolsoo
>
>
>
> On Mon, May 13, 2013 at 2:25 AM, Bertrand Decho
Hi,
I am using PigTest in order to verify a script reading and storing data in
avro format.
However, at the moment, the script fails due to the optimisation rule
ColumnMapKeyPrune.
I known I can disable it using the -optimizer_off flag. But is there a way
to do that using PigTest?
It seems to me
10 matches
Mail list logo