above files are produced by PIG output STORE command .
>> >
>> > I want to read the files starting with "part-m-" using PIG command
>> >
>> > When I tried using Data= LOAD '\Output2\part-m-*' AS ( );
>> > It does not work and it throws error.
>> >
>> > How do I read these files in a single LOAD statement?
>> >
>> > Thanks
>> >
>> >
>>
--
Harsh J
3272013
>
>
> And finally it says exp/03272013 tough the directory exists as it gets
> created in STORE command.
>
> What is wrong in this
--
Harsh J
>> Do I need to create a udf for this is or is there something out there?
>>
>> Thanks,
>> Ben
>>
--
Harsh J
lve this?
>
> Thanks,
> Keren
>
> --
> Keren Ouaknine
> Web: www.kereno.com
--
Harsh J
problem, and I can not find some paper
> about how to install or how to use it
> by PIG, so if you had some install or configue file, you could share with me.
> Thank you.
>
>
>
> Best Regards
>
> Malone
>
>
> 2012-05-24
>
>
--
Harsh J
m new to Hadoop and trying to learn PIG.
> Can someone help me understand XML parsing using PIG?
>
> Thanks
> Krishnan
--
Harsh J
r details (Map Bytes Out) data is not compressed at
> all, which reduces performance a lot (IO is 100% most of the time)... What am
> I doing wrong and how do I fix it?
>
>
> Thanks,
> Marek M.
--
Harsh J
Customer Ops. Engineer
Cloudera | http://tiny.cloudera.com/about
> So I want PIG to compress it's data with LZO but mapreduce with Snappy, but
> as I see in the tasktracker details (Map Bytes Out) data is not compressed at
> all, which reduces performance a lot (IO is 100% most of the time)... What am
> I doing wrong and how do I fix it?
>
>
> Thanks,
> Marek M.
--
Harsh J
Customer Ops. Engineer
Cloudera | http://tiny.cloudera.com/about
12
> Map Plan
> Local Rearrange[tuple]{tuple}(true) - scope-14
> | |
> | Project[tuple][*] - scope-13
> |
> |---IDs: New For Each(false)[bag] - scope-9
> | |
> | Project[bytearray][0] - scope-7
> |
> |---Data: Load(/AllStateInputs/input.csv:PigStorage(',')) -
> scope-6
> Reduce Plan
> UniqueID: Store(fakefile:org.apache.pig.builtin.PigStorage) - scope-11
> |
> |---New For Each(true)[bag] - scope-17
> | |
> | Project[tuple][0] - scope-16
> |
> |---Package[tuple]{tuple} - scope-15
> Global sort: false
>
>
>
> Thanks,
> Praveenesh
--
Harsh J
Customer Ops. Engineer, Cloudera
e way I implemented is: I create List of Tuples and add
> all Tuples from DataBag to List and then use custom Collections.sort But that
> way I loose a lot of resources and memory does anybody knows a different way?
>
>
> Thanks,
> Marek M.
--
Harsh J
Customer Ops. Engineer, Cloudera
Gianmarco/others,
This got solved over at common-user@hadoop and yes it was that. Cross posting
blues >_>.
On 31-Dec-2011, at 3:44 AM, Gianmarco De Francisci Morales wrote:
> Looks like a network issue.
> Are you behind a proxy server?
>
> Cheers,
> --
> Gianmarco
>
>
>
> On Fri, Dec 30, 20
ew IOException("config()")));
> }
> synchronized(Configuration.class) {
> REGISTRY.put(this, null);
> }
> }
>
> Log is in debug mode.
>
> Can anyone please help me on this??
>
> Regards,
> JD
--
Harsh J
ind anything on ther web.
>
> Cheers,
> Thomas
>
--
Harsh J
Hey Gianmarco,
On Fri, Apr 15, 2011 at 5:00 PM, Gianmarco wrote:
> Avro is not an easy option on the hadoop side.
Am just a little curious on this, could you explain why you feel so
about Avro on M/R?
--
Harsh J
air, and I am considering the use
> of two Hadoop counters (via reporter). They would could the number
> of records read/written, so that they would be available via
> the job WebUI.
>
> Is this an overly expensive proposition?
>
> Thanks,
>
> Andreas
>
--
Harsh J
http://harshj.com
on this mailing list, but all
> my other JARs seem to import fine ...
>
> Any help is appreciated,
> Robert.
>
--
Harsh J
www.harshj.com
mething obvious?
>
> --jacob
> @thedatachef
>
>
--
Harsh J
www.harshj.com
ip the
json-simple library along. But you may want to be careful about the
version of Jackson Core/Mapper in place inside your Hadoop. There are
much more recent updates of it available with benefits.
Perhaps, if you feel like, you can contribute your change back to
elephant-bird [2]. I think they're open to newer-Pig related changes.
[1] -
https://github.com/kevinweil/elephant-bird/blob/master/src/java/com/twitter/elephantbird/pig/load/LzoJsonLoader.java
[2] - https://github.com/kevinweil/elephant-bird
--
Harsh J
www.harshj.com
not find any related
> document.
>
> best regards,
> c.b.
>
--
Harsh J
www.harshj.com
at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:258)
> ... 7 more3092 [main] ERROR
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> - Failed to produce result in: "file:///output"3092 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> - Failed!
--
Harsh J
www.harshj.com
20 matches
Mail list logo