Hi Scott,
I work with lots of gzipped files also and sometimes I used to get the same
error.
I started checking the gzip files before processing them. In fact I check
immediately after I put them on hdfs.
What I do is a cat of the gzip file and check it with the gzip -t.
For example, all the
I am glad that you got this in a replicatable form! I have seen this error
as well (where the output is just the last value repeated instead of the
multiples that you want), but wasn't able to give a concrete example.
2011/2/16 James Kebinger jkebin...@gmail.com
Hello all, I've been scratching
Hello,
I'll preface this with saying that I know very very little Java and I am
just learning Pig.
My situation is that I am aggregating logs with Flume into a single
logfile. All my logs are in JSON format and then gzip'd before being added
to S3. I have 3 types of log lines in each
This is weird, because in my case it seems to be nondeterministic. I have a
text file, thing.txt, that is simply
http://www.guardian.co.uk/
asjlkdajlkdad
askjldajlksdjlkasjdlkajslkdjalds
asdjaskdjlasjdlkad
http://www.guardian.co.uk/adsasd
http://www.guardian.co.uk/sadasd
Interesting, maybe I should file a bug report then?
On Thu, Feb 17, 2011 at 10:41 AM, Jonathan Coveney jcove...@gmail.comwrote:
I am glad that you got this in a replicatable form! I have seen this error
as well (where the output is just the last value repeated instead of the
multiples that
https://issues.apache.org/jira/browse/PIG-1859
On Thu, Feb 17, 2011 at 12:32 PM, James Kebinger jkebin...@gmail.comwrote:
Interesting, maybe I should file a bug report then?
On Thu, Feb 17, 2011 at 10:41 AM, Jonathan Coveney jcove...@gmail.comwrote:
I am glad that you got this in a
Guys,
Does Pig read the _log directories from an output script ?
What I want is to read an pig output dir (or multiples) from pig scripts.
But I just want the part- files not the .part-crc or _logs files.
Thanks
--
*Charles Ferreira Gonçalves *
http://homepages.dcc.ufmg.br/~charles/
UFMG
Files starting with . are also ignored.
-Richard
On 2/17/11 3:23 PM, Ramesh, Amit amram...@amazon.com wrote:
Directory names starting with underscores are ignored, but I am not certain
about .* files/directories.
Amit
On 2/17/11 3:12 PM, Charles Gonçalves charles...@gmail.com wrote: