when I use the distributed cache , I found that when the file is more than
100MB or the number of records are more than 10 million , the file can not be
cache in the memory; and I try to set the io.sort.mb is 200MB ; it still can
not work, Any suggestion would be fine! Thank you !
In open source community no book can be ever latest , so we have to live by
this :)
I would suggest you to start from this book and see the latest
documentation on pig website.side by side to see latest features
Good luck
On Fri, Nov 16, 2012 at 8:41 PM, Majid Azimi
Unfortunately I've realised that boundscript.describe doesn't return a
string. It returns void but prints to stdout. This means I have to go
through a rather painful process of calling a separate python process that
calls boundscript.describe and then capture the stdout of that process in
order to
Agree with Mr. Jagat.
Regards,
Mohammad Tariq
On Fri, Nov 16, 2012 at 3:26 PM, Jagat Singh jagatsi...@gmail.com wrote:
In open source community no book can be ever latest , so we have to live by
this :)
I would suggest you to start from this book and see the latest
documentation on
It is a bit dated but an excellent resource for learning Pig. We give each
new data engineer a copy! Probably the biggest change from my point of view
is the use of JSONStorage() now built in at 0.10 so one does not need to
wrangle with a custom loader. When I started a couple years back, the only
In the java interface, there is a getInputSchema() method. You could make
this available in the python side of things. This would be a useful
addition.
2012/11/16 Martin Goodson mar...@qubitproducts.com
Unfortunately I've realised that boundscript.describe doesn't return a
string. It returns
That sounds reasonable, I've run into the same problem. Do you mind
submitting a patch?
On Fri, Nov 16, 2012 at 12:48 PM, pablomar
pablo.daniel.marti...@gmail.com wrote:
hi all,
I'm using Pig 0.9.2 (Apache Pig version 0.9.2-cdh4.0.1, precisely)
I got a case today on which I needed to clean up
just for the record
I m posting here the solution for my problem.
Thank you for your help.
In the end the problem seams to be with the JsonLoader I was using. I don't
know why exactly, but it seams to have a bug with my strings.
I finally changed my code to use