Hi All,
You can ignore this mail. I've found the configuration parameter which i
was looking for i.e. pig.maxCombinedSplitSize and pig.splitCombination.
Regards,
Sandeep
On Tue, Jan 5, 2016 at 4:39 PM, sandeep das wrote:
> Hi All,
>
> I've a pig script which runs over YARN. Each MAP task crea
I want to use map-reduce to sample data by some conditions. If I find
enough data, I want to output it to reducer and stop all
mappers(including running and not started). Is there any method to do
this?
-
To unsubscribe, e-mail: u
Hah, we definitely do have those properties. Thank you for this!
On Tue, Jan 5, 2016 at 5:23 PM, Chris Nauroth
wrote:
> If this is an Ambari-managed cluster that was transitioned from non-HA to
> running in HA mode, then you might be interested in a known Ambari bug:
>
> https://issues.apache.or
If this is an Ambari-managed cluster that was transitioned from non-HA to
running in HA mode, then you might be interested in a known Ambari bug:
https://issues.apache.org/jira/browse/AMBARI-13946
If your hdfs-site.xml contains the "non-HA properties" described in that issue,
then a viable work
I have never tried it myself, so I have no idea if it works. But this JIRA
(https://issues.apache.org/jira/browse/HADOOP-11261) seems to indicate that you
might want to setup “fs.s3a.endpoint”.
Thanks
Anu
From: Han JU mailto:ju.han.fe...@gmail.com>>
Date: Tuesday, January 5, 2016 at 9:35 AM
To:
Hello,
For test purpose we need to configure a custom s3 endpoint for s3n/s3a.
More precisely we've need to test that parquet writes correctly the content
to s3.
We've setup a s3rver, so the endpoint should be `http://s3rver:8000`. I've
tried different method but no luck so far.
Things I've tried
Hi All,
I've a pig script which runs over YARN. Each MAP task created by this pig
script is taking 128MB as input and not more than that.
I want to increase the input size of each map job. I've read that input
size is determined using following formula:
max(min split size, min(block size, max sp