ApacheCon North America 2019 Schedule Now Live!

2019-06-12 Thread Rich Bowen

Dear Apache Enthusiast,

(You’re receiving this message because you’re subscribed to one or more 
Apache Software Foundation project user mailing lists.)


We’re thrilled to announce the schedule for our upcoming conference, 
ApacheCon North America 2019, in Las Vegas, Nevada. See it now at 
https://www.apachecon.com/acna19/schedule.html  The event will be held 
September 9th through 12th at the Flamingo Hotel.  Register today at 
https://www.apachecon.com/acna19/register.html


Our schedule features keynotes by James Gosling, the father of Java; 
Samaira Mehta, founder and CEO of the Billion Kids Can Code project; and 
David Brin, noted science fiction author and futurist. And we’ll have a 
discussion panel featuring some of the founders of The Apache Software 
Foundation, talking about the past as well as their vision for the future.


ApacheCon is the flagship convention of the ASF, and features tracks 
curated by many of our project communities: Apache Drill, Apache Karaf, 
Big Data, TomcatCon, Apache Cloudstack, Integration, Apache Cassandra, 
Streaming, Geospatial software, Graph processing, Internet of Things, 
Community, Machine Learning, Apache Traffic Control, Apache Beam, 
Observability, OFBiz, and Mobile app development.


The Hackathon, which will run all day, every day, is the place to meet 
your project community, and get some serious work knocked out in a short 
focused time. The BarCamp is the place to discuss the topics that are 
important to you, with your colleagues, in an unconference format.


We offer financial assistance for travel and lodging for those who want 
to come to ApacheCon but are unable to afford it. Apply at 
http://apache.org/travel/ by June 21st to be considered for this.


If you’re unable to make it to North America, we’ll also be running 
ApacheCon Europe in Berlin in October. Details of that event are at 
https://aceu19.apachecon.com/


Follow us on Twitter - @ApacheCon - for all the latest updates. See you 
in Las Vegas!


Rich Bowen, VP Conferences, The ASF
rbo...@apache.org
http://apachecon.com/


-
To unsubscribe, e-mail: user-unsubscr...@hadoop.apache.org
For additional commands, e-mail: user-h...@hadoop.apache.org



ApacheCon North America 2019 Schedule Now Live!

2019-06-12 Thread Rich Bowen

Dear Apache Enthusiast,

(You’re receiving this message because you’re subscribed to one or more 
Apache Software Foundation project user mailing lists.)


We’re thrilled to announce the schedule for our upcoming conference, 
ApacheCon North America 2019, in Las Vegas, Nevada. See it now at 
https://www.apachecon.com/acna19/schedule.html  The event will be held 
September 9th through 12th at the Flamingo Hotel.  Register today at 
https://www.apachecon.com/acna19/register.html


Our schedule features keynotes by James Gosling, the father of Java; 
Samaira Mehta, founder and CEO of the Billion Kids Can Code project; and 
David Brin, noted science fiction author and futurist. And we’ll have a 
discussion panel featuring some of the founders of The Apache Software 
Foundation, talking about the past as well as their vision for the future.


ApacheCon is the flagship convention of the ASF, and features tracks 
curated by many of our project communities: Apache Drill, Apache Karaf, 
Big Data, TomcatCon, Apache Cloudstack, Integration, Apache Cassandra, 
Streaming, Geospatial software, Graph processing, Internet of Things, 
Community, Machine Learning, Apache Traffic Control, Apache Beam, 
Observability, OFBiz, and Mobile app development.


The Hackathon, which will run all day, every day, is the place to meet 
your project community, and get some serious work knocked out in a short 
focused time. The BarCamp is the place to discuss the topics that are 
important to you, with your colleagues, in an unconference format.


We offer financial assistance for travel and lodging for those who want 
to come to ApacheCon but are unable to afford it. Apply at 
http://apache.org/travel/ by June 21st to be considered for this.


If you’re unable to make it to North America, we’ll also be running 
ApacheCon Europe in Berlin in October. Details of that event are at 
https://aceu19.apachecon.com/


Follow us on Twitter - @ApacheCon - for all the latest updates. See you 
in Las Vegas!


Rich Bowen, VP Conferences, The ASF
rbo...@apache.org
http://apachecon.com/


-
To unsubscribe, e-mail: common-user-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-user-h...@hadoop.apache.org



ApacheCon North America 2019 Schedule Now Live!

2019-06-12 Thread Rich Bowen

Dear Apache Enthusiast,

(You’re receiving this message because you’re subscribed to one or more 
Apache Software Foundation project user mailing lists.)


We’re thrilled to announce the schedule for our upcoming conference, 
ApacheCon North America 2019, in Las Vegas, Nevada. See it now at 
https://www.apachecon.com/acna19/schedule.html  The event will be held 
September 9th through 12th at the Flamingo Hotel.  Register today at 
https://www.apachecon.com/acna19/register.html


Our schedule features keynotes by James Gosling, the father of Java; 
Samaira Mehta, founder and CEO of the Billion Kids Can Code project; and 
David Brin, noted science fiction author and futurist. And we’ll have a 
discussion panel featuring some of the founders of The Apache Software 
Foundation, talking about the past as well as their vision for the future.


ApacheCon is the flagship convention of the ASF, and features tracks 
curated by many of our project communities: Apache Drill, Apache Karaf, 
Big Data, TomcatCon, Apache Cloudstack, Integration, Apache Cassandra, 
Streaming, Geospatial software, Graph processing, Internet of Things, 
Community, Machine Learning, Apache Traffic Control, Apache Beam, 
Observability, OFBiz, and Mobile app development.


The Hackathon, which will run all day, every day, is the place to meet 
your project community, and get some serious work knocked out in a short 
focused time. The BarCamp is the place to discuss the topics that are 
important to you, with your colleagues, in an unconference format.


We offer financial assistance for travel and lodging for those who want 
to come to ApacheCon but are unable to afford it. Apply at 
http://apache.org/travel/ by June 21st to be considered for this.


If you’re unable to make it to North America, we’ll also be running 
ApacheCon Europe in Berlin in October. Details of that event are at 
https://aceu19.apachecon.com/


Follow us on Twitter - @ApacheCon - for all the latest updates. See you 
in Las Vegas!


Rich Bowen, VP Conferences, The ASF
rbo...@apache.org
http://apachecon.com/


-
To unsubscribe, e-mail: hdfs-user-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-user-h...@hadoop.apache.org



ApacheCon North America 2019 Schedule Now Live!

2019-06-12 Thread Rich Bowen

Dear Apache Enthusiast,

(You’re receiving this message because you’re subscribed to one or more 
Apache Software Foundation project user mailing lists.)


We’re thrilled to announce the schedule for our upcoming conference, 
ApacheCon North America 2019, in Las Vegas, Nevada. See it now at 
https://www.apachecon.com/acna19/schedule.html  The event will be held 
September 9th through 12th at the Flamingo Hotel.  Register today at 
https://www.apachecon.com/acna19/register.html


Our schedule features keynotes by James Gosling, the father of Java; 
Samaira Mehta, founder and CEO of the Billion Kids Can Code project; and 
David Brin, noted science fiction author and futurist. And we’ll have a 
discussion panel featuring some of the founders of The Apache Software 
Foundation, talking about the past as well as their vision for the future.


ApacheCon is the flagship convention of the ASF, and features tracks 
curated by many of our project communities: Apache Drill, Apache Karaf, 
Big Data, TomcatCon, Apache Cloudstack, Integration, Apache Cassandra, 
Streaming, Geospatial software, Graph processing, Internet of Things, 
Community, Machine Learning, Apache Traffic Control, Apache Beam, 
Observability, OFBiz, and Mobile app development.


The Hackathon, which will run all day, every day, is the place to meet 
your project community, and get some serious work knocked out in a short 
focused time. The BarCamp is the place to discuss the topics that are 
important to you, with your colleagues, in an unconference format.


We offer financial assistance for travel and lodging for those who want 
to come to ApacheCon but are unable to afford it. Apply at 
http://apache.org/travel/ by June 21st to be considered for this.


If you’re unable to make it to North America, we’ll also be running 
ApacheCon Europe in Berlin in October. Details of that event are at 
https://aceu19.apachecon.com/


Follow us on Twitter - @ApacheCon - for all the latest updates. See you 
in Las Vegas!


Rich Bowen, VP Conferences, The ASF
rbo...@apache.org
http://apachecon.com/


-
To unsubscribe, e-mail: mapreduce-user-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-user-h...@hadoop.apache.org



Re: RM web got HTTP ERROR 500

2019-06-12 Thread Prabhu Josephraj
Hi Kevin,

 Looks different versions of hadoop-yarn-api jar is in the classpath of
Yarn ResourceManager. Can you remove the older jars if any in classpath.
lsof -p  or adding -verbose in YARN_OPTS in yarn.cmd file will help
to find the wrong jars.

Thanks,
Prabhu Joseph

On Wed, Jun 12, 2019 at 8:36 PM kevin su  wrote:

> Hi all,
>
> I have already restarted my cluster , still go same error.
>
> my *yarn-site.xml*
>  
>  
>  yarn.nodemanager.aux-services
>   mapreduce_shuffle
>   
>  
>
>
> -
> To unsubscribe, e-mail: user-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: user-h...@hadoop.apache.org


Cluster utilization low with "Skipping scheduling" messages in logs

2019-06-12 Thread Ori Popowski
Hi,

We use Hadoop 2.8.5 on EMR for a MapReduce job that reads data from S3.
The job has 13K mappers, and the cluster is 200 r5.xlarge machines.

The cluster is _extremely_ under utilized. We've went through all the
possible configuration values that can cause this problem and everything is
fine.

No failing jobs, no decommissioned nodes.

The log of the Resource Manager is full of these messages:

2019-06-12 08:43:05,307 INFO
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
(ResourceManager Event Processor): Trying to fulfill reservation for
application application_1560249941099_0004 on node:
ip-XX-XXX-XX-XXX.us-west-2.compute.internal:8041
2019-06-12 08:43:05,308 INFO
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
(ResourceManager Event Processor): Skipping scheduling since node
ip-XX-XXX-XX-XXX.us-west-2.compute.internal:8041 is reserved by application
appattempt_1560249941099_0004_01

And I am suspecting that this is the cause.

This message will appear for every node in the cluster.

Can you please help us figure this out?

Thanks


Re: [DISCUSS] HDFS roadmap/wish list

2019-06-12 Thread Julien Laurenceau
Hi,

I am not absolutely sure it is not already in a roadmap or supported, but I
would appreciate those two features :

- First feature : I would also like to be able to use a dedicated directory
in HDFS as a /tmp directory leveraging RAMFS for high performing checkpoint
of Spark Jobs without using Alluxio or Ignite.
My current issue is that the RAMFS is only useful with replication factor
x1 (in order to avoid network).
My default replication factor is x3, but I would need a way to set
replication factor x1 on a specific directory (/tmp) for all new writes
coming to this directory.
Currently if I use "hdfs setrep 1 /tmp" it only works for blocks already
written.
For example, this could be done by specifying the replication factor at the
storage policy level.
In my view this would dramatically improve the interest of the Lazy-persist
storage policy.

> From the Doc > Note 1: The Lazy_Persist policy is useful only for single
replica blocks. For blocks with more than one replicas, all the replicas
will be written to DISK since writing only one of the replicas to RAM_DISK
does not improve the overall performance.
In the current state of HDFS configuration, I only see the following hack
(not tested) to implement such a solution : Configure HDFS replication x1
as default configuration and use Erasure Coding RS(6,3) for the main
storage by attaching an ec storage policy on all directories except /tmp.

hdfs ec -setPolicy -path  [-policy ]



- Second feature: a bandwidth throttling dedicated to the re-replication in
case of a failed datanode.
Something similar to the option dedicated to the balancing algorithm
dfs.datanode.balance.bandwidthPerSecbut only for re-replication.

Thanks and regards
JL

Le lun. 10 juin 2019 à 19:08, Wei-Chiu Chuang 
a écrit :

> Hi!
>
> I am soliciting feedbacks for HDFS roadmap items and wish list in the
> future Hadoop releases. A community meetup
> 
> is happening soon, and perhaps we can use this thread to converge on things
> we should talk about there.
>
> I am aware of several major features that merged into trunk, such as RBF,
> Consistent Standby Serving Reads, as well as some recent features that
> merged into 3.2.0 release (storage policy satisfier).
>
> What else should we be doing? I have a laundry list of supportability
> improvement projects, mostly about improving performance or making
> performance diagnostics easier. I can share the list if folks are
> interested.
>
> Are there things we should do to make developer's life easier or things
> that would be nice to have for downstream applications? I know Sahil Takiar
> made a series of improvements in HDFS for Impala recently, and those
> improvements are applicable to other downstreamers such as HBase. Or would
> it help if we provide more Hadoop API examples?
>