Re: Spark Job on YARN Hogging the entire Cluster resource

2016-02-24 Thread Prabhu Joseph
YARN-2026 has fixed the issue.

On Thu, Feb 25, 2016 at 4:17 AM, Prabhu Joseph 
wrote:

> You are right, Hamel. It should get 10 TB /2. And In hadoop-2.7.0, it is
> working fine. But in hadoop-2.5.1, it gets only 10TB/230. The same
> configuration used in both versions.
> So i think a JIRA could have fixed the issue after hadoop-2.5.1.
>
> On Thu, Feb 25, 2016 at 1:28 AM, Hamel Kothari 
> wrote:
>
>> The instantaneous fair share is what Queue B should get according to the
>> code (and my experience). Assuming your queues are all equal it would be
>> 10TB/2.
>>
>> I can't help much more unless I can see your config files and ideally
>> also the YARN Scheduler UI to get an idea of what your queues/actual
>> resource usage is like. Logs from each of your Spark applications would
>> also be useful. Basically the more info the better.
>>
>> On Wed, Feb 24, 2016 at 2:52 PM Prabhu Joseph 
>> wrote:
>>
>>> Hi Hamel,
>>>
>>> Thanks for looking into the issue. What i am not understanding is,
>>> after preemption what is the share that the second queue gets in case if
>>> the first queue holds the entire cluster resource without releasing, is it
>>> instantaneous fair share or fair share.
>>>
>>>  Queue A and B are there (total 230 queues), total cluster resource
>>> is 10TB, 3000 cores. If a job submitted into queue A, it will get 10TB,
>>> 3000 cores and it is not releasing any resource. Now if a second job
>>> submitted into queue B, so preemption definitely will happen, but what is
>>> the share queue B will get after preemption. *Is it  <10 TB , 3000> / 2
>>> or <10TB,3000> / 230*
>>>
>>> We find, after preemption queue B gets only <10TB,3000> / 230, because
>>> the first job is holding the resource. In case if first job releases the
>>> resource, the second queue will get <10TB,3000> /2 based on higher priority
>>> and reservation.
>>>
>>> The question is how much preemption tries to preempt the queue A if it
>>> holds the entire resource without releasing? Could not able to share the
>>> actual configuration, but the answer to the question here will help us.
>>>
>>>
>>> Thanks,
>>> Prabhu Joseph
>>>
>>>
>>>
>>>
>>> On Wed, Feb 24, 2016 at 10:03 PM, Hamel Kothari 
>>> wrote:
>>>
 If all queues are identical, this behavior should not be happening.
 Preemption as designed in fair scheduler (IIRC) takes place based on the
 instantaneous fair share, not the steady state fair share. The fair
 scheduler docs
 
 aren't super helpful on this but it does say in the Monitoring section that
 preemption won't take place if you're less than your instantaneous fair
 share (which might imply that it would occur if you were over your inst.
 fair share and someone had requested resources). The code for
 FairScheduler.resToPreempt
 
 also seems to use getFairShare rather than getSteadyFairShare() for
 preemption so that would imply that it is using instantaneous fair share
 rather than steady state.

 Could you share your YARN site/fair-scheduler and Spark configurations?
 Could you also share the YARN Scheduler UI (specifically the top of of the
 RM which shows how many resources are in use)?

 Since it's not likely due to steady state fair share, some other
 possible reasons why this might be taking place (this is not remotely
 conclusive but with no information this is what comes to mind):
 - You're not reaching
 yarn.scheduler.fair.preemption.cluster-utilization-threshold. Perhaps
 due to core/memory ratio inconsistency with the cluster.
 - Your second job doesn't have a sufficient level of parallelism to
 request more executors than what it is recieving (perhaps there are fewer
 than 13 tasks at any point in time) and you don't have
 spark.dynamicAllocation.minExecutors set?

 -Hamel

 On Tue, Feb 23, 2016 at 8:20 PM Prabhu Joseph <
 prabhujose.ga...@gmail.com> wrote:

> Hi All,
>
>  A YARN cluster with 352 Nodes (10TB, 3000cores) and has Fair
> Scheduler with root queue having 230 queues.
>
> Each Queue is configured with maxResources equal to Total Cluster
> Resource. When a Spark job is submitted into a queue A, it is given with
> 10TB, 3000 cores according to instantaneous Fair Share and it is holding
> the entire resource without releasing. After some time, when another job 
> is
> submitted into other queue B, it will get the Fair Share 45GB 

Re: [build system] additional jenkins downtime next thursday

2016-02-24 Thread shane knapp
the security update has been released, and it's a doozy!

https://wiki.jenkins-ci.org/display/SECURITY/Security+Advisory+2016-02-24

i will be putting jenkins in to quiet mode ~7am PST tomorrow morning
for the upgrade, and expect to be back up and building by 9am PST at
the latest.

amp-jenkins-worker-08 will also be getting a reboot to test out a fix for:
https://github.com/apache/spark/pull/9893

shane

On Wed, Feb 17, 2016 at 10:47 AM, shane knapp  wrote:
> the security release has been delayed until next wednesday morning,
> and i'll be doing the upgrade first thing thursday morning.
>
> i'll update everyone when i get more information.
>
> thanks!
>
> shane
>
> On Thu, Feb 11, 2016 at 10:19 AM, shane knapp  wrote:
>> there's a big security patch coming out next week, and i'd like to
>> upgrade our jenkins installation so that we're covered.  it'll be
>> around 8am, again, and i'll send out more details about the upgrade
>> when i get them.
>>
>> thanks!
>>
>> shane

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: Spark 1.6.1

2016-02-24 Thread Yin Yang
Have you tried using scp ?

scp file i...@people.apache.org

Thanks

On Wed, Feb 24, 2016 at 5:04 PM, Michael Armbrust 
wrote:

> Unfortunately I don't think thats sufficient as they don't seem to support
> sftp in the same way they did before.  We'll still need to update our
> release scripts.
>
> On Wed, Feb 24, 2016 at 2:09 AM, Yin Yang  wrote:
>
>> Looks like access to people.apache.org has been restored.
>>
>> FYI
>>
>> On Mon, Feb 22, 2016 at 10:07 PM, Luciano Resende 
>>  wrote:
>>
>>>
>>>
>>> On Mon, Feb 22, 2016 at 9:08 PM, Michael Armbrust <
>>> mich...@databricks.com> wrote:
>>>
 An update: people.apache.org has been shut down so the release scripts
 are broken. Will try again after we fix them.


>>> If you skip uploading to people.a.o, it should still be available in
>>> nexus for review.
>>>
>>> The other option is to add the RC into
>>> https://dist.apache.org/repos/dist/dev/
>>>
>>>
>>>
>>> --
>>> Luciano Resende
>>> http://people.apache.org/~lresende
>>> http://twitter.com/lresende1975
>>> http://lresende.blogspot.com/
>>>
>>>
>


Re: Spark 1.6.1

2016-02-24 Thread Michael Armbrust
Unfortunately I don't think thats sufficient as they don't seem to support
sftp in the same way they did before.  We'll still need to update our
release scripts.

On Wed, Feb 24, 2016 at 2:09 AM, Yin Yang  wrote:

> Looks like access to people.apache.org has been restored.
>
> FYI
>
> On Mon, Feb 22, 2016 at 10:07 PM, Luciano Resende 
>  wrote:
>
>>
>>
>> On Mon, Feb 22, 2016 at 9:08 PM, Michael Armbrust > > wrote:
>>
>>> An update: people.apache.org has been shut down so the release scripts
>>> are broken. Will try again after we fix them.
>>>
>>>
>> If you skip uploading to people.a.o, it should still be available in
>> nexus for review.
>>
>> The other option is to add the RC into
>> https://dist.apache.org/repos/dist/dev/
>>
>>
>>
>> --
>> Luciano Resende
>> http://people.apache.org/~lresende
>> http://twitter.com/lresende1975
>> http://lresende.blogspot.com/
>>
>>


Spark HANA jdbc connection issue

2016-02-24 Thread Dushyant Rajput
Hi,

Will this be resolved in any forthcoming release?

https://issues.apache.org/jira/browse/SPARK-10625

Rgds,
Dushyant.


Re: Spark Job on YARN Hogging the entire Cluster resource

2016-02-24 Thread Prabhu Joseph
You are right, Hamel. It should get 10 TB /2. And In hadoop-2.7.0, it is
working fine. But in hadoop-2.5.1, it gets only 10TB/230. The same
configuration used in both versions.
So i think a JIRA could have fixed the issue after hadoop-2.5.1.

On Thu, Feb 25, 2016 at 1:28 AM, Hamel Kothari 
wrote:

> The instantaneous fair share is what Queue B should get according to the
> code (and my experience). Assuming your queues are all equal it would be
> 10TB/2.
>
> I can't help much more unless I can see your config files and ideally also
> the YARN Scheduler UI to get an idea of what your queues/actual resource
> usage is like. Logs from each of your Spark applications would also be
> useful. Basically the more info the better.
>
> On Wed, Feb 24, 2016 at 2:52 PM Prabhu Joseph 
> wrote:
>
>> Hi Hamel,
>>
>> Thanks for looking into the issue. What i am not understanding is,
>> after preemption what is the share that the second queue gets in case if
>> the first queue holds the entire cluster resource without releasing, is it
>> instantaneous fair share or fair share.
>>
>>  Queue A and B are there (total 230 queues), total cluster resource
>> is 10TB, 3000 cores. If a job submitted into queue A, it will get 10TB,
>> 3000 cores and it is not releasing any resource. Now if a second job
>> submitted into queue B, so preemption definitely will happen, but what is
>> the share queue B will get after preemption. *Is it  <10 TB , 3000> / 2
>> or <10TB,3000> / 230*
>>
>> We find, after preemption queue B gets only <10TB,3000> / 230, because
>> the first job is holding the resource. In case if first job releases the
>> resource, the second queue will get <10TB,3000> /2 based on higher priority
>> and reservation.
>>
>> The question is how much preemption tries to preempt the queue A if it
>> holds the entire resource without releasing? Could not able to share the
>> actual configuration, but the answer to the question here will help us.
>>
>>
>> Thanks,
>> Prabhu Joseph
>>
>>
>>
>>
>> On Wed, Feb 24, 2016 at 10:03 PM, Hamel Kothari 
>> wrote:
>>
>>> If all queues are identical, this behavior should not be happening.
>>> Preemption as designed in fair scheduler (IIRC) takes place based on the
>>> instantaneous fair share, not the steady state fair share. The fair
>>> scheduler docs
>>> 
>>> aren't super helpful on this but it does say in the Monitoring section that
>>> preemption won't take place if you're less than your instantaneous fair
>>> share (which might imply that it would occur if you were over your inst.
>>> fair share and someone had requested resources). The code for
>>> FairScheduler.resToPreempt
>>> 
>>> also seems to use getFairShare rather than getSteadyFairShare() for
>>> preemption so that would imply that it is using instantaneous fair share
>>> rather than steady state.
>>>
>>> Could you share your YARN site/fair-scheduler and Spark configurations?
>>> Could you also share the YARN Scheduler UI (specifically the top of of the
>>> RM which shows how many resources are in use)?
>>>
>>> Since it's not likely due to steady state fair share, some other
>>> possible reasons why this might be taking place (this is not remotely
>>> conclusive but with no information this is what comes to mind):
>>> - You're not reaching
>>> yarn.scheduler.fair.preemption.cluster-utilization-threshold. Perhaps
>>> due to core/memory ratio inconsistency with the cluster.
>>> - Your second job doesn't have a sufficient level of parallelism to
>>> request more executors than what it is recieving (perhaps there are fewer
>>> than 13 tasks at any point in time) and you don't have
>>> spark.dynamicAllocation.minExecutors set?
>>>
>>> -Hamel
>>>
>>> On Tue, Feb 23, 2016 at 8:20 PM Prabhu Joseph <
>>> prabhujose.ga...@gmail.com> wrote:
>>>
 Hi All,

  A YARN cluster with 352 Nodes (10TB, 3000cores) and has Fair Scheduler
 with root queue having 230 queues.

 Each Queue is configured with maxResources equal to Total Cluster
 Resource. When a Spark job is submitted into a queue A, it is given with
 10TB, 3000 cores according to instantaneous Fair Share and it is holding
 the entire resource without releasing. After some time, when another job is
 submitted into other queue B, it will get the Fair Share 45GB and 13 cores
 i.e (10TB,3000 cores)/230 using Preemption. Now if some more jobs are
 submitted into queue B, all the jobs in B has to share the 45GB and 13
 cores. Whereas the job which is in queue A holds the entire 

how about a custom coalesce() policy?

2016-02-24 Thread Nezih Yigitbasi
Hi Spark devs,

I have sent an email about my problem some time ago where I want to merge a
large number of small files with Spark. Currently I am using Hive with the
CombineHiveInputFormat and I can control the size of the output files with
the max split size parameter (which is used for coalescing the input splits
by the CombineHiveInputFormat). My first attempt was to use coalesce(), but
since coalesce only considers the target number of partitions the output
file sizes were varying wildly.

What I think can be useful is to have an optional PartitionCoalescer
parameter (a new interface) in the coalesce() method (or maybe we can add a
new method ?) that the callers can implement for custom coalescing
strategies — for my use case I have already implemented a
SizeBasedPartitionCoalescer that coalesces partitions by looking at their
sizes and by using a max split size parameter, similar to the
CombineHiveInputFormat (I also had to expose HadoopRDD to get access to the
individual split sizes etc.).

What do you guys think about such a change, can it be useful to other users
as well? Or do you think that there is an easier way to accomplish the same
merge logic? If you think it may be useful, I already have an
implementation and I will be happy to work with the community to contribute
it.

Thanks,
Nezih
​


Spark Summit (San Francisco, June 6-8) call for presentation due in less than week

2016-02-24 Thread Reynold Xin
Just want to send a reminder in case people don't know about it. If you are
working on (or with, using) Spark, consider submitting your work to Spark
Summit, coming up in June in San Francisco.

https://spark-summit.org/2016/call-for-presentations/

Cheers.


Re: ORC file writing hangs in pyspark

2016-02-24 Thread James Barney
Thank you for the suggestions. We looked at the live spark UI and yarn app
logs and found what we think to be the issue: in spark 1.5.2, the FPGrowth
algorithm doesn't require you to specify the number of partitions in your
input data. Without specifying, FPGrowth puts all of its data into one
partition however. Subsequently, only one executor is responsible for
writing the ORC file from the resultant dataframe that FPGrowth puts out.
That's what was causing it to hang.

After specifying the number of partitions in FPGrowth, upon writing, the
writing step continues and finishes quickly.

Thank you again for the suggestions

On Tue, Feb 23, 2016 at 9:28 PM, Zhan Zhang  wrote:

> Hi James,
>
> You can try to write with other format, e.g., parquet to see whether it is
> a orc specific issue or more generic issue.
>
> Thanks.
>
> Zhan Zhang
>
> On Feb 23, 2016, at 6:05 AM, James Barney  wrote:
>
> I'm trying to write an ORC file after running the FPGrowth algorithm on a
> dataset of around just 2GB in size. The algorithm performs well and can
> display results if I take(n) the freqItemSets() of the result after
> converting that to a DF.
>
> I'm using Spark 1.5.2 on HDP 2.3.4 and Python 3.4.2 on Yarn.
>
> I get the results from querying a Hive table, also ORC format, running a
> number of maps, joins, and filters on the data.
>
> When the program attempts to write the files:
> result.write.orc('/data/staged/raw_result')
>   size_1_buckets.write.orc('/data/staged/size_1_results')
>   filter_size_2_buckets.write.orc('/data/staged/size_2_results')
>
> The first path, /data/staged/raw_result, is created with a _temporary
> folder, but the data is never written. The job hangs at this point,
> apparently indefinitely.
>
> Additionally, no logs are recorded or available for the jobs on the
> history server.
>
> What could be the problem?
>
>
>


Re: Spark Job on YARN Hogging the entire Cluster resource

2016-02-24 Thread Hamel Kothari
The instantaneous fair share is what Queue B should get according to the
code (and my experience). Assuming your queues are all equal it would be
10TB/2.

I can't help much more unless I can see your config files and ideally also
the YARN Scheduler UI to get an idea of what your queues/actual resource
usage is like. Logs from each of your Spark applications would also be
useful. Basically the more info the better.

On Wed, Feb 24, 2016 at 2:52 PM Prabhu Joseph 
wrote:

> Hi Hamel,
>
> Thanks for looking into the issue. What i am not understanding is,
> after preemption what is the share that the second queue gets in case if
> the first queue holds the entire cluster resource without releasing, is it
> instantaneous fair share or fair share.
>
>  Queue A and B are there (total 230 queues), total cluster resource is
> 10TB, 3000 cores. If a job submitted into queue A, it will get 10TB, 3000
> cores and it is not releasing any resource. Now if a second job submitted
> into queue B, so preemption definitely will happen, but what is the share
> queue B will get after preemption. *Is it  <10 TB , 3000> / 2 or
> <10TB,3000> / 230*
>
> We find, after preemption queue B gets only <10TB,3000> / 230, because the
> first job is holding the resource. In case if first job releases the
> resource, the second queue will get <10TB,3000> /2 based on higher priority
> and reservation.
>
> The question is how much preemption tries to preempt the queue A if it
> holds the entire resource without releasing? Could not able to share the
> actual configuration, but the answer to the question here will help us.
>
>
> Thanks,
> Prabhu Joseph
>
>
>
>
> On Wed, Feb 24, 2016 at 10:03 PM, Hamel Kothari 
> wrote:
>
>> If all queues are identical, this behavior should not be happening.
>> Preemption as designed in fair scheduler (IIRC) takes place based on the
>> instantaneous fair share, not the steady state fair share. The fair
>> scheduler docs
>> 
>> aren't super helpful on this but it does say in the Monitoring section that
>> preemption won't take place if you're less than your instantaneous fair
>> share (which might imply that it would occur if you were over your inst.
>> fair share and someone had requested resources). The code for
>> FairScheduler.resToPreempt
>> 
>> also seems to use getFairShare rather than getSteadyFairShare() for
>> preemption so that would imply that it is using instantaneous fair share
>> rather than steady state.
>>
>> Could you share your YARN site/fair-scheduler and Spark configurations?
>> Could you also share the YARN Scheduler UI (specifically the top of of the
>> RM which shows how many resources are in use)?
>>
>> Since it's not likely due to steady state fair share, some other possible
>> reasons why this might be taking place (this is not remotely conclusive but
>> with no information this is what comes to mind):
>> - You're not reaching
>> yarn.scheduler.fair.preemption.cluster-utilization-threshold. Perhaps
>> due to core/memory ratio inconsistency with the cluster.
>> - Your second job doesn't have a sufficient level of parallelism to
>> request more executors than what it is recieving (perhaps there are fewer
>> than 13 tasks at any point in time) and you don't have
>> spark.dynamicAllocation.minExecutors set?
>>
>> -Hamel
>>
>> On Tue, Feb 23, 2016 at 8:20 PM Prabhu Joseph 
>> wrote:
>>
>>> Hi All,
>>>
>>>  A YARN cluster with 352 Nodes (10TB, 3000cores) and has Fair Scheduler
>>> with root queue having 230 queues.
>>>
>>> Each Queue is configured with maxResources equal to Total Cluster
>>> Resource. When a Spark job is submitted into a queue A, it is given with
>>> 10TB, 3000 cores according to instantaneous Fair Share and it is holding
>>> the entire resource without releasing. After some time, when another job is
>>> submitted into other queue B, it will get the Fair Share 45GB and 13 cores
>>> i.e (10TB,3000 cores)/230 using Preemption. Now if some more jobs are
>>> submitted into queue B, all the jobs in B has to share the 45GB and 13
>>> cores. Whereas the job which is in queue A holds the entire cluster
>>> resource affecting the other jobs.
>>>  This kind of issue often happens when a Spark job submitted first
>>> which holds the entire cluster resource. What is the best way to fix this
>>> issue. Can we make preemption to happen for instantaneous fair share
>>> instead of fair share, will it help.
>>>
>>> Note:
>>>
>>> 1. We do not want to give weight for particular queue. Because all the
>>> 240 queues are 

Re: Spark Job on YARN Hogging the entire Cluster resource

2016-02-24 Thread Prabhu Joseph
Hi Hamel,

Thanks for looking into the issue. What i am not understanding is,
after preemption what is the share that the second queue gets in case if
the first queue holds the entire cluster resource without releasing, is it
instantaneous fair share or fair share.

 Queue A and B are there (total 230 queues), total cluster resource is
10TB, 3000 cores. If a job submitted into queue A, it will get 10TB, 3000
cores and it is not releasing any resource. Now if a second job submitted
into queue B, so preemption definitely will happen, but what is the share
queue B will get after preemption. *Is it  <10 TB , 3000> / 2 or
<10TB,3000> / 230*

We find, after preemption queue B gets only <10TB,3000> / 230, because the
first job is holding the resource. In case if first job releases the
resource, the second queue will get <10TB,3000> /2 based on higher priority
and reservation.

The question is how much preemption tries to preempt the queue A if it
holds the entire resource without releasing? Could not able to share the
actual configuration, but the answer to the question here will help us.


Thanks,
Prabhu Joseph




On Wed, Feb 24, 2016 at 10:03 PM, Hamel Kothari 
wrote:

> If all queues are identical, this behavior should not be happening.
> Preemption as designed in fair scheduler (IIRC) takes place based on the
> instantaneous fair share, not the steady state fair share. The fair
> scheduler docs
> 
> aren't super helpful on this but it does say in the Monitoring section that
> preemption won't take place if you're less than your instantaneous fair
> share (which might imply that it would occur if you were over your inst.
> fair share and someone had requested resources). The code for
> FairScheduler.resToPreempt
> 
> also seems to use getFairShare rather than getSteadyFairShare() for
> preemption so that would imply that it is using instantaneous fair share
> rather than steady state.
>
> Could you share your YARN site/fair-scheduler and Spark configurations?
> Could you also share the YARN Scheduler UI (specifically the top of of the
> RM which shows how many resources are in use)?
>
> Since it's not likely due to steady state fair share, some other possible
> reasons why this might be taking place (this is not remotely conclusive but
> with no information this is what comes to mind):
> - You're not reaching
> yarn.scheduler.fair.preemption.cluster-utilization-threshold. Perhaps due
> to core/memory ratio inconsistency with the cluster.
> - Your second job doesn't have a sufficient level of parallelism to
> request more executors than what it is recieving (perhaps there are fewer
> than 13 tasks at any point in time) and you don't have
> spark.dynamicAllocation.minExecutors set?
>
> -Hamel
>
> On Tue, Feb 23, 2016 at 8:20 PM Prabhu Joseph 
> wrote:
>
>> Hi All,
>>
>>  A YARN cluster with 352 Nodes (10TB, 3000cores) and has Fair Scheduler
>> with root queue having 230 queues.
>>
>> Each Queue is configured with maxResources equal to Total Cluster
>> Resource. When a Spark job is submitted into a queue A, it is given with
>> 10TB, 3000 cores according to instantaneous Fair Share and it is holding
>> the entire resource without releasing. After some time, when another job is
>> submitted into other queue B, it will get the Fair Share 45GB and 13 cores
>> i.e (10TB,3000 cores)/230 using Preemption. Now if some more jobs are
>> submitted into queue B, all the jobs in B has to share the 45GB and 13
>> cores. Whereas the job which is in queue A holds the entire cluster
>> resource affecting the other jobs.
>>  This kind of issue often happens when a Spark job submitted first
>> which holds the entire cluster resource. What is the best way to fix this
>> issue. Can we make preemption to happen for instantaneous fair share
>> instead of fair share, will it help.
>>
>> Note:
>>
>> 1. We do not want to give weight for particular queue. Because all the
>> 240 queues are critical.
>> 2. Changing the queues into nested does not solve the issue.
>> 3. Adding maxResource to queue  won't allow the first job to pick entire
>> cluster resource, but still configuring the optimal maxResource for 230
>> queue is difficult and also the first job can't use the entire cluster
>> resource when the cluster is idle.
>> 4. We do not want to handle it in Spark ApplicationMaster, then we need
>> to check for other new YARN application type with similar behavior. We want
>> YARN to control this behavior by killing the resources which is hold by
>> first job for longer period.
>>
>>
>> Thanks,

Re: Build fails

2016-02-24 Thread Marcelo Vanzin
The error is right there. Just read the output more carefully.

On Wed, Feb 24, 2016 at 11:37 AM, Minudika Malshan
 wrote:
> [INFO] --- maven-enforcer-plugin:1.4.1:enforce (enforce-versions) @
> spark-parent_2.11 ---
> [WARNING] Rule 0: org.apache.maven.plugins.enforcer.RequireMavenVersion
> failed with message:
> Detected Maven Version: 3.3.3 is not in the allowed range 3.3.9.

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: Build fails

2016-02-24 Thread Minudika Malshan
Here is the full stack trace..
@Yin : yeah it seems like a problem with maven version. I am going to
update maven.
@ Marcelo :  Yes, couldn't decide what's wrong at first :)

Thanks for your help!


[INFO] Scanning for projects...
[INFO]

[INFO] Reactor Build Order:
[INFO]
[INFO] Spark Project Parent POM
[INFO] Spark Project Test Tags
[INFO] Spark Project Sketch
[INFO] Spark Project Launcher
[INFO] Spark Project Networking
[INFO] Spark Project Shuffle Streaming Service
[INFO] Spark Project Unsafe
[INFO] Spark Project Core
[INFO] Spark Project GraphX
[INFO] Spark Project Streaming
[INFO] Spark Project Catalyst
[INFO] Spark Project SQL
[INFO] Spark Project ML Library
[INFO] Spark Project Tools
[INFO] Spark Project Hive
[INFO] Spark Project Docker Integration Tests
[INFO] Spark Project REPL
[INFO] Spark Project Assembly
[INFO] Spark Project External Twitter
[INFO] Spark Project External Flume Sink
[INFO] Spark Project External Flume
[INFO] Spark Project External Flume Assembly
[INFO] Spark Project External Akka
[INFO] Spark Project External MQTT
[INFO] Spark Project External MQTT Assembly
[INFO] Spark Project External ZeroMQ
[INFO] Spark Project External Kafka
[INFO] Spark Project Examples
[INFO] Spark Project External Kafka Assembly
[INFO]

[INFO]

[INFO] Building Spark Project Parent POM 2.0.0-SNAPSHOT
[INFO]

[INFO]
[INFO] --- maven-clean-plugin:3.0.0:clean (default-clean) @
spark-parent_2.11 ---
[INFO]
[INFO] --- maven-enforcer-plugin:1.4.1:enforce (enforce-versions) @
spark-parent_2.11 ---
[WARNING] Rule 0: org.apache.maven.plugins.enforcer.RequireMavenVersion
failed with message:
Detected Maven Version: 3.3.3 is not in the allowed range 3.3.9.
[INFO]

[INFO] Reactor Summary:
[INFO]
[INFO] Spark Project Parent POM ... FAILURE [
 0.811 s]
[INFO] Spark Project Test Tags  SKIPPED
[INFO] Spark Project Sketch ... SKIPPED
[INFO] Spark Project Launcher . SKIPPED
[INFO] Spark Project Networking ... SKIPPED
[INFO] Spark Project Shuffle Streaming Service  SKIPPED
[INFO] Spark Project Unsafe ... SKIPPED
[INFO] Spark Project Core . SKIPPED
[INFO] Spark Project GraphX ... SKIPPED
[INFO] Spark Project Streaming  SKIPPED
[INFO] Spark Project Catalyst . SKIPPED
[INFO] Spark Project SQL .. SKIPPED
[INFO] Spark Project ML Library ... SKIPPED
[INFO] Spark Project Tools  SKIPPED
[INFO] Spark Project Hive . SKIPPED
[INFO] Spark Project Docker Integration Tests . SKIPPED
[INFO] Spark Project REPL . SKIPPED
[INFO] Spark Project Assembly . SKIPPED
[INFO] Spark Project External Twitter . SKIPPED
[INFO] Spark Project External Flume Sink .. SKIPPED
[INFO] Spark Project External Flume ... SKIPPED
[INFO] Spark Project External Flume Assembly .. SKIPPED
[INFO] Spark Project External Akka  SKIPPED
[INFO] Spark Project External MQTT  SKIPPED
[INFO] Spark Project External MQTT Assembly ... SKIPPED
[INFO] Spark Project External ZeroMQ .. SKIPPED
[INFO] Spark Project External Kafka ... SKIPPED
[INFO] Spark Project Examples . SKIPPED
[INFO] Spark Project External Kafka Assembly .. SKIPPED
[INFO]

[INFO] BUILD FAILURE
[INFO]

[INFO] Total time: 1.721 s
[INFO] Finished at: 2016-02-25T01:03:12+05:30
[INFO] Final Memory: 28M/217M
[INFO]

[ERROR] Failed to execute goal
org.apache.maven.plugins:maven-enforcer-plugin:1.4.1:enforce
(enforce-versions) on project spark-parent_2.11: Some Enforcer rules have
failed. Look above for specific messages explaining why the rule failed. ->
[Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e
switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions,
please read the following articles:
[ERROR] [Help 1]
http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException


Minudika Malshan

Re: Build fails

2016-02-24 Thread Marcelo Vanzin
Well, did you do what the message instructed you to do and looked
above the message you copied for more specific messages for why the
build failed?

On Wed, Feb 24, 2016 at 11:28 AM, Minudika Malshan
 wrote:
> Hi,
>
> I am trying to build from spark source code which was cloned from
> https://github.com/apache/spark.git.
> But it fails with following error.
>
> [ERROR] Failed to execute goal
> org.apache.maven.plugins:maven-enforcer-plugin:1.4.1:enforce
> (enforce-versions) on project spark-parent_2.11: Some Enforcer rules have
> failed. Look above for specific messages explaining why the rule failed. ->
> [Help 1]
>
> Please help me to get it fixed.
>
> Thanks and regards..
> Minudika
>
>
> Minudika Malshan
> Undergraduate
> Department of Computer Science and Engineering
> University of Moratuwa.
>
>



-- 
Marcelo

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Build fails

2016-02-24 Thread Minudika Malshan
Hi,

I am trying to build from spark source code which was cloned from
https://github.com/apache/spark.git.
But it fails with following error.

[ERROR] Failed to execute goal
org.apache.maven.plugins:maven-enforcer-plugin:1.4.1:enforce
(enforce-versions) on project spark-parent_2.11: Some Enforcer rules have
failed. Look above for specific messages explaining why the rule failed. ->
[Help 1]

Please help me to get it fixed.

Thanks and regards..
Minudika


Minudika Malshan
Undergraduate
Department of Computer Science and Engineering
University of Moratuwa.


Re: Spark Job on YARN Hogging the entire Cluster resource

2016-02-24 Thread Hamel Kothari
If all queues are identical, this behavior should not be happening.
Preemption as designed in fair scheduler (IIRC) takes place based on the
instantaneous fair share, not the steady state fair share. The fair
scheduler docs

aren't super helpful on this but it does say in the Monitoring section that
preemption won't take place if you're less than your instantaneous fair
share (which might imply that it would occur if you were over your inst.
fair share and someone had requested resources). The code for
FairScheduler.resToPreempt

also seems to use getFairShare rather than getSteadyFairShare() for
preemption so that would imply that it is using instantaneous fair share
rather than steady state.

Could you share your YARN site/fair-scheduler and Spark configurations?
Could you also share the YARN Scheduler UI (specifically the top of of the
RM which shows how many resources are in use)?

Since it's not likely due to steady state fair share, some other possible
reasons why this might be taking place (this is not remotely conclusive but
with no information this is what comes to mind):
- You're not reaching
yarn.scheduler.fair.preemption.cluster-utilization-threshold. Perhaps due
to core/memory ratio inconsistency with the cluster.
- Your second job doesn't have a sufficient level of parallelism to request
more executors than what it is recieving (perhaps there are fewer than 13
tasks at any point in time) and you don't have
spark.dynamicAllocation.minExecutors set?

-Hamel

On Tue, Feb 23, 2016 at 8:20 PM Prabhu Joseph 
wrote:

> Hi All,
>
>  A YARN cluster with 352 Nodes (10TB, 3000cores) and has Fair Scheduler
> with root queue having 230 queues.
>
> Each Queue is configured with maxResources equal to Total Cluster
> Resource. When a Spark job is submitted into a queue A, it is given with
> 10TB, 3000 cores according to instantaneous Fair Share and it is holding
> the entire resource without releasing. After some time, when another job is
> submitted into other queue B, it will get the Fair Share 45GB and 13 cores
> i.e (10TB,3000 cores)/230 using Preemption. Now if some more jobs are
> submitted into queue B, all the jobs in B has to share the 45GB and 13
> cores. Whereas the job which is in queue A holds the entire cluster
> resource affecting the other jobs.
>  This kind of issue often happens when a Spark job submitted first
> which holds the entire cluster resource. What is the best way to fix this
> issue. Can we make preemption to happen for instantaneous fair share
> instead of fair share, will it help.
>
> Note:
>
> 1. We do not want to give weight for particular queue. Because all the 240
> queues are critical.
> 2. Changing the queues into nested does not solve the issue.
> 3. Adding maxResource to queue  won't allow the first job to pick entire
> cluster resource, but still configuring the optimal maxResource for 230
> queue is difficult and also the first job can't use the entire cluster
> resource when the cluster is idle.
> 4. We do not want to handle it in Spark ApplicationMaster, then we need to
> check for other new YARN application type with similar behavior. We want
> YARN to control this behavior by killing the resources which is hold by
> first job for longer period.
>
>
> Thanks,
> Prabhu Joseph
>
>


Re: Spark 1.6.1

2016-02-24 Thread Yin Yang
Looks like access to people.apache.org has been restored.

FYI

On Mon, Feb 22, 2016 at 10:07 PM, Luciano Resende 
 wrote:

>
>
> On Mon, Feb 22, 2016 at 9:08 PM, Michael Armbrust 
>  wrote:
>
>> An update: people.apache.org has been shut down so the release scripts
>> are broken. Will try again after we fix them.
>>
>>
> If you skip uploading to people.a.o, it should still be available in nexus
> for review.
>
> The other option is to add the RC into
> https://dist.apache.org/repos/dist/dev/
>
>
>
> --
> Luciano Resende
> http://people.apache.org/~lresende
> http://twitter.com/lresende1975
> http://lresende.blogspot.com/
>
>