Re: getting a list of executors for use in getPreferredLocations

2016-03-03 Thread Cody Koeninger
Thanks.  That looks pretty similar to what I'm doing, with the
difference being getPeers vs getMemoryStatus.  Seems like they're both
backed by the same blockManagerInfo, but getPeers is filtering in a
way that looks close to what I need.  Is there a reason to prefer
getMemoryStatus?

On Thu, Mar 3, 2016 at 5:15 PM, Shixiong(Ryan) Zhu
 wrote:
> You can take a look at
> "org.apache.spark.streaming.scheduler.ReceiverTracker#getExecutors"
>
> On Thu, Mar 3, 2016 at 3:10 PM, Reynold Xin  wrote:
>>
>> What do you mean by consistent? Throughout the life cycle of an app, the
>> executors can come and go and as a result really has no consistency. Do you
>> just need it for a specific job?
>>
>>
>>
>> On Thu, Mar 3, 2016 at 3:08 PM, Cody Koeninger  wrote:
>>>
>>> I need getPreferredLocations to choose a consistent executor for a
>>> given partition in a stream.  In order to do that, I need to know what
>>> the current executors are.
>>>
>>> I'm currently grabbing them from the block manager master .getPeers(),
>>> which works, but I don't know if that's the most reasonable way to do
>>> it.
>>>
>>> Relevant code:
>>>
>>>
>>> https://github.com/koeninger/spark-1/blob/aaef0fc6e7e3aae18e4e03271bc0707d09d243e4/external/kafka-beta/src/main/scala/org/apache/spark/streaming/kafka/KafkaRDD.scala#L107
>>>
>>> -
>>> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
>>> For additional commands, e-mail: dev-h...@spark.apache.org
>>>
>>
>

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: getting a list of executors for use in getPreferredLocations

2016-03-03 Thread Shixiong(Ryan) Zhu
You can take a look at
"org.apache.spark.streaming.scheduler.ReceiverTracker#getExecutors"

On Thu, Mar 3, 2016 at 3:10 PM, Reynold Xin  wrote:

> What do you mean by consistent? Throughout the life cycle of an app, the
> executors can come and go and as a result really has no consistency. Do you
> just need it for a specific job?
>
>
>
> On Thu, Mar 3, 2016 at 3:08 PM, Cody Koeninger  wrote:
>
>> I need getPreferredLocations to choose a consistent executor for a
>> given partition in a stream.  In order to do that, I need to know what
>> the current executors are.
>>
>> I'm currently grabbing them from the block manager master .getPeers(),
>> which works, but I don't know if that's the most reasonable way to do
>> it.
>>
>> Relevant code:
>>
>>
>> https://github.com/koeninger/spark-1/blob/aaef0fc6e7e3aae18e4e03271bc0707d09d243e4/external/kafka-beta/src/main/scala/org/apache/spark/streaming/kafka/KafkaRDD.scala#L107
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
>> For additional commands, e-mail: dev-h...@spark.apache.org
>>
>>
>


Re: getting a list of executors for use in getPreferredLocations

2016-03-03 Thread Reynold Xin
What do you mean by consistent? Throughout the life cycle of an app, the
executors can come and go and as a result really has no consistency. Do you
just need it for a specific job?



On Thu, Mar 3, 2016 at 3:08 PM, Cody Koeninger  wrote:

> I need getPreferredLocations to choose a consistent executor for a
> given partition in a stream.  In order to do that, I need to know what
> the current executors are.
>
> I'm currently grabbing them from the block manager master .getPeers(),
> which works, but I don't know if that's the most reasonable way to do
> it.
>
> Relevant code:
>
>
> https://github.com/koeninger/spark-1/blob/aaef0fc6e7e3aae18e4e03271bc0707d09d243e4/external/kafka-beta/src/main/scala/org/apache/spark/streaming/kafka/KafkaRDD.scala#L107
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
> For additional commands, e-mail: dev-h...@spark.apache.org
>
>


getting a list of executors for use in getPreferredLocations

2016-03-03 Thread Cody Koeninger
I need getPreferredLocations to choose a consistent executor for a
given partition in a stream.  In order to do that, I need to know what
the current executors are.

I'm currently grabbing them from the block manager master .getPeers(),
which works, but I don't know if that's the most reasonable way to do
it.

Relevant code:

https://github.com/koeninger/spark-1/blob/aaef0fc6e7e3aae18e4e03271bc0707d09d243e4/external/kafka-beta/src/main/scala/org/apache/spark/streaming/kafka/KafkaRDD.scala#L107

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org