Re: HashedRelation Memory Pressure on Broadcast Joins

2016-03-03 Thread Rishi Mishra
Hi Davies, When you say *"UnsafeRow could come from UnsafeProjection, so We should copy the rows for safety." *do you intend to say that the underlying state might change , because of some state update APIs ? Or its due to some other rationale ? Regards, Rishitesh Mishra, SnappyData .

Re: getting a list of executors for use in getPreferredLocations

2016-03-03 Thread Cody Koeninger
Thanks. That looks pretty similar to what I'm doing, with the difference being getPeers vs getMemoryStatus. Seems like they're both backed by the same blockManagerInfo, but getPeers is filtering in a way that looks close to what I need. Is there a reason to prefer getMemoryStatus? On Thu, Mar

Re: getting a list of executors for use in getPreferredLocations

2016-03-03 Thread Shixiong(Ryan) Zhu
You can take a look at "org.apache.spark.streaming.scheduler.ReceiverTracker#getExecutors" On Thu, Mar 3, 2016 at 3:10 PM, Reynold Xin wrote: > What do you mean by consistent? Throughout the life cycle of an app, the > executors can come and go and as a result really has no

Re: getting a list of executors for use in getPreferredLocations

2016-03-03 Thread Reynold Xin
What do you mean by consistent? Throughout the life cycle of an app, the executors can come and go and as a result really has no consistency. Do you just need it for a specific job? On Thu, Mar 3, 2016 at 3:08 PM, Cody Koeninger wrote: > I need getPreferredLocations to

getting a list of executors for use in getPreferredLocations

2016-03-03 Thread Cody Koeninger
I need getPreferredLocations to choose a consistent executor for a given partition in a stream. In order to do that, I need to know what the current executors are. I'm currently grabbing them from the block manager master .getPeers(), which works, but I don't know if that's the most reasonable

Re: [VOTE] Release Apache Spark 1.6.1 (RC1)

2016-03-03 Thread Sean Owen
FWIW I was running this with OpenJDK 1.8.0_66 On Thu, Mar 3, 2016 at 7:43 PM, Tim Preece wrote: > Regarding the failure in > org.apache.spark.streaming.kafka.DirectKafkaStreamSuite","offset recovery > > We have been seeing the very same problem with the IBM JDK for quite a

Re: [VOTE] Release Apache Spark 1.6.1 (RC1)

2016-03-03 Thread Tim Preece
Regarding the failure in org.apache.spark.streaming.kafka.DirectKafkaStreamSuite","offset recovery We have been seeing the very same problem with the IBM JDK for quite a long time ( since at least July 2015 ). It is intermittent and we had dismissed it as a testcase problem. -- View this

Re: [VOTE] Release Apache Spark 1.6.1 (RC1)

2016-03-03 Thread Sean Owen
@Yin Yang see https://issues.apache.org/jira/browse/SPARK-12426 Docker has to be running locally for these tests to pass. I think it's a little surprising. However I still get a docker error, below. For me, +0 I guess. The signatures and hashes are all fine, but as usual I'm getting test

Re: [Proposal] Enabling time series analysis on spark metrics

2016-03-03 Thread Karan Kumar
Precisely. Found a JIRA in this regard : SPARK-10610 On Wed, Mar 2, 2016 at 3:36 AM, Reynold Xin wrote: > Is the suggestion just to use a different config (and maybe fallback to > appid) in order to publish metrics? Seems

Re: [VOTE] Release Apache Spark 1.6.1 (RC1)

2016-03-03 Thread Yin Yang
When I ran test suite using the following command: build/mvn clean -Phive -Phive-thriftserver -Pyarn -Phadoop-2.6 -Dhadoop.version=2.7.0 package I got failure in Spark Project Docker Integration Tests : 16/03/02 17:36:46 INFO RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down;

Re: [VOTE] Release Apache Spark 1.6.1 (RC1)

2016-03-03 Thread Tim Preece
I just created the following pull request ( against master but would like on 1.6.1 ) for the isolated classloader fix ( Spark-13648 ) https://github.com/apache/spark/pull/11495 -- View this message in context:

Re: [VOTE] Release Apache Spark 1.6.1 (RC1)

2016-03-03 Thread Tim Preece
I have been testing 1.6.1RC1 using the IBM Java SDK. I notice a problem ( with the org.apache.spark.sql.hive.client.VersionsSuite tests ) after a recent Spark 1.6.1 change. Pull request - https://github.com/apache/spark/commit/f7898f9e2df131fa78200f6034508e74a78c2a44 The change introduced a