Question about avoiding reflection with isolated class loader

2015-05-28 Thread Guozhang Wang
Hi, I have a question that is probably related to SPARK-1870 https://issues.apache.org/jira/browse/SPARK-1870. Basically I have also encountered the issue that with separate classloaders while developing a programming framework where I have to use reflection inside the application code. To

Re: [VOTE] Release Apache Spark 1.4.0 (RC1)

2015-05-28 Thread Peter Rudenko
Also have the same issue - all tests fail because of HiveContext / derby lock. |Cause: javax.jdo.JDOFatalDataStoreException: Unable to open a test connection to the given database. JDBC url = jdbc:derby:;databaseName=metastore_db;create=true, username = APP. Terminating connection pool (set

Re: flaky tests scaled timeouts

2015-05-28 Thread Reynold Xin
My understanding is that it is good to have this, but a large fraction of flaky tests are actually also race conditions, etc. On Thu, May 28, 2015 at 11:55 AM, Imran Rashid iras...@cloudera.com wrote: Hi, I was just fixing a problem with too short a timeout on one of the unit tests I added

flaky tests scaled timeouts

2015-05-28 Thread Imran Rashid
Hi, I was just fixing a problem with too short a timeout on one of the unit tests I added (https://issues.apache.org/jira/browse/SPARK-7919), and I was wondering if this is a common problem w/ a lot of our flaky tests. Its really hard to know what to set the timeouts to -- you set the timeout so

Re: Representing a recursive data type in Spark SQL

2015-05-28 Thread Jeremy Lucas
Hey Reynold, Thanks for the suggestion. Maybe a better definition of what I mean by a recursive data structure is rather what might resemble (in Scala) the type Map[String, Any]. With a type like this, the keys are well-defined as strings (as this is JSON) but the values can be basically any

Re: Representing a recursive data type in Spark SQL

2015-05-28 Thread Matei Zaharia
Your best bet might be to use a mapstring,string in SQL and make the keys be longer paths (e.g. params_param1 and params_param2). I don't think you can have a map in some of them but not in others. Matei On May 28, 2015, at 3:48 PM, Jeremy Lucas jeremyalu...@gmail.com wrote: Hey Reynold,

Absence of version 1.3 of spark-assembly jar

2015-05-28 Thread Pala M Muthaia
I am looking to take dependency on spark-assembly jar, version 1.3.0, for our spark code unit tests, but it's not available on maven central (only older versions are available). Looks like it's not getting released anymore, is that right? Our internal build system prevents us from including

Re: [VOTE] Release Apache Spark 1.4.0 (RC1)

2015-05-28 Thread Yin Huai
Justin, If you are creating multiple HiveContexts in tests, you need to assign a temporary metastore location for every HiveContext (like what we do at here https://github.com/apache/spark/blob/master/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveContext.scala#L527-L543). Otherwise, they

Re: Representing a recursive data type in Spark SQL

2015-05-28 Thread Reynold Xin
I think it is fairly hard to support recursive data types. What I've seen in one other proprietary system in the past is to let the user define the depth of the nested data types, and then just expand the struct/map/list definition to the maximum level of depth. Would this solve your problem?

Re: [build system] jenkins downtime tomorrow morning ~730am PDT

2015-05-28 Thread shane knapp
well, i started early and am pretty much done. sadly, i had to roll back most of the plugin updates (which doesn't surprise me), but the system and jenkins core updates went swimmingly. anyways, we're up and building again! now, back to my coffee... :) On Wed, May 27, 2015 at 2:11 PM, shane

Streaming data + Blocked Model

2015-05-28 Thread Debasish Das
Hi, We want to keep the model created and loaded in memory through Spark batch context since blocked matrix operations are required to optimize on runtime. The data is streamed in through Kafka / raw sockets and Spark Streaming Context. We want to run some prediction operations with the