Re: create HiveContext if available, otherwise SQLContext

Yin Huai Thu, 16 Jul 2015 20:37:28 -0700

No problem:) Glad to hear that!

On Thu, Jul 16, 2015 at 8:22 PM, Koert Kuipers <ko...@tresata.com> wrote:


> that solved it, thanks!
>
> On Thu, Jul 16, 2015 at 6:22 PM, Koert Kuipers <ko...@tresata.com> wrote:
>
>> thanks i will try 1.4.1
>>
>> On Thu, Jul 16, 2015 at 5:24 PM, Yin Huai <yh...@databricks.com> wrote:
>>
>>> Hi Koert,
>>>
>>> For the classloader issue, you probably hit
>>> https://issues.apache.org/jira/browse/SPARK-8365, which has been fixed
>>> in Spark 1.4.1. Can you try 1.4.1 and see if the exception disappear?
>>>
>>> Thanks,
>>>
>>> Yin
>>>
>>> On Thu, Jul 16, 2015 at 2:12 PM, Koert Kuipers <ko...@tresata.com>
>>> wrote:
>>>
>>>> i am using scala 2.11
>>>>
>>>> spark jars are not in my assembly jar (they are "provided"), since i
>>>> launch with spark-submit
>>>>
>>>> On Thu, Jul 16, 2015 at 4:34 PM, Koert Kuipers <ko...@tresata.com>
>>>> wrote:
>>>>
>>>>> spark 1.4.0
>>>>>
>>>>> spark-csv is a normal dependency of my project and in the assembly jar
>>>>> that i use
>>>>>
>>>>> but i also tried adding spark-csv with --package for spark-submit, and
>>>>> got the same error
>>>>>
>>>>> On Thu, Jul 16, 2015 at 4:31 PM, Yin Huai <yh...@databricks.com>
>>>>> wrote:
>>>>>
>>>>>> We do this in SparkILookp (
>>>>>> https://github.com/apache/spark/blob/master/repl/scala-2.10/src/main/scala/org/apache/spark/repl/SparkILoop.scala#L1023-L1037).
>>>>>> What is the version of Spark you are using? How did you add the spark-csv
>>>>>> jar?
>>>>>>
>>>>>> On Thu, Jul 16, 2015 at 1:21 PM, Koert Kuipers <ko...@tresata.com>
>>>>>> wrote:
>>>>>>
>>>>>>> has anyone tried to make HiveContext only if the class is available?
>>>>>>>
>>>>>>> i tried this:
>>>>>>>  implicit lazy val sqlc: SQLContext = try {
>>>>>>>     Class.forName("org.apache.spark.sql.hive.HiveContext", true,
>>>>>>> Thread.currentThread.getContextClassLoader)
>>>>>>>
>>>>>>> .getConstructor(classOf[SparkContext]).newInstance(sc).asInstanceOf[SQLContext]
>>>>>>>   } catch { case e: ClassNotFoundException => new SQLContext(sc) }
>>>>>>>
>>>>>>> it compiles fine, but i get classloader issues when i actually use
>>>>>>> it on a cluster. for example:
>>>>>>>
>>>>>>> Exception in thread "main" java.lang.RuntimeException: Failed to
>>>>>>> load class for data source: com.databricks.spark.csv
>>>>>>>     at scala.sys.package$.error(package.scala:27)
>>>>>>>     at
>>>>>>> org.apache.spark.sql.sources.ResolvedDataSource$.lookupDataSource(ddl.scala:216)
>>>>>>>     at
>>>>>>> org.apache.spark.sql.sources.ResolvedDataSource$.apply(ddl.scala:229)
>>>>>>>     at
>>>>>>> org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:114)
>>>>>>>     at
>>>>>>> org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:104)
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: create HiveContext if available, otherwise SQLContext

Reply via email to