Re: Error building a self contained Spark app

Mich Talebzadeh Fri, 04 Mar 2016 16:42:38 -0800

Hi Ted,

I meant as we have the spark-shell  and spark-sql, what is the advantage of
building self contained applications? We still need to submit it via
spark-submit. Basically the use case for self contained programs. That is
we build the code, create the class and run it independently of
spark-shell? I mean I can run the code from Apache Zeppelin through the
notebook.


Thanks


Dr Mich Talebzadeh



LinkedIn * 
https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*



http://talebzadehmich.wordpress.com



On 5 March 2016 at 00:32, Ted Yu <yuzhih...@gmail.com> wrote:

> Answers to first two questions are 'yes'
>
> Not clear on what the 3rd question is asking.
>
> On Fri, Mar 4, 2016 at 4:28 PM, Mich Talebzadeh <mich.talebza...@gmail.com
> > wrote:
>
>> Thanks now all working. Also select from  tmp tables are part
>> of sqlContext not HiveContext
>>
>> This is the final code that works in blue
>>
>>
>> Couple of questions if I may
>>
>>
>>    1. This works pretty effortless in spark-shell. Is this because
>>    $CLASSPATH already includes all the needed jars?
>>    2. The import section. That imports the needed classes. So basically
>>    import org.apache.spark.sql.functions._ imports all the methods of Class
>>    functions?
>>    3. What is the reason why we should use sbt to build custom jars from
>>    a spark code as opposed to running the code against spark shell in a file?
>>    Any particular use case for it?
>>
>>
>> import org.apache.spark.SparkContext
>> import org.apache.spark.SparkConf
>> import org.apache.spark.sql.Row
>> import org.apache.spark.sql.hive.HiveContext
>> import org.apache.spark.sql.types._
>> import org.apache.spark.sql.SQLContext
>> import org.apache.spark.sql.functions._
>> //
>> object Sequence {
>>   def main(args: Array[String]) {
>>   val conf = new
>> SparkConf().setAppName("Sequence").setMaster("local[*]").set("spark.driver.allowMultipleContexts",
>> "true")
>>   val sc = new SparkContext(conf)
>>   // Note that this should be done only after an instance of
>> org.apache.spark.sql.SQLContext is created. It should be written as:
>>   val sqlContext= new org.apache.spark.sql.SQLContext(sc)
>>   import sqlContext.implicits._
>>   val HiveContext = new org.apache.spark.sql.hive.HiveContext(sc)
>>   val a = Seq(("Mich",20), ("Christian", 18), ("James",13),
>> ("Richard",16))
>>   // Sort option 1 using tempTable
>>   val b = a.toDF("Name","score").registerTempTable("tmp")
>>   sqlContext.sql("select Name,score from tmp order by score desc").show
>>   // Sort option 2 with FP
>>   a.toDF("Name","score").sort(desc("score")).show
>>  }
>> }
>>
>> Dr Mich Talebzadeh
>>
>>
>>
>> LinkedIn * 
>> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>>
>>
>>
>> http://talebzadehmich.wordpress.com
>>
>>
>>
>> On 4 March 2016 at 23:58, Chandeep Singh <c...@chandeep.com> wrote:
>>
>>> That is because an instance of org.apache.spark.sql.SQLContext doesn’t
>>> exist in the current context and is required before you can use any of its
>>> implicit methods.
>>>
>>> As Ted mentioned importing org.apache.spark.sql.functions._ will take
>>> care of the below error.
>>>
>>>
>>> On Mar 4, 2016, at 11:35 PM, Mich Talebzadeh <mich.talebza...@gmail.com>
>>> wrote:
>>>
>>> thanks. It is like war of attrition. I always thought that you add
>>>  import before the class itself not within the class? w3hat is the reason
>>> for it please?
>>>
>>> this is my code
>>>
>>> import org.apache.spark.SparkContext
>>> import org.apache.spark.SparkConf
>>> import org.apache.spark.sql.Row
>>> import org.apache.spark.sql.hive.HiveContext
>>> import org.apache.spark.sql.types._
>>> import org.apache.spark.sql.SQLContext
>>> //
>>> object Sequence {
>>>   def main(args: Array[String]) {
>>>   val conf = new
>>> SparkConf().setAppName("Sequence").setMaster("local[*]").set("spark.driver.allowMultipleContexts",
>>> "true")
>>>   val sc = new SparkContext(conf)
>>>   // Note that this should be done only after an instance of
>>> org.apache.spark.sql.SQLContext is created. It should be written as:
>>>   val sqlContext= new org.apache.spark.sql.SQLContext(sc)
>>>   import sqlContext.implicits._
>>>   val HiveContext = new org.apache.spark.sql.hive.HiveContext(sc)
>>>   val a = Seq(("Mich",20), ("Christian", 18), ("James",13),
>>> ("Richard",16))
>>>   // Sort option 1 using tempTable
>>>   val b = a.toDF("Name","score").registerTempTable("tmp")
>>>   HiveContext.sql("select Name,score from tmp order by score desc").show
>>>   // Sort option 2 with FP
>>>   a.toDF("Name","score").sort(desc("score")).show
>>>  }
>>> }
>>>
>>> And now the last failure is in
>>>
>>> info]  [SUCCESSFUL ] org.scala-lang#jline;2.10.5!jline.jar (104ms)
>>> [info] Done updating.
>>> [info] Compiling 1 Scala source to
>>> /home/hduser/dba/bin/scala/Sequence/target/scala-2.10/classes...
>>> [info] 'compiler-interface' not yet compiled for Scala 2.10.5.
>>> Compiling...
>>> [info]   Compilation completed in 15.779 s
>>> [error]
>>> /home/hduser/dba/bin/scala/Sequence/src/main/scala/Sequence.scala:21: not
>>> found: value desc
>>> [error]   a.toDF("Name","score").sort(desc("score")).show
>>> [error]                               ^
>>> [error] one error found
>>> [error] (compile:compileIncremental) Compilation failed
>>>
>>>
>>> Dr Mich Talebzadeh
>>>
>>>
>>> LinkedIn * 
>>> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>>>
>>>
>>> http://talebzadehmich.wordpress.com
>>>
>>>
>>>
>>> On 4 March 2016 at 23:25, Chandeep Singh <c...@chandeep.com> wrote:
>>>
>>>> This is what you need:
>>>>
>>>>     val sc = new SparkContext(sparkConf)
>>>>     val sqlContext = new org.apache.spark.sql.SQLContext(sc)
>>>>     import sqlContext.implicits._
>>>>
>>>> On Mar 4, 2016, at 11:03 PM, Mich Talebzadeh <mich.talebza...@gmail.com>
>>>> wrote:
>>>>
>>>> Hi Ted,
>>>>
>>>> This is my code
>>>>
>>>> import org.apache.spark.SparkConf
>>>> import org.apache.spark.sql.Row
>>>> import org.apache.spark.sql.hive.HiveContext
>>>> import org.apache.spark.sql.types._
>>>> import org.apache.spark.sql.SQLContext
>>>> //
>>>> object Sequence {
>>>>   def main(args: Array[String]) {
>>>>   val conf = new
>>>> SparkConf().setAppName("Sequence").setMaster("local[*]").set("spark.driver.allowMultipleContexts",
>>>> "true")
>>>>   val sc = new SparkContext(conf)
>>>>   val sqlContext = new org.apache.spark.sql.SQLContext(sc)
>>>>   val HiveContext = new org.apache.spark.sql.hive.HiveContext(sc)
>>>>   val a = Seq(("Mich",20), ("Christian", 18), ("James",13),
>>>> ("Richard",16))
>>>>   // Sort option 1 using tempTable
>>>>   val b = a.toDF("Name","score").registerTempTable("tmp")
>>>>   sql("select Name,score from tmp order by score desc").show
>>>>   // Sort option 2 with FP
>>>>   a.toDF("Name","score").sort(desc("score")).show
>>>>  }
>>>> }
>>>>
>>>> And the error I am getting now is
>>>>
>>>> [info] downloading
>>>> https://repo1.maven.org/maven2/org/scala-lang/jline/2.10.5/jline-2.10.5.jar
>>>> ...
>>>> [info]  [SUCCESSFUL ] org.scala-lang#jline;2.10.5!jline.jar (103ms)
>>>> [info] Done updating.
>>>> [info] Compiling 1 Scala source to
>>>> /home/hduser/dba/bin/scala/Sequence/target/scala-2.10/classes...
>>>> [info] 'compiler-interface' not yet compiled for Scala 2.10.5.
>>>> Compiling...
>>>> [info]   Compilation completed in 12.462 s
>>>> [error]
>>>> /home/hduser/dba/bin/scala/Sequence/src/main/scala/Sequence.scala:16: value
>>>> toDF is not a member of Seq[(String, Int)]
>>>> [error]   val b = a.toDF("Name","score").registerTempTable("tmp")
>>>> [error]             ^
>>>> [error]
>>>> /home/hduser/dba/bin/scala/Sequence/src/main/scala/Sequence.scala:17: not
>>>> found: value sql
>>>> [error]   sql("select Name,score from tmp order by score desc").show
>>>> [error]   ^
>>>> [error]
>>>> /home/hduser/dba/bin/scala/Sequence/src/main/scala/Sequence.scala:19: value
>>>> toDF is not a member of Seq[(String, Int)]
>>>> [error]   a.toDF("Name","score").sort(desc("score")).show
>>>> [error]     ^
>>>> [error] three errors found
>>>> [error] (compile:compileIncremental) Compilation failed
>>>> [error] Total time: 88 s, completed Mar 4, 2016 11:12:46 PM
>>>>
>>>>
>>>> Dr Mich Talebzadeh
>>>>
>>>>
>>>> LinkedIn * 
>>>> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>>>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>>>>
>>>>
>>>> http://talebzadehmich.wordpress.com
>>>>
>>>>
>>>>
>>>> On 4 March 2016 at 22:52, Ted Yu <yuzhih...@gmail.com> wrote:
>>>>
>>>>> Can you show your code snippet ?
>>>>> Here is an example:
>>>>>
>>>>>       val sqlContext = new SQLContext(sc)
>>>>>       import sqlContext.implicits._
>>>>>
>>>>> On Fri, Mar 4, 2016 at 1:55 PM, Mich Talebzadeh <
>>>>> mich.talebza...@gmail.com> wrote:
>>>>>
>>>>>> Hi Ted,
>>>>>>
>>>>>>  I am getting the following error after adding that import
>>>>>>
>>>>>> [error]
>>>>>> /home/hduser/dba/bin/scala/Sequence/src/main/scala/Sequence.scala:5: not
>>>>>> found: object sqlContext
>>>>>> [error] import sqlContext.implicits._
>>>>>> [error]        ^
>>>>>> [error]
>>>>>> /home/hduser/dba/bin/scala/Sequence/src/main/scala/Sequence.scala:15: 
>>>>>> value
>>>>>> toDF is not a member of Seq[(String, Int)]
>>>>>>
>>>>>>
>>>>>> Dr Mich Talebzadeh
>>>>>>
>>>>>>
>>>>>> LinkedIn * 
>>>>>> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>>>>>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>>>>>>
>>>>>>
>>>>>> http://talebzadehmich.wordpress.com
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 4 March 2016 at 21:39, Ted Yu <yuzhih...@gmail.com> wrote:
>>>>>>
>>>>>>> Can you add the following into your code ?
>>>>>>>  import sqlContext.implicits._
>>>>>>>
>>>>>>> On Fri, Mar 4, 2016 at 1:14 PM, Mich Talebzadeh <
>>>>>>> mich.talebza...@gmail.com> wrote:
>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I have a simple Scala program as below
>>>>>>>>
>>>>>>>> import org.apache.spark.SparkContext
>>>>>>>> import org.apache.spark.SparkContext._
>>>>>>>> import org.apache.spark.SparkConf
>>>>>>>> import org.apache.spark.sql.SQLContext
>>>>>>>> object Sequence {
>>>>>>>>   def main(args: Array[String]) {
>>>>>>>>   val conf = new SparkConf().setAppName("Sequence")
>>>>>>>>   val sc = new SparkContext(conf)
>>>>>>>>   val sqlContext = new org.apache.spark.sql.SQLContext(sc)
>>>>>>>>   val HiveContext = new org.apache.spark.sql.hive.HiveContext(sc)
>>>>>>>>   val a = Seq(("Mich",20), ("Christian", 18), ("James",13),
>>>>>>>> ("Richard",16))
>>>>>>>>   // Sort option 1 using tempTable
>>>>>>>>   val b = a.toDF("Name","score").registerTempTable("tmp")
>>>>>>>>   sql("select Name,score from tmp order by score desc").show
>>>>>>>>   // Sort option 2 with FP
>>>>>>>>   a.toDF("Name","score").sort(desc("score")).show
>>>>>>>>  }
>>>>>>>> }
>>>>>>>>
>>>>>>>> I build this using sbt tool as below
>>>>>>>>
>>>>>>>>  cat sequence.sbt
>>>>>>>> name := "Sequence"
>>>>>>>> version := "1.0"
>>>>>>>> scalaVersion := "2.10.5"
>>>>>>>> libraryDependencies += "org.apache.spark" %% "spark-core" % "1.5.0"
>>>>>>>> libraryDependencies += "org.apache.spark" %% "spark-sql" % "1.0.0"
>>>>>>>> libraryDependencies += "org.apache.spark" %% "spark-hive" % "1.5.0"
>>>>>>>>
>>>>>>>>
>>>>>>>> But it fails compilation as below
>>>>>>>>
>>>>>>>> [info]   Compilation completed in 12.366 s
>>>>>>>> [error]
>>>>>>>> /home/hduser/dba/bin/scala/Sequence/src/main/scala/Sequence.scala:15: 
>>>>>>>> value
>>>>>>>> toDF is not a member of Seq[(String, Int)]
>>>>>>>> [error]   val b = a.toDF("Name","score").registerTempTable("tmp")
>>>>>>>> [error]             ^
>>>>>>>> [error]
>>>>>>>> /home/hduser/dba/bin/scala/Sequence/src/main/scala/Sequence.scala:16: 
>>>>>>>> not
>>>>>>>> found: value sql
>>>>>>>> [error]   sql("select Name,score from tmp order by score desc").show
>>>>>>>> [error]   ^
>>>>>>>> [error]
>>>>>>>> /home/hduser/dba/bin/scala/Sequence/src/main/scala/Sequence.scala:18: 
>>>>>>>> value
>>>>>>>> toDF is not a member of Seq[(String, Int)]
>>>>>>>> [error]   a.toDF("Name","score").sort(desc("score")).show
>>>>>>>> [error]     ^
>>>>>>>> [error] three errors found
>>>>>>>> [error] (compile:compileIncremental) Compilation failed
>>>>>>>> [error] Total time: 95 s, completed Mar 4, 2016 9:06:40 PM
>>>>>>>>
>>>>>>>> I think I am missing some dependencies here
>>>>>>>>
>>>>>>>>
>>>>>>>> I have a simple
>>>>>>>> Dr Mich Talebzadeh
>>>>>>>>
>>>>>>>>
>>>>>>>> LinkedIn * 
>>>>>>>> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>>>>>>>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>>>>>>>>
>>>>>>>>
>>>>>>>> http://talebzadehmich.wordpress.com
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>
>

Re: Error building a self contained Spark app

Reply via email to