Re: How to run Zeppelin and Spark Thrift Server Together

Chanh Le Sun, 17 Jul 2016 21:48:01 -0700

Hi Ayan,
I succeed with Tableau but I still can’t import metadata from Hive to Oracle BI.




Is that Oracle BI still can’t connect to STS.

Regards,
Chanh




> On Jul 15, 2016, at 11:44 AM, ayan guha <guha.a...@gmail.com> wrote:
> 
> Its possible that transfar protocols are not matching, thats what Simba is 
> complaining about. Try to change the protocol to SASL?
> 
> On Fri, Jul 15, 2016 at 1:20 PM, Chanh Le <giaosu...@gmail.com 
> <mailto:giaosu...@gmail.com>> wrote:
> Hi Ayan,
> Thanks. I got it.
> Did you have any problem when connecting Oracle BI with STS?
> 
> I have some error 
> <IMG_15072016_095351.png>
> 
> If I use Tableau 
> <Screen Shot 2016-07-15 at 10.19.40 AM.png>
> 
> 
>> On Jul 15, 2016, at 10:03 AM, ayan guha <guha.a...@gmail.com 
>> <mailto:guha.a...@gmail.com>> wrote:
>> 
>> This looks like a Spark code. I am not sure if this is what you intended to 
>> use with STS? I think it should be a insert overwrite command (SQL)
>> 
>> On Fri, Jul 15, 2016 at 12:22 PM, Chanh Le <giaosu...@gmail.com 
>> <mailto:giaosu...@gmail.com>> wrote:
>> Hi Ayan,
>> 
>> Spark Code:
>> df.write.mode(SaveMode.Overwrite).
>>   parquet(dataPath)
>> I overwrite current data frame every hour to update data.
>> dataPath is /etl_info/WEBSITE 
>> So that mean the part-xxx file will change every hour as well.
>> This also happen on Spark if I register as tempTable and I need to drop and 
>> recreate every time data was changed.
>> 
>> Regards,
>> Chanh
>> 
>> 
>> 
>>> On Jul 14, 2016, at 10:39 PM, ayan guha <guha.a...@gmail.com 
>>> <mailto:guha.a...@gmail.com>> wrote:
>>> 
>>> Can you kindly share essential parts your spark code?
>>> 
>>> On 14 Jul 2016 21:12, "Chanh Le" <giaosu...@gmail.com 
>>> <mailto:giaosu...@gmail.com>> wrote:
>>> Hi Ayan,
>>> Thank you for your suggestion. I switch to use STS for primary and Zeppelin 
>>> just call through JDBC.
>>> But my data is a bit update data so Every hour I need to rewrite of some 
>>> data and therefore it has a issue with parquet file. 
>>> Scenario is: I have parquet file NETWORK and each one hour I need overwrite 
>>> that to update a new one and that gonna happened. 
>>> Caused by: alluxio.exception.FileDoesNotExistException: Path 
>>> /etl_info/WEBSITE/part-r-00001-a26015c3-6de5-4a10-bda8-d985d7241953.snappy.parquet
>>>  does not exist.
>>> So I need to drop old table and create a new one. 
>>> 
>>> Is there anyway to work around this?
>>> 
>>> Regards,
>>> Chanh
>>> 
>>> 
>>>> On Jul 14, 2016, at 11:36 AM, ayan guha <guha.a...@gmail.com 
>>>> <mailto:guha.a...@gmail.com>> wrote:
>>>> 
>>>> Hi
>>>> 
>>>> Thanks for the information. However, I still strongly believe you should 
>>>> be able to set URI in STS hive site xml and then create table through 
>>>> JDBC. 
>>>> 
>>>> On Thu, Jul 14, 2016 at 1:49 PM, Chanh Le <giaosu...@gmail.com 
>>>> <mailto:giaosu...@gmail.com>> wrote:
>>>> Hi Ayan,
>>>> I found that Zeppelin 0.6.0 can’t set hive.metastore.warehouse.dir 
>>>> property in somehow. 
>>>> Because I change the hive-site.xml inside $ZEP_DIR/conf/hive-site.xml for 
>>>> other config it works for instance I changed hive.metastore.metadb.dir 
>>>> it’s ok but only for hive.metastore.warehouse.dir didn’t work.
>>>> 
>>>> This is wired.
>>>> 
>>>> 
>>>> 
>>>> 
>>>>> On Jul 13, 2016, at 4:53 PM, ayan guha <guha.a...@gmail.com 
>>>>> <mailto:guha.a...@gmail.com>> wrote:
>>>>> 
>>>>> I would suggest you to restart zeppelin and STS. 
>>>>> 
>>>>> On Wed, Jul 13, 2016 at 6:35 PM, Chanh Le <giaosu...@gmail.com 
>>>>> <mailto:giaosu...@gmail.com>> wrote:
>>>>> Hi Ayan,
>>>>> 
>>>>> I don’t know I did something wrong but still couldn’t set 
>>>>> hive.metastore.warehouse.dir property.
>>>>> 
>>>>> I set 3 hive-site.xml files in spark location, zeppelin, hive as well but 
>>>>> still didn’t work.
>>>>> 
>>>>> zeppeline/conf/hive-site.xml 
>>>>> spark/conf/hive-site.xml
>>>>> hive/conf/hive-site.xml
>>>>> 
>>>>> <Screen Shot 2016-07-13 at 3.32.49 PM.png>
>>>>> 
>>>>> My hive-site.xml
>>>>> 
>>>>> <configuration>
>>>>>   <property>
>>>>>     <name>hive.metastore.metadb.dir</name>
>>>>>     <value>alluxio://master1:19998/metadb</value> <>
>>>>>     <description>
>>>>>     Required by metastore server or if the uris argument below is not 
>>>>> supplied
>>>>>     </description>
>>>>>   </property>
>>>>>   <property>
>>>>>     <name>hive.metastore.warehouse.dir</name>
>>>>>     <value>alluxio://master1:19998/warehouse</value> <>
>>>>>     <description>
>>>>>     Required by metastore server or if the uris argument below is not 
>>>>> supplied
>>>>>     </description>
>>>>>   </property>
>>>>> </configuration>
>>>>> 
>>>>> Is there anything I can do?
>>>>> 
>>>>> Regards,
>>>>> Chanh
>>>>> 
>>>>> 
>>>>> 
>>>>>> On Jul 13, 2016, at 12:43 PM, ayan guha <guha.a...@gmail.com 
>>>>>> <mailto:guha.a...@gmail.com>> wrote:
>>>>>> 
>>>>>> Hi
>>>>>> 
>>>>>> Create table always happens through Hive. In Hive, when you create a 
>>>>>> database, the default metadata storage location is driven by 
>>>>>> hive.metastore.metadb.dir and data storage is driven by 
>>>>>> hive.metastore.warehouse.dir property (set in hive site xml). So, you do 
>>>>>> not need to set this property in Zeppelin. 
>>>>>> 
>>>>>> What you can do:
>>>>>>            a. Modify  hive-site.xml to include those properties, if they 
>>>>>> are not already set.  use the same hive site.xml to run STS. Then 
>>>>>> connect through JDBC, create table and you should find metadata & data 
>>>>>> in your desired location. 
>>>>>> b. I think you can set these properties (same way you'd do in hive cli)
>>>>>> c. You can create tables/databases with a LOCATION clause,  in case you 
>>>>>> need to use non-standard path. 
>>>>>> 
>>>>>> Best
>>>>>> Ayan
>>>>>> 
>>>>>> On Wed, Jul 13, 2016 at 3:20 PM, Chanh Le <giaosu...@gmail.com 
>>>>>> <mailto:giaosu...@gmail.com>> wrote:
>>>>>> Hi Ayan,
>>>>>> Thank you for replying. 
>>>>>> But I wanna create a table in Zeppelin and store the metadata in Alluxio 
>>>>>> like I tried to do set 
>>>>>> hive.metastore.warehouse.dir=alluxio://master1:19998/metadb <>   <>So I 
>>>>>> can share data with STS.
>>>>>> 
>>>>>> The way you’ve mentioned through JDBC I already did and it works but I 
>>>>>> can’t create table in Spark way easily.
>>>>>> 
>>>>>> Regards,
>>>>>> Chanh
>>>>>> 
>>>>>> 
>>>>>>> On Jul 13, 2016, at 12:06 PM, ayan guha <guha.a...@gmail.com 
>>>>>>> <mailto:guha.a...@gmail.com>> wrote:
>>>>>>> 
>>>>>>> HI
>>>>>>> 
>>>>>>> I quickly tried with available hive interpreter 
>>>>>>> 
>>>>>>> <image.png>
>>>>>>> 
>>>>>>> Please try similarly. 
>>>>>>> 
>>>>>>> I will try with jdbc interpreter but I need to add it to zeppelin :)
>>>>>>> 
>>>>>>> Best
>>>>>>> Ayan
>>>>>>> 
>>>>>>> On Wed, Jul 13, 2016 at 1:53 PM, Chanh Le <giaosu...@gmail.com 
>>>>>>> <mailto:giaosu...@gmail.com>> wrote:
>>>>>>> Hi Ayan,
>>>>>>> How to set hive metastore in Zeppelin. I tried but not success.
>>>>>>> The way I do I add into Spark Interpreter
>>>>>>> 
>>>>>>> <Screen Shot 2016-07-13 at 10.50.53 AM.png>
>>>>>>> 
>>>>>>> And also try in a notebook by 
>>>>>>> %sql
>>>>>>> set hive.metastore.metadb.dir=alluxio://master1:19998/metadb <>
>>>>>>> 
>>>>>>> %sql 
>>>>>>> set hive.metastore.warehouse.dir=alluxio://master1:19998/metadb <>
>>>>>>> 
>>>>>>> %spark
>>>>>>> sqlContext.setConf("hive.metastore.warehouse.dir", 
>>>>>>> "alluxio://master1:19998/metadb <>")
>>>>>>> sqlContext.setConf("hive.metastore.metadb.dir", 
>>>>>>> "alluxio://master1:19998/metadb <>")
>>>>>>> sqlContext.read.parquet("alluxio://master1:19998/etl_info/WEBSITE 
>>>>>>> <>").saveAsTable("tests_5”)
>>>>>>> 
>>>>>>> But It’s <Screen Shot 2016-07-13 at 10.53.10 AM.png>
>>>>>>> 
>>>>>>>> On Jul 11, 2016, at 1:26 PM, ayan guha <guha.a...@gmail.com 
>>>>>>>> <mailto:guha.a...@gmail.com>> wrote:
>>>>>>>> 
>>>>>>>> Hi
>>>>>>>> 
>>>>>>>> When you say "Zeppelin and STS", I am assuming you mean "Spark 
>>>>>>>> Interpreter" and "JDBC interpreter" respectively. 
>>>>>>>> 
>>>>>>>> Through Zeppelin, you can either run your own spark application (by 
>>>>>>>> using Zeppelin's own spark context) using spark interpreter OR you can 
>>>>>>>> access STS, which  is a spark application ie separate Spark Context 
>>>>>>>> using JDBC interpreter. There should not be any need for these 2 
>>>>>>>> contexts to coexist. 
>>>>>>>> 
>>>>>>>> If you want to share data, save it to hive from either context, and 
>>>>>>>> you should be able to see the data from other context. 
>>>>>>>> 
>>>>>>>> Best
>>>>>>>> Ayan
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Mon, Jul 11, 2016 at 3:00 PM, Chanh Le <giaosu...@gmail.com 
>>>>>>>> <mailto:giaosu...@gmail.com>> wrote:
>>>>>>>> Hi Ayan,
>>>>>>>> I tested It works fine but one more confuse is If my (technical) users 
>>>>>>>> want to write some code in zeppelin to apply thing into Hive table? 
>>>>>>>> Zeppelin and STS can’t share Spark Context that mean we need separated 
>>>>>>>> process? Is there anyway to use the same Spark Context of STS?
>>>>>>>> 
>>>>>>>> Regards,
>>>>>>>> Chanh
>>>>>>>> 
>>>>>>>> 
>>>>>>>>> On Jul 11, 2016, at 10:05 AM, Takeshi Yamamuro <linguin....@gmail.com 
>>>>>>>>> <mailto:linguin....@gmail.com>> wrote:
>>>>>>>>> 
>>>>>>>>> Hi,
>>>>>>>>> 
>>>>>>>>> ISTM multiple sparkcontexts are not recommended in spark.
>>>>>>>>> See: https://issues.apache.org/jira/browse/SPARK-2243 
>>>>>>>>> <https://issues.apache.org/jira/browse/SPARK-2243>
>>>>>>>>> 
>>>>>>>>> // maropu
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On Mon, Jul 11, 2016 at 12:01 PM, ayan guha <guha.a...@gmail.com 
>>>>>>>>> <mailto:guha.a...@gmail.com>> wrote:
>>>>>>>>> Hi
>>>>>>>>> 
>>>>>>>>> Can you try using JDBC interpreter with STS? We are using 
>>>>>>>>> Zeppelin+STS on YARN for few months now without much issue. 
>>>>>>>>> 
>>>>>>>>> On Mon, Jul 11, 2016 at 12:48 PM, Chanh Le <giaosu...@gmail.com 
>>>>>>>>> <mailto:giaosu...@gmail.com>> wrote:
>>>>>>>>> Hi everybody,
>>>>>>>>> We are using Spark to query big data and currently we’re using 
>>>>>>>>> Zeppelin to provide a UI for technical users.
>>>>>>>>> Now we also need to provide a UI for business users so we use Oracle 
>>>>>>>>> BI tools and set up a Spark Thrift Server (STS) for it.
>>>>>>>>> 
>>>>>>>>> When I run both Zeppelin and STS throw error:
>>>>>>>>> 
>>>>>>>>> INFO [2016-07-11 09:40:21,905] ({pool-2-thread-4} 
>>>>>>>>> SchedulerFactory.java[jobStarted]:131) - Job 
>>>>>>>>> remoteInterpretJob_1468204821905 started by scheduler 
>>>>>>>>> org.apache.zeppelin.spark.SparkInterpreter835015739
>>>>>>>>>  INFO [2016-07-11 09:40:21,911] ({pool-2-thread-4} 
>>>>>>>>> Logging.scala[logInfo]:58) - Changing view acls to: giaosudau
>>>>>>>>>  INFO [2016-07-11 09:40:21,912] ({pool-2-thread-4} 
>>>>>>>>> Logging.scala[logInfo]:58) - Changing modify acls to: giaosudau
>>>>>>>>>  INFO [2016-07-11 09:40:21,912] ({pool-2-thread-4} 
>>>>>>>>> Logging.scala[logInfo]:58) - SecurityManager: authentication 
>>>>>>>>> disabled; ui acls disabled; users with view permissions: 
>>>>>>>>> Set(giaosudau); users with modify permissions: Set(giaosudau)
>>>>>>>>>  INFO [2016-07-11 09:40:21,918] ({pool-2-thread-4} 
>>>>>>>>> Logging.scala[logInfo]:58) - Starting HTTP Server
>>>>>>>>>  INFO [2016-07-11 09:40:21,919] ({pool-2-thread-4} 
>>>>>>>>> Server.java[doStart]:272) - jetty-8.y.z-SNAPSHOT
>>>>>>>>>  INFO [2016-07-11 09:40:21,920] ({pool-2-thread-4} 
>>>>>>>>> AbstractConnector.java[doStart]:338) - Started 
>>>>>>>>> SocketConnector@0.0.0.0:54818 <http://SocketConnector@0.0.0.0:54818/>
>>>>>>>>>  INFO [2016-07-11 09:40:21,922] ({pool-2-thread-4} 
>>>>>>>>> Logging.scala[logInfo]:58) - Successfully started service 'HTTP class 
>>>>>>>>> server' on port 54818.
>>>>>>>>>  INFO [2016-07-11 09:40:22,408] ({pool-2-thread-4} 
>>>>>>>>> SparkInterpreter.java[createSparkContext]:233) - ------ Create new 
>>>>>>>>> SparkContext local[*] -------
>>>>>>>>>  WARN [2016-07-11 09:40:22,411] ({pool-2-thread-4} 
>>>>>>>>> Logging.scala[logWarning]:70) - Another SparkContext is being 
>>>>>>>>> constructed (or threw an exception in its constructor).  This may 
>>>>>>>>> indicate an error, since only one SparkContext may be running in this 
>>>>>>>>> JVM (see SPARK-2243). The other SparkContext was created at:
>>>>>>>>> 
>>>>>>>>> Is that mean I need to setup allow multiple context? Because It’s 
>>>>>>>>> only test in local with local mode If I deploy on mesos cluster what 
>>>>>>>>> would happened?
>>>>>>>>> 
>>>>>>>>> Need you guys suggests some solutions for that. Thanks.
>>>>>>>>> 
>>>>>>>>> Chanh
>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org 
>>>>>>>>> <mailto:user-unsubscr...@spark.apache.org>
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> -- 
>>>>>>>>> Best Regards,
>>>>>>>>> Ayan Guha
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> -- 
>>>>>>>>> ---
>>>>>>>>> Takeshi Yamamuro
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> -- 
>>>>>>>> Best Regards,
>>>>>>>> Ayan Guha
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> -- 
>>>>>>> Best Regards,
>>>>>>> Ayan Guha
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> -- 
>>>>>> Best Regards,
>>>>>> Ayan Guha
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> -- 
>>>>> Best Regards,
>>>>> Ayan Guha
>>>> 
>>>> 
>>>> 
>>>> 
>>>> -- 
>>>> Best Regards,
>>>> Ayan Guha
>>> 
>> 
>> 
>> 
>> 
>> -- 
>> Best Regards,
>> Ayan Guha
> 
> 
> 
> 
> -- 
> Best Regards,
> Ayan Guha

Re: How to run Zeppelin and Spark Thrift Server Together

Reply via email to