Hey Michael, Cheng,

Thanks for the replies. Sadly I can't remember the specific error so I'm
going to chalk it up to user error, especially since others on the list
have not had a problem.

@michael By the way, was at the Spark 1.1 meetup yesterday. Great event,
very informative, cheers and keep doing more!

@cheng Got it, cheers. Fortunately we don't have to deal with this use
case, but that's good to know (especially the $SPARK_HOME bit).




On Wed, Aug 27, 2014 at 3:36 PM, Cheng Lian <lian.cs....@gmail.com> wrote:

> Hey Matt, if you want to access existing Hive data, you still need a to
> run a Hive metastore service, and provide a proper hive-site.xml (just
> drop it in $SPARK_HOME/conf).
>
> Could you provide the error log you saw?
> ​
>
>
> On Wed, Aug 27, 2014 at 12:09 PM, Michael Armbrust <mich...@databricks.com
> > wrote:
>
>> I would expect that to work.  What exactly is the error?
>>
>>
>> On Wed, Aug 27, 2014 at 6:02 AM, Matt Chu <m...@kabam.com> wrote:
>>
>>> (apologies for sending this twice, first via nabble; didn't realize it
>>> wouldn't get forwarded)
>>>
>>> Hey, I know it's not officially released yet, but I'm trying to
>>> understand (and run) the Thrift-based JDBC server, in order to enable
>>> remote JDBC access to our dev cluster.
>>>
>>> Before asking about details, is my understanding of this correct?
>>> `sbin/start-thriftserver` is a JDBC/Hive server that doesn't require
>>> running a Hive+MR cluster (i.e. just Spark/Spark+YARN)?
>>>
>>> Assuming yes, I have hope that it all basically works, just that some
>>> documentation needs to be cleaned up:
>>>
>>> - I found a release page implying that 1.1 will be released "pretty
>>> soon-ish":
>>> https://cwiki.apache.org/confluence/display/SPARK/Wiki+Homepage
>>> - I can find recent (more recent 30 days or so) activity with promising
>>> titles: ["Updated Spark SQL README to include the hive-thriftserver
>>> module"](https://github.com/apache/spark/pull/1867),
>>> ["[SPARK-2410][SQL] Merging Hive Thrift/JDBC server (with Maven profile
>>> fix)"](https://github.com/apache/spark/pull/1620)
>>>
>>> Am I following all the right email threads, issues trackers, and
>>> whatnot?
>>>
>>> Specifically, I tried:
>>>
>>> 1. Building off of `branch-1.1`, synced as of ~today (2014 Aug 25)
>>> 2. Running `sbin/start-thriftserver.sh` in `yarn-client` mode
>>> 3. Can see the processing running, and the spark context/app created in
>>> yarn logs,
>>> and can connect to the thrift server on the default port of 10000 using
>>> `bin/beeline`
>>> 4. However, when I try to find out what that cluster has via `show
>>> tables;`, in the logs
>>> I see a connection error to some (what I assume to be) random port.
>>>
>>> So what service am I forgetting/too ignorant to run? Or did I
>>> misunderstand and we do need a live Hive instance to back thriftserver? Or
>>> is this a YARN-specific issue?
>>>
>>> Only recently started learning the ecosystem and community, so apologies
>>> for the longer post and lots of questions. :)
>>>
>>> Matt
>>>
>>
>>
>

Reply via email to