Re: Griffin on Docker - modify Hive metastore Uris

Lionel Liu Fri, 13 Apr 2018 02:20:29 -0700

Hi Enrico,

I think you need to copy hive-site.xml into spark config directory, or
explicitly set hive-site.xml in spark-shell command line.
Because spark shell creates its sqlContext when start up, after then,
setConf will not work.



Thanks,
Lionel

On Thu, Apr 12, 2018 at 6:04 PM, Enrico D'Urso <a-edu...@hotels.com> wrote:

> Hi,
>
> After further investigation, I noticed that Spark is pointing to the east
> Aws region, by default.
> Any suggestion to force it to use us-west2?
>
> Thanks
>
> From: Enrico D'Urso <a-edu...@hotels.com>
> Date: Wednesday, April 11, 2018 at 3:55 PM
> To: Lionel Liu <lionel...@apache.org>, "dev@griffin.incubator.apache.org"
> <dev@griffin.incubator.apache.org>
> Subject: Re: Griffin on Docker - modify Hive metastore Uris
>
> Hi Lionel,
>
> Thank you for your email.
>
> Right now, I am testing Spark cluster using the Spark-shell available on
> your Docker image. I just wanted to test it before running any ‘measure
> job’ to tackle any configuration issue.
> I start the shell as follows:
> spark-shell --deploy-mode client --master yarn
> --packages=org.apache.hadoop:hadoop-aws:2.6.5
>
> I am fetching Hadoop-aws:2.6.5 as 2.6.5 is the Hadoop version that is
> included in the Docker image.
> So far, so good, then I also set the right Hive metastore URI:
> sqlContext.setConf("hive.metastore.uris", metastoreURI)
>
> the problem arises when I try to fetch any table for instance:
> sqlContext.sql("Select * from hcom_data_prod_.testtable").take(2)
>
> the table does exist, but I get an error back saying that:
>
> Caused by: java.io.FileNotFoundException: File s3://hcom-xxXXXxx/yyy
> /testtable/sentdate=2017-10-13 does not exist.
>
> But it does exist, basically AWS is responding with 404 http message.
> I think I would get the same error if I try to run any ‘measure job’, so I
> prefer to tackle this earlier.
>
> Are you aware of any S3 endpoint misconfiguration with old version of
> Hadoop-aws?
>
> Many thanks,
>
> Enrico
>
>
> From: Lionel Liu <lionel...@apache.org>
> Date: Wednesday, April 11, 2018 at 3:34 AM
> To: "dev@griffin.incubator.apache.org" <dev@griffin.incubator.apache.org>,
> Enrico D'Urso <a-edu...@hotels.com>
> Subject: Re: Griffin on Docker - modify Hive metastore Uris
>
> Hi Enrico,
>
> Griffin service only need to get metadata from hive metastore service, it
> doesn't fetch hive table data actually.
> Griffin measure, which runs on spark cluster, needs to fetch hive table
> data, you need to pass the AWS credentials to it when submit. I recommend
> you try the shell-submit way to submit the measure module first.
>
>
>
> Thanks,
> Lionel
>
> On Tue, Apr 10, 2018 at 9:48 PM, Enrico D'Urso <a-edu...@hotels.com
> <mailto:a-edu...@hotels.com>> wrote:
> Hi,
>
> I have just set up the Griffin Docker image and it seems to work ok, I am
> able to view the sample data that comes by default.
>
> Now, I would like to test a bit the metrics things against a subset of a
> table that I have in our Hive instance;
> In particular the configuration is as follows:
> - Hive Metastore on RDS (Mysql on Amazon)
> -Actual data on  Amazon S3
>
> The machine in which Docker is running has access to the metastore and
> also can potentially fetch data from S3.
>
> I connected into the Docker image and now I am checking the following file:
> /root/service/config/application.properties
>
> in which I see the hive.metastore.uris that I can potentially modify.
> I would also need to pass to Griffin the AWS credentials to be able to
> fetch data from S3.
>
> Anyone has experience on this?
>
> Thanks,
>
> E.
>
>

Re: Griffin on Docker - modify Hive metastore Uris

Reply via email to