Hi Enrico, I think you need to copy hive-site.xml into spark config directory, or explicitly set hive-site.xml in spark-shell command line. Because spark shell creates its sqlContext when start up, after then, setConf will not work.
Thanks, Lionel On Thu, Apr 12, 2018 at 6:04 PM, Enrico D'Urso <a-edu...@hotels.com> wrote: > Hi, > > After further investigation, I noticed that Spark is pointing to the east > Aws region, by default. > Any suggestion to force it to use us-west2? > > Thanks > > From: Enrico D'Urso <a-edu...@hotels.com> > Date: Wednesday, April 11, 2018 at 3:55 PM > To: Lionel Liu <lionel...@apache.org>, "dev@griffin.incubator.apache.org" > <dev@griffin.incubator.apache.org> > Subject: Re: Griffin on Docker - modify Hive metastore Uris > > Hi Lionel, > > Thank you for your email. > > Right now, I am testing Spark cluster using the Spark-shell available on > your Docker image. I just wanted to test it before running any ‘measure > job’ to tackle any configuration issue. > I start the shell as follows: > spark-shell --deploy-mode client --master yarn > --packages=org.apache.hadoop:hadoop-aws:2.6.5 > > I am fetching Hadoop-aws:2.6.5 as 2.6.5 is the Hadoop version that is > included in the Docker image. > So far, so good, then I also set the right Hive metastore URI: > sqlContext.setConf("hive.metastore.uris", metastoreURI) > > the problem arises when I try to fetch any table for instance: > sqlContext.sql("Select * from hcom_data_prod_.testtable").take(2) > > the table does exist, but I get an error back saying that: > > Caused by: java.io.FileNotFoundException: File s3://hcom-xxXXXxx/yyy > /testtable/sentdate=2017-10-13 does not exist. > > But it does exist, basically AWS is responding with 404 http message. > I think I would get the same error if I try to run any ‘measure job’, so I > prefer to tackle this earlier. > > Are you aware of any S3 endpoint misconfiguration with old version of > Hadoop-aws? > > Many thanks, > > Enrico > > > From: Lionel Liu <lionel...@apache.org> > Date: Wednesday, April 11, 2018 at 3:34 AM > To: "dev@griffin.incubator.apache.org" <dev@griffin.incubator.apache.org>, > Enrico D'Urso <a-edu...@hotels.com> > Subject: Re: Griffin on Docker - modify Hive metastore Uris > > Hi Enrico, > > Griffin service only need to get metadata from hive metastore service, it > doesn't fetch hive table data actually. > Griffin measure, which runs on spark cluster, needs to fetch hive table > data, you need to pass the AWS credentials to it when submit. I recommend > you try the shell-submit way to submit the measure module first. > > > > Thanks, > Lionel > > On Tue, Apr 10, 2018 at 9:48 PM, Enrico D'Urso <a-edu...@hotels.com > <mailto:a-edu...@hotels.com>> wrote: > Hi, > > I have just set up the Griffin Docker image and it seems to work ok, I am > able to view the sample data that comes by default. > > Now, I would like to test a bit the metrics things against a subset of a > table that I have in our Hive instance; > In particular the configuration is as follows: > - Hive Metastore on RDS (Mysql on Amazon) > -Actual data on Amazon S3 > > The machine in which Docker is running has access to the metastore and > also can potentially fetch data from S3. > > I connected into the Docker image and now I am checking the following file: > /root/service/config/application.properties > > in which I see the hive.metastore.uris that I can potentially modify. > I would also need to pass to Griffin the AWS credentials to be able to > fetch data from S3. > > Anyone has experience on this? > > Thanks, > > E. > >