Re: Hive On Spark - ORC Table - Hive Streaming Mutation API

2016-09-14 Thread Benjamin Schaff
Hi,

Thanks for the answer.

I am running on a custom build of spark 1.6.2 meaning the one given in the
hive documentation so without hive jars.
I set it up in hive-env.sh.

I created the istari table like in the documentation and I run INSERT on it
then a GROUP BY.
Everything went on spark standalone cluster correctly not exception nowhere.

Do you have any other suggestion ?

Thanks.

Le mer. 14 sept. 2016 à 13:55, Mich Talebzadeh <mich.talebza...@gmail.com>
a écrit :

> Hi,
>
> You are using Hive 2. What is the Spark version that runs as Hive
> execution engine?
>
> I cannot see spark.home in your hive-site.xml so I cannot figure it out.
>
> BTW you are using Spark standalone as the mode. I tend to use yarn-client.
>
> Now back to the above issue. Do other queries work OK with Hive on Spark?
>
> Some of those perf parameters can be set up in Hive session itself or
> through init file
>
>  set spark.home=/usr/lib/spark-1.6.2-bin-hadoop2.6;
> set spark.master=yarn;
> set spark.deploy.mode=client;
> set spark.executor.memory=8g;
> set spark.driver.memory=8g;
> set spark.executor.instances=6;
> set spark.ui.port=;
>
>
> HTH
>
>
>
>
>
>
>
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * 
> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>
>
>
> http://talebzadehmich.wordpress.com
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
> On 14 September 2016 at 18:28, Benjamin Schaff <benjamin.sch...@gmail.com>
> wrote:
>
>> Hi,
>>
>> After several days trying to figure out the problem I'm stuck with a
>> class cast exception when running a query with hive on spark on orc tables
>> that I updated with the streaming mutation api of hive 2.0.
>>
>> The context is the following:
>>
>> For hive:
>>
>> The version is the latest available from the website 2.1
>> I created some scala code to insert data into an orc table with the
>> streaming mutation api followed the example provided somewhere in the hive
>> repository.
>>
>> The table looks like that:
>>
>> ++--+
>> |   createtab_stmt   |
>> ++--+
>> | CREATE TABLE `hc__member`( |
>> |   `rdv_core__key` bigint,  |
>> |   `rdv_core__domainkey` string,|
>> |   `rdftypes` array,|
>> |   `rdv_org__firstname` string, |
>> |   `rdv_org__middlename` string,|
>> |   `rdv_org__lastname` string,  |
>> |   `rdv_org__gender` string,|
>> |   `rdv_org__city` string,  |
>> |   `rdv_org__state` string, |
>> |   `rdv_org__countrycode` string,   |
>> |   `rdv_org__addresslabel` string,  |
>> |   `rdv_org__zip` string)   |
>> | CLUSTERED BY ( |
>> |   rdv_core__key)   |
>> | INTO 24 BUCKETS|
>> | ROW FORMAT SERDE   |
>> |   'org.apache.hadoop.hive.ql.io.orc.OrcSerde'  |
>> | STORED AS INPUTFORMAT  |
>> |   'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat'|
>> | OUTPUTFORMAT   |
>> |   'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat'   |
>> | LOCATION   |
>> |   'hdfs://hmaster:8020/user/hive/warehouse/hc__member' |
>> | TBLPROPERTIES (|
>> |   'COLUMN_STATS_ACCURATE'='{\"BASIC_STATS\":\"true\&q

Hive On Spark - ORC Table - Hive Streaming Mutation API

2016-09-14 Thread Benjamin Schaff
ild myself the spark distribution removing hive related dependencies so
I don't think it comes from there.

Have you any recommendations on how I can proceed to find the root cause of
that problem ?

Thanks in advance.

PS: I made the mistake of posting on the dev mailing list earlier please
ignore it and sorry for the double post.

Regards,
Benjamin Schaff