Re: It seems that result of Hive on Spark is mistake And result of Hive and Hive on Spark are not the same

2015-12-22 Thread Xuefu Zhang
It seems that the plan isn't quite right, possibly due to union all optimization in Spark. Could you create a JIRA for this? CC Chengxiang as he might have some insight. Thanks, Xuefu On Tue, Dec 22, 2015 at 3:39 AM, Jone Zhang wrote: > Hive 1.2.1 on Spark1.4.1 > >

Re: Attempt to do update or delete using transaction manager that does not support these operations. (state=42000,code=10294)

2015-12-22 Thread Alan Gates
Also note that transactions only work with MR or Tez as the backend. The required work to have them work with Spark hasn't been done. Alan. Mich Talebzadeh December 22, 2015 at 9:43 Dropped and created table tt as follows: drop table if exists tt; create table

Re: Attempt to do update or delete using transaction manager that does not support these operations. (state=42000,code=10294)

2015-12-22 Thread Alan Gates
Correct. What doesn't work in Spark are actually the transactions, because there's a piece in the execution side that needs to send heartbeats to the metastore saying a transaction is still alive. That hasn't been implemented for Spark. It's very simple and could be done (see

RE: Attempt to do update or delete using transaction manager that does not support these operations. (state=42000,code=10294)

2015-12-22 Thread Mich Talebzadeh
Sounds like any delete only removes one 1 row from table in Hive on Spark! Delete from table (actually it has only 10 rows removes one row only! delete from tt; INFO : Query Hive on Spark job[6] stages: INFO : 12 INFO : 13 INFO : Status: Running (Hive on Spark job[6]) INFO :

RE: Attempt to do update or delete using transaction manager that does not support these operations. (state=42000,code=10294)

2015-12-22 Thread Mich Talebzadeh
Thanks for the feedback Alan It seems that one can do INSERTS with Hive on Spark but no updates or deletes. Is this correct? Cheers, Mich Talebzadeh Sybase ASE 15 Gold Medal Award 2008 A Winning Strategy: Running the most Critical Financial Data on ASE 15

Re: Attempt to do update or delete using transaction manager that does not support these operations. (state=42000,code=10294)

2015-12-22 Thread Elliot West
What concurrency properties do you have set? Should look something like those described here: https://cwiki.apache.org/confluence/display/Hive/Hive+Transactions#HiveTransactions-Configuration On Tuesday, 22 December 2015, Mich Talebzadeh wrote: > Dropped and created

hive metadata apis

2015-12-22 Thread Rachna Jotwani
The hcatalog apis do not return a ordinal position (the order in which they were created) on the hive columns or view columns. Is it possible to get that information? Thanks Rachna

unique-id for the mapper task with tez execution engine

2015-12-22 Thread Amey Barve
Hi All, Thanks in advance! I am running hive queries with mr engine and I wanted to get unique-id from the mapper task so I used following property from configuration conf.get("mapreduce.task.id"); Now I want to run same hive queries with tez engine and I want to know what should be my

Re: unique-id for the mapper task with tez execution engine

2015-12-22 Thread Gopal Vijayaraghavan
Hi, (x-posts to bcc:) On 12/22/15, 9:19 PM, "Amey Barve" wrote: >conf.get("mapreduce.task.id"); > >Now I want to run same hive queries with tez engine and I want to know >what >should be my unique-id. Is there any property from configuration or other >that can give me

Re: Low performance map join when join key types are different

2015-12-22 Thread Zhiwen Sun
Thanks to Gopal. But why disable mapjoin has better performance when we don't use cast to string(user always lazy)? Join key values comparison in in reduce stage is more quickly? Zhiwen Sun p;9456 On Wed, Dec 23, 2015 at 2:36 AM, Gopal Vijayaraghavan wrote: > > > We

Re: unique-id for the mapper task with tez execution engine

2015-12-22 Thread Gopal Vijayaraghavan
Hi, > So what do you suggest to get unique-id for mapper task with tez >execution engine? > > conf.get("mapreduce.task.partition"); > > Is this correct? Yes, that is correct - but it can only be unique within a Mapper vertex. Tez plans sort of look like this for complex queries

error while defining custom schema in Spark 1.5.0

2015-12-22 Thread Divya Gehlot
Hi, I am new bee to Apache Spark ,using CDH 5.5 Quick start VM.having spark 1.5.0. I working on custom schema and getting error import org.apache.spark.sql.hive.HiveContext >> >> scala> import org.apache.spark.sql.hive.orc._ >> import org.apache.spark.sql.hive.orc._ >> >> scala> import

Low performance map join when join key types are different

2015-12-22 Thread Zhiwen Sun
Hi all: We found that when we join on two different type keys , hive will convert all join key to Double. Consider such simple query: explain > select * > from table_a a > join table_b b on a.id = b.id > If type of a.id is int while b.id 's type is string, hive will convert a.id and b.id to

UnsupportedOperationException Schema for type String => Int is not supported

2015-12-22 Thread zml张明磊
Hi, Spark-version : 1.4.1 Runing the code getting the following error, how can I fix the code and run collectly ? I don’t know why the schema don’t support this type system. If I use callUDF instead of udf. Everything is good. Thanks, Minglei. val index:(String => (String => Int)) =

Re: It seems that result of Hive on Spark is mistake And result of Hive and Hive on Spark are not the same

2015-12-22 Thread Jone Zhang
Hive 1.2.1 on Spark1.4.1 2015-12-22 19:31 GMT+08:00 Jone Zhang : > *select * from staff;* > 1 jone 22 1 > 2 lucy 21 1 > 3 hmm 22 2 > 4 james 24 3 > 5 xiaoliu 23 3 > > *select id,date_ from trade union all select id,"test" from trade ;* > 1 201510210908 > 2 201509080234 >

It seems that result of Hive on Spark is mistake And result of Hive and Hive on Spark are not the same

2015-12-22 Thread Jone Zhang
*select * from staff;* 1 jone 22 1 2 lucy 21 1 3 hmm 22 2 4 james 24 3 5 xiaoliu 23 3 *select id,date_ from trade union all select id,"test" from trade ;* 1 201510210908 2 201509080234 2 201509080235 1 test 2 test 2 test *set hive.execution.engine=spark;* *set spark.master=local;* *select

Attempt to do update or delete using transaction manager that does not support these operations. (state=42000,code=10294)

2015-12-22 Thread Mich Talebzadeh
Hi, I am trying this code on table tt defined as an ORC table 0: jdbc:hive2://rhes564:10010/default> show create table tt; +-+--+ | createtab_stmt |

Re: Attempt to do update or delete using transaction manager that does not support these operations. (state=42000,code=10294)

2015-12-22 Thread Elliot West
Hi, The input/output formats do not appear to be ORC, have you tried 'stored as orc'? Additionally you'll need to set the property 'transactional=true' on the table. Do you have the original create table statement? Cheers - Elliot. On Tuesday, 22 December 2015, Mich Talebzadeh

RE: Attempt to do update or delete using transaction manager that does not support these operations. (state=42000,code=10294)

2015-12-22 Thread Mich Talebzadeh
Thanks Elliot, Sounds like that table was created as create table tt as select * from t. Although the original table t was created as transactional shown below, the table tt is not! 0: jdbc:hive2://rhes564:10010/default> show create table t;

RE: Attempt to do update or delete using transaction manager that does not support these operations. (state=42000,code=10294)

2015-12-22 Thread Mich Talebzadeh
Dropped and created table tt as follows: drop table if exists tt; create table tt ( owner varchar(30) ,object_name varchar(30) ,subobject_name varchar(30) ,object_id bigint ,data_object_id bigint ,object_type

Re: unique-id for the mapper task with tez execution engine

2015-12-22 Thread Amey Barve
Thanks Gopal! So what do you suggest to get unique-id for mapper task with tez execution engine? conf.get("*mapreduce.task.**partition*"); Is this correct? Regards, Amey On Wed, Dec 23, 2015 at 10:58 AM, Gopal Vijayaraghavan wrote: > Hi, > > (x-posts to bcc:) > > On

Re: Low performance map join when join key types are different

2015-12-22 Thread Gopal Vijayaraghavan
> But why disable mapjoin has better performance when we don't use cast to >string(user always lazy)? > > Join key values comparison in in reduce stage is more quickly? The HashMap is slower than the full-sort + sorted-merge-join. It shouldn't be, but it hits

Re: Low performance map join when join key types are different

2015-12-22 Thread Gopal Vijayaraghavan
> We found that when we join on two different type keys , hive will >convert all join key to Double. This is because of type coercions for BaseCompare, so that String:Integer comparisons with "<=" will work similarly to "=". > b.id to double. When the conversion occurs, map join will become

Re: unique-id for the mapper task with tez execution engine

2015-12-22 Thread Amey Barve
Ok Thanks. Can I get this *, ***from some conf to be absolutely sure that I get unique id ? Regards, Amey On Wed, Dec 23, 2015 at 12:06 PM, Gopal Vijayaraghavan wrote: > Hi, > > > So what do you suggest to get unique-id for mapper task with tez > >execution engine? > > > >