Re: ORC does not support type conversion from INT to STRING.

2016-07-19 Thread Mich Talebzadeh
Is that a distro from Hortonworks? In that case what Matthew mentioned may be valid. Unless you go through pain of inserting using CAST function? HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw

Re: ORC does not support type conversion from INT to STRING.

2016-07-19 Thread Mahender Sarangam
But we are using Hive 1.2 version On 7/19/2016 12:43 PM, Mich Talebzadeh wrote: in Hive 2, I don't see this issue INSERT/SELECT from INT to String column! Dr Mich Talebzadeh LinkedIn https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw http://talebzadehmich.w

Re: ORC does not support type conversion from INT to STRING.

2016-07-19 Thread Mich Talebzadeh
in Hive 2, I don't see this issue INSERT/SELECT from INT to String column! Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw * http://talebzadehmic

Re: ORC does not support type conversion from INT to STRING.

2016-07-19 Thread Mahender Sarangam
Thanks Matthew, Currently we are in Hive 1.2 version only, Is there any setting like "hive.metastore.disallow.incompatible.col.type.changes=false;​" in Hive 1.2 or any around apart for reloading entire table data. For Quick workaround, we are reloading entire data. Can you please share with u

RE: Hive External Storage Handlers

2016-07-19 Thread Lavelle, Shawn
I am using the compiled version of spark-sql. But the API seems to have changed and the storage handler is not receiving the pushdown predicate as it did on hive 0.11 on shark 0.9.2. We’ve only written our own storage handler. Specifically, the FILTER_EXPR_CONF-like parameters are not being set

Re: Presentation in London: Running Spark on Hive or Hive on Spark

2016-07-19 Thread Ashok Kumar
Thanks Mich looking forward to it :) On Tuesday, 19 July 2016, 19:13, Mich Talebzadeh wrote: Hi all, This will be in London tomorrow Wednesday 20th July starting at 18:00 hour for refreshments and kick off at 18:30, 5 minutes walk from Canary Wharf Station, Jubilee Line  If you wish y

Re: Presentation in London: Running Spark on Hive or Hive on Spark

2016-07-19 Thread Mich Talebzadeh
Hi all, This will be in London tomorrow Wednesday 20th July starting at 18:00 hour for refreshments and kick off at 18:30, 5 minutes walk from Canary Wharf Station, Jubilee Line If you wish you can register and get more info here It will be in La Tasc

Re: Hive on TEZ + LLAP

2016-07-19 Thread Mich Talebzadeh
Sounds like if I am correct joining a fact table store_sales; with two dimensions? cool thanks Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw *

Re: Hive on TEZ + LLAP

2016-07-19 Thread Gopal Vijayaraghavan
> What was the type (Parquet, text, ORC etc) and row count for each three >tables above? I always use ORC for flat columnar data. ORC is designed to be ideal if you have measure/dimensions normalized into tables - most SQL workloads don't start with an indefinite depth tree. hive> select count(1

Re: Hive External Storage Handlers

2016-07-19 Thread Jörn Franke
The main reason is that if you compile it yourself then nobody can understand what you did. Hence any distribution can be downloaded and people can follow what you did. As far as I recall you had described several problems that the distributions did not have (eg you could not compile tez, spark

Re: Hive on TEZ + LLAP

2016-07-19 Thread Mich Talebzadeh
Thanks In this sample query select i_brand_id brand_id, i_brand brand, sum(ss_ext_sales_price) ext_price from *date_dim, store_sales, item * where date_dim.d_date_sk = store_sales.ss_sold_date_sk and store_sales.ss_item_sk = item.i_item_sk and i_manager_id=36 and

Re: hive external table on gzip

2016-07-19 Thread Jörn Franke
Gzip is transparently handled by Hive (* by the formats available in Hive. If it is a custom format it depends on it).. What format is the table (csv? Json?) depending on that you simply choose the corresponding serde and it transparently does the decompression. Keep in mind that gzip is not spl

RE: Hive External Storage Handlers

2016-07-19 Thread Lavelle, Shawn
Thanks All, Perhaps moving to 2.0.0 will be the answer. We are trying to move to Spark-SQL, but I wasn’t sure how much of Hive the HiveContext supports – such as the external table API. The problem I encountered with Spark-SQL 1.6 was that the predicate storage handler’s are not being pu

Re: hive external table on gzip

2016-07-19 Thread Mich Talebzadeh
pretty simple --1 Move gz file or files into HDFS: Multiple files can be in that staging directory with hdfs dfs -copyFromLocal /*.gz hdfs://rhes564:9000/data/stg/ --2 Create an external table. Just one will do CREATE EXTERNAL TABLE stg_t2 ... STORED AS TEXTFILE LOCATION '/data/stg/' --3 Creat

RE: hive external table on gzip

2016-07-19 Thread Amatucci, Mario, Vodafone Group
Hi I have huge gzip on hdfs and |I'd like to create an external table on top of them Any code example? Cheers Ps I cannot use snappy or lzo for some constraints -- Kind regards Mario Amatucci CG TB PS GDC PRAGUE THINK BIG