Re: How to increase the mapper number in my case

2016-01-08 Thread Ankit Bhatnagar
check these> mapred.max.split.size > mapred.min.split.size On Friday, January 8, 2016 6:41 PM, Todd wrote: Hi, I have Hadoop (2.6.0, pseudo distributed mode) and Hive (1.2.1) installed on my local machine. I have a table A,its underlying file takes up 8 HDFS blocks. When I run a quer

Does hive(1.2.1) support to automatically detect parquet schema?

2016-01-08 Thread Todd
Hi, I would ask whether hive(1.2.1) support to automatically detect parquet schema. Thanks.

How to increase the mapper number in my case

2016-01-08 Thread Todd
Hi, I have Hadoop (2.6.0, pseudo distributed mode) and Hive (1.2.1) installed on my local machine. I have a table A,its underlying file takes up 8 HDFS blocks. When I run a query like select count(1) from A From the result, I see only 1 mapper task ,I thought it should be equal to the block num

bitmap index on FACT table

2016-01-08 Thread Mich Talebzadeh
Hi, I have the usual SALES fact table with 5 million rows partitioned by yean and month I created 5 bitmap indexes all being the foreign keys from DIMENSION tables as below: 0: jdbc:hive2://rhes564:10010/default> show index on sales; +---+---+---

Benefit of following setting.

2016-01-08 Thread mahender bigdata
Hi, I have doubt on following setting, where i could not find clear meaning of these setting * SET hive.optimize.index.filter=false; * set hive.mapjoin.hybridgrace.hashtable=false; * set hive.optimize.null.scan=false; * Is there any negative by enabling hive.optimize.index.filter al

Re: Hive UDF accessing https request

2016-01-08 Thread Sergey Shelukhin
To start with, you can remove the try-catch so that the exception is not swallowed and you can see if an error occurs. However, note that this is an anti-pattern for any reasonable-sized dataset. From: Prabhu Joseph mailto:prabhujose.ga...@gmail.com>> Reply-To: "user@hive.apache.org

Re: adding jars - hive on spark cdh 5.4.3

2016-01-08 Thread Ophir Etzion
It didn't work. assuming I did the right thing. in the properties you could see {"key":"hive.aux.jars.path","value":"file:///data/loko/foursquare.web-hiverc/current/hadoop-hive-serde.jar,file:///data/loko/foursquare.web-hiverc/current/hadoop-hive-udf.jar","isFinal":false,"resource":"programatical

Re: adding jars - hive on spark cdh 5.4.3

2016-01-08 Thread Edward Capriolo
Yes you can add UDF's via add Jar. But strangely the classpath of 'the driver' of the hive process does not seem to be able to utilize InputFormats and Serde's that have been added to the session via ADD JAR. At one point I understood why. This is probably something we should ticket and come up wi

Re: adding jars - hive on spark cdh 5.4.3

2016-01-08 Thread Ophir Etzion
Thanks! In certain use cases you could but forgot about the aux thing, thats probably it. On Fri, Jan 8, 2016 at 12:24 PM, Edward Capriolo wrote: > You can not 'add jar' input formats and serde's. They need to be part of > your auxlib. > > On Fri, Jan 8, 2016 at 12:19 PM, Ophir Etzion > wrote:

Re: adding jars - hive on spark cdh 5.4.3

2016-01-08 Thread Edward Capriolo
You can not 'add jar' input formats and serde's. They need to be part of your auxlib. On Fri, Jan 8, 2016 at 12:19 PM, Ophir Etzion wrote: > I tried now. still getting > > 16/01/08 16:37:34 ERROR exec.Utilities: Failed to load plan: > hdfs://hadoop-alidoro-nn-vip/tmp/hive/hive/c2af9882-38a9-42b

Re: adding jars - hive on spark cdh 5.4.3

2016-01-08 Thread Ophir Etzion
I tried now. still getting 16/01/08 16:37:34 ERROR exec.Utilities: Failed to load plan: hdfs://hadoop-alidoro-nn-vip/tmp/hive/hive/c2af9882-38a9-42b0-8d17-3f56708383e8/hive_2016-01-08_16-36-41_370_3307331506800215903-3/-mr-10004/3c90a796-47fc-4541-bbec-b196c40aefab/map.xml: org.apache.hive.com.eso

Re: Impact of partitioning on certain queries

2016-01-08 Thread Jörn Franke
https://snippetessay.wordpress.com/2015/07/25/hive-optimizations-with-indexes-bloom-filters-and-statistics/ Maybe a compact index makes more sense if you have high cardinality columns > On 08 Jan 2016, at 10:11, Mich Talebzadeh wrote: > > Interesting point below: > > Well you use a text format

RE: Impact of partitioning on certain queries

2016-01-08 Thread Mich Talebzadeh
Thanks helpful 0: jdbc:hive2://rhes564:10010/default> explain dependency select * from sales where year = 2001 and month = 12; +---+-

Re: Impact of partitioning on certain queries

2016-01-08 Thread Jörn Franke
Try explain dependency > On 08 Jan 2016, at 10:47, Mich Talebzadeh wrote: > > Thanks Gopal. > > Basically the following is true: > > 1.The storage layer is HDFS > 2.The execution engine is MR, Tez, Spark etc > 3.The access layer is Hive > > When we say the access layer is Hive,

RE: Impact of partitioning on certain queries

2016-01-08 Thread Mich Talebzadeh
Thanks Gopal. Basically the following is true: 1.The storage layer is HDFS 2.The execution engine is MR, Tez, Spark etc 3.The access layer is Hive When we say the access layer is Hive, is the assumption correct that we are referring to optimiser (loosly related to the opti

Re: Impact of partitioning on certain queries

2016-01-08 Thread Gopal Vijayaraghavan
> Ok we hope that partitioning improves performance where the predicate is >on partitioned columns Nope. Partitioning *only* improves performance if your queries run with set hive.mapred.mode=strict; That's the "use strict" easy way to make sure you're writing good queries. Even then, schem

RE: Impact of partitioning on certain queries

2016-01-08 Thread Mich Talebzadeh
Interesting point below: Well you use a text format for your data so you should not be surprised. For text based formats, such as csv, you can always use the hive bitmap index. How can one create a bitmap index in Hive please? Dr Mich Talebzadeh LinkedIn

Re: Impact of partitioning on certain queries

2016-01-08 Thread Jörn Franke
Well you use a text format for your data so you should not be surprised. For text based formats, such as csv, you can always use the hive bitmap index. I do not think it makes a lot of sense to compare here processing csv files and internal tables of a relational database. > On 08 Jan 2016, at

Hive UDF accessing https request

2016-01-08 Thread Prabhu Joseph
Hi Experts, I am trying to write a Hive UDF which access https request and based on the response return the result. From Plain Java, the https response is coming but the https accessed from UDF is null. Can anyone review the below and share the correct steps to do this. create temporary func

RE: Impact of partitioning on certain queries

2016-01-08 Thread Mich Talebzadeh
Well that is debatable. The following table sales is partitioned in Oracle but has local bitmap indexes that help the query. select * from sales where prod_id = 10; no rows selected Execution Plan -- Plan hash value: 5112