Indexing

2017-12-30 Thread Sachit Murarka
Hello, I have seen some blog saying that Indexing is not recommended , instead we can use ORC format. Can you please provide suggestion? I could not see any official declaration. Kind Regards, Sachit Murarka

Re: Hive indexing optimization

2015-06-30 Thread John Pullokkaran
To: user@hive.apache.orgmailto:user@hive.apache.org user@hive.apache.orgmailto:user@hive.apache.org Subject: RE: Hive indexing optimization I've attached the output. Thanks. B Subject: Re: Hive indexing optimization From: jpullokka

RE: Hive indexing optimization

2015-06-29 Thread Bennie Leo
output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe Stage: Stage-0 Fetch Operator limit: -1 Thank you, B Subject: Re: Hive indexing optimization From

Re: Hive indexing optimization

2015-06-29 Thread John Pullokkaran
@hive.apache.org user@hive.apache.orgmailto:user@hive.apache.org Subject: RE: Hive indexing optimization Here is the explain output: STAGE PLANS: Stage: Stage-1 Tez Edges: Reducer 2 - Map 1 (SIMPLE_EDGE), Map 3 (SIMPLE_EDGE) Vertices: Map 1 Map Operator Tree

RE: Hive indexing optimization

2015-06-29 Thread Bennie Leo
I've attached the output. Thanks. B Subject: Re: Hive indexing optimization From: jpullokka...@hortonworks.com To: user@hive.apache.org Date: Mon, 29 Jun 2015 19:17:44 + Could you post explain extended output? From: Bennie Leo tben...@hotmail.com Reply

Re: Hive indexing optimization

2015-06-27 Thread John Pullokkaran
SELECT StartIp, EndIp, Country FROM ipv4geotable” should have been rewritten as a scan against index table. BitMap Indexes seems to support inequalities (=, , =). Post the explain plan. On 6/26/15, 8:56 PM, Gopal Vijayaraghavan gop...@apache.org wrote: Hi, Hive indexes won¹t really help you

RE: Hive indexing optimization

2015-06-26 Thread Bennie Leo
; ? I don't know how I could include this within my current query. Cheers, B Subject: Re: Hive indexing optimization From: jpullokka...@hortonworks.com To: user@hive.apache.org Date: Fri, 26 Jun 2015 01:27:21 + Set hive.optimize.index.filter=true; Thanks John From: Bennie Leo

Re: Hive indexing optimization

2015-06-26 Thread Gopal Vijayaraghavan
Hi, Hive indexes won¹t really help you speed up that query right now, because of the plan it generates due to the = clauses. CREATETABLE ipv4table AS SELECT logon.IP, ipv4.Country FROM (SELECT * FROM logontable WHERE isIpv4(IP)) logon LEFT OUTER JOIN (SELECT StartIp, EndIp, Country FROM

Hive indexing optimization

2015-06-25 Thread Bennie Leo
Hi, I am attempting to optimize a query using indexing. My current query converts an ipv4 address to a country using a geolocation table. However, the geolocation table is fairly large and the query takes an impractical amount of time. I have created indexes and set the binary search

Re: Hive indexing optimization

2015-06-25 Thread John Pullokkaran
@hive.apache.org user@hive.apache.orgmailto:user@hive.apache.org Subject: Hive indexing optimization Hi, I am attempting to optimize a query using indexing. My current query converts an ipv4 address to a country using a geolocation table. However, the geolocation table is fairly large

Hive Indexing and ORC

2014-09-06 Thread Alain Petrus
Hello, Is it possible to create an index on table stored as ORC and compressed as Snappy? Does it make sense? I am wondering if Hive indexing is a mature functionality? Thanks, Alain

Hive Indexing and ORC

2014-09-06 Thread Alain Petrus
Hello, Is it possible to create an index on table stored as ORC and compressed as Snappy? Does it make sense? I am wondering if Hive indexing is a mature functionality? Thanks, Alain

Indexing in Hive

2014-04-18 Thread saquib khan
Hi, For large tables, its takes a lot of time to load the indexes in the index table. Is there any way we can reduce the index load time? CREATE TABLE SE_TX_SUMMARY (COUNTY string, BLOCKGROUPID string, GROUPING_ID int) PARTITIONED BY (EXPOSED_TIME int) row format delimited fields terminated by

Indexing in Hive 0.12 on a partitioned and bucketed table

2014-03-20 Thread Sagar Mehta
Hi Guys, We have a Hive 0.12 ORC table that is partitioned on year, month, day, hour and is bucketed by one column. So far so good - We are seeing good speed up improvements as compared to non-ORC format. - Now we want to add an index on another commonly used column. My question was -

Predicate pushdown/indexing on ORC file

2013-11-07 Thread Avrilia Floratou
Hi all, I'm using hive-12. I have a file that contains 10 integer columns stored in ORC format. The ORC file is zlib compressed and indexing is enabled. I'm running a simple select count(*) with a predicate of the form (Col1 =0 OR col2 = 0 etc). The predicate touches all 10 columns but its

Re: Predicate pushdown/indexing on ORC file

2013-11-07 Thread Prasanth Jayachandran
have a file that contains 10 integer columns stored in ORC format. The ORC file is zlib compressed and indexing is enabled. I'm running a simple select count(*) with a predicate of the form (Col1 =0 OR col2 = 0 etc). The predicate touches all 10 columns but its selectivity is 0 (none

Help me understand Hive indexing.

2013-11-06 Thread Heller, Chris
Hi, I am new to Hive, and am trying to setup an index on a Hive table to improve query performance. I am presently using the CDH 4.2 Hadoop distribution, which ships with Hive 0.10, so from what I have read table index support should be available. What I am seeing though is that when I go and

Review improvement request: Hive indexing doc

2013-06-28 Thread Lefty Leverenz
The stub of an Indexing user doc in the Hive wiki's Language Manual now includes some simple examples, adapted from the test suite. Would someone who uses Hive indexes please review it and make any necessary corrections additions? For example, I omitted examples of indexes on partitioned tables

Hive 0.9 and Indexing

2012-07-26 Thread John Omernik
I am playing with Hive indexing and a little discouraged by the gap between the potential seen and the amount of documentation around indexing. I am running Hive 0.9 and started playing with indexing as follows: I have a table logs that has a bunch of fields but for this, lets say three

RE: Hive 0.9 and Indexing

2012-07-26 Thread Connell, Chuck
I do not have answers to any of your questions, but I appreciate you raising them. My team is very interested in Hive indexing as well, so I look forward to this discussion. Chuck Connell Nuance RD Data Team Burlington, MA From: John Omernik [mailto:j...@omernik.com] Sent: Thursday, July 26

Problem with indexing in Hive

2012-07-26 Thread Ablimit Aji
I have written a custom index handler and wanted to test it. However hive is not using it. So I test with simple table (pokes (int foo, string bar)) which comes with hive distribution for testing purpose. Then I created a compact index and set the set hive.optimize.index.filter=true; However, upon

Indexing in hive

2012-05-16 Thread Raghunath, Ranjith
I am currently using hive 0.7.1 and creating indexes based on columns in the where clause. However, when I run the explain plan I do not see the index being leveraged. The syntax that I am using to build the index is as follows: CREATE INDEX x ON TABLE t(j) AS

Indexing

2011-10-07 Thread Avrilia Floratou
Hi, I'd like to know what's the current status of indexing in hive. What I've found so far is that the user has to manually set the index table for each query. Sth like this: ** insert overwrite directory /tmp/index_result select `_bucketname

Re: Indexing Help

2011-08-05 Thread Shouguo Li
on a side note, i'm looking at adding indexes to our hive tables as well, is there a performance/space trade off comparison or metrics? thx! On Wed, Aug 3, 2011 at 10:52 AM, Siddharth Ramanan siddharth.rama...@gmail.com wrote: Hi all, I have used compact index for my table and the

Indexing .gz files

2011-08-03 Thread Martin Konicek
Hi, can indexes work on gzipped files? The index gets build without errors using ALTER INDEX syslog_index ON syslog PARTITION(dt='2011-08-03') REBUILD; but when querying, no results are returned (and no errors reported). The query should be correct because with plaintext files it works.

Re: Indexing .gz files

2011-08-03 Thread yongqiang he
unfortunately it does not, because can not split .gz file. 2011/8/3 Martin Konicek martin.koni...@gmail.com: Hi, can indexes work on gzipped files? The index gets build without errors using ALTER INDEX syslog_index ON syslog PARTITION(dt='2011-08-03') REBUILD; but when querying, no

Indexing help

2011-07-28 Thread Siddharth Ramanan
Hi, I have a table, which has close to a billion rows.. I am trying to create an index for the table, when I do the alter command, I always end up with map-reduce jobs with errors. The same runs fine for small tables though, I also notice that the number of reducers are set to 24, even if set

Error while indexing the LZO file

2011-07-27 Thread Ankit Jain
Hi all, I tried to index the lzo file but got the following error while indexing the lzo file : java.lang.ClassCastException: com.hadoop.compression.lzo.LzopCodec$LzopDecompressor cannot be cast to com.hadoop.compression.lzo.LzopDecompressor