Alternatives to self join

2016-07-26 Thread Buntu Dev
I'm currently doing a self-join on a table 4 times on varying conditions. Although it works fine, I'm not sure if there are any alternatives that perform better. Please let me know. Thanks!

org.apache.hadoop.hive.serde2.io.DoubleWritable cannot be cast to org.apache.hadoop.hive.serde2.io.HiveDecimalWritable error

2016-05-06 Thread Buntu Dev
I created a table using SparkSQL and loaded parquet data into the table but when I attempt to do a 'SELECT * FROM tbl' I keep running into this error: Error: java.io.IOException: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.ClassCastException: org.apache.hadoop.hive.serde2.io.Double

Cannot convert column 2 from string to map error

2016-02-29 Thread Buntu Dev
When attempting to insert null value into a map column type, I run into this error: Cannot convert column 2 from string to map Here is my Avro schema and the table definition: "fields": [ {"name": "src", "type": ["null", "string"], "default": null}, {"name": "ui

Combine rows with json string and map

2016-02-23 Thread Buntu Dev
I'm looking for ideas on how to go about merging columns from 2 tables. In one of the table I got a json string column that needs to be added to the map column of other table. json string: {"type": "fruit", "name":"apple"} map: {'type' -> 'fruit', 'f' -> 'foo', 'b' -> 'bar'} The resulting map fie

Re: Convert string to map

2016-01-20 Thread Buntu Dev
I found the brickhouse Hive udf `json_map' that seems to convert to map of given type. Thanks! On Wed, Jan 20, 2016 at 2:03 PM, Buntu Dev wrote: > I got json string of the form: > > {"k1":"v1","k2":"v2,"k3":"v3"} > > How would I go about converting this to a map? > > Thanks! >

Convert string to map

2016-01-20 Thread Buntu Dev
I got json string of the form: {"k1":"v1","k2":"v2,"k3":"v3"} How would I go about converting this to a map? Thanks!

Best practices for using Parquet

2016-01-19 Thread Buntu Dev
I'm looking for converting existing Avro dataset into Parquet and wanted to know if there are any other performance related properties that I can set such as compression, block size, etc. to take advantage of the Parquet. I could only find `parquet.compression` property but would be good to know i

Re: Hive HLL for appx count distinct

2015-12-30 Thread Buntu Dev
Thanks Gopal! In the hive-hll-udf, you seem to mention about RRD. Is that something supported by Hive? Will go over the Data Sketches as well, thanks for the pointer :) On Wed, Dec 30, 2015 at 4:29 PM, Gopal Vijayaraghavan wrote: > > > I'm trying to explore the HLL UDF option to compute # of u

Hive HLL for appx count distinct

2015-12-30 Thread Buntu Dev
I'm trying to explore the HLL UDF option to compute # of uniq users for each time range (week, month, yr, etc.) and wanted to know if its possible to just maintain HLL struct for each day and then use those to compute the uniqs for various time ranges using these per day structs instead of running

Re: Convert Array to string

2015-10-05 Thread Buntu Dev
Looks like concat_ws does the job, thanks! On Mon, Oct 5, 2015 at 1:16 PM, Buntu Dev wrote: > I've a column of type Array generated by collect_set function and > want to concatenate the strings separated by some delimiter. Is there any > built-in function to handle this? > > Thanks! >

Convert Array to string

2015-10-05 Thread Buntu Dev
I've a column of type Array generated by collect_set function and want to concatenate the strings separated by some delimiter. Is there any built-in function to handle this? Thanks!

How to debug avro parse exception: "java.io.IOException: java.lang.ArrayIndexOutOfBoundsException: 26"

2015-09-21 Thread Buntu Dev
I got an table with a .avro file and when I attempt to run simple query like: "select count(*) from " I run into the below exception. I don't get any more info about what exactly is wrong with the avro records. Is there any other way to debug this issue? Thanks! ~~ Error: java.io.IOException

SemanticException Partition spec {p1=null, p2=null} contains non-partition columns

2015-08-21 Thread Buntu Dev
I'm running into this error while doing a dynamic partition insert. Heres how I created the table: CREATE TABLE `part_table`( `c1` bigint, `c2` bigint, `c3` bigint) PARTITIONED BY (p1 string, `p2` string) STORED AS PARQUET; Here is the insert table: SET hive.exec.dynamic.pa

Re: HBase and Hive integration

2015-06-22 Thread Buntu Dev
Thanks Sanjiv. I've updated the Hive config setting the hbase.zookeeper.quorum to point to the appropriate zookeeper. On Tue, Jun 23, 2015 at 10:53 AM, Buntu Dev wrote: > Thanks Sanjiv. > > On 6/23/15, @Sanjiv Singh wrote: > > Hi Buntu, > > > > > > Hive

Re: HBase and Hive integration

2015-06-22 Thread Buntu Dev
Thanks Sanjiv. On 6/23/15, @Sanjiv Singh wrote: > Hi Buntu, > > > Hive config to provide zookeeper quorum for the HBase cluster > > > --hiveconf hbase.zookeeper.quorum=## > > > Regards > Sanjiv Singh > Mob : +091 9990-447-339 > > On Fri,

Re: Query in Pattern of string value of Date

2015-06-18 Thread Buntu Dev
Did you already checkout the built-in Date Functions supported by Hive: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-DateFunctions Also might want to search for any existing UDFs if the Data Functions does not satisfy your needs or write one yourself using

Query HBase table based on timestamp

2015-06-17 Thread Buntu Dev
My use case is to query time series data ingested into HBase table containing a web page name or url as row key and related properties as column qualifiers. The properties for the web page are dynamic ie, the columns qualifiers are dynamic for a given timestamp. I would like to create a Hive manag

Re: HBase and Hive integration

2015-06-12 Thread Buntu Dev
a-hive-pt2/ > > On Fri, Jun 5, 2015 at 10:56 AM, Sean Busbey wrote: > >> +user@hive >> -user@hbase to bcc >> >> Hi! >> >> This question is better handled by the hive user list, so I've copied >> them in and moved the hbase user list to bcc

Re: Create function using custom UDF

2015-04-24 Thread Buntu Dev
me.]function_name AS class_name > > [USING JAR|FILE|ARCHIVE 'file_uri' [, JAR|FILE|ARCHIVE 'file_uri'] ]; > > > > > > *发件人:* Buntu Dev [mailto:buntu...@gmail.com] > *发送时间:* 2015年4月24日 12:20 > *收件人:* user@hive.apache.org > *主题:* Re: Create function usin

Re: Create function using custom UDF

2015-04-23 Thread Buntu Dev
///home/me/my.jar,file:///home/you/your.jar,file:///home/us/our.jar > > > > > > On Thu, Apr 23, 2015 at 11:13 PM, Buntu Dev wrote: > >> Thanks but is there a way to make it available to other users and avoid >> 'add jar ' step? >> >> On Thu,

Re: Create function using custom UDF

2015-04-23 Thread Buntu Dev
function in each > session. > If you use the same Jar files and functions frequently, you can add those > statements to your $HOME/.hiverc file. > > > > On Apr 23, 2015, at 10:37 PM, Buntu Dev wrote: > > > > Does the JAR need to be added for every session before usin

Create function using custom UDF

2015-04-23 Thread Buntu Dev
Does the JAR need to be added for every session before using the custom UDF created using "CREATE FUNCTION"? I'm using Hive 0.13 and was able to add a custom UDF successfully and use it in a sample query. But in the subsequent Hive sessions, I do see the function but I get this error when using th

Re: Querying Uniontype with Hive

2015-03-23 Thread Buntu Dev
Still trying to figure out if there is any way to query directly or if there is any existing UDF to help query union type fields in HiveQL. Thanks! On Tue, Feb 24, 2015 at 10:43 AM, Buntu Dev wrote: > Hi, > > This might've been asked previously but I couldn't find any e

Error creating a partitioned view

2015-03-12 Thread Buntu Dev
I got a 'log' table which is currently partitioned by year, month and day. I'm looking to create a partitioned view on top of 'log' table but running into this error: hive> CREATE VIEW log_view PARTITIONED ON (pagename,year,month,day) AS SELECT pagename year,month,day,uid,properties FROM log

Re: Create custom UDF

2015-03-05 Thread Buntu Dev
ric/GenericUDFAddMonths.java > > > https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFInitCap.java > > > On Tue, Mar 3, 2015 at 2:43 PM, Buntu Dev wrote: > >> I couldn't find any official documentation on how to creat

Re: Create custom UDF

2015-03-05 Thread Buntu Dev
/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFInitCap.java > > > On Tue, Mar 3, 2015 at 2:43 PM, Buntu Dev wrote: > >> I couldn't find any official documentation on how to create a UDF and mvn >> dependencies for building the project except for this page:

Re: Create custom UDF

2015-03-03 Thread Buntu Dev
Thanks Pradeep. On Tue, Mar 3, 2015 at 2:53 PM, Pradeep Gollakota wrote: > This is what I use: > > > org.apache.hive > hive-exec > 0.12.0 > provided > > > I don't believe anything else is needed. > > On Tue, Mar 3, 2015 at 2:43 PM, B

Create custom UDF

2015-03-03 Thread Buntu Dev
I couldn't find any official documentation on how to create a UDF and mvn dependencies for building the project except for this page: https://cwiki.apache.org/confluence/display/Hive/HivePlugins Can anyone help me with whats needed to construct the pom? Thanks!

Querying Uniontype with Hive

2015-02-24 Thread Buntu Dev
Hi, This might've been asked previously but I couldn't find any examples of how to query uniontype in Hive. I have this field in the table: `location` uniontype,boolean> How do I go about querying: "select location.latitiude, location.latitude from ..." since I get this error: . Operator is on

Matchpath usage examples?

2015-01-21 Thread Buntu Dev
Are there any usage examples for Matchpath UDF? I got a time series data and want to generate a funnel report, is Matchpath suitable for such use cases? Thanks!

Hive Insert overwrite creating a single file with large block size

2015-01-09 Thread Buntu Dev
I got a bunch of small Avro files (<5MB) and have a table against those files. I created a new table and did an 'INSERT OVERWRITE' selecting from the existing table but did not find any option to provide the file block size. It currently creates a single file per partition. How do I specify the ou

Insert into partitioned table from unpartitioned table

2014-12-22 Thread Buntu Dev
Hi -- I got the destination table with partition on columns that are not in the source table and get this error when attempting to do an INSERT OVERWRITE, how to go about fixing this? Thanks: SET hive.exec.dynamic.partition = true; SET hive.exec.dynamic.partition.mode = nonstrict; INSERT OVERWRIT

How to query avro map column

2014-12-09 Thread Buntu Dev
Hi -- I've got dataset in avro format with one of the columns as 'map' data type defined as: {"name": "data", "type": {"type": "map", "values": "string"} If the data column has say: {"param1":"value1","param2":"value2"} How do I go about writing a hive query to only extract 'param1' column? Th