I'm currently doing a self-join on a table 4 times on varying conditions.
Although it works fine, I'm not sure if there are any alternatives that
perform better. Please let me know.
Thanks!
I created a table using SparkSQL and loaded parquet data into the table but
when I attempt to do a 'SELECT * FROM tbl' I keep running into this error:
Error: java.io.IOException:
org.apache.hadoop.hive.ql.metadata.HiveException:
java.lang.ClassCastException:
org.apache.hadoop.hive.serde2.io.Double
When attempting to insert null value into a map column type,
I run into this error:
Cannot convert column 2 from string to map
Here is my Avro schema and the table definition:
"fields": [
{"name": "src", "type": ["null", "string"], "default": null},
{"name": "ui
I'm looking for ideas on how to go about merging columns from 2 tables. In
one of the table I got a json string column that needs to be added to the
map column of other table.
json string: {"type": "fruit", "name":"apple"}
map: {'type' -> 'fruit', 'f' -> 'foo', 'b' -> 'bar'}
The resulting map fie
I found the brickhouse Hive udf `json_map' that seems to convert to map of
given type.
Thanks!
On Wed, Jan 20, 2016 at 2:03 PM, Buntu Dev wrote:
> I got json string of the form:
>
> {"k1":"v1","k2":"v2,"k3":"v3"}
>
> How would I go about converting this to a map?
>
> Thanks!
>
I got json string of the form:
{"k1":"v1","k2":"v2,"k3":"v3"}
How would I go about converting this to a map?
Thanks!
I'm looking for converting existing Avro dataset into Parquet and wanted to
know if there are any other performance related properties that I can set
such as compression, block size, etc. to take advantage of the Parquet.
I could only find `parquet.compression` property but would be good to know
i
Thanks Gopal!
In the hive-hll-udf, you seem to mention about RRD. Is that something
supported by Hive?
Will go over the Data Sketches as well, thanks for the pointer :)
On Wed, Dec 30, 2015 at 4:29 PM, Gopal Vijayaraghavan wrote:
>
> > I'm trying to explore the HLL UDF option to compute # of u
I'm trying to explore the HLL UDF option to compute # of uniq users for
each time range (week, month, yr, etc.) and wanted to know if its possible
to just maintain HLL struct for each day and then use those to compute the
uniqs for various time ranges using these per day structs instead of
running
Looks like concat_ws does the job, thanks!
On Mon, Oct 5, 2015 at 1:16 PM, Buntu Dev wrote:
> I've a column of type Array generated by collect_set function and
> want to concatenate the strings separated by some delimiter. Is there any
> built-in function to handle this?
>
> Thanks!
>
I've a column of type Array generated by collect_set function and
want to concatenate the strings separated by some delimiter. Is there any
built-in function to handle this?
Thanks!
I got an table with a .avro file and when I attempt to run simple query
like: "select count(*) from " I run into the below exception. I don't
get any more info about what exactly is wrong with the avro records. Is
there any other way to debug this issue?
Thanks!
~~
Error: java.io.IOException
I'm running into this error while doing a dynamic partition insert. Heres
how I created the table:
CREATE TABLE `part_table`(
`c1` bigint,
`c2` bigint,
`c3` bigint)
PARTITIONED BY (p1 string, `p2` string)
STORED AS PARQUET;
Here is the insert table:
SET hive.exec.dynamic.pa
Thanks Sanjiv. I've updated the Hive config setting the
hbase.zookeeper.quorum to point to the appropriate zookeeper.
On Tue, Jun 23, 2015 at 10:53 AM, Buntu Dev wrote:
> Thanks Sanjiv.
>
> On 6/23/15, @Sanjiv Singh wrote:
> > Hi Buntu,
> >
> >
> > Hive
Thanks Sanjiv.
On 6/23/15, @Sanjiv Singh wrote:
> Hi Buntu,
>
>
> Hive config to provide zookeeper quorum for the HBase cluster
>
>
> --hiveconf hbase.zookeeper.quorum=##
>
>
> Regards
> Sanjiv Singh
> Mob : +091 9990-447-339
>
> On Fri,
Did you already checkout the built-in Date Functions supported by Hive:
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-DateFunctions
Also might want to search for any existing UDFs if the Data Functions does
not satisfy your needs or write one yourself using
My use case is to query time series data ingested into HBase table
containing a web page name or url as row key and related properties as
column qualifiers. The properties for the web page are dynamic ie, the
columns qualifiers are dynamic for a given timestamp.
I would like to create a Hive manag
a-hive-pt2/
>
> On Fri, Jun 5, 2015 at 10:56 AM, Sean Busbey wrote:
>
>> +user@hive
>> -user@hbase to bcc
>>
>> Hi!
>>
>> This question is better handled by the hive user list, so I've copied
>> them in and moved the hbase user list to bcc
me.]function_name AS class_name
>
> [USING JAR|FILE|ARCHIVE 'file_uri' [, JAR|FILE|ARCHIVE 'file_uri'] ];
>
>
>
>
>
> *发件人:* Buntu Dev [mailto:buntu...@gmail.com]
> *发送时间:* 2015年4月24日 12:20
> *收件人:* user@hive.apache.org
> *主题:* Re: Create function usin
///home/me/my.jar,file:///home/you/your.jar,file:///home/us/our.jar
>
>
>
>
>
> On Thu, Apr 23, 2015 at 11:13 PM, Buntu Dev wrote:
>
>> Thanks but is there a way to make it available to other users and avoid
>> 'add jar ' step?
>>
>> On Thu,
function in each
> session.
> If you use the same Jar files and functions frequently, you can add those
> statements to your $HOME/.hiverc file.
>
>
> > On Apr 23, 2015, at 10:37 PM, Buntu Dev wrote:
> >
> > Does the JAR need to be added for every session before usin
Does the JAR need to be added for every session before using the custom UDF
created using "CREATE FUNCTION"?
I'm using Hive 0.13 and was able to add a custom UDF successfully and use
it in a sample query. But in the subsequent Hive sessions, I do see the
function but I get this error when using th
Still trying to figure out if there is any way to query directly or if
there is any existing UDF to help query union type fields in HiveQL.
Thanks!
On Tue, Feb 24, 2015 at 10:43 AM, Buntu Dev wrote:
> Hi,
>
> This might've been asked previously but I couldn't find any e
I got a 'log' table which is currently partitioned by year, month and day.
I'm looking to create a partitioned view on top of 'log' table but running
into this error:
hive> CREATE VIEW log_view PARTITIONED ON (pagename,year,month,day) AS
SELECT pagename year,month,day,uid,properties FROM log
ric/GenericUDFAddMonths.java
>
>
> https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFInitCap.java
>
>
> On Tue, Mar 3, 2015 at 2:43 PM, Buntu Dev wrote:
>
>> I couldn't find any official documentation on how to creat
/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFInitCap.java
>
>
> On Tue, Mar 3, 2015 at 2:43 PM, Buntu Dev wrote:
>
>> I couldn't find any official documentation on how to create a UDF and mvn
>> dependencies for building the project except for this page:
Thanks Pradeep.
On Tue, Mar 3, 2015 at 2:53 PM, Pradeep Gollakota
wrote:
> This is what I use:
>
>
> org.apache.hive
> hive-exec
> 0.12.0
> provided
>
>
> I don't believe anything else is needed.
>
> On Tue, Mar 3, 2015 at 2:43 PM, B
I couldn't find any official documentation on how to create a UDF and mvn
dependencies for building the project except for this page:
https://cwiki.apache.org/confluence/display/Hive/HivePlugins
Can anyone help me with whats needed to construct the pom?
Thanks!
Hi,
This might've been asked previously but I couldn't find any examples of how
to query uniontype in Hive.
I have this field in the table:
`location`
uniontype,boolean>
How do I go about querying: "select location.latitiude, location.latitude
from ..." since I get this error:
. Operator is on
Are there any usage examples for Matchpath UDF? I got a time series data
and want to generate a funnel report, is Matchpath suitable for such use
cases?
Thanks!
I got a bunch of small Avro files (<5MB) and have a table against those
files. I created a new table and did an 'INSERT OVERWRITE' selecting from
the existing table but did not find any option to provide the file block
size. It currently creates a single file per partition.
How do I specify the ou
Hi -- I got the destination table with partition on columns that are not in
the source table and get this error when attempting to do an INSERT
OVERWRITE, how to go about fixing this? Thanks:
SET hive.exec.dynamic.partition = true;
SET hive.exec.dynamic.partition.mode = nonstrict;
INSERT OVERWRIT
Hi -- I've got dataset in avro format with one of the columns as 'map' data
type defined as:
{"name": "data", "type": {"type": "map", "values": "string"}
If the data column has say:
{"param1":"value1","param2":"value2"}
How do I go about writing a hive query to only extract 'param1' column?
Th
33 matches
Mail list logo