Re: Where are jars stored for permanent functions

2016-06-09 Thread dhruv kapatel
In order to add it permanently recommended ways are as follows.

   1.

   add in hive-site.xml

hive.aux.jars.path
   file://localpath/yourjar.jar 
   2.

   Copy and paste the JAR file to the ${HIVE_HOME}/auxlib/ folder (create
   it if it does not exist).

   Source: Apache hive essentials book


On 10 June 2016 at 02:20, Jason Dere  wrote:

> Hive doesn't currently handle storing of the JARs. Doing ADD JAR only
> adds the jars to the current session, but won't help for other sessions.
>
> The permanent functions syntax allows you to specify JAR paths when you
> create the function. These should be on HDFS or other non-local path.
>
>
> create function ST_GeomFromWKT as 'com.esri.hadoop.hive.ST_GeomFromWKT'
> using jar 'hdfs:/path/to/spatial-sdk-hive-1.1.jar', jar 'hdfs:/path/to/
> esri-geometry-api-1.2.1.jar';
>
>
> --
> *From:* Marcin Tustin 
> *Sent:* Wednesday, June 08, 2016 1:10 PM
> *To:* user@hive.apache.org
> *Subject:* Where are jars stored for permanent functions
>
> Hi All,
>
> I just added local jars to my hive session, created permanent functions,
> and find that they are available across sessions and machines. This is of
> course excellent, but I'm wondering where those jars are being stored? What
> setting or what default directory would I find them in.
>
> My session was:
>
> add jars /mnt/storage/spatial-sdk-hive-1.1.jar
> /mnt/storage/esri-geometry-api-1.2.1.jar;
>
> create function ST_GeomFromWKT as 'com.esri.hadoop.hive.ST_GeomFromWKT';
>
>
> Then that function was available via the thriftserver.
>
>
> Thanks,
>
> Marcin
>
> Want to work at Handy? Check out our culture deck and open roles
> 
> Latest news  at Handy
> Handy just raised $50m
> 
>  led
> by Fidelity
>
>


-- 


*With Regards:Kapatel Dhruv vmobile:+919909214243*


  
  





Re: hive with ua paser (external) library gives IOException

2016-04-15 Thread dhruv kapatel
No proper stacktrace that where exception occur actually.
Exception that i got on hive terminal i posted on stackoverflow question.

On 16 April 2016 at 01:35, Jason Dere <jd...@hortonworks.com> wrote:

> And no stack trace with that error?
>
> ​
> ------
> *From:* dhruv kapatel <kapateldh...@gmail.com>
> *Sent:* Friday, April 15, 2016 11:49 AM
> *To:* user@hive.apache.org
> *Subject:* Re: hive with ua paser (external) library gives IOException
>
> Checked both the logs of hive and hadoop.
> No logs are there regarding this. That's why it becomes difficult to debug
> it.
>
> On 15 April 2016 at 23:55, Denise Rogers <datag...@aol.com> wrote:
>
>> Good call. Is there a way to direct where logs should be written to and
>> stored ?
>>
>> Regards,
>> Denise
>> Cell - (860)989-3431
>>
>> Sent from mi iPhone
>>
>> On Apr 15, 2016, at 1:44 PM, Jason Dere <jd...@hortonworks.com> wrote:
>>
>> Maybe check the Hive logs or the Hadoop logs of the tasks that were
>> started by the query. Looks like that error happens when an exception is
>> hit while trying to invoke the UDF, it might be logged somewhere.
>>
>>
>>
>> --
>> *From:* dhruv kapatel <kapateldh...@gmail.com>
>> *Sent:* Friday, April 15, 2016 3:36 AM
>> *To:* user@hive.apache.org
>> *Subject:* hive with ua paser (external) library gives IOException
>>
>> All the required external jars are included in udf jar.
>> You can find full details on stackoverflow  question.
>>
>> http://stackoverflow.com/questions/36643795/hive-ua-parser-udf-gives-ioexception
>>
>> Any hint regarding this will be appreciated.
>>
>> --
>>
>>
>> *With Regards: Kapatel Dhruv v *
>>
>>
>>
>
>
> --
>
>
> *With Regards: Kapatel Dhruv v mobile:+919909214243*
>
>
> <https://www.facebook.com/kapateldhruv>
> <https://twitter.com/kapatel_dhruv>
> <https://www.linkedin.com/in/kapateldhruv>
> <https://plus.google.com/u/0/107663824071419377609/posts>
> <http://stackoverflow.com/users/2016271/dhruv>
>
>
>


-- 


*With Regards:Kapatel Dhruv vmobile:+919909214243*


<https://www.facebook.com/kapateldhruv>
<https://twitter.com/kapatel_dhruv>
<https://www.linkedin.com/in/kapateldhruv>
<https://plus.google.com/u/0/107663824071419377609/posts>
<http://stackoverflow.com/users/2016271/dhruv>


Re: hive with ua paser (external) library gives IOException

2016-04-15 Thread dhruv kapatel
Checked both the logs of hive and hadoop.
No logs are there regarding this. That's why it becomes difficult to debug
it.

On 15 April 2016 at 23:55, Denise Rogers <datag...@aol.com> wrote:

> Good call. Is there a way to direct where logs should be written to and
> stored ?
>
> Regards,
> Denise
> Cell - (860)989-3431
>
> Sent from mi iPhone
>
> On Apr 15, 2016, at 1:44 PM, Jason Dere <jd...@hortonworks.com> wrote:
>
> Maybe check the Hive logs or the Hadoop logs of the tasks that were
> started by the query. Looks like that error happens when an exception is
> hit while trying to invoke the UDF, it might be logged somewhere.
>
>
>
> --
> *From:* dhruv kapatel <kapateldh...@gmail.com>
> *Sent:* Friday, April 15, 2016 3:36 AM
> *To:* user@hive.apache.org
> *Subject:* hive with ua paser (external) library gives IOException
>
> All the required external jars are included in udf jar.
> You can find full details on stackoverflow  question.
>
> http://stackoverflow.com/questions/36643795/hive-ua-parser-udf-gives-ioexception
>
> Any hint regarding this will be appreciated.
>
> --
>
>
> *With Regards: Kapatel Dhruv v *
>
>
>


-- 


*With Regards:Kapatel Dhruv vmobile:+919909214243*


<https://www.facebook.com/kapateldhruv>
<https://twitter.com/kapatel_dhruv>
<https://www.linkedin.com/in/kapateldhruv>
<https://plus.google.com/u/0/107663824071419377609/posts>
<http://stackoverflow.com/users/2016271/dhruv>


hive with ua paser (external) library gives IOException

2016-04-15 Thread dhruv kapatel
All the required external jars are included in udf jar.
You can find full details on stackoverflow  question.
http://stackoverflow.com/questions/36643795/hive-ua-parser-udf-gives-ioexception

Any hint regarding this will be appreciated.

-- 


*With Regards:Kapatel Dhruv v*


Re: How to append one column to an existing array column in Hive?

2016-03-20 Thread dhruv kapatel
You can try this select concat("A",concat_ws("",array('B','C')));
Please set separator according to your requirement.
For , separator select concat_ws(",","A",concat_ws(",",array('B','C')));



On 20 March 2016 at 01:17, Rex X  wrote:

> For example, to append columnA to an existing array-type column B
>
> select
> string_column_A,
> array_column_B,
> *append(array_column_B, string_column_A) as AB*
> from onetable;
>
> I did not find any append function as above in Hive.
>
> To be more accurate, I should say "set" instead of "array" above, since I
> expect no duplicates. But the duplication here is not a big deal.
>
> What's the best way to make this in Hive? I have checked the hive
> documentation, but cannot find any relevant information to do this.
>
>
>


-- 


*With Regards:Kapatel Dhruv v*


what are the exact performance criteria when you are comparing pig and hive?

2016-03-10 Thread dhruv kapatel
HI

I've gone through https://issues.apache.org/jira/browse/HIVE-396 benchmarks.
My question is can we generalize that hive will be faster in all the use
cases.

   - weblogs
   - xml
   - json

Does hive performance depend on different use cases or type of data we are
processing?



-- 


*With Regards:Kapatel Dhruv v*


hive cast string to date in 'dd/MMM/yyyy' format order by and group by issue

2016-03-08 Thread dhruv kapatel
Hi,

I've date stored as [27/Feb/2016:00:24:31 +0530] in string format. I want
to cast it to dd/MMM/ format and also want to perform group by from it.
More details and tried queries see below question on stackoverflow.
http://stackoverflow.com/questions/35668624/hive-cast-string-to-date-in-dd-mmm--format-order-by-and-group-by-issue


-- 


*With Regards:Kapatel Dhruv v*


Re: Which one should i use for benchmark tasks in hive & hadoop

2016-03-06 Thread dhruv kapatel
Thank you very much.


-- 


*With Regards:Kapatel Dhruv v*


Fwd: Which one should i use for benchmark tasks in hive & hadoop

2016-03-05 Thread dhruv kapatel
Hi

I am comparing performance of pig and hive for weblog data.
I was reading this pig and hive benchmarks. In which one statement written
on page 10 that "The CPU time
required by a job running on 10 node cluster will (more or less) be the same
than the time required to run the same job on a 1000 node cluster. However
the real time it takes the job to complete on the 1000 node cluster will be
100 times less than if it were to run on a 10 node cluster."

How it will take same cpu time on clusters having different capacity?

In this benchmark they have considered both real and cumulative cpu time.
As real time affected by other processes also which time shouls i consider
for actual performance measure of pig and hive?

See question below for more details.

http://stackoverflow.com/questions/35500987/which-one-should-i-use-for-benchmark-tasks-in-hadoop-usersys-time-or-total-cpu

http://www.ibm.com/developerworks/library/ba-pigvhive/pighivebenchmarking.pdf
.

-- 


*With Regards:Kapatel Dhruv v*