Re: Hive servers in same cluster use different hive-log4j.properties files.

2017-07-28 Thread Mungeol Heo
Check 
https://community.hortonworks.com/questions/115159/hive-servers-in-same-cluster-use-different-hive-lo.html#answer-115162

On Fri, Jul 28, 2017 at 3:12 PM, Mungeol Heo  wrote:
> Hello,
>
> Here are logs from two hiveserver2.log file.
>
> --- first which is strange ---
>
> 2017-07-28 14:46:10,051 DEBUG [main]: server.HiveServer2
> (HiveServer2.java:main(586)) -
>
> Logging initialized using configuration in
> jar:file:/usr/hdp/2.6.0.3-8/hive/lib/hive-common-1.2.1000.2.6.0.3-8.jar!/hive-log4j.properties
>
> --- second ---
>
> 2017-07-28 14:43:01,534 DEBUG [main]: server.HiveServer2
> (HiveServer2.java:main(586)) - Logging initialized using configuration
> in file:/etc/hive/2.6.0.3-8/0/conf.server/hive-log4j.properties
>
> What is causing this problem?
>
> Any help will be very thankful.
>
> Thank you.


Re: How to set the logging level of the hiveserver2.log

2017-07-28 Thread Mungeol Heo
The answer for the second question is that they use different
hive-log4j.properties.

Check 
https://community.hortonworks.com/questions/115159/hive-servers-in-same-cluster-use-different-hive-lo.html#answer-115162

On Fri, Jul 28, 2017 at 2:15 PM, Mungeol Heo  wrote:
> The answer for the first question is addressed below.
>
> ambari -> go to hive--> config --> Advanced hive-log4j --> replace
> hive.root.logger=INFO,DRFA with hive.root.logger=DEBUG,DRFA and
> restart hiveserver2
>
> Reference: 
> https://community.hortonworks.com/questions/115152/how-to-set-the-logging-level-of-the-hiveserver2log.html
>
> On Fri, Jul 28, 2017 at 12:32 PM, Mungeol Heo  wrote:
>> Hello.
>>
>> As I mentioned at the title of this question, I wonder how to set the
>> logging level of the hiveserver2.log file in HDP.
>>
>> And I am also curios about why one of my hiveserver2.log file's
>> logging level is debug while another is not.
>>
>> --- First hiveserver2.log which has debug log ---
>>
>> 2017-05-10 13:47:34,201 DEBUG [main]: common.LogUtils
>> (LogUtils.java:logConfigLocation(147)) - Using hive-site.xml found on
>> CLASSPATH at /etc/hive/2.5.0.0-1245/0/conf.server/hive-site.xml
>>
>> 2017-05-10 13:47:34,206 DEBUG [main]: server.HiveServer2
>> (HiveServer2.java:main(558)) - Logging initialized using configuration
>> in 
>> jar:file:/usr/hdp/2.5.0.0-1245/hive/lib/hive-common-1.2.1000.2.5.0.0-1245.jar!/hive-log4j.properties
>>
>> 2017-05-10 13:47:34,227 INFO [main]: server.HiveServer2
>> (HiveStringUtils.java:startupShutdownMessage(693)) - STARTUP_MSG:
>>
>> --- Second hiveserver2.log which has no debug log ---
>>
>> 2017-03-13 20:02:37,674 INFO [main]: server.HiveServer2
>> (HiveStringUtils.java:startupShutdownMessage(693)) - STARTUP_MSG:
>>
>> Any help will be very thankful.


Hive servers in same cluster use different hive-log4j.properties files.

2017-07-27 Thread Mungeol Heo
Hello,

Here are logs from two hiveserver2.log file.

--- first which is strange ---

2017-07-28 14:46:10,051 DEBUG [main]: server.HiveServer2
(HiveServer2.java:main(586)) -

Logging initialized using configuration in
jar:file:/usr/hdp/2.6.0.3-8/hive/lib/hive-common-1.2.1000.2.6.0.3-8.jar!/hive-log4j.properties

--- second ---

2017-07-28 14:43:01,534 DEBUG [main]: server.HiveServer2
(HiveServer2.java:main(586)) - Logging initialized using configuration
in file:/etc/hive/2.6.0.3-8/0/conf.server/hive-log4j.properties

What is causing this problem?

Any help will be very thankful.

Thank you.


Re: How to set the logging level of the hiveserver2.log

2017-07-27 Thread Mungeol Heo
The answer for the first question is addressed below.

ambari -> go to hive--> config --> Advanced hive-log4j --> replace
hive.root.logger=INFO,DRFA with hive.root.logger=DEBUG,DRFA and
restart hiveserver2

Reference: 
https://community.hortonworks.com/questions/115152/how-to-set-the-logging-level-of-the-hiveserver2log.html

On Fri, Jul 28, 2017 at 12:32 PM, Mungeol Heo  wrote:
> Hello.
>
> As I mentioned at the title of this question, I wonder how to set the
> logging level of the hiveserver2.log file in HDP.
>
> And I am also curios about why one of my hiveserver2.log file's
> logging level is debug while another is not.
>
> --- First hiveserver2.log which has debug log ---
>
> 2017-05-10 13:47:34,201 DEBUG [main]: common.LogUtils
> (LogUtils.java:logConfigLocation(147)) - Using hive-site.xml found on
> CLASSPATH at /etc/hive/2.5.0.0-1245/0/conf.server/hive-site.xml
>
> 2017-05-10 13:47:34,206 DEBUG [main]: server.HiveServer2
> (HiveServer2.java:main(558)) - Logging initialized using configuration
> in 
> jar:file:/usr/hdp/2.5.0.0-1245/hive/lib/hive-common-1.2.1000.2.5.0.0-1245.jar!/hive-log4j.properties
>
> 2017-05-10 13:47:34,227 INFO [main]: server.HiveServer2
> (HiveStringUtils.java:startupShutdownMessage(693)) - STARTUP_MSG:
>
> --- Second hiveserver2.log which has no debug log ---
>
> 2017-03-13 20:02:37,674 INFO [main]: server.HiveServer2
> (HiveStringUtils.java:startupShutdownMessage(693)) - STARTUP_MSG:
>
> Any help will be very thankful.


How to set the logging level of the hiveserver2.log

2017-07-27 Thread Mungeol Heo
Hello.

As I mentioned at the title of this question, I wonder how to set the
logging level of the hiveserver2.log file in HDP.

And I am also curios about why one of my hiveserver2.log file's
logging level is debug while another is not.

--- First hiveserver2.log which has debug log ---

2017-05-10 13:47:34,201 DEBUG [main]: common.LogUtils
(LogUtils.java:logConfigLocation(147)) - Using hive-site.xml found on
CLASSPATH at /etc/hive/2.5.0.0-1245/0/conf.server/hive-site.xml

2017-05-10 13:47:34,206 DEBUG [main]: server.HiveServer2
(HiveServer2.java:main(558)) - Logging initialized using configuration
in 
jar:file:/usr/hdp/2.5.0.0-1245/hive/lib/hive-common-1.2.1000.2.5.0.0-1245.jar!/hive-log4j.properties

2017-05-10 13:47:34,227 INFO [main]: server.HiveServer2
(HiveStringUtils.java:startupShutdownMessage(693)) - STARTUP_MSG:

--- Second hiveserver2.log which has no debug log ---

2017-03-13 20:02:37,674 INFO [main]: server.HiveServer2
(HiveStringUtils.java:startupShutdownMessage(693)) - STARTUP_MSG:

Any help will be very thankful.


The remaining connections of IPC client keeps increasing.

2017-07-27 Thread Mungeol Heo
Hello.

I found the log addressed below in the hiveserver2.log file.

2017-07-10 23:35:02,389 DEBUG [IPC Client (1639759054) connection to
host.name/10.10.10.18:8020 from hdfs]: ipc.Client
(Client.java:run(1025)) - IPC Client (1639759054) connection to
host.name/10.10.10.18:8020 from hdfs: stopped, remaining connections
25

The problem is the remaining connections keeps increasing / maintains high.

What these connections are?
How to check these connections?
What is the reason casing this problem?
How to solve it?

Any help will be very thankful.

Thank you.


Re: how to customize tez query app name

2017-04-20 Thread Mungeol Heo
Try --hiveconf hive.session.id=session_id_name.
Then, the job name will be HIVE-session_id_name.
AFAK, this is the best option for your request.
If there is a better way, please share here.
Hope it helps.
Thank you.

On Sat, Jan 21, 2017 at 6:56 AM, Gopal Vijayaraghavan  wrote:
>
>> So no one has a solution?
> …
>> “mapreduce.job.name” works for M/R queries, not Tez.
>
> Depends on the Hive version you're talking about.
>
> https://issues.apache.org/jira/browse/HIVE-12357
>
> That doesn't help you with YARN, but only with the TezUI (since each YARN AM 
> runs > 1 queries).
>
> For something like an ETL workload, I suspect we can name CLI sessions, but 
> not queries independently.
>
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionState.java#L314
>
> Cheers,
> Gopal
>
>
>


Need suggestion for using rlike

2015-05-18 Thread Mungeol Heo
Hi,

I am using hundreds of "rlike" for matching specific URLs from apache
access log at a hiveql.
It is really slow when the target data is large.
I tried several ways to improve the performance of this kind of query.
Unfortunately, nothing works as I expect.
Any help will be great.
Thank you.

- mungeol


Re: Partition Columns

2015-05-14 Thread Mungeol Heo
Hi, Appan.

you can just simply check the amount of data your query reads from the
table. or the number of the mapper for running that query.
then, you can know whether it filtering or scanning all table.
Of course, it is a lazy approach. but, you can give a try.
I think query 1 should work fine. because I am using a lot of that
kind of queries and it works fine for me.

Thanks,
mungeol

On Fri, May 15, 2015 at 8:31 AM, Appan Thirumaligai
 wrote:
> I agree with you Viral. I see the same behavior as well. We are on Hive 0.13
> for the cluster where I'm testing this.
>
> On Thu, May 14, 2015 at 2:16 PM, Viral Bajaria 
> wrote:
>>
>> Hi Appan,
>>
>> In my experience I have seen that Query 2 does not use partition pruning
>> because it's not a straight up filtering and involves using functions (aka
>> UDFs).
>>
>> What version of Hive are you using ?
>>
>> Thanks,
>> Viral
>>
>>
>>
>> On Thu, May 14, 2015 at 1:48 PM, Appan Thirumaligai
>>  wrote:
>>>
>>> Hi,
>>>
>>> I have a question on Hive Optimizer. I have a table with partition
>>> columns  eg.,Sales partitioned by year, month, day. Assume that I have two
>>> years worth of data on this table. I'm running two queries on this table.
>>>
>>> Query 1: Select * from Sales where year=2015 and month = 5 and day
>>> between 1 and 7
>>>
>>> Query 2: Select * from Sales where concat_ws('-',cast(year as
>>> string),lpad(cast(month as string),2,'0'),lpad(cast(day as string),2,'0'))
>>> between '2015-01-01' and '2015-01-07'
>>>
>>> When I ran Explain command on the above two queries I get a Filter
>>> operation for the 2nd Query and there is no Filter Operation for the first
>>> query.
>>>
>>> My question is: Do both queries use the partitions or is it used only in
>>> Query 1 and for Query 2 it will be a scan of all the data?
>>>
>>> Thanks for your help.
>>>
>>> Thanks,
>>> Appan
>>
>>
>


[HQL] How to compare same column between rows?

2014-11-17 Thread Mungeol Heo
Hi,

My question is that does Hive able to compare same column between rows.
For instance, I have a table which contains data like below.

name, value
a, 1
a, 2
b, 1
c, 1
a, 13
b, 11

What I need is to compare 'value' columns between rows which have same 'name'.

For the name 'a'
first 'a' comes, then count = 1
next 'a' comes, then, compare 'value', if '2 - 1 >= 10' then 'count + 1'
next 'a' comes, then, compare 'value', if '13 - 2 >= 10' then 'count + 1'

For the name 'b'
first 'b' comes, then count = 1
next 'b' comes, then, compare 'value', if '11 - 1 >= 10' then 'count + 1'

For the name 'c'
first 'c' comes, then count = 1

And, the result what I will get should be like below.

name, count
a, 2
b, 2
c, 1

Is it possible without UDF?
If I have to use UDF, Is there one out there already exists?

I will write a UDF for this, but still I want to make sure about the
questions which I mentioned above.
Any help will be great.
Thanks.

Best regards,

- Mungeol