date:20110104

[jira] Commented: (HIVE-1508) Add cleanup method to HiveHistory class

2011-01-04 Thread Mohit Sikri (JIRA)

[
https://issues.apache.org/jira/browse/HIVE-1508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12977185#action_12977185
]

Mohit Sikri commented on HIVE-1508:
---

Great finding, but how do we curtail(or put the limit to) the number of these
files generated, since I don't see any mechanism of deleting such files (say
deleting the files older than an hour or so), they may eventually pile up to
the extent of consuming significant disk space.
I don't see any significance of these files once session expires or keeping too
old session information within such files.
Kindly suggest.

Add cleanup method to HiveHistory class
---

Key: HIVE-1508
URL: https://issues.apache.org/jira/browse/HIVE-1508
Project: Hive
Issue Type: Bug
Components: Metastore, Server Infrastructure
Reporter: Anurag Phadke
Assignee: Edward Capriolo
Priority: Blocker
Fix For: 0.7.0

Attachments: hive-1508-1-patch.txt

Running hive server for long time 90 minutes results in too many open
file-handles, eventually causing the server to crash as the server runs out
of file handle.
Actual bug as described by Carl Steinbach:
the hive_job_log_* files are created by the HiveHistory class. This class
creates a PrintWriter for writing to the file, but never closes the writer.
It looks like we need to add a cleanup method to HiveHistory that closes the
PrintWriter and does any other necessary cleanup.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Regarding Hive History File(s).

2011-01-04 Thread Mohit

Hello All,

 

What is the purpose of maintaining hive history files which contain session
information like session start, query start, query end, task start, task end
etc.? Are they being used later (say by a tool) for some purpose?

 

I don't see these files being getting deleted from the system ;any
configuration needed to be set  to enable deletion or Is there any design
strategy/decision/rationale for not deleting them at all?

 

Also, in these files I don't see the session end message being logged, is it
reserved for future use?

 

-Mohit

 


***
This e-mail and attachments contain confidential information from HUAWEI,
which is intended only for the person or entity whose address is listed
above. Any use of the information contained herein in any way (including,
but not limited to, total or partial disclosure, reproduction, or
dissemination) by persons other than the intended recipient's) is
prohibited. If you receive this e-mail in error, please notify the sender by
phone or email immediately and delete it!

Re: Regarding Hive History File(s).

2011-01-04 Thread Edward Capriolo

On Tue, Jan 4, 2011 at 7:03 AM, Mohit mohitsi...@huawei.com wrote:
 Hello All,



 What is the purpose of maintaining hive history files which contain session
 information like session start, query start, query end, task start, task end
 etc.? Are they being used later (say by a tool) for some purpose?



 I don't see these files being getting deleted from the system ;any
 configuration needed to be set  to enable deletion or Is there any design
 strategy/decision/rationale for not deleting them at all?



 Also, in these files I don't see the session end message being logged, is it
 reserved for future use?



 -Mohit



 ***
 This e-mail and attachments contain confidential information from HUAWEI,
 which is intended only for the person or entity whose address is listed
 above. Any use of the information contained herein in any way (including,
 but not limited to, total or partial disclosure, reproduction, or
 dissemination) by persons other than the intended recipient's) is
 prohibited. If you receive this e-mail in error, please notify the sender by
 phone or email immediately and delete it!



HiveHistory was added a while ago between 3.0 and 4.0 (iirc). A tool
to view them is HiveHistoryViewer in the API. I am not exactly sure
who is doing what with that data. The Web Interface does use it to
provide links to the JobTracker. So it helpful for trying to trace all
the dependant jobs of a query after the fact.

There is a ticket open to customize the file location. I was also
thinking we should allow the user to supply a 'none' to turn off the
feature. As for clean up and management cron and rm seem like a good
fit.

[jira] Updated: (HIVE-1818) Call frequency and duration metrics for HiveMetaStore via jmx

2011-01-04 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-1818:
---

Fix Version/s: 0.7.0
   Status: Patch Available  (was: Open)

 Call frequency and duration metrics for HiveMetaStore via jmx
 -

 Key: HIVE-1818
 URL: https://issues.apache.org/jira/browse/HIVE-1818
 Project: Hive
  Issue Type: New Feature
  Components: Metastore
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
Priority: Minor
 Fix For: 0.7.0

 Attachments: HIVE-1818-vs-1054860.patch, HIVE-1818.patch


 As recently brought up in the hive-dev mailing list, it'd be useful if the 
 HiveMetaStore had some sort of instrumentation capability so as to measure 
 frequency of calls to various calls on the HiveMetaStore and the duration of 
 time spent in these calls. 
 There are already incrementCounter() and logStartFunction() / 
 logStartTableFunction() ,etc calls in HiveMetaStore, and they could be 
 refactored/repurposed to make calls that expose JMX MBeans as well. Or, a 
 Metrics subsystem could be introduced which made calls to 
 incrementCounter()/etc as a refactor.
 It might also be possible to specify a -D parameter that the Metrics 
 subsystem could use to determine whether or not to be enabled, and if so, on 
 to what port. And once we have the capability to instrument and expose 
 MBeans, it might also be possible for other subsystems to also adopt and use 
 this system.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Build failed in Hudson: Hive-trunk-h0.20 #467

2011-01-04 Thread Apache Hudson Server

See https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/467/

--
[...truncated 14626 lines...]
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/srcbucket23.txt
[junit] Loading data to table srcbucket2
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/kv1.txt
[junit] Loading data to table src
[junit] POSTHOOK: Output: defa...@src
[junit] OK
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/kv3.txt
[junit] Loading data to table src1
[junit] POSTHOOK: Output: defa...@src1
[junit] OK
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/kv1.seq
[junit] Loading data to table src_sequencefile
[junit] POSTHOOK: Output: defa...@src_sequencefile
[junit] OK
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/complex.seq
[junit] Loading data to table src_thrift
[junit] POSTHOOK: Output: defa...@src_thrift
[junit] OK
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/json.txt
[junit] Loading data to table src_json
[junit] POSTHOOK: Output: defa...@src_json
[junit] OK
[junit] diff 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/build/ql/test/logs/negative/unknown_table1.q.out
 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/ql/src/test/results/compiler/errors/unknown_table1.q.out
[junit] Done query: unknown_table1.q
[junit] Begin query: unknown_table2.q
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/kv1.txt
[junit] Loading data to table srcpart partition (ds=2008-04-08, hr=11)
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-08/hr=11
[junit] OK
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/kv1.txt
[junit] Loading data to table srcpart partition (ds=2008-04-08, hr=12)
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-08/hr=12
[junit] OK
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/kv1.txt
[junit] Loading data to table srcpart partition (ds=2008-04-09, hr=11)
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-09/hr=11
[junit] OK
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/kv1.txt
[junit] Loading data to table srcpart partition (ds=2008-04-09, hr=12)
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-09/hr=12
[junit] OK
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/srcbucket0.txt
[junit] Loading data to table srcbucket
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/srcbucket1.txt
[junit] Loading data to table srcbucket
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/srcbucket20.txt
[junit] Loading data to table srcbucket2
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/srcbucket21.txt
[junit] Loading data to table srcbucket2
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/srcbucket22.txt
[junit] Loading data to table srcbucket2
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/srcbucket23.txt
[junit] Loading data to table srcbucket2
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/kv1.txt
[junit] Loading data to table src
[junit] POSTHOOK: Output: defa...@src
[junit] OK
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/kv3.txt
[junit] Loading data to table src1
[junit] POSTHOOK: Output: defa...@src1
[junit] OK
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/kv1.seq
[junit] Loading data to table

48 matches

Mail list logo