[jira] Commented: (HIVE-809) Create a copier to copy data from scribe hdfs cluster to main DW cluster

2009-12-22 Thread Suresh Antony (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12793704#action_12793704
 ] 

Suresh Antony commented on HIVE-809:


I am on vacation from 12/11/09 to 1/5/10.


> Create a copier to copy data from scribe hdfs cluster to main DW cluster
> 
>
> Key: HIVE-809
> URL: https://issues.apache.org/jira/browse/HIVE-809
> Project: Hadoop Hive
>  Issue Type: New Feature
>Reporter: Suresh Antony
>Assignee: Suresh Antony
>Priority: Minor
> Attachments: patch_809_1.txt
>
>
> Currently we have scribe hdfs, which write scribe data directly to HDFS 
> cluster. But in most cases this cluster will not be used for accessing the 
> data.
> This data needs to copied to cluster from which you can access this scribe 
> using hive or some other tool.
> This copier should be able to copy large amounts of data on a new realtime 
> bases. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-809) Create a copier to copy data from scribe hdfs cluster to main DW cluster

2009-09-02 Thread Suresh Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Antony updated HIVE-809:
---

Status: Patch Available  (was: Open)

Submitted patch for scribehdfs to main hdfs copier.   


> Create a copier to copy data from scribe hdfs cluster to main DW cluster
> 
>
> Key: HIVE-809
> URL: https://issues.apache.org/jira/browse/HIVE-809
> Project: Hadoop Hive
>  Issue Type: New Feature
>Reporter: Suresh Antony
>Assignee: Suresh Antony
>Priority: Minor
> Fix For: 0.4.0
>
> Attachments: patch_809_1.txt
>
>
> Currently we have scribe hdfs, which write scribe data directly to HDFS 
> cluster. But in most cases this cluster will not be used for accessing the 
> data.
> This data needs to copied to cluster from which you can access this scribe 
> using hive or some other tool.
> This copier should be able to copy large amounts of data on a new realtime 
> bases. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-809) Create a copier to copy data from scribe hdfs cluster to main DW cluster

2009-09-02 Thread Suresh Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Antony updated HIVE-809:
---

Attachment: patch_809_1.txt

patch for scribe data copier. 

> Create a copier to copy data from scribe hdfs cluster to main DW cluster
> 
>
> Key: HIVE-809
> URL: https://issues.apache.org/jira/browse/HIVE-809
> Project: Hadoop Hive
>  Issue Type: New Feature
>Reporter: Suresh Antony
>Assignee: Suresh Antony
>Priority: Minor
> Fix For: 0.4.0
>
> Attachments: patch_809_1.txt
>
>
> Currently we have scribe hdfs, which write scribe data directly to HDFS 
> cluster. But in most cases this cluster will not be used for accessing the 
> data.
> This data needs to copied to cluster from which you can access this scribe 
> using hive or some other tool.
> This copier should be able to copy large amounts of data on a new realtime 
> bases. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HIVE-809) Create a copier to copy data from scribe hdfs cluster to main DW cluster

2009-08-31 Thread Suresh Antony (JIRA)
Create a copier to copy data from scribe hdfs cluster to main DW cluster


 Key: HIVE-809
 URL: https://issues.apache.org/jira/browse/HIVE-809
 Project: Hadoop Hive
  Issue Type: New Feature
Reporter: Suresh Antony
Assignee: Suresh Antony
Priority: Minor
 Fix For: 0.4.0


Currently we have scribe hdfs, which write scribe data directly to HDFS 
cluster. But in most cases this cluster will not be used for accessing the data.
This data needs to copied to cluster from which you can access this scribe 
using hive or some other tool.
This copier should be able to copy large amounts of data on a new realtime 
bases. 



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-563) UDF for parsing the URL

2009-06-16 Thread Suresh Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Antony updated HIVE-563:
---

Attachment: patch_563.txt.2

Removed String.split()
-- Added second eventuate method where user can specify 'Query' key as separate 
argument.
eg:-
parse_url('http://facebook.com/path1/p.php?k1=v1&k2=v2#Ref1', 'QUERY', 'k2') ,
parse_url('http://facebook.com/path1/p.php?k1=v1&k2=v2#Ref1', 'QUERY', 'k1') ,


> UDF for parsing the URL
> ---
>
> Key: HIVE-563
> URL: https://issues.apache.org/jira/browse/HIVE-563
> Project: Hadoop Hive
>  Issue Type: New Feature
>  Components: Server Infrastructure
>Reporter: Suresh Antony
>Assignee: Suresh Antony
> Attachments: patch_563.txt, patch_563.txt.1, patch_563.txt.2
>
>
> Needs a udf to extract the parts of url from url string. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-563) UDF for parsing the URL

2009-06-15 Thread Suresh Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Antony updated HIVE-563:
---

Attachment: patch_563.txt.1

 * UDF to extract specific parts from URL
 * parse_url('http://facebook.com/path/p1.php?query=1', 'HOST') will return 
'facebook.com'
 * parse_url('http://facebook.com/path/p1.php?query=1', 'PATH') will return 
'/path/p1.php'
 * parse_url('http://facebook.com/path/p1.php?query=1', 'QUERY') will return 
'query=1'
 * parse_url('http://facebook.com/path/p1.php?query=1#Ref', 'REF') will return 
'Ref'
 * parse_url('http://facebook.com/path/p1.php?query=1#Ref', 'PROTOCOL') will 
return 'http'
 * Possible values are HOST,PATH,QUERY,REF,PROTOCOL,AUTHORITY,FILE,USERINFO
 * Also you can get a value of particular key in QUERY, using syntax 
QUERY: eg: QUERY:k1. 

> UDF for parsing the URL
> ---
>
> Key: HIVE-563
> URL: https://issues.apache.org/jira/browse/HIVE-563
> Project: Hadoop Hive
>  Issue Type: New Feature
>  Components: Server Infrastructure
>Reporter: Suresh Antony
>Assignee: Suresh Antony
> Attachments: patch_563.txt, patch_563.txt.1
>
>
> Needs a udf to extract the parts of url from url string. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-563) UDF for parsing the URL

2009-06-15 Thread Suresh Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Antony updated HIVE-563:
---

Attachment: patch_563.txt.1

 UDF to extract specific parts from URL
 parse_url('http://facebook.com/path/p1.php?query=1', 'HOST') will return 
'facebook.com'
 parse_url('http://facebook.com/path/p1.php?query=1', 'PATH') will return 
'/path/p1.php'
 parse_url('http://facebook.com/path/p1.php?query=1', 'QUERY') will return 
'query=1'
 parse_url('http://facebook.com/path/p1.php?query=1#Ref', 'REF') will return 
'Ref'
 parse_url('http://facebook.com/path/p1.php?query=1#Ref', 'PROTOCOL') will 
return 'http'
 Possible values are HOST,PATH,QUERY,REF,PROTOCOL,AUTHORITY,FILE,USERINFO
 Also you can get a value of particular key in QUERY, using syntax 
QUERY: eg: QUERY:k1. 

> UDF for parsing the URL
> ---
>
> Key: HIVE-563
> URL: https://issues.apache.org/jira/browse/HIVE-563
> Project: Hadoop Hive
>  Issue Type: New Feature
>  Components: Server Infrastructure
>Reporter: Suresh Antony
>Assignee: Suresh Antony
> Attachments: patch_563.txt
>
>
> Needs a udf to extract the parts of url from url string. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-563) UDF for parsing the URL

2009-06-15 Thread Suresh Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Antony updated HIVE-563:
---

Attachment: (was: patch_563.txt.1)

> UDF for parsing the URL
> ---
>
> Key: HIVE-563
> URL: https://issues.apache.org/jira/browse/HIVE-563
> Project: Hadoop Hive
>  Issue Type: New Feature
>  Components: Server Infrastructure
>Reporter: Suresh Antony
>Assignee: Suresh Antony
> Attachments: patch_563.txt
>
>
> Needs a udf to extract the parts of url from url string. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-563) UDF for parsing the URL

2009-06-15 Thread Suresh Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Antony updated HIVE-563:
---

Attachment: patch_563.txt

parse_url -- udf
 Format;
  pasre_url( utl, URL_PART_NAME).
Possible url Parts are: HOST,PATH,QUERY,REF,PROTOCOL,AUTHORITY,FILE,USERINFO
example:
parse_url('http://facebook.com/path/p1.php?query=1', 'HOST') will return 
'facebook.com'
parse_url('http://facebook.com/path/p1.php?query=1', 'PATH') will return 
'path/p1.php'

Definition of parts can be obtained from:
http://www.j2ee.me/j2se/1.4.2/docs/api/java/net/URL.html

> UDF for parsing the URL
> ---
>
> Key: HIVE-563
> URL: https://issues.apache.org/jira/browse/HIVE-563
> Project: Hadoop Hive
>  Issue Type: New Feature
>  Components: Server Infrastructure
>Reporter: Suresh Antony
>Assignee: Suresh Antony
> Attachments: patch_563.txt
>
>
> Needs a udf to extract the parts of url from url string. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HIVE-563) UDF for parsing the URL

2009-06-15 Thread Suresh Antony (JIRA)
UDF for parsing the URL
---

 Key: HIVE-563
 URL: https://issues.apache.org/jira/browse/HIVE-563
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: Server Infrastructure
Reporter: Suresh Antony
Assignee: Suresh Antony


Needs a udf to extract the parts of url from url string. 


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-376) In strict mode do not allow join without "ON" condition

2009-03-27 Thread Suresh Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Antony updated HIVE-376:
---

  Component/s: Configuration
 Priority: Minor  (was: Major)
Affects Version/s: 0.4.0
Fix Version/s: 0.4.0
   Issue Type: New Feature  (was: Bug)

> In strict mode do not allow join without  "ON" condition
> 
>
> Key: HIVE-376
> URL: https://issues.apache.org/jira/browse/HIVE-376
> Project: Hadoop Hive
>  Issue Type: New Feature
>  Components: Configuration
>Affects Versions: 0.4.0
>Reporter: Suresh Antony
>Priority: Minor
> Fix For: 0.4.0
>
>
> In strict mode do not allow join without  "ON" condition. This will result in 
> cartition product and explosion of data. Very few people want to run with 
> join condition. Usually it is a mistake.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HIVE-376) In strict mode do not allow join without "ON" condition

2009-03-27 Thread Suresh Antony (JIRA)
In strict mode do not allow join without  "ON" condition


 Key: HIVE-376
 URL: https://issues.apache.org/jira/browse/HIVE-376
 Project: Hadoop Hive
  Issue Type: Bug
Reporter: Suresh Antony


In strict mode do not allow join without  "ON" condition. This will result in 
cartition product and explosion of data. Very few people want to run with join 
condition. Usually it is a mistake.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-349) HiveHistory: TestCLiDriver fails if there are test cases with no tasks

2009-03-13 Thread Suresh Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Antony updated HIVE-349:
---

Attachment: patch_349_1.txt

> HiveHistory: TestCLiDriver fails if there are test cases with  no tasks
> ---
>
> Key: HIVE-349
> URL: https://issues.apache.org/jira/browse/HIVE-349
> Project: Hadoop Hive
>  Issue Type: Bug
>Affects Versions: 0.3.0
>Reporter: Suresh Antony
>Assignee: Suresh Antony
> Fix For: 0.3.0
>
> Attachments: patch_349_1.txt
>
>
> TestCLIDriver Fails for some test cases.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-349) HiveHistory: TestCLiDriver fails if there are test cases with no tasks

2009-03-13 Thread Suresh Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Antony updated HIVE-349:
---

Summary: HiveHistory: TestCLiDriver fails if there are test cases with  no 
tasks  (was: HiveHistory: TestCLiDriver fails if there are test cases with  
tasks)

> HiveHistory: TestCLiDriver fails if there are test cases with  no tasks
> ---
>
> Key: HIVE-349
> URL: https://issues.apache.org/jira/browse/HIVE-349
> Project: Hadoop Hive
>  Issue Type: Bug
>Affects Versions: 0.3.0
>Reporter: Suresh Antony
>Assignee: Suresh Antony
> Fix For: 0.3.0
>
>
> TestCLIDriver Fails for some test cases.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Assigned: (HIVE-349) HiveHistory: TestCLiDriver fails if there are test cases with tasks

2009-03-13 Thread Suresh Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Antony reassigned HIVE-349:
--

Assignee: Suresh Antony

> HiveHistory: TestCLiDriver fails if there are test cases with  tasks
> 
>
> Key: HIVE-349
> URL: https://issues.apache.org/jira/browse/HIVE-349
> Project: Hadoop Hive
>  Issue Type: Bug
>Affects Versions: 0.3.0
>Reporter: Suresh Antony
>Assignee: Suresh Antony
> Fix For: 0.3.0
>
>
> TestCLIDriver Fails for some test cases.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HIVE-349) HiveHistory: TestCLiDriver fails if there are test cases with tasks

2009-03-13 Thread Suresh Antony (JIRA)
HiveHistory: TestCLiDriver fails if there are test cases with  tasks


 Key: HIVE-349
 URL: https://issues.apache.org/jira/browse/HIVE-349
 Project: Hadoop Hive
  Issue Type: Bug
Affects Versions: 0.3.0
Reporter: Suresh Antony
 Fix For: 0.3.0


TestCLIDriver Fails for some test cases.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-327) row count getting printed wrongly

2009-03-05 Thread Suresh Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Antony updated HIVE-327:
---

Attachment: patch_327_1.txt

Fixed problem. RowCount hash map is now associated with queryid

> row count getting printed wrongly
> -
>
> Key: HIVE-327
> URL: https://issues.apache.org/jira/browse/HIVE-327
> Project: Hadoop Hive
>  Issue Type: Bug
>Affects Versions: 0.2.0
>Reporter: Suresh Antony
>Assignee: Suresh Antony
> Fix For: 0.2.0
>
> Attachments: patch_327_1.txt
>
>
> When multiple queries are executed in same session, row count of the first 
> query is getting printed for subsequent queries. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HIVE-327) row count getting printed wrongly

2009-03-05 Thread Suresh Antony (JIRA)
row count getting printed wrongly
-

 Key: HIVE-327
 URL: https://issues.apache.org/jira/browse/HIVE-327
 Project: Hadoop Hive
  Issue Type: Bug
Affects Versions: 0.2.0
Reporter: Suresh Antony
Assignee: Suresh Antony
 Fix For: 0.2.0


When multiple queries are executed in same session, row count of the first 
query is getting printed for subsequent queries. 


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-79) Print number of rows inserted to table(s) when the query is finished.

2009-02-25 Thread Suresh Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-79?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Antony updated HIVE-79:
--

Attachment: patch_79_5.txt

Test failed yesterday because a new test was added.   The patch changes plan 
for filesink operator. A new test case will fail the unit test.

> Print number of rows inserted to table(s) when  the query is finished.
> --
>
> Key: HIVE-79
> URL: https://issues.apache.org/jira/browse/HIVE-79
> Project: Hadoop Hive
>  Issue Type: New Feature
>  Components: Logging
>Reporter: Suresh Antony
>Assignee: Suresh Antony
>Priority: Minor
> Fix For: 0.2.0
>
> Attachments: patch_79_1.txt, patch_79_2.txt, patch_79_3.txt, 
> patch_79_4.txt, patch_79_5.txt
>
>
> It is good to print the number of rows inserted into each table at end of 
> query. 
> insert overwrite table tab1 select a.* from tab2 a where a.col1 = 10;
> This query can print something like:
> tab1 rows=100

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-79) Print number of rows inserted to table(s) when the query is finished.

2009-02-24 Thread Suresh Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-79?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Antony updated HIVE-79:
--

Attachment: patch_79_4.txt

Resolved the conflicts

> Print number of rows inserted to table(s) when  the query is finished.
> --
>
> Key: HIVE-79
> URL: https://issues.apache.org/jira/browse/HIVE-79
> Project: Hadoop Hive
>  Issue Type: New Feature
>  Components: Logging
>Reporter: Suresh Antony
>Assignee: Suresh Antony
>Priority: Minor
> Fix For: 0.2.0
>
> Attachments: patch_79_1.txt, patch_79_2.txt, patch_79_3.txt, 
> patch_79_4.txt
>
>
> It is good to print the number of rows inserted into each table at end of 
> query. 
> insert overwrite table tab1 select a.* from tab2 a where a.col1 = 10;
> This query can print something like:
> tab1 rows=100

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-79) Print number of rows inserted to table(s) when the query is finished.

2009-02-17 Thread Suresh Antony (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-79?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12674327#action_12674327
 ] 

Suresh Antony commented on HIVE-79:
---

if so can I commit this patch.

> Print number of rows inserted to table(s) when  the query is finished.
> --
>
> Key: HIVE-79
> URL: https://issues.apache.org/jira/browse/HIVE-79
> Project: Hadoop Hive
>  Issue Type: New Feature
>  Components: Logging
>Reporter: Suresh Antony
>Assignee: Suresh Antony
>Priority: Minor
> Fix For: 0.2.0
>
> Attachments: patch_79_1.txt, patch_79_2.txt, patch_79_3.txt
>
>
> It is good to print the number of rows inserted into each table at end of 
> query. 
> insert overwrite table tab1 select a.* from tab2 a where a.col1 = 10;
> This query can print something like:
> tab1 rows=100

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-79) Print number of rows inserted to table(s) when the query is finished.

2009-02-16 Thread Suresh Antony (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-79?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12673919#action_12673919
 ] 

Suresh Antony commented on HIVE-79:
---

Trying to add test case ...
We run our test cases in local mode.
Looks like there are no counters getting created in local mode. Need to 
investigate more.

> Print number of rows inserted to table(s) when  the query is finished.
> --
>
> Key: HIVE-79
> URL: https://issues.apache.org/jira/browse/HIVE-79
> Project: Hadoop Hive
>  Issue Type: New Feature
>  Components: Logging
>Reporter: Suresh Antony
>Assignee: Suresh Antony
>Priority: Minor
> Fix For: 0.2.0
>
> Attachments: patch_79_1.txt, patch_79_2.txt, patch_79_3.txt
>
>
> It is good to print the number of rows inserted into each table at end of 
> query. 
> insert overwrite table tab1 select a.* from tab2 a where a.col1 = 10;
> This query can print something like:
> tab1 rows=100

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-79) Print number of rows inserted to table(s) when the query is finished.

2009-02-12 Thread Suresh Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-79?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Antony updated HIVE-79:
--

Attachment: patch_79_3.txt

1. Added printing row count to output
2.  Also added row count to Query hisory
3.  Changing the the test outputs because fileSinkDesc changed. Added table Id 
to file Sink Descriptor.

> Print number of rows inserted to table(s) when  the query is finished.
> --
>
> Key: HIVE-79
> URL: https://issues.apache.org/jira/browse/HIVE-79
> Project: Hadoop Hive
>  Issue Type: New Feature
>  Components: Logging
>Reporter: Suresh Antony
>Assignee: Suresh Antony
>Priority: Minor
> Fix For: 0.2.0
>
> Attachments: patch_79_1.txt, patch_79_2.txt, patch_79_3.txt
>
>
> It is good to print the number of rows inserted into each table at end of 
> query. 
> insert overwrite table tab1 select a.* from tab2 a where a.col1 = 10;
> This query can print something like:
> tab1 rows=100

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-79) Print number of raws inserted to table(s) when the query is finished.

2009-01-30 Thread Suresh Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-79?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Antony updated HIVE-79:
--

Attachment: patch_79_2.txt

Implemented the feedback by namit.
Reseting the id and map in the reset() call.
 

> Print number of raws inserted to table(s) when  the query is finished.
> --
>
> Key: HIVE-79
> URL: https://issues.apache.org/jira/browse/HIVE-79
> Project: Hadoop Hive
>  Issue Type: New Feature
>  Components: Logging
>Reporter: Suresh Antony
>Assignee: Suresh Antony
>Priority: Minor
> Fix For: 0.2.0
>
> Attachments: patch_79_1.txt, patch_79_2.txt
>
>
> It is good to print the number of rows inserted into each table at end of 
> query. 
> insert overwrite table tab1 select a.* from tab2 a where a.col1 = 10;
> This query can print something like:
> tab1 rows=100

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-79) Print number of raws inserted to table(s) when the query is finished.

2009-01-29 Thread Suresh Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-79?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Antony updated HIVE-79:
--

Attachment: patch_79_1.txt

This path logs inserted row count to hive query log.  
Logged format will be:
TaskEnd TASK_ROWS_INSERTED="tmp_suresh_12:181687,tmp_suresh_13:181687"

Made changes to semantic analyzer keep tack id-table name map.

HiveHistory converts id back to table name and writes to structured query log.

> Print number of raws inserted to table(s) when  the query is finished.
> --
>
> Key: HIVE-79
> URL: https://issues.apache.org/jira/browse/HIVE-79
> Project: Hadoop Hive
>  Issue Type: New Feature
>  Components: Logging
>Reporter: Suresh Antony
>Assignee: Suresh Antony
>Priority: Minor
> Fix For: 0.2.0
>
> Attachments: patch_79_1.txt
>
>
> It is good to print the number of rows inserted into each table at end of 
> query. 
> insert overwrite table tab1 select a.* from tab2 a where a.col1 = 10;
> This query can print something like:
> tab1 rows=100

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Assigned: (HIVE-79) Print number of raws inserted to table(s) when the query is finished.

2009-01-29 Thread Suresh Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-79?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Antony reassigned HIVE-79:
-

Assignee: Suresh Antony

> Print number of raws inserted to table(s) when  the query is finished.
> --
>
> Key: HIVE-79
> URL: https://issues.apache.org/jira/browse/HIVE-79
> Project: Hadoop Hive
>  Issue Type: New Feature
>  Components: Logging
>Reporter: Suresh Antony
>Assignee: Suresh Antony
>Priority: Minor
> Fix For: 0.2.0
>
>
> It is good to print the number of rows inserted into each table at end of 
> query. 
> insert overwrite table tab1 select a.* from tab2 a where a.col1 = 10;
> This query can print something like:
> tab1 rows=100

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-257) Put the structed hhive query location in hive-site.xml

2009-01-28 Thread Suresh Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Antony updated HIVE-257:
---

Attachment: patch_257_2.txt

Last patch contained all the changes...

> Put the structed hhive query location in hive-site.xml
> --
>
> Key: HIVE-257
> URL: https://issues.apache.org/jira/browse/HIVE-257
> Project: Hadoop Hive
>  Issue Type: Bug
>Affects Versions: 0.2.0
>Reporter: Suresh Antony
>Assignee: Suresh Antony
> Fix For: 0.2.0
>
> Attachments: patch_257.txt, patch_257_2.txt
>
>
> Put the structed hhive query location in hive-site.xml. Also chnage name of 
> query log to add a random integer to file name. So that mutiple session do 
> not overwrite same file.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-257) Put the structed hhive query location in hive-site.xml

2009-01-28 Thread Suresh Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Antony updated HIVE-257:
---

Attachment: patch_257.txt

> Put the structed hhive query location in hive-site.xml
> --
>
> Key: HIVE-257
> URL: https://issues.apache.org/jira/browse/HIVE-257
> Project: Hadoop Hive
>  Issue Type: Bug
>Affects Versions: 0.2.0
>Reporter: Suresh Antony
>Assignee: Suresh Antony
> Fix For: 0.2.0
>
> Attachments: patch_257.txt
>
>
> Put the structed hhive query location in hive-site.xml. Also chnage name of 
> query log to add a random integer to file name. So that mutiple session do 
> not overwrite same file.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HIVE-257) Put the structed hhive query location in hive-site.xml

2009-01-28 Thread Suresh Antony (JIRA)
Put the structed hhive query location in hive-site.xml
--

 Key: HIVE-257
 URL: https://issues.apache.org/jira/browse/HIVE-257
 Project: Hadoop Hive
  Issue Type: Bug
Affects Versions: 0.2.0
Reporter: Suresh Antony
Assignee: Suresh Antony
 Fix For: 0.2.0


Put the structed hhive query location in hive-site.xml. Also chnage name of 
query log to add a random integer to file name. So that mutiple session do not 
overwrite same file.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-176) structured log for obtaining query stats/info

2009-01-28 Thread Suresh Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Antony updated HIVE-176:
---

Attachment: patch_176_2.txt

added hive.querylog.location

> structured log for obtaining query stats/info
> -
>
> Key: HIVE-176
> URL: https://issues.apache.org/jira/browse/HIVE-176
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Logging
>Affects Versions: 0.2.0
>Reporter: Joydeep Sen Sarma
>Assignee: Suresh Antony
> Fix For: 0.2.0
>
> Attachments: patch_176.txt, patch_176.txt, patch_176.txt, 
> patch_176.txt, patch_176.txt, patch_176.txt, patch_176.txt, patch_176_2.txt
>
>
> Josh  wrote:
> When launching off hive queries using hive -e is there a way to get the job 
> id so that I can just queue them up and go check their statuses later? What's 
> the general pattern for queueing and monitoring without using the libraries 
> directly?
> I'm gonna throw my vote in for a structured log format. Users could tail it 
> and use whatever queuing or monitoring they wish. It's also probably just a 
> 30 minute project for someone already familiar with the code. I suggest ^A 
> seperated key=value pairs per log line.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-176) structured log for obtaining query stats/info

2009-01-27 Thread Suresh Antony (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12667944#action_12667944
 ] 

Suresh Antony commented on HIVE-176:


Create hive job history directory, if the directory does not exists/

> structured log for obtaining query stats/info
> -
>
> Key: HIVE-176
> URL: https://issues.apache.org/jira/browse/HIVE-176
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Logging
>Affects Versions: 0.2.0
>Reporter: Joydeep Sen Sarma
>Assignee: Suresh Antony
> Fix For: 0.2.0
>
> Attachments: patch_176.txt, patch_176.txt, patch_176.txt, 
> patch_176.txt, patch_176.txt, patch_176.txt, patch_176.txt
>
>
> Josh  wrote:
> When launching off hive queries using hive -e is there a way to get the job 
> id so that I can just queue them up and go check their statuses later? What's 
> the general pattern for queueing and monitoring without using the libraries 
> directly?
> I'm gonna throw my vote in for a structured log format. Users could tail it 
> and use whatever queuing or monitoring they wish. It's also probably just a 
> 30 minute project for someone already familiar with the code. I suggest ^A 
> seperated key=value pairs per log line.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-176) structured log for obtaining query stats/info

2009-01-27 Thread Suresh Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Antony updated HIVE-176:
---

Attachment: patch_176.txt

> structured log for obtaining query stats/info
> -
>
> Key: HIVE-176
> URL: https://issues.apache.org/jira/browse/HIVE-176
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Logging
>Affects Versions: 0.2.0
>Reporter: Joydeep Sen Sarma
>Assignee: Suresh Antony
> Fix For: 0.2.0
>
> Attachments: patch_176.txt, patch_176.txt, patch_176.txt, 
> patch_176.txt, patch_176.txt, patch_176.txt, patch_176.txt
>
>
> Josh  wrote:
> When launching off hive queries using hive -e is there a way to get the job 
> id so that I can just queue them up and go check their statuses later? What's 
> the general pattern for queueing and monitoring without using the libraries 
> directly?
> I'm gonna throw my vote in for a structured log format. Users could tail it 
> and use whatever queuing or monitoring they wish. It's also probably just a 
> 30 minute project for someone already familiar with the code. I suggest ^A 
> seperated key=value pairs per log line.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-176) structured log for obtaining query stats/info

2009-01-21 Thread Suresh Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Antony updated HIVE-176:
---

Attachment: patch_176.txt

Made following changes:

Added "TIME" key to every line. It will have the value of 
"System.currentTimeMillis()"

Also Added QueryId instead of using Query string as the query id. Added new 
conf variable "hive.query.id"





> structured log for obtaining query stats/info
> -
>
> Key: HIVE-176
> URL: https://issues.apache.org/jira/browse/HIVE-176
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Logging
>Affects Versions: 0.2.0
>Reporter: Joydeep Sen Sarma
>Assignee: Suresh Antony
> Fix For: 0.2.0
>
> Attachments: patch_176.txt, patch_176.txt, patch_176.txt, 
> patch_176.txt, patch_176.txt, patch_176.txt
>
>
> Josh  wrote:
> When launching off hive queries using hive -e is there a way to get the job 
> id so that I can just queue them up and go check their statuses later? What's 
> the general pattern for queueing and monitoring without using the libraries 
> directly?
> I'm gonna throw my vote in for a structured log format. Users could tail it 
> and use whatever queuing or monitoring they wish. It's also probably just a 
> 30 minute project for someone already familiar with the code. I suggest ^A 
> seperated key=value pairs per log line.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-176) structured log for obtaining query stats/info

2009-01-19 Thread Suresh Antony (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12665148#action_12665148
 ] 

Suresh Antony commented on HIVE-176:


Changed hive.joblog.location to hive.querylog.location

Also changed e.PrintStacktrace()

Changed CliDriver.. Moved SessionState.start() after conf variable 
initialization. Otherwise conf setting to change hive.querylog.location was 
having no effect. Since HiveHistory was getting initialized even before 
hiveconf was parsed. 

Thanks for all the feedbacks.

> structured log for obtaining query stats/info
> -
>
> Key: HIVE-176
> URL: https://issues.apache.org/jira/browse/HIVE-176
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Logging
>Affects Versions: 0.2.0
>Reporter: Joydeep Sen Sarma
>Assignee: Suresh Antony
> Fix For: 0.2.0
>
> Attachments: patch_176.txt, patch_176.txt, patch_176.txt, 
> patch_176.txt, patch_176.txt
>
>
> Josh  wrote:
> When launching off hive queries using hive -e is there a way to get the job 
> id so that I can just queue them up and go check their statuses later? What's 
> the general pattern for queueing and monitoring without using the libraries 
> directly?
> I'm gonna throw my vote in for a structured log format. Users could tail it 
> and use whatever queuing or monitoring they wish. It's also probably just a 
> 30 minute project for someone already familiar with the code. I suggest ^A 
> seperated key=value pairs per log line.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-176) structured log for obtaining query stats/info

2009-01-19 Thread Suresh Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Antony updated HIVE-176:
---

Attachment: patch_176.txt

> structured log for obtaining query stats/info
> -
>
> Key: HIVE-176
> URL: https://issues.apache.org/jira/browse/HIVE-176
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Logging
>Affects Versions: 0.2.0
>Reporter: Joydeep Sen Sarma
>Assignee: Suresh Antony
> Fix For: 0.2.0
>
> Attachments: patch_176.txt, patch_176.txt, patch_176.txt, 
> patch_176.txt, patch_176.txt
>
>
> Josh  wrote:
> When launching off hive queries using hive -e is there a way to get the job 
> id so that I can just queue them up and go check their statuses later? What's 
> the general pattern for queueing and monitoring without using the libraries 
> directly?
> I'm gonna throw my vote in for a structured log format. Users could tail it 
> and use whatever queuing or monitoring they wish. It's also probably just a 
> 30 minute project for someone already familiar with the code. I suggest ^A 
> seperated key=value pairs per log line.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-176) structured log for obtaining query stats/info

2009-01-15 Thread Suresh Antony (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12664204#action_12664204
 ] 

Suresh Antony commented on HIVE-176:


*  inferNumReducers(): instead of two calls to the hivehistory - can just 
make one call at the end of the function when the numReducers has been set for 
sure. We could also set NUM_REDUCERS to 0 when no reducer is specified (more 
informative imho).
 made it single call after this function call
* I still don't see why HAS_REDUCE_TASKS and NUM_REDUCE_TASKS are 
meaningful counters. what is the use case?
--- Removed both of these variables
* In TestHiveHistory - please use setup() method or constructor to do 
initialization. also a negative test case would be good (to check if negative 
job status is being captured for example).
--- moved this code to setUp()
* HiveHistoryViewer - indentation is badly off. I think we are following a 
general convention of '} else {' as well (and curly braces on same like as 
function/class declaration - viz 'void init() {'.
--- Re-formtted using eclipse formatter
* JOB_STATUS and TASK_STATUS are both unused.
* i couldn't understand this code block in parseHiveHistory:
  + if (!line.trim().endsWith("\"")){ + continue; + }
  can u explain.
--- Format is key="value"... so the value line does not end with " means 
value has a newline

* parseLine: confused that we have a reg ex group for the key - but are not 
using it .. seems weird - if u had groups for both key and value u wouldn't 
need to split. alternately u can rely on just the split.
-- cut and pasted this code From JobHistory Parser
* getHiveHistory - i don't think it's a good idea to initialize hivehistory 
object on demand:
  a) u always need it
  b) it prints stuff to the console (log file location). if u want a 
deterministic location for this log - we should just initialize hivehistory at 
session initialization so that the log file location always comes at the 
beginning of the session (and not at some random point when the code actually 
requires it)

-- moved hiveHistory initialization to constructor of sessionSate
* it would be good to have an example of the hive history file/format 
checked in somewhere with a pointer to it from the documentation (either in 
README or wiki).
--- Put short summary about the HistoryLog in internal wiki.
   http://www.intern.facebook.com/intern/wiki/index.php/HiveQueryLog
* another easy and comprehensive test to add is in TestCliDriver. This is 
generated code that fires a bunch of queries - we should be easily able to use 
HiveHistoryViewer to assert that query status is successful for all queries in 
positive tests.
--- Added hiveHistory Check TestCliDriver. For this to work QTestUtil. 
SessionState is constructed in the constructor of QTestUtil. Not sure this is 
correct way or not
-- Changed TestCliDriver.vm to check history File.

One thing i am concerned about overall is the use of the term 'job' for what is 
essentially a hive query. I think this creates a lot of room for confusion - 
since in the hadoop ecosystem job means hadoop job. (we have also overloaded 
the word task in Hive - which is unfortunate - but almost too late now). If 
possible - i would really appreciate if we could replace 'job' with 'query' 
whereever applicable. (s/startJob/startQuery/ for example).
 --- Changed all Job referces to Query

-- should we create the history file always, history will be disabled by 
default and enbaled setting a jobconf parameter. 'enable.job.history'




> structured log for obtaining query stats/info
> -
>
> Key: HIVE-176
> URL: https://issues.apache.org/jira/browse/HIVE-176
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Logging
>Affects Versions: 0.2.0
>Reporter: Joydeep Sen Sarma
>Assignee: Suresh Antony
> Fix For: 0.2.0
>
> Attachments: patch_176.txt, patch_176.txt, patch_176.txt
>
>
> Josh  wrote:
> When launching off hive queries using hive -e is there a way to get the job 
> id so that I can just queue them up and go check their statuses later? What's 
> the general pattern for queueing and monitoring without using the libraries 
> directly?
> I'm gonna throw my vote in for a structured log format. Users could tail it 
> and use whatever queuing or monitoring they wish. It's also probably just a 
> 30 minute project for someone already familiar with the code. I suggest ^A 
> seperated key=value pairs per log line.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-176) structured log for obtaining query stats/info

2009-01-15 Thread Suresh Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Antony updated HIVE-176:
---

Attachment: patch_176.txt

> structured log for obtaining query stats/info
> -
>
> Key: HIVE-176
> URL: https://issues.apache.org/jira/browse/HIVE-176
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Logging
>Affects Versions: 0.2.0
>Reporter: Joydeep Sen Sarma
>Assignee: Suresh Antony
> Fix For: 0.2.0
>
> Attachments: patch_176.txt, patch_176.txt, patch_176.txt, 
> patch_176.txt
>
>
> Josh  wrote:
> When launching off hive queries using hive -e is there a way to get the job 
> id so that I can just queue them up and go check their statuses later? What's 
> the general pattern for queueing and monitoring without using the libraries 
> directly?
> I'm gonna throw my vote in for a structured log format. Users could tail it 
> and use whatever queuing or monitoring they wish. It's also probably just a 
> 30 minute project for someone already familiar with the code. I suggest ^A 
> seperated key=value pairs per log line.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-176) structured log for obtaining query stats/info

2009-01-11 Thread Suresh Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Antony updated HIVE-176:
---

Attachment: patch_176.txt

Fix review comments & added a simple test case. 

> structured log for obtaining query stats/info
> -
>
> Key: HIVE-176
> URL: https://issues.apache.org/jira/browse/HIVE-176
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Logging
>Affects Versions: 0.2.0
>Reporter: Joydeep Sen Sarma
>Assignee: Suresh Antony
> Fix For: 0.2.0
>
> Attachments: patch_176.txt, patch_176.txt, patch_176.txt
>
>
> Josh  wrote:
> When launching off hive queries using hive -e is there a way to get the job 
> id so that I can just queue them up and go check their statuses later? What's 
> the general pattern for queueing and monitoring without using the libraries 
> directly?
> I'm gonna throw my vote in for a structured log format. Users could tail it 
> and use whatever queuing or monitoring they wish. It's also probably just a 
> 30 minute project for someone already familiar with the code. I suggest ^A 
> seperated key=value pairs per log line.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-176) structured log for obtaining query stats/info

2009-01-05 Thread Suresh Antony (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12660868#action_12660868
 ] 

Suresh Antony commented on HIVE-176:


attached a patch with following changes:
1.  HiveHistory stored in sessionState variable.  HiveHistory is created  on 
the first call to job log history. 
2. Removed all references to HiveHistory from CliDriver
3. Added a new config variable "hive.joblog.location" for the location of 
history file.  Not using scratch directory as the  default location because We 
are removing scratch directory at the end  of run. 

> structured log for obtaining query stats/info
> -
>
> Key: HIVE-176
> URL: https://issues.apache.org/jira/browse/HIVE-176
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Logging
>Affects Versions: 0.2.0
>Reporter: Joydeep Sen Sarma
>Assignee: Suresh Antony
> Fix For: 0.2.0
>
> Attachments: patch_176.txt, patch_176.txt
>
>
> Josh  wrote:
> When launching off hive queries using hive -e is there a way to get the job 
> id so that I can just queue them up and go check their statuses later? What's 
> the general pattern for queueing and monitoring without using the libraries 
> directly?
> I'm gonna throw my vote in for a structured log format. Users could tail it 
> and use whatever queuing or monitoring they wish. It's also probably just a 
> 30 minute project for someone already familiar with the code. I suggest ^A 
> seperated key=value pairs per log line.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-176) structured log for obtaining query stats/info

2009-01-05 Thread Suresh Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Antony updated HIVE-176:
---

Attachment: patch_176.txt

> structured log for obtaining query stats/info
> -
>
> Key: HIVE-176
> URL: https://issues.apache.org/jira/browse/HIVE-176
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Logging
>Affects Versions: 0.2.0
>Reporter: Joydeep Sen Sarma
>Assignee: Suresh Antony
> Fix For: 0.2.0
>
> Attachments: patch_176.txt, patch_176.txt
>
>
> Josh  wrote:
> When launching off hive queries using hive -e is there a way to get the job 
> id so that I can just queue them up and go check their statuses later? What's 
> the general pattern for queueing and monitoring without using the libraries 
> directly?
> I'm gonna throw my vote in for a structured log format. Users could tail it 
> and use whatever queuing or monitoring they wish. It's also probably just a 
> 30 minute project for someone already familiar with the code. I suggest ^A 
> seperated key=value pairs per log line.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-202) LINEAGE is not working for join quries

2008-12-31 Thread Suresh Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Antony updated HIVE-202:
---

Attachment: patch_202.txt

New patch with test case included.

> LINEAGE is not  working for join quries
> ---
>
> Key: HIVE-202
> URL: https://issues.apache.org/jira/browse/HIVE-202
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Clients
>Affects Versions: 0.2.0
> Environment: lineage is not working for join quires
>Reporter: Suresh Antony
>Assignee: Suresh Antony
>Priority: Minor
> Fix For: 0.2.0
>
> Attachments: patch_202.txt, patch_202.txt
>
>
> lineage is not giving input tables  in case of join quires.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-202) LINEAGE is not working for join quries

2008-12-31 Thread Suresh Antony (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12660125#action_12660125
 ] 

Suresh Antony commented on HIVE-202:


Added a new patch with a test case included.

> LINEAGE is not  working for join quries
> ---
>
> Key: HIVE-202
> URL: https://issues.apache.org/jira/browse/HIVE-202
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Clients
>Affects Versions: 0.2.0
> Environment: lineage is not working for join quires
>Reporter: Suresh Antony
>Assignee: Suresh Antony
>Priority: Minor
> Fix For: 0.2.0
>
> Attachments: patch_202.txt, patch_202.txt
>
>
> lineage is not giving input tables  in case of join quires.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-202) LINEAGE is not working for join quries

2008-12-30 Thread Suresh Antony (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12660010#action_12660010
 ] 

Suresh Antony commented on HIVE-202:


This query is already there in the test...

FROM ( FROM ( FROM src1 src1 SELECT src1.key AS c1, src1.value AS c2 WHERE 
src1.key > 10 and src1.key < 20) a RIGHT OUTER JOIN ( FROM src2 src2 SELECT 
src2.key AS c3, src2.value AS c4 WHERE src2.key > 15 and src2.key < 25) b ON 
(a.c1 = b.c3) SELECT a.c1 AS c1, a.c2 AS c2, b.c3 AS c3, b.c4 AS c4) c SELECT 
c.c1, c.c2, c.c3, c.c4


> LINEAGE is not  working for join quries
> ---
>
> Key: HIVE-202
> URL: https://issues.apache.org/jira/browse/HIVE-202
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Clients
>Affects Versions: 0.2.0
> Environment: lineage is not working for join quires
>Reporter: Suresh Antony
>Assignee: Suresh Antony
>Priority: Minor
> Fix For: 0.2.0
>
> Attachments: patch_202.txt
>
>
> lineage is not giving input tables  in case of join quires.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-202) LINEAGE is not working for join quries

2008-12-30 Thread Suresh Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Antony updated HIVE-202:
---

Attachment: patch_202.txt

> LINEAGE is not  working for join quries
> ---
>
> Key: HIVE-202
> URL: https://issues.apache.org/jira/browse/HIVE-202
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Clients
>Affects Versions: 0.2.0
> Environment: lineage is not working for join quires
>Reporter: Suresh Antony
>Assignee: Suresh Antony
>Priority: Minor
> Fix For: 0.2.0
>
> Attachments: patch_202.txt
>
>
> lineage is not giving input tables  in case of join quires.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Assigned: (HIVE-202) LINEAGE is not working for join quries

2008-12-30 Thread Suresh Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Antony reassigned HIVE-202:
--

Assignee: Suresh Antony

> LINEAGE is not  working for join quries
> ---
>
> Key: HIVE-202
> URL: https://issues.apache.org/jira/browse/HIVE-202
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Clients
>Affects Versions: 0.2.0
> Environment: lineage is not working for join quires
>Reporter: Suresh Antony
>Assignee: Suresh Antony
>Priority: Minor
> Fix For: 0.2.0
>
>
> lineage is not giving input tables  in case of join quires.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HIVE-202) LINEAGE is not working for join quries

2008-12-30 Thread Suresh Antony (JIRA)
LINEAGE is not  working for join quries
---

 Key: HIVE-202
 URL: https://issues.apache.org/jira/browse/HIVE-202
 Project: Hadoop Hive
  Issue Type: Bug
  Components: Clients
Affects Versions: 0.2.0
 Environment: lineage is not working for join quires
Reporter: Suresh Antony
Priority: Minor
 Fix For: 0.2.0


lineage is not giving input tables  in case of join quires.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-176) structured log for obtaining query stats/info

2008-12-29 Thread Suresh Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Antony updated HIVE-176:
---

Fix Version/s: 0.2.0
Affects Version/s: 0.2.0
 Release Note: Hive History Logging
   Status: Patch Available  (was: Open)

> structured log for obtaining query stats/info
> -
>
> Key: HIVE-176
> URL: https://issues.apache.org/jira/browse/HIVE-176
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Logging
>Affects Versions: 0.2.0
>Reporter: Joydeep Sen Sarma
>Assignee: Suresh Antony
> Fix For: 0.2.0
>
> Attachments: patch_176.txt
>
>
> Josh  wrote:
> When launching off hive queries using hive -e is there a way to get the job 
> id so that I can just queue them up and go check their statuses later? What's 
> the general pattern for queueing and monitoring without using the libraries 
> directly?
> I'm gonna throw my vote in for a structured log format. Users could tail it 
> and use whatever queuing or monitoring they wish. It's also probably just a 
> 30 minute project for someone already familiar with the code. I suggest ^A 
> seperated key=value pairs per log line.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-176) structured log for obtaining query stats/info

2008-12-29 Thread Suresh Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Antony updated HIVE-176:
---

Attachment: patch_176.txt

logging hive history
Files changed:
1. Driver.java
2. CliDriver.java
3. ExecDriver.java

> structured log for obtaining query stats/info
> -
>
> Key: HIVE-176
> URL: https://issues.apache.org/jira/browse/HIVE-176
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Logging
>Reporter: Joydeep Sen Sarma
> Attachments: patch_176.txt
>
>
> Josh  wrote:
> When launching off hive queries using hive -e is there a way to get the job 
> id so that I can just queue them up and go check their statuses later? What's 
> the general pattern for queueing and monitoring without using the libraries 
> directly?
> I'm gonna throw my vote in for a structured log format. Users could tail it 
> and use whatever queuing or monitoring they wish. It's also probably just a 
> 30 minute project for someone already familiar with the code. I suggest ^A 
> seperated key=value pairs per log line.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Assigned: (HIVE-176) structured log for obtaining query stats/info

2008-12-29 Thread Suresh Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Antony reassigned HIVE-176:
--

Assignee: Suresh Antony

> structured log for obtaining query stats/info
> -
>
> Key: HIVE-176
> URL: https://issues.apache.org/jira/browse/HIVE-176
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Logging
>Reporter: Joydeep Sen Sarma
>Assignee: Suresh Antony
> Attachments: patch_176.txt
>
>
> Josh  wrote:
> When launching off hive queries using hive -e is there a way to get the job 
> id so that I can just queue them up and go check their statuses later? What's 
> the general pattern for queueing and monitoring without using the libraries 
> directly?
> I'm gonna throw my vote in for a structured log format. Users could tail it 
> and use whatever queuing or monitoring they wish. It's also probably just a 
> 30 minute project for someone already familiar with the code. I suggest ^A 
> seperated key=value pairs per log line.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-148) extend bin/hive to include the lineage tool

2008-12-18 Thread Suresh Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Antony updated HIVE-148:
---

Attachment: patch_148.txt

Adding  lineage service to hive binary.

> extend bin/hive to include the lineage tool 
> 
>
> Key: HIVE-148
> URL: https://issues.apache.org/jira/browse/HIVE-148
> Project: Hadoop Hive
>  Issue Type: New Feature
>  Components: Clients
>Affects Versions: 0.2.0
>Reporter: Suresh Antony
>Assignee: Suresh Antony
> Fix For: 0.2.0
>
> Attachments: patch_148.txt
>
>
> biin/hive currently used only to execute the query. Add options to bin/hive 
> to output the lineage info given the query as the input.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-148) extend bin/hive to include the lineage tool

2008-12-18 Thread Suresh Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Antony updated HIVE-148:
---

Fix Version/s: 0.2.0
Affects Version/s: 0.2.0
 Release Note: Adding lineage service extension to hive binary
   Status: Patch Available  (was: Open)

Adding lineage service extension to hive binary

> extend bin/hive to include the lineage tool 
> 
>
> Key: HIVE-148
> URL: https://issues.apache.org/jira/browse/HIVE-148
> Project: Hadoop Hive
>  Issue Type: New Feature
>  Components: Clients
>Affects Versions: 0.2.0
>Reporter: Suresh Antony
>Assignee: Suresh Antony
> Fix For: 0.2.0
>
> Attachments: patch_148.txt
>
>
> biin/hive currently used only to execute the query. Add options to bin/hive 
> to output the lineage info given the query as the input.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-176) structured log for obtaining query stats/info

2008-12-15 Thread Suresh Antony (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12656855#action_12656855
 ] 

Suresh Antony commented on HIVE-176:


Some thoughts about logging:
  -  Thinking of following the same format as  hadoop logging for the jobs
   - each  query will have separate file.  location of the file will be printed 
to stdout while running the query.
   - Format of the entries in file will be of the following.
id=value key1=value1 key2=value2 .
  - Different entry types can be:
 -QueryStart
 -QueryEnd
 -QueryStepStart
-QueryStepEnd 
-QueryStepProgress





> structured log for obtaining query stats/info
> -
>
> Key: HIVE-176
> URL: https://issues.apache.org/jira/browse/HIVE-176
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Logging
>Reporter: Joydeep Sen Sarma
>
> Josh  wrote:
> When launching off hive queries using hive -e is there a way to get the job 
> id so that I can just queue them up and go check their statuses later? What's 
> the general pattern for queueing and monitoring without using the libraries 
> directly?
> I'm gonna throw my vote in for a structured log format. Users could tail it 
> and use whatever queuing or monitoring they wish. It's also probably just a 
> 30 minute project for someone already familiar with the code. I suggest ^A 
> seperated key=value pairs per log line.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-147) Need a tool for extracting lineage info from hive sql

2008-12-12 Thread Suresh Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Antony updated HIVE-147:
---

Status: Patch Available  (was: Open)

> Need a tool for extracting lineage info from hive sql
> -
>
> Key: HIVE-147
> URL: https://issues.apache.org/jira/browse/HIVE-147
> Project: Hadoop Hive
>  Issue Type: New Feature
>Reporter: Suresh Antony
>Assignee: Suresh Antony
> Attachments: patch_147.txt, patch_147.txt
>
>
> Need a tool to  extract the line information from hive query.  
> This tool should take hive query as input and it should output, input and 
> output tables used by the query.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-147) Need a tool for extracting lineage info from hive sql

2008-12-12 Thread Suresh Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Antony updated HIVE-147:
---

Attachment: patch_147.txt

Adding incense text  for new files added.

> Need a tool for extracting lineage info from hive sql
> -
>
> Key: HIVE-147
> URL: https://issues.apache.org/jira/browse/HIVE-147
> Project: Hadoop Hive
>  Issue Type: New Feature
>Reporter: Suresh Antony
>Assignee: Suresh Antony
> Attachments: patch_147.txt, patch_147.txt
>
>
> Need a tool to  extract the line information from hive query.  
> This tool should take hive query as input and it should output, input and 
> output tables used by the query.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-147) Need a tool for extracting lineage info from hive sql

2008-12-09 Thread Suresh Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Antony updated HIVE-147:
---

Attachment: patch_147.txt

LineageInfo.java prints input/output tables given the query.  It uses new tree 
walker architecture.

> Need a tool for extracting lineage info from hive sql
> -
>
> Key: HIVE-147
> URL: https://issues.apache.org/jira/browse/HIVE-147
> Project: Hadoop Hive
>  Issue Type: New Feature
>Reporter: Suresh Antony
>Assignee: Suresh Antony
> Attachments: patch_147.txt
>
>
> Need a tool to  extract the line information from hive query.  
> This tool should take hive query as input and it should output, input and 
> output tables used by the query.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HIVE-148) extend bin/hive to include the lineage tool

2008-12-09 Thread Suresh Antony (JIRA)
extend bin/hive to include the lineage tool 


 Key: HIVE-148
 URL: https://issues.apache.org/jira/browse/HIVE-148
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: Clients
Reporter: Suresh Antony
Assignee: Suresh Antony


biin/hive currently used only to execute the query. Add options to bin/hive to 
output the lineage info given the query as the input.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HIVE-147) Need a tool for extracting lineage info from hive sql

2008-12-09 Thread Suresh Antony (JIRA)
Need a tool for extracting lineage info from hive sql
-

 Key: HIVE-147
 URL: https://issues.apache.org/jira/browse/HIVE-147
 Project: Hadoop Hive
  Issue Type: New Feature
Reporter: Suresh Antony


Need a tool to  extract the line information from hive query.  
This tool should take hive query as input and it should output, input and 
output tables used by the query.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Assigned: (HIVE-147) Need a tool for extracting lineage info from hive sql

2008-12-09 Thread Suresh Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Antony reassigned HIVE-147:
--

Assignee: Suresh Antony

> Need a tool for extracting lineage info from hive sql
> -
>
> Key: HIVE-147
> URL: https://issues.apache.org/jira/browse/HIVE-147
> Project: Hadoop Hive
>  Issue Type: New Feature
>Reporter: Suresh Antony
>Assignee: Suresh Antony
>
> Need a tool to  extract the line information from hive query.  
> This tool should take hive query as input and it should output, input and 
> output tables used by the query.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HIVE-79) Print number of raws inserted to table(s) when the query is finished.

2008-11-25 Thread Suresh Antony (JIRA)
Print number of raws inserted to table(s) when  the query is finished.
--

 Key: HIVE-79
 URL: https://issues.apache.org/jira/browse/HIVE-79
 Project: Hadoop Hive
  Issue Type: New Feature
Affects Versions: 0.19.0
Reporter: Suresh Antony
Priority: Minor
 Fix For: 0.19.0


It is good to print the number of rows inserted into each table at end of 
query. 
insert overwrite table tab1 select a.* from tab2 a where a.col1 = 10;

This query can print something like:
tab1 rows=100

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.